Download Signature Analysis and Computer Forensics PDF

TitleSignature Analysis and Computer Forensics
File Size390.4 KB
Total Pages11
Document Text Contents
Page 1


Signature analysis and Computer Forensics

Michael Yip

School of Computer Science

University of Birmingham

Birmingham, B15 2TT, U.K.


December, 2008

Abstract: Computer Forensics is a process of using scientific knowledge to collect,

analyze and present digital evidence to court or tribunals. Since files are the

standard persistent form of data on computers, the collection, analysis and

presentation of computer files as digital evidence is of utmost essential in

Computer Forensics. However, data can be hidden behind files and can be enough

to trick the naked eye. Therefore, a more comprehensive data analyzing method

called file signature analysis is needed to support the process of Computer

Forensics. This method is articulated in details in this article and discussed.


Computer Forensics is the process of using scientific knowledge to collect, analyse and present data to

courts. This process involves the preservation, identification, extraction and documentation of

computer evidence stored in the form of magnetically, optically or electronically stored media. Steps

in forensic process include:

1. Creating an exact physical copy of the digital media e.g. the computer hard disk. This is often
called bitwise image

2. Load image to an empty or formatted hard disk
3. Secure the original media in a sealed container
4. Mark and retrieve data of evidential value
5. Present evidence in a readable form for court or tribunal

Step 4 involves the examination of the image and the search for evidence. With millions of files being

stored on a computer, there is a need for methods to reduce the search space for the forensic

examiners and spot out suspicious files. This is where signature analysis is used as part of the forensic


A signature analysis is a process where files, their headers and extensions are compared with a known

database of file headers and extensions in an attempt to verify all files on the storage media and

discover those which may be hidden. In order to fully understand the usefulness of signature analysis,

this article gives an introduction to the structure of computer files and how such files can be hidden.

Then, a demonstration would be articulated to show how signature analysis can be used to defeat such

data hiding techniques.

Understanding the structure of a file

Since data are stored on computers as files, all of these files must be searched and examined as if they

were files in an office for the purpose to gather digital evidence. In order to understand the process of

Page 2


data hiding, one must first understand the structure of a computer file. The structure of a file normally

consists of:

1. Filename
2. File header/footer
3. File content

1. Filename

The filename is a unique identifier which allows the computer to correctly identify each file stored on

the disk. The first piece of information on the file format is given on the name of the file e.g.

essay.doc. Different applications use different file formats to encode data on files so that other

applications cannot extract the data. The part “.doc” is the filename extension. It is used by many

modern operating systems such as Mac OS X and Windows to determine the format of the file and

associate a list of application of which the file is compatible with.

2. File header/ footer

The information which describes the type of the file, e.g. which application the file is associated with,

is stored in the header or footer (or both) within a file. Such information is called the signature of the

file or file signature and they most often unique to one another. The file extension and the file

signature of each file should match each other in most cases but there are a few exceptions. There can

be mismatches, no match, unknown types and anomalous results.

Every file has a file header/footer which contains information on the format of the content stored in

the file. It could either form a part of the file or stored as a separate file. Files of a particular type can

be searched for using the information stored in the file header alone. Such information can be easily

obtained by opening files using a hex editor such as HexEdit. Such tools allow users to see and edit

the raw and exact contents stored in a file. Below is an Adobe PDF file opened using a hex editor:

Fig 1. Hexadecimal representation of a Adobe PDF file

Page 5


whether a filename extension has been tampered with and once detected, a further investigation can

be carried out on such files, narrowing the search space.

Normally, the file signature analysis is carried using forensic applications such as EnCase which

enables the user to examine a disk image and carry out several different procedures. Such applications

make use of an extensive list of publicised file signatures and match them with files’ extensions. If a

mismatch is found then the file’s extension has obviously been altered and the file would warrant a

closer examination.

In this case, the concept of file signature analysis is demonstrated by examining file signatures using

HexEdit rather than EnCase.

Case study: Hiding a Microsoft Office 2007 document

It is demonstrated in this case study how easy it is to trick Windows Explorer to display wrong file

type simply by altering the file’s extension and how the examination of the file’s signature using

HexEdit defeats such method.

Platform: Windows Vista

Method of hiding: Changing file’s extension

Forensic technique: File signature analysis using HexEdit

Fig 3. confidential.docx created in a directory named secret

Firstly, a new directory was created on the desktop called secret and a new Microsoft Office 2007

document was also created as shown in Fig.3. above. The document has a secret message “You

shouldn’t read this!” as shown in Fig.4. below.

Fig 4. The message stored in confidential.docx

Similer Documents