0% found this document useful (0 votes)
187 views115 pages

Cybe Forensics Basics

The document provides an overview of cyberforensics including the basics, perspectives, topics covered, computer crimes, and cyberforensic procedures. It discusses preparation, incident detection, documentation, duplication, analysis, and reporting as key steps in the cyberforensic process.

Uploaded by

MdMehediHasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
187 views115 pages

Cybe Forensics Basics

The document provides an overview of cyberforensics including the basics, perspectives, topics covered, computer crimes, and cyberforensic procedures. It discusses preparation, incident detection, documentation, duplication, analysis, and reporting as key steps in the cyberforensic process.

Uploaded by

MdMehediHasan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 115

Cyber-Forensics

The Basics

CERTConf2006
Tim Vidas

Because you have to start somewhere

Who are we?


Tim Vidas
Sr. Tech. Research Fellow
UNO/PKI/NUCIA
Certs: CISSP, 40xx, Guidance,
AccessData etc.
Instructor: UNO, Guidance, LM
RRCF

Joe Wilson
Recent Graduate (MS/MIS)
RRCF
2

NUCIA
Nebraska University Consortium on
Information Assurance
IA full time
Traditional university coursework in
IA, Crypto, Forensics, Secure
Administration, Certification and
Accreditation, etc
STEAL Labs
Other work
Most of us are around CERTconf..3

Who are you?

Who are you?


Where do you work?
What do you do?
How many of you are planning on
attending all Forensics sessions?
What are you expecting to get out
of them? (Ill try to be
accommodating)
4

The learning theory:


A technical, practical, hands-on approach
Technical means the class(es) will either
require or provide a significant amount of
technical expertise.
Practical implies that the information covered
should provide you with the capacity to
conduct many cyberforensic activities.
Hands-on the additional hands-on
component provides an active experience in
which you will immersed in related exercises.
The best way to learn is by doing.

Disclaimer
Even though this class touches on quite
a few legal topics nothing should be
construed as advice or legal instruction
Before performing many of the skills
learned this week on a computer other
than your own, you may need to seek
permission (possibly written) and or
seek advice from your own legal
counsel.
6

_______ forensics
Whereas computer forensics is defined as
the collection of techniques and tools used to
find evidence in a computer,
digital forensics has been defined as the
use of scientifically derived and proven
methods toward the preservation, collection,
validation, identification, analysis,
interpretation, documentation, and
presentation of digital evidence derived from
digital sources for the purpose of facilitation or
furthering the reconstruction of events found to
be criminal, or helping to anticipate
unauthorized actions shown to be disruptive to
planned operations
7

What is Cyberforensics?
This really depends on the point of view
Traditionally Cyber forensics involves the

preservation,
collection,
validation,
identification,
analysis,
interpretation,
documentation and
presentation

of computer evidence stored on a computer.


Forensics is the application of science to the
legal process.
Jim Christy, DCCI

Rapid-Response
Cyberforensics
Characterized by:
Live-response
Military-type contexts
But not of necessity

Judicious a priori planning


Prior strategic incident response planning
Requisite training in
Basic forensic procedures
Live-response
Network forensics

Continued updating of skills as technology


changes
Technically adept with a diversity of tools &
toolkits

Viewpoint
According to the CFEWG curriculum
group there are three perspectives of
cyberforensics
Law enforcement
FBI/IRS

Business/Industry
Cisco

Military/counterintelligence
AF OSI/NSA

Although not mutually exclusive, each


can have its own thrust.
academia is becoming a fourth

10

Viewpoint

11

Viewpoint
Each perspective has different
objectives, even though there is
overlap, the approaches of each remain
mainly ah-hoc and uncoordinated
Technology is vendor-driven
No industry certification
No standards
ASCLAD for labs

Interesting situations with the court


system
Who is more believable?
Evidence isnt questioned
12

Coverage from OS
perspective
Windows
95% of cases involve Windows (FBI)
Topics
File systems: FAT & NTFS
Multiple tools:
Commercial
Freeware
Windows & Linux

Live response
Network forensics
13

Topics we will cover


Were going to start by establishing a basis for
cyber-forensics
Hexadecimal notation
Traditional post-mortem forensics

Duplication
Analysis
File systems
Footprints
Etc

Then build upon this foundation and explore other


avenues
Generally speaking, if you dont know how a
particular tool is working behind the scenes you
might not be able to hold weight on the witness14
stand (or corporate report, or ____ )

Cybercrime &
Cyberwarfare
Information warfare specialists at
the Pentagon estimate that a
properly prepared and wellcoordinated attack by fewer than
30 computer virtuosos strategically
located around the world, with a
budget of less than $10 million,
could bring the United States to its
knees.
Center for Strategic & International Studies (CSIS)
15
http://www.csis.org/pubs/cyberfor.html

Cybercrime &
Cyberwarfare
Such a strategic attack, mounted
by a cyberterrorist group, either
substate or nonstate actors, would
shut down everything from electric
power grids to air traffic control
centers.

Center for Strategic & International Studies (CSIS)


http://www.csis.org/pubs/cyberfor.html

16

Scope of the Problem


In1990 a computer hard drive seized in
a criminal investigation would contain
approximately 50,000 pages of text
The same hard drives now contain 5
million to 50 million pages of data.
But the ability of these agencies to retain
computer talent is seriously jeopardized by
the compensation packages offered by the
private sector.
Center for Strategic & International Studies (CSIS)
http://www.csis.org/pubs/cyberfor.html

17

Computer Crime
Sample of computer crimes from 2001
Demoted employee installs a logic bomb,
which later deactivates hand-held
computers used by the sales force.
eBay
User advertises goods, but on receiving payment
never ships the goods.
Advertised collectibles turn out to be fakes

Disgruntled student sends threatening


emails, leading to school closing down.
Ring of software pirates use web site to
distribute pirated software
Stephenson, 2001.

18

Computer Crimes
Software company employee is indicted for altering
a copyright program to overcome file reading
limitations
Hacker accesses 65 U.S. Court computers and
downloads large quantities of private information.
Hacker accesses bank records, steals banking and
personal details.
15 year old boy runs scripts that invoke DOS against
eBay, Yahoo!, AOL, etc.

Moral: NO SUCH THING AS TYPICAL


COMPUTER CRIME.
Must be flexible in your response
Stephenson, 2001.

19

Taxonomy of Computer
Crime Scenes
Computer Crime
Computer used to conduct crime
Examples?

Computer is target of crime


Examples?

Response
Live: real-time
After the fact
20

Introduction
Computer forensics involves
Preservation
Evidence changed, court case is gone

Identification
Of the 100,000 files, what is evidence of a crime?

Extraction
Take the evidence off the hard drive for presentation

Documentation
Document what you found to present in court

Interpretation
Interpret the evidence in light of the charges

As much art as science


21

Goals (Questions) of
Forensic Analysis
Identify root cause of an event to ensure it
wont happen again
Must understand the problem before you can be
sure it wont be exploited again.

Who was responsible for the event?


Most computer crime cases are not
prosecuted
Consider acceptability in court of law as our
standard for investigative practice.
Ultimate goal is to conduct investigation in a manner
that will stand up to legal scrutiny.
Treat every case like a court case!
22
Kruse & Heiser, 2002

Cyberforensics
Procedures
Cyberforensics is a large, complex
problem composed of various
flexible steps
Each step has an input and an
output

23

Preparation

Detection

IR Team Informed
First Response

Secure System
Create Response
Strategy

Incident?

Begin Evidence
Acquisition

Duplicate

Is this
Correct?

Duplication
Required?

Begin
Investigation

Report

Feedback to Preparation &


Secure System
24

Adapted from Mandia & Prosise 2003

Preparation
What to do before the incident
Incident response plan
What to do in case of
User incident
User or customer reports problem
Application incident
Web page changed, etc.
System incident
Virus
Server down
Denial-of-service attack
Hostile code
Unauthorized access
Network probes

25

Preparation
What to do before the incident
Incident response team

Systems administrators
Forensic analysts
Users
Managers

May have to wear more than one hat

26

Detecting Incidents
You detect something you believe
to be an incident
Something outside the scope of
normal operation
Now what?

DO UNTIL DONE
Document everything
Document everything
Document everything
27

Incident Detecting
Follow a well-defined methodology
Care and due diligence must proceed with
each case
TREAT EACH CASE AS IF IT MAY END
UP IN COURT
Dont begin analysis, decide you have a
problem, THEN start handling it as
evidence
TOO LATE by then, because you have changed
the scene of the crime.
Defense attorney wont care whether this was
done accidentally or not.
28

Incident Detection
How to document
Create a notification checklist
Assure you wont miss any details
Facts to include:
Time & Date
Who or what is reporting the incident
User, sysadmin, IDS

When incident is suspected to have


occurred
Hardware/software
POC

29

Chain-of-custody
CRITICAL that documentation regarding how
evidence is handled.
Establishes continuity of who/what/where RE
evidence

Who collected the evidence


What comprises the evidence
When evidence was collected
If hardware (take a photo)
Make, model, serial #

Description of the evidence, technical information


Name and signature of individual receiving evidence
Case number & tag (bag & tag)
If electronic, cryptographic hashes
Mandia & Prosise, 2003

30

Chain-of-custody
How to bag & tag electronic evidence?
Cryptographic hash of the electronic file
More on this stuff a bit later
Time & date stamp before and after
capture

31

Evidence Checkout Log


Item

Date

Time

Dell Inspiron 8000 SN# 4005

10/8/2002

13:05

Dell Inspiron 8000 SN# 4005

10/9/2002

8:02

Dell Inspiron 8000 SN# 4005

10/9/2002

15:33

Dell Inspiron 8000 SN# 4005

10/11/2002

7:30

Dell Inspiron 8000 SN# 4005

10/11/2002

12:00

Location

Name

Reason

Locked up in STEAL Lab cabinet

Vidas

Safekeeping

Removed

Vidas

Analysis

Locked up in STEAL Lab cabinet

Vidas

Safekeeping

Removed

Nicoll

Analysis

Locked up in STEAL Lab cabinet

Nicoll

Safekeeping

Adapted from Kruse & Heiser, 2002

32

Handling Evidence
Chain-of-Custody
Goal is to protect the integrity of your evidence
Make it difficult for the defense attorney to
successfully argue that the evidence was tampered
with it while it was in your custody

Document following questions

Who collected the evidence?


How was it collected? From where was it collected?
Who took possession of it?
How was it stored and protected in storage?
Who took it out of storage and why?

33
Kruse & Heiser, 2002

Chain-of-Custody
Be meticulous
Defense attorney will cross-reference
with other documents to determine
any inconsistencies

Fewer people who have access to


your evidence or locker room, the
better.
Defense attorneys will argue
otherwise
34

Chain-of-Custody
What does a typical CoC list look
like?
CoC is quite a bit different with
digital evidence these days

35

First Response
Youve detected an incident, now
what?
Verify incident and related
information
Initiate network monitoring if appropriate
IDS
Sniffer

Users involved
Business impact if any
36

Formulate/Execute
Response Strategy
Your response strategy should be
driven by your incident response plan
If you dont have one, you must develop on
the fly.
Select the most appropriate strategy
Best if you have thought about this beforehand

Context determines whether to do a live


response or a off-line media analysis
after forensic duplication
Big difference between the two, notwithstanding
legal implications

37

Formulate/Execute
Response Strategy
Determine

How serious the problem is


Sensitivity of the compromised information
Potential offenders
Whether the incident is public or private
Internal network vs. web page

Level of access gained by intruder


Skill of the intruder
Level of tolerable downtime
Determines live response vs. offline

$$$$ lost
Mandia, Prosise & Pepe, 2003

38

Formulate/Execute
Response Strategy
Incident:
DOS

Example
SMURF attack

Strategy
Reconfigure router to minimize effect of
flooding
Establishing perpetrator too costly

Likely outcome
Reconfiguration reduces effect of flooding
Mandia, Prosise & Pepe, 2003

39

Formulate/Execute
Response Strategy
Incident:
Unauthorized use

Example
KPorn surfing from company workstation

Strategy
Perform forensic duplication
Offline analysis
Interview user

Likely outcome
Suspect identified and evidence collected for
disciplinary action.
Mandia, Prosise & Pepe, 2003

40

Formulate/Execute
Response Strategy
Incident:
Computer intrusion

Example
Buffer-overflow gives intruder root access to critical
system

Strategy
Monitor intruder activities
Isolate the machine, reduce problem scope
Secure and recover the system

Likely outcome
Vulnerability identified, system recovered.
Mandia, Prosise & Pepe, 2003

41

Formulate/Execute
Response Strategy
Incident:
Stolen information

Example
Stolen CC numbers from company database

Strategy
Issue public statement
Perform forensic duplication & analysis
Contact LE

Likely outcome
LE agents participate in investigation
Systems offline until problem resolved.

Mandia, Prosise & Pepe, 2003

42

Considerations
Presenting strategies to management,
consider
Downtime
Network/system
User

Legal liability
e.g., downstream liability
Stolen CC

Publicity
Most intrusions are not reported

Theft of IP
Mandia, Prosise & Pepe, 2003

43

Forensic Duplication
Your strategy is to take the system
offline.
Case may go to court or high-cost damage
Need to perform a bit-level copy of the
system
WHY?
Two types of analysis
Logical
Physical

44

Forensic Duplication
Your strategy is to take the system
offline.
Cant do a physical analysis on a
mere logical copy of the hard drive
Misses ambient data that may contain a
wealth of evidence
Must access each sector of the HD
Ambient data found in areas no privy to
the user
45

Forensic Duplication
Your strategy is to take the system
offline.
Offline analysis allows you to preserve the
system as-is, i.e., like putting yellow police
tape around the scene of a crime
Offline analysis doesnt affect the integrity
of the evidence because you are doing
analyses on copies of the evidence.
Of course youll likely loose all volatile data
by shutting down the machine
46

Forensic Duplication
More on this a little later

47

Authenticate the Evidence


It is difficult to show that evidence of
any kind collected is the same as what
was left behind by a criminal
Computer drives deteriorate slowly
Child pornography and Taliban terror plans
dont show up randomly on a HD
Chain of custody and other handling rules
assure the jury that no unanticipated or
introduced changes occurred.
prove who was at the keyboard problem
48
Kruse & Heiser, 2002

Investigation
Answers
Who, what, when, where, how
How you perform the investigation
determined by whether you have a
forensic duplicate, or whether you
are conducting a live response.
IE..Cant get certain portions of a hard
disk if working with live-response
Cant do a string search on a swap file
under live-response
49

Investigation
What is the goal?
Search for appropriate types of information
Graphics/images
Text

Problems:
There are hundreds or thousands of files
Needle in a stack of needles problem
Files can be hidden

Kiddy porn graphic saved as myhomework.doc


Steganography or alternate data streams
Files deleted
.files
Hidden areas of disk
obfuscation

50

Common Mistakes
Altering time and date stamps.
Killing rogue processes.
Patching the system before the
investigation.
Not recording commands executed on
the system.
Using untrusted commands and
binaries.
Writing over potential evidence by:
Installing software on the evidence media
Running programs that store their output on
51
the evidence media.
Kruse & Heiser, 2002

How do you know something


is wrong?
Failed login attempts
Logins into dormant and default
accounts
Activity during nonworking hours
Presence of new accounts not
created by the systems
administrator
Unfamiliar files or programs
52

Running Processes
What is this?

Whats wrong with


this?

Linux: top, ps

53

Other:

Event Log
Computer management (mmc)
Open Shares (mmc)
Network connections (netstat)
Services ( mmc)
Connected users (mmc)

MMC, administrative tools, and 3rd party


applications are all going to be valuable
54

Detection
Unexplained changes in file and
directory permissions
Unexplained elevation or use of
privileges
An altered web page
Presence of pornographic images
on a system
55

Detection
Use of commands or functions not
normally associated with a users
job
Presence of contraband utilities
(cracking, hacking, crypto,
obfuscating, etc)
Gaps in or erasure of system logs

56

Detection
Changes in DNS tables or router
or firewall rules that cannot be
accounted for.
Unusually slow system
performance
System crashes
Social engineering attempts
57

Where do I find this


evidence?
It depends on the OS
For Windows
it will likely be in various GUI-based
utilities
Or in highly obfuscated portions of
specific files

For UNIX/Linux, it will likely be in


various text files
58

The Initial Assessment


What probably happened?
Best response?
Investigator must assess scene and
respond accordingly
Difference between
Someone lying on the ground bleeding at scene
of the crime
Someone lying on the ground dead at the scene
of the crime

Response differ depending on


circumstances
Mandia & Prosise, 2003

59

Incident Notification Checklist


Who called:
Time/Date
Phone

Nature of incident
When did it occur?
How was it detected?
When was it detected?
Immediate and future impact to client:
Mandia & Prosise, 2003

60

Always practice safe hex.

Hex

Why HEX?
While hex is less readable than
ascii text, it is more readable than
code the machine understands
The number 65535 would be written
down as 16 ones, or
11111111111111112
Prone to errorwas that 16 or 17 1s?
To condense the same information we
use a base 16 system, called
hexadecimal.
62

What is HEX?
Hex uses decimals first,followed
by alphabetic characters.
It is fairly straightforward to convert
back and forth from binary to hex
0 1 2 3 4 5 6 7 8 9 1
0

1
1

1
2

1
3

1
4

1
5

0 1 2 3 4 5 6 7 8 9 A B C D E F
63

BIN %

OCT

DEC

HEX 0x

10

11

100

101

110

111

1000

10

1001

11

1010

12

10

1011

13

11

1100

14

12

1101

15

13

1110

16

14

1111

17

15

10000

20

10

10

10001

21

11

11

64

Converting
If you write down
1234, (base 10)
you are talking
about the number
one thousand,
two hundred and
thirty four.
This can be
rewritten as:
65

Converting
It is the same in all other bases,
each place represents a power of
the base:

66

Converting
What is 0xCB in Decimal?
C = 12 and B = 11 so
12 * 16^1 + 11 * 16^0 = 203

What about binary?


C = 12 = 1100 B = 11 = 1011
CB =
1100 .
1011
so 0xCB = %11001011

67

Converting
What is 0xAF1 in Decimal?
A = 10, F = 15 so
10 * 16^2 + 15 * 16^1 + 1 * 16^0 = 2801

What about binary?


A = 1010 F = 1111 1 = 0001
AF1= 1010 . 1111 . 0001
so 0xAF1 = %101011110001

68

Practical bits
Netmask:
So most people just type in:
255.255.255.0
What does this mean?

69

Practical bits
Netmask:
IP address are dotted quad,
basically the dots just break up bits
to make them easier to read.
How many bits does it take to
represent 256 (base 10)?

70

Practical bits
Netmask:
11111111 = 255, so 8 bits for 256 unique
values
Therefore, 255.255.255.0 is decimal dotted
quad for the base 2 number:
11111111.11111111.11111111.00000000
This is also sometimes referred to as as /24
network because there are 24 1s
Netmasks almost always start with
sequential 1s and end with sequential
0s
71

slight diversion now


Netmask:
11111111 . 11111111 . 11111111 . 00000000
network (subnet)
host
So this particular netmask (/24) allows for 256
different hosts(well actually a bit less but
lets just say 256) on one subnet. Every time
you add a bit to the netmask, you get more
subnets and less hosts per subnet.
Example:
192.168.100.0 192.168.100.255
72

slight diversion now


Netmask:
11111111.11111111.11111111.1 0000000
network (subnet)
host
So this particular netmask (/25) has 2 subnets...
Example:
192.168.100.0 192.168.100.127
subnet1
192.168.100.0 192.168.100.255
subnet2
So /26 has 4 subnets, /27 has 8 subnets, all the
way through /30 which has 64 subnets (4 hosts
per)
73

slight diversion now


Netmask:
Looking at Netmasks that lower than /24 get into
Class A,B,C type discussions and are definitely
out of scope here
Basically each fourth of the dotted quad controls a
class, so using letters to represent the class a
bit belongs to:
AAAAAAAA.BBBBBBBB.CCCCCCCC.xxxxxxxx
Class D is used for broadcasting
Class E is Experimental is basically a leftover
from bureaucratic / political design by
committee fallout
74

Practical Bits
Netmask:
Whats the mask actually do?
Used for Bitwise AND with a hosts address
If my computer is 137.48.112.123
and my netmask is 255.255.255.0
10001001.00110000.01110000.01111011
11111111.11111111.11111111.00000000
AND 10001001.00110000.01110000.00000000
so for the very common /24 netmask the result
may be familiar then the last number (123) is the
host id, and the others 137.48.112 is the network.
75

Why does all this matter?


So as a forensic examiner you
might not be overly concerned with
netmasks, or the class of a
particular network
And you may not be able to decode
machine language when you see it
But you should understand what it is
and realize that decoding it correctly
could change data into
information
76

Why does all this matter?


In the physical world if an
investigator found a letter at a crime
scene he would not throw it away
just because the crime was
committed in Nebraska and the
letter was written in Chinese.

77

Why does all this matter?


A set of 1s and 0s that translates
into an peculiar set of Hex
characters may appear to be
gibberish, but upon proper
decoding, it may reveal an MIME
encoded message (for example)
Just because the data isnt in a
particularly useful form, doesnt
mean that its not valuable.
78

Peek into the Future:


Windows stores all kinds of data in
all kinds of places
And interesting example are lnk
files
And extension of .lnk means?

79

Peek into the Future:


Turns out that the date / time
information for the original file the
lnk points to (deleted or not) is
stored in the lnk.
Starting at byte offsets 28, 36, and
44 you can gleam creation, last
access and last modification
times..
These are Windows Date/Time
values 64 bit little Endian
80

Peek into the Future:


CST is 6 hours behind GMT
Notice the highlighted portion of
hex in winhex
What happened on 11/16/04 at
about 9:54 AM?

81

Peek into the Future:


What are the odds?
Every time a document is
accessed a lnk is created in the
hidden system folder RECENT
This folder exists for all users
individually
Obviously this knowledge has a
variety of uses
82

Encoding is not Encrypting


It is also important to note the
different between Encoding and
Encrypting
Encoding is done primarily to make
information EASY to interpret
Encrypting is done primarily to
make information HARD to interpret

83

Encoding is not Encrypting


The very fact that data has been
encrypted is sometimes enough to
raise red flags
Depending on circumstances the
existence of encrypted files may
create, or be a contributing factor
for Probable Cause
This is not the case with encoded
files
84

The Hex Editor


In windows you may find a tool
such as winhex, frHed, or Hackman
valuable:
In Linux maybe something like xxd,
Heme, SHED, gHex, KHexEdit or
some other abstraction (Autopsy for
example has a hex view option).

85

Hex Editor
You can use these hex tools at
varying granularityby file:
Viewing a FILE
NOTICE
the offset
starts at 0

86

Hex Editor
How is this different?
Viewing a DISK

Contents
do not
start at 0

87

Files
Many low level things can be
determined at the Hex level
Files always have particular
header information (this is different
then file-extensions like .doc or
.jpeg)
This is often called a file signature
or Magic numbers
88

Files
When considering graphics files
; Windows Bitmap graphics BMP=0x00:"BM" ; Compressed
BM? File BM_=0x00:"SZDD"
; Graphics Interchange Format bitmap graphics
GIF=0x00:"GIF8"
; Graphics Interchange Format bitmap graphics (GIF 87a)
GIF87A=0x00:"GIF87a"
; Graphics Interchange Format bitmap graphics (GIF 89a)
GIF89A=0x00:"GIF89a"
; JPEG Bitmap graphics
JPE=0x00:0xFF,0xD8,0xFF,0xE0,0x00,0x10,"JFIF"
; JPEG Bitmap graphics
JPG=0x00:0xFF,0xD8,0xFF,0xE0,0x00,0x10,"JFIF"
JS=0x00:"/"
These are standard types, the information is widely
available, these particular lines came from drivespy.ini

89

Files
This is the hex representation of a
jpg:

90

Files
If files are simply stored in hidden areas,
like unallocated, slack, or interpartition
space, they will still have header
information
If files are enciphered some way (like
stereography) then there is no header
information
If files are encrypted / compressed, there
may not be header information about the
file, but there will typically be header
information about the encryption /
compress for decryption / decompression
purposes
91

Files
In some cases you may find portions
or fragments of a file. If you suspect
that the fragment may be part of
what used to be JPEG for example
(because near where the header
should be you found FIF and you
know that jpeg headers contain
JFIF) you can attempt to recover
the file by editing the correct header
information back to the disk.
92

Hashing

No. Not that kind.

Hashing
One of the best ways to describe
hashing is to describe a hash as a
fingerprint [of an image].
Fingerprints uniquely identify a
much larger object (human) from a
much smaller object ( the
fingerprint)

94

Hashing
similarly, a digital hash is a unique
representation of a larger object like
an image
This hash is a file that is completely
separate from the image that it is
fingerprinting and has a fixed length
like 128 or 160 bits.
A 1 MB file and a 1 GB files will both
produce hashes of the same length
95

Hashing
The general idea is that a very
small (any) change in the source
file will result in a very large
change in the hash
The hashes we are referring to are
one-way hashes

96

Hashing
There are many automated tools
that provide hashing components.
Most *nix distributions provide
hashing tools by default, for
windows youll have to download
software

97

Hashing

The algorithm is independent of


OS the same hash is produced
from the same file on Linux and
Windows
98

Hashing
The same software typically
provides the means to check to
see if the hash of a given file has
changed in this case a c
option

Add a single space to the email


99

Hashing
MD5 128 bits
Sha1 160 bits
Sha256 256 bits
sha384, sha512

100

Hashing

Use something like md5deep or sha1deep for


recursion:

101

Side Rant: Hashing DLs


When downloading software a
hash is often provided along with
the download.
What purpose does this hash serve?

102

MD5 hash collisions


Whats all this hoopla about?
Who has heard of this?
explain

103

Hash Collisions
Hash Collision (n): a term in
computer programming for a
situation that occurs when two
distinct inputs into a hash function
produce identical outputs.
What does this mean to us
forensically?
http://en.wikipedia.org/wiki/Hash_collision
104

Hash Collisions
Its all computation time relative

1) Create bad file


2) Gen MD5
3) Was is the MD5 I wanted? (no)
4) Mod bad file in some way
5) go to 2 (until done)

Turns out that if you can produce,


locate, etc two strings of the same
arbitrary length that happen to hash to
the same MD5, then you can do some
interesting things
105

Hash Collisions
With getting too into it
Message Digest 5 uses MerkleDamgard construction rounds
Starting at 128 bits then adding
(processing?) in 512 more bits at a time
Unfortunately, at time+X for two
arbitrary files going through rounds if at
any given round in either file the
current hash matches, then arbitrary
data can be appended afterward.and
the resultant hashes will match
106

Hash collisions
Tools like stripwire can actually
create 2 files that have the same
md5and very quickly
stripwire $VERSION: Conflation Attack Using Colliding MD5 Test
Vectors
Author:
Dan Kaminsky(dan\@doxpara.com)
Example: ./stripwire.pl -v -b test.pl -r fire.bin
Options: -b [file.pl]
: Build encrypted archives of this
perl code
-r [file.bin]
: Attempt to self-decrypt and
execute this file
-v
: Increase verbosity.
-a
: Rename
active payload
(fire.bin)
-i
: Rename inactive payload (
ice.bin)
107

Hash Collisoins
What does this mean to us?
Actually very little!
Since we are creating two new files
with the same MD5 this doesnt even
effect Known Hash Set Lists, like KFF
or NIST / NSRL
This whole disscussion was on MD5,
but does/may apply to other hashing
algorithms
This can be mitigated by simply storing
dual hashes or in a weaker sense by
108
storing other metadata like filesize

Difference between the 2

109

Difference between the 2

d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
2f ca b5 87 12 46 7e ab 40 04 58 3e b8 fb 7f 89
55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 71 41 5a
08 51 25 e8 f7 cd c9 9f d9 1d bd f2 80 37 3c 5b
d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
dd 53 e2 b4 87 da 03 fd 02 39 63 06 d2 48 cd a0
e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 a8 0d 1e
c6 98 21 bc b6 a8 83 93 96 f9 65 2b 6f f7 2a 70

d1 31 dd 02 c5 e6 ee c4 69 3d 9a 06 98 af f9 5c
2f ca b5 07 12 46 7e ab 40 04 58 3e b8 fb 7f 89
55 ad 34 06 09 f4 b3 02 83 e4 88 83 25 f1 41 5a
08 51 25 e8 f7 cd c9 9f d9 1d bd 72 80 37 3c 5b
d8 82 3e 31 56 34 8f 5b ae 6d ac d4 36 c9 19 c6
dd 53 e2 34 87 da 03 fd 02 39 63 06 d2 48 cd a0
e9 9f 33 42 0f 57 7e e8 ce 54 b6 70 80 28 0d 1e
c6 98 21 bc b6 a8 83 93 96 f9 65 ab 6f f7 2a 70

110

Bit Rot
Has anyone ever heard of Bit
Rot?

111

Bit Rot
There is actually a lot of DA
backlash about hashing as part of
Chain of Custody
Over time, MTBF kicks in and a bit
mysteriously flips on an HD.
This one bit obliterates the hash
What are the legal repercussions to a
re-opened case?
112

Resources
http://md5deep.sourceforge.net/
www.doxpara.org
En.wikipedia.org

113

References
Casey, E. (2001). Digital Evidence and
Computer Crime. Academic Press.
Casey, E. (2002). Handbook of Computer
Crime Investigation: Forensic Tools and
Technology. Academic Press.
Kruse, W.G. III, & Heiser, J.G. (2002).
Computer Forensics : Incident Response
Essentials. Addison-Wesley.
Mandia, K., Prosise, C., & Pepe, M. ( 2003).
Incident Response: Investigating Computer
Crime. Osborne.
114

References
Stephenson, P. (2001).
Investigating Computer-Related
Crime. CRC Press.
Center for Strategic & International
Studies (CSIS)
http://www.csis.org/pubs/cyberfor.h
tml
http://www.ascld-lab.org/
http://www.Dcci.gov
115

You might also like