KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE
CSDF 557: COMPUTER FORENSICS
GROUP ASSIGNMENT
RESEARCH INTO HASHING AND HOW HASHING IS RELEVANT IN THE DIGITAL
FORENSICS PROCESS
SUBMITTED BY
CHRISTIAN ADU-BOAHENE - PG3443722
HARRISON AGBEDANU DORMENYO - PG3443922
MICHAEL OFORI ASARE - PG3493922
LAWRENCE BOATENG - PG3494022
KOMBAT SALAM - PG3494222
MICHAEL ATSU - PG3496622
1.0 DEFINITION OF HASHING
Hashing is a process in cryptography that allows you to take any size of data and apply a
mathematical process to it to produce an output that is a unique string of characters and numbers
of the same length (Ke, Liu, Wang, & Goyal, 2011). The mathematical process applied to
produce the output is known as the hash function or hash algorithm. The output of the hashing is
known as the hash value or hash digest. This can be used for everything from data organization
to file integrity verification. Because this process is virtually irreversible, it means that hashing is
a one-way cryptographic function.
Figure 1 below shows how hashing works.
Figure 1: How hashing works
2.0 HASH ALGORITHMS OR HASH FUNCTIONS
A hash algorithm is an algorithm that turns a variable-sized amount of text into a fixed-sized
output known as the hash value (Kumar, Sofat, Jain, & Aggarwal, 2012). There are several hash
algorithms and functions used in digital forensics. The most popular ones are the Secure Hash
Algorithm (SHA), and Message Digest Method 5 (MD5).
1
The MD5 algorithm was designed by Ronald Rivest in 1991 to provide the means for digital
signature verification. MD5 algorithm was one of the first hashing algorithms to take the global
stage as a successor to the MD4 algorithm. The MD5 is a cryptographic hash algorithm used to
generate a 128-bit digest from a string of any length. It represents the digests as 32-digit
hexadecimal numbers.
SHA was designed by the US National Security Agency. It encompasses SHA-1, SHA-2 and
SHA-3 algorithms, which also have their sub-families. The SHA-2 family includes SHA-224,
SHA-256, SHA-384 and SHA-512. SHA-3 includes SHA3-224, SHA3-256, SHA3-384, and
SHA3-512. Today SHA-2 and SHA-3 are in use as SHA-1 has been deprecated.
3.0 RELEVANCE OF HASHING IN THE DIGITAL FORENSICS PROCESS
Hashing and data imaging are two terms or concepts that are fundamental to digital forensics.
The most important rule in digital forensics is to never perform direct examination and analysis
on the original digital evidence. In doing so, the date and time or the file properties such as MAC
(Modified, Accessed and Created) will be changed. This will result in the evidence being
declared as tampered since changes have already been made to the original file or storage media.
For that matter, it will not be admissible in the court of law as evidence and therefore be
rendered useless. Hashing preserves the integrity of the original device, that is, it assures that the
original evidence has not been changed or altered in any way. Imaging creates a copy of the
digital evidence for conducting investigations and gathering evidence. Various imaging tools are
used in forensic laboratories such as FTK Imager and EnCase. Generally, a hash value is created
by the forensic software tool when the imaging of the digital evidence is done. The hash value is
2
used to check the integrity of the digital evidence. The hash value generated during imaging
should match when that image of the evidence disk is extracted for detailed analysis.
4.0 REQUIREMENTS OF A STRONG HASH ALGORITHM
The following are the requirements of a strong hash algorithm:
4.0.1 Irreversibility: Hashing algorithms are one-way functions thus you cannot figure out the
original input using the hash value. This means that you can easily convert an input into a hash
but you cannot derive input from its hash value.
4.0.2 Determinism: The output length for all hashing algorithms should be the same regardless
of the length of the input size. It helps to keep hackers from knowing how large the original input
was because all outputs regardless of how long or short the original input is are fixed-length and
do not vary.
4.0.3 Collision Resistance: In hashing, collision is said to have occurred when hash values of
two different inputs produce identical outputs. Hashing algorithms must be resistant to collision
attacks to prevent different inputs from having the same hash values since attackers can use that
to attack the system (Rasjid, Soewito, Witjaksono, & Abdurachman, 2017).
4.0.4 Avalanche Effect: The smallest change in a file should result in a significant change in the
hash value. Figure 2 is a demonstration of the avalanche effect.
Figure 2: Demonstration of the avalanche effect
3
4.0.5 Speed: When a user enters the password of his/her account, they expect to log in in
microseconds. This can only happen if the hashing function performs at an extremely high speed
in creating hashes. However, not all hashing functions are supposed to be quick. Some
functionalities require hashing functions to be slow. This is seen in the calculation of a password
hash. In this situation, you want the calculation to be slower to make it harder and thus more
time-consuming for attackers to brute-force users’ passwords if the password hash database gets
stolen or carry out rainbow table attacks.
5.0 HASHING AND ENCRYPTION
People often get confused between hashing and encryption because they seem quite similar when
it comes to their functionalities. However, there’s a stark difference between the two of them.
Hashing is a one-way function or process. This means that once an input gets hashed, there is no
way back. It is one of the things that make hashes so unique. Encryption, on the other hand, is a
two-way method. It is an entirely different process that can be reversed or decrypted. When
something is encrypted, it is supposed to be decrypted, meaning it is essentially reverted to its
original form.
REFERENCES
4
Ke, H.-J., Liu, J., Wang, S.-J., & Goyal, D. (2011). Hash-Algorithms Output for Digital
Evidence in Computer Forensics. 2011 International Conference on Broadband and
Wireless Computing, Communication and Applications (pp. 399-404). Barcelona, Spain:
IEEE. doi:10.1109/BWCCA.2011.65.
Kumar, K., Sofat, S., Jain, S., & Aggarwal, N. (2012). Significance of hash value generation in
digital forensics: A case study. International Journal of Engineering Research and
Development, 2(5), 64-70. Retrieved from
https://www.ijerd.com/paper/vol2-issue5/I02056470.pdf
Rasjid, Z. E., Soewito, B., Witjaksono, G., & Abdurachman, E. (2017). A review of collisions in
cryptographic hash function used in digital forensic tools. ScienceDirect, 116, 381-392.
doi:https://doi.org/10.1016/j.procs.2017.10.072