Lect-8 (Hash Function)
Lect-8 (Hash Function)
Cryptography Hash
Functions
w
Hash Function
A hash function is a mathematical algorithm that
takes an input and produces a fixed-size string of
bytes.
It is commonly used in computer science for various
purposes, including data integrity verification, digital
signatures, and password hashing.
One key feature of a secure hash function is that it is
irreversible, meaning the original input cannot be
determined from the hash value. Additionally, a small
change to the input data should produce a vastly
w
1 Integrity 2 Non-repudiation
Data encrypted with a hash function should Hash functions are used to create digital
remain unaltered and unchanged during signatures, thereby preventing individuals from
transmission or storage. denying the authenticity of their digital
messages.
3 Authentication 4 Confidentiality
The hash function serves as a way to verify the While hash functions mainly focus on ensuring
identity of the sender and ensures the data's data integrity, they can also be used to protect
origin. the confidentiality of data.
Why do we need Hash Functions ?
1. Standard Length
When you hash a message, it takes your file or
message of any size, runs it through a mathematical
algorithm, and spits out an output of a fixed length.
In Table, we converted the same input message (the letters
CFI) into hash values using three different hash functions
(MD5, SHA-1, and SHA-256). Each one of those
different hash functions will spit out an output hash that
has a set fixed length of hexadecimal characters. In the
case of MD5, it is 32 characters, SHA-1, 40 characters,
and SHA-256, 64 characters.
It doesn’t matter what we put in as an input, the same hash
function
w
will always produce a hash value that has the the
same number of characters. In Table 2 above, we change
the message each time, but using the same hash function
(SHA-1 in this case), the output is always 40 hexadecimal
characters long.
Why do we need Hash Functions ?
2. Ensure data integrity
Let’s think of an example where you want to send a digital message
or document to someone, and you want to make sure that it hasn’t
been tampered with along the way. You could send it multiple times
and have the recipient verify each copy is the same, but that would
not be feasible if the file or message was very large.
It would be much easier if there was a way of having a shorter and
set number of characters for the sender and receiver to check. And
that’s essentially what a hash function allows two computers to do.
Rather than compare the data in its original (and larger) form, by
comparing the two hashes of the data, computers can quickly
confirm that the data has not been tampered with and changed.
Hashw
functions, therefore, serve as a check-sum or a way for
someone to identify whether digital data has been tampered with
after it’s been created.
Why do we need Hash Functions ?
3. Verify authenticity
For example, if you send out an email, it can be intercepted easily
(especially if it is sent over an unsecured WiFi network).
The recipient of the email has no way of knowing if someone has altered
the contents of the email along the way, called a “Man-in-the-Middle”
(MitM) attack.
However, if the sender signs the email with their digital signature and
hashes that together with the email contents, the receiver can examine the
hash data to ensure that the email contents have not been modified after
being digitally signed.
To do this, the receiver would compare the hash value on the digitally-
signed email received to a hash value they “re-generate” themselves
using the same hash function provided by the sender, as well as the
signer’s public key.
If it matches,
w that means that no one has altered the message, but if the
hashes are different, then the receiver knows that the contents of the
email are not authentic, as even if something small has been changed in
that message, the hash will be completely different.
Hash Functions Properties:
There are three central properties which hash functions need
to possess in order to be secure:
1. preimage resistance (or one-wayness)
2. second preimage resistance (or weak collision resistance)
3. collision resistance (or strong collision resistance)
w
1. Preimage resistance:
w
2. Second Preimage resistance:
w
3. Collision resistance:
w
3. Collision resistance:
w
How Does Hash Function Work ?
The “Avalanche Effect”
The data blocks are processed one at a time. The output of the first data block is
fed as input along with the second data block. Consequently, the output of the
second is fed along with the third block, and so on.
Thus, making the final output the combined value of all the blocks. If you change
one bit anywhere in the message, the entire hash value changes. This is called
‘the avalanche effect.
Uniqueness and Deterministic
Hash functions must be Deterministic – meaning that every time you put in the
same input, it will always create the same output.
In other words, the output, or hash value, must be unique to the exact input. There
should be no chance whatsoever that two different message inputs create the
same output hash. If a hash function produces the same output from two different
pieces of data, it is known as a “hash collision,” and the algorithm is useless.
Irreversibility
w
Ideally, hash functions should be irreversible. Meaning that while it is quick and
easy to compute the hash if you know the input message for any given hash
function, it is very difficult to go through the process in reverse to compute the
input message if you only know the hash value.
Design of Hashing Algorithms
Since, the hash value of first message block becomes an input to the second hash operation,
output of which alters the result of the third operation, and so on. This effect, known as an
avalanche effect of hashing.
Avalanche effect results in substantially different hash values for two messages that differ
by even a single bit of data.
Understand the difference between hash function and algorithm correctly. The hash
function generates a hash code by operating on two blocks of fixed-length binary data.
Hashing algorithm is a process for using the hash function, specifying how the message will
be broken up and how the results from previous message blocks are chained together.
w
Popular Hash Functions
1. Message Digest (MD)
MD5 was most popular and widely used hash function for
quite some years.
•The MD family comprises of hash functions MD2, MD4,
MD5 and MD6. It was adopted as Internet Standard RFC
1321. It is a 128-bit hash function.
•MD5 digests have been widely used in the software world to
provide assurance about integrity of transferred file. For
example, file servers often provide a pre-computed MD5
checksum for the files, so that a user can compare the
checksum of the downloaded file to it.
•In 2004, collisions were found in MD5. An analytical attack
was reported to be successful only in an hour by using
w
w
Examples:
w
Examples:
w
Applications of Hash Functions
There are two direct applications of hash function based on
its cryptographic properties, the most common
cryptographic applications:
1. Password Storage
Hash functions provide protection to password storage.
•Instead of storing password in clear, mostly all logon
processes store the hash values of passwords in the file.
•The Password file consists of a table of pairs which are in
the form (user id, h(P)).
•The process of logon is depicted in the following
illustration.
•An intruder can only see the hashes of passwords, even if
w
w
Applications of Hash Functions
3. Signature Generation and Verification
Verifying signatures is a mathematical process
used to verify the authenticity of digital
documents or messages. A valid digital
signature, where the prerequisites are satisfied,
gives its receiver strong proof that a known
sender created the message and that it was not
altered in transit.
A digital signature scheme typically consists of
three algorithms: a key generation algorithm; a
signing algorithm that, given a message and a
private key, produces a signature; and a
w
w
Hash Function and Cryptographic Hash Function ?
w
Hash Function and cryptographic Hash Function ?
w
Exercises
• Exercise : Explain the purpose of the collision-resistance requirement for the hash function
used in a digital signature scheme.