M - 3. Cryptographic Hash Functions
M - 3. Cryptographic Hash Functions
CRYPTOGRAPHIC HASH
FUNCTIONS
CRYPTOGRAPHIC HASH FUNCTIONS
• A hash function is a mathematical function that converts a
numerical input value into another compressed numerical value.
The input to the hash function is of arbitrary length but output is
always of fixed length hash value h = H(M).
• Values returned by a hash function are called message digest or
simply hash values.
• In general terms, the principal object of a hash function is data
integrity. A change to any bit or bits in M results, with high
probability, in a change to the hash code.
• The kind of hash function needed for security applications is
referred to as a cryptographic hash function.
CRYPTOGRAPHIC HASH FUNCTIONS
•
•
•
• A cryptographic hash function is an algorithm for which it is
computationally infeasible to find either
• (a) a data object that maps to a pre-specified hash result (the
one-way property) or
• (b) two data objects that map to the same hash result (the
collision-free property).
Because of these characteristics, hash functions are often used
to determine whether or not data has changed.
CRYPTOGRAPHIC HASH FUNCTIONS
The goal of any message digest function is to produce digest that
appears to be random.
• The integrity check helps the user to detect any changes made to
original file.
MESSAGE AUTHENTICATION CODE
Message Authentication is concerned with:
• protecting the integrity of a message
• validating identity of originator
• non-repudiation of origin (dispute resolution)
Security Requirements:
• Disclosure
• traffic analysis
• Masquerade
• Content modification
• Sequence modification
• Timing modification
• Source repudiation
• Destination repudiation
MESSAGE ENCRYPTION
• Message encryption by itself also provides a measure of
authentication
• If symmetric encryption is used then:
• receiver know sender must have created it
• since only sender and receiver known key used
• know content cannot be altered
• if message has suitable structure, redundancy or a checksum to
detect any changes
MESSAGE ENCRYPTION
• If public-key encryption is used:
• encryption provides Confidentiality but not authentication
• since anyone potentially knows public-key
• Source (A) encrypts the message (M) using public-key of
destination (B)
• Since ‘B’ has its own private-key, only ‘B’ can decrypt the
message
• Any opponent can use B’s public key to encrypt a message and
can claim to ‘A’.
MESSAGE ENCRYPTION
• If public-key encryption is used:
• to provide authentication
• A uses its private key to encrypts the message (M)
• B uses A public-key to decrypt
MESSAGE ENCRYPTION
• If public-key encryption is used:
• to provide both confidentiality and authentication,
• A can encrypt M first using its private key, which provides the
digital signature, and then using B’s public key, which provides
confidentiality
• The disadvantage of this approach is that the public-key
algorithm, which is complex, must be exercised four times rather
than two in each communication
MESSAGE AUTHENTICATION CODE
• generated by an algorithm that creates a small fixed-sized block
• depending on both message and some key
• like encryption though need not be reversible
• appended to message as a signature
• receiver performs same computation on message and checks it
matches the MAC
• provides assurance that message is unaltered and comes from
sender
• as shown the MAC provides confidentiality
• can also use encryption for secrecy
• generally use separate keys for each
• can compute MAC either before or after encryption
• is generally regarded as better done before
MESSAGE AUTHENTICATION CODE
• why use a MAC?
• sometimes only authentication is needed
• sometimes need authentication to persist longer than the
encryption (eg. archival use)
• note that a MAC is not a digital signature
MAC PROPERTIES
• a MAC is a cryptographic checksum
MAC = CK(M)
• condenses a variable-length message M
• using a secret key K
• to a fixed-sized authenticator
• is a many-to-one function
• potentially many messages have same MAC
• but finding these needs to be very difficult
MAC BASED ON HASH FUNCTION
• HMAC (Hash-based Message Authentication Code) is a type of a
message authentication code (MAC) that is acquired by
executing a cryptographic hash function on the data (that is) to
be authenticated and a secret shared key.
• Like any of the MAC, it is used for both data integrity and
authentication. The cryptographic hash function may be MD-5,
SHA-1, or SHA-256.
• Digital signatures are nearly similar to HMACs i.e they both
employ a hash function and a shared key. The difference lies in
the keys i.e HMACs use symmetric key(same copy) while
Signatures use asymmetric (two different keys).
• HMAC has been issued as RFC 2104, has been chosen as the
mandatory-to-implement MAC for IP security, and is used in
other Internet protocols, such as SSL.
HMAC DESIGN OBJECTIVES
RFC 2104 lists the following design objectives for HMAC.
• To use, without modifications, available hash functions. In
particular, to use hash functions that perform well in software and
for which code is freely and widely available.
• To allow for easy replaceability of the embedded hash function in
case faster or more secure hash functions are found or required.
• To preserve the original performance of the hash function without
incurring a significant degradation.
• To use and handle keys in a simple way.
• To have a well understood cryptographic analysis of the strength
of the authentication mechanism based on reasonable assumptions
about the embedded hash function.
HMAC STRUCTURE
HMAC STRUCTURE
H = embedded hash function (e.g., MD5, SHA-1, RIPEMD-160)
IV = initial value input to hash function
M = message input to HMAC (including the padding specified in
the embedded hash function)
Yi = i th block of M, 0 <= i <= (L - 1)
L = number of blocks in M
b = number of bits in a block
n = length of hash code produced by embedded hash function
K = secret key; recommended length is >= n; if key length is
greater than b, the key is input to the hash function to produce an n-
bit key
HMAC STRUCTURE
K+ = K padded with zeros on the left so that the result is b bits in
length
ipad = 00110110 (36 in hexadecimal) repeated b/8 times
opad = 01011100 (5C in hexadecimal) repeated b/8 times
Then HMAC can be expressed as
HMAC(K, M) = H[(K+ ⊕ opad) || H[(K+ ⊕ ipad) || M]]
HMAC ALGORITHM
We can describe the algorithm as follows.
1. Append zeros to the left end of K to create a b-bit string K+ (e.g.,
if K is of length 160 bits and b = 512, then K will be appended
with 44 zeroes).
2. XOR (bitwise exclusive-OR) K+ with ipad to produce the b-bit
block Si.
3. Append M to Si.
4. Apply H to the stream generated in step 3.
5. XOR K+ with opad to produce the b-bit block S0.
6. Append the hash result from step 4 to S0.
7. Apply H to the stream generated in step 6 and output the result.
MAC BASED ON BLOCK CIPHER
Cipher-based message authentication codes (or CMACs) are a tool
for calculating message authentication codes using a block cipher
coupled with a secret key. You can use an CMAC to verify both the
integrity and authenticity of a message.
First, let us define the operation of CMAC when the message is an
integer multiple n of the cipher block length b. For AES, b = 128,
and for triple DES, b = 64.
The message is divided into n blocks (M1, M2, …, Mn). The
algorithm makes use of a k-bit encryption key K and a b-bit
constant, K1. For AES, the key size k is 128, 192, or 256 bits; for
triple DES, the key size is 112 or 168 bits.
MAC BASED ON BLOCK CIPHER
MAC BASED ON BLOCK CIPHER
CMAC is calculated as follows:
C1 = E(K, M1)
C2 = E(K, [M2 ⊕ C1])
C3 = E(K, [M3 ⊕ C2])
….
Cn = E(K, [Mn ⊕ Cn-1 ⊕ K1])
T = MSBTlen(Cn)
where
T = message authentication code, also referred to as the tag
Tlen = bit length of T
MSBs(X) = the s leftmost bits of the bit string X
MAC BASED ON BLOCK CIPHER
If the message is not an integer multiple of the cipher block length,
then the final block is padded to the right (least significant bits) with
a 1 and as many 0s as necessary so that the final block is also of
length b. The CMAC operation then proceeds as before, except that
a different b-bit key K2 is used instead of K1.
The two b-bit keys are derived from the k-bit encryption key as
follows.
L = E(K, 0b )
K1 = L . x
K2 = L . x2 = (L . x) . x
where multiplication ( # ) is done in the finite field GF(2b) and x and
x2 are first and second-order polynomials that are elements of
GF(2b).
MAC BASED ON BLOCK CIPHER
Thus, the binary representation of x consists of b - 2 zeros followed
by 10; the binary representation of x2 consists of b - 3 zeros
followed by 100.
The finite field is defined with respect to an irreducible polynomial
that is lexicographically first among all such polynomials with the
minimum possible number of nonzero terms.
For the two approved block sizes, the polynomials are
x64 + x4 + x3 + x + 1 and x128 + x7 + x2 + x + 1.
To generate K1 and K2, the block cipher is applied to the block that
consists entirely of 0 bits. The first subkey is derived from the
resulting ciphertext by a left shift of one bit and, conditionally, by
XORing a constant that depends on the block size. The second
subkey is derived in the same manner from the first subkey.
MD5 ALGORITHM
MD5 (Message Digest Method 5) is a cryptographic hash algorithm
used to generate a 128-bit digest from a string of any length. It
represents the digests as 32 digit hexadecimal numbers.
The digest size is always 128 bits, and thanks to hashing function
guidelines, a minor change in the input string generate a drastically
different digest. This is essential to prevent similar hash generation
as much as possible, also known as a hash collision.
MD5 ALGORITHM STEPS
1. Padding Bits
When you receive the input string, you have to make sure the size is
64 bits short of a multiple of 512. When it comes to padding the bits,
you must add one(1) first, followed by zeroes to round out the extra
characters.
MD5 STEPS
2. Padding Length
You need to add a few more characters to make your final string a
multiple of 512. To do so, take the length of the initial input and
express it in the form of 64 bits. On combining the two, the final
string is ready to be hashed.
MD5 STEPS
Assuming message length = 112 bits
Number of padding bits required = (512 * 1) – 112 - 64 = 336
512
112 336 64
10…..0
512 488 P P
⚫ word A: 01 23 45 67
⚫ word B: 89 AB CD EF
⚫ word C: FE DC BA 98
⚫ word D: 76 54 32 10
MD5 STEPS
4. Process Each Block
Each 512-bit block gets broken down further into 16 sub-blocks of
32 bits each. There are four rounds of operations, with each round
utilizing all the sub-blocks, the buffers, and a constant array value.
Message 100…0
L X 512 bits
512 bits
512
128
MD HMD5 HMD5 HMD5 HMD5
Buffer0 MD
IV MD MD bufferL-1
buffer1 buffern
128-bit
digest
MD5 STEPS
MD5 STEPS
The non-linear process above is different for each round of the sub-
block.
⚫ Round 1: (B AND C) OR ((NOT B) AND D)
+ g
X[k] +
T[i] +
CLSs
A B C D
SHA 1
Developed by NIST(National Institute of standards and technology).
SHA-1 Logic :
The algorithm takes as input a message with a maximum length of
less than 264 bits and produces a 160-bit message digest.
The input is processed in 512-bit blocks.
SHA 1
Processing Steps :
Step 1 : Append padding bits (Same as MD5)
Step 2 : Append length (Same as MD5)
Step 3 : Initialize MD buffer.
A 160-bit buffer is used to hold intermediate and final results of the
hash function.
It is represented as five 32-bit registers {A, B, C, D, E}.
The initial register value are:
A = 67 45 23 01
B = EF CD AB 89
C = 98 BA CD FE
D = 10 32 54 76
E = C3 D2 E1 F0
SHA 1
Step 4 : Process the message in 512 bit blocks.
The compression function consists of four rounds.
Each round consists of 20 processing steps.
The four rounds have a similar structure, but each uses a different
primitive logical function f1, f2, f3, and f4.
Each round takes as an input the current 512-bit block being
processed Yq and the 160-bit buffer value {ABCDE} and updates the
contents of the buffer.
◦ For 0 ≤ t ≤ 19
Kt = 5A827999
◦ For 20 ≤ t ≤ 39
Kt = 6ED9EBA1
◦ For 40 ≤ t ≤ 59
Kt = 8F1BBCDC
◦ For 60 ≤ t ≤ 79
Kt = CA62C1D6
SHA 1
How the 32-bit word values Wt are derived from the 512-bit
message ?
The first sixteen values of Wt are taken directly from the 16 words of
the current block and the remaining values are defined as …
i.e. for the remaining 64 steps, the value Wt consists of the circular
left shift by one bit of the XOR of 4 of the preceding values of Wt.
Wt=S1(Wt-16 Wt-14 Wt-8 Wt-3)
SHA 1
A single step of the SHA-1 operation :