Chapter 10 Crypto 33
Chapter 10 Crypto 33
Message Authentication
Encryption protects against passive attack (eavesdropping). A different requirement is to
protect against active attack (falsification of data and transactions). Protection against such
attacks is known as message authentication.
A message, file, document, or other collection of data is said to be authentic when it is
genuine and came from its alleged source. Message authentication is a procedure that allows
communicating parties to verify that received message is authentic. The two important aspects
are to verify that the contents of the message have not been altered and that the source is
authentic. We may also wish to verify a message’s timeliness (it has been artificially delayed
and replayed) and sequence relative to other messages flowing between two parties.
One authentication technique involves the use of a secret key to generate a small block of
data, known as a message authentication code that is appended to the message. This technique
assumes that two communicating parties, say A and B, share a common secret key KAB. When
A has a message to send to B, it calculates the message authentication code as a function of
the message and the key: MACM = F (KAB, M). The message plus code are transmitted to the
intended recipient. The recipient performs the same calculation on the received message,
using the same secret key, to generate a new message authentication code. The received code
is compared to the calculated code. If we assume that only the receiver and the sender know
the identity of the key, and if the received code matches the calculate code, then
1. The receiver is assured that the message has not been altered.
2. The receiver is assured that the message is from the alleged sender. Because no one
else knows the secret key, no one else could prepare a message with a proper code.
3. If the message includes a sequence number, then the receiver can be assured of the
proper sequence, because an attacker cannot successfully alter the sequence number.
A number of algorithms could be used to generate the code. The national Bureau of
Standards, in its publication DES Modes of Operation, recommends the use of Data
Encryption Algorithm (DEA).
79
K
M M M
e e e MAC
s s s Algorith
s s Transmit s
a a a MAC
g g g
e e e
MAC MAC Compare
MAC MAC
Algorith
K Fig. 10. 1
80
function to the received message) and comparing the result with the hash value that was
pretended to the received message. If the two hash values agree, the probability is high
that there was no manipulation; if the two disagree, it is certain that some manipulation or
corruption in the data took place. Thus hash functions generate a sort of error detecting
code.
Figure shows a technique that uses a hash function. This technique assumes that two
communicating parties, say A and B, share a common secret value SAB. When A has a
message to send B, it calculates the hash function over the concatenation of the secret
value and the message: MDM = H (SAB║M). It then sends [M║MDM] to B. because B
possesses SAB, it can recomputed H (SAB║M) and verify MDM. because the secret value
itself is not sent, it is not possible for an attacker to modify an intercepted message. As
long as the secret value remains secret, it is also not possible for an attacker to generate a
false message.
A variation on the third technique, called HMAC, is the one adopted for IP security.
Bob
Alice SAB
m m
e e MDM
m
s s
e s s
Channel H
s a a
s
a
g g Compare
e [M║MDM] e
g
H e
MDM
Example 1. Grace and Alan are concerned that their e-mails maybe intercepted and
modified in transit. They agree to compute a hash value H(x) of an e-mail message x as
follows. Group the letters(ignoring spaces and punctuation) into five-letter blocks and pad
if necessary at the end to make the message length exactly a multiple of 5. Then , treating
the letters a representing numbers in the range 0 to 25, they sum letters 1,6,11...modulo 26
to obtain the first hash letter y 1 , sum letters 2,7,12,... modulo 26 to obtain a second hash
letter y 2 , and so on. The five letters representing these five sums are the value of this y 2 ,
and so on. The five letters representing these five sums are the value of this hash function:
81
H(x) = y 1 y 2 y 3 y 4 y 5 .The hash value will be sent first in a preliminary message, and then
the message itself will be sent.
For example, suppose Grace wants to send the message:
It is much easier to apologise than it is to get permission.
X=ITISM UCHEA SIERT OAPOL OGIZE THANI TISTO GETPE RMISS IONXX
Alan then computes the hash letters by
y 1 I+U+S+O+O+T+T+G+R+I = N(mod 26)
y 2 T+C+I+A+G+H+I+E+M+O = C(mod 26)
y 3 I+H+E+P+I+A+S+T+I+N = W(mod 26)
She first sends NCWKJ and then she sends the plaintext unencrypted.
On the other end, after receiving a hash value and a message, Alan sums the message
letters in the same way to produce a hash word. It compares the message letters in the
same way to produce a hash word. He compares this with the one that preceded the
message. If the two are the same, he regards the message as likely to be the one Grace
sent. If they are different, he is certain that either the message was altered, or the hash
value was altered, or both. In this case, he rejects the received message.
Why does agreement between Alan’s computed and received hash words make it likely
that the message was unaltered? Suppose, for instance, that Evelyn, an opponent, knows
the hashing algorithm, has learned the hash value Grace sent, and would like to alter or
82
replace the message to Alan. If , for example, she changed the letters EASI to HARD. In
the message, then she would change the message, but Alan would (you should verify)
compute the hash word MXWNJ and compare it with NCWKJ to see that something was
a miss. If Evelyn picked an intelligible English sentence at random, what would be the
probability that it hashed to NCWKJ? An easier question is, If Evelyn picked a random
string of letters, what would be the probability that it hashed to NCWKH? Assuming that
the strings that do hash to NCWKJ are in some sense uniformly distributed through all
1 1
strings, then this probability is 5
8.4 x10 8 ( why ? .). The sub collection of
26 11881376
these strings hashing to NCWKJ that are also intelligible English sentences is only a tiny
proportion, so the probability of Evelyn choosing an English text at random that hashes to
NCWKJ is small indeed. Of course, knowing the details of the hashing algorithm, Evelyn
might be able to improve her odds significantly. Hash functions are selected so that their
values will be essentially uniformly distributed for the population of expected messages. If
there are M possible messages, and z is a k-bit hash value, then there are about M/2 k
messages that hash to z, so the probability of an adversary guessing a message that hashes
to z is about
k
M/2 1
M 2k
All hash functions operate using the following general principles. The input (message,
file, etc.) is viewed as a sequence of n-bit blocks. The input is processed one block at a
time in an iterative fashion to produce an n-bit hash function.
One of the simplest hash functions is the bit-by-bit exclusive-OR (XOR) of every
block. This can be expressed as follows:
A technique originally proposed by the National Bureau of Standards used the simple XOR
applied to 64-bit blocks of the message and then an encryption of the entire message that used
the cipher block chaining (CBC) mode. We can define the scheme as follows: given a
message consisting of a sequence of 64-bit blocks X1, X2, … XN, define the hash code C as the
block-by-block XOR or all blocks and append the hash code as the final block:
C X N 1 X 1 X 2 ... X N
Next, encrypt the entire message plus hash code, using CBC mode to produce the encrypted
message Y1, Y1,… YN+1.
84
The algorithm takes as input a message with a maximum length of less than 264 bits
and produces as output a 160-bit message digest. The input is processed in 512-bit blocks.
The SHA-1 algorithm has the property that every bit of the hash code is a function of
every bit of the input. The complex repetition of the basic function f produces results that are
well mixed, that is, it is unlikely that two messages chosen at random, even if they exhibit
similarly regularities, will have the same hash code. Unless there is some hidden weakness in
SHA-1, which has not so far been published, the difficulty of coming up with two messages
having the same message digest is on the order of 280 operations, while the difficulty of
finding a message with a given digest is on the order of 2160 operations.
In this section we look at two other secure hash functions that, in addition to SHA-1,
have gained commercial acceptance. Some of the principal characteristics are compared in
table.
So with a secure hash function that produces, say, a 160-bit hash value- as does the federal
Secure Hash Standard 8 the probability of random guesswork leading to a message for
a given hash value is
1 1
160
= 6.8422810 49
2 1461501637330902918203684832716283019655932542976
Presumably, very sophisticated guessing might raise this number a few orders of
magnitude, but the probability will still be infinitesimal.
85
The algorithm takes as input a message of arbitrary length and produces as output a 128-bit
message digest. The input is processed in 512-bit blocks.
As processor speeds have increased, the security of a 128-bit hash code has become
questionable. It can be shown that the difficulty of coming up with two messages having the
same message digest is on the order of 264 operations, whereas the difficulty of finding a
message with a given digest is on the on the order 2128 operations. The former figure is too
small for security. Further, a number of cryptanalytic attacks have been developed that
suggest the vulnerability of MD5 to cryptanalysis.
10. 4. 1 RIPEMD-160
The RIPEMD-160 message-digest algorithm was developed under the European RACE
Integrity Primitives Evaluation (RIPE) project, by a group of researchers that launched
partially successful attacks on MD4 and MD5. The group originally developed a 128-bit
version of RIPEM. After the end of the RIPE project, H. Dobbertin (who was not a part of the
RIPE project) found attacks on two rounds of RIPEMD, and later on MD4 and MD5. Because
of these attacks, some members of the RIPE consortium decided to upgrade RIPEMD. The
design work was done by them and by Dobbertin.
RIPEMD-160 is quite similar in structure to SHA-1. The algorithm takes as input a
message of arbitrary length and produces as output a 160-bit message digest. The input is
processed in 512 –bit blocks.
10. 4. 2 HMAC
In recent years, there has been increased interest in developing a MAC derived from a
cryptographic hash code, such as SHA-1.
A hash function such as SHA-1 was not designed for use as a MAC and can not be
used directly for that purpose because it does not rely on a secret key. There have been a
number of proposals for the incorporation of a secret key into an existing hash algorithm. The
approach that has received the most support is HMAC. HMAC has been issued as RFC 2104,
has been chosen as the mandatory-to-implement MAC for IP Security, and is used in other
Internet protocols, such as transport layer security (TLS, soon to replace secure sockets layer)
and secure electronic transaction (SET).
86