Journal of Cryptology: Differential Cryptanalysis of DES-like Cryptosystems 1
Journal of Cryptology: Differential Cryptanalysis of DES-like Cryptosystems 1
Abstract. The Data Encryption Standard (DES) is the best known and most
widely used cryptosystem for civilian applications. It was developed at IBM and
adopted by the National Bureau of Standards in the mid 1970s,and has successful-
ly withstood all the attacks published so far in the open literature. In this paper
we develop a new type of cryptanalytic attack which can break the reduced variant
of DES with eight rounds in a few minutes on a personal computer and can break
any reduced variant of DES (with up to 15 rounds) using less than 2 s6 operations
and chosen plaintexts. The new attack can be applied to a variety of DES-like
substitution/permutation cryptosystems, and demonstrates the crucial role of the
(unpublished) design rules.
Key words. Data Encryption Standard, Differential cryptanalysis, Iterated
cryptosystems.
1. Introduction
round function of Lucifer has a combination of nonlinear S boxes and a bit permu-
tation. The input bits are divided into groups of four consecutive bits. Each group
is translated by a reversible S box giving a four-bit result. The output bits of all the
S boxes are permuted in order to mix them when they become the input to the
following round. In Lucifer only two fixed S boxes (So and $1) were chosen. Each
S box can be used at any S box location and the choice is key dependent. De-
cryption is accomplished by running the data backward using the inverse of each
S box.
The Data Encryption Standard (DES) [15] is an improved version of Lucifer. It
was developed at IBM and adopted by the U.S. National Bureau of Standards
(NBS) as the standard cryptosystem for sensitive but unclassified data (such as
financial transactions and email messages). DES has become a well-known and
widely used cryptosystem. The key size of DES is 56 bits and the block size is 64
bits. This block is divided into halves of 32 bits each. The main part of the round
function is the F function, which works on the right half of the data using a subkey
of 48 bits and eight (six-bit to four-bit) S boxes. The 32 output bits of the F function
are XORed with the left half of the data and the halves are exchanged. The complete
specification of the DES algorithm appears in [15].
An extensive cryptanalytic literature on DES was published since its adoption in
1977. Yet no short-cuts which can reduce the complexity of cryptanalysis to less
than half of exhaustive search were ever reported in the open literature.
The 50~ reduction [9-] (under a chosen plaintext attack) is based on the follow-
ing symmetry under complementation:
T = DES(P, K)
implies that
= DES(P, K),
where X is the bit-by-bit complementation of X. Cryptanalysis can exploit this
symmetry if two plaintext/ciphertext pairs (P1, T1) and (P2, T2) are available with
P1 =/52 (or similarly T1 = T2). The attacker encrypts P1 under all the 2 55 keys K
whose least-significant bit is zero. If such a ciphertext T is equal to 7"1, then the
corresponding key K is likely to be the real key. If T = T2, t h e n / ~ is likely to be
the real key. Otherwise neither K n o r / ~ can be the real key. Since testing whether
T = T2 is much faster than an encryption, the computational saving is very close
to 50~.
Diffie and Hellman I-6] suggested exhaustive search of the entire key space on a
parallel machine. They estimate that a VLSI chip may be built which can search
one key every microsecond. By building a search machine with a million such chips,
all searching in parallel, 1012 keys can be searched per second. The entire key space
contains about 7.1016 keys and it can be searched in 105 seconds which is about a
day. They estimate the cost of this machine to be $20 million and the cost per
solution to be $5000.
Hellman I-8-] presented a time memory tradeoff method for a chosen plaintext
attack which takes mt words of memory and t z operations provided mt z equals the
number of possible keys (256 for DES). A special case (m = t) of this method takes
Differential Cryptanalysis of DES-like Cryptosystems
about 238 time and 238 memory, with a 256 preprocessing time. Hellman suggests
a special purpose machine which produces 100 solutions per day with an average
wait of 1 day. He estimates that the machine costs about $4 million and the cost
per solution is about $1-$100. The preprocessing is estimated to take 2.3 years on
the same machine.
The Method of Formal Coding in which the formal expression of each bit in the
ciphertext is found as an XOR sum of products of the bits of the plaintext and the
key was suggested in [9]. The formal manipulations of these expressions may de-
crease the key search effort. Schaumuller-Bichl [ 16], [17] studied this method and
concluded that it requires an enormous amount of computer memory which makes
the whole approach impractical.
In 1985 Chaum and Evertse [2] showed that a meet in the middle attack can
reduce the key search for DES reduced to a small number of rounds by the follow-
ing factors:
They also showed that a slightly modified version of DES reduced to seven rounds
can be solved with a reduction factor of 2. However, they proved that a meet in the
middle attack of this kind is not applicable to DES reduced to eight or more
rounds.
In their method they look for a set of data bits (J) in a middle round and a set of
key bits (I) for which any change of the values of the I bits cannot change the
value of the J bits in either directions. Knowing those fixed sets and given several
plaintext/ciphertext pairs the following algorithm is used:
1. Try all the keys in which all the key bits in I are zero. Partially encrypt and
decrypt a plaintext/ciphertext pair to get the data in the middle round.
2. Discard the keys for which the J bits are not the same under partial
encryption/decryption.
3. For the remaining keys try all the possible values of the key bits in I.
This algorithm requires about 256-1zl -1- 2 Itl encryption/decryption attempts.
In 1987 Davies [3] described a known plaintext cryptanalytic attack on
DES. Given sufficient data, it could yield 16 linear relationships among key bits,
thus reducing the size of a subsequent key search to 24~ It exploited the correlation
between the outputs of adjacent S boxes, due to their inputs being derived from,
among other things, a pair of identical bits produced by the bit expansion opera-
tion. This correlation could reveal a linear relationship among the four bits of key
used to modify these S box input bits. The 32-bit halves of the DES result (ignoring
IP) receive these outputs independently, so each pair of adjacent S boxes could be
exploited twice, yielding 16 bits of key information.
6 E. Bihamand A. Shamir
The analysis does not require the plaintext P or ciphertext T but uses the quan-
tity P ~) T and requires a huge number of random inputs. The S box pairs vary in
the extent of correlation they produce so that, for example, the pair $7/$8 needs
about 1017 samples but pair $2/$3 needs about 1021. With about 1023 samples, all
but the pair $3/$4 should give results (i.e., a total of 14 bits of key information). To
exploit all pairs the cryptanalyst needs about 1026 samples. The S boxes do not
appear to have been designed to minimize the correlation but they are somewhat
better than a random choice in this respect. Since the number of samples is larger
than the 264 size of the sample space, this attack is purely theoretical and cannot
be carried out. However, for DES reduced to eight rounds the sample size of 1012
o r 1013 (about 24~ is on the verge of practicality. Therefore, Davies' analysis had
penetrated more rounds than previously reported attacks.
During the last decade several cryptosystems which are variants of DES were
suggested. Schaumuller-Bichl suggested three such cryptosystems [16], [18]. Two
of them (called C80 and C82) are based on the DES structure with the replacement
of the F function by nonreversible functions. The third one, called the Generalized
DES Scheme (GDES), is an attempt to speed up DES. GDES has 16 rounds with
the original DES F function but with a larger block size which is divided into more
than two parts. She claims that GDES increases the encryption speed of DES
without decreasing its security.
Another variant is the Fast Data Encryption Algorithm (Feal). Feal was designed
to be efficiently implementable on an eight-bit microprocessor. The first version of
Feal [20-1, called Feal-4, has four rounds. Feal-4 was broken by Den Boer [4,1 using
a chosen plaintext attack with 100-10,000 encryptions. The creators of Feal reacted
by introducing a new version, called Feal-8, with eight rounds [19,1, [14,1. Both
versions were described as cryptographically better than DES in several aspects.
In this paper we describe a new kind of attack that can be applied to many
DES-like iterated cryptosystems. This is a chosen plaintext atack which uses only
the resultant ciphertexts. The basic tool of the attack is the ciphertext pair which is
a pair of ciphertexts whose plaintexts have particular differences. The two plain-
texts can be chosen at random, as long as they satisfy the difference condition, and
the cryptanalyst does not have to know their values. The attack is statistical in
nature and can fail in rare instances.
The main results described in this paper are as follows (note that the complexities
we quote are based on the number of encryptions needed to create all the necessary
pairs on the target machine, while the attacking algorithm itself uses fewer and
simpler operations). DES reduced to six rounds was broken in less than 0.3 seconds
on a personal computer using 240 ciphertexts. DES reduced to eight rounds was
broken in less than 2 minutes on a computer by analysing 15,000 ciphertexts chosen
from a pool of 50,000 candidate ciphertexts. DES reduced to up to 15 rounds is
breakable faster than exhaustive search, but DES with 16 rounds still requires 258
steps (which is slightly higher than the complexity of exhaustive search). A summa-
ry of the cryptanalytic results on DES reduced to intermediate number of rounds
appears in Table 1.
Some researchers have proposed to strengthen DES by making all the subkeys
Differential Cryptanalysis of DES-like Cryptosystems
Table 1. Summary of the cryptanalysis of DES: The number of operations and chosen
plaintexts required to break the specified number of rounds.
Rounds Complexity
4 24
6 28
8 216
9 226
10 235
11 236
12 243
13 244
14 251
15 252
16 258
Ki independent (or at least to derive them in a more complicated way from a longer
actual key K). Our attack can be carried out even in this case. DES reduced to eight
rounds with independent subkeys (i.e., with 8" 48 = 384 independent key bits which
are not compatible with the key scheduling algorithm) was broken in less than 2
minutes using the same ciphertexts as in the case of dependent subkeys. The full
DES with independent subkeys (i.e., with 16.48 = 768 independent key bits) is
breakable within 261 steps. As a result, any modification of the key scheduling
algorithm cannot make DES much stronger. The attacks on DES reduced to 9-16
rounds are not influenced by the P permutation and the replacement of the P
permutation by any other permutation cannot make them less successful. On the
other hand, the replacement of the order of the eight DES S boxes (without chang-
ing their values) can make DES much weaker: DES with 16 rounds with a particu-
lar replaced order is breakable in about 246 steps. The replacement of the XOR
operation by the more complex addition operation makes this cryptosystem much
weaker. DES with random S boxes is shown to be very easy to break. Even a
minimal change of one entry in one of the DES S boxes can make DES easier to
break. GDES is shown to be trivially breakable with six encryptions in less than
0.2 seconds, while GDES with independent subkeys is breakable with 16 encryp-
tions in less than 3 seconds.
This attack is applicable also to a wide variety of DES-like cryptosystems. In
forthcoming papers we describe several extensions to our new attack. Lucifer re-
duced to eight rounds can be broken using less than 60 ciphertexts (30 pairs). The
Feal-8 cryptosystem can be broken with less than 2000 ciphertexts (1000 pairs) and
the Feal-4 cryptosystem can be broken with just eight ciphertexts and one of their
plaintexts. As a reaction to our attack on Feal-8, its creators introduced Feal-N
[11], with any even number of rounds N. They suggest the use of Feal-N with 16
and 32 rounds. FeaI-NX 1-12] is similar to Feal-N with the extension of the key size
to 128 bits. Nevertheless, Feal-N and Feal-NX can be broken for any N < 31
rounds faster than exhaustive search.
Differential cryptanalytic techniques are applicable to hash functions, in addition-
8 E. Bihamand A. Shamir
to cryptosystems. For example, the following messages hash to the same value in
Merkle's Snefru 1-10] function with two passes:
9 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000
9 00000000 f1301600 13dfc53e 4cc3b093 37461661 ccd8b94d 24d9d36f
71471fde 00000000 00000000 00000000 00000000
9 00000000 1d197f00 2abd3f6f cf33f3d1 8674966a 816e5d51 acd9a905
53cl d180 00000000 00000000 00000000 00000000
9 00000000 e98c8300 1e777a47 b5271f34 a04974bb 44cc8b62 be4b0efc
18131756 00000000 00000000 00000000 00000000
and the following two messages hash to the same value in a variant of Miyaguchi's
N-Hash [13] function with six rounds:
9 CAECE595 127ABF3C 1ADE09C8 1 F9AD8C2
9 4A8C6595 921A3F3C 1ADE09C8 1F9AD8C2.
Plaintext(P))
K1
K2
K3
K4
K5
I(6
K7
K8
Ciphertext(T))
Fig. 1. DES reduced to eight rounds.
a . . . . . j: The 32-bit inputs of the F function in the various rounds. See Fig. 1. Note
that a = R.
A . . . . . J: The 32-bit outputs of the F function in the various rounds. See Fig. 1.
Si: The S boxes S1, $2 . . . . . $8.
SiEx, SiKx, Silx, Siox: The input of Si in round X is denoted by Sijx for X e
{a . . . . . j}. The output of Si in round X is denoted by Siox. The value of the six
subkey bits entering the S box Si is denoted by SiKx and the value of the six
input bits of the expanded data (E(X)) which are XORed with SiKX to form
10 E. Biham and A. Shamir
+
SIs 48 bits STE SS~v
$2s $3E $4E SSE I $6s s,,: n x s ~ s~x s~K szx ssx
T
Fig. 2. The F function of DES.
Si~x is denoted by S i ~ x . The S box number i and the round marker X are
optional. For example Sl~a denotes the first six bits of E(a). S I K a denotes the
first six bits of the subkey K1. S lia denotes the input of the S box S1 which is
SlI~ = Slea 9 S l o a denotes the output of S1 which is S l o a = SI(SI,a).
See Fig. 2.
Example 1. DES has 21648 = 2768 possible independent keys, but only 256 possi-
ble keys. Note that every key can be viewed as a special type of an independent key.
Remark. To simplify the probabilistic analysis of our attack, we assume that all
the subkeys are independent. Attacks on DES with dependent subkeys seem to be
just as successful in practice, but their theoretical analysis is much harder.
Let us recall how the DES F function behaves in these terms. The F function
takes a 32-bit input and a 48-bit key. The input is expanded (by the E expansion)
to 48 bits and XORed with the key. The result is fed into the S boxes and the
resultant bits are permuted.
Given the XOR value of an input pair to the F function it is easy to determine
its XOR value after the expansion by the formula
e ( x ) ~ E(X*) = e ( x ~ X*).
DifferentialCryptanalysisof DES-likeCryptosystems 11
The XOR with the key does not change the XOR value in the pair, i.e., the
expanded XOR stays valid even after the XOR with the key, by the formula
(X ~) K) ~) (X* ~) K) = X ~) X*.
The output of the S boxes is mixed by the P permutation and thus the XOR of the
pair after the P permutation is the permuted value of the S boxes output XOR, by
the formula
P(X) • P(X*) = P(X ~3 X*).
The output XOR of the F function is linear in the XOR operation that connects
the different rounds:
(X ~) Y ) ~ ) ( X * ~ Y*)= (X ~3 X*)O(YO) Y*).
The XOR of pairs is thus invariant in the key and is linear in the E expansion, the
P permutation, and the XOR operation.
The S boxes are known to be nonlinear. Knowledge of the XOR of the input pairs
cannot guarantee knowledge of the XOR of the output pairs. Usually several out-
put XORs are possible. A special case arises when the both inputs are equal, in
which case both outputs must be equal too. However, a crucial observation is that
for any particular input XOR not all the output XORs are possible, the possible
ones do not appear uniformly, and some XORed values appear much more fre-
quently than others.
Before we proceed we want to mention the known design principles of the S
boxes I-1]:
1. No S box is a linear or affine function of its input.
2. Changing one input bit to an S box results in changing at least two output
bits.
3. S(X) and S(X ~ 001100) must differ in at least two bits.
4. S(X) # S(X O) 1lefO0) for any choice of e and f.
5. The S boxes were chosen to minimize the differences between the number of
ones and zeros in any S box output when any single bit is held constant.
In DES any S box has 64.64 possible input pairs, and each one of them has an
input XOR and an output XOR. There are only 64.16 possible tuples of input and
output XORs. Therefore, each tuple results in average from four pairs. However,
not all the tuples exist as a result of a pair, and the existing ones do not have a
uniform distribution. Very important properties of the S boxes are derived from the
analysis of the tables that summarize this distribution:
Definition 2. A table that shows the distribution of the input XORs and output
XORs of all the possible pairs of an S box is called the pairs XOR distribution table
of the S box. In this table each row corresponds to a particular input XOR, each
column corresponds to a particular output XOR, and the entries themselves count
the number of possible pairs with such an input XOR and an output XOR.
Output XOR
Input
XOR 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F,
0, 64 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1, 0 0 0 6 0 2 4 4 0 10 12 4 10 6 2 4
2, 0 0 0 8 0 4 4 4 0 6 8 6 12 6 4 2
3, 14 4 2 2 10 6 4 2 6 4 4 0 2 2 2 0
4, 0 0 0 6 0 10 IO 6 0 4 6 4 2 8 6 2
5, 4 8 6 2 2 4 4 2 0 4 4 0 12 2 4 6
6, 0 4 2 4 8 2 6 2 8 4 4 2 4 2 0 12
7, 2 4 10 4 0 4 8 4 2 4 8 2 2 2 4 4
8, 0 0 0 12 0 8 8 4 0 6 2 8 8 2 2 4
9, 10 2 4 0 2 4 6 0 2 2 8 0 10 0 2 12
A, 0 8 6 2 2 8 6 0 6 4 6 0 4 0 2 10
4 2 4 0 10 2 2 4 0 2 6 2 6 6 4 2 12
CC 0 0 0 8 0 6 6 0 0 6 6 4 6 6 14 2
DX 6 6 4 8 4 8 2 6 0 6 4 6 0 2 0 2
L 0 4 8 8 6 6 4 0 6 6 4 0 0 4 0 8
FX 2 0 2 4 4 6 4. 2 4 8 2 2 2 6 8 8
3% 0 4 6 0 12 6 2 2’ 8 2 4 4 6 2 2 4
31, 4 8 2 10 2 2 2 2 6 0 0 2 2 4 10 8
32, 4 2 6 4 4 2 2 4 6 6 4 8 2 2 8 0
33, 4 4 6 2 10 8 4 2 4 0 2 2 4 6 2 4
34, 0 8 16 6 2 0 0 12 6 0 0 0 0 8 0 6
35, 2 2 4 0 8 0 0 0 14 4 6 8 0 2 14 0
36, 2 6 2 2 8 0 2 2 4 2 6 8 6 4 10 0
37, 2 2 12 4 2 4 4 10 4 4 2 6 0 2 2 4
38, 0 6 2 2 2 0 2 2 4 6 4 4 4 6 10 10
39, 6 2 2 4 12 6 4 8 4 0 2 4 2 4 4 0
3.4, 6 4 6 4 6 8 0 6 2 2 6 2 2 6 4 0
3B, 2 6 4 0 0 2 4 6 4 6 8 6 4 4 6 2
3G 0 10 4 0 12 0 4 2 6 0 4 12 4 4 2 0
3D, 0 8 6 2 2 6 0 8 4 4 0 4 0 12 4 4
3E, 4 8 2 2 2 4 4 14 4 2 0 2 0 8 4 4
3F, 4 8 4 2 4 0 2 4 4 2 4 8 8 6 2 2
different entries. Thus in each line in the table the average of the entries is exactly
four.
Example 3. The first line of Table 2 shows that, for the zero input XOR, the
output XOR must be zero too, as we noticed above. Also, the different lines in the
table have different output XOR distributions.
* The full pairs XOR distribution tables of all the S boxes appear in Appendix B.
DifferentialCryptanalysisof DES-likeCryptosystems 13
Table 3. S1 table.
14 4 13 1 2 15 11 8 3 10 6 12 5 9 0 7
0 15 7 4 14 2 13 1 10 6 12 11 9 5 3 8
4 1 14 8 13 6 2 11 15 12 9 7 3 10 5 0
15 12 8 2 4 9 1 7 5 11 3 14 10 0 6 13
Definition 3. Let X be a six-bit value and let Y be a four-bit value. We say that X
may cause Y by an S box if there is a pair in which the input XOR of the S box
equals X and the output XOR of the S box equals Y. If there is such a pair we write
X ~ Y, and if there is no such pair we say that X may not cause Y by the S box and
write X ~ Y.
Example 4. Consider the input XOR SI~ = 34~. It has only eight possible output
XORs, while the other eight entries are impossible. The possible output XORs S 1~
are 1x, 2~, 3x, 4~, 7x, 8~, Dx, and Fx. Therefore, the input XOR SI~ = 34~ may cause
output XOR SI~ = 1x (34x ~ Ix). Also 34~ ---,2x and 34x ~ Fx. On the other hand,
34x -~ Ox and 34x ~ 9x.
Examples 3 and 4 demonstrate that for a fixed input XOR, the possible output
XORs do not have a uniform distribution. The following definition extends Defini-
tion 3 with probabilities.
Example 5. 34x ---,2~ results from 16 out of the 64 pairs of S1, i.e., with probability
1/4. 34x --* 4 x results only from two out of the 64 pairs of S1, i.e., with probability
1/32.
Different distributions appear in different lines of the table. In total between 70y/o
and 80~o of the entries are possible and between 2070 and 30Y/oare impossible. The
exact percentage for each S box is shown in Table 4. In various formulas in this
paper we approximate the percentage of the possible entries by 809/o.
The pairs XOR distribution tables let us find the possible input and output
values of pairs given their input and output XORs. The following example shows
a simple case:
Example 6. Consider the entry 34x ~ 4x in the pairs XOR distribution table of S1.
Since the entry 34x ~ 4x has value 2, only two pairs satisfy these XORs. These pairs
are duals. If the first pair is S 1~, S IT, then the other pair is S 1T, S 11. By looking at
Table 5 we see that these inputs must be 13x and 27x whose corresponding outputs
are 6x and 2x, respectively.
14 E. Biham and A. Shamir
Table 4. Percentage of the possible entries in the various pairs XOR distribution tables.
S box Percentage
S1 79.4
$2 78.6
$3 79.6
$4 68.5
$5 76.5
$6 80.4
$7 77.2
$8 77.1
Next we show how to find the key bits using k n o w n input pairs and output X O R
of an S box in the F function.
Example 7. Consider S1 and assume that the input pair is S1E = 1x, SI~ = 35 x
and that the value of the corresponding six key bits is S1 r = 23x. Then the actual
inputs of $1 (after X O R i n g the input and key bits) are $1~ = 22x, $1" = 16x and
the outputs are S1 o = lx, S l g -- Cx, respectively. The output X O R is S l ~ = Dx.
Assume we k n o w that S1E = 1~, Sl~ = 35~, and S l ~ = D~ and we want to find
the key value S 1K. The input X O R is S l~ = S 1~ -- 34x regardless of the actual value
of S1 r. By consulting Table 2 we can see that the input to the S box has eight
possibilities. These eight possibilities make eight possibilities for the key (by SK --
SE 9 SI) as described in Table 6~ Each line in the table describes two pairs with the
same two inputs but with the opposite order. Each pair leads to one key, so each
line leads to two keys (which are Se ~ Sl and Se q)S*). The right key value S 1 r
must occur in this table.
Using additional pairs we can get additional candidates for S 1r. Let us look at
the input pair S l e = 21~, $1~ = 15~ (with the same S1K = 23x). The inputs to the
S box are S11 = 2~, S I * = 36~ and the outputs are S1 o = 4x, $1~ = 7~. The output
X O R is S I ~ = 3x. The possible inputs to the S box where 3 4 x ~ 3 x and the
Table 5. Possible input values for the input XOR S1) = 34:, by the
output XOR (in hexadecimal).
Output
XOR
(Sly) Possible Inputs (S 11)
03, OF, 1E, 1F, 2A, 2B, 37, 3B
04, 05, 0E, 11, 12, 14, 1A, 1B, 20, 25, 26, 2E, 2F, 30, 31, 3A
01, 02, 15, 21, 35, 36
13, 27
00, 08, 0D, 17, 18, 1D, 23, 29, 2C, 34, 39, 3C
09, 0C, 19, 2D, 38, 3D
06, 10, 16, 1C, 22, 24, 28, 32
07, 0A, 0B, 33, 3E, 3F
DifferentialCryptanalysisof DES-like Cryptosystems 15
Table 6. Possible keys for 34x ~ D~by S1 with input 1~, 35x(in hexadecimal).
S box input Possiblekeys
06, 32 07, 33
10, 24 11, 25
16, 22 17, 23
1c, 28 ID, 29
corresponding possible keys are described in Table 7. The right key must occur in
both tables. The only common key values in Tables 6 and 7 are 17x and 23~. These
two values are indistinguishable with this input XOR since 17x @ 23~ = 34~ = S 1~,
but may become distinguishable by using a pair with a different input XOR value
(S 1~ :/: 34x).
Example 8. Assume we have a ciphertext pair whose plaintext XOR is known and
the values of the six bits 64, 33 . . . . . 37 of the plaintext XOR are zero. The input
XOR of the first round is zero in all the bits entering S 1 (S 1~ = S lla = 0) and thus
the output XOR of $1 in the first round must be zero (Slba = 0). The left half of the
ciphertext is calculated as the XOR value of the left half of the plaintext, the output
of the first round and the output of the third round (l = L 9 A 0) C). Since the
plaintext XOR and the ciphertext XOR are known and the output XOR of S1 in
the first round is known as well, the output XOR of S1 in the third round can be
calculated. The input pair S 1Ec, S l~c in the third round is easily extractable from
the ciphertext pair.
If the input pair of S1 in the third round is S 1Ec = Ix, S l*c = 35 x and the output
XOR is Slb~ = Dx, then the value of Slr~ can be found as in Example 7 and it must
appear in Table 6. Using additional pairs we can discard some of the possible values
until we get a unique value of S 1K~. Since S lk~ is not constant, there should not be
any indistinguishable values of the subkey.
The following definition extends Definitions 3 and 4 for use with the F function:
Definition 5. Let X and Y be 32-bit values. We say that X may cause Y with
probability p by the F function if for a fraction p of all the possible input pairs
encrypted by all the possible subkey values in which the input XOR of the F func-
tion equals X, the output XOR equals Y. If p > 0 we denote this possibility by
X~Y.
Proof. To prove the lemma it suffices to show the property for each of the S boxes.
For each input XOR of the data Sk there is S~ = S~ regardless of SK. If there are k
possible input pairs to the S box with this input XOR that may cause a given output
XOR, we can choose precisely k key values Sx = Se ~ St, each taking the fixed
input pair SE, S* to one of the possible input pairs St, S~' of the S box and thus
causing the given output XOR. Thus, the fraction p is held constant for all the input
pairs, and therefore equals the average over all the input pairs. []
In other iterated cryptosystems this lemma does not necessarily hold. However, we
assume that the fraction is very close to p, which is usually the case.
The above discussion about finding the key bits entering S boxes can be extended
to find the subkeys entering the F function. The method is as follows:
1. Choose an appropriate plaintext XOR.
2. Create an appropriate number of plaintext pairs with the chosen plaintext
XOR, encrypt them and keep only the resultant ciphertext pairs.
3. For each pair derive the expected output XOR of as many S boxes in the last
round as possible from the plaintext XOR and the ciphertext pair. (Note that
the input pair of the last round is known since it appears as part of the ci-
phertext pair.)
4. For each possible key value, count the number of pairs that result with the
expected output XOR using this key value in the last round.
5. The right key value is the (hopefully unique) key value suggested by all the
pairs.
We are left with the problem of pushing the knowledge of the XORs of the
plaintext pairs as many rounds as possible (in step 3) without making them all
zeros. When the XORs of the pairs are zero, i.e., both texts are equal, the outputs
are equal too, which makes all the keys equally likely. The pushing mechanism is
a statistical characteristic of the cryptosystem which is an extension of the single
round analysis. Before we define it formally we give an informal definition and three
examples.
DifferentialCryptanalysisof DES-like Cryptosystems 17
Definition 6 (informal). Associated with any pair of encryptions are the XOR
value of its two plaintexts, the XOR of its ciphertexts, the XORs of the inputs of
each round in the two executions, and the XORs of the outputs of each round in
the two executions. These XOR values form an n-round characteristic. A character-
istic has a probability, which is the probability that a random pair with the chosen
plaintext XOR has the round and ciphertext XORs specified in the characteristic.
We denote the plaintexts XOR of a characteristic by f~p and its ciphertexts XOR
by fl r.
( )
The following example describes a simple one-round characteristic with proba-
bility 14/64.
Example 10. In this one-round characteristic all the S box input XORs except one
are zero. One S box input XOR is not zero, and is chosen to maximize the probabil-
ity that the input XOR may cause the output XOR. Since there are several input
bits that enter two neighboring S boxes by the E expansion we have to ensure
that the XORs of these bits are zero. There are only two private bits entering
each S box. These bits can have nonzero XOR values. The best such probability for
S1 is 14/64 (i.e., there is an entry that contains 14 pairs that does not cause the input
of the neighboring $2 or $8 to be nonzero). Thus, it is easy to get a one-round
characteristic with probability 14/64 which is
SI: 0Cx ~ Ex with probability 14/64,
$2 . . . . . $8: 00 x ~ 0x always.
This characteristic can also be written (for any L') as
18 E. Biham and A. Shamir
np = (L', 60 oo oo 0%) )
~ -= P(EO O0 O0 00~)
One-round characteristics with probability 1/4 are possible using nonzero input
XOR in $2 or $6.
The following example describes a two-round characteristic which is easily
obtained by concatenating the two one-round characteristics that are described in
Examples 10 and 9:
ftp = 00 80 82 00 6 0 0 0 0 0 0 0 x
t. A / = 00 80 82 00~
~ --].. a t = 60 00 00 00~: with probability
( ~T=6o oo oo oo oo oo oooo~ )
Definition 7. An n-round characteristic is a tuple fl = (fie, flA, fir) where lip and
O r are m-bit numbers and F~A is a list of n elements ~A = (A1, A 2. . . . . A.), each
of which is a pair of the form Ai = (2~, 2/0) where 2~ and 2~ are (m/2)-bit numbers
and m is the block size of the cryptosystem. A characteristic satisfies the following
rea uirements:
2] = the right half of f~p,
2~ = the left half of f~e (9 2~,
27 = the right half of f~T,
2~'-1 = the left half of fiT O 2~,
Differential Cryptanalysis of DES-like Cryptosystems 19
The following definitions and theorem deal with the probability of characteristics:
pa = ~ p/U.
i=1
Note that by Definitions 9 and 11 the probability of a characteristic f~ which is
the concatenation of the characteristic fl 1 with the characteristic f12 is the product
of their probabilities: pa = pn,. pn2. As a result, every n-round characteristic can be
described as the concatenation of n one-round characteristics with probability
which is the product of the one-round characteristics' probabilities.
Proof. The probability of any fixed plaintext pair satisfying P' = ~ , to be a right
pair is the probability that at all the rounds i: 2~ ~ 2~. The probability at each
round is independent of its exact input (as proved in Lemma 1) and independent of
the action of the previous rounds (since the independent keys completely random-
ize the inputs to each S box, leaving only the XOR value fixed). Therefore, the
probability of a pair to be a right pair is the product of the probabilities of 2~ ~ 2~,
which was defined above as the probability of the characteristic. []
20 E. Biham and A. Shamir
f~v=O0808200 60000000~ )
9 q gi
9 4
( so s2 oo 6o oo oooo )
where in the fourth round d' = b' O) C' = C' = A'. We see that when the plaintexts
differ in the five specified bit locations, with probability about 0.05 there is a differ-
ence of only three bits at the input of the fourth round. After the bit expansion, five
S boxes have nonzero input XOR and three have zero input XORs and thus zero
output XORs. In this case it is possible to deduce 12 bits of e' by e' = c' ~ D'.
This structure of three rounds with a zero input XOR in the middle round is very
useful and forms the best possible probability for three-round characteristics. 3 A
similar structure can be used in five-round characteristics. The middle round has
zero input and output XORs and there is a symmetry around it, i.e.,
3 Since less than two differingS boxes are impossibleand there are characteristicsof this structure
with two differingS boxes,each with the best possible probability(1/4).
Differential Cryptanalysis of DES-like Cryptosystems 21
ap = (L', R') )
2.
"F
D' = R' ~-],, d' = L' @ A' with probability Pb
f
( ftr=ae=(L',R') )
where in the sixth round f ' = d' ~) E' = b' ~) A' = L'. The existence of a string
b' ~ a' ~ A' ensures the existence of such a five-round characteristic. The charac-
teristic's probability is quite low since three S box inputs must differ in both rounds
b' ~ a' and a' ~ A', and six in the whole five-round characteristic. The best proba-
bility for an S box is 16/64 = 1/4. This limits the five-round characteristic's proba-
bility to be lower than or equal to (1/4) 6 = 1/4096. In fact, the best known five-
round characteristic has probability about 1/10,486.
Among the most useful characteristics are those that can be iterated.
function that may cause a zero output XOR (i.e., two different inputs yield the same
output). This is possible in DES if at least three neighboring S boxes differ in
the pair (this phenomena is also described in I-5] and [1]). The structure of these
characteristics is described in the following example.
Example 13. If the input XOR of the F funtion is marked by if, such that r ~ 0,
then we have the following iterative characteristic:
~p = (L', R') = (r O) )
( )
The best such characteristic has probability about 1/234. A five-round character-
istic based on this iterative characteristic has probability about 1/55,000.
The statistical behavior of most characteristics does not allow us to look for the
intersection of all the keys suggested by the various pairs as we did in Example 7,
since the intersection is usually empty: the wrong pairs do not necessarily list the
right key as a possible value. However, we know that the right key value should
result from all the right pairs which occur (approximately) with the characteristic's
probability. All the other possible key values are fairly randomly distributed: the
expected XOR value (which is usually not the real value in the pair) with the known
ciphertext pair can cause any key value to be possible, and even the wrong key
values suggested by the right pairs are quite random. Consequently, the right key
appears with the characteristic's probability (from right pairs) plus other random
occurrences (from wrong pairs). To find the key we just have to count the number
of occurrences of each of the suggested keys. The right key is likely to be the one
that occurs most often.
Each characteristic lets us look for a particular number of bits in the subkey of
the last round (all the bits that enter some particular S boxes). The most useful
characteristics are those which have a maximal probability and a maximal number
of subkey bits whose occurrences can be counted. Yet it is not necessary to count
on all the possible subkey bits. The advantages of counting on all the possible
subkey bits are the good identification of the right key value and the small amount
of data needed. However, counting the number of occurrences of all the possible
DifferentialCryptanalysisof DES-likeCryptosystems 23
values of a large number of bits usually demands huge memory which can make
the attack impractical. We can count on a smaller number of subkey bits entering
a smaller number of S boxes, and use all the other S boxes only to identify and
discard those wrong pairs in which the input XORs in such S boxes cannot cause
the expected output XORs. Since about 209/0 of the entries in the pairs XOR distri-
bution tables of the S boxes are impossible, about 209/o of the wrong pairs can be
discarded by each S box before they are actually counted.
The following definition gives us a tool to evaluate the usability of a counting
scheme based on a characteristic:
Definition 13. The ratio between the number of right pairs and the average count
in a counting scheme is called the signal-to-noise ratio of the counting scheme and
is denoted by SIN.
To find the right key in a counting scheme we need a high probability character-
istic and enough ciphertext pairs to guarantee the existence of several right pairs.
This means that for a characteristic with probability 1/10,000 we need several tens
of thousands of pairs. How many pairs we need depends on the probability of the
characteristic, the number of key bits that we count on, and the level of identifica-
tion of wrong pairs that can be discarded before the counting. If we are looking for
k key bits, then we count the number of occurrences of 2k possible key values in 2 k
counters. The counters contain an average count of m" ~" ill2 k counts, where m is
the number of pairs, 0t is the average count per counted pair, and fl is the ratio of
the counted to all pairs (i.e., counted and discarded). The right key value is counted
about m.p times using the right pairs where p is the characteristic's probability,
plus the random counts estimated above for all the possible keys. The signal-to-
noise ratio of a counting scheme is therefore
m.p 2k'p
s/N - - - -
m" ~ . ~/2 ~ ~. ~ "
Example 14. The following four plaintexts form a quartet (where ~b1 and It~2 are
the plaintext XORs of the characteristics):
1. A random plaintext P.
2. PO~1.
3. P@~2.
4. P@~kl@~b 2.
The two pairs of the first characteristic are the pairs labeled (1, 2) and (3, 4) and the
two pairs of the second characteristic are the pairs labeled (1, 3) and (2, 4).
The use of these structures can be done in two ways. When an attack uses n pairs
of each one of two characteristics we can use n/2 quartets which contain the same
information as each of the n pairs of each characteristic. Thus, we save half the data.
Using three characteristics we can save two-thirds of the data. The other approach
is used when an attack can simultaneously use two characteristics while counting
the same bits. Then we can divide the data so that half of the pairs are based on the
first characteristic and the other half on the second. When quartets can be used we
can save half the data, and when octets can be used we can save two-thirds of the
data.
~ = 20 00 00 00 00 00 00 00~ )
=2ooooooooooooooo
( )
where in the second round b' = L' ~9 A' = 20 00 00 00 x.
Differential Cryptanalysis of DES-like Cryptosystems 25
In the first round the characteristic has a' = 0 ~ A' = 0 with probability 1. The
single bit difference between the two plaintexts starts to play a role in the second
round in S1. Since the inputs to S1 differ only in one bit, at least two output
bits must differ. Typically such two bits enter three S boxes in the third round
(c' = a' ~ B' = B'), where there is a difference of one bit in each S box input. Thus,
about six output bits differ at the third round. These bits are XORed with the
known difference of the input of S1 in the second round (d' = b' ~ C'), making a
difference of about seven bits in the input of the fourth round and about 11 bits in
the entries of the S boxes (due to the E expansion). Such an avalanche makes it very
likely that the input of all the S boxes differ at the fourth round. Even if an input
of an S box does not differ in one pair it can differ in another pair and the exact
value of d' is usually different for every pair.
The 28 output XOR bits of $2 .... , $8 in B' must be equal to zero since their input
XORs are zero. Since a' 9 B' = c' = D' @ l' (see Fig. 3) then
D'=a'~I'OB'. (1)
When the ciphertext pair values T and T* are known then d and d* are known to
be their right halves (by d = r). Since a', l' and the 28 bits of B' are known, the
corresponding 28 bits of D' are known as well by (1). These 28 bits are the output
XORs of S boxes $2 . . . . . $8. Thus, we know the values Sed, S'd, and S~d of seven S
boxes in the fourth round.
Given the encrypted pairs we use a separate counting procedure for each one of
Plaintext(P))
K1
K2
K3
K4
(Cipherext ,T,)
Fig. 3. DES reduced to four rounds.
26 E. Biham and A. Shamir
the seven S boxes in the fourth round. We try all the 64 possible values of Sxd and
check whether
S(SEd @ SK~) ~ S(S~a G SKd) = SOd.
For each key we count the number of pairs for which the test succeeds. The right
key value is suggested by all the pairs since we use a characteristic with probabil-
ity 1 which causes all the pairs to be right pairs. The other 63 key values may occur in
some of the pairs. It is unlikely that a value occurs in all the pairs for which S~ are
different and S~ are different. In rare cases when more than one key value is sug-
gested by all the pairs a few additional pairs can be tried, or the analysis of the other
key bits can be done in parallel for all the surviving candidates.
So far we have found 7.6 = 42 bits of the subkey of the last round (K4). If the
subkeys are calculated via the DES key scheduling algorithm these are 42 actual
key bits out of the DES 56 key bits, and 14 key bits are still missing. We can now
try all the 214 possibilities of the missing bits and decrypt the given ciphertexts
using the resulting keys. The right key should satisfy the known plaintext XOR
value for all the pairs, but the other 214 - 1 values have only probability 2 -64 to
satisfy this condition.
Some researchers have proposed to strengthen DES by making all the subkeys
Ki independent (or at least to derive them in a more complicated way from a longer
actual key K). Our attack can be carried out even in this case. To find the six
missing bits of K4 and to find K3 we use another plaintext XOR value with the
following characteristic f~2:
f ~ = 02 22 22 22 00000000~ ~)
a I = 0z always
( o2 22 22 22 oo oo oo
and d' are known. The counting method is used again to count the number of
occurrences of the possible keys of all the eight S boxes at the third round. The
values that are counted for all the pairs are likely to be the right key values. As a
result the complete K3 is found with high probability.
The P' values used above are insufficient to find a unique K2 since the S~b are
constant for all the pairs, and thus the right key values are indistinguishable from
the alternative key values obtained by XORing them with S~b. Although we can
find these two possibilities for each S box, i.e., 2 8 possibilities for K2, we cannot use
the above XOR values to find K1 since in both XOR values there is R' = 0 and thus
a' = 0 and A' = 0. Note that
a'=0~A'=0
happens regardless of the key and thus all the possible values of K1 are equally
likely using these XOR values. To solve this problem we have to use an additional
characteristic which has a nonzero input XOR for all the S boxes of the first round.
In addition we want to be able to distinguish the key values of all the S boxes so
we choose two characteristics f~3 and fP. These characteristics can be chosen
arbitrarily under the following two conditions:
9 5~, # 0 for all the S boxes using either 12~,or f2~,.
9 For every particular S box 5~o of the characteristic t2~ is different from 5~., of
tar
Then b and b* are known by decryption of the third round and B' is known by
B'=a'@c'=R'~c'.
The counting method is used to find K2. This time it has to use the appropriate R'
value for each pair. Now a, a*, and a' are known by decryption of the second round
and A' is known by
A'=L'~b'.
The counting method finds K1. Using K1, K2, K3, and K4 we can decrypt the
original ciphertexts to get the corresponding plaintexts and then verify their plain-
text XOR values. If we find only one possibility for all the subkeys the verification
must succeed. If several possibilities are found, then only one of them is likely to be
verified successfully, and thus the right key can be identified.
Typically, 16 encryptions are sufficient for this attack. These 16 encryptions con-
tain eight pairs of the characteristic [21, eight pairs of [22, four pairs of ~3, and four
pairs of f~4. In order not to increase the amount of data needed we use two octets
that occupy four pairs of each of three plaintext XORs.
The cryptanalysis of DES reduced to six rounds is more complex than the crypt-
analysis of the four-round version. We use two statistical characteristics with prob-
28 E. Bihamand A. Shamir
ability 1/16, and choose the key value that is counted most often. Each one of the
two characteristics lets us find the 30 key bits of K6 which are used at the input of
five S boxes in the sixth round, but three of the S boxes are common so the total
number of key bits found by the two characteristics is 42. The other 14 key bits can
be found later by means of exhaustive search or by a more careful counting on the
key bits entering the eighth S box in the sixth round.
The first characteristic t) 1 is
f l j , = 4 0 0 8 0 0 0 0 04000000. )
Z.
y- B' = Or ~ r " b' = O~ always
T
( _- oo oooo o, oooooo )
~ , = 00 20 00 08 00000400~ )
,.•_ B' = 0~
~ -7: b~= O~ ] always
I
r~C' = 00 20 00 08= U ~ c' = 00 00 04 00= with probability 88
4
( f~=00200008 00000400= )
XORs are as expected. The verification of most of the 64 possibilities of the six
missing bits of K6 should fail, and with high probability only one possibility sur-
vives. This value completes K6. Only eight key bits are missing now. They can be
found by trying all the 256 possibilities, or by applying a similar analysis to key bits
that enter S boxes in the fifth round.
How much data is needed? The signal-to-noise ratio of the first part of the algo-
rithm (which finds 30 key bits) is
230. 1/16 _ 23o_4_1o = 2a6.
S/N - 45
The S / N is high and thus only seven or eight right pairs of each characteristic are
needed. Since the characteristics' probability is 1/16, we need about 120 pairs of
each characteristic for the analysis. The S / N of the later part is
26. 1
SIN- - 16.
4
This is lower, but we do not care since we can almost certainly identify and use only
the seven or eight right pairs from the first part (while eliminating most of the noise)
and intersect the sets of possible key values. To reduce the number of ciphertexts
needed we use quartets which combine the two characteristics. As a result only 240
ciphertexts (representing 120 pairs of each characteristic) are needed for the com-
plete cryptanalysis.
In order to decrease the amount of memory needed in the first part of this attack
we devised an equivalent but faster counting algorithm that uses negligible memory
and can count on all the countable subkey bits simultaneously. This algorithm can
be used in any counting scheme that needs a huge memory but analyses a relatively
small number of pairs (after filtering out all the identifiable wrong pairs). The idea
behind this algorithm is to describe the pairs and the possible key values by a graph.
In this graph each pair is a vertex and every two pairs which suggest a common key
value have a connecting edge labeled by this value. Thus, each key value forms a
clique which contains all its suggesting pairs. The largest clique corresponds to the
key value which is counted by the largest number of pairs. In our implementation,
for each of the five S boxes we keep a bit mask of 64 bits, one bit for each possible
key. Given the values of SE, S*, and S~ we set the bits of the key masks that corre-
spond to possible keys. Each pair has five such key masks, one for every S box. A
clique is defined as a set of pairs for which for each of the five key masks there is a
DifferentialCryptanalysisof DES-likeCryptosystems 31
common bit set in all the pairs in the set (i.e., the binary "and" operation is nonzero
for all the five key masks). Finding the largest clique can be done in the following
way: first compare the key masks for every pair with all the following pairs in the
pairs list. At each comparison there is usually at least one key mask without any
common bit set. For the remaining possibilities we try to "and" the result with third
pairs, fourth pairs, and so on until no more pairs can be added to the clique. Given
the largest clique we can easily compute the corresponding key bits by looking at
each key mask for the key value it represents.
Using the clique algorithm with 240 ciphertexts it takes about 0.3 seconds on a
COMPAQ personal computer to find the key in 95?/oof the tests conducted on DES
reduced to six rounds. When 320 ciphertexts are used the program succeeds in
almost all the cases. The program uses about 100K bytes of memory, most of which
is devoted to various preprocessed tables used tO speed up the algorithm.
DES reduced to eight rounds can be broken using about 25,000 ciphertext pairs for
which the plaintext XOR is P' = 40 5C 00 00 04 00 00 00x. The method finds 30
bits of K8. Eighteen additional key bits can be found using similar manipulations
on the pairs. The remaining eight key bits can be found using exhaustive search.
The following characteristic is used in this analysis:
~tp= 40 5C 00 00 04000000~ )
+" B'
0O(P= 04 100000~)
00 00 00~ ~ - 7 = bt = OO54 00 with probability 10-16
64-64
( f l T = 4 0 5 C 0 0 0 0 04000000~ )
32 E. Biham and A. Shamir
This characteristic has probability 1/10,486. The input XOR in the sixth round
of a right pair is
f ' = d' ~ E' = b' G A' = L' = 4 0 5 C 0 0 0 0 x.
by 80Y/o of the pairs. Therefore, the probability of e' ~ E' is ~16 q_ 0.8~2~= --6432=
89 The probability of the five-round modified characteristic is (16.10.16/643) 9
(16- 10- 32/643) ~ 1/5243. The signal-to-noise ratio of a counting scheme which
count on the 24 subkey bits entering $2, $6, $7, and $8 is SIN = 224/44. 0.8. 5243
15.6. This signal-to-noise ratio allows us to use only about five right pairs. There-
fore, it uses a total amount of about 25,000 pairs. The signal-to-noise ratio of a
counting scheme which counts on 18 subkey bits entering three S boxes out of
$2, $6, $7, and $8 is SIN = 21s/4 a .0.82. 5243 ~ 1.2. This counting scheme which
counts on 18 bits needs 150,000 pairs and has an average of about 24 counts for
any wrong key value and about 53 counts for the right key value (53 = 24 +
150,000/5243 = 24 + 29).
A summary of this cryptanalytic method using 2 is memory cells is as follows:
1. Set up an array of 2 TM counters which is initialized by zeros. The array corre-
sponds to the 2 la values of the 18 key bits of K8 entering $6, $7, and $8.
2. Preproc.ess the possible values of St that satisfy each 5~ ~ S~ for the eight S
boxes into a table. This table is used to speed up the program.
3. For each ciphertext pair do:
(a) Assume h' = r ,' H' = l,' and h = r. Calculate S~h = 5~h and S~h for $2,$5,
..., $8 by h' and H'. Calculate SEh for $6, $7, and $8 by h.
(b) For each one of the S boxes $2, $5, $6, $7, and $8 check if S'~h~ S~h. If
S~h -p S~h for one of the S boxes, then discard the pair as a wrong pair.
(c) For each one of the S boxes $6, $7, and $8: fetch from the preprocessed
table all the values of S~h which are possible for S~ ~ S~h. For each
possible value calculate Srh = S~ht~ SEh.Increment by one all the counters
corresponding to combinations of the possible values of S6Kh, S7r.h, and
S8Kh.
4. Find the entry in the array that contains the maximal count. The entry index
is most likely to be the real value of S6Kh, S7rh, and S8xh which is the value
of the 18 bits 31 . . . . . 48 of K8.
To find the other bits, we filter all the pairs and leave just the pairs with the
expected 5~ value using the known values of h and the known bits of K8 entering
$6, $7, and $8. The expected number of the remaining pairs is 53.
The next bits we are looking for are the 12 bits of K8 that correspond to $2 and
$5. We use a similar counting method (exploiting the enhanced SIN created by the
higher concentration of right pairs) and then filter more pairs. A wrong pair is not
discarded by either this filter or its predecessor with probability 2 -2o and thus
almost all the remaining pairs are right pairs.
Using the known subkey bits of K8 we can calculate the values of 20 bits of each
of H and H* for each pair and thus 20 bits of each of g and g* (by g = 10) H),
Table 9 shows the dependence of the g bits and the subkey bits of K7 at the seventh
round on the known and unknown subkey bits of K8 at the eighth round. The
digits 1, 3, and 4 mean that they depend on the value of the unknown key bits
entering the corresponding S box in the eighth round. " + " means that it depends
only on the known bits of KS. Eight key bits are not used at all in K8 and are
marked by ".".
34 E. Bihamand A. Shamir
The expected value of G' is known by the formula G' = f ' ~) h'. We can now look
for the 18 missing bits of K8 by exhaustive search of 218 possibilities for every pair.
Thus we know H, H* and g, 0" and 40 bits of K7. For each pair we check that the
expected value of G' holds. For the right value of those 18 key bits the expected
G' holds for almost all the filtered pairs. All the other possible values satisfy the
expected G' value only for a few pairs (usually two or three pairs while the right
value holds for 15 pairs). To save computer time we search primarily for the 12 key
bits entering S1 and $4 in the eighth round. They suffice to compute S3~g as seen
in Table 9. By similar methods we find these 12 bits and then find the other eight
bits. This completes the calculation of the 48 bits of K8. Only eight key bits are still
missing and they can be found by exhaustive search of 256 cases, using one pair of
ciphertexts, and verifying that the plaintext XOR is as expected.
To save disk space we can filter the pairs as soon as they are created and discard
all the identifiable wrong pairs (leaving 0.85 ,,~ ~ of all the pairs). Therefore, in the
case of counting on 24 bits, the 25,000 pairs are reduced to about 7500 pairs. For
the case of counting on 18 bits we devised another criterion which discards most of
the wrong pairs while leaving almost all the right pairs. This criterion is based on
a carefully chosen weighting function and discards any pair whose weight is lower
than a particular threshold. This criterion is the extension of the filtering of the
identifiable wrong pairs (where the threshold is actually zero) and is based on the
idea that a right pair typically suggests more possible key values than a wrong pair.
The weighting function is the product of the number of possible keys of each of the
five countable S boxes (i.e., the number in the corresponding entry in the pairs XOR
distribution tables). The threshold is chosen to maximize the amount of discarded
pairs, while leaving as many right pairs as possible. The best threshold value was
experimentally found to be 8192 which discards about 97~o of the wrong pairs and
leaves almost all the right pairs. This reduces the number of pairs we actually
analyze from 150,000 to about 7500, with a corresponding reduction in the running
time of the attack.
The attacking program finds the key in less than 2 minutes on a C O M P A Q
personal computer with 95~o success rate (using 150,000 pairs). Using 250,000 pairs
the success rate is increased to almost 100~o. The program uses 460K bytes of
DifferentialCryptanalysisof DES-likeCryptosystems 35
memory, most of it for the counting array (one byte suffices for each counter since the
maximum count is about 53, and thus the total array size is 2 is bytes), and the
preprocessed speed up tables. The program which counts using 224 memory cells
finds the key using only 25,000 pairs.
( ~T = o4oo oo oo 4o 5c oo oo~ )
This characteristic has probability 12- 14.16/643 ~ 1/100 and thus the probabil-
ity of the concatenated six-round characteristic is about 1/1,000,000.
DES reduced to nine rounds can be broken using 30 million pairs by a method
based on this six-round characteristic and using an array of size 230 with S / N =
230/45. 1,000,000 ~ 1. The first part of the algorithm that finds the first 30 key bits
is almost the same as in the eight-round algorithm except that it counts on all the
30 bits at once. The second part of the algorithm that uses Table 9 is slightly
different since the key scheduling at the ninth round is based on a shift of one bit
instead of two bits. The input part stays the same.
p = (r 0) = 1 9 60 00 00 00000000~
19 60 00 00~
1
= (o, _- oo oo oo oo 19 oo oo o o )
where ~ = 19 60 00 00 x.
Due to the importance of this iterative characteristic, throughout this paper we
refer it as the iterative characteristic.
Differential Cryptanalysis of DES-like Cryptosystems 37
Table 11. The probability of the iterative characteristic versus number of rounds.
N u m b e r of rounds Probability
3 1/234
5 1/55,000
7 ~ 2 -24
9 ~2 -32
11 ~ 2 -4~
13 ~ 2 -4s
15 ~ 2 -56
Proof. S'Eb# 0 only at three S boxes: S1, $2 and $3, for which
Sl~b = Sl'~b = 03~ ~ Slbb = 0 with probability 14/64,
and for the next round (without loss of generality we use the notation of a five-round
characteristic)
f ' = ~,
and five of its S boxes satisfy S'Ef = O.
Proof. The results of this theorem are derived from Definition 11 and L e m m a 2.
The X O R data during the intermediate rounds looks like:
n,, = (~, o)
a' = 0 ~ A' = 0 always,
b' = ~ ~ B' = 0 with probability a b o u t 1/234,
38 E. Biham and A. Shamir
c'=a'~B'=O~C'=O always,
d' = ~ b ~ D ' = 0 with probability about 1/234,
e'=c'~D'=O~E'=O always,
Note. There is another value for which Lemma 2 and Theorem 2 hold with the
same probabilities. This value is ~,t = 1B 60 00 00 x. There are several additional
values for which the probabilities are smaller. The best of them is ~b~ = 00 19 60 00x
for which the probability is exactly 1/256. The extension of this iterative character-
istic to 15 rounds has probability 2 - 5 6 .
There are several possible types of attack, depending on the number of additional
rounds in the cryptosystem that are not covered by the characteristic itself. The
attack on DES reduced to eight rounds in Section 5 uses a five-round characteristic
and there were three additional rounds. This kind of attack is called a 3R-attack.
The other kinds of attacks are a 2R-attack, with two additional rounds, and a
1 R-attack, with one additional round (where the characteristic causes r' to be fixed).
A 0R-attack is also possible but it can be reduced to a 1R-attack with better statis-
tics and the same S/N. A 0R-attack has the advantage that the right pairs can be
recognized almost without mistakes (the probability of a wrong pair to survive is
2 - 6 4 ) and thus the memory requirements can become negligible using the clique
method. For a fixed cryptosystem it is~advisable to use the shortest possible charac-
teristic due to its better statistics. Thus, a 3R-attack is advisable over a 2R-attack
and both are advisable over a 1R-attack.
In the following sections the actual attacks on DES reduced to 8-16 rounds are
described. All these attacks find some bits of the subkey of the last round. The other
bits of the subkey of the last round can be calculated using these known bits and a
reduction of the cryptosystem to a smaller number of rounds can be done. Only
eight bits do not appear in the subkey of the last round and they can be found by
trying all the 256 possible keys.
6.1. 3R-Attacks
In 3R-attacks counting can be done on all the bits of the subkey of the last round
entering the S boxes that have zero input XORs at the round that follows the last
round of the characteristic. The four, six, eight, and nine-round attacks described
in the previous sections are of this type.
In DES reduced to eight rounds the first 30 subkey bits can be found using the
iterative characteristic with five rounds (whose probability is about 1/55,000) by an
attack which is similar to the one described in Section 5. Using an array of size 224
we have S/N = 224/44. 0.8.55,000 = 1.5. We need about 220 pairs. Using an array
of size 230 we have S/N = 230/45. 55,000 ~ 19. About 67~ (1 - 0.8 s) of the pairs
can be identified in advance as wrong pairs.
Differential Cryptanalysis of DES-like Cryptosystems 39
6.2. 2R-Attacks
In 2R-attacks counting can be done on all the bits of the subkey of the last round.
Possibility checks can be done for all the previous round S boxes. An S box whose
input XOR is zero should also have an output XOR of zero, i.e., the success rate of
this check is 1/16. For the other S boxes the success rate is about 0.8.
In DES reduced to nine rounds the 48 bits of K9 can be found using 226 pairs
using the seven-round characteristic. We know that
f~p = (~,, 0)
a'=O~A'=O always,
b'=~k~B'=O with probability about 1/234,
c'=0 ~C'=0 always,
g'=0~G'=0 always,
h'=lp~H'=i'G#'=r',
i'= r ' ~ l ' = h' O l ' = l' O)~k,
I) r = (l', r').
We can check that h' ~ H' and i' --* I' and count the possible occurrences of the key
bits. At h' ~ H' five S boxes satisfy S~h = S~h = 0 and thus S~h must be zero (which
happens for wrong pairs with probability 1/16), while the other three S boxes satisfy
S~h ~ S~h (which happens for wrong pairs with probability 0.8). Therefore the
counting on all the 48 bits of K9 has S/N = 248. 2-24/48.0.83. (1~6)s ~ 229 and
counting on 18 bits has SIN = 218. 2-24/43. 0.85. 0.83. (~6) s ~ 211. Even a separate
counting on the six key bits entering each S box is possible with SIN = 26. 2-24/4 9
0.87. 0.83. (116)5 ~ 10. The identification of the wrong pairs leaves only 0.83. ( ~ ) s .
0.88 ~ 2 -24 of the wrong pairs and thus only about one wrong pair is left per each
right pair. The characteristic's probability is 2 -24 and thus we need about 226 pairs
for the cryptanalysis. This attack needs more data than the previous 3R-attack on
DES reduced to nine rounds but needs much less memory. Due to the very good
identification of wrong pairs (only about eight pairs are not discarded, four right
pairs and four wrong pairs) it is possible to use the clique method on all the 48 bits.
Eleven rounds can be broken by using the nine-round characteristic with an
array of size 218 and S/N = 2 is "2-32/43 "0.85 "0.83 -(~6) 5 ~ 6 using 235 pairs. The
clique method can still be used on 48 subkey bits with SIN = 248. 2-32/48"0.83.
(~6) 5 ,~ 221 with an identification that leaves 232. 2 -24 = 28 wrong pairs per each
right pair.
Thirteen rounds can be broken using the eleven-round characteristic with an
array of size 230 and SIN = 23~176 ~ 4 using 243 pairs.
The clique method is not possible since 243. 2 -24 = 219 pairs are not discarded.
Counting schemes on 18 and 24 bits are not advisable due to the low SIN.
Fifteen rounds can be broken using the 13-round characteristic with an array
40 E. Biham and A. Shamir
of size 242 and S/N =242"2-48/47"0.8"0.83"(1) 5 ~ 2.5 using 251 pairs. This is
still faster than exhaustive search, but requires unrealistic amounts of space and
ciphertexts.
6.3. 1R-Attacks
In 1R-attacks counting can be done on all the bits of the subkey of the last round
entering the S boxes with nonzero input XORs. Verification of the values of r' itself
and possibility checks on all the other S boxes in the last round can be done. For
those S boxes with a zero input XOR the output XOR should be zero too, i.e., the
check success rate is 1/16. Since the input XOR is constant we cannot distinguish
between several subkey values. However, the number of such values is small (eight
in all the 1R-attacks described here) and each can be checked later in parallel by
the next part of the algorithm (either via exhaustive search or by a differential
cryptanalytic attack).
Ten rounds can be broken using the nine-round characteristic where
h' = ~b -~ H' = 0 with probability 1/234,
i' = 0 --* I' = 0 always,
j'=~b=r'~J'=l'~i'=l'.
We can identify the right pairs easily. Those pairs satisfy r' = ~, and the 20 bits in
l' going out of $4 . . . . . $8 ae zero. This also holds for 2 -52 of the wrong pairs. For
the other three S boxes we count the possible values of their 18 key bits with
SIN = 2 is" 2-32/43. 2 -s2 = 232. Thus we need 234 pairs.
Twelve rounds can be broken using the eleven-round characteristic with S/N =
2 is. 2-40/43. 2 -52 = 224 and with 242 pairs.
Fourteen rounds can be broken using the 13-round characteristic with S/N =
2 is. 2-4s/43- 2 -s2 = 216 and with 2 s~ pairs.
For 16 rounds we get SIN = 2 as" 2-56/43. 2 -52 = 2 s using the 15-round charac-
teristic. This can be broken using 257 pairs. Note that the creation of 257 pairs is
more time consuming than exhaustive search for the 256 possible keys.
Table 13. Possible inputs and outputs for 32x ---,0 by $2 (in binary).
The XOR value of bit 6 of $2~ and of bit 2 of $3~ equals the XOR value of the
corresponding key bits in S2r and S3K since the corresponding bits in S2E and S3E
are the same bit due to the bit expansion. If their XOR value is known to be
one, then the probability of the iterative characteristic becomes 14" 8- 8/642. 32 =
7/21~ ~ 1/146. If their XOR value is known to be zero, then the probability be-
comes 14" 8- 2/642. 32 = 7/212 ,,~ 1/585.
The other characteristic described with the same probability has the opposite
direction. When 36x ~ 0 by $2 the value of bit number 6 is always zero and thus
the probabilities are exchanged. If the XOR of the key bits is zero, then the proba-
bility is 1/146 and if one it is 1/585.
The attack on DES with 16 rounds is now as follows. There are seven rounds in
which the input XOR is assumed to be r Suppose that, out of these seven rounds,
we have n rounds (0 < n < 7) whose key bit number 6 of S2K equals key bit number
2 of S3x. In this case, the probability of the 15-round characteristic is
47-n
( 7 x~n~ 7 '~71 = 47_,( 7 " ~ 7 1 . 6 - -
2 ,
all the keys but only for a small fraction of them. For this fraction exhaustive search
is still faster. Table 15 shows that although the knowledge of the specific bit values
during the rounds of the characteristics enhances the attack and decreases the
number of pairs needed, the improvement is relatively small and does not affect the
overall complexity.
7. Variants of DES
This section describes several variants of DES and how the attack works on them.
7.3.1. Modifying the XORs Within the F Function. If we replace the occurrences
of the XORs within the F function by addition operations we get a much weaker
cryptosystem. The attack uses the following iterative characteristic:
r
fZT=O00CO000 00000000~ )
7.3.2. Modifying all the XORs. Modifying all the XORs by additions changes the
probability of this characteristic from 2 -6 to 2 -8 . This happens because the addi-
tional addition operation (for example c = a + B) does not change the input XOR
(c' = a' for B' = 0) with probability 1/4. Thus the 16-round characteristic has prob-
ability 2 -64, the 15-round characteristic has probability 2 -58, the 14-round charac-
teristic has probability 2 -56, and the 13-round characteristic has probability 2 -s~
The analysis of this attack shows that 2 s2 pairs are needed to cryptanalize the
14-round cryptosystem. The attacks on the 15-round and 16-round cryptosystems
are slower than exhaustive search.
7.3.3. Modifying all the XORs in an Equivalent DES Description. DES has an
equivalent description in which the expansion is moved to the end of the F function
and all the calculations are done using 48 bits instead of 32. The cryptosystem
which is the result of modifying all the XORs in this description by additions is
DifferentialCryptanalysisof DES-likeCryptosystems 45
t i p = 6 0 0 0 0 0 0 0 00000000~ )
~ 4
A'= 0 ~-~4 a' = 0 always
(T=oooooooorooooooo)
Ninety-seven percent of the sets of eight S boxes have such iterative characteristic
with probability 1/8 or more. The corresponding 13-round characteristics have
probability 2 -18 for which the 3R-attack on 42 subkey bits needs 220 pairs with
S/N = 21~ Table 16 describes the relationship between the probability of the
characteristics, the number of pairs needed, and the probability that a set of ran-
dom S boxes has such a characteristic.
46 E. Biham and A. Shamir
In S boxes chosen as four random permutations (as in the original DES S boxes)
two different inputs that differ in the private bits of one S box must have different
outputs. But there is a high probability that there are two different inputs differing
in the input bits of two S boxes which have the same output. In this case there is
an iterative characteristic which is (without loss of generality the difference is in S1
and $2 and the differing bits of the data are by bit mask 7E 00 00 00x)
Qp=7E000000 00000000~ )
T- l
=oo oo oooo 7E oooo )
In random tests we found several attacks that use 243 to 247 pairs. We estimate that
attacks that use this number of pairs can be found for more than 90% of the 16-
round cryptosystems which use S boxes chosen as four random permutations.
With a single modification in one entry of one of the original DES S boxes we
can force this S box to have two different inputs with the same output. For example,
such a modification may set the value of S(4) to be equal to S(0) (i.e., the third value
in the first line to be equal to the first value in the first line). Therefore there are two
different inputs (0 and 4) with the same output (the input XOR is 4 and the output
X O R is 0). The probability of 4 ~ 0 by this S box is 1/32. An iterative characteristic
based on this property has probability 1/32 and is (without loss of generality the
difference is in S1)
DifferentialCryptanalysisof DES-like Cryptosystems 47
( ~2p=B0000000 00000500~ )
r
,~_ A,= lO oo oo oo~ ~ a' = 00 00 05 0 0 . with probability 88
( n~= oo oo o5 oo B0 o0 0o 00~ )
48 E. Bihamand A. Shamir
Using a 2R-attack only 228 pairs are needed to break the 16-round cryptosystem.
There are several additional characteristics that can be used to attack the crypto-
system with a similar amount of pairs.
the missing bit of S3Kg and then the 128 possibilities of S3K, and the missing bit of
S4Kg. Now K8 is completely known. To find K7 we repeat the algorithm of finding
K7 described above with the difference that now we know all K8. Only one bit of
K7 remains indistinguishable. This bit is bit number 2 of Slrg.
So far we have used the filtered pairs. These pairs are assumed to be right pairs
whose f ' is as expected. They cannot help finding K6 since the input XORs of five
of the S boxes are zero so this part of K6 cannot be found at all. The other three S
boxes have constant input XORs so there are two indistinguishable values for the
subkey bits entering each S box. In order to find K6 we have to use wrong pairs for
which the characteristic holds in the first three of the five rounds. From now on
we use all the pairs and filter them by a different criterion in each phase of the
cryptanalysis.
K6: To find K6 we decrypt two rounds of the ciphertexts and get the values o f f
and f * . We assume that the first three rounds of the characteristic hold in the
chosen pairs so d' is as expected with zero input XORs entering six S boxes. Thus
we can calculate the output XORs of these S boxes in the sixth round by F ' =
c' ~ D' O) g'. Since c' = 0 and S~d is zero in the six S boxes, we get that F' = g' in
the output bits of these S boxes. The filtering chooses all the pairs for which f ' and
F' satisfy S~I -} S~I for S1, $2, $5 . . . . . $8. Using the resultant pairs we count on the
12 subkey bits entering S1 and $2 and the missing bit of K7 (needed for the de-
cryption of the seventh round).
To find the other bits of K6 we filter the pairs again by using the known bits of
K6 to check the output XOR of S1 and $2, and count on S5rr . . . . . S8r,r, a separate
counting for each S box (we have a very good filtering so the SIN is high enough).
In parallel we count on S3rr and on S4r,r using the assumption that e' is as
expected by the characteristic (four rounds hold) and the filter that discards any
pair for which S~e # 0 for S1, $3 . . . . . $8 (since only S2~e # 0). Several possibilities
are found for some of the S boxes' key bits, and the following phases are run on
each one of them in parallel.
K5: We assume c' = 0 and d' = b'. Then D' = e' where e and e* are calculated
by a partial decryption. S~d must be zero in the six S boxes in which S~d = 0. We
filter the pairs and leave only those that have S'oa = 0. Then we count on each of
the eight S boxes of the fifth round. Sex;eral possibilities can be found for some of
the SKIS. A list of all the possibilities of K5 is created and used to try each one of
them in parallel in the following phases.
K4: At the second round there must be S2~b = S6~b = 0 for any pair (these S box
inputs do not depend on the differing bits of the plaintexts), d and d* are found by
a partial decryption. In addition D' = a' ~) B' <~ e' so S2~d and S6~d are known and
there must be S2kd ~ S2~a and S6~d ~ S6~d. If it does not hold for even one pair it
is not a filtering problem. It must be a wrong value of the subkeys K5, ..., K8. A
separate counting is done for each of the six S boxes S1, $2, $5 . . . . . $8. The counting
on the other S boxes $3 and $4 is done only for pairs whose d' is as expected by
the characteristic since otherwise we cannot know the value of S3~d and S4~d be-
cause S3~b and S4~b are unknown. Since S3~d and S4~d are constants there are two
indistinguishable values for each of their keys. As usual we create a list of the
possible K4 values and try them in parallel.
50 E. Bihamand A. Shamir
K3: c and c* can be found by a partial decryption of the following rounds using
K4 . . . . . K8. S[o = 0 in all the S boxes except $2. Thus S~c can be found for S1,
$3, ..., $8 by C' = L' ~ A' ~) d'. For every pair there must be Sic ~ S~c. Therefore,
even if only one S box (S1 or $3 . . . . . $8) of one pair does not match S[c ~ S~r it
must be that the values of K4 . . . . . K8 are wrong. If this does not happen, the
counting is done in parallel for all the S boxes except $2 using all the pairs.
S2~a # 0, thus the calculation of $ 2 ~ is impossible without further assumptions.
Therefore we assume that the values of A' and b' are as expected by the characteris-
tic. The filtering discards any pair that does not have S~b = 0 for S1, $2, and $5 . . . . .
$8 using B' = a' ~ c' = R' ~) c' (since we assume S[b = 0 in these S boxes). The
counting of S2K~ is done using the filtered pairs.
K2 and KI: The plaintext XOR used above is useless to find K2 and K1 since
all the pairs have S2~b = S6~b = 0 and for all the S boxes of the first round except
$2 there is StEa : O. The key bits cannot be found at all for these S boxes. For K1
and K2 we must use another plaintext XOR. We need only 100 such pairs, which
can be obtained without adding new ciphertexts by arranging some of the original
ciphertexts in quartets. This plaintext XOR and the algorithm of finding K1 and
K2 are very similar to the case of K1 and K2 in the four-round version. See the end
of Section 3 for more details.
This attack was implemented in C on a C O M P A Q personal computer. It finds
the key in less than 2 minutes with 95% success rate using 150,000 pairs. Using
250,000 pairs the success rate is almost 100%. The program uses 460K bytes of
memory, most of it for the counting array (of size 218 bytes) and the preprocessed
optimization tables. The program which counts using 224 memory cells finds the
key using only 25,000 pairs. As demonstrated by these figures, DES reduced to eight
rounds with independent subkeys is almost as easy to solve as the case of dependent
subkeys.
rounds are possible with even smaller complexity. Therefore the cryptanalysis of the
full DES with 16 rounds with independent keys takes about 261 steps and use 259
pairs. Even though this is an impractical complexity bound, it is much faster than
the 276s complexity of exhaustive search.
The Generalized DES Scheme (GDES) is an attempt to speed up DES which was
suggested by Schaumuller-Bichl [16], 1-18]. The speed up is obtained by increasing
the ratio between the block size and the number of calculations of the F function.
The GDES blocks are divided into q parts of 32 bits each. The F function is
calculated once per round on the rightmost part, and the result is XORed into all
the other parts, which are then cyclically rotated to the fight. After the last round
the order of the parts is exchanged to make the encryption and decryption differ
only in the order of the subkeys. The scheme is shown in Fig. 4, where n is the
number of rounds of the GDES cryptosystem,
B~j~=B~_,
u-l) c~
~F(B~_)~. K,),. .
je{2, . .,q},
. iE{1, .,n},
9-,i/](1)= B~q_)l, i ~ {1, ..., n},
9.1. G D E S Properties
This section describes several properties of GDES.
1. In GDES with n < q,
Btoi)~ q~ = Bt~"+i', Vi ~ {1. . . . . q - n},
and for pairs of plaintexts for which Btoq-"§ . . . . . B~o~) are kept constant (i.e.,
a;, ..... = o):
Plalntext
+ I"
%
%.
~, F x G ~. F x~ - ~ _ I
% % ~... .
J
. . . .
Ki
Kn
Ciphertext (swapped)
. In G D E S w i t h n = q - 1,
8~o, = o, V i e { 2 . . . . . q},
implies that
/~s, = O, V i e {1 . . . . . q - 1},
and
B'~ = B'om.
Differential Cryptanalysis of DES-like Cryptosystems 53
5. In G D E S with n = 2q - 2,
n~(1) = ~1,
Bo(2) = 172,
with probability 1/16 since r/2 ~ ~h ~ ~/2 with probability 1/4. There are addi-
tional values for rh and r/z with smaller probabilities.
6. In G D E S with n = 2q - 1,
and
-- 0, vj {2 . . . . . q}
(where ~b is the value used in Section 6: $ = 19 60 00 00x) implies that
B~~ = 0, Vj~{1 . . . . . q - l } ,
and
B'(~) = ~k
9.2. Cryptanalysis o f G D E S
This section describes how to cryptanalyze G D E S for various values of n and q.
We assume that q is even (as suggested in 1,16] and 1,18]), but note that odd q
can be attacked by variants of our technique. All the attacks find the subkeys
and are independent of the key scheduling algorithm. The special case of q = 8 and
n = 16 which is suggested in 1,16] and 118] as a faster and more secure alternative
to DES is breakable with just six ciphertexts in a fraction of a second on a personal
computer.
j=l j=l
54 E. Biham and A. Shamir
B~(2) = r/z,
= o, vj {3 . . . . . q},
DifferentialCryptanalysisof DES-likeCryptosystems 55
where ~/1 and ~/2 are defined in Section 9.1. The right pairs are about 1/16 of all the
pairs. We can identify most of the wrong pairs by checking that the input XOR
cannot cause the output XOR. This happens with probability about 0.8 for each S
box. Thus only 0.8 sq = 0.16 q of the wrong pairs remain. When q > 3 this is less than
0.88.3 = 1/250 of the pairs. This excellent identification makes it possible to consid-
er only 48 pairs, and identify the three expected occurrences of right pairs among
them. We can further decrease this amount to 24 pairs by using quartets of two
XOR values.
This attack shows that any G D E S which is faster than DES is also less secure
than DES. G D E S with n = 8q rounds is just as fast as DES. Consider G D E S with
n --- 8q - 1 which is slightly faster than DES. Then the usable characteristic has
7q - 1 rounds and six repetitions of the iterative characteristic. Thus its probability
is about (1/234) 6 ~ 2 -48. Counting on all the 48 bits of the subkey of the last round
has
248.2-48
S/N = 48.0.8sq_13.2_2o ,~ 225q.
Thus about four to eight right pairs are needed, giving a total of 8" 24a = 251 pairs.
This complexity decreases rapidly when we try to make G D E S even faster by mak-
ing n substantially smaller than 8q.
9.2.6. The Actual Breaking Algorithm for n = 2q. The breaking algorithm for the
recommended case of n = 2q needs six ciphertexts with particular plaintext XOR
values. In this section we describe an attack on the extension of G D E S which uses
independent subkeys, which needs 16 encryptions.
The attacker chooses a random plaintext P, encrypts the following 16 plaintexts,
56 E. Bihamand A. Shamir
Kn) = O
j=2
The input XOR is easily computed as ~_~ u, tq~ = B,t~ and the input itself is B~1~. Now
we try all the possible key bits for each S box separately and check that for the given
input XOR we get the given output XOR value. For each S box there are at least
five pairs which can distinguish values of the key bits. The (almost certainly unique)
value suggested by all the pairs is the key of the corresponding S box. Therefore,
the whole subkey of the last round is found. Now a decryption of the last round
can be done reducing the cryptosystem to 2q - 1 rounds.
Note that if the subkeys are derived by the DES key scheduling algorithm, then
48 bits out of the 56 key bits are known at this point. The others can be easily found
by trying all the 256 possibilities of the missing eight key bits. We thus proceed to
analyze the case of independent subkeys.
In the following q - 1 rounds we get the input and the input XOR of the F
Differential Cryptanalysis of DES-like Cryptosystems 57
function from the (partially decrypted) ciphertexts. The output XOR is calculated
by the formula
q
P'(n fl, K,) = B6 9 O
j=2
where r is the round number (r ~ {q + 1. . . . . 2q - 1}). In this case the first ten
ciphertexts are used. The additional four ciphertexts are needed primarily to find
K(q + 1) since in the first six encryptions there are too m a n y zero X O R bits and
more variety is needed. These added ciphertexts do no help in the nth round since
there we want the output XORs of the S boxes in the qth round to be zero.
In the remaining q rounds we use all the 16 ciphertexts. The additional cipher-
texts have nonzero differences in all the S boxes in all the rounds, whereas the first
ten had a constant value during the first q - 1 rounds. The input XOR is calculated
by the formula
q
j=2
In this section we describe several novel attacks on DES reduced to three to six
rounds which are not based on the ciphertext pair paradigm. These attacks are of
three kinds: ciphertext-only attacks, known-plaintext attacks, and statistical-
known-plaintext attacks. C o m p a r e d with differential attacks, they analyse fewer
ciphertexts but require more time.
58 E. Biham and A. Shamir
10.1.1. A Three-Round Attack. This attack assumes that the eight plaintext bytes
are ASCII characters whose most-significant bits are zeros. The Initial Permutation
(IP) packs the most-significant bits of all these bytes into a single byte. This byte
is the fifth byte of the permuted plaintext which is the first byte of the right half.
Given a ciphertext T = (l, r) we can easily calculate eight bits of the output of the
second round by B = a ~9 c = R q) r. From Table 26 (see Appendix A) we see that
these eight bits are the output of seven S boxes in the second round (where two of
them come from $5). The attack is as follows:
1. We try all the possibilities of the key bits entering $5 in the second round and
all the key bits entering the six S boxes S1, $2, $3, $4, $6, and $8 in the third
round. Their output bits are XORed with the data bits entering $5 in the
second round. Three bits are counted in both rounds and thus 39 bits are
exhaustively tried.
2. Using the tried key bits and any ciphertext we find the output of the six S
boxes in the third round and the input and output of $5 in the second round.
3. We compare the two computed output bits of $5 in the second round to their
expected value. If they are different, then the 39 key bits are wrong. A quarter
of the tried keys have the expected value. By trying additional ciphertexts we
can discard more key values. We stop when only one candidate remains.
Since we start with 239 possible keys and only a quarter of them survive each test,
we need about log4 239 = 19.5 ciphertexts. When the correct 39 key bits are deter-
mined, we can exhaustively try all the possible values of the remaining 17 bits b y
checking whether the decoded plaintexts are ASCII characters. The attack thus
needs a total of 239 steps and 20 ciphertexts to break DES reduced to three rounds.
10.1.2. Another Three-Round Attack. In this attack we assume that the plaintext
bytes belong to a smaller set in which the three most-significant bits are constant.
Such sets are the ASCII capital letters, the ASCII lowercase letters, and the ASCII
digits. The three most-significant bits of all the eight plaintext bytes are packed into
three bytes by the initial permutation. These three bytes are the first byte of the left
half and the first and second bytes of the right half. Since the first and second bytes
of the right half are constant in all the plaintext blocks, the inputs of $2 and $3 in
the first round are constant and thus their outputs are constant as well. We can
calculate the output bits of the third round by the equation
C=L~A~I. (2)
Two bits of the eight constant bits in L have corresponding constant bits in A: one
of them is an output of $2 and the other is an output of $3 (see Table 26). Since I is
known, the two bits in C are known up to an XOR with a constant. These bits are
outputs of $2 and $3. Trying all the 64 possibilities of the key bits entering $2 in
the third round, we can check that in any pair of ciphertexts the output bit of $2
satisfies C 1 ~ 11 = C 2 O) 12. Since half the keys satisfy this condition, we need about
1 + log2 64 = 7 ciphertexts to find the six key bits entering $2 in the third round.
DifferentialCryptanalysisof DES-likeCryptosystems 59
The same ciphertexts can be used to find the six key bits entering $3 in the third
round. This leaves 44 unknown key bits, which can be found in 244 steps with seven
ciphertexts.
10.2.1. A Three-Round Attack. The DES key scheduling algorithm divides the 56
key bits into halves. Each half has 28 bits, and supplies the key bits to the same four
S boxes in all the rounds.
Consider DES reduced to three rounds with a single known plaintext/ciphertext
pair. The exclusive-or value of the output of the first round and the third round is
known by the equation
AO)C=LO)I.
We first try all the 228 possibilities of half of the key. Each candidate makes it
possible to compute the output of four S boxes in the first round and the output of
the same S boxes in the third round. We know their expected exclusive-or value.
Since the value has 16 bits, only about 2 -16 of the candidates survive this test. Thus
we get about 212 possibilities for the first 28 bits of the key. In a similar way we get
about 212 possibilities for the other 28 bits of the key. Therefore we find about
212" 212 = 224 possibilities for the full key, which can be exhaustively searched. The
complexity of this algorithm is about 229, and can be reduced to about 221 by
choosing the key bits entering each S box sequentially rather than in parallel, and
discarding partial keys as soon as they lead to a contradiction.
10.3.1. A Three-Round Attack. In this attack we use the fact that in a pairs XOR
distribution table, if we know that the output XOR is zero, then the input XOR is
zero with probability 1/4. Given the plaintext and the ciphertext of an encryption
we can easily calculate A 0) C = L ~ 1. Then the following algorithm is used for
each S box. Choose only the encryptions whose output XOR from this S box is zero
( ~ of the encryptions): Soa ~3 Soc = 0. If S~o ~ Sxc = 0, then the corresponding bits
60 E. Bihamand A. Shamir
ofa ~) c = R 0) r equal SKo ~ SKc.We count the number of occurrences of each such
XOR value. The right value is suggested by about a quarter of the encryptions.
Each incorrect value is suggested by about 88 ~3 of the encryptions. The value that
appears most frequently is likely to be the value of Sx,, O) Sxc. This algorithm is used
for each S box and thus we find 8- 6 = 48 bits that are XORs of the actual key bits.
Then trying 2 s possibilities we can find the full 56-bit key. We need about four
occurrences of the right value of the key XOR for each S box, i.e., total of about
4" 4" 16 = 256 plaintext/ciphertext pairs.
10.3.2. A Four-Round Attack. In this attack we use the fact that for all the S boxes
there is a weak correlation between the value of the XOR of the four output bits
and the value of bit number 2 of the input. In particular, for every two inputs of an
S box, if the XOR of the four output bits of the first input equals the corresponding
value of the second input, then both bits 2 of the input are equal with a certain
probability. This probability is different for each S box and varies between 0.56 and
0.70.
Given the plaintext and the ciphertext of an encryption we can easily calculate
So``~ Soc by
A~C=LOr.
Then the following algorithm is used separately for each S box. For every encryp-
tion calculate the (single bit) XOR of the four output bits of the first round and the
four output bits of the third round by the about equation. This value is likely to be
equal to the XOR of bit number 2 of the inputs of the S box in these two rounds.
$I`` is known up to an XOR with the key (by the plaintext) and thus bit number 2
of the input in the third round is known up to an XOR with a constant with a high
probability. This constant is the XOR of the corresponding bit number 2 in SK``
Src. Thus by D = l ~ c we find the corresponding output bit in the fourth round
up to that constant with a high probability. We try all the 64 possibilities of the key
bits entering the corresponding S box in the fourth round and the two possibilities
of the constant and verify that the specific output bit of the S box equals its expected
value. The right key value is counted in about 56%-70% of the encryptions, de-
pending on the exact S box. Any wrong key value is counted in about half of the
encryptions. The key value which is counted most frequently is likely to be the right
value. This attack finds a total of seven bits: six of them are actual key bits and the
seventh is an XOR of two key bits.
The attack obtains the best results when the probability is as high as possible.
To increase the probability we use only encryptions with specific values of So,, O) Soc
which maximize this probability, For instance, when $5o``O) $5Oc = 0 this proba-
bility is about 0.81. There is a tradeoff between the number of allowed values and
the corresponding probability. As the number of allowed values increases, the prob-
ability decreases so we need more data to carry out the attack. However, as the
number of allowed values decreases we need more data to make the occurrence of
these values sufficiently probable. Table 17 describes the best tradeoff achievable
by this attack. To make the best use of this attack it is advisable to use about 200
plaintext/ciphertext pairs, from which we can find almost 28 key bits, and search
exhaustively for the (about 228 ) remaining possibilities of the key. Using about 370
Differential Cryptanalysisof DES-like Cryptosystems 61
Best tradeoff
By Finding Average
S box bits of probability(%) Values Encryptions
s1 $4 66 16 75
$2 $8 57 8 195
$3 s1 58 7 240
$4 $2 56 9 370
$5 s1 70 16 50
$6 $8 61 8 135
$7 $5 60 14 210
$8 $6 63 12 120
plaintext/ciphertext pairs we can find almost 42 key bits and search exhaustively
for the (about 214 ) remaining possibilities of the key.
10.3.4. A Six-Round Attack. This attack is again similar to the attack on five
rounds, but we also have to count all the possibilities of the 36 subkey bits of the
sixth round which enter S boxes whose output bits enter the counted S box in the
fifth round by the P permutation. In total we count on 49 bits. The total complexity
of this attack is about 255-256 but the basic operation (which is similar to a single
application of the F function) is much simpler than an encryption, and thus the time
needed is marginally faster than exhaustive search.
~m
~r
J~
~J
z~
Differential Cryptanalysis of DES-like Cryptosystems 63
From To
o
~ o o ~ m ~ o ~ o ~ o ~ H ~ H ~ o m H o ~ o o ~ o ~ o ~ o ~ o ~ o o o ~ o m o ~ o m o ~ o ~ o ~ o ~ o ~ o o ~
II
.w
.~ r~
~ o o
C~
~r"
k
x
9
~4
~r
o"
~r
C~
~r
"0
O"
C)
o
r~
x~
u=
~=
x
0
g.
0"
x
0
~n
o ~"
~o~
0
o
x
o
o.I
b~
x
O
~D
,ii
k
~r
~r
o
72 E. Biham and A. Shamir
References
[1] E. F. Brickell, J. H. Moore, M. R. Purtill, Structure in the S-boxes of the DES, Advances in
Cryptolooy, Proceedings of CRYPTO 86, pp. 3-7, 1986.
[2] D. Chaum, J.-H. Evertse, Cryptanalysis of DES with a Reduced Number of Rounds, Sequences of
Linear Factors in Block Ciphers, Advances in Cryptology, Proceedings of CR YPTO 85, pp. 192-
211, 1985.
[3] D. W. Davies, Private communications.
[4] B. Den Boer, Cryptanalysis of F.E.A.L, Advances in Cryptolooy, Proceedings of EUROCRYPT
88, pp. 293-300, 1988.
[5] Y. Desmedt, J.-J. Quisquater, M. Davio, Dependence of output on input in DES: small avalanche
characteristics, Advances in Cryptolooy, Proceedings of CR YPTO 84, pp. 359-376, 1984.
[6] W. Diffie, M. E. HeUman, Exhaustive cryptanalysis of the NBS Data Encryption Standard,
Computer, Vol. 10, No. 6, pp. 74-84, June 1977.
[7] H. Feistel, Cryptography and data security, Scientific American, Vol. 228, No. 5, pp. 15-23, May
1973.
[8] M. E. Heliman, A cryptanalytic time-memory tradeoff, IEEE Transactions on Information Theory,
Vol. 26, No. 4, pp. 401-406, July 1980.
[9] M. E. Hellman, R. Merkle, R. Schroppei, L. Washington, W. Diffie, S. Pohlig, P. Schweitzer,
Results of an Initial Attempt to Cryptanalyze the NBS Data Encryption Standard, Standford
University, September 1976.
[10] R. C. Merkle, A fast software one-way hash function, Journal of Cryptology, Vol. 3, No. 1,
pp. 43-58, 1990.
[11] S. Miyaguchi, Feal-N specifications, NTT, 1989.
[12] S. Miyaguchi, News on Feal Cipher, Talk at the RUMP session at CRYPTO 90, 1990.
[13] S. Miyaguchi, K. Ohta, M. Iwata, 128-bit hash function (N-Hash), Proceedings of SECURICOM
90, pp. 123-137, March 1990.
[14] S. Miyaguchi, A. Shiraishi, A. Shimizu, Fast data encryption algorithm Feal-8, Review of Electrical
Communications Laboratories, Vol. 36, No. 4, pp. 433-437, 1988.
[15] National Bureau of Standars, Data Encryption Standard, FIPS publication, No. 46, U.S. Depart-
ment of Commerce, January 1977.
[16] I. Schaumuller-Bichl, Zur Analyse des Data Encryption Standard und Synthese Verwandter
Chiffriersysteme, Ph.D. Thesis, Linz University, May 1981.
[17] I. Schaumuller-Bichl, Cryptanalysis of the Data Encryption Standard by the method of formal
coding, Cryptolooia, Proceedings of CR YPTO 82, pp. 235-255, 1982.
[18] I. Schaumuller-Bichl, On the Design and Analysis of New Cipher Systems Related to the DES,
Technical Report, Linz University, 1983.
[19] A. Shimizu, S. Miyaguchi, Fast Data Encryption Algorithm Feai, Advances in Cryptolooy, Pro-
ceedinos of EUROCRYPT 87, pp. 267-278, 1987.
[20] A. Shimizu, S. Miyaguchi, Fast Data Encryption Algorithm Feal, Abstracts of EUROCRYPT 87,
pp. VII-11-VII-14, April 1987.