0% found this document useful (0 votes)
38 views70 pages

Journal of Cryptology: Differential Cryptanalysis of DES-like Cryptosystems 1

This document summarizes previous cryptanalytic attacks on the Data Encryption Standard (DES) and iterated cryptosystems in general. It describes attacks that reduce the complexity of cryptanalysis, such as exploiting symmetries to reduce the key space by 50% and meet-in-the-middle attacks that can break reduced variants of DES with up to 7 rounds. It also outlines linear cryptanalysis techniques that can reveal relationships among key bits. The document establishes the context for the new differential cryptanalysis attack that will be introduced to break variants of DES with up to 15 rounds using less than 2^56 operations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views70 pages

Journal of Cryptology: Differential Cryptanalysis of DES-like Cryptosystems 1

This document summarizes previous cryptanalytic attacks on the Data Encryption Standard (DES) and iterated cryptosystems in general. It describes attacks that reduce the complexity of cryptanalysis, such as exploiting symmetries to reduce the key space by 50% and meet-in-the-middle attacks that can break reduced variants of DES with up to 7 rounds. It also outlines linear cryptanalysis techniques that can reveal relationships among key bits. The document establishes the context for the new differential cryptanalysis attack that will be introduced to break variants of DES with up to 15 rounds using less than 2^56 operations.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

J.

Cryptology (1991) 4:3-72


Journal of Cryptology
9 1991 International Association for
Cryptologic Research

Differential Cryptanalysis of DES-like Cryptosystems 1


Eli Biham and Adi Shamir
Department of Applied Mathematics and Computer Science,
The Weizmann Institute of Science, Rehovot 76100, Israel

Abstract. The Data Encryption Standard (DES) is the best known and most
widely used cryptosystem for civilian applications. It was developed at IBM and
adopted by the National Bureau of Standards in the mid 1970s,and has successful-
ly withstood all the attacks published so far in the open literature. In this paper
we develop a new type of cryptanalytic attack which can break the reduced variant
of DES with eight rounds in a few minutes on a personal computer and can break
any reduced variant of DES (with up to 15 rounds) using less than 2 s6 operations
and chosen plaintexts. The new attack can be applied to a variety of DES-like
substitution/permutation cryptosystems, and demonstrates the crucial role of the
(unpublished) design rules.
Key words. Data Encryption Standard, Differential cryptanalysis, Iterated
cryptosystems.

1. Introduction

Iterated cryptosystems are a family of cryptographically strong functions based on


iterating a weaker function n times. Each iteration is called a round and the crypto-
system is called an n-round cryptosystem. The round function is a function of the
o u t p u t of the previous r o u n d and of a subkey which is a key-dependent value
calculated via a key-scheduling algorithm. The r o u n d function is usually based on
S boxes, bit permutations, arithmetic operations, and the exclusive-or (denoted by
@ and X O R ) operations. The S boxes are nonlinear translation tables m a p p i n g a
small n u m b e r of input bits to a small n u m b e r of output bits. They are usually the
only part of the cryptosystem that is not linear and thus the security of the crypto-
system crucially depends on their choice. The bit permutation is used to rearrange
the o u t p u t bits of the S boxes in order to m a k e the input bits of each S box in the
following r o u n d depend on the output of as m a n y S boxes as possible. The X O R
operation is often used to mix the subkey with the data. In most applications the
encryption algorithm is assumed to be k n o w n and the secrecy of the data depends
only on the secrecy of the r a n d o m l y chosen key.
An early proposal for an iterated cryptosystem was Lucifer [7], which was
designed at I B M to resolve the growing need for data security in its products. The

i Date received:July 12, 1990. Date revised: February 5, 1991.


4 E. Bihamand A. Shamir

round function of Lucifer has a combination of nonlinear S boxes and a bit permu-
tation. The input bits are divided into groups of four consecutive bits. Each group
is translated by a reversible S box giving a four-bit result. The output bits of all the
S boxes are permuted in order to mix them when they become the input to the
following round. In Lucifer only two fixed S boxes (So and $1) were chosen. Each
S box can be used at any S box location and the choice is key dependent. De-
cryption is accomplished by running the data backward using the inverse of each
S box.
The Data Encryption Standard (DES) [15] is an improved version of Lucifer. It
was developed at IBM and adopted by the U.S. National Bureau of Standards
(NBS) as the standard cryptosystem for sensitive but unclassified data (such as
financial transactions and email messages). DES has become a well-known and
widely used cryptosystem. The key size of DES is 56 bits and the block size is 64
bits. This block is divided into halves of 32 bits each. The main part of the round
function is the F function, which works on the right half of the data using a subkey
of 48 bits and eight (six-bit to four-bit) S boxes. The 32 output bits of the F function
are XORed with the left half of the data and the halves are exchanged. The complete
specification of the DES algorithm appears in [15].
An extensive cryptanalytic literature on DES was published since its adoption in
1977. Yet no short-cuts which can reduce the complexity of cryptanalysis to less
than half of exhaustive search were ever reported in the open literature.
The 50~ reduction [9-] (under a chosen plaintext attack) is based on the follow-
ing symmetry under complementation:
T = DES(P, K)
implies that
= DES(P, K),
where X is the bit-by-bit complementation of X. Cryptanalysis can exploit this
symmetry if two plaintext/ciphertext pairs (P1, T1) and (P2, T2) are available with
P1 =/52 (or similarly T1 = T2). The attacker encrypts P1 under all the 2 55 keys K
whose least-significant bit is zero. If such a ciphertext T is equal to 7"1, then the
corresponding key K is likely to be the real key. If T = T2, t h e n / ~ is likely to be
the real key. Otherwise neither K n o r / ~ can be the real key. Since testing whether
T = T2 is much faster than an encryption, the computational saving is very close
to 50~.
Diffie and Hellman I-6] suggested exhaustive search of the entire key space on a
parallel machine. They estimate that a VLSI chip may be built which can search
one key every microsecond. By building a search machine with a million such chips,
all searching in parallel, 1012 keys can be searched per second. The entire key space
contains about 7.1016 keys and it can be searched in 105 seconds which is about a
day. They estimate the cost of this machine to be $20 million and the cost per
solution to be $5000.
Hellman I-8-] presented a time memory tradeoff method for a chosen plaintext
attack which takes mt words of memory and t z operations provided mt z equals the
number of possible keys (256 for DES). A special case (m = t) of this method takes
Differential Cryptanalysis of DES-like Cryptosystems

about 238 time and 238 memory, with a 256 preprocessing time. Hellman suggests
a special purpose machine which produces 100 solutions per day with an average
wait of 1 day. He estimates that the machine costs about $4 million and the cost
per solution is about $1-$100. The preprocessing is estimated to take 2.3 years on
the same machine.
The Method of Formal Coding in which the formal expression of each bit in the
ciphertext is found as an XOR sum of products of the bits of the plaintext and the
key was suggested in [9]. The formal manipulations of these expressions may de-
crease the key search effort. Schaumuller-Bichl [ 16], [17] studied this method and
concluded that it requires an enormous amount of computer memory which makes
the whole approach impractical.
In 1985 Chaum and Evertse [2] showed that a meet in the middle attack can
reduce the key search for DES reduced to a small number of rounds by the follow-
ing factors:

Number of rounds Reduction factor


4 219
5 29
6 22
7

They also showed that a slightly modified version of DES reduced to seven rounds
can be solved with a reduction factor of 2. However, they proved that a meet in the
middle attack of this kind is not applicable to DES reduced to eight or more
rounds.
In their method they look for a set of data bits (J) in a middle round and a set of
key bits (I) for which any change of the values of the I bits cannot change the
value of the J bits in either directions. Knowing those fixed sets and given several
plaintext/ciphertext pairs the following algorithm is used:
1. Try all the keys in which all the key bits in I are zero. Partially encrypt and
decrypt a plaintext/ciphertext pair to get the data in the middle round.
2. Discard the keys for which the J bits are not the same under partial
encryption/decryption.
3. For the remaining keys try all the possible values of the key bits in I.
This algorithm requires about 256-1zl -1- 2 Itl encryption/decryption attempts.
In 1987 Davies [3] described a known plaintext cryptanalytic attack on
DES. Given sufficient data, it could yield 16 linear relationships among key bits,
thus reducing the size of a subsequent key search to 24~ It exploited the correlation
between the outputs of adjacent S boxes, due to their inputs being derived from,
among other things, a pair of identical bits produced by the bit expansion opera-
tion. This correlation could reveal a linear relationship among the four bits of key
used to modify these S box input bits. The 32-bit halves of the DES result (ignoring
IP) receive these outputs independently, so each pair of adjacent S boxes could be
exploited twice, yielding 16 bits of key information.
6 E. Bihamand A. Shamir

The analysis does not require the plaintext P or ciphertext T but uses the quan-
tity P ~) T and requires a huge number of random inputs. The S box pairs vary in
the extent of correlation they produce so that, for example, the pair $7/$8 needs
about 1017 samples but pair $2/$3 needs about 1021. With about 1023 samples, all
but the pair $3/$4 should give results (i.e., a total of 14 bits of key information). To
exploit all pairs the cryptanalyst needs about 1026 samples. The S boxes do not
appear to have been designed to minimize the correlation but they are somewhat
better than a random choice in this respect. Since the number of samples is larger
than the 264 size of the sample space, this attack is purely theoretical and cannot
be carried out. However, for DES reduced to eight rounds the sample size of 1012
o r 1013 (about 24~ is on the verge of practicality. Therefore, Davies' analysis had
penetrated more rounds than previously reported attacks.
During the last decade several cryptosystems which are variants of DES were
suggested. Schaumuller-Bichl suggested three such cryptosystems [16], [18]. Two
of them (called C80 and C82) are based on the DES structure with the replacement
of the F function by nonreversible functions. The third one, called the Generalized
DES Scheme (GDES), is an attempt to speed up DES. GDES has 16 rounds with
the original DES F function but with a larger block size which is divided into more
than two parts. She claims that GDES increases the encryption speed of DES
without decreasing its security.
Another variant is the Fast Data Encryption Algorithm (Feal). Feal was designed
to be efficiently implementable on an eight-bit microprocessor. The first version of
Feal [20-1, called Feal-4, has four rounds. Feal-4 was broken by Den Boer [4,1 using
a chosen plaintext attack with 100-10,000 encryptions. The creators of Feal reacted
by introducing a new version, called Feal-8, with eight rounds [19,1, [14,1. Both
versions were described as cryptographically better than DES in several aspects.
In this paper we describe a new kind of attack that can be applied to many
DES-like iterated cryptosystems. This is a chosen plaintext atack which uses only
the resultant ciphertexts. The basic tool of the attack is the ciphertext pair which is
a pair of ciphertexts whose plaintexts have particular differences. The two plain-
texts can be chosen at random, as long as they satisfy the difference condition, and
the cryptanalyst does not have to know their values. The attack is statistical in
nature and can fail in rare instances.
The main results described in this paper are as follows (note that the complexities
we quote are based on the number of encryptions needed to create all the necessary
pairs on the target machine, while the attacking algorithm itself uses fewer and
simpler operations). DES reduced to six rounds was broken in less than 0.3 seconds
on a personal computer using 240 ciphertexts. DES reduced to eight rounds was
broken in less than 2 minutes on a computer by analysing 15,000 ciphertexts chosen
from a pool of 50,000 candidate ciphertexts. DES reduced to up to 15 rounds is
breakable faster than exhaustive search, but DES with 16 rounds still requires 258
steps (which is slightly higher than the complexity of exhaustive search). A summa-
ry of the cryptanalytic results on DES reduced to intermediate number of rounds
appears in Table 1.
Some researchers have proposed to strengthen DES by making all the subkeys
Differential Cryptanalysis of DES-like Cryptosystems

Table 1. Summary of the cryptanalysis of DES: The number of operations and chosen
plaintexts required to break the specified number of rounds.

Rounds Complexity

4 24
6 28
8 216
9 226
10 235
11 236
12 243
13 244
14 251
15 252
16 258

Ki independent (or at least to derive them in a more complicated way from a longer
actual key K). Our attack can be carried out even in this case. DES reduced to eight
rounds with independent subkeys (i.e., with 8" 48 = 384 independent key bits which
are not compatible with the key scheduling algorithm) was broken in less than 2
minutes using the same ciphertexts as in the case of dependent subkeys. The full
DES with independent subkeys (i.e., with 16.48 = 768 independent key bits) is
breakable within 261 steps. As a result, any modification of the key scheduling
algorithm cannot make DES much stronger. The attacks on DES reduced to 9-16
rounds are not influenced by the P permutation and the replacement of the P
permutation by any other permutation cannot make them less successful. On the
other hand, the replacement of the order of the eight DES S boxes (without chang-
ing their values) can make DES much weaker: DES with 16 rounds with a particu-
lar replaced order is breakable in about 246 steps. The replacement of the XOR
operation by the more complex addition operation makes this cryptosystem much
weaker. DES with random S boxes is shown to be very easy to break. Even a
minimal change of one entry in one of the DES S boxes can make DES easier to
break. GDES is shown to be trivially breakable with six encryptions in less than
0.2 seconds, while GDES with independent subkeys is breakable with 16 encryp-
tions in less than 3 seconds.
This attack is applicable also to a wide variety of DES-like cryptosystems. In
forthcoming papers we describe several extensions to our new attack. Lucifer re-
duced to eight rounds can be broken using less than 60 ciphertexts (30 pairs). The
Feal-8 cryptosystem can be broken with less than 2000 ciphertexts (1000 pairs) and
the Feal-4 cryptosystem can be broken with just eight ciphertexts and one of their
plaintexts. As a reaction to our attack on Feal-8, its creators introduced Feal-N
[11], with any even number of rounds N. They suggest the use of Feal-N with 16
and 32 rounds. FeaI-NX 1-12] is similar to Feal-N with the extension of the key size
to 128 bits. Nevertheless, Feal-N and Feal-NX can be broken for any N < 31
rounds faster than exhaustive search.
Differential cryptanalytic techniques are applicable to hash functions, in addition-
8 E. Bihamand A. Shamir

to cryptosystems. For example, the following messages hash to the same value in
Merkle's Snefru 1-10] function with two passes:
9 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000
9 00000000 f1301600 13dfc53e 4cc3b093 37461661 ccd8b94d 24d9d36f
71471fde 00000000 00000000 00000000 00000000
9 00000000 1d197f00 2abd3f6f cf33f3d1 8674966a 816e5d51 acd9a905
53cl d180 00000000 00000000 00000000 00000000
9 00000000 e98c8300 1e777a47 b5271f34 a04974bb 44cc8b62 be4b0efc
18131756 00000000 00000000 00000000 00000000

and the following two messages hash to the same value in a variant of Miyaguchi's
N-Hash [13] function with six rounds:
9 CAECE595 127ABF3C 1ADE09C8 1 F9AD8C2
9 4A8C6595 921A3F3C 1ADE09C8 1F9AD8C2.

2. Introduction to Differential Cryptanalysis

Differential cryptanalysis is a method which analyses the effect of particular differ-


ences in plaintext pairs on the differences of the resultant ciphertext pairs. These
differences can be used to assign probabilities to the possible keys and to locate the
most probable key. This method usually works on many pairs of plaintexts with
the same particular difference using only the resultant ciphertext pairs. For DES-
like cryptosystems the difference is chosen as a fixed XORed value of the two
plaintexts. In this introduction we show how these differences can be analyzed and
exploited.
We now introduce the following notation:
nx: An hexadecimal number is denoted by a subscript x (i.e., 10x = 16).
X*, X': At any intermediate point during the encryption of pairs of messages, X
and X* are the corresponding intermediate values of the two executions of the
algorithm, and X' is defined to be X' = X ~ X*.
P(X): The P permutation is denoted by P(X). Note that P as a variable denotes
the plaintext.
E(X): The E expansion is denoted by E(X).
IP(X): The initial permutation. In this paper the existence of I P and I P -1 are
ignored, since they have no cryptanalytic significance in our attack.
P: The plaintext (after the known initial permutation IP) is denoted by P. P* is
the other plaintext in the pair and P' = P ~ P* is the plaintexts' XOR.
T: The ciphertexts of the corresponding plaintexts P and P* (before the inverse
initial permutation IP-1) are denoted by T and T*. T' = T ~ T* is the ci-
phertexts' XOR.
(L, R): The left and right halves of the plaintext P are denoted by L and R,
respectively.
(l, r): The left and right halves of the ciphertext T are denoted by l and r,
respectively.
Differential Cryptanalysis of DES-like Cryptosystems

Plaintext(P))
K1

K2

K3

K4

K5

I(6

K7

K8

Ciphertext(T))
Fig. 1. DES reduced to eight rounds.

a . . . . . j: The 32-bit inputs of the F function in the various rounds. See Fig. 1. Note
that a = R.
A . . . . . J: The 32-bit outputs of the F function in the various rounds. See Fig. 1.
Si: The S boxes S1, $2 . . . . . $8.
SiEx, SiKx, Silx, Siox: The input of Si in round X is denoted by Sijx for X e
{a . . . . . j}. The output of Si in round X is denoted by Siox. The value of the six
subkey bits entering the S box Si is denoted by SiKx and the value of the six
input bits of the expanded data (E(X)) which are XORed with SiKX to form
10 E. Biham and A. Shamir

l input (32 bits) I

+
SIs 48 bits STE SS~v
$2s $3E $4E SSE I $6s s,,: n x s ~ s~x s~K szx ssx

SlI $21 831 % % s61 ~7l ssI

$10 6"20 $30 S40 ~ S$0 S6o S7o S8 o

T
Fig. 2. The F function of DES.

Si~x is denoted by S i ~ x . The S box number i and the round marker X are
optional. For example Sl~a denotes the first six bits of E(a). S I K a denotes the
first six bits of the subkey K1. S lia denotes the input of the S box S1 which is
SlI~ = Slea 9 S l o a denotes the output of S1 which is S l o a = SI(SI,a).
See Fig. 2.

Definition 1. An i n d e p e n d e n t k e y is a list of n subkeys which is not necessarily


derived from some key via the key scheduling algorithm.

Example 1. DES has 21648 = 2768 possible independent keys, but only 256 possi-
ble keys. Note that every key can be viewed as a special type of an independent key.

Remark. To simplify the probabilistic analysis of our attack, we assume that all
the subkeys are independent. Attacks on DES with dependent subkeys seem to be
just as successful in practice, but their theoretical analysis is much harder.

Let us recall how the DES F function behaves in these terms. The F function
takes a 32-bit input and a 48-bit key. The input is expanded (by the E expansion)
to 48 bits and XORed with the key. The result is fed into the S boxes and the
resultant bits are permuted.
Given the XOR value of an input pair to the F function it is easy to determine
its XOR value after the expansion by the formula
e ( x ) ~ E(X*) = e ( x ~ X*).
DifferentialCryptanalysisof DES-likeCryptosystems 11

The XOR with the key does not change the XOR value in the pair, i.e., the
expanded XOR stays valid even after the XOR with the key, by the formula
(X ~) K) ~) (X* ~) K) = X ~) X*.
The output of the S boxes is mixed by the P permutation and thus the XOR of the
pair after the P permutation is the permuted value of the S boxes output XOR, by
the formula
P(X) • P(X*) = P(X ~3 X*).
The output XOR of the F function is linear in the XOR operation that connects
the different rounds:
(X ~) Y ) ~ ) ( X * ~ Y*)= (X ~3 X*)O(YO) Y*).
The XOR of pairs is thus invariant in the key and is linear in the E expansion, the
P permutation, and the XOR operation.
The S boxes are known to be nonlinear. Knowledge of the XOR of the input pairs
cannot guarantee knowledge of the XOR of the output pairs. Usually several out-
put XORs are possible. A special case arises when the both inputs are equal, in
which case both outputs must be equal too. However, a crucial observation is that
for any particular input XOR not all the output XORs are possible, the possible
ones do not appear uniformly, and some XORed values appear much more fre-
quently than others.
Before we proceed we want to mention the known design principles of the S
boxes I-1]:
1. No S box is a linear or affine function of its input.
2. Changing one input bit to an S box results in changing at least two output
bits.
3. S(X) and S(X ~ 001100) must differ in at least two bits.
4. S(X) # S(X O) 1lefO0) for any choice of e and f.
5. The S boxes were chosen to minimize the differences between the number of
ones and zeros in any S box output when any single bit is held constant.
In DES any S box has 64.64 possible input pairs, and each one of them has an
input XOR and an output XOR. There are only 64.16 possible tuples of input and
output XORs. Therefore, each tuple results in average from four pairs. However,
not all the tuples exist as a result of a pair, and the existing ones do not have a
uniform distribution. Very important properties of the S boxes are derived from the
analysis of the tables that summarize this distribution:

Definition 2. A table that shows the distribution of the input XORs and output
XORs of all the possible pairs of an S box is called the pairs XOR distribution table
of the S box. In this table each row corresponds to a particular input XOR, each
column corresponds to a particular output XOR, and the entries themselves count
the number of possible pairs with such an input XOR and an output XOR.

Each line in a pairs XOR distribution table contains 64 possible pairs in 16


12 E. Biham and A. Shamir

Table 2. Partial pairs XOR distribution table of Sl.

Output XOR
Input
XOR 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F,
0, 64 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1, 0 0 0 6 0 2 4 4 0 10 12 4 10 6 2 4
2, 0 0 0 8 0 4 4 4 0 6 8 6 12 6 4 2
3, 14 4 2 2 10 6 4 2 6 4 4 0 2 2 2 0
4, 0 0 0 6 0 10 IO 6 0 4 6 4 2 8 6 2
5, 4 8 6 2 2 4 4 2 0 4 4 0 12 2 4 6
6, 0 4 2 4 8 2 6 2 8 4 4 2 4 2 0 12
7, 2 4 10 4 0 4 8 4 2 4 8 2 2 2 4 4
8, 0 0 0 12 0 8 8 4 0 6 2 8 8 2 2 4
9, 10 2 4 0 2 4 6 0 2 2 8 0 10 0 2 12
A, 0 8 6 2 2 8 6 0 6 4 6 0 4 0 2 10
4 2 4 0 10 2 2 4 0 2 6 2 6 6 4 2 12
CC 0 0 0 8 0 6 6 0 0 6 6 4 6 6 14 2
DX 6 6 4 8 4 8 2 6 0 6 4 6 0 2 0 2
L 0 4 8 8 6 6 4 0 6 6 4 0 0 4 0 8
FX 2 0 2 4 4 6 4. 2 4 8 2 2 2 6 8 8

3% 0 4 6 0 12 6 2 2’ 8 2 4 4 6 2 2 4
31, 4 8 2 10 2 2 2 2 6 0 0 2 2 4 10 8
32, 4 2 6 4 4 2 2 4 6 6 4 8 2 2 8 0
33, 4 4 6 2 10 8 4 2 4 0 2 2 4 6 2 4
34, 0 8 16 6 2 0 0 12 6 0 0 0 0 8 0 6
35, 2 2 4 0 8 0 0 0 14 4 6 8 0 2 14 0
36, 2 6 2 2 8 0 2 2 4 2 6 8 6 4 10 0
37, 2 2 12 4 2 4 4 10 4 4 2 6 0 2 2 4
38, 0 6 2 2 2 0 2 2 4 6 4 4 4 6 10 10
39, 6 2 2 4 12 6 4 8 4 0 2 4 2 4 4 0
3.4, 6 4 6 4 6 8 0 6 2 2 6 2 2 6 4 0
3B, 2 6 4 0 0 2 4 6 4 6 8 6 4 4 6 2
3G 0 10 4 0 12 0 4 2 6 0 4 12 4 4 2 0
3D, 0 8 6 2 2 6 0 8 4 4 0 4 0 12 4 4
3E, 4 8 2 2 2 4 4 14 4 2 0 2 0 8 4 4
3F, 4 8 4 2 4 0 2 4 4 2 4 8 8 6 2 2

different entries. Thus in each line in the table the average of the entries is exactly
four.

Example 2. Table 2 is a partial2 pairs XOR distribution table of Sl. Sl itself is


described in Table 3.

Example 3. The first line of Table 2 shows that, for the zero input XOR, the
output XOR must be zero too, as we noticed above. Also, the different lines in the
table have different output XOR distributions.

* The full pairs XOR distribution tables of all the S boxes appear in Appendix B.
DifferentialCryptanalysisof DES-likeCryptosystems 13

Table 3. S1 table.
14 4 13 1 2 15 11 8 3 10 6 12 5 9 0 7
0 15 7 4 14 2 13 1 10 6 12 11 9 5 3 8
4 1 14 8 13 6 2 11 15 12 9 7 3 10 5 0
15 12 8 2 4 9 1 7 5 11 3 14 10 0 6 13

The following definition deals with pairs XOR distribution tables:

Definition 3. Let X be a six-bit value and let Y be a four-bit value. We say that X
may cause Y by an S box if there is a pair in which the input XOR of the S box
equals X and the output XOR of the S box equals Y. If there is such a pair we write
X ~ Y, and if there is no such pair we say that X may not cause Y by the S box and
write X ~ Y.

Example 4. Consider the input XOR SI~ = 34~. It has only eight possible output
XORs, while the other eight entries are impossible. The possible output XORs S 1~
are 1x, 2~, 3x, 4~, 7x, 8~, Dx, and Fx. Therefore, the input XOR SI~ = 34~ may cause
output XOR SI~ = 1x (34x ~ Ix). Also 34~ ---,2x and 34x ~ Fx. On the other hand,
34x -~ Ox and 34x ~ 9x.

Examples 3 and 4 demonstrate that for a fixed input XOR, the possible output
XORs do not have a uniform distribution. The following definition extends Defini-
tion 3 with probabilities.

Definition 4. We say that X may cause Y with probability p by an S box if for a


fraction p of the pairs in which the input XOR of the S box equals X, the output
XOR equals Y.

Example 5. 34x ---,2~ results from 16 out of the 64 pairs of S1, i.e., with probability
1/4. 34x --* 4 x results only from two out of the 64 pairs of S1, i.e., with probability
1/32.

Different distributions appear in different lines of the table. In total between 70y/o
and 80~o of the entries are possible and between 2070 and 30Y/oare impossible. The
exact percentage for each S box is shown in Table 4. In various formulas in this
paper we approximate the percentage of the possible entries by 809/o.
The pairs XOR distribution tables let us find the possible input and output
values of pairs given their input and output XORs. The following example shows
a simple case:

Example 6. Consider the entry 34x ~ 4x in the pairs XOR distribution table of S1.
Since the entry 34x ~ 4x has value 2, only two pairs satisfy these XORs. These pairs
are duals. If the first pair is S 1~, S IT, then the other pair is S 1T, S 11. By looking at
Table 5 we see that these inputs must be 13x and 27x whose corresponding outputs
are 6x and 2x, respectively.
14 E. Biham and A. Shamir

Table 4. Percentage of the possible entries in the various pairs XOR distribution tables.

S box Percentage
S1 79.4
$2 78.6
$3 79.6
$4 68.5
$5 76.5
$6 80.4
$7 77.2
$8 77.1

Next we show how to find the key bits using k n o w n input pairs and output X O R
of an S box in the F function.

Example 7. Consider S1 and assume that the input pair is S1E = 1x, SI~ = 35 x
and that the value of the corresponding six key bits is S1 r = 23x. Then the actual
inputs of $1 (after X O R i n g the input and key bits) are $1~ = 22x, $1" = 16x and
the outputs are S1 o = lx, S l g -- Cx, respectively. The output X O R is S l ~ = Dx.
Assume we k n o w that S1E = 1~, Sl~ = 35~, and S l ~ = D~ and we want to find
the key value S 1K. The input X O R is S l~ = S 1~ -- 34x regardless of the actual value
of S1 r. By consulting Table 2 we can see that the input to the S box has eight
possibilities. These eight possibilities make eight possibilities for the key (by SK --
SE 9 SI) as described in Table 6~ Each line in the table describes two pairs with the
same two inputs but with the opposite order. Each pair leads to one key, so each
line leads to two keys (which are Se ~ Sl and Se q)S*). The right key value S 1 r
must occur in this table.
Using additional pairs we can get additional candidates for S 1r. Let us look at
the input pair S l e = 21~, $1~ = 15~ (with the same S1K = 23x). The inputs to the
S box are S11 = 2~, S I * = 36~ and the outputs are S1 o = 4x, $1~ = 7~. The output
X O R is S I ~ = 3x. The possible inputs to the S box where 3 4 x ~ 3 x and the

Table 5. Possible input values for the input XOR S1) = 34:, by the
output XOR (in hexadecimal).

Output
XOR
(Sly) Possible Inputs (S 11)
03, OF, 1E, 1F, 2A, 2B, 37, 3B
04, 05, 0E, 11, 12, 14, 1A, 1B, 20, 25, 26, 2E, 2F, 30, 31, 3A
01, 02, 15, 21, 35, 36
13, 27
00, 08, 0D, 17, 18, 1D, 23, 29, 2C, 34, 39, 3C
09, 0C, 19, 2D, 38, 3D
06, 10, 16, 1C, 22, 24, 28, 32
07, 0A, 0B, 33, 3E, 3F
DifferentialCryptanalysisof DES-like Cryptosystems 15

Table 6. Possible keys for 34x ~ D~by S1 with input 1~, 35x(in hexadecimal).
S box input Possiblekeys
06, 32 07, 33
10, 24 11, 25
16, 22 17, 23
1c, 28 ID, 29

corresponding possible keys are described in Table 7. The right key must occur in
both tables. The only common key values in Tables 6 and 7 are 17x and 23~. These
two values are indistinguishable with this input XOR since 17x @ 23~ = 34~ = S 1~,
but may become distinguishable by using a pair with a different input XOR value
(S 1~ :/: 34x).

The following example is an extension of Example 7 to a three-round crypto-


system.

Example 8. Assume we have a ciphertext pair whose plaintext XOR is known and
the values of the six bits 64, 33 . . . . . 37 of the plaintext XOR are zero. The input
XOR of the first round is zero in all the bits entering S 1 (S 1~ = S lla = 0) and thus
the output XOR of $1 in the first round must be zero (Slba = 0). The left half of the
ciphertext is calculated as the XOR value of the left half of the plaintext, the output
of the first round and the output of the third round (l = L 9 A 0) C). Since the
plaintext XOR and the ciphertext XOR are known and the output XOR of S1 in
the first round is known as well, the output XOR of S1 in the third round can be
calculated. The input pair S 1Ec, S l~c in the third round is easily extractable from
the ciphertext pair.
If the input pair of S1 in the third round is S 1Ec = Ix, S l*c = 35 x and the output
XOR is Slb~ = Dx, then the value of Slr~ can be found as in Example 7 and it must
appear in Table 6. Using additional pairs we can discard some of the possible values
until we get a unique value of S 1K~. Since S lk~ is not constant, there should not be
any indistinguishable values of the subkey.

The following definition extends Definitions 3 and 4 for use with the F function:

Definition 5. Let X and Y be 32-bit values. We say that X may cause Y with
probability p by the F function if for a fraction p of all the possible input pairs

Table 7. Possible keysfor 34x--3~ by S1 with input 21x, 15x(inhexadecimal).


S box input Possiblekeys
01, 35 03, 37
02, 36 00, 34
15, 21 17, 23
16 E. Biham and A. Shamir

encrypted by all the possible subkey values in which the input XOR of the F func-
tion equals X, the output XOR equals Y. If p > 0 we denote this possibility by
X~Y.

Lemma 1. In DES if X ~ Y with probability p by the F function, then every fixed


input pair Z, Z* with Z' = Z ~ Z* = X causes the F function output XOR to be Y
by the same fraction p of the possible subkey values.

Proof. To prove the lemma it suffices to show the property for each of the S boxes.
For each input XOR of the data Sk there is S~ = S~ regardless of SK. If there are k
possible input pairs to the S box with this input XOR that may cause a given output
XOR, we can choose precisely k key values Sx = Se ~ St, each taking the fixed
input pair SE, S* to one of the possible input pairs St, S~' of the S box and thus
causing the given output XOR. Thus, the fraction p is held constant for all the input
pairs, and therefore equals the average over all the input pairs. []

In other iterated cryptosystems this lemma does not necessarily hold. However, we
assume that the fraction is very close to p, which is usually the case.

Corollary 1. The probability p of X --* Y by the F function is the product of Pi in


which X~ ~ Yi by the S boxes Si (i ~ {1..... 8}) where X 1 X 2 X 3 X 4 X s X 6 X 7 X 8 =
E(X) and Y1 Y2 Y3 Y4 Ys Y6 YT Ys = p - l ( y ) .

The above discussion about finding the key bits entering S boxes can be extended
to find the subkeys entering the F function. The method is as follows:
1. Choose an appropriate plaintext XOR.
2. Create an appropriate number of plaintext pairs with the chosen plaintext
XOR, encrypt them and keep only the resultant ciphertext pairs.
3. For each pair derive the expected output XOR of as many S boxes in the last
round as possible from the plaintext XOR and the ciphertext pair. (Note that
the input pair of the last round is known since it appears as part of the ci-
phertext pair.)
4. For each possible key value, count the number of pairs that result with the
expected output XOR using this key value in the last round.
5. The right key value is the (hopefully unique) key value suggested by all the
pairs.
We are left with the problem of pushing the knowledge of the XORs of the
plaintext pairs as many rounds as possible (in step 3) without making them all
zeros. When the XORs of the pairs are zero, i.e., both texts are equal, the outputs
are equal too, which makes all the keys equally likely. The pushing mechanism is
a statistical characteristic of the cryptosystem which is an extension of the single
round analysis. Before we define it formally we give an informal definition and three
examples.
DifferentialCryptanalysisof DES-like Cryptosystems 17

Definition 6 (informal). Associated with any pair of encryptions are the XOR
value of its two plaintexts, the XOR of its ciphertexts, the XORs of the inputs of
each round in the two executions, and the XORs of the outputs of each round in
the two executions. These XOR values form an n-round characteristic. A character-
istic has a probability, which is the probability that a random pair with the chosen
plaintext XOR has the round and ciphertext XORs specified in the characteristic.
We denote the plaintexts XOR of a characteristic by f~p and its ciphertexts XOR
by fl r.

The following example describes a one-round characteristic with probability 1.


This is the only one-round characteristic with probability greater than 1/4. This
characteristic is very useful and is applicable in any DES-like cryptosystem.

Example 9. A one-round characteristic with probability 1 is (for any L'):

tip = (L', 0~) )

A' = O~ ~-]~, a ' = O~ always

( )
The following example describes a simple one-round characteristic with proba-
bility 14/64.

Example 10. In this one-round characteristic all the S box input XORs except one
are zero. One S box input XOR is not zero, and is chosen to maximize the probabil-
ity that the input XOR may cause the output XOR. Since there are several input
bits that enter two neighboring S boxes by the E expansion we have to ensure
that the XORs of these bits are zero. There are only two private bits entering
each S box. These bits can have nonzero XOR values. The best such probability for
S1 is 14/64 (i.e., there is an entry that contains 14 pairs that does not cause the input
of the neighboring $2 or $8 to be nonzero). Thus, it is easy to get a one-round
characteristic with probability 14/64 which is
SI: 0Cx ~ Ex with probability 14/64,
$2 . . . . . $8: 00 x ~ 0x always.
This characteristic can also be written (for any L') as
18 E. Biham and A. Shamir

np = (L', 60 oo oo 0%) )

J~.A,= oo so s2 oo~ ~ - - ] a,= 6o oo oo oox with probability 14

~ -= P(EO O0 O0 00~)

~T = (L, eoo so s2 oo~,~o oo oo oo~

One-round characteristics with probability 1/4 are possible using nonzero input
XOR in $2 or $6.
The following example describes a two-round characteristic which is easily
obtained by concatenating the two one-round characteristics that are described in
Examples 10 and 9:

Example 11. A two-round characteristic with probability 14/64:

ftp = 00 80 82 00 6 0 0 0 0 0 0 0 x

t. A / = 00 80 82 00~
~ --].. a t = 60 00 00 00~: with probability

f~_ B'=O ~ -~, b' = 0 always

( ~T=6o oo oo oo oo oo oooo~ )

We can now formulate the exact definition of a characteristic:

Definition 7. An n-round characteristic is a tuple fl = (fie, flA, fir) where lip and
O r are m-bit numbers and F~A is a list of n elements ~A = (A1, A 2. . . . . A.), each
of which is a pair of the form Ai = (2~, 2/0) where 2~ and 2~ are (m/2)-bit numbers
and m is the block size of the cryptosystem. A characteristic satisfies the following
rea uirements:
2] = the right half of f~p,
2~ = the left half of f~e (9 2~,
27 = the right half of f~T,
2~'-1 = the left half of fiT O 2~,
Differential Cryptanalysis of DES-like Cryptosystems 19

and, for every i such that 2 < i < n - 1,


=

Definition 8. A right pair with respect to an n-round characteristic f2 = (f~p, ~'~A,


f2T) and an independent key K is a pair for which P' = f e and for the first n rounds
of the encryption of the pair using the independent key K the input XOR of the ith
round equals 2j and the output XOR of the F function equals 2~. Every pair which
is not a right pair with respect to the characteristic and the independent key is
called a wrong pair with respect to the characteristic and the independent key.
Throughout this paper we refer them shortly by right pair and wrong pair.

Definition 9. The concatenation of an n-round characteristic f l = (f~, f ~ , f2~)


with an m-round characteristic ['~2 ----(fie, 2 f~), where f~l equals the swapped
2 f2A,
value of the halves of [2p2, is the characteristic f = (f~, fA, f22), where f~A is the
concatenation of the lists ~ and f22.

The following definitions and theorem deal with the probability of characteristics:

Definition 10. Round i of a characteristic f has probability p~ if 2~ ~ 2~ with


probability p~ by the F function.

Definition 11. An n-round characteristic f has probability pU if pt~ is the product


of the probabilities of its n rounds:

pa = ~ p/U.
i=1
Note that by Definitions 9 and 11 the probability of a characteristic f~ which is
the concatenation of the characteristic fl 1 with the characteristic f12 is the product
of their probabilities: pa = pn,. pn2. As a result, every n-round characteristic can be
described as the concatenation of n one-round characteristics with probability
which is the product of the one-round characteristics' probabilities.

Theorem 1. The formally defined probability of a characteristic f = (f~e, fA, f~r)


is the actual probability that any fixed plaintext pair satisfying P' = f~e is a right pair
when random independent keys are used.

Proof. The probability of any fixed plaintext pair satisfying P' = ~ , to be a right
pair is the probability that at all the rounds i: 2~ ~ 2~. The probability at each
round is independent of its exact input (as proved in Lemma 1) and independent of
the action of the previous rounds (since the independent keys completely random-
ize the inputs to each S box, leaving only the XOR value fixed). Therefore, the
probability of a pair to be a right pair is the product of the probabilities of 2~ ~ 2~,
which was defined above as the probability of the characteristic. []
20 E. Biham and A. Shamir

For practical purposes the significant probability with respect to a characteristic


is the probability that a pair whose plaintext XOR equals the characteristic's plain-
text XOR is a right pair using a fixed key (the one we try to find). This probability
is not constant for all the keys (as we show later in this paper in a special case).
However, we assume that the characteristic's probability is a very good approxima-
tion of it, which is usually the case.
After this formal discussion we show a three-round characteristic:

Example 12. An extension to three rounds of the characteristic described in


Example 11 can be achieved by concatenating it with the characteristic of
Example 10. Thus a three-round characteristic with probability (14/64) 2 ~ 0.05 is

f~v=O0808200 60000000~ )

9 q gi

9 4

( so s2 oo 6o oo oooo )

where in the fourth round d' = b' O) C' = C' = A'. We see that when the plaintexts
differ in the five specified bit locations, with probability about 0.05 there is a differ-
ence of only three bits at the input of the fourth round. After the bit expansion, five
S boxes have nonzero input XOR and three have zero input XORs and thus zero
output XORs. In this case it is possible to deduce 12 bits of e' by e' = c' ~ D'.

This structure of three rounds with a zero input XOR in the middle round is very
useful and forms the best possible probability for three-round characteristics. 3 A
similar structure can be used in five-round characteristics. The middle round has
zero input and output XORs and there is a symmetry around it, i.e.,

3 Since less than two differingS boxes are impossibleand there are characteristicsof this structure
with two differingS boxes,each with the best possible probability(1/4).
Differential Cryptanalysis of DES-like Cryptosystems 21

ap = (L', R') )

with some probability p~


A' ~ l ~" a' = R'
"t:-
s B' = a' = R' ~ . b' = L' @ A' with some probability Pb
T
2. C' = 0 ~ l ~ c' = 0 always

2.
"F
D' = R' ~-],, d' = L' @ A' with probability Pb

E ' = A' ~ . e' = R' with probability p~

f
( ftr=ae=(L',R') )

where in the sixth round f ' = d' ~) E' = b' ~) A' = L'. The existence of a string
b' ~ a' ~ A' ensures the existence of such a five-round characteristic. The charac-
teristic's probability is quite low since three S box inputs must differ in both rounds
b' ~ a' and a' ~ A', and six in the whole five-round characteristic. The best proba-
bility for an S box is 16/64 = 1/4. This limits the five-round characteristic's proba-
bility to be lower than or equal to (1/4) 6 = 1/4096. In fact, the best known five-
round characteristic has probability about 1/10,486.
Among the most useful characteristics are those that can be iterated.

Definition 12. A characteristic f~ = (11~,, [~A, f~r) is called an iterative characteris-


tic if the swapped value of the halves of [le equals f~r.

We can concatenate an iterative characteristic to itself any number of times and


can get characteristic with an arbitrary number of rounds. The advantage of itera-
tive characteristics is that we can build an n-round characteristic for any large n
with a fixed reduction rate of the probability for each additional round, while in
noniterative characteristics the reduction rate of the probability usually increases
due to the avalanche effect.
There are several kinds of iterative characteristics but the simplest ones are the
most useful. These characteristics are based on a nonzero input XOR to the F
22 E. Bihamand A. Shamir

function that may cause a zero output XOR (i.e., two different inputs yield the same
output). This is possible in DES if at least three neighboring S boxes differ in
the pair (this phenomena is also described in I-5] and [1]). The structure of these
characteristics is described in the following example.

Example 13. If the input XOR of the F funtion is marked by if, such that r ~ 0,
then we have the following iterative characteristic:

~p = (L', R') = (r O) )

A' = 0 a' = 0 always

B'=O ~-~ b'=L'@A'=r with some probability

( )
The best such characteristic has probability about 1/234. A five-round character-
istic based on this iterative characteristic has probability about 1/55,000.

The statistical behavior of most characteristics does not allow us to look for the
intersection of all the keys suggested by the various pairs as we did in Example 7,
since the intersection is usually empty: the wrong pairs do not necessarily list the
right key as a possible value. However, we know that the right key value should
result from all the right pairs which occur (approximately) with the characteristic's
probability. All the other possible key values are fairly randomly distributed: the
expected XOR value (which is usually not the real value in the pair) with the known
ciphertext pair can cause any key value to be possible, and even the wrong key
values suggested by the right pairs are quite random. Consequently, the right key
appears with the characteristic's probability (from right pairs) plus other random
occurrences (from wrong pairs). To find the key we just have to count the number
of occurrences of each of the suggested keys. The right key is likely to be the one
that occurs most often.
Each characteristic lets us look for a particular number of bits in the subkey of
the last round (all the bits that enter some particular S boxes). The most useful
characteristics are those which have a maximal probability and a maximal number
of subkey bits whose occurrences can be counted. Yet it is not necessary to count
on all the possible subkey bits. The advantages of counting on all the possible
subkey bits are the good identification of the right key value and the small amount
of data needed. However, counting the number of occurrences of all the possible
DifferentialCryptanalysisof DES-likeCryptosystems 23

values of a large number of bits usually demands huge memory which can make
the attack impractical. We can count on a smaller number of subkey bits entering
a smaller number of S boxes, and use all the other S boxes only to identify and
discard those wrong pairs in which the input XORs in such S boxes cannot cause
the expected output XORs. Since about 209/0 of the entries in the pairs XOR distri-
bution tables of the S boxes are impossible, about 209/o of the wrong pairs can be
discarded by each S box before they are actually counted.
The following definition gives us a tool to evaluate the usability of a counting
scheme based on a characteristic:

Definition 13. The ratio between the number of right pairs and the average count
in a counting scheme is called the signal-to-noise ratio of the counting scheme and
is denoted by SIN.

To find the right key in a counting scheme we need a high probability character-
istic and enough ciphertext pairs to guarantee the existence of several right pairs.
This means that for a characteristic with probability 1/10,000 we need several tens
of thousands of pairs. How many pairs we need depends on the probability of the
characteristic, the number of key bits that we count on, and the level of identifica-
tion of wrong pairs that can be discarded before the counting. If we are looking for
k key bits, then we count the number of occurrences of 2k possible key values in 2 k
counters. The counters contain an average count of m" ~" ill2 k counts, where m is
the number of pairs, 0t is the average count per counted pair, and fl is the ratio of
the counted to all pairs (i.e., counted and discarded). The right key value is counted
about m.p times using the right pairs where p is the characteristic's probability,
plus the random counts estimated above for all the possible keys. The signal-to-
noise ratio of a counting scheme is therefore
m.p 2k'p
s/N - - - -
m" ~ . ~/2 ~ ~. ~ "

A simple corollary of this formula is that the signal-to-noise ratio of a counting


scheme is independent of the amount of pairs used in the scheme. Another corollary
is that different counting schemes based on the same characteristic but with a
different number of subkey bits have different SIN.
Usually we relate the number of pairs needed by a counting scheme to the num-
ber of the right pairs needed. The number of right pairs needed is mainly a function
of the signal-to-noise ratio. When the SIN is high enough, only a few occurrences
of right pairs are needed to identify uniquely the right value of the subkey bits. We
observed experimentally that when the SIN is about 1-2, about 20-40 occurrences
of right pairs are sufficient. When the SIN is much higher even three or four right
pairs are usually enough. When the SIN is much smaller the identification of the
right value of the subkey bits requires an unreasonably large number of pairs.
In many attacks we use several simultaneous characteristics. In order to mini-
mize the number of ciphertexts needed, we can pack them into more economical
structures.
24 E. Biham and A. Shamir

Definition 14. A quartet is a structure of four eiphertexts that simultaneously


contains two ciphertext pairs of one characteristic and two ciphertext pairs of a
second characteristic. An octet is a structure of eight ciphertexts that simultane-
ously contains four ciphertext pairs of each of three characteristics.

Example 14. The following four plaintexts form a quartet (where ~b1 and It~2 are
the plaintext XORs of the characteristics):
1. A random plaintext P.
2. PO~1.
3. P@~2.
4. P@~kl@~b 2.
The two pairs of the first characteristic are the pairs labeled (1, 2) and (3, 4) and the
two pairs of the second characteristic are the pairs labeled (1, 3) and (2, 4).

The use of these structures can be done in two ways. When an attack uses n pairs
of each one of two characteristics we can use n/2 quartets which contain the same
information as each of the n pairs of each characteristic. Thus, we save half the data.
Using three characteristics we can save two-thirds of the data. The other approach
is used when an attack can simultaneously use two characteristics while counting
the same bits. Then we can divide the data so that half of the pairs are based on the
first characteristic and the other half on the second. When quartets can be used we
can save half the data, and when octets can be used we can save two-thirds of the
data.

3. D E S Reduced to Four Rounds

In Section 2 we defined the notions of pairs and characteristics. In this section we


describe how it can be used to cryptanalyze DES reduced to four rounds. This
cryptanalysis is quite simple since it uses a characteristic with probability 1, but it
serves as a good introductory example to the method of differential cryptanalysis.
In this attack we use the following one-round characteristic t) 1 with probabil-
ity 1 which is an instance of the characteristic described in Example 9:

~ = 20 00 00 00 00 00 00 00~ )

i A' 0~ ~ a' = 0~ always

=2ooooooooooooooo
( )
where in the second round b' = L' ~9 A' = 20 00 00 00 x.
Differential Cryptanalysis of DES-like Cryptosystems 25

In the first round the characteristic has a' = 0 ~ A' = 0 with probability 1. The
single bit difference between the two plaintexts starts to play a role in the second
round in S1. Since the inputs to S1 differ only in one bit, at least two output
bits must differ. Typically such two bits enter three S boxes in the third round
(c' = a' ~ B' = B'), where there is a difference of one bit in each S box input. Thus,
about six output bits differ at the third round. These bits are XORed with the
known difference of the input of S1 in the second round (d' = b' ~ C'), making a
difference of about seven bits in the input of the fourth round and about 11 bits in
the entries of the S boxes (due to the E expansion). Such an avalanche makes it very
likely that the input of all the S boxes differ at the fourth round. Even if an input
of an S box does not differ in one pair it can differ in another pair and the exact
value of d' is usually different for every pair.
The 28 output XOR bits of $2 .... , $8 in B' must be equal to zero since their input
XORs are zero. Since a' 9 B' = c' = D' @ l' (see Fig. 3) then
D'=a'~I'OB'. (1)
When the ciphertext pair values T and T* are known then d and d* are known to
be their right halves (by d = r). Since a', l' and the 28 bits of B' are known, the
corresponding 28 bits of D' are known as well by (1). These 28 bits are the output
XORs of S boxes $2 . . . . . $8. Thus, we know the values Sed, S'd, and S~d of seven S
boxes in the fourth round.
Given the encrypted pairs we use a separate counting procedure for each one of

Plaintext(P))
K1

K2

K3

K4

(Cipherext ,T,)
Fig. 3. DES reduced to four rounds.
26 E. Biham and A. Shamir

the seven S boxes in the fourth round. We try all the 64 possible values of Sxd and
check whether
S(SEd @ SK~) ~ S(S~a G SKd) = SOd.
For each key we count the number of pairs for which the test succeeds. The right
key value is suggested by all the pairs since we use a characteristic with probabil-
ity 1 which causes all the pairs to be right pairs. The other 63 key values may occur in
some of the pairs. It is unlikely that a value occurs in all the pairs for which S~ are
different and S~ are different. In rare cases when more than one key value is sug-
gested by all the pairs a few additional pairs can be tried, or the analysis of the other
key bits can be done in parallel for all the surviving candidates.
So far we have found 7.6 = 42 bits of the subkey of the last round (K4). If the
subkeys are calculated via the DES key scheduling algorithm these are 42 actual
key bits out of the DES 56 key bits, and 14 key bits are still missing. We can now
try all the 214 possibilities of the missing bits and decrypt the given ciphertexts
using the resulting keys. The right key should satisfy the known plaintext XOR
value for all the pairs, but the other 214 - 1 values have only probability 2 -64 to
satisfy this condition.
Some researchers have proposed to strengthen DES by making all the subkeys
Ki independent (or at least to derive them in a more complicated way from a longer
actual key K). Our attack can be carried out even in this case. To find the six
missing bits of K4 and to find K3 we use another plaintext XOR value with the
following characteristic f~2:

f ~ = 02 22 22 22 00000000~ ~)

a I = 0z always

( o2 22 22 22 oo oo oo

where in the second round b' = L' 9 A' = 02 22 22 22x.


The value of Sl~b is zero. Thus, Sl~b = 0. As above we find Sl'oa using (1) and
similarly we can find the corresponding six key bits S 1Ka.
Now we know the complete fourth round subkey K4. Using K4 we partially
decrypt all the given ciphertexts by "peeling off" the effect of the last round. As a
result we remain with a three-round cryptosystem. In this cryptosystem the second
P' value lets us calculate the third round subkey K3. The inputs to the third round
c and c* are known as halves of the ciphertexts of the three-round cryptosystem.
The input XOR c' is easily calculated. The output XOR C' is C' = b' 0) d' where b'
DifferentialCryptanalysisof DES-like Cryptosystems 27

and d' are known. The counting method is used again to count the number of
occurrences of the possible keys of all the eight S boxes at the third round. The
values that are counted for all the pairs are likely to be the right key values. As a
result the complete K3 is found with high probability.
The P' values used above are insufficient to find a unique K2 since the S~b are
constant for all the pairs, and thus the right key values are indistinguishable from
the alternative key values obtained by XORing them with S~b. Although we can
find these two possibilities for each S box, i.e., 2 8 possibilities for K2, we cannot use
the above XOR values to find K1 since in both XOR values there is R' = 0 and thus
a' = 0 and A' = 0. Note that
a'=0~A'=0
happens regardless of the key and thus all the possible values of K1 are equally
likely using these XOR values. To solve this problem we have to use an additional
characteristic which has a nonzero input XOR for all the S boxes of the first round.
In addition we want to be able to distinguish the key values of all the S boxes so
we choose two characteristics f~3 and fP. These characteristics can be chosen
arbitrarily under the following two conditions:
9 5~, # 0 for all the S boxes using either 12~,or f2~,.
9 For every particular S box 5~o of the characteristic t2~ is different from 5~., of
tar
Then b and b* are known by decryption of the third round and B' is known by
B'=a'@c'=R'~c'.
The counting method is used to find K2. This time it has to use the appropriate R'
value for each pair. Now a, a*, and a' are known by decryption of the second round
and A' is known by
A'=L'~b'.
The counting method finds K1. Using K1, K2, K3, and K4 we can decrypt the
original ciphertexts to get the corresponding plaintexts and then verify their plain-
text XOR values. If we find only one possibility for all the subkeys the verification
must succeed. If several possibilities are found, then only one of them is likely to be
verified successfully, and thus the right key can be identified.
Typically, 16 encryptions are sufficient for this attack. These 16 encryptions con-
tain eight pairs of the characteristic [21, eight pairs of [22, four pairs of ~3, and four
pairs of f~4. In order not to increase the amount of data needed we use two octets
that occupy four pairs of each of three plaintext XORs.

4. D E S Reduced to Six Rounds

The cryptanalysis of DES reduced to six rounds is more complex than the crypt-
analysis of the four-round version. We use two statistical characteristics with prob-
28 E. Bihamand A. Shamir

ability 1/16, and choose the key value that is counted most often. Each one of the
two characteristics lets us find the 30 key bits of K6 which are used at the input of
five S boxes in the sixth round, but three of the S boxes are common so the total
number of key bits found by the two characteristics is 42. The other 14 key bits can
be found later by means of exhaustive search or by a more careful counting on the
key bits entering the eighth S box in the sixth round.
The first characteristic t) 1 is

f l j , = 4 0 0 8 0 0 0 0 04000000. )

~d A ' = 40 08 00 00~ ~ - ] 1 ~ a 1=04000000~ with probability 1

Z.
y- B' = Or ~ r " b' = O~ always

C ' = 40 08 00 00~ [ ~ c ' = 0 4 0 0 0 0 0 0 ~ with probability 88

T
( _- oo oooo o, oooooo )

where in the fourth round


d'=b'@C' = 4 0 0 8 0 0 0 0 x.
Five S boxes in the fourth round ($2, $5 . . . . . $8) have zero input XORs (S~d = 0)
and thus their output XORs are zero (S~a = 0). The corresponding output XORs
in the sixth round can be found by
F'=c'~D'~)l'.
Since the right key value is not suggested by all the pairs (due to the probabilistic
nature of the characteristic), we cannot use a separate counting procedure for the
subkey bits entering each S box. In order to increase the S/N we should simultane-
ously count on subkey bits entering several S boxes. The best approach is to count
on all the 30 countable subkey bits together, which maximizes the probability that
the right key value is the one counted most often. A straightforward implementation
of this method requires 230 counters, which is impractical on most computers.
However, the improved counting procedure described at the end of this section
achieves exactly the same result with much smaller memory.
The same efficient algorithm is used to find the 30 key bits of S1, $2, $4, $5, and
$6 using the second characteristic r2 which is
Differential Cryptanalysisof DES-likeCryptosystems 29

~ , = 00 20 00 08 00000400~ )

~ _ A' = 00 20 00 08~ a ~= O0 O0 04 00~ with probability 88


y- q

,.•_ B' = 0~
~ -7: b~= O~ ] always
I
r~C' = 00 20 00 08= U ~ c' = 00 00 04 00= with probability 88
4

( f~=00200008 00000400= )

where in the fourth round d' = b' ~ C' = 00 20 00 08x.


Again, five S boxes in the fourth round (S1, $2, $4, $5, and S6) have zero input
XORs. The computed key values of the common S boxes $2, $5, and $6 should be
the same in the both calculations (otherwise we should analyze more pairs or con-
sider additional candidate keys with almost maximal counts). If this test is success-
ful, we have probably found 42 bits of K6.
DES has 56 key bits. Fourteen of them are still missing. The simplest way to find
them is to search all the 214 possibilities for the expected plaintext XOR value of
the decrypted ciphertexts. A faster way is to start looking for the six missing bits of
K6 which enter $3 (the other eight key bits occur only in other subkeys). At first we
use our partial knowledge of the key to filter the given pairs. For each pair we check
if at the five S boxes having S~d = 0 by the characteristic, the value of S~s obtained
b y f and f * and the known key bits form the expected value from F' = c' @ D' ~ l'.
If not, then this cannot be a right pair. Otherwise it is almost certainly a right pair
(since the condition can be satisfied accidentally only with probability 2-2~ For
the remainder of the cryptanalysis we use only the (roughly) ~6 of the pairs which
are believed to be the fight pairs. This filtration greatly improves the signal-to-noise
ratio of the following scheme, which otherwise would be impractical.
Table 8 describes the known bits of the key and the input to the F function at
the fifth round assuming we know the 42 key bits. The digit "3" means that the bit
depends on the exact value of the missing key bits that enter $3 in the sixth round.
" + " means that it depends only on known key bits. Eight key bits are not used at
all in the subkey K6, and are marked by ".". This table shows that by guessing the
six missing bits of K6 we can verify its correctness by calculating e and e* for each
right pair by a single round decryption with K6 and by verifying that the values of
S2'oe, SYoe, and S8be (for which all the input and key bits are known) are as
expected by E' = d' ~ f'. Furthermore, we can verify that there are values of the
missing key bits (for each S box separately) such that the other S boxes output
30 E. Biham and A. Shamir

Table 8. Knownbits at the fifth round.

Into S box e bits Key bits


number SEe SKe
S1 ++++++ 3+..++
$2 ++3+++ +3+333
$3 ++++++ ++++++
++++3+ ++..++
$5 3+++++ +++.++
$6 ++++3+ +.+.++
$7 3+++++ +++.++
$8 ++3+++ ++++++

XORs are as expected. The verification of most of the 64 possibilities of the six
missing bits of K6 should fail, and with high probability only one possibility sur-
vives. This value completes K6. Only eight key bits are missing now. They can be
found by trying all the 256 possibilities, or by applying a similar analysis to key bits
that enter S boxes in the fifth round.
How much data is needed? The signal-to-noise ratio of the first part of the algo-
rithm (which finds 30 key bits) is
230. 1/16 _ 23o_4_1o = 2a6.
S/N - 45
The S / N is high and thus only seven or eight right pairs of each characteristic are
needed. Since the characteristics' probability is 1/16, we need about 120 pairs of
each characteristic for the analysis. The S / N of the later part is
26. 1
SIN- - 16.
4
This is lower, but we do not care since we can almost certainly identify and use only
the seven or eight right pairs from the first part (while eliminating most of the noise)
and intersect the sets of possible key values. To reduce the number of ciphertexts
needed we use quartets which combine the two characteristics. As a result only 240
ciphertexts (representing 120 pairs of each characteristic) are needed for the com-
plete cryptanalysis.
In order to decrease the amount of memory needed in the first part of this attack
we devised an equivalent but faster counting algorithm that uses negligible memory
and can count on all the countable subkey bits simultaneously. This algorithm can
be used in any counting scheme that needs a huge memory but analyses a relatively
small number of pairs (after filtering out all the identifiable wrong pairs). The idea
behind this algorithm is to describe the pairs and the possible key values by a graph.
In this graph each pair is a vertex and every two pairs which suggest a common key
value have a connecting edge labeled by this value. Thus, each key value forms a
clique which contains all its suggesting pairs. The largest clique corresponds to the
key value which is counted by the largest number of pairs. In our implementation,
for each of the five S boxes we keep a bit mask of 64 bits, one bit for each possible
key. Given the values of SE, S*, and S~ we set the bits of the key masks that corre-
spond to possible keys. Each pair has five such key masks, one for every S box. A
clique is defined as a set of pairs for which for each of the five key masks there is a
DifferentialCryptanalysisof DES-likeCryptosystems 31

common bit set in all the pairs in the set (i.e., the binary "and" operation is nonzero
for all the five key masks). Finding the largest clique can be done in the following
way: first compare the key masks for every pair with all the following pairs in the
pairs list. At each comparison there is usually at least one key mask without any
common bit set. For the remaining possibilities we try to "and" the result with third
pairs, fourth pairs, and so on until no more pairs can be added to the clique. Given
the largest clique we can easily compute the corresponding key bits by looking at
each key mask for the key value it represents.
Using the clique algorithm with 240 ciphertexts it takes about 0.3 seconds on a
COMPAQ personal computer to find the key in 95?/oof the tests conducted on DES
reduced to six rounds. When 320 ciphertexts are used the program succeeds in
almost all the cases. The program uses about 100K bytes of memory, most of which
is devoted to various preprocessed tables used tO speed up the algorithm.

5. D E S Reduced to Eight Rounds

DES reduced to eight rounds can be broken using about 25,000 ciphertext pairs for
which the plaintext XOR is P' = 40 5C 00 00 04 00 00 00x. The method finds 30
bits of K8. Eighteen additional key bits can be found using similar manipulations
on the pairs. The remaining eight key bits can be found using exhaustive search.
The following characteristic is used in this analysis:

~tp= 40 5C 00 00 04000000~ )

+ A ' = 40 08 00 00~ ~ - ~ a'=04000000~ with probability 88


= P(OA O0 O0 00~)

+" B'
0O(P= 04 100000~)
00 00 00~ ~ - 7 = bt = OO54 00 with probability 10-16
64-64

C'= 0 ~ c' = 0 always


,~- ] r [q

D'=04000000~ ~ d ' = 00 54 00 00~ with probability 610.16


4.64

,~_E'=40080000~ ~ e ' = 04 00 00 00~ with probability 1

( f l T = 4 0 5 C 0 0 0 0 04000000~ )
32 E. Biham and A. Shamir

This characteristic has probability 1/10,486. The input XOR in the sixth round
of a right pair is
f ' = d' ~ E' = b' G A' = L' = 4 0 5 C 0 0 0 0 x.

Consequently, for five S boxes S[I = S~I = 0 and S~I = 0.

Note. There is an additional five-round characteristic with probability about


1/33,000. Its plaintext XOR is
f~v = 04 04 07 80 00 20 20 00 x.
In this characteristic only four S boxes in the sixth round satisfy Ski = 0. There are
other characteristics for which either the probability or the number of unchanged
S boxes in the sixth round are smaller, and thus their use is less advantageous.
In right pairs the five S boxes $2, $5 . . . . . $8 satisfy S~I = S~I = 0 and Sbl = 0. By
H' = 1' @ g' = I' @ e' @ F' we can find the output XORs of the corresponding S
boxes in the eighth round. The input data of the eighth round is known from the
ciphertexts. Therefore, we can use the counting method to find the 30 subkey bits
entering the five S boxes at the eighth round. The signal-to-noise ratio of this
counting scheme is S I N = 230/45.10,486 = 100.
Counting on 30 subkey bits demands a huge memory of 23~ counters. In this case
the clique method is not recommended since its computation time grows very fast
(more than quadratically) with the number of pairs, while the computation time of
the counting method is linear in the number of pairs. Nevertheless, we can reduce
the amount of memory by counting on fewer subkey bits entering fewer S boxes.
The remaining S boxes can be used for identification of some of the wrong pairs (in
which S'rh-P S'oh). About 20?/0 of the entries in the pairs XOR distribution tables
are impossible and thus each remaining S box discards 20Yo of the wrong pairs.
Counting on 24 key bits thus has S I N = 224/44. 0.8.10,486 ~ 7.8 and counting on
18 key bits has S I N = 218/43 "0.82. 10,486 ~ 0.6.
In counting schemes that count on a reduced number of bits we can choose the
reduced set of countable S boxes arbitrarily. In this particular case we can choose
the reduced set with the advantage of increasing the characteristic's probability and
the signal-to-noise ratio by using a slightly modified characteristic which ignores
output bits that are not counted anyway. The slightly modified characteristic is
similar to the original one except that in the fifth round only one bit of S2be is fixed
and all the combinations of the other three are allowed:
e' = 04 O0 O0 00~, --,, E' = P ( O W O0 O000x) = XO OY ZO 00,,,
where W ~ {0, 1, 2, 3, 8, 9, A, B}, X ~ {0, 4}, Y s {0, 8}, and Z E {0, 4}. Therefore at
the sixth round
f ' = X O 5 V ZO 00,,,
where V = Y G 4. The only possible combination in which Z = 0 is 04 00 00 00 x --*
40 08 00 00, which has probability 16/64. All the other combinations (in which
Z = 4) have an overall probability 20/64. We cannot count on the subkey bits S5Kh
but it is still advisable to cheek the possibility of S5~h ~ S5bh which is satisfied
Differential Cryptanalysis of DES-like Cryptosystems 33

by 80Y/o of the pairs. Therefore, the probability of e' ~ E' is ~16 q_ 0.8~2~= --6432=
89 The probability of the five-round modified characteristic is (16.10.16/643) 9
(16- 10- 32/643) ~ 1/5243. The signal-to-noise ratio of a counting scheme which
count on the 24 subkey bits entering $2, $6, $7, and $8 is SIN = 224/44. 0.8. 5243
15.6. This signal-to-noise ratio allows us to use only about five right pairs. There-
fore, it uses a total amount of about 25,000 pairs. The signal-to-noise ratio of a
counting scheme which counts on 18 subkey bits entering three S boxes out of
$2, $6, $7, and $8 is SIN = 21s/4 a .0.82. 5243 ~ 1.2. This counting scheme which
counts on 18 bits needs 150,000 pairs and has an average of about 24 counts for
any wrong key value and about 53 counts for the right key value (53 = 24 +
150,000/5243 = 24 + 29).
A summary of this cryptanalytic method using 2 is memory cells is as follows:
1. Set up an array of 2 TM counters which is initialized by zeros. The array corre-
sponds to the 2 la values of the 18 key bits of K8 entering $6, $7, and $8.
2. Preproc.ess the possible values of St that satisfy each 5~ ~ S~ for the eight S
boxes into a table. This table is used to speed up the program.
3. For each ciphertext pair do:
(a) Assume h' = r ,' H' = l,' and h = r. Calculate S~h = 5~h and S~h for $2,$5,
..., $8 by h' and H'. Calculate SEh for $6, $7, and $8 by h.
(b) For each one of the S boxes $2, $5, $6, $7, and $8 check if S'~h~ S~h. If
S~h -p S~h for one of the S boxes, then discard the pair as a wrong pair.
(c) For each one of the S boxes $6, $7, and $8: fetch from the preprocessed
table all the values of S~h which are possible for S~ ~ S~h. For each
possible value calculate Srh = S~ht~ SEh.Increment by one all the counters
corresponding to combinations of the possible values of S6Kh, S7r.h, and
S8Kh.
4. Find the entry in the array that contains the maximal count. The entry index
is most likely to be the real value of S6Kh, S7rh, and S8xh which is the value
of the 18 bits 31 . . . . . 48 of K8.
To find the other bits, we filter all the pairs and leave just the pairs with the
expected 5~ value using the known values of h and the known bits of K8 entering
$6, $7, and $8. The expected number of the remaining pairs is 53.
The next bits we are looking for are the 12 bits of K8 that correspond to $2 and
$5. We use a similar counting method (exploiting the enhanced SIN created by the
higher concentration of right pairs) and then filter more pairs. A wrong pair is not
discarded by either this filter or its predecessor with probability 2 -2o and thus
almost all the remaining pairs are right pairs.
Using the known subkey bits of K8 we can calculate the values of 20 bits of each
of H and H* for each pair and thus 20 bits of each of g and g* (by g = 10) H),
Table 9 shows the dependence of the g bits and the subkey bits of K7 at the seventh
round on the known and unknown subkey bits of K8 at the eighth round. The
digits 1, 3, and 4 mean that they depend on the value of the unknown key bits
entering the corresponding S box in the eighth round. " + " means that it depends
only on the known bits of KS. Eight key bits are not used at all in K8 and are
marked by ".".
34 E. Bihamand A. Shamir

Tzb]e 9. Knownbits at the seventhround.


Into S box g bits Key bits
number SEg SKg
S1 +4++++ 3+ . . 4+
$2 ++3++1 134333
$3 +14+++ +1 +41 +
$4 ++++31 11 . . 1 +
$5 31 ++4+ +++ . ++
$6 4++1 3+ + . + . ++
$7 3+4+++ + + + . ++
$8 ++31 +4 ++++++

The expected value of G' is known by the formula G' = f ' ~) h'. We can now look
for the 18 missing bits of K8 by exhaustive search of 218 possibilities for every pair.
Thus we know H, H* and g, 0" and 40 bits of K7. For each pair we check that the
expected value of G' holds. For the right value of those 18 key bits the expected
G' holds for almost all the filtered pairs. All the other possible values satisfy the
expected G' value only for a few pairs (usually two or three pairs while the right
value holds for 15 pairs). To save computer time we search primarily for the 12 key
bits entering S1 and $4 in the eighth round. They suffice to compute S3~g as seen
in Table 9. By similar methods we find these 12 bits and then find the other eight
bits. This completes the calculation of the 48 bits of K8. Only eight key bits are still
missing and they can be found by exhaustive search of 256 cases, using one pair of
ciphertexts, and verifying that the plaintext XOR is as expected.
To save disk space we can filter the pairs as soon as they are created and discard
all the identifiable wrong pairs (leaving 0.85 ,,~ ~ of all the pairs). Therefore, in the
case of counting on 24 bits, the 25,000 pairs are reduced to about 7500 pairs. For
the case of counting on 18 bits we devised another criterion which discards most of
the wrong pairs while leaving almost all the right pairs. This criterion is based on
a carefully chosen weighting function and discards any pair whose weight is lower
than a particular threshold. This criterion is the extension of the filtering of the
identifiable wrong pairs (where the threshold is actually zero) and is based on the
idea that a right pair typically suggests more possible key values than a wrong pair.
The weighting function is the product of the number of possible keys of each of the
five countable S boxes (i.e., the number in the corresponding entry in the pairs XOR
distribution tables). The threshold is chosen to maximize the amount of discarded
pairs, while leaving as many right pairs as possible. The best threshold value was
experimentally found to be 8192 which discards about 97~o of the wrong pairs and
leaves almost all the right pairs. This reduces the number of pairs we actually
analyze from 150,000 to about 7500, with a corresponding reduction in the running
time of the attack.
The attacking program finds the key in less than 2 minutes on a C O M P A Q
personal computer with 95~o success rate (using 150,000 pairs). Using 250,000 pairs
the success rate is increased to almost 100~o. The program uses 460K bytes of
DifferentialCryptanalysisof DES-likeCryptosystems 35

Table 10. The possible instances of 08x ~ Ax


by $2 (in binary).
$21 $2" $2o $2"
123456 123456 1234 1234

000010 001010 0001 1011


0001 I 0 0011 I 0 11 I 0 0100
010001 011001 11 O0 0110
010101 011101 0001 1011
100000 101000 0000 1010
100010 101010 1110 01 O0
1001 O0 1011 O0 0111 1101
100110 101110 1011 0001

memory, most of it for the counting array (one byte suffices for each counter since the
maximum count is about 53, and thus the total array size is 2 is bytes), and the
preprocessed speed up tables. The program which counts using 224 memory cells
finds the key using only 25,000 pairs.

5.1. Enhanced Characteristic's Probability


In addition to the statistical behavior of the characteristic we can use the possible
values of individual input and output bits of the S boxes. Let us look at the first
round of the characteristic. We have 08 x -* A x by $2 with probability 16/64. Table
10 describes the possible input and output values.
We can see that the input bits number 2 and 6 are always equal. In addition for
12/16 of the input values they are both zero and for 4/16 of them they are both one.
If we know the XOR of the key bits entering these two bits of $2 in the first round
(i.e., bits 57 and 42 of the key) we can use only plaintexts whose corresponding bits
(i.e., bits 5 and 9) have the same XOR value (causing bits number 2 and 6 to be
equal). Other pairs of plaintexts cannot satisfy the characteristic. The statistics and
the SIN ratio are then twice as good, and let us use less than half the number of
pairs.
If we know the values of both bits in a key we can choose the two bits in the
plaintexts such that the bit values entering $2 are both zero. In this case the statis-
tics for $2 becomes 12/16 instead of 16/64. Thus we get a factor of three in the
statistics and the SIN. The higher SIN lets us use less than one-third of the pairs
needed originally. A factor of four can be easily obtained by a characteristic that
holds for all the inputs in which bit number 1 has value one and both bits number
2 and 6 have value zero.

5.2. Extension to Nine Rounds


The five-round characteristic can be extended to six rounds by concatenating it to
the following characteristic:
36 E. Bihamand A. Shamir

f~p 84 41 13 46 405C00 00~ )


.J

~4 A' = 80 41 13 46~ [""'-~4 a' = 40 5C 00 00~ with probability about i-~


1
= P(30 B E O0 00~) I " I

( ~T = o4oo oo oo 4o 5c oo oo~ )

This characteristic has probability 12- 14.16/643 ~ 1/100 and thus the probabil-
ity of the concatenated six-round characteristic is about 1/1,000,000.
DES reduced to nine rounds can be broken using 30 million pairs by a method
based on this six-round characteristic and using an array of size 230 with S / N =
230/45. 1,000,000 ~ 1. The first part of the algorithm that finds the first 30 key bits
is almost the same as in the eight-round algorithm except that it counts on all the
30 bits at once. The second part of the algorithm that uses Table 9 is slightly
different since the key scheduling at the ninth round is based on a shift of one bit
instead of two bits. The input part stays the same.

6. DES withan ArbitraryNumberof Rounds


The following iterative characteristic can be used to cryptanalyze (at least in princi-
ple) variants of DES with an arbitrary number of rounds.

p = (r 0) = 1 9 60 00 00 00000000~

A'=0 ~ q a'=0 ] always

BI = 0 ~--~q ] with probability about A_


234

19 60 00 00~

1
= (o, _- oo oo oo oo 19 oo oo o o )

where ~ = 19 60 00 00 x.
Due to the importance of this iterative characteristic, throughout this paper we
refer it as the iterative characteristic.
Differential Cryptanalysis of DES-like Cryptosystems 37

Table 11. The probability of the iterative characteristic versus number of rounds.

N u m b e r of rounds Probability

3 1/234
5 1/55,000
7 ~ 2 -24
9 ~2 -32
11 ~ 2 -4~
13 ~ 2 -4s
15 ~ 2 -56

L e m m a 2. The iterative characteristic has probability 1 4 . 8 ' 10/643 ~ 1/234.

Proof. S'Eb# 0 only at three S boxes: S1, $2 and $3, for which
Sl~b = Sl'~b = 03~ ~ Slbb = 0 with probability 14/64,

S2'Eb = S2'~b = 32~ ~ S2~b = 0 with probability 8/64,


S3'rb = S3'tb = 2C~ ~ S3~b = 0 with probability 10/64,
and for the other S boxes ($4, ..., $8)

S'eb = S'1b = 0 --~ Sob = 0 always.


Thus B' = 0 with probability 14.8.10/643 ~ 1/234. []

Theorem 2. By an iterative concatenation of the iterative characteristic with itself


and with the one-round characteristic with probability 1 (described in Example 9)
we get characteristics with probabilities as summarized in Table 11. In addition the
plaintext X ORs and the ciphertext X ORs of these characteristics are equal:
flv = fl r = 19 60 00 00 00 00 00 00 x = (~b, 0)

and for the next round (without loss of generality we use the notation of a five-round
characteristic)

f ' = ~,
and five of its S boxes satisfy S'Ef = O.

Proof. The results of this theorem are derived from Definition 11 and L e m m a 2.
The X O R data during the intermediate rounds looks like:
n,, = (~, o)
a' = 0 ~ A' = 0 always,
b' = ~ ~ B' = 0 with probability a b o u t 1/234,
38 E. Biham and A. Shamir

c'=a'~B'=O~C'=O always,
d' = ~ b ~ D ' = 0 with probability about 1/234,
e'=c'~D'=O~E'=O always,

and so forth for any number of rounds. []

Note. There is another value for which Lemma 2 and Theorem 2 hold with the
same probabilities. This value is ~,t = 1B 60 00 00 x. There are several additional
values for which the probabilities are smaller. The best of them is ~b~ = 00 19 60 00x
for which the probability is exactly 1/256. The extension of this iterative character-
istic to 15 rounds has probability 2 - 5 6 .

There are several possible types of attack, depending on the number of additional
rounds in the cryptosystem that are not covered by the characteristic itself. The
attack on DES reduced to eight rounds in Section 5 uses a five-round characteristic
and there were three additional rounds. This kind of attack is called a 3R-attack.
The other kinds of attacks are a 2R-attack, with two additional rounds, and a
1 R-attack, with one additional round (where the characteristic causes r' to be fixed).
A 0R-attack is also possible but it can be reduced to a 1R-attack with better statis-
tics and the same S/N. A 0R-attack has the advantage that the right pairs can be
recognized almost without mistakes (the probability of a wrong pair to survive is
2 - 6 4 ) and thus the memory requirements can become negligible using the clique
method. For a fixed cryptosystem it is~advisable to use the shortest possible charac-
teristic due to its better statistics. Thus, a 3R-attack is advisable over a 2R-attack
and both are advisable over a 1R-attack.
In the following sections the actual attacks on DES reduced to 8-16 rounds are
described. All these attacks find some bits of the subkey of the last round. The other
bits of the subkey of the last round can be calculated using these known bits and a
reduction of the cryptosystem to a smaller number of rounds can be done. Only
eight bits do not appear in the subkey of the last round and they can be found by
trying all the 256 possible keys.

6.1. 3R-Attacks
In 3R-attacks counting can be done on all the bits of the subkey of the last round
entering the S boxes that have zero input XORs at the round that follows the last
round of the characteristic. The four, six, eight, and nine-round attacks described
in the previous sections are of this type.
In DES reduced to eight rounds the first 30 subkey bits can be found using the
iterative characteristic with five rounds (whose probability is about 1/55,000) by an
attack which is similar to the one described in Section 5. Using an array of size 224
we have S/N = 224/44. 0.8.55,000 = 1.5. We need about 220 pairs. Using an array
of size 230 we have S/N = 230/45. 55,000 ~ 19. About 67~ (1 - 0.8 s) of the pairs
can be identified in advance as wrong pairs.
Differential Cryptanalysis of DES-like Cryptosystems 39

6.2. 2R-Attacks
In 2R-attacks counting can be done on all the bits of the subkey of the last round.
Possibility checks can be done for all the previous round S boxes. An S box whose
input XOR is zero should also have an output XOR of zero, i.e., the success rate of
this check is 1/16. For the other S boxes the success rate is about 0.8.
In DES reduced to nine rounds the 48 bits of K9 can be found using 226 pairs
using the seven-round characteristic. We know that
f~p = (~,, 0)
a'=O~A'=O always,
b'=~k~B'=O with probability about 1/234,
c'=0 ~C'=0 always,

g'=0~G'=0 always,
h'=lp~H'=i'G#'=r',
i'= r ' ~ l ' = h' O l ' = l' O)~k,
I) r = (l', r').

We can check that h' ~ H' and i' --* I' and count the possible occurrences of the key
bits. At h' ~ H' five S boxes satisfy S~h = S~h = 0 and thus S~h must be zero (which
happens for wrong pairs with probability 1/16), while the other three S boxes satisfy
S~h ~ S~h (which happens for wrong pairs with probability 0.8). Therefore the
counting on all the 48 bits of K9 has S/N = 248. 2-24/48.0.83. (1~6)s ~ 229 and
counting on 18 bits has SIN = 218. 2-24/43. 0.85. 0.83. (~6) s ~ 211. Even a separate
counting on the six key bits entering each S box is possible with SIN = 26. 2-24/4 9
0.87. 0.83. (116)5 ~ 10. The identification of the wrong pairs leaves only 0.83. ( ~ ) s .
0.88 ~ 2 -24 of the wrong pairs and thus only about one wrong pair is left per each
right pair. The characteristic's probability is 2 -24 and thus we need about 226 pairs
for the cryptanalysis. This attack needs more data than the previous 3R-attack on
DES reduced to nine rounds but needs much less memory. Due to the very good
identification of wrong pairs (only about eight pairs are not discarded, four right
pairs and four wrong pairs) it is possible to use the clique method on all the 48 bits.
Eleven rounds can be broken by using the nine-round characteristic with an
array of size 218 and S/N = 2 is "2-32/43 "0.85 "0.83 -(~6) 5 ~ 6 using 235 pairs. The
clique method can still be used on 48 subkey bits with SIN = 248. 2-32/48"0.83.
(~6) 5 ,~ 221 with an identification that leaves 232. 2 -24 = 28 wrong pairs per each
right pair.
Thirteen rounds can be broken using the eleven-round characteristic with an
array of size 230 and SIN = 23~176 ~ 4 using 243 pairs.
The clique method is not possible since 243. 2 -24 = 219 pairs are not discarded.
Counting schemes on 18 and 24 bits are not advisable due to the low SIN.
Fifteen rounds can be broken using the 13-round characteristic with an array
40 E. Biham and A. Shamir

of size 242 and S/N =242"2-48/47"0.8"0.83"(1) 5 ~ 2.5 using 251 pairs. This is
still faster than exhaustive search, but requires unrealistic amounts of space and
ciphertexts.

6.3. 1R-Attacks
In 1R-attacks counting can be done on all the bits of the subkey of the last round
entering the S boxes with nonzero input XORs. Verification of the values of r' itself
and possibility checks on all the other S boxes in the last round can be done. For
those S boxes with a zero input XOR the output XOR should be zero too, i.e., the
check success rate is 1/16. Since the input XOR is constant we cannot distinguish
between several subkey values. However, the number of such values is small (eight
in all the 1R-attacks described here) and each can be checked later in parallel by
the next part of the algorithm (either via exhaustive search or by a differential
cryptanalytic attack).
Ten rounds can be broken using the nine-round characteristic where
h' = ~b -~ H' = 0 with probability 1/234,
i' = 0 --* I' = 0 always,
j'=~b=r'~J'=l'~i'=l'.
We can identify the right pairs easily. Those pairs satisfy r' = ~, and the 20 bits in
l' going out of $4 . . . . . $8 ae zero. This also holds for 2 -52 of the wrong pairs. For
the other three S boxes we count the possible values of their 18 key bits with
SIN = 2 is" 2-32/43. 2 -s2 = 232. Thus we need 234 pairs.
Twelve rounds can be broken using the eleven-round characteristic with S/N =
2 is. 2-40/43. 2 -52 = 224 and with 242 pairs.
Fourteen rounds can be broken using the 13-round characteristic with S/N =
2 is. 2-4s/43- 2 -s2 = 216 and with 2 s~ pairs.
For 16 rounds we get SIN = 2 as" 2-56/43. 2 -52 = 2 s using the 15-round charac-
teristic. This can be broken using 257 pairs. Note that the creation of 257 pairs is
more time consuming than exhaustive search for the 256 possible keys.

6.4. Summary of the Cryptanalysis


A summary of the cryptanalytic results appears in Table 12. The description of each
field is as follows:
No. of rounds: The number of rounds in the cryptosystem.
No. pairs needed: The number of pairs needed to cryptanalyze the cryptosystem.
The number of ciphertexts needed is twice the number of pairs.
No. pairs used: The number of pairs that are actually used in the attack, excluding
the identifiable wrong pairs that can be easily discarded during the collection
phase.
No. bits found: The number of key bits found in the initial attack (using a single
characteristic). The other key bits can be found by auxiliary techniques.
Characteristic: The number of rounds and the probability of the characteristic
used in the attack.
Differential Cryptanalysis of DES-like Cryptosystems 41

Table 12. Summary of the cryptanalysis of DES.

No. of No. pairs No. pairs No. bits


rounds needed used found Characteristics S/N Comments
4 23 23 42 1 1 16 [6]
6 27 27 30 3 1/16 216 *
8 215 213 30 5 1/10,486 15.6 [24]
8 217 213 30 5 1/10,486 1.2 [18]
8 22o 219 30 5 1/55,000 1.5 [24] The iterative characteristic
9 225 224 30 6 1/1,000,000 1.0 [30] Extension to six rounds
9 226 8 48 7 2-24 229 *
10 234 4 18 9 2-32 232 *
11 235 211 48 9 2 -32 221 *
12 242 4 18 11 2-40 224 *
13 243 219 48 11 2-40 4 [30]
14 2 s~ 4 18 13 2-4s 216 *
15 251 227 48 13 2-48 2.5 [42] Needs a huge memory.
With less memory needs
257 pairs

16 257 25 18 15 2 -56 2s * Slower than exhaustive


search

S/N: T h e signal-to-noise r a t i o o f the attack. T h e n u m b e r in b r a c k e t s (if any)


d e n o t e s the n u m b e r of initial bits f o u n d with t h a t S/N. A n asterisk denotes t h a t
the clique m e t h o d is preferable over the c o u n t i n g m e t h o d a n d then the SIN is
o n the n u m b e r of bits found. T h e o t h e r key bits are f o u n d either in parallel o r
at a s e c o n d pass.
Comments: Real c o m m e n t s .

6.5. Enhanced Characteristic's Probability

In a d d i t i o n to the statistical b e h a v i o r of the iterative characteristic we can use the


i n d i v i d u a l values of the i n p u t a n d o u t p u t bits of the S boxes.
I n the iterative c h a r a c t e r i s t i c we have the following behavior. W h e n 32 x --, 0 b y
$2 the values o f the i n p u t bits n u m b e r 4 a n d 6 are always b o t h one (see T a b l e 13).
It does n o t h a p p e n in the first r o u n d a n d thus it c a n n o t be used as in Section 5.1.
Also we have 2Cx ~ 0 b y $3 where is 8/10 of the cases bit n u m b e r 2 equals zero a n d
in 2/10 of the cases bit n u m b e r 2 equals one (see T a b l e 14).

Table 13. Possible inputs and outputs for 32x ---,0 by $2 (in binary).

S21 S2T S20 = S2~


123456 123456 1234

000111 110101 0111


001111 111101 1110
010101 100111 0001
010111 100101 1010
42 E. B i h a m a n d A. S h a m i r

Table 14. Possible i n p u t s a n d o u t p u t s for 2C~ ---, 0 by $3 (in binary).

$31 SYr $3o = $3"


123456 123456 1234

000010 101110 0000


000011 101111 0111
000111 101011 1001
001111 100011 1010
010001 111101 0010

The XOR value of bit 6 of $2~ and of bit 2 of $3~ equals the XOR value of the
corresponding key bits in S2r and S3K since the corresponding bits in S2E and S3E
are the same bit due to the bit expansion. If their XOR value is known to be
one, then the probability of the iterative characteristic becomes 14" 8- 8/642. 32 =
7/21~ ~ 1/146. If their XOR value is known to be zero, then the probability be-
comes 14" 8- 2/642. 32 = 7/212 ,,~ 1/585.
The other characteristic described with the same probability has the opposite
direction. When 36x ~ 0 by $2 the value of bit number 6 is always zero and thus
the probabilities are exchanged. If the XOR of the key bits is zero, then the proba-
bility is 1/146 and if one it is 1/585.
The attack on DES with 16 rounds is now as follows. There are seven rounds in
which the input XOR is assumed to be r Suppose that, out of these seven rounds,
we have n rounds (0 < n < 7) whose key bit number 6 of S2K equals key bit number
2 of S3x. In this case, the probability of the 15-round characteristic is
47-n
( 7 x~n~ 7 '~71 = 47_,( 7 " ~ 7 1 . 6 - -
2 ,

For the other characteristic it is 1.6(4"/265). Table 15 describes the probabilities


for each number of equalities among the key bits and the relative frequency of such
keys.
To increase the probability (especially in the worse cases) we use quartets based
on both characteristics. Since both characteristics allow counting on the same S
boxes we can use them simultaneously. We can see from the table that even though
we can now break 16 rounds with less than 256 encryptions, it does not work for

Table 15. P r o b a b i l i t i e s by n u m b e r of key bits equalities.

No. of Keys P r o b a b i l i t y of P r o b a b i l i t y of S u m of No. needed


equals ratio first characteristic other characteristic probabilities ciphertexts

0 !12a 1.6- 2 T M 1.6' 2 -65 1.6" 2 T M 1.25" 252


1 ~12s 1.6- 2 -53 1.6" 2 -63 1.6" 2 .53 1.25" 254
2 2~
12s 1.6.2-5s 1.6.2-61 1.625.2-55 1.23.256
3 35 1.6" 2 -57 1.6' 2 .59 2 -s6 25a
128
4 ~'~
128
1.6.2 .59 1.6" 2 .57 2 .56 2 sa
5 2~
128
1.6" 2 -61 1.6' 2 -55 1.625" 2 _55 1.23" 2 $6
6 ~laa
- 1.6" 2 -6a 1.6" 2 .53 1.6' 2 _53 1.25' 2 ~'*
7 !128 1.6' 2 -65 1.6" 2 T M 1.6' 2 -51 1.25" 2 ~2
DifferentialCryptanalysisof DES-likeCryptosystems 43

all the keys but only for a small fraction of them. For this fraction exhaustive search
is still faster. Table 15 shows that although the knowledge of the specific bit values
during the rounds of the characteristics enhances the attack and decreases the
number of pairs needed, the improvement is relatively small and does not affect the
overall complexity.

7. Variants of DES

This section describes several variants of DES and how the attack works on them.

7.1. Modifying the P Permutation


All the attacks based on the iterative characteristic are independent of the choice
of the P permutation. Thus any modification of the P permutation by any other
pcrmutation cannot make the attack less successful.

7.2. Modifying the Order of the S Boxes


The DES cryptosystem specifies a certain order of the eight S boxes. A modification
of the order of the S boxes can make the cryptosystem much weaker. Consider, for
example, the case in which $1, $7, and $4 are brought together in this order (with-
out loss of generality in the first three places) and the other S boxes are set in any
order. Then there is a similar iterative characteristic. This characteristic is denoted
by ~b~ = 1D 40 00 00x, where
SI: 03x --, 0 with probability 14/64,
$7: 3A~ ---,0 with probability 16/64,
$4: 28~ ---,0 with probabiliity 16/64,
and ~" ~ 0 with probability 14.16.16/643 g 1/73.
The 15-round characteristic has probability 1/737 ~ 2 -43 and thus the 16-round
cryptosystem can be attacked using 245 pairs with S/N = 218" 2 - 4 3 / 4 3 . 2 -52 ----221.
The 17-round characteristic has probability 1/738 ~ 2 -50 and thus the 18-round
cryptosystem can be attacked using 252 pairs with S/N = 218. 2-50/43. 2 -52 = 214.
In these attacks the clique method can be used due to the excellent identification
of wrong pairs (only 2 -53 of them remain). As in the attack based on the iterative
characteristic this attack is independent of the choice of the P permutation.

7.3. Modifying XORs by Additions


In DES there are two XOR operations in each round. The first XORs the expanded
input with the subkey within the F function while the other XORs the output of
the F function with the other half of the input data. The following subsections
describe three possible modifications which replace some of the XOR operations
by addition operations. The same analysis holds for modification by subtraction
operations.
44 E. Biham and A. Shamir

7.3.1. Modifying the XORs Within the F Function. If we replace the occurrences
of the XORs within the F function by addition operations we get a much weaker
cryptosystem. The attack uses the following iterative characteristic:

(.. Qp=O0000000 O00CO000~ )


r
1
A' = 0 [ " - - 7 - 7 a' = 00 0C 00 00~ with probability

r
fZT=O00CO000 00000000~ )

The 00 0C 00 00x ~ 0 should be explained: 00 0C 00 00x is the input XOR of the F


function. The expansion to 48 bits is 000058000000x. The addition of the key causes
the input XOR to become 000028000000~ with probability 1/16. Thus the input
XORs of all the S boxes except $4 is zero, while $4~ = 28~. However, 28~ ~ 0 by
$4 with probability 1/4.
The 15-round characteristic has probability (1/64)7= 2 -42. The 1R-attack
counting scheme which finds the six subkey bits entering $4 in the 16th round
has S/N = 26/242. 2 -32 "2-24"4 = 2 is. Thus the attack needs about 244 pairs of
encryptions. The six key bits entering $3 can then be found using the same
encryptions with even higher signal-to-noise ratio. Exhaustive search of the 244
possible keys (with 12 fixed bits) recovers the right key. The total complexity of this
attack is thus 245.

7.3.2. Modifying all the XORs. Modifying all the XORs by additions changes the
probability of this characteristic from 2 -6 to 2 -8 . This happens because the addi-
tional addition operation (for example c = a + B) does not change the input XOR
(c' = a' for B' = 0) with probability 1/4. Thus the 16-round characteristic has prob-
ability 2 -64, the 15-round characteristic has probability 2 -58, the 14-round charac-
teristic has probability 2 -56, and the 13-round characteristic has probability 2 -s~
The analysis of this attack shows that 2 s2 pairs are needed to cryptanalize the
14-round cryptosystem. The attacks on the 15-round and 16-round cryptosystems
are slower than exhaustive search.

7.3.3. Modifying all the XORs in an Equivalent DES Description. DES has an
equivalent description in which the expansion is moved to the end of the F function
and all the calculations are done using 48 bits instead of 32. The cryptosystem
which is the result of modifying all the XORs in this description by additions is
DifferentialCryptanalysisof DES-likeCryptosystems 45

not equivalent to the modified standard cryptosystem as described in the previous


subsection. In this subsection we show that this cryptosystem is much weaker than
the modified standard cryptosystem. We can save the repeated cancellation of non-
zero input XORs entering S3 in the previous characteristic by doing it in the first
addition, since during the various rounds the data bits entering each S box are kept
expanded. We get a two-round iterative characteristic with probability 1/16 which
is concatenated to a single occurrence of a one-round characteristic with probabil-
ity 1/16 at the first round. Thus an n-round characteristic with an odd n has proba-
bility (1/16)-(1/16) ("-1)/2 = 2 -2-2".
The 15-round characteristic has probability 2 -32. A 1R-attack on the 16-round
cryptosystem while counting the six key bits entering $4 in the last round has
S/N = 26/232. 2 -4s. 2 -42. 1 = 264. Thus only about 234 pairs are needed. The oth-
er key bits entering the last round can be found using similar characteristics. The
best three characteristics have probabilities between 2 -32 and 2 -35 , and the attacks
based on them can find 18 key bits. Therefore, 237 pairs are needed to find the first
18 key bits. The remaining 38 key bits can be found by exhaustive search. The total
complexity of this attack is thus 239 .

7.4. Random and Modified S Boxes


In a random S box there is a very high probability (about 0.998) that there are two
different inputs that differ in the two middle input bits of an S box (which do not
affect the neighboring S boxes) which have the same output. In this case there is an
iterative characteristic which is (without loss of generality the S box is S1 and
S1~ = Cx)

t i p = 6 0 0 0 0 0 0 0 00000000~ )

~ 4
A'= 0 ~-~4 a' = 0 always

B'=0 ~ ~ b'=60000000~ with some probability

(T=oooooooorooooooo)
Ninety-seven percent of the sets of eight S boxes have such iterative characteristic
with probability 1/8 or more. The corresponding 13-round characteristics have
probability 2 -18 for which the 3R-attack on 42 subkey bits needs 220 pairs with
S/N = 21~ Table 16 describes the relationship between the probability of the
characteristics, the number of pairs needed, and the probability that a set of ran-
dom S boxes has such a characteristic.
46 E. Biham and A. Shamir

Table 16. Characteristicprobabilitieswith random S boxes.


Char. Prob. 8 13 rnds 13 rnds Needed
prob. S boxes char. prob. S/N pairs
1/32 1.00000 2 -30 2 -2
2/32 1.00000 2-:4 2'* 227
3/32 0.99991 2 -20.5 27's 223
4/32 0.97079 2 -18 2 l~ 22o
5/32 0.68375 2 -16"1 211"9 218
6/32 0.27330 2 -14"5 213"s 217
7/32 0.07240 2 -13'2 214"8 215
8/32 0.01499 2 -12 216 214
9/32 0.00260 2 -11'~ 217"0 213
10/32 0.00039 2 -'~ 217"9 2 '2

In S boxes chosen as four random permutations (as in the original DES S boxes)
two different inputs that differ in the private bits of one S box must have different
outputs. But there is a high probability that there are two different inputs differing
in the input bits of two S boxes which have the same output. In this case there is
an iterative characteristic which is (without loss of generality the difference is in S1
and $2 and the differing bits of the data are by bit mask 7E 00 00 00x)

Qp=7E000000 00000000~ )

A' = 0 ~, a' = 0 always


Y5
B' = 0 ~-~,, b' = 7E 00 00 00~ with some probability

T- l
=oo oo oooo 7E oooo )

In random tests we found several attacks that use 243 to 247 pairs. We estimate that
attacks that use this number of pairs can be found for more than 90% of the 16-
round cryptosystems which use S boxes chosen as four random permutations.
With a single modification in one entry of one of the original DES S boxes we
can force this S box to have two different inputs with the same output. For example,
such a modification may set the value of S(4) to be equal to S(0) (i.e., the third value
in the first line to be equal to the first value in the first line). Therefore there are two
different inputs (0 and 4) with the same output (the input XOR is 4 and the output
X O R is 0). The probability of 4 ~ 0 by this S box is 1/32. An iterative characteristic
based on this property has probability 1/32 and is (without loss of generality the
difference is in S1)
DifferentialCryptanalysisof DES-like Cryptosystems 47

- f~p= 20 00 00 00 0000 00 00~ )


1
A' = 0 ~ a' = 0 always

B'=0 ~ b' = 20 00 00 00. with probability


M~ TM I /" I9
r
(., =0o 00 00 00 20 00 00 00~ )

Therefore the probability of the 15-round characteristic is 1/327 = 2 -35. Using a


1R-attack 237 pairs are needed to attack the 16-round modified DES with S I N =
26. 2-35/4 92 -60 = 229 in order to find two indistinguishable values of the first six
key bits.

7.5. Four-Bit to Four-Bit S Boxes


A cryptosystem similar to DES in which the E expansion is eliminated and the S
boxes map four bits to four bits is quite weak. Even the cryptosystems that use
permutations derived from the original S boxes are easily attacked. For example,
using the first lines of the original S boxes we can find the following four-round
iterative characteristic with probability 1/256:

( ~2p=B0000000 00000500~ )
r
,~_ A,= lO oo oo oo~ ~ a' = 00 00 05 0 0 . with probability 88

r~_ B' = 00 00 02 00~ bI = A0 00 00 00~ with probability

t2~ C ' = 1 0 00 00 00. ~ c' = 00 00 07 00. with probability 88


"~- = A'

t~_ D' = 00 00 02 00~ ~ - - ~ [ ~ d' = B0 00 00 00~ with probability


~" --- B'

( n~= oo oo o5 oo B0 o0 0o 00~ )
48 E. Bihamand A. Shamir

Using a 2R-attack only 228 pairs are needed to break the 16-round cryptosystem.
There are several additional characteristics that can be used to attack the crypto-
system with a similar amount of pairs.

8. DES with Independent Keys

In this section we describe an attack on DES reduced to eight rounds with


independent keys and its application to DES with 16 rounds with independent
keys.

8.1. Eiffht Rounds


The attack on DES reduced to eight rounds with independent keys is basically
similar to the attack on DES reduced to eight rounds described in Section 5. We
start by using the same algorithm to find the first 30 bits of K8 and then proceed
to find the remaining bits of K8 and the bits of all the other subkeys by variants of
this algorithm. The attack uses the same characteristic as in the attack described in
Section 5 plus 100 pairs with additional two characteristics.
After finding the first 30 bits of K8, we filter the pairs, identify the right pairs, and
discard all the wrong pairs (with relatively few errors). The other 18 bits of K8
cannot be found yet since we cannot assume that the subkeys are related to each
other by the key scheduling algorithm. To avoid this problem we first look for bits
of K7. Table 9 shows the bits in g that can be calculated for any given ciphertext
(the known key bits there are irrelevant to our case). For each of the eight S boxes
of the seventh round and for each of its 64 possible key values we count the number
of pairs for which this key value is possible. A key value is possible for an S box in
a pair if there is an input pair to the S box whose computable bits have the calcu-
lated value, the other bits have any value and the output XOR is as expected by
the characteristic and the ciphertexts (by G ' = f ' ~ h ' = f ' ~ r'). The most fre-
quent key value is likely to be the right key value. Since there is not enough data
to make this key value unique we look for the set of key values with maximal counts
and choose the bits that have the same value in all the set. Those bits are likely to
have the right values. The other bits stay unknown. Experience has shown that the
known bits of Slrg, S3rg, and S4xg are at the locations denoted by "1" bits in 2Fx,
27x, and 3Cx, respectively. If some of these bits are unknown it is almost certainly
due to a mistaken value of the known bits of K8.
By the knowledge of the subkey bits of the eighth round we can calculate several
input bits of the seventh round for any ciphertext. The input to the seventh round
# has missing bits that enter all the S boxes. There is one S box whose input
depends just on one missing bit while the inputs of all the other S boxes depend on
two missing bits at least. This S box is S1, whose input bit could be calculated if the
output of $4 of the eighth round was known. To find the key bits of S4rh we try all
the 64 possibilities of its value for each pair, and find the key bits value by the
counting method. Now each of the inputs of S3eg and S4Eg have one missing bit:
S3rg could be calculated if Slob was known and S4Eg could be calculated if S3oh
was known. To find these subkey bits we try all the 128 possibilities of Slrh and
Differential Cryptanalysis of DES-like Cryptosystems 49

the missing bit of S3Kg and then the 128 possibilities of S3K, and the missing bit of
S4Kg. Now K8 is completely known. To find K7 we repeat the algorithm of finding
K7 described above with the difference that now we know all K8. Only one bit of
K7 remains indistinguishable. This bit is bit number 2 of Slrg.
So far we have used the filtered pairs. These pairs are assumed to be right pairs
whose f ' is as expected. They cannot help finding K6 since the input XORs of five
of the S boxes are zero so this part of K6 cannot be found at all. The other three S
boxes have constant input XORs so there are two indistinguishable values for the
subkey bits entering each S box. In order to find K6 we have to use wrong pairs for
which the characteristic holds in the first three of the five rounds. From now on
we use all the pairs and filter them by a different criterion in each phase of the
cryptanalysis.
K6: To find K6 we decrypt two rounds of the ciphertexts and get the values o f f
and f * . We assume that the first three rounds of the characteristic hold in the
chosen pairs so d' is as expected with zero input XORs entering six S boxes. Thus
we can calculate the output XORs of these S boxes in the sixth round by F ' =
c' ~ D' O) g'. Since c' = 0 and S~d is zero in the six S boxes, we get that F' = g' in
the output bits of these S boxes. The filtering chooses all the pairs for which f ' and
F' satisfy S~I -} S~I for S1, $2, $5 . . . . . $8. Using the resultant pairs we count on the
12 subkey bits entering S1 and $2 and the missing bit of K7 (needed for the de-
cryption of the seventh round).
To find the other bits of K6 we filter the pairs again by using the known bits of
K6 to check the output XOR of S1 and $2, and count on S5rr . . . . . S8r,r, a separate
counting for each S box (we have a very good filtering so the SIN is high enough).
In parallel we count on S3rr and on S4r,r using the assumption that e' is as
expected by the characteristic (four rounds hold) and the filter that discards any
pair for which S~e # 0 for S1, $3 . . . . . $8 (since only S2~e # 0). Several possibilities
are found for some of the S boxes' key bits, and the following phases are run on
each one of them in parallel.
K5: We assume c' = 0 and d' = b'. Then D' = e' where e and e* are calculated
by a partial decryption. S~d must be zero in the six S boxes in which S~d = 0. We
filter the pairs and leave only those that have S'oa = 0. Then we count on each of
the eight S boxes of the fifth round. Sex;eral possibilities can be found for some of
the SKIS. A list of all the possibilities of K5 is created and used to try each one of
them in parallel in the following phases.
K4: At the second round there must be S2~b = S6~b = 0 for any pair (these S box
inputs do not depend on the differing bits of the plaintexts), d and d* are found by
a partial decryption. In addition D' = a' ~) B' <~ e' so S2~d and S6~d are known and
there must be S2kd ~ S2~a and S6~d ~ S6~d. If it does not hold for even one pair it
is not a filtering problem. It must be a wrong value of the subkeys K5, ..., K8. A
separate counting is done for each of the six S boxes S1, $2, $5 . . . . . $8. The counting
on the other S boxes $3 and $4 is done only for pairs whose d' is as expected by
the characteristic since otherwise we cannot know the value of S3~d and S4~d be-
cause S3~b and S4~b are unknown. Since S3~d and S4~d are constants there are two
indistinguishable values for each of their keys. As usual we create a list of the
possible K4 values and try them in parallel.
50 E. Bihamand A. Shamir

K3: c and c* can be found by a partial decryption of the following rounds using
K4 . . . . . K8. S[o = 0 in all the S boxes except $2. Thus S~c can be found for S1,
$3, ..., $8 by C' = L' ~ A' ~) d'. For every pair there must be Sic ~ S~c. Therefore,
even if only one S box (S1 or $3 . . . . . $8) of one pair does not match S[c ~ S~r it
must be that the values of K4 . . . . . K8 are wrong. If this does not happen, the
counting is done in parallel for all the S boxes except $2 using all the pairs.
S2~a # 0, thus the calculation of $ 2 ~ is impossible without further assumptions.
Therefore we assume that the values of A' and b' are as expected by the characteris-
tic. The filtering discards any pair that does not have S~b = 0 for S1, $2, and $5 . . . . .
$8 using B' = a' ~ c' = R' ~) c' (since we assume S[b = 0 in these S boxes). The
counting of S2K~ is done using the filtered pairs.
K2 and KI: The plaintext XOR used above is useless to find K2 and K1 since
all the pairs have S2~b = S6~b = 0 and for all the S boxes of the first round except
$2 there is StEa : O. The key bits cannot be found at all for these S boxes. For K1
and K2 we must use another plaintext XOR. We need only 100 such pairs, which
can be obtained without adding new ciphertexts by arranging some of the original
ciphertexts in quartets. This plaintext XOR and the algorithm of finding K1 and
K2 are very similar to the case of K1 and K2 in the four-round version. See the end
of Section 3 for more details.
This attack was implemented in C on a C O M P A Q personal computer. It finds
the key in less than 2 minutes with 95% success rate using 150,000 pairs. Using
250,000 pairs the success rate is almost 100%. The program uses 460K bytes of
memory, most of it for the counting array (of size 218 bytes) and the preprocessed
optimization tables. The program which counts using 224 memory cells finds the
key using only 25,000 pairs. As demonstrated by these figures, DES reduced to eight
rounds with independent subkeys is almost as easy to solve as the case of dependent
subkeys.

8.2. Sixteen Rounds


DES with independent keys with any number of rounds is vulnerable to similar
attacks. Let us concentrate on DES with 16 rounds with independent keys. As we
noticed in Section 6 we can find eight possibilities for 18 bits of K16 using 257 pairs.
Three characteristics can be used to cover the subkey bits entering all the S boxes
in the 16th round. The three characteristics are the iterative characteristic itself, a
similar iterative characteristic which is nonzero in the input XORs of $3, $4, and
$5 whose 15-round probability is 2 -56 and a similar characteristic with nonzero
input XORs to $6, $7, and $8 whose 15-round probability is about 2 -57. Altogeth-
er, about 259 pairs are needed to find two possibilities for the six bits entering
each of the S boxes, except $2 whose bits are completely determined by two
characteristics. Therefore 27 possibilities for K16 are found. We try in parallel all
the 128 possibilities of the value of K16 and reduce the cryptanalytic problem to a
DES reduced to 15 rounds. Since we know how to attack DES reduced to 15
rounds with less data in a complexity that is smaller by a factor of 26 then trying
the 128 possibilities takes up to twice the time of finding the possibilities of K16.
Most of the possibilities are discarded during this reduction and reductions to fewer
Differential Cryptanalysis of DES-like Cryptosystems 51

rounds are possible with even smaller complexity. Therefore the cryptanalysis of the
full DES with 16 rounds with independent keys takes about 261 steps and use 259
pairs. Even though this is an impractical complexity bound, it is much faster than
the 276s complexity of exhaustive search.

9. The Generalized DES Scheme (GDES)

The Generalized DES Scheme (GDES) is an attempt to speed up DES which was
suggested by Schaumuller-Bichl [16], 1-18]. The speed up is obtained by increasing
the ratio between the block size and the number of calculations of the F function.
The GDES blocks are divided into q parts of 32 bits each. The F function is
calculated once per round on the rightmost part, and the result is XORed into all
the other parts, which are then cyclically rotated to the fight. After the last round
the order of the parts is exchanged to make the encryption and decryption differ
only in the order of the subkeys. The scheme is shown in Fig. 4, where n is the
number of rounds of the GDES cryptosystem,
B~j~=B~_,
u-l) c~
~F(B~_)~. K,),. .
je{2, . .,q},
. iE{1, .,n},
9-,i/](1)= B~q_)l, i ~ {1, ..., n},

Bo = (Bto1).... , Btoq)) is the plaintext, and B~ = (B~~). . . . . B~1)) is the ciphertext.

9.1. G D E S Properties
This section describes several properties of GDES.
1. In GDES with n < q,
Btoi)~ q~ = Bt~"+i', Vi ~ {1. . . . . q - n},

where ~o = (~j=l-~--7-1~tn(~),Kj). Thus, the following formulae are satisfied for


any i, j e {1,..., q - n}:

B~oO= Be ) r B~.+i) = B~'+J),

and for pairs of plaintexts for which Btoq-"§ . . . . . B~o~) are kept constant (i.e.,
a;, ..... = o):

B~(i) = B~ -+~ = B "("+i), Vi ~ {1. . . . . q - n}, Vme {0. . . . . n}.


2. In GDES with n < q, any pair of encryptions in which Btoq-"+2). . . . . B~o~) are
kept constant satisfies
B~)tq-,+l) = ~ = B~(1).
3. For any odd q and any n the following equation is satisfied:

~J:)B~'=(~)B~,'=(~B: j', Vme{0 . . . . . n}.


j=l j=l j=l
52 E. Bihamand A. Shamir

Plalntext

+ I"
%

%.
~, F x G ~. F x~ - ~ _ I
% % ~... .
J
. . . .
Ki

Kn

Ciphertext (swapped)

Fig. 4. The Generalized DES Scheme.

. In G D E S w i t h n = q - 1,
8~o, = o, V i e { 2 . . . . . q},

implies that
/~s, = O, V i e {1 . . . . . q - 1},

and
B'~ = B'om.
Differential Cryptanalysis of DES-like Cryptosystems 53

5. In G D E S with n = 2q - 2,
n~(1) = ~1,

Bo(2) = 172,

B~u, = 0, Vj ~ {3. . . . , q},


where ~h = 4 4 0 8 00 00x and r/2 = 04 00 00 00~ or rh = 00 20 04 08~ and r/2 =
00 00 04 00~ implies that
=o, vj {1 ..... q - 2 } ,
B~(q-1) ?72,
B;,t~)

with probability 1/16 since r/2 ~ ~h ~ ~/2 with probability 1/4. There are addi-
tional values for rh and r/z with smaller probabilities.
6. In G D E S with n = 2q - 1,

and
-- 0, vj {2 . . . . . q}
(where ~b is the value used in Section 6: $ = 19 60 00 00x) implies that
B~~ = 0, Vj~{1 . . . . . q - l } ,
and
B'(~) = ~k

with probability about 1/234. G D E S with n = lq - 1 satisfies it for any l _> 2


with probability about (1/234) H .

9.2. Cryptanalysis o f G D E S
This section describes how to cryptanalyze G D E S for various values of n and q.
We assume that q is even (as suggested in 1,16] and 1,18]), but note that odd q
can be attacked by variants of our technique. All the attacks find the subkeys
and are independent of the key scheduling algorithm. The special case of q = 8 and
n = 16 which is suggested in 1,16] and 118] as a faster and more secure alternative
to DES is breakable with just six ciphertexts in a fraction of a second on a personal
computer.

9.2.1. A k n o w n - P l a i n t e x t A t t a c k f o r n = q. Using a known-plaintext attack we


are given several plaintexts (each one of the form B~o1). . . . . B~o~ and the corre-
sponding ciphertexts (each one of the form B~ = (B~q). . . . . B~I))). Then

j=l j=l
54 E. Biham and A. Shamir

and for any i e { 1, ..., n}

F(BJ _'I, Kj) = 8(oq+' -" 9 a'. q+l-''.


j=l
j~:i

Thus the output of the F functions is


q
F(B}q_)1, K,) = B~oq+l-~ ~ B~q+X-O~ ( ~ (n~oJ)~ By))
j=l

and the input of the F functions is


i-1
B}~_)1 ___ S{oq+1-0 (~) O F(B)~-)I,Ki)"
j=l
We thus have Sn and So of each one of the 8n S boxes. As a result we get only
four choices for the six subkey bits of each S box. Using two or three encryptions
the choices can be filtered by leaving only the ones that appear in all the encryp-
tions, and thus all the subkey bits can be found.

9.2.2. A Second Known-Plaintext Attack for n = q. Using pairs whose plaintext


XORs are known we can compute the input and output XORs of the F functions
by the same method used in the previous known-plaintext attack. We can thus find
all the subkeys (starting with the subkey of the last round and working backward
toward the first round) using three pairs of ciphertexts with different plaintext
XORs.

9.2.3. A Chosen-Plaintext Attack for n = 2q - 1. Using a chosen-plaintext attack


with pairs satisfying
B'ou) = 0, Vj ~ {2, ..., q},
and any B~") # 0, we get
B'qU_')x= 0, Vje{1 . . . . . q - l } ,
and
B'q~ = B'o(1).
The rest of the encryption is based on q rounds and thus an attack similar to the
second known-plaintext attack for n = q can be used to find q subkeys by analyzing
three ciphertext pairs.
The other q - 1 subkeys can be found using a similar attack with two additional
ciphertexts.

9.2.4. A Chosen-Plaintext Attack for n = 3q - 2. This attack is similar to the


previous one, and uses ciphertext pairs satisfying
"B~)(1) ~ ~1,

B~(2) = r/z,
= o, vj {3 . . . . . q},
DifferentialCryptanalysisof DES-likeCryptosystems 55

where ~/1 and ~/2 are defined in Section 9.1. The right pairs are about 1/16 of all the
pairs. We can identify most of the wrong pairs by checking that the input XOR
cannot cause the output XOR. This happens with probability about 0.8 for each S
box. Thus only 0.8 sq = 0.16 q of the wrong pairs remain. When q > 3 this is less than
0.88.3 = 1/250 of the pairs. This excellent identification makes it possible to consid-
er only 48 pairs, and identify the three expected occurrences of right pairs among
them. We can further decrease this amount to 24 pairs by using quartets of two
XOR values.

9.2.5. A Chosen-Plaintext Attack for n = lq - 1. This attack works for n = lq - 1


rounds for I > 3. It is similar to the previous ones using
B~ 1) = ff = 19 60 00 00 x
B~t~ = 0, Vj 9 {2. . . . , q}.

The ( ( l - 1 ) q - 1)-round characteristic holds with probability (1/234) 1-2. The


identification leaves 0.8 sq-5.(1/16) s of the wrong pairs. Thus if 0.8 a~-s-2-20<<
(1/234) t-2 (i.e., for q = 8: l < 7 and n < 55), then the identification is excellent and
only three right pairs and needed (among the 3- 234 z-2 pairs considered) for count-
ing the occurrences for each S box separately. Otherwise we can count on several S
boxes simultaneously using more memory and a better SIN. Counting on the 48
bits of the subkey of the last round has
248.2-8(t-2)
SIN = 4a . 0 . 8 8 q _ 1 3 " 2_20 ~ 264-8/+2"5q.

This attack shows that any G D E S which is faster than DES is also less secure
than DES. G D E S with n = 8q rounds is just as fast as DES. Consider G D E S with
n --- 8q - 1 which is slightly faster than DES. Then the usable characteristic has
7q - 1 rounds and six repetitions of the iterative characteristic. Thus its probability
is about (1/234) 6 ~ 2 -48. Counting on all the 48 bits of the subkey of the last round
has
248.2-48
S/N = 48.0.8sq_13.2_2o ,~ 225q.

Thus about four to eight right pairs are needed, giving a total of 8" 24a = 251 pairs.
This complexity decreases rapidly when we try to make G D E S even faster by mak-
ing n substantially smaller than 8q.

9.2.6. The Actual Breaking Algorithm for n = 2q. The breaking algorithm for the
recommended case of n = 2q needs six ciphertexts with particular plaintext XOR
values. In this section we describe an attack on the extension of G D E S which uses
independent subkeys, which needs 16 encryptions.
The attacker chooses a random plaintext P, encrypts the following 16 plaintexts,
56 E. Bihamand A. Shamir

and uses only the resultant ciphertexts:


9 The plaintext P itself.
9 The nine plaintexts obtained from P by XORing 66 00 00 00x, 60 60 00 00x,
60 00 60 00~, 60 00 00 60x, 60 00 00 06x, 9E 5F AC 7D:,, F7 A5 35 C7x,
7A F A 78 D5x, and 21 22 E3 2C~ into B~o1~(the first 32 bits of P).
9 The six plaintexts obtained from P by XORing A6 BD EF B7x, F4 F3 82 3C~,
4F 5C 37 51 x, 2B 76 7A DBx, 5A 19 F9 68~, and 33 EE DD FFx into all the Bto~
blocks.
These XOR values are chosen by the following criteria:
1. The first plaintext is the randomly chosen basis for the differential attack.
2. Five plaintexts have the maximal number of unchanged inputs to S boxes in
the qth round compared with P and with each other. For the values chosen
at least five of the inputs to each S box are unchanged, which makes it possible
to find the subkey of the last round.
3. Four other plaintexts have a maximal difference in the S boxes of the qth
round. This is used to find the subkeys of the (q + 1)th and all the subsequent
rounds (There is not enough variability in the previous values to find all those
subkeys.)
4. Six plaintexts have a maximal difference in the S boxes of the first q rounds.
This makes it possible to find the first q subkeys.
The cryptanalytic algorithm is as follows. At first the attacker tries to find the
subkey of the last round. Each one of the 15 pairs formed by the first six encryp-
tions has a different set of six S boxes whose input XORs in Bto1~ are zero. All
the other Bto~ i E {2, ..., q} have input XORs which are trivially zero. Thus each
one of the first q - 1 F functions have the same input and output values in all the
pairs. In each pairs the qth F function has zero input and output XORs in six of
the eight S boxes. Using this knowledge we get the output XOR of these six S boxes
in the last (2qth) round by the formula
q

Kn) = O
j=2

The input XOR is easily computed as ~_~ u, tq~ = B,t~ and the input itself is B~1~. Now
we try all the possible key bits for each S box separately and check that for the given
input XOR we get the given output XOR value. For each S box there are at least
five pairs which can distinguish values of the key bits. The (almost certainly unique)
value suggested by all the pairs is the key of the corresponding S box. Therefore,
the whole subkey of the last round is found. Now a decryption of the last round
can be done reducing the cryptosystem to 2q - 1 rounds.
Note that if the subkeys are derived by the DES key scheduling algorithm, then
48 bits out of the 56 key bits are known at this point. The others can be easily found
by trying all the 256 possibilities of the missing eight key bits. We thus proceed to
analyze the case of independent subkeys.
In the following q - 1 rounds we get the input and the input XOR of the F
Differential Cryptanalysis of DES-like Cryptosystems 57

function from the (partially decrypted) ciphertexts. The output XOR is calculated
by the formula
q
P'(n fl, K,) = B6 9 O
j=2

where r is the round number (r ~ {q + 1. . . . . 2q - 1}). In this case the first ten
ciphertexts are used. The additional four ciphertexts are needed primarily to find
K(q + 1) since in the first six encryptions there are too m a n y zero X O R bits and
more variety is needed. These added ciphertexts do no help in the nth round since
there we want the output XORs of the S boxes in the qth round to be zero.
In the remaining q rounds we use all the 16 ciphertexts. The additional cipher-
texts have nonzero differences in all the S boxes in all the rounds, whereas the first
ten had a constant value during the first q - 1 rounds. The input XOR is calculated
by the formula
q

j=2

where r is the round number (r e {1 . . . . . q}) and ~p is


~B~)r if r < q,
~ P = ( B 6 ~2) if r=q.

9.2.7. Conclusions. G D E S with n = q = 8 is breakable using a known-plaintext


attack with three ciphertexts. With a key scheduling similar to DES, G D E S is
vulnerable to a known-plaintext attack when n = q + 1 as well.
G D E S with q = 8 and n = 16 was suggested in [-16] and [18]. The 15-round
variant is easily breakable using the n = 2q - 1 attack with three ciphertexts. The
16-round version is breakable using the extension to n = 2q with six ciphertexts in
0.2 seconds on a C O M P A Q personal computer. If independent keys are used, then
it is breakable with 16 ciphertexts in 3 seconds on the same computer.
G D E S with q = 8 and n = 22 is breakable using the n = 3q - 2 attack with 48
ciphertexts (24 pairs). G D E S with q = 8 and n = 31 is breakable using the n =
4q - 1 attack with 250,000 pairs and SIN = 21a/(2342. 0.813) ~ 27 with memory of
size 2 la. Even G D E S with q = 8 and n = 63 is weaker than DES and is breakable
using 252 ciphertexts. In general, any G D E S which is faster than DES is also less
secure than DES.

10. Nondifferential Attacks on D E S Reduced to Few Rounds

In this section we describe several novel attacks on DES reduced to three to six
rounds which are not based on the ciphertext pair paradigm. These attacks are of
three kinds: ciphertext-only attacks, known-plaintext attacks, and statistical-
known-plaintext attacks. C o m p a r e d with differential attacks, they analyse fewer
ciphertexts but require more time.
58 E. Biham and A. Shamir

10.1. Ciphertext-Only Attacks

10.1.1. A Three-Round Attack. This attack assumes that the eight plaintext bytes
are ASCII characters whose most-significant bits are zeros. The Initial Permutation
(IP) packs the most-significant bits of all these bytes into a single byte. This byte
is the fifth byte of the permuted plaintext which is the first byte of the right half.
Given a ciphertext T = (l, r) we can easily calculate eight bits of the output of the
second round by B = a ~9 c = R q) r. From Table 26 (see Appendix A) we see that
these eight bits are the output of seven S boxes in the second round (where two of
them come from $5). The attack is as follows:
1. We try all the possibilities of the key bits entering $5 in the second round and
all the key bits entering the six S boxes S1, $2, $3, $4, $6, and $8 in the third
round. Their output bits are XORed with the data bits entering $5 in the
second round. Three bits are counted in both rounds and thus 39 bits are
exhaustively tried.
2. Using the tried key bits and any ciphertext we find the output of the six S
boxes in the third round and the input and output of $5 in the second round.
3. We compare the two computed output bits of $5 in the second round to their
expected value. If they are different, then the 39 key bits are wrong. A quarter
of the tried keys have the expected value. By trying additional ciphertexts we
can discard more key values. We stop when only one candidate remains.
Since we start with 239 possible keys and only a quarter of them survive each test,
we need about log4 239 = 19.5 ciphertexts. When the correct 39 key bits are deter-
mined, we can exhaustively try all the possible values of the remaining 17 bits b y
checking whether the decoded plaintexts are ASCII characters. The attack thus
needs a total of 239 steps and 20 ciphertexts to break DES reduced to three rounds.

10.1.2. Another Three-Round Attack. In this attack we assume that the plaintext
bytes belong to a smaller set in which the three most-significant bits are constant.
Such sets are the ASCII capital letters, the ASCII lowercase letters, and the ASCII
digits. The three most-significant bits of all the eight plaintext bytes are packed into
three bytes by the initial permutation. These three bytes are the first byte of the left
half and the first and second bytes of the right half. Since the first and second bytes
of the right half are constant in all the plaintext blocks, the inputs of $2 and $3 in
the first round are constant and thus their outputs are constant as well. We can
calculate the output bits of the third round by the equation
C=L~A~I. (2)
Two bits of the eight constant bits in L have corresponding constant bits in A: one
of them is an output of $2 and the other is an output of $3 (see Table 26). Since I is
known, the two bits in C are known up to an XOR with a constant. These bits are
outputs of $2 and $3. Trying all the 64 possibilities of the key bits entering $2 in
the third round, we can check that in any pair of ciphertexts the output bit of $2
satisfies C 1 ~ 11 = C 2 O) 12. Since half the keys satisfy this condition, we need about
1 + log2 64 = 7 ciphertexts to find the six key bits entering $2 in the third round.
DifferentialCryptanalysisof DES-likeCryptosystems 59

The same ciphertexts can be used to find the six key bits entering $3 in the third
round. This leaves 44 unknown key bits, which can be found in 244 steps with seven
ciphertexts.

10.1.3. A Four-Round Attack. This attack is an extension of the previous three-


round attack and assumes (as before) that the three most-significant bits of each
plaintext byte are constant. In this attack two bits of C are found by the equation
C= L~AO)d= L~AO)r
which is similar to (2). Then two output bits (one in $2 and one in $3 in the third
round) are known up to a constant. We try all the possible key values of the six key
bits of $2 (or similarly $3) in the third round and all the possible key values of the
six S boxes in the fourth round whose output bits are XORed with the data bits
entering $2 (or $3) in the third round. We try a total of 36 key bits entering the
fourth round and six key bits entering the third round, but five bits are common
(six when using $3) and thus we have to try 237 possible key values. We need about
1 + log2 237 = 1 + 37 = 38 ciphertexts to make the computed key unique.

10.2. Known-Plaintext Attacks

10.2.1. A Three-Round Attack. The DES key scheduling algorithm divides the 56
key bits into halves. Each half has 28 bits, and supplies the key bits to the same four
S boxes in all the rounds.
Consider DES reduced to three rounds with a single known plaintext/ciphertext
pair. The exclusive-or value of the output of the first round and the third round is
known by the equation
AO)C=LO)I.
We first try all the 228 possibilities of half of the key. Each candidate makes it
possible to compute the output of four S boxes in the first round and the output of
the same S boxes in the third round. We know their expected exclusive-or value.
Since the value has 16 bits, only about 2 -16 of the candidates survive this test. Thus
we get about 212 possibilities for the first 28 bits of the key. In a similar way we get
about 212 possibilities for the other 28 bits of the key. Therefore we find about
212" 212 = 224 possibilities for the full key, which can be exhaustively searched. The
complexity of this algorithm is about 229, and can be reduced to about 221 by
choosing the key bits entering each S box sequentially rather than in parallel, and
discarding partial keys as soon as they lead to a contradiction.

10.3. Statistical-Known-Plaintext Attacks

10.3.1. A Three-Round Attack. In this attack we use the fact that in a pairs XOR
distribution table, if we know that the output XOR is zero, then the input XOR is
zero with probability 1/4. Given the plaintext and the ciphertext of an encryption
we can easily calculate A 0) C = L ~ 1. Then the following algorithm is used for
each S box. Choose only the encryptions whose output XOR from this S box is zero
( ~ of the encryptions): Soa ~3 Soc = 0. If S~o ~ Sxc = 0, then the corresponding bits
60 E. Bihamand A. Shamir

ofa ~) c = R 0) r equal SKo ~ SKc.We count the number of occurrences of each such
XOR value. The right value is suggested by about a quarter of the encryptions.
Each incorrect value is suggested by about 88 ~3 of the encryptions. The value that
appears most frequently is likely to be the value of Sx,, O) Sxc. This algorithm is used
for each S box and thus we find 8- 6 = 48 bits that are XORs of the actual key bits.
Then trying 2 s possibilities we can find the full 56-bit key. We need about four
occurrences of the right value of the key XOR for each S box, i.e., total of about
4" 4" 16 = 256 plaintext/ciphertext pairs.

10.3.2. A Four-Round Attack. In this attack we use the fact that for all the S boxes
there is a weak correlation between the value of the XOR of the four output bits
and the value of bit number 2 of the input. In particular, for every two inputs of an
S box, if the XOR of the four output bits of the first input equals the corresponding
value of the second input, then both bits 2 of the input are equal with a certain
probability. This probability is different for each S box and varies between 0.56 and
0.70.
Given the plaintext and the ciphertext of an encryption we can easily calculate
So``~ Soc by
A~C=LOr.
Then the following algorithm is used separately for each S box. For every encryp-
tion calculate the (single bit) XOR of the four output bits of the first round and the
four output bits of the third round by the about equation. This value is likely to be
equal to the XOR of bit number 2 of the inputs of the S box in these two rounds.
$I`` is known up to an XOR with the key (by the plaintext) and thus bit number 2
of the input in the third round is known up to an XOR with a constant with a high
probability. This constant is the XOR of the corresponding bit number 2 in SK``
Src. Thus by D = l ~ c we find the corresponding output bit in the fourth round
up to that constant with a high probability. We try all the 64 possibilities of the key
bits entering the corresponding S box in the fourth round and the two possibilities
of the constant and verify that the specific output bit of the S box equals its expected
value. The right key value is counted in about 56%-70% of the encryptions, de-
pending on the exact S box. Any wrong key value is counted in about half of the
encryptions. The key value which is counted most frequently is likely to be the right
value. This attack finds a total of seven bits: six of them are actual key bits and the
seventh is an XOR of two key bits.
The attack obtains the best results when the probability is as high as possible.
To increase the probability we use only encryptions with specific values of So,, O) Soc
which maximize this probability, For instance, when $5o``O) $5Oc = 0 this proba-
bility is about 0.81. There is a tradeoff between the number of allowed values and
the corresponding probability. As the number of allowed values increases, the prob-
ability decreases so we need more data to carry out the attack. However, as the
number of allowed values decreases we need more data to make the occurrence of
these values sufficiently probable. Table 17 describes the best tradeoff achievable
by this attack. To make the best use of this attack it is advisable to use about 200
plaintext/ciphertext pairs, from which we can find almost 28 key bits, and search
exhaustively for the (about 228 ) remaining possibilities of the key. Using about 370
Differential Cryptanalysisof DES-like Cryptosystems 61

Table 17. Number of encryptions needed to find SK~ for


each S box.

Best tradeoff
By Finding Average
S box bits of probability(%) Values Encryptions
s1 $4 66 16 75
$2 $8 57 8 195
$3 s1 58 7 240
$4 $2 56 9 370
$5 s1 70 16 50
$6 $8 61 8 135
$7 $5 60 14 210
$8 $6 63 12 120

plaintext/ciphertext pairs we can find almost 42 key bits and search exhaustively
for the (about 214 ) remaining possibilities of the key.

10.3.3. A Five-Round Attack. This five-round attack is similar to the previous


algorithm. We can calculate B ~ D = R ~ r. Then an input XOR bit of the S box
in the second and fourth round is known with probability between 0.56 and 0.70.
As a result, an output bit of A ~ E is known up to an XOR with a constant by
L t~ A = b and d E) E = I and thus
A O) E = b ~ d ~ ) L ~ l.
Using a counting method that counts on the key bits entering the same S box in
the first round, the key bits entering the corresponding S box in the fifth round, and
the constant, we can find 13 bits of the key: six of them are actual key bits from the
first round, six are actual key bits from the fifth round, and the thirteenth is an XOR
of two key bits. The amount of data needed to find these 13 key bits is about the
same as in the previous attack.

10.3.4. A Six-Round Attack. This attack is again similar to the attack on five
rounds, but we also have to count all the possibilities of the 36 subkey bits of the
sixth round which enter S boxes whose output bits enter the counted S box in the
fifth round by the P permutation. In total we count on 49 bits. The total complexity
of this attack is about 255-256 but the basic operation (which is similar to a single
application of the F function) is much simpler than an encryption, and thus the time
needed is marginally faster than exhaustive search.

Appendix A. DES Tables

Table 18. S1 table.


14 4 13 1 2 15 11 8 3 10 6 12 5 9 0 7
0 15 7 4 14 2 13 1 10 6 12 11 9 5 3 8
4 1 14 8 13 6 2 11 15 12 9 7 3 10 5 0
15 12 8 2 4 9 1 7 5 11 3 14 10 0 6 13
~w ~w m
~m

~m

~r

J~
~J

z~
Differential Cryptanalysis of DES-like Cryptosystems 63

Table 26. The P permutation table.

From To

Bit S box Bit mask Bit S box Bit mask Missing


no. and bit (hex) no. and bit (hex) S box

1 S1 1 80 00 00 00 9 $2.6 $3.2 00800000 $7


2 2 40 00 00 00 17 $4.6 $5.2 00008000
3 3 20 00 00 00 23 $6.4 00000200
4 4 100000000 31 $8.4 00000002
5 $2 1 08 00 00 00 13 $3.6 $4.2 00080000 $6
6 2 04 00 00 00 28 $7.5 $8.1 00000010
7 3 02000000 2 S1.3 40000000
8 4 01 00 00 00 18 $5.3 00004000
9 $3 1 00800000 24 $6.5 $7.1 00000100 SI
10 2 00 40 00 00 16 $4.5 $5.1 00010000
11 3 00 20 00 00 30 $8.3 00000004
12 4 00 10 00 00 6 $2.3 04000000
13 $4 1 00 08 00 00 26 $7.3 00000040 $2
14 2 00 04 00 00 20 $5.5 $6.1 00001000
15 3 00 02 00 00 10 $3.3 00400000
16 4 00 01 00 00 1 $8.6 S1.2 80000000
17 $5 1 00 00 80 00 8 $2.5 $3.1 01 000000 $8
18 2 00004000 14 $4.3 00040000
19 3 00 00 20 00 25 $6.6 $7.2 00000080
20 4 00001000 3 S1.4 20000000
21 $6 1 00 00 08 00 4 S1.5 $2.1 10000000 $4
22 2 00 00 04 00 29 $7.6 $8.2 00000008
23 3 00 00 02 00 11 $3.4 002000000
24 4 00 00 01 00 19 $5.4 00002000
25 $7 1 00 00 00 80 32 $8.5 SI.I 0O 0000 O1 $5
26 2 00 00 00 40 12 $3.5 $4.1 O0 100000
27 3 00 00 00 20 22 $6.3 00000400
28 4 00 00 00 10 7 $2.4 02000000
29 $8 1 00 00 00 08 5 S1.6 $2.2 08000000 $3
30 2 00 00 00 04 27 $7.4 00000020
31 3 00 001~0 02 15 $4.4 00020000
32 4 00 00 O0 01 21 $5.6 $6.2 00000800
X
0

o
~ o o ~ m ~ o ~ o ~ o ~ H ~ H ~ o m H o ~ o o ~ o ~ o ~ o ~ o ~ o o o ~ o m o ~ o m o ~ o ~ o ~ o ~ o ~ o o ~

II

.w

.~ r~

~ o o
C~

~r"

k
x
9

~4

~r

o"

~r

C~
~r
"0

O"

C)

o
r~
x~

u=
~=

x
0

g.
0"
x
0

~n

o ~"

~o~
0

o
x
o

o.I

b~
x
O

~D

,ii
k

~r

~r

o
72 E. Biham and A. Shamir

References

[1] E. F. Brickell, J. H. Moore, M. R. Purtill, Structure in the S-boxes of the DES, Advances in
Cryptolooy, Proceedings of CRYPTO 86, pp. 3-7, 1986.
[2] D. Chaum, J.-H. Evertse, Cryptanalysis of DES with a Reduced Number of Rounds, Sequences of
Linear Factors in Block Ciphers, Advances in Cryptology, Proceedings of CR YPTO 85, pp. 192-
211, 1985.
[3] D. W. Davies, Private communications.
[4] B. Den Boer, Cryptanalysis of F.E.A.L, Advances in Cryptolooy, Proceedings of EUROCRYPT
88, pp. 293-300, 1988.
[5] Y. Desmedt, J.-J. Quisquater, M. Davio, Dependence of output on input in DES: small avalanche
characteristics, Advances in Cryptolooy, Proceedings of CR YPTO 84, pp. 359-376, 1984.
[6] W. Diffie, M. E. HeUman, Exhaustive cryptanalysis of the NBS Data Encryption Standard,
Computer, Vol. 10, No. 6, pp. 74-84, June 1977.
[7] H. Feistel, Cryptography and data security, Scientific American, Vol. 228, No. 5, pp. 15-23, May
1973.
[8] M. E. Heliman, A cryptanalytic time-memory tradeoff, IEEE Transactions on Information Theory,
Vol. 26, No. 4, pp. 401-406, July 1980.
[9] M. E. Hellman, R. Merkle, R. Schroppei, L. Washington, W. Diffie, S. Pohlig, P. Schweitzer,
Results of an Initial Attempt to Cryptanalyze the NBS Data Encryption Standard, Standford
University, September 1976.
[10] R. C. Merkle, A fast software one-way hash function, Journal of Cryptology, Vol. 3, No. 1,
pp. 43-58, 1990.
[11] S. Miyaguchi, Feal-N specifications, NTT, 1989.
[12] S. Miyaguchi, News on Feal Cipher, Talk at the RUMP session at CRYPTO 90, 1990.
[13] S. Miyaguchi, K. Ohta, M. Iwata, 128-bit hash function (N-Hash), Proceedings of SECURICOM
90, pp. 123-137, March 1990.
[14] S. Miyaguchi, A. Shiraishi, A. Shimizu, Fast data encryption algorithm Feal-8, Review of Electrical
Communications Laboratories, Vol. 36, No. 4, pp. 433-437, 1988.
[15] National Bureau of Standars, Data Encryption Standard, FIPS publication, No. 46, U.S. Depart-
ment of Commerce, January 1977.
[16] I. Schaumuller-Bichl, Zur Analyse des Data Encryption Standard und Synthese Verwandter
Chiffriersysteme, Ph.D. Thesis, Linz University, May 1981.
[17] I. Schaumuller-Bichl, Cryptanalysis of the Data Encryption Standard by the method of formal
coding, Cryptolooia, Proceedings of CR YPTO 82, pp. 235-255, 1982.
[18] I. Schaumuller-Bichl, On the Design and Analysis of New Cipher Systems Related to the DES,
Technical Report, Linz University, 1983.
[19] A. Shimizu, S. Miyaguchi, Fast Data Encryption Algorithm Feai, Advances in Cryptolooy, Pro-
ceedinos of EUROCRYPT 87, pp. 267-278, 1987.
[20] A. Shimizu, S. Miyaguchi, Fast Data Encryption Algorithm Feal, Abstracts of EUROCRYPT 87,
pp. VII-11-VII-14, April 1987.

You might also like