Encrypt, Sign, Attack A Compact Introduction To Cryptography
Encrypt, Sign, Attack A Compact Introduction To Cryptography
Encrypt, Sign, Attack A Compact Introduction To Cryptography
Olaf Manz
Encrypt, Sign,
Attack
A compact introduction to cryptography
Mathematics Study Resources
Series Editors
Kolja Knauer
Departament de Matemàtiques Informàtic
Universitat de Barcelona
Barcelona, Barcelona, Spain
Elijah Liflyand
Department of Mathematics
Bar-Ilan University
Ramat-Gan, Israel
This series comprises direct translations of successful foreign language titles, especially
from the German language.
Powered by advances in automated translation, these books draw on global teaching
excellence to provide students and lecturers with diverse materials for teaching
and study.
Olaf Manz
© The Editor(s) (if applicable) and The Author(s), under exclusive licence to Springer-Verlag GmbH, DE, part
of Springer Nature 2022
Translation from the German language edition: “Verschlüsseln, Signieren, Angreifen” by Olaf Manz, © Springer-
Verlag GmbH Deutschland, ein Teil von Springer Nature 2019. Published by Springer Berlin Heidelberg. All
Rights Reserved.
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the
whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,
recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now
known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does
not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective
laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book are
believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give
a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that
may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
This Springer imprint is published by the registered company Springer-Verlag GmbH, DE, part of Springer Nature.
The registered company address is: Heidelberger Platz 3, 14197 Berlin, Germany
Preface
Have you ever wondered whether mobile phones can be used to confide even the most
secret secrets? Or whether online banking is really secure these days? Or whether an elec-
tronic signature on contracts sent by e-mail meets legal requirements? All of this has
something to do with the encryption – or ciphering – of data, which is sent or stored on
data carriers every day in large and ever-increasing quantities via data highways or
“wireless”.
Textbooks and reference books take a more scientific approach to the topic of data
encryption under the title of cryptography. They deal with the mathematical theories of
the common procedures, describe their algorithms and program-technical realizations, and
also deal with many topics of the organizational implementation. As a basis for lectures or
seminars, it must in the first instance be the goal to introduce students to scientific work
and to introduce them to areas of current research. Practitioners working in the subject also
need a correspondingly comprehensive presentation. On the other hand, there are also
numerous popular science publications that aim at a generally understandable level. This
works very well in this case, since simple ciphering methods can easily be brought to the
attention of interested laymen and can be substantiated with examples from everyday
practice. The mathematics behind it, however, usually remains hidden.
This book aims to be a balancing act between the two. It is a fact that cryptography can
be understood quite comprehensively with very little mathematics. Our goal is therefore,
without a theoretical superstructure, to deal specifically with the most important proce-
dures of encryption, signing and authentication, and to present them in a compact and
mathematically understandable manner, which is reflected in many practical examples.
• We focus first on symmetric ciphers, where anyone who knows the cipher procedure
can decode it. The procedures go back to antiquity to the Caesar cipher, in which each
letter in the alphabet is replaced by the letter three places further down. The Vigenère
cipher from the sixteenth century does this much more subtly, while more modern
methods such as the Triple DES (Data Encryption Standard) and especially today’s
standard method AES (Advanced Encryption Standard) are considerably more
complex.
v
vi Preface
• But how is it supposed to work that you can encrypt but can’t decrypt even with the help
of the biggest and most modern computers? The keyword is public key. We will learn
about the standard methods: RSA relies on the difficulty of decomposing large natural
numbers into factors, and Diffie-Hellman and ElGamal exploit the problem that “dis-
crete logarithms” cannot be computed efficiently enough. Here, we even run into “ellip-
tic curves” with ECDH.
• Ciphering would certainly be unnecessary if there were not rogues and especially also
professional attackers who would expect political, military or economic advantage
from the knowledge of secret data and therefore try to “crack” the encryption. In addi-
tion to classical attacks using statistical analysis and Friedman’s coincidence index,
we learn about Pollard’s methods for effectively factorizing large natural numbers to
potentially “crack” RSA. Finally, we also attack the “discrete logarithm” with Baby-
Step-Giant-Step and Pohlig-Hellman.
• Particularly fatal, however, is an attack in which an unauthorized person not only pas-
sively listens in but also actively engages in the message traffic and changes it in their
own way. In this case, the recipient of a message is completely unaware of whether the
information received in this form really originates from exactly the sender specified. In
order to prevent this situation, digital signatures are used, for example the RSA, DSA
or ECDSA procedure, thus giving a man-in-the-middle attack no chance.
• Of course, we will always deal with practical applications. Historically interesting are,
for example, the Illuminati cipher and the Enigma machine. The Internet with
HTTPS is perhaps the most prominent modern application for secure data transmis-
sion, but wireless WLAN networks and the Bluetooth radio interface are also well
protected today. The PGP Pretty Good Privacy method is widely used for e-mails,
while mobile communications with GSM are only partially secure against eavesdrop-
ping, but those with UMTS/LTE are much more secure. Another focus is on online
banking, credit cards and Bitcoins. Finally, e-passports with their biometric data are
also designed to be forgery-proof. Last but not least, data stored on hard disks, and
thus passwords in particular, must be protected against unauthorized access.
The target audience for this book is basically anyone who is enthusiastic about the topic;
in particular, it is also intended as an introduction to more advanced literature. We will
have to do relatively little, but nevertheless some mathematics. We will need arithmetic
with binary numbers (bits) and with remainders modulo a natural number, as well as an
understanding of permutations, both for the conceptual background and for one or the
other formal derivations. However, we will build this up piece by piece, with special
emphasis on the plausibility of the relationships. So, let’s plunge into the adventure – and
have fun.
Preface vii
As a guide, here is a brief reading guide for the four chapters of this book in advance:
ix
x Contents
4 Digital Signature 87
4.1 Man-in-the-Middle Attack and Authentication�������������������������������������������� 87
4.2 RSA and ElGamal Signature������������������������������������������������������������������������ 90
4.3 Hash Value and Secure Hash Algorithm SHA���������������������������������������������� 94
4.4 Email with PGP and WhatsApp�������������������������������������������������������������������� 99
4.5 DSA and ECDSA Signature ������������������������������������������������������������������������ 102
4.6 Online Banking �������������������������������������������������������������������������������������������� 107
4.7 Blind Signature and Cryptocurrencies���������������������������������������������������������� 110
4.8 Password Security and Challenge Response������������������������������������������������ 114
4.9 Mobile Phone, Credit Card and Passport������������������������������������������������������ 118
Bibliography125
Index 131
Basics and History
1
An online channel can be thought of as computer networks (LAN, Internet, etc.), mobile
communications networks or digital television via cable or satellite. Storage media include,
for example, hard disk or USB stick.
Let us first use Fig. 1.2 to clarify the terms and the individual steps in a little more detail.
First, the desired information must be structured and “put on paper”. This can be done
in German, English or another language and should also be illustrated with some graphics
and photos. To do this, you may already be using WORD for text passages and TIFF for
graphic formats and have thus already digitized your documents.
When sending or archiving documents, however, it must also be taken into account that
the transmission time should be as short as possible and the required storage space as
small as possible, i.e. the data should be used in a suitably compressed form. The optimal
digitization of data together with suitable compression is a sub-aspect of the field of
information theory.
In a further step, our digitized document should be protected against eavesdropping or
even changes by unauthorized third parties. To do this, we encrypt its contents in such a
way that it cannot be read or even changed by strangers. This is called ciphering, the
related field is called cryptography. In addition, we have to find methods to “crack” the
ciphers used, i.e. to put ourselves in the role of an attacker, either in real or virtual terms.
This is called cryptanalysis.
Last but not least, our transmission channel is susceptible to interference (e.g. short-
term noise), or the storage medium used might have been damaged (e.g. scratches on the
DVD). In the so-called encoding process, we add a little redundant information to our text
so that any errors that occur can usually be detected and possibly corrected without asking.
This step is the main ingredient of the field of coding theory [Man].
1.2 Alphabets and Digitisation 3
After receiving our document or reading out the corresponding memory contents, the
above steps must all be undone, as shown in Fig. 1.3.
Reception errors should at least be detected or, even better, automatically corrected. In
addition, the redundancy must be removed again and thus the original message recovered
(decoding). Then the message must be decrypted (deciphering), which of course requires
that the recipient knows the decryption procedure. Finally, the document must be con-
verted from its compressed and digitized state back into the readable source text including
embedded graphics. Only now can the content of the document be understood by the
recipient.
The topic of this paper is cryptography and cryptanalysis. Thus, we will discuss cipher-
ing in detail, i.e., how to protect sensitive information against unwanted eavesdropping or
even unauthorized modification.
The basic prerequisite for the transmission or storage of abstract information is first of all
its structured documentation. As a rule, a text is documented with letters, and sequences of
digits or combinations of digits and letters are used for identifying marks (e.g. passport
number), although this does not necessarily have to be digitized at first. However, if you
use WORD, for example, the text is automatically digitized. In the case of graphics, which
are nowadays no longer created by hand, but e.g. with PowerPoint, as well as digital pho-
tos, there is ultimately no other choice anyway. But anyway: The structuring of informa-
tion is always based on the use of so-called alphabets. Here are some examples.
For arithmetic implementations, letters naturally have the disadvantage that one cannot
calculate with them. But also with the digits a problem arises, because one gets out of their
one-digit range very fast when calculating, because for example 7+ 8 = 15 and 3 ∙ 4 = 12
are already no simple digits any more. It is better to operate with remainders modulo a
natural number m instead of letters or digits, i. e. with the possible remainders when divid-
ing by m. Thus the alphabet consists of 0, 1,…, m − 1, which can be added and multiplied
modulo m. For example, for m = 10, 7 + 8 has remainder 5 when divided by 10, and thus
7 + 8 = 5, read modulo 10. The product 3 ∙ 4 has remainder 2 when divided by 10, so 3 ∙
4 = 2, also read modulo 10. So sum and product are again an element of the alphabet in our
modulo calculation. We can now also calculate with letters by simply taking the m = 26
capital letters as 0, 1, 2,…, 25 and thus as remainders modulo 26.
Calculating modulo a natural number m will turn out to be an important procedure in
many places. For two integers, i.e. possibly also negative natural numbers a and b, one
writes a = b (mod m) and means by this that a and b modulo m are equal, i.e. have the
same remainder 0 or 1, or… or m − 1 when divided by m. The following criteria are then
equivalent:
• a = b (mod m)
• a − b is divisible by m
• a and b differ only by a multiple of m
+ 0 1 . 0 1
0 0 1 0 0 0
1 1 0 1 0 1
1.2 Alphabets and Digitisation 5
It also makes sense to consider alphabets made up of blocks of bits, such as blocks of 2
bits, namely 00, 01, 10, 11, but often blocks of 8 bits are used, as we will now see.
So far, we have used the terms digitization and alphabet rather intuitively. In general, digi-
tization is understood as the conversion of abstract information or analog values into a
sequence of “discrete” characters. As a rule, this involves only a finite number of charac-
ters, and their totality is then called an alphabet. Against this background, the conversion
of a spoken text into a sequence of letters can already be regarded as “digitization”.
Digitization in the narrower – and today always assumed – sense, however, also means
that the elements of the alphabet are represented as a binary string, i.e. as a sequence of
bits 0 and 1. In the case of texts, for example, their letters consist of blocks of bits; in the
case of photos and graphics, the same is true for the color and brightness values of the
individual image points, so-called pixels, as Fig. 1.4 illustrates.
One block length of bits has proven to be particularly useful in the past, namely the
length 8, with which one can thus represent up to 28 = 256 characters such as letters, digits
or brightness values. A block of 8 bits is called a byte. However, the representation of
characters is not always limited to one byte and thus to 256 values, but more than 8 bits or
even a few bytes can also be used.
There is a 256-character standard that includes upper and lower case letters, numbers,
and most special and control characters, called the ASCII character set (American
Standard Code for Information Interchange). Although many procedures are more flexible
today, people still like to work with digital ASCII characters. In this case, the number of
the respective ASCII character must be read as a binary representation in order to derive
the desired byte. Table 1.1 shows an extract from the table of all ASCII values as an
example.
Historical ciphers, of course, do not yet use bits and bytes, but letters, digits, and pos-
sibly some special characters, as we will see in the rest of this chapter.
We begin with historical ciphers in antiquity. Roman sources say that the emperor and
general Gaius Julius Caesar proceeded for his secret communication in such a way that
he replaced each letter of the alphabet by the one three places further, i.e. “A” by “D” and
finally “Z” by “C”. Figure 1.5 shows an example.
Therefore, this cipher procedure is also called Caesar cipher, the underlying alphabet
being A, B,…, Z. We first make two observations:
• Of course, instead of three digits, you could have chosen another number i from 0 to 25
and replaced each letter with the one i digits further.
• We have also already learned that instead of letters it is better to use the remainders 0,
1,…, 25 modulo 26. So, based on these observations, the ciphertext for the characters z
of our alphabet is z → z + i (mod 26). This is called a shift cipher.
1.3 Caesar Cipher 7
But how did the governors in the Roman provinces decode Caesar’s orders, i.e. decipher
them? Of course, they had to know the shift value i = 3, and again they moved all the letters
back three places in the alphabet, in which case, for example, “A” becomes “X”. Using our
algorithmic notation for shift ciphers, this means z → z − i = z + (26 − i) (mod 26). Thus,
for the Caesar cipher with i = 3 and the letter A, i.e., z = 0, we get 0 → −3 = 26 − 3 = 23
(mod 26), which is the letter X. In this way, one can unambiguously decipher each charac-
ter in the encoded letter sequence.
But how secure is a shift cipher? So let’s do some cryptanalysis and examine how the
encryption method can be cracked. This can be done quite easily by means of statistics. In
most languages, the “E” is by far the most frequent letter:
German 17.4%
English 12.7%
French 14.7%
Spanish 13.7%
Italian 12.0%
So you determine in a sufficiently large passage of the ciphertext the most common
letter, which could be, for example, the “Q”. This then most likely corresponds to the letter
“E”, which is 12 places before Q in the alphabet. Assuming that the cipher is a shift cipher,
one only needs to replace each letter in the ciphertext with the letter 12 places before it in
the alphabet, and the readable plaintext is obtained.
Now we make our shift cipher a bit more complicated and not only add i, but we also
multiply by a j, i.e. z → j ∙ z + i (mod 26). This is called an affine cipher. In this case,
however, j must be chosen to be coprime to 26, i.e., j odd and j not equal to 13. Only then
can j−1 (mod 26) be computed (Sect. 3.1) and the assignment z → j−1 ∙ z − j−1 ∙ i (mod 26)
be used for deciphering. Thus, the receiver must now know both i and j. For example, for
j = 3, j−1 = 9 (mod 26), since 3 ∙ 9 = 27 = 1 (mod 26).
But even the affine cipher is not much more secure than a shift cipher, because here an
unauthorized listener has to count out the second most frequent letter in a ciphertext in
addition to the most frequent one. In German, this will correspond to the letters E (i.e.
8 1 Basics and History
z = 4) and N (i.e. z = 13). For example, if one has determined A (i.e., z = 0) and F (i.e.,
z = 5) in the ciphertext, then, assuming an affine cipher, 0 = j ∙ 4 + i (mod 26) and 5 = j ∙
13 + i (mod 26) follows. Subtracting the first equation from the second, we get 5 = j ∙ 9
(mod 26). Now multiplying by 3, we get 15 = j ∙ 9 ∙ 3 = j ∙ 27 = j (mod 26), so j = 15.
Substituting this into the first equation, we get 0 = 15 ∙ 4 + i = 60 + i = 8 + i (mod 26), so
i = 18. Therefore, the encryption is z → 15 ∙ z + 18 (mod 26). From this, the deciphering
procedure and hence the total plaintext are computable.
A historically popular means of secrecy is that of various secret writings, such as those of
the Illuminati (Latin for the enlightened), a secret order founded in the eighteenth century,
around which numerous myths and conspiracy theories surrounding the Catholic Church
are entwined. The Illuminati became famous not least through the best-selling novel of the
same name by Dan Brown and the film adaptation starring Tom Hanks. In the Illuminati’s
cipher [Kuh], the letters and numbers are each replaced by a fixed, self-discovered secret
character of the Illuminati alphabet, as Fig. 1.6 shows.
The Illuminati secret writing therefore appears at first glance to be extremely strange
and hardly decipherable, as the simple example in Fig. 1.7 shows.
Nevertheless, there are also cryptanalytic starting points here. The frequency of letters
is transferred to the uniquely assigned secret character, so that statistical methods can be
applied again. The first step is to obtain the statistical frequencies of all letters in as many
different languages as possible. Table 1.2 shows some examples.
Now count the relative frequencies of all characters in a sufficiently large passage of the
ciphertext and compare with the table. This will reveal quite a few unique plaintext letters.
For the rest, which is not so unique, you have to puzzle a bit, which of the possibilities
results in a meaningful plaintext. More problems, however, are caused by digits used in the
text, for which there are of course no statistical predictions.
10 1 Basics and History
Instead of choosing self-discovered characters, one can just as well permute the alphabet
itself, i.e. one always encrypts each plaintext letter by the same ciphertext letter. This is
called a monoalphabetic cipher. Here is an example of such a permutation π of letters to
be kept secret:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
π U F L P W D R A S J M C O N Q Y B V T E X H Z K G I
Decryption is done with the reverse permutation π−1. The ciphertext looks visually
much less strange than in the Illuminati cipher, but the cipher itself has exactly the same
cryptanalytic effect. The affine cipher and thus also the Caesar cipher are simple special
cases of a monoalphabetic cipher.
We now make another attempt to generalize the Caesar or shift cipher. We now no longer
shift each letter in the plaintext by the same number i of digits. Rather, we allow for differ-
ent i, but the pattern should repeat after a certain period d. For example, i might cycle
through 0, 21, and 4, and then the whole thing starts over with period d = 3. Figure 1.8
visualizes the procedure.
In practice, of course, the period is much larger. Therefore, it has become common
practice to specify the corresponding letters in the alphabet as a so-called keyword instead
of the sequence of numbers, in our example AVE for the values 0, 21 and 4. Thus, “A”
stands for the digit “0”, “V” for the “21” and “E” for the “4”. The keyword must be trans-
mitted secretly from the sender to the receiver, because it is needed to decrypt the message.
Therefore, text passages from literature are often used, which do not have to be transmitted
as text, but in a simpler and shorter form as a quotation, such as “Faust I, verse 512-13”.
Historically, the Vigenère cipher was described in a different way, namely via the so-called
Vigenère tableau, which is shown in Table 1.3.
In the header line, it contains all 26 letters for which the cipher assignment must be
determined. In the left margin column, each line is marked consecutively with the 26
letters. The lines within the tableau also contain the entire alphabet, but always shifted
cyclically to the left by one position.
Now the Vigenère keyword comes into play. Let’s take the keyword PAUSE with the
period d = 5 as an example instead of AVE. Then the first letter in the plaintext to be
encoded is encoded according to line “P” in the tableau, the second according to line “A”,
then according to the lines “U”, “S” and “E”. At the sixth letter in the plaintext, the whole
thing starts again from the beginning. To decode, the receiver must know the keyword. The
receiver then deciphers with the keyword as well, inverting from the corresponding line of
the Vigenère tableau back to the header line. Here is a simple example of a Vigenère cipher.
The main advantage of the Vigenère cipher is that it smoothes the statistical frequency of
letters in natural languages. Let us again take the keyword PAUSE and as an example the
ciphertext letter I. We take from the Vigenère tableau that I may have arisen from the fol-
lowing plaintext letters: T, I, O, Q, and E. Here are the rounded statistical frequencies of
these letters in the German language:
T : 6% I : 8% O : 2% Q : 0% E : 17%
So we see that I can arise from the most frequent letter E, but also from the almost non-
existent Q, from the relatively frequent I and from the quite rare O, as well as from the
averagely frequent T. So with our statistical approach alone, we no longer get to the bot-
tom of the Vigenère cipher. For this we need a new idea.
So let’s do some cryptanalysis again and try to crack the Vigenère cipher. As we will see
in a moment, this is not too difficult if the keyword is relatively short and thus the period
d is relatively small compared to the length of the ciphertext. The attack is then done in
two steps.
• First we determine the length of the keyword, i.e. the period d. For this we will learn the
Kasiski attack in a moment.
1.6 Kasiski and Friedman Attack 13
• Then one determines the keyword itself. If the period d is known, then it is only a matter
of d different shift ciphers, because the positions 1, d + 1, 2 ∙ d + 1,… are encrypted by
identical shift ciphers, as are the positions 2, d + 2, 2 ∙ d + 2,… and so on. As is well
known, these can easily be cracked individually using statistical methods (Sect. 1.3).
The Kasiski attack goes back to Friedrich Wilhelm Kasiski (1805–1881), who pub-
lished the method in 1863. It is based on the following idea: If certain character sequences
occur frequently in the plaintext, they also become identical ciphertext sequences if their
spacing is a multiple of the period d. Such repetitions of strings can of course also occur
randomly in the ciphertext, but this is much less often the case. Thus, in the Kasiski attack,
the ciphertext is examined for repetitions of strings of at least three characters and the
respective distances are determined. This gives clear indications of the period d. As a rule,
only a few possibilities remain, which are then examined more closely. This is best seen in
an example [Hau1], for example in the following ciphertext:
Table 1.4 lists some of the repeating strings with their positions and the respective
distance.
Since the long string LGFJVFSY is very unlikely to repeat randomly, we can assume
that the period is a divisor of 50. Since the number 5 divides the occurring intervals in nine
cases, but 2 only in six cases and 25 only in one case at all, it is plausible to assume that
the period d = 5. The string UJC would then have been the only one to repeat at random.
With d = 5 one now goes into the statistical analysis. If this does not lead to success, one
would still try the second most plausible possibility d = 10. So the Kasiski attack also
needs some luck to find some strings as long as possible that repeat.
To determine the period d of a Vigenère cipher using the Friedman attack, we think of the
ciphertext with its m letters read row by row into a table with d columns. Moreover, we
now also concretely assume that the plaintext comes from the German language. Then we
can state the following two facts:
• Each column of this table was encoded using a shift cipher. In particular, their letter
distribution corresponds to that of the German language.
• On the other hand, the Vigenère cipher smooths the overall letter frequency. Therefore,
we can approximately assume that all letters occur with equal probability in the
entire table.
1.7 Enigma Machine 15
Thus, if we select two letters of the ciphertext in the same column of the table, the proba-
bility of drawing the same letter twice is approximately ID = 0.0762, but if the two letters
are from different columns, this probability is approximately IgW = 0.0385. Now, for the
number g of pairs of letters from the same column, g = m ∙ (m/d − 1)/2, and for the number
v of pairs of letters from different columns, v = m ∙ (m − m/d)/2. Moreover, m ∙ (m − 1)/2
is the number of pairs of letters in the entire table and hence in the entire ciphertext.
Consequently, I is approximated given by the equation I = (g ∙ ID + v ∙ IgW)/(m ∙
(m − 1)/2) = (0.0762 ∙ m ∙ (m/d − 1)/2 + 0.0385 ∙ m ∙ (m − m/d)/2)/(m ∙ (m − 1)/2), which
resolves to d = 0.0377 ∙ m/((m − 1) ∙ I − 0.0385 ∙ m + 0.0762). Inserting here the calcula-
tion formula I = (m0 ∙ (m0 − 1) + … + m25 ∙ (m25 − 1))/(m ∙ (m − 1)) gives an at least
approximate formula for determining the period d.
As a conclusion of the chapter about historical ciphers, we will now report about a cipher
machine which was used by the German Wehrmacht during World War II: the Enigma
machine (gr. ainigma, engl. riddle). The inventor is Arthur Scherbius (1878–1929),
whose first patent dates back to 1918.
In the Enigma machine, the plaintext is entered via a keyboard. If one presses a letter
key, electric current flows by means of a battery in the Enigma over an ingenious arrange-
ment of circuits and finally lets light up an indicator lamp in the lamp field, which indi-
cates the coding of the pressed letter. Typically, in this circuitry, the electrical signal is first
fed to a plugboard. This has 26 contacts, one for each letter of the alphabet. Of these 26
contacts, ten pairs are selected to be wired together. Figure 1.9 shows the schematic of the
Enigma plugboard with ten exemplary wirings visualized in red. So the signal from the
keyboard is possibly redirected to another letter on the plugboard. Non-wired letters
remain unchanged.
The signal is then applied to a roller set consisting of three rollers. Each roller has 26
input and output contacts, which are interconnected in pairs within the roller. As shown
schematically in Fig. 1.10, the signal arriving on the left for a letter is first passed along the
red path through the three rollers and then meets a reflector, which in turn has 26 contacts
connected in pairs. The reflector passes the signal back through the three rollers. Each
roller consists of two parts: the core with the fixed wiring for substitution and the ring. The
ring position determines the offset between the internal wiring of the rollers and the letter
at which the carryover to the next roller occurs. After exiting the roller set, the signal is
passed over the plugboard again and then ultimately displayed in the lamp field as a letter.
While the reflector is immobile, the three rollers, driven by a mechanical coupling,
rotate as follows after each input of a letter: The left “fast” roller starts and rotates one
position after each letter is entered, so that it returns to its original position after 26 rota-
tional steps. After that, the “middle” roller rotates by one position, and then the first one
again completes a full rotation. After 26 rotation steps of the “middle” roller, the right
“slow” roller starts to rotate by one position, and this continues until the end of the text.
Due to its rotation mechanism, the Enigma machine is also a polyalphabetic cipher.
Basically, the five rollers I to V and the three reflectors A to C were available for the
Enigma machine [WPEnM]:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
I E K M F L G D Q V Z N T O W Y H X U S P A I B R C J
II A J D K S I R U X B L H W T M C Q G Z N P Y F V O E
III B D F H J L C P R T X V Z N Y E I W G A K M U S Q O
IV E S O V P Z J A Y Q U I R H X L N F T G K D C M W B
V V Z B R G I T Y U P S D N H L X A W M J Q O F E C K
A AE BJ CM DZ FL GY HX IV KW NR OQ PU ST
B AY BR CU DH EQ FS GL IP JX KN MO TZ VW
C AF BV CP DJ EI GO HY KR LZ MX NW QT SU
1.7 Enigma Machine 17
This results in the following degrees of freedom in the configuration of the Enigma
machine:
The trick of the machine is that by mirroring the incoming signal at the reflector every
substitution of a letter is involutory, i.e., if the letter X is enciphered in Y, the letter Y would
also have been substituted in X at this text passage. Therefore, if the receiver of messages
was secretly informed of the chosen configuration, he could decode the received message
with exactly the same configured Enigma machine.
The configuration options of Enigma were enormous for that time. Nevertheless, many
of them do not contribute much to Enigma’s security. The plugboard, for example, pro-
vides nothing more than a simple monoalphabetic substitution. Also the facts that Enigma
is involutory and fixed-point free, i.e. can never substitute a letter by itself, provide points
of attack for cryptanalysis. The Enigma encryption was then also cracked by the British
through the group around Alan Turing.
Symmetric Ciphers
2
It is now time to understand our approach to ciphering in a slightly more conceptual way.
Every encryption method consists of an encryption algorithm together with one or more
parameters, the encryption key, which can be used to assign a ciphertext one-to-one to
any plaintext. The cipher key thus determines the characteristics of the algorithm.
In the case of the shift cipher, the algorithm describes the “shifting” or, mathematically
speaking, the “addition modulo 26”. The key corresponds to the shift value i, i.e. in the
case of the Caesar cipher i = 3. In the case of the affine cipher, one even needs a key pair,
namely (i, j). In the case of the Vigenère cipher, the Vigenère tableau describes the algo-
rithm and the keyword, or the text passage used as the key, already has a whole set of
individual parameters, namely the letters of the keyword or text.
The requirement of the ciphertext being one-to-one means that for each ciphertext its
corresponding plaintext can be uniquely determined. The decryption of the ciphertext is
therefore carried out with a decryption algorithm belonging to the encryption method,
which in turn depends on one or more parameters, the decryption key.
The decipherment algorithm for the shift cipher is also a shift cipher, and the decipher-
ment key is -i. The decipherment algorithm for the affine cipher is again an affine cipher,
and the decipherment key consists of the key pair (j−1 (mod 26), −j−1 ∙ i (mod 26)). In the
decipherment algorithm for the Vigenère cipher, one inversely infers from the rows of the
Vigenère tableau to the header row, using the same keyword or keytext that is used in the
cipher. The decipherment algorithm of the Enigma machine works with an exactly identi-
cally configured Enigma machine.
In our examples, the decryption key could easily be determined from the encryption key,
in some cases both were even the same. In this case we speak of symmetric ciphers. With
symmetric ciphers, knowledge of the cipher key is sufficient for decryption; in short, one
simply speaks of the key of a symmetric cipher.
If, on the other hand, the decryption key cannot be calculated from the encryption key,
or can only be calculated with an extremely large amount of effort and therefore in an
unrealistically long time, we speak of asymmetric ciphers. In this chapter we will exclu-
sively deal with symmetric ciphers and only deal with asymmetric ciphers in the next
chapter.
Despite Keckhoff’s principle, the algorithms of many encryption processes are still kept
secret, especially in the military and intelligence sectors. An example is the satellite navi-
gation system GPS of the USA. A satellite navigation system is based on several satellites
that constantly broadcast their current position and the exact time using radio signals.
Special receivers can then calculate their own position from the signal propagation times
of four satellites. There are several systems worldwide, in particular GPS (Global
Positioning System) of the USA, Galileo (EU), GLONASS (Russia) and Beidou (China).
GPS dates back to the late 1980s and was originally developed for navigation by the US
Navy (NAVSTAR GPS). Today, however, it is at least partially available for civilian use
2.1 Keys and Attack Strategies 21
and is the de facto standard on many roads. On the so-called L1 carrier frequency of
1575.42 MHz, the C/A code (Coarse/Acquisition) is transmitted as the basis for civilian
use. The non-public P/Y code (Precision/Encrypted) for precise military positioning is
transmitted separately on top of this. To protect against a possible enemy, the P-code is
encrypted into a Y-code. The procedure is kept secret by the military as a whole.
Based on the Kerckhoffs principle, the key of a symmetric encryption method must be
communicated to the authorized recipient in a secure way. However, there are several rea-
sons why the entire message is not transmitted in this secure way:
To crack a cipher, one therefore probably comes up with the obvious idea of obtaining the
secretly exchanged key in some way. But on the one hand, the handover of the key was
also particularly well secured for historical ciphers. On the other hand, there are now mod-
ern methods of key exchange, which we will get to know in the next chapter, that make
such an attempt hopeless from the outset.
So you might try to brutally check all possible keys one after the other. This is called a
brute-force attack. If necessary, the sequence can be selected according to probabilities
known from experience. Even with modern encryption methods, this method is always
useful if the key space is not large enough. In this case, networked computers may be able
to calculate all possibilities in a reasonable amount of time.
The most elementary variant is that an attacker listens to the entire ciphertext or at least
large parts of it and tries to use it to find the key or at least to deduce the corresponding
plaintext. This attack is called a ciphertext-only attack. The statistical cryptanalysis of
shift ciphers and more general monoalphabetic ciphers as well as the pattern recognition
of the Kasiski attack are typical examples.
The known plaintext attack has a greater chance of determining the key. The attacker
listens to the ciphertext, but also knows parts of the plaintext or at least assumes to know
them. For example, Enigma could be cracked with the knowledge that event messages
always started with the place and date and that the daily weather report was routinely sent.
Another example is the attack on the old encryption method WEP of the WLAN (Sect.
3.2), which exploits the fact that the encrypted header data of the WLAN protocol are
predictable.
The chosen plaintext attack is even more powerful. Here, the attacker is able to have
plaintext passages of his choice encrypted to a certain extent. To do this, the attacker must,
for example, be able to foist the messages to be encrypted on the victim in such a way that
22 2 Symmetric Ciphers
the victim is not aware of this. Or he has at least temporary access to the encryption device,
for example through a break-in or theft, without the current key being directly readable
(e.g. on a smartphone). In this way, the plaintext can be varied and the resulting changes
in the ciphertext can be analyzed.
Finally, there is also the chosen ciphertext attack, whereby the attacker temporarily
even has the possibility of having ciphertexts of his choice decrypted to a certain extent.
This is the case, for example, if the encryption device is also used for decryption and the
attacker has at least temporary access to the device, for example by stealing it. This attack
is often fatal for the security of the procedure.
Cryptanalysis is not only carried out by attackers with the aim of cracking a cipher
procedure and thus eavesdropping on the secret information, but also by cryptographers in
order to prove or quantify the security of the procedure.
Plaintext
11000101
Key
01101100
Ciphertext 101010 01…
But what does a random sequence of bits mean? The decisive factor is how the sequence
was generated, namely each bit as an independent fair coin toss, with probability 1/2 for
both the 0 and the 1. The Vernam cipher is made particularly secure if such a bit sequence
is used only once for encryption. This is called a one-time pad.
So much for the theory of the Vernam cipher. But now the cat bites its own tail, because to
encrypt plaintext, you need a randomly generated bit string of the same length as a key,
2.2 Vernam Cipher and Pseudo-Randomness 23
which you have to communicate secretly to the recipient beforehand so that he can decrypt
it. How can this be done in a practicable way? In order to get out of this dilemma, a method
has been devised in which a much shorter key can be exchanged and in which the receiver
is nevertheless able to generate the bit string used by the sender himself. This method is
based on digital switching elements, so-called linear feedback shift registers, the func-
tion of which we will now illustrate using the example shown in Fig. 2.1 [Man, Beu].
A shift register has m cells, each with one bit z1,…, zm as cell content. The sender
secretly informs the receiver of the initialization of the cell contents z1 = i1 to zm = im. In
our example, i1…i8 = 01100101. In addition, the sender secretly informs the receiver
whether to interconnect after a cell (vj = 1) or not (vj = 0). In our example, this means
v1…v8 = 01010011. With each clock pulse of the switching element, the contents of the
cells are shifted one position to the right and the last bit is output on the right. Via the so-
called feedback equation z1 ∙ v1 + … + zm ∙ vm, the first cell is simultaneously filled again.
In our example, this means that for the first clock pulse z1 ∙ v1 + … + z8 ∙ v8 = i1 ∙
v1 + … + i8 ∙ v8 = 0 ∙ 0 + 1 ∙ 1 + 1 ∙ 0 + 0 ∙ 1 + 0 ∙ 0 + 1 ∙ 0 + 0 ∙ 1 + 1 ∙ 1 = 1 + 1 = 0,
where the addition and multiplication of bits is meant here (Sect. 1.2). So 1 is output on
the right and the first cell is filled with 0. In the second clock pulse, the new cell contents
z1 to zm are used and the procedure is repeated. The whole thing can therefore be continued
as often as desired. In this way, the sender and receiver can exchange the comparatively
short key consisting of the 2 ∙ m bits i1…im v1…vm for a shift register of length m and thus
generate the same, initially rather random-looking bit sequence. This can then be added
bitwise ⊕ to a plaintext of arbitrary length as a shift register cipher, as in the Vernam cipher.
But sequences generated by linear feedback shift registers are no real random sequences,
because the next bit is always determined by the current contents of the m cells. The
sequence is even periodic with a period length at most 2m − 1. This follows simply from
the fact, that the m cells can take at most 2m different values z1 to zm, and if all z1 = … = zm = 0,
only one sequence consisting of all 0 is generated.
Fig. 2.1 Example of a linear feedback shift register with initialization (blue) and interconnec-
tion (red)
24 2 Symmetric Ciphers
Thus with linear feedback shift registers only pseudo-random sequences can be gen-
erated. It is true that there are criteria for a shift register to have maximum period 2m − 1,
and in practice it is advisable to only use such in shift register ciphers in the first place.
However, if an attacker can obtain only 2 ∙ m consecutive bits of the pseudo-random
sequence generated by the shift register, he knows the entire formation law and can deci-
pher at his leisure from then on. All he has to do is solve a system of equations with m
equations and m unknowns. We consider this with a small example and imagine that an
attacker has identified a sequence of 2 ∙ m = 6 bits as part of a pseudo-random sequence
generated by a shift register:
These immediately provide the attacker with the shift register he is looking for with cur-
rent initialization, as shown in Fig. 2.2.
Shift register ciphers that use only a single linear feedback shift register are therefore
completely unsuitable for cryptographic practice. In order to be able to use their technical
advantages nevertheless, several shift registers are sometimes concatenated (Sect. 2.3).
GSM (Global System for Mobile Communications) is the standard for digital mobile
communications networks of the so-called 2. generation (2G) as successor of the analogue
networks of the first generation. It was primarily designed for telephony and short mes-
sages (SMS Short Messages). GSM was introduced in Germany in 1992 and is still used
today by many mobile phone customers worldwide.
For data encryption, GSM uses the algorithms A8 for key generation and A5 for the actual
encryption of the telephone call or SMS. A5 is a procedure which was initially designed in
1987 as A5/1 and in 1989 additionally in a weakened version for certain export regions as
A5/2. Originally, an attempt was made to keep the algorithm secret, contrary to the Kerckhoffs
principle, but this failed. In the meantime, however, A5/1 is open and standardized. The
algorithm A8 is defined by the respective network operator and kept secret as far as possible.
GSM data encryption, which is visualized in Fig. 2.3, uses personalized chip cards
(ICC Integrated Circuit Card). These so-called SIM cards (Subscriber Identification
Module) are issued by the network operators to their customers. Each subscriber is thus
assigned a 128-bit subscriber key ki (Subscriber Authentication Key), which is stored on
the SIM card on the one hand and in the mobile communications server on the other. The
mobile network also sends a 128 bit long random number RAND when the subscriber logs
on. The A8 algorithm uses RAND and the subscriber key ki to generate a 64-bit key kc on
the subscriber’s SIM card and in the mobile communications server. The A5 algorithm
together with the key kc ultimately performs the encryption and decryption of the calls and
SMSs. (For subscribers’ authentification confer Sect. 4.9.)
The A5 algorithm of version A5/1 is a shift register cipher with three linear feedback shift
registers connected in parallel. For encryption, the outputs of all three shift registers are
added in binary and added to the plaintext. Figure 2.4 shows the structure.
In contrast to the shift register ciphers described so far, however, the lengths and inter-
connections are publicly known here, i.e. they are part of the algorithm. The same applies
to the initialization of the cells. At the beginning, they all contain the value 0. Only now
does the 64-bit cipher key kc come into play. It is successively loaded into the first cell of
each of the three shift registers by bitwise addition ⊕. In this process, the shift registers are
clocked 64 times, and in each case another bit of the cipher key kc is added to the contents
of the first cells. After that, the registers are clocked several times irregularly, depending on
the contents of the cells 8, 10 and 10 highlighted in yellow. The output bits expire unused,
and only then does the actual encryption begin by binary addition to the plaintext [Sto].
The method A5/1 and especially the similar but weaker version A5/2 are considered
insecure, the encryption cannot provide significant security against serious attacks [Sto].
But at least it prevents simple eavesdropping. The successor versions A5/3 and A5/4,
which are considered secure, differ fundamentally from A5/1 (Sect. 2.7).
Stream ciphers are encryption methods in which the sequence of plaintext characters is
encrypted one after the other and (pseudo-)randomly varying in each step. If, on the other
hand, the plaintext is divided into blocks of fixed length, which are all encrypted separately,
and the encryption method is the same for each block, this is known as block ciphers. Thus,
in order to design secure ciphers, one either invests in the costly generation of the key or in
complex encryption methods on blocks of suitably large length, where the key can be cho-
sen more simply. An advantage of stream ciphers compared to block ciphers is that one can
decrypt character by character and does not always have to wait for a whole ciphertext block.
In this sense, the Vernam cipher and the shift register cipher are stream ciphers. The
shift and affine ciphers are block ciphers of block length 1. The Vigenère cipher is also a
block cipher, with the block length determined by its period d. Most of today’s important
ciphers are or are at least based (Sect. 2.6) on block ciphers.
2.4 Feistel Cipher 27
But what are the quality criteria that should be applied to a block cipher? Claude Shannon
(1916–2001) formulated two rather intuitive criteria as early as 1949, but they still hold
true today.
We now come to the prototype of modern block ciphers par excellence, the Feistel cipher,
which goes back to Horst Feistel (1915–1990). In 1973, under the project name
LUCIFER, he developed an encryption method that can be regarded as the forerunner of
the DES (Data Encryption Standard) (Sect. 2.5).
However, the Feistel cipher is rather a construction principle for a block cipher, which
is composed of an arbitrary number of so-called rounds. The plaintext m = m1…mn is
taken as a binary string and divided into blocks m1 to mn of even length 2 ∙ t, where t is
arbitrary. It may be necessary to suitably pad mn in the process. Each of these blocks is
now ciphered separately. So we consider a fixed such block L0 R0 with binary strings L0
and R0 of length t. Let further F(∙, λ) be an arbitrary function that transforms a binary
string of length t into a binary string of length t and that has as parameter a binary string λ
of arbitrary length. Furthermore, let ki be a bit string of the same length as the parameter
λ, the so-called round key for the i. round. We now want to describe the so-called round
function of a Feistel cipher, and this is done recursively. Let Li − 1 Ri − 1 be the binary string
of length 2 ∙ t, which has arisen after the (i − 1). round. Then the round function that com-
putes the next binary string Li Ri at the i-th round of a Feistel cipher is as follows:
L i R i 1
R i L i 1 F R i 1 ,k i
It is best to look at the first two steps graphically, as shown in Fig. 2.5.
In principle, therefore, any number of rounds is possible with a Feistel cipher. At the
same time, however, the number of round keys ki and thus the size of the total key k1, k2,
k3… increases considerably. Therefore, a Feistel cipher always includes the basic idea of
generating the individual round keys ki conversely from a relatively short “base key”. Here
are the essential advantages of a Feistel cipher.
• Ciphering and deciphering is done with exactly the same algorithm, where you only
have to apply the round keys in reverse order. This has the advantage for the computer
implementation that the same program modules are sufficient for both. Figure 2.6
shows the last two rounds of decryption.
• However, this also means that, unlike a cipher, the function F(∙, λ) need not be one-to-
one, i.e., one has significantly more degrees of freedom in a concrete realization of the
Feistel cipher.
• Finally, the Feistel cipher effectively operates on only half the block length t and is
therefore much faster to implement.
In 1973 and 1974, the US standardization authority NIST (National Institute of Standards
and Technology) issued two calls for proposals for a standardized cryptographic algo-
rithm. After none of the candidates appeared to be suitable in the first tender, LUCIFER
from IBM remained as the only acceptable proposal in the second tender.
In the course of the assessments by the NSA (National Security Agency), numerous modi-
fications were made and many adjustments were made, as will certainly become clear from the
following description of the procedure. In 1977, the DES (Data Encryption Standard) came
into force in its final form. DES was the first cryptographic algorithm ever to be standardized,
with all details published. Subsequently, the standard was reviewed and extended every 5 years.
The DES is a block cipher with 64-digit binary input and output blocks. Its key is also
formally 64 bits long. However, it effectively consists of eight strings with seven bits each,
30 2 Symmetric Ciphers
to which one bit each is appended for parity checking for error detection, i.e. effectively
56 bits in total. More precisely, DES is a Feistel cipher with a total of 16 rounds. Figure 2.7
gives a first overview of a DES round. The 16 rounds are preceded and followed by fixed,
mutually inverse input and output permutations on 64 bits, which do not play any role for
the security of DES [Buc, Hau1]. As with all Feistel ciphers, DES is decrypted using the
same algorithm with the round keys in reverse order.
We now proceed to the round function of the i. round and thus ultimately to the choice
of the mapping F(∙, λ) in DES. Here, we first replace the parameter λ by a 48-bit round key
ki, which in turn is derived from the 56 bits of the effective DES key. We will first postpone
how exactly this is done. So we first describe in the four steps
• Expansion
• Key addition
• S-Boxes
• Permutation,
as F(∙, ki) operates on the right 32-bit block R of a 64-bit block LR.
First the 32-bit string R is enlarged to 48 bits with an expansion ε. To do this, the 32-bit
string a1 a2…a32 is divided into eight sub-blocks of 4 bits each and each of these sub-blocks
is expanded by the edge bit of the predecessor and successor sub-block to 6 bits.
… a9 a10 a11 a12 a13 a14 a15 a16 a17 a18 a19 a20 …
… a9 a10 a11 a12 a13 a12 a13 a14 a15 a16 a17 a16 a17 a18 a19
The last bit of R is used at the beginning of the first block and the first bit of R is used
at the end of the last block.
To this 48-bit string we add position by position ⊕ the 48-digit round key ki. After key
addition, we now call the bits bj and bj′, respectively. Here, the bj refer to the aj within the
original sub-blocks, and the bj′ refer to the boundary bits aj shown in boldface above. Note
that because of the key addition, bj and bj′ may differ.
The so-called S-boxes (substitution boxes) are the core of the algorithm. For each of the
eight sub-blocks, each consisting of six bits, there is a fixed S-box, namely a matrix with four
rows and 16 columns. The rows are indexed by bit strings of length 2, the columns by bit
strings of length 4 in ascending binary order. Each row of the matrix also contains all bit
strings of length 4, but in rather jumbled order. For example, Table 2.1 shows the third S-box,
matching the third 6-bit subblock b8′b9b10b11b12b13′. To illustrate how S-boxes work, consider
the specific example b8′b9b10b11b12b13′ = 101011. The two outer bits b8′b13′ = 11 denote the row
of the matrix, the four inner bits b9 b10 b11 b12 = 0101 decide the column. Therefore, the S-box
returns the bit string 1001 and thus decides that the string b9b10b11b12 = 0101 is substituted by
the string c9c10c11c12 = 1001. The border bits b8′b13′ are discarded again.
The output of the eight S-boxes finally results in a bit string c1 c2…c32 of length 32. For
the sake of completeness, all eight S-boxes of DES are listed in Table 2.2.
Finally, the output string c1 c2….c32 of the eight S-boxes of bit length 32 is subjected to
permutation π. It is permuted in the order according to Table 2.3, i.e. the bit from position
32 2 Symmetric Ciphers
S3 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111
00 1010 0000 1001 1110 0110 0011 1111 0101 0001 1101 1100 0111 1011 0100 0010 1000
01 1101 0111 0000 1001 0011 0100 0110 1010 0010 1000 0101 1110 1100 1011 1111 0001
10 1101 0110 0100 1001 1000 1111 0011 0000 1011 0001 0010 1100 0101 1010 1110 0111
11 0001 1010 1101 0000 0110 1001 1000 0111 0100 1111 1110 0011 1011 0101 0010 1100
16 is put in position 1, the bit from position 7 is put in position 2, and so on. This serves
the equal distribution of the bits from round i to the S-boxes in round i + 1.
We now return to the selection and determination of the round key ki. For this purpose, the
64-bit total key to be exchanged secretly is first written into the scheme of Table 2.4. The
right column contains the parity check bits, which are now omitted.
The remaining bits are divided into two registers C (framed in bold on the left) and D,
in the order determined according to Table 2.5. This is called PC-1 (Permuted Choice 1).
In each round, 24 bits are selected from each of the two registers, always the same ones
and always in the same order. Table 2.6 shows the selection and the sequence. Here, the
numbering refers to the position numbers shown in italics in Table 2.5. This is called PC-2
(Permuted Choice 2).
However, in order to obtain different round keys ki for each round, the values in regis-
ters C and D are cyclically shifted to the left after each round according to Table 2.5,
namely by one position after rounds 1, 2, 9 and 16 and by two positions after the remaining
rounds. Therefore, during the 16 rounds, a total of 28 shift operations are performed, so
that the registers are in their initial state again afterwards. Therefore, the next 64-bit block
can be enciphered without reloading the key.
The many operations and permutations per round do not serve the actual encryption,
because they are all publicly known and realized in DES programs. Rather with them the
secret key is to be mixed as powerfully as possible into the plaintext. For example, it can
be shown that after only five rounds of DES, every bit depends on every plaintext bit and
every key bit, i.e., DES produces a high degree of diffusion. This is also one reason why,
with the exception of the brute-force attack, there is no other “real practical” attack on
DES to date, not even differential and linear cryptanalysis, which emerged in the early
1990s and are generally applicable to iterative block ciphers and, in particular, Feistel
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
C 57 49 41 33 25 17 9 1 58 50 42 34 26 18 10 2 59 51 43 35 27 19 11 3 60 52 44 36
D 63 55 47 39 31 23 15 7 62 54 46 38 30 22 14 6 61 53 45 37 29 21 13 5 28 20 12 4
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56
ciphers. We will now explain these two important attack strategies in more detail using
DES. This will get a bit tricky, but is not absolutely necessary for further understanding.
Therefore, both topics can also just be “skimmed over” or even skipped.
Differential cryptanalysis is a special chosen plaintext attack (Sect. 2.1). The attacker
encrypts two plaintext blocks m and m′ with a self-selected difference (= sum) m ⊕ m′ and
learns at least the difference c ⊕ c′ of the ciphertext blocks c and c′. When executed mul-
tiple times, he or she can thus examine the effects of differences in plaintext blocks on
differences in the associated ciphertext blocks. This allows the probability of keys, and
hence the most likely key, to be determined.
So we want to explain the procedure at the example of DES, but for the sake of clear-
ness only at one round, and only at that part, which operates on the right block with a
round key k. Let two 32-bit strings R and R′ as well as the difference F(R, k) ⊕ F(R′, k) be
known. On the way through the DES round, the only unknown is the key k, which has to
be determined or narrowed down. Let B and B′ be the input strings to the S-boxes belong-
ing to R and R′, respectively, and C and C′ be the corresponding output strings. Then
B = ε(R) ⊕ k and B′ = ε(R′) ⊕ k with expansion mapping ε, and consequently B ⊕ B′
= (ε(R) ⊕ k) ⊕ (ε(R′) ⊕ k) = ε(R) ⊕ ε(R′) = ε(R ⊕ R′). In particular, therefore,
although B and B′ are not known individually, at least their difference is known.
Furthermore, C ⊕ C′ = π−1(F(R, k)) ⊕ π−1(F(R′, k)) = π−1(F(R, k) ⊕ F(R′, k)), such that
with the help of the inverse permutation π−1 of π, the difference C ⊕ C′ is also known. If it
is now possible to restrict B or B′, then k is also determined accordingly because of
k = B ⊕ ε(R) = B′ ⊕ ε(R′).
We use the abbreviation E = ε(R) und E′ = ε(R′) and explain the procedure on the
basis of the first S-box S1. For this the index 1 may denote the part of the respective bit-
strings, which refers to S1. Thus, with this designation, B1 ⊕ B1′ and C1 ⊕ C1′ are also
known in particular, and k1 = B1 ⊕ E1 holds. As a concrete example [Hau1] now let
R = 00101 ∗ … ∗ 1, R′ = 10001 ∗ … ∗ 1 and C1 ⊕ C1′ = 0110. Then E1 = 100101, E1′ =
36 2 Symmetric Ciphers
110001 and consequently B1 ⊕ B1′ = E1 ⊕ E1′ = 010100. Now one determines all 6-bit
pairs X and X′ with X ⊕ X′ = 010100 in such a way that these result in 4-bit pairs Y and
Y′ with difference Y ⊕ Y′ = 0110 when passing through the S-box S1. After a little calcula-
tion one obtains for B1 the four possibilities 100010, 110110, 101010, 111110 and conse-
quently for k1 = B1 ⊕ E1 the four possibilities 000111, 010011, 001111, 011011. If one
repeats the procedure for other R and R′, one can thus further narrow down k1. Accordingly,
one proceeds to determine the total round key k = k1…k8 with the other S-boxes.
To shorten the calculations, one can of course keep corresponding tables for all eight
S-boxes and all possible differences. In Table 2.7 for our example of the S-Box S1 with
input difference X ⊕ X′ = 010100 to all output differences Y ⊕ Y′ the number of the dif-
ferent possibilities is listed.
With more than one DES round, however, differential cryptanalysis becomes more and
more complex; with 16 rounds it is not significantly more effective than a brute-force
attack. Although not officially published until 1991 by Eli Biham (born 1960) and Adi
Shamir (born 1952), the DES developers nevertheless already knew the underly-
ing method.
We have subliminally used the equations ε(R ⊕ R′) = ε(R) ⊕ ε(R′) and
π(C ⊕ C′) = π(C) ⊕ π(C′) for the expansion ε and the permutation π in differential
Table 2.7 Number of output differences of S-Box S1 with input difference 010100
cryptanalysis. Indeed, both ε and π are so-called linear transformations. However, the
S-boxes are highly nonlinear transformations, as can be seen, for example, from Table 2.7.
If the S-box S1 were linear, all 64 input pairs with input difference 010100 would lead to
the same output difference. Therefore, the entire DES algorithm is also nonlinear. Linear
cryptanalysis is a known-plaintext attack (Sect. 2.1) that attempts to “linearly approxi-
mate” a block cipher as optimally as possible in order to determine the key from a suffi-
cient number of plaintext/ciphertext pairs, at least partially and with a certain probability.
The method was developed by Mitsuru Matsui (born 1961) in 1993.
We want to explain the basic idea again by the example of the DES, and again only by
one round and quite concretely by the S-box S5. Let B5 = b1…b6 be an input string and
C5 = c1…c4 the corresponding output string. For the linear approximation of the S-box
S5 one searches for bit strings u1…u6 and v1…v4 in such a way that one of the two follow-
ing equations is fulfilled for far more than half of all 26 = 64 possible input strings B5 with
the corresponding output strings C5 and thus has a high probability:
u1 • b1 u6 • b6 v1 • c1 v 4 • c 4
u1 • b1 u6 • b6 v1 • c1 v 4 • c 4 1
To do this, one sets up a table that contains, for all values of u1…u6 and v1…v4, the
indication of how often the first of the two equations is satisfied for input string B5 with
output string C5.
For example, for the values u1…u6 = 010000 and v1…v4 = 1111, when passing through
the S-box S5 one finds [Fra] that the first equation u1 ∙ b1 + … + u6 ∙ b6 = b2 = c1 + … + c4 = v1
∙ c1 + … + v4 ∙ c4 holds only in 12 cases out of a total of 64. So, conversely, the second
equation b2 = c1 + … + c4 + 1 holds in 52 out of 64 cases and this therefore has a probabil-
ity of 0.81. Considering now again the expansion ε(R5) = E5 = e1…e6 and the key portion
K5 = k1…k6 to the S-box S5, in a known-plaintext attack with n plaintext/chiffretext pairs,
the respective bit strings E5 and C5 are known, and in this case B5 = E5 ⊕ K5 holds with
previously unknown K5. Therefore, our linear approximation yields e2 + k2 = c1 + … + c4 + 1,
and k2 is with high probability the bit 0 or 1 for which the equation is correct for more than
n/2 of the plaintext/chiffretext pairs E5 and C5.
For u1…u6 = 111111 and v1…v4 = 0100, one can verify that u1 ∙ b1 + … + u6 ∙
b6 = b1 + … + b6 = c2 = v1 ∙ c1 + … + v4 ∙ c4 holds in 46 out of a total of 64 cases, i.e. with
probability 0.72. This yields e1 + … + e6 + k1 + … + k6 = c2, and one in turn determines
from this k1 + … + k6 as the bit for which the equation is correct for more than n/2 of the
plaintext/ciphertext pairs. The two results for the key bits K5 = k1…k6 can now be com-
bined with each other or further linear approximations of high probability can be deter-
mined and used.
With more than one DES round, the creation of linear approximations becomes more
and more difficult and thus the linear cryptanalysis more and more complex. The method
38 2 Symmetric Ciphers
was not known to the developers of DES in contrast to differential cryptanalysis. Therefore
also the S-Boxes are not completely optimized concerning linear cryptanalysis.
By design, DES would actually only have required an effective key length of 48 bits.
However, even when it was first standardized, this tended to be insecure because of the
possibility of brute-force attacks. Nevertheless, the relatively short key of effectively 56
bits proved to be DES’s greatest weakness. While a complete key search was hardly con-
ceivable at the time of DES’s introduction, it came within immediate reach in the 1990s.
With large networked computers, it was possible to get into the day and even hour range.
To increase the effective key length, an obvious approach is to encrypt multiple times with
DES. However, one may consider that even double DES encryption hardly provides more
security. To get around this problem, the so-called Triple-DES has been introduced. Here,
one uses three DES algorithms with two independent keys k1 and k2, first and finally apply-
ing the DES cipher EDES (∙, k1) with the key k1, but in between applying the DES decipher
DDES (∙, k2) with the key k2. Triple-DES then has an effective key length of 112 bits.
Plaintext m Ciphertext c E DES D DES E DES m,k1 ,k 2 ,k1
Triple DES was and is implemented in many practical applications for the encryption
of data requiring protection. However, it has been successively replaced by the more mod-
ern AES (Sect. 2.8), or at least AES is offered as an alternative.
Let E = E(∙, k) be an arbitrary block cipher with key k and with binary input and output
blocks, e.g. DES or Triple-DES. Furthermore, let D = D(∙, k′) be the corresponding
decryption scheme with possibly different decryption key k′. Let us again denote the
blocks of plaintext by m1, m2,…, mn, with the last block mn padded to the same length as
the others, if necessary. Then the blocks ci of the ciphertext are computed according to
ci = E(mi, k). This use of a block cipher, namely according to its very own definition, as it
is also visualized in Fig. 2.8, is called electronic codebook mode (ECB). This means that
identical blocks are always encrypted identically. This preserves large-scale plaintext pat-
terns, and the frequency of identical plaintext areas is only inadequately disguised. Thus,
the ECB mode provides ideal attack conditions for statistical analyses, as we have used
several times for historical ciphers. Another disadvantage of the ECB mode is that the
receiver of the ciphertext cannot necessarily detect whether an attacker has deleted,
swapped, or even added blocks during data transmission. In general, the ECB mode should
therefore only be used for short messages with few blocks. Decryption in ECB mode is
performed according to mi = D(ci, k′).
Plaintext patterns can be destroyed using contextual encryption. In cipher block chaining
mode (CBC), one adds the previous ciphertext block to the current plaintext block and
then encrypts the result. Thus, the blocks ci of the ciphertext are calculated according to
ci = E(mi ⊕ ci − 1, k). However, since one does not yet have a ciphertext block available for
the first plaintext block m1, one uses an initialization block c0, which is sent to the receiver
together with the entire ciphertext. Figure 2.9 again visualizes the procedure. Decryption
is performed according to mi = D(ci, k′) ⊕ ci − 1.
In cipher feedback mode (CFB), one computes the blocks ci of the ciphertext according
to ci = mi ⊕ E(ci − 1, k). Thus, it is a stream cipher where the block cipher E is used to
generate a context-dependent pseudo-random sequence that is added to the plaintext.
Again, an initialization block c0 is required, which is sent to the receiver along with the
entire ciphertext. Figure 2.10 again visualizes the procedure.
Because of ci ⊕ E(ci − 1, k) = mi ⊕ E(ci − 1, k) ⊕ E(ci − 1, k) = mi one does not even
need D for deciphering, but computes the plaintext according to mi = ci ⊕ E (ci − 1, k).
Here, as with all stream ciphers, the receiver has the additional advantage that he or she
does not have to wait for the entire ciphertext block ci, but can decrypt it bit by bit.
However, it is also possible to design a stream cipher based on the block cipher E in a
context-independent manner, which has the advantage that the pseudo-random sequence
can be calculated in advance. For this purpose, the sender and receiver agree on an initial
value s0 of the same length as mi. In output feedback mode (OFB), the ciphertext block
ci is then determined according to ci = mi ⊕ E(si − 1, k), so here the pseudo-random sequence
si = E(si − 1, k) has no reference to the context. Figure 2.11 shows the procedure. Decryption
is again bitwise according to mi = ci ⊕ E(si − 1, k).
Finally, we want to describe the counter mode (CTR), where the encryption of the plain-
text block mi depends on its position i in the text m. For this purpose, one writes the posi-
tion i = i0 ∙ 20 + i1 ∙ 21 + i2 ∙ 22 + i3 ∙ 23 + … + ib − 1 ∙ 2b − 1 as a binary expansion with bits ij
and the block length b of mi and again agrees on a base value s0 of length b. Now identify-
ing i with the bit string i0 i1 i2 i3… ib − 1, one can add s0 and i bitwise ⊕ and derive a context-
dependent stream cipher where the ciphertext block ci is computed according to
ci = mi ⊕ E(s0 ⊕ i, k). However, despite context dependency, one can calculate the
pseudo-random sequence in advance here as in the OFB mode. Figure 2.12 illustrates the
CTR mode. If one has reached position i = 2b, one simply counts from the beginning again.
Decoding in the CTR mode is done bit by bit according to mi = ci ⊕ E(s0 ⊕ i, k).
The operating modes were first standardized for use with DES in 1981, but are of
course used with other block ciphers as well.
2.7 UMTS/LTE Mobile Communications and Digital Television 41
We will now take up the encryption procedure for the GSM mobile communications stan-
dard (Sect. 2.3). In the course of the 1990s, UMTS (Universal Mobile Telecommunications
System) was developed as a 3. generation (3G) mobile communications standard with
significantly higher data transmission rates than GSM. UMTS includes additional services
such as e-mail and Internet. UMTS has been commercially available in Germany since
2004 and there are now UMTS networks in over 100 countries. In the meantime, LTE
(Long Term Evolution) has already been launched as a 4.generation (4G) mobile commu-
nications standard, but it has a similar architecture to UMTS. In 2010, the first LTE licenses
were auctioned in Germany and the first LTE transmission masts were put into operation.
While the old version A5/1 is still widely used in many GSM networks and is only gradu-
ally being replaced by A5/3, the A5/4 version is already implemented for data encryption
in UMTS and LTE. Both are fundamentally different from A5/1. It is the Japanese
KASUMI cipher (English: fog, mist), a variant of MISTY1 from 1995. KASUMI is a
Feistel cipher with 8 rounds on 64-bit blocks and a 128-bit key. This generates a pseudo-
random sequence in a combination of CTR and OFB modes and is therefore operated as a
stream cipher. Since we have already dealt in detail with DES and thus with by far the most
important Feistel cipher, we will not give an explicit description of the round function
[3GPP] for the KASUMI cipher.
The standardization of A5/3 has in fact a key with an effective key length of 64 bits.
This is simply doubled to a key length of 128 bits for the KASUMI algorithm. One reason
for this is that the key generation for GSM can be used unchanged for A5/3 and thus GSM
can be upgraded to A5/3 more easily (Sect. 2.3). However, this means that A5/3 is just as
vulnerable to brute-force attacks as DES. For this reason, ETSI (European
Telecommunications Standards Institute) has also launched version A5/4, also with
KASUMI cipher, but with an effective key length of 128 bits [ETSI2, WPA5A].
42 2 Symmetric Ciphers
DVB (Digital Video Broadcasting) is a standard for the digital transmission of television
programmes. There are different sub-standards for different transmission paths, which dif-
fer, among other things, in the modulation method: DVB-S for transmission via satellite,
DVB-C for transmission via cable networks, DVB-T for transmission via terrestrial trans-
mitters. DVB-S and DVB-C were ratified in 1994, DVB-T 3 years later. In the meantime,
however, there is already a successor standard DVB2. The video and audio contents of
DVB are transmitted by means of so-called MPEG2 transport packets. These are named
after the MPEG (Moving Picture Experts Group), which has been creating various stan-
dards for video and audio formats since the late 1980s. Each MPEG2 transport packet
consists of header data with controlling information as well as the actual payload data. For
example, there is a 2-bit field that encodes a possible encryption, where 00 stands for
unencrypted. The transport packets are reassembled during playback to form the so-called
elementary stream, which ultimately generates the video and audio playback.
We will now look at DVB encryption for pay-TV channels. The procedure, which origi-
nated in 1994, is called CSA (Common Scrambling Algorithm). Each receiver requires a
CA module (Condition Access) and an individual smart card (ICC). In addition to the
MPEG2 transport packets, the DVB provider sends separate ECM packets (Entitlement
Control Message) with which the keys for the decryption of the pay-TV channel are
transmitted. The CA module filters the ECM packets out of the data stream and uses the
smart card to calculate the 64-bit key that is valid at that time.
The CSA encryption method itself consists of a combination of block cipher and stream
cipher, with the block cipher being used for encryption first. This is not a Feistel cipher,
but more generally an iterative substitution permutation cipher of 56 rounds on blocks of
64 bits, operated in CBC mode. The round function is shown schematically in Fig. 2.13,
where the permutation, the substitution box, and the derivative of the round key are speci-
fied separately.
Following the block cipher, an additional complex stream cipher is used, which outputs
two pseudo-random bits at each of its clock pulses, which are added to the bit stream to be
encrypted [WPCAS]. CSA was kept secret for many years, contrary to the Kerckhoffs
principle, but then became public knowledge in 2002. Although a brute-force attack ini-
tially appears feasible due to the small key length, it is hampered by the frequent change
of the key in the ECM packets.
In 2013, ETSI standardized a successor procedure CSA3, which is based on the mod-
ern standard procedure AES (Sect. 2.8) and on an XRC cipher, which is again kept secret
[ETSI1]. The AES cipher is operated with a key length of 128 bits in CBC mode. However,
CSA3 is hardly used, and CSA therefore remains the dominant method for protecting pay-
TV channels in DVB.
• a preliminary round,
• 9, 11 or 13 normal rounds (each for key length 128, 192 or 256 bits) and
• a final round.
• SubByte
• ShiftRow
• MixColumn
• AddRoundKey
The preliminary round uses only AddRoundKey, and the final round does without
MixColumn. Before we describe these four building blocks in a little more detail, we want
to point out what is actually new about the AES cipher.
Here again are the addition and multiplication tables for bits (Sect. 1.2):
+ 0 1 . 0 1
0 0 1 0 0 0
1 1 0 1 0 1
As we already know, information units such as letters and pixels are usually interpreted
as blocks of several bits, especially often as bytes with 8 bits. The question therefore arises
whether it is also possible to add and multiply blocks of bits in a meaningful way. But what
should “meaningful” mean in this context? It should mean that certain rules of calculation
apply, which are absolutely necessary for further considerations. It is important, that you
can not only add and multiply, but also subtract and divide. For this you need the following
formal properties:
• Each bit block can be made 0…0 by adding a second bit block (so-called additive
inverse, which corresponds to a subtraction). 0…0 is also called the 0-element.
• Each bit block not equal to 0…0 can be made 0…01 by multiplication with a second bit
block (so-called multiplicative inverse, which corresponds to a division). 0…01 is
also called the 1-element.
In mathematics, such structures are also called a field. Let us look at the simple example
of bit pairs. A first approach is to add and multiply the bits position by position. For our
well-known bitwise addition ⊕ this works well, because if you add the same bit pair to a
2.8 Advanced Encryption Standard AES 45
bit pair, the result is always 00. But for the multiplication, unfortunately, one suffers a
shipwreck, because no matter what one multiplies 10 position by position, it never results
in 01. So one has to define the multiplication more elaborate. Here are the desired useful
addition and multiplication tables for bit pairs:
⨁ 00 01 10 11 ⨂ 00 01 10 11
00 00 01 10 11 00 00 00 00 00
01 01 00 11 10 01 00 01 10 11
10 10 11 00 01 10 00 10 11 01
11 11 10 01 00 11 00 11 01 10
In a way it is an extension of the bit addition and multiplication, because this is found
exactly in the left upper quarter of the table, related to the last position of the bit pair. Now
there is also a second bit pair to 10, namely 11, for which the product 10 ⊗ 11 results in
the 1-element 01. A corresponding multiplication ⊗ works also for bit blocks of any
length, so especially also for bytes with their 8 bits. However, we refrain here from repro-
ducing the byte multiplication table ⊗ with its 64 rows and columns.
Instead we want to explain briefly how to define this reasonable multiplication ⊗ con-
ceptually for bit blocks of arbitrary length n. For this we number the positions of the bits
on the right starting from 0 to n − 1 and set for abbreviation t = 0…0010. For i from 0 to
n − 1 we now define ti = t ⊗ t ⊗ …i… ⊗ t as the bit string which has bit 1 exactly at posi-
tion i and 0 otherwise. In particular, t0 = 0…001 is the 1-element, and ti ⊗ tj = (t ⊗ t…i…
⊗ t) ⊗ (t ⊗ t…j… ⊗ t) = ti + j. Any bit strings, i.e. sums ⊕ of some ti, are multiplied by the
distributive rule. For example, for n = 8 we have 00000101 ⊗ 00001110 = (t2 ⊕ t0) ⊗ (t3
⊕ t2 ⊕ t1) = t5 ⊕ t4 ⊕ t3 ⊕ t3 ⊕ t2 ⊕ t1 = t5 ⊕ t4 ⊕ t2 ⊕ t1 = 00110110. But wait: This mul-
tiplication rule ⊗ makes sense only if the exponents of t are at most n − 1. So you need
some kind of recursion formula for tn. But this formula is not so easy to construct for
arbitrary n. For bit pairs, for example, the formula, would be t2 = t1 ⊕ t0 = 11, and for
bit triples one can use t3 = t1 ⊕ t0 = 011. The reader is asked to check this for bit-pairs in
the multiplication table above and to create the multiplication table for bit-triples with its
eight rows and columns. For bytes at any rate one can use as one of several possibilities
the recursion formula t8 = t4 ⊕ t3 ⊕ t1 ⊕ t0 = 00011011, and it is exactly this formula which
is used for AES. Mathematically speaking, bytes thus form a field [Man, Wil, Buc].
What hence is essentially new compared to DES is that AES is not designed for bit struc-
tures, but for byte structures and their addition and multiplication. A plaintext block of
46 2 Symmetric Ciphers
AES has 128 bits, i.e. 128/8 = 16 bytes. Each such block a1…a16 of 16 bytes is read for
encryption column by column into a matrix with four rows and four columns:
a1 a5 a9 a13
a2 a6 a10 a14
a3 a7 a11 a15
a4 a8 a12 a16
On the basis of this so-called state matrix, all mapping modules are now defined.
The block SubByte is the actual substitution part in the AES procedure. From our prelimi-
nary consideration we already know that every byte ai, which is not equal to 00000000, has
a multiplicative inverse ai−1, for which ai ⊗ ai−1 = 00000001 therefore applies. So for
each byte ai in the state matrix, we first compute b i a 1
i , if ai is not equal to 00000000,
and set bi = 00000000 for ai = 00000000. Then we write each byte b i 7 0 again
i i
i i
as a bit string of length 8 and transform the bits 7 ,, 0 according to
7 7 70 0 77 7 7
i i i i
Here, the bits λjl and δj are determined by the AES algorithm as follows:
1 0 0 0 1 1 1 1 1
1 1 0 0 0 1 1 1 1
1 1 1 0 0 0 1 1 0
1 1 1 1 0 0 0 1 0
jl 1
1 1 1 1 0 0 0
and j 0
0 1 1 1 1 1 0 0 1
0 0 1 1 1 1 1 0 1
0 0 0 1 1 1 1 1 0
The module SubByte then replaces each byte ai in the state matrix with the byte
a i 7 0 determined in this way.
i i
2.8 Advanced Encryption Standard AES 47
The module ShiftRow changes the rows of the state matrix. The first row remains
unchanged, the second row is cyclically shifted to the left by one place, the third row by
two places and the fourth row by three places. In this way, each byte ai′ of the state matrix
is converted into a byte ai′′.
The module MixColumn changes the columns of the state matrix. For abbreviation we
write e = 00000001 (the 1-element), t = 00000010 and s = e ⊕ t = 00000011. Then the
elements a1′ ′ ′, a2′ ′ ′, a3′ ′ ′ und a4′ ′ ′ of the new first column of the state matrix are cal-
culated as
a1 t a1 s a 2 e a 3 e a 4
a 2 e a1 t a 2 s a 3 e a 4
a 3 e a1 e a 2 t a 3 s a 4
a 4 s a1 e a 2 e a 3 t a 4
The new elements a5′ ′ ′, a6′ ′ ′, a7′ ′ ′ and a8′ ′ ′ of the second column, a9′ ′ ′, a10′ ′ ′, a11′ ′ ′ and a12′ ′ ′
of the third column and a13′ ′ ′, a14′ ′ ′, a15′ ′ ′ and a16′ ′ ′ of the fourth column of the state matrix
are calculated in the same way.
Now, of course, the secret key k must also come into play. Since you want to use different
keys for each round, you construct them successively on the basis of the 128-, 192-, or
256-digit AES key. We will explain the procedure using the example of a 128-digit key. To
do this, the key k is first divided into four blocks k0, k1, k2 and k3 of 32 bits each. The block
AddRoundKey then adds the actual AES key k = k0 k1 k2 k3 as a round key bit by bit ⊕ to
the plaintext block a1 a2…a15 a16 in the preliminary round.
For the j. round one recursively derives the following four 32-bit blocks from k:
k4j k 4 j 4 T k 4 j1
k 4 j 1 k 4 j 3 k 4 j
k 4 j 2 k 4 j 2 k 4 j 1
k 4 j 3 k 4 j1 k 4 j 2
48 2 Symmetric Ciphers
Here the transformation T of the 32-bit block k4j−1 must still be described. However,
this again consists of four bytes, say k 4 j1 c1 c 2 c 3 c 4 with the bytes c1 to c 4 . Then
j j j j j j
j
c1 S c 2 t j1
j
c S c
j j
2 3
c S c
j j
3 4
c S c
j j
4 1
with the transformation S described at the module SubByte and with the byte
t = 00000010.
In the j. round, the 128-bit round key k4j‖k4j + 1‖k4j + 2‖k4j + 3 formed by stringing together
k4j, k4j + 1, k4j + 2 and k4j + 3 is added bit by bit ⊕ to the concatenation a1′ ′ ′‖a2′ ′ ′‖…‖a15′ ′ ′‖a16′ ′ ′
of the entries a1′ ′ ′ to a16′ ′ ′ of the state matrix in the AddRoundKey module.
AES is not a Feistel cipher, which can also be used for deciphering in the same way only
with reversed order of the round keys. But it is not hard to see, that all AES modules are
invertible, i. e. can be inverted again. Again, one needs the same round keys, only in
reverse order.
Of course, it is not surprising that numerous cryptanalytic attacks have been carried out on
AES. However, the method is secure against all attacks known to date, e.g. also against
differential and linear cryptanalysis. The inversion of the bytes in the module SubBytes
makes the method highly complex, the modules ShiftRow and MixColumn cause a high
confusion and diffusion. However, it is debatable whether the simple algebraic design
could be a weakness of AES and thus a possible point of attack.
2.9 Hard Disk and ZIP Archive 49
The magnetic storage medium hard disk has been the most important mass storage
medium for many decades. Hard disk drives are installed in computers, but are also offered
as external drives. The write and simultaneously read head of the write finger is basically
a small electromagnet. It magnetizes tiny areas of the disk surface differently and thus
writes the data to the hard disk. Conversely, when reading, the changes in the magnetiza-
tion of the surface cause a voltage pulse in the read head due to electromagnetic induction.
Hard disks organize their data in so-called sectors (with e.g. 512, 2048 or 4096 bytes),
which can only ever be read or written as a whole. Encryption of hard disks therefore usu-
ally takes place per sector.
There are a large number of hard disk encryption software products on the market. Many
of them use the CBC-AES method. Here, each sector is divided into blocks of 128 bits
each and the blocks are encrypted one after the other using AES in CBC operating mode.
This is generally considered sufficient for most security applications.
However, both the BSI (Bundesamt für Sicherheit in der Informationstechnik, English:
German Federal Office for Information Security) and NIST recommend in particular
XTS-AES [BSI] for hard disk encryption. This is a standardized procedure that is also
based on AES. The abbreviation XTS stands for “Xor-Encrypt-Xor-based tweaked-
codebook mode with ciphertext stealing”. With XTS-AES, too, each sector is divided into
blocks of 128 bits, but AES is operated in an optimized (“tweaked”) variant of the ECB
mode. This is done with the following trick, which we have already seen for bytes (Sect.
2.8). Even for bit blocks of length 128, a reasonable addition ⊕ and multiplication ⊗ can
be defined. For the 128-bit string t = 0…010, for example, the recursion formula
t128 = t7 ⊕ t2 ⊕ t1 ⊕ t0 can be used. Just like bytes, the bit blocks of length 128 then form a
field in the mathematical sense. But we know more about a field, namely that there is at
least one element g, which continuously exponentiated, i.e. gj = g ⊗ … j… ⊗ g for j = 0,
1, 2,… yields all 128-bit blocks except 0…0 [Man, Wil, Buc]. Therefore, g is called a
generating element (Sect. 2.5).
XTS-AES uses two AES keys. The key k1 is used to AES-encrypt the 128-bit blocks per
sector, and the other k2 encrypts an initialization value IV of also 128 bits, which is usually
derived from the sector address. The diagram in Fig. 2.14 schematically shows the work-
flow of an XTS-AES encryption for the j. block within a sector. The procedure is as fol-
lows in detail:
• The initialization value IV with 128 bits is encrypted using AES and the key k2.
• The result, again a string of 128 bits, is multiplied ⊗ by gj.
• This string is added bitwise ⊕ to the plaintext of the j. block.
• The result is subjected to an AES cipher with key k1.
• Finally, the string from the second point above is added bit by bit again ⊕.
The graphic in Fig. 2.14 serves as a simplified representation of the procedure. If one were
to proceed in the same way in practice, the IV value would be encrypted again and again
for each block of the sector and gj would be recalculated again and again. This is unneces-
sary. Therefore, the encryption of IV per sector is done only once at the beginning, and gj
is calculated successively as gj = gj − 1 ⊗ g.
If the division of the sector into blocks does not work, a rudimentary block of less than
128 bits remains at the end. This is then filled by the last bits of the ciphertext of the pen-
ultimate block (“ciphertext stealing”) [WPDET].
Unlike CBC-AES, in the tweaked-ECB mode of XTS-AES each block is independent
and not concatenated with other blocks. This means that if stored ciphered data is cor-
rupted, only the data of that particular block is unrecoverable. However, XTS-AES requires
AES keys twice as long, so 256 bits and 512 bits for AES-128 and AES-256 respectively.
Other storage media such as USB sticks (so-called flash memories) are also commer-
cially available with CBC-AES or XTS-AES encryption [Kin].
2.9 Hard Disk and ZIP Archive 51
The ZIP file format was originally developed in 1989 by Phil Katz (1962–2000). Today,
there is a whole range of standard programs for creating and editing so-called ZIP archives,
such as Winzip and 7-zip. The use of ZIP archives offers a whole range of advantages.
They function as a container file into which several files belonging together or even entire
directory trees can be packed. And they store data in compressed form, which was, inci-
dentally, the real reason for their development. This way you can save space on your hard
drive, fit more data on a USB stick, and upload and send it over the Internet is more practi-
cal. Incidentally, the compression method developed by Phil Katz is called DEFLATE.
Zip archives are also very popular because they can be optionally encrypted, which
increases data security, especially when sending files. Encrypted ZIP archives can only be
accessed by entering a password. The files of a ZIP archive are encrypted with DES in
older versions, but with AES in newer versions, alternatively with the key lengths 128 bit
or 256 bit.
Public-Key Ciphers
3
Up to now, all our encryption methods were designed in such a way that the encryption key
was immediately known as the decryption key, or at least that it could be calculated with-
out great difficulty. We called these methods symmetric ciphers (Sect. 2.1). In the case of
asymmetric ciphers, it should be practically impossible to deduce the decryption key from
the knowledge of the encryption key. Therefore, in this case, the encryption key can be
made public. This is why these methods are also called public-key ciphers.
But what, on the one hand, should be easy to handle as a key, but, on the other hand, cannot
be calculated in a reasonable amount of time, especially today, with our networked super-
computers? Mathematical topics probably come to mind, which have had great appeal
since antiquity, but have steadfastly eluded a reasonable solution to this day. One of these
problems is the decomposition of a natural number into factors, preferably prime numbers.
A prime number is a natural number that is divisible only by 1 and itself, such as 2, 3, 5,
7, 11, 13, 17,… While it is known in principle that any natural number can be uniquely
decomposed into its prime factors, e.g. 60 = 22 ∙ 3 ∙ 5, how does one do this concretely?
The obvious thing to do is to examine a given number for possible divisors. However, for
very large natural numbers, this method quickly reaches its runtime limits. In short:
Factoring (in a reasonable time) is difficult. However, one cannot prove this mathemati-
cally conclusively. In any case, if the puzzle is unexpectedly solved tomorrow, some of
what we are about to learn will have to be completely rethought.
How to translate the problem of factoring into a public-key cipher is what we want to look
at now. The idea for this is based on the following statement, the so-called Fermat’s little
theorem: Let p be a prime number and a a natural number coprime with p. Then ap – 1 = 1
(mod p) holds.
This statement, which goes back to Pierre de Fermat (1607–1665), contains again a
modulo calculation (Sect. 1.2), as we already know it from letters (mod n = 26) or from
bits (mod n = 2). If a = x (mod n) and b = y (mod n), then a ∙ b = x ∙ y (mod n) is also true.
In words, this means, “Whether you first calculate the remainders modulo n and then mul-
tiply, or whether you first multiply and then calculate the remainders modulo n, it comes
out to the same thing.” This rule of calculation, which we shall use very frequently in what
follows, is seen to be thus: Namely, if a = x + r ∙ n and b = y + s ∙ n with integers r and s,
then a ∙ b = (x + r ∙ n)∙(y + s ∙ n) = x ∙ y + (x ∙ s + r ∙ y + r ∙ s ∙ n) ∙ n and therefore x ∙ y = a
∙ b (mod n). Thus, by our rule of arithmetic, if in particular ai = x (mod n) and aj = y (mod
n), then ai + j = x ∙ y (mod n).
Let us also prove Fermat’s little theorem for practice. First, we note that all numbers
1 ∙ a, 2 ∙ a,…, (p − 1) ∙ a are distinct, and this is true even when considered as a remain-
der modulo p. Indeed, if i ∙ a = j ∙ a (mod p) holds with natural numbers i and j from the
range 1 to p − 1, then (j − i) ∙ a = 0 (mod p). But since the prime number p does not
divide a by assumption, p must divide the difference j − i, and so j = i. Thus the numbers
1 ∙ a, 2 ∙ a,…, (p − 1) ∙ a, each considered as a remainder modulo p, pass through all the
remainders 1, 2,…, p − 1, but possibly in a different order. If we form their product in
each case, then ap − 1 ∙ 1 ∙ 2 ∙ 3… ∙ (p − 1) = 1 ∙ 2 ∙ 3… ∙ (p − 1) (mod p) and p thus
divides (ap − 1 − 1) ∙ 1 ∙ 2 ∙ 3… ∙ (p − 1). Since p is a prime number, it must divide
ap − 1 − 1, so ap − 1 = 1 (mod p).
Before we come to the announced public key cipher, we want to remind you of the
Euclidean algorithm [Wil, Buc]. It is named after Euclid of Alexandria (third century
BC). The algorithm is used to determine the greatest common divisor of two natural num-
bers m and n. For this, let m be greater than n. Then set r0 = m and r1 = n and divide r0 by
r1 with remainder, i.e. r0 = q1 ∙ r1 + r2 with r2 less than r1, and continue the procedure itera-
tively until at a k. step the division works out even:
3.1 Factorization and RSA Cipher 55
r0 q1 r1 r2
r1 q 2 r2 r3
ri q i 1 ri 1 ri 2
rk 3 q k 2 rk 2 rk 1
rk 2 q k 1 rk 1 rk
rk 1 q k rk
Then the Euclidean algorithm states that g = rk is the greatest common divisor of m and
n. Calculating iteratively backwards from the next to last equation by substituting previous
ri, we get.
g rk rk 2 q k 1 • rk 1 rk 2 q k 1 • rk 3 q k 2 • rk 2
and the greatest common divisor g can finally be written as the multiple sum g = x ∙
m + y ∙ n with integers x and y. These can be chosen so that x is positive and y is negative.
Otherwise, namely, one modifies the multiple sum according to g = x ∙ m + n ∙ m + y ∙
n − m ∙ n = (x + n) ∙ m + (y − m) ∙ n. This is also called the extended Euclidean algo-
rithm [Wil, Buc]. The method is highly efficient, fast and easy to implement.
The public key cipher we are about to describe was created by Ronald Rivest (b. 1947),
Adi Shamir (b. 1952), and Leonard Adleman (b. 1945). It was published in 1977 and is
known as the RSA cipher. How it works is shown schematically in Fig. 3.1 and
described below.
The potential communication participant Y(ollanda) first obtains two different very
large prime numbers p and q and multiplies them to the number n = p ∙ q. She also chooses
a natural number e smaller than (p − 1) ∙ (q − 1), which is coprime with (p − 1) ∙ (q − 1).
Using the extended Euclidean algorithm, she can then write the greatest common divisor
1 as the multiple sum of e and (p − 1) ∙ (q − 1), that is, 1 = d ∙ e + b ∙ (p − 1) ∙ (q − 1) with
a natural number d and a negative integer b. Our participant Y now registers in a central
registry with her name and her so-called public key (n, e). She is then also said to have a
certified RSA key. However, she keeps her private key d secret.
Now suppose sender X(avier) wants to send a secret message to receiver Y. X then first
looks up the public key (n, e) of Y in the central register. Let the message m be a natural
number smaller than the very large n. Now X sends the remainder of me when divided by
n to Y. She uses the received me (mod n), takes her private key d and computes (me)d = med
(mod n). As a result, she receives m = med (mod n), and since m is smaller than n, exactly
the desired plaintext m.
To see this, we first consider that the statement med = m1 − b(p − 1)(q − 1) = m ∙ (m(p − 1))−b(q − 1) = m
(mod p) holds. Namely, if p is not a divisor of m, then Fermat’s little theorem gives
m(p − 1) = 1 (mod p). However, if p divides the number m, then both sides are equal to 0
modulo p. Similarly, for the other prime number q, med = m (mod q) also holds. Thus n = p
∙ q is a divisor of med − m, so med = m (mod n).
(a) To explain the procedure concretely, we start with a very small example. Receiver
Y(ollanda) chooses prime numbers p = 3 and q = 5. Therefore, n = 15 and (p − 1) ∙
(q − 1) = 2 ∙ 4 = 8. Since e = 3 is coprime with 8, she can choose (n, e) = (15, 3) as her
public key. To determine her private key d, she uses the extended Euclidean algorithm
for (p − 1) ∙ (q − 1) = 8 and e = 3. Here first is the iterated division with remainder.
8 2•32
3 1• 2 1
2 2 •1 0
Since in the last equation the division works out even, the divisor 1 is the greatest com-
mon divisor, but this was already clear in this small example anyway. Much more
important here is the fact that from the previous equations iteratively calculated back-
wards one can represent the greatest common divisor 1 as the multiple sum of (p − 1)
∙ (q − 1) = 8 and e = 3, viz.
1 3 1 • 2 3 1 • 8 2 • 3 3 • 3 1 • 8 3 • e 1 • p 1 • q 1 d • e b • p 1 • q 1 .
So d = 3 is the private key of Y. For example, let m = 7 be the message to be sent. Then
sender X(avier) computes the value me (mod n), so 73 = 343 = 13 (mod 15), and there-
fore sends 13. Receiver Y uses her private key d = 3, computes 13d = 133 = 2197 = 7
(mod 15), and thus receives the message m = 7.
3.1 Factorization and RSA Cipher 57
(b) Here is a slightly larger example. For the prime numbers p = 17 and q = 19,
n = 17 ∙ 19 = 323 and (p − 1) ∙ (q − 1) = 16 ∙ 18 = 25 ∙ 32 = 288. Since e = 5 is
coprime with 288, receiver Y(ollanda) can choose (n, e) = (323, 5) as her public
key in this case. By means of the extended Euclidean algorithm we get
288 57 • 5 3
5 1• 3 2
3 1• 2 1
2 2 •1 0
The RSA cipher requires a faster procedure for computing mk (mod n) for natural numbers
k than the obvious successive multiplication of m. To do this, write k as a binary expansion
k = k(r) ∙ 2r + … + k(1) ∙ 21 + k(0) ∙ 20 = (…((k(r) ∙ 2 + k(r − 1)) ∙ 2 + k(r − 2)) ∙ 2 + … + k(1)) ∙
2 + k(0) with k(i) = 0 or k(i) = 1, but in any case k(r) = 1. Then mk = ((…((m2 ∙ mk(r − 1))2 ∙
mk(r − 2))2…)2 ∙ mk(1))2 ∙ mk(0). This is referred to as repeated squaring.
Each step consists of a square and, for k(i) = 1, of an additional multiplication by m. If
you do not want to calculate mk itself, but only the remainder mk (mod n), you form the
remainder modulo n after each squaring and multiplication.
Let us illustrate this with an example [Hau1] and choose m = 296, k = 53 and n = 13 ∙
23 = 299. Then k = 53 = 25 + 24 + 22 + 20 and therefore mk = 29653 = ((((2962 ∙ 296)2)2 ∙
296)2)2 ∙ 296 We now calculate successively.
58 3 Public-Key Ciphers
One can show the following [BNS, Kob]: Knowing the public key (n, e), it is just as “hard”
to factorize n into its prime factors p and q as it is to compute the private key d. So this, if
one believes in the statement “Factorizing in reasonable time is hard”, is exactly the situ-
ation needed for a public key cipher. Nevertheless, it must also be clearly stated that it is
unknown whether one really needs the private key d for decryption or whether there may
be other efficient methods.
In order to attack RSA, one therefore tries as a matter of priority to develop factoriza-
tion algorithms that are as fast as possible (Sect. 3.4). For security reasons, the BSI guide-
line [BSI1] recommends that the key length of the RSA module n = p ∙ q should be of the
order of 2000 bits as a binary expansion. Therefore, it is a natural number with about 600
decimal places. However, with increasing computer performance, the BSI recommends
using RSA modules n with a length of 3000 bits for a deployment period beyond 2022.
The other public key e can, however, be chosen to be quite small, but not too small for
security reasons.
Among all possible attack methods (Sect. 2.1), chosen plaintext attacks are the most
basic against public-key schemes, since any attacker knows the public keys and can there-
fore encrypt all plaintexts of his choice. Let us briefly consider here that RSA is vulnerable
to certain chosen ciphertext attacks. To this end, let (n, e) be the RSA public key of partici-
pant X(avier). We assume attacker A(rchibald) intercepts the ciphertext c, but does not
know the corresponding plaintext message m to c = me (mod n). However, he cannot have
it decrypted either, because that would be too conspicuous. Then he chooses a plaintext m1
and encrypts it to c1 = (m1)e (mod n). Attacker A can assume that c1 is coprime with n,
otherwise he would have found a divisor of n and thus cracked the method. So he uses the
extended Euclidean algorithm to compute the multiple sum representation 1 = x ∙ c1 + y ∙
3.2 Internet and WLAN 59
n. Consequently, x ∙ c1 = 1 (mod n), i.e., (c1)−1 = x (mod n). Thus he forms c2 = (c1)−1 ∙ c
(mod n) and lets the more innocuous c2 decrypt to m2, i.e. c2 = m2e (mod n). But then c = c1
∙ c2 = m1e ∙ m2e = (m1 ∙ m2)e (mod n), so attacker A can decrypt the ciphertext c to the plain-
text m = m1 ∙ m2 (mod n).
Given the size of n, it is not surprising that the RSA cipher is very slow despite the repeated
squaring method, at least it requires much more computation time than symmetric ciphers.
Therefore, both are usually used in combination in practice:
• With the public-key cipher RSA, one merely exchanges the key necessary for a sym-
metric cipher (so-called key exchange).
• For the actual transmission of information, the much faster symmetric cipher is then
used, e.g. Triple-DES or AES with the secretly exchanged key.
This is also the reason why in practice the message m in an RSA cipher, i.e., the key of a
symmetric cipher, is always smaller than the RSA module n. For the 128-bit key k = k0 k1
k2…k127 of AES, for example, one uses its binary expansion m = k0 ∙ 20 + k1 ∙ 21 + k2 ∙
22 + k3 ∙ 23 + … + k127 ∙ 2127, to make it a natural number m for the RSA cipher.
located. The OSI model is shown in the left column of Table 3.1. The more rudimentary a
communication is, the fewer layers are necessary or the more rudimentary the protocols on
the upper layers can be. If, for example, a communication runs completely without users,
no application layer is necessary. If only point-to-point connections are involved, no net-
work layer is required.
Especially for the Internet and the Internet protocol family, the seven layers of the OSI
model are usually combined into four levels in the TCP/IP model according to the right
column of Table 3.1. The basic elements are the IP protocol (Internet Protocol), and TCP
(Transmission Control Protocol), which organizes data transport. In the link layer, the
Ethernet protocol is often used as well as DSL (Digital Subscriber Line) for fast bit trans-
mission. The application layer is home to a variety of protocols.
3.2.2 Confidential Work on the Internet with HTTPS, SMTPS and FTPS
The protocol TLS (Transport Layer Security), formerly known as SSL (Secure Socket
Layer), enables the secure transmission of information on the application layer via TCP/
IP-based connections on the Internet. TLS is located at the upper end of the transport layer
above TCP in the TCP/IP model, as Table 3.2 shows. It often works together with the fol-
lowing protocols of the application layer:
• HTTP (Hypertext Transfer Protocol), with which a user (client) can access the pages
of a provider (server) by means of a browser.
• SMTP (Simple Mail Transfer Protocol), which is used to send e-mails.
• FTP (File Transfer Protocol), which allows files to be downloaded from a server to the
client or uploaded from the client to the server.
To indicate the interaction with TLS, an S for “Secure” is usually appended to the protocol
of the application layer, i.e. HTTPS, SMTPS and FTPS. If, for example, you call up an
Internet page, you will find an “HTTPS” in its Internet address if the respective provider
wants to particularly secure the content of this page. This is particularly the case if book-
ings can be made on the page or other transactions can be carried out. The short message
service Twitter also uses the TLS infrastructure for secure data transmission.
Preferably, TLS is operated with the symmetric cipher AES in CBC or CTR mode and
keys of length 128 or 256. Triple DES no longer plays a significant role. RSA, among oth-
ers, can be used as a public-key cipher. TLS consists of several subprotocols. With the
TLS Handshake Protocol, the user (client) and provider (server) determine the cipher
method to be used and agree on the key for the symmetric cipher. If RSA is used for the
key exchange, the server sends its certified public RSA key to the client. The client then
sends the server a secret random natural number encrypted with this key, which is to be
used as the AES key. The server decrypts the random number with its private RSA key.
After that, the TLS Record Protocol, the actual heart of TLS, can begin, which encrypts
the communication on the Internet, for example, via browser with AES [WPTLS,
WPTLSe, BSI2].
WLAN (Wireless Local Area Network, or Wi-Fi) refers to a local wireless network. This
can involve larger installations with a central server, but also in the private environment or
in office communication, one likes to network devices (router, laptop, printer, etc.) wire-
lessly with each other with a WLAN. In connection with the Internet, it is often only used
as an interface where you can dial into the Internet wirelessly with a laptop or smartphone
via a nearby router.
As the successor to WEP (Wired Equivalent Privacy), which was considered insecure,
the new WLAN standard specifies the WPA2 (Wi-Fi Protected Access 2) method. WEP
was based on the stream cipher RC4 (Ron’s Code 4) from 1987 by Ronald Rivest (born
1947), which was kept secret but was anonymously made public in 1994. WPA2, on the
other hand, uses AES for data encryption with a key length of 128 bits in CTR operat-
ing mode.
In large WLAN installations, the server has a certified RSA key that can be used to
exchange the AES key via the so-called EAP-TLS protocol. In smaller networks in the
so-called SoHo domain (Small Office, Home Office), the PSK (Pre-shared key) procedure
is usually used. The PSK key must be known to all devices in the WLAN, as it is used to
generate the AES key. It can usually be entered on the various devices, and changing it
regularly also increases security [WPWP2, BSI3].
Now, at the latest, a fundamental question arises. How on earth is it possible to obtain suf-
ficiently large prime numbers for an RSA cipher? One thing is reassuring: There are infi-
nitely many prime numbers, hence arbitrarily large ones, as we have known since Euclid.
62 3 Public-Key Ciphers
But in order to definitively determine whether a natural number p is really a prime number,
one must actually check that it is not divisible by any number smaller than p, for which √p
is sufficient. In practice, however, it certainly cannot work like this, because otherwise one
could search the number n = p ∙ q for divisors in this way in a reasonable runtime. But that
exactly this should not be possible was the basic idea of the RSA cipher. Is this already the
practical end of RSA? Not quite: There are other primality tests. These are quite tricky,
but usually have a flaw: They only prove with a certain probability that a given number is
a prime number.
To understand the procedure, we start with a simple method based on Fermat’s little theo-
rem. Let n be an odd natural number, which we want to check whether it is a prime num-
ber. Let k be a fixed number of samples. Then we choose k random natural numbers a in
the range from 2 to n − 1, which are coprime with n. The Euclidean algorithm can quickly
decide on the coprimeness. If one finds an a for which an − 1 is not equal to 1 modulo n, then
p is certainly not a prime number according to Fermat’s little theorem. If one finds no such
a after k samples, then n is at least possibly a prime number. Stupidly, it may actually hap-
pen that n passes this test even for all numbers a that are coprime with n, without actually
being a prime number. Such numbers are called Carmichael numbers, and there are even
an infinite number of them, the smallest being 561 = 3 ∙ 11 ∙ 17. So this test yields the
statement “possibly prime” for an infinite number of n, although this is factually not
true at all.
Let p be a prime number greater than 2 and let a be a natural number coprime with p. Then
it follows from Fermat’s little theorem and the 3. binomial formula that p divides the num-
ber ap − 1 − 1 = (a(p − 1)/2 + 1)(a(p − 1)/2 − 1). Since p is a prime number, p must divide either
a(p − 1)/2 + 1 or a(p − 1)/2 − 1. If the second case is true, then because a(p − 1)/2 − 1 = (a(p − 1)/4 + 1)
(a(p − 1)/4 − 1), it follows that p divides either a(p − 1)/4 + 1 or a(p − 1)/4 − 1. If here again the
second case is true, then one continues this argument with a(p − 1)/8 + 1 and a(p − 1)/8 − 1. And
so one can go on and on, as long as in each case the second case is true, and successively
halve the exponent until this is no longer possible. This observation can also be expressed
the other way round. For this, let p − 1 = s ∙ 2t with an odd number s. Then in our procedure
at some point the first case is true, or p finally divides as − 1. In the first case, however,
there is a j smaller than t in such a way that p is a divisor of (…((as)2)2…j…)2 + 1.
3.3 Monte Carlo Prime Numbers 63
The Miller-Rabin primality test dates from 1976 and is named after Gary Miller and
Michael Rabin (b. 1931). It is based on the above simple corollary from Fermat’s little
theorem and works as follows: Let n be the odd natural number to be studied. For this, one
again writes n − 1 = s ∙ 2t with an odd number s. Further, let k be a fixed number of sam-
ples. Then choose k random natural numbers a in the range from 2 to n − 1 that are
coprime with n. The coprimeness is again easy to determine using the Euclidean algo-
rithm. One checks then whether as = 1 (mod n) or as = −1 (mod n) is valid. If yes, then one
makes a hook at the chosen sample a and takes the next random number. If no, then one
checks by iterative squaring whether (…((as)2)2…j…)2 = −1 (mod n) holds for a j smaller
than t. If this is the case, then one hooks the selected sample a and takes the next random
number. If you cannot set a check mark for a sample, then n is certainly not a prime num-
ber. If, however, a check mark is placed on all samples, then n is a prime number candi-
date in the sense of Miller-Rabin.
After the experience with Carmichael numbers, one must now naturally ask the ques-
tion: How many numbers pass the Miller-Rabin test as prime number candidates, even
though they are not actually prime numbers? But now we are in a much better situation.
You can show what is not so easy and therefore we will not do it here: The probability that
a number n passes the Miller-Rabin test for a randomly chosen a, although it is not a prime
number at all, is at most 1/4 [Kob, Buc]. Thus, if one performs the test for k independent
samples a, the probability of error is at most (1/4)k and can therefore be made arbitrarily
small by choosing enough samples. For k = 5, for example, this probability is already
(1/4)5 = 1/1024, i.e., smaller than 1‰. Natural numbers that pass the Miller-Rabin test can
thus be safely used as prime numbers for an RSA cipher. This is indeed the way to get
large prime numbers for RSA. The Miller-Rabin test is a so-called Monte Carlo method,
i.e. a random-based method that only gives a false result with an upper bound probability.
(mod n = 221). This is not equal to 1 and not equal to −1 modulo n = 221. So we also
compute (as)2 = 472 = −1 (mod n = 221). This shows that n = 221 is a prime number
candidate in the sense of Miller-Rabin with respect to the one sample a = 174. We try
a second sample b = 137. Then bs = 13755 = 188 (mod n = 221), so again unequal 1 and
unequal −1 modulo n = 221. Also (bs)2 = 1882 = 205 (mod n = 221) is unequal −1
modulo n = 221. Therefore the sample b = 137 excludes the number n = 221 as
prime number.
The so-called Euler criterion, named after Leonhard Euler (1707–1783), is a slight
tightening of Fermat’s little theorem, which we will not prove here [Wil] and which states
the following: If p is a prime number greater than 2 and a is a natural number coprime
with p, then
Let us look at a simple example. Let p = 7. Then 12 = 1 (mod 7), 22 = 4 (mod 7), 32 = 2
(mod 7), 42 = 2 (mod 7), 52 = 4 (mod 7), and 62 = 1 (mod 7). So the squares modulo 7 are
exactly the numbers 1, 2 and 4. For the square a = 2 (mod 7) we get a(p − 1)/2 = 23 = 1 (mod
7), and for the non-square a = 3 (mod 7) we get a(p-1)/2 = 33 = 6 = −1 (mod 7) in agreement
with Euler’s criterion.
Here is another primality test as an example of a Monte Carlo method, namely the Solovay-
Strassen primality test published by Robert Solovay (b. 1938) and Volker Strassen (b.
1936) in 1977. It uses the Euler criterion and otherwise follows the same strategy as the
Miller-Rabin test. Namely, let n be the odd natural number to be tested and k be a fixed
number of samples. Then one chooses k random natural numbers a in the range from 2 to
n − 1, which are coprime with n, and calculates a(n − 1)/2 (mod n). If one finds an a that does
not satisfy the Euler criterion (for n instead of p), then n is certainly not a prime number.
However, if the Euler criterion is satisfied for all samples a, then n is a prime number
candidate in the sense of Solovay-Strassen.
If, in the Solovay-Strassen test, a(n − 1)/2 = 1 (mod n) or = −1 (mod n), then one must also
decide whether a is a square modulo n. To do this, one does not try to explicitly calculate
a “square root” b modulo n from a, which would be difficult for large composite n anyway.
Much more effective is the method of computing with so-called Legendre and Jacobi sym-
bols [Wil]. However, the running time of the Solovay-Strassen test is still worse than that
3.4 Attack by Factorization 65
of Miller-Rabin. In addition, the probability that a number n passes the test for a, although
it is not a prime number, is twice as high as in the Miller-Rabin test, namely at most 1/2
[Kob]. If, however, the test is carried out again for k independent samples a, the error prob-
ability is at most (1/2)k, and here too we very quickly arrive at an extremely small resid-
ual risk.
It is true that in 2002, with the AKS primality test [Wil, Hau3], a deterministic method
was published for the first time, which thus identifies prime numbers as such with certainty
and which also has “in principle a reasonable running time”. However, despite some
improvements in the meantime, this is still too high for practical applications.
Unfortunately, if a primality test fails, one has no clue as to what factors this number has.
But this is exactly what one would need to know in order to calculate the decomposition
n = p ∙ q into prime numbers from the public key (n, e) of an RSA cipher and thus crack
the cipher. So, primality tests do not provide a starting point for cryptanalysis. Successively
trying for divisibility for numbers up to √n is much too slow for a magnitude of over 2000
bits, or 600 decimal places. So let’s do some cryptanalysis again and look for faster meth-
ods for factorization in order to crack the RSA method or quantify its security.
The following factorization method again goes back to Pierre de Fermat (1607–1665).
The basic idea here is to write the natural number n to be factorized as the difference of
two squares, i.e. n = x2 − y2 with two natural numbers x and y. Using the 3. binomial for-
mula, this results in n = x2 − y2 = (x + y) ∙ (x − y), i.e. a factorization of n.
But first you need the largest natural number s less than or equal to √n. This is best
done with the Heron method according to Heron of Alexandria. You start with x0 = n and
iteratively calculate xi + 1 = (xi + n/xi)/2 until you find the first k such that xk − xk + 1 is less
than 1. Then s is the integer part of xk + 1. If s = √n, then already n = s2. Otherwise, the
Fermat factorization successively computes (s + 1)2 − n, (s + 2)2 − n,…, (s + i)2 − n,…,
and this until one finds a square number. Because of the differences relatively small num-
bers result, so that one can test this by successive trying, or one uses again the Heron
method. However, one does not successively recalculate the squares (s + i)2, but uses the
1. binomial formula (s + (i + 1))2 = ((s + i) + 1)2 = (s + i)2 + 2 ∙ (s + i) + 1, thus adding 2 ∙
(s + i) + 1 to the square of the predecessor (s + i)2. Finally, if by this procedure one has
66 3 Public-Key Ciphers
identified an i for which (s + i)2 − n = a2 is a square number, then by the 3. binomial for-
mula n = (s + i)2 − a2 = (s + i + a) ∙ (s + i − a), and one has found two factors s + i + a and
s + i − a of n.
The Fermat method thus searches for the divisor closest to √n and arrives at a solution
in a few iterations if the number n can be decomposed into two factors of approximately
equal size. As a consequence for the security of the RSA cipher it follows that the two
prime numbers p and q in n = p ∙ q must not be too close to each other.
Instead of determining natural numbers x and y with n = x2 − y2 for a given natural number
n, one can also search for x and y more generally with x2 = y2 (mod n). Namely, then n is
a divisor of the difference x2 − y2 = (x + y) ∙ (x − y), and if n does not divide x + y or x − y,
then one can use the Euclidean algorithm to compute the greatest common divisor of n and
x + y or x − y, respectively, and thus find a divisor of n. We make the procedure clear with
an example [WPQSi]. Let n = 1649. Then for x1 = 41, the equation x12 = 412 = 25 (mod
1649), and for x2 = 43, x22 = 432 = 23 ∙ 52 (mod 1649) hold. If we multiply both equations,
it follows that (41 ∙ 43)2 = 412 ∙ 432 = 25 ∙ 23 ∙ 52 = (24 ∙ 5)2 (mod 1649). Thus we have
found x = 41 ∙ 43 and y = 24 ∙ 5 with x2 = y2 (mod 1649). Since the greatest common divi-
sor of x − y = 41 ∙ 43 − 24 ∙ 5 = 1683 and n = 1649 equals 17, the desired factorization is
n = 1649 = 17 ∙ 97.
In the search for x and y we therefore proceeded in such a way that we first determined
the remainders yi with xi2 = yi (mod n) for the samples x1 and x2. In our example, y1 and y2
had only the very small prime divisors 2 and 5. This construction principle can be used in
general by allowing only small prime divisors p1 = 2, p2 = 3, p3 = 5,…, pb − 1, pb for yi, up
to a self-defined bound b. These prime numbers are then called factor basis. The factoriza-
tion method originating from John Dixon thus seeks m distinct numbers xi such that the
remainders yi of xi2 modulo n consist only of prime factors of the factor basis and that the
3.4 Attack by Factorization 67
remainder of y1 ∙ ym modulo n is a square y2. Indeed, the latter can then be easily read from
the exponents of the factor basis [Wil]. In any case, from xi2 = yi (mod n) and y1 ∙ ym = y2
(mod n), it then follows for x = x1 ∙ xm, that x2 = y2 (mod n).
Complementing Dixon’s method, Carl Pomerance (b. 1944) has developed a method
that systematically searches for the xi using a sieve [Wil, Buc]. One therefore speaks alto-
gether of the quadratic sieve.
We now want to get to know a method of factorization developed by John Pollard (born
1941), namely Pollard’s ρ-factorization from 1975. Let n again be the natural number to
be factorized. Then one considers a sequence xi of natural numbers, starting for this pur-
pose with a small natural number x0, for example x0 = 1 or x0 = 2. One computes the
sequence recursively by xi + 1 = xi2 + 1 (mod n) up to a xb, imposing this “pain threshold” b
on oneself and increasing it in case of failure. In the hope of finding a proper divisor of n,
one now uses the Euclidean algorithm to successively determine the greatest common
divisor of n and
x1 – x 0
x2 – x0, x2 – x1
x3 – x0, x3 – x1, x3 – x2
: : :
xb – x0, xb – x1, xb – x2, …, xb – xb-2, xb – xb-1
The name ρ-method is derived from the Greek letter ρ. One draws the sequence x0, x1,
x2,… as a chain of points. If one finds a prime number p as a divisor of n and xk − xj, then
because of xk = xj (mod p) the sequence of xi becomes periodic modulo p, and the chain
closes, so to speak, modulo p at the indices j and k. This is visualized in Fig. 3.2.
Pollard’s ρ-method has a particularly good chance of finding a proper divisor of n if the
number n has at least one smaller factor. As a consequence for the security of the RSA
cipher it follows that the two prime numbers p and q in n = p ∙ q should not be too far apart,
because otherwise one of them becomes relatively small.
We choose as an example [Hau3] the number n = 143, start with x0 = 2 and choose as pain
threshold b = 6. Then we get the following values for x1 to x6:
68 3 Public-Key Ciphers
x0 x1 x2 x3 x4 x5 x6
2 5 26 105 15 83 26
Now we successively compute the greatest common divisor of n = 143 and xk − xj for
k greater than j and k and j less than or equal to b = 6, finding the first proper divisor of n
at x4 − x0 = 13. Thus 143 = 13 ∙ 11.
The second factorization method by John Pollard (born 1941), which we will describe
here, dates from 1974 and is called Pollard’s p − 1-factorization. Let n again be the natu-
ral number to be factorized. First, one chooses a natural number b as the pain threshold and
computes the least common multiple k of all the natural numbers from 1 to b. Then one
chooses a random natural number a in the range from 2 to n − 1 and determines the great-
est common divisor of a and n using the Euclidean algorithm. If you find a proper divisor,
you have already factorized n. So, you can assume that a and n are coprime. Now calculate
the remainder c = ak (mod n). If c = 1 (mod n), then one tries another a or changes the pain
threshold b. Otherwise, use the Euclidean algorithm to determine the greatest common
divisor of c − 1 and n hoping to find a proper divisor of n.
When and why does Pollard’s p − 1 method work? To examine this, suppose n has a
prime divisor p such that p − 1 can be written as the product of relatively small prime pow-
ers; more precisely, all such powers should be at most equal to the pain threshold b. Then
k is a multiple of p − 1, so we can write k as k = (p − 1) ⋅ k′ with a natural number k′.
3.5 Discrete Logarithm and Diffie-Hellman 69
Because of c = ak (mod n), also c = ak (mod p), and from Fermat’s little theorem follows
c a k a 1 mod p , i.e., p divides both n and c − 1. So in this case one finds a
p 1 k
We now want to take care of a second difficult problem of mathematics and see how to use
it for a public key cipher. This is the so-called discrete logarithm, which is easier to
understand than the term first suggests.
Let p again be a prime number. We know from Fermat’s little theorem (Sect. 3.1) that for
all natural numbers c that are smaller than p and thus coprime with p, the statement
cp − 1 = 1 (mod p) holds. What Fermat’s little theorem does not rule out, however, is the
possibility that there are also smaller exponents a in the range from 1 to p − 2 with ca = 1
(mod p). For example, for prime numbers p greater than 2, Euler’s criterion (Sect. 3.3) says
that for squares c modulo p, c(p − 1)/2 = 1 (mod p) already holds. Let us consider another
concrete example. For p = 7 and c = 2, already c(p − 1)/2 = 23 = 8 = 1 (mod p = 7). For g = 3,
however, 31 = 3 (mod 7), 32 = 2 (mod 7), 33 = 6 (mod 7), 34 = 4 (mod 7), 35 = 5 (mod 7)
and 36 = 1 (mod 7), and consequently here only gp − 1 equals 1 modulo p.
70 3 Public-Key Ciphers
This observation on a small example is also valid in general. For a prime number p
greater than 2 there is always at least one natural number g less than p such that ga is not
equal to 1 modulo p for all a in the range from 1 to p − 2. One calls g a generating ele-
ment modulo p. In contrast to Fermat’s little theorem, however, this statement is not quite
so obvious. We therefore omit a formal derivation [Wil, Buc].
The actual reason for the existence of generating elements is that the remainders mod-
ulo p form a field in the mathematical sense with respect to their addition and multiplica-
tion, as we have already seen with the bytes (Sect. 2.8) and with the example of hard disk
encryption (Sect. 2.9). Every remainder r not equal to 0 modulo p has a multiplicative
inverse. Since r and the prime number p are coprime, the inverse r−1 = r0 (mod p) is simply
determined by the extended Euclidean algorithm to r ∙ r0 = 1 (mod p). It also follows that
the powers 1 = g0, g1, g2,…, gp − 2 of the generating element g pass through all residues not
equal to 0 modulo p, thus generating them. Indeed, if gi = gj (mod p), multiply by the
inverse g−i = gp − 1 − i modulo p and obtain g(j − i) = 1 (mod p). Consequently, i = j, and the gi
are pairwise different modulo p.
The mathematical derivation that there is always at least one generating element g modulo
a prime number p greater than 2 is only one side of the coin. But how to obtain such a
generating element g0 in concrete terms, especially when, as in our case, very large prime
numbers p are involved, is a completely different question. The answer is: You do it by trial
and error, so you choose a random natural number g0 smaller than p and now you have to
test in principle whether all g0 powers g0a for a in the range from 1 to p − 2 are not equal
to 1 modulo p. This will be very laborious for large p.
However, there is a criterion which makes the whole thing easier, but which we also do
not want to derive formally [Wil, Buc]. Namely, one only has to prove that g0(p − 1)/q is not
equal to 1 modulo p for all prime numbers q that divide p − 1. And the most efficient way
to do this is to use the method of repeated squaring (Sect. 3.1). But what is the probability
of success with the random choice of g0? This in turn depends on the prime factorization
of p − 1. If, for example, p − 1 = 2 ∙ q with a prime number q, the probability is about 1/2.
We also consider a concrete example of this [Buc]. Let p = 23, so p − 1 = 2 ∙ 11. Then,
211 = 311 = 1 (mod 23), and hence 2 and 3 are eliminated as generating elements modulo
23. However, 52 = 2 (mod 23) and 511 = 22 (mod 23) as well as 72 = 3 (mod 23) and 711 = 22
(mod 23) hold. Therefore, g = 5 as well as g = 7 are generating elements modulo 23.
Let a generating element g modulo the prime number p with p greater than 2 be given. For
a given b the exponent a in b = ga (mod p) with a in the range from 0 to p − 2 is called the
3.5 Discrete Logarithm and Diffie-Hellman 71
discrete logarithm of b to the base g modulo p. The term discrete logarithm derives from
the fact that this is the analogue of the logarithm of real analysis for finite (discrete) sets.
For a large prime number p and given g and b, computing a by successively trying the
powers g1, g2, g3,… modulo p quickly reaches its limits. So we hold: the discrete loga-
rithm (in manageable time) is a difficult problem. However, as with factoring, this can-
not be proven mathematically conclusively.
We consider again a small example. For p = 19, we calculate that g = 2 is a generating
element modulo 19. The discrete logarithm a of b = 7 to the base g = 2 is a = 6, because
26 = 64 = 7 (mod 19).
As with the RSA cipher, the question now arises of how to convert this difficult mathemat-
ical problem into a practicable public-key cipher, which in principle can be used to send
any message m in encrypted form (Sect. 3.8). However, we first recall the fact that public-
key ciphers are usually only used in combination with symmetric ciphers for their key
exchange. A method designed only for key exchange, based on the discrete logarithm, was
published in 1976 by Whit Diffie (b. 1944) and Martin Hellman (b. 1945). It was these
two who first proposed the idea of public-key cryptography in this seminal paper.
The Diffie-Hellman key exchange works as follows. Let p be a large prime number
and g a generating element modulo p. Let both g and p be publicly known. We now imag-
ine that X(avier) and Y(ollanda) want to agree on the key of a symmetric cipher for a
planned secret data transmission. For this key exchange, they both use the following pro-
cedure, which is visualized in Fig. 3.3.
(a) We start with a simple example [WPDHS]. We use the prime number p = 13 and con-
vince ourselves that g = 2 is a generating element modulo p = 13. X(avier) chooses the
random number e = 5, and Y(ollanda) chooses f = 8. Then X sends the value 6 = 25
(mod 13) to Y, and Y sends 9 = 28 (mod 13) to X. Now X computes the value k = 3 = 95
(mod 13), and Y computes k = 3 = 68 (mod 13). So k = 3 is the mutually agreed key
for a symmetric cipher.
(b) Here is another slightly larger example [Kob]. We take as prime number p = 53 and the
number g = 2, which is a generating element modulo p = 53. X(avier) chooses e = 29
and therefore transmits 45 = 229 (mod 53) to Y(ollanda). For her part, Y chooses f = 19
and therefore transmits 12 = 219 (mod 53) to X. Thereupon, X computes the number
k = 21 = 1229 (mod 53) and Y computes the number k = 21 = 4519 (mod 53). After that,
both X and Y know the common key k = 21.
To crack the Diffie-Hellman method, the obvious thing to do is to try to compute the dis-
crete logarithm a from b = ga (mod p). So this, if one believes in the statement “discrete
logarithm in reasonable time is hard”, is again exactly the situation needed for a public-key
3.5 Discrete Logarithm and Diffie-Hellman 73
cipher. However, it is unclear whether one can break the Diffie-Hellman method using
only the discrete logarithm, or whether there may be entirely different efficient methods.
In order to attack Diffie-Hellman, one therefore primarily tries to develop fast algo-
rithms for calculating the discrete logarithm (Sect. 3.6). For security reasons, the BSI
guideline [BSI1] recommends that the key length of the prime number p should be of the
order of 2000 bits. These are therefore prime numbers with about 600 decimal places. We
have already examined how to obtain such prime numbers using the Monte Carlo method
for the RSA procedure (Sect. 3.3). However, with increasing computer performance, the
BSI recommends using prime numbers p with a length of 3000 bits for a period of use
beyond 2022.
One can also think of the Diffie-Hellman key exchange somewhat more vividly as a color
game, as illustrated in Fig. 3.4. X(avier) and Y(ollanda) know the common color yellow
(namely p and g), which every potential attacker A(rchibald) also knows. They each mix
yellow with a secret color red (namely e) and turquoise (namely f) known only to them.
These mixed colors brown and blue are now exchanged, and it is assumed that they are
observed doing so. The underlying assumption is that A will not be able to determine
74 3 Public-Key Ciphers
exactly the red and turquoise used from the mixed colors. But this would be necessary in
order, as X and Y do, to subsequently produce the common military green mixture (namely
the key k) that they use for their communication.
r 0 1 2 3 4 5
11r 1 11 5 26 25 14 (mod 29)
Because of g−t = 11−6 = 1128 − 6 = 1122 = 13 (mod 29), the giant steps are successively
calculated as 3 ∙ 13q, and we obtain.
q 0 1 2
3 ∙ 13q 3 10 14 (mod 29)
At 14 (mod 29), there is a match with the baby-step table. Therefore r = 5, q = 2 and
thus the sought-after a equals a = t ∙ q + r = 6 ∙ 2 + 5 = 17.
The Pohlig-Hellman method by Stephen Pohlig (1953–2017) and Martin Hellman (b.
1945) was published in 1978. It also allows the computation of the discrete logarithm a of
b = ga (mod p) for a prime number p and a generating element g modulo p. The method is
faster than baby-step-giant-step when all prime divisors q of p − 1 are quite small. The
trick works like this: To a prime q, let qe be the highest q power that divides p − 1. Then
one first determines the remainder c with c = a (mod qe) for initially still unknown a.
If this is done for all prime divisors q of p − 1, then there is a well-known standard
technique to determine the sought a smaller than p − 1. This is the so-called Chinese
remainder theorem [Wil, Buc], which already appears in various early Chinese sources,
probably for the first time around 300 A.D. For i = 1,…, s we denote by ni the highest qi
-power dividing p − 1, and set ni′ = (p − 1)/ni. Since ni and ni′ are coprime for each i, we
use the extended Euclidean algorithm to find natural numbers xi with xi ∙ ni′ = 1 (mod ni),
i.e. (ni′)−1 = xi (mod ni). Since, according to the assumption, we have already determined
all ci with ci = a (mod ni), we can compute yi = ci ∙ (ni′)−1(mod ni) as well as
x = y1 ∙ n1′ + … + ys ∙ ns′. Since each ni is a divisor of nj′ for different i and j, we obtain x =
yi ∙ ni′ = ci ∙ (ni′)−1 ∙ ni′ = ci = a (mod ni). Because of n1 ∙ ns = p − 1 with pairwise coprime
ni, it follows that a = x (mod p − 1).
76 3 Public-Key Ciphers
So it remains to calculate c with c = a (mod qe). For this, first think of c = c(0) ∙ q0 + c(1)
∙ q + c(2) ∙ q2 + … + c(e–1) ∙ qe−1 written as an expansion of q-powers with coefficients c(i)
1
smaller than q, as we have already done for binary expansions (i.e. for q = 2). Let h = g(p − 1)/q
(mod p), then hq = 1 (mod p), and because c = a (mod qe) it follows
b g h a h c h 0 mod p . Since c(0) is less than q, one can determine
p 1 / q a p 1 / q c
from it c(0) by successively comparing b(p − 1)/q (mod p) with 1, h, h2,…, hq − 1 (mod p). This
is effective since q should be quite small by assumption. Now consider
b1 b g 0 g 0 mod p .
c a c
Then
p 1 / q2
b1
g 0
a c p 1 / q 2
a c / q
h 0 h 0 h 1 mod p
c c / q c , and because c(1) is smaller
than q, one can in turn determine c(1) from this by successively comparing b1 (mod
p 1 / q2
c c q a c c q
p) with 1, h, h2,…, hq − 1 (mod p). Next, one forms b 2 b g 0 1 g 0 1 (mod p)
p 1 / q3
and b h (mod p) , determines from that c(2) and calculates in this way succes-
c 2
2
sively all coefficients c(i) and thus also c itself.
We will also give an example [Wil], which must be sufficiently complicated to demon-
strate the Pohlig-Hellman method comprehensively. We choose the prime number
p = 1999, thus p − 1 = 1998 = 2 ∙ 33 ∙ 37. As one has to verify, g = 3 is a generating element
modulo p = 1999. For b = 1996 we want to determine the discrete logarithm a to the base
g = 3 modulo p = 1999, for which thus 1996 = b = ga = 3a (mod p = 1999) holds.
We first examine the three prime divisors 2, 3, and 37 of p − 1. Using the terms of the
Pohlig-Hellman method, we start with q = 2 and calculate c = a (mod 2). To do this, we
first note that c = c(0) holds. Because of h = g(p − 1)/q = 31998/2 = −1 (mod 1999) and
b(p − 1)/q = 19961998/2 = 1 (mod 1999) it follows that c(0) = 0 and therefore c = 0.
We next consider q = 3 and now calculate c = a (mod 33). To do this, we again note that
c = c(0) + c(1) ∙ 3 + c(2) ∙ 32 holds. Because of h = g(p − 1)/q = 31998/3 = 808 (mod 1999) and
b(p − 1)/q = 19961998/3 = 808 (mod 1999), it first follows 2c(0) = 1. Because of
b1 b • g 0 b • g 1 1999 • 31 1 (mod 1999) and b1 1(1998 / 9 ) 1 (mod
c p 1 / q
b • g 1 b1 1 mod 1999 , so
c 0 c1 q
1999), it now3 follows c(1) = 0. Finally, b 2 b • g
p 1 / q
that b 2 1(1998 / 27 )
1 (mod 1999) finally yields c(2) = 0. Altogether, therefore,
c = c(0) = 1.
It remains to consider q = 37. We again look for c = a (mod 37) and note that c = c(0)
holds. Now h = g(p − 1)/q = 31998/37 = 1309 (mod 1999) as well as b(p − 1)/q = 19961998/37 = 1309
(mod 1999), from which follows c(0) = 1 and thus also c = 1.
Now, using the notation from the Pohlig-Hellman method, we collect what we already
know in our example. It is n1 = 2 and n1′ = 33 ∙ 37, n2 = 33 and n2′ = 2 ∙ 37, and n3 = 37 and
n3′ = 2 ∙ 33. Also, we have just determined c1 = 0, c2 = 1, and c3 = 1. Then the Chinese
remainder theorem states that the sought a can be calculated as a = x (mod 1998) with x =
3.6 Attack with Baby and Giant Steps 77
We also want to discuss a second ρ-method by John Pollard (born 1941), namely the one
for discrete logarithms from the year 1978. Thus, we are again looking for a with b = ga
(mod p) for a prime number p and a generating element g modulo p. To do this, one first
divides the numbers 1, 2,…, p − 1 into three roughly equal ranges B1, B2 and B3, say B1
from 1 to an n1, B2 from n1 + 1 to an n2 and B3 from n2 + 1 to p − 1. Inspired by Pollard’s
ρ-factorization, one now defines a sequence xi of natural numbers in the range from 1 to
p − 1 and starts for this with a random number x 0 = g k0 (mod p). The sequence itself is
calculated recursively by
x i 1 g • x i mod p , if x i lies in the range B1
x i 1 x 2i mod p , if x i lies in the range B2
x i 1 b • x i mod p , if x i lies in the range B3
But since xi can take only the finitely many values 1, 2,…, p − 1, there must be among
the ki and mi numbers k and k′ as well as m and m′ with g k • b m g k´ • b m´ mod p and the
chain closes, so to speak, modulo p. Putting here b = ga (mod p) and summing up the expo-
nents of g, it follows for these k − k′ = a ∙ (m′ − m) (mod p − 1). In order to be able to
calculate the discrete logarithm a concretely, one must first get the values k, k′, m and m′
and afterwards determine the sought a from the just derived equation modulo p − 1. If
there are several solutions a, one must determine the correct one by trial and error or alter-
natively start with a new x 0 = g k0 .
78 3 Public-Key Ciphers
As an example [WPPRL] for Pollard’s ρ-method we choose the prime number p = 1019
and as generating element modulo p = 1019 the number g = 2, which of course has to be
checked again. We look for the discrete logarithm a for b = 5 to the base g = 2 modulo
p = 1019, so 5 = 2a (mod 1019). A small computer program helps to calculate 2681 ∙
5378 = 1010 = 2301 ∙ 5416 (mod 1019), and consequently a ∙ (416 − 378) = (681 − 301) (mod
1018). In any case, for this a = 10 is a solution with 2a = 210 = 1024 = 5 (mod 1019). There
is also a second solution a′ = 519, but for which 2a 2 519 1014 5 (mod 1019) holds.
We will not discuss here another method for computing discrete logarithms, the index-
calculus method [Wil, Buc].
As a practical example, we will now look at the Bluetooth radio interface. This is a stan-
dard developed in the 1990s by the Bluetooth Special Interest Group for data transmis-
sion over short distances via radio technology. The devices involved transmit in a
license-free so-called ISM band (Industrial, Scientific, Medical Band) at about 2.4 GHz
and may be operated worldwide without approval. Depending on the transmission power,
the range is between 1 m and 100 m, whereby the characteristics of the environment, such
as the presence of partitions, can also strongly influence the range. The name Bluetooth is
derived from the Danish king Harald Blauzahn. The main purpose of Bluetooth is to
replace cable connections between different devices. Bluetooth provides an interface
through which small mobile devices such as mobile phones and tablets as well as comput-
ers, printers and other peripheral devices can communicate with each other and
exchange data.
Like any radio transmission, Bluetooth offers potential attackers good opportunities to
hack into the communication, especially at greater distances. For this reason, Bluetooth
3.7 Bluetooth and ECDH 79
data are transmitted in encrypted form. The original encryption method was very similar
to that used for GSM mobile radio (Sect. 2.3). The so-called E2 procedure and a random
natural number RAND were used to generate the key for the actual encryption method E0.
E0 was a stream cipher in which the pseudo-random sequences were generated with the
aid of four shift registers of lengths 25, 31, 33 and 39 [Fox].
In the meantime, however, Bluetooth has switched to newer methods. For the encryp-
tion of the data to be transmitted, AES with a key length of 128 bits is used in CRT operat-
ing mode. The key exchange for the AES key takes place with the Diffie-Hellman
method [Blu].
But this was not yet the full truth. In order to explain this, however, we must first venture
a small digression. In real analysis, elliptic curves are functions of the form y2 = x3 + r ∙
x + s with real coefficients r and s and variables x and y (together with the technical condi-
tion that 4∙ r3 + 27∙ s2 does not equal 0). For example, y2 = x3 + 5 ∙ x + 3 is an elliptic curve,
and the point P = (x0, y0) = (1, 3) lies on the curve since y 20 9 12 5 • 12 3 x 30 5 • x 0 3
holds. Figure 3.5 shows the graphs of two typical elliptic curves in the real (x, y) plane.
Since one can add and multiply remainders modulo a prime number p, an elliptic curve
is also conceivable modulo p. This means that the coefficients r and s are remainders
modulo p and that also for the variables x and y only remainders modulo p are admissible.
At first this sounds like abstract mathematical gimmickry, especially since one cannot
imagine and plot the graph with its finitely many points in such a concrete way. However,
one can operate with two points P = (x0, y0) and Q = (x1, y1) of the elliptic curve with resi-
dues x0, x1, y0 and y1 modulo p in such a way that the result is again a point on the curve,
thus satisfying the curve equation. This operation is geometrically motivated from the real
(x, y)-plane [BNS, Kob], but it is quite complicated. For the sake of simplicity it is written
as addition P + Q, although it has nothing at all to do with a simple addition x0 + x1 and
y0 + y1 of the coordinates.
So here are the calculation rules for the addition of two points, but only for information or
even for “deterrence”, and explicitly with the possibility to “skim” or even skip this. For
further understanding, the intuitive notion of an additive operation of two curve points is
quite sufficient. Let P = (x0, y0) and Q = (x1, y1) be two points on an elliptic curve y2 = x3 + r
∙ x + s modulo a prime number p, where we want to assume that p is greater than 3.
First, define a fictitious point O of the curve, the so-called point at infinity, which is to
be the neutral element of the addition, i.e. O + P = P + O = P. Moreover, we set −P = (x0,
−y0) and P + (−P) = O. We now want to define in general how to compute the sum
80 3 Public-Key Ciphers
P + Q = (xS, yS) with remainders xS and yS modulo p. Since we have already done this for
Q = −P, we can assume that Q is not equal to −P.
If in addition Q is unequal to P, then x0 is also unequal to x1. Otherwise from the for-
mula for the elliptic curve 0 = y12 − y02 = (y1 − y0) ∙ (y1 + y0) (mod p) would follow and
therefore y1 = y0 or y1 = −y0. Thus one can divide by x1 − x0 modulo p (Sect. 3.5). One
then defines
x S y1 y 0 / x1 x 0 x 0 x1 mod p
2
y S y 0 y1 y 0 / x1 x 0 x 0 x S mod p
2
x S 3 x 20 r / 2 y 0 2 x 0 mod p
y S y 0 3 x 20 r / 2 y 0 x 0 x S mod p
First you have to prove that P + Q is again a point on the elliptic curve, i.e. that the
equation of the curve yS2 = xS3 + r ∙ xS + s is fulfilled. If you are motivated enough, you can
also try to prove that the addition of the points satisfies reasonable calculation rules, i.e.
P + Q = Q + P (commutative law) as well as (P + Q) + R = P + (Q + R) (associative law)
for a third point R = (x2, y2) on the elliptic curve.
So, at the latest now, elliptic curves modulo p become interesting also for cryptography. The
idea is this: Let one choose a large prime number p and an elliptic curve y2 = x3 + r ∙ x + s
modulo p, by fixing its coefficients r and s. Also, find a point G on the curve where the itera-
tive addition i ∙ G = G + …i… + G passes through as many different points on the curve as
possible. One calls the number o after which a point repeats for the first time in this process,
i.e., j ∙ G = o ∙ G for a suitable j, the order o of G. But this in turn implies O = (o − j ) ∙ G,
and by the choice of o, j = 0 and o ∙ G = O is the neutral element of the addition.
From experience one knows, or at least believes to know, that it is a hard problem to
determine the natural number n for a base point G of large order o from the knowledge of
a point P = n ∙ G. One suspects that this is even significantly more difficult than determin-
ing the discrete logarithm a of b = ga (mod p). So in this comparison the base point G plays
the role of the generating element g modulo p. The baby-step-giant-step method, the
Pohlig-Hellman method, and Pollard’s ρ-method (Sect. 3.6) are suitably “translated” also
applicable to calculate the number n from P = n ∙ G. But all known methods are much less
efficient for elliptic curves than for the discrete logarithm.
3.7 Bluetooth and ECDH 81
3.7.6 ECDH
We now want to formulate the Diffie-Hellman key exchange for elliptic curves ECDH
(Elliptic Curve Diffie Hellman). Let the values p, r, s, G and o be defined as above and
publicly known.
ECDH is thus the key exchange used in Bluetooth [Blu]. In standard procedures based on
ECDH, the parameters (p, r, s, G, o) are predefined and are thus effectively part of the
algorithm. In order to show explicitly what such procedures look like, the NIST standard
P-256 used for Bluetooth is reproduced here [BeL, USG]:
• prime number p
–– p = FFFFFFFF 00000001 00000000 00000000 00000000 FFFFFFFF FFFFFFFF
FFFFFFFF
• Elliptic curve y2 = x3 + r ∙ x + s with
–– r = FFFFFFFF 00000001 00000000 00000000 00000000 FFFFFFFF FFFFFFFF
FFFFFFFC = p − 3
–– s = 5AC635D8 AA3A93E7 B3EBBD55 769886BC 651D06B0 CC53B0F6
3BCE3C3E 27D2604B
• Base point G = (xG, yG) of prime order o
–– xG = 6B17D1F2 E12C4247 F8BCE6E5 63A440F2 77037D81 2DEB33A0
F4A13945 D898C296
82 3 Public-Key Ciphers
The parameters are represented hexadecimally, whereby 4 bits are combined into numbers
from 0 to 15. The letters stand for the two-digit numbers A = 10, B = 11,…, F = 15.
Even on the Internet, the case can arise that neither of the two communication partners has
a valid RSA certificate for the key agreement. For this reason, the TLS handshake protocol
also allows the AES key to be exchanged using the Diffie-Hellman method. Both the vari-
ant with discrete logarithm (DH) and the variant with elliptic curves (ECDH) are supported.
In order to achieve a security level comparable to RSA or Diffie-Hellman, the key length
of order o of the base point G is decisive for ECDH. The BSI guideline [BSI1] recom-
mends at least 250 bits for this, as is also the case with P-256.
In general, the methods based on elliptic curves show a better runtime behavior even
compared to RSA. In order to generate a value for o in the order of 250 bits, it is sufficient
to use primes p with only about 250 bits, as can be seen in the example P-256. It is true
that the addition of the points of an elliptic curve is relatively complex. But here one has
to calculate only with remainders modulo prime numbers with about 250 bits and can
therefore avoid the complex calculations with remainders in the order of 2000 bits.
3.8 ElGamal Cipher 83
How to transform the discrete logarithm problem into a public key cipher that can be used
to encrypt arbitrary messages m is what we will now look at. The method originates from
Taher ElGamal (born 1955) in 1984.
The ElGamal cipher is visualized in Fig. 3.6. Each potential receiver Y(ollanda) of mes-
sages registers in a central registry with her public key. To do this, she obtains a very large
prime number p and a generating element g modulo p. She also chooses a natural number
a in the range from 2 to p − 2 and computes b = ga (mod p). Finally, Y publishes as her
public key (p, g, b) but keeps a secret as her private key. Thus, to obtain the private key
a, an attacker A(rchibald) would have to solve the discrete logarithm of b to the base g
modulo p.
Now suppose sender X(avier) wants to send a secret message m to receiver Y again.
Then X looks up the public key (p, g, b) of Y in the central register and in turn chooses a
random natural number k in the range from 2 to p − 2. Let the message m be a natural
number smaller than the very large p. Sender X then transmits the encrypted information
gk (mod p) and m . bk (mod p) to Y. She in turn takes her private key a and computes
(gk)(p − 1 − a) ∙ m ∙ bk = m ∙ (gk)(p − 1 − a) ∙ (ga)k = m ∙ g(p − 1)k − ka + ak = m ∙ g(p − 1)k = m (mod p).
Since m is less than p, she has thus obtained the desired plaintext message.
The ElGamal procedure is very similar to the Diffie-Hellman key exchange, and the
key selection even corresponds exactly to the semi-static Diffie-Hellman variant. For gk
(mod p) it is the exchange value of X and for bk (mod p) it is the mutually agreed Diffie-
Hellman key, which “masks” the secret message m in the ElGamal cipher.
• With ElGamal two cipher values have to be calculated and sent, with RSA only one.
• With ElGamal, two modular exponentiations are required for ciphering, with RSA
only one.
• With ElGamal a new random number must be generated each time, with RSA none.
As an example [Wil], receiver Y(ollanda) chooses the prime number p = 107. As can be
calculated, g = 2 is a generating element modulo p = 107. Y also chooses a = 51, which she
keeps secret as her private key, and calculates b = ga = 251 = 80 (mod 107). Therefore, she
registers (p, g, b) = (107, 2, 80) as her public key.
84 3 Public-Key Ciphers
Let us now assume that sender X(avier) wants to send Y the message m = 83. He
chooses k = 17 as random number, calculates gk = 217 = 104 (mod 107) as well as m .
bk = 83 . 8017 = 74 (mod 107) and therefore sends 104 and 74. Receiver Y uses her private
key a = 51 and calculates (gk)(p − 1 − a) ∙ m ∙ bk = 10455 ∙ 74 = 83 (mod 107) and receives the
plaintext message 83 from X this way.
Again, it is necessary to note that the public-key cipher ElGamal requires much more
computing time than symmetric ciphers. Therefore, like RSA, it is only used for key
exchange for symmetric ciphers. This is also the reason why in practice the message m,
i.e. the binary expanded key of a symmetric cipher, is smaller than p.
Analogous to the Diffie-Hellman key exchange, the ElGamal cipher can also be applied to
elliptic curves. Let p again be a prime number and y2 = x3 + r ∙ x + s an elliptic curve
modulo p. Moreover, let G be a base point on the curve with order o. Y(ollanda) chooses a
natural number in the range from 2 to o − 1 and computes the point B = b ∙ G. The public
key of Y is then (p, r, s, G, o, B), the private one b.
If X(avier) wants to send the message m to Y, i.e. usually the key for a symmetric
cipher, he must first interpret m as a point M on the elliptic curve, preferably as the
x-coordinate of M = (m, y0). From the curve equation y02 = m3 + r ∙ m + s, it should then
be possible to compute y0 as a square root modulo p. There are efficient calculation meth-
ods for this [Kob], but they can only lead to the goal if such a square root exists at all
according to the Euler criterion (Sect. 3.3). However, half of the remainders modulo p are
square roots, so X has a good chance that M = (m, y0) is a curve point. But if not, he just
tries the point M1 = (m + 1, y1) and y12 = (m + 1)3 + r ∙ (m + 1) + s and continues with m + i
until he succeeds [Kob].
To send m or M, X then chooses a random natural number k in the range from 2 to o − 1
and sends the points k ∙ G and M + k ∙ B to Y. The latter uses her private key b and com-
putes (M + k ∙ B) − b ∙ (k ∙ G) = M + k ∙ B − k ∙ B = M = (m, y0), i.e., in particular, the
desired message m.
As attractive as the idea of developing public-key methods on the basis of difficult math-
ematical problems sounds, it is sobering to find that the pool is relatively small. The Rabin
cipher by Michael Rabin (born 1931) is worth mentioning, but it is not used in practice
3.9 Knapsack and Merkle-Hellman Cipher 85
[BNS, Buc]. It is based on the fact that computing “square roots” modulo a large number
n = p . q is as difficult as factoring n. In contrast, Robert McEliece’s (b. 1942) cipher using
error-correcting Goppa codes [BNS, Man] requires an exorbitantly large, currently
impractical key length. To complete this chapter we want to explain a totally different,
easily understandable method, which is, however, of more historical interest.
The so-called knapsack deals with the following problem: Given are natural numbers
v1,…, vn, where some of the vi can be equal. Now bits bi are to be calculated for a natural
number v such that v = b1 ∙ v1 + … + bn ∙ vn. One imagines a knapsack of volume v, which
one wants to fill optimally with provisions each of volume vi without leaving any part of
the knapsack unused. Admittedly, this idea requires a very cunning backpacker. In any
case, the fact is that solving this problem is hard, at least for sufficiently large vi and n.
There may be a unique solution to this, but there may also be many or none at all.
Now it is again a matter of turning the Knapsack problem into a public-key procedure. The
idea for the Merkle-Hellman cipher originated in 1978 from Ralph Merkle (born 1952)
and Martin Hellman (born 1945).
86 3 Public-Key Ciphers
The potential receiver Y(ollanda) chooses a super-knapsack v1,…, vn. She also chooses
a number m greater than v1 + … + vn, and a number a in the range from 1 to m − 1 that is
coprime with m. Then Y can also compute b = a−1 (mod m) by using the extended Euclidean
algorithm to determine the multiple sum 1 = b ∙ a + x ∙ m. Finally, Y computes the residues
wi = a ∙ vi (mod m), making w1,…, wn no longer a super-knapsack. As her public key, Y
publishes the knapsack w1,…, wn, her private key is (b, m), from which the super-knapsack
v1,…, vn can again be computed inversely.
A message block in Merkle-Hellman cipher consists of digital strings b1…bn of length
n. Thus, if sender X(avier) wants to send a message to Y, he computes w = b1 ∙ w1 + … + bn
∙ wn and sends the number w. Since w1,…, wn is not a super-knapsack, a potential attacker
A(rchibald) has a hard time computing the message b1…bn from w and the wi.
For receiver Y, however, this is easy. Namely, she calculates v = b ∙ w = b1 ∙ b ∙
w1 + … + bn ∙ b ∙ wn = b1 ∙ v1 + … + bn ∙ vn (mod m). But since by assumption m is greater
than v1 + … + vn, even v = b1 ∙ v1 + … + bn ∙ vn. Now all Y has to do is solve the super-
knapsack v1,…, vn for the number v to determine the message b1…bn.
Even with multiple transformations with different modules m and factors a, the super-
knapsack v1,…, vn is modified only slightly, namely too little. Therefore, Adi Shamir (b.
1952) already in 1982 found a method that exploits this shortcoming and cracks the
Merke-Hellman cipher with only one transformation in reasonable time. Shortly thereaf-
ter, Leonard Adleman (b. 1945) elaborated that Shamir’s method also works for Merkle-
Hellman ciphers with multiple transformations. Knapsack-based methods are therefore no
longer considered secure.
Digital Signature
4
So far, we have always implicitly assumed that a potential attacker A(rchibald) plays only
a passive role. He is intent on undermining the confidentiality between sender X(avier)
and receiver Y(ollanda) by eavesdropping on and decrypting the secret communication in
order to use the acquired knowledge for his own purposes immediately afterwards or even
after a time delay. The scenario of passive eavesdropping is shown in Fig. 4.1.
However, attackers can also play an active role. This is called a man-in-the-middle
attack. In this case, an attacker A(rchibald) inserts himself into a possibly two-way com-
munication between X(avier) and Y(ollanda) and plays the role of Y to X and the role of
X to Y. This scenario is visualized in Fig. 4.2. In general, active intervention requires that
the attacker can also decrypt the messages. This is because he can then specifically change
the communication to his liking.
Receiver Y(ollanda) of a message should always be sure that she has received exactly the
message that sender X(avier) has really sent. In addition, she should be able to convince
herself beyond doubt of the origin of the message, i.e., that the message actually originated
from the specified sender X. For example, if a man-in-the-middle attacker A(rchibald) suc-
ceeds in converting the originating message “Secret meeting tomorrow 10 a.m. at my
place, Xavier” into
First of all, we would like to delimit the somewhat different concept of authentication of
users. Here, a user usually registers with a central point of a system, the so-called verifier,
which in turn authenticates the user on the basis of characteristics. Examples of this are the
reading of a bank card with subsequent PIN entry at an ATM, the password entry when
dialing into a computer network or the biometric passport check when entering and leav-
ing the country at the airport. The procedures for authenticating a user generally rely on
the following features, although combinations can also be used:
The focus of the authentication of a users is solely on the question of whether the user is
currently really who he claims to be. In contrast to the authentication of messages, con-
tents, for example the actions planned by a user, are of no importance. Furthermore, the
authentication of a user only assesses his current situation, while old, archived messages
also retain their authentication once it has been obtained.
Passive eavesdropping is sufficient to gain unauthorized knowledge of the features used
for authentication; a man-in-the-middle attack is therefore not required. However, the veri-
fier must know the respective characteristics of the features so that it can also check them.
Thus, their secure and secret storage on the verifier’s server is required. How to proceed
with passwords, for example, and how to use the digital signature as an alternative will
be discussed at the end of the chapter (Sect. 4.8).
Both message and user authentication are usually referred to briefly (and laxly) as
authentication, and the meaning is only clear from context.
If X(avier) and Y(ollanda) agree on their key for a symmetric cipher using Diffie-Hellman
key exchange, a man-in-the-middle attack is fatal. This is because attacker A(rchibald) can
agree on a key with X in the role of Y and agree on a key with Y in the role of X without
either of the communication partners X and Y being aware of it. Subsequently, A can tap
messages from X, decrypt them, and forward them modified to Y, as well as vice versa. A
Diffie-Hellman key exchange, such as with the TSL protocol on the Internet or with
Bluetooth, is therefore exposed to a man-in-the-middle attack without any further precau-
tions. In this case, it is therefore advisable for the two communication partners to authen-
ticate themselves unambiguously to each other beforehand, i.e. not to a verifier. This is
why, for example, the so-called SSP procedure (Secure Simple Pairing) has been
90 4 Digital Signature
implemented for Bluetooth, in which the users can first authenticate the devices commu-
nicating via Bluetooth by means of a six-digit number.
We have dealt in detail with public-key ciphers in the last chapter, where participant
Y(ollanda) registers her encryption key ke publicly, since it is practically impossible to
compute her private decryption key kd from it. Thus, the key kd is an absolute secret of
Y. Let us denote by E(∙, ∙) and D(∙, ∙) the ciphering and deciphering, respectively, with an
initially still arbitrary public-key cipher. In this case, if X(avier) wants to send a confiden-
tial message m to Y, X encrypts the message m into the ciphertext c = E(m, ke) using Y’s
public key ke. Receiver Y decrypts using her private key kd and receives the plaintext mes-
sage m = D(c, kd).
We now imagine as a scenario that in the context of a public key cipher two additional
computation rules sig(∙, ∙) and ver(∙, ∙, ∙) would be specified, one for signing and one for
verifying. Participant Y(ollanda) should be able to “sign” her message m in this scenario
by computing a digital signature s = sig(m, kd) using her private key kd. So only partici-
pant Y can generate this signature. Now Y sends the signature s together with the message
m to participant X(avier), where we want to imagine m unencrypted for the moment.
Participant X in this scenario should be able to verify the signature using Y’s public key ke
by computing ver(m, s, ke). If the verification returns an “o. k.”, then X accepts Y’s signa-
ture as “valid”.
If the message text m had been changed by attacker A(rchibald) into another message
m′, then ver(s, m′, ke) would not result in an “o. k.”. So we summarize the scenario again:
• Only participant Y(ollanda) can generate her signature s = sig(m, kd), since only she
knows her private key kd.
• Participant X(avier) can use Y’s public key ke to verify the authenticity of the signature
using ver(s, m, ke).
• Once authenticity is authenticated, X can be sure that the message really came from Y
and has not been altered along the way.
• X can even clearly prove this to any third party, such as Z(izi), since he could not have
calculated the signature himself.
We will now look at how this scenario of a digital signature based on the public key meth-
ods RSA- and ElGamal can be realized.
The RSA signature is quite simple and obvious, namely it uses exactly the same compu-
tational rules as the RSA cipher itself. Thus, participant Y(ollanda) obtains two different
prime numbers p and q and multiplies them to n = p ∙ q. She also chooses a natural number
e less than (p − 1) ∙ (q − 1), which is coprime with (p − 1) ∙ (q − 1). Using the extended
Euclidean algorithm, she determines a natural number d with 1 = d ∙ e + b ∙ (p − 1) ∙
(q − 1). The public key of Y is then (n, e), her private is d. Let the message m again be a
natural number smaller than n. Then med = mde = m (mod n) (Sect. 3.1).
For the digital RSA signature of m, Y computes the remainder of md modulo n, which
she sends to receiver X(avier) along with the plaintext m. The latter uses Y’s public key (n,
e) and the received value md (mod n) and computes mde (mod n). Receiver X considers the
signature verified if his computation produces exactly the received message m = mde (mod
n). The procedure is shown schematically in Fig. 4.3.
Here is an example [Hau1] for the RSA signature. Here Y(ollanda) chooses n = p ∙ q = 13
∙ 23 = 299, and because of (p − 1) ∙ (q − 1) = 264 she can use e = 5 for her public key (n,
e) = (299, 5). Using the extended Euclidean algorithm, she computes 1 = 53 ∙ 5 − 1 ∙ 264,
so d = 53 is her private key.
If Y wants to sign the message m = 296, she calculates md = 29653 = 212 (mod n = 299)
(Sect. 3.1) and sends the signature 212 together with the message m = 296. Receiver
X(avier) uses Y’s public key (299, 5) and calculates mde = 2125 = 296 (mod 299). Since this
results in m = 296, X accepts Y’s RSA signature.
The ElGamal signature, published by Taher ElGamal (b. 1955) together with his public
key cipher in 1984, is a bit more complicated. Thus, participant Y(ollanda) obtains a prime
number p and a generating element g modulo p. She also chooses a natural number a in the
range from 2 to p − 2 and computes b = ga (mod p). Her public key is then (p, g, b), her
private is a. Let the message m again be a natural number smaller than p.
For the digital ElGamal signature of m, Y chooses a random natural number k in the
range from 2 to p − 2, which is coprime with p − 1, and also keeps this secret. Using the
extended Euclidean algorithm, she computes a natural number x with 1 = x ∙ k (mod
p − 1), so that k−1 = x (mod p − 1) holds. As signature, along with the plaintext m, she
sends the residues u = gk (mod p) and v = (m − a ∙ u) ∙ k−1 (mod p − 1).
To verify the signature, receiver X(avier) computes bu ∙ uv (mod p). He can do this
because, on the one hand, he receives u and v and, on the other hand, he can look up b and
p as Y’s public key. Because of b = ga (mod p), gp − 1 = 1 (mod p), and k−1 = x (mod p − 1),
he gets bu ∙ uv = gau ∙ gk(m − au)x = gau ∙ gm − au = gm (mod p). As a check, receiver X also com-
putes the value gm (mod p) directly. He can do this since he receives m and can look up g
as Y’s public key. If both results are identical, X considers the signature verified. Figure 4.4
visualizes the procedure.
• With ElGamal two signature values have to be calculated and sent, with RSA only one.
• With ElGamal three modular exponentiations are necessary for verification, with RSA
only one.
• With EGamal a new random number must be generated each time, with RSA none.
characterizes the message. This fingerprint can then be signed instead of the complete
message m in the cryptographic envelope.
• It should not be possible to find two different messages m and m′ with the same hash
value h(m) = h(m′) in a reasonable amount of time (so-called collision resistance).
• It should not be possible to find a message m with hash value h(m) = y for a randomly
chosen bit string y of length n in a reasonable amount of time (so-called one-way
property).
Digital fingerprints are thus generated by means of cryptographic hash functions whose
length should be at most a few hundred bits. Then, on the one hand, it follows from the
computability that the hash value h(m) and thus also the signature sig(h(m)) of the binary
expanded hash value can be computed much faster than the signature sig(m) of the entire
message m. On the other hand, the collision resistance ensures that the hash value h(m)
uniquely characterizes the message m in principle and thus makes a meaningful signature
possible in the first place. The one-way property primarily concerns the RSA signature. In
this case, an attacker can determine a y for a given z via verification with sig(y) = z. If he
were also additionally able to efficiently compute a message m with h(m) = y, he might
falsely claim that z = sig(h(m)) would be a valid signature for m. Where the one-way prop-
erty is still needed, we see in the secure storage of passwords as their hash values (Sect. 4.8).
We will mention another application of hash functions in a moment. They can also be used
as a MAC (Sect. 4.1) for the authentication of a message m and are then referred to as an
HMAC. It would be obvious to use a hash function h(∙) and a MAC key km to calculate the
hash value h(km‖m). However, this procedure is considered to be insecure.
4.3 Hash Value and Secure Hash Algorithm SHA 95
Let us first clarify the principle of one of the most common hash function construction
methods, the Merkle-Damgård construction, which dates back to work by Ralph
Merkle (b. 1952) and Ivan Damgård (b. 1956). One iteratively constructs a hash function
h(∙) of length n using the Merkle-Damgård construction, but requires a compression
function F(∙) that maps bit sequences of length n + r to bit sequences of length n for a
suitable natural number r. We will see below how F(∙) can be chosen. The hash value h(m)
of any message m is then computed in the Merkle-Damgård construction using the com-
pression function F(∙) as follows: First, one decomposes the message m = m1…mt into t bit
blocks mi of length r, appropriately padding at the end for each concrete procedure. Then
one starts with an initial bit sequence h0 of length n, which is also specifically determined
for each concrete procedure, and calculates the value h1 = F(h0‖m1) for the bit sequence
h0‖m1 of length n + r. The compression function F(∙) maps the bit sequence h0‖m1 of
length n + r into a bit sequence h1 of length n. In the next step, one computes h2 = F(h1‖m2)
and in general hi = F(hi − 1‖mi). Finally, the last result ht is the hash value h(m) of length n
for message m. The diagram in Fig. 4.5 illustrates the Merkle-Damgård construction again
[WPMDK].
The 1979 construction method was originally proposed by Ralph Merkle. Ivan Damgård
proved in 1989 that if the message m is suitably prepared, a collision-resistant compres-
sion function F(∙) leads to a collision-resistant hash function h(∙).
Thus, to concretely specify a hash function of length n using the Merkle-Damgård con-
struction, it suffices to specify the compression function F(∙). The main component of F(∙)
is usually a block cipher E(∙, ∙). This can be a standard cipher such as Triple-DES or AES,
but an individually constructed cipher is usually used. Indeed, in hash functions, one
places particular emphasis on simple, fast operations that can be implemented efficiently.
We will look at an example, namely SHA-2-256 (Sect. 4.3 end).
The Davies-Meyer compression function FDM (∙) uses the message block mi of bit
length r as the key for the cipher E(∙, ∙), and the preceding iterated hash value hi − 1 of
length n as its plaintext. The ciphertext is then still added bitwise ⊕ to h i − 1, to compute
the next iterated hash value hi. The compression function is thus FDM(hi − 1‖mi) = E(hi − 1, mi)
⊕ hi − 1, which is visualized schematically in Fig. 4.6.
Alternatively, in the Davies-Meyer compression function, one also decomposes
hi − 1 = hi ‐ 1(1)…hi − 1(n/w) and E(hi − 1, mi) = ci(1)…ci(n/w) into components hi ‐ 1(j) and ci(j) of
smaller bit length w and computes hi = hi(1)…hi(n/w) component-wise as hi(j) = ci(j) ⊞ hi-1(j) by
adding ⊞ modulo 2w, where the bit strings of length w are interpreted as a binary expan-
sion of a natural number.
The Matyas-Meyer-Oseas compression function FMMO (∙) proceeds approximately in
reverse. However, the prerequisite for this is that the bit length r of the message blocks mi
is chosen to be equal to the length n of the hash value. The message block mi is then used
as the plaintext block for the block cipher E(∙, ∙), and the ciphertext is subsequently added
to mi bitwise ⊕, so as to compute the next iterated hash value hi. The previous iterated hash
value hi − 1 is used as the key for the block cipher. However, this only works if the block
cipher has equal block and key length. If this is not the case, hi − 1 is first made suitable
using a suitable function G(∙). Thus, the Matyas-Meyer-Oseas compression function is
FMMO(hi − 1‖mi) = E(mi, G(hi − 1)) ⊕ mi. Figure 4.7 again visualizes this schematically.
The cryptographic hash functions most frequently used in practice today are those of the
so-called SHA family (Secure Hash Algorithm). The first generation SHA-1 has a length
of 160 bits and was standardized by NIST in 1995. It is based on a Merkle-Damgård con-
struction together with a Davies-Meyer compression function. However, one did not use
an existing, possibly standardized block cipher, but developed it individually for SHA-1.
By 2004, there had been several successful attacks against SHA-1, and it was discovered
that SHA-1 is far less collision resistant than had been theoretically expected.
In response to the attacks that became known, NIST held a workshop in 2005 to discuss
the current status of hash functions. NIST recommends the transition to SHA-2 hash func-
tions of the second generation. These are the variants SHA-2-224, SHA-2-256, SHA-2-384
and SHA-2-512, where the appended number indicates the length of the hash value in each
case. It is also based on a Merkle-Damgård construction together with a Davies-Meyer
compression function, although the block cipher has been modified in SHA-2 compared to
SHA-1 [WPSH2]. There have been no relevant attacks on SHA-2 so far, so SHA-2 may
still be considered secure with the exception of the smallest variant SHA-2-224. However,
if SHA-2 should also turn out to be compromised or insecure, there was initially no other
standardized hash function available that was recognized as secure.
Therefore, it was decided to create a new standard that would take into account current
research. In order to standardize a hash function with a different construction principle,
NIST organized a tender in 2007 along the lines of the AES. The choice was made in 2012
for the method called Keccak, which was standardized in 2015 as SHA-3 with variants
SHA-3-224, SHA-3-256, SHA-3-384, and SHA-3-512. SHA-3 is constructed in a funda-
mentally different way than SHA-2, namely with the help of a so-called sponge construc-
tion [WPSH3].
Since the SHA-2 hash functions have now become the de facto standard, let us take a
closer look at SHA-2-256 [IWS, WPSH2e]. It is a 256-bit hash value computed using a
Merkle-Damgård construction together with the second variant of a Davies-Meyer com-
pression function. The message m = m1…mt is thereby split into blocks mi of length 512
bits, with padding at the end according to rules not described in more detail here. Thus, we
must first describe how the block cipher E(hi − 1, mi) is constructed for an arbitrary index i
and the iterated hash value hi − 1. To do this, consider the 256 bits of hi − 1 composed as
hi − 1 = hi − 1(1)‖…‖hi − 1(8) with eight blocks hi − 1(1), …, hi − 1(8) of 32 bits each. Namely, the
cipher E(∙, ∙) operates on eight 32-bit blocks, has 64 rounds, and uses only simple and fast
operations:
⊕ Bitwise addition
°
Bitwise multiplication
¬ Bitwise NOT (i.e. ¬ 0 = 1, ¬ 1 = 0)
Cyclic shifting of a bit string to the right by k positions
R rk
Shifting a bit string to the right by k positions, padded with 0
Srk
⊞ Addition modulo 232 of 32-bit strings (interpreted as natural number)
Derived from this, the following operations are required for bit strings x, y, z of
length 32:
C x,y,z xy xz
M x,y,z xy xz yz
0 x R 2r x R13
r x Rr x
22
1 x R 6r x R11
r x Rr x
25
0 x R 7r x R18
r x Sr x
3
1 x R r x R r x S10
17 19
r x
Figure 4.8 shows how, for the total of 64 rounds of E(∙, ∙), the j. round for j = 0,…, 63 is
constructed. Here, a,…, h denote the placeholders of 32 bits each, which are initialized
with a = hi − 1(1),…, h = hi − 1(8) and are recalculated and filled accordingly for each round.
The wj result from the message block mi. Namely, it is mi = w0w1…w15 the decomposi-
tion of the message block mi into 16 sub-blocks wj with 32 bits each. The remaining wj for
j = 16,…, 63 are recursively computed from wj = σ1 (wj − 2) ⊞wj − 7 ⊞σ0 (wj − 15) ⊞wj − 16.
The constants k0,…, k63, are as follows:
The kj are represented hexadecimally, whereby 4 bits each are combined to form num-
bers from 0 to 15. The letters stand for the two-digit numbers a = 10, b = 11,…, f = 15.
As a result of the 24 rounds, the procedure thus yields eight blocks ci(1), …, ci(8) with 32
bits each, from which E(hi − 1, mi) = ci(1)‖…‖ci(8) is composed. According to the second
variant of the Davies-Meyer compression function, hi is finally calculated as hi = hi(1)||…
||hi(8), where hi(1) = hi-1(1) ⊞ ci(1), …, hi(8) = hi-1(8) ⊞ ci(8) applies. The initial hash value
h0 = h0(1)‖…‖h0(8) is
It remains to be clarified and described what is really calculated and sent in messages in a
cryptographic envelope. You only sign the hash value, i.e. the digital fingerprint of a mes-
sage. And this is how the entire procedure looks in principle when sending and receiving:
100 4 Digital Signature
• Sender Y(ollanda) chooses a symmetric cipher S(∙, ∙), a public-key cipher E(∙, ∙), a
digital signature sig(∙), and a hash function h(∙), which she agrees on in advance with
receiver X(avier).
• Moreover, Y generates the cipher key k of the symmetric cipher S(∙, ∙) using a random
generator, looks up the public key e of X for the public key cipher E(∙, ∙), and computes
E(k, e).
• Now Y calculates the hash value h(m) for message m and signs it with her private sig-
nature key.
• Finally, to send m secret and signed to receiver X, Y transmits the cryptographic enve-
lope S(m||sig(h(m)), k) together with E(k, e).
• X first deciphers E(k, e) with his private key and then S(m||sig(h(m)), k) with key k. He
thus receives the message m together with the signature of the hash value h(m).
• Now he in turn computes the hash value h(m) of the received message m.
• Finally, X looks up Y’s public signature key and uses it to verify the received signature
sig(h(m)), i.e. he checks whether the verification function for h(m), sig(h(m)) and the
public key does yield an o.k.
• If this is successful, receiver X considers both sender Y and message m to be
authenticated.
We want to concretize the procedure using the example of e-mails. The PGP (Pretty Good
Privacy) program package is used to encrypt and authenticate data and is primarily used
for e-mails. It was originally written by Phil Zimmermann (born 1954) and first pub-
lished in 1991. PGP is a program package that uses both symmetric and public-key ciphers.
The following description of how an e-mail is encrypted and digitally signed with PGP is
schematically visualized in Fig. 4.9.
• To ensure that the message cannot be tampered with and to unambiguously prove the
sender, PGP generates a digital signature of the entire e-mail m. First, the hash value
h(m) is calculated using a hash function h(∙). This creates a unique digital fingerprint
that is much shorter than m itself. Subsequently, the sender’s digital signature sig(h(m))
is generated for this hash value using the sender’s private signature key.
• Now PGP can perform the encryption of the e-mail. To do this, m is first combined with
the digital signature sig(h(m)) to form a data record and subjected to data compression
C(∙). On the one hand, this reduces the size of the data set and, on the other hand, makes
cryptanalysis more difficult by reducing, for example, linguistic redundancy. The com-
pressed data C(m‖sig(h(m))) are now encrypted with a symmetric cipher S(∙, ∙) and a
randomly generated key k to form a ciphertext S(C(m‖sig(h(m))), k).
• The randomly generated key k must also be communicated to the receiver. To do this,
k is encrypted using a public-key cipher E(∙, ∙) and the recipient’s public key e to form
E(k, e) and prefixed to the symmetrically encrypted ciphertext S(C(m‖sig(h(m))), k).
This entire packet is finally sent under PGP.
The PGP program was sold several times over the course of time to various software
houses, from which licenses could be purchased. In 2010, the software was transferred to
the US company Symantec. However, Phil Zimmermann already published the complete
PGP source code in 1995 in the book “PGP Source Code and Internals”. This was pains-
takingly typed out, and on the basis of this the freely available OpenPGP standard was
developed and maintained in parallel with open source software packages. Initiated by
Werner Koch (born 1961), the first version of GPG (GNU Privacy Guard) was published
in 1997. This is also a freely available system developed on the basis of OpenPGP, com-
parable to PGP in structure and range of functions and largely compatible
[WPGPG, WPOPG].
Originally, PGP used DES as the symmetric cipher. In the meantime, however, PGP/open-
PGP/GPG offers, among others, Triple-DES and AES with 128- or 256-bit keys, each of
which is operated in CFB mode and therefore as a stream cipher with a pseudo-random
sequence. The public key cipher for the exchange of the cipher key and for the digital
signature was only RSA in the original PGP version. In the meantime, PGP/openPGP/
GPG can also use ElGamal and ECDH with the P-256 standard for key exchange.
SHA-2-256 and MD5, among others, can be used as hash functions, and data compression
is performed using the zip-format.
For digital signatures, PGP/openPGP/GPG now also provides the DSA and ECDSA
methods as well as the important secp256k1, brainpoolP256r1 and Curve25519 standards
for elliptic curves (which will be explained in Sect. 4.5).
102 4 Digital Signature
4.4.5 WhatsApp
But first we want to talk about another popular communication service besides SMS and
email, namely WhatsApp. WhatsApp was founded in 2009 and has been part of Facebook
since 2014. Users can exchange text messages as well as image, video and sound files
between two people or in groups via WhatsApp.
WhatsApp has had a comprehensive security concept [WhA] since 2016. This is basi-
cally designed as follows: First, an ECDH key, the so-called identity key, is generated for
each subscriber during the WhatsApp installation for a semi-static Diffie-Hellman key
exchange based on elliptic curves (Sect. 3.7). The public part of the identity key is trans-
ferred to the WhatsApp server, the private part cannot be accessed. For ECDH, the elliptic
curve according to Standard Curve25519 is used (Sect. 4.5).
In order to establish a protected WhatsApp communication, the sender performs a
semi-static ECDH key exchange with the recipient. Both thus have a shared secret, the
so-called master secret, from which a so-called root key is derived. When a WhatsApp
message is to be sent, another ECDH key exchange is performed between the sender and
the recipient, and a so-called chain key is derived from this using the root key, from which
a so-called message key is formed using a hash function. The first substring of 256 bits of
this message key is used as the key for an AES cipher in CBC mode, which is used to
encrypt the WhatsApp message. Unlike PGP, WhatsApp does not use a digital signature to
authenticate messages, but rather an HMAC (Sect. 4.3) based on the SHA-2-256 hash
function and a 256-bit MAC key derived from the next substring of the message key. In
contrast to PGP, which can be used in particular to send contractual documents, the authen-
ticity of a message cannot be proven to third parties with WhatsApp.
Because of the vulnerability of Diffie-Hellman and thus also ECDH against a man-in-
the-middle attack, a procedure for authentication of the communication partners has also
been implemented for WhatsApp, similar to Bluetooth. A hash value is generated from the
participant name and identity key using SHA-2-512, which can be scanned in as a QR
code for verification.
Facebook is also planning a similar security concept for its Instagram photo and video
sharing social networking service.
Let again p be a prime number and g a generating element modulo p. If q is a prime num-
ber dividing p − 1, then we can compute h = g(p − 1)/q (mod p). Then hq = 1 (mod p), and the
powers 1 = h0, h1, h2, h3,…, hq − 1 run through exactly q distinct values modulo p. To given
b, one again calls the exponent a in b = ha (mod p) the discrete logarithm of b, but this
time to the base of the nongenerating element h modulo p. It is believed that for
4.5 DSA and ECDSA Signature 103
sufficiently large q this problem is as difficult as the discrete logarithm to the base g,
although in fact the hi (mod p) run through less distinct elements than the gj (mod p),
namely q instead of p − 1. In any case, it is a fact that all known methods for calculating
discrete logarithms also have an unrealistic large running time with respect to the base h
as they have with respect to the base g, provided, however, that q has not been chosen
too small.
In practice, the ElGamal signature (Sect. 4.2), which is more important from a historical
point of view, is hardly ever used. Instead, the standard procedure DSA (Digital Signature
Algorithm) is usually used. This is a more efficient variant of the ElGamal signature,
which was first standardised by NIST in 1994, with the latest revision dating from 2013.
Participant Y(ollanda) chooses prime numbers p and q such that q is a divisor of p − 1.
For a generating element g modulo p, she computes h = g(p − 1)/q (mod p), chooses a natural
number a in the range from 2 to q − 1, and computes b = ha (mod p). Her public key is then
(p, q, h, b), her private key is a.
For the DSA signature of a message m, Y chooses a random natural number k in the
range from 2 to q − 1, which is thus automatically coprime with the prime number q, and
also keeps this secret. Using the extended Euclidean algorithm, she computes a natural
number x with 1 = x ∙ k (mod q), so that k−1 = x (mod q) holds. As her signature, together
with the plaintext m, she sends on the one hand the residue u = (hk (mod p)) (mod q), where
here first modulo p and then modulo q is calculated, as well as the remainder v = k−1 ∙
(m + a ∙ u) (mod q). If u = 0 or v = 0 holds, Y starts again with another random number k.
Receiver X(avier) verifies the authenticity of the signature values u and v as follows:
Using the extended Euclidean algorithm, he computes a natural number y for v with 1 = y
∙ v (mod q), so v−1 = y (mod q) holds. This is possible because the prime number q is part
of the public key of Y and v is not 0, and therefore is coprime with q. Using the received
message m, he computes the residues w = m ∙ v−1 (mod q) and w′ = u ⋅ v−1(mod q) and uses
Y’s public key (p, q, h, b) to check whether the residue h w b w mod p mod q equals
the first signature value u. If it is, X considers the signature to be verified. This is because
if the DSA signature is correct, then due to hq = 1 (mod p), it follows that
h w b w h my b uy h h h k mod p holds. In particular, u is then equal to
y m au y vk
h b mod p mod q . Figure 4.10 visualizes the procedure.
w w
The NIST standard specifies values for p and q in the order of 1024 and 160 bits, 2048
and 224 bits, 2048 and 256 bits, and 3072 and 256 bits, respectively, although 1024 and
160 bits are no longer recommended today.
In our description, we have signed the message m itself for the sake of simplicity.
However, as we know, you actually sign a hash value of m. In the original NIST standard,
SHA-1 was intended for this purpose, but this is no longer considered completely secure.
Therefore, SHA-2 has also been approved in the meantime.
104 4 Digital Signature
Although DSA at first seems more complicated than the ElGamal signature, it requires
only two instead of three modular exponentiations for verification. Moreover, the expo-
nents as well as the signature values to be sent are also significantly smaller, since they are
only remainders modulo the smaller prime q.
As an example [DIM], participant Y(ollanda) chooses prime numbers p = 283 and q = 47,
where q divides p − 1 = 282. Also, Y chooses the number h = 60, for which hq = 6047 = 1
(mod p = 283), and a = 24, and computes b = ha = 6024 = 158 (mod p = 283). Therefore,
she registers (p, q, h, b) = (283, 47, 60, 158) as her public key, and her private key is a = 24.
Now suppose Y wants to send the message m = 41 to receiver X(avier) and also wants
to sign it digitally. Then Y chooses a random number k = 15 and calculates x = k−1 = 15−1 = 22
(mod q = 47). Now Y first computes hk = 6015 = 207 (mod p = 283) and then the remainder
u = 207 = 19 (mod q = 47). Finally, since u is not equal to 0, Y also computes the remainder
v = (m + a ∙ u) ∙ k−1 = (41 + 24 ∙ 19) ∙ 22 = 30 (mod q = 47). Since v is also not equal to 0,
Y sends the signature values u = 19 and v = 30 to X along with the plaintext m = 41.
Receiver X first calculates y = v−1 = 30−1 = 11 (mod q = 47). With this in turn, he
calculates the residues w = m ⋅ v−1 = 41 ⋅ 11 = 28(mod q = 47) und w′ = u ⋅ v−1 = 19 ⋅
11 = 21(mod q = 47). This ultimately shows that
h w b w 60 28 15821 106 42 207 mod p 283 and hence 207 = 19 (mod q = 47).
Since this yields the signature value u = 19 as a remainder, X considers Y’s DSA signature
to be verified.
4.5 DSA and ECDSA Signature 105
At this point a reference to the Diffie-Hellman key exchange is necessary (Sect. 3.5). In
practice this can and is usually done with a non-generating element h = g(p − 1)/q (mod p)
instead of the generating element g modulo p. So the prime numbers p and q as well as h
are publicly known. The procedure then looks quite analogously as follows:
In the semi-static variant of Diffie-Hellman key exchange, one of the two, say participant
Y(ollanda), has a public key (p, q, h, b = ha) and a private key f = a, but participant X(avier)
does not.
We have already transferred the Diffie-Hellman key exchange as ECDH (Sect. 3.7) as well
as the ElGamal cipher (Sect. 3.8) to elliptic curves. DSA also has a variant based on ellip-
tic curves, which is called ECDSA (Elliptic Curve Digital Signature Algorithm). Let p be
a prime number and y2 = x3 + r ∙ x + s an elliptic curve modulo p. Furthermore, let G be a
base point on the curve with the highest possible order o. Furthermore, let q be a prime
number that divides o. Then for the point H = (o/q) ∙ G, we have q ∙ H = q ∙ (o/q) ∙ G = o
∙ G = O, and H therefore has order q. Participant Y(ollanda) chooses a natural number b in
the range from 2 to q − 1 and computes the point B = b ∙ H. Her public key is (p, r, s, H,
q, B), and the private one is b.
If Y wants to send the message m to X(avier), she first chooses a random natural num-
ber k in the range from 2 to q − 1, computes k ∙ H = (x0, y0) with remainders x0 and y0
modulo p, and determines u = x0 (mod q). If u = 0, she chooses a different k. Since q is a
prime number and hence coprime with k, she uses the extended Euclidean algorithm to
determine a natural number x with x ∙ k = 1 (mod q), i.e., k−1 = x (mod q). Further, Y com-
putes the residue v = k−1 ∙ (m + b ∙ u) (mod q). If v = 0, she starts the procedure with a new
k. Finally, as a signature, she sends, together with the plaintext m, the residues u = x0 (mod
q) and v = k−1 ∙ (m + b ∙ u) (mod q).
To verify the signature values, receiver X computes a natural number y with 1 = y ∙ v
(mod q) using the extended Euclidean algorithm, so that v−1 = y (mod q) holds. This is
possible because q is part of the public key of Y and because v is not 0 and therefore is
106 4 Digital Signature
coprime with the prime number q. Finally, he computes the curve point A = v−1 ∙ (m ∙
H + u ∙ B) = (x1, y1) with residues x1 and y1 modulo p. If A is the point O, then X does not
accept the signature. Otherwise, he determines t = x1 (mod q) and then considers the sig-
nature as verified if t = u. Indeed, if the signature really originates from Y, then for her
private key b and because of q ∙ H = O, it follows first A = v−1 ∙ (m ∙ H + u ∙ B) = v−1 ∙
(m + u ∙ b) ∙ H = k ∙ H and hence t = u.
Of course, again in reality a hash value of m is signed. ECDSA is a formal translation
of the DSA procedure, whereby the role of h and b is taken over by the points H and B.
For security reasons, the BSI guideline [BSI1] recommends key lengths for p and q in the
order of 2000 and 250 bits, respectively, for the DSA signature and also for Diffie-Hellman
with non-generating element. However, with increasing computer performance, the BSI
recommends using prime numbers p with a length of 3000 bits for a deployment period
beyond 2022. For ECDSA, a key length for q of at least 250 bits is recommended.
In standard procedures based on ECDH or ECDSA, the parameters (p, r, s, H, q) are pre-
defined and are thus effectively part of the algorithm. Thus, only B is the public key and b
is the private key. In addition to the standard P-256 (Sect. 3.7), we also want to explicitly
list the two standards sepc256k1 and brainpoolP256r1 with their parameters (p, r, s, G, o).
For all above mentioned standards G = H is a base point of prime order o = q.
The EC standard sepc256k1 was proposed by the SECG (Standards for Efficient
Cryptography Group) [BeL, SEC]:
• prime number p
–– p = FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
FFFFFFFE FFFFFC2F
• Elliptic curve y2 = x3 + 7 (i.e. r = 0 and s = 7)
• Base point G = (xG, yG) of prime order o
–– xG = 79BE667E F9DCBBAC 55A06295 CE870B07 029BFCDB 2DCE28D9
59F2815B 16F81798
–– yG = 483ADA77 26A3C465 5DA4FBFC 0E1108A8 FD17B448 A6855419
9C47D08F FB10D4B8
–– o = FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B
BFD25E8C D0364141
• prime number p
–– p = A9FB57DB A1EEA9BC 3E660A90 9D838D72 6E3BF623 D5262028
2013481D 1F6E5377
• Elliptic curve y2 = x3 + r ∙ x + s
–– r = 7D5A0975 FC2C3057 EEF67530 417AFFE7 FB8055C1 26DC5C6C
E94A4B44 F330B5D9
–– s = 26DC5C6C E94A4B44 F330B5D9 BBD77CBF 95841629 5CF7E1CE
6BCCDC18 FF8C07B6
• Base point G = (xG, yG) of prime order o
–– xG = 8BD2AEB9 CB7E57CB 2C4B482F FC81B7AF B9DE27E1 E3BD23C2
3A4453BD 9ACE3262
–– yG = 547EF835 C3DAC4FD 97F8461A 14611DC9 C2774513 2DED8E54
5C1D54C7 2F046997
–– o = A9FB57DB A1EEA9BC 3E660A90 9D838D71 8C397AA3 B561A6F7
901E0E82 974856A7
The parameters are represented hexadecimally, whereby 4 bits are combined into numbers
from 0 to 15. The letters stand for the two-digit numbers A = 10, B = 11,…, F = 15.
So far we have only encountered elliptic curves in the so-called Weierstrass form y2 = x3 + r
∙ x + s. The EC standard Curve25519, on the other hand, which has not yet been explicitly
mentioned, is a different representation of elliptic curves, in this case y2 = x3 + 486,662 ∙
x2 + x modulo the prime p = 2255 − 19. The curve Curve25519 was proposed in 2005 by
Daniel Bernstein (born 1971). Unlike the usual form of elliptic curves, it allows the use
of algorithms immune to so-called side-channel attacks. Here, it is not the cryptographic
cipher itself that is attacked, but a specific implementation in a device (e.g. a chip card)
[WPC29].
We would now like to discuss the data security of online banking. Online banking is the
processing of banking transactions with the help of laptops, tablets or smartphones, where
you can access the bank computer directly from home or on the road via the Internet.
Online banking can be used to make account inquiries, but also to make transfers.
108 4 Digital Signature
With the conventional security concept, the so-called 2-factor authentication, a new
TAN (transaction number) is additionally required for each account-moving transaction,
such as transfers, which is sent to the customer by the bank via a separate channel.
With the so-called TAN list, the customer receives a list of different TANs from his
bank by post, one of which must be entered for each transfer order at the online banking.
Once a TAN has been used, it has expired and can no longer be used. As a moderate further
development, indexed TAN lists are used (so-called iTAN), in which the TANs are num-
bered consecutively. As part of the online transfer order, the customer is requested to enter
the TAN corresponding to the number displayed. However, the use of TAN list and the
iTAN procedure has expired since the end of 2019.
A still common procedure is called mTAN (mobile TAN). After secure Internet trans-
mission of the transfer completed at the online banking, the bank sends the customer a
TAN that can only be used for this transaction together with (parts of) the target account
number by SMS to his mobile phone. The transfer order must then be confirmed with this
TAN within a few minutes before it is actually executed. In another variant, the eTAN
procedure, the customer uses a TAN generator. This generates a TAN based on a displayed
BAR code and the inserted bank card of the customer as well as the current date and time,
which is also only valid for a short period of time. The bank can use the same algorithm to
check the TAN entered.
The use of the TAN list and iTAN at least ensures that a passive attacker cannot repeat-
edly use the TAN once it has been intercepted. But why have these procedures been phased
out? The reason is that they are vulnerable to a man-in-the-middle attack. These attacks
are carried out via Trojans on the customer’s device. The scenario then looks like this:
• The fraudster foists an Internet address on his victim’s device that leads to a fictitious
page and makes the customer believe that he has official access to his bank’s online
banking. The customer falls for the trick and, by entering his access data, makes them
available to the fraudster.
4.6 Online Banking 109
• The fraudster uses the access data and simultaneously establishes a connection to the
bank’s real online banking.
• Meanwhile, the customer enters the data for a bank transfer in the fake system.
• The fraudster also starts a transfer process from the customer’s account at the real
online banking, but with a large amount to an account held in the Cayman Islands. In
the last step, the fraudster is asked to enter the TAN (for the displayed number) for
authorization.
• The fraudster then asks the customer in the fake system to enter the TAN (for this
number).
• The customer enters the correct TAN from his list, the fraudster receives it, enters it as
part of his own ongoing transfer process and successfully authorises the transfer in
this way.
If two separate end devices are used for the mTAN procedure, e.g. a laptop for the transfer
and a smartphone for SMS, this can be considered sufficiently secure, as the probability
that both end devices are infected is low. However, the whole thing becomes critical if the
fraudster can hack into both communications via a man-in-the-middle attack and also
change the content of the SMS. The 2-factor authentication is just only a doubled verifica-
tion based on the static PIN and the dynamic TAN.
To use the more modern security concept RAH-7 and RAH-9 available with FinTS, the
customer needs a special individual chip card (ICC), which is delivered by the bank in a
secure way, and a chip card reader, which must be connected to the customer’s computer.
For the time being, however, the RAH-10 software solution installed on the customer’s
computer itself is also possible. Figure 4.11 visualises the process of a bank transfer with
FinTS and digital signature described below:
• To initiate a bank transfer, the customer enters the online banking of his bank on the
Internet, logs in with user name and PIN and prepares the transfer there.
• He then inserts his chip card into the card reader and enters his PIN again for security.
Thereupon
Fig. 4.11 Schematic workflow for online banking with digital signature
110 4 Digital Signature
–– the digital signature of a digital fingerprint of the transfer is made using the private
key of the customer stored on the chip card,
–– a key is randomly generated and the entire transfer is encrypted using a symmetric
cipher, and
–– this key is in turn encrypted with the public key of the bank and attached.
• The transfer is then sent in this form TLS-secured via the Internet to the bank server.
• As soon as the bank receives the transfer, it decrypts it and checks the signature using
the customer’s public key.
• Only if this check is successful the transfer order will actually be executed.
This procedure is not only tap-proof, but also tamper-proof against a man-in-the-middle
attack. However, the FinTS standard still allows the mTAN and eTAN procedures as
alternatives.
We now want to list the procedures that are used in FinTS [BDB]. While in version 3.0
Triple-DES was allowed as an alternative symmetric cipher for encrypting the transfer
data, in version 4.0 only AES with a key length of 256 bits is allowed. The key is generated
by generating a 256-bit random number. AES encryption takes place in CBC operat-
ing mode.
The public key cipher both for encrypting the AES key and for signing transfer data is
RSA. The customer keys are generated by the processor on the individual chip card (ICC).
For this purpose, prime numbers p and q are generated and the RSA module n = p ∙ q is
calculated based on the currently recommended bit length of about 2000 bits. The second
public key component e is fixed as the prime number e = 216 + 1, while the private key d is
computed using the extended Euclidean algorithm from 1 = d ∙ e + b ∙ (p − 1) ∙ (q − 1).
As a rule, when the customer accesses FinTS for the first time, the public keys of the cus-
tomer and the bank are mutually exchanged. Alternatively, this is also possible on data
carriers. The private customer key d is stored in an area of the chip from which it cannot
be read and therefore never leaves the chip card.
The hash function SHA-2-256 is used to generate the digital fingerprint of the transfer
data. In addition, the transfer data is compressed using the zip format before being sent.
signature, the so-called blind signature. Here, X(avier) acquires a valid digital signature
from Y(ollanda) for an information content that Y cannot even recognize when signing.
Such a method was first proposed in 1981 by David Chaum (born 1955) based on the
RSA cipher. For this, let (n, e) be Y’s RSA public key and d be her private key. X chooses
a random natural number r in the range from 2 to n − 1, which is coprime with n, and keeps
it secret. Further, X computes the value r−1 = x (mod n) by using the extended Euclidean
algorithm to determine the multiple sum 1 = x ∙ r + y ∙ n. Using the random number r, X
modifies his text m to be signed to m′ = re ⋅ m (mod n) and submits the “blinded” m′ to Y
for digital signature. Since r was chosen at random, Y cannot infer back to m. “Faithfully”
Y signs the blinded text m′ with her private key d, thus calculating sig(m′) = (m′)d(mod n)
and submits this signature to X. The latter multiplies Y’s signature by r−1 modulo n, thereby
obtaining r−1 ⋅ sig(m′) = r−1 ⋅ (m′)d = (r−1 ⋅ (re ⋅ m)d) = (r−1 ⋅ red ⋅ md) = (r−1 ⋅ r ⋅ md) =
md = sig(m) (mod n) and in this way has “unblinded” the blinded signature again.
Therefore, X indeed has the digital signature sig(m) = r−1 ⋅ sig(m′) (mod n) of Y for the
original text m. Everyone can verify this by using the public key (e, n) of Y and checking
whether (sig(m))e = (md)e = m (mod n) really holds. Figure 4.12 illustrates the procedure
graphically.
But who would blindly sign “faithfully” in Chaum’s procedure? Yes, there are such situa-
tions. Chaum has used his concept of the blind signature to define a so-called cryptocur-
rency. This is understood to be a new manifestation of money, in addition to the classical
banknotes and coins of the central bank and the book money of commercial banks. A
cryptocurrency is held in digitally encrypted form, stored on a server or computers some-
where in the network or in a cloud. In addition to more practical requirements such as
user-friendliness and availability, from a cryptographic point of view, cryptocurrencies are
required to be counterfeit-proof and verifiable. If one adds the request to be anonymous
112 4 Digital Signature
here, there can no longer even be a kind of serial number as with banknotes, but it is then
more of a digital coin system.
Chaum conceived of his cryptocurrency, which he called eCash, as a digitally stored
claim on a financial institution. So in his model, banks issue eCash shares, which custom-
ers can purchase with “normal” money to make purchases. For a piece of eCash worth say
US$100, Bank B issues the following specifications:
• a specially created public RSA key (n, e), but keeps the associated private key d
secret, and
• a redundancy scheme according to which digital data sets must be prepared.
As a simple example, such a redundancy scheme could look like this: the data record must
consist of a digital string of length 5, which must then be repeated twice, e.g. 11001 11001
11001. In order to purchase a piece of eCash worth US$100 from bank B, customer C has
bank B blindly sign a digital string m prepared according to the redundancy scheme with
the RSA key. The bank then debits the US$100 from his account. The value g = sig(m) = md
(mod n) is then the piece of eCash worth US$100 that customer C purchased from bank
B. It is clear that g as a digital signature could only be generated using Bank B’s private
key d (counterfeit-proof) and that anyone can verify the authenticity of g using Bank B’s
public key (n, e) and the published redundancy scheme (verifiable). Moreover, g does not
reveal any information about C, not even Bank B knows that it actually signed m and thus
issued the piece of eCash g = sig(m) to C (anonymous).
This is a brief sketch of the theory, and we do not want to go into any further details of
the technical and organisational implementation of eCash. For the commercial marketing
of his financial product, Chaum founded the company DigiCash at the beginning of the
1990s, which was able to win Deutsche Bank and Credit Suisse, among others, as European
eCash licensees. However, Chaum was probably too far ahead of his time with the crypto-
currency eCash. In any case, DigiCash went bankrupt at the end of the 1990s.
There are now many different cryptocurrencies on the market, which are usually much
more complicated in structure and implementation than the comparatively simple eCash.
Probably the best known, namely Bitcoin (BTC), was invented in 2009 under the pseud-
onym Satoshi Nakamoto and traded publicly for the first time. Bitcoin currently has a
market share of over 50%. It is followed by the cryptocurrencies Ripple/XRP and Ether/
ETH. The conversion rate of Bitcoins into other means of payment is determined by sup-
ply and demand. Thus, unlike eCash, Bitcoin does not require commercial banks.
Bitcoins are exchanged electronically between the parties involved in the trade. A digi-
tal signature scheme is required for authentication. Bitcoin uses ECDSA with the standard
sepc256k1 (Sect. 4.5) and the hash function SHA-2-256 (Sect. 4.3). Each participant
4.7 Blind Signature and Cryptocurrencies 113
needs an ECDSA public key and a private key. The public key also serves as the basis for
establishing the identity of the participant and thus, as a random-looking string, simultane-
ously guarantees the desired anonymity. Each participant must import their private key
into their so-called Bitcoin wallet and then store it securely. In addition, the Bitcoin wallet
also holds his current account balance, so it must be protected with a strong password.
Bitcoin deliberately dispenses with any central authority (such as central banks or com-
mercial banks) that could mediate financial transactions; instead, these are to take place
directly between the participants involved. However, if data were stored in the cloud or on
the Internet, one would have to access at least one or even several servers. Therefore,
Bitcoin data storage is decentralized in a so-called P2P network (peer-to-peer), where
each participant is directly connected to others. The prerequisite for this is the installation
of free software.
The Bitcoin blockchain consists of a chain of blocks, each of which contains a certain
number of Bitcoin transactions. New blocks are created in a computationally intensive
114 4 Digital Signature
process called mining, and then distributed to participants via the Bitcoin P2P network.
Mining, meanwhile, consumes large amounts of energy.
Within a Bitcoin block, two transactions are combined in pairs and a common hash
value is calculated for their individual hash values. The resulting hash values are again
combined in pairs and a common hash value is calculated for each. In this way, a tree
structure of hash values is successively created, the last of which, the so-called root hash,
is stored in the block header together with a time stamp. The blocks are linked to each
other using the hash values of their header data. This is done in such a way that the hash
value of the header data of the previous block is also written to the header of each block.
The procedure is shown schematically in Fig. 4.13. Thus the sequence of the blocks is
clearly defined. In addition, the subsequent modification or even deletion of previous
blocks or transactions is practically excluded, since the hash values of all subsequent
blocks would also have to be recalculated in a short time.
at least the system administrator has authorized access to this, but possibly also other
unauthorized attackers. The solution to this problem is once again a cryptographic hash
function, i.e. the password or PIN is only stored in the form of its hash value. Under the
SHA-1 hash function, for example, two popular passwords look like this [Rau]:
admin1234 7B902E6FF1DB9F560443F2048974FD7D386975B0
password 5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8
The 160-bit string of SHA-1 is again represented here in hexadecimal. Here, 4 bits are
combined into numbers from 0 to 15, whereby the two-digit hexadecimal numbers are
written as letters, namely A = 10, B = 11,…, F = 15.
Of course, with hash values of passwords the focus does not lie on data compression,
as is the case with digital signatures. Passwords are usually only a few characters long.
Instead, other properties of cryptographic hash functions are used. For example, every
hash value of a given hash function has the same length, no matter how long the corre-
sponding password is. Thus, one cannot infer the length of the password. Also, the hash
value does not allow one to infer the number of digits and special characters. Hash func-
tions cause great confusion and diffusion, a property inherited from block ciphers as their
building blocks. Hash values for similar passwords therefore differ significantly; even
small changes result in fundamentally different hash values. Here is an example using the
hash function SHA-2-224 [WPSH2e], where just a period has been added
Overall, it must not be possible to infer an associated password from a given hash value
using efficient methods. And exactly this is formally guaranteed by the one-way property
of a cryptographic hash function.
Nevertheless, it is well known that attempts are always made to crack password files. A
brute-force attack is used to try out all theoretically conceivable passwords in sequence
and compare their hash value with the stored values. With about 70 characters, consisting
of upper and lower case letters, digits and some special characters, there are exactly 708
different passwords of length 8, i.e. a little less than 1015 or 250 ones. This is in the order of
DES keys and is therefore no longer a problem for computers today. An even faster way to
achieve the goal is to use a dictionary attack to go through a dictionary instead of
116 4 Digital Signature
arbitrary passwords, supplemented by first names and calendar data. There are also so-
called rainbow tables with a specially developed data structure that allows an extremely
fast search for passwords for a given hash value.
You can increase the password security a bit more by using the concept of salting. This
adds a little “salt to the corned beef hash”. Each time a password is entered, a few charac-
ters that are as meaningless as possible are automatically added, such as &7T?a$. This
makes the hash look completely different and avoids passwords that are too simple. Salting
can also be customized with individual additions for each user. This leads to the fact that
for two users with the same password nevertheless different hash values are stored. Another
aspect is also advantageous: Many users use the same password for some or all of their
applications. However, since they all use different salting, the hash values look completely
different everywhere.
All the effort involved in storing passwords is only necessary because the verifier must be
able to check the subscriber’s knowledge in the form of his password. Another weakness
of the method is also that an attacker who has intercepted a password does not necessarily
have to use it immediately, but can use it sometime later and even multiple times.
These problems can be circumvented with challenge-response authentication. User
T(ina) is confronted with a challenge by verifier V(ictor), which she can only solve based
on her secret knowledge. In most cases, this is a kind of “arithmetic problem”. User T
sends the solution as a response back to the verifier V, who checks the answer. If it is cor-
rect, user T has been successfully authenticated by verifier V. Necessary in this procedure
is that the task is randomly generated and thus varies sufficiently with each new authenti-
cation process. Furthermore, it must be possible for the verifier V to check the correctness
of the answer without knowing the user T’s secret knowledge.
4.8 Password Security and Challenge Response 117
Finally, we will now look at some examples of how the authentication of users is imple-
mented in practice.
We have already explained the encryption method used in 2. generation GSM mobile com-
munications (Sect. 2.3). It uses personalised chip cards (ICC). These so-called SIM cards
(Subscriber Identification Module) are issued by the network operators to their customers.
Each subscriber is assigned a 128-bit subscriber authentication key ki, which is stored on
the SIM card in the mobile phone and in the mobile communications server and which we
will now refer to as k for short.
Now we will deal with the authentication of mobile communications subscribers. The
subscriber authenticates himself by knowledge, namely the PIN, and possession, namely
the SIM card. The PIN entry on the mobile phone is checked by the chip of the SIM card.
However, it can also be deactivated. If the PIN is entered incorrectly three times in a row,
the SIM card is automatically blocked. To unlock it again, the PUK (Personal Unblocking
Key) is required.
After successful PIN entry, the SIM card is authenticated by the network operator’s
mobile communications server on the basis of its knowledge, namely the subscriber key k.
The so-called A3 algorithm is used for this purpose. Similar to the A8 algorithm for key
generation, the definition of A3 is also the responsibility of the respective network opera-
tor; it is also kept secret as far as possible. In any case, the mobile communications server
sends a 128-bit random number RAND to the subscriber’s mobile phone as a “challenge”.
The subscriber’s SIM card calculates a 32-bit response SRES (signed response) from
RAND and k using the A3 algorithm and sends it back to the mobile communications
server as a response. There, the subscriber’s individual key k is read from a database and
SRES is also calculated. Only if the two values match is the SIM card authenticated and
the subscriber is granted access to the network. Figure 4.15 visualizes the process.
In the UMTS and LTE standards of the 3. and 4. generation, a similar challenge-response
procedure is used to authenticate a mobile communications subscriber, also using the 128-
bit subscriber key k and the 128-bit random number RAND. However, the A3 algorithm of
GSM is replaced by a standardized procedure, which still leaves some possibilities for the
network operator to configure suitably. The entire procedure is called MILENAGE, of
which the algorithm for authentication is only a part. Among other things, MILENAGE
also generates a 128-bit cipher key which, together with the A5/4 cipher (Sect. 2.7),
encrypts the data to be transmitted, such as the calls or Internet pages. We explain the
authentication part of the algorithm using the workflow in Fig. 4.16 [ETSI3].
• The input value to the MILENAGE procedure is the 128-bit random number RAND.
• Ek = E(∙, k) is a block cipher on 128-bit blocks, which depends on a 128-bit key k. The
block cipher is applied multiple times in the procedure. The standard leaves the choice
of Ek open in principle, but strongly recommends using AES with key length 128 bits.
• OP is a 128-bit constant that can be freely configured by the respective network opera-
tor. It is modified to OPC using the block cipher Ek and thus added several times bit by
bit ⊕ within the procedure.
• For the 128-bit constant c2, which is added once bitwise ⊕ in the procedure, the stan-
dard contains c2 = 00…001 as a suggestion.
• For the constant r2, which can assume the values 0, 1,…, 127, r2 = 0 is proposed in the
standard. In principle, it causes the cyclic shift ZL of a bit string by r2 positions to
the left.
• The output value of the MILENAGE procedure is initially a 128-bit number. From this,
the 64-bit number RES is derived as the right half, which is used for the authentication
of the SIM card and thus of the subscriber.
The GSM/UMTS/LTE standard does not specify how often authentication is to be per-
formed. It must be performed at least when the mobile phone is switched on, but can also
be performed operator-dependently when dialling into a new cell tower and automatically
in fixed time cycles.
The EMV (Europay International, MasterCard and VISA) takes care of the creation and
review of specifications and requirements for secure payment with credit cards. Today, it
is a joint organization of American Express, Discover, JCB, Mastercard, UnionPay and
Visa in cooperation with numerous banks, retailers and industry [EMV1].
Of course, in the field of credit card payments, secure data transmission is of particular
importance, especially in view of the many parties involved in the process. This involves
The specification and guideline issued by EMV [EMV2, EMV4] preferably permits
Triple-DES for encryption, but also AES with key lengths of 128, 192 and 256 bits, oper-
ated in ECB or CBC mode. A CBC-MAC is recommended for authentication of the data
transmission (Sect. 4.1).
4.9 Mobile Phone, Credit Card and Passport 121
Here we want to look more specifically at the authentication of cardholders when making
purchases with credit cards. Every credit card is an ICC (Integrated Circuit Card) and
therefore contains an integrated chip. The chip stores the cardholder’s individual public
RSA key (n, e) and his private d, both of which were generated when the card was created.
For e, only the values 3 and 216 + 1 are permitted. The private key d is stored in an area of
the chip from which it cannot be read.
The holder of a credit card authenticates himself by knowledge (PIN) and possession
(credit card) when reading into a terminal. In the process, the PIN entry is verified by the
chip of the credit card. The chip in turn is authenticated by the terminal on the basis of
knowledge, as we will explain in principle using DDA (Dynamic Data Authentication)
[EMV2, EMV3].
The terminal generates a bit string of defined terminal data iT and prefixes it with a
random number zT of 4 bytes length. It sends the “challenge” zT‖iT to the chip of the
credit card.
The chip supplements the bit string with defined data iC from its memory and prefixes
it with a further 2 to 8 byte random number zC. This results in the bit string m = zC‖iC‖zT‖iT,
where we have neglected a few more format bits here. In any case, the chip computes a
160-bit hash value h(m) using the hash function SHA-1. We now use m0 = zC‖iC as an
abbreviation. The procedure takes care that the bit length of m0‖h(m) is smaller than the
one of the RSA module n of the cardholder. Now the chip signs the bit string m0‖h(m) with
the private key d of the cardholder and therefore sends sig(m0‖h(m)) = (m0‖h(m)d
(mod n)) as “response” to the terminal.
This uses the public key (n, e) of the cardholder and calculates
m 0 h sig m 0 h m m 0 h m mod n . Here h′ denotes the 160 rightmost bits in
e de
′
the calculated bit string and m 0 the rest. If the signature is correct, this should result in
m 0 m 0 and h′ = h(m). The terminal therefore interprets m ′0 as the bit string m0 = zC‖iC
of the chip, which is unknown to it, and supplements the bit string zT‖iT, which is known
to it, to m′ = m0′‖zT‖iT.. From m′ it calculates the hash value h(m′). If this is equal to the
received h′, the terminal considers the chip and thus the credit card as authenticated. For
only a signature with the correct private key d could result in this coincidence of the
hash values.
The ePassport (electronic passport) was introduced in Germany in 2005. Originally, there
was a chip in its cover with which a terminal can exchange data contactlessly via RF
(Radio Frequency). Since 2017, the chip has been integrated into the passport’s chip card.
The chip stores the personal data as well as the biometric picture and two fingerprints of
the passport holder [BSI4].
122 4 Digital Signature
However, the chip of the ePassport also contains both the public (p, q, h, b) and the
private DSA key a of the passport holder (Sect. 4.5). The private key is located in an area
of the chip from which it cannot be read. Therefore, even if the complete chip is “cloned”,
it is not possible to copy the private key as well. The public key of the passport holder is
readable, but again secured by a digital signature of the issuing authority. In the passport
creation phase, a hash value of the personal data stored on the chip is also digitally signed
and stored with the passport holder’s DSA key.
The passport holder authenticates himself at the automatic passport control by possession
(ePassport) and by his characteristics (biometric picture, fingerprint). In the process, the
characteristics are verified by the chip integrated in the ePassport. The chip is in turn
authenticated by the terminal by checking its knowledge. This is done in two steps, which
we will now explain in principle [BSI5], [BSI6].
Passive authentication (PA) is used to verify the authenticity of the passport and the
integrity of the data on the chip. To do this, the terminal reads the personal data and their
digital DSA signature from the chip and verifies the signature with the passport holder’s
public key. However, with passive authentication, copying the data from one chip to
another would remain undetected.
Chip authentication (CA2) additionally is used to recognize “cloned” chips in ePass-
ports. For this purpose, the terminal generates a random natural number f, reads the public
key of the passport holder and sends c = hf (mod p) to the chip as a “challenge”.
The chip uses its private key a to calculate the remainder k = ca = hfa (mod p), chooses
a random natural number r and uses a hash function h(∙) to calculate the hash value
km = h(k‖r), where here k‖r is to be interpreted as the stringing together of the digital rep-
resentations of the numbers k and r. In order to calculate the CBC-MAC of c, km is suitably
truncated to k ′m and thus t = macCBC(c, km′) is calculated (Sect. 4.1). Finally, the chip
sends the values r and t to the terminal as a “response”.
The terminal, for its part, uses its random number f to calculate k = bf = haf (mod p) from
the passport holder’s public key, which ultimately corresponds to a semi-static Diffie-
Hellman key exchange (Sect. 4.5). Now the terminal is also able to compute the hash value
km = h(k‖r) from the received r, and in turn to compute the CBC-MAC macCBC(c, km′) from
it. If this matches the received value t, the terminal considers the ePassport as authenti-
cated. Indeed, a “cloned” chip cannot have the original private key a, and if it were simply
to use a different private key, the Diffie-Hellman keys k and subsequently t computed on
both sides would differ. If, on the other hand, entirely new DSA keys had been generated
for a “cloned” chip, this would have been noticed during passive authentication, since the
public key is protected against unnoticed changes by an official digital signature.
4.9 Mobile Phone, Credit Card and Passport 123
In addition to DSA and semi-static DH, the BSI guideline also allows ECDSA and
semi-static ECDH, among others with the standards P-256 and brainpool256r1.
Incidentally, a key kc is also generated from the Diffie-Hellman key k in a similar way for
the encryption of data transmission. Triple-DES and AES with key lengths of 128, 192 and
256 bits are permitted as symmetric ciphers for CBC-MAC and data encryption, and
SHA-1 and SHA-2 as hash functions.
Bibliography
[3GPP] 3GPP: A5/3 Encription Algorithm for GSM (Technische Spezifikation). Sophia Antipolis
Valbonne/Frankreich (2003). https://www.gsma.com/aboutus/wp-content/uploads/2014/12/
a53andgea3specifications.pdf
[BeL] Bernstein, D., Lange, T.: SafeCurves: Choosing Safe Curves for Elliptic-Curve Cryptography
(Internet-Information). Eindhoven/Niederlande. https://safecurves.cr.yp.to/. Accessed 10
Apr 2019
[Beu] Beutelspacher, A.: Kryptologie (Sachbuch). Springer Spektrum, Wiesbaden (2015)
[BNS] Beutelspacher, A., Neumann, H., Schwarzpaul, T.: Kryptografie in Theorie und Praxis
(Lehrbuch). Vieweg+Teubner, Wiesbaden (2010)
[BiC] Bitcoinworld: Bitcoin-Lexikon (Internet-Information). https://www.bitcoin-welt.com/bitcoin-
lexikon/. Accessed 10 Apr 2019
[Blu] Bluetooth: Bluetooth Core Specification v. 5.0 (Technische Spezifikation). (2016). https://
www.bluetooth.com/specifications/bluetooth-core-specification
[Bre] Bressoud, D.: Factorization and Primality Testing (Lehrbuch). Springer, New York (1989)
[Buc] Buchmann, J.: Einführung in die Kryptographie (Lehrbuch). Springer Spektrum, Berlin (2016)
[BSI1] Bundesamt für Sicherheit in der Informationstechnik: Kryptographische Verfahren 1:
Empfehlungen und Schlüssellängen (Technische Richtlinie). Bonn/Deutschland (2018). https://
www.bsi.bund.de/DE/Publikationen/TechnischeRichtlinien/tr02102/index_htm.html;jsessionid
=D4F0ACAD39ED0893ECBE3F951AE6B66C.2_cid360
[BSI2] Bundesamt für Sicherheit in der Informationstechnik: Kryptographische Verfahren 2:
Verwendung von Transport Layer Security (TLS) (Technische Richtlinie). Bonn/Deutschland
(2018). https://www.bsi.bund.de/DE/Publikationen/TechnischeRichtlinien/tr02102/index_htm.
html;jsessionid=D4F0ACAD39ED0893ECBE3F951AE6B66C.2_cid360
[BSI3] Bundesamt für Sicherheit in der Informationstechnik: Sichere Nutzung von WLAN
(Technische Richtlinie). Bonn/Deutschland (2018). https://www.bsi.bund.de/SharedDocs/
Downloads/DE/BSI/Internetsicherheit/isi_wlan_leitlinie.pdf?__blob=publicationFile
[BSI4] Bundesamt für Sicherheit in der Informationstechnik: Der elektronische Reisepass
(ePass) (Internet-Information). https://www.bsi.bund.de/DE/Themen/DigitaleGesellschaft/
ElektronischeIdentitaeten/ePass/ePassSeite.html. Accessed 10 Apr 2019
[BSI5] Bundesamt für Sicherheit in der Informationstechnik: Advanced Security Mechanisms for
Machine Readable Travel Documents and eIDAS token – Part 2: Protocols (Technische Richtlinie).
Bonn/Deutschland (2016). https://www.bsi.bund.de/DE/Publikationen/TechnischeRichtlinien/
tr03110/index_htm.html?nn=6615602
© The Author(s), under exclusive license to Springer-Verlag GmbH, DE, part of 125
Springer Nature 2022
O. Manz, Encrypt, Sign, Attack, Mathematics Study Resources,
https://doi.org/10.1007/978-3-662-66015-7
126 Bibliography
[BSI6] Bundesamt für Sicherheit in der Informationstechnik: Advanced Security Mechanisms for
Machine Readable Travel Documents and eIDAS token – Part 3: Specifications (Technische
Richtlinie). Bonn/Deutschland (2016). https://www.bsi.bund.de/DE/Publikationen/
TechnischeRichtlinien/tr03110/index_htm.html?nn=6615602
[BDB] Bundesverband Deutscher Banken: Financial Transaction Services FinTS (Security-
Spezifikation). Berlin/Deutschland (2014). https://www.hbci-zka.de/
[DIM] DI-Management: Public key cryptography using discrete logarithms (Internet-Tutorium).
https://www.di-mgt.com.au/public-key-crypto-discrete-logs-4-dsa.html. Accessed 10 Apr 2019
[DSB] Datenschutzbeauftragter.: Bitcoin – Technische Grundlagen der Kryptowährung (Internet-
Information). https://www.datenschutzbeauftragter-info.de/bitcoin-technische-grundlagen-der-
kryptowaehrung/. Accessed 10 Apr 2019
[DeM] Deutsches Museum: Die Rotor-Chiffriermaschine Enigma der deutschen Wehrmacht
(Internet-Information). https://www.deutsches-museum.de/sammlungen/meisterwerke/meister-
werke-ii/enigma/. Accessed 10 Apr 2019
[EMV1] EMVCo: Overview (Internet-Information). https://www.emvco.com/about/overview/.
Accessed 10 Apr 2019
[EMV2] EMVCo: ICC Specifications for Payment Systems – Security and Key Management
(Technische Spezifikation). Foster City CA/USA (2011). https://www.emvco.com/
document-search/
[EMV3] EMVCo: ICC Specifications for Payment Systems – Application Specification (Technische
Spezifikation). Foster City, CA, USA (2011). https://www.emvco.com/document-search/
[EMV4] EMVCo: Issuer and Application Security Guidelines (Technische Richtlinie). Foster City,
CA, USA (2018). https://www.emvco.com/document-search/
[ETC] Enuma Technologies: A Tale of Two Curves (Internet-Information). https://blog.enuma.io/
update/2016/11/01/a-tale-of-two-curves-hardware-signing-for-ethereum.html. Accessed 10
Apr 2019
[ETSI1] ETSI: Digital Video Broadcasting (DVB) – Content Scrambling Algorithms (Technische
Spezifikation). Sophia Antipolis Cedex, Frankreich (2013). https://www.etsi.org/deliver/etsi_
ts/103100_103199/103127/01.01.01_60/ts_103127v010101p.pdf
[ETSI2] ETSI: A5/4 Encription Algorithm for GSM (Technische Spezifikation). Sophia Antipolis
Cedex, Frankreich (2011). https://www.etsi.org/deliver/etsi_ts/155200_155299/155226/09.00.0
0_60/ts_155226v090000p.pdf
[ETSI3] ETSI: MILENAGE Algorithm for UMTS (Technische Spezifikation). Sophia Antipolis
Cedex, Frankreich (2010). https://www.etsi.org/deliver/etsi_ts/135200_135299/135206/09.00.0
0_60/ts_135206v090000p.pdf
[Fox] Fox, D.: Sicherheit des Bluetooth-Standards (Übersichtsartikel). Tagungsband des Deutschen
IT-Sicherheitskongresses des BSI, Ingelheim/Deutschland (2003). https://www.secorvo.de/pub-
likationen/bluetooth-sicherheit-fox-2003.pdf
[Fra] Franz, E.: Kryptographie und Kryptoanalyse (Vorlesungsfolien). Dresden, Deutschland
(2015). https://www.inf.tu-dresden.de/content/institutes/sya/dud/lectures/2015sommersemester/
Kryptoanalyse/KuKA15_01_1s.pdf
[Gar] Gartner, L.: Häufigkeitstabellen (Internet-Blog). http://www.mathe.tu-freiberg.de/~hebisch/
cafe/kryptographie/haeufigkeitstabellen.html. Accessed 10 Apr 2019
[HEZ] Hassan, Z., Elgard, T., Zekry, A.: Modifying authentication techniques in mobile communi-
cation systems (Forschungsartikel). Int. J. Eng. Res. Appl. 4, Kairo, Ägypten (2014)
[Hau1] Hauck, P.: Kryptologie und Datensicherheit (Vorlesungsskript). Tübingen, Deutschland (2009)
[Hau2] Hauck, P.: Kryptologie (Vorlesungsskript). Tübingen, Deutschland (2015). https://www.fsi.
uni-tuebingen.de/_media/studium/skripte/kryptows1415.pdf
Bibliography 127
A brainpoolP256r1, 106
A5, 25, 41 Byte, 5
Adleman, L., 55, 86
Advanced Encryption Standard (AES), 44
Alphabet, 3, 5 C
ASCII, 5 Caesar, G.J., 6
Attack Carmichael number, 62
brute-force, 21 CBC-AES, 50
chosen ciphertext, 22 CBC-MAC, 88
chosen plaintext, 21 Chaum, D., 111
ciphertext-only, 21 Chinese remainder theorem, 75
dictionary, 115 Cipher
Friedman, 14 affine, 7
Kasiski, 13 asymmetric, 20
known plaintext, 21 block, 26
man-in-the-middle, 87 Caesar, 6
Authentication ElGamal, 83
challenge-response, 116 Feistel, 27
of messages, 88 KASUMI, 41
of users, 89 Merkle-Hellman, 85
zero-knowledge, 117 monoalphabetic, 10
polyalphabetic, 11
Rabin, 84
B RSA, 55
Baby-step-giant-step method, 74 shift, 6
Bernstein, D., 107 shift register, 23
Bit, 4 stream, 26
Bitcoin, 112 symmetric, 20
blockchain, 113 Vernam, 22
mining, 114 Vigenère, 11
transaction, 113 Common Scrambling Algorithm (CSA), 43
wallet, 113 Compression function, 95
Blind signature, 111 Davies-Meyer, 96
Bluetooth, 78 Matyas-Meyer-Oseas, 96
© The Author(s), under exclusive license to Springer-Verlag GmbH, DE, part of 131
Springer Nature 2022
O. Manz, Encrypt, Sign, Attack, Mathematics Study Resources,
https://doi.org/10.1007/978-3-662-66015-7
132 Index
Confusion, 27 F
Credit card, 120 Factorization
Cryptanalysis, 2 Fermat, 65
differential, 35 Pollard’s p − 1-, 68
linear, 37 Pollard’s ρ-, 67
Cryptocurrency, 111 quadratic sieve, 67
Cryptographic envelope, 93 Feistel, H., 27
Cryptography, 2 Fermat, P. de, 54, 65
Curve25519, 107 Fermat’s little theorem, 54
Fiat, A., 117
Field, 45
D
Financial Transaction Services
Daemen, J., 44
(FinTS), 108
Damgård, I., 95
Friedman coincidence index, 14
Data Encryption Standard (DES), 30
Friedman, W.F., 14
DEFLATE, 51
FTPS, 60
Diffie-Hellman key exchange, 72, 105
semi-static, 74
Diffie, W., 72 G
Diffusion, 27 Generating element modulo p, 70
Digital fingerprint, 93 Global Positioning System (GPS), 20
Digital signature, 90 Global System for Mobile Communications
Digital Signature Algorithm (DSA), 103 (GSM), 24, 118
Digital Video Broadcasting (DVB), 42 GNU Privacy Guard (GPG), 101
Digitization, 5
Discrete logarithm, 71, 102
Dixon, J., 66 H
Hash function, 94
collision resistance of, 94
E cryptographic, 94
eCash, 112 one-way property of, 94
Electronic passport (ePassport), 121 Hash value, 94
ElGamal signature, 92 Hellman, M., 72, 75, 85
ElGamal, T., 83, 92 Heron method, 65
Elliptic Curve Diffie Hellman (ECDH), 81 Heron of Alexandria, 65
Elliptic Curve Digital Signature Algorithm HMAC, 94
(ECDSA), 105 HTTPS, 60
Elliptic curves, 79
EMV, 120
Enigma machine, 15 I
Euclidean algorithm, 54 Illuminati, 8
extended, 55 Integrated Circuit Card (ICC), 25
Euclid of Alexandria, 54
Euler criterion, 64
Euler, L., 64 K
European Telecommunications Standards Kasiski, F.W., 13
Institute (ETSI), 42 Katz, P., 51
Extensible Authentication Protocol Kerckhoffs, A., 20
(EAP), 116 Kerckhoffs principle, 20
Index 133
Shift register V
linear feedback, 23 Verifier, 89
SIM card, 25, 118 Vernam, G., 22
SMTPS, 60 Vigenère, B. de, 11
Solovay, R., 64
Strassen, V., 64
W
Wired Equivalent Privacy (WEP), 61
T WhatsApp, 102
TCP/IP, 60 Wireless Local Area Network
Transaction number (TAN), 108 (WLAN), 61
Transport Layer Security (TLS), 60 Wi-Fi Protected Access 2 (WPA2), 61
Triple-DES, 38
Turing, A., 17
2-factor authentication, 108 X
Twofish, 43 XTS-AES, 50
U Z
Universal Mobile Telecommunications System Zimmermann, P., 100
(UMTS), 41, 119 ZIP, 51