0% found this document useful (0 votes)
1K views223 pages

Crypto 101 March 2014

amazing book

Uploaded by

Gagan Gowda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views223 pages

Crypto 101 March 2014

amazing book

Uploaded by

Gagan Gowda
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 223

Crypto 101

Laurens Van Houtven


1
2
Copyright 2013-2014, Laurens Van Houtven
Tis book is made possible by your donations. If you enjoyed it, please
consider making a donation, so it can be made even better and reach
even more people.
Tis work is available under the Creative Commons Attribution-
NonCommercial 4.0 International (CC BY-NC 4.0) license. You
can fnd the full text of the license at https://creativecommons.org/
licenses/by-nc/4.0/.
Te following is a human-readable summary of (and not a substitute
for) the license. You can:
Share: copy and redistribute the material in any medium or for-
mat
Adapt: remix, transform, and build upon the material
Te licensor cannot revoke these freedoms as long as you follow
the license terms:
Attribution: you must give appropriate credit, provide a link to
the license, and indicate if changes were made. You may do so
in any reasonable manner, but not in any way that suggests the
licensor endorses you or your use.
NonCommercial: you may not use the material for commercial
purposes.
No additional restrictions: you may not apply legal terms or
technological measures that legally restrict others from doing
anything the license permits.
3
You do not have to comply with the license for elements of the
material in the public domain or where your use is permitted by an
applicable exception or limitation.
No warranties are given. Te license may not give you all of the
permissions necessary for your intended use. For example, other rights
such as publicity, privacy, or moral rights may limit how you use the
material.
Pomidorkowi
4
Contents
Contents 5
I Foreword 11
1 About this book 13
2 Development 15
3 Acknowledgments 17
II Building blocks 19
4 Exclusive or 21
4.1 Description . . . . . . . . . . . . . . . . . . . . . . 21
4.2 Bitwise XOR . . . . . . . . . . . . . . . . . . . . . 22
4.3 One-time pads . . . . . . . . . . . . . . . . . . . . 23
4.4 Attacks on one-time pads . . . . . . . . . . . . . . 25
4.5 Remaining problems . . . . . . . . . . . . . . . . . 31
5 Block ciphers 33
5.1 Description . . . . . . . . . . . . . . . . . . . . . . 33
5
6 CONTENTS
5.2 AES . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3 DES and 3DES . . . . . . . . . . . . . . . . . . . . 38
5.4 Remaining problems . . . . . . . . . . . . . . . . . 40
6 Stream ciphers 41
6.1 Description . . . . . . . . . . . . . . . . . . . . . . 41
6.2 A naive attempt with block ciphers . . . . . . . . . . 41
6.3 Block cipher modes of operation . . . . . . . . . . . 49
6.4 CBC mode . . . . . . . . . . . . . . . . . . . . . . 49
6.5 CBC bit fipping attacks . . . . . . . . . . . . . . . 51
6.6 Padding . . . . . . . . . . . . . . . . . . . . . . . . 54
6.7 CBC padding attacks . . . . . . . . . . . . . . . . . 55
6.8 Native stream ciphers . . . . . . . . . . . . . . . . . 62
6.9 RC4 . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.10 Salsa20 . . . . . . . . . . . . . . . . . . . . . . . . 72
6.11 Native stream ciphers versus modes of operation . . . 74
6.12 CTR mode . . . . . . . . . . . . . . . . . . . . . . 74
6.13 Stream cipher bit fipping attacks . . . . . . . . . . . 76
6.14 Authenticating modes of operation . . . . . . . . . . 76
6.15 Remaining problem . . . . . . . . . . . . . . . . . . 77
7 Key exchange 79
7.1 Description . . . . . . . . . . . . . . . . . . . . . . 79
7.2 Abstract Die-Hellman . . . . . . . . . . . . . . . 80
7.3 Die-Hellman with discrete logarithms . . . . . . . 84
7.4 Die-Hellman with elliptic curves . . . . . . . . . . 85
7.5 Remaining problems . . . . . . . . . . . . . . . . . 87
8 Public-key encryption 89
8.1 Description . . . . . . . . . . . . . . . . . . . . . . 89
8.2 Why not use public-key encryption for everything? . 90
8.3 RSA . . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.4 Elliptic curve cryptography . . . . . . . . . . . . . . 96
8.5 Remaining problem: unauthenticated encryption . . 96
CONTENTS 7
9 Hash functions 99
9.1 Description . . . . . . . . . . . . . . . . . . . . . . 99
9.2 MD5 . . . . . . . . . . . . . . . . . . . . . . . . . 101
9.3 SHA-1 . . . . . . . . . . . . . . . . . . . . . . . . 101
9.4 SHA-2 . . . . . . . . . . . . . . . . . . . . . . . . 101
9.5 Keccak and SHA-3 . . . . . . . . . . . . . . . . . . 101
9.6 BLAKE and BLAKE2 . . . . . . . . . . . . . . . . 101
9.7 Password storage . . . . . . . . . . . . . . . . . . . 101
9.8 Length extension attacks . . . . . . . . . . . . . . . 105
9.9 Hash trees . . . . . . . . . . . . . . . . . . . . . . . 107
9.10 Remaining issues . . . . . . . . . . . . . . . . . . . 108
10 Message authentication codes 109
10.1 Description . . . . . . . . . . . . . . . . . . . . . . 109
10.2 Combining MAC and message . . . . . . . . . . . . 112
10.3 A naive attempt with hash functions . . . . . . . . . 113
10.4 HMAC . . . . . . . . . . . . . . . . . . . . . . . . 117
10.5 One-time MACs . . . . . . . . . . . . . . . . . . . 119
10.6 Carter-Wegman MAC . . . . . . . . . . . . . . . . 122
10.7 Authenticated encryption modes . . . . . . . . . . . 123
10.8 OCB mode . . . . . . . . . . . . . . . . . . . . . . 125
10.9 GCM mode . . . . . . . . . . . . . . . . . . . . . . 127
11 Signature algorithms 129
11.1 Description . . . . . . . . . . . . . . . . . . . . . . 129
11.2 RSA-based signatures . . . . . . . . . . . . . . . . . 130
11.3 DSA . . . . . . . . . . . . . . . . . . . . . . . . . . 130
11.4 ECDSA . . . . . . . . . . . . . . . . . . . . . . . . 135
11.5 Repudiable authenticators . . . . . . . . . . . . . . . 135
12 Key derivation functions 137
12.1 Description . . . . . . . . . . . . . . . . . . . . . . 137
12.2 Password strength . . . . . . . . . . . . . . . . . . . 139
12.3 PBKDF2 . . . . . . . . . . . . . . . . . . . . . . . 139
8 CONTENTS
12.4 bcrypt . . . . . . . . . . . . . . . . . . . . . . . . . 139
12.5 scrypt . . . . . . . . . . . . . . . . . . . . . . . . . 139
12.6 HKDF . . . . . . . . . . . . . . . . . . . . . . . . . 139
13 Random number generators 145
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 145
13.2 True random number generators . . . . . . . . . . . 146
13.3 Yarrow . . . . . . . . . . . . . . . . . . . . . . . . . 149
13.4 Blum Blum Shub . . . . . . . . . . . . . . . . . . . 149
13.5 Dual_EC_DRBG . . . . . . . . . . . . . . . . . . . . . 149
13.6 Mersenne Twister . . . . . . . . . . . . . . . . . . . 157
III Complete cryptosystems 165
14 SSL and TLS 167
14.1 Description . . . . . . . . . . . . . . . . . . . . . . 167
14.2 Handshakes . . . . . . . . . . . . . . . . . . . . . . 168
14.3 Certifcate authorities . . . . . . . . . . . . . . . . . 169
14.4 Self-signed certifcates . . . . . . . . . . . . . . . . 170
14.5 Client certifcates . . . . . . . . . . . . . . . . . . . 170
14.6 Perfect forward secrecy . . . . . . . . . . . . . . . . 171
14.7 Session resumption . . . . . . . . . . . . . . . . . . 172
14.8 Attacks . . . . . . . . . . . . . . . . . . . . . . . . 172
14.9 HSTS . . . . . . . . . . . . . . . . . . . . . . . . . 176
14.10Certifcate pinning . . . . . . . . . . . . . . . . . . 178
14.11Secure confgurations . . . . . . . . . . . . . . . . . 178
15 OpenPGP and GPG 181
15.1 Description . . . . . . . . . . . . . . . . . . . . . . 181
15.2 Te web of trust . . . . . . . . . . . . . . . . . . . . 182
16 O-Te-Record Messaging (OTR) 185
16.1 Description . . . . . . . . . . . . . . . . . . . . . . 185
CONTENTS 9
IV Appendices 189
A Modular arithmetic 191
A.1 Addition and subtraction . . . . . . . . . . . . . . . 191
A.2 Prime numbers . . . . . . . . . . . . . . . . . . . . 194
A.3 Multiplication . . . . . . . . . . . . . . . . . . . . . 195
A.4 Division and modular inverses . . . . . . . . . . . . 195
A.5 Exponentiation . . . . . . . . . . . . . . . . . . . . 197
A.6 Discrete logarithm . . . . . . . . . . . . . . . . . . 200
B Elliptic curves 203
B.1 Te elliptic curve discrete log problem . . . . . . . . 205
C Side-channel attacks 207
C.1 Timing attacks . . . . . . . . . . . . . . . . . . . . 207
C.2 Power measurement attacks . . . . . . . . . . . . . . 207
Bibliography 209
Glossary 215
Acronyms 221
Part I
Foreword
11
..
1
About this book
Lots of people working in cryptography have no deep
concern with real application issues. Tey are trying to
discover things clever enough to write papers about.
Whitfeld Die
Tis book is intended as an introduction to cryptography for pro-
grammers of any skill level. Its a continuation of a talk of the same
name, which was given by the author at PyCon 2013.
Te structure of this book is very similar: it starts with very sim-
ple primitives, and gradually introduces new ones, demonstrating why
theyre necessary. Eventually, all of this is put together into complete,
practical cryptosystems, such as TLS, GPG and OTR.
Te goal of this book is not to make anyone a cryptographer or a
security researcher. Te goal of this book is to understand how com-
13
14 CHAPTER 1. ABOUT THIS BOOK
plete cryptosystems work from a birds eye view, and how to apply
them in real software.
Te exercises accompanying this book focus on teaching cryptog-
raphy by breaking inferior systems. Tat way, you wont just know
that some particular thing is broken; youll know exactly how its bro-
ken, and that you, yourself, armed with little more than some spare
time and your favorite programming language, can break them. By
seeing how many systems that are ostensibly secure to the layman, you
will understand why certain primitives and constructions are necessary.
Hopefully, these exercises will also leave you with healthy distrust of
DIY cryptography in all its forms.
For a long time, cryptography has been deemed the exclusive realm
of experts. From the many internal leaks weve seen over the years of
the internals of both large and small corporations alike, it has become
obvious that that approach is doing more harm than good. We can no
longer aord to keep the two worlds strictly separate. We must join
them into one world where all programmers are educated in the basic
underpinnings of information security, so that they, together with in-
formation security professionals, can work together to produce more
secure software systems for all. Tat does not make people such as
penetration testers and security researchers obsolete or less valuable;
quite the opposite, in fact. By sensitizing all programmers to security
concerns, the need for professional security audits will become more
apparent, not less.
Tis book hopes to be a bridge: to teach everyday programmers
from any feld or specialization to understand just enough cryptogra-
phy to do their jobs, or maybe just satisfy their appetite.
..
2
Development
Te entire Crypto 101 project is publicly developed on Github under
the crypto101 organization, including this book.
Tis is an early pre-release of this book. All of your questions,
comments and bug reports are highly appreciated. If you dont under-
stand something after reading it, or a sentence is particularly clumsily
worded, thats a bug and I would very much like to fx it! Of course, if
I never hear about your issue, its very hard for me to address
Te copy of this book that you are reading right now is based on
the git commit with hash 1cbe8aa, also known as v0.1.0-4-g1cbe8aa.
15
..
3
Acknowledgments
Various people have made this book possible. Some people reviewed
the text, some people provided technical review, and some people
helped with the original talk. In no particular order:
My wife, Ewa
Brian Warner
Oskar abik
Ian Cordasco
Zooko Wilcox-OHearn
Nathan Nguyen (@nathanhere)
17
Part II
Building blocks
19
..
4
Exclusive or
4.1 Description
Exclusive or, often called XOR, is a Boolean
1
binary
2
operator that
is true when either the frst input or the second input, but not both,
are true. Another way to think of XOR is a programmable inverter: a
Boolean binary operator where one input bit decides whether or not
to invert the other input bit. Inverting bits is much more commonly
called fipping bits, a term well use often throughout the book.
1
Uses only true and false as input and output values.
2
Takes two parameters.
21
22 CHAPTER 4. EXCLUSIVE OR
In mathematics and cryptography papers, exclusive or is generally
represented by a cross in a circle: . Well use the same notation in
this book:
Te inputs and output here are named as if were using XOR as an
encryption operation. On the left, we have the plaintext bit p
i
. Te i
is just an index, since well usually deal with more than one such bit.
On top, we have the key bit k
i
, that decides whether or not to invert
p
i
. On the right, we have the ciphertext bit, c
i
, which is the result of
the XOR operation.
4.2 Bitwise XOR
XOR, as weve just defned it, operates only on single bits or Boolean
values. Since we usually deal with values comprised of many bits, most
4.3. ONE-TIME PADS 23
programming languages provide a bitwise XOR operator: an oper-
ator that performs XOR on the respective bits in a value.
Python, for example, provides the ^ (caret) operator that performs
bitwise XOR on integers. It does this by frst expressing those two
integers in binary
3
, and then performing XORon their respective bits.
Hence the name, bitwise XOR.
73 87 = 0b1001001 0b1010111
=
1 0 0 1 0 0 1 (left)

1 0 1 0 1 1 1 (right)
= 0 0 1 1 1 1 0
= 0b0011110
= 30
4.3 One-time pads
XOR may seem like an awfully simple, even trivial operator. Even so,
theres an encryption scheme, called a one-time pad, which consists of
just that single operator. Its called a one-time pad because it involves
a sequence (the pad) of random bits, and the security of the scheme
depends on only using that pad once. Tis scheme is unique not only
in its simplicity, but also because it has the strongest possible security
3
Usually, numbers are already stored in binary internally, so this doesnt actually
take any work.
24 CHAPTER 4. EXCLUSIVE OR
guarantee. If the bits are truly random (and therefore unpredictable by
an attacker), and the pad is only used once, the attacker learns nothing
about the plaintext when they see a ciphertext.
Suppose we can translate our plaintext into a sequence of bits. We
also have the pad of random bits, shared between the sender and the
(one or more) recipients. We can compute the ciphertext by taking the
bitwise XOR of the two sequences of bits.
If an attacker sees the ciphertext, we can prove that they will learn
zero information about the plaintext, which is why this scheme is con-
sidered unbreakable. Te proof can be understood intuitively by
thinking of XOR as a programmable inverter, and then looking at a
particular bit intercepted by Eve, the eavesdropper.
4.4. ATTACKS ON ONE-TIME PADS 25
Lets say Eve sees that a particular ciphertext bit c
i
is 1. She has
no idea if the matching plaintext bit p
i
was 0 or 1, because she has
no idea of the key bit k
i
was 0 or 1. Since all of the key bits are truly
random, both options are exactly equally probable.
4.4 Attacks on one-time pads
Te one-time pad security guarantee only holds if it is used correctly.
First of all, the one-time pad has to consist of truly random data. Sec-
ondly, the one-time pad can only be used once (hence the name). Un-
fortunately, most commercial one-time pads are snake oil, and dont
satisfy at least one of those two properties.
Not using truly random data
Te frst issue is that they use various deterministic constructs to pro-
duce the one-time pad, instead of using truly random data. Tat isnt
necessarily insecure: in fact, the most obvious example, a synchronous
26 CHAPTER 4. EXCLUSIVE OR
stream cipher, is something well see later in the book. However, it
does invalidate the unbreakable security property of one-time pads.
Te end user would be better served by a more honest cryptosystem,
instead of one that lies about its security properties.
Reusing the one-time pad
Te other issue is with key reuse, which is much more serious. Suppose
an attacker gets two ciphertexts with the same one-time pad. Te
attacker can then XOR the two ciphertexts, which is also the XOR of
the plaintexts:
c
1
c
2
= (p
1
k) (p
2
k) (defnition)
= p
1
k p
2
k (reorder terms)
= p
1
p
2
k k (a b = b a)
= p
1
p
2
0 (x x = 0)
= p
1
p
2
(x 0 = x)
At frst sight, that may not seem like an issue. To extract either p
1
or p
2
, youd need to cancel out the XOR operation, which means you
need to knowthe other plaintext. Te problemis that even the result of
the XOR operation on two plaintexts contains quite a bit information
about the plaintexts themselves. Well illustrate this visually with some
images from a broken one-time pad process, starting with fgure 4.1
on page 27.
4.4. ATTACKS ON ONE-TIME PADS 27
(a) First plaintext. (b) Second plaintext.
(c) First ciphertext. (d) Second ciphertext.
(e) Reused key. (f ) XOR of ciphertexts.
Figure 4.1: Two plaintexts, the re-used key, their respective cipher-
texts, and the XOR of the ciphertexts. Information about the plain-
texts clearly leaks through when we XOR the ciphertexts.
28 CHAPTER 4. EXCLUSIVE OR
Crib-dragging
A classical approach to breaking multi-time pad systems involves
crib-dragging, a process that uses small sequences that are expected
to occur with high probability. Tose sequences are called cribs. Te
name crib-dragging originated from the fact that these small cribs
are dragged from left to right across each ciphertext, and from top to
bottom across the ciphertexts, in the hope of fnding a match some-
where in. Tose matches form the sites of the start, or crib, if you
will, of further decryption.
Te idea is fairly simple. Suppose we have several encrypted mes-
sages C
i
encrypted with the same one-time pad K.
4
If we could
correctly guess the plaintext for one of the messages, lets say C
j
, wed
know K:
C
j
P
j
= (P
j
K) P
j
= K P
j
P
j
= K 0
= K
Since K is the shared secret, we can now use it to decrypt all of
the other messages, just as if we were the recipient:
P
i
= C
i
K for all i
Since we usually cant guess an entire message, this doesnt actually
work. However, we might be able to guess parts of a message.
4
We use capital letters when referring to an entire message, as opposed to just
bits of a message.
4.4. ATTACKS ON ONE-TIME PADS 29
If we guess a few plaintext bits correctly for any of the messages,
that would reveal the key bits at that position for all of the messages,
since k = c
i
p
i
. Hence, all of the plaintext bits at that position
are revealed: using that value for k, we can compute the plaintext bits
p
i
= c
i
k for all the other messages.
Guessing parts of the plaintext is a lot easier than guessing the
entire plaintext. Suppose we know that the plaintext is in English.
Teres some sequences that we know will occur very commonly, for
example (the symbol denotes a space):
the, variants like .The
of and variants
to and variants
and(less common at the start of a sentence)
a and variants
If we know more about the plaintext, we can make even better
guesses. For example, if its HTTP serving HTML, we would expect
to see things like Content-Type, <a>, and so on.
Tat only tells us which plaintext sequences are likely, giving us
likely guesses. How do we tell if any of those guesses are correct? If
our guess is correct, we know all the other plaintexts at that position
as well, using the technique described earlier. We could simply look at
those plaintexts and decide if they look correct. For example, if they
also contain English text, wed expect to see a lot of letters e, t, a, o, i,
30 CHAPTER 4. EXCLUSIVE OR
n. If were seeing binary nonsense instead, we know that the guess was
probably incorrect, or perhaps that message is actually binary data.
Tese small, highly probable sequences are called cribs because
theyre the start of a larger decryption process. Suppose your crib,
the, was successful and found the fve-letter sequence tthr in an-
other message. You can then use a dictionary to fnd common words
starting with thr, such as through. If that guess were correct, it would
reveal four more bytes in all of the ciphertexts, which can be used to
reveal even more. Similarly, you can use the dictionary to fnd words
ending in t.
Tis becomes even more eective for some plaintexts that we know
more about. If some HTTP data has the plaintext ent-Len in it, then
we can expand that to Content-Length:, revealing many more bytes.
While this technique works as soon as two messages are encrypted
with the same key, its clear that this becomes even easier with more
ciphertexts using the same key, since all of the steps become more ef-
fective:
We get more cribbing positions.
More plaintext bytes are revealed with each successful crib and
guess, leading to more guessing options elsewhere.
More ciphertexts are available for any given position, making
guess validation easier and sometimes more accurate.
Tese are just simple ideas for breaking multi-time pads. While
theyre already quite eective, people have invented even more eec-
tive methods by applying advanced, statistical models based on natu-
4.5. REMAINING PROBLEMS 31
ral language analysis. Tis only demonstrates further just how broken
multi-time pads are. [30]
4.5 Remaining problems
Real one-time pads, implemented properly, have an extremely strong
security guarantee. It would appear, then, that cryptography is over:
encryption is a solved problem, and we can all go home. Obviously,
thats not the case.
One-time pads are impractical: the key is at least as large as all
information youd like to transmit put together. Plus, youd have to
exchange those keys securely, ahead of time, with all people youd like
to communicate with. Wed like to communicate securely with every-
one on the Internet, and thats an impossibly large number of people.
Furthermore, since the keys have to consist of truly random data for
its security property to hold, key generation is fairly dicult and time-
consuming without specialized hardware.
One-time pads pose a trade o. Its an algorithm with a security
guarantee, but it also has extremely impractical key exchange require-
ments. However, as well see throughout this book, secure symmetric
encryption algorithms arent the problem. Cryptographers have de-
signed plenty of those, while practical key management remains one
of the toughest challenges facing modern cryptography. One-time
pads may solve a problem, but its the wrong problem.
While they may have their uses, theyre obviously not a panacea.
We need something with manageable key sizes while maintaining se-
crecy. We need ways to negotiate keys over the Internet with people
weve never met before.
..
5
Block ciphers
Few false ideas have more frmly gripped the minds
of so many intelligent men than the one that, if they just
tried, they could invent a cipher that no one could break.
David Kahn
5.1 Description
A block cipher is an algorithm that allows us to encrypt blocks of a
fxed length. It provides an encryption function E, that takes a key k
and a plaintext block P, and produces a ciphertext block C:
C = E(k, P) (5.1)
33
34 CHAPTER 5. BLOCK CIPHERS
Te plaintext and ciphertext blocks are sequences of bytes. Tey
are always the same size as one another, and that size is fxed by the
block cipher: its called the block ciphers block size.
Once weve encrypted plaintext blocks into ciphertext blocks, they
later have to be decrypted again to recover the original plaintext block.
Tis is done using a decryption function D, which takes the ciphertext
block C and the key k (the same one used to encrypt the block) as
inputs, and produces the original plaintext block P.
P = D(k, C) (5.2)
Or, in blocks:
A block cipher is a keyed permutation. In the set of possible blocks,
which is the set of all possible byte sequences of the ciphers block size,
the block cipher maps every block to some other block. For illustration
purposes, well look at a block cipher with a impractically tiny 3-bit
block size, so 2
3
= 8 possible blocks. Encryption would look like this:
Te points a, b, c . . . are blocks. Te arrows show which blocks
map to which blocks: that the block at the start of the arrow, encrypted
using E under key k, is mapped to the block at the end of the arrow.
For example, E(k, a) = b.
When youre decrypting instead of encrypting, the block cipher
just computes the inverse permutation. We get the same illustrations,
5.1. DESCRIPTION 35
with all the arrows going in the other direction:
Te only way to know which block maps to which other block, is
to know the key. A dierent key will lead to a completely dierent set
of arrows, for example under k

:
Knowing a bunch of (input, output) pairs shouldnt give you any
information about any other (input, output) pairs
1
. As long as were
talking about a hypothetical perfect block cipher, theres no easier way
1
Te attentive reader may have noticed that this breaks in the extremes: if you
know all but one of the pairs, then you know the last one by exclusion.
36 CHAPTER 5. BLOCK CIPHERS
to decrypt a block other than to brute-force the key: i.e. just try
every single one of them until you fnd the right one.
Our toy illustration block cipher only has 2
3
= 8 possible blocks.
Real, modern block ciphers have much larger block sizes, such as 128
bits. Mathematics tells us that there are n! (pronounced n factorial )
dierent permutations of an n element set. Its defned as the product
of all of the numbers from 1 up to and including n:
n! = 1 2 3 . . . (n 1) n
Factorials grow incredibly quickly. For example, 5! = 120, 10! =
3628800, and the rate continues to increase. Te number of permuta-
tions of the set of blocks of a cipher with a 128 bit block size is (2
128
)!.
Just 2
128
is large already (it takes 38 digits to write it down), so (2
128
)!
is a mind-bogglingly huge number, impossible to comprehend. Com-
mon key sizes are only in the range of 128 to 256 bits, yielding 2
128
to
2
256
possibilities. Tat means that only a tiny fraction of all possible
permutations are possible. Tats okay: that tiny fraction is still more
than large enough that its impossible for an attacker to just try them
5.2. AES 37
all.
Of course, a block cipher should be as easy to compute as possible,
as long as it doesnt sacrifce any of the above properties.
5.2 AES
Te most common block cipher in current use is Advanced Encryp-
tion Standard (AES), the Advanced Encryption Standard. Prior to
being chosen as the Advanced Encryption Standard, the algorithm
was known as Rijndael. Rijndael defned a family of block ciphers,
with block sizes and key sizes that could be any multiple of 32 bits
between 128 bits and 256 bits. [15] When Rijndael became AES
through the Federal Information Processing Standards (FIPS) stan-
dardization process, the the parameters were restricted to a block size
of 128 bits and keys sizes of 128, 192 and 256 bits. [1]
REVIEW: Show how AES works internally?
Tere are no practical attacks known against AES. While there
have been some developments in the last few years, most of them in-
volve related-key attacks [9], some of them only on reduced-round
versions of AES [8].
A related key attack involves making some predictions about how
AES will behave with two dierent keys with some specifc mathemat-
ical relation. Tose predictions provide some information about what
identical (input, output) pairs will look like under those dierent keys.
Most of these attacks attempt to recover the key entirely, completely
breaking the encryption. While an ideal block cipher wouldnt be vul-
nerable to a related key attack, no system in the real world should ever
38 CHAPTER 5. BLOCK CIPHERS
end up with such related keys. If it does, things have gone so com-
pletely wrong that all further bets are o.
5.3 DES and 3DES
Te Data Encryption Standard (DES) is one of the oldest block ci-
phers that saw widespread use. It was published as an ocial FIPS
standard in 1977. It is no longer considered secure, mainly due to its
tiny key size of 56 bits. (Te DES algorithm actually takes a 64 bit key
input, but the remaining 8 bits are only used for parity checking, and
are discarded immediately.) It shouldnt be used in new systems. On
modern hardware, DES can be brute forced in less than a day. [19]
In an eort to extend the life of the DES algorithm, in a way that
allowed much of the spent hardware development eort to be reused,
people came up with 3DES: a scheme where input is frst encrypted,
then decrypted, then encrypted again:
C = E
DES
(k
1
, D
DES
(k
2
, E
DES
(k
3
, p))) (5.3)
Tis scheme provides two improvements:
By applying the algorithm several three times, the cipher be-
comes harder to attack directly through cryptanalysis.
By having the option of using many more total key bits, spread
over the three keys, the set of all possible keys becomes much
larger, making brute-forcing impractical.
2
2
Te set of all keys is commonly called the keyspace.
5.3. DES AND 3DES 39
Te three keys could all be chosen independently (yielding 168 key
bits), or k
3
= k
1
(yielding 112 key bits), or k
1
= k
2
= k
3
, which, of
course, is just plain old DES (with 56 key bits). In the last keying op-
tion, the middle decryption reverses the frst encryption, so you really
only get the eect of the last encryption. Tis is intended as a back-
wards compatibility mode for existing DES systems. If 3DES had
been defned as E(k
1
, E(k
2
, E(k
3
, p))), it wouldve been impossible
to use 3DES implementations for systems that required compatibility
with DES.
Some attacks on 3DES are known, reducing their eective secu-
rity. While breaking 3DES with the frst keying option is currently
impractical, 3DES is a poor choice for any modern cryptosystem. Te
security margin is already small, and continues to shrink as crypto-
graphic attacks improve and processing power grows.
Far better alternatives, such as AES, are available. Not only are they
more secure than 3DES, they are also generally much, much faster. On
the same hardware and in the same mode of operation (well explain
what that means in the next chapter), AES-128 only takes 12.6 cycles
per byte, while 3DES takes up to 134.5 cycles per byte. [16] Despite
being worse from a security point of view, it is literally an order of
magnitude slower.
While more iterations of DES might increase the security margin,
they arent used in practice. Not only has the process never been stan-
dardized, but the performance picture only becomes worse when you
add more iterations of the DES algorithm, since more computation is
required for all those extra iterations. Furthermore, increasing the key
bits has diminishing security returns, only increasing the security level
40 CHAPTER 5. BLOCK CIPHERS
of the resulting algorithm by a smaller amount as the number of key
bits increases. While 3DES with keying option 1 has a key length of
168 bits, the eective security level is estimated at only 112 bits.
Even though 3DES is signifcantly worse in terms of performance
and slightly worse in terms of security, 3DES is still the workhorse
of the fnancial industry. With a plethora of standards already in ex-
istence and new ones continuing to be created, in such an extremely
technologically conservative industry where Fortran and Cobol still
reign supreme on massive mainframes, it will probably continue to be
used for many years to come, unless there are some large cryptanalytic
breakthroughs that threaten the security of 3DES.
TODO: Explain security levels? See also: explain entropy?
5.4 Remaining problems
Even with block ciphers, there are still some unsolved problems.
For example, we can only send messages of a very limited length:
the block length of the block cipher. Obviously, wed like to be able to
send much larger messages, or, ideally, streams of indeterminate size.
Well address this problem with a stream cipher.
Although we have reduced the key size drastically (from the total
size all data ever sent under a one-time pad scheme versus a few bytes
for most block ciphers), we still need to address the issue of agreeing
on those few key bytes, potentially over an insecure channel. Well
address this problem in a later chapter with a key exchange protocol.
..
6
Stream ciphers
6.1 Description
A stream cipher is a symmetric encryption algorithm that encrypts a
stream of bits. Ideally, that stream could be as long as wed like; real-
world stream ciphers have limits, but they are normally suciently
large that they dont pose a practical problem.
6.2 A naive attempt with block ciphers
Lets try to build a streamcipher using the tools we already have. Since
we already have block ciphers, we could simply divide an incoming
stream into dierent blocks, and encrypt each block:
41
42 CHAPTER 6. STREAM CIPHERS
abcdefgh

ijklmno

pqrstuvw

...


APOHGMMW

PVMEHQOM

MEEZSNFM ...
(6.1)
Tis scheme is called ECB mode, and it is one of the many ways
that block ciphers can be used to construct stream ciphers. Unfor-
tunately, while being very common in home-grown cryptosystems, it
poses very serious security faws. For example, in ECBmode, identical
input blocks will always map to identical output blocks:
abcdefgh

abcdefgh

abcdefgh

...


APOHGMMW

APOHGMMW

APOHGMMW ...
(6.2)
At frst, this might not seem like a particularly serious problem.
Assuming the block cipher is secure, it doesnt look like an attacker
would be able to decrypt anything. By dividing the ciphertext stream
up into blocks, an attacker would only be able to see that a ciphertext
block, and therefore a plaintext block, was repeated.
Well now illustrate the many faws of ECB mode with two at-
tacks. First, well exploit the fact that repeating plaintext blocks result
in repeating ciphertext blocks, by visually inspecting an encrypted im-
age. Ten, well demonstrate that attackers can often decrypt messages
encrypted in ECB mode by communicating with the person perform-
ing the encryption.
6.2. A NAIVE ATTEMPT WITH BLOCK CIPHERS 43
Visual inspection of an encrypted stream
To demonstrate that this is, in fact, a serious problem, well use a sim-
ulated block cipher of various block sizes and apply it to an image
1
.
Well then visually inspect the dierent outputs.
Because identical blocks of pixels in the plaintext will map to iden-
tical blocks of pixels in the ciphertext, the global structure of the image
is largely preserved.
As you can see, the situation appears to get slightly better with
larger block sizes, but the fundamental problem still remains: the
macrostructure of the image remains visible in all but the most ex-
treme block sizes. Furthermore, all but the smallest of these block
sizes are unrealistically large. For an uncompressed bitmap with three
color channels of 8 bit depth, each pixel takes 24 bits to store. Since
the block size of AES is only 128 bits, that would equate to 5.3 pixels
2
per block, signifcantly less than the larger block sizes in the exam-
ple. But AES is the workhorse of modern block ciphersit cant be
at fault, certainly not because of an insucient block size.
When we look at a picture of what would happen with an idealized
encryption scheme, we notice that it looks like random noise. Keep
in mind that looking like random noise doesnt mean something is
properly encrypted: it just means that we cant inspect it using methods
this trivial.
1
Tis particular demonstration only works on uncompressed bitmaps. For other
media, the eect isnt signifcantly less damning: its just less visual.
2
Te line over the 3 in 5.3 means it repeats, so the value is 5.333 . . ..
44 CHAPTER 6. STREAM CIPHERS
(a) Plaintext image, 2000 by 1400
pixels, 24 bit color depth.
(b) ECB mode ciphertext, 5 pixel
(120 bit) block size.
(c) ECB mode ciphertext, 30 pixel
(720 bit) block size.
(d) ECB mode ciphertext, 100
pixel (2400 bit) block size.
(e) ECB mode ciphertext, 400
pixel (9600 bit) block size.
(f ) Ciphertext under idealized en-
cryption.
Figure 6.1: Plaintext image with ciphertext images under idealized
encryption and ECB mode encryption with various block sizes. In-
formation about the macro-structure of the image clearly leaks. Tis
becomes less apparent as block sizes increase, but only at block sizes
far larger than typical block ciphers. Only the frst block size (fgure
b, a block size of 5 pixels or 120 bits) is realistic.
6.2. A NAIVE ATTEMPT WITH BLOCK CIPHERS 45
Encryption oracle attack
In the previous section, weve focused on how an attacker can inspect
a ciphertext encrypted using ECB mode. Tats a passive, ciphertext-
only attack. Its passive because the attacker doesnt really interfere
in any communication; theyre simply examining a ciphertext. In this
section, well study an active attack, where the attacker actively com-
municates with their target. Well see how the active attack can enable
an attacker to decrypt ciphertexts encrypted using ECB mode.
To do this, well introduce a new concept called an oracle. For-
mally defned oracles are used in the study of computer science, but
for our purposes its sucient to just say that an oracle is something
that will compute some particular function for you.
In our case, the oracle will perform a specifc encryption for the
attacker, which is why its called an encryption oracle. Given some data
A chosen by the attacker, the oracle will encrypt that data, followed
by a secret sux S, in ECB mode. Or, in symbols:
C = ECB(E
k
, AS)
You can see why the concept of an oracle is important here: the
attacker would not be able to compute C themselves, since they do
not have access to the encryption key k or the secret sux S. Te
goal of the oracle is for those values to remain secret, but well see
how an attacker can recover S by inspecting the ciphertext C for many
carefully chosen values of the prefx A.
46 CHAPTER 6. STREAM CIPHERS
Decrypting a block using the oracle
Te attacker starts by sending in a plaintext A thats just one byte
shorter than the block size. Tat means the block thats being en-
crypted will consist of those bytes, plus the frst byte of S, which well
call s
0
. Te attacker remembers the encrypted block. Tey dont know
the value of s
0
yet, but now they do know the value of the frst en-
crypted block: E
k
(As
0
). In the illustration, this is block C
R1
:
Ten, the attacker tries a full-size block, trying all possible values
for the fnal byte. Eventually, theyll fnd the value of s
0
; they know
the guess is correct because the resulting ciphertext block will match
the ciphertext block C
R1
they remembered earlier.
Te attacker can repeat this for the penultimate byte. Tey submit
a plaintext A thats two bytes shorter than the block size. Te oracle
will encrypt a frst block consisting of that A followed by the frst two
bytes of the secret sux, s
0
s
1
. Te attacker remembers that block.
Since the attacker already knows s
0
, they try As
0
followed by all
possible values of s
1
. Eventually theyll guess correctly, which, again,
6.2. A NAIVE ATTEMPT WITH BLOCK CIPHERS 47
theyll know because the ciphertext blocks match:
Te attacker can rinse and repeat, eventually decrypting an entire
block. Tis allows them to brute-force a block in p b attempts, where
p is the number of possible values for each byte (so, for 8-bit bytes,
thats 2
8
= 256) and b is the block size. Normally, theyd have to try
all of the possible combinations, which would be:
48 CHAPTER 6. STREAM CIPHERS
p p . . . p

b positions
= p
b
For a typical block size of 16 bytes (or 128 bits), brute forc-
ing would mean trying 256
16
combinations. Tats a huge, 39-digit
number. Its so large that trying all of those combinations is consid-
ered impossible. Tis attack allows an attacker to do it in at most
256 16 = 4096 tries, a far more manageable number.
Conclusion
In the real world, block ciphers are used in systems that encrypt large
amounts of data all the time. Weve seen that when using ECB mode,
an attacker can both analyze ciphertexts to recognize repeating pat-
terns, and even decrypt messages when given access to an encryption
oracle.
Even when we use idealized block ciphers with unrealistic prop-
erties, such as block sizes of more than a thousand bits, an attacker
6.3. BLOCK CIPHER MODES OF OPERATION 49
ends up being able to decrypt the ciphertexts. Real world block ci-
phers only have more limitations than our idealized examples, such as
much smaller block sizes.
We arent even taking into account any potential weaknesses in
the block cipher. Its not AES (or our test block ciphers) that cause
this problem, its our ECB construction. Clearly, we need something
better.
6.3 Block cipher modes of operation
One of the more common ways of producing a stream cipher is to use
a block cipher in a particular confguration. Te compound system be-
haves like a stream cipher. Tese confgurations are commonly called
modes of operation. Tey arent specifc to a particular block cipher.
ECB mode, which weve just seen, is the simplest such mode of
operation. Te letters ECB stand for electronic code book
3
. For reasons
weve already gone into, ECB mode is very ineective. Fortunately,
there are plenty of other choices.
6.4 CBC mode
CBC mode, which stands for cipher block chaining, is a very com-
mon mode of operation where plaintext blocks are XORed with the
previous ciphertext block before being encrypted by the block cipher.
Of course, this leaves us with a problemfor the frst plaintext block:
there is no previous ciphertext block to XOR it with. Instead, we pick
3
Traditionally, modes of operation seem to be referred to by a three-letter
acronym.
50 CHAPTER 6. STREAM CIPHERS
an initialization vector (IV): a random number that takes the place of
the frst ciphertext in this construction. Initialization vectors also
appear in many other algorithms. An initialization vector should be
unpredictable; ideally, they will be cryptographically random. Tey
do not have to be secret: IVs are typically just added to ciphertext
messages in plaintext.
Te following diagram demonstrates encryption in CBC mode:
Decryption is the inverse construction, with block ciphers in de-
cryption mode instead of encryption mode:
While CBC mode itself is not inherently insecure (unlike ECB
mode), its particular use in TLS 1.0 was. Tis eventually led to
6.5. CBC BIT FLIPPING ATTACKS 51
the Browser Exploit Against SSL/TLS (BEAST) attack, which well
cover in more detail in the section on SSL/TLS. Te short version is
that instead of using unpredictable initialization vectors, for example
by choosing randomones, the previous ciphertext block was used. Un-
fortunately, it turns out that attackers fgured out how to exploit that
property.
6.5 CBC bit fipping attacks
An interesting attack on CBC mode is called a bit fipping attack.
Using a CBC bit fipping attack, attackers can modify ciphertexts en-
crypted in CBC mode so that it will have a predictable eect on the
plaintext.
Suppose we have a CBC encrypted ciphertext. Tis could be, for
example, a cookie. We take a particular ciphertext block, and we fip
some bits in it. What happens to the plaintext?
When we fip some bits, we do that by XORing with a sequence
of bits, which well call X. If the corresponding bit in X is 1, the bit
will be fipped; otherwise, the bit will remain the same.
When we try to decrypt the ciphertext block with the fipped bits,
we will get indecipherable
4
nonsense. Remember how CBC decryp-
tion works: the output of the block cipher is XORed with the previous
ciphertext block to produce the plaintext block. Now that the input
ciphertext block C
i
has been modifed, the output of the block cipher
will be some random unrelated block, and, statistically speaking, non-
sense. After being XORed with that previous ciphertext block, it will
4
Excuse the pun.
52 CHAPTER 6. STREAM CIPHERS
still be nonsense. As a result, the produced plaintext block is still just
nonsense. In the illustration, this unintelligible plaintext block is P

i
.
However, in the block after that, the bits we fipped in the ci-
phertext will be fipped in the plaintext as well! Tis is because, in
CBC decryption, ciphertext blocks are decrypted by the block cipher,
and the result is XORed with the previous ciphertext block. But since
we modifed the previous ciphertext block by XORing it with X, the
plaintext block P
i+1
will also be XORed with X. As a result, the at-
tacker completely controls that ciphertext block, since they can just
fip the bits that arent the value they want them to be.
TODO: add previous illustration, but mark the path X takes to
infuence P prime {i + 1} in red or something
Tis may not sound like a huge deal at frst. If you dont know the
plaintext bytes of that next block, you have no idea which bits to fip
in order to get the plaintext you want.
To illustrate howattackers can turn this into a practical attack, lets
consider a website using cookies. When you register, your chosen user
name is put into a cookie. Te website encrypts the cookie and sends
6.5. CBC BIT FLIPPING ATTACKS 53
it to your browser. Te next time your browser visits the website, it
will provide the encrypted cookie; the website decrypts it and knows
who you are.
An attacker can often control at least part of the plaintext being
encrypted. In this example, the user name is part of the plaintext of
the cookie. Of course, the website just lets you provide whatever value
for the user name you want at registration, so the attacker can just add
a very long string of Z bytes to their user name. Te server will happily
encrypt such a cookie, giving the attacker an encrypted ciphertext that
matches a plaintext with many such Z bytes in them. Te plaintext
getting modifed will then probably be part of that sequence of Z bytes.
An attacker may have some target bytes that hed like to see in the
decrypted plaintext, for example, ;admin=1;. In order to fgure out
which bytes they should fip (so, the value of X in the illustration),
they just XOR the fller bytes (ZZZ ) with that target. Because two
XOR operations with the same value cancel each other out, the two
fller values (ZZZ ) will cancel out, and the attacker can expect to see
;admin=1; pop up in the next ciphertext block:
P

i+1
= P
i+1
X
= P
i+1
ZZZZZZZZZ ; admin = 1;
= ZZZZZZZZZ ZZZZZZZZZ ; admin = 1;
= ; admin = 1;
Tis attack is another demonstration of an important crypto-
graphic principle: encryption is not authentication! Its virtually never
54 CHAPTER 6. STREAM CIPHERS
sucient to simply encrypt a message. It may prevent an attacker from
reading it, but thats often not even necessary for the attacker to be able
to modify it to say whatever they want it to.
6.6 Padding
So far, weve conveniently assumed that all messages just happened to
ft exactly in our system of block ciphers, be it CBC or ECB. Tat
means that all messages happen to be a multiple of the block size,
which, in a typical block cipher such as AES, is 16 bytes. Of course,
real messages can be of arbitrary length. We need some scheme to
make them ft. Tat process is called padding.
Padding with zeroes (or some other pad byte)
One way to pad would be to simply append a particular byte value until
the plaintext is of the appropriate length. To undo the padding, you
just remove those bytes. Tis scheme has an obvious faw: you cant
send messages that end in that particular byte value, or you will be
unable to distinguish between padding and the actual message.
PKCS#5/PKCS#7 padding
A better, and much more popular scheme, is PKCS#5/PKCS#7
padding.
PKCS#5, PKCS#7 and later CMS padding are all more or less the
same idea
5
. Take the number of bytes you have to pad, and pad them
5
Technically, PKCS#5 padding is only defned for 8 byte block sizes, but the idea
clearly generalizes easily, and its also the most commonly used term.
6.7. CBC PADDING ATTACKS 55
with that many times the byte with that value. For example, if the
block size is 8 bytes, and the last block has the three bytes 12 34 45,
the block becomes 12 34 45 05 05 05 05 05 after padding.
If the plaintext happened to be exactly a multiple of the block size,
an entire block of padding is used. Otherwise, the recipient would
look at the last byte of the plaintext, treat it as a padding length, and
almost certainly conclude the message was improperly padded.
Tis scheme is described in [22].
6.7 CBC padding attacks
We can refne CBC bit fipping attacks to trick a recipient into de-
crypting arbitrary messages!
As weve just discussed, CBC mode requires padding the message
to a multiple of the block size. If the padding is incorrect, the recipient
typically rejects the message, saying that the padding was invalid. We
can use that tiny bit of information about the padding of the plaintext
to iteratively decrypt the entire message.
Te attacker will do this, one ciphertext block at a time, by trying
to get an entire plaintext block worth of valid padding. Well see that
this tells them the what the decryption of their target ciphertext block
is, under the block cipher. Well also see that you can do this e-
ciently and iteratively, just from that little leak of information about
the padding being valid or not.
It may be helpful to keep in mind that a CBC padding attack does
not actually attack the padding for a given message; instead the at-
tacker will be constructing paddings to decrypt a message.
To mount this attack, an attacker only needs two things:
56 CHAPTER 6. STREAM CIPHERS
1. A target ciphertext to decrypt
2. A padding oracle: a function that takes ciphertexts and tells the
attacker if the padding was correct
In this chapter, well assume that PKCS#5/PKCS#7 padding is
being used, since thats the most popular option. Te attack is general
enough to work on other kinds of padding, with minor modifcations.
Decrypting the frst byte
Te attacker flls a block with arbitrary bytes R = r
1
, r
2
. . . r
b
. Tey
also pick a target block C
i
from the ciphertext that theyd like to de-
crypt. Te attacker asks the padding oracle if RC
i
has valid padding.
Statistically speaking, such a random plaintext block probably wont
have valid padding: the odds are in the half-a-percent ballpark. If by
pure chance the message happens to already have valid padding, they
can simply skip the next step.
6.7. CBC PADDING ATTACKS 57
Next, the attacker tries to modify the message so that it does
have valid padding. Tey can do that by playing with the last byte
of the plaintext: eventually that byte will be 01, which is always valid
padding. In order to modify the last byte of a plaintext block, the
attacker modifes the last byte of the previous ciphertext block. Tis
works exactly like it did with CBC bit fipping attacks. Tat previous
ciphertext block is the block R, so the byte being modifed is the last
byte of R, r
b
.
One way to try all values for that last byte of R is to XOR it with
all values up to 256, since a byte has 256 possible values. Eventually,
the padding oracle will report that for some ciphertext block R, the
decrypted plaintext of RC
i
has valid padding.
Discovering the padding length
Te oracle has just told the attacker that for our chosen value of R,
the plaintext of RC
i
has valid padding. Since were working with
PKCS#5 padding, that means that the plaintext block P
i
ends in one
of the following byte sequences:
01
02 02
03 03 03

Te frst option (01) is much more likely than the others, since
it only requires one byte to have a particular value. Te attacker is
58 CHAPTER 6. STREAM CIPHERS
modifying that byte to take every possible value, so it is quite likely
that they happened to stumble upon 01. All of the other valid padding
options not only require that byte to have some particular value, but
also one or more other bytes. For an attacker to end up with a valid
01 padding, they just have to try every possible byte; for an attacker to
end up with a valid 02 02 padding, they have to try every possible byte
and happen to have picked a block that has a 02 in the second-to-last
position.
In order to successfully decrypt the message, we still need to fgure
out which one of those options is the actual value of the padding. To
do that, we try to discover the length of the padding by modifying
bytes starting at the left-hand side of P
i
until the padding becomes
invalid again. As with everything else in this attack, we modify those
bytes in P
i
by modifying the equivalent bytes in our chosen block R.
As soon as padding breaks, you know that the last byte you modifed
was part of the valid padding, which tells you how many padding bytes
there are. Since were using PKCS#5 padding, that also tells you what
their value is.
Lets illustrate this with an example. Suppose weve successfully
found some block R so that RC
i
has valid padding. Lets say that
padding is 03 03 03. Normally, you wouldnt know this; the point of
this procedure is to discover what that padding is. Suppose the block
size is 8 bytes. So, we know that P
i
is currently:
p
0
p
1
p
2
p
3
p
4
p
5
030303 (6.3)
Where p
0
are some bytes of the plaintext. Teir actual value
doesnt matter: the only thing that matters is that theyre not part of
6.7. CBC PADDING ATTACKS 59
the padding. When we modify the frst byte of R, well cause a change
in the frst byte of P
i
, so that p
0
becomes some other byte p

0
:
p

0
p
1
p
2
p
3
p
4
p
5
030303 (6.4)
As you can see, this doesnt aect the validity of the padding. Te
same goes for p
1
, p
2
, p
3
, p
4
and p
5
. However, when we modify the
byte after that (say, we turn that frst 03 into a 02), P
i
looks like this:
p

0
p

1
p

2
p

3
p

4
p

5
020303 (6.5)
Since 02 03 03 isnt valid PKCS#5 padding, the server will reject
the message. At that point, we know that once we modify six bytes,
the padding breaks. Tat means the sixth byte is the frst byte of the
padding. Since the block is 8 bytes long, we know that the padding
consists of bytes 6, 7 and 8, which means that the padding is three
bytes long, and, in PKCS#5, equal to 03 03 03.
For the next section, well assume that it was just 01, since that is
the most common case. Te attack doesnt really change depending on
the length of the padding. If you guess more bytes of padding correctly,
that just means that there are fewer remaining bytes you will have to
guess manually. (Tis will become clear once you understand the rest
of the attack.)
Decrypting one byte
At this point, weve actually already successfully decrypted the last byte
of the target block of ciphertext! (Actually, weve decrypted as many
bytes as we have valid padding; were just assuming the worst case sce-
nario that thats only a single byte.) Since we know that the last byte
60 CHAPTER 6. STREAM CIPHERS
of the decrypted ciphertext block C
i
(well call that byte D(C
i
)[b]),
XORed with our iteratively found value r
b
, is 01:
D(C
i
)[b] r
b
= 01
We can just move the XOR operation to the other side, and we
get:
D(C
i
)[b] = 01 r
b
Te attacker has now tricked the receiver into decrypting the last
byte of the block C
i
.
Decrypting subsequent bytes
Next, the attacker tricks the receiver into decrypting the next byte.
Remember the previous equation, where we reasoned that the last byte
of the plaintext was 01:
D(C
i
)[b] r
b
= 01
Now, wed like to get that byte to say 02, to produce an almost valid
padding: the last byte would be correct for a 2-byte PKCS#5 padding
(02 02), but that second-to-last byte probably isnt 02 yet. To do that,
we XOR with 01 to cancel the 01 thats already there (since two XORs
with the same value cancel each other out), and then we XOR with 02
to get 02:
D(C
i
)[b] r
b
01 02 = 01 01 02
= 02
6.7. CBC PADDING ATTACKS 61
Te attacker uses that value for the last byte. Ten, they try all
possible values for the second-to-last byte (index b 1). Eventually,
one of them will cause the message to have valid padding. Since we
modifed the random block so that the fnal byte of the plaintext will
be 02, the only byte in the second-to-last position that can cause valid
padding is 02 as well. Using the same math as above, the attacker has
recovered the second-to-last byte.
Ten, its just rinse and repeat. Te last two bytes are modifed to
create an almost-valid padding of 03 03, then the third byte from the
right is modifed until the padding is valid, and so on. Repeating this
for all the bytes in the block means the attacker can decrypt the entire
block; repeating it for dierent blocks means the attacker can read the
entire message.
Tis attack has proven to be very subtle and hard to fx. First of all,
messages should be authenticated, as well as encrypted. Tat would
cause modifed messages to be rejected. However, many systems de-
crypted (and removed padding) before authenticating the message; so
the information about the padding being valid already leaked.
You might consider to just get rid of the invalid padding mes-
sage; declaring the message invalid without specifying why it was in-
valid. Tat turns out to only be a partial solution for systems that
decrypt before authenticating. Tose systems would typically reject
messages with an invalid padding slightly faster than messages with a
valid padding. After all, they didnt have to do the authentication step:
if the padding is invalid, the message cant possibly be valid.
Tat discrepancy was commonly exploited as well. By measur-
ing how long it takes the recipient to reject the message, the attacker
62 CHAPTER 6. STREAM CIPHERS
can tell if the recipient performed the authentication step. Tat tells
them if the padding was correct or not, providing the padding oracle
to complete the attack.
TODO: Remove TODO about Vaudenays padding attack later,
refer to this
6.8 Native stream ciphers
In addition to block ciphers being used in a particular mode of opera-
tion, there are also native streamciphers algorithms that are designed
from the ground up to be a stream cipher.
Te most common type of stream cipher is called a synchronous
stream cipher. Tese algorithms produce a long stream of pseudo-
random bits from a secret symmetric key. Tis stream, called the
keystream, is then XORed with the plaintext to produce the cipher-
text. Decryption is the identical operation as encryption, just repeated:
the keystream is produced from the key, and is XORed with the ci-
phertext to produce the plaintext.
TODO: Explain parallel with one-time pads
Historically, native streamciphers have had their issues. For exam-
ple, the NESSIE competition, an international competition for new
cryptographic primitives, did not result in any new stream ciphers: all
6.9. RC4 63
of the participants were broken before the competition ended. RC4,
one of the most popular stream ciphers, has had serious known issues
for years. By comparison, some of the constructions using block ci-
phers seem bulletproof.
Fortunately, more recently, several new cipher algorithms provide
new hope that we can get practical, secure and performant stream ci-
phers.
6.9 RC4
By far the most common streamcipher in common use on desktop and
mobile devices is RC4.
RC4 is sometimes also called ARCFOURor ARC4, which stands
for alleged RC4. While its source code has been leaked and its imple-
mentation is now well-known, RSA Security, where RC4 originated
and who still hold the trademark on the name, has never acknowledged
that it is the real algorithm.
It quickly came popular because its very simple and very fast. Its
not just extremely simple to implement, its also extremely simple to
apply. Being a synchronous stream cipher, theres little that can go
wrong; with a block cipher, youd have to worry about things like
modes of operation and padding. Clocking in at around 13.9 cycles
per byte, its comparable to AES-128 in CTR (12.6 cycles per byte)
or CBC (16.0 cycles per byte) modes. AES came out a few years after
RC4; when RC4 was designed, the state of the art was 3DES, which
was excruciatingly slow by comparison (134.5 cycles per byte in CTR
mode). [16]
64 CHAPTER 6. STREAM CIPHERS
Tis algorithm is, unfortunately, quite broken. To better under-
stand just how broken, well take a look at how RC4 works. Te de-
scription requires understanding modular addition; if you arent famil-
iar with it, you may want to review the appendix section on modular
addition.
Everything in RC4 revolves around a state array and two indexes
into that array. Te array consists of 256 bytes forming a permutation:
that is, all possible index values occur exactly once as a value in the
array. Tat means it maps every possible byte value to every possible
byte value: usually dierent, but sometimes the same one. We know
that its a permutation because S starts as one, and all operations that
modify S always swap values, which obviously keeps it a permutation.
RC4 consists of two major components that work on these indexes
i, j and the state array S:
Te key scheduling algorithm, which produces an initial state
array S for a given key.
Te pseudorandom generator, which produces pseudorandom
bytes from the state array S, modifying it as it goes along.
Te key scheduling algorithm
Te key scheduling algorithmstarts with the identity permutation. Tat
means that each byte is mapped to itself.
6.9. RC4 65
Ten, the key is mixed in with the state. Tis is done by iterating
over every element of the state. Te j index is found by adding the
current value of j (starting at 0) with the next byte of the key, and the
current state element:
Once j has been found, S[i] and S[j] are swapped:
Tis process is repeated for all the elements of S. If you run out
of key bytes, you just wrap around on the key. Tis explains why RC4
accepts keys from anywhere between 1 and 256 bytes long. Usually,
128 bit (16 byte) keys are used, which means that each byte in the key
is used 16 times.
Or, in Python:
from itertools import cycle
66 CHAPTER 6. STREAM CIPHERS
def key_schedule(key):
s = range(256)
key_bytes = cycle(ord(x) for x in key)
j = 0
for i in xrange(256):
j = (j + s[i] + next(key_bytes)) % 256
s[i], s[j] = s[j], s[i]
return s
Te pseudorandom generator
Te pseudorandom generator is responsible for producing pseudoran-
dom bytes from the state S. For each index i, it computes j = j +S[i]
(j starts at 0). Ten, S[i] and S[j] are swapped:
To produce the output byte, S[i] and S[j] are added together.
Teir sum is used as an index into S; the value at S[S[i] + S[j]] is
the keystream byte K
i
:
6.9. RC4 67
We can express this in Python:
def pseudorandom_generator(s):
j = 0
for i in cycle(range(256)):
j = (j + s[i]) % 256
s[i], s[j] = s[j], s[i]
k = (s[i] + s[j]) % 256
yield s[k]
Attacks
Tere are many attacks on RC4-using cryptosystems where RC4 isnt
really the issue, but are caused by things like key reuse or failing to au-
thenticate the message. We wont discuss these in this section. Right
now, were only talking about issues specifc to the RC4 algorithm it-
self.
Intuitively, we can understand how an ideal stream cipher would
produce a stream of random bits. After all, if thats what it did, wed
end up in a situation quite similar to that of a one-time pad.
68 CHAPTER 6. STREAM CIPHERS
Te stream cipher is ideal if the best way we have to attack it is to
try all of the keys, a process called brute-forcing the key. If theres an
easier way, such as through a bias in the output bytes, thats a faw of
the stream cipher.
Troughout the history of RC4, people have found many such bi-
ases. In the mid-nineties, Andrew Roos noticed two such faws:
Te frst three bytes of the key is correlated with the frst byte of
the keystream.
Te frst fewbytes of the state are related to the key with a simple
(linear) relation.
For an ideal stream cipher, the frst byte of the keystream should
tell me nothing about the key. In RC4, it gives me some information
about the frst three bytes of the key. Te latter seems less serious: after
all, the attacker isnt supposed to know the state of the cipher.
6.9. RC4 69
As always, attacks never get worse. Tey only get better.
Adi Shamir and Itsik Mantin showed that the second byte pro-
duced by the cipher is twice as likely to be zero as it should be.
Other researchers showed similar biases in the frst few bytes of the
keystream. Tis sparked further research by Mantin, Shamir and
Fluhrer[18], showing large biases in the frst bytes of the keystream.
Tey also showed that knowing even small parts of the key would al-
low attackers to make strong predictions about the state and outputs
of the cipher.
Most modern stream ciphers provide a way to combine a long-
term key with a nonce, to produce multiple dierent keystreams from
the same long-term key. RC4, by itself, doesnt do that. Te most
common approach was also the simplest concatenate the long-term
key k and the nonce n: kn, taking advantage of RC4s fexible key
length requirements. Tis scheme meant attackers knew some bits of
the key, allowing themto slowly recover the long-termkey froma large
amount of messages (around 2
24
to 2
26
).
WEP, a standard for protecting wireless networks that was pop-
ular at the time, was heavily aected by this attack, since it used this
simplistic nonce combination scheme. A scheme where the long-term
key and the nonce are combined using a cryptographic hash function
wouldnt have this weakness; TLS and other standards were therefore
not aected.
Again, attacks only get better. Andreas Klein showed more ex-
tensive correlation between the key and the keystream[24]. Instead of
tens of millions of messages with the Fluhrer, Mantin, Shamir attacks,
attackers now only needed several tens of thousands messages to make
70 CHAPTER 6. STREAM CIPHERS
the attack practical. Tis was applied against WEP with great eect.
In 2013, a team of researchers at Royal Holloway in London pro-
duced a combination of devastating practical attacks[3]. Tey demon-
strated two attacks.
Te frst attack is based on single-byte biases in the frst 256 bytes
of the keystream. By performing statistical analysis on the keystreams
produced by a large number of keys, they were able to analyze the
already well-known biases in the early keystream bytes of RC4 in very
greater detail.
TODO: illustrate: http://www.isg.rhul.ac.uk/tls/RC4_keystream_
dist_2_45.txt
Te second attack is based on double byte biases anywhere in the
keystream. It turns out that adjacent bytes of the keystream have an
exploitable relation, whereas in an ideal stream cipher you would ex-
pect them to be completely independent.
6.9. RC4 71
Byte pair Byte position (mod 256) i Probability
(0, 0) i = 1 2
16
(1 + 2
9
)
(0, 0) i {1, 255} 2
16
(1 + 2
8
)
(0, 1) i {0, 1} 2
16
(1 + 2
8
)
(0, i + 1) i {0, 255} 2
16
(1 + 2
8
)
(i + 1, 255) i = 254 2
16
(1 + 2
8
)
(255, i + 1) i {1, 254} 2
16
(1 + 2
8
)
(255, i + 2) i {0, 253, 254, 255} 2
16
(1 + 2
8
)
(255, 0) i = 254 2
16
(1 + 2
8
)
(255, 1) i = 255 2
16
(1 + 2
8
)
(255, 2) i {0, 1} 2
16
(1 + 2
8
)
(255, 255) i = 254 2
16
(1 + 2
8
)
(129, 129) i = 2 2
16
(1 + 2
8
)
Tis table may seem a bit daunting at frst. Te probability ex-
pression in the rightmost column may look a bit complex, but theres
a reason its expressed that way. Suppose that RC4 was a good stream
cipher, and all values occurred with equal probability. Ten youd ex-
pect the probability for any given byte value to be 2
8
since there are
2
8
dierent byte values. If RC4 was a good stream cipher, two adja-
cent bytes would both each have probability 2
8
, so any given pair of
two bytes would have probability 2
8
2
8
= 2
16
. However, RC4
isnt an ideal stream cipher, so these properties arent true. By writing
the probability in the 2
16
(1 + 2
k
) form, its easier to see how much
RC4 deviates from what youd expect from an ideal stream cipher.
So, lets try to read the frst line of the table. It says that when
the frst byte i = 1 of any 256-byte chunk from the cipher is 0, then
the byte following it is slightly more likely ((1 + 2
9
times as likely,
72 CHAPTER 6. STREAM CIPHERS
to be exact) to be 0 than for it to be any other number. We can also
see that when one of the keystream bytes is 255, you can make many
predictions about the next byte, depending on where it occurs in the
keystream. Its more likely to be 0, 1, 2, 255, or the position in the
keystream plus one or two.
TODO: demonstrate attack success
Again, attacks only get better. Tese attacks have primarily fo-
cused on the cipher itself, and havent been fully optimized for practi-
cal attacks on, say, web services. Te attacks can be greatly improved
with some extra information about the plaintext youre attempting to
recover. For example, HTTP cookies are often base-64 or hex en-
coded.
6.10 Salsa20
Salsa20 is a newer stream cipher designed by Dan Bernstein. Bern-
stein is well-known for writing a lot of open source (public domain)
software, a lot of which is either directly security related or built with
computer security very much in mind.
Tere are two minor variants of Salsa20, called Salsa20/12 and
Salsa20/8, which are simply the same algorithm except with 12 and 8
rounds
6
respectively, down from the original 20. ChaCha is another,
orthogonal tweak of the Salsa20 cipher, which tries to increase the
amount of diusion per round while maintaining or improving per-
formance. ChaCha doesnt have a 20 after it; specifc algorithms do
6
Rounds are repetitions of an internal function. Typically a number of rounds
are required to make a algorithm eective work; attacks often start on reduced-round
versions of an algorithm.
6.10. SALSA20 73
have a number after them (ChaCha8, ChaCha12, ChaCha20), but
that refers to the number of rounds.
Tis block cipher is among the state of the art of modern stream
ciphers. As of time of writing, there are no known attacks against
Salsa20, ChaCha20, nor against their reduced-round variants. It is
also pretty fast. For long streams, it takes about 4 cycles per byte for
the full-round version, about 3 cycles per byte for the 12-round ver-
sion and about 2 cycles per byte for the 8-round version, on modern
Intel processors [7] and modern AMD processors [16]. To put that
into comparison, thats more than three times faster than RC4
7
, ap-
proximately three times faster than AES-CTR with a 128 bit key at
12.6 cycles per byte, and roughly in the ballpark of AES GCM mode
8
with specialized hardware instructions.
Salsa20 has one particularly interesting property. Its possible to
jump to a particular point in the keystream without computing all
previous bits. Tis can be useful, for example, if a large fle is en-
crypted, and youd like to be able to do random reads in the middle of
the fle. While many encryption schemes require the entire fle to be
decrypted, with Salsa20, you can just select the portion you need. An-
other construction that has this property is a mode of operation called
CTR mode, which well talk about later.
7
Te quoted bencmarks dont mention RC4 but MARC4, which stands for
modifed alleged RC4. Te RC4 section explains why its alleged, and modifed
means it throws away the frst 256 bytes because of a weakness in RC4.
8
GCM mode is an authenticated encryption mode, which we will see in more
detail in a later chapter.
74 CHAPTER 6. STREAM CIPHERS
6.11 Native stream ciphers versus modes of
operation
Some texts only consider native stream ciphers to be stream ciphers.
Tis book emphasizes what the functionality of the algorithmis. Since
both block ciphers in a mode of operation and a native stream cipher
take a secret key and can be used to encrypt a stream, and the two can
usually replace each other in a cryptosystem, we just call both of them
stream ciphers and be done with it.
We will further emphasize the tight link between the two with
CTR mode, a mode of operation which produces a synchronous stream
cipher. While there are also modes of operation (like OFB and CFB)
that can produce self-synchronizing stream ciphers, these are far less
common, and not discussed here.
6.12 CTR mode
CTRmode, short for counter mode, is a mode of operation that works
by concatenating a nonce (which stands for a n/umber used /once) and
a counter. Te counter is incremented with each block, and padded
with zeroes so that the whole is as long as the block size. Te resulting
concatenated string is run through a block cipher. Te outputs of the
block cipher are then used as the keystream.
6.12. CTR MODE 75
Tis illustration shows a single input block N00 . . . i, consisting
of nonce N, current counter value i and padding, being ran though
block cipher E using key k to produce keystream block S
i
, which is
then XORed with the plaintext block P
i
to produce ciphertext block
C
i
.
Obviously, to decrypt, you do the exact same thing again, since
XORing a bit with the same value twice always produces the original
bit: p
i
s
i
s
i
= p
i
. As a consequence, CTR encryption and de-
cryption is the same thing: in both cases you produce the keystream,
and you XOR either the plaintext or the ciphertext with it in order to
get the other one.
For CTR mode to be secure, it is critical that nonces arent reused.
If they are, the entire keystream will be repeated, allowing an attacker
to mount multi-time pad attacks.
Tis is dierent from an initialization vector such as the one used
by CBC. An IV has to be unpredictable. An attacker being able to
predict a CTR nonce doesnt really matter: without the secret key, he
has no idea what the output of the block cipher (the sequence in the
keystream) would be.
Like Salsa20, CTRmode has the interesting property that you can
jump to any point in the keystream easily: just increment the counter
to that point. Te Salsa20 paragraph on this topic explains why that
76 CHAPTER 6. STREAM CIPHERS
might be useful.
Another interesting property is that since none of the computa-
tions depend on any previous computations, both encryption and de-
cryption are trivial to compute in parallel.
6.13 Stream cipher bit fipping attacks
Streamciphers, such as native streamciphers or a block cipher in CTR
mode, are also vulnerable to a bit fipping attack. Its similar to CBC
bit fipping attacks in the sense that an attacker fips several bits in the
ciphertext, and that causes some bits to be fipped in the plaintext.
Tis attack is actually much simpler to perform on stream ciphers
than it is on CBC mode. First of all, the bits fipped aect the exact
same bit in the ciphertext, not a bit in the following block. It only
aects that bit; in the CBC bit fipping attacks, the plaintext of the
modifed block is scrambled. Since the attacker is modifying a se-
quence of bytes and not a sequence of blocks, theyre not limited to a
block size.
TODO illustrate
Tis is yet another example of why authentication has to go hand
in hand with encryption. If the message is properly authenticated, the
recipient can simply reject the modifed messages, and the attack is
foiled.
6.14 Authenticating modes of operation
Tere are other modes of operation that provide authentication as well
as encryption at the same time. Since we havent discussed authenti-
6.15. REMAINING PROBLEM 77
cation at all yet, well handle these later.
6.15 Remaining problem
We now have tools that will encrypt large streams of data using a small
key. However, we havent actually discussed how were going to agree
on that key. As noted in a previous chapter, to communicate between
n people, we need n
2
key exchanges. While the key to be exchanged
is a lot smaller now, the fundamental problem of the impossibly large
number of key exchanges hasnt been solved yet. Next, well look at
key exchange protocols, protocols that allow us to agree on a secret
key over an insecure medium.
Additionally, weve seen that encryption isnt enough to provide
security: without authentication, its easy for attackers to modify the
message, and in many fawed systems even decrypt messages. In a
future chapter, well discuss how to authenticate messages, to prevent
attackers from modifying them.
..
7
Key exchange
7.1 Description
Key exchange protocols attempt to solve a problemthat, at frst glance,
seems impossible. Alice and Bob, whove never met before, have to
agree on a secret value. Te channel they use to communicate is inse-
cure: were assuming that everything they send across the channel is
being eavesdropped on.
Well demonstrate such a protocol here. Alice and Bob will end up
having a shared secret, only communicating over the insecure channel.
Despite Eve having literally all of the information Alice and Bob send
to each other, she cant use any of that information to fgure out their
shared secret.
Tat protocol is called Die-Hellman, named after Whitfeld
79
80 CHAPTER 7. KEY EXCHANGE
Die and Martin Hellman, the two cryptographic pioneers who dis-
covered it. Tey suggest calling the protocol Die-Hellman-Merkle
key exchange, to honor the contributions of Ralph Merkle. While
his contributions certainly deserve honoring, that term hasnt really
caught on much. For the beneft of the reader well use the more com-
mon term.
Practical implementations of Die-Hellman rely on mathemat-
ical problems that are believed to be very complex to solve in the
wrong direction, but easy to compute in the right direction. Un-
derstanding the mathematical implementation isnt necessary to un-
derstand the principle behind the protocol. Most people also fnd it
a lot easier to understand without the mathematical complexity. So,
well explain Die-Hellman in the abstract frst, without any mathe-
matical constructs. Afterwards, well look at two practical implemen-
tations.
7.2 Abstract Die-Hellman
In order to describe Die-Hellman, well use an analogy based on
mixing colors. We can mix colors according to the following rules:
Its very easy to mix two colors into a third color.
Mixing two or more colors in dierent order results in the same
color.
Mixing colors is one-way. Its impossible to determine if, let
alone which, multiple colors were used to produce a given color.
Even if you knowit was mixed, and even if you knowsome of the
7.2. ABSTRACT DIFFIE-HELLMAN 81
colors used to produce it, you have no idea what the remaining
color(s) were.
Well demonstrate that with a mixing function like this one, we
can produce a secret color only known by Alice and Bob. Later, well
simply have to describe the concrete implementation of those func-
tions to get a concrete key exchange scheme.
To illustrate why this remains secure in the face of eavesdroppers,
well walk through an entire exchange with Eve, the eavesdropper, in
the middle. Eve is listening to all of the messages sent across the net-
work. Well keep track of everything she knows and what she can
compute, and end up seeing why Eve cant compute Alice and Bobs
shared secret.
To start the protocol, Alice and Bob have to agree on a base color.
Tey can communicate that across the network: its okay if Eve hears.
Typically, this base color is a fxed part of the protocol; Alice and Bob
dont need to communicate it. After this step, Alice, Bob and Eve all
have the same information: the base color.
Alice and Bob both pick a random color, and they mix it with the
base color.
82 CHAPTER 7. KEY EXCHANGE
At the end of this step, Alice and Bob know their respective secret
color, the mix of the secret color and the base color, and the base color
itself. Everyone, including Eve, knows the base color.
Tey then send both of their mixed colors over the network. Eve
sees both mixed colors: but she cant fgure out what either of Alice
and Bobs secret colors are. Even though she knows the base, she cant
un-mix the colors sent over the network.
1
At the end of this step, Alice and Bob know the base, their re-
spective secrets, their respective mixed colors, and each others mixed
colors. Eve knows the base color and both mixed colors.
1
While this might seem like an easy operation with black-and-white approxima-
tions of color mixing, keep in mind that this is just a failure of the illustration: our
assumption was that this was hard.
7.2. ABSTRACT DIFFIE-HELLMAN 83
Once Alice and Bob receive each others mixed color, they add
their own secret color to it. Since the order of the computation doesnt
matter, theyll both end up with the same secret.
Eve cant perform that computation. She could using either secret,
since she has both the mixed secrets, but she has neither.
84 CHAPTER 7. KEY EXCHANGE
7.3 Die-Hellman with discrete logarithms
Tis section describes a practical implementation of the abstract
Die-Hellman algorithm, based on the discrete logarithm problem.
It is intended to provide some mathematical background, and requires
modular arithmetic to understand. If you are unfamiliar with modular
arithmetic, you can either skip this chapter, or frst read the mathe-
matical background appendix.
Discrete log Die-Hellman is based on the idea that computing
y in the following equation is easy (at least for a computer):
y g
x
(mod p) (7.1)
However, computing x given y, g and p is believed to be very hard.
Tis is called the discrete logarithm problem, because a similar opera-
tion without the modular arithmetic is called a logarithm.
Tis is just a concrete implementation of the abstract Die-
Hellman process we discussed earlier. Te common base color is a
large prime p and the base g. Te color mixing operation is the equa-
tion given above, where x is the input value and y is the resulting mixed
value.
When Alice or Bob select their random numbers r
A
and r
B
, they
mix them with the base to produce the mixed numbers m
A
and m
B
:
m
A
= g
r
A
(mod p) (7.2)
m
B
= g
r
B
(mod p) (7.3)
Tese numbers are sent across the network where Eve can see
them. Te premise of the discrete logarithm problem is that thats
7.4. DIFFIE-HELLMAN WITH ELLIPTIC CURVES 85
okay, because fguring out r in m = g
r
(mod p) is supposedly very
hard.
Once Alice and Bob have each others mixed numbers, they add
their own secret number to it. For example, Bob would compute:
s = (g
r
A
)
r
B
(mod p) (7.4)
While Alices computation looks dierent, they get the same re-
sult, because (g
r
A
)
r
B
= (g
r
B
)
r
A
(mod p). Tis is the shared secret.
Because Eve doesnt have r
A
or r
B
, she can not perform the equiv-
alent computation: she only has the base number g and mixed numbers
m
A
= g
r
A
(mod p) and m
B
= g
r
B
(mod p) , which are useless to
her. She needs either r
A
or r
B
(or both) to make the computation
Alice and Bob do.
TODO: Say something about active MITM attacks where the at-
tacker picks smooth values to produce weak secrets?
7.4 Die-Hellman with elliptic curves
Tis section describes a practical implementation of the abstract
Die-Hellman algorithm, based on the elliptic curve discrete loga-
rithm problem. It is intended to provide some mathematical back-
ground, and requires a (very basic) understanding of the mathematics
behind elliptic curve cryptography. If you are unfamiliar with elliptic
curves, you can either skip this chapter, or frst read the mathematical
background appendix.
One of the benefts of the elliptic curve Die-Hellman variant is
that the required key size is much, much smaller than the variant based
86 CHAPTER 7. KEY EXCHANGE
on the discrete log problem. Tis is because the fastest algorithms for
breaking the discrete log problem have a larger asymptotic complexity
than their elliptic curve variants. For example, one of the fastest al-
gorithms for attacking discrete log Die-Hellman, the function feld
sieve, has complexity:
O
(
exp
(
(
64
9
log n
)1
3
(log log n)
2
3
))
On the other hand, the fastest algorithms that could be used to
break the elliptic curve discrete log problem all have complexity:
O(

n)
Relatively speaking, that means that its much harder to solve the
elliptic curve problem than it is to solve the regular discrete log prob-
lem, using state of the art algorithms for both. Te fip side of that
is that for equivalent security levels, the elliptic curve algorithm needs
much smaller key sizes[28][23]
2
:
Security level in bits Discrete log key bits ECC key bits
56 512 112
80 1024 160
112 2048 224
128 3072 256
256 15360 512
2
Tese fgures are actually for the RSA problem versus the equivalent EC prob-
lem, but their security levels are suciently close to give you an idea.
7.5. REMAINING PROBLEMS 87
7.5 Remaining problems
Using Die-Hellman, we can agree on shared secrets across an in-
secure Internet, safe from eavesdroppers. However, while n attacker
may not be able to simply get the secret from eavesdropping, an ac-
tive attacker can still break the system. If an attacker (Mallory) is in
between Alice and Bob, they can still performthe Die-Hellman pro-
tocol twice: once with Alice, where the attacker pretends to be Bob,
and once with Bob, where the attacker pretends to be Alice.
Alice and Bob will have a shared secret, but the secret is shared
with Mallory. Te attacker can then simply take all the messages they
get from one person and send them to the other, they can look at the
plaintext messages, remove messages, and they can also modify them
in any way they choose.
To make matters worse, even if one of the two participants was
somehow aware that this was going on, they would have no way to get
the other party to believe them. After all: the attacker is the one with
the shared secrets that check out, not the intended participant.
While Die-Hellman successfully produced a shared secret be-
tween sender and receiver, theres clearly some pieces of the puzzle still
missing. We need tools that help us authenticate Alice to Bob and vice
versa, and we need tools that help guarantee message integrity: that
the messages the recipient receives are in fact the messages the sender
intended to send.
..
8
Public-key encryption
8.1 Description
So far, we have only done secret-key encryption. Suppose, that you
could have a cryptosystem that didnt involve a single secret key, but
instead had a key pair: one public key, which you freely distribute,
and a private one, which you keep to yourself. Tis is called public-
key encryption. People can encrypt information to you by using your
public key. Te information is then impossible to decipher without
your private key.
For a long time, people thought this was impossible. However,
starting in the 1970s, such algorithms started appearing. Te frst
publicly available encryption scheme was produced by three cryptog-
raphers from MIT: Ron Rivest, Adi Shamir and Leonard Adleman.
89
90 CHAPTER 8. PUBLIC-KEY ENCRYPTION
Te algorithm they published is still the one most common one today,
and carries the frst letters of their last names: RSA.
Public-key algorithms arent limited to encryption. In fact, youve
already seen a public-key algorithm in this book that isnt directly used
for encryption. Tere are actually three related classes of public-key
algorithms:
1. Key exchange algorithms, such as Die-Hellman, which allow
you to agree on a shared secret across an insecure medium.
2. Encryption algorithms, such as the ones well discuss in this
chapter, which allow people to encrypt without having to agree
on a shared secret.
3. Signature algorithms, which well discuss in a later chapter,
which allow you to sign any piece of information using your pri-
vate key in a way that allows anyone else to easily verify it using
your public key.
8.2 Why not use public-key encryption for
everything?
At face value, it seems that the existence of public-key encryption al-
gorithms obsoletes all our previous secret-key encryption algorithms.
We could just use public key encryption for everything, avoiding all
the added complexity of having to do key agreement for our symmet-
ric algorithms.
By far the most important reason for this is performance. Com-
pared to our speedy stream ciphers (native or otherwise), public-key
8.3. RSA 91
encryption mechanisms are extremely slow. A single 2048-bit RSA
encryption takes 0.29 megacycles, decryption takes a whopping 11.12
megacycles. [16] To put this into comparison, symmetric key algo-
rithms work in order of magnitude 10 or so cycles per byte in either
direction. In order to encrypt or decrypt 2048 bytes, that means ap-
proximately 2500 cycles.
Tere are a few other problems with most practical cryptosystems.
For example, RSA cant encrypt anything larger than its modulus,
which is generally less than or equal 4096 bits, far smaller than the
largest messages wed like to send. Still, the most important reason is
the speed argument given above.
8.3 RSA
As we already mentioned, RSA is one of the frst practical public-key
encryption schemes. It remains the most common one to this day.
Encryption and decryption
RSA encryption and decryption relies on modular arithmetic. You
may want to review the modular arithmetic primer before continuing.
In order to generate a key, you pick two large prime numbers p
and q. Tese numbers have to be picked at random, and in secret. You
multiply them together to produce the modulus N, which is public.
Ten, you pick an encryption exponent e, which is also public. Usu-
ally, this value is either 3 or 65537; because those numbers have few 1
numbers in their binary expansion, you can compute the exponentia-
92 CHAPTER 8. PUBLIC-KEY ENCRYPTION
tion more eciently. Put together, (N, e) is the public key. Anyone
can use the public key to encrypt a message M into a ciphertext C:
C M
e
(mod N)
Te next problem is decryption. It turns out that there is a value
d, the decryption exponent, that can turn C back into M. Tat value is
fairly easy to compute assuming that you know p and q, which we do.
Using d, you can decrypt the message like so:
M C
d
(mod N)
Breaking RSA
Like many cryptosystems, RSA relies on the presumed diculty of a
particular mathematical problem. For RSA, this is the RSA problem,
specifcally: to fnd the plaintext message M, given a ciphertext C,
and public key (N, e) in the equation:
C M
e
(mod N) (8.1)
Te easiest way we know how to do that is to factor N back into
pq. Given p and q, you could just repeat the process that the legitimate
owner of the key does during key generation in order to compute the
private exponent d, and then youve won.
Fortunately, we dont have an algorithm that can factor such large
numbers in reasonable time. Unfortunately, we also havent proven
it doesnt exist. Even more unfortunate, is that we have a theoretical
algorithm, called Shors algorithm, that would be able to factor such
8.3. RSA 93
a number in reasonable time, on a quantum computer. Right now,
quantum computers are far from practical, but it does appear that if
someone in the future manages to build one thats suciently large,
RSA becomes ineective.
Implementation pitfalls
While right now there are no known practical complete breaks against
RSA. Tats not to say that systems employing RSA arent routinely
broken. Like with most broken cryptosystems, theres plenty of cases
where sound components, improperly applied, result in a useless sys-
tem. For a more complete overview of the things that can go wrong
with RSA implementations, please refer to [12] and [4]. In this book,
well just highlight a few interesting ones.
PKCSv1.5 padding
Salt
Salt
1
is a provisioning system written in Python. In the authors opin-
ion, it has one major faw. It has a module named crypto.
Instead of reusing existing complete cryptosystems
2
, it imple-
ments its own, using RSA and AES provided by a third party package.
For a long time, Salt used a public exponent of 1, which meant
the encryption phase didnt actually do anything. While this issue has
1
So, theres Salt the provisioning system, salts the things used in broken password
stores, NaCl pronounced salt the cryptography library, and NaCl which runs native
code in some browsers, and probably a bunch Im forgetting. Can we stop naming
things after it?
2
For a variety of reasons, which, even if they were sensible, are secondary to the
tenet of not implementing your own cryptography.
94 CHAPTER 8. PUBLIC-KEY ENCRYPTION
now been fxed, this only goes to show once again that you probably
shouldnt implement your own cryptography.
OAEP
OAEP, short for optimal asymmetric encryption padding, is the state
of the art in RSA padding. It was introduced by Mihir Bellare and
Phillip Rogaway in 1995. [6]. Its structure looks like this:
Te thing that eventually gets encrypted is XY , which is n bits
long, where n is the number of bits of N, the RSA modulus. It takes a
randomblock Rthats k bits long, a constant specifed by the standard.
8.3. RSA 95
Te message is frst padded with zeroes to be nk bits long. If you look
at the above ladder, everything on the left half is nk bits long, and
everything on the right half is k bits long. Te random block R and
zero-padded message M000 . . . are combined using two trapdoor
functions, G and H. A trapdoor function is a function thats very easy
to compute in one direction and very hard to reverse. In practice, these
are cryptographic hash functions; well see more about those later.
As you can tell from the diagram, G takes k bits and turns them
into n k bits, and H is the other way around, taking n k bits and
turning them into k bits.
Te resulting blocks X and Y are concatenated, and the result is
encrypted using the standard RSA encryption primitive, to produce
the ciphertext.
To see how decryption works, we reverse all the steps. Te recipi-
ent gets XY when decrypting the message. Tey know k, since is a
fxed parameter of the protocol. so they can split up XY into X (the
frst n bits) and Y (the frst n k bits).
In the previous diagram, the directions are for padding being ap-
plied. Reverse the arrows on the side of the ladder, and you can see
how to revert the padding:
TODO: reverse arrows
We want to get to M, which is in M000 . . .. Teres only one
way to compute that, which is:
M000 . . . = X G(R)
Computing G(R) is a little harder:
96 CHAPTER 8. PUBLIC-KEY ENCRYPTION
G(R) = H(X) Y
As you can see, at least for some defnitions of the functions H and
G, we need all of X and all of Y (and hence the entire encrypted mes-
sage) in order to learn anything about M. Tere are many functions
that would be a good choice for H and G; based on cryptographic hash
functions, which well discuss in more detail later in the book.
8.4 Elliptic curve cryptography
TODO: Tis
8.5 Remaining problem: unauthenticated
encryption
Most public-key encryption schemes can only encrypt small chunks of
data at a time, much smaller than the messages we want to be able to
send. Tey are also generally quite slow, much slower than their sym-
metric counterparts. Terefore public-key cryptosystems are almost
always used in conjunction with secret-key cryptosystems.
When we discussed stream ciphers, one of the remaining issues
that we were facing was that we still had to exchange secret keys with a
large number of people. With public-key cryptosystems such as public
encryption and key exchange protocols, weve now seen two ways that
we can solve that problem. Tat means that we can now communicate
with anyone, using only public information, completely secure from
eavesdroppers.
8.5. REMAINING PROBLEM: UNAUTHENTICATED
ENCRYPTION 97
So far weve only been talking about encryption without any form
of authentication. Tat means that while we can encrypt and decrypt
messages, we cannot verify that the message is what the sender actually
sent.
While unauthenticated encryption may provide secrecy, weve al-
ready seen that without authentication an active attacker can generally
modify valid encrypted messages successfully, despite the fact that they
dont necessarily know the corresponding plaintext. Accepting these
messages can often lead to secret information being leaked, so that not
even the secrecy property is satisfed.
As a result it has become evident that we need ways to authen-
ticate as well as encrypt our secret communications. Tis is done by
adding extra information to the message, that only the sender could
have computed. Just like encryption, authentication comes in both
private-key (symmetric) and public-key forms. Symmetric authen-
tication schemes are typically called message authentication codes.
Public-key authentication is typically called a signature.
First, we will introduce a new cryptographic primitive: hash func-
tions. Tese can be used to produce both signature schemes as well
as message authentication schemes. Unfortunately, they are also very
often abused to produce entirely insecure systems.
..
9
Hash functions
9.1 Description
Hash functions are functions that take an input of indeterminate
length and produce a fxed-length value, also known as a digest.
Simple hash functions have many applications. Hash tables, a
common data structure, rely on them. Te really only guarantee one
thing: for two identical inputs, theyll produce an identical output.
Importantly, theres no guarantee that two identical outputs imply that
the inputs were the same
1
. A good hash function is also quick to com-
pute.
Since this is a book on cryptography, were particularly interested
in cryptographic hash functions. Tose are hash functions with much
1
Tat would be impossible: theres only a fnite amount of digests, since theyre
fxed size, but theres an infnite amount of inputs.
99
100 CHAPTER 9. HASH FUNCTIONS
stronger properties. For a cryptographic hash function, we want it to
be impossibly hard to:
1. modify a message without changing the hash.
2. generate a message that has a given hash.
3. fnd two dierent messages with the same hash.
Te frst property implies that cryptographic hash functions will
exhibit something known as the avalanche eect. Changing even a
single bit in the input will produce an avalanche of changes through
the entire digest: each bit of the ciphertext will have a 50% chance of
fipping.
Te second property, which states that it should be dicult to fnd
a message mthat has a given hash value h, is called pre-image resistance.
Tis makes a hash function a one-way function: its very easy to com-
pute a hash for a given message, but its very hard to compute a message
for a given hash.
Te third property talks about fnding messages with the same
hash value, comes in two favors. In the frst one, theres a given mes-
sage m, and it should be dicult to fnd another message m

with the
same hash value: thats called second pre-image resistance. Te second
one is stronger, stating that it should be hard to fnd any two messages
m, m

that have the same hash value. Tis is called collision resistance.
Because collision resistance is a stronger form of second pre-image
resistance, theyre sometimes also called weak and strong collision re-
sistance.
TODO: Maybe link to http://www.cs.ucdavis.edu/~rogaway/
papers/relates.pdf for further reading
9.2. MD5 101
9.2 MD5
TODO: Explain MD5
9.3 SHA-1
TODO: Explain SHA-1
9.4 SHA-2
TODO: Explain SHA-2
9.5 Keccak and SHA-3
TODO: Explain Keccak
TODO: Explain the parameter change debacle in SHA-3
9.6 BLAKE and BLAKE2
TODO: Explain BLAKE, BLAKE2
9.7 Password storage
One of the most common use cases for cryptographic hash functions,
and unfortunately one which is also completely and utterly broken, is
password storage.
102 CHAPTER 9. HASH FUNCTIONS
Suppose you have a service where people log in using a username
and a password. Youd have to store the password somewhere, so that
next time the user logs in, you can verify the password they supplied.
Storing the password directly has several issues. Besides an obvi-
ous timing attack in the string comparison, if the password database
were to be compromised, an attacker would be able to just go ahead
and read all of the passwords. Since many users re-use passwords,
thats a catastrophic failure. Most user databases also contain their e-
mail addresses, so it would be very easy to hi-jack a bunch of your users
accounts that are unrelated to this service.
Hash functions to the rescue
An obvious approach would be to hash the password using a cryp-
tographically secure hash function. Since the hash function is easy
to compute, whenever the user provides their password, you can just
compute the hash value of that, and compare that to what you stored
in the database.
If an attacker were to steal the user database, they could only see
the hash values, and not the actual passwords. Since the hash function
is impossible for an attacker to inverse, they wouldnt be able to turn
those back into the original passwords. Or so people thought.
Rainbow tables
It turns out that this reasoning is fawed. Te amount of passwords
that people actually use is very limited. Even with very good pass-
word practices, theyre strings somewhere between 10 and 20 charac-
ters, consisting mostly of things that you can type on common key-
9.7. PASSWORD STORAGE 103
boards. In practice though, people use even worse passwords: things
based on real words (password, swordfish), consisting of few symbols
and few symbol types (1234), or with predictable modifcations of the
above (passw0rd).
To make matters worse, hash functions are the same everywhere.
If a user re-uses the same password on two sites, and both of them
hash the password using MD5, the values in the password database
will be the same. It doesnt even have to be per-user: many passwords
are extremely common (password), so many users will use the same
one.
Keep in mind that a hash function is easy to evaluate. What if
we simply try many of those passwords, creating huge tables mapping
passwords to their hash values?
Tats exactly what some people did, and the tables were just as ef-
fective as youd expect them to be, completely breaking any vulnerable
password store. Such tables are called rainbow tables. Tis is because
theyre essentially sorted lists of hash function outputs. Tose outputs
will be more or less randomly distributed. When written down in hex-
adecimal formats, this reminded some people of color specifcations
like the ones used in HTML, e.g. #52f211, which is lime green.
Salts
Te reason rainbow tables were so incredibly eective was because ev-
eryone was using one of a handful of hash functions. Te same pass-
word would result in the same hash everywhere.
Tis problem was generally solved by using salts. By mixing (ap-
pending or prepending) the password with some random value before
104 CHAPTER 9. HASH FUNCTIONS
hashing it, you could produce completely dierent hash values out of
the same hash function. It eectively turns a hash function into a
whole family of related hash functions, with virtually identical security
and performance properties, except with completely dierent output
values.
Te salt value is stored next to the password hash in the database.
When the user authenticates using the password, you just compute the
salt with the password, hash it, and compare it against the stored hash.
If you pick a random value thats large enough (say, 2
160
), youve
completely defeated ahead-of-time attacks like rainbow tables. In or-
der to successfully mount a rainbow table attack, an attacker would
have to have a separate table for each of those salt values. Since even
a single table was usually quite large, storing a large amount of them
would be impossible. Even if an attacker would be able to store all
that data, hed still have to compute it. Computing a single table took
a decent amount of time; computing 2
160
of them is impossible.
Many systems used a single salt for all users. While that prevented
a rainbow table attack, it still allowed attackers to attack all passwords
at once. Marginally more advanced (but still fundamentally broken)
systems used a dierent salt for each user, which meant attackers could
only attack one password at a time.
Perhaps the biggest problem with salts is that many programmers
were suddenly convinced they were doing the right thing. Teyd heard
of broken password storage schemes, and they knew how to do it
properly, so they ignored all talk about how a password database could
be compromised. Tey werent the ones storing passwords in plaintext,
or forgetting to salt their hashes, or re-using salts for dierent users.
9.8. LENGTH EXTENSION ATTACKS 105
It was all of those other people that didnt know what they were doing
that had those problems. Unfortunately, thats not true. Perhaps thats
why broken password storage schemes are still the norm.
Modern attacks on weak password systems
To a modern attack, salts quite simply dont matter. Modern attacks
take advantage of the fact that the hash function being used is easy
to compute. Using faster hardware, in particular video cards, we can
simply enumerate all of the passwords, regardless of salt.
TODO: more concrete performance numbers about GPUs
Salts may make precomputed attacks impossible, but they do very
little against an attacker that actually knows the salt.
So where do we go from here?
In order to protect passwords, you dont need a hash function, but a key
derivation function. Well deal with those in a following chapter.
9.8 Length extension attacks
In many hash functions, particularly the previous generations, the in-
ternal state kept by the hash function was used as the digest value. In
some poorly engineered systems, that causes a critical faw: if an at-
tacker knows H(M
1
), its very simple to compute H(M
1
M
2
), with-
out actually knowing the value of M
1
. Since you know $H(M
1
), you
know the state of the hash function after its hashed M
1
. You can use
that to reconstruct the hash function, and ask it to hash more bytes.
106 CHAPTER 9. HASH FUNCTIONS
Setting the hash functions internal state to a known state you got from
somewhere else (such as H(M
1
)) is called fxation.
For most real-world hash functions, its a little bit more compli-
cated than that. Tey commonly have a padding step that an attacker
needs to recreate. MD5 and SHA-1 have the same padding step. Its
fairly simple, so well go through it:
1. Add a 1 bit to the message.
2. Add zero bits until the length is 448 (mod 512).
3. Take the total length of the message, before padding, and add
it as a 64-bit integer.
For the attacker to be able to compute H(M
1
M
2
) given H(M
1
),
the attacker needs to fake that padding, as well. Te attacker will ac-
tually compute H(M
1
GM
2
), where G is the glue padding, called
that way because it glues the two messages together. Te hard part is
knowing the length of the message M
1
.
In many systems, the attacker can actually make fairly educated
guesses about the length of M
1
, though. As an example, consider
the common (broken) example of a secret-prefx authentication code.
People send messages M
i
, authenticated using A
i
= H(SM
i
), where
S is a shared secret.
Its very easy for the recipient to compute the same function, and
verify the code is correct. Any change to the message M
i
will change
the value of A
i
drastically, thanks to the avalanche eect. Unfortu-
nately, its quite easy for attackers to forge messages. Since the authen-
tication codes are usually sent together with the original message, the
9.9. HASH TREES 107
attacker knows the length of the original message. Ten, the attacker
only has to guess at the length of the secret, which hell probably get
in a hundred or so tries. Contrast this with guessing the secret itself,
which is impossible for any reasonably chosen secret.
Tere are secure authentication codes that can be designed using
cryptographic hash functions: this one just isnt it. Well see better
ones in a later chapter.
Some hash functions, particularly newer ones such as SHA-3
competition fnalists, do not exhibit this property. Te digest is com-
puted from the internal state, instead of using the internal state di-
rectly.
Tis makes the SHA-3-era hash functions not only a bit more
fool-proof, but also enables them to produce simpler schemes for mes-
sage authentication. (Well elaborate on those in a later chapter.)
While length extension attacks only aected systems where crypto-
graphic hash functions were being abused in the frst place, theres
something to be said for preventing them anyway. People will end
up making mistakes, we might as well mitigate where we can.
TODO: say why this prevents meet in the middle attacks?
9.9 Hash trees
Hash trees are trees
2
where each node is identifed by a hash value,
consisting of its contents and the hash value of its ancestor. Te root
node, not having an ancestor, simply hashes its own contents.
TODO: illustrate
2
Directed graphs, where each node except the root has exactly one ancestor.
108 CHAPTER 9. HASH FUNCTIONS
Tis defnition is very wide practical hash trees are often more re-
stricted. Tey might be binary trees
3
, or perhaps only leaf nodes carry
data of their own, and parent nodes only carry derivative data. Partic-
ularly these restricted kinds are often called Merkle trees.
Systems like these or their variants are used by many systems,
particularly distributed systems. Examples include distributed ver-
sion control systems such as Git, digital currencies such as Bitcoin,
distributed peer-to-peer networks like Bittorrent, and distributed
databases such as Cassandra.
9.10 Remaining issues
Weve already illustrated that hash functions, by themselves, cant au-
thenticate messages. Also, weve illustrated that hash functions cant
be used to secure passwords. Well tackle both of these problems in
the following chapters.
While this chapter has focused heavily on what hash functions cant
do, it cant be stressed enough that they are still incredibly important
cryptographic primitives. Tey just happen to be commonly abused
cryptographic primitives.
3
Each non-leaf node has no more than two children
..
10
Message authentication codes
10.1 Description
A Message authentication code (MAC) is a small bit of information
that can be used to check the authenticity and the integrity of a mes-
sage. Tese codes are often called tags. A MAC algorithm takes a
message of arbitrary length and a secret key of fxed length, and pro-
duces the tag. Te MAC algorithm also comes with a verifcation
algorithm that takes a message, the key and a tag, and tells you if the
tag was valid or not. (It is not always sucient to just recompute a tag
and check if they are the same; many secure MAC algorithms are ran-
domized, and will produce dierent tags every time you apply them.)
Note that we say message here instead of plaintext or cipher-
text. Tis ambiguity is intentional. In this book were mostly inter-
109
110 CHAPTER 10. MESSAGE AUTHENTICATION CODES
ested in MACs as a way to achieve authenticated encryption, so the
message will always be a ciphertext. Tat said, theres nothing wrong
with a MAC being applied to a plaintext message. In fact, we will be
seeing examples of secure authenticated encryption schemes that ex-
plicitly allow for authenticated (but not encrypted) information to be
sent along with the authenticated ciphertext.
Often, when you just want to talk about the authenticity and in-
tegrity of a particular plaintext message, it may be more practical to
use a signing algorithm, which well talk about in a later chapter.
Secure MACs
We havent quite defned yet exactly which properties we want from a
secure MAC.
We will be defending against an active attacker. Te attacker will
be performing a chosen message attack. Tat means that an attacker
will ask us the tag for any number of messages m
i
, and well answer
truthfully with the appropriate tag t
i
.
An attacker will then attempt to produce an existential forgery, a
fancy way of saying that they will produce some newvalid combination
of (m, t). Te obvious target for the attacker is the ability to produce
valid tags t

for new messages m

of their choosing. We will also con-


sider the MAC insecure if an attacker can compute a new, dierent
valid tag t

for a message m
i
that we previously gave them a valid tag
for.
10.1. DESCRIPTION 111
Why does a MAC take a secret key?
If youve had to deal with verifying the integrity of a message before,
you may have used checksums (like CRC32 or Adler32) or even cryp-
tographic hashes (like the SHA family) in order to compute a check-
sumfor the message (depending on the algorithmand who youre talk-
ing to, they may have called it hash or digest, too).
Lets say that youre distributing a software package. You have
some tarballs with source code in them, and maybe some binary pack-
ages for popular operating systems. Ten you put some (cryptographi-
cally secure!) hashes right next to them, so that anyone who downloads
them can verify the hashes and be confdent that they downloaded
what they think they downloaded.
Of course, this scheme is actually totally broken. Computing those
hashes is something everyone can do. Youre even relying on that fact
for your user to be able to verify their download. Tat also means that
an attacker that modifed any of the downloads can just compute the
hash again for the modifed download, and save that value. A user
downloading the modifed fle will compute its hash and compare it
against the modifed hash, and conclude that the download worked.
Te scheme provided no help whatsoever against an attacker modify-
ing the download, either as stored, or in transit.
In order to do this securely, you would either apply signing algo-
rithms to the binaries directly, or by signing the digests, as long as
the hash function used to produce the digest is secure against second-
preimage attacks. Te important dierence is that producing a sig-
nature (using either a pre-shared key with your users, or, preferably,
a public-key signing algorithm) is not something that an attacker can
112 CHAPTER 10. MESSAGE AUTHENTICATION CODES
do. Only someone who has the secret keys can do that.
10.2 Combining MAC and message
As weve mentioned before, unauthenticated encryption is bad. Tats
why we introduced MACs. Of course, for a MAC to be useful, it
has to make it to the recipient. Since were explictly talking about
authenticating encryption, now, well stop using the word message
and instead use the less ambiguous plaintext and ciphertext.
Teres three common ways to combine a ciphertext with a MAC.
1. Authenticate and encrypt. You authenticate the plaintext, en-
crypt the plaintext, and concatenate the two. E(K
C
, P)MAC(K
M
, P).
Tis is how SSH does it.
2. Authenticate, then encrypt. You take the plaintext, concatenate
the MACof the plaintext, and encrypt the whole. E(K
C
, PMAC(K
M
, P)).
Tis is how TLS does it (usually).
3. Encrypt, then authenticate. You encrypt the plaintext, com-
pute the MAC of that ciphertext, and concatenate the two.
E(K
C
, P)MAC(K
M
, E(K
C
, P)). Tis is how IPSec does
it.
Tese were studied in depth in the landmark paper [26]. Out of all
of these, encrypt-then-authenticate is unequivocally the best option.
Its so unequivocally the best option that Moxie Marlinspike, a well-
respected information security researcher, has a principle called Te
Cryptographic Doom Principle for any system that does not follow
10.3. A NAIVE ATTEMPT WITH HASH FUNCTIONS 113
this pattern [29]. (More accurately, Moxie claims that any system that
does anything before checking the MAC is doomed. Both the frst
and the second options require you to decrypt something before you
can verify the authentication.)
Authenticate-then-encrypt
Authenticate-then-encrypt is a poor choice, but its a subtle poor
choice. Its still provably secure[26], but only under certain conditions.
At frst sight, this scheme appears to work. Sure, you have to de-
crypt before you can do anything, but to many cryptographers, includ-
ing the designers of TLS, this did not appear to pose a problem.
TODO: Explain Vaudenay CBC attack [36]
Authenticate-and-encrypt
Authenticate-and-encrypt has obvious problems. For example, since
the plaintext is authenticated, and that MACis part of the transmitted
message, an attacker will be able to recognize two identical messages
sent independently using the same MAC key are in fact the same,
essentially leading to a similar problem to what we saw with ECB
mode.
TODO: Explain how this works in SSH (see Moxies Doom arti-
cle)
10.3 A naive attempt with hash functions
Many ways of constructing MACs involve hash functions. Perhaps
one of the simplest ways you could imagine doing that is to just prefx
114 CHAPTER 10. MESSAGE AUTHENTICATION CODES
the message with the secret key and hash the whole thing:
t = H(km)
Tis scheme is most commonly called Prefx-MAC, because it
is a MAC algorithm that works by using the secret key as a prefx.
Te cryptographically secure hash function H guarantees two
things that are important to us here:
Te tag t will be easy to compute; the hash function H itself is
typically very fast. In many cases we can compute the common
key part ahead of time, so we only have to hash the message
itself.
Given any number of tags, there is no way for an attacker to
invert the hash function to recover k, which would allowthem
to forge arbitrary messages.
Given any number of tags, there is no way for an attacker to
rewind the hash function to recover H(k), which may allow
them to forge almost arbitrary messages.
One small caveat: were assuming that the secret key k has enough
entropy. Otherwise, we have the same issue that we had for password
storage using hash functions: an attacker could just try every single k
until one of them matches. Once theyve done that, theyve almost
certainly found the correct k. Tats not really a failure of the MAC
though: if your secret key contains so little entropy that its feasible
for an attacker to try all of them, youve already lost, no matter which
MAC algorithm you pick.
10.3. A NAIVE ATTEMPT WITH HASH FUNCTIONS 115
Breaking prefx-MAC
Despite being quite common, this MAC is actually completely inse-
cure for most (cryptographically secure!) hash functions H, including
SHA-2.
As we saw in the chapter on hash functions, many hash functions
such as MD5, SHA-0, SHA-1, SHA-2 all pad the message with
a predictable padding before producing the output digest. Te output
digest is the same thing as the internal state of the hash function. Tats
a problem: the attacker can use those properties to forge messages.
First, they use the digest as the internal state of the hash function.
Tat state matches the state you get when you hash kmp, where k
is the secret key, m is the message, and p is that predictable padding.
Now, the attacker gets the hash function to consume some new bytes:
the attackers chosen message m

. Te internal state of the hash func-


tion is now what you get when you feed it kmpm

. Te attacker
tells the hash function to produce a digest. Again, the hash func-
tion appends a padding, so were now at kmpm

. Te attacker
outputs that digest as the tag. Tat is exactly the same thing as what
happens when you try to compute the tag for the message mpm

under the secret key k. So, the attacker has successfully forged a tag
for a new message, and, by our defnition, the MAC is insecure.
Tis attack is called a length extension attack, because you are ex-
tending a valid message. Te padding in the middle p, which started
out as the padding for the original message but has become just some
data in the middle, is called glue padding, because it glues the original
message m and the attackers message m

together.
Tis attack might sound a little academic, and far from a practical
116 CHAPTER 10. MESSAGE AUTHENTICATION CODES
problem. We may have proven that the MAC is insecure by our def-
inition, but the only signatures the attacker can successfully forge are
for modifcations of messages that we sent and signed; specifcally, the
message we sent, followed by some binary junk, followed by something
the attacker chooses. It turns out that for many systems, this is plenty
to result in real breaks. Consider the following Python code that parses
a sequence of key-value pairs that look like k1=v1&k2=v2&...:
1
def parse(s):
pairs = s.split(&)
parsed = {}
for pair in pairs:
key, value = pair.split(=)
parsed[key] = value
return parsed
Since the parsing function only remembers the last value for a
given key (previous values in the dictionary are overwritten), an at-
tacker can eectively control the parsed data entirely.
If youre thinking that that code has many issues; sure, it does. For
example, it doesnt handle escaping correctly. But even if it did; that
wouldnt really fx the length extension attack problem. Most pars-
ing functions will perfectly happily live with that binary junk in the
middle. Hopefully it convinces you that there is in fact a pretty good
chance that an attacker can produce messages with valid tags that say
something entirely dierent from what you intended.
1
I realize there are briefer ways to write that function. I am trying to make it
comprehensible to most programmers; not pleasing to advanced Pythonistas.
10.4. HMAC 117
Tis construction is actually secure in current (SHA-3-era) hash
functions, such as Keccak and BLAKE(2). Te specifcations for these
algorithms even recommend it as a secure and fast MAC. Tey use var-
ious techniques to foil length extension attacks: for example, BLAKE
keeps track of the number of bits that have been hashed so far, while
BLAKE2 has a fnalization fag that marks a specifc block as the last.
Variants
Issues with prefx-MAC has tempted people to come up with all sorts
of clever variations. For example, why not add the key to the end
instead of the beginning (t = H(mk), or sux-MAC, if you will)?
Or maybe we should append the key to both ends for good measure
(t = H(kmk), sandwich-MAC perhaps?)?
For what its worth, both of these are at least better than prefx-
MAC, but both of these have serious issues. For example, a sux-
MAC system is more vulnerable to weaknesses in the underlying hash
function; a successful collision attack breaks the MAC. Sandwich-
MAC has other, more complex issues.
Cryptography has produced much stronger MACs, which well see
in the next few sections. Tere are no good reasons not to use them.
10.4 HMAC
HMAC is a standard to produce a MAC with a cryptographic hash
function as a parameter. It was introduced in 1996 in a paper by Bel-
lare, Canetti and Krawczyk. Many protocols at the time implemented
their own attempt at message authentication using hash functions.
118 CHAPTER 10. MESSAGE AUTHENTICATION CODES
Most of these attempts failed. Te goal of that paper specifcally was
to produce a provably secure MACthat didnt require anything beyond
a secret key and a hash function.
One of the nice features of HMAC is that it has a fairly strong
security proof. As long as the underlying hash function is a pseu-
dorandom function, HMAC itself is also a pseudorandom function.
Te underlying hash function doesnt even have to be collision re-
sistant for HMAC to be a secure MAC. [5] Tis proof was intro-
duced after HMAC itself, and matched real-world observations: even
though MD5 and to a lesser extent SHA-0 had serious collision at-
tacks, HMAC constructions built from those hash functions still ap-
peared to be entirely secure.
Te biggest dierence between HMAC and prefx-MAC or its
variants is that the message passes through a hash function twice, and
is combined with the key before each pass. Visually, HMAC looks
like this:
Te only surprising thing here perhaps are the two constants p
i
10.5. ONE-TIME MACS 119
(the inner padding, one hash functions block length worth of 0x36
bytes) and p
0
(the outer padding, one block length worth of 0x5x
bytes). Tese are necessary for the security proof of HMAC to work;
their particular values arent very important, as long as the two con-
stants are dierent.
Te two pads are XORed with the key before use. Te result is
either prepended to the original message (for the inner padding p
i
)
or the intermediate hash output (for the outer padding p
o
). Because
theyre prepended, they can be computed ahead of time, shaving a few
cycles o the MAC computation time.
10.5 One-time MACs
So far, weve always assumed that MAC functions can be used with a
single key to produce secure MACs for a very large number of mes-
sages. By contrast, One-time MACs are MACfunctions that can only
securely be used once with a single key. Tat might sound like a silly
idea, since weve already talked about regular secure MACs. An al-
gorithm that only works once just seems objectively worse. However,
they have several big advantages:
Tey can be incredibly fast to evaluate, even for very large mes-
sages.
Tey have a compelling security proof based on the information
content of the tag.
A construction exists to turn a one-time MAC into a secure
multiple-use MAC, removing the principal problem.
120 CHAPTER 10. MESSAGE AUTHENTICATION CODES
A typical simple example of such One-time MACs consists of a
simple multiplication and addition modulo some large prime p. In
this case, the secret key consists of two truly random numbers a and
b, both between 1 and p.
t = m a +b (mod p)
Tis simple example only works for one-block messages m, and
some prime p slightly bigger than the biggest m. It can be extended
to support bigger messages M consisting of blocks m
i
by using a
message-specifc polynomial P:
t = (m
i
a
i
+ +m
1
a)

P(M,a)
+b (mod p)
In many ways, a one-time MAC is to authentication what a one-
time pad is to encryption. Te security argument is similar: because an
attacker learns no information about the key or the message, because
they are being mixed irreversibly. Tis demonstrates that the MAC
is secure against attackers trying to produce existential forgeries, even
when that attacker has infnite computational power.
Also like a one-time pad, the security argument relies on two very
important properties about the keys a, b:
Tey have to be truly random.
Tey have to be used at most once.
10.5. ONE-TIME MACS 121
Re-using a and b
Well illustrate that our example MAC is insecure if used to authenti-
cate two messages m
1
, m
2
with the same key (a, b):
t
1
= m
1
a +b (mod p)
t
2
= m
2
a +b (mod p)
An attacker can reconstruct a, b with some simple arithmetic,
starting by subtracting the two equations:
t
1
t
2
= (m
1
a +b) (m
2
a +b) (mod p)
(remove parentheses)
t
1
t
2
= m
1
a +b m
2
a b (mod p)
(b and b cancel out)
t
1
t
2
= m
1
a +m
2
a (mod p)
(factor out a)
t
1
t
2
= a (m
1
+m
2
) (mod p)
(fip sides, multiply by inverse of (m
1
+m
2
))
a = (t
1
t
2
)(m
1
+m
2
)
1
(mod p)
Te attacker has a direct way of computing a, allowing them to
plug it into either the equation for t
1
or t
2
to get b:
122 CHAPTER 10. MESSAGE AUTHENTICATION CODES
t
1
= m
1
a +b (mod p)
(reorder terms)
b = t
1
m
1
a (mod p)
As you can see, as with one-time pads, re-using the key even once
leads to a complete failure of the cryptosystem to preserve privacy or
integrity, as the case may be. As a result, One-time MACs are a bit
dangerous to use directly; but well see that with the appropriate con-
struction, that issue can be remedied.
10.6 Carter-Wegman MAC
As weve already stated, the obvious problem with One-time MACs
is their limited practicality. Fortunately, it turns out that there is a
construction, called a Carter-Wegman MAC, that turns any secure
one-time MAC into a secure many-time MAC while preserving most
of the performance beneft.
Te idea behind a Carter-Wegman MAC is that you can use a
one-time MAC O to produce a tag for the bulk of the data, and then
encrypt a nonce n with a pseudorandom function F, such as a block
cipher, to protect that one-time tag:
CW((k
1
, k
2
), n, M) = F(k
1
, n) O(k
2
, M)
Keep in mind that while Carter-Wegman MACs take two distinct
keys k1 and k2, and that Carter-Wegman MACs are related to One-
time MACs which also take two distinct keys, these keys are unrelated.
10.7. AUTHENTICATED ENCRYPTION MODES 123
Te Carter-Wegman MACs k
2
is the only input here to the fast one-
time MAC O. If that fast one-time MAC is our earlier example that
takes two keys, that k
2
would have to get split up into those two keys.
You can tell how a Carter-Wegman MAC exploits the benefts of
both kinds of MACs by considering the two terms of the equation
separately. In F(k
1
, n), F is just a regular pseudorandom function,
such as a block cipher. It is quite slow by comparison to the one-time
MAC. However, its input, the nonce, is very small. Te unpredictable
output of the block cipher masks the output of the one-time MAC. In
the second term, O(k
2
, M), the large input message M is only handled
by the very fast one-time MAC O.
Tese constructions, in particular Poly1305-AES, currently repre-
sent some of the state of the art in MAC functions. Te paper ([11])
and RFC ([10]) for an older, related MAC function called UMAC
may also be good sources of extra background information, since they
go into extensive details of the hows and whys of a practical Carter-
Wegman MAC.
10.7 Authenticated encryption modes
So far, weve always clearly distinguished encryption from authentica-
tion, and explained the need for both. Te majority of secure connec-
tions that are set up every day have that distinction as well: they treat
encryption and authentication as fundamentally dierent steps.
Alternatively, we could make authentication a fundamental part of
the mode of operation. After all, weve already seen that unauthenti-
cated encryption is virtually never what you want; it is, at best, some-
thing you occasionally have to live with. It makes sense to use con-
124 CHAPTER 10. MESSAGE AUTHENTICATION CODES
structions that not only guarantee the privacy of an arbitrary stream,
but also its integrity.
As weve already seen, many of the methods of composing authen-
tication and encryption are inherently insecure. By doing that in a
fxed, secure way such as a properly designed authenticated encryp-
tion mode, an application developer no longer has to make that choice,
which means they also cant inadvertently make the wrong choice.
Authenticated Encryption with Associated Data (AEAD)
AEADis a feature of certain modes of authenticated encryption. Such
modes of operation are called AEADmodes. It starts with the premise
that many messages actually consist of two parts:
Te actual content itself
Metadata: data about the content
In many cases the metadata should be plaintext, but the content it-
self should be encrypted. Te entire message should be authenticated:
it should not be possible for an attacker to mess with the metadata and
have the resulting message still be considered valid.
Consider an e-mail alternative as an example cryptosystem. Te
metadata about the content might contain the intended recipient. We
defnitely want to encrypt and authenticate the content itself, so that
only the recipient can read it. Te metadata, however, has to be in
plaintext: the e-mail servers performing the message delivery have to
know which recipient to send the message to.
Many systems would leave this metadata unauthenticated, allow-
ing attackers to modify it. In our case, that looks like it may just lead
10.8. OCB MODE 125
to messages being delivered to the wrong inbox. Tat also means that
an attacker can force e-mail to be delivered to the wrong person, or
not delivered at all.
AEAD modes address this issue by providing a specifed way to
add metadata to encrypted content, so that the whole of the encrypted
content and the metadata is authenticated, and not the two pieces sep-
arately:
10.8 OCB mode
OCB mode is an AEAD mode of operation. It is one of the earliest
developed AEAD modes.
As you can see, most of this scheme looks quite similar to ECB
mode. Te name oset codebook (OCB) is quite similar to electronic
codebook, as well. OCB does not share the security issues ECB mode
has, of course; there are several important dierences, such as the o-
sets
i
introduced in each individual block encryption. Additionally,
there is an authentication tag t, built from x, a simple checksum over
the plaintext as well as t
a
, which authenticates the AEAD associated
data. Tat associated data tag t
a
is computed as follows:
126 CHAPTER 10. MESSAGE AUTHENTICATION CODES
Tis design has a number of interesting properties. For example,
it is very fast: only requiring roughly one block cipher operation per
encrypted or associate data block, as well as one additional block cipher
10.9. GCM MODE 127
operation for the fnal tag. Te osets (
i
) are also extremely easy to
compute. Te checksumx is just all of the plaintext blocks p
i
XORed
together. Finally, OCB mode is easy to compute in parallel; only the
fnal authentication tag is dependent on all the preceding information.
OCB mode also comes with a built-in padding scheme: it be-
haves slightly dierently when the plaintexts or authentication text is
not exactly a multiple of the block size. Tis means that, unlike with
PKCS#5/PKCS#7 padding, there isnt an entire block of wasted
padding if the plaintext happens to be a multiple of the block size.
Despite having several interesting properties going for it, OCB
mode has not received as much attention as some of the alternatives;
one of the main reasons being that it is patent encumbered. Even
though a number of patent licenses are available[33], including a free-
of-charge one for open source software, this does not appear to have
signifcantly impacted how much OCB mode is used in the feld.
10.9 GCM mode
GCM mode is an AEAD mode with an unfortunate case of RAS syn-
drome (redundant acronymsyndrome): GCMitself stands for Galois
Counter Mode. It is formalized in a NIST Special Publication[2]
and roughly boils down to a combination of classical CTR mode with
a Carter-Wegman MAC. Tat MAC can be used by itself as well,
which is called GMAC.
Authentication
GCM mode (and by extension GMAC)
..
11
Signature algorithms
11.1 Description
A signature algorithm is the public-key equivalent of a message au-
thentication code. It consists of three parts:
1. a key generation algorithm, which can be shared with other
public-key algorithms
2. a signature generation algorithm
3. a signature verifcation algorithm
Signature algorithms can be built using encryption algorithms.
Using the private key, we produce a value based on the message, usu-
ally using a cryptographic hash function. Anyone can use the private
129
130 CHAPTER 11. SIGNATURE ALGORITHMS
key to retrieve that value, compute what the value should be from the
message, and compare the two to verify. Te obvious dierence be-
tween this and public-key encryption is that the private key is used
to produce the signature and the public one to retrieve the original,
which is the opposite of how it usually works.
Te above explanation glosses over many important details. Well
discuss real schemes in more detail below.
11.2 RSA-based signatures
PKCS#1 v1.5
TODO (see #48)
PSS
TODO (see #49)
11.3 DSA
TODO: intro (see #50)
Parameter generation
TODO: explain parameter generation (see #51)
Signing a message
In order to sign a message, the signer picks a randomk between 0 and
q. Picking that k turns out to be a fairly sensitive and involved process;
11.3. DSA 131
but well go into more detail on that later. With k chosen, Tey then
compute the two parts of the signature r, s of the message m:
r = g
k
(mod q)
s = k
1
(H(m) +xr) (mod q)
If either of these happen to be 0 (a rare event, with 1 in q odds,
and q being a pretty large number), pick a dierent k.
TODO: Talk about k
-1
, the modular inverse (see #52)
Verifying a signature
Verifying the signature is a lot more complex. Given the message m
and signature (r, s):
w = s
1
(mod q)
u
1
= wH(m) (mod q)
u
2
= wr (mod q)
v = (g
u
1
y
u
2
(mod p)) (mod q)
If the signature is valid that fnal result v will be equal to r, the
second part of the signature.
Te trouble with k
While there is nothing wrong with DSA done right, its very easy to
get it wrong. Furthermore, DSA is quite sensitive: even a small im-
plementation mistake results in a broken scheme.
132 CHAPTER 11. SIGNATURE ALGORITHMS
In particular, the choice of the signature parameter k is critical.
Te requirements for this number are among the strictest of all random
numbers in cryptographic algorithms. For example, many algorithms
require a nonce. A nonce just has to be unique: you can use it once,
and then you can never use it again. It doesnt have to be secret. It
doesnt even have to be unpredictable. A nonce can be implemented
by a simple counter, or a monotonic clock. Many other algorithms,
such as CBC mode, use an initialization vector. It doesnt have to be
unique: it only has to be unpredictable. It also doesnt have to be secret:
initialization vectors are typically tacked on to the ciphertext. DSAs
requirements for the k value are a combination of all of these:
It has to be unique.
It has to be unpredictable.
It has to be secret.
Muddle with any of these properties, and an attacker can probably
retrieve your secret key, even with a modest amount of signatures. For
example, an attacker can recover the secret key knowing only knows a
few bits of k, plus a large amount of valid signatures. [32]
It turns out that many implementations of DSA dont even get the
uniqueness part right, happily reusing k values. Tat allows a direct
recovery of the secret key using basic arithmetic. Since this attack is
much simpler to understand, very commonly applicable, and equally
devastating, well discuss it in detail.
Suppose that an attacker sees multiple signatures (r
i
, s
i
), for dif-
ferent messages m
i
, all with the same k. Te attacker picks any two
11.3. DSA 133
signatures (r
1
, s
1
) and (r
2
, s
2
) of messages m
1
and m
2
respectively.
Writing down the equations for s
1
and s
2
:
s
1
= k
1
(H(m
1
) +xr
1
) (mod q)
s
2
= k
1
(H(m
2
) +xr
2
) (mod q)
Te attacker can simplify this further: r
1
and r
2
are equal. Fol-
lowing the defnition:
r
i
= g
k
(mod q)
Since the signer is reusing k, and the value of r only depends on
k, all r
i
will be equal. Since the signer is using the same key, x is equal
in the two equations as well.
Subtract the two s
i
equations from each other, followed by some
other arithmetic manipulations:
s
1
s
2
= k
1
(H(m
1
) +xr) k
1
(H(m
2
) +xr) (mod q)
= k
1
((H(m
1
) +xr) (H(m
2
) +xr)) (mod q)
= k
1
(H(m
1
) +xr H(m
2
) xr) (mod q)
= k
1
(H(m
1
) H(m
2
)) (mod q)
Giving us the simple, direct solution for k:
k =
H(m
1
) H(m
2
)
s
1
s
2
(mod q)
Te hash values H(m
1
) and H(m
2
) are easy to compute. Teyre
not secret: the messages being signed are public. Te two values s
1
134 CHAPTER 11. SIGNATURE ALGORITHMS
and s
2
are part of the signatures the attacker saw. So, the attacker can
compute k. Tat doesnt give him the private key x yet, though, or the
ability to forge signatures.
Lets write the equation for s down again, but this time thinking
of k as something we know, and x as the variable were trying to solve
for:
s = k
1
(H(m) +xr) (mod q)
All (r, s) that are valid signatures satisfy this equation, so we can
just take any signature we saw. Solve for x with some algebra:
sk = H(m) +xr (mod q)
sk H(m) = xr (mod q)
r
1
(sk H(m)) = x (mod q)
Again, H(m) is public, plus the attacker needed it to compute k,
anyway. Teyve already computed k, and s is plucked straight from
the signature. Tat just leaves us with r
1
(mod q) (read as: the
modular inverse of r modulo q), but that can be computed eciently
as well. (For more information, see the appendix on modular arith-
metic; keep in mind that q is prime, so the modular inverse can be
computed directly.) Tat means that the attacker, once theyve dis-
covered the k of any signature, can recover the private key directly.
So far, weve assumed that the broken signer would always use the
same k. To make matters worse, a signer only has to re-use k once
in any two signatures that the attacker can see for the attack to work.
As weve seen, if k is repeated, the r
i
values repeat as well. Since
11.4. ECDSA 135
r
i
is a part of the signature, its very easy to see when the signer has
made this mistake. So, even if reusing k is something the attacker
only does rarely (because their random number generator is broken,
for example), doing it once is enough for the attacker to break the
DSA scheme.
In short, reusing the k parameter of a DSA signing operation
means an attacker recovers the private key.
TODO: Debian http://rdist.root.org/2009/05/17/the-debian-pgp-disaster-that-almost-was/
11.4 ECDSA
TODO: explain (see #53)
As with regular DSA, the choice of k is extremely critical. Tere
are attacks that manage to recover the signing key using a fewthousand
signatures when only a few bits of the nonce leak. [31]
11.5 Repudiable authenticators
Signatures like the ones we described above provide a property called
non-repudiation. In short, it means that you cant later deny being the
sender of the signed message. Anyone can verify that the signature
was made using your private key, something only you could do.
Tat may not always be a useful feature; it may be more prudent
to have a scheme where only the intended recipient can verify the sig-
nature. An obvious way to design such a scheme would be to make
sure that the recipient (or, in fact, anyone else) could have computed
an identical value.
136 CHAPTER 11. SIGNATURE ALGORITHMS
Such messages can be repudiated; such a scheme is often called
deniable authentication. While it authenticates the sender to the
intended recipient, the sender can later deny (to third parties) having
sent the message. Equivalently, the recipient cant convince anyone
else that the sender sent that particular message.
..
12
Key derivation functions
12.1 Description
A key derivation function is a function that derives one or more secret
values (the keys) from one secret value.
Many key derivation functions can also take a (usually optional)
salt parameter. Tis parameter causes the key derivation function to
not always return the same output keys for the same input secret. As
with other cryptosystems, salts are fundamentally dierent from the
secret input: salts generally do not have to be secret, and can be re-
used.
Key derivation functions can be useful, for example, when a cryp-
tographic protocol starts with a single secret value, such as a shared
password or a secret derived using Die-Hellman key exchange, but
requires multiple secret values to operate, such as encryption and
137
138 CHAPTER 12. KEY DERIVATION FUNCTIONS
MAC keys. Another use case of key derivation functions is in crypto-
graphically secure random number generators, which well see in more
detail in a following chapter, where they are used to extract random-
ness with high entropy density from many sources that each have low
entropy density.
Tere are two main categories of key derivation function, depend-
ing on the entropy content of the secret value, which determines how
many dierent possible values the secret value can take.
If the secret value is a user-supplied password, for example, it typ-
ically contains very little entropy. Tere are very few values the pass-
word will take. As weve already established in a previous section on
password storage, that means it is necessary that the key derivation
function is hard to compute. Tat means it requires a non-trivial
amount of computing resources, such as CPU cycles or memory. If
the key derivation function were easy to compute, an attacker could
simply enumerate all possible values of the shared secret, since there
are few possibilities, and then compute the key derivation function for
all of them. As weve seen in that previous section on password stor-
age, this is how most modern attacks on password stores work. Using
an appropriate key derivation function, such as scrypt, would prevent
these attacks. In this chapter, well see scrypt, as well as other key
derivation functions in this category.
On the other hand, the secret value could also have a high en-
tropy content. For example, it could be a shared secret derived from
a Die-Hellman key agreement protocol, or an API key consisting
of cryptographically random bytes (well discuss cryptographically se-
cure random number generation in the next chapter). In that case, it
12.2. PASSWORD STRENGTH 139
isnt necessary to have a key derivation function thats hard to compute:
even if the key derivation function is trivial to compute, there are too
many possible values the secret can take, so an attacker would not be
able to enumerate them all. Well see the best-of-breed of this kind of
key derivation function, HKDF, in this chapter.
12.2 Password strength
TODO: NIST Special Publication 800-63
12.3 PBKDF2
12.4 bcrypt
12.5 scrypt
12.6 HKDF
Te HMAC-based (Extract-and-Expand) Key Derivation Function
(HKDF), defned in RFC 5869[25] and explained in detail in a re-
lated paper[27], is a key derivation function designed for high entropy
inputs, such as shared secrets from a Die-Hellman key exchange. It
is specifcally not designed to be secure for low-entropy inputs such as
passwords.
HKDF exists to give people an appropriate, o-the-shelf key
derivation function. Previously, key derivation was often something
that was done ad hoc for a particular standard. Usually these ad hoc
solutions did not have the extra provisions HKDF does, such as salts
140 CHAPTER 12. KEY DERIVATION FUNCTIONS
or the optional info parameter (which well discuss later in this sec-
tion); and thats only in the best case scenario where the KDF wasnt
fundamentally broken to begin with.
HKDF is based on HMAC. Like HMAC, it is a generic construc-
tion that uses hash functions, and can be built using any cryptograph-
ically secure hash function you want.
HKDF consists of two phases. In the frst phase, called the ex-
traction phase, a fxed-length key is extracted from the input entropy.
In the second phase, called the expansion phase, that key is used to
produce a number of pseudorandom keys.
Te extraction phase
Te extraction phase is responsible for extracting a small amount of
data with a high entropy content from a potentially large amount of
data with a smaller entropy density.
Te extraction phase just uses HMAC with a salt:
def extract(salt, data):
return hmac(salt, data)
Te salt value is optional. If the salt is not specifed, a string of
zeroes equal to the length of the hash functions output is used. While
the salt is technically optional, the designers stress its importance, be-
cause it makes the independent uses of the KDF (for example, in dif-
ferent applications, or with dierent users) produce independent re-
sults. Even a fairly low-entropy salt can already contribute signif-
cantly to the security of the KDF. [25] [27]
12.6. HKDF 141
Te extraction phase explains why HKDF is not suitable for de-
riving keys from passwords. While the extraction phase is very good
at concentrating entropy, it is not capable of amplifying entropy. It is
designed for compacting a small amount of entropy spread out over
a large amount of data into the same amount of entropy in a small
amount of data, but is not designed for creating a set of keys that are
dicult to compute in the face of a small amount of available entropy.
Tere are also no provisions for making this phase computationally
intensive. [25]
In some cases, it is possible to skip the extraction phase, if the
shared secret already has all the right properties. For example, if it
is a pseudorandom string of sucient length, and with sucient en-
tropy. However, sometimes this should not be done at all, for example
when dealing with a Die-Hellman shared secret. Te RFCgoes into
slightly more detail on the topic of whether or not to skip this step;
but it is generally inadvisable. [25]
Te expansion phase
In the expansion phase, the random data extracted from the inputs in
the extraction phase is expanded into as much data as is required.
Te expansion step is also quite simple: chunks of data are pro-
duced using HMAC, this time with the extracted secret, not with the
public salt, until enough bytes are produced. Te data being HMACed
is the previous output (starting with an empty string), an info param-
eter (by default also the empty string), and a counter byte that counts
which block is currently being produced.
def expand(key, info=):
142 CHAPTER 12. KEY DERIVATION FUNCTIONS
Expands the key, with optional info.

output =
for byte in map(chr, range(256)):
output = hmac(key, output + info + byte)
yield output
def get_output(desired_length, key, info=):
Collects output from the expansion step until enough
has been collected; then returns the collected output.

outputs, current_length = [], 0


for output in expand(key, info):
outputs.append(output)
current_length += len(output)
if current_length >= desired_length:
break
else:
raise RuntimeError(Desired length too long)
return .join(outputs)[:desired_length]
Like the salt in the extraction phase, the info parameter is
entirely optional, but can actually greatly increase the security of
the application. Te info parameter is intended to contain some
12.6. HKDF 143
application-specifc context in which the key derivation function is
being used. Like the salt, it will cause the key derivation function
to produce dierent values in dierent contexts, further increasing its
security. For example, the info parameter may contain information
about the user being dealt with, the part of the protocol the KDF is
being executed for [25]
..
13
Random number generators
Te generation of random numbers is too important
to be left to chance. Robert R. Coveyou
13.1 Introduction
Many cryptographic systems require random numbers. So far, weve
just assumed that theyre available and waved our hands vigorously
around those parts. In this chapter, well go more in depth about the
importance and mechanics of random numbers in cryptographic sys-
tems.
Producing random numbers is a fairly intricate process. Whats
worse, is that like so many cryptographic systems gone wrong, that to
the untrained eye, getting it wrong looks exactly like getting it right.
145
146 CHAPTER 13. RANDOM NUMBER GENERATORS
Teres three categories of random number generation that well
consider separately:
True random number generators
Cryptographically secure pseudorandom number generators
Pseudorandom number generators
13.2 True random number generators
Any one who considers arithmetical methods of pro-
ducing random digits is, of course, in a state of sin.
John von Neumann
John von Neumann, father of the modern model of computing,
made an obvious point. We cant expect to produce random numbers
using predictable, deterministic arithmetic. We need a source of ran-
domness that isnt a consequence of deterministic rules.
True random number generators get their randomness from phys-
ical processes. Historically, many systems have been used for produc-
ing such numbers. Systems like dice are still in common use today.
However, for the amount of randomness we need for practical cryp-
tographic algorithms, these are typically far too slow, and often quite
unreliable.
Weve since come up with more speedy and reliable sources of ran-
domness. Tere are several categories of physical processes that are
used for hardware random number generation:
Quantum processes
13.2. TRUE RANDOM NUMBER GENERATORS 147
Termal processes
Oscillator drift
Timing events
Keep in mind that not all of these options necessarily generate
high-quality, truly random numbers. Well elaborate further on how
they can be applied successfully anyway.
Radioactive decay
One example of a quantum physical process used to produce random
numbers is radioactive decay. We know that radioactive substances
will slowly decay over time. Its impossible to know when the next
atom will decay; that process is entirely random. Detecting when such
a decay has occurred, however, is fairly easy. By measuring the time
between individual decays, we can produce random numbers.
Shot noise
Shot noise is another quantum physical process used to produce ran-
dom numbers. Shot noise is based on the fact that light and electricity
are caused by the movement of indivisible little packets: photons in
the case of light, and electrons in the case of electricity.
Nyquist noise
An example of a thermal process used to produce random numbers
is Nyquist noise. Nyquist noise is the noise that occurs from charge
carriers (typically electrons) traveling through a mediumwith a certain
148 CHAPTER 13. RANDOM NUMBER GENERATORS
resistance. Tat causes a tiny current to fow through the resistor (or,
alternatively put, causes a tiny voltage dierence across the resistor).
i =

4k
B
Tf
R
v = 4k
B
TfR
Tese formulas may seem a little scary to those who havent seen
the physics behind them before, but dont worry too much: under-
standing them isnt really necessary to go along with the reasoning.
Tese formulas are for the the root mean square. If youve never heard
that term before, you can roughly pretend that means average. f
is the bandwidth, T is the temperature of the system in Kelvins, k
B
is
Boltzmanns constant.
As you can see from the formula, Nyquist noise is thermal, or
temperature-dependent. Fortunately, an attacker generally cant use
that property to break the generator: the temperature at which it would
become ineective is so low that the system using it has probably al-
ready failed at that point.
By evaluating the formula, we can see that Nyquist noise is quite
small. At room temperature with reasonable assumptions (10 kHz
bandwidth and a 1k resistor), the Nyquist voltage is in the order of
several hundred nanovolts. Even if you round up liberally to a micro-
volt (a thousand nanovolts), thats still a thousandth of a thousandth
of a volt, and even a tiny AA battery produces 1.5V.
While the formulas describe the root mean square, the value you
can measure will be randomly distributed. By repeatedly measuring it,
13.3. YARROW 149
we can produce high-quality random numbers. For most practical ap-
plications, thermal noise numbers are quite high quality and relatively
unbiased.
TODO: weve never actually explained the word entropy! resis-
tance an attacker perceives is necessary in a good defnition!
TODO: explain synchronous stream ciphers as CSPRNGs
13.3 Yarrow
Te Yarrow algorithm is a cryptographically secure pseudorandom
number generator.
TODO: actually explain Yarrow
Tis algorithm is used as the CSPRNG for FreeBSD, and was
inherited by Mac OS X. On both of these operating systems, its used
to implement /dev/random. Unlike on Linux, /dev/urandom is just an
alias for /dev/random.
13.4 Blum Blum Shub
TODO: explain this, and why its good (provable), but why we dont
use it (slow)
13.5 Dual_EC_DRBG
Dual_EC_DRBG is a NIST standard for a cryptographically secure pseu-
dorandom bit generator. It sparked a large amount of controversy:
despite being put forth as an ocial, federal cryptographic standard,
it quickly became evident that it wasnt very good.
150 CHAPTER 13. RANDOM NUMBER GENERATORS
Cryptanalysis eventually demonstrated that the standard could
contain a back door hidden in the constants specifed by the standard,
potentially allowing an unspecifed attacker to completely break the
random number generator.
Several years afterwards, leaked documents suggested a back-
door in an unnamed NIST standard released in the same year as
Dual_EC_DRBG, fueling the suspicions further. Tis lead to an ocial
recommendation from the standards body to stop using the standard,
which was previously unheard of under such circumstances.
Background
For a long time, the ocial standards produced by NIST lacked good,
modern cryptographically secure pseudorandom number generators.
It had a meager choice, and the ones that had been standardized had
several serious faws.
NIST hoped to address this issue with a new publication called
SP 800-90, that contained several new cryptographically secure pseu-
dorandom number generators. Tis document specifed a number of
algorithms, based on dierent cryptographic primitives:
1. Cryptographic hash functions
2. HMAC
3. Block ciphers
4. Elliptic curves
Right o the bat, that last one jumps out. Using elliptic curves
for random number generation was unusual. Standards like these are
13.5. DUAL_EC_DRBG 151
expected to be state-of-the-art, while still staying conservative. Ellip-
tic curves had been considered before in an academic context, but that
was a far cry from being suggested as a standard for common use.
Tere is a second reason elliptic curves seem strange. HMAC and
block ciphers are obviously symmetric algorithms. Hash functions
have their applications in asymmetric algorithms such as digital signa-
tures, but arent themselves asymmetric. Elliptic curves, on the other
hand, are exclusively used for asymmetric algorithms: signatures, key
exchange, encryption.
Tat said, the choice didnt come entirely out of the blue. A choice
for a cryptographically secure pseudorandom number generator with
a strong number-theoretical basis isnt unheard of: 13.4 is a perfect
example. Tose generators are typically much slower than the alterna-
tives. Dual_EC_DRBG, for example, is three orders of magnitude slower
than its peers presented in the same standard. Te idea is that the
extra confdence inspired by the stronger mathematical guarantees is
worth the performance penalty. For example, were fairly confdent
that factoring numbers is hard, but were a lot less sure about our hash
functions and ciphers. RSA came out in 1977 and has stood the test
of time quite well since then. DES came out two years later, and is
now considered completely broken. MD4 and MD5 came out over a
decade later, and are completely broken as well.
Te problem is though, that the standard didnt actually provide
the security proof. Te standard specifes the generator but then
merely suggests that it would be at least as hard as solving the elliptic
curve discrete log problem. Blum Blum Shub, by contrast, has a proof
that shows that breaking it is at least as hard as solving the quadratic
152 CHAPTER 13. RANDOM NUMBER GENERATORS
residuosity problem. Te best algorithm we have for that is factoring
numbers, which were fairly sure is pretty hard.
Te omission of the proof is a bit silly, because theres no reason
youd use a pseudorandom number generator as slow as Dual_EC_DRBG
unless you had proof that you were getting something in return for the
performance hit.
Cryptographers then later did the homework that NIST should
have provided in the specifcation[34][14]. Tose analyses quickly
highlighted a few issues.
A quick overview of the algorithm
Te algorithm consists of two parts:
1. Generating pseudorandom points on the elliptic curve, which
are turned into the internal state of the generator;
2. Turning those points into pseudorandom bits.
Well illustrate this graphically, with an illustration based on the
work by Shumow and Ferguson, two cryptographers who highlighted
some of the major issues with this algorithm:
Troughout the algorithm, is a function that takes a curve point
and turns it into an integer. Te algorithm needs two given points on
13.5. DUAL_EC_DRBG 153
the curve: P and Q. Tese are fxed, and defned in the specifcation.
Te algorithm has an internal state s. When producing a new block of
bits, the algorithm turns s into a dierent value r using the function
and elliptic curve scalar multiplication with P:
r = (sP)
Tat value, r, is used both for producing the output bits and updat-
ing the internal state of the generator. In order to produce the output
bits, a dierent elliptic curve point, Q, is used. Te output bits are
produced by multiplying r with Q, and running the result through a
transformation :
o = ((rQ))
In order to perform the state update, r is multiplied with P again,
and the result is converted to an integer. Tat integer is used as the
new state s.
s = (rP)
Issues and question marks
First of all, is extremely simple: it just takes the x-coordinate of the
curve point, and discards the y coordinate. Tat means that its quite
easy for an attacker who sees the output value of to fnd points that
could have produced that value. In itself, thats not necessarily a big
deal; but, as well see, its one factor that contributes to the possibility
of a backdoor.
154 CHAPTER 13. RANDOM NUMBER GENERATORS
Another faw was shown where points were turned into pseudo-
random bits. Te function simply discards the 16 most signifcant
bits. Previous designs discarded signifcantly more: for 256-bit curves
such as these, they discarded somewhere in the range of 120 and 175
bits.
Failing to discard sucient bits gave the generator a small bias.
Te next-bit property was violated, giving attackers a better than 50%
chance of guessing the next bit correctly. Granted, that chance was
only about one in a thousand better than 50%; but thats still unaccept-
able for whats supposed to be the state-of-the-art in cryptographically
secure pseudorandom number generators.
Discarding only those 16 bits has another consequence. Because
only 16 bits were discarded, we only have to guess 2
16
possibilities to
fnd possible values of (rQ) that produced the output. Tat is a very
small number: we can simply enumerate all of them. Tose values are
the outputs of , which as we saw just returns the x coordinate of a
point. Since we know it came from a point on the curve, we just have
to check if our guess is a solution for the curve equation:
y
2
x
3
+ax +b (mod p)
Te constants a, b, p are specifed by the curve. Weve just guessed
a value for x, leaving only one unknown, y. We can solve that quite
eciently. We compute the right hand side and see if its a per-
fect square: y
2
= q =

x
3
+ax +b (mod p). If it is, A =
(x, sqrt(q)) = (x, y) is a point on the curve. Tis gives us a number
of possible points A, one of which is rQ used to produce the output.
Tis isnt a big deal at face value. To fnd the state of the algorithm,
13.5. DUAL_EC_DRBG 155
an attacker needs to fnd r, so they can compute s. Tey still need to
solve the elliptic curve discrete log problem to fnd r from rQ, given
Q. Were assuming that thats hard.
Keep in mind that elliptic curves are primitives used for asymmet-
ric encryption. Tat problem is expected to be hard to solve in general,
but what if we have some extra information? What if theres a secret
value e so that eQ = P?
Lets put ourselves in the shoes of an attacker knowing e. We
repeat our math from earlier. One of those points A we just found is
the rQ were looking for. We can compute:
(eA) = (erQ) = (rP) (mod p)
Tat last step is a consequence of the special relationship between
e, P, Q. Tats pretty interesting, because (rP) is exactly the compu-
tation the algorithmdoes to compute s, the newstate of the algorithm!
Tat means that an attacker that knows e can, quite eciently, com-
pute the new state s from any output o, allowing them to predict all
future values of the generator!
Tis assumes that the attacker knows which A is the right A. Be-
cause only 16 bits were discarded there are only 16 bits left for us to
guess. Tat gives us 2
16
candidate x coordinates. Experimentally,
we fnd that roughly half of the possible x coordinates correspond to
points on the curve, leaving us with 2
15
possible curve points A, one of
which is rQ. Tats a pretty small number for a bit of computer-aided
arithmetic: plenty small for us to try all options. We can therefore say
that an attacker that does know the secret value e most defnitely can
break the generator.
156 CHAPTER 13. RANDOM NUMBER GENERATORS
So, weve nowshown that if there is a magical e for which eQ = P,
and you can pick P and Q (and you dont have to explain where you
got them from), that you could break the generator. How do you pick
such values?
To demonstrate just howpossible it is, the researchers started from
the NISTcurves P and p values, but came up with their own Q

. Tey
did this by starting with P, picking a random d (keeping it secret),
and setting Q

= dP. Te trick is that theres an ecient algorithm


for computing e in eQ

= P if you know the d in Q

= dP. Tis is
the e we need for our earlier attack. When they tried this out, they
discovered that in all cases (that is, for many random d), seeing 32
bytes of output was enough to determine the state s.
All of this, of course, only demonstrates that it is possible for the
specifed values of P and Q to be special values with a secret back
door. It doesnt provide any evidence that the actual values have a
backdoor in them. However, given that the standard never actually
explains how they got the magical value for Q, it doesnt really inspire
a lot of confdence. Typically, cryptographic standards use nothing-
up-my-sleeve numbers, such as the value of some constant such as
or the natural logarithm base, e.
If someone does know the backdoor, the consequences are obvi-
ously devastating. Weve already argued for the necessity of crypto-
graphically secure pseudorandom number generators: having a broken
one essentially means that all cryptosystems that use this generator are
completely and utterly defeated.
Tere are two suggested ways of fxing this particular algorithm:
Make the function more complex to invert, rather than just
13.6. MERSENNE TWISTER 157
discarding 16 bits. Tis makes it harder to fnd candidate points,
and hence, harder to perform the attack. One obvious way
would be to discard more bits. Another option would be to use
a cryptographically secure hash, or a combination of both.
Generate randomQevery time you start the algorithm, possibly
by picking a random d and setting Q = dP. Of course, d has
to be suciently large and truly random: if is unchanged, and
theres only few values d can have, the attacker can just perform
the above attack for all values of d.
Aftermath
TODO: Talk about RSA guys comments + snowden leaks
13.6 Mersenne Twister
Mersenne Twister is a very common pseudorandom number genera-
tor. It has many nice properties, such as high performance, a huge
period
1
of 2
19937
1, and it passes all but the most demanding ran-
domness tests. Despite all of these, it is not cryptographically secure.
To demonstrate this, well take a look at how the algorithm works.
Fortunately, its not very complex.
Internal structure
Te standard Mersenne Twister algorithmoperates on an internal state
array S consisting of 624 unsigned 32-bit integers, and an index i
1
Te period of a pseudorandom number generator is how many random numbers
it produces before the entire sequence repeats.
158 CHAPTER 13. RANDOM NUMBER GENERATORS
pointing to the current integer. It consists of three steps:
1. An optional initialization function, which produces an initial
state from a small random value called a seed.
2. An state generation function, which produces a new state from
the old state.
3. An extraction function, also called the tempering function, that
produces a randomnumber fromthe current element of the state
(the element pointed at by the index i).
Whenever the extraction function is called, the index to the cur-
rent integer is incremented. When all of the current elements of the
state have been used to produce a number, the state initialization func-
tion is called again. Te state initialization function is also called right
before the frst number is extracted.
So, to recap: the state is regenerated, then the extraction function
goes over each of the elements in the state, until it runs out. Tis
process repeats indefnitely.
TODO: illustrate
Well look at each of the parts briefy. Te exact workings of them
is outside the scope of this book, but well look at them just long
enough to get some insight into why Mersenne Twister is unsuitable
as a cryptographically secure random number generator.
Te initialization function
Te initialization function creates an instance of Mersenne Twisters
state array, from a small initial random number called a seed.
13.6. MERSENNE TWISTER 159
Te array starts with the seed itself. Ten, each next element is
produced from a constant, the previous element, and the index of the
new element. Elements are produced until there are 624 of them.
Heres the Python source code:
def initialize_state(seed):
state = [seed]
for i in xrange(1, 624):
prev = state[-1]
elem = 0x6c078965 * (prev ^ (prev >> 30)) + i
state.append(uint32(elem))
return state
For those of you who havent worked with Python or its bitwise
operators:
>> and << are right-shift and left-shift
& is binary AND: 0&0 = 0&1 = 1&0 = 0, and 1&1 = 1.
^ is binary XOR, ^= XORs and assigns the result to the name
on the left-hand side, so x ^= k is the same thing as x = x ^ k.
REVIEW: Bitwise arithmetic appendix?
160 CHAPTER 13. RANDOM NUMBER GENERATORS
Te state regeneration function
Te state regeneration function takes the current state and produces a
new state. It is called right before the frst number is extracted, and
every time all 624 elements of the state have been used up.
Te Python source code for this function is fairly simple. Note
that it modifes the state array in place, instead of returning a new one.
def regenerate(s):
for i in xrange(624):
y = s[i] & 0x80000000
y += s[(i + 1) % 624] & 0x7fffffff
z = s[(i + 397) % 624]
s[i] = z ^ (y >> 1)
if y % 2:
s[i] ^= 0x9908b0df
Te % in an expression like s[(i + n) % 624] means that a next
element of the state is looked at, wrapping around to the start of the
state array if there is no next element.
Te tempering function
Te tempering function is applied to the current element of the state
before returning it as the produced random number. Its easier to just
show the code instead of explaining how it works:
13.6. MERSENNE TWISTER 161
_TEMPER_MASK_1 = 0x9d2c5680
_TEMPER_MASK_2 = 0xefc60000
def temper(y):
y ^= uint32(y >> 11)
y ^= uint32((y << 7) & _TEMPER_MASK_1)
y ^= uint32((y << 15) & _TEMPER_MASK_2)
y ^= uint32(y >> 18)
return y
It may not be obvious, especially if youre not used to binary arith-
metic, but this function is bijective or one-to-one: each 32 bit integer
input maps to exactly one output, and vice versa: for each 32 bit in-
teger we get as an output there was exactly one 32 bit integer it could
have come from.
Because the tempering function is one-to-one, there is an inverse
function: a function that gives you the untempered equivalent of a
number. It may not be obvious to you how to construct that function
unless youre a bitwise arithmetic wizard, but thats okay; in the worst
case scenario we could still brute-force it. Suppose we just try every
single 32 bit integer, and remember the result in a table. Ten, when
we get a result, we look it up in the table, and fnd the original. Tat
table would have to be at least 2
32
32 bits in length, or about 17.18
GB; big, but not impossibly so.
Fortunately, theres a much simpler method to compute the inverse
of the temper function. Well see why thats interesting when we eval-
uate the cryptographic security of the Mersenne Twister in the next
section. For those interested in the result, the untempering function
162 CHAPTER 13. RANDOM NUMBER GENERATORS
looks like this:
def untemper(y):
y ^= y >> 18
y ^= ((y << 15) & _TEMPER_MASK_2)
y = _undo_shift_2(y)
y = _undo_shift_1(y)
return y
def _undo_shift_2(y):
t = y
for _ in xrange(5):
t <<= 7
t = y ^ (t & _TEMPER_MASK_1)
return t
def _undo_shift_1(y):
t = y
for _ in xrange(2):
t >>= 11
t ^= y
return t
13.6. MERSENNE TWISTER 163
Cryptographic security
Remember that for cryptographic security, it has to be impossible to
predict future outputs or recover past outputs given present outputs.
Te Mersenne Twister doesnt have that property.
Its clear that psuedorandom number generators, both those cryp-
tographically secure and those that arent, are entirely defned by their
internal state. After all, they are deterministic algorithms: theyre just
trying very hard to pretend not to be. Terefore, you could say that
the principal dierence between cryptographically secure and ordinary
pseudorandom number generators is that the cryptographically secure
ones shouldnt leak information about their internal state, whereas it
doesnt matter for regular ones.
Remember that in Mersenne Twister, a random number is pro-
duced by taking the current element of the state, applying the tem-
pering function, and returning the result. Weve also seen that the
tempering function has an inverse function. So, if I can see the output
of the algorithm and apply the inverse of the tempering function, Ive
recovered one element out of the 624 in the state.
Suppose that I happen to be the only person seeing the outputs of
the algorithm, and you begin at the start of the state, such as with a
fresh instance of the algorithm, that means that I can clone the state
by just having it produce 624 random numbers.
Even if an attacker doesnt see all 624 numbers, they can often still
recreate future states, thanks to the simple relations between past states
and future states produced by the state regeneration function.
Again, this is not a weakness of Mersenne Twister. Its designed
to be fast and have strong randomness properties. It is not designed to
164 CHAPTER 13. RANDOM NUMBER GENERATORS
be unpredictable, the principal property of a cryptographically secure
pseudorandom number generators.
Part III
Complete cryptosystems
165
..
14
SSL and TLS
14.1 Description
SSL, short for Secure Socket Layer, is a cryptographic protocol origi-
nally introduced by Netscape Communications
1
for securing trac on
the Web. Te standard is now superseded by TLS (Transport Layer
Security), a standard publicized in RFCs by the IETF. Te term SSL
is still commonly used, even when the speaker actually means a TLS
connection. Fromnowon, this book wil only use the termTLS, unless
we really mean the old SSL standard.
Its frst and foremost goal[17] is to transport bytes securely, over
the Internet or any other insecure medium. Its hybrid cryptosystem: it
1
For those too young to remember, Netscape is a company that used to make
browsers.
167
168 CHAPTER 14. SSL AND TLS
uses both symmetric and asymmetric algorithms in unison. For exam-
ple, signature algorithms can be used to authenticate peers, and public
key algorithms be used to negotiate shared secrets and authenticate
certifcates.
On the symmetric side, stream ciphers (both native and using
modes of operation) are used to encrypt the actual data being trans-
mitted, and MACs are used to authenticate that data.
TLS is the worlds most common cryptosystem, and hence prob-
ably also the most studied. Over the years, many faws have been dis-
covered in SSL and TLS, despite many of the worlds top cryptogra-
phers contributing to and examining the standard
2
. As far as we know,
the current versions of TLS are secure, or at least can be confgured
securely.
14.2 Handshakes
TODO: explain a modern TLS handshake
Downgrade attacks
SSL 2.0 made the mistake of not authenticating handshakes. Tis
made it easy to mount downgrade attacks. A downgrade attack is a
man-in-the-middle attack where an attacker modifes the handshake
messages that negotiate which ciphersuite is being used. Tat way, he
can force the clients to set up the connection using an insecure block
cipher, for example.
2
In case I havent driven this point home yet: it only goes to show that designing
cryptosystems is hard, and you probably shouldnt do it yourself.
14.3. CERTIFICATE AUTHORITIES 169
Due to cryptographic export restrictions at the time, many ciphers
were only 40 or 56 bit. Even if the attacker couldnt break the best
encryption both client and server supported, he could probably break
the weakest, which is all that is necessary for a downgrade attack to
succeed.
Tis is one of the many reasons that there is an explicit RFC[35]
prohibiting newTLS implementations fromhaving SSLv2.0 support.
14.3 Certifcate authorities
TLS certifcates can be used to authenticate peers, but how do we au-
thenticate the certifcate? My bank may very well have a certifcate
claiming to be that particular bank, but how do I know its actually my
bank, and not just someone pretending to be my bank? Why should
I trust this particular certifcate? As weve seen when we discussed
these algorithms, anyone can generate as many key pairs as theyd like.
Teres nothing stopping someone fromgenerating a key pair pretend-
ing to be your bank
When someone actually tries to use a certifcate to impersonate a
bank, real browsers dont believe you. Tey notify the user that the
certifcate is untrusted. Tey do this using the standard TLS trust
model of certifcate authorities. TLS clients come with a list of trusted
certifcate authorities, commonly shipped with your operating system
or your browser. Tese are special, trusted certifcates, that are carefully
guarded by their owners.
For a fee, these owners will use their certifcate authority to sign
other certifcates. Te idea is that the certifcate authority wouldnt
170 CHAPTER 14. SSL AND TLS
sign a certifcate for Facebook or a bank or anyone else, unless you
could prove youre actually them.
When a TLS client connects to a server, that server provides a
certifcate chain. Typically, their own certifcate is signed by an in-
termediary CA certifcate, which is signed by another, and another,
and one that is signed by a trusted root certifcate authority. Since
the client already has a copy of that root certifcate, they can verify the
signature chain starting with the root.
Your fake certifcate doesnt have a chain leading up to a trusted
root certifcate, so the browser rejects it.
TODO: Explain why this is a total racket
14.4 Self-signed certifcates
14.5 Client certifcates
In TLS, certifcates are usually only used to identify the server. Tis
satisfes a typical use case: users want to communicate securely with
their banks and e-mail providers, and the certifcate authenticates the
service theyre talking to. Te service usually authenticates the user
using passwords, and, occasionally, two-factor authentication.
In public-key schemes weve seen so far, all peers typically had
one or more key pairs of their own. Teres no reason users cant have
their own certifcates, and use them to authenticate to the server. Te
TLS specifcation explicitly supports client certifcates. Tis feature is
only rarely used, even thought it clearly has very interesting security
benefts.
14.6. PERFECT FORWARD SECRECY 171
Te main reason for that is probably rooted in the poor user expe-
rience. Tere are no systems that are easy to use for atechnical people
that rely on client certifcates. Since there are few such systems, even
tech-savvy people dont know about them, which means new systems
arent created.
Client certifcates are a great solution for when you control both
ends of the wire and want to securely authenticate both peers in a TLS
connection. By producing your own certifcate authority, you can even
sign these client certifcates to authenticate them.
14.6 Perfect forward secrecy
Historically, the most common way to agree on the pre-master secret is
for the client to select a random number and encrypt it, typically using
RSA. Tis has a few nice properties. For example, it means the server
can make due with less entropy: since the random bits are handed to
the server by the client, the server doesnt need to produce any crypto-
graphically random bits. It also makes the handshake slightly faster,
since theres no need for back-and-forth communication to agree on a
shared secret.
However, it has one major faw. Suppose an attacker gets access to
the servers private key. Perhaps they managed to factor the modulus
of the RSA key, or perhaps they broke in and stole it, or perhaps they
used legal force to get the owner to hand over the key. Regardless of
how they acquired it, getting access to the key allows the attacker to
decrypt all past communication. Te key allows them to decrypt the
encrypted pre-master secrets, which allows them to derive all of the
symmetric encryption keys, and therefore decrypt everything.
172 CHAPTER 14. SSL AND TLS
Tere are obvious alternatives to this scheme. Weve already seen
Die-Hellman key exchange, allowing two peers to agree on secret
keys over an insecure medium. TLS allows for peers to agree on the
pre-master secret using a Die-Hellman exchange, either based on
discrete logs or elliptic curves.
Assuming both peers discard the keys after use like theyre sup-
posed to, getting access to the secret keys wouldnt allow an attacker
to decrypt previous communication. Tat property is called perfect
forward secrecy. Te term perfect is a little contested, but the term
forward means that communications cant be decrypted later if the
long-term keys (such as the servers private key) fall into the wrong
hands.
Of course, this is only true if Die-Hellman exchanges are se-
cure. If an attacker has a signifcant mathematical and computational
advantage over everyone else, such as an algorithm for solving the dis-
crete log problem more eciently than thought possible, combined
with many data centers flled with number-crunching computers, its
possible that theyll break the key exchange itself.
14.7 Session resumption
TODO: explain session resumption
14.8 Attacks
As with most attacks, attacks on TLS can usually be grouped into two
distinct categories:
14.8. ATTACKS 173
1. Attacks on the protocol itself, such as subverting the CA mech-
anism;
2. Attacks on a particular implementation or cipher, such as crypt-
analytic attacks exploiting weaknesses in RC4, or timing attacks
in a particular AES implementation.
Unfortunately, SSL/TLS has had many successful attacks in both
categories. Tis section is particularly about the latter.
CRIME and BREACH
CRIME
3
is an attack by the authors of BEAST. Its an innovative side
channel attack that relies on TLS compression leaking information
about secrets in the plaintext. In a related attack called BREACH
4
,
the attackers accomplish the same eect using HTTP compression.
Tat was predicted by the authors of the original paper, but the
BREACH authors were the frst to demonstrate it as a practical at-
tack. Te BREACH attack was more practically applicable, though:
HTTP compression is signifcantly more common than TLS com-
pression.
Both of these rely on encryption of a compressed plaintext, and
their mechanisms are virtually identical: only the specifc details re-
lated to HTTP compression or TLS compression are relevant. Te
largest dierence is that with TLS compression, the entire stream can
be attacked; with HTTP compression, only the body is compressed,
3
Compression Ratio Info-leak Made Easy
4
Browser Reconnaissance and Exfltration via Adaptive Compression of Hyper-
text
174 CHAPTER 14. SSL AND TLS
so HTTP headers are safe. Since the attacks are otherwise extremely
similar, well just talk about how the attack works in the abstract, by
explaining how attackers can learn information about the plaintext if
it is compressed before encryption.
Te most common algorithm used to compress both HTTP and
TLS[21] is called DEFLATE. Te exact mechanics of DEFLATE
arent too important, but the important feature is that byte sequences
that occur more than once can be eciently stored. When a byte se-
quence recurs
5
, instead of recording the same sequence, a reference is
provided to the previous sequence: instead of repeating the secret, it
says go back and look at the thing I wrote N bytes ago.
Suppose an attacker can control the plaintext. For example, the
attacker injects an invisible iframe
6
or some Javascript code that fres
o many requests. Te attacker needs some way to inject their guess
of the secret so that their guess occurs in the plaintext, such as the
query parameters
7
. Usually, they can prefx their guess with something
known, for example, if the CSRF token is:
<input type=hidden
name=csrf-token
value=TOKEN_VALUE_HERE>
they can prefx the guess with the known part of that.
Ten, they make a bunch of guesses, byte per byte. When one of
their guesses is correct, the ciphertext will be just a little shorter. Once
5
Within limits; specifcally within a sliding window, usually 32kB big. Other-
wise, the pointers would grow bigger than the sequences theyre meant to compress.
6
An iframe is a web page embedded within a page.
7
Te key-value pairs in a URL after the question mark, e.g. the x=1&y=2 in
http://example.test/path?x=1&y=2.
14.8. ATTACKS 175
they have that frst byte, they go on to the next one, and so forth, until
they discover the entire secret.
Tis attack is particularly interesting for a number of reasons. Not
only is it a completely new class of attack, widely applicable to many
cryptosystems, but compressing the plaintext prior to encryption was
actively recommended by existing cryptographic literature. It doesnt
require any particularly advanced tools: you only need to convince the
user to make requests to a vulnerable website, and you only need to be
able to measure the size of the responses. Its also extremely eective:
the researchers that published BREACH report being able to extract
secrets, such as CSRF tokens, within one minute.
In order to defend against CRIME, disable TLS compression.
Tis is generally done in most systems by default. In order to defend
against BREACH, theres a number of possible options:
Dont allow the user to inject arbitrary data into the request.
Dont put secrets in the response bodies.
Regenerate secrets such as CSRF tokens liberally, for example,
each request.
8
Its a bad idea to simply unconditionally turn o HTTP compres-
sion. While it does successfully stop the attack, HTTP compression
is a critical tool for making the Web faster.
Web apps that consist of a static front-end (say, using HTML5,
JS, CSS) and that only operate using an API, say, JSON over REST,
8
Be careful not to drain your system of entropy: perhaps use longer secrets, gen-
erated with a pseudo-random number generator, instead of using a random number
source for cryptographic use.
176 CHAPTER 14. SSL AND TLS
are particularly easy to immunize against this attack. Just disable com-
pression on the channel that actually contains secrets. It makes things
slower, of course, but at least the majority of data can still be served
over a CDN.
14.9 HSTS
HTTP Strict Transport Security (HSTS) is a way for web servers to
communicate that what theyre saying should only ever be transferred
over a secure transport. In practice, the only secure transport that is
ever used for HTTP is TLS.
Using HSTS is quite simple; the web server just adds an ex-
tra Strict-Transport-Security header to the response. Te header
value contains a maximum age (max-age), which determines how long
into the future the browser can trust that this website will be HSTS-
enabled. Tis is typically a large value, such as a year. Browsers suc-
cessfully remembering that a particular host is HSTS-enabled is very
important to the eectiveness of the scheme, as well see in a bit. Op-
tionally, the HSTS header can include the includeSubDomains direc-
tive, which details the scope of the HSTS policy. [20]
Tere are several things that a conforming web browser will do
when communicating with an HSTS-enabled website:
Whenever there is any attempt to make any connection to this
website, it will always be done over HTTPS. Te browser does
this completely by itself, before making the request to the web-
site.
14.9. HSTS 177
If there is an issue setting up a TLS connection, the website will
not be accessible, instead of simply displaying a warning.
Essentially, HSTS is away for websites to communicate that they
only support secure transports. Tis helps protect the users against all
sorts of attacks including both passive eavesdroppers (that were hoping
to see some credentials accidentally sent in plaintext, and active man-
in-the-middle attacks such as SSL stripping.
HSTS also defends against mistakes on the part of the web server.
For example, a web server might accidentally pull in some executable
code, such as some Javascript, over an insecure connection. An active
attacker that can intercept and modify that Javascript would then have
complete control over the (supposedly secure) web site.
As with many TLS improvements, HSTS is not a panacea: it is
just one tool in a very big toolbox of stu that we have to try and make
TLS more secure. HSTS only helps to ensure that TLS is actually
used; it does absolutely nothing to prevent attacks against TLS itself.
HSTS can suer froma chicken-or-the-egg problem. If a browser
has never visited a particular HSTS-enabled website before, its possi-
ble that the browser doesnt know that that website is HSTS-enabled
yet. Terefore, the browser may still attempt a regular HTTP con-
nection, vulnerable to an SSL stripping attack. Some browsers have
attempted to mitigate this issue by having browsers come pre-loaded
with a list of HSTS websites.
178 CHAPTER 14. SSL AND TLS
14.10 Certifcate pinning
Certifcate pinning is an idea thats very similar to HSTS, taken a little
further: instead of just remembering that a particular server promises
to support HTTPS, well remember information about their certif-
cates (in practice, well remember a hash of the public key). When we
connect to a server that we have some stored information about, well
verify their certifcates, making it much harder for an impostor to pre-
tend to be the website were connecting to using a dierent certifcate.
Browsers originally implemented certifcate pinning by coming
shipped with a list of certifcates from large, high-profle websites. For
example, Google included whitelisted certifcates for all of their ser-
vices in their Chrome browser.
14.11 Secure confgurations
In this section, we are only talking about confguration options such
as which ciphers, TLS/SSL versions, et cetera. Were specifcally not
talking about TLS confgurations in the sense of trust models, key
management, et cetera.
Tere are several issues with securely confguring TLS securely:
1. Often, the defaults are unsafe, and people are unaware that they
should be changed.
2. Te things that constitute a secure TLS confguration can
change rapidly, because cryptanalysis and practical attacks are
continuously improving.
14.11. SECURE CONFIGURATIONS 179
3. Old clients that still need to be supported sometimes mean that
you have to hang on to broken confguration options.
A practical example of some of these points coming together is
the BEAST attack. Tat attack exploited weaknesses in CBC cipher-
suites in TLSv1.0, which were parts of the default ciphersuite specif-
cations everywhere. Many people recommended defending against it
by switching to RC4. RC4 was already considered cryptographically
weak, later cryptanalysis showed that RC4 was even more broken than
previously suspected. Te attack had been known for years before be-
ing practically exploited; it was already fxed in TLSv1.1 in 2006, years
before the BEAST paper being published. However, TLSv1.1 had
not seen wide adoption.
Terefore, good advice necessarily changes over time, and its im-
possible to do so in a persistent medium such as a book. Instead, you
should look at continuously updated third party sources such as Qualys
SSLLabs. Tey provide tests for both SSLclients and servers, and ex-
tensive advice on how to improve confgurations.
Tat said, there are certainly some general things we want from a
TLS confguration.
TODO: say stu we generally want from TLS confgurations
TODO: http://tools.ietf.org/html/draft-agl-tls-chacha20poly1305-01
..
15
OpenPGP and GPG
15.1 Description
OpenPGP is an open standard that describes a method for encrypting
and signing messages. GPG is the most popular implementation of
that standard
1
, available under a free software license.
Unlike TLS, which focuses on data in motion, OpenPGP focuses
on data at rest. A TLS session is active: bytes fy back and forth as
the peers set up the secure channel. An OpenPGP interaction is, by
comparison, static: the sender computes the entire message up front
using information shared ahead of time. In fact, OpenPGP doesnt
insist that anything is sent at all: for example, it can be used to sign
software releases.
1
GPG 2 also implements S/MIME, which is unrelated to the OpenPGP stan-
dard. Tis chapter only discusses OpenPGP.
181
182 CHAPTER 15. OPENPGP AND GPG
Like TLS, OpenPGP is a hybrid cryptosystem. Users have key
pairs consisting of a public key and a private key. Public key algo-
rithms are used both for signing and encryption. Symmetric key algo-
rithms are used to encrypt the message body; the symmetric key itself
is protected using public-key encryption. Tis also makes it easy to
encrypt a message for multiple recipients: only the secret key has to be
encrypted multiple times.
15.2 Te web of trust
Earlier, we saw that TLS typically uses trusted root certifcates to es-
tablish that a particular peer is who they claim to be. OpenPGP does
not operate using such trusted roots. Instead, it relies on a system
called the Web of Trust: a friend-of-a-friend honor system that relies
on physical meetings where people verify identities.
Te simplest case is a directly trusted key. If we meet up in person,
we can verify each others identities. Perhaps we know each other, or
perhaps wed check some form of identifcation. Ten, we sign each
others keys.
Because I know the key is yours, I know that you can read the
messages encrypted by it, and the other way around. Provided you
dont share your key, I know that only you can read those messages.
No-one can replace my copy of your key, because they wouldnt be
able to forge my signature on it.
Teres a direct trust link between the two of us, and we can com-
municate securely.
15.2. THE WEB OF TRUST 183
A slightly more complicated case is when a friend of yours would
like to send me a message. Weve never met: hes never signed my
key, nor have I signed theirs. However, I have signed your key, and
vice versa. Youve signed your friends key, and vice versa. Your friend
can choose to leverage your assertion that Im indeed the person in
possession of that key you signed, and use that to communicate with
me securely.
You might wonder how your friend would ever see signatures that
you placed on my key. Tis is because keys and signatures are typically
uploaded to a network of key servers, making them freely available to
184 CHAPTER 15. OPENPGP AND GPG
the world.
Te above system can be extended to multiple layers of friends.
It relies in no small part in communities being linked by signatures,
which is why many community events include key signing parties,
where people sign each others keys. For large events, such as inter-
national programming conferences, this system is very eective. Te
main weakness in this systemare islands of trust: individuals or small
groups with no connections to the rest of the web.
Of course, this is only the default way to use OpenPGP. Teres
nothing stopping you from shipping a particular public key with some
software, and using that to sign messages, just like you might want to
do with TLS.
..
16
O-Te-Record Messaging
(OTR)
16.1 Description
O-the-record (OTR) messaging is a protocol for securing instant
messaging communication between people[13]. It intends to be the
online equivalent of a private, real-life conversation. It encrypts mes-
sages, preventing eavesdroppers from reading them. It also authenti-
cates peers to each other, so they know who theyre talking to. De-
spite authenticating peers, it is designed to be deniable: participants
can later deny to third parties anything they said to each other. It is
also designed to have perfect forward secrecy: even a compromise of a
long-term public key pair doesnt compromise any previous conversa-
185
186 CHAPTER 16. OFF-THE-RECORD MESSAGING (OTR)
tions.
Te deniability and perfect forward secrecy properties are very dif-
ferent from those of other systems such as OpenPGP. OpenPGP in-
tentionally guarantees non-repudiability. Its a great property if youre
signing software packages, talking on mailing lists or signing busi-
ness invoices, but the authors of OTR argue that those arent desir-
able properties for the online equivalent of one-on-one conversations.
Furthermore, OpenPGPs static model of communication makes the
constant key renegotiation to facilitate OTRs perfect forward secrecy
impossible.
OTR is typically confgured opportunistically, which means that
it will attempt to secure any communication between two peers, if
both understand the protocol, without interfering with communica-
tion where the other peer does not. Te protocol is supported in many
dierent instant messaging clients either directly, or with a plugin.
Because it works over instant messages, it can be used across many
dierent instant messaging protocols.
A peer can signal that they would like to speak OTR with an ex-
plicit message, called the OTR Query message. If the peer is just
willing to speak OTR but doesnt require it, they can optionally invis-
ibly add that information to a plaintext message. Tat happens with a
clever system of whitespace tags: a bunch of whitespace such as spaces
and tab characters are used to encode that information. An OTR-
capable client can interpret that tag and start an OTR conversation;
an client that isnt OTR-capable just displays some extra whitespace.
OTR uses many of the primitives weve seen so far:
Symmetric key encryption (AES in CTR mode)
16.1. DESCRIPTION 187
Message authentication codes (HMAC with SHA-1)
Die-Hellman key exchange
Authenticated key exchange (AKE)
TODO: Explain (https://otr.cypherpunks.ca/Protocol-v3-4.0.0.
html), #33
Data exchange
TODO: Explain (https://otr.cypherpunks.ca/Protocol-v3-4.0.0.
html), #33
Part IV
Appendices
189
..
A
Modular arithmetic
Modular arithmetic is used for many public key cryptosystems, in-
cluding public-key encryption algorithms like RSA and key exchange
protocols like Die-Hellman.
Modular arithmetic is something most people actually already un-
derstand, they just dont know its called that. We can illustrate the
principles of modular arithmetic using a clock.
For simplicitys sake, our 12-hour clock only shows hours, not
minutes or seconds. Unlike real clocks, the hour hand always shows an
exact hour, such as 2 or 9, and is never halfway in between two hours.
A.1 Addition and subtraction
It obviously makes sense to add hours to our clock: if its 2 oclock now,
and youd like to know what time it is fve hours from now, you can
191
192 APPENDIX A. MODULAR ARITHMETIC
Figure A.1: A clock, pointing to 2.
add 5, and end up with 7, as you can see in fgure A.2 on page 192.
Figure A.2: 2 + 5 = 7, on the clock.
Similarly, we can subtract times. If its 10 oclock now, and youd
like to know what time it was two hours ago, you subtract two and end
up with 8.
Te weird part is when you cross the boundary at 12. As far as
the clock is concerned, theres no real dierence between 12 and 0. If
its 10 oclock now, itll be 2 oclock in four hours. If its 2 oclock now,
A.1. ADDITION AND SUBTRACTION 193
Figure A.3: 10 2 = 8, on the clock.
it was 9 oclock fve hours ago.
Tis is an example of whats called modular arithmetic. Te mod-
ulus, in this case, is 12. We can write the above equations as:
(10 + 4) mod 12 = 2
(2 5) mod 12 = 9
In these equations, the mod is an operator, giving the remainder
after division. When we are dealing with modular arithmetic, where
all operations are aected by the modulus instead of a simple single
operation, well write (mod 12) at the end of the equation:
10 + 4 2 (mod 12)
2 5 9 (mod 12)
Tis is read as ten plus four is equivalent to two, modulo twelve
and two minus fve is equivalent to nine, modulo twelve. Tat might
194 APPENDIX A. MODULAR ARITHMETIC
seems like a trivial notational hack now, but the dierence will be-
come apparent once we start applying tricks for doing more complex
modular computations, like multiplication and exponentiation.
A.2 Prime numbers
Prime numbers are wonderful kinds of numbers that come back in
many branches of mathematics. Anything I say about them proba-
bly wont do them justice; but were in a practical book about applied
cryptography, so well only see a few properties.
Aprime number is a number that is divisible only by two numbers:
1 and itself. For example, 3 is a prime number, but 4 is not, because it
can be divided by 2.
Any number can be written as a product of prime factors: a bunch
of prime numbers multiplied together. Tat product is called a factor-
ization. For example, 30 can be factorized into 2, 3 and 5:
30 = 2 3 5
Sometimes, a prime number will occur more than once in a factor-
ization. For example, the factorization of 360 has 2 in it three times,
and three in it twice:
360 = 2
3
3
2
5
Te factorization of any prime number is just that prime number
itself.
Two numbers are called coprime when their greatest common di-
visor is 1, or, to put it in another way, they dont share any prime
A.3. MULTIPLICATION 195
factors. Since the only prime factor a prime has is itself, that means
that a prime is coprime to every other number.
A.3 Multiplication
You might remember you were frst taught multiplication as repeated
addition:
n x = x +x +. . . +x

n times
Modular multiplication is no dierent. You can compute modular
multiplication by adding the numbers together, and taking the modu-
lus whenever the sum gets larger than the modulus. You can also just
do regular multiplication, and then take the modulus at the end.
A.4 Division and modular inverses
Division is defned as the inverse of multiplication. So, a b c
(mod m), then
c
b
a (mod m).
For example, 5 6 2 (mod 7); so:
2
6
5 (mod 7).
Usually, instead of using division directly, well multiply using
something called a modular inverse. Te modular inverse of a is a
number, that when you multiply it with a, you get 1. Tis is just like
the inverse of a number in regular arithmetic: x
1
x
= 1.
Like in regular arithmetic, not all numbers have modular inverses.
Tis is the equivalent of dividing by zero in regular arithmetic.
196 APPENDIX A. MODULAR ARITHMETIC
Tere are two algorithms that are used to compute modular in-
verses: the extended Euclidean algorithm, and with the help of Eulers
theorem.
Te extended Euclidean theorem
TODO: explain, and how you can get modular inverses with it
Using Eulers theorem
Eulers theorem states that if two numbers a and n are coprime, then:
a
(n)
1 (mod n)
In that equation, is Eulers totient function, which counts the
amount of numbers that are coprime to its argument.
Multiplying both sides by a
1
, as multiplicative inverse, we get:
a
(n)1
a
1
(mod n)
Tat gives us a direct formula for computing a
1
. Unfortunately,
its still generally less interesting than using the extended Euclidean
algorithm, for two reasons:
1. It requires computing the totient function, which is generally
more complex than running the extended Euclidean algorithm
in the frst place (unless you happen to know ns prime factors)
2. Modular exponentiation is computationally expensive.
A.5. EXPONENTIATION 197
One exception to that rule is for prime moduli. Since a prime is
coprime to every other number, and , since there are p 1 numbers
smaller than p, (p) = p 1. So, for a prime modulus, the modular
inverse of a is:
a
(n)1
a
1
(mod n)
A.5 Exponentiation
Like multiplication is taught as as repeated addition, exponentiation
can be thought of as repeated multiplication:
a
n
= a a . . . a

n times
Performing modular exponentiation
As with multiplication, its possible to compute modular exponentia-
tion by performing regular exponentiation, and then taking the mod-
ulus at the end. However, this is very inecient, particularly for large
n: the product quickly becomes far too large.
Fortunately, it is possible to compute modular exponentiation
much more eciently. Tis is done by splitting the problem up into
smaller sub-problems. For example, instead of computing 2
20
directly
you could split it up:
2
20
= (2
10
)
2
198 APPENDIX A. MODULAR ARITHMETIC
2
10
is something you can compute on your hands: start at 2, which
is 2
1
, and then keep multiplying by two. Every time you multiply by
two, the exponent goes up by 1, so by the time youve counted all your
fngers (assuming you have ten of them), youre done. Te result is
1024. So:
2
20
(2
10
mod 15)
2
(mod 15)
(1024 mod 15)
2
(mod 15)
4
2
(mod 15)
16 (mod 15)
1 (mod 15)
A particularly ecient way to do it on computers, is splitting the
exponent up into a sum of powers of two. Tis is called binary expo-
nentiation, or exponentiation by squaring. Suppose we want to com-
pute 3
209
(mod 19). First, we split up 209 into a sum of powers of
two. Tis is process is essentially just writing 209 down in binary:
which would be 0b11010001. Tats very practical if the computa-
tion is being performed by a computer, because thats often how the
computer had the number stored in the frst place.
209 = 1 2
7
+1 2
6
+0 2
5
+1 2
4
+0 2
3
+0 2
2
+0 2
1
+1 2
0
= 1 128 +1 64 +0 32 +1 16 +0 8 +0 4 +0 2 +1 1
= 128 +64 +16 +1
We use that expansion into a sum of powers of two to rewrite the
equation:
A.5. EXPONENTIATION 199
3
209
= 3
128+64+16+1
= 3
128
3
64
3
16
3
1
Now, we need to compute those individual powers of 3: 1, 16, 64
and 128. A nice property of this algorithm is that we dont actually
have to compute the big powers separately from scratch. We can use
previously computed smaller powers to compute the larger ones. For
example, we need both 3
128
(mod 1)9 and 3
64
(mod 1)9, but you can
write the former in terms of the latter:
3
128
mod 19 = (3
64
mod 19)
2
(mod 19)
Lets compute all the powers of 3 we need. For sake of brevity,
we wont write these out entirely, but remember that all tricks weve
already seen to compute these still apply:
3
16
17 (mod 19)
3
64
(3
16
)
4
17
4
16 (mod 19)
3
128
(3
64
)
2
16
2
9 (mod 19)
Filling these back in to our old equation:
3
209
= 3
128
3
64
3
16
3
1
(mod 19)
9 16 17 3 (mod 19)
Tis trick is particularly interesting when the exponent is a very
large number. Tat is the case in many cryptographic applications.
200 APPENDIX A. MODULAR ARITHMETIC
For example, in RSA decryption, the exponent is the private key d,
which is usually more than a thousand bits long. Keep in mind that
this method will still leak timing information, so its only suitable for
oine computation. Modular exponentiation can also be computed
using a technique called a Montgomery ladder, which well see in the
next section.
Many programming languages provide access to specifc modu-
lar exponentiation functions. For example, in Python, pow(e, x, m)
performs ecient modular exponentiation. However, the expression
(e ** x) % m will still use the inecient method.
Timing-invariant computation using a Montgomery ladder
TODO: explain
A.6 Discrete logarithm
Just like subtraction is the inverse of addition, and division is the in-
verse of multiplication, logarithms are the inverse of exponentiation.
In regular arithmetic, e
x
= y, if x = log
e
y. Te equivalent of this in
modular arithmetic is commonly called a discrete logarithm.
As with division, if you start from the defnition as the inverse
of a dierent operator, its easy to come up with examples. For ex-
ample, since 3
6
9 (mod 15), we can defne 9 log
3
6 (mod 15).
However computing discrete logarithms is generally fairly hard, unlike
modular inverses. Tere is no formal proof that computing discrete
logarithms is complex; we just havent found any ecient algorithms
to do it.
A.6. DISCRETE LOGARITHM 201
Tere is one theoretical algorithm for computing discrete loga-
rithms eciently. However, it requires a quantum computer, which
is a fundamentally dierent kind of computer from the classical com-
puters we use today. While we can build such computers, we can only
build very small ones. Te limited size of our quantum computers
strongly limits which problems we can solve. So far, theyre much
more in the realm of the kind of arithmetic a child can do in their
head, than ousting the top of the line classical computers from the
performance throne.
Te complexity of computing discrete logarithms, together with
the relative simplicity of computing its inverse, modular exponentia-
tion, is the basis for many public key cryptosystems. Common exam-
ples include the RSA encryption primitive, or the Die-Hellman key
exchange protocol.
While cryptosystems based on the discrete logarithm problem are
currently considered secure with appropriate parameter choices, there
are certainly ways that could change in the future. For example:
Teoretical breakthroughs in number theory could make dis-
crete logarithms signifcantly easier to compute than we cur-
rently think.
Technological breakthroughs in quantum computing could lead
to large enough quantum computers.
Technological breakthroughs in classical computing as well as
the continuous gradual increases in performance and decreases
in cost could increase the size of some problems that can be
tackled using classical computers.
202 APPENDIX A. MODULAR ARITHMETIC
Discrete logarithm computation is tightly linked to the problem
of number factorization. Tey are still areas of active mathematical re-
search; the links between the two problems are not still not thoroughly
understood. Tat said, there are many similarities between the two:
Both are believed to be hard to compute on classical computers,
but neither has a proof of that fact.
Tey can both be eciently computed on quantum computers
using Shors algorithm.
Mathematical advances in one are typically quickly turned into
mathematical advances in the other.
..
B
Elliptic curves
Like modular arithmetic, elliptic curve arithmetic is used for many
public key cryptosystems. Many cryptosystems that traditionally work
with modular arithmetic, such as Die-Hellman and DSA, have an
elliptic curve counterpart.
Elliptic curves are curves with the following form:
y
2
= x
3
ax +b
Tis is the most common form when talking about elliptic curves
in general; there are several other forms which mostly have applica-
tions in cryptography, notably the Edwards form:
x
2
+y
2
= 1 +dx
2
y
2
We can defne addition of points on the curve.
203
204 APPENDIX B. ELLIPTIC CURVES
TODO: Move the Abelian group thing somewhere else, since it
applies to our felds thing as well
All of this put together form something called an Abelian group.
Tats a scary-sounding mathematical term that almost everyone al-
ready understands the basics of. Specifcally, if you know how to
add integers (. . . 2, 1, 0, 1, 2, . . .) together, you already know an
Abelian group. An Abelian group satisfes fve properties:
1. If a and b are members of the Abelian group and is the oper-
ator, then a b is also a member of that Abelian group. Indeed,
any two integers added together always get you another integer.
Tis property is called closure, or, we say that the group is closed
under addition (or whatever the name is of the operation weve
defned).
2. If a, b and c are members of the Abelian group, the order of
operations doesnt matter; to put it dierently: we can move the
brackets around. In equation form: (a b) c = a (b c).
Indeed, the order in which you add integers together doesnt
matter; they will always sumup to the same value. Tis property
is called associativity, and the group is said to be associative.
3. Teres exactly one identity element i, for which ai = ia = a.
For integer addition, thats zero: a + 0 = 0 +a = a for all a.
4. For each element a, theres exactly one inverse element b, for
which ab = ba = i, where i is the identity element. Indeed,
for integer addition, a + (a) = (a) +a = 0 for all a.
B.1. THE ELLIPTIC CURVE DISCRETE LOG PROBLEM 205
5. Te order of elements doesnt matter for the result of the op-
eration. For all elements a, b, a b = b a. Tis is known as
commutativity, and the group is said to be commutative.
Te frst four properties are called group properties and make
something a group; the last property is what makes a group Abelian.
We can see that our elliptic curve, with the point at infnity and
the addition operator, forms an Abelian group:
1. If P and Q are two points on the elliptic curve, then P + Q is
also always a point on the curve.
2. If P, Q, and R are all points on the curve, then P +(Q+R) =
(P +Q) +R, so the elliptic curve is associative.
3. Teres an identity element, our point at infnity O. For all
points on the curve P, P +O = O +P = P.
4. Each element has an inverse element. Tis is easiest explained
visually TODO: Explain visually
5. Te order of operations doesnt matter, P +Q = Q+P for all
P, Q on the curve.
B.1 Te elliptic curve discrete log problem
TOOD: explain fully
As with the regular discrete log problem, the elliptic curve discrete
log problem doesnt actually have a formal proof that the operation is
206 APPENDIX B. ELLIPTIC CURVES
hard to perform: we just know that there is no publicly available al-
gorithm to do it eciently. Its possible, however unlikely, that some-
one has a magical algorithm that makes the problem easy, and that
would break elliptic curve cryptography completely. Its far more likely
that we will see a stream of continuous improvements, which coupled
with increased computing power eventually eat away at the security of
the algorithm.
..
C
Side-channel attacks
C.1 Timing attacks
AES cache timing
http://tau.ac.il/~tromer/papers/cache.pdf
Elliptic curve timing attacks
TODO: Explain why the edwards form is great?
C.2 Power measurement attacks
TODO: Say something here.
207
Bibliography
[1] Specifcation for the Advanced Encryption Standard (AES).
Federal Information Processing Standards Publication 197,
2001. http://csrc.nist.gov/publications/fips/fips197/
fips-197.pdf.
[2] NIST special publication 800-38d: Recommendation for
block cipher modes of operation: Galois/Counter Mode
(GCM) and GMAC, November 2007. http://csrc.nist.gov/
publications/nistpubs/800-38D/SP-800-38D.pdf.
[3] Nadhem AlFardan, Dan Bernstein, Kenny Paterson, Bertram
Poettering, and Jacob Schuldt. On the security of RC4 in TLS
and WPA. http://www.isg.rhul.ac.uk/tls/.
[4] Ross Anderson and Serge Vaudenay. Minding your ps and
qs. In In Advances in Cryptology - ASIACRYPT96, LNCS 1163,
pages 2635. Springer-Verlag, 1996. http://www.cl.cam.ac.
uk/~rja14/Papers/psandqs.pdf.
209
210 BIBLIOGRAPHY
[5] M. Bellare. New proofs for NMAC and HMAC: Security
without collision-resistance, 2006. http://cseweb.ucsd.edu/
~mihir/papers/hmac-new.html.
[6] Mihir Bellare and Phillip Rogaway. Optimal Asymmetric En-
cryption How to encrypt with RSA. Advances in Cryptology -
EUROCRYPT94 - Lecture Notes in Computer Science, 950, 1995.
http://www-cse.ucsd.edu/users/mihir/papers/oae.pdf.
[7] D. J. Bernstein. Snue 2005: the Salsa20 encryption function.
http://cr.yp.to/snuffle.html#speed.
[8] Alex Biryukov, Orr Dunkelman, Nathan Keller, Dmitry
Khovratovich, and Adi Shamir. Key recovery attacks of practical
complexity on AES variants with up to 10 rounds. Cryptology
ePrint Archive, Report 2009/374, 2009. http://eprint.iacr.
org/2009/374.
[9] Alex Biryukov and Dmitry Khovratovich. Related-key crypt-
analysis of the full AES-192 and AES-256. Cryptology ePrint
Archive, Report 2009/317, 2009. http://eprint.iacr.org/
2009/317.
[10] John Black, Shai Halevi, Hugo Krawczyk, Ted Krovetz, and
Phillip Rogaway. RFC 4418: UMAC: Message Authentica-
tion Code using Universal Hashing. https://www.ietf.org/
rfc/rfc4418.txt.
[11] John Black, Shai Halevi, Hugo Krawczyk, Ted Krovetz, and
Phillip Rogaway. UMAC: Fast and secure message authenti-
BIBLIOGRAPHY 211
cation, 1999. http://www.cs.ucdavis.edu/~rogaway/papers/
umac-full.pdf.
[12] Dan Boneh. Twenty years of attacks on the RSA cryptosys-
tem. Notices of the AMS, 46:203213, 1999. http://crypto.
stanford.edu/dabo/papers/RSA-survey.pdf.
[13] Nikita Borisov, Ian Goldberg, and Eric Brewer. O-the-
record communication, or, why not to use PGP. https://otr.
cypherpunks.ca/otr-wpes.pdf.
[14] Daniel R. L. Brown and Kristian Gjsteen. A security analysis
of the nist sp 800-90 elliptic curve random number generator.
Cryptology ePrint Archive, Report 2007/048, 2007. http://
eprint.iacr.org/2007/048.pdf.
[15] Joan Daemen and Vincent Rijmen. Te design of Rijndael: AES
the Advanced Encryption Standard. Springer-Verlag, 2002.
[16] Wei Dai. Crypto++ 5.6.0 benchmarks. http://www.cryptopp.
com/benchmarks.html.
[17] T. Dierks and E. Rescorla. RFC 5246: Te transport layer se-
curity (TLS) protocol, version 1.2. https://tools.ietf.org/
html/rfc5246.
[18] Scott Fluhrer, Itsik Mantin, and Adi Shamir. Weaknesses in the
key scheduling algorithm of RC4. pages 124, 2001. http://
www.wisdom.weizmann.ac.il/~itsik/RC4/Papers/Rc4_ksa.ps.
212 BIBLIOGRAPHY
[19] SciEngines GmbH. Break DES in less than a single day,
2008. http://www.sciengines.com/company/news-a-events/
74-des-in-1-day.html.
[20] J. Hodges, C. Jackson, and A. Barth. RFC 6797: Http strict
transport security (HSTS). https://tools.ietf.org/html/
rfc6797.
[21] S. Hollenbeck. RFC 3749: Transport layer security protocol
compression methods. https://tools.ietf.org/html/rfc3749.
[22] R. Housley. RFC 5652: Cryptographic message syntax (CMS).
https://tools.ietf.org/html/rfc5652#section-6.3.
[23] National Institute for Standards and Technology. Sp800-
57: Recommendation for key management part 1: Gen-
eral (revised). http://csrc.nist.gov/publications/nistpubs/
800-57/sp800-57_part1_rev3_general.pdf.
[24] Andreas Klein. Attacks on the RC4 stream cipher. Des. Codes
Cryptography, 48(3):269286, September 2008. http://cage.
ugent.be/~klein/papers/RC4-en.pdf.
[25] H. Krawczyk and P. Eronen. RFC5869: HMAC-based extract-
and-expand key derivation function (HKDF). https://tools.
ietf.org/html/rfc5869.
[26] Hugo Krawczyk. Te order of encryption and authentication
for protecting communications (or: How secure is SSL?), 2001.
http://www.iacr.org/archive/crypto2001/21390309.pdf.
BIBLIOGRAPHY 213
[27] Hugo Krawczyk. Cryptographic extraction and key deriva-
tion: Te HKDF scheme. Cryptology ePrint Archive, Report
2010/264, 2010. http://eprint.iacr.org/2010/264.
[28] RSA Laboratories. What key size should be used? http:
//www.emc.com/emc-plus/rsa-labs/standards-initiatives/
key-size.htm.
[29] Moxie Marlinspike. Te cryptographic doom prin-
ciple, 2011. http://www.thoughtcrime.org/blog/
the-cryptographic-doom-principle/.
[30] Joshua Mason, Kathryn Watkins, Jason Eisner, and AdamStub-
blefeld. A natural language approach to automated cryptanaly-
sis of two-time pads. In Proceedings of the 13th ACM conference
on Computer and Communications Security, CCS 06, pages 235
244, New York, NY, USA, 2006. ACM. http://www.cs.jhu.
edu/~jason/papers/mason+al.ccs06.pdf.
[31] Elke De Mulder, Michael Hutter, Mark E. Marson, and Peter
Pearson. Using Bleichenbachers solution to the hidden number
problem to attack nonce leaks in 384-bit ECDSA. Cryptology
ePrint Archive, Report 2013/346, 2013. http://eprint.iacr.
org/2013/346.pdf.
[32] Phong Q. Nguyen and Igor E. Shparlinski. Te insecurity of the
digital signature algorithm with partially known nonces. Journal
of Cryptology, 15:151176, 2000. ftp://ftp.ens.fr/pub/dmi/
users/pnguyen/PubDSA.ps.gz.
214 BIBLIOGRAPHY
[33] Philip Rogaway. OCB - an authenticated-encryption scheme -
licensing. http://www.cs.ucdavis.edu/~rogaway/ocb/license.
htm.
[34] Berry Schoenmakers and Andrey Sidorenko. Cryptanalysis of
the dual elliptic curve pseudorandom generator, 2006. http://
www.cosic.esat.kuleuven.be/wissec2006/papers/21.pdf.
[35] S. Turner and T. Polk. RFC 6176: Prohibiting secure sock-
ets layer (SSL) version 2.0. https://tools.ietf.org/html/
rfc6176.
[36] Serge Vaudenay. Security faws induced by CBC padding ap-
plications to SSL, IPSec, WTLS... http://www.iacr.org/
cryptodb/archive/2002/EUROCRYPT/2850/2850.pdf.
Glossary
A | B | C | E | G | I | K | M | N | O | S | K | M | O | P
A
AEAD mode
Class of block cipher modes of operation that provides authen-
ticated encryption, as well as authenticating some unencrypted
associated data. 124, 125, 127, 214, 217, 218
asymmetric-key algorithm
See public-key algorithm. 214, 219
asymmetric-key encryption
See public-key encryption. 214
B
block cipher
Symmetric encryption algorithm that encrypts and decrypts
blocks of fxed size. 33, 214, 215
215
216 GLOSSARY
C
Carter-Wegman MAC
Reusable message authentication code scheme built from a one-
time MAC. Combines benefts of performance and ease of use.
122, 127, 214, 217
CBC mode
Cipher block chaining mode; common mode of operation where
the previous ciphertext block is XORed with the plaintext block
during encryption. Takes an initialization vector, which as-
sumes the role of the block before the frst block. 49, 51, 55,
76, 214, 217
CTR mode
Counter mode; a nonce combined with a counter produces a
sequence of inputs to the block cipher; the resulting ciphertext
blocks are the keystream. 73, 74, 76, 214, 217
E
ECB mode
Electronic code book mode; mode of operation where plaintext
is separated into blocks that are encrypted separately under the
same key. Te default mode in many cryptographic libraries,
despite many security issues. 42, 45, 48, 49, 113, 214
encryption oracle
An oracle that will encrypt some data. 45, 48, 214
GLOSSARY 217
G
GCM mode
Galois counter mode; AEAD mode combining CTR mode
with a Carter-Wegman MAC. 214, 217
GMAC
Message authentication code part of GCM mode used sepa-
rately. 127, 214
I
initialization vector
Data used to initialize some algorithms such as CBC mode.
Generally not required to be secret, but required to be unpre-
dictable. Compare nonce, salt. 50, 51, 75, 214, 218, 220, 222
K
key agreement
See key exchange. 214
key exchange
Te process of exchanging keys across an insecure mediumusing
a particular cryptographic protocol. Typically designed to be
secure against eavesdroppers. Also known as key agreement.
191, 214, 217, 219
M
218 GLOSSARY
message authentication code
Small piece of information used to verify authenticity and in-
tegrity of a message. Often called a tag. 214, 216218
mode of operation
Generic construction that encrypts and decrypts streams, built
from a block cipher. 39, 49, 74, 214, 215
N
nonce
Number used once. Used in many cryptographic protocols.
Generally does not have to be secret or unpredictable, but does
have to be unique. Compare initialization vector, salt. 69, 74,
75, 132, 214, 216, 217, 220
O
OCB mode
Oset codebook mode; high-performance AEAD mode, un-
fortunately encumbered by patents. 214
one-time MAC
Message authentication code that can only be used securely for
a single message. Main beneft is increased performance over
re-usable MACs. 214, 216
oracle
A black box that will perform some computation for you. 45,
214, 216
GLOSSARY 219
OTR messaging
O-the-record messaging, messaging protocol that intends to
mimic the properties of a real-live private conversation. Piggy-
backs onto existing instant messaging protocols. 214
P
public-key algorithm
Algorithm that uses a pair of two related but distinct keys.
Also known as asymmetric-key algorithms. Examples include
public-key encryption and most key exchange protocols. 90,
214, 215
public-key encryption
Encryption using a pair of distinct keys for encryption and de-
cryption. Also known as asymmetric-key encryption. Contrast
with secret-key encryption. 8991, 96, 130, 182, 191, 214, 215,
219, 220
S
salt
Random data that is added to a cryptographic primitive (usually
a one-way function such as a cryptographic hash function or a
key derivation function) Customizes such functions to produce
dierent outputs (provided the salt is dierent). Can be used
to prevent e.g. dictionary attacks. Typically does not have to be
220 GLOSSARY
secret, but secrecy may improve security properties of the sys-
tem. Compare nonce, initialization vector. 103, 137, 139, 140,
214, 217, 218
secret-key encryption
Encryption that uses the same key for both encryption and de-
cryption. Also known as symmetric-key encryption. Contrast
with public-key encryption. 89, 90, 214, 219, 220
stream cipher
Symmetric encryption algorithm that encrypts streams of arbi-
trary size. 26, 49, 62, 63, 72, 74, 214
symmetric-key encryption
See secret-key encryption. 214
Acronyms
P | A | B | C | D | F | G | H | I
A
AEAD
Authenticated Encryption with Associated Data. 124, 125, 214
AES
Advanced Encryption Standard. 37, 186, 214
B
BEAST
Browser Exploit Against SSL/TLS. 51, 214
C
CBC
Cipher Block Chaining. 214
221
222 ACRONYMS
D
DES
Data Encryption Standard. 38, 151, 214
F
FIPS
Federal Information Processing Standards. 37, 38, 214
G
GCM
Galois Counter Mode. 214
H
HKDF
HMAC-based (Extract-and-Expand) Key Derivation Func-
tion. 139, 214
HSTS
HTTP Strict Transport Security. 176, 214
I
IV
initialization vector. 50, 75, 214
K
ACRONYMS 223
KDF
key derivation function. 214
M
MAC
message authentication code. 109, 214, 218
O
OCB
oset codebook. 125, 214
OTR
o-the-record. 185, 214
P
PRF
pseudorandom function. 214
PRP
pseudorandom permutation. 214

You might also like