George Crypto Notes
George Crypto Notes
George Kudrayvtsev
[email protected]
1
0 Preface 6
1 Introduction 8
I Symmetric Cryptography 10
2 Perfect Security 11
2.1 Notation & Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 One-Time Pads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 The Beauty of XOR . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Proving Security . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Block Ciphers 16
3.1 Modes of Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1.1 ECB—Electronic Code Book . . . . . . . . . . . . . . . . . . . 17
3.1.2 CBC—Cipher-Block Chaining . . . . . . . . . . . . . . . . . . 17
3.1.3 CBCC—Cipher-Block Chaining with Counter . . . . . . . . . 18
3.1.4 CTR—Randomized Counter Mode . . . . . . . . . . . . . . . 18
3.1.5 CTRC—Stateful Counter Mode . . . . . . . . . . . . . . . . . 20
3.2 Security Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.1 IND-CPA: Indistinguishability Under Chosen-Plaintext Attacks 21
3.2.2 IND-CPA-cg: A Chosen Guess . . . . . . . . . . . . . . . . . . 23
3.2.3 What Makes Block Ciphers Secure? . . . . . . . . . . . . . . . 25
3.2.4 Random Functions . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.5 IND-CCA: Indistinguishability Under Chosen-Ciphertext Attacks 32
3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5 Hash Functions 41
5.1 Collision Resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 Building Hash Functions . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.3 One-Way Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.4 Hash-Based MACs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6 Authenticated Encryption 49
6.1 INT-CTXT: Integrity of Ciphertexts . . . . . . . . . . . . . . . . . . 49
6.2 Generic Composite Schemes . . . . . . . . . . . . . . . . . . . . . . . 50
6.2.1 Encrypt-and-MAC . . . . . . . . . . . . . . . . . . . . . . . . 51
6.2.2 MAC-then-encrypt . . . . . . . . . . . . . . . . . . . . . . . . 51
2
6.2.3 Encrypt-then-MAC . . . . . . . . . . . . . . . . . . . . . . . . 52
6.2.4 In Practice. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.2.5 Dedicated Authenticated Encryption . . . . . . . . . . . . . . 53
6.3 AEAD: Associated Data . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.3.1 GCM: Galois/Counter Mode . . . . . . . . . . . . . . . . . . . 54
7 Stream Ciphers 55
7.1 Generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
7.1.1 PRGs for Encryption . . . . . . . . . . . . . . . . . . . . . . . 56
7.2 Evaluating PRGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.3 Creating Stream Ciphers . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.3.1 Forward Security . . . . . . . . . . . . . . . . . . . . . . . . . 57
7.3.2 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . 58
II Asymmetric Cryptography 61
9 Overview 63
9.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
9.2 Security Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
10 Number Theory 65
10.1 Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
10.2 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
10.2.1 Running Time . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
10.2.2 Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
10.2.3 Modular Exponentiation . . . . . . . . . . . . . . . . . . . . . 69
10.3 Groups for Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . 69
10.3.1 Discrete Logarithm . . . . . . . . . . . . . . . . . . . . . . . . 70
10.3.2 Constructing Cyclic Groups . . . . . . . . . . . . . . . . . . . 71
10.4 Modular Square Roots . . . . . . . . . . . . . . . . . . . . . . . . . . 73
10.4.1 Square Groups . . . . . . . . . . . . . . . . . . . . . . . . . . 74
10.4.2 Square Root Extraction . . . . . . . . . . . . . . . . . . . . . 75
10.5 Chinese Remainder Theorem . . . . . . . . . . . . . . . . . . . . . . . 76
11 Encryption 77
11.1 Recall: The Discrete Logarithm . . . . . . . . . . . . . . . . . . . . . 77
11.1.1 Formalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
11.1.2 Difficulty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
11.2 Diffie-Hellman Key Exchange . . . . . . . . . . . . . . . . . . . . . . 81
11.3 ElGamal Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
11.3.1 Security: IND-CPA . . . . . . . . . . . . . . . . . . . . . . . . 82
3
11.3.2 Security: IND-CCA . . . . . . . . . . . . . . . . . . . . . . . . 84
11.4 Cramer-Shoup Encryption . . . . . . . . . . . . . . . . . . . . . . . . 84
11.5 RSA Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
11.5.1 Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
11.5.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
11.5.3 Securing RSA . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
11.6 Hybrid Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
11.7 Multi-User Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . 91
11.7.1 Security Definitions . . . . . . . . . . . . . . . . . . . . . . . . 92
11.7.2 Security Evaluation . . . . . . . . . . . . . . . . . . . . . . . . 93
11.8 Scheme Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
12 Digital Signatures 95
12.1 Security Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
12.2 RSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
12.2.1 Plain RSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
12.2.2 Full-Domain Hash RSA . . . . . . . . . . . . . . . . . . . . . 97
12.2.3 Probabilistic Signature Scheme . . . . . . . . . . . . . . . . . 98
12.3 ElGamal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
12.4 Digital Signature Algorithm . . . . . . . . . . . . . . . . . . . . . . . 100
12.5 Schnorr Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
12.6 Scheme Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
12.6.1 Simple Multi-Signature Scheme . . . . . . . . . . . . . . . . . 103
12.6.2 Simple Blind Signature Scheme . . . . . . . . . . . . . . . . . 104
12.7 Signcryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
14 Epilogue 114
4
15.1 OWFs to PRGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
15.1.1 Extension by 1 Bit . . . . . . . . . . . . . . . . . . . . . . . . 117
15.1.2 Arbitrary Extension . . . . . . . . . . . . . . . . . . . . . . . 121
15.2 PRGs to PRFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
15.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
15.3.1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
16 Commitments 124
16.1 Formalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
16.2 Pedersen Commitments . . . . . . . . . . . . . . . . . . . . . . . . . . 125
16.2.1 Proof of Properties . . . . . . . . . . . . . . . . . . . . . . . . 125
5
Preface
I read that Teddy Roosevelt once said, “Do what you can with
what you have where you are.” Of course, I doubt he was in the
tub when he said that.
— Bill Watterson, The Days are Just Packed
These are not official course materials; they were created throughout my time
taking the course, so there cannot be any guarantees about the correctness of the
content. You can refer to some official resources like Bellare & Rogaway’s notes
or Boneh & Shoup’s course for a better, deeper look at the concepts I cover here.
If you encounter typos; incorrect, misleading, or poorly-worded information; or
simply want to contribute a better explanation or extend a section, please raise
an issue on my notes’ GitHub repository.
Before we begin to dive into all things cryptography, I’ll enumerate a few things I do
in this notebook to elaborate on concepts:
• An item that is highlighted like this is a “term;” this is some vocabulary
that will be used and repeated regularly in subsequent sections. I try to cross-
reference these any time they come up again to link back to its first defined
usage; most mentions are available in the Index.
• An item that is highlighted like this is a “mathematical property;” such
properties are often used in subsequent sections and their understanding is
assumed there.
• An item in a maroon box, like. . .
6
APPLIED CRYPTOGRAPHY
I also sometimes include margin notes like the one here (which just links back here) Linky
that reference content sources so you can easily explore the concepts further.
Kudrayvtsev 7
Introduction
Cryptography is hard.
— Anonymous
The purpose of a cryptographic scheme falls into three very distinct categories. A
common metaphor used to explain these concepts is a legal document.
• confidentiality ensures content secrecy—that it can’t be read without knowl-
edge of some secret. In our example, this would be like writing the document
in a language nobody except you and your recipient understand.
• authenticity guarantees content authorship—that its author can be irrefutably
proven. In our example, this is like signing the original document in pen (as-
suming, of course, your signature was impossible to forge).
• integrity guarantees content immutability—that it has not been changed. In
our example, this could be that you get an emailed copy of the signed document
to ensure that its language cannot be changed post-signing.
Note that even though all three of these properties can go hand-in-hand, they are
not mutually constitutive. You can have any of them without the others: you can
just get a copy of an unsigned document sent to you in plain English to ensure its
integrity later down the line.
crypto graphy
| {z } | {z }
secret writing
8
APPLIED CRYPTOGRAPHY
breeds bugs.
• Alice and Bob are the most common sender-recipient pairing. They
are generally acting in good faith and aren’t trying to break the cryp-
tographic scheme in question. If a third member is necessary, Carol
will enter the fray (for consistency of the allusion to Lewis Carroll’s
Alice in Wonderland ).
• Eve and Mallory are typically the two members trying to break the
scheme. Eve is a passive attacker (short for eavesdropper) that merely
observes messages between Alice and Bob, whereas malicious Mallory
is an active attacker who can capture, modify, and inject her own
messages into exchanges between other members.
You can check out the Wikipedia article on the topic for more historic trivia
and the full cast of characters.
Kudrayvtsev 9
PART I
Symmetric Cryptography
he notion of symmetric keys comes from the fact that both the sender and
T receiver of encrypted information share the same secret key, K. This secret is
the only thing that separates a viable receiver from an attacker.
Eve
K Alice Bob K
Symmetric key algorithms are often very efficient and supported by hardware, but
their fatal flaw lies in key distribution. If two parties need to share a secret without
anyone else knowing, how do they get it to each other without already having a secure
channel? That’s the job of Part II: asymmetric cryptography.
Contents
2 Perfect Security 11
3 Block Ciphers 16
5 Hash Functions 41
6 Authenticated Encryption 49
7 Stream Ciphers 55
10
Perfect Security
11
CHAPTER 2: Perfect Security
An encryption scheme defines the message space and three algorithms: (MsgSp, E, D, K).
The key generation algorithm often just pulls a random n-bit string from the entire
$
{0, 1}n bit space; to describe this action, we use notation K ←− KeySp. The encryp-
tion algorithm is often randomized (taking random input in addition to (K, M )) and
stateful. We’ll see deeper examples of all of these shortly.
E(K, M ) = M ⊕ K
D(K, C) = C ⊕ K
Can we be a little more specific with this notion of “perfect encryption”? Intuitively, a
secure encryption scheme should reveal nothing to adversaries who have access to the
ciphertext. Formally, this notion is called being Shannon-secure (and is also referred
to as perfect security): the probability of a ciphertext occurring should be equal
for any two messages.
∀m1 , m2 ∈ MsgSp, ∀C :
Pr [E(K, m1 ) = C] = Pr [E(K, m2 ) = C] (2.1)
That is, the probability of a ciphertext C must be equally-likely for any two
messages that are run through E.
Note that this doesn’t just mean that a ciphertext occurs with equal probability
for a particular message, but rather than any message can map to any ciphertext
Kudrayvtsev 12
APPLIED CRYPTOGRAPHY Symmetric Cryptography
with equal probability. It’s often necessary but not sufficient to show that a specific
message maps to a ciphertext with equal probability under a given key; additionally,
it’s necessary to show that all ciphertexts can be produced by a particular message
(perhaps by varying the key).
Shannon security can also be expressed as a conditional probability,1 where all mes-
sages are equally-probable (i.e. independent of being) given a ciphertext:
∀m ∈ MsgSp, ∀C :
Pr [M = m | C] = Pr [M = m]
Are one-time pads Shannon-secure under these definitions? Yes, thanks to XOR.
x y x⊕y
1 1 0
1 0 1
0 1 1
0 0 0
Table 2.1: The truth table for XOR.
Suppose you have some c = 0 (where c ∈ {0, 1}1 ); what was the input bit m? Well
it could’ve been 1 and been XOR’d with 1 OR it could’ve been 0 and been XOR’d
with 0. . . Knowing c gives us no new information about the input: our guess is still
as good as random chance ( 12 = 50%).
Now suppose you know that c = 1; are your odds any better? In this case, m could’ve
been 1 and been XOR’d with 0 OR it could’ve been 0 and XOR’d with 1. . . Again,
we can’t do better than random chance.
By the very definition of being Shannon-secure, if we (as the attacker) can’t do better
than random chance when given a ciphertext, the scheme is perfectly secure.
Kudrayvtsev 13
CHAPTER 2: Perfect Security
Proof. We start by fixing an arbitrary n-bit ciphertext: C ∈ {0, 1}n . We also choose
a fixed n-bit message, m ∈ MsgSp. Then, what’s the probability that a randomly-
generated key k ∈ KeySp will encrypt that message to be that ciphertext? Namely,
what is
Pr [E(K, m) = C]
for our fixed m and C? In other words, how many keys can turn m into C?
Well, by the definition of the OTP, we know that this can only be true for a single
key: K = m ⊕ C. Well, since every bit counts, and the probability of a single bit in
the key being “right” is 1/2:
Pr [E(K, m) = C] = Pr [K = m ⊕ C]
1 1 1
= · · · ...
| 2 {z2
2 }
n times
1
= n
2
Note that this is true ∀m ∈ MsgSp, which fulfills the requirement for perfect security!
Every message is equally likely to result in a particular ciphertext.
The problem with OTPs is that keys can only be used once. If we’re going to go
through the trouble of securely distributing OTPs,2 we could just exchange the mes-
sages themselves at that point in time. . .
Let’s look at what happens when we use the same key across two messages. From the
scheme itself, we know that Ci = K ⊕ Mi . Well if we also have Cj = K ⊕ Mj , then:
Ci ⊕ Cj = (K ⊕ Mi ) ⊕ (K ⊕ Mj )
= (K ⊕ K) ⊕ (Mi ⊕ Mj ) XOR is associative
= Mi ⊕ Mj a⊕a=0
Though this may seem like insignificant information, it actually can reveal quite a bit
about the inputs, and eventually the entire key if it’s reused enough times.
An important corollary of perfect security is what’s known as the impossibility
result (also referred to as the optimality of the one-time pad when used in that
context):
2
One could envision a literal physical pad in which each page contained a unique bitstring; if two
people shared a copy of these pads, they could communicate securely until the bits were exhausted
(or someone else found the pad). Of course, if either of them lost track of where they were in the
pad, everything would be gibberish from then-on. . .
Kudrayvtsev 14
APPLIED CRYPTOGRAPHY Symmetric Cryptography
|KeySp| ≥ |MsgSp|
We know for a fact, then, that at least one key exists that can craft C; thus if we pick
a key K ∈ KeySp at random, there’s a non-zero probability that we’d get C again:
Pr [E(K, m1 ) = C] > 0
Suppose then there is a message m2 ∈ MsgSp which we can never get from decrypting
C:
Pr [D(K, C) = m2 ] = 0 ∀K ∈ KeySp
Pr [E(K, M1 ) = C] 6= Pr [E(K, M2 ) = C]
Thus, our assumption is wrong: m2 cannot exist! Meaning there must be some
K2 ∈ KeySp that decrypts C: D(K2 , C) = M2 . Thus, it must be the case that there
are as many keys as there are messages.
Ideally, we’d like to encrypt long messages using short keys, yet this theorem shows
that we cannot be perfectly-secure if we do so. Does that indicate the end of this
chapter? Thankfully not. If we operate under the assumption that our adversaries
are computationally-bounded, it’s okay to relax the security requirement and make
breaking our encryption schemes very, very unlikely. Though we won’t have perfect
secrecy, we can still do extremely well.
We will create cryptographic schemes that are computationally-secure under Kerck-
hoff ’s principle, which effectively states that everything about a scheme should be
publicly-available except for the secret key(s).
Kudrayvtsev 15
Block Ciphers
Formally, a block cipher is a function family that maps from a k-bit key
and an n-bit input string to an n-bit output string:
Additionally, ∀K ∈ {0, 1}k , EK (·) is a permutation on {0, 1}n . This means its inverse
is well-defined; we denote it either as EK−1
(·) or the much more intuitive DK (·).
In a similar vein, ciphertexts are unique, so ∀C ∈ {0, 1}n , there exists a single M
such that C = EK (M ).
16
APPLIED CRYPTOGRAPHY Symmetric Cryptography
C[i] = EK (M [i])
M [i] = DK (C[i])
EK EK ...... EK
This mode of operation has a fatal flaw that greatly compromises its security: if two
message blocks are identical, the ciphertexts will be as well. Furthermore, encrypting
the same long message will result in the same long ciphertext. This mode of operation
is never used, but it’s useful to present here to highlight how we’ll fix these flaws in
later modes.
Kudrayvtsev 17
CHAPTER 3: Block Ciphers
⊕ ⊕ ⊕
EK EK ...... EK
Each message block is first chained via XOR with the previous ciphertext before being
run through the encryption algorithm. Similarly, the ciphertext is run through the
inverse then XOR’d with the previous ciphertext to decrypt. That is,
C[i] = EK (M [i] ⊕ C[i − 1])
M [i] = DK (C[i]) ⊕ C[i − 1]
(where the base case is C[0] = IV ).
The IV can be sent out in the clear, unencrypted, because it doesn’t contain any secret
information in-and-of itself. If Eve intercepts it, she can’t do anything useful with it;
if Mallory modifies it, the decrypted plaintext will be gibberish and the recipient will
know something is up.
However, if an initialization vector is repeated, there can be information leaked to
keen attackers about the underlying plaintext.
The downside of these two algorithms is not a property of security but rather of
performance. Because every block depends on the outcome of the previous block,
both encryption and decryption must be done in series. This is in contrast with. . .
Kudrayvtsev 18
APPLIED CRYPTOGRAPHY Symmetric Cryptography
⊕ ⊕ ⊕
EK EK ...... EK
use a block cipher as its fundamental primitive.1 Specifically, the encryption function
does not need to be invertible. Whereas before we used EK as a mapping from a
k-bit key and an n-bit string to an n-bit string (see Definition 2.2), we can now use
a function that instead maps them to an m-bit string:
F : {0, 1}k × {0, 1}l 7→ {0, 1}L
FK FK FK
This is because both the encryption and decryption schemes use FK directly. They
rely on a randomly-generated value R as fuel, much like the IV in the CBC modes.2
Notice that to decrypt C[i] in Figure 3.4, one needs to first determine FK (R + i),
then XOR that with the ciphertext to get M [i]. The plaintext is never run through the
encryption algorithm at all; instead, FK (R + i) is used as a one-time pad for M [i].
That is,
C[i] = M [i] ⊕ FK (R + i)
M [i] = C[i] ⊕ FK (R + i)
Note that in all of these schemes, the only secret is K (F and E are likely standardized
and known).
1
In practice, though, FK will generally be a block cipher. Even though this properly is noteworthy,
it does not offer any additional security properties.
2
In fact, I’m not sure why the lecture decides to use R instead of IV here to maintain consistency.
They are mathematically the same: both R and IV are pulled from {0, 1}n .
Kudrayvtsev 19
CHAPTER 3: Block Ciphers
FK FK ... FK
Though this informality is not useful enough to prove things about encryption schemes
we encounter, it’s enough to give us intuition on the formal definition ahead.
3
Any information except the length of the plaintexts; this knowledge is assumed to be public.
Kudrayvtsev 20
APPLIED CRYPTOGRAPHY Symmetric Cryptography
m0 , m1 mb
A LR (m0 , m1 , b) EK (·)
? ?
The adversary does not know the value of b, and thus does not know which of the
messages was encrypted; it’s their goal to figure this out, given full access to the oracle.
We say that an encryption scheme is secure if the adversary’s ability to determine
which experiment the ciphertexts came from is no better than random chance.
Kudrayvtsev 21
CHAPTER 3: Block Ciphers
long they are. We might be willing to make certain compromises of security if, for
example, the attacker needs a 2512 -length message to gain an advantage.
With our new formal definition of security under our belt, let’s take a crack at break-
ing the various Modes of Operation we defined. If we can provide an algorithm
that demonstrates a reasonable advantage for an adversary that requires reasonable
resources, we can show that a scheme is not secure under IND-CPA.
Analysis of ECB
This was clearly the simplest and weakest of schemes that we outlined. The lack of
randomness makes gaining an advantage trivial: the message can be determined by
having a message with a repeating and one with a non-repeating plaintext.
Algorithm 3.1: A simple algorithm for breaking the ECB block cipher mode.
C1 k C2 = EK (LR(02n , 0n k 1n , b))
if C1 = C2 then
return 0
end
return 1
The attack can be generalized to give the adversary perfect knowledge for any input
plaintext, and it leads to an important corollary.
Analysis of CBCC
Turns out, counters are far harder to “get right” relative to random initialization
vectors: their predictable nature means we can craft messages that are effectively
deterministic by replicating the counter state. Namely, if we pre-XOR our plaintext
with the counter, the first ciphertext block functions the same way as in ECB.
Kudrayvtsev 22
APPLIED CRYPTOGRAPHY Symmetric Cryptography
The first message lets us identify the counter value. The second message lets us craft
a “post-counter” message that will be equal to the third message.
Kudrayvtsev 23
CHAPTER 3: Block Ciphers
the start and encoded within b. There is now only one experiment: if the attackers
guess matches (b0 = b), the experiment returns 1.
$
b ←− {0, 1}
m0 , m1 mb
A LR (m0 , m1 , ·) EK (·)
? ?
C
b0
Figure 3.7: The “chosen guess” variant on IND-CPA security, where the attacker
must guess a b0 , and the experiment returns 1 if b0 = b.
A scheme SE is still only considered secure under the “chosen guess” variant
of IND-CPA if their IND-CPA-cg advantage is small; this advantage is
now instead defined as:
The two variants on attacker advantage in Definition 2.3 and the new Definition 2.4
can be proven equal.
Claim 3.1. Advind-cpa (A) = Advind-cpa-cg (A) for some encryption scheme SE.
Proof. The probability of the cg experiment being 1 (that is, the attacker guessing
$
b0 = b correctly) can be expressed as conditional probabilities. Remember that b ←−
{0, 1} with uniformally-random probability.
Pr [experiment-cg returns 1] = Pr [b = b0 ]
= Pr [b = b0 | b = 0] Pr [b = 0] + Pr [b = b0 | b = 1] Pr [b = 1]
1 1
= Pr [b0 = 0 | b = 0] · + Pr [b0 = 1 | b = 1] ·
2 2
1 1
= · Pr [b = 0 | b = 0] +
0
1 − Pr [b = 0 | b = 1]
0
2 2
1 1
= + Pr [b = 0 | b = 0] − Pr [b = 0 | b = 1]
0 0
2 2
Kudrayvtsev 24
APPLIED CRYPTOGRAPHY Symmetric Cryptography
Notice the expression in parentheses: the difference between the probability of the
attacker guessing 0 correctly (that is, when it really is 0) and incorrectly. This is
exactly Definition 2.3: advantage under the normal IND-CPA definition! Thus:
1 1
Pr [exp-cg returns 1] = + Pr [b0 = 0 | b = 0] − Pr [b0 = 0 | b = 1]
2 2| {z }
IND-CPA advantage
1 1
= + Advind-cpa (A)
2 2
2 · Pr [exp-cg returns 1] − 1 = Advind-cpa (A)
Advind-cpa-cg (A) = Advind-cpa (A) (3.1)
Modern block ciphers like AES use at least 128-bit keys (though 192 and
256-bit options are available) which is considered secure from exhaustive
search.
The now-outdated block cipher DES (invented in the 1970s) had a 56-bit key
space, and it had a particular property that could speed up exhaustive search
by a factor of two. This means exhaustive key-search on DES takes ≈ 254
operations which took about 23 years on a 25MHz processor (fast at the
time of DES’ inception). By 1999, the key could be found in only 22 hours.
The improved triple-DES or 3DES block cipher used 112-bit keys, but it
Kudrayvtsev 25
CHAPTER 3: Block Ciphers
too was abandoned in favor of AES for performance reasons: doing three
DES computations proved to be too slow for efficient practical use.
Obviously a block cipher is not necessarily secure just because exhaustive key-search
is not feasible. We now aim to define some measure of security for a block cipher.
Why can’t we just use IND-CPA? Well a block cipher is deterministic by definition,
and we saw in Theorem 3.1, a deterministic scheme cannot be IND-CPA secure. Thus
our definition is too strong! We need something weaker for block ciphers that is still
lets us avoid all possible information leaks: nothing about the key, nothing about the
plaintexts (or some property of the plaintexts), etc. should be revealed.
We will say that a block cipher is secure if its output ciphertexts “look” random; more
precisely, it’d be secure if an attacker can’t differentiate its output from a random
function. Well. . . that requires a foray into random functions.
Kudrayvtsev 26
APPLIED CRYPTOGRAPHY Symmetric Cryptography
PRF Security
The security of a block cipher depends on whether or not an attacker can differentiate
between it and a random function. Like with IND-CPA, we have two experiments. In
the first experiment, the attacker gets the output of the block cipher E with a fixed
K ∈ KeySp; in the second, it’s a random function g chosen from the PRF matching
the domain and range of E.
Experiment 1 (“real”) Experiment 0 (“random”)
A EK A g
b b
The attacker outputs their guess, b, which should be 1 if they think they’re being
fed outputs from the real block cipher and 0 if they think it’s random. Then, their
“advantage” is how much more often the attacker can guess correctly.
For AES, the PRF advantage is very small and its conjectured (not proven) to be
PRF secure. Specifically, for running time t and q queries,
ct
Advprf
AES (A) ≤ · 2−128 + q 2 · 2−128 (3.2)
TAES | {z }
| {z } birthday paradox
exhaustive key-search
We will use this as an upper bound when calling a function F PRF secure.
The second term comes from an interesting attack that can be applied to all block
ciphers known as the birthday paradox. Recall that block ciphers are permutations,
so for distinct messages, you always get distinct ciphertexts. The attack is simple: if
you feed the PRF security oracle q distinct messages and get q distinct ciphertexts,
you output b = 1; otherwise, you output b = 0. The only way you get < q distinct
Kudrayvtsev 27
CHAPTER 3: Block Ciphers
ciphertexts is from a g that isn’t one-to-one. The probability of this happening is the
probability of algorithm 3.4 picking the same bitstring for two xs, so 2−L .
Suppose you’re at a house party with 50 other people. What’re the chances
that two people at that party share the same birthday? Turns out, it’s
really, really high: 97%, in fact!
The birthday paradox is the counterintuitive idea despite the fact that
YOU are unlikely to share a birthday with someone, the chance of ANY
two people sharing a birthday is actually extremely high.
In the context of cryptography, this means that as the number of outputs
generated by a random function g increases, the probability of SOME two
inputs resolving to the same output increases much faster.
Kudrayvtsev 28
APPLIED CRYPTOGRAPHY Symmetric Cryptography
if p =⇒ q
then ¬q =⇒ ¬p
With that in mind, let’s prove CTRC’s security. To be verbose, the statements we’re
aiming to prove are:
However, since we’re approaching this via the contrapositive, we’ll instead prove
Theorem 3.2. CTRC is a secure mode of operation if its underlying block cipher
is secure.
More formally, for any efficient adversary A, ∃B with similar efficiency such
that the IND-CPA advantage of A under CTRC mode is less than double the
PRF advantage of B under a secure block cipher F :
Advind-cpa prf
CTRC (A) ≤ 2 · AdvF (B)
where we know an example of a secure block cipher F =AES that any B’s advan-
tage will be very small (see (3.2)).
Kudrayvtsev 29
CHAPTER 3: Block Ciphers
$
b ←− {0, 1}
m0 , m1 mb
A LR (m0 , m1 , ·) EK (·) g or FK
? ?
C
b0
• Namely, B lets A make oracle queries to CTRC until it guesses b correctly. This
is valid because B still delegates to a PRF oracle which is choosing between a
random function g and the block cipher FK (where K is still secret) for the
actual block cipher; everything else is done exactly as described for CTRC in
Figure 3.5.
• This construction lets us leverage the fact that A knows how to break CTRC-
encrypted messages, but we don’t need to know how. For the pseudocode
describing this process, refer to algorithm 3.5.
Now let’s analyze B, expressing its PRF advantage over F in terms of A’s IND-CPA
advantage over CTRC.5 The ability for B to differentiate between F and some random
function g ∈ Func(`, L) depends entirely on A’s ability to differentiate between CTRC
with an actual block cipher F and a truly-random function g. Thus,
h i h i
AdvF (B) = Pr B → 1 in ExpF
prf prf-0
− Pr B → 1 in ExpF prf-1
definition
h i h i
B depends
= Pr Expind-cpa-cg
CTRC[F ] → 1 − Pr ExpCTRC[g] → 1
ind-cpa-cg
only on A
1 1 1 1 IND-CPA is equal
= · Advind-cpa
CTRC[F ] (A) + − · Advind-cpa
CTRC[g] (A) −
to IND-CPA-cg
2 2 2 2 via (3.1)
5
The syntax X → n means the adversary X outputs the value n, and the syntax Expnm refers to
the experiment n under some parameter or scheme m, for shorthand.
Kudrayvtsev 30
APPLIED CRYPTOGRAPHY Symmetric Cryptography
Notice that the inputs to g are all distinct points, and by definition of a truly-random
function its outputs are truly-random bitstrings. These are then XOR’d with mes-
sages. . . sound familiar? The outputs of g are distinct one-time pads and thus each
C[i] is Shannon-secure, meaning an advantage is simply impossible by definition.
The theorem claim can then be trivially massaged out:
1 1 1 1
Advprf
F (B) = · Advind-cpa
CTRC[F ] (A) + − · Advind-cpa
CTRC[g] (A) −
2 2 2 | {z } 2
=0
1
= · Advind-cpa
CTRC (A)
2
2 · Advprf ind-cpa
F (B) = AdvCTRC (A)
Kudrayvtsev 31
CHAPTER 3: Block Ciphers
Proving Security: CTR Recall that the difference between CTRC (which we
just proved was secure) and standard CTR is the use of a random IV rather than a
counter (see Figure 3.4). It’s also provably PRF secure, but we’ll state its security
level without proof:6
Theorem 3.3. CTR is a secure mode of operation if its underlying block cipher is
secure. More formally, for any efficient adversary A, ∃B with similar efficiency
such that:
µ2A
Advind-cpa
CTR (A) ≤ 2 · Advprf
F (B) +
`2`
where µ is the total number of bits A sends to the oracle.
It’s still secure because ` ≥ 128 for secure block ciphers, making the extra term near-
zero. Proving bounds on security is very useful: we can see here that CTRC mode is
better than CTR mode because there is no additional constant.
There is a similar theorem for CBC mode (see Figure 3.2), the last mode of operation
whose security we haven’t formalized.
Theorem 3.4. CBC is a secure mode of operation if its underlying block cipher is
secure. More formally, for any efficient adversary A, ∃B with similar efficiency
such that:
µ2A
Advind-cpa
CBC (A) ≤ 2 · Advprf
F (B) +
n2 2n
where µ is the total number of bits A sends to the oracle.
We can see that n2 > ` when comparing CBC to CTR, meaning the term will be
smaller for the same µ. Thus, CTRC is more secure than CBC is more secure than
CTR. The constant again comes from the birthday paradox.
6
Feel free to refer to the lecture video to see the proof. In essence, the fact the value is chosen
randomly means it’s possible that for enough Rs and M s there will be overlap for some Ri + m
and Rj + n. This will result in identical “one-time pads,” though thankfully it occurs with a very
small probability (it’s related to the birthday paradox).
Kudrayvtsev 32
APPLIED CRYPTOGRAPHY Symmetric Cryptography
This isn’t a far-fetched possibility,7 and it has historic precedent in being a viable
attack vector. Since IND-CPA does not cover this vector, we need a stronger definition
of security: the attacker needs more power. With IND-CCA, the adversary A has
access to two oracles: the left-right encryption oracle, as before, and a decryption
oracle.
m0 , m1 mb
A LR (m0 , m1 , b) EK (·)
? ?
C0
d
DK (·)
M0
The only restriction on the attacker is that they cannot query the decryption ora-
cle on ciphertexts returned from the encryption oracle (obviously, that would make
determining b trivial) (in Figure 3.8, this means C 6= C 0 ). As before, a scheme is
considered IND-CCA secure if an adversary’s advantage is small.
Note that since IND-CCA is stronger than IND-CCA, the former implies
the latter. This is trivially-provable by reduction, so we won’t show it here.
Unfortunately, none of our IND-CPA schemes are also secure under IND-CCA.
7
Imagine reverse-engineering an encrypted messaging service like iMessage to fully understanding
its encryption scheme, and then control the data that gets sent to Apple’s servers to “skip” the
encryption step and control the ciphertext directly. If you control both endpoints, you can see
what the ciphertext decrypts to!
Kudrayvtsev 33
CHAPTER 3: Block Ciphers
Analysis of CBC
Recall from Figure 3.2 the way that message construction works under CBC with
random initialization vectors.
Suppose we start by encrypting two distinct, two-block messages. They don’t have
to be the ones chosen here, but it makes the example easier. We pass these to the
left-right oracle:
$
IV k c1 k c2 ←− EK LR(02n , 12n )
From these ciphertexts alone, we’ve already shown that the adversary can’t determine
which of the input messages was encrypted. However, suppose we send just the first
chunk to the decryption oracle?
m = DK (IV k c1 )
This is legal since it’s not an exact match for any encryption oracle outputs. Well
since our two blocks were identical, and c2 has no bearing in the decryption of IV k c1
(again, refer to the visualization in Figure 3.2), the plaintext m will be all-zeros in
the left case and all-ones in the right case!
It should be fairly clear that this is an efficient attack, and that the adversary’s
advantage is optimal (exactly 1). For posterity,
=1−0= 1
The attack time t is the time to compare n bits, it requires qe = qd = 1 query to each
oracle, and message lengths of µe = 4n and µd = 2n. Thus, CBC is not IND-CCA
secure.
Almost identical proofs can be used to break both CTR and CTRC, our final bastions
of hope in the Modes of Operation we’ve covered.
Analysis of CBC: Anotha’ One (or, Kicking ‘em While They’re Down)
We can break CBC (and the others) in a different way. This is included here to jog
the imagination and offer an alternative way of thinking about showing insecurity
under IND-CCA.
In this attack, one-block messages will be sufficient:
$
IV k c1 ←− EK (LR(0n , 1n ))
This time, there’s nothing to chop off. However, what if we try decrypting the ci-
phertext with a flipped IV?
m = DK IV k c1
Kudrayvtsev 34
APPLIED CRYPTOGRAPHY Symmetric Cryptography
Well, according to Figure 3.2, the output from the blockcipher will be XOR’d with
the flipped IV, and thus result in a flipped message, so m = 0n = 1n in the left case,
and m = 0n in the right case!
Again, this is trivially computationally-reasonable (in fact, it’s even more reasonable
than before) and breaks IND-CCA security.
Kudrayvtsev 35
Message Authentication Codes
ata privacy and confidentiality is not the only goal of cryptography, and a
D good encryption method does not make any guarantees about anything beyond
confidentiality. In the one-time pad (which is perfectly secure), an active attacker
Mallory can modify the message in-flight to ensure that Alice receives something
other than what Bob sent:
C =K ⊕M C0 = C ⊕ M 0
Mallory
receives M ⊕ M 0
Bob Alice
instead
If Mallory knows that the first 8 bits of Bob’s message corresponds to the number of
dollars that Alice needs to send Bob (and she does, according to Kerckhoff’s principle),
such a manipulation will have catastrophic consequences for Alice’s bank account.
Clearly, we need a way for Alice to know that a message came from Bob himself.
Let’s discuss ways to ensure that the recipient of a message can validate that the
message came from the intended sender (authenticity) and was not modified on the
way (integrity).
36
APPLIED CRYPTOGRAPHY Symmetric Cryptography
A replay attack is one where an adversary uses valid messages from the
past that they captured to duplicate some action.
For example, imagine Bob sends an encrypted, authenticated message “You
owe my friend Mallory $5.” to Alice that everyone can see. Alice knows this
message came from Bob, so she pays her dues. Then, Mallory decides to
just. . . send Alice that message again! It’s again deemed valid, and Alice
Kudrayvtsev 37
CHAPTER 4: Message Authentication Codes
Mac (K, ·)
A
Vf(K, ·, ·)
(M, t)
The latter part of the probability lets us ignore replay attacks and trivial
breaks of the scheme.
A Toy Example
Suppose we take a simple MAC scheme that prepends each message block with a
counter, runs this concatenation through a block cipher, and XORs all of the cipher-
texts (see Figure 4.1).
This can be broken easily if we realize that XORs can cancel each other out. Consider
tags for three pairs of messages and what they expand to
T1 = Mac (X1 k Y1 ) −→ EK (1 k X1 ) ⊕ EK (2 k Y1 )
T2 = Mac (X1 k Y2 ) −→ EK (1 k X1 ) ⊕ EK (2 k Y2 )
1
This lone restriction on the adversary is exactly like the one for IND-CCA, where its trivial to get
a perfect advantage if you’re allowed to decrypt messages you’ve encrypted.
Kudrayvtsev 38
APPLIED CRYPTOGRAPHY Symmetric Cryptography
EK EK ...... EK
⊕
T
Figure 4.1: A simple MAC algorithm.
T3 = Mac (X2 k Y1 ) −→ EK (1 k X2 ) ⊕ EK (2 k Y1 )
If we combine these three tags, we can actually derive the tag for a new pair of
messages!
T1 ⊕ T2 ⊕ T3 = EK (1 k X1 ) ⊕ EK (2 k Y1 ) ⊕
EK (1 k X1 ) ⊕ EK (2 k Y2 ) ⊕
EK (1 k X2 ) ⊕ EK (2 k Y1 )
cancel duplicate XORs
= EK (2 k Y2 ) ⊕ EK (1 k X2 ) (highlighted)
= Mac (X2 k Y2 )
Since we haven’t queried the tagging algorithm with this particular message, it be-
comes a valid pairing that breaks the scheme. It’s also trivially a reasonable attack,
requiring only qt = 3 queries to the tagging algorithm, µ = 3 messages, and the time
it takes to perform 3 XORs (if we don’t count the internals of Mac).
This means that any secure blockcipher (like AES) can be used as a MAC. However,
they only operate on short input messages. Can we extend our Modes of Operation
to allow MACs on arbitrary-length messages?
Enter CBC-MAC, which looks remarkably like CBC mode for encryption (see sub-
section 3.1.2) but disregards all but the last output ciphertext. Given an n-bit block
cipher, E : {0, 1}k ×{0, 1}n 7→ {0, 1}n , the output message space is MsgSp = {0, 1}mn ,
fixed m-block messages (obviously m ≥ 1).
Kudrayvtsev 39
CHAPTER 4: Message Authentication Codes
⊕ ⊕ ⊕
EK EK ...... EK
C[m]
To reiterate, this scheme is secure under UF-CMA only for a fixed message length
across all messages. That is, we can’t send messages that are longer or shorter than
some predefined multiple of n bits.
prf m 2 qA
2
Advuf-cma
CBC-MAC (A) ≤ AdvE (B) +
2n−1
(the last term is an artifact of the birthday paradox)
This is an important limitation, and it will be enlightening for the reader to determine
why variable-length messages break the CBC-MAC authentication scheme. There
are, however, ways to extend CBC-MAC to allow variable-length messages, such as
by prepending the length as the first message block.
Kudrayvtsev 40
Hash Functions
Realize you won’t master data structures until you are working
on a real-world problem and discover that a hash is the solution
to your performance woes.
— Robert Love
Some examples of modern hash functions include those in Table 5.1. They should
be pretty familiar: SHA-1 is used by git and SHA-3 is used by various cryptocur-
rencies like Ethereum.1 They are used as building blocks for encryption, hash-maps,
blockchains, key-derivation functions, password-storage mechanisms, and more.
1
Technically, Ethereum uses the Keccak-256 hash function, which is the pre-standardized ver-
sion of SHA-3. There are some interesting theories on the difference between the two: though
the standardized version changes a padding rule—allegedly to allow better variability in digest
lengths—its underlying algorithm was weakened to improved performance, casting doubts on its
general-purpose security.
41
CHAPTER 5: Hash Functions
Kudrayvtsev 42
APPLIED CRYPTOGRAPHY Symmetric Cryptography
Advcr
H (A) = Pr [Hk (x1 ) = Hk (x2 )] where x1 6= x2
Is H collision resistant?
Obviously not. It’s actually quite trivial to get the exact same digest, since x1 ⊕x1 = 0.
That is, we pass the same 128-bit block in twice:
$
Let x ←− {0, 1}128 and m = x k x :
Hk (m) = Aes(x) ⊕ Aes(x)
=c⊕c=0
Notice that this is extremely general-purpose, finding 2128 messages that all collide
to the same value of zero.
Kudrayvtsev 43
CHAPTER 5: Hash Functions
h h ... h h
0n
it looks like Figure 5.1: each “block” of the input message is concatenated with the
hashed version of its previous block, then hashed again.
The good news of this transform is the following:
Birthday Attacks Recall the birthday paradox: as the number of samples from
a range increases, the probability of any two of those samples being equal grows
rapidly (there’s a 95% chance that two people at a 50-person party will have the
same birthday).
A hash function is regular if every range point has the same number of pre-images
(that is, if every output has the same number of possible inputs). For such a function,
the “birthday attack” finds a collision in ≈ 2n/2 trials. For a hash function that is not
regular, such an attack could succeed even sooner.
Kudrayvtsev 44
APPLIED CRYPTOGRAPHY Symmetric Cryptography
Thorough research into the modern hash functions (for which n ≥ 160, large-enough
to protect against birthday attacks) suggests that they are “close to regular.” 2 Thus,
we can safely use them as building blocks for Merkle-Damgård.
Attacks in Practice: SHAttered A collision for the SHA-1 hash was found in
February of 2017, breaking the hash function in practice after it was broken theo-
retically in 2005: two PDFs resolved to the same digest. The attack took 263 − 1
computations; tthis is 100,000 faster than the birthday attack.
Figure 5.2: The two PDFs in the SHAttered attack and their resulting, identical
digests. More details are available on the attack’s site (because no security attack
is complete without a trendy title and domain name).
R = {f (d) : d ∈ D}
2
Much like the conjecture that AES is PRF secure, this is thus far unproven. As we’ll see later,
neither are the security assumptions behind asymmetric cryptography (e.g. “factoring is hard”).
Overall, these conjectures on top of conjectures unfortunately do not inspire much confidence in
the overall state of security, yet it’s the best we can do.
Kudrayvtsev 45
CHAPTER 5: Hash Functions
Example Given f (x) = x (the diagonal line passing through the origin),
for what subset of the domain is f (x) > 0? Obviously, when x > 0.
This subset is called the preimage. Namely, given a subset of the range,
S ⊆ R, its preimage is the set of inputs that corresponds to it:
P = {x | f (x) ∈ S}
Kudrayvtsev 46
APPLIED CRYPTOGRAPHY Symmetric Cryptography
Advow 0
H (A) = Pr Hk x = y
Given our two security properties for a hash function, do either of them imply the
other? That is, are either of these true?
collision resistance =⇒ one-wayness
one-wayness =⇒ collision resistance
Since g was one-way, h is also one-way. However, it’s obviously not collision resistant,
since we know that when given any n-bit input m
h(m1 m2 · · · mn−1 0) = h(m1 m2 · · · mn−1 1)
Kudrayvtsev 47
CHAPTER 5: Hash Functions
opad = 0x |5C5C5C
{z . .}. ipad = 0x |363636
{z . .}.
repeated B times repeated B times
The specific constants are chosen to simplify the proof of security, having no bearing
on the security itself.
Ki k M h h h ... X
Ko k X h h h ... HmacM (M )
HMAC is easy to implement and fast to compute; it is a core part of many standard-
ized cryptographic constructs. Its useful both as a message authentication code and
as a key-derivation function (which we’ll discuss later in asymmetric cryptography).
Theorem 5.2. HMAC is a PRF assuming that the underlying compression func-
tion H is a PRF.
Kudrayvtsev 48
Authenticated Encryption
Advint-ctxt
SE (A) = Pr [A → C : DK (C) 6=⊥ and C wasn’t received from EK (·)]
49
CHAPTER 6: Authenticated Encryption
Bellare & We can build a secure authentication encryption scheme by composing the basic
Namprempre, ‘00 encryption and MAC schemes we’ve already seen.
Key Generation Keeping keys for confidentiality and integrity separate is incred-
ibly important. This is called the key separation principle: one should always use
distinct keys for distinct algorithms and distinct modes of operation. It’s possible to
do authenticated encryption without this, but it’s far more error-prone.1
1
The unique keys can still be derived from a single key via a pseudorandom generator, such as
by saying K1 = FK (0) and K2 = FK (1) for a PRF secure F . The main point is to keep them
separate beyond that.
Kudrayvtsev 50
APPLIED CRYPTOGRAPHY Symmetric Cryptography
Thus our composite key generation algorithm will generate two keys: Ke for encryp-
tion and Km for authentication.
$
K : Ke ←− K0
$
Km ←− {0, 1}k
K := Ke k Km
6.2.1 Encrypt-and-MAC
In this composite scheme, the plaintext is both encrypted and authenticated; the full
message is the concatenated ciphertext and tag.
6.2.2 MAC-then-encrypt
In this composite scheme, the plaintext is first tagged, then the concatenation of the
tag and the plaintext is encrypted.
How’s the security of this scheme? There’s no longer a deterministic component, so
it is IND-CPA secure; however, it does not guarantee integrity under INT-CTXT.
We can prove this by counterexample if there are some specific secure building blocks
that lead to valid forgeries.
2
Specifically, consider submitting two queries to the left-right oracle: LR(0n , 1n ) and LR(0n , 0n ).
The tags for the b = 0 case would match.
Kudrayvtsev 51
CHAPTER 6: Authenticated Encryption
The counterexample for this is a little bizzare and worth exploring; it gives us insight
into how hard it truly is to achieve security under these rigorous definitions. We’ll
first define a new IND-CPA encryption scheme:
SE 00 = {K0 , E 00 , D00 }
Then, we’ll define SE 0 as an encryption scheme that is also IND-CPA secure, that
uses SE 00 , but enables trivial forgeries by appending an ignorable bit to the resulting
ciphertext:
SE 0 = {K0 , E 0 , D0 }
0 00
EK (M ) = EK (M ) k 0
0 00
DK (C k b) = DK (C)
Obviously, now both C k 0 and C k 1 decrypt to the same plaintext, and this means
that an adversary can easily create forgeries. Weird, right? This example, silly as
though it may be, is enough to demonstrate that MAC-then-encrypt cannot make
guarantees about INT-CTXT security in general.
6.2.3 Encrypt-then-MAC
In our last hope for a generally-secure scheme, we will encrypt the plaintext, then
add a tag based on the resulting ciphertext.
Kudrayvtsev 52
APPLIED CRYPTOGRAPHY Symmetric Cryptography
valid ones, and this lets the attacker to learn secret information about the
scheme. Now, they can differentiate between an invalid tag and an invalid
ciphertext.
With this scheme, we get both security under IND-CPA and INT-CTXT, and by
Theorem 6.1, also under IND-CCA.
6.2.4 In Practice. . .
It’s important to remember that the above results hold in general ; that is, they hold
for arbitrary secure building blocks. That does not mean it’s impossible to craft a
specific AE scheme that holds under a generally-insecure composition method.
Kudrayvtsev 53
CHAPTER 6: Authenticated Encryption
4
It was designed by Phillip Rogaway, one of the authors for the lecture notes on cryptography.
Kudrayvtsev 54
Stream Ciphers
his chapter introduces a paradigm shift in the way we’ve been constructing
T ciphertexts. Rather than encrypting block-by-block using a specific mode of op-
eration, we’ll instead be encrypting bit-by-bit with a stream of gibberish. Previously,
we needed our input plaintext to be a multiple of the block size; now, we can truly
deal with arbitrarily-length inputs without worrying about padding. This will actu-
ally be reminiscent of one-time pads: a pseudorandom generator (or PRG) will
essentially be a function that outputs an infintely-long one-time pad, and a stream
cipher will use that output to encrypt plaintexts.
7.1 Generators
$
In general, a stateful generator G begins with some initial state St=0 ←− {0, 1}n
called the seed, then uses the output of itself as input to its next run. The sequence
of outputs over time, X0 X1 X2 · · · should be pseudorandom for a pseudorandom gen-
erator: reasonably unpredictable and tough to differentiate from true randomness.
St+1
St G
Xt
(X0 X1 · · · Xm , St ) = G (S0 , m)
to signify running the generator m times with the starting state S0 , resulting in an
m-length output and a new state St . This construction is the backbone of all of the
$
instances where we’ve used ←− previously to signify choosing a random value from a
set. Pseudorandom generators (PRGs) are used to craft initialization vectors, keys,
oracles, etc.
55
CHAPTER 7: Stream Ciphers
Advindr
0 0
G (A) = Pr b = 1 for Exp1 − Pr b = 1 for Exp0
Kudrayvtsev 56
APPLIED CRYPTOGRAPHY Symmetric Cryptography
S0 S1 S2 S3
G G G
X1 X2 X3
Suppose an adversary somehow gets access to S2 . Obviously, they can now derive X3 ,
X4 , and so on, but can they compute X1 or X2 , though? A scheme that preserves
forward secrecy should say “no.”
The scheme presented in algorithm 7.1, though secure under INDR does not preserve
forward secrecy. Leaking any state (K, Vt ) lets the adversary construct the entire
chain of prior states if they have been capturing the entire history of generated X0..t
values.
Consider a simple forward-secure pseudorandom generator: regenerate the key anew
Kudrayvtsev 57
CHAPTER 7: Stream Ciphers
on every iteration.
K0 K1 K2
0 1 0 1
E E E E
X1 X2
7.3.2 Considerations
To get an unpredictable PRG, you need an unpredictable key for the underlying PRF.
This is the seed, and it causes a bit of a chicken-and-egg problem. We need random
values to generate (pseudo)random values.
Entropy pools typically come from “random” events from the real world like keyboard
strokes, system events, even CPU temperature. Then, seeds can be pulled from this
entropy pool to seed PRGs.
Seeding is not exactly a cryptographic problem, but it’s an important consideration
when using PRGs and stream ciphers.
Kudrayvtsev 58
Common Implementation Mistakes
Primitives There are far more primitives that don’t work compared to those that
work. For example, using block ciphers with small block sizes or small key spaces
are vulnerable to exhaustive key-search attacks, not even to mention their vulnera-
bility to the birthday paradox. Always check NIST and recommendations from other
standards committees to ensure you’re using the most well-regarded primitives.
Security Bounds Recall that we proved that the CTR mode of operation (see
Figure 3.4) had the following adversarial advantage:
q2
Advind-cpa prf
CTR (A) ≤ AdvE (B) +
2L+1
Yet if we use constants that are far too low, this becomes easily achievable. The
WEP security protocol for WiFi networks used L = 24. With q = 4096 (trivial to
59
CHAPTER 8: Common Implementation Mistakes
do), the advantage becomes 1/2! In other words, the IVs are far too short1 to provide
any semblance of security from a reasonably-resourced attacker.
Trifecta Just because you have achieved confidentiality, you have not necessarily
achieved integrity or authenticity. Not keeping these things in mind leads to situations
where false assumptions are made.
Security Proofs As we’ve seen, we often need to extend our security definitions
to encompass more sophisticated attacks (like IND-CCA over IND-CPA). Thus, even
using a provably-secure scheme does not absolve you of an attack surface. For exam-
ple, the Lucky 13 attack used a side-channel timing attack to break TLS. The security
definitions we’ve recovered did not consider an attacker being able to the difference
between decryption and MAC verification failures, or how fragmented ciphertexts
(where the received doesn’t know the borders between ciphertexts) are handled.
1
It’s so easy to break WEP-secured WiFi networks; I did it as a kid with a $30 USB adapter and
15 minutes on Backtrack Linux.
Kudrayvtsev 60
PART II
Asymmetric Cryptography
his class of algorithms is built to solve the key distribution problem. Here,
T secrets are only known to one party; instead, a key (pk, sk) is broken into two
mathematically-linked components. There is a public key that is broadcasted to the
world, and a secret key (also called a private key) that must is kept secret.
Eve
Contents
9 Overview 63
10 Number Theory 65
11 Encryption 77
12 Digital Signatures 95
61
CHAPTER 8: CONTENTS
14 Epilogue 114
Kudrayvtsev 62
Overview
e need to translate some things over from the world of symmetric encryption
W to proceed with our same level of rigor and analysis as before, this time applying
our security definitions to asymmetric encryption schemes.
9.1 Notation
An asymmetric encryption scheme is similarly defined by an encryption and decryp-
tion function pair as well as a key generation algorithm. Much like before, we denote
these as AE = (E, D, K).
The key is now broken into two components: the public key (shareable) and the
private key (secret). These are typically composed as: K = (pk, sk).
63
CHAPTER 9: Overview
pk
m0 , m1 mb
A LR (m0 , m1 , b) Epk (·)
? ?
Asymmetric IND-CCA For chosen ciphertext attacks, we keep the same restric-
tion as before: the attacker cannot query the decryption oracle with ciphertexts s/he
acquired from the encryption oracle.
pk
m0 , m1 mb
A LR (m0 , m1 , b) Epk (·)
? ?
C0
d
Dsk (·)
M0
Much like before, a scheme being IND-CCA implies it’s also IND-CPA (recall the
inverse direction of Theorem 6.1).
Advind-cpa
AE (A) ≤ q · Advind-cpa
AE (A0 )
Essentially, this theorem states that a scheme that is secure against a single query
is just as secure against multiple queries because the factor of q does not have a
significant overall effect on the advantage.
Kudrayvtsev 64
Number Theory
odular arithmetic and other ideas from number theory are the backbone of
M asymmetric cryptography. Like the name implies, the foundational security
principles rely on the asymmetry of difficulty in mathematical operations. For ex-
ample, verifying that a number is prime is easy, yet factoring a product of primes is
hard.
The RSA and modular arithmetic discussions in this chapter are ripped
from my notes for Graduate Algorithms which also covers these topics; these
sections may not align perfectly with lectures in terms of overall structure.
Notation
• Z+ is the set of positive integers, {0, 1, . . .}.
• ZN is the set of positive integers up to N : {0, 1, . . . , N − 1}.
• Z∗N is the set of integers that are coprime with N , meaning their greatest com-
mon divisor is 1:
Z∗N = {x ∈ ZN : gcd(x, N ) = 1}
65
CHAPTER 10: Number Theory
10.1 Groups
A group is just a set of numbers on which certain operations hold true. Let G be a
non-empty set and let · be some binary operation. Then, G is a group under said
operation if:
• closure: the result of the operation should stay within the set:
∀a, b ∈ G : a · b ∈ G
• identity: there should be some element in the set such that binary operations
on that element have no effect:
∀a ∈ G : a · 1 = 1 · a = a
The 1 here is a placeholder for the identity element; it doesn’t need to be the
actual positive integer 1.
• invertibility: for any value in the set, there should be another unique element
in the set such that their result is the identity element:
∀a ∈ G, ∃b ∈ G : a · b = b · a = 1
This latter element b is called the inverse of a.
For example, ZN is a group under addition modulo N , and Z∗N is a group under
multiplication modulo N . The order of a group is just its size.
Property 10.1. For a group G, if we let m = |G|, the order of the group, then:
∀a ∈ G : am = 1
∀a ∈ G, i ∈ Z : ai = ai mod m (10.1)
Example These properties let us do some funky stuff with calculating seemingly-
impossible values. Suppose we’re working under Z∗21 :
Z∗21 = {1, 2, 4, 5, 8, 10, 11, 13, 16, 17, 19, 20}
Note that |Z∗21 | = 12. What’s 586 mod 21? Simple:
586 mod 21 = 586 mod 12 mod 21
= 52 mod 21 = 25 mod 21
= 4
Kudrayvtsev 66
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
∀x, y ∈ S : x · y −1 ∈ S
. . . , −6, −3, 0, 3, 6, . . .
. . . , −5, −2, 1, 4, 7, . . .
. . . , −4, −1, 2, 5, 8, . . .
Kudrayvtsev 67
CHAPTER 10: Number Theory
This will become clearer with examples, but in essence we’re looking for inverse
exponentials, so an advantage of 2−k is negligible.
10.2.2 Inverses
The multiplicative inverse of a number under a modulus is the value that makes
their product 1. That is, x is the multiplicative inverse of z if zx ≡ 1 (mod N ). We
then say x ≡ z −1 (mod N ).
Note that the multiplicative inverse does not always exist (in other words, ZN is not
a group under multiplication); if it does, though, it’s unique. They exist if and only
if their greatest common divisor is 1, so when gcd(x, N ) = 1. This is also called being
relatively prime or coprime.
ax + by = n
Kudrayvtsev 68
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
These can be found using the extended Euclidean algorithm and are crucial in
finding the multiplicative inverse. If we find that gcd(x, n) = 1, then we want to find
x−1 . By the above identity, this means:
ax + bn = 1
taking mod n of both sides
ax + bn ≡ 1 (mod n) doesn’t change the truth
ax ≡ 1 (mod n) bn mod n = 0
Then, we can multiply the correct powers of two to get xy , so if y = 69, you would
use x69 ≡ x64 · x4 · x1 (mod N ).
Group Elements The order of a finite group element, denoted o(g) for some
g ∈ G, is the smallest integer n ≥ 1 fulfilling g n = 1 (the identity element).
For any group element g ∈ G, we can generate a subgroup of G easily:
hgi = {g 0 , g 1 , . . . , g o(g)−1 }
Naturally, its order is the order o(g) of G. Since we established above that the order
of a subgroup divides the order of the group, the same is true for group elements. In
other words, ∀g ∈ G : |G| mod o(g) = 0.
Kudrayvtsev 69
CHAPTER 10: Number Theory
We need to scale our security based on the state of the art for the groups in question: a
1024-bit prime p is just as secure on Z∗p as a 160-bit prime q on an elliptic curve group.
Kudrayvtsev 70
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
Finding Generators
Thankfully, there are some simple cases that let us create such groups:
• If p is a prime number, then Z∗p is a cyclic group.
• If the order of any group G is prime, then G is cyclic.
• If the order of a group is prime, then every non-trivial element is a generator
(that is, every g ∈ G \ {1} where 1 is the identity element).
However, if G = Z∗p , then its order is p − 1 which isn’t prime. Though it may be hard
to find a generator in general, it’s easy if the prime factorization of p − 1 is known.
A prime p is called a safe prime if p − 1 = 2q, where q is also a prime. Safe primes
are useful because it means that ∗equally-hard to factor pq into either p or q.1 Here,
though, they’re useful because Zp factors into (2, q) for safe primes.
Property 10.2. Given a safe prime p, the order of Z∗p can be factored into (2, q),
where q is a prime. Then, a group element g ∈ Z∗p is a generator if and only if
g 2 6≡ 1 and g q 6= 1.
Now, there a useful fact that Z∗p will have q − 1 generators, so a simple randomized
$
algorithm that chooses g ←− G \ {1} until g is a generator (checked by finding g 2
and g q ) will fail with only 1/2 probability. This becomes negligible after enough runs
and will take two tries on average, letting us find generators quickly.
We just found a way to find a generator g in the group Z∗p ; the end-goal is to work
over hgi. Thus, our difficulty has transferred over to choosing safe primes.
Generating Primes
Because primes are dense—for an n-bit number, we’ll find a prime every n runs on
average—we can just generate random bitstrings until one of them is prime. Once we
have a prime, making sure it’s a safe prime does not add much complexity because
they are also dense.
1
For example, factoring 4212253 (into 2903 · 1451) is much harder than factoring 5806 (into 2903 · 3)
because both of the former primes need around 11 bits to represent them.
Kudrayvtsev 71
CHAPTER 10: Number Theory
Given this, how do we check for primality quickly? Fermat’s little theorem gives us
a way to check for positive primality: if a randomly-chosen number r is prime, the
theorem holds. However, checking all r − 1 values √against the theorem is not ideal.
Similarly, checking whether or not all values up to r divide r is not ideal.
It will be faster to identify a number as being composite (non-prime), instead.
Namely, if the theorem doesn’t hold, we should be able to find any specific z for
which z r−1 6≡ 1 (mod r). These are called a Fermat witnesses, and every composite
number has at least one.
This “at least one” is the trivial Fermat witness: the one where gcd(z, r) > 1.
Most composite numbers have many non-trivial Fermat witnesses: the ones where
gcd(z, r) = 1.
The composites without non-trivial Fermat witnesses called are called Carmichael
numbers or “pseudoprimes.” Thankfully, they are relatively rare compared to normal
composite numbers so we can ignore them for our primality test.
The above property inspires a simple randomized algorithm for primality tests that
identifies prime numbers to a particular degree of certainty:
$
1. Choose z randomly: z ←− {1, 2, . . . , r − 1}.
?
2. Compute: z r−1 ≡ 1 (mod r).
3. If it is, then say that r is prime. Otherwise, r is definitely composite.
Note that if r is prime, this will always confirm that. However, if r is composite (and
not a Carmichael number), this algorithm is correct half of the time by the above
property. To boost our chance of success and lower false positives (cases where r is
composite and the algorithm says it’s prime) we choose z many times. With k runs,
we have a 1/2k chance of a false positive.
Property 10.4. Given a prime number p, the number 1 only has the trivial
square roots ±1 under its modulus. In other words, there is no other value z such
that: z 2 ≡ 1 (mod p).
The above property lets us identify Carmichael numbers during the fast exponentia-
tion for 3/4ths of the choices of z, which we can use in the same way as before to check
primality to a particular degree of certainty.
Kudrayvtsev 72
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
52 = 25 ≡ 3 (mod 11)
62 = 36 ≡ 3 (mod 11)
Thus, the square root of 25 is both 5 and 6 under modulo 11 (weird, right?). Weird,
right? Well, not so much when you consider that −5 ≡ 6 (mod 11).
Again, not every value has a square root: for example, 28 doesn’t under mod 11
(note that 28 mod 11 = 6). We can verify this by trying all possible values2 under
the modulus:
12 =1 (mod 11)
22 =4 (mod 11)
32 =9 (mod 11)
42 = 16 ≡ 5 (mod 11)
52 = 25 ≡ 3 (mod 11)
62 = 36 ≡ 3 (mod 11)
72 = 49 ≡ 5 (mod 11)
82 = 64 ≡ 9 (mod 11)
92 = 81 ≡ 4 (mod 11)
102 = 100 ≡ 1 (mod 11)
2
Also, notice that there are no other values that are equivalent to 25 ≡ 3 (mod 11), confirming
that there are only two roots under this modulus.
Kudrayvtsev 73
CHAPTER 10: Number Theory
With that, we can define sets of squares (or quadratic residues) in a group as:
QR Z∗p = {a ∈ Z∗p : Jp (a) = 1} (10.4)
Now previously, we defined the Legendre symbol as a simple indicator function (10.3);
conveniently, it can actually be computed for any prime p ≥ 3:
p−1
Jp (a) ≡ a 2 (mod p) (10.5)
Kudrayvtsev 74
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
The Legendre symbol has the property multiplicity: Jp (ab) = Jp (a) · Jp (b) for any
a, b ∈ Z. It also has an inversion property: the Legendre symbol of a value’s inverse
is the same as the value’s. That is, Lp (a−1 ) = Lp (a). Both of these apply only for
non-trivial primes: p ≥ 3.
The following are “bonus” sections not directly related to the lecture content.
12 =1 (mod 13)
22 =4 (mod 13)
32 =9 (mod 13)
42 = 16 ≡ 3 (mod 13)
52 = 25 ≡ 12 (mod 13)
62 = 36 ≡ 10 (mod 13)
72 = 49 ≡ 0 (mod 13)
82 = 64 ≡ 12 (mod 13)
92 = 81 ≡ 3 (mod 13)
102 = 100 ≡ 9 (mod 13)
112 = 121 ≡ 4 (mod 13)
122 = 144 ≡ 1 (mod 13)
Looks like 5 and 8 are the roots of 25 under mod 13. Thus, if we look for the roots
under the product of 11 · 13 = 143, we will find exactly four values:3
52 = 25 ≡ 25 (mod 143)
602 = 3600 ≡ 25 (mod 143)
3
These were found with a simple Python generator:
filter(lambda i: (i**2) % P == v % P, range(P))
Kudrayvtsev 75
CHAPTER 10: Number Theory
The key comes from the following fact: by knowing the roots under both 11 and 13
separately, it’s really easy to find them under 11 · 13 without iterating over the entire
space. To reiterate, our roots are 5, 6 (mod 11) and 5, 8 (mod 13). We can use the
Chinese remainder theorem to find the roots quickly under 13 · 11.
Finding Roots Efficiently In our case, we have the four roots under the respective
moduli, and we can use the CRT to find the four roots under the product. Namely,
we find ri for each pair of roots:
Finding each ri can be done very quickly using the extended Euclidean algorithm
in O((|n| + |m|)2 ) time (where |x| represents the bit count of each prime), which is
much faster than the exhaustive search O 2|m||n| necessary without knowledge of 11
and 13. In this case, the four roots are 5, 60, 83, and 138 (in order of the ri s above).
Square root extraction of a product of primes pq is considered to be as
difficult as factoring it.
x ≡ a1 (mod n1 )
x ≡ a2 (mod n2 )
...
x ≡ ak (mod nk )
Kudrayvtsev 76
Encryption
77
CHAPTER 11: Encryption
Though these problems all appear different, they boil down to the same fact: if you
can solve the initial discrete log problem, you can solve all of them:
11.1.1 Formalization
In each case, suppose again that we’re given a cyclic group G, the order of the group
m = |G|, and a generator g. The adversary knows all of this (fixed) information.
$
Expdl
G, g (A) : x ←− Zm
x0 = A(g x )
0
if g x = g x , A wins
As usual, we define the discrete problem as being “hard” if any adversary’s dl-
advantage is negligible with polynomial resources.
$
Expcdh
G, g (A) : x, y ←− Zm
z = A(g x , g y )
if z = g xy , A wins
The cdh-advantage and difficulty of CDH is defined in the same way as DL.
Kudrayvtsev 78
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
$
Expddh-1
G, g (A) : x, y ←− Zm
$
Expddh-0
G, g (A) : x, y ←− Zm
$
z = xy mod m z ←− Zm key
difference
d = A(g x , g y , g z ) d = A(g x , g y , g z )
return d return d
The difficulty of DDH is defined in the usual way based on the ddh-advantage of any
adversary with polynomial resources.
11.1.2 Difficulty
Under the group of a prime Z∗p , DDH is solvable in polynomial time, while the others
are considered hard: the best-known algorithm is the general number field sieve whose
complexity we mentioned in (10.2).
In contrast, under the elliptic curves, all three of the aforementioned problems are
√
harder than their Z∗p counterparts, with the best-known algorithms taking p time,
where p is the prime order of the group.
DL Difficulty
Note that there is a linear time algorithm for breaking the DL problem, but it relies
on knowing something that is hard to acquire. The algorithm relies on knowing the
prime factorization of the order of the group. Namely, if we know the breakdown
such that
p − 1 = pα1 1 · pα2 2 · . . . · pαnn
(where each pi is a prime), then the discrete log problem can be solved in
n
X √
αi · ( pi + |p|)
i=1
time. Thus, if we want the DL problem to stay difficult, then at least one prime
factor needs to be large (e.g. a safe prime) so the factorization is difficult.
Kudrayvtsev 79
CHAPTER 11: Encryption
Let’s take a look at the algorithm for breaking the decisional DH problem under
the prime group Z∗p . Remember, the goal of breaking DDH essential comes down to
differentiating between g xy and a random g z6=xy .
The key lies in a fact we covered when discussing Groups: we can easily differentiate
squares and non-squares in Z∗p (see Property 10.5). There’s an efficient adversary who
can have a ddh-advantage of 1/2: the idea is to compute the Legendre symbols of the
inputs. Recall Equation 10.5 or more specifically Property 10.6: the Legendre symbol
of an exponent product must match the individual exponents.
Since g x or g y will be squares half of the time (by Property 10.5—even powers of g
are squares), and g xy can only be a square if this is the case, this check succeeds with
1/2 probability, since:
Expddh-0
G, g A = 1
Expddh-1
G, g A = /2
1
∴ Advddh
G, g (A) = 1 − /2 = /2
1 1
The algorithm only needs two modular exponentiations, meaning it takes O |p|3
Making DDH Safe Since the best-known efficient algorithm relies on squares, we
can modify the group in question to avoid the algorithm. Specifically, DDH is believe
to be difficult (i.e. a minimal ddh-advantage for any polynomial adversary) in QR Z∗p
where p = 2q + 1, a safe prime.
Kudrayvtsev 80
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
3. Then, Alice sends Bob her public key and vice-versa: Alice Bob
Kudrayvtsev 81
CHAPTER 11: Encryption
There is good news, though: ElGamal encryption is IND-CPA secure for a group if
the DDH problem on the same group is hard. As a reminder, such groups include
prime-order subgroups of Z∗p or elliptic curve prime order groups.
1
Recall that a group is defined by a set and a specific operation (see section 10.1 for a review)
which we have been denoting as · (as opposed to · for multiplication).
Kudrayvtsev 82
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
Proof. We proceed by showing that if ElGamal is not IND-CPA, then the DDH
problem is not hard (this is the contrapositive technique we used when proving the
IND-CCA security of the CBC mode of operation; see Theorem 3.2).
Assume A is an IND-CPA attacker for ElGamal. We will use A to construct an DDH
adversary B. Simply put, B will pass along the Diffie-Hellman tuple it receives as part
of the DDH challenge and let A differentiate between “real” and “random” tuples.
Recall that the DDH Problem is to differentiate a value g z where z is chosen randomly
from a g z where z = xy mod m when given g x and g y .
?
Our adversary B is given g, X = g , Y = g , Z = g
x y z ≡xy
and will use A as follows:
$
1. Flip a coin as an oracle, choosing b ←− {0, 1}.
2. Run A with the ElGamal parameters g and public key X.
3. When A makes a query (with, say, (m0 , m1 )), return (Y, mb · Z), exactly as
specified by ElGamal encryption—Y here is the one-time public key of the
“sender” and mb · Z is the ciphertext.
4. Let d be A’s result—its guess of b.
5. If d = b, return 1 (this is a real DDH tuple, Z = g xy ); otherwise, return 0 (this
is a random DDH tuple).
The rationale is that because Z is an invalid ciphertext construction in the case of
a random DDH tuple, A should fail at breaking the scheme since its invalid. Let’s
justify the ddh-advantage. By definition,
Let’s break this down into its component parts. Notice that the first probability
(differentiating real tuples correctly) is simply dependent on A’s ability to break
ElGamal encryption under the IND-CPA-cg variant (in other words, B acts exactly
?
like an IND-CPA-cg oracle by choosing b randomly and comparing d = b):
h i
Pr ExpG, g (B) → 1 = Pr ExpEG
ddh-1 ind-cpa-cg
(A) → 1
1 1
= + Advind-cpa
EG (A) from the proof of
Definition 3.3
2 2
On the other side, we have B’s chance of failing to differentiate random tuples. By
construction, B outputs 1 when A is correct; because A receives a random group
Kudrayvtsev 83
CHAPTER 11: Encryption
1 1 1
≥ + Advind-cpa
EG (A) −
2 2 2
1 ind-cpa
≥ · AdvEG (A)
2
DB c · m−1 = c · m−1 ab −1
1 1 · (g )
= mb · m−1 ab ab −1
1 · g · (g )
= mb · m−1
1
If b = 1 (the right message was chosen), then this simply evaluates to the identity
element 1! In the other case, it does not, so this is a sufficient check for always
correctly differentiating which plaintext was encrypted.
Kudrayvtsev 84
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
c = g1x1 · g2x2
d = g1y1 · g2y2
h = g1z
Kudrayvtsev 85
CHAPTER 11: Encryption
Property 11.1. If the decisional Diffie-Hellman problem for the group G is hard
and H is a cryptographically-secure hash function, then Cramer-Shoup is IND-
CCA secure.
Despite its strong security, this scheme is not used in practice because far more
efficient IND-CCA secure asymmetric algorithms exist, though they do not rely on
the difficulty of the discrete logarithm problem.
Kudrayvtsev 86
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
This means that raising a message m to the de power simply returns the original
message. The RSA protocol thus works as follows. A user reveals their public key
to the world: the exponent e and the modulus N . To send them a message, m, you
send c = me mod N . They can find your message by raising it to their private key
exponent, d:
= cd mod N
= (me mod N )d = med mod N
= m mod N
This is secure because you cannot determine (p − 1)(q − 1) from the revealed N and e
without exhaustively enumerating all possibilities2 (i.e. “factoring is hard”); thus, if p
and q are large enough, it’s computationally infeasible for an adversary to factor N .
11.5.1 Protocol
With the math out of the way, here’s the full protocol. Note that pk = (e, N ) is the
public key information, and sk = (d, N ) is the private key information (where d is
the only real “secret,” but both values are needed).
2
Notice that if an adversary knew both N and ϕ (N ), they could use this information to form a
simple quadratic equation which can obviously be solved in polynomial time. If they knew ϕ (N )
and e, then d is just the modular inverse of e under mod ϕ (N ).
Kudrayvtsev 87
CHAPTER 11: Encryption
Sending Given an intended recipient’s public key, (N, e), and a message m ≤ N ,
simply compute and send c = me mod N . This can be calculated quickly using fast
exponentiation (refer to subsection 10.2.3).
Receiving Given a received ciphertext, c, to find the original message simply cal-
culate cd mod N = m (again, use fast exponentiation).
11.5.2 Limitations
For this to work, the message must be small: m ∈ Z∗N . This is why asymmetric
cryptography is typically only used to exchange a secret symmetric key which is
then used for all other future messages. There are also a number of attacks on plain
RSA that must be kept in mind:
• We need to take care to choose ms such that gcd(m, N ) = 1. If this isn’t
the case, the key identity med ≡ m (mod N ) still holds—albeit this time by
the Chinese remainder theorem rather than Euler’s theorem—but now there’s a
fatal flaw. If gcd(m, N ) 6= 1, then it’s either p or q. If it’s p, then gcd(me , N ) = p
and now N can easily be factored (and likewise if it’s q).
• Similarly, m can’t be too small, because then it’s possible to have me < N —the
modulus has no effect and directly taking the eth root will reveal the plaintext!
• Even though small e are acceptible, the private exponent d cannot be too small.
Specifically, if d < 31 · N 1/4 , then given the public key (N, e) one can efficiently
compute d.
Kudrayvtsev 88
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
• Another problem comes from sending the same message multiple times via
different public keys. The Chinese remainder theorem can be used to recover the
plaintext from the ciphertexts; this is known as Haståd’s broadcast attack.
Letting e = 3, for example, the three ciphertexts are:
c1 ≡ m 3 (mod N1 )
c2 ≡ m 3 (mod N2 )
c3 ≡ m 3 (mod N3 )
mde = m · mde−1
= m · mk(p−1)
k
≡ m · mp−1 (mod p)
≡ m · 1k (mod p) by Fermat’s little theorem
Kudrayvtsev 89
CHAPTER 11: Encryption
∴ mde = m (mod p)
We’re almost there; notice what we’ve derived: Take a message, m, and
raise it to the power of e to “encrypt” it. Then, you can “decrypt” it and
get back m by raising it to the power of d.
aϕ(N ) ≡ 1 (mod N )
a(p−1)(q−1) ≡ 1 (mod N )
The rationale for “encryption” is the same as before: take d, e such that
de ≡ 1 (mod ϕ (N )). Then,
mde = m · mde−1
= m · m(p−1)(q−1)k def. of mod
k
≡ m · m(p−1)(q−1) (mod N )
≡ m · 1k (mod N ) by Euler’s theorem
de
∴ m ≡m (mod N )
Kudrayvtsev 90
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
M 0...0 $
Algorithm:
⊕ G
m = M k {0 . . . 0}
H ⊕ $
r ←− {0, 1}k
left = G(r) ⊕ m
k
right = H(left) ⊕ r
return RSAN,e (left k right)
RSAN,e (·) C
random oracles.
The random oracle model assumes that all parties involved must access an oracle
that acts as a truly-random function. This does not match reality, since hashes can
be computed locally without consulting an oracle; furthermore, the hash functions
often imitate pseudorandom functions. However, it’s still a useful construction and
key to many security proofs.
For more on the random oracle model (which we’ll be leveraging many
times throughout this part of the text), take a glance at chapter 17 in the
Advanced Topics section.
Kudrayvtsev 91
CHAPTER 11: Encryption
pk1. . . pkn $
b ←− {0, 1}
m1 , m2
b0
Figure 11.2: A visualization of the IND-CPA adversary in the multi-user setting,
where A knows n public keys and must output their guess, b0 , corresponding to
the left-right oracles’ collective choice of b.
The adversarial scenario for the n-IND-CPA experiment is visualized in Figure 11.2
and should be relatively intuitive: instead of a single oracle and public key, there are
n oracles and n public keys. All of them use the same b (that is, if they choose b = 0,
they will always encrypt the left message), and the adversary’s goal is to differentiate
between left and right messages to output b0 , their guess.
Kudrayvtsev 92
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
Theorem 11.3. For an asymmetric encryption scheme AE, for any adversary
A there exists a similarly-efficient adversary B that only uses one query to the
left-right oracle such that:
Advn-ind-cpa
AE (A) ≤ n · qe · Advind-cpa
AE (B)
where n is the number of users and qe is the number of queries made to the
encryption oracles.
This is obviously much stronger than the generic guarantees of the previous theorem.
Unfortunately, no such improvements exist for RSA. For Cramer-Shoup encryption,
a better bound exists but only drops one of the linear terms.
Kudrayvtsev 93
CHAPTER 11: Encryption
up from some central, trusted authority. With IBE, senders don’t need public
keys to encrypt; instead, they can use an arbitrary string owned by the receiver
(such as an email address) and provide it to a central authority to get a secret
key.
Nothing is free, of course, and this last point is its fundamental flaw: the central
authority must be extremely trustworthy and safe in order to store secret keys
(rather than public keys, which are much more innocuous).
attribute-based encryption In this variant, the secret key and ciphertext of a
message are associated with attributes about the recipient, such as their age,
title, etc. Decryption is only possible if the attributes of the recipient’s key
match that of the ciphertext.
Of course, this implies a bit of a chicken-and-egg problem much like in key
distribution: how does the sender learn the supposedly-hidden attributes of the
recipient without anyone else knowing them, and doesn’t that mean they can
decrypt any ciphertexts now intended for that recipient since they know that
secret information?
homomorphic encryption This is an active area of research, especially among folks
aiming to do machine learning on encrypted data. A homomorphic encryption
scheme allows mathematical operations to be done on encrypted data securely
while also guaranteeing that a corresponding operation is done on the underlying
plaintext. More specifically, the encryption of an input x can be turned into
the encryption of an input f (x).
Interestingly-enough, RSA is homomorphic under multiplication. Consider two
messages encrypted under the same public key:
Kudrayvtsev 94
Digital Signatures
Notation Much like before, digital signatures schemes are described by a tuple
describing the way to generate keys, sign messages, and verify them:
The sender runs the signing algorithm using the their private key, and the receiver
runs the verifying algorithm using the sender’s public key.
Correctness — For every message in the message space, and every key-pair that can
be generated, a signature can be correctly verified if (and only if) it was output by
the signing function. Formally,
$
∀m ∈ MsgSp and ∀(pk, sk) ←− K :
s = Sign (m, sk) ⇐⇒ Vf (m, s, pk) = 1
The signing algorithm can be randomized or stateful, but doesn’t have to be for
security. The message space is typically all bit-strings: MsgSp = {0, 1}∗ .
95
CHAPTER 12: Digital Signatures
adversary shouldn’t be able to create a MAC that verifies a message unless it was
received from the oracle.
For signatures, we want to ensure the same principle: an adversary should not be
able to craft a valid signature for a message (that is, one verifiable by the oracle’s
public key) without the oracle signing it with its secret key.
pk
A Sign (sk, ·)
(m, s)
The only difference is that we no longer need a verification oracle, since anyone with
the signature, message, and knowledge of the public key should be able validate the
sender. Above, (m, s) is a message-signature pair such that m was never queried to
the signature oracle. The UF-CMA advantage is defined in the same way.
12.2 RSA
Fascinatingly, the math behind RSA allows it to be used nearly as-is for creating
digital signatures.
Kudrayvtsev 96
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
All of these could be formalized into proper UF-CMA adversaries, but they are omit-
ted here for brevity; they should be very straightforward.
Theorem 12.1. The FDH-RSA scheme is UF-CMA secure under the random
oracle model.
Specifically, let Krsa be a key generating algorithm for RSA and DS be the FDH-
RSA signature scheme. Then, let F be a forging adversary making at most qH
queries to its hash oracle and at most qS queries to its signing oracle. Then, there
exists an adversary I with comprable resources such that
Advuf-cma
DS (F) ≤ (qS + qH + 1) · Advowf
Krsa (I)
where owf refers to the strength of RSA as a one-way function—the ability to find
m given me without knowing the trap door d.
The intuition behind this result is worth understanding since it uses the random
oracle model which we’ve largely only alluded to previously. The proof proceeds by
leveraging the contrapositive as we’ve seen before: if an adversary F exists that can
Kudrayvtsev 97
CHAPTER 12: Digital Signatures
break UF-CMA on with FDH-RSA, then another adversary I can use F to break
RSA’s one-wayness. Let’s walk through how this happens.
Security Proof
In I’s case, it receives three values—the public key tuple (N, e) and the ciphertext
y = me mod N —and its goal is to find x. It models itself as the both the signing
and hash oracle for F. Given that F can crack UF-CMA security, it will return some
(M, H(M )d ), where M was never queried to the signing oracle. This means that we
know for a fact that F needs to make at least one query to the hash oracle (to hash its
forged M ); when this occurs, I will instead provide y as the hash value rather than
some “true” H(M ).1 The returned signature is then y d mod N , which is the original
message that was given as a challenge to I!
This devious plan has a little bit of resistance, though: it’s quite likely that F needs
to make legitimate queries to the signing and hashing oracles in order to form its
forgery M , yet I cannot emulate valid signatures! Or can it. . . ?
Recall the second trick we used to break UF-CMA for plain RSA: given an arbitrary
signature, we can derive its original message. Now, I will simply output random
signatures for queries to the oracle and track which messages resulted in which random
signatures. This way, it can respond with the same signature when given the same
message. Furthermore, this means it can respond with the correct legitimate hash
for a message: for a message m, its signature should be H(M )d mod N , so I uses
se = H(M ), since sed ≡ s (where s is the randomly-chosen signature for m).
Implication
Because of these additive factors, one needs many more bits to ensure the same level of
security. For example, to achieve the same security as a 1024-bit RSA key, one needs
3700 bits for FDH-RSA (assuming the GNFS is the best algorithm). In practice, this
advice is rarely followed because despite the theoretical bound, no practical attacks
exist.
1
This is the key of the random oracle model: rather than computing the hash “locally” (i.e. using a
known hashing algorithm), the adversary F must instead consult the oracle which is controlled by
I, which doesn’t necessarily need to provide it with values consistent with the properties expected
by a scheme.
Kudrayvtsev 98
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
Theorem 12.2. The PSS scheme is UF-CMA secure under the random oracle
model with the following guarantees (where the parameters are the same as in
Theorem 12.1, and S is the number of bits of randomness):
(qH − 1) · qS
Advuf-cma
pss (F) ≤ Advowf
Krsa (I) +
2S
12.3 ElGamal
We can build a signature scheme from discrete log-based encryption schemes like
ElGamal just like we can with factoring schemes like RSA.
Recall that in the ElGamal scheme, we start with a group G = Z∗p = hgi where p
$
is a prime, and the secret key is just any group element x ←− Z∗p−1 . The public
key is then its corresponding exponentiation: pk = g x mod p. Then, we also need
a bitstring to group hash function as before, H : {0, 1}∗ 7→ Z∗p−1 . With that, the
signing and verification algorithms are as follows:
Kudrayvtsev 99
CHAPTER 12: Digital Signatures
Correctness This likely isn’t immediately apparent, but when we recall that p − 1
is the order of Z∗p , it emerges cleanly:
pk r · rs ≡ g xr · g ks (mod p)
≡ g xr · g ks mod p−1 (mod p) see (10.1)
k(k−1 (m−xr)) now we can
≡ g xr · g mod p−1
(mod p) substitute for s
xr m−xr mod p−1
≡g ·g (mod p) cancel inverse
m
≡g (mod p) combine exponents
Security The security of ElGamal signatures under UF-CMA has not been proven,
even when applying the additional random oracle assumption that the other schemes
made. There are proofs for variants of the scheme that are not used in practice, but
(apparently?) they are “close enough” to grant ElGamal legitimacy.
2
Recall that the order of a group element a is the smallest integer n fulfilling an = 1 (see para-
graph 10.3). In this case, it means g q ≡ 1 (mod p).
Kudrayvtsev 100
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
?
r=v
?
g k mod p ≡ (g u1 · pk u2 mod p) (mod q)
?
≡ (g mw mod q · pk rw mod q mod p) (mod q)
? −1 −1
≡ (g ms mod q
· pk rs mod q
mod p) (mod q)
? ms−1 mod q xrs−1 mod q
≡ (g ·g mod p) (mod q) pk = g x mod p
? −1 +xrs−1
≡ (g ms mod q
mod p) (mod q) combine exponents
? −1 (m+xr)
≡ (g s mod q
mod p) (mod q) factor out s−1
? −1 (m+xr))−1 (m+xr)
≡ (g (k mod q
mod p) (mod q) substitute s
? −1 )−1 a(ab)−1 = b−1 ,
≡ (g (k mod q
mod p) (mod q) here, a = mx + r
and b = k−1
?
≡ g k mod q mod p (mod q)
since k ∈ Z∗q , mod q
≡ g k mod p (mod q) has no effect
This version of DSA works only with groups modulo a prime, but there is a version
called ECDSA designed for elliptic curves.
Security The security of DSA under UF-CMA was not proven until 2016 under the
hardness of discrete log and random oracle assumptions. The proof also confirmed
DSA’s superiority in terms of efficiency: a 320-bit signature has security on-par with
a 1024-bit signature in ElGamal.
Kudrayvtsev 101
CHAPTER 12: Digital Signatures
Correctness Notice the odd difference in this scheme: many values (like R) are
not calculated under a modulus. This makes correctness trivial to verify:
R · pk c = g r · (g x )c = g xc+r = g s
Security The Schnorr signature scheme works on arbitrary groups as long as they
have a prime order. It has been proven to be secure under UF-CMA with the random
oracle and discrete log assumptions for modulo groups,3 and is as efficient as ECDSA
with a 160-bit elliptic curve group.
3
It’s worth noting that this security proof is pretty “loose.”
Kudrayvtsev 102
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
group signatures A group of users holds a single public key. Each user can anony-
mously sign messages on behalf of the group; their identity is hidden except
from the manager of the group who controls the joining and revocation of group
“members.” A similar variant called ring signatures drops the manager and
allows members to always be anonymous.
blind signatures This variant allows users to obtain signatures from a signer with-
out the signer knowing what it was that they signed. This may seem odd,
but is actually very useful in many applications like password strenghtening or
anonymizing digital currency (through a centralized bank, not cryptocurrency).
Concretely for our example, s∗ = H(m)x1 +x2 +x3 . Our verification algorithm for multi-
signatures is:
Vf (M, s1 , s2 , s3 , pk1 , pk2 , pk3 ) = Vddh (g, pk1 pk2 pk3 , H(M ), s∗ )
Kudrayvtsev 103
CHAPTER 12: Digital Signatures
Neither M nor H(M ) was revealed to the signer at any point: r is a random number,
and a meaningful value multiplied by a random number still looks like a random
number. Further, this scheme can be proven to be unforgeable.
12.7 Signcryption
Our final asymmetric construction—a primitive called signcryption—will achieve all
three of our security goals: message confidentiality (contents are private), integrity
(contents are unchanged), and authenticity (contents are genuine).
To fulfill these requirements, a simple asymmetric scenario is not enough: now, both
the sender and the recipient must have public key pairs. As such, signcryption must
be considered in the multi-user setting. Our notions of integrity from symmetric
cryptography (see INT-CTXT) and privacy (IND-CCA and friends) transfer over in
a similar fashion.
Though additional attacks need to be considered (one called “identity fraud” that
wasn’t present in the separate encryption or signature worlds), the security result is
reminiscent of the one for Hybrid Encryption:
Kudrayvtsev 104
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
This new SUF-CMA security notion (S = strongly) is stronger than UF-CMA. It’s
satisfied by all practical schemes, and is equivalent to UF-CMA for deterministic
schemes. The encrypt-then-sign construction is exactly what it sounds like: first, we
do encryption (see schemes in the previous chapter), then sign the result.
To achieve security in the multi-user model, users need to take extra precautions: add
the public key of the sender to the message being encrypted, and add the public
key of the receiver to the message being signed.
Kudrayvtsev 105
Secret Sharing
n this final chapter, we’ll discuss the various nuances in how crucial information
I that we’ve been relying upon in the previous chapters is distributed. This includes
things of authenticating public keys, securely sharing secrets among groups, and using
session keys. We’ll also briefly discuss some potpourri topics like passwords and PGP.
106
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
communication channel with the CA, which can be achieved by any asymmetric
scheme since pkCA is known. Registration is then done as follows:
• The user sends (idU , pkU ) to the CA.
• The CA needs to verify that the user truly knows the associated secret key, so
it generates and send a random challenge R for the user to sign.
• The user signs and sends the challenge, s = Sign (skU , R).
?
• Finally, the CA checks the validity of the signature: Vf (pkU , s) = 1.
After the CA determines that this is a genuine user, it issues a certificate: a collection
of information about the user that is signed by the CA itself:
The user can obviously verify the validity of the certificate as a sanity check. Now,
they can present the certificate to anyone who requests their public key to show that
it is genuine and authentic: the recipient simply independently verifies the certificate
against pkCA .
13.1.2 Revocation
The certificate authority does more than act as an authentic repository of public
keys. It also handles key revocation: if a user’s key is compromised, rotated, or
otherwise no longer trusted before its expiration date, they can notify the CA who
will update its certificate revocation list (CRL) accordingly. This list is public, and
anyone verifying a certificate should ideally cross-reference it against this list.
Practically, though, users will instead download a copy of the CRL periodically. Nat-
urally, this human-dependent element has a flaw: between the time of key compro-
mise, revocation, and updated CRL, the attacker wreak havoc impresonatingn the
user, signing and encrypting malicious messages.
Shockingly, 8-20% of all issued certificates are revoked. This means CRLs can grow
very large and unweildy; recovation is one of the biggest pain points of widespread
PKI adoption.
Kudrayvtsev 107
CHAPTER 13: Secret Sharing
Kudrayvtsev 108
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
Protocol
Let’s describe the (t, n) scheme formally. We first choose p to be a large prime, and
our secret is any z ∈ Zp . Then,
• Choose t − 1 random elements: a1 , a2 , . . . , at−1 ∈ Zp .
• The secret will be denoted a0 = z.
• Now, view these elements as coefficients of a polynomial:
f (x) = a0 + a1 x + a2 x2 + . . . + at−1 xt−1
Weakness
Unfortunately, there’s a fundamental flaw in this scheme: if any parties cheat during
reconstruction, the true secret cannot be recovered and there’s no way to detect
cheating; verifiable secret sharing schemes are designed to get around this problem.
2
This is the same Shamir who is the “S” in RSA!
Kudrayvtsev 109
CHAPTER 13: Secret Sharing
Remember, you First, let’s briefly recall how ElGamal works. If the public keys of the sender and
can usually click receiver are S = g s and R = g r , respectively, and m is the intended message, then
a term to jump the sender sends the pair (v, c) = (S, m · Rs ). The recipient then decrypts via the
back to its first
usage. inverse (v r )−1 , since c · (g rs )−1 = m.
For the thresholded case, we craft our secret using Shamir’s scheme, above. Namely,
we denote α as our secret, create a polynomial, and distribute its n distinct evaluations
as the
Pnsecret shards s1 through sn . Thus, the Lagrange interpolation tells us that
α = i=1 ci si for some set of ci s.
The secret is broadcasted by masking it as a public key: h = g s . This is now the
entire group’s public key, but they don’t know s. Each member then commits to their
share of s by broadcasting hi = g si .
The group can now collaborate to decrypt secrets encrypted by s without ever learning
it. Specifically, consider the pair (v, z) = (g r , mhr ), some message m encrypted under
an arbitrary g r for the group.
Each member Ai broadcasts wi = vis . Then, one of the servers3 (or some trusted third
party) can combine these pieces to recover the plaintext as follows:
z z
m = Qn ci =
i=1 wi w1c1 · w2c2 · . . . · wncn
Again, here the ci s are the coefficients from Lagrange interpolation. Notice that at
no point in time was it possible to derive the original secret, s, only the “safe” value
h = g s is possible to derive (see the correctness proof, below).
3
We’ll proceed by assuming that all of the members are trustworthy, but at this point, they could’ve
used a zero-knowledge proof to demonstrate that dlogg (hi ) = dlogv (wi ). This would prove to the
other participants that we used the share si rather than some other value to generate wi .
Kudrayvtsev 110
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
B 0 = g m2 B = gb
Now, Bob forms the secret g bm1 , and Alice forms the secret g am2 . It seems like they
can’t communicate effectively: attempting to decrypt Alice’s messages with Bob’s
secret would result in gibberish or failure! However, if Mallory continues to facilitate
communication, she can ensure that Alice and Bob receive valid ciphertexts while
4
Note that even though the DDH problem is not hard under Z∗p , it’s still often used in practice.
This is because of a small modification that enables provable security of the secret: the true mutual
secret is instead the hashed g xy .
Kudrayvtsev 111
CHAPTER 13: Secret Sharing
Property 13.1. Asymmetric key exchange schemes cannot be made secure against
active attackers when starting from scratch.
13.4 Passwords
Despite all of the fantastic, provably-secure cryptographic methods and schemes we’ve
studied in the last 112 pages, human-memorable passwords remain the weakest link
in many security architectures.
Dictionary attacks—brute force methods that simply compare dictionary-like pass-
words against the hashes of a compromised server—are still very effective.5 Studies
show that despite of efforts to complicate requirements (which are in and of them-
selves largely ineffective, see Figure 13.1), many passwords are still just words in the
dictionary.
The key to a secure future is simply to move away from passwords: the Fast ID Online
alliance is moving towards an ambitious idea of secure authentication based on secure
devices (“something you have”) and biometrics (“something you are”) rather than on
passwords (“something you know ”).
5
Note that “salting” a password hash—appending and storing a random value so that identical
passwords have different hashes, H(p k r1 ) 6= H(p k r2 )—only helps against mass cracking of
compromised passwords. They have no bearing on the time it takes to crack a specific user’s
password.
Kudrayvtsev 112
APPLIED CRYPTOGRAPHY Asymmetric Cryptography
Figure 13.1: The classic XKCD comic demonstrating the futility of memorizing
“complex” passwords which have low entropy relative to the ease of long passwords
that are far harder to guess.
Kudrayvtsev 113
Epilogue
114
PART III
Advanced Topics
his part of the guide is dedicated to advanced topics in cryptography that I’ve
T been researching myself recently out of interest. It has no relation to the official
course content. It’s worth noting that log n in this section refers to the discrete
logarithm under base 2, so log 8 = log2 8 = 3.
Happy studying!
Contents
15 Secure Cryptographic Primitives 116
16 Commitments 124
115
Secure Cryptographic Primitives
It’s worth noting that even though the transformations from one primitive to another
exist (and we’re about to go over them) and are efficient (i.e. polynomial-time),
they’re largely impractical and unweildy. Their purpose is to fundamentally prove
that security is possible: if OWFs exist, then PRPs exist, which means we can
build schemes that have extremely high levels of security. In practice, we craft specific
primitives for our needs. Instead of actually deriving pseudorandom permutations
from one-way functions, we directly assume that AES is a valid PRP (see the AES
security bounds described much earlier in (3.2)). Still, the following are important
derivations for a deepening our theoretical understanding of cryptography.
As always, this is a synthesis of other resources, so feel free to refer to the Refer-
ences themselves (organized in a subjective order of “easy-to-read”ness) for further,
alternative explanations.
116
APPLIED CRYPTOGRAPHY Advanced Topics
What does “stretch 1” mean? It quite literally means extending the OWP to contain
one more bit of (pseudo)randomness:
S =⇒ f (S) b(S)
| {z } | {z } | {z }
n bits n bits 1 bit
The first n output bits will be uniformally (“truly”) random, while the latter appended
bit b(S) should “look” random.2
How can we acquire such a bit? It’s entirely possible to have a OWF with a predictable
bit—consider the trivial construction of just appending the first bit to a true OWF:
f (x1 , . . . , xn ) = x1 k g(x2 , . . . , xn )
But this obviously can’t be done for all OWFs or all bits of its output. It’d be helpful
for us to first define which bits of the OWF are “hard” or “easy” to predict:
We say that the Boolean function b : {0, 1}∗ 7→ {0, 1} is a hardcore pred-
icate of a function f : {0, 1}∗ 7→ {0, 1} if:
?
1
This is a computational problem often discussed in the context of P 6= NP; see my notes on the
topic for a deeper discussion if you’re interested.
2
“Looking random” means being computationally indistinguishable from truly random output, as
we’ve discussed at length when talking about Random Functions given f (S).
Kudrayvtsev 117
CHAPTER 15: Secure Cryptographic Primitives
To put it simply, a “hardcore bit” in an OWF’s output is one that cannot be reliably
predicted. Under the contrived f and g above, the first bit x1 would not be a hardcore
bit; the other bits provide a reliable source of unpredictability.
Proof. The full proof can be found online, [14,13] but its basic idea is that a random
linear combination of the bits of x should be hard to compute. First, we can extend
the function to be
g(x, r) = (f (x), r) where |x| = |r|
Under this extension, g is still one-way. Then, we can derive a b that fits the definition
of a hardcore predicate as follows:
n
X
b(x, r) = hx, ri = xi · ri mod 2
i=n
That is, we take the inner product modulo 2 for any bitstrings x, r ∈ {0, 1}n . This
predicate will always be hardcore.
Theorem 15.2. Let f be a one-way permutation with the hardcore bit b. Then,
Kudrayvtsev 118
APPLIED CRYPTOGRAPHY Advanced Topics
Suppose there exists a PPT adversary A and a non-negligible E(n) such that
Pr [A(G(S)) = 1] − Pr [A(r) = 1] ≥ E(n) (15.1)
$ n $
r←−{0,1}n
S ←−{0,1}
That is, the adversary can reliably differentiate between random bits and bits gener-
ated by the PRG.
$
where b := 1 − b (e.g. the flipped bit) and S ←− {0, 1}n as before (e.g. random
n-bit string).
Proof. We prove this by first rewriting the distribution of random bit strings,
$
r ←− {0, 1}n :
( ( $
)
n
$ n+1
o r1 ←− {0, 1}n
r : r ←− {0, 1} ≡ r ← r1 k r2 : $
r2 ←− {0, 1}
( ( $
)
S ←− {0, 1}n
≡ r ← f (S) k r2 : $
r2 ←− {0, 1}
( ( $
)
S ←− {0, 1}n
≡ r ← f (S) k r2 : $
r2 ←− {b(S), b(S)}
≡ {r ← f (S) k b(S)} or r ← f (S) k b(S)
w/ prob. 1/2 each
Thus, the adversary’s chance of a correct guess can be expressed as the sum of
the chance of the two individual guesses:
1 1
Pr [A(r) = 1] = · Pr [A(f (S) k b(S)) = 1] + · Pr A(f (S) k b(S)) = 1
r 2 S 2 S
Kudrayvtsev 119
CHAPTER 15: Secure Cryptographic Primitives
1 1
≤ Pr [A(G(S)) = 1] − · Pr [A(f (S) k b(S)) = 1] − · Pr A(f (S) k b(S)) = 1
S 2 S 2 S
1 1
≤ Pr [A(G(S)) = 1] − · Pr [A(G(S)) = 1] − · Pr A(f (S) k b(S)) = 1
S 2 S 2 S
1 1
≤ − Pr [A(f (S) k b(S)) = 1] − Pr A(f (S) k b(S)) = 1
2S 2S
1
≤ · Pr [A(f (S) k b(S)) = 1] − Pr A(f (S) k b(S)) = 1
2 S S
With this claim proven, we can now construct an adversary B that can break the
hardcore bit b (that is, predicts b(S) from f (S)). Given the input y = f (S) ∈ {0, 1}n ,
$
• Choose a random bit, c ←− {0, 1}.
• Run A(y k c), then:
– If A outputs 1, we output c.
– Otherwise, output c.
What’s the success rate of our constructed adversary B?
1
Pr [B(f (S)) = b(S)] = · Pr [B(f (S)) = b(S)) | c = b(S)] +
S 2 S,c
1
· Pr B(f (S)) = b(S)) | c = b(S) Law of Total
Probability
2 S,c
1
· Pr [A(f (S) k b(S)) = 1] + Pr A(f (S) k b(S)) = 0
=
2 S S
by con-
struction
1
= · Pr [A(f (S) k b(S)) = 1] + 1 − Pr A(f (S) k b(S)) = 1
2 S S
by
definition
1 1
= + · Pr [A(f (S) k b(S)) = 1] − Pr A(f (S) k b(S)) = 1
2 2 S S
1 by Claim
≥ +E 15.1
2
Thus, we’ve constructed an adversary B that can reliably break the hardcore bit b,
which contradicts the definition of b being a hardcore bit. By this contradiction, the
efficient adversary A cannot exist.
Whew. That was really something. If your eyes glazed over the minute you read
the word Proof., don’t worry. The important part was the theorem itself: we can
stretch a PRG by one bit using the hardcore bit that Goldreich-Levin has
guaranteed to exist.
Kudrayvtsev 120
APPLIED CRYPTOGRAPHY Advanced Topics
S0 S1 S2 ··· S`−1 S`
G G G
b1 b2 b`−1
It should come as no surprise that this diagram is nearly identical to the stateful
generator visualized in chapter 7 when we discussed how PRGs are used. Our use of
Si makes sense now: it’s the state of the generator, and bi are obviously the output
bits.
We can also write G(s) → (s0 , s1 ) as the pair (G0 (s), G1 (s)). That is, when operating
on the input s, G0 is the “first half” of the output and G1 is the “second half.”
Kudrayvtsev 121
CHAPTER 15: Secure Cryptographic Primitives
Claim 15.1. Under this construction, G can be viewed as a secure PRF in the
following form:
F : {0, 1}` × {0, 1} 7→ {0, 1}`
| {z } | {z } | {z }
key single-bit range
domain
In other words, F (s, 0) would treat s as the key and 0 as the input bit, mapping to
an `-bit (pseudo)random bitstring: G0 (s), and likewise for F (s, 1) = G1 (s).
Obviously, having a single-bit domain is not an ideal limitation. The key question is
then: how do we generalize this to an arbitrary domain?
The answer can be found by looking outside: trees! Our construction will look some-
thing like this:
F (s, x1 x2 · · · xn ) = Gxn Gxn−1 (. . . Gx1 (s) . . .)
| {z }
input bits
S0 = G0 (S) S1 = G1 (S)
The proof that this results in a secure PRF can be found in Ch. 4.6 of Boneh &
Shoup’s book. [3]
15.3 Conclusion
Almost every larger cryptographic construct in Part 1 relies on the existence of a
secure PRF, which we just showed how to create from nothing but a secure one-way
function. We know that functions that are “hard to invert” exist (and we’ve shown
various bounds for them), but fundamentally proving that there’s no algorithm that
can achieve a non-negligible advantage remains an open question.
Kudrayvtsev 122
APPLIED CRYPTOGRAPHY Advanced Topics
To reiterate, in practice we don’t use the transforms described above to go from OWF
; PRF, and instead rely on the security of AES to be a PRF (largely for efficiency
reasons), but the idea that if all else failed we could still construct secure primitives
is an assuring one.
15.3.1 References
1. Eskandarian, S., Kogan, D., and Tramèr, F. Basic primitives; From OWFs to PRGs
and PRFs. In Topics in Cryptography. ch. 1–2. [Online]
2. Trevisan, L. One-way functions. In CS 276. ch. 11. [Online]
Kudrayvtsev 123
Commitments
aking commitments can be hard for humans, but actually pretty easy in cryp-
M tography. It’s a fundamental necessity in a lot of applications, such as zero-
knowledge proofs. In this (short) chapter, we’ll formalize the notion of a commitment
and work through some implementations.
A commitment allows one to commit to a particular message without revealing it,
like putting it into a locked box. Once committed, they can announce the key to
unlock the box and reveal the hidden message. Obviously, it should be impossible to
reveal a different message than the one committed. Similarly, the “locked box” should
reveal nothing about the message without the secret. Let’s formalize this.
16.1 Formalization
Define a function commit : M × R 7→ C, that is, a function that maps a message
space and some randomness to some commitment space. Then, commit(m, r) → c
commits m with the secret r to the commitment c. We “open” the commitment by
revealing (m, r).
A good commitment scheme holds the following properties:
• hiding: Whoever sees c learns nothing about m.
Much like with perfect security (2.1), we’ll use a statistical notion of hiding: the
commitment for m0 should be equally likely to have occurred for m1 . Specifi-
cally,
$
∀m0 , m1 ∈ M; r ←− R :
Pr [commit(m0 , r) = c] ≈ Pr [commit(m1 , r) = c]
124
APPLIED CRYPTOGRAPHY Advanced Topics
A couple of off-hand schemes that may come to mind that are are not commitments
include:
• just using a plain hash function H(x). Since the message space is public, the
hiding property isn’t preserved as someone could just enumerate all m ∈ M
and find the matching hash.
• just using AES directly by treating r as the key:
commit(m0 , r0 ) = AESr (m)
This is not a valid commitment scheme either, because despite fulfilling the
hiding property, it does not necessarily provide binding: you can have secure
schemes decrypt the same ciphertext to different messages depending on the
key.1
Claim 16.1. The Pederson commitment scheme provides hiding because the com-
mitments are uniformally distributed in the group G.
1
https://crypto.stackexchange.com/a/72199
2
The discrete log between g and h should not be known, though I’m not entirely sure what that
means.
Kudrayvtsev 125
CHAPTER 16: Commitments
Proof. For any value m ∈ M, there is exactly one r such that c = g m hr . First, define
c = g a , h = g b (this is valid by the definition of the generator g); then,
c = g m hr =⇒ g a = g m · (g b )r = g m+br
=⇒ a = m + br
a−m
=⇒ r =
b
Claim 16.2. The Pederson commitment scheme provides binding under the hard-
ness assumption of the discrete log problem in the group G.
That is, given h ∈ G, it’s hard to find x such that h = g x .
g m0 hr0 = c = g m1 hr1
g m0 (g x )r0 = g m1 (g x )r1
m1 − m0
m0 + xr0 = m1 + xr1 =⇒ x =
r0 − r1
If A can break binding, by definition it would give us (m0 , r0 ) and (m1 , r1 ) that
generate identical commitments. Plugging them into the above relationship breaks
the discrete log challenge.
Thus by the contrapositive, if the discrete log problem is hard in the group G, the
Pederson commitment scheme is binding.
Kudrayvtsev 126
Random Oracle Model
ur use of this security model has been prolific throughout the main text, but
O we never really dove into its intricacies and implications. Its general gist was
basically buried in a footnote, summarizing the basic idea into a single sentence: in
the random oracle model, the attacker controls hash function outputs. This
is generally unrealistic in practice, but forms a (controversial) theoretical framework
for evaluating a scheme’s security.
We build security models that are supposed to reflect reality; if we can prove some-
thing within that model (like a security bound), then it should hold true in the real
world. However, often we miss things in our model: consider the difference between
being IND-CCA secure and IND-CPA secure. We typically adjust models to better-
reflect the real world (often in response to the discovery of new security holes). Many
of our security proofs relied on the “standard model,” but others relied on the random
oracle model to prove security.
Fundamentally, as we already stated, the thing that differentiates the random oracle
model is that an oracle answers queries to hash function evaluations. In practice, we
replace the random oracle with a suitable hash function and call it secure. Otherwise,
imagining a cryptographic application calling out to an API server for hash function
evaluations feels. . . a little unrealistic.
We’ll prove that f is PRF secure under the random oracle model assuming the hard-
ness of the decisional Diffie-Hellman problem. Recall the DDH assumption: for a
cyclic group G of order q and a generator g,
computationally
x y xy $ indistinguishable from $
g, g , g , g : x, y ←− Zq ←−−−−−−−−−−−→ g, g x , g y , g z : x, y, z ←− Zq
127
CHAPTER 17: Random Oracle Model
Namely, the ??? above can be whatever B wants, and that’s exactly what B will
leverage to break DDH: B redefines the hash function:
H(m) :
$
α ←− Zq
return X α
f (K, m) :
$
α ←− Zq
H(m) := X α
return Z α
(Obviously, if you called H(m) before f (K, m), f would use the α from the H(m) call rather than
generating a new one.)
Notice that now for some arbitrary n, f (K, n) = g zα , and H(n) = g xα . If we’re given
z = xy, then evaluating f is equivalent to calculating g xyα , whereas if z is chosen
randomly, then evaluating f is equivalent to choosing a random group member, since
$
g zα ≡ γ ←− G.
Kudrayvtsev 128
APPLIED CRYPTOGRAPHY Advanced Topics
17.2 Conclusion
The random oracle model is a heuristic model that often gives us simpler / faster
schemes than what we have in the standard model. Despite that, it’s possible to
craft some (contrived) schemes that are secure in the random oracle model but are
completely broken in the standard model regardless of the hash function being used.
The “trick” we used above in “programming” the oracle to return specific results
is controversial among cryptographers: such a model does not really reflect reality.
However, it’s undoubtedly a useful framework, and in some sense the best we can do
for certain schemes.
Remember that in practice, the oracle is just replaced with a specific hash function;
just don’t use SHA256! It (and any hash function reliant on the Merkle-Damgård
transform, actually) is vulnerable to length extension attacks (see this and this for
more on that).
17.2.1 References
1. Eskandarian, S., Kogan, D., and Tramèr, F. Commitments & the random oracle model.
In Topics in Cryptography. ch. 3. [Online]
2. Boneh, D., and Shoup, V. Random oracles: a useful heuristic. In A Graduate Course in
Applied Cryptography. January 2020, ch. 8.10.2, pp. 322—325. [Online]
Kudrayvtsev 129
Zero-Knowledge Proofs
he beauty of a zero-knowledge proof (or ZKP) is that one party can prove
T something to another without revealing any concrete information about the ac-
tual thing they’re proving. In other words, they prove knowledge of some fact without
revealing it.
1
For a review of some of the problems within the NP-complete complexity class, I recommend both
the Computational Complexity chapter in my notes for Graduate Algorithms as well as Chapter 8
of the textbook, Algorithms, on which it’s based.
130
APPLIED CRYPTOGRAPHY Advanced Topics
key is that the prover can’t think of a valid solution on the spot without having truly
actually solved the problem already. It’s been shown, in fact, that any NP-complete
problem has a zero-knowledge proof formulation. [10]
In other words: the “knowledge” must be hard to compute but easy to verify.
This is what prevents “cheating” on Alice’s part.
18.1.2 Formalization
Knowledge proofs are pretty easy to define formally. Let’s refer to a prover, Alice,
that actually has knowledge A∗ and prover-Alice that is faking the knowledge as A.
In general, the prover is A and can be either of the Alices. Our verifier, Bob, will just
be B.
The “problem” in question is the predicate P (Q, S), where Q is a challenge query
proposed by B and S is the solution output by A.
Formally, then, a verifier should act as follows:
1. If A = A∗ , B should accept its proofs with overwhelming probability for all q
for which P (q, S) is satisfiable.
2. If A = A, B should accept its proofs with negligible probability for all q.
To put it simply, B should accept all valid proofs from A∗ and reject all proofs from
A.
2
Note that problems within the NP-hard class don’t fall into this category: by their very definition,
it’s just as difficult to verify a solution as it is to find it. The asymmetry of NP-complete problems
are exactly what make these (non-)interactive knowledge proofs possible.
Kudrayvtsev 131
CHAPTER 18: Zero-Knowledge Proofs
18.2 Zero-Knowledge
A knowledge proof might reveal details about what Alice knows. In the above exam-
ple, obviously the secret information was revelaed to the prover. However, wouldn’t
it be nice if we could avoid divulging the secrets while simultaneously proving that we
know them? That would be a zero-knowledge proofs! In general, Bob should learn
nothing at all from Alice about her “truth” claim in a ZKP.
Alice must compute these quicker than Bob can brute-force them. Alice calculates
the partial roots of each prime using brute force on the factors (not their product),
then of N via the Chinese remainder theorem. This can be done in polynomial time,
whereas brute force takes exponential time in the number of digits in N . Bob can
validate the roots trivially.
Of course, Bob may be suspicious of Alice’s ability to provide roots for just these 4
numbers (after all, she may have gotten lucky: there are only 143 possible values).
Bob can then test Alice again, over and over until he’s satisfied of her knowledge.
More formally, though there may some small chance that Alice can cheat and fake
her knowledge, with enough trials this chance gets closer and closer to zero. At no
point in time does Alice actually reveal p or q, yet Bob is satisfied that Alice knows
them simply because there’s no way for Alice to do this so efficiently without knowing
them.
Kudrayvtsev 132
APPLIED CRYPTOGRAPHY Advanced Topics
Finding the coloring is hard,3 but verifying it is easy. However, it seems like Alice
would need to reveal the entire graph coloring to prove to Bob that it’s valid, no?
Fortunately not. Because we are making probabilistic guarantees, we can use a scheme
that reveals nothing about the true coloring of the graph yet also guarantees its
validity.
Alice first commits to a coloring using a commitment scheme like Pedersen commit-
ments. Bob then presents a simple challenge: “Show me the colorings of edge e.”
Alice then reveals the colors she chose for that edge. What’s the probability of Alice
lying about having a valid coloring? Well, in the worst case, Alice’s graph coloring
was perfectly colored except for a single edge. The chance that Bob did not choose
this invalid edge is pretty high: E−1
E
, where E is the number of edges in the graph.
For the above 8-edge graph, there’s an 87.5% chance that Alice is full of it.
How does Bob get more certainty? Run it again! Now, critically, Alice shuffles the
colors she used before presenting the colored graph; notice that the validity of the
above coloring does not change if we swap reds and greens, for example. We do the
E−1 2
challenge and reveal again, and this time there’s a E chance of cheating (a 76.6%
chance for our example). That’s progress. . . In general, the probability of cheating
E−1 n
after n runs is E , so if we wanted, for example, 99% certainty that Alice isn’t
Kudrayvtsev 133
CHAPTER 18: Zero-Knowledge Proofs
You can try out So after 35 rounds (shuffle, challenge, reveal), Bob can be pretty dang sure that Alice
an interactive isn’t lying about her graph coloring, and since Alice always shuffled and did not reveal
version of the
a full coloring, the full solution is still a secret.
3-color ZKP
from MIT here.
18.2.3 Formalization
Formally, a zero-knowledge proof must satisfy the following key properties:
1. Completeness: For all valid queries or challenges to the problem that the
verifier Bob can present, Alice must respond with a valid solution. This is point
(1) from above. Colloquially, if Alice is honest, then she will eventually
convince Bob.
2. Soundness: For all invalid queries, the verifier must reject Alice’s solution
with a ≥ 12 probability; this is point (2). Colloquially, lying should be near-
impossible: Alice can only convince Bob if the statement is true.
3. Perfection (also called zero-knowledge-ness [yeah, seriously]): There must exist
a realistic PPT algorithm that can simulate knowledge of the solution rather
than actually know it.
Colloquially, Bob learns no information from Alice beyond the fact that
the statement is true. If such a simulator as described above (which doesn’t
actually know the solution) can exist, Bob wouldn’t know the difference and
thus learn nothing additional.
The last point is the oddest and bears a bit of clarification; perfection is such a weird
definition mostly because defining it formally is difficult.
Simply imagine that the prover doesn’t actually have a solution but picks things
randomly. Whenever the verifier catches them lying (i.e. reveals an invalid partial so-
lution), the prover turns back time, randomizes the solution until the lie is fixed, then
“resumes” the proof; the verifier then continues as if nothing happened. Critically,
the prover doesn’t actually have a solution, and thus it’s impossible for the verifier
to learn anything about it, ensuring that zero knowledge is revealed.
In essence, it must be possible for a false prover to get away with lying for every single
challenge if they knew the challenges up front without actually having a solution.4
This means that the verifier couldn’t learn anything by the simple fact that the prover
didn’t actually know anything.
4
In the literature, this is expressed as a “simulator” producing a “transcript” of an interaction with
the verifier, allowing it to internally retry as many times as it needs to fulfill challenges.
Kudrayvtsev 134
APPLIED CRYPTOGRAPHY Advanced Topics
The Schnorr protocol is a specific type of sigma protocol; to put it succinctly, these
protocols only take a single “round” of commit / challenge / respond to sufficiently
prove something to the Verifier. Crucially, the challenge that the Verifier presents
must be chosen at random.
18.4 Interactivity
It’s worth noting a key foundation of the ZKPs we’ve discussed thus far: they’re
interactive. The Prover and the Verifier enter several “rounds” of commit / challenge
/ respond to give the Verifier an arbitrary degree of certainty in the existence of the
Prover’s knowledge.
This limitation begs the question: can we avoid this interactivity and enable the
Prover to present a self-contained proof of knowledge?
Obviously since we’re asking this question, the answer is “yes.” This “yes” comes with
restrictions, though. In the standard model, this is only true for simple problems:
by “simple,” we mean problems in the BPP (bounded-error probabilistic polynomial
time, basically just P with randomness) complexity class. However, if we operate in
Kudrayvtsev 135
CHAPTER 18: Zero-Knowledge Proofs
P (x) V (x)
P (x) V (x)
P (x) V (x)
...
Figure 18.1: The two types of ZKPs (since a ΣP is a subset of all interactive
ZKPs.
the random oracle model (now you see why chapter 17 came first), we can expand
our problem space.
In fact, we can convert any sigma protocol (ΣP) into a non-interactive zero-knowledge
proof (NIZK). Because the Verifier chooses challenges randomly and keeps no secret
state, we can essentially replace them with a random oracle. This is the idea behind
the Fiat-Shamir heuristic: the challenge c is chosen randomly from a hash function
$
c ←− H(x, t) ∈ Zq
where x is some input (like the graph in the 3-color problem) and t is the Prover’s
commitment (like a particular coloring).
Specifically, the Prover sends the verifier its commitment, t, its self-generated chal-
lenge, c (as above), and finally its response to the challenge, z. Then, the Verifier can
choose to either accept or reject the proof as a deterministic function of (x, t, c, z).
Of course, being security-minded, we should consider the possibility that the Prover’s
choice of (x, t) (and thus the resulting commitment) were maliciously crafted to prove
something that she doesn’t actually know. Just like before, though, we can increase
our confidence in the proof by including many generated commitments that follow a
random distribution.5
18.5 Succinctness
In the previous section, we discussed making a zero-knowledge proof not require
interactivity; minimizing communication rounds was the main goal. Now, our goal is
to minimize both communication and verifier complexity.
5
The notes I’ve been following don’t actually dive into this, but I believe this is one viable way to
alleviate this concern. Perhaps I’m misunderstanding something in the formal inner workings of
a NIZK (there’s a lot of detail I’m glossing over here for the sake of a more-digestible overview).
Kudrayvtsev 136
APPLIED CRYPTOGRAPHY Advanced Topics
Honestly, this definition doesn’t matter much, but it’s included here for the sake of
completeness and precision. In general, we can just say that a knowledge proof is
succinct if it doesn’t take very much information to communicate it.
With succinctness comes a new slew of acronyms. Specifically, a SNARG is succinct,
non-interactive argument of knowledge; a SNARK additionally includes a proof of
knowledge; and a zk-SNARK means this proof zero-knowledge.
18.5.2 Construction
TODO: wow, things get wild here; try this link instead
18.6 References
1. Green, M. Zero Knowledge Proofs: An illustrated primer. In A Few Thoughts on Crypto-
graphic Engineering. November 2014
2. Barak, B. Zero Knowledge Proofs. In COS 433: Cryptography. November 2007, ch. 15
3. Boneh, D. Zero-Knowledge Proofs. In Notes on Cryptography. Stanford University, 2002
4. Ray, S. What are Zero Knowledge Proofs? In Towards Data Science. Medium, April 2019
5. Cossack Labs. Explain Like I’m 5: Zero Knowledge Proof, October 2017
Kudrayvtsev 137
CHAPTER 18: Zero-Knowledge Proofs
6. Feige, U., Fiat, A., and Shamir, A. Zero-Knowledge Proofs of Identity. Journal of
Cryptology (1988), 77–94
7. Feige, U., and Shamir, A. Zero knowledge proofs of knowledge in two rounds. In Advances
in Cryptology, CRYPTO ‘89 (Berlin Heidelberg, 1990), G. Brassard, Ed., Springer-Verlag
8. Boneh, D., and Shoup, V. Schnorr’s identification protocol. In A Graduate Course in
Applied Cryptography. January 2020, ch. 19.1, pp. 724—731. [Online]
Kudrayvtsev 138
Multi-Party Computation
1
More generally, we could even have two separate functions and have Alice learn fA (x, y) while
Bob learns fB (x, y), but that’s additionally complexity we don’t need right now.
139
CHAPTER 19: Multi-Party Computation
without revealing who the ads were shown to or who made purchases.
The security of a particular two-party computation scheme can be expressed with
two variants: in the first, we have semi-honest participants—both Bob and Alice
promise to follow the protocol precisely (akin to a honest Verifier in zero-knowledge
proofs)—while in the second, we may have malicious participants—Bob and/or Alice
can deviate from the protocol.
We’ll focus on the former version for simplicity, but the latter is also possible. We
need to ensure two properties of our interactive protocol:
• correctness: for every possible input, the protocol’s output must match the
function’s output. Formally, if we say that hA(x), B(y)i is an execution of the
protocol with x and y as defined above, then correctness is when
• privacy: for both of the hidden inputs, there must exist “simulators” that
emulate what both A and B see in the protocol. This is similar to “zero-
knowledgeness” (see item 3), in that the existence of such a simulator provdes
that neither participant learns anything valueable from the transcript of the
protocol.
Kudrayvtsev 140
Elliptic Curves
ude, literally just read this incredible series of blog posts on this topic that
D explain things far more elegantly than I ever could.
The fundamental beauty of an elliptic curve is that it’s a group, meaning it can
literally replace all of the instances of Zp that we’ve been using throughout the pre-
vious 141 pages. Elliptic curves are a drop-in replacement for modulus-based groups;
they increase security, reduce key size, and improve efficiency all in one go.
141
Index of Terms
142
Index of Terms
G M
MAC-then-encrypt . . . . . . . . . . . . . . . . . . 50
general number field sieve . . . . 70, 79, 98
man-in-the-middle attack . . . . . . . . . . 112
generator . . . . . . . . . . . . . . . . . . 55, 116, 121
Merkle tree . . . . . . . . . . . . . . . . . . . . . . . . . 114
Goldreich-Goldwasser-Micali . . . . . . . 122
Merkle-Damgard transform . . 43, 44, 50,
Goldreich-Levin . . . . . . . . . . . . . . . 118, 120
129
greatest common divisor . . . . . . . . . . . . . 65
message authentication code . 36, 47, 48,
group signatures . . . . . . . . . . . . . . . . . . . 103
50, 95
H mode of operation 17, 23, 25, 37, 43, 53,
hardcore predicate . . . . . . . . . . . . 117, 118 55, 59, 77, 83
hash function . . 41, 90, 95, 97, 116, 124, multi-party computation . . . . . . 114, 139
127 multi-signatures . . . . . . . . . . . . . . . . . . . . 102
Hastad’s broadcast attack . . . . . . . 89, 92 N
hiding . . . . . . . . . . . . . . . . . . . . . . . . 124, 125 negligible . . . . . . 67, 67, 78, 118, 119, 122
HMAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 NP-hard . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
homomorphic encryption . . . . . . . . . 94, 97
homomorphic, additive . . . . . . . . . . . . . 125 O
OCSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
I offset codebook . . . . . . . . . . . . . . . . . . . . . 53
identity-based encryption . . . . . . . . . . . . 93 one-time pad . . . . . . . . . 12, 19, 31, 36, 55
impossibility result . . . . . . . . . . . . . . . . . . 14 one-way . . . 46, 47, 88, 98, 116, 118, 122
IND-CCA . 33, 35, 37, 38, 49, 53, 60, 63, ow-advantage . . . . . . . . . . . . . . . . . . . . . . . 46
83, 84, 86, 90–92, 104, 105, 127
IND-CCA advantage . . . . . . . . . . . . . . . . 33 P
IND-CPA . 21, 22, 26, 27, 32, 33, 35, 49, Pedersen commitment . . . . . . . . . 125, 133
53, 60, 63, 78, 82–84, 90–92, 105, perfect security . . . . . . . . . . . . . 12, 36, 124
127 PGP . . . . . . . . . . . . . . . . . . . . . . . . . . 106, 108
IND-CPA advantage . . . . . . . . . 21, 23, 29 post-quantum cryptography . . . . . . . . 114
IND-CPA-cg . . . . . . . . . . . . . . . . . 23, 29, 83 PRF advantage . . . . . . . . . . . . . . . . . . 27, 29
IND-CPA-cg advantage . . . . . . . . . . . . . 24 PRF secure . . . . . . 27, 32, 35, 45, 50, 53,
INDR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 127–129
INDR advantage . . . . . . . . . . . . . . . . . . . . 56 prime factorization . . . . . . . . . . . . . . . . . . 79
initialization vector . . . 17, 22, 34, 54–56 private key . . . . . . . . . . . . . . . 61, 63, 81, 87
INT-CTXT . . . . . . . . . . . . 50, 53, 104, 105 private set intersection . . . . . . . . . . . . . 139
INT-CTXT advantage . . . . . . . . . . . . . . . 49 probabilistically-polynomial time . . . 118
integrity . . 8, 35, 36, 49, 50, 60, 95, 104, prover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
114 pseudorandom function . . 26, 31, 50, 57,
77, 91, 116, 117, 121, 127
K pseudorandom generator . . . . 50, 55, 121
Kerckhoff’s principle . . . . . . . . . . . . . 15, 36 PSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98, 101
key distribution . . . . 10, 61, 94, 102, 106 public key . . 61, 63, 81, 83, 87, 106, 111
key separation principle . . . . . . . . . . . . . 50 public key infrastructure . . . . . . . . . . . 106
143
Index
R succinct . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
random function . . . . . . . . . . . . . . . . . 35, 91 SUF-CMA . . . . . . . . . . . . . . . . . . . . . . . . . 105
random oracle . . 91, 91, 92, 97–102, 127, symmetric key . . . . . . . . . . . 10, 77, 88, 95
136
RC4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 T
relatively prime . . . . . . . . . . . . . . . . . . . . . . 90 threshold signatures . . . . . . . . . . . 102, 110
replay attack . . . . . . . . . . . . . . . . . . . . 37, 38 timing attack . . . . . . . . . . . . . . . . . . . . 52, 60
revocation . . . . . . . . . . . . . . . . . . . . . . . . . 107 totient function . . . . . . . . . . . . . . 65, 86, 90
ring signatures . . . . . . . . . . . . . . . . . . . . . 103 trap door . . . . . . . . . . . . . . . . . . . . . . . . 88, 97
RSA . . . . . . . 77, 86, 96, 99, 104, 109, 116 U
RSA-OAEP . . . . . . . . . . . . . . . . . . . . . 90, 92 UF-CMA . . . . . . . 38, 40, 49, 95–102, 105
UF-CMA advantage . . . . . . . . . . . . . 38, 96
S
safe prime . . . . . . . . . . . . . . . . 71, 71, 79, 80 V
Schnorr . . . . . . . . . . . . . . . . . . . . . . . 101, 135 verifiable secret sharing . . . . . . . . . . . . 109
Schnorr’s protocol . . . . . . . . . . . . . 135, 135 verifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
searchable encryption . . . . . . . . . . . . . . . 94
secret key . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 W
seed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55, 58 web of trust . . . . . . . . . . . . . . . . . . . . . . . . 108
session key . . . . . . . . . . . . . . . . . . . . . . . . . 106
Shamir’s secret sharing . . . . . . . . 109, 110 X
X.509 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Shannon-secure . . . . . . . 12, 15, 20, 31, 36
sigma protocol . . . . . . . . . . . . . . . . 135, 136 Y
signature . . . . . . . . . . . . . . . . . . 95, 102, 108 Yao’s millionaire problem . . . . . 114, 139
signcryption . . . . . . . . . . . . . . . . . . . 104, 105
SNARG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Z
SNARK . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 zero-knowledge proof . . . . . 110, 114, 124,
socialist millionaire problem . . . . . . . . 114 130, 135, 137, 140
stream cipher . . . . . . . . . . . . . . . . . . . . 55, 56 zk-SNARK . . . . . . . . . . . . . . . . . . . . . . . . 137
144