04 Crypto
04 Crypto
Fall 2024
Cryptography
Tyler Bletsch
Duke University
4
Symmetric cryptography
• The primary method for providing confidentiality of data
transmitted (“in-flight”) or stored (“at-rest encryption”)
Given:
Plaintext p (arbitrary size)
Secret key k (fixed size)
Encryption function E
Decryption function D
Can produce ciphertext c:
c = E(p,k)
Can recover plaintext:
p = D(c,k)
5
How to attack cryptography
• Cryptanalysis – apply cleverness
▪ Exploit weaknesses in algorithm or manner of its use
▪ May leverage existing plaintext, ciphertext, or
pairs of each
▪ KEY ISSUE: Even if algorithm is “perfect” (unprovable),
you might use the algorithm incorrectly.
(This is why you don’t roll your own crypto)
7
XOR “encryption” demo
Plaintext: 'Hello'
Key : 'key'
H e l l o
Plaintext : 01001000 01100101 01101100 01101100 01101111
k e y Key repeats> k e
Key : 01101011 01100101 01111001 01101011 01100101
Ciphertext: 00100011 00000000 00010101 00000111 00001010
^ XOR result
8
Types of cryptanalysis attacks
• Given the encryption algorithm and ciphertext under attack, attacks we can do:
9
Attacking XOR (1)
• Known plaintext attack:
▪ Given plaintext : 01001000 01100101 01101100 01101100 01101111
▪ Given ciphertext : 00100011 00000000 00010101 00000111 00001010
▪ XOR result : 01101011 01100101 01111001 01101011 01100101
^^ it's the key!!!
• Chosen plaintext attack:
▪ Chosen plaintext : 00000000 00000000 00000000 00000000 00000000
▪ Given ciphertext : 01101011 01100101 01111001 01101011 01100101
▪ XOR result : 01101011 01100101 01111001 01101011 01100101
^^ it's the key!!!
• Chosen ciphertext attack:
▪ Chosen ciphertext: 00000000 00000000 00000000 00000000 00000000
▪ Result plaintext : 01101011 01100101 01111001 01101011 01100101
▪ XOR result : 01101011 01100101 01111001 01101011 01100101
^^ it's the key!!!
10
Attacking XOR (2)
• Ciphertext only attack:
▪ Ciphertext: 00100011 00000000 00010101 00000111 00001010
▪ "I assume the plaintext had ASCII text with lowercase letters, and in all such letters bit
6 is 1, but none of the ciphertext has bit 6 set, so I bet the key is most/all lower case
letters"
▪ "The second byte is all zeroes, which means the second byte of the key and plaintext
are equal"
▪ etc....
11
Symmetric ciphers in common use
Cipher Key size Block size Year introduced (1975) Now ATMs can exist thanks to
this Data Encryption Standard (DES)!
DES 56 64 1975 (1995) Ahh, the DES key is too small! It
3DES 112/168 64 1995 can be brute forced really fast! Hurry,
duct tape three of them together!
Twofish 128/192/256 128 1998 (1998) Crap, now it’s too slow! Lets
have a bunch of algorithms come fight
Serpent 128/192/256 128 1998 to become the American Encryption
Standard (AES)!
Rijndael 128/192/256 128 1998
(2001) This guy won and is now called AES.
• Triple DES (3DES) still around in legacy stuff like in financial systems (ATMs)
• AES dominates everywhere else.
• Implemented in hardware in modern CPUs – way faster than software
versions of other algorithms (by 5x or more!)
• Not sure what to use? Use AES
• Some people like to use the other AES finalists (Twofish, Serpent) or other
symmetric ciphers not listed here. That’s fine.
12
Okay, but what about that “block size” thing?
Problem?
13
Demonstrating the danger of ECB
• Electronic Codebook (ECB) is what you’d come up with naively:
“Just apply the key to each block”
• But this means that identical blocks give identical ciphertext, which
can be informative to an attacker...
14
☺ Figures from Wikipedia “Block cipher mode of operation”
Solution to the “ECB problem”
• Develop more sophisticated modes of operation for use with our
block cipher.
▪ We’ll see several of these – there’s tradeoffs to different techniques
15
Figure from https://www.researchgate.net/figure/Stream-cipher-diagram_fig2_318517979
Modes of operation: CBC
• Cipher Block Chaining (CBC):
▪ Each block of plaintext is XOR’d with previous block ciphertext
▪ Prevents patterns from being visible even in regular data
Encryption parallelizable? No
Decryption parallelizable? Yes
Random read access? Yes 16
Figures from Wikipedia “Block cipher mode of operation”
More about the Initialization Vector
• The previous slide showed an “IV” (Initialization Vector”) used to
start the chain (it’s XORed with the first block of plaintext).
Something like this is used in many modes.
▪ IV is random per-message; ensures first block of two ciphertexts don’t match
just because plaintexts match.
• The IV must be known to both the sender and receiver, typically not
a secret (often included in the communication).
• IV integrity is important: If an opponent is able to fool the receiver
into using a different value for IV, then the opponent is able to
invert selected bits in the first block of plaintext. Other attacks,
too...
▪ A more detailed discussion can be found here.
17
Modes of operation: CTR
• Counter (CTR):
▪ Encrypt an incrementing list of integers to make a keystream:
turns a block cipher into a stream cipher!
▪ Allows full parallelization and random access
19
Asymmetric (Public Key) Cryptography
20
The problem
• Problem with symmetric crypto:
• Solution:
What if the key to decrypt was different than the one to encrypt?
21
Asymmetric (Public Key) Cryptography
• Proposed by Diffie and Hellman in 1976
Martin Hellman
22
Asymmetric cryptography
• Public and private keys mathematically related,
but one cannot be determined from the other
• Far slower than symmetric encryption
(but there’s tricks to get around that – covered later)
Sender has:
Plaintext p (arbitrary size)
Recipient’s public kpub (fixed size)
Encryption function E
Decryption function D
Can produce ciphertext c:
c = E(p,kpub)
Can recover plaintext:
Need recipient private key kpriv
p = D(c,kpriv)
Also works if you reverse the keys:
D(E(p,kpriv),kpub) == p
23
Y = E[PUa, X]
Alice's
public key
ring
Joy
Ted
Mike Bob
X=
X Transmitted D[PUb, Y]
ciphertext
Y = E[PRb, X]
Plaintext Plaintext
Encryption algorithm Decryption algorithm
input output
(e.g., RSA)
25
Asymmetric crypto algorithms
• In symmetric crypto, the list of algorithms just differed in bit sizes
and implementation details
• Asymmetric algorithms differ in fundamental method of use
• Key algorithms:
▪ Diffie-Hellman: Just solves the problem of agreeing to a secret symmetric key
over an open communication channel. Doesn’t encrypt/decrypt or
authenticate on its own.
▪ DSS: Digital Signature Standard – Just able to provide authentication. Doesn’t
encrypt/decrypt on its own.
▪ RSA: The original general purpose asymmetric algorithm – able to do
encryption/decryption (shown 3 slides ago) and signatures (2 slides ago)
▪ Elliptic Curve (e.g., X25519): Uses different fundamental math than the above
(smaller keys, more efficient) but achieves the same goals
(encryption/decryption and signatures)
26
RSA Public-Key Encryption
• Developed by Rivest, Shamir & Adleman in 1977
▪ Best known and widely used public-key algorithm
• Uses exponentiation of integers modulo a prime
• Given integers:
▪ Plaintext p
▪ Public key kpub = {e, n} (Known to sender)
▪ Private key kpriv = {d, n} (Known to receiver)
• Encrypt: 𝒄 = 𝒑𝑒 % 𝑛
• Decrypt: 𝒑 = 𝒄𝑑 % 𝑛
27
Where do you get the numbers:
Key generation
• Choose two distinct prime numbers p and q.
▪ Secret, random, similar in magnitude, chosen to make factoring hard.
• Get product n = pq
▪ Used in modulus in decrypt/encrypt. Length in bits is the “key length”. Part of
both keys.
• Compute φ = (p − 1)(q − 1)
• Choose an integer e that’s coprime with φ (no common factors)
▪ gcd(e, φ) = 1 and 1 < e < φ
▪ e being not very secret is okay; it’s often 216 + 1 = 65537
▪ e is part of the public key
• Determine d by solving de % φ = 1
▪ There’s an efficient algorithm for this, since we know φ
(but φ will be discarded later, and it’s not efficient to solve without it)
▪ d is part of the private key
Q: Hey this is a lot of math. Do I have to care?
A: Nah 28
RSA example
Encryption Decryption
ciphertext
plaintext plaintext
7 11 23
88 88 mod 187 = 11 11 mod 187 = 88 88
kPU
pub = 7, 187 kPR
priv = 23, 187
• See? It works.
• In practice, all the numbers are muuuuuuuuch bigger.
29
How long of a key do you need?
Or, How good are we at factoring RSA keys?
• RSA Factoring Challenge – cash prizes for factoring big n values
• Computing gets faster/cheaper, and algorithms are getting better
▪ 1024-bit keys are out there… RSA number
RSA-100
Decimal digits
100
Binary digits
330
Factored on
Apr 1991
• Countermeasures:
▪ Constant exponentiation time: Ensure that all exponentiations take the same
amount of time before returning a result – simple but slows things down
▪ Random delay: Better performance, but attacker could do many
measurements and statistically tease out actual delay
▪ Blinding: Multiply the ciphertext by a random number before performing
exponentiation then divide out - prevents attacker from knowing the actual
ciphertext bits
31
Diffie-Hellman Key Exchange
• RSA is so slow! I want to use fast symmetric crypto…
▪ Could use RSA to send a random secret key, but we can do better!
32
Diffie-Hellman in operation
34
“Digital Envelopes”: Reducing the amount of
asymmetric crypto you need to do
• Asymmetric crypto is more
expensive then symmetric Message E
Encrypted
Random message
symmetric
Alice Bob
Ok: k1 Ok: k2
37
The Authenticity Problem
• Problem: who sent this message?
Hey bro this email’s for real
• Best practical solution: Sender includes some data that only they
could have created
▪ But how could only they have created it?
Because only they had the key to do so!
▪ Attacker: Get that key! Or fool you into validating against the wrong key!
38
Techniques to authenticate
Two broad approaches similar to crypto:
• Either way, confidentiality (from crypto) and authenticity (from the above)
are separate things.
▪ Confidentiality without authenticity? Secrets sent anonymously
▪ Authenticity without confidentiality? Public info from a trusted source
▪ Confidentiality + authenticity? Secure communication 39
MAC concept
Message
MAC
Transmit algorithm
Compare
MAC
algorithm
MAC
41
Cryptographic Hash Functions
A cryptographic hash function H(x) must:
• Eat data of any size and give fixed-length output
• Be easy to compute for any given input
• Be one-way (a.k.a. pre-image resistant):
Computationally infeasible to find x from H(x)
• Have weak collision resistance:
Given x, computationally infeasible to find y ≠ x such that H(x) = H(y)
• Have strong collision resistance:
Computationally infeasible to find any pair (x, y) such that H(x) = H(y)
• Have the avalanche effect:
A small change to the input should totally change the output
42
Common cryptographic hash functions
• MD5: Published 1992, compromised several ways, but it’s in enough
“how do i program webz” tutorials that novices keep using it
▪ Output size: 128 bits
• SHA-1: NIST standard published in 1995, minor weaknesses
published throughout the 2000s, broken in general in 2017.
Sometimes just called “SHA” which can be misleading. Don’t use.
▪ Output size: 160 bits
• SHA-2: NIST standard published in 2001. Still considered secure.
• Output size: a few choices between 224-512 bits
• SHA-3: NIST standard published in 2015. Radically different design;
thought of as a “fallback” if SHA-2 vulnerabilities are discovered.
▪ Output size: a few choices between 224-512 bits, plus “arbitrary size” option
• RIPEMD-160: From 1994, but not broken. Sometimes used for
performance reasons.
▪ Output size: 160 bits 43
Ways of using a hash to authenticate
Source A Destination B
Message
Message
Message
H Uses hash + symmetric crypto.
Gives a MAC – sender and
H K K
Compare receiver need same key.
E D
(a) Using symmetric encryption
Message
Message
Message
H
Uses hash + asymmetric crypto.
Gives a Digital Signature!
Compare
H PRa PUa So important we’re going to dig deeper…
E D
(b) Using public-key encryption
K K
Message
Message
Message
H
Uses just a hash. Neat!
K K
Compare
Gives a MAC – sender and
H
receiver need same key.
45
Digital Signatures
• Digital signature provides (per NIST FIPS PUB 186-4):
▪ origin authentication,
▪ data integrity, and
▪ signatory non-repudiation The notion that you can’t deny it was you that sent the message.
• Common algorithms:
▪ Digital Signature Algorithm (DSA)
▪ RSA Digital Signature Algorithm
▪ Elliptic Curve Digital Signature Algorithm (ECDSA)
(All based on asymmetric cryptography – public and private keys!)
46
Digital Signature overview
47
The recursive problem of signatures
Alice can’t remotely prove to you that a given key is hers on her own.
48
Recurse! (1)
Proposed solution:
We have someone ELSE sign a message containing Alice’s key.
49
Recurse (2)
Proposed solution:
We have someone ELSE sign a message containing Bob’s key.
50
Recurse (3)
Proposed solution:
We have someone ELSE sign a message containing Clara’s key.
51
Recurse (4)
The base case
What about Don’s key?
We got it shipped to us just for this. We trust it implicitly.
52
Certificates and the chain of trust
• A certificate is a message that:
▪ Contains someone’s identity and their public key, and
▪ Is signed by someone else (usually*).
• Each message on the previous slide was a certificate, and everyone
signing a certificate was a certificate authority (CA).
• The entity that we trust implicitly is a root certificate authority.
• Together, they had this chain of trust:
You can now show the certificate to anyone who asks, and as long as
they trust your CA (either directly or recursively), they trust that the
key shown is yours.
55
Certificates in practice: X.509
• Most certificates are in X.509 format specified in RFC 5280.
• Used in many contexts, including:
▪ IP security (IPSEC): Used in Virtual Private Networks (VPNs)
▪ S/MIME: Encrypted/authenticated email
▪ Secure sockets layer (SSL) and its successor Transport Layer Security (TLS)
• This includes HTTPS!
56
Public-Key Infrastructure (PKI)
• All of this certificate and chain-of-trust stuff is part called
Public-Key Infrastructure (PKI)
▪ “The set of hardware, software, people, policies, and procedures needed to
create, manage, store, distribute, and revoke digital certificates based on
asymmetric cryptography.” -- RFC 4949
57
Trust stores in practice
• Most chosen by OS or app vendor – major decision!
• Organization can change this – many companies add a private root
CA to all their machines so they can sign certificates internally
• If malware can add a root CA, they can have that CA sign *any*
malicious certificate, allowing man-in-the-middle attacks
• Some security software does this too so it can “inspect” encrypted
traffic for “bad stuff” (I think this is stupid and dangerous)
58
Random number generation
59
Wait, how do we make keys again?
• Symmetric crypto: • Asymmetric crypto:
• Generate random bits. • Choose two random
prime numbers p and q,
then do more stuff
60
Requirements for random number generator
• PRNG: Algorithm to do a bunch of math, update
internal state, and kick out a random. Requirements:
▪ Uniform distribution: Frequency of occurrence of each of
the numbers should be approximately the same
• In binary, 0’s and 1’s occur with equal chance. Plain PRNG
Good enough for games.
61
Dangers when seeding the PRNG
• All PRNGs take in a seed (initial state)
▪ Given the same seed, it will generate the same sequence
▪ For basic randomness (video game junk), you can seed with the current time
• Guaranteed unique sequence ☺
• For crypto purposes, current time is a super bad choice:
If I know when you made your keys, then I can figure out your keys
▪ Instead, feed in a large amount of external entropy (true randomness)
• IO device delays, mouse movements, keyboard timing
• Modern kernels are always recording that stuff into an entropy pool
• Read from /dev/random: pull from this pool and it’s used up (rate limit)
Read from /dev/urandom: not crypto-secure, but not rate limited
62
Random versus Pseudorandom
• What’s better than a Pseudo-Random Number Generator?
• A True Random Number Generator (TRNG)!
▪ Uses a nondeterministic source to produce randomness (e.g. via external
natural processes like temperature, radiation, leaky capacitors, etc.)
63
The quantum computing threat
to cryptography
• Quantum computing uses quantum mechanical properties like
superposition and entanglement to do computing
▪ Can perform algorithms in entirely better time domains!
O(2n) might become O(n2)!
• A problem for cryptography! Shor’s Algorithm
65
Practical crypto rules
67
Good idea / Bad idea
• Which of the following are okay?
▪ Use AES-256 ECB with a fixed, well-chosen IV
• WRONG: ECB reveals patterns in plaintext (penguin!), use CBC or other
• WRONG: The IV should be random else a chosen plaintext can reveal key; also,
ECB mode doesn’t use an IV!
▪ Expand a 17-character passphrase into a 256-bit AES key through repetition
• WRONG: Human-derived passwords are highly non-random and could allow for
cryptanalysis; use a key-derivation algorithm instead
▪ Use RSA to encrypt network communications
• WRONG: RSA is horribly slow, instead use RSA to encrypt (or Diffie-Hellman to
generate) a random secret key for symmetric crypto
▪ Use an MD5 to store a password Note: We’ll cover password storage at
length later when we cover Authentication.
• WRONG: MD5 is broken
• WRONG: Use a salt to prevent pre-computed dictionaries
▪ Use a 256-bit SHA-2 hash with salt to store a password
• WRONG: Use a password key derivation function with a configurable iteration
count to dial in computation effort for attackers to infeasibility
68
Adapted from here.
“Top 10 Developer Crypto Mistakes”
Adapted from a post by Scott Contini here.
69
How to avoid problems like the above
Two choices:
1. Become a cryptography expert, deeply versed in every algorithm and every
caveat to its use. Hire auditors or fund and operate bug bounty programs to
inspect every use of cryptography you produce until your level of expertise
exceeds that of your opponents. Live in constant fear.
or
70
Examples of higher level libraries
Low-level High level
Password hashing with salt, iteration count, At minimum, use something like PBKDF2.
etc. (e.g., iterated SHA-2 with secure RNG- Even better, use a user management library
generated salt) that does this for you (for example, many web
frameworks like Django and Meteor handle
user authentication for you)
Secure a synchronous communication Use Transport Layer Security (TLS), or even
channel from eavesdropping (e.g., X.509 for better, put your communication over HTTPS if
authentication, DH for key exchange, AES for possible.
encryption)
Secure asynchronous communications like Use OpenPGP (or similar) via email or another
email from eavesdropping (e.g., RSA with a transport. See also commercial solutions like
public key infrastructure including X.509 for Signal.
key distribution and authentication, AES for
encryption)
Store content on disk in encrypted form (e.g., Use VeraCrypt, dm-crypt, BitLocker, etc. Even
AES-256 CBC with key derived from password a passworded ZIP is better than doing it
using PBKDF2). yourself.
71
If you find yourself needing to use crypto primitives yourself, check out “Crypto 101”.
Conclusion
72
Crypto basics summary
• Symmetric (secret key) cryptography c = ciphertext
p = plaintext
▪ c = Es(p,k) k = secret key
Es = Encryption function (symmetric)
▪ p = Ds(c,k) Ds = Decryption function (symmetric)