Introduction To Cryptography
Introduction To Cryptography
1
Dan Boneh
References
• https://cs155.stanford.edu/lectures/07-crypto.pptx
• https://www.cs.purdue.edu/homes/ninghui/courses/555_Spri
ng12/handouts/555_Spring12_topic01.ppt
• https://www.cs.purdue.edu/homes/ninghui/courses/555_Spri
ng12/handouts/555_Spring12_topic02.ppt
• https://cs.fit.edu/~mmahoney/cse4232/crypto.ppt
2
Goals of Cryptography
• The most fundamental problem cryptography addresses: ensure security
of communication over insecure medium
• What does secure communication mean?
– confidentiality (privacy, secrecy)
• only the intended recipient can see the communication
– integrity (authenticity)
• the communication is generated by the alleged sender
• What does insecure medium mean?
– Two possibilities:
• Passive attacker: the adversary can eavesdrop
• Active attacker: the adversary has full control over the
communication channel
3
Approaches to Secure Communication
• Steganography
– “covered writing”
– hides the existence of a message
– depends on secrecy of method
• Cryptography
– “hidden writing”
– hide the meaning of a message
– depends on secrecy of a short key, not method
4
Terms: Cryptography, cryptanalysis, and cryptology
• Cryptography,
– Traditionally, designing algorithms/protocols
– Nowadays, often synonym with cryptology
• Cryptanalysis
– Breaking algorithms/protocols
6
A Sample List of Other Goals in Modern Cryptography
7
History of Cryptography
• 2500+ years
• An ongoing battle between codemakers and codebreakers
• Driven by communication & computation technology
– paper and ink (until end of 19th century)
– cryptographic engine & telegram, radio
• Enigma machine, Purple machine used in WWII
– computers & digital communication
8
Major Events in History of Cryptography
• Mono-alphabetical ciphers (Before 1000 AD)
• Frequency analysis (Before 1000 AD)
• Cipher machines (early 1900’s)
• Shannon developed theory of perfect secrecy and information theoretical
security (around 1950)
• US adopts Data Encryption Standard in 1977
• Notion of public key cryptography and digital signatures introduced
(1970~1976)
• The study of cryptography becomes mainstream in the research
community (1976)
• Development of computational security and other theoretical foundation
of modern cryptography (1980’s)
9
Symmetric-key Encryption
• This is what cryptography is all about until 1970.
• Two parties (often called a sender and a receiver) share some
secret information called a key.
10
Basic Terminology for Encryption
• Plaintext - An original message, also referred to as
message
11
Notation for Symmetric-key Encryption
• A symmetric-key encryption scheme is comprised of three algorithms
– Gen the key generation algorithm
• The algorithm must be probabilistic/randomized
• Output: a key k
– Enc the encryption algorithm
• Input: key k, plaintext m
• Output: ciphertext c := Enck(m)
– Dec the decryption algorithm
• Input: key k, ciphertext c
• Output: plaintext m := Deck(m)
Requirement: k m [ Deck(Enck(m)) = m ]
12
Shift Cipher
• The Key Space K :
– [0 .. 25]
• Encryption given a key k:
– each letter in the plaintext P is
replaced with the k’th letter
following corresponding
number (shift right)
• Decryption given k:
– shift left
14
Mono-alphabetic Substitution Cipher
• The key space: all permutations of = {A, B, C, …, Z}
• Encryption given a key :
– each letter X in the plaintext P is replaced with
(X)
• Decryption given a key :
– each letter Y in the cipherext P is replaced with
-1(Y)
Example:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
= B A D C Z H W Y G O Q X S V T R N M L K J I P F E U
BECAUSE AZDBJSZ
15
Coding Break #1
• Build a python function named “caesar_cipher“ for
Caesar’s Cipher encryption
16
Strength of the Mono-alphabetic Substitution Cipher
• Basic ideas:
– Each language has certain features: frequency of
letters, or of groups of two or more letters.
– Substitution ciphers preserve the language
features.
• History of frequency analysis
– Discovered by the Arabs; earliest known description is in a book by the ninth-
century scientist al-Kindi
– Rediscovered or introduced from the Arabs in the Europe during the Renaissance
• Frequency analysis made substitution cipher insecure
18
Dan Boneh
Frequency of Letters in English
14
12
10
8
Series1
6
0
a b c d e f g h i j k l m n o p q r s t u v w x y z
19
How to Defeat Frequency Analysis?
• Use larger blocks as the basis of substitution. Rather than
substituting one letter at a time, substitute 64 bits at a time,
or 128 bits.
– Leads to block ciphers such as DES & AES.
20
Coding Break #2
• Copy a block of english text from the internet, such
as the following Wikipedia page
https://en.wikipedia.org/wiki/Caesar_cipher
21
Random Variable
Definition
A discrete random variable, X, consists of a finite set X, and
a probability distribution defined on X. The probability that
the random variable X takes on the value x is denoted
Pr[X =x]; sometimes, we will abbreviate this to Pr[x] if the
random variable X is fixed. It must be that
22
Dan Boneh
Example of Random Variables
• Let random variable D1 denote the outcome of throw one dice
(with numbers 0 to 5 on the 6 sides) randomly, then
D={0,1,2,3,4,5} and Pr[D1=i] = 1/6 for 0 i 5
• Let random variable D2 denote the outcome of throw a second
such dice randomly
• Let random variable S1 denote the sum of the two dices, then S
={0,1,2,…,10}, and Pr[S1=0] =
Pr[S1=10] = 1/36 Pr[S1=1] =
Pr[S1=9] = 2/36 = 1/18 …
• Let random variable S2 denote the sum of the two dices modulo
6, what is the distribution of S2
23
Relationships between Two Random Variables
Definitions
Assume X and Y are two random variables,
then we define:
- joint probability: Pr[x, y] is the probability that
X takes value x and Y takes value y.
- conditional probability: Pr[x|y] is the probability
that X takes on the value x given that Y takes
value y.
Pr[x|y] = Pr[x, y] / Pr[y]
- independent random variables: X and Y
are said to be independent if Pr[x,y] = Pr[x]P[y],
for all x X and all y Y.
24
Examples
• Joint probability of D1 and D2
for 0i, j5, Pr[D1=i, D2=j] = ?
• What is the conditional probability Pr[D1=i | D2=j] for
0i, j5?
• Are D1 and D2 independent?
25
Examples to think after class
• What is the joint probability of D1 and S1?
• What is the joint probability of D2 and S2?
26
Dan Boneh
Bayes’ Theorem
Bayes’ Theorem
If P[y] > 0 then
P[ x ]P[ y | x ]
P[ x | y ]
P[ y ]
Corollary
X and Y are independent random variables iff
P[x|y] = P[x], for all x X and all y Y.
27
Ways to Enhance the Substitution Cipher against Frequency
Analysis
• Using nulls
– e.g., using numbers from 1 to 99 as the ciphertext
alphabet, some numbers representing nothing
and are inserted randomly
• Deliberately misspell words
– e.g., “Thys haz thi ifekkt off diztaughting thi
ballans off frikwenseas”
• Homophonic substitution cipher
– each letter is replaced by a variety of substitutes
• These make frequency analysis more difficult, but not
impossible
28
Towards the Polyalphabetic Substitution Ciphers
31
Coding Break #3
• Build a python function named “vigenere_cipher “ for
Vigenère Cipher encryption
33
Dan Boneh
How to Find the Key Length?
• For Vigenere, as the length of the keyword
increases, the letter frequency shows less
English-like characteristics and becomes more
random.
• Two methods to find the key
length:
– Kasisky test
– Index of coincidence
(Friedman)
34
Kasisky Test
• Note: two identical segments of plaintext, will be
encrypted to the same ciphertext, if the they occur in the
text at the distance , (0 (mod m), m is the key
length).
• Algorithm:
– Search for pairs of identical
segments of length at least 3
– Record distances between
the two segments: 1, 2, …
– m divides gcd(1, 2, …)
https://youtu.be/asRbswE2hFY
35
Example of the Kasisky Test
Key K I N G K I N G K I N G K I N G K I N G K I N G
PlainText t h e s u n a n d t h e m a n i n t h e m o o n
Cyphertext D P R Y E V N T N B U K W I A O X B U K W W B T
36
Index of Coincidence (Friedman)
Informally: Measures the probability that two random elements
of the n-letters string x are identical.
Definition:
Suppose x = x1x2…xn is a string of n alphabetic characters. Then
Ic(x), the index of coincidence is:
I ( x) P( x x )
when i and j are uniformly
c randomlyichosenjfrom [1..n]
https://youtu.be/kty-dCB4AAk
37
Index of Coincidence (cont.)
• Consider the plaintext x, and f0, f1, … f25 are the
frequencies with which A, B, … Z appear in x and
p0, p1, … p25 are the probabilities with which A, B,
… Z appear in x.
• That is pi = fi / n where n is the length of x
S
fi S S
i 0 2
f (f i i 1) f i
2
S
pi
2
I C ( x) i 0
i 0
n n(n 1) n 2
i 0
2
39
Index of Coincidence of English
• For English, S = 25 and pi can be estimated
i 25
I c ( x) pi 0.065
2
i 0
40
Finding the Key Length
y = y1y2…yn, , assum m is the key length, write y vertically in an m-row
array
y1 y m 1 ... y n m 1 y1
y y m 2 ... y n m 2
2 y2
... ... ... ... …
ym y 2m ... yn ym
41
Finding out the Key Length
• If m is the key length, then the text ``looks like’’
English text
i 25
I c ( y i ) pi 0.065 1 i m
2
i 0
i 25
1 2 1 1
Ic ( ) 26 2 0.038
i 0
26 26 26
42
Rotor Machines
• Basic idea: if the key in Vigenere cipher is very long, then the
attacks won’t work
• Implementation idea: multiple rounds of substitutions
• A machine consists of multiple cylinders
– Each character is encrypted by multiple cylinders
– Each cylinder has 26 states, at each state it is a substitution
cipher
– Each cylinder rotates to change states according to
different schedule
43
Rotor Machines
• A m-cylinder rotor machine has
– 26m different substitution ciphers
• 263 = 17576
• 264 = 456,976
• 255 = 11,881,376
44
Earliest Enigma Machine
• Use 3 scramblers (motors): 17576
substitutions
45
Dan Boneh
History of the Enigma Machine
• Patented by Scherius in 1918
• Widely used by the Germans from 1926 to the end of second
world war
• First successfully broken by the Polish’s in the thirties by
exploiting the repeating of the message key
• Then broken by the UK intelligence during the WW II
46
The Imitation Game - Breaking the Enigma Code
47
Cryptographic Attacks
• Ciphertext only: attacker has only ciphertext.
• Known plaintext: attacker has plaintext and corresponding
ciphertext.
• Chosen plaintext: attacker can encrypt messages of his
choosing.
• Distinguishing attack: an attacker can distinguish your cipher
from an ideal cipher (random permutation).
Is not:
– The solution to all security problems
– Reliable unless implemented and used properly
– Something you should try to invent yourself
49
Definitions
• Cryptography = the science (art) of encryption
• Cryptanalysis = the science (art) of breaking encryption
• Cryptology = cryptography + cryptanalysis
50
מטלה 1
51