Lecture 3 Introduction To Cryptography
Lecture 3 Introduction To Cryptography
Introduction to Cryptography
ECEG 4192 1
Introduction
• Human beings need to communicate and share information
but selectively- gave rise to coding of msgs
• The art of cryptography is considered to be born along with
the art of writing
• Cryptography: combination of two Greek words: ‘Krypto’
meaning Secret and ‘graphene’ meaning writing
– Def: The art and science of concealing the messages to introduce
secrecy in information security
• Cryptanalysis: breaking of cryptographic systems
• Cryptology: cryptography + cryptanalysis
• Cryptosystems/cipher systems: deals with implementation
infrastructure (cryptographic algorithms, keys)
ECEG 4192 2
Cryptosystem
• Components of
cryptosystem
• Plaintext- original msg
• Ciphertext- scrambled
version of the plaintext
• Key: (Encryption and
decryption keys)- a value
known to the sender
Basic model of cipher system
and/or receiver
• Encryption algorithm: mathematical process that produces
a ciphertext for any given plaintext and encryption key
• Decryption algorithm: reverses the encryption algorithm
ECEG 4192 3
Cont’d
• A cryptosystem is a five-tuple (P,C,K,E,D), where
the following conditions are satisfied:
– P is a finite set of possible plaintexts
– C is a finite set of possible ciphertexts
– K ,the keyspace,is a finite set of possible keys
– E,D: Encryption and Decryption rules respectively
– For each kK, there is an encryption rule ek E
and a corresponding decryption rule dk D.
• Each ek:PC and dk:CP are functions such that
dk(ek(x))=x for each xP
ECEG 4192 4
Cont’d
• Two requirements for secure use of encryption
– We need a strong encryption algorithm. If an attacker knows
the algorithm and has access to one or more ciphertexts , the
attacker should be unable to decipher the ciphertext or figure
out the key
– For shared key, the sender and receiver must get copy of the
key in a secured fashion. If someone can discover the key and
knows the algorithm, all communication using this key is
readable.
• N.B: we do not need to keep the algorithm secret; we
need to keep only the key secret.
ECEG 4192 5
Attacking an encryption system
• Objective: to recover the key in use rather than simply to
recover the plaintext of a single ciphertext
• Two general approaches
• Cryptanalysis: depend on the nature of the algorithm
and perhaps some knowledge of the characteristics of
the plaintext
– The stronger the algorithm, the harder the
cryptanalysis is
• Brute-force attack: attacker tries every possible key on a
piece of ciphertext until a plaintext that makes sense is
found.
– On average, half of all possible keys must be tried to achieve
success.
ECEG 4192 6
Types of cryptanalysis
ECEG 4192 7
Brute force attacks
• Focus on trying every possible key
ECEG 4192 10
Introduction
• Basic building blocks of all encryption techniques are
substitution and transposition
– Substitution :is one in which the letters of plaintext are
replaced by other letters or by numbers or symbols.
– Transposition: kind of mapping is achieved by performing
some sort of permutation on the plaintext letters.
• We will see some common classical encryption
techniques – mainly focus on processing of alphabets
– Caesar Cipher
– Mono-alphabetic Ciphers
– Playfair Cipher
– Vigenere Cipher
– One Time Pad ECEG 4192 11
i) Caesar cipher
• The earliest known, and the simplest, use of a substitution
cipher
• Involves replacing each letter of the alphabet with shifting
three places further down the alphabet
• Example:
plain: a b c d e f g h i j k l m n o p q r s t u v w x y z
cipher: d e f g h i j k l m n o p q r s T u v w x y z a b c
• Encrypt the plain text: meet me after the toga party
ECEG 4192 12
Cont’d
• Then the algorithm can be expressed as follows. For each
plaintext letter p, substitute the ciphertext letter C:
C = E(3, p) = (p + 3) mod 26
• A shift may be of any amount, so that the general Caesar
algorithm is
C = E(k, p) = (p + k) mod 26 , k=shift
• And the decryption algorithm of this cipher becomes
p = D(k, C) = (C - k) mod 26
ECEG 4192 13
Breaking Caesar cipher
• Easy!
• What is the size of the key space? Only 26 for English
text.
• Brute force: There are only 26 keys to try
• Cryptanalysis: if the language is know, frequency analysis
of the letters can be used
• Caesar cipher is insecure with respect to the brute force
and cryptanalysis techniques
• Can be improved by increasing the key space
ECEG 4192 14
ii) Mono-alphabetic Ciphers
• Permutation: rearrangement of the letters
– For n elements, there are n! permutations ; the first element
can be chosen in one of n ways, the second in n-1 ways, the
third in n-2 ways
• For 26 alphabets, there are 26! rearrangements
– E.g. ‘a’ – can be coded 26 different ways, ‘b’– 25 ways, ‘c’– 24
ways 26x25x24x…x1= 26!
• There is arbitrary mapping of one element of the
plaintext to one alphabet in the cipher text
• Using this method, the number of mappings, hence the
number of keys, is 26!
• The key space is very large ~ 4x1026 number of keys to
use ECEG 4192 15
Breaking mono-alphabetic Ciphers
• Brute force attack on mono-alphabetic cipher is difficult
as the key space is large
– For the faster computer we saw (1012 decryptions/sec), the
average time required is 6.4 million years
• If the nature of the plaintext is known (e.g uncompressed
English text), frequency of letters, digrams, etc can be
exploited
– The frequency distribution of the letters in the ciphertext is
computed and compared with standard distribution
– The most common letter in the ciphertext likely corresponds to
‘e’ in plaintext, the next common letter in the ciphertext, to ‘t’
in the plaintext. But this is more likely for large messages as the
relative frequency of the letters approaches the standard
frequency distribution
ECEG 4192 16
Cont’d
Frequency distribution of English letters
17
ECEG 4192
Cont’d
Common digrams
Common trigrams
18
ECEG 4192
Example
• For the ciphertext of
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
Determine the plaintext, if mono-alphabetic cipher is used.
– Determine the relative frequency of the ciphertext
• P=13.33, Z= 11.67 etc. p or z could correspond to ‘e’. Go with tentative
assignments (pe)
– Determine the frequency of the digrams, trigrams, common
words
• The most common digram is zw (3 times)– corresponds to ‘th’. From
this we can assume z t, wh and substitute them if can find
acceptable skeleton
– Frequency of trigrams, zwp--> ‘the’, etc
– Continued analysis of frequencies plus trial and error should
easily yield a solution
ECEG 4192 19
Cont’d
• The limitation of mono-alphabetic cipher is not
the key size.
• The problem is that the ciphertext reflects the
frequency of the characters of the plaintext
– If the nature of the language of the plaintext is known
and is uncompressed, then the mono alphabetic
cipher can be easily broken
• Soln: One letter in the plaintext does not always
correspond to the same letter in the cipher text
– Play-fair cipher- multiple letter encryption
– Poly alphabetic ciphers
ECEG 4192 20
iii) Play-fair cipher
• Operates on pair of letters
• Treats digrams in the plaintext as single units and
translates these units into ciphertext digrams
• Adv: one letter in the plain text does not correspond to
the same letter in the cipher
– changing the frequency of occurrence
• Play-fair algorithm
– Is based on the use of a 5 × 5 matrix of letters constructed
using a keyword.
– The keyword is the secret that the sender and receiver share
– Plaintext is encrypted two letters at a time
– letters I and J count as one letter (b/c only 5x5 matrix)
ECEG 4192 21
Cont’d
1. Filling in the letters of the keyword (avoiding duplicates) from left
to right and from top to bottom, and then filling in the remainder
of the matrix with the remaining letters in alphabetic order
2. Divide the plaintext into pairs of letters, if same letters exist in a
pair, separate them using a filter letter (e.g. X), if odd X at the end
too.
– E.g: balloon becomes ba lx lo on (plaintext must have even No of letters)
3. Two plaintext letters that fall in the same row of the matrix are
each replaced by the letter to the right, with the first element of
the row circularly following the last.
4. Two plaintext letters that fall in the same column are each replaced
by the letter beneath, with the top element of the column
circularly following the last.
5. Otherwise, each plaintext letter in a pair is replaced by the letter
that lies in its own row and the column occupied by the other
plaintext letter.
ECEG 4192 22
Cont’d
• Decrypting the Playfair cipher is as simple as doing the encryption
process in reverse (moving left or up )
• Receiver has the same key and can create the same matrix
• Using playfair encryption, the same letter in the plaintext can be
transformed in to different alphabets in the ciphertext.
– Hence the frequency of a given alphabet in the plaintext will
not be the same in the ciphertext
• Making frequency analysis of the single letters difficult.
– But the frequency of digrams in both the plaintext and
ciphertext will be the same.
– Attackers can hence use the frequency of digrams to break the
cipher.
– Better than mono alphabetic cipher– relative frequency of
digrams is much less than the individual letters
ECEG 4192 23
Example
1. Encrypt the message “hide money” using a keyword of
“tutorials”
• Fill in the matrix as shown
• Split the plaintext into pairs
– HI DE MO NE YX
• Then encrypt the digrams
– HI QC, DEEF, MONU, NE MF, YX ZY
• Hence ciphertext= QCEFNUMFZY
EX.
1. Encrypt the plaintext “hello” using the same keyword
2. Decrypt the ciphertext “QCIDGUFW” using the same
keyword given above
ECEG 4192 24
iv) Poly-alphabetic ciphers
• In mono-alphabetic ciphers, one letter of the plaintext is
mapped to the same letter in the ciphertext
• If no alphabet should occur more frequently than the
others, then frequency analysis becomes difficult
• E.g: Vigenere Cipher, One Time Pad
• A) Vigenere Cipher:
– To encrypt a message, a key is needed that is as long as the
message usually by repeating the keyword
– Assume plaintext P= p0,p1,…,pn-1 and a key
K=k0,k1,…,km-1, where m<n, the ciphertext C=
C0,C1,…,Cn-1 is calculated as follows
ECEG 4192 25
Cont’d
• Vigenere cipher(cont’d):
– A general equation of the encryption process is
key: deceptivedeceptivedeceptive
plaintext: wearediscoveredsaveyourself
ciphertext:ZICVTWQNGRZGVTWAVZHCQYGLMGJ
– E.g: Letter ‘e’ in the plaintext is mapped to different letters in
the cipher text
ECEG 4192 26
Breaking Vigenere cipher
• Adv: there are multiple ciphertext letters for each plaintext letter.
That is, the ciphertext does not contain the statistics of the
plaintext
• But breaking it is not hard! The key length can be found through
brute force or other means of analysis.
• But for simplicity, let us assume the attacker knows the length of
the keyword, in this case 8.
• Then the ciphertext is divided into groups with length of 8 letters
• The first letter of each group of the ciphertext is encrypted using
the same letter in the key and Collect all these first letters of each
group, then the most common letter among the set is likely to be
the letter ‘e’. If ‘H’ is the common letter in ciphertext, then ‘H’-
’e’=‘d’, so here we can assume ‘d’ to be the 1st letter of the key
• Do the same thing for 2nd, 3rd … letters of the group and the key
used to encrypt the ciphertext can be easily discovered.
ECEG 4192 27
b) One Time Pad
• Instead of using a keyword and repeating it, choose a key
with a random sequence of letters which the same length
as the plaintext
• In addition, the key is to be used to encrypt and decrypt a
single message, and then is discarded
• It produces random output that bears no statistical
relationship to the plaintext unbreakable
• For true random key, the frequency distribution of all
letters in the ciphertext will be equal
• Breaking OTP is hard
– We can’t use frequency analysis (frequency distribution is same)
– We can’t use brute force (There are many (billions of) plaintext
that make sense )
ECEG 4192 28
Cont’d
• Example: consider a ciphertext
ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS
• Using brute force, we can try all the possible keys
ciphertext: ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS
key: pxlmvmsydofuyrvzwc tnlebnecvgdupahfzzlmnyih
plaintext: mr mustard with the candlestick in the hall
• And
ciphertext: ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS
key: mfugpmiydgaxgoufhklllmhsqdqogtewbqfgyovuhwt
plaintext: miss scarlet with the knife in the library
• Suppose the cryptanalyst found these two keys.
– How does he know which one is the correct one?
• If the key is random, and we do an exhaustive search,
there is no way finding the correct key as there will be
many plaintexts that make sense
ECEG 4192 29
Cont’d
• The security of the one-time pad is entirely due to the
randomness of the key.
• But OTP has limitations as well.
– As the plain text gets larger, then it is hard to distribute or
share a key of equal size. E.g if we have 10GB plaintext, then
sharing 10GB key is very difficult.
– Creating a true random key is not easy. If the key is not
random, then it will not be unbreakable.
• Hence OTP has limited practical use
ECEG 4192 30
Transposition methods
• Is achieved by performing some sort of permutation on
the plaintext letters
• E.g: Rail fence technique
• In Rail fence, the plaintext is written down as a
sequence of diagonals and then read off as a sequence
of rows
• E.g: plaintext: meet me after the toga party
• Use rail fence depth of 2 and write it as follows, but it
would be more trivial for cryptanalysis.
m e m a t r h t g p r y
e t e f e t e o a a t
Cipher text: MEMATRHTGPRYETEFETEOAAT
ECEG 4192 31
Cont’d
• A more complex scheme is
– To write the message in a rectangle, row by row, and read the
message off, column by column,
– Permute the order of the columns.
– The order of the columns then becomes the key to the algorithm
• Example:
Key: 4 3 1 2 5 6 7
Plaintext: a t t a c k p
o s t p o n e
d u n t i l t
w o a m x y z
Ciphertext: TTNAAPTMTSUOAODWCOIXKNLYPETZ
• But, Frequency analysis is still possible!
ECEG 4192 32