0% found this document useful (0 votes)

32 views13 pages

Shannon's Theory of Secrecy: 3.1 Introduction To Attack and Security Assumptions

The document summarizes Shannon's theory of secrecy and introduces concepts from information theory relevant to cryptanalysis. It describes different types of attacks an adversary could perform, including ciphertext-only, known-plaintext, chosen-plaintext, and chosen-ciphertext attacks. It also defines different levels of security a cryptosystem could achieve, such as unconditional, computational, provable, and heuristic security. The document then introduces probability theory concepts like sample spaces, random variables, probability distributions, independent and conditional probabilities, and the entropy measure from information theory to quantify uncertainty.

Uploaded by

cv31415

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views13 pages

Shannon's Theory of Secrecy: 3.1 Introduction To Attack and Security Assumptions

Uploaded by

cv31415

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Chapter 3

Shannon’s Theory of Secrecy

3.1 Introduction to attack and security assumptions

After an introduction to some basic encryption schemes in the previous chapter we will in
the sequel try to explain modern theory for the design of cryptographic primitives. The
starting point is to give a more thorough treatment of possible attack scenarios. We have
a symmetric encryption system in mind, according to the model described in the previous
chapter. We list a number of possible attacks.
Ciphertext-only attack: Eve (the enemy) is assumed to have access to the ciphertext c.
The target for Eve would be to try to recover the secret key k, the plaintext m, or possibly
some partial information about the plaintext. This is the weakest form of an attack that
we consider.
Known-plaintext attack: Eve knows the plaintext and the ciphertext. She tries to recover
the secret key k. Alternatively, she might know a part of the plaintext together with
the full ciphertext, and she tries to recover the unknown part of the plaintext (or just
some partial information about it). It should be noted that a known-plaintext attack is
a basic attack that all cryptographic primitives should be resistant against. For example,
Eve should not be able to recover the key from a known plaintext and the corresponding
ciphertext. It is very often the case in real applications that some data known to Eve is
encrypted. So the known-plaintext attack is not unrealistic at all.
Chosen-plaintext attack: Not only does Eve know the plaintext but she is able to choose
it herself. She gets the corresponding ciphertext. Her target is to recover the secret key k.
As before, we can also consider an attack when she chooses a part of the plaintext and tries
to recover some other unknown part of it. This attack scenario is of course less realistic
compared to the attacks mentioned before. However, there are several applications when
this is also a realistic attack. One such example is when Eve is attacking a primitive
implemented in a protected device (e.g. smart card). She might have access to the device
and can give arbitrary input and observe the output.
Chosen-ciphertext attack: Eve is assumed to have access to a decryption algorithm and
can feed it with an arbitrary ciphertext, observing the corresponding plaintext. Again
the prime target is to recover k.

37
38 Chapter 3. Shannon’s Theory of Secrecy

We sometimes consider also related-key attacks. Here plaintexts have been encrypted by
different keys that are almost identical (they may differ in only one bit). Eve tries to
recover one of the keys. An important scenario in certain applications is the area of side-
channel attacks. Here Eve has access to a device performing encryption or decryption and
gets additional information from side channels. This could be how the device consumed
power during its execution, the time it takes the devise to perform different operations,
etc. Eve is attacking the implementation of an algorithm and not the algorithm itself.
However, the possibilities of a secure implementation depend a lot on the algorithm itself.
The second topic to consider is what we mean by a secure primitive. As we have just
described, the security is related to the attack scenario we consider. However, the security
can be of different kind.
Unconditional security: A cryptographic primitive is said to be unconditionally secure if
we can prove that it cannot be broken even if Eve have infinite computational resources.
This is the strongest kind of security we can achieve. Note that under this assumption
Eve is allowed to do an exhaustive key search, i.e., exhaustively try all possible keys.
Computational security: A cryptographic primitive is said to be computationally secure if
we can prove that the best algorithm for breaking it requires at least T operations, where
T is some large fixed number. It is a very rare event that a cryptosystem can be proved
secure under this assumption.
Provable security: A cryptographic primitive is said to be provably secure if its security
can be reduced to some well-studied problem. This means that breaking the primitive
implies that we can solve the well-studied problem. For example, a primitive might have
been proved secure provided that an integer n cannot be factored. The security have
then been reduced to the factoring problem. As long as we cannot factor large integers,
the primitive is secure. The way we reduce one computational problem to another is an
interesting topic and we will come back to that later.
Heuristic security: If there is no known method of breaking the primitive but we cannot
prove the security in any sense. Actually, most ciphers used today have a security of this
kind.
After this introduction of attack scenarios and different kinds of security for a primitive,
we move to the first study of cryptographic primitives. In 1949 Claude Shannon published
his paper on secrecy systems entitled “Communication Theory of Secrecy Systems”. This
was the first formal treatment of the secrecy problem in open litterature. For the first
time a cipher could be proved to be secure.
It is important to note that Shannon considered only ciphertext-only attacks and consid-
ered only unconditional security, i.e., Eve is assumed to have infinite computing power.
This combination is best studied through probability theory and information theory.

3.2 Information theory

First a few basics from probability theory. Consider an experiment or trial of some kind,
resulting in some outcome. The set of all possible outcomes is called the sample space of
the experiment, denoted Ω.
39

Let Ω be ﬁnite,
Ω = {ω1 , ω2 , . . . , ωn }.
The elements ωi in Ω are called elementary events. An event E is simply any subset of
Ω. We assume that the elementary
events have an associated probability measure P (ωi )
such that 0 ≤ P (ωi) ≤ 1, ni=1 P (ωi ) = 1. The probability of an event E is given by

P (E) = P (ω).
ω∈E

A discrete random variable X takes values from a ﬁnite set X . It is a mapping from
X(ω) : Ω −→ X . It has a probability distribution P (X) where the notation P (X = x)
means the probability that the random variable X takes the value x ∈ X . The probability
P (X = x) is given as
P (X = x) = P (ω).
ω:X(ω)=x

We use the notation P (x) rather than P (X = x) for convenience. Obviously P (x) ≥ 0
for all x ∈ X and
P (x) = 1.
x∈X

Let E ⊂ X . Then X ∈ E is an event and its probability is calculated as

P (X ∈ E) = P (x).
x∈E

Example 3.1. Let X denote the outcome when throwing a die. Then Ω = X = {1, 2, 3, 4, 5, 6}
and P (X = x) = 1/6 for all x ∈ X .
event of throwing an even number, i.e., E = {2, 4, 6}. Its probability is
Consider the
P (X ∈ E) = x∈E P (x) = 1/2.

Note that a random variable does not need to take on numerical values. Considering
the example above, we can deﬁne a new random variable Y taking values from Y =
{odd, even}. Still, Ω = {1, 2, 3, 4, 5, 6} and Y (1) = odd, Y (2) = even, . . ., Y (6) = even
giving
P (Y = odd) = P (ω) = 1/2,
ω:Y (ω)=odd

etc.
A pair of random variables X, Y deﬁned on the same sample space can be considered as
a single random variable, say Z = (X, Y ). The random variable Z takes values in X × Y,
with Z(ω) = (X(ω), Y (ω)). Usually, we want to keep the X and Y variables visible, so
we write P (X, Y ), etc., without introducing a new random variable.
A similar reasoning can be made for events E and F . We can consider the joint event G
that both E and F occur. Then G corresponds to the event G = E ∩ F . The notation
P (E, F ) is deﬁned as P (E ∩ F ).
Two events E and F are said to be independent if

P (E, F ) = P (E)P (F ).
40 Chapter 3. Shannon’s Theory of Secrecy

Correspondingly, two random variables X, Y are said to be independent random variables

if
P (X = x, Y = y) = P (X = x)P (Y = y), ∀x ∈ X , y ∈ Y.
Next we introduce conditional probabilities.
Definition 3.1. The conditional probability P (E|F ) is deﬁned as
P (E, F )
P (E|F ) = ,
P (F )
assuming P (F ) = 0.

For random variables we have a similar deﬁnition of P (X|Y ) as

P (X = x, Y = y)
P (X = x|Y = y) = .
P (Y = y)

We will now introduce the concept of entropy, which is a measure of uncertainty of a

random variable.
Definition 3.2. The entropy H(X) of a discrete random variable X is deﬁned as

H(X) = − P (x) log P (x).
x∈X

The log is to the base 2 and entropy is expressed in bits. Also, we use the convention that
0 log 0 = 0, which is easily justiﬁed since x log x → 0 as x → 0.
Recall that the expectation E(F (X)) of a function F (X) is deﬁned as

E(F (X)) = P (x)F (x).
x∈X

Clearly, an alternative way of expressing the entropy is

H(X) = E(− log P (X)).
We can see that the actual values of X are not used in the calculation of the entropy, only
its probability distribution. This means that the entropy of X can be calculated even if
X does not take on numerical values.
The interpretation of entropy can be viewed as some kind of uncertainty about the out-
come of the random variable. If X takes one value with probability 1 and other values
with probability 0, then the entropy is 0 bits. There is no uncertainty since we know what
value X will take.
If X takes on two possible values, both with probability 1/2, then the entropy is 1 bit. If
X takes on four possible values, all with probability 1/4, then the entropy is 2 bit, and
so on.
If we consider (X, Y ) as one random variable we get

H(X, Y ) = − P (x, y) log P (x, y).
x∈X ,y∈Y

We need to deﬁne conditional entropy.

Definition 3.3. The conditional entropy H(X|Y ) of a discrete random variable X,

conditioned on another discrete random variable Y is deﬁned as

H(X|Y ) = − P (x, y) log P (x|y).
x∈X ,y∈Y

It is then straightforward to derive the expression

H(X|Y ) = P (y)H(X|Y = y).
y∈Y

This may be a convenient way to calculate H(X|Y ).

The entropy function has several important properties.
Theorem 3.1. If X is a random variable taking values in the set X = {x1 , x2 , . . . , x|X | }
then
0 ≤ H(X) ≤ log |X |.
Furthermore, H(X) = 0 if and only if P (x) = 1 for some x ∈ X ; and H(X) = log |X | if
and only if P (x) = 1/|X | for all x ∈ X .

A similar property holds for the conditional entropy.

Theorem 3.2. If X is a random variable taking values in the set X = {x1 , x2 , . . . , x|X | }
then
0 ≤ H(X|Y ) ≤ log |X |.
Furthermore, H(X|Y ) = 0 if and only if for every y P (x|y) = 1 for some x ∈ X ; and
H(X|Y ) = log |X | if and only if for every y P (x|y) = 1/|X | for all x ∈ X .

Also for H(X|Y = y) similar inequalities hold.

Consider a number of random variables X1 , X2 , . . . Xn . The following chain rule is often
very useful.
Theorem 3.3.

H(X1 X2 . . . Xn ) = H(X1 ) + H(X2 |X1 ) + H(X3 |X1 X2 ) + · · · H(Xn |X1 X2 . . . Xn−1 ).

In particular, H(XY ) = H(X) + H(Y |X) = H(Y ) + H(X|Y ).

Finally, the following inequality shows that the uncertainty of a random variable X can
never increase by knowledge of the outcome of another random variable.
Theorem 3.4.
H(X|Y ) ≤ H(X)
with equality if and only if X and Y are independent.
42 Chapter 3. Shannon’s Theory of Secrecy

The inequality leads to the fact that

H(XY ) ≤ H(X) + H(Y ),

again with equality if and only if X and Y are independent.

Consider H(X) in the case |X | = 2. There are two possible values, one with probability
p and the other with probability 1 − p. This case is so common that the entropy function
has received a notation of its own,

h(p) = −p log p − (1 − p) log(1 − p).

We will now introduce the concept of mutual information. To start with, we define the
relative entropy D(P (X)||Q(X)) between two probability distributions P (X) and Q(X)
as
P (x)
D(P (X)||Q(X)) = P (x) log .
x∈X
Q(x)
This can equivalently be written as
P (x)
D(P (X)||Q(X)) = EP (log ).
Q(x)
Note that the expectation is taken over P (X) and in general D(P (X)||Q(X)) = D(Q(X)||P (X)).
The relative entropy is a measure of the distance between two distributions. It can be
thought of as the inefficiency of assuming distribution Q(X) when the correct distribution
is P (X).
Definition 3.4. The mutual information I(X; Y ) between random variables X and Y is
defined as
I(X; Y ) = D(P (X, Y )||P (X)P (Y )),
or equivalently
P (x, y)
I(X; Y ) = P (x, y) log .
x∈X ,y∈Y
P (x)P (y)

The mutual information I(X; Y ) measures the information (in bits) we receive about
the random variable X when observing the outcome of the random variable Y . But it
also describes the information we receive about the random variable Y when observing
the outcome of the random variable X. Hence the name mutual information and the
property I(X; Y ) = I(Y ; X). Some very important properties for the mutual information
are summarized in the following theorem.
Theorem 3.5.

I(X; Y ) = H(X) − H(X|Y ), (3.1)

I(X; Y ) = H(Y ) − H(Y |X), (3.2)
I(X; Y ) = H(X) + H(Y ) − H(X, Y ), (3.3)
I(X; Y ) = I(Y ; X), (3.4)
I(X; X) = H(X). (3.5)
43

Eve

m c m
Alice encryption decryption Bob

secure channel
K

key source

Figure 3.1: Shannons model of a secrecy system

Finally we deﬁne the conditional mutual information I(X; Y |Z) in an analogue way,

I(X; Y |Z) = D(P (X, Y |Z)||P (X|Z)P (Y |Z)),

where
P (x|z)
D(P (X|Z)||Q(X|Z)) = P (z) P (x|z) log
z∈Z x∈X
Q(x|z)

is the conditional relative entropy. Clearly,

I(X; Y |Z) = H(X|Z) − H(X|Y, Z) = H(Y |Z) − H(Y |X, Z),

etc. We end by giving the inequality

Theorem 3.6.
I(X; Y ) ≥ 0,
with equality if and only if X and Y are independent.

3.3 Shannons theory of secrecy

Shannons model of a secrecy system is essentially the model we saw in Chapter 2. The
model appears again in Figure 3.3. We consider a given set of encryption functions, one
for each key k, mapping a sequence of plaintext letters m = m1 , m2 , . . ., mi ∈ M to a
sequence of ciphertext letters c = c1 , c2 , . . ., ci ∈ C. If not otherwise stated, we assume
that the plaintext and ciphertext letters are from the same alphabet.
Eve has access to the ciphertext c and her task is to obtain some information about
either the transmitted message (plaintext) or the key k used by Alice and Bob. For a
length N sequence, let M = (M1 , M2 , . . . , MN ) be the random variable corresponding to
the plaintext Alice is sending, and let C = (c1 , c2 , . . . , cN ) be the the random variable
corresponding to the ciphertext Bob is receiving. Also, K ∈ K is a random variable
representing the chosen key.
44 Chapter 3. Shannon’s Theory of Secrecy

We introduce the key entropy

H(K) = − P (k) log P (k),
k∈K

and the message entropy

H(M) = − P (m) log P (m).
m∈MN

Clearly, we have
H(K) ≤ log |K|,
and
H(M) ≤ log |M|N .
The key entropy describes the uncertainty Eve faces regarding the unknown key a priori
(i.e. without having observed the transmitted ciphertext). Similarly, the message entropy
describes the uncertainty regarding the transmitted message. Now, Eve observes the
transmissed ciphertext. The remaining uncertainty is described by the key equivocation

H(K|C) = − P (k, c) log P (k|c),
k∈K,c∈C N

and the message equivocation

H(M|C) = − P (m, c) log P (m|c).
m∈MN ,c∈C N

From Theorem 3.4 we have

H(K|C) ≤ H(K),
and
H(M|C) ≤ H(M).
Our uncertainty about the key and the message can never increase by observing the
ciphertext.
A nonprobabilistic encryption scheme is an encryption function for which every plaintext
message is mapped to a unique ciphertext under a ﬁxed key. Most encryption functions
are nonprobabilistic.
Theorem 3.7. For a nonprobabilistic encryption scheme we have

H(M|C) ≤ H(K|C).

Proof. The expression H(K, M|C) can written

H(K, M|C) = H(K|C) + H(M|C, K).

Let |M| = |C| = L. The maximum entropy of the considered alphabet H0 = log L is
sometimes called the rate of the alphabet.
The actual entropy of the message source per alphabet symbol is denoted by H(M) and
given by
H(M)
H(M) = ,
N
for a length N message M = (M1 , M2 , . . . , MN ).
There are two cases to consider. The ﬁrst one is for memoryless source. Then

H(M) = H(M1 ) + H(M2 |M1 ) + · · · + H(MN |M1 , . . . , MN −1 ),

and due to the memoryless property

H(M) = H(M1 ) + H(M2 ) + · · · + H(MN ) = NH(M1 ).

Finally H(M) = H(M1 ), i.e., it is enough to ﬁnd the uncertainty of a single message
symbol.
The second case is when the source is not memoryless. Then we use the expression
H(M) = H∞ , where H∞ is the entropy per letter deﬁned by

H(M1 , M2 , . . . , MN )
H∞ = lim .
N →∞ N

For English, the entropy per letter can be determined experimentally to be around H∞ =
1.5.

Definition 3.5. We deﬁne the redundancy of a source, denoted D, to be

D = H0 − H(M).

The redundancy is an important characterization of a source, since it describes how much

a source can be compressed. For English, the redundancy is

D = log 26 − 1.5 = 3.2 bits.

This means that out of the 4.7 bits needed to represent a letter only 1.5 bits is necessary for
unique representation. The remaining 3.2 bits can be removed. In more practical terms,
a ﬁle of 1000 letters English text would require 4700 bits if represented in a basic form
(or 5000 bits if each possible letter corresponds to a 5 bit pattern). The above arguments
shows that 1500 bits is theoretically enough to represent the text (by clever and complex
encoding). The remaining 3200 bits is the redundancy that “can be removed”.
One of the main results of Shannon was that he showed that due to the redundancy in a
source, cryptosystems can be broken.

Theorem 3.8.
H(K|C) ≥ H(K) − ND.
46 Chapter 3. Shannon’s Theory of Secrecy

Proof. Since H(K, M, C) = H(K, M) = H(K, C) (unique encryption/decryption) and

H(K, M) = H(K) + H(M)

we get on one hand

H(K, M) = H(K) + NH(M),
and on the other hand

H(K, C) = H(C) + H(K|C) ≤ NH0 + H(K|C).

Combining these expressions gives

H(K|C) ≥ NH(M) + H(K) − NH0 = H(K) − ND.

From the theorem we see that when the length of the encrypted message is more than
H(K)/D the right hand side becomes 0, i.e., the uncertainty about the key might be close
to zero. This leads to the following deﬁnition.
Definition 3.6. The unicity distance denoted N0 is deﬁned as

N0 = H(K)/D.

Let us comment shortly on this deﬁnition. If the message length is longer than N0 the
inequality in Theorem 3.8 becomes H(K|C) ≥ 0, i.e., the uncertainty about the key for
Eve could be very small, but it does not necessarily have to be so.
If, on the other hand, the message length is shorter than N0 , this means that Eve faces
some uncertainty about the key, i.e, there are many key values that could have generated
the observed ciphertext. But this does not mean that there is an uncertainty about the
message. It could be that all possible keys would decrypt to the same message.
Even though it is hard to fully describe the meaning of unicity distance, from a practical
point of view it gives a rough borderline between the case when there are several possible
solutions and the case when there is only one possible solution for the key or the message.
Example 3.2. Consider a simple substitution cipher encrypting an English source. We
have H(K) = log 26! = 88.4 bit. Using D = 3.2 we get N0 = 88.4/3.2 ≈ 28 letters. The
interpretation is that if we know more than 28 letters there is probably a unique value
for the key giving a meaningful message. All other key values give messages that are not
meaningful.
Example 3.3. A Vernam cipher has key size H(K) = NH0 . Since D < H0 we get
N0 > N, i.e., the unicity distance is larger than the message length.

Having demonstrated the fact that redundancy helps Eve in her cryptanalytic attempts,
we come to two conclusions regarding services provided by a communication system.

• Compression should be done before encryption and it improves the security of a

cryptosystem.
47

• Channel coding (adding parity bits for correction/detection of errors) should be

done after encryption.

So the correct order of services is ﬁrst to compress the source, then to encrypt the resulting
data, and ﬁnally to perform channel coding.
We can strengthen the notion of an unbreakable system as follows.
Definition 3.7. A cryptosystem is said to have perfect secrecy if
I(M, C) = 0.

In a system with perfect secrecy, the plaintext and the ciphertext are independent.
When Eve observes the ciphertext, she obtains no information at all about the plain-
text (H(M|C) = H(M)). This is in some sense the strongest notion of security we can
hope for. Unfortunately, a cryptosystem with perfect secrecy has a severe drawback,
making it useless for practical purposes.
Theorem 3.9. For a cryptosystem with perfect secrecy we have
H(M) ≤ H(K).

Proof. We know that

H(M|C) ≤ H(K|C) ≤ H(K).
But the deﬁnition of perfect secrecy implies H(M|C) = H(M), so the theorem follows.

In a system with perfect secrecy we need the key size to be at least as large the size of
the plaintext (after compression). As most applications consider plaintexts of large sizes
(like encrypting a ﬁle), this would lead to huge key sizes which is practically impossible
to handle.
We end this chapter with the following result.
Theorem 3.10. The Vernam cipher has perfect secrecy.

Proof. We prove the result for the binary alphabet. For the Vernam cipher we have
C = M + K,
where M = (M1 , M2 , . . . , MN ) and K = (K1 , K2 , . . . , KN ), Mi , Ki ∈ F2 , i = 1, 2, . . . , N.
Whereas M may have any distribution, K is uniformly distributed. Thus
P (c|m)
I(M; C) = P (m, c) log =
m c
P (c)
1/2N
= P (m, c) log = 0.
m c
1/2N

Recall our discussion in the introduction of the chapter. The Vernam cipher is proved
secure in the strongest possible sense, but this is only valid for the cipertext-only assump-
tion.
48 Chapter 3. Shannon’s Theory of Secrecy

3.4 Exercises
Exercise 3.1. Consider the sample space as the outcome when throwing a die. Then
Ω = {1, 2, 3, 4, 5, 6}. Consider three diﬀerent random variables X, Y ; Z deﬁned on Ω,
where X is the received number, Y takes on two possible values, either “EVEN” or “ODD”,
depending on whether an even or an odd number is received. Finally, Z also takes on two
possible values, either “LOW” or “HIGH”, where “LOW” is obtained if the received number
is at most 3.
a) Calculate
1. H(X)
2. H(Y)
3. H(Z)
4. H(XY)
5. H(XZ)
6. H(YZ)
7. H(XYZ)
b) Calculate
1. I(X;Y)
2. I(X;Z)
3. I(Y;Z)
4. I(X;YZ)
Exercise 3.2. Assume that X1 , X2 , X3 are three random variables such that P (X1 =
x1 , X2 = x2 , X3 = x3 ) = 1/4 if (x1 , x2 , x3 ) ∈ {(000), (011), (101), (110)}, and zero other-
wise. Calculate
a) H(X1 )
b) H(X1 X2 )
c) H(X2 |X1 )
d) H(X1 X2 X3 )
e) H(X3|X1 X2 )
f) H(X3 )
g) I(X1 ; X3 )
h) I(X1 X2 ; X3 )
Exercise 3.3. Assume that X1 , X2 , X3 are three random variables such that P (X1 =
x1 , X2 = x2 , X3 = x3 ) = 1/5 if (x1 , x2 , x3 ) ∈ {(000), (001), (010), (100), (111)}, and zero
otherwise. Calculate
a) H(X1 )
b) H(X2 )
c) H(X3 )
d) H(X2 |X1 )
e) H(X1X2 )
f) H(X3 |X1 X2 )
g) H(X1 X2 X3 )
h) H(X2 |X1 = 0)
i) H(X2|X1 = 1)
49

Exercise 3.4. Let X ∈ {rain, sunshine} be a produced weather forecast and let Y ∈
{rain, sunshine} be the actual weather. The joint distribution is P (X = rain, Y = rain) =
1/4,P (X = rain, Y = sunshine) = 1/2, P (X = sunshine, Y = rain) = 0,P (X =
sunshine, Y = sunshine) = 1/4.
Examining this situation, you can see that the forecast is only correct with probability 0.5.
By just always predicting sunshine you can increase the probability of a correct weather
forecast to 0.75. Use information theoretical arguments to decide whether the latter is a
better forecast.

Exercise 3.5. Continuing the previous exercise, assume instead that P (X = rain, Y =
sunshine) = 1/2 and P (X = sunshine, Y = rain) = 1/2, and zero otherwise. What can
you say about the forecast in this case?

Exercise 3.6. Assume that Bob has two coins, one genuine with heads and tail; and one
with heads on both sides. He randomly select one of the coins and tells us the outcome
of two coin tosses. Determine how much information we get about which coin he selected
from the outcome of the coin tosses.

Exercise 3.7. Prove that

I(M; C) ≥ H(M) − H(K),
i.e., a cryptosystem with small key size will leak a lot of information about the plaintext.

Exercise 3.8. Determine the unicity distance N0 as a function of n for a Vigenére cipher
with period n encrypting English text. What can you say about the security if we encrypt
twice using two Vigenére encryptions with diﬀerent keys but the same period?

Exercise 3.9. Determine the unicity distance N0 for a Playfair cipher encrypting English
text. In a Playfair cipher, the alphabet size is 25 and the key is an arbitrary permutation
of the alphabet.

Cryptography and Network Security: Seventh Edition by William Stallings
50% (2)
Cryptography and Network Security: Seventh Edition by William Stallings
37 pages
Cryptography Dan Boneh
No ratings yet
Cryptography Dan Boneh
580 pages
Network Security Multiple Choice Questions and Answers With MCQ
No ratings yet
Network Security Multiple Choice Questions and Answers With MCQ
60 pages
Dan Boneh Notes
No ratings yet
Dan Boneh Notes
58 pages
Classical Encryption Cipher
No ratings yet
Classical Encryption Cipher
164 pages
Trends in Data Protection and Encryption Technologies
No ratings yet
Trends in Data Protection and Encryption Technologies
253 pages
Cryptography and Network Security: Sixth Edition by William Stallings
No ratings yet
Cryptography and Network Security: Sixth Edition by William Stallings
37 pages
A Cryptography Primer PDF
83% (6)
A Cryptography Primer PDF
190 pages
NS by MU PDF
No ratings yet
NS by MU PDF
252 pages
Module 1b
No ratings yet
Module 1b
38 pages
Practical Cryptography (Introduction To Cryptography)
No ratings yet
Practical Cryptography (Introduction To Cryptography)
42 pages
4.cryptography Handout 2
No ratings yet
4.cryptography Handout 2
87 pages
BonehShoup 0 5 PDF
No ratings yet
BonehShoup 0 5 PDF
900 pages
Cryptography Overview: John Mitchell
No ratings yet
Cryptography Overview: John Mitchell
87 pages
Cryptography Using Chaos
0% (1)
Cryptography Using Chaos
86 pages
Crypto Watermarking Method For Medical Images 5721 NDgZaOt
No ratings yet
Crypto Watermarking Method For Medical Images 5721 NDgZaOt
13 pages
Cryptography and Network Security: Sixth Edition by William Stallings
No ratings yet
Cryptography and Network Security: Sixth Edition by William Stallings
56 pages
Quiz1 Cryptography
No ratings yet
Quiz1 Cryptography
9 pages
Unit 1 CSS
No ratings yet
Unit 1 CSS
18 pages
Information Theory, Coding and Cryptography Unit-1 by Arun Pratap Singh
71% (7)
Information Theory, Coding and Cryptography Unit-1 by Arun Pratap Singh
46 pages
2 Shannon's Theory Part 2
No ratings yet
2 Shannon's Theory Part 2
27 pages
Cryptography, Network Security and Cyber Laws Notes 2019-2020
No ratings yet
Cryptography, Network Security and Cyber Laws Notes 2019-2020
23 pages
Information Theory: Dr. Eng. Sattar B. Sadkhan MSC - Sarah Abd UL - Rizah
No ratings yet
Information Theory: Dr. Eng. Sattar B. Sadkhan MSC - Sarah Abd UL - Rizah
23 pages
Cryptography and Network Security
No ratings yet
Cryptography and Network Security
642 pages
Quantum Cryptography: A Survey
No ratings yet
Quantum Cryptography: A Survey
31 pages
Summary of Typical Example of A Simultaneity Trouble
No ratings yet
Summary of Typical Example of A Simultaneity Trouble
859 pages
Principios de Criptografía Moderna
No ratings yet
Principios de Criptografía Moderna
400 pages
Unit 1 - 2022
No ratings yet
Unit 1 - 2022
177 pages
Fatal-Freight
No ratings yet
Fatal-Freight
54 pages
A Cornputational Theory of Surprise: Transmission of Data, Shan
No ratings yet
A Cornputational Theory of Surprise: Transmission of Data, Shan
25 pages
Nis-Manual Compress
No ratings yet
Nis-Manual Compress
37 pages
Slides 1
No ratings yet
Slides 1
35 pages
Trends of Cryptography Stepping From Ancient To Modern
No ratings yet
Trends of Cryptography Stepping From Ancient To Modern
9 pages
4 Shannon's Theory
No ratings yet
4 Shannon's Theory
52 pages
One Time Pad/ Vernam Cipher: Attacks Impossible For Sufficiently Long PT Msgs
No ratings yet
One Time Pad/ Vernam Cipher: Attacks Impossible For Sufficiently Long PT Msgs
22 pages
Perfect Secrecy: Chester Rebeiro IIT Madras
No ratings yet
Perfect Secrecy: Chester Rebeiro IIT Madras
50 pages
Categorical Criptography
No ratings yet
Categorical Criptography
17 pages
Cryptographic Primitives
No ratings yet
Cryptographic Primitives
59 pages
Certified For Publication: Filed 7/27/21
No ratings yet
Certified For Publication: Filed 7/27/21
20 pages
Ccnas CH7
No ratings yet
Ccnas CH7
150 pages
Cryptography I: General Concepts and Some Classical Ciphers
No ratings yet
Cryptography I: General Concepts and Some Classical Ciphers
60 pages
NIS Unit 3
No ratings yet
NIS Unit 3
27 pages
Introduction To Cryptography
No ratings yet
Introduction To Cryptography
51 pages
Quantum Cryptography A Review
No ratings yet
Quantum Cryptography A Review
5 pages
Information Security Textbook
No ratings yet
Information Security Textbook
7 pages
Provably Secure Steganography (Extended Abstract)
No ratings yet
Provably Secure Steganography (Extended Abstract)
16 pages
Consider The Vigenere Cipher Over The Lowercase English Alphabet
No ratings yet
Consider The Vigenere Cipher Over The Lowercase English Alphabet
4 pages
Probability and Secrecy
No ratings yet
Probability and Secrecy
30 pages
03 Adversarial Secrecy Slides Handouts
No ratings yet
03 Adversarial Secrecy Slides Handouts
9 pages
Perfect Secrecy Notes
No ratings yet
Perfect Secrecy Notes
8 pages
CH 0 Introduction: 0.1 Overview of Information Theory and Coding
No ratings yet
CH 0 Introduction: 0.1 Overview of Information Theory and Coding
133 pages
InfTh Vorl e
No ratings yet
InfTh Vorl e
96 pages
Entropy (Information Theory)
No ratings yet
Entropy (Information Theory)
17 pages
CRYPTOGRAPHY Student Notes PDF
No ratings yet
CRYPTOGRAPHY Student Notes PDF
65 pages
Lecture 03
No ratings yet
Lecture 03
27 pages
Short Intro Quantum Information
No ratings yet
Short Intro Quantum Information
64 pages
Security Games
No ratings yet
Security Games
26 pages
Cryptography Primitives and Protocols
No ratings yet
Cryptography Primitives and Protocols
91 pages
1 Introduction
No ratings yet
1 Introduction
28 pages
Shannon's Theory of Secure Communication: CSG 252 Fall 2006 Riccardo Pucella
No ratings yet
Shannon's Theory of Secure Communication: CSG 252 Fall 2006 Riccardo Pucella
23 pages
LECTURE 1: Introduction
No ratings yet
LECTURE 1: Introduction
16 pages
An Information-Theoretic Model For Steganography
No ratings yet
An Information-Theoretic Model For Steganography
14 pages
Unit I (1.security Attacks & Services)
No ratings yet
Unit I (1.security Attacks & Services)
28 pages
Foundations of Cryptography: Lecturer: Moni Naor
No ratings yet
Foundations of Cryptography: Lecturer: Moni Naor
70 pages
Classes of Attacks and Security Modelsupdated 18052010
No ratings yet
Classes of Attacks and Security Modelsupdated 18052010
3 pages
Cibc Costco Cardholder Agreement Additional Clauses en
No ratings yet
Cibc Costco Cardholder Agreement Additional Clauses en
1 page
1.1 Shannon's Information Measures: Lecture 1 - January 26
No ratings yet
1.1 Shannon's Information Measures: Lecture 1 - January 26
5 pages
Crypto Week1
No ratings yet
Crypto Week1
7 pages
Introduction and 1.1.1
No ratings yet
Introduction and 1.1.1
3 pages
A Visual Introduction To Information Theory
No ratings yet
A Visual Introduction To Information Theory
43 pages
Introduction To Entropy
No ratings yet
Introduction To Entropy
63 pages
Lecture 3 - Perfect Secrecy
No ratings yet
Lecture 3 - Perfect Secrecy
61 pages
2 - Module 2 - Symmetric-Key Ciphers 2023-Part 1
No ratings yet
2 - Module 2 - Symmetric-Key Ciphers 2023-Part 1
108 pages
Communication Theory and Coding: Basics
No ratings yet
Communication Theory and Coding: Basics
17 pages
Lec35 - 210108062 - ZAINAB ALI
No ratings yet
Lec35 - 210108062 - ZAINAB ALI
9 pages
Inf Theory 3
No ratings yet
Inf Theory 3
76 pages
Honors Unit 1
No ratings yet
Honors Unit 1
18 pages
Week 3
No ratings yet
Week 3
54 pages
Lecture 2 - Quantum Key Distribution
No ratings yet
Lecture 2 - Quantum Key Distribution
10 pages
Lecture 2 Introduction Cryptography
No ratings yet
Lecture 2 Introduction Cryptography
3 pages
Lecture 14
No ratings yet
Lecture 14
6 pages
Exponential Bounds For Information Leakage in Unknown-Message Side-Channel Attacks
No ratings yet
Exponential Bounds For Information Leakage in Unknown-Message Side-Channel Attacks
8 pages
Ch5 Entropy and Information
No ratings yet
Ch5 Entropy and Information
77 pages
Bis Question Answer-1
No ratings yet
Bis Question Answer-1
40 pages
Week 2
No ratings yet
Week 2
76 pages
Quiz Chapter 4 - Answers
No ratings yet
Quiz Chapter 4 - Answers
8 pages
2IS80 2014 11 Information+Crypto
No ratings yet
2IS80 2014 11 Information+Crypto
35 pages
w2 3 Sanon Sym
No ratings yet
w2 3 Sanon Sym
46 pages
Cyber Security Notes Unit-1
No ratings yet
Cyber Security Notes Unit-1
17 pages
HTCyberSecurity. UNIT 1
No ratings yet
HTCyberSecurity. UNIT 1
23 pages
Lecture 5
No ratings yet
Lecture 5
3 pages
Mathematical Foundations of Information Theory
From Everand
Mathematical Foundations of Information Theory
A. Ya. Khinchin
3.5/5 (9)
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)
Theory of Approximation
From Everand
Theory of Approximation
N. I. Achieser
No ratings yet
Elgenfunction Expansions Associated with Second Order Differential Equations
From Everand
Elgenfunction Expansions Associated with Second Order Differential Equations
E. C. Titchmarsh
No ratings yet

Shannon's Theory of Secrecy: 3.1 Introduction To Attack and Security Assumptions

Uploaded by

Shannon's Theory of Secrecy: 3.1 Introduction To Attack and Security Assumptions

Uploaded by

Chapter 3

Shannon’s Theory of Secrecy

3.1 Introduction to attack and security assumptions

3.2 Information theory

Let E ⊂ X . Then X ∈ E is an event and its probability is calculated as

Correspondingly, two random variables X, Y are said to be independent random variables

For random variables we have a similar deﬁnition of P (X|Y ) as

We will now introduce the concept of entropy, which is a measure of uncertainty of a

Clearly, an alternative way of expressing the entropy is

We need to deﬁne conditional entropy.

Definition 3.3. The conditional entropy H(X|Y ) of a discrete random variable X,

It is then straightforward to derive the expression

This may be a convenient way to calculate H(X|Y ).

A similar property holds for the conditional entropy.

Also for H(X|Y = y) similar inequalities hold.

H(X1 X2 . . . Xn ) = H(X1 ) + H(X2 |X1 ) + H(X3 |X1 X2 ) + · · · H(Xn |X1 X2 . . . Xn−1 ).

In particular, H(XY ) = H(X) + H(Y |X) = H(Y ) + H(X|Y ).

The inequality leads to the fact that

H(XY ) ≤ H(X) + H(Y ),

again with equality if and only if X and Y are independent.

h(p) = −p log p − (1 − p) log(1 − p).

I(X; Y ) = H(X) − H(X|Y ), (3.1)

Figure 3.1: Shannons model of a secrecy system

I(X; Y |Z) = D(P (X, Y |Z)||P (X|Z)P (Y |Z)),

is the conditional relative entropy. Clearly,

I(X; Y |Z) = H(X|Z) − H(X|Y, Z) = H(Y |Z) − H(Y |X, Z),

etc. We end by giving the inequality

3.3 Shannons theory of secrecy

We introduce the key entropy

and the message entropy

and the message equivocation

From Theorem 3.4 we have

Proof. The expression H(K, M|C) can written

H(K, M|C) = H(K|C) + H(M|C, K).

H(M) = H(M1 ) + H(M2 |M1 ) + · · · + H(MN |M1 , . . . , MN −1 ),

and due to the memoryless property

H(M) = H(M1 ) + H(M2 ) + · · · + H(MN ) = NH(M1 ).

Definition 3.5. We deﬁne the redundancy of a source, denoted D, to be

The redundancy is an important characterization of a source, since it describes how much

D = log 26 − 1.5 = 3.2 bits.

Proof. Since H(K, M, C) = H(K, M) = H(K, C) (unique encryption/decryption) and

H(K, M) = H(K) + H(M)

we get on one hand

H(K, C) = H(C) + H(K|C) ≤ NH0 + H(K|C).

Combining these expressions gives

H(K|C) ≥ NH(M) + H(K) − NH0 = H(K) − ND.

• Compression should be done before encryption and it improves the security of a

• Channel coding (adding parity bits for correction/detection of errors) should be

Proof. We know that

Exercise 3.7. Prove that

You might also like