0% found this document useful (0 votes)
52 views

An Introduction To Information Theory: Adrish Banerjee

The document summarizes solutions to problems posed about concepts in information theory, including conditional entropy, mutual information, and divergence. For conditional entropy, examples are given of random variables X and Y such that H(Y|X=x) is both less than and greater than H(Y). For mutual information, examples show I(X;Y|Z) can be less than or greater than I(X;Y). For divergence, it is shown that the triangle inequality holds for the given distributions of PX, QX, and RX.

Uploaded by

Sushant Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views

An Introduction To Information Theory: Adrish Banerjee

The document summarizes solutions to problems posed about concepts in information theory, including conditional entropy, mutual information, and divergence. For conditional entropy, examples are given of random variables X and Y such that H(Y|X=x) is both less than and greater than H(Y). For mutual information, examples show I(X;Y|Z) can be less than or greater than I(X;Y). For divergence, it is shown that the triangle inequality holds for the given distributions of PX, QX, and RX.

Uploaded by

Sushant Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

An introduction to Information Theory

Adrish Banerjee
Department of Electrical Engineering
Indian Institute of Technology Kanpur
Kanpur, Uttar Pradesh
India

July 18, 2016

Lecture #2B: Problem solving session-I

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Conditional Entropy
Problem # 1: Give examples of joint random variable X and Y
such that

Adrish Banerjee

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Conditional Entropy
Problem # 1: Give examples of joint random variable X and Y
such that
i) H(Y |X = x) < H(Y )

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Conditional Entropy
Problem # 1: Give examples of joint random variable X and Y
such that
i) H(Y |X = x) < H(Y )
ii) H(Y |X = x) > H(Y )

Adrish Banerjee

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Conditional Entropy
Problem # 1: Give examples of joint random variable X and Y
such that
i) H(Y |X = x) < H(Y )
ii) H(Y |X = x) > H(Y )

Solutions: Suppose that the random vector [X,Y,Z] is equally likely


to take any of the following four values: [0,0,0],[0,1,0],[1,0,0] and
[1,0,1].

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Conditional Entropy
Problem # 1: Give examples of joint random variable X and Y
such that
i) H(Y |X = x) < H(Y )
ii) H(Y |X = x) > H(Y )

Solutions: Suppose that the random vector [X,Y,Z] is equally likely


to take any of the following four values: [0,0,0],[0,1,0],[1,0,0] and
[1,0,1].
Then PX (0) = PX (1) = 1/2 so that H(X ) = H(1/2) = 1 bit.

Adrish Banerjee

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Conditional Entropy
Problem # 1: Give examples of joint random variable X and Y
such that
i) H(Y |X = x) < H(Y )
ii) H(Y |X = x) > H(Y )

Solutions: Suppose that the random vector [X,Y,Z] is equally likely


to take any of the following four values: [0,0,0],[0,1,0],[1,0,0] and
[1,0,1].
Then PX (0) = PX (1) = 1/2 so that H(X ) = H(1/2) = 1 bit.
Note that PY /X (0/1) = 1 so that H(Y /X = 1) = 0.

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Conditional Entropy
Problem # 1: Give examples of joint random variable X and Y
such that
i) H(Y |X = x) < H(Y )
ii) H(Y |X = x) > H(Y )

Solutions: Suppose that the random vector [X,Y,Z] is equally likely


to take any of the following four values: [0,0,0],[0,1,0],[1,0,0] and
[1,0,1].
Then PX (0) = PX (1) = 1/2 so that H(X ) = H(1/2) = 1 bit.
Note that PY /X (0/1) = 1 so that H(Y /X = 1) = 0.
Similarly, we have PY /X (0/0) = 1/2 so that
H(Y /X = 0) = h(1/2) = 1 bit

Adrish Banerjee

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Conditional Entropy
Problem # 1: Give examples of joint random variable X and Y
such that
i) H(Y |X = x) < H(Y )
ii) H(Y |X = x) > H(Y )

Solutions: Suppose that the random vector [X,Y,Z] is equally likely


to take any of the following four values: [0,0,0],[0,1,0],[1,0,0] and
[1,0,1].
Then PX (0) = PX (1) = 1/2 so that H(X ) = H(1/2) = 1 bit.
Note that PY /X (0/1) = 1 so that H(Y /X = 1) = 0.
Similarly, we have PY /X (0/0) = 1/2 so that
H(Y /X = 0) = h(1/2) = 1 bit
Since PY (1) = 1/4, we have H(Y ) = h(1/4) = 0.811 bits. Thus we
have

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Conditional Entropy
Problem # 1: Give examples of joint random variable X and Y
such that
i) H(Y |X = x) < H(Y )
ii) H(Y |X = x) > H(Y )

Solutions: Suppose that the random vector [X,Y,Z] is equally likely


to take any of the following four values: [0,0,0],[0,1,0],[1,0,0] and
[1,0,1].
Then PX (0) = PX (1) = 1/2 so that H(X ) = H(1/2) = 1 bit.
Note that PY /X (0/1) = 1 so that H(Y /X = 1) = 0.
Similarly, we have PY /X (0/0) = 1/2 so that
H(Y /X = 0) = h(1/2) = 1 bit
Since PY (1) = 1/4, we have H(Y ) = h(1/4) = 0.811 bits. Thus we
have
i) H(Y |X = 1) < H(Y )
Adrish Banerjee

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Conditional Entropy
Problem # 1: Give examples of joint random variable X and Y
such that
i) H(Y |X = x) < H(Y )
ii) H(Y |X = x) > H(Y )

Solutions: Suppose that the random vector [X,Y,Z] is equally likely


to take any of the following four values: [0,0,0],[0,1,0],[1,0,0] and
[1,0,1].
Then PX (0) = PX (1) = 1/2 so that H(X ) = H(1/2) = 1 bit.
Note that PY /X (0/1) = 1 so that H(Y /X = 1) = 0.
Similarly, we have PY /X (0/0) = 1/2 so that
H(Y /X = 0) = h(1/2) = 1 bit
Since PY (1) = 1/4, we have H(Y ) = h(1/4) = 0.811 bits. Thus we
have
i) H(Y |X = 1) < H(Y )
ii) H(Y |X = 0) > H(Y )
Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Mutual Information
Problem # 2: Give examples of joint random variable X, Y and Z
such that

Adrish Banerjee

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Mutual Information
Problem # 2: Give examples of joint random variable X, Y and Z
such that
i) I (X ; Y |Z ) < I (X ; Y )

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Mutual Information
Problem # 2: Give examples of joint random variable X, Y and Z
such that
i) I (X ; Y |Z ) < I (X ; Y )
ii) I (X ; Y |Z ) > I (X ; Y )

Adrish Banerjee

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Mutual Information
Problem # 2: Give examples of joint random variable X, Y and Z
such that
i) I (X ; Y |Z ) < I (X ; Y )
ii) I (X ; Y |Z ) > I (X ; Y )

i) Solutions: Let X, Y and Z form a Markov Chain.


I (X ; Y , Z ) = I (X ; Z ) + I (X ; Y |Z )
= I (X ; Y ) + I (X ; Z |Y )

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Mutual Information
Problem # 2: Give examples of joint random variable X, Y and Z
such that
i) I (X ; Y |Z ) < I (X ; Y )
ii) I (X ; Y |Z ) > I (X ; Y )

i) Solutions: Let X, Y and Z form a Markov Chain.


I (X ; Y , Z ) = I (X ; Z ) + I (X ; Y |Z )
= I (X ; Y ) + I (X ; Z |Y )
We note that I (X ; Z |Y ) = 0, by Markovity, and I (X ; Z ) 0. Thus,
I (X ; Y |Z ) I (X ; Y )

Adrish Banerjee

(1)

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Mutual Information
Problem # 2: Give examples of joint random variable X, Y and Z
such that
i) I (X ; Y |Z ) < I (X ; Y )
ii) I (X ; Y |Z ) > I (X ; Y )

i) Solutions: Let X, Y and Z form a Markov Chain.


I (X ; Y , Z ) = I (X ; Z ) + I (X ; Y |Z )
= I (X ; Y ) + I (X ; Z |Y )
We note that I (X ; Z |Y ) = 0, by Markovity, and I (X ; Z ) 0. Thus,
I (X ; Y |Z ) I (X ; Y )

(1)

ii) Let X and Y be independent fair binary random variables, and let
Z = X + Y.

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Mutual Information
Problem # 2: Give examples of joint random variable X, Y and Z
such that
i) I (X ; Y |Z ) < I (X ; Y )
ii) I (X ; Y |Z ) > I (X ; Y )

i) Solutions: Let X, Y and Z form a Markov Chain.


I (X ; Y , Z ) = I (X ; Z ) + I (X ; Y |Z )
= I (X ; Y ) + I (X ; Z |Y )
We note that I (X ; Z |Y ) = 0, by Markovity, and I (X ; Z ) 0. Thus,
I (X ; Y |Z ) I (X ; Y )

(1)

ii) Let X and Y be independent fair binary random variables, and let
Z = X + Y.
Then I (X ; Y ) = 0, but I (X ; Y |Z ) = H(X |Z ) H(X |Y , Z ) =
H(X |Z ) = P(Z = 1)H(X |Z = 1) = 12 bit.
Adrish Banerjee

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Divergence
Problem # 3: Let PX (X = 0) = PX (X = 1) = 0.5,
QX (X = 0) = 0.25, QX (X = 1) = 0.75 and
RX (X = 0) = 0.2, RX (X = 1) = 0.8. Show that triangle inequality
does hold for divergence, i.e.
D(PX ||RX ) > D(PX ||QX ) + D(QX ||RX )

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Divergence
Problem # 3: Let PX (X = 0) = PX (X = 1) = 0.5,
QX (X = 0) = 0.25, QX (X = 1) = 0.75 and
RX (X = 0) = 0.2, RX (X = 1) = 0.8. Show that triangle inequality
does hold for divergence, i.e.
D(PX ||RX ) > D(PX ||QX ) + D(QX ||RX )
Solution:
D(PX ||QX ) =
D(QX ||RX ) =
D(PX ||RX ) =

Adrish Banerjee

0.5
0.5
+ 0.5 log
= 0.208
0.25
0.75
0.75
0.25
+ 0.75 log
= 0.011
0.25 log
0.2
0.8
0.5
0.5
+ 0.5 log
= 0.322
0.5 log
0.2
0.8
0.5 log

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Divergence
Problem # 3: Let PX (X = 0) = PX (X = 1) = 0.5,
QX (X = 0) = 0.25, QX (X = 1) = 0.75 and
RX (X = 0) = 0.2, RX (X = 1) = 0.8. Show that triangle inequality
does hold for divergence, i.e.
D(PX ||RX ) > D(PX ||QX ) + D(QX ||RX )
Solution:
D(PX ||QX ) =
D(QX ||RX ) =
D(PX ||RX ) =

0.5
0.5
+ 0.5 log
= 0.208
0.25
0.75
0.75
0.25
+ 0.75 log
= 0.011
0.25 log
0.2
0.8
0.5
0.5
+ 0.5 log
= 0.322
0.5 log
0.2
0.8
0.5 log

Since, 0.322 > 0.208 + 0.011 = 0.219, triangular inequality is not


satised.
Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Mutual Information
Problem # 4: Consider a discrete memoryless channel with inputs
X and outputs Y . The input X takes values from a ternary set with
equal probability and it is known that the probability of error for the
system is p. Using Fanos lemma, nd a lower bound to the mutual
information I (X ; Y ) as a function of p.

Adrish Banerjee

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Mutual Information
Problem # 4: Consider a discrete memoryless channel with inputs
X and outputs Y . The input X takes values from a ternary set with
equal probability and it is known that the probability of error for the
system is p. Using Fanos lemma, nd a lower bound to the mutual
information I (X ; Y ) as a function of p.
Solutions: Mutual information can be written as
I (X ; Y ) = H(X ) H(X |Y )

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Mutual Information
Problem # 4: Consider a discrete memoryless channel with inputs
X and outputs Y . The input X takes values from a ternary set with
equal probability and it is known that the probability of error for the
system is p. Using Fanos lemma, nd a lower bound to the mutual
information I (X ; Y ) as a function of p.
Solutions: Mutual information can be written as
I (X ; Y ) = H(X ) H(X |Y )
By Fanos inequality, we get
H(X |Y ) H(Pe ) + Pe log(3 1) = H(p) + p

Adrish Banerjee

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Mutual Information
Problem # 4: Consider a discrete memoryless channel with inputs
X and outputs Y . The input X takes values from a ternary set with
equal probability and it is known that the probability of error for the
system is p. Using Fanos lemma, nd a lower bound to the mutual
information I (X ; Y ) as a function of p.
Solutions: Mutual information can be written as
I (X ; Y ) = H(X ) H(X |Y )
By Fanos inequality, we get
H(X |Y ) H(Pe ) + Pe log(3 1) = H(p) + p
Thus
I (X ; Y ) H(X ) H(p) p = log 3 H(p) p

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Concave Function
Problem # 5: Let (X , Y ) p(x, y ) = p(x)p(y |x). the mutual
information I (X ; Y ) is a concave function of p(x) for xed p(y |x)

Adrish Banerjee

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Concave Function
Problem # 5: Let (X , Y ) p(x, y ) = p(x)p(y |x). the mutual
information I (X ; Y ) is a concave function of p(x) for xed p(y |x)
Solutions: To prove, we expand the mutual information

p(x)H(Y |X = x)
I (X ; Y ) = H(Y ) H(Y |X ) = H(Y )
x

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Concave Function
Problem # 5: Let (X , Y ) p(x, y ) = p(x)p(y |x). the mutual
information I (X ; Y ) is a concave function of p(x) for xed p(y |x)
Solutions: To prove, we expand the mutual information

p(x)H(Y |X = x)
I (X ; Y ) = H(Y ) H(Y |X ) = H(Y )
x

If p(y |x) is xed, then p(y ) is a linear function of p(x).

Adrish Banerjee

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

An introduction to Information Theory

Concave Function
Problem # 5: Let (X , Y ) p(x, y ) = p(x)p(y |x). the mutual
information I (X ; Y ) is a concave function of p(x) for xed p(y |x)
Solutions: To prove, we expand the mutual information

p(x)H(Y |X = x)
I (X ; Y ) = H(Y ) H(Y |X ) = H(Y )
x

If p(y |x) is xed, then p(y ) is a linear function of p(x).


Hence H(Y ), which is a concave function of p(y ), is a concave
function of p(x).

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

Concave Function
Problem # 5: Let (X , Y ) p(x, y ) = p(x)p(y |x). the mutual
information I (X ; Y ) is a concave function of p(x) for xed p(y |x)
Solutions: To prove, we expand the mutual information

p(x)H(Y |X = x)
I (X ; Y ) = H(Y ) H(Y |X ) = H(Y )
x

If p(y |x) is xed, then p(y ) is a linear function of p(x).


Hence H(Y ), which is a concave function of p(y ), is a concave
function of p(x).
The second term is a linear function of p(x). Hence, the dierence
is a concave function of p(x).

Adrish Banerjee
An introduction to Information Theory

Department of Electrical Engineering Indian Institute of Technology Kanpur Kanpur, Uttar Pradesh India

You might also like