EE376A: Homework #1
Due on Thursday, January 21, 2016
You can hand in the homework either after class or deposit it, before 5 PM, in the EE376A
drawer of the class file cabinet on the second floor of the Packard Building.
1. Entropy of Hamming Code.
Consider information bits X1 , X2 , X3 , X4 {0, 1} chosen uniformly at random, together with check bits X5 , X6 , X7 chosen to make the parity of the circles even.
X5
X3
X2
X1
X7
X6
X4
Thus, for example,
1
1
1
becomes
0
0
1
1
That is, 1011 becomes 1011010.
(a) What is the entropy of H(X1 , X2 , ..., X7 )?
Homework 1
Page 1 of ??
Now we make an error (or not) in one of the bits (or none). Let Y = X e, where
e is equally likely to be (1, 0, . . . , 0), (0, 1, 0, . . . , 0), . . . , (0, 0, . . . , 0, 1), or (0, 0, . . . , 0),
and e is independent of X.
(b) What is the entropy of Y?
(c) What is H(X|Y)?
(d) What is I(X; Y)?
2. Entropy of functions of a random variable.
Let X be a discrete random variable.
(a) Show that the entropy of a function of X is less than or equal to the entropy of
X by justifying the following steps:
(a)
H(X, g(X)) = H(X) + H(g(X)|X)
(b)
= H(X).
(c)
H(X, g(X)) = H(g(X)) + H(X|g(X))
(d)
H(g(X)).
Thus H(g(X)) H(X).
(b) Show that if Z = g(Y ) then H(X|Y ) H(X|Z).
3. Data Processing Inequality.
If X,Y,Z form a markov triplet (X Y Z), show that:
(a) H(X|Y ) = H(X|Y, Z) and H(Z|Y ) = H(Z|X, Y )
(b) H(X|Y ) H(X|Z)
(c) I(X; Y ) I(X; Z) and I(Y ; Z) I(X; Z)
(d) I(X; Z|Y ) = 0
The following definition may be useful:
Definition 1: The conditional mutual information of random variables X and Y given
Z is defined by
I(X; Y |Z) = H(X|Z) H(X|Y, Z)
X
P (x, y|z)
=
P (x, y, z) log
P (x|z)P (y|z)
x,y,z
Homework 1
Page 2 of ??
4. Entropy of time to first success.
A fair coin is flipped until the first head occurs. Let X denote the number of flips
required.
(a) Find the entropy H(X) in bits. The following expressions may be useful:
r = r/(1 r),
n=1
nrn = r/(1 r)2 .
n=1
(b) Find an efficient sequence of yes-no questions of the form, Is X contained in
the set S?. Compare H(X) to the expected number of questions required to
determine X.
(c) Let Y denote the number of flips until the second head appears. Thus, for example, Y = 5 if the second head appears on the 5th flip. Argue that H(Y ) =
H(X1 + X2 ) < H(X1 , X2 ) = 2H(X), and interpret in words.
5. Example of joint entropy.
Let p(x, y) be given by
Y
X
1
4
1
4
1
2
0
1 0
Find
(a) H(X), H(Y ).
(b) H(X|Y ), H(Y |X).
(c) H(X, Y ).
(d) I(X; Y ).
6. Infinite entropy. This problem
the entropy of a discrete random variable
P shows2 that
1
can be infinite. Let A =
(n
log
n)
.
(It is easy to show that A is finite by
n=2
bounding the infinite sum by the integral of (x log2 x)1 .) Show that the integervalued random variable X distributed as P (X = n) = (An log2 n)1 for n = 2, 3, . . .
has H(X) = +.
7. A measure of correlation.
Let X1 and X2 be identically distributed with positive entropy, but not necessarily
Homework 1
Page 3 of ??
independent. Note that H(X1 ) = H(X2 ). Let
=1
H(X2 |X1 )
.
H(X1 )
(a) Show that 0 1
(b) Show that I(X1 ; X2 ) = H(X1 )
(c) Show that = 0 iff X1 is independent of X2
(d) Show that = 1 iff there exists a one-to-one function g such that X1 = g(X2 ) with
probability one.
8. Two looks.
Here is a statement about pairwise independence and joint independence. Let X, Y1 ,
and Y2 be binary random variables. If I(X; Y1 ) = 0 and I(X; Y2 ) = 0, does it follow
that I(X; Y1 , Y2 ) = 0?
(a) Yes or no? Prove or provide a counterexample.
(b) If I(X; Y1 ) = 0 and I(X; Y2 ) = 0 in the above problem, does it follow that
I(Y1 ; Y2 ) = 0?
9. Markovs inequality for probabilities.
Let p(x) be a probability mass function. Prove, for all d 0,
1
P (p(X) d) log
H(X).
d
10. Smallest Typical Set
We have a memoryless source U , i.e., U1 , U2 , , are i.i.d. U , where U takes values
in the finite alphabet U. Let un denote the n-tuple (u1 , u2 , , un ) and p(un ) be its
probability, i.e.,
p(un ) = ni=1 PU (ui )
Let > 0 and for every n let B (n) U n be an arbitrary set of source sequences
satisfying |B (n) | 2n(H(U )) . Prove that:
lim P (U n B (n) ) = 0.
n
(n)
In words, the typical set A (defined in class) is essentially smallest (on an exponential
scale) among the sets that have non-negligible probability.
Homework 1
Page 4 of ??