Prob 02
Prob 02
Then,
I(Z n ; W) = I(Zi ; W |Z i1 )
n
i=1
= I(Z j ; W |Z nj+1 ),
n
j=1
where the two orderings Z n = (Z1 , . . . , Zn ) and Z n = (Zn , . . . , Z1 ) have been used.
Therefore, we have
= I(X j ; Yi |Y i1 , X nj+1 , U )
n j1
(b)
j=2 i=1
= I(X j ; Y j1 | X nj+1 )
n
(c)
j=2
= I(X j ; Y j1 | X nj+1 )
n
(d)
j=1
= I(Y i1 ; Xi | Xi+1 ),
n
n
i=1
2 Solutions to Selected Problems in Chapter 2
where (a) and (c) follow from chain rule of mutual information, (b) is obtained by
switching the order in the summations, and finally (d) follows from the fact that
Y0 = .
.. Prove the properties of jointly typical sequences with () constants explicitly spec-
ified.
Solution:
(a) If (x n , y n ) T(n) , then for all (x, y) X Y
or equivalently
and therefore
i=1
Therefore,
= n(H(Y | X) ()),
(c) First, we establish the upper bound on |T(n) (Y|x n )|. Since
2n(H(Y|X)+()) p(y n |x n ) 1,
y :(x , y )T() y :(x , y )T()
(1 n )
k(x, y) log k(x, y) k(x, y) log k(x, y)
n xX yY yY yY
yY k(x, y)
= (1 n ) log ,
k(x, y)
xX yY n k(x, y)
where the last inequality comes from Stirlings formula for some n such that
n 0 as n . Note that
1
k(x, y) p(x, y) p(x, y), and
n
p(x, y) p(x, y) p(x, y) + p(x, y).
k(x, y)
n
yY p(x, y) p(x, y)
log | T(n) (Y |x n )| (1 n ) p(x, y) p(x, y) log
1
n p(x, y)>0
p(x, y) + p(x, y)
yY p(x, y) p(x, y)
= (1 n ) p(x, y) p(x, y) log
p(x, y)>0
p(x, y) + p(x, y)
p(x) p(x)
= (1 n ) p(x, y) p(x, y) log
p(x, y)>0
p(x, y) + p(x, y)
= (1 n ) p(x, y) log
p(x) p(x)
p(x, y) log
p(x, y)>0
p(x, y) p(x, y)>0
p(x, y)
1
+ (1 ) log H(Y | X) as 0 and n .
1+
4 Solutions to Selected Problems in Chapter 2
(d) Consider
(x , y )T() i=1
2nH(X) 2nH(Y)
(x , y )T()
Solution:
(a) i.Since (x n , y n ) T(n) (X, Y), we have |(x, y|x n , y n )| p(x, y) (x, y)
(X, Y)
yY |(x, y|x n , y n ) p(x, y)| yY p(x, y) (x, y) (X, Y)
|(x|x n ) p(x)| p(x)
x n T(n) (X).
Similarly, we can prove y n T(n) (Y).
ii.Since x n T(n) (X), we have |(x|x n ) p(x)| p(x) x X.
By setting (x) = lo PX (x) in the Typical Average Lemma, we have
(1 + )p(x, y) (x, y) X Y.
1= p(x n , y n )
(x , y )X Y
p(x n , y n )
(x , y )T() (X,Y)
1 = p(y n |x n )
y Y
p(y n |x n )
y T() (Y)
| T(n) (Y |x n )|2n(H(Y|X)H(Y|X))
yi = (xi )i [1 : n].
By Law of Large numbers and Y p(y |x ) ni=1 pY|X (yi |xi ), we have
n n n
(y|y n ) p(y|x) Further, if we fix the positions that all the xi in the positions
equal x, the distribution of yi on such positions will also converge to p(y|x)
for large enough n. It means that for n large enough, such that
(x , y )|x n , y n )
p(y|x) p(y|x).
(x|x )
i i
n
(xi , yi )|x n , y n )
(1 )(1 )p(x)p(y|x) (x |x n ) (1 + )(1 + )p(x)p(y|x).
(x|x n )
(1 )p(x, y) (xi , yi )|x n , y n ) (1 + )p(x, y) where = + > .
lim P(x n , y n ) T(n) (X, Y) = 1.
n
Solution:
(a) Consider
H(X |Z) H(X, Y |Z) = H(Y |Z) + H(X |Y , Z) H(Y |Z) + H(X |Y).
(b) Consider
h(X + Y) h(X + Y |Y) = h(X |Y) = h(X).
(d) Since Y1 X1 X2 Y2 ,
I(X1 , X2 ; Y1 , Y2 ) = H(Y1 , Y2 ) H(Y1 , Y2 | X1 , X2 )
= H(Y1 , Y2 ) H(Y1 | X1 , X2 ) H(Y2 | X1 , X2 , Y1 )
= H(Y1 , Y2 ) H(Y1 | X1 ) H(Y2 | X2 )
= I(X1 ; Y1 ) + I(X2 ; Y2 ) I(Y1 ; Y2 )
I(X1 ; Y1 ) + I(X2 ; Y2 ).
(e) Since X1 and X2 are independent,
I(X1 , X2 ; Y1 , Y2 ) = H(X1 , X2 ) H(X1 , X2 |Y1 , Y2 )
= H(X1 ) + H(X2 ) H(X1 |Y1 , Y2 ) H(X2 | X1 , Y1 , Y2 )
= I(X1 ; Y1 , Y2 ) + I(X2 ; X1 , Y1 , Y2 )
I(X1 ; Y1 ) + I(X2 ; Y2 ).
(f) Since a, b = 0,
I(aX + Y ; bX) = h(aX + Y) h(aX + Y |bX)
= h(aX + Y) h(Y)
1 1
= h(aX + Y) + log h(Y) + log
a a
= h X + h
Y Y
a a
= I X + ; X .
Y
a
.. Mrs. Gerbers Lemma. Let H 1 : [0, 1] [0, 1/2] be the inverse of the binary
entropy function.
(a) Show that H(H 1 (u) p) is convex in u for every p [0, 1].
(b) Use part (a) to prove the scalar MGL
H 1 (H(Y |U )) H 1 (H(X |U )) p.
(c) Use part (b) and induction to prove the vector MGL
H(Y n |U ) H(X n |U )
H 1 H 1 p.
n n
Solution:
(a) We have the following chain of inequalities:
H(Y |U ) = H(X + Z |U )
= EU [H(X + Z |U = u)] (.)
= EU [H(H (H(X |U = u)) p)]
1
(.)
H(H (EU (H(X |U = u))) p)
1
(.)
= H(H (H(X |U )) p),
1
8 Solutions to Selected Problems in Chapter 2
where (.) follows from the definition of conditional entropy, (.) follows
from the fact that Z Bern(p) is independent of (X, U ), and finally (.) is ob-
tained using the convexity of H(H 1 (u) p) in u. Since H 1 : [0, 1] [0, 1/2]
is an increasing function, by taking H 1 to both side of the above inequality,
we have
H 1 (H(Y |U )) H 1 (H(X |U )) p.
(b) We use induction to show the inequality in the part (b). The base case when
n = 1 follows from the part (a). Assume the inequality holds for n 1. Then
we have the following chain of inequalities:
H(Y n |U ) H(Y n1 |U ) H(Yn |Y1 , U )
= +
n1
n n n
H(Y n1 |U ) n 1 H(Yn |Y1 , U )
= +
n1
n1 n n
|U ) n 1 H(Yn |Y1 , U )
H H 1 p + ) (.)
n1 n1
H(X
n1 n n
H(X n1 |U ) n 1 H(Xn + Zn |Y1 , X1 , U )
H H 1 p + )
n1 n1
n1 n n
(.)
H(X n1 |U ) n 1 H(Xn + Zn |X1 , U )
= H H 1 p + )
n1
n1 n n
(.)
H(X n1 |U ) n 1 H(H (H(Xn |X1 , U )) p)
H H 1 p )
1
+
n1
n1 n n
(.)
H(X n1 |U ) n 1 H(Xn |X1 , U )
H H 1 + p (.)
n1
n1 n n
H(X n |U )
= H H 1 p .
n
Here (.) follows from the induction hypothesis; (.) follows from the fact
that conditioning reduces entropy; (.) follows from the fact that Xn + Zn
is independent of Y1n1 conditioned on X1n1 , U ; (.) is obtained by part (a);
finally, (.) follows from the fact that H(H 1 (u) p) is convex in u.
By taking H 1 to both sides of the above inequality, we have
H(Y n |U ) H(X n |U )
H 1 H 1 p.
n n
Solution:
is a function of Y,
(a) Since X
h(X|Y) = h(X X|Y)
h(X X)
log((2e)n |K |).
1
2
be the minimum mean square linear estimator of X given Y; i.e. X
(b) Let X =
1
KXY KY Y (assuming zero means for both X and Y). Then
h(X|Y) log((2e)n |K |)
1
2
= log (2e)n |KX KXY KY1 KYX | .
1
2
(b) Using part (a) and the nonnegativity of the relative entropy, conclude that
Solution:
(a) Let X N(0, 2 ). Since f X (x) = exp( 2x 2 ), we have
2
1
2 2
= f X (x) log(2 2 ) 2 dx
(a) 1 x2
2 2
= f X (x) log f X (x)dx = h(X ),
h(X ).
h(X |Y) = h(X E(X |Y)|Y) h(X E(X |Y)) log(2e)n |K X|Y |
1
2
with equality if (X, Y) are jointly Gaussian, where the last inequality follows
from Problem .
(b) Let X = aY the linear MMSE estimator of X given Y. Then,
det(K) Kii .
n
i=1
22h(X + 22h(Z
|U )/n |U )/n |U )/n
22h(Y .
Solution:
(a) The Hessian matrix of (u, ) = log(2u + 2 ) is
2 2
(u, ) =
2 (u)2
2
u
2
u ()2
1 1
= (ln 2)2 .
+u
2
(2u + 2 )2 1 1
n 1 2h(Y n1 ) 1
2h(Y n )/n = + 2h(Yn |Y n1 )
n n1 n
n1
log(2 (1) + 2 (1) ) + 2h(Yn |Y n1 ),
2( 1 ) 2( 1 ) 1
n n
where the inequality comes from the induction assumption. If we show that
n n
log(2 2h(X )/n
+2 2h(Z )/n
), (.)
where the last inequality comes from the convexity of f (u, ). Now it remains
to show inequality (.).
where
Solutions to Selected Problems in Chapter 2 13
Now assume for n 1, equality happens in all the above inequalities iff X n1
and Z n1 are Gaussian with K X(n1) = aKZ(n1) . We show that the same is true
for n. First, we find the equality condition for each of the above inequalities.
Equality happens
E(Yn | X n1 + Z n1 ) = E(Xn + Zn | X n1 + Z n1 )
= E(Xn | X n1 ) + E(Xn | X n1 )
Note that if X n and Z n are Gaussian with K X = aKZ , then all the conditions
are satisfied.
To prove the necessity we can see that from the equality conditions in (.),
(.), (.), X n and Z n must be Gaussian. To show K Xn = aKZn , denote
K Xn =
T
A (n1)(n1) B(n1)1
B1(n1) C11
A (n1)(n1) B(n1)1
KZn = .
T
B1(n1) C11
E(Xn | X n1 ) = BA1 X n1
E(Zn |Z n1 ) = B (A )1 Z n1 ,
and from the equality constraint in (.) and the fact that A = aA, we can
conclude B = aB. Lastly from the equality condition in (.) we conclude
that C = aC, and hence K Xn = aKZn .
14 Solutions to Selected Problems in Chapter 2
H(X n )
H(X) = lim
n n
is well-defined.
(c) Show that for a continuous stationary ergodic process Y = {Yi },
h(Y n ) h(Y n1 )
for n = 2, 3, . . . .
n n1
Solution:
(a) We first show that H(Xn |X n1 ) is decreasing in n. By stationarity, we have
Now we have
n1 n n1
H(X n1 ) 1
= H(Xi | X i1 ) H(Xn | X n1 )
n1
1
n1 n n 1 i=1
H(X n1 ) 1
H(Xn | X n1 ) H(Xn | X n1 )
n1
1
n1 n n 1 i=1
H(X n1 )
=
n1
.
with equality if Z is Gaussian. Thus, Gaussian noise is the worst noise if the input
to the channel is Gaussian.
Solution: Since the (nonlinear) MMSE is upper bounded by the linear MMSE,
I(X ; X + Z ) I(X ; X + Z)
log(2eP) log(2e )
1 1 PQ
2 2 P +Q
= h(X ) h(X | X + Z )
= I(X ; X + Z ).
.. Variations on the joint typicality lemma. Let (X, Y , Z) p(x, y, z) and 0 < < .
Prove the following statements.
(a) Let (X n , Y n ) ni=1 p X,Y (xi , yi ) and Z n ni=1 pZ|X (zi |xi ), conditionally in-
dependent of Y n given X n . Then
() if zn T(n) (Z|x n ),
=1 p| (z |x )
p(z |x ) = P{Z T (Z|x )}
0
n n
otherwise.
Then
P{(x n , yn , Z n ) T(n) (X, Y , Z)} 2n(I(Y ;Z|X)()) .
Solution:
(a)
P(X n , Y n , Z n ) T(n) (X, Y , Z)) = p(x n , y n )p(z n |x n )
(x , y ,z )T() (X,Y ,Z)
(b)
P(x n , y n , Z n ) T(n) (X, Y , Z)) = p(z n |x n )
z T() (Z|x , y )
=
1
z T() (Z|x , y )
|T(n) (Z|x n )|
2H(Z|X,Y) 2n(H(Z|X)
= 2n(I(Y ;Z|X)) .
2n(H(Z|X)())
zT(Z|x , y )
1
1 nH(Z|X,Y) n(H(Z|X)()
1
2 2
2nH(Z|X,Y) 2n(H(Z|X)() .
Solutions to Selected Problems in Chapter 2 17
(a) Show that |An | 2n(H(X,Y)+H(Y ,Z)+H(X,Z)+())/2 . (Hint: First show that |An |
2n(H(X,Y)+H(Z|Y)+()) .)
(b) Does a corresponding lower bound hold?
Solution:
(a) Let
| An | 2 2n(H(X,Y)+H(Z|Y)+3()) 2n(H(X,Z)+H(Y|Z)+3())
= 2n(H(X,Y)+H(X,Z)+H(Z|Y)+H(Y|Z)+6() ,
| An | 2n(H(X,Y)+H(X,Z)+H(Z|Y)+H(Y|Z)+6())/2
2n(H(X,Y)+H(X,Z)+H(Y ,Z)+6())/2
2n(H(X,Y)+H(X,Z)+H(Y ,Z)+ ())/2 .
(b) This bound is not tight. For the random variables X, Y , Z always satisfying
X = Y = Z, we have |An | 2n(H(X)()) from the upper bound of the typical
set. However, the bound in the problem reduces to |An | 2n( 2 H(X)+()) .
3
i=1
which converges to 1/2 as n . Thus, the fact that x n T(n) (X) and Y n
ni=1 pY|X (yi |xi ) does not necessarily imply that P(x n , Y n ) T(n) (X, Y).
Remark: This problem illustrates that in general we need > in the conditional
typicality lemma.
Solution:
(a) Since (1|x n ) = = 2 n 2 n = 12 (1 + ) = p X (1)(1 + ). We can also
k (1+)
(1+)
P{ Yi (1 + )}
k
n
i=1 4
k+1
P{ <
k
}.
i=1 2