0% found this document useful (0 votes)
13 views18 pages

Ece458 L6

The document discusses topics related to information theory including joint entropy, conditional entropy, mutual information and channel capacity. Key concepts like the chain rule for entropy and the maximum mutual information between channel input and output defining capacity are covered.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views18 pages

Ece458 L6

The document discusses topics related to information theory including joint entropy, conditional entropy, mutual information and channel capacity. Key concepts like the chain rule for entropy and the maximum mutual information between channel input and output defining capacity are covered.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

ECE458

INFORMATION THEORY AND CODING


LECTURE 6
Associate Prof. Fatma Newagy
[email protected]
LAST LECTURE TOPICS
 Discrete Memoryless Channel
 Channel Models

[email protected]
2
TODAY’S TOPICS
 Joint entropy and Conditional Entropy
 Mutual Information

[email protected]
 Channel Capacity

3
JOINT ENTROPY AND CONDITIONAL ENTROPY

Definition: Joint entropy of a pair of two discrete random variables


X and Y is:

[email protected]
Definition: Conditional entropy of Y given a random variable X
(average over X) is:

4
Theorem: Chain rule

[email protected]
Corollary

5
MUTUAL INFORMATION

𝐼(X,Y) is the amount of information contained by R.V. X


about another R.V. Y

[email protected]
H(x|y) I(x,y)
H(y|x)

H(y)
H(x)

6
MUTUAL INFORMATION

Source entropy : H ( x)   p ( x) log p ( x)

[email protected]
Receiver entropy : H ( y )   p ( y ) log p( y )
Conditional Entropy : H ( x | y )   p ( x, y ) log p ( x | y )
H ( y | x)   p( x, y ) log p( y | x)
p( x | y )
Information transfer : I ( x, y )   p ( x, y ) log [ ]
p( x) p( y )
 H ( x)  H ( x | y )
 H ( y )  H ( y | x)

Note:
7
[email protected]
8
H(Y|X)
SUMMARY
Quantity Definition
Source information I ( X i )   log 2 P( X i )

Received I ( Y j )   log 2 P( Y j )
information

[email protected]
P( X i | Y j )
Mutual information I ( X i ,Y j )  log 2
P( X i )

Average mutual I ( X ,Y )   P( X i ,Y j ) log 2


P( X i | Y j )
information X Y P( X i )
P( Y j | X i ) P( X i ,Y j )
  P( X i ,Y j ) log 2   P( X i ,Y j ) log 2
X Y P( Y j ) X Y P( X i )P( Y j )

Source entropy H ( X )   P( X i ) log 2 P( X i )


X

Destination entropy H ( Y )   P( Y j ) log 2 P( Y j )


Y

Conditional entropy H ( X | Y )    P( X ,Y )log 2 P( X i | Y j )


X Y
i j
9

Conditional entropy H ( Y | X )    P( X ,Y i j )log 2 P( Y j | X i )


X Y
CHANNEL CAPACITY
 It is the maximum mutual information between
the channel input and output.

[email protected]
C=max I(X,Y)
 that is maximum information transfer

10
BINARY SYMMETRIC CHANNEL CAPACITY
P(Y  0 | X  0)
I ( X ; Y )  P( X  0) P(Y  0 | X  0) log  0 1-p 0
P(Y  0)
P(Y  1 | X  0)
P( X  0) P(Y  1 | X  0) log  p

[email protected]
P(Y  1)
Input Output
P(Y  0 | X  1)
P( X  1) P(Y  0 | X  1) log  p
P(Y  0)
P(Y  1 | X  1) 1 1
P( X  1) P(Y  1 | X  1) log
P(Y  1)
1-p
1 p
 P( X  0)(1  p ) log  1  p p 
(1  p ) P( X  0)  pP( X  1) P 
P( X  0) p log
p
  p 1  p 
pP( X  0)  (1  p ) P( X  1)
p
P( X  1) p log 
(1  p ) P( X  0)  pP( X  1)
1 p
P( X  1)(1  p ) log 11
pP( X  0)  (1  p ) P( X  1)
BINARY SYMMETRIC CHANNEL CAPACITY
Channel Capacity is Maximum Information
1
max( I ( X ; Y ))  P( X  1)  P( X  0) 
2

[email protected]
 1 p
C  max( I ( X ; Y ))   P( X  0)(1  p ) log 
 (1  p ) P ( X  0)  pP ( X  1)
p
P( X  0) p log 
pP( X  0)  (1  p ) P( X  1)
p
P( X  1) p log 
(1  p ) P( X  0)  pP( X  1)
1 p 
P( X  1)(1  p ) log 
pP( X  0)  (1  p ) P( X  1)  P ( X 1)  P ( X  0 ) 
1
2

 (1  p ) log 2(1  p )  p log 2 p 12


[email protected]
C  (1  p ) log 2 2(1  p )  p log 2 2 p
 (1  p )[log 2 2  log 2 (1  p )]  p[log 2 2  log 2 p ]
 (1  p  p )  (1  p ) log 2 (1  p )  p log 2 p
13
 1  H ( p)
EXAMPLE
The input source to a noisy communication channel is a
random variable X over the four symbols a, b, c, d. The output
from this channel is a random variable Y over these same

[email protected]
four symbols. The joint distribution of these two random
variables is as follows:

14
EXAMPLE
(a) Write down the marginal (individual) distribution for X
and compute the marginal entropy H(X) in bits.

[email protected]
(b) Write down the marginal distribution for Y and compute
the marginal entropy H(Y ) in bits.

(c) What is the joint entropy H(X, Y) of the two random


variables in bits?

(d) What is the conditional entropy H(Y |X) in bits?

(e) What is the mutual information I(X; Y) between the two


random variables in bits?

(f) Provide a lower bound estimate of the channel capacity C 15


for this channel in bits.
EXAMPLE
(a) Using total probability: the marginal (individual) distribution for
X is (1/4, 1/4,1/4, 1/4)
What we have in the table are the joint probabilities: thus,

[email protected]
𝑝 𝑥𝑖 , 𝑦𝑗 = 1, ∀ 𝑖, 𝑗 ∈ *𝑎, 𝑏, 𝑐, 𝑑+
𝑖 𝑗
𝑝 𝑥, 𝑦 = 𝑝 𝑦 𝑥 . 𝑝 𝑥 = 𝑝 𝑥 𝑦 𝑝 𝑦 (1)
From the total probability theory:
𝑝 𝑥 = 𝑗𝑝 𝑥 𝑦𝑖 𝑝 𝑦𝑖 = 𝑗𝑝 𝑥, 𝑦𝑗  From Eq(1)
i.e., p(x_a) = p(y_a)+ p(y_b)+ p(y_c)+ p(y_d) =1/8+1/16+1/32+1/32 =
¼ and so on.
p(x_b) = ¼ also p(x_c) = ¼ also p(x_d) = ¼ also

The marginal entropy H(X) = (-1/4*log2(1/4)) + 1/2 + 1/2 + 1/2 = 2 16


bits.
EXAMPLE
(b) Marginal distribution for Y = (1/2 {i.e, 1/8+1/16+1/16+1/4},1/4, 1/8, 1/8)
H(Y) 1/2 + 1/2 + 3/8 + 3/8 = 7/4 bits

[email protected]
(c) joint entropy H(X, Y):
𝐻 𝑋, 𝑌 = 𝑖 𝑗𝑝 𝑥𝑖 , 𝑦𝑗 log 2 𝑝(𝑥𝑖 , 𝑦𝑗 )
i.e., over all 16 probabilities in the joint distribution(of which only 4
different non-zero values appear, with the following frequencies):
(1)(2/4) + (2)(3/8) + (6)(4/16) + (4)(5/32) = 1/2 + 3/4 + 3/2 + 5/8 = 27/8 bits.

(d) Conditional entropy:


𝟐𝟕 𝟏𝟔 𝟏𝟏
𝑯 𝒀|𝑿 = 𝑯 𝑿, 𝒀 − 𝑯 𝑿 = − =
𝟖 𝟖 𝟖
Or
𝑯 𝒀|𝑿 = 𝒊 𝒋𝒑 𝒙𝒊 , 𝒚𝒋 𝐥𝐨𝐠 𝟐 𝒑 𝒚𝒊 𝒙𝒊 = 17
𝒑 𝒙𝒊 ,𝒚𝒋 𝒑 𝒙𝒊
= 𝒊 𝑝 𝑥𝑖 𝒋 𝑝 𝑦𝑗 𝑥𝑖 𝐥𝐨𝐠 𝟐 𝒑 𝒚𝒋 𝒙𝒊 = 𝒊 𝑝 𝑥𝑖 𝒋 𝒑 𝒙 𝐥𝐨𝐠 𝟐 𝒑𝒙𝒊 ,𝒚𝒋
𝒊
EXAMPLE
(e) mutual information I(X; Y):
There are different alternative ways to obtain the answer:
I(X; Y ) = H(Y ) − H(Y |X) = 7/4 - 11/8 = 3/8 bits.

[email protected]
Or,
I(X; Y ) = H(X) + H(Y ) − H(X, Y ) = 2 + 7/4 - 27/8
= (16+14-27)/8 = 3/8bits.

f) Channel capacity is the maximum, over all possible input


distributions, of the mutual information that the channel establishes
between the input and the output. So one lower bound estimate can
be simply any particular measurement of the mutual information for
this channel, i.e., such as the above measurement which was only
3/8 bits.
18

You might also like