Different Integration Formulas
Different Integration Formulas
In this case, any transmitted bit is received without error. Hence, one error-free bit can be transmitted per
use of the channel, and the capacity is 1 bit. We can also calculate the information capacity as follows.
The channel appears to be noisy, but really is not. Even though the output of the channel is a random
consequence of the input, the input can be determined from the output, and hence every transmitted bit
1
can be recovered without error. The capacity of this channel is also 1 bit per transmission. We can also
calculate the information capacity as follows.
3. Noisy Typewriter
1
In this case the channel input is either received unchanged at the output with probability 2 or is
transformed into the next letter with probability 12 as shown in Fig 3.
If the input has 26 symbols and we use every alternate input symbol, we can transmit one of 13 symbols
without error with each transmission. Hence, the capacity of this channel is log 13 bits per transmission.
We can also calculate the information capacity as follows.
2
which is achieved by using p(x) distributed uniformly over all the inputs.
where the last inequality follows because Y is a binary random variable. Equality is achieved when
the input distribution is uniform. Hence, the information capacity of a binary symmetric channel with
parameter p is
C = 1 − H(p) bits.
C = max I(X; Y )
p(x)
The first guess for the maximum of H(Y ) would be log 3, but we cannot achieve this by any choice of
input distribution p(x). Letting E be the event Y = e, using the expansion
3
Figure 5: Binary Erasure Channel
Hence
= max(1 − α)H(π)
p(x)
=1−α
2 Exercises
4
where P r{Z = 0} = P r{Z = a} = 21 . The alphabet for x is X = {0, 1}. Assume the at Z is
independent of X. Observe that the channel capacity depends on the value of a.
Solution:
The channel can be modeled as follows.
Yi = Xi ⊕ Zi ,
where
{
1 with probability p
Zi =
0 with probability 1 − p
and Zi are not independent,
= H(X1 , X2 , · · · , Xn ) − nH(p)
= n − nH(p)
5
(1)
if X1 , X2 , · · · , Xn are chosen i.i.d. ∼ Bern 2 . The capacity of the channel with memory over n uses
of the channel is
≥ n(1 − H(p))
= nC
Y = X + Z(mod11)
where
1 with probability
1
3
Z= 1
2 with probability
3
1
3 with probability 3
In this case,
C = max I(X; Y )
p(x)
= log 11 − log 3
which is attained when Y has a uniform distribution, which occurs (by symmetry) when X has a uniform
distribution.
6
(a) The capacity of the channel is log 11
3 bit/transmission.
Equality holds if Y1 and Y2 are independent, namely, X1 and X2 are independent. Hence,
C = max I(X1 , X2 ; Y1 , Y2 )
p(x1 ,x2 )
= C1 + C2
with equality iff p(x1 , x2 ) = p∗ (x1 )p∗ (x2 ) and p∗ (x1 ) and p∗ (x2 ) are the distributions that maximize
C1 and C2 respectively.
5. The Z channel.
The Z-channel has binary input and output alphabets and transition probabilities p(y|x) given by the
following matrix:
[ ]
1 0
Q= x, y ∈ {0, 1}
1/2 1/2
7
Find the capacity of the Z-channel and the maximizing input probability distribution.
Solution:
First we express I(X; Y ), the mutual information between the input and output of the Z-channel, as a
funtion of x = P r(X = 1):
Since I(X; Y ) = 0 when x = 0 and x = 1, the maximum mutual information is obtained for some value
of x such that 0 < x < 1.
Using elementary calculus, we determine that
d 1 1 − x/2
I(X; Y ) = log − 1,
dx 2 x/2
which is equal to zero for x = 2/5. (It is reasonable that P r(X = 1) < 1/2 because X = 1 is the noisy
input to the channel.) So the capacity of the Z-channel in bits is H(1/5) − 2/5 = 0.722 − 0.4 = 0.322.
6. Time-varying channels.
Consider a time-varying discrete memoryless channel. Let Y1 , Y2 , · · · , Yn be conditionally independent
∏
given X1 , X2 , · · · , Xn , with conditional distribution given by p(y|x) = ni=1 pi (yi |xi ). Let X =
8
since by the definition of the channel, Yi depends only on Xi and is conditionally independent of
everything else. Continuing the series of inequalities, we have
∑
n
I(X ; Y ) = H(Y ) −
n n n
H(Yi |Xi )
i=1
∑
n ∑
n
= H(Yi |Y i−1
)− H(Yi |Xi )
i=1 i=1
∑n ∑
n
≤ H(Yi ) − H(Yi |Xi )
i=1 i=1
∑n
≤ (1 − H(pi ))
i=1
7. Unused symbols.
Show that the capacity of the channel with probability transition matrix
2/3 1/3 0
Py|x = 1/3 1/3 1/3
0 1/3 2/3
is achieved by a distribution that places zero probability on one of input symbols. What is the capacity of
this channel?
Solution:
Let the probabilities of the three input symbols be p1 , p2 and p3 . Then the probabilities of the three
output symbols can be easily calculated to be ( 32 p1 + 13 p2 , 13 , 31 p2 + 32 p3 ), and therefore
9
which is maximized if p1 + p3 is as large as possible (since log 3 > H( 13 , 32 )). Therefore the maximizing
distribution corresponds to p1 + p3 = 1, p1 = p3 , and therefore (p1 , p2 , p3 ) = ( 21 , 0, 21 ). We have the
capacity of this channel as
1 2 2 2
C = log 3 − H( , ) = log 3 − (log 3 − ) = bits.
3 3 3 3
Remark: The intuitive reason why p2 = 0 for the maximizing distribution is that conditional on the
input being 2, the output is uniformly distributed. The same uniform output distribution can be achieved
without using the symbol 2 (by setting p1 = p3 ), and therefore the use of symbol 2 does not add any
information (it does not change the entropy of the output and the conditional entropy H(Y |X = 2) is
the maximum possible, i.e., log 3 , so any positive probability for symbol 2 will only reduce the mutual
information.
Note that not using a symbol is optimal only if the uniform output distribution can be achieved without
use of that symbol. For example, in the Z channel example above, both symbols are used, even though
one of them gives a conditionally uniform distribution on the output.
this channel.
(b) Specialize to the case of the binary symmetric channel (α = 0).
(c) Specialize to the case of the binary erasure channel (ϵ = 0).
Solution:
(a) As with the examples in the text, we set the input distribution for the two inputs to be π and 1 − π.
10
Then
C = max I(X; Y )
p(x)
As in the case of the erasure channel, the maximum value for H(Y ) cannot be log 3, since the probability
of the erasure symbol is α independent of the input distribution. Thus,
π+ϵ−πα−2πϵ
with equality iff 1−α = 12 , which can be achieved by setting π = 21 .
Therefore the capacity of this channel is
C = H(α) + 1 − α − H(1 − α − ϵ, α, ϵ)
1−α−ϵ ϵ
= H(α) + 1 − α − (1 − α)H( , )
( 1 − α ) 1 − α
1−α−ϵ ϵ
= (1 − α) 1 − H( , )
1−α 1−α
C = 1 − H(ϵ),
C = 1 − α,
11