A Mathematical Theory of Communication: by C.E.Shannon Presented by Ling Shi
A Mathematical Theory of Communication: by C.E.Shannon Presented by Ling Shi
Communication
By C.E.Shannon
1
Part II: The Discrete Channel with Noise
2
2.1 Representation of a noisy discrete channel
• E=f(S,N) S E
• p(β, j | α,i) => p(j|i)
Channel
x y
α, β are states of channel and i is the
send sym and j is the received sym
N
3
2.2 Equivocation and channel capacity
(1) Objective: minimize the error caused by the noise
Rate = 1000 symbols per second Rate = 1000 symbols per second
p0 = p1 = 1/2 p0 = p1 = 1/2
p(1|0) = 0.01 p(1|1) = 0.99 p(1|0) = 1/2 p(1|1) = 1/2
p(0|0) = 0.99 p(0|1) = 0.01 p(0|0) = 1/2 p(0|1) = 1/2
H(x) = 1 H(x|y) = 0.081 H(x) = 1 H(x|y) = 1
R=0.919 R=0
(3) Theorem 10: If the correction channel has a capacity equal to H(x|y)
it is possible to so encode the correction data as to send it over this
channel and correct all but an arbitrarily small fraction ε of the errors. This
is not possible if the channel capacity is less than H(x|y).
4
2.2 Equivocation and channel capacity (cont’d)
(4) Remarks: H(x|y) is the amount of additional information that must be supplied
per second at the receiving point to correct the received message.
(5) Definition:
C = Max( H(x) – H(x|y) )
where the max is with respect to all possible information sources used as input
to the channel.
Correction Data
Observer
5
2.3 Fundamental theorem for a discrete noisy channel
(1) Theorem 11: Let a discrete channel have the capacity C and a discrete
source the entropy per second H. If H <· C, there exists a coding system
such that the output of the source can be transmitted over the channel with
an aribitrarily small frequency of errors (or an arbitrarily small equivocation).
If H > C, it is possible to encode the source so that the equivocation is less
than H – C + ε where ε is arbitrarily small. There is no method of encoding
which gives an equivocation less than H – C.
H(x|y) Attainable
region
Slope = 1.0
C H (x)
6
2.3 Fundamental theorem for a discrete noisy channel (cont’d)
(2) Proof :
E Assume R < C and let X be such that I(X,Y) achieves its
maximum, i.e, equal to the channel capacity C. Choose
M 2RT random sequences from the 2H(x)T highly possible input
messages. If xi is transmitted and y is received. For y to
be decoded as xj with j ≠ i , we must have at least one xj in
Sy with j ≠ i, where Sy is the set of inputs associated with y.
Now we have
Sy y 2H(x|y)T possible
TR
causes 2
P(at least one xj in i\ Sy, j ≠ i) <= ∑P(xj in Sy)
j = 1, j ≠ i
Verification: Case I: p=1, β =1, C = log 3; result makes sense since now the channel is
noiseless with three possible symbols
Case II: p=0.5, β = 2, C = log 2; result also makes sense since now the
second and the third symbols cannot be distinguished at all and act together
like one symbol
In general cases where p has other values, C varies between log 2 and log 3
8
3.1 Sets and ensembles of functions
• A set of functions is merely a eg, sint (t + θ) is ergodic while asin(t+θ)
collections of functions, eg, fθ(t) = with a normally distributed and θ
sin(t+θ), where θ is in some set uniform is not ergodic
• An ensemble of functions is a set • We can obtain new ensembles from
of functions with a probability measure, existing ones, eg, gα(t) = Tfα(t) where
eg, P(θ) T is an operator, eg, T=d/dt
• An ensemble of functions fα(t) is • An operator T is called invariant if
stationary if the same ensembles gα(t+t1) = Tfα(t+t1) for all fα(t) and t1;
results when all the functions are Underinvariant operators, stationarity
shifted any fixed amount in time, eg, and ergodicity are preserved
fθ(t+t1) = sin (t + t1 + θ) = sin (t+φ) with
• If f(t) is band limited to 0 and W, then
φ distributed uniformly from 0 to 2π if θ
is uniformly distributed from 0 to 2π
sinπ(2Wt- n )
• An ensemble is ergodic if it is f(t) = ∑xn
π(2Wt-n)
stationary and there’s no subset of the
functions in the set with a probability
where xn=f(n/2W)
different from 0 and 1 which is
stationary.
9
3.2 Entropy of a continuous distribution
10