0% found this document useful (0 votes)
35 views10 pages

A Mathematical Theory of Communication: by C.E.Shannon Presented by Ling Shi

The document summarizes key concepts from Shannon's "A Mathematical Theory of Communication": [1] It introduces representations of noisy discrete channels, including entropy, equivocation, and channel capacity. The channel capacity is the maximum rate possible to encode information for error-free transmission. [2] It describes the fundamental theorem for discrete channels with noise: if a source's entropy is less than the channel capacity, its output can be transmitted with arbitrarily few errors; otherwise, equivocation can be reduced but not below the difference between entropy and capacity. [3] It provides examples of calculating channel capacity for simple discrete channels by maximizing mutual information between input and output subject to constraints.

Uploaded by

mahRastgoo
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views10 pages

A Mathematical Theory of Communication: by C.E.Shannon Presented by Ling Shi

The document summarizes key concepts from Shannon's "A Mathematical Theory of Communication": [1] It introduces representations of noisy discrete channels, including entropy, equivocation, and channel capacity. The channel capacity is the maximum rate possible to encode information for error-free transmission. [2] It describes the fundamental theorem for discrete channels with noise: if a source's entropy is less than the channel capacity, its output can be transmitted with arbitrarily few errors; otherwise, equivocation can be reduced but not below the difference between entropy and capacity. [3] It provides examples of calculating channel capacity for simple discrete channels by maximizing mutual information between input and output subject to constraints.

Uploaded by

mahRastgoo
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

A Mathematical Theory of

Communication
By C.E.Shannon

Presented by Ling Shi

1
Part II: The Discrete Channel with Noise

• Representation of a noisy discrete channel


• Equivocation and channel capacity
• Fundamental theorem for a discrete channel with noise
• Example of a discrete channel and its capacity

• Sets and ensembles of functions (from Part III)


• Entropy of a continuous distribution (from Part III)

2
2.1 Representation of a noisy discrete channel

• E=f(S,N) S E
• p(β, j | α,i) => p(j|i)
Channel
x y
α, β are states of channel and i is the
send sym and j is the received sym
N

• H(x): Entropy of input H(x) H(y)


H(y): Entropy of output

• H(y|x): Entropy of output when input is


known
H(x|y): Entropy of input when output is
known

• H(x,y) = H(x) + H(y|x) = H(y) + H(x|y)


I(x, y) = H(x) – H(x|y) = H(y) – H(y|x) H(x|y) I(x,y) H(y|x)

3
2.2 Equivocation and channel capacity
(1) Objective: minimize the error caused by the noise

(2) R = I(x,y) = H(x) – H(x|y), where H(x|y) is named equivocation

Rate = 1000 symbols per second Rate = 1000 symbols per second
p0 = p1 = 1/2 p0 = p1 = 1/2
p(1|0) = 0.01 p(1|1) = 0.99 p(1|0) = 1/2 p(1|1) = 1/2
p(0|0) = 0.99 p(0|1) = 0.01 p(0|0) = 1/2 p(0|1) = 1/2
H(x) = 1 H(x|y) = 0.081 H(x) = 1 H(x|y) = 1
R=0.919 R=0

(3) Theorem 10: If the correction channel has a capacity equal to H(x|y)
it is possible to so encode the correction data as to send it over this
channel and correct all but an arbitrarily small fraction ε of the errors. This
is not possible if the channel capacity is less than H(x|y).

4
2.2 Equivocation and channel capacity (cont’d)
(4) Remarks: H(x|y) is the amount of additional information that must be supplied
per second at the receiving point to correct the received message.

(5) Definition:
C = Max( H(x) – H(x|y) )
where the max is with respect to all possible information sources used as input
to the channel.
Correction Data

Observer

Source Transmitter Receiver Correcting


Device
M M’ M

5
2.3 Fundamental theorem for a discrete noisy channel
(1) Theorem 11: Let a discrete channel have the capacity C and a discrete
source the entropy per second H. If H <· C, there exists a coding system
such that the output of the source can be transmitted over the channel with
an aribitrarily small frequency of errors (or an arbitrarily small equivocation).
If H > C, it is possible to encode the source so that the equivocation is less
than H – C + ε where ε is arbitrarily small. There is no method of encoding
which gives an equivocation less than H – C.

H(x|y) Attainable
region
Slope = 1.0

C H (x)
6
2.3 Fundamental theorem for a discrete noisy channel (cont’d)
(2) Proof :
E Assume R < C and let X be such that I(X,Y) achieves its
maximum, i.e, equal to the channel capacity C. Choose
M 2RT random sequences from the 2H(x)T highly possible input
messages. If xi is transmitted and y is received. For y to
be decoded as xj with j ≠ i , we must have at least one xj in
Sy with j ≠ i, where Sy is the set of inputs associated with y.
Now we have
Sy y 2H(x|y)T possible
TR
causes 2
P(at least one xj in i\ Sy, j ≠ i) <= ∑P(xj in Sy)
j = 1, j ≠ i

There are 2TR-1 in the sum above and since

P(xj in Sy )= 2H(x|y)T/2H(x)T = 1/ 2TC , therefore,

P(at least one xj in Sy, j ≠ i) = 2T(R-


T(R-C) ->0 as T -> infinity

2H(x)T highly possible 2H(y)T highly possible


input messages output messages
7
2.4 Example of a discrete channel and its capacity

P α = p log p + q log q H(x) = -PlogP – 2QlogQ H(x|y) = 2Qα


p
Q Problem: max C = H(x) – H(x|y) subject to P + 2Q =1

q Solution: U = -PlogP – 2QlogQ – 2Qα + λ(P + 2Q)


q ∂U/ ∂ P = -1 – logP + λ =0
Q ∂U/ ∂ Q = -2 -2logQ -2α + 2λ =0
p
P=β / (β + 2) Q=1/(β +2) C=log (β +2)/β Where β = eα

Verification: Case I: p=1, β =1, C = log 3; result makes sense since now the channel is
noiseless with three possible symbols

Case II: p=0.5, β = 2, C = log 2; result also makes sense since now the
second and the third symbols cannot be distinguished at all and act together
like one symbol

In general cases where p has other values, C varies between log 2 and log 3

8
3.1 Sets and ensembles of functions
• A set of functions is merely a eg, sint (t + θ) is ergodic while asin(t+θ)
collections of functions, eg, fθ(t) = with a normally distributed and θ
sin(t+θ), where θ is in some set uniform is not ergodic
• An ensemble of functions is a set • We can obtain new ensembles from
of functions with a probability measure, existing ones, eg, gα(t) = Tfα(t) where
eg, P(θ) T is an operator, eg, T=d/dt
• An ensemble of functions fα(t) is • An operator T is called invariant if
stationary if the same ensembles gα(t+t1) = Tfα(t+t1) for all fα(t) and t1;
results when all the functions are Underinvariant operators, stationarity
shifted any fixed amount in time, eg, and ergodicity are preserved
fθ(t+t1) = sin (t + t1 + θ) = sin (t+φ) with
• If f(t) is band limited to 0 and W, then
φ distributed uniformly from 0 to 2π if θ
is uniformly distributed from 0 to 2π
sinπ(2Wt- n )
• An ensemble is ergodic if it is f(t) = ∑xn
π(2Wt-n)
stationary and there’s no subset of the
functions in the set with a probability
where xn=f(n/2W)
different from 0 and 1 which is
stationary.
9
3.2 Entropy of a continuous distribution

• Definition: 4. H(x,y)=H(x) + H(y|x) = H(y) + H(x|y)


H = -∫∫p(x)logp(x)dx
5. Let p(x) be a one-dim distribution.
H=-∫L ∫p(x1,L,xn)logp(x 1,L,xn)dx1L dxn The form giving a maximum entropy
subject to the condition that σ(x)
H(x,y)=- ∫∫p(x,y)logp(x,y)dxdy begin fixed at σ is Gaussian.

H(y|x)=- ∫∫p(x,y)log p(x,y)/p(x) dxdy 6. The entropy of a one-dim Gaussian


distribution whose standard deviation
H(x|y)=- ∫∫p(x,y)log p(x,y)/p(y) dxdy is σ is given by H(x)=log (2π e)1/2 σ

• Properties: 7. If x is limited to the half plane, ie


1. H(x) is max if x is uniform p(x)=0 for x· 0, and E[x]=a being
fixed, then the maximum entropy
2. H(x,y)<= H(x) + H(y) with equlity iff x & occurs when p(x)=1/a exp(-x/a) and is
y are independent equal to log ea

3. p`(y) =∫a(x,y)p(x)dx with ∫a(x,y)dx =1 8. Entropy is relative to coordinate


∫a(x,y)dy = 1, a(x,y) >= 0 => H(y)>=H(x) system

10

You might also like