Information Theory: Chapter 7: Channel Capacity and Coding Theorem - Part II
Information Theory: Chapter 7: Channel Capacity and Coding Theorem - Part II
May 3, 2013
Chapter 7 : Part I
Fano’s inequality
Channel capacity and channel models
Preview of channel coding theorem
Definitions
Jointly typical sequences
Sources
Outline
2 Zero-Error Codes
4 Coding Schemes
5 Feedback Capacity
Theorem
For a discrete memory-less channel, all rates below capacity C are achievable
Proof
Proof
Proof
Probability of Error
The inner sum does not depend on ω, so assuming that the message
W = 1 was sent X
Pr(E) = Pr(C)λ1 (C) = Pr(E|W = 1) (8)
C
Probability of Error
(n)
Let Ei = {(X n (i), Y n (i) ∈ Aǫ }, i ∈ 1, 2, . . . , 2nR . Then,
Random Codes
We keep half of the codewords and throw away the worst half, then the
maximal error probability will be less than 4ǫ
The codebook has now 2nR−1 codewords
1
The rate is R − n = R as n → ∞
Zero-Error Codes
Proof
We have W → X n (W) → Y n → Ŵ
For each n, let W is uniformly distributed and
(n) P
Pr(Ŵ 6= W) = Pe = 21nR i λi
Using the results we have
nR =a H(W) (27)
b
= H(W|Ŵ) + I(W; Ŵ) (28)
(n)
≤c 1 + Pe nR + I(W; Ŵ) (29)
(n)
≤d 1 + Pe nR + I(X n ; Y n ) (30)
(n)
≤e 1 + Pe nR + nC (31)
(n) 1
R ≤ Pe R + + C (32)
n
(n) 1
R ≤ Pe R + +C (33)
n
(n) 1 P (n)
Since Pe = 2nR i λi , Pe → 0 as n → ∞
Same with the second term, thus, R ≤ C
However, if R > C, the average probability of error is bounded away
from 0
Channel capacity : A very clear dividing point.
(n) (n)
For R ≤ C → (Pe → 0), exponentially and for R > C → (Pe → 1)
Coding Schemes
Main focus → Codes that achieve low error probability and simple to
encode/decode
Objective → Introduce redundancy such that errors can be detected and
corrected
Simplest scheme → Repition codes
Encoding → Repeat the information multiple times
Decoding → Take the majority vote
Rate goes to zero with increasing block length
Error correcting codes → Parity check codes, hamming codes
Redundant bits help in detecting/correcting errors
Linear Codes
Linear Codes
Should satisfy
GH T = 0 (38)
asad@isy.liu.se (Linköping University) Chapter 7 May 3, 2013 22 / 39
Coding Schemes
k n-k
Message Bits Parity Bits
n
Hamming Codes
Hamming Codes
Minimum distance is 3
If codeword c is corrupted in only one place, it will be closer to c
How to discover the closest codeword without searching over all the
codewords
Let ei be a vector with a 1 in the ith position
If codeword corrupt at i, then received vector can be written can be
written as r = c + ei .
Hr = H(c + ei ) = Hc + Hei = Hei (41)
P1 D1 P2
D4
D2 D3
P3
1
0 1
Placing parity bits in each circle so that parity of each circle is even
1
0 1
1
0 1
1
0 1
1
0 0
Other Codes
Feedback Capacity
W Xi (W, Y i−1 ) Yi Ŵ
Encoder Channel Decoder
Message p(y|x) Estimate
of
Message
Feedback Capacity
Definition
The capacity with feedback, CFB , of a discrete memoryless channel is the
supremum of all rates achievable by feedback codes
Proof
Proof
Proof
(n)
nR ≤ Pe nR + 1 + nC (53)
R ≤ C, as n → ∞ (54)
Theorem
If
V1 , V2 , . . . , Vn is a finite alphabet stochastic process that satisfies AEP
H(V) < C
there exists a source-channel code with Pr(V̂ n 6= V n ) → 0
In converse
Theorem
For any staionary stochastic process, if
H(V) > C
the probability of error is bounded away from zero
Proof of Theorem
(n)
From AEP → |Aǫ | ≤ 2n(H(V)+ǫ)
At most contributes ǫ to probability of error
2n(H(V)+ǫ) sequences → n(H(V) + ǫ) bits for indexing
Transmit with error probability less than ǫ if R = H(V) + ǫ < C
(n) (n)
Total error of probability ≤ 2ǫ (P(V n ∈
/ Aǫ + P(g(Y n ) 6= V n |V n ∈ Aǫ )
Hence, the sequence can be constructed with low probability of error for
n sufficiently large if H(V) < C
H(V1 , V2 , . . . , Vn )
H(V) ≤a (56)
n
H(V n )
= (57)
n
1 1
= H(V n |V̂ n ) + I(V n ; V̂ n ) (58)
n n
1 1
≤b (1 + Pr(V̂ n 6= V n )nlog|V|) + I(V n ; V̂ n ) (59)
n n
1 1
≤c (1 + Pr(V̂ n 6= V n )nlog|V|) + I(X n ; Y n ) (60)
n n
d 1 n n
≤ (1 + Pr(V̂ 6= V )nlog|V|) + C (61)
n
H(V) ≤ C as n → ∞ (62)
Summary of Theorem
www.liu.se