Lect 02
Lect 02
Xn M ̂n
X
Source Encoder Decoder
/
Point-to-point lossless compression system
Xn M ̂n
X
Source Encoder Decoder
∙ Examples:
If X ∼ Bern(p), then H(X) = −p log p − ( − p) log( − p) = H(p) (binary entropy function)
If X ∼ Unif(X ), then H(X) = log |X |
In general H(X) ≤ log |X | (by Jensen’s inequality)
/
∙ We need to review:
Conditional and joint entropy
The notion of typicality
/
Conditional and joint entropy
i=
n
H(X n ) ≤ ∑i= H(Xi ) (with equality if X , X , . . . , Xn are independent)
/
Typical sequences
/
Properties of typical sequences
Xn
Tє(n) (X)
emen
|Tє(n) | ≐ nH(X) typical xn
/
∙ Encoding:
Upon observing xn , send m(xn )
∙ Decoding:
Declare x̂ n = xn (m) for the unique xn (m) ∈ Tє(n)
∙ By Fano’s inequality,
̂ n ) ≤ + nP(n) log |X | = n(/n + P(n) log |X |) = nєn ,
H(X n | X e e
where єn → as n → ∞ by assumption
∙ Hence as n → ∞, R ≥ H(X)
/
M Xn Yn ̂
M
Encoder p(y|x) Decoder
/
Point-to-point communication system
M Xn Yn ̂
M
Encoder p(y|x) Decoder
/
Examples
X Y
X Y
−p
X e Y
−p
/
Proving the channel coding theorem
/
/
Conditionally typical sequences
∙ Conditionally typical set: Tє(n) (Y|xn ) = yn : (xn , yn ) ∈ Tє(n) (X, Y)
∙ If xn ∈ Tє(n)
(X), є > є , then for n sufficiently large,
/
xn
yn
/
Another illustration of joint typicality
Xn Yn
Tє(n) (X) Tє(n) (Y)
xn
Tє(n) (Y|xn )
| ⋅ | ≐ nH(Y|X)
/
∙ If xn ∈ Tє(n)
and Ỹ n ∼ ∏ni= pY (̃yi ), then for n sufficiently large,
/
Achievability proof of channel coding theorem
∙ For every R < maxp(x) I(X; Y) ∃ sequence of (nR , n) codes with limn→∞ Pe(n) =
∙ Key ideas: random coding and joint typicality decoding
∙ Codebook generation:
Fix p(x) that attains C = maxp(x) I(X; Y)
n
Independently generate nR sequences xn (m) ∼ ∏i= pX (xi ), m ∈ [ : nR ]; hence
nR n
p(C ) = pX (xi (m))
Xn m= i=
/
∙ For every R < maxp(x) I(X; Y) ∃ sequence of (nR , n) codes with limn→∞ Pe(n) =
∙ Key ideas: random coding and joint typicality decoding
∙ Encoding: C is revealed to both encoder and decoder
To send message m, transmit xn (m)
Xn
xn (m)
/
Achievability proof of channel coding theorem
∙ For every R < maxp(x) I(X; Y) ∃ sequence of (nR , n) codes with limn→∞ Pe(n) =
∙ Key ideas: random coding and joint typicality decoding
∙ Decoding:
Declare that m ̂ yn ) ∈ Tє(n)
̂ is sent if it is unique message such that (xn (m),
Otherwise declare an error e
Xn Yn
xn (m) yn
/
Xn Yn Tє(n) (Y)
X n ()
X n (m)
Yn
/
/
Application: Achievability using linear codes
/
∙ Chain rule: n
I(X ; Y) = I(Xi ; Y |X i− )
n
i=
/
Converse proof of channel coding theorem
∙ Note that
nR = H(M)
̂ + H(M| M)
= I(M; M) ̂
∙ By Fano’s inequality,
̂ ≤ + P(n) nR = nєn ,
H(M| M) e
where єn → as n → ∞
∙ By the data processing inequality,
̂ + nєn
nR = I(M; M)
≤ I(M; Y n ) + nєn
/
∙ We have
nR ≤ I(M; Y n ) + nєn
i=
n
≤ I(M, Y i− ; Yi )
i=
n
= I(Xi , M, Y i− ; Yi ) (Xi is function of M)
i=
n
= I(Xi ; Yi ) ((M, Y i− ) → Xi → Yi )
i=
≤ n max I(X; Y)
p(x)
/
Channel coding with input cost
M Xn Yn ̂
M
Encoder p(y|x) Decoder
B
C(B)
B
/
Proof of achievability
∙ Codebook generation:
Fix p(x) that attains C(B/( + є))
n
Independently generate nR sequences xn (m) ∼ ∏i= pX (xi ), m ∈ [ : nR ]
∙ Encoding:
To send message m, transmit xn (m) if xn (m) ∈ Tє(n)
n
(by the typical average lemma, ∑i= b(xi (m)) ≤ nB)
Otherwise transmit (x , . . . , x )
∙ Decoding:
Declare that m ̂ yn ) ∈ Tє(n)
̂ is sent if it is unique message such that (xn (m),
Otherwise declare an error
/
Proof of the converse
∙ Need to show: For any sequence of codes with Pe(n) → and ∑ni= b(xi (m)) ≤ nB,
R ≤ C(B) = max I(X; Y)
p(x):E(b(X))≤B
∙ Hence, as n → ∞, R ≤ C(B)
/
Gaussian channel
g
X Y = gX + Z
Differential entropy
/
Proof of the converse
∙ Mutual information extends to arbitrary random variables (Pinsker )
∙ Hence, by the converse proof for the DMC with cost,
C≤ sup I(X; Y)
F(x):E(X )≤P
/
Proof of achievability
∙ Extend proof for DMC with cost via discretization procedure (McEliece )
∙ First note that capacity is attained by X ∼ N(, P)
∙ Let [X]j be a finite quantization of X with [X]j → X in distribution, E([X]j ) ≤ P
Z
X [X]j g Yj [Yj ]k
∙ By achievability proof for DMC with cost, I([X]j ; [Yj ]k ) is achievable for every j, k
∙ By weak convergence and the dominated convergence theorem (see NIT ..),
lim lim I([X]j ; [Yj ]k ) = lim I([X]j ; Yj ) = I(X; Y) = C(S)
j→∞ k→∞ j→∞
/
Gaussian vector (MIMO) channel
G
X Y = GX + Z
X: t-vector, Y: r-vector
G: channel gain matrix with gain Gjk from transmitter antenna k to receiver antenna j
Z ∼ N(, Ir )
Theorem .
C= max I(X; Y) = max log |GKX GT + Ir |
F(x):E(X T X)≤P K X ⪰: tr(K X )≤P
/
Summary
/