0% found this document useful (0 votes)
2 views20 pages

Lect 02

The document covers fundamental concepts in information theory, focusing on lossless source coding, channel coding, and the associated theorems. It discusses the lossless source coding theorem, proving its achievability and converse, as well as the channel coding theorem and its implications for point-to-point communication systems. Key topics include typical sequences, mutual information, and examples of different types of channels.

Uploaded by

1801099819
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views20 pages

Lect 02

The document covers fundamental concepts in information theory, focusing on lossless source coding, channel coding, and the associated theorems. It discusses the lossless source coding theorem, proving its achievability and converse, as well as the channel coding theorem and its implications for point-to-point communication systems. Key topics include typical sequences, mutual information, and examples of different types of channels.

Uploaded by

1801099819
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Lecture  Basic Information Theory

(Reading: NIT , ., .–., .)

∙ Lossless source coding


∙ Channel coding
∙ Channel coding with input cost
∙ Gaussian channel

© Copyright – Abbas El Gamal and Young-Han Kim

Point-to-point lossless compression system

Xn M ̂n
X
Source Encoder Decoder

∙ Discrete (stationary) memoryless source (DMS) (X , p(x)) (or X in short)


󳶳 Generates i.i.d. sequence X , X , . . . with Xi ∼ pX (xi )
󳶳 Example: Bern(p) source X generates i.i.d. Bern(p) sequence

∙ A (nR , n) lossless compression code:


󳶳 Encoder: m(xn ) ∈ [ : nR ) = {, , . . . , ⌊nR⌋ } (R = rate in bits/source symbol)
󳶳 Decoder: x̂ n (m) ∈ X n

 / 
Point-to-point lossless compression system

Xn M ̂n
X
Source Encoder Decoder

∙ Probability of error: Pe(n) = P{X̂ n ̸= X n }


∙ R achievable if ∃ a sequence of (nR , n) codes such that limn→∞ Pe(n) = 
∙ Optimal lossless compression rate R∗ : Infimum of all achievable R
Lossless source coding theorem (Shannon )
R∗ = H(X) = − 󵠈 p(x) log p(x) bits/symbol (entropy)
x∈X

∙ Examples:
󳶳 If X ∼ Bern(p), then H(X) = −p log p − ( − p) log( − p) = H(p) (binary entropy function)
󳶳 If X ∼ Unif(X ), then H(X) = log |X |
󳶳 In general H(X) ≤ log |X | (by Jensen’s inequality)

 / 

Proving the lossless source coding theorem

∙ To prove this theorem, we need to establish:


󳶳 Achievability: If R > H(X), ∃ a sequence of (nR , n) codes such that limn→∞ Pe(n) = 
󳶳 Converse: For any sequence of (nR , n) codes with limn→∞ Pe(n) = , R ≥ H(X)

∙ We need to review:
󳶳 Conditional and joint entropy
󳶳 The notion of typicality

 / 
Conditional and joint entropy

∙ Conditional entropy (equivocation): Let (X, Y) ∼ p(x, y), (x, y) ∈ X × Y


H(Y |X) = 󵠈 H(Y | X = x)p(x)
x∈X

󳶳 H(Y|X) ≤ H(Y) (with equality if X and Y are independent)

∙ Joint entropy: For (X, Y) ∼ p(x, y),


H(X, Y) = H(X) + H(Y |X) = H(Y) + H(X|Y)

∙ Chain rule for entropy:


n
H(X ) = 󵠈 H(Xi |X i− )
n

i=
n
󳶳 H(X n ) ≤ ∑i= H(Xi ) (with equality if X , X , . . . , Xn are independent)

∙ Fano’s inequality: If (X, Y) ∼ p(x, y) and Pe = P{X ̸= Y}, then


H(X|Y) ≤ H(Pe ) + Pe log |X | ≤  + Pe log |X |

 / 

Typical sequences

∙ Empirical pmf (type) of xn ∈ X n :


󵄨󵄨{i: x = x}󵄨󵄨
󵄨 󵄨󵄨
π(x|x ) = 󵄨
n i
for x ∈ X
n
∙ Example: For xn = , π(|xn ) = / and π(|xn ) = /
∙ X , X , . . . , Xn i.i.d. with X ∼ p(x), then by the WLLN
π(x|X n ) → p(x) in probability for every x ∈ X

∙ Typical set (Orlitsky–Roche ): For X ∼ p(x) and є > ,


Tє(n) (X) = Tє(n) = 󶁁xn : 󵄨󵄨󵄨󵄨 π(x|xn ) − p(x)󵄨󵄨󵄨󵄨 ≤ єp(x) for all x ∈ X 󶁑

Typical average lemma


If xn ∈ Tє(n) (X) and g(x) ≥ , then
n

( − є) E(g(X)) ≤ 󵠈 g(xi ) ≤ ( + є) E(g(X))
n i=

 / 
Properties of typical sequences

Xn
Tє(n) (X)
emen
|Tє(n) | ≐ nH(X) typical xn

P{X n ∈ Tє(n) } →  p(xn ) ≐ −nH(X)

∙ For xn ∈ Tє(n) (X), −n(H(X)+δ(є)) ≤ ∏ni= pX (xi ) ≤ −n(H(X)−δ(є))


∙ |Tє(n) (X)| ≤ n(H(X)+δ(є))
∙ If X n ∼ ∏ni= pX (xi ), then P{X n ∈ Tє(n) (X)} →  (by the LLN)
󳰀
∙ |Tє(n) (X)| ≥ ( − є)n(H(X)−δ(є)) = n(H(X)−δ (є)) for n sufficiently large

 / 

Achievability proof of lossless source coding theorem

∙ If R > H(X), ∃ sequence of (nR , n) codes with limn→∞ Pe(n) = 


∙ Let R > H(X) + δ(є) so that |Tє(n) | ≤ n(H(X)+δ(є)) < nR
∙ Codebook:
󳶳 Assign a distinct index m(xn ) to each xn ∈ Tє(n)
󳶳 Assign m =  to all xn ∉ Tє(n)
󳶳 Codebook is revealed to both encoder and decoder

∙ Encoding:
󳶳 Upon observing xn , send m(xn )

∙ Decoding:
󳶳 Declare x̂ n = xn (m) for the unique xn (m) ∈ Tє(n)

∙ Analysis of the probability of error:


󳶳 All typical sequences are correctly recovered
󳶳 Thus, limn→∞ Pe(n) = limn→∞ P{X n ∉ Tє(n) } = , and every R > H(X) is achievable
 / 
Converse proof of lossless source coding theorem
∙ Given sequence of (nR , n) codes with limn→∞ Pe(n) → , R ≥ H(X)
∙ For each code, let M = m(X n ) and X̂ n = x̂ n (M)
∙ Consider
nR ≥ H(M)
̂ n)
≥ H(X
̂ n ) − H(X n | X
= H(X n , X ̂ n)
= H(X n ) + H(X ̂ n |X n ) − H(X n | X
̂ n)
̂ n)
= H(X n ) − H(X n | X
= nH(X) − H(X n | X ̂ n)

∙ By Fano’s inequality,
̂ n ) ≤  + nP(n) log |X | = n(/n + P(n) log |X |) = nєn ,
H(X n | X e e

where єn →  as n → ∞ by assumption
∙ Hence as n → ∞, R ≥ H(X)
 / 

Point-to-point communication system

M Xn Yn ̂
M
Encoder p(y|x) Decoder

∙ Discrete memoryless channel (DMC) (X , p(y|x), Y)


󳶳 Discrete: X and Y are finite
󳶳 Memoryless: p(yi |yi− , xi , m) = p(yi |xi ), i ∈ [ : n], i.e., (M, Y i− , X i− ) → Xi → Yi
n
󳶳 Without feedback: p(yn |xn , m) = ∏i= pY|X (yi |xi )

∙ A (nR , n) code for the DMC:


󳶳 Message set [ : nR ] = {, , . . . , ⌈nR⌉ }
󳶳 Encoder: a codeword xn (m) for each m ∈ [ : nR ]
C = 󶁂xn (), xn (), . . . , xn (⌈nR⌉ )󶁒 is the codebook
󳶳 ̂ n ) ∈ [ : nR ] ∪ {e} for each yn
Decoder: an estimate m(y

 / 
Point-to-point communication system

M Xn Yn ̂
M
Encoder p(y|x) Decoder

∙ Assume that M ∼ Unif[ : nR ]


∙ Average probability of error: Pe(n) = P{M
̂ ̸= M}

∙ R achievable if ∃ a sequence of (nR , n) codes such that limn→∞ Pe(n) = 


∙ Capacity C: Supremum of all achievable rates (operational capacity)
∙ For (X, Y) ∼ p(x, y), define the mutual information as
I(X; Y) = H(X) − H(X|Y) = H(Y) − H(Y |X) = H(X) + H(Y) − H(X, Y)

Channel coding theorem (Shannon )


C = max I(X; Y) bits/transmission (information capacity)
p(x)

 / 

Examples

∙ Binary symmetric channel (BSC): C =  − H(p)


−p
  Z ∼ Bern(p)

X Y
X Y
 
−p

∙ Binary erasure channel (BEC): C =  − p


−p
 

X e Y

 
−p

 / 
Proving the channel coding theorem

∙ Achievability: For every R < C = maxp(x) I(X; Y) ∃ a sequence of (nR , n) codes


with limn→∞ Pe(n) = 
󳶳 We will use random coding and joint typicality decoding

∙ Converse: Given a sequence of (nR , n) codes with limn→∞ Pe(n) = ,


R ≤ C = maxp(x) I(X; Y)
󳶳 Need some properties of mutual information

 / 

Jointly typical sequences

∙ Joint type of (xn , yn ) ∈ X n × Y n :


󵄨󵄨{i: (x , y ) = (x, y)}󵄨󵄨
󵄨 󵄨󵄨
π(x, y|x , y ) = 󵄨
n n i i
for (x, y) ∈ X × Y
n

∙ Jointly typical set: For (X, Y) ∼ p(x, y) and є > ,


Tє(n) (X, Y) = Tє(n) ((X, Y))
= 󶁁(xn , yn ): |π(x, y|xn , yn ) − p(x, y)| ≤ єp(x, y) for all (x, y)󶁑

∙ If (xn , yn ) ∈ Tє(n) (X, Y) and p(xn , yn ) = ∏ni= pX,Y (xi , yi ), then


󳶳 xn ∈ Tє(n) (X) and yn ∈ Tє(n) (Y)
󳶳 p(xn ) ≐ −nH(X) , p(yn ) ≐ −nH(Y) , and p(xn , yn ) ≐ −nH(X,Y)
󳶳 p(xn |yn ) ≐ −nH(X|Y) and p(yn |xn ) ≐ −nH(Y|X)

 / 
Conditionally typical sequences
∙ Conditionally typical set: Tє(n) (Y|xn ) = 󶁁yn : (xn , yn ) ∈ Tє(n) (X, Y)󶁑

∙ |Tє(n) (Y|xn )| ≤ n(H(Y|X)+δ(є))

Conditional typicality lemma


If xn ∈ Tє(n)
󳰀 (X), Y
n
∼ ∏ni= pY|X (yi |xi ), and є > є 󳰀 , then

lim P󶁁(xn , Y n ) ∈ Tє(n) (X, Y)󶁑 = 


n→∞

∙ If xn ∈ Tє(n) 󳰀
󳰀 (X), є > є , then for n sufficiently large,

|Tє(n) (Y |xn )| ≥ n(H(Y|X)−δ(є))

∙ Let X ∼ p(x), Y = g(X), and xn ∈ Tє(n) (X). Then


yn ∈ Tє(n) (Y |xn ) iff yi = g(xi ), i ∈ [ : n]

 / 

Illustration of joint typicality


Tє(n) (X) 󶀤| ⋅ | ≐ nH(X) 󶀴

xn
yn

Tє(n) (Y) Tє(n) (X, Y)


󶀤| ⋅ | ≐ nH(Y) 󶀴 󶀤| ⋅ | ≐ nH(X,Y) 󶀴

Tє(n) (Y|xn ) Tє(n) (X|yn )

 / 
Another illustration of joint typicality

Xn Yn
Tє(n) (X) Tє(n) (Y)

xn

Tє(n) (Y|xn )

󶀤| ⋅ | ≐ nH(Y|X) 󶀴

 / 

Joint typicality lemma

Let (X, Y) ∼ p(x, y) and є > є 󳰀 . Then for some δ(є) →  as є → :

∙ If x̃ n is arbitrary and Ỹ n ∼ ∏ni= pY (̃yi ), then


P󶁁(̃xn , Ỹ n ) ∈ Tє(n) (X, Y)󶁑 ≤ −n(I(X;Y)−δ(є))

∙ If xn ∈ Tє(n)
󳰀 and Ỹ n ∼ ∏ni= pY (̃yi ), then for n sufficiently large,

P󶁁(xn , Ỹ n ) ∈ Tє(n) (X, Y)󶁑 ≥ −n(I(X;Y)+δ(є))

∙ Corollary: If (X̃ n , Ỹ n ) ∼ ∏ni= pX (̃xi )pY (̃yi ), then


̃ n , Ỹ n ) ∈ T (n) 󶁑 ≐ −nI(X;Y)
P󶁁(X є

 / 
Achievability proof of channel coding theorem

∙ For every R < maxp(x) I(X; Y) ∃ sequence of (nR , n) codes with limn→∞ Pe(n) = 
∙ Key ideas: random coding and joint typicality decoding
∙ Codebook generation:
󳶳 Fix p(x) that attains C = maxp(x) I(X; Y)
n
󳶳 Independently generate nR sequences xn (m) ∼ ∏i= pX (xi ), m ∈ [ : nR ]; hence

 nR n
p(C ) = 󵠉 󵠉 pX (xi (m))
Xn m= i=

 / 

Achievability proof of channel coding theorem

∙ For every R < maxp(x) I(X; Y) ∃ sequence of (nR , n) codes with limn→∞ Pe(n) = 
∙ Key ideas: random coding and joint typicality decoding
∙ Encoding: C is revealed to both encoder and decoder
󳶳 To send message m, transmit xn (m)

Xn

xn (m)

 / 
Achievability proof of channel coding theorem

∙ For every R < maxp(x) I(X; Y) ∃ sequence of (nR , n) codes with limn→∞ Pe(n) = 
∙ Key ideas: random coding and joint typicality decoding
∙ Decoding:
󳶳 Declare that m ̂ yn ) ∈ Tє(n)
̂ is sent if it is unique message such that (xn (m),
󳶳 Otherwise declare an error e

Xn Yn

xn (m) yn

 / 

Analysis of the probability of error

∙ Consider P(E) averaged over codebooks


∙ Observe that P(E) = P(E|M = ); hence assume that M =  is sent
∙ Error events:
E = 󶁁(X n (), Y n ) ∉ Tє(n) 󶁑,
E = 󶁁(X n (m), Y n ) ∈ Tє(n) for some m ̸= 󶁑

By the union of events bound, P(E) = P(E ∪ E ) ≤ P(E ) + P(E )


∙ By the LLN, P(E ) →  (as n → ∞)
∙ By the union of events bound and the joint typicality lemma,
 nR
P(E ) ≤ 󵠈 P󶁁(X n (m), Y n ) ∈ Tє(n) 󶁑 ≤ −n(C−R−δ(є)) ,
m=

which →  as n → ∞ if R < C − δ(є)


∙ Hence, ∃ sequence of (nR , n) codes with limn→∞ Pe(n) =  if R < C − δ(є)
 / 
Illustration of E

Xn Yn Tє(n) (Y)

X n ()

X n (m)
Yn

∙ Note that we only needed X n (m), m ∈ [ : nR ], to be pairwise independent of Y n

 / 

“Little” packing lemma

∙ Let (X, Y) ∼ p(x, y)


∙ Let Ỹ n ∼ ∏ni= pY (̃yi )
∙ Let X n (m) ∼ ∏ni= pX (xi ), m ∈ A, |A| ≤ nR , be pairwise independent of Ỹ n

“Little” packing lemma


There exists δ(є) →  as є →  such that

lim P󶁁(X n (m), Ỹ n ) ∈ Tє(n) for some m ∈ A} = ,


n→∞

if R < I(X; Y) − δ(є)

∙ We will generalize this later (see NIT .)

 / 
Application: Achievability using linear codes

∙ Consider a BSC(p) and let m = (u , u , . . . , uk ) ∈ {, }k (i.e., k = nR)


∙ Random linear codebook: Generator matrix G with i.i.d. Bern(/) entries
x g g ... gk u
󶀔 x 󶀅
󶀄 󶀕 󶀔g
󶀄 g ... gk 󶀅
󶀕 󶀔u  󶀅
󶀄 󶀕
󶀔.󶀕= 󶀔 . .. .. .. 󶀕 󶀔.󶀕
󶀜 .. 󶀕
󶀔 󶀝 󶀔
󶀜 .. . . . 󶀕
󶀝 󶀜 .. 󶀕
󶀔 󶀝
xn gn gn . . . gnk uk

󳶳 X (uk ), . . . , Xn (uk ) are i.i.d. Bern(/) for each uk ̸= 


󳶳 X n (uk ) and X n (̃uk ) are independent for each uk ̸= ũ k

∙ By the “little” packing lemma, P(E) →  if R <  − H(p) − δ(є)


∙ There exists a good sequence of linear codes
∙ There are now practical randomly generated linear codes (turbo, LDPC)

 / 

Properties of mutual information


∙ Nonnegativity:
I(X; Y) = H(X) − H(X|Y) = H(Y) − H(Y |X) = H(X) + H(Y) − H(X, Y) ≥ 
∙ Conditional mutual information:
I(X; Y |Z) = 󵠈 I(X; Y |Z = z)p(z) = H(X|Z) − H(X|Y, Z)
z∈Z
∙ Mutual information versus conditional mutual information:
󳶳 Conditional independence: If Z → X → Y form a Markov chain, I(X; Y|Z) ≤ I(X; Y)
󳶳 Independence: If p(x, y, z) = p(z)p(x)p(y|x, z), I(X; Y|Z) ≥ I(X; Y)

∙ Chain rule: n
I(X ; Y) = 󵠈 I(Xi ; Y |X i− )
n

i=

∙ Data processing inequality: If X → Y → Z,


I(X; Z) ≤ I(X; Y),
I(X; Z) ≤ I(Y; Z)

 / 
Converse proof of channel coding theorem

∙ Need to show: For any sequence of (nR , n) codes with Pe(n) → , R ≤ C


∙ Each (nR , n) code induces empirical pmf
n
̂ = −nR p(xn |m) 󵠉 pY|X (yi |xi ) p(m|y
p(m, xn , yn , m) ̂ n)
i=

∙ Note that
nR = H(M)
̂ + H(M| M)
= I(M; M) ̂

∙ By Fano’s inequality,
̂ ≤  + P(n) nR = nєn ,
H(M| M) e
where єn →  as n → ∞
∙ By the data processing inequality,
̂ + nєn
nR = I(M; M)
≤ I(M; Y n ) + nєn
 / 

Proof of the converse

∙ We have
nR ≤ I(M; Y n ) + nєn

∙ Now need to show: I(M; Y n ) ≤ n maxp(x) I(X; Y)


n
I(M; Y ) = 󵠈 I(M; Yi |Y i− )
n

i=
n
≤ 󵠈 I(M, Y i− ; Yi )
i=
n
= 󵠈 I(Xi , M, Y i− ; Yi ) (Xi is function of M)
i=
n
= 󵠈 I(Xi ; Yi ) ((M, Y i− ) → Xi → Yi )
i=
≤ n max I(X; Y)
p(x)

 / 
Channel coding with input cost

M Xn Yn ̂
M
Encoder p(y|x) Decoder
B

∙ Cost b(x) ≥  with b(x ) = 


∙ Average cost constraint: ∑ni= b(xi (m)) ≤ nB, m ∈ [ : nR ]
∙ Define capacity–cost function C(B) as C
Capacity–cost function
C(B) = max I(X; Y)
p(x):E(b(X))≤B

C(B)

B
 / 

Proof of achievability

∙ Codebook generation:
󳶳 Fix p(x) that attains C(B/( + є))
n
󳶳 Independently generate nR sequences xn (m) ∼ ∏i= pX (xi ), m ∈ [ : nR ]

∙ Encoding:
󳶳 To send message m, transmit xn (m) if xn (m) ∈ Tє(n)
n
(by the typical average lemma, ∑i= b(xi (m)) ≤ nB)
󳶳 Otherwise transmit (x , . . . , x )

∙ Decoding:
󳶳 Declare that m ̂ yn ) ∈ Tє(n)
̂ is sent if it is unique message such that (xn (m),
󳶳 Otherwise declare an error

∙ Analysis of the probability of error: Read NIT .

 / 
Proof of the converse

∙ Need to show: For any sequence of codes with Pe(n) →  and ∑ni= b(xi (m)) ≤ nB,
R ≤ C(B) = max I(X; Y)
p(x):E(b(X))≤B

∙ By Fano’s inequality and the data processing inequality,


n
nR ≤ 󵠈 I(Xi ; Yi ) + nєn
i=
n
≤ 󵠈 C(E[b(Xi )]) + nєn (by definition)
i=
n

≤ nC󶀥 󵠈 E[b(Xi )]󶀵 + nєn (concavity of C(B))
n i=
≤ nC(B) + nєn (monotonicity of C(B))

∙ Hence, as n → ∞, R ≤ C(B)

 / 

Gaussian channel

∙ Discrete-time additive white Gaussian noise channel


Z

g
X Y = gX + Z

󳶳 g: channel gain (path loss)


󳶳 Z ∼ N(, N /) ({Zi }: WGN(N /) process, independent of M)

∙ Average power constraint: ∑ni= xi (m) ≤ nP for every m ∈ [ : nR ]


∙ Assume N / =  and label received power g  P as S (SNR)
Theorem .. (Shannon )

C= log( + S) = C(S)

∙ To prove this result we need differential entropy


 / 
Differential entropy

∙ Differential entropy of a continuous random variable X ∼ f (x) (pdf ):

h(X) = − 󵐐 f (x) log f (x) dx = − EX 󶀡log f (X)󶀱

󳶳 Concave function of f (x) (but not necessarily nonnegative)


󳶳 Examples: h(Unif[a, b]) = log(b − a), h(N(μ, σ  )) = (/) log(πeσ  )
󳶳 Translation: h(X + a) = h(X)
󳶳 Scaling: h(aX) = h(X) + log |a|

∙ Maximum differential entropy under average power constraint:



max h(X) = log(πeP) = h(N(, P))
f (x):E(X )≤P 

Thus, for any X ∼ f (x),


 
h(X) = h(X − E(X)) ≤ log(πe Var(X)) ≤ log(πe E(X  ))
 
 / 

Differential entropy

∙ Conditional differential entropy: If X ∼ F(x) and Y|{X = x} ∼ f (y|x),

h(Y |X) = 󵐐 h(Y | X = x) dF(x) = − EX,Y 󶀡log f (Y |X)󶀱

󳶳 h(Y|X) ≤ h(Y) (with equality if X and Y are independent)

∙ For continuous (X, Y) ∼ f (x, y),


I(X; Y) = h(X) − h(X|Y) = h(Y) − h(Y |X) = h(X) + h(Y) − h(X, Y)

∙ If X ∼ p(x) is discrete and Y|{X = x} ∼ f (y|x) is continuous for each x,


I(X; Y) = h(Y) − h(Y |X) = H(X) − H(X|Y)

 / 
Proof of the converse
∙ Mutual information extends to arbitrary random variables (Pinsker )
∙ Hence, by the converse proof for the DMC with cost,
C≤ sup I(X; Y)
F(x):E(X  )≤P

∙ Now consider any X with E(X  ) ≤ P, thus E(Y  ) ≤ g  P +  = S + 


I(X; Y) = h(Y) − h(Y |X)
= h(Y) − h(Y − gX|X)
= h(Y) − h(Z|X)
= h(Y) − h(Z)
 
≤ log(πe( + S)) − log(πe) = C(S)
 

∙ Finally note that setting X ∼ N(, P), I(X; Y) = C(S); hence


C≤ max I(X; Y) = C(S)
F(x):E(X  )≤P

 / 

Proof of achievability

∙ Extend proof for DMC with cost via discretization procedure (McEliece )
∙ First note that capacity is attained by X ∼ N(, P)
∙ Let [X]j be a finite quantization of X with [X]j → X in distribution, E([X]j ) ≤ P
Z

X [X]j g Yj [Yj ]k

∙ Let [Yj ]k be a finite quantization of Yj = g[X]j + Z such that [Yj ]k → Tj in


distribution

∙ By achievability proof for DMC with cost, I([X]j ; [Yj ]k ) is achievable for every j, k
∙ By weak convergence and the dominated convergence theorem (see NIT ..),
lim lim I([X]j ; [Yj ]k ) = lim I([X]j ; Yj ) = I(X; Y) = C(S)
j→∞ k→∞ j→∞

 / 
Gaussian vector (MIMO) channel

∙ Discrete-time additive white Gaussian noise multiple-antenna channel


Z

G
X Y = GX + Z

󳶳 X: t-vector, Y: r-vector
󳶳 G: channel gain matrix with gain Gjk from transmitter antenna k to receiver antenna j
󳶳 Z ∼ N(, Ir )

∙ Average power constraint: ∑ni= xT(m, i)x(m, i) ≤ nP, m ∈ [ : nR ]

Theorem .

C= max I(X; Y) = max log |GKX GT + Ir |
F(x):E(X T X)≤P K X ⪰: tr(K X )≤P 

 / 

Summary

∙ Lossless source coding problem


∙ Discrete memoryless source
∙ Entropy is the limit on lossless source coding
∙ Proof of coding theorem: achievability and the converse
∙ Channel coding problem
∙ Discrete memoryless channel (DMC), e.g., BSC and BEC
∙ Information capacity is the limit on channel coding
󳶳 Random codebook generation
󳶳 Joint typicality decoding
󳶳 “Little” packing lemma
󳶳 Capacity with input cost
󳶳 Gaussian channel (discretization procedure)
 / 
References
McEliece, R. J. (). The Theory of Information and Coding. Addison-Wesley, Reading, MA.
Orlitsky, A. and Roche, J. R. (). Coding for computing. IEEE Trans. Inf. Theory, (), –.
Pinsker, M. S. (). Information and Information Stability of Random Variables and Processes.
Holden-Day, San Francisco.
Shannon, C. E. (). A mathematical theory of communication. Bell Syst. Tech. J., (), –, (),
–.

 / 

You might also like