0% found this document useful (0 votes)
49 views36 pages

Introduction To Information Theory Channel Capacity and Models

Channel capacity n n Shannon channel coding theorem converse some channel models input x P(y x) output y transition probabilities memoryless: - output at time I depends only on input. Capacity depends on input probabilities because the transition probabilities are fixed. Shannon showed that: : for R = C encoding methods exist with decoding error probability 0.

Uploaded by

Sunil Sharma
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views36 pages

Introduction To Information Theory Channel Capacity and Models

Channel capacity n n Shannon channel coding theorem converse some channel models input x P(y x) output y transition probabilities memoryless: - output at time I depends only on input. Capacity depends on input probabilities because the transition probabilities are fixed. Shannon showed that: : for R = C encoding methods exist with decoding error probability 0.

Uploaded by

Sunil Sharma
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Introduction to Information

theory
channel capacity and models
A.J. Han Vinck
University of Essen
May 2009
This lecture

 Some models
 Channel capacity
 Shannon channel coding theorem
 converse
some channel models

Input X P(y|x) output Y

transition probabilities

memoryless:
- output at time i depends only on input at time i
- input and output alphabet finite
Example: binary symmetric channel (BSC)

1-p
Error Source
0 0
E
p
X Y  X E
+ 1 1
Input Output
1-p

E is the binary error sequence s.t. P(1) = 1-P(0) = p


X is the binary information sequence
Y is the binary output sequence
from AWGN
to BSC

Homework: calculate the capacity as a function of A and 2


Other models

1-e
0 0 (light on) 0 0
e
X Yp
1-p
1 1 (light off) E
e
1
P(X=0) = P0 1-e 1

P(X=0) = P0

Z-channel (optical) Erasure channel


(MAC)
Erasure with errors

1-p-e
0 e 0
p

E
p e
1
1-p-e 1
burst error model (Gilbert-Elliot)

Random error channel; outputs independent


Error Source P(0) = 1- P(1);

Burst error channel; outputs dependent


P(0 | state = bad ) = P(1|state = bad ) = 1/2;
Error Source
P(0 | state = good ) = 1 - P(1|state = good ) = 0.999

State info: good or bad transition probability


Pgb
Pgg good bad Pbb
Pbg
channel capacity:

I(X;Y) = H(X) - H(X|Y) = H(Y) – H(Y|X) (Shannon 1948)

X Y
H(X) channel H(X|Y)

max I(X; Y)  capacity


P( x )

notes:
capacity depends on input probabilities
because the transition probabilites are fixed
Practical communication system design
Code book

Code receive
message word in
estimate
2k channel decoder

Code book
with errors

n
There are 2k code words of length n
k is the number of information bits transmitted in n channel uses
Channel capacity
Definition:
The rate R of a code is the ratio k/n, where
k is the number of information bits transmitted in n channel uses

Shannon showed that: :


for R  C
encoding methods exist
with decoding error probability 0
Encoding and decoding according to Shannon

Code: 2k binary codewords where p(0) = P(1) = ½


Channel errors: P(0 1) = P(1  0) = p
i.e. # error sequences  2nh(p)
Decoder: search around received sequence for codeword
with  np differences

space of 2n binary sequences


decoding error probability

1. For t errors: |t/n-p|> Є


® 0 for n  
(law of large numbers)

2. > 1 code word in region


(codewords random)

2nh(p)
P( 1)  (2k 1)
2n
 2n(1h(p)R)  2n(CBSCR)  0
k
for R   1  h(p)
n
and n 
channel capacity: the BSC

1-p I(X;Y) = H(Y) – H(Y|X)


0 0 the maximum of H(Y) = 1
since Y is binary
X p Y
1 1 H(Y|X) = h(p)
1-p = P(X=0)h(p) + P(X=1)h(p)

Conclusion: the capacity for the BSC CBSC = 1- h(p)


Homework: draw CBSC , what happens for p > ½
channel capacity: the BSC

Explain the behaviour!


1.0
Channel capacity

0.5 1.0
Bit error p
channel capacity: the Z-channel

Application in optical communications

0 0 (light on) H(Y) = h(P0 +p(1- P0 ) )


X Yp
H(Y|X) = (1 - P0 ) h(p)
1-p
1 1 (light off)
For capacity,
P(X=0) = P0 maximize I(X;Y) over P0
channel capacity: the erasure channel

Application: cdma detection

1-e
0 0 I(X;Y) = H(X) – H(X|Y)
e
H(X) = h(P0 )
X Y
E H(X|Y) = e h(P0)
e
1
1-e 1
Thus Cerasure = 1 – e
P(X=0) = P0 (check!, draw and compare with BSC and Z)
Erasure with errors: calculate the capacity!

1-p-e
0 e 0
p

E
p e
1
1-p-e 1
0 0
1/3
example 1 1
1/3
2 2
 Consider the following example

 For P(0) = P(2) = p, P(1) = 1-2p

H(Y) = h(1/3 – 2p/3) + (2/3 + 2p/3); H(Y|X) = (1-2p)log23

Q: maximize H(Y) – H(Y|X) as a function of p


Q: is this the capacity?

hint use the following: log2x = lnx / ln 2; d lnx / dx = 1/x


channel models: general diagram

P1|1 y1
x1
P1|2
P2|1 Input alphabet X = {x1, x2, …, xn}
x2 y2
P2|2 Output alphabet Y = {y1, y2, …, ym}
: Pj|i = PY|X(yj|xi)
:
:
:
: In general:
:
xn calculating capacity needs more
Pm|n theory
ym
The statistical behavior of the channel is completely defined by
the channel transition probabilities Pj|i = PY|X(yj|xi)
* clue:

I(X;Y)
is convex  in the input probabilities

i.e. finding a maximum is simple


Channel capacity: converse

For R > C the decoding error probability > 0

Pe

k/n
C
Converse: For a discrete memory less channel

channel

Xi Yi
n n n n
I ( X ; Y )  H (Y )   H (Yi | X i )   H (Yi )   H (Yi | X i )   I ( X i ;Yi )  nC
n n n

i 1 i 1 i 1 i 1

Source generates one


source encoder channel decoder
out of 2k equiprobable
messages
m Xn Yn m‘
Let Pe = probability that m‘  m
converse R := k/n

k = H(M) = I(M;Yn)+H(M|Yn)
Xn is a function of M Fano
1 – C n/k - 1/k  Pe
 I(Xn;Yn) + 1 + k Pe
 nC + 1 + k Pe

Pe  1 – C/R - 1/nR
Hence: for large n, and R > C,
the probability of error Pe > 0
We used the data processing theorem
Cascading of Channels

I(X;Z)

X Y Z
I(X;Y) I(Y;Z)

The overall transmission rate I(X;Z) for the cascade can


not be larger than I(Y;Z), that is:
I(X; Z)  I(Y; Z)
Appendix:

Assume:
binary sequence P(0) = 1 – P(1) = 1-p
t is the # of 1‘s in the sequence
Then n   ,  > 0
Weak law of large numbers
Probability ( |t/n –p| >  )  0

i.e. we expect with high probability pn 1‘s


Appendix:

Consequence:

1. n(p- ) < t < n(p + ) with high probability


n (p ) n  n 
2.     2 n   2 n 2 nh ( p )
n (p) t  pn 
n
3. 1
lim n log2 2n   h(p) h (p)  p log2 p  (1  p) log2 (1  p)
n  pn 
Homework: prove the approximation using ln N! ~ N lnN for N large.

Or use the Stirling approximation: N!  2 NN e N N


Binary Entropy: h(p) = -plog2p – (1-p) log2 (1-p)

1
h
0.9
Note:
0.8 h(p) = h(1-p)
0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
p
Capacity for Additive White Gaussian Noise

Noise
Input X Output Y

Cap : sup [ H ( Y )  H ( Noise )]


p( x )

x 2 S / 2 W W is (single sided) bandwidth

Input X is Gaussian with power spectral density (psd) ≤S/2W;

Noise is Gaussian with psd = 2noise

Output Y is Gaussian with psd = y2 = S/2W + 2noise

For Gaussian Channels: y2 =x2 +noise2


Noise
X Y X Y

Cap  12 log 2 ( 2 e( 2x   2noise ))  12 log 2 ( 2 e 2noise ) bits / trans.

 2noise   2x
 12 log 2 ( ) bits / trans.
 2
noise

 2noise  S / 2 W
Cap  W log 2 ( ) bits / sec .
 2noise

1  z 2 / 2  2z
p( z )  e ; H( Z )  21 log 2 (2 e  2z ) bits
2  2z
Middleton type of burst channel model
0 0

1 1
Transition
probability P(0)

channel 1

channel 2

Select channel k …
with probability channel k has
Q(k) transition
probability p(k)
Fritzman model:

multiple states G and only one state B


Closer to an actual real-world channel

1-p
G1 … Gn B
Error probability 0 Error probability h
Interleaving: from bursty to random

bursty

Message interleaver channel interleaver -1


message
encoder decoder

„random error“

Note: interleaving brings encoding and decoding delay

Homework: compare the block and convolutional interleaving w.r.t. delay


Interleaving: block

Channel models are difficult to derive:


- burst definition ?
- random and burst errors ?
for practical reasons: convert burst into random error

read in row wise 1 0 1 0 1


transmit
0 1 0 0 0

0 0 0 1 0 column wise
1 0 0 1 1

1 1 0 0 1
De-Interleaving: block

read in column 1 0 1 e 1
read out
0 1 e e 0
wise 0 0 e 1 0

this row contains 1 error 1 0 e 1 1


row wise
1 1 e 0 1
Interleaving: convolutional

input sequence 0
input sequence 1 delay of b elements

input sequence m-1 delay of (m-1)b elements

in
Example: b = 5, m = 3

out

You might also like