0% found this document useful (0 votes)
22 views

Introduction To Information Theory Channel Capacity and Models

This document provides an introduction to information theory and channel capacity models. It discusses several channel models including the binary symmetric channel (BSC), binary erasure channel, and additive white Gaussian noise channel. It also covers key concepts such as channel capacity, the Shannon channel coding theorem, and the converse. Channel capacity is defined as the maximum rate of error-free communication and depends on the channel transition probabilities. Interleaving is introduced as a technique to convert burst errors into random errors to improve error correction.

Uploaded by

Mahadev
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Introduction To Information Theory Channel Capacity and Models

This document provides an introduction to information theory and channel capacity models. It discusses several channel models including the binary symmetric channel (BSC), binary erasure channel, and additive white Gaussian noise channel. It also covers key concepts such as channel capacity, the Shannon channel coding theorem, and the converse. Channel capacity is defined as the maximum rate of error-free communication and depends on the channel transition probabilities. Interleaving is introduced as a technique to convert burst errors into random errors to improve error correction.

Uploaded by

Mahadev
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 37

Introduction to Information

theory
channel capacity and models
A.J. Han Vinck
University of Duisburg-Essen
May 2012

This lecture

Some models
Channel capacity

Shannon channel coding theorem


converse

some channel models


Input X

P(y|x)

output Y

transition probabilities
memoryless:
- output at time i depends only on input at time i
- input and output alphabet finite

Example: binary symmetric channel (BSC)

1-p

Error Source

E
X
Input

Y X E
Output

0
p

1
1-p

E is the binary error sequence s.t. P(1) = 1-P(0) = p


X is the binary information sequence
Y is the binary output sequence

from AWGN
to BSC

Homework: calculate the capacity as a function of A and 2

Other models
0

0 (light on)
X

p
1-p

1-e
e

Y
1 (light off)
e

P(X=0) = P0

1
P(X=0) = P0

Z-channel (optical)
(MAC)

Erasure channel

Erasure with errors

p
p

1-p-e
e

e
1-p-e

burst error model (Gilbert-Elliot)


Random error channel; outputs independent
P(0) = 1- P(1);

Error Source

Burst error channel; outputs dependent


P(0 | state = bad ) = P(1|state = bad ) = 1/2;

Error Source

P(0 | state = good ) = 1 - P(1|state = good ) = 0.999

State info: good or bad


Pgb
Pgg

good

bad

Pbg

transition probability
Pbb

channel capacity:
I(X;Y) = H(X) - H(X|Y) = H(Y) H(Y|X) (Shannon 1948)

H(X)

H(X|Y)

channel

notes:
max I(X; Y) capacity
capacity depends onP (input
x ) probabilities
because the transition probabilites are fixed

Practical communication system design


Code book
Code
word in

message

2k

receive
estimate
channel

with errors

decoder
Code book

n
There are 2k code words of length n
k is the number of information bits transmitted in n channel uses

Channel capacity
Definition:
The rate R of a code is the ratio k/n, where
k is the number of information bits transmitted in n channel uses

Shannon showed that: :


for R C
encoding methods exist
with decoding error probability

Encoding and decoding according to Shannon


Code: 2k binary codewords where p(0) = P(1) =
Channel errors: P(0 1) = P(1 0) = p
i.e. # error sequences 2nh(p)
Decoder: search around received sequence for codeword
with np differences

space of 2n binary sequences

decoding error probability


1.

For t errors: |t/n-p|>


0 for n

(law of large numbers)

2. > 1 code word in region


(codewords random)

P( 1) (2 k 1)

2 nh ( p)
2n

2 n (1 h ( p) R ) 2 n (CBSC R ) 0
for
and

k
R 1 h ( p)
n
n

channel capacity: the BSC


I(X;Y) = H(Y) H(Y|X)

1-p

0
X

the maximum of H(Y) = 1

since Y is binary

H(Y|X) = h(p)

1-p

= P(X=0)h(p) + P(X=1)h(p)

Conclusion: the capacity for the BSC CBSC = 1- h(p)


Homework: draw CBSC , what happens for p >

channel capacity: the BSC


Explain the behaviour!
Channel capacity

1.0

0.5
Bit error p

1.0

channel capacity: the Z-channel


Application in optical communications
0

0 (light on)
X

p
1-p

P(X=0) = P0

Y
1 (light off)

H(Y) = h(P0 +p(1- P0 ) )


H(Y|X) = (1 - P0 ) h(p)
For capacity,
maximize I(X;Y) over P0

channel capacity: the erasure channel


Application: cdma detection
0

1-e
e

I(X;Y) = H(X) H(X|Y)

Y
e

H(X) = h(P0 )
H(X|Y) = e h(P0)

1
P(X=0) = P0

Thus Cerasure = 1 e
(check!, draw and compare with BSC and Z)

Capacity and coding for the erasure channel


Code: 2k binary codewords where p(0) = P(1) =
Channel errors: P(0 E) = P(1 E) = e
Decoder: search around received sequence for codeword
with ne differences

space of 2n binary sequences

decoding error probability


1.

For t erasures: |t/n-e|>


0 for n

(law of large numbers)

2. > 1 candidate codeword


agrees in n(1-e) positions
after ne positiona are
erased (codewords random)

P( 1) (2 k 1)2 n (1e )
2 n (1e R ) 2 n ( Cerasure R ) 0
k
for R 1 e, n
n

Erasure with errors: calculate the capacity!

p
p

1-p-e
e

e
1-p-e

example

0
1/3

1
1/3

Consider the following example

For P(0) = P(2) = p, P(1) = 1-2p

H(Y) = h(1/3 2p/3) + (2/3 + 2p/3); H(Y|X) = (1-2p)log 23


Q: maximize H(Y) H(Y|X) as a function of p
Q: is this the capacity?
hint use the following: log2x = lnx / ln 2; d lnx / dx = 1/x

channel models: general diagram


P1|1

x1
x2

y1
P2|1

P1|2
:
:
:

xn

y2

P2|2

Input alphabet X = {x1, x2, , xn}


Output alphabet Y = {y1, y2, , ym}
Pj|i = PY|X(yj|xi)

:
:
:

In general:

Pm|n
ym

calculating capacity needs more


theory

The statistical behavior of the channel is completely defined by


the channel transition probabilities Pj|i = PY|X(yj|xi)

* clue:
I(X;Y)
is convex in the input probabilities
i.e. finding a maximum is simple

Channel capacity: converse


For R > C

the decoding error probability > 0

Pe

k/n
C

Converse:

For a discrete memory less channel

Xi

channel

Yi

i 1

i 1

i 1

i 1

I ( X ; Y ) H (Y ) H (Yi | X i ) H (Yi ) H (Yi | X i ) I ( X i ; Yi ) nC


n

Source generates one


out of 2k equiprobable
messages

source

encoder

Xn

channel

Yn

decoder

Let Pe = probability that m m

converse

R := k/n for any code

k = H(M) = I(M;Yn)+H(M|Yn)
I(Xn;Yn) +H(M|Yn M)

Xn is a function of M

I(Xn;Yn) +H(M|M)

M is a function of Yn

I(Xn;Yn) + h(Pe) + Pe log2k

Fano inequality

nC + 1 + k Pe

Pe 1 C/R - 1/nR
Hence:

for large n, and R > C,


the probability of error Pe > 0

Appendix:
Assume:
binary sequence P(0) = 1 P(1) = 1-p
t is the # of 1s in the sequence
Then n , > 0
Weak law of large numbers
Probability ( |t/n p| > ) 0
i.e. we expect with high probability pn 1s

Appendix:
Consequence:
1.
2.
3.

n(p- ) < t < n(p + ) with high probability


n ( p )

n ( p )

n
n
2n
2n 2 nh ( p)
t
pn

1
lim n log 2 2n
n

n
h (p)
pn

h (p) p log 2 p (1 p) log 2 (1 p)

Homework: prove the approximation using ln N! ~ N lnN for N large.


Or use the Stirling approximation:

N ! 2 N N e

N N

Binary Entropy:

h(p) = -plog2p (1-p) log2 (1-p)

Note:

0.9

h(p) = h(1-p)

0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Capacity for Additive White Gaussian Noise


Noise
Input X

Output Y

Cap : sup [H(Y) H( Noise)]


p( x )

x 2 S / 2 W

W is (single sided) bandwidth

Input X is Gaussian with power spectral density (psd) S/2W;


Noise is Gaussian with psd = 2noise
Output Y is Gaussian with psd = y2 = S/2W + 2noise
For Gaussian Channels: y2 =x2 +noise2

Noise
X

Cap 12 log 2 (2e( 2x 2noise )) 12 log 2 ( 2e 2noise ) bits / trans.


12 log 2 (

2noise 2x

Cap W log 2 (

p(z)

1
22z

z2 / 2 2z

2
noise

) bits / trans.

2noise S / 2W
2noise

) bits / sec .

; H(Z) 21 log2 (2e2z ) bits

Middleton type of burst channel model


0

Transition
probability P(0)

channel 1

channel 2

Select channel k
with probability
Q(k)

channel k has
transition
probability p(k)

Fritzman model:
multiple states G and only one state B
Closer to an actual real-world channel

G1

1-p

Gn

Error probability 0

B
Error probability h

Interleaving: from bursty to random


bursty
Message

interleaver

channel

interleaver

encoder

-1

message
decoder

random error
Note: interleaving brings encoding and decoding delay
Homework: compare the block and convolutional interleaving w.r.t. delay

Interleaving: block
Channel models are difficult to derive:
- burst definition ?
- random and burst errors ?
for practical reasons: convert burst into random error
read in row wise

transmit
column wise

De-Interleaving: block

read in column
wise
this row contains 1 error

read out
row wise

Interleaving: convolutional
input sequence 0
input sequence 1

delay of b elements

input sequence m-1


Example: b = 5, m = 3

delay of (m-1)b elements


in
out

You might also like