0% found this document useful (0 votes)
114 views

Lecture Notes On "Source and Channel Coding"

This document provides lecture notes on source and channel coding. It covers topics such as channel coding introduction and properties of block codes, linear block codes, decoding spheres, error detection and correction. It also discusses Hamming distance, minimum distance, and decoding algorithms. Source coding topics covered include discrete source coding techniques like Huffman coding, run-length coding, and Lempel-Ziv coding. References on the topics are provided at the end.

Uploaded by

Ajith Kategar
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
114 views

Lecture Notes On "Source and Channel Coding"

This document provides lecture notes on source and channel coding. It covers topics such as channel coding introduction and properties of block codes, linear block codes, decoding spheres, error detection and correction. It also discusses Hamming distance, minimum distance, and decoding algorithms. Source coding topics covered include discrete source coding techniques like Huffman coding, run-length coding, and Lempel-Ziv coding. References on the topics are provided at the end.

Uploaded by

Ajith Kategar
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 107

1

Lecture Notes on Source and Channel Coding


Prof. Sabah Badri-Hoeher
Faculty of Electrical Engineering and Computer Science
University of Applied Sciences Bremen
[email protected]
Summer Term 2011
2
Contents
Introduction
Channel Coding
Introduction to channel coding
Block codes: Denition and properties
Systematic codes
Hamming distance, minimum distance, Hamming weight
Linear block code
Decoding spheres, error detection, error correction
Hamming codes
Hard-decision and soft-decision decoding, MAP and ML decoding
3
Contents
Binary symmetric channel (BSC) and AWGN channel
Cyclic codes, CRC code, Reed-Solomon and BCH codes
Convolutional codes: Denition and properties
Decoding of convolutional Codes, Viterbi algorithm
Source Coding
Introduction to source coding
Coding of discrete sources
Human coding
Runlength coding
Lempel-Ziv coding
4
References
R. Veldhuis, Introduction to Source Coding, Prentice Hall, UK, 1993.
P.M. Gray, Source Coding Theory, Kluwer Academic Publishers, 1998.
M.Bossert, Channel Coding for Telecommunications, John Wiley & Sons, 1999.
S. Benedetto and E. Biglieri, Principles of Digital Transmission with Wireless
Applications, Kluwer Academic Publishers, New York, 1999.
J.G. Proakis, Digital Communications, McGraw-Hill, New York, 1995 (third
edition).
5
Channel coding
Introduction of channel coding
Demodulator
Modulator
Sink
Decoder
Decoder
Source
Channel
Channel
Encoder
Source
Encoder
Source
C
h
a
n
n
e
l
Channel coding provides techniques for a digital transmission of data from a
transmitter to a receiver with a minimum number of errors
Redundancy is included in the transmitted data, so that errors due to the
transmission can be detected and/or corrected at the receiver
6
Channel coding
Tasks of channel coding are
error detection
error correction
error concealment
Without channel coding, robust transmission via noisy transmission channels as
well as reliable storage is not possible.
Therefore, channel coding is applied in many dierent applications
Examples of digital transmission systems:
Mobile radio systems (GSM, GPRS, EDGE, UMTS)
data modems, internet
Digital satellite communications (DAB, XM)
7
Channel coding
Examples of digital storage systems:
Compact disc (CD), digital versatile disc (DVD)
Digital audio tap (DAT)
Hard disc, magnetic storage systems
Due to source coding and channel coding
Flexibility: Trade-o between power eciency and bandwidth eciency
Regenerability: Satellite on-board processing, repeaters in sea cables
High quality: due to error correction
Digital recording media: RAM, discs, tapes, etc.
Multimedia integration of speech, audio, data, video, etc.
8
Channel coding
Fundamental principles of channel coding
Forward error correction (FEC):
In forward error correction schemes there is no feedback from the channel
decoder to the channel encoder
Automatic repeat request (ARQ): in automatic repeat request schemes
there is a feedback from the channel decoder to the channel encoder.
A code word may be repeated until the channel decoder does not detect any
error for example. Alternatively, additional parity bits may be transmitted until
the channel decoder does not detect any error. The additional decoding delay is
not tolerable in all transmission schemes, such as in real-time speech
transmission schemes
Our focus is on FEC techniques in this lecture
9
Channel coding
Fundamental notions of channel coding
Message: Denition I
A message may be ideas, data, facts, lies, or nonsense, etc. Messages may or may
not be known at the receiver in advance
Information:
Information may be data or facts, which are not known to the receiver in
advance. Information does not contain redundancy
Symbol:
Messages and informations are represented by symbols, which are known to all
communication users.
(Examples: Characters, numbers, bits)
10
Channel coding
Code:
A code is a set of words over a well-dened symbol alphabet.
(Example: English words, German words, ASCII code, genetic code)
Redundancy:
If, on average, more symbols are used to represent a message than actually
needed for a perfect reconstruction, then the message contains redundancy
besides information. (Example: German Duden approximately 115 000 words,
number of possible words of length 8: 26
8
2.1 10
11
)
Redundancy helps to protect information with respect to transmission errors
(Example: Source and channel doding makes fun)
11
Channel coding
Block codes: Denition and properties
Denition:
An (n, k)
q
block code C maps an info word u = [u
0
, u
1
, . . . , u
k
] of length k onto
a code word x := [x
0
, x
1
, . . . , x
n1
] of length n, n k
The info word u = [u
0
, u
1
, . . . , u
k
] denotes a sequence of k info symbols
The info symbols u
i
, i = 0, . . . , k 1, are dened over the alphabet
{0, 1, . . . , q 1}, where q is the cardinality (number of elements) of the symbol
alphabet
The code symbols x
i
, i = 0, . . . , n 1, are assumed to be within the same
alphabet {0, 1, . . . , q 1}
The assignment of code words with respect to the info words is
unambiguous and reversible: For each code word there is exactly one info word
time invariant: The mapping rule does not change in time
Each info word eects only one code word
12
Channel coding
Encoder
Info word
u
0
u
1
u
k1
. . .
code word
x
0
. . . x
1
x
n1
u
i
{0, 1, . . . , q 1}, 0 i k 1
u {0, 1, . . . , q 1}
k
x
i
{0, 1, . . . , q 1}, 0 i n 1
x {0, 1, . . . , q 1}
n
13
Channel coding
Denition: A code C is the set of all q
k
code words.
Since n symbols are needed in order to transmit k info symbols, where n > k, the
code contains redundancy, because only q
k
possible combinations of q
n
are allowed.
This redundancy is used for error detection, error correction, or error
concealment by the receiver.
The ratio
R :=
k
n
< 1
is called code rate. The larger the code rate, the less is the redundancy given the
same length n of the code word.
Trade-o between bandwidth eciency and power eciency.
The transmitted (possibly erroneous or noisy) code words are denoted as received
words y. For hard-decision decoding
y
i
{0, 1, . . . , q 1}, i = 0, 1, . . . , n 1, by denition.
14
Channel coding
Systematic codes
Denition
A code is called systematic, if the mapping between info words and code words is
such that the info words are explicitly contained in the code words.
The n k remaining symbols are called parity check symbols
(q = 2: parity check bits).
Example 1:
(3, 2)
2
-single parity check (SPC) code:
(q = 2, i.e., one symbol corresponds to one bit)
Parity check equation: u
0
u
1
x
2
= 0 (: modulo-q addition)
Code: C = {[000], [011], [101], [110]}
15
Channel coding
Info word u = [u
0
, u
1
] Code word x = [x
0
, x
1
, x
2
]
[00] [000]
[01] [011]
[10] [101]
[11] [110]
Example 2:
(3, 2)
256
-single parity check (SPC) code:
(q = 2
8
= 256, i.e., one symbol corresponds to one byte)
Parity check equation: u
0
u
1
x
2
= 0 (: modulo-q addition)
Code: C = {[0, 0, 0], [0, 1, 255], . . . , [255, 255, 2]} (q
k
= 256
2
= 65536 code words)
16
Channel coding
Info word u = [u
0
, u
1
] Code word x = [x
0
, x
1
, x
2
]
[0,0] [0,0,0]
[0,1] [0,1,255]
[0,2] [0,2,254]
. . . . . .
[0,255] [0,255,1]
[1,0] [1,0,255]
. . . . . .
[255,255] [255,255,2]
17
Channel coding
Hamming distance
Denition
The Hamming distance d
H
(x, y) is the number of deviations between the
components of x and y, where x {0, 1, . . . , q 1}
n
and y {0, 1, . . . , q 1}
n
are words of length n
x and y may be code words or received words, respectively.
Example 1 (q = 2):
The Hamming distance between the code words
x = [000] and y = [110] is d
H
(x, y) = 2
x = [011] and y = [101] is d
H
(x, y) = 2
Example 2 (q = 3):
x = [012] and y = [021] is d
H
(x, y) = 2
x = [012] and y = [120] is d
H
(x, y) = 3
18
Channel coding
Minimum distance
Denition
The minimum distance d
min
of an (n, k) block code C is the minimum Hamming
distance between all code words:
d
min
:= min
all pairs of code words
{d
H
(x, y)|x, y C, x = y}
Example
Let x = [00000], y = [01010], and z = [11111] be code words of a block code.
d
H
(x, y) = 2, d
H
(x, z) = 5, and d
H
(y, z) = 3.
d
min
= 2
Notation
A detailed notation of a block code C is given by (n, k, d
min
)
q
Example
A binary single parity check code is an (n, n 1, 2)
2
block code
19
Channel coding
Hamming weight
Denition
If the all-zeros code word exist, the Hamming weight w
H
(x) is the number of
components of the code word x which are dierent from zero
Properties
The Hamming distance is a metric with the following properties:
d
H
(x, y) = d
H
(y, x)
0 d
H
(x, y) n
d
H
(x, y) = 0 x = y
d
H
(x, y) d
H
(x, z) + d
H
(z, y) triangle inequality
20
Channel coding
If addition and subtraction are dened in the range of the symbols, the
following properties apply for the Hamming weight
w
H
(x) = w
H
(x)
0 w
H
(x) n
w
H
(x) = 0 x = 0 = [0, 0, . . . , 0]
w
H
(x y) w
H
(x) + w
H
(y)
21
Channel coding
Linear Block Code
Denition
An (n, k, d
min
)
q
block code C is called linear, if for any possible combination of
two code words the modulo-q sum of their components yields a valid code word:
x, y C x y C
Additionally, for non-binary codes (q = 2)
x C, a {0, 1, . . . , q 1} a x C
Example
C = {[000], [011], [101], [110]} is linear
C = {[000], [011], [110], [111]} is non-linear
Linear block codes are an important subset of block codes.
All block codes which are important in practice (such as repetition codes, single
parity check codes, Hamming codes, BCH codes, Reed Solomon codes) are linear
22
Channel coding
Hamming distance of linear block codes
Theorem
The minimum distance of a linear (n, k, d
min
)
q
block code C is
d
min
= min
all code words
{w
H
(x)|x C, x = 0}
Sketch of a proof: d
H
(x, y) = w
H
(x y), if the substitution is dened in the
range of the symbols.
Hence, an advantage of linear block codes is that for the calculation of d
min
not
all q
k
(q
k
1) pairs are to be considered, but only q
k
1 pairs.
The property of linearity may be used for an ecient encoding and decoding of
linear block codes.
23
Channel coding
Decoding spheres
Denition
A decoding sphere K
r
(x) of radius r around a word x is the set of all words y,
which have a Hamming distance d
H
(x, y) r:
K
r
(x) := {y|d
H
(x, y) r}
Example
(3, 1, 3)
2
repetition code C = {[000], [111]}
K
1
([000]) = {[000], [100], [010], [001]}
K
1
([111]) = {[111], [011], [101], [110]}
K
2
([000]) = {[000], [100], [010], [001], [110], [101], [011]}
K
2
([111]) = {[111], [011], [101], [110], [001], [010], [100]}
K
3
([000]) = K
3
([111]) = {0, 1}
3
24
Channel coding
Error detection
The larger the minimum distance d
min
, the more errors may detected and/or
corrected. This important fact will now demonstrated for hard decision decoding
Theorem
A block code (n, k, d
min
)
q
C is able to detect
t

= d
min
1
symbol errors for sure. Although each symbol consists of log
2
bits, for q > 2 this
does not mean that (d
min
1) log
2
(q) bit errors are detected for sure.
Example
single parity check code: Each SPC code is able to detect one error for sure
Proof of error detection theorem
25
Channel coding
Each sphere of radius d
min
1 around a code word includes no other code words
If at the most d
min
1 errors occurs, the received word is included in the
decoding sphere around the transmitted code word
Since in this decoding sphere no further code word is contained, a confusion with
an allowed code word is not possible
Therefore, an (n, k, d
min
)
q
block code can detect t

= d
min
1 symbol errors for
sure
Error correction
Theorem
A block code (n, k, d
min
)
q
C is able to correct
t = (d
min
1)/2
symbol errors for sure. Although each symbol consists of log
2
(q) bits, for q > 2
this does not mean that (d
min
1)/2 log
2
(q) bit errors are corrected for sure.
26
Channel coding
Example
Repetition code: A (3, 1, 3)
2
code is able to correct one error for sure.
Proof of error correction theorem
Now we consider decoding spheres of radius t
Hence, decoding sphere with a minimum distance d
min
= 2t + 1 are disjunct
Therefore, an (n, k, d
min
)
q
block code can correct t symbol errors for sure, if
2t + 1 d
min
, i.e., t (d
min
1)/2
27
Channel coding
Hamming codes
Denition
(n, k, d
min
)
q
= (n, n r, 3)
q
Hamming codes of order r are dened as follows:
n =
q
r
1
q 1
= 1 + q + q
2
+ + q
r1
, k = n r
Hamming codes exist for all orders r 2
All Hamming codes are linear and systematic and have a minimum distance
d
min
= 3, i.e. t = 1 symbol error can be corrected for sure. Example q = 2
r n k R d
min
2 3 1 1/3 3
3 7 4 4/7 3
4 15 11 11/15 3
r 2
r
1 2
r
r 1 k/n 3
28
Channel coding
Example: The (7, 4, 3)
2
Hamming code consists of 16 code words of length 7:
u x
[0000] [0000000]
[0001] [0001111]
[0010] [0010110]
[0011] [0011001]
[0100] [0100101]
[0101] [0101010]
[0110] [0110011]
[0111] [0111100]
29
Channel coding
u x
[1000] [1000011]
[1001] [1001100]
[1010] [1010101]
[1011] [1011010]
[1100] [1100110]
[1101] [1101001]
[1110] [1110000]
[1111] [1111111]
30
Channel coding
Hard-decision and soft-decision decoding
hard-decision (HD) decoding (hard-input decoding)
Let y := x e, where the error components e
i
, i = 0, 1, . . . , n 1, are over the
symbol alphabet and where the addition is done modulo q.
For the hard-decision decoding, we apply the rule:
d
H
(y, x) d
H
(y, x) x C
soft-decision (SD) decoding (soft-input decoding)
Let y := x + n, where n
i
IR, i = 0, 1, . . . , n 1, and the addition is dened
over IR.
For the soft-decision decoding, we apply the rule:
||y x|| ||y x|| x C
31
Channel coding
Example of hard-decision decoding
Consider the (7, 4, 3)
2
Hamming code with the following assumptions:
Info word: u = [0010]
Code word: x = [0010110]
Error word: e = [0010000]
Received word: y = [0000110]
Decoded code word: x = [0010110] ()
Decoded info word: u = [0010]
() This code word has the smallest Hamming distance with respect to the received
word: d
H
(y, x) = 1, d
H
(y, x) > 1 x = x. According to this example, we recognize
that for systematic codes the parity check symbols should not be eliminated before
decoding takes place.
32
Channel coding
Example of soft-decision decoding
Example 1: Consider again the (7, 4, 3)
2
Hamming code, binary antipodal
modulation (0 +1, 1 1)
In soft-decision decoding, usually the squared Euclidean distance is used
Info word: u = [0010]
Transmitted sequence: x = [+1.0, +1.0, 1.0, +1.0, 1.0, 1.0, +1.0]
Noise sequence: n = [+0.2, 0.3, +1.1, +0.1, 0.1, 0.2, +0.3]
Received sequence: y = [+1.2, +0.7, +0.1, +1.1, 1.1, 1.2, 1.3]
Decoded code word: x = [+1.0, +1.0, 1.0, +1.0, 1.0, 1.0, +1.0] ()
decoded info word: u = [0010]
() This modulated code word has the smallest squared Euclidean distance with
respect to the received sequence. Due to the soft decoding often more than
t

= d
min
1 errors can be detected and more than t = (d
min
1)/2 errors can
be corrected.
33
Channel coding
Error probability
Denition I (word error probability):
The word error probability is by denition the probability that the decoded info
word deviates from the transmitted info word in at least one symbol:
P
w
:= P( u = u) = P( x = x)
Denition II (symbol error probability):
The symbol error probability is by denition the average probability of a symbol
error:
P
s
:=
1
n
n1

i=0
p( u
i
= u
i
)
Remarks
For binary codes (q = 2), the symbol error probability, P
s
, is equal to the bit
error probability, P
b
.
Since the number of symbol errors per word error is between 1 and n,
P
w
/n P
s
P
w
holds.
34
Channel coding
Decoding principles for block codes
Let x be hypotheses of the code words of a block code C and let y be the received
code word. Furthermore, let P( x|y) be the conditional probability of x given y.
Denition I (Maximum a posteriori (MAP) decoding):
The MAP decoding rule is as follows: Select the code word x such that for a given
code word y
HD&SD : P( x|y) P( x|y) x
Denition II (Maximum-likelihood (ML) decoding):
The ML rule is as follows: Select the code word x such that for a given received
word y
HD : P(y| x) P(y| x) x C
SD : p(y| x) p(y| x) x C
In both rules, all q
n
possible code words x will be compared with the received word
y, at least conceptionally. The most likely code word (MAP rule) or the code word
nearst to the received code word (ML rule) is nally selected.
35
Channel coding
Theorem:
MAP and ML decoding are identical, if all q
n
code words are equiprobable
Proof:
According to Bayes rule P(x|y) = P(y|x) P(x)/P(y) or
P(x|y) = p(y|x) P(x)/p(y). Since the denominator on the right hand side is
irrelevant with respect to a maximization, both decoding rules dier only in the use
of a priori information P(x). Therefore, for equiprobable code words both decoding
rules are identical. q.e.d.
Denition III (bounded minimum distance (BMD) decoding)
Let us given decoding spheres of radius r (d
min
1)/2. Only those received
words y will be considered for decoding, which are within a decoding sphere. In this
case, the code word in the center of corresponding decoding sphere is selected. For
all received words y which are outside of the decoding spheres, an erasure is
declared.
Denition IV (perfect codes)
A code is called perfect, if the ML and the BMD decoding principle are identical.
36
Channel coding
Binary symmetric channel, BSC
A so-called binary symmetric channel (BSC) models independent bit errors e
i
:
y
i
= x
i
e
i
, i = 0, . . . , n 1
where x
i
, e
i
, y
i
{0, 1}. The corresponding bit error probability is denoted as
P
BSC
.
P
BSC
0
1
1
0
1 P
BSC
1 P
BSC
P
BSC
37
Channel coding
Word error probability for the BSC
Now we consider a coded transmission system with hard-decision decoding given
the example of a BSC with error probability P
BSC
:
Theorem
For linear (n, k, d
min
)
q
block codes the word error probability for ML decoding can
be approximated by the upper bound
P
w
1
t

r=0

n
r

P
r
BSC
(1 P
BSC
)
nr
For perfect codes (e.g., for binary Hamming codes) this bound is exact.
In this case ML and BMD decoding are identical.
38
Channel coding
Proof:
P
w
= 1 P(correct decoding)
Assumption: Decoding is correct, if at most t errors occur:
P
w
= 1 P(w
H
(e) t)
This corresponds to bounded minimum distance (BMD) decoding. Therefore,
P
w
1
t

r=0
P(w
H
(e) = r)
The number of errors in a word of length n is binomial distributed:
P(w
H
(e) = r) =

n
r

P
r
BSC
(1 P
BSC
)
nr
q.e.d.
39
Channel coding
Additive white Gaussian noise (AWGN) channel
In an additive white Gaussian noise (AWGN) channel model, zero mean white
Gaussian noise is linearly added to the channel input value x
i
:
y
i
= x
i
+ n
i
, i = 0, 1, . . . , n 1
where x
i
, n
i
, y
i
IR. The channel inputs x
i
and the noise samples n
i
are
statistically independent.
+
n
i
x
i
y
i
40
Channel coding
The noise n
i
(n
i
IR) is:
additive: y
i
= x
i
+ n
i
, x
i
, y
i
IR, E{x
2
i
} = 1, e.g. x
i
{+1, 1}
white: E{n
i
n
j
} = 0 if i = j
Gaussian distributed: p(n
i
) =
1

2
2
e

(n
i
)
2
2
2
, (i.e., p(y
i
|x
i
) =
1

2
2
e

(y
i
x
i
)
2
2
2
)
with zero mean = E{n
i
} = 0
and variance
2
= E{n
2
i
} =
1
2E
s
/N
0
where E
s
is the energy per symbol
and N
0
is the single-sided noise power density: N
0
= kT
eff
(k is here the Boltzmann constant)
E
s
/N
0
is the so-called signal-to-noise ratio (SNR)
The signal-to-noise ratio per info bit is E
b
/N
0
= R
1
E
s
/N
0
(Note that white, Gaussian distributed random variables are statistically
independent)
(R is the code rate)
41
Channel coding
Bit error probability for the AWGN channel
For an uncoded transmission system (R = 1) with binary, identically distributed
symbols x
i
{+1, 1} the bit error probability P
b
of the AWGN channel can be
calculated as follows:
P
b
=
1
2

p(y
i
|x
i
= +1) dy
i
+
1
2


0
p(y
i
|x
i
= 1) dy
i
=

p(y
i
|x
i
= +1) dy
i
With
p(n
i
) =
1

2
2
e

n
2
i
2
2
,
2
=
N
0
2E
s
follows
p(y
i
|x
i
) =
1

2
2
e

(y
i
x
i
)
2
2
2
After substitution we obtain the nal result
P
b
=
1

2
2

(y
i
1)
2
2
2
dy
i
:=
1
2
erfc

E
s
N
0
, where E
s
= E
b
42
Channel coding
Word error probability for the AWGN channel
Now we consider a coded transmission system with soft-decision decoding given the
example of an AWGN channel model with signal-to-noise ratio E
s
/N
0
:
Theorem For (n, k, d
min
)
q
block code and antipodal transmission (x
i
{+1, 1})
with ML decoding the word error probability can be approximated by an lower and
upper bound:
1
2
erfc

d
min
E
s
N
0
P
w

1
2
n

d=d
min
a
d
erfc

d
E
s
N
0
where E
s
/N
0
= RE
b
/N
0
and a
d
is the number of all code words with Hamming
distance d with respect to the transmitted code word
For linear codes a
d
is equal to the number of code words with Hamming weight d.
For linear codes the inequality holds for all q
k
code words
Example: a
d
min
= a
3
= 7 for the (7, 4, 3)
2
Hamming code
Asymptotically, i.e. for large E
s
/N
0
, the bounds are exact
43
Channel coding
Coding gain
Denition (coding gain)
The coding gain is the dierence (in decibel) between the signal-to-noise ratio for
uncoded transmission and for coded transmission, respectively, given the same bit
error probability (e.g. , 10
4
)
The asymptotic coding gain (asymptotic means: For large SNR) for the
AWGN channel model and soft-decision decoding is given by:
G
asy
:= 10 log
10
(d
min
R) dB
With repetition codes, on the AWGN channel model no coding gain is possible:
d
min
R = n 1/n = 1 and a
dmin
= 1
Desirable is a large minimum distance d
min
at a high code rate R as well as a
small number of code words with Hamming weight d
min
44
Channel coding
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
E
b
/N
0
in dB
10
5
10
4
10
3
10
2
10
1
10
0
B
i
t

E
r
r
o
r

P
r
o
b
a
b
i
l
i
t
y
Code: (7,4) Hamming Code. Channel: AWGN. Decoder: hard/soft
uncoded transmission
hard decision ML decoding
soft decision ML decoding
1.8 dB
45
Channel coding
Matrix description of block codes
Theorem
To each (n, k, d
min
)
q
linear block code C a generator (k n) matrix G exists so
that the q
k
code words can be written as follows:
x := u G,
where u = [u
0
, u
1
, . . . , u
k1
] is a (1 k) info word and x = [x
0
, x
1
, . . . , x
n1
] is the
corresponding (1 n) code word.
For systematic linear block codes the generator matrix can be written as
G = [I
k
|P]
where I
k
is the (k k) identity matrix and P a (k (n k)) matrix which
presents the parity check symbols
46
Channel coding
Theorem (parity check matrix)
((n k) n) matrix H exists such that
x H := 0
if x is a code word of C
The all-zero word on the right hand includes (n k) elements (zeros)
The matrix H is called parity check matrix
For systematic linear block codes the parity check matrix can be written as
H = [P
T
|I
nk
]
where I
nk
is the ((n k) (n k) identity matrix and P
T
an ((n k) k)
matrix representing the parity check symbols
47
Channel coding
Denition
The syndrome s of an (n, k, d
min
)
q
block codes is dened as follows:
s := y H
T
The n k components of s are zero, if y is a code word
Note that s = y H
T
= (x e) H
T
= x H
T
e H
T
= e H
T
Syndrome decoding
Construct syndrome table (with q
nk
rows):
e = arg min
e: eH
T
=s
w
H
(e)
For each received word y compute the syndrome s = y H
T
Search syndrome s in table and hence obtain e
Compute x = y e
48
Channel coding
Cyclic block codes
Denition
A linear (n, k, d
min
)
q
block code C is called cyclic, if each cyclic of a code word
yields a valid code word:
[x
0
, x
1
, . . . , x
n1
] C [x
n1
, x
0
, . . . , x
n2
] C
Remark
The (7, 4, 3)
2
Hamming code is not a cyclic block code
[0001111] is a code word But the code word [1111000] obtained by shifting the code
word [0001111] four time to the left hand side is not a code word.
However by addition of the rows and columns of the generator one can always
obtain a cyclic Hamming code
It is not important whether we shift to left hand side or to the right hand side.
49
Channel coding
Info polynomial and code polynomial
Each code word of an arbitrary block code C can be described by a code
polynomial of degree n 1:
x(D) := x
0
+ x
1
D + + x
n2
D
n2
+ x
n1
D
n1
where [x
0
, . . . , x
n1
] represent the n code symbols
x
i
{0, . . . , q 1}, i = 0, . . . , n 1
Example
The code word [110100] (n = 6) can be represented by the code polynomial
x(D) = 1 + D + D
3
of degree 3
Accordingly, we dene an info polynomial of degree k 1
u(D) := u
0
+ u
1
D + + u
k1
D
k1
where [u
0
, . . . , u
k1
] represent the k info symbols
u
i
{0, 1, . . . , q 1}, i = 0, . . . , k 1
50
Channel coding
Generator polynomial
Theorem
Let u(D) be an info polynomial of degree k 1 and g(D) be a so-called
generator polynomial of degree n k:
g(D) := 1 + g
1
D + + g
nk1
D
nk1
+ 1 D
nk
where g
i
{0, . . . , q 1}, i = 1, . . . , n k 1
The product u(D) g(D) is a polynomial of degree n 1 and corresponds to a
code word of a linear (n, k, d
min
)
q
block code C:
x(D) = u(D) g(D)
The code C is not necessarly cyclic
51
Channel coding
Parity check polynomial
Theorem
Let g(D) be generator polynomial of degree n k of a linear (n, k, d
min
)
q
block
code C. Then
C is cyclic g(D) is a Divisor of D
n
1
Therefore a polynomial h(D) := h
0
+ h
1
D + + h
k1
D
k1
+ 1 D
k
of degree k
exists such that g(D) h(D) = D
n
1, where
h
i
{0, . . . , q 1}, i = 0, . . . , k 1
The polynomial h(D) is called parity check polynomial
52
Channel coding
Golay code
An example of a linear, cyclic block code is the (23, 12, 7)
2
Golay code
Its generator polynomial is
g(D) = D
11
+ D
9
+ D
7
+ D
6
+ D
5
+ D + 1
The corresponding parity check polynomial is
h(D) = D
12
+ D
10
+ D
7
+ D
4
+ D
3
+ D
2
+ D + 1
Proof:
(D
23
+1) : (D
11
+D
9
+D
7
+D
6
+D
5
+D+1) = D
12
+D
10
++D
7
+D
4
+D
3
+D
2
+D+1
Remarks
1. For binary codes (q = 2) D
n
1 = D
n
+ 1
2. The (23, 12, 7)
2
Golay code is a perfect code
53
Channel coding
Deniton of CRC code: A cyclic (2
r
1, 2
r
r 2, 4)
2
code is called cyclic
redundancy check code, if the generator polynomial is of the form
g(D) := (1 + D) p(D)
where p(D) is a primitive polynomial of degree r 3
Example of primitive polynomials:
Degree primitive polynomial p(D)
1 D + 1
2 D
2
+D + 1
3 D
3
+D + 1
4 D
4
+D + 1
5 D
5
+D
2
+ 1
6 D
6
+D + 1
7 D
8
+D
6
+D
5
+D
4
+ 1
8 D + 1
54
Channel coding
CCITT has standardized the following CRC codes, among others, for applications
in the open systems interconnection (OSI) data security layer
D
16
+ D
12
+ D
5
+ 1 = (D + 1)(D
15
+ D
14
+ D
13
+ D
4
+ D
3
+ D
2
+ D + 1)
D
12
+ D
2
+ D + 1 = (D + 1)(D
11
+ D
2
+ 1)
D
8
+ D
2
+ D + 1 = (D + 1)(D
7
+ D
6
+ D
5
+ D
4
+ D
3
+ D
2
+ D + 1)
55
Channel coding
Circuit for the generation of systematic, cyclic block codes with
generator polynomial g(D)
. . .
+
+ + +
D D D
Code polynomial
x(D)
u(D)
Info polynomial
g
1 g
2
g
nk1
56
RS Codes and BCH Codes
Reed-Solomon (RS) codes and Bose-Chaudhuri-Hocquenghem (BCH) codes have
been invented about 1960
Are most powerful block codes
RS codes and BCH codes can be designed analytically
The minimum distance is a design parameter of RS codes
BCH codes can be interpreted as binary RS codes
BCH codes are more suitable for the correction of single errors
RS codes are more suitable for the correction of burst errors
57
Channel coding
Reed-Solomon codes
RS codes are characterized by the following parameters:
n = p
m
1
n k = d
min
1 = 2t
d
min
= 2t + 1
q = p
m
where m and t are arbitrary positive integer numbers and p is a prime number
Often, p = 2 and m = 8 are chosen (q = 256), i. e. , one symbol corresponds to one
byte
Examples
1. (255, 127, 129)
256
RS code
This R 1/2 code can correct 64 bytes for sure
The number of code words is q
k
10
308
58
Channel coding
2. (255, 239, 17)
256
RS code
This R 0.94 code can correct 8 bytes for sure
This code is often used as an outer code in concatenated coding systems
59
Block Codes
Bose-Chaudhuri-Hocquenghem Codes
BCH codes are characterized by the following parameters:
n = 2
m

1
n k m

t
d
min
= 2t + 1
q = 2,
where m

(m

3) and t are arbitrary positive integer numbers.


The code can be derived for a given minimum distance d
min
.
The corresponding generator polynomials are tabulated.
60
Bit Error Probability for Binary R 1/2 BCH Codes
61
Bit Error Probability for Binary R 3/4 BCH Codes
62
Further Classes of Block Codes
Reed-Muller codes
Goppa codes
Simplex codes
Fire codes
Walsh codes
. . .
63
Channel coding
Convolutional codes
Convolutional codes are able to encode the info bits continuously
The ratio between the number of info bits and the number of code bits is called
coding rate R
In practical systems, the information is transmitted block-wise, rather than
continuously
In this lecture only binary convolutional codes are treated
The number of info bits per block is denoted as K, i.e. , the index before the
encoder is 0 k K 1
The number of code bits per block is denoted as N, i.e. , the index after the encoder
is 0 n N 1
64
Channel coding
Shift register representation of a binary,non-recursive R = 1/2
convolutional encoder with 4 states
+ +
+
D D
u
k1
u
k2
x
1
, k
x
2
, k
x
n
Memory length = 2
u
k
Number of states S := 2

65
Channel coding
State diagram of a binary, non-recursive R = 1/2 convolutional
encoder with 4 states
Zustnde: u
k-2
u
k-1
u
k
= 0
u
k
= 1
Infobits:
Codebits: x
1,k
x
2,k
0 0
0 1
1 1
1 0
0 0
1 1 1 0
0 1
1 0 1 1
0 0 0 1
66
Channel coding
Trellis segment of a binary, non-recursive R = 1/2 convolutional
encoder with 4 states
0 1
1 0
1 1
0 0 0 0
0 1
1 0
1 1
u u u u
k-2 k-1 2,k 1,k
x x
0 0
1 1
0 0
0 1
1 0 1 0
1 1
0 1
Previous state Consecutive state
u
u
k
k
= 0
= 1
Code bits
Info bits
k-1 k
67
Channel coding
Trellis diagram of a binary, non-recursive R = 1/2 convolutional
encoder with 4 states
= 0
u = 1
u
k
k
00
01
10
11
00 00 00 00 00 00 00 00 00
11 11 11 11 11 11 11 11 11
10 10 10 10 10 10 10 10
01 01 01 01 01 01 01
01 01 01 01 01 01 01 01
10 10 10 10 10 10 10
11 11 11 11 11 11 11
00 00 00 00 00 00 00
k=0 k=1 k=2 k=3 k=4 k=5 k=6 k=7 k=8
68
Channel coding
Terminated trellis diagram of a binary, non-recursive R = 7/8 1/2
convolutional encoder with 4 states
k
u = 1
u = 0
k
00
01
10
11
00 00 00 00 00 00 00 00 00
11 11 11 11 11 11 11
01 01 01 01 01 01 01
00 00 00 00 00
10 10 10 10 10 10
01 01 01 01 01
10 10 10 10 10 10
11 11 11 11 11 11 11
A Z
69
Channel coding
Decoding of convolutional codes
A trellis diagram is a so-called directed graph
The optimal decoder in the sense of maximum-likelihood decoding searches the
most probable sequence among all possible sequences within the trellis
Hard decision decoding
The path with the smallest Hamming distance with respect to the received sequence
is selected
Soft decision decoding
The path with the smallest squared Euclidean distance with respect to the received
sequence is selected
This is an optimization strategy: Which path is the best from all possible
paths in the trellis, where path costs corresponds to Hamming or squared Euclidean
distances
Viterbi algorithm
70
Channel coding
Viterbi algorithm
1. Initialization
Initialization of all path metrics
1
, where
1
= 0 for the initial state and

1
= for all other states
2. Computation of the branch metrics
Compute the branch metric
j
k
, 0 k K 1 for all path
Example: The squared Euclidean branch metric

j
k
=

i
(y
i,k
x
j
i,k
)
2
, 0 k K 1
where x
j
i,k
are the encoded bits corresponding to the j-th path and y
i,k
{0, 1}
(hard decision) respectively y
i,k
IR (soft decision) are the received values
71
Channel coding
3. Add-compare-select operation
add branch metrics:
j
k
=
j
k1
+
j
k
and
j

k
=
j

k1
+
j

k
compare the path metrics
j
k
and
j

k
select the best path, the best path
j
k
or
j

k
is remains
4. Back search
If in a terminated trellis diagram the nal state is reached, only the ML path
remains. This path is traced backward (back search or trace-back), in order
to obtain the decode info bits u
k
72
Channel coding
Theorem
For the AWGN channel model the Euclidean branch metric
k
is optimal
Proof
For the AWGN channel model y
n
= x
n
+ n
n
, 0 n N 1, where
p
Y |X
(y
n
|x
n
) =
1

2
2
e

(y
n
x
n
)
2
2
2
,
2
=
1
2E
s
/N
0
Therefore
p
Y|X
(y|x) =
1
(2
2
)
N/2
N1

n=0
e

(y
n
x
n
)
2
2
2
Hence
ln(p
Y|X
(y|x)) =
N
2
ln(2
2
) +
N1

n=0
(y
n
x
n
)
2
2
2
a + b
N1

n=0
(y
n
x
n
)
2
where a and b > 0 are constant factors
73
Channel coding
Viterbi algorithm
The optimal sequence in the sense of maximum-likelihood sequence estimation
(MLSE) is
u
MLSE
= arg max
x
p
Y|

X
(y| x)
u
MLSE
= arg max
x
ln(p
Y|

X
(y| x))
u
MLSE
= arg min
x

ln(p
Y|

X
(y| x))

u
MLSE
= arg min
x

a + b
N1

n=0
(y
n
x
n
)
2
. .. .
:=
n

= arg min
x
N1

n=0

n
u
MLSE
= arg min
x

K1

k=0

i
(y
i,k
x
i,k
)
2
. .. .
:=
k

= arg min
x
K1

k=0

k
Therefore,
k
=

i
(y
i,k
x
i,k
)
2
is the wanted branch metric
74
Channel coding
Distance properties
Convolutional codes are linear, since the code bit are obtained by a linear
operation from the info bits. Due to the linearity, without loss of generality the
all-zeros sequence will be assumed as transmitted
Denition of error path
An error path is a path, which deviates in the k
1
-th trellis segment for the rst time
from the all-zeros path sequence and converge in the k
2
-th trellis segment with the
all-zeros sequence, where k
2
> k
1
The capability of the error correction of convolutional codes is related to the
Hamming weight of the error paths, and not the lengths of the error paths
Denition of the free distance
The free distance, d
free
, is equal to the minimum Hamming weight of all error paths
75
Channel coding
Rate-1/2 convolutional codes with maximum free distance
Taps (octal) d
free
Applications
2 5,7 5
3 15,17 6
4 23,35 7 GSM
5 53, 75 8
6 133,171 10 DAB, DVB, Satcom
Optimization of convolutional codes by computer search
76
Channel coding
Rate-1/3 convolutional codes with maximum free distance
Taps (octal) d
free
Applications
2 5,7,7 8
3 13,15, 17 10
4 25,33, 37 12
5 47, 53 , 75 13
6 133,145, 171 14
77
Channel coding
Distance spectrum
The number of error paths with distance d with respect to the all-zeros sequence
d d
free
, is denoted as a
d
The corresponding sum of info bits, which are equal to one, is denoted as c
d
The list of values a
d
versus d and c
d
versus d is called distance spectrum
Example: Rate-1/2 convolutional code with = 2
d a
d
c
d
5 1 1
6 2 4
7 4 12
8 8 32
9 16 80
d 2
dd
free
a
d
(d d
free
+ 1)
78
Channel coding
Bit error probability for ML-decoding For binary antipodal transmission
via an AWGN channel, the bit error probability for soft-decision ML-decoding (or
MAP-decoding without a-priori information) can be lower and upper bounded as
follows:
1
2
erfc

d
free
R
E
b
N
0
P
b

1
2

d
free
c
d
erfc

d R
E
b
N
0
where RE
b
= E
s
The lower bound takes only the error event with smallest Hamming distance into
account, whereas the upper bound takes all error events into account (union
bound)
For large E
b
/N
0
both curves merge
79
Channel coding
Bounds on the bit error probability for the rate-1/2 convolutional
code with 64 states
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
E
b
/N
0
in dB
10
5
10
4
10
3
10
2
10
1
B
i
t

E
r
r
o
r

P
r
o
b
a
b
i
l
i
t
y
Code: Rate1/2 code, =6. Channel: AWGN. Decoder: VA
uncoded system
upper bound
lower bound
asymptotic coding gain: 7 dB
80
Channel coding
Denition of coding gain for convolutional codes
The coding gain is the dierence in decibel between the necessary E
b
/N
0
for
uncoded transmission and E
b
/N
0
for coded transmission, in order to obtain the
same bit error probability
For convolutional codes, the AWGN channel, and ML-decoding, the asymptotic
coding gain (i. e. , for E
b
/N
0
) is
G
asy
= 10 log
10
(d
free
R) (dB)
Examples
The asymptotic coding gain for R = 1/2 for convolutional codes with maximal free
distance is
4 dB for = 2
5.44 dB for = 4
7 dB for = 6
81
Channel coding
Bounds on the bit error probability for the rate-1/2 convolutional
code with memory lengths = 2, . . . , 6
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0
E
b
/N
0
in dB
10
10
10
9
10
8
10
7
10
6
10
5
10
4
10
3
10
2
10
1
10
0
B
i
t

E
r
r
o
r

P
r
o
b
a
b
i
l
i
t
y

(
B
o
u
n
d
s
)
R=1/2 convolutional codes, AWGN channel, ML decoding
uncoded
=2
=4
=6
82
Channel coding
Bounds on the bit error probability for the rate-1/3 convolutional
code with memory lengths = 2, . . . , 6
0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0
E
b
/N
0
in dB
10
10
10
9
10
8
10
7
10
6
10
5
10
4
10
3
10
2
10
1
10
0
B
i
t

E
r
r
o
r

P
r
o
b
a
b
i
l
i
t
y

(
B
o
u
n
d
s
)
R=1/3 convolutional codes, AWGN channel, ML decoding
uncoded
=2
=4
=6
83
Channel coding
Polynomial representation of convolutional codes
Denition of generator polynomials
Non-recursive R = 1/n convolutional encoders can be described by the generator
polynomials
g
i
(D) =

j=0
g
i,j
D
j
, i = 1, . . . , n
where g
i,j
= 0 if the corresponding modulo-2 addition does not exist and g
i,j
= 1 if
the corresponding modulo-2 addition exists
Accordingly, the memory length is
= max
1in
deg g
i
(D)
Denition of info polynomial
The info polynomial (which may be of innite length) is dened as follows
u(D) =

k=0
u
k
D
k
, u
k
{0, 1}
84
Channel coding
where u
0
, u
1
, . . . are the info bits
Encoding corresponds to a multiplication of the polynomials u(D) and g
i
(D):
x
i
(D) = u(D) g(D) for 1 i n
or equivalently
[x
1
(D), . . . , x
n
(D)] = u(D) [g
1
(D), . . . , g
n
(D)]
. .. .
G(D)
Denition of generator matrix
G(D) := [g
1
(D), . . . , g
n
(D)] is often dubbed generator matrix
The set of all code words can be written as
C =

u(D) G(D)|u(D) =

u
k
D
k
, u
k
{0, 1}

Correspondingly, convolutional codes are linear


85
Modied State Diagram
Example: Non-recursive R = 1/2 convolutional encoder with 2

= 4 states
(generator matrix G(D) = [1 + D
2
, 1 + D + D
2
])
0 0
0 1
1 1
1 0
0 0
1 1
1 0
1 0 1 1
0 0 0 1
0 1
u = 0
u = 1
k
k
u = 0
u = 1
k
k
0 0 0 1 0 0
1 1
1 1
1 0
1 0
0 1
1 1
1 0 0 0
0 1
State Diagram Modied State Diagram
Computation of the distance spectrum can be done by means of the
modied state diagram
86
Channel coding
Catastrophic convolutional encoders
As an example, we consider the R = 1/2 convolutional code with generator matrix
G(D) = [1 + D, 1 + D
2
], i.e. , the taps (6, 5)|
8
The info sequence u
0
= [0, 0, . . . , 0] corresponds to the code sequence
x
0
= [0, 0, 0, 0, 0, . . . , 0]
The info sequence u
1
= [1, 1, . . . , 1] corresponds to the code sequence
x
1
= [1, 1, 0, 1, 0, . . . , 0]
An important observation is that two info sequences with innite Hamming
distance exist, whose corresponding code sequence dier in a nite number of bits
Denition of catastrophic convolutional encoders
Convolutional codes are called catastrophic, if two info sequences with innite
Hamming distance exist, whose corresponding code sequences have a nite
Hamming distance
In the modied state diagram, catastrophic encoders are
characterized by loops without any distance gain.
87
Recursive Convolutional Encoders
For each non-recursive convolutional encoder a corresponding recursive convolutional
encoder generating a code with the same free distance d
free
can be constructed.
Example: Recursive, systematic R = 1/2 convolutional encoder with 4 states
q - q

+
D D
q q q
-
? ?
-
?
-
-

+
u
k
a
k
a
k1
a
k2
x
1,k
x
2,k
Signal representation:
(1) x
1,k
= u
k
(2) a
k
= u
k
a
k2
(3) x
2,k
= a
k
a
k1
a
k2
Polynomial representation:
(1) x
1
(D) = u(D), therefore g
1
(D) = 1
(2) a(D) = u(D) + D
2
a(D)
(3) x
2
(D) = a(D) + Da(D) + D
2
a(D)
(2) yields a(D) =
1
1+D
2
u(D). Insertion into (3) yields: x
2
(D) =
1 + D + D
2
1 + D
2
. .. .
g
2
(D)
u(D)
88
The Compact Disc
Design of an audio CD
Sampling of the audio signal
Channel coding
Interleaving
Modulation
89
The Compact Disc
The audio CD has been introduced 1982 on the market and is the rst mass
product employing channel coding
The high quality is essentially based on error correction and error concealment
Data is stored in form of
Holes (pits) scattering of the laser beam
Plain surface (lands) reection of the laser beam
Spiral track of about 5 km (!) length
Length of a pit or a land, respectively, about 0.3 m
Track width about 0.6 m
Distance between tracks about 1 m
Sampling speed about 1.2 m/s
Scattered or reected laser beam is evaluated with a laser diode
90
The Compact Disc
Data rate of the channel bits 4.3218 Mbit/s
Playtime 74 min about 1.9 10
10
bits are stored (!)
A channel bit corresponds to a track length of about 0.3 m
Scratches typically cause burst errors
Material defects typically cause single errors
Hence, the channel decoder must be able to correct single and burst errors
44.1 kHz sampling rate (20 kHz audio bandwidth)
16 bit A/D conversion 2 16 44.1 10
3
= 1.4112 Mbit/s info data rate
6 samples from both stereo channels form an info word
8 bits are combined in one symbol (i.e., q = 2
8
)
This results in an info word length of k = 2 6 16/8 = 24 symbols
91
The Compact Disc
Given a (255, 251, 5)
256
RS code, by means of shortening the following codes result:
(28, 24, 5)
256
RS code C
o
(32, 28, 5)
256
RS code C
i
(Shortening means to suppress info bits)
Since RS codes are maximum-distance separable, the minimal distance of the
shortened codes is also d
min
= 5
The total rate of the serial-concatenated code is R = R
o
R
i
= 24/32 = 3/4
The inner decoder is able to correct two single symbol errors (e.g., material defects)
In order to convert burst errors into single errors, a convolutional deinterleaver
(N = 28, J = 4, 8 bits/symbol) is used
The single symbol errors (after deinterleaving) are corrected by the outer decoder
Interleaver and deinterleaver store N(N 1)J/2 = 1512 symbols (12096 bits) each
92
The Compact Disc
In order to ease clock synchronization, a so called eight-to-fourteen
modulation (EFM) is used: Inside two ones between two and ten zeros must occur
A one causes a transition from a pit to a land or vice versa
Due to three coupling bits it can be maintained that also in a continuous data
stream inside two ones between two and ten zeros occur
Additionally, 24 + 3 sync bits are inserted
Altogether, 588 channel bits per 192 info bits are generated
the eective rate is (only) R
eff
= 0.3265, although the code rate is R = 3/4
In contrast to sampling, coding, modulation, and interleaving the decoder has not
been standardized
Possible quality improvement: Joint EFM demodulation and channel decoding.
93
Digital Transmission System
Source
-
Source
encoder
-
Encryption
-
Channel
encoder
-
Modulator
?
Physical
channel
?
De-
modulator

Channel
decoder

De-
cryption

Source
decoder

Sink
u
u
x
y
Transmitter
Receiver
94
Shannons Information Theory
Claude E. Shannon (1948)
Source coding: Data compression
Cryptology: Data encryption
Channel coding: Error detection/correction/concealment
Separation theorem:
Source coding, encryption, and channel coding may be separated without information
loss (note that the separation theorem holds for very long data sequences only)
95
Examples for Source Coding, Encryption, and Channel
Coding
Source Coding:
1. Example: Characters A-Z encoded with log
2
(26) = 5 bits, no data compression
2. Example: Characters A-Z encoded, with data compr. (e.g. Human algorithm)
No data compression: A [00000] With data compression: A [11]
B [00001] B [001]
C [00010] C [0110]
. . . . . .
Encryption:
Example: Add a key word to each source code word modulo 2
e.g. source code word [0110] (
.
= C), key word [1010] (random sequence)
info word [0110] [1010] = [1100]
Channel coding:
Example: (2,1) repetition code with code rate R = 1/2
e.g. info word [1100] code word [11 11 00 00]
96
Examples for Source Coding Techniques
Application Rate without compr. Rate with compr. Technique
Speech coding 64 kbit/s 13 . . . 7 kbit/s CELP
(8 kHz 8 bit/sample) 4.8 . . . 2.4 kbit/s vocoder
Audio coding 1.536 Mbit/s 256 . . . 128 kbit/s MPEG-1
(2 48 kHz 16 bit/sample) 96 kbit/s MPEG-2 AAC
Image coding 8 bit/pixel 0.25 . . . 1.25 bit/pix. JPEG
Video coding 625 Mbit/s (HDTV) 24 Mbit/s MPEG-2
163.9 Mbit/s (SDTV) 6 Mbit/s
Text compr. factor 3:1 . . . 10:1 Lempel-Ziv
97
Fundamental Questions of Information Theory
Let us given a discrete memoryless source. What is the minimum number of
bits/source symbol, R, after lossless source encoding?
Answer: Entropy H
On average, each symbol of a discrete-time source can be represented (and
recovered with an arbitrarily small error) by R bits/source symbol if R H,
but not if R < H.
What is the maximum number of bits/channel symbol, R, after channel encoding?
Answer: Channel capacity C
On average, R randomly generated bits/channel symbol can be transmitted via a
noisy channel with arbitrarily low error probability if R C, but not if R > C.
Symbols should not be transmitted individually. Instead, the channel encoder
should map the info bits onto the coded symbols so that each info bit inuences
as many coded symbols as possible.
98
Source coding
Assume that a code word with length W
i
is assigned to q
i
.
The probability of q
i
is p
i
. The average codeword length

W is given by

W(Q) =
L

i=1
p
i
W
i
The lowest average bite rate is achieved with the code that gives the smallest

W(Q). This lower bound is given by the rst Shannons source-coding theorem
Theorem (Shannons source coding theorem): Given the constraint that
n , it is necessary and sucient that lossless source encoding is done on
average with
H(Q) =
L

i=1
p
Q
(q
(i)
) log p
Q
(q
(i)
)
bits/source symbol
99
Source coding
Optimal source coding is done if:
H(Q)

W(Q) H(Q) + 1
Remarks:
A code is only useful in a transmission system if every message can be uniquely
decoded.
A sucient, but not necessary, condition for a code to be uniquely decodable is
the prex condition
The prex condition states that for no two codewords C
i
and C
j
can a binary
sequence S be found such that C
i
S = C
j
Example:
Codewords 1111,1110, 110, 10, 00, 010, 0110, and 0111.
100
Source coding
Human coding
The Human code is an optimal binary prex-condition code. This means that
there is no other uniquely decodable binary code with a smaller average
codeword length
Human codes use a code tree constructed as follows:
1. The source symbols are sorted in order of decreasing probabilities
2. The two active nodes with the smallest probabilities are connected to the same
node.
The upper branch of each node is assigned to 1 and the lower branch to 0
3. The resulting probability of each node is obtained by adding the two
probabilities of the two active nodes (see 2.)
4. The resulting probability is considered as a symbol probability for the next
coding step
5. The code tree is complete if the last resulting probability is equal to one
101
Source coding
Example: Coding of a source with 8 symbols (A, B, C, . . . , H)
B
A
C
D
E
F
G
H
0.24
0.34
0.14
0.12
0.07
0.05
0.03
0.01
0
0
0
0
1
1
1
1
1
0
1
0.04
0.09
0
1
0.16
0.4
0
0.6
1.0
0.26
q
i
p
i
Symbol A B C D E F G H
p
i
0.34 0.24 0.14 0.12 0.07 0.05 0.03 0.01
Codeword 11 01 101 100 000 0011 00101 00100
Codeword length 2 2 3 3 3 4 5 5
102
Source coding
Calculate the average codeword length

W
Calculate the entropie of the source
Is the redundancy in the source completely removed?
Remarks
Human coding is often used in data compression
Human coding and decoding are usually done by using lookup tables
For larger sequences of symbols these tables become prohibitively large
Another disadvantage of Human coding is that its performance is sensitive to
changes in the signal statistics
If the statistics change and the code is not adapted, the bit rate will increase and many
even exceed log
2
(L) bits per symbol
103
Source coding
Runlength coding
Runlength coding is useful if long subsequences, or runs, of the same symbol
occur
This is the case, for instance, if the probability density function of the input of a
quantizer shows a sharp peak at zero
Long sequences of zeros can then be expected at the output of the quantizer
The idea of runlength coding is to detect runs of the same symbol and to assign
to each run one codeword that indicates its length
An extensive statistical analysis of runlength coding is dicult
Runlength coding is not only used exclusively for sequences of independent
symbols
It can also give good results if the symbols in a sequence are dependent
104
Source coding
Example of runlength coding: Runlength coding is used for example in
JPEG for encoding the AC coecients. After the quantization many symbols in
the sequence including the AC coecients are equal to zero. At the end of the
sequence of the AC coecients the number of zeros is large.
For the runlength coding following steps are considered
1. The number of zeros between two non zero AC coecients (also called run
length) is transmitted
2. In JPEG the runlength is between 0 and 15
3. Non zero AC coecients are divided into categories
4. From the categories of the AC coecients and the runlength new data symbols
are build
105
Source coding
5. Two important symbol are
ZRL: denes a runlength of 15 followed by a 0 symbol
EOB: denes the end of block, which give the last non zero AC coecient. All
zeros after the EOB symbol will be not transmitted
106
Source coding
Lempel-Ziv coding
Lempel-Ziv coding is a universal coding scheme
This means that it adapts to the signal statistics and therefore can be used
without measuring statistics and designing codes according to these in advance
It is suitable for sources producing independent symbols as well as for sources
that produce dependent symbols.
It is often employed in data compression algorithms used in computers to store
data on a disk.
Many variations of the Lempel-Ziv algorithm exist
In following Lempel-Ziv (LZ78) algorithm is described
1. The source sequence is divided into subsequences, which are as short as
possible and which did not occur before
107
Source coding
Three parameter are introduced and denote
m: The total number of subsequences
Sux: The last symbol of each subsequence
Prex: The remaining symbols of each subsequence
2. The position of each prex is encoded
log
2
m bits are needed in order to encode the position and 1 bit for the sux,
i.e., m (1 + log
2
m) bits are needed in total for a source sequence
Example: Let a [10110100010] be a source sequence of binary symbols
How many bits we need to encode one position?
What is the encoded sequence?

You might also like