BaigiangECC PDF
BaigiangECC PDF
et4-030
Lecture Notes
January 2007
1) Introduction
2) Mathematical Basics
7) Convolutional Codes
8) Iterative Decoding
i. Turbo Codes
ii. LDPC Codes
9) Applications
iii
Abbreviations
ARQ Automatic Repeat Request
ASCII American Standard Code for Information Interchange
AWGN Additive White Gaussian Noise
BCH Bose, Chaudhuri & Hocquenghem
BER Bit Error Rate
BPSK Binary Phase-Shift Keying
BSC Binary Symmetric Channel
BVD Big Viterbi Decoder
CCSDS Consultative Committee in Space Data Systems
CD Compact Disc
CRC Cyclic Redundancy Check
dB Decibel
EFM Eight-to-Fourteen Modulation
ESA European Space Agency
FEC Forward Error Correction
GF(q) Galois Field of order q
GSM Global System for Mobile Communications
LDPC Low-Density Parity-Check
LLR Log-Likelihood Ratio
MAP Maximum a Posteriori
NASA National Aeronautics and Space Administration
ML Maximum Likelihood
RS Reed-Solomon
RSC Recursive Systematic Convolutional
SISO Soft-Input Soft-Output
SNR Signal-to-Noise Ratio
SOVA Soft-Output Viterbi Algorithm
TDMA Time Division Multiple Access
UMTS Universal Mobile Telecommunications System
VA Viterbi Algorithm
v
Index
Automatic Repeat Request 1.20
BCH Code 4.16
Binary Asymmetric Channel 1.9
Binary Error-Erasure Channel 1.10
Binary Symmetric Channel 1.8
Block Code 3.3
Burst Error 1.11
Chase Decoding 6.12
Code Performance 1.21
Code Polynomial 4.3
Code Rate 3.3, 7.5
Codeword 3.3
Coding Gain 1.21
Coding Threshold 1.21
Communications Block Diagram 1.5, 6.3
Compact Disc 9.15
Concatenated Code 5.11
Constraint Length 7.5
Convolutional Code 7.3
Coset 3.13
Coset Leader 3.13
Covering Radius 9.22
Cyclic Code 4.7
Dual Code 3.9
Dual Space 2.21
Eight-to-Fourteen Modulation 9.17
Error Correction 1.15
Error Detection 1.17
Even-Weight Code 3.4
Extension 3.20
Extrinsic Information 8.14
Factorization 4.9
Fano Metric 7.17
Field 2.4
Finite Field 2.5
Football Pool 9.20
Forward Error Correction 1.20
Free Distance 7.7
Galois Field 2.5
Generator Matrix 3.7, 7.5
Generator Polynomial 4.3
Gilbert Model 1.12
Group 2.3
GSM 9.11
vi
Hamming Code 3.11
Hamming Distance 3.5
Information Polynomial 4.3, 7.4
Information Vector 3.7
Inner Code 5.11
Interleaving 3.22
Irreducible Polynomial 2.9
Linear Block Code 3.4
Log-Likelihood Algebra 8.21
Log-Likelihood Ratio 8.21
Low-Density Parity-Check (LDPC) Code 8.29
Maximum a Posteriori Decoder 8.19
Memory 7.3
Minimal Polynomial 2.16
Outer Code 5.11
Parity-Check Equation 3.9
Parity-Check Matrix 3.9, 8.30
Parity-Check Polynomial 4.11
Planetary Standard Code 9.6
Polynomial Code 4.3
Product Code 5.3
Puncturing 3.19
Random Error 1.11
Recursive Systematic Convolutional 8.17
Reed-Solomon Code 4.20
Repetition Code 1.13
Runlength-Limited Sequences 9.16
Sequential Decoding 7.17
Shannon Limit 1.21
Shift Register Implementation 4.14, 7.4
Shortening 3.18
Soft-Decision Decoding 6.1
Soft-Output Viterbi Algorithm 8.19
Space and Satellite Communications 9.3
Stack Algorithm 7.19
Standard Matrix 3.15
State Diagram 7.7
Subspace 2.19
Syndrome Decoding 3.12, 4.13
Systematic Encoding 3.8, 4.5
Tanner Graph 8.35
Tree 7.15
Trellis 7.8
Turbo Code 8.16
Vector Space 2.18
Viterbi Decoding 7.11
Weight 3.5
vii
Bibliography
Books
[Bos99] M. Bossert,
Channel Coding for Telecommunications,
Wiley, 1999, ISBN 0471-98277-6
[Bur01] A. Burr,
Modulation and Coding for Wireless Communications,
Pearson Education, 2001, ISBN 0-201-39857-5
[Swe02] P. Sweeney,
Error Control Coding, from theory to practice,
Wiley, 2002, ISBN 0470-84356-X
viii
Key Papers
[BC60] R.C. Bose & D.K. Ray-Chaudhuri, “On a class of error correcting binary
group codes”, Information and Control, vol. 3, pp. 68-79, March 1960.
[Eli55] P. Elias, “Coding for Noisy Channels”, IRE Convention Record, vol. 3.
pt. 4, pp. 37-47, 1955.
[For73] G.D. Forney, Jr., “The Viterbi Algorithm”, Proceedings of the IEEE, vol.
61, pp. 268-278, March 1973.
[Ham50] R. Hamming, “Error Detecting and Error Correcting Codes”, Bell System
Technical Journal, vol. 29, pp. 147-160, 1950.
[RS60] I.S. Reed & G. Solomon, “Polynomial Codes over Certain Finite Fields”,
Journal of the Society of Industrial and Applied Mathematics, vol. 8, pp.
300-304, 1960.
ix
x
et4-030 Error-Correcting Codes, Part 1: Introduction -1.1-
et4-030 Part 1
Error-Correcting Codes:
Introduction
Introduction 1
This part starts with some practical information concerning the course organization, and
then continues with a general introduction to the field of error-correcting codes.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.2-
Focus
Introduction 2
The goal of this course is to provide a broad overview of error correction techniques.
Such techniques are applied in order to protect information against errors which may
occur during transmission or storage. The emphasis will be on the basic trade-offs
between efficiency, reliability, and complexity. Most results are presented without proofs
(the interested reader may consult books from the bibliography).
et4-030 Error-Correcting Codes, Part 1: Introduction -1.5-
Agent 1 Agent 2
Input Device Output Device
Data Reduction Data Reconstruction
Source Coding Source Decoding
Encryption Decryption
Channel Coding Channel Decoding
Modulation Demodulation
Channel
Introduction 5
Agent 1 Agent 2
Input Device Output Device
Data Reduction Data Reconstruction
Source Coding Source Decoding
Encryption Decryption
In this course we focus on channel coding aspects. Hence, the transmitting agent, input
device, data reduction, source coding, and encryption are considered as one entity: the
source. Similarly, at the receiving side, the decryption, source decoding, data
reconstruction, output device, and receiving agent are considered as one entity as well: the
sink or destination. The modulation and demodulation functionalities may be considered
as a part of the channel. Such a (discrete) channel transports symbols, mostly bits, rather
than waveforms.
Throughout these lecture notes we mostly speak of “transmitting” and “receiving” of data,
indicating that most applications of channel coding techniques are in telecommunications.
However, these techniques may also be applied to data storage applications, e.g.,
magnetic or optical disks. Instead of transporting data from one place to the other,
storage may be considered as transport through time.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.7-
01011 00011
CHANNEL
Introduction 7
A source generates a message u, mostly consisting of bits. In the example the message
contains two bits, so there are 22=4 possible messages, for example representing the wind
directions:
00 NORTH, 01 EAST, 10 SOUTH, 11 WEST
The message is encoded into a codeword x, in this example by adding three bits to u,
leading to a sequence of five bits, according to a code C as indicated. Next, these bits are
transported over the channel, and the five bit estimates collected in the received word y
are delivered at the receiving side. In the example, the estimate of the second bit is not
correct, for example due to noise added to the transmitted waveform in the channel. The
decoder needs to produce a message estimate v, based on its knowledge of the code C and
the received word y. Here, the decoder chooses the codeword 01011, which resembles y
the most, and thus v is 01, which is indeed the original message. Hence, the error has
been corrected.
Note that without using a code, any error would immediately lead to an incorrect wind
direction. So using the code makes the system more reliable, at the expense of
transmitting more bits. However, if too many errors occur, also the coded system may
fail. For example, when receiving 00010 (errors in second and fifth bit), the closest
codeword is 00000, and thus the message estimate would be 00 (NORTH) rather than 01
(EAST).
et4-030 Error-Correcting Codes, Part 1: Introduction -1.8-
p
1 1
1-p
Introduction 8
In order to be able to evaluate the performance of coded systems, we often assume the
channel to behave according to a certain model. A very simple and popular model is the
binary symmetric channel, for which the probability of a transmitted 1 to be received as 0
equals the probability of a transmitted 0 to be received as 1. This channel bit error
probability is denoted by p (0<p<1/2). Of course, the probability of correct reception is
then 1-p. The value of p depends on the underlying modulation method and noise type.
For example, for binary phase-shift keying (BPSK), with bit energy E, and additive white
Gaussian noise (AWGN), with variance N0/2, we have
p = Q(√(2E/N0)),
where Q(z) is the famous Gaussian tail function.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.9-
p
1 1
1-p
Introduction 9
Introduction 10
Instead of always making a decision between zero and one at the receiver side, we can
also prefer to declare the received bit as an erasure. For example, this may be done when
the demodulator has a tough job in deciding from the received waveform whether it
represents a zero or a one. This leads to the channel model shown above, with two inputs
and three outputs.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.11-
Random/Burst Errors
memoryless channels: random errors
example
transmitted: 01010000011101010100111001100010010
received: 01010010011101010100110001101010010
Introduction 11
In a channel without memory, the noise influences each of the transmitted symbols
independently from the previous or next symbols. This leads to so-called random errors.
A typical example is shown above.
While burst errors occur very frequently in practice, most emphasis in the design of
channel codes is on the correction of random errors. This apparent discrepancy is solved
by a technique called interleaving, which will be discussed in Part 3.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.12-
Gilbert Model
Introduction 12
The Gilbert model is a simple way to model burst errors, possibly in combination with
random errors. It consists of two states: a good one and a bad one. Both states are
represented by a binary symmetric channel (BSC). In the good state the channel bit error
probability is very close to 0, while in the bad state it is considerably larger. The model is
completed by transition probabilities between the states.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.13-
000
100
YES 0 000 010 0
BSC 001 Pcorrect = 0.9997
This simple example illustrates the trade-off between efficiency and reliability. If we want
to send a 1-bit message over a binary symmetric channel for which 1% of the bits is
received in error, then the probability that an incorrect message arrives at the destination
is 0.01 in case no coding is applied. By using a code we can improve this reliability. Here
we apply the simple coding concept of just repeating the information symbol two more
times, thus leading to the codewords 000 and 111. In general, a binary repetition code of
length n consists of two codewords: the all-zero and the all-one sequences of length n.
At the receiver one of the eight possible binary vectors of length 3 comes out of the
demodulator. If x=000=0 was transmitted, then the probabilities of receiving y are
P(y=000|x=0) = (1-p)3 ≈ 0.9703,
P(y=100|x=0) = P(y=010|x=0) = P(y=001|x=0) = p(1-p)2 ≈ 0.0098,
P(y=110|x=0) = P(y=101|x=0) = P(y=011|x=0) = p2(1-p) = 9.9×10-5,
P(y=111|x=0) = p3 = 10-6.
Similar expressions can be derived for x=111. Because of the symmetry we may assume
without loss of generality that x=0 in the error analysis.The obvious decoding rule is to
take a majority decision over the three received symbols. Hence, when x=0, a decoding
error occurs if and only if y is 110, 101, 011, or 111, and so the probability of a decoding
error is about 0.0003. This is considerably lower than the error probability 0.01 in case of
no coding. The improvement comes at the price of transmitting three times as many bits.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.14-
This example shows that just repeating the message a number of times is not the most
efficient thing to do. Here, the messages consist of three bits. In Code 1 protection is
provided by sending a message three times, so nine bits are to be transmitted. In Code 2
we add three bits to the message: if the message u=(u1,u2,u3), the the codeword
x=(u1,u2,u3,u2+u3,u1+u3,u1+u2).
Applying the decoding rule of choosing the codeword which resembles the received word
y the most, both codes can correct one error, because of the fact that any two different
codewords from the same code differ in at least three positions. However, none of the
codes guarantees the correction of two errors. Suppose we use Code 1, transmit 0, and
receive 100100000. Indeed the decoded codeword is 100100100 en we have a decoding
error. Similarly for Code 2: if we transmit 0 and receive 000110, the decoded codeword is
001110. Both codes are called single error-correcting codes.
In conclusion, both codes have similar error correction properties, but Code 2 is much
more efficient: only six bits rather than nine have to be transmitted or stored per three-bit
message.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.15-
Error Correction
t t
t
t
Introduction 15
In the above plot the centers of the circles represent codewords from code C of length n.
The circles themselves represent all vectors of length n which differ from the center
codeword in at most t positions. Using the decoding rule of choosing the codeword which
resembles the received word the most, the code C is able to correct up to t errors if and
only if all these circles are disjoint.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.16-
010
101
1 1
000 111
001
011
100 110
Introduction 16
Here the general error correction principle is illustrated for the repetition code of length 3.
As discussed before, this is a single error-correcting code.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.17-
Error Detection
s
s
s
s
Introduction 17
In error detection, the decoder merely checks whether the received word y is a codeword
or not. If y is a codeword, then the receiver assumes that no errors have occurred, else it
knows for sure that something is wrong ("error(s) have been detected"). Again, the centers
of the circles represent codewords from code C of length n. The circles themselves
represent all vectors of length n which differ from the center codeword in at most s
positions. The code C is able to detect up to s errors if and only if all circles contain only
one codeword (the center!).
In case error(s) have been detected, the receiver has various choices how to handle. If
there is a return channel to the sender, the receiver may ask for a retransmission. If there
is no return channel, the receiver may simply ignore the message and do an interpolation
using previous and subsequent messages (for example in audio or video applications).
In fact, for one and the same code, a decoder may choose to use a detection mode or a
correction mode, or even a combination of the two. For applications which require
extremely low bit error probabilities, such as storage or transmission of sensitive data, the
detection mode may be preferred, using Automatic Repeat Request (ARQ) protocols.
Such protocols work with error detection mechanisms, acknowledgements, and
retransmissions.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.18-
010
101
2
100
011
2
000 111
001 110
Introduction 18
Here we show how the repetition code {000,111} of length 3 is used in an error detection
mode. If 000 or 111 is received, then it is assumed that transmission has been error-free.
If any other binary vector of length 3 is received, then errors are detected. Hence, one or
two errors are always detected, while three errors are not detected. Therefore, we call this
a double error-detecting code. Remember that the same code was single error-correcting.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.19-
000 0
100
YES 0 000
010
BSC 001 Request for a
p=0.01 110 retransmission
NO 1 111 101
011
111 1
Introduction 19
We revisit the binary repetition code example, but this time we do the decoding in an
Automatic Repeat Request (ARQ) mode. Again, either 000 or 111 is transmitted. If 000
or 111 is received, then the decoder assumes that no errors occurred during transmission.
If any other sequence of length 3 is received, the decoder is absolutely sure that errors did
occur during transmission, and he asks for a retransmission. Since
P(y=000|x=0) = (1-p)3 ≈ 0.9703,
P(y=100|x=0) = P(y=010|x=0) = P(y=001|x=0) = p(1-p)2 ≈ 0.0098,
P(y=110|x=0) = P(y=101|x=0) = P(y=011|x=0) = p2(1-p) = 9.9×10-5,
P(y=111|x=0) = p3 = 10-6,
and since P(y|x=1)=P(y+1|x=0), we have that
Pcorrect = (1-p)3 ≈ 0.9703,
Pretr = 3p(1-p)2 + 3p2(1-p) = 0.0297,
Perror = p3 = 10-6,
after one transmission.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.20-
ARQ FEC
retransmissions no retransmissions
Introduction 20
As demonstrated by the repetition code example, it is clear that a code can be decoded in
an Automatic Repeat Request (ARQ) mode or in a Forward Error Correction (FEC). The
former is detection-based, while the latter is correction-based. When comparing the two,
first of all note that ARQ requires a return channel, while FEC does not. Since there are
no retransmissions in FEC, the total time required to transmit a message is fixed. In case
of ARQ, the possible retransmissions cause an extra delay and a variable transmission
time.
Since in ARQ the errors only have to be detected (and not corrected), the decoding
complexity (i.e., the number of operations/calculations to be performed in the decoding
process) is typically much lower than for FEC. Furthermore, as seen in the example, ARQ
leads to much lower bit error probabilities.
Because of the listed properties, it is no surprise that ARQ techniques are mainly
implemented in text and data applications, while FEC techniques can be found in real-
time applications like voice, audio, and video. A further discussion of ARQ is beyond the
scope of these lecture notes, the interested reader is referred to the literature, e.g., [RC99].
et4-030 Error-Correcting Codes, Part 1: Introduction -1.21-
Code Performance
• Bit error probability Pe as a function of Eb/N0;
Example: BPSK modulation, AWGN channel,
∞
1 − x2 / 2
then Pe=Q(√(2Eb/N0)), with Q( z ) =
2π ∫e
z
dx
Introduction 21
The reliability of digital communication systems is often expressed in terms of bit error
probability Pe, i.e., the probability that a bit delivered to the sink is different from the
corresponding bit generated by the source. This probability depends on the modulation
and coding methods, on the behavior of the channel, and on the received signal-to-noise
ratio (SNR). For example, for BPSK modulation on the AWGN channel, Pe =
Q(√(2Eb/N0)), where Eb is the energy per information bit, N0 is the noise power spectral
density, and Q(z) is the Gaussian tail function.
When coding is applied, less energy may be required in order to achieve a certain bit error
probability than without coding, The difference in Eb/N0, expressed in decibel (dB), is
called the coding gain. A coded system does not always perform better than an uncoded
system. When Eb/N0 is below a certain coding threshold, which may depend on the
modulation and coding methods, the performance of the coded system may actually be
worse (in spite of the added redundancy).
The ultimate goal for a communication systems engineer is to get close to the Shannon
limit. In his famous 1948 paper [Sha48], Shannon has shown that error-free
communication on an AWGN channel is possible at an Eb/N0 of only –1.59 dB.
Unfortunately, Shannon’s proof is not constructive and allowed coding and modulation
methods of infinite length/complexity. However, using the recently developed turbo
coding methods (see [BGT93] and Part 8), it is possible to get very close to the Shannon
limit at a reasonable and practicable complexity.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.22-
-2
10
Pe BPSK
without coding
-4
10
2.3 dB
-6
10 BPSK with
BPSKBCH
(63,51,2) with
coding
(63,51,5) BCH
-8 coding
10
-10 3.1 dB
10
2 3.7 4 6 8 10 12 14
E b / N 0 (in dB)
Introduction 22
This example illustrates the issues concerning code performance. One curve shows the bit
error probability as a function of Eb/N0 for BPSK modulation on the AWGN channel,
without the use of an error-correcting code. The other curve shows the performance using
a certain BCH code (which is explained in Part 4).
In order to achieve a bit error probability of 10-5, the option without coding requires Eb/N0
= 9.6 dB, while the option with coding requires only Eb/N0 = 7.3 dB. Hence, the coding
gain is 2.3 dB. Similarly, the coding gain at Pe=10-9 turns out to be 3.1 dB. The
asymptotic coding gain (at Eb/N0 approaching infinity) for this example is 3.9 dB.
Further, note that for an Eb/N0 below 3.7 dB, applying this BCH code is useless, since it
performs worse than the no coding option.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.23-
Course Overview
Mathematical Basics (Part 2)
Block Codes Convolutional Codes
Info Block → Code Block Info Stream → Code Stream
(Parts 3, 4, 5, 6) (Part 7)
• Created by students
George Mathijssen and Arjen van Rhijn
Introduction 24
Several students have developed demonstrators to show the usefulness and working of
error-correcting codes. In the one indicated above, it is shown how coding can be used to
deliver images of quite a good quality to a user, in spite of rather poor channel
conditions.
Other demos may be found in the Assignments Section on Blackboard.
et4-030 Error-Correcting Codes, Part 1: Introduction -1.25-
Pretr
Perror
Introduction 25
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 1: Introduction -1.26-
Decoding Exercise
For the code (see Slide 1.7)
00000
01011
10101
11110
decode the following received words:
a) y = 10101
b) y = 01110
c) y = 01100
Introduction 26
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.1-
et4-030 Part 2
Mathematical Basics
Mathematical Basics 1
Contents
• Group
• Finite Field GF(q)
• Polynomials over GF(q)
• Construction of GF(q)
• Properties of GF(q)
• Vector Space
Mathematical Basics 2
As can be seen from the Table of Contents, the emphasis in this part is on finite fields,
also called Galois Fields (GF). These are sets with a finite number of elements on which
two operations are defined, satisfying certain rules. A Galois Field with q elements is
denoted by GF(q).
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.3-
Group
Set G with operation ∗ such that
(i) ∀ a,b ∈ G: a∗b ∈G
(ii) ∀ a,b,c ∈ G: a ∗ (b ∗ c) = (a ∗ b) ∗ c
(iii)∃ e ∈ G: a=a∗e=e∗a∀a∈G
(iv)∀ a ∈ G ∃ a-1 ∈ G: a ∗ a-1 = a-1 ∗ a = e
Mathematical Basics 3
Examples of groups:
• The integer numbers with operation + (addition): identity element is 0, inverse of
5 is –5, etc.
• Numbers {0, 1, 2, 3} with addition modulo 4: identity element is 0, inverse of 0
is 0, inverse of 1 is 3, inverse of 2 is 2, inverse of 3 is 1.
Note that the set of integer numbers with operation × (multiplication) is not a group: there
is an identity element 1, but what would be the inverse of for example the element 5? In
conclusion, a set can only be said to be a group or not to be a group if the operation is
specified: the integer numbers are a group under the operation addition, but not a group
under the operation multiplication.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.4-
Field
Set F with operations + and × such that
(i) F, + is an Abelian Group with identity element e=0
(ii) F \ {0}, × is an Abelian Group
(iii)∀ a,b,c ∈ F: a × (b + c) = a × b + a × c
Examples:
• the real numbers with the usual + and × operations
• {0,1} with the usual binary + and × operations
Mathematical Basics 4
A set F on which two operations, let’s say an addition + and a multiplication ×, are
defined is called a field (in Dutch known as lichaam) if it satisfies requirements (i) to (iii).
The first requirement says that F under the operation + must be an Abelian Group. The
second requirement states that F, exclusive the identity element under addition, should be
an Abelian group under multiplication. Finally, the third requirement is known as the
distributive law.
Examples of fields:
• The real numbers with the usual + and × operations: identity element is 0 under
addition and 1 under multiplication, additive inverse of 5 is –5, multiplicative
inverse of 5 is 1/5, etc.
• Numbers {0, 1} with addition and multiplication modulo 2: identity element is 0
under addition and 1 under multiplication, additive inverse of 0 is 0, additive
inverse of 1 is 1, and multiplicative inverse of 1 is also 1.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.5-
Finite Field
Field with a finite number of elements
• Number of elements: order
• Order is always a power of a prime number: q=pm
• Field of order q: GF(q) (“Galois Field”)
• The order of element a (≠ 0):
smallest n>0 such that an = 1
• A primitive element of GF(q):
element a of order q-1;
such element generates GF(q): 0, a1,a2,…,aq-1=1
• GF(p), p prime: {0,1,…,p-1} with calculations mod p
Mathematical Basics 5
In the context of error-correcting codes we are mainly interested in finite fields. In such a
field the number of elements, called the order of the field, is finite. It can be shown that
the order of a field is always a power of a prime number. Remember that a prime number
is an integer number, which is at least equal to 2 and which is only divisible by 1 and by
itself (2, 3, 5, 7, 11, etc.). Hence, there exist fields of order 2, 3, 4=22, 5, 7, 8=23, 9=32, 11,
etc., but not of order 6, 10, 12, etc. A field of order q is often denoted as GF(q), where GF
stands for Galois Field, after the French mathematician Pierre Galois.
Besides the order of the field itself, we may also speak of the order of an element of the
field (except for the additive identity element 0). This is defined as the smallest positive
number n such that the element raised to the power n gives the multiplicative identity
element 1. The powers of an element of order q-1 generates all elements in GF(q) except
0. Therefore, an element of order q-1 is called a primitive element.
In case q=p is prime GF(p) is easily constructed by just taking the numbers 0, 1, …, p-1
and doing the additions and multiplications modulo p. See the next page for the p=7
example. The construction of GF(q) in case q is not prime is more complicated and will
be discussed later.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.6-
Note that GF(7) van be generated by the elements 3 or 5, but not by 1, 2, 4, and 6:
• 21=2, 22=4, 23=1, 24=2, 25=4, 26=1, 27=2, 28=4, 29=1, … (hence only the elements 1, 2,
and 4 are powers of 2 in GF(7))
• 31=3, 32=2, 33=6, 34=4, 35=5, 36=1 (hence all non-zero elements are powers of 3 in
GF(7))
• etc.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.7-
Example: GF(7)
u(x) = x2 + 5x + 2, v(x) = 2x + 5
u(x) + v(x) = x2
u(x) - v(x) = x2 + 3x + 4
u(x) v(x) = 2x3 + x2 + x + 3
Mathematical Basics 7
Just like polynomials over the real (or complex) numbers, we can also define polynomials
over GF(q). The highest power of x with a non-zero coefficient is called the degree of the
polynomial (u(x) in the example above is of degree 2, v(x) of degree 1, u(x)v(x) of degree
2+1=3). When adding, subtracting, multiplying, or dividing polynomials, all operations
are done in GF(q). Note that in GF(2) addition and subtraction are the same.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.8-
Polynomial Division
u(x) / v(x):
u(x) = q(x) v(x) + r(x)
with degree r(x) < degree v(x)
Example: GF(7)
u(x) = x2 + 5x + 2, v(x) = 2x + 5
u(x) = (4x+3) v(x) + 1
Hence, u(x) is not divisible by v(x)
Mathematical Basics 8
Explanation: first we concentrate on the terms with the largest power of x in u(x) (i.e., x2)
and v(x) (i.e., 2x). The first term in q(x) is chosen such that this term multiplied with 2x
gives x2. Hence this term must be 4x, which we multiply by v(x) and subtract from u(x),
which gives 6x+2. Etc.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.9-
Irreducibility
Example: GF(2)
x2 + x + 1 is irreducible
x3 + x + 1 is irreducible
x2 + 1 is reducible (x2 + 1 = (x+1)(x+1))
Mathematical Basics 9
Just like in number theory any positive integer number can be uniquely composed by a
multiplication of prime numbers, polynomials can be written as the product of irreducible
polynomials. The latter are polynomials which cannot be further decomposed into
polynomials of smaller degree. Such a decomposition into irreducible factors may also be
used to determine all possible divisors.
Number example:
60 = 2 × 2 × 3 × 5
Hence the divisors of 60 are 1, 2, 3, 5, 2×2=4, 2×3=6, 2×5=10, 3×5=15,
2×2×3=12, 2×2×5=20, 2×3×5=30, 2×2×3×5=60
+ 0 1 2 3 x 0 1 2 3
Note that 2 has
0 0 1 2 3 0 0 0 0 0
no inverse under
1 1 2 3 0 1 0 1 2 3 multiplication!
2 2 3 0 1 2 0 2 0 2
3 3 0 1 2 3 0 3 2 1
Recall that in case q is prime GF(q) is easily constructed by just taking the numbers 0, 1,
…, q-1 and doing the additions and multiplications modulo q. From the example above it
becomes clear that this simple approach does not work in case q is not prime. In general,
if q=pm, with p prime and m≥2, the element p would not have a multiplicative inverse
under this approach, since there is no number b such that bp is equivalent to 1 modulo pm.
Therefore, we need to come up with another construction method for GF(q) in case q=pm
with m≥2.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.11-
The construction method for GF(q) presented here shows strong similarities with the
construction of complex number from real numbers. Remember that the real polynomial
x2+1 has no roots in the real numbers, i.e., there is no real value x which satisfies the
equation x2+1=0. Then, the imaginary number i was defined as a root of this polynomial:
i2+1=0, or i2=-1, or i=√(-1). Then the complex numbers were defined as numbers from the
format a+bi, with a and b being real numbers. In this way the real numbers were
extended to the complex numbers.
In a similar way the field GF(p) with p prime can be extended to a field GF(pm) with m≥2.
Let q=pm. We use an irreducible polynomial p(x) over GF(p) which is is a divisor of xq-1-
1, but not of xj-1 for any j<q-1. Such a polynomial is said to be primitive. Now just like
we defined the imaginary number i, define α as a root of p(x), i.e., p(α)=0. The elements
of GF(q) are defined in the format
b0 + b1α + b2α2 + … + bm-1αm-1
where the bk are elements from GF(p). Alternatively, the elements may be represented as
the m-tuple (b0, b1, b2, …, bm-1). It can be shown that the sequence 0, α1, α2, α3, …, αq-1
generates exactly the same set of elements. Hence, GF(pm) can be represented by an m-
tuple representation, which we use for the operation `addition’, and by a power (of α)
representation, which we use for the operation `multiplication’. Under these rules all the
requirements of a field are satisfied. The identity element under addition is (0,0,…,0) (in
m-tuple representation) or just 0 (in power representation), while the identity element
under multiplication is (1,0,…,0) or just 1.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.12-
0 (0,0) + 0 α α2 1 x 0 α α2 1
0 0 α α2 1 0 0 0 0 0
α1= α (0,1)
α α 0 1 α2 α 0 α2 1 α
α2=1+ α (1,1)
α2 α2 1 0 α α2 0 1 α α2
α3=1 (1,0) 1 1 α2 α 0 1 0 α α2 1
Mathematical Basics 12
As an example we consider the case q=4, i.e., p=2 and m=2. The only binary primitive
polynomial of degree 2 is x2+x+1, since the other candidates are all reducible (x2=x×x,
x2+1=(x+1)(x+1), and x2+x=x(x+1)). Hence, α2+α+1=0, and so α2=-1-α=1+α. In power
representation, the elements of GF(4) are 0, α1=α, α2=1+α, and
α3=α1α2=α(1+α)=α+α2=α+1+α=1. Thus the 2-tuple representations of these four
elements are (0,0), (0,1), (1,1), and (1,0), respectively. The addition and multiplication
tables are as shown above. For example:
α+α2 = (0,1) + (1,1) = (1,0) = 1,
α2×α2 = α4 = α1×α3 = α×1 = α.
It can easily be checked that all the field requirements are met.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.13-
Mathematical Basics 13
As another example we consider the case q=8, i.e., p=2 and m=3. There are two possible
binary primitive polynomials of degree 3: x3+x+1 and x3+x2+1. Either one can be used to
construct GF(8), here we choose x3+x+1. Hence, α3+α+1=0, and so α3=-1-α=1+α. In
power representation, the elements of GF(8) are 0, α1= α, α2, α3=1+α,
α4=α1α3=α(1+α)=α+α2, α5=α1α4=α(α+α2)=α2+α3=1+α+α2,
α6=α1α5=α(1+α+α2)=α+α2+α3=1+α2, and α7=α1α6=α(1+α2)=α+α3=1. Thus the 3-tuple
representations of these eight elements are (0,0,0), (0,1,0), (0,0,1), (1,1,0), (0,1,1), (1,1,1),
(1,0,1), and (1,0,0), respectively. Addition and multiplication are done using the 3-tuple
representation and the power representation, respectively. For example:
α+α2 = (0,1,0) + (0,0,1) = (0,1,1) = α4,
α5×α6 = α11 = α4×α7 = α4×1 = α4.
Again, it can easily be checked that all the field requirements are met.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.14-
Mathematical Basics 14
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.15-
Calculations in GF(q)
GF(q)=GF(pm): 0, α, α2, α3, …, αq-1=1
∀β∈ GF(q): β = b0 + b1α + b2α2 + … + bm-1αm-1
or (b0,b1,b2,…,bm-1) with bi∈GF(p)
multiplication: αi αj = αi+j
addition: according to m-tuple representation
0 (0,0)
example
GF(4) α1= α (0,1) α2 α2 = α4 = α
p=2, m=2 α2=1+ α (1,1) α+α2 = (0,1)+(1,1)
= (1,0) = 1
α3=1 (1,0)
Mathematical Basics 15
As mentioned before, multiplications in GF(q) are done using the power representation,
while additions are done using the m-tuple representation.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.16-
Minimal Polynomial
Mathematical Basics 16
For any β∈GF(pm), the minimal polynomial ϕ(x) is defined as the polynomial over GF(p)
of smallest degree such that β is a root of this polynomial, i.e., ϕ(β)=0. These minimal
polynomials play an important role in the construction of BCH codes, as discussed in Part
4. The above theorem provides an easy way to derive the minimal polynomial.
As an example, let’s consider the element β=α3 from GF(8), where GF(8) was
constructed using the primitive polynomial p(x)= x3+x+1. We look for a binary (since
p=2) polynomial of smallest degree such that α3 is a root. Note that (α3)2=α6,
(α3)4=α12=α5, and (α3)8=α24=α3. Hence, E=8 and e=3, and thus
ϕ(x) = (x-α3)(x-α6)(x-α5)
= x3 - (α3+α6+α5) x2 + (α2+α+α4) x – 1
= x3 + x2 + 1.
β e ϕ(x)
0 1 x
1 1 x-1 = x+1
α, α2, α4 3 (x+α)(x+α2)(x+α4) =
x3 + (α+α2+α4)x2 + (α3+α5+α6)x + α7 =
x3 + x + 1
α3, α6, α5 3 (x+α3)(x+α6)(x+α5) =
x3 + (α3+α6+α5)x2 + (α2+α+α4)x + α14 =
x3 + x2 + 1
Mathematical Basics 17
For all other elements in GF(8), the minimal polynomials ϕ(x) can be derived in the same
way as for α3. Note that due to the fact that GF(8) is an extension field of GF(2),
subtractions may be replaced by additions.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.18-
Vector Space
Let V be a vector set with an addition operation;
Let F be a field;
V is a vector space over F if
(i) V is an Abelian Group
(ii) ∀ a∈F, v∈V: av ∈ V
(iii)∀ a,b∈F, u,v∈V: a(u+v) = au+av
(a+b)v = av+bv
(iv)∀ a,b∈F, v∈V: (ab)v = a(bv)
(v) ∀ v∈V: 1v=v
Mathematical Basics 18
The codes that we consider in this course are often represented as subspaces of vector
spaces over a certain (mostly binary) field. Therefore, we now formally define vector
spaces and their subspaces.
A vector set V is a vector space over a field F if it satisfies the five above requirements.
Examples:
• F consists of the real numbers (F=R), and V is formed by all vectors of length n with
real coordinates ⇒ V= R n
• F consist of the binary numbers (F=GF(2)), and V is formed by all vectors of length n
with binary coordinates ⇒ V=(GF(2))n
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.19-
Subspace
A subset S of V is a subspace of V if
∀ a,b ∈F, u,v∈S: au+bv ∈ S
000 111 V
110 001
101 010
011 100
S
Mathematical Basics 19
Subspace Basis
Mathematical Basics 20
Any subspace S can be generated by a set of k linear independent basis vectors v1, v2, …,
vk, i.e., any vector in S can be written as a linear combination of v1, v2, …, vk:
a1v1 + a2v2 + … + ak vk
The number k is called the dimension of the subspace. A k-dimensional subspace contains
|F|k elements, where |F| is the order of the field F.
The set of basis vectors is not necessarily unique. In the example, also (1,0,1) and (1,1,0)
could serve as a basis of the subspace. For the vector space (GF(2))n itself, the set
(1,0,0,…,0,0), (0,1,0,…,0,0), …, (0,0,0,…,0,1)
serves as a basis, but also
(1,0,0,…,0,0), (1,1,0,…,0,0), …, (1,1,1,…,1,1)
or any other set of n independent binary vectors may serve as such.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.21-
Dual Space
Mathematical Basics 21
Example: consider (GF(2))5, which contains 25=32 vectors; let the 2-dimensional
subspace S be generated by the basis (1,0,1,0,1) and (0,1,0,1,1), i.e.,
S = {(0,0,0,0,0), (1,0,1,0,1), (0,1,0,1,1), (1,1,1,1,0)}.
Then the dual space is of dimension 5-2=3:
S ⊥ = {(0,0,0,0,0), (1,0,1,0,0), (0,1,0,1,0), (1,1,1,1,0),
(1,1,0,0,1), (0,1,1,0,1), (1,0,0,1,1), (0,0,1,1,1)}.
et4-030 Error-Correcting Codes, Part 2: Mathematical Basics -2.22-
Exercise
Mathematical Basics 22
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.1-
et4-030 Part 3
Block Codes I:
Fundamentals
In the area of error-correcting codes we distinguish block codes and convolutional codes.
Parts 3 to 6 are devoted to block codes, Part 7 to convolutional codes.
In this Part 3 the fundamentals of block coding are presented.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.2-
Contents
A block code C is a subset of Vn, which is the n-dimensional vector space over an
alphabet A of size q. Throughout these lecture notes we assume A is the binary alphabet
GF(2) (i.e., q=2), unless explicitly stated otherwise. The elements of the code C are called
codewords, and n is called the length of the code. The number of codewords, also called
the cardinality or size of the code, is denoted by |C|.
Messages u generated by the source are one-to-one mapped to codewords x∈C. Hence, |C|
is also the number of messages. The code rate R is defined by
R = (qlog |C|) / n.
Since |C| is at most qn, the code rate is a real number in between 0 and 1. The closer it is
to 1, the more efficiently the code represents the messages.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.4-
If C is a subspace of Vn of dimension k,
then C is called a linear (n,k) block code
Thus Code Rate R = (qlog qk) / n = k / n
0110 1111
Block Codes I: Fundamentals 4
If C is not only a subset of Vn, but also a k-dimensional subspace, then C is called a
linear (n,k) block code. Hence, for any linear block code C,
x1 ∈ C, x2 ∈ C ⇒ a1x1+a2x2 ∈ C
for any a1 and a2 from the alphabet A. The number of codeword in a linear code is
|C| = qk,
and so the code rate is
R = k/n.
Since k represents the length of the message generated by the source and n represents the
length of the codeword to be transmitted over the channel (or stored on the disk), the code
rate R equals the ratio between this two lengths, which can indeed be considered as the
natural efficiency measure.
In the example, the first three bits of each of the eight codewords of length four can be
considered as the message generated by the source. The fourth bit is the additional bit for
error protection, which is chosen such that the total number of ones in the codeword is
even. This procedure of adding a single parity bit to a k-bit message can be done for any
k, leading to an (k+1,k) even-weight block code. Such a code is able to detect a single
error: if one error occurs, then the received number of ones will be odd, so the receiver
knows something is wrong (but it does not know what). In the famous ASCII code, where
128 (key-board) characters (such as A, a, 1, 2, ?, %, etc.) are represented by seven bits
each, this procedure may be used to provide single error detection. Hence, 8-bit ASCII
does not represent a character set of size 28=256, but a character set of size 27=128 with
an error detection feature.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.5-
Hamming Distance
a = (a1,a2, …,an) a = 001101111000100
d(a,b) = |{ i : ai ≠ bi }| d(a,b) = 6
The Hamming distance between two vectors of the same length is defined as the number
of positions in which these two vectors differ. This metric is named after Richard W.
Hamming, who pioneered error-correcting codes in the 1950’s [Ham50]. The Hamming
distance between a vector a and the all-zero vector 0 is called the weight w(a) of the
vector a.
The Hamming distance d of a code C is defined as the smallest Hamming distance
between any two different codewords. For linear codes this equals the minimum weight of
all codewords except the all-zero codeword.
The Hamming distance d is an important parameter of a code C, since knowing it we can
immediately derive the error protection capabilities of the code:
C can correct t errors ⇔ d ≥ 2t+1;
C can detect s errors ⇔ d ≥ s+1.
Hence, a code with Hamming distance d=9 can correct four errors or detect eight errors.
Linear block codes of length n, dimension k, and Hamming distance d may be denoted as
(n,k,d) codes. For example, the family of even-weight codes introduced on the previous
page are (k+1,k,2) codes. The one explicitly shown on the slide is a (4,3,2) code and the
one used in ASCII an (8,7,2) code .
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.6-
The binary linear code in this example has messages of length k=2, which are mapped to
codewords of length n=11. Hence, the code rate is very low, only 2/11=0.18. But on the
other hand, the code’s Hamming distance d=7 is rather large. Note that also larger
distances between codewords may appear, e.g., d(c2,c3)=8. The code can be used to
correct up to (7-1)/2=3 errors, or to detect up to 7-1=6 errors.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.7-
Generator Matrix
Linear (n,k) code C: basis vectors b1, b2, …,bk
information vector u
generator matrix G x=uG
codeword x
Since a linear (n,k) code C is a k-dimensional subspace of a vector space Vn, the
codewords of C can be generated by taking all linear combinations of k basis vectors of
the subspace. The k x n matrix which contains these basis vectors as rows is called a
generator matrix G of the code C. An n-bit codeword x can be generated by multiplying a
k-bit vector u representing a message with the generator matrix G.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.8-
Systematic Codes
n
G = ( Ik | P) k
k n-k
(info) (check)
Example: q=2, k=2, n=8
⎡10111100⎤
x1 = ( 0 0 ) G = ( 0 0 0 0 0 0 0 0 ) G=⎢ ⎥
x2 = ( 1 0 ) G = ( 1 0 1 1 1 1 0 0 ) ⎣01110011⎦
x3 = ( 0 1 ) G = ( 0 1 1 1 0 0 1 1 ) d=5
x4 = ( 1 1 ) G = ( 1 1 0 0 1 1 1 1 ) R = 2/8 = 1/4
Block Codes I: Fundamentals 8
Since a code may have many constellations of basis vectors, it may also have many
generator matrices. Through elementary row and column operations, a generator matrix
may always be put into the format shown above, where Ik is the k x k identity matrix with
ones on main diagonal and zeroes elsewhere. The advantage of this format is that the
message vector u appears at the first k positions of the codeword x generated by x=uG.
Such a format is called systematic, and the first k symbols are called information symbols,
while the last n-k symbols are called check symbols.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.9-
Parity-Check Matrix
If Generator Matrix G = ( Ik | P )
then Parity-Check Matrix H = ( -PT | In-k )
Property: x ∈ C ⇔ x HT = 0 (useful in decoding!)
⎡10100 ⎤
⎡10101⎤
Example: q=2, k=2, n=5 G=⎢ ⎥ H = ⎢⎢ 01010 ⎥⎥
⎣01011⎦ ⎢⎣11001 ⎥⎦
x1 = ( 0 0 ) G = ( 0 0 0 0 0 ) x1 HT = ( 0 0 0 ) N.B.
x2 = ( 1 0 ) G = ( 1 0 1 0 1 ) x2 HT = ( 0 0 0 ) x3 = x1
x3 = ( 0 1 ) G = ( 0 1 0 1 1 ) x3 HT = ( 0 0 0 ) x4 = x2
x4 = ( 1 1 ) G = ( 1 1 1 1 0 ) x4 HT = ( 0 0 0 ) x5 = x1 + x 2
Block Codes I: Fundamentals 9
The dual space of the linear code (n,k) code C in the vector space Vn is a linear (n,n-k)
code denoted by C⊥. This code is called the dual code of C. If G as indicated above is a
generator matrix of C, then the matrix H as indicated above is a generator matrix of C⊥.
The matrix H is called a parity-check matrix of C, and because of the dual property the
codewords of C are characterized by
x ∈ C ⇔ x HT = 0.
This property is particularly useful in decoding, as we will see later.
Thus a code cannot only be determined by a generator matrix, but also by a parity-check
matrix. The n-k equations
n
∑x h
j =1
j ij =0
for i=1,2,…,n-k are called parity-check equations. These determine how the n-k check
symbols are determined from the k information symbols.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.10-
Parity-Check Exercise
G = ⎡1 0 1 0 1 0 1 1⎤
⎢⎣ 0 1 0 1 0 1 1 1⎥⎦
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.11-
Hamming Codes
2m - 1
H = ( all possible columns ≠ 0T ) m≥2
n = 2m -1 Example: m=3
k = 2m - 1 - m ⎡1000111 ⎤
⎡1110100 ⎤
⎢ 0100110 ⎥
d=3
H = ⎢⎢1101010 ⎥⎥
G=⎢ ⎥
⎢ 0010101 ⎥
⎢⎣1011001 ⎥⎦
⎢ ⎥
Family of single error correcting codes! ⎣ 0001011 ⎦
Block Codes I: Fundamentals 11
The class of binary Hamming codes is one of the oldest (over fifty years!) families of
error-correcting codes [Ham50]. They are most easily defined from the parity-check
matrix.
The equation xHT=0 can be considered as a condition that a number of columns in H add
up to 0. If all columns in H are different and unequal to the all-zero column, then the
weight of x in order to satisfy this equation is either equal to zero or at least equal to
three. Hence, the Hamming distance of the code is at least equal to three, and the code is
single error-correcting or double error-detecting.
If the height of the columns in the H matrix is m, which is an integer at least equal to two,
then the maximum number of columns satisfying the requirements is 2m-1. The H matrix
with this maximum number of columns leads to an (2m-1, 2m-1-m,3) code, called a
Hamming code. A generator matrix G can be easily derived from H. For m=2 we obtain
the (3,1,3) repetition code, for m=3 the (7,4,3) code shown above, and for m=4,5,6,… we
obtain (15,11,3), (31,26,3), (63,57,3), … codes similarly.
Also, the concept of binary Hamming codes can easily be extended to larger alphabets.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.12-
Syndrome Decoding
e
x y=x+e
Syndrome s = y HT
N.B. s = y HT = (x+e) HT = x HT + e HT = 0 + e HT = e HT
Example: ⎡10100 ⎤
H = ⎢⎢ 01010 ⎥⎥
⎢⎣11001 ⎥⎦ s = 100
y = 10001
We look for error vector e with syndrome s and smallest weight!
Block Codes I: Fundamentals 12
When a codeword x is transmitted, the received vector y can be considered as the sum of
x and an error vector e. In maximum likelihood (ML) decoding, the decoder looks for the
most probable codeword x upon receipt of y, or equivalently, it looks for the most
probable error vector e. When transmitting over a binary symmetric channel, then
P(ei=0)=1-p and P(ei=1)=p for all i, and thus P(e=a)=(1-p)n-w(a)pw(a). Since p<0.5, P(e=a) is
a decreasing function of w(a). So in ML decoding one should look for the error vector e
of minimum weight such that y-e is in the code. For small codes, a brute-force exhaustive
search would do the job, but for larger codes this would be computationally infeasible.
Fortunately, we may make use of the nice algebraic structures of linear codes in order to
reduce the number of computations and comparisons. One such technique is known as
syndrome decoding.
The syndrome s of a received vector y is defined as s=yHT, which is a vector of length n-
k. Remember that a vector x is a codeword if and only if xHT=0. Hence, the syndrome
eHT of the error vector e (which is unknown to the decoder) is equal to the syndrome of
the received vector y=x+e (which is known to the receiver).
In an error detection mode, the decoder decides that errors have occurred if s≠0, and that
no errors occurred if s=0. Note that the former decision is valid for 100%, while the latter
is not, since it may happen that errors do occur but the syndrome is still equal to 0 (in case
the errors transform one codeword into another).
In an error correction mode, the decoder looks for a vector of length n with syndrome s
and minimal weight among all vectors with syndrome s. This vector is the ML estimate of
the error vector. This is further explored on the next slides.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.13-
Coset
Coset of a code C:
C+a={x+a|x∈C}
Standard Matrix
⎡10100 ⎤
⎡10101⎤
G=⎢ ⎥ H = ⎢⎢ 01010 ⎥⎥
s
⎣01011⎦ ⎢⎣11001 ⎥⎦
000 00000 10101 01011 11110
100 00100 10001 01111 11010 y = 10001
010 00010 10111 01001 11100
s = 100
001 00001 10100 01010 11111
110 00110 10011 01101 11000 e’ = 00100
101 10000 00101 11011 01110
x’ = 10101
011 01000 11101 00011 10110
111 10010 00111 11001 01100 u’ = 10
Block Codes I: Fundamentals 14
A standard matrix of a code is a qn-k by qk matrix. Its entries contain all vectors of length
n, in such a way that each row contains a coset. Hence, each row can be related to a
syndrome. The row corresponding to the 0 syndrome contains the codewords. It is
customary that the first entry of each row contains the coset leader. The other entries in
the row can be obtained by adding the coset leader to the codeword in the same column.
In the above example the first row contains the four codewords 00000, 10101, 01011, and
11110. The coset leader of the second row, which corresponds to the syndrome 100, can
be found by observing that (100)T is the third column in the parity-check matrix H. Hence
(00100)HT=100, and thus 00100 serves as coset leader. The other three entries in the
second row can be found by adding this coset leader to the codeword (from the first row)
in the same column. The other rows are obtained similarly. Note that the choice of the
coset leader is not always unique. In the last row, also 01100 could have been chosen
instead of 10010. However, it would have lead to the same coset (only in a different
order).
In correction mode, the decoder computes the syndrome s=yHT, and chooses the
corresponding coset leader as an estimate e’ for the error vector, and thus x’=y-e’ as the
estimate for the transmitted codeword. The information symbols of x’ form the message
estimate u’. Note that in the above example a single error is always corrected (as may be
expected due to the code’s Hamming distance d=3), since all five vectors of length five
and weight one appear as a coset leader. Furthermore, in the implementation chosen here,
double errors are corrected if and only if these occur either in the third and the fourth
position, or in the first and the fourth.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.15-
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.16-
1) Compute Syndrome s = y HT
2) If s = 0, then x’ = y
It is clear that in particular in case k is large and n-k is rather small, the syndrome
decoding technique may lead to a great reduction in computational effort compared to a
brute-force method. Instead of comparing the received word y to all 2k possible
codewords in order to find the closest one, now only the syndrome is computed and the
coset leader corresponding to this syndrome is subtracted from the received word y.
The coset leader corresponding to a syndrome may be obtained by a table look-up. For
Hamming codes, the coset leader corresponding to any syndrome ≠0 can be immediately
derived from the parity-check matrix H, since this matrix contains all columns of height
n-k, except the all-zero column. This observation leads to the decoding algorithm
presented above and illustrated on the next slide.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.17-
e’ = 0100000
x’ = y + e’= 1100001
u’ = 1100
Block Codes I: Fundamentals 17
A (2m-1,2m-1-m,3) binary Hamming code consists of the words obtained by taking all
linear combinations of the 2m-1-m rows in the generator matrix. For m=3, we have a
(7,4,3) code, and the 24=16 codewords are:
0000000 1000111 0100110 1100001
0010101 1010010 0110011 1110100
0001011 1001100 0101101 1101010
0011110 1011001 0111000 1111111
If the received word is y=1000001, then decoding could be performed by calculating the
Hamming distances between 1000001 and all 16 codewords, leading to the conclusion
that 1100001 at Hamming distance 1 is closest.
The simple syndrome decoding technique illustrated above leads to the same conclusion.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.18-
Shortening
⎡1000111 ⎤
⎢ 0100110 ⎥
G=⎢ ⎥ ⎡10101⎤
⎢ 0010101 ⎥
G' = ⎢ ⎥
⎢ ⎥ ⎣01011⎦
⎣ 0001011 ⎦
We have seen that the length of Hamming codes is of the form 2m-1 (i.e., 7, 15, 31,
63,…). However, it may well be that our application demands a length which does not
belong to this sequence, e.g., 48. We can still use the Hamming codes for such
applications by applying techniques like shortening, puncturing, or extension.
Any (n,k,d) code C can be shortened to a (n-a,k-a,d) code, where a is an integer such that
1≤a≤k-1. This can be achieved by using only k-a information symbols in the encoding
process (rather than k). The other a information symbols are fixed at the value 0 and thus
do not have to be transmitted, which reduces the code length from n to n-a. Since the
removed information symbols did not contribute to the distance (they were all equal to 0),
the Hamming distance of the shortened code is still (at least) equal to d.
Note that shortening is only useful up to a certain maximum value of a. For example, the
shortening process of the (7,4,3) Hamming code leads to (6,3,3), (5,2,3), and (4,1,3)
codes. However, the last one is not a good choice, since the (3,1,3) repetition code has the
same dimension and distance, but a smaller length (and thus a higher efficiency).
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.19-
Puncturing
⎡1000111 ⎤ ⎡100011 ⎤
⎢ 0100110 ⎥ ⎢ 010011 ⎥
G=⎢ ⎥ G'= ⎢ ⎥
⎢ 0010101 ⎥ ⎢ 001010 ⎥
⎢ ⎥ ⎢ ⎥
⎣ 0001011 ⎦ ⎣ 000101 ⎦
Any (n,k,d) code C can be punctured to a (n-a,k,d-a) code, where a is an integer such that
1≤a≤d-1. This can be achieved by deleting a of the parity bits. In general, a punctured
code has an increased code rate and a reduced error-control capability.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.20-
Extension
⎡1000111 ⎤ ⎡10001110 ⎤
⎢ 0100110 ⎥ ⎢ 01001101 ⎥
G=⎢ ⎥ G'= ⎢ ⎥
⎢ 0010101 ⎥ ⎢ 00101011 ⎥
⎢ ⎥ ⎢ ⎥
⎣ 0001011 ⎦ ⎣ 00010111 ⎦
Any binary (n,k,d) code C for which the Hamming distance d is odd can be extended to a
(n+1,k,d+1) code, by extending each row in generator matrix G of C with one bit, such
that the number of ones in each new row is even. Since a sum of vectors of even weight is
again a vector of even weight, all codewords in the extended code are of even weight.
Hence the Hamming distance of the new code is even, and thus at least equal to d+1
(remember we assumed d to be odd!).
Special case: (k,k,1) code → (k+1,k,2) code, i.e., adding a single parity bit to a message,
such that one error can be detected. As mentioned before, this concept may be applied to
the famous ASCII code (k=7).
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.21-
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.22-
Interleaving
Errors are often of a bursty/cluster nature
In certain storage media, the errors tend to occur in bursts. For example, think of
scratches on a Compact Disc. Examples of error-bursts also occur in wireless
communications networks. In such networks messages are often corrupted by long bursts
of (intentional or non-intentional) interference. On the other hand, most error-correcting
codes have been designed to handle random errors. A way to deal with this discrepancy
is a technique called interleaving. The interleaver is a device that rearranges and spreads
out the ordering of a symbol sequence in a deterministic way. The de-interleaver is a
device , which by means of an inverse permutation restores the symbol sequence to its
original order. As we will see, interleaving enables the correction of (long) error-bursts
using simple codes for random error correction, but at the expense of an increased delay
and a need for buffering facilities at the transmitter and the receiver.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.23-
Before describing interleaving in more detail, let’s first consider an example which shows
the need for such a technique. Suppose we want to transmit five messages of four bits
each, which have been encoded using the (7,4,3) Hamming code. Because of the code’s
Hamming distance being equal to 3, only one error per 7-bit codeword can be corrected.
Hence, if the total transmitted string of 35 bits is hit by a burst error affecting one or more
codewords in two or more positions, we are left with uncorrectable errors.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.24-
1 1 1 1 1 1 1 2 7 12 17 22 27 32
0 1 0 1 1 0 1 3 8 13 18 23 28 33
0 0 0 0 0 0 0 4 9 14 19 24 29 34
1 1 1 0 1 0 0 5 10 15 20 25 30 35
Block Codes I: Fundamentals 24
At the receiver side the incoming bits from the channel are collected column-wise in a
similar array, so there is again a need for buffering facilities. Next, each row is decoded
by a decoder for the (n,k,d) code in use. Assuming no random errors occur, any burst up
to length s×t can be corrected using this block interleaving technique, where s is the
interleaving depth and t=⎣(d-1)/2⎦ (the error correction capability of the code).
In the example above s=5 and t=1. Note that only 15 check bits are used to protect 20
information bits against a burst of length up to 5. In comparison, an ordinary linear code
of dimension 20 with Hamming distance d=11 (which is thus able to correct up to 5
random errors) would require more than 20 check bits.
The choice of an appropriate value for the interleaving depth s is a matter of trade-off and
depends on the nature of the channel and the application. Increasing s does increase the
burst error correction capability, but also the delay and need for buffering facilities. In
particular the delay may be a problem for real-time applications, which puts an upper
limit on the interleaving depth.
et4-030 Error-Correcting Codes, Part 3: Block Codes I: Fundamentals -3.26-
Interleaving Exercise
G = ⎡1 0 1 0 1 0 1 1⎤
⎢⎣ 0 1 0 1 0 1 1 1⎥⎦
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.1-
et4-030 Part 4
In this part we consider cyclic codes, which are linear codes with the additional property
that a cyclic shift of any codeword gives again a codeword. This property may be
particularly useful in the decoding process.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.2-
Contents
In the previous parts we described block codes using vectors and matrices. Here we
introduce an alternative description by polynomials.
Next, the cyclic property is explained, and encoding and decoding operations are
considered.
Finally, we discuss two important families of cyclic codes which have found wide-spread
use in practical communication ands storage systems: BCH Codes and Reed-Solomon
Codes.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.3-
Information Polynomial:
u(x) = u0+u1x+…+uk-1xk-1 with ui∈GF(q)
Generator Polynomial:
g(x) = g0+g1x+…+gn-kxn-k with gi∈GF(q), gn-k=1
Code Polynomial:
c(x) = u(x)•g(x) = c0+c1x+…+cn-1xn-1 with ci∈GF(q)
As seen in the previous part, the block coding concept may be described in a
vector/matrix fashion. A message u represented by a q-ary vector of length k is mapped to
a codeword represented by a q-ary vector x of length n via a generator matrix G:
x= uG.
In a polynomial code representation, the k message symbols appear as the coefficients u0,
u1, …, uk-1 of a q-ary information polynomial u(x) of degree at most k-1. The role of the
generator matrix is now performed by a generator polynomial g(x), which is a q-ary
polynomial of degree n-k. The code polynomial c(x) is obtained by multiplying the
information polynomial u(x) and the generator polynomial g(x):
c(x) = u(x) • g(x).
Hence, the code polynomial is of degree at most (k-1)+(n-k)=n-1, and the code symbols to
be transmitted over the channel are c0, c1, …, cn-1.
N.B. In a vector representation a word a of length b is mostly denoted by (a1, a2, ..., ab),
while the corresponding polynomial representation is characterized by the coefficients a0,
a1, ..., ab-1. Note that there is a small shift in the subscripts, but that the number of
symbols is the same.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.4-
In this example the message length k=3, the code length n=8, and the Hamming distance
d=4. The messages and codewords are represented in their vector representation as well as
their polynomial representation. A generator matrix for this code is
⎡1 1 0 0 1 1 0 0 ⎤
G = ⎢0 1 1 0 0 1 1 0 ⎥
⎢⎣0 0 1 1 0 0 1 1⎥⎦
For example:
u=(0,1,1) leads to the codeword (0,1,1)•G = (0,1,0,1,0,1,0,1).
Similarly, in polynomial representation:
u(x)=x+x2 leads to the codeword (x+x2) • g(x) = x+x3+ x5 +x7.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.5-
Systematic Encoding
Encoding procedure c(x) = u(x)•g(x) is not systematic!
Systematic (in last k positions) encoding procedure:
1) Multiply u(x) by xn-k
2) Divide u(x)•xn-k by g(x):
u(x)•xn-k = q(x)•g(x) + r(x)
with degree r(x) < degree g(x) = n-k
3) c(x) = u(x)•xn-k - r(x)
N.B. c(x) obtained in this way is indeed a code polynomial:
c(x) = u(x)•xn-k - r(x)
= u(x)•xn-k + q(x)•g(x) - u(x)•xn-k = q(x)•g(x)
Block Codes II: Cyclic Codes 5
Note that in general the simple encoding procedure c(x) = u(x) • g(x) does not lead to a
systematic representation, i.e., the message does not appear as an integral part of the
codeword. For implementation purposes it is often desirable for a code to be systematic.
After the decoding algorithm has found the most likely codeword, the message estimate
can then be obtained by just removing the check symbols from that codeword.
The procedure shown above delivers a codeword which is systematic in the last k
positions, i.e.,
cj+n-k = uj for j=0,1,…,k-1.
Note that it does not change the list of codeword polynomials itself, but only the mapping
of the information polynomials to the code polynomials.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.6-
u u(x) c(x) x
000 0 0 00000000
100 1 1+x+x4+x5 11001100
010 x x+x2+x5+x6 01100110
110 1+x 1+x2+x4+x6 10101010
001 x2 x2+x3+x6+x7 00110011
101 1+x2 1+x+x2+x3+x4+x5+x6+x7 11111111
011 x+x2 x+x3+x5+x7 01010101
111 1+x+x2 1+x3+x4+x7 10011001
Block Codes II: Cyclic Codes 6
Here the systematic encoding procedure from the previous slide is applied on the binary
(8,3,4) code. Again, both the matrix and polynomial representations are shown. Note that
indeed the message appears at the end of the codeword (c5 = u0, c6 = u1, c7 = u2). The
corresponding generator matrix reads
⎡1 1 0 0 1 1 0 0⎤
G = ⎢1 0 1 0 1 0 1 0⎥
⎢⎣1 0 0 1 1 0 0 1⎥⎦
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.7-
Cyclic Codes
Definition: A linear code C is cyclic,
if c0+c1x+…+cn-1xn-1 ∈ C implies cn-1+c0x+…+cn-2xn-1 ∈ C
Theorem:
code C with length n and generator polynomial g(x) is cyclic
⇔
g(x) is a divisor of xn-1
Cyclic codes are linear codes with the additional property that a cyclic shift of any
codeword gives again a codeword. It can easily be checked “by hand” that the binary
(8,3,4) code from the previous slide is cyclic: 00000000 shifts to itself, 11001100 shifts to
01100110, etc. For codes of larger size this would be a very tedious job. Fortunately,
there is a very easy criterion to check whether a code is cyclic or not, as indicated in the
above theorem. Indeed, the example code is cyclic, since g(x) divides 1+x8 (note that xn-
1=1+xn in the binary case).
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.8-
binary code with n=3 and g(x)=x is not cyclic (1+x3 = x2g(x)+1 ):
1+x3 = (1+x)(1+x+x2)
1+x4 = (1+x)(1+x)(1+x)(1+x) Cyclic codes with n=7:
1+x5 = (1+x)(1+x+x2+x3+x4) g(x) k d
1+x6 = (1+x)(1+x)(1+x+x2)(1+x+x2) 1+x 6 2
1+x7 = (1+x)(1+x+x3)(1+x2+x3) 1+x+x3 4 3
… 1+x2+x3 4 3
(1+x)(1+x+x3) = 1+x2+x3+x4 3 4
(1+x)(1+x2+x3) = 1+x+x2+x4 3 4
(1+x+x3)(1+x2+x3) = 1+x+x2+x3+x4+x5+x6 1 7
Block Codes II: Cyclic Codes 9
Above, the factorization of xn-1 for n=3,4,5,6,7 is given in the binary case. Furthermore, the
generator polynomials of all binary codes of length 7 have been listed. The dimension k,
which is the number of information bits, can be found from the generator polynomial, since
its degree equals n-k. Finding the Hamming distance d is in general more difficult, this will
be further discussed when considering BCH and Reed-Solomon codes. For the small codes
considered here in the n=7 example, the Hamming distance can be found by listing all
codewords and observing the smallest nonzero weight. The found cyclic codes of length 7
have all been met before in these lecture notes in one way or another:
• (7,6,2) code: even-weight code, which contains all binary vectors of length 7 of
even weight;
• (7,4,3) code: Hamming code;
• (7,3,4) code: first shortening the (7,4,3) Hamming code to a (6,3,3) code,
and then extending it to a (7,3,4) code;
• (7,1,7) code: repetition code, containing only the two words
0000000 & 1111111.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.10-
Here the binary (7,3,4) code with generator polynomial g(x)=1+x+x2+x4 is listed. Note
that the eight codewords indeed satisfy the cyclic property. The first generator matrix
follows immediately from the generator polynomial g(x): the first row corresponds to g(x)
itself, while the second and third row are shifted versions over one and two positions,
respectively. The second generator matrix is systematic in the last three positions, and can
be obtained from the first matrix by adding the first row to the third row.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.11-
Parity-Check Polynomial
For a cyclic code with length n and generator polynomial g(x), the
parity check polynomial is defined by
h(x) = (xn-1) / g(x) = h0 + h1x + … + hkxk
⎡ hk hk −1 hk − 2 ... h0 0 0 0 ... 0⎤
⎢0 hk hk −1 hk − 2 ... h0 0 0 ... 0 ⎥⎥
H =⎢
⎢ ... ... ... ... ... ... ... ... ... ... ⎥
⎢ ⎥
⎣0 0 0 ... 0 hk hk −1 hk − 2 ... h0 ⎦
Block Codes II: Cyclic Codes 11
For a cyclic code C, a parity-check polynomial h(x) can be defined as indicated. Note that
this is always possible, since g(x) is a divisor of xn-1. Since the degree of g(x) is n-k, the
degree of h(x) is k (and thus hk≠0).
Generator and parity-check matrices for a cyclic code can be derived from the generator
and parity-check polynomials as indicated. For the generator matrix G this is rather
obvious: g(x), xg(x), …, xk-1g(x) are k independent basis polynomials. For the parity-check
matrix H, we consider the code polynomial c(x)=u(x)•g(x). If c(x) is multiplied with h(x)
we get
c(x)•h(x) = u(x)•g(x)•h(x) = u(x)•(xn-1) = -u(x) + xnu(x).
Since u(x) is of degree at most k-1, the powers xk, xk+1, …, xn-1do not appear in c(x)•h(x).
Hence,
k
∑hc
i =0
i n −i − j =0
Here, the parity-check polynomial and matrix are considered for the binary (7,3,4) code C
with g(x)=1+x+x2+x4. The parity-check polynomial reads
h(x) = (1+x7) / (1+x+x2+x4) =1+x+x3.
The parity-check matrix H has been derived following the rule explained on the previous
slide. Indeed, it can be checked that a vector x of length 7 is a codeword if and only if
xHT=0.
The dual code is a binary (7,4,3) code with H as the generator matrix, or
xkh(x-1) = x3(1+(x-1)+(x-1)3) = 1+x2+x3
as generator polynomial.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.13-
Syndrome Decoding
Binary code with generator polynomial g(x);
transmit code polynomial c(x); receive y(x)=c(x)+e(x)
Decoding:
1) Divide y(x) by g(x), call remainder polynomial s(x)
(“syndrome”)
2) Look for polynomial ê(x) with syndrome s(x) and
minimal weight, and take
y(x) - ê(x)
as the estimate for the transmitted codeword
Similarly to the case of code description by vectors and matrices, we can also consider
syndrome decoding in case of polynomial codes. Upon receipt of a polynomial y(x),
which is the sum of the transmitted code polynomial c(x) and the error polynomial e(x),
the syndrome s(x) is defined as the remainder polynomial after division of y(x) by the
code’s generator polynomial g(x). Since c(x) is a multiple of g(x), division of e(x) by g(x)
leads to the same remainder polynomial s(x). Hence, in the decoding process, we should
look for a polynomial with syndrome s(x) and minimum weight, and take that polynomial
as the estimate for the error polynomial. Subtraction of this estimate for the error
polynomial from the received polynomial leads to the estimate for the transmitted
codeword.
For the code on the previous slide, receipt of y(x)=1+x2+x6 results in the syndrome
s(x)=x+x2+x3. The polynomial of smallest weight with this syndrome is x5, and thus the
estimated transmitted codeword is 1+x2+x5+x6.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.14-
…
When computing the syndrome s(x), the n-k sections of this circuit are initially set to 0,
and then the received bits are fed into the circuit in the order yn-1, yn-2, …, y1, y0. After
these n cycles the contents of the n-k sections are the syndrome coefficients s0, s1, …, sn-k-
2 4
1. For the binary (7,3,4) code with g(x)=1+x+x +x , the syndrome for the received word
y(x)=1+x2+x6 is s(x)=x+x2+x3. Indeed, the shift register circuit with 4 sections and g0=1,
g1=1, g2=1, and g3=0, gives
cycle 1 (y6=1) 2 (y5=0) 3 (y4=0) 4 (y3=0) 5 (y2=1) 6 (y1=0) 7 (y0=1)
Further, check that shifting one more time (with input 0) gives 1 1 0 1 (i.e., s(x)= 1+x+x3)
as the content of the sections, which is the syndrome of the shifted version of y(x), i.e., of
1+x+x3. This cyclic property can be very useful in order to realize an efficient decoder
implementation.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.15-
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.16-
length n = 2m-1
dimension k ≥ 2m-1-mt
distance d ≥ 2t+1
The BCH codes, proposed in [Hoc59] and [BC60], form an important family of cyclic
codes. Here, we only consider binary BCH codes, but the concept can easily be extended
to other finite fields.
The codes are characterized by the two parameters m, which determines the length of the
code, and t, which determines a lower on the code’s Hamming distance. The generator
polynomial g(x) is defined to be the least common multiple of the minimal polynomials
(see Part 2) of α, α2, …, α2t, where α is a primitive element of the field GF(2m). Since the
generator polynomial g(x) is a product of binary polynomials, g(x) is binary itself. The
binary BCH code with this generator polynomial g(x) and length n=2m-1, can be shown to
have a dimension k of at least 2m-1-mt, and a Hamming distance d of at least 2t+1 (the
latter value is called the designed distance). The actual dimension and/or distance may be
higher, as shown in examples on the next slides.
BCH codes form a flexible family of codes, where the parameters m and t can be chosen
to satisfy (as good as possible) the application’s efficiency and reliability requirements.
Increasing the parameter t leads to a higher reliability, at the price of a lower efficiency
(code rate). Algebraic encoding and decoding algorithms have been proposed over the
years, which allow very fast implementations. These algorithms may be based on
syndrome techniques and may exploit the cyclic structure, but the further details are
beyond the scope of these lecture notes. The interested reader may consult the books in
the bibliography.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.17-
1 1+x+x3 7 4 3
Here we consider the case m=3, i.e., codes of length 7. Hence, we need the field GF(8) in
order to derive the generator polynomials. When GF(8) is constructed from the primitive
polynomial 1+x+x3, then the minimal polynomials of α, α2, α3, etc. are as indicated.
In case t=1, the generator polynomial is the least common multiple of the minimal
polynomials of α and α2. Since both these elements have the same minimal polynomial
1+x+x3, this polynomial serves as the generator polynomial. Since the degree is 3, the
dimension k=7-3=4. Since t=1, the code’s Hamming distance is at least equal to 3.
Observing that the weight of the generator polynomial (which is a code polynomial
itself!) is 3, we conclude that the code’s Hamming distance is exactly 3.
In case t=2, the generator polynomial is the least common multiple of the minimal
polynomials of α, α2, α3, and α4. Since α, α2, and α4 have the same minimal polynomial
1+x+x3, and α3 has 1+x2+x3 as minimal polynomial, (1+x+x3)(1+x2+x3) =
1+x+x2+x3+x4+x5+x6 serves as the generator polynomial. Since the degree is 6, the
dimension k=7-6=1. Since t=2, the code’s Hamming distance is at least equal to 5.
However, note that this is in fact the repetition code of length 7, and thus the actual
Hamming distance is 7. This is due to the fact that the minimal polynomial 1+x2+x3 of α3
is also the minimal polynomial of α5 and α6, and so the choice t=3 leads to the same code
as the choice t=2.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.18-
Here we consider the case m=4, i.e., codes of length 15. Hence, we need the field GF(16)
in order to derive the generator polynomials. When GF(16) is constructed from the
primitive polynomial 1+x+x4, then the minimal polynomials of α, α2, α3, etc. and the
generator polynomials g(x) for the different values of t are as indicated. Note that the
cases t=4,5,6,7 all lead to the repetition code of length 15.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.19-
15 11 3 63 51 5
63 45 7
15 7 5
63 39 9
15 5 7 63 36 11
31 26 3 63 30 13
31 21 5 63 24 15
63 18 21
31 16 7
63 16 23
31 11 11 63 10 27
31 6 15 63 7 31
In the tables we list the (n,k,d) parameters for m=3,4,5,6, i.e., n=7,15,31,63. For some of
these codes we also list the coding gain G (in dB), which is achieved when the codes are
applied in a communication system with phase shift-keying modulation and an AWGN
channel:
(n,k,d) code rate k/n G, in dB G, in dB G, in dB
(Pe=10-5) (Pe=10-9) (asymptotic)
(31,26,3) 0.84 1.7 2.0 2.2
(31,21,5) 0.68 2.2 2.7 3.1
(63,57,3) 0.90 1.7 2.2 2.6
(63,51,5) 0.81 2.3 3.1 3.9
(63,45,7) 0.71 2.9 3.7 4.6
Note the advantage of long codes: the (63,45,7) code has both a higher code rate and a
higher coding gain than the (31,21,5) code. However, longer codes usually have a higher
complexity.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.20-
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.21-
alphabet GF(2m)
length n = 2m-1
dimension k = 2m-1-2t
distance d = 2t+1
The Reed-Solomon codes, proposed in [RS60], form another important family of cyclic
codes. For example, these are used in the compact disc system to provide high-quality
music even when there are small scratches on the disc. Reed-Solomon codes are non-
binary codes, which may be defined over any finite field GF(q). However, since most
modern applications make use of binary data, RS codes over GF(2m) are of greatest
interest to us.
Like BCH codes, the codes are characterized by the two parameters m, which determines
the length of the code, and t, which determines the code’s Hamming distance. The
generator polynomial g(x) is defined to be the product of (x+α), (x+α2), …, (x+α2t), where
α is a primitive element of the field GF(2m). In contrast to BCH codes the generator
polynomial is not necessarily binary. The RS code over the alphabet GF(2m) with this
generator polynomial g(x) and length n=2m-1, has a dimension k=2m-1-2t (since the degree
of the generator polynomial is 2t), and a Hamming distance d of 2t+1.
RS codes form a flexible family of codes, where the parameters m and t can be chosen to
satisfy (as good as possible) the application’s efficiency and reliability requirements. As
will be shown, the codes are particularly suited for the correction of burst errors. As for
BCH codes, algebraic encoding and decoding algorithms have been proposed over the
years, which allow very fast implementations. These algorithms may be based on
syndrome techniques and may exploit the cyclic structure, but the further details are
beyond the scope of these lecture notes. The interested reader may consult the books in
the bibliography.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.22-
3 (x+α)(x+α2)…(x+α6) = 7 1 7
1+x+x2+x3+x4+x5+x6
Here we consider the case m=3, i.e., codes of length 7 over the alphabet GF(8) . When
GF(8) is constructed from the primitive polynomial 1+x+x3, we obtain the field table
(with power representation and 3-tuple representation) as indicated.
For t=1,2,3 we get (7,5,3), (7,3,5), and (7,1,7) codes, respectively. Note that the codes
over GF(8), and so the sizes of the codes (i.e., number of codewords) are 85=32768,
83=512, and 81=8, respectively. The codeword of the (7,1,7) code are (in vector notation):
( 0, 0, 0, 0, 0, 0, 0)
( α, α, α, α, α, α, α)
(α2, α2, α2, α2, α2, α2, α2)
(α3, α3, α3, α3, α3, α3, α3)
(α4, α4, α4, α4, α4, α4, α4)
(α5, α5, α5, α5, α5, α5, α5)
(α6, α6, α6, α6, α6, α6, α6)
( 1, 1, 1, 1, 1, 1, 1)
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.23-
⎡α 3 α 1 α3 1 0 0⎤ GF(8):
⎢ ⎥
G = ⎢ 0 α3 α 1 α3 1 0⎥ 0 (0,0,0)
⎢ 0 0 α3 α 1 α3 1⎥⎦ α1=α (0,1,0)
⎣ α2 (0,0,1)
N.B. Number of codewords is 83 = 512
α3=1+α (1,1,0)
Example: info symbols α4 α2 0
⇒ codeword 1 0 α6 α6 1 α2 0 α4=α+α2 (0,1,1)
α5=1+α+α2 (1,1,1)
Binary representation using GF(8) table: α6=1+α2 (1,0,1)
Example: info bits 011 001 000 α7=1 (1,0,0)
⇒ codeword 100 000 101 101 100 001 000
We now have a closer look at the (7,3,5) RS code over GF(8). A generator matrix can be
obtained from the generator polynomial g(x): the first row corresponds to g(x) itself, while
the second and third row are shifted versions over one and two positions, respectively.
The messages consist of 3 symbols from GF(8). Since each symbol can also be
represented by a unique binary 3-tuple, the message may also be represented by 9 bits.
Similarly, the codeword consist of 7 symbols, or 7×3=21 bits.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.24-
The generator matrix may also be represented in a binary fashion. First, we replace all
symbols from GF(2m) in the generator matrix by their m-tuple equivalent. For our m=3
example, this means that all symbols are replaced by 3-tuples. Further, note that for each
row not only the row itself is a codeword, but also α times this row, α2 times this row, α3
times this row, etc. In the binary representation, the first m rows in this sequence are
independent, while the other can be composed as a linear combination of the first m. In
the slide example, note that the fourth one equals the binary sum of the first and the
second. In conclusion, each row r from the generator matrix over GF(2m) leads to m rows
r, αr, α2r, …, αm-1r in the generator matrix over GF(2).
In general, a (2m-1, 2m-1-2t) RS code has a (2m-1-2t) × (2m-1) generator matrix over
GF(2m), which may also be represented as an m(2m-1-2t) × m(2m-1) binary generator
matrix.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.25-
#t
… … …
m m m m m
Burst of length m(t-1)+2 is not guaranteed to be corrected!
Block Codes II: Cyclic Codes 25
Since the Hamming distance between m-tuples representing different symbols from
GF(2m) is at least equal to one, the Hamming distance of the binary representation of an
RS code is (at least) equal to 2t+1. Hence, the code is able to correct t random errors.
However, the popularity of RS codes is mainly due to their burst error correcting
capabilities. Observe that it does make a difference whether only one bit error hits a
symbol or up to m bit errors: both events lead to a wrong GF(2m) symbol. Since any
binary burst error of length up to m(t-1)+1 affects at most t GF(2m) symbols, it will be
corrected. A burst of length m(t-1)+2 may affect t+1 symbols if its first error bit hits the
last bit of a symbol, in which case it is not guaranteed to be corrected. In conclusion, any
binary burst error of length up to m(t-1)+1 is guaranteed to be corrected, while a burst of
length m(t-1)+j (with j=2,3,..,m) is corrected if it starts in the (≤m+1-j)th bit of a symbol.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.26-
Encoding: codeword 1 0 α6 α6 1 α2 0
⇒ transmitted bits 100 000 101 101 100 001 000
Channel: received bits 100 011 110 101 100 001 000
⇒ received symbols 1 α4 α 3 α6 1 α2 0
Decoding: codeword 1 0 α6 α6 1 α2 0
⇒ info symbols α α
4 2 0
⇒ info bits 011 001 000
Here the burst error correction capability of a Reed-Solomon code is illustrated for the
(7,3,5) code over GF(8). Because of the Hamming distance of 5, up to two symbol errors
can be corrected. In the above example a binary burst of length 5 affects the second and
the third of the seven codeword symbols and is thus corrected. Note that a burst of length
5 may also affect three symbols, in which case it is not corrected.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.27-
Encoding: codeword 1 0 α6 α6 1 α2 0
⇒ transmitted bits 100 000 101 101 100 001 000
Channel: received bits 000 000 101 101 000 000 000
⇒ received symbols 0 0 α 6 α6 0 0 0
Decoding: codeword 0 0 0 0 0 0 0
⇒ info symbols 0 0 0
⇒ info bits 000 000 000
Through the same (7,3,5) code, it is now illustrated that Reed-Solomon codes are not
particularly suited for random error correction. Only 3 of the 21 transmitted bits are in
error, but since they are spread they affect 3 of the 7 symbols. Since the code’s Hamming
distance is 5, the errors cannot be corrected.
et4-030 Error-Correcting Codes, Part 4: Block Codes II: Cyclic Codes -4.28-
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.1-
et4-030 Part 5
In this part we show how two (or more) block codes can be combined into new block
codes, which have good error correction capabilities for combinations of random and
burst errors. The new (long) codes make use of the encoding and decoding algorithms of
the (short) constituent codes, so the complexity can be held rather low.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.2-
Outline
Product Codes:
co-operating codes over the same alphabet, for
example two binary codes
Concatenated Codes:
co-operating codes over different alphabets, for
example a binary code and a code over GF(256)
Product Codes
Product codes use two (or more) q-ary block codes. The constituent codes may or may not
be identical. We assume they are systematic in the first positions. The length, dimension,
and Hamming distance of the product code are equal to the products of the corresponding
parameters of the constituent codes.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.4-
Codeword Set-up
k1 n 1 – k1
k2 Info Checks C1
A codeword of the product code C1⊗C2 is a rectangular structure (“array”) of n2×n1 q-ary
symbols. We divide this array into four parts.
The upper left corner of k2×k1symbols contains the k=k1k2 information symbols.
The upper right corner of k2×(n1-k1) symbols contains check symbols according to the
“row-code” C1. Each row from the information part is considered as an information vector
of length k1. The n1-k1 check bits generated by the (systematic) C1 encoding process are
put into the corresponding row of the upper right part.
The lower left corner of of (n2-k2)×k1 symbols contains check symbols according to the
“column-code” C2. Each column from the information part is considered as an
information vector of length k2. The n2-k2 check bits generated by the (systematic) C2
encoding process are put into the corresponding column of the lower left part.
The lower right corner of (n2-k2)×(n1-k1) symbols contains check symbols on the check
symbols. Each row from the lower left part is considered as an information vector of
length k1, and the n1-k1 check bits generated by the (systematic) C1 encoding process are
put into the corresponding row of the lower right part. Alternatively, each column from
the upper right part may be considered as an information vector of length k2, and the n2-k2
check bits generated by the (systematic) C2 encoding process are put into the
corresponding column of the lower right part. Both approaches lead to the same sub-array
of (n2-k2)×(n1-k1) symbols.
Note that the once the information symbols have been generated (by the source), all other
symbols are fixed. Hence, the number of codewords is qk.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.5-
C1 ⊗ C2 : length n1n2 = 56
dimension k1k2 =8
distance d1d2 =15
alphabet GF(2)
Block Codes III: Co-operating Codes 5
As an example we consider the binary product codes with the (7,4,3) code C1 and the
(8,2,5) code C2 as constituent row and column codes, respectively. The length of the
product code is 7×8=56, i.e., the number of symbols in the array representing a codeword.
The dimension of the product code is 4×2=8, i.e., the number of symbols in the sub-array
representing the information symbols. Hence the code rate is only 8/56=1/7=0.14. Finally,
the Hamming distance of the product code is 3×5=15, i.e., two arrays representing
different codewords differ in at least 15 positions.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.6-
⎡1000111 ⎤ ⎡1 0 1 1 1 1 0 0⎤
⎢ 0100110 ⎥ G2 = ⎢
⎣0 1 ⎥⎦
G1 = ⎢ ⎥ 1 1 1 0 0 1
⎢ 0010101 ⎥ 1100 001
⎢ ⎥ 0110 011
⎣ 0001011 ⎦
1010 010
1010 010
1100 001
1100 001
0110 011
0110 011
Block Codes III: Co-operating Codes 6
Since the example product code has eight information bits, it contains 256 codewords.
One of these is shown above. Note that each row in the 8×7 array is a codeword of C1,
while each column is a codeword of C2.
Another codeword from this product code reads
1000 111
0000 000
1000 111
1000 111
1000 111
1000 111
0000 000
0000 000
Note that this codeword and the one on the slide differ in 27 positions, i.e., the Hamming
distance between these two codewords is 27.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.7-
Properties:
• Low complexity!
• Correction of combinations of
random and burst errors
After row-wise transmission of the n1n2 received symbols are put into an array of the
same size as the codewords. A simple decoding procedure for product codes reads as
follows. First, decode the n2 rows of this array using a decoding algorithm for code C1,
and put the decoding results again in an n2×n1array. Then, decode the n1 columns of this
new array using a decoding algorithm for code C2, and put the decoding results in an
n2×n1array, of which the upper left k2×k1 sub-array gives the estimates for the information
symbols.
Note that this procedure is of low complexity, since all the decoding operations are on the
(small) constituent codes C1 and C2. Also, it is very much suited to correcting a
combination of random and burst errors, since the random errors will mostly be corrected
by the row decoder, while the uncorrectable burst errors in the rows may be dealt with by
the column decoder, as will be shown in the next example.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.8-
For convenience, we assume the transmitted codeword is the all-zero array, i.e., the errors
are in the bits which are received as a one. The left array denotes the received bits, which
show a burst of errors in the second and third row, and some scattered random errors in
the other rows. After row decoding, as indicated in the middle array, the random errors
have been corrected, while the second and third row are incorrect due to the fact that the
number of errors in these rows exceeds the error-correcting capability of the row decoder.
Hence, at most two errors remain per column, which can be corrected by the column
decoder (since d2=5). This gives a completely correct decoding result, as shown in the
right array.
Generalizing this reasoning, it is clear that this product code, with the proposed decoding
algorithm, can correct any error pattern in which six rows contain at most one error, while
the remaining two rows may contain any number of errors.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.9-
Again, we assume the transmitted codeword is the all-zero array, i.e., the errors are the
bits which are received as a one. The left array denotes the received bits, which show two
errors in the first, third and fourth row, and no errors in the other rows. Since the
Hamming distance of the row decoder is only 3, the first, third and fourth row are still in
error after row decoding, see the middle array. Hence, the column decoder is faced with
three errors in some of the columns, which is beyond its error correcting capability. This
gives an incorrect decoding result, as shown in the right array.
Note that in this case only six errors lead to an incorrect decoding result, although this
number is below half of the code’s Hamming distance 15. This is due to the sub-
optimality of the decoding algorithm. Another algorithm which compares the received
array with all possible codewords of the product code, and then chooses the closest (in
Hamming distance sense) one as the decoding result, would lead to the correct decoding
result in this case. However, such an algorithm is usually much more complex.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.10-
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.11-
Concatenated Codes
Inner Code C1: Outer Code C2:
length n length N
dimension k dimension K
distance d distance D
alphabet GF(q) alphabet GF(qk)
In concatenated coding schemes, codes over different alphabets are combined. The
discussion here is restricted to two codes, but extension to more codes is obvious.
The two codes are denoted as inner and outer code. The inner code C1 is an (n,k,d) block
code over GF(q). The outer code C2 is an (N,K,D) block code over GF(qk). Note that the
alphabet of the outer code follows from the alphabet and the dimension of the inner code.
The concatenated code is defined over the same (mostly binary) alphabet as the inner
code. Its length, dimension, and distance are the products of the corresponding parameters
of the inner and outer code.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.12-
u x y v
Encoder Channel Decoder
a c d b
Encoder Channel Decoder
C1 : length n=6
⎡1 0 0 0 1 1⎤
dimension k=3
G1 = ⎢ 0 1 0 1 0 1⎥
distance d=3
⎢⎣ 0 0 1 1 1 0 ⎥⎦
alphabet GF(2)
C2 : length N=7 ⎡α 3 α 1 α3 1 0 0⎤
dimension K=3 ⎢ ⎥
distance D=5
G2 = ⎢ 0 α3 α 1 α3 1 0⎥
alphabet GF(8) ⎢0 0 α3 α 1 α3 1 ⎥⎦
⎣
Concatenated Code: length nN = 42
dimension kK =9
distance dD =15
alphabet GF(2)
Block Codes III: Co-operating Codes 13
In the example we choose a binary (6,3,3) inner code, which is in fact a shortened version
of the (7,4,3) Hamming code. Since the inner code is binary and of dimension 3, the outer
code should be over GF(23)=GF(8). Let’s take the (7,3,5) Reed-Solomon code as outer
code.
The resulting concatenated code is thus a binary (42,9,15) code. Hence, the code rate is
9/42=0.21.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.14-
In the encoding process, 9 information bits are transformed into 3 symbols from GF(8),
which are encoded using the Reed-Solomon code. The 7 resulting symbols are each
considered as binary 3-tuples, which are encoded using the shortened Hamming code. The
final result is a (3+4)×(3+3)=42 bit string.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.15-
The received string of 42 bits is first parsed into 7 strings of six bits, which are decoded
according to the shortened Hamming code. The resulting 7 binary 3-tuples are considered
as symbols from GF(8), which are collected into a vector to be decoded by the Reed-
Solomon decoder. This leads to three symbols from GF(8), which can be transformed into
3 bits each, so 9 bits in total.
Since the shortened Hamming code has Hamming distance 3, single errors in a 6-bit sub-
string are corrected at the inner decoding stage. Remaining errors are corrected at the
outer decoding stage, if they affect at most two symbols (since the Reed-Solomon code
has Hamming distance 5). In the example on the slide, we see that the three scattered
errors are corrected at the the inner decoding stage, while the burst error affecting the
third and the fourth symbol are corrected at the outer decoding stage.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.16-
• Low complexity:
the concatenated code may be long, but all encoding
and decoding operations are on rather short codes
As for product codes, the conclusion for concatenated codes is that they provide good
correction capabilities for combined random and burst errors, at a rather low complexity.
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.17-
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 5: Block Codes III: Co-operating Codes -5.18-
Picture Exercise
We consider binary pictures in an 8×11 format,
where each of the 88 pixels can be white (=0) or
black (=1). The images are transmitted over a
Binary Symmetric Channel with error probability
10-3.
Each column is considered as an element from GF(256), and the
resulting 11 symbols are encoded using a (15,11,5) code.
1) How many codewords does this code have?
2) Calculate the probability that a decoded picture is not identical
to the transmitted picture.
3) In order to provide extra protection, each symbol is encoded by a
binary (12,8,3) code before transmission. Describe the error
correction capabilities of this concatenated coding scheme.
Block Codes III: Co-operating Codes 18
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.1-
et4-030 Part 6
In this part we introduce soft-decision decoding algorithms for binary block codes. Such
decoding algorithms take into consideration the reliability of the received bits, which
leads to a better performance, possibly at the expense of a higher complexity.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.2-
Outline
Communication System
information encoded
Binary bits Binary bits Data
Source u Encoder x Modulator
x(t) transmitted
waveform
Channel
reliability information
α
estimated received
information received y(t)
bits waveform
Binary bits Binary Data
Sink v Decoder y Demodulator
We consider the above communication system using binary block codes. Bits generated
by the source are collected into an information vector u of length k. The bits are encoded
using an (n,k,d) code into a codeword x of length n. This bit string is mapped to a
waveform x(t) uniquely representing x. The modulation can be on a bit-by-bit basis (e.g.,
BPSK), but also higher-order modulation methods may be used (e.g., M-PSK, QAM).
After transmission, a distorted version y(t) of x(t) is received. This received waveform is
filtered, and the data modulator delivers estimates yi for the transmitted bits xi to the
decoder. These yi are called hard decisions. The decoding algorithms discussed so far in
these lecture notes are based on these hard decisions. The new aspect in the soft-decision
decoding approach is that the demodulator also provides reliability information αi to the
decoder, which indicates the demodulator’s confidence in the estimate yi. Based on the
received binary vector y=(y1,y2,…,yn) and the associated reliability vector
α=(α1,α2,…,αn), the decoder determines a binary vector v, serving as the estimate for the
message u. The reliability information may lead to a better performance, i.e., P(v≠u) for
soft-decision decoding is lower than P(v≠u) for hard-decision decoding.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.4-
Example Set-up
AWGN Channel
The ideas behind soft-decision decoding will be further explained by treating a simple
example:
• AWGN channel,
• BPSK modulation,
• binary (5,2,3) code.
Of course, soft-decision decoding may also be applied to other (more complicated)
channels, modulation methods, and codes. However, the basic principles remain the same.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.5-
-1 0 +1
Block Codes IV: Soft-Decision Decoding 5
1 −( ρ − χ ) 2 / N 0
πN 0
e
A graphical representation is shown above. The width of the curves is determined by the
noise power: the higher the noise power, the wider the curves.
Further assuming that zeros and ones are equally likely (P(x=0)=P(x=1)=0.5), the hard
decision y reads
y = 0 if ρ ≥ 0,
y = 1 if ρ < 0.
Mostly we take the reliability value
α = |ρ|,
but other choices may also be used. In general, the higher the value of α, the more reliable
the estimate y.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.6-
Example 1:
• Generated codeword: 1 0 1 0 1
• BPSK Modulation: -1 +1 -1 +1 -1
• Channel output: -1.2 -0.2 -1.0 +0.8 -0.6
• Demodulator output: 1 1 1 0 1
• Decoder output: 1 0 1 0 1
Block Codes IV: Soft-Decision Decoding 6
For the example code, suppose that the message to be transmitted is “east”, which is
represented as u=(1,0). Hence, the codeword is x=(1,0,1,0,1), leading to the BPSK
representation χ=(-1,+1,-1,+1,-1). The received real values are represented in the vector
ρ=(-1.2, -0.2, -1.0, +0.8, -0.6), leading to the received binary vector y=(1,1,1,0,1) with
reliability vector α=(1.2, 0.2, 1.0, 0.8, 0.6). Note that an error occurred in the second
position.
A hard-decision decoder only makes use of the hard-decision vector y. It looks for the
codeword at closest Hamming distance from y. In this case we have
and thus the transmitted codeword (1,0,1,0,1) is the decoding result, leading to the correct
message “east”.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.7-
Example 2:
• Generated codeword: 1 0 1 0 1
• BPSK Modulation: -1 +1 -1 +1 -1
• Channel output: -1.2 +0.9 +0.1 +0.7 +0.2
• Demodulator output: 1 0 0 0 0
• Decoder output: 0 0 0 0 0
Block Codes IV: Soft-Decision Decoding 7
For the same example code, again suppose that the message to be transmitted is “east”,
which is represented as u=(1,0). Hence, the codeword is still x=(1,0,1,0,1), leading to the
BPSK representation χ=(-1,+1,-1,+1,-1). This time the received real values vector is ρ=(-
1.2, +0.9, +0.1, +0.7, +0.2), leading to the received binary vector y=(1,0,0,0,0) with
reliability vector α=(1.2, 0.9, 0.1, 0.7, 0.2). Note that errors occurred in the third and fifth
position.
For hard-decision decoding, we compute the Hamming distances between the possible
codewords and y:
Hence, in this case the decoding result is the codeword (0,0,0,0,0) (representing “north”),
and not the transmitted codeword (1,0,1,0,1) (representing “east”). The two errors are
beyond the error correcting capability of the code, since its Hamming distance in only 3.
Still, a soft-decision decoding approach gives the correct message, as will be seen on the
next slide.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.8-
Next, we consider Example 2 once again, but now using soft decisions. Instead of
calculating the Hamming distances between the hard decisions y and the four possible
codewords, we calculate the Euclidean distances between the real received values ρ and the
BPSK (±1) representations of the four possible codewords. For example, the Euclidean
distance between ρ=(-1.2, +0.9, +0.1, +0.7, +0.2) and the vector (+1,+1,+1,+1,+1)
representing the all-zero codeword is
√((-1.2-1)2+(0.9-1)2+(0.1-1)2+(0.7-1)2+(0.2-1)2) = 2.5.
Note from the table that the transmitted codeword (1,0,1,0,1) is closest to the channel
output in the Euclidean distance sense (but not in the Hamming distance sense). Hence,
soft-decision decoding leads to the correct decoding result (“east”), while hard-decision
decoding leads to an incorrect decoding result (“north”).
For the AWGN channel and BPSK modulation, minimum Euclidean distance decoding is
equivalent to maximum-likelihood (ML) decoding. In ML decoding, we look for the
codeword x maximizing P(x|ρ). Since P(ρ|x)=P(ρ)P(x|ρ)/P(x), we may also maximize
P(ρ|x)=
n
n
1 ⎛ 1 ⎞ n
∑ ( ρi − χ i )2 ) / N0
∏ =⎜ ⎟
−( ρi − χ i ) 2 / N 0 −(
πN 0
e ⎜ πN ⎟ e i =1
i =1 ⎝ 0 ⎠
in case all codewords are equally likely (which is mostly the case). Indeed, note from the
formula that maximizing P(ρ|x) is equivalent to minimizing the Euclidean distance
between the received real vector ρ and the BPSK representation χ of codeword x.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.9-
Derive
a) the hard-decision decoding result,
b) the maximum-likelihood decoding result.
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.11-
When comparing hard and soft decision decoding, the latter perform better, i.e., the bit
error probability at the same SNR is lower, or, equivalently, the same bit error probability
is achieved at a lower SNR (typically 2 tot 3 dB gain at AWGN channels). However, the
soft-decision decoding techniques (based on Euclidean distance calculations) are mostly
much more complex to implement than hard-decision decoding techniques, for which fast
algebraic (e.g. syndrome based) algorithms have been developed.
In order to have the best of both worlds, many hybrid decoding schemes trying to achieve
good performance at a low complexity have been achieved over the years. In most of
these the core of the decoder is still based on algebraic techniques (to have a low
complexity), while in the pre- and/or post-processing stages the reliability information
plays an important role (to increase the performance).
An important example of such hybrid decoders is the class of Chase decoders [Cha72],
which is further explored on the next slides.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.12-
The Chase algorithms can be applied to any binary (n,k,d) block code. As stated before, it
combines the virtues of the algebraic and the probabilistic decoding approaches. Although
it is a rather old technique (the original paper [Cha72] dates from 1972), it is still very
relevant. These days, it is also used in iterative decoding of product codes, a process
better known as “block turbo codes” (see, e.g., “Near-Optimum Decoding of Product
Codes: Block Turbo Codes”, R.M. Pyndiah, IEEE Transactions on Information Theory,
vol. 46, no. 8, pp. 1003-1010, August 1998).
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.13-
The basic idea behind the Chase decoding principle is to use a conventional binary
(algebraic) decoder in a multi-trial fashion. In each of the l trials, some of the received bits
in the hard-decision vector y are inverted. The inversions are determined by a test pattern t,
which is a binary vector of length n, in which the ones indicate the positions to be inverted.
The resulting vector y’=y+t is decoded by the conventional binary decoder, which either
delivers a codeword x’ at Hamming distance ⎣(d-1)/2⎦, or gives a decoding failure (if it
cannot find a codeword at Hamming distance ⎣(d-1)/2⎦ or less). If an x’ has been found, the
error pattern to the received vector y is calculated by e’=x’+y.
The above is done for all l test pattern t from the test set T. This results in at most l
codeword estimates x’, of which the one with ±1 representation χ’ at closest Euclidean
distance from ρ is chosen as the final decoder output. In case all l trials resulted in a
decoding failure, the y itself is taken as the decoding result.
Since the Euclidean distance between a ±1 vector χ representing a codeword x and a real
vector ρ is
n n n
∑ (χ
i =1
i − ρi )2 = ∑ (( χ )
i =1
i
2
− 2χ i ρi + ( ρi )2 ) = ∑ (1 + ( ρ )
i =1
i
2
− 2χ i ρi )
we may also maximize the correlation ∑χiρi instead of minimizing the Euclidean distance.
With the reliability values αi=|ρi|, this can be further simplified to minimizing the analog
weight
since n n
∑α
i: xi ≠ yi
i ∑ χi ρi =
i =1
∑αi −
i: x i = y i
∑αi = ∑αi − 2
i: x i ≠ y i i =1
∑α
i: x i ≠ y i
i
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.14-
Example 2 Revisited
Example (n=5, k=2, d=3) Code
0 0 0 0 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1 0
Example 2 revisited:
• Generated codeword: 1 0 1 0 1
• BPSK Modulation: -1 +1 -1 +1 -1
• Channel output: -1.2 +0.9 +0.1 +0.7 +0.2
• Demodulator output: 1 0 0 0 0
with reliability 1.2 0.9 0.1 0.7 0.2
• Trial 1: decode 1 0 0 0 0 ⇒ 0 0 0 0 0
• Trial 2: decode 1 0 1 0 0 ⇒ 1 0 1 0 1
• Ed1=2.5, Ed2=1.7 ⇒ output: 1 0 1 0 1
Block Codes IV: Soft-Decision Decoding 14
We revisit the example with the binary (5,2,3) code and received values ρ=(-1.2, +0.9,
+0.1, +0.7, +0.2). We apply a 2-trial Chase decoder, which does not invert any symbol in
the first trial, and which inverts the most unreliable symbol in the second trial. From the
signs of the coordinates of ρ we have y=(1,0,0,0,0), and thus the first trial (test pattern
t=(0,0,0,0,0)) gives x’=(0,0,0,0,0), which is the codeword at closest Hamming distance
from y. In the second trial, we invert the third coordinate of y, since this is the smallest
coordinate in α=(1.2, 0.9, 0.1, 0.7, 0.2). Hence, t=(0,0,1,0,0), and we decode
y’=y+t=(1,0,1,0,0), giving x’=(1,0,1,0,1) as the codeword at closest Hamming distance. For
both decoding results (0,0,0,0,0) and (1,0,1,0,1), we consider the ±1 representations, which
are (+1,+1,+1,+1,+1) and (-1,+1,-1,+1,-1) respectively. Next, we calculate the Euclidean
distance between ρ and these two ±1 vectors, giving 2.5 and 1.7, respectively.
Alternatively, we could also compute the analog weights, which are 1.2 and 0.1+0.2=0.3,
respectively. Either way, the decoder output is the codeword obtained in the second trial,
i.e., (1,0,1,0,1).
Note that this 2-trial method gives the same result as the exhaustive search from Slide 6.8.
Here, the reduction in complexity is not that impressive due to the small code parameters.
However, for large codes, considerable reductions can be obtained.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.15-
In his original paper [Cha72], Chase has proposed three algorithms, i.e., three ways to
choose the test set T. The test sets are given on the slide, except for “Chase 3” in case the
Hamming distance d is even, for which there is no inversion in the first trial, inversion of
the least reliable bit in the second trial, inversion of the three least reliable bits in the third
trial, …, and inversion of the d-1 least reliable bits in trial d/2+1. Note the decreasing
complexity going from “Chase 1” to “Chase 2” to “Chase 3”. For “Chase 3” the
complexity grows only linear with the Hamming distance d.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.16-
Here the test sets for “Chase 1”, “Chase 2”, and “Chase 3” are shown in case of a (15,5,7)
code. Note the decreasing sizes of the test sets. In order to be able to show the “Chase 2”
and “Chase 3” test patterns, it is assumed without loss of generality that the symbols in
the received word are ordered according to increasing reliability. For example, the test
pattern (1,1,0,0,…,0) indicates that in the corresponding trial the two least reliable bits in
y are inverted. For convenience, the test pattern (1,1,…,1,0,0,…,0) starting with i ones
and ending with n-i zeros is denoted by ti.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.17-
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.18-
Chase 2, 3 Performance
10
-1
(AWGN, BPSK, (15,5,7) Code)
Hard-Decision
-2
10
Chase 2
Pe Chase 3
-3
10
-4
10
2 2.5 3 3.5 4 4.5 5
In this plot we show the performance (bit error probability Pe as a function of Eb/N0 (in
dB)) of a hard-decision decoder, a “Chase 3” decoder, and a “Chase 2” decoder, all for a
binary (15,5,7) BCH code, BPSK modulation, and an AWGN channel. Note that both
Chase decoders significantly improve upon the hard-decision decoder.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.19-
Asymptotic Results
Let γ=(Eb/N0)(k/n)
Asymptotically (i.e., at high γ), we have the following
probabilities for an incorrect decoding result:
Hard-Decision Decoder: e-γ (⎡d/2⎤+o(γ))
Chase Decoders 1, 2, and 3: e-γ (d+o(γ))
Maximum Likelihood Decoder: e-γ (d+o(γ))
⇒ Chase decoding as good as ML decoding
at high SNR, while hard-decision
decoding loses about 3 dB
Let γ be defined as the product of Eb/N0 and the code rate R=k/n, i.e., γ is the amount of
energy per transmitted channel bit Eb•k/n divided by the noise power spectral density N0.
At large γ, i.e., high SNR, the probabilities of an incorrect decoding result (i.e., the
message generated by the source differs from the message delivered to the user) are as
indicated on the slide for a hard-decision decoder, the Chase decoders, and an ML
decoder. Here o(γ) is defined as a function that goes to zero as γ→∞.
Note that asymptotically all three Chase decoders perform as well as an ML decoder.
Further note the exponents in the expressions for hard-decision decoding and ML
decoding differ by a factor of d/⎡d/2⎤, which is (roughly) equal to 2 (for even or large odd
Hamming distance d). Hence, we need (about) twice the amount of energy in order to let
a hard-decision decoder have the same probability of incorrect decoding as an ML
decoder. In other words, the loss is about 3 dB.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.20-
The “Chase 3” algorithm uses only ⎣d/2+1⎦ trials and performs still quite well,
asymptotically even as well as an ML decoder. The research question we may ask
ourselves is whether the number of trials may be further reduced while maintaining an
acceptable performance. Suppose that we allow only l<d/2 trials. How should we choose
the l test patterns in such a limited-trial Chase decoder. Since 2001, several M.Sc.
students at Delft University of Technology have investigated this and related research
problems. Here, we briefly report on their results.
Giampiero Arico has proposed test sets for l trials, where l is any integer in between 1 and
d/2. Surprisingly, it turned out that some of his test sets performed better than the “Chase
3” algorithm, in spite of the fact that the number of trials is lower! He also showed that
asymptotically optimal performance is possible in only about d/4 trials, i.e., at half the
complexity of the “Chase 3” algorithm! The key to these results is the observation that it
is better to choose test patterns ti with i having the same parity as the Hamming distance
d, rather than the opposite parity (as in “Chase 3”). Details can be found in this paper
(available on Blackboard):
G. Arico and J.H. Weber, “Limited-Trial Chase Decoding”,
IEEE Transactions on Information Theory, vol. 49, no. 11,
pp. 2972-2975, November 2003.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.21-
LT-Chase Performance
10
-1
(AWGN, BPSK, (15,7,5) Code)
-2
10
Hard-Decision
Pe 10
-3
t0, t2
10
-4 Chase 3
t0, t3
-5
10
4 4.5 5 5.5 6 6.5 7
Here we show simulation results for the performance of limited-trial Chase decoders in a
communication system with an AWGN channel, BPSK modulation, and a (15,7,5) BCH
code. Hard-decision decoding can be considered as using only one trial with test pattern
t0. Extending this with a second trial in which the two most unreliable received bits are
inverted (test pattern t2) gives an performance improvement, as may be expected. Further
extension with a third trial in which the four most unreliable bits are inverted (test pattern
t4) gives a (small) extra improvement. Note that this 3-trial algorithm is the one known as
“Chase 3” (test set {t0, t2, t4}). Surprisingly, the 2-trial algorithm with test set {t0, t3}
performs better than “Chase 3”!
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.22-
Another M.Sc. Student, Fred Kossen, has investigated the performance of (limited-
trial) Chase decoders on the Rayleigh fading channel. One of his conclusions was that
optimal test set design should take into account the nature of the channel, since he
observed that a test set which performs better than another test set on the AWGN
channel, may perform worse than the other test on the Rayleigh fading channel.
Details can be found in this paper (available on Blackboard):
F. Kossen and J.H. Weber,
“Performance Analysis of Limited-Trial Chase Decoders”,
Proceedings of the 23rd Symposium on Information Theory
in the Benelux, Louvain-la-Neuve, Belgium,
pp. 79-86, May 29-31, 2002.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.23-
LT-Chase Performance
10
-1
(Rayleigh, BPSK, (15,7,5) Code)
-2
10
Hard-Decision
Pe
10
-3 Chase 3
t0, t5
-4
10
7 7.5 8 8.5 9 9.5 10 10.5 11 11.5 12
Here we show simulation results for the performance of limited-trial Chase decoders in a
communication system with a Rayleigh fading channel, BPSK modulation, and a (15,7,5)
BCH code. Hard-decision decoding can be considered as using only one trial with test
pattern t0. As for the AWGN channel, “Chase 3” (test set {t0, t2, t4}) gives a considerable
improvement. The 2-trial algorithm with test set {t0, t5} performs even better than “Chase
3”, although it uses one trial less! Further, the 2-trial algorithm with test set {t0, t5}
performs also a little bit better than the 2-trial algorithm with test set {t0, t3} (not shown),
which was the best performing 2-trial test set on the AWGN channel for this code.
et4-030 Error-Correcting Codes, Part 6: Block Codes IV: Soft-Decision Decoding -6.24-
In the Chase decoders considered so far the test set was fixed. However, we may also
choose a test set based on the received reliability values, i.e., we make the decoding
dynamic. Carlos Barrios Vicente has proposed such a dynamic Chase decoding algorithm.
His main idea is to split an Arico test set in two equal parts. The resulting dynamic
decoder performs better than a static decoder using the same number of trials. In
particular, Barrios Vicente has shown that asymptotically optimal performance is possible
with a dynamic algorithm using only about d/8 trails. Note that this is a complexity
reduction by a factor of four compared to “Chase 3”, and a factor of two compared to
Arico. Further details can be found in this paper (available on Blackboard):
C. Barrios Vicente and J.H. Weber, "Dynamic Chase decoding algorithm",
Proceedings IEEE Information Theory Workshop, Paris, France,
pp. 312-315, March 31 - April 4, 2003.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.1-
et4-030 Part 7
Convolutional Codes
Convolutional Codes 1
The codes discussed in Parts 3 to 6 were of the block type: codewords were formed by
adding a block of n-k check symbols to a block of k information symbols. These codes
could be described by vectors/matrices or polynomials over GF(q).
In this part we introduce codes of the convolutional type [Eli55]. One feature of this
coding type is that the information symbols are considered as a stream rather than a block.
In principle this stream can be of infinite length, but in practice it is always finite, of
course. Still, the input stream is typically much longer than an information block in the
block coding concept. For every information symbol, a convolutional code generates
corresponding code symbols, which are to be transmitted over the channel. The values of
the code symbols may also be influenced by one or more preceding information symbols.
The encoder output may thus be considered as the convolution of the input and the
impulse response of the encoder.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.2-
Outline
Viterbi Decoding
Sequential Decoding
Convolutional Codes 2
First, the basics of convolutional codes are introduced. Like a block code, a convolutional
code can be described by a generator matrix, though the entries are in this case
polynomials rather than scalars. Implementation can be achieved by use of shift registers.
Alternatively, a convolutional code may also be described by a state diagram or a trellis.
Next, we consider two decoding methods for convolutional codes. Viterbi decoding is a
maximum likelihood method. In cases where this is too complex for implementation,
sequential decoding methods may be considered.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.3-
Convolutional Codes
Convolutional Codes 3
In these lecture notes we restrict ourselves to binary convolutional codes, though the
concept can easily be extended to larger alphabets. A generator matrix G=(gij) is a k×n
matrix, in which each of the entries gij= gij(x) (0≤i≤k-1, 0≤j≤n-1) is a binary polynomial.
The highest degree among all polynomials gij(x) is denoted by M, which is called the
memory of the convolutional code.
Implementation can be achieved by a shift register with k inputs and n outputs, as will be
illustrated on the next slide. Mostly k and n are very small, not only in the examples
shown in these lecture notes, but also for codes used in practice. Often k=1 and n=2 or 3.
The information bit stream is represented by I=(I0(x),I1(x),…,Ik-1(x)), where Ii(x)=ui,0+
ui,1x+ ui,2x2+… The coefficients ui,m are the bits generated by the source. The code bit
stream is represented by C=(C0(x),C1(x),…,Cn-1(x)), where Cj(x)=cj,0+ cj,1x+ cj,2x2+…,
which is obtained by multiplying the information stream I and the generator matrix G:
C = I G.
The coefficients cj,m are the code bits to be transmitted over the channel. Note that
k −1 M
c j ,m = ∑∑ ui ,m − s g i , j , s
i =0 s =0
input 1 1 1 0 0 0 ...
output 111 001 100 011 101 000 ...
input 1 0 1 0 0 0 ...
output 111 110 010 110 101 000 ...
Convolutional Codes 4
In this example the generator matrix G=(1+x+x2, 1+x, 1+x2). Since the matrix has one row
and three columns it follows that k=1 and n=3. Since the highest degree among the
polynomials is two we have memory M=2. This code can be implemented by a shift register
with k=1 input, n=3 outputs, and M=2 memory cells. In some books/publications an extra
cell is included containing the current input bit (indicated by the dotted box), but this is not
really necessary. The composition of an output bit follows from the corresponding
polynomial in the generator matrix. For example, the second output corresponds to the
polynomial 1+x, which means that it is formed by the binary sum of the current and the
previous input bit.
The bits u0, u1, u2,… generated by the source are contained in the information polynomial
I=I0(x)=u0+ u1x+ u2x2+… If the information polynomial is 1+x+x2, then
C = (1+x+x2) G = (1+x2+x4, 1+x3, 1+x+x3+x4)
= (1,1,1) + (0,0,1)x + (1,0,0)x2 + (0,1,1)x3 + (1,0,1)x4
resulting in the code bit string as indicated on the slide. Note that the code bit string resulting
from the information polynomial 1+x2 differs in seven positions from the previous string
(while the information strings differ in only one position). As for block codes, a large
distance between code strings means a high error detection and/or correction capability.
Check that the indicated output strings also follow from the shift register representation, with
zeros as the initial contents of the two memory cells.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.5-
Definitions
k inputs, n outputs ⎛ g 0,0 g0,1 ... g0,n −1 ⎞
⎜ ⎟
Generator Matrix: G = ⎜ g1,0 g1,1 ... g1,n −1 ⎟
⎜ ... ... ... ... ⎟
⎜ ⎟
⎜g g k −1,1 ... g k −1,n −1 ⎟⎠
⎝ k −1,0
where gi,j polynomials over GF(2)
As indicated before, a binary convolutional code can be defined by a k×n generator matrix
in which all entries are binary polynomials. The highest degree among these polynomials
is denoted as the memory M, while M+1 is called the constraint length K. The latter
definition refers to the observation the value of a certain input bit not only influences its
own encoding step, but also the next M encoding steps. In the example from the previous
slide (M=2, so K=3), note that the difference in the second position of the two input
strings not only leads to differences in the corresponding three output bits, but also to
differences in the next M=2 substrings of n=3 bits each.
Since every k input bits generated by the source lead to n output bits to be transmitted
over the channel, the code rate of the convolutional code can be defined by R=k/n. The
example code is thus a rate 1/3 code.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.6-
Scalar Matrix
Generator matrix G = ( 1+x+x2, 1+x, 1+x2)
= (1,1,1) + (1,1,0) x + (1,0,1) x2
Scalar matrix ⎛⎜ 111 110 101 000 000 000 ... ⎞⎟
⎜ 000 111 110 101 000 000 ... ⎟
⎜ 000 000 111 110 101 000 ... ⎟
⎜ ⎟
⎜ 000 000 000 111 110 101 ... ⎟
⎜ ... ... ... ⎟⎠
⎝ ... ... ... ...
# rows kL ⇒ # columns n(L+M);
⎛ 111 110 101 000 000 ⎞
Example (k=1, n=3, M=2, L=3): ⎜ ⎟
⎜ 000 111 110 101 000 ⎟
⎜ 000 000 111 110 101 ⎟
⎝ ⎠
Convolutional Codes 6
By decomposing the generator matrix according to the powers of x, a scalar matrix can be
obtained as follows. Let the first k×n submatrix contain the coefficients of x0, the next k×n
submatrix the coefficients of x1, etc. The concatenation of these submatrices forms the
first row of the scalar matrix. Each next row is formed by shifting the previous row over
n positions to the right, and substituting a k×n all-zero matrix at the beginning of the row.
For example, the code bit string for the information string 1,1,1,0,0,0,0,… is obtained by
the addition of the first three rows, while the code bit string for the information string
1,0,1,0,0,0,0,… is obtained by the addition of the first row and the third row (check the
results with the results obtained from the earlier representations).
In principle, the scalar matrix is of infinite size. In practice, the number of information
bits is finite, and the number of rows is fixed at kL. From the composition of the matrix
we observe that all entries beyond the first n(L+M) columns are zero and may thus be
omitted. This leaves a kL×n(L+M) matrix, and thus an actual code rate of kL/(n(L+M)).
Usually L>>M, in which case this code rate is close to k/n. In the slide example we choose
a rather small value L=3, for obvious reasons.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.7-
State Diagram
State: “Contents of Memory Cells”
000
Input 0
Input 1
00 xxx Output
101 111
110
01 10
010
011 001
11
Convolutional Codes 7
100
A convolutional code can also be represented by a state diagram (finite state machine).
The state refers to the contents of the memory cells in the shift register representations.
For the example codes there are two memory cells, and thus four states: 00 (initial state),
10, 01, and 11. For each state, the input bit may be either 0 (indicated by a solid arrow) or
1 (dotted arrow). In any case, the encoder produces three output bits (indicated along the
arrows) and goes to the same or another state. For example, we consider the situation that
the encoder is in state 10. If the input is 0, the output is (0+1+0,0+1,0+0)=(1,1,0), and the
new state is 01. If the input is 1, the output is (1+1+0,1+1,1+0)=(0,0,1), and the new state
is 11. Similarly for the other three states.
Encoding can now be considered as a walk through the state diagram, starting from the
all-zero state. For example, again considering the information string 1,1,1,0,0,0,0,…, we
start in 00, follow the dotted arrow (output 111) leading to state 10, follow the dotted
arrow (output 001) leading to state 11, follow the dotted arrow (output 100) leading to
state 11, follow the solid arrow (output 011) leading to state 01, etc.
The minimum Hamming distance between any two different code strings is called the
(free) distance of the convolutional code. Note that if all the input bits are zero, also all
the output bits are zero. As for linear blocks codes, the distance of a convolutional code
equals the smallest non-zero codeword weight. Since any codeword is represented by a
walk through the state diagram, we find the distance by looking for a path of minimum
weight starting and ending in the all-zero state, but leaving the all-zero state at least once.
For the example code, we easily observe that the distance is 3+2+2=7, and thus three
errors can be corrected.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.8-
Trellis
Input 0
Input 1
Trellis: “State Diagram in Time”
xxx Output
000 000 000
00 00 00 00 ...
101
011
001 001
11 11 11 11 ...
100
Convolutional Codes 8
Trellis (L=3)
Input 0
Trellis ends at “time” t=L+M
Input 1
N.B. In practice L>>M
xxx Output
000 000 000 000 000
00 00 00 00 00 00
When terminating the trellis at t=L, we further exploit the memory of the system by
running M more cycles with input 0. In this way, we always end in the all-zero state at
t=L+M. Remember that L is usually much larger than M. However, in the example code
we have M=2 and L=3. Check that the input string 1,0,1 leads from the initial state 00 to
the states 10, 01, 10, 01, 00, giving output 111, 110, 010, 110, 101.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.10-
Convolutional Codes 10
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.11-
2 3 0
1 1 01 1 01 01
1
1
0 0
10 10 10
1
2
3 3
11 1 11
Convolutional Codes 11
The Viterbi decoding algorithm for convolutional codes [Vit67] can be easiest described
using the trellis representation. Instead of putting the encoder output at the arrows
between the states from time t to time t+1, we put the (Hamming) weight of the binary
sum vector of this encoder output and the corresponding received bits. For example, for
the solid arrow from state 00 at t=0 and state 00 at t=1, the weight is 2, since
110+000=110. Similarly, for the dotted arrow from state 00 at t=0 and state 10 at t=1, the
weight is 1, since 110+111=001.
If the channel bit error probability is below ½, than the maximum likelihood estimation
for the transmitted bit string is the shortest path (in terms of Hamming weights) between
the all-zero state at t=0 and the all-zero state at t=L+M. In order to find this we could
calculate the weights of all 2kL possible paths. However, since L is large, this is infeasible.
Fortunately, Viterbi has most proposed a method of which the complexity increases only
linearly with L (rather than exponentially).
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.12-
The key to the Viterbi decoding algorithm is the idea illustrated above. Let P be the
shortest path between the all-zero state at t=0 and the all-zero state at t=L+M. If this path
is in a certain state X at a certain time t=i, than the subpath P1 of P must be the shortest
path between the all-zero state at t=0 and state X at t=i. In order to prove this, suppose
that there exist a shorter path Q between the all-zero state at t=0 and state X at t=i. Then
the path (Q,P2) between the all-zero state at t=0 and the all-zero state at t=L+M would be
shorter than P, which gives a contradiction.
In the Viterbi algorithm, we determine the shortest path between the all-zero state at t=0
and state X at t=i for all X and i. In order to do so, we do not have to work our way from
the start over and over again, but we make use of the results already obtained for t=i-1. If
we know the shortest paths for all states at t=i-1, then the shortest path to state X at t=i
can be found by considering all incoming arrows. For each such an arrow A, add the
weight of A to the weight of the shortest path P’ to the starting state of A. The (P’,A) with
lowest weight is selected as the shortest path to X at t=i. Starting at the all-zero state at
t=0, and repeating this procedure for all states for i=1,2,…,L+M, the final decoding result
is the path appearing for the all-zero state at t=L+M.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.13-
2 2 2 1 2
00 00 00 00 00 00
0 2 4 3 4 3
- 0 00 2 100 3 1000 0 10100
1 1 01 1 01 01
1 1 3 3
10 010 1 1010
0 0
10 10 10
1 3 2 1 Final
1 01 2 101 decoding
3 3
result
11 1 11
Convolutional Codes 4 5 13
11 111
Here, the general Viterbi decoding procedure is illustrated for the example code when the
received bit string is 110,110,110,010,101. As an example, we determine the path to state
11 at t=3. Note that there are two incoming arrows: one from state 10 with weight 3, and
one from state 11 with weight 1. Both are dotted, meaning that they represent an input bit
equal to 1. Since the shortest path to state 10 at t=2 was determined to be 01 (of weight 3),
and since the shortest path to state 11 at t=2 was determined to be 11 (of weight 4), our
two path options are 011 (of weight 3+3=6) and 111 (of weight 4+1=5). Hence, the result
is 111 (of weight 5) as the shortest path from 00 at t=0 to state 11 at t=3.
The final decoding result is the path 10100 from state 00 at t=0 to state 00 at t=5. Of this
path only the first L=3 bits 101 are of interest, since the last M=2 bits are 0 by default.
Note that the code string corresponding to the information bits 101 is
111,110,010,110,101, which differs indeed in three positions from the received string
110,110,110,010,101. Also, for this small example it can be easily checked that the other
23-1=7 code strings differ in at least four positions from the received string. Hence, the
Viterbi algorithm indeed delivers the code string which is closest to the received string in
Hamming distance sense.
So far we have discussed the Viterbi decoding algorithm in a hard-decision mode.
However, replacing the Hamming weight labels by analog weight labels, the Viterbi
algorithm may also be used to perform soft-decision decoding.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.14-
• Created by student
Xander Burgerhout
• Demonstration showing
• Convolutional encoding
• Channel simulation
• Viterbi decoding
Convolutional Codes 14
Tree Representation
…
input 0: “up”
… input 1: “down”
…
example:
…
input 101…
Convolutional Codes 15
The Viterbi algorithm provides a very strong maximum likelihood decoding mechanism
for convolutional codes. However, note that the complexity of the Viterbi algorithm is
growing exponentially with the memory M. Since a large value of M may be attractive
(for example to create a large free distance), there is still a need for other (sub-optimal)
decoding algorithms as well. One such class is formed by the sequential decoding
algorithms, which do not always deliver the maximum likelihood code sequence, but are
on average less complex than the Viterbi algorithm.
In order to describe the sequential decoding algorithms, we introduce (again) another
representation of convolutional codes. In the tree representation the encoding process
starts at the root (at the left side). If the input bit is a zero we move up in the tree, else we
move down. As in a trellis, we label the branches of the tree with the corresponding
output bits.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.16-
000 000
000
example:
000 110 101
111 input
01
110
111 101 000
output
000111110101
001
011 101
Convolutional Codes 16
Here we illustrate the tree representation for our example code. After the termination at
t=L=2, we continue for M=2 more time units with default input zero (represented
horizontally and not “up”). Note that there are 2L=4 possible code strings of length 12 ,
each represented by a path in the tree starting in the root at t=0 at the left side and ending
at t=L+M=4 at the right side.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.17-
Here, we have shown the Fano metrics for all nodes in the example code tree, in case the
received sequence y is 001,001,011,101 and the BSC has a bit error probability p=1/16.
Since the code rate R=1/3 and log2 P(yj)=-1 , we get
P ( y j | x ij )
µ F ( xi ) = ∑ j =1 (log 2 − R ) = 2 n i / 3 + ∑ j i=1 log 2 P ( y j | x ij )
ni n
P( y j )
Further, note that log2 P(yj|xij) is log2(15/16)=-0.1 if the received bit yj equals the
corresponding xij, and log2(1/16)=-4 otherwise. Hence, per branch of three bits we have
four possibilities:
By using this table, all node metrics in the tree can be easily calculated.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.19-
Stack Algorithm
For the example on the previous slide, we have calculated the Fano metric for all nodes in
the tree. However, remember that in practice L is very large, and it is not feasible (even
for fast computers) to evaluate all 2L paths. In sequential decoding algorithms, only a
(small) part of the tree is explored. Above, the stack algorithm is described, which is
illustrated on the next slide for the example code.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.20-
Convolutional Codes 20
Here we illustrate the working of the stack algorithm for the example code. Remember
that all path segments start in the root, that a 0 means “up” and a 1 means “down”. The
Fano metric of the path segment follows between brackets.
x0 x1 x2 x3
- (0)
0 (-2.2) 1 (-6.1)
00 (-4.4) 1 (-6.1) 01 (-8.3)
1 (-6.1) 01 (-8.3) 000 (-10.5)
11 (-4.4) 01 (-8.3) 000 (-10.5) 10 (-16.1)
110 (-2.7) 01 (-8.3) 000 (-10.5) 10 (-16.1)
1100 (-1.0) 01 (-8.3) 000 (-10.5) 10 (-16.1)
Here the algorithm stops, since x0 a complete path. The first L=2 bits of x0 form the
decoding result. Note that some parts of the tree have not been explored. Due to the small
example size the achieved complexity reduction is only very modest, but for larger L the
reduction may be enormous.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.21-
Optimal Sub-optimal
(Maximum Likelihood)
Convolutional Codes 21
Since a sequential decoding algorithm explores only a part of the tree, it may happen that
it does not deliver the maximum likelihood code string. Also, the route through the tree
may vary. In case of no or only a few errors, it will follow the original path straight from
the root. In case of many errors, it may start exploring other parts of the tree over and over
again. Hence, the required decoding time and storage capacity may vary a lot from one
sequence to the other. Still, for large values of M, the sequential decoding approach is a
good alternative for Viterbi decoding.
et4-030 Error-Correcting Codes, Part 7: Convolutional Codes -7.22-
Decoding Exercise
Convolutional Codes 22
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.1-
et4-030 Part 8
Iterative Decoding
Iterative Decoding 1
Most modern coding systems make use of iterative decoding techniques, which allow
very good performance (close to the Shannon limit) at a moderate complexity. In this
part, some basic features of iterative decoding are presented, in particular by introducing
the two most popular present-day schemes: turbo codes and low-density parity-check
(LDPC) codes.
The proposal of turbo codes [BGT93] caused a true revolution in the coding community
during the 1990’s. The success of the iterative decoding process associated with these
codes also caused a revival of LDPC codes, which were already proposed by Gallager
[Gal62] in the 1960’s.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.2-
Outline
Turbo Codes
– Basics, Examples & Demo
Iterative Decoding 2
Example Set-Up
SOURCE DESTINATION
(7,4,3)
ENCODING Hamming DECODING
Code
In the example we use for introducing the iterative decoding concept, 4 information bits
are encoded into 7 code bits through the binary (7,4,3) Hamming code. The 7 bits are
transmitted over a binary erasure channel, i.e., some of the transmitted bits may be erased.
The decoder will try to retrieve the erased bits and to deliver the message to the
destination.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.4-
Parity-Check Matrix:
1011100
H= 0101110
0010111
Iterative Decoding 4
The (7,4,3) Hamming code (see Part 3) is a binary block code of length 7, dimension 4,
and Hamming distance 3. A possible parity-check matrix H is given above. The code is
formed by the 24=16 binary vectors of length 7 which are orthogonal to all three rows of
H.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.5-
ε
1 1
1-ε
Iterative Decoding 5
The binary erasure channel can be considered as a special case of the binary error-erasure
channel introduced in Part 1. It is assumed that errors do not occur, i.e., a transmitted 0
will not be received as a 1 and a transmitted 1 will not be received as a 0. A transmitted
bit may be erased though, denoted at the receiver side by the symbol *. The probability of
a transmitted bit to be erased is denoted by ε, where 0<ε<1.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.6-
Exhaustive Decoding
Codewords: 0000000 1101000 0110100 0011010
0001101 1000110 0100011 1010001
0010111 1001011 1100101 1110010
0111001 1011100 0101110 1111111
Iterative Decoding 6
An exhaustive decoding strategy for a code on the binary erasure channel is to compare
the received vector (consisting of symbols 0, 1, and/or *) with all codewords, and to list
those codewords which match the received vector on all non-erased positions. Since the
channel model does not allow of any errors, the originally transmitted codeword is always
included in this list. Decoding is said to be successful if and only if the list contains
exactly one codeword.
For linear block codes, successful decoding is achieved if and only if the set of erased
positions does not contain the support of a non-zero codeword, where the support of a
vector x is {i: xi≠0}. Hence, a linear code of Hamming distance d is capable of
successfully correcting any received vector containing at most d-1 erasures, since the size
of the support of any non-zero codeword is at least equal to d.
Example: Suppose we receive ****100, i.e., the first four positions are erased. There are
two codewords in the (7,4,3) Hamming code which end in 100: 0110100 and 1011100.
Hence, decoding is unsuccessful.
Example: Suppose we receive *1*1001. There is only one codeword in the (7,4,3)
Hamming code which matches this vector in all non-erased positions: 0111001. Hence,
decoding is successful, which was to be expected, since the number of erasures (i.e., 2) is
lower than the code’s Hamming distance (i.e., 3).
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.7-
Iterative Decoding 7
The examples shown above confirm that vectors with at most two erasures are
successfully decoded (see 1) and 2)). Vectors with more erasures may be successfully
decoded as well (see 3)), but may also lead to the ambiguous situation of having more
than one candidate codeword (see 4)). In case 4), the set {1,2,4} of erased positions is the
support of the non-zero codeword 1101000.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.8-
Iterative Decoding 8
A full evaluation of all possible erasure patterns leads to the following conclusions for the
(7,4,3) Hamming code :
• all erasure sets of size 2 or less are successfully decoded, as argued before;
• the 7 erasure sets of size 3 which correspond to the support of a codeword of weight 3
cannot be corrected unambiguously; the other 35-7=28 erasure sets of size 3 are
successfully decoded, since they do not contain the support of a non-zero codeword;
• all erasure sets of size 4 or more contain the support of at least one non-zero codeword
(try to prove this!), and thus cannot be corrected unambiguously.
The exhaustive decoding strategy gives the best possible result, i.e., it always finds all
possible candidate codewords, and thus provides the originally transmitted codeword as
its output whenever this is the unique codeword matching the received vector on all non-
erased positions. However, for large codes, the exhaustive decoding strategy may be too
computationally intensive. An attractive sub-optimal alternative may be an iterative
decoding strategy, which will be presented next, again for the (7,4,3) Hamming code.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.9-
A: x1+x3+x4+x5 = 0
B: x2+x4+x5+x6 = 0
C: x3+x5+x6+x7 = 0
Iterative Decoding 9
Since a binary vector x=(x1, x2, x3, x4, x5, x6, x7) is a codeword if and only if xHT=0, it
follows that any codeword satisfies the three parity-check equations A, B, and C indicated
above. Equation A is said to check on x1, x3, x4, and x5. If exactly one of these four
symbols is erased, it can be retrieved from this equation. For example, if the received
vector is **010*0, it follows from equation A that x1=1 (since x3=0, x4=1, and x5=0).
Note that this result has been achieved without checking all possible codewords. A similar
reasoning holds for parity-check equations B and C.
If an equation checks on two or more erased positions, none of the erased positions can be
retrieved immediately, since there are multiple solutions for the erased symbols to satisfy
the equation. In the example just started, applying equation B on 1*010*0 gives the
options x2=1 and x6=0 or x2=0 and x6=1. Hence, no conclusions can be drawn from
equation B.
However, at a later stage, when all but one of these erased positions have been retrieved
from other equations, the equation under consideration may become useful again to
retrieve the one remaining erasure. In our example, equation C gives x6=0, and then by
returning to equation B we obtain x2=1. The transmitted codeword 1101000 has thus been
retrieved by iteratively using equations A, B, and C.
In general, we could keep on using the parity-check equations iteratively, until none of
them checks on exactly one erased symbol. In this case, either all erased symbols have
been retrieved (successful decoding), or the decoder is stuck (unsuccessful decoding).
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.10-
A: x1+x3+x4+x5 = 0 A B
B: x2+x4+x5+x6 = 0
x1 x4 x2
C: x3+x5+x6+x7 = 0
0 0 * =1
x5
x3 0 x6
0 * =1
Any pattern of one or two
erasures can be x7
corrected!
1
C
Iterative Decoding 10
In this and the subsequent examples on the next pages, each of the three parity-check
equations A, B, and C is denoted by a circle (Venn Diagram). For each of the three
circles, the sum of the four symbols contained in the circle should be zero in order to
satisfy the requirements of a codeword. This property will be used to retrieve the erased
symbols, as discussed before.
In the example on this slide, the received vector is 0*000*1. Applying the iterative
decoding procedure gives:
Iteration A B C
1 - - x6=1
2 - x2 =1
Hence, the transmitted codeword has been retrieved, as expected (since there were only
two erasures).
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.11-
A: x1+x3+x4+x5 = 0 A B
B: x2+x4+x5+x6 = 0
x1 x4 x2
C: x3+x5+x6+x7 = 0
0 0 * =1
x5
x3 0 x6
* =0 * =1
Some patterns of three
x7
erasures may be
corrected as well! 1
C
Iterative Decoding 11
In the example on this slide, the received vector is 0**00*1. Applying the iterative
decoding procedure gives:
Iteration A B C
1 x3=0 - x6=1
2 - x2 =1
Hence, the three erasures have been corrected and the transmitted codeword has been
retrieved.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.12-
A: x1+x3+x4+x5 = 0 A B
B: x2+x4+x5+x6 = 0
x1 x4 x2
C: x3+x5+x6+x7 = 0
* * *
This pattern of three x5
erasures is not x3 0 x6
corrected. 0 0
Two possible solutions:
x7
x1=x2=x4=0
0
C
x1=x2=x4=1
Iterative Decoding 12
In the example on this slide, the received vector is **0*000. Trying to apply the iterative
decoding procedure does not succeed, since it is stuck from the very beginning. Equations
A and B both check on two erased symbols, while equation C checks on none. Hence,
decoding is unsuccessful. Note that also the optimal exhaustive decoder is unsuccessful in
this case, since there are two candidate codewords which match the received vector in all
non-erased positions: 0000000 and 1101000.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.13-
A: x1+x3+x4+x5 = 0 A B
B: x2+x4+x5+x6 = 0
x1 x4 x2
C: x3+x5+x6+x7 = 0
0 * 0
In the example on this slide, the received vector is 00***00. Again, trying to apply the
iterative decoding procedure does not succeed. Equation A checks on three erased
symbols, while equations B and C both check on two erased symbols. Hence, the iterative
decoding procedure is unsuccessful. However, in this case, the optimal exhaustive
decoder is successful, since there is only one codeword which matches the received vector
in all non-erased positions: 0000000.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.14-
Iterative Decoding 14
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.15-
A full evaluation of all erasure patterns leads to the table shown above. Like the optimal
exhaustive decoder, the iterative decoder corrects all patterns with at most two erasures
(see Example 1) and no patterns with four or more erasures. For the 35 patterns with
exactly three erasures the situation is as follows:
• 25 are corrected by both the exhaustive decoder and the iterative decoder (see Example
2);
• 7 are not corrected by both the exhaustive decoder and the iterative decoder (see
Example 3); these correspond to the supports of the 7 codewords of weight 3;
• 3 are corrected by the exhaustive decoder, but are not corrected by the iterative decoder
(see Example 4).
The case of the (7,4,3) Hamming code studied on the preceding pages clearly illustrates
the general notion that sub-optimal iterative decoders may form a good alternative for
optimal exhaustive decoders, by offering a performance which is almost as good at a far
lower complexity.
In the remainder of Part 8, two classes of codes with iterative decoding are introduced:
first turbo codes, and next LDPC codes. We will assume that transmission is over the
AWGN channel, thus allowing the use of soft values in the decoding process.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.16-
Iterative Decoding 16
Turbo codes were introduced in 1993 by Berrou, Glavieux, and Thitimajshima [BGT93].
The proposed turbo coding scheme consisted of known components, the new aspect was
mainly in the way these components co-operated. An essential feature is the iterative
decoding process. The sensational result was that performance close to the Shannon limit
is possible at modest complexity.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.17-
input 1 1 1 0 0 0 ...
output 11 10 10 00 01 00 ...
input 1 0 1 0 0 0 ...
output 11 01 11 00 00 00 ...
Iterative Decoding 17
In general, the convolutional codes as introduced in Part 7 are not systematic, i.e., the
string of information bits does not appear as a substring of the code string. This is mostly
not a big problem, since both the Viterbi decoding algorithm and the sequential decoding
algorithms directly deliver the information bits. However, in a turbo coding scheme, it is
required that the codes are in a systematic format. Mostly, recursive systematic
convolutional (RSC) codes are used, an example of which is shown above. Clearly, the
code is systematic, since every first output bit equals the corresponding information bit.
Hence, the substring formed by the bits in the odd positions of the output string equals the
input string. The recursive aspect is in the feedback loop at the bottom of the scheme,
where the contents of the second memory cell is added to the new input bit. The code rate
is ½, since there are still two output bits for every input bits. By puncturing techniques,
this code rate may be increased (at the expense of a poorer performance). For example, a
systematic rate 2/3 code is obtained when omitting the second output bit every other
encoding step.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.18-
data
0 +1
source information bits
1 –1
check Modulator,
C1 Encoder bits C1 e.g., BPSK
C2 Encoder
check bits C2
Iterative Decoding 18
At the output of a standard turbo encoder three bit streams are merged:
1) Information bits generated by the data source;
2) Check bits generated by feeding the information bits to the systematic encoder of a
code C1; this may be either a block or convolutional code;
3) Check bits generated by feeding a permuted version of the information bits to the
systematic encoder of a code C2; often code C2 is the same as code C1, but it may
also be different (asymmetric turbo code); the permutation of the information bits is
done by an interleaver.
As usual, after encoding, the bits are modulated (e.g., BPSK) and transmitted over a noisy
channel (e.g., AWGN).
If the code rates of C1 and C2 are R1 and R2, respectively, then the code rate of the turbo
code is
(R1R2) / (R1+ R2- R1R2).
For example, if both C1 and C2 are rate ½ codes, than the turbo code rate is 1/3. Indeed,
for every bit generated by the source, three bits are delivered: the information bit
itself, one check bit from C1, and one check bit from C2. Higher code rates may be
obtained by applying puncturing techniques. For example, alternate omission of the
C1 and C2 check bits increases the turbo code rate from 1/3 to ½.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.19-
De-Interleaver
a priori
extrinsic 1
~check 1~
C1 Decoder Interleaver
(MAP or SOVA)
a priori
extrinsic 2
~info~ C2 Decoder
Interleaver
(MAP or SOVA) soft output
~check 2~
De-Interleaver
hard output
Iterative Decoding 19
At the input of the turbo decoder we have the soft values corresponding to the information
bits, the check bits of code C1, and the check bits of code C2. First the information bit
and C1 check bit related values are fed into a decoder for C1 which is capable of handling
soft information. Such a soft-input/soft-output (SISO) decoder can be implemented using
a maximum a posteriori (MAP) rule or a soft-output Viterbi algorithm (SOVA) (see, e.g.,
[VY00]). The additional information on a certain information bit coming from the bits
surrounding it is called the extrinsic information gained in the C1 decoding process. This
extrinsic information is permuted (by the same interleaver as used in the encoder) and
then used as a priori information in the SISO C2 decoder, which also uses the soft values
related to the C2 check bits and the (interleaved) information bits coming straight from
the channel as inputs. Similarly to the C1 decoder, the C2 decoder produces extrinsic
information, which is de-interleaved and then used as extra input in a second round of C1
SISO decoding. This iterative decoding process can continue for many rounds and
explains the terminology “turbo”. In spite of the name turbo coding, the turbo aspect is
actually in the decoding part.
After a number of iterations the process is terminated. This number may be either fixed or
based on some stopping criterion (checking whether still progress is being made when
going from one iteration to the next). The soft values for the information bits resulting
from the last C2 decoding step are de-interleaved, and a threshold rule gives the binary
hard decisions.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.20-
Iterative Decoding 20
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.21-
Log-Likelihood Algebra
0 → +1 1 → -1
⊕ +1 -1
+1 +1 -1
-1 -1 +1
In order to explain the SISO decoders in some more detail, we first introduce some log-
likelihood algebra. Let the binary 0 be represented by the real number +1 and the binary 1
by the real number –1. Hence, the binary addition can be achieved by a real
multiplication.
The log-likelihood ratio (LLR) L(u) of a binary random variable u is defined of the
logarithm of the quotient of the probability that u equals 0 (or +1) and the probabability
that u equals 1 (or –1). If the former probability is larger than the latter, L(u) is positive,
else L(u) is negative. Hence the sign of L(u) can be used as a hard decision, while the
absolute value |L(u)| is an indication of the reliability of the hard decision.
It can be shown that the LLR of the binary sum of two binary random variables is
approximately given by the formula at the bottom of the slide.
More on Log-Likelihood algebra can be found in the paper:
J. Hagenauer et al., “Iterative decoding of binary block and convolutional codes”, IEEE
Transactions on Information Theory, vol. 42, pp. 429-445, March 1996.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.22-
i i c Scheme, using
(n=3, k=2, d=2) parity-check code
i i c
c c
0 0 0 +1 +1 +1 +0.5+1.5+1.0
0 1 1 +1 -1 -1 +4.0+1.0 -1.5
0 1 +1 -1 +2.0 -2.5
Iterative Decoding 22
The iterative decoding process will now be illustrated by a simple example, in which
C1=C2 is the (3,2,2) even-weight code. A source generates four information bits at a time
(here 0001). First, the first two are encoded, i.e., a check bit is generated which is 0 if the
two bits are equal and 1 otherwise. Similarly, a check bit is generated for the last two bits.
This C1 encoding is represented horizontally in the scheme above, where i represent an
information bit and c a check bit. Next, we permute the four information bits, and encode
the first and the third bit using code C2, and the second and the fourth also using code C2.
This C2 encoding is represented vertically in the scheme above. As an overall result, we
have four information and four check bits (as indicated in scheme (1)), and thus a rate ½
code.
Next, the bits are put into their ±1 representation (see scheme (2)). During transmission
noise is added to these values, leading to the received LLR values presented in Scheme
(3). Note that taking hard decision based on the sign of the received LLR values would
lead to an error in the last information bit.
All codewords (horizontally and vertically) from the constituent (3,2,2) code are of the
form (c1,c2,c3)=(u1,u2,u1⊕u2). Hence, in order to gain information on bit c1, we can
consider the corresponding LLR L(c1), but also the LLR of c3⊕c2(=c1!). This L(c3⊕c2) is
the extrinsic information on c1. Similarly, L(c3⊕c1) is the extrinsic information on c2.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.23-
i i c Scheme, using
(n=3, k=2, d=2) parity-check code
i i c
c c
(4) extrinsic (5) extrinsic
(3) received horizontal vertical (6) output #1
+2.0 -2.5
Iterative Decoding 23
First, the extrinsic information following from C1 (“horizontal”) is given in scheme (4).
For example, for the first information bit, the approximation formula from log-likelihood
algebra gives +min{1.0,1.5}=+1.0. Next, this horizontal extrinsic information is added to
the received LLR values of the information bits (scheme (3)):
From this we calculate the extrinsic information from C2 (“vertical”), given in scheme
(5). For example, for the first information bit, the approximation formula from log-
likelihood algebra gives +min{2.0,3.0}=+2.0. After the complete first iteration, the soft
output values corresponding to an information bit are the sum of the received value
(scheme (3)), the horizontal extrinsic information (scheme (4)), and the vertical extrinsic
information (scheme (5)). The result is given in scheme (6)). Note that already after this
first iteration the error in the last information bit has been corrected due to the extrinsic
information.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.24-
i i c Scheme, using
(n=3, k=2, d=2) parity-check code
i i c
c c
(7) extrinsic (8) extrinsic
(3) received horizontal vertical (9) output #2
+2.0 -2.5
Iterative Decoding 24
At the start of the second iteration, the extrinsic information following from C2
(“vertical”) in the first generation (scheme (5)) is added to the received LLR values of the
information bits (scheme (3)):
+0.5+2.0=+2.5 +1.5+0.5=+2.0 +1.0
+4.0+1.5=+5.5 +1.0-2.0=-1.0 -1.5
+2.0 -2.5
From this we calculate new extrinsic information from C1 (“horizontal”, scheme (7)). For
example, for the third information bit (the first of the second row), the approximation
formula from log-likelihood algebra gives +min{1.5,1.0}=+1.0.
Next, the new extrinsic information following from C1 in the first generation (scheme (7))
is added to the received LLR values (scheme (3)):
+0.5+1.0=+1.5 +1.5+1.0=+2.5 +1.0
+4.0+1.0=+5.0 +1.0-1.5=-0.5 -1.5
+2.0 -2.5
From this we calculate new extrinsic information from C2 (“vertical”, scheme (8)).
After the complete second iteration, the soft output values corresponding to an
information bit are the sum of the received value (scheme (3)), the new horizontal
extrinsic information (scheme (7)), and the new vertical extrinsic information (scheme
(8)). The result is given in scheme (9). Note that the hard decisions are still the same as
after the first iteration (scheme(6)), but showing even more confidence (higher absolute
values). Similarly for next iterations.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.25-
Iterative Decoding 25
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.26-
Iterative Decoding 26
This demo, developed at the former KPN Research Lab in Leidschendam (now
incorporated in TNO ICT in Delft), shows the iterative turbo decoding process. Two type
of turbo coding schemes are considered.
The first one is similar to the example considered before, replacing the (3,2,2) code by a
Hamming code.
The other one, applied in the Universal Mobile Telecommunications System (UMTS),
uses a recursive systematic convolutional code with memory M=3 as constituent code.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.27-
Iterative Decoding 27
Here we show the performance of a turbo coding scheme based on the (63,57,3)
Hamming code. The code rate is (57×57)/((57×57+2×57×6) = 0.826. The performance for
a code of such high rate is very good. The vertical line at 2.3 dB indicates the Shannon
bound (virtually error-free communication) for binary codes at a rate of 0.826. Note that
the performance curve first has the typical “waterfall’” shape, but tends to be more
horizontal from about 3.5 dB. This phenomenon is common to turbo codes, which show
an “error floor” at higher signal-to-noise ratios.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.28-
Iterative Decoding 28
Here we show the performance of the rate 1/3 turbo coding scheme based on the UMTS
code and an interleaver of size 5114. The vertical line at –0.5 dB indicates the Shannon
bound (virtually error-free communication) for binary codes at a rate of 1/3. Note that the
performance is even better than on the previous slide (at the expense of a lower code
rate).
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.29-
Iterative Decoding 29
Besides turbo codes, low-density parity-check (LDPC) codes form another class of codes
approaching the Shannon limit. The concept of LDPC codes was already proposed by Gallager
in the early 1960’s [Gal62]. Unfortunately, Gallager’s remarkable discovery was mostly
ignored by coding researchers for almost 20 years, until Tanner’s work in 1981, in which he
provided a new interpretation of LDPC codes from a graphical point of view. Tanner’s work
was also ignored for another 14 years, until the late 1990’s when some coding researchers
began to investigate codes on graphs with iterative decoding, mainly because of the success of
turbo codes.
Long LDPC codes with iterative decoding based on belief propagation have been shown to
achieve an error performance only a fraction of a decibel away from the Shannon limit. This
discovery makes the LDPC codes strong competitors with turbo codes for error control in
many communication and digital storage systems where high reliability is required. LDPC
codes have some advantages over turbo codes. They do not require a long interleaver.
Furthermore, the error floor (the phenomenon that the error performance curve starts to show a
horizontal behavior) tends to occur at much lower BER values.
We provide an introduction on the basic concepts of LDPC codes. For a detailed discussion
and many suggestions for further reading we refer to Chapter 17 in [LC04].
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.30-
Iterative Decoding 30
Low Density!
Iterative Decoding 31
A block code is called a regular LDPC code if its parity-check matrix H satisfies the above
properties. The first two properties simply say that H has constant row and column weights.
The third property also implies that no two rows of H have more than one 1 in common. The
fourth property explains the name of low-density parity-check codes: the parity-check matrix H
has a small density of ones (i.e., H is sparse).
If the first two properties are not satisfied, but H is still sparse, the code is said to be an
irregular LDPC code.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.32-
r=ρ/n=λ/J
Iterative Decoding 32
For a regular LDPC code, the total number of ones in H can be calculated as the number
of rows (J) times the number of ones per row (ρ), or as the number of columns (n) times
the number of ones per column (λ). Hence the total number of ones in H is
ρJ= λn.
Since the total number of entries in H is
Jn,
the density r satisfies both
r = ρJ / Jn = ρ / n
and
r = λn / Jn = λ / J.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.33-
⎢1 0 0 0 1 1 0⎥ s5
⎢ ⎥
⎢0 1 0 0 0 1 1⎥ s6
⎢⎣1 0 1 0 0 0 1 ⎥⎦ s7
Iterative Decoding 33
In the above example, the rank of the H-matrix is 4. Hence, the code could also be represented
by the parity-check matrix consisting of just the first four rows of H, which clearly are linearly
independent. Note that this submatrix leads to an irregular LDPC code, while the given H (in
which both the row and the column weight are 3) leads to a regular LDPC code.
The density is equal to 3/7. In fact, this is not really a low-density parity-check code. In order
to have densities close to zero, which is required to achieve good performance, the codes must
be rather long. The efficient decoding methods to be discussed later can easily handle parity-
check matrices with millions of entries. For obvious reasons, we restrict ourselves to small
examples in these lecture notes in order to illustrate the basic concepts.
The columns in H are associated with the n code bits, here denoted by vi, while the rows in H
represent J check sums, here denoted by sj.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.34-
Iterative Decoding 34
The encoding of a message u (of k bits) using a parity-check matrix H (rather than a
generator matrix) can be established as shown above. In order to compute the check
sequence p (of n-k bits) properly, the matrix H should have n-k linearly independent rows,
and the matrix A should be invertible. Hence, for LDPC codes, it may be required to omit
some rows from the parity-check matrix (such that n-k independent rows remain), and to
swap some columns (such that the first n-k columns are linearly independent). At a later
stage, the reversed swapping mechanism should be applied, of course.
In the example for the (7,3,4) code, the last three rows from the original parity-check
matrix introduced on the previous slide are omitted (they will be used in the decoding
process though). No column swapping is required, since the first four columns are
independent already.
For large LDPC codes, this method may be too computationally intensive. Other encoding
techniques with a lower complexity (e.g., linear in the code length) have been proposed
(not to be discussed here).
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.35-
v1 v2 v3 v4 v5 v6 v7
s1 s2 s3 s4 s5 s6 s7
A graph consists of a set of vertices (or nodes), denoted by V = {V1,V2, . . . }, and a set of edges, denoted by E=
{E1,E2, ... }, such that each edge Em is identified with an unordered pair (Vi, Vj) of vertices. Such a graph is
denoted by (V, E). The vertices Vi and Vj associated with edge Em are called the end vertices of Em. A graph is
most commonly represented by a diagram in which the vertices are represented as points and each edge as a line
joining its end vertices. With this graphical representation, the two end vertices of an edge are said to be
connected by the edge, and the edge is said to be incident with its end vertices. The number of edges that are
incident with a vertex Vi is called the degree of vertex Vi.
A graph (V, E) is called a bipartite graph if its vertex set V can be partitioned into two disjoint subsets VA and VB
such that every edge in E joins a vertex in VA with a vertex in VB, and no two vertices in either VA or VB are
connected.
An LDPC code (or any linear block code) with parity-check matrix H can be represented by a bipartite graph as
follows. The first set of nodes VA={v1,v2,…,vn} consists of n nodes that represent the n code bits. These are
called code-bit vertices or variable nodes. The second set of nodes VB={s1,s2,…,sJ} consists of J nodes that
represent the J parity-check sums (or equations) following from H. These are called check-sum vertices or check
nodes. A variable node vi is connected to a check node sj if and only if the code bit vi is contained in (or checked
by) the parity-check sum sj. This kind of graph is known as a Tanner graph, named after its inventor.
A path in a graph is defined as a finite alternating sequence of vertices and edges, beginning and ending with
vertices, such that each edge is incident with the vertices preceding and following it, and no edge appears more
than once. The number of edges in a path is called the length of the path. It is possible for a path to begin and end
end in the same vertex. Such a closed path is called a cycle. The length of the shortest cycle in a graph is called
the girth of the graph. Note that the length of any cycle in a bipartite graph is even.
In order to have good decoding performance, it is important that the Tanner graph of an LDPC code (or any linear
block code) does not contain cycles of short lengths. Especially, cycles of length 4 should be avoided. In the above
graph, a cycle of length 6 is indicated in bold.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.36-
0 0 * * * 0 0
x1 x2 x3 x4 x5 x6 x7
A B C
Iterative Decoding 36
Above, the Tanner graph representation of the (7,4,3) Hamming code presented earlier in
this part is given. The three parity-check equations are represented by the three lower
nodes. A binary vector x of length 7 is a codeword if and only if the binary sum of the
inputs of all three check sums is zero. On the binary erasure channel. This property can be
used to try retrieving erased symbols, as illustrated through the circle diagrams before.
However, for the received sequence 00***00 indicated above, the iterative decoder is
stuck, since each check node has at least two erasures among its inputs (x3, x4, and x5 for
equation A, x4 and x5 for equation B, and x3 and x5 for equation C).
We will use this example to show that adding more check nodes to the Tanner graph may
improve performance.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.37-
Iterative Decoding 37
Besides the three check equations A, B, and C, the codewords of the dual code also give
rise to four more check equations, indicated by D, E, F, and G. By adding equation G to
the Tanner graph, the received sequence 00***00 can be corrected via iterative decoding.
Since x5 is the only erased symbol appearing in equation G, it can be concluded that
x5=x1+x2+x7=0. The bits x4 and x3 then follow from B and C, respectively.
Note that the addition of check nodes should be done in a careful way. If equation F
would have been added rather than equation G, the decoder would still be stuck. In fact, it
can be shown that the choice of adding G is optimal, since it recovers all three correctable
erasure sets of size three which could not be solved by iterative decoding based on merely
A, B, and C. Adding D, E, or F rather than G would leave one of these three cases
unsolved.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.38-
Iterative Decoding 38
Several ways for constructing parity-check matrices of LDPC codes have been proposed in
coding literature. One way is to make use of certain well-known and well-studied mathematical
structures, such as finite geometries and combinatorial designs. In the above example, the
projective plane PG(2,2) is shown, which consists of seven points 1, 2, 3, 4, 5, 6, and 7, and
seven lines {1,2,4}, {2,3,5}, {3,4,6}, {4,5,7}, {1,5,6}, {2,6,7}, and {1,3,7} (note that the circle
is considered to be a line as well). Any two lines meet in exactly one point. The incidence
matrix, having an entry one if and only if the point associated with the column is on the line
associated with the row, leads to the parity-check matrix of the (7,3,4) LDPC code introduced
before.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.39-
⎡ ⎧6ρ71' s
8 ⎤
⎢ ⎪1111... ⎥
⎢ n/ρ ⎪ ⎥
⎡ H1 ⎤ ⎢k rows⎨ 1111... ⎥
⎢H ⎥ ⎢ ⎪
⎢ 2⎥ O⎥
H= =⎢ ⎪ ⎥
⎢ M ⎥ ⎢ ⎩ ⎥
⎢ ⎥ ⎢ Column permutation of H1 ⎥
⎣H λ ⎦
⎢ M ⎥
⎢ ⎥
⎢⎣ Column permutation of H1 ⎥⎦
Iterative Decoding 39
Iterative Decoding 40
The above is an example of an H matrix with n=20, ρ=4, and λ=3. The horizontal lines
separate H1 and its two permutations H2 and H3. Clearly, the tenth row of H is the sum of
the first nine rows, and the fifteenth row is the sum of rows one through five and rows
eleven through fourteen. Thus rows ten and fifteen are linearly dependent on the other
rows. It can be shown that the remaining thirteen rows are linearly independent. Hence,
the rank of H is 13, the dimension of the LDPC code is k=20-13=7, and the code rate is
7/20=0.35. Furthermore, it can be shown that the code’s distance is 6.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.41-
These steps are repeated until all parity-check sums are zero or a
preset number of iterations is reached.
Iterative Decoding 41
Various methods for decoding of LDPC codes have been proposed in coding literature. One
class of (hard-decision) decoding algorithms is known as bit-flipping. A simple version is
presented through the above algorithm.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.42-
y =1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
f1 =2 2 1 2 2 2 2 1 1 2 3 2 0 0 2 1 0 1 1 1
y =1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f2 =2 1 1 1 2 2 1 1 0 1 0 1 0 0 1 0 0 1 0 1
y =0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
f3 =1 1 0 0 1 3 1 1 0 1 0 1 0 0 0 0 1 1 0 0
y =0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Iterative Decoding 42
The bit-flipping decoding algorithm is illustrated through the (20,7,6) LDPC code. We
assume that the transmitted codeword x is the all-zero sequence, and that errors occur in
the first, the fifth, and the eleventh position, leading to the received sequence y. The
fifteen check sums associated with this sequence are represented by s1. Note from f1 that
the eleventh bit is involved in three failed parity-check equations, which is more than any
other bit. Hence, the eleventh bit in y is flipped. Since the new check-sum sequence s2
still contains non-zero entries, the procedure is continued, and we note from f2 that the
first, fifth, and sixth bits are involved in two failed parity-check equations. Hence, these
bits are flipped, and s3 is computed. We note from f3 that the sixth bit is involved in three
failed parity-check equations. Hence, the sixth bit is flipped again, resulting in the all-zero
sequence for y, leading to the situation that all fifteen check-sums are satisfied (see s4).
Hence, the decoding algorithm is terminated, and the all-zero sequence is given as the
output. This equals the transmitted sequence x, and thus we can conclude that the three
errors have been corrected by the bit-flipping algorithm.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.43-
Bit-Flipping Exercise
Iterative Decoding 43
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.44-
Iterative Decoding 44
Iterative algorithms based on belief propagation are extremely efficient for decoding
LDPC codes. They form a subclass of message passing algorithms. The messages passed
along the edges of the code’s Tanner graph may be probabilities or (log) likelihood ratios.
A popular version for decoding LDPC codes is also known under the name sum-product
algorithm (SPA). It is a symbol-by-symbol soft-in, soft-out algorithm, which processes
the received symbols iteratively to improve the reliability of each decoded code symbol.
The computed reliability measures of code symbols at the end of each decoding iteration
are used as input for the next iteration. The decoding iteration process continues until a
certain stopping condition is satisfied. Then, based on the computed reliability measures
of code symbols, hard decisions are made.
Here, we present a simple version based on log-likelihood calculations, which is
illustrated using the (7,3,4) LDPC code.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.45-
Variable Nodes … … vi …
Real number:
Most recent messages from
the sum of all
check to variable nodes
vi-inputs
(absent in first iteration)
(except the one
… … from sj)
Check Nodes sj …
Iterative Decoding 45
The messages from variable nodes to check nodes are real numbers which are calculated
as follows. The message from vi to sj is the sum of Li, which is the log likelihood value of
the ith bit merely based on the observation from the channel, and the received values from
the check nodes connected to vi, except the value originating from the check node sj under
consideration. In the first iteration, no information from the check nodes is available, so vi
submits just Li to all its connected check nodes.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.46-
Variable Nodes … … vi …
Check Nodes … … sj …
Real number:
• absolute value is minimum of absolute values of all sj-inputs (except the one from vi)
• sign is the product of the signs of all sj-inputs (except the one from vi)
Iterative Decoding 46
The messages from check nodes to variable nodes are also real numbers. They are
calculated using log-likelihood algebra, as introduced already in the context of turbo
codes. The message from sj to vi is calculated as follows. Consider all received values by
sj, except the one from the variable node vi under consideration. The sign of the generated
number is the product of the signs of these values, while its absolute value is the
minimum among their absolute values. For example, if the received values by sj
(excluding the one from vi) are +0.9, -0.4, +0.3, -1.2, +1.0, and -0.8, then the output from
sj to vi is -0.3.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.47-
Check Nodes … … …
Iterative Decoding 47
After each iteration (which is a cycle of updates from variable nodes to check nodes and
vice versa), the n code bits ci can be estimated by calculating for each vi the sum Σi of all
its inputs, i.e., the sum of Li and the received values by vi from all its connected check
nodes. If Σi is positive the bit estimate is zero, otherwise it is one. The absolute value of
Σi is an indicator for the confidence in the estimate.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.48-
Variable nodes
v1 v2 v3 v4 v5 v6 v7
s1 s2 s3 s4 s5 s6 s7
The belief propagation principle is illustrated through the (7,3,4) LDPC code with the above
Tanner graph. BPSK modulation and transmission over the AWGN channel are assumed.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.49-
Iterative Decoding 49
Iterative Decoding 50
In the second half of the first iteration, each check node sends belief information to all its
directly connected variable nodes. For example, the value sent from s4 to v7 is computed
by considering all inputs to s4, except the one from v7 itself. Note that s4 has received +0.8
from v4 and +1.0 from v5. Hence, the value delivered from s4 to v7 is
(+1) x (+1) x min{0.8,1.0} = +0.8.
Similarly, all other entries can be calculated.
After the first iteration, each variable note vi has a value Σi which is the sum of Li and
three received values from the directly connected check nodes. For example,
Σ7 = -0.3 +0.8 -0.1 +0.3 = +0.7.
The sign of Σi indicates the tentative estimate for the code bit ci (a plus gives a zero, a
minus gives a one), while the absolute value | Σi | gives an indication for the confidence in
this estimate. Note that the four terms composing Σi, which can all be considered as
estimates for ci, come from different sources. For i=7, -0.3 comes from v7 itself, +0.8
(from s4) originates from v4 and v5, -0.1 (from s6) originates from v2 and v6, and +0.3
(from s7) originates from v1 and v3. Hence, at least initially, the inputs are independent,
thus giving the opportunity of bad values to be repaired by good values. This underlines
the importance of avoiding short cycles in the Tanner graph.
In the first iteration, the number of errors has been reduced from two to one. On one hand,
the errors in the last two positions have been corrected. On the other hand, these errors
have propagated to a (small) error in the third bit (which was not in error at the start of the
decoding process!).
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.51-
Iterative Decoding 51
In the first half of the second iteration, the variable nodes send updates to the check
nodes, based on the received information generated in the second half of the first
iteration. For example, v4 sends the value +0.7 to s4, which is the sum of L4=+0.8, the
received value +0.1 from s1, and the received value -0.2 from s2. Note that this value can
also be obtained by subtracting the received value from s4, i.e., -0.3, from Σ4=+0.4. The
other values in the above table can be obtained similarly.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.52-
Iterative Decoding 52
In the second half of the second iteration the check nodes send messages to the variable
nodes based on the received values in the first half of this iteration. For example, the
value sent from s4 to v4 is computed by considering all inputs to s4, except the one from v4
itself. Note that s4 has received +0.9 from v5 and -0.1 from v7. Hence, the value delivered
from s4 to v4 is
(+1) x (-1) x min{0.9, 0.1} = -0.1.
Similarly, all other entries can be calculated.
After the second iteration, the new Σi values are the sums of the Li of the variable node
under consideration and the three received values from the directly connected check
nodes. For example,
Σ4 = +0.8 +0.4 +0.1 -0.1 = +1.2.
Note that the value of Σ4 after the first iteration was +0.4. Hence, the bit estimation is still
a zero (since Σ4>0), but the confidence has increased considerably.
After the second iteration, the estimated bit sequence is the all-zero sequence, which is a
codeword. All errors have been corrected.
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.53-
Iterative Decoding 53
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 8: Iterative Decoding -8.54-
LDPC Demo
Iterative Decoding 54
et4-030 Error-Correcting Codes, Part 9: Applications -9.1-
et4-030 Part 9
Applications
Applications 1
Contents
Applications 2
We start this part with an (historical) overview of the coding techniques used in space and
satellite communications.
Next, we briefly describe error protection in two of the most important present-day
products on the consumer market: GSM and CD.
Finally, the reader may try to earn a lot of money by applying codes for football pools.
et4-030 Error-Correcting Codes, Part 9: Applications -9.3-
Many early applications of coding were developed for deep-space and satellite
communication systems as exploited by institutes like the the National Aeronautics and
Space Administration (NASA). The deep-space channel turned out to be the perfect link
to first demonstrate the power of coding, for the following reasons:
• It can almost exactly be modeled as an memoryless AWGN channel. Hence, the
theoretical (Shannon!) and simulation studies conducted for the AWGN channel carried
over almost exactly into practice.
• Remember that adding check bits to messages leads to an expansion of the required
bandwidth. Fortunately, plenty of bandwidth is available on the deep-space channel, and
thus also low-rate codes can be used.
• A deep-space mission is very expensive undertaking, and thus the additional cost of
developing and implementing (complex) encoding and decoding solutions can be
tolerated. Furthermore, each dB gained by coding techniques results in big overall
financial savings in transmitting and receiving equipment.
et4-030 Error-Correcting Codes, Part 9: Applications -9.4-
Block Code:
• Each pixel has one of 64 gray levels: k=6
• Pixels encoded using a binary (32,6,16)
Reed-Muller code
• Low code rate 6/32 = 0.19
• Soft-decision decoding
• Using this code with BPSK modulation:
BER 10-5 at Eb/N0=6.4 dB
(gain of 3.2 dB to uncoded BPSK)
Applications 4
During the Mariner and Viking missions to Mars around 1970, gray-valued pictures were
sent to earth. Each pixel (picture element) could have one of 64 possible gray values, i.e.,
each pixel could be represented by six bits (k=6). These six bits were encoded using a
binary (32,6,16) Reed-Muller code (see e.g. Section 3.6 in [RC99]). Note the low code
rate of 6/32=0.19, which was not a big problem since plenty of bandwidth was available.
The Hamming distance of 16 is very high for a code of length 32. Using BPSK
modulation and full soft-decision maximum-likelihood decoding (using unquantized
demodulator outputs), a bit error probability of 10-5 was achieved at Eb/N0 of 6.4 dB,
which is 3.2 dB less than the 9.6 dB required for uncoded BPSK.
et4-030 Error-Correcting Codes, Part 9: Applications -9.5-
Convolutional Code:
• Rate 1/2
• High constraint length K=32
• Free distance dfree=21
• Sequential decoding
• Using this code with BPSK modulation:
BER 10-5 at Eb/N0=2.7 dB
(gain of 6.9 dB to uncoded BPSK)
Applications 5
On the Pioneer 10 and 11 missions to Jupiter and Saturn (ca. 1972), a rate ½ convolutional
code was used defined by two polynomials, each of degree 31. Hence, the memory M=31
and the constraint length K=32. Because of the very large memory, maximum-likelihood
Viterbi decoding could not be applied. Sub-optimal sequential decoding was used instead.
A bit error probability of 10-5 was achieved at Eb/N0 of 2.7 dB, which is 6.9 dB less than the
9.6 dB required for uncoded BPSK, at the expense of a doubling in bandwidth
requirements. Note that this code improves upon the Reed-Muller code used on the Mariner
mission in both code rate and coding gain.
et4-030 Error-Correcting Codes, Part 9: Applications -9.6-
Convolutional Code:
• Rate 1/2
• Constraint length K=7
• Free distance dfree=10
• Viterbi decoding
• Using this code with BPSK modulation:
BER 10-5 at Eb/N0=4.5 dB
(gain of 5.1 dB to uncoded BPSK)
Applications 6
Another rate ½ convolutional code was adopted as NASA/ESA Planetary Standard Code
by the Consultative Committee on Space Data Systems (CCSDS). It has been used
(possibly in combination with other codes) on the Voyager missions to Jupiter and Saturn
(1977), but has also been employed in numerous other applications, including satellite
communication and cellular telephony, and has become a de facto industry standard.
Its generator matrix is
G=(1+x+x3+x4+x6, 1+x3+x4+x5+x6),
and so its memory M=6 and its constraint length K=7. Because of the rather small
memory, maximum-likelihood Viterbi decoding can be applied. A bit error probability of
10-5 is achieved at Eb/N0 of 4.5 dB, which is 5.1 dB less than the 9.6 dB required for
uncoded BPSK, at the expense of a doubling in bandwidth requirements. Note that the
improvement is less than for the Pioneer code presented on the previous slide, due to the
short constraint length. However, its implementation complexity is much simpler, and it
does not suffer the long buffering delays characteristics of sequential decoding.
et4-030 Error-Correcting Codes, Part 9: Applications -9.7-
The CCSDS standard conatenation scheme uses the Planetary Standard code described on
the previous slide as inner code, and a (255,223,33) Reed-Solomon code over GF(256) as
outer code. Since the outer code is over GF(256)=GF(28), it works on the basis of 8-bit
symbols (`bytes’). In order to break up possibly long burst of errors from the inner
decoder into separate blocks for the outer decoder, thus making them easier to decode, a
symbol interleaver is inserted between the outer and inner encoder, and the corresponding
symbol de-interleaver between the inner decoder and outer decoder. The code rate of the
concatenated code equals the product of the code rates of the inner code and the outer
code, i.e., it is (1/2)×(223/255)=0.44. A bit error probability of 10-5 is achieved at Eb/N0 of
2.5 dB, which is 7.1 dB less than the 9.6 dB required for uncoded BPSK. Note that both
the code rate and the coding gain are close to that of the “Pioneer” codes. Thus short
constraint length convolutional codes with Viterbi decoding in concatenated coding
schemes can be considered as alternatives to long constraint length codes with sequential
decoding.
The concatenated CCSDS scheme (or variations on it) was used for example in some of
the Voyager missions, but also in the 1998 DirecTV satellite system.
et4-030 Error-Correcting Codes, Part 9: Applications -9.8-
Current/Future Missions
Turbo Code:
• Rate ¼, …, ½ (using puncturing)
• Simple short block or convolutional
constituent codes (small distance)
• Large interleaver, iterative decoding
• Performance close to capacity is possible!
Example: rate ½ turbo code, interleaver of
size 216, 18 iterations of decoding:
BER 10-5 at Eb/N0= 0.7 dB
Applications 9
The recently discovered turbo codes (see Part 8) are the major candidates to be used in
current or future missions. These have an excellent performance versus complexity trade-
off. Several rates can be achieved using puncturing techniques. Note that the example
code indicated on the slide has a double code rate and still a slightly higher coding gain in
comparison to the Galileo code.
et4-030 Error-Correcting Codes, Part 9: Applications -9.10-
Shannon
Bound
uncoded
1
Spectral
Efficiency
Turbo Pioneer PSC
0.5
CCSDS
BVD
Galileo Mariner
0
-2 0 2 4 6 8 10
Applications Eb/N0 (in dB) 10
Here we give an overview of the AWGN performances of the codes considered on the
preceding slides. The curve at the left side is the Shannon capacity bound. Since we use
BPSK modulation, the spectral efficiency of each of the coding schemes equals the code
rate. The marked points indicate the required Eb/N0 in order to achieve a bit error
probability of 10-5. Note that for a completely fair comparison, also the (decoding)
complexity should be taken into account.
Applications 11
The Global System for Mobile communications (GSM) is the most popular digital cellular
mobile communication system these days. In GSM, speech and data services use a variety
of (error control) coding techniques, as indicated above. Both block and convolutional
codes are applied. The Cyclic Redundancy Check (CRC ) code is of the block type and is
used for error detection. Other codes of the block type, used for error correction, are the
BCH codes (see Part 4) and Fire codes (for burst error correction).
GSM offers speech and data services in various modes. On the next slide, we focus on the
error control techniques used in the full-rate speech mode.
et4-030 Error-Correcting Codes, Part 9: Applications -9.12-
Class 1a Class II
50 bits 3 132 bits 4
convolutional code no
rate ½, constraint length 5 protection
Class 1 Class 2
378 coded bits 78 bits
Applications 12
A 13 kbit/s full rate speech coder produces a block of 260 bits every 20 ms. These 260
bits are split into three sensitivity classes:
• Class 2 contains 78 bits which are rather insensitive to errors and are unprotected;
• Class 1a contains the 50 most significant bits, which are initially protected by three
CRC check bits;
• Class 1b contains the remaining 132 bits.
The 50+3+132 Class 1 bits are encoded with a rate ½ convolutional code with generator
matrix
G=(1+x3+x4, 1+x+x3+x4).
Since the memory of the code equals 4, four additional bits for trellis termination are
added. Hence, Class 1 is finally represented by 2×(50+3+132+4) = 378 coded bits.
Together, with the 78 unprotected Class 2 bits, this leads to a rate of (378+78)/20 = 22.8
kbit/s.
Together with seven other users, the channel bits are interleave in groups of 456/8=57 bits
and spread over eight TDMA bursts, which are transmitted over the mobile channel.
After de-interleaving, a Viterbi decoder accepts 378 values and outputs 185 bits from the
terminated 16-state trellis. If the check with the CRC is positive, the 50 plus 132 decoded
bits, together with the 78 unprotected bits are delivered to the speech decoder. A negative
CRC usually triggers a concealment procedure in the speech decoding algorithm, rather
than accepting any errors in the 50 most significant bits.
et4-030 Error-Correcting Codes, Part 9: Applications -9.13-
Implementation Exercise
Applications 13
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 9: Applications -9.14-
Applications 14
Applications 15
The analog audio signal is sampled at 44.1 kHz and quantized with 2×16=32 bits/sample.
This leads to an audio bit stream of
44.1 × 32 kbit/s = 1.41 Mbit/s.
The audio bits are organized in frames, where one frame represent six samples. A group
of 8 bits is considered to be one symbol from GF(28)=GF(256). Hence, each frame
contains 6×32=192 audio bits, i.e., 192/8=24 audio symbols.
The 24 audio symbols are considered as information symbols for a (shortened) (28,24,5)
Reed-Solomon code over GF(256). Then, cross-interleaving (a variation on the
interleaving principle discussed in Part 3) is applied to depth 28 (over various frames),
and the resulting 28 symbols in a column of the interleaving scheme are encoded using a
(shortened) (32,28,5) Reed-Solomon code, again over GF(256). One “Control & Display”
symbol, containing information on the music stored on the disc, is added per frame.
Hence, we have 32+1=33 data symbols in the frame at this stage.
Next, the 8 bits from each symbol are mapped to a sequence of 14 bits according to the
Eight-to-Fourteen Modulation (EFM) code, which will be further explained on the next
slides. Furthermore, three merging bits are inserted after each EFM codeword. Finally, a
(fixed) synchronization pattern of 24 bits, also followed by three merging bits, is added to
the frame. Hence, the number of channel bits is 33 ×(14+3) + (24+3) = 588 per frame, i.e.,
588/6=98 channel bits per sample, and thus the resulting channel bit rate is
44.1 × 98 kbit/s = 4.32 Mbit/s.
Note that the channel bit rate is three times as high as the audio bit rate.
et4-030 Error-Correcting Codes, Part 9: Applications -9.16-
Runlength-Limited Sequences
(d,k) constrained sequence:
binary sequence with at least d and at most k zeros
between two consecutive ones
Example (d=1, k=3):
0010001001000101010
+ “lands”
- “pits”
Information is stored on the disc along a spiral track of two levels: pits and lands. A one
is represented by a transition (from a pit to a land or vice versa), while a zero is
represented by a non-transition (staying at the current level). Due to various reasons, the
number of zeros between two subsequent ones (transitions) should not be too small (e.g.,
in order to avoid intersymbol interference), but also not too large (e.g., for
synchronization purposes). Therefore, (d,k) constrained sequences have been introduced,
which have also found wide-spread use in other optical or magnetic storage systems. Note
that in this context d and k do not denote the Hamming distance and dimension of a block
code, but the minimum and maximum number of zeros between two consecutive ones.
If a land is represented by a + and a pit by a –, than a binary (d,k) constrained sequence
appears as a sequence of + and – runs, where the length of each run is at least d+1 and at
most k+1. This is called a (d+1,k+1) runlength-limited sequence.
et4-030 Error-Correcting Codes, Part 9: Applications -9.17-
Example:
01100100 01111010
EFM EFM
01000100100010 000 10010000000010
Merging bits
Applications 17
In the design of the compact disc system, the choice has been made to use (2,10)
constrained sequences, i.e., any two subsequent ones are separated by at least two and at
most ten zeros. Each 8-bit symbol can be uniquely mapped to a (2,10) constrained
sequence of length 14, since there exist more than 256 of such sequences. This mapping
has been standardized in the EFM coding table.
The cascading of two EFM sequences may violate the (2,10) constraint, as in the shown
example. Therefore, three merging bits are inserted between two subsequent EFM
sequences, which are chosen in such a way that the constraint is still satisfied. The choice
of the merging bits is not necessarily unique, though it is in the shown example.
Note that also the synchronization pattern added in the encoding process satisfies the
(2,10) constraint. Hence, the resulting overall sequence recorded on the disc is (2,10)
constrained, i.e., (3,11) runlength-limited.
et4-030 Error-Correcting Codes, Part 9: Applications -9.18-
Applications 18
The CD decoding process is not completely standardized. In fact, it may be optimized per
equipment manufacturer. The procedure mentioned above is one way of doing it, but
other options are possible. In particular, the decoding of the Reed-Solomon code leaves
room for choices.
As we know, any code of Hamming distance d is capable of correcting up to ⎣(d-1)/2⎦
errors. So the (32,28,5) could be used to correct up to 2 errors. However, in the indicated
procedure, it only corrects up to one error. In case no codeword of distance ≤1 is found
(so it is detected that at least 2 errors occurred), all 28 information symbols are marked as
an erasure. Note that some of the error correction capability has been sacrificed for an
increased error detection capability.
After de-interleaving, the next decoder (of the (28,24,5) code) may be faced with some
erasures. In general, an error-correcting code of Hamming distance d can be used for the
correction of t errors and e erasures if and only if the Hamming distance d>2t+e. Hence,
the (28,24,5) Reed-Solomon code can be used to correct two errors and no erasures, or
one error and two erasures, or no errors and four erasures.
If the previous step fails (for example if the decoder is faced with five erasures), then
interpolation methods (using preceding and/or next music samples) may be used as the
last resort.
The error correction encoding and decoding procedures as described here enable a very
high music quality, in spite of scratches or other damages on the disc.
et4-030 Error-Correcting Codes, Part 9: Applications -9.19-
Applications 19
Solution can be found on Blackboard. However, it is strongly recommended that you first
try to solve it yourself!
et4-030 Error-Correcting Codes, Part 9: Applications -9.20-
Football Pools
Football Pools: for n matches, predict whether
a) the home-team will win (coded by a 1);
b) the visiting team will win (coded by a 2);
c) the match will end in a draw (coded by a 0).
Prizes:
First prize: n correct predictions
Second prize: n-1 correct predictions
Third prize: n-2 correct predictions
Applications 20
As a last application, we consider the football pool problem. In such pools, one needs to
predict the results (only on the win/loss/draw level, not the detailed score) of a number (n)
of football matches. Since there are three possible results per match, a single match result
can be represented by a symbol from the ternary alphabet GF(3), and a complete
prediction can be represented by a ternary vector of length n. A common value in the
Dutch football pool system is n=13.
Per prediction vector one pays a certain fixed amount of money to the bookmaker. The
first prize is shared among those who predicted all n matches correctly. Consolation
prizes may be given to those who made only one or two errors. The whole game is a
mixture of football knowledge and luck.
et4-030 Error-Correcting Codes, Part 9: Applications -9.21-
Question:
How many prediction vectors (of length n=4) should be
generated in order to be absolutely sure to win a first or second
prize?
Applications 21
Here we consider the football pool example with n=4. The research question we ask
ourselves is how many prediction vectors should be generated in order to be sure to win a
first or second prize, i.e., to have at least one prediction vector which differs from the
actual results in at most one match. Note that there are in principle 3n=34=81 possible
outcomes. A trivial solution is to generate 33=27 prediction vectors, which contain all
possible combinations for the first three matches, and a random result for the fourth
match. However, the guarantee of winning a first or second prize can also be achieved
using fewer prediction vectors, as is shown on the next slide.
et4-030 Error-Correcting Codes, Part 9: Applications -9.22-
G = ⎛⎜ 1 0 1 1 ⎞⎟
⎝0 1 1 2⎠
The 32=9 codewords are
0000 1011 2022 0112 1120 2101 0221 1202 2210
Applications 22
Using only the nine prediction vectors shown above, winning a first or second prize is
guaranteed. Each prediction vector is a codeword from the ternary (4,2,3) code. For a
codeword c, let Sc(t) be the set of ternary vectors of length 4 which are at Hamming
distance at most t to c. For example, for c=0000=0 we have S0(1)={0000, 1000, 2000,
0100, 0200, 0010, 0020, 0001, 0002}. For this code, all Sc(1) contain nine vectors and are
disjoint (due to the fact that the Hamming distances between the codewords are at least
equal 3). Since ∑c |Sc(1)|=9×9=81, we can thus conclude that each of the possible 81
ternary vectors of length 4 is at distance ≤1 from a codeword. Hence, nine prediction
vectors are indeed sufficient in order to guarantee a first or second prize. From this
reasoning it also follows that this guarantee cannot be given by any combination using
fewer than nine prediction vectors.
Generalization of the football problems leads to studying the covering radius of a code,
which is defined as the largest Hamming distance between a vector from the n-
dimensional vector space and its closest codeword. Note that when codes are used for
error correction of up to t errors, we try to maximize the number of codewords in an n-
dimensional vector space such that all Sc(t) are disjoint. In applications of the football
pool type, we try to minimize the number of codewords in an n-dimensional vector space
while achieving a certain covering radius t, i.e., ∪c Sc(t) covers the complete n-
dimensional vector space.