Improved Error Control Techniques For Data Transmission: Steven Robert Marple
Improved Error Control Techniques For Data Transmission: Steven Robert Marple
Transmission
Doctor of Philosophy.
University of Lancaster.
March 2000.
To my parents, Rob and Gill Marple.
Contents
Acknowledgements xii
Declaration xiii
Abstract xix
1 Introduction 1
2 Background 6
i
CONTENTS ii
2.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Trellises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4 Trellis Decoding 73
4.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
gorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.3.4 Modification of the Channel Metrics with the SOVA Metric . 137
References 212
Index 226
List of Figures
vii
LIST OF FIGURES viii
5.4 Example trellis with metric differences for traceback SOVA. . . . . . 127
6.12 Simulation results for RS(7; 5; 3) and (2; 1; 7) over a trellis of depth 28. 189
6.13 Simulation results for RS(7; 5; 3) and (2; 1; 7) over a trellis of depth 35. 190
6.14 Simulation results for RS(7; 5; 3) and (2; 1; 7) over a trellis of depth 42. 191
x
LIST OF TABLES xi
6.2 Comparison of VA and SOVA decoding complexity for RS(7; 3; 5). . . 160
6.3 Comparison of VA and SOVA decoding complexity for RS(7; 5; 3). . . 160
6.8 Complexity versus subtrellises decoded for TSD of RS(7; 3; 5). . . . . 173
6.9 Complexity versus subtrellises decoded for TSD of RS(7; 5; 3). . . . . 174
6.13 Complexity for SOVA decoding of the convolutional (2; 1; 7) code. . . 193
for his guidance, support and encouragement. Without this help I could not have
I would also like to thank my colleagues, both past and present, in the Communi-
Finally, I would like to thank my parents and Judith for their support.
xii
Declaration
This Thesis and the work described in it are my own, except where stated otherwise.
The work was carried out at the University of Lancaster between January 1994 and
December 1999. No part of this Thesis has been submitted for the award of a higher
degree, either at the University of Lancaster or elsewhere. Some parts of this Thesis
have appeared in publications, which are listed below and in the References.
Relevant Thesis sections: 4.1.2, 4.1.4, 4.2.5, 4.3, 6.1.2, 6.2, and 6.4.
July 1995.
xiii
DECLARATION xiv
Relevant Thesis sections: 4.1.2, 4.1.4, 4.2.5, 4.3, 6.1.2, 6.2 and 6.4.
147. John Wiley & Sons, New York, London, Sydney, 1997. ISBN
0 86380 221 4.
Relevant Thesis sections: 4.1.2, 4.1.4, 4.2.5, 4.3, 6.1.2, 6.2, and 6.4.
(SSK) and its application in CDMA systems. In ITEC 96, Leeds, UK,
April 1996.
B branch profile
BCH Bose-Chaudhuri-Hocquenghem
BM Berlekamp-Massey [decoding]
C code
d minimum distance
e error vector
xv
SYMBOLS AND ABBREVIATIONS xvi
e error symbol
Eb bit energy
G generator matrix
GF Galois field
H parity-check matrix
HD hard decision
L label profile
LL log likelihood
ML maximum-likelihood
N state profile
RM Reed-Muller
RS Reed-Solomon
S syndrome vector
SD soft decision
T trellis
SYMBOLS AND ABBREVIATIONS xviii
u dataword vector
u dataword symbol
v codeword vector
v codeword symbol
VA Viterbi algorithm
discrepancy
complexity
(x)
correction polynomial for (x)
! (x)
Error control coding is frequently used to minimise the errors which occur naturally in
the transmission and storage of digital data. Many methods for decoding such codes
already exist. The choice falls mainly into two areas: hard-decision algebraic decod-
The work presented in this Thesis is intended to provide practical decoding algo-
tained by using the Viterbi algorithm over a suitable trellis. Two-stage decoding of
mance may be achieved with a complexity lower than the Viterbi algorithm.
to apply SOVA to multi-level codes are given. The use of SOVA in a satellite downlink
improvement in coding gain for only a 20% increase in decoding complexity, are
xix
ABSTRACT xx
presented.
SOVA was also used to improve the decoding performance when applied to an RS
product code. Several different decoding methods were evaluated, including cascade
decoding, and a method where the row and columns were decoded alternately.
decoding complexity for trellis-based and algebraic decoders. With this technique the
decoding complexity of all the algorithms implemented are compared. Also included
Introduction
1
Chapter 1
Introduction
The twentieth century has seen an explosion in the use and availability of commu-
nication systems. They are now placed in and on many devices which were not even
invented one hundred years ago. Such widespread use has placed high demands on
engineers. Mobile telephones are expected to operate for long periods and with clear
reception. Digital television and weather images from space are expected to be clear of
speckles. Music from compact discs must be free of clicks and pops which frequently
troubled the vinyl records which they are now quickly replacing. As the storage size of
computer memories and disks increase the access times plummet. As these technolo-
gies are reliant upon computer hardware, which is still following Moore’s ‘law’,1 the
rapid increase in technology looks set to continue. The uniting factor in all of these
diverse applications is that they use error control coding to protect valuable digital
1 Moore, founder of Intel, suggested that the number of transistors on integrated circuits for comput-
ers doubles approximately every 18–24 months [Moore, 1965].
2
CHAPTER 1. INTRODUCTION 3
In its early days error control coding could only be afforded by the ‘super-rich’—
the military and organisations such as NASA. Even so, the codes used then (often
Reed-Muller codes) are considered by today’s standards as weak and simple to decode.
Today, error control coding is widespread and cheap. Probably most popular are the
Reed-Solomon codes. They are, however, a two-edged sword; providing greater error
protection but are also many orders of magnitude more difficult to decode. Although
The twenty-first century will surely see an increase in the use of error control
coding as current technologies are miniaturised further, and new ones invented. Thus
the demand for fast and efficient decoding algorithms will only increase. This Thesis
presents new work which is aimed at both improving upon hard-decision decoding
Chapter two introduces the background topics used in this work. Included are
the theory and important properties of linear and cyclic block codes. Attributes of
convolutional codes are discussed. The concept of concatenated codes and important
definitions regarding trellis diagrams and trellis decoding are introduced. The channel
Massey, Euclidean and high-speed step-by-step algorithms are explained, both math-
ematically and with the aid of examples. Chapter four details trellis construction tech-
niques, for both syndrome (BCJR/Wolf) and coset trellises. The Viterbi algorithm is
described and an example decoding used to illustrate the procedure. A novel low-
CHAPTER 1. INTRODUCTION 4
ror control performance with low complexity. The soft-output Viterbi algorithm is
in Chapter four is extended to give a clear demonstration of how SOVA operates. The
this Thesis includes both algebraic and combinatorial (trellis) decoders a unified prac-
tical method, by which all the decoders implemented may be compared, was sought.
How this was achieved is explained. Results of all the simulated systems are given,
for the Viterbi algorithm was analysed in a mathematical manner, applicable to all
linear codes (and also separable non-linear codes). Also, the analysis is expanded
for the soft-output Viterbi algorithm. Following this, decoding complexity for all the
simulated codes is given, using the same set of benchmarks. Decoding performance
is not forgotten, and Chapter six also includes decoding performance curves for all
the decoding algorithms demonstrated. Where possible the same channel model was
used.
Concluding remarks are made in Chapter seven. The unified approach to decod-
CHAPTER 1. INTRODUCTION 5
ing complexity allows comparisons to be made regarding the complexity of the various
systems. Where appropriate, comparisons of the decoding performance are also made.
The improved performance of the weather satellite image distribution system is dis-
cussed, and commercial benefits of increased coding gain are highlighted. Finally, the
References, and for the benefit of the reader, a citation index and general index are
Background
6
Chapter 2
Background
In a block code the message symbols are sectioned into blocks of fixed length, k,
before being passed to the encoder. The input block or dataword contains k data
code symbols, also over an alphabet of size q. For the case q = 2 the code is described
as binary. Block codes may be divided into two categories, linear block codes and
For a useful code the datawords u must form a one-to-one mapping with the set
of qk possible input sequences. For an (n; k) linear code C the codewords v must form
a k-dimensional subspace of the n-dimensional codespace over the field GF(q) [Lin
and Costello, 1983, p. 52], i.e., the dimension of the code is k. Since the codewords
7
2.1. BLOCK CODES 8
are restricted to a k-dimensional subspace of the codespace the linear sum of any two
codewords is also restricted to the k-dimensional subspace and must therefore also be
a codeword.
v = u g +u g +:::+u
0 0 1 1 g
k 1 k 1 (2.1)
matrix:
2 3 2 3
6 g0 7 6 g0;0 g0;1 ::: g0;n 1 7
6 7 6 7
6 7 6 7
6 g 7 6 g1;0 g1;1 ::: g1;n 7
G = 6 1
6
6 ..
7
7
7
= 6
6
6 .. .. .. ..
1 7
7
7
(2.2)
6 . 7 6 . . . . 7
6 7 6 7
4 5 4 5
gk 1 gk 1;0 gk 1;1 : : : gk 1;n 1
2.1. BLOCK CODES 9
v =uG
2 3
6 g0 7
6 7
6 7
6 g 7
= (u ; u ; : : : ;
o 1 6 1
uk 1 ) 6
6 ..
7
7
7
(2.3)
6 . 7
6 7
4 5
gk 1
= u g +u g +:::+u
0 0 1 1 g
k 1 k 1
From (2.3) it can be seen that the matrix G generates codewords of C given a data-
word, and is known as the generator matrix. If the encoding procedure of a linear
block code preserves the input sequence u within the output sequence of v the code
are especially important for array codes. This useful property can be identified in the
code is systematic. By column reordering the generator matrix for systematic codes
2.1. BLOCK CODES 10
2 3
6 1 0 0 ::: 0 7
6 7
6 7
6 0 1 0 ::: 0 7
6 7
6 7
G = 6
6
6
0 0 1 0
7
P 7
7
(2.4)
6 7
6 .. .. .. 7
6 . . . 7
6 7
4 5
0 0 0 1
where P represents the parity checks. An important point to note is that every valid
codeword is a multiple of the generator matrix. The importance of this will become
apparent when the decoding of a received codeword which contains errors is consid-
ered.
It is useful to be able to express the code in a manner which highlights the parity
GHT =0 (2.5)
vHT =0 (2.6)
2.1. BLOCK CODES 11
Consider the case of the parity checks when the received codeword, r, is in error. The
parity check values, or syndromes, are non-zero. The syndrome vector, S, is defined
by
S = rH T
(2.7)
where S = [S ; S ; : : : ; S ℄
1 2 2t
For the case when the received codeword is correct (i.e., r v) Equation 2.7 reduces
S = rH T
= (v + e)H T
(2.8)
= vH + eH
T T
= eH T
since vHT = 0 (Equation 2.6). From Equation (2.8) it is clear that the syndrome is
dependent only upon the error pattern, e, and not the transmitted codeword, v.
For simple codes error correction can be achieved by a table lookup of the syndromes
and a GF(q) subtraction of the corresponding error value. This method is not practical
for useful codes as the table size is too large to store (e.g., for RS(255; 223; 33) the
The Singleton bound [Singleton, 1964] provides an upper limit on the minimum dis-
tance, d min , between codewords in a linear block code. It thus provides an important
goal good codes should attain. Codes which satisfy the Singleton bound with equality
d min n k +1 (2.9)
Array codes were introduced by Elias [Elias, 1954]. They are constructed from lin-
ear component codes in two or more dimensions. The simplest array code is a two-
dimensional code with an (n1 ; k1; d 1 ) vertical code, C1 , and an (n2 ; k2 ; d 2 ) horizontal
code, C2 . The resulting code, C , is an (n1 n2 ; k1 k2 ; d 1d 2 ) code (Figure 2.1). The term
product code is sometimes applied to array codes; the two terms are interchangeable.
Theorem 2.1 The minimum weight for the product of two codes is the product of the
Proof 2.1 [Elias, 1954]. The minimum weights of the component codes, C1 and C2 are
d 1 and d 2 . Each column containing a non-zero element must have at least d 1 non-zero
2.1. BLOCK CODES 13
k2
n2
n1
elements, and each row containing a non-zero element must have at least d 2 non-zero
elements. Therefore if the product code C contains any non-zero elements it must
contain at least d 1 non-zero rows and d 2 non-zero columns. The minimum number of
Encoding and decoding are greatly simplified when the component codes are sys-
tematic. For a two-dimensional product code with systematic component codes the
the simpler case as the Tensor product of the component codes [Wolf, 1965]. Higher
2.1. BLOCK CODES 14
dimensions are possible. As both row and column codes are linear it is not important
whether the row or column encoding is performed first, the checks-on-checks will be
identical in either case [Peterson and Weldon, 1972, p. 132]. Similarly, the decoding
2 3
6 v0;0 v0;1 ::: v0;n2 1 7
6 7
6 7
6 v1;0 v1;1 ::: v1;n2 7
v = 6
6
6 .. .. .. ..
1 7
7
7
(2.10)
6 . . . . 7
6 7
4 5
vn1 1;0 vn1 1;1 : : : vn 1 1;n2 1
2 3
6 v0 v1 : : : vn 1 7
6 7
2
6 7
6 vn2 vn2 +1 : : : v2n 7
v = 6
6
6 .. .. ..
2
..
1 7
7
7
(2.11)
6 . . . . 7
6 7
4 5
v(n1 1)n2 v(n1 +
1)n2 1 ::: vn 1
For the case when n1 and n2 are relatively prime there is (by the Chinese remainder
theorem) a unique integer l in the range 0 l<nn 1 2 1 for the pair (i; j) such that
2.1. BLOCK CODES 15
l i mod n 2 and l j mod n1 [Burton and Weldon, 1965]. This mapping of i and j
to l is known as cyclic ordering. The reasoning behind the name is clearly seen from
2 3
6 v0 v6 v12 v3 v9 7
6 7
v = 6
6 v10
6 v1 v7 v13
7
v4 7
7
(2.12)
4 5
v5 v11 v2 v8 v14
If C1 and C2 are cyclic codes and n1 and n2 are relatively prime then the product
C = C
C 1 2 is also cyclic, however not all cyclic codes are array codes [MacWilliams
Many decoding algorithms for array codes exist, both algebraic and trellis-oriented.
The simplest method for decoding a canonically-ordered array code is to cascade de-
code the component codes one at a time, decoding the row code, C2 , and then the col-
umn code C1 . Particularly for memoryless channels, the cascade decoding algorithm
is inefficient. There exist error patterns for which the code is capable of correcting but
the algorithm fails to correct. If the minimum distances, d 1 and d 2 , of the component
codes are odd then there are error patterns of weight (d 1 + 1)(d + 1)=4 which are
2
As the name suggests GACs are a generalisation of the array codes introduced by
Elias. Unlike standard array codes GACs may have different component codes along
a given dimension, with the restriction of the code length being invariant. The total
rows and columns, respectively. The total number of information symbols is given by
k = P=n1
i 1 k p , where k p is the number of information symbols in p-th row [Honary and
The technique provides a simple design procedure for constructing many differ-
ent block codes, BCH, Hamming, Golay, RM etc. [Honary and Markarian, 1993a,b;
straightforward manner.
Cyclic codes are an important subclass of linear block codes, not only because many
prominent codes are cyclic e.g., BCH, RS, but also because they are used in the con-
struction of other error-correcting codes e.g., Kerdock and Preparata codes. The inher-
ent algebraic structure of cyclic codes allows the formation of many practical decod-
A code C is cyclic if it is linear and every cyclic shift of every codeword v is also a
are the symbols in v. The notation v(x) will be used to denote a code polynomial. The
v = (v ; v ; : : : ; v
0 1 n 1 ) (2.13)
v(x) = v +v x +v x +:::+v
0 1 2
2
n 1 xn 1
(2.14)
It can be seen that the maximum degree of v(x) is n 1. Algebraically, v(x) is defined
xn 1 (2.15)
x:v(x) = v x + v x + v x + ::: + v
0 1
2
2
3
n 2 xn 1
+v n 1 xn
(2.16)
= v + v x + v x + v x + ::: + v
n 1 0 1
2
2
3
n 2 xn 1
It can be shown [Wicker, 1994, p. 101; Wilson, 1996, pp. 443–444] that there
exists a unique polynomial, g(x), such that every code polynomial can be expressed as
2.1. BLOCK CODES 18
2 3 2 3
6 g(x) 7 6 g0 g1 ::: gr 0 ::: 0 0 7
6 7 6 7
6 7 6 7
6 xg(x)
6
7
7
6 0
6
g0 g1 : : : gr 0 ::: 0 7
7
6 7 6 7
G = 6 x2 g(x)
6
7
7 = 6
6
..
.
..
.
..
. 7
7 (2.18)
6 7 6 7
6 .. 7 6 7
6 . 7 6 0 ::: 0 g0 g1 ::: gr 0 7
6 7 6 7
4 5 4 5
xk 1 g(x) 0 0 ::: 0 g0 g1 : : : gr
The encoding procedure given in (2.17) does not produce a systematic codeword.
Normally, systematic codewords are preferred as they simplify the decoding proce-
dure. The normal method by which systematic codewords are generated is [Michelson
v(x) = x n k
u(x) mod g(x)
+x n k
u(x) (2.19)
2.1. BLOCK CODES 19
It can be seen that this does result in a valid codeword, the remainder when v(x) is
Syndrome Polynomial
For cyclic codes (e.g., BCH and RS) the calculation of the syndromes can be per-
formed more efficiently by using the cyclic properties of the code. Though the syn-
formed as
S(x) = g(x)
r(x)
(2.20)
= r(x) mod g(x)
where S(x) = S +S x +S x +:::+S
1 2 3
2
2t x2t 1
(2.21)
where S(x) is the syndrome polynomial. Thus the syndrome polynomial may be de-
fined as the remainder when an erroneous codeword is divided by the generator poly-
nomial, g(x). The proof is given in [Peterson and Weldon, 1972, p. 231]. For the
case of the received codeword being equal to the transmitted codeword, v(x), the syn-
drome polynomial is zero since valid codewords are exactly divisible by the generator
polynomial (2.17).
It can be shown that the syndrome polynomial is dependent upon the error polyno-
mial, e(x), and not the transmitted codeword, v(x), by substituting r(x) = v(x) + e(x)
2.1. BLOCK CODES 20
into (2.20)
hem [Hocquenghem, 1959] in 1959 and Bose and Ray-Chaudhuri [Bose and Ray-
Chaudhuri, 1960a,b] in 1960. BCH codes are a generalisation of the cyclic Hamming
codes for correcting multiple errors. Peterson [Peterson, 1960] proved that BCH codes
are cyclic. Gorenstein and Zierler [Gorenstein and Zierler, 1961] generalised the BCH
codes for non-binary alphabets of size pm . Their wide choice of block lengths, code
rates and symbol alphabets, coupled with efficient decoding algorithms, has made
computer search of all the non-zero codewords to find the minimum-weight codeword
2.1. BLOCK CODES 21
and thus the minimum distance of the code. For BCH codes this procedure is not
required, the BCH bound places lower limit on the minimum distance of the code. An
1987].
Let C be a q-ary (n; k) cyclic code with generator polynomial g(x). Let m be the
multiplicative order of q mod n. (GF(qm ) is thus the smallest extension field of GF(q)
which contains a primitive n-th root of unity.) Let be a primitive n-th root of unity.
[℄
Select g(x) to be a minimal-degree polynomial in GF(q) x , where GF(q) x de- [℄
notes the collection of all polynomials a0 +a x+a x
1 2
2
+ + x n
of arbitrary degree
with coefficients fai g in the finite field GF(q) [Wicker, 1994, p. 40], such that
g( b ) = g( + ) = g( + ) = = g( +
b 1 b 2 b Æ 2
) =0 (2.23)
for some integers b 0 and Æ 1. The roots of the generator polynomial g(x) are
Æ 1 consecutive powers of . The code C defined by g(x) has minimum distance
d Æ.
Proof of Theorem 2.2 can be found in [MacWilliams and Sloane, 1978; Peterson and
Weldon, 1972; Wicker, 1994]. The BCH bound can be used to produce a BCH code
with a given design distance. However, since the weight distributions of most BCH
codes are not known the actual minimum distance may be greater than the design
distance.
2.1. BLOCK CODES 22
Generator Polynomial
The generator polynomial for a BCH code has 2t roots, which are consecutive powers
Y
2t 1
For the case when b = 1 the code is termed narrow sense [MacWilliams and Sloane,
1978, p. 203].
Reed-Solomon codes were discovered by Reed and Solomon in 1960 and are a spe-
cial subclass of non-binary BCH codes [Reed and Solomon, 1960]. RS codes exhibit
additional properties to BCH codes which make them very much more powerful than
BCH codes. Their powerful error-correcting abilities have made them possibly the
most important codes. RS codes are multi-level, therefore log2 q binary bits are com-
monly mapped to one RS symbol. This process provides some burst-error correction.
They have many and widespread applications which include the compact disc, satellite
Reed-Solomon codes are cyclic and so profit from the many useful characteristics
cyclic codes offer. They are normally generated in systematic form using (2.19).
Proof 2.2 Let C be an (n; k) RS code. The Singleton bound (Section 2.1.3) gives an
upper bound of d n + 1 to all (n; k) codes. The BCH bound provides a lower
k
d =n k +1 (2.25)
Theorem 2.3 and its proof are important for two reasons. Firstly it shows that RS
codes can be designed such that their designed minimum distance is always the actual
minimum distance (unlike BCH codes). Secondly, RS codes satisfy the Singleton
in a systematic representation. This follows from the MDS property, proof is given
2.2.1 Overview
The name convolutional code was coined by Elias [Elias, 1955] to describe a code
which is the output sequence of a linear mapping of an input sequence with a discrete-
time, finite-alphabet convolution of the input and encoder’s impulse response [Wilson,
1996, p. 551]. Such codes are sometimes termed trellis codes, but this name is mis-
leading because block codes may also be represented by trellises. The concept of
A convolutional code over GF(q) is usually described by the parameters (n; k; K),
where k is the number of q-ary symbols (simultaneously) input to the decoder and n
is the number of q-ary symbols (simultaneously) output from the decoder. As with a
2.3. CONCATENATED CODES 25
block code, the rate is given by n=k. The constraint length of the code, K, is defined as
the number of consecutive symbols in the output stream affected by any input symbol.
2.3.1 Overview
erful technique for creating error-correcting codes by combining two (or more) codes
sequentially. The primary reason for using concatenated codes is to achieve a low
error rate with an overall implementation complexity which is less than that which
would be required by a single decoding operation [Sklar, 1988, p. 365]. Figure 2.3
shows a concatenated coding scheme. The data stream is first encoded with the outer
code (in this case an RS block code). The output of the outer code is then re-encoded
with the inner code, (here a convolutional code) before transmission. At the receiver
the decoding order must be reversed, and so the inner code is decoded first. Any errors
from the output of the inner decoder are likely to appear as bursts, hence it is usual
to include an interleaver and de-interleaver between the inner and outer codes. The
purpose of the interleaver is to rearrange the symbols so that errors do not occur in
bursts but are spread through several outer codewords to allow correct decoding.
Convolutional codes are a natural choice for the inner code. With a suitable Vit-
erbi decoder SD information can be used for maximum performance. RS codes are
frequently used for the outer code. They are powerful and when combined with a
2.3. CONCATENATED CODES
1 2 3 4 I 1 3 2 4
Gaussian channel,
coherent BPSK
demodulator TX RX
1 2 3 4 D 1 3 2 4
26
2.4. TRELLISES 27
binary convolutional code the burst-error performance of the RS code helps minimise
errors. There are, however, many possible configurations for a concatenated coding
scheme and flexibility is one of its many advantages. For example the compact disc
coding system uses a concatenated system based upon RS(32; 28; 5) and RS(28; 24; 5)
shortened RS codes.
The rate of a concatenated code is the product of the rates of its component codes.
Consider a concatenated code C with an (n1 ; k1 ; d 1 ) inner code C1 and an (n2 ; k2; d 2 )
outer code C2 . An input sequence of k1 k2 symbols is passed to the outer encoder. The
output is k1 n2 symbols. This new block is sent to the inner decoder which results in
k1 k2
n1 n2 output symbols. The rate is thus n1 n2
.
p. 279].
The proof is the same as for an array code, see Proof 2.1.
2.4 Trellises
2.4.1 Introduction
graph [Muder, 1988], with one start point (the root) and one end point (the goal).
The horizontal axis of a trellis diagram indicates the passage of time. The trellis can
2.4. TRELLISES 28
2.4.2 Definitions
Nodes of the graph represent possible encoder states, Si;t , where i is the state number
and t is the time index. The nodes are decomposed into a union of disjoint subsets,
called vertices or levels. The levels are numerated 0; 1; : : : ; Nc (Nc n + 1) and the
t-th level consists of Nt nodes, (S1;t ); (S2;t ); : : : ; (SN ;t ).
t
Between states in adjacent levels, Si;t and S j;t +1, there may be directed branches
(also called edges), B(Si;t !S + ), which indicate possible changes in state. Branches
j;t 1
are only permitted to connect adjacent levels. The set of branches between level t 1
and t is called the t-th depth. In some texts the set of branches at a given depth is
termed a stage [Wicker, 1994, p. 292]. To prevent confusion (e.g., with “two-stage
decoding”) such terminology is avoided in this work. Associated with every branch is
a label denoting the output (or code) symbol(s) given when that branch is taken, and
a branch metric, Bm , which indicates the likelihood of a given branch being selected.
For certain trellises an additional input (or data) label may exist. Its purpose is to allow
the same trellis structure to be used for both encoding and decoding operations. The
code label is a lt -dimensional vector of q-ary symbols, (l1 ; l2 ; : : : ; lt ). The code label
associated with the branch B(Si;t !S + ) is denoted by L(Si;t ; S j;t +1 ).
j;t 1
Using the notation introduced above the root can be more precisely defined as S1;0 ,
and the goal as S1;Nc . A path is a continuous sequence of branches, and is denoted
2.4. TRELLISES 29
by P (Si;t !S +
j;t 1 !S +
k;t 2 ! ::: ! S + ). The term partial path is sometimes
z;t Æ
used to denote a sequence of branches for which decoding is incomplete, thus the
sequence starts from the root but does not reach the goal. For certain codes (generally
Frequently these trellises are shown with multiple roots and goals (see Figure 4.7
for an example). Though this does not strictly match the definition of a trellis, the
between the codewords of the code C and all paths between S1;0 and S1;Nc (i.e., all
B 2; ::: ; B Nc ℄ be the branch profile, where N is the number of states at the i-th level
i
and B j is the number of branches at the j-th depth [Forney and Trott, 1993; Honary
of the trellis where L j is the number of symbols used for labelling the j-th depth.
Figure 2.4 shows a trellis for the (7; 4; 3) Hamming code annotated with the defi-
nitions described above. The state profile, branch profile and code label size profile of
2.4. TRELLISES
branch label path (bold)
root 00=00 0=00 0=00 =00 goal
10
1
1=11 1=11
=1
=1
1
branch
0=00 0=00
01=
=01
01
0=01 0=01
1 1 =1
=1 0
0
1=10 1=10
state
0=01 0=01
level: 0 1 2 3 4
depth: 1 2 3 4
30
2.4. TRELLISES 31
N (t ) = [N ; N ; N ; N ; N ℄
0 1 2 3 4
(2.26)
= [1; 4; 4; 4; 1℄
B (t ) = [B 1 ; B 2; B 3; B 4℄
(2.27)
= [4; 8; 8; 4℄
L (t ) = [L ; L ; L ; L ℄
1 2 3 4
(2.28)
= [2; 2; 2; 1℄
2.4.3 Properties
Proper A trellis where all the branches (edges) leaving any state (vertex) have distinct
labels. Unless otherwise stated all references to a trellis will imply a proper
trellis.
Observable A trellis with a one-to-one mapping between all codewords of the code C
and all paths between S1;0 and S1;Nc (i.e., all P (S1;0 ! ::: ! S 1;Nc )).
Since an unobservable trellis contains more than one path through the trellis
for at least one codeword this may cause difficulties for encoders and for sub-
methods.
Minimal trellis Many definitions of a minimal trellis exist due to varying interpre-
tations of minimality. The definition used within this text will be that given by
2.5. CHANNEL MODELS 32
Muder [Muder, 1988]. The trellis T is a minimal trellis of the code C if for every
State-Oriented form The trellis state number is directly correlated with the encoder
state.
A discrete memoryless channel features discrete input and output alphabets. The set
The output symbol depends only on the input symbol, not on the existencre of any
previous errors (hence the channel has no memory). For an input sequence U =u; 1
Y
N
A binary symmetric channel is a special case of the discrete memoryless channel. The
input alphabet size is 2, containing the binary elements “0” and “1”. In addition, the
The channel transition probabilities (2.30, 2.31) give the probability that a trans-
how well a symbol is received, it merely outputs a “0” or “1”. This type of output is
p(0 j 0)
0 0
p(
1
j 1)
0
j 0)
p(
1 1
p(1 j 1)
correcting code operating over a BSC can be obtained by considering the probability
n
X
PM n
i
Ps i (1 Ps )n i
(2.32)
=+
i t 1
The binary symmetric erasure channel may be viewed as a special case of the DMC,
or as an extension of the BSC. Like the BSC the input alphabet size is 2, containing
the binary elements. However the output alphabet size is increased to 3, and contains
“0”, “1” and an erasure (denoted by “?”). For times when the demodulator is not able
to clearly identify a “0” or “1” it may signal its uncertainty by outputting an erasure.
The decoder is then aware that the symbol is unreliable. The symmetric conditional
probabilities are
0 p(0 j 0) 0
p (0
j ?)
p(
0
j 0)
?
j 0)
j ?)
1
p(
p (1
1 p(1 j 1) 1
In many cases channels are not discrete but feature a continuous output alphabet over
the range ( 1 ! +1). An AWGN channel is an example of such a case. The output
is the input with broadband Gaussian noise added. The channel contains no memory
(as defined in Section 2.5.1). This type of channel is an accurate channel model of
White Gaussian noise is a random process, with a zero mean and a Gaussian PDF
with variance 2. The power spectral density is flat over all frequencies ( 1 f
+1). The channel corrupts the transmitted signal with noise. The probability density
function, y, of the noise value, x, is Gaussian and in the frequency domain the noise is
( 2 )
y = p1 exp
2
1
2
x
(2.36)
2.5. CHANNEL MODELS 36
While channels with continuous output alphabets are a natural phenomenon they are
impossible to use with SD decoding (due to requiring infinite precision to store the
soft value). AWGN channels are frequently ‘approximated’ to a channel with a fixed
number of noise values. Such a channel is termed a quantised additive white Gaussian
noise channel. Like the standard AWGN channel it contains no memory. The quan-
tised AWGN channel is discussed further in Section 5.1.2, where the relative merits
(or metrics) of each quantisation level are calculated and their variation with Eb =N0 for
Codes
37
Chapter 3
Codes
3.1 Introduction
While RS codes are constructed by a few well-defined methods (Section 2.1.8) a large
those which are algebraic and those which are combinatorial in origin. Algebraic de-
to find the lowest weight error word. The fundamental method by which most RS al-
gebraic decoders operate is by attempting to solve the key equation. However, not all
algorithms use this approach, notable exceptions being Peterson’s direct method [Pe-
38
3.1. INTRODUCTION 39
find the codeword which most closely matches the received word. Whilst many alge-
braic decoders can be adapted for error-and-erasures decoding it is true to say that they
are unable to make maximum use of soft-decision information in the way that combi-
natorial decoders can. Most combinatorial decoding algorithms are trellis-based (e.g.,
the Fano algorithm and the Viterbi algorithm). Combinatorial decoders are discussed
in Chapter 4.
troduced an iterative method for decoding binary BCH codes [Berlekamp, 1967]. In
Peterson’s method the decoding complexity is proportional to the square of the errors
corrected while for Berlekamp’s algorithm the decoding complexity increases linearly
with the number of errors corrected [Wicker, 1994, p. 211]. Thus Berlekamp’s al-
gorithm is much more suited for decoding long block codes where many errors may
algorithm [Massey, 1969]. The algorithm is now commonly called the Berlekamp-
showed how Euclid’s algorithm, for finding the greatest common divisor of two inte-
gers or polynomials, can be used to solve the key equation and decode BCH and RS
This Chapter begins with a discussion of the key equation (Section 3.2). The syn-
the error values can be calculated is given, along with its proof. Euclidean decoding
is described in Section 3.3, and illustrated with an example. Section 3.4 describes the
3.2. THE KEY EQUATION 40
Berlekamp-Massey algorithm. The same decoding example is repeated for the case of
decoding example is used to illustrate the algorithm. All the algorithms are described
using the same form of the key equation, and have been generalised for the case when
Many common algebraic decoding algorithms for parity-check block codes start by
testing if the syndromes (Section 2.1.2) of the received codeword are non-zero (i.e., if
the parity checks fail). If the syndrome vector or polynomial is non-zero the received
1 Note that these syndromes are not the same as the syndrome described in Section 2.1.6
3.2. THE KEY EQUATION 41
Sj = r( + j b 1
)
X
n 1
= (ri )( j+b 1 )i
=
i 0
X
n 1
= (vi + e )( +i
j b 1 i
) (3.1)
=
i 0
X
n 1
= (ei ) i( j+b 1)
=
i 0
where j = 1; 2; : : : ; 2t
Let the received codeword contain ( t ) correctable errors. Let the locations of
the errors be given by time indices i1 ; i2 ; : : : ; i . For each symbol in error define an
error-locator, Xi such that
Xi = i
(3.2)
Noting that only symbols received in error contribute to the syndrome values it is
X
The error-locator polynomial, (x), is defined as a polynomial whose inverse roots
are the error locators [Wicker, 1994, p. 205] i.e., the inverses of the error locations are
3.2. THE KEY EQUATION 42
the roots of (x).
4Y
From (3.4) it can be seen that deg (x) = , and that for no errors ( = 0) then
(x) = 0. For binary codes it is sufficient to find the location of an error, since the
error value is always 1. However for RS and other multi-level codes it is necessary
to find the value of each error in addition to the location of each error. It is therefore
cation gives the value of the error. It is defined as follows.2 The syndrome polynomial
is an infinite degree polynomial, however only the first 2t coefficients of x are known
1
X
1 + S(x) = 1 + S jx j
=
j 1
1 !
X X
X
1
X
=1+ eil Xl( j+b 1) j
x
=
l 1 j 1 =
P1 ( j+b 1) j
The summation = Xl
j 1 x can be simplified to a rational expression by noting it
2 This is similar to [Wicker, 1994, p. 221], but has been generalised for the case when b 6 = 1.
3.2. THE KEY EQUATION 43
S1 = a
1 r
1
X 1
X
X +
l
( j b 1) j
x = Xlb 1
(Xl x) j
=
j 1 =
j 1
1
X
= X l
b 1
+ Xlb 1
(Xl x) j
j 0 = (3.6)
= X 1 +XXx x + 1 X
b 1 b b 1
l l l
l Xl x
= 1 X Xx x
b
l
X
1 + S(x) = 1 + eil
Xlbx
1 Xl x
(3.7)
=
l 1
Multiplying both sides of (3.7) by (x) (Equation 3.4) produces the definition of
" #
X
= (x) + 1 Xl x i=1
(1 Xi x)
l 1= 2 3 (3.8)
X
Y
= (x) + 6 b
4eil Xl x (1
7
Xi x)5
= 6=
=
l 1 i l
i 1
4
=
(x)
As the decoder is only able to calculate the first 2t coefficients of S(x) then S(x) is
unknown, though the decoder does know the value of S(x) mod x2t +1 . Thus the key
3.2. THE KEY EQUATION 44
(x) [1 + S(x)℄
(x) mod x + 2t 1
(3.9)
An alternative definition of the key equation sometimes found [Clark and Cain, 1981]
is
While the error-locator polynomial is defined in the same way (3.4) it is important
to note that the syndrome polynomial, Salt (x), in (3.10) is not the same as defined
in (3.5); they are related by S(x) = xS alt (x). As the error-evaluator is also defined in a
different manner the initial conditions for Euclidean and Berlekamp-Massey decoding
also differ, as does the equation for calculating the error values from (x) and
alt (x).
After the solution to the key equation has been found the error locations and values
must still be found. Future references to the key equation will be to (3.9) only.
X2 1
; : : : ; X 1
g, corresponding to error locations fi ; i ; : : : ; i g. The roots may
1 2
be found by exhaustive substitution or Chien search [Chien, 1964]. The roots must be
unique since an error can only occur in one position once. By definition of (x) (3.4),
fX 1
1
; X2 1; : : : ; X 1
g 2 GF(q). This implies deg (x) = . If any of these con-
3.2. THE KEY EQUATION 45
ditions are not met the decoder should abort error-correction and declare a decoder
failure to indicate that the codeword contains an uncorrectable number of errors. The
reason why the decoder is able to identify a codeword containing > t errors is that
BCH and RS codes are not perfect [Wicker, 1994, p. 76] and there exist words in the
codespace which are greater than t Hamming distance from the nearest codeword.
From the Forney algorithm [Forney, 1965] the error values eil = fe ; e ; : : : ; e g
1 2
are given by
= X0 (X
(X)
2 b 1
l l )
eil 1
(3.11)
l
where Xl = l
is the error-locator (Section 3.2.2).
Proof 3.1 The proof given here is similar to [Wicker, 1994, p. 222], but has been ex-
0 (x) = (1 Xl x)
=
l 1
2 3
(3.12)
X
Y
= 6
4Xl (1
7
X j x)5
= 6=
=
l 1 j l
j 1
At the error location Xl the error-locator polynomial is (x) = 0, with the root x = X . l
1
3.2. THE KEY EQUATION 46
1
Therefore substitute Xl for x in (3.12).
0 (X ) =
l
1
2 3
X6 Y
0 (X ) =
l
1
4Xl (1 X j Xl 1 )5
7
= 6= (3.13)
=
l 1 j l
j 1
Y
= Xl (1 X j Xl 1 )
6=
=
j l
j 1
1
Similarly, substitute Xl for x in the error-evaluator polynomial (3.8).
Y
(X ) = e X X
l
1
il l
b
l
1
(1 Xi Xl 1 ) (3.14)
6=
=
i l
i 1
Y
b 1
eil X Xl l (1 XiXl 1 )
(X l
1
)
=
6=
=
i l
i 1
0 (X l
1
)
Xl
Y
(1 X j Xl 1 ) (3.15)
6=
=
j l
j 1
=
1
eil XlbXl
Xl
= X0 (X
(X)
2 b 1
l l )
eil 1
(3.16)
l
3.3. EUCLIDEAN DECODING 47
For narrow sense RS or BCH codes the error values are given by [Wicker, 1994,
p. 222]
eil =
Xl (Xl 1 )
0 (Xl 1 ) (3.17)
Euclid’s algorithm is a recursive method for finding the greatest common divisor of
domains include integer numbers and polynomials whose coefficients are elements in
1. Let a and b represent two integers or polynomials, where a > b if they are
integers or deg a deg b if they are polynomials.
3. If r (i 1)
6= 0, or for polynomials if deg r ( 1)
(x) > deg b(x), then define r (i) by
r (i) =r (i 2)
q(i) r (i 1)
(3.18)
Example 3.1 Calculate the greatest common divisor for a = 93 and b = 33.
The values for r (i) and q(i) for each iteration i are given in Table 3.1. The algorithm
1 — 93 —
0 — 33 —
1 2 27 93 = 2 33 + 27
2 1 6 33 = 1 27 + 6
3 4 3 27 =46+3
4 2 0 6 =23+0
Table 3.1: Solution to Example 3.1.
The extended version of Euclid’s algorithm is used to find two values in the Euclidean
gcd(a; b) = sa + tb (3.19)
3.3. EUCLIDEAN DECODING 49
Since they are linearly related the same recursive algorithm can be used to find s(i) and
t (i) . That is
s(i) =s (i 2)
q(i) s(i 1)
(3.20)
t (i) =t (i 2)
q(i)t (i 1)
(3.21)
follows
r( 1)
=a=s ( 1)
a +t ( 1)
b (3.22)
s( 1)
=1 (3.23)
t( 1)
=0 (3.24)
s(0) =0 (3.26)
t (0) =1 (3.27)
3.3. EUCLIDEAN DECODING 50
The applicability of Euclid’s algorithm for solving the key equation can be shown
(x) [1 + S(x)℄
(x) mod x + 2t 1
(3.28)
) (x)x +
2t 1
+ (x) [1 + S(x)℄ =
(x) (3.29)
3.3. EUCLIDEAN DECODING 51
a x2t +1 (3.30)
b 1 [ + S(x)℄ (3.31)
s (x) (3.32)
t (x) (3.33)
gcd(a; b)
(x) (3.34)
Therefore the extended form of Euclid’s algorithm can be used to solve the key
r( 1)
(x) = x+
t 1
(3.35)
t( 1)
(x) =0 (3.37)
3. If r (i 1)
(x) 6= 0 then define
r (i) (x) =r (i 2)
(x) q(i) (x)r (i 1)
(x) (3.39)
3.3. EUCLIDEAN DECODING 52
4. Define
t (i) (x) =t (i 2)
(x) q(i) (x)r (i 1)
(x) (3.40)
6. (x) = t (i)
=r
(x) and (x) (i)
(x). Find the roots of (x) and determine the error
locations.
7. Find the error magnitudes using the Forney algorithm (3.11) and correct the
errors.
The code RS(15; 9; 7) is a triple error-correcting code over GF(16). Let the prim-
itive polynomial for GF(16) be 1 + + . Table 3.3 gives the elements of the field
4
15 = 1 (3.41)
Let the sense of the code be b = 3. The generator polynomial, g(x), of the code
is (2.24)
Element Polynomial
representation
2 1 0
0 0 0 0 0
1 0 0 0 1
0 0 1 0
2 0 1 0 0
3 1 0 0 0
4 0 0 1 1
5 0 1 1 0
6 1 1 0 0
7 1 0 1 1
8 0 1 0 1
9 1 0 1 0
10 0 1 1 1
11 1 1 1 0
12 1 1 1 1
13 1 1 0 1
14 1 0 0 1
Table 3.3: Galois field elements for GF(16).
u(x) = x 3 6
(3.43)
v(x) = + x+
7 4 12 2
x + x +
4 3 11 4
x + x + x
9 5 3 12
(3.44)
After transmission the received codeword, r(x), is corrupted with errors, e(x), where
Let
e(x) = + x+ x
13 11 3 12
(3.46)
therefore
r(x) = + x+5 13 12 2
x + x +4 3 11 4
x + x 9 5
(3.47)
Sj = 1: ( j ) + :( ) + :( ) +
7 j 2 j 3 10
:( j )4 + 10 :( j )5 (3.48)
where j = 1; 2; : : : ; 6
and are shown in Table 3.4. From (2.21) and Table 3.4
[1 + S(x)℄ = 1 + x + 11 10 3
x + x + x + x
3 4 9 5 2 6
(3.49)
Sj Value
1 11
2 0
3 10
4 3
5 9
6 2
Table 3.4: Syndrome values.
3.3. EUCLIDEAN DECODING 55
[1 + S(x)℄ = 1 + x + 11
+ x + x + x , (x) = 0 and (x) = 1.
10 3
x 3 4 9 5 2 6 ( 1) (0)
quotient, q(i) are given in Table 3.5 for each iteration, i, of the algorithm.
The solution to the key equation is found after 3 iterations, the error-locator and
error-evaluator are
(x) = + x + x
5 11 3 3
(3.50)
(x) = + x + x +
5 6 7 2 14 3
x (3.51)
The roots of the error-locator polynomial can be found by either exhaustive substitu-
tion or by a Chien search [Chien, 1964]. The roots are 1, 3 and 14 , or in their inverse
All that remains is to calculate the value of the errors at the error locations and
subtract the error polynomial from the received codeword. As the code is not narrow
sense the error values must be found from (3.11). The formal derivative of (x) is
0 (x) = + x + x 0
5 11 3 3
= + 3( x )
11 3 2
(3.52)
= + x
11 3 2
3.3. EUCLIDEAN DECODING
i (i)
(x)
(i)
(x) q(i) (x)
1 0 x7 —
0 1 S(x) —
1 5 + 13 x + + + +
x 9 x2 x3 7 x5
5 12
5 + 13 x
2 9 + 8 x2 + + +
9 5 x 8 x2 12 x4 9 + 5 x + 8 x2 + 12 x4
3 + 11 x + 3 x3
5
+ + +
5 6 x 7 x2 14 x3 10 x
Table 3.5: Solution of the Key Equation using Euclid’s algorithm.
56
3.4. BERLEKAMP-MASSEY DECODING 57
= X0
X(X )
2 b 1
l l
eil 1
l
h 2 i
Xl 1
5 + 6 Xl 1 + 7 Xl 1 + 14
Xl 1 3
= 11 + 3 (Xl 1 )2
8
>
= 1;
> (3.53)
>
>
> 13 for Xl
>
>
<
= > 11
for Xl = ;
>
>
>
>
>
>
: 3 for Xl = 12
:
e(x) = + x+ x
13 11 3 12
(3.54)
3.4.1 Introduction
Massey [Massey, 1969] recognised that the problem of finding the minimum-degree
solution to the key equation is the same as finding the minimum-length feedback shift-
3.4. BERLEKAMP-MASSEY DECODING 58
register which generates the first 2t terms of the syndrome polynomial, S(x). The
Berlekamp-Massey algorithm may be used for decoding binary and multi-level codes.
If the test is successful (i.e., there is a discrepancy of 0) continue until the test
If the test fails use the discrepancy to modify the connections to the FSR so that
(iii) the FSR is increased by the smallest possible amount (to find the
The connections to the FSR produce the error-locator polynomial, (x). Since (x)
can be found from
(x) [1 + S(x)℄ it follows that
(i)
(x) obeys the same recursive
relationship as (i)
(x). Therefore a second FSR can be simultaneously constructed
to find
(x). Both (x) and
(x) share the same discrepancy but need their own
correction polynomials ((x) and ! (x) respectively).
3.4. BERLEKAMP-MASSEY DECODING 59
The steps below show the procedure for Berlekamp-Massey decoding [Berlekamp,
1967; Massey, 1969] of an arbitrary BCH or RS code. This follows from [Wicker,
1994, p. 219], but with adaptations to work for any value of b, and to generate both
i =0 (3.56)
L =0 (3.57)
(0)
(x) =1 (3.58)
(x) = x (3.59)
(0)
(x) =1 (3.60)
! (x) = 0 (3.61)
tracting the i-th output of the FSR defined by (x) from the i-th syndrome.
(i 1)
X
L
=S
(i)
i (i 1)
j S i j (3.62)
=
j 1
(i)
(x) = (i 1)
(x) (i)
(x) (3.63)
(i)
(x) =
(i 1)
(x) (i)
! (x) (3.64)
L=i L (3.65)
(x) =
(i 1)
(x)
(i)
(3.66)
! (x) =
(i 1)
(x)
(i)
(3.67)
9. Check if all the syndrome values have been used. If i < 2t, then go to step 3.
=
10. Determine the roots of (x) (2t )
(x).
11. If the roots are distinct and lie in GF(q) then calculate the error magnitudes
(using Equation 3.11) for each error location. v(x) = r(x) e(x). STOP.
3.4. BERLEKAMP-MASSEY DECODING 61
12. If the roots are not distinct, or do not lie in GF(q) then the calculated error-
locator does not agree with its definition (3.4), declare a decoder failure. STOP.
Note that it is trivial to implement a decoder which can correct t c errors and detect
to 2t c and the condition in step 9 becomes i < 2t c. It is also a simple matter to add
error-and-erasure decoding to the BM algorithm [Wicker, 1994].
Let the code and received codeword be as defined in Example 3.3. Recall that the
11 x4 + 9 x5 . The values of the algorithm variables for each iteration i are given in
Table 3.6.
The common constant factor is removed in the evaluation of (3.11). Hence the error
+ +
4 3 1 8 x 14 x2 3 2 5 x2 1 + 7 x + 9 x2 5 x2 + x3
+ +
5 9 1 8 x 7 x2 11 3 x + x + x 1 + x + x + x x + 11 x2 + 13 x3
4 12 2 3 3 7 3 2 12 3 4
+ +
6 2 1 6 x 13 x3 10 3 4 x2 + 12 x3 + 3 x4 1 + x + 2 x2 + 9 x3 4 x2 + 11 x3 + 13 x4
62
3.5. HIGH-SPEED STEP-BY-STEP DECODING 63
3.5.1 Introduction
The step-by-step algorithm differs considerably in its approach from that of the Eu-
clidean and Berlekamp-Massey algorithms, which both solve the key equation to com-
pute the error values and their locations. Step-by-step decodes every possible error
location for the correct error value individually and without solving the key equa-
tion. The original step-by-step algorithm was introduced by Massey [Massey, 1965]
in 1965. However, in this section the modification proposed by Wei and Wei [Wei and
The basic algorithm is as follows. Suppose a method exists to calculate the error
weight of a received codeword without decoding it. First the initial error weight of the
symbol is correct. If the error weight has decreased then the new symbol is correct.
Otherwise the symbol is in error but the new symbol is incorrect, and another value
should be tried until the correct one is found. The process is repeated for all informa-
tion symbols. It is not necessary to correct the parity symbols, though if desired they
For the case where the received codeword contains t 1 or less errors, it is nec-
with an incorrect one while searching for the first error location. However, if the full
t 1 and t + 1 errors.
the error-weight of a received codeword. This can be achieved from the interdepen-
dence of the syndromes of RS codes using Theorem 9.9 of [Peterson and Weldon,
2 3
6 S1 S2 ::: Sj 7
6 7
6 7
6 S2 S3 ::: S j+1 7
Nj = 6
6
6 .. .. ..
7
7
.. 7
(3.70)
6 . . . . 7
6 7
4 5
: : : S2 j 1
S j S j+1
where j = 1; 2; : : : ; n
j. If the weight of e(x) is greater than j, the result is not determinable. The proof is
zero, then det(Nt +1 ) is independent of S2t +1 and can be calculated. Thus the problem of
S2t +1 is replaced by zero. Note that the result of det(Nt +1 ) is only valid when det(Nt )
is zero.
For the purpose of calculating the error weight only the singularity of the syndrome
matrix is important, the actual value if non-zero is discarded. Consequently the results
8
>
>
>
<1 if det(N j ) = 0,
hj => (3.71)
>
>
:0 if det(N j ) 6 = 0.
where j = 0; 1; : : : ; t
8
>
>
>
<1 if det(Nt +1 ) = 0;
ht +1 => (3.72)
>
>
:0 if det(Nt +1 ) 6 = 0:
As the error weight is dependent on all the t + 1 binary decision bits it is useful to
combine them into a decision vector, D.
D = (h ; h ; : : : ; h + )
1 2 t 1 (3.73)
3.5. HIGH-SPEED STEP-BY-STEP DECODING 66
From (3.70)
Weight of e(x)
0 D 2 0 = f(1 + )g
t 1
(3.74)
1 D 2 1 = f(0; 1 )g
t
(3.75)
w D 2 w = f(X ; 0; 1 + )g
w 1 t w 1
(3.76)
where w = 2; 3; : : : ; t 1
t D 2 = f(X ; 0; X )g
t
t 1
(3.77)
t +1 D 2 t +1 = f(X t 1
; 0; X ); (X t 1; 1; 0)g (3.78)
It is not possible to distinguish t and t + 1 errors in all cases. To account for this it is
Wei and Wei use a concept of updated syndrome values. Since the changes to the
received codeword are known it is possible to only fully calculate the syndrome val-
ues once and from there on keep a running total. This concept is important in re-
ducing the complexity of the algorithm. For a code with sense b = 1 the syn-
dromes for a trial codeword with a trial error value at location x0 can be computed
3.5. HIGH-SPEED STEP-BY-STEP DECODING 67
Si1 =S + 0
i
where = j (3.79)
j 2 f0; 1; : : : ; n 1g
The above equation will now be generalised for codes other than narrow sense
(b > 1), and for trial locations other than x0 . Let the received codeword be represented
by
r(x)0 =r n 1 xn 1
+r n 2 xn 2
++r x+r 1 0 (3.80)
X
n 1
Sj = ri i( j+b 1)
(3.81)
=
i 0
S10 =r n 1 (n 1)b
+r n 2 (n 2)b
++r +r +r 2
2b
1
b
0 (3.82)
S20 =r n 1 (n +
1)(b 1)
+r n 2 (n +
2)(b 1)
++r 2
2(b 1)+ +r 1
+
(b 1)
+r 0 (3.83)
..
.
S2t0 =r n 1 (n +
1)(b 1)
+r n 2 (n +
2)(b 2t )
++r 2
+
2(b 2t )
+r 1
+
(b 2t )
+r 0 (3.84)
The implementation described in [Wei and Wei, 1993] cyclically shifts the codeword
3.5. HIGH-SPEED STEP-BY-STEP DECODING 68
before correcting each symbol at the x0 location. As RS codes are cyclic, any cyclic
shift of a valid codeword is also a valid codeword. Hence cyclic shifts do not alter the
error weight of the received word. To calculate the error weight cyclic shifts may be
r(x)1 = r(x) 0
=r n 1 xn 1
+r n 2 xn 2
+ + r x + r + x
1 0
p
where = j (3.85)
j = 0; 1; : : : ; n 2
p = 0; 1; : : : ; n 1
S10 =r n 1 (n 1)b
+r n 2 (n 2)b
+ + r + r + r +
2
2b
1
b
0
pb
(3.86)
S20 =r n 1 (n +
1)(b 1)
+r n 2 (n +
2)(b 1)
++r 2
2(b 1) + +r 1
+
p(b 1)
+r 0 (3.87)
..
.
S2t0 =r n 1 (n +
1)(b 1)
+r n 2 (n +
2)(b 2t )
++r 2
+
2(b 2t )
+r 1
+
p(b 2t )
+r 0 (3.88)
3.5. HIGH-SPEED STEP-BY-STEP DECODING 69
Si1 = S +
0
i
+
p(i b 1)
where = j
i = 1; 2; : : : ; 2t (3.89)
j 2 f0; 1; : : : ; n 1g
p 2 f0; 1; ::: ; n 1g
The algorithm described below is a slightly modified version of that given in [Wei
and Wei, 1993]. In particular, the received codeword is not cyclically shifted, instead
the position of the symbol under test moves. For the high-level implementation tested
(Section 6.3.2) this approach was more efficient. Annotations to the algorithm are
shown thus.
2. Let p =n k
Only the information symbols require correction; the parity symbols occupy the least significant
3. Let j = 0.
3.5. HIGH-SPEED STEP-BY-STEP DECODING 70
4. Let = j
then obtain Si0 + ip
.
The number of errors has increased, therefore the original symbol value is correct.
6. If D0 2 t and D1 2 f(X t 1
; 1; 0)g, then go to step 10.
The temporarily-changed received codeword contains t + 1 errors. Therefore the original code-
word must have contained t errors. Hence the original symbol value is correct.
The number of errors has decreased by one. Therefore the temporary value of is correct so
8. If D0 62 f ; ; : : : ; g and D 2
0 1 t 1
1
t 1 , then add x p to the received word.
11. All the k symbols have been checked and corrected. The decoded information
Let the code and received codeword be as defined in Example 3.3. The code has
For the key decoding stages, Table 3.5 shows the active decoding position, p, the
trial error value, , and the before and after error weights, D0 and D1 respectively.
Also shown is the trial codeword. In this Example errors in the parity symbols are
not corrected, so the first decoding position is p = 6, i.e., x . The decoder has cor-
6
rectly established that three symbols are in error (jD j = 3). At location x successive
0 6
elements from GF(16) are tried in turn as the possible error value. For = the 3
trial codeword has increased to distance 4 from the nearest correct codeword, thus the
decoder is able to identify that the original symbol value was correct. This pattern is
repeated up to p = 11. Note that for p = 9 the decoder cannot detect an increase in
the error-weight. This means for every possible error pattern at location x9 the distance
to the nearest valid codeword is only 2t, whereas for other locations the distance is 2t
+ 1. As the decoder has not detected any decrease in error weight the original
or 2t
symbol must be correct, provided the original error weight was 2t.
At location x12 the decoder has found and corrected an error—the error value is 3
and is signalled by the decrease in error weight. Two more errors remain, the decoder
searches the remaining two locations but the errors are not found since they are in the
parity symbols. The algorithm required 52 attempts to correct the single error in the
information symbols.
3.5. HIGH-SPEED STEP-BY-STEP DECODING 72
Trellis Decoding
73
Chapter 4
Trellis Decoding
4.1.1 Introduction
Techniques for designing block code trellises have been investigated ever since they
were first proposed for error-correction in 1974 [Bahl et al., 1974]. Wolf [Wolf, 1978]
has shown that such a trellis, known as a syndrome trellis (also called a Wolf or BCJR
code. McEliece [McEliece, 1994] proved syndrome trellises are minimal and pro-
posed a technique to obtain an optimal reordering of the generator matrix of the code.
In 1988 Forney [Forney, 1988b] introduced the concept of a coset code and its
coset trellis. These trellises have a regular structure, composed of a number of identi-
cal subtrellises which differ only in the labelling of the trellis branches. This impor-
tant advancement in trellis design allows a reduction in both the decoder complexity
74
4.1. TRELLIS CONSTRUCTION METHODS 75
Muder [Muder, 1988] proved that coset trellises are minimal and that the number of
states in the trellis diagram can be minimised by an appropriate reordering of the sym-
bols in the codeword. A vast amount of literature has accumulated on the design of a
minimal trellis for a given block code [Berger and Be’ery, 1993; Honary and Markar-
ian, 1993a,b; Honary et al., 1993; Kasami et al., 1993a,b; Kschischang and Sorokine,
1995; Wu et al., 1994; Zyablov and Sidorenko, 1993]. Optimal reorderings have been
found for certain binary codes [Berger and Be’ery, 1993; Forney, 1988b; Honary et
al., 1995b; Kasami et al., 1993a,b]. The general solution to this problem, and its
extension for non-binary block codes, is however unsolved and remains a complex
analytical task.
a simplification used for trellis construction, but it is not a requirement. Indeed, for
some codes (e.g., non-linear codes) state-oriented form is not possible. Both syndrome
to a situation where both channels are used each unit of time”. A similar product
exists for two (or more) trellises which are combined in such a manner that they
share the same time indexes. This product is known as the Shannon product of trel-
Proposition 4.1 Consider two codes C 0 and C 00 with the same label profile (and it
follows, the same code length n). Let T 0 be a trellis of the code C 0 and T 00 be a trellis
T = T 0 ? T 00 (4.1)
The sum of C 0 + C 00 produces the set of all possible sums v0 + v00. Let v0 + v00 2 C .
Associated with v0 is a path P 0 with labels (l10 ; l20 ; : : : ; ln0 ), while associated with v00 is
a path P 00 with labels (l100 ; l200 ; : : : ; ln00 ). By definition, in T there is a path with labels
(l10 ; l20 ; : : : ; ln0 ) which is the path for codeword v0 (v0 2 C 0 ). Likewise, there exists in
T 00 a path P 00 with labels (l 00 ; l 00 ; : : : ; l 00 ) which is the path for codeword v00 (v00 2 C 00 ).
1 2 n
For trellises which are labelled with both data and code symbols the Shannon
product can be extended, and performed on each set of labels, where code labels are
A code C having a sum structure can be constructed from the (linear) sum of its
subcodes. There are many codes having the property of a sum structure, e.g., RS, RM
4.1. TRELLIS CONSTRUCTION METHODS 77
and the Nordstrom-Robinson code. Alternatively, the trellis for any such code can be
In proving syndrome trellises are minimal McEliece [McEliece, 1994] showed that
the maximum number of states in a syndrome trellis can be estimated using Wolf’s
bound.
Nmax min qk ; q(n k)
(4.3)
The minimum number of states at the i-th level of the minimal trellis can be ob-
Ni =q qk
kpast qkfuture
(4.4)
i = 0; 1; : : : ; n
where
Theorem 4.1 The maximum number of states in the minimal syndrome trellis of an
Nmax = min q ; q k n k
(4.9)
Proof 4.2 To prove Theorem 4.1 consider the i-th and (i + 1)-th vertices of the trellis
such that i = (n 1)=2 and i + 1 = (n + 1)=2. Consider also the two different types
of code:
n
<d 1 (4.10)
2
and
n +1 <d (4.11)
2
= qqq = q
k
k
Nmax 0 0
(4.12)
4.1. TRELLIS CONSTRUCTION METHODS 79
To obtain a syndrome trellis for an RS(n; k; d) code over GF(q). Let u = (u1 ;
the RS code in cyclic form [Vardy and Be’ery, 1991]. G is given in the format:
2 3
6 g1 7
6 7
6 7
6 g2 7
G = 6
6
6 ..
7
7
7
(4.13)
6 . 7
6 7
4 5
gk
X
k
C = Cj (4.14)
=
j 1
vj = u g
j j (4.15)
j = 0; 1; : : : ; q 1 (4.16)
vertices and the number of states in the t-th vertex is defined as follows [McEliece,
1994]:
N0 = N8 = 1
n (4.17)
>
>
>
<q if gtj 6= 0 and g + 6= 0
t 1
Nt => j
(4.18)
>
>
:1 for all other cases
where t = 0; 1; : : : ; n and g t
j is the t-th element of g j .
N0 =N =1 n
(4.19)
Nt =q m
Proof 4.4 In order to prove Theorem 4.2 it is necessary to show the maximum number
of states in the designed trellis is given according to Theorem 4.1. Thus calculate the
Ntmax = min q ; q
k n k
(4.20)
and from Theorem 4.1 it follows that the designed trellis is minimal.
Example 4.1 To design the syndrome trellis for the narrow-sense RS(7; 5; 3) code
g(x) = (x )(x 2)
(4.21)
=x + x+
2 4 3
4.1. TRELLIS CONSTRUCTION METHODS 82
2 3 2 3
6
3 4
6 g1 7 1 0 0 0 0 7
6 7 6 7
6 7 6 7
6
6
g2 7
7
6 0
6
3
4
1 0 0 0 7
7
6 7 6 7
G = 6
6 g3 7
7 = 6 0
6 0 3 4 1 0 0 7
7 (4.22)
6 7 6 7
6 7 6 7
6 g4 7 6 0 0 0 3 4 1 0 7
6 7 6 7
4 5 4 5
g5 0 0 0 0 3 4 1
X
5
C = Cj (4.23)
j 1=
C1 = u [g ℄ = ( u ; u ; u ; 0; 0; 0; 0)
1 1
3
1
4
1 1 (4.24)
C2 = u [g ℄ = (0;
2 2 3 u2 ; 4 u2 ; u2 ; 0; 0; 0) (4.25)
..
.
C5 = u [g ℄ = (0; 0; 0; 0; u ; u ; u )
5 5
3
5
4
5 5 (4.26)
The component trellises have a very simple structure (as shown in Figure 4.1) with the
4.1. TRELLIS CONSTRUCTION METHODS 83
(a) T1
(b) T2
(c) T3
(d) T4
(e) T5
N1 (t ) = (1; 8; 8; 1; 1; 1; 1; 1) (4.27)
N2 (t ) = (1; 1; 8; 8; 1; 1; 1; 1) (4.28)
N3 (t ) = (1; 1; 1; 8; 8; 1; 1; 1) (4.29)
N4 (t ) = (1; 1; 1; 1; 8; 8; 1; 1) (4.30)
N5 (t ) = (1; 1; 1; 1; 1; 8; 8; 1) (4.31)
Applying the procedure outlined above, the Shannon product of the trellises, T =
T1 ? T2 ? T3 ? T4 ? T5 has the following state profile:
and the overall syndrome trellis for the RS(7; 5; 3) code is shown in Figure 4.2. Simi-
lar trellises can be obtained using the techniques described in [Wolf, 1978]. However,
the technique described here allows one to label the designed trellis with both informa-
tion and encoded symbols and is easier to implement. Using the technique proposed
in [Forney, 1988b; Zyablov and Sidorenko, 1993] it is easy to show that the minimal
trellises for RS(7; 5; 3) must have 64 states and the designed trellis is isomorphic to
the minimal trellis of the code. Therefore the designed trellis is a minimal trellis.
A coset trellis contains a set of parallel subtrellises. Its highly regular structure enables
the storage requirements to be reduced since each subtrellis is identical in structure and
only the branch labels differ. Moreover, the labels differ by a fixed offset which is the
value of the coset leader. Decoding algorithms which are able to take advantage of its
regular structure (e.g., two-stage decoding) are of lower complexity than the Viterbi
algorithm (Section 6.4.2). Alternatively, a suitable decoder can take advantage of the
parallel subtrellises to perform most of the decoding operations in parallel and thus
The Shannon product of trellises (Section 4.1.2) can be applied to the design of
calculate the state profile of the minimal syndrome trellis for the RS(n; k; d) code:
Nsynd = [N ; N ; : : : ; N ℄
0 1 n (4.33)
The state profile can be obtained from the calculation of the minimal number of states
for every splitting point of the trellis (4.4). From the calculated Nsynd choose splitting
points which have the same number of states and define the state and label profiles of
Ncoset = [1; N ; N ; : : : ; N
1 2 Nc 1 ; 1℄ (4.34)
Lcoset = [l ; l ; : : : ; l ℄
1 2 Nc 1 (4.35)
4.1. TRELLIS CONSTRUCTION METHODS 88
where Nc is the number of columns (vertices) in the desired coset trellis. Since all
Ni =n j (4.36)
i; j = 1; 2; : : : ; N c 1 (4.37)
and in general
li 6= l j (4.38)
i; j = 1; 2; : : : ; N c 1 (4.39)
At the next stage represent the generator matrix G in the following format:
2 3
6 g1 7
6 7
6 7
6 g2 7
G = 6
6
6 ..
7
7
7
= G1 G2 : : : GNc 1
(4.40)
6 . 7
6 7
4 5
gk
to design the trellis diagram of the (n; 1; d) code over GF(q) with the label size profile
given as in (4.35), and the overall trellis diagram can be obtained as the Shannon
Example 4.2 To design a coset trellis for RS(7; 3; 5) with symbols taken from GF(8).
4.1. TRELLIS CONSTRUCTION METHODS 89
2 3
6
3 3
1 1 0 0 7
6 7
G = 6
6 0
6 3
1 3
1
7
0 7
7
(4.42)
4 5
0 0 3 1 3 1
Following the procedure outlined above, the state profile of the trellis is obtained as
Nsynd = [N ; N ; : : : ; N ℄, where
0 1 7
= qqq = 1
3
N0 0 3
(4.43)
N1 = q3
q0 q2
=8 (4.44)
N2 = q3
q0 q1
= 64 (4.45)
N3 = q3
q0 q0
= 512 (4.46)
N4 = q3
q0 q0
= 512 (4.47)
N5 = q3
q1 q0
= 64 (4.48)
N6 = q3
q2 q0
=8 (4.49)
N7 = q3
q3 q0
=1 (4.50)
and Nsynd = [1; 8; 64; 512; 512; 64; 8; 1℄. It is apparent that for a given RS(7; 3; 5)
4.1. TRELLIS CONSTRUCTION METHODS 90
Three possible solutions, each with different state and label profiles, are given below:
(i)
N = [1; 8; 8; 1℄ (4.51)
L = [1; 5; 1℄ (4.52)
(ii)
L = [2; 3; 2℄ (4.54)
(iii)
L = [3; 1; 3℄ (4.56)
2 3
6 3 1
3
1 0 0 7
6 7
G = G1 G2 G3 = 6
6 0
6 3
1 3
1
7
0 7
7
(4.57)
4 5
0 0 3 1 3 1
The overall trellis diagram, T , can be obtained as the Shannon product of three trel-
4.2. DECODING ALGORITHMS 91
diagram is shown in Figure 4.5. As follows from Figure 4.5, the minimal coset trellis
for RS(7; 3; 5) consists of 8 identical, parallel subtrellises which differ only in their
4.2.1 Introduction
The aim of a trellis decoder is to choose the best path through the trellis, either by
maximizing the similarity or minimizing the difference between the received sequence
and one of the codewords. Depending upon which metrics are in use best may mean
the largest or smallest path metric. If the trellis is labelled with both data and code
symbols the decoding algorithm can usually be configured to output either data or
code symbols. This is true for the Viterbi and soft-output Viterbi algorithms, and two-
stage decoding (Section 4.2.3). While the dataword is normally the required output
some product code decoding algorithms may require the most likely codeword (Sec-
tion 5.3.4).
Trellises for block codes are fixed in length, while the length of a convolutional
trellis is related to the message length. Since this can result in exceedingly long trel-
lises convolutional decoding algorithms normally truncate the trellis early to reduce
the decoding delay and memory to a finite, known value. For this reason convolutional
4.2. DECODING ALGORITHMS 92
(a) T1
(b) T2
(c) T3
code trellises are frequently seen with multiple start and end points, reflecting the fact
that the decoding sequence may begin and end at any state. Heller and Jacobs have
shown that the length of the truncated trellis should be 4 or 5 times the code constraint
length, by which time it can be assumed that all surviving paths have merged with the
ML path [Heller and Jacobs, 1971]. Forney’s more conservative result [Forney, 1970]
1967] in 1967. It was later shown [Omura, 1969] that the VA provides a ML de-
coding solution for convolutional codes, and has since been used for ML decoding of
block codes also. It reduces the computational load by taking advantage of the trellis
structure. It calculates a series of path metrics which are a measure of the similarity
(or difference) between the received sequence and the possible transmitted sequences.
The VA eliminates paths which cannot possibly form part of the ML path. This is
performed when two or more branches enter a node; the partial path having the best
metric is chosen to become the surviving path. This continues until the end of the
trellis is reached and a surviving path selected. The VA is usually implemented in one
The Viterbi algorithm is mostly easily explained with the aid of an example de-
coding.
An encoder for this code is given in Figure 2.2. The trellis (see Figure 4.7) contains
4.2. DECODING ALGORITHMS 95
four states at each level. Suppose the uncoded data sequence was
u = [: : : ; 0; 1; 0; 0; 1; 1; 1; 0; : : : ℄ (4.58)
and that the encoder shown in Figure 2.2 was in state S3 . The encoded sequence was
thus
After transmission over a discrete symmetric channel as shown in Figure 4.6 the re-
ceiver assigns one of four values to each received symbol. The ‘0’ and ‘1’ indicate the
reception of a good signal, while ‘0’ and ‘1’ indicate reception of a weaker signal. For
the channel shown in Figure 4.6 log likelihood functions [Wicker, 1994, p. 294] are
used to compute the set of bit metrics used in the decoding process. The bit metrics
received
symbol
0 0 1 1
required 0 5 4 2 0
symbol 1 0 2 4 5
Table 4.1: Channel metrics for the Viterbi decoding example.
4.2. DECODING ALGORITHMS 96
p(0 j 0)
0 0
p (0 j
0)
p (1
j 0)
p(
0
1
Transmitted Received
j 0)
symbols symbols
j 1)
j 1)
0
p( p (0 1
p (1 j 1)
p(1 j 1)
1 1
Figure 4.7 shows the (truncated) trellis diagram for the code (2; 1; 3). At each
parentheses is the SD metric for that particular branch. The value above or below each
node is the state metric, which is a measure of the likelihood of any state being part
of the transmitted sequence. The state metrics can be found recursively from the sum
of an input branch and its preceding state metric. In this Example the metrics are a
Trellis decoding starts at state S0;1 , that is state 0 at time t = 1. The best path
back to t = 0 is P (S ! S
0;1 2;0 ) and is indicated by a solid black line. The decoding
metric for state S0;1 is 8. This process is repeated for all other states at time t = 1.
Moving onto t = 2, state S has a choice of two paths, P (S ! S ! S )
0;2 0;2 2;1 3;0
)
)
)
)
)
0)
(5
(6
(7
(4
(7
(7
(4
0 7 17 22 34 43 47 59 61
(1
S1
11
11
11
11
11
11
11
=11
(4 ) (6 ) (7 ) (0 ) (4 ) (4 ) (7 ) (5 )
0=
0=
0=
0=
0=
0=
0=
0
00 00 00 00 00 00 00 00
1= 1= 1=0 1=0 1= 1= 1=0 1=
0= 0= =1 0 =1 0 0= 0= =1 0 0=
10 10 10 10 10
( ( ( ( ( ( ( (
1=
1=
1=
1=
1=
1=
1=
1=
4.2. DECODING ALGORITHMS
2) 8) 9) 5) 2) 2) 9) 0)
01
01
01
01
01
01
01
01
S2
(5
(9
(4
(2
(9
(9
(2
(1
)
)
)
)
)
1 11 15 31 40 52 ) 56
26 ) 71
0)
(9 ) (4 ) (2 ) (5 ) (9 ) (9 ) (2 ) ( 10
01 01 01 01 01 01 01 01
0= 0= 0= 0= 0= 0= 0= 0=
2 1=10 (2) 9 1=10 (8) 17 1=10 (9) 26 1=10 (5) 31 1=10 (2) 43 1=10 (2) 52 1=10 (9) 61 1=10 (0) 69
S3
t =0 t =1 t =2 t =3 t =4 t =5 t =6 t =7 t =8
r = 01 10 10 11 01 01 10 01
ML path
Selected path
Discarded path
P (S0;2 !S !S
2;1 3;0 ). The process is repeated up to t = 8. If any two paths have the
same metric one is chosen arbitrarily. Note that if each state stores its own metric it is
The ML path, denoted by thick, solid lines, can be seen by starting at the state at
time t = 8 with the best metric (i.e., S 2;8 ) and tracing back along the best path. The
data symbol output from the decoder is the data label on the earliest branch of the ML
path, i.e., B(S2;1 !S 3;0 ). Therefore the decoder output is ‘0’, and is in agreement
with the first data symbol of x. Subsequent decoding attempts will output more recent
symbols in the trellis. The trace-back can be avoided by keeping track of the output
It is important that the path metrics for the states at time t = 1 are not lost. These
states will become the earliest states in the next decoding attempt. Only the relative
overflow. Therefore on the next decoding attempt the state metrics for S0;0, S1;0 , S2;0
Note that the surviving path from each state at time t = 8 has merged with the ML
path by t = 1. Therefore, regardless of which node had been chosen the correct data
for t = 0 would have resulted. This indicates that the truncated trellis was (just!) long
enough so that all possible paths had merged with the ML path. In practice a trellis of
depth 8 for a code of constraint length 3 is not sufficiently long to reliably ensure all
possible paths merge with the ML path. A flowchart showing the decoding stages is
(a)
t =1
START i =0
(b)
Calculate path metric for branch 0, 1
Pm = Bm (Si;t ! S j;t1) Pm (S jt 1) +
(c)
Store best metric
Pm (Si;t )
(d)
(e) Yes
t =1? No (f)
= Store o/p data for
selected branch
Store o/p data
from S j;t 1
(g)
i =i+1
(h)
No
i = 2K 1 1
?
(k)
Choose
Yes
Pm (Si;Æ )
(i) with best metric
(j)
t =t+1 No
t =Æ ?
i=0
(l)
Output data label
Yes
STOP
It has been shown [Omura, 1969] that SDMLD of RS codes can be achieved by use of
the Viterbi algorithm [Viterbi, 1967] over a suitable trellis. For short RS codes Viterbi
The designed coset trellises are isomorphic to the minimal trellis, and are thus them-
selves minimal (Section 4.1.4). However, for long RS codes Viterbi decoding becomes
therefore necessary to use a different decoding method. The algebraic techniques de-
scribed in Chapter 3 are well-known but unlike trellis decoding are unable to take
trellis for a code with an inherent sum structure may be decomposed into its compo-
nent trellises, T 0 and T 00 . It should be noted that the technique is not constrained to a
trellises T 0 and T 00 is 0 and 00 respectively, then the decoding complexity of the trel-
on the two component trellises the complexity is reduced to 0 + 00. The storage
requirement is reduced in much the same way. Hence both major hurdles to trellis
2. Apply the Viterbi decoding algorithm only to the subtrellis indicated at step 1.
4.2. DECODING ALGORITHMS 101
If the overall trellis is viewed as the Shannon product of two trellises, T 0 and T 00 ,
with corresponding codes C 0 and C 00 , then codewords from C 00 can be viewed as coset
like most reduced search algorithms the paths to be decoded are decided before trellis
decoding (proper) begins, i.e., at the end of stage one, when the most likely subtrel-
lis(es) have been identified. The usual behaviour of reduced search algorithms (e.g.,
[Shin and Sweeney, 1994] or [Aguado and Farrell, 1998]) is to select the candidate
Reed-Muller codes are highly regular, a feature which can be used to good effect in
able to make good use of their regular structure, and was first employed by Wu et al.
who found the decoding performance was only 0:2–0:5 dB away from SDMLD [Wu
et al., 1994]. For two-stage decoding of Reed-Muller codes a trellis can be used to
identify which subtrellis(es) to decode. However, RM codes have few subtrellises and
(using ‘soft’ Galois field algebra (Section 4.3) if soft information is available). A
Consider the code generated from a generalised array code (Section 2.1.5). The
construction of the code can be found in [Honary et al., 1995a]. The code C is the
4.2. DECODING ALGORITHMS 102
2 3
6 u1 p1 7
6 7
6 7
6 u2 p2 7
C1 = 6
6
6
7
7
7
(4.61)
6 u3 p3 7
6 7
4 5
u4 p4
2 3
6 0 u4 7
6 7
6 7
6 0 u4 7
C2 = 6
6
6
7
7
7
(4.62)
6 0 u4 7
6 7
4 5
0 u4
2 3 2 3
6 u1 u1 u4 7 6 v1 v2 7
6 7 6 7
6 7 6 7
6 u2 u2 u4 7 6 v3 v4 7
C = 6
6
6
7
7
7
= 6
6
6
7
7
7
(4.63)
6 u3 u3 u4 7 6 v5 v6 7
6 7 6 7
4 5 4 5
p4 p4 u4 v7 v8
where p j =u j
j = f1; 2; 3g
p =u +u +u
4 1 2 3
(4.64)
Addition over GF(2) is denoted by ‘’. The trellis for this code is shown in Figure 4.9.
From 4.63 it can be seen that the cosets of the RM code are generated by C1 and the
The first stage of the decoding process is to identify in which subtrellis the code-
4.2. DECODING ALGORITHMS 103
word lies. This is achieved by decoding C2 to find the value of u4 . At this stage the
values of the other data symbols, u1 , u2 and u3 are not known. However four indepen-
dent predictions of the value of u4 can be made from the four rows in C . If u4 = 0 then
the left and right columns of a row should have the same value, and if u4 = 1 then the
columns should have opposite values. This can be shown by rearranging (4.63).
8 9
> >
>
>
>
>
(v1 v2 ); >
>
>
>
>
> >
>
> >
>
< (v3 >
v4 ); =
b
u4 => (4.65)
>
>
>
>
(v5 v ); >
6
>
>
>
>
>
> >
>
> >
>
: (v7 v ) >
8
;
where b
u4 is the set of symbol predictors for u4 . If hard-decision values of the received
metic (Section 4.3) to preserve as much information as possible. Having found u4 the
Although the coset leader is the direct output of the T 00 trellis it is not possible in the
case of RS codes to simply pass the received codeword through T 00 ; the received code-
10
=1 1
1 1=11 1=11 =1
4.2. DECODING ALGORITHMS
0=00 0=00
01=
=01
01
0=01 0=01
=10
1 1 =1
0
1=10 1=10
0=01 0=01
an algebraic method is used to predict which subtrellis to decode. The subtrellis pre-
diction can be improved with the inclusion of SD information, such as by using ‘soft’
predict u j since there are k unknowns (the k information symbols). Any k symbols can
c1 j v1 +c v +:::+c v = f
2j 2 kj k 1j u1 +f 2j u2 +:::+ f kj uk (4.66)
8
>
>
>
<0 for i 6 =j
fi j => (4.67)
>
>
:1 for i =j
From the generator matrix G form a k k matrix, , from the k columns which relate
to the encoded symbols in set S. That is, for symbols fs1 ; s2 ; : : : ; sk g form from
4.2. DECODING ALGORITHMS 106
columns 1; 2; : : : ; k of G.
2 32 3 2 3
6 11 12 : : : 1k 76 c1 7 6 f1 j 7
6 76 7 6 7
6 76 7 6 7
6 21 22 : : : 2k 76 c2 7 6 f2 j 7
6
6
6 .. ..
76
76
76 ..
7
7
7
= 6
6
6 .. 7
7
7 (4.68)
6 . . 76 . 7 6 . 7
6 76 7 6 7
4 54 5 4 5
k1 k2 kk ck fk j
Earlier it was stated that almost any k symbols could be used. It is important
that only the minimum set of received symbols is used to predict a given information
symbol. A minimum set, Smin , is defined such that no subset of Smin exists from which
of the prediction, since for small Ps the probability of a prediction using an incorrect
symbol is proportional to both Ps and the number of symbols used. It should be noted
operation, while the evaluation of the weighted sums (4.66) is a decoding operation.
From the requirement of minimum sets it can be shown that the number of predictions,
codes, i.e.,
8
>
>
>
>
>
>
>
C1 +C +:::+C +
2 k
4
3
>
>
>
>
>
< +C + +C + +:::+C where k+2 1 odd
=>
3k 5 3k 9
C0
4 4
k
(4.70)
>
>
>
>
>
C1 +C +:::+C +
2 k 1
>
>
4
>
>
>
>
: +C + +C + +:::+C
3k
4
3 3k
4
7
k where k+2 1 even
8
>
>
>
<C k+7 +C + +:::+C +1 where k+2 1 is odd
=>
k 11 3k
C 00
4 4 4
(4.71)
>
>
:C k+4 5 +C + +:::+C
k
4
9 3k
4
1 where k+2 1 is even
by adding the appropriate codeword from C 00 (i.e., coset leader of C ) to the coset of
C containing the all-zeros codeword. The subtrellis is then decoded with the Viterbi
be obtained by decoding more than one subtrellis. The subtrellises decoded are chosen
on the basis of the highest confidences from the output of stage one. The final output
is the one with highest confidence from the output of stage two.
Example 4.5 Design the symbol predictors for two-stage decoding of the RS(7; 3; 5)
4.2. DECODING ALGORITHMS 108
code.
Let the trellis be designed according to Example 4.2. Thus the generator matrix is
2 3 2 3
6 3 1
3
6 gT1 7 1 0 0 7
6 7 6 7
G = 6
6 gT2
6
7
7
7
= 6
6 0
6 3
1 3
1
7
0 7
7
(4.72)
4 5 4 5
gT3 0 0 3 1 3 1
The component trellises are shown in Figure 4.4, and the complete trellis in Figure 4.5.
From (4.70) and (4.71) it is apparent that the selection of the subtrellises should be
based upon the ‘central’ code (i.e., C2 ) to maximise the number of predictions avail-
able (and thus ensure the best possible performance). Although it is possible to use
isomorphic trellises where the subtrellis decision is based upon C1 or C3 this will result
The symbols of C2 are dependent upon u2 alone, therefore only a single symbol pre-
dictor, b
u2 , is required. Whilst in general the minimum number of received symbols
required to predict C2 is k = 3 there exists two minimum sets requiring only 2 re-
ceived symbols, fv ; v g and fv ; v g.
1 2 6 7
For the minimum set S = fv ; v g the calculation of the symbol predictor proceeds
1 2
as follows. The weighted sum of the received symbols is (from Equation 4.66)
c1 v1 +c v = f u + f u + f u
2 2 1 1 2 2 3 3
(4.73)
=u 2
4.2. DECODING ALGORITHMS 109
The coefficients c1 and c2 can be found by taking columns 1 and 2 from the generator
2 3 2 3
6
3 2 3
7 6 0 7
6 7 c1 6 7
6
6 0
6
3
76
74
7
7
5 = 6 7
6 1 7
6 7
(4.74)
4 5 c2 4 5
0 0 0
c1 = 2
(4.75)
c2 = 4
(4.76)
However, the transmitted symbols fv1 ; v2 ; : : : ; vn g are not known by the receiver.
Instead the received symbols fr1 ; r2 ; : : : ; rn g must be used. As they may be subject
to errors the calculation is weakened to a prediction of the value of u2 . However as 27
independent predictions are available the overall prediction is much less sensitive to
errors. The full set of 27 symbol predictors are given in Table 4.2. Having calculated
codeword is given by
v = u:G
2 3
6 3 1
3
1 0 0 7
6 7
=[ 6
0 0 0 6
6 0
℄ 3
1 3
1
7
0 7
7
(4.77)
4 5
0 0 3 1 3 1
=[ 0 0 0 0 0 0 0 ℄
For simplicity consider only hard-decision decoding of C2 . Let the received codeword
be r = v + e, where e = [ 0 6 0 0 0 0 0 ℄.
r =[ r 1 r2 r3 r4 r5 r6 r7 ℄
(4.78)
=[ 0 6 0 0 0 0 0 ℄
The prediction of the value of u2 , and thus of which subtrellis to decode, is obtained
by substituting (4.78) into the symbol predictors (Table 4.2) and choosing the most
likely value. The results of evaluating the symbol predictors are given in Table 4.3.
It can be seen that the most likely value of u2 is zero, and therefore the subtrellis to
likely candidates. The chosen subtrellis(es) are decoded with normal Viterbi decoding
(Section 4.2.2).
4.3. ‘SOFT’ GALOIS FIELD ARITHMETIC 112
Reference has been made to ‘soft’ Galois field arithmetic. In this Section an explana-
reaching a precise definition the useful soft information is discarded. By including the
ated. There are q different results (0; 1; ; : : : ; q 2 ) and each result occurs q times.
Therefore, each of the q output values has an associated probability which is the sum
of q probabilities. A similar approach can also be used for , although there is also
Example 4.7 The addition of two GF(4) values using soft GF arithmetic.
For the elements f0; 1; ; 2g let the confidences of a be f0:75; 0:10; 0:10; 0:05g
respectively, and for b f0:25; 0:25; 0:50; 0:00g respectively. Consider the result ab =
0. There are four ways by which this outcome may be achieved: 0 0, 1 1,
and 2 2 . The probability that a b = 0 is given by
(4.79)
= p(a = 0): p(b = 0) + p(a = 1): p(b = 1)
+ p(a = ): p(b = ) + p(a = ): p(b = )
2 2
Table 4.4 illustrates how the output confidences for all outcomes are computed. The
Confidence
Element Computation Total
0 0:1875 0:0250 0:0500 0:0000 0:2625
(0 0) (1 1) ( ) ( 2 2 )
1 0:1875 0:0250 0:0000 0:0250 0:2375
(0 1) (1 0) ( ) ( 2 )
2
4.4 Discussion
In this Chapter techniques for constructing minimal trellises have been demonstrated.
For low-rate codes this is served by the coset trellises, while for high-rate codes syn-
drome trellises should be used. The Shannon product of trellises is important not only
for trellis construction but also for the decomposition of trellises into simpler forms.
Two-stage decoding of RS codes is a new method which can take advantage of sim-
the decoding performance of TSD a new procedure for including soft information in
the evaluation of Galois field algebra was presented. The decoding performance and
complexity of both Viterbi and two-stage decoding has been measured by computer
Codes
115
Chapter 5
Codes
5.1.1 Introduction
channel. The basic concepts of concatenated coding were introduced in Section 2.3. In
such a system the inner decoder is able to take advantage of any SD information from
the channel. For maximum performance the outer decoder requires SD information
from the inner decoder. Traditional decoders are unable to fulfill this requirement.
SOVA [Hagenauer and Hoeher, 1989] and MAP [Bahl et al., 1974]. SOVA was iden-
116
5.1. CONCATENATED CODES 117
Massey stated that convolutional codes should be used as the first stage of decoding
because they can easily accept soft decisions and channel state information [Massey,
1984]. Many concatenated coding schemes exist which do just that for the very rea-
burst of errors [Hagenauer et al., 1994, p. 243]. Reed-Solomon codes are well known
for their burst error correction capability when log2 qRS binary bits are mapped into one
provides very good performance and is used by NASA and ESA for space communi-
cations [Dai, 1995; Wicker, 1994]. While the binary to multi-level mapping provides
burst error correction it is a mixed blessing, as the bit error probabilities must some-
how be transformed into symbol error probabilities. For an RS trellis decoder this can
Section 5.2 of this Chapter describes the soft output Viterbi algorithm. The Vit-
information in addition to the most likely symbol(s). Product codes can be viewed as
a type of concatenated code, with the row and column codes forming the inner and
outer codes. Section 5.3 describes various decoding algorithms for decoding product
It can be shown [Gallager, 1968; Wozencraft and Jacobs, 1965] that for any channel
if all input sequences are equally likely the decoder which minimises the error prob-
ability is one which compares the conditional probabilities (or likelihood functions),
p(r j v), of the received sequence, r, and all possible transmitted sequences, v, and se-
lects the maximum, v0 [Viterbi, 1971]. Such a decoder is termed maximum likelihood.
For
r = r; r+ ; r+
t t 1 t 2 (5.1)
v0 = v0 ; v0+ ; v0+
t t 1 t 2 (5.2)
For most channels the inputs to the receiver are real values and thus require infinite
precision. This is not possible and some loss of precision must be accepted by quan-
tising the received signal to a finite number of values. Simulation studies [Heller and
Jacobs, 1971] have shown that 8-level quantisation resulted in only 0:25 dB reduc-
tion in coding gain with respect to the unquantised case, much less than the gains
The transition probabilities may then be computed by considering the area under
5.1. CONCATENATED CODES 119
the PDF for each quantisation level. In (5.3) it can be seen that the conditional prob-
ability that sequence r was received is dependent upon multiplication operations. For
achieved by using logarithms. Equation (5.3) may then be rewritten without multipli-
log p(r j v0 ) = log p(r j v0 ) + log p(r + j v0+ ) + log p(r + j v0+ )
t t t 1 t 1 t 2 t 2 (5.4)
Since log p(r j v0 ) increases monotonically with p(r j v0 ) the decoder is able to max-
imise log p(r j v0 ) instead of p(r j v0 ) with the same result. Logarithms of any base
may be used, the only difference is a scaling factor. Unless stated otherwise natural
Integer arithmetic is typically several times faster than floating point arithmetic and
is often preferred for reasons of both speed and reduced complexity of the hardware
required. The log likelihood functions can be transformed into integer log likelihood
where a and b are real numbers chosen to scale the LL functions into a suitable range,
and hxi denotes the closest integer to x. Rounding errors can be minimised by choosing
appropriate values for a and b. When a is positive the decoder should select v0i to
maximise ` (ri j v0i ) and when a is negative v0i is selected to minimise ` (ri j v0i ).
5.1. CONCATENATED CODES 120
tion levels.1
Let the received bit energy in noiseless conditions be Eb and let the levels have an
p
equal spacing of 12 Eb . For simplicity assume unity bit energy; let a “0” be represented
by 1 and a “1” by +1. Thus the 7 transition points are 1:5, 1:0, 0:5, 0:0, +0:5,
+1:0 and +1:5. If “0” was transmitted it could be received in any one of the 8 levels,
and the probability of each level being received is (generally) different and dependent
upon Eb =N0 , the signalling scheme used and the noise PDF. Figure 5.1 shows the 7
transition levels, and the signalling values with the superimposed Gaussian PDF (2.36)
Z ( 2 )
A = p1 exp
2
1
2
x
dx (5.6)
Thus the probability of receiving “0” in any region is given by evaluating the integral in
(5.6) between the limits of the quantisation level. For coherently demodulated BPSK
s
= Eb
2N0
(5.7)
0:45
“0” “1”
0:40
0:35
5.1. CONCATENATED CODES
0:30
0:25
Probability
0:20
0:15
0:10
0:05
0
4 3 2 1 0 1 2 3 4
Received signal
121
3:8642. The transition probabilities and LL metrics for “1” are computed in the same
way. The transition probabilities are given in Table 5.1, where the symmetry of the
channel is reflected in the symmetry of the LL metrics. The metrics are shown graph-
ically in Figure 5.2. In the area of indecision (received signal 0) the metrics are ap-
proximately equal while further from zero the metrics display an increasingly strong
bias to “0” or “1”. In Figure 5.1 it can be seen that there is a disproportionately large
the poor SNR of the channel as the probability of receiving a high confidence value is
The LL metrics are dependent upon Eb =N0 , but the receiver can only estimate this
sensitivity of the metrics with variation in Eb =N0. Figure 5.3 shows the LL metrics for
symbol “0” against Eb =N0 over the range 6 dB ! +6 dB. The metrics are scaled
to fit the range 0 ! 15 using the method outlined above. The dotted lines show the
metrics truncated to integer values as would be used in a hardware implementation
of SOVA. For the region where the noise dominates the signal (Eb =N0 < 0 dB) the
metrics are sensitive to changes in Eb =N0 . In such noisy conditions coding would not
be used because uncoded operation results in fewer errors. For Eb =N0 > 2 dB there
is very little change in the scaled LL values, indeed, for the integer metrics there is
no change. This shows that for the area of interest the metrics are not particularly
5.1. CONCATENATED CODES
level lower upper transition log of transition scaled LL LL
limit limit probability probability value metric
p(r j 0) p(r j 1) log p(r j 0) log p(r j 1) “0” “1” “0” “1”
0 1 1:5 0:3083 0:0062 1:18 5:09 15:00 0:00 15 0
1 1:5 1:0 0:1917 0:0165 1:65 4:11 13:18 3:77 13 4
2 1:0 0:5 0:1917 0:0440 1:65 3:12 13:18 7:53 13 8
3 0:5 + 0:0 0:1500 0:0918 1:90 2:39 12:24 10:36 12 10
4 + 0:0 + 0:5 0:0918 0:1500 2:39 1:90 10:36 12:24 10 12
5 + 0:5 + 1:0 0:0440 0:1917 3:12 1:65 7:53 13:18 8 13
6 + 1:0 + 1:5 0:0165 0:1917 4:11 1:65 3:77 13:18 4 13
7 + 1:5 +1 0:0062 0:3083 5:09 1:18 0:00 15:00 0 15
Table 5.1: Transition probabilities and LL metrics for a coherently-demodulated BPSK channel at Eb =N0 = 3 dB.
123
LL metrics for coherently-demodulated BPSK ( 3 dB)
15
10
5.1. CONCATENATED CODES
LL metric
5
:
0
2 1 :5 1 0 :5 0 0:5 1 1 :5 2
Received signal
124
14
12
10
5.1. CONCATENATED CODES
Scaled LL metric
6
1 ! 1:5
1:5 ! 1:0
1:0 ! 0:5
4
0:5 ! 0:0
0:0 ! +0:5
+0:5 ! +1:0
2
+1:0 ! +1:5
+1:5 ! +1
0
6 4 2 0 2 4 6
Eb =N0 (dB)
125
5.2.1 Introduction
As its name suggests the soft output Viterbi algorithm [Hagenauer and Hoeher, 1989;
Hagenauer et al., 1994, 1996] is a modification to the ‘standard’ Viterbi algorithm [Vit-
erbi, 1967]. It can decode using soft or hard decision information and provides a
single reliability measure for its output sequence, which is the closest codeword. Al-
ternatively, if the trellis is labelled with both data and code symbols then the output
One of the main areas in which SOVA has been used is for iterative or turbo decod-
ing.2 An immense amount of literature has recently been written on turbo decoding,
but the work involving SOVA described in this Chapter is intended for the purpose of
concatenated decoding and therefore iterative methods have not been applied.
Algorithms
The soft output Viterbi algorithm differs from the standard model by keeping track
of the reliability of its decisions. However, only decisions which affect the outcome
are considered, that is decisions which lie along the surviving path. Decisions which
2 Turbodecoding is also known as turbo coding, this is misleading since the ‘turbo’ (feedback)
analogy applies to the decoding, not the codes themselves [Hagenauer et al., 1996].
5.2. SOFT OUTPUT VITERBI ALGORITHM 127
affect other paths are discarded at the same time the path is discarded. Consider the
trellis segment for a binary code shown in Figure 5.4. For each trellis state, S j;t , the
S0
0;4
S1 1;2
S2 2;1 2;3
difference in path metrics State S2;3
S3
t =0 t =1 t =2 t =3 t =4
Surviving path
Discarded path
Figure 5.4: Example trellis with metric differences for traceback SOVA.
Viterbi algorithm chooses the branch B(Si;t 1 !S j;t ) to select the best partial path
metric, Pm (Si;t 1 + B (S
m i;t 1 !S j;t )).
=M j;t M0 (5.8)
5.2. SOFT OUTPUT VITERBI ALGORITHM 128
The probability that the decision made at this point was correct is given by [Hage-
= e e+ e 0
M
M M
(5.10)
= 1 +e e
j ;t
(5.11)
j ;t
Therefore the log likelihood ratio of this binary path decision is j;t because
log
p(correct)
1 p(correct)
= j;t (5.12)
This shows that when two paths merge and either would give rise to the same
is given by the difference in the partial path metrics of the surviving and discarded
paths. The reliability of the output sequence is given by product of the reliabilities
for the decisions affecting the output (equivalent to the sum of log likelihood metrics).
Hagenauer shows that the sum can be approximated to the smallest log likelihood de-
cision reliability of the terms [Hagenauer, 1995]. While the channel information gives
some indication as to the most likely transmitted sequence the additional reliability
information gleaned from the decoding process is termed the extrinsic information.
It is important to note at this point that the behaviour of the algorithm differs be-
tween convolutional and block trellises. This is due to the differing ways in which
5.2. SOFT OUTPUT VITERBI ALGORITHM 129
erable delay in the decoding process. For this reason a truncated trellis (typically 4 to
6 times the constraint length) is decoded once per k output symbols, whereas a block
code trellis is decoded once for the entire codeword. This affects the algorithm in two
ways. Firstly, for convolutional decoding independent passes through the trellis are
made for each k output symbols. Usually k = 1 and thus each symbol has a unique
metric. This cannot be done for the fixed length block code trellis. The ML sequence
is the codeword, and thus all output symbols in the codeword must share the same
reliability metric. While the MAP algorithm [Hagenauer et al., 1996] is able to pro-
vide reliability metrics for each encoded symbol this is not helpful for concatenated
schemes where metrics for the reliability of output data symbols is sought. (MAP is
also much more complex.) The second point to note is that when decoding over a trun-
cated convolutional trellis the output is the data label(s) on the first branch, whereas
for the block code trellises described in Chapter 4 the output is the sequence of data
labels (or occasionally code labels) found on all branches forming the ML path. This
means all decisions along the ML path of a block code trellis are important since they
all will affect the outcome. Conversely, for convolutional decoding fewer decisions
trellis is chosen with the assumption that all surviving paths have merged with the ML
The differences between the standard and soft output Viterbi algorithms are most
easily seen with the aid of a decoding example. They are also illustrated in Figure 5.5,
where the flowchart given in Chapter 4 (Figure 4.8) is expanded to include the addi-
5.2. SOFT OUTPUT VITERBI ALGORITHM 130
(a)
t =1
START i =0
(b)
Calculate path metric for branch 0, 1 Calculate
Pm = Bm (Si;t ! S j;t1) + Pm (S jt 1) i;t = jPm0 Pm1j
(d)
(e)
Store o/p data for
Yes
t =1? No (f)
Store o/p data
selected branch from S j;t 1
(g)
i =i+1
(h)
No
i = 2K 1 1
?
(k)
Choose
Yes
Pm (Si;Æ )
(i) with best metric
(j)
t =t+1 No
t =Æ ?
i=0
(l)
Output data label
Yes
STOP
Example 4.3 will now be extended to include SD outputs. Figure 5.6 shows the
annotated trellis. Every decision-making state i.e., those for which time t > 0, is now
annotated with two values, the partial path metric to time t = 0 (as in Example 4.3 )
and the new LL reliability of the decision made at that node (in parentheses).
is indicated by a solid black line. The decoding metric for state S0;1 is 8. Since both
paths give the same output data, “1”, it is clear that no bit error would have occurred
had the discarded path been selected. Therefore the LL reliability of this decision
should that node be on the ML path. The reliability metric is the difference between
the metrics of the selected path and the discarded path, i.e., 24 19 = 5. Should the
decision metric be zero this shows that the choice of best path was tied; therefore the
value of the data output was dependent upon an arbitrary decision and should not be
relied upon.
The reliability is found by tracing back along the ML path and taking the minimum
of all the decision metrics on the ML path. Thus the data output is “0” with reliability 7
(from state S2;3 ). Trace-back can be avoided if each state keeps track of the (minimum)
reliability metric.
It was noted in Example 4.3 that the trellis was just long enough for the surviving
paths to merge with the ML path. This is seen by tracing the paths back in time, and
0 0=00 (4) 8 (1 ) 0=00 (6) 17 (1 ) 0=00 (7) 24 (5) 0=00 (0) 36 (1 ) 0=00 (4) 40 (2) 0=00 (4) 47 (3) 0=00 (7) 56 (2) 0=00 (5) 61 (1 )
S0
1= 1= 1= 1= 1= 1= 1= 1=
11 11 11 11 11 11 11 11
(7 ) (6 ) (4 ) ( 10 (7 ) (7 ) (4 ) (5 )
)
)
)
)
)
)
)
)
0)
(6
(5
(7
(4
(7
(7
(4
0 22 (1) 43 (8) 47 (3) 59 (8)
(1
7 (1 ) 17 (1 ) 34 (1 ) 61 (1 )
S1
11
11
11
11
11
11
11
4) 6) 7) 0) 4) 4) 7) 5)
=11
( ( ( ( ( ( ( ( 0=
0=
0=
0=
0=
0=
0=
0
00 00 00 00 00 00 00 00
1= 1= 1= 1=0 1= 1= 1= 1=
0= 0= 0= 0= 0= 0= 0=
10 10 10 =1 0 10 10 10 10
1=
1=
1=
1=
1=
1=
1=
1=
(2 ) (8 ) (9 ) (5 ) (2 ) (2 ) (9 ) (0 )
01
01
01
01
01
01
01
01
S2
(5
(9
(4
(2
(9
(9
(2
(1
)
)
)
)
)
1 26 (7) 40 (4) )
0)
11 (1 ) 15 (1 ) 31 (1 ) 52 (1 ) 56 (1 ) ) 71 (1 )
9) 4) 2) 5) 9) 9) 2) 10
0 1( 0 1( 0 1( 0 1( 0 1( 0 1( 0 1( 0 1(
0= 0= 0= 0= 0= 0= 0= 0=
2 1=10 (2) 9 (1 ) 1=10 (8) 17 (1 ) 1=10 (9) 26 (7) 1=10 (5) 31 (1 ) 1=10 (2) 43 (10) 1=10 (2) 52 (1 ) 1=10 (9) 61 (1 ) 1=10 (0) 69 (1 )
S3
t =0 t =1 t =2 t =3 t =4 t =5 t =6 t =7 t =8
5.2. SOFT OUTPUT VITERBI ALGORITHM
r = 01 10 10 11 01 01 10 01
ML path
Selected path
Discarded path
In Section 6.5 SOVA is used to improve the performance of a satellite link using a
standard concatenated coding system. Unlike the iterative decoding described in [Ha-
genauer et al., 1996] where the extrinsic information is used as the a priori information
to the next iteration, the extrinsic information is used as the a priori information to
SOVA considers only the weakest decision made, and thus considers only the ML
path and the best discarded path. Thus SOVA is analogous to a MAP decoder with
2 codewords [Bahl et al., 1974]. The SOVA algorithm described above has been
extended to work over non-binary convolutional code trellises (and also binary code
best and next-best paths, ignoring the metrics on the other discarded paths. As before,
if the output data would be identical the reliability of the decision is 1, otherwise it is
the difference in metrics of the best and next-best paths; should an arbitrary decision
be made the decision metric is zero. Thus at the least reliable node on the ML path
the extended SOVA algorithm decides between two codewords. Whilst in principle it
5.3.1 Introduction
Product codes can be considered a form of concatenated codes, where the row and
column codes form the inner and outer codes. If constructed with linear codes then
the encoding and decoding order is not important (Section 2.1.4) and may be reversed
if desired. The decoding of row and column codewords can even be alternated [Bate
et al., 1986; Farrell et al., 1986]. This is in contrast with a true concatenated coding
scheme (Section 2.3) where decoding order must be the reverse of the encoding order.
sible methods exist; the Fano [Fano, 1963], stack [Jelinek, 1969; Zigangirov, 1966],
Chase [Chase, 1972], and Viterbi [Viterbi, 1967] algorithms, and also MAP [Bahl et
al., 1974; Hagenauer and Hoeher, 1989]. Equally desirable are soft outputs, so that
the second decoder of the concatenated system can perform optimally. It is of course
implemented, either in hardware or software. For this reason the extended SOVA de-
scribed in Section 5.2.3 was selected. The MAP algorithm with its symbol-by-symbol
reliability metrics is also applicable but the complexity of SOVA is considerably lower
for only minor degradation in performance [Hagenauer et al., 1996]. Any appropriate
block code trellis may be used. The block code trellises used in [Hagenauer et al.,
1996] are based upon the parity-check matrix (H) and are thus irregular in structure.
thus the GAC construction methods based upon the generator matrix (G) described in
5.3. REED-SOLOMON PRODUCT CODES 135
Chapter 4 are used here. RS codes were chosen as the block codes, they are MDS and
so provide the greatest possible distance for a given n and k. This is important for a
product code as the overall distance is the product of the distances of the subcodes.
For some product code decoding algorithms it is required that the subcode codewords
are systematic. One such algorithm is the alternating row/column method described
in Section 5.3.7 or other algorithms which may terminate early when all codewords
containing data symbols have been decoded. More precisely, the data symbols should
have a one-to-one correspondence with k code symbols—the actual order of the sym-
bols is not important nor do they need to be consecutive (such a symbol reordering is
Consider the RS(7; 5; 3) trellis constructed in Example 4.1. The generator ma-
trix (4.22) is not systematic as it is not in reduced-echelon form, nor can it be rear-
ranged to be. Only u5 is unchanged after the data word u is multiplied by the generator
matrix. Table 5.2 shows a few of the 32768 codewords from the RS(7; 5; 3) trellis. For
clarity the symbols are given in decimal form. It clearly shows the lack of a one-to-
one correspondence between the data symbols and the code symbols (u5 excepted).
RS codes are invertible (Section 2.1.8) and therefore any k symbols can be chosen as
data symbols, the remaining n k symbols form the parity checks. Since the trel-
lis is labelled with independent data and code labels it is possible to re-map the data
symbols on the trellis to match the first k code symbols. Table 5.3 shows the same
dataword codeword
u1 u2 u3 u4 u5 v1 v2 v3 v4 v5 v6 v7
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 4 5 1
0 0 0 0 2 0 0 0 0 5 6 2
.. ..
. .
0 0 7 7 7 0 0 6 3 2 5 7
0 1 0 0 0 4 5 0 1 0 0 0
0 1 0 0 1 4 5 0 1 4 5 1
.. ..
. .
7 7 7 5 6 3 6 6 2 7 2 6
7 7 7 5 7 3 6 6 2 4 7 7
7 7 7 6 0 3 6 3 2 1 6 0
.. ..
. .
Table 5.2: Sample data and codewords of the non-systematic RS(7; 5; 3) trellis.
dataword codeword
u1 u2 u3 u4 u5 v1 v2 v3 v4 v5 v6 v7
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 1 2 5
0 0 0 0 2 0 0 0 0 2 3 6
.. ..
. .
0 0 7 7 7 0 0 7 7 7 3 4
0 1 0 0 0 0 1 0 0 0 1 1
0 1 0 0 1 0 1 0 0 1 4 6
.. ..
. .
7 7 7 5 6 7 7 7 5 6 3 0
7 7 7 5 7 7 7 7 5 7 0 6
7 7 7 6 0 7 7 7 6 0 2 4
.. ..
. .
Table 5.3: Sample data and codewords of the systematic RS(7; 5; 3) trellis.
5.3. REED-SOLOMON PRODUCT CODES 137
Many different methods for decoding product codes exist. It is useful to define the
are compared. For this purpose cascade decoding (Section 2.1.4, p. 15) is used as the
reference method. It is expected that the channel can supply soft-decision information
to the row decoder, hence the first decoding stage is soft-decision. However the stan-
dard decoder does not make use of advanced techniques such as SOVA (Section 5.2),
so it is unable to supply soft information to the column decoder (second stage). The
The SOVA decoder detailed in Section 5.2 was configured to produce two outputs,
a codeword and a metric indicating the reliability of the chosen codeword. For non-
cascade decoding algorithms the full codeword symbols are required. For cascade
decoding only the dataword is strictly required, however this is easily obtained from
the first k symbols of the systematic codeword. In order for the SOVA metric to influ-
ence later decodings its value must be somehow incorporated into the buffer storing
the received channel metrics. The method used for this was common to both prod-
uct code decoding schemes using SOVA (Sections 5.3.5 and 5.3.7) and so will be
Remembering that SOVA on a block code trellis outputs only one metric (Sec-
tion 5.2.2) necessitates the assumption that the metric applies equally to all symbols.
5.3. REED-SOLOMON PRODUCT CODES
k2
n2 k2
Column decoding
(hard decision)
k2
Information matrix
k1
138
5.3. REED-SOLOMON PRODUCT CODES 139
Nor does SOVA indicate the next most likely value for a symbol from a non-binary
alphabet. Therefore it must be assumed that the discarded values for each symbol are
equally improbable. Plainly this is not the case but without more information no better
In all cases the received symbol metrics (channel metrics) were stored in a buffer
of the same dimensions as the transmitted codeword (n1 n2 ). When decoding a row/
column the corresponding row/column metrics were passed to the SOVA decoder.
After decoding it is not desirable to completely replace the channel information with
the SOVA metric as it applies to only a subset of all possible symbol values. SOVA
may also have incorrectly decoded the received codeword. Instead, the SOVA metric
is accumulated to the M buffered metrics for the symbol values which correspond to
those selected by SOVA (where M =n 1 for rows and M =n 2 for columns). Thus the
Example 5.3 Consider a (9; 4; 4) product code, whose subcodes are (3; 2; 2) parity
check codes over GF(4). Using SOVA decode the first row code and modify the
symbol metrics.
5.3. REED-SOLOMON PRODUCT CODES 140
column
1 2 3
0 : 19 0 : 23 0 : 23
1 : 28 1 : 28 1 : 25
row 1 (5.13)
: 4 : 8 : 14
2 : 13 2 : 13 2 : 16
.. .. ..
2 . . .
.. .. ..
3 . . .
SOVA indicates that the best codeword is “110”, with a reliability of 3. Therefore
add 3 to the metrics corresponding to a row 1 codeword “110”. The new metrics
(emboldened) are
column
1 2 3
0 : 19 0 : 23 0 : 26
1 : 31 1 : 31 1 : 25
row 1 (5.14)
: 4 : 8 : 14
2 : 13 2 : 13 2 : 16
.. .. ..
2 . . .
.. .. ..
3 . . .
5.3. REED-SOLOMON PRODUCT CODES 141
It can be seen that if the reliability of the decision made by SOVA is zero, i.e., the
decision was arbitrary, then no change is made to the metrics. Conversely, a positive
decision by SOVA will result in a large modification in the LL metrics. Summing the
same value to all symbols obeys the assumption stated that all symbols are equally
reliable; not modifying the discarded symbol values maintains the second assumption
that the discarded symbol values are equally improbable. At no point is the a priori
channel information discarded, but is modified with the extrinsic information from
A logical extension of the cascade decoder is to apply SOVA decoding to the first
decoding stage. The second stage is therefore able to use SD information. In other
studies [Hagenauer et al., 1994; Marple, 1998] SOVA provided a performance im-
The received symbol metrics are stored in a buffer of the same dimensions as the
transmitted codeword (n1 n2 ). The rows are decoded using the extended SOVA al-
gorithm detailed in Section 5.2.3, taking the soft information from the buffer. After
decoding each row the SOVA metric is used to modify the metrics stored in the buffer
(Section 5.3.4). The second stage, decoding columns, then follows. The Viterbi de-
coder is able to make use of SD information from the first stage, along with the a
Alternative decoding strategies exist which aim to minimise the number of errors.
Bate et al. considered various methods based upon decoding rows and columns alter-
natively [Bate et al., 1986]. The row/column subcodes were decoded in decreasing
order of confidence. The algorithms investigated for subcode decoding were hard de-
cision decoding, soft decision decoding using successive erasures decoding [Chase,
1972] and combined soft/hard decision decoding. In [Bate et al., 1986] the method
M log2 q
=
X
log p(r i 0)
j
C
=
i 1
p(r 1)
i j (5.15)
where
8
>
>
>
<n1 for rows
M =>
>
>
:n2 for columns
conditional probabilities.
X
M log2 q
X
M log2 q
C0 = j`
0i `1 j
i
(5.17)
=
i 1
where `0i is the LL metric for symbol “0” for the i-th bit and `1i is the metric for symbol
5.3. REED-SOLOMON PRODUCT CODES 143
“1” for the i-th bit. In other words, the confidences can be computed by summing the
difference between the LL metrics for “0” and “1”. In the case of LL ratios which
have been mapped to integer values C0 will be related to C by C0 ' aC (allowing for
rounding errors) where a is a scaling factor (Equation 5.5).
The alternating row-column decoder in [Bate et al., 1986] was adapted to use SOVA.
The channel modelled was binary, though only symbol metrics were available to the
decoders (Section 6.6.1). This required a small change to the method used for cal-
culating the received codeword confidence. The symbols represented by the best and
worst LL metrics are bit inverses of each other. Therefore (5.17) may be rewritten as
X
M
C0 = j`
besti `worst j
i
(5.18)
=
i 1
where `besti is the metric corresponding to the most likely value for the i-th symbol and
The received symbol metrics are stored in a buffer of the same dimensions as the
transmitted codeword (n1 by n2 ). From these buffered values the row and column
codeword confidences are computed with (5.18) and sorted into order of decreasing
confidence. The row codeword with the highest probability of being correct is decoded
first.
After decoding one row codeword the next most likely column codeword is de-
coded. This process is repeated until all row and column codewords containing data
5.3. REED-SOLOMON PRODUCT CODES 144
symbols are decoded. By this method each data symbol is decoded twice, once with a
row decoding and once with a column decoding. The decodings may occur in either
order. However after both decodings the data symbol’s metrics cannot be changed.
Therefore after the second decoding the value of the symbol (as decided by the SOVA
Decoding may terminate early if the remaining row and column codewords do
not contain data symbols, their decoding will not affect the output data. With this
procedure the average decoding delay and computation are reduced. If the product
decode row and column codewords in some ratio other than 1 : 1 so that the row and
Bate et al. [Bate et al., 1986] recomputed the row and column subcode confidences
after each row and column iteration. Re-sorting the confidences can be numerically ex-
pensive, even the best sorting algorithms require of the order of several times N log2 N
operations [Press et al., 1992, p. 329]. The performance of the algorithm described
above was tested with and without the re-sorting of codeword confidences.
Example 5.4 The decoding of an arbitrary product code with the alternating row/
algorithm.
Figure 5.8 shows the initial decoding stages. Hatching denotes codewords which
have been decoded, whilst shading indicates information symbols which have been
(a) In this case the best row codeword is within the information section of
5.3. REED-SOLOMON PRODUCT CODES 145
(b) The best column decoding is also within the information section of the
product code. One data symbol has been decoded twice and is copied to
(c) The second best row decoding does not affect any data symbols of
(d) The second best column decoding intersects two completed row de-
codings. Only the intersecting data symbol is copied to the output buffer.
(e) The third row decoding intersects two previously decoded columns.
This process continues until all data symbols have been decoded twice and copied into
Although it was stated earlier that SOVA was applied for concatenated coding tech-
niques it was noticed how readily iterative decoding may be applied to the decoder
described in Section 5.3.7. After decoding the channel metric buffer contains the
channel metrics plus the extrinsic information from SOVA, ready to be the a priori in-
formation for the next iteration. Hence repeated decodings on the same channel metric
Research is a product of an inquisitive mind (and vice versa); given the ease
with which iterative decoding may be applied and the additional increase in coding
gain possible (> 1:5 dB for just 4 iterations with a BCH(64; 51; 6) BCH(64; 51; 6)
code [Pyndiah, 1998]) the temptation to test this decoder in an iterative fashion could
not be resisted. Some encouraging introductory results are given in Section 6.6.4.
Chapter 6
148
Chapter 6
All the decoders discussed in detail in this Thesis were implemented in C++ [Strous-
rapid development and code reuse of important components such as GF and polyno-
mial arithmetic, channel models and trellis decoders. The trellises themselves were
file. The trellis diagrams in this Thesis were computer-generated from the trellis de-
It should be noted that all the simulations described in this Chapter have been
performed in full, using random data and random errors. For trellis decoding full
149
6.1. DECODING COMPLEXITY MEASUREMENTS 150
implementations of the trellis were used. This is important since it allows real im-
the all-zeros codeword and/or matched-filters over a limited subset of codewords are
useful performance tools but cannot be used in real systems where the full set of
codewords may be transmitted, for which the trellis diagram is used to exploit the
codewords decoded in a given time. This is only accurate if all the decoders in
the trial are implemented equally well. Subsequent comparisons can only be made
by using the same hardware, which may not be appropriate for all algorithms, and
which may not always be available. A more formal method is the use of O-notation.1
While O-notation is a helpful tool for algorithm designers its use is not without prob-
lems [Sedgewick, 1988, pp. 71–76]. It is a worst-case bound, the constants c0 and N0
are unknown and may be large. Without knowledge of the constants c0 and N0 only
A more practical method was chosen instead, that of counting the number of im-
easy task. The relative execution time of algebraic operations is dependent upon the
hardware selected, but, by choosing the appropriate parameters the method can be ap-
1A function g(N) is said to be O( f (n)) if there exist constants c0 and N0 such that g(N) < c0 f (N)
for all N > N0 [Sedgewick, 1988, p. 72].
6.1. DECODING COMPLEXITY MEASUREMENTS 151
plied to any implementation. To enable valid comparisons it was assumed that each
+
mathematical operation (add ( ), subtract ( ), multiply (), divide () and com-
=
pare ( )) was assigned a cost in terms of the number of CPU cycles required for
its execution. Table 6.1 shows the cost of integer, floating-point (real) and Galois
coder.2 The DSP32C does not contain instructions for integer multiply, integer divide
or floating-point divide. However, these instructions were not needed by any of the
decoders implemented.
It was assumed that the GF arithmetic would be implemented using the same poly-
addition and subtraction are identical and can be performed with an exclusive-OR
logical operation. For the polynomial basis multiplication and division are most eas-
ily implemented by table look-up, which is the method used in Table 6.1. To enable
complexity, bit , which is the total decoding complexity divided by the number of bits
data Operation
type + =
int 1 1 n/a n/a 3
float 2 2 2 n/a 6
GF 1 1 5 5 3
6.2.1 Introduction
The complexity of a trellis has previously been measured by various means, number
of states [Muder, 1988], number of vertices [Kasami et al., 1993a,b] and number of
edges [McEliece, 1996]. However, none of these methods allow trellis decoding com-
Marple, 1995c, 1996, 1997]. Therefore, the trellis decoding complexity can be com-
pared directly with algebraic decoding using the method described in Section 6.1.2.
It should be noted that for integer and floating-point numbers the term comparison
=
includes the tests >, <, and in addition to equality ( ) and their logical inverses.
rithm, and then later extended to include SOVA. In the analysis presented here it
is also assumed that log likelihood metrics are used to avoid multiplication opera-
tions (p. 119); a similar analysis is possible for Euclidean distance metrics, which
were used in [Honary, Markarian, and Marple, 1995c, 1996, 1997]. Following these
6.2. TRELLIS DECODING COMPLEXITY 153
assumptions the complexity can be calculated with the steps shown below. Note that
only algebraic operations involving the log likelihood metrics are counted, other oper-
ations are designated overheads. The overheads, which are very dependent on the ac-
of complexity. One important consequence is that memory accesses to the stored met-
rics can be made for free. However, no assumption as to the type of the log likelihood
Branch Labels
B (t ) = [B 1 ; B 2; ::: ; ℄
B Nc and a (code) label profile L (t ) = [L ; 1 L2 ; ::: ; ℄
L Nc .
To calculate the metric associated with one branch at depth j requires the addition of
L j log likelihood metrics, a process which needs L j 1 additions. This is repeated for
X
Nc
N+ = B j (L j 1) (6.1)
=
j 1
Note that trellises for simpler codes, such as RM and single error-correcting Ham-
ming codes, may ‘share’ branch labels. At a given depth, j, more than one branch
may be labelled with the same code symbols, i.e., L(Si;t ; S j;t +1 ) = L(S 0 ; S 0 + ) where
i ;t j ;t 1
Lt > 1 and at least one of the inequalities i 6= i0; j 6= j0 holds (i.e., the start and/or end
6.2. TRELLIS DECODING COMPLEXITY 154
vertices must differ). An example of such a trellis is Figure 2.4, where for instance
L(S1;1 ; S1;2 ) = L(S 2;1 ; S2;2 ) = [0; 0℄. The branch metrics associated with such branches
will always be identical so it is possible to optimise decoding by saving the tempo-
rary result to memory after the first calculation. Note that this saving is irrelevant for
branches with a code label size of 1; from (6.1) the complexity of evaluating such a
branch is zero. This optimisation is not possible with RS codes and so will not be
considered further.
State metrics
The state metric is the metric for the best partial path from the root to the state in
question. Consider the trellis section given in Figure 6.1. Let the number of branches
entering state S j;t be Nb . The first partial path metric can be calculated as the sum of
the state metric from the preceding level, Si;t 1 , and the metric of the branch linking
the two states. This requires one addition. The process is repeated for the remaining
Nb 1 branches. The best metric is stored as the state metric, finding the best metrics
requires Nb 1 comparisons. For a linear trellis the number of branches entering each
state S j;t is the same for all j = 1; 2; : : : ; N t states at depth t. Therefore Nb is given
by
Nb = NB t
(6.2)
t
6.2. TRELLIS DECODING COMPLEXITY 155
S j;t
X
Nc
N= = (Nb 1) N t
=
t 1
Nc
X Bt
= Nt
1 Nt (6.3)
=
t 1
X
Nc
= (B t Nt )
=
t 1
For rectangular non-linear codes the same technique can be applied, except that Nb
may not be constant over all vertices at a given depth, requiring an extra summation
In some cases it is possible to optimise the state metric calculation. For trellises
which are never truncated the state metric of the root is always zero, therefore no
addition is required in the calculation of the state metric at depth 1. Generally this is
not true for convolutional code trellises (see Example 4.3) since the first states in the
trellis will not have a zero metric. The number of additions required to calculate the
X Nc
N+ = Nb N t
t =t s
(6.4)
X
Nc
= Bt
=
t ts
where ts = 2 for block code trellises and t = 1 for truncated (convolutional) trellises.
s
6.2. TRELLIS DECODING COMPLEXITY 157
Total Complexity
The total complexity for trellis decoding using the VA is obtained by combining (6.1),
X
Nc
X
Nc
+
NVA = B j (L j 1) + Bt (6.5)
=
j 1 =
t ts
X
Nc
=
NVA = (B t Nt ) (6.6)
=
t 1
Using the same technique as above it is possible to calculate the additional complexity
for SOVA decoding. The calculation of the branch metrics is unchanged. During the
calculation of the state metric the reliability metric must also be determined. After
finding the best partial path from the Nb possible choices the next best path must be
found, i.e., the best out of the remaining Nb 1 possibilities. However, by arranging
the selection as a binary tree, less than dlog2 Nbe comparisons are needed [Knuth, 1973,
defined by Nb > 1.
Finally, SOVA must store the difference between the best and next best paths,
requiring one subtraction per decision-making state. SOVA must also test if the output
data from the best and next best paths is the same. If not, the difference between the
3 For linear codes where q is an integer power of 2, Nb is always an integer power of 2. This
restriction is met for the majority of all useful codes.
6.2. TRELLIS DECODING COMPLEXITY 158
best and next best paths must be calculated and stored, which requires one subtraction
per decision-making state. For the case that both paths would result in the same output
1 is stored. The comparison of output data is declared to be part of the overheads for
two reasons. Firstly, it is an integer comparison, not necessarily the same type of
comparison as may be used for comparing two log likelihood metrics. Secondly, for
block code trellises the output data is always different, thus the comparison always
fails and can be eliminated. Therefore the additional number of comparisons and
subtractions is given by
8
>
>
Nc >
X <0 if N t =B
N= =
t
= >
>
>
:N t log2 Nb
t 1
1
otherwise
8 (6.7)
>
X
>
Nc >
<0 if N t =B
=
t
>
> Bt
=
t 1>
:N t log2 1 otherwise
Nt
8
>
>
Nc >
X <0 if N t =B
=
t
N (6.8)
>
t =1 >
>
:N t otherwise
The total complexity of decoding a trellis using SOVA is therefore given by combining
6.2. TRELLIS DECODING COMPLEXITY 159
X
Nc
X
Nc
N+SOVA = B j (L j 1) + Bt (6.9)
=
j 1 =
t ts
8
>
X
>
Nc >
<0 if N t =B
= =
t
NSOVA (6.10)
> Bt
t =1 >
>
:B t +N t log2
Nt
2 otherwise
8
>
>
Nc >
X <0 if N t =B
=
t
NSOVA (6.11)
>
t =1 >
>
:N t otherwise
Example 6.1 The RS(7; 3; 5) trellis decoding complexity using VA and SOVA are
The trellis (Figure 4.5) is decoded using integer LL metrics. Properties of the
Nc =3 (6.12)
L (t ) = [2; 3; 2℄ (6.15)
ts =2 (6.16)
Using the operation cost as given in Table 6.1 the relative complexities are shown
in Table 6.2.
6.2. TRELLIS DECODING COMPLEXITY 160
Table 6.2: Comparison of VA and SOVA decoding complexity for RS(7; 3; 5).
Example 6.2 Similar to Example 6.1, calculate the VA and SOVA decoding complex-
ity for the RS(7; 5; 3) trellis (Figure 4.2). Properties of the trellis are as below:
Nc =7 (6.17)
L (t ) = [1; 1; 1; 1; 1; 1; 1℄ (6.20)
ts =2 (6.21)
Table 6.3: Comparison of VA and SOVA decoding complexity for RS(7; 5; 3).
6.3. A COMPARISON OF ALGEBRAIC DECODERS 161
by (2.32). The probability of bit error can be converted from the probability of sym-
bol error over an M-ary orthogonal signal set by (adapted from [Sklar, 1988, Equa-
Pb = 2(MM 1)
Ps (6.22)
for the RS(7; 5; 3) code over a coherently-demodulated BPSK channel. The BM BER
is about half the bound. (The bound assumes that an incorrect decoding will result
in all bits erroneous, whereas on average half are correct.) The Euclidean decoding
in the case of decoder failures. The performance is slightly worse but still within the
HDMLD bound.
The decoder complexity was measured by using the C++ Galois field class to count the
+
number of , , , and = (compare) operations. The decoders were presented with
the same t + 1 sets of codewords, where each set contained 0; 1; : : : ; t errors. The
Comparison of Algebraic Decoder Performance
0
log10 (Pb )
4
6.3. A COMPARISON OF ALGEBRAIC DECODERS
uncoded
5 Berlekamp-Massey
Euclidean
High-speed step-by-step
HDMLD bound
6
2 3 4 5 6 7 8 9 10
Eb=N0 (dB)
Figure 6.2: Algebraic decoding performance for RS(7; 5; 3) over a coherently-demodulated BPSK channel.
162
6.3. A COMPARISON OF ALGEBRAIC DECODERS 163
average decoding complexity for one codeword is shown for RS(7; 3; 5) (Table 6.4),
RS(7; 5; 3) (Table 6.5), RS(63; 55; 9) (Table 6.6) and RS(255; 223; 33) (Table 6.7).4
since they are all syndrome-based algorithms where the first step is the syndrome
calculation, for which they share a common method (3.1). On finding no errors no
It can be seen that HSSBS decoding is the least efficient of the three algebraic de-
coders implemented. For codes over small alphabets (Tables 6.4 and 6.5) the perfor-
mance is tolerable. With the chosen complexity criteria HSSBS is particularly heavily
penalised for its high use of multiplications. With a different choice of basis for the
simply than a table look-up, its complexity performance would improve. Whatever
basis is chosen, the implementation complexity of addition and subtraction are likely
ever, combining the number of additions with subtractions and multiplications with
divisions reveals that HSSBS will always be more complex than either Euclidean or
Berlekamp-Massey decoding.
For codes over large alphabets the situation worsens dramatically (see Table 6.6);
there are an exponentially increasing number of possible error values for a trial-and-
error method to search. It was not possible to include complexity results for HSSBS
in Table 6.7. Though HSSBS is an improvement over the original step-by-step algo-
4 In the tables the values are printed with limited precision, but the complexity is based upon the full
numerical precision.
5 Exactly equal for fields of characteristic 2.
6.3. A COMPARISON OF ALGEBRAIC DECODERS 164
number
of errors
+ =
total bit
complexity complexity
0 24.53 0.00 21.03 0.00 0.00 129.69 14.41
1 33.21 9.69 28.66 5.00 8.41 236.44 26.27
2 58.87 21.46 49.12 12.21 25.66 463.91 51.55
(a) Berlekamp-Massey.
number
of errors
+ =
total bit
complexity complexity
0 24.53 0.00 21.03 0.00 0.01 129.72 14.41
1 40.52 23.00 48.13 8.00 14.89 388.84 43.20
2 72.21 42.87 73.91 16.14 51.03 718.42 79.82
(b) Euclidean.
number
of errors
+ =
total bit
complexity complexity
0 24.53 0.00 21.03 0.00 0.01 129.72 14.41
1 50.44 19.60 90.64 0.00 15.11 568.57 63.17
2 92.14 47.32 194.38 0.00 51.32 1265.33 140.59
number
of errors
+ =
total bit
complexity complexity
0 12.22 0.00 10.48 0.00 0.00 64.61 4.31
1 18.84 7.73 16.10 5.00 8.47 157.47 10.50
(a) Berlekamp-Massey.
number
of errors
+ =
total bit
complexity complexity
0 12.22 0.00 10.48 0.00 0.00 64.61 4.31
1 22.11 15.00 25.51 8.00 8.88 231.28 15.42
(b) Euclidean.
number
of errors
+ =
total
complexity
bit
complexity
0 12.22 0.00 10.48 0.00 0.00 64.61 4.31
1 41.28 15.21 69.34 0.00 31.05 496.34 33.09
number
of errors
+ = total bit
complexity complexity
0 498 0 490 0 6 2962 8.98
1 506 14 498 5 13 3072 9.31
2 609 26 572 13 62 3743 11.34
3 685 45 634 24 142 4445 13.47
4 866 70 774 38 287 5855 17.74
(a) Berlekamp-Massey.
number
of errors
+ = total bit
complexity complexity
0 498 0 490 0 6 2962 8.98
1 522 39 537 8 32 3385 10.26
2 653 87 639 17 132 4414 13.38
3 759 132 732 27 292 5562 16.85
4 961 173 893 38 536 7393 22.40
(b) Euclidean.
number
of errors
+ = total bit
complexity complexity
0 498 0 490 0 6 2962 8.98
1 2912 4204 11474 0 276 65314 197.92
2 4467 6895 18513 0 719 106084 321.47
3 6219 9929 26449 0 1358 152470 462.03
4 20149 34070 89589 0 3546 512802 1553.95
number
of errors
+ = total bit
complexity complexity
0 8115 0 8086 0 50 48696 27.30
1 8159 38 8126 5 58 49022 27.48
2 8475 51 8354 13 166 50858 28.51
3 8875 70 8653 24 394 53508 29.99
4 9417 95 9074 38 766 57368 32.16
5 10303 126 9782 55 1344 63647 35.68
6 11096 163 10441 75 2088 70103 39.30
7 12260 205 11425 98 3049 79227 44.41
8 12912 254 12005 124 4127 86189 48.31
9 13740 309 12728 153 5358 94526 52.99
10 13877 369 12879 184 6626 99441 55.74
11 16338 435 15012 218 8279 117763 66.01
12 17019 509 15657 257 10033 127199 71.30
13 18689 585 17150 296 12029 142591 79.93
14 19479 670 17872 341 14164 153709 86.16
15 21750 760 19934 387 16587 173877 97.46
16 23987 855 21966 435 19294 194727 109.15
(a) Berlekamp-Massey.
number
of errors
+ = total bit
complexity complexity
0 8115 0 8086 0 50 48696 27.30
1 8222 135 8285 8 149 50265 28.18
2 8687 328 8663 17 524 53985 30.26
3 9234 492 9110 27 1190 58979 33.06
4 9920 655 9674 38 2165 65632 36.79
5 10937 810 10514 50 3496 75054 42.07
6 11857 966 11302 63 5139 85064 47.68
7 13148 1126 12413 77 7143 98154 55.02
8 13915 1279 13108 92 9393 109375 61.31
9 14851 1427 13940 108 11915 122260 68.53
10 15097 1580 14200 125 14594 132083 74.04
11 17656 1722 16429 143 17762 155523 87.18
12 18403 1850 17144 161 21104 170090 95.34
13 20184 2011 18747 181 24804 191251 107.20
14 21064 2157 19559 203 28736 208238 116.73
15 23401 2291 21689 224 33021 234320 131.35
16 25699 2417 23781 247 37649 261204 146.41
(b) Euclidean.
rithm [Massey, 1965] it is not well suited to decoding multi-level codes over large
alphabets. For these reasons further work on HSSBS was not pursued.
which in turn is dependent on the modulation scheme, the type of noise and the ra-
tio Eb =N0 . Figure 6.3 compares the decoding complexities as a function of Eb =N0 for
AWGN. At high values of Eb =N0 , where received symbol errors are uncommon, the
complexity is almost constant and is dominated by the cost of the syndrome calcula-
tion.
The decoding performance of TSD has been evaluated by computer simulation for
RS(7; 3; 5) and RS(7; 5; 3) codes. For each case the performance of both HD and
SD subtrellis prediction was measured. The simulated modulation scheme was non-
coherently demodulated 8FSK, over an AWGN channel. Unlike the other simulations
Figure 6.4 shows the performance of RS(7; 3; 5) with HD choice of subtrellis. The
but with a complexity lower than SDMLD. To obtain an appreciable coding gain the
reduced. (It should be noted however that a less noisy channel is required for HDMLD
Comparison of Algebraic Decoder Complexity
30
25
20
6.4. TWO-STAGE DECODING
15
Complexity
10
5
Berlekamp-Massey
Euclidean
High-speed step-by-step
0
2 3 4 5 6 7 8 9 10
Eb=N0 (dB)
Figure 6.3: Algebraic decoding complexity for RS(7; 5; 3) over a coherently-demodulated BPSK channel.
170
6.4. TWO-STAGE DECODING 171
to give any coding gain.) Figure 6.5 shows the performance when SD information
is incorporated into the subtrellis selection, by using “soft GF” algebra. The error-
to optimum.
Similarly Figures 6.6 and 6.7 show the performance of RS(7; 5; 3) for HD and
SD choice of subtrellis respectively. For this code subsets of the trellis are selected
> q symbols are required for good performance. (With q symbols there may not
be the opportunity to try a second symbol value for the more confident prediction.)
Two-stage decoding has both algebraic and combinatorial operations. The first stage
addition and multiplication operations is easily obtained. Since the subtrellis predic-
tors are known at design time, the prediction process can easily be optimised to remove
that the trellis will be decoded using the log likelihood values, which may be expressed
6.4. TWO-STAGE DECODING 172
a coset trellis, such as Figure 4.5, the decoding implementation and calculation of its
Although the two-stage decoding simulations did not use LL metrics, the complex-
ity analysis presented below is for the case of LL metrics. This enables the complexity
Table 6.8 shows the complexity to decode the RS(7; 3; 5) coset trellis (Figure 4.5),
for the case of HD subtrellis prediction. Note that selecting the best subtrellis from the
Nst decoded subtrellises requires Nst 1 comparisons. Table 6.8 includes the additional
Nst 1 comparisons. The bit complexity to decode the full minimal RS(7; 3; 5) trellis
ficult to analyse because the trellis does not contain independent subtrellises. Prior
knowledge (or prediction) of information symbols can, however, be used to limit de-
coding to only a subset of the trellis. The trellis vertices at depth 3 can be thought of
as storing the values of u2 and u3 . From the generator matrix (4.22) it can be seen that
v4 =u + u + u
2
4
3
3
4 (6.23)
The new information, responsible for the trellis branching, is u4 , while the information
from past subcodes (u2 and u3 ) was effectively stored in the vertex number. When the
6.4. TWO-STAGE DECODING 173
Nst + = stage 2
complexity
TSD TSD total
total per bit
1 272 0 0 0 63 461 848 94:22
2 544 0 0 0 127 925 1312 145:78
3 816 0 0 0 191 1389 1776 197:33
4 1088 0 0 0 255 1853 2240 248:89
5 1360 0 0 0 319 2317 2704 300:44
6 1632 0 0 0 383 2781 3168 352:00
7 1904 0 0 0 447 3245 3632 403:56
8 2176 0 0 0 511 3709 4096 455:11
Table 6.8: Complexity versus number of subtrellises decoded for TSD of RS(7; 3; 5).
values for u2 and u3 are known (or can be predicted) only those paths which pass
through the relevant vertex at depth 3 need be decoded. This technique is similar
with the aim of reducing decoder complexity instead of increasing the error-correction
Figure 6.8 shows the possible trellis paths for the case u2 = 0 and u = 0, while the
3
possible paths for the case u2 = 1 and u = 2 are shown in Figure 6.9. Some branches
3
are common whatever values of u2 and u3 are selected. This is true for branches
at depths 1, 6 and 7, which require decoding only once. Figure 6.10 highlights the
complexity. The number of GF additions required for the prediction of u2 and u3 are 36
6.4. TWO-STAGE DECODING 174
and 40, respectively, while the number of multiplications are 39 and 44, respectively.
Table 6.9 shows the decoding complexity using the worst-case analysis for selected
values of Nst.
Nst + = stage 2
complexity
TSD
total
TSD total
per bit
1 97 0 0 0 28:88 183:62 674:62 44:98
2 122 0 0 0 51:75 277:25 768:25 51:22
8 272 0 0 0 189:00 839:00 1330:00 88:67
16 472 0 0 0 372:00 1588:00 2079:00 138:60
24 672 0 0 0 555:00 2337:00 2828:00 188:53
32 872 0 0 0 738:00 3086:00 3577:00 238:47
44 1172 0 0 0 1012:50 4209:50 4700:50 313:37
56 1472 0 0 0 1287:00 5333:00 5824:00 388:27
64 1672 0 0 0 1470:00 6082:00 6573:00 438:20
Table 6.9: Complexity versus number of subtrellises decoded for TSD of RS(7; 5; 3).
Two-stage Decoding of RS(7; 3; 5) With HD Stage 1
0
2
6.4. TWO-STAGE DECODING
log10 (Pb )
4 uncoded
HDMLD
1 subtrellis
2 subtrellises
3 subtrellises
5
4 subtrellises
6 subtrellises
8 subtrellises (SDMLD)
6
2 3 4 5 6 7 8 9 10
Eb =N0 (dB)
Figure 6.4: Two-stage decoding of RS(7; 3; 5), with HD choice of subtrellis, over 8-FSK channel.
175
Two-stage Decoding of RS(7; 3; 5) with SD Stage 1
0
2
6.4. TWO-STAGE DECODING
log10 (Pb )
4 uncoded
HDMLD
1 subtrellis
2 subtrellises
3 subtrellises
5
4 subtrellises
6 subtrellises
8 subtrellises (SDMLD)
6
2 3 4 5 6 7 8 9 10
Eb=N0 (dB)
Figure 6.5: Two-stage decoding of RS(7; 3; 5), with SD choice of subtrellis, over 8-FSK channel.
176
Two-stage Decoding of RS(7; 5; 3) with HD Stage 1
0
2
6.4. TWO-STAGE DECODING
log10 (Pb )
4 uncoded
HDMLD
8 subtrellises
16 subtrellises
24 subtrellises
5
32 subtrellises
44 subtrellises
64 subtrellises (SDMLD)
6
2 3 4 5 6 7 8 9 10
Eb=N0 (dB)
Figure 6.6: Two-stage decoding of RS(7; 5; 3), with HD choice of subtrellis, over 8-FSK channel.
177
Two-stage Decoding of RS(7; 5; 3) with SD Stage 1
0
2
6.4. TWO-STAGE DECODING
log10 (Pb )
4 uncoded
HDMLD
8 subtrellises
16 subtrellises
24 subtrellises
5
32 subtrellises
44 subtrellises
64 subtrellises (SDMLD)
6
2 3 4 5 6 7 8 9 10
Eb=N0 (dB)
Figure 6.7: Two-stage decoding of RS(7; 5; 3), with SD choice of subtrellis, over 8-FSK channel.
178
6.4. TWO-STAGE DECODING 179
Trellis 1 only
Trellis 2 only
Common branches
Figure 6.10: Subsets of RS(7; 5; 3) trellis for u2 = f0; 1g, u = f0; 2g.
3
6.5. SOVA APPLIED TO THE METEOSAT II SATELLITE SYSTEM 182
6.5.1 Introduction
lites, Meteosat II, which will be used for meteorological purposes. The satellites will
paying end users. In its simplest form the Meteosat II – Earth retransmission channel
can be viewed as a concatenated coding system, with an RS(255; 223; 33) outer code
and a (2; 1; 7) convolutional inner code as shown in Figure 2.3. An interleaver of depth
d = 4 is used between inner and outer codes. The proposed system also includes en-
cryption, compression, randomisation and synchronisation. None of these affect the
performance of the error control coding and can be ignored. This concatenated code
Potential users are spread over a wide geographical area. Users at the edges of the
defined service area are those most likely to have most difficulty in reception. There is
of course a trade-off between the cost of increasing the transmitter power, increasing
the complexity of the users’ receiving equipment and the defined geographical limits
for which satisfactory reception can be expected. If a means can be found to improve
the channel error-rate for users in marginal locations greater commercial benefits exist,
either by enlarging the service area or reducing the transmitter power. An investigation
was made into the benefits of applying SOVA to improve the system performance.
Replacing the inner Viterbi decoder with a soft-output Viterbi decoder allowed SD
6.5. SOVA APPLIED TO THE METEOSAT II SATELLITE SYSTEM 183
decoding of the outer (RS) code. The original system is denoted by SD-HD and the
The performance of SOVA in the Meteosat II – Earth high/low rate user station link
in Figure 2.3 and the specifications are given in Table 6.10. The differences between
Convolutional n 2 bits
code: k 1 bit
K 7
generator polynomial:
g1 (x) + + + +
1 x x2 x3 x6
g2 (x) + + + +
1 x2 x3 x5 x6
trellis length 28, 35 or 42
RS code: n 7 symbols
k 5 symbols
d 3 symbols
g(x) + +
1 4 x 3 x2
GF size 8
GF primitive polynomial +
1 x
symbol width 3 bits
Block interleaver: depth 4
width 7 RS symbols (21 bits)
Modulation/ type BPSK
demodulation: channel AWGN
demodulation coherent
quantisation resolution 3 bits
Table 6.10: Specifications of simulated Meteosat II system.
6.5. SOVA APPLIED TO THE METEOSAT II SATELLITE SYSTEM 184
Modulation/Demodulation
The simulated system modelled the transmission of equally-likely random data over a
BSC with AWGN and coherent BPSK demodulation. The system proposed by ESA
is switchable between BPSK and QPSK modulation. However, BPSK and QPSK
have identical BER performance [Sklar, 1988, p. 172] so only BPSK modulation was
Synchronisation
The simulated system assumed synchronisation. This assumption can be made since
without synchronisation no error control coding can be applied. Whilst the error con-
trol decoders are important for ensuring and maintaining synchronisation, introduction
Data randomisation
Data randomisation is recommended for the following reasons [Dai, 1995, p. 3.2-18]:
Data randomisation was not included in the simulated system since synchronisa-
tion was assumed. The reasons for including data randomisation are for synchronisa-
The convolutional code used for the computer simulation was the (2; 1; 7) code spec-
ified by the CCSDS. The decoder used was either a SOVA decoder, or a conventional
VA trellis decoder, both operated in the trace-back implementation. Decoding was per-
formed over trellises of depths 28 (4K), 35 (5K) and 42 (6K). A single section of the
trellis is shown in Figure 6.11. The quantisation resolution modelled was 8 levels, the
maximum resolution given by the frame synchroniser output [Dai, 1995, p. 3.2-13].
Greater resolution would provide little extra performance, simulation studies [Heller
and Jacobs, 1971] have shown that 8 level quantisation resulted in only 0:25 dB reduc-
To measure the coding gain SOVA can produce it is necessary that the RS decoder used
is capable of SDMLD. Failing to use such a decoder will not produce an independent
measure of the coding gain possible by introducing SOVA, but instead a combination
of the gain by SOVA and the loss from the sub-optimal decoder.
For the simulation the RS(255; 223; 33) code over GF(256) was replaced by the
RS(7; 5; 3) code over GF(8). This code was chosen because it is a similar rate to
RS(255; 223; 33) and readily decoded with VA over the trellis shown in Figure 4.2.
6.5. SOVA APPLIED TO THE METEOSAT II SATELLITE SYSTEM 186
The trellis was labelled with the binary mapping of the RS symbols (using polyno-
mial representation), thus avoiding the need to map symbols and symbol reliabilities
from binary to GF(8). The RS Viterbi decoder operated in trace-back mode. For HD
of the interleaver was reduced from 255 8-bit symbols to 7 3-bit symbols to conform
with the change of outer (RS) code. The interleaver was arranged to operate on the
RS symbols, not on individual bits, and thereby preserved the burst-error correction
Results from the simulated system are presented over the Eb =N0 range 0–7 dB. The
CCSDS coding standard does not specify the path storage to be used for a Viterbi de-
coder (nor even the decoding method to use!). It is assumed that the trellis length will
be in the range 4K to 6K [Heller and Jacobs, 1971]. Results are given for convolutional
bility above 4:5 dB, requiring some results to be extrapolated. On the graphs dotted
lines indicate results obtained from extrapolated data. The uncoded curve was calcu-
lated by theoretical means, and is in agreement with the simulated uncoded curve (not
shown). The probability of bit error for uncoded data transmitted with BPSK over a
6.5. SOVA APPLIED TO THE METEOSAT II SATELLITE SYSTEM 188
s !
Pb =Q 2Eb
N0
(6.24)
Figure 6.12 shows the performance of both SOVA and VA decoding for a con-
5
volutional trellis of depth 28. At a BER of 10 SOVA decoding provides a coding
gain increase of approximately 0:9 dB. The “break-point” at which coding becomes
SOVA displays a coding gain over the VA of 1:1 dB, while the break-point is reduced
the difference in coding gain has risen to 1:4 dB at a BER of 10 5 . The break-point is
The total system complexity was calculated from the sum of the complexities for each
of the component decoders. Only operations directly associated with decoding were
SOVA versus VA, trellis depth = 28
0
log10 (Pb )
4
5
uncoded
SD-HD conventional decoder
SD-SD SOVA decoder
6.5. SOVA APPLIED TO THE METEOSAT II SATELLITE SYSTEM
6
0 1 2 3 4 5 6 7 8
Eb =N0 (dB)
Figure 6.12: Simulation results for RS(7; 5; 3) and (2; 1; 7) over a trellis of depth 28.
189
SOVA versus VA, trellis depth = 35
0
log10 (Pb )
4
5
uncoded
SD-HD conventional decoder
SD-SD SOVA decoder
6.5. SOVA APPLIED TO THE METEOSAT II SATELLITE SYSTEM
6
0 1 2 3 4 5 6 7 8
Eb =N0 (dB)
Figure 6.13: Simulation results for RS(7; 5; 3) and (2; 1; 7) over a trellis of depth 35.
190
SOVA versus VA, trellis depth = 42
0
log10 (Pb )
4
5
uncoded
SD-HD conventional decoder
SD-SD SOVA decoder
6.5. SOVA APPLIED TO THE METEOSAT II SATELLITE SYSTEM
6
0 1 2 3 4 5 6 7 8
Eb =N0 (dB)
Figure 6.14: Simulation results for RS(7; 5; 3) and (2; 1; 7) over a trellis of depth 42.
191
6.5. SOVA APPLIED TO THE METEOSAT II SATELLITE SYSTEM 192
included. Additional work, such as that performed by the de-interleaver, was termed
overheads because the exact complexity is very dependent on the exact hardware or
software implementation.
The decoding complexity for the inner (2; 1; 7) convolutional code was calculated
using the procedure given in Section 6.2. Table 6.12 details the VA complexity for
trellises truncated to depths 28, 35 and 42. Similarly, the decoding complexity using
SOVA is shown in Table 6.13. Each decoding operation produces just one binary bit
of information.
The complexity for VA and decoding of the outer RS(7; 5; 3) code was calculated
in Example 6.2. A BM decoder was used for HD decoding of RS(7; 5; 3), for which
the complexity is given in Figure 6.3. The complexity of the BM decoder is dependent
The complexity of decoding the outer RS(7; 5; 3) code with the VA can be obtained
from Table 6.3, while the complexity for the original BM decoder can be calculated
4
from Figures 6.2 and 6.3. At a BER of 10 the bit complexity of the BM decoder
is 4:52. Each outer decoding operation results in 15 binary bits of data. On average,
n2
k2
= 7
5
inner decoding operations are required for every (binary) data bit output by the
concatenated system. The total complexity, for both the original system (SD-HD) and
trellis
length
+ = total
bit
complexity
28 3808 0 0 0 1792 9184 9184
35 4760 0 0 0 2240 11480 11480
42 5712 0 0 0 2688 13776 13776
Table 6.12: Complexity for VA decoding of the convolutional (2; 1; 7) code.
trellis
length
+ = total
bit
complexity
28 3808 1792 0 0 1792 10976 10976
35 4760 2240 0 0 2240 13720 13720
42 5712 2688 0 0 2688 16464 16464
Table 6.13: Complexity for SOVA decoding of the convolutional (2; 1; 7) code.
binary phase-shift keying channel with additive white Gaussian noise, transmitting
equally-likely data. At the demodulator the soft outputs were quantised to 8 levels.
RS symbols, not binary bits. Unlike the SOVA system described in Section 6.5 it was
therefore not possible to use a binary-mapped RS trellis. However the same binary
mapping and combination of LL metrics which would have been performed by the
trellis was instead performed at the output of the demodulator. This mapping is shown
in Table 6.15.
The RS decoders were presented with symbols over GF(q). For each symbol there
were q LL metrics. Other than for the implementation changes described above this
channel is identical to that used by the SOVA system (Section 6.5.2), and later, com-
6.6. RS PRODUCT CODE DECODING 195
parisons between the two systems will be made. Likewise, the simulated system as-
sumed synchronisation.
Decoding Performance
Four different variations on the cascade decoder were implemented. The first and most
By introducing advanced techniques such as SOVA (Section 5.2) the column de-
coding stage may also use SD information and reap the benefits by reducing the BER
further still. Two variations of the cascade SD-SD decoder were implemented. In the
first (cascade SD-SDa), the column decoding used only the soft output from SOVA
(the extrinsic information) whilst the second used the sum of the SOVA metric and
the channel state information (i.e., extrinsic + channel state information). As cascade
SD-SDa does not make use of all the information available it performed worse than
cascade SD-SDb, but it is shown in Figure 6.15 to indicate the relative gains due to
By taking advantage of the extrinsic information available from SOVA and the
channel state information the cascade SD-SDb decoder is able to perform best of all
6.6. RS PRODUCT CODE DECODING 196
the cascade decoders implemented. It has a coding gain of > 1 dB over cascade SD-
6
HD at a BER of 10 (Figure 6.15), and with respect to cascade HD-HD the coding
Decoding Complexity
The cascade decoding complexity was measured by combining the decoding com-
plexities of the BM, VA and SOVA decoders. The complexity values are quoted for
function of the input Eb =N0 ratio. BM decoding was chosen because it is the least com-
plex HD decoder implemented in this work. Table 6.16 details the complexity for the
cascade decoders described above. Only operations performed by the row or column
decoders are included in Table 6.16, other operations are deemed overheads and are
not included. The complexity of SD-SDa and SD-SDb are almost identical, SD-SDb
is slightly more complex because it requires extra additions to sum the channel state
3
6.6. RS PRODUCT CODE DECODING
log10 (Pb )
4
uncoded
5 cascade HD-HD
cascade SD-HD
cascade SD-SDa
cascade SD-SDb
6
0 1 2 3 4 5 6 7 8
Eb =N0 (dB)
Decoding Performance
Two variants of the alternating row-column decoder (Section 5.3.7) were implement-
ed and evaluated. Firstly, with just an initial sort of the row and column codeword
confidences (i.e., no re-sort), and also with the row/column codeword confidences re-
calculated and re-sorted after each row/column decoding. Both variants used SOVA,
not the successive erasures decoding used in [Bate et al., 1986]. In Figure 6.16 the
dard’ cascade SD-HD decoder. It can be seen that the ARC decoders provide an extra
0:8 dB coding gain relative to the standard cascade SD-HD decoder. Interestingly, no
that recalculation and re-sorting of the row and column codeword confidences is not
necessary.
Decoding Complexity
The decoding complexity may be computed by considering the average number of row
and column decodings required. The average number of row and column decodings
row/column codes the bit complexity is 1168:00, not significantly different from that
3
6.6. RS PRODUCT CODE DECODING
log10 (Pb )
4
5 uncoded
cascade SD-HD
ARC (no re-sort)
ARC (with re-sort)
6
0 1 2 3 4 5 6 7 8
Eb =N0 (dB)
In Figure 6.17 the decoding performance of ARC decoding is compared for the cases
of one and four iterations. Again, no benefit is found from re-sorting the row and
column codeword confidences after each decoding. It can be seen that the effect of
four iterations is to provide an extra 1:1 dB coding gain. The decoding complexity
is simply i times greater, i.e., for the decoder in Figure 6.17 with 4 iterations the bit
complexity is 4672.
Iterative Alternating Row-Column Decoding
0
3
6.6. RS PRODUCT CODE DECODING
log10 (Pb )
4
uncoded
cascade SD-HD
5 ARC (no re-sort, 1 iteration)
ARC (with re-sort, 1 iteration)
ARC (no re-sort, 4 iterations)
ARC (with re-sort, 4 iterations)
6
0 1 2 3 4 5 6 7 8
Eb =N0 (dB)
202
Chapter 7
While McEliece has also introduced a similar method [McEliece, 1996] the re-
sults of my work were first published in 1995 (initially for Euclidean metrics).
Subsequently the work was extended to LL metrics and presented in the form
203
7.2. DECODING COMPLEXITY 204
more, Section 6.2 details both optimisations not given in [McEliece, 1996] and
The computer simulation of the new decoding algorithms to measure both de-
In Section 6.1.2 a technique was introduced to measure and compare decoding com-
the Berlekamp-Massey and Euclidean decoders was small, when correcting errors the
1994, p. 225] and [MacWilliams and Sloane, 1978, p. 369] cite the fact that the Ber-
for its computation (Section 6.2) has been a very successful approach for compar-
ing trellis and algebraic decoding techniques. In particular, without such a method it
would not have been possible to ascertain what complexity benefits actually existed
in two-stage decoding. Nor would the comparisons between the Meteosat II system
and the product code algorithms, as they are considerably different in their approach.
1996]. However, his published results apply to the BCJR trellis only while the method
7.2. DECODING COMPLEXITY 205
described in Section 6.2 is applicable to any linear code trellis. With trivial modifica-
tion (as suggested) the work in this Thesis can also be applied to rectangular non-linear
codes. McEliece also applies equal weighting to the operations “addition” and “taking
the minimum”, whereas it was shown (Table 6.1) that such operations are not neces-
It would appear from the literature that no work has been carried out on the subject
of reducing trellis (decoding) complexity by using the concept of shared branch la-
It can happen that different codewords will produce common edges, i.e.,
edges with the same values of init(e), fin(e) and (e).1 Such “shared”
edges are only counted once in the trellis. It is this sharing of edges that
(be it linear or non-linear) to produce the optimum trellis; it is not the same as the
label sharing described on p. 153, where the start and/or end nodes differ. It was
stated that codes such as RM and single error-correcting Hamming codes could share
labels (which is apparent by inspection of their trellises) but that it is not possible for
1 McEliece uses the notation init(e), fin(e) and (e) to refer to the start state, end state and labelling,
respectively, of a trellis branch (edge).
2 Also termed rectangularity.
7.3. HIGH-SPEED STEP-BY-STEP DECODING 206
RS codes. No proof was offered for either statement and this is an area where further
work is required.
Example 3.5 highlights the tortuous route by which HSSBS decoding sometimes op-
erates, particularly when compared with the efficiency in which both Euclidean and
Berlekamp-Massey algorithms perform the same task (Examples 3.3 and 3.4 respec-
tively). HSSBS decoding is very inefficient at correcting the t-th error because of its
would have been required to correct all errors (instead of 52). For sake of complete-
ness, further study into the trade-off between decoding all symbols and decoding only
not (and probably never will be) an efficient algorithm for decoding multi-level codes
The premise of two-stage decoding was that the decoding complexity could be reduced
from 0 00 to 0 + 00, with only minor loss of performance. For the case of RS codes it
was shown that a trellis-based system for the selection of second-stage subtrellises was
not possible. This unfortunately increased the complexity of the first decoding stage,
but complexity savings are still possible (Section 6.4.2). For the case of HD subtrellis
7.5. COMPARISON OF DECODERS FOR CONCATENATED CODES 207
prediction it also reduced the performance. A trellis-type method would seem a more
It is suggested that for two-stage decoding of RS codes future work might be di-
rected towards the use of the max-sum algorithm [Forney, 1997]. The subtrellis pre-
diction can therefore employ Tanner graphs [Tanner, 1981] as a means of preserving
of subtrellises, instead of the fixed but adjustable number currently decoded. Such a
scheme would require a termination condition. Aguado and Farrell employed the Fano
metric [Fano, 1963] for their reduced search trellis decoding algorithm while Shin and
Sweeney used their own reference path metric to discard candidate paths. A third
option for deciding when to stop decoding subtrellises might be the SOVA reliability
metric. Whatever metric were to be employed the advantages of such a scheme could
For the case of the modified Meteosat II system it can be seen that SOVA provides
a gain of 1:1–1:8 dB, dependent upon the depth of the trellis decoded for the convo-
coding gain are considerable. For example, the geostationary satellite can transmit
a signal 1:8 dB weaker, which requires less power and thus smaller solar cells and
7.5. COMPARISON OF DECODERS FOR CONCATENATED CODES 208
batteries. The weight saved translates to cheaper launch costs for the satellite. Al-
ternatively, the receiving dish be may reduced in area by 33% (so that a 0:8 m dish
can be used where a 1 m dish was previously required). Another option is to increase
the defined service area and accrue greater revenue from an enlarged number of users.
For the system simulated, the increase in decoding complexity to achieve such a large
gain was very modest indeed, only 22%. An alternative comparison is that SD-SD
decoding over a convolutional trellis of length 28 has approximately the same com-
plexity as SD-HD decoding over a trellis of length 35, but performed better than the
The simulated transmission channel for both the Meteosat II satellite system and
may be compared directly. The best of the Meteosat II, and RS product code decoders
Firstly, it can be seen that the performance increase in applying SOVA to the (mod-
uct code the increase is only just over 1 dB. This may be explained by two factors.
The inner code of the Meteosat II system is stronger (but more complex to decode).
Also, the outer RS(7; 5; 3) code of the modified Meteosat II system was allocated a
reliability metric for each input bit. For the cascade decoder it was necessary to make
the assumptions that the SOVA metric applied equally to all symbols, and that the
discarded values were equally unlikely (p. 137). This lead to only 7 SOVA reliability
metrics being available, which were ‘recycled’ for each of the 5 column decodings.
It is interesting to note that the Meteosat II decoder performs better than the RS
7.6. ITERATIVE DECODING OF PRODUCT CODES 209
product code, for both SD-HD and SD-SD cases. Massey’s assertion (p. 117) that
convolutional codes should be used as the first stage of decoding has been shown to
be correct for the cases simulated. Such increase in performance comes at a price
of higher complexity, which must not be forgotten. The Meteosat II SD-SD decoder
shown in Figure 7.1 is 20 times more complex than one iteration of the ARC SD-SD
decoder.
The brief foray into iterative decoding (Sections 5.3.8 and 6.6.4) has provided some
excellent results. Without any additional work the alternating row-column decoder
5
was instructed to execute four decodings instead of one. At a BER of 10 the coding
gain increased by > 1:1 dB. While the complexity also increased by a factor of four
the bit complexity was still less than that of the modified Meteosat II system by a
iterations.
The increase in coding gain was less than that found in [Pyndiah, 1998], where
a BCH(64; 51; 6) BCH(64; 51; 6) code provided an increase of > 1:5 dB for just 4
iterations. There are several reasons for this, not just the change of code or increase
in code alphabet size. Pyndiah used a weighting factor, , to scale the amount of
extrinsic information included by each iteration, m, where
(m) = 0:0 0:2 0:3 0:5 0:7 0:9 1:0 1:0 (7.1)
Comparison of Concatenated Decoding Algorithms
0
log10 (Pb )
4 uncoded
cascade HD-HD
cascade SD-HD
7.6. ITERATIVE DECODING OF PRODUCT CODES
cascade SD-SDb
ARC (no re-sort, 1 iteration)
5
ARC (no re-sort, 4 iterations)
Meteosat II SD-HD (trellis depth = 42)
Meteosat II SD-SD (trellis depth = 42)
6
0 1 2 3 4 5 6 7 8
Eb =N0 (dB)
With such a scheme it ensures the extrinsic information is added slowly, to help the
decoding process to converge upon the correct solution. Adding a weighting factor to
the iterative decoding described in Section 5.3.8 would be a trivial extension. It should
also be remembered that the soft-output Viterbi algorithm employed is not entirely
optimum at generating its reliability metric. (However, the selection of the output
codeword is optimum.) There is neither any indication as to the next-best symbol, nor
very useful and efficient decoding algorithm for this purpose. Interesting further work
on this topic would be to simulate the BCH(64; 51; 6) BCH(64; 51; 6) code used by
Pyndiah with the iterative decoder described in Section 5.3.8. This is at the limit of
L. E. Aguado and P. G. Farrell. On hybrid stack decoding algorithms for block codes.
L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv. Optimal decoding of linear codes for
S. D. Bate, B. K. Honary, and P. G. Farrell. Soft and hard decision decoding of product
codes for communication systems. In Systems Science, volume 12, pages 79–85.
1986.
Y. Berger and Y. Be’ery. Bounds on the trellis size of linear block codes. IEEE
212
REFERENCES B (cont.) – F 213
D. A. Chase. A class of algorithms for decoding block codes with channel mea-
1972.
27/10/95.
1954.
P. Elias. Coding for noisy channels. IRE Convention Record, 3(4):37–47, 1955.
G. D. Forney, Jr. Coset codes—part II: Binary lattices and related codes. IEEE Trans-
G. D. Forney, Jr. and M. D. Trott. The dynamics of group codes: State spaces, trellis
June 1961.
J. Hagenauer and P. Hoeher. A Viterbi algorithm with soft-decision outputs and its
Reed-Solomon Codes and Their Applications, pages 242–271. IEEE Press, New
J. Hagenauer, E. Offer, and L. Papke. Iterative decoding of binary block and convo-
1996.
J. A. Heller and I. W. Jacobs. Viterbi decoding for satellite and space communication.
1971.
B. Honary and G. Markarian. New simple encoder and trellis decoder for Golay codes.
B. Honary, G. Markarian, and M. Darnell. Trellis decoding for linear block codes. In
P. G. Farrell, editor, Codes and Ciphers. IMA Press, 1995b. ISBN 0905091035.
on Information Theory and Its Applications, pages 282–285, Victoria, BC, Canada,
September 1996.
niques and Applications Series, pages 133–147. John Wiley & Sons, New York,
B. Honary, G. S. Markarian, and P. G. Farrell. Generalised array codes and their trellis
F. Jelinek. A fast sequential decoding algorithm using a stack. IBM Journal of Re-
May 1993a.
T. Kasami, T. Takata, T. Fujiwara, and S. Lin. On the optimum bit orders with respect
to the state complexity of trellis diagrams for binary linear codes. IEEE Transac-
S. Lin and J. Costello, Jr. Error Control Coding: Fundamentals and Applications.
0-444-85193-3.
J. L. Massey. The how and why of channel coding. In Proceedings of the 1984 Zurich
R. J. McEliece. Finite Fields for Computer Scientists and Engineers. Kluwer Aca-
Norway, 1994.
R. J. McEliece. On the BCJR trellis for linear block codes. IEEE Transactions on
nications. John Wiley & Sons, New York, London, Sydney, 1985. ISBN 0-471-
88074-4.
museum/25anniv/hof/moore.htm.
nr/book .html.
I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. Journal of the
1994.
V. Sidorenko, G. Markarian, and B. Honary. Code trellises and the Shannon Product.
V. Sidorenko, G. Markarian, and B. Honary. Minimal trellis design for linear codes
0-13-212713-X.
D. Slepian. Some further theory of group codes. Bell Systems Technical Journal, 39:
key equation for decoding Goppa codes. Information and Control, 27(1):87–99,
January 1975.
A. Vardy and Y. Be’ery. Bit level soft decision decoding of Reed-Solomon codes.
April 1967.
1971.
S. Wicker. Error Control Systems for Digital Communication and Storage. Prentice-
J. K. Wolf. On codes derivable from the tensor product of check matrices. IEEE
J. Wu, S. Lin, T. Kasami, T. Fujiwara, and T. Takata. An upper bound on the effective
February/March/April 1994.
August 1993.
Citation Index
Aguado and Farrell [1998], 101, 207 Farrell, Honary, and Bate [1986], 134
Bahl, Cocke, Jelinek, and Raviv [1974], Forney and Trott [1993], 29
Bose and Ray-Chaudhuri [1960b], 20 Hagenauer and Hoeher [1989], 116, 126,
Chase [1972], 134, 142 Hagenauer, Offer, and Papke [1994], 117,
Clark and Cain [1981], 44, 61 Hagenauer, Offer, and Papke [1996], 126–
Elias [1954], 12, 16 Heller and Jacobs [1971], 94, 118, 185,
223
CITATION INDEX H (cont.) – S 224
Honary and Markarian [1993a], 16, 75 Lin and Costello [1983], 7, 9, 27, 34
Honary and Markarian [1993b], 16, 75 MacWilliams and Sloane [1978], 15, 17,
Honary, Markarian, and Darnell [1995a], Massey [1965], 38, 63, 169
75 McEliece [1987], 21
Honary, Markarian, and Marple [1996], Muder [1988], 27, 32, 75, 152
Honary, Markarian, and Marple [1997], Peterson and Weldon [1972], 14, 19,
152 21, 64
Kasami, Takata, Fujiwara, and Lin [1993a], Press, Teukolsky, Vetterling, and Flan-
Kasami, Takata, Fujiwara, and Lin [1993b], Pyndiah [1998], 147, 209, 211
Sidorenko, Markarian, and Honary [1995], Wozencraft and Jacobs [1965], 118
Sidorenko, Martin, and Honary [1999], Zyablov and Sidorenko [1993], 75, 77,
205 84
Singleton [1964], 12
Slepian [1960], 13
[1975], 39
119, 204
Wolf [1965], 13
Index
A bound
branch metric, 28
B
branch profile, 153
BCH bound, 20–21
example, 61 channel
226
INDEX C (cont.) – D 227
FSK, 169
E
edge, 28 G
F H
I M
log likelihood O
parity-check, 40 R
Pb decoders
product code, 12, 134–147, see also ar- RS(7; 5; 3) trellis, 85, 86, 179–181
S syndrome, 11, 40
100
V
observable, 31
vertex, 28
proper, 31
Viterbi
properties, 31–32
algorithm, 39, 87, 100, 107, 134
RM(8; 4; 4), 30, 104
flowchart, 99
RS(7; 3; 5), 93
reduced-search, 31
component codes, 92
storage requirements, 100
RS(7; 5; 3), 85, 86, 179–181
decoder, 25
component codes, 83
decoding, 94–98
Shannon product, 75–77, 100
example, 94–98
state profile, 29, 80, 84, 87
W
state-oriented form, 32, 75
weighting factor, 209
syndrome, 74, 77–84
truncated, 29
Wolf, 74
trellis code, 24
truncated trellis, 29
The generosity of Leslie Lamport in making LATEX 2" freely available, and also Donald
Knuth for TEX, is acknowledged. The success of TEX and LATEX 2" is due to many
volunteers in the TEX community who have contributed numerous support packages
for free.
233