Error Control Coding
Error Control Coding
s3 1 1 0
s4 1 1 1
Clearly, the code efficiency is 100% and L = 1.75 bints/sym = H(S). The sequence s3s4s1 will
then correspond to 1101110. Suppose a one-bit error occurs so that the received sequence is 0101110.
This will be decoded as s1s2s4s1, which is altogether different than the transmitted sequence. Thus
although the coding provides 100% efficiency in the light of Shannons theorem, it suffers a major
disadvantage. Another disadvantage of a variable length code lies in the fact that output data rates
measured over short time periods will fluctuate widely. To avoid this problem, buffers of large length
will be needed both at the encoder and at the decoder to store the variable rate bit stream if a fixed
output rate is to be maintained.
Some of the above difficulties can be resolved by using codes with fixed length. For
example, if the codes for the example cited are modified as 000, 100, 110, and 111. Observe that even
if there is a one-bit error, it affects only one block and that the output data rate will not fluctuate.
The encoder/decoder structure using fixed length code words will be very simple compared to the
complexity of those for the variable length codes.
Here after, we shall mean by Block codes, the fixed length codes only. Since as discussed
above, single bit errors lead to single block errors, we can devise means to detect and correct these
errors at the receiver. Notice that the price to be paid for the efficient handling and easy
manipulations of the codes is reduced efficiency and hence increased redundancy.
In general, whatever be the scheme adopted for transmission of digital/analog information, the
probability of error is a function of signal-to-noise power ratio at the input of a receiver and the data
rate. However, the constraints like maximum signal power and bandwidth of the channel (mainly the
Governmental regulations on public channels) etc, make it impossible to arrive at a signaling scheme
which will yield an acceptable probability of error for a given application. The answer to this problem
is then the use of error control coding, also known as channel coding. In brief, error control
coding is the calculated addition of redundancy. The block diagram of a typical data transmission
system is shown in Fig. 6.1
The information source can be either a person or a machine (a digital computer). The source
output, which is to be communicated to the destination, can be either a continuous wave form or a
sequence of discrete symbols. The source encoder transforms the source output into a sequence of
binary digits, the information sequence u. If the source output happens to be continuous, this involves
A-D conversion as well. The source encoder is ideally designed such that (i) the number of bints per
unit time (bit rate, rb) required to represent the source output is minimized (ii) the source output can
be uniquely reconstructed from the information sequence u.
Error control for data integrity may be exercised by means of forward error correction
(FEC) where in the decoder performs error correction operation on the received information
according to the schemes devised for the purpose. There is however another major approach known
as Automatic Repeat Request (ARQ), in which a re-transmission of the ambiguous information is
effected, is also used for solving error control problems. In ARQ, error correction is not done at all.
The redundancy introduced is used only for error detection and upon detection, the receiver
requests a repeat transmission which necessitates the use of a return path (feed back channel).
In summary, channel coding refers to a class of signal transformations designed to improve
performance of communication systems by enabling the transmitted signals to better withstand the
effect of various channel impairments such as noise, fading and jamming. Main objective of error
control coding is to reduce the probability of error or reduce the Eb/N0 at the cost of expending more
bandwidth than would otherwise be necessary. Channel coding is a very popular way of providing
performance improvement. Use of VLSI technology has made it possible to provide as much as
8 dB performance improvement through coding, at much lesser cost than through other methods
such as high power transmitters or larger Antennas.
We will briefly discuss in this chapter the channel encoder and decoder strategies, our major
interest being in the design and implementation of the channel encoder/decoder pair to achieve fast
transmission of information over a noisy channel, reliable communication of information and
reduction of the implementation cost of the equipment.
6. 3 Types of codes:
There are mainly two types of error control coding schemes Block codes and convolutional
codes, which can take care of either type of errors mentioned above.
In a block code, the information sequence is divided into message blocks of k bits each,
represented by a binary k-tuple, u = (u1, u2 .uk) and each block is called a message. The symbol u,
here, is used to denote a k bit message rather than the entire information sequence. The encoder
then transforms u into an n-tuple v = (v1, v2 .vn). Here v represents an encoded block rather than
the entire encoded sequence. The blocks are independent of each other.
The encoder of a convolutional code also accepts k-bit blocks of the information sequence u
and produces an n-symbol block v. Here u and v are used to denote sequences of blocks rather than a
single block. Further each encoded block depends not only on the present k-bit message block but
also on m-pervious blocks. Hence the encoder has a memory of order m. Since the encoder has
memory, implementation requires sequential logic circuits.
If the code word with n-bits is to be transmitted in no more time than is required for the
transmission of the k-information bits and if b and c are the bit durations in the encoded and coded
words, i.e. the input and output code words, then it is necessary that
n.c = k.b
We define the rate of the code by (also called rate efficiency)
Rc
k
n
k
Accordingly, with f b 1 and f c 1 , we have b c Rc
b
c
fc b n
[C=Blog2 (1+
(i)
Error detection: Single parity check-coding. Consider the (4, 3) even parity check code.
Message
000
001
010
011
100
101
110
111
Parity
0110
1001
1010
1100
1111
4
4
4 C 1 4 , 4 C 3 4
1
3
Expanding we get pd 4 p 12 p 2 16 p 3 8 p 4
Substituting the value of p we get:
pd = 32 (10-3) - 768 (10-6) +8192 (10-9) 32768 (10-12) = 0.031240326 > > (10-3)
However, an error results if the decoder does not indicate any error when an error indeed has
occurred. This happens when two or 4 errors occur. Hence probability of a detection error = pnd
(probability of no detection) is given by:
4
4
pnd P ( X 2 ) P ( X 4 ) p 2 ( 1 p )2 p 4 ( 1 p )0 6 p 2 12 p 3 7 p 4
2
4
(ii)
Error Correction: The triplets 000 and 111 are transmitted whenever 0 and 1 are inputted.
A majority logic decoding, as shown below, is employed assuming only single errors.
Received
Triplet
000
001
010
100
011
101
110
111
Output
message
3 3
p (1-p) 0 =3p2-2p3
3
Probability of no detection, pnd =P (All 3 bits in error) = p3 =512 x 10-9 < < pde!
In general observe that probability of no detection, pnd < < probability of decoding error, pde.
The preceding examples illustrate the following aspects of error control coding. Note that in
both examples with out error control coding the probability of error =8(10-3) of the modem.
1. It is possible to detect and correct errors by adding extra bits-the check bits, to the message
sequence. Because of this, not all sequences will constitute bonafide messages.
2. It is not possible to detect and correct all errors.
3. Addition of check bits reduces the effective data rate through the channel.
4. Since probability of no detection is always very much smaller than the decoding error
probability, it appears that the error detection schemes, which do not reduce the rate efficiency
as the error correcting schemes do, are well suited for our application. Since error detection
schemes always go with ARQ techniques, and when the speed of communication becomes a
major concern, Forward error correction (FEC) using error correction schemes would be
desirable.
The binary alphabet (0, 1) is called a field of two elements (a binary field and is denoted by
GF (2). (Notice that represents the EX-OR operation and represents the AND operation).Further
in binary arithmetic, X=X and X Y = X Y. similarly for 3-valued variables, modulo 3
arithmetic can be specified as shown in Fig 6.4. However, for brevity while representing polynomials
involving binary addition we use + instead of and there shall be no confusion about such usage.
Polynomials f(X) with 1 or 0 as the co-efficients can be manipulated using the above
relations. The arithmetic of GF(2m) can be derived using a polynomial of degree m, with binary coefficients and using a new variable called the primitive element, such that p() = 0.When p(X) is
irreducible (i.e. it does not have a factor of degree m and >0, for example X3 + X2 + 1, X3 + X + 1,
X4 +X3 +1, X5 +X2 +1 etc. are irreducible polynomials, whereas f(X)=X4+X3+X2+1 is not as
f(1) = 0 and hence has a factor X+1) then p(X) is said to be a primitive polynomial.
If vn represents a vector space of all n-tuples, then a subset S of vn is called a subspace if (i)
the all Zero vector is in S (ii) the sum of any two vectors in S is also a vector in S. To be more
specific, a block code is said to be linear if the following is satisfied. If v1 and v2 are any two code
words of length n of the block code then v1 v2 is also a code word length n of the block code.
Example 6.1: Linear Block code with k= 3, and n = 6
Observe the linearity property: With v3 = (010 101) and v4 = (100 011), v3 v4 = (110 110) = v7.
Remember that n represents the word length of the code words and k represents the number of
information digits and hence the block code is represented as (n, k) block code.
Thus by definition of a linear block code it follows that if g1, g2gk are the k linearly
independent code words then every code vector, v, of our code is a combination of these code words,
i.e.
v = u1 g1u2 g2 uk gk
(6.1)
Where uj= 0 or 1, 1 j k
Eq (6.1) can be arranged in matrix form by nothing that each gj is an n-tuple, i.e.
gj= (gj1, gj2,.gjn)
(6.2)
Thus we have
v=uG
Where:
u = (u1, u2uk)
represents the data vector and
g11 g12 g1n
g1
g 21 g 22 g 2 n
G g2
g 3
g k 1 g k 2 g kn
is called the generator matrix.
(6.3)
(6.4)
(6.5)
Notice that any k linearly independent code words of an (n, k) linear code can be used to form
a Generator matrix for the code. Thus it follows that an (n, k) linear code is completely specified by
the k-rows of the generator matrix. Hence the encoder need only to store k rows of G and form linear
combination of these rows based on the input message u.
Example 6.2: The (6, 3) linear code of Example 6.1 has the following generator matrix:
g 1 1 0 0 0 1 1
G g 2 0 1 0 1 0 1
g 3 0 0 1 1 1 0
If u = m5 (say) is the message to be coded, i.e. u = (011)
We have v = u .G = 0.g1 + 1.g2 +1.g3
= (0,0,0,0,0,0) + (0,1,0,1,0,1) + (0,0,1,1,1,0) = (0, 1, 1, 0, 1, 1)
Thus v = (0 1 1 0 1 1)
v can be computed simply by adding those rows of G which correspond to the locations of
1`s of u.
6.5.2 Systematic Block Codes (Group Property):
A desirable property of linear block codes is the Systematic Structure. Here a code word is
divided into two parts Message part and the redundant part. If either the first k digits or the last k
digits of the code word correspond to the message part then we say that the code is a Systematic
Block Code. We shall consider systematic codes as depicted in Fig.6.5.
v1 = u1, v2 = u2, v3 = u3 vk = uk
v k 1 u1 p11 u2 p21 u3 p31 uk pk1
v k 2 u1 p12 u2 p22 u3 p32 uk pk2
(6.6 a)
(6.6 b)
v1
v2
... v k
u1
u2
vk 1
vk 2
... v n
1 0 0 ...
0 1 0 ...
... uk
0 0 0 ...
0
0
p11
p21
pk 1
p12 ...
p22 ...
pk 2 ...
p1 ,n k
p2 ,n k .
pk ,n k
(6.7)
i.e., v = u. G
Where G = [Ik, P]
p11 p12 p1 ,n k
p21 p22 p2 ,n k
Where P =
pk , 1 pk ,2 pk ,n k
(6.8)
(6.9)
(6.9a)
This matrix H is called a parity check matrix of the code. Its dimension is (n k) n.
If the generator matrix has a systematic format, the parity check matrix takes the following form.
p21
...
pk 1
1 0 0 ... 0
p11
p
p22
...
pk 2
0 1 0 ... 0
12
T
H = [P .In-k] =
(6.10)
p1 ,n k p2 ,n k ... pk ,n k 0 0 0 ... 1
i th element
pi,1
pi,2pi,jpi, n-k)
(k + j) th element
ith element
pi,1
(k + j) th element
ith element
(k + j) th element
= pij + pij. = 0 (as the pij are either 0 or 1 and in modulo 2 arithmetic X + X = 0)
This implies simply that:
G. HT = Ok (n k)
(6.11)
Where pi =( u1 p1,i + u2 p2,i + u3 p3,i + uk pk, i) are the parity bits found from Eq (6.6b).
P
Now H T
In k
v.HT = [u1 p11 + u2 p21 +. + . + uk pk1 + p1, u1 p12 + u2 p22 + .. + uk pk2 + p2,
u1 p1, n-k + u2 p2, n-k + . + uk pk, n-k + pn-k]
= [p1 + p1, p2 + p2 pn-k + pn-k]
= [0, 0 0]
Thus v. HT = O. This statement implies that an n- Tuple v is a code word generated by G if and only
if
v HT = O
Since v = u G, This means that: u G HT = O
If this is to be true for any arbitrary message vector v then this implies: G HT = Ok (n k)
Example 6.3:
Consider the generator matrix of Example 6.2, the corresponding parity check matrix is
0 1 1 1 0 0
H = 1 0 1 0 1 0
1 1 0 0 0 1
(6.12)
is an n-tuple, where ej = 1 if rj vj and ej = 0 if rj = vj. This n tuple is called the error vector or
error pattern. The 1s in e are the transmission errors caused by the channel noise. Hence from
Eq (6.12) it follows:
r=ve
.
(6.12a)
Observe that the receiver noise does not know either v or e. Accordingly, on reception of r the
decoder must first identify if there are any transmission errors and, then take action to locate these
errors and correct them (FEC Forward Error Correction) or make a request for retransmission
(ARQ). When r is received, the decoder computes the following (n-k) tuple:
s = r. HT
= (s1, s2 sn-k)
..
(6.13)
It then follows from Eq (6.9a), that s = 0 if and only if r is a code word and s 0 iffy r is not a
code word. This vector s is called The Syndrome (a term used in medical science referring to
collection of all symptoms characterizing a disease). Thus if s = 0, the receiver accepts r as a valid
code word. Notice that there are possibilities of errors undetected, which happens when e is identical
to a nonzero code word. In this case r is the sum of two code words which according to our linearity
property is again a code word. This type of error pattern is referred to an undetectable error
pattern. Since there are 2k -1 nonzero code words, it follows that there are 2k -1 error patterns as
well. Hence when an undetectable error pattern occurs the decoder makes a decoding error.
Eq. (6.13) can be expanded as below:
rn
(6.14)
A careful examination of Eq. (6.14) reveals the following point. The syndrome is simply the vector
sum of the received parity digits (rk+1, rk+2 ...rn) and the parity check digits recomputed from the
received information digits (r1, r2 rn).
Example 6.4:
We shall compute the syndrome for the (6, 3) systematic code of Example 6.2. We have
0
1
1
0
or s1 = r2 +r3 + r4
s2 = r1 +r3 + r5
s3 = r1 +r2 + r6
or s = e.HT
(6.15)
as v.HT= O. Eq. (6.15) indicates that the syndrome depends only on the error pattern and not on the
transmitted code word v. For a linear systematic code, then, we have the following relationship
between the syndrome digits and the error digits.
s1 = e1 p11 + e2 p21 + . + ek pk, 1 + ek + 1
(6.16)
v = r e
= (0 1 1 1 0 1) + (0 0 1 0 0 0) = (0 1 0 1 0 1)
Notice now that v indeed is the actual transmitted code word.
The concept of distance between code words and single error correcting codes was first
developed by R .W. Hamming. Let the n-tuples,
= (1, 2 n), = (1, 2 n)
be two code words. The Hamming distance d (,) between such pair of code vectors is defined as
the number of positions in which they differ. Alternatively, using Modulo-2 arithmetic, we have
n
d( , ) ( j j )
j1
(6.17)
(Notice that represents the usual decimal summation and is the modulo-2 sum, the EX-OR
function).
The Hamming Weight () of a code vector is defined as the number of nonzero
elements in the code vector. Equivalently, the Hamming weight of a code vector is the distance
between the code vector and the all zero code vector.
Example 6.6: Let
= (0 1 1 1 0 1), = (1 0 1 0 1 1)
Notice that the two vectors differ in 4 positions and hence d (,) = 4. Using Eq (6.17) we find
d (,) = (0 1) + (1 0) + (1 1) + (1 0) + (0 1) + (1 1)
=
(6.18)
(6.19)
( ) = 4 = d (,)
If = (1 0 1 01 0), we have d (,) = 4; d (,) = 1; d (,) = 5
Notice that the above three distances satisfy the triangle inequality:
d (,) + d (,) = 5 = d (,)
d (,) + d (,) = 6 > d (,)
d (,) + d (,) = 9 > d (,)
Similarly, the minimum distance of a linear block code, C may be mathematically
represented as below:
dmin =Min {d (,):, C, }
(6.20)
(6.21)
=Min {( ):, C, }
=Min {(v), v C, v 0}
That is d min min . The parameter min is called the minimum weight of the linear
code C.The minimum distance of a code, dmin, is related to the parity check matrix, H, of the code in
a fundamental way. Suppose v is a code word. Then from Eq. (6.9a) we have:
0 = v.HT
= v1h1 v2h2 . vnhn
Here h1, h2 hn represent the columns of the H matrix. Let vj1, vj2 vjl be the l nonzero
components of v i.e. vj1 = vj2 = . vjl = 1. Then it follows that:
hj1 hj2 hjl = OT
(6.22)
That is if v is a code vector of Hamming weight l, then there exist l columns of H such
that the vector sum of these columns is equal to the zero vector. Suppose we form a binary n-tuple
of weight l, viz. x = (x1, x2 xn) whose nonzero components are xj1, xj2 xjl. Consider the
product:
x.HT = x1h1 x2h2 . xnhn = xj1hj1 xj2hj2 . xjlhjl = hj1 hj2 hjl
If Eq. (6.22) holds, it follows x.HT = O and hence x is a code vector. Therefore, we conclude
that if there are l columns of H matrix whose vector sum is the zero vector then there exists a
code vector of Hamming weight l .
From the above discussions, it follows that:
i)
If no (d-1) or fewer columns of H add to OT, the all zero column vector, the code has a
minimum weight of at leastd.
ii)
The minimum weight (or the minimum distance) of a linear block code C, is the smallest
number of columns of H that sum to the all zero column vector.
0 1 1 1 0 0
For the H matrix of Example 6.3, i.e. H = 1 0 1 0 1 0 , notice that all columns of H are non
1 1 0 0 0 1
zero and distinct. Hence no two or fewer columns sum to zero vector. Hence the minimum weight of
the code is at least 3.Further notice that the 1st, 2nd and 3rd columns sum to OT. Thus the minimum
weight of the code is 3. We see that the minimum weight of the code is indeed 3 from the table of
Example 6.1.
Clearly, if the code words used are {000, 101, 110, 011}, the Hamming distance between the
words is 2. Notice that any error in the received words locates them on the vertices of the cube which
are not code words and may be recognized as single errors. The code word pairs with Hamming
distance = 3 are: (000, 111), (100, 011), (101, 010) and (001, 110). If a code word (000) is received
as (100, 010, 001), observe that these are nearer to (000) than to (111). Hence the decision is made
that the transmitted word is (000).
Suppose an (n, k) linear block code is required to detect and correct all error patterns (over a
BSC), whose Hamming weight, t. That is, if we transmit a code vector and the received vector
is = e, we want the decoder out put to be = subject to the condition (e) t.
Further, assume that 2k code vectors are transmitted with equal probability. The best decision
for the decoder then is to pick the code vector nearest to the received vector for which the
Hamming distance is the smallest. i.e., d (,) is minimum. With such a strategy the decoder will be
able to detect and correct all error patterns of Hamming weight (e) t provided that the minimum
distance of the code is such that:
dmin (2t + 1)
(6.23)
2t + 1
dmin 2t + 2
(6.24)
Suppose be any other code word of the code. Then, the Hamming distances among
, and satisfy the triangular inequality:
d(,) + d(, ) d(,)
(6.25)
Suppose an error pattern of t errors occurs during transmission of . Then the received vector
differs from in t places and hence d(,) = t. Since and are code vectors, it follows from
Eq. (6.24).
d(,) dmin 2t + 1
(6.26)
Combining Eq. (6.25) and (6.26) and with the fact that d(,) = t, it follows that:
d (, ) 2t + 1- t
Hence if t t, then: d (, ) > t
(6.27)
(6.28)
Eq 6.28 says that if an error pattern of t or fewer errors occurs, the received vector is
closer (in Hamming distance) to the transmitted code vector than to any other code vector of the
code. For a BSC, this means P (|) > P (|) for . Thus based on the maximum likelihood
decoding scheme, is decoded as , which indeed is the actual transmitted code word and this
results in the correct decoding and thus the errors are corrected.
On the contrary, the code is not capable of correcting error patterns of weight l>t. To show
this we proceed as below:
Suppose
d (,) = dmin, and let e1 and e2 be two error patterns such that:
i)
e1 e2 =
ii)
(6.29)
Suppose, is the transmitted code vector and is corrupted by the error pattern e1. Then the received
vector is:
= e1
and
..
d (,) = ( ) = (e1)
(6.30)
(6.31)
d (,) = ()
= ( e1) = (e2)
(6.32)
If the error pattern e1 contains more thant errors, i.e. (e1) > t, and since 2t + 1 dmin 2t + 2, it
follows
(e2) t- 1
(6.33)
d (,) d (,)
(6.34)
This inequality says that there exists an error pattern of l > t errors which results in a received
vector closer to an incorrect code vector i.e. based on the maximum likelihood decoding scheme
decoding error will be committed.
To make the point clear, we shall give yet another illustration. The code vectors and the
received vectors may be represented as points in an n- dimensional space. Suppose we construct two
spheres, each of equal radii,t around the points that represent the code vectors and . Further let
these two spheres be mutually exclusive or disjoint as shown in Fig.6.11 (a).
For this condition to be satisfied, we then require d (,) 2t + 1.In such a case if d (,) t,
it is clear that the decoder will pick as the transmitted vector.
On the other hand, if d (,) 2t, the two spheres around and intersect and if is located as in
Fig. 6.11(b), and is the transmitted code vector it follows that even if d (,) t, yet is as close to
as it is to. The decoder can now pick as the transmitted vector which is wrong. Thus it is
imminent that an (n, k) linear block code has the power to correct all error patterns of weightt or
less if and only if d (,) 2t + 1 for all and. However, since the smallest distance between any
pair of code words is the minimum distance of the code, dmin , guarantees correcting all the error
patterns of
1
t ( d min 1 )
2
(6.35)
where ( d min 1 ) denotes the largest integer no greater than the number ( d min 1 ) . The
2
the code is referred to as a t-error correcting code. The (6, 3) code of Example 6.1 has a minimum
distance of 3 and from Eq. (6.35) it follows t = 1, which means it is a Single Error Correcting
(SEC) code. It is capable of correcting any error pattern of single errors over a block of six digits.
For an (n, k) linear code, observe that, there are 2n-k syndromes including the all zero
syndrome. Each syndrome corresponds to a specific error pattern. If j is the number of error
n
locations in the n-dimensional error pattern e, we find in general, there are nC j multiple error
j
t
n
patterns. It then follows that the total number of all possible error patterns = , wheret is the
j 0 j
j
j 0
(6.36)
Eq (6.36) is usually referred to as the Hamming bound. A binary code for which the Hamming
Bound turns out to be equality is called a Perfect code.
(6.37)
The set of vectors {ej, j = 1, 2 2k} so defined is called the co- set of the code. That is, a
co-set contains exactly 2k elements that differ at most by a code vector. It then fallows that there are
2n-k co- sets for an (n, k) linear block code. Post multiplying Eq (6.37) by HT, we find
ej HT = eHT vj HT
= e HT
(6.38)
Notice that the RHS of Eq (6.38) is independent of the index j, as for any code word the term
vj HT = 0. From Eq (6.38) it is clear that all error patterns that differ at most by a code word have
the same syndrome. That is, each co-set is characterized by a unique syndrome.
Since the received vector r may be any of the 2n n-tuples, no matter what the transmitted code
word was, observe that we can use Eq (6.38) to partition the received code words into 2k disjoint sets
and try to identify the received vector. This will be done by preparing what is called the standard
array. The steps involved are as below:
Step1: Place the 2k code vectors of the code in a row, with the all zero vector
v1 = (0, 0, 0 0) = O as the first (left most) element.
Step 2: From among the remaining (2n 2k) - n tuples, e2 is chosen and placed below the allzero vector, v1. The second row can now be formed by placing (e2 vj),
j = 2, 3 2k under vj
Step 3: Now take an un-used n-tuple e3 and complete the 3rd row as in step 2.
Step 4: continue the process until all the n-tuples are used.
Since all the code vectors, vj, are all distinct, the vectors in any row of the array are also
distinct. For, if two n-tuples in the l-th row are identical, say el vj = el vm, j m; we should have
vj = vm which is impossible. Thus it follows that no two n-tuples in the same row of a standard
array are identical.
Next, let us consider that an n-tuple appears in both l-th row and the m-th row. Then for some
j1 and j2 this implies el vj1 = em vj2, which then implies el = em (vj2 vj1); (remember that
X X = 0 in modulo-2 arithmetic) or el = em vj3 for some j3. Since by property of linear block
codes vj3 is also a code word, this implies, by the construction rules given, that el must appear in the
m-th row, which is a contradiction of our steps, as the first element of the m-th row is em and is an
unused vector in the previous rows. This clearly demonstrates another important property of the
array: Every n-tuple appears in one and only one row.
From the above discussions it is clear that there are 2n-k disjoint rows or co-sets in the standard
array and each row or co-set consists of 2k distinct entries. The first n-tuple of each co-set, (i.e., the
entry in the first column) is called the Co-set leader. Notice that any element of the co-set can be
used as a co-set leader and this does not change the element of the co-set - it results simply in a
permutation.
Suppose DjT is the jth column of the standard array. Then it follows
Dj = {vj, e2 vj, e3 vj e2n-k vj}
..
(6.39)
where vj is a code vector and e2, e3, e2n-k are the co-set leaders.
T
The 2k disjoints columns D1T, D2T D2 k can now be used for decoding of the code. If vj is
the transmitted code word over a noisy channel, it follows from Eq (6.39) that the received vector r is
in DjT if the error pattern caused by the channel is a co-set leader. If this is the case r will be decoded
correctly as vj. If not an erroneous decoding will result for, any error pattern e which is not a co-set
leader must be in some co-set and under some nonzero code vector, say, in the i-th co-set and under v
0. Then it follows
Thus the received vector is in DmT and it will be decoded as vm and a decoding error has been
committed. Hence it is explicitly clear that Correct decoding is possible if and only if the error
pattern caused by the channel is a co-set leader. Accordingly, the 2n-k co-set leaders (including the
all zero vector) are called the Correctable error patterns, and it follows Every (n, k) linear block
code is capable of correcting 2n-k error patterns.
So, from the above discussion, it follows that in order to minimize the probability of a
decoding error, The most likely to occur error patterns should be chosen as co-set leaders. For a
BSC an error pattern of smallest weight is more probable than that of a larger weight. Accordingly,
when forming a standard array, error patterns of smallest weight should be chosen as co-set leaders.
Then the decoding based on the standard array would be the minimum distance decoding (the
maximum likelihood decoding). This can be demonstrated as below.
Suppose a received vector r is found in the jth column and lth row of the array. Then r will be
decoded as vj. We have
d(r, vj) = (r vj ) = (el vj vj ) = (el )
where we have assumed vj indeed is the transmitted code word. Let vs be any other code word, other
than vj. Then
d(r, vs ) = (r vs ) = (el vj vs ) = (el vi )
as vj and vs are code words, vi = vj vs is also a code word of the code. Since el and (el vi ) are in
the same co set and, that el has been chosen as the co-set leader and has the smallest weight it follows
(el ) (el vi ) and hence d(r, vj ) d(r, vs ). Thus the received vector is decoded into a closet
code vector. Hence, if each co-set leader is chosen to have minimum weight in its co-set, the standard
array decoding results in the minimum distance decoding or maximum likely hood decoding.
Suppose a0, a1, a2 , an denote the number of co-set leaders with weights 0, 1, 2 n. This
set of numbers is called the Weight distribution of the co-set leaders. Since a decoding error will
occur if and only if the error pattern is not a co-set leader, the probability of a decoding error for a
BSC with error probability (transition probability) p is given by
P( E ) 1
a j p j ( 1 p )n j
(6.40)
j 0
Example 6.8:
For the (6, 3) linear block code of Example 6.1 the standard array, along with the syndrome
table, is as below:
The weight distribution of the co-set leaders in the array shown are a0 = 1, a1 = 6, a2 = 1, a3 = a4 = a5
= a6 = 0.From Eq (6.40) it then follows:
P (E) = 1- [(1-p) 6 +6p (1-p) 5 + p2 (1-p) 4]
With p = 10-2, we have P (E) = 1.3643879 10-3
A received vector (010 001) will be decoded as (010101) and a received vector (100 110) will be
decoded as (110 110).
We have seen in Eq. (6.38) that each co-set is characterized by a unique syndrome or there is
a one- one correspondence between a co- set leader (a correctable error pattern) and a syndrome.
These relationships, then, can be used in preparing a decoding table that is made up of 2n-k co-set
leaders and their corresponding syndromes. This table is either stored or wired in the receiver. The
following are the steps in decoding:
Step 1: Compute the syndrome s = r. HT
Step 2: Locate the co-set leader ej whose syndrome is s. Then ej is assumed to be the error
pattern caused by the channel.
Step 3: Decode the received vector r into the code vector v = r ej
This decoding scheme is called the Syndrome decoding or the Table look up decoding.
Observe that this decoding scheme is applicable to any linear (n, k) code, i.e., it need not necessarily
be a systematic code.
Comments:
1) Notice that for all correctable single error patterns the syndrome will be identical to a
column of the H matrix and indicates that the received vector is in error corresponding to
that column position.
For Example, if the received vector is (010001), then the syndrome is (100). This is identical
withthe4th column of the H- matrix and hence the 4th position of the received vector is in error.
Hence the corrected vector is 010101. Similarly, for a received vector (100110), the syndrome is 101
and this is identical with the second column of the H-matrix. Thus the second position of the
received vector is in error and the corrected vector is (110110).
2) A table can be prepared relating the error locations and the syndrome. By suitable
combinatorial circuits data recovery can be achieved. For the (6, 3) systematic linear code we have
the following table for r = (r1 r2 r3 r4 r5 r6.).
k=0, m=2
k=1, m=3
k=4
Thus we require 3 parity check symbols and the length of the code 23 1 = 7. This results in the
(7, 4) Hamming code.
The parity check matrix for the (7, 4) linear systematic Hamming code is then
G I 2 m m 1 QT
p1 p2 m1 p3 m2 m3 m4 p4 m5 m6
m 7 m8
m9
m10 m11
p5
m12
Where p1, p2, p3are the parity digits and m1, m2, m3are the message digits. For example, let us
consider the non systematic (7, 4) Hamming code.
p1 = 1, 3, 5, 7, 9, 11, 13, 15
p2 = 2, 3, 6, 7, 10, 11, 14, 15
p3 = 4, 5, 6, 7, 12, 13, 14, 15
It can be verified that (7, 4), (15, 11), (31, 26), (63, 57) are all single error correcting
Hamming codes and are regarded quite useful.
An important property of the Hamming codes is that they satisfy the condition of Eq. (6.36)
with equality sign, assuming that t=1.This means that Hamming codes are single error correcting
binary perfect codes. This can also be verified from Eq. (6.35)
We may delete any l columns from the parity check matrix H of the Hamming code resulting
in the reduction of the dimension of H matrix to m (2m-l-1).Using this new matrix as the parity
check matrix we obtain a shortened Hamming code with the following parameters.
Code length:
n = 2m-l-1
k=2m-m-l-1
nk=m
Minimum distance:
dmin 3
Notice that if the deletion of the columns of the H matrix is proper, we may obtain a Hamming code
with dmin = 4.For example if we delete from the sub-matrix Q all the columns of even weight, we
obtain an m 2m-1 matrix
H Q : Im
Where Q contains (2m-1 -m) columns of odd weight. Clearly no three columns add to zero as all
columns have odd weight .However, for a column in Q , there exist three columns in Im such that four
columns add to zero .Thus the shortened Hamming codes with H as the parity check matrix has
minimum distance exactly 4.
The distance 4 shortened Hamming codes can be used for correcting all single error patterns while
simultaneously detecting all double error patterns. Notice that when single errors occur the
syndromes contain odd number of ones and for double errors it contains even number of ones.
Accordingly the decoding can be accomplished in the following manner.
(1) If s = 0, no error occurred.
(2) If s contains odd number of ones, single error has occurred .The single error pattern pertaining
to this syndrome is added to the received code vector for error correction.
(3) If s contains even number of ones an uncorrectable error pattern has been detected.
Alternatively the SEC Hamming codes may be made to detect double errors by adding an extra
parity check in its (n+1) Th position. Thus (8, 4), (6, 11) etc. codes have dmin = 4 and correct single
errors with detection of double errors.
CYCLIC CODES
CHAPTER 7
"Binary cyclic codes form a sub class of linear block codes. Majority of important linear
block codes that are known to-date are either cyclic codes or closely related to cyclic codes. Cyclic
codes are attractive for two reasons: First, encoding and syndrome calculations can be easily
implemented using simple shift registers with feed back connections. Second, they posses well
defined mathematical structure that permits the design of higher-order error correcting codes.
A binary code is said to be "cyclic" if it satisfies:
1. Linearity property sum of two code words is also a code word.
2. Cyclic property Any lateral shift of a code word is also a code word.
The second property can be easily understood from Fig, 7.1. Instead of writing the code as a
row vector, we have represented it along a circle. The direction of traverse may be either clockwise or
counter clockwise (right shift or left shift).
For example, if we move in a counter clockwise direction then starting at A the code word is
110001100 while if we start at B it would be 011001100. Clearly, the two code words are related in
that one is obtained from the other by a cyclic shift.
(7.1)
is a code vector, then the code vector, read from B, in the CW direction, obtained by a one bit cyclic
right shift:
v(1) = (vn-1 , vo, v1, v2, vn-3,vn-2,)
(7.2)
is also a code vector. In this way, the n - tuples obtained by successive cyclic right shifts:
v(2) = (vn-2, vn-1, vn, v0, v1vn-3)
(7.3a)
(7.3b)
(7.3c)
are all code vectors. This property of cyclic codes enables us to treat the elements of each code vector
as the co-efficients of a polynomial of degree (n-1).
This is the property that is extremely useful in the analysis and implementation of these codes.
Thus we write the "code polynomial' V(X) for the code in Eq (7.1) as a vector polynomial as:
V(X) = vo + v1 X + v2 X2 + v3 X3 ++ vi-1 Xi-1 +... + vn-3 Xn-3 + vn-2 Xn-2 + vn-1 Xn-1
.. (7.4)
Notice that the co-efficients of the polynomial are either '0' or '1' (binary codes), i.e. they belong to
GF (2) as discussed in sec 6.7.1.
. Each power of X in V(X) represents a one bit cyclic shift in time.
. Therefore multiplication of V(X) by X maybe viewed as a cyclic shift or rotation to the right subject
to the condition Xn = 1. This condition (i) restores XV(X) to the degree (n-1) (ii) Implies that right
most bit is fed-back at the left.
. This special form of multiplication is called "Multiplication modulo Xn + 1
. Thus for a single shift, we have
XV(X) = voX + v1 X2 + v2 X3 +........+ vn-2 Xn-1 + vn-1 Xn
(+ vn-1 + vn-1) (Manipulate A + A =0 Binary Arithmetic)
= vn-1 + v0 X + v1 X2 + + vn-2 Xn-1 + vn-1(Xn + 1)
=V (1) (X) = Remainder obtained by dividing XV(X) by Xn + 1
(Remember: X mod Y means remainder obtained after dividing X by Y)
Thus it turns out that
V (1) (X) = vn-1 + vo X + v1 X2 +..... + vn-2 Xn-1
I
is the code polynomial for v(1) . We can continue in this way to arrive at a general format:
(7.5)
Remainder Quotient
Where
(7.6)
..
(7.8)
..
(7.9)
corresponding to these code polynomials form a linear (n, k) code. We have then, from the theorem
g( X ) 1
n k 1
gi X i X n k
(7.10)
i 1
As
(7.11)
is a polynomial of minimum degree, it follows that g0 = gn-k = 1 always and the remaining coefficients may be either' 0' of '1'. Performing the multiplication said in Eq (7.8) we have:
U (X) g(X) = uo g(X) + u1 X g(X) ++uk-1Xk-1g(X)
(7.12)
Suppose u0=1 and u1=u2= =uk-1=0. Then from Eq (7.8) it follows g(X) is a code word polynomial
of degree (n-k). This is treated as a basis code polynomial (All rows of the G matrix of a block
code, being linearly independent, are also valid code vectors and form Basis vectors of the code).
Therefore from cyclic property Xi g(X) is also a code polynomial. Moreover, from the linearity
property - a linear combination of code polynomials is also a code polynomial. It follows therefore
that any multiple of g(X) as shown in Eq (7.12) is a code polynomial. Conversely, any binary
polynomial of degree (n-1) is a code polynomial if and only if it is a multiple of g(X). The code
words generated using Eq (7.8) are in non-systematic form. Non systematic cyclic codes can be
generated by simple binary multiplication circuits using shift registers.
.
In this book we have described cyclic codes with right shift operation. Left shift version can
be obtained by simply re-writing the polynomials. Thus, for left shift operations, the various
polynomials take the following form
U(X) = uoXk-1 + u1Xk-2 + + uk-2X + uk-1
V(X) = v0 Xn-1 + v1Xn-2 +.
+ vn-2X + vn-1
..
(7.13a)
(7.13b)
(7.13c)
(7.13d)
n k
gi X n k i gn k
i 1
(7.14)
where ais are either a ' 0' or a '1', the right most bit in the sequence (a0, a1, a2 ... an-1) is transmitted
first in any operation. The product of the two polynomials A(X) and B(X) yield:
C(X) = A(X) B(X)
= (a0 + a1 X + a2 X2 + ..................+ an-1Xn-1) (b0 + b1 X + b2X2 ++ bm-1 Xm-1)
= a0b0+ (a1b0+a0b1) X + (a0b2 + b0a2+a1b1) X2 +. + (an-2bm-1+ an-1bm-2) Xn+m -3 +an-1bm-1Xn+m -2
This product may be realized with the circuits of Fig 7.2 (a) or (b), where A(X) is the input and the
co-efficient of B(X) are given as weighting factor connections to the mod - 2 .adders. A '0' indicates
no connection while a '1' indicates a connection. Since higher order co-efficients are first sent, the
highest order co-efficient an-1 bm-1 of the product polynomial is obtained first at the output of
Fig 7.2(a). Then the co-efficient of Xn+m-3 is obtained as the sum of {an-2bm-1 + an-1 bm-2}, the first
term directly and the second term through the shift register SR1. Lower order co-efficients are then
generated through the successive SR's and mod-2 adders. After (n + m - 2) shifts, the SR's contain
{0, 0 0, a0, a1} and the output is (a0 b1 + a1 b0) which is the co-efficient of X. After (n + m-1)
shifts, the SR's contain (0, 0, 0,0, a0) and the out put is a0b0. The product is now complete and the
contents of the SR's become (0, 0, 0 0, 0). Fig 7.2(b) performs the multiplication in a similar way
but the arrangement of the SR's and ordering of the co-efficients are different (reverse order!). This
modification helps to combine two multiplication operations into one as shown in Fig 7.2(c).
From the above description, it is clear that a non-systematic cyclic code may be generated using (n-k)
shift registers. Following examples illustrate the concepts described so far.
Example 7.2: Consider the generation of a (7, 4) cyclic code. Here (n- k) = (7-4) = 3 and we have to
find a generator polynomial of degree 3 which is a factor of Xn + 1 = X7 + 1.
To find the factors of degree 3, divide X7+1 by X3+aX2+bX+1, where 'a' and 'b' are binary
numbers, to get the remainder as abX2+ (1 +a +b) X+ (a+b+ab+1). Only condition for the remainder
to be zero is a +b=1 which means either a = 1, b = 0 or a = 0, b = 1. Thus we have two possible
polynomials of degree 3, namely
g1 (X) = X3+ X2+ 1 and g2 (X) = X3+X+1
In fact, X7 + 1 can be factored as:
(X7+1) = (X+1) (X3+X2+1) (X3+X+1)
Thus selection of a 'good' generator polynomial seems to be a major problem in the design of cyclic
codes. No clear-cut procedures are available. Usually computer search procedures are followed.
Let us choose g (X) = X3+ X + 1 as the generator polynomial. The encoding circuits are
shown in Fig 7.4(a) and (b).
We have
= 1 +X2+X3+X+X3+X4+X3+X5+X6
= 1 + X + X2+ X3+ X4+ X5+ X6 because (X3+ X3=0)
=> v = (1 1 1 1 1 1 1)
The multiplication operation, performed by the circuit of Fig 7.4(a), is listed in the Table below step
by step. In shift number 4, 000 is introduced to flush the registers. As seen from the tabulation the
product polynomial is:
V (X) = 1 +X+X2+X3+X4+X5+X6,
and hence out put code vector is v = (1 1 1 1 1 1 1), as obtained by direct multiplication. The reader
can verify the operation of the circuit in Fig 7.4(b) in the same manner. Thus the multiplication
circuits of Fig 7.4 can be used for generation of non-systematic cyclic codes.
Input
Queue
0001011
000101
00010
0001
000
00
0
-
Bit
shifted
IN
1
1
0
1
0
0
0
Contents of shift
registers.
SRI SR2 SR3
0
0
0
1
0
0
1
1
0
0
1
1
1
0
1
0
1
0
0
0
1
0
0
0
Out
put
1
1
1
1
1
1
1
Remarks
Input
Queue
0001011
000101
00010
0001
*000
00
0
-
Bit
shifted
IN
1
1
0
1
0
0
0
Contents of shift
Registers.
SRI SR2 SR3
0
0
0
1
0
0
1
1
0
0
1
1
0
1
1
1
1
1
1
0
1
1
0
0
Out
put
0
0
0
1
1
1
1
Remarks
The quotient co-efficients will be available only after the fourth shift as the first three shifts
result in entering the first 3-bits to the shift registers and in each shift out put of the last register, SR3,
is zero.
The quotient co-efficient serially presented at the out put are seen to be (1111) and hence the
quotient polynomial is Q(X) =1 + X + X2 + X3. The remainder co-efficients are (1 0 0) and the
remainder polynomial is R(X) = 1. The polynomial division steps are listed in the next page.
Division Table for Example 7.3:
(7.15)
(7.16)
= P(X) + Xn-kU(X)
(7.17)
Since the code polynomial is a multiple of the generator polynomial we can write:
V (X) = P (X) +Xn-k U (X) = Q (X) g (X)
X n kU ( X )
P( X )
Q( X )
g( X )
g( X )
.........................
(7.18)
(7.19)
Thus division of Xn-k U (X) by g (X) gives us the quotient polynomial Q (X) and the
remainder polynomial P (X). Therefore to obtain the cyclic codes in the systematic form, we
determine the remainder polynomial P (X) after dividing Xn-k U (X) by g(X). This division process
can be easily achieved by noting that "multiplication by Xn-k amounts to shifting the sequence by
(n-k) bits". Specifically in the circuit of Fig 7.5(a), if the input A(X) is applied to the Mod-2 adder
after the (n-k) th shift register the result is the division of Xn-k A (X) by B (X).
Accordingly, we have the following scheme to generate systematic cyclic codes. The
generator polynomial is written as:
(7.20)
The circuit of Fig 7.8 does the job of dividing Xn-kU (X) by g(X). The following steps describe the
encoding operation.
The encoder circuit for the problem on hand is shown in Fig 7.9. The operational steps are as follows:
Shift Number
0
1
2
3
4
Output
1
1
0
1
After the Fourth shift GATE Turned OFF, switch S moved to position 2, and the parity bits
contained in the register are shifted to the output. The out put code vector is v = (100 1011) which
agrees with the direct hand calculation.
7.6
Suppose the code vector v= (v0, v1, v2 vn-1) is transmitted over a noisy channel. Hence the
received vector may be a corrupted version of the transmitted code vector. Let the received code
vector be r = (r0, r1, r2rn-1). The received vector may not be anyone of the 2k valid code vectors.
The function of the decoder is to determine the transmitted code vector based on the received vector.
The decoder, as in the case of linear block codes, first computes the syndrome to check
whether or not the received code vector is a valid code vector. In the case of cyclic codes, if the
syndrome is zero, then the received code word polynomial must be divisible by the generator
polynomial. If the syndrome is non-zero, the received word contains transmission errors and needs
error correction. Let the received code vector be represented by the polynomial
R(X) = r0+r1X+r2X2++rn-1Xn-1
Let A(X) be the quotient and S(X) be the remainder polynomials resulting from the division of
R(X) by g(X) i.e.
R( X )
S( X )
A( X )
g( X )
g( X )
..
(7.21)
The remainder S(X) is a polynomial of degree (n-k-1) or less. It is called the "Syndrome polynomial".
If E(X) is the polynomial representing the error pattern caused by the channel, then we have:
R(X) =V(X) + E(X)
..
(7.22)
And it follows as V(X) = U(X) g(X), that:
E(X) = [A(X) + U(X)] g(X) +S(X)
(7.23)
That is, the syndrome of R(X) is equal to the remainder resulting from dividing the error pattern by
the generator polynomial; and the syndrome contains information about the error pattern, which can
be used for error correction. Hence syndrome calculation can be accomplished using divider circuits
discussed in Sec 7.4, Fig7.5. A Syndrome calculator is shown in Fig 7.10.
vector is
2 After the entire received vector is shifted into the register, the contents of the register will be
the syndrome, which can be shifted out of the register by turning GATE-1 ON and GATE-2
OFF. The circuit is ready for processing next received vector.
Cyclic codes are extremely well suited for 'error detection' .They can be designed to detect
many combinations of likely errors and implementation of error-detecting and error correcting
circuits is practical and simple. Error detection can be achieved by employing (or adding) an
additional R-S flip-flop to the syndrome calculator. If the syndrome is nonzero, the flip-flop sets and
provides an indication of error. Because of the ease of implementation, virtually all error detecting
codes are invariably 'cyclic codes'. If we are interested in error correction, then the decoder must be
capable of determining the error pattern E(X) from the syndrome S(X) and add it to R(X) to
determine the transmitted V(X). The following scheme shown in Fig 7.11 may be employed for the
purpose. The error correction procedure consists of the following steps:
Step1. Received data is shifted into the buffer register and syndrome registers with switches
SIN closed and SOUT open and error correction is performed with SIN open and SOUT
closed.
Step2.
After the syndrome for the received code word is calculated and placed in the
syndrome register, the contents are read into the error detector. The detector is a
combinatorial circuit designed to output a 1 if and only if the syndrome corresponds
to a correctable error pattern with an error at the highest order position Xn-l. That is, if
the detector output is a '1' then the received digit at the right most stage of the buffer
register is assumed to be in error and will be corrected. If the detector output is '0' then
the received digit at the right most stage of the buffer is assumed to be correct. Thus
the detector output is the estimate error value for the digit coming out of the buffer
register.
Step3. In the third step, the first received digit in the syndrome register is shifted right once. If
the first received digit is in error, the detector output will be '1' which is used for error
correction. The output of the detector is also fed to the syndrome register to modify
the syndrome. This results in a new syndrome corresponding to the altered received
code word shifted to the right by one place.
Step4. The new syndrome is now used to check and correct the second received digit, which
is now at the right most position, is an erroneous digit. If so, it is corrected, a new
syndrome is calculated as in step-3 and the procedure is repeated.
Step5. The decoder operates on the received data digit by digit until the entire
received code word is shifted out of the buffer.
At the end of the decoding operation, that is, after the received code word is shifted out of the
buffer, all those errors corresponding to correctable error patterns will have been corrected, and the
syndrome register will contain all zeros. If the syndrome register does not contain all zeros, this
means that an un-correctable error pattern has been detected. The decoding schemes described in Fig
7.10 and Fig7.11 can be used for any cyclic code. However, the practicality depends on the
complexity of the combinational logic circuits of the error detector. In fact, there are special classes
of cyclic codes for which the decoder can be realized by simpler circuits. However, the price paid for
such simplicity is in the reduction of code efficiency for a given block size.
7.7
One of the major considerations in the design of optimum codes is to make the block size
n smallest for a given size k of the message block so as to obtain a desirable value of dmin. Or for
given code length n and efficiency k/n, one may wish to design codes with largest dmin. That
means we are on the look out for the codes that have 'best error correcting capabilities". The
BCH codes, as a class, are one of the most important and powerful error-correcting cyclic codes
known. The most common BCH codes are characterized as follows. Specifically, for any
positive integer m 3, and t < 2m - 1) / 2, there exists a binary BCH code (called 'primitive' BCH
code) with the following parameters:
Block length
: n = 2m-l
Number of message bits : k n - mt
Minimum distance
: dmin 2t + 1
Clearly, BCH codes are "t - error correcting codes". They can detect and correct up tot
random errors per code word. The Hamming SEC codes can also be described as BCH codes. The
BCH codes are best known codes among those which have block lengths of a few hundred or less.
The major advantage of these codes lies in the flexibility in the choice of code parameters viz: block
length and code rate. The parameters of some useful BCH codes are given below. Also indicated in
the table are the generator polynomials for block lengths up to 31.
NOTE: Higher order co-efficients of the generator polynomial are at the left. For example, if we are
interested in constructing a (15, 7) BCH code from the table we have (111 010 001) for the coefficients of the generator polynomial. Hence
g(X) = 1 + X4 + X6 + X7 + X8
n
7
15
15
15
31
31
31
31
31
k
4
11
7
5
26
21
16
11
6
t
1
1
2
3
1
2
3
5
7
Generator Polynomial
1 011
10 011
111 010 001
10 100 110 111
100 101
11 101 101 001
1 000 111 110 101 111
101 100 010 011 011 010 101
11 001 011 011 110 101 000 100 111
For further higher order codes, the reader can refer to Shu Lin and Costello Jr. The alphabet of
a BCH code for n = (2m-1) may be represented as the set of elements of an appropriate Galois field,
GF(2m) whose primitive element is .The generator polynomial of the t-error correcting BCH code
is the least common multiple (LCM) of Ml(X), M2(X), M2t(X), where Mi(X) is the minimum
polynomial of i, i = 1, 22t. For further details of the procedure and discussions the reader can
refer to J.Das etal.
There are several iterative procedures available for decoding of BCH codes. Majority of them
can be programmed on a general purpose digital computer, which in many practical applications form
an integral part of data communication networks. Clearly, in such systems software implementation
of the algorithms has several advantages over hardware implementation
7.8
Cyclic redundancy .check codes are extremely well suited for "error detection". The two important
reasons for this statement are, (1) they can be designed to detect many combinations of likely errors.
(2) The implementation of both encoding and error detecting circuits is practical. Accordingly, all
error detecting codes used in practice, virtually, are of the CRC -type. In an n-bit received word if a
contiguous sequence of b-bits in which the first and the last bits and any number of intermediate
bits are received in error, then we say a CRC "error burst' of length b has occurred. Such an error
burst may also include an end-shifted version of the contiguous sequence.
In any event, Binary (n, k) CRC codes are capable of detecting the following error patterns:
1. All CRC error bursts of length (n-k) or less.
2. A fraction of (1 - 2 (n k - 1)) of CRC error bursts of length (n k + 1).
3. A fraction (1-2 (n k)) of CRC error bursts of length greater than (n k + 1).
4. All combinations of (dmin 1) or fewer errors.
5. All error patterns with an odd number of errors if the generator polynomial
g (X) has an even number of non zero coefficients.
Generator polynomials of three CRC codes, internationally accepted as standards are listed below.
All three contain (1 +X) as a prime factor. The CRC-12 code is used when the character lengths is 6bits. The others are used for 8-bit characters.
* CRC-12 code: g (X) = 1 + X + X2+X3 + X11 + X12 *
*CRC-16 code: g (X) = 1 +X2 + X15 + X16
*CRC-CCITT code: g (X) = 1 + X5 + x12 + X16
(Expansion of CCITT: "Commit Consultaitif International Tlphonique et Tlgraphique" a
Geneva-based organization made up of telephone companies from all over the world)
: n = 2m - 1
:k=m
: dmin = 2m-1
1 Xn
p( X )
Maximum length codes are generated by polynomials of degree 'm'. Notice that any cyclic
code generated by a primitive polynomial is a Hamming code of dmin = 3. It follows then that the
maximum length codes are the 'duals' of Hamming codes. These codes are also referred to as 'pseudo
Noise (PN) codes' or "simplex codes".
Maximum length codes are generated by polynomials of the form g ( X )
correcting capabilities, for most interesting values code length and efficiency, are much inferior to
BCH codes. The main advantage is that the decoding can be performed using simple circuits. The
concepts are illustrated here with two examples.
Consider a (7, 3) simplex code, which is dual to the (7, 4) Hamming code. Here dmin=4 and t = 1.
This code is generated by G and corresponding parity check matrix H given below:
1 0 1 1 1 0 0
G 1 1 1 0 0 1 0
0 1 1 1 0 0 1
1
0
H
0
0
1
0
0
0
0
1
0
0
0
0
1
1
0
1
1
1
1
1
0
0
1
1
The error vector e= (e0, e1, e2, e3, e4, e5, e6) is checked by forming the syndromes:
s0 = e0 + e4 + e5;
s2 = e2 + e4 + e5 + e6;
s1 = e1 + e5 + e6;
s3 = e3 + e4 + e6
Thus all the bits are checked by successive shifts and the corrected V(X) is reloaded in the buffer. It is
possible to correct single errors only by using two of the check sums. However, by using three check
sums, the decoder also corrects some double error patterns. The decoder will correct all single errors
and detect all double error patterns if the decision is made on the basis of
(i). A1 = 1, A2 = 1, A3 = 1 for single errors
(ii). One or more checks fail for double errors.
We have devised the majority logic decoder assuming it is a Block code. However we should
not forget that it is also a cyclic code with a generator polynomial
g(X) = 1 + X2 + X3 + X4.
Then one could generate the syndromes at the decoder by using a divider circuit as already
discussed. An alternative format for the decoder is shown in Fig 7.17. Successive bits are checked for
single error in the block. The feed back shown is optional - The feed back will be needed if it is
desired to correct some double error patterns.
s0 = e0 + e3 + (e5 + e6)
s1 = e1 + e3 + e4 + e5
s2 = e2 + e4 + (e5 + e6) = e2 + e5 + (e4 + e6)
Check sum A1 = (s0 + s1) = e0 + e1 + (e4 + e6)
It is seen that s0 and s2 are orthogonal on B1 = (e5 + e6), as both of them provide check for this
sum. Similarly, A1 and s2 are orthogonal on B2 = (e4 + e6). Further B1 and B2 are orthogonal on
e6. Therefore it is clear that a two-step majority vote will locate the error on e6.The
corresponding decoder is shown in Fig 7.18, where the second level majority logic circuit gives
the correction signal and the stored R(X) is corrected as the bits are read out from the buffer.
Correct decoding is achieved if t < d / 2 = 1 error (d = no. of steps of majority vote). The circuit
provides majority vote '1' when the syndrome state is {1 0 1}. The basic principles of both types
of decoders, however, are the same. Detailed discussions on the general principles of Majority
logic decoding may be found in Shu-Lin and Costello Jr., J.Das etal and other standard books on
error control coding. The idea of this section was "only to introduce the reader to the concept of
majority logic decoding.
The Hamming codes (2m-1, 2m-m-1), m any integer, are majority logic decodable, (l5, 7)
BCH code with t 2 is 1-step majority logic decodable. Reed-Muller codes, maximum length
(simplex) codes Difference set codes and a sub-class of convolutional codes are examples
majority logic decodable codes.
23 23 23 23 12
23
2 2
0
1
2
3
The code has been used in many practical systems. The generator polynomial for the code is
obtained from the relation (X23+1) = (X+ 1) g1(X) g2(X), where:
CONVOLUTIONAL CODES
CHAPTER 8
In block codes, a block of n-digits generated by the encoder depends only on the block of kdata digits in a particular time unit. These codes can be generated by combinatorial logic circuits. In a
convolutional code the block of n-digits generated by the encoder in a time unit depends on not only
on the block of k-data digits with in that time unit, but also on the preceding m input blocks. An (n,
k, m) convolutional code can be implemented with k-input, n-output sequential circuit with input
memory m. Generally, k and n are small integers with k < n but the memory order m must be made
large to achieve low error probabilities. In the important special case when k = 1, the information
sequence is not divided into blocks but can be processed continuously.
Similar to block codes, convolutional codes can be designed to either detect or correct errors.
However, since the data are usually re-transmitted in blocks, block codes are better suited for error
detection and convolutional codes are mainly used for error correction.
Convolutional codes were first introduced by Elias in 1955 as an alternative to block codes.
This was followed later by Wozen Craft, Massey, Fano, Viterbi, Omura and others. A detailed
discussion and survey of the application of convolutional codes to practical communication channels
can be found in Shu-Lin & Costello Jr., J. Das etal and other standard books on error control coding.
To facilitate easy understanding we follow the popular methods of representing convolutional
encoders starting with a connection pictorial - needed for all descriptions followed by connection
vectors.
At each input bit time one bit is shifted into the left most stage and the bits that were present in the
registers shifted to the right by one position. Output switch (commutator /MUX) samples the output
of each X-OR gate and forms the code symbol pairs for the bits introduced. The final code is obtained
after flushing the encoder with "m" zero's where 'm'- is the memory order (In Fig.8.1, m = 2). The
sequence of operations performed by the encoder of Fig.8.1 for an input sequence u = (101) are
From Fig 8.2, the encoding procedure can be understood clearly. Initially the registers are in
Re-set mode i.e. (0, 0). At the first time unit the input bit is 1. This bit enters the first register and
pushes out its previous content namely 0 as shown, which will now enter the second register and
pushes out its previous content. All these bits as indicated are passed on to the X-OR gates and the
output pair (1, 1) is obtained. The same steps are repeated until time unit 4, where zeros are
introduced to clear the register contents producing two more output pairs. At time unit 6, if an
additional 0 is introduced the encoder is re-set and the output pair (0, 0) obtained. However, this
step is not absolutely necessary as the next bit, whatever it is, will flush out the content of the second
register. The 0 and the 1 indicated at the output of second register at time unit 5 now vanishes.
Hence after (L+m) = 3 + 2 = 5 time units, the output sequence will read v = (11, 10, 00, 10, 11).
(Note: L = length of the input sequence). This then is the code word produced by the encoder. It is
very important to remember that Left most symbols represent earliest transmission.
As already mentioned the convolutional codes are intended for the purpose of error
correction. However, it suffers from the problem of choosing connections to yield good distance
properties. The selection of connections indeed is very complicated and has not been solved yet. Still,
good codes have been developed by computer search techniques for all constraint lengths less than
20. Another point to be noted is that the convolutional codes do not have any particular block size.
They can be periodically truncated. Only thing is that they require m-zeros to be appended to the
end of the input sequence for the purpose of clearing or flushing or re-setting of the encoding
shift registers off the data bits. These added zeros carry no information but have the effect of
reducing the code rate below (k/n). To keep the code rate close to (k/n), the truncation period is
generally made as long as practical.
The encoding procedure as depicted pictorially in Fig 8.2 is rather tedious. We can approach
the encoder in terms of Impulse response or generator sequence which merely represents the
8.2
The encoder for a (2, 1, 3) code is shown in Fig. 8.3. Here the encoder consists of m=3 stage
shift register, n=2 modulo-2 adders (X-OR gates) and a multiplexer for serializing the encoder
outputs. Notice that module-2 addition is a linear operation and it follows that all convolution
encoders can be implemented using a linear feed forward shift register circuit.
The information sequence u = (u1, u2, u3 .) enters the encoder one bit at a time starting from
u1. As the name implies, a convolutional encoder operates by performing convolutions on the
information sequence. Specifically, the encoder output sequences, in this case v(1) = {v1(1), v2(1), v3(1)
}and v(2) = {v1(2),v2(2),v3(2) } are obtained by the discrete convolution of the information sequence
with the encoder "impulse responses'. The impulse responses are obtained by determining the output
sequences of the encoder produced by the input sequence u = (1, 0, 0, 0).The impulse responses so
defined are called 'generator sequences' of the code. Since the encoder has a m-time unit memory
the impulse responses can last at most (m+ 1) time units (That is a total of (m+ 1) shifts are necessary
for a message bit to enter the shift register and finally come out) and are written as:
v (1) = u * g (1)
(8.1 a)
v (2) = u * g (2)
(8.1 b)
vl
ul i . g i 1
( j)
( j)
i 0
= ul g 1
(j)
(8.2)
for j = 1, 2 and where ul-i = 0 for all l < i and all operations are modulo - 2. Hence for the encoder of
Fig 8.3, we have:
vl(1) = ul + ul 2 + ul - 3
vl(2) = ul + ul 1+ ul 2 + ul - 3
This can be easily verified by direct inspection of the encoding circuit. After encoding, the
two output sequences are multiplexed into a single sequence, called the "code word" for transmission
over the channel. The code word is given by:
INPUT
OUT PUT
1
0
1
1
11 01 11 11
0 0 0 0 0 0 0 0 -----one branch word shifted sequence
11 01 11 11
---Two branch word shifted
11 01 11 11
1
Modulo -2 sum
11 01 11 11
11 01 00 01 01 01 00 11
The Modulo-2 sum represents the same sequence as obtained before. There is no confusion at
all with respect to indices and suffices! Very easy approach - super position or linear addition of
shifted impulse response - demonstrates that the convolutional codes are linear codes just as the block
codes and cyclic codes. This approach then permits us to define a 'Generator matrix' for the
convolutional encoder. Remember, that interlacing of the generator sequences gives the overall
impulse response and hence they are used as the rows of the matrix. The number of rows equals the
number of information digits. Therefore the matrix that results would be Semi-Infinite. The second
and subsequent rows of the matrix are merely the shifted versions of the first row -They are each
shifted with respect to each other by "One branch word". If the information sequence u has a finite
length, say L, then G has L rows and n (m +L) columns (or (m +L) branch word columns) and v
has a length of n (m +L) or a length of (m +L) branch words. Each branch word is of length 'n'.
Thus the Generator matrix G, for the encoders of type shown in Fig 8.3 is written as:
g1( 1 ) g1( 2 )
(1)
(2)
g2 g2
(1) (2)
g1 g1
(1)
(2)
g3 g3
(1) (2)
g2 g2
(1) (2)
g1 g1
(1)
(2)
g4 g4
(1) (2)
g3 g3
(1) (2)
g2 g2
(2)
g4 g4
(1) (2)
g3 g3
(1)
(1) (2)
g4 g4
.. (8.3)
v = u .G
(8.4)
Example 8.2:
For the information sequence of Example 8.1, the G matrix has 5 rows and 2(3 +5) =16 columns and
we have
1 1 0 1 1 1 1 1 0 0 0 0 0 0 0 0
0 0 1 1 0 1 1 1 1 1 0 0 0 0 0 0
G 0 0 0 0 1 1 0 1 1 1 1 1 0 0 0 0
0 0 0 0 0 0 1 1 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 1 1 0 1 1 1 1 1
Performing multiplication, v = u G as per Eq (8.4), we get: v = (11, 01, 00, 01, 01, 00, 11) same as
before.
As a second example of a convolutional encoder, consider the (3, 2, 1) encoder shown in
Fig.8.4. Here, as k =2, the encoder consists of two m = 1 stage shift registers together with n = 3
modulo -2 adders and two multiplexers. The information sequence enters the encoder k = 2 bits at a
(1)
(2)
(1)
(2)
(1)
(2)
time and can be written as u = {u1 u1 , u2 u2 , u3 u3 } or as two separate input
sequences:
u (1) = {u1 (1), u2 (1), u3 (1) } and u (2) = {u1 (2), u2 (2), u3 (2) }.
There are three generator sequences corresponding to each input sequence. Letting
gi ( j) = {gi,1 ( j), gi,2 ( j), gi,3 ( j) gi,m+1 ( j)} represent the generator sequence corresponding to
input i and output j. The generator sequences for the encoder are:
.
.
v = { v 1 ( 1) v 1 ( 2) v 1 ( 3) , v 2 ( 1) v 2 ( 2) v 2 ( 3) , v 3 ( 1) v 3 ( 2) v 3 ( 3) }
Example 8.3:
Suppose u = (1 1 0 1 1 0). Hence u (1) = (1 0 1) and u (2) = (1 1 0). Then
(8.5 a)
(8.5 b)
(8.5 c)
v = (1 0 1, 0 0 0, 0 0 1, 1 0 0).
The generator matrix for a (3, 2, m) code can be written as:
(1)
(2)
(3)
(1)
(2)
(3)
(1)
(2)
(3)
(1)
(2)
(1)
(2)
(3)
(1)
(2)
(3)
(3)
(8.6)
The encoding equations in matrix form are again given by v = u G. observe that each set of k = 2
rows of G is identical to the preceding set of rows but shifted by n = 3 places or one branch word to
the right.
Example 8.4:
For the Example 8.3, we have
1
1
1
0
0 0
1 0
1 1, 1 0 0
1 0, 1 1 0
1 1 1, 1 0 0
0 1 0 , 1 1 0
*Remember that the blank places in the matrix are all zeros.
Performing the matrix multiplication, v = u G, we get: v = (101,000,001,100), again agreeing
with our previous computation using discrete convolution.
This second example clearly demonstrates the complexities involved, when the number of
input sequences are increased beyond k > 1, in describing the code. In this case, although the encoder
contains k shift registers all of them need not have the same length. If ki is the length of the i-th shift
register, then we define the encoder "memory order, m" by
m Max ki
1 i k
(8.7)
Since each information bit remains in the encoder up to (m + 1) time units and during each
time unit it can affect any of the n-encoder outputs (which depends on the shift register connections)
it follows that "the maximum number of encoder outputs that can be affected by a single
information bit" is
n A n( m 1)
(8.8)
nA is called the 'constraint length" of the code. For example, the constraint lengths of the encoders
of Figures 8.3, 8.4 and 8.5 are 8, 6 and 12 respectively. Some authors have defined the constraint
length (For example: Simon Haykin) as the number of shifts over which a single message bit can
influence the encoder output. In an encoder with an m-stage shift register, the memory of the
encoder equals m-message bits, and the constraint length nA = (m + 1). However, we shall adopt the
definition given in Eq (8.8).
The number of shifts over which a single message bit can influence the encoder output is
usually denoted as K. For the encoders of Fig 8.3, 8.4 and 8.5 have values of K = 4, 2 and 3
respectively. The encoder in Fig 8.3 will be accordingly labeled as a rate 1/2, K = 4 convolutional
encoder. The term K also signifies the number of branch words in the encoders impulse response.
Turning back, in the general case of an (n, k, m) code, the generator matrix can be put in the
form:
G1 G2
G1
G
G 3 Gm
G 2 Gm 1
G1 Gm 2
Gm 1
Gm Gm 1
Gm 1 Gm Gm 1
(8.9)
g 1 ,i ( 1 )
(1)
g
Gi 2 ,i
(1)
gk ,i
g 1 ,i
(2)
g 1 ,i
(n)
(2)
( n )
g 2 ,i
g 2 ,i
(2)
(n)
g k ,i
gk ,i
(8.10)
Notice that each set of k-rows of G are identical to the previous set of rows but shifted n-places to
the right. For an information sequence u = (u1, u2) where ui = {ui (1), ui (2)ui (k)}, the code word is
v = (v1, v2) where vj = (vj (1), vj (2) .vj (n)) and v = u G. Since the code word is a linear combination
of rows of the G matrix it follows that an (n, k, m) convolutional code is a linear code.
Since the convolutional encoder generates n-encoded bits for each k-message bits, we define
R = k/n as the "code rate". However, an information sequence of finite length L is encoded into a
code word of length n (L +m), where the final nm outputs are generated after the last non zero
information block has entered the encoder. That is, an information sequence is terminated with all
zero blocks in order to clear the encoder memory. (To appreciate this fact, examine 'the calculations
(j)
of vl
for the Example 8.l and 8.3). The terminating sequence of m-zeros is called the "Tail of the
message". Viewing the convolutional-code as a linear block code, with generator matrix G, then the
block code rate is given by kL/n(L +m) - the ratio of the number of message bits to the length of the
code word. If L >> m, then, L/ (L +m) 1 and the block code rate of a convolutional code and its rate
when viewed as a block code would appear to be same. Infact, this is the normal mode of operation
for convolutional codes and accordingly we shall not distinguish between the rate of a convolutional
code and its rate when viewed as a block code. On the contrary, if L were small, the effective rate of
transmission indeed is kL/n (L + m) and will be below the block code rate by a fractional amount:
k / n kL / n( L m )
m
..
(8.11)
k/n
L m
and is called "fractional rate loss". Therefore, in order to keep the fractional rate loss at a minimum
(near zero), L is always assumed to be much larger than m. For the information 'sequence of
Example 8.1, we have L = 5, m =3 and fractional rate loss = 3/8 = 37.5%. If L is made 1000, the
fractional rate loss is only 3/1003 0.3%.
8.3
In any linear system, we know that the time domain operation involving the convolution
integral can be replaced by the more convenient transform domain operation, involving polynomial
multiplication. Since a convolutional encoder can be viewed as a 'linear time invariant finite state
machine, we may simplify computation of the adder outputs by applying appropriate transformation.
As is done in cyclic codes, each 'sequence in the encoding equations can' be replaced by a
corresponding polynomial and the convolution operation replaced by polynomial multiplication. For
example, for a (2, 1, m) code, the encoding equations become:
..
(8.12a)
.....
(8.12b)
are the generator polynomials of' the code; and all operations are modulo-2. After multiplexing, the
code word becomes:
v(X) = v(1)(X2) + X v(2)(X2)
(8.13)
The indeterminate 'X' can be regarded as a unit-delay operator, the power of X defining the
number of time units by which the associated bit is delayed with respect to the initial bit in the
sequence.
Example 8.5:
For the (2, 1, 3) encoder of Fig 8.3, the impulse responses were: g(1)= (1,0, 1, 1), and g(2) = (1,1, 1, 1)
The generator polynomials are: g(l)(X) = 1 + X2 + X3, and g(2)(X) = 1 + X + X2 + X3
For the information sequence u = (1, 0, 1, 1, 1); the information polynomial is: u(X) = 1+X2+X3+X4
The two code polynomials are then:
v(1)(X) = u(X) g(l)(X) = (1 + X2 + X3 + X4) (1 + X2 + X3) = 1 + X7
v(2)(X) = u(X) g(2)(X) = (1 + X2 + X3 + X4) (1 + X + X2 + X3) = 1 + X + X3 + X4 + X5 + X7
From the polynomials so obtained we can immediately write:
v(1) = ( 1 0 0 0 0 0 0 1), and v(2) = (1 1 0 1 1 1 0 1)
Pairing the components we then get the code word v = (11, 01, 00, 01, 01, 01, 00, 11).
We may use the multiplexing technique of Eq (8.13) and write:
v (1) (X2) = 1 + X14 and v (2) (X2) = 1+X2+X6+X8+X10+X14; Xv (2) (X2) = X + X3 + X7 + X9 + X11 + X15;
and the code polynomial is: v(X) = v (1) (X2) + X v (2) (X2) = 1 + X + X3 + X7 + X9 + X11 + X14 + X15
Hence the code word is: v = (1 1, 0 1, 0 0, 0 1, 0 1, 0 1, 0 0, 1 1); this is exactly the same as
obtained earlier.
The generator polynomials of an encoder can be determined directly from its circuit diagram.
Specifically, the co-efficient of Xl is a '1' if there is a "connection" from the l-th shift register stage to
the input of the adder of interest and a '0' otherwise. Since the last stage of the shift register in an
(n, l) code must be connected to at least one output, it follows that at least one generator polynomial
should have a degree equal to the shift register length 'm', i.e.
Max
deg g ( j ) ( X )
1 j n
(8.14)
In an (n, k) code, where k > 1, there are n-generator polynomials for each of the k-inputs,
each set representing the connections from one of the shift registers to the n-outputs. Hence, the
length Kl of the l-th shift register is given by:
Kl
Max
( j)
deg g l ( X ) , 1 l k
1 j n
(8.15)
Where gl (j) (X) is the generator polynomial relating the l-th input to the j-th output and the encoder
memory order m is:
Max
Max
( j)
(8.16)
m
K l 1 j deg g l ( X )
1 l k
1 l k
Since the encoder is a linear system and u (l) (X) represents the l-th input sequence and v (j) (X)
represents the j-th output sequence the generator polynomial gl (j) (X) can be regarded as the 'encoder
transfer function' relating the input - l to the output j. For the k-input, n- output linear system there
are a total of kn transfer functions which can be represented as a (k n) "transfer function matrix".
g1( 1 ) ( X ), g1( 2 ) ( X ), g1( n ) ( X )
(1)
(2)
(n)
g 2 ( X ), g 2 ( X ), g 2 ( X )
G( X )
(1)
(2)
(n)
g k ( X ), g k ( X ), g k ( X )
(8.17)
Using the transfer function matrix, the encoding equations for an (n, k, m) code can be expressed as
V(X) = U(X) G(X)
(8.18)
U(X) = [u (1) (X), u (2) (X)...u (k) (X)] is the k-vector, representing the information polynomials, and.
V(X) = [v (1) (X), v (2) (X) v (n) (X)] is the n-vector representing the encoded sequences. After
multiplexing, the code word becomes:
v(X) = v(1)(Xn) + X v(2)(Xn) +X2 v(3)(Xn)++ Xn-l v(n)(Xn)
Example 8.6:
For the encoder of Fig 8.4, we have:
g 1(1) (X) = 1 + X,
g 2(1) (X) = X
g 1(2) (X) = 1,
g 1(3) (X) = 1 ,
g 2(3) (X) = 0
1
1
1 X
G( X )
1 X 0
X
For the information sequence u (1) = (1 0 1), u (2) = (1 1 0), the information polynomials are:
1 X
X
= [1 + X2, 1 + X]
1
1 X
1
= [1 +X3, 0, 1+X2]
(8.19)
i-th shift register contains Ki previous information bits. Defining K K i as the total encoder i 1
memory (m - represents the memory order which we have defined as the maximum length of any
shift register), the encoder state at time unit T', when the encoder inputs are, {u l (1), u l (2)u l (k)}, are
the binary k-tuple of inputs:
{u l-1 (1) u l-2 (1), u l-3 (1) u l-k (1); u l-1 (2), u l-2(2, u l-3 (2) u l-k (2); ; u l-1 (k) u l-2 (k), u l-3 (k) u l-k (k)},
and there are a total of 2k different possible states. For a (n, 1, m) code, K = K1 = m and the encoder
state at time unit l is simply {ul-1, ul-2 ul-m}.
Each new block of k-inputs causes a transition to a new state. Hence there are 2k branches
leaving each state, one each corresponding to the input block. For an (n, 1, m) code there are only
two branches leaving each state. On the state diagram, each branch is labeled with the k-inputs
causing the transition and the n-corresponding outputs. The state diagram for the convolutional
encoder of Fig 8.3 is shown in Fig 8.10. A state table would be, often, more helpful while drawing
the state diagram and is as shown.
State table for the (2, 1, 3) encoder of Fig 8.3
State
Binary
Description
S0
S1
S2
S3
S4
S5
S6
S7
000
100
010
110
001
101
011
111
Recall (or observe from Fig 8.3) that the two out sequences are:
v (1) = ul + ul 2 + ul 3 and
v (2) = ul + ul 1 + ul 2 + ul 3
Till the reader, gains some experience, it is advisable to first prepare a transition table using the
output equations and then translate the data on to the state diagram. Such a table is as shown below:
Binary
State
Description
S0
0 0 0
S1
S2
S3
S4
S5
S6
S7
1 0 0
0 1 0
1 1 0
0 0 1
1 0 1
0 1 1
1 1 1
Input
Next
Binary
State
Description
S0
0 0 0
0 0
S1
1 0 0
1 1
S2
0 1 0
0 1
S3
1 1 0
1 0
S4
0 0 1
1 1
S5
1 0 1
0 0
S6
0 1 1
1 0
S7
1 1 1
0 1
S0
0 0 0
1 1
S1
1 0 0
0 0
S2
0 1 0
1 0
S3
1 1 0
0 1
S4
0 0 1
0 0
S5
1 0 1
1 1
S6
0 1 1
0 1
S7
1 1 1
1 0
ul ul 1 ul 2 ul - 3
Output
For example, if the shift registers were in state S5, whose binary description is 101, an input
1 causes this state to change over to the new state S3 whose binary description is 110 while
producing an output (0 1). Observe that the inputs causing the transition are shown first, followed by
Tree
and
Trellis
Let us now consider other graphical means of portraying convolutional codes. The state
diagram can be re-drawn as a 'Tree graph'. The convention followed is: If the input is a '0', then the
upper path is followed and if the input is a '1', then the lower path is followed. A vertical line is called
a 'Node' and a horizontal line is called 'Branch'. The output code words for each input bit are shown
on the branches. The encoder output for any information sequence can be traced through the tree
paths. The tree graph for the (2, 1, 2) encoder of Fig 8.15 is shown in Fig 8.18. The state transition
table can be conveniently used in constructing the tree graph.
Following the procedure just described we find that the encoded sequence for an information
sequence (10011) is (11, 10, 11, 11, 01) which agrees with the first 5 pairs of bits of the actual
encoded sequence. Since the encoder has a memory = 2 we require two more bits to clear and re-set
the encoder. Hence to obtain the complete code sequence corresponding to an information sequence
of length kL, the tree graph is to extended by n(m-l) time units and this extended part is called the
"Tail of the tree", and the 2kL right most nodes are called the "Terminal nodes" of the tree. Thus the
extended tree diagram for the (2, 1, 2) encoder, for the information sequence (10011) is as in Fig 8.19
and the complete encoded sequence is (11, 10, 11, 11, 01, 01, 11).
At this juncture, a very important clue for the student in drawing tree diagrams neatly and
correctly, without wasting time appears pertinent. As the length of the input sequence L increases the
number of right most nodes increase as 2L. Hence for a specified sequence length, L, compute 2L.
Mark 2L equally spaced points at the rightmost portion of your page, leaving space to complete the m
tail branches. Join two points at a time to obtain 2L-l nodes. Repeat the procedure until you get only
one node at the left most portion of your page. The procedure is illustrated diagrammatically in
Fig 8.20 for L = 3. Once you get the tree structure, now you can fill in the needed information either
looking back to the state transition table or working out logically.
From Fig 8.18, observe that the tree becomes "repetitive' after the first three branches. Beyond
the third branch, the nodes labeled S0 are identical and so are all the other pairs of nodes that are
identically labeled. Since the encoder has a memory m = 2, it follows that when the third information
bit enters the encoder, the first message bit is shifted out of the register. Consequently, after the third
branch the information sequences (000u3u4---) and (100u3u4---) generate the same code symbols and
the pair of nodes labeled S0 may be joined together. The same logic holds for the other nodes.
Accordingly, we may collapse the tree graph of Fig 8.18 into a new form of Fig 8.21 called a
"Trellis". It is so called because Trellis is a tree like structure with re-merging branches (You will
have seen the trusses and trellis used in building construction).
The Trellis diagram contain (L + m + 1) time units or levels (or depth) and these are labeled
from 0 to (L + m) (0 to 7 for the case with L = 5 for encoder of Fig 8.15 as shown in Fig8.21. The
convention followed in drawing the Trellis is that "a code branch produced by an input '0' is drawn
as a solid line while that produced by an input '1' is shown by dashed lines". The code words
produced by the transitions are also indicated on the diagram. Each input sequence corresponds to a
specific path through the trellis. The reader can readily verify that the encoder output corresponding
to the sequence u = (10011) is indeed v = (11, 10, 11, 11, 01, 01, 11) the path followed being as
shown in Fig 8.22.
8.6
P ( r | v ) p( ri | v i )
i 1
(8.37)
(8.38)
ln P ( r | v ) ln p( ri | v i )
i 1
p .... if ri v i
Let p( ri | v i )
(1 p) .... if ri v i
Further, let the received vector r differ from the transmitted vector v in exactlyd positions. Notice
that the numberd is nothing but the Hamming distance between the vectors r and v. Then Eq (8.38)
can be re-formulated as:
ln P (r |v) = d ln p + (N d) ln (1 p)
= d ln {p/ (1 p)} + N ln (1 p)
or ln P (r |v) = A d + B
(8.39)
(8.40)
65
i 1
i 1
M ( r | v ) M ( ri | v i ) M ( ri | v i )
(8.41)
A partial path metric for the first j branches of the path can be expressed as:
j
M ( r | v j ) M ( ri | v j )
(8.42)
i 1
66
Suppose that the maximum likely hood path is eliminated by the algorithm at time unit j as
shown in Fig 8.25. This implies that the partial path metric of the survivor exceeds that of the
maximum likely hood path at this point. Now, if the remaining portion of the maximum likely hood
path is appended onto the survivor at time unit j, then the total metric of this path will exceed the total
metric of the maximum likely hood path. But this contradicts the definition of the 'maximum likely
hood path' as the 'path with largest metric'. Hence the maximum likely hood path cannot be
eliminated by the algorithm and it must be the final survivor and it follows
M ( r | v ) M ( r | v ), v v .Thus it is clear that the Viterbi algorithm is optimum in the sense that it
always finds the maximum likely hood path through the Trellis. From an implementation point of
view, however, it would be very inconvenient to deal with fractional numbers. Accordingly, the bit
metric M (r i |vi) = ln P (ri |vi) can be replaced by C2 {ln P (ri |vi) + C1}, C1 is any real number and
C2 is any positive real number so that the metric can be expressed as an integer. Notice that a path v
which maximizes M ( r | v ) M ( ri | vi ) ln P ( ri | vi ) also maximizes C 2 ln P ( ri | v i ) C1 .
N
i 1
i 1
i 1
Therefore, it is clear that the modified metrics can be used without affecting the performance of the
Viterbi algorithm. Observe that we can always choose C1 to make the smallest metric as zero and
then C2 can be chosen so that all other metrics can be approximated by nearest integers. Accordingly,
there can be many sets of integer metrics possible for a given DMC depending on the choice of C2.
The performance of the Viterbi algorithm now becomes slightly sub-optimal due to the use of
modified metrics, approximated by nearest integers. However the degradation in performance is
typically very low.
67
Notice that the final m-branches in any trellis path always corresponds to 0 inputs and
hence not considered part of the information sequence.
As already mentioned, the MLD reduces to a 'minimum distance decoder' for a BSC (see
Eq 8.40). Hence the distances can be reckoned as metrics and the algorithm must now find the path
through the trellis with the smallest metric (i.e. the path closest to r in Hamming distance). The
details of the algorithm are exactly the same, except that the Hamming distance replaces the log
likely hood function as the metric and the survivor at each state is the path with the smallest metric.
The following example illustrates the concept.
Example 8.14:
Suppose the code word r = (01, 10, 10, 11, 01, 01, 11), from the encoder of Fig 8.15 is received
through a BSC. The path traced is shown in Fig 8.29 as dark lines.
68
69
CHAPTER 9
The coding and decoding schemes discussed so far are designed to combat random or
independent errors. We have assumed, in other words, the channel to be Memory less. However,
practical channels have memory and hence exhibit mutually dependent signal transmission
impairments. In a fading channel, such impairment is felt, particularly when the fading varies
slowly compared to one symbol duration. The multi-path impairment involves signal arrivals at the
receiver over two or more paths of different lengths with the effect that the signals arrive out of
phase with each other and the cumulative received signal are distorted. High-Frequency (HF) and
troposphere propagation in radio channels suffer from such a phenomenon. Further , some channels
suffer from switching noise and other burst noise (Example: Telephone channels or channels
disturbed by pulse jamming impulse noise in the communication channel causes transmission errors
to cluster into bursts). All of these time-correlated impairments results in statistical dependence
among successive symbol transmissions. The disturbances tend to cause errors that occur in bursts
rather than isolated events.
Once the channel is assumed to have memory, the errors that occur can no longer be
characterized as single randomly distributed errors whose occurrence is independent from bit to bit.
Majority of the codes: Block, Cyclic or Convolutional codes are designed to combat such random or
independent errors. They are, in general, not efficient for correcting burst errors. The channel
memory causes degradation in the error performance.
Many coding schemes have been proposed for channels with memory. Greatest problem
faced is the difficulty in obtaining accurate models of the frequently time-varying statistics of such
channels. We shall briefly discuss some of the basic ideas regarding such codes. (A detailed
discussion of burst error correcting codes is beyond the scope of this book). We start with the
definition of burst length, b and requirements on a (n, k) code to correct error burst. An error burst
of length b is defined as a sequence of error symbols confined to b consecutive bit positions in
which the first and the last bits are non-zero
For example, an error vector (00101011001100) is a burst of b = 10. The error vector
(001000110100) is a burst of b = 8. A code that is capable of correcting all burst errors of length b
or less is called a b-burst-error-correcting code. Or the code is said to have a burst error
correcting capability = b. Usually for proper decoding, the b-symbol bursts are separated by a guard
space of g symbols. Let us confine, for the present, ourselves for the construction of an (n, k) code
for a given n and b with as small a redundancy (n - k) as possible. Then one can make the following
observations.
Start with a code vector V with an error burst of length 2b or less. This code vector then may
be expressed as a linear combination (vector sum) of the vectors V1 and V2 of length b or less.
Therefore in the standard array of the code both V1 and V2 must be in the same co-set. Further, if one
of these is assumed to be the co-set leader (i.e. a correctable error pattern), then the other vector
which is in the same co-set turns out to be an un-correctable error pattern. Hence, this code will not
be able to correct all error bursts of length b or less. Thus we have established the following
assertion:
70
(9.1)
From Eq. (9.1) it follows that the burst-error-correcting capability of an (n, k) code is at most
(n-k)//2. That is, the upper bound on the burst-error-correcting capability of an (n, k) linear code is
governed by:
b (n-k)/2
(9.2)
This bound is known by the name Reiger Bound and it is used to define the burst correcting
efficiency, z, of an (n, k) codes as
z = 2b/ (n-k)
..
(9.3)
Whereas most useful random error correcting codes have been devised using analytical techniques,
for the reasons mentioned at the beginning of this section, the best burst- error correcting codes have
to be found through computer aided search procedures. A short list of high rate burst-errorcorrecting cyclic codes found by computer search is listed in Table-9.1.
If the code is needed for detecting error bursts of lengthb, then the number of check bits must
satisfy:
(n - k) b
(9.4)
Some of the famous block/cyclic and convolution codes designed for correcting burst errors are
Burton, Fire, R-S, Berlekemp - Preparata-Massey , Iwadare and Adaptive
Gallager codes. Of
these Fire codes have been extensively used in practice. A detailed discussion on these codes is
available in Shu-Lin et-all and J.Das et-all. (Refer Bibliography)
Any burst of length less than contiguous channel symbol errors result in isolated errors at
the de-interleaver output that are separated from each other by at least n-symbols.
2)
Any q. burst of errors, where q >1, results in output bursts from the de-interleaver of not
more than q symbol errors. Each output burst is separated from the other burst by not less
than n- q symbols. The notation q means the smallest integer not less than q and q
means the largest integer not greater than q.
3)
A periodic sequence of single errors spaced - symbols apart results in a single burst of errors
of length n at the de-interleaver output.
4)
The interleaver/ de-interleaver end to end delay is approximately 2n symbol time units to
be filled at the receiver before decoding begins. Therefore, the minimum end- to- end delay is
(2n-2n+2) symbol time units. This does not include any channel propagation delay.
5)
The memory requirement, clearly, is n symbols for each location (interleaver and deinterleaver). However, since the n array needs to be (mostly) filled before it can be read out,
a memory of 2n symbols is generally implemented at each location to allow the emptying of
one n array while the other is being filled , and vice versa.
Finally a note about the simplest possible implementation aspect- If the original code is cyclic
then the interleaved code is also cyclic. If the original code has a generator polynomial g(X), the
interleaved code will have the generator polynomial g (X ). Hence encoding and decoding can be
done using shift registers as was done for cyclic codes. The modification at the decoder for the
interleaved codes is done by replacing each shift register stage of the original decoder by stages
without changing other connections. This modification now allows the decoder to look at successive
rows of the code array on successive decoder cycles. It then follows that if the decoder for the
original cyclic code is simple so will it be for the interleaved code. Interleaving technique is indeed
an effective tool for deriving long powerful codes from short optimal codes.
Example 9.1:
Let us consider an interleaver with n = 4 and =6. The corresponding (64) array is shown in
Fig. 9.2(a). The symbols are numbered indicating the sequence of transmission. In Fig 9.2(b) is
73
Observe that in the de-interleaved sequence, each code word does not have more than one error. The
smallest separation between symbols in error is n = 4.
Next, with q = 1.5, q= 9. Fig 9.2(c) illustrates an example of 9-symbol error burst. After deinterleaving at the receiver, the sequence is:
The encircled symbols are in error. It is seen that the bursts consists of no more than 1.5=2
contiguous symbols per code word and they are separated by at least n - 1.5 = 4 1 = 3 symbols.
Fig.9.2 (d) illustrates a sequence of single errors spaced by = 6 symbols apart. After deinterleaving at the receiver, the sequence is:
74
It is seen that de-interleaved sequence has a single error burst of length n = 4 symbols. The
minimum end to end delay due to interleaver and de-interleaver is (2n 2n+2) = 42 symbol time
units. Storage of n = 24 symbols required at each end of the channel. As said earlier, storage for
2n = 48 symbols would generally be implemented.
Example 9.2: Interleaver for a BCH code.
Consider a (15, 7) BCH code generated by g(X) = 1+X+X2+X4+X8. For this code dmin=5,
d
1
t = mim
2 .With =5, we can construct a (75, 35) interleaved code with a burst error
2
correcting capability of b= t=10. The arrangement of code words, similar to Example 9.1, is shown
in Fig 9.3. A 35-bit message block is divided into five 7-bit message blocks and five code words of
length 15 are generated using g(X).These code words are arranged as 5-rows of a 515 matrix. The
columns of the matrix are transmitted in the sequence shown as a 75-bit long code vector.
Each rows is 15 bit code word
1 6
11 . 31
(36) .. (66) 71
2 7
12 . (32) (37) .. 67
72
3 8
13 . (33) (38) .. 68
73
4 (9) 14 . (34) 39
.. 69
74
5 10 15 . (35) 40
(70) 75
Fig 9.3 Block Interleaver for a (15, 7) BCH code.
To illustrate the burst and random error correcting capabilities of this code, we have put the bit
positions 9, 32 to 38, 66 and 70 in parenthesis, indicating errors occurred in these positions .The deinterleaver now feeds the rows of Fig 9.3 to the decoder. Clearly each row has a maximum of two
errors and since the (15, 7) BCH code, from which the rows were constructed, is capable of
correcting up to two errors per row. Hence the error pattern shown in parenthesis in the Figure can
be corrected. The isolated errors in bit positions 9, 66 and 70 may be thought of as random errors
while the cluster of errors in bit positions 32 to 38 as a burst error.
9.1.2 Convolutional Interleaving:
Convolution interleavers are somewhat simpler and more effective compared to block
interleavers. A (b n) periodic (convolutional) interleaver is shown in Fig 9.4.The code symbols are
shifted sequentially into the bank of n-shift registers. Each successive register introduces a delay b.
i.e., the successive symbols of a codeword are delayed by {0, b, 2b (n-1) b} symbol units
respectively. Because of this, the symbols of one codeword are placed at distances of b-symbol units
in the channel stream and a burst of length b separated by a guard space of (n-1)b symbol units
only affect one symbol per codeword. In the receiver, the code words are reassembled through
complementary delay units and decoded to correct single errors so generated. If the burst length l > b
but l 2b, then the (n, k) code should be capable of correcting two errors per code words. To
75
As illustrated in Example 9.2, if one uses a (15, 7) BCH code with t=2, then a burst of length 2b
can be corrected with a guard space of (n-1) b = 14b. This 14 to 2 guard space to burst length
ratio is too large, and hence codes with smaller values of n are preferable. Convolutional codes with
interleaver may also be used. The important advantage of convolutional interleaver over block
interleaver is that, with convolutional interleaving the end-to-end delay is (n-1) b symbol units and
the memory required at both ends of the channel is b (n-1)/2. This means, there is a reduction of one
half in delay and memory over the block interleaving requirements.
76