Error Control Coding: 7.1 Block Codes
Error Control Coding: 7.1 Block Codes
adds one more bit, called the parity bit, to the sequence. The parity bit is chosen so that the number of ones in the resulting sequence is even. If an odd number of errors occur during the transmission of the sequence, the received sequence will contain an odd number of ones, and the receiver will know that errors have occurred. Error control coding is a generalization of this idea. By introducing some redundancies into the data to be transmitted, the transmitter enables the receiver to detect or correct some of the errors.
Digital Communications
3 check bits
(7.1)
2 possible codewords: 0000, 1111 differ in 4 positions For error detection only, can detect 3 errors. For error correction, can correct 1 error. Can also detect 2 errors.
(7.2) mod
4 possible codewords: 00000, 01011, 10101, 11110 differ in at least 3 positions For error detection only, can detect 2 errors. For error correction, can correct 1 error.
(7.3)
7.2
Digital Communications
16 possible codewords:
(7.4)
differ in at least 3 positions For error detection only, can detect 2 errors. For error correction, can correct 1 error.
where
, .
7.3
Digital Communications
or fewer errors if
(7.5)
E.g., (4,1) repetition code: correct 1 error. A code can simultaneously detect
(7.6)
errors if (7.7)
E.g., (4,1) repetition code: correct 1 error and detect up to 2 errors. Denition: A block code is linear if any linear combination of two codewords is also a codeword. In the binary case, it is equivalent to the fact that the sum of any two codewords is also a codeword where summation is dened by componentwise modulo-2 addition. (We will limit our discussion to binary linear codes.) Denition: An length . (Concisely, an space over the eld
binary linear block code is a .)
codewords of
(7.8)
(7.9)
Therefore,
. )
7.4
Digital Communications
7.1.3 Encoding
Consider the (7,4) Hamming code.
(7.10)
In matrix form,
(7.11)
(7.12)
We call the last matrix the generator matrix the identity matrix. Any codeword
, where is
onto
algebra then tells us that the set of codewords is the row-space of E.g., (4,1) repetition code: Notice that .
, if we perform an
, the code (i.e., the set of all codewords) remains unchanged. Only the
mapping between the data vectors and the codewords may be changed.
7.5
Digital Communications
(7.13)
It is easy to see that the set of all codewords remains unchanged. One the other hand, the data vector
is mapped to
instead of
to represent 0
and
to represent 1 where
(7.14)
where
is the th coded bit. The signal is contaminated by AWGN in the channel. The received signal
is given by
(7.15)
where represents AWGN. At the receiver, we want to obtain the data sequence from the received signal. Notice that we can consider the two time intervals
and
to the two codewords, and whether we can correctly determine the data in one codeword has nothing to do with whether we can do so in the other codeword. We will consider nding the data in the interval
data bits. For example, the 1st, 5th, and 7th coded bits all contain information about the 1st data bit.
7.6
Digital Communications
Figure 7.1: Receiver for coded signal To optimally determine the data bits, the seven coded bits (i.e., the whole interval considered together. We look at the problem from the point of view of
) must be
16 possible symbols. The problem reduces to the demodulation of a 16-ary communication system. The optimal decision rule is the minimum distance decision rule: Find
is minimized where
, for
transmitted symbols are of equal energy, we can equivalently maximize the correlation, i.e., nd
such that
(7.17)
is maximized. To do so, we can use a bank of 16 matched lters. More conveniently, we can express the signals with a set of basis functions and build a bank of matched lters for the basis functions only. A convenient (although not minimal) set of basis functions is is shown in Figure 7.1. Notice that to optimal determine 4 data bits, we need to record 7 outputs of the matched lter and calculate 16 linear combinations of these outputs according to the 16 possible codewords. The data corresponding to the codeword that gives the largest linear combination would be the demodulated data. It may be considered too complicated to perform optimal demodulation. Very often, a sub-optimal approach is used. The coded bits are demodulated separately. Then a decoding algorithm is used to recover the data bits from the (possibly wrong) demodulated sequence. This approach of data recovery
7.7
for
. Each one
is a time-shifted version of another. Therefore, we only need one matched lter. The optimal receiver
Digital Communications
making a decision on each coded bit before decoding is called hard-decision decoding. In contrast, the optimal approach where decisions are not made on individual coded bits is called soft-decision decoding.
7.1.5 Decoding
We consider hard-decision decoding. As we have seen before, an error control code can be used for error detection as well. Error Detection Consider the (7,4) Hamming code. Let should be a codeword of the form if no error occurs,
(7.18)
To check if any error occurs, we check if the relationships still hold. Equivalently, we check if the following relationships hold.
(7.19)
In matrix form,
(7.20)
7.8
Digital Communications
(7.21)
If
(7.22)
Error Correction Although we have performed a decision on each individual coded bit, we would still like to minimize the distance between the received vector and the decided codeword. Notice that after hard decision on each coded bit, minimizing the Euclidean distance is equivalent to minimizing the Hamming distance for binary codes. Standard array We construct a decision algorithm as follows. Since there are only a nite number of possible received vectors, we consider each possible received vector and determine the codeword that is closest to it. If we tabulate the results, decision can simply be accomplished by table lookup. The table we are going to construct is called the standard array. We use the (5,2) code as an example. First, the codewords of the code are listed in a row.
(7.23)
Notice that if the received vector is any one of these codewords, the decided codeword will simply be itself. Then we consider vectors not in the rst row. We pick one with the smallest weight, e.g., 00001. Notice that it has the interpretation of being a possible received vector with the smallest distance to
7.9
Digital Communications
00000. We use it to form the second row by adding it to each element of the rst row.
(7.24)
Notice that by construction, each vector on the second row is close to the corresponding vector on the rst row. (Actually, it is equal to the corresponding codeword in the rst row plus the error vector 00001.) Therefore, if the received vector is any one of these vectors, the decided codeword will be the corresponding codeword in the rst row. We repeat the process until all vectors are exhausted: 1. Pick a vector with the smallest weight but not already in the array. 2. Use it to form the next row by adding it to each vector of the rst row. For the (5,2) code, a standard array is given by
(7.25)
Given any received vector, the decided codeword is the codeword in the same column at the top of the standard array. Notice that the correctable error vectors are the ones in the rst column. Other error vectors are not correctable by this code, and the received vector would be incorrectly modied by the algorithm. Syndrome table The standard array approach works well for small codes. However, for larger codes, it is undesirable to store all possible vectors. One way to reduce the memory requirement is to store a syndrome table instead.
7.10
Digital Communications
We again consider the (5,2) code as our example. Consider any received vector . It can be decomposed as
(7.27)
i.e., error correction is done. Since all vectors in the same row of the standard array have the same error vector and since we only need the error vector for error correction, we may be able to reduce the standard array to just one column. Of course, the problem is to nd Consider the parity matrix, for the (5,2) code, from .
(7.28)
(7.29)
The product is called the syndrome. It depends only on the error vector, but not the transmitted codeword. We tabulate the correctable error vectors against the syndrome to give the syndrome table. For the (5,2) code, Error vector Syndrome
(7.30)
determine the syndrome from the received vector, look-up the syndrome table to get the error vector, add the error vector to the received vector.
7.11
Digital Communications
data
output
Figure 7.2: Convolutional encoder for the rate 1/2 code (7,4) Hamming code For the Hamming code, the correctable error vectors are all the weight-one (single bit error) error vectors:
(7.31)
The syndrome table can be viewed as embedded inside the parity check matrix . If the error vector is 1000000 (only the rst bit is in error), we get the rst column of as the syndrome. If the error vector is 0100000 (only the second bit is in error), we get the second column of as the syndrome, and so on. Therefore, the error correction algorithm reduces to
use the parity check matrix to compute the syndrome, match the syndrome with a column of to locate the error, correct the error.
In convolutional codes, memories are introduced. Each block of coded bits depends not only on the data bits but also on the previous data bits. We consider a specic example of a rate
register shown in Figure 7.2. For each input data bit, two coded bits are generated and are read alter7.12
Digital Communications
data
output
Figure 7.3: Filter viewpoint of encoder nately. For example, if the input sequence is 110101100, then the output sequence is 111010000100101011. One way to describe the code is to specify how the output bits depend on the contents of the shift register. The rst output is connected to the rst stage and the third stage of the shift register. We denote this fact by saying the generator sequence
We can view the encoder as two lters operating on the same input sequence as shown in Figure 7.3. With this viewpoint,
second lter. Of course, the rst lter output is the convolution of second lter output is the convolution of
obtained by reading the outputs of these convolutions alternately. The memory of the code resides in the contents of the shift register. Two previous data bits, together with the newly input data bit, determine the two output coded bits. Since two previous bits are remembered, the encoder has 4 states, and the convolutional code can also be described by a state diagram as shown in Figure 7.4. Each box represents a state. The digits in each box represent the previous data bits residing in the shift register. Each connecting line represents a state transition due to a newly input data bit. For example, consider the line labeled
us that at state 00 with an input 1, the output will be 11 and the new state will be 10. The convolutional code can also be represented by a trellis diagram which is an extension of the state diagram showing the passage of time. The states are shown as dots in a column while each column represents a point in time. The possible transitions are shown by the branches. An input of
7.13
Digital Communications
1/11
10
1/10
00
00
00
11 01 11 11 01 10 10 10 11 01 00
input 0
input 1
7.14
Digital Communications
0 is represented by a solid line while an input of 1 is represented by a dashed line. The labels of the branches represent the outputs of the code. Convolutional codes can be decoded by the minimum distance decision rule. The resulting algorithm is a form of Viterbis algorithm.
7.15