Digital Communication Systems by Simon Haykin-107

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Haykin_ch10_pp3.

fm Page 617 Tuesday, January 8, 2013 12:58 PM

10.8 Maximum Likelihood Decoding of Convolutional Codes 617

Table 10.6 Summary of the Viterbi algorithm

The Viterbi algorithm is a maximum likelihood decoder, which is optimal for any
discrete memoryless channel. It proceeds in three basic steps. In computational terms,
the so-called add–compare–select (ACS) operation in Step 2 is at the heart of the
Viterbi algorithm.
Initialization
Set the all-zero state of the trellis to zero.
Computation Step 1: time-unit j
Start the computation at some time-unit j and determine the metric for the path that
enters each state of the trellis. Hence, identify the survivor and store the metric for each
one of the states.
Computation Step 2: time-unit j + 1
For the next time-unit j + 1, determine the metrics for all 2  – 1 paths that enter a state
where  is the constraint length of the convolutional encoder; hence do the following:
a. Add the metrics entering the state to the metric of the survivor at the preceding
time-unit j;

b. Compare the metrics of all 2 paths entering the state;
c. Select the survivor with the largest metric, store it along with its metric, and
discard all other paths in the trellis.
Computation Step 3: continuation of the search to convergence
Repeat Step 2 for time-unit j < L + L , where L is the length of the message sequence
and L is the length of the termination sequence.
Stop the computation once the time-unit j = L + L is reached.

EXAMPLE 5 Correct Decoding of Received All-Zero Sequence


Suppose that the encoder of Figure 10.13 generates an all-zero sequence that is sent over a
binary symmetric channel and that the received sequence is (0100010000 ). There are
two errors in the received sequence due to noise in the channel: one in the second bit and
the other in the sixth bit. We wish to show that this double-error pattern is correctable
through the application of the Viterbi decoding algorithm.
In Figure 10.18 we show the results of applying the algorithm for time-unit j = 1, 2, 3,
4, 5. We see that for j = 2 there are (for the first time) four paths, one for each of the four
states of the encoder. The figure also includes the metric of each path for each level in the
computation.
In the left side of Figure 10.18, for time-unit j = 3 we show the paths entering each of
the states, together with their individual metrics. In the right side of the figure we show the
four survivors that result from application of the algorithm for time-unit j = 3, 4, 5.
Examining the four survivors in the figure for j = 5, we see that the all-zero path has the
smallest metric and will remain the path of smallest metric from this point forward. This
clearly shows that the all-zero sequence is indeed the maximum likelihood choice of the
Viterbi decoding algorithm, which agrees exactly with the transmitted sequence.
Haykin_ch10_pp3.fm Page 618 Friday, January 4, 2013 5:03 PM

618 Chapter 10 Error-Control Coding

Figure 10.18 Received


sequence 01
Illustrating steps in the Viterbi 1
algorithm for Example 5. 0

j=1
1

Received
sequence 01 00
1
0 1

1
3
j=2

2
Received
sequence 01 00 01
1 1 2 1 1 2
0 0
3
1 3 2 1 3
2
j=3 3
2 5
2
2
3
2 2
3
4
Survivors

Received
sequence 01 00 01 00
1 1 2 2 1 1 2 2
0 0
4
1 3 2 4 1 2
2 2
j=4 3
2 2
4 3
3
3
2 3 4 2
Survivors
Received
sequence 01 00 01 00 00
1 1 2 2 2 1 1 2 2 2
0 0
5
1 2 2 4 2 2
3
3
j=5 3
2 3
3
4 2 3
3
3
2 3 4
Survivors
Haykin_ch10_pp3.fm Page 619 Friday, January 4, 2013 5:03 PM

10.8 Maximum Likelihood Decoding of Convolutional Codes 619

EXAMPLE 6 Incorrect Decoding of Received All-Zero Sequence


Suppose next that the received sequence is (1100010000 ), which contains three errors
compared with the transmitted all-zero sequence; two of the errors are adjacent to each
other and the third is some distance away.
In Figure 10.19, we show the results of applying the Viterbi decoding algorithm for
levels j = 1, 2, 3, 4. We see that in this second example on Viterbi decoding the correct
path has been eliminated by time-unit j = 3. Clearly, a triple-error pattern is uncorrectable
by the Viterbi algorithm when applied to a convolutional code of rate 12 and constraint
length  = 3. The exception to this algorithm is a triple-error pattern spread over a time
span longer than one constraint length, in which case it is likely to be correctable.

Received
sequence 11
2
0
j =1
0

Received
sequence 11 00
2 2
0

0
4
j =2

Received
sequence 11 00 01
0 2

0
2
j =3
1
1

1 3

Received
sequence 11 00 01 00
2
0 3

2
3
j =4
Figure 10.19
3
Illustrating breakdown 1
of the Viterbi algorithm
in Example 6. 1 3
Haykin_ch10_pp3.fm Page 620 Friday, January 4, 2013 5:03 PM

620 Chapter 10 Error-Control Coding

What Have We Learned from Examples 5 and 6?


In Example 5 there were two errors in the received sequence, whereas in Example 6 there
were three errors, two of which were in adjacent symbols and the third one was some
distance away. In both examples the encoder used to generate the transmitted sequence
was the same. The difference between the two examples was attributed to the fact that the
number of errors in Example 6 was beyond the error-correcting capability of the
maximum likelihood decoding algorithm, which is the next topic for discussion.

Free Distance of a Convolutional Code


The performance of a convolutional code depends not only on the decoding algorithm
used but also on the distance properties of the code. In this context, the most important
single measure of a convolutional code’s ability to combat errors due to channel noise is
the free distance of the code, denoted by dfree; it is defined as follows:
The free distance of a convolutional code is given by the minimum Hamming
distance between any two codewords in the code.
A convolutional code with free distance dfree can, therefore, correct t errors if, and only if,
dfree is greater than 2t.
The free distance can be obtained quite simply from the state graph of the convolutional
encoder. Consider, for example, Figure 10.16b, which shows the state graph of the encoder
of Figure 10.13. Any nonzero code sequence corresponds to a complete path beginning
and ending at the 00 state (i.e., node a). We thus find it useful to split this node in the
manner shown in the modified state graph of Figure 10.20, which may be viewed as a
signal-flow graph with a single input and single output.
A signal-flow graph consists of nodes and directed branches; it operates by the
following set of rules:
1. A branch multiplies the signal at its input node by the transmittance characterizing
that branch.
2. A node with incoming branches sums the signals produced by all of those branches.

DL

DL DL

D2L DL D 2L
a0 b c a1

Figure 10.20 Modified state graph of convolutional encoder.


Haykin_ch10_pp3.fm Page 621 Friday, January 4, 2013 5:03 PM

10.8 Maximum Likelihood Decoding of Convolutional Codes 621

3. The signal at a node is applied equally to all the branches outgoing from that node.
4. The transfer function of the graph is the ratio of the output signal to the input signal.
Returning to the signal-flow graph of Figure 10.20, the exponent of D on a branch in this
graph describes the Hamming weight of the encoder output corresponding to that branch;
the symbol D used here should not be confused with the unit-delay variable in Section
10.6 and the symbol L used herein should not be confused with the length of the message
sequence. The exponent of L is always equal to one, since the length of each branch is one.
Let T(D,L) denote the transfer function of the signal-flow graph, with D and L playing the
role of dummy variables. For the example of Figure 10.20, we may readily use rules 1, 2,
and 3 to obtain the following input-output relations:

b = D La 0 + Lc 
2

c = DLb + DLd 
 (10.58)
d = DLb + DLd 
2 
a 1 = D Lc 

where a0, b, c, d, and a1 denote the node signals of the graph. Solving the system of four
equations in (10.58) for the ratio a1a0, we obtain the transfer function
5 3
D L
T  D L  = ---------------------------------- (10.59)
1 – DL  1 + L 
Using the binomial expansion, we may equivalently express T(D,L) as follows:
5 3 –1
T  D L  = D L  1 – DL  1 + L  

  DL  1 + L  
5 3 i
= D L
i=0

Setting L = 1 in this formula, we thus get the distance transfer function expressed in the
form of a power series as follows:
T  D 1  = D + 2D + 4D + 
5 6 7
(10.60)
Since the free distance is the minimum Hamming distance between any two codewords in
the code and the distance transfer function T(D,1) enumerates the number of codewords
that are a given distance apart, it follows that the exponent of the first term in the
expansion of T(D,1) in (10.60) defines the free distance. Thus, on the basis of this
equation, the convolutional code of Figure 10.13 has the free distance dfree = 5.
This result indicates that up to two errors in the received sequence are correctable, as
two or fewer transmission errors will cause the received sequence to be at most at a
Hamming distance of 2 from the transmitted sequence but at least at a Hamming distance
of 3 from any other code sequence in the code. In other words, in spite of the presence of
any pair of transmission errors, the received sequence remains closer to the transmitted
sequence than any other possible code sequence. However, this statement is no longer true
if there are three or more closely spaced transmission errors in the received sequence. The
observations made here reconfirm the results reported earlier in Examples 5 and 6.
Haykin_ch10_pp3.fm Page 622 Friday, January 4, 2013 5:03 PM

622 Chapter 10 Error-Control Coding

Asymptotic Coding Gain


The transfer function of the encoder’s state graph, modified in a manner similar to that
illustrated in Figure 10.20, may be used to evaluate a bound on the BER for a given
decoding scheme; details of this evaluation are, however, beyond the scope of our present
discussion.8 Here, we simply summarize the results for two special channels, namely the
binary symmetric channel and the binary-input AWGN channel, assuming the use of
binary PSK with coherent detection.
1. Binary symmetric channel.
The binary symmetric channel may be modeled as an AWGN channel with binary
PSK as the modulation in the transmitter followed by hard-decision demodulation in
the receiver. The transition probability p of the binary symmetric channel is then
equal to the BER for the uncoded binary PSK system. From Chapter 7 we recall that
for large values of Eb N0, denoting the ratio of signal energy per bit-to-noise power
spectral density, the BER for binary PSK without coding is dominated by the
exponential factor exp(–Eb N0). On the other hand, the BER for the same
modulation scheme with convolutional coding is dominated by the exponential
factor exp  – d free rE b  2N 0  , where r is the code rate and dfree is the free distance of
the convolutional code. Therefore, as a figure of merit for measuring the
improvement in error performance made by the use of coding with hard-decision
decoding, we may set aside the E b  N 0 to use the remaining exponent to define the
asymptotic coding gain (in decibels) as follows:
d free r
G a = 10 log  -------------  dB (10.61)
10  2 
2. Binary-input AWGN channel.
Consider next the case of a memoryless binary-input AWGN channel with no output
quantization (i.e., the output amplitude lies in the interval  –    ). For this
channel, theory shows that for large values of Eb N0 the BER for binary PSK with
convolutional coding is dominated by the exponential factor exp(–dfreerEb N0),
where the parameters are as previously defined. Accordingly, in this second case, we
find that the asymptotic coding gain is defined by
G a = 10 log 10 d free r  dB (10.62)

Comparing (10.61) and (10.62) for cases 1 and 2, respectively, we see that the asymptotic
coding gain for the binary-input AWGN channel is greater than that for the binary
symmetric channel by 3 dB. In other words, for large Eb N0, the transmitter for a binary
symmetric channel must generate an additional 3 dB of signal energy (or power) over that
for a binary-input AWGN channel if we are to achieve the same error performance.
Clearly, there is an advantage to be gained by using an unquantized demodulator output in
place of making hard decisions. This improvement in performance, however, is attained at
the cost of increased decoder complexity due to the requirement for accepting analog
inputs.
It turns out that the asymptotic coding gain for a binary-input AWGN channel is
approximated to within about 0.25 dB by a binary input Q-ary output discrete memoryless
channel with the number of representation levels Q = 8. This means that, for practical

You might also like