0% found this document useful (0 votes)
37 views

Viterbi Algorithm

This document summarizes an article from 2002 that describes the Viterbi algorithm and its applications to digital data transmission. It provides background on how intersymbol interference impacts high-speed digital transmission over voice channels. It then explains how the Viterbi algorithm can be used for optimal detection of transmitted symbol sequences by considering the entire received signal rather than just values at symbol centers. However, the computational complexity of a straightforward approach grows exponentially with sequence length. The article then provides a simple example to illustrate the principles of dynamic programming, which the Viterbi algorithm is based on. It aims to introduce readers without specialized knowledge of coding or data transmission to the Viterbi algorithm.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Viterbi Algorithm

This document summarizes an article from 2002 that describes the Viterbi algorithm and its applications to digital data transmission. It provides background on how intersymbol interference impacts high-speed digital transmission over voice channels. It then explains how the Viterbi algorithm can be used for optimal detection of transmitted symbol sequences by considering the entire received signal rather than just values at symbol centers. However, the computational complexity of a straightforward approach grows exponentially with sequence length. The article then provides a simple example to illustrate the principles of dynamic programming, which the Viterbi algorithm is based on. It aims to introduce readers without specialized knowledge of coding or data transmission to the Viterbi algorithm.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/3196488

The viterbi algorithm applied to digital data transmission

Article  in  IEEE Communications Magazine · June 2002


DOI: 10.1109/MCOM.2002.1006969 · Source: IEEE Xplore

CITATIONS READS

23 439

1 author:

Jeremiah F. Hayes
Concordia University Montreal
140 PUBLICATIONS   1,767 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Jeremiah F. Hayes on 28 August 2019.

The user has requested enhancement of the downloaded file.


THE VlTERBl ALGORITHMAPPLIEDTO
DIGITAL DATATRANSMISSION
Jeremiah F. Hayes
~ ~~

Originally published in
IEEE Communications Magazine
March 1975 -Volume 13, Number 2

AUTHOR’S INTRODUCTION
hen I wrote the tutorial on the Viterbi algo- nel trellis coding. This was the successful combination of
rithm (VA), I was a member of the data theory modulation and coding (codulation) that had been talked
g r o u p a t Bell Labs, w h o s e m a i n work was about for some time, where signals were designed accord-
voiceband modems. State of the art was the ing to Euclidean rather than Hamming distance. A key
9600 bps modem, which was about the size of element of this technique was t h e Viterbi algorithm,
a VCR and cost $10,000. (At the time, the rule of thumb already used for convolutional codes. This made the Viter-
o n cost was a d o l l a r p e r bps.) I n t h a t g e n e r a t i o n of bi algorithm a standard component of tens of millions of
modems, the only error-control measure was automatic high-speed modems, and firmly established the value of
adaptive equalizers, which combated intersymbol interfer- research in coding technologies. The symbol “VA” is ubiq-
ence (ISI). T h e application of coding theory was at low uitous in the block diagrams of modern receivers.
ebb and had no place in the modem. Our interest in the Essentially, the VA finds a path through any Markov
Viterbi algorithm was in its ability to deal with ISI. Per- graph, which is a sequence of states governed by a Markov
haps the VA could help us improve the rate to 14,400 bps chain. The many practical applications of the VA go well
on dial-up connections. Dared we to dream of 19,200 bps beyond convolutional decoding and channel trellis decod-
in some distant future? T h e Shannon capacity of voice- ing. It is also used for fading communication channels,
band lines was estimated at something in the neighbor- partial response channels in recording systems, optical
hood of 25,000 bps. character recognition, and voice recognition. The model is
In 1975 it was unimaginable that modems would be a so fundamental that one would expect ever-widening appli-
small component of a computer weighting less than five cation. Recently, it was applied to DNA sequence analysis.
pounds and would reach rates in excess of 56,000 bps. The T h e interesting b u t still not widely used algorithm
Shannon capacity of telephone channels has increased as described in 1975 has proved to be a key building block of
the telephone network improved through the deployment modern information infrastructure.
of modern digital technology and optical fiber. Digital I would like to close by saluting the role of the editor of
technology has also allowed the implementation of modu- the magazine at that time, Steve Weinstein, who was also a
lation and coding techniques that have made a quantum member of the data theory group. H e solicited the article
leap in the state of the art of digital transmission. and provided encouragement and much needed criticism
To my mind the opening shot of the revolution in mod- while I was writing it. The final draft would not have been
ulation and coding was the work of Ungerboeck on chan- what it was without his work.

26 . IEEE CommunicoiionsMogozine 50th Anniversary CommemorotiveIssue/Moy 2002


c ince its development by Richard
Bellman in the 1940s, dynamic pro-
gramming has found wide applica-
tion in control and circuit theory.’ It
is only recently that dynamic pro-
erammine. in the form of thc Viterbi
has been applicd to com-
munication Droblems. T h e first
application was to the decoding of convolution-
al codes [3]. Subsequently, the technique was
extended to the detection of data signals trans-
Channel
Equalizer

mitted over linear models of voiceband chan- ..


,..
nels [4-71. Earlier work on optimum sequential
detection of data signals was done by Chang
and Hancock [8]. I FIGURE1. OJ Single pulse; bJpulse froin.
This paper describes, through systematic
derivations and numerical examples, what the
Viterbi algorithm is and how i t works.* The
reader is presumed to have a good background
in basic communication theory but no special-
ized knowledge of data transmission or dynamic
programming.

INTERSYMBOL
INTERFERENCE
A major impairment encountered in the high-
speed transmission of digital data over voice
frequency lines is intersymbol interference.
Consider the situation depicted in Fig. l a , in
which a single pulse is transmitted over a rela-
tivcly narrowband channel, resulting in the
pulse being smeared in time at the output. A
sequencc of pulses, such as is pulse amplitude
modulation systems, suffers intersymbol inter-
ference when the energy from one pulse spills
over into adjacent symbol intervals so as to
interfere with thc detection of these adjacent
pulses (Fig. lb). Thus, a sample at the center
of a symbol interval is a wcighted sum of ampli- FIGURE 2. A pedeshion exomple
tudes of pulses in several adjacent intervals.
This effect, combined with random noise, leads
to error. complexity, it is s h , in most cases of practical
The current practice is to minimize the effect interest, beyond the capability of present-day
of intersymbol interference by channel equaliza- processors. This difficulty will, perhaps, he over-
tion [9], which adjusts the pulse shape so that it come with the growth of computer technology.
does not interfere with neighboring pulses at Moreover, there are suboptimum implementa-
pulse centers (Fig. l a and b). Although this tions that may yield performance close to the
approach is effective in many cases, minimizing optimum.
the effect of intersymbol interference in this way
is inherently suboptimum, since even the inter- EXAMPLE OF DYNAMICPROGRAMMING
ference contains information about the symbols
that were transmitted. In theory, when the chan- Dynamic programming is essentially a computa-
nel causes a time dispersion of signal energy, the tional procedure for finding an optimum path or
whole received signal rather than center values trajectory. The following rather pedestrian exam-
should be used to detect any symbol or group of ple3 will serve to illustrate its basic principles. A
symbols. Heretofore, the obstacle to optimum certain Professor X walks each day from his
detection of a whole sequence of pulses has office in the E E building to the faculty lounge
been computational complexity. The number of for lunch (Fig. 2). Between the two buildings lie ’Anumberoftatson the
computations required i n a straightforward two small streams, christened by some campus subject have been wvitten.
approach grows exponentially with the length of wag as the Publish and the Perish. Each stream See, for unmple, [ I ] and
the transmitted sequence. Furthermore, compu- runs north to south and is crossed by two foot PI.
tation cannot begin until the entire sequence has bridges. In our example, these bridges shall be
been received. designated by the stream they cross and by the For Y treatment of the
The significance of the dynamic programming appellation north or south. One day our scholar- viterbi algodhmfrom a
approach is that the number of computations ly friend decides to find the shortest path to the difierentpoint of view we
required for optimum detection grows only lin- faculty lounge. He could, of course, simply calcu- recommend [ZI].
early with the length of the transmitted late the length of all possible paths and choose
sequence, and hence computations can be car- the shortest. However, sensing that a general For P somewhnt more
ried out while the sequence is bcing received. principle is involved, Professor X eschews the complex unmple, see 12,
Although this approach reduces computational brute force approach. He first writes down the ch. I].

IEEF Communications Mogozine * 50th Annivmy Commemorotiue Irsue/May 2002 27


1 Minimum distance from office to : Publish Perish Faculty lounge
s(t)=
N
Caip(t-iT) (1)
j via north bridge : 0.5 ' 1.2 1.4 i=l

!L
via south bridge
_-__
H TAB11 1.
_,---I__ _-__ II
0.8
--
1.0
_-
Comporison of fofol distances to the foculiy lounge.
1.3 whercp(1) is the transmitted pulse and the sym-
bol rate is l i T Bd. The bit rate over the base-
band channel is (l/T)logzL bitsis.
The output of the baseband channel is writ-
ten
distances from his office to the two bridges N
across the Publish. He thenpostulates that the y(t) = C aih(t - iT) + n(t) (2)
optimum path is via the north bridge across the i=l
Perish. Under this assumption, he calculates the
minimum path from his office to this bridge by where " ( I ) is white Gaussian noise with dou-
comparing the two paths over the Publish. The ble-sided power density spectrum Noi2 WlHz
same procedure is repeated for the south bridge and where h(t) is the convolution ofp(t) with
across the Perish. At this point, the professor the impulse response of the baseband channel.
notes that for the purpose of further calcula- In the following derivation we shall, for sim-
tions, hc need only keep track of the shortest plicity, refer to h ( t ) as t h e channel impulse
path to the north bridge on the Perish and the response.
shortest path to the south bridge on the Perish. A key assumption is that h(t) has finite dura-
In observing this simplification, the good profes- tion m T (as suggested in Fig. l). This assump-
sor has hit upon the basic principle of dynamic tion has two consequences. First, all elements of
programming, the principle of optimalily. The the N symbol sequence are received in the finite
optimum total path must lie along the optimum interval 0 S t 5 z, ( N + m)T < z c m. The sec-
path from his office to either the north or south ond consequence bears upon a term that arises
bridge across the Perish. The final step is a com- in the sequel. We make the definition
parison of the total distances to the lounge via
the north and south bridge across the Perish. c-jL$h(t- iT)h(t- jT)dt. (3a)
The step-by-step procedure followed by Profes-
sor X is shown in Table 1 (note the distances in Now the finite memory of the channel implies
Fig. 2). that
In carrying.out these calculations, six addi-
tions are necessary. Brute force enumeration r i j = 0, for l i j l > m. (3b)
would require eight additions. Now, if Professor We shall refer t o m as the memory of the chan-
X had to cross N streams, each with two bridges, nel in units of T.
dynamic programming would require 4(N - 1)+2
additions, whereas straight enumeration would LIKELIHOOD SEQUENCEESTIMATION
MAXIMUM
require ( N - l ) Z N additions. Notice the differ-
ence between linear and exponential growth with O u r objective is to o p e r a t e on the received
N here. signaly(t), 0 < f < 7 so as to produce an esti-
This example illustrates forward dynamic pro- mate n;, a;, . . _ , a iof the sequence of transmit-
gramming since the computation proceeds from ted symbols a l , az, ... , aN. Given that y ( t )
the starting point of the journey. The computa- is perturbed by additive noise, we cannot repro-
tion could just as well have been carried out duce the transmitted sequence with certainty.
from the faculty lounge working backward, illus- Rather, we seek to minimize the probability
trating backward dynamic p r ~ g r a m m i n gThe
.~ of sequence error,6 i.e., the probability that
principle of optimality applies to both; the opti- a i*, a z* , ... , a i is different from n l , az, ..., aN.
For our example, the mum total path must lie along an optimum sub- Under o u r assumptions, on t h e transmitted
distinction between bnck- path from the beginning o r end to any sequence, maximum likelihood sequence estima-
word and Jonvard is hiv- intermediate point. This principle, applied to tion (MLSE) produces this minimum error prob-
iaL However, the- am finite dimensional problems, gives rise to systcm- ability.
problems that naturallyft atic and efficient algorithms for calculating opti- In order to define MLSE, first define the
one or the other. We shall mum paths. probability density functional
see one shortb.
p[y(t), 0 < t 5 T I at = Zl, a 2 = 22, ... ,
The derivation ofthe
BASEBANDSIGNAL MODEL
aN = L?N]
Yiterbialgarithmpresent- We shall now relate this general mathematical
edinthesequelisdueto theory to the reception of digital data signals. as the probability thaty(r), 0 S t < z is received
Ungerbwck (101, who Let us first consider a mathematical model of a under the assumption that the transmitted sym-
alsoconsideredthepass- , baseband signal, i.e., a signal not modulated bols are $1, a^z, ... , $N. Notice that for a particu-
band case. I onto a carrier.5 Let the sequence of numbers, lar received signal, there are LN values of this
called information symbols, to be transmitted be quantity since there are LN sequences a^,, . . . , a ^ ~ .
6 In r? later section, the denoted a l , ..., a w N , the number of informa- In MLSE we estimate the transmitted sequence
dbrinction between tion symbols, is large but finite. It is assumed to be the sequence that maximizes this likeli-
sequence error and bit that these symbols are independent and can each hood.' Ostensibly, L N calculations are required
e m r is discussed. assume L equally probable values. These sym- to find this maximum. The virtue of the Viterbi
bok amplitude modulate a train of pulses occur- algorithm is that the number of calculations nec-
'Foradisc~sionofdeci- ring at intervals T to produce the transmitted essary for MLSE grows linearly with N rather
sion mles, see [ I l l . waveform than exponentially.

28 lttt (ommunimtlonr Magoine Soh A r n i v e ~ r /Commemomtlve iswe/Moy 2002


We now derive an expression for the likeli-
hood that shows explicitly the calculations that
are necessaj. From Eq. (Z), if q ,02, ... , aN are
assumed to have particular values &, &, ... , &,
then y(i) is a Gaussian process with mean
N
ri,h(r - iT).
i-I

A straightfonvard derivations allows us to write

whe1e.K is a constant independent of y(i) and


the sequence a^], ... , &r. In finding the optimum FIGURE 3. ionice diagrom. 1 = 2, M = 3.
sequence, we are interested not in absolute val-
ues of the likelihood, but in relative values for
different sets of tl,&, ...,2 ~This . observation nitions, the problem can be cast as that of opti-
allows the problem to be simplified quite apart mum path selection. By use of the principle of
from the'viterhi algorithm. The likelihood is optimality, the optimum path can be found.
maximized by choosing the set il,iZ ..., i ~
, , We now hegin a series of manipulations that
which minimizes the integral, lead to the desired result. First we decompose
the objective function. From Eq. ( 5 ) we have

N N N
D= - 2 x iiZi + I
iiijq.j
i-1 i-1 j-I
We now expand the quadratic term under the N-I N-l N-l
integral sign, resulting in three terms, one of = -2 1i i Z i + 2 iiiiri-j
which is i-I i-1 j-I (6)
J;y*(t)dt.

This energy term is independent of ; I , ... , ZN


and may be ignored in making comparisons. The Notice that there is a replication in Eq. (6) in
objecfivefunciion to be minimized is then that the first two terms on the RHS are similar
in form to the LHS. We shall continue this repli-
cation in the course of developing the algorithm.
Notice also that the last three Lerms of Eq. (6)
are a function only of t2N-mr R N . ~ + ~..., , e l
where and not of the rest of the possible transmitted
sequence. Also, these terms depend only upon
Zi&Jiy(r)h(r - i T ) d r . one output of the matched filter ZN.
By the use of appropriate definitions, the
(Recall that ri-j was defined in Eq. (3)) objective function can be put in a more suitable
The two terms in Eq. ( 5 ) embody the two form. We define the set of stale veciors
kinds of information at our disposal. The terms

Z , = J ~ y ( r ) h ( r - i T ) d t ; i = l2,,._.., N

can he viewed as the sampled output of the filter Note that ak contains all the data symbols,
matched to the channel impulse response h(t). except for Likfl, that will determineyk+l. There
All that we need to know about the received sig- is a one-to-one correspondence between a
nal y(f) is in these samples. The term sequence of state vectors a,,,,am+l,..., ON and
an estimated sequence of transmitted symbols
^ ^
a l , a*, _..,CN, although it is apparent that the set
of state vectors has much redundancy. The prob-
lem of choosing an optimum sequence from the * The random processy(r)
indicates our knowledge of the memory in the set a ] , .._,aN can therefore be recast as that of ir npproximaied by a
channels (see Eq. (3b)). choosing an optimum a,,,,..., UN. Estimating the sequence of ffirhunen-
optimum sequence of states a , a,+l, ..., U N Loeve exponrions. AN of
can be viewed as optimum path selection through rhe coeficienrs in these
THE VITERBI ALGORITHM ..
a laitice representing the states. In Fig. 3 such a apnnsions are Gaussian
Although Eq. ( 5 ) reflects considerable simplifi- lattice is shown for the case L = 2 and m = 3. random vorinbles. A limir
cation, a brute force approach to its minimiza- The dotted lines indicate the transitions that can of these erponsions yieldr
tion requires LN calculations. Like Professor X, be made from one state to another. For exam- Eq. (4). For details, see
we shall eschew this approach. By suitable defi- ple, state o6 = {a^,a^,i6} = {-I, + I , -1) can (121 and (131.

IEB hnunicotiom Mogozine 56h Annlversiy h " n r o t l v e Irrue/Moy 2002 29


~

/ 1 isnns, since ( N - 1) datu estimates go into


this set of state vectors.
2) With the values obtained in I ) , minimize
over ON ( L comparisons).
This mercly specifies the order of a brute
force approach. The ncxt step breaks the mini-
mization into three operations and introduces
the definitions of Eq. (8).
I(ZI....,ZNj= min min min
tlN ",".,ION om..... o N - l l o N w . ( ~ N

[ U ( Z ,,.... ZN-, ;a1,....0 N . l ) (11)


+V(ZN.~N.I,~N)l.

Up to this point we have merely manipulated


the problem of finding the set ilk..., UN, which
maximizesPrb(t) 0 s t s ~ l a =l a i , a2 = a2, ...,
uN = a^N] into an entirely equivalent problem as
stated in Eq. (11). As we have noted earlier,
there is a one-to-one mapping between the StdtC
, ..., oNand the symbol set a^], a^z, ..., ZN.
set a
,
In connection with the quantities in Eq. ( I l ) , we
make two observations that arc crucial to the
I I dcrivation of the algorithm.
81 FIGURE 4. lattice diagrom
min VfZN,oN.,. O N )
am..... O N d " N . I . o N (12a)
only have predecessor states 05 = (-1, -1, + l > = VfZN,aN.l.aN).
or as = { + l ,-1, + l } , i.e., states with the same
u4 and as. In order to formulate the problem as This simply says that if aN.l and o N are fixed, no
optimum path selection, we must now derive variation of the states a , _..,oN-2can change
,
suitable distance measures. the value of V ( Z ,awl, UN).
We proceed by defining the quantities
4
U(Z,,.._,Zk;o, .... ak)=

If UN.~ is fixed, fixing a N also has no bearing on


and the value Of U(Z1, ...,ZN-I ; a , ..., ON-]). Cam-
bining Eqs. (lo), (ll), and (12) yields
b
v(Zk;ak.l%'k)s
k-1
min min U ( Z l....Z N ; u m
.....a # )
,J# oln,....aN~,ION
-2ikzkt22, i;rk., t($?r,, (8b)
i-k-m
k = m + l , ..., N .
In tcrms of these definitions, we can write
By the same steps that led to Eq. (13), we can
U(Zl ,...,Zk;a, ,... ah) decompose the second term on the RHS of
Eq. (13). We find
= u(Z,,...,zk-,;a~,,..,ak.]) (9)
tV(zk.ok&[.Uk).

The problem of finding the set of states that


minimizes the objective function ( 5 ) can he SUC-
cinctly written

~ min
n,,.,,,o u(z~.....z N ; a , , ..., O N ) .
This decomposition can be continued. It is
Expanding upon this notation, we write easiest to express this with some additional nota-
tion. We write
-
F(ok+,) min IV(Z~+l;ok,~~+~)+
U*]"*.,
F(o,)l,
(14)
k = m, ...,N - I
Equation (10) indicates that thc minimization where
is carried out in two steps:
F(Uk)&U(A,, ...,Zm;om)
1) With aNheld fixed to one of its L" values,
minimize over a,*,..., aN-, (LN-' cnmpar- and

30 IEEE Communiiotlonr Magozine Soh Anniveimw Commemorotlve lswe/Moy 2002


k = m + I , ...N-1

Although not indicated explicitly, F(ak) is still


dependent on Zi, ,..,Z,.
Equations (13) and (14) embody the Viterhi
algorithm. One begins by computing F(o,) for
each of the L" values of the state am(see Eq.
(sa)). Then employing Eq. (14), L" values of
F(a,+,,) are found. Recall that in connection
with Fig. 3 we said that ammil
had L possible pre-
decessor states so that
min
am10.+,

requires L comparisons for each of the L m possi-


ble values of am+LThe quantity F(a,,,) is the
"minimum distance" to a particular value of
state a,,,+l.From the principle of optimality, the
optimum total path through the state lattice, as
in Fig. 3, must lie on one of the paths to each of +I11 ..:
,I , ,
1..........-1
........

the L" possible realizations of state am+l. In the


same way that F(a,) is computed, so are the FIGURi 5. lonice diogrom'optimum sequence defection.
successive quantities F(G,,,+~), F(am+3), ...,
F(aN).We end with Lm possible paths through
the lattice, each ending with a different value of we see that the optimum path passes
aN.The final step is.choosing the path for which through a7 = {-1, +I}, The optimum path
F(aN) is minimum. is the one that leads through the succession
It should he emphasized that Lm is related o f s t a t e s { + I , +I}, { + l , - l } , { - I , + I } , .
to the number of levels per transmitted symbol {+1, -l}, I-1, -1), {-1, +l),indicatingthe
and the memory of the channel, but not to N , symbol sequence ti =,+I, a2 = t1, a3 =
the number of symbols transmitted. Hence, we -1, t4 = + l , = -1, a6 = -1, and a 2, =
have shown that the Viterhi algorithm requires +l.
a fixed number of computations per symbol
independent of thc number of the symbols
received. MERGES
From the foregoing we see that it is not neces-
sary to wait until the entire sequence ZI, Z2, _..,
EXAMPLE Z N has been received before making calcula-
In order to illustrate the Viterbi algorithm, we tions. Equation (14) shows that the quantity
go through an example step by step. Let a sys- F(ak) is updated by V(Zk+,, a k , a k + l ) , which is a
tem with the characteristics L = 2 (data values function of the most recent output Z ),, of the
k l ) , m = 2, ro = 1, re1 = 0.5, and r*z = -0.25 matched filter.
be given. We consider a sequence seven symbols Examination of Fig. 4 also discloses that it is
long with the successive outputs of the matched not necessary to wait until the entire output of
filter being Z1 = 1.5, Z2 = 2.0, Z3 = 0.5, Z4 = the matched filter Z1, ..., ZN has been received
1.0, Z5 = -1.5, Z, = -3.0, a n d Z 7 = 0.5. The before making decisions. Consider the four
steps required to detect the transmitted sequence paths leading to state a4.There is a merge in
are as follows: that each of these paths passes through the
1)For each of the Lm = 4 states, a2 = { t i , state a2 = {tl, + l } . Whatever happens from
tz),calculate U ( Z 1 ,Z2, 02) according to this point on does not change anything before
Eq. (Pa), These values, along with those the merge, so we can immediately make the
obtained i n the succeeding steps of the decision a^l = + 1 , a^> = +1. In Fig. 4 it hap-
algorithm, are shown in the lattice diagram pens that merges take place in the minimum
of Fig. 4. time, Le.; the minimum number of steps
2)Apply Eq. (14) to find the optimum path required to go from any value of one state to
to each of the four states a3= {C2, a^3}. any value of another. This minimum time is
F o r example, let a3 = . { - 1 , tl). As equal to the memory, which in this case i s m =
shown i n Fig. 4, the predecessor states 2. The assumption of a different set of channel
are {*I, -1). From {-1, -1) the distance characteristics and matched filter outputs could
is t11.5 and from { + l , -1) i t is +2.5. lead to t h e lattice diagram shown in Fig. 5.
Thus, the optimum path to {-1, + 1 } is Here the merge does not take place in mini-
via { + l , -1). This is indicated in Fig, 4 mum time. It is not until state 05 has been
by the solid line. reached that one can make decisions. Merging
3) Repeat step 2) for each of the succeeding is a random phenomenon, and it is possible
states a4,as,a6, and a7,obtaining four that, in an unfortunate set of circumstances, no
paths through the lattice. decisions can b e made until the end of the
4)Optimize over the final state a7.In Fig. 4 entire sequence,

IEEE Communirotionr Mogozine * 5Oh dnnivenor( Commemome Irrue/Moy 2002 31


To date, the Viterbi IMP LEMENTATI0N the problem of detecting a discrete-time finite-
state Markov process immersed in additive mem-
algorithm has been The complexity of the Viterbi algorithm can be oryless noise. To date, the Viterbi algorithm has
assessed by computing the amount of memory been applied to decoding convolutional codes,
opplied to decoding and the number of arithmetic operations FSK, text recognition, and magnetic tape record-
required to implement it. From the foregoing it ing. Doubtlessly, more applications will be found
convolutional codes, is evident that both of these quantities depend as work in this area progresses.
directly upon the number of possibilities LMfor
FSK, text recognition, the state vector at any given time. In a not ACKNOWLEDGMENT
and magnetic tape unlikely case, L = 4 and m = 6, implying The author would like to express his apprecia-
1,048,576 possibilities. Clearly, for the dynamic tion to R. D. Gitlin and F. R. Magee for many
recording. programming technique to be feasible with pre- illuminating discussions, and to S. B. Weinstein
sent technology, the number of states must be for many helpful suggestions on the manuscript.
Doubtlessly, more reduced. Two techniques for reducing the num-
ber of states have been considered in the litera- REFERENCES
applications will be ture. Qureshi and Newhall [15] propose a
prefiltering technique that reduces the spread of [ l ] R. E. Bellman, Dynamic Programming, Princeton, NJ:
found as work in the signal pulses. The combined effect of the Princeton Univ. Press, 1957.
121 5 . E. Dreyfus, Dynamic Programming and the Calculus
this area progresses. prefilter and the channel yields a reduction of of Variations, New York and London: Academic, 1965.
the memory m . This approach is inherently sub- [3] A. 1. Viterbi, "Error Bounds for Convolutional Codes and
optimum since the additive noise is enhanced. an Asymptotically Optimum Decoding Algorithm," I€€€
Trans. Info. Theory, vol. IT-13, Apr. 1967, pp. 260-69.
Vermeulen and Hellman [16] study a state [4] H. Kobayashi, "Application of Probabilistic Decoding t o
reduction technique in which only the most Digital Magnetic Recording Systems," /EM 1. Res. Devel-
probable states are retained as the algorithm is op., vol. 15, Jan. 1971, pp. 64-74.
carried out. Both of these techniques show [5] H. Kobayashi, "Correlative Level Coding and Maximum-
Likelihood Decoding," /€€€ Trans. Info. Theory, vol. IT-
promising results under certain circumstances. 17, Sept. 1971, pp. 586-94.
(See also [22].) [6] J. K. Omura, "Optimal Receiver Design for Convolutional
Codes and Channels with Memory via Control Theoreti-
cal Concepts," Inform. Sci., vol. 3, July 1971, pp.
PERFORMANCE 243-66.
[71 G. D. Forney, Jr., "Maximum Likelihood Sequence Esti-
From the properties of maximum likelihood esti- mation of Digital Sequences in the Presence o f Inter-
mation, we are assured that the probability of symbol Interference," /€€€ Trans. Info. Theory, vol.
sequence error, Le., mistaking one sequence for IT-18, May 1972, pp. 363-78.
[81 R. W. Chang and J. C. Hancock, "On Receiver Structures
another, is minimized. In his paper on the Viter- for Channels Having Memory," I€€€ Trans. Info. Theory,
bi algorithm [7], Forneyg presented upper and vol. IT-12, Oct. 1966, pp. 463-68.
lower bounds on the probability of bit error for [9] R. W. Lucky, J. Salz, and E. J. Weldon, Jr., Principles o f Data
large signal-to-noise ratio when maximum likeli- Communications, New York McGraw-Hill, 1968, ch. 6.
(1 01 G. Ungerboeck, "Adaptive Maximum-Likelihood Receiv-
hood sequence estimation is used. er for Carrier Modulated Data-Transmission Systems,"
Minimizing the probability of sequence error /E€€ Trans. Commun., vol. COM-22, May 1974, pp.
corresponds roughly to the minimization of 624-36.
probability of block error in digital communica- [ l 1 1 G. L. Turin, Notes o n Digital Communication, New
York: Van Nostrand Reinhold, 1969, ch. 2.
tions. Iterative algorithms similar to the forego- 1121 Ibid, ch. 3.
ing have been developed for minimizing the 1131 W. B. Davenport, Jr. and W. L. Root, An lntroduction
probability of incorrectly detecting individual to the Theory of Random Signals and Noise, New York:
symbols [19]. Optimum bit-by-bit detection and McGraw-Hill, 1958.
[14] L. K. Mackechnie, "Receivers for Channels with Inter-
optimum sequence detection are not synony- symbol Interference" (abstract), presented at the /€€€
mous.10 In a sense, optimum sequence detection Int'l. Symp. Info. Theory, 1972.
considers all erroneous sequences to be equally 1151 S . U. H. Qureshi and E. E. Newhall, "An Adaptive
bad. Thus, for low signal-to-noise situations, Receiver for Data Transmission over Time-Dispersive
Channels," / € E € Trans. lnfo. Theory, vol. IT-19, July
errors may lead to detected sequences that are ,

1973, pp. 448-51.


very far from the true sequence. However, for [I61 F. L. Vermeulen and M. E. Hellman, "Reduced State
high signal-to-noise ratio, it is conjectured that Viterbi Decoding for Channels with lntersymbol Inter-
both techniques would give about the same per- ference" (abstract only), presented at the /€€€ Int'l.
Symp. Info. Theory, 1972.
formance. The comparison of these two tech- [17] G. J. Foschini, "Performance Bound for Maximum Like-
niques is a subject that merits further attention. lihood Reception of Digital Data," /€€€ Trans. Info. The-
, In a recent paper, Fos- ory, to be published.
chini [17] made further [ l 8 ] R. R. Anderson and G. J. Foschini, "The Minimum Dis-
contributions to this
CONCLUSION tance for MLSE Digital data Systems of Limited Com-
plexity," /€E€ Trans. Info. Theory, t o be published.
aspect of Fomey's work. We have studied the application of the Viterbi (191 K. Abend and B. D. Fritchman, "Statistical Detection
Also, Anderson and Fos- algorithm to the detection of digital signals. The for Communications Channels with lntersymbol Inter-
chini [18] have described Viterbi algorithm is one of several nonlinear ference," Proc. /€€€, vol. 58, May 1970, pp. 779-85.
[20] R. W. Lucky, "A.Survey of the Communication Theory
how to calculate these approaches to signal detection. We recommend Literature, 1968-1 973," I€€€ Trans. Info. Theory, vol.
bounds. the survey article by Lucky [20] summarizing the IT-19, Nov. 1973, pp. 725-39.
work in this area. [21] G. D. Forney, Jr., "The Viterbi Algorithm," Proc. I € € € ,
For a summay of work Another survey paper by Forney [21] places vol. 61, Mar. 1973, pp. 268-78.
[22] D. D. Falconer and F. R. Magee, Jr., "Adaptive Channel
on bit-by-bit detection, see the Viterbi algorithm in a wider context than Memory Truncation for Maximum Likelihood Sequence
the survey paper by Lucky digital transmission over narrowband channels. Estimation," Bell Syst. Tech. 1.. vol. 52, Nov. 1973, pp.
POI. The Viterbi algorithm is generally applicable to 1 541-61.

32 IEEE Communications Magazine 50th Anniversary Commemorative Issue/May 2002


~

View publication stats

You might also like