0% found this document useful (0 votes)

793 views

A Turbo Codes Tutorial

Uploaded by

vmaiz

Available Formats

Download as PS, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

793 views

A Turbo Codes Tutorial

Uploaded by

vmaiz

Available Formats

Download as PS, PDF, TXT or read online on Scribd

You are on page 1/ 9

A Turbo Code Tutorial

William E. Ryan

New Mexico State University

Box 30001 Dept. 3-O, Las Cruces, NM 88003

[email protected]

Abstract | We give a tutorial exposition of turbo design a computer simulation of the encoder and decoder.

codes and the associated algorithms. Included are a

simple derivation for the performance of turbo codes, II. The Encoder and Its Performance
and a straightforward presentation of the iterative de-

coding algorithm. The derivations of both the per- Fig. 1 depicts a standard turbo encoder. As seen in the

formance estimate and the modi¯ed BCJR decoding ¯gure, a turbo encoder is consists of two binary rate 1/2

algorithm are novel. The treatment is intended to be convolutional encoders separated by an N -bit interleaver

a launching point for further study in the ¯eld and, or permuter, together with an optional puncturing mech-

signi¯cantly, to provide su±cient information for the anism. Clearly, without the puncturer, the encoder is rate

design of computer simulations. 1/3, mapping N data bits to 3N code bits. We observe

that the encoders are con¯gured in a manner reminiscent of

I. Introduction classical concatenated codes. However, instead of cascad-

ing the encoders in the usual serial fashion, the encoders

Turbo codes, ¯rst presented to the coding community in are arranged in a so-called parallel concatenation. Observe
1993 [1], represent the most important breakthrough in also that the consituent convolutional encoders are of the
coding since Ungerboeck introduced trellis codes in 1982 recursive systematic variety. Because any non-recursive
[2]. Whereas Ungerboeck's work eventually led to coded (i.e., feedforward) non-catastrophic convolutional encoder
modulation schemes capable of operation near capacity is equivalent to a recursive systematic encoder in that they
on bandlimited channels [3], the original turbo codes of- possess that same set of code sequences, there was no com-
fer near-capacity performance for deep space and satellite pelling reason in the past for favoring recursive encoders.
channels. The invention of turbo codes involved reviv- However, as will be argued below, recursive encoders are
ing some dormant concepts and algorithms, and combining necessary to attain the exceptional performance provided
them with some clever new ideas. Because the principles by turbo codes. Without any essential loss of generality,
surrounding turbo codes are both uncommon and novel, it we assume that the constituent codes are identical. Be-
has been di±cult for the initiate to enter into the study of fore describing further details of the turbo encoder in its
these codes. Complicating matters further is the fact that entirety, we shall ¯rst discuss its individual components.
there exist now numerous papers on the topic so that there
A. The Recursive Systematic Encoders
is no clear place to begin study of these codes.
Whereas the generator matrix for a rate 1/2 non-
In this paper, we hope to address this problem by in-
recursive convolutional code has the form GN R (D) =
cluding in one paper an introduction to the study of turbo
[g1 (D) g2 (D)] ; the equivalent recursive systematic en-
codes. We give a detailed description of the encoder and
coder has the generator matrix
present a simple derivation of its performance in additive

white Gaussian noise (AWGN). Particularly di±cult for

· g2 (D)
¸
GR (D) = 1 :
the novice has been the understanding and simulation of g1 (D)
the iterative decoding algorithm, and so we give a thorough

description of the algorithm here. This paper borrows from Observe that the code sequence corresponding to the en-

some of the most prominent publications in the ¯eld [4]- coder input u(D) for the former code is u(D)GN R (D) =

[9], sometimes adding details that were omitted in those [u(D)g1 (D) u(D)g2 (D)] ; and that the identical code se-

works. However, the general presentation and some of the quence is produced in the recursive code by the sequence
0
derivations are novel. Our goal is a self-contained, simple u (D) = u(D)g1 (D); since in this case the code sequence is

introduction to turbo codes for those already knowledge- u(D)g1 (D)GR (D) = u(D)GN R (D): Here, we loosely call

able in the ¯elds of algebraic and trellis codes. the pair of polynomials u(D)GN R (D) a code sequence, al-

The paper is organized as follows. In the next section we though the actual code sequence is derived from this poly-

present the structure of the encoder, which leads to an esti- nomial pair in the usual way.

mate of its performance. The subsequent section then de- Observe that, for the recursive encoder, the code se-

scribes the iterative algorithm used to decode these codes. quence will be of ¯nite weight if and only if the input

The treatment in each of these sections is meant to be sequence is divisible by g1 (D): We have the following im-

su±ciently detailed so that one may with reasonable ease mediate corollaries of this fact which we shall use later.
Corollary 1. A weight-one input will produce an in¯nite even parity bits from the top encoder and all odd parity

weight output (for such an input is never divisible by a bits from the bottom one.

polynomial g1 (D)). D. The Turbo Encoder and Its Performance

Corollary 2. For any non-trivial g1 (D); there exists a As will be elaborated upon in the next section, a
j q¡1

¸
family of weight-two inputs of the form D (1 + D ), maximum-likehood (ML) sequence decoder would be far
j 0; which produce ¯nite weight outputs, i.e., which are too complex for a turbo code due to the presence of the

¡
divisible by g1 (D): When g1 (D) is a primitive polynomial permuter. However, the suboptimum iterative decoding al-
m
of degree m, then q = 2 ; more generally, q 1 is the gorithm to be described there o®ers near-ML performance.
length of the pseudorandom sequence generated by g1 (D): Hence, we shall now estimate the performance of an ML
In the context of the code's trellis, Corollary 1 says that decoder (analysis of the iterative decoder is much more
a weight-one input will create a path that diverges from di±cult).
the all-zeros path, but never remerges. Corollary 2 says Armed with the above descriptions of the components of
that there will always exist a trellis path that diverges the turbo encoder of Fig. 1, it is easy to conclude that it
and remerges later which corresponds to a weight-two data is linear since its components are linear. The constituent
sequence. codes are certainly linear, and the permuter is linear since
Example 1. Consider the code with generator matrix it may be modeled by a permutation matrix. Further,

· 1+D
2
+D
3
+D
4
¸ the puncturer does not a®ect linearity since all codewords

GR (D) = 1 : share the same puncture locations. As usual, the impor-

1 + D + D4
tance of linearity is that, in considering the performance

of a code, one may choose the all-zeros sequence as a ref-

4 2 3 4
Thus, g1 (D) = 1 + D + D and g2 (D) = 1 + D +D +D
erence. Thus, hereafter we shall assume that the all-zeros
or, in octal form, (g1 ; g2 ) = (31; 27): Observe that g1 (D)
codeword was transmitted.
15
is primitive so that, for example, u(D) = 1 + D produces th

2f ¡g
15 2 3 Now consider the all-zeros codeword (the 0 codeword)
the ¯nite-length code sequence (1 +D ; 1+D +D +D + th N
5 7 8 11 and the k codeword, for some k 1; 2; :::; 2 1 . The
D + D + D

this input, say, D (1 + D

+ D
7
). Of course, any delayed version of
15
); will simply produce a delayed
ML decoder will choose the k ³p th
´
codeword over the 0
th
code-

word with probability Q 2dk rEb =N0 where r is the

version of this code sequence. Fig. 2 gives one encoder
th
realization for this code. We remark that, in addition to code rate and dk is the weight of the k codeword. The

elaborating on Corollary 2, this example serves to demon- bit error rate for this two-codeword situation would then

strate the conventions generally used in the literature for be

¤
specifying such encoders.

B. The Permuter Pb (k j 0) = wk (bit errors/cw error) £

As the name implies, the function of the permuter is 1
(cw/ data bits) £
to take each incoming block of N data bits and rearrange N
³p ´
them in a pseudo-random fashion prior to encoding by the
Q 2rdk Eb =N0 (cw errors/cw)
second encoder. Unlike the classical interleaver (e.g., block Ãr !
or convolutional interleaver), which rearranges the bits in wk 2rdk Eb
= Q (bit errors/data bit)
some systematic fashion, it is important that the permuter N N0
sort the bits in a manner that lacks any apparent order, al-

though it might be tailored in a certain way for weight-two th

where wk is the weight of the k data word. Now including
and weight-three inputs as explained in Example 2 below.
all of the codewords and invoking the usual union bounding

¸
Also important is that N be selected quite large and we
argument, we may write
shall assume N 1000 hereafter. The importance of these

two requirements will be illuminated below. We point out

Pb = Pb (choose any k 2f 1; 2; :::; 2
N
¡ gj
1 0)
N
X
also that one pseudo-random permuter will perform about

· j
2 ¡1
as well as any other provided N is large.
Pb (k 0)
C. The Puncturer
While for deep space applications low-rate codes are ap-
2
k =1
N
X ¡1
Ãr !
propriate, in other situations such as satellite communi- wk 2rdk Eb
= Q .
cations, a rate of 1/2 or higher is preferred. The role of N N0
k =1
the turbo code puncturer is identical to that of its convolu-

tional code counterpart, to periodically delete selected bits

Note that every non-zero codeword is included in the above
to reduce coding overhead. For the case of iterative decod-
summation. Let us now reorganize the summation as
ing to be discussed below, it is preferrable to delete only

parity bits as indicated in Fig. 1, but there is no guarantee

XX
w (
N) Ãr !
·
N
that this will maximize the minimum codeword distance. w 2rdwv Eb
Pb Q (1)
For example, to achieve a rate of 1/2, one might delete all N N0
w =1 v =1
¡ ¢
where the ¯rst sum is over the weight-w inputs, the second
N
u(D) divisible by g1 (D) (e.g., g1 (D) itself in the above ex-

sum is over the di®erent weight-w inputs, and dwv is ample), it becomes very unlikely that the permuted input
w
th 0
the weight of the v codeword produced by a weight-w u (D) seen by the second encoder will also be divisible by
4
input. g1 (D). For example, suppose u(D) = g1 (D) = 1 + D + D :

Consider now the ¯rst few terms in the outer summation Then the permuter output will be a multiple of g1 (D) if
th th th
of (1). the three input 1's become the j ; (j + 1) ; and (j + 4)

w = 1: From Corollary 1 and associated discussion bits out of the permuter, for some j . If we imagine that the

above, weight-one inputs will produce only large weight permuter acts in a purely random fashion so that the prob-

codewords at both constituent encoder outputs since the ability that one of the 1's lands a given position is 1=N , the

permuter output will be D g1 (D) = D

j j
¡ 1+D +D
4
¢ with
trellis paths created never remerge with the all-zeros path.
3 1
Thus, each d1v is signi¯cantly greater than the minimum probabilility 3!=N : For comparison, for w = 2 inputs,

codeword weight so that the w = 1 terms in (1) will be a given permuter output pattern occurs with probability

¡ ¢
2
negligible. 2!=N . Thus, we would expect the number of weight-three
N
w = 2: Of the weight-two encoder inputs, only a inputs, n3 ; resulting in remergent paths in both encoders
2
fraction will be divisible by g1 (D) (i.e., yield remergent to be much less than n2 ,

paths) and, of these, only certain ones will yield the small-
CC n3 << n2 ;
est weight, d2 ;min ; at a constituent encoder output (here,

C C denotes \constituent code"). Further, with the per-

with the result being that the inner sum in (1) for w = 3
muter present, if an input u(D) of weight-two yields a
2

¸
CC is negligible relative to that for w = 2.
weight-d2 ;min codeword at the ¯rst encoder's output, it is
0 w 4: Again we can approximate the inner sum in (1)
unlikely that the permuted input, u (D), seen by the sec-
CC for w = 4 in the same manner as in (2) and (3). Still
ond encoder will also correspond to a weight-d2;min code-
we would like to make some comments on its size for the
word (much less, be divisible by g1 (D)). We can be sure,
\random" interleaver. A weight-four input might appear
however, that there will be some minimum-weight turbo
to the ¯rst encoder as a weight-three input concatenated
codeword produced by a w = 2 input, and that this mini-
some time later with a weight-one input, leading to a non-
mum weight can be bounded as
remergent path in the trellis and, hence, a negligible term
TC
d2 ;min ¸ CC
2d2;min ¡ 2; in the inner sum in (1). It might also appear as a concate-

nation of two weight-two inputs, in which case the turbo

TC
with equality when both of the constituent encoders pro- codeword weight is at least 2d2 ;min ; again leading to a neg-
CC
duce weight-d2 ;min codewords (minus 2 for the bottom en- ligible term in (1). Finally, if it happens to be some other
TC pattern divisible by g1 (D) at the ¯rst encoder, with prob-
coder). The exact value of d2 ;min is permuter dependent.
3
We will denote the number of weight-two inputs which ability on the order of 1=N it will be simultaneously di-
3

¸
TC visible by g1 (D) at the second encoder. Thus, we may
produce weight-d2 ;min turbo codewords by n2 so that, for

w = 2, the inner sum in (1) may be approximated as expect n4 << n2 so that the w 4 terms are negligible in

N) Ãr ! 0s 1 (1). The cases for w > 4 are argued similarly.

X
2( To summarize, the bound in (1) can be approximated as

' @ A
TC
2 2rd2 v Eb 2n2 2rd
2 ;min Eb
8 0s 19
< =
Q Q : (2)

'
N N0 N N0
@ A
TC
v =1 wnw 2rd Eb
w;min

w = 3: Following an argument similar to the w = 2 case,

Pb max
w¸2 : N
Q
N0 ; (4)

we can approximate the inner sum in (1) for w = 3 as

N Ãr ! 0s 1 where nw and d
w;min
are functions of the particular in-

X
(3) terleaver employed. From our discussion above, we might

' @ A
TC
2rd
3 2rd3v Eb 3n3 3;min Eb expect that the w = 2 term dominates for a randomly gen-
Q Q ; (3)
N N0 N N0 erated interleaver, although it is easy to ¯nd interleavers
v =1
with n2 = 0 as seen in the example to follow. In any case,
TC
where n3 and d3;min are obviously de¯ned. While n3 is 1 This is not the only weight-three pattern divisible by g1 (D) |
clearly dependent on the interleaver, we can make some 2
g (D) = 1 + D
2 + D
8 is another one, but this too has probability
1
3

¡
comments on its size relative to n2 for a \randomly gener- 3!=N of occurring.
2 Because our argument assumes a \purely random" permuter, the
ated" interleaver. Although there are (N 2)=3 times as
inequality n3 << n 2 has to b e interpreted probabilistically. Thus, it
many w = 3 terms in the inner summation of (1) as there
is more accurate to write E n3 f g << E n 2 f g where the exp ectation
are w = 2 terms, we can expect the number of weight-three is over all interleavers. Alternatively, for the average interleaver, we

terms divisible by g1 (D) to be of the order of the number would expect n 3 << n 2 ; thus if n2 = 5, say, we would expect n3 = 0.
3 The value 3
¡ ¢
of 1=N derives from that fact that ideally a particular
of weight-two terms divisible by g1 (D): Thus, most of the 4,
N divisible output pattern occurs with probability 4!=N but there will
terms in (1) can be removed from consideration for
3 be approximately N shifted versions of that pattern, each divisible
this reason. Moreover, given a weight-three encoder input by g1 (D).
we observe that Pb decreases with N , so that the error rate tiple (usually two) decoders operating cooperatively and

can be reduced simply by increasing the interleaver length. iteratively. Most of the focus was on a type of Viterbi de-

This e®ect is called interleaver gain (or permuter gain) and coder which provides soft-output (or reliability) informa-

demonstrates the necessity of large permuters. Finally, we tion to a companion soft-output Viterbi decoder for use in

note that recursive encoders are crucial elements of a turbo a subsequent decoding [10]. Also receiving some attention

code since, for non-recursive encoders, division by g1 (D) was the symbol-by-symbol maximum a posteriori (MAP)

(non-remergent trellis paths) would not be an issue and algorithm of Bahl, et al [11], published over 20 years ago.

(4) would not hold (although (1) still would). It was this latter algorithm, often called the BCJR algo-

Example 2 . We consider the performance of a rate 1/2, rithm, that Berrou, et al [1], utilized in the iterative de-

(31, 33) turbo code for two di®erent interleavers of size coding of turbo codes. We will discuss in this section the

N = 1000. We start ¯rst with an interleaver that was BCJR algorithm employed by each constituent decoder,

randomly generated . We found for this particular inter- but we refer the reader to [11] for the derivations of some
TC
leaver, n2 = 0 and n3 = 1, with d3 ;min = 9, so that the of the results.

w = 3 term dominates in (4). Interestingly, the inter- We ¯rst discuss a modi¯ed version of the BCJR algo-

leaver input corresponding to this dominant error event rithm for performing symbol-by-symbol MAP decoding.
168 5 10
was D (1 + D + D ) which produces the interleaver We then show how this algorithm is incorporated into an
88 15 848
output D (1 + D + D ), where of course both poly- iterative decoder employing two BCJR-MAP decoders. We
4
nomials are divisible by g1 (D) = 1 + D + D . Figure 3 shall require the following de¯nitions:

gives the simulated performance of of this code for 15 iter-

ations of the iterative decoding algorithm detailed in the - E1 is a notation for encoder 1

next section. Also included in Fig. 3 is the estimate of (4)

- E2 is a notation for encoder 2
for the same interleaver which is observed to be very close

to the simulated values. The interleaver was then modi¯ed - D1 is a notation for decoder 1
by hand to improve the weight spectrum of the code. It
TC - D2 is a notation for decoder 2
was a simple matter to attain n2 = 1 with d2 ;min = 12 and
TC
n3 = 4 with d3 ;min = 15 for this second interleaver so that
- m is the constituent encoder memory
the w = 2 term now dominates in (4). The simulated and
m
estimated performance curves for this second interleaver - S is the set of all 2 constituent encoder states

are also included in Fig. 3. ¤ - x

s s s
= (x1 ; x2 ; :::; x
s
) = (u1 ; u2 ; :::; uN ) is the encoder
In addition to illustrating the use of the estimate (4), N
input word
this example helps explain the unusual shape of the error

rate curve: it may be interpreted as the usual Q-function p p p p

- x = (x1 ; x2 ; :::; x ) is the parity word generated by
N
shape for a signaling scheme with a modest dmin ; \pushed a constituent encoder
¤ ¤
down" by the interleaver gain w nw¤ =N , where w is the
s p s p
maximizing value of w in (4). - yk = (y ; y ) is a noisy (AWGN) version of (x ; x )
k k k k

b
- ya = (ya ; ya+1 ; :::; yb )
III. The Decoder
N
- y = y1 = (y1 ; y2 ; :::; yN ) is the noisy received code-
Consider ¯rst an ML decoder for a rate 1/2 convolutional
word

¸
code (recursive or not), and assume a data word of length

N, N 1000 say. Ignoring the structure of the code, a

A. The Modi¯ed BCJR Algorithm
N
naive ML decoder would have to compare (correlate) 2

j ¡j
In the symbol-by-symbol MAP decoder, the decoder de-
code sequences to the noisy received sequence, choosing

¡
cides uk = +1 if P (uk = +1 y ) > P (uk = 1 y), and it
in favor of the codeword with the best correlation met-
decides uk = 1 otherwise. More succinctly, the decision
ric. Clearly, the complexity of such an algorithm is exorbi-
u
^k is given by
tant. Fortunately, as we know, such a brute force approach
u
^k = sign [L(uk )]
is simpli¯ed greatly by Viterbi's algorithm which permits

a systematic elimination of candidate code sequences (in where L(uk ) is the log a posteriori probability (LAPP)
N ¡1 N ¡2 ratio de¯ned as
µ ¶
the ¯rst step, 2 are eliminated, then another 2 are

eliminated on the second step, and so on). Unfortunately,

, j y)

¡j
P (uk = +1
we have no such luck with turbo codes, for the presence L(uk ) log :
P (uk = 1 y)
of the permuter immensely complicates the structure of a

turbo code trellis, making these codes look more like block Incorporating the code's trellis, this may be written as
codes.
0P 0
p(sk¡1 = s ; sk = s; y )=p(y )
1
Just prior to the discovery of turbo codes, there was

much interest in the coding community in suboptimal de- L(uk ) = log @ P+ S

0
A (5)
p(sk¡1 = s ; sk = s; y )=p(y )
coding strategies for concatenated codes, involving mul-
S¡
2 +
f g
f g
where sk S is the state of the encoder at time k, S is But since we would like to avoid storing both ®k (s) and
0

! f g
the set of ordered pairs (s ; s) corresponding to all state ®
~ k (s) , we can use (7) in (13) to obtain a recursion in-
0

¡
transitions (sk¡1 = s ) (sk = s) caused by data input volving only ®
~ k (s) ,

uk = +1, and S
¡
is similarly de¯ned for uk = 1. P
PP
0 0
®k¡1 (s )°k (s ; s)
Observe we may cancel p(y ) in (5) which means we s0
®
~ k (s) =

P
0 0 0
require only an algorithm for computing p(s ; s; y ) = ®k¡1 (s )°k (s ; s)
s s0

PP
0 0 0
p(sk¡1 = s ; sk = s; y ): The BCJR algorithm [11] for doing ®
~ k¡1 (s )°k (s ; s)
s0
= ; (14)
this is ®
~ k¡1 (s )°k (s ; s)
0 0
s0

¢ ¢
s

0 0 0
p(s ; s; y ) = ®k¡1 (s ) °k (s ; s) ¯k (s) (6) where the second equality follows by dividing the numera-

,
k¡1
k tor and the denominator by p(y1 ).
where ®k (s) p(sk = s; y1 ) is computed recursively as

X ~ (s) can be obtained by noticing that

The recursion for ¯ k

j
0 0
®k (s) = ®k¡1 (s )°k (s ; s) (7)

j ¢
N k
p( yk+1 y1 )
s0 2S N k¡1 k
p(y k y1 ) = p(y1 )
k¡1
p(y 1 )
with initial conditions
XX
¢
N
j k

6
p(y y1 )
0 0 k +1
= ®k¡1 (s )°k (s ; s)
®0 (0) = 1 and ®0 (s = 0) = 0 : (8) k¡1
p(y1 )
XX s s0

(These conditions state that the encoder is expected to

start in state 0.) The probability °k (s ; s) in (7) is de¯ned

0
= ®
0
~ k¡1 (s )°k (s ; s)
0
¢ N
p(yk +1 j k
y1 )
s s0

j
as
0
°k (s ; s) , p(sk = s; yk sk¡1 = s )
0
(9) so that dividing (10) by this equation yields

P
j
and will be discussed further below. The probabilities
PP
~ (s)° (s0 ; s)
¯k (s) , p( y
N
k+1
sk = s) in (6) are computed in a \back-
~
¯
0
k¡1 (s ) =
s
¯

®
k k

~ k¡1 (s )°k (s ; s)
0 0
: (15)
s s0
ward" recursion as

0
X 0 In summary, the modi¯ed BCJR-MAP algorithm in-
¯k¡1 (s ) = ¯k (s)°k (s ; s) (10)
volves computing the LAPP ratio L(uk ) by combining (5)
s2S
and (12) to obtain

0P 1
¢ ¢
with boundary conditions

6
0 0 ~ (s)
B C
®
~ k¡1 (s ) °k (s ; s) ¯ k
+
¯N (0) = 1 and ¯N (s = 0) = 0 : (11)
L(uk ) = log @PS

®
~ k¡1 (s )
0
¢ °k (s ; s)
0
¢ ~ (s)
¯
A (16)
k
(The encoder is expected to end in state 0 after N input
S¡
bits, implying that the last m input bits, called termination
~

f g f g
bits, are so selected.) where the ®'s
~ and ¯'s are computed recursively via (14)
Unfortunately, cancelling the divisor p(y ) in (5) leads and (15), respectively. Clearly the ®
~ k (s) and ~ (s)
¯k
to a numerically unstable algorithm. We can include di- share the same boundary conditions as their counterparts
4
vision by p(y )=p(yk ) in the BCJR algorithm by de¯ning as given in (8) and (11). Computation of the probabilities
0
modi¯ed probabilities °k (s ; s) will be discussed shortly.

k
On the topic of stability, we should point out also that
~ k (s) = ®k (s)=p(y1 )
®
the algorithm given here works in software, but a hard-

ware implementation would employ the \log-MAP" algo-

j
and
~ (s) = ¯ (s)=p(y N
¯
k
y1 ) : rithm [12], [14]. In fact, most software implementations
k k k +1

Dividing (6) by p(y )=p(yk ) = p(y1

k¡1
)p(y
N
k+1
j k
y1 ), we
these days use the log-MAP, although they can be slower

than the algorithm presented here if not done carefully.

obtain
The algorithm presented here is close to the earlier turbo

0
p(s ; s j y )p(yk ) = ®
~ k¡1 (s )
0
¢ °k (s ; s)
0
¢ ~ (s) :
¯k (12)
decoding algorithms [1], [13], [4].

k
P B. Iterative MAP Decoding

f 2 g
Note since p(y1 ) = ®k (s); the values ®
~ k (s) may be From Bayes' rule, the LAPP ratio for an arbitrary MAP
s2S
computed from ®k (s) : s S via decoder can be written as

X µ P (y j ¶ µ ¶
j ¡ ¡
uk = +1) P (uk = +1)
®
~ k (s) = ®k (s)= ®k (s) : (13)
L(uk ) = log + log
s2S P (y uk = 1) P (uk = 1)

4 Unfortunately, y) to obtain p(s0 ; s j y) also

¡
dividing by simply p( with the second term representing a priori information.
leads to an unstable algorithm. Obtaining p(s ; s
0
j y)p(yk ) instead
Since P (uk = +1) = P (uk = 1) typically, the a pri-
of the APP p(s ; s
0
j y) presents no problem since an APP ratio is
ori term is usually zero for conventional decoders. How-
computed so that the unwanted factor p(y ) cancels; see equation k
(16) below. ever, for iterative decoders, D1 receives extrinsic or soft
0

¡
information for each uk from D2 which serves as a pri- Now since °k (s ; s) appears in the numerator (where

ori information. Similarly, D2 receives extrinsic infor- uk = +1) and denominator (where uk = 1) of (16), the

! ! ! ! §
mation from D1 and the decoding iteration proceeds as factor Ak Bk will cancel as it is independent of uk . Also,

D1 D2 D1 D2 ..., with the previous decoder passing since we assume transmission of the symbols 1 over the

soft information along to the next decoder at each half- channel,

E c 1
2 2
0
N =2
=
¾
so that ¾ = N0 =2Ec where Ec = rEb

iteration except for the ¯rst. The idea behind extrinsic is the energy per channel bit. From (18), we then have
information is that D2 provides soft information to D1 for
· ¸
each uk , using only information not available to D1 (i.e., 0
°k (s ; s) » exp
1 e s
uk (L (uk ) + Lc yk ) +
1
Lc y x
p
k
p
k
E2 parity); D1 does likewise for D2.
· 2
¸ 2

An iterative decoder using component BCJR-MAP de-

coders is shown in Fig. 4. Observe how permuters and

= exp
1

2
e
uk (L (uk ) + Lc yk )
s
¢ e 0
°k (s ; s) (19)

de-permuters are involved in arranging systematic, parity,

and extrinsic information in the proper sequence for each where Lc , 4E

N
c
0 and where
decoder.
· ¸
We now show how extrinsic information is extracted

from the modi¯ed-BCJR version of the LAPP ratio embod-

e
°k (s ; s)
0
, exp
1

2
Lc y x
k
p p
k
:

0
ied in (16). We ¯rst observe that °k (s ; s) may be written

as (cf. equation (9)) Combining (19) with (16) we obtain

j j 0P 1
0
°k (s ; s) = P (s
0
s )p(yk
0
s ; s) 0
¢ e 0
¢ ~ (s) ¢
j B C
®
~ k¡1 (s ) ° (s ; s) ¯ k Ck

@ P+
k

= P (uk )p(yk uk ) L(uk ) = log

¢ ¢ ¢ A
! ®
~ k¡1 (s )
0 e
° (s ; s)
0 ~ (s)
¯ Ck
k k
0
where the event uk corresponds to the event s s. De¯n- S¡

ing
µ ¶ =
s
Lc yk + L (uk )
0P
e

1 (20)

e
L (uk ) , log
P (uk = +1)

¡ ;
B +
®
~ k¡1 (s )
0
¢ e
° (s ; s)
k
0
¢ ~ (s)
¯k
C
observe that we may write
P (uk = 1)
+ log @P S

®
~ k¡1 (s )
0
¢ e
° (s ; s)
0
¢ ~ (s)
¯
A :

µ ¶
k

¡
k
S¡

¢ £ ¤
e
exp[ L (uk )=2]

¡ ,
e
P (uk ) = exp[uk L (uk )=2] 1 e s

¡
1 + exp[ Le (uk )] where Ck exp u (L (uk ) + Lc y ) : The second equal-
2 k k
e ity follows since Ck (uk = +1) and Ck (uk = 1) can be
= Ak exp[uk L (uk )=2] (17)
factored out of the summations in the numerator and de-
where the ¯rst equality follows since it equals
Ãp !
nominator, respectively. The ¯rst term in (20) is some-

P¡ =P+ p times called the channel value, the second term represents

any a priori information about uk provided by a previous

P+ =P¡ = P+ when uk = +1 and
1 + P¡ =P+
Ãp ! decoder, and the third term represents extrinsic informa-

p
¡
tion that can be passed on to a subsequent decoder. Thus,
P¡ =P+
P¡ =P+ = P¡ when uk = 1 : for example, on any given iteration, D1 computes
1 + P¡ =P+
s e e

, , L1 (uk ) = Lc yk + L21 (uk ) + L12 (uk )

¡ j
where we have de¯ned P+ P (uk = +1) and P¡

P (uk = 1) for convenience.

As for p(yk uk ); we may e
where L21 (uk ) is extrinsic information passed from D2 to
s p s p p
write (recall yk = (y ; y ) and xk = (x ; x ) = (uk ; x )) e
D1, and L12 (uk ) is the third term in (20) which is to be
· ¸
k k k k k

j / ¡ ¡ ¡ ¡
s 2 p p 2 used as extrinsic information from D1 to D2.
(y uk ) (y x )
k k k
p(yk uk ) exp C. Pseudo-Code for the Iterative Decoder
2¾ 2 2¾ 2
" # We do not give pseudo-code for the encoder here since

¡
s 2 2 p 2 p 2
y +u +y +x this is much more straightforward. However, it must be
k k k k
= exp
2¾ 2 emphasized that at least E1 must be terminated correctly

· ¸ to avoid serious degradation. That is, the last m bits of

¢
s p p
uk y +x y
k k k the N -bit information word to be encoded must force E1
exp
¾2
· ¸
th
to the zero state by the N bit.
s p p
y uk + y x The pseudo-code given below for iterative decoding of
k k k
= Bk exp
¾2 a turbo code follows directly from the development above.

Implicit is the fact that each decoder must have full knowl-

· ¸
so that
edge of the trellis of the constituent encoders. For example,

/
s p p
0
uk y +x y each decoder must have a table (array) containing the in-
e k k k
°k (s ; s) Ak Bk exp[uk L (uk )=2] exp :
¾2
!
put bits and parity bits for all possible state transitions
0
(18) s s: Also required are permutation and de-permutation
(1)
functions (arrays) since D1 and D2 will be sharing relia- - compute ®
~ (s) for all s using (14)
k

¢
bility information about each uk , but D2's information is

¡
end

¢
pemuted relative to D1. We denote these arrays by P [ ]

and P inv[ ], respectively. For example, the permuted word for k = N : 1 : 2

0
u is obtained from the original word u via the pseudo-code
~(1) (s) for all s using (15)
- compute ¯
0 k¡1
statement: for k = 1 : N , u = uP [ k] , end. We next point
k
out that due to the presence of Lc in L(uk ), knowledge of
end
the noise variance N0 =2 by each MAP decoder is neces-
for k = 1 : N
sary. Finally, we mention that a simple way to simulate
0
puncturing is, in the computation of °k (s ; s), to set to e
- compute L12 (uk ) using
1p 2p
zero the received parity samples, y or y ; corresponding
0P 1
¢ ¢
k k
1p 2p (1) 0 e 0 ~(1) (s)
B C
to the punctured parity bits, x or x : Thus, puncturing ®
~ (s ) ° (s ; s) ¯
k¡1
k k
+ k k

@P A
¢ ¢
need not be performed at the encoder. e S
L12 (uk ) = log
(1) 0 e 0 ~(1) (s)
®
~ (s ) ° (s ; s) ¯
k¡1 k k
S¡

end

D2:
for k = 1 : N

===== Initialization ===== s 2p

- get yk = (y ;y )
P [ k] k
D1:
0

!
(1) - compute °k (s ; s) from (19) for all allowable state

6
®
~
- 0 (s) = 1 for s = 0 0

!
transitions s s (uk in (19) is set to the value of
= 0 for s = 0
0
the encoder input which caused the transition s s;

~(1)
e e
L (uk ) is L12 (uP [k ] ), the permuted extrinsic infor-

6
¯ (s) = 1 for s = 0
- N
s
= 0 for s = 0 maton from the previous D1 iteration; y is the per-
k
s
muted systematic value, y )
P [k ]
e
- L21 (uk ) = 0 for k = 1; 2; :::; N
(2)
- compute ®
~ (s) for all s using (14)
D2: k

(2)

¡
end

6
®
~0 (s) = 1 for s = 0
-
= 0 for s = 0 for k = N : 1 : 2

~(2) (s) = ® (2) ~(2) (s) for all s using (15)

- compute ¯
- ¯ ~ (s) for all s (set after computation of k¡1

f g
N N
(2) 5
®
~ (s) in the ¯rst iteration)
N end
e
- L12 (uk ) is to be determined from D1 after the ¯rst for k = 1 : N

half-iteration and so need not be initialized

e
- compute L21 (uk ) using

0P 1
¢ ¢
=======================
(2) 0 e 0 ~(2) (s)
B C
®
~ (s ) ° (s ; s) ¯
k¡1
+ k k

@P A
¢ ¢
e S
L21 (uk ) = log
(2) 0 e 0 ~(2) (s)
th ®
~ (s ) ° (s ; s) ¯
===== The n interation ===== k¡1 k k
S¡
D1:
for k = 1 : N
end

s 1p 1p ==========================
- get yk = (y ; y ) where y is a noisy version of E1
k k k
parity
===== After the last iteration =====
0

!
- compute °k (s ; s) from (19) for all allowable state
for k = 1 : N
0

!
transitions s s (uk in (19) is set to the value of
0
the encoder input which caused the transition s s; - compute
e e
L (uk ) is in this case L21 (uP inv [k] ), the de-permuted
s e e
extrinsic informaton from the previous D2 iteration) L1 (uk ) = Lc yk + L21 (uP inv [k ] ) + L12 (uk )

5 Note encoder 2 cannot b e simply terminated due to the presence

of the interleaver. The strategy implied here leads to a negligible

- if L1 (uk ) > 0

loss in performance. Other termination strategies can be found in

the literature (e.g., [15]). decide uk = +1

else

decide uk = ¡ 1 References

end [1] C. Berrou, A. Glavieux, and P. Thitimajshima, \Near

========================== Shannon limit error- correcting coding and decoding:

Turbo codes," Proc. 1993 Int. Conf. Comm., pp.

1064- 1070.
Acknowledgment
[2] G. Ungerboeck, \Channel coding with multi-

The author would like to thank Omer Acikel of New level/phase signals," IEEE Trans. Inf. Theory, pp.
Mexico State University for help with Example 2, and Esko 55-67, Jan. 1982.

Nieminen of Nokia and Prof. Steve Wilson of the Univer-

[3] M. Eyuboglu, G. D. Forney, P. Dong, G. Long,
sity of Virginia for helpful suggestions.
\Advanced modulation techniques for V.Fast," Eur.

Trans. on Telecom., pp. 243-256, May 1993.

[4] P. Robertson, \Illuminating the structure of code and

decoder of parallel concatenated recursive systematic

(turbo) codes," Proc. GlobeCom 1994, pp. 1298-1303.

[5] S. Benedetto and G. Montorsi, \Unveiling turbo

codes: Some results on parallel concatenated coding

schemes," IEEE Trans. Inf. Theory, pp. 409-428,

Mar. 1996.

[6] S. Benedetto and G. Montorsi, \Design of parallel con-

catenated codes," IEEE Trans. Comm., pp. 591-600,

May 1996.

[7] J. Hagenauer, E. O®er, and L. Papke, \Iterative

decoding of binary block and convolutional codes,"

IEEE Trans. Inf. Theory , pp. 429-445, Mar. 1996.

[8] D. Arnold and G. Meyerhans, \The realization of

the turbo-coding system," Semester Pro ject Report,

Swiss Fed. Inst. of Tech., Zurich, Switzerland, July,

1995.

[9] L. Perez, J. Seghers, and D. Costello, \A distance

spectrum interpretation of turbo codes," IEEE Trans.

Inf. Theory, pp. 1698-1709, Nov. 1996.

[10] J. Hagenauer and P. Hoeher, \A Viterbi algorithm

with soft-decision outputs and its applications," Proc.

GlobeCom 1989, pp. 1680-1686.

[11] L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, \Optimal

decoding of linear codes for minimizing symbol error

rate," IEEE Trans. Inf. Theory, pp. 284-287, Mar.

1974.

[12] P. Robertson, E. Villebrun, and P. Hoeher, \A com-

parison of optimal and suboptimal MAP decoding al-

gorithms operating in the log domain," Proc. 1995

Int. Conf. on Comm., pp. 1009-1013.

[13] C. Berrou and A. Glavieux, \Near optimum error

correcting coding and decoding: turbo-codes," IEEE

Trans. Comm., pp. 1261-1271, Oct. 1996.

[14] A. Viterbi, \An intuitive justi¯cation and a simpli¯ed (31,33) Turbo Code Comparison
−1
10
implementation of the MAP decoder for convolutional

codes," IEEE JSAC, pp. 260-264, Feb. 1998. N = 1000

−2
10 r = 1/2 (even parity punctured)

15 iterations
−3
10

[15] D. Divsalar and F. Pollara, \Turbo codes for PCS

−4
applications," Proc. 1995 Int. Conf. Comm., pp. 10
54-59.

Pb
−5
10 bnd for intlvr1

−6
10
bnd for intlvr2

−7
10

−8
10
0.5 1 1.5 2 2.5 3
u=xs Eb/No

Fig. 3. Simulated performance of the rate 1/2 (31, 33)

turbo code for two di®erent interleavers (N = 1000)
RSC 1
g2 ( D) x1p together with the asymptotic performance of each

predicted by (4).
g1 ( D)
N-bit
Puncturing
Interleaver x1p, x2p
Mechanism N-bit
De-Intrlver Le21
RSC 2 2p
g2 ( D) x e
u’ L12
MAP N-bit
g1 ( D) y1p MAP
Decoder 1 Intrlver
ys Decoder 2
Fig. 1. Diagram of a standard turbo encoder with two
identical recursive systematic encoders (RSC's). N-bit
Intrlver
y2p
Fig. 4. Diagram of iterative (turbo) decoder which uses
e
two MAP decoders operating cooperatively. L12 is \soft"
e
or extrinsic information from D1 to D2, and L21 is