A Turbo Codes Tutorial
A Turbo Codes Tutorial
William E. Ryan
Abstract | We give a tutorial exposition of turbo design a computer simulation of the encoder and decoder.
simple derivation for the performance of turbo codes, II. The Encoder and Its Performance
and a straightforward presentation of the iterative de-
coding algorithm. The derivations of both the per- Fig. 1 depicts a standard turbo encoder. As seen in the
formance estimate and the modi¯ed BCJR decoding ¯gure, a turbo encoder is consists of two binary rate 1/2
algorithm are novel. The treatment is intended to be convolutional encoders separated by an N -bit interleaver
a launching point for further study in the ¯eld and, or permuter, together with an optional puncturing mech-
signi¯cantly, to provide su±cient information for the anism. Clearly, without the puncturer, the encoder is rate
design of computer simulations. 1/3, mapping N data bits to 3N code bits. We observe
description of the algorithm here. This paper borrows from Observe that the code sequence corresponding to the en-
some of the most prominent publications in the ¯eld [4]- coder input u(D) for the former code is u(D)GN R (D) =
[9], sometimes adding details that were omitted in those [u(D)g1 (D) u(D)g2 (D)] ; and that the identical code se-
works. However, the general presentation and some of the quence is produced in the recursive code by the sequence
0
derivations are novel. Our goal is a self-contained, simple u (D) = u(D)g1 (D); since in this case the code sequence is
introduction to turbo codes for those already knowledge- u(D)g1 (D)GR (D) = u(D)GN R (D): Here, we loosely call
able in the ¯elds of algebraic and trellis codes. the pair of polynomials u(D)GN R (D) a code sequence, al-
The paper is organized as follows. In the next section we though the actual code sequence is derived from this poly-
present the structure of the encoder, which leads to an esti- nomial pair in the usual way.
mate of its performance. The subsequent section then de- Observe that, for the recursive encoder, the code se-
scribes the iterative algorithm used to decode these codes. quence will be of ¯nite weight if and only if the input
The treatment in each of these sections is meant to be sequence is divisible by g1 (D): We have the following im-
su±ciently detailed so that one may with reasonable ease mediate corollaries of this fact which we shall use later.
Corollary 1. A weight-one input will produce an in¯nite even parity bits from the top encoder and all odd parity
weight output (for such an input is never divisible by a bits from the bottom one.
¸
family of weight-two inputs of the form D (1 + D ), maximum-likehood (ML) sequence decoder would be far
j 0; which produce ¯nite weight outputs, i.e., which are too complex for a turbo code due to the presence of the
¡
divisible by g1 (D): When g1 (D) is a primitive polynomial permuter. However, the suboptimum iterative decoding al-
m
of degree m, then q = 2 ; more generally, q 1 is the gorithm to be described there o®ers near-ML performance.
length of the pseudorandom sequence generated by g1 (D): Hence, we shall now estimate the performance of an ML
In the context of the code's trellis, Corollary 1 says that decoder (analysis of the iterative decoder is much more
a weight-one input will create a path that diverges from di±cult).
the all-zeros path, but never remerges. Corollary 2 says Armed with the above descriptions of the components of
that there will always exist a trellis path that diverges the turbo encoder of Fig. 1, it is easy to conclude that it
and remerges later which corresponds to a weight-two data is linear since its components are linear. The constituent
sequence. codes are certainly linear, and the permuter is linear since
Example 1. Consider the code with generator matrix it may be modeled by a permutation matrix. Further,
· 1+D
2
+D
3
+D
4
¸ the puncturer does not a®ect linearity since all codewords
2f ¡g
15 2 3 Now consider the all-zeros codeword (the 0 codeword)
the ¯nite-length code sequence (1 +D ; 1+D +D +D + th N
5 7 8 11 and the k codeword, for some k 1; 2; :::; 2 1 . The
D + D + D
elaborating on Corollary 2, this example serves to demon- bit error rate for this two-codeword situation would then
¤
specifying such encoders.
¸
Also important is that N be selected quite large and we
argument, we may write
shall assume N 1000 hereafter. The importance of these
· j
2 ¡1
as well as any other provided N is large.
Pb (k 0)
C. The Puncturer
While for deep space applications low-rate codes are ap-
2
k =1
N
X ¡1
Ãr !
propriate, in other situations such as satellite communi- wk 2rdk Eb
= Q .
cations, a rate of 1/2 or higher is preferred. The role of N N0
k =1
the turbo code puncturer is identical to that of its convolu-
sum is over the di®erent weight-w inputs, and dwv is ample), it becomes very unlikely that the permuted input
w
th 0
the weight of the v codeword produced by a weight-w u (D) seen by the second encoder will also be divisible by
4
input. g1 (D). For example, suppose u(D) = g1 (D) = 1 + D + D :
Consider now the ¯rst few terms in the outer summation Then the permuter output will be a multiple of g1 (D) if
th th th
of (1). the three input 1's become the j ; (j + 1) ; and (j + 4)
w = 1: From Corollary 1 and associated discussion bits out of the permuter, for some j . If we imagine that the
above, weight-one inputs will produce only large weight permuter acts in a purely random fashion so that the prob-
codewords at both constituent encoder outputs since the ability that one of the 1's lands a given position is 1=N , the
codeword weight so that the w = 1 terms in (1) will be a given permuter output pattern occurs with probability
¡ ¢
2
negligible. 2!=N . Thus, we would expect the number of weight-three
N
w = 2: Of the weight-two encoder inputs, only a inputs, n3 ; resulting in remergent paths in both encoders
2
fraction will be divisible by g1 (D) (i.e., yield remergent to be much less than n2 ,
paths) and, of these, only certain ones will yield the small-
CC n3 << n2 ;
est weight, d2 ;min ; at a constituent encoder output (here,
¸
CC is negligible relative to that for w = 2.
weight-d2 ;min codeword at the ¯rst encoder's output, it is
0 w 4: Again we can approximate the inner sum in (1)
unlikely that the permuted input, u (D), seen by the sec-
CC for w = 4 in the same manner as in (2) and (3). Still
ond encoder will also correspond to a weight-d2;min code-
we would like to make some comments on its size for the
word (much less, be divisible by g1 (D)). We can be sure,
\random" interleaver. A weight-four input might appear
however, that there will be some minimum-weight turbo
to the ¯rst encoder as a weight-three input concatenated
codeword produced by a w = 2 input, and that this mini-
some time later with a weight-one input, leading to a non-
mum weight can be bounded as
remergent path in the trellis and, hence, a negligible term
TC
d2 ;min ¸ CC
2d2;min ¡ 2; in the inner sum in (1). It might also appear as a concate-
¸
TC visible by g1 (D) at the second encoder. Thus, we may
produce weight-d2 ;min turbo codewords by n2 so that, for
w = 2, the inner sum in (1) may be approximated as expect n4 << n2 so that the w 4 terms are negligible in
X
2( To summarize, the bound in (1) can be approximated as
' @ A
TC
2 2rd2 v Eb 2n2 2rd
2 ;min Eb
8 0s 19
< =
Q Q : (2)
'
N N0 N N0
@ A
TC
v =1 wnw 2rd Eb
w;min
N Ãr ! 0s 1 where nw and d
w;min
are functions of the particular in-
X
(3) terleaver employed. From our discussion above, we might
' @ A
TC
2rd
3 2rd3v Eb 3n3 3;min Eb expect that the w = 2 term dominates for a randomly gen-
Q Q ; (3)
N N0 N N0 erated interleaver, although it is easy to ¯nd interleavers
v =1
with n2 = 0 as seen in the example to follow. In any case,
TC
where n3 and d3;min are obviously de¯ned. While n3 is 1 This is not the only weight-three pattern divisible by g1 (D) |
clearly dependent on the interleaver, we can make some 2
g (D) = 1 + D
2 + D
8 is another one, but this too has probability
1
3
¡
comments on its size relative to n2 for a \randomly gener- 3!=N of occurring.
2 Because our argument assumes a \purely random" permuter, the
ated" interleaver. Although there are (N 2)=3 times as
inequality n3 << n 2 has to b e interpreted probabilistically. Thus, it
many w = 3 terms in the inner summation of (1) as there
is more accurate to write E n3 f g << E n 2 f g where the exp ectation
are w = 2 terms, we can expect the number of weight-three is over all interleavers. Alternatively, for the average interleaver, we
terms divisible by g1 (D) to be of the order of the number would expect n 3 << n 2 ; thus if n2 = 5, say, we would expect n3 = 0.
3 The value 3
¡ ¢
of 1=N derives from that fact that ideally a particular
of weight-two terms divisible by g1 (D): Thus, most of the 4,
N divisible output pattern occurs with probability 4!=N but there will
terms in (1) can be removed from consideration for
3 be approximately N shifted versions of that pattern, each divisible
this reason. Moreover, given a weight-three encoder input by g1 (D).
we observe that Pb decreases with N , so that the error rate tiple (usually two) decoders operating cooperatively and
can be reduced simply by increasing the interleaver length. iteratively. Most of the focus was on a type of Viterbi de-
This e®ect is called interleaver gain (or permuter gain) and coder which provides soft-output (or reliability) informa-
demonstrates the necessity of large permuters. Finally, we tion to a companion soft-output Viterbi decoder for use in
note that recursive encoders are crucial elements of a turbo a subsequent decoding [10]. Also receiving some attention
code since, for non-recursive encoders, division by g1 (D) was the symbol-by-symbol maximum a posteriori (MAP)
(non-remergent trellis paths) would not be an issue and algorithm of Bahl, et al [11], published over 20 years ago.
(4) would not hold (although (1) still would). It was this latter algorithm, often called the BCJR algo-
Example 2 . We consider the performance of a rate 1/2, rithm, that Berrou, et al [1], utilized in the iterative de-
(31, 33) turbo code for two di®erent interleavers of size coding of turbo codes. We will discuss in this section the
N = 1000. We start ¯rst with an interleaver that was BCJR algorithm employed by each constituent decoder,
randomly generated . We found for this particular inter- but we refer the reader to [11] for the derivations of some
TC
leaver, n2 = 0 and n3 = 1, with d3 ;min = 9, so that the of the results.
w = 3 term dominates in (4). Interestingly, the inter- We ¯rst discuss a modi¯ed version of the BCJR algo-
leaver input corresponding to this dominant error event rithm for performing symbol-by-symbol MAP decoding.
168 5 10
was D (1 + D + D ) which produces the interleaver We then show how this algorithm is incorporated into an
88 15 848
output D (1 + D + D ), where of course both poly- iterative decoder employing two BCJR-MAP decoders. We
4
nomials are divisible by g1 (D) = 1 + D + D . Figure 3 shall require the following de¯nitions:
ations of the iterative decoding algorithm detailed in the - E1 is a notation for encoder 1
to the simulated values. The interleaver was then modi¯ed - D1 is a notation for decoder 1
by hand to improve the weight spectrum of the code. It
TC - D2 is a notation for decoder 2
was a simple matter to attain n2 = 1 with d2 ;min = 12 and
TC
n3 = 4 with d3 ;min = 15 for this second interleaver so that
- m is the constituent encoder memory
the w = 2 term now dominates in (4). The simulated and
m
estimated performance curves for this second interleaver - S is the set of all 2 constituent encoder states
b
- ya = (ya ; ya+1 ; :::; yb )
III. The Decoder
N
- y = y1 = (y1 ; y2 ; :::; yN ) is the noisy received code-
Consider ¯rst an ML decoder for a rate 1/2 convolutional
word
¸
code (recursive or not), and assume a data word of length
j ¡j
In the symbol-by-symbol MAP decoder, the decoder de-
code sequences to the noisy received sequence, choosing
¡
cides uk = +1 if P (uk = +1 y ) > P (uk = 1 y), and it
in favor of the codeword with the best correlation met-
decides uk = 1 otherwise. More succinctly, the decision
ric. Clearly, the complexity of such an algorithm is exorbi-
u
^k is given by
tant. Fortunately, as we know, such a brute force approach
u
^k = sign [L(uk )]
is simpli¯ed greatly by Viterbi's algorithm which permits
a systematic elimination of candidate code sequences (in where L(uk ) is the log a posteriori probability (LAPP)
N ¡1 N ¡2 ratio de¯ned as
µ ¶
the ¯rst step, 2 are eliminated, then another 2 are
¡j
P (uk = +1
we have no such luck with turbo codes, for the presence L(uk ) log :
P (uk = 1 y)
of the permuter immensely complicates the structure of a
turbo code trellis, making these codes look more like block Incorporating the code's trellis, this may be written as
codes.
0P 0
p(sk¡1 = s ; sk = s; y )=p(y )
1
Just prior to the discovery of turbo codes, there was
! f g
the set of ordered pairs (s ; s) corresponding to all state ®
~ k (s) , we can use (7) in (13) to obtain a recursion in-
0
¡
transitions (sk¡1 = s ) (sk = s) caused by data input volving only ®
~ k (s) ,
uk = +1, and S
¡
is similarly de¯ned for uk = 1. P
PP
0 0
®k¡1 (s )°k (s ; s)
Observe we may cancel p(y ) in (5) which means we s0
®
~ k (s) =
P
0 0 0
require only an algorithm for computing p(s ; s; y ) = ®k¡1 (s )°k (s ; s)
s s0
PP
0 0 0
p(sk¡1 = s ; sk = s; y ): The BCJR algorithm [11] for doing ®
~ k¡1 (s )°k (s ; s)
s0
= ; (14)
this is ®
~ k¡1 (s )°k (s ; s)
0 0
s0
¢ ¢
s
0 0 0
p(s ; s; y ) = ®k¡1 (s ) °k (s ; s) ¯k (s) (6) where the second equality follows by dividing the numera-
,
k¡1
k tor and the denominator by p(y1 ).
where ®k (s) p(sk = s; y1 ) is computed recursively as
j
0 0
®k (s) = ®k¡1 (s )°k (s ; s) (7)
j ¢
N k
p( yk+1 y1 )
s0 2S N k¡1 k
p(y k y1 ) = p(y1 )
k¡1
p(y 1 )
with initial conditions
XX
¢
N
j k
6
p(y y1 )
0 0 k +1
= ®k¡1 (s )°k (s ; s)
®0 (0) = 1 and ®0 (s = 0) = 0 : (8) k¡1
p(y1 )
XX s s0
j
as
0
°k (s ; s) , p(sk = s; yk sk¡1 = s )
0
(9) so that dividing (10) by this equation yields
P
j
and will be discussed further below. The probabilities
PP
~ (s)° (s0 ; s)
¯k (s) , p( y
N
k+1
sk = s) in (6) are computed in a \back-
~
¯
0
k¡1 (s ) =
s
¯
®
k k
~ k¡1 (s )°k (s ; s)
0 0
: (15)
s s0
ward" recursion as
0
X 0 In summary, the modi¯ed BCJR-MAP algorithm in-
¯k¡1 (s ) = ¯k (s)°k (s ; s) (10)
volves computing the LAPP ratio L(uk ) by combining (5)
s2S
and (12) to obtain
0P 1
¢ ¢
with boundary conditions
6
0 0 ~ (s)
B C
®
~ k¡1 (s ) °k (s ; s) ¯ k
+
¯N (0) = 1 and ¯N (s = 0) = 0 : (11)
L(uk ) = log @PS
®
~ k¡1 (s )
0
¢ °k (s ; s)
0
¢ ~ (s)
¯
A (16)
k
(The encoder is expected to end in state 0 after N input
S¡
bits, implying that the last m input bits, called termination
~
f g f g
bits, are so selected.) where the ®'s
~ and ¯'s are computed recursively via (14)
Unfortunately, cancelling the divisor p(y ) in (5) leads and (15), respectively. Clearly the ®
~ k (s) and ~ (s)
¯k
to a numerically unstable algorithm. We can include di- share the same boundary conditions as their counterparts
4
vision by p(y )=p(yk ) in the BCJR algorithm by de¯ning as given in (8) and (11). Computation of the probabilities
0
modi¯ed probabilities °k (s ; s) will be discussed shortly.
k
On the topic of stability, we should point out also that
~ k (s) = ®k (s)=p(y1 )
®
the algorithm given here works in software, but a hard-
j
and
~ (s) = ¯ (s)=p(y N
¯
k
y1 ) : rithm [12], [14]. In fact, most software implementations
k k k +1
0
p(s ; s j y )p(yk ) = ®
~ k¡1 (s )
0
¢ °k (s ; s)
0
¢ ~ (s) :
¯k (12)
decoding algorithms [1], [13], [4].
k
P B. Iterative MAP Decoding
f 2 g
Note since p(y1 ) = ®k (s); the values ®
~ k (s) may be From Bayes' rule, the LAPP ratio for an arbitrary MAP
s2S
computed from ®k (s) : s S via decoder can be written as
X µ P (y j ¶ µ ¶
j ¡ ¡
uk = +1) P (uk = +1)
®
~ k (s) = ®k (s)= ®k (s) : (13)
L(uk ) = log + log
s2S P (y uk = 1) P (uk = 1)
¡
information for each uk from D2 which serves as a pri- Now since °k (s ; s) appears in the numerator (where
ori information. Similarly, D2 receives extrinsic infor- uk = +1) and denominator (where uk = 1) of (16), the
! ! ! ! §
mation from D1 and the decoding iteration proceeds as factor Ak Bk will cancel as it is independent of uk . Also,
D1 D2 D1 D2 ..., with the previous decoder passing since we assume transmission of the symbols 1 over the
iteration except for the ¯rst. The idea behind extrinsic is the energy per channel bit. From (18), we then have
information is that D2 provides soft information to D1 for
· ¸
each uk , using only information not available to D1 (i.e., 0
°k (s ; s) » exp
1 e s
uk (L (uk ) + Lc yk ) +
1
Lc y x
p
k
p
k
E2 parity); D1 does likewise for D2.
· 2
¸ 2
2
e
uk (L (uk ) + Lc yk )
s
¢ e 0
°k (s ; s) (19)
2
Lc y x
k
p p
k
:
0
ied in (16). We ¯rst observe that °k (s ; s) may be written
j j 0P 1
0
°k (s ; s) = P (s
0
s )p(yk
0
s ; s) 0
¢ e 0
¢ ~ (s) ¢
j B C
®
~ k¡1 (s ) ° (s ; s) ¯ k Ck
@ P+
k
¢ ¢ ¢ A
! ®
~ k¡1 (s )
0 e
° (s ; s)
0 ~ (s)
¯ Ck
k k
0
where the event uk corresponds to the event s s. De¯n- S¡
ing
µ ¶ =
s
Lc yk + L (uk )
0P
e
1 (20)
e
L (uk ) , log
P (uk = +1)
¡ ;
B +
®
~ k¡1 (s )
0
¢ e
° (s ; s)
k
0
¢ ~ (s)
¯k
C
observe that we may write
P (uk = 1)
+ log @P S
®
~ k¡1 (s )
0
¢ e
° (s ; s)
0
¢ ~ (s)
¯
A :
µ ¶
k
¡
k
S¡
¢ £ ¤
e
exp[ L (uk )=2]
¡ ,
e
P (uk ) = exp[uk L (uk )=2] 1 e s
¡
1 + exp[ Le (uk )] where Ck exp u (L (uk ) + Lc y ) : The second equal-
2 k k
e ity follows since Ck (uk = +1) and Ck (uk = 1) can be
= Ak exp[uk L (uk )=2] (17)
factored out of the summations in the numerator and de-
where the ¯rst equality follows since it equals
Ãp !
nominator, respectively. The ¯rst term in (20) is some-
P¡ =P+ p times called the channel value, the second term represents
p
¡
tion that can be passed on to a subsequent decoder. Thus,
P¡ =P+
P¡ =P+ = P¡ when uk = 1 : for example, on any given iteration, D1 computes
1 + P¡ =P+
s e e
¡ j
where we have de¯ned P+ P (uk = +1) and P¡
j / ¡ ¡ ¡ ¡
s 2 p p 2 used as extrinsic information from D1 to D2.
(y uk ) (y x )
k k k
p(yk uk ) exp C. Pseudo-Code for the Iterative Decoder
2¾ 2 2¾ 2
" # We do not give pseudo-code for the encoder here since
¡
s 2 2 p 2 p 2
y +u +y +x this is much more straightforward. However, it must be
k k k k
= exp
2¾ 2 emphasized that at least E1 must be terminated correctly
¢
s p p
uk y +x y
k k k the N -bit information word to be encoded must force E1
exp
¾2
· ¸
th
to the zero state by the N bit.
s p p
y uk + y x The pseudo-code given below for iterative decoding of
k k k
= Bk exp
¾2 a turbo code follows directly from the development above.
Implicit is the fact that each decoder must have full knowl-
· ¸
so that
edge of the trellis of the constituent encoders. For example,
/
s p p
0
uk y +x y each decoder must have a table (array) containing the in-
e k k k
°k (s ; s) Ak Bk exp[uk L (uk )=2] exp :
¾2
!
put bits and parity bits for all possible state transitions
0
(18) s s: Also required are permutation and de-permutation
(1)
functions (arrays) since D1 and D2 will be sharing relia- - compute ®
~ (s) for all s using (14)
k
¢
bility information about each uk , but D2's information is
¡
end
¢
pemuted relative to D1. We denote these arrays by P [ ]
@P A
¢ ¢
need not be performed at the encoder. e S
L12 (uk ) = log
(1) 0 e 0 ~(1) (s)
®
~ (s ) ° (s ; s) ¯
k¡1 k k
S¡
end
D2:
for k = 1 : N
!
(1) - compute °k (s ; s) from (19) for all allowable state
6
®
~
- 0 (s) = 1 for s = 0 0
!
transitions s s (uk in (19) is set to the value of
= 0 for s = 0
0
the encoder input which caused the transition s s;
~(1)
e e
L (uk ) is L12 (uP [k ] ), the permuted extrinsic infor-
6
¯ (s) = 1 for s = 0
- N
s
= 0 for s = 0 maton from the previous D1 iteration; y is the per-
k
s
muted systematic value, y )
P [k ]
e
- L21 (uk ) = 0 for k = 1; 2; :::; N
(2)
- compute ®
~ (s) for all s using (14)
D2: k
(2)
¡
end
6
®
~0 (s) = 1 for s = 0
-
= 0 for s = 0 for k = N : 1 : 2
f g
N N
(2) 5
®
~ (s) in the ¯rst iteration)
N end
e
- L12 (uk ) is to be determined from D1 after the ¯rst for k = 1 : N
0P 1
¢ ¢
=======================
(2) 0 e 0 ~(2) (s)
B C
®
~ (s ) ° (s ; s) ¯
k¡1
+ k k
@P A
¢ ¢
e S
L21 (uk ) = log
(2) 0 e 0 ~(2) (s)
th ®
~ (s ) ° (s ; s) ¯
===== The n interation ===== k¡1 k k
S¡
D1:
for k = 1 : N
end
s 1p 1p ==========================
- get yk = (y ; y ) where y is a noisy version of E1
k k k
parity
===== After the last iteration =====
0
!
- compute °k (s ; s) from (19) for all allowable state
for k = 1 : N
0
!
transitions s s (uk in (19) is set to the value of
0
the encoder input which caused the transition s s; - compute
e e
L (uk ) is in this case L21 (uP inv [k] ), the de-permuted
s e e
extrinsic informaton from the previous D2 iteration) L1 (uk ) = Lc yk + L21 (uP inv [k ] ) + L12 (uk )
decide uk = ¡ 1 References
The author would like to thank Omer Acikel of New level/phase signals," IEEE Trans. Inf. Theory, pp.
Mexico State University for help with Example 2, and Esko 55-67, Jan. 1982.
Mar. 1996.
May 1996.
1995.
15 iterations
−3
10
Pb
−5
10 bnd for intlvr1
−6
10
bnd for intlvr2
−7
10
−8
10
0.5 1 1.5 2 2.5 3
u=xs Eb/No
predicted by (4).
g1 ( D)
N-bit
Puncturing
Interleaver x1p, x2p
Mechanism N-bit
De-Intrlver Le21
RSC 2 2p
g2 ( D) x e
u’ L12
MAP N-bit
g1 ( D) y1p MAP
Decoder 1 Intrlver
ys Decoder 2
Fig. 1. Diagram of a standard turbo encoder with two
identical recursive systematic encoders (RSC's). N-bit
Intrlver
y2p
Fig. 4. Diagram of iterative (turbo) decoder which uses
e
two MAP decoders operating cooperatively. L12 is \soft"
e
or extrinsic information from D1 to D2, and L21 is
uk +
+
pk