A New Approach For Fec Decoding Based On The BP Algorithm in Lte and Wimax Systems
A New Approach For Fec Decoding Based On The BP Algorithm in Lte and Wimax Systems
A New Approach For Fec Decoding Based On The BP Algorithm in Lte and Wimax Systems
l=0
x
t1
g
(j)
l
. (1)
Each coded output sequence y
(j)
in a rate 1/n code is the
convolution of the input sequence x and the impulse response
g
(j)
,
y
(j)
= x g
(j)
. (2)
In vector form, this is expressed
y
(j)
=
k1
i=0
x
(i)
g
(j)
i
, (3)
which can be developed thus
y
(j)
t
=
k1
i=0
_
mi1
l=0
x
(i)
t1
g
(j)
i,l
_
. (4)
We can express these forms as a matrix multiplication
operation, thus providing a generator matrix similar to that
developed for block codes. In fact, the primary difference
arises from the fact that the input sequence is not necessarily
bounded in length, and thus the generator and parity check
matrices for convolutional codes are semi innite. However,
herein we introduce the G and H matrices as equivalent to a
tail-biting convolutional code having nite length. Therefore,
the generator and parity-check matrices will be as follows [4]:
G =
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
Gm G
0
G
1
G
m1
G
m1
Gm G
0
G
m2
G
m2
G
m1
Gm
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
G
1
G
m1
Gm G
0
G
0
G
1
Gm
G
0
G
1
Gm
.
.
.
.
.
.
.
.
.
G
0
G
1
Gm
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
(5)
where
G
l
=
_
_
g
(1)
1,l
g
(2)
1,l
g
(n)
1,l
g
(1)
2,l
g
(2)
2,l
g
(n)
2,l
.
.
.
.
.
.
.
.
.
g
(1)
k,l
g
(2)
k,l
g
(n)
k,l
_
_
(6)
Note that each block of k rows in the G matrix is a circular
shift by n positions of the previous such block. In general,
the parity-check matrix of a rate k/n tail-biting convolutional
code with constraint length m is
10
H =
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
P
T
0
|I P
T
m
|0 P
T
m1
|0 P
T
1
|
P
T
1
|0 P
T
0
|I P
T
m
|0 P
T
2
|0
.
.
.
.
.
.
.
.
.
P
T
1
|0 P
T
0
|I P
T
0
|0
P
T
m
|0 P
T
m1
|0 P
T
1
|0 P
T
0
|I
.
.
.
.
.
.
.
.
.
P
T
m
|0 P
T
m1
|0 P
T
1
|0 P
T
0
|I
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
(7)
where I is the k k identity matrix, 0 is the k k all zero
matrix, and P
i
, i = 0, 1, ..., m, is a k (nk) matrix whose
entries are
P
i
=
_
_
g
(k+1)
1,i
g
(k+2)
1,i
g
(n)
1,i
g
(k+1)
2,i
g
(k+2)
2,i
g
(n)
2,i
.
.
.
.
.
.
.
.
.
g
(k+1)
k,i
g
(k+2)
k,i
g
(n)
k,i
_
_
(8)
Here, g
p,i
is equal to 1 or 0 coresponding to whether or
not the i
th
stage of the shift register for the input contributes
to output j(i = 0, 1, ..., m; j = (k + 1), (k + 2), ..., n; p =
1, 2, ..., k). Since the last m bits serve as the starting state
and are also fed into the encoder, there is an end-round-shift
phenomenon for the last m columns of H.
Example 2: Consider the previous encoder shown in Ex-
ample 1, assume that a block of k = 6 information bits are
encoded. Then the tail-biting construction gives a binary (12,
6) code with generator and parity check matrices,
G =
_
_
11 01 10 11 00 00
00 11 01 10 11 00
00 00 11 01 10 11
11 00 00 11 01 10
10 11 00 00 11 01
01 10 11 00 00 11
_
_
(9)
and
H =
_
_
11 00 00 11 01 10
10 11 00 00 11 01
01 10 11 00 00 11
11 01 10 11 00 00
00 11 01 10 11 00
00 00 11 01 10 11
_
_
(10)
B. Degree distribution of Tanner graph for tail-biting convo-
lutional codes
Looking at the H matrix of a tail-biting convolutional code,
we can notice that it is similar to the H matrix of an irregular
LDPC code where the number of non-zero elements is not a
xed number per row and column. Our goal is to represent
the tail-biting convolutional codes through Tanner graphs in
order to decode them using the BP algorithm. Therefore, it
is important to obtain the degree distribution of the Tanner
graph which describes the number of edges into the bit and
check nodes in irregular LDPC codes. The fraction of edges
which are connected to degree-i bit nodes is denoted
i
, and
the fraction of edges which are connected to degree-i check
nodes is denoted
i
. The functions
(x) =
1
x +
2
x
2
+ +
i
x
i1
+ (11)
(x) =
1
x +
2
x
2
+ +
i
x
i1
+ (12)
are dened to describe the degree distributions.
III. TURBO CODES
To replace the traditional decoders of turbo codes by the
BP decoder, we have to obtain the parity-check matrix for
the turbo code as was done in the previous section for the
tail-biting convolutional codes.
A. Parity check matrix for turbo codes
Let us consider a recursive systematic convolutional (RSC)
code C
0
of rate R = 1/2. It has two generator polynomials
g
1
(X) and g
2
(X) of degree v +1, where v is the memory of
the encoder. Let u(X) be the input of the encoder and x
1
(X)
and x
2
(X) its outputs. We consider this code as a block code
obtained from the zero-tail truncation of the RSC. Using the
parity check matrix H of the RSC, we can do some column
permutations and rewrite the H matrix as H
new
, where H
new
=
[H
1
H
2
]. As mentioned before, we consider a conventional
turbo code C, resulting from the parallel concatenation of
two identical RSC codes C
0
and whose common inputs are
separated by an interleaver of length N, represented by matrix
M of size N N with exactly one nonzero element per row
and column. It is well known that the superior performance of
turbo codes is primarily due to the interleaver, i.e., due to the
cycle structure of the Tanner graph [17]. Hence, the parity-
check matrix of the whole turbo code is (for a detailed proof,
the reader is referred to [18]):
H
turbo
=
_
H
2
H
1
0
H
2
M
T
0 H
1
_
. (13)
Example 3: Let us now consider the special case of a RSC
code C
0
of rate R = 1/2 whose input u(X) has a nite degree
N1 (i.e. the input vector has size N). Its parity-check matrix
H can now be written as an N2N matrix over GF(2) whose
coefcients are xed by its generator. The rst (respectively
second) NN part of H consists of shifted rows representing
the coefcients of g
2
(X) (respectively g
1
(X)). For example,
choosing g
1
= 101, g
2
= 111 and N = 8, we have:
H =
_
_
11100000 | 10100000
01110000 | 01010000
00111000 | 00101000
00011100 | 00010100
00001110 | 00001010
00000111 | 00000101
00000011 | 00000010
00000001 | 00000001
_
_
Note that the number of non-zero elements per row and
per column in the diagonal sub-matrices H
1
and H
2
is
upper bounded by L = v + 1, the constraint length of the
constituent codes, which is always very small in comparison
to the length of the interleaver. Also, the interleaver does not
11
change the weights of the sub-matrix H
2
M
T
. As with the tail-
biting convolutional code, the H matrix for a turbo code can
also be seen as the H matrix of an irregular LDPC code, since
the weight of non-zero elements per row and column is not
strictly constant, but always very small compared to the size
of the parity-check matrix. Then, the parity-check matrix for
the mentioned turbo code in our example will be as follows:
H =
_
_
10100000 11100000 0000000
01010000 01110000 0000000
00101000 00111000 0000000
00010100 00011100 0000000
00001010 00001110 0000000
00000101 00000111 0000000
00000010 00000011 0000000
00000001 00000001 0000000
00001001 0000000 11100000
01010000 0000000 01110000
00000101 0000000 00111000
10010000 0000000 00011100
00000110 0000000 00001110
10100000 0000000 00000111
00000010 0000000 00000011
00100000 0000000 00000001
_
_
Following (9) and (10) provided in the previous section, the
degree distribution of this turbo code is given by
(x) = 0.291667x + 0.083333x
2
+ 0.25x
3
+ (14)
0.166667x
4
+ 0.208333x
5
(x) = 0.083333x
2
+ 0.125x
3
+ 0.033333x
4
+ (15)
0.208333x
5
+ 0.25x
6
IV. WIMAX AND LTE CODING STRUCTURE
To address the low and high rate requirements of LTE,
the 3
rd
Generation Partnership Project (3GPP) working group
undertook a rigorous study of advanced channel coding can-
didates such as tail-biting convolutional and turbo codes for
low and high data rates, respectively. We investigate here the
application of the BP decoder for the proposed turbo code
in LTE systems. Meanwhile, a rate , memory-6 tail-biting
convolutional code has been adopted in the WiMAX (802.16e)
system, because of its best minimum distance and the smallest
number of minimum weight codewords for larger than 32-bit
payloads which is used for both frame control header (FCH)
and data channels. In fact, we will focus here on the FCH
which has much shorter payload sizes (12 and 24 bits) as
shown in the next subsection.
A. Tail-biting convolutional code in 802.16e
Here, we briey describe the WiMAX frame control header
structure. In the WiMAX Orthogonal Frequency Division
Multiplexing (OFDM) physical layer, the payload size of
the frame control header is either 24 bits or 12 bits and
the smallest unit for generic data packet transmission is one
subchannel. A subchannel consists of 48 QPSK symbols (96
coded bits). At a code rate of , one subchannel translates to
48 bits as the smallest information block size. Currently, the
FCH payload bits are repeated to meet the minimum number
(48) of encoder information bits. The generator polynomials
for the rate WiMAX tail-biting convolutional code are given
by g
1
= (1011011) and g
2
= (1111001) in binary notation.
According to [19], these generator polynomials have the best
d
min
(minimum distance) and n
dmin
(number of codewords
with weight d
min
) for payload sizes 33 bits and for some
payload sizes between 25 and 33 bits, under the constraint of
memory size m = 6 and code rate .
B. Turbo code in LTE system
The 3GPP turbo code is a systematic parallel concatenated
convolutional code (PCCC) with two 8-state constituent en-
coders and one turbo code internal interleaver. Each con-
stituent encoder is independently terminated by tail bits.
For an input block size of K bits, the output of a turbo
encoder consists of three length K streams, corresponding
to the systematic and two parity bit streams (referred to as
the Systematic, Parity 1, and Parity 2 streams in the
following), respectively, as well as 12 tail bits due to trellis
termination. Thus, the actual mother code rate is slightly lower
than 1/3. In LTE, the tail bits are multiplexed to the end of the
three streams, whose lengths are hence increased to (K + 4)
bits each [5]. The transfer function of the 8-state constituent
code for the PCCC is:
G(D) =
_
1,
g
1
(D)
g
0
(D)
_
, (16)
where
g
0
(D) = 1 +D
2
+D
3
, (17)
g
1
(D) = 1 +D + D
3
. (18)
The initial value of the shift registers of the 8-state con-
stituent encoders will be all zeros when starting to encode the
input bits. The output from the turbo encoder is d
(0)
k
= x
k
,
d
(1)
k
= z
k
, and d
(2)
k
= z
,
k
for k = 0, 1, 2, . . . , K1. If the code
block to be encoded is the 0-th code block and the number of
ller bits is greater than zero, i.e., F > 0, then the encoder will
set c
k
= 0, k = 0,. . . , (F-1) at its input and will set d
(0)
k
=<
NULL >, k = 0,. . . , (F-1) and d
(0)
k
=< NULL >, k = 0,. . . ,
(F-1) at its output [5]. The bits input to the turbo encoder
are denoted by c
0
, c
1
, c
2
, c
3
, . . . , c
K1
, and the bits output
from the rst and second 8-state constituent encoders are de-
noted byz
0
, z
1
, z
2
, z
3
, . . . , z
K1
and z
,
0
, z
,
1
, z
,
2
, z
,
3
, . . . , z
,
K1
,
respectively. The bits output from the turbo code internal
interleaver are denoted by c
,
0
, c
,
1
, . . . , c
,
K1
, and these bits are
to be the input to the second 8-state constituent encoder.
V. SIMULATION RESULTS
Considering the previous example of the tail-biting convolu-
tional code in WiMAX systems and binary transmission over
an AWGN channel, the BP algorithm as in [4] is compared
with the maximum-likelihood (ML) Viterbi type algorithm to
12
decode the same tail-biting convolutional code [9, 12]. To
determine by simulation the maximum decoding performance
capability of each algorithm, at least 300 codeword errors are
detected at each SNR value. Figure 1 shows a performance
comparison between the two mentioned decoding algorithms
for a payload size of 24 bits. Note that the maximum number
of iterations for the BP algorithm is 30 iterations. The sim-
ulation results show that the proposed BP algorithm exhibits
a slight performance penalty with respect to the ML Viterbi
type algorithm. However, since the BP decoder is less complex
than this traditional decoder and enables a unied decoding
approach, this loss in BER performance is deemed acceptable.
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
10
6
10
5
10
4
10
3
10
2
10
1
10
0
B
E
R
Eb/No (dB)
BP Decoder
ML Viterbi Decoder
Figure 1. Comparisons of BER for length-24 rate tail-biting convolutional
code.
In addition, a comparison between the same short length
code using the BP and ML Viterbi algorithms has been
performed in Figure 2. In this case, a loss of 1.85 dB or less
in FER compared with the traditional decoder is observed.
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
10
5
10
4
10
3
10
2
10
1
10
0
F
E
R
Eb/No (dB)
BP Decoder
ML Viterbi Decoder
Figure 2. Comparisons of FER for length-24 rate tail-biting convolutional
code.
In Figure 3, we report simulation results for the AWGN
channel for the LTE turbo code that was studied in the previous
section. When compared to the traditional MAP and SOVA
decoders [11, 12], the BP algorithm is about 1.7 dB worse at
a BER value of 10
2
. Also, as we obtained a general form for
the parity-check matrices of tail-biting convolutional and turbo
codes, then we can enhance the performance by investigating
other decoding algorithms which are also applicable for LDPC
codes.
0 0.5 1 1.5 2 2.5 3 3.5 4
10
3
10
2
10
1
10
0
B
E
R
Eb/No (dB)
BP Decoder
MAP Decoder
SOVA Decoder
Figure 3. Comparisons of BER for length-40 rate 1/3 turbo code.
For further research, we propose exploring alternatives to
the ooding schedule usually adopted for LDPC codes to
enhance the BER performance.
VI. COMPLEXITY COMPARISON
A direct comparison between the complexity of different
decoding algorithms is implementation dependent. Starting
with the traditional decoders for turbo codes, the MAP process
computes the log-likelihood for all paths in the trellis. The
MAP algorithm estimates the metric for both received binary
zero and a received binary one, then compares them to
determine the best overall estimate. The SOVA process only
considers two paths of the trellis per step: the best path with
a data bit of zero and the best path with a data bit of one.
In addition, it utilizes the difference of the log-likelihood
function for each of these paths. However, the SOVA is the
least complex of the two algorithms in terms of number of
calculations [14]. Finally, for the BP algorithm, the decoding
complexity per iteration grows linearly with the number of
edges (the number of messages passed per iteration is twice
the number of edges in the graph E). Moreover, one can argue
that the complexity of the operations at the variable and check
nodes frequently scales linearly with
E =
dvmax
i=1
nv
i
i =
dcmax
i=1
nc
j
j (19)
Following the notations of Luby et al. [13], consider a
Tanner graph with n left nodes, where v
i
=
ni
n
represents
the fraction of left nodes of degree i > 0 and d
v
(resp. d
c
)
is the variable node degree (res. check node degree). Also,
c
j
=
rj
r
is dened to be the fraction of right nodes of degree
j > 1.
The complexity comparisons of the various decoding al-
gorithms are shown in Table I where k is the number of
systematic bits and v is the memory order of the encoder. The
table gives the operations per iteration for MAP, SOVA, and
BP decoding for the horizontal (H) and vertical (V) step. Note
13
Table I
DECODER COMPLEXITY COMPARISONS
MAP SOVA BP
2 2
k
2
v
+ 6 2 2
k
2
v
+ 9 (H) (3dvmax 4)M2
2
(V) 2dvmaxM
5 2
k
2
v
+ 8 2
k
2
v
(H) (3dvmax 4)M2
2
(V) 2dcmaxdvmaxM
max
0 2 2
v
1 (H) 0
(V) 0
that, for the BP algorithm, the complexity per information bit
is [20]
K =
1
i
/i
i
/i
_
p0
pt
dp
p log
_
p
P
i
ifi(p)
_, (20)
Considering example 3, Figure 4 shows a comparison between
these mentioned algorithms in terms of the number of oper-
ations required in the implementations. In comparison with
MAP and SOVA decoders, BP exhibits the lowest implemen-
tation complexity over all the required operations.
Figure 4. Comparison of MAP, SOVA, and BP decoders in terms of number
of operations.
VII. CONCLUSION
In this paper, the feasibility of decoding arbitrary tail-
biting convolutional and turbo codes using the BP algorithm
was demonstrated. Using this algorithm to decode the tail-
biting convolutional code in WiMAX systems speeds up the
error correction convergence and reduces the decoding com-
putational complexity with respect to the ML-Viterbi-based
algorithm. In addition, the BP algorithm performs a non-trellis
based forward-only algorithm and has only an initial decoding
delay, thus avoiding intermediate decoding delays that usually
accompany the traditional MAP and SOVA components in
LTE turbo decoders. However, with respect to the traditional
decoders for turbo codes, the BP algorithm is about 1.7 dB
worse at a BER value of 10
2
. This is because the nonzero
element distribution in the parity-check matrix is not random
enough. Also, there are a number of short cycles in the
corresponding Tanner graphs. Finally, as an extended work,
we propose the BP decoder for these codes in a combined
architecture which is advantageous over a solution based on
two separate decoders due to efcient reuse of computational
hardware and memory resources for both decoders. In fact,
since the traditional turbo decoders (based on MAP and
SOVA components) have a higher complexity, the observed
loss in performance with BP is more than compensated by
a drastically lower implementation complexity. Moreover, the
low decoding complexity of the BP decoder brings about end-
to-end efciency since both encoding and decoding can be
performed with relatively low hardware complexity.
REFERENCES
[1] C. Berrou, A. Glavieux, and P. Thitimajshima, Near Shannon limit
error-correcting coding and decoding: Turbo codes, IEEE Intl. Conf.
on Commun., vol. 2, Geneva, Switzerland, pp. 1064-1070, 1993.
[2] R. M. Tanner, A recursive approach to low complexity codes, IEEE
Trans. Inform. Theory, vol. 27, no. 5, pp. 533-547, 1981.
[3] G. D. Forney, Jr., Codes on graphs: normal realizations, IEEE Trans.
Inform. Theory, vol. 47, no. 2, pp. 520548, 2001.
[4] H. H. Ma and J. K. Wolf, On Tail Biting Convolutional Codes, IEEE
Trans. On Commun., vol. 34, no. 2, pp. 104-111, 1986.
[5] 3GPP TS 45.003, 3rd Generation Partnership Project; Technical Spec-
ication Group GSM/EDGE Radio Access Network; Channel Coding
(Release 7), February, 2007.
[6] IEEE Std 802.16-2004, IEEE Standard for Local and Metropolitan Area
Networks Part 16: Air Interface for Fixed Broadband Wireless Access
Systems, October, 2004.
[7] IEEE Std P802.16e/D10, IEEE Standard for Local and Metropolitan
Area Networks Part 16: Air Interface for Fixed and Mobile Broadband
Wireless Access Systems, August, 2005.
[8] D. J. C. MacKay, Good error-correcting codes based on very sparse
matrices, IEEE Trans. Inform. Theory, vol. 45, no. 2, pp. 399-431,
1999.
[9] N. Wiberg, Codes and Decoding on General Graphs, Linkoping
Studies in Science and Technology, Dissertation No. 440, Linkoping
University, -Linkoping, Sweden, 1996.
[10] R. J. McEliece, D. MacKay, and J.-Fu Cheng, Turbo Decoding as an
Instance of Pearls "Belief Propagation" Algorithm, IEEE Trans. On
Commun., vol. 16, no. 2, pp. 140-152, 1998.
[11] G. Colavolpe, Design and performance of turbo Gallager codes, IEEE
Trans. On Commun., vol. 52, no. 11, pp. 1901-1908, 2004.
[12] T. T. Chen, and S-He Tsai, Reduced-Complexity Wrap-Around Viterbi
Algorithms for Decoding Tail-Biting Convolutional Codes, 14th Euro-
pean Wireless Conference, Jun. 2008.
[13] M. G. Luby, M. Mitzenmacher, M. A. Shokrollahi, and D. A. Spielman,
Improved LDPC Codes Using Irregular Graphs, IEEE Trans. Inform.
Theory, vol. 47, no. 2, pp. 585-598, 2001.
[14] Giulietti A., Turbo Codes: Desirable and Designable, IKluwer Aca-
demic Publishers, ISBN: 1-4020-7660-6, 2004.
[15] P. Elias, Coding for Noisy Channels, IRE Conv. Rec., vol. 3, pt. 4,
pp. 3746, 1955.
[16] A. J. Viterbi, Convolutional Codes and their Performance in Communi-
cation Systems, IEEE Trans. On Commun., vol. 19, no. 15, pp. 751-772,
1971.
[17] C. Poulliat, D. Declercq, and T. Lestable, Efcient Decoding of
Turbo Codes with Nonbinary Belief Propagation, EURASIP Journal
on Wireless Communications and Networking, vol. 2008, no. 473613,
2008.
[18] A. Refaey, S. Roy, and P. Fortier, On the Application of BP Decoding
to Convolutiona and Turbo Codes, Asilomar Conference on Signals,
Systems, and Computers, Nov. 2009.
[19] P. Stahl, J. B. Anderson, and R. Johannesson, Optimal and near-optimal
encoders for short and moderate-length tail-biting trellises, IEEE Trans.
Inform. Theory, vol. 45, no. 7, pp. 2562-2571, 1999.
[20] W. Yu, M. Ardakani, B. Smith, and F. FKschischang, Complexity-
optimized low-density parity-check codes for Gallager decoding algo-
rithm B, IEEE International Symposium on Information Theory (ISIT)s,
Sep. 2005.
14