Block3 ChannelCoding-expanded PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 176

DIGITAL COMMUNICATIONS

Block 3

Channel Coding

Francisco J. Escribano, 2017-18


OUTLINE

Fundamentals of error control

Binary linear block codes

Multiple-error correction block codes

Convolutional codes

Turbo codes

Low Density Parity Check codes

Coded modulations

Conclusions

References
2
Fundamentals of error control

Error control:
– detection (ARQ -Automatic Repeat reQuest- schemes)
– correction (FEC -Forward Error Correction- schemes)

bn cn rn λn Channel bn
Channel Channel error
encoder corrector

Channel
error
detector

Channel model: RTx
– discrete inputs,
– discrete (hard, rn) or continuous (soft, λn) outputs,
– memoryless. 3
Fundamentals of error control

Enabling detection/correction:
– Adding redundancy to the information: for every k bits,
transmit n, n>k.


Shannon's theorem (1948):
1) If R<C=max [ I ( X ; Y ) ], for ε>0, there is n, R=k/n const., so that Pb<ε.
pX

2) If Pb is acceptable, rates R<R(Pb)=C/(1-H(Pb)) are achievable.

3) For any Pb, rates greater than R(Pb) are not achievable.


Problem: Shannon's theorem is not constructive.

4
Fundamentals of error control

Added redundancy should be structured redundancy.

This relies on sound algebraic & geometrical basis.

Our initial approach:
– Algebra over the Galois Field of order 2, GF(2)={0,1}.
– GF(2) is a proper field, GF(2)m is a vector field of dim. m.
– Dot product · :logical AND. Sum +(-) : logical XOR.
– Scalar product: b, d ∈ GF(2)m
b·dT=b1·d1+...+bm·dm
– Product by scalars: a ∈ GF(2), b ∈ GF(2)m
a·b=(a·b1..a·bm)
– It is also possible to define a matrix algebra over GF(2).
5
Fundamentals of error control

Given a vector b ∈ GF(2)m, its binary weight is w(b)=number of
1's in b.

It is possible to define a distance over vector field GF(2)m,
called Hamming distance:
dH(b,d)=w(b+d); b, d ∈ GF(2)m

Hamming distance is a proper distance and accounts for the
number of differing positions between vectors.

Geometrical view:
(1011)

(0110) (1010)
(1110)
6
Fundamentals of error control

A given encoder produces n output bits for each k input bits:
– R=k/n<1 is the rate of the code.

The information rate decreases by R when using a code.
R'b=R·Rb < Rb (bit/s)
● If used jointly with a modulation with spectral efficiency η=Rb/B
(bit/s/Hz), the efficiency decreases by R
η'=R·η < η (bit/s/Hz)
● In terms of limited Pb, the achievable Eb/N0 region under
AWGN is lower bounded by (the so-called Shannon limit):
Eb 1 η'
N0
(dB)⩾10⋅log ( ⋅( 2
10 η ' −1 ) )
7
Fundamentals of error control

Recall our playground:

Source: http://www.comtechefdata.com/technologies/fec/ldpc
8
Fundamentals of error control
● How a channel code can improve Pb (BER in statistical terms).


Cost: loss in resources (spectral efficiency, power, processing time). 9
Fundamentals of error control
● How a channel code can improve Pb (BER in statistical terms).

Eb
Δ
N0


Cost: loss in resources (spectral efficiency, power, processing time). 10
Fundamentals of error control
● How a channel code can improve Pb (BER in statistical terms).

Coding gain for


P b =10−7

Eb
Δ
N0


Cost: loss in resources (spectral efficiency, power, processing time). 11
Binary linear block codes

12
Binary linear block codes

An (n,k) linear block code (LBC) is a subspace C(n,k) < GF(2)n
with dim(C(n,k))=k.

● C(n,k) contains 2k vectors c=(c1…cn).


R=k/n is the rate of the LBC.


n-k is the redundancy of the LBC
– we would only need vectors with k components to
specify the same amount of information.
13
Binary linear block codes

Recall vector theory:
– A basis for C(n,k) has k vectors over GF(2)n
– C(n,k) is the kernel of a linear function such that LF:
GF(2)n → GF(2)n-k


c ∈ C(n,k) can be both specifed as:
– c=b1·g1+...+bk·gk, where {gj}j=1,...,k is the basis set,
and (b1...bk) are its coordinates over it.

– c such that the scalar products c·hiT are null, when


matrix {hi}i=1,...,n-k represents the linear function LF.
C(n,k) is the null subspace of this matrix.
14
Binary linear block codes

Arranging in matrix form, an LBC C(n,k) can be specifed by
– G=[gij]i=1,...,k, j=1,...,n, and c=b·G, b ∈ GF(2)k.

– H=[hij]i=1,...,n-k, j=1,...,n, and c·HT=0.


G is a k×n generator matrix of the LBC C(n,k).


H is a (n-k)×n parity-check matrix of the LBC C(n,k).
– In other approach, it can be shown that the rows in H stand for
linearly independent parity-check equations.
– The row rank of H for an LBC should be n-k.

● Note that gj ∈ C(n,k), and so G·HT=0.


15
Binary linear block codes

The encoder is given by G
– Note that a number of different G generate the same LBC

b=(b1...bk) LBC c=(c1...cn)=b·G


encoder
G


For any input information block with length k, it yields a
codeword with length n.
● An encoder is systematic if b is contained in c=(b1...bk | ck+1...cn),
so that ck+1...cn are the n-k parity bits.
– Systematicity is a property of the encoder, not of the LBC
C(n,k) itself.
– GS=[Ik | P] is a systematic generator matrix. 16
Binary linear block codes

How to obtain G from H or H from G.
– G rows are k vectors linearly independent over GF(2)n
– H rows are n-k vectors linearly independent over GF(2)n
– They are related through G·HT=0 (a)

(a) does not yield a sufficient set of equations, given H or G.
– A number of vector sets comply with it (basis sets are not unique).

Given G, put it in systematic form by combining rows (the code will be the
same, but the encoding does change).
– If GS=[Ik | P], then HS=[PT | In-k] complies with (a).

Conversely, given H, put it in systematic form by combining rows.
– If HS=[In-k | P], then GS=[PT | Ik] complies with (a).

Parity check submatrix P can be on the left or on the right side (but on
opposite sides of H and G simultaneously for a given LBC).
17
Binary linear block codes

Note that, by taking 2k vectors out of 2n, we are
getting apart the binary words.

Minimum Hamming distance between input words
is
d min ( GF (2)k )=min {d H ( bi , b j ) | bi , b j ∈GF (2)k }=1
bi ≠ b j


Recall that we have added n-k redundancy bits, so
that
d min ( C (n , k )) =min { d H ( c i , c j ) | ci , c j ∈C (n , k )} >1
c i ≠c j

d min ( C (n , k )) ⩽n−k +1 (Singleton bound)


18
Binary linear block codes

The channel model corresponds to a BSC (binary symmetric channel)

c=(c1...cn) r=(r1...rn)
Channel
BSC(p)

Modulator AWGN Hard


channel demodulator

1-p
0 p 0

p
1 1
1-p
● p=P(ci≠ri ) is the bit error probability of the modulation in AWGN.
19
Binary linear block codes
● The received word is r=c+e, where P(ei=1)=p.
– e is the error vector introduced by the noisy channel
– w(e) is the number of errors in r wrt original word c
– P(w(e)=t)=pt·(1-p)n-t, because the channel is memoryless


At the receiver side, we can compute the so-called syndrome vector
s=(s1...sn-k) as s=r·HT=(c+e)·HT=c·HT+e·HT=e·HT.

r=(r1...rn) (Channel s=(s1...sn-k)=r·HT


decoder)
H


r ∈ C(n,k) ⇔ s=0.
20
Binary linear block codes
Two possibilities at the receiver side:

a) Error detection (ARQ schemes):
– If s≠0, there are errors, so ask for retransmission.

b) Error correction (FEC schemes):
– Decode an estimated ĉ ∈ C(n,k), so that dH(ĉ,r) is the minimum
over all codewords in C(n,k) (closest neighbor decoding).
– ĉ is the most probable word under the assumption that p is small
(otherwise, the decoding fails).
(1011)
ĉ1 r1 ĉ2
(0110)
e1
OK (1010)
c (1110)
r2
e2
21
Binary linear block codes

Detection and correction capabilities (worst case) of an LBC with
dmin(C(n,k)).
– a) It can detect error events e with binary weight up to
w(e)|max,det=d=dmin(C(n,k))-1

– b) It can correct error events e with binary weight up to


w(e)|max,corr=t=⎣(dmin(C(n,k))-1)/2⎦


It is possible to implement a joint strategy:
– A dmin(C(n,k))=4 code can simultaneously correct all error
patterns with w(e)=1, and detect all error patterns with w(e)=2.

22
Binary linear block codes
● The minimum distance dmin(C(n,k)) is a property of the set of
codewords in C(n,k), independent from the encoding (G).

● As the code is linear, dH(ci,cj)=dH(ci+cj,cj+cj)=dH(ci+cj,0).

– ci, cj, ci+cj, 0 ∈ C(n,k)

● dmin(C(n,k))=min{w(c) | c ∈ C(n,k), c≠0}


– i.e., corresponds to the minimum word weight over all codewords
different from the null codeword.
● dmin(C(n,k)) can be calculated from H:
– It is the minimum number of different columns of H adding to 0.
– It is the column rank of H + 1.
23
Binary linear block codes

Detection limits: probability of undetected errors?
– Note that an LBC contains 2k codewords, and the received
word corresponds to any of the 2n possibilities in GF(2)n.
– An LBC detects up to 2n-2k error patterns.

An undetected error occurs if r=c+e with e≠0 ∈ C(n,k)
– In this case, r·HT=e·HT=0.
n
i n−i
P u ( E)= ∑ Ai⋅p ⋅(1− p )
i =d min

– Ai is the number of codewords in C(n,k) with weight i: this is


called the weight spectrum of the LBC.
24
Binary linear block codes

On correction, an LBC considers syndrome s=r·HT.
– Assume correction capabilities up to w(e)=t, and EC to be
the set of correctable error patterns.
– A syndrome table associates a unique si over the 2n-k
possibilities to a unique error pattern ei ∈ EC with w(ei)≤t.
– If si=r·HT, decode ĉ=r+ei.
– Given the knowledge about the encoder G, estimate
^ ^
information vector b such that b·G=ĉ.
● If the number of correctable errors is #(EC)<2n-k, there are 2n-k-
#(EC) syndromes usable in detection, but not in correction.
– At most, an LBC may correct 2n-k error patterns.
25
Binary linear block codes

A w(e)≤t error correcting LBC has a probability of
correcting erroneously bounded by
n
n ⋅pi⋅(1− p)n−i
P (E )⩽ ∑
i=t +1
()
i
– This is an upper bound, since, for example, not all the
codewords are separated by the minimum distance of the
code.

● Calculating the resulting P'b of an LBC is not an easy task,


and it depends heavily on how the encoding is made
through G.

LBC codes are mainly used in detection tasks (ARQ).
26
Binary linear block codes

Observe that both coding & decoding can be performed with
low complexity hardware (combinational logic: gates).

Examples of LBC
– Repetition codes
– Single parity check codes
– Hamming codes
– Cyclic redundancy codes
– Reed-Muller codes
– Golay codes
– Product codes
– Interleaved codes

Some of them will be examined through exercises. 27
Binary linear block codes
● An example of FEC performance: RHam=4/7, RGol=1/2.

28
Multiple-error correction block
codes

29
Multiple-error-correction block codes

There are a number of more powerful block codes, based on higher
order Galois fields.
– They use symbols over GF(s), with s>2.
– Now the operations are defined as mod s.
– An (n,k) linear block code with symbols from GF(s) is again a k-
dimensional subspace of the vector space GF(s)n.

They are used mainly for correction, and have applications in
channels or systems where error bursts are frequent, i.e.
– Storage systems (erasure channel).
– Communication channels with deep fades.

They are frequently used in concatenation with other codes that fail in
short bursts when correcting (i.e. convolutional codes).

We are going to introduce two important instances of such broad
class of codes: Reed-Solomon and BCH codes.
30
Multiple-error-correction block codes
● An (n,k,t)q Reed-Solomon (R-S) code is defined as a mapping
from GF(q)k↔GF(q)n, with the following parameters:
– Block length n=q-1 symbols, for input block length of k<n
symbols.
– Alphabet size q=pr, where p is a prime number.
● Minimum distance of the code dmin=n-k+1 (it achieves the
Singleton bound for linear block codes).
– It can correct up to t symbol errors, dmin=2t+1.

This is a whole class of codes.
– For any q=pr, n=q-1 and k=q-1-2t, there exists a Reed-
Solomon code meeting all these criteria.
31
Multiple-error-correction block codes

The algebra in GF(q), q=pr and p prime, is move involved:
– Elements are built as powers of an abstract entity
called primitive root of unity α.
– For any b ∊ GF(q), except 0, there exists an integer
u / αu=b (mod q).
– αi, i=1,...,q-1, spans GF(q), except 0.
– b ∊ GF(q) can also be written as a polynomial in α,
using properties

α r=α +1; αi +αi=0


– Addition, multiplication, vector operations, etc. in this
domain are done according to such properties.
32
Multiple-error-correction block codes

The R-S code is built with the help of polynomial algebra.

The message
k
s=( s 1 ... s k ) ∈ GF ( q )
is mapped to a polynomial with given coefficients
k
i−1
p ( a )=∑ z ⋅a
i ; a , z i ∈ GF ( q )
i=1

To get the corresponding codeword, the encoding function
works evaluating the polynomial at n distinct given points

c=( p( a1) ... p(a n )) ∈ GF (q)n


Note that all the operations are performed over GF(q). 33
Multiple-error-correction block codes

We can rewrite the encoding in a more familiar form
c=z⋅A , z=( z 1 ... z k )
where A is the transpose of a Vandermonde matrix with
structure
1 1 ⋯ 1

A=
( a
a1
a

2
1

(k−1)
1 a
a2
a

2
2

(k −1)
2



a

(k −1)
⋯ an
an
2
n

) 34
Multiple-error-correction block codes

The number of polynomials in GF(q) of degree less than k
is clearly qk, exactly the possible number of messages.
– This guarantees that each information word can be
mapped to a unique codeword by choosing a
convenient mapping s↔z.

● It is only required that a1,..,an are distinct points in GF(q),


(these are the points where the polynomial p(a) is
evaluated to build the codeword)
– The points can be chosen to meet certain properties.
– Either the polynomial can be chosen in a given way.
35
Multiple-error-correction block codes

One way to build the encoding framework for R-S codes
consists in choosing a1,..,an as n distinct points in GF(q)
and build p(a) by forcing the condition
p ( ai )=s i ∀ i=1,. .. , k

The polynomial is characterized as the only polynomial of
degree less than k that meets the above mentioned
condition, and can be found by using known algebraic
methods (Lagrange interpolation).

The codeword is given as
c s=( s 1 ... s k p ( ak +1) ... p ( a n ) )
This is an instance of R-S systematic encoding.
36
Multiple-error-correction block codes

In other possible construction, the polynomial is given by
the mapping z=s
k
i−1
p ( a )=∑ s i⋅a , a ∈ GF ( q )
i=1

And the points in GF(q) are chosen to meet certain
convenient properties.
– Let α be a primitive root of GF(q). This means
that, for any b ∊ GF(q), except 0, there exists an
integer u / αu=b (mod q).
– aj=αj, j=1,...,q-1 (this spans GF(q), except 0).
37
Multiple-error-correction block codes

Now we can rewrite
c=s⋅A , s=( s 1 ... s k )
where this time A is the transpose of a Vandermonde
matrix with structure

1 1 ⋯ 1

( )
2 n
α α ⋯ α
A= (α)2 2 2
(α ) ⋯ (α )
n 2

⋮ ⋮ ⋱ ⋮
(k −1 ) 2 (k−1) n (k−1)
(α) (α ) ⋯ (α )
38
Multiple-error-correction block codes

In this last case, it can be demonstrated that the parity-
check matrix of the resulting R-S code is
2 (q− 2)
1 α α ⋯ α

( )
2 2 2 2 (q− 2)
1 α (α ) ⋯ (α )
H= 1 α 3 (α 3) 2 3 (q− 2)
⋯ (α )
⋮ ⋮ ⋮ ⋱ ⋮
2t 2t 2 2 t (q −2)
1 α (α ) ⋯ (α )
– It can correct t or fewer random symbol errors over a span
of n=q-1 symbols.
39
Multiple-error-correction block codes

The weight distribution (spectrum) of R-S codes has closed
form 2t
Ai= q−1 q−2t {( q−1 ) + ∑ ( −1 ) i ( q2 t −q i ) }
i i+ j
( )
i j=0 j ()
2 t+1⩽i⩽q−1

We can derive bounds for the probability of undetected errors
for a symmetric DMC with q-ary input/output alphabets, and
probability of correct reception 1-ε.
q
0⩽ϵ⩽ 0< λ<t Pu ( E ) <q −2 t before any correction step
q−1
λ
−2 t q−1 ( q−1 )h
Pu ( E , λ ) <q ∑( h ) if used first to correct λ errors
h=0
40
Multiple-error-correction block codes

Other kind of linear block code defined over higher-order
Galois fields is the class of BCH codes.
– Named after Raj Bose and D. K. Ray-Chaudhuri.
● An (n,k,t)q BCH code is again defined over GF(q), where
q=pr, and p is prime.

We have m, n, q, d=2t+1, l, so that
– 2≤d≤n. l will be considered later.
– gcd(n,q)=1 (“gcd” → greatest common divisor)
– m is the multiplicative order of q modulo n; m is thus
the smallest integer meeting qm=1 (mod n).
– t is the number of errors that may be corrected.
41
Multiple-error-correction block codes

Let α be a primitive n-th root of 1 in GF(qm). This means αn=1
(mod qm).
● Let mi(x) the minimal polynomial for GF(q) of αi, ∀ i.
– This is the monic polynomial of least degree having αi as a
root.
– Monic → the coefficient of the highest power of x is 1.

Then, a BCH code is defined by a so-called generator
polynomial
g ( x )=lcm ( m l ( x ) ... ml +d −2 ( x ) )
where “lcm” stands for least common multiple. Its degree is at
most (d-1)m=2mt.
42
Multiple-error-correction block codes

The encoding with a generator polynomial is done by building a
polynomial containing the information symbols
k
k i−1
s=( s 1 ... s k ) ∈ GF ( q ) → s ( x ) =∑ si⋅x
i=1

Then
n
i−1
c ( x )=∑ c ⋅x
i =s ( x )⋅g ( x ) , c i ∈ GF (q) → c=( c1 ... cn )
i=1

We may do systematic encoding as
n−k n−k
c s ( x ) = x⏟ ⋅s ( x ) + ⏟
x ⋅s ( x ) mod g ( x )
systematic symbols redundancy symbols

43
Multiple-error-correction block codes

The case with l=1, and n=qm-1 is called a primitive BCH
code.
– The number of parity check symbols is n-k≤(d-1)m=2mt.
– The minimum distance is dmin≥d=2t+1.

If m=1, then we have a Reed-Solomon code (of the “primitive
root” kind)!
– R-S codes can be seen as a subclass of BCH codes.
– In this case, it can be verified that the R-S code may be
defined by means of a generator polynomial, in the form
2 2t
g ( x ) =( x−α ) ( x−α ) ... ( x−α ) =
2 2 t −1 2t
=g 1+ g2 x+ g3 x +...+ g2 t−2 x + x
44
Multiple-error-correction block codes

The parity check matrix for a primitive BCH code over GF(qm) is

2 (n−1)
1 α α ⋯ α

( )
2 2 2 2 (n−1)
1 α (α ) ⋯ (α )
H= 1 α 3
(α )3 2
⋯ (α ) 3 (n−1)

⋮ ⋮ ⋮ ⋱ ⋮
(d −1) (d −1) 2 (d −1) (n−1)
1 α (α ) ⋯ (α )
– This code can correct t or fewer random symbol errors, d=2t+1,
over a span of n=qm-1 symbol positions.
45
Multiple-error-correction block codes

Why these codes can correct error bursts in the channel?

Map Map
bits BCH/R-S q-ary
b=(b1...bk·log2(q))
to q-ary encoder symbols cb
symbols to bits
s=(s1...sk) c=(c1...cn)
BSC(p)
p=Pb
s'=(s'1...s'k) r=(r1...rn)
Map Map rb=cb+eb
q-ary BCH/R-S bits
symbols decoder to q-ary
b'=(b'1...b'k·log2(q)) to bits symbols

46
Multiple-error-correction block codes

Why these codes can correct error bursts in the channel?
Other channel encoder/decoder,
modulator/demodulator,
Map Map medium access technique
bits BCH/R-S q-ary
b=(b1...bk·log2(q))
to q-ary encoder symbols cb
symbols to bits
s=(s1...sk) c=(c1...cn)
BSC(p)
p=Pb
s'=(s'1...s'k) r=(r1...rn)
Map Map rb=cb+eb
q-ary BCH/R-S bits
symbols decoder to q-ary
b'=(b'1...b'k·log2(q)) to bits symbols

47
Multiple-error-correction block codes

Why these codes can correct error bursts in the channel?
Other channel encoder/decoder,
modulator/demodulator,
Map Map medium access technique
bits BCH/R-S q-ary
b=(b1...bk·log2(q))
to q-ary encoder symbols cb
symbols to bits
s=(s1...sk) c=(c1...cn)
BSC(p)
p=Pb
s'=(s'1...s'k) r=(r1...rn)
Map Map rb=cb+eb
q-ary BCH/R-S bits
symbols decoder to q-ary
b'=(b'1...b'k·log2(q)) to bits symbols eb=(...1111...)

48
Multiple-error-correction block codes

Why these codes can correct error bursts in the channel?
Other channel encoder/decoder,
modulator/demodulator,
Map Map medium access technique
bits BCH/R-S q-ary
b=(b1...bk·log2(q))
to q-ary encoder symbols cb
symbols to bits
s=(s1...sk) c=(c1...cn)
BSC(p)
p=Pb
s'=(s'1...s'k) r=(r1...rn)
Map Map rb=cb+eb
q-ary BCH/R-S bits
symbols decoder to q-ary
b'=(b'1...b'k·log2(q)) to bits symbols eb=(...1111...)

If bit error burst falls within a single symbol ri in GF(q), or at most spans over t
symbols in GF(q) within word r, it can be corrected! 49
Multiple-error-correction block codes

Note that algebra is now far more involved and more
complex than with binary LBC.
– This is part of the price to pay to get better data integrity
protection.
– Other logical price to pay is the reduction in data rate, R. k
and n are measured in symbols, but, as the mapping and
demapping is performed from GF(2) to GF(q), and
viceversa, the end-to-end effective data rate is again

' k
R =R b⋅
b
n

But there is still something to do to get the best from R-S
and BCH codes: decoding is substantially more complex!
50
Multiple-error-correction block codes

We are going to see a simple instance of decoding for
these families of nonbinary LBC.

We address the general BCH case, as R-S can be seen as
an instance of the former.
r ( x ) =c ( x ) +e ( x ) is the received codeword
e ( x )=e j x j +...+e j x j 0⩽ j 1⩽...⩽ j ν⩽n−1
1
1

ν
ν


Correction can be performed by identifying the pairs
j
( , e j ) ∀ i=1,... , ν
x i


For this, we can resort to the syndrome
i i m
( S 1 ... S2 t ) , where S i=r (α )=e ( α ) ∈ GF (q )
51
Multiple-error-correction block codes

We can build a set of equations
S l=δ1 βl1+... δ ν βlν , i=l , ..2t
δi=e j , βi=α j
i
i


Based on this, a BCH or an R-S may be decoded on 4 steps:
1. Compute the syndrome vector.
2. Determine the so-called error-location polynomial.
3. Determine the so-called error-value evaluator.
4. Evaluate error-location numbers ( j i ) and error values ( e j ),
i
and perform correction.

The error-location polynomial is defined as
ν
l
σ ( x ) =( 1−β1 x ) ... (1−βν x ) =∑ σ l x , where σ 0=1
l=0
52
Multiple-error-correction block codes

We can find the error-location polynomial with the help of the
Berlekamp's algorithm, using the syndrome vector.
– It works iteratively, in 2t steps.
– Details can be found in the references.

Once determined, its roots can be found by substituting the
elements of GF(qm) cyclically in σ(x).
m
i −i q −i−1
– If σ ( α )=0 , α =α is an error-location number.
m
– The errors are thus located at such q −i −1 positions.

On the other hand, the error-value evaluator is defined as
ν ν
Z 0 ( x ) = ∑ δl β l ∏ ( 1−βi x )
l =1 i=1, i≠l

53
Multiple-error-correction block codes

It can be shown that the error-value evaluator can be calculated
as a function of known quantities
2
Z0 x =S1 + ( S2 +σ1 S 1 ) x + ( S3 +σ 1 S2 +σ2 S 1 ) x +
( )
ν
+...+ ( S ν +σ 1 S ν−1 +...+σ ν−1 S 1 ) x

After some algebra, the error values are determined as
−Z0 ( β−1
k )
δk =
σ ' (β−1
k )

– Where the denominator is the derivative of σ(x).



With the error values and the error locations, the error vector is
estimated and correction may be performed
ν
ji
e^ ( x ) =∑ δi x → c^ ( x )=r ( x )− ^e ( x )
i=1
54
Multiple-error-correction block codes

BCH and R-S codes may be very powerful, but the amount
of algebra required is very high.
– The supporting theory is very complex, and designing
and analyzing these codes require mastering algebra
and geometry over finite-size fields.
– All the operations are to be understood in GF(q) or
GF(qm), when corresponding.


Nonbinary LBC of the kind described are usually employed
in sophisticated FEC strategies for specific channels.
– Binary LBC are more usual in ARQ strategies.

55
Multiple-error-correction block codes
● Example BPSK+AWGN: RHam=246/255, RGol=1/2, RR-S=155/255.

56
Convolutional codes

57
Convolutional codes

A binary convolutional code (CC) is another kind of linear
channel code class.

The encoding can be described in terms of a finite state
machine (FSM).
– A CC can eventually produce sequences of infinite length.
– A CC encoder has memory. General structure:
not
mandatory
Backward logic (feedback)
k input streams
MEMORY: ml bits for l-th input
not n output streams
mandatory
Forward logic (coded bits)

Systematic output 58
Convolutional codes

The memory is organized as a shift register.
– Number of positions for input l: memory ml.

– ml=νl is the constraint length of the l-th input/register.


– The register effects step by step delays / shifts on the input: recall
discrete LTI systems theory.


A CC encoder produces sequences, not just blocks of data.
– Sequence-based properties vs. block-based properties.

to backward
logic
l-th input stream 1 2 3 4 ml
input at instant i
(l ) (l ) (l ) (l ) (l ) (l )
to forward di d i−1 d i−2 d i−3 d i−4 d i−m l
logic 59
Convolutional codes

Both forward and backward logic is boolean logic.
– Very easy: each operation adds up (XOR) a number of memory
positions, from each of the k inputs.
inputs from all the k registers at instant i

Same structure for


backward logic
k i−ml
( j) ( j)
c =∑ ∑ g
i l , q−i ⋅d (l )
q
j-th output at instant i

l =1 q=i
( j)
● g l,p , p=0,. .. , ml , is 1 when the p-th register position for the l-
th input is added to get the j-th output.
60
Convolutional codes

Parameters of a CC so far:
– k input streams
– n output streams
– k shift registers with length ml each, l=1,...,k

– νl=ml is the constraint length of the l-th register

– m=maxl{νl} is the memory order of the code

– ν=ν1+...+νk is the overall constraint length of the code


A CC is denoted as (n,k,ν).
– As usual, its rate is R=k/n, where k and n take normally
small values for a convolutional code.
61
Convolutional codes

The backward / forward logic may be specified in the form
of generator sequences.
– Theses sequences are the impulse responses of each
output j wrt each input l.

( j) ( j) ( j)
g =( g
l l ,0 , ... , g l ,ml )

Observe that:
g–(l j )=( 1,0,... ,0 ) connects the l-th input directly to the j-th
output
( j)
g–l =( 0,... ,1(q th ) ,... ,0 ) just delays the l-th input to the
j-th output q time steps.
62
Convolutional codes

Given the presence of the shift register, the generator
sequences are better denoted as generator polynomials
ml
( j) ( j) (j) ( j) ( j) q
g =( g
l l ,0 ,... , g l , ml )≡g l ( D)=∑ g ⋅D l,q
q =0

We can thus write, for example
( j) (j)
g =( 1,0,... ,0 ) ≡ g (D)=1
l l
( j) th ( j) q
g =( 0,... ,1(q ) ,... ,0 ) ≡ g ( D)= D
l l

g(l j )=( 1,1,0,... ,0 ) ≡ g (l j ) (D)=1+ D


63
Convolutional codes

As all operations involved are linear, a binary CC is linear
and the sequences produced constitute CC codewords.

A feedforward CC (without backward logic - feedback) can
be denoted in matrix from as

(1) (2) (n )
g (D) g (D) ⋯ g ( D)

( )
1 1 1
(1) (2) (n )
G(D)= g (D) g (D) ⋯ g ( D)
2 2 2
⋮ ⋮ ⋱ ⋮
(1) (2) (n )
g k (D) g k (D) ⋯ g k ( D)
64
Convolutional codes

If each input has a feedback logic given as
ml q
g ( D)=∑
(0)
l q= 0 g ⋅D
(0)
l ,q
the code is denoted as

g(1)
1 (D) g(2)
1 ( D) g (n1 ) (D)

( )
(0) (0 ) (0 )
g (D)
1 g (D)
1 g (D)
1
(1) (2) (n )
g (D)
2 g ( D)
2 g (D)
2
G(D)= g (D) (0) (0 )
⋯ (0 )
2 g (D)
2 g (D)
2
⋮ ⋮ ⋱ ⋮
(1) (2) (n )
g k (D) gk ( D) g k (D)
(0) (0 )
⋯ (0 )
g (D)
k g (D)
k g k (D)
65
Convolutional codes

We can generalize the concept of parity-check matrix H(D).
– An (n,k,ν) CC is fully specified by G(D) or H(D).

Based on the matrix description, there are a good deal linear
tools for design, analysis and evaluation of a given CC.

A regular CC can be described as a (canonical) all-feedforward
CC and through an equivalent feedback (recursive) CC.
– Note that a recursive CC can be seen as an IIR filter in GF(2).

Even though k and n could be very small, a CC has a very rich
algebraic structure.
– This has to do with the constraint length of the CC.
– Each output bit is related to the present and past inputs via
powerful algebraic methods.
66
Convolutional codes

Given G(D), a CC can be classified as:
– Systematic and feedforward (NSC).
– Systematic and recursive (RSC).
– Non-systematic and feedforward.
– Non-systematic and recursive.

RSC is a popular class of CC, because it may provide an
infinite output for a finite-weight input (IIR behavior).

Each NSC can be converted straightforwardly to a RSC
with similar error correcting properties.

CC encoders are easy to implement with standard
hardware: shift registers + combinational logic.
67
Convolutional codes

As with the case of nonbinary LBC, we may use the
polynomial representation to perform coding & decoding.
– But now we have encoding with memory, spanning over
theoretically infinite length sequences → not practical.

c ( D ) =b ( D )⋅G ( D )
j
b ( D ) =( b(1) ( D ) ... b(k) ( D ) ) ; b(i) ( D )=∑ b (i)
j D
j=0
b(ji) is the i−th input bit stream

(1) (n) (l) (l) h


c ( D ) =( c ( D ) ... c ( D ) ) ; c ( D )=∑ c D h
h=0
c(l)
h is the l−th output bit stream

68
Convolutional codes

We do not need to look very deep into the algebraic details
of G(D) and H(D) to study:
– Coding
– Decoding
– Error correcting capabilities

A CC encoder is a FSM!

k input bits determine


The ν memory positions the shifting of The ν memory positions
store a content (among 2ν the registers store a new content
possible ones) at instant i-1 at instant i
Coder is said to be at Coder is said to be at
state s(i-1) state s(i)

And we get n related


69
output bits
Convolutional codes

The finite-state behavior of the CC can be captured by the
concept of trellis.
– For any starting state, we have 2k possible edges
leading to a corresponding set of ending states.

ss=s(i-1)
s=1,...,2ν se=s(i)
e=1,...,2ν
input bi=(bi,1...bi,k)
output ci=(ci,1...ci,n)

70
Convolutional codes

The trellis illustrates the encoding process in 2 axis:
– X-axis: time / Y-axis: states

Example for a (2,1,3) CC: input 0 input 1
s1 output 00 s1
s2 output 0 s2
1
s3 s3
s4 s4
s5 s5
s6 s6
s7 s7
s8 s8
i-1 i i+1
– For a finite-size input data sequence, a CC can be forced to finish at a
known state (often 0) by adding terminating (dummy) bits.
71
– Note that one section (e.g. i-1 → i) fully specifies the CC.
Convolutional codes

The trellis illustrates the encoding process in 2 axis: Memory:
same input,
– X-axis: time / Y-axis: states different outputs

Example for a (2,1,3) CC: input 0 input 1
s1 output 00 s1
s2 output 0 s2
1
s3 s3
s4 s4
s5 s5
s6 s6
s7 s7
s8 s8
i-1 i i+1
– For a finite-size input data sequence, a CC can be forced to finish at a
known state (often 0) by adding terminating (dummy) bits.
72
– Note that one section (e.g. i-1 → i) fully specifies the CC.
Convolutional codes

The trellis description allows us
– To build the encoder
– To build the decoder
– To get the properties of the code


The encoder:
ss=s(i-1)
s=1,...,2ν se=s(i)
input bi=(bi,1...bi,k) e=1,...,2ν
output ci=(ci,1...ci,n)
Registers
Combinational
k logic n

CLK H(D)↔G(D)
73
Convolutional codes

As usual, decoding is far more complicated than encoding
– Long sequences
– Memory: dependence with past states

In fact, CC were already well known before there existed a
practical good method to decode them: the Viterbi algorithm.
– It is a Maximum Likelihood Sequence Estimation (MLSE)
algorithm with many applications.

Issue: for a length N>>n sequence at the receiver side
– There are 2ν·2N·k/n paths through the trellis to match with
the received data.
– Even if the coder starting state is known (often 0), there are
still 2N·k/n paths to walk through in a brute force approach.
74
Convolutional codes

Viterbi algorithm setup.
Key facts:
input bi → output ci(s(i-1),bi) ●
The encoding corresponds to a Markov
start s(i-1) → end s(i)(s(i-1),bi) chain model: P(s(i))=P(s(i)|s(i-1))·P(s(i-1)).
s1 ●
Total likelihood P(r|b) can be factorized
s2 as a product of probabilities.
bi
(i−1) (i )
s3 ● Given s →s , P(ri|s(i),s(i-1)) depends
s4 only on the channel kind (AWGN,
s5 BSC...).
s6 ●
Transition from s(i-1) to s(i) (linked in the
s7 trellis) depends on the probability of bi:
s8 P(s(i)|s(i-1))=2-k if the source is iid.
i-1 i

P(s(i)|s(i-1))=0 if they are not linked in the
received data ri
trellis (finite state machine: deterministic).
75
Convolutional codes

The total likelihood can be recursively calculated as:
N /n
P ( r∣b ) =∏ P ( r i∣s ,s
(i) (i −1)

(i) (i−1)
)⋅P s s ⋅P s )
( ) ( (i−1)

i=1

In the BSC(p), the observation (branch) metric would be
related to:

P ( r i|s ,s
(i) (i−1)
)=P ( r i|c i ) →w ( r i+ ci ) =d H ( ri , ci )

Maximum likelihood (ML) criterion:
̂
b=arg max [ P ( r∣b ) ]
{ b } 76
Convolutional codes

We know that the brute force approach to ML criterion is at
least O(2N·k/n).


The Viterbi algorithm works recursively from 1 to N/n on the
basis that
– Many paths can be pruned out (transition probability=0).
– During forward recursion, we only keep the paths with highest
probability: the path probability goes easily to 0 from the moment
a term metric ⨯ transition probability is very small.
– When recursion reaches i=N/n, the surviving path guarantees the
ML criterion (optimal for ML sequence estimation!).


The Viterbi algorithm complexity goes down to O(N·22ν).
77
Convolutional codes

The algorithm recursive rule is
(0) (0)
V = P ( s =s j ) ;
j

=smax )⋅ max { P ( s =s j∣s }


(i) (i) (i−1) (i) (i−1) (i−1)
V j =P ( r i∣s =s j ,s =sl )⋅V l
(i −1)
s = smax

● {Vj(i)} stores the most probable state sequence wrt observation r


s1 s1
s2 s2 MAX
s3 s3
s4 s4
s5 s5

MAX s6 s6
s7 s7
s8 s8 78
i-1 i i+1
Convolutional codes
Probability of the
most probable

The algorithm recursive rule is state sequence
corresponding to the i-1
(0) (0) previous observations
V = P ( s =s j ) ;
j

=smax )⋅ max { P ( s =s j∣s }


(i) (i) (i−1) (i) (i−1) (i−1)
V j =P ( r i∣s =s j ,s =sl )⋅V l
(i −1)
s = smax

● {Vj(i)} stores the most probable state sequence wrt observation r


s1 s1
s2 s2 MAX
s3 s3
s4 s4
s5 s5

MAX s6 s6
s7 s7
s8 s8 79
i-1 i i+1
Convolutional codes
Probability of the
most probable

The algorithm recursive rule is state sequence
corresponding to the i-1
(0) (0) previous observations
V = P ( s =s j ) ;
j

=smax )⋅ max { P ( s =s j∣s }


(i) (i) (i−1) (i) (i−1) (i−1)
V j =P ( r i∣s =s j ,s =sl )⋅V l
(i −1)
s = smax

● {Vj(i)} stores the most probable state sequence wrt observation r


s1 s1
s2
Note that we may s2 MAX
better work with logs:
s3 products ↔ additions s3
s4 Criterion remains the same s4
s5 s5

MAX s6 s6
s7 s7
s8 s8 80
i-1 i i+1
Convolutional codes

Note that we have considered the algorithm when the
demodulator yields hard outputs
– ri is a vector of n estimated bits (BSC(p) equivalent channel).

In AWGN, we can do better to decode a CC
– We can provide soft (probabilistic) estimations for the
observation metric.
– For an iid source, we can easily get an observation transition
metric based on the probability of each bi,l=0,1, l=1,...,k,
associated to a possible transition.
– There is a gain of around 2 dB in Eb/N0.
– LBC decoders can also accept soft inputs (non syndrome-based
decoders).
– We will examine an example of soft decoding of CC in the lab.
81
Convolutional codes

We are now familiar with the encoder and the decoder
– Encoder: FSM (registers, combinational logic).
– Decoder: Viterbi algorithm (for practical reasons,
suboptimal adaptations are usually employed).


But what about performance?


First...
– CC are mainly intended for FEC, not for ARQ schemes.
– In a long sequence (=CC codeword), the probability of
having at least one error is very high...
– And... are we going to retransmit the whole sequence?
82
Convolutional codes

Given that we truncate the sequence to N bits and CC is linear
– We may analyze the system as an equivalent (N,N·k/n)
LBC.
– But... equivalent matrices G and H would not be practical.


Remember FSM: we can locate error loops in the trellis.
b

b+e

i i+1 i+2 i+3 83


Convolutional codes

The same error loop may occur irrespective of s(i-1) and b.

b+e

b+e
84
i i+1 i+2 i+3
Convolutional codes

Examining the minimal length loops and taking into account this
uniform error property we can get dmin of a CC.
– For a CC emitting finite-duration coded blocks forced to end at 0
state, d is called d .
min free

– dfree is also dmin for convolutionally coded sequences of infinite


duration.

We can draw a lot of information by building an encoder state
diagram: error loops, codeword weight spectrum...

Diagram of a (2,1,3) CC,


from Lin & Costello (2004).

85
Convolutional codes

With a fairly amount of algebra, related to FSM, modified
encoder state diagrams and so on, it is possible to get an upper
bound for optimal MLSE decoding.
BPSK in AWGN,
Eb
P b⩽ ∑
d = d free
B d⋅erfc (√ dR
N0 ) soft demodulation

Bd is the total number of nonzero information bits associated with


CC codewords of weight d, divided by the number of information
bits k per unit time... A lot of algebra behind...

For the BSC(p), the bound can be calculated as
d
P b⩽ ∑ B d⋅( 2 √ p⋅(1− p) )
d = d free
86
Convolutional codes

There are easier, suboptimal ways to decode a CC, and
performance will vary accordingly.

A CC may be punctured to match other rates higher than R=k/n,
but the resulting equivalent CC is clearly weaker.
– Performance-rate trade-off.
– Puncturing is a very usual tool that provides flexibility to the
usage of CC's in practice.

Native Puncturing
convolutional algorithm
k encoder n (prune bits) n'<n

87
Convolutional codes

Examples with BPSK+AWGN using ML bounds.

88
Turbo codes

89
Turbo codes

Canonically, Turbo Codes (TC) are Parallel Concatenated
Convolutional Codes (PCCC).
k input streams n=n1+n2 output streams
CC1
b c=c1∪c 2
? Rate R=k/(n1+n2)
CC2


Coding concatenation has been known and employed for
decades, but TC added a joint efficient decoding algorithm.
– Example of concatenated coding with independent decoding is
the use of ARQ + FEC hybrid strategies (CRC/R-S + CC).
90
Turbo codes

Canonically, Turbo Codes (TC) are Parallel Concatenated
Convolutional Codes (PCCC).
k input streams n=n1+n2 output streams
CC1
b c=c1∪c 2
? Rate R=k/(n1+n2)

We will see this is CC2


a key element...

Coding concatenation has been known and employed for
decades, but TC added a joint efficient decoding algorithm.
– Example of concatenated coding with independent decoding is
the use of ARQ + FEC hybrid strategies (CRC/R-S + CC).
91
Turbo codes

We have seen that standard CC decoding with Viterbi algorithm
relied on MLSE criterion.
– This is optimal when binary data at CC input is iid.


For CC, we also have decoders that provide probabilistic (soft)
outputs.
– They convert a priori soft values + channel output soft
estimations into updated a posteriori soft values.
– They are optimal from the Maximum A Posteriori (MAP)
criterion point of view.
– They are called Soft Input-Soft Output (SISO) decoders.

92
Turbo codes

What's in a SISO?

r
SISO
(for a CC)
P ( bi =b∣r )
1
P ( bi =b )=
2
0 1
0 1
Probability density function of bi

Note that the SISO works on a bit
by bit basis, but produces a
sequence of APP's. 93
Turbo codes

What's in a SISO? Soft demodulated values
from channel

r
SISO
(for a CC)
P ( bi =b∣r )
1
P ( bi =b )=
2
0 1
0 1
Probability density function of bi

Note that the SISO works on a bit
by bit basis, but produces a
sequence of APP's. 94
Turbo codes

What's in a SISO? Soft demodulated values
from channel

r
A priori probabilities SISO
(APR) (for a CC)
P ( bi =b∣r )
1
P ( bi =b )=
2
0 1
0 1
Probability density function of bi

Note that the SISO works on a bit
by bit basis, but produces a
sequence of APP's. 95
Turbo codes

What's in a SISO? Soft demodulated values
from channel

r
A priori probabilities SISO
(APR) (for a CC)
P ( bi =b∣r )
1
P ( bi =b )= A posteriori probabilities
2 (APP) updated with
channel information

0 1
0 1
Probability density function of bi

Note that the SISO works on a bit
by bit basis, but produces a
sequence of APP's. 96
Turbo codes

The algorithm inside the SISO is some suboptimal version of
the MAP BCJR algorithm.
– BCJR computes the APP values through a forward-backward
dynamics → it works over finite length data blocks, not over
(potentially) infinite length sequences (like pure CCs).
– BCJR works on a trellis: recall transition metrics, transition
probabilities and so on.
– Assume the block length is N: trellis starts at s (0)
, ends at s (N )
.
(i)
αi ( j ) = P ( s =s j , r 1, ⋯, r i )

βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j )
(i)

γ i j , k = P ( r i , s =s j∣s =sk )
( ) (i) (i−1)

97
Turbo codes

The algorithm inside the SISO is some suboptimal version of
the MAP BCJR algorithm.
– BCJR computes the APP values through a forward-backward
dynamics → it works over finite length data blocks, not over
(potentially) infinite length sequences (like pure CCs).
– BCJR works on a trellis: recall transition metrics, transition
probabilities and so on.
– Assume the block length is N: trellis starts at s (0)
, ends at s (N )
.
(i)
αi ( j ) = P ( s =s j , r 1, ⋯, r i ) FORWARD term

βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j )
(i)

γ i j , k = P ( r i , s =s j∣s =sk )
( ) (i) (i−1)

98
Turbo codes

The algorithm inside the SISO is some suboptimal version of
the MAP BCJR algorithm.
– BCJR computes the APP values through a forward-backward
dynamics → it works over finite length data blocks, not over
(potentially) infinite length sequences (like pure CCs).
– BCJR works on a trellis: recall transition metrics, transition
probabilities and so on.
– Assume the block length is N: trellis starts at s (0)
, ends at s (N )
.
(i)
αi ( j ) = P ( s =s j , r 1, ⋯, r i ) FORWARD term

βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j )
(i)
BACKWARD term

γ i j , k = P ( r i , s =s j∣s =sk )
( ) (i) (i−1)

99
Turbo codes

The algorithm inside the SISO is some suboptimal version of
the MAP BCJR algorithm.
– BCJR computes the APP values through a forward-backward
dynamics → it works over finite length data blocks, not over
(potentially) infinite length sequences (like pure CCs).
– BCJR works on a trellis: recall transition metrics, transition
probabilities and so on.
– Assume the block length is N: trellis starts at s (0)
, ends at s (N )
.
(i)
αi ( j ) = P ( s =s j , r 1, ⋯, r i ) FORWARD term

βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j )
(i)
BACKWARD term

γ i j , k = P ( r i , s =s j∣s =sk )
( ) (i) (i−1)
TRANSITION

100
Turbo codes

The algorithm inside the SISO is some suboptimal version of
the MAP BCJR algorithm.
– BCJR computes the APP values through a forward-backward
dynamics → it works over finite length data blocks, not over
(potentially) infinite length sequences (like pure CCs).
– BCJR works on a trellis: recall transition metrics, transition
probabilities and so on.
– Assume the block length is N: trellis starts at s (0)
, ends at s (N )
.
(i)
αi ( j ) = P ( s =s j , r 1, ⋯, r i ) FORWARD term
Remember,
n components
βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j )
(i)
for an (n,k,ν) CC BACKWARD term

γ i j , k = P ( r i , s =s j∣s =sk )
( ) (i) (i−1)
TRANSITION

101
Turbo codes

BCJR algorithm in action:
– Forward step i=1,...,N:
ν
2
(0)
α0 ( j ) = P ( s =s j ) ; αi ( j )= ∑ αi−1 ( k )⋅γ i ( k , j )
k =1

– Backward step i=N-1,...,0:


ν
2
(N )
β N ( j )= P (s =s j ) ; βi ( j ) =∑ βi+1 ( k )⋅γ i+1 ( j , k )
k =1
– Compute the joint probability sequence i=1,...,N:

(i−1) (i)
P (s =s j , s =sk ,r ) =βi ( k )⋅γ i ( j , k )⋅αi−1 ( j )

102
Turbo codes

Finally, the APP's can be calculated as:

1 (i−1) (i )
P ( bi =b∣r )= ⋅ ∑ P (s =s j ,s =s k ,r )
p(r ) s (i −1)
→ s(i )
b i =b


Decision criterion based on these APP's:

(i−1) (i)
∑ P (s =s j , s =sk , r )
log
(
P ( bi =1|r )
P ( bi =0|r ) )
=log
( s

s
(i−1)


(i−1)
→s
bi=1

→s
bi=0
(i)

(i)
P ( s (i−1) =s j , s(i)=sk , r ) ) ^b =1

^
i
> 0
< 0
bi =0

103
Turbo codes

Finally, the APP's can be calculated as:

1 (i−1) (i )
P ( bi =b∣r )= ⋅ ∑ P (s =s j ,s =s k ,r )
p(r ) s (i −1)
→ s(i )
b i =b
Its modulus is the
reliability of the
decision

Decision criterion based on these APP's:

(i−1) (i)
∑ P (s =s j , s =sk , r )
log
(
P ( bi =1|r )
P ( bi =0|r ) )
=log
( s

s
(i−1)


(i−1)
→s
bi=1

→s
bi=0
(i)

(i)
P ( s (i−1) =s j , s(i)=sk , r ) ) ^b =1

^
i
> 0
< 0
bi =0

104
Turbo codes
● How do we get γi(j,l)?

This probability takes into account
– The restrictions of the trellis (CC).
– The estimations from the channel.

γi ( j , l )=P ( r i ,s (i )=s j∣s(i−1) =s l ) =

= p ( ri∣s(i )=s j , s(i −1 )=s l )⋅P ( s(i)=s j∣s (i −1 )=s l )

105
Turbo codes
● How do we get γi(j,l)?

This probability takes into account
– The restrictions of the trellis (CC).
– The estimations from the channel.

γi ( j , l )=P ( r i ,s (i )=s j∣s(i−1) =s l ) =

= p ( ri∣s(i )=s j , s(i −1 )=s l )⋅P ( s(i)=s j∣s (i −1 )=s l )

=0 if transition is not possible


=1/2k if transition is possible
(binary trellis, k inputs)
106
Turbo codes
● How do we get γi(j,l)?

This probability takes into account
– The restrictions of the trellis (CC).
– The estimations from the channel.

γi ( j , l )=P ( r i ,s (i )=s j∣s(i−1) =s l ) =

= p ( ri∣s(i )=s j , s(i −1 )=s l )⋅P ( s(i)=s j∣s (i −1 )=s l )

n
− ∑ ( r i ,m −c i,m )
2 =0 if transition is not possible
m =1 =1/2k if transition is possible
1 2σ
2
(binary trellis, k inputs)
⋅e
2 n/2
in AWGN
(2πσ ) for unipolar ci,m 107
Turbo codes

Idea: what about feeding APP values as APR values for
other decoder whose coder had the same inputs?
r2
From CC1 SISO SISO
(for CC2)
P ( bi =b∣r 2 )
P ( bi =b∣r 1 )

0 1
0 1

108
Turbo codes

Idea: what about feeding APP values as APR values for
other decoder whose coder had the same inputs?
r2
From CC1 SISO SISO
(for CC2)
P ( bi =b∣r 2 )
P ( bi =b∣r 1 )

0 1
0 1
This will happen
under some
conditions
109
Turbo codes

APP's from first SISO used as APR's for second SISO
may increase updated APP's reliability iff
– APR's are uncorrelated wrt channel estimations for
second decoder.
– This is achieved by permuting input data for each
encoder.

k input streams n=n1+n2 output streams


CC1
b c=c1∪c 2
Π Rate R=k/(n1+n2)
CC2
d
110
Turbo codes

APP's from first SISO used as APR's for second SISO
may increase updated APP's reliability iff
– APR's are uncorrelated wrt channel estimations for
second decoder.
– This is achieved by permuting input data for each
encoder.

k input streams n=n1+n2 output streams


CC1
b c=c1∪c 2
Π Rate R=k/(n1+n2)

INTERLEAVER CC2
(permutor) d
111
Turbo codes

The interleaver preserves the data (b), but changes its
position within the second stream (d).
– Note that this compels the TC to work with blocks of
N=size(Π) bits.
– The decoder has to know the specific interleaver used
at the encoder.
b1 b 2 b 3 b 4 bN

d i=bπ (i )
d π ( i) =bi
−1

dπ −1
( 2)

−1
(N )

−1
( 3)

−1
( 1)

−1
( 4)
112
Turbo codes

The mentioned process is applied iteratively (l=1,...).
– Iterative decoder → this may be a drawback, since it adds
latency (delay).

r2
from
channelr 1 SISO 1 Π SISO 2

APP1(l) APR2(l)
APP2(l)
APR1(l+1)
−1
Π

– Note the feedback connection: it is the same principle as in


the turbo engines (that's why they are called “turbo”!).
113
Turbo codes

The mentioned process is applied iteratively (l=1,...).
– Iterative decoder → this may be a drawback, since it adds
latency (delay).

r2
from
channelr 1 SISO 1 Π SISO 2
Initial APR1(l=0)
APP1(l) APR2(l)
is taken with APP2(l)
P(bi=b)=1/2 APR1(l+1)
−1
Π

– Note the feedback connection: it is the same principle as in


the turbo engines (that's why they are called “turbo”!).
114
Turbo codes

It is better to work with log-probability values, that are in general
denominated LLR in the literature (though they are strictly not so).

SISO 1 output LLR for i-th bit and l-th iteration can be factorized as
P ( bi=1|r , l )
i
L (l)=log
1
( P ( bi=0|r , l ) ) i i
=La 1 (l)+ Le 1 (l)

i
– La 1 (l) is the input APR value from previous SISO.
– Li (l) is the so-called extrinsic output value (only term interchanged).
e1

r2
La2(l)
from
channelr 1 SISO 1 Π SISO 2 Le2(l)

Le1(l)
−1 decision
La1(l+1) Π
−1
Π 115
Turbo codes

The interleaver prevents the same error loop to happen at both
decoder stages: they may cooperate successfully.

b+e

b
i i+1 i+2 i+3

π(i+1) π(i+2)
116
π(i+3) π(i)
Turbo codes

The interleaver prevents the same error loop to happen at both
decoder stages: they may cooperate successfully.

b+e

b
i i+1 i+2 i+3
1st CC

π(i+1) π(i+2)
117
π(i+3) π(i)
Turbo codes

The interleaver prevents the same error loop to happen at both
decoder stages: they may cooperate successfully.

b+e

b
i i+1 i+2 i+3
1st CC

c 2nd CC

π(i+1) π(i+2)
118
π(i+3) π(i)
Turbo codes

The interleaver prevents the same error loop to happen at both
decoder stages: they may cooperate successfully.

b+e

b
i i+1 i+2 i+3
1st CC
10...01:
typical error loop
for RSC CC c 2nd CC

π(i+1) π(i+2)
119
π(i+3) π(i)
Turbo codes

The interleaver prevents the same error loop to happen at both
decoder stages: they may cooperate successfully.

b+e

b
i i+1 i+2 i+3
1st CC
10...01:
typical error loop
for RSC CC c 2nd CC

Error loop may be


π(i+1)in
broken π(i+2)
2nd CC trellis 120
π(i+3) π(i)
Turbo codes

When the interleaver is adequately chosen and the CC's employed
are RSC, the typical BER behavior is

– Note the two distinct zones: waterfall region / error floor. 121
Turbo codes

Analysis of TC is a very complex task: interleaving!

The location of the waterfall region can be analyzed by the so-
called density evolution method
– Based on the exchange of mutual information between SISO blocks.


The error floor can be lower bounded by the minimum Hamming
distance of the TC
– Contrary to CC's, TC relies on reducing multiplicities rather than just
trying to increase minimum distance.

w min⋅M min Eb
Pb floor
>
N
⋅erfc (√ d min R
N0 )
122
Turbo codes

Analysis of TC is a very complex task: interleaving!

The location of the waterfall region can be analyzed by the so-
called density evolution method
– Based on the exchange of mutual information between SISO blocks.


The error floor can be lower bounded by the minimum Hamming
distance of the TC
– Contrary to CC's, TC relies on reducing multiplicities rather than just
trying to increase minimum distance.

BPSK in AWGN
w min⋅M min Eb
soft demodulation
Pb floor
>
N
⋅erfc (√ d min R
N0 )
123
Turbo codes

Analysis of TC is a very complex task: interleaving!

The location of the waterfall region can be analyzed by the so-
called density evolution method
– Based on the exchange of mutual information between SISO blocks.


The error floor can be lower bounded by the minimum Hamming
distance of the TC
– Contrary to CC's, TC relies on reducing multiplicities rather than just
trying to increase minimum distance.

BPSK in AWGN
w min⋅M min Eb
soft demodulation

Hamming weight
Pb floor
>
N
⋅erfc (√ d min R
N0 )
of the error
with minimum
distance 124
Turbo codes

Analysis of TC is a very complex task: interleaving!

The location of the waterfall region can be analyzed by the so-
called density evolution method
– Based on the exchange of mutual information between SISO blocks.


The error floor can be lower bounded by the minimum Hamming
distance of the TC
– Contrary to CC's, TC relies on reducing multiplicities rather than just
trying to increase minimum distance.

BPSK in AWGN
w min⋅M min Eb
soft demodulation

Hamming weight
Pb floor
>
N
⋅erfc (√ d min R
N0 )
of the error Error
with minimum multiplicity
distance (low value!!) 125
Turbo codes

Analysis of TC is a very complex task: interleaving!

The location of the waterfall region can be analyzed by the so-
called density evolution method
– Based on the exchange of mutual information between SISO blocks.


The error floor can be lower bounded by the minimum Hamming
distance of the TC
– Contrary to CC's, TC relies on reducing multiplicities rather than just
trying to increase minimum distance.

BPSK in AWGN
w min⋅M min Eb
soft demodulation

Hamming weight
Pb floor
>
N
⋅erfc (√ d min R
N0 )
of the error Error
with minimum Interleaver gain multiplicity
distance (only if recursive (low value!!) 126
CC's!!)
Turbo codes

Examples of 3G TC. Note that TC's are intended for FEC...

127
Low Density Parity Check codes

128
Low Density Parity Check codes

LDPC codes are just another kind of channel codes derived
from less complex ones.
– While TC's were initially an extension of CC systems,
LDPC codes are an extension of the concept of binary
LBC, but they are not exactly our known LBC.


Formally, an LDPC code is an LBC whose parity check matrix is
large and sparse.
– Almost all matrix elements are 0!!!!!!!!!!
– Very often, the LDPC parity check matrices are randomly
generated, subject to some constraints on sparsity...
– Recall that LBC relied on extreme powerful algebra related
to carefully and well chosen matrix structures.
129
Low Density Parity Check codes

Formally, a (ρ,γ)-regular LDPC code is defined as the null
space of a parity check matrix J⨯n H that meets these
constraints:
a) Each row contains ρ 1's.
b) Each column contains γ 1's.
c) λ, the number of 1's in common between any two
columns, is 0 or 1.
d) ρ and γ are small compared with n and J.


These properties give name to this class of codes: their
matrices have a low density of 1's.

The density r of H is defined as r=ρ/n=γ/J. 130
Low Density Parity Check codes

Example of a (4,3)-regular LPDC parity check matrix

11110 000 00 00 000 00 00 0

[ ]
0 000 111100 00 000 00 00 0
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01

131
Low Density Parity Check codes

Example of a (4,3)-regular LPDC parity check matrix

11110 000 00 00 000 00 00 0


15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0

[ ]
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01

132
Low Density Parity Check codes

Example of a (4,3)-regular LPDC parity check matrix
This H defines a
11110 000 00 00 000 00 00 0
15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 (20,7) LBC!!!

[ ]
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01

133
Low Density Parity Check codes

Example of a (4,3)-regular LPDC parity check matrix
This H defines a
11110 000 00 00 000 00 00 0
15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 (20,7) LBC!!!

[ ]
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0 r=4/20=3/15=0.2
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01

134
Low Density Parity Check codes

Example of a (4,3)-regular LPDC parity check matrix
This H defines a
11110 000 00 00 000 00 00 0
15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 (20,7) LBC!!!

[ ]
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0 r=4/20=3/15=0.2
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
Sparse!
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01

135
Low Density Parity Check codes

Example of a (4,3)-regular LPDC parity check matrix
This H defines a
11110 000 00 00 000 00 00 0
15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 (20,7) LBC!!!

[ ]
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0 r=4/20=3/15=0.2
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
Sparse!
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
λ=0,1

136
Low Density Parity Check codes

Note that the J rows of H are not necessarily linearly
independent over GF(2).
– To determine the dimension k of the code, it is mandatory
to find the row rank of H = n-k < J.
– That's the reason why in the previous example H defined a
(20,7) LBC instead of a (20,5) LBC as could be expected!


The construction of large H for LDPC with high rates and good
properties is a complex subject.
– Some methods relay on smaller Hi used as building blocks,
plus random permutations or combinatorial manipulations;
resulting matrices with bad properties are discarded.
– Other methods relay on finite geometries and lot of
algebra. 137
Low Density Parity Check codes

LDPC codes yield performances equal or even better than
TC's, but without the problem of their relatively high error
floor.
– Both LDPC codes and TC's are capacity approaching
codes.


As in the case of TC, their interest is in part related to the
fact that
– The encoding may be easily done, under some constraints
(even if H is large, the low density of 1's may help reducing
the complexity of the encoder).
– At the decoder side, there are powerful algorithms that can
take full advantage of the properties of the LDPC code.
138
Low Density Parity Check codes

Encoding of LDPC is a bit tricky.

One may build the equivalent full-row-rank matrix H by Gaussian
elimination, and then Hs=[In-k | P] → Gs=[PT | Ik].
– Nevertheless, P is not usually sparse,
and length n is in practice too large to
make this framework practical.
– Encoding using generator matrix is
done with complexity O(n2).
– Encoding can be performed with lower
complexity by using iterative algorithms,
that take advantage of the parity-check
structure of H.
Source:– Using Gs
Wikipedia

139
Low Density Parity Check codes

Encoding of LDPC is a bit tricky.

One may build the equivalent full-row-rank matrix H by Gaussian
elimination, and then Hs=[In-k | P] → Gs=[PT | Ik].
– Nevertheless, P is not usually sparse,
and length n is in practice too large to
make this framework practical.
– Encoding using generator matrix is
done with complexity O(n2).
– Encoding can be performed with lower
complexity by using iterative algorithms,
that take advantage of the parity-check
structure of H.
Source:– Using Gs E.g. info bits are distributed through an
Wikipedia structure with a lattice of simple encoders
reproducing “local” parity-check equations
140
Low Density Parity Check codes

There are several algorithms to decode LDPC codes.
– Hard decoding.
– Soft decoding.
– Mixed approaches.


We are going to examine two important instances thereof:
– Majority-logic (MLG) decoding; hard decoding, the simplest
one (lowest complexity).
– Sum-product algorithm (SPA); soft decoding, best error
performance (but high complexity!).


Key concepts: Tanner graphs & belief propagation.
141
Low Density Parity Check codes

MLG decoding: hard decoding; r=c+e → received word.
– The simplest instance of MLG decoding is the decoding of
a repetition code by the rule “choose 0 if 0's are dominant,
1 if otherwise”.


Given a (ρ,γ)-regular LDPC code, for every bit position
i=1,...,n, there is a set of γ rows
(i) (i)
Ai ={h ,⋯ , h }
1 γ

that have a 1 in position i, and do not have any other


common 1 position among them...
142
Low Density Parity Check codes

We can form the set of syndrome equations
(i )T (i)T (i)
S i ={s i=r⋅h j =e⋅h j , h ∈ Ai , i=1,⋯, γ}
j

● Si gives a set of γ checksums orthogonal on ei.

● ei is decoded as 1 if the majority of the checksums give 1;


0 in the opposite case.


Repeating this for all i, we estimate ê, and ĉ=r+ê.
– Correct decoding of ei is guaranteed if there are less than
γ/2 errors in e.
143
Low Density Parity Check codes

Tanner graphs. Example for a (7,3) LBC.
c1 c2 c3 c4 c5 c6 c7

+ + + + + + +
s1 s2 s3 s4 s5 s6 s7

It is a bipartite graph with interesting properties for decoding.
– A variable node is connected to a check node iff the
corresponding code bit is checked by the corresponding
parity sum equation.
144
Low Density Parity Check codes

Tanner graphs. Example for a (7,3) LBC.
Variable nodes or
c1 c2 c3 c4 c5 c6 c7 code-bit vertices

+ + + + + + +
s1 s2 s3 s4 s5 s6 s7

It is a bipartite graph with interesting properties for decoding.
– A variable node is connected to a check node iff the
corresponding code bit is checked by the corresponding
parity sum equation.
145
Low Density Parity Check codes

Tanner graphs. Example for a (7,3) LBC.
Variable nodes or
c1 c2 c3 c4 c5 c6 c7 code-bit vertices

+ + + + + + + Check nodes or
s1 s2 s3 s4 s5 s6 s7 check-sum vertices

It is a bipartite graph with interesting properties for decoding.
– A variable node is connected to a check node iff the
corresponding code bit is checked by the corresponding
parity sum equation.
146
Low Density Parity Check codes

Tanner graphs. Example for a (7,3) LBC.
Variable nodes or
c1 c2 c3 c4 c5 c6 c7 code-bit vertices

The absence of short


loops is necessary for
iterative decoding

+ + + + + + + Check nodes or
s1 s2 s3 s4 s5 s6 s7 check-sum vertices

It is a bipartite graph with interesting properties for decoding.
– A variable node is connected to a check node iff the
corresponding code bit is checked by the corresponding
parity sum equation.
147
Low Density Parity Check codes

Based on the Tanner graph of an LDPC code, it is possible
to make iterative soft decoding (SPA).

SPA is performed by belief propagation (which is an
instance of a message passing algorithm).
c1 c2 c3 c4 c5 c6 c7

+ + + + + + +
s1 s2 s3 s4 s5 s6 s7
148
Low Density Parity Check codes

Based on the Tanner graph of an LDPC code, it is possible
to make iterative soft decoding (SPA).

SPA is performed by belief propagation (which is an
instance of a message passing algorithm).
c1 c2 c3 c4 c5 c6 c7 “Messages” (soft values)
are passed to and from
related variable and
check nodes

+ + + + + + +
s1 s2 s3 s4 s5 s6 s7
149
Low Density Parity Check codes

Based on the Tanner graph of an LDPC code, it is possible
to make iterative soft decoding (SPA).

SPA is performed by belief propagation (which is an
instance of a message passing algorithm).
c1 c2 c3 c4 c5 c6 c7 “Messages” (soft values)
are passed to and from
related variable and
check nodes

This process, applied


iteratively and under
+ + + + + + + some rules, yields
s1 s2 s3 s4 s5 s6 s7 P ( ci∣λ )
150
Low Density Parity Check codes

Based on the Tanner graph of an LDPC code, it is possible
to make iterative soft decoding (SPA).

SPA is performed by belief propagation (which is an
instance of a message passing algorithm).
c1 c2 c3 c4 c5 c6 c7 “Messages” (soft values)
are passed to and from
related variable and
check nodes

This process, applied


iteratively and under
+ + + + + + + some rules, yields
s1 s2 s3 s4 s5 s6 s7 P ( ci∣λ )
λ soft
151
values
Low Density Parity Check codes

Based on the Tanner graph of an LDPC code, it is possible
to make iterative soft decoding (SPA).

SPA is performed by belief propagation (which is an
instance of a message passing algorithm).
c1 c2 c3 c4 c5 c6 c7 “Messages” (soft values)
are passed to and from
related variable and
check nodes

This process, applied


iteratively and under
+ + + + + + + some rules, yields
s1 s2 s3 s4 s5 s6 s7 P ( ci∣λ )
λ soft
152
values
Low Density Parity Check codes

Based on the Tanner graph of an LDPC code, it is possible
to make iterative soft decoding (SPA).

SPA is performed by belief propagation (which is an
instance of a message passing algorithm).
c1 c2 c3 c4 c5 c6 c7 “Messages” (soft values)
are passed to and from
related variable and
check nodes

This process, applied


iteratively and under
+ + + + + + + some rules, yields
s1 s2 s3 s4 s5 s6 s7 P ( ci∣λ )
λ soft
N(c5): check nodes neighbors of variable node c5 153
values
Low Density Parity Check codes

Based on the Tanner graph of an LDPC code, it is possible
to make iterative soft decoding (SPA).

SPA is performed by belief propagation (which is an
instance of a message passing algorithm).
c1 c2 c3 c4 c5 c6 c7 “Messages” (soft values)
are passed to and from
related variable and
check nodes

This process, applied


iteratively and under
+ + + + + + + some rules, yields
s1 s2 s3 s4 s5 s6 s7 P ( ci∣λ )
λ soft
N(c5): check nodes neighbors of variable node c5 154
values
Low Density Parity Check codes

Based on the Tanner graph of an LDPC code, it is possible
to make iterative soft decoding (SPA).

SPA is performed by belief propagation (which is an
instance of a message passing algorithm).
c1 c2 c3 c4 c5 c6 c7 “Messages” (soft values)
are passed to and from
related variable and
check nodes

N(s7)

This process, applied


iteratively and under
+ + + + + + + some rules, yields
s1 s2 s3 s4 s5 s6 s7 P ( ci∣λ )
λ soft
N(c5): check nodes neighbors of variable node c5 155
values
Low Density Parity Check codes
● If we get P(ci | λ), we have an estimation of the codeword sent ĉ.

The decoding aims at calculating this through the marginalization

P ( ci∣λ )= ∑ P ( c '∣λ )
c ' :c ' i =c i


Brute-force approach for LDPC is impractical, hence the iterative solution
through SPA. Messages interchanged at step l:

μ(lc )→ s ( ci =c ) =α(l)
i j i , j⋅P ( ci =c∣λ i )⋅ ∏ μ(ls −1)
k →c ( c i =c )
i
sk ∈ N ( c i )
sk ≠s j

(l ) (l)
μ s j →c i ( ci =c ) = ∑ P ( s j =0∣ci =c , c )⋅ ∏ μ ck → s j ( c ' k =c k )
c ∖ ck ∈ N (s j ) c ' k∈ N ( s j)
156
c k ≠c i
Low Density Parity Check codes
● If we get P(ci | λ), we have an estimation of the codeword sent ĉ.

The decoding aims at calculating this through the marginalization

P ( ci∣λ )= ∑ P ( c '∣λ )
c ' :c ' i =c i


Brute-force approach for LDPC is impractical, hence the iterative solution
through SPA. Messages interchanged at step l: From variable node
to check node

μ(lc )→ s ( ci =c ) =α(l)
i j i , j⋅P ( ci =c∣λ i )⋅ ∏ μ(ls −1)
k →c ( c i =c )
i
sk ∈ N ( c i )
sk ≠s j

(l ) (l)
μ s j →c i ( ci =c ) = ∑ P ( s j =0∣ci =c , c )⋅ ∏ μ ck → s j ( c ' k =c k )
c ∖ ck ∈ N (s j ) c ' k∈ N ( s j)
157
c k ≠c i
Low Density Parity Check codes
● If we get P(ci | λ), we have an estimation of the codeword sent ĉ.

The decoding aims at calculating this through the marginalization

P ( ci∣λ )= ∑ P ( c '∣λ )
c ' :c ' i =c i


Brute-force approach for LDPC is impractical, hence the iterative solution
through SPA. Messages interchanged at step l: From variable node
to check node

μ(lc )→ s ( ci =c ) =α(l)
i j i , j⋅P ( ci =c∣λ i )⋅ ∏ μ(ls −1)
k →c ( c i =c )
i
sk ∈ N ( c i )
sk ≠s j From check node
to variable node
(l ) (l)
μ s j →c i ( ci =c ) = ∑ P ( s j =0∣ci =c , c )⋅ ∏ μ ck → s j ( c ' k =c k )
c ∖ ck ∈ N (s j ) c ' k∈ N ( s j)
158
c k ≠c i
Low Density Parity Check codes

Note that:
– α(l)
i , j is a normalization constant.
– P ( c i=c|λ i ) plugs into the LDPC SPA the values from the
channel → it is the APR info.
– N (c i ) and N (s i) are the neighborhoods of variable nodes
and check nodes.

(l) (l) (l )
P ( c i=c| λ )=β ⋅P ( ci =c|λ i )⋅
i ∏ μ s j →c i ( ci =c ) : APP value.
s j ∈ N (c i )

(l)

Based on the final probabilities P ( c i=c| λ ) , a candidate ĉ is
chosen and ĉ·HT is tested. If 0, the information word is
decoded. 159
Low Density Parity Check codes

Note that:
– α(l)
i , j is a normalization constant.
– P ( c i=c|λ i ) plugs into the LDPC SPA the values from the
channel → it is the APR info.
– N (c i ) and N (s i) are the neighborhoods of variable nodes
and check nodes. Normalization

(l) (l) (l )
P ( c i=c| λ )=β ⋅P ( ci =c|λ i )⋅
i ∏ μ s j →c i ( ci =c ) : APP value.
s j ∈ N (c i )

(l)

Based on the final probabilities P ( c i=c| λ ) , a candidate ĉ is
chosen and ĉ·HT is tested. If 0, the information word is
decoded. 160
Low Density Parity Check codes

There is also the possibility to analyze their performance.
– They have also error floors that may be characterized, on
the basis of their minimum distance.
– Good performance is related with sparsity (it breaks error
recurrences along parity check equations).

Analysis techniques are complex.
– They require taking into account the Tanner graph
structure, nature of their loops, and so on.
– It is possible to draw bounds for the BER, and give design
and evaluation criteria thereon.
– Analysis depends heavily on the nature of the LDPC:
regular, irregular..., and constitutes a field of very active
research.
161
Low Density Parity Check codes

LDPC BER performance examples (DVBS2 standard).

162
Low Density Parity Check codes

LDPC BER performance examples (DVBS2 standard).

Short n=16200

163
Low Density Parity Check codes

LDPC BER performance examples (DVBS2 standard).

Short n=16200

Long n=64800

164
Coded modulations

165
Coded modulations

We have considered up to this point channel coding and
decoding isolated from the modulation process.
– Codewords feed any kind of modulator.
– Symbols go through a channel (medium).
– The info recovered from received modulated symbols is fed
to the suitable channel decoder

As hard decisions.

As soft values (probabilistic estimations).
– The abstractions of BSC(p) (hard demodulation) or soft
values from AWGN ( ⋉ exp[-|ri-sj|2/(2σ2)] ) -and the like for
other cases- are enough for such an approach.

Note that there are other important channel kinds not
considered so far.
166
Coded modulations

Coded modulations are systems where channel coding
and modulation are treated as a whole.
– Joint coding/modulation.
– Joint decoding/demodulation.


This offers potential advantages (recall the improvements
made when the demodulator outputs more elaborated
information -soft values vs. hard decisions).
– We combine gains in BER with spectral efficiency!


As a drawback, the systems become more complex.
– More difficult to design and analyze.
167
Coded modulations

TCM (trellis coded modulation).
– Normally, it combines a CC encoder and the modulation
symbol mapper.

output mk
s1 s1
output m
s2 j
s2
s3 s3
s4 s4
s5 s5
s6 s6
s7 s7
s8 s8
i-1 i i+1
168
Coded modulations

If the modulation symbol mapper is well matched to the CC
trellis, and the decoder is accordingly designed to take
advantage of it,
– TCM provides high spectral efficiency.
– TCM can be robust in AWGN channels, and against fading and
multipath effects.

In the 80's, TCM become the standard for telephone line data
modems.
– No other system could provide better performance over the
twisted pair cable before the introduction of DMT and ADSL.

However, the flexibility of providing separated channel coding
and modulation subsystems is still preferred nowadays.
– Under the concept of Adaptive Coding & Modulation (ACM).
169
Coded modulations

Other possibility of coded modulation, evolved from TCM and
from the concatenated coding & iterative decoding framework is
Bit-Interleaved Coded Modulation (BICM).
– What about if we use an interleaver between the channel coder
(normally a CC) and the modulation symbol mapper?

CC Π

– A soft demodulator can also accept APR values and update as


APP's its soft outputs in an iterative process!
Channel corrupted
outputs APP values
Soft
APR values (to interleaver
demapper and CC SISO)
(interleaved
from CC SISO) 170
Coded modulations

As TCM, BICM has special good behavior (even better!)
– In channels where spectral efficiency is required.
– In dispersive channels (multipath, fading).
– Iterative decoding yields a steep waterfall region.
– Being a serial concatenated system, the error floor is very
low (contrary to the parallel concatenated systems).


BICM has already found applications in standards such as
DVB-T2.


The drawback is the higher latency and complexity of the
decoding.
171
Coded modulations

Examples of BICM.

172
Conclusions

173
Conclusions

Channel coding is a key enabling factor for modern digital
communications.
– All standards at the PHY level include one or more channel
coding methods.
– Modulation & coding are usually considered together to
guarantee a final performance level.

Error control comes out in two main flavors: ARQ and FEC.
– Nevertheless, hybrid strategies are becoming more and more
popular (HARQ).

A lot of research has been made, with successful results, to
approach Shannon's promises from the noisy-channel coding
theorem.

Concatenation and iterative (soft) decoding are pushing the
results towards channel capacity.
174
Conclusions

Long standing trends point towards the development of codes
endowed with less rich algebraic structure, and relaying more on
statistical / probabilistic grounds.

New outstanding proposals are being made, relaying more intensively
on randomness, as hinted by Shannon's demonstrations.
– New capacity-achieving alternatives, like polar codes and
fountain codes, are step-by-step reaching the market, with
promising prospects.
– Nonetheless, their processing needs make them more suitable
for higher layers than the PHY.

Though we are already approaching the limits of the Gaussian
channel, there are still many challenges.
– In general, the wireless channel poses problems unresolved
from the point of view of capacity calculation & exploitation
through channel coding.
175
References

S. Lin, D. Costello, ERROR CONTROL CODING, Prentice
Hall, 2004.

S. B. Wicker, ERROR CONTROL SYSTEMS FOR
DIGITAL COMMUNICATION AND STORAGE, Prentice
Hall, 1995.

J. M. Cioffi, DIGITAL COMMUNICATIONS - CODING
(course), Stanford University, 2010. [Online] Available:
http://web.stanford.edu/group/cioffi/book

176

You might also like