Block3 ChannelCoding-expanded PDF
Block3 ChannelCoding-expanded PDF
Block3 ChannelCoding-expanded PDF
Block 3
Channel Coding
bn cn rn λn Channel bn
Channel Channel error
encoder corrector
Channel
error
detector
●
Channel model: RTx
– discrete inputs,
– discrete (hard, rn) or continuous (soft, λn) outputs,
– memoryless. 3
Fundamentals of error control
●
Enabling detection/correction:
– Adding redundancy to the information: for every k bits,
transmit n, n>k.
●
Shannon's theorem (1948):
1) If R<C=max [ I ( X ; Y ) ], for ε>0, there is n, R=k/n const., so that Pb<ε.
pX
3) For any Pb, rates greater than R(Pb) are not achievable.
●
Problem: Shannon's theorem is not constructive.
4
Fundamentals of error control
●
Added redundancy should be structured redundancy.
●
This relies on sound algebraic & geometrical basis.
●
Our initial approach:
– Algebra over the Galois Field of order 2, GF(2)={0,1}.
– GF(2) is a proper field, GF(2)m is a vector field of dim. m.
– Dot product · :logical AND. Sum +(-) : logical XOR.
– Scalar product: b, d ∈ GF(2)m
b·dT=b1·d1+...+bm·dm
– Product by scalars: a ∈ GF(2), b ∈ GF(2)m
a·b=(a·b1..a·bm)
– It is also possible to define a matrix algebra over GF(2).
5
Fundamentals of error control
●
Given a vector b ∈ GF(2)m, its binary weight is w(b)=number of
1's in b.
●
It is possible to define a distance over vector field GF(2)m,
called Hamming distance:
dH(b,d)=w(b+d); b, d ∈ GF(2)m
●
Hamming distance is a proper distance and accounts for the
number of differing positions between vectors.
●
Geometrical view:
(1011)
(0110) (1010)
(1110)
6
Fundamentals of error control
●
A given encoder produces n output bits for each k input bits:
– R=k/n<1 is the rate of the code.
●
The information rate decreases by R when using a code.
R'b=R·Rb < Rb (bit/s)
● If used jointly with a modulation with spectral efficiency η=Rb/B
(bit/s/Hz), the efficiency decreases by R
η'=R·η < η (bit/s/Hz)
● In terms of limited Pb, the achievable Eb/N0 region under
AWGN is lower bounded by (the so-called Shannon limit):
Eb 1 η'
N0
(dB)⩾10⋅log ( ⋅( 2
10 η ' −1 ) )
7
Fundamentals of error control
●
Recall our playground:
Source: http://www.comtechefdata.com/technologies/fec/ldpc
8
Fundamentals of error control
● How a channel code can improve Pb (BER in statistical terms).
●
Cost: loss in resources (spectral efficiency, power, processing time). 9
Fundamentals of error control
● How a channel code can improve Pb (BER in statistical terms).
Eb
Δ
N0
●
Cost: loss in resources (spectral efficiency, power, processing time). 10
Fundamentals of error control
● How a channel code can improve Pb (BER in statistical terms).
Eb
Δ
N0
●
Cost: loss in resources (spectral efficiency, power, processing time). 11
Binary linear block codes
12
Binary linear block codes
●
An (n,k) linear block code (LBC) is a subspace C(n,k) < GF(2)n
with dim(C(n,k))=k.
●
R=k/n is the rate of the LBC.
●
n-k is the redundancy of the LBC
– we would only need vectors with k components to
specify the same amount of information.
13
Binary linear block codes
●
Recall vector theory:
– A basis for C(n,k) has k vectors over GF(2)n
– C(n,k) is the kernel of a linear function such that LF:
GF(2)n → GF(2)n-k
●
c ∈ C(n,k) can be both specifed as:
– c=b1·g1+...+bk·gk, where {gj}j=1,...,k is the basis set,
and (b1...bk) are its coordinates over it.
●
G is a k×n generator matrix of the LBC C(n,k).
●
H is a (n-k)×n parity-check matrix of the LBC C(n,k).
– In other approach, it can be shown that the rows in H stand for
linearly independent parity-check equations.
– The row rank of H for an LBC should be n-k.
●
For any input information block with length k, it yields a
codeword with length n.
● An encoder is systematic if b is contained in c=(b1...bk | ck+1...cn),
so that ck+1...cn are the n-k parity bits.
– Systematicity is a property of the encoder, not of the LBC
C(n,k) itself.
– GS=[Ik | P] is a systematic generator matrix. 16
Binary linear block codes
●
How to obtain G from H or H from G.
– G rows are k vectors linearly independent over GF(2)n
– H rows are n-k vectors linearly independent over GF(2)n
– They are related through G·HT=0 (a)
●
(a) does not yield a sufficient set of equations, given H or G.
– A number of vector sets comply with it (basis sets are not unique).
●
Given G, put it in systematic form by combining rows (the code will be the
same, but the encoding does change).
– If GS=[Ik | P], then HS=[PT | In-k] complies with (a).
●
Conversely, given H, put it in systematic form by combining rows.
– If HS=[In-k | P], then GS=[PT | Ik] complies with (a).
●
Parity check submatrix P can be on the left or on the right side (but on
opposite sides of H and G simultaneously for a given LBC).
17
Binary linear block codes
●
Note that, by taking 2k vectors out of 2n, we are
getting apart the binary words.
●
Minimum Hamming distance between input words
is
d min ( GF (2)k )=min {d H ( bi , b j ) | bi , b j ∈GF (2)k }=1
bi ≠ b j
●
Recall that we have added n-k redundancy bits, so
that
d min ( C (n , k )) =min { d H ( c i , c j ) | ci , c j ∈C (n , k )} >1
c i ≠c j
c=(c1...cn) r=(r1...rn)
Channel
BSC(p)
1-p
0 p 0
p
1 1
1-p
● p=P(ci≠ri ) is the bit error probability of the modulation in AWGN.
19
Binary linear block codes
● The received word is r=c+e, where P(ei=1)=p.
– e is the error vector introduced by the noisy channel
– w(e) is the number of errors in r wrt original word c
– P(w(e)=t)=pt·(1-p)n-t, because the channel is memoryless
●
At the receiver side, we can compute the so-called syndrome vector
s=(s1...sn-k) as s=r·HT=(c+e)·HT=c·HT+e·HT=e·HT.
●
r ∈ C(n,k) ⇔ s=0.
20
Binary linear block codes
Two possibilities at the receiver side:
●
a) Error detection (ARQ schemes):
– If s≠0, there are errors, so ask for retransmission.
●
b) Error correction (FEC schemes):
– Decode an estimated ĉ ∈ C(n,k), so that dH(ĉ,r) is the minimum
over all codewords in C(n,k) (closest neighbor decoding).
– ĉ is the most probable word under the assumption that p is small
(otherwise, the decoding fails).
(1011)
ĉ1 r1 ĉ2
(0110)
e1
OK (1010)
c (1110)
r2
e2
21
Binary linear block codes
●
Detection and correction capabilities (worst case) of an LBC with
dmin(C(n,k)).
– a) It can detect error events e with binary weight up to
w(e)|max,det=d=dmin(C(n,k))-1
●
It is possible to implement a joint strategy:
– A dmin(C(n,k))=4 code can simultaneously correct all error
patterns with w(e)=1, and detect all error patterns with w(e)=2.
22
Binary linear block codes
● The minimum distance dmin(C(n,k)) is a property of the set of
codewords in C(n,k), independent from the encoding (G).
28
Multiple-error correction block
codes
29
Multiple-error-correction block codes
●
There are a number of more powerful block codes, based on higher
order Galois fields.
– They use symbols over GF(s), with s>2.
– Now the operations are defined as mod s.
– An (n,k) linear block code with symbols from GF(s) is again a k-
dimensional subspace of the vector space GF(s)n.
●
They are used mainly for correction, and have applications in
channels or systems where error bursts are frequent, i.e.
– Storage systems (erasure channel).
– Communication channels with deep fades.
●
They are frequently used in concatenation with other codes that fail in
short bursts when correcting (i.e. convolutional codes).
●
We are going to introduce two important instances of such broad
class of codes: Reed-Solomon and BCH codes.
30
Multiple-error-correction block codes
● An (n,k,t)q Reed-Solomon (R-S) code is defined as a mapping
from GF(q)k↔GF(q)n, with the following parameters:
– Block length n=q-1 symbols, for input block length of k<n
symbols.
– Alphabet size q=pr, where p is a prime number.
● Minimum distance of the code dmin=n-k+1 (it achieves the
Singleton bound for linear block codes).
– It can correct up to t symbol errors, dmin=2t+1.
●
This is a whole class of codes.
– For any q=pr, n=q-1 and k=q-1-2t, there exists a Reed-
Solomon code meeting all these criteria.
31
Multiple-error-correction block codes
●
The algebra in GF(q), q=pr and p prime, is move involved:
– Elements are built as powers of an abstract entity
called primitive root of unity α.
– For any b ∊ GF(q), except 0, there exists an integer
u / αu=b (mod q).
– αi, i=1,...,q-1, spans GF(q), except 0.
– b ∊ GF(q) can also be written as a polynomial in α,
using properties
A=
( a
a1
a
⋮
2
1
(k−1)
1 a
a2
a
⋮
2
2
(k −1)
2
⋯
⋯
⋱
a
⋮
(k −1)
⋯ an
an
2
n
) 34
Multiple-error-correction block codes
●
The number of polynomials in GF(q) of degree less than k
is clearly qk, exactly the possible number of messages.
– This guarantees that each information word can be
mapped to a unique codeword by choosing a
convenient mapping s↔z.
1 1 ⋯ 1
( )
2 n
α α ⋯ α
A= (α)2 2 2
(α ) ⋯ (α )
n 2
⋮ ⋮ ⋱ ⋮
(k −1 ) 2 (k−1) n (k−1)
(α) (α ) ⋯ (α )
38
Multiple-error-correction block codes
●
In this last case, it can be demonstrated that the parity-
check matrix of the resulting R-S code is
2 (q− 2)
1 α α ⋯ α
( )
2 2 2 2 (q− 2)
1 α (α ) ⋯ (α )
H= 1 α 3 (α 3) 2 3 (q− 2)
⋯ (α )
⋮ ⋮ ⋮ ⋱ ⋮
2t 2t 2 2 t (q −2)
1 α (α ) ⋯ (α )
– It can correct t or fewer random symbol errors over a span
of n=q-1 symbols.
39
Multiple-error-correction block codes
●
The weight distribution (spectrum) of R-S codes has closed
form 2t
Ai= q−1 q−2t {( q−1 ) + ∑ ( −1 ) i ( q2 t −q i ) }
i i+ j
( )
i j=0 j ()
2 t+1⩽i⩽q−1
●
We can derive bounds for the probability of undetected errors
for a symmetric DMC with q-ary input/output alphabets, and
probability of correct reception 1-ε.
q
0⩽ϵ⩽ 0< λ<t Pu ( E ) <q −2 t before any correction step
q−1
λ
−2 t q−1 ( q−1 )h
Pu ( E , λ ) <q ∑( h ) if used first to correct λ errors
h=0
40
Multiple-error-correction block codes
●
Other kind of linear block code defined over higher-order
Galois fields is the class of BCH codes.
– Named after Raj Bose and D. K. Ray-Chaudhuri.
● An (n,k,t)q BCH code is again defined over GF(q), where
q=pr, and p is prime.
●
We have m, n, q, d=2t+1, l, so that
– 2≤d≤n. l will be considered later.
– gcd(n,q)=1 (“gcd” → greatest common divisor)
– m is the multiplicative order of q modulo n; m is thus
the smallest integer meeting qm=1 (mod n).
– t is the number of errors that may be corrected.
41
Multiple-error-correction block codes
●
Let α be a primitive n-th root of 1 in GF(qm). This means αn=1
(mod qm).
● Let mi(x) the minimal polynomial for GF(q) of αi, ∀ i.
– This is the monic polynomial of least degree having αi as a
root.
– Monic → the coefficient of the highest power of x is 1.
●
Then, a BCH code is defined by a so-called generator
polynomial
g ( x )=lcm ( m l ( x ) ... ml +d −2 ( x ) )
where “lcm” stands for least common multiple. Its degree is at
most (d-1)m=2mt.
42
Multiple-error-correction block codes
●
The encoding with a generator polynomial is done by building a
polynomial containing the information symbols
k
k i−1
s=( s 1 ... s k ) ∈ GF ( q ) → s ( x ) =∑ si⋅x
i=1
●
Then
n
i−1
c ( x )=∑ c ⋅x
i =s ( x )⋅g ( x ) , c i ∈ GF (q) → c=( c1 ... cn )
i=1
●
We may do systematic encoding as
n−k n−k
c s ( x ) = x⏟ ⋅s ( x ) + ⏟
x ⋅s ( x ) mod g ( x )
systematic symbols redundancy symbols
43
Multiple-error-correction block codes
●
The case with l=1, and n=qm-1 is called a primitive BCH
code.
– The number of parity check symbols is n-k≤(d-1)m=2mt.
– The minimum distance is dmin≥d=2t+1.
●
If m=1, then we have a Reed-Solomon code (of the “primitive
root” kind)!
– R-S codes can be seen as a subclass of BCH codes.
– In this case, it can be verified that the R-S code may be
defined by means of a generator polynomial, in the form
2 2t
g ( x ) =( x−α ) ( x−α ) ... ( x−α ) =
2 2 t −1 2t
=g 1+ g2 x+ g3 x +...+ g2 t−2 x + x
44
Multiple-error-correction block codes
●
The parity check matrix for a primitive BCH code over GF(qm) is
2 (n−1)
1 α α ⋯ α
( )
2 2 2 2 (n−1)
1 α (α ) ⋯ (α )
H= 1 α 3
(α )3 2
⋯ (α ) 3 (n−1)
⋮ ⋮ ⋮ ⋱ ⋮
(d −1) (d −1) 2 (d −1) (n−1)
1 α (α ) ⋯ (α )
– This code can correct t or fewer random symbol errors, d=2t+1,
over a span of n=qm-1 symbol positions.
45
Multiple-error-correction block codes
●
Why these codes can correct error bursts in the channel?
Map Map
bits BCH/R-S q-ary
b=(b1...bk·log2(q))
to q-ary encoder symbols cb
symbols to bits
s=(s1...sk) c=(c1...cn)
BSC(p)
p=Pb
s'=(s'1...s'k) r=(r1...rn)
Map Map rb=cb+eb
q-ary BCH/R-S bits
symbols decoder to q-ary
b'=(b'1...b'k·log2(q)) to bits symbols
46
Multiple-error-correction block codes
●
Why these codes can correct error bursts in the channel?
Other channel encoder/decoder,
modulator/demodulator,
Map Map medium access technique
bits BCH/R-S q-ary
b=(b1...bk·log2(q))
to q-ary encoder symbols cb
symbols to bits
s=(s1...sk) c=(c1...cn)
BSC(p)
p=Pb
s'=(s'1...s'k) r=(r1...rn)
Map Map rb=cb+eb
q-ary BCH/R-S bits
symbols decoder to q-ary
b'=(b'1...b'k·log2(q)) to bits symbols
47
Multiple-error-correction block codes
●
Why these codes can correct error bursts in the channel?
Other channel encoder/decoder,
modulator/demodulator,
Map Map medium access technique
bits BCH/R-S q-ary
b=(b1...bk·log2(q))
to q-ary encoder symbols cb
symbols to bits
s=(s1...sk) c=(c1...cn)
BSC(p)
p=Pb
s'=(s'1...s'k) r=(r1...rn)
Map Map rb=cb+eb
q-ary BCH/R-S bits
symbols decoder to q-ary
b'=(b'1...b'k·log2(q)) to bits symbols eb=(...1111...)
48
Multiple-error-correction block codes
●
Why these codes can correct error bursts in the channel?
Other channel encoder/decoder,
modulator/demodulator,
Map Map medium access technique
bits BCH/R-S q-ary
b=(b1...bk·log2(q))
to q-ary encoder symbols cb
symbols to bits
s=(s1...sk) c=(c1...cn)
BSC(p)
p=Pb
s'=(s'1...s'k) r=(r1...rn)
Map Map rb=cb+eb
q-ary BCH/R-S bits
symbols decoder to q-ary
b'=(b'1...b'k·log2(q)) to bits symbols eb=(...1111...)
If bit error burst falls within a single symbol ri in GF(q), or at most spans over t
symbols in GF(q) within word r, it can be corrected! 49
Multiple-error-correction block codes
●
Note that algebra is now far more involved and more
complex than with binary LBC.
– This is part of the price to pay to get better data integrity
protection.
– Other logical price to pay is the reduction in data rate, R. k
and n are measured in symbols, but, as the mapping and
demapping is performed from GF(2) to GF(q), and
viceversa, the end-to-end effective data rate is again
' k
R =R b⋅
b
n
●
But there is still something to do to get the best from R-S
and BCH codes: decoding is substantially more complex!
50
Multiple-error-correction block codes
●
We are going to see a simple instance of decoding for
these families of nonbinary LBC.
●
We address the general BCH case, as R-S can be seen as
an instance of the former.
r ( x ) =c ( x ) +e ( x ) is the received codeword
e ( x )=e j x j +...+e j x j 0⩽ j 1⩽...⩽ j ν⩽n−1
1
1
ν
ν
●
Correction can be performed by identifying the pairs
j
( , e j ) ∀ i=1,... , ν
x i
●
For this, we can resort to the syndrome
i i m
( S 1 ... S2 t ) , where S i=r (α )=e ( α ) ∈ GF (q )
51
Multiple-error-correction block codes
●
We can build a set of equations
S l=δ1 βl1+... δ ν βlν , i=l , ..2t
δi=e j , βi=α j
i
i
●
Based on this, a BCH or an R-S may be decoded on 4 steps:
1. Compute the syndrome vector.
2. Determine the so-called error-location polynomial.
3. Determine the so-called error-value evaluator.
4. Evaluate error-location numbers ( j i ) and error values ( e j ),
i
and perform correction.
●
The error-location polynomial is defined as
ν
l
σ ( x ) =( 1−β1 x ) ... (1−βν x ) =∑ σ l x , where σ 0=1
l=0
52
Multiple-error-correction block codes
●
We can find the error-location polynomial with the help of the
Berlekamp's algorithm, using the syndrome vector.
– It works iteratively, in 2t steps.
– Details can be found in the references.
●
Once determined, its roots can be found by substituting the
elements of GF(qm) cyclically in σ(x).
m
i −i q −i−1
– If σ ( α )=0 , α =α is an error-location number.
m
– The errors are thus located at such q −i −1 positions.
●
On the other hand, the error-value evaluator is defined as
ν ν
Z 0 ( x ) = ∑ δl β l ∏ ( 1−βi x )
l =1 i=1, i≠l
53
Multiple-error-correction block codes
●
It can be shown that the error-value evaluator can be calculated
as a function of known quantities
2
Z0 x =S1 + ( S2 +σ1 S 1 ) x + ( S3 +σ 1 S2 +σ2 S 1 ) x +
( )
ν
+...+ ( S ν +σ 1 S ν−1 +...+σ ν−1 S 1 ) x
●
After some algebra, the error values are determined as
−Z0 ( β−1
k )
δk =
σ ' (β−1
k )
●
Nonbinary LBC of the kind described are usually employed
in sophisticated FEC strategies for specific channels.
– Binary LBC are more usual in ARQ strategies.
55
Multiple-error-correction block codes
● Example BPSK+AWGN: RHam=246/255, RGol=1/2, RR-S=155/255.
56
Convolutional codes
57
Convolutional codes
●
A binary convolutional code (CC) is another kind of linear
channel code class.
●
The encoding can be described in terms of a finite state
machine (FSM).
– A CC can eventually produce sequences of infinite length.
– A CC encoder has memory. General structure:
not
mandatory
Backward logic (feedback)
k input streams
MEMORY: ml bits for l-th input
not n output streams
mandatory
Forward logic (coded bits)
Systematic output 58
Convolutional codes
●
The memory is organized as a shift register.
– Number of positions for input l: memory ml.
●
A CC encoder produces sequences, not just blocks of data.
– Sequence-based properties vs. block-based properties.
to backward
logic
l-th input stream 1 2 3 4 ml
input at instant i
(l ) (l ) (l ) (l ) (l ) (l )
to forward di d i−1 d i−2 d i−3 d i−4 d i−m l
logic 59
Convolutional codes
●
Both forward and backward logic is boolean logic.
– Very easy: each operation adds up (XOR) a number of memory
positions, from each of the k inputs.
inputs from all the k registers at instant i
l =1 q=i
( j)
● g l,p , p=0,. .. , ml , is 1 when the p-th register position for the l-
th input is added to get the j-th output.
60
Convolutional codes
●
Parameters of a CC so far:
– k input streams
– n output streams
– k shift registers with length ml each, l=1,...,k
●
A CC is denoted as (n,k,ν).
– As usual, its rate is R=k/n, where k and n take normally
small values for a convolutional code.
61
Convolutional codes
●
The backward / forward logic may be specified in the form
of generator sequences.
– Theses sequences are the impulse responses of each
output j wrt each input l.
( j) ( j) ( j)
g =( g
l l ,0 , ... , g l ,ml )
●
Observe that:
g–(l j )=( 1,0,... ,0 ) connects the l-th input directly to the j-th
output
( j)
g–l =( 0,... ,1(q th ) ,... ,0 ) just delays the l-th input to the
j-th output q time steps.
62
Convolutional codes
●
Given the presence of the shift register, the generator
sequences are better denoted as generator polynomials
ml
( j) ( j) (j) ( j) ( j) q
g =( g
l l ,0 ,... , g l , ml )≡g l ( D)=∑ g ⋅D l,q
q =0
●
We can thus write, for example
( j) (j)
g =( 1,0,... ,0 ) ≡ g (D)=1
l l
( j) th ( j) q
g =( 0,... ,1(q ) ,... ,0 ) ≡ g ( D)= D
l l
(1) (2) (n )
g (D) g (D) ⋯ g ( D)
( )
1 1 1
(1) (2) (n )
G(D)= g (D) g (D) ⋯ g ( D)
2 2 2
⋮ ⋮ ⋱ ⋮
(1) (2) (n )
g k (D) g k (D) ⋯ g k ( D)
64
Convolutional codes
●
If each input has a feedback logic given as
ml q
g ( D)=∑
(0)
l q= 0 g ⋅D
(0)
l ,q
the code is denoted as
g(1)
1 (D) g(2)
1 ( D) g (n1 ) (D)
⋯
( )
(0) (0 ) (0 )
g (D)
1 g (D)
1 g (D)
1
(1) (2) (n )
g (D)
2 g ( D)
2 g (D)
2
G(D)= g (D) (0) (0 )
⋯ (0 )
2 g (D)
2 g (D)
2
⋮ ⋮ ⋱ ⋮
(1) (2) (n )
g k (D) gk ( D) g k (D)
(0) (0 )
⋯ (0 )
g (D)
k g (D)
k g k (D)
65
Convolutional codes
●
We can generalize the concept of parity-check matrix H(D).
– An (n,k,ν) CC is fully specified by G(D) or H(D).
●
Based on the matrix description, there are a good deal linear
tools for design, analysis and evaluation of a given CC.
●
A regular CC can be described as a (canonical) all-feedforward
CC and through an equivalent feedback (recursive) CC.
– Note that a recursive CC can be seen as an IIR filter in GF(2).
●
Even though k and n could be very small, a CC has a very rich
algebraic structure.
– This has to do with the constraint length of the CC.
– Each output bit is related to the present and past inputs via
powerful algebraic methods.
66
Convolutional codes
●
Given G(D), a CC can be classified as:
– Systematic and feedforward (NSC).
– Systematic and recursive (RSC).
– Non-systematic and feedforward.
– Non-systematic and recursive.
●
RSC is a popular class of CC, because it may provide an
infinite output for a finite-weight input (IIR behavior).
●
Each NSC can be converted straightforwardly to a RSC
with similar error correcting properties.
●
CC encoders are easy to implement with standard
hardware: shift registers + combinational logic.
67
Convolutional codes
●
As with the case of nonbinary LBC, we may use the
polynomial representation to perform coding & decoding.
– But now we have encoding with memory, spanning over
theoretically infinite length sequences → not practical.
c ( D ) =b ( D )⋅G ( D )
j
b ( D ) =( b(1) ( D ) ... b(k) ( D ) ) ; b(i) ( D )=∑ b (i)
j D
j=0
b(ji) is the i−th input bit stream
68
Convolutional codes
●
We do not need to look very deep into the algebraic details
of G(D) and H(D) to study:
– Coding
– Decoding
– Error correcting capabilities
●
A CC encoder is a FSM!
ss=s(i-1)
s=1,...,2ν se=s(i)
e=1,...,2ν
input bi=(bi,1...bi,k)
output ci=(ci,1...ci,n)
70
Convolutional codes
●
The trellis illustrates the encoding process in 2 axis:
– X-axis: time / Y-axis: states
●
Example for a (2,1,3) CC: input 0 input 1
s1 output 00 s1
s2 output 0 s2
1
s3 s3
s4 s4
s5 s5
s6 s6
s7 s7
s8 s8
i-1 i i+1
– For a finite-size input data sequence, a CC can be forced to finish at a
known state (often 0) by adding terminating (dummy) bits.
71
– Note that one section (e.g. i-1 → i) fully specifies the CC.
Convolutional codes
●
The trellis illustrates the encoding process in 2 axis: Memory:
same input,
– X-axis: time / Y-axis: states different outputs
●
Example for a (2,1,3) CC: input 0 input 1
s1 output 00 s1
s2 output 0 s2
1
s3 s3
s4 s4
s5 s5
s6 s6
s7 s7
s8 s8
i-1 i i+1
– For a finite-size input data sequence, a CC can be forced to finish at a
known state (often 0) by adding terminating (dummy) bits.
72
– Note that one section (e.g. i-1 → i) fully specifies the CC.
Convolutional codes
●
The trellis description allows us
– To build the encoder
– To build the decoder
– To get the properties of the code
●
The encoder:
ss=s(i-1)
s=1,...,2ν se=s(i)
input bi=(bi,1...bi,k) e=1,...,2ν
output ci=(ci,1...ci,n)
Registers
Combinational
k logic n
CLK H(D)↔G(D)
73
Convolutional codes
●
As usual, decoding is far more complicated than encoding
– Long sequences
– Memory: dependence with past states
●
In fact, CC were already well known before there existed a
practical good method to decode them: the Viterbi algorithm.
– It is a Maximum Likelihood Sequence Estimation (MLSE)
algorithm with many applications.
●
Issue: for a length N>>n sequence at the receiver side
– There are 2ν·2N·k/n paths through the trellis to match with
the received data.
– Even if the coder starting state is known (often 0), there are
still 2N·k/n paths to walk through in a brute force approach.
74
Convolutional codes
●
Viterbi algorithm setup.
Key facts:
input bi → output ci(s(i-1),bi) ●
The encoding corresponds to a Markov
start s(i-1) → end s(i)(s(i-1),bi) chain model: P(s(i))=P(s(i)|s(i-1))·P(s(i-1)).
s1 ●
Total likelihood P(r|b) can be factorized
s2 as a product of probabilities.
bi
(i−1) (i )
s3 ● Given s →s , P(ri|s(i),s(i-1)) depends
s4 only on the channel kind (AWGN,
s5 BSC...).
s6 ●
Transition from s(i-1) to s(i) (linked in the
s7 trellis) depends on the probability of bi:
s8 P(s(i)|s(i-1))=2-k if the source is iid.
i-1 i
●
P(s(i)|s(i-1))=0 if they are not linked in the
received data ri
trellis (finite state machine: deterministic).
75
Convolutional codes
●
The total likelihood can be recursively calculated as:
N /n
P ( r∣b ) =∏ P ( r i∣s ,s
(i) (i −1)
∣
(i) (i−1)
)⋅P s s ⋅P s )
( ) ( (i−1)
i=1
●
In the BSC(p), the observation (branch) metric would be
related to:
P ( r i|s ,s
(i) (i−1)
)=P ( r i|c i ) →w ( r i+ ci ) =d H ( ri , ci )
●
Maximum likelihood (ML) criterion:
̂
b=arg max [ P ( r∣b ) ]
{ b } 76
Convolutional codes
●
We know that the brute force approach to ML criterion is at
least O(2N·k/n).
●
The Viterbi algorithm works recursively from 1 to N/n on the
basis that
– Many paths can be pruned out (transition probability=0).
– During forward recursion, we only keep the paths with highest
probability: the path probability goes easily to 0 from the moment
a term metric ⨯ transition probability is very small.
– When recursion reaches i=N/n, the surviving path guarantees the
ML criterion (optimal for ML sequence estimation!).
●
The Viterbi algorithm complexity goes down to O(N·22ν).
77
Convolutional codes
●
The algorithm recursive rule is
(0) (0)
V = P ( s =s j ) ;
j
MAX s6 s6
s7 s7
s8 s8 78
i-1 i i+1
Convolutional codes
Probability of the
most probable
●
The algorithm recursive rule is state sequence
corresponding to the i-1
(0) (0) previous observations
V = P ( s =s j ) ;
j
MAX s6 s6
s7 s7
s8 s8 79
i-1 i i+1
Convolutional codes
Probability of the
most probable
●
The algorithm recursive rule is state sequence
corresponding to the i-1
(0) (0) previous observations
V = P ( s =s j ) ;
j
MAX s6 s6
s7 s7
s8 s8 80
i-1 i i+1
Convolutional codes
●
Note that we have considered the algorithm when the
demodulator yields hard outputs
– ri is a vector of n estimated bits (BSC(p) equivalent channel).
●
In AWGN, we can do better to decode a CC
– We can provide soft (probabilistic) estimations for the
observation metric.
– For an iid source, we can easily get an observation transition
metric based on the probability of each bi,l=0,1, l=1,...,k,
associated to a possible transition.
– There is a gain of around 2 dB in Eb/N0.
– LBC decoders can also accept soft inputs (non syndrome-based
decoders).
– We will examine an example of soft decoding of CC in the lab.
81
Convolutional codes
●
We are now familiar with the encoder and the decoder
– Encoder: FSM (registers, combinational logic).
– Decoder: Viterbi algorithm (for practical reasons,
suboptimal adaptations are usually employed).
●
But what about performance?
●
First...
– CC are mainly intended for FEC, not for ARQ schemes.
– In a long sequence (=CC codeword), the probability of
having at least one error is very high...
– And... are we going to retransmit the whole sequence?
82
Convolutional codes
●
Given that we truncate the sequence to N bits and CC is linear
– We may analyze the system as an equivalent (N,N·k/n)
LBC.
– But... equivalent matrices G and H would not be practical.
●
Remember FSM: we can locate error loops in the trellis.
b
b+e
b+e
b+e
84
i i+1 i+2 i+3
Convolutional codes
●
Examining the minimal length loops and taking into account this
uniform error property we can get dmin of a CC.
– For a CC emitting finite-duration coded blocks forced to end at 0
state, d is called d .
min free
85
Convolutional codes
●
With a fairly amount of algebra, related to FSM, modified
encoder state diagrams and so on, it is possible to get an upper
bound for optimal MLSE decoding.
BPSK in AWGN,
Eb
P b⩽ ∑
d = d free
B d⋅erfc (√ dR
N0 ) soft demodulation
Native Puncturing
convolutional algorithm
k encoder n (prune bits) n'<n
87
Convolutional codes
●
Examples with BPSK+AWGN using ML bounds.
88
Turbo codes
89
Turbo codes
●
Canonically, Turbo Codes (TC) are Parallel Concatenated
Convolutional Codes (PCCC).
k input streams n=n1+n2 output streams
CC1
b c=c1∪c 2
? Rate R=k/(n1+n2)
CC2
●
Coding concatenation has been known and employed for
decades, but TC added a joint efficient decoding algorithm.
– Example of concatenated coding with independent decoding is
the use of ARQ + FEC hybrid strategies (CRC/R-S + CC).
90
Turbo codes
●
Canonically, Turbo Codes (TC) are Parallel Concatenated
Convolutional Codes (PCCC).
k input streams n=n1+n2 output streams
CC1
b c=c1∪c 2
? Rate R=k/(n1+n2)
●
For CC, we also have decoders that provide probabilistic (soft)
outputs.
– They convert a priori soft values + channel output soft
estimations into updated a posteriori soft values.
– They are optimal from the Maximum A Posteriori (MAP)
criterion point of view.
– They are called Soft Input-Soft Output (SISO) decoders.
92
Turbo codes
●
What's in a SISO?
r
SISO
(for a CC)
P ( bi =b∣r )
1
P ( bi =b )=
2
0 1
0 1
Probability density function of bi
●
Note that the SISO works on a bit
by bit basis, but produces a
sequence of APP's. 93
Turbo codes
●
What's in a SISO? Soft demodulated values
from channel
r
SISO
(for a CC)
P ( bi =b∣r )
1
P ( bi =b )=
2
0 1
0 1
Probability density function of bi
●
Note that the SISO works on a bit
by bit basis, but produces a
sequence of APP's. 94
Turbo codes
●
What's in a SISO? Soft demodulated values
from channel
r
A priori probabilities SISO
(APR) (for a CC)
P ( bi =b∣r )
1
P ( bi =b )=
2
0 1
0 1
Probability density function of bi
●
Note that the SISO works on a bit
by bit basis, but produces a
sequence of APP's. 95
Turbo codes
●
What's in a SISO? Soft demodulated values
from channel
r
A priori probabilities SISO
(APR) (for a CC)
P ( bi =b∣r )
1
P ( bi =b )= A posteriori probabilities
2 (APP) updated with
channel information
0 1
0 1
Probability density function of bi
●
Note that the SISO works on a bit
by bit basis, but produces a
sequence of APP's. 96
Turbo codes
●
The algorithm inside the SISO is some suboptimal version of
the MAP BCJR algorithm.
– BCJR computes the APP values through a forward-backward
dynamics → it works over finite length data blocks, not over
(potentially) infinite length sequences (like pure CCs).
– BCJR works on a trellis: recall transition metrics, transition
probabilities and so on.
– Assume the block length is N: trellis starts at s (0)
, ends at s (N )
.
(i)
αi ( j ) = P ( s =s j , r 1, ⋯, r i )
βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j )
(i)
γ i j , k = P ( r i , s =s j∣s =sk )
( ) (i) (i−1)
97
Turbo codes
●
The algorithm inside the SISO is some suboptimal version of
the MAP BCJR algorithm.
– BCJR computes the APP values through a forward-backward
dynamics → it works over finite length data blocks, not over
(potentially) infinite length sequences (like pure CCs).
– BCJR works on a trellis: recall transition metrics, transition
probabilities and so on.
– Assume the block length is N: trellis starts at s (0)
, ends at s (N )
.
(i)
αi ( j ) = P ( s =s j , r 1, ⋯, r i ) FORWARD term
βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j )
(i)
γ i j , k = P ( r i , s =s j∣s =sk )
( ) (i) (i−1)
98
Turbo codes
●
The algorithm inside the SISO is some suboptimal version of
the MAP BCJR algorithm.
– BCJR computes the APP values through a forward-backward
dynamics → it works over finite length data blocks, not over
(potentially) infinite length sequences (like pure CCs).
– BCJR works on a trellis: recall transition metrics, transition
probabilities and so on.
– Assume the block length is N: trellis starts at s (0)
, ends at s (N )
.
(i)
αi ( j ) = P ( s =s j , r 1, ⋯, r i ) FORWARD term
βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j )
(i)
BACKWARD term
γ i j , k = P ( r i , s =s j∣s =sk )
( ) (i) (i−1)
99
Turbo codes
●
The algorithm inside the SISO is some suboptimal version of
the MAP BCJR algorithm.
– BCJR computes the APP values through a forward-backward
dynamics → it works over finite length data blocks, not over
(potentially) infinite length sequences (like pure CCs).
– BCJR works on a trellis: recall transition metrics, transition
probabilities and so on.
– Assume the block length is N: trellis starts at s (0)
, ends at s (N )
.
(i)
αi ( j ) = P ( s =s j , r 1, ⋯, r i ) FORWARD term
βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j )
(i)
BACKWARD term
γ i j , k = P ( r i , s =s j∣s =sk )
( ) (i) (i−1)
TRANSITION
100
Turbo codes
●
The algorithm inside the SISO is some suboptimal version of
the MAP BCJR algorithm.
– BCJR computes the APP values through a forward-backward
dynamics → it works over finite length data blocks, not over
(potentially) infinite length sequences (like pure CCs).
– BCJR works on a trellis: recall transition metrics, transition
probabilities and so on.
– Assume the block length is N: trellis starts at s (0)
, ends at s (N )
.
(i)
αi ( j ) = P ( s =s j , r 1, ⋯, r i ) FORWARD term
Remember,
n components
βi ( j ) = P ( r i +1 , ⋯, r N ∣s =s j )
(i)
for an (n,k,ν) CC BACKWARD term
γ i j , k = P ( r i , s =s j∣s =sk )
( ) (i) (i−1)
TRANSITION
101
Turbo codes
●
BCJR algorithm in action:
– Forward step i=1,...,N:
ν
2
(0)
α0 ( j ) = P ( s =s j ) ; αi ( j )= ∑ αi−1 ( k )⋅γ i ( k , j )
k =1
(i−1) (i)
P (s =s j , s =sk ,r ) =βi ( k )⋅γ i ( j , k )⋅αi−1 ( j )
102
Turbo codes
●
Finally, the APP's can be calculated as:
1 (i−1) (i )
P ( bi =b∣r )= ⋅ ∑ P (s =s j ,s =s k ,r )
p(r ) s (i −1)
→ s(i )
b i =b
●
Decision criterion based on these APP's:
(i−1) (i)
∑ P (s =s j , s =sk , r )
log
(
P ( bi =1|r )
P ( bi =0|r ) )
=log
( s
s
(i−1)
∑
(i−1)
→s
bi=1
→s
bi=0
(i)
(i)
P ( s (i−1) =s j , s(i)=sk , r ) ) ^b =1
^
i
> 0
< 0
bi =0
103
Turbo codes
●
Finally, the APP's can be calculated as:
1 (i−1) (i )
P ( bi =b∣r )= ⋅ ∑ P (s =s j ,s =s k ,r )
p(r ) s (i −1)
→ s(i )
b i =b
Its modulus is the
reliability of the
decision
●
Decision criterion based on these APP's:
(i−1) (i)
∑ P (s =s j , s =sk , r )
log
(
P ( bi =1|r )
P ( bi =0|r ) )
=log
( s
s
(i−1)
∑
(i−1)
→s
bi=1
→s
bi=0
(i)
(i)
P ( s (i−1) =s j , s(i)=sk , r ) ) ^b =1
^
i
> 0
< 0
bi =0
104
Turbo codes
● How do we get γi(j,l)?
●
This probability takes into account
– The restrictions of the trellis (CC).
– The estimations from the channel.
105
Turbo codes
● How do we get γi(j,l)?
●
This probability takes into account
– The restrictions of the trellis (CC).
– The estimations from the channel.
n
− ∑ ( r i ,m −c i,m )
2 =0 if transition is not possible
m =1 =1/2k if transition is possible
1 2σ
2
(binary trellis, k inputs)
⋅e
2 n/2
in AWGN
(2πσ ) for unipolar ci,m 107
Turbo codes
●
Idea: what about feeding APP values as APR values for
other decoder whose coder had the same inputs?
r2
From CC1 SISO SISO
(for CC2)
P ( bi =b∣r 2 )
P ( bi =b∣r 1 )
0 1
0 1
108
Turbo codes
●
Idea: what about feeding APP values as APR values for
other decoder whose coder had the same inputs?
r2
From CC1 SISO SISO
(for CC2)
P ( bi =b∣r 2 )
P ( bi =b∣r 1 )
0 1
0 1
This will happen
under some
conditions
109
Turbo codes
●
APP's from first SISO used as APR's for second SISO
may increase updated APP's reliability iff
– APR's are uncorrelated wrt channel estimations for
second decoder.
– This is achieved by permuting input data for each
encoder.
INTERLEAVER CC2
(permutor) d
111
Turbo codes
●
The interleaver preserves the data (b), but changes its
position within the second stream (d).
– Note that this compels the TC to work with blocks of
N=size(Π) bits.
– The decoder has to know the specific interleaver used
at the encoder.
b1 b 2 b 3 b 4 bN
d i=bπ (i )
d π ( i) =bi
−1
dπ −1
( 2)
dπ
−1
(N )
dπ
−1
( 3)
dπ
−1
( 1)
dπ
−1
( 4)
112
Turbo codes
●
The mentioned process is applied iteratively (l=1,...).
– Iterative decoder → this may be a drawback, since it adds
latency (delay).
r2
from
channelr 1 SISO 1 Π SISO 2
APP1(l) APR2(l)
APP2(l)
APR1(l+1)
−1
Π
r2
from
channelr 1 SISO 1 Π SISO 2
Initial APR1(l=0)
APP1(l) APR2(l)
is taken with APP2(l)
P(bi=b)=1/2 APR1(l+1)
−1
Π
i
– La 1 (l) is the input APR value from previous SISO.
– Li (l) is the so-called extrinsic output value (only term interchanged).
e1
r2
La2(l)
from
channelr 1 SISO 1 Π SISO 2 Le2(l)
Le1(l)
−1 decision
La1(l+1) Π
−1
Π 115
Turbo codes
●
The interleaver prevents the same error loop to happen at both
decoder stages: they may cooperate successfully.
b+e
b
i i+1 i+2 i+3
π(i+1) π(i+2)
116
π(i+3) π(i)
Turbo codes
●
The interleaver prevents the same error loop to happen at both
decoder stages: they may cooperate successfully.
b+e
b
i i+1 i+2 i+3
1st CC
π(i+1) π(i+2)
117
π(i+3) π(i)
Turbo codes
●
The interleaver prevents the same error loop to happen at both
decoder stages: they may cooperate successfully.
b+e
b
i i+1 i+2 i+3
1st CC
c 2nd CC
π(i+1) π(i+2)
118
π(i+3) π(i)
Turbo codes
●
The interleaver prevents the same error loop to happen at both
decoder stages: they may cooperate successfully.
b+e
b
i i+1 i+2 i+3
1st CC
10...01:
typical error loop
for RSC CC c 2nd CC
π(i+1) π(i+2)
119
π(i+3) π(i)
Turbo codes
●
The interleaver prevents the same error loop to happen at both
decoder stages: they may cooperate successfully.
b+e
b
i i+1 i+2 i+3
1st CC
10...01:
typical error loop
for RSC CC c 2nd CC
– Note the two distinct zones: waterfall region / error floor. 121
Turbo codes
●
Analysis of TC is a very complex task: interleaving!
●
The location of the waterfall region can be analyzed by the so-
called density evolution method
– Based on the exchange of mutual information between SISO blocks.
●
The error floor can be lower bounded by the minimum Hamming
distance of the TC
– Contrary to CC's, TC relies on reducing multiplicities rather than just
trying to increase minimum distance.
w min⋅M min Eb
Pb floor
>
N
⋅erfc (√ d min R
N0 )
122
Turbo codes
●
Analysis of TC is a very complex task: interleaving!
●
The location of the waterfall region can be analyzed by the so-
called density evolution method
– Based on the exchange of mutual information between SISO blocks.
●
The error floor can be lower bounded by the minimum Hamming
distance of the TC
– Contrary to CC's, TC relies on reducing multiplicities rather than just
trying to increase minimum distance.
BPSK in AWGN
w min⋅M min Eb
soft demodulation
Pb floor
>
N
⋅erfc (√ d min R
N0 )
123
Turbo codes
●
Analysis of TC is a very complex task: interleaving!
●
The location of the waterfall region can be analyzed by the so-
called density evolution method
– Based on the exchange of mutual information between SISO blocks.
●
The error floor can be lower bounded by the minimum Hamming
distance of the TC
– Contrary to CC's, TC relies on reducing multiplicities rather than just
trying to increase minimum distance.
BPSK in AWGN
w min⋅M min Eb
soft demodulation
Hamming weight
Pb floor
>
N
⋅erfc (√ d min R
N0 )
of the error
with minimum
distance 124
Turbo codes
●
Analysis of TC is a very complex task: interleaving!
●
The location of the waterfall region can be analyzed by the so-
called density evolution method
– Based on the exchange of mutual information between SISO blocks.
●
The error floor can be lower bounded by the minimum Hamming
distance of the TC
– Contrary to CC's, TC relies on reducing multiplicities rather than just
trying to increase minimum distance.
BPSK in AWGN
w min⋅M min Eb
soft demodulation
Hamming weight
Pb floor
>
N
⋅erfc (√ d min R
N0 )
of the error Error
with minimum multiplicity
distance (low value!!) 125
Turbo codes
●
Analysis of TC is a very complex task: interleaving!
●
The location of the waterfall region can be analyzed by the so-
called density evolution method
– Based on the exchange of mutual information between SISO blocks.
●
The error floor can be lower bounded by the minimum Hamming
distance of the TC
– Contrary to CC's, TC relies on reducing multiplicities rather than just
trying to increase minimum distance.
BPSK in AWGN
w min⋅M min Eb
soft demodulation
Hamming weight
Pb floor
>
N
⋅erfc (√ d min R
N0 )
of the error Error
with minimum Interleaver gain multiplicity
distance (only if recursive (low value!!) 126
CC's!!)
Turbo codes
●
Examples of 3G TC. Note that TC's are intended for FEC...
127
Low Density Parity Check codes
128
Low Density Parity Check codes
●
LDPC codes are just another kind of channel codes derived
from less complex ones.
– While TC's were initially an extension of CC systems,
LDPC codes are an extension of the concept of binary
LBC, but they are not exactly our known LBC.
●
Formally, an LDPC code is an LBC whose parity check matrix is
large and sparse.
– Almost all matrix elements are 0!!!!!!!!!!
– Very often, the LDPC parity check matrices are randomly
generated, subject to some constraints on sparsity...
– Recall that LBC relied on extreme powerful algebra related
to carefully and well chosen matrix structures.
129
Low Density Parity Check codes
●
Formally, a (ρ,γ)-regular LDPC code is defined as the null
space of a parity check matrix J⨯n H that meets these
constraints:
a) Each row contains ρ 1's.
b) Each column contains γ 1's.
c) λ, the number of 1's in common between any two
columns, is 0 or 1.
d) ρ and γ are small compared with n and J.
●
These properties give name to this class of codes: their
matrices have a low density of 1's.
●
The density r of H is defined as r=ρ/n=γ/J. 130
Low Density Parity Check codes
●
Example of a (4,3)-regular LPDC parity check matrix
[ ]
0 000 111100 00 000 00 00 0
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
131
Low Density Parity Check codes
●
Example of a (4,3)-regular LPDC parity check matrix
[ ]
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
132
Low Density Parity Check codes
●
Example of a (4,3)-regular LPDC parity check matrix
This H defines a
11110 000 00 00 000 00 00 0
15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 (20,7) LBC!!!
[ ]
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
133
Low Density Parity Check codes
●
Example of a (4,3)-regular LPDC parity check matrix
This H defines a
11110 000 00 00 000 00 00 0
15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 (20,7) LBC!!!
[ ]
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0 r=4/20=3/15=0.2
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
134
Low Density Parity Check codes
●
Example of a (4,3)-regular LPDC parity check matrix
This H defines a
11110 000 00 00 000 00 00 0
15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 (20,7) LBC!!!
[ ]
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0 r=4/20=3/15=0.2
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
Sparse!
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
135
Low Density Parity Check codes
●
Example of a (4,3)-regular LPDC parity check matrix
This H defines a
11110 000 00 00 000 00 00 0
15⨯20 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 (20,7) LBC!!!
[ ]
0 000 00 001111 000 00 00 0
0 000 00 00 000 011110 00 0
0 000 00 00 000 00 00 01111
10 0010 00 100 0100 00 00 0
01 000 100 010 000 00 100 0 r=4/20=3/15=0.2
H= 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0
0 0010 00 00 0100 010 001 0
0 000 00 010 0010 00 100 01
Sparse!
10 000 100 00 0100 00 010 0
01 000 010 001 000 010 00 0
0 0100 00 100 0010 00 001 0
0 0010 00 010 000 100 100 0
0 000 100 001 000 010 00 01
λ=0,1
136
Low Density Parity Check codes
●
Note that the J rows of H are not necessarily linearly
independent over GF(2).
– To determine the dimension k of the code, it is mandatory
to find the row rank of H = n-k < J.
– That's the reason why in the previous example H defined a
(20,7) LBC instead of a (20,5) LBC as could be expected!
●
The construction of large H for LDPC with high rates and good
properties is a complex subject.
– Some methods relay on smaller Hi used as building blocks,
plus random permutations or combinatorial manipulations;
resulting matrices with bad properties are discarded.
– Other methods relay on finite geometries and lot of
algebra. 137
Low Density Parity Check codes
●
LDPC codes yield performances equal or even better than
TC's, but without the problem of their relatively high error
floor.
– Both LDPC codes and TC's are capacity approaching
codes.
●
As in the case of TC, their interest is in part related to the
fact that
– The encoding may be easily done, under some constraints
(even if H is large, the low density of 1's may help reducing
the complexity of the encoder).
– At the decoder side, there are powerful algorithms that can
take full advantage of the properties of the LDPC code.
138
Low Density Parity Check codes
●
Encoding of LDPC is a bit tricky.
●
One may build the equivalent full-row-rank matrix H by Gaussian
elimination, and then Hs=[In-k | P] → Gs=[PT | Ik].
– Nevertheless, P is not usually sparse,
and length n is in practice too large to
make this framework practical.
– Encoding using generator matrix is
done with complexity O(n2).
– Encoding can be performed with lower
complexity by using iterative algorithms,
that take advantage of the parity-check
structure of H.
Source:– Using Gs
Wikipedia
139
Low Density Parity Check codes
●
Encoding of LDPC is a bit tricky.
●
One may build the equivalent full-row-rank matrix H by Gaussian
elimination, and then Hs=[In-k | P] → Gs=[PT | Ik].
– Nevertheless, P is not usually sparse,
and length n is in practice too large to
make this framework practical.
– Encoding using generator matrix is
done with complexity O(n2).
– Encoding can be performed with lower
complexity by using iterative algorithms,
that take advantage of the parity-check
structure of H.
Source:– Using Gs E.g. info bits are distributed through an
Wikipedia structure with a lattice of simple encoders
reproducing “local” parity-check equations
140
Low Density Parity Check codes
●
There are several algorithms to decode LDPC codes.
– Hard decoding.
– Soft decoding.
– Mixed approaches.
●
We are going to examine two important instances thereof:
– Majority-logic (MLG) decoding; hard decoding, the simplest
one (lowest complexity).
– Sum-product algorithm (SPA); soft decoding, best error
performance (but high complexity!).
●
Key concepts: Tanner graphs & belief propagation.
141
Low Density Parity Check codes
●
MLG decoding: hard decoding; r=c+e → received word.
– The simplest instance of MLG decoding is the decoding of
a repetition code by the rule “choose 0 if 0's are dominant,
1 if otherwise”.
●
Given a (ρ,γ)-regular LDPC code, for every bit position
i=1,...,n, there is a set of γ rows
(i) (i)
Ai ={h ,⋯ , h }
1 γ
●
Repeating this for all i, we estimate ê, and ĉ=r+ê.
– Correct decoding of ei is guaranteed if there are less than
γ/2 errors in e.
143
Low Density Parity Check codes
●
Tanner graphs. Example for a (7,3) LBC.
c1 c2 c3 c4 c5 c6 c7
+ + + + + + +
s1 s2 s3 s4 s5 s6 s7
●
It is a bipartite graph with interesting properties for decoding.
– A variable node is connected to a check node iff the
corresponding code bit is checked by the corresponding
parity sum equation.
144
Low Density Parity Check codes
●
Tanner graphs. Example for a (7,3) LBC.
Variable nodes or
c1 c2 c3 c4 c5 c6 c7 code-bit vertices
+ + + + + + +
s1 s2 s3 s4 s5 s6 s7
●
It is a bipartite graph with interesting properties for decoding.
– A variable node is connected to a check node iff the
corresponding code bit is checked by the corresponding
parity sum equation.
145
Low Density Parity Check codes
●
Tanner graphs. Example for a (7,3) LBC.
Variable nodes or
c1 c2 c3 c4 c5 c6 c7 code-bit vertices
+ + + + + + + Check nodes or
s1 s2 s3 s4 s5 s6 s7 check-sum vertices
●
It is a bipartite graph with interesting properties for decoding.
– A variable node is connected to a check node iff the
corresponding code bit is checked by the corresponding
parity sum equation.
146
Low Density Parity Check codes
●
Tanner graphs. Example for a (7,3) LBC.
Variable nodes or
c1 c2 c3 c4 c5 c6 c7 code-bit vertices
+ + + + + + + Check nodes or
s1 s2 s3 s4 s5 s6 s7 check-sum vertices
●
It is a bipartite graph with interesting properties for decoding.
– A variable node is connected to a check node iff the
corresponding code bit is checked by the corresponding
parity sum equation.
147
Low Density Parity Check codes
●
Based on the Tanner graph of an LDPC code, it is possible
to make iterative soft decoding (SPA).
●
SPA is performed by belief propagation (which is an
instance of a message passing algorithm).
c1 c2 c3 c4 c5 c6 c7
+ + + + + + +
s1 s2 s3 s4 s5 s6 s7
148
Low Density Parity Check codes
●
Based on the Tanner graph of an LDPC code, it is possible
to make iterative soft decoding (SPA).
●
SPA is performed by belief propagation (which is an
instance of a message passing algorithm).
c1 c2 c3 c4 c5 c6 c7 “Messages” (soft values)
are passed to and from
related variable and
check nodes
+ + + + + + +
s1 s2 s3 s4 s5 s6 s7
149
Low Density Parity Check codes
●
Based on the Tanner graph of an LDPC code, it is possible
to make iterative soft decoding (SPA).
●
SPA is performed by belief propagation (which is an
instance of a message passing algorithm).
c1 c2 c3 c4 c5 c6 c7 “Messages” (soft values)
are passed to and from
related variable and
check nodes
N(s7)
P ( ci∣λ )= ∑ P ( c '∣λ )
c ' :c ' i =c i
●
Brute-force approach for LDPC is impractical, hence the iterative solution
through SPA. Messages interchanged at step l:
μ(lc )→ s ( ci =c ) =α(l)
i j i , j⋅P ( ci =c∣λ i )⋅ ∏ μ(ls −1)
k →c ( c i =c )
i
sk ∈ N ( c i )
sk ≠s j
(l ) (l)
μ s j →c i ( ci =c ) = ∑ P ( s j =0∣ci =c , c )⋅ ∏ μ ck → s j ( c ' k =c k )
c ∖ ck ∈ N (s j ) c ' k∈ N ( s j)
156
c k ≠c i
Low Density Parity Check codes
● If we get P(ci | λ), we have an estimation of the codeword sent ĉ.
●
The decoding aims at calculating this through the marginalization
P ( ci∣λ )= ∑ P ( c '∣λ )
c ' :c ' i =c i
●
Brute-force approach for LDPC is impractical, hence the iterative solution
through SPA. Messages interchanged at step l: From variable node
to check node
μ(lc )→ s ( ci =c ) =α(l)
i j i , j⋅P ( ci =c∣λ i )⋅ ∏ μ(ls −1)
k →c ( c i =c )
i
sk ∈ N ( c i )
sk ≠s j
(l ) (l)
μ s j →c i ( ci =c ) = ∑ P ( s j =0∣ci =c , c )⋅ ∏ μ ck → s j ( c ' k =c k )
c ∖ ck ∈ N (s j ) c ' k∈ N ( s j)
157
c k ≠c i
Low Density Parity Check codes
● If we get P(ci | λ), we have an estimation of the codeword sent ĉ.
●
The decoding aims at calculating this through the marginalization
P ( ci∣λ )= ∑ P ( c '∣λ )
c ' :c ' i =c i
●
Brute-force approach for LDPC is impractical, hence the iterative solution
through SPA. Messages interchanged at step l: From variable node
to check node
μ(lc )→ s ( ci =c ) =α(l)
i j i , j⋅P ( ci =c∣λ i )⋅ ∏ μ(ls −1)
k →c ( c i =c )
i
sk ∈ N ( c i )
sk ≠s j From check node
to variable node
(l ) (l)
μ s j →c i ( ci =c ) = ∑ P ( s j =0∣ci =c , c )⋅ ∏ μ ck → s j ( c ' k =c k )
c ∖ ck ∈ N (s j ) c ' k∈ N ( s j)
158
c k ≠c i
Low Density Parity Check codes
●
Note that:
– α(l)
i , j is a normalization constant.
– P ( c i=c|λ i ) plugs into the LDPC SPA the values from the
channel → it is the APR info.
– N (c i ) and N (s i) are the neighborhoods of variable nodes
and check nodes.
(l) (l) (l )
P ( c i=c| λ )=β ⋅P ( ci =c|λ i )⋅
i ∏ μ s j →c i ( ci =c ) : APP value.
s j ∈ N (c i )
(l)
●
Based on the final probabilities P ( c i=c| λ ) , a candidate ĉ is
chosen and ĉ·HT is tested. If 0, the information word is
decoded. 159
Low Density Parity Check codes
●
Note that:
– α(l)
i , j is a normalization constant.
– P ( c i=c|λ i ) plugs into the LDPC SPA the values from the
channel → it is the APR info.
– N (c i ) and N (s i) are the neighborhoods of variable nodes
and check nodes. Normalization
(l) (l) (l )
P ( c i=c| λ )=β ⋅P ( ci =c|λ i )⋅
i ∏ μ s j →c i ( ci =c ) : APP value.
s j ∈ N (c i )
(l)
●
Based on the final probabilities P ( c i=c| λ ) , a candidate ĉ is
chosen and ĉ·HT is tested. If 0, the information word is
decoded. 160
Low Density Parity Check codes
●
There is also the possibility to analyze their performance.
– They have also error floors that may be characterized, on
the basis of their minimum distance.
– Good performance is related with sparsity (it breaks error
recurrences along parity check equations).
●
Analysis techniques are complex.
– They require taking into account the Tanner graph
structure, nature of their loops, and so on.
– It is possible to draw bounds for the BER, and give design
and evaluation criteria thereon.
– Analysis depends heavily on the nature of the LDPC:
regular, irregular..., and constitutes a field of very active
research.
161
Low Density Parity Check codes
●
LDPC BER performance examples (DVBS2 standard).
162
Low Density Parity Check codes
●
LDPC BER performance examples (DVBS2 standard).
Short n=16200
163
Low Density Parity Check codes
●
LDPC BER performance examples (DVBS2 standard).
Short n=16200
Long n=64800
164
Coded modulations
165
Coded modulations
●
We have considered up to this point channel coding and
decoding isolated from the modulation process.
– Codewords feed any kind of modulator.
– Symbols go through a channel (medium).
– The info recovered from received modulated symbols is fed
to the suitable channel decoder
●
As hard decisions.
●
As soft values (probabilistic estimations).
– The abstractions of BSC(p) (hard demodulation) or soft
values from AWGN ( ⋉ exp[-|ri-sj|2/(2σ2)] ) -and the like for
other cases- are enough for such an approach.
●
Note that there are other important channel kinds not
considered so far.
166
Coded modulations
●
Coded modulations are systems where channel coding
and modulation are treated as a whole.
– Joint coding/modulation.
– Joint decoding/demodulation.
●
This offers potential advantages (recall the improvements
made when the demodulator outputs more elaborated
information -soft values vs. hard decisions).
– We combine gains in BER with spectral efficiency!
●
As a drawback, the systems become more complex.
– More difficult to design and analyze.
167
Coded modulations
●
TCM (trellis coded modulation).
– Normally, it combines a CC encoder and the modulation
symbol mapper.
output mk
s1 s1
output m
s2 j
s2
s3 s3
s4 s4
s5 s5
s6 s6
s7 s7
s8 s8
i-1 i i+1
168
Coded modulations
●
If the modulation symbol mapper is well matched to the CC
trellis, and the decoder is accordingly designed to take
advantage of it,
– TCM provides high spectral efficiency.
– TCM can be robust in AWGN channels, and against fading and
multipath effects.
●
In the 80's, TCM become the standard for telephone line data
modems.
– No other system could provide better performance over the
twisted pair cable before the introduction of DMT and ADSL.
●
However, the flexibility of providing separated channel coding
and modulation subsystems is still preferred nowadays.
– Under the concept of Adaptive Coding & Modulation (ACM).
169
Coded modulations
●
Other possibility of coded modulation, evolved from TCM and
from the concatenated coding & iterative decoding framework is
Bit-Interleaved Coded Modulation (BICM).
– What about if we use an interleaver between the channel coder
(normally a CC) and the modulation symbol mapper?
CC Π
●
BICM has already found applications in standards such as
DVB-T2.
●
The drawback is the higher latency and complexity of the
decoding.
171
Coded modulations
●
Examples of BICM.
172
Conclusions
173
Conclusions
●
Channel coding is a key enabling factor for modern digital
communications.
– All standards at the PHY level include one or more channel
coding methods.
– Modulation & coding are usually considered together to
guarantee a final performance level.
●
Error control comes out in two main flavors: ARQ and FEC.
– Nevertheless, hybrid strategies are becoming more and more
popular (HARQ).
●
A lot of research has been made, with successful results, to
approach Shannon's promises from the noisy-channel coding
theorem.
●
Concatenation and iterative (soft) decoding are pushing the
results towards channel capacity.
174
Conclusions
●
Long standing trends point towards the development of codes
endowed with less rich algebraic structure, and relaying more on
statistical / probabilistic grounds.
●
New outstanding proposals are being made, relaying more intensively
on randomness, as hinted by Shannon's demonstrations.
– New capacity-achieving alternatives, like polar codes and
fountain codes, are step-by-step reaching the market, with
promising prospects.
– Nonetheless, their processing needs make them more suitable
for higher layers than the PHY.
●
Though we are already approaching the limits of the Gaussian
channel, there are still many challenges.
– In general, the wireless channel poses problems unresolved
from the point of view of capacity calculation & exploitation
through channel coding.
175
References
●
S. Lin, D. Costello, ERROR CONTROL CODING, Prentice
Hall, 2004.
●
S. B. Wicker, ERROR CONTROL SYSTEMS FOR
DIGITAL COMMUNICATION AND STORAGE, Prentice
Hall, 1995.
●
J. M. Cioffi, DIGITAL COMMUNICATIONS - CODING
(course), Stanford University, 2010. [Online] Available:
http://web.stanford.edu/group/cioffi/book
176