Lecture “Channel Coding”
Chapter 2: Channel Coding Principles
Prof. Dr.-Ing. Antonia Wachter-Zeh
x
Outline of this Chapter
1. Transmission Model
2. Channel Models:
BSC & QSC, BEC & QEC, AWGN
3. Decoding Principles:
MAP, ML, s/s MAP
4. Definition of the Hamming Metric
5. Decoding Principles in the Hamming Metric:
error detection, erasure correction, unique error correction, nearest codeword decoding, list decoding
6. Decoding Results and Error Probability
Antonia Wachter-Zeh (TUM) 2
Transmission Model
u c r ĉ
source encoder + decoder sink û
• The information symbols u0 , . . . , uk−1 are from a finite alphabet A and they are statistically independent
and equi-probable.
• The encoder maps the k information symbols on a valid codeword of length n (one-to-one mapping).
Qn−1
• We consider only additive discrete memoryless channels (DMCs): P(r|c) = i=0 P(ri |ci )
• The additive DMC summarizes filtering, modulation, demodulation, sampling and quantization (D/A and
A/D conversion).
Antonia Wachter-Zeh (TUM) 3
Channel Models: BSC
Definition: Binary Symmetric Channel (BSC)
The memoryless BSC has input and output alphabet {0, 1}. For every pair of symbols (r , c) ∈ {0, 1}2 ,
we have:
1 − p
if r = c
P(r |c) =
p if r 6= c,
where p, with 0 ≤ p ≤ 1, is called crossover probability.
• The BSC flips each bit with probability p
• If p = 0 or p = 1 (if it is known): reliable communication
• If p = 12 : output is statistically independent of input
[Illustration: see blackboard]
Antonia Wachter-Zeh (TUM) 4
Channel Models: QSC
Definition: q-ary Symmetric Channel (QSC)
The memoryless QSC has an input and output alphabet A of size q. For every pair of symbols (r , c) ∈ A2 ,
we have:
1 − p
if r = c
P(r |c) = p
q−1 if r = c 0 , ∀c 0 ∈ A \ {c},
where p, with 0 ≤ p ≤ 1, is called crossover probability.
• The QSC changes each symbol with probability p
• If p = 0: reliable communication
• If p = q−1
q : output is statistically independent of input
[Illustration: see blackboard]
Antonia Wachter-Zeh (TUM) 5
Channel Models: BEC & QEC
Definition: Binary Erasure Channel (BEC)
The memoryless BEC has input alphabet A = F2 = {0, 1} and output alphabet {0, 1, ~}, where ~ denotes
an erasure. For every pair of symbols (r , c) ∈ F22 , we have:
1 − if r = c
P(r |c) = if r = ~
0 if r 6= c, r 6= ~,
where , with 0 ≤ ≤ 1, is called erasure probability.
QEC: generalization to q-ary alphabets with same P(r |c).
[Illustration: see blackboard]
Antonia Wachter-Zeh (TUM) 6
Channel Models: AWGN Channel
Definition: Additive White Gaussian Noise (AWGN) Channel
The AWGN channel is an additive channel which adds on each input symbol ci a symbol ei ∈ R. The
error value ei follows a normal distribution with mean µ = 0 and constant power spectral density (uniformly
distributed over all frequencies).
• In this lecture: only discrete channels.
=⇒ The decoder obtains a vector with elements from the same finite alphabet as the encoder outputs
(and maybe erasures).
=⇒ This is called hard decision decoding.
• If the decoder obtains an ”analog” value (e.g., from the AWGN channel), it can be used to improve the
decoding performance since some values are more reliable than others.
=⇒ This is called soft decision decoding.
[Illustration: see blackboard]
Antonia Wachter-Zeh (TUM) 7
Decoding
e
u c r ĉ
source encoder + decoder sink û
• The channel decoder is the main challenge in channel coding.
• Due to the one-to-one mapping of u to c, once we have ĉ, we also know û at the sink.
Remark:
Vectors as a = (a0 , a1 , . . . , an−1 ) denote row vectors.
Column vectors are denoted by aT .
Antonia Wachter-Zeh (TUM) 8
Decoding Principles: MAP
Task of all decoders: given the channel output r = (r0 , . . . , rn−1 ), guess the channel input (= a codeword)
c = (c0 , . . . , cn−1 ).
• Here: given a received word r, we decide for the most likely codeword ĉ from a given code C:
P(r|c)P(c)
ĉ = arg max P(c|r) = arg max
c∈C c∈C P(r)
(using Bayes’ rule)
• Since P(r) is fixed during one decoding, we get the following MAP decoding rule.
Definition: Maximum A-Posteriori (MAP) Decoding
Given the received word r, a MAP decoder outputs:
ĉ = arg max P(r|c)P(c).
c∈C
Antonia Wachter-Zeh (TUM) 9
Decoding Principles: ML
• Given a channel output word r, we decide for the most likely codeword from a given code C:
P(r|c)P(c)
ĉ = arg max P(c|r) = arg max
c∈C c∈C P(r)
• P(r) is fixed.
• Assume that all codewords are equi-probable or P(c) is unknown.
Definition: Maximum Likelihood (ML) Decoding
Given the received word r, an ML decoder outputs:
ĉ = arg max P(r|c).
c∈C
=⇒ MAP and ML (if P(c) = const.) decoders choose the most likely codeword and therefore minimize the
block error probability.
Drawback: large complexity (we have to go through all codewords) =⇒ not practical!
[Examples for MAP and ML decoders: see blackboard]
Antonia Wachter-Zeh (TUM) 10
Decoding Principles: Symbol-by-Symbol MAP
• The MAP decoding rule can also be applied to each symbol instead of the whole word.
• To decide for the symbol ĉi , ∀i ∈ {0, . . . , n − 1}, we decide on the most likely symbol by summing up the
probabilities of all codewords with 0 or 1 on position i.
Definition: (Binary) Symbol-by-Symbol MAP Decoding
Given the received word r, a symbol-by-symbol MAP decoder outputs ĉ = (ĉ0 , . . . , ĉn−1 ), where
0 if c∈C:ci =0 P(r|c)P(c) ≥ c∈C:ci =1 P(r|c)P(c)
P P
ĉi =
1 else
for i = 0, . . . , n − 1.
However, the output word ĉ is not necessarily a valid codeword!
Antonia Wachter-Zeh (TUM) 11
Hamming Weight & Distance
Definition: Hamming Weight
The Hamming weight wt(a) of a vector a = (a0 , a1 , . . . , an−1 ) is the number of its non-zero entries:
wt(a) , |{i : ai 6= 0, i = 0, . . . , n − 1}|.
Definition: Hamming Distance
The Hamming distance d(a, b) between two words a, b is the number of coordinates where a and b are
different:
d(a, b) , |{i : ai 6= bi , i = 0, . . . , n − 1}|.
• d(a, b) = wt(a − b)
• A code with minimum Hamming distance d is a set of words where any two codewords have Hamming
distance at least d.
[Examples: see blackboard]
Antonia Wachter-Zeh (TUM) 12
Decoding Principles: Error Detection
Theorem: Error Detection
If c ∈ C, where C is a code of minimum distance d, is transmitted and the channel adds an error with
wt(e) ≤ d − 1, then the decoder can always detect whether an error occurred or not.
c(2)
c(1) r
d −1 c(3)
However, (usually) we cannot correct the error!
Antonia Wachter-Zeh (TUM) 13
Decoding Principles: Erasure Correction
Theorem: Erasure Correction
If c ∈ C, where C is a code of minimum distance d, is transmitted and the channel erases at most d − 1
symbols (i.e., we know the locations), then the decoder can always correct the erasures.
c(2) • Any two codewords differ by at least d symbols
• If we erase d − 1 fixed positions in all
c(1) r
codewords, any two codewords still differ by at
least one symbol
d −1 c(3) =⇒ c can still be determined
Antonia Wachter-Zeh (TUM) 14
Decoding Principles: Unique Error-Correction
Theorem: Unique Decoding c(2)
If c ∈ C, where C is a code of minimum distance
d, is transmitted and the channel adds errors with r
d−1
j k
wt(e) ≤ 2 , then the decoder can always correct
the errors. c(1)
• Reason: The decoding spheres of radius d−1
j k
2 do
not overlap.
d−1 c(3)
• Unique decoding is also called Bounded Minimum
2
Distance (BMD) Decoding.
Antonia Wachter-Zeh (TUM) 15
Decoding Principles: Nearest Codeword Decoding
c(2)
• A nearest codeword decoder decides for the ”closest”
(i.e., within smallest Hamming distance) codeword. c(1)
• If more than one codewords is at the same distance, it r
randomly decides for one.
• If less errors are more likely than more errors (e.g., in a
BSC with p < 21 ), then a nearest codeword decoder is
ML. d−1 c(3)
2
Antonia Wachter-Zeh (TUM) 16
Decoding Principles: List Decoding
Definition: τ -List Decoder c(2)
Given a received word r, then a τ -list decoder returns all code-
words around r within radius τ .
c(1) r
• The size of the output list is called list size.
• The maximum list size depends on τ .
• If τ exceeds a certain value, the list size can grow
d−1 c(3)
exponentially in the code length (such a decoder has a huge
2
complexity and is practically not feasible).
Antonia Wachter-Zeh (TUM) 17
Decoding Results
The decoder outputs a word ĉ for which one of the following three cases holds.
• Correct decoding: The decoder output ĉ equals the channel input.
• Miscorrection: The decoder output ĉ is a valid codeword, but not the channel input. This case cannot be
detected at the receiver side.
(Its probability will be denoted by Perr )
• Decoding failure: The decoder output is not a valid codeword. The receiver therefore detects that the
output is erroneous.
(Its probability will be denoted by Pfail )
Antonia Wachter-Zeh (TUM) 18
Error Probability
• Block error probability: Pblock (c) = Perr (c) + Pfail (c) = r:Dec(r)6=c P(r|c)
P
Calculation for BSC:
n n
Pblock (c) ≤ · p i · (1 − p)n−i
X
i=b d−1
i
2 c+1
d−1
j k
Practically, some errors beyond 2 might be correctable.
• Symbol error probability (codeword): Psym (ci ) = r:ĉi 6=ci P(r|c)
P
• Symbol error probability (information): Psym (ui ) = r:ûi 6=ui P(r|c)
P
Remark
The symbol (bit) error probability depends on the explicit mapping between information and codewords,
the block error probability does not.
[Example: see blackboard]
Antonia Wachter-Zeh (TUM) 19