T5 Detection
T5 Detection
Detection
We will work with a discrete-time representation of the digital data transmission. We will assume that the waveforms
ϕ i (t) used to modulate the transmitted signal X(t) are shift-orthonormal for some symbol period T and that use
some bandwidth W, and characterize the transmission rate of a given in terms of symbols per channel use. Then, a
PAM signal carries one symbol per (real-valued) channel use, and a QAM signal one symbol per (complex-valued)
channel use.
Problem 1.1. Given a shift-orthonormal waveform ϕ(t), we transmit the PAM signal
n
X(t) = ∑ X i ϕ(t − iT),
i=1
where X1 , . . . , X n are the output of the binary-to-symbol mapper given by the following rule
⎧
⎪
⎪+1 (D2 j−1 , D2 j ) = (0, 0)
⎪
⎪
⎪+ 1
⎪
⎪ (D2 j−1 , D2 j ) = (1, 0)
X j = ⎨ 21 (1.1)
⎪
⎪
⎪ −2 (D2 j−1 , D2 j ) = (0, 1)
⎪
⎪
⎪
⎪
⎩−1 (D2 j−1 , D2 j ) = (1, 1).
for j = 1, 2, . . .. In this case, we take tuples of two bits to generate a real number, that is, two bits per symbol. At the
output of an ideal channel, we receive the signal Y(t) = ϕ(t − T) − ϕ(t − 2T) + 21 ϕ(t − 3T) + 21 ϕ(t − 4T). Find the
decoded sequence of bits D1 , . . . , D k and the transmission rate in bits per channel use.
Problem 1.2. Given a shift-orthonormal set of waveforms {ϕ1 , ϕ2 }, we transmit the QAM signal
n
X(t) = ∑ (B i ϕ1 (t − iT) + C i ϕ2 (t − iT)) ,
i=1
1
At the output of a perfect channel, we receive the signal Y(t) = ϕ1 (t − T) + ϕ2 (t − T) − ϕ1 (t − 2T) + ϕ2 (t − 2T).
Find the decoded sequence of bits D1 , . . . , D k and the transmission rate in bits per (complex-valued) channel use.
In Problems 1.1 and 1.2, we have perfect channel assumption, Y(t) = X(t), and we may transmit at the highest rate
R = r bits per channel use and recover the transmitted data sequence totally error-free. In practice, however, we will
have Y(t) ≠ X(t). A new signal processing block called detector is required to estimate the transmitted symbols,
as shown in Figure 1.1.
Definition 1.1. A detector is a function g n (⋅) that estimates the transmitted symbols X1 , . . . , X n based on the de-
modulated symbols Y1 , . . . , Yn :
( X̂1 , . . . , X̂ n ) = g n (Y1 , . . . , Yn ) (1.3)
With some abuse of notation, we denote as X̂1 , . . . , X̂ n the estimated, decoded or detected symbols at the output of
the demodulator, and D̂1 , . . . , D̂ k as the estimated, decoded or detected bits at the output of the demapper.
Symbol-to-bit X̂1 , . . . , X̂ n Y1 , . . . , Yn
D̂1 , . . . , D̂ k Detector Demodulator Y(t)
demapper
A measure of the reliability of a digital communication system is given in terms of the error probability.
Definition 1.2. The average symbol error probability is defined as
Error events in the detection process may be due to interferences from other communication systems, imprecisions
in the synchronization and other sources of noise. In the sequel, we will study the rather simple but informative
characterization of noise by means of the additive white Gaussian noise.
Recall that for baseband digital communications such as PAM where ϕ i (t) = ϕ(t − iT) is a shift-orthonormal
waveform, the received symbols Yi are given by Yi = X i + Z i , where Z i ∼ N (0, σ 2 ). For passband digital com-
munications such as QAM where we simultaneously receive two real-valued symbols (real and imaginary part)
represented in a complex-valued symbol, the received complex-valued symbol Yi is also given by Yi = X i + Z i ,
where now Z i ∼ NC (0, σ 2 ).
Since the discrete-time AWGN channel is memoryless, we will assume that the output of the binary-to-symbol
mapper X1 , . . . , X n is also a sequence of i.i.d. random variables taking values over a constellation X with prob-
ability distribution Q(x). As a consequence, the received symbols Y1 , . . . , Yn will also be i.i.d. random variables
taking values over R or C with probability distribution P(y) = ∑x∈X W(y∣x)Q(x). Also, the average symbol error
probability (1.4) simplifies to
Ps = Pr[ X̂ ≠ X]. (1.5)
The average signal-to-noise ratio (SNR) the discrete-time AWGN channel is defined as
Es
SNR = , (1.6)
σ2
2
where σ 2 is the white Gaussian noise variance and Es is the average symbol energy of the constellation given by
Es = ∑ Q(x)∣x∣2 . (1.7)
x∈X
1.2 Detector
We have characterized all the random variables involved in the discrete-time AWGN channel, namely, X1 , . . . , X n ,
Z1 , . . . , Z n and Y1 , . . . , Yn . Under the memoryless assumption for X, all these sequences are i.i.d., and the detection
process will also detect each symbol independently. From Definition 1.1, this implies that a detector is a function
g(⋅) such that
X̂ = g(Y). (1.8)
Two important decision rules in detection theory that make use of the information about the channel transition
probability W and the channel input probability Q are the MAP and the ML detectors.
Definition 1.3. The maximum a posteriori (MAP) decision rule is given by
We first note that ML and MAP are equivalent when Q(x) as uniform distribution, i.e., when the symbols in the
constellation are equiprobable with probability Q(x) = ∣X1 ∣ . A second remark is that for the AWGN channel, the
ML decision rule can be simplified as the minimum distance rule:
Definition 1.5. The minimum distance (MD) decision rule is given by
Problem 1.3. Show that the ML and the MD decision rules are equivalent for the AWGN channel.
Problem 1.4. Consider a QPSK constellation X = {+1 + j, −1 + j, −1 − j, +1 − j} and channel input distribution
Q(x) = {0.1, 0.1, 0.7, 0.1}. Assume that we receive the signal Y(t) = 0.1 ⋅ ϕ1 (t − T) + 0.1 ⋅ ϕ2 (t − T). What is
the detected symbol x̂ according to the MAP, ML and MD rules?
In the previous examples, we have used a binary-to-symbol mapper that works at the highest possible rate of r =
log2 (∣X ∣) bits per symbol. When these symbols are modulated using PAM or QAM, then this is equivalent to have
R = log2 (∣X ∣) bits per channel use. The reader may be convinced that for any value of SNR, the average symbol
error probability Ps as in (1.5) cannot be made arbitrarily small by increasing n. If we add extra structure to the
binary-to-symbol mapper, we may reduce the error probability.
Problem 1.5. For a given set {ϕ1 , ϕ2 } of shift-orthonormal waveforms, we transmit the QAM signal
n
X(t) = ∑ (B i ϕ1 (t − iT) + C i ϕ2 (t − iT)) ,
i=1
3
where the complex-valued symbols X i = B i + jC i belong to a QPSK constellation X = {+1 + j, −1 + j, −1 − j, +1 − j}.
At the transmitter side, the following bit-to-symbol mapper is used
⎧
⎪
⎪(+1 + j, +1 + j, −1 + j) (D2i−1 , D2i ) = (0, 0)
⎪
⎪
⎪
⎪
⎪(+1 + j, −1 + j, −1 + j) (D2i−1 , D2i ) = (0, 1)
(X2i−1 , X2i , X3i ) = ⎨ (1.12)
⎪
⎪
⎪(−1 − j, +1 − j, +1 − j) (D2i−1 , D2i ) = (1, 0)
⎪
⎪
⎪
⎪
⎩(−1 − j, −1 − j, +1 − j) (D2i−1 , D2i ) = (1, 1).
At the output of a perfect channel, we receive the signal Y(t) that is demodulated to the received symbols (Y1 , Y2 , Y3 ) =
(−1 − j, +1 − j, +1 − j). Find the decoded sequence of bits D1 , . . . , D k and the transmission rate in bits per (complex-
valued) channel use.
Problem 1.6. We transmit the QAM signal X(t) = ∑ni=1 (B i ϕ1 (t − iT) + C i ϕ2 (t − iT)), where {ϕ1 , ϕ2 } is a shift-
orthonormal set of waveforms and the complex-valued symbols X i = B i + jC i belong to a BPSK constellation
X = {+1, −1}. At the transmitter side, the following bit-to-symbol mapper is used
⎧
⎪
⎪+1 D1 = 0
Xi = ⎨ (1.13)
⎪
⎩−1
⎪ D1 = 1.
for all i = 1, . . . , n. At the output of a noisy channel, we receive a signal Y(t) that is demodulated to the received
symbols (Y1 , Y2 , Y3 ) = (0.1 − 0.6 j, −0.7 + j, 1 − j). Find the decoded sequence of bits D1 , . . . , D k using the minimum
distance rule (1.11) and the transmission rate in bits per (complex-valued) channel use. Assume that the probability
of making a decision error of this modulation over a noisy channel under the minimum distance detector when x
was transmitted is given by
Pr [ X̂ ≠ x∣X = x] = є. (1.14)
x ∈ {+1, −1}. What is the error probability (1.4)?
In Problems 1.5 and 1.6, the transmission rate respectively if R = 23 and R = 31 instead of R = log2 (∣X ∣) = 2. These are
examples of coded binary-to-symbol mappings of rate R < log2 (∣X ∣). An important result from information theory
is that, for a given channel W and input distribution Q, there exists a code of rate R bits per channel use such that
the error probability Ps → 0 as n → ∞ if and only if the rate is below the celebrated mutual information (capacity)
of the channel, that is
R < I(Q; W), (1.15)
where I(Q; W) is the mutual information of channel W with input distribution Q.
We will next argue that one should consider a demodulator as well, a block that transforms the channel output into
a form suitable for channel decoding, but which is general is going to be different from the detector. As context, let
us consider the transmission of a repetition code of rate R = 1/r, where r is an integer, over a binary-input AWGN
channel, that is a channel where the modulation is 2-PAM (or BPSK) and noise is additive complex Gaussian.
4
Let D denote a single bit to be sent over the channel and (C1 , C2 , . . . , Cr ) the output of the repetition code. In this
simple case, C1 = C2 = ⋯ = Cr = D. The encoded bits are mapped onto modulation symbols according to the
mapping rule:
⎧ √
⎪− Es , if C i = 0,
⎪
X i (C i ) = ⎨ √ (2.1)
⎪+ Es , if C i = 1,
⎪
⎩
where Es is the average symbol energy. In the channel output Yi = X i +Z i , for i = 1, . . . , r, the input X i is corrupted by
samples Z i of additive complex Gaussian noise of variance σ 2 . The detector is a function such that operates symbol
by symbol, producing X̂ i (Yi ). With no extra effort, we may subsume the inverse mapping operation of (2.1) into
the detector, so that the detector output is actually Ĉ i (Yi ). For a minimum-distance detector, we have the following
rule:
2
Ĉ = arg min ∣Y − X(C)∣ . (2.2)
C∈{0,1}
We now characterize the probability that Ĉ differs from the transmitted encoded bit C in a way that will easily extend
to consideration of the channel code. Denoting by C̄ the binary complement of C, that is the decision when we have
a detector error, i. e., Ĉ ≠ C, the error condition can be expressed as
2 2
∣X(C) + Z − X(C̄)∣ < ∣
X(C)
+ Z − X(C)∣
, (2.4)
where we used that Y = X(C) + Z. Expanding the squared modulus in the left-hand side, we obtain
Now, since X(C̄) = −X(C), regardless of the value of C, and X is real-valued, the condition is further simplified to
2 ∗
∣2X(C)∣ + 4X(C)re(Z
) < 0. (2.6)
Since re(Z ∗ ) has a real-valued Gaussian density, of mean zero and variance 21 σ 2 and X(C) ∈ {± Es }, therefore
√
X(C) re(Z ∗ ) is a real-valued Gaussian of mean zero and variance 21 Es σ 2 . The condition in (2.6) is thus equivalent
to this Gaussian being lower than −Es , and the detector error probability is thus given by:
−Es
− Etσ2
2
1
Pr{Ĉ ≠ C} = √ e s dt. (2.7)
πEs σ 2 −∞
∫
√
With a change of variable u = −t/ Es σ 2 , we obtain the closed-form expression
√
1 Es
Pr{Ĉ ≠ C} = erfc( ). (2.8)
2 σ2
5
bit D from the sequence (Ĉ1 , . . . , Ĉr ). A reasonable choice is to take a decision according to majority voting, that
is choosing D̂ = 0 if there are more zeroes than ones in the sequence of detected bits, or D̂ = 1 in the opposite case,
where there are more ones than zeros, and we break ties for r even with a random decision. In our study of Mutual
Information, we observed that this majority-rule decision was valid for p < 21 , a condition verified by (2.8) analyzed
the error probability Pe = Pr{D̂ ≠ D} of a repetition code of rate R = 1/r, where r is an integer, transmitted over the
bit-flipping channel of probability p. We obtained the following:
r r
1 r r r k r−k
⎪ 2 ( 2r )p 2 (1 − p) 2 + ∑k= 2r +1 (k)p (1 − p) r even
⎧
⎪
Pe = ⎨ r r k r−k
(2.9)
r ( )p (1 − p) r odd.
⌉
⎪
⎪ ∑
⎩ k=⌈ 2 k
√
In particular, for r = 2, one has Pe = p = 21 erfc( σE2s ).
However, an alternative decision rule is to make a decision on D̂ directly from the sequence (Y1 , . . . , Yr ), without
first detected the modulation symbols or the corresponding encoded bits. In this case, the minimum-distance rule
(2.2) generalizes to
2
D̂ = arg min ∥Y − X(D)∥ , (2.10)
D∈{0,1}
2
where ∥Y − X(D)∥ denotes the r-dimensional Euclidean distance between the vectors Y = (Y1 , . . . , Yr ) and X =
(X1 , . . . , Xr ). This distance r-dimensional Euclidean distance is given by
r
2 2
∥Y − X(D)∥ = ∑ ∣Yi − X i (D)∣ . (2.11)
i=1
The decision when we have a channel decoding error, i. e., D̂ ≠ D, the error condition can be expressed as
2 2
∥X(D) + Z − X(D̄)∥ < ∥ X(D)
+ Z − X(D)∥
. (2.12)
Problem 2.2. Redo the analysis from (2.4) to the estimate of the detector error probability in (2.8) to find that
√
1 Es
Pe = Pr{ D̂ ≠ D} = erfc( r 2 ). (2.13)
2 σ
∎
√
For r = 2, the decoder that makes a decision based upon the detected symbols has a probability of error 21 erfc( σE2s ),
whereas the decoder that makes a decision based on the Euclidean √ distance between the received vector and the
transmitted sequence has the lower error probability Pe = 2 erfc( 2E
1 s
σ 2 in (2.13); this is a consequence of the fact
)
that erfc(x) is a decreasing function of the variable x. Interestingly, the error probabilitiy is the same as that obtained
by doubling the signal energy. A detailed comparison is considered in the following problem.
Problem 2.3. Plot as a function of the signal-to-noise ratio Es /σ 2 the error probabilities in (2.9) and (2.13) for r = 2,
r = 3, and r = 4, and compare the different curves. The plot is best done in log-log scale, where the error probability
is represented in logarithmic scale and the signal-to-noise ratio is expressed in decibels. What is the lowest signal-
to-noise ratio compatible with reliable communication (i. e. transmission below capacity) for each value of r? ∎
2
Problem 2.4. Compare the mutual information of 2-PAM over the Gaussian
√ channel with signal-to-noise ratio Es /σ
Es
to the capacity of a bit-flipping channel with flip probability p = 21 erfc( σ 2 and relate what you find to the previous
)
problem. ∎
Taking into account the results from the previous two problems leads to the observation one also has to consider a
demodulator that does not make a decision on the transmitted symbols, but rather keeps probabilistic information
about the likelihood of having sent one symbol or another, based on the observed channel output. The formalization
of this idea is left to the reader. We end with a review of how the minimum-Euclidean-distance decoder appears in
the context of receiver design.
6
3 maximum a posteriori receiver
For a repetition code of rate 1/r, the Gaussian channel receiver considered in Section 2 estimated the transmit-
ted bit as that whose transmitted sequence (X1 , . . . , Xr ) was closest in Euclidean distance to the channel output
(Y1 , . . . , Yr ). While this operation lends itself to actual implementation in a practical system, it obscures the reason
why one should consider such a decoder, other than its ease of implementation and of description. As we saw in
the first part of the course, the underlying mathematical operation behind this minimum-distance decoding is the
Maximum a Posteriori principle.
According to the Maximum a Posteriori principle, one selects the bit value whose a posterior probability is largest
for a given channel output. In the problem of Section 2, Eq.(2.10) stems from the application of this rule:
where P(D∣Y1 , . . . , Yn ) is the a posterior probability of D given the output (Y1 , . . . , Yr ) We verify the equivalence of
both rules by first noting that P(D∣Y1 , . . . , Yr ) can be rewritten by using Bayes’ rule as
P(Y1 , . . . , Yr ∣D)P(D)
P(D∣Y1 , . . . , Yr ) = (3.2)
P(Y)
P(Y1 , . . . , Yr ∣X1 (D), . . . , Xr (D))P(D)
= (3.3)
P(Y)
W(Y1 ∣X1 (D))⋯W(Yr ∣Xr (D))P(D)
= , (3.4)
P(Y)
where we explicitly introduced the channel transition probability W(y∣x). Under the further assumption that
P(D) = 21 (more on this point will follow later) and the observation that P(Y) does not depend on D, so its specific
value does not affect the decision, we note that the Maximum A Posteriori decision rule is nothing but the Maximum
Likelihood criterion, namely
where the first equation is the general definition and the second its application to the repetition code.
Problem 3.1. Use the exact form of the channel transition probability for the Gaussian channel to verify the equiva-
lence between (3.5) and (2.10). You may use the fact that monotonic transformations of the likelihood, e. g. rescaling
or taking logarithms, do not affect the decision on which bit is selected.
How does the decision rule change if we consider a real-valued channel rather than a complex-valued channel? ∎
Problem 3.2. Use the exact form of the channel transition probability for the bit-flipping with probability p to verify
the equivalence between (3.5) and a majority-vote decoder when p < 21 and a minority-vote decoder when p > 21 .
Argue that for p = 21 , a random decoder that completely ignores the channel output is the Maximum a Posteriori
decoder. ∎
Problem 3.3. Use the exact form of the channel transition probability for the bit-erasing with probability p to describe
the decoding rule (3.5) in this channel. Can you find a simple description in words of how this decoding rule
operates? ∎
7
Problem 3.4. For a binary-input additive exponential noise-channel, where the inputs are {0, 2A}, with probability
1
2 each, and the channel transition probability is given by the formula
⎪e −(y−x) ,
⎧
⎪ y ≥ x,
W(y∣x) = ⎨ (3.7)
⎩0,
⎪
⎪ y < x.
describe the decoding rule (3.5) in this channel. Can you find a simple description in words of how this decoding
rule operates? How does the answer change if Pr{D = 0} = ε0 ? ∎