0% found this document useful (0 votes)
12 views56 pages

Turbo Equalization An Overview

Uploaded by

morokali8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views56 pages

Turbo Equalization An Overview

Uploaded by

morokali8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/224213239

Turbo Equalization: An Overview

Article in IEEE Transactions on Information Theory · March 2011


DOI: 10.1109/TIT.2010.2096033 · Source: IEEE Xplore

CITATIONS READS

194 1,761

2 authors:

Michael Tüchler Andrew Singer


Rheinmetall Air Defence AG University of Illinois, Urbana-Champaign
43 PUBLICATIONS 3,942 CITATIONS 351 PUBLICATIONS 6,617 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Michael Tüchler on 28 July 2015.

The user has requested enhancement of the downloaded file.


1

Turbo Equalization: An Overview


Michael Tüchler and Andrew C. Singer
Email: [email protected], [email protected]

Abstract

Turbo codes and the iterative algorithm for decoding them sparked a new era in the theory and practice of error
control codes. Turbo equalization followed as a natural extension to this development, as an iterative technique for
detection and decoding of data that has been both protected with forward error correction and transmitted over a
channel with intersymbol interference (ISI). In this paper, we review the turbo equalization approach to coded data
transmission over ISI channels, with an emphasis on the basic ideas, some of the practical details, and many of
the research directions that have arisen from this offshoot of the original turbo decoding algorithm, introduced by
Douillard, et al. The subsequent relaxation of the maximum a posteriori (MAP) equalization algorithm to include
linear and other simpler receivers sparked a decade and a half of iterative algorithms spanning research problems
from trellis coded modulation to underwater acoustic communications.

I. T URBO E QUALIZATION OVERVIEW

In this paper, we discuss the turbo equalization approach to coded data transmission over intersymbol interference
(ISI) channels. Our emphasis is on the basic ideas and some of the practical details involved in making use of this
offshoot of turbo-decoding that has sparked a number of interesting new research directions in the fifteen years
since its discovery. We begin with a high-level overview of turbo equalization and bring our attention to a digital
communication system, whose transmitter is depicted in Fig. 1. The components in this system are typical to most
practical digital communication links and are essential to the application of turbo equalization in the receiver.

an
Encoder
bn
Interleaver
cn
Mapper
xn
Channel
yn

Fig. 1. Transmitter configuration of a digital communication system.

April 25, 2010 DRAFT


2

The receiver for this system must estimate the data that was transmitted, making use of knowledge of the channel
together with the available redundancy introduced to protect the data, in the form of forward error correction
(FEC). Mitigating the effects of an inter-symbol interference (ISI) channel on the transmitted data is generally
called equalization or detection, while the subsequent problem of recovering the data from the equalized symbols,
making use of the FEC, is called decoding. For complexity reasons, these problems have typically been considered
separately, with limited interaction between the two blocks. As such, substantial performance degradation can be
induced. The main contribution of much of the work in turbo equalization to date has been to enable feasible
approaches to jointly addressing the equalization and decoding tasks. As a result, the performance gap between an
optimal joint equalization and decoding and that achievable through systems with practical complexity has been
narrowed in a manner similar to that of near Shannon-limit communications using turbo codes [1].
This paper is organized as follows: Section I begins with an overview of turbo equalization and shows some of
the many applications and extentions to the original formulation of [2] that have arisen over the last decade and a
half. For ease of description Section II provides an overview of turbo equalization as applied to systems using BPSK
modulation, and Section III applies these developments to more general QAM alphabets. Implementation issues
including low-complexity approximations of the receiver structures are described in Section IV. The performance
enhancing benefits of precoding are discussed in Section V. EXIT chart analysis is described in Section VI as a
tool for approximate analysis of the convergence of turbo equalization and related algorithms.
An equalizer produces estimates of the transmitted symbols and for complexity reasons, often consists of linear
processing of the received signal and possibly past symbol estimates. The parameters of these equalizers can be
optimized using a variety of optimization algorithms, such as zero forcing (ZF) or minimum mean squared error
(MMSE) estimation [3, 4]. Optimal methods for minimizing the bit error rate (BER) or the sequence error rate
(SER) are highly nonlinear, and are based on maximum likelihood (ML) estimation, which turns into maximum
a-posteriori probability (MAP) estimation in presence of a-priori information about the transmitted data. Efficient
algorithms exist for MAP/ML sequence estimation, such as the Viterbi algorithm [3, 5, 6], for MAP/ML symbol
estimation, such as the BCJR algorithm [7]. However, the complexity of such methods often remains significantly
higher than that for linear methods. This is true in particular for channels with large delay spread (memory) Mh or
given a large size |S| of the signal alphabet, since the receiver complexity is of order O(|S|Mh ) per symbol. State-
of-the-art systems for a variety of communication channels employ convolutional codes and ML equalizers together
with an interleaver after the encoder and a deinterleaver before the decoder [8, 9]. Interleaving shuffles symbols
within a given time frame or block of data in attempts to decorrelate error events introduced, or unresolved, by the
equalizer between neighboring symbols. These error bursts are hard to mitigate, for example, using a convolutional
decoder alone.
A number of iterative receiver algorithms repeat the equalization and decoding tasks on the same set of received
data, where feedback information from the decoder is incorporated into the equalization process. This method is
the basis of turbo equalization and is based on decoding methods for turbo codes [10–12]. This approach has
been adapted to various communication tasks, ranging from detection and decoding of trellis coded modulation

April 25, 2010 DRAFT


3

(TCM) [13, 14] and code division multiple access (CDMA) [15] to underwater acoustic communications [16–19]
and optical fiber communications [20–22]. Over the last decade and a half, this joint, iterative approach has been
applied to a vast array of problems that involve the use of an error control code to aid in the processing of another
related task in the receive chain of a digital communication link. Turbo equalization systems were first proposed in
[2], by noting that the transmitter in Fig. 1 can be viewed as employing a serially concatenated convolutional code
(turbo code), where the inner code is taken over the reals (by means of the ISI channel), and further developed
in a number of articles, including [23–26]. In some of these systems, MAP-based techniques, as well as a Viterbi
algorithm producing soft output information [27], were used for both equalization and decoding [2, 23]. The more
complex BCJR algorithm [7] was implemented in [23]. These tasks are similar enough in structure, that in [28], an
architecture was developed such that the same circuitry could be used for both, after which other high-throughput
circuit implementations have also appeared [29, 30].
The approaches in [25, 26, 31–33] address a major shortcoming of the classical turbo equalization scheme [2, 23,
24], which is the exponentially increasing complexity of the equalizer for channels with long memory or given a
large signal alphabet. These replace the MAP equalizer with a linear or a decision feedback equalizer. The approach
in [31] and later [34], replaces the MAP equalizer in the turbo equalization framework by a soft interference canceler,
obtained using a least-mean-square (LMS) based update algorithm. An added benefit of this approach is its ability
to track changes in the channel over time. However, this comes at the expense of using a single linear filter that
does not adjust to the soft-feedback from the decoder, a property that the MMSE-based linear methods possess [25,
26]. In [35], an extending window version of the sliding-window MMSE approach in [25, 26] exploits the added
structure to further reduce complexity.
Another common technique to reduce the complexity of the MAP equalizer is to reduce the number of states
in the underlying trellis, which was applied to turbo equalization in [36]. In [37], MAP complexity is reduced by
exploring only the most promising paths in the trellis, exploiting properties of the channel in this selection. In [38],
the MAP complexity is reduced by replacing the BCJR algorithm with an approximation based on Markov chain
monte carlo methods and in [39], the BCJR algorithm is replaced by an approach based on simulated annealing.
A local search over a suitably defined objective function is used to approximate MAP detection in [40].
Reduced complexity turbo equalization methods have been applied to magnetic recording channels, where in
[41] a soft decision feedback equalizer is used and in [42] 2D MMSE turbo equalization is used along with LDPC
codes. The special case of 2D separable ISI was considered in [43] and in [44] overlapping tracks are considered.
The original work of [15] pioneered a turbo equalization-like approach to multiuser detection in CDMA using
linear receivers in place of the MAP detector. A number of extensions of this work followed, including a reduced-
state trellis search approach to mitigating ISI and multiple access interference (MAI) in [45]. For mitigating intercell
and intersymbol interference, a distributed turbo equalization approach was explored in [46]. For multiple-access
links that include ISI, MAI, and multiple antennas, a variational inference approach was used to approximate MAP
detection in [47]. When such MIMO channels are rapidly fading, improved convergence speed was achieved through
the use of linear dispersion codes in [48].

April 25, 2010 DRAFT


4

Based on the multi-user formulation of the problem, a natural extension to multiple-input/multiple-output (MIMO)
channels arose and a host of results were also developed in this area, including methods for suppressing multi-user
interference [49–51], accounting for channel estimation errors [52], and results from a number of so-called turbo-
BLAST experimental systems were also published, e.g. [53]. Extensions to trellis-coded modulation [54] and BICM
[55] as well as frequency domain-based formulations using single-carrier transmission with and without cyclic prefix
[56–60] as well as MIMO OFDM were also developed [61, 62]. The increase in problem dimensionality lends itself
naturally to complexity reduction methods through rank reduction [63] and a variety of other MAP approximation
methods.
One application area in which turbo equalziation methods have provided substantial gains over systematically
separate equalization and decoding is that of underwater acoustic communications [17, 19]. The long delay-spread
of the channel (several tens to hundreds of symbol periods) makes MAP-based methods prohibitive and poses real
challenges for MMSE-based linear methods as well. In addition, the underwater acoustic environment is often rapidly
time-varying, such that explicit channel estimation and tracking must be employed [64]. A variety of methods for
incorporating soft information for iteratively estimating the channel have been developed, such as those based on a
Kalman formulation [65, 66]. The relatively long delay-spread of the channel makes OFDM-based methods appear
attractive, however this gives rise to inter-carrier interference (ICI) due to the time variation and, once again, a
linear channel matrix arises, for which ISI is replaced by ICI. A variety of joint ICI-mitigation/decoding methods
have been developed for performing turbo equalization over such time-varying channels using OFDM [67–69].

II. C OMMUNICATION SYSTEMS USING BPSK MODULATION

A. System model

This section defines a specific configuration of the transmitter of Fig. 1. In particular, the real-valued BPSK
signal alphabet S = {+1, −1} and an ISI channel with real-valued impulse response h[n] is considered. Based on
these results, algorithms for arbitrary complex-valued signal alphabets S and channel impulse responses h[n] are
given in Sec. III.
A data source produces independent and uniformly distributed (IUD) data bits an . The sequence a = (a0 a1 ... aK−1 )
of K data bits is protected by a memory Mc = 2 convolutional FEC code defined by the generator polynomials
g0 (D) = 1+D2 and g1 (D) = 1+D+D2 . The encoding operation yielding the N code bits b = (b0 b1 ... bN −1 ) is
carried out systematically including trellis termination. It follows that the FEC code rate is equal to
K K 1
Rf ec = = = .
N 2(K + 2) 2+2/K
Figure 2 depicts a trellis for this encoder with the branch labeling an / b2n b2n+1 . A corresponding state-space
model utilizing the two-dimensional state variable θn is given by
         
0 1 1 b2n 0 0 1
θn+1 =   θ n +   an ,  =  θ n +   an , n = 0, 1, ..., K +1.
1 0 0 b2n+1 1 1 1

April 25, 2010 DRAFT


5

The initial state θ0 is 02 and aK = aK+1 = 0 (termination bits). The code bits bn are permuted using an S-random
interleaver [11] to the N bits c = (c0 c1 ... cN −1 ), which are modulated to the symbols x = (x0 x1 ... xL−1 )T using
BPSK modulation. It follows that the transmitted symbols xn are given by


+1, cn = 0,
xn =

−1, cn = 1.

The system code rate R is given by


1
R = log2 |S| · Rf ec = Rf ec = .
2+2/K
The additive white Gaussian noise (AWGN) ISI channel model is given by
Mh
X
yn = wn + hi xn−i , n = 0, 1, ..., L−1, or y = Hx+w, (1)
i=0

where y = (y0 y1 ... yL−1 )T and w = (w0 w1 ... wL−1 )T for L = N . The channel coefficients hi and the noise
samples wn are assumed to be real-valued, i.e., the PDF p(wn ) is given by NR (0, σ 2 ), where NR (µ, σ 2 ) denotes

p(x) = exp(−(x−µ)2 /(2σ 2 ))/ 2πσ 2 .

The L×L system matrix H has two different structures depending on how the undefined symbols x−1 , x−2 , ..., x−Mh
in (1) are treated. They are 0 under the termination assumption or they are given by xL−1 , xL−2 , ..., xL−Mh under
the periodic extension assumption.
To illustrate performance, the length-3 unit power channel defined by the coefficients h0 = 0.407, h1 = 0.815,
and h2 = 0.407 is used as an example channel. This channel has memory Mh = 2. Its system matrix H is given by

termination assumption: periodic extension assumption:


   
h0 0 0 0 0 · · · 0 h0 0 0 0 0 · · · h2 h1
   
   
 h1 h0 0 0 0 · · · 0   h1 h0 0 0 0 · · · 0 h2 







 (2)
H =  h2 h1 h0 0 0 · · · 0  , H =  h2 h1 h0 0 0 · · · 0 0  .
   
 .. .. ..   .. .. .. 
 . . .   . . . 
   
0 0 · · · 0 h2 h1 h0 0 0 · · · 0 0 h2 h1 h0
Unless otherwise specified in the sequel, the termination assumption will be applied. The periodic extension
assumption is applied in Sec. IV to derive low-complexity equalization algorithms based on MMSE estimation
in the frequency domain.
Figure 3 depicts a trellis with the branch labeling xn / vn describing the length-3 example channel. The corre-
sponding state-space model is given by
   
0 0 1
θn+1 =   θn +   xn , vn = (h1 h2 ) θn + h0 xn , yn = v n + wn , n = 0, 1, ..., L−1,
1 0 0
where the initial state θ0 equals 02 under the termination assumption.

April 25, 2010 DRAFT


6

n=0 n=1 n=2 n=3 n = K −1 n=K n = K +1


0/00 0/00 0/00 0/00 0/00 0/00
s0 1/11 s0 1/11 s0 1/11 s0 s0 1/11 s0 1/11 s0 s0
1/11 1/11 1/11 1/11
s1 s1 0/00 s1 s1 0/00 s1
0/01 0/01 . . . 0/01 0/01
1/10 1/10 1/10
s2 s2 s2 s2 s2
1/10 1/10 1/10

s3 0/01 s3 s3 0/01 s3

Fig. 2. Trellis for systematic encoding of a convolutional code given by the generator polynomials g0 (D) = 1+D 2 and g1 (D) = 1+D+D 2 .

n=0 n=1 n=2 n=3 n = L−1 n=L


1/0.407 1/1.22 1/1.63 1/1.63
s0 -1/-0.407 s0 -1/0.408 s0 -1/0.815 s0 s0 -1/0.815 s0
1/0.407 1/1.22 1/0.815 1/0.815
s1 -1/-0.407 s1 -1/0.408 s1 -1/0 s1 s1 -1/0 s1
1/0.407 1/-0.408 1/0 ... 1/0
-1/-0.407 -1/-1.22 -1/-0.815 -1/-0.815
s2 s2 s2 s2 s2 s2
1/0.407 1/-0.408 1/-0.815 1/-0.815

s3 -1/-0.407 s3 -1/-1.22 s3 -1/-1.63 s3 s3 -1/-1.63 s3

Fig. 3. Trellis for a length-3 ISI channel given by the coefficients h0 = 0.407, h1 = 0.815, and h2 = 0.407 under the termination assumption.

The transmitter configuration defined in this section produces IUD symbols xn from the signal alphabet S =
{+1, −1}, which satisfy the power constrained E{|xn |2 } = P = 1. The largest rate R for reliable transmission over
the example channel is the constrained IUD capacity Cc,iud (S) [70], and is a function of the signal-to-noise ratio
(SNR) P/σ 2 at the transmitter output. Since the system rate R approaches 0.5 for increasing L, it can be shown
that a minimum SNR of 1.4 dB is required to achieve Cc,iud (S) larger than R = 0.5 in the limit of L → ∞.

B. Optimal detection

The BEP-optimal decoder computes estimates ân of the data bits an minimizing bit error probability (BEP)
Pr{an 6= ân }:
ân = argmax P (an = a|y), for a = 0, 1, ..., K −1. (3)
a∈F2

The posterior probabilities P (an |y) follow from marginalizing over an in the sequence-based posterior probability
P (a|y)
X 1 X
P (an = a|y) = P (a|y) = · p(y|a) · P (a). (4)
p(y)
a∈FK
2 :an =a a∈FK
2 :an =a

QK−1
The IUD assumption on the bits an yields that P (a) factors into n=0 P (an ).
When binary random variables are concerned, it is convenient to work with log-likelihood ratios (LLRs) rather
than probabilities [71]. Consider the conditional LLR L(an |y) of an given y:
P (an = 0|y)
L(an |y) = ln . (5)
P (an = 1|y)

April 25, 2010 DRAFT


7

an ân ân
Encoder Decoder Decoder
bn b̂n s(bn )
Interleaver Deinterleaver Deinterleaver
cn ĉn passing s(cn ) passing
hard soft
Mapper Demapper
decisions
Demapper
decisions
xn x̂n s(xn )
Channel Equalizer/Detector Equalizer/Detector
yn yn yn

Fig. 4. Receiver structure for separate equalization and decoding.

The decoding rule (3) can be rewritten to:




0, L(an |y) ≥ 0
ân = for a = 0, 1, ..., K −1 (6)

1, L(an |y) < 0

From (4) follows that L(an |y) can be decomposed into an extrinsic and an a-priori LLR:
P QK−1
a:an=0 p(y|a) k=0 P (ak )
L(an |y) = ln P QK−1
a:an=1 p(y|a) k=0 P (ak )
P QK−1
a:a =0 p(y|a) k=0:k6=n P (ak ) P (an = 0) (7)
= ln P n QK−1 + ln
a:an=1 p(y|a) k=0:k6=n P (ak )
P (an = 1)
| {z } | {z }

= Lext (an |y) + L(an ).


The LLR Lext (an |y) represents information about an contained in y and P (ak ) for all k 6= n, which is the so-called
extrinsic information or the extrinsic LLR, respectively. It is added to the a-priori LLR L(an ), which represents
the available a-priori information about an . Extrinsic LLRs play a crucial role in turbo equalization. Unfortunately
the BEP-optimal decoder (6) is computationally intractible for large K and is of order O(2K ).

C. Separate equalization and decoding

A standard approach to reducing the computational burden of the receiver is to split the detection problem into
the two subproblems equalization (detection) and decoding. This strategy is employed in the two receiver structures
depicted in Fig. 4. The difference between them is that the receiver on the left communicates estimates x̂n , ĉn , and
b̂n from the same alphabet as xn , cn , and bn , respectively, from the equalizer to the decoder whereas the receiver
on the right use so-called soft information s(xn ), s(cn ), and s(bn ).
Two common, but distinct families of algorithms for the subproblem of equalization are those based on trel-
lis methods and those using linear filters. Typical trellis-based approaches are symbol-based or sequence-based
MAP/ML detection. A symbol-based MAP detector computes estimates x̂n of the symbols xn as follows:

x̂n = argmax P (xn = x|y).


x∈S

April 25, 2010 DRAFT


8

This can be done efficiently using the BCJR algorithm [7] as long as the ISI channel has a trellis with a sufficiently
small number of states. An overview of BCJR-based turbo equalization is given in [72]. Table I applies this
description to the example ISI channel defined in Sec. II-A, where 1L denotes a length-L column vector containing
all ones and ⊙ denotes element-wise multiplication. The entries γn (si , sj ) = 0.5 · p(yn |vn = vi,j ) of Γn and
the initial vectors f0 and bL follow from (1), and the ISI channel trellis in Fig. 3. The scaling of γn (si , sj )

by 0.5/ 2πσ 2 has been neglected because of the normalization applied in the forward and backward recursion.
Practical implementations of the BCJR algorithm most often store the logarithms of the quantities in Table I [12, 73].
In this case, it is possible to use the terms −|yn−vi,j |2 /(2σ 2 ) directly in ln Γn without the need for exponentiation.

TABLE I
A LGORITHM FOR COMPUTING THE POSTERIOR PROBABILITIES P (xn |y) FOR THE ISI CHANNEL MODEL (1).

INPUT
|yn −v0,0 |2 |yn −v2,0 |2
 
exp(− 2σ 2
) 0 exp(−
2σ 2
) 0 
 2 |yn −v2,1 |2

exp(− n −v0,1 |
|y
 
) 0 exp(− ) 0 
2σ 2 2σ 2
 
Γn = 
|yn −v1,2 |2 2
 
|yn −v3,2 | 
0 exp(− ) 0 exp(− )
 
2σ 2 2

 2σ 
|yn −v1,3 |2 |yn −v3,3 | 2
 
0 exp(− ) 0 exp(− )
2σ 2 2σ 2
for n = 0, 1, ..., L−1 (the labels vi,j follow from the trellis in Fig. 3)

   
1 0 1 0 0 0 0 0
   
0 0 0 0  1 0 1 0 
U(+1) =  , U(−1) = 
   

0 1 0 1  0 0 0 0 
   
0 0 0 0 0 1 0 1
INITIALIZATION
f0 = bL = (1 1 1 1)T
BCJR ALGORITHM
FOR n = 0 TO L−1 DO
fn+1 = Γn fn (forward recursion)
T
fn+1 = fn+1 /(fn+1 14 ) (normalization)
END
FOR n = L − 1 TO 1 DO
bn = Γ T
n bn+1 (backward recursion)
bn = bn /(bTn 14 ) (normalization)
END
OUTPUT
P (xn = x|y) = bT T
n+1 (U(x) ⊙ Γn )fn /(bn+1 Γn fn )

A sequence-based MAP detector computes an estimate x̂ of the sequence x as follows:

x̂ = argmax P (x|y).
x∈S L

This can be done efficiently using the Viterbi algorithm [6, 74].
The computational complexity of the trellis-based approaches is determined by the number |S|Mh of trellis states,
which grows exponentially with the memory Mh of the ISI channel and the size |S| of the signal alphabet. Note
that the trellis-based implementation of MAP detection (as opposed to exhaustive search) in Table I is possible only
QL−1
because the symbols xn are assumed to be independent, i.e., P (x) factors into n=0 P (xn ). Certain structured
dependencies on the symbols xn can also lead to a trellis decomposition such as in TCM [13, 14], but this would
lead to a trellis with more than |S|Mh states.

April 25, 2010 DRAFT


9

Linear filter-based approaches perform only simple operations on the received symbols, which can be described
with matrix operations on the sequence y. The matrix formulation (1) of the ISI channel model immediately
suggests multiplying the received symbols y with an (approximate) inverse matrix H−1 yielding an estimate x̂ of
the sequence x. This so-called zero forcing (ZF) equalizer [74] attempts to invert the channel and may suffer from
noise enhancement, which can be severe if H is ill-conditioned. Noise enhancement can be avoided using (linear)
minimum mean square error (MMSE) estimation [75]. The estimate x̂ of x minimizing the MSE E{kx̂ − xk2 } is
given by
x̂ = E{x} + Cov{x, y} Cov{y, y}−1 (y − E{y}),

where Cov{x, y} = E{xyH } − E{x}E{y}H . Since each entry in the noise sequence w is IUD with NR (0, σ 2 ),it
follows that Cov{w, w} equals σ 2 IL . From the IUD assumption on the symbols xn , it follows that the estimate x̂
is therefore given by
x̂ = HT (σ 2 IL + HHT )−1 y. (8)

Since the symbol estimates x̂n in x̂ are most often not from the signal alphabet S, they are usually mapped to that
symbol from S at closest (Euclidean) distance to x̂n . Note that in a practical implementation of (8), only a small
window of received symbols yn rather than the complete sequence y is considered for complexity reasons.
The equalizer can often provide more information to the decoder than the hard decisions x̂n at the cost of
additional storage, processing, and communication, such as probabilities that xn takes on a particular symbol
from S. The principle of using probabilities (soft information) rather than hard-decisions is often referred to as soft
processing or soft decoding. This is the difference between the two receiver approaches depicted in Fig. 4. A natural
choice for the soft information s(xn ) about the transmitted symbols xn are the posterior probabilities P (xn |y),
which are a “side product” of the symbol-based MAP detector. Similarly, the less complex sequence-based MAP
detector (Viterbi equalizer) can produce approximations of P (xn |y) using, e.g., the soft-output Viterbi algorithm
(SOVA) [27].
For filter-based methods, extracting soft information s(xn ) is more involved. A common approach is to assume that
the estimation error, en = x̂n−xn , is Gaussian distributed [15], i.e., the PDF p(en ) is given by NR (E{en }, Var{en }).
For linear MMSE estimation, the mean and the covariance matrix of the sequence e = (e0 e1 ... eL−1 )T of estimation
errors en are given by

E{e} = 0L ,

Cov{e, e} = IL −HT (σ 2 IL +HHT )−1 H,


The soft information s(xn ) follows from sampling p(en ) at the values en = x̂n −x for each x ∈ S,

s(xn = x) = κ · p(en ) for en = x̂n − x, (9)


P
where the normalization constant κ ensures that x∈S s(xn = x) = 1. This scheme to compute soft information
s(xn ) from a filter output can be applied to other filter-based equalization algorithms as well.

April 25, 2010 DRAFT


10

Since the decoder in Fig. 4 operates on the code bits cn , the next step for the receiver algorithm is to compute
estimates ĉn or soft information s(cn ). The mapping from |S| = 2q probabilities s(xn ) to 2q probabilities s(cn ) is
commonly referred to as soft demapping. The demapping operation required here is quite simple, since q = 1 and
c̃n = cn holds:
s(cn = 0) = s(xn = +1),
s(cn = 1) = s(xn = −1).

After deinterleaving ĉn to b̂n or s(cn ) to s(bn ), respectively, the decoder can compute estimates ân of the transmitted
data bits.
The BEP-optimal MAP decoder given the soft information s(bn ) is defined as follows:

ân = argmax P (an = a|s(b0 ), s(b1 ), ..., s(bN −1 )). (10)


a∈F2

The posterior probabilities P (an |s(b0 ), s(b1 ), ..., s(bN −1 )) can be computed efficiently using the BCJR algorithm
as long as the FEC code has a trellis with a sufficiently small number of states. Table II applies the BCJR algorithm
to the example code defined in Sec. II-A. The entries γn (si , sj ) of Γn and the initial vectors f0 and bK+2 follow
from the code trellis in Fig. 2.

TABLE II
A LGORITHM FOR COMPUTING THE POSTERIOR PROBABILITIES P (an |s(b0 ), s(b1 ), ..., s(bN −1 )) FOR THE MEMORY-2 CONVOLUTIONAL
CODE DEFINED IN S EC . II-A.

INPUT
 
s(b2n = 0)s(b2n+1 = 0) 0 s(b2n = 1)s(b2n+1 = 1) 0
 
s(b = 1)s(b2n+1 = 1) 0 s(b2n = 0)s(b2n+1 = 0) 0
2n

Γn = 
 


 0 s(b 2n = 0)s(b 2n+1 = 1) 0 s(b 2n = 1)s(b 2n+1 = 0)

0 s(b2n = 1)s(b2n+1 = 0) 0 s(b2n = 0)s(b2n+1 = 1)
for n = 0, 1, ..., K +1

   
1 0 0 0 0 0 1 0
   
0 0 1 0  1 0 0 0 
U(0) =  , U(1) = 
   

0 1 0 0  0 0 0 1 
   
0 0 0 1 0 1 0 0
INITIALIZATION
f0 = bK+2 = (1 0 0 0)T
BCJR ALGORITHM
FOR n = 0 TO K +1 DO
fn+1 = Γn fn (forward recursion)
T
fn+1 = fn+1 /(fn+1 14 ) (normalization)
END
FOR n = K + 1 TO 1 DO
bn = Γ T
n bn+1 (backward recursion)
bn = bn /(bT
n 14 ) (normalization)
END
OUTPUT
P (an = a|s(b0 ), s(b1 ), ..., s(bN −1 )) = bT T
n+1 (U(a) ⊙ Γn )fn /(bn+1 Γn fn )

D. Joint equalization and decoding

A receiver algorithm where the equalizer is aware of the underlying code is often termed joint equalization and
decoding. The exact implementation of this algorithm is intractable for a general code and interleaver, and as such,

April 25, 2010 DRAFT


11

a feasible, yet suboptimal, alternative must be sought.


Consider the following approach. Note first that the equalizer may apply the BCJR algorithm when P (x) factors
QL−1
into k=0 P (xn ), which holds for independent symbols xn . The symbol probabilities P (xn ) can be different for
each symbol value xn ∈ S and each time step n. Initially, the equalizer imposes the IUD assumption on xn and
sets P (xn ) to 1/|S| for all xn ∈ S while computing the soft information s(xn ). The receiver algorithm continues
by computing s(cn ) and s(bn ) followed by decoding. At this stage, the decoder may compute a new version of the
soft information s(bn ),
s′ (bn ) = P (bn |s(b0 ), s(b1 ), ..., s(bN −1 )).

The reliability of s′ (bn ) compared to s(bn ) generally improves because of the redundancy introduced during FEC
encoding, i.e., the values in s′ (bn ) are on average closer to 0 and 1 than those in s(bn ). The posterior probabilities
s′ (bn ) can be computed with the symbol-based MAP decoder. In fact, to compute s′ (bn ) for the example code
from Sec. II-A, only the last line of the BCJR algorithm described in Table II need be updated to

s′ (b2n ) = P (b2n = b|s(b0 ), s(b1 ), ..., s(bN −1 )) = bTn+1 (V0 (b) ⊙ Γn )fn /(bTn+1 Γn fn ),

s′ (b2n+1 ) = P (b2n+1 = b|s(b0 ), s(b1 ), ..., s(bN −1 )) = bTn+1 (V1 (b) ⊙ Γn )fn /(bTn+1 Γn fn ),
where        
1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0
0 0 1 0 1 0 0 0 0 0 1 0 1 0 0 0
V0 (0) = 0 1 0 0
 , V0 (1) = 
0 0 0 1
, V1 (0) = 0 0 0 1
 , V1 (1) = 
0 1 0 0
.
0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 1


In this example, it turns out that s(an ) is equal to s (b2n ) or U(u) = V0 (u), respectively, since the chosen encoding
function is systematic. After interleaving s′ (bn ) to s′ (cn ), it makes sense to use the new soft information s′ (cn )
to guide the equalizer with new symbol probabilities s′ (xn ) or soft information, respectively, given the improved
knowledge about the bits cn . The mapping from 2N probabilities s′ (cn ) to L2q probabilities s′ (xn ) given a 2q -
ary signal alphabet S is commonly referred to as soft mapping. The soft mapping required here is quite simple:
s′ (xn = +1) = s′ (cn = 0) and s′ (xn = −1) = s′ (cn = 1).
With the soft information s′ (xn ) at hand, it is natural to repeat the equalization step yielding updated soft
information s(xn ). Incorporating s′ (xn ) into the trellis-based equalization algorithms (symbol- or sequence-based
MAP detection) is straight-forward. Only the transition matrices Γn in the initialization step of the BCJR algorithm
described in Table I need be updated to

2
exp(− |yn −v2i,j | )s′ (xn = xi,j ),

(i j) corresponds to a valid trellis branch,

Γn [j, i] =

0, otherwise,

for i, j = 0, 1, 2, 3, where vi,j and xi,j follow from the trellis in Fig. 3 (bottom trellis), e.g., v0,0 = 1.63 and
x0,0 = +1. Also the linear MMSE estimator (8) can take advantage of the probabilities s′ (xn ) by recomputing the
symbol statistics E{xn } and Cov{xn , xn }. This approach is derived and analyzed in detail in Sec. II-E.
The updated probabilities s(xn ) are again soft demapped to s(cn ) followed by deinterleaving and decoding.
This amounts to (suboptimal) joint equalization and decoding, since the equalizer incorporates knowledge about

April 25, 2010 DRAFT


12

the underlying code. The result is an iterative receiver algorithm, which recomputes the soft information s(xn ),
s(cn ), s(bn ) (equalizer to decoder) and s′ (bn ), s′ (cn ), s′ (xn ) (decoder to equalizer) by iterating equalization and
decoding tasks, passing soft information between them. The BER performance of the receiver indeed improves with
the iterations, but not significantly. In contrast, the improvement can be tremendous when the following information
is fed back from the decoder to the equalizer:

s′ext (bn ) = P (bn |s(b0 ), s(b1 ), ..., s(bn−1 ), s(bn+1 ), ..., s(bN −1 )).

This quantity is the extrinsic soft information about bn contained in s(b0 ), s(b1 ), ... s(bN −1 ) except s(bn ). Similarly,
the equalizer should communicate extrinsic soft information sext (cn ) to the decoder computed using the observation
y and all s′ext (c0 ), s′ext (c1 ), ..., s′ext (cN −1 ) except s′ext (cn ). This approach to passing extrinsic soft information
between constituent algorithms was first proposed by [10] in the context of decoding PCCCs (turbo codes) and has
been extended to various concatenated communication systems such as coded data transmission over ISI channels
[2, 23, 25, 31], where it is called turbo equalization.
Since the code bits bn are from a binary alphabet, it is often more convenient to replace the two probabilities
s(bn = 0) and s(bn = 1) by the LLR
s(bn = 0)
λ(bn ) = ln .
s(bn = 1)
Using the LLRs λ(bn ) for decoding seems to be counterintuitive, since they must be incorporated into the matrices
Γn in Table II, which requires several exponentiation operations. However, when implementing the BCJR algorithm
with the logarithms of fn , bn , and Γn , it is much easier to use λ(bn ) in ln Γn [12, 73]. Consequently, instead of
the extrinsic soft information s′ext (bn ), it is also more practical to consider the extrinsic LLR
s′ext (bn = 0) P (bn = 0|s(b0 ), s(b1 ), ..., s(bn−1 ), s(bn+1 ), ..., s(bN −1 ))
λ′ext (bn ) = ln = ln .
s′ext (bn = 1) P (bn = 1|s(b0 ), s(b1 ), ..., s(bn−1 ), s(bn+1 ), ..., s(bN −1 ))
If the soft information s′ (bn ) is a posterior probability, i.e. s′ (bn ) = P (bn |s(b0 ), s(b1 ), ..., s(bN −1 )), it is possible
to apply the decomposition (7) to the LLR λ′ (bn ) corresponding to s′ (bn ):
P (bn = 0|s(b0 ), s(b1 ), ..., s(bN −1 )) s(bn = 0)
λ′ext (bn ) = ln − ln ,
P (bn = 1|s(b0 ), s(b1 ), ..., s(bN −1 )) s(bn = 0)
s′ (bn = 0) s(bn = 0)
= ln ′ − ln ,
s (bn = 1) s(bn = 0)
| {z } | {z }

= λ′ (bn ) − λ(bn ).
For example, the extrinsic LLRs λ′ext (bn ) for the memory-2 convolutional code defined in Sec. II-A can be computed
using the BCJR algorithm described in Table II as follows:
bTn+1 (Vi (0) ⊙ Γn )fn
λ′ext (b2n+i ) = ln − λ(b2n ), i = 0, 1, n = 0, 1, ..., K +1.
bTn+1 (Vi (1) ⊙ Γn )fn
However, this formula may produce incorrect extrinsic LLRs λ′ext (bn ) in a practical implementation due to numerical
problems in computing the difference λ′ (bn )−λ(bn ). Instead, λ′ext (bn ) should be computed directly as follows:
bTn+1 (Vi (0) ⊙ Γext,i,n )fn
λ′ext (b2n+i ) = ln ,
bTn+1 (Vi (1) ⊙ Γext,i,n )fn

April 25, 2010 DRAFT


13

where the Γext,i,n are extrinsic transition matrices, which do not depend on the input LLR λext (b2n+i ) while
computing λ′ext (b2n+i ). For example, the extrinsic transition matrix Γext,0,n corresponding to λ′ext (b2n ) for the
memory-2 convolutional code defined in Sec. II-A is given by
 
s(b2n+1 = 0) 0 s(b2n+1 = 1) 0
 
 
s(b2n+1 = 1) 0 s(b2n+1 = 0) 0 
Γext,0,n = 
 .

 0 s(b2n+1 = 1) 0 s(b2n+1 = 0)
 
0 s(b2n+1 = 0) 0 s(b2n+1 = 1)
Similarly, the extrinsic LLR λext (cn ) is defined corresponding to sext (cn ):
sext (cn = 0)
λext (cn ) = ln . (11)
sext (cn = 1)
For example, the extrinsic LLRs λext (cn ) for the ISI channel model defined in Sec. II-A can be computed using
the BCJR algorithm described in Table II as follows:
s(xn = +1) P (xn = +1|y)
λext (cn ) = ln = ln
s(xn = −1) without using λ′ext (cn ) P (xn = −1|y) without using λ′ext (cn )

bTn+1 (U(+1) ⊙ Γn )fn bTn+1 (U(+1) ⊙ Γn )fn


= ln = ln − λ′ext (cn ).
bTn+1 (U(−1) ⊙ Γn )fn without using λ′ext (cn )
bTn+1 (U(−1) ⊙ Γn )fn

The last line follows from using that s′ext (cn = 0) can be factored out of bTn+1 (U(+1)⊙Γn )fn and that s′ext (cn = 1)
can be factored out of bTn+1 (U(−1)⊙Γn )fn . However, this relationship holds only for symbol-based MAP detection.
In a practical implementation, λext (cn ) should be computed directly with an extrinsic transition matrix Γext,n to
avoid numerical problems:
bTn+1 (U(+1) ⊙ Γext,n )fn
λext (cn ) = ln ,
bTn+1 (U(−1) ⊙ Γext,n )fn
where Γext,n [j, i] = exp(−|yn − vi,j |2 /(2σ 2 )) if (i j) corresponds to a valid trellis branch and Γext,n [j, i] = 0
otherwise. The basic steps of turbo equalization are summarized in Fig. 5.
The BER performance of turbo equalization using symbol-based MAP detection as an equalization algorithm is
exhibited in Fig. 6 for block length K = 510. Using the equalizer and the decoder once corresponds to separate
equalization and decoding. Applying the steps outlined in Fig. 5 results in a performance gain. After one additional
equalization-decoding step, a so-called iteration, more than 3 dB SNR P/σ 2 are gained at a BER of 10−5 . More
iterations do not improve the performance significantly and instead, a limit seems to exist, which is identical to
the BER performance of the soft decoder (10) without ISI in the channel (dotted line in Fig. 6). In this case, the
channel model (1) simplifies to yn = xn + wn such that the extrinsic LLRs λext (cn ) are given by
P (xn = +1|yn ) p(yn |xn = +1)s′ (xn = +1)
λext (cn ) = ln = ln .
P (xn = −1|yn ) without using λ′ext (cn ) p(yn |xn = −1)s′ (xn = −1) without using λ′ext (cn )

The PDF p(yn |xn ) is equal to NR (xn , σ 2 ). However, s′ (xn ) follows directly from λ′ext (cn ) and must be discarded
while computing λext (cn ), which yields
p(yn |xn = +1) exp(−(yn −+1)2 /(2σ 2 )) 2yn
λext (cn ) = ln = ln = 2 .
p(yn |xn = −1) exp(−(yn −−1)2 /(2σ 2 )) σ

April 25, 2010 DRAFT


14

an ân
Encoder Decoder
bn λext (bn ) λ′ext (bn )
Interleaver Deinterleaver Interleaver
cn λext (cn ) λ′ext (cn )
Mapper Demapper Mapper
xn s(xn ) s′ (xn )
Channel Equalizer/Detector
yn yn

Turbo equalization
(1) initialization: set λ′ext (cn ) to 0 for all n = 0, 1, ..., N −1
(2) equalization: compute λext (cn ) for all n from y
and λ′ext (c0 ), ..., λ′ext (cN −1 ) except λ′ext (cn )
(3) deinterleaving: permute λext (cn ) to λext (bn ) for all n
(4) termination: subject to a suitable termination criterion, go to (5)
or compute ân for all n = 0, 1, ..., K −1 from λext (b0 ), ..., λext (bN −1 ) and stop
(5) decoding: compute λ′ext (bn ) for all n from λext (b0 ), ..., λext (bN −1 ) except λext (bn )
(6) interleaving: permute λ′ext (bn ) to λ′ext (cn ) for all n
(7) go to (2)

Fig. 5. Receiver structure and basic steps of the turbo equalization algorithm.

0
10

after 1 iteration
after 2 iterations
after 8 iterations
data bit error rate

−2
10

−4
10
separate
capacity limit of equalization
the ISI channel and decoding
0 1.4 2 4 6 8 10 12
2
P/σ in dB

Fig. 6. Performance of turbo equalization for the communication system described in Sec. II-A using symbol-based MAP detection. The
BER performance (solid lines) is plotted for separate equalization and decoding as well as after one, two, or eight iterations. The dotted line
corresponds to the BER performance of the FEC decoder when no ISI is introduced in the channel. The dashed line is a lower bound on the
BER performance of any decoder. The considered blocklength is K = 510 (N = L = 1024). An S-random interleaver with S=16 is applied.

As such, performing several equalization-decoding tasks (iterations) given an ISI-free channel does not improve the
LLRs λext (cn ), i.e., a single decoding task is sufficient to achieve the BER performance depicted in Fig. 6. This
finding does not hold for higher-order signal alphabets as shown in Sec. III.
The lower bound on the BER performance, derived from properties of the ISI channel and the code, also shows

April 25, 2010 DRAFT


15

that the communication system defined in Sec. II-A is not well designed, since even the best possible decoder is
still 4.5 dB SNR P/σ 2 away from the channel capacity at a BER of 10−5 . A remedy to this problem could be
to increase the minimum distance dmin,f ec of the convolutional code, but this results in a decoding complexity
increase. Using precoding is a more elegant way to improve the distance spectrum. Such communication systems
together with a turbo equalization receiver are studied in Sec. V and VI.
It is interesting to consider the block length K required to approach the performance of the BEP optimal receiver,
or whether alternative approaches to turbo equalization exist with better performance. These questions are hard to
answer and the graphical descriptions of coded data transmission over an ISI channel introduced in [76, 77] suggest
other iterative receiver approaches that may perform as well as turbo equalization (or even better).
The interleaver serves to improve the distance spectrum of the communication system, which yields the lower
bound (dashed line) in Fig. 6. It also decorrelates error bursts or, say, dependencies between neighboring samples
of the soft information λext (cn ) and λ′ext (bn ) produced by the equalizer and the decoder, respectively. Recall that
the BCJR algorithm applied for detection and decoding assumes that the input symbols to the state-space model are
independent, i.e., the input symbols xn for the ISI channel as well as the soft information s′ (xn ) about them are
assumed independent. The same holds for the probability information about the output of a state-space model, i.e.,
the soft information λ(bn ) about the code bits bn must be independent. The interleaver cannot remove dependencies
in between the permuted bits or symbols, but it can reduce local dependencies.
The minimal required S parameter of the interleaver in a communication system applying turbo equalization can
vary tremendously depending on the system configuration. For example, large memory in the channel or the FEC
code usually requires a larger S and, thus, a larger block length for turbo equalization to work effectively.
Another issue of interest is the number of required iterations to achieve desirable BER performance. As a practical
rule, longer block lengths require more iterations to approach the performance of the BEP-optimal decoder. In a
practical implementation, the number of iterations is often constrained by the computational complexity allowed
and the delay of the receiver algorithm. For more information about this topic, the reader is referred to [11, 71, 78].

E. SISO equalization based on linear MMSE estimation

Figure 7 shows how the linear MMSE estimator introduced in Sec. II-C can be integrated into the turbo
equalization setup defined in Fig. 5. It is called a soft-in soft-out (SISO) equalizer because it outputs and processes
soft information. Clearly, the symbol-based MAP detector used in the previous section is a SISO equalizer as well.

yn MMSE x̂n
estimator Mapping λext (cn )

E{xn }
λ′ext (cn ) Mapping
Cov{xn , xn }

Fig. 7. A SISO equalizer based on linear MMSE estimation.

April 25, 2010 DRAFT


16

The first step is to incorporate the extrinsic LLRs λ′ext (cn ) into the equalization process. This is done by
mapping the LLRs λ′ext (cn ) to probabilities s′ (xn ) followed by mapping them to new statistics µn = E{xn } and
vn = Cov{xn , xn } of the symbols xn . For BPSK modulation, this mapping is as follows:
X
µn = x · s′ (xn = x) = s′ (xn = +1) − s′ (xn = −1)
x∈S

1 e−λext (cn )
= s′ (cn = 0) − s′ (cn = 1) = −λ′ext (cn )
− ′ = tanh(λ′ext (cn )/2), (12)
1+e 1+e−λext (cn )
X
vn = |x − E{xn }|2 · s′ (xn = x) = 1 − |µn |2 .
x∈S

A second step is to map the symbol estimates x̂n produced from the linear MMSE estimator to the extrinsic LLRs
λext (cn ). Combining (9) with (11) yields the following rule to compute λext (cn ):
sext (cn = 0) s(xn = +1)
λext (cn ) = ln = ln , (13)
sext (cn = 1) s(xn = −1) without using λ′ext (cn )

where s(xn = x) is the value of the PDF p(en ) evaluated at en = x̂n − x. The extrinsic LLR λext (cn ) should
not depend on λ′ext (cn ) and, consequently, neither should the estimate x̂n depend on λ′ext (cn ), which affects the
derivation of the estimation algorithm.
In contrast to Sec. II-C, the linear MMSE estimator considered here processes a length-W window of observations
yn = (yn−W2 yn−W2 +1 · · · yn+W1 )T , W = W1+W2+1, rather than the complete sequence y to compute the estimate
x̂n . The solution in (8) produces all estimates x̂n at once, but this requires the solution of an L × L system of
equations (complexity O(L3 )) and to linearly process the sequence y (complexity O(L2 )). In the remainder of this
section, the system matrix H is constructed under the termination assumption as shown on the left side of (2).
The estimate x̂n of xn minimizing the MSE E{|x̂n − xn |2 } is given by

x̂n = E{xn } + Cov{xn , yn } Cov{yn , yn }−1 (yn − E{yn }). (14)

From the ISI channel model (1) it follows that (14) is given by

x̂n = µn +vn hTn Σ−1


n (yn −Hn µ n ), Σn = σ 2 IW +Hn Vn HTn , (15)

where Hn is the W × (W +Mh ) submatrix H[n−W2 : n+W1 , n−W2 −Mh : n+W1 ] of the system matrix H and

µ n = (µn−Mh −W2 µn−Mh −W2 +1 · · · µn+W1 )T ,

Vn = Diag{vn−Mh −W2 vn−Mh −W2 +1 · · · vn+W1 },

hn = Hn [0 : W −1, W2 + Mh ],
i.e., hn is the (W2 + Mh )-th column of Hn . The submatrix Hn is given by
 
hMh hMh −1 ··· h0 0 ··· 0
 
 
 0 h Mh hMh −1 · · · h0 0 ··· 0 
Hn =   ..
.

 . 
 
0 ··· 0 hMh hMh −1 ··· h0

April 25, 2010 DRAFT


17

for the time indices n = W2 , W2 +1..., L−W1 −1. For simplicity, Hn is assumed to have the same structure for
the remaining time indices. Although Hn is time-invariant, the time index is kept to make clear that the following
equalization algorithms apply to time-variant ISI channels as well.
Under the termination assumption, the statistics µn and vn are set to 0 and 1, respectively, for time indices n
outside the range 0, 1, ..., L−1, which assumes that no information about the corresponding symbol xn is available
and the IUD assumption is applied.
The estimate x̂n calculated in (15) depends on λ′ext (cn ) via µn and vn . In order that x̂n be independent from
λ′ext (cn ), this particular LLR is set to 0 while computing x̂n . This choice corresponds to an IUD assumption made
on the particular symbol xn . It follows that µn and vn should be replaced by 0 and 1, respectively, while computing
x̂n , which changes (15) to

x̂n = hTn (Σn +(1−vn ) hn hTn )−1 (yn −Hn µn +µn hn ).

Using Woodbury’s identity, the expression hTn (Σn +(1−vn ) hn hTn )−1 can be simplified as follows:

hTn (Σn +(1−vn ) hn hTn )−1 = hTn (Σ−1 −1 T −1


n −Σn hn hn Σn /((1−vn )
−1
+hTn Σ−1
n hn )

= fnT − (fnT hn ) · fnT /((1−vn )−1 + fnT hn ) with fn = Σ−1


n hn ,

= fnT /(1+(1−vn ) fnT hn ).


The estimates x̂n are now given by
fnT (yn −Hn µn ) + µn sn
x̂n = , sn = fnT hn = hTn Σ−1
n hn . (16)
1+(1−vn ) sn
To calculate the extrinsic LLRs λext (cn ), we have that the mean and the variance of the estimation error en =
x̂n − xn are given by E{en } = 0 and under the constraint that vn is replaced by 1 yields

Var{en } = 1 − hTn (Σn +(1−vn ) hn hTn )−1 hn


(17)
= 1 − fnT hn /(1+(1−vn ) fnT hn ) = 1 − sn /(1+(1−vn ) sn ),
where again Woodbury’s identity was applied. Under the Gaussian assumption on the distribution of the estimation
error en , the PDF p(en ) is given by NR (0, 1−sn /(1+(1−vn ) sn )). Finally, from (13) the extrinsic LLRs λext (cn )
for BPSK modulation without precoding are given by:
exp(−(x̂n −+1)2 /(2 Var{en })) 2x̂n
λext (cn ) = ln =
exp(−(x̂n −−1)2 /(2 Var{en })) 1 − sn /(1+(1−vn ) sn )
(18)
2(fnT (yn −Hn µ n ) + µn sn )
= , fn = (σ 2 IW +Hn Vn HTn )−1 hn , sn = fnT hn .
1 − vn sn
The complete SISO equalization algorithm using linear MMSE estimation is shown in Table III.
When the input LLRs λ′ext (cn ) are 0 for all time steps n, e.g., for the initial equalization step in Fig. 5 or for
usual linear MMSE estimation without incorporating prior knowledge about the symbols xn , the means µn are
equal to 0 and the variances vn are equal to 1 for all n. It follows that the coefficient vectors fn are given by
(σ 2 IW +Hn HTn )−1 hn for all n, i.e., they are time-invariant and equal to the common linear MMSE equalizer [3].
The estimates x̂n are in this case given by fnT yn and the extrinsic LLRs λext (cn ) are given by 2fnT yn /(1 − sn ).

April 25, 2010 DRAFT


18

TABLE III
SISO EQUALIZATION ALGORITHM BASED ON LINEAR MMSE ESTIMATION .

INPUT
extrinsic LLRs λ′ext (cn ) for n = 0, 1, ..., N −1
INITIALIZATION
compute µn = tanh(λ′ext (cn )/2) and vn = 1 − |µn |2 for n = 0, 1, ..., N −1
LINEAR MMSE ESTIMATION
FOR n = 0 TO N − 1 DO
fn = (σ 2 IW +Hn Vn HT n)
−1 h
n
Th
sn = f n n
T
λext (cn ) = 2(fn (yn −Hn µ n ) + µn sn )/(1 − vn sn )
END
OUTPUT
extrinsic LLRs λext (cn ) for n = 0, 1, ..., N −1

0
10
after 1 iteration
after 2 iterations
after 8 iterations
data bit error rate

−2
10

−4
10
separate
capacity limit of equalization
the ISI channel and decoding
0 1.4 2 4 6 8 10 12
2
P/σ in dB

Fig. 8. Performance of turbo equalization for the communication system described in Sec. II-A using linear MMSE estimation. The BER
performance (solid lines) is plotted for separate equalization and decoding as well as after 1, 2, or 8 iterations. The dotted line corresponds to
the BER performance of the FEC decoder when no ISI is introduced in the channel. The dashed line is a lower bound on the BER performance
of any decoder. The blocklength considered is K = 510 (N = L = 1024). An S-random interleaver with S=16 is applied.

The BER performance of turbo equalization using linear MMSE estimation as equalization algorithm is shown
in Fig. 8 for the same block length K = 510 as in Fig. 6. Using the equalizer and the decoder once corresponds
to separate equalization and decoding. Applying the steps outlined in Fig. 5 while computing the LLRs λext (cn )
according to (18) results in a performance gain even larger than that for turbo equalization based on symbol-based
MAP detection shown in Fig. 6. After one iteration, 8 dB SNR P/σ 2 less are required to achieve a BER of 10−5 .
As in Fig. 6, the performance does not improve significantly using more iterations and, instead, approaches the
BER performance of the soft decoder with an ISI-free channel (dotted line in Fig. 6). The BER performance of
turbo equalization using the linear MMSE estimator is in general worse than that using the symbol-based MAP
detector for turbo equalization for any SNR and any number of iterations, because the locally BEP-optimal MAP
detector produces more reliable extrinsic LLRs λext (cn ). However, comparing the Figs. 6 and 8 reveals that the
performance gap narrows tremendously over iteration and is negligible after 8 iterations.

April 25, 2010 DRAFT


19

The applicability of decision-feedback equalization for turbo equalization was investigated in [25]. However, the
results were not promising as this amounts to replacing soft information s′ (xn ) by a quantized value, which is
inferior to the soft information itself. As such, the linear MMSE approach is, in fact, a decision feedback equalizer
that employs soft decisions from the decoder, rather than hard decisions from the output of the equalizer or decoder.
The BER performance of turbo equalization is also affected by the Gaussian assumption made on the distribution
of the estimation error
fnT (yn −Hn µ n ) + µn sn
en = x̂n − xn = − xn . (19)
1+(1−vn ) sn
Recall that this assumption allows computation of the extrinsic LLRs λext (cn ) efficiently. However, these LLRs are
incorrect if the true PDF p(en ) is not Gaussian. This unwanted property is called inconsistency in Section VI. Using
the exact PDF p(en ) for each time index n to accurately calculate the equalizer output LLRs λext (cn ) is impractical.
Fortunately, the Gaussian approximation NR (E{en }, Var{en }) of p(en ) given by NR (0, 1−sn /(1+(1−vn ) sn ))
is usually close to p(en ), although this observation does not hold for higher-order signal alphabets as shown in
Sec. III.
Input LLRs λ′ext (cn ) with large magnitude yield that µn = xn and vn = 0 hold for all n. The coefficient vector
fn = Σ−1 T
n hn and the term sn = fn hn are in this case given by

fn = (σ 2 IW )−1 hn = hn /σ 2 and sn = fnT hn = Ph /σ 2 ,


PMh
where Ph = i=0 |hi |2 is the power of the ISI channel. The estimates x̂n are given by
hTn (yn − Hn µ n )/σ 2 + µn · Ph /σ 2
x̂n =
1+Ph /σ 2
(20)
hT (yn − Hn (µn−Mh −W2 ... µn−1 0 µn+1 ... µn+W1 )T )
= n .
σ 2 +Ph
The estimation error en is given by
hTn wn + Ph · xn hTn wn − σ 2 · xn
en = − x n = .
σ 2 +Ph σ 2 +Ph
The PDF p(en ) is the superposition of the two conditional PDFs p(en |xn = x), which are given by NR (−xσ 2 /(σ 2+
Ph ), Ph σ 2 /(σ 2 +Ph )), weighted with s′ (xn = x) = 0.5:

p(en ) = κ · exp(−|en |2 /(2Ph σ 2 /(σ 2 +Ph ))) · cosh(en /(2Ph )).

The constant κ assures that p(en ) is a PDF. This distribution closely matches the Gaussian approximation NR (0, 1−
sn /(1+(1−vn ) sn )) given by NR (0, σ 2 /(σ 2 +Ph )).
The performance results in Figs. 6 and 8 indicate that the linear MMSE estimator is a viable alternative to the
symbol-based MAP detector, since the BER performance is nearly identical after a few iterations. However, this
result should be interpreted with care, since the required number of iterations and the required blocklength K
to achieve this similarity may be beyond the specifications of a given application. For separate equalization and
decoding and within early iterations, there is still a considerable performance gap between the two approaches.

April 25, 2010 DRAFT


20

III. C OMMUNICATION SYSTEMS USING PSK OR QAM MODULATION

A. System model

In this section, the transmitted symbols xn are now chosen from a 2q -ary QAM signal alphabet S, i.e., they are
in general complex-valued. The q-tuples are directly mapped to a symbol xn using a Gray mapping function.
The ISI channel model is identical to (1) except that the channel coefficients hi are complex-valued and the noise
samples wn are circularly symmetric complex Gaussian distributed with variance σ 2
Mh
X
yn = wn + hi xn−i , n = 0, 1, ..., L−1, or y = Hx+w. (21)
i=0

The PDF p(wn ) is therefore given by NC (0, σ 2 ), where NC (µ, σ 2 ) denotes



p(x) = exp −|x − µ|2 /σ 2 /(πσ 2 ).

The same length-3 unit power example channel as in Sec. II-A is used to illustrate the BER performance of the
algorithms described here. The system matrix H is constructed as in (2).
The same memory-(Mc = 2) convolutional code as in Sec. II-A is applied such that the system code rate R is
given by
1
R = log2 |S| · Rf ec = q · Rf ec = q · .
2+2/K
The signal alphabets S can be chosen to satisfy the power constrained E{|xn |2 } = P = 1 under the IUD assumption
on the symbols xn . BER performance results are obtained using the 8-PSK alphabet depicted in Fig. 9. This choice
is made for simplicity, i.e., the following derivations apply to any memory-Mh ISI channel with complex-valued
coefficients hi and any finite signal alphabet S.

010
000 110
100 111
+1
101 011
001

Fig. 9. An 8-PSK signal alphabet with a Gray mapping between the bit-triple cn = (c3n c3n+1 c3n+2 ) and the symbol xn .

B. SISO equalization based on symbol-based MAP detection

The BEP-optimal decoder (6) is the same as in Sec. II-B. The approaches for separate equalization and decoding
considered in Sec. II-C are special cases of the turbo equalization algorithm (initial equalization step) derived in
Sec. II-D. The focus of this section is on the second step in Fig. 5, which is to compute the extrinsic LLR λext (cn )
from the observations y and all input LLRs λ′ext (c0 ), ..., λ′ext (cN −1 ) except λ′ext (cn ).

April 25, 2010 DRAFT


21

We firstconsider MAP detection as in Table I. The trellis describing a memory-Mh ISI channel with input symbols
xn from a signal alphabet S of size 2q has 2qMh states. The entries γn (i, j) = p(yn |vn = vi,j ) · s′ (xn = xi,j ) of the
2qMh × 2qMh matrices Γn are given by

2
κ · exp(− |yn −v2i,j | )s′ (xn = xi,j ),

(i j) corresponds to a valid trellis branch,
σ
Γn [j, i] =

0, otherwise,

for i, j = 0, 1, ..., 2qMh −1, where vi,j and xi,j follow from the ISI channel trellis and κ = 1/(πσ 2 ). The scaling
with κ can be neglected if a normalization step is applied in the forward and backward recursion of the BCJR
algorithm.
Each symbol xn depends on exactly q code bits cn = (cqn ... cqn+q−1 ), such that a symbol probability s′ (xn ) is the
Qq−1
product of q extrinsic code bit probabilities, i.e. s′ (xn = x) = k=0 s′ext (cqn+k = mk ) with x = map(m0 , ..., mq−1 ).
For example, the probability s′ (xn = −1) for the 8-PSK alphabet in Fig. 9 is given by s′ext (c3n = 1)s′ext (c3n+1 =
0)s′ext (c3n+2 = 0). The soft mapping can be rewritten in terms of extrinsic LLRs λ′ext (cn ):
q−1

Y exp(−mk · λ′ext (cqn+k ))
s (xn = x) = , x = map(m0 , ..., mq−1 ). (22)
1 + exp(−λ′ext (cqn+k ))
k=0

Practical implementations of the BCJR algorithm most often store the quantity ln Γn , whose entries are easily
computed from the LLRs λ′ext (cn ),

2
− |yn −v2i,j | − Pq−1 mk ·λ′ext (cqn+k ),

(i j) corresponds to a valid trellis branch,
σ k=0
ln Γn [j, i] =

−∞, otherwise,

where xi,j = map(m0 , ..., mq−1 ) and all unnecessary normalization factors are omitted.
The soft demapping from the posterior probabilities P (xn |y) to the extrinsic LLRs λext (cqn+k ) is performed by
marginalizing cqn+k in the corresponding joint probability P (cn |y):
P
sext (cqn+k = 0) m∈Fq2 :mk =0 P (xn = map(m)|y)
λext (cqn+k ) = ln = ln P
sext (cqn+k = 1) m∈Fq2 :mk =1 P (xn = map(m)|y) without using λ′ext (cqn+k )
P T
m∈Fq2 :mk =0 bn+1 (U(map(m)) ⊙ Γn )fn
= ln P T
m∈Fq :mk =1 bn+1 (U(map(m)) ⊙ Γn )fn
2 ′
without using λext (cqn+k )

with m = (m0 , m1 , ..., mq−1 ), where U(u) follows from the trellis describing the ISI channel and fn and bn are
part of the BCJR algorithm. For example, to compute λext (c3n ) for the 8-PSK alphabet in Fig. 9, the summation in
the numerator is over the symbol values ei· π/4 for i = 2, 3, 6, 7, and the summation in the denominator is over the
symbol values ei· π/4 for i = 0, 1, 4, 5. To ensure that λext (cqn+k ) does not depend on λ′ext (cqn+k ), we observe
that computing λext (cqn+k ) with the extrinsic transition matrix

exp(−|yn −vi,j |2 /σ 2 ), (i j) corresponds to a valid trellis branch,

Γext,n [j, i] =

0, otherwise,

April 25, 2010 DRAFT


22

and
P Qq−1
m∈Fq2 :mk =0 bTn+1 (U(map(m)) ⊙ Γext,n )fn j=0:j6=k exp(−mj · λ′ext (cqn+j ))
λext (cqn+k ) = ln P Qq−1 ,
m∈Fq2 :mk =1 bTn+1 (U(map(m)) ⊙ Γext,n )fn j=0:j6=k exp(−mj · λ′ext (cqn+j ))
ensures that λext (cqn+k ) does not depend on λ′ext (cqn+k ), only. This relationship follows from (22) and that the
Qq−1
product j=0 exp(−mj · λ′ext (cqn+j )) can be factored out of the expression bTn+1 (U(map(m)) ⊙ Γn )fn .
The system rate R given the 8-PSK signal alphabet in Fig. 9 equals R = q · Rf ec = 1.5 bits per channel use
for large block lengths L. A minimum SNR P/σ 2 of 4.7 dB is required in order that the constrained IUD capacity
Cc,iud (S) for 8-PSK is larger than R = 1.5. Thus, the capacity limit for arbitrarily reliable data transmission using
the given system configuration is 4.7 dB SNR.

0
10
after 1 iteration
after 2 iterations
after 8 iterations
data bit error rate

−2
10

separate
equalization
and decoding
−4
10

capacity limit of
the ISI channel
4.7 6 8 10 12 14 16
2
P/σ in dB

Fig. 10. Performance of turbo equalization for the communication system described in Sec. III-A using symbol-based MAP detection. The
BER performance (solid lines) is plotted for separate equalization and decoding as well as after 1, 2, or 8 iterations. The dotted line corresponds
to the BER performance of bit-interleaved coded modulation given an ISI-free channel after 8 demapping-decoding iterations. The considered
block length is K = 510 (N = 1024 and L = 342). An S-random interleaver with S=16 is applied.

The BER performance of turbo equalization using the symbol-based MAP detector as equalization algorithm is
shown in Fig. 10 for the same block length K = 510 as in the previously shown simulations in Figs. 6, and 8.
Applying the steps outlined in Fig. 5 results in a performance improvement in the first iteration, since 1.5 dB SNR
P/σ 2 is gained at a BER of 10−5 . As in Sec. II-D, more iterations do not improve the performance significantly
and, instead, the BER performance of the soft decoder (10) without ISI (dotted line in Fig. 10) is a lower bound.
In this case, the channel model (21) simplifies to yn = xn + wn such that the extrinsic LLRs λext (cn ) are given by
P
m∈Fq2 :mk =0 P (xn = map(m)|yn )
λext (cqn+k ) = ln P
m∈Fq2 :mk =1 P (xn = map(m)|yn ) without using λ′ (c
ext qn+k )
P Qq−1 (23)

m∈Fq2 :mk =0 p(yn |xn = map(m)) j=0:j6=k exp(−mj · λext (cqn+j ))
= ln P Qq−1 ′
,
m∈Fq :mk =1 p(yn |xn = map(m)) 2 j=0:j6=k exp(−mj · λext (cqn+j ))

The PDF p(yn |xn ) is equal to NC (xn , σ 2 ). In contrast to the finding in Sec. II-D obtained for BPSK modulation,
performing equalization-decoding tasks (iterations) given an ISI-free channel and a general signal alphabet does

April 25, 2010 DRAFT


23

improve the LLRs λext (cn ). In fact, the BER performance depicted in Fig. 10 for the ISI-free case was obtained
after 8 iterations, even though the improvement over the iterations is small as explained in Section VI. For the
QPSK signal alphabet it is zero, because the terms in (23) containing the LLRs λ′ext (c2n+j ) disappear. Consider
for example the LLRs λext (c2n ) with even indices:
P ′
(m0 ,m1 )∈F22 :m0 =0 p(yn |xn = map(m0 , m1 )) exp(−m1 · λext (c2n+1 ))
λext (c2n ) = ln P ′
(m0 ,m1 )∈F2 :m0 =1 p(yn |xn = map(m0 , m1 )) exp(−m1 · λext (c2n+1 ))
2

exp(|yn −κ(1+ )|2 /σ 2 ) + exp(|yn −κ(1− )|2 /σ 2 ) exp(−λ′ext (c2n+1 ))


= ln .
exp(|yn −κ(−1+ )|2 /σ 2 ) + exp(|yn −κ(−1− )|2 /σ 2 ) exp(−λ′ext (c2n+1 ))
The last line simplifies to
4κRe{yn } √
λext (c2n ) = 2
, κ = 1/ 2,
σ
since the following expression can be factored out in both the numerator and the denominator:

exp((Im{yn }−κ )2 /σ 2 ) + exp((Im{yn }+κ )2 /σ 2 ) exp(−λ′ext (c2n+1 )).

The equalization step given an ISI-free channel is merely a soft demapping operation. A communication system
consisting of a FEC code, an interleaver, and a mapper, which transmits data over an (ISI-free) AWGN channel is
called bit-interleaved coded modulation (BICM) in the literature [79]. The explanations at the end of Sec. II-D and
the results in [79] show that the BER performance of BICM is the same as that of the BEP-optimal decoder (6)
for ideal interleaving. Since the performance of turbo equalization closely approaches that of BICM as shown in
Fig. 10, there is again the satisfying property that turbo equalization achieves the performance of the BEP-optimal
decoder for the communication system. However, the BER performance of BICM is far away from the capacity
limit, which calls for additional system design, e.g., including precoding or the use of another mapping function
map(·) [80].

C. SISO equalization based on MMSE estimation

It is also natural to extend the SISO equalizer based on linear MMSE equalization derived in Sec. II-E to higher-
order signal alphabets. As in (14), only a length-W window yn of observations yn is considered to compute the
MMSE-optimal estimate x̂n of xn minimizing the MSE E{|x̂n − xn |2 }. The MMSE-optimal estimate x̂n of the
complex-valued symbol xn is not a linear, but a widely linear combination of the complex-valued symbols yn

x̂n = E{xn } + un (yn −E{yn }) + vn (yn −E{yn })∗ , (24)

where
un = (Cov{xn , yn }−Cov{xn , yn∗ }(Cov{yn , yn }−1 )∗ Cov{yn , yn∗ }∗ )Rn ,

vn = (Cov{xn , yn∗ }−Cov{xn , yn } Cov{yn , yn }−1 Cov{yn , yn∗ })R∗n ,

Rn = (Cov{yn , yn }−Cov{yn , yn∗ }(Cov{yn , yn }−1 )∗ Cov{yn , yn∗ }∗ )−1 .


It is also possible to consider the real-valued estimate (x̂n,R x̂n,I ) of (Re{xn } Im{xn }), which linearly depends
on the 2W real-valued observations Re{yn } and Im{yn }, as in [81]. However, it turns out in Sec. IV that low-
complexity approximate implementations of (24) are easier to derive based on the description in C.

April 25, 2010 DRAFT


24

The statistics µn = E{xn } and vn = Var{xn } of the transmitted symbols xn given an arbitrary signal alphabet
S are given by
X X
µn = x · s′ (xn = x) and vn = |x − µn |2 · s′ (xn = x). (25)
x∈S x∈S

A special property of PSK alphabets such as the 8PSK alphabet depicted in Fig. 9 or the BPSK alphabet {+1, −1}
is that vn = 1 − |µn |2 holds. Recall that the soft information s′ (xn ) is a symbol probability that xn takes on a
value from S. The pseudo-covariances Cov{xn , yn∗ } and Cov{yn , yn∗ } are evaluated using the ISI channel model
(21)
Cov{xn , yn∗ } = Cov{xn , x∗n }hTn and Cov{yn , yn∗ } = Hn Cov{xn , x∗n }HTn = Hn Ψn HTn .

From the independence assumption on the symbols xn follows that Ψn = Cov{xn , x∗n } is a diagonal matrix with
the pseudo-variances
X
ψn = Cov{xn , x∗n } = (x − µn )2 · s′ (xn = x) (26)
x∈S

on the main diagonal. Combining all results, the estimates x̂n are given by

x̂n = µn + un (yn −Hn µ n ) + vn (yn −Hn µ n )∗ , (27)

−1 ∗ T ∗ H −1 ∗
where un = (vn hH T T T
n −ψn hn (Σn ) (Hn Ψn Hn ) )Rn , vn = (ψn hn −vn hn Σn Hn Ψn Hn )Rn , and Rn = (Σn −

Hn Ψn HTn (Σ−1 ∗ T ∗ −1
n ) (Hn Ψn Hn ) ) .
A special case are IUD symbols xn , i.e., s(xn ) is a uniform PMF, which holds in the initial equalization step
of turbo equalization or for separate equalization and decoding. In this case, µn = 0 and vn = 1 holds for all n
given many signal alphabets. Moreover, the pseudo-variances ψn vanish for all n, i.e., the symbols xn are circularly
P
symmetric except for the BPSK alphabet. The corresponding constraint on the signal alphabet is x∈S x2 = 0.
The coefficient vector v vanishes if ψn = 0 holds for all n such that the estimation problem simplifies greatly to

2 H −1
x̂n = hH
n (σ IW +Hn Hn ) yn . (28)

This solution is similar to that in (8) for MMSE estimation of real-valued parameters.
Given a 2q -ary signal alphabet S, q extrinsic LLRs λext (cn ) must be produced per estimate x̂n . To this end,
all equalizer input LLRs except λ′ext (cn ) can be used. Using the standard assumption that the estimation error
en = x̂n − xn is Gaussian distributed, the estimate x̂n can be written as x̂n = en + xn , where the distribution
p(en ) is in general a complex Gaussian PDF with mean µ = E{en }, variance σ 2 = Var{en }, and pseudo-variance
ψ = Cov{en , e∗n }:
 
|en − µ|2 (e∗n − µ∗ )2 · ψ/σ 2 p
p(en ) = exp − 2 2 2
+ Re{ 2 2 2
} / π 2 · (σ 4 −|ψ|2 ).
σ − |ψ| /σ σ − |ψ| /σ
Such a PDF is briefly denotes NC (µ, σ 2 , ψ) in the sequel. The relationship x̂n = en + xn is similar to the ISI
channel model (21) in the ISI-free case, i.e. yn = xn + wn , where wn is distributed with NC (0, σ 2 ). It follows that

April 25, 2010 DRAFT


25

the method (23) for generating the extrinsic output LLRs λext (cn ) of a symbol-based MAP detector in the ISI-free
case can be applied here as well,
P
m∈Fq2 :mk =0 P (xn = map(m)|x̂n )
λext (cqn+k ) = ln P
P (xn = map(m)|x̂n )
m∈Fq2 :mk =1 without using λ′ext (cqn+k )
P Qq−1 ′
m∈Fq2 :mk =0 p(x̂n |xn = map(m)) j=0:j6=k exp(−mj · λext (cqn+j ))
= ln P Qq−1 ′
,
m∈Fq :mk =1 p(x̂n |xn = map(m))
2 j=0:j6=k exp(−mj · λext (cqn+j ))

where p(x̂n |xn ) is given by NC (xn +E{en }, Var{en }, Cov{en , e∗n }) and m = (m0 , m1 , ..., mq−1 ). An advantage
of using P (xn |x̂n ) is the complexity used to compute x̂n , which is polynomial in the window length W as shown
in Sec. IV, compared to the exponential order O(2qMh ) for computing P (xn |y).
The estimate x̂n must not depend on λ′ext (cqn+k ) while computing λext (cqn+k ) to assure that λext (cqn+k ) is a
valid extrinsic LLR. Since x̂n depends on the q LLRs λ′ext (cqn+k ), k = 0, ..., q −1, via µn , vn , and ψn , it seems
that x̂n needs to be recomputed q times for each λext (cqn+k ), k = 0, ..., q−1, with different statistics µn , vn , and
ψn obtained by setting the corresponding input LLR λ′ext (cqn+k ) to 0. To avoid this costly recalculation, only one
estimate x̂n for each λext (cqn+k ), k = 0, ..., q−1, is computed while imposing the IUD assumption on the particular
symbol xn . It follows that µn , vn , and ψn should be replaced by 0, 1, and 0, respectively, while computing x̂n ,
which changes (27) to
x̂n = un (yn −Hn µ n +µn hn ) + vn (yn −Hn µ n +µn hn )∗ , (29)

H −1 T ∗ −1 ∗ T ∗ −1
where un = hH T
n Rn , vn = −hn Σ̃n Hn Ψ̃n Hn Rn , and Rn = (Σ̃n −Hn Ψ̃n Hn (Σ̃n ) (Hn Ψ̃n Hn ) ) using the
updated matrices Σ̃n = Σn +(1−vn ) hn hH T
n and Ψ̃n = Ψn −ψn hn hn .

The statistics E{en }, Var{en }, and Cov{en , e∗n } of the estimation error en under the constraints that vn and ψn
be replaced by 1 and 0, respectively, yield E{en } = 0, and

Var{en } = 1 − un hn and Cov{en , e∗n } = vn hn .

The PDF p(en ) is, thus, given by NC (0, 1−un hn , vn hn ) under the Gaussian assumption, i.e.,
 
|en |2 +Re{(e∗n )2 Cov{en , e∗n }/ Var{en }}  p 
2 −| Cov{e , e∗ }|2 .
p(en ) = exp − / π Var{e n } n n (30)
Var{en }−| Cov{en , e∗n }|2 / Var{en }
The extrinsic LLRs λext (cn ) are given by
|x̂ −map(m)|2 +Re{(x̂∗ ∗ 2
  Q
n−map(m) ) κ2 }
exp − n exp(−mj λ′ext (cqn+j ))
P
q
κ1
m∈F2 :mk =0 j6=k
λext (cqn+k ) = ln   Q , (31)
|x̂ −map(m)|2 +Re{(x̂∗ ∗ 2
n−map(m) ) κ2 }
− n exp(−mj λ′ext (cqn+j ))
P
exp κ1
q j6=k
m∈F2 :mk =1
p
where κ1 = 1−un hn −|vn hn |2 /(1−un hn ), κ2 = vn hn /(1−un hn ), and m = (m0 , ..., mq−1 ). The complete
SISO equalization algorithm based on widely linear MMSE estimation is shown in Table IV.
When the input LLRs λ′ext (cn ) are 0 for all n, e.g., for the initial equalization step in Fig. 5 or for usual MMSE
estimation without incorporating prior knowledge about the symbols xn , all means µn and pseudo-variances ψn are
2 H −1
0 and all variances vn are 1. As shown in (28), the estimates x̂n are given by un yn with un = hH
n (σ IW +Hn Hn )

under this condition. The statistics of the estimation error en simplify to E{en } = 0, Var{en } = 1 − un hn , and

April 25, 2010 DRAFT


26

TABLE IV
SISO EQUALIZATION ALGORITHM BASED ON WIDELY LINEAR MMSE ESTIMATION .

INPUT
hline extrinsic LLRs λ′ext (cn ) for n = 0, 1, ..., N −1
INITIALIZATION
compute L = ⌈N/q⌉ given a 2q -ary signal alphabet S
compute µn = x∈S x · s′ (xn = x), vn = x∈S |x − µn |2 · s′ (xn = x), and
P P

ψn = x∈S (x − E{xn })2 · s′ (xn = x) for n = 0, 1, ..., L−1, where


P

Qq−1 exp(−mk ·λext (cqn+k ))
s′ (xn = x) = , x = map(m0 , ..., mq−1 )
k=0 1+exp(−λ′
ext (cqn+k ))
WIDELY LINEAR MMSE ESTIMATION
FOR n = 0 TO L − 1 DO
Σ̃n = σ 2 IW + Hn Vn HH H
n +(1−vn ) hn hn
Ψ̃n = Hn Ψn HT n −ψn hn hn
T

Rn = (Σ̃n −Hn Ψ̃n HT −1 ∗ T ∗ −1


n (Σ̃n ) (Hn Ψ̃n Hn ) )
un = hH
n Rn
vn = −hH −1 T ∗
n Σ̃n Hn Ψ̃n Hn Rn
x̂n = un (yn −Hn µ n +µn hn ) + vn (yn −Hn µ n +µn hn )∗
q
κ1 = 1−un hn −|vn hn |2 /(1−un hn )
κ2 = vn hn /(1−un hn )
FOR k = 0 TO q − 1 DO
exp(−(|x̂n−map(m)|2 +Re{(x̂∗ ∗ 2 exp(−mj λ′ext (cqn+j ))
P Q
q n−map(m) ) κ2 })/κ1 )
m∈F2 :mk =0 j6=k
λext (cqn+k ) = ln
exp(−(|x̂n−map(m)|2 +Re{(x̂∗ ∗ 2 exp(−mj λ′ext (cqn+j ))
P Q
q n−map(m) ) κ2 })/κ1 )
m∈F2 :mk =1 j6=k

where m = (m0 , ..., mq−1 )


END
END
OUTPUT
extrinsic LLRs λext (cn ) for n = 0, 1, ..., N −1

Cov{en , e∗n } = 0, i.e., the estimation error is circularly symmetric and distributed with NC (0, Var{en }) under the
Gaussian assumption, which simplifies the computation of the extrinsic LLRs λext (cn ) to
P 2 Q
exp(− |un y1−u
n−map(m)|
n hn
) exp(−mj λ′ext (cqn+j ))
m∈Fq2 :mk =0 j6=k
λext (cqn+k ) = ln P 2 Q ,
exp(− |un y1−u
n−map(m)|
n hn
) exp(−mj λ′ext (cqn+j ))
m∈Fq2 :mk =1 j6=k

with m = (m0 , ..., mq−1 ).


The BER performance of turbo equalization using the widely linear MMSE estimator is exhibited in Fig. 11 for
the same block length K = 510 as in Fig. 10. The system model and the 8PSK signal alphabet were specified in
Sec. III-A and the window parameters W = 11, W1 = 6, and W2 = 4 are similar to the choice in Sec. II-E. Applying
the steps outlined in Fig. 5 results in the expected performance increase lasting over a few iterations. The BER
performance of turbo equalization using the symbol-based MAP detector depicted in Fig. 10 is not attained. This is
not surprising since the widely linear MMSE estimator is not calculating the correct soft information P (xn |y) but
the approximation P (xn |x̂n ). However, in contrast to the BER performance of the linear MMSE estimator (BPSK
modulation) depicted in Fig. 8 and that of the symbol-based MAP detector (8PSK modulation) depicted in Fig. 10,
the BER performance of BICM over the ISI-free channel yn = xn + wn is not attained. The obvious gap in Fig. 11
between the left-most solid line (BER after 8 iterations) and the dotted line (BICM performance after 8 iterations)
narrows for larger block lengths K, but remains nonvanishing.

April 25, 2010 DRAFT


27

0
10
after 1 iteration
after 2 iterations
after 8 iterations

data bit error rate


−2
10
separate
equalization
and decoding

−4
10

capacity limit of
the ISI channel
4.7 6 8 10 12 14 16
2
P/σ in dB

Fig. 11. Performance of turbo equalization for the communication system described in Sec. III-A using widely linear MMSE estimation. The
BER performance (solid lines) is plotted for separate equalization and decoding as well as after 1, 2, or 8 iterations. The dotted line corresponds
to the BER performance of bit-interleaved coded modulation given an ISI-free channel after 8 demapping-decoding iterations. The considered
block length is K = 510 (N = 1024 and L = 342). An S-random interleaver with S=16 is applied.

It turns out that the Gaussian assumption made on the estimation error en = x̂n − xn , which is, in particular
for higher-order signal alphabets, not accurate [82], is the limiting factor in this case. The extrinsic LLRs λext (cn )
are incorrect (or inconsistent as defined in Section VI) if the true PDF p(en ) is not Gaussian. The inconsistency
apparent in the derivation for the linear MMSE estimator used for BPSK modulation in Sec. II-E does not cause
a significant performance degradation as shown in Fig. 8, due to the good match between p(en ) and the Gaussian
approximation. In contrast, the degradation is significant in Fig. 11. A remedy to this problem could be a more
accurate modeling of p(en ) as proposed in [82] at the expense that the calculation of the extrinsic LLRs λext (cn )
becomes more complex.

IV. I MPLEMENTATION ISSUES

The focus of this section is on the computational complexity of the MMSE estimation algorithm. The computa-
tional complexity is measured per SISO equalization task, which is to produce N extrinsic LLRs λext (cn ) from N
input LLRs λ′ext (cn ) and L observations yn . This task has to be repeated for each iteration of the turbo equalization
algorithm. The complexity order depends on the following system parameters: the size |S| of the signal alphabet S,
the memory Mh of the ISI channel, the estimator window length W , and the number L of symbols xn transmitted
over the channel.

A. Trellis-based approaches

SISO equalization based on symbol-based MAP detection can be implemented on a trellis with |S|Mh states
as shown in Table I (for BPSK modulation) and in Sec. III-A for arbitrary signal alphabets. It follows that the
computational complexity of a SISO equalization task is of order O(L · |S|Mh ). The same holds for trellis-based
sequence-based MAP detection using the Viterbi algorithm [6, 74]. The exponential increase of the required number

April 25, 2010 DRAFT


28

of operations in the channel memory Mh , in particular for signal alphabets of large size |S|, is a serious problem
in a practical implementation. As a remedy, numerous approximations of both symbol- and sequence-based MAP
detection exists, which aim to use an alternative trellis describing the ISI channel with fewer states [36].

B. Linear MMSE estimation of BPSK-modulated symbols

SISO equalization of BPSK-modulated symbols xn based on linear MMSE estimation is described in Table III.
The complexity order of the SISO equalization task, which is to calculate the N = L output LLRs λext (cn ), is
governed by the inversion of the time-varying W × W matrix Σn = σ 2 IW +Hn Vn HTn to calculate the coefficient
vector fn = Σ−1 3
n hn for each time step n. This operation is of order O(W ) such that the complexity of the entire

SISO equalization task is of order O(L · W 3 ). However, the structured time-dependence of the matrix Σn admits
a fast recursive solution for computing fn that has complexity that is O(W 2 ). The time-recursive update algorithm
introduced in [26] uses the following partitioning scheme:
   
sold sTold Snew snew
Σn =   and Σn+1 =  , (32)
sold Sold sTnew snew
where the Sold , Snew are (W −1)×(W −1) matrices, sold , snew are length (W −1) column vectors, and sold , snew
are scalars. The subscript old denotes quantities at time step n and new denotes quantities at time step n+ 1. A
similar partitioning scheme is introduced for the inverses of Σn and Σn+1 :
   
T
u old u U unew
Σ−1
n = Un =
 old  and S−1
n+1 = Un+1 =
 new . (33)
uold Uold uTnew unew
The efficient recursive implementation of the equalization algorithm in Table III arises from noting that the
submatrices Sold and Snew are identical:
T
Sold = Snew = σ 2 IW −1 + H′n Diag{vn−Mh −W2 +1 · · · vn+W1 }H′n ,

where H′n is the (W −1) × (W +Mh −1) submatrix Hn [1 : W −1, 1 : W +Mh −1] of Hn , i.e., H′n is the submatrix
H[n−W2+1 : n+W1 , n−W2−Mh+1 : n+W1 ] of the system matrix H. The proposed recursive algorithm computes
first S−1 −1 −1 −1
old from Un , sets Sold = Snew , and computes Un+1 from Snew .

The inverse S−1


old of the submatrix Sold of Σn is expressed in terms of components of Un by solving Σn Un = IW

using (32) and (33):

Sold Uold + sold uTold = IW −1 ,

Sold uold + sold uold = 0W −1 , (34)

→ S−1 T
old = Uold − uold uold /uold .

By solving Sn+1 Un+1 = IW , the quantities Unew , unew , and unew are expressed in terms of Snew , snew , and

April 25, 2010 DRAFT


29

snew :

s′new = S−1
new snew ,

unew = 1/(snew − sTnew s′new ),

unew = −unew s′new ,


T
Unew = S−1 ′ ′
new + unew snew snew ,

where the equations are ordered to optimize the computation by using the intermediate vector s′new . The matrix
−1
S−1
new is equal to Sold and the quantities snew and snew are computed using the definition of Σn and its partitioning

in (32):      
snew 0W −1 0
 =  + Hn+1 Vn+1 HTn+1  W −1  . (35)
snew σ2 1
Assembling Un+1 from Unew , unew , and unew completes the recursive algorithm.
To bootstrap the time-recursive update algorithm, an initialization of Un at the starting time step n = 0 is required,
e.g., by computing U0 = (σ 2 IW +H0 V0 HT0 )−1 . This operation can be trivial when the block of transmitted symbols
xn starts with a preamble of at least W +Mh symbols known to the receiver, which yields V0 = 0W +Mh and, thus,
U0 = IW /σ 2 .
The complete SISO equalization based on linear MMSE estimation of BPSK-modulated symbols is summarized
in Table V. The most demanding operations to compute the output LLRs λext (cn ) are matrix-vector multiplications
T
such as S−1 ′ ′ 2
new snew and outer vector products such as snew snew , which are both of order O(W ). The complexity

of the complete SISO equalization algorithm is therefore O(L · W 2 ).

C. Approximate linear MMSE estimation of BPSK-modulated symbols

A further complexity reduction is possible by using a time invariant coefficient vector f to compute the estimates
x̂n , i.e., the time varying coefficient vector fn in (16) is replaced by the length-W vector f being constant for all
n = 0, 1, ..., L−1,
x̂n = κ · (f T (yn −Hn µ n ) + µn f T hn ), (36)

where κ is a constant. Recall that κ is equal to 1/(1+(1−vn ) fnT hn ) in (16), where it assures that vn is replaced by 1
while computing x̂n as shown in Sec. II-E. The complexity order of computing the estimates x̂n via (36) is O(W ),
and the results in Fig. 12 show that this approach has an excellent complexity/performance trade-off. Clearly, using
(36) makes sense only if the ISI channel impulse response is time invariant, i.e., Hn must be constant over time.
Of crucial interest is the actual choice for f . In [25] was proposed to set f equal to the MMSE-optimal solution
in case all input LLRs λ′ext (cn ) are 0, e.g., for the initial equalization step. The coefficient vector f is given by
(σ 2 IW +Hn HTn )−1 hn under this condition. While this choice yields a good BER performance of turbo equalization
for the first few iterations, the performance degrades significantly compared to the MMSE-optimal approach in
Table III. The reason for this behavior is explained in [25].

April 25, 2010 DRAFT


30

TABLE V
O RDER O(L · W 2) SISO EQUALIZATION ALGORITHM BASED ON LINEAR MMSE ESTIMATION OF BPSK- MODULATED SYMBOLS . T HE
OPERATOR := DENOTES THAT THE VALUE OF THE LEFT- HARD ARGUMENT IS REPLACED BY THAT OF THE RIGHT- HAND ARGUMENT.

INPUT
extrinsic LLRs λ′ext (cn ) for n = 0, 1, ..., N −1
INITIALIZATION
compute µn = tanh(λ′ext (cn )/2) and vn = 1 − |µn |2 for n = 0, 1, ..., N −1
compute R = (σ 2 IW + H0 V0 HT 0)
−1

LINEAR MMSE ESTIMATION


FOR n = 0 TO N − 1 DO
f = Rhn
s = f T hn
λext (cn ) = 2 · (f T (yn −Hn µ n ) + µn s)/(1 − vn s)
IF n < (L − 1) THEN
 
U u
  := R
u T u
U := U − u uT /u
     
u 0 0
  :=  W −1  + H T  W −1 
2 n+1 Vn+1 Hn+1
u σ 1
u′ := U u
u := 1/(u − uT u′ )
u := −u u′
U := U + u u′ u′ T
 
u uT
R :=  
u U
END
END
OUTPUT
extrinsic LLRs λext (cn ) for n = 0, 1, ..., N −1

Another methods also proposed in [25] is to set f equal to the MMSE-optimal solution in case all input LLRs
λ′ext (cn ) have a large magnitude such that vn = 0 holds for all n, e.g., after convergence of turbo equalization.
The coefficient vector f is given by hn /σ 2 under this condition, which is the matched filter to the ISI channel
impulse response already derived in (20). Performing turbo equalization based on this equalization algorithm has
a severe convergence problem, i.e., large SNR P/σ 2 is required to achieve performance improvements over the
iterations [25]. This is not surprising, since the input LLRs λ′ext (cn ) are initially 0, a condition, for which the
choice f = hn /σ 2 is certainly not suited.
A possible remedy to cure the weaknesses of both proposals is to start with f = (σ 2 IW +Hn HTn )−1 hn for the
first few iterations, where the input LLRs λ′ext (cn ) are usually of small magnitude. Subject to a quality criterion
evaluating the output LLRs λext (cn ), the SISO equalizer switches to f = hn /σ 2 , which is suitable for input LLRs
with large magnitude. This hybrid equalization approach is explained in detail in [25].
A more powerful solution for f was derived in [26], which includes the above two choices as special cases:

f = (σ 2 IW + Hn V̄HTn )−1 hn , (37)

1
PL−1 PL−1
where v̄ = L vn is the average variance of the symbols xn and V̄ = L1 · n=0 Vn . This choice minimizes
· n=0
PL−1
the average MSE L1 · n=0 E{|x̂n − xn |2 } if κ is chosen suitably, which can be shown using the orthogonality
PL−1
principle. The choice κ = 1/(1 + (1 − v̄) f T hn ) yields L1 · n=0 E{yn (x̂n − xn )} = 0W . The solution (37) is

April 25, 2010 DRAFT


31

MMSE-optimal, i.e., it minimizes the MSE E{|x̂n −xn |2 }, whenever all vn are constant, e.g., in case all vn are 1
(initial equalization step) or all vn are 0 (convergence of turbo equalization).
The term V̄ can be approximated with v̄IMh +W by neglecting the boundary effects for time indices n smaller
than Mh +W2 and larger than L−W1 −1, which simplifies (37) to

f = (σ 2 IW + v̄Hn HTn )−1 hn . (38)

Using this coefficient vector, the estimates x̂n are given by

x̂n = κ · (f T (yn −Hn µ n ) + µn f T hn ), κ = 1/(1+(1−v̄) f T hn ).

The statistics E{en } and Var{en } of the estimation error en = x̂n − xn are given by

E{en } = E{x̂n } = κ · (f T (E{yn }−Hn µ n ) + µn f T hn ) = 0,

Var{en } = Var{x̂n } + Var{xn } − 2 Cov{x̂n , xn }

= κ2 f T Cov{yn , yn }f + 1 − 2 κ f T Cov{yn , xn }

= κ2 f T (σ 2 IW + Hn Vn HTn + (1−vn )hn hTn )f + 1 − 2 κ f T hn ,


recalling that E{xn } = 0 and Var{xn } = 1 holds. Using the Gaussian assumption on p(en ) and (13), the extrinsic
LLRs λext (cn ) can be obtained as follows:
exp(−(x̂n −+1)2 /(2 Var{en })) 2x̂n
λext (cn ) = ln = .
exp(−(x̂n −−1)2 /(2 Var{en })) Var{en }
A simplified way to compute λext (cn ) is to approximate p(en ) with a Gaussian distribution with time invariant
statistics. The mean of this distribution is set to 0, since E{en } is always 0. The variance Var{en } of this distribution
is set to the average estimation error variance over all time indices n = 0, 1, ..., L−1:
L−1
1 X
Var{en } = · Var{en }
L n=0

= κ2 f T (σ 2 IW + Hn V̄HTn + (1−v̄)hn hTn )f + 1 − 2 κ f T hn

= κ s + 1 − 2 κ s = 1 − κ s, s = f T hn ,

where the last line follows from the approximation V̄ ≈ v̄IMh +W . The extrinsic LLRs λext (cn ) are then given by
exp(−(x̂n −+1)2 /(2Var{en })) 2x̂n
λext (cn ) = ln =
exp(−(x̂n −−1)2 /(2Var{en })) Var{en } (39)
2 κ (f T (yn −Hn µ n ) + µn s) 2 (f T (yn −Hn µ n ) + µn s)
= = .
1 − κs 1 − v̄ s
Note that this simplification of the LLR calculation assigns that same reliability to each output LLR λext (cn ).
The complete SISO equalization based on approximate linear MMSE estimation of BPSK-modulated symbols is
summarized in Table VI. The complexity order of this SISO equalization algorithm is O(L · W ).
The BER performance of turbo equalization using approximate linear MMSE estimation as equalization algorithm
is exhibited in Fig. 12 for the same block length K = 510 as in Fig. 8, which depicts the performance of
turbo equalization using exact linear MMSE estimation. Comparing the two figures indicates that the performance

April 25, 2010 DRAFT


32

TABLE VI
O RDER O(L · W ) SISO EQUALIZATION ALGORITHM BASED ON APPROXIMATE LINEAR MMSE ESTIMATION OF BPSK- MODULATED
SYMBOLS .

INPUT
extrinsic LLRs λ′ext (cn ) for n = 0, 1, ..., N −1
INITIALIZATION
compute µn = tanh(λ′ext (cn )/2) and vn = 1 − |µn |2 for n = 0, 1, ..., N −1
PL−1
compute v̄ = 1 · n=0 vn
L
compute f = (σ 2 IW + v̄H0 HT
0)
−1 h
0
compute s = f T h0
compute κ = 2/(1 − v̄ s)
APPROXIMATE LINEAR MMSE ESTIMATION
FOR n = 0 TO N − 1 DO
λext (cn ) = κ · (f T (yn −Hn µ n ) + µn · s)
END
OUTPUT
extrinsic LLRs λext (cn ) for n = 0, 1, ..., N −1

degradation due to the imposed approximations is small. Both estimation algorithms attain the BER performance
of the FEC decoder when no ISI is introduced in the channel. Recall that SISO equalization based on exact linear
MMSE estimation has the complexity order O(L · W 2 ) as shown in Table V, whereas SISO equalization based on
approximate linear MMSE estimation has the complexity order O(L · W ).

0
10
after 1 iteration
after 2 iterations
after 8 iterations
data bit error rate

−2
10

−4
10
separate
capacity limit of equalization
the ISI channel and decoding
0 1.4 2 4 6 8 10 12
2
P/σ in dB

Fig. 12. Performance of turbo equalization for the communication system described in Sec. II-A using approximate linear MMSE estimation
of BPSK-modulated symbols. The BER performance (solid lines) is plotted for separate equalization and decoding as well as after 1, 2, or 8
iterations. The dotted line corresponds to the BER performance of the FEC decoder when no ISI is introduced in the channel. The dashed line
is a lower bound on the BER performance of any decoder. The considered blocklength is K = 510 (N = L = 1024). An S-random interleaver
with S=16 is applied.

D. Approximate linear MMSE estimation in the frequency domain

Consider the following linear MMSE estimator,

x̂ = E{x} + Cov{x, y} Cov{y, y}−1 (y − E{y}), (40)

April 25, 2010 DRAFT


33

which, when x̂n does not depend on the corresponding symbol statistics µn and vn , yields

x̂n = hTn (σ 2 IL +HVHT +(1−vn ) hn hTn )−1 (y−H µ +µn hn ), (41)

where µ = (µ0 µ1 ... µL−1 )T , V = (v0 v1 ... vL−1 )T , and hn = H[0 : L−1, n] is (here) the n-th column of H. By
considering the following approximation, which is similar to that in Sec. IV-C, i.e., it minimizes the average MSE
1
PL−1 2
L · n=0 E{(x̂n −xn ) }:

hTn Σ−1 (y−H µ +µn hn )


x̂n = hTn (Σ+(1−v̄) hn hTn )−1 (y−H µ +µn hn ) = ,
1+(1−v̄) hTn Σ−1 hn
where Σ = σ 2 IL + v̄HHT , we see that the term hTn Σ−1 hn is constant for all n given a time invariant ISI
channel and a system matrix H constructed under a periodic extension. Using a cyclic prefix for transmission, i.e.
(xL−2 xL−1 x0 x1 ... xL−1 )T , H becomes a circulant matrix. The subsequence (xL−2 . . . xL−1 )T transmitted prior
to x is the so-called cyclic prefix [83], which in general consists of the Mh last symbols of x given a memory-Mh
ISI channel. Using the constant term s = hTn Σ−1 hn , the estimates x̂n are given by

x̂n = (hTn Σ−1 (y−H µ )+µn s)/(1+(1−v̄) s),

which can be combined to


x̂ = (HT Σ−1 (y−H µ )+µ
µ s)/(1+(1−v̄) s). (42)

The complexity of this calculation can be reduced to O(L · ln L) by calculating (42) efficiently in the frequency do-
main, since H is a circulant matrix, i.e., its columns are cyclic shifts of the first column h0 = (h0 h1 ... hMh 0 ... 0)T .
It follows that H is equal to F Diag{hF }F−1 , where F is the L × L DFT matrix and

hF = (hF,0 hF,1 ... hF,L−1 )T = Fh0

is the spectrum of h0 . Applying this relationship to Σ−1 yields

Σ−1 = (σ 2 IL +v̄ F Diag{hF }F−1 (F Diag{hF }F−1 )H )−1

= (σ 2 IL +v̄ F Diag{hF } Diag{hF }∗ F−1 )−1

= F−1 (σ 2 IL +v̄ Diag{hF } Diag{hF }∗ )−1 F,

where HT was replaced by HH in order that the properties (F−1 )H = F/L and FH = L · F−1 of the DFT matrix
can be applied. The estimate vector x̂ is finally given by
F−1 Diag{hF }∗ (σ 2 IL +v̄ Diag{hF } Diag{hF }∗ )−1 (Fy−Diag{hF }Fµ
µ)+µ
µs
x̂ = . (43)
1+(1−v̄) s
It will be exact whenever all symbol variances vn are identical, e.g., in the initial equalization step in turbo
equalization, where all vn are 1. The constant s = hTn Σ−1 hn can be calculated using F Diag{hF }F−1 1L :
L−1
X hF,n h∗F,n
s = 1TL Diag{hF }∗ (σ 2 IL +v̄ Diag{hF } Diag{hF }∗ )−1 Diag{hF }1L = .
n=0
σ 2 +v̄ hF,n h∗F,n

The complexity of the actual estimation part is only O(L) and that of the complete equalization algorithm is
O(L · ln L), using the fast Fourier transform [84]. The extrinsic LLRs λext (cn ) are calculated as in Sec. IV-C. A

April 25, 2010 DRAFT


34

TABLE VII
O RDER O(L · ln L) SISO EQUALIZATION ALGORITHM BASED ON APPROXIMATE LINEAR MMSE ESTIMATION OF BPSK- MODULATED
SYMBOLS IN THE FREQUENCY DOMAIN .

INPUT
extrinsic LLRs λ′ext (cn ) for n = 0, 1, ..., N −1
INITIALIZATION
compute µn = tanh(λ′ext (cn )/2) and vn = 1 − |µn |2 for n = 0, 1, ..., N −1
PL−1
compute v̄ = 1 · n=0 vn
L
compute (hF,0 hF,1 ... hF,L−1 )T = F(h0 h1 ... hM 0 ... 0)T
h
PL−1 2 2 2
compute s = n=0 |hF,n | /(σ + v̄ |hF,n | )
compute κ = 2/(1 − v̄ s)
APPROXIMATE LINEAR MMSE ESTIMATION
(µF,0 ... µF,L−1 )T = F(µ0 ... µL−1 )T
(yF,0 ... yF,L−1 )T = F(y0 ... yL−1 )T
FOR n = 0 TO N − 1 DO
x̂F,n = (h∗ 2 2 2
F,n yF,n − |hF,n | µF,n )/(σ + v̄ |hF,n | )
END
(x̂0 ... x̂L−1 )T = F−1 (x̂F,0 ... x̂F,L−1 )T
FOR n = 0 TO N − 1 DO
λext (cn ) = κ · x̂n
END
OUTPUT
extrinsic LLRs λext (cn ) for n = 0, 1, ..., N −1

detailed study of the complexity of estimation algorithms working either in the time or in the frequency domain
has been carried out in [85].
The complete SISO equalization algorithm based on approximate linear MMSE estimation of BPSK-modulated
symbols in the frequency domain is summarized in Table VII. The BER performance of turbo equalization using
this estimation algorithm is identical to that in Sec. IV-C, where approximate linear MMSE estimation of BPSK-
modulated symbols in the time domain was applied. The frequency domain approach can perform slightly better
than the time domain counterpart in Sec. IV-C, since the entire received sequence y is used to estimate each symbol
xn . However, the improvement is usually small as long as the window parameter W is chosen sufficiently long.

E. Linear MMSE estimation of PSK- or QAM-modulated symbols

Another approximation is to simply discard the pseudo-variances ψn while computing x̂n , i.e., the transmitted
symbols xn are assumed to be circularly symmetric. By setting ψn to 0 for all n, the calculation of the estimates
x̂n in (29) simplifies greatly, since Ψn and vn vanish for all n and un is given by hH
n Σ̃n :

−1
x̂n = hH
n Σ̃n (yn −Hn µ n +µn hn ), (44)

2 −1 −1
where Σ̃n = Σn+(1−vn ) hn hH H
n and Σn = σ IW +Hn Vn Hn . The expression Σ̃n hn is equal to Σn hn /(1+(1−
−1
v n ) hH
n Σn hn ), which can be shown using Woodbury’s identity. It follows that the estimates x̂n can be computed

as follows:
fnH (yn −Hn µ n )+µn · sn
x̂n = , fn = Σ−1
n hn , sn = fnH hn .
1+(1−vn ) sn

April 25, 2010 DRAFT


35

TABLE VIII
O RDER O(L · W 2) SISO EQUALIZATION ALGORITHM BASED ON LINEAR MMSE ESTIMATION OF PSK- OR QAM- MODULATED SYMBOLS .
T HE OPERATOR := DENOTES THAT THE VALUE OF THE LEFT- HAND ARGUMENT IS REPLACED BY THAT OF THE RIGHT- HAND ARGUMENT.

INPUT
extrinsic LLRs λ′ext (cn ) for n = 0, 1, ..., N −1
INITIALIZATION
compute L = ⌈N/q⌉ given a 2q -ary signal alphabet S
compute µn = x∈S x · s′ (xn = x) and vn = x∈S |x − µn |2 · s′ (xn = x) for n = 0, 1, ..., L−1,
P P

′ Qq−1 exp(−mk ·λext (cqn+k ))
where s (xn = x) = , x = map(m0 , ..., mq−1 )
k=0 1+exp(−λ′
ext (cqn+k ))
compute R = (σ 2 IW + H0 V0 HH 0)
−1

LINEAR MMSE ESTIMATION


FOR n = 0 TO L − 1 DO
f = Rhn
s = f H hn
κ = 1/(1+(1−vn ) s)
x̂ = κ · (f H (yn −Hn µ n )+µn · s)
FOR k = 0 TO q − 1 DO
exp(−|x̂−map(m)|2 /(1−κ·s)) exp(−mj λ′ext (cqn+j ))
P Q
q j6=k
m∈F2 :mk =0
λext (cqn+k ) = ln
exp(−|x̂−map(m)|2 /(1−κ·s)) exp(−mj λ′ext (cqn+j ))
P Q
q j6=k
m∈F2 :mk =1
where m = (m0 , ..., mq−1 )
END
IF n < (L − 1) THEN
 
U u
  := R
u H u
U := U − u uH /u
     
u 0 0
  :=  W −1  + H H  W −1 
n+1 Vn+1 Hn+1
u σ2 1
u′ := U u
u := 1/(u − uH u′ )
u := −u u′
U := U + u u′ u′ H
 
u uH
R :=  
u U
END
END
OUTPUT
extrinsic LLRs λext (cn ) for n = 0, 1, ..., N −1

The computation of the extrinsic LLRs λext (cn ) can be shown to be


P −map(m)|2 Q
exp(− 1−s|x̂nn/(1+(1−v n ) sn )
) exp(−mj λ′ext (cqn+j ))
m∈Fq2 :mk =0 j6=k
λext (cqn+k ) = ln P −map(m)|2 Q ,
exp(− 1−s|x̂nn/(1+(1−v n ) sn )
) exp(−mj λ′ext (cqn+j ))
m∈Fq2 :mk =1 j6=k

where m = (m0 , ..., mq−1 ). The complete order O(L · W 2 ) SISO equalization algorithm based on linear MMSE
estimation of PSK- or QAM-modulated symbols is summarized in Table VIII.
The BER performance of turbo equalization using the linear MMSE estimation of PSK- or QAM-modulated
symbols is studied at the end of the next section, which introduces an even simpler MMSE estimation algorithm.

April 25, 2010 DRAFT


36

TABLE IX
O RDER O(L · W ) SISO EQUALIZATION ALGORITHM BASED ON APPROXIMATE LINEAR MMSE ESTIMATION OF PSK- OR
QAM- MODULATED SYMBOLS .

INPUT
extrinsic LLRs λ′ext (cn ) for n = 0, 1, ..., N −1
INITIALIZATION
compute L = ⌈N/q⌉ given a 2q -ary signal alphabet S
compute µn = x∈S x · s′ (xn = x) and vn = x∈S |x − µn |2 · s′ (xn = x) for n = 0, 1, ..., L−1,
P P

exp(−m ·λ ′ (cqn+k ))
q−1 k ext
where s′ (xn = x) =
Q
, x = map(m0 , ..., mq−1 )
k=0 1+exp(−λ′
ext (cqn+k ))
L−1
compute v̄ = 1 ·
P
n=0 v n
L
compute f = (σ 2 IW + v̄H0 HH 0)
−1 h
0
compute s = f h0H

compute κ = 1/(1+(1− v̄) s)


APPROXIMATE LINEAR MMSE ESTIMATION
FOR n = 0 TO L − 1 DO
x̂ = κ · (f H (yn −Hn µ n )+µn · s)
FOR k = 0 TO q − 1 DO
exp(−|x̂n−map(m)|2 /(1−κ s)) exp(−mj λ′ext (cqn+j ))
P Q
q j6=k
m∈F2 :mk =0
λext (cqn+k ) = ln
exp(−|x̂n−map(m)|2 /(1−κ s)) exp(−mj λ′ext (cqn+j ))
P Q
q j6=k
m∈F2 :mk =1
where m = (m0 , ..., mq−1 )
END
END
OUTPUT
extrinsic LLRs λext (cn ) for n = 0, 1, ..., N −1

F. Approximate linear MMSE estimation of PSK- or QAM-modulated symbols

An approximate linear MMSE estimation of PSK- or QAM-modulated symbols algorithm follows from using a
time-invariant coefficient vector to compute the estimates x̂n ,

x̂n = κ · (f H (yn −Hn µ n )+µn s), f = (σ 2 IW + v̄Hn HH


n)
−1
hn , (45)

where s = f H hn and κ = 1 + (1 − v̄) s. This solution has a complexity order of O(W ) per time step n similar
to the approximate linear MMSE estimation algorithm for BPSK-modulated symbols summarized in Table VI. It
PL−1 PL−1
minimizes the average MSE L1 · n=0 E{(x̂n −xn )2 } provided that the average pseudo-variance ψ̄ = L1 · n=0 ψn
vanishes. For computing the extrinsic LLRs λext (cn ), the same approximate calculation of the estimation error
statistics as in Sec. IV-C is applied, i.e., the PDF p(en ) is approximated with a Gaussian distribution with time
PL−1
invariant mean E{en } = 0, variance Var{en } = 1−κ s, and pseudo-variance Cov{en , e∗n } = L1 · n=0 Cov{en , e∗n }.
The latter is again ignored, which yields the following formula to calculate the LLRs λext (cn ),
P 2 Q
exp(− |x̂n−map(m)|
1−κ s ) exp(−mj λ′ext (cqn+j ))
m∈Fq2 :mk =0 j6=k
λext (cqn+k ) = ln P 2 Q ,
exp(− |x̂n−map(m)|
1−κ s ) exp(−mj λ′ext (cqn+j ))
m∈Fq2 :mk =1 j6=k

where m = (m0 , ..., mq−1 ). The complete order O(L · W ) SISO equalization algorithm based on approximate
linear MMSE estimation of PSK- or QAM-modulated symbols is summarized in Table IX.
The BER performance of turbo equalization based on (approximate) linear MMSE estimation of PSK- or QAM-
modulated symbols is exhibited in Fig. 13 for the same block length K = 510 as in Fig. 10, which depicts

April 25, 2010 DRAFT


37

TABLE X
O RDER O(L · ln L) SISO EQUALIZATION ALGORITHM BASED ON APPROXIMATE LINEAR MMSE ESTIMATION OF PSK- OR
QAM- MODULATED SYMBOLS IN THE FREQUENCY DOMAIN .

INPUT
extrinsic LLRs λ′ext (cn ) for n = 0, 1, ..., N −1
INITIALIZATION
compute L = ⌈N/q⌉ given a 2q -ary signal alphabet S
compute µn = x∈S x · s′ (xn = x) and vn = x∈S |x − µn |2 · s′ (xn = x) for n = 0, 1, ..., L−1,
P P

exp(−m ·λ ′ (cqn+k ))
q−1 k ext
where s′ (xn = x) =
Q
, x = map(m0 , ..., mq−1 )
k=0 1+exp(−λ′
ext (cqn+k ))
L−1
compute v̄ = 1 ·
P
n=0 v n
L
compute (hF,0 hF,1 ... hF,L−1 )T = F(h0 h1 ... hM 0 ... 0)T
h
PL−1 2 2 2
compute s = n=0 |hF,n | /(σ + v̄ |hF,n | )
compute κ = 1/(1+(1− v̄) s)
APPROXIMATE LINEAR MMSE ESTIMATION
(µF,0 ... µF,L−1 )T = F(µ0 ... µL−1 )T
(yF,0 ... yF,L−1 )T = F(y0 ... yL−1 )T
FOR n = 0 TO L − 1 DO
x̂F,n = κ · (h∗ 2 2 2
F,n yF,n − |hF,n | µF,n )/(σ + v̄ |hF,n | )
END
(x̂0 ... x̂L−1 )T = F−1 (x̂F,0 ... x̂F,L−1 )T
FOR n = 0 TO L − 1 DO
FOR k = 0 TO q − 1 DO
exp(−|x̂n−map(m)|2 /(1−κ s)) exp(−mj λ′ext (cqn+j ))
P Q
q j6=k
m∈F2 :mk =0
λext (cqn+k ) = ln
exp(−|x̂n−map(m)|2 /(1−κ s)) exp(−mj λ′ext (cqn+j ))
P Q
q j6=k
m∈F2 :mk =1
where m = (m0 , ..., mq−1 )
END
END
OUTPUT
extrinsic LLRs λext (cn ) for n = 0, 1, ..., N −1

the performance of turbo equalization using exact MMSE estimation. Both the order O(L · W 2 ) linear MMSE
algorithm summarized in Table VIII and the order O(L · W ) approximate linear MMSE algorithm summarized in
Table IX are considered. Comparing the two figures indicates that the performance degradation due to the imposed
approximations is small, but not negligible. For separate equalization and decoding, all three solutions are identical
and, thus, yield the same BER performance.

V. P RECODING

We now consider the use of precoders in the transmitter of Fig. 1, i.e., where q-tuples cn of code bits cn are
precoded to the q-tuples c̃n using a memory-Mp state space model. The BER performance of ML decoding with
perfect interleaving but no precoding and an ISI channel as communication channel is lower bounded by the ML
decoding performance of the FEC decoder given an ISI-free channel at the same SNR. This lower bound can be
seen in Figs. 6, 8, 10, 11, 12, and 13. Clearly, the decoding performance is bounded away from capacity limits.
Precoding is an elegant way to improve the spectrum of the transmitter such that the lower bound on the BER
decoding performance, can be beaten. This section focuses on the implementation of turbo equalization in the
receiver given a transmitter using precoding.
Using precoding, xn is affected by cn and all previous tuples ci , i = 0, 1, ..., n−1. Vice versa, P (cn |y) must be

April 25, 2010 DRAFT


38

0
10
after 1 iteration
after 2 iterations
after 8 iterations

data bit error rate


−2
10
separate
equalization
and decoding

−4
10

capacity limit of
the ISI channel
4.7 6 8 10 12 14 16
2
P/σ in dB

Fig. 13. Performance of turbo equalization for the communication system described in Sec. III-A using (approximate) linear MMSE estimation.
The BER performance is plotted for separate equalization and decoding as well as after 1, 2, or 8 iterations, where the solid lines correspond
to the order O(L · W 2 ) linear MMSE algorithm algorithm summarized in Table VIII and the dashed lines with crosses correspond to the order
O(L · W ) approximate linear MMSE algorithm summarized in Table IX. The dotted line corresponds to the BER performance of bit-interleaved
coded modulation given an ISI-free channel after 8 demapping-decoding iterations. The considered blocklength is K = 510 (N = 1024 and
L = 342). An S-random interleaver with S=16 is applied.

computed taking into account the entire sequence c, i.e.,


X
P (cn = c|y) = P (c = m|y), m = (m0 , m1 ..., mN −1 ). (46)
m∈FN
2 :mn =c

Given a state-space model of the precoder, P (cn |y) can be obtained using the BCJR algorithm on the combined
trellis of the precoder state-space model and the ISI channel state space model.
For the following precoder state-space model, where the memory Mp is restricted to be qMh and the parameter
matrices A and B are given by
 
A1 A2 A3 ... AM h  

I

 Iq
 q 0 0 ... 0   
0
   
A=
 0 Iq 0 ... 0  and B =  . , (47)
 .. .. .. .. ..

  .. 
 . .  
 . . . 

0
0 0 0 ... 0
where the terms Ai are arbitrary q × q matrices with binary numbers as elements, the state-space models of the
channel and precoder can be combined to
system input: q-tuples cn for n = 0, 1, ..., L−1,
 
A A2 A3 ... AM
 1
 
h I
I

0 0 ... 0
  q
 q

0
  

0 Iq 0 ... 0

 θn +  .  T
system function: θn+1 =  c ,

 .
 . . .   .  n
 .
 .
.
.
.
.
.
.
.
.
.



 . 
  (48)
  0
0 0 0 ... 0

output function: vn = (h1 h2 ... hM ) map(θn ) + h0 map(xn ),


h

observation: yn = vn + wn ,

April 25, 2010 DRAFT


39

n=0 n=1 n=2 n=3 n = L−1 n=L


0/0.407 0/1.22 0/1.63 0/1.63
s0 1/-0.407 s0 1/0.408 s0 1/0.815 s0 s0 1/0.815 s0
0/0.407 0/1.22 0/0.815 0/0.815
s1 1/-0.407 s1 1/0.408 s1 1/0 s1 s1 1/0 s1
1/0.407 1/-0.408 1/0 ... 1/0
0/-0.407 0/-1.22 0/-0.815 0/-0.815
s2 s2 s2 s2 s2 s2
1/0.407 1/-0.408 1/-0.815 1/-0.815

s3 0/-0.407 s3 0/-1.22 s3 0/-1.63 s3 s3 0/-1.63 s3

Fig. 14. Combined trellis of a length-3 ISI channel given by the coefficients h0 = 0.407, h1 = 0.815, and h2 = 0.407 and the precoder
c̃n = cn + c̃n−1 under the termination assumption. Note that compared to the trellis in Fig. 3, only the input xn in the labels xn / vn has
changed.

where map(θn ) is the element-wise mapping from the length-qMh state vector θn to Mh symbols from S. The
corresponding trellis has 2qMh states as in the case of no precoding. This holds as well for any precoder with memory
Mp less than qMh , whose parameters A and B can be brought into the form (47). Although the constraints in
(47) limit the number of possible precoders considerably, the remaining choice turn out to be powerful enough to
improve the BER of turbo equalization tremendously.
The extrinsic LLRs λext (cn ) are computed as in Sec. III-B using the combined precoder-channel trellis:
P T
Qq−1 ′
m∈Fq2 :mk =0 bn+1 (U(m) ⊙ Γext,n )fn j=0:j6=k exp(−mj · λext (cqn+j ))
λext (cqn+k ) = ln P T
Q q−1 ′
,
m∈Fq :mk =1 bn+1 (U(m) ⊙ Γext,n )fn
2 j=0:j6=k exp(−mj · λext (cqn+j ))

with m = (m0 , m1 , ..., mq−1 ), where fn and bn are part of the BCJR algorithm and U(m) returns 1 for all
trellis branches with input label m and 0 otherwise. For example, to compute λext (c3n ) for the 8-PSK alphabet
in Fig. 9, the summation in the numerator is over the 3-tuples (000), (001), (010), (011) and the summation in the
denominator is over the 3-tuples (100), (101), (110), (111). The extrinsic transition matrices Γext,n are defined as
in Sec. III-B.
Consider the system configuration in Sec. II-A, which uses BPSK modulation and a length-3 channel given by
the coefficients h0 = 0.407, h1 = 0.815, and h2 = 0.407. In addition, the memory-1 precoder

θn+1 = θn + cn , c̃n = θn + cn , n = 0, 1, ..., N −1,

is applied, i.e., Mp = 1, A = B = C = 1, and D = 1. The initial state θ0 is 0. This particular precoding rule
can be rewritten to c̃n = cn + c̃n−1 with c̃−1 = 0. The parameters A and B satisfy the constraint (47), i.e., the
combined precoder-channel trellis has only 2Mh = 4 states and is derived from the state-space model:
   
1 0 1
θn+1 =   θn +   cn , vn = (h1 h2 ) map(θn ) + h0 map(xn ), yn = vn + wn . (49)
1 0 0
Figure 14 depicts the corresponding trellis with the branch labeling cn /vn and figure 15 depicts the state machines
of the precoder and the ISI channel. The top plot shows both machines separately, such that the combined precoder-
channel trellis has 23 = 8 states due to the three memory elements in the system. The bottom plot shows an identical

April 25, 2010 DRAFT


40

cn c̃n 0 → +1 xn
1 → −1
wn
h0 h1 h2 yn

cn 0 → +1 xn
1 → −1
wn
h0 h1 h2 yn

Fig. 15. Two equivalent state machines describing the precoder c̃n = cn + c̃n−1 followed by a memory-2 ISI channel.

state machine corresponding to (49), which requires only two memory elements. This state reduction is possible,
because the precoder satisfies (47).
The BER performance of turbo equalization using the precoder c̃n = cn + c̃n−1 and symbol-based MAP detection
as equalization algorithm is exhibited in Fig. 16 [72] for the same block length K = 510 as in Fig. 6, which depicts
the performance of turbo equalization using symbol-based MAP detection without precoding. Whereas the BER
performance of separate equalization and decoding is below that of the system using no precoding shown in Fig. 6,
after one iteration the performance bound of the systems using no precoding is beaten.

0
10
after 1 iteration
after 2 iterations
after 8 iterations
data bit error rate

−2
10

−4
10
separate
capacity limit of equalization
the ISI channel and decoding
0 1.4 2 4 6 8 10 12
2
P/σ in dB

Fig. 16. Performance of turbo equalization for the communication system described in Sec. II-A using the precoder c̃n = cn + c̃n−1 and
symbol-based MAP detection. The BER performance (solid lines) is plotted for separate equalization and decoding as well as after 1, 2, or 8
iterations. The dotted line corresponds to the BER performance of the FEC decoder when no ISI is introduced in the channel. The dashed line
is a lower bound on the BER performance of any decoder when no precoding is used in the transmitter. The considered blocklength is K = 510
(N = L = 1024). An S-random interleaver with S=16 is applied.

Another difference to systems without precoding is the strong correlation between the block length K and
the number of iterations where performance gains are achieved. This effect has two explanations: The spectrum-
enhancing property of precoding is in fact only a necessary condition to achieve improvements in the distance

April 25, 2010 DRAFT


41

spectrum. The distance enhancement is due to the interleaver, which permutes the ones of the low weight code
words b of the FEC code far away, such that large weights are accumulated in c̃ = cM after precoding. An
increasing code length N improves this effect, since the interleaver can permute larger groups of consecutive
bits farther apart. The bit permutation exerted by the interleaver also helps to fulfill the independence assumption
made on the input LLRs to the equalizer and decoder. A system without precoding takes only advantage of the
“information decorrelation" purpose of interleaving, which explains why the BER performance over iteration ceases
to improve. A system with a carefully designed precoder takes a two-fold advantage of the increasing block length
K, since both the distance spectrum improves and the interleaver can better decorrelate the input LLRs.
The BER performance of turbo equalization using the precoder c̃n = cn + c̃n−1 and symbol-based MAP detection
as equalization algorithm is exhibited in Fig. 16 for the block length K = 25018. Compared to Fig. 16, nearly
error-free performance is achieved at 2.6 dB SNR P/σ 2 after 20 iterations. This is only 1.2 dB away from the
capacity limit. The code optimization algorithms in [86, 87] narrow this gap even further.

0
10
after 1 iteration
after 2 iterations
after 8 iterations
after 20 iterations
data bit error rate

−2
10

−4
10
separate
capacity limit of equalization
the ISI channel and decoding
0 1.4 2 4 6 8 10 12
2
P/σ in dB

Fig. 17. Performance of turbo equalization for the communication system described in Sec. II-A using the precoder c̃n = cn + c̃n−1 and
symbol-based MAP detection. The BER performance (solid lines) is plotted for separate equalization and decoding as well as after 1, 2, 8, or
20 iterations. The dotted line corresponds to the BER performance of the FEC decoder when no ISI is introduced in the channel. The dashed
line is a lower bound on the BER performance of any decoder when no precoding is used in the transmitter. The considered blocklength is
K = 25018 (N = L = 50040). An S-random interleaver with S=40 is applied.

It is natural to replace the symbol-based MAP detector by an MMSE estimator in case the combined precoder-
channel trellis has too many states. Using precoding, the extrinsic LLRs λext (cn ) cannot be obtained from s(xn )
using soft demapping as in Secs. II-E and III-C, since a particular code bit tuple cn depends on all symbols
xn . We note that λext (cn ) must be computed by marginalizing cn in the joint probabilities P (c|x̂) with x̂ =
(x̂0 x̂1 ... x̂L−1 )T . This can be done efficiently with the BCJR algorithm using the precoder trellis, where the
transition matrices Γn and Γext,n are constructed with the corresponding probabilities s(xn ).
However, the BCJR algorithm is derived based on the assumption that the observations x̂n of the output c̃n
or xn = map(c̃n ), respectively, of the precoder state-space model are conditionally independent given xn , e.g.,

April 25, 2010 DRAFT


42

p(x̂n , x̂k |xn , xk ) = p(x̂n |xn )p(x̂k |xk ). This is not the case with the estimates x̂n produced by the MMSE estimator.
A remedy to this problem is to place an interleaver between the precoder output c̃ and the mapper to x. The
resulting cascade FEC code - interleaver - precoder - interleaver - mapping is a three-fold concatenation, for which
the turbo equalization algorithm in Fig. 5 must take the form of other iterative decoders for three-fold concatenated
systems, as in [88–92].

VI. EXIT CHART ANALYSIS

This section describes methods used to analyze the convergence behavior of iterative receiver algorithms, for
which turbo equalization is one example. Consider the BER performance of turbo equalization in Fig. 17. There
is nearly no improvement with iteration until the SNR reaches 3 dB, after which the receiver is nearly error-free.
One successful approach to explaining and predicting this behavior is via density evolution, [93, 94] through which
a design algorithm [94] and convergence thresholds [93] for LDPC codes have been derived. Other approaches
include assumptions on the densities such as that they are Gaussian [95–97] or single parameter characterizations
of them [98–101]. Among the latter are the extrinsic information transfer (EXIT) charts [99, 102], which have been
applied successfully to various concatenated systems including turbo equalization [25], iterative demapping and
demodulation [80] [103], iterative decoding [104], and iterative multi-user detection [105, 106].

A. Convergence prediction using EXIT charts

At the receiver, the code bits bn , n = 0, 1, ..., N − 1, generated by the FEC encoder are often assumed to be
equally likely to take on the values 0 or 1. Thus, the bit bn can be modeled as a realization of a binary random
variable Bn with uniform PMF P (bn = 0) = P (bn = 1) = 1/2. The interleaved code bit cn is modeled as a realization
of the binary random variable Cn = Bπ(n) , where π(·) is a permutation function. Similarly, the extrinsic LLRs
λext (cn ) produced by the equalizer and the extrinsic LLRs λ′ext (bn ) produced by the decoder can be modeled
as realizations of the random variables Λn and Λ′n , respectively. They are described with the conditional PDFs
p(λn |cn ) and p(λ′n |bn ), respectively, which both vary with iteration.
The analysis of the turbo equalization algorithm based on the PDFs p(λn |cn ) and p(λ′n |bn ) can be simplified
by making some assumptions. These assumptions render the analysis approximate, but their wide-spread use and
reported benefits in the literature as, for example, in [98–101, 107–110] gives credibility to their application. First,
all LLRs λext (cn ) and λ′ext (bn ) are assumed to be distributed identically, i.e., p(λn |cn ) and p(λ′n |bn ) are constant
for all n. Thus, the time index can be discarded yielding the representative random variables B, C, Λ, and Λ′ .
Second, only a single parameter of the PDFs p(λ|c) or p(λ′ |b) is observed. A survey of possible choices for this
parameter is given in [104].
The most popular measure in the literature is the mutual information I(λ; C) between the assumed random
variables λ and C, given by
1 X Z ∞
2 p(λ|c)
I(Λ; C) = p(λ|c) log2 dλ. (50)
2 −∞ p(λ|c = 0)+p(λ|c = 1)
c∈{0,1}

April 25, 2010 DRAFT


43

The mutual information I(Λ′ ; B) is defined similarly on the PDF p(λ′ |b). First suggested in [99, 109], the mutual
information measure has been widely used due to its robustness and accuracy as reported in [104]. The original
problem now simplifies from tracking the evolution of 2N functions over the real axis, to tracking the evolution
of two real-valued parameters per iteration.
The evolution of I(Λ; C) and I(Λ′ ; B) over iteration is the so-called trajectory of the turbo equalization algorithm
and can be depicted in a chart with two axes labeled with I(Λ; C) and I(Λ′ ; B). The following derivations focus
on I(Λ; C). The integral in (50) can be evaluated numerically using a histogram of a sufficient number of LLRs
λext (cn ), which are obtained by transmitting a large number of sequences x and observing the produced LLRs
λext (cn ) after each iteration of turbo equalization [99].
Calculating I(Λ; C) becomes much simpler if the PDF p(λ|c) is both symmetric and consistent [98]. The latter
property is also called exponential symmetry in the literature [94, 111]. The conditional PDF p(λ|c) is symmetric,
if it satisfies: p(λ|c = 0) = p(−λ|c = 1). Conditions for this property to hold are discussed in [94, 98].
The PDF p(λ|c) is consistent when the extrinsic LLRs λext (cn ) arise from valid probability measures [94, 98,
111] and is given by p(λ|c = 0)/p(λ|c = 1) = exp(λ). Combining the symmetry and the consistency property yields
p(λ|c = 0) p(λ|c = 1)
= exp(λ) and = exp(−λ). (51)
p(−λ|c = 0) p(−λ|c = 1)
A Gaussian PDF NR (µ, σ 2 ) with mean µ and variance σ 2 is consistent, if and only if 2|µ| = σ 2 holds. It turns out
that the PDFs p(λ′ |b) and p(λ|c) describing the LLRs between equalizer and decoder are most often not consistent.
Apart from the technical use of the consistency condition to simplify the analyses carried out in [93, 94, 111], it
appears that turbo equalization performs quite well despite non-consistent distributions.
When the PDF p(λ|c) satisfies both the symmetry and the consistency constraint, the calculation of I(Λ; C)
simplifies greatly to
Z ∞
I(Λ; C) = 1 − p(λ|c = 0) · log2 (1 + exp(−λ)) dλ
−∞ (52)
= E{1−log2 (1+exp(−λ)) | c = 0}.
Likewise, I(Λ; C) can be computed as E{1−log2 (1+exp(λ))|c = 1}. Either of these expectations can be arbitrarily
closely approximated by averaging the expression log2 (1+exp(−λext (cn )) or log2 (1+exp(λext (cn )) over many
realizations λext (cn ) of Λ. Under the assumption that all PDFs p(λn |cn ) are identical to p(λ|c) and that the LLRs
λext (cn ) are independent realizations of Λ, the averaging can be performed over all λext (cn ) for n = 0, 1, ..., N −1
(time average) rather than over many transmissions of sequences x, i.e.,

N −1 
+1,
1 X cn = 0,
I(ΛI ; C) ≈ 1 − log2 (1 + exp(−xn · λext (cn ))), where xn = (53)
N 
−1,
n=0 cn = 1.
It is now straight-forward to compute the trajectory of the turbo equalization algorithm. This trajectory becomes
more stable with increasing N or more blocks of LLRs λext (cn ) averaged, respectively. Computing I(Λ; C) and
I(Λ′ ; B) via histograms or via (53) requires the knowledge of the code bits bn or cn , respectively, i.e., the trajectory
cannot be computed online in a receiver, however online approximations exist [112].

April 25, 2010 DRAFT


44

One advantage of EXIT chart analysis is its ability to predict the behavior of turbo equalization without simulation.
This is possible by analyzing both the equalizer and decoder separately via their transfer functions, i.e., the trajectory
is predicted by precomputing the measures I(Λ; C) and I(Λ′ ; B) at the output of equalizer and decoder, respectively,
for any constellation of input LLRs. The input LLRs are artificially generated using the symmetric, consistent
Gaussian distribution

  
+1,
1 −(λ−x · γ/2) 2 c = 0,
g(λ|c) = √ · exp , where x= (54)
2πγ 2γ 
−1, c = 1.

This Gaussian model for the true PDFs p(λ|c) and p(λ′ |b) appears to be realistic [98, 99]. Each γ ∈ [0, ∞) is
assigned the mutual information
1 X Z ∞
2 g(λ|c)
Iin (γ) = g(λ|c) log2 dλ
2 −∞ g(λ|0)+g(λ|1)
c∈{0,1}
Z ∞
=1− g(λ|0) log2 (1 + exp(−λ)) dλ,
−∞

with Iin (0) = 0 (no a-priori information) and Iin (γ → ∞) = 1 (exact knowledge of the associated bit) as extremal
values. Suitable approximations of Iin (γ) are presented in [99].
The transfer function of the equalizer is computed as follows: First, N code bits cn are be computed from a
randomly chosen data sequence a and transmitted over the ISI channel. Running turbo equalization on the received
sequence y would yield the measures I(Λ; C) and I(Λ′ ; B) for each iteration. This time, N input LLRs λ′ext (cn )
are generated as IID realizations of Λ distributed with g(λ|c = 0) given by NR (γ/2, γ) for several variances γ.
Flipping the signs of all LLRs λ′ext (cn ) corresponding to cn = 1 yields that these LLRs are correctly distributed
with g(λ|c = 1) given by NR (−γ/2, γ). The variances are usually chosen to sample the range [0, 1] of Iin (γ)
uniformly. For the extremal case Iin (γ = 0) → 1, all LLRs λ′ext (cn ) have infinite magnitude and their sign yields
the correct code bit cn . For the extremal case Iin (γ → ∞) = 0, all LLRs λ′ext (cn ) are 0. For each Iin (γ), the
equalizer produces N output LLRs λext (cn ) from N input LLRs and the observation y. The output LLRs are used
to estimate the mutual information Iout = I(Λ; C) at the equalizer output via (53), which yields the transfer function
Iout = TE (Iin ). The transfer function Iout = TD (Iin ) of the decoder is similary computed.
The analysis based on these transfer functions is accurate when the equalizer and the decoder output the same
Iout when fed with LLRs distributed with the true PDF p(λ|c) and p(λ′ |b) or the Gaussian PDF g(λ|c) at the
same Iin . Note also that the input LLR λ′ext (cn ) (equalizer) and λext (bn ) (decoder) are independent when they are
artificially generated using g(l|x). This is not true with λ′ext (cn ) and λext (bn ) when turbo equalization runs in a
receiver, but is plausible for a large interleaver length N , at least for several iterations, in a neighborhood around
λ′ext (cn ) and λext (bn ). It follows that EXIT chart analysis using the transfer functions becomes more inaccurate
with increasing number of iterations exacerbated by a decreasing block size N .
Figure 18 depicts transfer functions Iout = TD (Iin ) of a symbol-based MAP decoder for some rate Rf ec = 1/2
codes. Among them is the memory-Mc = 2 code used in Section ??. Also included are transfer functions in case
the FEC code words b are punctured using the puncturing pattern p = (1110)T . The corresponding code rate is

April 25, 2010 DRAFT


45

1
G=[7 5], memory 2
G=[7 5], punctured to rate 2/3
G=[62 56], memory 4
G=[62 56], punctured to rate 2/3
0.8 G=[634 564], memory 6

0.6
Iout=TD(Iin)

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1
Iin

Fig. 18. Transfer functions Iout = TD (Iin ) of a symbol-based MAP decoder for several convolutional FEC codes with rate Rf ec = 1/2 and
Rf ec = 2/3.

0.8

0.6
=T (I )
E in
out
I

0.4

0.2

symbol−based MAP detection


exact linear MMSE estimation
approximate linear MMSE estimation
0
0 0.2 0.4 0.6 0.8 1
I
in

Fig. 19. Transfer functions Iout = TE (Iin ) of a SISO equalizer for a communication system using BPSK modulation, no precoding, a
memory-2 ISI channel with coefficients h0 = 0.407, h1 = 0.815, and h2 = 0.407. The SNR P/σ 2 is set to 4 dB.

Rf ec = 2/3. To obtain the transfer functions, N = 107 code bits bn encoded from K = Rf ec · 107 randomly chosen
data bits an and N = 107 corresponding decoder input LLRs λext (bn ) distributed with p(λ|b) given a preset γ were
used. The output mutual information Iout was computed via (53).
Figure 19 depicts transfer functions Iout = TE (Iin ) of a SISO equalizer based on symbol-based MAP detection
or linear MMSE equalization. The system configuration is the same as in Sec. II-A, i.e., BPSK modulation without
precoding is applied and the ISI channel has memory Mh = 2 and the coefficients h0 = 0.407, h1 = 0.815,

April 25, 2010 DRAFT


46

and h2 = 0.407. The SNR P/σ 2 is set to 4 dB and the window parameters for the linear MMSE estimator are
(W = 11, W1 = 4, W2 = 6). Both the exact implementation from Table V and the approximate version from Table VI
are considered. To obtain the transfer functions, 107 randomly chosen bits cn were modulated to xn ∈ {+1, −1},
which in turn were transmitted over the ISI channel. The 107 equalizer input LLRs λ′ext (cn ) were generated as
outlined above. The output mutual information Iout was computed via (53).

B. Convergence analysis of turbo equalization

EXIT chart analysis attempts to predict I(Λ; C) and I(Λ′ ; B) by taking the output Iout from the equalizer transfer
function Iout = TE (Iin ) as input of the decoder transfer function Iout = TD (Iin ), whose output is used in turn as
input of the equalizer transfer function. To distinguish between equalizer and decoder, the mutual informations Iin
E D
and Iout are augmented with the superscripts (equalizer) and (decoder), respectively.
The iterative process starts with an initial equalization task, where λ′ext (cn ) = 0 holds for all n, i.e. Iin
E
= 0. Next,
E D
the output LLRs λext (cn ) described by Iout = Iin are fed into the decoder yielding LLRs λ′ext (bn ) described by
D E
Iout = Iin , which are fed back to the equalizer and so forth. This procedure is described with a single receiver
EXIT chart combining the Figs. 18 and 19 as shown in Fig. 20, where the decoder transfer function is flipped along
D D
the Iin = Iout line. The iteration process is a trace between the transfer curves of the two receiver components, the
so-called trajectory of the turbo equalization algorithm. The receiver EXIT chart in Fig. 20 is drawn at 4 dB SNR
P/σ 2 , where either symbol-based MAP detection or exact and approximate linear MMSE estimation are applied.
The convolutional code considered has memory-2 and the generator matrix [7 5]. The block length is K = 512
(N = L = 1024) and an S-random interleaver with S=16 is applied. The parameters I(Λ; C) and I(Λ′ ; B) were
computed via (53) by averaging over 10000 blocks.
D
The turbo equalization system using symbol-based MAP detection requires 3 iterations to nearly reach Iout = 1 at
the decoder output, which corresponds to a zero BER. Using either implementation of the linear MMSE estimator,
the same performance is achieved, but after 4 to 5 iterations (for clarity, the traces are not shown in Fig. 20).
E
Figure 20 also reveals that all SISO equalization algorithms achieve the same value Iout < 1 for perfect input
E
LLRs λ′ext (cn ), which corresponds to Iin = 1. Recall from Section I that the SISO equalizer, which must obey that
the output LLR λext (cn ) must not depend on λ′ext (cn ), produces LLRs which correspond to an equivalent ISI-free
channel when all input LLRs λ′ext (cn ) have infinite magnitude. It follows that the ISI disturbing a received symbol
yn is removed completely, since all xn′ , n′ 6= n, are known. As derived in (20), the estimate at the equalizer output
becomes in this case
x̂n = (Ph · xn +wn )/(Ph +σ 2 ),
PMh
where Ph = i=0 |hi |2 . The corresponding output LLR λext (cn ) follows from (18):

λext (cn ) = 2Ph x̂n /σ 2 . (55)

It is shown in [87] that a symbol-based MAP detector produces the same output LLRs λext (cn ) in this case. The
E E
maximally attainable Iout in Fig. 20 corresponding to the LLRs (55) is Iout ≤ 0.8, which is already shown in

April 25, 2010 DRAFT


47

= equalization
= decoding
0.8

0.6
Iout and Iin
D
E

0.4

0.2
symbol−based MAP detection
exact linear MMSE estimation
approximate linear MMSE estimation
symbol−based MAP decoder, G=[7 5]
0
0 0.2 0.4 0.6 0.8 1
E D
Iin and Iout

Fig. 20. Receiver EXIT chart of a communication system using a convolutional FEC code (rate Rf ec = 1/2, generator polynomial (1 +
D+ D2 1 + D 2 ), systematic encoding), BPSK modulation, and no precoding. The communication channel is a memory-2 ISI channel with
coefficients h0 = 0.407, h1 = 0.815, and h2 = 0.407. The SNR P/σ 2 is set to 4 dB.

E D
Fig. 19 for Iin = 1. It follows that the MAP decoder receives LLRs λext (bn ) with at most Iin = 0.8, which is the
matched filter bound for the turbo equalization system, i.e., the asymptotic performance of the receiver is at best the
performance of the MAP decoder alone for transmission over the ISI-free AWGN channel yn = xn +wn (ISI-free
case). This can be overcome using precoding as shown next.
The receiver EXIT chart also helps to choose a suitable FEC code for the communication system. Figure 21
depicts the receiver EXIT chart at 2 dB SNR P/σ 2 using symbol-based MAP detection and MAP decoding for two
D
different FEC codes with generators [7 5] and [634 564]. The iterative receiver converges to a larger value of Iout
using the FEC code with generator (634 564). Moreover, a lower error floor using that FEC code can be achieved
D
for higher SNRs, since the transfer function TD (Iin ) approaches Iout = 1 faster.
A similar problem, i.e., to find convolutional codes with a good trade-off between early convergence (at low SNRs)
and a small error floor after convergence, exists in turbo coding. Here, techniques such as choosing a generator with
feed-forward polynomials of degree much larger than that of the feedback polynomial [113] or systematic doping
[114] have been tested. They can be applied in principle to turbo equalization, as well. A more powerful solution
to this “curve-fitting” problem is derived in [87] based on irregular codes. Table XI gives SNR thresholds in dB for
which there is a tunnel between the transfer function TE (Iin ) of the equalizer and that of the MAP decoder for the
D
code (634 564). If such a tunnel exists, turbo equalization can converge to a large value for Iout constrained by the
E
upper bound for Iout . However, this may require many iterations and, thus, a large block length K. The thresholds
were obtained by generating equalizer transfer functions at varying SNRs until the transfer function touched or
D
intersected the flipped decoder transfer function at Iout < 0.5.

April 25, 2010 DRAFT


48

0.8

0.6
Iout and Iin
D
E

0.4

0.2

symbol−based MAP detection


symbol−based MAP decoder, G=[7 5]
symbol−based MAP decoder, G=[634 564]
0
0 0.2 0.4 0.6 0.8 1
E D
Iin and Iout

Fig. 21. Receiver EXIT chart of the communication system described in Fig. 20 using symbol-based MAP detection and two different FEC
codes at 2 dB SNR P/σ 2 .

TABLE XI
SNR CONVERGENCE THRESHOLDS FOR TURBO EQUALIZATION GIVEN THE COMMUNICATION SYSTEM DESCRIBED IN F IG . 20 FOR VARIOUS
SISO EQUALIZERS .

SISO equalization algorithm SNR P /σ 2 threshold in dB


symbol-based MAP detection 1.4
exact linear MMSE estimation 1.6
approximate linear MMSE estimation 1.8

Figure 22 depicts the receiver EXIT chart of turbo equalization using the system configuration from Figs. 16
and 17, i.e., BPSK modulation with the precoder c̃n = cn + c̃n−1 is applied in the transmitter and symbol-based
MAP detection is applied in the receiver. The most significant difference between the equalizer transfer functions
E
TE (Iin ) shown in Figs. 22 and 19 is that Iout = 1 is achieved for perfect input LLRs λ′ext (cn ) corresponding to
E
Iin = 1.

VII. C ONCLUSIONS

This paper provides an overview of turbo equalization, including some of its basic concepts, applications, and
tools for analysis. Beginning with the simple observation of [2] that an intersymbol-interference channel could
be viewed as a rate-1 convolutional code over the reals, turbo equalization was simply an extension of the turbo
decoding algorithm applied to a coded data transmission over a frequency selective channel. By considering the
channel itself as an inner-code of a serially concatenated turbo code, the problems of equalization and decoding
became inextricably linked.
Two critical components are the communication of extrinsic information between the equalizer and the decoder and

April 25, 2010 DRAFT


49

0.8

0.6
in
out
IE and ID

0.4

0.2

symbol−based MAP detection


symbol−based MAP decoder, G=[7 5]
0
0 0.2 0.4 0.6 0.8 1
IE
in
and ID
out

Fig. 22. Receiver EXIT chart of the communication system described in Fig. 20 applying the precoder c̃n = cn + c̃n−1 in the transmitter
and symbol-based MAP detection in the receiver. The SNR P/σ 2 is set to 2.4 dB.

the use of interleaving to enable the independence assumptions made at the equalizer and decoder to facilitate simple
operation. As such, turbo equalization requires that both the equalizer and the decoder be capable of processing
and outputing soft information, i.e. they need to have SISO capability. While symbol-based MAP detection and
MAP decoding are natural SISO equalization algorithms, Koetter’s work, and that of others showed that MMSE
estimation (and a host of other serially concatenated tasks) can be used to construct a SISO algorithm as well.
An important observation is that the turbo equalization algorithm using either symbol-based MAP detection or
linear MMSE estimation often achieves the BER performance of ML decoding for the complete communication
link after convergence, i.e., turbo equalization has performance that approahces that of the exact joint equalization
and decoding. However, a lower bound on the BER governed by the minimum distance dmin,f ec of the FEC code
is bounded away from the capacity limit, when the ISI channel has finite memory. Precoding, however, provides a
simple means for overcoming this performance limitation and in certain cases can be designed without an increase
in receiver complexity.
The EXIT chart was also discussed as a semi-analytical tool for predicting the convergence properties of the
turbo equalization algorithm. This analysis describes the soft information communicated between the equalizer
and the decoder in the receiver through a single parameter - the mutual information between the LLRs and the
corresponding code bits in the transmitter. The trajectory of the turbo equalization algorithm, i.e., the evolution
of mutual information over iterations, can be predicted with the equalizer and the decoder transfer functions. This
approximate analysis employs an independent Gaussian approximation for the probability denisty of the LLRs
passed between the equalizer and the decoder and assumes an infinite block length with perfect interleaving.
Though this approximate analysis, achievable thresholds in SNR for which turbo equalization will converge

April 25, 2010 DRAFT


50

can be predicted and a carefully designed system applying turbo equalization can approach the information-
theoretic performance limits imposed by the communication channel. Moreover, applying MMSE estimation or
other complexity reduction methods often leads to only minor degradation this ultimate performance, which is
evidenced by a slightly increased SNR threshold for which turbo equalization converges.
We see that the turbo equalization approach is tightly connected to recent work in a variety of areas, including
turbo codes, LDPC codes, and iterative detection and decoding algorithms. Just as turbo equalization grew out of
the original turbo decoding algorithm, there are numerous extensions of the basic turbo equalization approach in
which a wide variety of signal processing tasks have been incorporated into the joint estimation process. There are
a host of additional means in which to explore the turbo equalization approach, and we refer the interested reader
to the references cited and the references contained therein. We hope that this overview of the general concepts
and salient features enables further exploration into this topic that we find both challenging and fascinating.

ACKNOWLEDGMENT

This paper contains material that appeared in the doctoral dissertation of M. Tüchler [70]. This work was supported
in part by the department of the Navy, Office of Naval Research, under grants ONR MURI N00014-07-1-0738 and
ONR N00014-07-1-0311 and by the National Science Foundation under grant NSF CCF 07-29092 and by the
Gigascale System Research Center (GSRC), one of five research centers funded under the Focus Center Research
Program (FCRP), a Semiconductor Research Corporation program.

R EFERENCES

[1] C. Berrou and A. Glavieux, “Near optimum error correcting coding and decoding: Turbo codes,” IEEE Transactions on Communications,
vol. 44, no. 10, pp. 1261–1271, October 1996.
[2] C. Douillard, M. Jezequel, and C. Berrou, “Iterative correction of inter-symbol interference: Turbo equalization,” European Transactions
on Telecommunications, vol. 6, no. 5, pp. 507–511, September-October 1995.
[3] J. Proakis, Digital Communications, 3rd Ed. New York, USA: McGraw-Hill, 1995.
[4] S. Haykin, Communication Systems, 3rd Ed. Canada: Wiley & Sons, 1994.
[5] S. Lin and J. Costello, Error Control Coding. Englewood Cliffs, USA: Prentice Hall, 1983.
[6] G. Forney, “Maximum-likelihood estimation of digital sequences in the presence of intersymbol interference,” IEEE Transactions on
Information Theory, vol. 18, no. 3, pp. 363–378, May 1972.
[7] L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear codes for minimizing symbol error rate,” IEEE Transactions on
Information Theory, vol. 20, pp. 284–287, March 1974.
[8] Y. Li, B. Vucetic, and Y. Sato, “Optimum soft-output detection for channels with intersymbol interference,” IEEE Transactions on
Information Theory, vol. 41, no. 3, pp. 704–713, May 1995.
[9] W. Koch and A. Baier, “Optimum and sub-optimum detection of coded data disturbed by time-varying intersymbol interference,” in
Proc. IEEE Global Telecommunications Conference, December 1990, pp. 1679–1684.
[10] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near shannon limit error-correcting coding and decoding: Turbo codes,” in Proc. IEEE
International Conference on Communications (ICC), May 1993.
[11] C. Heegard and S. Wicker, Turbo Coding. Boston, USA: Kluwer Academic Publishing, 1999.
[12] J. Hagenauer, “The turbo principle: Tutorial introduction and state of the art,” in Proc. International Symposium on Turbo codes and
Related Topics, Brest, France, September 1997, pp. 1–11.

April 25, 2010 DRAFT


51

[13] S. Benedetto, D. Divsalar, G. Mondorsi, and F. Pollara, “Serial concatenated trellis coded modulation with iterative decoding: design and
performance,” in Proc. IEEE Global Telecommunications Conference, November 1997.
[14] S. Benedetto and G. Mondorsi, “Unveiling Turbo codes: some results on parallel concatenated coding schemes,” IEEE Transactions on
Information Theory, vol. 42, no. 2, pp. 409–428, March 1996.
[15] X. Wang and H. Poor, “Iterative (turbo) soft interference cancellation and decoding for coded CDMA,” IEEE International Conference
on Communications (ICC), vol. 47, no. 7, pp. 1046–1061, 1999.
[16] A. Singer, J. Nelson, and S. Kozat, “Signal processing for underwater acoustic communications,” IEEE Communications Magazine,
vol. 47, no. 1, pp. 90–96, January 2009.
[17] J. Sifferlen, H. Song, W. Hodgkiss, W. Kuperman, and M. Stevenson, “An iterative equalization and decoding approach for underwater
acoustic communication,” IEEE J. Oceanic Engr., pp. 182–197, 2008.
[18] J. Choi, R. Drost, A. Singer, and J. Preisig, “Iterative multi-channel equalization and decoding for high frequency underwater acoustic
communications,” Proc. 2008 IEEE Sensor Array and Multichannel Signal Processing Workshop, pp. 127–130, July 2008.
[19] R. Otnes and T. H. Eggen, “Underwater acoustic communications: Long-term test of turbo equalization in shallow water,” IEEE J. Oceanic
Eng., vol. 33, no. 3, pp. 321–334, July 2008.
[20] F. Xu, M.-A. Khalighi, and S. Bourennane, “Coded ppm and multipulse ppm and iterative detection for free-space optical links,” IEEE/OSA
Journal of Optical Communications and Networking, vol. 1, no. 5, pp. 404 – 415, October 2009.
[21] H. Haunstein, T. Schorr, A. Zottmann, W. Sauer-Greff, and R. Urbansky, “Performance comparison of mlse and iterative equalization in
fec systems for pmd channels with respect to implementation complexity,” Journal of Lightwave Technology, vol. 24, no. 11, pp. 4047
– 4054, November 2006.
[22] H. Haunstein, W. Sauer-Greff, A. Dittrich, K. Sticht, and R. Urbansky, “Principles for electronic equalization of polarization-mode
dispersion,” Journal of Lightwave Technology, vol. 22, no. 4, pp. 1169 – 1182, April 2004.
[23] G. Bauch and V. Franz, “A comparison of soft-in/soft-out algorithms for ‘Turbo detection’,” in Proc. International Conference on
Telecommunications (ICT), June 1998, pp. 259–263.
[24] A. Anastasopoulos and K. Chugg, “Iterative equalization and decoding for TCM for frequency-selective fading channels,” in Proc. Allerton
Conference on Communication, Control, and Computing, Monticello, USA, vol. 1, November 1997, pp. 177–181.
[25] M. Tüchler, R. Koetter, and A. Singer, “Turbo equalization: principles and new results,” IEEE Transactions on Communications, vol. 50,
no. 5, pp. 754–767, May 2002.
[26] M. Tüchler, A. Singer, and R. Koetter, “Minimum mean squared error equalization using priors,” IEEE Transactions on Signal Processing,
vol. 50, no. 3, pp. 673–683, March 2002.
[27] J. Hagenauer and P. Hoeher, “A Viterbi algorithm with soft-decision outputs and its applications,” in Proc. IEEE Global Telecommuni-
cations Conference, 1989, pp. 1680–1686.
[28] S. Lee, N. Shanbhag, and A. Singer, “A 285-mhz map decoder in 0.18µm cmos,” IEEE Journal of Solid-State Circuits, vol. 40(8), pp.
1718– 1725, Aug. 2005.
[29] S.-J. Lee, N. Shanbhag, and A. Singer, “Area-efficient high-throughput map decoder architectures,” IEEE Transactions on Very Large
Scale Integration (VLSI) Systems, vol. 13(8), pp. 921–933, Aug. 2005.
[30] R. Ratnayake, A. Kavcic, and G.-Y. Wei, “A high-throughput maximum a posteriori probability detector,” IEEE Journal of Solid-State
Circuits, vol. 43(8), pp. 1846–1858, Aug. 2008.
[31] A. Glavieux, C. Laot, and J. Labat, “Turbo equalization over a frequency selective channel,” in Proc. International Symposium on Turbo
codes and Related Topics, Brest, France, September 1997, pp. 96–102.
[32] D. Raphaeli and A. Saguy, “Linear equalizers for Turbo equalization: A new optimization criterion for determining the equalizer taps,”
in Proc. International Symposium on Turbo codes and Related Topics, Brest, France, September 2000, pp. 371–374.
[33] Z. Wu and J. Cioffi, “Turbo decision aided equalization for magnetic recording channels,” in Proc. IEEE Global Telecommunications
Conference, December 1999, pp. 733–738.
[34] R. Lopes and J. Barry, “The soft-feedback equalizer for turbo equalization of highly dispersive channels,” IEEE Transactions on
Communications, vol. 54, no. 5, pp. 783 – 788, May 2006.
[35] L. Liu and L. Ping, “An extending window mmse turbo equalization algorithm,” IEEE Signal Processing Letters, vol. 11, no. 11, pp. 891
– 894, November 2004.

April 25, 2010 DRAFT


52

[36] A. Berthet, R. Visoz, and P. Tortelier, “Sub-optimal Turbo-detection for coded 8-PSK signals over ISI channels with application to EDGE
advanced mobile system,” in Proc. IEEE Vehicular Technology Conference (VTC), September 2000.
[37] D. Fertonani, A. Barbieri, and G. Colavolpe, “Reduced-complexity bcjr algorithm for turbo equalization,” IEEE Transactions on
Communications, vol. 55, no. 12, pp. 2279 – 2287, December 2007.
[38] R. Peng, R.-R. Chen, and B. Farhang-Beroujeny, “Markov chain monte carlo detectors for channels with intersymbol interference,” IEEE
Transactions on Signal Processing, vol. 58, no. 4, pp. 2206–2217, April 2010.
[39] Z. Qin, K. Cai, and X. Zou, “A reduced-complexity iterative receiver based on simulated annealing for coded partial-response channels,”
IEEE Transactions on Magnetics, vol. 43(6), pp. 2265–2267, June 2007.
[40] Z. Qin and K. C. Teh, “Reduced-complexity turbo equalization for coded intersymbol interference channels based on local search
algorithms,” IEEE Transactions on Vehicular Technology, vol. 57, no. 1, pp. 630 – 635, January 2008.
[41] P. Supnithi, R. Lopes, and S. McLaughlin, “Reduced-complexity turbo equalization for high-density magnetic recording systems,” IEEE
Transactions on Magnetics, vol. 39, no. 5, pp. 2585 – 2587, September 2003.
[42] N. Singla, J. O’Sullivan, R. Indeck, and Y. Wu, “Iterative decoding and equalization for 2-D recording channels,” IEEE Transactions on
Magnetics, vol. 38, pp. 2328–2330, September 2002.
[43] Y. Wu, J. O’Sullivan, N. Singla, and R. Indeck, “Iterative detection and decoding for separable two-dimensional intersymbol interference,”
IEEE Transactions on Magnetics, vol. 39, no. 4, pp. 2115 – 2120, July 2003.
[44] N. Singla, J. O’Sullivan, C. Miller, and R. Indeck, “Decoding for magnetic recording media with overlapping tracks,” IEEE Transactions
on Magnetics, vol. 41, no. 10, pp. 2968 – 2970, 2005.
[45] H. Li and H. Poor, “Reduced complexity joint iterative equalization and multiuser detection in dispersive ds-cdma channels,” IEEE
Transactions on Wireless Communications, vol. 4, no. 3, pp. 1234 – 1243, May 2005.
[46] T. Mayer and C. Kuhn, “Distributed turbo equalization for intercell and intersymbol interference cancellation,” IEEE Global Telecommu-
nications Conference, 2006, pp. 1–5, 2006.
[47] D. Lin and T. J. Lim, “A variational inference framework for soft-in soft-out detection in multiple-access channels,” IEEE Transactions
on Information Theory, vol. 55, no. 5, pp. 2345 – 2364, May 2009.
[48] P. Xiao, J. Wu, M. Sellathurai, T. Ratnarajah, and E. Strom, “Iterative multiuser detection and decoding for ds-cdma system with space-time
linear dispersion,” IEEE Transactions on Vehicular Technology, vol. 58, no. 5, pp. 2343 – 2353, Jun 2009.
[49] T. Abe, S. Tomisato, and T. Matsumoto, “A mimo turbo equalizer for frequency-selective channels with unknown interference,” IEEE
Transactions on Vehicular Technology, vol. 52, no. 3, pp. 476 – 482, May 2003.
[50] M. Koca and B. Levy, “Broadband beamforming for joint interference cancellation and turbo equalization,” IEEE Transactions on Wireless
Communications, vol. 4, no. 5, pp. 2244 – 2255, September 2005.
[51] T. Ait-Idir, S. Saoudi, and N. Naja, “Space-time turbo equalization with successive interference cancellation for frequency-selective mimo
channels,” IEEE Transactions on Vehicular Technology, vol. 57, no. 5, pp. 2766 – 2778, September 2008.
[52] S.-S. Sadough, M.-A. Khalighi, and P. Duhamel, “Improved iterative mimo signal detection accounting for channel-estimation errors,”
IEEE Transactions on Vehicular Technology, vol. 58, no. 7, pp. 3154 – 3167, September 2009.
[53] M. Sellathurai and S. Haykin, “T-blast for wireless communications: first experimental results,” IEEE Transactions on Vehicular
Technology, vol. 52, no. 3, pp. 530 – 535, may 2003.
[54] M. Koca and B. Levy, “Turbo space-time equalization of tcm for broadband wireless channels,” IEEE Transactions on Wireless
Communications, vol. 3, no. 1, pp. 50 – 59, Jan 2004.
[55] K. Kansanen, C. Schneider, T. Matsumoto, and R. Thoma, “Multilevel-coded qam with mimo turbo-equalization in broadband single-
carrier signaling,” IEEE Transactions on Vehicular Technology, vol. 54, no. 3, pp. 954 – 966, May 2005.
[56] S. Ahmed, T. Ratnarajah, M. Sellathurai, and C. Cowan, “Reduced-complexity iterative equalization for severe time-dispersive mimo
channels,” IEEE Transactions on Vehicular Technology, vol. 57, no. 1, pp. 594 – 600, Jan 2008.
[57] R. Visoz, A. O. Berthet, and S. Chtourou, “Frequency-domain block turbo-equalization for single-carrier transmission over mimo broadband
wireless channel,” IEEE Trans. Commun., vol. 54, pp. 2144–2149, December 2006.
[58] F. Pancaldi and G. Vitetta, “Block channel equalization in the frequency domain,” IEEE Transactions on Communications, vol. 53, no. 3,
pp. 463 – 471, March 2005.

April 25, 2010 DRAFT


53

[59] H. Liu and P. Schniter, “Iterative frequency-domain channel estimation and equalization for single-carrier transmissions without cyclic-
prefix,” IEEE Transactions on Wireless Communications, vol. 7, no. 10, pp. 3686–3691, October 2008.
[60] C. Laot, R. L. Bidan, and D. D. Leroux, “Low-complexity mmse turbo equalization: a possible solution for edge,” IEEE Transactions
on Wireless Communications, vol. 4, no. 3, pp. 965 – 974, May 2005.
[61] D. N. Liu and M. P. Fitz, “Low complexity affine mmse detector for iterative detection-decoding mimo-ofdm systems,” IEEE Trans.
Commun, vol. 56, no. 1, pp. 150–158, 2008.
[62] S. Ahmed, T. Ratnarajah, M. Sellathurai, and C. Cowan, “Iterative receivers for mimo-ofdm and their convergence behavior,” IEEE
Transactions on Vehicular Technology, vol. 58, no. 1, pp. 461 – 468, January 2009.
[63] Y. Sun, V. Tripathi, and M. L. Honig, “Adaptive turbo reduced-rank equalization for mimo channels,” IEEE Trans. Wireless Commun,
vol. 4, pp. 2789–2800, 2005.
[64] R. Otnes and M. Tuchler, “Iterative channel estimation for turbo equalization of time-varying frequency-selective channels,” IEEE
Transactions on Wireless Communications, vol. 3, no. 6, pp. 1918 – 1923, November 2004.
[65] S. Song, A. Singer, and K.-M. Sung, “Soft input channel estimation for turbo equalization,” IEEE Transactions on Signal Processing,
vol. 52, no. 10, pp. 2885 – 2894, October 2004.
[66] X. Li and T. F. Wong, “Turbo equalization with nonlinear kalman filtering for time-varying frequency-selective fading channels,” IEEE
Transactions on Wireless Communications, vol. 6, no. 2, pp. 691–700, February 2007.
[67] K. Fang, L. Rugini, and G. Leus, “Low-complexity block turbo equalization for ofdm systems in time-varying channels,” IEEE Transactions
on Signal Processing, vol. 56, no. 11, pp. 5555 – 5566, 2008.
[68] Q. Yu and S. Lambotharan, “Iterative (turbo) estimation and detection techniques for frequency-selective channels with multiple frequency
offsets,” IEEE Signal Processing Letters, vol. 14, no. 4, pp. 236 – 239, April 2007.
[69] S. Das and P. Schniter, “Max-sinr isi/ici-shaping multicarrier communication over the doubly dispersive channel,” IEEE Trans. Signal
Processing, vol. 55, no. 12, pp. 5782–5795, December 2007.
[70] M. Tüchler, “Turbo equalization,” Ph.D. dissertation, Munich University of Technology, Munich, Germany, December 2003.
[71] J. Hagenauer, E. Offer, and L. Papke, “Iterative decoding of binary block and convolutional codes,” IEEE Transactions on Information
Theory, vol. 42, no. 3, pp. 429–445, March 1996.
[72] R. Koetter, A. Singer, and M. Tüchler, “Turbo equalization,” IEEE Signal Processing Magazine, vol. 21, no. 1, January 2004.
[73] P. Robertson, P. Hoeher, and E. Villebrun, “Optimal and sub-optimal maximum a posteriori algorithms suitable for turbo decoding,”
European Transactions on Telecommunications, vol. 8, 1997.
[74] J. Proakis and M. Salehi, Communication Systems Engineering. Upper Saddle River, USA: Prentice Hall, 1994.
[75] H. Poor, An Introduction to Signal Detection and Estimation, 2nd Ed. New York, USA: Springer Verlag, 1994.
[76] M. Tüchler, R. Koetter, and A. Singer, “Graphical models for coded data transmission over inter-symbol interference channels,” in
Proc. IEEE/ITG Conference on Source and Channel Coding, January 2004.
[77] ——, “Graphical models for coded data transmission over inter-symbol interference channels,” European Transactions on Telecommuni-
cations, vol. 15, no. 4, July-August 2004.
[78] M. Moher and T. Gulliver, “Cross-entropy and iterative decoding,” IEEE Transactions on Information Theory, vol. 44, no. 7, p. 1998,
November 1998.
[79] C. Caire, G. Taricco, and E. Biglieri, “Bit-interleaved coded modulation,” IEEE Transactions on Information Theory, vol. 44, no. 3, pp.
927–946, May 1998.
[80] S. ten Brink, J. Speidel, and R. Yan, “Iterative demapping and decoding for multilevel modulation,” Proc. IEEE Global Communications
Conference (Globecom), pp. 579–584, November 1998.
[81] S. Jiang, L. Ping, H. Sun, and C. S. Leung, “Modified lmmse turbo equalization,” IEEE Communications Letters, vol. 8, no. 3, pp. 174
– 176, March 2004.
[82] R. Otnes, “On soft-output demapping in filter-based soft-in/soft-out equalizers,” in Proc. International Symposium on Turbo codes and
Related Topics, Brest, France, September 2003.
[83] R. V. Nee and R. Prasad, OFDM for Wireless Multimedia Communications. Artech House, 2000.
[84] R. Ramirez, The FFT: Fundamentals and Concepts. Upper Saddle River, USA: Prentice Hall, 1984.

April 25, 2010 DRAFT


54

[85] M. Tüchler and J. Hagenauer, “Turbo equalization using frequency domain equalizers,” in Proc. Allerton Conference on Communication,
Control, and Computing, Monticello, USA, Oct 2000, pp. 1234–1243.
[86] M. Tüchler, “Design of serially concatenated systems depending on the block length,” in Proc. IEEE International Conference on
Communications (ICC), May 2003.
[87] ——, “Design of serially concatenated systems depending on the block length,” IEEE Transactions on Communications, vol. 52, no. 2,
pp. 209–218, February 2004.
[88] ——, “Convergence prediction of iterative decoding of three-fold concatenated systems,” in Proc. IEEE Global Communications
Conference (Globecom), November 2002.
[89] R. Ramamurthy and W. Ryan, “Convolutional double accumulate codes (or double turbo DPSK),” IEEE Communications Letters, vol. 5,
no. 4, pp. 157–159, April 2001.
[90] D. Raphaeli and Y. Zarai, “Combined turbo equalization and turbo decoding,” IEEE Communications Letters, vol. 2, no. 4, pp. 107–109,
April 1998.
[91] S. ten Brink, “Code doping for triggering iterative decoding convergence,” in Proc. International Symposium on Information Theory
(ISIT), October 2001, p. 235.
[92] M. Toegel, W. Pusch, and H. Weinrichter, “Combined serially concatenated codes and turbo-equalization,” in Proc. International
Symposium on Turbo codes and Related Topics, Brest, France, September 2000, pp. 375–378.
[93] T. Richardson and R. Urbanke, “The capacity of low density parity-check codes under message passing decoding,” IEEE Transactions
on Information Theory, vol. 47, no. 2, pp. 599–618, February 2001.
[94] ——, “Design of capacity-approaching low density parity-check codes,” IEEE Transactions on Information Theory, vol. 47, no. 2, pp.
619–637, February 2001.
[95] S. Chung, T. Richardson, and R. Urbanke, “Analysis of sum-product decoding of low-density-parity-check codes using a Gaussian
approximation,” IEEE Transactions on Information Theory, vol. 47, no. 2, pp. 657–670, February 2001.
[96] N. Sellami, A. Roumy, and I. Fijalkow, “A proof of convergence of the map turbo-detector to the awgn case,” IEEE Transactions on
Signal Processing, vol. 56, no. 4, pp. 1548 – 1561, April 2008.
[97] K. Bhattad and K. Narayanan, “An mse-based transfer chart for analyzing iterative decoding schemes using a gaussian approximation,”
IEEE Transactions on Information Theory, vol. 53, no. 1, pp. 22 – 38, January 2007.
[98] H. E. Gamal and A. Hammons, “Analyzing the Turbo decoder using the Gaussian approximation,” IEEE Transactions on Information
Theory, vol. 47, no. 2, pp. 671–686, February 2001.
[99] S. ten Brink, “Convergence behaviour of iteratively decoded parallel concatenated codes,” IEEE Transactions on Communications, vol. 49,
no. 10, pp. 1727–1737, Oct 2001.
[100] D. Divsalar, S. Dolinar, and F. Pollara, “Serial Turbo trellis coded modulation with rate-1 inner code,” in Proc. International Symposium
on Information Theory (ISIT), June 2000, p. 194.
[101] P. Alexander, A. Grant, and M. Reed, “Iterative detection and code-division multiple-access with error control coding,” European
Transactions on Telecommunications, vol. 9, no. 5, pp. 419–425, September-October 1998.
[102] S. ten Brink, “Code characteristic matching for iterative decoding of serially concatenated codes,” Annals of Telecommunications, vol. 56,
no. 7-8, pp. 394–408, April 2001.
[103] B. Scanavino, G. Mondorsi, and S. Benedetto, “Convergence properties of iterative decoders working at bit and symbol level,” in
Proc. IEEE Global Communications Conference (Globecom), November 2001, pp. 1037–1041.
[104] M. Tüchler, S. ten Brink, and J. Hagenauer, “Measures for tracing convergence of iterative decoding algorithms,” in Proc. IEEE/ITG
Conference on Source and Channel Coding, January 2002, pp. 53–60.
[105] K. Li and X. Wang, “Exit chart analysis of turbo multiuser detection,” IEEE Transactions on Wireless Communications, vol. 4, no. 1,
pp. 300 – 311, January 2005.
[106] A. Dejonghe and L. Vandendorpe, “Bit-interleaved turbo equalization over static frequency-selective channels: constellation mapping
impact,” IEEE Transactions on Communications, vol. 52, no. 12, pp. 2061 – 2065, 2004.
[107] P. Robertson, “Illuminating the structure of code and decoder of parallel concatenated recursive systematic (turbo) codes,” in Proc. IEEE
Global Communications Conference (Globecom), San Francisco, USA, December 1994, pp. 1298–1303.

April 25, 2010 DRAFT


55

[108] M. Reed and C. Schlegel, “An iterative receiver for the partial response channel,” in Proc. International Symposium on Information
Theory (ISIT), August 1998, p. 63.
[109] S. ten Brink, “Convergence of iterative decoding,” Electronics Letters, vol. 35, no. 10, pp. 806–808, May 1999.
[110] K. Narayanan, “Effect of precoding on the convergence of turbo equalization for partial response channels,” IEEE Journal on Selected
Areas in Communications, vol. 19, no. 4, pp. 686–698, April 2001.
[111] P. Hoeher, U. Sorger, and I. Land, “Log-likelihood values and monte carlo simulation - some fundamental results,” in Proc. International
Symposium on Turbo codes and Related Topics, Brest, France, September 2000, pp. 43–46.
[112] I. Land, P. Hoeher, and S. Gligorevic, “Computation of symbol-wise mutual information in transmission systems with LogAPP decoders
and applications to EXIT charts,” in Proc. IEEE/ITG Conference on Source and Channel Coding, January 2004.
[113] P. Massey, O. Takeshita, and D. Costello, “Contradicting a myth: good Turbo codes with large memory order,” in Proc. International
Symposium on Information Theory (ISIT), 2000, p. 122.
[114] S. ten Brink, “Iterative decoding trajectories of parallel concatenated codes,” in Proc. IEEE/ITG Conference on Source and Channel
Coding, January 2000, pp. 75–80.

April 25, 2010 DRAFT

View publication stats

You might also like