0% found this document useful (0 votes)
5 views324 pages

(Kuo, 2008) Precoding Techniques For Digital Communication Systems (Springer) (324s)

This document is a comprehensive review of precoding techniques for digital communication systems, focusing on improving data rates, link quality, and user capacity. It covers various methods including Tomlinson-Harashima precoding, Trellis precoding, and techniques for MIMO and multiuser OFDM systems, highlighting their applications and advantages. The book is intended for graduate students, engineers, and researchers familiar with digital communication concepts, aiming to enhance understanding and encourage further research in this field.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views324 pages

(Kuo, 2008) Precoding Techniques For Digital Communication Systems (Springer) (324s)

This document is a comprehensive review of precoding techniques for digital communication systems, focusing on improving data rates, link quality, and user capacity. It covers various methods including Tomlinson-Harashima precoding, Trellis precoding, and techniques for MIMO and multiuser OFDM systems, highlighting their applications and advantages. The book is intended for graduate students, engineers, and researchers familiar with digital communication concepts, aiming to enhance understanding and encourage further research in this field.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 324

Precoding Techniques for Digital

Communication Systems
C.-C. Jay Kuo · Shang-Ho Tsai · Layla Tadjpour ·
Yu-Hao Chang

Precoding Techniques
for Digital Communication
Systems

123
C.-C. Jay Kuo Layla Tadjpour
Department of Electrical Engineering 908 N. Verdugo Rd.
EEB 440 Glendale, CA 91206
Hughes Aircraft Electrical
Engineering Building
3740 McClintock Ave.
Los Angeles, CA 90089

Shang-Ho Tsai Yu-Hao Chang


Department of Electrical 1 Dusing Rd.
Engineering Hsinchu Science Park
R734 E5 Building Hsin-Chu
National Chiao Tung University Taiwan, 30078, R.O.C.
Taiwan, R.O.C.

ISBN: 978-0-387-71768-5 e-ISBN: 978-0-387-71769-2


DOI: 10.1007/978-0-387-71769-2

Library of Congress Control Number: 2008926091


C 2008 Springer Science+Business Media, LLC
All rights reserved. This work may not be translated or copied in whole or in part without the written
permission of the publisher (Springer Science+Business Media, LLC., 233 Spring Street, New York,
NY10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in
connection with any form of information storage and retrieval, electronic adaptation, computer software,
or by similar or dissimilar methodology now known or hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are
not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to
proprietary rights.

Printed on acid-free paper

springer.com
Preface

During the past two decades, many communication techniques have been
developed to achieve various goals such as higher data rate, more robust link
quality, and more user capacity in more rigorous channel conditions. The most
well known are, for instance, CDMA, OFDM, MIMO, multiuser OFDM, and
UWB systems. All these systems have their own unique superiority while they
also induce other drawbacks that limit the system performance. Conventional
way to overcome the drawback is to impose most of the computational effort
in the receiver side and let the transmitter design much simpler than receiver.
The fact is that, however, by leveraging reasonable computational effort to
the transmitter, the receiver design can be greatly simplified. For instance,
multiaccess interference (MAI) has long been considered to limit the perfor-
mance of multiuser systems. Popular solutions to mitigate MAI issue include
multiuser detection (MUD) or sophisticated signal processing for interference
cancellation such as PIC or SIC. However, those solutions impose great bur-
den in the receiver. In this case, precoding offer good solutions to achieve
simple transceiver designs as we will mention later in this book.
This book is intended to provide a comprehensive review of precoding
techniques for digital communications systems from a signal processing per-
spective. The variety of selected precoding techniques and their applications
makes this book quite different from other texts about precoding techniques
in digital communication engineering.
In the first part of the book, we overview the principles of precoding for
channels with intersymbol interference (ISI) such as Tomlinson–Harashima
precoding and Trellis precoding. We also introduce how the widely used
OFDM systems can be treated as a special case of precoding techniques and in-
troduce precoding schemes for OFDM systems. Furthermore, it is well known
that the performance of code division multiple access (CDMA) systems is
limited by the MAI. As the number of users increases, a light weight receiver
may not be able to combat MAI efficiently. Thus, we introduce various ex-
isting precoding techniques that reduce the interference level while keeping
the receiver design simple. Finally, we devote a whole chapter to the issue
VI Preface

of precoding for multiple input multiple output (MIMO) channels. In MIMO


systems, the use of TH precoding at the transmitter will increase the capacity
of a BLAST MIMO system. Precoding techniques can also use the channel
stat information to optimally assign resources such as power and bits over
multiple antennas or facilitates the design of space-time codes with maximum
diversity/coding gain. We will review, joint linear/decoder design and linear
precoding techniques for space-time codes systems, among other techniques.
MIMO precoding with partial channel information is also included.
The second part of the book offers the recent state-of-the-art precoding
techniques originated from several projects and research activities conducted
by the authors in the field of multiuser OFDM transmissions and ultra-
wideband (UWB) radio systems. For multiuser OFDM systems, we show that
by properly designing transceiver and orthogonal code, the MAI induced from
various sources such as multipath, time and frequency offsets, and Doppler
effect can be completely eliminated or reduced greatly to a negligible amount
for certain active users. Since some active users can enjoy a MAI-free or near
zero MAI property, the computational complexity for MUD or sophisticated
signal processing for interference cancellation can be significantly reduced.
In the UWB channel, the signal power spread over a great number of mul-
tipath components leads to a challenging problem of received signal power
collection at the receiver. We show that a channel phase precoding tech-
nique that concentrates on the signal power at the desired receiver output can
greatly simplify the receiver complexity as compared to the conventional Rake
receiver.
The book is written for graduate students, practicing engineers in telecom-
munications industry as well as researchers in academia who are already famil-
iar with technical concepts such as probability, digital communication systems,
and estimation theory. We hope that the book will contribute to a better un-
derstanding of the value of precoding techniques for digital communication
system and may motivate further investigation in this exciting research area.
The authors would like to thank the anonymous reviewers for their con-
structive suggestions. C.-C. J. Kuo would like to thank his parents, his wife
Terri and daughter Allison for their encouragement and support for years.
Y.-H. Chang would like to thank his parents and wife Sophia for their sup-
port and encouragement during the preparation of this book. L. Tadjpour
is grateful to her parents and sisters for their support and encouragement
throughout this project. S.-H. Tsai would like to thank his parents and his
wife Janet for their endless understanding and support during the time he
devoted to writing this book, and his son Lawrence for his lovely smile.
National Chiao Tung University, Shang-Ho Tsai
University of Southern California, Layla Tadjpour
University of Southern California, Yu-Hao Chang
University of Southern California, C.-C. Jay Kuo
February 2008
Contents

List of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XI

Part I Precoded Systems Overview

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 Precoding for ISI Gaussian Channels . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Precoding for CDMA Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Precoding for MIMO Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Precoding for Multiuser OFDM Systems . . . . . . . . . . . . . . . . . . . 8
1.5 Precoding for UWB Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Precoding Techniques in ISI Channel . . . . . . . . . . . . . . . . . . . . . . 13


2.1 Equalizers for ISI Cancellation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Tomlinson–Harashima Precoding (THP) . . . . . . . . . . . . . . . . . . . . 15
2.2.1 Performance of TH Precoding . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.2 Combined Precoding and Coding . . . . . . . . . . . . . . . . . . . . 20
2.3 Trellis Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3.1 Trellis Shaping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.2 Principles of Trellis Precoding . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.3 Performance of Trellis Precoding . . . . . . . . . . . . . . . . . . . . 29
2.4 Multirate Representations for OFDM Systems . . . . . . . . . . . . . . 29
2.4.1 Multirate Fundamentals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.2 Multirate Representation for OFDM Systems . . . . . . . . . 32
2.4.3 OFDM Systems with Cyclic Prefix . . . . . . . . . . . . . . . . . . 37
2.4.4 OFDM Systems with Zero Padding . . . . . . . . . . . . . . . . . . 39
2.4.5 OFDM Systems with Transmitter Knows Channel
Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5 Precoding for OFDM Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.5.1 Single Carrier System with Cyclic Prefix (SC-CP) . . . . . 41
2.5.2 Single Carrier System with Zero Padding (SC-ZP) . . . . . 45
VIII Contents

3 Precoding Techniques in Multiple Access Channels . . . . . . . . 47


3.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2 Transmit Matched Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.3 Transmit Zero-Forcing Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4 Transmit Wiener Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.5 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.5.1 Derivation of Tx-MF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.5.2 Derivation of Tx-ZF with Minimum Output Power
Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.5.3 Derivation of Tx-Wiener . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4 Precoding Techniques for MIMO Channels . . . . . . . . . . . . . . . . 67


4.1 Review of MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.2 Tomlinson-Harashima precoding (THP) for MIMO systems . . . 69
4.3 Joint Design of Linear Precoder and Decoder . . . . . . . . . . . . . . . 73
4.3.1 Generalized Weighted MMSE Design . . . . . . . . . . . . . . . . 74
4.3.2 Maximum Information Rate Design . . . . . . . . . . . . . . . . . . 76
4.3.3 QoS-Based Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.3.4 (Unweighted) MMSE Design . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3.5 Equal Error Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3.6 Maximum SNR-Based Design . . . . . . . . . . . . . . . . . . . . . . . 80
4.3.7 Unified Framework with Convex Optimization . . . . . . . . 81
4.4 Precoder in MIMO Space-Time Code Systems . . . . . . . . . . . . . . 88
4.4.1 Linear Precoder for Space-Time Coded System with
Fading Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.4.2 Linear Constellation Precoding (LCP) for Space-Time
Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.5 Precoding Techniques for the Limited Feedback Channel
Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.5.1 Precoding with Channel Statistics Knowledge . . . . . . . . . 99
4.5.2 Unitary Precoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.5.3 System model and the Optimal Precoder for Unitary
Precoded OSTBC Systems . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.5.4 Codebook Construction for the Unitary Precoding . . . . . 110

Part II Future Communication Systems with Precoding

5 Precoded Multiuser (PMU)-OFDM System . . . . . . . . . . . . . . . 117


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.2 System Model and Its Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.2.1 Approximately MAI-Free Property . . . . . . . . . . . . . . . . . . 121
5.2.2 Approximately MAI-Free Property: Quantitative
Analysis for Hadamard-Walsh Code . . . . . . . . . . . . . . . . . 124
5.3 PMU-OFDM System in Time Offset Environment . . . . . . . . . . . 132
Contents IX

5.3.1 Time Asynchronism Analysis . . . . . . . . . . . . . . . . . . . . . . . 132


5.3.2 Code Design for MAI Mitigation . . . . . . . . . . . . . . . . . . . . 139
5.4 PMU-OFDM System in Frequency Offset Environment . . . . . . . 151
5.4.1 Analysis of CFO Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.4.2 Analysis of Other User’s CFO Effect . . . . . . . . . . . . . . . . . 154
5.4.3 Analysis of Self-CFO Effect . . . . . . . . . . . . . . . . . . . . . . . . . 161
5.4.4 Overall CFO Estimation and Compensation . . . . . . . . . . 166
5.4.5 Code Priority in CFO Environment . . . . . . . . . . . . . . . . . . 171
5.5 PMU-OFDM System in Time-Varying Channel Environment . . 185
5.5.1 Time-Varying Rayleigh Fading Channel Model . . . . . . . . 185
5.5.2 Analysis of PMU-OFDM Under the Doppler Effect . . . . 188
5.5.3 Doppler MAI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
5.5.4 Analysis of Doppler ICI and Symbol Distortion . . . . . . . 195
5.5.5 Codeword Priority Schemes for ICI Cancellation . . . . . . . 200
5.5.6 Channel Estimation in Fast Time-Varying Channel . . . . 205

6 MAI-Free MC-CDMA System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
6.2 System Model and Its Properties . . . . . . . . . . . . . . . . . . . . . . . . . . 210
6.2.1 MAI Analysis over Frequency-Selective Fading . . . . . . . . 212
6.2.2 Channel Estimation Under MAI-Free Condition . . . . . . . 220
6.2.3 Proposed Code Design in the Presence of CFO . . . . . . . . 222
6.2.4 Practical Considerations on Applicability
of the Proposed Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
6.3 MAI-Free MC-CDMA with CFO Using Hadamard-Walsh
Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

7 Simplified Multiuser Detection for MC-CDMA


with Carrier Interferometry Codes with CFO . . . . . . . . . . . . . . 239
7.1 Orthogonal Carrier Interferometry Codes for MAI-free
MC-CDMA with CFO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
7.2 Complexity Reduction in PIC MUD Detection . . . . . . . . . . . . . . 248
7.2.1 Derivation of BEP Assuming Gaussian Model for
Residual Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
7.2.2 Derivation of BEP Using Non-Gaussian Model for
Residual Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
7.3 Complexity Reduction in ML MUD Detection . . . . . . . . . . . . . . . 258
7.3.1 ML-MUD in Multipath Fading Channel . . . . . . . . . . . . . . 258
7.3.2 ML-MUD in Multipath Fading Channel with CFO . . . . . 259
7.3.3 Viterbi Algorithm for Tail Biting Trellis (TBT) . . . . . . . 260
7.3.4 Upper Bound on Minimum Error Probability . . . . . . . . . 263
7.4 Complexity Reduction in Decorrelating MUD Detection . . . . . . 266
7.4.1 Error Probability for Decorrelating MUD . . . . . . . . . . . . . 269
7.5 Channel and CFO Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
X Contents

8 Ultra-Wideband (UWB) Precoding System Design Using


Channel Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
8.2 System Model and Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
8.2.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
8.2.2 Features of CPP-UWB System . . . . . . . . . . . . . . . . . . . . . . 279
8.3 Performance Analysis of CPP-UWB Systems . . . . . . . . . . . . . . . . 281
8.3.1 Channel Power Concentration of Phase Precoding . . . . . 281
8.3.2 Comparison Between TRP and CPP Schemes . . . . . . . . . 283
8.4 Phase Estimation and Performance Analysis with Estimated
Phase Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
8.4.1 Channel Phase Estimation Algorithm . . . . . . . . . . . . . . . . 285
8.4.2 Performance Analysis with Estimated Phase . . . . . . . . . . 286
8.5 Codeword Length Optimization (CLO) in an ISI Channel . . . . 289
8.5.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
8.5.2 Fast Search Algorithm for Optimal Code Length . . . . . . 291
8.6 Consideration of FCC Power Spectral Mask . . . . . . . . . . . . . . . . . 297

9 Conclusion and Future Trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301


9.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
9.1.1 Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
9.1.2 Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
9.1.3 Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
9.1.4 Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
9.1.5 Chapter 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
9.1.6 Chapter 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
9.1.7 Chapter 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
9.2 Future Research Trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
9.2.1 Precoding with Partial Channel Information . . . . . . . . . . 304
9.2.2 Combined Precoding and MUD for Multiuser
Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
9.2.3 Other Code Scheme to Achieve More MAI-Free User
Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
List of Symbols

≈ Approximately equal to
∗ Time-domain convolution operator
⊗ Kronecker product
A† Hermitian (Complex and conjugate) of A
At Transpose of A
M AIi←j [k] The MAI from user j to user i at subcarrier k
[x]↓M Signal x down-sampled by M
[x]↑M Signal x up-sampled by M
E {x} Expectation of x
CN (0, 1) Complex circular symmetric Gaussian
distribution
σx2 Variance of x
Jn Bessel function of the first kind of order n
Re{x} Real part of x
F Fourier matrix
sk Spreading code of user k
w(t) Unit power chip waveform defined as
 Tc
0
|w(t)|2 = 1
δD (t) Dirac delta function
IN An identity matrix of size N × N
(i)
eN the ith column of IN
X‡ Pseudo inverse of matrix X
||x|| Vector 2-norm
||F||F Frobenius norm of matrix X
x A ceiling function of x
F{·} Fourier transform operator
diag(x1 x2 · · · xN ) N × N diagonal matrix with diagonal
elements x1 ,x2 ,· · · ,xN
Tc Chip interval
x(n) x in time domain
X(k) x in frequency domain
XII List of Symbols

L Multipath length
h(n) Channel impulse response
n and t Time indices
T Symbol period or number of users
k Discrete frequency domain index
x̂ Detected symbol, or estimated value of x
R Correlation matrix
Fc Carrier frequency
Fs Sampling frequency
V Velocity
τ , D, or d Delay
fD Doppler frequency
Mt Number of transmit antennas
Mr Number of receive antennas
H MIMO channel matrix
Eb Bit energy
 Normalized carrier frequency offset (CFO)
τ Time offset
ν Length of cyclic prefix
Part I

Precoded Systems Overview


1
Introduction

Wireless communications have enjoyed exponential growth in the last few


decades thanks to the development of reliable solid-state radio frequency hard-
ware in the 1970s. Various paging, cordless, cellular, and personal communica-
tion standards have been developed for wireless systems throughout the world.
The next generation of wireless systems will provide an end-to-end communi-
cation system where voice, data, and streamed multimedia can be served to
users on an “anytime, anywhere” basis at hundreds of Mbits/s. For instance,
current IEEE 802.11n standard (MIMO-OFDM) supports data rate up to
600 Mbps in physical (PHY) layer in quasi-static environment. The under
constructing standard IEEE 802.16m (also MIMO-OFDM) aims to provide a
gross data rate greater than 100 Mbps for mobile applications.
Data throughput is one of the most important performance indicators for
communication systems. Before 1990s, multipath effect has long been consid-
ered as the main obstacle that prevents high throughput transmission [108].
This can be simply explained in the time domain. Due to the multipath effect,
we may need to insert guard interval between transmitted symbols to pre-
vent inter-symbol interference (ISI), where the guard interval should be larger
than the channel delay spread. However, the insertion of guard interval limits
the transmission data throughput. This limitation can also be explained in
frequency domain. The multipath effect in time domain leads to frequency-
selective fading in frequency domain. Hence, if the signal occupies the whole
channel bandwidth, the signal will experience the frequency selective fading
and the performance degrade significantly in this case. To avoid frequency-
selective fading, we may transmit signal with narrow bandwidth (narrow band
communications). In narrow band communication systems, such as the IS-95
(CDMA), the signal bandwidth is far less than channel coherent bandwidth.
Hence, although the channel has frequency selective fading, the narrow band
signal will not experience a dramatic frequency-selective fading since the sig-
nal occupies only a small portion of the channel bandwidth. However, trans-
mit signal with narrow bandwidth implies that the data throughput cannot
be high.

C.-C.J. Kuo et al., Precoding Techniques for Digital Communication Systems,


DOI: 10.1007/978-0-387-71769-2 1,  c Springer Science+Business Media, LLC 2008
4 1 Introduction

To overcome the multipath effect and achieve high throughput transmis-


sion, channel equalization or precoding techniques can be used. The original
principle of precoding is that if transmit side knows channel information, we
can design the transmit signal so that the ISI in the receiver side is greatly
mitigated. For instance, the use of Tomlinson-Harashima precoding can be re-
garded as moving the feedback part of a DFE (Decision Feedback Equalizer)
to the transmit side to avoid error propagation problem. Different from er-
ror correct coding that operates in Galois field, precoding deals with symbols
in complex field and hence can help to rescue constellation-mapped symbols
from impairments such as frequency selective fading [115].
An effective way to overcome multipath effect is using orthogonal fre-
quency division multiplexing (OFDM) systems. OFDM has widely been used
in both wireline and wireless communications. While the concept of OFDM
has been known since 1966 [19], it was not employed in standard systems until
1990s when advances in digital signal processing (DSP) and VLSI technol-
ogy made effective OFDM implementation possible with low-cost fast Fourier
transform (FFT) chips. When used in wired environments, OFDM is also
called discrete multitone (DMT) modulation, the technique used in xDSL
(digital subscribe line). The primary advantage of the OFDM technique is
its ability to effectively combat the inter-symbol interference (ISI) effect due
to frequency-selective fading with a simple transceiver structure. OFDM sys-
tem has proven its superior ability to combat ISI with simple implementa-
tion scheme in the past decade. In 1993, DSL adopted DMT and made it the
first successful commercial product using OFDM instead of equalization-based
technique. In 1995 and 1997, ETSI adopted OFDM in DAB (Digital Audio
Broadcasting) and DVB-T (Digital Video Broadcasting-Terrestrial) systems,
respectively. Later in 1999, IEEE 802.11a standard used OFDM and provided
the Wi-Fi application a peak data rate up to 54 Mbps. Since 2002 till 2007,
OFDM systems have been adopted for other standards such as IEEE 802.16x
family (Wi-MAX) and IEEE 802.11n (MIMO Wi-Fi). In fact, OFDM system
can also be regarded as a special case of precoding. In [5], the channel infor-
mation of this multicarrier system was utilized to design the transmitting and
receiving filter banks so that the ISI can be eliminated, which was called “vec-
tor coding.” Vector coding may be regarded as a linear precoding technique
and feedback from receiver to transmitter is needed since it assumes that the
transmitter knows the channel information. The current OFDM system does
not utilize the channel information to design the transmitting and receiving
filter banks. Instead, it uses DFT and IDFT filter banks for transceiver design.
As a result, transmitter does not need to know channel information. This kind
of channel-independent OFDM scheme was generalized as precoding scheme
in [76] and [115].
The increasing need for fast and reliable wireless communication links has
lead to system with multiple antennas located at both the transmitter and
the receiver. Multiple input multiple output (MIMO) systems are able to in-
crease significantly the capacity and hence achieve higher transmission rates
1 Introduction 5

than one-sided array links. The well-known Shannon theorem for capacity of
bandlimited Gaussian channels shows that there is a fundamental limit (chan-
nel capacity) for transmission data rate over Gaussian bandlimited channels.
With advances in communication theory and the growth of sophisticated sig-
nal processing and computation techniques, the possibility of achieving the
fundamental information limit on channel capacity seems higher than ever
before. In MIMO systems, if channel information is known to the transmit-
ter, precoding can be used to further improve the system performance based
on various design criteria such as maximum capacity or minimize the mean
square error. In the current wireless standards, precoding (or beamforming)
is adopted as an optional feature for IEEE 802.11n and IEEE 802.16 family,
where full channel information or partial channel information are required to
implement this feature. When channel varies rapidly, full channel information
may not be available in the transmitter. In this case, some researches showed
that “partial channel” information for MIMO precoding can still achieve sat-
isfactory performance. This concept also motivated us to use partial channel
information to perform precoding for UWB communication systems since the
channel impulse response for such systems is long and full channel information
may not be available in transmitter side (see Chapter 8).
As mentioned above, precoding schemes may be divided into three cate-
gories according to the accessibility of channel information. That is,
1. Transmit side has full channel information.
2. Transmit side only has partial channel information.
3. Transmit side does not have any channel information.
It is intuitive that the precoding scheme with full channel information can
achieve better performance than other two schemes. However, sometimes full
channel information may not be available in transmitter due to rapid channel
variation (feedback is not in time) or long delay spread (too much feedback in-
formation). An example that full channel or even partial information may not
be available is the uplink transmission in multiuser communication systems.
In this case, the signal from different users will lead to multiaccess interfer-
ence (MAI). To eliminate MAI using precoding technique with full channel
information, the mobile station may need to know the channel information
of all other mobile stations. However, this is somewhat impractical when the
number of users is large. Although multiuser detection (MUD) can be used
in the receiver to eliminate MAI, it demands high computational complexity,
especially when the number of users is large. This motivates us to approach
another research direction and we found that by properly designing transceiver
and multiaccess orthogonal code schemes, we can achieve MAI-free or nearly
MAI-free property without knowing channel information in the transmit side.
Since this precoding scheme is channel independent, the transceiver design
can be greatly simplified.
This book deals with precoding techniques in digital communication sys-
tems. Different precoding techniques can be employed in the transmitter
6 1 Introduction

to improve the performance of wireline or wireless systems with affordable


complexity. In the first part, we will provide an overview of several impor-
tant existing precoding techniques. From this tutorial overview, readers can
understand the evolution of precoding and the principles of various precoding
schemes. In the second part, we will discuss the application of state-of-the-
art precoding schemes to current and/or emerging communication systems to
combat impairments such as ISI, MAI, and the Doppler effect. We will intro-
duce the proposed MAI-free or near MAI-free multiuser systems, which com-
bines CDMA and OFDM techniques. Unlike the multiuser systems without
precoding that may lead to serious complexity burden in the receiver side to
deal with MAI problem, the proposed MAI-free multiuser systems can achieve
a good complexity leverage between transmitter and receiver. Consequently
simple transceiver design is achievable. Different chapters are dedicated to
different applications of precoding as outlined below.

1.1 Precoding for ISI Gaussian Channels

In Chapter 2, we discuss precoding techniques to eliminate ISI and achieve


high throughput. To approach capacity, the transmission band must be ex-
panded to the entire usable available bandwidth of the channel which leads
to inter symbol interference (ISI). To mitigate the effect of ISI, equaliza-
tion technique can be employed at the receiver side, or alternatively, when
the channel state information is available at the transmitter, the interfer-
ing symbols can be subtracted from the transmitted bits with precoding
techniques. Such precoders were independently proposed by Tomlinson [131]
and by Harashima and Miyakawa [84] and is called Tomlinson–Harashima
(TH) precoder. Tomlinson–Harashima precoding has been used in bandlim-
ited telephone line modems to support data rates of 19.2 kb/s or above just
short of the capacity 20 kbits/s. Theoretically, it has been shown that, with
TH precoding and for high signal to noise ratio (SNR) channels, the capac-
ity of any bandwidth limited Gaussian channel can be achieved as closely as
the capacity of an ideal Gaussian channel. TH-precoding can be combined
naturally with channel coding and also with trellis shaping which provide
additional gain. In particular, trellis precoding, a combination of trellis shap-
ing, trellis coded modulation (TCM), and TH precoding, yield very powerful
scheme with remarkable performance.
OFDM is an effective technique to combat ISI. Representing OFDM sys-
tems using multirate filter banks structures will enable us to gain more insight
on how to design precoder and postcoder to achieve different design criteria,
since there have been many well-developed results for multirate systems. Also,
with multirate representation, current widely adopted OFDM systems can be
slightly modified to become other types of OFDM systems. For instance, the
cyclic prefix (CP)-inserted OFDM can be modified to zero padding (ZP)-
inserted OFDM to reduce transmitting power. It is also interesting to note
1.3 Precoding for MIMO Channels 7

that precoding can also be added on current OFDM systems. One example
is the channel independent Single Carrier with Cyclic Prefix (SC-CP) sys-
tem [113], where a DFT precoder is added in the transmitter and hence the
whole becomes a single carrier system. The SC-CP system enjoys a much lower
peak-to-average power ratio (PAPR) than the widely adopted OFDM system.
Furthermore, this system has been proven to achieve the minimum bit error
rate (BER) with certain modulation schemes [78]. Since the transmitter of
such precoding scheme does not need channel information in SC-CP systems,
the transceiver design can be greatly simplified.

1.2 Precoding for CDMA Systems


In Chapter 3, we discuss different precoding techniques in the direct-sequence
(DS) code division multiple access (CDMA) channel. As the number of co-
channel users increases, the decoding performance suffers due to the increased
level of multiple access interference (MAI). Even though several receiver-based
MAI suppression schemes, such as multiuser detection [144], minimum mean
square error (MMSE) receiver [83], and decorrelator [81], improve the receiver
performance at the expense of high computational complexity and the knowl-
edge of transmitted signals of all co-channel user, they are only suitable for the
uplink, rather than the downlink channel. This is because the uplink receiver,
namely the base station, is more likely to provide high computational power
and to acquire the channel knowledge of all users than its downlink counter-
part, namely the mobile unit. If the downlink channel information of each user
is acquired by the base station based on the channel reciprocity assumption in
the time division duplex (TDD) system, it is thus straight forward to consider
different precoding techniques for the downlink channel. Here, three differ-
ent precoding schemes, including transmit matched filter (Tx-MF), transmit
zero-forcing filter (Tx-ZF), and the transmit Wiener filter (Tx-Wiener), for
the CDMA downlink channel are derived and their performance and design
trade-off are discussed.

1.3 Precoding for MIMO Channels


Multiple input multiple output (MIMO) and space-time processing techniques
are by far the most promising existing wireless technologies. MIMO technology
can provide channel capacity gain (multiplexing gain), if multiple independent
data streams are sent simultaneously and in the same frequency band over
multiple transmit antennas and recovered at the receiver via appropriate signal
processing techniques, e.g. the V-BLAST scheme. This is usually called Spatial
Division Multiplexing (SDM).
MIMO systems also offer potential spatial diversity that can be exploited
by space-time codes, e.g. Alamouti code, among other techniques which es-
sentially is a clever way to map data streams across time and space.
8 1 Introduction

When the full channel state information is available at the transmitter,


precoding can be used to improve the performance of MIMO system in variety
of ways as follows:

• TH precoding can be used in a Spatial Division Multiplexing MIMO to


subtract the interfering symbols from transmitted bits.
• Transmit and receive processing (linear precoding and decoding) can use
the CSI to optimally allocate resources such as power and data rates.
• Precoding can also be combined with space-time codes to maximize the
diversity and coding gain. In this particular form, the linear constella-
tion precoding technique does not even require the knowledge of channel
information state.
The MIMO precoding techniques are based on the assumption that the real-
time channel knowledge at the transmitter is available. However, in a time
varying channel environment where the channel information is updated often,
the ideal channel information assumption is usually weakened by insufficient
feedback channel capacity. Later in this chapter, two different precoding con-
cepts are introduced to save the feedback overhead. The first precoder design
explores the channel statistics, say, either channel mean or channel covariance,
since they are relatively stable as compared to the current channel information
and, therefore, less number of updates is required. Given either the channel
mean or channel covariance matrix is available at the transmitter, the pre-
coder is constructed by specifying a proper input covariance matrix at the
precoder output so that the information rate is maximized. It is found that
the maximum rate can be achieved by either the beamforming scheme or spa-
tial multiplexing scheme, depending on the quality of feedback information.
The other precoder design based on incomplete channel information is called
unitary precoding. The idea of unitary precoding is as follows. A set of dis-
crete codewords, i.e., codebook, is constructed off-line first and this codebook
knowledge is shared at both sides of the link. Later, the receiver selects the
best codeword from the codebook and sends the corresponding code index to
the transmitter for precoding. Obviously, the size of codebook, which deter-
mines the quality of precoder, is limited by the number of feedback bits. The
codebook design that minimizes the distortion due to discrete representation
of the codeword is the key to this scheme and is detailed as well.

1.4 Precoding for Multiuser OFDM Systems

In Chapters 5–7, we introduce the precoding techniques for multiuser OFDM


system to eliminate the MAI. Multiuser OFDM systems have been developed
to meet the need of wireless multiaccess, including multicarrier code divi-
sion multiple access (MC-CDMA) and orthogonal frequency division multiple
1.4 Precoding for Multiuser OFDM Systems 9

access (OFDMA). Multiuser OFDM systems generally suffer from multiac-


cess interference (MAI) which is caused by various environmental effects such
as multipath, timing asynchronism, carrier frequency offset (CFO), or the
Doppler spread. By using precoding techniques, the MAI caused by above
environmental effects can be greatly mitigated or completely eliminated for a
group of users.
In Chapter 5, we introduce a new family of multiuser OFDM transceivers,
called the precoded multiuser (PMU)-OFDM to achieve approximately MAI-
free with a low implementational cost. A code selection scheme based on
Hadamard-Walsh code has been proposed for PMU-OFDM system so that
the system is still approximately MAI-free in the environment with time asyn-
chronism, CFO (carrier frequency offset), or the Doppler effects. That is, we
use only the M/2 symmetric or the M/2 anti-symmetric codewords of the
M Hadamard-Walsh codes. When the number of users is below M/2, all the
M/2 users can enjoy the nearly zero MAI property. When the number of users
exceeds M/2, MUD techniques can be used to further suppress the MAI. It is
worthwhile to emphasize that even in the fully-loaded system, the complex-
ity for MUD has been reduced since each user only needs to deal with M/2
interferers instead of M interferers.
Moreover, we proposed a code priority scheme based on the proposed code
selection so that the PMU-OFDM can be more robust to CFO effect. That is,
we rank the codewords according to their CFO robustness and form a code
priority. This code priority can be used in three different ways. First, since
not all the users are active simultaneously, the code priority suggests that we
should assign the higher priority codewords to the first connected users. Sec-
ond, when in a serious CFO environment, we can consider to support fewer
users based on the code priority to maintain nearly MAI-free in this hostile
environment. Third, in practical situations, individual users may have differ-
ent CFOs. In this case, we should assign the codewords with higher priorities
to the users who have larger CFOs and assign the codewords with lower pri-
orities to the users who have smaller CFOs so that the overall performance
will not degrade significantly. Furthermore, we evaluate the PMU-OFDM sys-
tem in time-varying channel environments. We found that the PMU-OFDM
system with the proposed code selection scheme can work robustly in such
hostile channel environments.
It is known that MC-CDMA systems suffer from MAI when the channel
is frequency-selective fading. In Chapter 6, we will introduce a Hadamard-
Walsh code based MC-CDMA system that achieves zero MAI over frequency-
selective fading channel. In particular, we will use appropriately chosen subsets
of Hadamard-Walsh code as codewords. For a multipath channel of length L,
we partition a Hadamard-Walsh code of size N into G groups, where G is
a power of 2 with G ≥ L. We will show that any of the G subsets yields
an MAI-free system. It is also shown that the MAI-free property allows us to
estimate the channel of each user separately and the system can perform chan-
10 1 Introduction

nel estimation much more easily. Owing to the MAI-free property, every user
can enjoy a channel diversity gain of order L to improve the bit error perfor-
mance. Furthermore, the system has the additional advantage that it is robust
to CFO in a multipath environment. That is, by partitioning those codewords
into subsets, the number of supportable MAI-free users with Hadamard-Walsh
codes is 1 + log2 (N/G) in a CFO environment.
In Chapter 7, we show that the number of MAI-free users in CFO environ-
ment can be increased by partitioning the orthogonal carrier interferometry
(CI) codewords [92]. Since some users can achieve the MAI-free property, in a
fully-loaded MC-CDMA system, the complexity for MAI suppression can be
greatly reduced. We will use existing interference suppression techniques to
achieve fully-loaded user capacity in MC-CDMA with CFO and show that the
receiver complexity is indeed simplified by adopting the proposed precoding
scheme in the MC-CDMA system.

1.5 Precoding for UWB Systems

In Chapter 8, we study the precoding system design in UWB impulse ra-


dio (IR) systems. The UWB-IR system enjoys several advantages, such as
an accurate ranging capability due to its fine multipath resolution, excellent
fading immunity due to a great number of multipath components found in
its channel response, etc. However, from a communication prospective, it is
also important for the UWB receiver to acquire sufficient signal power spread
among those many paths. The traditional Rake receiver used in the multipath
CDMA channel is not applicable in UWB channels since it requires a huge
number of Rake finger, and the associated cost is not cheap. The idea called
time-reversal precoding (TRP) that shifts the complexity of multipath com-
bining from the receiver to the transmitter is recently proposed to reduce the
receiver hardware cost. By encoding the transmitted symbol with the time-
reversed channel information, all the multipath components are automatically
concentrated after certain delay and a simple MF filter can be applied to col-
lect enough channel power. However, the deployment of TRP is limited by its
high feedback overhead for those many multipath components. To reduce the
feedback overhead while having the advantage of signal power focusing,
the channel phase precoding UWB (CPP-UWB) is discussed in this chapter.
The CPP transmitter that encodes the data symbol with the time-reversed
phase information alone, demands only the phase information feedback, rather
than the complete channel information feedback. Since each phase component
in UWB channel is either +1 or −1 and it is represented by one signal bit, the
feedback information quantity is greatly reduced. Owing to the concentrated
signal power in the equivalent channel response after precoding, the ISI ef-
fect is somewhat mitigated for a fixed symbol interval. On the other words,
the symbol interval of CPP-UWB system can be further shortened to speed
up the data transmission rate for a given noise margin. For a given symbol
1.5 Precoding for UWB Systems 11

interval less than the channel response, we can adjust the phase codeword
length so that the output signal-to-interference power ratio (SIR) is maxi-
mized. Although the optimal code length can be found via the exhaustive
search, its computational complexity is high, especially when the number of
channel taps is large. A fast search algorithm is also developed to determine
the optimal codeword length with less computational power as compared with
the exhaustive search algorithm.
2
Precoding Techniques in ISI Channel

One of the most critical issues in communication systems is the multipath


effect. Multipath effect will cause inter-symbol interference (ISI) and hence
it limits the transmission speed. In this chapter, we will review several tech-
niques used to combat ISI. These include channel equalization, Tomlinson–
Harashima precoding and Trellis precoding. Moreover, it is known that OFDM
systems can compensate the ISI using simple transceiver scheme. Since mul-
tirate representation of OFDM systems can help us to gain more insight on
the analysis, we will introduce the multirate representation of several OFDM
systems. Further, channel independent precoding scheme for OFDM systems
will also be developed based on the multirate representation.

2.1 Equalizers for ISI Cancellation

Communication channels are usually bandwidth limited. In order to approach


capacity, the transmission band must be expanded to the entire usable avail-
able bandwidth of the channel. This will inevitably lead to the sever distortion
of the signal spectrum at the band edges and inter-symbol interference (ISI).
One way to reduce ISI is to use equalizers in the receiver. Three different
equalizers, namely linear equalizers (LE), decision feedback equalizer (DFE),
and maximum likelihood sequence estimation (MLSE) equalizers have been
proposed.
Zero-forcing (ZF) DFE is depicted in Fig. 2.1 for a channel with impulse
response h(D). It consists of one feedforward block f (D) whose task is to
guarantee white noise at the decision devices and a casual minimum phase
end-to-end impulse response. The causality ensures that each decision sym-
bol xi should only be disturbed by symbols with index j < i. Assuming all
previous estimatesare correct (ideal DFE assumption), i.e., x̂i = xi , we can
eliminate the tail j<i hj xi−j or equivalently in D-transform (h(D) − 1)x(D)
by subtraction as

C.-C.J. Kuo et al., Precoding Techniques for Digital Communication Systems,


DOI: 10.1007/978-0-387-71769-2 2,  c Springer Science+Business Media, LLC 2008
14 2 Precoding Techniques in ISI Channel

Fig. 2.1. Block diagram of ZF-DFE.

r (D) = r(D) − (h(D)f (D) − 1)x̂(D). (2.1)

Since
r(D) = h(D)f (D)x(D) + w(D)f (D), (2.2)
we have
r (D) = x(D) + w(D). (2.3)
We see that the ISI is completely cancelled and only white noise exists. The
output signal to noise ratio is then given by

SN RZF −DF E = Sx /Sn , (2.4)

where Sx and Sn are the average energy of input symbols xi and noise, re-
spectively. Note that for a ZF linear equalizer, the feedback section of Fig. 2.1
does not exist and f (D) = 1/h(D). Then,

SN RZF −LE = Sx /Sn ||1/h||2

where  1/2
1
||1/h|| =
2
df
−1/2 |H(f )|2
is the noise enhancement factor. Therefore, SN RZF −DF E > SN RZF −LE , i.e.,
ZF-DFE improves SN RZF −LE by the the noise enhancement factor. How-
ever, the ideal DFE assumption may not be practical. In fact, one of the
main shortcomings of DFE is error propagation. Moreover, since the reliable
detected decisions are only available after a delay, channel coding cannot be
combined with DFE in a straightforward manner.
ISI cancellation can be performed by the optimum maximum likelihood
sequence estimation (MLSE) equalizer. Suppose transmitting sequence x(D)
is drawn from an M -point constellation set. If channel impulse response has
finite length L, it can be modeled as a shift register of length L with M -
state memory. The Viterbi algorithm can be applied to optimally decode the
received sequence r(D) in the presence of noise. Although the output SNR
for the MLSE equalizer can approach the effective SNR of the matched filter
bound SN RMF B = Sx ||h||2 /Sn , the complexity of Viterbi algorithm grows
exponentially with the channel length (M L ).
An alternative approach to equalizations in the receiver is to change the
signal format at the transmitter by using channel state information so that the
2.2 Tomlinson–Harashima Precoding (THP) 15

effect of ISI is reduced or cancelled. This technique is called Precoding. In this


chapter, we review the principles of the most famous precoding techniques,
namely Tomlinson–Harashima (TH) precoding and trellis precoding.

2.2 Tomlinson–Harashima Precoding (THP)


Tomlinson–Harashima precoding (THP) technique was invented indepen-
dently by Tomlinson in United Kingdom [131] and Hiroshima in Japan [84].
TH precoding is closely tied to the signal set. Figure 2.2 shows the block dia-
gram of a transceiver in AWGN channel. We assume the discrete time chan-
nel impulse response is given by hl . Let us assume without loss of generality,
h0 = 1. The channel sampled outputs are given by


L−1
ri = hl xi−l (2.5)
l=0

We see that the symbols xi , xi−1 , . . . xi−L+1 will interfere with each other.
If the transmitter knows the channel impulse response, the inter-symbol in-
terference (ISI) effect can be overcome by a precoder with a transfer func-
tion equal to the inverse of the transfer function of the channel as shown in
Fig. 2.3(a).
However, when channel transfer function value is close to zero, the output
of this precoder may increase or diverge to infinity.
To overcome this issue, a nonlinear block with transfer function T(D)
is inserted before the h−1 (D) block, as shown in Fig. 2.3(b). The nonlinear
function makes the sequence z(D) peak limited, i.e.,

zmin ≤ z(D) ≤ zmax , (2.6)

for some zmin and zmax and all D. Assume zmin = −zmax . Then, One possible
construction of T can be constructed as follows [84]

y(D) = x(D) − 2P zmax , (2.7)

where P is an integer. Since z(D) = y(D) − v(D) and from Eq. (2.6), we can
determine the condition for P to limit the peak value of sequence z(D) to be

(2P − 1)zmax ≤ x(D) − v(D) ≤ (2P + 1)zmax . (2.8)

Fig. 2.2. Communication over AWGN channel.


16 2 Precoding Techniques in ISI Channel

Fig. 2.3. (a) Block diagram of a precoder for ISI channel; (b) block diagram of
Tomlinson–Harashima precoder [[84]IEEE].
c

Therefore the value of 2P zmax is actually the output of quantization operation


on x(D) − v(D) Q as show in Fig. 2.4. An implementation of T based on the
above equations is shown in Fig. 2.5.
In the receiver, the estimation of y(D) is obtained by

ŷ(D) = y(D) + w(D),

where w(D) is additive white Gaussian noise. After that, the inverse Trans-
formation T −1 computes the estimated input sequence x(D) from

Fig. 2.4. Input–Output characteristic of quantizer Q[[84]IEEE].


c
2.2 Tomlinson–Harashima Precoding (THP) 17

Fig. 2.5. Implementation of transmitter of TH precoding [[84]IEEE].


c

x̂(D) = ŷ(D) + 2P zmax . (2.9)

We assume the information sequence is also peak limited, i.e.,

xmin ≤ x(D) ≤ xmax ,

for some xmin and xmax . Therefore,

xmin ≤ ŷ(D) + 2P zmax ≤ xmax , (2.10)

or equivalently,

xmin − 2P zmax ≤ ŷ(D) ≤ xmax + 2P zmax . (2.11)

Assuming xmax − xmin < 2 zmax , we have xmin > xmax − 2 zmax and xmaz <
xmin + 2 zmax . Therefore, we can rewrite Eq. (2.11) as

(2P  − 1)zmax + xmax − zmax ≤ ŷ(D) ≤ (2P  + 1)zmax + xmin + zmax , (2.12)

where P  = −P . Let

xmax − zmax ≤ d ≤ xmin + zmax . (2.13)

Then, by comparing Eqs. (2.12) and (2.13), we conclude

(2P  − 1)zmax ≤ ŷ(D) − d ≤ (2P  + 1)zmax . (2.14)

We see that the term 2P  zmax can be obtained from (ŷ(D) − d) as the output
from the quantizer Q shown in Fig. 2.4 which was inserted in the transmitter.
From Eq. (2.9), x̂(D)) can be obtained from

x̂(D) = ŷ(D) − 2P  zmax , (2.15)

where y(D) varies according to

(2P  − 1)zmax + d ≤ ŷ(D) ≤ (2P  + 1)zmax + d, (2.16)

for all values of P  . Then, the input–output characteristics of T −1 is shown


in Fig. 2.6. An implementation of T −1 function is shown in Fig. 2.7.
18 2 Precoding Techniques in ISI Channel

Fig. 2.6. Input–output characteristics of inverse transformation T −1 [[84]IEEE].


c

Fig. 2.7. An implementation of inverse transformation T −1 [[84]IEEE].


c

For a special case where the information sequence is binary, i.e., x(D) can
take either 0 or 1, we can set zmin = −1, zmax = 1, and d = 1/2. Then, from
Fig. 2.7, it is clear that the original binary sequence can be restored by

0, ŷ(D) = even,
x̂(D) = (2.17)
1, ŷ(D) = odd.

In other words, the T −1 is reduced to


x̂(D) = ŷ(D) mod 2. (2.18)
Moreover, for a general case where the information sequence takes on values
from M -level alphabet, 0, 1, . . . , (M − 1), T −1 function can be constructed by
letting zmax = −zmin = M/2, and d = (M − 1)/2, and
x̂(D) = ŷ(D) mod M. (2.19)
Equivalently, if a signal set of 2M -level PAM consists of points taking
±0, ±1, . . . , ±(M − 1) values, T −1 is equal to
x̂(D) = ŷ(D) mod 2M. (2.20)
Note that the modulo-2M block can be used in the transmitter as the
nonlinear T transformation as demonstrated in [131] and shown in Fig. 2.8.
In fact, the modulo-2M operation reduces the transmit power by constraining
the transmitted symbols to lie within [−M, M ].

2.2.1 Performance of TH Precoding


In 1972, Price [104] showed that for high signal to noise ratio, the gap (in
dB) between the performance of an uncoded PAM with ideal (no feedback
2.2 Tomlinson–Harashima Precoding (THP) 19

Fig. 2.8. Alternative representation of TH precoding [[84]IEEE].


c

error) DFE or Tomlinson–Harashima precoding over any bandlimited AWGN


channel and channel capacity at a given Pr (E) approaches a constant value.
Consider a bandlimited Gaussian channel with bandwidth W . Assume
the transmitted modulated signal is QAM with an M × M signal set. The
transmitted power is constant for all frequencies within bandwidth W and
zero elsewhere. Suppose either ideal zero forcing (ZF) DFE or TH precoding
is used. Define normalized signal-to-noise ratio of an arbitrary transmission
scheme with a data rate per two dimensions denoted by R as

SN Rnorm = SN RDF E /(2R − 1), (2.21)

where SN RDF E = Sx /Sn is the output signal power to noise power ratio of
an ideal DFE. According to Price’s results, at high SNR, the performance
of all uncoded QAM signals over strictly bandlimited channels, regardless of
channel characteristics (including the ideal AWGN channel), is the same with
ZF-DFE or TH precoding. Thus, for high signal to noise ratio, the capacity
of any bandlimited channel is approximated by [144]

C log2 (1 + SN RDF E ). (2.22)

Therefore,
SN Rnorm (2C − 1)/(2R − 1). (2.23)
Since r < C, SN Rnorm > 1. Now, suppose the M × M square QAM scheme
has minimum distance dmin . Since

Sx = (M 2 − 1)d2min /6, (2.24)

and
R = log2 (M 2 ), (2.25)
and the noise is Gaussian, at high SNR, the symbol error probability can be
approximated by [44]

Pr (E) 4 Q[(d2min /2Sn )1/2 ]


= 4 Q[(3 SN Rnorm )1/2 ], (2.26)
∞
where Q(y) = y fX (x)dx, and fX (x) is the Gaussian probability density
function, with mean zero and variance one. The plot of Pr (E) versus SN Rnorm
20 2 Precoding Techniques in ISI Channel

Fig. 2.9. Symbol error probability of uncoded QAM with precoding [[44]IEEE].
c

is given in Fig. 2.9. It is seen that the gap difference between Shannon capacity
limit and the performance of uncoded M × M QAM at Pr (E) = 10−6 is about
9 dB. The key concept here is that the dB gap between uncoded QAM and
capacity is approximately the same for over all strictly bandlimited high signal
to noise ratio channels including the ideal AWGN channel.
The Tomlinson–Harashima precoding can be regarded an attractive alter-
native to the ZF-DFE when the channel state information is available in the
transmitter, since as was shown by Price’s result, it has the same performance
for high SNR values over any ISI channel as ZF-DFE. It is nothworthy that TH
precoding can be viewed as a DFE whose feedback section has moved to the
transmitter. Therefore, it does not have the disadvantages of DFE mentioned
before. That is, since the equalization is performed at the transmitter, the
transmitting signals are perfectly known and the error propagation phenom-
ena never occurs. Also, no immediate decision is required and hence channel
coding can be combined easily with TH precoding.

2.2.2 Combined Precoding and Coding

The 9-dB SNR gap between performance of an uncoded QAM with Tomlinson–
Harashima precoding over any bandlimited AWGN channel and the channel
capacity at a given Pr (E), observed from Fig. 2.9 can be reduced with coding
and shaping gains achieved.
Precoding can be combined with coding to obtain additional coding gain.
The combining can be preformed in a very natural way. Coded modulated
symbols {xi } are sent into the TH precoder on which they perform the same
precoding operation as before to produce {zi } symbols. For example, trellis
coded modulated symbols can be used in conjunction with TH precoding.
2.3 Trellis Precoding 21

The received symbols are the trellis coded symbols plus the additive Gaussian
noise. The regular Viterbi algorithm can be used to find the closest sequence
to the received sequence, which can then be reduced to modulo 2M .
TH precoding combined with coded modulation schemes can achieve the
same coding gain over partial response channel (1 − D) [43] or arbitrary chan-
nel response [15] and [35] as is achieved over the ideal channel, with about
the same decoding complexity.

2.3 Trellis Precoding

It was shown in [29] that spherical lattice codes can achieve capacity at high
SNR on ideal channels. Later, it was shown that the gain of a lattice code
over uncoded QAM is due to both a coding gain and a shaping gain. The
maximum shaping gain is shown to be about 1.53 dB. This means that a cod-
ing gain of 7.5 dB is required to fill the 9-dB gap with channel capacity. Since
the complexity of spherical lattice codes is high due to prohibitive memory
requirements, Forney and Eyuboglu [36] proposed trellis precoding, a com-
bination of trellis shaping [42] and trellis coded modulation and precoding
to leverage both coding and shaping gains. They argued that after the first
3–4 dB of coding gain, it is much easier to obtain the next 1 dB with shaping
than with coding.
The concept of lattice and cosets is important in the discussion of trellis
coded modulation (TCM) and trellis precoding. Here, we overview the basic
concepts briefly. A lattice denoted by Λ consists of N dimensional vectors
and is a subset of RN , where R denotes the set of real values. The simplest
example of lattice is Z, the set of integers. Z 2 and Z N are two-dimensional
and N -dimensional lattice, respectively. If we rotate each points of a lattice,
e.g., Z 2 , we obtain a new lattice denoted by RZ 2 . Also, we can scale each
point of Z 2 by a factor of k, where k is any integer and obtain another lattice
kZ 2 . A subset of lattice is called sublattice. Cosets are also subsets of lattice
but they must be disjoint. Thus, a sublattice can have many cosets.
The two-dimensional lattice is the general form behind constellation sets
used in digital communications such as M-PSK, M-QAM, etc. Higher dimen-
sion lattice have proven to be very useful in TCM. We assume that A is
a conventional two-dimensional M × M square constellation. In lattice no-
tation, A is the set of points from the two-dimensional half-integer lattice
Λ = Z 2 + (0.5)2 that fall within the two-dimensional boundary square re-
gion R = R1 × R2 , where R1 is the half-open interval R1 = (−M/2, M/2).
Figure 2.10 shows an 16 × 16 square two-dimensional constellation.
In the following sections, we first review the trellis shaping as described
in [42] in Section 2.3.1. Then, in Section 2.3.2, we describe the principles of
trellis precoding.
22 2 Precoding Techniques in ISI Channel

Fig. 2.10. Square 16 × 16 two-dimensional constellation.

2.3.1 Trellis Shaping

In TH precoding, the transmitted signals are uniformly distributed within


a square region. Briefly, the goal of shaping is to reduce the average signal
energy. This can be done by causing the probability distribution of signal set
to be more like a Gaussian distribution than a uniform distribution.
Usually an M × M two-dimensional square constellation such as the 16 × 6
two-dimensional constellation shown in Fig. 2.10 is considered as the baseline
constellation. For such a constellation with the minimum squared distance
d2min = 1, [42]

R = log2 (M 2 ), bits per two dimensions, (2.27)


Sx = (M − 1)/6 = (2 − 1)/6,
2 R
(2.28)

and
|A| = M 2 = 2R . (2.29)
where |A| is the size of constellation. The base line average energy for a
conventional M × M two-dimensional constellation or with d2min = 1, data
rate R and average energy SX , is defined as

Sx0 (R) = 2R /6. (2.30)

The shaping gain γs is defined as


2.3 Trellis Precoding 23

γs = Sx0 (R)/Sx . (2.31)

The shaping constellation expansion ratio (CER) for a two-dimensional con-


stellation is defined as
CER = |A|/2R . (2.32)

Thus the shaping gain of a square two-dimensional is approximately 1 dB


when R is large while CER is exactly 1 dB.
It was shown in [45] that Sx0 (R) is a good approximation to average energy
of any N dimensional square constellation with d2min = 1. Thus, the shaping
gain of any large constellation in the form of N -cube is 1 dB. It can be shown
that an N -sphere constellation has a shaping gain of 0.2 dB in two dimensions,
0.98 dB in 16 dimensions, 1.10 dB is 24 dimensions and eventually reaches to
the limit πe/6 (1.53) dB as N approaches infinity [45].
To have a shaping gain, the Sx must be reduced below Sx0 . One way to
do it is through sign bit shaping . Consider 16 × 16 constellation shown in
the Fig. 2.10 as the base line. Each coordinates can take values on from the
16-point PAM constellation {±1/2 ± 3/2, . . . ± 15/2}. In two’s-complement
notation, the most significant bit, t, is called sign bit. The remaining bits are
called the least significant bits. For any of the 64 values of the least significant
bits, we can form a square of side 8 whose vertices are the set of 4 constellation
points corresponding to 4 possible sign bit combination. The set of such a four
point set is called an equivalence class of points. An example of equivalence
class is shown in Fig. 2.11(a).
The goal of sign bit shaping is to change the sign bit values such that
the average energy is reduced, hence the shaping gain is obtained. Thus the

(a) (b) (c)

Fig. 2.11. (a) Equivalence class of 4 points with the same less significant bits for a
square 16 × 16 two-dimensional constellation; (b) 64-point constellation consisting
of least energy points in each equivalence class with no restriction on the sign bits;
(c) 144-point constellation consisting of least energy points in each equivalence class
if the parity of sign bits are preserved [[42]IEEE].
c
24 2 Precoding Techniques in ISI Channel

shaping operation will choose the point with the least average energy in the
class, which will be one of the 64-least energy points of Fig. 2.11(b).
Since sign bit carries no information, R = 6. Assuming all points are
equiprobable, the average energy is reduced to Sx = 63/6 = 10.5 (10.21 dB).
Therefore, the shaping gain γs is equal to 64/63 which is equal to that of a
conventional 8 × 8 constellation.
If we demand the shaping operation change both sign bits together, then,
we will have 144 point constellation as shown in Fig. 2.11(c). The reduced
energy will be Sx = 129/6 = 21.5 (13.2 dB). Here, the sign bits carry one bit
of information, hence R is 7 bits per second. However, we still do not have a
shaping gain since the constellation shape is square (γs = 128/129 1).
To obtain a shaping gain (γs > 1 dB), we can use a binary rate 1/2
convolutional code as shown in Fig. 2.12. The convolutional code is called
shaping code Cs whose task is to modify the sign bit sequence t(D), which
is a sequence of binary 2-tuples to t (D) = t(D) ⊕ cs (D), where cs (D) is
any output sequence of the convolutional code and ⊕ denotes modulo sum
operation [42].
Suppose any sequence of two-dimensional constellation point is denoted by
a = {ai } = (ai1 , ai2 ). Then, by modifying t(D) to t (D), we will modify a(D)
sequence to x(D) such that we minimize the average energy Sx = E{||x||2 }.
This is done via a Viterbi algorithm through a trellis search of the convolution
code Cs .
A trellis diagram is a way of visualizing all encoder’s state transitions
over time. Each state transition over time i to time i + 1 corresponds to a
branch metric and is associated with encoder output sequence csi = (csi1 , csi2 ).
Given constellation point ai = (ai1 , ai2 ), the modified output symbol xi is
constellation point ai with sign bits modified with csi , the binary 2-tuple at
time i. If we assign to each branch labeled by csi , a branch metric ||xi ||2 the
Viterbi algorithm can search for a minimum weight path through the trellis
diagram of Cs . This search is equivalent to a search for the cs (D) ∈ Cs that
results in modified output sequence x(D) with the least average energy. The
complexity of the shaping is mainly determined by that of Viterbi decodes
which grows exponentially with the number of states.

b(D) Uncoded 6 less Least b(D)


significant bits significant
bits

Constellation x(D) r(D) x(D)


Decoder for Cs Detector MAP–1
mapping

cs(D) w(D)
t'(D) t'(D) s(D)
s(D)
(H–1)T
t(D) HT

Fig. 2.12. Sign-bit shaping using the rate-1/2 convolutional code as the shaping
code and square 16 × 16 constellation [[42]IEEE].
c
2.3 Trellis Precoding 25

At the receiver, if uncoded sequence x(D) is sent through a noisy channel,


the received sequence x̂(D) can be detected using conventional symbol by sym-
bol detection techniques. From x̂(D), the receiver can recover an estimate of
t̂ (D) of the modified sign bits. The modified sign bit sequence t = t⊕cs is an
element of the set Cs ⊕ t which is equal to a coset of the group code Cs .
To limit the effect of propagation error associated with shaping code Cs
on t (D), the original two-tuple modified sign bit t(D) can be generated by a
syndrome sequence s for the shaping code Cs at the transmitter.
For a rate k/n binary convolutional code with generation matrix G, a
syndrome-former is specified by an n × (n − k) transfer function HT with
rank u − k such that
GHT = 0. (2.33)
For any code sequence cs (D) = b(D)G, we have
cs (D)HT = 0. (2.34)
Also, for any n-tuple sequence t(D) which does not belong to Cs , t(D)HT =
s(D) = 0. Since t (D) = t(D) ⊕ cs (D), we have
t (D)HT = (t(D) ⊕ cs (D))HT = t(D)HT = s(D). (2.35)
Therefore, regardless of the choice of cs (D), the receiver can recover from
t̂ (D), the syndrome sequence s(D), provided t̂ (D) = t (D). However, even if
there is an error in the estimate of t̂ (D), the resultant error propagation effect
will be limited since it is known that for any linear, time invariant convolu-
tional code, the syndrome-former can always be chosen to be feedbackfree.
As shown in Fig. 2.12, from the syndrome sequence s(D), the sign bit
sequence can be generated as
t(D) = s(D)(H−1 )T , (2.36)
where (H−1 )T is an (n − k) × n left inverse matrix for HT and is called coset
representative generator.
The least significant bits, if sent uncoded over the channel, can be mapped
directly into the estimates of the transmitted bits as shown in Fig. 2.12. The
less significant bits of detected symbols x̂(D) determines an estimate of the
six corresponding input bits b̂(D). The sign bits t (D) are passed through a
syndrome former HT to produce an estimate of the syndrome sequence ŝ(D).
Figure 2.12 shows a sign bit shaping using a 256-point constellation and
a rate 1/2 convolutional shaping code Cs , with data rate R = 7 bits per two
dimension. It was shown in [42] that the shaping gain of such sign bit shaping
depends on the window size of VA. As it approaches 20 symbols, the shaping
gain increases to 0.9 dB. The shaping operation also changes the uniform
distribution of input signal set to Gaussian distributions.
Sign bit shaping can be combined with any coding based on Ungerboeck’s
mapping by set partitioning [138]. According to Ungerboeck’s set partitioning
rule, the subsets are determined by the least significant bits. Since the sign
26 2 Precoding Techniques in ISI Channel

b(D) y(D) Least b(D)


Gc G-1c
kc bits nc tuple significant
d(D) bits d(D)
nu=n-nc-ns bits
Constellation x(D) r(D) Decoder x(D) MAP–1
Decoder for Cs mapping for
Cc
cs(D) w(D) t'(D) s(D)
s(D)
(H–1)T
t(D) t'(D) HT
rs bits ns bits ns bits

Fig. 2.13. Binary coded modulation shaping using the rate-ks /ns convolutional
code as the shaping code and square 16 × 16 constellation [[36] IEEE].
c

bit does not change the partitioning of constellation points, the coding and
shaping are completely compatible with each other.
In fact, sign bit shaping is only a special case of a more general trellis
shaping depicted in Fig. 2.13. Here the constellation set A is N dimensional.
Trellis shaping is used in conjunction with binary lattice-type channel trellis
code Cc (Λc , Λc , Cc ) which comprises the top section of Fig. 2.13. Trellis coded
modulation [138] performs both coding and modulation without an increase
in bandwidth. A trellis code is composed of a binary kc /nc convolutional
code Cc and nu uncoded bits. The encoder Gc generates an n-tuple code
sequence cs (D) = b(D)Gc from the k-tuple input sequence b(D). Each csi
partitions the constellation sets into 2nc cosets. In lattice notation, each csi
partitions a translate Λc + a of an N -dimensional binary lattice Λc into 2nc =
|Λc /Λc | cosets of a sublattice Λc [42]. The nu uncoded bits d(D) determine
the coordinates of points within each of these subsets. This set partitioning
proposed by Ungerboeck guarantees that signal with the largest Euclidean
distance have less bit differences.
The bottom section of Fig. 2.13 is trellis shaping which consists of a binary
rate-ks /ns shaping convolutional code Cs and a partition of a region R of
N -dimension into 2ns subregions R(z) and also a shaping decoder.
From what was said before, we know that t (D) must fall in the same
coset Cs ⊕ t as the initial sequence t. The shaping decoder first selects the
output code sequence cs (D) to be any sequence that lies in the code Cs .
Then, using the Viterbi algorithm, the shaping decoder selects a label sequence
t (D) = t(D) ⊕ cs (D) that optimizes any desired characteristics, usually the
average transmit energy, of the transmitted sequence x(D).
The sequence x(D) is then sent through the noisy channel. The receiver
can use a conventional decoder for Cc to obtain the estimate of the sequence
xi , denoted by x̂i . The complexity and performance of this decoder is not
affected by shaping operation. The coded information bits can be recovered
with G−1 c for the generating matrix Gc of the code Cc . To obtain shaping bits,
the received modified sign bits must be extracted from subregion R(z) and
then sent through the syndrome-former HTs to obtain the syndrome sequence
ŝ(D) = t̂ (D)HTs . If the estimate x̂ is correct, x̂(D) = x(D), then ŝ(D) =
s(D). Again, HTs can be chosen to be feedbackfree, and thus occasional error
does not matter much.
2.3 Trellis Precoding 27

2.3.2 Principles of Trellis Precoding

In short, trellis precoding is a combination of trellis shaping and precoding.


Precoding is done to cancel the ISI effect when channel is not ideal, i.e.,
h(D) = 1, where h(D) is any monic, causal, minimum phase channel transfer
function, with a unique inverse g(D) = 1/h(D).
For trellis precoding, we require the boundary region R of the M × M
square constellation A be a fundamental region of a lattice Λs , called pre-
coding lattice. Let us denote the N -dimensional infinite array set by S. Then,
another property of the precoding lattice can be stated as follows: any element
of precoding lattice added to any point in the coding subset of S results in
another point in the same coding subset.
In trellis precoding, the information sequence is mapped into a sequence
x(D). The sequence x(D) is then subtracted out from an N -dimensional sym-
bols pi from the precoding lattice Λs = M Z 2 . In other words, a modulo device
subtracts integer multiples of M (or 2M ) from each symbol. The resulting se-
quence x(D) − p(D) is still a code sequence in Cc . Finally the transmitted
sequence z(D) is obtained by filtering x(D) − p(D) through g(D) = 1/h(D).
Therefore, the received sequence r(D) is

r(D) = [(x(D) − p(D)) g(D)] h(D) + w(D) = x(D) − p(D) + w(D),

where w(D) is a while Gaussian noise sequence. The sequence {pi } can be
chosen according to various criterion. In TH precoding, {pi } is determined by
the simple memoryless modulo-M operation.
Figure 2.14 shows a trellis precoding composed of trellis shaping and TH
precoding. The important block in trellis precoding is shaping decoder which
selects not only the shaping code sequence cs (D) from Cs but also the precod-
ing lattice sequence p(D). The shaping decoder searches through all possible
code sequence cs (D) in Cs and precoding lattice sequence p(D) to optimize
any desired property of transmitting sequence z(D) such as minimizing the
average transmit energy Sz . In our case, the shaping decoder scans all possible
sequence superstates si = (si , pi−1 ), where si is a state of the convolutional
code Cs and pi−1 = (pi−1 , pi−2 , . . . , ).

b(D) y(D)
Gc
kc bits nc tuple
d(D)
nu =n-nc-ns bits
Constellation x(D) Mod z(D)
Decoder for Cs mapping 2M

cs(D) h(D)–1
s(D) t(D) t'(D)
(H–1)T
rs bits ns bits ns bits TH Precoding

Fig. 2.14. Block diagram of trellis precoding transmitter [[36] IEEE].


c
28 2 Precoding Techniques in ISI Channel

The number of possible superstates is infinite, since p(D) can be any


point from Λs which is an infinite set. Since the optimum decoder resem-
bles an MLSE of trellis-coded signals in the ISI channel, the VA-like reduced
search estimation (RSSE) techqniques developed for combined decoding and
equalization [38] has been suggested as the decoder for trellis precoding. The
complexity of RSSE for trellis cods can range between that of the encoder trel-
lis and that of the ML super-trellis. In the case in which the decoder trellis is
equal to the encoder trellis, RSSE reduces to what is called parallel decision
feedback decoding (PDFD).
In PDFD, the superstate is reduced to (si ). The VA searches the trellis
diagram of the shaping code Cs recursively to find a shaping sequence cs (D)
for which the transmit sequence z(D) has an energy
2 2
|z(D)| = |[x(D) − p(D)]g(D| . (2.37)
For each state s of the trellis, the VA stores zi−1 (s) = [zi−1 (s), zi−2 (s), . . .]
and then computes the path metric as

Γi = ||zj | |2 , (2.38)
j<i

where 
|zj |2 = |xj − ( zj−k (s)hk ) − pj |2 , (2.39)
k≥1

where hk are the coefficients of h(D).


The VA then selects the surviving paths for each branch, by selecting the
nearest element pj ∈ M Z 2 that minimizes |zj |2 . This can be done in two
steps. First, the VA computes

fj = xj − zj−k (s)hk . (2.40)
k≥1

In other words, the VA subtracts the ISI from xj using symbols zj−k (s). Then,
the coordinates of fj are reduced to the interval (−M/2, M/2] with modulo
M operations.
The VA will otherwise operates in a normal way, i.e., for each state, it se-
lects the surviving path among the merging paths. The only constraint is that
the final sequence cs (D) must be a legitimate code sequence in the shaping
code Cs .
From the above discussion, we can see that the TH precoder is incorporated
into the branch metric computations, and there is one precoder for each state
of the trellis. Since PDFD does not make use of the component pi−1 of the
superstate si = (si , pi−1 ), it does not minimize the average transmit energy.
In [37], it was shown that the shaping gain can be increased by expanded
trellises that takes into account some part of the omitted component pi−1 .
However, the above decoding technique offers the best performance/
complexity trade-off among the TH precoding the above precoder/decoder
other decoders of RSSE family
2.4 Multirate Representations for OFDM Systems 29

2.3.3 Performance of Trellis Precoding

Recall that TH precoding (or ZF-DFE) have a 9-dB gap with channel capacity
over any bandlimited or ideal channel as shown in Fig. 2.9. Suppose there is
a coded modulation with coding gain γc and shaping gain γs . Then, the error
probability is given by

Pr (E) BQ[(3γc γs SN Rnorm )1/2 ], (2.41)

where B is the error coefficient of the scheme and SN Rnorm as defined in


Eq. (2.23). Figure 2.9 shows the performance that would be obtained with a
coding gain of 5 dB and a shaping gain of approximately 1 dB. In other words,
the gap between channel capacity and performance of coded modulation is
narrowed to 3 dB. The same improvement over the performance of uncoded
can be obtained by a combination of coding, shaping, and precoding on any
bandlimited, high SNR Gaussian channels.
We saw that trellis precoding is a combination of trellis coded modulation,
trellis shaping, and precoding. Trellis coded modulation can achieve 5–6 dB
coding gain. The shaping gain in trellis precoding is determined not only by
the shaping code and the shaping decoder, but also by the channel response
h(D).
Reference [36] showed a two-dimensional 4-state Ungerboeck code as the
shaping trellis code and PDFD decoder can provide a modest shaping gain
γs of 0.6−1 dB. These shaping gains were found to be independent of R at
moderate to high data rates R. Also, it was found, while shaping gain is
insensitive to the channel coefficients for small decoding delays, it starts to
decrease as the decoding delay increases.
A maximum reduction of 0.3 dB in shaping gain was observed on severely
distorted channels. It is possible to achieve a shaping gain as close to the ulti-
mate shaping gain πe/6 (1.53 dB), with more complex trellis precoding. There-
fore, with trellis precoding, the SNR gap of about 9 dB (at Pr (E) 10−6 )
between capacity and uncoded modulation can be reduced by approximately 7
dB over any strictly bandlimited high SNR Gaussian channel. In other words,
with the trellis precoding, the channel capacity of ISI channels at high SNRs
can be approached as closely as channel capacity of ideal channels.

2.4 Multirate Representations for OFDM Systems

Multirate and filterbank design are very useful techniques for digital signal
processing. By representing communication systems using multirate/filterbank
forms, we can gain more insight into the systems since many important re-
sults can directly be derived from the existing multirate theories. This will
greatly simplify the analysis. In this section, we introduce the basic multirate
techniques [140] and represent the OFDM systems using multirate forms [76].
30 2 Precoding Techniques in ISI Channel

2.4.1 Multirate Fundamentals

Down-sample/Decimation. x(n) is said to be down-sampled by M and


results in output y(n) if

y(n) = [x(n)]↓M = x(M n). (2.42)

That is, each sample of y(n) is obtained by extracting every M sample of x(n).
The relationship between input and output for decimation in the z domain is
given by
1 
M−1
Y (z) = X(z 1/M e−j2π m/M ). (2.43)
M m=0
From Eq. (2.43), the decimated signal is the stretched and scaled version of
the original signal in frequency domain. For convenience, we will use the block
diagram as shown in Fig. 2.15 to denote sequence being decimated by M .
Up-sample/Interpolation. x(n) is said to be up-sampled by M and results
in output y(n) if

x(n/M ), if n is multiple of M,
y(n) = [x(n)]↑M = (2.44)
0, otherwise.

That is, y(n) is obtained by inserting M − 1 zeros between every sample of


x(n). The relationship between input and output for interpolation in the z
domain is given by
Y (z) = X(z M ). (2.45)
The interpolated signal is the compressed version of the original signal in
frequency domain. We will use the block diagram as shown in Fig. 2.16 to
denote sequence being interpolated by M .
Noble Identities. In many cases, the decimator and interpolator are cas-
caded to other LTI systems. For instance, to relax the design effort of low
pass filter in digital to analog converter (DAC), we usually interpolate the
digital signal and then filter the interpolated signal before passing it to ADC.
On the other hand, we will filter and decimate the digital signal after the ana-
log to digital converter (ADC). In such cases, Noble Identities together with
the polyphase decomposition (will be introduced later) provide good way to
reduce the computational cost. Referring to Fig. 2.17, the Noble Identities pro-
vide rules so that we can interchange interpolation/decimation with system
function. The interchanging rule for decimation can be expressed as
 
Y (z) = X(z)H(z M ) ↓M = H(z)[X(z)]↓M . (2.46)

x(n) M y(n) = [ x(n)] M

Fig. 2.15. Block diagram of decimation.


2.4 Multirate Representations for OFDM Systems 31

x(n) M y(n) = [x(n)] M

Fig. 2.16. Block diagram of interpolation.

x(n) y(n) x(n) y(n)


M H(z) H(zM) M
(a)

x(n) y(n) x(n) y(n)


H(z) M M H(zM)

(b)

Fig. 2.17. Noble Identities: (a) interchanging decimator with system function, (b)
interchanging interpolator with system function.

The interchanging rule for interpolation can be expressed as

Y (z) = H(z M )[X(z)]↑M = [X(z)H(z)]↑M . (2.47)

Polyphase Identity. If a system first interpolates its input by M , passing


the interpolated signal through a system function and then further decimates
the output by M , such system can use the Polyphase Identity as illustrated
in Fig. 2.18. It can be expressed as

Y (z) = [H(z)[X(z)]↑M ]↓M = [H(z)]↓M X(z). (2.48)

Polyphase Decompositions. For a given system function, it can be decom-


posed into polyphase representations. Referring to Fig. 2.19, there are two
types of polyphase decompositions. Type I decomposes the system function
into polyphase with delay elements, i.e.,


N −1
F (z) = z −k Gk (z N ). (2.49)
k=0

For instance, if F (z) = a0 + a1 z −1 + a2 z −2 + a3 z −3 + a4 z −4 + a5 z −5 + a6 z −6 ,


let N = 3, we have F (z) = (a0 + a3 z −3 + a6 z −6 ) + (a1 + a4 z −3 )z −1 + (a2 +
a5 z −3 )z −2 . In this case, G0 (z 3 ) = (a0 + a3 z −3 + a6 z −6 ), G1 (z 3 ) = (a1 +
a4 z −3 + a1 + a4 z −3 ), and G2 (z 3 ) = (a2 + a5 z −3 ). Type II decomposes the
system function into polyphase with advance elements, i.e.,

N −1
H(z) = z k Sk (z N ). (2.50)
k=0

x(n) M h(n) M y(n) x(n) [h(n)] M y(n)

Fig. 2.18. Polyphase Identity.


32 2 Precoding Techniques in ISI Channel

G0(z N) S0(z N)
z–1 Z
F(z) ≡ G1(z N) H(z) ≡ S1(z N)
.. ..
. Z .
z–1
GN–1(z N) SN–1(z N)

(a) (b)

Fig. 2.19. Polyphase decomposition: (a) Type I (with delay elements), (b) Type II
(with advance elements).

Continuing from above example and let H(z) = F (z). Then, for N = 3,
H(z) = (a0 + a3 z −3 + a6 z −6 ) + (a2 z −3 + a5 z −6 )z 1 + (a−3
1 + a4 z
−6 2
)z . In
this case, H0 (z ) = (a0 + a3 z + a6 z ), H1 (z ) = (a2 z + a5 z −6 ), and
3 −3 −6 3 −3

H2 (z 3 ) = (a−3
1 + a4 z
−6
).

2.4.2 Multirate Representation for OFDM Systems

A multirate representation for communication systems with parallel symbol


transmission is shown in Fig. 2.20 [76]. We will explain later that the OFDM
systems can be represented using this model. In addition, several popular
precoded OFDM systems can be extended based on this model as well. More
detailed treatment and its extension for this subsection can be found in [100].
In communication systems, we usually choose N > M to represent the
insertion of redundancy to eliminate interference. More specifically, if the
channel order is ν, we usually let N ≥ M + ν. In this chapter, we assume
N = M + ν.
To facilitate the analysis, we express the transmitter, channel, and the
receiver using polyphase decomposition. Referring to Fig. 2.20, consider one
of the filters in the transmitter (filter banks), i.e., the mth transmitter filter,
Fm (z). Using the Type I polyphase decomposition, Fm (z) can be decom-
posed as

Noise
q(n)

x0 (n) N F0(z) C(z) H0(z) N y0(n)


Channel
x1 (n) N F1(z) H1(z) N y1(n)
.. ..
xM–1 (n) N
. FM–1(z) HM–1(z)
. N yM–1(n)

Transmitter Receiver

Fig. 2.20. A multirate representation for OFDM systems.


2.4 Multirate Representations for OFDM Systems 33


N −1
Fm (z) = z −k Gkm (z N ), m = 0, 1, · · · , M − 1, (2.51)
k=0

where


Gkm (z) = gkm (n)z −n with gkm (n) ≡ fm (k + nN ). (2.52)
k=−∞

The decomposition of Eq. (2.51) is shown in Fig. 2.21.


Similarly, let us consider the mth receiving filter in receiver (filter banks),
Hm (z). Using the Type II polyphase decomposition, Hm (z) can be decom-
posed as


N −1
Hm (z) = z k Smk (z N ), m = 0, 1, · · · , M − 1, (2.53)
k=0

where


Smk (z) = smk (n)z −n with smk (n) ≡ hm (−k + nN ). (2.54)
n=−∞

The decomposition of Eq. (2.53) is shown in Fig. 2.22.


From Figs. 2.21 and 2.22, we can redraw the transmitting and receiving
filter banks as in the left hand sides of Figs. 2.23 and 2.24. Since interpolation
and symbol delay are linear operations, we can move the final summation for
all filter banks to the front of the interpolation and symbol delay operations,
and redraw the left hand side of Fig. 2.23 as the right hand side of the same
figure. Similarly, we can redraw the left hand side of Fig. 2.24 as the right
hand side of the same figure. Referring to the right hand side of Fig. 2.23, let

x(n) = (x0 (n) x1 (n) · · · xM−1 (n))t

and
v(n) = (v0 (n) v1 (n) · · · vN −1 (n))t .
We can represent the relationship of x(n) and v(n) using matrix form in
z-domain, i.e.,

G0,m(z) N
z–1
N Fm(z) ≡ G1,m(z) N
..
. z–1
GN–1,m(z) N

Fig. 2.21. Type I polyphase decomposition for Fm (z).


34 2 Precoding Techniques in ISI Channel

N Sm,0(z)
z
Hm(z) N ≡ Sm,1(z)
N
.
.
z
.
N Sm,N–1(z)

Fig. 2.22. Type II polyphase decomposition for Hm (z).

v0(n)
x0(n) G0,0(z) N x0(n) G0,0(z) N
z–1 z–1
v1(n)
G1,0(z) N G1,0(z) N
.. .
.
. z–1
.
vN–1(n) z–1

..
GN–1,0(z) N


GN–1,0(z)
.
N

xM–1(n)
.
G0,M–1(z) N xM–1(n)
.
.
G0,M–1(z)
z–1
G1,M–1(z) N G1,M–1(z)
.. .
.
. z–1
.
GN–1,M–1(z) N GN–1,M–1(z)

Fig. 2.23. Polyphase decomposition for transmitting filter banks Fm (z), 0 ≤ m ≤


M − 1, and its simplified representation.

V(z) = G(z)X(z), (2.55)


where G(z) is of size N × M given by
⎛ ⎞
G0,0 (z) G0,1 (z) · · · G0,M−1 (z)
⎜ G1,0 (z) G1,1 (z) · · · G1,M−1 (z) ⎟
⎜ ⎟
G(z) = ⎜ .. .. .. .. ⎟. (2.56)
⎝ . . . . ⎠
GN −1,0 (z) GN −1,1 (z) · · · GN −1,M−1 (z)

Note that the ith column is corresponding to the ith transmitting filter. Also,
the component Gj,i (z) is the jth polyphase of the ith transmitting filter.
Similarly, from the right hand side of Fig. 2.24, let
t
y(n) = (y0 (n) y1 (n) · · · yM−1 (n))

and
t
w(n) = (w0 (n) w1 (n) · · · wN −1 (n)) .
We can represent the relationship of y(n) and w(n), i.e.,

Y(z) = S(z)w(z), (2.57)

where S(z) is of size M × N as follows:


2.4 Multirate Representations for OFDM Systems 35
w0(n)
N S0,0(z) y0 (n) N S0,0 (z) y0 (n)
z z
w1(n)
N S0,1(z) N S0,1 (z)
.. ..
z
. z
. wN–1(n)

..
N S0,N–1(z)


N
.
.
S0,N–1 (z)

N
. SM–1,0(z) yM–1(n)
.
SM–1,0 (z) yM–1 (n)
z
N SM–1,1(z) SM–1,1 (z)
..
z
.
N SM–1,N–1(z) SM–1,N–1 (z)

Fig. 2.24. Polyphase decomposition for receiving filter banks Hm (z), 0 ≤ m ≤


M − 1, and its simplified representation.

⎛ ⎞
S0,0 (z) S0,1 (z) ··· S0,N −1 (z)
⎜ S1,0 (z) S1,1 (z) ··· S1,N −1 (z) ⎟
⎜ ⎟
S(z) = ⎜ .. .. .. .. ⎟. (2.58)
⎝ . . . . ⎠
SM−1,0 (z) SM−1,1 (z) · · · SM−1,N −1 (z)

Again, notice that the ith row is corresponding to the ith receiving filter. Also,
the component Si,j (z) is the jth polyphase of the ith receiving filter.
From the discussion above, we can redraw Fig. 2.20 as in Fig. 2.25. Now we
would like to discuss how to formulate the channel matrix C(z) in Fig. 2.25.
Polyphase Representation of the Channel. By using the Type 1 polyphase
representation, we can decompose the channel as

C(z) = C0 (z N ) + C1 (z N )z −1 + · · · + CN −1 (z N )z −(N −1) ,

where Ck (z) is the kth polyphase of C(z). The shaded area in Fig. 2.25 is
an N × N system. Each path can be described as a cascade of an interpo-
lator, delays, C(z), advances, and a decimator. The (m, n) path is shown
in Fig. 2.26(a). Using Polyphase Identity, we can redraw Fig. 2.26(a) as in

q(n)
v(n) w (n)
x0(n) N C (z) N y0(n)

z –1 z
x1(n)
.
.. G(z)
N
.. .. N
S (z) .. y1(n)

xM–1(n)
. z–1 z
. . yM–1(n)
N N

Channel matrix: C(z)

Fig. 2.25. Polyphase representation for the OFDM systems.


36 2 Precoding Techniques in ISI Channel

(a)
N z–m C(z) zn N

(b)
Cn–m (z)

(c)
z–1 CN+(n–m) (z)

Fig. 2.26. (a) General block diagram for every path (from vn (n) to wm (n)). When
n ≥ m, we can redraw (a) as in (b), when n < m, then (a) can redraw as in (c).

Fig. 2.26(b) when n ≥ m, and redraw as in Fig. 2.26(c) when n < m. Although
the decimator and interpolator are time-varing building blocks, the intercon-
nection in Fig. 2.26(a) happens to be time-invariant and circuits (a) and (b)
are equivalent, where C0 (z) is the 0th polyphase component of the transfer
function C(z). Thus the transfer matrix C(z) is given by
⎛ ⎞
C0 (z) z −1 CN −1 (z) . . . z −1 C1 (z)
⎜ C1 (z) C0 (z) ⎟
⎜ ⎟
C(z) = ⎜ .. . . ⎟. (2.59)
⎝ . . . .
. ⎠
CN −1 (z) CN −2 (z) . . . C0 (z)
This matrix falls into the category of the so-called pseudo circulant matrices
[140]. Let the channel be finite impulse response (FIR) with order ν (ν < N ),
and the kth channel tap be ck , the polyphase terms of C(z) is given by

ck if k < L + 1,
Ck (z) =
0 otherwise.
We can partition C(z) as a constant matrix CL and a matrix CR (z), in
particular
C(z) = [CL |CR (z)], (2.60)
where CL is an N ×M matrix and CR (z) is an N ×ν matrix given, respectively,
by
⎛ ⎞ ⎛ −1 ⎞
c0 0 . . . 0 . . . 0 z cν z −1 cL−1 . . . z −1 c1
⎜ c1 c0 0⎟ ⎜ 0 z −1 cν z −1 c2 ⎟
⎜ ⎟ ⎜ ⎟
⎜ .. . . ⎟ ⎜ 0 ⎟
⎜ . .. .. ⎟ ⎜ 0 ⎟
⎜ ⎟ ⎜ .. ⎟
⎜ cν cν−1 ⎟ ⎜
. . .
⎜ 0 ⎟ ⎜ .
.
. . . . ⎟
.

⎜ 0 cν ⎟
0 ⎟ and CR = ⎜ c0
CL = ⎜ ⎜ 0 z −1 cν ⎟
⎟.
⎜ .. .. . . . . .. ⎟ ⎟ ⎜ c1 ⎟
⎜ . . . . . ⎜ c 0 0 ⎟
⎜ ⎟ ⎜ ⎟
⎜0 0 cν c0 ⎟ ⎜ .. . .. .
.. ⎟
⎜ ⎟ ⎜ . ⎟
⎜ . . . . .. ⎟ ⎝
⎝ .. .. . . ⎠ 0 ⎠
0 0 0 . . . cν cL−1 cL−2 . . . c0
(2.61)
2.4 Multirate Representations for OFDM Systems 37

Based on the discussion above, the relationship between Y(z) and X(z) is
given by
Y(z) = S(z)C(z)G(z)X(z) + N, (2.62)
where N is the noise vector. Now we will represent OFDM systems using the
above model. Since OFDM are systems with block transmission, the maximum
length of the filters in transmitting filter bank Fm (z) and receiving filter bank
Hm (z) cannot exceed N . Otherwise, the current transmission block x(n) will
be affected by the blocks at other time instance k, where k = n, which is
not block transmission. Given the maximum length N for all filters, from
Eqs. (2.51) and (2.53), the matrices G(z) and S(z) become constant matrices.
We can simplify the input and output relationship as

Y(z) = SC(z)GX(z) + N. (2.63)

From (2.63), we still need to design S and G to avoid inter-block interfer-


ence (IBI) due to C(z). The two popular methods are by inserting zeros or
cyclic prefix. By the use of cyclic prefix or zero padding in matrix G and
the redundancy removal in S, although the channel matrix C(z) contains the
z components, the current output block will only be affected by the current
input block, and hence X(z) and Y(z) become constant vectors X and Y as
we will see in the following subsections.

2.4.3 OFDM Systems with Cyclic Prefix

Now consider the OFDM system with cyclic prefix, which is widely adopted
in current communications standards such as DVB-T, IEEE 802.11a/g/n, and
IEEE 802.16x. At the transmitter side, due to the appended cyclic prefix with
length L, the constant transmitting matrix Gcp is given by [77]
 
0 Iν
Gcp = F† , (2.64)
IM

where F is the M × M DFT matrix with the component at the nth row and
mth column given by
1
[F]n,m = √ e−j M nm .

M
Consider zero forcing (ZF) equalization in the receiver side, due to the removal
of CP, the constant receiving matrix Scp is given by
 
Scp = Λ−1 F 0 IM , (2.65)

where Λ is a diagonal matrix with its diagonal elements λk , 0 ≤ k ≤ M − 1.


λk is the kth element of the M -point DFT of the channel coefficients, i.e.,

ν
cn e−j M nk .

λk = (2.66)
n=0
38 2 Precoding Techniques in ISI Channel

From Eqs. (2.61) and (2.65), we find that Scp C(z) will lead to the removal of
the upper ν rows of C(z) and is given by

Scp C(z) = Λ−1 FCcp , (2.67)

where Ccp is an M × N matrix given by


⎛ ⎞
cν cν−1 · · · c0 0 · · · 0 ··· 0
⎜ . . .. .. ⎟
⎜ 0 cν . . c0 . . . . ⎟
⎜ ⎟
⎜ .. . . . . .. ⎟
⎜ . . . . 0 ··· 0⎟
Ccp =⎜

⎟.
.. ⎟ (2.68)
⎜0 0 0 .
⎜ cν c0 . . . ⎟

⎜ . . .. .. .. ⎟
⎝ .. .. . . . 0⎠
0 0 ··· 0 cν · · · c1 c0

From Eq. (2.71) and (2.64), the overall system function Scp C(z)Gcp can be
manipulated as
Scp C(z)Gcp = Λ−1 FCcir F† , (2.69)
where Ccir is an M × M circulant matrix given by
⎛ ⎞
c0 0 · · · · · · cν · · · c1
⎜ . . .. ⎟
⎜ c1 c0 . . ⎟
⎜ ⎟
⎜ .. . ⎟
⎜ . .. cν ⎟
⎜ ⎟
⎜ .. ⎟
Ccir = ⎜ cν . 0⎟ . (2.70)
⎜ ⎟
⎜ . ⎟
⎜ 0 cν .. ⎟
⎜ ⎟
⎜ . . . ⎟
⎝ .. . . . . 0⎠
0 · · · 0 cν · · · c1 c0
 
0 Iν
The effect that the matrix results in Ccp is to add the first ν columns
IM
to the last ν columns and then eliminate the first ν columns. Since FCcir F†
is a diagonal matrix with the diagonal elements being the M -point DFT of
(c0 , c1 , · · · , cν ) [47], from Eqs. (2.63) and (2.69), current Y(z) only depends
on current X(z). Hence there is no need to use z herein. The two vectors
become constant vectors and their relationship of x and y can be expressed as

Y = X + Λ−1 N, (2.71)

When the channel has zeros near the unit circle, the subchannels correspond-
ing to these zeros have serious fading. In this case, λk are small in these
subchannels. As a result, λ−1k are large and tend to enhance noise. Thus it
leads to relatively high bit error probability (BEP) in these subchannels. To
2.4 Multirate Representations for OFDM Systems 39

avoid significant noise enhancement, we may use minimum mean square error
(MMSE) technique. That is, define the averaged SNR

γ = Es /N0 , (2.72)

where Es is the averaged transmitted power and N0 is the noise variance. We


can choose the receiving filter bank such that

Scp C(z) = ΓFCcp , (2.73)

where Γ is a diagonal matrix with its elements given by


γλ∗k
[Γ]k,k = . (2.74)
1 + γ|λk |2

When γ  1, it is obvious that the MMSE equalization in Eq. (2.73) is reduced


to ZF equalization in Eq. (2.67). Please note that both the ZF and MMSE
equalizations described are called frequency domain equalization (FEQ) since
they perform equalization in frequency domain.

2.4.4 OFDM Systems with Zero Padding

Zero padding is an alternative to avoid IBI. Instead of using cyclic prefix, it


inserts ν zeros at the end of every transmitted block. If ν + 1 ≥ L, where L
is the multipath length, the zero padded OFDM systems are free from IBI.
The zero padded OFDM systems require less power consumption than that of
cyclic-prefix OFDM systems. However, the zero padded OFDM systems need
to perform extra manipulations in the receiver side. Fortunately, such manip-
ulations only involve additions and have comparable computation complexity
with cyclic-prefix OFDM systems as we will mention later.
Referring to Fig. 2.25, due to the insertion of zeros, the transmit matrix
is given by  
IM
Gzp = F† . (2.75)
0
The channel matrix remains the same as that in Eq. (2.59). Hence, from
Eqs. (2.75) and (2.59), C(z)Gzp is given by

C(z)Gzp = CL F† , (2.76)

where CL is an N × M matrix as given in Eq. (2.61). From (2.76), the z term


disappears and hence there is no IBI in this system. Let the receiving matrix
be Szp , the relationship between input and output blocks can be expressed as

Y = Szp CL F† X. (2.77)

From Eq. (2.77), for zero forcing reconstruction, we can choose Szp as the
pseudo inverse of CL F† , i.e.,
40 2 Precoding Techniques in ISI Channel

Szp = F(C†L CL )−1 C†L . (2.78)


If we have the information of the averaged SNR as defined in Eq. (2.72), the
performance can be further improved by choosing
Szp = F(C†L CL + γ −1 IM )−1 C†L , (2.79)
which is the MMSE receiver. However, the receivers implemented according
to Eqs. (2.78) and (2.79) demand matrix multiplication and reversion and this
leads to great computational complexity. To reduce the extra computational
complexity, we can first perform manipulation of the received symbols and
hence it makes the channel matrix CL circulant as that shown in Eq. (2.61).
After the manipulations, pass the signal into an DFT matrix F. The input
and output relationship can again be characterized by Eq. (2.71). To achieve
this, let the receiving matrix be
Szp = FΦ, (2.80)
where Φ is an M × N permutation matrix given by
 
I
Φ = IM ν . (2.81)
0
Note that given an N × M matrix A, the operation of ΦA is simply adding
the last ν rows to the first ν rows and then eliminating the last ν rows. Hence,
we have
Szp CL Gzp = FCcir F† , (2.82)
which again leads to the same input and output relationship as that in
Eq. (2.71). Note that according to Eq. (2.80) although the OFDM systems
with zero padding requires extra computations for Φ in the receiver side, this
matrix is a permutation matrix and can be implemented by additions. Hence,
the complexity of OFDM with zero padding can use less transmitted power
to achieve the same performance as that of OFDM with cyclic prefix while
both systems can be implemented with comparable computational cost.

2.4.5 OFDM Systems with Transmitter Knows Channel


Information
The transceiver filter banks as discussed above can actually be regarded as
precoded communications systems. More specifically, we can regard the trans-
mitting matrix Gcp and Gzp be the precoding/prefiltering to achieve ISI-free
property. The DFT-based OFDM systems discussed above assume that only
the receiver knows the channel information. If the transmitter knows the chan-
nel information, we can design non-DFT-based OFDM systems. For example,
in [115], optimal transceiver design based on maximum output SNR and min-
imum mean square criteria were derived. In [76] and [77], transceiver that
achieves optimal bit rate is considered. Based on the optimal bit rate, the
transceiver that can achieve minimum transmit power is developed.
2.5 Precoding for OFDM Systems 41

2.5 Precoding for OFDM Systems

Although the OFDM systems introduced in previous section can successfully


overcome the ISI effect, the subchannels with serious fading will significantly
degrade system performance. More specifically, the subchannels with the most
serious fading will have the largest BEP. As a result, it dominates the over-
all system performance. In wireline communication systems, e.g. DMT-xDSL
systems, such problems can be overcome by bit or power allocation. How-
ever, this demands feedback of channel information or SNR information for
all subchannels and may somewhat infeasible in a rapid changing channel
environment.
This problem may also be overcome by adding a proper precoder in the
transmitter and the corresponding postcoder in the receiver. In wireless com-
munication systems, however, channel may vary rapidly, and it is preferable
that the precoder is independent of channel so that there is no need to use
feedback. In this section, we will introduce a channel independent precoder to
mitigate BEP. More detailed treatment and its extension for this subsection
can be found in [75].
An OFDM system with precoder is shown in Fig. 2.27, where P is an
M × M unitary matrix. At the receiver side, P† is placed after S so that
the whole system can achieve perfect reconstruction property when there is
no noise. In the following subsections, we will let the matrix P be the DFT
matrix F, and thus the OFDM systems become single carrier systems.

2.5.1 Single Carrier System with Cyclic Prefix (SC-CP)

Referring to Fig. 2.27, let the precoder P be an M × M DFT matrix and the
transmitting matrix Gcp be that described in Eq. (2.64). In this case, the whole
system becomes single carrier system with cyclic prefix. Figure 2.28 shows the
block diagram of the single carrier system with cyclic prefix. The SC-CP

x0 y0
x1 . P . G . . . . y1
.. .. .. C(z) .. S . P† .
xM–1 . . yM–1

Fig. 2.27. Block diagram for a general OFDM system with precoder.

x0 f0
qi y0
x1
.. P/S
and C(z)
remove
CP .. F .. f1

F† .. y1

xM–1
. add CP
and
S/P . . fM–1 . yM–1

Fig. 2.28. Block diagram for a single carrier system with cyclic prefix.
42 2 Precoding Techniques in ISI Channel

system has the following advantages when compared with the conventional
OFDM systems introduced in Sections 2.4.3 and 2.4.4.
1. Mitigation for High PAPR. It is well known that one of the critical
issues of conventional OFDM systems is its high PAPR (peak-to-average
power ratio) characteristic. Due to the use of IDFT in the transmitter, the
dynamic range of IDFT output will increase, and hence the peak value
increases. If we let the IDFT matrix be unitary, the average power of
IDFT output remains the same as that in the IDFT input. As a result,
the PAPR increases. It is intuitive that the PAPR increases as the number
of subchannels increases. In the transceiver of DVB-T, the number of
subchannels is up to 8192 and hence the system has large PAPR value.
High PAPR will increase design effort for analog circuits such as power
amplifier since the demand for linearity in systems with high PAPR is
much higher than in systems with low PAPR. System performance will
degrade significantly due to high PAPR. This drawback, however, can be
overcome by using SC-CP systems. The reason is that there is no need to
use IDFT in the transmitter side. Hence, the peak value of the transmit
signal is completely dependent on the the modulation scheme. Take IEEE
802.11a/g/n for instance, the highest modulation scheme is 64-QAM. The
corresponding PAPR for 64-QAM is around 1.63, which is much smaller
than the PAPR in conventional OFDM systems whose PAPR is in general
greater than 10 dB.
2. Simplification of DAC. Since there is no IDFT in the transmitter, it
may be sufficient to put a low bit width for DAC design. For instance,
for 64-QAM, the constellation level for both I and Q axes only have 8
different levels, i.e. (−7 − 5 − 3 − 1 + 1 + 3 + 5 + 7). Under such
situation, we may only require 3-bit DAC to represent such constellation
level, which again greatly simplifies the design effort of analog circuits.
3. Equalized BEP for All Subchannels. Now let us discuss the most
important characteristic of SC-CP system. That is, the SC-CP system
has equal BEP for all subchannels. Referring to the receiving noise path
of the SC-CP system in Fig. 2.29 and assuming that the receiving noise
qi is complex white Gaussian with zero mean and variance N0 , due to
the uncorrelated property of the noise, the correlation matrix of the noise
vector is given by
 
Rq = E qq† = N0 IM . (2.83)

q0 f0 e0
remove q1 f1 e1
CP . .. ..
.. F F†
qi
and
S/P qM–1 f
.
M–1
.e
M–1

Fig. 2.29. Receiving noise path of a single carrier system with cyclic prefix.
2.5 Precoding for OFDM Systems 43

Let the output noise vector be

e = (e0 e1 · · · eM−1 ).

Now we would like to obtain the correlation matrix of e. Assume that we


are using the zero forcing equalization. In this case, the FEQ coefficients
are
fk = 1/λk , 0 ≤ k ≤ M − 1.
Hence, the output noise vector is given by

e = F† Λ−1 Fq. (2.84)

Using Eqs. (2.83) and (2.84), we have the correlation matrix of the output
noise vector given by
   
Re = E ee† = F† Λ−1 FE qq† F† (Λ−1 )† F = N0 F† (ΛΛ† )−1 F.
(2.85)
To obtain the output noise variance, we only need to obtain the diagonal
elements of Re since the diagonal elements are the output noise variances.
From Eq. (2.85), since (ΛΛ† )−1 is diagonal with its diagonal element being
(1/|λ0 |2 1/|λ1 |2 · · · 1/|λM−1 |2 ), Re = F† (ΛΛ† )−1 F is a circulant matrix
and its diagonal elements are the 0th output component of the IDFT of
(1/|λ0 |2 1/|λ1 |2 · · · 1/|λM−1 |2 ), i.e.,

N0 
M−1
[Re ]k,k = 1/|λm |2 , for all k. (2.86)
M m=0

From Eqs. (2.85) and (2.86), the variance of output noise for every sub-
channel is the same, i.e.,

  N0 
M−1
σe2k = E |ek | = [Re ]k,k =
2
1/|λm |2 . (2.87)
M m=0

Assume the transmitter power is Es . Due to the use of zero forcing equal-
ization and from Eq. (2.87), the output SNR for all subchannel is the same
and is given by
Es M
SN Rk =  , for all k. (2.88)
N0 M−1
m=0 1/|λm |
2

Since the output SNR is the same (independent of k), all subchannels
have the same BEP in this situation.
The BEP result for SC-CP is very different from that of the conventional
OFDM systems. In conventional OFDM systems, the BEP of every individ-
ual subchannels is in general different. Thus, the performance corresponding
44 2 Precoding Techniques in ISI Channel

to the most serious fading will dominate the overall performance. Hence, we
usually use channel coding to overcome the issue that subchannels have deep
fading. On the contrary, SC-CP systems have the same BEP for all subchan-
nels. In this case, if the unique SNR of SC-CP is good, there may be no need
to use channel coding in this case. If the unique SNR of SC-CP is bad, how-
ever, using channel coding may not be able to rescue the system performance
because the channel coding such as convolutional code is designed to correct
burst error instead of block error. Observing from Eq. (2.88), if there is any
M−1
m=0 1/|λm | is
2
value for λk is small (deep fading), the summation term
large. Hence, the SNR value for all subchannels are the same bad. To over-
come this problem, we can use the MMSE equalization instead of zero forcing
equalization described as follows:
MMSE Equalization for SC-CP. Let the equalization coefficients be
γλ∗k
fk = , 0 ≤ k ≤ M − 1. (2.89)
1 + γ|λk |2
Let Γ be the diagonal matrix consists of fk , i.e., Γ = diag(f0 , f1 · · · fM−1 ).
In this case, the input and output relationship is given by

Y = F† ΓFCcir X + F† ΓFq, (2.90)

where the first term is contributed from the transmitted symbols and the
second term is contributed from noise. Since Ccir = F† ΛF, Eq. (2.90) can be
rewritten as

Y = F† ΓΛFX + F† ΓFq = X + F† (ΓΛ − IM )FX + F† ΓFq . (2.91)


  
e

Note that the output noise vector in MMSE equalization contains the contri-
bution from both noise and transmitted symbols. Let us obtain the output
noise variance as follows. Assume the transmitted symbol vector X and the
noise vector q are uncorrelated, using Eqs. (2.83) and (2.91), we have
 
Re = F† Es (ΓΛ − IM )(ΓΛ − IM )† + N0 ΓΓ† F. (2.92)
  
D

Since D in Eq. (2.92) is a diagonal matrix, Re is again a circulant matrix.


Next let us obtain the diagonal elements of D, which is denoted by dk , where
0 ≤ k ≤ M − 1. From Eqs. (2.89) and (2.92), we have
  2   2
2
 γλ∗k  γ|λ |2  γλ∗k 
dk = Es |λk |   −2 k
+ 1 + N   .
1 + γ|λk |2  1 + γ|λk |2 0
1 + γ|λk | 
2

Es + N0 γ 2 |λk |2
= . (2.93)
(1 + γ|λk |2 )2
Since γ = Es /N0 , we can rewrite Eq. (2.93) as
2.5 Precoding for OFDM Systems 45

1 + γ|λk |2 Es
dk = Es = . (2.94)
(1 + γ|λk | )
2 2 (1 + γ|λk |2 )
The diagonal elements of Re is the 0th component of the IDFT of (d0 d1 · · ·
dM−1 ). Thus, we have

Es 
M−1
1
[Re ]k,k = . (2.95)
M 1 + γ|λk |2
k=0

From Eq. (2.95), the SNR is the same in individual subchannels and is given
by
M
SN Rk = M−1 1
. (2.96)
k=0 1+γ|λk |2

Compared Eq. (2.96) with Eq. (2.88), even if some suchannels have deep fad-
ing, the MMSE equalization will not lead to bad SNR value for all suchannels.

2.5.2 Single Carrier System with Zero Padding (SC-ZP)

Another method for single carrier systems to avoid IBI is zero padding, which
is similar to the OFDM systems with zero padding as discussed in Section 2.4.4
except that now we have a precoder. Referring to Fig. 2.28, SC-ZP is the same
as that of SC-CP except that adding CP becomes zero padding. Hence, let
P = F and Gzp be the same as that in Eq. (2.75). For zero forcing and efficient
implementation, Szp can be chosen as that in Eq. (2.80). Hence, we have the
input and output relationship given by

Y = P† Szp C(z)Gzp PX = X + F† Λ−1 Fq,

which leads to the same results as that of SC-CP with zero forcing equalization
in Eqs. (2.83–2.88).
If we keep all conditions unchanged except that Szp = ΓFΦ as in
Eq. (2.89), it becomes SC-ZP with MMSE equalization. In this case, the same
results as in Eqs. (2.90–2.96) can be obtained. Similar to OFDM systems,
SC-ZP can use lower transmitter power to achieve the same performance as
SC-CP with little extra additions in the receiver.
We can also use the pseudo inverse solution with zero forcing to recon-
struct the transmitted symbols. That can be done by keeping the transmitter
unchanged as that in SC-ZP with MMSE equalization, and let Szp be chosen
as that in Eq. (2.80). In addition, let P† = F† . In this case, the input and
output relationship is given by

Y = X + (C†L CL )−1 C†L q.

For pseudo inverse solution with MMSE receiver, we can choose Szp as that
in Eq. (2.79) and P† = F† . The input and output relationship in this case can
be expressed as
46 2 Precoding Techniques in ISI Channel

Y = (C†L CL + γ −1 IM )−1 C†L CL X + (C†L CL + γ −1 IM )−1 C†L q.

For further reading, a channel independent OFDM precoder to combat


subchannel nulls was proposed in [153]. Linear precoder for OFDM to achieve
maximum diversity gain was considered in [149]. The SC-CP system was first
proposed in [113] to investigate the possible application for digital terrestrial
TV. In [78], the authors proved that SC-CP achieves the minimum BEP for a
moderate SNR range over QPSK modulation. The results is further extended
to higher constellation cases in [75].
3
Precoding Techniques in Multiple Access
Channels

The direct-sequence code division multiple access (DS-CDMA) system accom-


modates more than one user in a single frequency band simultaneously by the
use of spreading codes. For example, the data symbol of the kth user in the
DS-CDMA system is first modulated onto an unique spreading code and then
transmitted. If code information is available at the receiver, the kth receiver
employs a matched filter (MF), which simply matches the received signal with
the code sequence while rejecting the multiple access interference (MAI) for
symbol decoding.
The MAI immunity of the MF receiver depends on both the spreading
code design and the channel condition. In a synchronous frequency flat fading
channel, the orthogonality between different code channels can be achieved
by employing Hadamard-Walsh codes, which suggests that the MF receiver
can completely remove MAI at its output. It is however that when the sig-
nal bandwidth is greater than the channel coherent bandwidth, the transmit
signal undergoes a frequency selective fading channel, and multiple copies of
the transmit signal coming from different propagation paths will be observed
at different time delays at the receiver. This channel, known as a multipath
channel, not only destroys the code orthogonality but also introduces over-
lapped transmit signals at the receiver. As a result, the system performance
of the MF receiver degrades due to the presence of interference, namely, MAI,
inter-chip interference (ICI), and possible inter-symbol interference (ISI) if a
proper number of guard chips are not inserted between any two consecutive
transmit symbols.
The DS-CDMA performance in the multipath channel can be saved by
employing other receiver design schemes, such as the Rake receiver [105], the
decorrelator [81], the minimum mean square error (MMSE) receiver [83] or
the multiuser detector [144]. These schemes, which are complex in design and
consume more power for symbol decoding than the MF receiver, are more
likely to be implemented at the base station, rather than the mobile handset
due to its size and battery constraints. Consequently, the performance of the
downlink DS-CDMA system is worse than its uplink counterpart.

C.-C.J. Kuo et al., Precoding Techniques for Digital Communication Systems,


DOI: 10.1007/978-0-387-71769-2 3,  c Springer Science+Business Media, LLC 2008
48 3 Precoding Techniques in Multiple Access Channels

In order to improve the downlink performance while maintaining a simple


receiver design, the idea of transmit precoding, which shifts the decoding
complexity from the receiver to the transmitter, was proposed in different
literature and gets more and more attention nowadays. Given the channel
information of all users are known to the transmitter, the transmitter can
pre-process the original transmit data according to the channel condition and
the transmit signals of all co-channel users such that the resultant MF output
contains either more signal power or less interference as compared to the
system without precoding.
The precoder design necessitates the channel knowledge at the transmitter.
If the channel varies slowly, the channel information can either be acquired
via one feedback link from the receiver to the transmitter or be estimated
at the transmitter by exploiting the channel reciprocity property, which is
inherent in the time division duplex (TDD) system [12, 24, 25, 147]. It is
however that the channel responses estimated at different ends may not be
exactly the same due to imperfect RF circuits [96, 137]. Therefore, additional
calibration is necessary. In this chapter, we do not consider the problem of
channel information mismatch and simply assume that the channel knowledge
at the transmitter is ideal. Next, three different linear precoding schemes,
namely transmit matched filter (Tx-MF), transmit zero-forcing filter (Tx-
ZF), and the transmit Wiener filter (Tx-Wiener), proposed in the previous
literature, are reviewed in the following sections.

3.1 System Model

The block diagram of a generic downlink DS-CDMA system with precoding


is given in Fig. 3.1, where the base station and each mobile user employ only
one transmit and receive antennas, respectively. The binary symbol, bk (i) ∈
{+1, −1} equal-probable, which denotes the ith data symbol for the kth user,
is modeled as an independent, identically distributed (i.i.d.) random variable
for all i and k. The data symbol is first despread by the unit-power, N -chip
spreading code, sk , given as

h1 (t)

^
x1(t) y1(t) MF1 b1 (i)
b1(i) p1(t)
hK (t)
S1
Tx-1
bK (i) pK (t)
xK (t) yK (t) ^
MFK bK (i)
SK

Fig. 3.1. The block diagram of the precoded SISO TDD DS-CDMA system.
3.1 System Model 49

sk = [sk,0 , · · · , sk,N −1 ]t , s†k sk = 1, (3.1)


√ √
where sk,i ∈ {+1/ N , −1/ N }, ∀k, i. The spread signal for the kth user,
xk (t), is represented as
∞ N
  −1
xk (t) = sk,j bk (j)w(t − jTc − jTs ). (3.2)
i=−∞ j=0

The precoder for the kth user pk (t), which is a Lp -tap finite impulse response
(FIR) filter, is described as
Lp −1

pk (t) = pk,i δD (t − iTc ), (3.3)
i=0

where pk,i , which denotes the ith tap coefficient, is designed according to
different criteria. The spread signal xk (t) is first passed through the corre-
sponding precoder pk (t) before transmission and the output of the precoder
pk (t) is
yk (t) = xk (t) ∗ pk (t). (3.4)
The transmit signal y(t), which is the sum of all K prefilter output signals
yk (t), ∀k = 1, · · · , K, is given as


K
y(t) = yk (t). (3.5)
k=1

Please note that we omit the carrier frequency in y(t) since we represent our
transmit signal in the equivalent baseband.
Here we adopt the block fading channel model, i.e., the channel coefficients
remain unchanged during a block of transmit symbols and then changes in-
dependently from block to block. Also, channels of different users are realized
independently. Let hk (t) be the channel response for the kth user and is mod-
eled as an L-tap FIR filter given as


L−1
hk (t) = hk,i δD (t − iTc ), (3.6)
i=0

where the ith channel tap, hk,i , is described as one complex Gaussian random
variable whose mean is zero and has variance as σ 2 , i.e.,

hk,i ∼ CN (0, σ 2 ). (3.7)

After passing through the channel, the transmit signal is destroyed by the
multipath channel and contaminated by the noise, and the received signal at
the kth receiver is
50 3 Precoding Techniques in Multiple Access Channels

rk (t) = hk (t) ∗ y(t) + nk (t), (3.8)


where nk (t) is a zero mean, circularly complex Gaussian process with power
spectrum density equal to σn2 . The kth mobile device performs the chip wave-
form match and then takes sample for every chip interval. As a result, the
equivalent discrete representation for the received signal rk (t) becomes

rk (i) = [rk,0 (i), · · · , rk,L+Lp +N −3 (i)]t ,



K
= Sj h̄j,k bj (i) + I k (i) + nk (i),
j=1


K
= Sj Hk pj bj (i) + I k (i) + nk (i), (3.9)
j=1

where
h̄j,k = [h̄j,k,0 , · · · , h̄j,k,L+Lp −2 ]t = Hk pj (3.10)
denotes the effective channel response for the signal bj (t) to the kth receiver,

I k (i) = [Ik,0 (i), · · · , Ik,L+Lp +N −3 (i)]t (3.11)

and
nk (i) = [nk,0 (i), · · · , nk,L+Lp +N −3 (i)]t (3.12)
denotes the corresponding interference and noise vectors for the kth user,
respectively,
pj = [pj,0 , · · · , pj,Lp −1 ]t (3.13)
is a Lp × 1 precoding vector for the jth user, and
⎡ ⎤
sj,0
⎢ .. ⎥
⎢ . sj,0 ⎥
⎢ ⎥
⎢ .. .. ⎥
Sj = ⎢
⎢ sj,N −1 . .⎥⎥ (3.14)
⎢ .. ⎥
⎢ sj,N −1 .⎥
⎣ ⎦
..
.
(L+Lp +N −2)×(L+Lp −1)

and ⎡ ⎤
hj,0
⎢ .. ⎥
⎢ . hj,0 ⎥
⎢ ⎥
⎢ .. .. ⎥
Hj = ⎢
⎢ hj,L−1 . .⎥⎥ (3.15)
⎢ .. ⎥
⎢ hj,L−1 .⎥
⎣ ⎦
..
.
(L+Lp −1)×Lp
3.1 System Model 51

are two Toeplitz matrices for the convolution operation. In addition, every
element in nk (i) is modeled as an i.i.d. circularly complex Gaussian random
variable and is denoted as

nk,j (i) ∼ CN (0, σn2 ). (3.16)

To simplify the following analysis, we assume that the noise power is the same
for all users. Here we restrict each receiver employing a simple MF receiver,1
which first synchronizes to the Lp th effective channel tap in h̄k,k and perform
the despreading operation to get decision statistic zk (i), i.e.,

zk (i) = f†k rk (i), (3.17)

where
fk = [0, · · · , 0 , sk,0 , · · · , sk,N −1 , 0, · · · , 0]t . (3.18)
     
1×(Lp −1) 1×(L−1)

By substituting (3.9) into (3.17), we can simplify zk (i) as

zk (i) = f†k rk (i)


⎛ ⎞
K
= f†k ⎝ Sj Hk pj bj (i) + I k (i) + nk (i)⎠
j=1


K
(0),†
= rj,k Hk pj bj (i) + Ik (i) + nk (i), (3.19)
j=1

where Ik (i) = f†k I k (i),

nk (i) = f†k nk (i) ∼ CN (0, σn2 ), (3.20)


(0)
rj,k = [rj,k (Lp − 1), · · · , rj,k (0), · · · , rj,k (−L + 1)]t , (3.21)

and

N −1
rj,k (d) = sj,i+d sk,i (3.22)
i=0

measures the cross-correlation between sj and sk . Please note that the vari-
ables, vectors, and matrices shown in (3.19) are functions of the precoder
length Lp and we simply omit it for notation simplicity. The kth receiver
estimates the ith transmit symbol as

b̄k (i) = sign {{zk (i)}} . (3.23)


1
It is worthwhile to point out that although a better system performance can be
achieved by combining more than one channel tap in h̄k,k (t). This idea, which
necessitates a Rake receiver, is complicated for mobile devices and hence is not
considered here.
52 3 Precoding Techniques in Multiple Access Channels

3.2 Transmit Matched Filter


The frequency selective fading channel in nature leads to multiple propagation
paths from the transmitter to the receiver, i.e., signal traveling along different
path experiences independent path gain and arrives at the receiver at different
time delay. The multipath channel provides additional frequency diversity,
which prevents the received signal from deep fading since it is less likely that
all the channel gains are in deep fading at the same time. Yet the conventional
MF receiver, which synchronizes to some path only, fails to collect enough
signal power scattered in the channel for symbol decoding. As a result, the
performance of the MF receiver in the multipath channel is not good.
In order to enjoy this multipath diversity, the Rake receiver with several
fingers, each of which is synchronized to an individual path, is usually adopted
in the frequency selective fading channel [105]. Apparently, there is a trade-off
between the system performance and the number of Rake fingers. The required
number of fingers could be large, especially in some extremely frequency se-
lective fading environment, such as the ultra-wideband (UWB) channel [151].
Since the receiver cost increases dramatically as the finger number grows, the
Rake receiver is not attractive to the low cost mobile system design.
The idea of shifting complexity of multipath combining from the mobile
receiver to the base station, called Pre-Rake, which utilizes the time-reversed
channel impulse response as its precoder, is first proposed for the single-
input single-output (SISO) TDD-DS-CDMA system by Esmailzadeh et al. in
[34] and later generalized to the multi-input single-output (MISO) TDD-DS-
CDMA system by Choi et al. in [24] to exploit additional transmit diversity
at the base station. A similar idea, called time-reversal prefilter (TRP), was
proposed first to the under water acoustic signal processing in [33, 40, 51] and
later to the wireless communication scenarios in [68, 94, 123]. In fact, both
TRP or Pre-Rake can be viewed as a special case of the transmit matched
filter (Tx-MF), which will be derived next.
Usually, the precoder length is less or equal to that of the channel response,
i.e., Lp ≤ L. Let us now consider a special case when Lp = L. The idea of
choosing Lp = L will become clear later in this section. Given Lp = L, the
decision statistic for the ith transmit symbol from the kth user zk (i) in (3.19)
can be represented as

K
(0),†
zk (i) = rj,k Hk pj bj (i) + Ik (i) + nk (i)
j=0

(0),†

K
(0),†
= rk,k Hk pk bk (i) + rj,k Hk pj bj (i) + Ik (i) + nk (i).
j=1;j =k
(3.24)
The first term at the right-hand side of (3.24) denotes the desired signal for
the kth user. The goal of Tx-MF is to maximize the desired signal power in
3.2 Transmit Matched Filter 53

zk (i) subject to the limited transmit power Ek . Hence, the precoder design
problem can be reformulated as
 2 " # $
(T x−MF )  (0),†  2
pk = arg max E rk,k Hk pk bk (i) s.t. E |Sk pk bk (i)| = Ek .
pk
(3.25)
Please note that Sk above is a (Lp + N − 1) × Lp Toeplitz matrix, whose first
column is
[sk,0 , · · · , sk,N −1 , 0, · · · , 0 ]t . (3.26)
  
1×(Lp −1)

The above constrained optimization problem is solved in Section 3.5.1, and


we have the Tx-MF for the kth user as
%
Ek
R−1 H†k rk,k ,
(T x−MF ) (0)
pk = (0),† −1 † (0) Sk
(3.27)
rkk Hk RSk Hk rkk

where RSk is defined in (3.50). It is worthwhile to comment that different


optimization criterion for Tx-MF, for example,
 # $2 "
(T x−MF ) (0),†
pk = arg max E  rk,k Hk pk bk (i)
pk
# $
2
s.t. E |Sk pk bk (i)| = Ek (3.28)

is also considered in [24] since the decoding performance is only related to the
real part of the desired signal. However, it is easy to show that both (3.25)
and (3.28) lead to the same solution.
Let us consider a special case when the autocorrelation property of sk is
ideal, i.e., 
1, i = 0,
rk,k (i) = (3.29)
0, elsewhere.
(0)
Therefore, RSk and rk,k become

RSk = I2L−1 (3.30)

and
(0)
rk,k = [0, · · · , 0, 1, 0, · · · , 0]t . (3.31)
     
L−1 L−1

By substituting (3.31) and (3.30) into (3.27), the Tx-MF becomes



(T x−MF ) Ek
pk = h̄k , (3.32)
||h̄k ||
where
h̄k = [hk,L−1 , · · · , hk,0 ]t (3.33)
54 3 Precoding Techniques in Multiple Access Channels

is the reversed-order channel tap vector, and ||x|| denotes the 2-norm of vector
x. Equation (3.32) is the well-known Pre-Rake (or TRP), which is a special
case of Tx-MF when the autocorrelation property of the spreading code sk
is perfect. In other words, the Pre-Rake or TRP, which utilizes the time-
order reversed channel impulse response as its precoder, does not consider the
spreading code structure into its precoder design. In addition, the MF receiver
synchronizes to the channel tap h̄k,k,L−1 , whose power is

 2 & (0),† (T x−MF )


'2 L−1

h̄k,k,L−1 = rk,k Hk pk = |hk,l |2 , (3.34)
l=0

which suggests that the full multipath diversity is achieved at the MF receiver
output. This is because all the channel taps are combined coherently after
some delay and we denote h̄k,k,L−1 as the peak channel tap in h̄k,k . On the
contrary, the signal power of those off-peak signals h̄k,k,j , ∀j = L − 1 are much
weaker than that of h̄k,k,L−1 since the original channel taps in hk are combined
excursively. Obviously, the precoder length Lp also affects the concentrated
peak signal power in h̄k,k . In order to achieve the full diversity, we let Lp = L.
Example 3.1: Rather than considering an autocorrelation function of a
specific spreading code, a SISO Pre-Rake transmit filter example is given
in Fig. 3.2. We assume that all the channel taps are real so that they can

hk (t) pk (t)
1 1
Amplitude

Amplitude

0.5 0.5

0 0

−0.5 −0.5
0 1 2 3 4 0 1 2 3 4
Time (ms) Time (ms)
h̄kk (t)
1.5

1
Amplitude

0.5

−0.5
0 1 2 3 4 5 6 7 8
Time (ms)

Fig. 3.2. The example of Pre-Rake transmit prefilter for the kth user.
3.2 Transmit Matched Filter 55

be plotted in a two dimensional coordinate. hk (t), pk (t), and h̄k,k (t) are the
responses of the original channel, precoder, and the resultant channel for the
kth user, respectively. In addition, the total power in hk (t) is normalized to
one and the inter-arrival time between two consecutive taps is set as 1 ms.
An obvious peak signal is presented in h̄k,k (t) at 4 ms, which is the maximum
tap delay of hk (t).

Example 3.2: In the following example, we demonstrate the performance


improvement via additional Tx-MF precoding in this example. The downlink
CDMA channel contains 10 users, each of which utilizes an unique Gold se-
quence of 31-chip in length. The channel response of every user consists of
5 multipath components, which are realized independently between different
paths and different users. Furthermore, the length of the Tx-MF precoder is
fixed as that of the channel response. The BEP curve corresponding to the
Tx-MF is plotted in Fig. 3.3 at different SNR values. The performance curve
from the conventional MF, which synchronizes to the strongest channel tap
for decoding, is also provided as the performance benchmark. It is observed
that the gap coming from additional Tx-MF precoding is at least 4 dB and it
becomes more obvious while the signal power goes up.
The prefilter response depends on the current channel realization between
the transmitter and the designated receiver. Therefore, the transmitted sig-
nal power will be concentrated at the desired receiver only rather than other

100

MF w/o precoding
Tx-MF

10–1
BEP

10–2

10–3
0 2 4 6 8 10 12 14 16 18 20
SNR (dB)

Fig. 3.3. Performance gap between the conventional MF without precoding and
Tx-MF.
56 3 Precoding Techniques in Multiple Access Channels

receivers in the system since the channel responses from different users are
independent. Consequently, the MAI in the pre-Rake system is somewhat re-
duced as compared with that of the system without Tx-MF. It is however
that the Tx-MF scheme, which does not consider the transmit signals and the
downlink channel responses of other users in the system, may not suppress
MAI efficiently at the receiver output. Two different precoder design schemes,
namely Tx-ZF and Tx-Wiener, which remove or suppresse the possible inter-
ference, are reviewed next.

3.3 Transmit Zero-Forcing Filter

Although Tx-MF provided in the previous section achieves the maximal peak
signal power at the desired receiver output, it may cause a strong interference
to other co-channel users in the system since only the channel knowledge of
the desired user is considered for the precoder design. In fact, a better system
performance can be achieved by jointly designing all K precoders such that
not only the peak signal power is maximized but also the interference is erased
or minimized.
The transmit zero-forcing filter (Tx-ZF) is designed to decorrelate all the
transmit signals such that the signal at every receiver output is free of interfer-
ence and is derived as follows. Let us assume that every precoder is a Lp -tap
FIR filter and both Lp and L are much smaller than N so that the inter-
symbol interference (ISI) is dominated by previous one and next one transmit
symbols only. Please note that the scheme discussed here can be generalized
to different values of N , Lp , and L as well. If the kth MF receiver synchronizes
to the Lp th effective channel tap h̄k,k,Lp −1 and performs signal despreading,
the decision statistic for the ith transmit symbol in (3.24) can be represented
as
1  K
(l),†
zk (i) = rj,k Hk pj bj (i + l) + nk (i), (3.35)
l=−1 j=1

(0)
where rj,k is given in (3.21) and

(−1)
rj,k = [0, · · · , 0, rj,k (N − 1), · · · , rj,k (N − L + 1)]t , (3.36)

and
(+1)
rj,k = [rj,k (Lp − 1 − N ), · · · , rj,k (1 − N ), 0, · · · , 0]t (3.37)

denote different weight vectors for signals coming from the previous and fol-
lowed symbols, respectively. By substituting (3.21), (3.36), and (3.37) into
(3.35), zk (i) becomes
3.3 Transmit Zero-Forcing Filter 57


K 
1
(l),†
zk (i) = rj,k Hk pj bj (i + l) + nk (i)
j=1 l=−1
(0),†
= rk,k Hk pk bk (i)
(−1),† (+1),†
+ rk,k Hk pk bk (i − 1) + rk,k Hk pk bk (i + 1)

K 
1
(l),†
+ rj,k Hk pj bj (i + l) + nk (i), (3.38)
j=1,j =k l=−1

where the second and the third terms at the right-hand side of (3.38) are the
pre-cursor and post-cursor ISI, respectively, and the fourth term is the MAI
caused by all the other precoders pj , ∀j = k. In other words, the precoder pk
not only contributes ISI to the its receiver, but also causes MAI described as


1
(l),†
rk,j Hj pk bk (i + l)
l=−1

to the ith decision statistic of user j, zj (i).


Based on the similar scheme proposed in [12], we construct the Tx-ZF for
(T x−ZF )
the kth user pk , which is a symbol-wise FIR filter to remove the total
interference caused by the transmit signal of user k while being subject to the
total transmit power constraint, as follows.

(T x−ZF ) (0),†
pk = arg max rk,k Hk pk (3.39)
pk

subject to

( (l),†
# Hj pk = 0, $∀1 ≤ j ≤ K and l = −1, 0, 1, except (j, l) = (k, 0),
rk,j
E |Sk pk bk (i)|2 = p†k S†k Sk pk = Ek ,

where Sk is the same as the one given in (3.25). The Tx-ZF design problem
in (3.39) can be solved in two steps, i.e., we design the precoder to satisfy
the zero-forcing constraint first while maintaining the unit channel gain for
the desired signal and then adjust the transmit power to confine the second
constraint. Consider the following linear system equation,

(3(k−1)+2)
Hk p̄k = e3K , (3.40)
58 3 Precoding Techniques in Multiple Access Channels

where
⎡ ⎤
Hk,1
⎢ .. ⎥
⎢ . ⎥
⎢ ⎥
Hk = ⎢⎢ Hk,k ⎥
⎥ , (3.41)
⎢ . ⎥
⎣ .. ⎦
Hk,K 3K×L
p
⎡ (−1),† ⎤
rk,j Hj
⎢ ⎥
Hk,j = ⎣ r(0),†
k,j Hj ⎦
. (3.42)
(+1),†
rk,j Hj 3×L
p

The solution p̄k in (3.40) is found as

p̄k = H‡k e3K


(3(k−1)+2)
. (3.43)

Note that since the dimension of Hk is 3K × Lp , p̄k is guaranteed if Lp is


greater or equal to 3K. In other words, the interference can be completely
removed in this case. Otherwise, we need to resort to the least-square solution
instead. However, the output of MF receiver is not free of interference. Next,
we enforce the transmit power constraint into Tx-ZF design so that we have

(T x−ZF ) Ek
pk = †
p̄k . (3.44)
||Sk p̄k ||
Another Tx-ZF design criteria is to maintain the channel gain for the
desired signal while suppressing all the interference at K receiver outputs
with the minimum transmit power. Mathematically, we can have

= arg min p†k S†k Sk pk s.t. Hk pk = e3K


(T x−ZF ) (3(k−1)+2)
pk , (3.45)
pk

where we consider the equal gain for all K desired signals without the loss
of generality. The solution to this alternative Tx-ZF design is provided in
Section 3.5.2. In fact, the solution provided in Section 3.5.2 also minimizes
the meas-square-error (MSE) between bk (i) and zk (i), ∀1 ≤ k ≤ K. This is
because the noise term at the MF filter output is independent of the precoder
design and we can always design a proper Tx-ZF precoder to cancel all the
interference before hand. The Tx-ZF design using MMSE criterion was also
proposed in [147].
It is however noted that the zero-forcing precoder removes all the ISI and
MAI at the expense of reduced desired signal power since part of the transmit
power is used to cancel interference. On the contrary, more power is required to
maintain the same power level of the desired signal as compared to the system
without Tx-ZF. Therefore, when the transmit power limitation is imposed, the
MMSE design criterion cannot lead to Tx-ZF precoder anymore.
3.4 Transmit Wiener Filter 59

h1(t)

^
MF1 b1(i)
b1(i) p1(t)
hK(t)

TX-1 ^
bK(i) pK(t) MFK b K (i )

Fig. 3.4. The block diagram of the generalized SISO Tx-ZF DS-CDMA system in
[12] [IEEE].
c

The conventional Tx-ZF processes the transmit signal after despreading. It


is however that the enforced spreading code structure limits the flexibility of
its precoder design. In fact, a more generalized precoder scheme is to prefilter
the original transmit symbol bk (i), ∀k, directly without the use of spreading
code [12] while the kth receiver employs a known code sk for signal despreading
[12]. The corresponding system block diagram is given in Fig. 3.4. It was shown
in [12] that this generalized precoder is more flexible and achieves even better
performance than the conventional precoder with additional signal spreading
operation at the transmitter [12].

3.4 Transmit Wiener Filter


In the previous sections, we introduced two precoder design ideas, namely Tx-
MF and Tx-ZF. On the one hand, Tx-ZF, which focuses on the interference
suppression alone, fails to provide better system performance than Tx-MF
when the noise power is strong. On the other hand, when the noise power is
weak, Tx-MF, which focuses the desired signal power without considering the
interference structure into its precoder design, does not outperform Tx-ZF.
In this section, we introduce another linear precoder design concept, called
transmit Wiener filter (Tx-Wiener), which balances the noise suppression and
interference cancelation.
We adopt the same system model and notations introduced in the previous
Sections 3.2 and 3.3. Now we somewhat modify the receiver structure by plac-
ing one amplifier immediately before the signal despreading operator as shown
in Fig. 3.5 where the gain of the receiver amplifier is α and is assumed to be

^
α MFk bk(i)

Fig. 3.5. The block diagram of the modified Rx structure for SISO Tx-Wiener SISO
DS-CDMA system.
60 3 Precoding Techniques in Multiple Access Channels

the same for all K receivers. Again, consider the case when Tx-Wiener pre-
coder is a Lp -tap FIR filter and the kth MF receiver synchronizes to h̄k,k,Lp −1
for the received signal despreading. Thus, the output of the kth MF receiver
for the ith transmit symbol, zk (i) is described as

1 
K
(l),†
zk (i) = αrj,k Hk pj bj (i + l) + αnk (i). (3.46)
l=−1 j=1

(T x−W iener)
The Tx-Wiener pk , ∀1 ≤ k ≤ K, and the receiver gain α(T x−W iener)
are jointly adjusted to minimize the sum of K output MSE subject to the total
precoder power constraint Ep , i.e.,
# $
(T x−W iener) (T x−W iener)
p1 , · · · , pK , α(T x−W iener)

K # 2 $ 
K
= arg min E bk (i) − zk (i) s.t. p†k S†k Sk pk = Ep .
p1 ,··· ,pK ,α
k=1 k=1
(3.47)
The Tx-Wiener filter and the optimal gain are derived in Section 3.5.3.
Recall that the receiver-based Wiener filter converges to the receiver-based
ZF and MF when the noise power approach to zero or infinity, respectively. In
fact, a similar property can be found in Tx-Wiener as well. Consider the case
when the noise power approaches infinity, the Tx-Wiener for the kth user in
(3.76) converges to
%
Ep
R−1
(T x−W iener) H (0)
lim pk = K (0),† † −1 (0) Sk Hk rk,k , (3.48)
σn →∞
2
r H R H r
k=1 k,k k Sk k k,k

which is the same as the Tx-MF filter shown in (3.27) with different scaling
factors. This is because both systems are derived under different constraints,
say, the individual transmit power limit for Tx-MF and the total transmit
power limit for Tx-Wiener. On the contrary, when the noise power becomes
infinitesimal, the Tx-Wiener is similar to Tx-ZF shown in (3.59) with different
scaling factor. This interesting behavior of Tx-Wiener filter was first reported
by Joham et al. in [61].
Example 3.3: The performance of different precoding schemes used to com-
bat interference, such as Tx-ZF and Tx-MMSE, are provided in this example.
The system parameters are the same as those in the previous example and
the result we have is shown in Fig. 3.6. Please note that the curves for both
Tx-MF and the conventional MF without precoding are plotted and served as
the references. It is obvious that suppressing interfering signals improves the
system performance. However, Tx-ZF with the fixed transmit power fails to
provide a better decoding performance than Tx-MF when SNR is less than 10
dB. This is because it reduces the interference power at the cost of its output
signal power.
3.5 Appendix 61

100
MF w/o precoding
Tx−MF
Tx−ZF w/ power constraint
Tx−ZF w/ minimum transmit power
Tx−MMSE

10–1
BEP

10–2

10–3

10–4
0 2 4 6 8 10 12 14 16 18 20
SNR (dB)

Fig. 3.6. Performance of different precoding schemes.

3.5 Appendix

3.5.1 Derivation of Tx-MF

The solution to (3.25) is given here. Let us first simplify the constrained
optimization problem as
# $ # $
= arg max p†k H†k rk,k rk,k Hk pk p†k RSk pk = Ek ,
(T x−MF ) (0) (0),†
pk s.t.
pk
(3.49)
where
RSk = S†k Sk = RSk RSk ,
1/2,† 1/2
(3.50)
−1/2
is a Lp × Lp full-rank matrix. Since RSk is also a full-rank matrix, we can
have the following transformation between pk and wk as
−1/2
pk = RSk wk . (3.51)

By substituting (3.51) into (3.49) and performing some manipulation, (3.49)


becomes
# $
−1/2,† † (0) (0),† −1/2
= arg max w†k RSk
(opt)
wk Hk rk,k rk,k Hk RSk wk
wk
# $
s.t. w†k wk = Ek , (3.52)
62 3 Precoding Techniques in Multiple Access Channels

where the problem is transformed to find the optimal wk with additional


energy constraint on wk . It is easy to show that the best wk satisfying (3.52)
is %
(opt) Ek −1/2,† † (0)
wk = R
† (0) Sk
Hk rk,k . (3.53)
rkk Hk R−1
(0),†
Sk H r
k kk
(T x−MF )
Therefore, the solution of pk is acquired by
(T x−MF ) −1/2 (opt)
pk = RSk wk
%
Ek
R−1 H†k rk,k .
(0)
= † (0) Sk
(3.54)
rkk Hk R−1
(0),†
Sk H r
k kk

3.5.2 Derivation of Tx-ZF with Minimum Output Power


Constraint

The Tx-ZF design problem in (3.45) can be rewritten as


−1/2
= arg min w†k wk s.t. Hk RSk wk = e3K
(T x−ZF ) (3(k−1)+2)
pk , (3.55)
wk

−1/2
where pk in (3.45) is replaced by RSk wk and RSk is given in (3.50). Gener-
(opt)
ally speaking, we can first figure out the optimal wk by using the pseudo-
−1/2
inverse of Hk RSk , i.e.,
& '†
(opt) −1/2 (3(k−1)+2)
wk = Hk RSk e3K , (3.56)

(T x−ZF )
and then acquire pk as
& '†
(T x−ZF ) −1/2 (opt) −1/2 −1/2 (3(k−1)+2)
pk = RSk wk = R Sk Hk R S k e3K . (3.57)

−1/2
Consider a special case when rank{Hk RSk } = Lp , and
& '† & '−1
−1/2 −1/2,† † −1/2 −1/2,† †
Hk R S k = RSk Hk Hk R S k RSk Hk . (3.58)

Therefore, it can be shown that


(T x−ZF ) −1/2
pk = R Sk
⎛ ⎞−1
K 
1
−1/2,† † (l) (l),† −1/2 −1/2,† † (0)
∗ ⎝ RSk Hj rk,j rk,j Hj RSk ⎠ RSk Hk rk,k .
j=1 l=−1

(3.59)
3.5 Appendix 63

3.5.3 Derivation of Tx-Wiener


Let us first rewrite zk (i) in (3.46) as

1 
K
(l),†
zk (i) = αrj,k Hk pj bj (i + l) + αnk (i)
l=−1 j=1
 

1
l)† Rk Hk p
(l),†
=α b(i + + αnk (i), (3.60)
l=−1

where
b(i) = [b1 (i), · · · , bK (i)]t , (3.61)
) *
(l) (l) (l)
Rk = diag r1,k , · · · , rK,k , (3.62)
Hk = diag [Hk , · · · , Hk ] , (3.63)
p = [pt1 , · · · , ptK ]t . (3.64)
By substituting (3.60) into (3.47) and performing some manipulations, the
summation of K MSE becomes

K # 2 $
E bk (i) − zk (i)
k=1
 

K 
1
2 † (l) (l),†
= K +α p HH
k Rk Rk Hk p
k=1 l=−1
 

K

H†k Rk eK
(k),† (0),† (0) (k)
−α eK Rk Hk p +p + α2 Kσn2 . (3.65)
k=1
Next, we let
1 −1/2
p= R w, (3.66)
α S
where
w = [wt1 , · · · , wtK ]t , (3.67)
RS = diag [RS1 , · · · , RSK ] (3.68)
and RSk is defined in (3.50). By substituting (3.65) and (3.66) into (3.47),
the constraint optimization is reformulated as
# $
w(opt) , α(T x−W iener)
( K 1 
  −1/2,† −1/2
= arg min K + w†
(l) (l),†
RS Hk Rk Rk Hk RS
H
w
w,a
k=1 l=−1
  +

K
−1/2 −1/2,† † (0) (k)
w† RS
(k),† (0),†
− eK Rk Hk RS w + Hk Rk eK + α2 Kσn2
k=1
(3.69)
64 3 Precoding Techniques in Multiple Access Channels

subject to
w† w
= Ep . (3.70)
α2
Equation (3.70) implies that

w† w
α2 = . (3.71)
Ep

In order to simplify the constrained optimization problem, we substitute (3.71)


into (3.69) to get rid of the power constraint, i.e.,

w(opt)
(  

K 
1
−1/2,† H (l) (l),† −1/2

= arg min K + w RS Hk Rk Rk Hk RS w
w
k=1 l=−1
 

K
−1/2 −1/2,† † (0) (k)
(k),† (0),† †
− eK Rk Hk RS w +w RS Hk Rk eK
k=1
"
Kσn2
+ w† w . (3.72)
Ep

The solution to the above unconstrained optimization problem can be easily


shown as
(opt),T (opt),T
w(opt) = [w1 , · · · , wK ]t
K 1 −1
  −1/2,† −1/2 Kσn2
H (l) (l),†
= RS Hk Rk Rk Hk RS + IKLp ∗
Ep
k=1 l=−1


K
−1/2,†
H†k Rk eK .
(0) (k)
RS (3.73)
k=1

(opt)
Due to the block-wise matrix operation in nature in (3.73), wk can be
individually represented as
⎛ ⎞−1

K 
1
−1/2,† H (l) (l),† −1/2 Kσn2
=⎝ ILp ⎠ ∗
(opt)
wk R Sk Hk rj,k rj,k Hk RSk +
j=1 l=−1
Ep
−1/2,† (0)
RSk HH
k rk,k . (3.74)

Hence, we can have


% %
K (opt),† (opt)
w(opt),† w(opt) wk wk
α(T x−W iener) = = k=1
(3.75)
Ep Ep
3.5 Appendix 65

and
) *t
(T x−W iener),t (T x−W iener),t
p(T x−W iener) = p1 , · · · , pK
%
Ep −1/2 (opt)
= K (opt),† (opt) S
R w , (3.76)
k=1 wk wk

where
%
(T x−W iener) Ep −1/2 (opt)
pk = K R
(opt),† (opt) Sk
wk ∀k = 1, · · · , K. (3.77)
k=1 wk wk
4
Precoding Techniques for MIMO Channels

4.1 Review of MIMO Systems

Multiple input multiple output (MIMO) systems have been the most desirable
candidates for next generation of high data rate wireless communications.
MIMO systems can offer spatial multiplexing gain and greatly increases the
capacity of channels by independently sending streams of data across multiple
antennas. The Bell Laboratory Layered Space-Time (BLAST) system is the
most prominent example of this capacity achieving scheme. MIMO systems
can also provide diversity and coding gain by using space time codes that map
input symbols across time and space.
A block diagram of a MIMO system with either space-time code or mul-
tiplexing is shown in Fig. 4.1. Suppose there are Mt transmit antennas and
Mr receive antennas. The MIMO system can be described by

y= ρHx + w. (4.1)

where x ∈ CMt is the transmit symbol, y ∈ CMr is the received symbol,


w ∈ CMr is a circularly symmetric complex Gaussian noise vector and H ∈
CMr ×Mt is channel matrix whose entries hi,j are the the complex gain of the
transmission path from ith transmit antenna to jth receive antenna. Unless
otherwise stated, the entries of H are considered statistically independent
across space. ρ is the signal-to-noise ratio. When H is random matrix and

Fig. 4.1. Block diagram of a MIMO system

C.-C.J. Kuo et al., Precoding Techniques for Digital Communication Systems,


DOI: 10.1007/978-0-387-71769-2 4,  c Springer Science+Business Media, LLC 2008
68 4 Precoding Techniques for MIMO Channels

ergodic, we assume its entries are i.i.d complex circularly symmetric Gaussian
with zero mean and unit variance. That means, each entry of H has uniform
phase and Rayleigh magnitude.
The capacity of an ergodic channel the capacity is achieved by maximizing
the mutual information I(x; y, H) with respect to the distribution of x. It is
shown that the capacity is achieved by transmitted signal x ∼ CN (0, Σx ) and
is given by [129]
C = E[log2 det(I + ρHΣx HH )] (4.2)
where Σx = E(xxH ) is the input covariance matrix.
If the channel state information (CSI) is known at both transmitter, H
can be considered deterministic. In this case the channel capacity is given by

C= max log2 det(I + ρHΣx HH ) (4.3)


Σx , tr(Σx )≤1

Singular value decomposition (SVD) of H yields

H = UΛVH (4.4)

where U ∈ CMr ×Mr and V ∈ CMt ×Mt are unitary and Λ ∈ CMr ×Mt and its
diagonal elements (singular values) are square roots of eigenvalues of HHH .
If we multiply both sides of (4.1) with UH , we will obtain

ỹ = ρΛx̃ + w̃ (4.5)

where ỹ = UH y, x̃ = VH x and w̃ = UH z. Since the rank of H is at most


m = min(Mt , Mr ), there are at most m non zero singular values denoted by

λi , i = 1, 2, ...m. Then, we can rewrite (4.5) as
√
ρλi x̃i + w̃i , i = 1, ..., m
ỹ = (4.6)
0 i = m + 1, ..., Mr

where we assumed Mr < Mt . We see that SVD decouples the channel into a
set of m parallel independent subchannels, known as eigen subchannels with
eigenmodes λi .
The constrained optimization problem of (4.3) is simplified by using SVD
of H. The capacity is given by


m
C= (log2 (ρμλi ))+ (4.7)
i=1

where (.)+ = max(0, .) and μ is chosen to meet the power constraint


m
1
(μ − )+ = 1 (4.8)
i=1
ρλi
4.2 Tomlinson-Harashima precoding (THP) for MIMO systems 69

we see that the input power is given by


1
E{xi 2 } = (μ − )+ (4.9)
ρλi
Based on the above formulas, more power must be allocated to eigen subchan-
nels with stronger eigenmodes. This type of power allocation policy is known
as water-pouring policy.
Although spatial multiplexing and space time codes schemes do not need
channel state information (CSI) at the transmitter, in a number of appli-
cation, CSI is available at the transmitter. In such scenarios, precoding can
optimally assign resources such as power and bits over multiple antennas us-
ing the channel state information. Precoding also can help us design better
space-time codes. In this chapter, we explore various precoding techniques for
MIMO systems. First, in section 4.2, we consider the use of TH-precoder for
a MIMO system. Then In section 4.3 we show how to jointly design opti-
mum linear precoder and decoder under certain criteria. We review the linear
precoders with space-time codes in section 4.4. Finally, in section 4.5, we ex-
plain precoding techniques in the presence of incomplete channel information
response.

4.2 Tomlinson-Harashima precoding (THP) for MIMO


systems
In Chapter 2 we reviewed TH precoding for ISI channels. In this section, we
show that TH precoding can easily be applied to a BLAST MIMO system
[152]. In diagonal (D-BLAST), each codeword is transmitted using different
antennas at different times. The Vertical (V-BLAST) architecture, however,
places separate codewords in different antennas. Each of the Mr antennas re-
ceives a combination of Mt independent symbol streams. Recovering the inde-
pendent data streams sent over multiple transmit antennas are essential to ob-
tain the multiplexing (capacity) gain. Decoder then attempts to separate each
independent symbol from others by canceling the interfering signals succes-
sively, similar to successive cancellation technique used in multiuser detection.
There exists a duality between canceling ISI in single input single output
(SISO) systems, and canceling interference in Blast-based MIMO systems.
Therefore, similar to SISO ISI channels, one can use linear equalization or
non linear equalization e.g. DFE to separate the data streams spatially in
MIMO systems.
Fig. 4.2 shows matrix DFE for a MIMO system. The MIMO system model
is represented by (4.1). For simplicity, we assume that the number of transmit
and receive antennas are the same and denoted by M . The feedforward M ×M
matrix F whitens noise and guarantees causality as in the case for the scalar
DFE for ISI channel. Therefore B = FH has to be lower triangular, where B
and H are also M × M matrices.
70 4 Precoding Techniques for MIMO Channels

Fig. 4.2. Block diagram of Matrix ZF-DFE [[152]IEEE].


c

F can also be chosen such that the main diagonal elements of B are one.
The feedback matrix, B−I cancels the interference caused by already detected
symbol. However, matrix DFE suffers from the same shortcomings as scalar
DFE for ISI channel. That is error propagation and the difficulty to combine
with coding schemes. When the channel state information is available, the
feedback section of the DFE can be moved to the transmitter so that linear
pre-equalization of the cascade B = FH can separate the signals from different
transmit antennas.
However, the above pre-equalization technique can increase the transmit
energy significantly. Therefore, a non linear function must be used to cripple
the transmit energy. As shown in Fig. 4.3, this can be done by a modulo device
as explained for TH-precoding for ISI channels in Chapter 2.
Ignoring the modulo device, the cascade of B = FH is a lower-triangular
and hence the output symbols yi , i = 1, 2, ...M are successively generated
from the data symbols xi ∈ A as


i−1
zi = xi − bij zj (4.10)
j=1

Suppose, each symbol belongs to a square Mc × Mc constellation A. each


point in the constellation can take values from {±1, ±3, ±(Mc − 1)}. The
constellation is bounded by by the square region of width 2Mc. The function
of the modulo 2Mc device of THP is to force the transmit symbols xi , i =
1, 2, ..., Mt, into the boundary region of A. In other words as

Fig. 4.3. Block diagram of TH precoding for MIMO channel [[152]IEEE].


c
4.2 Tomlinson-Harashima precoding (THP) for MIMO systems 71


i−1
zi = xi + pi − bij zj (4.11)
j=1


where pi ∈ {2 Mc .(pI + jpQ ); pi , pQ ∈ Z}. The channel matrix H can be
factorized by QL factorization technique as

H = FH S (4.12)

where S is a lower triangular matrix and F is the unitary feedforward matrix as


defined before (preserving whiteness of the noise). By multiplying feedforward
matrix with a scaling matrix G = diag(s−1 −1
11 , ..., sN N ), we can keep the main
diagonal elements of B equal to 1. Hence, B = GFH = GS.
The received symbols ri , i = 1, 2..., Mr at the slicer are only corrupted by
the noise thanks to the preequalization cascade B = FH and can be written
as
ri = vi + wi (4.13)
where vi = zi + wi = w̃/sii and w̃ = Fw. We see that the MIMO channel is
decomposed into M parallel AWGN channels with noise variance σn2 /sii 2 ,
i = 1, ...M , where σn2 is the variance of wi .
The received symbols ri , i = 1, 2, ...M are first modulo reduced into the
boundary region of signal constellation A. Then, the estimates of the symbols
x̂i for each of Mt transmit antennas is recovered by a conventional slicer as
shown in Fig. 4.3.
Eqs. (4.4) and (4.5) show that by singular value decomposition of channel
and applying V at the transmitter and U at the receiver, MIMO channel
can be transformed to M independent parallel eigen subchannels. Assumin
high SNRs (ρ  1), the capacity achieving power distribution approaches to
a uniform one, i.e. Σx = E(xxH ) = I. Hence by (4.3), the capacity of MIMO
channel is given by

CSV D = log det(I + ρHΣx HH )


,
M
≈ log det(ρHHH ) = log det(ρΛH ) = log( ρλi ) (4.14)
i=1

where λi and ρ were as defined before and we have assume that rank
of H = M .
As we showed above, MIMO system with THP (or with the error-free deci-
sion feedback equalization) result in parallel and independent AWGN channel
with variance σn2 /|sii |2 , i = 1, ...M . Hence the capacity of THP MIMO for
high SNR values is


M 
M
CT HP = log(1 + ρsii  ) ≈
2
log(ρsii 2 ) (4.15)
i=1 i=1
72 4 Precoding Techniques for MIMO Channels

But from (4.12),


HH H = SH FFH S = SH S (4.16)
-M -M
Since S is triangular, i=1 λii 2 = det(HH H) = det(SH S) = i=1 sii 2 .
Thus, for ρ  1,

,
M
CT HP = log( ρλi ) = CSV D (4.17)
i=1

In other words, precoding is asymptotically an optimal way for channel equal-


ization. Fig. 4.4 [152] shows the capacity (in bits per M -transmit vector)
versus SNR in dB for a MIMO system with M = 4. The entries of channel
matrix were considered i.i.d complex Gaussian and the results were averaged
over large number of channel realization. The capacity achieved by MIMO
THP precoding and SVD were plotted for a flat (uniform) power distribution
and also water pouring power allocation policy. We see that for high SNR, the
water pouring power allocation merges to a flat one and the capacity achieved
by TH precoding approaches to capacity of MIMO channel.
Symbol error rate for this MIMO system are also depicted in Fig. 4.5. The
performance of SVD and preequalization schemes both sffer from small singu-
lar value which dominates the error rate. The THP system shows advantage
over linear pre-equalization, SVD scheme and also the V-BLAST with non
ideal DFE. However, with respect to gVBLAST with genie-aided (feedback
error-free) DFE, THP suffers a small loss of performance.

Fig. 4.4. Achievable rates for SVD-based equalization and for MIMO precoding
[[152]IEEE].
c
4.3 Joint Design of Linear Precoder and Decoder 73

Fig. 4.5. Average symbol error rates for different equalization techniques
[[152]IEEE].
c

4.3 Joint Design of Linear Precoder and Decoder

A MIMO communication system model with linear precoder and decoder is


shown in Fig 4.6. The input bit streams are coded and modulated to and
symbol streams are then passed through the linear precoder. The linear pre-
coder is a matrix F that can add redundancy to the input symbol streams
to improve systems performance. The output of the precoder is then sent to
the channel through Mt transmit antennas. Mr receive antennas collect the
signals at the receiver and pass it to a decoder matrix G which removes any
redundancy that has been introduced by the precoder. For a single carrier
system in flat fading channel the system equation is

x̂ = GHFx + Gw, (4.18)

Fig. 4.6. Block diagram of linear precoder/decoder for MIMO channel


[[112]IEEE].
c
74 4 Precoding Techniques for MIMO Channels

where x is the K × 1 transmitted vector, x̂ is the K × 1 received vector, F


is the Mt × K precoder matrix, G is the K × Mr decoder matrix and H and
w are as defined before. In the following sections, unless otherwise stated, we
assume
E(xxH ) = I; E(wwH ) = I; E(xwH ) = 0. (4.19)
We see that the precoder F adds a redundancy of Mt − K across space. For
simplicity of the analysis, we assume K = rank(H).
In section 4.3.1, we show that if the minimization of the any weighted sum
of symbol estimation errors of all subchannels is chosen as the desired crite-
rion, the linear precoder and decoder decouples into parallel eigen subchan-
nels. By selecting appropriate error weights, we can maximize the information
rate, minimize the sum of error rates. We can also design precoders/decoders
for quality-of-service (QoS) based equal error rate (among all subchannels)
applications

4.3.1 Generalized Weighted MMSE Design


We want to design F and G matrices so as to minimize any weighted combi-
nation of symbol estimation errors. Define the K × 1 error vector as
e = x − x̂ = x − (GHFx + Gw). (4.20)
The error covariance matrix or the minimum square error (MSE) matrix is
defined as
MSE = E{eeH }
= E[(x − GHFx − Gw)(x − GHFx − Gw)H ]
= (GHF − I)(GHF − I)H + GRnn GH , (4.21)
where we have used the assumption given in Eq. (4.19).
Considering the K × K diagonal positive definite weight matrix We , the
minimization problem can be formulated as
min E{We 1/2 e2 }
G,F

tr(FFH ) ≤ p0 (4.22)
where p0 is the total power available and the expectation E is performed with
respect to the distribution of x and w. Note that
E{We 1/2 e2 } = E{tr(We 1/2 eeH We H1/2 )}
= tr(We 1/2 E{eeH }We H1/2 )
= tr(We E{eeH }), (4.23)
The method of Lagrange duality and the Karush-Kuhn-Tucker (KKT) condi-
tions can be used to solve the optimization problem in (4.22) as follows; We
first form the Lagrangian using Eqs. (4.21) and (4.23).
4.3 Joint Design of Linear Precoder and Decoder 75

L(μ, G, F) = tr[We (GHF−I)(GHF−I)H +We GRnn GH ]+μ[tr(FFH )−p0 ]


(4.24)
It can be shown that G and F are optimal if they satisfy the following condi-
tions [112]
G L(μ, G, F) = 0, (4.25)
F L(μ, G, F) = 0, (4.26)
μ ≥ 0; tr(FF ) − p0 ≤ 0,
H
(4.27)
μ[tr(FFH ) − p0 ] = 0. (4.28)
Substituting (4.24) in (4.25) and (4.26), we obtain the following formulas for
F and G,
HF = HFFH HH GH + Rnn GH , (4.29)
We GH = FH HH GH We GH + μFH , (4.30)
By solving the above equations the optimum precoder and decoder can be ob-
tained as shown in [112]. Let us define the following eigenvalue decomposition
(EVD)  
H H Λ 0
H Rnn H = (V V) (V V)H (4.31)
0 Λ
where V is an Mt ×K orthogonal matrix whose columns constitutes a basis for
the range space of HH RH nn H. V is a Mt ×(Mt −K) orthogonal matrix forming
a basis for the null space of HH RHnn H. Λ is a diagonal matrix containing the
K nonzero eigenvalues {λ}K i=1 arranged in a decreasing order from the top-
left to bottom-right and Λ contains the zero eigenvalues. Note that we assume
K = rank(HH RH nn H) = rank(H). Then, the following Theorem presents the
optimum F and G matrices [112];
Theorem 1: the optimum F and G matrices can be found from the following
equations
F = VΦf (4.32)
G = Φg VH HH R−1
nn (4.33)
where Φf and Φg are K × K diagonal matrices with non-negative elements
on the diagonal and are given by

Φf = (μ−1/2 Λ−1/2 We 1/2 − Λ−1 )+


1/2
(4.34)

Φg = (μ1/2 Λ−1/2 We −1/2 − μΛ−1 We −1 )+ Λ−1/2 .


1/2
(4.35)
where (.)+ means the negative values are replaced with zero.
Proof: The proof can be found in [112].
Let Λ = diag([λ1 , λ2 , ..., λK ]) and We = diag([w1 , w2 , ..., wK ]). Suppose
k < K subchannels are used for transmission. From (4.34) and the power
constraint tr(Φ2f ) = p0 , we have
76 4 Precoding Techniques for MIMO Channels


k
−1/2

k
tr(Φf 2 ) = μ−1/2 (λi wei 1/2) − (λ−1
i ) = p0 (4.36)
i=1 i=1

Therefore, we can obtain the following expression for μ


k −1/2 1/2
(λ wei )
μ1/2 = i=1 ik (4.37)
p0 + i=1 (λ−1
i )

Let ρk = λk wk and let them be ordered in a decreasing manner: ρ1 ≥ ... ≥


ρk ≥ ... ≥ ρK . To compute Φf = diag([φf,1 , φf,2 , ..., φf,K ]) optimally and
ensure that it is positive semi-definite, we can compute μ according to (4.37)
for each k = 1, 2, ...K iteratively, starting with k = K and assuming μ ≤ ρk ,
so that φf,k ≥ 0,k = 1, 2...K.
From Eqs (4.32) and (4.33), we have
FHG = Φg ΛΦg (4.38)
In other words, the optimum precoder F and decoder G diagonalize channel
matrix H into eigen subchannels (eigenmodes) for any set of error weights.
Fig. 4.7 shows the diagonalization of the channel by precoder/decoder.
We can design linear precoder/decoder according to different design crite-
ria, such as maximize information rate, minimize the sum of symbol estimation
error and so on by choosing an appropriate error weight matrix We . In the
following, we explore a variety of these design criteria.

4.3.2 Maximum Information Rate Design


We saw in sections 4.1 that the water-pouring power allocation policy max-
imized channel capacity of MIMO systems. Rewriting Eq. (4.9) in matrix

Fig. 4.7. Optimum linear precoder/decoder: decomposition of channels into eigen


subchannels [[112]IEEE].
c
4.3 Joint Design of Linear Precoder and Decoder 77

format and replacing E{xi 2 } with tr(FFH ) = Φf 2 we obtain


I
− Λ−1 )+
1/2
Φf = ( (4.39)
μ1 /2
The above equation is obtained by choosing We = Λ in Eq. (4.34). It is
shown in [111] that the optimum precoder and decoder that maximizes the
information rate are given by (4.32) and (4.33) with Φf obtained from (4.39).
In other words, the maximum information rate design is just a special case
of generalized weighted MMSE design. Total capacity (information rate) is
given by
K 
K
C= Ri = log2 (1 + φ2f,i λi ) (4.40)
i=1 i=1
where Ri is the information rate for the ith subchannel and φf,i is the ith
element on the diagonal of Φf . Also, Note that since Φg is not present in the
above expression for the maximum information rate, we can select Φg to be
any arbitrary full-rank diagonal matrix.
According to the water-pouring solution, stronger subchannels must be
assigned higher rates. The choice of We = Λ tells us that they also need to be
heavily weighed. For example, in adaptive modulation systems, more power
and higher order modulation are used on subchannels with higher gains to
improve the total data rate.

4.3.3 QoS-Based Design

In multimedia applications, different types of information (video, audio, data,


etc.) with different SNR requirements for successful transmission need to be si-
multaneously sent on different subchannels. For these quality-of-service (QoS)
applications, it is necessary to design our system such that different subchan-
nels have different SNRs. Let us define the SNR matrix as

SNR = (FH HH GH (GRnn GH )−1 GHF (4.41)

where we have used the assumption Rxx = I. Using the optimum F and G
from Eqs. (4.32) and (4.33), SNR is simplified to

SNR = Φf 2 Λ = (We 1/2 μ−1/2 Λ1/2 − I)+ = γD (4.42)

where D = diag([d1 , d2 , ..., dK ]) is a diagonal matrix of relative SNR’s across


K
subchannels, with i=1 di = 1, and γ > 0 is a scalar. Solving this equation
for We , we attain
We 1/2 = (I + γD)Λ−1/2 μ1/2 (4.43)
Since μ is a function of We given from (4.37) with k = K as

Tr(Λ−1/2 We 1/2 )
μ1/2 = (4.44)
Tr(Λ−1 ) + p0
78 4 Precoding Techniques for MIMO Channels

we can substitute the expression for We from (4.43) into (4.44) to obtain

Tr(Λ−1/2 (I + γD)Λ−1/2 μ1/2 )


μ1/2 =
Tr(Λ−1 ) + p0
μ1/2 (Tr(Λ−1 ) + γTr(Λ−1 D))
= (4.45)
Tr(Λ−1 ) + p0

Therefore, γ = p0 /tr(Λ−1 D). Inserting this into (4.43), we obtain


 
p0
We 1/2
=μ 1/2
D + I Λ−1/2 (4.46)
Tr(Λ−1 D)

Substituting (4.46) into (4.34) and (4.35), we obtain the following results for
the optimum precoder F and decoder G given in (4.32) and (4.33),

Φf = γ 1/2 D1/2 Λ−1/2 (4.47)

Φg = γ 1/2 D1/2 Λ−1/2 (γD + I)−1 (4.48)


p0
γ= (4.49)
tr(DΛ−1 )
We can also simplify the expression for E{eeH } given in (4.21) using the
optimum F and G to

E{eeH } = We −1/2 Λ−1/2 μ1/2 (4.50)

From (4.42) and (4.50), we see

SNR = E{eeH }−1 − I (4.51)

Hence equal SNR on each subchannels implies equal MSEs regardless of the
choice of error-weights.

4.3.4 (Unweighted) MMSE Design

If we let We = I, then the optimal precoder and decoder derived in Eqs.


(4.32) and (4.33) will minimize the sum of symbol estimation errors across all
subchannels and is given by
F = VΦf (4.52)
Φf = (μ−1/2 Λ−1/2 − Λ−1 )+
1/2
(4.53)
While, this approach can improve the performance of MIMO systems, it is
not guaranteed that the MSEs and SNRs on each subchannels is minimized.
Power allocation policy in MMSE design allocates no power to an subchannel
if its gain is less than a certain threshold. Among the remaining subchannels,
more power is allocated to the weakest subchannels, i.e. subchannels with the
smallest eigenmodes.
4.3 Joint Design of Linear Precoder and Decoder 79

An alternative way of joint MMSE design was illustrated in [117] in which


the optimization is done in two steps. First, tr(MSE(F, G)) is minimized with
respect to G while F is fixed. In other words, first, we want to solve
min MSE(F, G) (4.54)
G

By setting the gradient of MSE(F, G) = E{eeH } to zero, we write


G∗ E{eeH } = GHFFH HH − FH HH + GRnn = 0 (4.55)
Therefore, the optimum G is obtained as
G∗ = FH HH (HFFH HH + Rnn )−1 (4.56)
It is shown that the optimum G∗ is the same as the MMSE (Wiener) receiver
[64].
Substituting the above expression for G∗ in (4.21), we get

E{eeH } = (G∗ HF − I)(G∗ HF − I)H + G∗ Rnn G∗ H


= G∗ (HFFH HH + Rnn )G∗ H + I − G∗ HF − FH HH G∗ H
= FH HH ((HFFH HH + Rnn )H )−1 HF + I − (HFFH HH
+ Rnn )−1 HF − ((HFFH HH + Rnn )H )−1 HF
= I − (HFFH HH + Rnn )−1 HF (4.57)
Using the matrix inversion lemma
(A + BCD)−1 = A−1 − A−1 B(DA−1 B + C−1 )−1 DA−1
we obtain
MSE(F) = (I + FH HH R−1
nn HF)
−1
(4.58)
Here MSE(F) is minimum in the sense that
MSE(F) = MSE(F, G∗ ) ≤ MSE(F, G) (4.59)
The second step is the more difficult task of minimizing tr(MSE(F)) with the
power constraint. It is proven in [117] that the optimum result for F is the
same as (4.52) and (4.53). Another method is to minimize the determinant of
error covariance matrix (after optimized over G under power constraint, i.e.,
min det(MSE(F)), tr(FFH ) = p0 (4.60)
F

As shown in [117], the results are


F = VΦf (4.61)

Φf = (η −1/2 − Λ−1 )+
1/2
(4.62)
where
k
η 1/2 = k −1
.
p0 + i=1 (λi )
80 4 Precoding Techniques for MIMO Channels

4.3.5 Equal Error Design

Some applications require reliable transmission of K symbols using identi-


cal modulation and coding scheme and fixed rate. This means that all K
subchannels must have equal error rate and hence equal SNRs. The latter
requirements can be satisfied simply by choosing D = I in Eqs. (4.47)-(4.49)
in QoS design to yield
Φf = γ 1/2 Λ−1/2 (4.63)
Φg = γ 1/2 Λ−1/2 (γ + 1)−1 (4.64)
p0
γ= (4.65)
tr(Λ−1 )
The power allocation policy for equal error design assigns more power to
subchannels with less gains while allocates less power to subchannels with
stronger eigenmodes so that the subchannel SNRs and MSEs are equal. Fur-
thermore, no subchannels is dropped regardless of its channel realizations. It
can be shown that GHF = (γ)/(γ + 1)I. In other words, the optimum pre-
coder and decoder equal error design transforms the MIMO channel into a
scaled identity matrix.

4.3.6 Maximum SNR-Based Design

Sometimes, minimizing the bit error probability is preferred. However, the op-
timization problem is hard to deal with since they are rarely solvable in closed
form. Instead, an indirect way of reducing probability of error is to maximize
the minimum distance between hypothesis. Since the minimum eigenvalue
λmin (SNR(F, G)) provides a lower bound for the minimum distance between
the hypothesis for the maximum likelihood (ML) detector (provided that the
noise is Gaussian and the symbols are i.i.d), it was suggested in [117] to use
the (4.42) as a sensible measure related to the probability of error. The cor-
responding optimization formulation is given in the following equations

arg max λmin (SNR(F, G)), tr(FFH ) = p0 (4.66)


G,F

and the solutions are given by [117]

F = VΦf (4.67)

with Φf is a diagonal K × K matrix having diagonal entries


p0
Φii = K −1
λ−1
ii (4.68)
j=1 λjj

and
G = Φg VH HH R−1
nn (4.69)
4.3 Joint Design of Linear Precoder and Decoder 81

Fig. 4.8. Comparison of equal-error and MMSE design BER performance


[[112]IEEE].
c

The above solution leads to


p0
SNR = Φf H ΛΦf = K I. (4.70)
j=1 λ−1
jj

Fig. 4.8 compares the performance of equal error designs and MMSE design
for Mt = Mr = 4 spatial multiplexing system. K = 3 streams of data are sent
over the channel with QAM modulation. As we see from the figure, although
equal error and MMSE have the same total average BER performance, the
subchannel BER performances for each of the 3 streams are different.
Fig. 4.9 shows the performance of the QoS-based design for a 3 × 3 MIMO
spatial multiplexing system. One audio stream and one video stream are sent
into the channel. The optimal precoder and decoder are obtained for each
channel realization. The figure shows that video stream are provided with
5-dB higher received SNR.

4.3.7 Unified Framework with Convex Optimization

The joint linear precoder/decoder design is in general a complicated non-


convex problem. Optimization can be done with respect to various criteria.
82 4 Precoding Techniques for MIMO Channels

Fig. 4.9. QoS-design SNR performance [[112]IEEE].


c

As we saw previously for some specific criteria, the linear precoder/ decoder
optimization decouples the MIMO channel into parallel subchannels if the
criterion is the minimization of weighted sum of MSEs of all subchannels.
It is of great interest to retain this diagonalized structure for other cri-
teria such as minimization of maximum or average BER. In [99] a unified
framework was developed for multicarrier MIMO systems which generalizes
the existing result. Instead of dealing with each designing criterion separately,
the minimization of some arbitrary objective function of the MSEs of all chan-
nel subchannels f0 (M SEi ), were considered, where MSE is the MSE of the
ith spatial subchannel, (Objective function of the SNRs and of the BERs are
easily incorporated in MSE). The objective function f0 must be chosen rea-
sonably such that it is increasing in each one one if its arguments while having
the rest fixed.
Two families of objective functions are considered here that embodies all
the above and other reasonable criteria: Schur-concave and Schur-convex func-
tions that arise in majorization theory.
For any x ∈ Rn , let x[1] ≥ ... ≥ x[n] denote the components of x in
descending order. Also, let x, y ∈ Rn . We say vector x is majorized by vector
y and represent it by x ≺ y if
4.3 Joint Design of Linear Precoder and Decoder 83


k 
k
x[i] ≤ y[i] , 1≤k ≤n−1
i=1 i=1
n n
x[i] = y[i] (4.71)
i=1 i=1

Using the above definitions, we can define Schur-concave and Schur-convex


functions. A real valued function f defined on a set A ⊆ Rn is said to be
Schur-convex on A if

x ≺ y on A ⇒ f (x) ≤ f (y), (4.72)

Similarly, f is said to be Schur-concave on A if

x ≺ y on A ⇒ f (x) ≥ f (y), (4.73)

From the above definitions, we see that, if f is Schur-convex on A, then −f


is Schur-concave on A and vice versa.
Similar to the procedure in subsection 4.3.4, we first derive the optimum
decoder G∗ assuming the linear precoder F is fixed and we obtain Eqs. (4.56)
and (4.58). From (4.58), the MSE for the ith diagonal element is given by
1
MSEii = (4.74)
1 + fiH HH Rnn,i Hfi

where fi is the ith column of matrix F.


Since many objective functions are expressed as functions of signal to noise
and interference ratio (SINR), we shall express MSE as a function of SINR. It
is shown in [99] that the ith diagonal element of SINR can be upper bounded
by
[SIN R]i ≤ fiH HH Rnn,i Hfi (4.75)
Therefore, the SINR can be related as a function of MSE as
1
[SIN R]i = −1 (4.76)
M SEi
Under the Gaussian noise assumption, the symbol error probability can also
be related to SINR as
.
Pe (SIN R) = αQ( β SIN R) (4.77)

where α and β are scalars and depends on the modulation scheme. Using the
Chernoff upper bound we can approximate the symbol error probability for
higher SINR values as
β
Pe (1/2)αe− 2 SIN R (4.78)
The BER can be approximately obtained from symbol error probability as
84 4 Precoding Techniques for MIMO Channels

BER P e/log2 (M ) (4.79)

where M is the constellation size. Both exact BER function and the Chernoff
upper bound are convex decreasing functions of SINR. In addition, for BER
less than 2 × 10−2 , both functions are convex increasing functions of the
MSE. Therefore, for practical purposes, we can assume the exact BER and
the Chernoff upper bound as convex functions of the MSE.
From the above discussion, we conclude that it suffices to focus on objective
function of MSEs without loss of generality. The following theorem, presents
the optimum linear precoder for any objective function of MSEs.
Theorem 2: Consider the following constraint convex optimization problem

min f0 (diag(MSE(F)))
F
tr(FFH ) ≤ p0 (4.80)

where MSE(F) = (I + FH HH R−1 nn HF)


−1
. Without loss of generality, it is
assumed that the diagonal element of MSE matrix are in decreasing order.
f0 : RK −→ R is an arbitrary objective increasing function with respect to
each variable. Also, we assume K < v = rank(HH RHnn H) = rank(H).
If f0 is Schur-concave,

F = VΦf (4.81)

If f0 is Schur-convex,
F = VΦf UH (4.82)
where V is an Mt × K matrix orthogonal matrix consisting of the eigenvectors
corresponding to the K largest eigenvalues {λ}K
i=1 of H Rnn H. Φf is an v ×v
H H

diagonal matrix with diagonal elements given by {φf,i }2 , i = 1, 2, ...v, and U


is a unitary matrix such that MSE(F) = (I + FH HH R−1 nn HF)
−1
has identical
diagonal elements.
Proof: see [99].
For Schur-concave objective functions, both FHG and MSE(F) are fully
diagonalized. In this case, the corresponding MSEs and SINRs, for i =
1, 2, ...K are given by using (4.76)

1
M SEi = (4.83)
1 + φ2f,i λi

SIN Ri = φ2f,i λi (4.84)


where λi , are the K largest eigenvalues of HH RH 2
nn H in increasing order, φi,f
represent the allocated power and wi is the white noise.
For Schur-convex objective functions, FHG is diagonalized only up to a
specific rotation of data symbols. The MSE matrix is non diagonal with equal
diagonal elements and is given by
4.3 Joint Design of Linear Precoder and Decoder 85
 
1 1 
K
1
M SEi = tr(MSE(F)) = (4.85)
K K i=1
1 + φ2f,i λi

Similarly, the SINRs are given by using (4.76)


K
SIN Ri = K 1
−1 (4.86)
i=1 1+φ2f,i λi

We see that in both cases of Schur-concave and Schur-convex objective func-


tions, the original complicated matrix expressions for MSE have been reduced
to simple scalar expressions.
Now we can consider different design criteria using the optimum linear
decoder derived in (4.56) and the unified framework discussed in Theorem
2. We will show that a great variety of useful objective functions are either
Schur-concave or Schur-convex and thus the above Theorem can be applied
to simplify the design.

MSE-Based Design

We can select the objective function to be the weighted sum of MSEs as


K
f0 (M SEi ) = wei M SEi , (4.87)
i=1

or we can choose to minimize the weighted geometric mean of the MSEs by


choosing the objective function as

,
K
f0 (M SEi ) = (M SEi )wei (4.88)
i=1

It is shown in [99] that both of the above objective functions is minimized


when the weights are in increasing order (wei ≤ wi+1 ) and they are then
Schur-concave functions. Therefore, by Theorem 2 the diagonal structure is
optimal and the MSEs are obtained by (4.83). The resultant convex constraint
convex problem for the weighted sum of MSEs is given by


K
1
min wei
2
φi
i=1
1 + φ2f,i λi

K
φ2f,i ≤ p0 (4.89)
i=1

and the solution from the KKT optimality conditions is obtained as


1/2 −1/2
φi = (μ−1/2 wei λi − λ−1
1/2
i )+ (4.90)
86 4 Precoding Techniques for MIMO Channels

where μ−1/2 is the water-level chosen to satisfy the power constraint with
equality. Note that the above convex optimization problem and its solution is
nothing but the scaler version of (4.23) and (4.34).
Similarly, we can form the scalar convex optimization expressions for ob-
jective function (4.88) with the same power constraint and obtain the solution
with KKT optimality condition as
φi = (μ−1 wei − λ−1
1/2
i )+ (4.91)
where μ−1 is the water-level chosen to satisfy the power constraint with equal-
ity. We see that when wei = 1, the (4.91) becomes the classical capacity-
achieving water-filling solution.
Another MSE-based criterion is the minimization of the determinant of
the MSE matrix. Using the fact that X ≥ Y ⇒ detX ≥ detY it follows that
det(MSE(F)) is minimized for the choice of (4.56). Moreover, the determinant
of MSE(F) = (I + FH HH R−1 nn HF)
−1
does not change if F is post-multiplied
by a unitary matrix. Therefore, we can always choose an unitary (rotation)
matrix such that MSE(F) is diagonal and then
,
K
det(MSE(F)) = [MSE(F)]ii (4.92)
i=1

Also, since the mutual information is given by


max ; I = log det(I + R−1 H
nn HΣx H ) (4.93)
Σx

where Σx = FFH . Using the fact that det(I+XY) = det(I+YX), the mutual
information can be stated as I = − log det(MSE(F)). Therefore, the maxi-
mization of mutual information is equivalent to minimization of det(MSE(F))
and by (4.92) is the same as the minimization of the unweighted product of
the MSEs. The solution to all these criteria is given the classical capacity-
achieving water-filling for the power allocation
φi = (μ−1 wei − λ−1
1/2
i )+ (4.94)
Since the average BER performance is dominated by the symbols with
the highest MSE, the next reasonable criterion is the minimization of the
maximum of the MSEs. In other words, the objective function is
f0 (M SEi ) = max{M SEi } (4.95)
i

The above function is Schur-convex [99]. Thus, the MSE of the optimal so-
lution is not diagonal. However, tt is possible to make the diagonal of MSE
matrix identical by the optimal rotation matrix. The solution to the resulting
scalar convex problem can be written
−1/2
φi = (μ1/2 λi − λ−1
i )
+
(4.96)
where {μ1/2 } are multiple water levels chosen to satisfy the optimization con-
straints.
4.3 Joint Design of Linear Precoder and Decoder 87

SNR-Based Criteria

An Objective functions of SINR is related to that of MSE by

f˜0 ({SINRi }) = f0 ({MSEi }).

One useful such objective function is the weighted geometric mean of the
SINRs,
,K
f˜0 ({SINRi }) = − (SINRi )wei (4.97)
i=1

which can be expressed as a function of the MSEs using (4.76)

f0 ({MSEi }) = f˜0 ({MSE−1


i − 1})
,
K
=− (MSE−1
i − 1)
wei
(4.98)
i=1

- −1
f0 ({xi }) = − K i=1 (xi − 1)wei can be shown to be minimized when the
weights are in increasing order and is then a Schur-concave function. Thus, the
objective function of (4.97) is concave when M SEi < 0.5, a condition easily
satisfied. Hence, by Theorem 2, the diagonalized structure is optimal and the
SINR is given by (4.84). The solution to the constraint convex optimization
problem, is
wei
φi = ( K p0 )1/2 (4.99)
i=1 wei

Note that if wei = 1, the solution is φi = (p0 /K)1/2 , i.e., uniform power
allocation.
There are other SINR-based criteria discussed in [99]. In particular, it
-K
is shown that the maximization of i=1 (1 + SIN Ri ) is equivalent to the
minimization of the determinant of MSE and also the maximization of mu-
tual information, both discussed before, with the solution given by capacity-
achieving water pouring expression given by (4.94). Also, maximization of the
minimum SINR is equivalent to the minimization of maximum MSE treated
before.

BER-Based Criterion

The minimization of the maximum of the BERs is equivalent to the maximiza-


tion of the minimum of the SINRs and to the minimization of the maximum
of the MSEs, if all subchannels have the same constellations. Hence their so-
lution is given by (4.96). we only consider the minimization of average BER.
Let us first consider the minimization of the arithmetic mean of BERs. The
objective function is
88 4 Precoding Techniques for MIMO Channels


K
f˜0 ({BERi }) = BERi (4.100)
i=1

which can be expressed as a function of the MSEs using (4.76), (4.77) and
(4.79) as
K
f0 ({MSEi }) = BER(MSE−1i − 1) (4.101)
i=1
K
The function f0 ({xi }) = i=1 BER(x−1i − 1), (assuming θ ≥ xi > 0, for suffi-
ciently small θ such that BER(x−1
i − 1) ≤ 2 × 10−2) is a Schur-convex function
[99]. Therefore, by Theorem 2, the optimal solution has a non-diagonal MSE
matrix with diagonal elements given by (4.85) which have to be minimized.
The scalarized convex optimization problem is given by


K / 
−1
min αi Q βi (ti − 1)
ti ,φ2i
i=1
 
1 
K
1
θ ≥ ti ≥
K i=1
1 + λi φ2i

K
φ2i ≤ p0 (4.102)
i=1

Unfortunately, this problem does not have a closed form solution and one has
to resort to iterative methods such as interior-point methods [99].

4.4 Precoder in MIMO Space-Time Code Systems


Space-time (ST) coding is can effectively exploit the spatial diversity offered
by MIMO systems by appropriately mapping data streams across time and
space. In this section, we show how precoding can be used in conjunction
with space-time codes. First, we consider how to design linear precoder in
a channel with fading correlation for a space-time MIMO system. Next, in
section 4.4.2, we show how to design linear and unitary precoder in order to
maximize diversity and coding gain.

4.4.1 Linear Precoder for Space-Time Coded System with Fading


Correlation

So far, we have assumed that the there is no spatial correlation among transmit
and receive antennas. In other words, element of the MIMO channel matrix
fades independently. However, in many practical downlink scenarios, there
may be high correlation between base station (BS) antennas since the BS an-
tennas are typically placed high above the ground and see no local scatterers.
4.4 Precoder in MIMO Space-Time Code Systems 89

As a results the columns of H matrix are correlated. However, the received


signal at the mobile is a linear combination of several multipaths reflected
from several local scatterers leading to uncorrelated fading across the receive
antenna. Hence the rows of H are uncorrelated. Studies show that fading cor-
relation reduces MIMO channel capacity and system performance [120]. In
this section, we show that a linear precoder that knows only the transmit
antenna correlation matrix can be employed to improve the performance.
Suppose in such environment, the MIMO flat fading channel can be written
as
H = Hw R1/2a (4.103)
1/2
where Ra is the Mt × Mt transmit antenna correlation matrix and Hw is
an Mr × Mt i.i.d complex matrix.
We will assume a block-fading model wherein Hw remains constant over
the entire periods spanning the space-time codeword and then changes inde-
pendently over another space-time block.
Fig. 4.10 shows the block diagram of the MIMO system with space-time
code and precoder. Let the length of space-time codeword be N time-symbols.
At time instant n, the space time encoder takes in a set of input bits and
creates a K × 1 output code symbol vector

c(n) = [c1 (n), c2 (n), ..., cK (n)]T

where K ≤ Mt . A time sequence of N code symbol vectors [c(N t), c(N t −


1), ..., c(N t − N + 1)]T form a K × N space-time codeword,

x(t) = [c(N t), c(N t − 1), ..., c(N t − N + 1)].

The K × N space-time codeword is then processed by the Mt × K precoder


matrix F to produce Mt × N output matrix which is then sent over Mt anten-
nas. Note that the precoder F adds redundancy of Mt − K and may provide
additional diversity gains. The system equation is given by

y = HFx + w (4.104)

where x is the K × N codeword matrix, y is Mr × N received signal matrix


and w is Mr × N noise matrix. We assume Rnn = σ 2 I. Substituting (4.103)
in (4.104) yields

Linear
Bits in Coding ML Bits out
ST-Code Precoder
Modulation F Decoder

Fig. 4.10. Linear precoder and space-time code in correlated fading channel
[[110]IEEE].
c
90 4 Precoding Techniques for MIMO Channels

y = Hw R1/2
a Fx + w (4.105)
Let x (t) be the K × N transmitted space-time codeword at time t. At the
k

receiver, the maximum likelihood (ML) detection is performed. If the ML


decoder chooses the nearest distinct K × N codeword xl (t) instead of xk (t),
the pairwise error matrix can be written as Δx(k, l, t) = [xk (t) − xl (t)]. An
upper bound for pairwise error probability (PEP) is given by [128]

1
P (xk (t) → xl (t)) ≤ (4.106)
(λgm 4σ1 2 )vMr

where, v is the rank of matrix Δx(k, l, t)ΔxH (k, l, t) and λgm stands for
the geometric mean of the product-of the v nonzero eigenvalues {λi }vi=1 of
v
Δx(k, l, t)ΔxH (k, l, t), i.e. λgm = ( i=1 λi )1/v .
The diversity gain is defined as
 
H
Gd = Nr min rank(Δx(k, l, t)Δx (k, l, t) (4.107)
xk (t),xl (t)

As seen from (4.106), the diversity gain determines the slope of the upper
bound for the log-log pairwise error probability-SNR curve. Maximum diver-
sity is obtained if the matrix Δx(k, l, t) is full rank for all distinct k, l,.
The coding gain is the minimum of the product of eigenvalues of

Δx(k, l, t)ΔxH (k, l, t)

among all distinct pairs of (xk (t), xl (t)). In other words, the coding gain can
be written as
, v
Gc = min( λi )1/v (4.108)
i=1

For a given diversity gain, the coding gain measures the saving in SNR of the
space-time code.
We can design optimal linear precoder given such a known space-time
encoder. With the inclusion of the linear precoder F, the effective minimum
distance error matrix becomes Δx = FΔx. Similar to the proof steps in [128],
the PEP can be upper bounded by

P (xk (t) → xl (t)) ≤ e−dmin(t)/2


2
(4.109)

where

a ΔxF = Hw (t)Ra FΔxF


d2min = (1/σ 2 )Hw (t)R1/2 2 1/2 2
(4.110)

where .F is the Forbenius norm. Let

D = (1/σ 2 )R1/2 H H H1/2


a FΔxΔx F Ra
4.4 Precoder in MIMO Space-Time Code Systems 91

Define the eigenvalue decomposition (EVD) of D as D = Vd Λd Vd H , where


Vd is an Mt × Mt orthogonal matrixwhose columns are the eigenvectors of
D matrix and Λd is a Mt × Mt diagonal matrix where its diagonal elements
{λd,i }, i = 1, 2...Mt are eigenvalues of D.
An upper bound on the average PEP can be obtained by taking the ex-
pectation of the PEP with respect to Hw . Thus, the average PEP is obtained
as
M Mr
,t
P (x (t) → x (t)) ≤
k l
(1/(1 + λd,i )) (4.111)
i=1
-Mt
But i=1 (1 + λd,i ) = det(I + D), Hence,
& 'Mr
Mr
P (xk (t) → xl (t)) ≤ (det(I + D)) = det(I + R1/2
a FΔxΔx H H H1/2
F R a )
(4.112)
Now we can find the optimal F that minimizes the average PEP by solving
the following optimization problem
& '
max J = det I + (1/σ 2 )R1/2
a FΔxΔx H H H1/2
F R a
F
tr(FFH ) = p0 (4.113)

where p0 is the total transmit power across Mt transmit antennas. We ini-


tially assume that rank(Ra ) = K for simplicity. Let us define the EVD for
ΔxΔxH = Vx Λx Vx H , where Vx is the K × K orthonormal eigenmatrix and
Λx is the K × K diagonal matrix of eigenvalues, λx,i , for i = 1, 2, ...K. let us
also define the singular value decomposition (SVD)
 
1/2 Λr 0
Ra = (Ur Ur ) (Vr Vr )H (4.114)
0 Λr

where Ur and Vr are Mt × K orthogonal matrix whose columns are basis for
1/2
the range space of Ra . Vr and Ur are Mt × (Mt − K) orthogonal matrix
1/2
which constitutes a basis for the null space of Ra . Λr is a diagonal matrix
containing the K nonzero eigenvalues {λr }i=1 arranged in a decreasing order
K

from the top-left to bottom-right and Λr contains the zero eigenvalues.


It is shown in [110] that the solution to the above optimization problem
is given by

F = Vr Φf Vx H
Φf = (γI − Λr −2 Λx −1 )+
1/2
(4.115)

where Φf is a K × K matrix. We see that the power allocation on the eigen-


1/2
modes of Ra is given by water-pouring policy and depends on the eigen-
1/2
values of ΔxΔxH and Ra . Note that when rank(Ra ) = K, the precoder
92 4 Precoding Techniques for MIMO Channels

should allocate power on the strongest K eigenvectors of the transmit antenna


correlation matrix since otherwise, the cost function is not maximized.
Typically orthogonal space-time codes designed for i.i.d channels have
ΔxΔxH = βI. Thus, Λx = βI and Vx = I, where β is a scalar. In this case,
the rotation matrix Vr ensures that the optimal precoder allocates power only
to the eigenmodes of Ra .
Fig. 4.11, [110] shows the advantage of precoding for a rate 3/4 space-time
coded system with Mt is 2 and 3 and Mr = 2 and K = 3. The antenna

correlation coefficient rij /( rii rjj ) = 0.7, where rij is the ijth element of
Ra . For BER of 10−2 , the precoding gain is 4.7 dB over an non precoded
1/2

system.

4.4.2 Linear Constellation Precoding (LCP) for Space-Time Codes

As we discussed before, space-time (ST) coding is a powerful technique that


can offer both spatial diversity and coding gain. The most examples of ST
codes are ST trellis codes and ST block code with orthogonal design. ST trel-
lis codes enjoy maximum diversity and large coding gains but it is difficult
to use them with large constellation size due to their decoding complexity

Fig. 4.11. Precoding gain for rate 3/4 space-time code and antenna correlation
function 0.7 [[110]IEEE]
c
4.4 Precoder in MIMO Space-Time Code Systems 93

growing exponentially with the transmission rate. Space-time-orthogonal de-


sign (ST-OD) codes, on the other hand offer maximum transmit diversity
with affordable low-complexity decoders. However, when used with complex
constellations and number of transmit antennas Mt greater than two, the
transmission rates of ST-OD are reduced.
An alternative transmit diversity scheme that does not sacrifice rates are
linear constellation precoding (LCP) as shown in Fig. 4.12. The coded and
modulated symbol streams from a normalized constellation C is first parsed
into Mt ×1 signal vectors x and then is linearly precoded by an Mt ×Mt matrix
F. The precoded block is then sent to the space-time code mapper which maps
it to an Mt × Mt code matrix s that is sent over Mt antennas during Mt time
intervals. Specifically, the (i, j)th entry xi,j = ui,j fjT x is transmitted through
the ith antenna at the jth time interval, where ui,j denotes the (i, j)th entry
of a unitary matrix U, and vector fjT denotes the jth row of F. Let us define
Dx = diag(f1T x, ..., fM
T
t
x), we can then write the Mt ×Mt transmitted ST-LCP
code matrix as
s = UDx (4.116)
The received symbol at the receive antennas can be expressed as

y = Hs + w = HUDx + w (4.117)

where y is Mr × Mt received signal matrix and w is the Mr × Mt noise matrix


with [w]i,j is a circularly symmetric complex Gaussian noise. At the receiver
end, we will rely on y to detect s with maximum likelihood (ML) detection
algorithm. We assume that the channel coefficients are only known to the
receiver.
The pairwise matrix event [sk − sl ], where s = U diag(f1T x, ..., fMT
t
x),
From (4.106) and using the facts that U is full rank, and Dx is diagonal, the
diversity gain for the LCP system is given by Mr |NΔx |, where NΔx is given
by
NΔx = {m : fmT
(xk − xl )2 = 0} (4.118)
and |NΔx | denotes the cardinality of NΔx . Also, the maximum diversity gain
Mt Mr is achieved if the following maximum diversity conditions holds true

fm
T
(xk − xl ) = 0 ∀ m ∈ [1, Mt ], ∀ xk , xl ∈ C Mt (4.119)

Linear ST-LCP x
x Coding ML
Precoder Mapper
Modulation Decoder
F U

Fig. 4.12. Linear constellation precoder for space-time MIMO system


[[154]IEEE]
c
94 4 Precoding Techniques for MIMO Channels

T
Since fm (xk − xl ) is the mth coordinate of the precoder vector FT (xk − xl ),
we conclude from the above equation that to achieve maximum diversity, each
Mt × 1 output vector FT xk must be distinct in all its Mt coordinates.
From (4.108) and the fact that λm = fm T
(xk − xl )2 for m = 1, 2, ...Mt ,
we can express coding gain for LCP matrix when Gd = Mt Mr as
M 2/Mt
,t
Gc = min fm
T
(xk − x )
l
(4.120)
i=1

We will design the linear constellation precoder F such that it guarantees


maximum diversity and high coding gains. We shall not impose any constraint
on F except the following power constraint

tr(FFH ) = Mt (4.121)

which ensures that total transmit energy over Mt time intervals is E{Fx2 } =
E{x2 } = Mt .

Existence of Diversity-Maximizer Unitary Precoder

Unitary constellation precoding have certain advantage over nonunitary LCPs.


A unitary LCP preserves distances among Mt constellation points. On the
other hand, a non-unitary LCP, makes some paris of constellation points closer
or farther to each other.
Theorem 3: If the constellation size is finite, there always exists at least one
unitary F precoder satisfying (4.119), i.e. achieves the maximum diversity
gain Mt Mr .
Proof: From (4.119), it suffices to show there exists an Mt × Mt unitary
matrix such that each Mt × 1 output vector FT (xk − xl ) must be non zero in
all its Mt coordinates for all distinct xk and xl . For detailed proof, see [154].

Coding-Gain-Maximizer Unitary Precoder

Now, we look for a unitary F that maximizes coding gain among diversity-
maximizing unitary precoders. The optimum unitary precoder can be found
by solving the following optimization problem

,
Mt
Fopt = arg max min fm
T
(xk − xl )2/Mt
F xk =xl
m=1
tr(FFH ) = Mt (4.122)

Finding Fopt involves multidimensional nonlinear optimization over Mt2 com-


plex entries of F. However knowing F is unitary, we can parameterize it using
Mt2 real entries taking values from finite intervals.
4.4 Precoder in MIMO Space-Time Code Systems 95

LCP Design Based on Parameterization

Let us consider a real orthogonal precoder with Mt = 2,


0 1
cos Φ sin Φ
FΦ = (4.123)
− sin Φ cos Φ

where −π/2 ≤ Φ ≤ π/2. This precoder rotates every constellation point so


that each rotated point is different from other rotated points in both coordi-
nates. Therefore, it is a diversity maximizing precoder.
Another example is a 2 × 2 complex unitary which can be parameterized
as

FΨ,Φ = DUΨ,Φ , (4.124)

where
0 1
cos Ψ e−jΦ sin Ψ
UΨ,Φ = (4.125)
−e sin Ψ cos Ψ

where D is a 2 × 2 diagonal unitary matrix, −π ≤ Ψ ≤ π, −π/2 ≤ Φ ≤ π/2.


Any Mt × Mt unitary matrix can be written as
,
F=D Gkl(Ψkl ,Φkl ) (4.126)
1≤k≤Mt −1, k+1≤l≤Mt

where D is an Mt × Mt diagonal unitary matrix, −π ≤ Ψkl ≤ π, −π/2 ≤


Φkl ≤ π/2 and Gkl(Ψkl ,Φkl ) is a complex Givens matrix; i.e. an identity matrix
IMt with the (k, k)th, (l, l)th and (l, k)th entries replaced by cos Ψkl , cos Ψkl ,
sin Ψkl e−jΦkl and − sin Ψkl e−jΦkl respectively.
Since multiplication with a diagonal unitary matrix preserves product dis-
tances, D can be ignored in optimization problem of (4.122). Thus, the number
of parameters that need to be optimized is (Mt (Mt − 1)) which is the param-
eters of the complex Givens matrix. These Mt (Mt − 1) parameters are hard
to be computed for large number of Mt or constellation size. Algebraic-based
design of LCP can be the solution.

Algebraic Construction of LCP: method 1 (LCP-1)

Let Z[j] is the ring of Gaussian integers whose elements are in the form of
p + jq with p, q ∈ Z. Let also Q(j) be the smallest subfield of C including
both Q and j. Suppose the minimal polynomial over subfield Q(j) is denoted
by (mα,Q(j) (x)). Let {αm }N
m=1 T are the roots of the minimal polynomial over
subfield Q(j).
Then the the linear constellation precoder can be constructed as follows
96 4 Precoding Techniques for MIMO Channels
⎡ t −1

1 α1 ... αM 1

1 ⎢1 α2 ... αM t −1 ⎥
2 ⎥
F = ⎢. .. .. ⎥ (4.127)
γ ⎣ .. . . ⎦
t −1
1 αMt ... αM
Mt

From Algebraic theory, it is easy to show [154] that the first row of F forms a
basis for Q(j), i.e., f1T (xk − xl ) ∈ Q(j). It follows that for all (xk − xl ) ∈ Z[j],
f1T (xk −xl ) is a root of a monic polynomial coefficients with coefficients in Z[j].
Defining ηm (α) = αm (m ∈ [1, Mt ]), we have ηm (f1T (xk − xl )) = f1T (xk − xl ).
-N
m=1 T ηm (f1 (x − x )) is actually called a relative norm of (f1 (x − x ))
T k l T k l

which is equivalent to the definition of product distance in (4.120). It can be


shown that if f1T (xk − xl ) is in Q(j) and also a root of a monic polynomial
coefficients with coefficients in Z[j], then its relative norm is ∈ Z[j]. Therefore,
for xk = xl , the minimum product distance being nonzero is at least one since
it belongs to ∈ Z[j].
After taking into account the constant 1/γ and and the energy normal-
ization, the coding gain is equal to 4d2 /(γ 2 Es ), where d and Es depends on
constellation. √
If constellation belongs to Z[j] and normalized by Es , the coding gain is
lower bounded as [154]
1
Gc ≥ 2 (4.128)
γ Es

Also, for QAM an PAM constellation whose points are normalized by Es
and has a minimum distance equal to 2d, the maximum coding gain is lower
and upper bounded by
4d2 4d2
(ln 2) ≤ Gcmax ≤ (4.129)
NT Es NT Es

Let us define S = {Mt : Mt = deg(mα,Q(j) (x)), α = ej2π/P , P ∈ N}. It can


be proved that odd integers Mt > 1 do not belong to S while even integers
do belong to S. It is shown in [154] that LCP-1 can only achieve the upper
bound in (4.129) if in Mt ∈ S.

Algebraic Construction of LCP: method 2 (LCP-2)

LCP-1 does not guarantee a unitary precoder matrix. As we mentioned before,


unitary precoder is often preferred over non unitary ones. A unitary precoder
for any value of Mt can be constructed as

F = FTMt diag(1, α, ..., αMt −1 ), α = ej2π/P (4.130)

where FMt is the Mt -point inverse fast Fourier transform (FFT) matrix whose

1
(i, j)th entry is given by √M ej Mt (i−1)(j−1) . A lower bound for coding gain of
t
the LCp-2 is given by [154]
4.4 Precoder in MIMO Space-Time Code Systems 97

1 2(χ−1) 1
Gc ≥ ( ) ( ) (4.131)
HMt Mt Es
I √
where χ = i=1 Di /Mt and H = maxxk ,xl ∈C  Es (xk −xl ). I is the number
of distinct minimum polynomial pi (x) of βm = αej2π(m−1)/Mt , m = 1, ...Mt ,
over Q(j) and Di is the degree of pi (x), i = 1, ...I and Di ≥ Mt .
To design a LCP with large coding gain, the number of distinct minimal
polynomial βm and their degree must be as small as possible to make χ small
[154].
It turns out that the space-time code with unitary constellation precoder
can achieve higher average mutual information than orthogonal design space
time codes, for N t > 2 and large SNR [154].
the PEP performance of LCP and ST-OD compared via simulation and
the results are plotted in Fig. 4.13 and Fig. 4.14. In both figure, Mt = 3, 4.
For Mt = 4, complex precoders F were constructed according to LCP-1. For
Mt = 3, the parameterization method were used. Rate 3/4 ST-OD codes were
constructed according to [127] and were compared with rate 1 ST-LCP codes.
64 QAM constellation were used for ST-LCP to maintain the same spectral
efficiency of 6 bits/s/Hz that ST-OD achieves with 256 QAM constellation.
The SNR gain of ST-LCP gain is less than 1 dB in Fig. 4.13, it increases to 3

Fig. 4.13. BER performance of ST-OD (256-QAM) versus ST-LCP (64-QAM),


with Mt = 3, 4, Mr = 1 [[154]IEEE].
c
98 4 Precoding Techniques for MIMO Channels

Fig. 4.14. BER performance of ST-OD (256-QAM) versus ST-LCP (64-QAM),


with Mt = 3, 4, Mr = 2 [[154]IEEE].
c

dB when Mr = 2 in Fig. 4.14. We see that ST-LCP outperforms ST-OD for


all cases.

4.5 Precoding Techniques for the Limited Feedback


Channel Capacity
The precoding technique in the MIMO channel enables the transmitter to
adjust its transmit signal pattern so that the strongest channel mode is ex-
ploited to enhance the system performance if the channel state information is
known to the transmitter. For time division duplexing (TDD) communication
systems, where uplink and downlink channels share the same frequency band
at different time, the channel information can be estimated by the transmit-
ter due to channel reciprocity property if the channel state is unchanged. On
the contrary, for frequency division duplexing (FDD) systems, where differ-
ent frequency bands are allocated for the downlink and the uplink channels,
the channel response in different frequency is not the same. Hence, a feed-
back channel is necessary to deliver the estimated channel knowledge at the
receiver back to the transmitter.
For a time varying channel, the channel knowledge at the transmitter
should be updated every time when the channel changes. The feedback
4.5 Precoding Techniques for the Limited Feedback Channel Capacity 99

overhead, which increases linearly as the product of the antenna numbers


at both ends and the channel delay spread, could be large. Therefore, the pre-
coder design using full channel information is not practical for the system with
limited feedback channel capacity. The precoding scheme, which demands less
feedback capacity while enjoying the precoding gain, is more appreciated. Two
precoding schemes based on either the channel statistic knowledge [145] or the
idea called unitary precoding [80], were recently proposed to save the feedback
overhead and will be reviewed next.

4.5.1 Precoding with Channel Statistics Knowledge

Since the real-time channel information is not possible at the transmitter due
to either fast channel variation or insufficient feedback capacity, the channel
statistic knowledge, namely, channel mean or covariance, which varies less fre-
quently, is more likely to be delivered to the transmitter. The precoder design
with either the channel mean or covariance matrix available at the transmitter
is recently studied in [91, 145, 59] from the channel capacity point of view.
For a specified input covariance matrix, the channel capacity is achieved by
the use of vector Gaussian symbols [144]. Therefore, the goal of the precoder
design here is to determine the best input covariance matrix that maximizes
the mutual information when the channel statistic is known to the transmitter
[91, 145, 59]. Some related research results are reviewed next.
The downlink, multiple-input, single-output (MISO) scenario considered
in [91, 145] is used as an example to illustrate this design idea1 . The block
diagram of the MISO precoder is given in Fig. 4.15, where Mt transmit anten-
nas are separated far enough at the base station to exploit spatial diversity
and a single receive antenna is employed at the mobile unit due to its size and
power limitation. Furthermore, we assume only one data link is established
at a time and the multiple access can be achieved via either time division

h1
Tx-1
data stream data stream
input X output
Precoder Decoder
Rx-1
hM n

Tx-Mt

Channel Statistic Information

Fig. 4.15. The block diagram of precoder design using channel statistic knowledge
[[145]IEEE].
c

1
Based on the similar design concept in [91, 145], Jafar and Goldsmith later ex-
tended the precoder design to the MIMO channel in [59].
100 4 Precoding Techniques for MIMO Channels

multiple access (TDMA) or frequency division multiple access (FDMA). Let


x be the Mt × 1 input signal vector at the transmitter. The baseband received
signal in the flat fading channel is described as

y = x† h + n, (4.132)

where n is the zero mean circularly complex Gaussian noise sample whose
power is equal to σn2 and h is a Mt × 1 channel vector. Let h be a circularly
symmetric Gaussian random vector as well, whose probability density function
(PDF) is completely specified by its mean ν and variance Σ and is denoted
as h ∼ CN(νν , Σ).
Let Q be the input covariance matrix of x. Conditioned on the channel
realization h, the maximum mutual information is given as
 
h† Qh
I(x; y|h) = log 1 + . (4.133)
σn2

The precoder design problem is formulated as

Qo = maxQ Eh {I(x; y|h)} s.t. E{x† x} = trace{Q} = P, (4.134)

where the averaged output signal power is limited to P . Let the eigen-
decomposition of matrix Qo be

Qo = UQ0 ΛQ0 U†Q0 , (4.135)

where
UQ0 = [uQ0 ,1 , · · · , uQ0 ,Mt ], (4.136)
ΛQ0 = diag(λQ0 ,1 , · · · , λQ0 ,Mt ), (4.137)
and
λQ0 ,1 ≥ λQ0 ,2 · · · ≥ λQ0 ,Mt . (4.138)
Please note that the ith eigenvector uQ0 ,i specifies the direction of the ith
transmit signal while the ith diagonal element λQ0 ,i determines the corre-
sponding emitted power along uQ0 ,i . Later, we design the optimal input
covariance matrix in different scenarios by specifying its eigenvalues and eigen-
vectors.
Before discussing the precoding algorithm using channel statistic informa-
tion, let us consider two special cases as follows. First, when the transmitter
knows nothing about the channel, the expected value of the mutual informa-
tion is upper bounded as
  "
||h||2 P
Eh {I(x; y|h)} ≤ Eh log 1 + , (4.139)
Mt σn2
where ||x|| is the 2-norm of vector x [91]. Please note that the above equality
holds if and only if
4.5 Precoding Techniques for the Limited Feedback Channel Capacity 101

P
Qo = IM . (4.140)
Mt t
As (4.140) implies, the channel capacity is achieved by transmitting Mt signals
in Mt orthogonal directions with equal power, i.e., there is no directional
preference for the transmit signal. In the following context, we denotes it as
the diversity transmission scheme. Second, when the ideal channel knowledge
is available at the transmitter, we can develop the following upper bound as
[91]
  "
||h||2 P
Eh {I(x; y|h)} ≤ Eh log 1 + , (4.141)
σn2
where the equality is hold if and only if

hh†
Qo = P . (4.142)
||h||2

This result implies that when the channel state information is perfectly known
at the transmitter, the beamforming scheme, which combines Mt channel
gains coherently at the receiver, is able to achieve capacity. As comparing
these two capacity achieving schemes, the beamforming scheme not only pro-
vides a higher channel capacity but also requires less complexity for symbol
decoding than the diversity transmission scheme. However, it demands the in-
stant channel knowledge available at its transmitter. Delivering the real-time
channel information from the receiver to the transmitter may not be practical
in a time varying channel due to feedback delay. Furthermore, the channel
information estimated at the receiver is not perfect as well.
Generally speaking, the solution to the constraint optimization problem
in (4.134) for arbitrary ν and Σ is difficult to find. In what follows, we con-
sider the precoder design in (4.134) for two special cases, which depends on
the time duration of our interest [145, 59]. First, when the interested time
duration is short, the channel information can be somehow track based on the
fed back information at the transmitter [145]. The channel knowledge at the
transmitter can be modeled as

h ∼ CN (νν , σe2 I), (4.143)

which is an non-zero mean circularly complex Gaussian vector. The mean


ν denotes the channel response available while the matrix σe2 I denotes the
covariance of the estimation error with error power equal to σe2 . Here we adopt
the same terminology in [145] and name it the channel mean feedback from
now on. If the channel mean feedback is possible, the optimal transmission
scheme given in [145] is stated in the next Theory 4.4.
Theorem 4: (Visotsky and Madhow [145]) Given the channel mean feedback
is feasible, the eigenvectors and the eigenvalues of the optimal input covariance
matrix Qo satisfy the following conditions:
102 4 Precoding Techniques for MIMO Channels

1. For the eigenvectors of Qo we have uQo ,1 = ν /||νν || and uQo ,2 , · · · , uQo ,Mt
form an arbitrary set of orthonormal basis whose span is perpendicular
to ν ,
2. For the eigenvalues of Qo we have λQo ,1 = λo and λQo ,2 = λQo ,3 = · · · =
λQo ,Mt = (P − λo ) / (Mt − 1).
Proof: The proof of the above theorem is omitted here. Interested readers
are referred to Sec. III in [145] for the detailed treatment.
When the channel mean feedback is made available, the best transmis-
sion scheme, which achieves the highest mutual information, is to transmit
the signal along the direction of ν with power equal to λo . Depending on
the total available power, if λo < P , we then distribute the residual power
equally to other Mt − 1 orthogonal directions. This implies that we switch
from the beamforming scheme (Rank(Qo )=1) to the diversity transmission
(Rank(Qo )=Mt ). Although the closed form solutions for the transmit power
in different directions, λQo ,1 , · · · , λQo ,Mt , are not given here, they can be
computed by some numerical scheme, such as the projected gradient descent
algorithm [9] as pointed out in [145].
Another channel model is suitable for the fast time varying channel or
a long channel observation period. In this case, the channel mean tracking
is not practical and can be treated as zero. It is however that the channel
covariance matrix Σ, which is determined by the relative geometry between
the transmitter and the receiver [120], is more stable [145, 59]. As a result, the
receiver can gradually passes the channel covariance matrix information to the
transmitter and this scheme is denoted as the channel covariance feedback in
the sequel. Mathematically, the knowledge of the channel at the transmitter
can be modeled as
h ∼ CN (0, Σ). (4.144)
For the channel covariance feedback, Visotsky and Madhow provides the op-
timal transmission scheme in [145], which is stated in the following theory.
Theorem 5: (Visotsky and Madhow [145]) For h ∼ CN (0, Σ) and channel
covariance feedback at the transmitter, the best transmission scheme is to
deliver signals along the eigenvector direction of Σ, i.e.,

UQo = UΣ , (4.145)

where UΣ denotes all the eigenvectors of matrix Σ. In order to completely


specify the input covariance matrix, the transmitted power corresponding to
different direction should be given. Since different direction corresponds to dif-
ferent channel gain, i.e., different eigenvalue of Σ, the transmit power along
various direction can be determined by applying the water filling principle.
That is, the better the channel is, the more power we should assign.

Example 1: Channel Mean Feedback


Here we borrow the numerical results from Sec. IV in [145] to illustrate the
information rate between different transmission schemes in a system with two
4.5 Precoding Techniques for the Limited Feedback Channel Capacity 103

transmit antennas. The channel model we consider is the first order auto-
regressive channel model, which is described as

h(t) = ah(t − 1) + w(t), (4.146)

where a is the forgetting factor and w(t) ∼ CN (0, σw2


) is an i.i.d. Gaussian
vector. Due to the feedback delay d, the channel feedback function is

f(t) = h(t − d). (4.147)

Based on (4.146) and (4.147), the knowledge of the channel at the transmitter
can be shown to be
h(t) ∼ CN (νν , σn2 I), (4.148)
where ν = ad h(t − d) and σn2 = (σw 2
(1 − a2d ))/(1 − a2 ). Please note that
d
the value of a in (4.148) determines the quality of channel information,
namely, the greater ad is, the better the channel information becomes. Three
different transmission schemes are compared, and they are diversity scheme
P νν †
(i.e.,Q = M t
IMt ), beamforming scheme (i.e.,Q = P ||ν ν ||2 ), and the optimal
scheme introduced in Theory 1. The result shown in Fig. 4.16 and 4.17 corre-
sponds to the information rate as a function of different schemes with different
feedback information quality, namely, ad = 0.9 and 0.3. It is observed from
both Fig. 4.16 and 4.17 that when the transmitter has better understand-
ing about the current channel, beamforming achieves the highest information

Fig. 4.16. Information rate of different transmission schemes in the channel mean
feedback case, ad = 0.9 [[145]IEEE].
c
104 4 Precoding Techniques for MIMO Channels

Fig. 4.17. Information rate of different transmission schemes in the channel mean
feedback case, ad = 0.3 [[145]IEEE].
c

rate while the diversity scheme is 2 dB away from optimal. However, when
the channel uncertainty at the precoder increases, the diversity transmission
scheme outperforms the beamforming since the direction of beamforming is
not very accurate.
Example 2: Channel Covariance Feedback
The numerical result for different transmission schemes based on the channel
covariance feedback in [145] is given here. Three transmit antennas are de-
ployed to deliver the information symbols using different transmission method,
P
namely, the diversity scheme (i.e., Q = M t
IMt ), the beamforming scheme,
which corresponds to send the information along the direction of the strongest
eigenvector direction, and the optimal scheme given in Theory 2. In order
to demonstrate how the eigenvalue spread of the channel covariance matrix
affects the achieved information rate, we consider two different cases, say,
σ1 = σ2 = σ3 and σ1 /σ2 = σ1 /σ3 = 2, where σi denotes the ith eigenvalue of
Σ. The results we have are shown in Fig. 4.18 and 4.19. As Fig. 4.18 and 4.19
suggest, when the eigenvalue of Σ are all equal, diversity scheme is optimal.
On the contrary, when the eigenvalue spread of Σ increases, beamforming
provides almost the same performance as the optimum scheme.
From the previous Example 1 and 2 we observe that the beamforming
scheme is the optimal transmission scheme when either the knowledge of the
4.5 Precoding Techniques for the Limited Feedback Channel Capacity 105

Fig. 4.18. Information rate of different transmission schemes in the channel covari-
ance feedback case, σ1 = σ2 = σ3 [[145]IEEE].
c

Fig. 4.19. Information rate of different transmission schemes in the channel covari-
ance feedback case, σ1 /σ2 = σ1 /σ3 = 2 [[145]IEEE].
c
106 4 Precoding Techniques for MIMO Channels

fed back channel mean improves or the eigenvalue spread of the fed back co-
variance matrix grows. As we mention earlier, the beamforming scheme, which
transmits only one data symbol at a time, requires lower decoding complex-
ity as compared to the diversity transmission scheme, where multiple data
symbols should be separated at the receiver. In [145], Visotsky and Mad-
how demonstrated that beamforming is optimum via simulation only. Later,
based on the same concepts of channel mean and covariance feedback, Jafar
and Goldsmith generalized the precoder design in [145] to the MIMO chan-
nel and provided the necessary and sufficient condition for the optimality of
beamforming in [59]. Interested readers are referred to [59] for the detailed
treatment of this topic.

4.5.2 Unitary Precoding


The previously introduced precoding scheme is suitable for the fast time vary-
ing system. However, when the channel coherent time is long such that the
receiver can pass some channel knowledge to the other end of the link, the pre-
coder design with channel statistic only fails to adjust its transmitted signal
to the current channel state and its performance degrades as a result. Since
the channel knowledge is always quantized before passing to the transmitter,
the quantization level poses another design dilemma between precision and
feedback overhead. In addition, when the quantized channel information is
applied to the precoder design, it is hard to determine the required quantiza-
tion level for a given system performance spec. Next, we are going to review
another precoding scheme, called unitary precoding, which quantizes the pre-
coder rather than the channel information, provides more flexibility than the
direct channel quantization scheme in the feedback limit channel.
A set of codebook containing N codewords are first constructed for uni-
tary precoding systems. Since these N codewords, which is independent of
the current channel state information, they can be computed off-line and be
made available at both ends of the communication link. Based on the esti-
mated channel knowledge, the receiver selects a proper code from the code-
book based on certain criteria and then sends the corresponding code index
to the transmitter for symbol precoding. It is worthwhile to point out that
the size of codebook N is limited by the number of bits per feedback, l, i.e.,
N ≤ 2l . (4.149)
Obviously, the feedback quantity l provides an trade-off between the demand
feedback capacity and the resulting system performance. There are two ques-
tions followed immediately after this unitary precoding scheme: 1), what is
the criteria for the codeword selection ? and 2) how to construct a codebook ?
Next, we borrow the results for the precoded OSTBC system in [80] to answer
these two questions as follows2 .
2
Even though the unitary precoding idea can be applied to other communication
systems, such as the transmit beamforming [79], and the spatial multiplexing
4.5 Precoding Techniques for the Limited Feedback Channel Capacity 107

4.5.3 System model and the Optimal Precoder for Unitary


Precoded OSTBC Systems

The use of OSTBC is to increase the diversity gain of the MIMO channel such
that the probability of channel deep fading is low if channel gains between
different paths are independent [3, 127]. However, the full-rate OSTBC only
exists for a certain number of transmit antennas [127]. For arbitrary number
of transmit antennas, we can apply an additional precoding matrix to achieve
the full-rate code while enjoying additional array gain [62, 80].
The block diagram of the precoded OSTBC system is given in Fig. 4.20.
Let us consider a Mr × Mt MIMO system in a block fading channel, where
the number of the transmit antennas is less than that of the receive anten-
nas. The channel gain between different transceiver pair is assumed to be an
i.i.d., complexity Gaussian random variable whose probability density func-
tion (PDF) is CN (0, 1). The M (M < Mt ) data symbols are mapped onto
Mt transmit antennas via the precoding matrix Pi , which is specified by the
receiver according to the current channel realization. As a result, the received
signal corresponding to one OSTBC symbol Y can be formulated as
2
ρ
Y= HPi X + N, (4.150)
M
where ρ denotes the signal-to-noise power ratio, X

X = [x1 , · · · , xT ] (4.151)

is a M × T space-time coded symbol matrix whose tth column xt specifies


the OSTBC encoder output for time t, N is a Mr × T noise sample matrix

Tx-1 Rx-1
Data stream n1,t
X Data stream
input OSTBC Output
encoder ML decoder

Pi

Codeword Codeword
Mapping Selection
Scheme
Tx-Mt Tx-Mr
nMr,t

Codeword Index Feedback

Fig. 4.20. The block diagram of the precoded OSTBC system using unitary
precoding.

systems [80], the codeword selection and codebook construction processes are
in fact quiet similar with the precoded OSTBC system [80]. Only their design
criteria are different.
108 4 Precoding Techniques for MIMO Channels

and its (i, t) element, ni,t , which denotes the noise sample at the ith receive
antenna at the time instance t, is modeled as an i.i.d. complexity Gaussian
random variable, i.e., ni,t ∼ CN (0, 1).
If the maximum-likelihood (ML) decoding scheme is used to extract the
transmit symbols in Y, the conditional symbol error probability decreases
exponentially as the channel Frobenius norm goes up, i.e.,
 
Pr(error|H) = exp −γ||HPi ||2F , (4.152)

where γ is a constant independent of H [69]. In order to minimize the symbol


error rate (SER), Pi is selected to maximize the minimum distance between
two OSTBC codewords. Mathematically, we can have
 

Pi = arg max min ||HPi (Xl − Xm )||F
i∈{1,··· ,N } l=m

= arg max ||HPi ||F , (4.153)


i∈{1,··· ,N }

where the second equality comes from the fact that both Xl and Xm are
orthogonal. Hence, it is clear to see that the precoder is chosen such that the
Frobenius norm of the equivalent channel response is maximized. Next, we
are going to introduce the optimal unitary codeword, Popt , if the capacity of
the feedback channel is not limited [80]. Please note that even Popt , which
demands much more channel capacity for the precoder information feedback,
is not practical enough in a feedback limited system, it does provide insight
to our codebook design.
Let the singular value decomposition (SVD) of a m × n (m > n) matrix
A be

A = ΣA UA VA , (4.154)
where ΣA and VA are the unitary matrices of size m × m and n × n, respec-
tively, and UH of size m × n has m singular values λA,1 , · · · , λA,n on its main
diagonal and zeros elsewhere. Please note that n singular values are arranged
in a decreasing order, i.e.,

λA,1 ≥ · · · ≥ λA,n . (4.155)

In addition, the peak power constraint is enforced in the precoder design, i.e.,

||Px||
maxx ≤ 1, (4.156)
||x||

where x ∈ CMt . Eq. (4.156) implies that

λP,1 ≤ 1. (4.157)

By applying SVD to both H and P , we can bound ||HP||F as [80]


4.5 Precoding Techniques for the Limited Feedback Channel Capacity 109
† †
||HP||F = ||ΣH UH VH ΣP UP VP ||F

= ||UH VH ΣP UP ||F

≤ ||UH VH ΣP [IM 0]t ||F
3
4M
4
≤5 λ UH VH ΣP ΣP VH U†H ,i
† †

i=1

= ||ŪH ||F , (4.158)

where 0 is the zero matrix of size M × Mt and ŪH is composed of the first M
column vectors in UH . The property that the Frobenius norm of one matrix
is unchanged after being multiplied by an unitary matrix is applied to get
the second equation in (4.158) while the first inequality in (4.158) is true by
applying the upper bound on M singular values in P. In order to achieve the
upper bound in (4.158), Popt can be designed as

Popt = V̄H . (4.159)

The optimal codeword Popt is composed of M orthonormal basis vectors in


CMt and is denoted as Popt ∈ U(Mt , M ), where U(Mt , M ) denotes all possible
M orthonormal basis vectors in CMt . Please note that Popt given in (4.159)
is not unique since
||HPopt ||F = ||HPopt U||F , (4.160)
as long as U is a M × M unitary matrix. Given the peak emitted power con-
straint in (4.156), the optimal precoder emits maximum power in M orthog-
onal directions. In fact, it is also shown in [80] that for any P ∈ U(Mt , M ),
allocating unequal power along M precoding vectors cannot maximize the
Frobenius norm in HP. When a different constraint, such as the limited to-
tal output power, is applied to the precoder design, the optimal procoder
adjusts its signal power on different directions based on the corresponding
channel gain [117, 112]. However, since the output power for different direc-
tion is limited by the linear region of the power amplify at the transmitter,
the output power for M different beams is finite. Therefore, the total ouptut
power constraint may lead to the output power back-off in this case. There-
fore, the discussion of codeword design for the unitary precoding is limited to
U(Mt , M ).
Let us consider a special case when each column vector in P is selected
from the columns of IMt . This corresponds to the antenna subset selection
proposed in [46], where the receiver specifies M out of Mt best antennas for
signal transmission and pass the subset index to the transmitter. Please note
that it takes 6  7
Mt !
log2 (4.161)
M !(Mt − M )!
many bits to specify the antenna subset pattern. Even though both the an-
tenna subset selection and the unitary precoding schemes achieve full diversity
110 4 Precoding Techniques for MIMO Channels

in the MIMO channel [46, 80], the antenna subset selection scheme has less
flexibility in design since each element in the precoder is either 1 or 0. Con-
sequently, its performance is worse than that of the unitary precoding as we
will show later in the simulation.

4.5.4 Codebook Construction for the Unitary Precoding

Here, we will demonstrate that the codebook construction for the unitary pre-
coding system can be related to the subspace packing problem in the Grass-
mannian manifold as follows.
Recall that N codewords in the unitary precoding codebook are designed
off-line and hence independent of the channel realization and Popt as well.
First, we can modify the codeword selection criteria in (4.153) as

P∗i = arg max ||HPi ||2F (4.162)


i∈{1,··· ,N }
 
= arg min ||HPopt ||2F − ||HPi ||2F , (4.163)
i∈{1,··· ,N }

where  
||HPopt ||2F − ||HPi ||2F (4.164)
can be treated as the distortion due to an non-ideal codeword Pi . Love at al.
in [80] show that for a given channel realization H, the minimum distortion
can be bounded from top as
  1 †
min ||HPopt ||2F − ||HPi ||2F ≤ λ2H,1 ||V̄H V̄H
min − Pi P†i ||2F .
i∈{1,··· ,N } 2 i∈{1,··· ,N }
(4.165)
For a given codebook, its performance can be evaluated via its averaged dis-
tortion, i.e.,  "
 
EH min ||HPopt ||F − ||HPi ||F
2 2
(4.166)
i∈{1,··· ,N }

and its upper bound immediately follows as


 "
 
EH min ||HPopt ||2F − ||HPi ||2F ≤
i∈{1,··· ,N }
 "
  1 †
EH λ2H,1 EH min ||V̄H V̄H − Pi P†i ||2F , (4.167)
i∈{1,··· ,N } 2

where the distribution of λ2H,1 and V̄H of H are independent. By observing


(4.167) we learn that the upper bound of the averaged distortion is controlled
by both the averaged value of λ2H,1 and the distribution of first M vectors in
VH . Since the first term at the right-hand side is independent of the codebook
design, we thus construct N unitary codes so that the second term at the
right-hand side is minimized. Recall that N codewords are limited to be a
4.5 Precoding Techniques for the Limited Feedback Channel Capacity 111

subset of U(Mt , M ) and the column space of codeword Pi , Pi , represents a


M -dimensional subspace in CMt . Let G(Mt , M ) be the Grassmann manifold,
which contains all the column space in U(Mt , M ). The vicinity of two column
spaces Pi and Pj from Pi and Pj can be evaluated by their chordal distance,
which is defined as
1
d(Pi , Pj ) = √ ||Pi P†i − Pj P†j ||F . (4.168)
2
Let dmin denote the minimum distance between two different column spaces
from two different codewords and Bi be the union of of all subspaces in
U(Mt , M ) whose chordal distance is less than dmin /2, i.e.,
  "
 dmin
Bi = PU ∈ G(Mt , M )d(U, Pi ) < . (4.169)
2

It can be shown easily that Bi and Bj do not overlap if i = j. These N disjoint


open balls cover a subspace of U(Mt , M ), and its density is measured by the
probability when Fopt lies in one of these balls [80], i.e.,
( +
8
N
P r Popt ∈ Bi . (4.170)
i=1

In [7], Barg and Nogin have shown that when Mt is large, the probability
function defined above can be approximated by
( +  2Mt M+o(Mt )
8
N
δ
P r Popt ∈ Bi ≈ N √ . (4.171)
i=1
2 M

Using (4.168) and (4.171), we can have the following simplification [80]
 "
1 †
EH min ||V̄H V̄H − Pi P†i ||2F
i∈{1,··· ,N } 2
 "
= EH min d(Popt , Pi )2
i∈{1,··· ,N }
( + 2  ( +
8N
dmin 8N
≤ P r Popt ∈ Bi + 1 − P r Popt ∈ Bi ·M
i=1
2 i=1
 2Mt M+o(Mt )  
δ 1 2
≈ M +N √ d −M . (4.172)
2 M 4 min

It is shown in [80] that the upper bound developed in (4.172) reduces as


dmin increases in the general situation. This implies that we should design
our codebook so that the minimum chordal distance between two codewords
is maximized. As pointed out in [80], the noncoherent space-time modulation
112 4 Precoding Techniques for MIMO Channels

design using Fourier based signal processing scheme in [53] can be applied to
find out a good space packing with large minimum distance. Even though this
scheme demands high computational complexity for codebook construction,
especially when Mt , Mr and M are large, it requires less memory to store
all N codewords thanks to its unique code structure. Interested readers are
referred to [80] and [53] for more discussion.

Example 3: Unitary precoded OSTBC systems


Here we borrow the simulation result in [80] to compare the performance
of different diversity utilization schemes in a MIMO channel. The system
parameters are given as: Mt = 4, Mr = 2, M = 2 and the signal constellation
is 4-QAM. To illustrate the performance gain due to the increase of codebook
size, different feedback quantity, say, 3 and 6-bit, are considered in the unitary
precoding. Both codebooks are found by the scheme proposed in [53]. The SER
curves corresponding to different systems are plotted as a function of SNR
in Fig. 4.21. Both antenna subset selection scheme for 2 transmit antennas
[46] and the 2 × 2 OSTBC without precoding are also plotted for reference
purpose in Fig. 4.21. Please note that it takes 3-bit to specify the selected
antenna subset. It is observed that the use of precoding not only provides
more than 2 dB gain over the OSTBC system without precoding at SER

Fig. 4.21. SER for different precoding scheme. Mt = 4, Mr = 2, M = 2 and the


signal constellation is 4-QAM [[80]IEEE].
c
4.5 Precoding Techniques for the Limited Feedback Channel Capacity 113

equal to 10−2 , but achieves higher diversity gain as well. In addition, a larger
codebook provides a better SER performance at the cost of more feedback
overhead. If the feedback channel capacity is limited to be 3-bit, applying the
unitary precoding algorithm renders a little extra performance gain over the
antenna subset selection.
Part II

Future Communication Systems


with Precoding
5
Precoded Multiuser (PMU)-OFDM System

5.1 Introduction
To transmit data simultaneously for multiple users with a shared common
channel, the multiuser OFDM technology offers an attractive solution, which is
the main focus of this research. Generally speaking, there exist two families of
multiuser OFDM systems, i.e., multicarrier code division multiplexing access
(MC-CDMA) [1, 48, 49, 67, 155] and orthogonal frequency division multiple
access (OFDMA) [57, 66, 114] as detailed below.
When compared with the CDMA technology, MC-CDMA inherits the ad-
vantages of multicarrier systems in combating ISI caused by frequency selec-
tive fading channels. MC-CDMA systems can be further divided into two types
[49]. In the first type, one bit is transmitted per time slot. The transmitted
bit is spread into several chips, which are allocated to different subchannels.
The number of subchannels equals to the number of chips [155]. This type
of MC-CDMA system can exploit the full frequency diversity gain when the
maximum ratio combing (MRC) [49, 108] is used in the receiver side. In the
second type, several bits are converted from serial-to-parallel and then each
bit is spread into several chips. The chips corresponding to the same bit are al-
located to the same subchannel [67], which is often called the MC-DS CDMA
system. Two more generalized MC-CDMA system was proposed in [1, 48, 49].
In the first scheme [48, 49], each S/P converted bit is spread into several chips
and then each subcarrier is modulated with one chip, where the frequency sep-
aration corresponding to each bit is maximized to achieve frequency diversity.
In the second scheme [1], similarly, each S/P converted bit is spread into sev-
eral chips. Then, the chips corresponding to the same symbol are modulated
in successive subcarriers. Although MC-CDMA systems spread symbols using
orthogonal codes to ensure orthogonality, when used in uplink transmission,
i.e., from the mobile station (MS) to the base station (BS), orthogonality may
be destroyed at the receiver due to frequency selective fading, thus leading to
multiaccess interference (MAI). The MAI problem cannot be solved by in-
creasing the transmit power since increasing the transmit power for one user

C.-C.J. Kuo et al., Precoding Techniques for Digital Communication Systems,


DOI: 10.1007/978-0-387-71769-2 5,  c Springer Science+Business Media, LLC 2008
118 5 Precoded Multiuser (PMU)-OFDM System

will also increase the interference for other users. To suppress MAI, sophisti-
cated multiuser detection (MUD) [144] and signal processing techniques have
been proposed at the receiver end [30, 49]. Furthermore, due to MAI, the CFO
estimation and estimation and compensation are much more complicated in
MC-CDMA systems [31, 126].
In contrast, OFDMA is MAI-free when the time and the frequency are
well synchronized in the system. It was originally proposed for the cable TV
application [114]. Currently, it has been included in the IEEE 802.16a standard
for the fixed wireless metropolitan area network (WMAN) [57, 66]. However,
similar to conventional OFDM, OFDMA systems are sensitive to frequency
asynchronism, i.e., the carrier frequency offset (CFO) problem [87, 101]. The
CFO effects result from the oscillator mismatch in the transceiver pair and/or
the Doppler effect due to mobile users. Research on accurate CFO estimation
and compensation has received a lot of attention in the design of practical
OFDM systems [87, 118, 141]. However, different CFOs of multiple users in
OFDMA systems make the CFO estimation much more difficult than that of
a single user OFDM system. This is because of the fact that, when a user has a
CFO, the CFO not only causes the performance degradation of this user (the
self-CFO effect) but also results in MAI for others [135, 156]. Then, OFDMA
systems are no longer MAI-free in the presence of CFO. The CFO estimation
are no longer MAI-free in the presence of CFO. The CFO estimation problem
for OFDMA systems has been extensively studied, e.g., [6, 88, 106, 107, 141].
An edge sidelobe suppressor was proposed in [156] to mitigate the CFO effect
of an OFDMA system. However, most solutions demand extra complexity
at the receiver. The CFO effect is recognized as one of the main technical
challenges that limits the mobility of OFDM systems.
Moreover, in the uplink transmission, it is difficult to guarantee that all
users’ signals are aligned at the receiver and this leads to time asynchronism.
Time asynchronism will lead to MAI in OFDMA as well [97]. Although the
timing mismatch problem can be handled using a sufficiently long cyclic pre-
fix to cover time offsets, this solution increases redundancy and decreases the
actual data rate. Like the frequency offset issue, some research has been con-
ducted using sophisticated signal processing to estimate time and frequency
offsets, e.g. [6, 88, 142]. Due to MAI, time offsets cannot be well compen-
sated in the receiver [88]. Offsets have to be estimated by the receiver and
sent back to every user via feedback so that each user can compensate the
offsets at the transmitter. In the IEEE 802.16 standard, this is done using a
feedback mechanism called ranging [57]. Although the time offset problem can
be solved using feedback, the solution imposes a higher computational load
on the system. In addition, it may not guarantee that all users are perfectly
synchronized. If some users fail to synchronize with the base station, MAI will
occur and the system performance will degrade.
Furthermore, OFDM-based systems have been designed for applications
with little mobility. However, mobile OFDM technology has recently attracted
a lot of attention for three reasons. First, it is desirable to provide high qual-
5.2 System Model and Its Properties 119

ity broadband services in a mobile environment when we consider the next


generation wireless communication system. Second, emerging wireless com-
munication systems are expected to lie in higher spectral bands so that they
are more sensitive to the physical movement of users and their surroundings.
Third, more subchannels are needed to enhance OFDM bandwidth efficiency,
which implies the use of a longer OFDM block. Then, the effective channel
variation rate over one OFDM block increases. Rapid variation of the channel
over one OFDM symbol destroys the orthogonality among subcarriers, and
results in ICI and MAI in multiuser OFDM systems.
Based on the above discussion, a technique that is robust to both time and
frequency offsets as well as Doppler effect is desirable. That is, the MAI caused
by these effects can be reduced to a negligible amount so that the multiuser
system behaves like a “single-user” system. However, little research has been
done in the design of a multiuser OFDM system that is inherently robust to
these effects. In this chapter, we introduce the PMU-OFDM system which can
significantly reduce the MAI due to above factors. As a result, there is no need
to use sophisticated signal processing or multiuser detection to overcome the
MAI problem. Therefore, an algorithm for single-user OFDM system, which
is much simpler than that in multiuser OFDM system, can be used in this
novel multiuser OFDM system. PMU-OFDM system reveals that by proper
transceiver design with precoding, the MAI issue in multiuser systems can be
overcome without increasing receiver complexity.

5.2 System Model and Its Properties


The block diagram of the PMU-OFDM system with T users in uplink direc-
tion, i.e., from mobile station (MS) to base station (BS), is shown in Fig. 5.1.
Let the input of the ith user (1 ≤ i ≤ T ) be an N × 1 vector xi , which con-
tains N modulation symbols. The transmitter has 4 stages. At the first stage,
each symbol in vector xi is repeated M times to form a new vector yi of size
N M × 1 as

yi [m + kM ] = xi [k], 0 ≤ k ≤ N − 1 and 0 ≤ m ≤ M − 1. (5.1)

At the second stage, yi is passed through an N M × N M diagonal matrix Wi


with its diagonal elements drawn from an M × M unitary matrix D (D† D =
M I). Let the column vectors of D be d1 , d2 , · · · , dM . Then,
 Wi is obtained

by repeating di N times along the diagonal, i.e., Wi = diag dti dti . . . dti . For
instance, let M = 2, N = 4 and D be the Hadamard matrix, whose columns
form a set of Hadamard-Walsh code. We have

W1 = diag(+1 + 1 + 1 + 1 + 1 + 1 + 1 + 1)

and
W2 = diag(+1 − 1 + 1 − 1 + 1 − 1 + 1 − 1).
120 5 Precoded Multiuser (PMU)-OFDM System

Fig. 5.1. The block diagram of the proposed system [[135] IEEE].
c

Let wi [l] be the lth diagonal element of Wi , 0 ≤ l ≤ N M − 1. As D† D = M I,


these diagonal elements satisfy the following property:


M−1 
M, i=j
wi [m + kM ]wj∗ [m + kM ] = , (5.2)
0, i=j
m=0

for all k with 0 ≤ k ≤ N − 1. After passing through the diagonal matrix, the
lth component of the orthogonally coded output vector zi is given by

zi [l] = wi [l]yi [l], 0 ≤ l ≤ N M − 1. (5.3)

At the third stage, each coded vector is passed through the N M -point unitary
inverse discrete Fourier transform (IDFT) matrix. Finally, at the fourth stage,
each transformed vector is converted from parallel to the serial and the cyclic
prefix (CP) of length ν = L − 1 is added, where L is the maximum multipath
length which includes delay spread.
As for the channel, we assume that the channel has tap delay line with
uniform delay, which is commonly used in both wired and wireless communi-
cations. PMU-OFDM system is suitable to be used in both wired and wireless
environments. For the wired environment, the channel path is the equivalent
channel equalized by the time-domain equalizer (TEQ) [27]. Moreover, the
crosstalk interference occurs in most wired applications can be regarded as an
MAI [28]. For convenience, we will let hi (n) be the channel path of user i, and
λi [l] be the lth element of N M -point DFT of hi (n), where 0 ≤ l ≤ N M − 1,
throughout the whole chapter.
5.2 System Model and Its Properties 121

At the receiver side, the receiver removes the CP and passes each block of
size N M through the unitary discrete Fourier transform (DFT) matrix. For
the detection of the symbols transmitted by the ith user, the DFT output
vector is multiplied by W∗i and then averaged. Let ŷi be the output of Wi∗
and x̂i be the averaged output. Then, the kth element of x̂i is given by

1 
M−1
x̂i [k] = ŷi [m + kM ], 0 ≤ k ≤ N − 1. (5.4)
M m=0

From Fig. 5.1, we can rewrite Eq. (5.4) as

1 
M−1
x̂i [k] = ẑ[m + kM ]wi∗ [m + kM ], 0 ≤ k ≤ N − 1. (5.5)
M m=0

In the final stage, x̂i is passed through frequency equalization (FEQ) and
ready for detection. The operation of FEQ will be introduced in the following
section.

5.2.1 Approximately MAI-Free Property

The PMU-OFDM system has the approximately MAI-free property which is


described as follows [135]. According to [10, 27], the lth element of ẑ is given
by

T
ẑ[l] = λi [l]zi [l] + e[l], 0 ≤ l ≤ N M − 1. (5.6)
i=1

From Eqs. (5.3) and (5.6), the lth element of ŷj can be expressed as


T
ŷj [l] = λi [l]yi [l]wi [l]wj∗ [l] + e[l]wj∗ [l]. (5.7)
i=1

Let l = m + kM . From Eqs. (5.128) and (5.7), the kth element of x̂j is given
by

1 
M−1
x̂j [k] = λj [m + kM ]yj [m + kM ]wj [m + kM ]wj∗ [m + kM ]
M m=0

1 
T 
M−1
+ λi [m + kM ]yi [m + kM ]wi [m + kM ]wj∗ [m + kM ]
M
i=1,i=j m=0

1 
M−1
+ e[m + kM ]wj∗ [m + kM ], (5.8)
M m=0

where the first term is the desired signal, the second term is the MAI from
other users, and the third term is the additive noise. When N is sufficiently
122 5 Precoded Multiuser (PMU)-OFDM System

larger than the multipath length L, the coherent bandwidth is large, i.e.,
correlation between subchannels is large. In this case, the frequency response
for adjacent subchannels only varies a little. Therefore, we have the following
approximation
9i [k],
λi [m + kM ] ≈ λ 0 ≤ m ≤ M − 1, 0 ≤ k ≤ N − 1, (5.9)

9i [k] is the kth component of the averaged frequency gain, i.e.,


where λ


M−1
9i [k] = 1
λ λi [m + kM ].
M m=0

Using Eqs. (5.1) and (5.9), we can rewrite Eq. (5.8) as


M−1
9j [k]xj [k] 1
x̂j [k] ≈ λ wj [m + kM ]wj∗ [m + kM ]
M m=0

1 
T 
M−1
+ 9i [k]xi [k]
λ wi [m + kM ]wj∗ [m + kM ]
M m=0
i=1,i=j

1 
M−1
+ e[m + kM ]wj∗ [m + kM ]
M m=0

M−1
9j [k]xj [k] + 1
=λ e[m + kM ]wj∗ [m + kM ]. (5.10)
M m=0

Observed from Eq. (5.10), the MAI is approximately zero. Hence, if there is
no channel noise, we can approximately reconstruct xi [k] by multiplying x̂i [k]
& '−1
by λ 9i [k] . Similar to OFDM systems, the one-tap gain multiplication is
what usually called frequency equalization (FEQ). Since the proposed system
is approximately MAI-free, its capacity increases as transmit power increases
in the SNR range that we are interested. This result is very different from that
of conventional MC-CDMA systems, in which increasing the transmit power
of one user will also increase the MAI for other users.
Although PMU-OFDM uses orthogonal codes to distinguish different
users, the approximate zero MAI property makes it significantly different from
conventional MC-CDMA systems in many aspects. Instead, it has more sim-
ilarities to OFDMA as explained below.
1. MAI-free. The PMU-OFDM system can achieve approximately MAI-free
when N is much larger than the multipath length L. That is, for a fixed
L, as N increases, the system will have less MAI. Hence, by increasing
N , PMU-OFDM can accommodate more users with negligible MAI. This
MAI-free property is similar to that of the OFDMA system, which is
MAI-free when frequency are well synchronized. Moreover, in an OFDMA
5.2 System Model and Its Properties 123

system, the block duration is in general much longer than the multipath
length. Such systems are usually used in WLAN or WMAN applications
[66]. On the other hand, CDMA or MC-CDMA systems are usually used
in the cellular phone system, where co-channel interference is the major
concern.
2. Detection. Due to MAI, sophisticated multiuser detection (MUD) may be
involved in MC-CDMA so that detection of individual symbols is depen-
dent [49]. In the PMU-OFDM and OFDMA system, there is no need of
using MUD, and detection of individual symbols is independent.
3. Loading. PMU-OFDM can achieve approximately MAI-free when N is
sufficiently large. Hence, by increasing N , the system can be fully loaded
with negligible MAI. Also the OFDMA system can be fully loaded with no
MAI. In contrast, the number of supportable users in MC-CDMA systems
is much less than the spreading factor M due to MAI [1, 48, 49].
4. Frequency equalization. As shown in Fig. 5.1, the PMU-OFDM system
performs “frequency equalization” [27] instead of “combining” before sym-
bol detection. This stands in contrast with MC-CDMA systems, where
“combining” is usually used before detection [1, 48, 49]. There are several
different combining methods such as MRC (maximum ratio combining),
EGC (equal gain combining) and ORC (orthogonality restoring combin-
ing) in MC-CDMA systems [49]. The combining techniques multiply every
spread chip by a weighted gain and then sum up M chips before detec-
tion. Hence, there are N M gain multiplications for MC-CDMA systems
[1, 48, 49]. On the other hand, form Eq. (5.128), the PMU-OFDM sys-
tem performs summation before the gain multiplication, i.e., frequency
equalization. Thus, it only needs N gain multiplications.
5. Hadamard-Walsh code in the uplink transmission. In the uplink transmis-
sion, it is difficult to guarantee that every user transmits his/her signal
simultaneously, i.e., there is a time offset between users. This will lead
to timing mismatch among users. If the Hadamard-Walsh code is used
in the uplink transmission in conventional CDMA systems, a small tim-
ing mismatch among users will result in great MAI even if the channel
is perfect. Therefore, Hadamard-Walsh code is usually used in downlink
transmission but seldom used in the uplink transmission unless the tim-
ing mismatch problem can be well resolved by some other mechanism. In
conventional CDMA or MC-CDMA systems, quasi-orthogonal codes that
have less cross correlation such as the Gold code or the Kasami code are
usually used to mitigate the timing mismatch problem. In contrast, since
the PMU-OFDM system is robust to timing mismatch [133], we can adopt
the low complexity Hadamard-Walsh code in the uplink transmission, i.e.,
letting D be the Hadamard matrix.
6. Frequency diversity. Although PMU-OFDM may lose the frequency di-
versity when compared with MC-CDMA, it can achieve the full loading
capacity while maintaining the MAI-free property. The diversity gain of
PMU-OFDM is close to that of OFDMA.
124 5 Precoded Multiuser (PMU)-OFDM System

5.2.2 Approximately MAI-Free Property: Quantitative Analysis


for Hadamard-Walsh Code

It is easy to explain the approximately MAI-free property of the PMU-OFDM


system using the approximation in Eq. (5.9). However, it may not be easy to
see the relationship between the MAI and N . That is, although we know MAI
will decrease as N increases, it does not quantify the rate of MAI decrease. In
this section, we show that if all M Hadamard-Walsh codewords are used, the
MAI power will decrease at a rate proportional to N −2 , i.e., a 6 dB decrease
with a doubled N . Moreover, if only M/2 symmetric or anti-symmetric code-
words of the M Hadamard-Walsh code are used, the MAI power will decrease
at a rate proportional to N −4 , i.e., a 12 dB decrease as N doubles.
Using the definition of DFT, we can express λi [l] as


L−1
hi (n)e−j N M nl .

λi [l] = (5.11)
n=0

Let l = m + kM , 0 ≤ k ≤ N − 1 and 0 ≤ m ≤ M − 1. Eq. (5.11) can be


rewritten as

L−1
hi (n)e−j N M n(m+kM) .

λi [m + kM ] = (5.12)
n=0

Using the Taylor series representation, i.e.,

1 2 1 1 1
e−jθ = 1 − jθ − θ + j θ3 + θ4 − j θ5 − · · · ,
2! 3! 4! 5!

we can rewrite Eq. (5.12) as

 (: 2 ;

L−1
1 2π
−j 2π
λi [m + kM ] = hi (n)e 1− N kn mn + · · ·
n=0
2! N M
:  3 ;+
2π 1 2π
+j mn − mn + · · · . (5.13)
NM 3! N M

Since the maximum value of n is L − 1 and the maximum value of m is M − 1,


the maximum value of 2π/N M mn is less than 2π/N (L − 1). When N  L,
2π/N (L − 1) is small. In this case, we may use second order approximation
of ej2π/N Mmn . That is, we approximate λi [m + kM ] by
(  2  +

L−1
1 2π 2π
−j 2π
λi [m + kM ] ≈ hi (n)e N kn 1− mn +j mn .
n=0
2! NM NM
(5.14)
5.2 System Model and Its Properties 125
M−1 ∗
The Hadamard-Walsh code is orthogonal and m=0 wi [m]wj [m] = 0, for
i = j. Also, from Eqs. (5.8) and (5.14), we have the following approximation

1 
L−1
hi (n)e−j N kn φi,j (n),

M AIj←i [k] ≈ xi [k] (5.15)
M n=0

where
(  2 1+ 0

M−1
1 2π 2π
φi,j (n) = − +j mn mn wi [m]wj∗ [m].
m=0
2! NMN M
# $
2
Now, we would like to evaluate E |M AIj←i [k]| . Let us assume the
transmitted symbol xi [k] and channel coefficients hi (n) are uncorrelated.
  Fur-
thermore, E {xi [k]x∗i [k  ]} = 0, for k = k  . Let σx2i  E |xi [k]|2 be the
averaged transmitted power. Assume that the channel  coefficients
 have the
same averaged power, which is defined as σh2 i = E |hi (n)|2 . It is worth to
emphasize that the equal power assumption for channel coefficients will lead
to a pessimistic MAI result. In practical situation, channel coefficients usually
have exponential decay. As a result, the actual MAI should be much smaller.
We will mention later how to extend the result with i.i.d. assumption to prac-
tical case. Now let us first introduce several lemmas and proposition to explain
the pessimistic result.
Lemma 5.1: When all M Hadamard-Walsh codewords# are used, the $maxi-
2
mum value of the MAI from user i to user j, maxi,j E |M AIj←i [k]| , can
be approximated by
:  2 L−1  2 L−1 ;
1 π2 1  1 π4 1 
2 2
σxi σhi 1− 2
n + 4 1− 4
n , (5.16)
N2 4 M n=0
N 4 M n=0

and the maximum value occurs when wi [m] and wj [m] satisfy the following
condition 
+1, 0 ≤ m ≤ M/2 − 1
wi [m]wj [m] = . (5.17)
−1, M/2 ≤ m ≤ M − 1

Proof. From Eq. (5.15), we have


# $ 1  L−1
L−1  
E {hi (n)h∗i (n )} e−j N k(n−n ) |φi,j (n)|
2 2π 2
E |M AIj←i [k]| ≈ 2 σx2i
M n=0  n =0

1 2 2 
L−1
2
= σ σ |φi,j (n)| . (5.18)
M 2 xi hi n=0
# $
2
From Eq. (5.18), maximizing E |M AIj←i [k]| is equivalent to maximizing
|φi,j (n)|2 for all n. Thus, let us look at the term |φi,j (n)|2 . From Eq. (5.15),
φi,j (n) can be rearranged as
126 5 Precoded Multiuser (PMU)-OFDM System

φi,j (n) =  {φi,j (n)} + j {φi,j (n)} , (5.19)

where
 2 
M−1
1 2π
 {φi,j (n)} = − n2 wi [m]wj [m]m2
2! NM m=0

and
2π 
M−1
 {φi,j (n)} = n wi [m]wj [m]m.
N M m=0
Since
2 2
|φi,j (n)|2 = | {φi,j (n)}| + | {φi,j (n)}| ,
 
 
maximizing |φi,j (n)|2 is equivalent to maximizing both  M−1 w i [m]wj [m]m 
  m=0
 M−1 
and  m=0 wi [m]wj [m]m2 . According to [65], the product of two arbitrary
codewords of M Hadamard-Walsh codes is again a codeword in M codewords,
i.e., wi [m]wj [m], 0 ≤ m ≤ M −1, is one of the codewords of the M Hadamard-
 code. Since m and m are
Walsh 2
monotonically increasing
 functions for m ≥
 M−1   M−1 2
0,  m=0 wi [m]wj [m]m and  m=0 wi [m]wj [m]m  are maximized if the
number of successive +1s and −1s of wi [m]wj [m] are maximized, which occurs
when wi [m] and wj [m] satisfy the condition in Eq. (5.17). Hence, we have
M−1  M−1  
   
M/2−1
 2 1 1
 wi [m]wj [m]m  = m2 − 2 m2 = M 3 1 − . (5.20)
  4 M
m=0 m=0 m=0

Similarly, it can be shown that


   
M−1  M−1
  
M/2−1
 1 2 1
 wi [m]wj [m]m = m−2 m= M 1− . (5.21)
  4 M
m=0 m=0 m=0
# $
Thus, from Eqs. (5.18), (5.19), (5.20) and (5.21), maxi,j E |M AIj←i [k]|2
can be approximated as that in Eq. (5.16). # $ 
2
As given in Eq. (5.16), the maximum value of E |M AIj←i [k]| depends
on two terms. One is proportional to N −2 and another is proportional to N −4 .
−2
As N grows, the term proportional # to N will $
dominate the performance.
2
Hence, the maximum value of E |M AIj←i [k]| decreases in an order of
O(N −2 ). Note that when all M codewords are used, every# target user has $
2
the dominating MAI term, i.e., the maximum value of E |M AIj←i [k]| .
For instance, let M = 16 and user i uses w1 and user j uses w9 . Then,
wi [k]wj [k] satisfies Eq. (5.17) and the dominating MAI occurs. Now, if user i
uses w2 and user j uses w10 , again wi [k]wj [k] satisfies Eq. (5.17) and the
dominating MAI occurs. Consider the overall MAI power for user j, i.e.,
5.2 System Model and Its Properties 127
M # $ # $
2 2
i=1,i=j E |M AIj←i [k]| . Since the maximum value of E |M AIj←i [k]|
M # $
2
is the dominating MAI of i=1,i=j E |M AIj←i [k]| , we may regard the
overall MAI power decreases in an order of O(N −2 ). This result explains the
approximately MAI-free property of the PMU-OFDM system. That is, when
all M codewords are used, the overall MAI power decreases as N increases in
an order of O(N −2 ).
According to [50], the set of M Hadamard-Walsh codes can be divided
into two groups of M/2 codewords. One is the set of symmetric (even) codes
satisfying
wi [m] = wi [M − 1 − m], 0 ≤ m ≤ M/2 − 1. (5.22)

The other is the set of anti-symmetric (odd) codes:

wi [m] = −wi [M − 1 − m], 0 ≤ m ≤ M/2 − 1. (5.23)

Now, we consider the use of M/2 symmetric or anti-symmetric codewords


[135].

Lemma 5.2: Suppose that only M/2 symmetric or anti-symmetric codewords


of the M Hadamard-Walsh codes are used. The product of any two codewords
from the M/2 symmetric or anti-symmetric codewords, i.e., wi [m]wj [m], is a
symmetric codeword in M codewords. Moreover, wi [m] and wj [m] satisfy the
following property


M/2−1 
M/2, i=j
wi [m]wj [m] = . (5.24)
0, i=j
m=0

Proof. From Eqs. (5.22) and (5.23), it is obvious that the product of any two
codewords from either symmetric or anti-symmetric codes is symmetric. That
is, wi [m]wj [m] is symmetric and satisfies wi [m]wj [m] = wi [M − 1 − m]wj [M −
1 − m], 0 ≤ m ≤ M/2 − 1. Since the product of any two codewords is again a
codeword [65], wi [m]wj [m] is a symmetric codeword in M Hadamard-Walsh
codes.
Now let us prove Eq. (5.24). The whole summation in Eq. (5.2) can be
divided into two terms, i.e.,


M−1 
M/2−1

M−1
wi [m]wj [m] = wi [u]wj [u] + wi [v]wj [v]. (5.25)
m=0 u=0 v=M/2

According to Eq. (5.22) or Eq. (5.23), when only M/2 symmetric or anti-
symmetric codewords of M Hadamard-Walsh codes are used, the second sum-
mation term in Eq. (5.25) is given by
128 5 Precoded Multiuser (PMU)-OFDM System


M−1 
M−1
wi [v]wj [v] = wi [M − 1 − v]wj [M − 1 − v]
v=M/2 v=M/2


0
= wi [v  ]wj [v  ]. (5.26)
v  =M/2−1

From Eqs. (5.25) and (5.26), we prove the property in Eq. (5.24). 
Lemma 5.3: Suppose that only M/2 symmetric or anti-symmetric codewords
of M Hadamard-Walsh codes are used. We have the following property


M−1
wi [m]wj [m]m = 0. (5.27)
m=0

M−1
Proof. m=0 wi [m]wj [m]m can be divided into two terms as


M−1 
M/2−1

M−1
wi [m]wj [m]m = wi [u]wj [u]u + wi [v]wj [v]v. (5.28)
m=0 u=0 v=M/2

M−1
Let v = M − 1 − u and using either Eq. (5.22) or Eq. (5.23), m=0 wi [m]
wj [m]m can be manipulated as


M−1 
M/2−1
wi [m]wj [m]m = wi [u]wj [u]u
m=0 u=0

0
+ wi [M − 1 − u]wj [M − 1 − u](M − 1 − u)
u=M/2−1


M/2−1
= (M − 1) wi [u]wj [u]
u=0
= 0. (5.29)

From Lemma 5.3, the MAI term in Eq. (5.15) becomes

1 
L−1  −1  2π
M−1 2
hi (n)e−j N kn

M AIj←i [k] ≈ xi [k] mn wi [m]wj [m].
M n=0 m=0
2! N M
(5.30)
Hence, we have the following proposition.
Proposition 5.1: Suppose that only M/2 symmetric or anti-symmetric
#
codewords of M $Hadamard-Walsh codes are used. The maximum value of
E |M AIj←i [k]|2 can be approximated by
5.2 System Model and Its Properties 129
: ;
# $ 1 π4  4
L−1
2
max E |M AIj←i [k]| ≈ σx2i σh2 i n , (5.31)
i,j N 4 64 n=0

which occurs when wi [m] and wj [m] satisfy the following condition:

+1, 0 ≤ m ≤ M/4 − 1 or M/2 ≤ m ≤ 3M/4 − 1
wi [m]wj [m] = .
−1, M/4 ≤ m ≤ M/2 − 1 or 3M/4 ≤ m ≤ M − 1
(5.32)
Proof. According to Eq. (5.30), if the proposed code selection scheme is
used, the imaginary part of φi,j (n) in Eq. (5.19) disappears. Following
the# same argument $ in the proof of Lemma 5.1, we know that maximizing
2 M−1
E |M AIj←i [k]| is equivalent to maximizing m=0 wi [m]wj [m]m2 . Hence,
# $
2
the maximum E |M AIj←i [k]| occurs when the number of successive +1s
or −1s is maximized. Since wi [m]wj [m] is one of the symmetric codewords,
the last half M/2 elements can be obtained from the first M/2 elements. Also,
M/2−1
from Eq. (5.24), we know that m=0 wi [m]wj [m] = 0, for i = j. Hence, the
codeword product wi [m]wj [m] will have the largest number of successive +1’s
or −1’s if the condition in Eq. (5.32) is met. Given that the codeword product
wi [m]wj [m] satisfies Eq. (5.32), we have
M−1  M−1
   
3M/4−1
 2 1 3
 wi [m]wj [m]m  = m −2
2
m2 = M . (5.33)
  16
m=0 m=0 m=M/4

Based
# on Eqs. $ (5.19), (5.30) and (5.33), the maximum value of
E |M AIj←i [k]|2 can be approximated as that in Eq. (5.31). 
As shown in Eq. (5.31), when only M/2# symmetric or$ anti-symmetric
2
codewords are used, the maximum value of E |M AIj←i [k]| decreases in an
order of O(N −4 ). As compared with Eq. (5.16) where the MAI term decreases
in an order of O(N −2 ), the use of the code design in Proposition 5.1 enables
the system to achieve MAI-free with a much faster rate as N increases. Let
us give an example to illustrate this point.
Example 5.1: MAI Decreasing Rate
In this example, we show that the theoretical result derived using the Taylor
approximation is close to the simulation result. Moreover, it is demonstrated
that the use of only M/2 symmetric or anti-symmetric codewords allows the
system to be MAI-free at a faster rate. For the theoretical result, Eqs. (5.16)
and (5.31) are used to obtain the maximum MAI power for fully- and half-
loaded systems, respectively. In the simulation, the Monte Carlo method and
the term M AIj←i [k] in Eq. (5.8) are used to run more than 2500 different
channel realizations. That is, the simulated MAI power is obtained by the
following average
130 5 Precoded Multiuser (PMU)-OFDM System
N −1
1  2
|M AIj←i [k]| ,
N
k=0

for more than 2500 channels. Let M = 16 and the modulation is BPSK, i.e.,
σx2i = 1. The multipath length L = 4 and the coefficients of the channel are
i.i.d. complex Gaussian random variables with an unit variance, i.e., σh2 i = 1.
Let us consider the maximum MAI power from user i to user j. The
maximum MAI occurs if wi [m]wj [m] satisfies Eq. (5.17) for the fully-loaded
case. If symmetric or anti-symmetric codewords are in use, it occurs when
wi [m]wj [m] satisfies Eq. (5.32). The maximum MAI power as a function of
N for theoretical and simulated results are shown in Fig. 5.2, where the MAI
power is obtained for N from 8 to 256 in theory while the simulated MAI power
is plotted for N = 8, 16, 32, 64, 128, and 256. We see from this figure, that
theoretical and simulation results are close to each other. This confirms the
assumption that the second order Taylor series expansion in Eq. (5.14) is good
enough for the MAI analysis in the proposed system. Moreover, we see that
the use of symmetric or anti-symmetric codewords enables the system to be
MAI-free at a faster rate than a fully-loaded system. This result corroborates
our derivation in Lemma 5.1 and Proposition 5.1.
Now, we would like to demonstrate that the maximum MAI power oc-
curs when wi [m] and wj [m] satisfies Eq. (5.17) for the fully-loaded case, and
satisfies Eq. (5.32) for the use of symmetric or anti-symmetric codewords.

0
Theoretical result: full codeword
Theoretical result: half code selection
−10 Simulation result: full codeword
Simulation result: half code selection

−20
[k] |2 } (dB)

−30
j← i

−40
Max E{ | MAI

−50

−60

−70

−80
0 50 100 150 200 250
N

Fig. 5.2. The maximum MAI power as a function of N for theoretical and simulated
results with L = 4.
5.2 System Model and Its Properties 131

For the fully loaded system, there are M = 16 possible combinations for
wi [m]wj [m] and this codeword product is again one of the original codeword.
Hence, we will simulate the MAI power for these 16 possible combinations of
wi [m]wj [m] (in Kronecker ordering [8]) and number them as #1 to #16. Note
that since #1 is the all-one code, it denotes the desired signal power instead
of MAI. The MAI power for the 16 possible combinations of wi [m]wj [m] are
shown in Fig. 5.3. Note that since wi [m]wj [m] is either symmetric or anti-
symmetric, we use squared and circled curves respectively, to identify them.
From the figure, we see that the maximum MAI power is the curve with
number #9, which is the same curve as the circled points in Fig. 5.2, since
the codeword #9 satisfies the condition in Eq. (5.17). This corroborates our
derivation.
Consider the use of symmetric or anti-symmetric code. With this code
design, the codeword product, wi [m]wj [m], is a symmetric codeword according
to Lemma 5.2. Hence, the MAI power for M/2 = 8 different combinations of
wi [m]wj [m] can be represented by the 8 squared curves in Eq. (5.3). From
the figure, we see that, with the code design, the maximum MAI power is
the curve with number #13, which is the same curve as the squared points in
Fig. 5.2. Note that codeword #13 satisfies the condition in Eq. (5.32), which
corroborates our theoretical derivation.

Fig. 5.3. The maximum MAI power as a function of N for symmetric and anti-
symmetric wi [m]wj [m] with L = 4.
132 5 Precoded Multiuser (PMU)-OFDM System

Now let us consider a more practical channel model, i.e., channel coeffi-
cients do not have the same averaged power. However, it is still reasonable to
assume that the channel coefficients are uncorrelated with E{hi (n)h∗i (n )} = 0
for n = n . Also, since the channel coefficient power may be different for dif-
ferent taps. Let us assume the averaged channel power of tap n be σh2 i (n). In
this case, Eq. (5.18) should be rewritten as
# $ 1 
L−1
2 2
E |M AIj←i [k]| ≈ 2 σx2i σh2 i (n) |φi,j (n)| . (5.34)
M n=0

Since σh2 i (n) is a constant for a specific n, using similar derivation from
Eqs. (5.18)–(5.33),
# the $
maximum value of the MAI from user i to user j,
2
maxi,j E |M AIj←i [k]| , in Lemma 5.1 should be rewritten. It can be ap-
proximated by
:  2 L−1  2 L−1 ;
1 π2 1  1 π4 1 
2
σxi 1− 2 2
σhi (n)n + 4 1− 2 4
σhi (n)n .
N2 4 M n=0
N 4 M n=0
(5.35)
Compared to Lemma 5.1, since σh2 i (n) usually has exponential decay, the
result in Eq. (5.35) leads to a much smaller MAI than that in Eq. (5.16) for
fully-loaded PMU-OFDM with Hadamard-Walsh
# $ code. Similarly, we should
2
rewrite the maximum value of E |M AIj←i [k]| in Proposition 5.1. It can
be approximated by
: ;
# $ 1 π4  2
L−1
2
max E |M AIj←i [k]| ≈ σxi 2 4
σ (n)n . (5.36)
i,j N 4 64 n=0 hi

Again, the result in Eq. (5.36) leads to a much smaller MAI than that in
Eq. (5.30) for half-loaded PMU-OFDM with Hadamard-Walsh code. 

5.3 PMU-OFDM System in Time Offset Environment


In this section, we will show that the PMU-OFDM system is robust to the time
offset if N is sufficiently large. Furthermore, we will derive an expression for
the MAI effect due to the time offset in PMU-OFDM. Based on the expression,
we show that the MAI effect caused by the time offset can be reduced to a
negligible amount so that the multiuser system behaves like a “single-user”
system if M/2 symmetric or anti-symmetric codewords of the M Hadamard-
Walsh codes are used for each user in the prefilering stage of the system.

5.3.1 Time Asynchronism Analysis


In the following analysis, we assume that the CP length is ν = L − 1, where
L is the maximum length of the discrete channel considered. In this situation,
5.3 PMU-OFDM System in Time Offset Environment 133
An OFDM block with correct timing for user j

Corrupted
User j CP

Corrupted
User k CP

Corrupted
User i CP

n i nj nk n:Time index

Fig. 5.4. Illustration of the time offset effect.

whenever one user has a non-zero time offset, this user will cause not only
symbol distortion to himself/herself, but also MAI to all other users.
Referring to Fig. 5.4, let user j be the target user and thus nj is the
correct block extracting time for user j. Let user i has a time offset of τi =
ni − nj with respect to the jth user. Then, we say user i is timing advanced
if τi < 0 and is timing delayed if τi > 0. Since the MAI increases as |τi |
increases, we will consider the more sever case that |τi | is larger than ν. Later
we will show that the derivations also apply to the case |τi | ≤ ν. Without loss
of generality, suppose the receiver views ŝ(0), · · · , ŝ(M − 1) as one OFDM
block, where ŝ(n) is as depicted in Fig. 5.5. For presentational convenience,
we will add superscript (+) or (−) to denote, respectively, the data/channel
(+)
in the previous and next block. For instance, xi [k] is the kth symbol of
xi [k] in the previous block. Let the ν corrupted CP of the current block be

Fig. 5.5. The block diagram of the proposed system.


134 5 Precoded Multiuser (PMU)-OFDM System

pi (0) · · · p9i (ν − 1)] and the received noise vector be ê. Referring to Fig. 5.5,
[9

the received vector, ŝ = Ti=1 p̂i + ê. For τi < 0, p̂i is given by
)  
(−) (−)  
pi (N M − |τi | + ν) · · · pi (N M − 1)  p9i (0) · · · p9i (ν − 1) 
pi (0) · · · pi (N M − 1 − |τi |)] , (5.37)

where

n 
ν
(−)
p9i (n) = hi (m)si ((n−m)+N M −ν)+ hi (m)s(−) ((n−m)+N M ).
m=0 m=n+1
(5.38)
For τi > 0, p̂i is given by
)  
 (+) (+) 
pi (τi ) · · · pi (N M − 1)  p9i (0) · · · p9i (ν − 1) 
*
(+) (+)
pi (0) · · · pi (τi − ν − 1) , (5.39)

where

(+)

n
(+) (+)

ν
p9i (n) = hi (m)si ((n−m)+N M −ν)+ hi (m)s((n−m)+N M ).
m=0 m=n+1
(5.40)
After DFT, the mixed signal from all the users are given by


T
ẑ[l] = ri [l] + e[l], 0 ≤ l ≤ N M − 1, (5.41)
i=1

where
N
M−1
1
ŝ(n)e−j N M nl ,

ri [l] = √ (5.42)
N M n=0
and e[l] is the received noise after DFT. Now consider the symbol detection
for the jth user. From Eqs. (5.128) and (5.41), and let l = v + kM , where
0 ≤ v ≤ M −1, 0 ≤ k ≤ N −1, the kth element of x̂j under time asynchronism
is given by

1   1 
M−1 T M−1
x̂j [k] = rj [v +kM ]wj∗ [v]+ M AIj←i [k]+ e[v +kM ]wj∗ [v],
M v=0 M v=0
i=1,i=j
(5.43)
where the first term comes from the jth user, and

1 
M−1
M AIj←i [k] = ri [v + kM ]wj∗ [v]
M v=0
5.3 PMU-OFDM System in Time Offset Environment 135

is the MAI of user j due to the time asynchronism of user i. Based on


Eq. (5.43), once the overall MAI from all the other users is sufficiently smaller
than the desired signal, the time offset of user j (sampled as integer) can
be easily estimated and then compensated using the algorithms developed
for single-user OFDM systems, e.g., [118, 141]. Moreover, as mentioned in
the introductory section, there are several additional advantages when MAI
is negligible.
Let us proceed to the details of M AIj←i [k] and identify the dominating
MAI from user i to user j. We consider τi < 0. The derivations for τi > 0 is
similar. For representation purpose, we will use (n)N M to denote n modulo
N M . From Eqs. (5.37), (5.42) and (5.43) and let l = v + kM , where 0 ≤ v ≤
M − 1, 0 ≤ k ≤ N − 1, the MAI of user i contributing to user j is given by
(0) (1) (2)
M AIj←i [k] = M AIj←i [k] + M AIj←i [k] + M AIj←i [k], (5.44)

(0)
where M AIj←i [k] is the MAI due to the current block of user i given by

1 
M−1 N
M−1
1
pi (n − |τi |)e−j N M (k+vM)n
(0) 2π
M AIj←i [k] = √
M v=0 N M
n=|τ | i

(00) (01)
= M AIj←i [k] − M AIj←i [k], (5.45)

(00)
where M AIj←i [k] is given by
:N M−1 ;
1  
M−1
1 −j N2π
√ pi ((n − |τi |)N M )e M (k+vM)n wj∗ [v],
M v=0 N M n=0

(01)
and M AIj←i [k] is given by
⎡ ⎤
|τi |−1
1  1 ⎣
M−1
pi ((n − |τi |)N M ) e−j N M (v+kM)n ⎦ wj∗ [v].


M v=0 N M n=0

(1)
M AIj←i [k] is the MAI due to the previous block of the ith user given by
⎡ ⎤
|τi |−ν−1
1  1 ⎣ 
M−1
pi (n + N M − |τi | + ν)e−j N M (v+kM)n ⎦ wj∗ [v],
(−) 2π

M v=0 N M n=0
(5.46)
(2)
and M AIj←i [k] is the MAI due to the corrupted CP of user i given by
⎡ ⎤
|τi |−1
1  1 ⎣ 
M−1
p9i (n − |τi | + ν)e−j N M (v+kM)n ⎦ wj∗ [v]. (5.47)


M v=0 N M
n=|τ |−ν i
136 5 Precoded Multiuser (PMU)-OFDM System
(1)
Note that when |τi | < ν, M AIj←i [k] as given in Eq. (5.46) is zero and
(2)
M AIj←i [k] in Eq. (5.47) will sum up terms only for n ≥ 0. When the timing
(0)
mismatch τi is significantly smaller than N M , M AIj←i [k] is the dominating
(1) (2)
MAI as compared with M AIj←i [k] and M AIj←i [k] since the current block
contributes N M − |τi | symbols. This number is greater than that from the
(00)
previous block or the corrupted CP. Moreover, when |τi | << N M , M AIj←i [k]
(01) (01)
is much greater than M AIj←i [k] since M AIj←i [k] is only a small fraction
(00)
of M AIj←i [k] according to Eq. (5.45). Hence, if we can greatly suppress
(00)
M AIj←i [k], the MAI due to time asynchronism can be greatly reduced. The
(00)
suppression of M AIj←i [k] is considered in the next section. Before moving to
next section, let us see an example as follows.
Example 5.2: Dominating MAI in Time Offset Environment
(0)
Here, we would like to show that M AIj←i [k] is the dominating MAI term
(1) (2)
over M AIj←i [k] and M AIj←i [k] for both τi < 0 and τi > 0. The simulation
was conducted with the following setting. We consider the performance in the
uplink direction, where each user may have a different time offset and channel
fading. The channel and the time offset are assumed to be quasi-invariant in
the sense that it remains unchanged in one block duration. Simulations are
conducted with the following parameter setting throughout this section. M =
16 and the BPSK modulation is used. For the PMU-OFDM, the Hadamard-
Walsh code is used. For every individual user, the Monte Carlo method is
used to run more than 500,000 symbols. We consider the worst time offset
situation. That is, except the target user who is assumed to have correct
timing, the time offsets of all the other users are randomly assigned to be
either +τ or −τ . All T users are the target user in turn. For instance, as
shown in Fig. 5.4, the correct timing for target user j is nj = 0. Other users
will have a time offset either +τ or −τ with respect to nj . Moreover, the CP
length ν = L−1 is added. In this situation, any non-zero timing mismatch will
(0)
lead to MAI. We will evaluate the averaged total MAI values of M AIj←i [k]
(1) (2)
and M AIj←i [k]+M AIj←i [k]. The averaged total MAI value of the dominating
MAI is obtained via averaging the value,
 2
 −1   
1
T
1
N
 T

 M AIj←i [k] ,
(0)
T j=1 N 
 k=0 i=1,i=j 
for more than 500,000 T symbols. That is, the total MAI per symbol of the
T individual users, for all T users. Similarly, the averaged total MAI value of
the non-dominating MAI is obtained via averaging the value,
 2
N −1  
1  1    
T T
M AIj←i [k] + M AIj←i [k] ,
(1) (2)
T j=1 N 
k=0 i=1,i=j 
for more than 500,000 T symbols.
5.3 PMU-OFDM System in Time Offset Environment 137

−10

−20
Averaged total MAI power (dB)

−30

−40

−50

−60
N = 64: dominating MAI
N = 64: non−dominating MAI
−70 N = 128: dominating MAI
N = 128: non−dominating MAI

−80
0 5 10 15
τ: time offset
Fig. 5.6. The MAI effect is plotted as a function of the time offset when full M
Hadamard-Walsh codewords are used.

Let us consider a flat fading channel whose coefficients are complex


Gaussian random variables. Let T = M = 16, i.e., the fully-loaded case.
The averaged dominating and non-dominating MAI power values are plot-
ted as a function of the time offset for N = 64 and 128 in Fig. 5.6. The
dominating MAI values are represented by dashed curves. As shown in this
figure, the dominating MAI is much greater than the non-dominating MAI by
around 20–30 dB for N = 64, and by around 22–32 dB for N = 128. This re-
(0) (1)
sult confirms that M AIj←i [k] is indeed the dominating MAI over M AIj←i [k]
(2)
and M AIj←i [k]. Moreover, as N increases from 64 to 128, the dominating
MAI decreases by around 6–8 dB, and the non-dominating MAI decreases by
around 9–10 dB. This corroborates the theoretical result; namely, both the
dominating MAI and the non-dominating MAI decrease as N increases. 

Asymptotic Behavior with Hadamard-Walsh Code

Now, we will argue that if Hadamard-Walsh code is used, as N increases, all


the MAI terms will be approximately zero when N is sufficiently large. That
is, even in a time asynchronous environment, the PMU-OFDM can still be
(00)
approximately MAI-free for a sufficiently large N . Consider M AIj←i [k] first.
Using the definition of DFT for pi (n),
138 5 Precoded Multiuser (PMU)-OFDM System
N
M−1
1 2π
pi (n) = √ zi [l]λi [l]ej N M nl
N M l=0

N −1 
M−1
1 j 2π 2π
= √ xi [f ]e N fn
λi [u + f M ]wi [u]ej N M un ,
N M f =0 u=0

(5.48)

and from Eq. (5.45), and the time shift property of DFT [98], we have

1 
M−1
xi [k]e−j N k|τi | λ[v + kM ]wi [v]wj [v]e−j N M v|τi | . (5.49)
(00) 2π 2π
M AIj←i [k] =
M v=0

Using the approximation in Eq. (5.9) and τi < 0, we have

(00) 1 9 2π

M−1

M AIj←i [k] ≈ λi [k]xi [k]ej N kτi wi [v]wj [v]ej N M vτi . (5.50)
M v=0

Since the maximum value of v is M − 1, the term e−j2π/N Mv|τi | in Eq. (5.50)
is approximately 1 if N  |τi |. This approximation becomes more accurate

as N increases. Since the value M−1 v=0 wi [v]wj [v] = 0 according to Eq. (5.2),
(00) (01)
M AIj←i [k] ≈ 0 for sufficiently large N . Now, consider M AIj←i [k]. From
Eq. (5.45), since pi ((n − |τi |)N M ) = pi (n + N M − |τi |) , for 0 ≤ n ≤ |τi | − 1,
(01)
M AIj←i [k] is given by
⎡ ⎤
|τi |−1
1  1 ⎣
M−1
pi (n + N M − |τi |) e−j N M (v+kM)n ⎦ wj∗ [v].

√ (5.51)
M v=0 N M n=0

Using Eq. (5.48) and the approximation in Eq. (5.9), we can rewrite Eq. (5.51)
as
|τi |−1 N −1
1  
(01)
M AIj←i [k] ≈ 9i [f ]ej 2π
xi [f ]λ N f (n−|τi |)−kn δ
(01)
2 i,j (τi ), (5.52)
N M n=0
f =0

M−1
where δi,j (τi ) = u=0,v=0 wi [u]wj [v]ej N M u(n−|τi |)−vn . For a fixed n, if N is
(01) 2π

sufficiently large, ej2π/N Mu(n−|τi |)−vn is approximately 1 for all possible com-
bination of u and v. For instance, if n = |τi | − 1, the maximum value of u(n −
|τi |)−vn is 0, i.e., when u = 0, v = 0, and the minimum value of u(n−|τi |)−vn
is −2(M − 1), i.e., when u = v = M − 1. Moreover, for Hadamard-Walsh
M−1 M−1 M−1
code is used, u=0,v=0 wi [u]wj [v] = u=0 wi [u] u=0 wj [v] = 0, for i = j.
(01) (01)
Hence, δi,j (τi ) in Eq. (5.52) is approximately zero and thus M AIj←i [k] ≈ 0.
(1)
Next, consider M AIj←i [k]. From Eqs. (5.46), (5.48) and the approximation
in Eq. (5.9),
5.3 PMU-OFDM System in Time Offset Environment 139

|τi |−ν−1 N −1
1  
(1)
M AIj←i [k] ≈
(−)
xi 9(−) [f ]ej 2π
[f ]λ N f (n−|τi |+ν)−kn δ
(1)
i i,j (τi ),
NM2 n=0 f =0
(5.53)
M−1
M u(n−|τi |+ν)−vn . Using an argument
(1) j N2π
where δi,j (τi ) = w
u=0,v=0 i [u]wj [v]e
(01) (1)
similar to that for δi,j (τi ), we know that δi,j (τi ) is also approximately zero
(2)
for large N . Finally, consider M AIj←i [k]. From Eqs. (5.38), (5.47) and the
(2)
fact that si (n) is the N M -point IDFT of zi [l], M AIj←i [k] is given by

−1
:
1  N
ν−1 n
hi (m)ej N f (n−m−ν)−k(n+|τi |−ν) δi,j (τi )
2π (20)
2
x i [f ]
N M n=0 m=0
f =0
;
 (−)
ν
(−) j 2π f (n−m)−k(n+|τ |−ν) (21)
+ xi [f ] hi (m)e N i
δi,j (τi ), , (5.54)
m=n+1

(20) (21)
where δi,j (τi ) and δi,j (τi ) are given by


M−1
wi [u]wj [v]ej N M u(n−m−ν)−v(n+|τi |−ν)
(20) 2π
δi,j (τi ) =
u=0,v=0

and

M−1
wi [u]wj [v]ej N M u(n−m)−v(n+|τi |−ν) .
(21) 2π
δi,j (τi ) =
u=0,v=0

(20) (21)
Similarly, δi,j (τi ) and δi,j (τi ) are also approximately zero for large N . Thus,
for sufficiently large N , PMU-OFDM can still be approximately MAI-free.

5.3.2 Code Design for MAI Mitigation


(0)
In the previous section, we have explained that M AIj←i [k] is the major MAI.
(0) (00) (01)
Moreover, M AIj←i [k] is divided into M AIj←i [k] and M AIj←i [k], and the
(00)
first term is the dominating term. Hence, if we can suppress M AIj←i [k], the
MAI due to time asynchronism will be reduced greatly. In this section, we will
(00)
demonstrate a code design which can greatly suppress M AIj←i [k]. Based on
Eq. (5.50), let us define a function φi,j (τi ) given by


M−1

φi,j (τi ) = wi [m]wj [m]ej N M mτi . (5.55)
m=0

If the code is properly designed such that φi,j (τi ) ≈ 0 for arbitrary combina-
tion of i and j, the dominating MAI can be made approximately zero. The
following derivation is similar to that in Section 5.2.2 except that now the MAI
140 5 Precoded Multiuser (PMU)-OFDM System

is due to time offset. Let us further manipulate φi,j (τi ) as follows. Express the
exponential term in Eq. (5.55) using the Taylor series representation, i.e.,
   2  3
j N2π 2π 1 2π 1 2π
e M mτi =1+j mτi − mτi −j mτi + ··· .
NM 2! NM 3! NM

Since the maximum value of m is M −1, the maximum value of 2π/N M mτi <
2πτi /N . When τi  N , we may use second order approximation of
e−j2π/N Mmτi , which leads to

: 2 ;

M−1
1 2π 2π
φi,j (τi ) ≈ wi [m]wj [m] 1 − mτi + j mτi
m=0
2! N M NM
:  2 ;

M−1
1 2π 2π
= wi [m]wj [m] − mτi + j mτi
m=0
2! N M NM
 φ9i,j (τi ), (5.56)
M−1
where we have used m=0 wi [m]wj∗ [m] = 0, for i = j. From Eqs. (5.55) and
(5.56), we can rewrite Eq. (5.50) as

1 9
λi [k]xi [k]ej N kτi φ9i,j (τi ).
(00) 2π
M AIj←i [k] ≈ (5.57)
M
 2 "
 (00) 
Now, we would like to evaluate E M AIj←i [k] , the averaged MAI power
from user i to user j. Let us assume the averaged transmitter power are the
same for all subchannels and E{xi [k]x∗i [k  ]} = 0, for k = k  . Also, assume the
averaged channel power are the same for different taps and E{hi [n]h∗i [n ]} = 0,
for n = n . Again, according to the same argument in Section 5.2.2, the equal
power assumption of channel coefficients will lead to a pessimistic result. Let
σx2i be the averaged transmitted power, and σh2 i be the averaged channel power
   
defined, respectively, by σx2i = E |xi [k]|2 and σh2 i = E |hi (n)|2 . Then, we
have the following lemma.

Lemma 5.4: When all the M Hadamard-Walsh codewords are used, i.e., fully
loaded case. The maximum "value of the MAI from user i to user j, denoted
 2
 (00) 
by maxi,j E M AIj←i [k] , can be approximated by

: 2 ;
& τ '2 π 2  1
2 & ' 4 
τi 4 π 1
i
σx2i σh2 i 1− + 1− , (5.58)
N 4 M N 4 M

and the maximum value occurs when wi [m] and wj [m] satisfy Eq. (5.17).
5.3 PMU-OFDM System in Time Offset Environment 141

Proof. From Eq. (5.57) and using that


L−1
9i [k] =
λ hi (n)e−j N nk ,

n=0

 2 "
 (00) 
E M AIj←i [k] can be approximated by

 L−1
L−1   
 2
1 
E {hi (n)h∗i (n )} e−j N k(n−n ) φ9i,j (τi )

2
2
σxi
M n=0 n =0
L 2 2  9 2

= σ σ  φi,j (τi ) . (5.59)
M 2 x i hi
 2 "
 (00) 
From Eq. (5.59), maximizing E M AIj←i [k] is equivalent to maximizing
 2
9 
φi,j (τi ) . From Eq. (5.56), φ9i,j (τi ) can be rearranged as

# $ # $
φ9i,j (τi ) =  φ9i,j (τi ) + j φ9i,j (τi ) , (5.60)

where
# $  2 M−1
9 1 2π
 φi,j (τi ) = − τi2 wi [m]wj [m]m2
2! N M m=0

and
: ;
# $ 2π 
M−1
 φ9i,j (τi ) = τi wi [m]wj [m]m .
N M m=0

Since
 2  # $2  # $2
9     
φi,j (τi ) =  φ9i,j (τi )  +  φ9i,j (τi )  ,

 2  
   M−1 
maximizing φ9i,j (τi ) is equivalent to maximizing both  m=0 wi [m]wj [m]m
 
 M−1 
and  m=0 wi [m]wj [m]m2 . According to [65], the product of two arbitrary
distinct Hadamard-Walsh is a non-all-one Hadamard-Walsh codeword. Note
that each of the non-all-one Hadamard-Walsh codeword has equal number
−1. Since m and m
2
of +1 and  are monotonically increasing
 functions for
 M−1  M−1 2
m ≥ 0,  m=0 wi [m]wj [m]m and  m=0 wi [m]wj [m]m  are maximized
142 5 Precoded Multiuser (PMU)-OFDM System

if wi [m]wj [m] are of the same sign for 0 ≤ m ≤ M/2 − 1. In this case,
we have

   
M−1 
  
M−1 M/2−1
2 1 3 1
 wi [m]wj [m]m  = m2 − m2 = M 1− . (5.61)
  4 M
m=0 m=M/2 m=0

Similarly, it can be shown

M−1   
  
M−1 
M/2−1
  1 2 1
 wi [m]wj [m]m = m− m= M 1− . (5.62)
  4 M
m=0 m=M/2 m=0

From Eqs. (5.59), (5.60), (5.61) and (5.62), we proved the approximation in
Eq. (5.58). 
 2 "
 (00) 
Observed from Eq. (5.58), the maximum value of E M AIj←i [k] de-
pends on two terms. One is proportional to 1/N 2 and the other is propor-
tional to 1/N 4 . When N grows, the term which is proportional to 1/N 2
will dominate
 the performance. Hence, we may regard the maximum value
2 "
 (00) 
of E M AIj←i [k] decreases at a rate proportional to 1/N 2 . Note that
when all the M codewords are used, every target user will unavoidably
has interference from a certain user that attains the maximum given in
Eq. (5.16). For instance, let M = 16 and user i uses w1 and user j
uses w9 . Then, wi [k]wj [k] satisfies Eq. (5.17) and the dominating MAI oc-
curs. Now, if user i uses w2 and user j uses w10 , again wi [k]wj [k] satisfies
Eq. (5.17) and the dominating  MAI occurs. Consider the overall MAI power
M 2 "
 (00) 
for user j, i.e., i=1,i=j E M AIj←i [k] . Since the maximum value of
 2 "  2 "
 (00)  M  (00) 
E M AIj←i [k] is the dominating MAI of i=1,i=j E M AIj←i [k] ,
we may regard the overall MAI power decreases in an order of 1/N 2 .
Therefore, when all the M codewords are used and N is sufficiently large
(for the approximation in Eq. (5.9)), the overall MAI power of the
PMU-OFDM due to time asynchronism decreases in an order 1/N 2 . From
2
Eq. (5.16), the term proportional
# $ to 1/N dominates the performance, which
is contributed by  φ9i,j (τi ) in Eq. (5.60). Hence, if we can constrain
# $
 φ9i,j (τi ) = 0, the MAI due to time offset can be greatly reduced. This
goal can be achieved by properly selecting codewords from the Hadamard-
Walsh code.
5.3 PMU-OFDM System in Time Offset Environment 143

Now, we consider the use of M/2 symmetric or anti-symmetric codewords


[135]. From Lemma 5.3, we can rewrite Eq. (5.57) as

(00) 1 9 2π
 −1  2π
M−1 2
M AIj←i [k] ≈ λi [k]xi [k]ej N kτi mτi wi [m]wj [m]. (5.63)
M m=0
2! N M

Hence, we have the following proposition.


Proposition 5.2: Suppose that only symmetric (anti-symmetric)
 Hadamard-
2 "
 (00) 
Walsh codewords are used, the maximum value of E M AIj←i [k] can be
approximated by
 2 " 0& ' 4 1
 (00)  τi 4 π
max E M AIj←i [k] ≈ Lσxi σhi
2 2
, (5.64)
i,j N 64

and this occurs when wi [m] and wj [m] satisfy Eq. (5.32).
Proof. According to Eq. (5.63), if the code selection mentioned in Propo-
sition 5.2 is used, the imaginary part of φi,j (n) in Eq. (5.60) disappears.
Following
 the same argument in the proof of Lemma 5.4, we know that
2 "  
 (00)   2
E M AIj←i [k] is at its largest when the value  M−1
m=0 wi [m]wj [m]m 

is at its largest. According to [135], if only symmetric or anti-symmetric


codewords are used, the codeword product wi [m]wj [m] is again a symmetric
non-all-one Hadamard-Walsh codeword. Hence, the last half M/2 elements of
wi [m]wj [m] can be obtained by its first M/2 elements. Therefore, both the
of +1 and −1. Since m 
2
first and the last half M/2 elements have equal number
 M/2−1 
are monotonically increasing functions for m ≥ 0,  m=0 wi [m]wj [m]m2 
are maximized if wi [m]wj[m] are of the same sign for 0 ≤ m ≤ M/4 − 1, and
M−1 
 m=M/2 wi [m]wj [m]m2  are maximized if wi [m]wj [m] are of the same sign
for M/2 ≤ m ≤ 3M/4 − 1, which is the condition in Eq. (5.32). Given that
the codeword product wi [m]wj [m] satisfies Eq. (5.32), we have


M−1 
M−1 
3M/4−1
1 3
wi [m]wj [m]m2 = m2 − 2 m2 = M . (5.65)
m=0 m=0
16
m=M/4
 2 "
 (00) 
From Eqs. (5.60), (5.63) and (5.65), the maximum value of E M AIj←i [k]
can be approximated as that in Eq. (5.64). 
Observing from Eq. (5.64), we see that when
symmetric (anti-symmetric)
"
2
 (00) 
codewords are used, the maximum value of E M AIj←i [k] decreases in
an order of 1/N 4 for sufficiently large N . We can compare this result to fully
loaded case, in which MAI power decreases in an order of 1/N 2 . Thus, using
144 5 Precoded Multiuser (PMU)-OFDM System

symmetric (anti-symmetric) codewords enable the system to reduce MAI at a


much faster rate as N increases in a time asynchronous environment. Although
we use the assumption |τi |/N  1 in the derivation, simulations demonstrate
that the approximation is still accurate for moderate |τi |/N .
Example 5.3. In this example, we will show that the second order approx-
imation used in Eq. (5.56) holds for |τi |/N as large as 1/4. Moreover, we
demonstrate that using symmetric (anti-symmetric) Hadamard-Walsh code-
words enables the system to eliminate MAI due to time asynchronism much
faster than the fully-loaded case.
Let us consider the flat fading channel, i.e., L = 1. In this case, the approx-
imations in Eqs. (5.9) and (5.50) become equalities. Hence, from Eqs. (5.50)
and (5.55), we have
 2 "
 (00)  1
E M AIj←i [k] = 2 σx2i σh2 i |φi,j (τi )|2 . (5.66)
M
We will compare the approximated maximum interference given in Eqs. (5.16)
(fully-loaded) and (5.64) (half-loaded) to the exact quantity given in Eq. (5.66).
Let us consider the case that user i has a serious time offset τi = 16. Let
M = 16, σx2i = 1 and σh2 i = 1. Let us consider the maximum MAI power from
user i to user j, which occurs when wi [m]wj [m] satisfies Eq. (5.17) for the
fully-loaded case, and occurs when wi [m]wj [m] satisfies Eq. (5.32) when sym-
metric (anti-symmetric) codewords are used. The maximum MAI power with
and without approximation as functions of N for fully- and half-loaded sys-
tems are shown in Fig. 5.7. Approximated maximum MAI power is plotted for
all integer N from 32 to 256 and exact maximum MAI power is obtained for
N = 32, 64, 128, and 256. From the figure, we observe that the approxima-
tions are very accurate for both fully- and half-loaded cases when N ≥ 64.
Since τi = 16, the approximated results become accurate when the ratio
τi /N ≤ 1/4.
Moreover, we see that symmetric (anti-symmetric) codewords enable the
system to reduce MAI at a much faster rate than the fully-loaded system under
a time asynchronous environment. We see that in the fully-loaded case, dou-
bling N will decrease MAI power by 6 dB while in half-loaded case, doubling
N will decrease MAI power by 12 dB. This result corroborates our derivation
in Lemma 5.4 and Proposition 5.2. That is, using symmetric (anti-symmetric)
codewords, the maximum MAI power due to time asynchronism decreases at
a much faster rate proportional to 1/N 4 than the fully-loaded system whose
rate is proportional to only 1/N 2 .
It is interesting to note that in Eqs. (5.16) and (5.64), the ratio N/τ2 "
i
 (00) 
determines the approximated maximum dominating MAI E M AIj←i [k]
due to time offset. This result allows us to determine N according to the
maximum allowed time offset τi . For instance,"from Fig. 5.7, we see that in
 2
 (00) 
the half-loaded case, maxi,j E M AIj←i [k] is −34 dB for N = 128 and
5.3 PMU-OFDM System in Time Offset Environment 145

10
Without approximation: fully-loaded
Without approximation: half-loaded
With approximation: fully-loaded
With approximation: half-loaded
0

−10
j←i
MaxE{|MAI(00) [k]|2}

−20

−30

−40

−50
50 100 150 200 250
N
 2 "
 (00) 
Fig. 5.7. Term maxi,j E M AIj←i [k] with and without approximation as a
function of N for fully and half-loaded cases.

τi = 16. Now, if the system uses a more strick design parameter for time offset,
e.g., τi decreases from 16 to 8, to maintain the MAI power to be −34 dB in
this half-loaded situation, N can be decreased from 128 to 64. Therefore,
this result gives us an explicit way to determine N in a time asynchronous
environment. 
Based on Proposition 5.2, it is easy to derive the following corollary.
Corollary 7.1: Suppose that only M/2 symmetric or anti-symmetric code-
words of M Hadamard-Walsh codes are used, for i = j, we have the following
properties:
M
−1
2  
M−1
wi [u]wj∗ [v] = wi [u]wj∗ [v]
u=0,v=0 u= M M
2 ,v= 2

−1 M−1
M
−1
M

2  
M−1 
2

= wi [u]wj∗ [v] = wi [u]wj∗ [v] = 0. (5.67)


u=0 v= M u= M v=0
2 2

The code design procedure as stated in Proposition 5.2 also reduces the
(01) (1) (2)
three MAI terms, M AIj←i [k], M AIj←i [k], and M AIj←i [k] as well. This can
146 5 Precoded Multiuser (PMU)-OFDM System

be explained as follows. Using Eq. (5.67), δ1 , δ2 , δ3 , and δ4 in Eqs. (5.52),


(5.53), and (5.54) can be further reduced since the phase variation in the
exponential for all possible u and v becomes one half and the exponential
term is closer to 1 as compared with the fully-loaded system. Take δ1 in
Eq. (5.52) as an example, δ1 can be divided into four terms as
M
−1
2  
M−1
δ1 = φi,j (u, v) + φi,j (u, v)
u=0,v=0 u= M M
2 ,v= 2
M
−1,M−1 2 −1
M−1, M
2  
+ φi,j (u, v) + φi,j (u, v), (5.68)
u=0,v= M
2 u= M
2 ,v=0

where φi,j (u, v) = wi [u]wj∗ [v]ej N M u(n−|τi |)−vn . If symmetric or anti-symmetric


Hadamard-Walsh codes are used, the four terms of δ1 is approximately 0


according to Eq. (5.67). Hence, δ1 is approximately 0.
However, it is worthwhile to comment that, when compared to the fully
load case, the use of the proposed code scheme leads to more accurate ap-
proximation and hence smaller δ1 . For example, as mentioned earlier, when
n = |τi | − 1 in the fully-loaded system, the maximum and minimum values of
u(n − |τi |) − vn are 0 and −2(M − 1), respectively. Hence, the maximum phase
difference is 2π/N M 2(M − 1). On the other hand, if the proposed code is used
for n = |τi | − 1, the maximum and minimum values of u(n − |τi |) − vn are 0
(when u = v = 0) and −(M − 1) (when u = v = M/2), respectively. Thus,
the maximum phase difference is 2π/N M (M − 1), which is one half of that of
the fully-loaded case. It can be easily verified that the phase differences of the
later three terms in Eq. (5.68) are 2π/N M M , 2π/N M M and 2π/N M (M −2),
respectively. Since the phase differences are near one half of those of the fully
loaded system, the use of the proposed code leads to more accurate approxima-
tion and smaller δ1 . Please also note that the use of anti-symmetric codewords
(1) (2)
in a flat channel will result in M AIj←i [k] = M AIj←i [k] = 0 when τi = −1,
which can be easily verified by Eqs. (5.46) and (5.47). This is also true for
τi = +1.
The number of users is decreased by one half when only symmetric or anti-
symmetric Hadamard-Walsh codewords are used. This price is compensated
by several attractive gains. The system can continue to be approximately
MAI-free in the presence of time offset. It has been shown that the same
code design can greatly mitigate the MAI caused by the frequency offset
[135], which is a serious problem that limits the mobility of the OFDMA
system [6]. Therefore, the use of symmetric or anti-symmetric Hadamard-
Walsh codewords enables the PMU-OFDM to be more robust in both a time
and frequency asynchronous environment.
Since the MAI due to time asynchronism becomes negligible, the timing
estimation for every individual user becomes much easier, i.e., estimation algo-
rithm used in single-user environment may be applied here without worrying
5.3 PMU-OFDM System in Time Offset Environment 147

about the MAI. Moreover, timing misalignment among individual users can
be compensated in the receiver end. For instance, the receiver can extract
individual users’ OFDM block at the corresponding correct timing for de-
tection. Note that even the timing misalignment for all the other users still
exist, this misalignment will only cause negligible MAI and will not degrade
the system bit error probability performance. Moreover, since the requirement
for accurate timing has been greatly relaxed, a much simpler synchronization
mechanism can be adopted for the transceiver.
These results stand in contrast to those of the OFDMA system, where
minor timing mismatch will cause significant MAI [97]. As mentioned in [142],
the timing asynchronism of OFDMA cannot be solved in the receiver end
alone and feedback mechanism is demanded. If some users somehow fail to
be well-synchronized, great performance degradation occurs and sophisticated
multiuser estimation, e.g., [88, 142], is needed to acquire time offset for the
users and this will cause extra complexity burden.

Example 5.4: Comparison of MAI Suppression


(0)
We will also show that the dominating MAI due to time offset, i.e., M AIj←i [k]
can be greatly reduced if M/2 symmetric or anti-symmetric codewords of the
M Hadamard-Walsh codewords are used. Moreover, the experiment results
(1) (2)
show that the non-domination MAI, M AIj←i [k] + M AIj←i [k], can also be
greatly reduced by using symmetric or anti-symmetric codewords. Let the
simulation parameters be the same as that in Example 5.2. First, let N = 64.
The number of users decreases from T = M = 16 to T = M/2 = 8, i.e.,
a half-loaded system. For the comparison purpose, we consider another half-
loaded code scheme that uses the first M/2 codewords of M Hadamard-Walsh
codes to serve as a benchmark. The performance is shown in Fig. 5.8. Note
that the dominating MAI using symmetric codewords (dashed-diamond) and
the dominating MAI using anti-symmetric codewords (dashed-triangular) are
overlapping. Compared with Fig. 5.6 with N = 64, the use of symmetric
or anti-symmetric codewords can significantly reduce the dominating MAI
by a range of 14–37 dB. On the other hand, the use of the first M/2 code-
words only improves the dominating MAI by around 5–6 dB. We also see
that the use of symmetric or anti-symmetric codewords can greatly reduce
the non-dominating MAI. The results confirm that both dominating and non-
dominating MAIs can be greatly reduced using the proposed code scheme.
Note that even if the maximum time offset level is 15 in this example, since
N = 64 > 4 × 15, the averaged total MAI can still be suppressed below
−25 dB, which is much smaller than the transmit power of 0 dB. This result
confirms the claim in Section 5.3.2 that significant MAI suppression can be
achieved at a more relaxed value of N .
Figure 5.9 shows the performance for N = 128. Comparing it with
the result of N = 128 in Fig. 5.6, we see that the use of symmetric or
anti-symmetric codewords can significantly reduce the dominating MAI by
20–43 dB. The improvement for N = 128 is 6 dB better than that for N = 64,
148 5 Precoded Multiuser (PMU)-OFDM System

−10

−20

−30
Averaged total MAI power (dB)

−40

−50

−60

−70
First half codewords: dominating MAI
First half codewords: non-dominating MAI
−80
Anti-symmetric codewords: dominating MAI
Anti-symmetric codewords: non-dominating MAI
−90 Symmetric codewords: dominating MAI
Symmetric codewords: non-dominating MAI
−100
0 5 10 15
τ: time offset

Fig. 5.8. The MAI effect is plotted as a function of the time offset in a half-loaded
system for different code schemes with N = 64.

where the MAI reduction is 14–37 dB. This can be explained by the fact that,
when N increases, the approximation in Eq. (5.56) becomes more accurate.
Hence, better MAI suppression can be achieved.
Although the use of anti-symmetric codewords can lead to smaller non-
dominating MAI than symmetric codewords, the performance is actually
determined by the reduced dominating MAI. Hence, symmetric and anti-
symmetric codewords give rise to similar performance. Thus, in the following
discussion, we use symmetric codewords only to demonstrate the performance
of the proposed code scheme.
From the discussion in Section 5.2.1, we know that the PMU-OFDM sys-
tem has many characteristics similar to OFDMA. Thus, in the following two
examples, we compare the performance of PMU-OFDM and OFDMA under a
time asynchronous environment. Let the parameters remain the same as that
in Example 5.2. Every user in these two systems transmits N symbols and
the size of DFT/IDFT is the same, i.e., N M . Since the two systems transmit
N symbols per block and add the CP of the same length ν, their actual data
rates are the same. We consider both fully and half-loaded situations. For
the fully loaded OFDMA system, each user occupies N subchannels which
are maximally separated [114], i.e., user u is assigned subchannels indexed
by (u − 1) + kM , 1 ≤ u ≤ M , and 0 ≤ k ≤ N − 1. For the half-loaded
5.3 PMU-OFDM System in Time Offset Environment 149

−20

−30

−40
Averaged total MAI power (dB)

−50

−60

−70

−80

−90
Even indexed codewords: dominating MAI
Even indexed codewords: non-dominating MAI
−100
Anti−symmetric codewords: dominating MAI
Anti−symmetric codewords: non-dominating MAI
−110 Symmetric codewords: dominating MAI
Symmetric codewords: non-dominating MAI
−120
0 5 10 15
τ: time offset

Fig. 5.9. The MAI effect is plotted as a function of the time offset in a half-loaded
system for different code schemes with N = 128.

OFDMA system, user u is assigned subchannels indexed by 2(u − 1) + kM ,


with 1 ≤ u ≤ M/2 and 0 ≤ k ≤ N − 1. The remaining N M/2 subchannels
are used as guard bands. 
Example 5.5: Performance Comparison in a Flat Fading Channel
(0)
We first evaluate the MAI power, which is the summation of M AIj←i [k],
(1) (2)
M AIj←i [k], and M AIj←i [k]. As shown in Fig. 5.5, we consider MAI at the
detection stage, i.e., after FEQ. The parameters remain the same as that in
Example 7.4, i.e., N = 128. The averaged MAI after FEQ for PMU-OFDM
and OFDMA in fully- (dashed curves) and half-loaded (solid curves) situ-
ations are shown in Fig. 5.10. For the fully-loaded situation, PMU-OFDM
outperforms OFDMA when the time offset level τ ≤ 5, but its performance is
worse than OFDMA as τ > 5 in a flat channel. For the half-loaded case, we see
the use of symmetric codewords can greatly reduce MAI by around 20–43 dB
as compared with the fully loaded case. On the other hand, the MAI perfor-
mance of OFDMA is only slightly improved from the fully loaded system to
the half-loaded system. Consequently, PMU-OFDM outperforms OFDMA by
around 14–48 dB in the half-loaded situation.
Next, we consider the case where every user, except for the target user,
has a time offset of |τi | = 13 in the two systems. All T users serve as the
target user in turn. We would like to evaluate the bit error probability (BEP)
150 5 Precoded Multiuser (PMU)-OFDM System
0

−10

−20
Averaged total MAI power (dB)

−30

−40

−50

−60

PMU–OFDM: fully-loaded
−70 OFDMA: fully-loaded
OFDMA: half-loaded
PMU-OFDM: half-loaded with even codewords
−80
0 5 10 15
τ: time offset

Fig. 5.10. The MAI performance comparison between PMU-OFDM and OFDMA
in a flat fading channel.

performance when there is no feedback in Fig. 5.11. For the comparison pur-
pose, we also show the curve of OFDMA without time offset as a benchmark.
We see that, in a serious timing mismatch environment such as specified in this
example, PMU-OFDM with the proposed code scheme can achieve compara-
ble performance as OFDMA without time offset. However, the performance
of OFDMA degrades significantly due to time asynchronism. 
Example 5.6: Performance Comparison in a Multipath Environment
In this example, we examine the time asynchronous effect in a multipath
(or frequency-selective) fading channel. The number of multipaths, L, is set
to L = 4 while the other parameters remain the same as those given in
Example 5.5. The channel coefficients are i.i.d. complex Gaussian random
variables with an unit variance. The comparison of the MAI power for PMU-
OFDM and OFDMA is given in Fig. 5.12. In a fully loaded system, OFDMA
has less MAI than PMU-OFDM. However, in a half-loaded system, PMU-
OFDM outperforms OFDMA by around 13–25 dB due to the use of the pro-
posed code scheme. The lower MAI value of PMU-OFDM enables the system
to estimate time offset more accurately than OFDMA.
Figure 5.13 gives the BEP comparison between the two systems with
|τi | = 13 in a half-loaded system. We see that PMU-OFDM with the proposed
code design does not have a significant performance floor in the presence of
5.4 PMU-OFDM System in Frequency Offset Environment 151

PMU-OFDM: fully-loaded with first M/2 codewords


OFDMA: half-loaded
PMU-OFDM: half-loaded with even codewords
Without time offset: OFDMA with half-loaded

10−2
Bit error probability

10−3

10 12 14 16 18 20 22 24 26 28 30
SNR: E /N
b 0

Fig. 5.11. The BEP performance comparison between PMU-OFDM and OFDMA
in a flat channel with time offset level |τj | = 13.

serious time asynchronous and frequency-selective fading. For comparison, we


also plot PMU-OFDM with the first M/2 codewords. We see that its perfor-
mance is close to the half-loaded OFDMA system, which is far worse than the
proposed code scheme. This shows that the importance of the proper code
design in a time asynchronous environment for the PMU-OFDM system. 

5.4 PMU-OFDM System in Frequency Offset


Environment
In this section, we evaluate the CFO effect of the PMU-OFDM system and
derive analytical results for MAI caused by CFO as well as self-CFO impair-
ments. Based on the analytical results, we present a code selection scheme to
mitigate MAI by choosing proper orthogonal codes. Again, if we use only the
M/2 symmetric or the M/2 anti-symmetric codewords of the M Hadamard-
Walsh codes, MAI can be greatly reduced to a negligible amount. Moreover,
based on this code selection, we show that if a proper code priority is used,
PMU-OFDM is more robust to CFO. We will explain the code priority concept
in this section.
152 5 Precoded Multiuser (PMU)-OFDM System

−10
Averaged total MAI power (dB)

−20

−30

−40

−50 PMU-OFDM: fully-loaded


OFDMA: fully-loaded
OFDMA: half-loaded
PMU-OFDM: half-loaded with even codewords

−60
0 5 10 15
τ: time offset

Fig. 5.12. The MAI comparison between PMU-OFDM and OFDMA in a frequency-
selective fading channel with L = 4.

5.4.1 Analysis of CFO Effects

In the PMU-OFDM system, the overall CFO effect consists of two parts. One
is the MAI caused by CFOs of other users. The other is the symbol distortion
and the inter carrier interference (ICI) due to the user’s own CFO. They will
be analyzed separately in this section.
Referring to Fig. 5.1, consider the lth element of the received vector after
DFT in a CFO environment, i.e.,


T
ẑ[l] = ri [l] + e[l], 0 ≤ l ≤ N M − 1, (5.69)
i=1

where ri [l] is the attenuated symbol of zi [l]. The attenuation is caused by


channel fading and the CFO effect. Suppose the ith user has a normalized CFO
i , which is the actual CFO normalized by 1/N M of the overall bandwidth
and −0.5 ≤ i ≤ 0.5, ri [l] in Eq. (5.69) can be expressed by [87]

(0) (1)
ri [l] = ri [l] + ri [l], (5.70)
5.4 PMU-OFDM System in Frequency Offset Environment 153

10−2
PMU-OFDM: half-loaded with first M/2 codewords
OFDMA: half-loaded
PMU-OFDM: half-loaded with even codewords
Without time offset: OFDMA with half-loaded
Bit error probability

10−3

10−4
10 12 14 16 18 20 22 24 26 28 30
SNR: Eb/N0

Fig. 5.13. The BEP comparison between PMU-OFDM and OFDMA in a frequency-
selective channel with L = 4 and time offset level |τj | = 13.

where

(0)
ri [l] = αi λi [l]zi [l],
N
M−1
e−jπ N M
m−l
(1)
ri [l] = βi λi [m]zi [m] ,
m=0,m=l N M sin π(m−l+
NM
i)

αi and βi are given by

sin πi N M −1 N M −1
αi = ejπi N M and βi = sin (πi )ejπi N M . (5.71)
N M sin NπMi

(0) (1)
ri [l] is the distorted symbol and ri [l] is the ICI caused by the CFO. Note
that when there is no CFO, ri [l] equals λi [l]zi [l] as defined in Eq. (5.6). From
Eqs. (5.128) and (5.69), we see that x̂j [k] under CFO is given by
154 5 Precoded Multiuser (PMU)-OFDM System

1 
M−1
x̂j [k] = rj [v + kM ]wj∗ [v + kM ]
M v=0
  
sj [k]


T
1 
M−1
+ ri [v + kM ]wj∗ [v + kM ]
M v=0
i=1,i=j
  
MAIj←i [k]

1 
M−1
+ e[v + kM ]wj∗ [v + kM ], (5.72)
M v=0

where M AIj←i [k] is the interference due to user i. In the following, M AIj←i [k]
and sj [k] are considered separately.

5.4.2 Analysis of Other User’s CFO Effect

Let us consider the MAI of the kth symbol of user j due to user i, i.e.,
M AIj←i [k] in Eq. (5.72). From Eqs. (5.70) and (5.72), we have

M AIj←i [k] = Aj←i [k] + Bj←i [k], (5.73)

where
1  (0)
M−1
Aj←i [k] = r [v + kM ]wj∗ [v + kM ] (5.74)
M v=0 i

and
1  (1)
M−1
Bj←i [k] = r [v + kM ]wj∗ [v + kM ]. (5.75)
M v=0 i

Using Eqs. (5.1), (5.3) and the approximation in Eq. (5.9), we have

αi 9 
M−1
Aj←i [k] ≈ λi [k]xi [k] wi [v + kM ]wj∗ [v + kM ]
M v=0
= 0. (5.76)

Therefore, the interference term Aj←i [k] is approximately zero. Thus, only
the term Bj←i [k] is of concern. The term Bj←i [k] can be rearranged as

βi 
M−1 N
M−1
e−jπ
m−v−kM
NM
Bj←i [k] = λi [m]yi [m] wi [m]wj∗ [v].
M v=0
m=0,m=v+kM N M sin π(m−v−kM+
NM
i)

(5.77)
5.4 PMU-OFDM System in Frequency Offset Environment 155

Letting m = u + f M , Eq. (5.77) can be manipulated as


M−1 N −1
βi   
M−1
Bj←i [k] = λi [u + f M ]yi [u + f M ]
M v=0
f =0 u=0,u=v+(k−f )M
u−v−(k−f )M
e−jπ NM
· π(u−v−(k−f )M+i )
wi [u]wj∗ [v]. (5.78)
N M sin NM

Using Eq. (5.1) and the approximation in Eq. (5.9), we have


N −1
βi  9  
M−1 M−1
Bj←i [k] ≈ λi [f ]xi [f ]
M v=0
f =0 u=0,u=v+(k−f )M
u−v−(k−f )M
e−jπ NM
· π(u−v−(k−f )M+i )
wi [u]wj∗ [v]. (5.79)
N M sin NM

Dominating MAI Due to Others’ CFOs

We argue that the term of f = k is the dominating MAI in Eq. (5.79) below.
Since u = v + (k − f )M , we have
  (
 π(u − v + (k − f )M +  )  π(−1+i )
min sin
i  = sin N M , i > 0 .
NM  sin π(1+ i)
NM , i < 0
u,v,k,f

When f = k, there are M − 1 pairs of (u, v) to make the sine function in


Eq. (5.79) equal sin π(1 + i )/N M and M − 1 pairs of (u, v) to make the sine
function equal sin π(−1 + i )/N M . On the other hand, when f = k + 1 or
f = k − 1, both situations have only one pair of (u, v) that makes the sine
function equal sin π(1 + i )/N M and one pair of (u, v) that makes the sine
function equal sin π(−1 + i )/N M . Hence, the MAI term of f = k contributes
the most to Eq. (5.79). If we can find ways to reduce the term of f = k, the
MAI can be greatly reduced. Intuitively speaking, the MAI in the kth symbol
of the target user is most seriously affected by the kth symbols of other users.
The farther the distance of other users’ symbols from the kth symbol, the less
impact they will make. 
Since the MAI term of f = k in Eq. (5.79) is the dominating MAI, we will
rearrange this term to a form that helps us gain insights on how to reduce it.
Let us extract the MAI term of f = k in Eq. (5.79) and let it be denoted by
(0)
Bj←i [k], we have
(0) (1)
Bj←i [k] = Bj←i [k] + Bj←i [k], (5.80)
where

βi 9 
M−1 
M−1
e−jπ N M
u−v

wi [u]wj∗ [v]
(0)
Bj←i [k] ≈ λi [k]xi [k] π(u−v+i )
(5.81)
M v=0 N M sin
u=0,u=v NM
156 5 Precoded Multiuser (PMU)-OFDM System

and
(1) (0)
Bj←i [k] = Bj←i [k] − Bj←i [k]. (5.82)
p
Referring to Eq. (5.81), let g(p) = e−jπ /N M sin −(M − 1) ≤
NM
π(p+i )
NM ,
p ≤ M − 1, p = 0. When N is sufficiently large, the denominator of g(p) is
approximately an odd function of p and the numerator is nearly constant for
all possible p with −M + 1 ≤ p ≤ M − 1, p = 0. Hence, we can approximate
g(p) as an odd function of p, i.e., g(p) ≈ −g(−p). Using the equality

M−1 
M−1
g(u − v)wi [u]wj∗ [v]
v=0 u=0,u=v
( +

M−1 
M−1−p 
M−1−p
= g(p) wi [p + q]wj∗ [q] + g(−p) wi [q]wj∗ [p + q] ,
p=1 q=0 q=0

and the approximation g(p) ≈ −g(−p), we can rewrite Eq. (5.81) as

βi 9 
M−1  
M−1−p

wi [p + q]wj∗ [q] − wi [q]wj∗ [p + q] .
(0)
Bj←i [k] ≈ λi [k]xi [k] g(p)
M p=1 q=0
  
O
(5.83)
As given in Eq. (5.83), the quantity O is determined by the property of orthog-
onal codewords. If O = 0, the dominating MAI term of f = k in Eq. (5.79) is
approximately zero. One way to achieve this is the use of only M/2 of the M
Hadamard-Walsh codes, which are either symmetric or anti-symmetric.
Proposition 5.3: Suppose only the M/2 symmetric or the M/2 anti-
symmetric codewords of the M Hadamard-Walsh codes are used, O = 0 and
(0)
thus Bj←i [k] ≈ 0.
Proof. When symmetric codewords are used, from Eq. (5.22) and since
Hadamard-Walsh code is real, we have

M−1−p 
M−1−p
wi [p + q]wj∗ [q] = wi [M − 1 − (p + q)]wj [M − 1 − q]. (5.84)
q=0 q=0

Let q  = M − 1 − p − q. We can rewrite Eq. (5.84) as


M−1−p 
0
wi [p + q]wj∗ [q] = wi [q  ]wj [p + q  ]
q=0 q =M−1−p


M−1−p
= wi [q]wj∗ [p + q]. (5.85)
q=0

Thus, O in Eq. (5.83) is zero. As for the set of anti-symmetric codewords, from
Eq. (5.23), we have the same equality as given in Eq. (5.84) again. This leads
5.4 PMU-OFDM System in Frequency Offset Environment 157

to Eq. (5.85). Therefore, the use of anti-symmetric codewords also results in


O = 0. 
Let us give a simple example for codeword selection. Suppose M = 8 and
D8 is an 8 × 8 Hadamard matrix with column vectors d1 , d2 , . . . , d8 . We can
either choose column vectors {d1 , d4 , d6 , d7 }, which are symmetric, or column
vectors {d2 , d3 , d5 , d8 }, which are anti-symmetric, as the four codewords for
four users.
(0)
If the proposed code selection is used, the dominating MAI Bj←i [k] can
be reduced to a negligible amount. Therefore, the MAI terms of f = k in
Eq. (5.82) becomes the main MAI impairment. For convenience, with code
selection, let us call it residual MAI. Next, we will investigate the residual
MAI. From Eqs. (5.79) and (5.81), we have


N −1 
M−1 u−v−(k−f )M
βi 9i [f ]xi [f ] e−jπ NM
wi [u]wj∗ [v].
(1)
Bj←i [k] ≈ λ π(u−v−(k−f )M+i )
M v=0,u=0 N M sin
f =0,f =k NM
(5.86)
(1)
Let l = f − k. For fixed k, −k ≤ l ≤ N − 1 − k and l = 0, Bj←i [k] in Eq. (5.86)
can be approximated by
−1−k
N 
M−1
e−jπ
u−v+lM
βi 9i [k + l]xi [k + l]
NM
λ π(u−v+lM+i )
wi [u]wj∗ [v] . (5.87)
M v=0,u=0 N M sin
l=−k,l=0 NM
  
ζ

p
e−jπ N M
Let f (p, l) = π(p+lM +i ) , we have
N M sin NM

 M−1
M−1 
ζ = e−jπ N f (u − v, l)wi [u]wj∗ [v]
l

v=0 u=0
:

M−1 
M−1−p
−jπ N
wi [p + q]wj∗ [q]
l
=e f (p, l)
p=1 q=0
;

M−1−p
+f (−p, l) wi [q]wj∗ [p + q] .
q=0
(5.88)

Using Eqs. (5.85) and (5.88), we can rewrite Eq. (5.87) as


−1−k
N
βi 9i [k + l]xi [k + l]
e−jπ N λ
(1) l
Bj←i [k] ≈
M
l=−k,l=0


M−1 
M−1−p
· [f (p, l) + f (−p, l)] wi [q]wj∗ [p + q]. (5.89)
p=1 q=0
158 5 Precoded Multiuser (PMU)-OFDM System

Assume that λ9i [k] and xi [k] are uncorrelated for all k, and xi [k] and xi [k  ]
 2 "
 (1) 
are uncorrelated for k = k  . It can be shown that E Bj←i [k] can be
approximated by [135]
 2
|βi |2 2 2    
N −1 M−1 
M−1−p
∗ 
σ σ  [f (p, l) + f (−p, l)] wi [q]w [p + q] , (5.90)
M 2 λi xi  j

p=1
l=1 q=0

where σλ2 i is the averaged channel gain, and σx2i is the averaged symbol power
of user i defined by
  " # $
9 2 2
σλ2 i = E λ i [k] and σ 2
xi = E |xi [k]| , 0 ≤ k ≤ N − 1. (5.91)

Example 5.7: Assume σλ2 i = σx2i = 1. Let us first consider the CFO
case of i = 0.3. This CFO level may be regarded as a serious one. Let
M = 16 and  N = 4. " Figure 5.14 shows the total residual MAI, i.e.,
T  (1) 2
i=1,i=j E Bj←i [k] , as a function of the user index, where the sum-
mation term accounts for T = M/2 users in this system. As shown in
Fig. 5.14, the worst total residual MAI is about −18.5 dB, which is much

−18
symmetric codewords
anti-symmetric codewords

−20
[k]|2 } (dB)

−22
←i
E{|B i(1)

−24
≠j

−26
Σi=1,i
T

−28

−30
1 2 3 4 5 6 7 8
j: user index
 2 "
T  (1) 
Fig. 5.14. Example 5.7: i=1,i=j E Bj←i [k] as a function of user index j with
M = 16 for symmetric and anti-symmetric codewords [[135] IEEE].
c
5.4 PMU-OFDM System in Frequency Offset Environment 159

−15

−20
E{|B(1) [k]|2}/ T (dB)

−25
j←i
j = 1 i = 1,i ≠ j

−30
ΣT ΣT

M=8
M = 16
M = 32
M = 64
−35

−40
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
ε: CFO
 2 "
T T  (1) 
Fig. 5.15. Example 5.7: 1
T j=1 i=1,i=j E Bj←i [k] as a function of CFO for
different M [[135] IEEE].
c

(1)
smaller than the transmit power of 0 dB. Since the residual MAI Bj←i [k] is
relatively small, it will make channel estimation and CFO estimation much
more accurate. Let us consider theaveraged value
" of total residual MAI for
T  T  (1) 2
all users, i.e., T1 j=1 i=1,i=j E Bj←i [k] . Figure 5.15 shows the aver-
aged value of total residual MAI as a function of CFO for different M with
symmetric codewords. The performance with anti-symmetric codewords is
also similar to this figure. From Fig. 5.15, we see that, as M increases, the
averaged total residual MAI decreases. This result means that the increase
(1)
of M will help reduce Bj←i [k]. Note that since there are T = M/2 users
in this system, the increase of M will also increase the number of users. It
implies that the increase of users can help reduce the residual MAI for a
fixed CFO.
Although the codeword selection given in the Proposition 5.3 decreases the
number of users from M to M/2, it reduces the dominating MAI greatly and
the system is approximately MAI-free in the presence of CFOs. Thus, every
user only has to tackle his/her own CFO problem without worrying about
the CFOs of other users. This is very different from conventional multiaccess
OFDM systems, where sophisticated signal processing is used to solve the
multiuser CFO problem [6, 31, 88, 142]. 
160 5 Precoded Multiuser (PMU)-OFDM System

Example 5.8: Suppression of Dominating MAI Due to CFO


(0)
Here, we would like to show by simulation result that Bj←i [k] defined in
Eq. (5.81) is the dominating MAI term in Eq. (5.79), and it can be greatly
reduced using only M/2 symmetric or anti-symmetric codewords of the M
Hadamard-Walsh codewords.
We consider the performance in the uplink direction so that every user has
different CFO and channel fading. We assume the channel and the CFO are
quasi-invariant in the sense that it remains unchanged within one block dura-
tion. Simulations are conducted with the following parameter setting through-
out this section. M = 16 and the BPSK modulation is used. The channel
coefficients are i.i.d. (independently identically distributed) complex Gaussian
random variables with an unit variance. For every individual user, the Monte
Carlo method is used to run more than 500,000 symbols. We consider the worst
CFO situation. That is, the CFO value of each user is randomly assigned to
be either +ε or −ε.
Let channel be flat and N = 4. Then, Aj←i [k] = 0 and M AIj←i [k] =
Bj←i [k] according to Eqs. (5.73) and (5.76), where exact equalities are due
to flat channel. Using similar notation rule in Eq. (5.80), we define the
(0)
MAI from the kth symbol of user i to user j as M AIj←i [k], and the MAI
(1) (1)
from all the other symbols of user i as M AIj←i [k], i.e., M AIj←i [k] =
(0)
M AIj←i [k] − M AIj←i [k]. Let the number of users, T = M = 16, i.e., a
fully-loaded system. The total MAI for user j from the kth symbol of all
(0)
other users, denoted by M AI j , is calculated as follows. For the kth sym-
bol of a target user, we accumulate the MAI contributed from the kth sym-
bol of other 15 users. The procedure is repeated and then the MAI power
(0)
is averaged for k from 0 to N − 1. That is, M AI j is obtained by av-
N −1 T (0)
2

eraging the value, N1 k=0  i=1,i=j M AIj←i [k] , for more than 500,000
symbols. Similarly, the total MAI from all the other symbols (f = k) of
(1)
all the other users, denoted by M AI j , is obtained by averaging the value,
 2
1 N −1 T (1) 
N k=0  i=1,i=j M AIj←i [k] , for more than 500,000 symbols.
The total MAI is plotted as a function of the normalized CFO value in
(0) (1)
Fig. 5.16, where M AI j and M AI j of 16 users are shown by 16 solid and
16 dashed curves, respectively. The solid bold curve in Fig. 5.16, denoted
(0) (0)
by M AI , is the averaged value of the 16 solid curves. That is, M AI is
 (0)
obtained via T1 Tj=1 M AI j . Similarly, the dashed bold curve, denoted by
(1) (1)
M AI , is obtained by averaging the 16 dashed curves. That is, M AI is
1
T (1) (0)
obtained via T j=1 M AI j . From this figure, we see that M AI is around
(1)
10–11 dB more than that of M AI . Hence, for each user, the MAI at the
kth symbol is mostly contributed from the kth symbols of other users. Thus,
(0)
it confirms the derived theoretical result that Bj←i [k] in Eq. (5.81) is the
5.4 PMU-OFDM System in Frequency Offset Environment 161

Fig. 5.16. Example 5.8: The MAI effect as a function of the CFO when the full M
Hadamard-Walsh codewords are used [[135] IEEE].
c

(0)
dominating MAI in Eq. (5.79). In the following, we call Bj←i [k] “dominating
(1)
MAI” and Bj←i [k] “residual MAI” for short.
Next, we demonstrate that the dominating MAI can be greatly reduced
using only M/2 symmetric codewords of the M Hadamard-Walsh codes. Let
the user number decreases from T = M = 16 to T = M/2 = 8 and only
the M/2 = 8 symmetric codewords are used. The performance is shown in
Fig. 5.17(a). Compared with Fig. 5.16, the residual MAI decreases around
4–5 dB due to the number of users decreasing from 16 to 8. In contrast, the
dominating MAI is greatly reduced by 12–47 dB. Note that, the simulation
result of the residual MAI is consistent with the theoretical result in Fig. 5.15.
Moreover, we see that using symmetric codewords, the dominating MAI is
even smaller than that of the residual MAI when the CFO is less than 0.35.
Figure 5.17(b) shows the performance using M/2 anti-symmetric code-
words. Compared with Fig. 5.17(a), we see that the performance of the set
of anti-symmetric codewords is similar to that of the set of symmetric code-
words. The result confirms that the dominating MAI can be greatly reduced
using the proposed code selection scheme.

5.4.3 Analysis of Self-CFO Effect


In this subsection, we examine the impairment caused by self-CFO. For user
j, the self-CFO impairment means the symbol distortion and the interference
162 5 Precoded Multiuser (PMU)-OFDM System

Fig. 5.17. Example 5.8: The MAI effect as a function of the CFO (a) when only
M/2 symmetric Hadamard-Walsh codewords are used and (b) when only M/2 anti-
symmetric Hadamard-Walsh codewords are used [[135] IEEE].
c
5.4 PMU-OFDM System in Frequency Offset Environment 163

caused by his/her own CFO j . From Eqs. (5.70) and (5.72), we have

sj [k] = Cj [k] + Dj [k], (5.92)

where Cj [k] is the distorted symbol of xj [k] due to self-CFO given by

1  (0)
M−1
Cj [k] = r [v + kM ]wj∗ [v + kM ]
M v=0 j

αj 
M−1
= λj [v + kM ]yj [v + kM ]wj [v + kM ]wj∗ [v + kM ]
M v=0

αj 9 
M−1
≈ λj [k]xj [k] wj [v + kM ]wj∗ [v + kM ]
M v=0
9j [k]xj [k],
= αj λ (5.93)

and Dj [k] is the interference caused by xj [f ], 0 ≤ f ≤ N − 1,

1  (1)
M−1
Dj [k] = r [v + kM ]wj∗ [v + kM ]. (5.94)
M v=0 j

Using the same procedure of deriving Eqs. (5.75), (5.77), (5.78), and (5.79),
we have
N −1
βj  9  
M−1 M−1
Dj [k] ≈ λj [f ]xj [f ]
M v=0
f =0 u=0,u=v+(k−f )M
−jπ u−v−(k−f )M
e NM
· π(u−v−(k−f )M+j )
wj [u]wj∗ [v]. (5.95)
N M sin NM

Using the same procedure of deriving Eqs. (5.81) and (5.83), the term of f = k
in Eq. (5.95), which is the interference caused by the kth symbol itself, can
be written as

βj 9 
M−1  
M−1−p

wj [p + q]wj∗ [q] − wj [q]wj∗ [p + q]
(0)
Dj [k] ≈ λj [k]xj [k] g(p)
M p=1 q=0
  
O
= 0, (5.96)

where O = 0 because wj [m] = wj [m + kM ], 0 ≤ m ≤ M − 1 and 0 ≤ k ≤


(0)
N − 1. So we have Dj [k] ≈ 0.
164 5 Precoded Multiuser (PMU)-OFDM System

Next, consider the terms of f = k in Eq. (5.95), which is the ICI caused
by all other symbols except for the kth symbols. Using the same procedure of
deriving from Eqs. (5.86), (5.87), (5.88) and (5.89), we have
−1−k
N
βj 9j [k + l]xj [k + l]
e−jπ N λ
(1) l
Dj [k] ≈
M
l=0−k,l=0


M−1 
M−1−p
· [f (p, l) + f (−p, l)] wj [q]wj∗ [p + q]. (5.97)
p=1 q=0

Eq. (5.97) can be shown to be [136]


  " N −1
 (1) 2 |βj |2 2 2 
E Dj [k] ≈ σ σ
M 2 λj xj
l=1
M−1 2
 
M−1−p 
 ∗ 
· [f (p, l) + f (−p, l)] wj [q]wj [p + q]. (5.98)
 
p=1 q=0

Example 5.9: The environment setting  is the" same as that stated in



 (1) 2
Example 5.7. First, let us consider E Dj [k] for each individual user,
which is plotted as a function of user index in Fig. 5.18. We see that the worst
performance is around −6 dB for the user with codeword d1 , i.e., the all-one
code. Other codewords have performance smaller than −17 dB. Since the ICI
using all other symmetric codewords are below −27 dB (except d1 ), this re-
sult suggests the use of symmetric codewords but excluding the all-one code to
have a smaller ICI. According to Eq. (5.93), the distorted # symbol$due to self-
2
CFO has the mean square expectation value given by E |Cj [k]| = −1.3 dB
  "
 (1) 2
when |j | = 0.3 and σλ2 i = σx2i = 1. Except d1 , the amount E Dj [k]
of other users is much smaller than −1.3 dB. Moreover, when compared with
the residual MAI which is below −18 dB in Example 5.7, the −1.3 dB of the
distorted symbol is still relatively large as compared with the residual MAI.
Since the residual MAI and ICI are both smaller than the distorted symbol
due to the self-CFO, we can estimate αj accurately and compensate it in the
receiver end without demanding a feedback mechanism. We will discuss this
in more detail in the next section. The averaged
 ICI "
for all users as a function

1 T  (1) 2
of CFO for different M , i.e., T j=1 E Dj [k] , is shown in Fig. 5.19.
Since d1 is excluded, there are M/2 − 1 users in the system with symmetric
codewords. We see that symmetric codewords with d1 excluded have a smaller
averaged ICI than anti-symmetric codewords. From the figure, we see results
similar to Fig. 5.15. That is, the increase of M helps reduce ICI for a fixed
CFO.
5.4 PMU-OFDM System in Frequency Offset Environment 165

−5
symmetric codewords
anti-symmetric codewords

−10

−15
(dB)
[k]| }
2

−20
j←i
(1)
E{|D

−25

−30

−35
1 2 3 4 5 6 7 8
j: user index
 2 "
 (1) 
Fig. 5.18. Example 5.9: E Dj←i [k] as a function of user index j with M = 16
for symmetric and anti-symmetric codewords [[135] IEEE].
c
−20

−25

−30
(dB)
E{|Dj [k]| }/T

−35
(1) 2

−40
j=1
T
Σ

−45 Anti-symmetric: M = 16
Anti-symmetric: M = 32
Anti-symmetric: M = 64

−50 Symmetric: M = 16
Symmetric: M = 32
Symmetric: M = 64

−55
0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
ε: CFO
 2 "
T  (1) 
i=1,i=j E Dj←i [k]
1
Fig. 5.19. Example 5.9: T
as a function of CFO for dif-
ferent M , where the all-one code is excluded for the set of symmetric codewords
[[135] IEEE].
c
166 5 Precoded Multiuser (PMU)-OFDM System

5.4.4 Overall CFO Estimation and Compensation


Here, let us consider the overall CFO effect. For convenience, let us rewrite
Eq. (5.72) as

T
1 
M−1
9j [k]xj [k] + Dj [k] +
x̂j [k] ≈ αj λ M AIj←i [k] + e[m + kM ]wj∗ [m],
M m=0
i=1,i=j
(5.99)

where the first term is the desired signal, the second term is the self-CFO
interference, the third term is the MAI from other users, and the fourth term
is the additive noise. Following the discussion in Sections 5.4.2 and 5.4.3,
impairments due to MAI and self-CFO interference (second term of Eq. (5.99))
are negligible if only M/2 symmetric or anti-symmetric codewords of the M
Hadamard-Walsh codes are used. The desired first term of Eq. (5.99) is much
larger than all other terms. Thus, we can estimate the self-CFO value j of each
user accurately with the algorithm for single user OFDM systems developed
in [87] and [118, 141]. Once j is estimated, we can multiply the lth received
−lj
symbols of user j by ej2π N M to remove the self-CFO effect, before passing
them through the DFT matrix. The penalty is that a separate DFT is needed
for each user.
In this case, sophisticated MUD techniques and the feedback mechanism
are not needed at the receiver end. On the other hand, if the MAI is not
reduced, the CFO estimation for each user could be difficult and MUD tech-
niques are needed, which imposes a heavy computational burden on the re-
ceiver. For example, CFOs of other users will cause the MAI to any target
user in OFDMA systems and signal processing techniques are often used to
estimate the CFO of this target user [6, 88, 142]. Furthermore, since OFDMA
does not have a negligible MAI-free property in the CFO environment, the
feedback mechanism is demanded after CFO estimation so that every user
can compensate his/her own CFO at the transmitter end [142].
In the following two examples, we would like to compare the CFO effect
on the proposed system, the OFDMA system, and two MC-CDMA systems
over a flat channel. Let the simulation parameters remain the same as that
in Example 5.8. Among the two MC-CDMA schemes, one is with subcarriers
uniformly allocated (called MC-CDMA/U for short) [48, 49] while the other
is with subcarriers successively allocated (called MC-CDMA/S for short) [1].
Every user in these four systems will transmit N symbols and the DFT/IDFT
size is the same, i.e., N M . Since all four systems transmit N symbols per block
and add the CP of the same length L, their actual data rates are the same.
We consider both fully-loaded and half-loaded situations. In a fully-loaded
situation, the Hadamard-Walsh code is used in the proposed PMU-OFDM
and the two MC-CDMA systems. For OFDMA, two MC-CDMA systems. For
OFDMA, each user occupies N subchannels which are maximally separated
[114], i.e., user u will be assigned subchannels indexed by (u − 1) + kM ,
5.4 PMU-OFDM System in Frequency Offset Environment 167

1 ≤ u ≤ M and 0 ≤ k ≤ N − 1. In a half-loaded case, M/2 symmetric code-


words of the Hadamard-Walsh code (code selection scheme) are used in the
proposed system and the two MC-CDMA systems. For OFDMA, the uth user
will be assigned subchannels indexed by 2(u − 1) + kM , 1 ≤ u ≤ M/2, and
0 ≤ k ≤ N − 1. The remaining N M/2 subchannels are used as guard bands.

Example 5.10: Comparison of CFO Effect on the Proposed System,


OFDMA and MC-CDMA
Let us first evaluate the MAI in the detection stage, i.e., MAI after frequency
equalization. In the absence of MAI and channel noise, the received sym-
bols for detection are still BPSK symbols with either +1 or −1. For the
proposed system and the OFDMA system, frequency equalization is used.
For the two MC-CDMA systems, the orthogonality restoring combing (ORC)
scheme is used to achieve equalization [49]. To distinguish from the MAI

before equalization, we denote the MAI after equalization by M AIj←i [k].
 9
For instance, in the proposed system, M AIj←i [k] = M AIj←i [k]/λj [k]. The
averaged total MAI after equalization is obtained via averaging the value,
T 1 N −1 T  2
k=0  i=1,i=j M AIj←i [k] , for 500,000 T symbols. Figure 5.20
1
T j=1 N

5
Proposed system, OFDMA
0 MC-CDMA/S

−5 MC-CDMA/U
Fully-loaded
−10

−15
OFDMA
MAI power (dB)

−20 MC-CDMA/U

−25
Proposed system,
−30 MC-CDMA/S

−35 Fully-loaded: OFDMA


Half-loaded Fully-loaded: MC-CDMA/U
Fully-loaded: MC-CDMA/S
−40 Fully-loaded: proposed system
Half-loaded: OFDMA
−45 Half-loaded: MC-CDMA/U
Half-loaded: MC-CDMA/S
Half-loaded: proposed system
−50
0 0.05 0.1 0.15 0.2 0.25
ε: CFO

Fig. 5.20. Example 5.10: The MAI comparison among the proposed system, the
OFDMA system, and two MC-CDMA systems in a flat fading environment [[135]
IEEE].
c
168 5 Precoded Multiuser (PMU)-OFDM System

shows the averaged total MAI after equalization as a function of CFO for the
four systems in fully-loaded and half-loaded situations. Note that the perfor-
mance of the proposed system and the MC-CDMA/S are the same for the
flat channel so the two curves overlapped. In the fully-loaded case, the MC-
CDMA/U outperforms all other three systems. When the number of users
decreases from 16 to 8, the MAI of the proposed system is greatly reduced
by 15–16 dB and, consequently, the proposed system outperforms OFDMA by
10–11 dB. Recall that in Example 5.8, the dominating MAI of the proposed
system is larger than the residual MAI by around 10–11 dB in fully-loaded
situation. Using code selection, the dominating MAI is reduced to an amount
that is even smaller than the residual MAI. This explain why the proposed
system has similar performance with OFDMA in a fully-loaded situation while
it outperforms OFDMA by 10–11 dB in a half-loaded situation with code se-
lection.
The self-CFO impairment of the proposed system with symmetric code-
words is shown in Fig. 5.21. To compute the self-CFO impairment after fre-
quency equalization, we accumulate the symbol distortion and the interference
for the kth symbol of a target user due to his/her own CFO. This procedure
is repeated and, then, the impairment power is averaged for k, 0 ≤ k ≤ N − 1.
The self-CFO impairment of the 8 users in the proposed system are indicated
by the 8 solid curves in Fig. 5.21. Since the self-CFO impairment of an indi-

Fig. 5.21. Example 5.10: The self-CFO impairment of the proposed system [[135]
IEEE].
c
5.4 PMU-OFDM System in Frequency Offset Environment 169

vidual user is similar, these curves are overlapping. The bold-circled curve in
Fig. 5.21 is the averaged value of the 8 solid curves. By comparing this figure
with Fig. 5.20, we see that the self-CFO impairment is the main impairment
in the proposed system. Since the MAI is relatively small, we can estimate
αj or j accurately and then compensate the self-CFO effect as discussed in
Section 5.4.4. It is worthwhile to point out that, although not shown here, the
self-CFO impairment of OFDMA is very similar to that given in this figure.
Now, assume that each user has a normalized CFO of |j | = 0.1 and can
accurately estimate the self-CFO in these four systems. We would like to find
out the bit error probability when there is no feedback. The BEPs for the four
systems are shown in Fig. 5.22. For a fully-loaded situation, the BEP curves
of the proposed system and MC-CDMA/S are overlapping. Moreover, for a
half-loaded situation, the BEP curves of the proposed system, MC-CDMA/S,
and MC-CDMA/U are overlapping. We see that the BEP performance has a
similar trend as that of the MAI performance in Fig. 5.20.
Example 5.11: The CFO Effect in a Multipath Environment
In this example, we examine the CFO effect when the channel has the
multipath frequency-selective fading. The number of multipath is assumed
to be L = 4 and N =64 while the other parameters remain the same as

10−1

Fully-loaded
10−2
Bit error probability

10−3
Fully-loaded: OFDMA
Fully-loaded: MC-CDMA/U
Fully-loaded: MC-CDMA/S
Fully-loaded: proposed system
Half-loaded
Half-loaded: OFDMA
Half-loaded: MC-CDMA/U
Half-loaded: MC-CDMA/S
Half-loaded: proposed system

10−4
5 10 15 20 25 30
Eb/N0

Fig. 5.22. Example 5.10: The BEP comparison among the proposed system,
OFDMA and two MC-CDMA systems in a flat channel with a normalized CFO
|j | = 0.1 [[135] IEEE].
c
170 5 Precoded Multiuser (PMU)-OFDM System

10

−5

−10
MAI power (dB)

−15

−20

−25 OFDMA: fully-loaded


MC-CDMA/U: fully-loaded
−30 MC-CDMA/S: fully-loaded
proposed system: fully-loaded
−35 OFDMA: half-loaded
MC-CDMA/U: half-loaded
−40 MC-CDMA/S: half-loaded
proposed system: half-loaded
−45
0 0.05 0.1 0.15 0.2 0.25
ε: CFO

Fig. 5.23. Example 5.11: The MAI comparison among the proposed system, the
OFDMA system and two MC-CDMA systems in a multipath fading environment
[[135] IEEE].
c

those given in Example 5.10. The MAI performance of the four systems is
shown in Fig. 5.23. In a fully-loaded situation with the CFO smaller than
0.05, OFDMA has less MAI than the proposed system because OFDMA is
completely MAI-free when frequency and time are well synchronized. How-
ever, as CFO grows, the proposed system slightly outperforms OFDMA sys-
tem. In a half-loaded situation, the proposed system outperforms OFDMA
by around 10 dB due to the use of the code selection. The low MAI value
of the proposed system with code selection is also beneficial to CFO
estimation.
Let us see the comparison with MC-CDMA. The MAI performance curves
for the two MC-CDMA systems with ORC are not good, since ORC amplifies
the MAI power in the subcarrier with serious fading [49]. In a multipath
environment, it is likely that the frequency selective fading contains several
zeros to cause huge MAI in the two MC-CDMA systems. However, since a
large MAI value may only lead to the error of certain bits, MAI may not
be a good performance measure for this case. Instead, the BEP may provide
a more valuable measure for fair comparison. As MC-CDMA systems with
MRC outperforms those with ORC in a multipath environment, we will use
MRC for the two MC-CDMA systems in the MRC for the two MC-CDMA
systems in the following discussion.
5.4 PMU-OFDM System in Frequency Offset Environment 171

10−1

10−2
Bit error rate

10−3

OFDMA
MC-CDMA/U
MC-CDMA/S
proposed system
MC-CDMA/S without proposed code selection

10−4
0 5 10 15 20 25 30
Eb/N0

Fig. 5.24. Example 5.11: The BEP comparison among the proposed system,
OFDMA and two MC-CDMA systems with L = 4 and |j | = 0.1 in a half-loaded
situation [[135] IEEE].
c

Figure 5.24 shows the BEP comparison among the four systems with
|j | = 0.1 in a half-loaded situation. We see that MC-CDMA/S with the
proposed code selection outperforms OFDMA. Note that if the code selection
is not used in MC-CDMA/S, instead, e.g., the first M/2 codewords of the M
Hadamard-Walsh code are used, the performance as shown in the star curve is
much worse than that using code selection. This result shows the advantage of
using code selection in certain conventional MC-CDMA system. Furthermore,
although the half-loaded MC-CDMA/U has the worst performance in a multi-
path environment, this scheme can achieve a better frequency diversity order
and can perform better when the number of users is small [48]. Finally, we
observe that the proposed system with code selection outperforms OFDMA
and the two MC-CDMA systems in a half-loaded situation with multipath
fading.

5.4.5 Code Priority in CFO Environment

In this section, we show that the performance can be further improved in a


CFO environment if a proper code priority is developed based on the proposed
code selection scheme [134]. It was demonstrated in the last section that PMU-
OFDM is robust to CFO if only M/2 symmetric or anti-symmetric codewords
172 5 Precoded Multiuser (PMU)-OFDM System

of M Hadamard-Walsh code are used. That is, using this code scheme, the
dominating MAI due to CFO can be reduced to a negligible amount. Under
this situation, the system performance is determined by residual MAI. Since
the use of different Hadamard-Walsh codewords will cause different residual
MAI for individual users in a CFO environment. We can further improve
the performance using a code priority scheme. In this section, we extend the
result in the last section and propose a code priority scheme for PMU-OFDM
to improve the system performance in a CFO environment.
Following results in Section 5.4.2, it can be easily shown that
 2
# $ |αi |2 2 2    
L−1 M−1
2 ∗ −j N M vn 

E |Aj←i [k]| = σ σ  wi [v]w [v]e  , (5.100)
M 2 i i n=0  v=0
x h j


 2 " |β |2 
L−1
 (0)  i 2 2
E Bj←i [k] = 2
σ σ
x i hi
M n=0
 2
  
 M−1
e −jπ u−v 
·  M un  ,
N M
∗ −j N2π
w
π(u−v+i ) i
[v]wj [v]e 
v=0,u=0,u=v N M sin N M 
(5.101)

and
 2 " N −1 L−1
 (1)  |βi |2 2 2  
E Bj←i [k] = σ σ
M 2 x i hi
f =0,f =k n=0
 2
 M−1 
 
u−v−(k−f )M
e−jπ NM
∗ −j N2π 
· wi [v]w [v]e M un
 .
 N M sin π(u−v−(k−f )M+i ) j

v=0,u=0 NM
(5.102)

It was shown in the last section that, if only M/2 symmetric or anti-symmetric
(0) (1)
codewords are used, Bj←i [k] can be reduced to be less than Bj←i [k]. In this
(1)
situation, Bj←i [k] will determine the system performance. Let us consider an
example below.
Example 5.12: MAI Suppression in a CFO Environment
(0) (1)
In this example, we evaluate Aj←i [k], Bj←i [k], and Bj←i [k]. Let σx2i =
2
σhi = 1, M = 16, and L = 4. First, let us consider the fully-loaded case.
T T # $
2
Figure 5.25 shows three MAI terms i.e., T1 j=1 i=1,i=j E |Aj←i [k]| ,
 2 "  2 "
1 T T  (0)  1 T T  (1) 
T j=1 i=1,i=j E Bj←i [k] , and T j=1 i=1,i=j E Bj←i [k] as
  # $
functions of CFO for different N . We see that T1 Tj=1 Ti=1,i=j E |Aj←i [k]|2
5.4 PMU-OFDM System in Frequency Offset Environment 173

Fig. 5.25. The MAI terms are plotted as functions of CFO for the fully loaded case.

decreases at a rate of O(N −2 ). This is not surprising since Aj←i [k] =


(0)
αi M AIj←i [k]. When N is sufficiently large, e.g., N = 64, Bj←i [k] is the
(1)
dominating MAI over Aj←i [k] and Bj←i [k] for the CFO range of our interest.

Next, let us consider the use of only M/2 symmetric codewords,


# the perfor-
$
1 T T 2
mance curves are given in Fig. 5.26. We see T j=1 i=1,i=j E |Aj←i [k]|
decreases at a rate of O(N −4 ). Moreover, the dominating MAI Bj←i [k] is
(0)

(1)
greatly reduced, which is even less than the residual MAI Bj←i [k]. That is,
the system performance is now determined by the residual MAI. From the
figure, the residual MAI is not affected by N so that increasing N will no
longer improve the performance. 

According to the discussion in the last section, it is clear that, to further


(1)
improve the performance, we need to reduce Bj←i [k]. A way to reduce the
 2 "
T T  (1) 
averaged total residual MAI, T1 j=1 i=1,i=j E Bj←i [k] , is to increase
M . An extra benefit of this method is that it also increases the number of
users allowed. Now, we are interested whether there is another way to reduce
(1) (1)
Bj←i [k]. To answer this question, let us examine Bj←i [k]. For convenience,
174 5 Precoded Multiuser (PMU)-OFDM System

–10

–20

–30

–40

–50

–60

–70

Fig. 5.26. The MAI terms are plotted as functions of CFO for the half-loaded case
with M/2 symmetric codes.

we repeat the result in the


 last section here. If the symmetric or anti-symmetric
2 "
 (1) 
codewords are used, E Bj←i [k] can be approximated by

 2
|βi |2 2 2    
N −1 M−1 
M−1−p

σ σ  [f (p, l) + f (−p, l)] wi [q]wj [p + q] , (5.103)
M 2 λi xi  
l=1 p=1 q=0

p
e−jπ N M
where f (p, l) = π(p+lM +i ) . It can be seen from Eq. (5.103) that the
N M sin NM
residual MAI is determined
 by the cross-correlation of the assigned Hadamard-

Walsh codewords, i.e., M−1−p
q=0 w i [q]wj [p + q].

Lemma 5.5: Suppose only M/2 symmetric or anti-symmetric codewords of


M Hadamard-Walsh codes are used. We have the following property:

 M−1−p
M−1   M−1−p
M−1 
wi [q]wj [p + q] = wi [p + q]wj [q] = 0, i = j. (5.104)
p=1 q=0 p=1 q=0
5.4 PMU-OFDM System in Frequency Offset Environment 175

Proof. As shown in [135], if M/2 symmetric or anti-symmetric codewords of


M Hadamard-Walsh codes are used, we have


M−1−p 
M−1−p
wi [q]wj [p + q] = wi [p + q]wj [q], i = j. (5.105)
q=0 q=0

Consider the following equality


M−1 
M−1  M−1−p
M−1 
wi [u]wj [v] = {wi [q]wj [p + q] + wi [p + q]wj [q]} .
v=0 u=0,u=v p=1 q=0
(5.106)
For u = v, since


M−1 
M−1−0
wi [v]wj [v] = {wi [q]wj [0 + q] + wi [0 + q]wj [q]} = 0,
v=0 q=0

we have

 M−1
M−1   M−1−p
M−1 
wi [u]wj [v] = {wi [q]wj [p + q] + wi [p + q]wj [q]} .
v=0 u=0 p=1 q=0
(5.107)
M−1 M−1 M−1 M−1
From Eq. (5.107), since v=0 u=0 wi [u]wj [v] = v=0 wj [v] u=0 wi [u]
= 0, we know that

 M−1−p
M−1 
{wi [q]wj [p + q] + wi [p + q]wj [q]} = 0. (5.108)
p=1 q=0

From Eqs. (5.105) and (5.108), we can draw a conclusion on Eq. (5.104). 
From Eq. (5.104), if symmetric or anti-symmetric codewords are used, the
M−1−p
value q=0 wi [q]wj [p + q] for two given codewords wi [m] and wj [m] must
have positive and negative values so that the summation over all p is zero.
p
Since the maximum value of p is M − 1, if N is sufficiently large, e−jπ N M ≈ 1.
In this case, f (p, l) + f (−p, l) ≈ 1
π(p+lM +i ) +
1
π(−p+lM +i ) , which
N M sin NM N M sin NM
is a monotonically decreasing function of p. Thus, referring to Eq. (5.103),
a proper code priority is able to reduce the residual MAI when not all M/2
users
M−1−p are active. Intuitively, a “good” code priority here should be able to let
q=0 wi [q]wj [p+q] have the most zero values, or the fewest successive pos-
itive or negative values so that the absolute-and-squared term in Eq. (5.103)
can be efficiently cancelled out after summation for all p. For instance, let
M = 8 and symmetric codewords are used. Consider the code priority that
assigns the first two users the codewords

(+1 + 1 + 1 + 1 + 1 + 1 + 1 + 1) and (+1 + 1 − 1 − 1 − 1 − 1 + 1 + 1).


176 5 Precoded Multiuser (PMU)-OFDM System
M−1−p
Then, q=0 wi [q]wj [p + q] is (−1 − 2 − 1 0 + 1 + 2 + 1) for 1 ≤ p ≤ M − 1.
The alternative code priority is to assign the first two users the codewords

(+1 − 1 − 1 + 1 + 1 − 1 − 1 + 1)and(+1 − 1 + 1 − 1 − 1 + 1 − 1 + 1).


M−1−p
Then, q=0 wi [q]wj [p + q] is (−1 + 2 − 1 0 + 1 − 2 + 1) for 1 ≤ p ≤ M − 1.
Obviously, the latter code priority can more effectively cancel out the residual
MAI than the former one.
According to [8], each codeword of the M Hadamard-Walsh codes has an
unique zero-crossing number as its identifier. Moreover, symmetric codewords
have an even number of zero-crossing while anti-symmetric codewords have an
odd number of zero-crossing. From Eqs. (5.103), (5.104) and above discussion,
we have the following proposition.
Proposition 5.4: Suppose only M/2 symmetric or anti-symmetric
 codewords
2 "
 (1) 
of M Hadamard-Walsh codes are used. The maximum value of E Bj←i [k]
can be approximated by
  2
2 " |β |2  −1 M−1 

N
 (1)  i 2 2 
max E Bj←i [k] = σλi σxi  [f (p, l) + f (−p, l)] r(p) ,
i,j M 2  
l=1p=1
(5.109)
where ⎧
⎨ −p, 1 ≤ p ≤ M/4,
r(p) = p − 8, M/4 + 1 ≤ p ≤ 3M/4, (5.110)

−p + 16, 3M/4 + 1 ≤ p ≤ M − 1.
For symmetric codes, the maximum value occurs when the two codewords,
which have zero and two crossings, are used. For anti-symmetric codewords,
the maximum value occurs when the two codewords, which have one and three
crossings, are used.
Example 5.13: MAI Suppression with Code Priority
Let the parameter setting remains the same as those in Example
5.12 and the
"
T  (1) 2
normalized CFO level be j = 0.2. Let φi,j (T ) = i=1,i=j E Bj←i [k] .
The relationship between the number of zero-crossings and the codeword in-
dices (in Kronecker ordering [8]) for symmetric codewords and anti-symmetric
codewords are manipulated in Table 5.1(a) and (b), respectively. We see that,
when M increases from 16 to 32, the performance of codewords with the same
crossings does not degrade significantly. The reason is that the newly added
codewords have more crossings and tend to cause smaller MAI to other code-
words. Take the symmetric code with 10 crossings for instance, this codeword
degrades the most, i.e., 2.9 dB. However, since the increase of M will increase
the number of codewords with more crossings, which tends to cause much less
MAI, the overall performance is actually improved.
5.4 PMU-OFDM System in Frequency Offset Environment 177

Table 5.1. The MAI of assigned codewords indexed by the zero-crossing numbers
for (a) M/2 symmetric codewords, (b) M/2 anti-symmetric codewords, for M = 16
(column 3) and M = 32 (column 5) [[134] IEEE].
c
No. of Code Code No. of Code Code
zero index φ j (8) index φ j (16) zero index φ j (8) index φj (16)
crossing (M = 16) (dB) (M = 32) (dB) crossing (M = 16) (dB) (M = 32) (dB)
0 1 –15.6 1 –14.6 1 9 –15.8 17 –14.9
2 13 –16.8 25 –15.8 3 5 –16.5 9 –15.5
4 7 –21 13 –19.4 5 15 –21.7 29 –19.9
6 11 –19.6 21 –18.3 7 3 –19.2 5 –17.9
8 4 –26.4 7 –23.8 9 12 –26.7 23 –24.0
10 16 –27.3 31 –24.4 11 8 –27.2 15 –24.3
12 6 –24.6 11 –22.4 13 14 –25.1 27 –22.7
14 10 –23.4 19 –21.5 15 2 –23.0 3 –21.2
16 4 –29.3 17 20 –29.4
18 28 –29.6 19 12 –29.6
20 16 –30.2 21 32 –30.2
22 24 –30.1 23 8 –30.1
24 6 –27.5 25 22 –27.6
26 30 –27.9 27 14 –27.9
28 10 –26.3 29 26 –26.6
30 18 –25.7 31 2 –25.4

(a) Symmetric codewords (b) Anti-symmetric codewords

 2 "
 (1) 
Let us see the relationship of E Bj←i [k] and the number of crossings.
 2 "
 (1) 
This is shown in Table 5.2. We see that E Bj←i [k] is symmetric in the
 2 "  2 "
 (1)   (1) 
sense that E Bj←i [k] = E Bi←j [k] . Note that if we sum up the
whole row or column, we get φi,j (T ) at the third column in Table 5.1(a).
We see from Table 5.2 that, for symmetric codewords, the one with 0 and 2
crossings will lead to the maximum total residual MAI. This result is a direct
consequence of Proposition 5.4.
Moreover, we see that different codewords will cause different levels of
MAI to other users. Thus, as shown in Table 5.1, different users will receive a
different total MAI level. From Table 5.1, we observe that, as M increases, the
new codewords, i.e., 8–14 crossings for symmetric codes and 9–15 crossings for
anti-symmetric codes, will cause less MAI than the previous codewords, i.e.,
0–6 crossings for symmetric codes and 1–7 crossings for anti-symmetric codes.
Hence, for a M/2-user system with M ≥ 8, the M/4 codewords with more
crossings should be assigned to the first M/4 active users. The M/4 codewords
178 5 Precoded Multiuser (PMU)-OFDM System
 
Table 5.2. E |M AIj←i [k]|2 (dB) is plotted as a function of codewords in terms
of crossing numbers [[134] IEEE].
c

0 2 4 6 8 10 12 14
crossing crossings crossings crossings crossings crossings crossings crossings

0
–18.5 –24.5 –22.4 –31.5 –33.2 –28.9 –27.1
crossing

2
–18.5 –27.3 –26.1 –33.2 –34.0 –31.2 –30.2
crossings

4
–24.5 –27.3 –29.6 –35.2 –35.7 –33.8 –33.2
crossings

6
–22.4 –26.1 –29.6 –34.8 –35.4 –33.2 –32.4
crossings

8
–31.5 –33.2 –35.2 –34.8 –39.2 –38.2 –37.7
crossings

10
–33.2 –34.0 –35.7 –35.4 –39.2 –38.3 –38.1
crossings

12
–28.9 –31.2 –33.8 –33.2 –38.0 –38.3 –36.3
crossings

14
–27.1 –30.2 –33.2 –32.4 –37.7 –38.1 –36.3
crossings

with fewer crossings will be further divided into two sets with M/8 codewords,
the M/8 codewords with more crossings have higher priority to be assigned
to users when the number of active users exceeds M/4. This procedure will
continue until the divided set only has one user.
Take M = 32 symmetric code for instance. As shown in Table 5.1, code-
words can be divided into the following five sets.
1. The first code set with 0–1 crossing (i.e., 0 for symmetric codes, and 1 for
anti-symmetric codes);
2. the second code set with 2–3 crossings;
3. the third code set with 4–7 crossings;
4. the fourth code set with 8–15 crossings; and
5. the fifth code set with 16–31 crossings.
When the index of the code set is higher, the caused MAI is smaller. Thus, we
should assign codewords from a higher indexed set to connected users with a
higher priority so that the overall performance is better as compared with a
random code assignment. Note that the above scheme is only a coarse code
priority. That is, in the same code set, codewords with more crossings does
not necessarily have a higher priority than codewords with fewer crossings.
Take the fifth code set for instance, the codeword with 20 crossings will cause
less MAI than the codeword with 30 crossings.
A
fine code
2 "
priority can be obtained off-line using the close form of
 (1) 
E Bj←i [k] in Eq. (5.102) to construct Tables 5.1 and 5.2. Based on
5.4 PMU-OFDM System in Frequency Offset Environment 179

these two tables, we can easily determine the fine code priority. For instance,
when M = 16 for symmetric codes, the fine code priority is

Fine Code Priority : w16 , w4 , w6 , w10 , w7 , w11 , w13 , w1 . (5.111)


  "
T  (1) 2
Let M = 16. Figure 5.27 shows i=1,i=j E Bj←i [k] , 1 ≤ j ≤ 8, as
a function of CFO for symmetric codes. We see from this figure that the
performance rank of different codewords is independent of CFO. In other
words, a fine code priority is not changed by the CFO value. Hence, we only
need to determine the fine code priority once and then it can work for different
CFO environment.
Let M = 16 and N = 64. We consider a quarterly loaded system with
symmetric code. That is, only codewords with 14, 12, 10, and 8 crossings
are used. Figure 5.28 shows the individual MAI terms # as a function
$ of
T 2
CFO for N = 64. Note that the 4 curves i=1,i=j E |Aj←i [k]| are over-
  # $
T T 2
lapping with T1 j=1 i=1,i=j E |Aj←i [k]| . When comparing the result

−5

−10

−15

−20
MAI power (dB)

−25

−30

# of zero-crossings: 0
−35
# of zero-crossings: 2
# of zero-crossings: 4
−40
# of zero-crossings: 6
# of zero-crossings: 8
−45
# of zero-crossings: 10
# of zero-crossings: 12
−50
# of zero-crossings: 14

−55
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
ε: CFO
 2 "
T T  (1) 
Fig. 5.27. 1
T j=1 i=1,i=j E Bj←i [k] as a function of CFO for different
codewords [[134] IEEE].
c
180 5 Precoded Multiuser (PMU)-OFDM System

Fig. 5.28. MAI terms are plotted as functions of CFO for the quarterly loaded case
with code priority.

T T # 2 $
with Fig. 5.26, the two MAI terms, T1 j=1 i=1,i=j E Bj←i 0
[k] and
  #  $
1 T T B 1 [k]2 have been reduced simultaneously so that
T j=1 i=1,i=j E j←i
increasing N from 16 to 64 can further improve the system performance
in a CFO environment. This result is different from the half-loaded case in
Example 5.12. 

In an environment with serious CFO mismatch, the system designer may


consider this quarterly loaded scheme so that every user is nearly MAI-free
in such a bad CFO environment. A scenario is given below. When the user
speed is fast to cause a large CFO, it may be impractical to use a feed-
back mechanism to compensate the CFO in the transmitter side [88, 142].
Hence, it is desirable if we can compensate the CFO effect at the receiver
end while the MAI is negligible to lead to significant performance degra-
dation [135]. In this situation, the use of a quarterly loaded PMU-OFDM
system is a good trade-off since it enables the system to be more robust to
CFO effect.
5.4 PMU-OFDM System in Frequency Offset Environment 181

Example 5.14: MAI Suppression in Quarterly Load Systems


We compare the performance of quarterly loaded PMU-OFDM and quarterly
loaded OFDMA systems in this example. Let the parameter setting remains
the same as in Example 5.13. In a quarterly loaded case, the four codewords
with 14, 12, 10, and 8 crossings are used in the PMU-OFDM system. For
OFDMA, the uth user will be assigned subchannels indexed by 4(u − 1)+ kM ,
1 ≤ u ≤ M/4, and 0 ≤ k ≤ N − 1. The remaining 3N M/4 subchannels are
used as guard bands.
We first evaluate the MAI in the detection stage, i.e., MAI after frequency
equalization. Figure 5.29 shows the averaged total MAI after equalization
as a function of CFO for the two systems in the half- and quarterly loaded
situations. We see that the half-loaded PMU-OFDM even outperforms the
quarterly loaded OFDMA when CFO is less than 0.375. Consequently, the
quarterly loaded PMU-OFDM outperforms quarterly loaded OFDMA around
8–20 dB. We also see that if the code priority scheme is used in the quarterly
loaded PMU-OFDM, the system is more robust to the CFO effect. For in-
stance, at CFO=0.15, the quarterly loaded PMU-OFDM with code priority
can outperform the half-loaded PMU-OFDM by 14 dB.

−5

−10

−15
MAI power (dB)

−20

−25

−30

−35

OFDMA: half-loaded
−40
OFDMA: quaterly-loaded
PMU-OFDM: half-loaded
−45
PMU-OFDM: quaterly-loaded

−50
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
ε: CFO
Fig. 5.29. MAI comparison between PMU-OFDM and OFDMA in half- and
quarterly-loaded cases [[134] IEEE].
c
182 5 Precoded Multiuser (PMU)-OFDM System

Now, consider the bit error probability performance of both systems. Let
the CFO value be 0.2, which can be considered to be a serious CFO envi-
ronment. For instance, in a wireless broadband application [66, 88], let the
sampling frequency be 4 MHz and the carrier frequency be 4 GHz. Since there
are N M = 1024 subchannels, for CFO=0.2, it can be shown that the user
speed is around 211 km/h [108]. Note that it has also been verified that the
channel coherent time is around 542 μs (see Eq. (5.40.c) in [108]), which is
still greater than two times of one OFDM-block duration of 256 μs. Hence,
the assumption that the channel is quasi-statistic may still be valid in this
situation [108].
Figure 5.30 shows the BEP as a function of signal-to-noise ratio (SNR).
Comparing it with Fig. 5.29, we see that the BEP performance in Fig. 5.30
could be roughly evaluated by the MAI curves in Fig. 5.29. Take half-loaded
PMU-OFDM for instance, the performance floor occurs when SNR is around
15 dB. This is reasonable since we see from Fig. 5.29 that the MAI power is
around −18 dB at CFO=0.2. When SNR=18 dB for BPSK symbols, which
have 0 dB power, it means the noise power is around −18 dB. Hence, if we
include the MAI of −18 dB. The total power of noise plus interference is

10–1

OFDMA: half-loaded
OFDMA: quaterly-loaded
PMU-OFDM: half -loaded
PMU-OFDM: quaterly-loaded

10–2
Bit error probability

10–3

10–4
0 5 10 15 20 25 30
SNR: Eb/N0

Fig. 5.30. The BEP comparison between PMU-OFDM and OFDMA in half-loaded
and quarterly loaded cases (|j = 0.2|) [[134] IEEE].
c
5.4 PMU-OFDM System in Frequency Offset Environment 183

−15 dB. Hence, we see from Fig. 5.29 that the BEP of the half-loaded PMU-
OFDM at SNR=18 dB performs roughly the same as the quarterly loaded
PMU-OFDM at SNR=15 dB, where its MAI power is around −31 dB and
is sufficiently smaller than the noise power −15 dB. As SNR increases, the
noise will become less than −18 dB. However, the MAI remains −18 dB for
half-loaded PMU-OFDM if CFO is fixed at 0.2. In this case, the performance
floor will occur at SNR=18 dB. The same evaluation applies to the OFDMA
system. Furthermore, we see that quarterly loaded PMU-OFDM with code
priority makes the system more robust to this serious CFO environment. In
fact, due to the use of symmetric or anti-symmetric codes, half-loaded PMU-
OFDM outperforms quarterly loaded OFDMA. 
Example 5.15: Codeword Assignment with Code Priority
Let the parameter setting remain the same as that in Example 5.14. First, we
evaluate the performance of different code priority schemes. The proposed fine
code priority is given
 in Eq. (5.111), which is the code priority that results in
2 "
 (1) 
the minimum E Bj←i [k] according to Tables 5.1 and 5.2. Another code
priority used as a benchmark is given by
Priority Scheme for Comparision : w1 , w13 , w7 , w11 , w4 , w16 , w6 , w10 ,
which assigns codewords of crossings from the fewest to the most, i.e., from 0
to 14.
Figure 5.31 shows the MAI as a function of the number of active users T
for these two schemes. The dashed curves are for CFO=0.05 while the solid
curves are for CFO=0.2. We see that the proposed code priority greatly out-
performs the benchmark code priority. The benchmark scheme assigns the
first two users with the codewords having 0 and 2 crossings, which will cause
the most serious MAI in a CFO environment. Hence, we see that, as T in-
creases, the performance of the benchmark priority becomes slightly better.
Moreover, we see that the benchmark priority with CFO=0.05 performs even
worse than the proposed priority with CFO=0.2 when T < 5. These results
pronounce the fact that a good code priority design can further enhance the
system performance while a bad design may make the performance stay nearly
unchanged even if the number of users is decreased to 2. 
In future communication systems, it is desirable that systems are robust
to different CFO levels. For instance, some users are in a fast mobile speed
so that they have large CFOs while some users are in a slow speed so that
they have small CFOs. In this case, code priority can be used in the following
way. That is, we assign codewords of higher priorities to users who have larger
CFOs and codewords with lower priorities to users who have # smaller CFOs.
$
2
Since the MAI effect is symmetric in the sense that E |M AIj←i [k]| =
# $
E |M AIi←j [k]|2 , if user i has the most serious CFO while it is assigned the
highest priority codeword, this codeword is most robust to CFO and it only
causes a small MAI to other users.
184 5 Precoded Multiuser (PMU)-OFDM System

−15

−20

−25
MAI power (dB)

−30

−35

−40

Code priority assigning less crossings first: CFO = 0.2


−45
Code priority assigning less crossings first: CFO = 0.05
Proposed code priority: CFO = 0.2
Proposed code priority: CFO = 0.05
−50
2 3 4 5 6 7 8
T: Number of users

Fig. 5.31. MAI is plotted as a function of the number of active users for different
code priority schemes [[134] IEEE].
c

Example 5.16: Codeword Priority for Different CFOs


In this example, we are interested in seeing what happens if we assign higher
priority codewords to users who have larger CFOs and the lower priority
codewords to users who have lower CFOs in the PMU-OFDM system. This
code assignment is intuitive since higher priority codewords are more robust to
CFO and they will not cause significant MAI to other users even in a serious
CFO environment, e.g., please see Table 5.2. The lower priority codewords
tend to cause great MAI to other users. However, if they are assigned to users
with lower CFOs, the caused MAI is not as significant due to small CFOs.
Consider a half-loaded system with four different CFO levels, 0.02, 0.03,
0.05, four and 0.1. For PMU-OFDM, w1 and w13 are with CFO=0.02, w11
and w7 are with CFO=0.03, w10 and w6 are with CFO=0.05, and w4 and
w16 are with CFO=0.1. For OFDMA, the first and second users i.e., u = 1
and 2 in Example 5.14, are with CFO=0.02, the third and the fourth users
are with CFO=0.03, the fifth and the sixth users are with CFO=0.05, and the
seventh and the eigth users are with CFO=0.1. Figure 5.32 shows the BEP
curves of these two systems. We see that the use of code priority for different
CFO levels enables the PMU-OFDM to continue to outperform OFDMA.
5.5 PMU-OFDM System in Time-Varying Channel Environment 185

10–1
OFDMA
PMU-OFDM

10–2
Bit error probability

10–3

10–4

0 5 10 15 20 25 30
SNR: Eb/N0

Fig. 5.32. The BEP comparison between PMU-OFDM and OFDMA in an envi-
ronment with various CFOs [[134] IEEE].
c

This result shows a great advantage of PMU-OFDM with code priority in a


practical situation. 

5.5 PMU-OFDM System in Time-Varying Channel


Environment
In this section, we will investigate the performance of PMU-OFDM in the
presence of time-varying channels and compare with that of OFDMA system
in such an environment.

5.5.1 Time-Varying Rayleigh Fading Channel Model


The discrete-time impulse response of a time-variant channel h(τ ; n) is defined
as the response of the channel at time n to an impulse applied at time n − τ .
It can be modeled as [60]

L−1
h(τ ; n) = β(d)e−j2πφd (n) δ(τ − d), (5.112)
d=0
186 5 Precoded Multiuser (PMU)-OFDM System

where β(d) is the real envelope of the dth channel gain of the Rayleigh dis-
tribution, φd (n) is the phase of the dth multipath component of the uniform
distribution. The time-varying nature of the channel can be mathematically
modeled by treating h(τ ; n) as a wide-sense-stationary (WSS) random process
in n with autocorrelation

R(τ1 , τ2 , α) = E{h∗ (τ1 , n)h(τ2 , n + α)}. (5.113)

For most multipath channels, the attenuation and the phase shift associated
with different delays are assumed uncorrelated. This uncorrelated scattering
(US) leads to the following autocorrelation function [60, 105]

R(τ1 , τ2 , α) = R(τ1 , α)δ(τ1 − τ2 ). (5.114)

The above equation describes the autocorrelation function for the multi-
path channel under the wide sense stationary and uncorrelated scattering
assumptions. It is often known as the WSSUS model for the multipath fad-
ing channel. By applying the Fourier transform to autocorrelation function
R(τ, α) = E{h∗ (τ, n)h(τ, n + α)}, we can obtain the scattering function of the
channel as
 ∞
S(τ, f ) = F {R(τ, α)} = R(τ, α)e−j2πf α dα. (5.115)
−∞

Based on the above scattering function, we can derive the multipath intensity
profile (or the power-delay profile), which is defined as

p(τ ) = R(τ, 0) = E{|h(τ, n)|2 }. (5.116)

It is the average received power expressed as a function of delay. The two most
common power-delay profiles are uniform and exponential. Every multipath
component has the same power in the uniform power-delay profile whereas
multipath components decay exponentially with delay in the exponential pro-
file.
Another useful function to characterize the time-varying nature of the
channel is the Doppler power spectrum, which can be derived from the scat-
tering function as  ∞
SD (f ) = S(τ, f )dτ. (5.117)
−∞

The Doppler spread, fD , provides a measure of spectral broadening caused


by the change of the mobile channel over time. It is defined as the range of
frequencies over which the received Doppler spectrum is essentially non-zero.
When a sinusoidal tone of frequency fc is transmitted over a time-variant
channel, the Doppler spectrum of the received signal will have components
in the range from fc − fd to fc + fd , where fd is called the Doppler shift.
It is a function of the relative velocity of the mobile terminal and the angle
5.5 PMU-OFDM System in Time-Varying Channel Environment 187

θ between the motion direction of the mobile terminal and the direction of
arrival of scattered waves. Mathematically, fd can be written as
V
fd = fc cos θ, (5.118)
c
where fc is the carrier frequency, V is the velocity of the mobile terminal, and
c is the speed of light.
To model the Doppler phenomenon, it is often assumed that there are
many multipath components, each of which has a different delay but the same
Doppler spectrum (Fig. 5.33). Each multipath component is actually made up
of a large number of simultaneously arriving unresolvable multipath compo-
nents, having uniformly distributed angles of arrival at the receive antenna.
This channel model was suggested by Jakes. The classical Jakes’s Doppler
spectrum has the form [60]
1
SD (f ) = . , −fD ≤ f ≤ fD , (5.119)
1 − (f /fD )2

where fD = fc Vc is the maximum Doppler shift.


The time-varying channel impulse response of Eq. (5.112) can be written
as

L−1
h(n; τ ) = g(n; d)δ(τ − d), (5.120)
d=0
−j2πΦd (n)
where g(n; d) = βd e is a complex Gaussian random variable with
zero mean and variance σd . Then, we have

E[g(n; d)g ∗ (m; d )] = R(d, n − m)δ(d − d ). (5.121)

It turns out that we can factor the autocorrelation function as the product of
two one-variable functions of delay and time [105]

R(d, n − m) = φ(n − m)σd2

where σd2 is the variance of the dth tap and φ(n−m) is the time-autocorrelation
function. The Jakes model for the time-domain time-autocorrelation function
[60] can be written as

Fig. 5.33. Complex baseband equivalent model of Jakes Doppler spectrum.


188 5 Precoded Multiuser (PMU)-OFDM System

φ(n − m) = J0 (2πfD Ts (n − m)) , (5.122)


where J0 (x) is the zeroth order Bessel function of the first kind, fD is the
maximum Doppler frequency, and Ts is the sampling rate.
In Eq. (5.112), φd (n) is the Doppler frequency for channel path d. It is
the only time-varying parameter in this channel model. The Doppler spread
effect is caused by the difference in Doppler frequencies φd (n) for different
channel paths. On the other hand, if φd (n) is the same for all channel paths,
we have only the Doppler shift effect. Then, the channel impulse response can
be written as

L−1
h(τ ; n) = e−j2πφ(n) βd δ(τ − d),
d=0

= e−j2πφ(n) h(τ ). (5.123)


In other words, it is equivalent to a time-invariant channel multiplied by
a common phase shift component. When there is a carrier frequency offset
(CFO) due to the Doppler shift or mismatch between transceiver’s oscillators,
Eq. (5.112) yet with φ(n) replaced by the normalized CFO value provides the
appropriate channel model.
We will prove that PMU-OFDM is approximately MAI-free in time-
varying channel with the use of symmetric or anti-symmetric Hadamard-
Walsh codewords as it is in CFO channel. We will also show that the number
of sign changes in Hadamard-Walsh codewords has central effect on the value
of ICI caused by time-varying channel in PMU-OFDM system.

5.5.2 Analysis of PMU-OFDM Under the Doppler Effect


We can divide the overall Doppler effect in a mobile environment into the
Doppler MAI effect and self-Doppler effect. The MAI effect is the interference
from symbols of other users while the self-Doppler effect is the interference
from neighboring subcarriers (i.e., Doppler ICI) of the same user to also result
in symbol distortions.

5.5.3 Doppler MAI


In this section, we analyze the performance of PMU-OFDM and derive its
average MAI in Doppler environment. We saw before that in PMU-OFDM,
each symbol in the input of user j,(xj ) is repeated by M times and after
passing through Wj , the resultant N M × 1 vector zj can be written as
zj [l] = wj [l]yj [l], 0 ≤ l ≤ N M − 1, (5.124)
where l = v + kM is the frequency index. Next, each coded vector is passed
through the N M -point inverse DFT (IDFT) matrix. Then, following the par-
allel to serial conversion, ν cyclic prefix (CP) are inserted at the beginning of
each OFDM symbol. The jth transmitted signal over a block is thus given by
5.5 PMU-OFDM System in Time-Varying Channel Environment 189
N
M−1
1 2π
sj (n) = wj [Ω]yj [Ω]ej N M Ωn , −ν ≤ n ≤ N M, (5.125)
NM
Ω=0

where wj [Ω], Ω = 0, 1, . . . N M − 1 is the jth user’s Hadamard-Walsh code-


words.
These symbols are fed to the multiple access channel. We consider the
uplink scenario where each user experiences a different channel but under the
assumption of synchronous communication.
If the cyclic prefix duration is chosen such that ν ≥ L, the received signal
after removing the cyclic prefix is given by

 
T L−1
r(n) = gi (n; d)si (n − d) + e(n), (5.126)
i=1 d=0

where e(n) is the discrete-time additive white Gaussian noise, gi (n; d) is the
channel complex coefficient of user i, and T is the number of multiple access
users. The output of DFT can be expressed as
N
M−1
r(n)e−j N M nl ,

r[l] = 0 ≤ l ≤ N M − 1. (5.127)
n=0

The detected symbol for user j is then given by

1  1   
M−1 T N M−1 L−1
x̂j [k] = wj [v + kM ] yi [Ω]wi [Ω] gi (n; d)
M v=0 N M i=1
n,Ω=0 d=0
# 2π $
· ej N M Ω(n−d) e−j N M n(v+kM) + ê[k],

(5.128)

where
1 
M−1
ê[k] = e[v + kM ]wj [v + kM ], (5.129)
M v=0
and
N
M−1
e(n) e−j N M n(v+kM) .

e[v + kM ] = (5.130)
n=0

By Eq. (5.1), we get

yi [Ω] = yi [u + f M ] = xi [f ], (5.131)

where Ω = u + f M , with 0 ≤ u ≤ M − 1 and 0 ≤ f ≤ N − 1. Since all


symbols of an individual user use the same Hadamard-Walsh code, we have
wi [u + f M ] = wi [u] and wj [v + kM ] = wj [v]. By defining

1  
M−1 M−1
wi [u]wj [v] e−j N M (v−u)n e−j N M ud ,
2π 2π
ζi,j (n, d) = (5.132)
M v=0 u=0
190 5 Precoded Multiuser (PMU)-OFDM System

and using Eq. (5.131), we can express Eq. (5.128) in the following form

 
T N −1 N 
M−1 L−1
1
gi (n; d)ζi,j (n, d) e−j N M (f M)d e−j N (k−f )n
2π 2π
x̂j [k] = xi [f ]
i=1 f =0
N M n=0
d=0
+ ê[k]. (5.133)

From the above equation, the MAI to the kth symbol of user j due to user
i, denoted by M AIj←i [k], is given by


N −1 N 
M−1 L−1
1
M AIj←i [k] = xi [f ]
N M n=0
f =0 d=0
 +
−j N2π −j 2π
· gi (n; d)ζi,j (n, d)e M (f M)d
e N (k−f )n
. (5.134)

# $
2
The averaged M AIj←i [k] power is E |M AIj←i [k]| . Let

ζi,j (n, d) = ηi,j (n)e−j N M ud ,


where
1 
M−1
wi [u]wj [v]e−j N M (v−u)n .

ηi,j (n) = (5.135)
M u,v=0

We assume that xi [f ] and gi (n; d) are uncorrelated, and also cross correlation
between xi [f ] is zero; i.e., E {xi [f ]x∗i [f  ]} = σx2i δ(f −f  ). Then, by Eqs. (5.121)
and (5.122), the averaged power of M AIj←i [k] can be found as

# $ N −1 N M−1
σd σxi 2  
2
E |M AIj←i [k]| = L( ) J0 (2πfD Ts (n − m))
NM
f =0 n,m=0
( +

(m)e−j N (k−f )(n−m) .

· ηi,j (n)ηi,j (5.136)

Similar to previous section, we divide the MAI into two terms

(0) (1)
M AIj←i [k] = M AIj←i [k] + M AIj←i [k]. (5.137)

(0)
M AIj←i [k] is obtained by letting f = k in Eq. (5.134) and is the inter-
ference contributed from the kth symbol of user i to the kth symbol of user
(1)
j. M AIj←i [k] is the interference contributed from all the f = kth symbols of
user i to the kth symbol of user j.
5.5 PMU-OFDM System in Time-Varying Channel Environment 191
(0)
From Eq. (5.136), the averaged power of M AIj←i [k] is given by

 2 " & σ σ '2 NM−1


 (0)  d xi
E M AIj←i [k] = L
NM n,m=0
( +

· J0 (2πfD Ts (n − m)) ηi,j (n)ηi,j (m) . (5.138)

Example 5.17: Dominating and Residual MAI in Time-Varying Flat


Fading Channel
(0)
In this example, we show by simulation that M AI j is indeed the dominating
MAI in a Doppler spread environment with flat fading channel. Figure 5.34
(0) (0)
shows the simulation results for M AIj←i [k] and M AIj←i [k] versus normalized
Doppler frequency. The simulation parameters are N = 4, M = 16, L = 1,
fc = 4 GHz and sampling frequency Fs = T1s = 2 MHz. The maximum
Doppler frequency is related to the mobile speed via fD = fc Vc , where c is
the speed of light and V is the mobile speed. Channel coefficients are allowed

−15

−20
MAI from the kth symbol (f = k)
−25

−30

−35
MAI power (dB)

−40

−45

−50
(0)
−55 MAI
j
(0)
MAI
−60
MAI from all the other symbols (f ≠ k) (1)
MAI
j
−65
(1)
MAI
−70
1 2 3 4 5 6 7
−4
f T x 10
D s

Fig. 5.34. The dominating and the residual MAI curves as a function of the
normalized Doppler frequency [[125] IEEE].
c
192 5 Precoded Multiuser (PMU)-OFDM System

to change during one OFDM block, and they have the Rayleigh distribution
with an unit variance at the same path but different time indices.
The average dominating MAI for user j from all other users, denoted by
(0) (1)
M AI j and M AI j , respectively, are calculated as explained in Example 5.8.
The total MAI is plotted as a function of the maximum normalized
(0) (1)
Doppler frequency in Fig. 5.34, where M AI j and M AI j of all 16 users
are shown by 16 solid curves and 16 dashed curves, respectively. The solid
(0)
bold curve in Fig. 5.34, denoted by M AI , is the average value of 16 solid
 T (0)
curves and obtained by T1 j=1 M AI j . Similarly, the dashed bold curve,
(1)
denoted by M AI , is the average value of 16 dashed curves and obtained by
1
T (0) (0)
T j=1 M AI j . We see from this figure that the average M AI is about
(1) (0) (1)
10 dB more than the average M AI . Thus, we can view M AI and M AI
as the dominating MAI and the residual MAI, respectively, as they are in CFO
environment.
Similar to the MAI caused by CFO, the dominating MAI due to the
Doppler spread can be reduced to a negligible amount if only M/2 symmetric
or anti-symmetric codewords of M Hadamard-Walsh codewords are used as
the following theorem will demonstrates.
Proposition 5.5: Suppose that only the M/2 symmetric or the M/2 anti-
symmetric codewords of M Hadamard-Walsh codewords are used. Then
(0)
M AIj←i [k] is approximately zero if the normalized Doppler frequency is less
than 1/(2πN M ).
 M−1−p
Proof. From Lemma 5.5, M−1 p=1 q=0 wi [p + q]wj [q] = 0, i = j. Then,
we can rewrite Eq. (5.138) such that it includes this term. To do so, again we
can use the equality


M−1 
M−1 
M−1
α(u − v)wi [u]wj [v] =
u=0,u=v v=0 p=1
( +

M−1−p 
M−1−p
α(p) wi [p + q]wj [q] + α(−p) wi [q]wj [p + q] , (5.139)
q=0 q=0

where α(.) can be any function. Note that if i = j in the above equation, we
have

 M−1
M−1  
M−1 
M−1
α(u − v)wi [u]wj [v] = α(u − v)wi [u]wj [v]. (5.140)
u=0 v=0 u=0,u=v v=0

Let α(u − v) = e−j N M (v−u)n . Using the properties of symmetric and anti-

symmetric codewords given in Eqs. (5.85) and by using Eqs. (5.139), (5.140),
ηi,j (n) can be rewritten as
5.5 PMU-OFDM System in Time-Varying Channel Environment 193

2  
M−1 M−1−p

ηi,j (n) = wi [p + q]wj [q] cos( pn), i = j. (5.141)
M p=1 q=0 NM

By substituting the above equation in Eq. (5.138) and making some rear-
(0)
rangement, we can obtain the averaged power of M AIj←i [k] as
 2 " 2σd σxi 2  
M−1 M−1−p
 (0) 
E M AIj←i [k] = L( ) wi [p + q]wj [q]
NM2 p=1 q=0
( M−1 M−1−r +
 
· wi [r + s]wj [s]ρ(p, r) , (5.142)
r=1 s=0

where
N
M−1
2π 2π
ρ(p, r) = J0 (2πfD Ts (n − m)) cos( pn) cos( rm). (5.143)
n,m=0
NM NM

Now, we need to show that ρ(p, r) is a constant function so that


 2 "
 (0)  1
E M AIj←i [k] 0, for fD Ts < .
2πN M
The zeroth order Bessel function of the first kind can be expanded by the
following power series [2]
1 2
4z ( 14 z 2 )2 ( 14 z 2 )3
J0 (z) = 1 − + − + ··· . (5.144)
(1!)2 (2!)2 (3!)2
1
As shown in Fig. 5.35, when 0 < fD Ts < 2πN M (or equivalently 0 <
2πfD Ts (n − m) < 1), we can approximate J0 (z) for 0 < z < 1 by
1
z2
1 − (1!)
4
2 . By substituting this approximation for J0 {2πfD Ts (n − m)} with

0 ≤ n − m ≤ N M , we obtain
  1
J0 (2πfD Ts (n − m)) 1 − π 2 (fD Ts )2 (n − m)2 , fD Ts < .
2πN M
(5.145)
Based on the above approximation, we can express ρ(p, r) in Eq. (5.142) as
N
M−1 N
M−1
2π 2π
ρ(p, r) cos( pn) cos( rm)
n=0 m=0
NM NM
N
M−1 N
M−1
2π 2π
− (πfD Ts )2 n2 cos( pn) cos( rm)
n=0
NM m=0
N M
N
M−1 N
M−1
2π 2π
− (πfD Ts )2 cos( pn) m2 cos( rm)
n=0
NM m=0
NM
N
M−1 N
M−1
2π 2π
+ 2(πfD Ts )2 n cos( pn) m cos( rm). (5.146)
n=0
NM m=0
N M
194 5 Precoded Multiuser (PMU)-OFDM System

1 z2
Fig. 5.35. Approximation of J0 (z) by 1 − 4
(1!)2
for z < 1 [[125] IEEE].
c

N M−1 N M−1
It can easily be shown that n=0 cos( N2πM pn) = m=0 cos( N2πM rm) = 0.

Also, we know from [108] that


n−1
n sin( 2n−1 1 − cos(nx)
2 x)
k cos(kx) = − , x = 0. (5.147)
2 sin( x2 ) 4 sin2 ( x2 )
k=0

Therefore, for p = 1, 2, · · · M − 1, and r = 1, 2, · · · , M − 1, we get

N
M−1 N
M−1
2π 2π
m cos( rm) = n cos( pn)
m=0
NM n=0
N M
N M sin( 2N M−1 2π
N M p) 1 − cos(2πp)
= 2
π −
2 sin( N M p) 4 sin2 ( NπM p)
N M sin(2πp − N M )

NM
= =− . (5.148)
2 sin( NpπM ) 2

By plugging the above expression in Eq. (5.146),

NM 2
ρ(p, r) 2π 2 (fD Ts )2 ( ) ,
2

which is independent of p and r. By substituting this approximate value of


ρ(p, r) back to Eq. (5.142), we obtain
5.5 PMU-OFDM System in Time-Varying Channel Environment 195
 2 " πfD Ts σd σxi 2  
M−1 M−1−p
 (0) 
E M AIj←i [k] 2L( ) wi [p + q]wj [q]
M p=1 q=0
( M−1 M−1−r +
 
· wi [r + s]wj [s] . (5.149)
r=1 s=0

 M−1−p
It was shown in Lemma 5.5 that M−1 p=1 q=0 wi [p + q]wj [q] = 0, if the
symmetric or anti-symmetric Hadamard-Walsh codewords are used. There-
fore, the dominating MAI is approximately reduced to
 2 "
 (0)  1
E M AIj←i [k] 0, if fD Ts < . (5.150)
2πN M

Example 5.18: Suppression of the Dominating MAI in Time-Varying


Flat Fading Channel with Code Selection Scheme
We consider the same set of parameters as Example 5.17 and show how
employing only M/2 symmetric or M/2 antisymmetric Hadamard-Walsh
(0)
codewords can suppress M AI . Figure 5.36 shows the choosing only M/2
symmetric or M/2 anti-symmetric Hadamard-Walsh codewords greatly sup-
presses the dominating MAI. The simulation parameters were the same as
those adopted for Fig. 5.34.
By comparing results in this figure and Fig. 5.34, we see that the dom-
inating MAI is reduced by 35–58 dB. In contrast, the residual MAI is only
decreased by 5–6 dB due to the decreased user number from 16 to 8. The
results confirm that the dominating MAI can be greatly reduced using only
M/2 symmetric or M/2 anti-symmetric codewords so that the total MAI,
(0) (1)
which is equal to M AI + M AI , is reduced considerably.
1
In a practical mobile environment, fD Ts < 2πN M is equivalent to a wide
range of normalized Doppler frequency.
Let us consider an example. If N = 64 and M = 16, PMU-OFDM ap-
proximately MAI-free when the normalized maximum frequency is less than
1.5 × 10−4 . For the carrier frequency fc = 4 GHz and the sampling frequency
Fs = T1s = 2 MHz, this maximum Doppler frequency is equivalent to a mobile
speed of 81 km/h.

5.5.4 Analysis of Doppler ICI and Symbol Distortion

We studied the interference caused by other users’ Doppler effect, namely


Doppler MAI in previous section. Here, we investigate the impairment caused
by user’s self Doppler effect. These effects are symbol distortion and self ICI
effect.
If we assume there is no active user except for user j, then by letting i = j
in Eq. (5.133), the kth detected symbol of user j is given by
196 5 Precoded Multiuser (PMU)-OFDM System

−40

−60
MAI power (dB)

−80 (0)
MAIj , symmetric codewords
(0)
MAI , symmetric codewords
(1)
−100 MAI , symmetric codewords
j
(1)
MAI , symmetric codewords
(0)
−120 MAIj , anti-symmetric codewords
(0)
MAI , anti-symmetric codewords
MAI(1), anti-symmetric codewords
−140 j
(1)
MAI , anti-symmetric codewords
1 2 3 4 5 6 7
fDTs −4
x 10

Fig. 5.36. The dominating and residual MAI versus the normalized Doppler fre-
quency when only M/2 symmetric or anti-symmetric Hadamard-Walsh codewords
are used [[125] IEEE].
c


N −1 N 
M−1 L−1
1
gj (n; d)ζj,j (n, d)e−j N M (f M)d e−j N (k−f )n
2π 2π
x̂j [k] = xj [f ]
N M n=0
f =0 d=0
+ ê[k]. (5.151)

We can rewrite Eq. (5.151) as

(0) (1)
x̂j [k] = xj [k]Hj [k] + ICIj [k] + ICIj [k] + ê[k], (5.152)

(0) (1)
where Hj [k], ICIj [k], and ICIj [k] are discussed below. The distortion
factor Hj [k], is obtained by putting f = k and u = v in ζj,j (n, d) in Eq. (5.151),
i.e.,

N  1  −j 2π (v+kM)d
M−1 L−1 M−1
1
Hj [k] = gj (n; d) e NM , (5.153)
N M n=0 M v=0
d=0

for 0 ≤ v ≤ M − 1 and 0 ≤ d ≤ L − 1.
(0)
The term ICIj [k] in Eq. (5.152) is the interference from subcarriers with
f = k and u = v to the desired subcarrier f = k; namely,
5.5 PMU-OFDM System in Time-Varying Channel Environment 197
N 
M−1 L−1
(0) 1
ICIj [k] = xj [k] gj (n; d)
N M n=0
d=0
( +
−j N2π
· {ζj,j (n, d)|u=v }e M (kM)d , (5.154)

(1)
and ICIj [k] in Eq. (5.152) is the sum of all interferences from subcarriers
f = k, i.e.,


N −1 N 
M−1 L−1
(1) 1
ICIj [k] = xj [f ] gj (n; d)
N M n=0
f =0,f =k d=0
( +
ζj,j (n, d)e−j N M (f M)d e−j N (k−f )n .
2π 2π
· (5.155)

(0)
The averaged power of ICIj [k] can be obtained in a manner similar to that
in deriving the MAI averaged power in Eq. (5.136) as
 2 " σd σxj 2 
N M−1
 (0) 
E ICIj [k] = L( ) J0 (2πfD Ts (n − m))
NM n,m=0
( +

· {ηj,j (n)|u=v }{ηj,j (m)|u=v } . (5.156)

We can rewrite Eq. (5.139) for i = j as


( +

M−1 
M−1 
M−1−p
α(u − v)wj [u]wj [v] = {α(p) + α(−p)} wj [p + q]wj [q] .
u,v=0,u=v p=1 q=0

Let α(u − v) = e−j N M (v−u)n . Then,


2  
M−1 M−1−p

ηj,j (n)|u=v = wj [p + q]wj [q] cos( pn). (5.157)
M p=1 q=0 NM

By substituting the above equation in Eq. (5.156) and using the approximate
formula Eq. (5.145) for the Bessel function and Eq. (5.148), we have
 2 " πfD Ts σd σxj 2  
M−1 M−1−p
 (0) 
E ICIj [k] 2L( ) wj [p + q]wj [q]
M p=1 q=0
( M−1 M−1−r
  $
· wj [r + s]wj [s] . (5.158)
r=1 s=0

To find the double summation term in Eq. (5.158), we note


198 5 Precoded Multiuser (PMU)-OFDM System

 M−1−p
M−1  1 
M−1
wj [p + q]wj [q] = wj [u]wj [v]
p=1 q=0
2
u,v=0,u=v

1   1  2
M−1 M−1 M−1
= wj [u] wj [v] − w [u].
2 u=0 v=0
2 u=0 j
M−1
Since u=0 wj [u] is equal to M for the all-one codeword and is 0 for all other
codewords, we have
 M−1−p
M−1  
M (M − 1)/2, all − onecode,
wj [p + q]wj [q] = (5.159)
−M/2, otherwise.
p=1 q=0

From Eqs. (5.158) and (5.159),


 2 "
 (0)  L
E ICIj [k] (πfD Ts σd σxj )2 ,
2
when the all-one codeword is not used and fD Ts < 1/(2πN M ). On the other
hand, if the all-one codeword is used, we have
 2 " L
 (0) 
E ICIj [k] (πfD Ts σd σxj (M − 1))2 .
2
 2 "
  Lσd2 σx
2
(0)
Also, if fD Ts < 1/(2πN M ), then E ICIj [k] < 8N 2 j for the all-one
 2 " Lσ2 σ2
 (0)  d x
codeword and E ICIj [k] < 8N 2 M j2 for other codewords.
(1)
The averaged power of ICIj [k] can also be computed as
 2 " N −1 N
σd σxj 2 
M−1
 (1) 
E ICIj [k] = L( ) J0 (2πfD Ts (n − m))
NM
f =0,f =k n,m=0
( +

(m)e−j N (k−f )(n−m) .

· ηj,j (n)ηj,j (5.160)

(1)
As compared with the averaged power of ICIj [k], the averaged power of
(0)
ICIj [k] is negligible for all codewords for practical values of N and L. This
can be clearly explained by the following example.
Example 5.19: Doppler ICI
Let us consider an example. Let N = 64, L = 4, M = 16. The maximum
Doppler frequency was chosen to be 10−4 , which
 corresponds to the user speed
2 "
 (0) 
of 54 km/h. Also, σd = σxj = 1. Then, E ICIj [k]
2 2
−40.5 dB for the
 2 "
 (0) 
all-one codeword, and E ICIj [k] −64 dB for other codewords.
5.5 PMU-OFDM System in Time-Varying Channel Environment 199

Using Eq.
 (5.160) and the same set of parameters, we can compute the
2 "
 (1) 
value of E ICIj [k] to be about −10 dB for the user with the all-
one codeword dB, and between −25 dB and −50 dB for all other users.
2 "
 (1) 
Clearly, E ICIj [k] is much larger than the corresponding values of
 2 "
 (0)  (1)
E ICIj [k] . Since ICIj [k] is the dominant ICI, we only need to con-
(0)
sider this ICI effect and ignore ICIj [k].
Figure 5.37 shows the average ICI power for each individual Hadamard-
Walsh codeword used in PMU-OFDM. For the kth symbol of the target user,
we accumulate the ICI from all other symbols f = k of the same user. The
(1)
average ICI power is then averaged for 0 ≤ k ≤ N − 1; namely, ICI j =
 
1 N −1   2
(1)
N k=0 ICIj [k] .
We see that different users experience a different amount of ICI. Espe-
cially, the user that employs the all-one codeword suffers more ICI than all
others by 15–40 dB. Intuitively, if a codeword has more sign changes, the main
lobes of interfering subcarriers may cancel each other so that the ICI power is

−10

−15
anti-symmetric codewords
−20 symmetric codewords

−25
ICI power (dB)

−30

−35

−40

−45

−50
1 2 3 4 5 6 7 8
user index

Fig. 5.37. The ICI (1) power as a function of user index for symmetric and anti-
symmetric codewords with fD Ts = 1 × 10−4 [[125] IEEE].
c
200 5 Precoded Multiuser (PMU)-OFDM System

decreased. This is similar to the self-ICI cancellation technique used to mit-


igate ICI in OFDM [157]. Since the all-one codeword belongs to the set of
symmetric Hadamard-Walsh codewords, we recommend to choose the set of
anti-symmetric Hadamard-Walsh since they have the approximately MAI-free
and a lower average ICI power at the same time.

5.5.5 Codeword Priority Schemes for ICI Cancellation


(1)
Since ICIj [k] is the dominant ICI due to the Doppler spread effect, we would
like to further investigate this term in this section. It is found that codewords
with a higher number of sign changes tend to lead to a smaller ICI value.
This can be explained as follows. Based on the combination of u and v values,

the expression ηj,j (n)ηj,j (m) in Eq. (5.160) can be written as the sum of four
terms:
∗ ∗ ∗
ηj,j (n)ηj,j (m) = {ηj,j (n)|u=v }{ηj,j (m)|u=v } + {ηj,j (n)|u=v }{ηj,j (m)|u=v }
∗ ∗
+ {ηj,j (n)|u=v }{ηj,j (m)|u=v } + {ηj,j (n)|u=v }{ηj,j (m)|u=v }.
M−1−p
Let rj,j [p] = wj [p + q]wj [q], 1 ≤ p ≤ M − 1. Since {ηj,j (n)|u=v } = 1,
q=0  2 "
 (1) 
and by using Eq. (5.157), E ICIj [k] can be written as

 (
2 " 
N −1
 (1)  σd σxj 2
E ICIj [k] = L( )
NM
f =0,f =k
N
M−1
:
J0 (2πfD Ts (n − m)) e−j

N (k−f )(n−m)

n,m=0

2  2 
M−1 M−1
2π 2π
+ rj,j [p] cos( pn) + rj,j [p] cos( pm)
M p=1 NM M p=1 NM
M−1 2 ;+
4   
 2π 2π
+  rj,j [p] cos( pn) cos( pm) .
M 2  p=1  NM NM

Define
N
M−1

ne−j N (k−f )n cos(

β(p, k − f ) = pn),
n=0
NM
and
N
M−1
ne−j N (k−f )n .

γ(k − f ) =
n=0
Since
N
M−1 N
M−1

e−j N (k−f )n = e−j N (k−f )n cos(
2π 2π
pn) = 0,
n=0 n=0
NM
5.5 PMU-OFDM System in Time-Varying Channel Environment 201

for p = 1, 2, ...M − 1, and k − f = −2N, ..., −1, 1, ...2N , and by using the
approximate formula Eq. (5.145), we can express the average ICI power for
1
0 < fD Ts < 2πN M by
 (
2 " N −1
π(fD Ts )σd σxi 2 
 (1) 
E ICIj [k] 2L( ) |γ(k − f )|2
NM
f =0,f =k

4 
M−1
+ {γ ∗ (k − f ) rj,j [p]β(p, k − f )}
M p=1
 2 +
4   
M−1

+ 2 rj,j [p]β(p, k − f ) , (5.161)
M  p=1


where {.} denotes the real part. It was proven in [125] that the real part of
β(p, k − f ) is equal to −N M/2. Based on this result, we can rewrite the ICI
averaged power as
 (
2 " π(fD Ts )σd σxi 2 
N −1
 (1) 
E ICIj [k] 2L( ) |γ(k − f )|2
NM
f =0,f =k
( : ;+
4 
M−1

+  γ (k − f ) ø + rj,j [p]{β(p, k − f )}
M p=1
 2 +
4   
M−1

+ 2 ø + rj,j [p]{β(p, k − f )} , (5.162)
M  p=1


where {.} is the imaginary part and ø can be obtained by Eq. (5.159) as

⎨ −(M−1)M N , all-one codeword,
2

4
ø=
⎩ M2N
4 , otherwise.

The quantity, rj,j [p] M−1−pq=0 wj [p + q]wj [q], 1 ≤ p ≤ M − 1 will appear
again. rj,j [p] can be interpreted as the autocorrelation of codeword j. Since
{β(p, k − f )} is a monotonically decreasing or increasing function of p for
given k − f as shown in [125], we can characterize the ICI values qualitatively
by rj,j [p] based on Eq. (5.162).
If rj,j [p], 1 ≤ p ≤ M − 1, has a sufficient number of sign changes, the
M−1
term p=1 rj,j [p]{β(p, k − f ) in Eq. (5.162) is likely to be cancelled out
after the  summation over all p, which leads to a smaller ICI value. Since
M−1−p
rj,j [p] = q=0 wj [p + q]wj [q], for 1 ≤ p ≤ M − 1, there is a relation
between the number of sign changes in the codewords and the number of sign
changes in r[p].
Let us consider an example for codewords of size M = 8.
202 5 Precoded Multiuser (PMU)-OFDM System

• For the all-one codeword: (1, 1, 1, 1, 1, 1, 1, 1), r1,1 [p] = (seven 6 5 4 3 2 1)


and has no sign change.

• For the second codeword: (1 -1 1 -1 1 -1 1 -1) with seven sign changes,


r2,2 [p] = (-7 6 -5 4 -3 2 -1) that has six sign changes.

• For the 7th codeword: (1 1 -1 -1 -1 -1 1 1), r7,7 [p] = (3 -2 -3 -4 -1 2 1) and


they both have two sign changes.

With the above observation, we can adopt the following rule of thumb for
codeword selection in a PMU-OFDM system: “To give a higher priority to
codewords that have a higher number of sign-changes”. This result is similar
to PMU-OFDM in the presence of a pure CFO environment [134].
Example 5.19: Code Priority for ICI in Doppler Environment
In this example, we corroborate the code priority analysis given above for
ICI power of PMU-OFDM system in Doppler environment via computer
simulation.
The ICI power for 16 codewords is shown in Table 5.3, where the cor-
responding number of sign changes is also provided. System parameters are
chosen to be N = 64, M = 16, and L = 4. The maximum normalized Doppler
frequency and the SNR value were fixed at 10−4 and 30 dB, respectively.
We see that as a general trend, codewords with more sign changes have
less ICI. This observation verifies the theoretical result stated above.
Example 5.20: Performance Comparison of PMU-OFDM to
OFDMA in Doppler Environment
Here, we compare the performance of the PMU-OFDM and the OFDMA
systems via computer simulations. For fair comparison, we keep the size of
IDFT/DFT of both systems the same, i.e., N M , and consider both fully
loaded and half-loaded situations. To simulate the half-loaded PMU-OFDM
system, the set of anti-symmetric (or symmetric) Hadamard-Walsh codewords
is used. The subchannel for a fully loaded and half-loaded OFDMA systems,
are assigned in the same manner as Examples 5.10 and 5.11 for the half-loaded
OFDMA.

Table 5.3. The ICI averaged power in dB for 16 users [[125] IEEE].
c
# sign changes 0 2 4 6 8 10 12 14
ICI of symmetric −10.6 −32.5 −40.2 −38.4 −49.4 −49.9 −46.9 −45.8
codewords (dB)
# sign changes 1 3 5 7 9 11 13 15
ICI of anti-symmetric −24.8 −30.5 −40.5 −37.1 −49.6 −49.8 −47.2 −45.0
codewords (dB)
5.5 PMU-OFDM System in Time-Varying Channel Environment 203

Fig. 5.38. Comparison of the MAI power and the ICI power as a function of the
normalized Doppler frequency [[125] IEEE].
c

First, let us see how the MAI and ICI of half-loaded PMU-OFDM com-
pares with that of OFDMA as functions of the normalized Doppler frequency.
Let N = 64, M = 16, and L = 4. The ICI power and the MAI power for
OFDMA and PMU-OFDM are depicted versus maximum normalized Doppler
frequency as in Fig. 5.38. The average ICI power is the averaged value of eight
multiple access users (i.e., T = 8). The first observation is that the second
largest amount of interference is the ICI of PMU-OFDM with only symmet-
ric codewords. The reason should be obvious from our previous discussion in
Section 5.5.5. That is, the all-one codeword in the set of symmetric codewords
has a high ICI value as compared to the ICI or the MAI of all other users.
On the other hand, the ICI value of PMU-OFDM with the set of anti-
symmetric Hadamard-Walsh codewords is about 4 dB more than that of
OFDMA. The inferior ICI performance of PMU-OFDM compared to that
of OFDMA is explained by noting the subcarriers allocation we adopted for
OFDMA. That is, subcarriers allocated to a particular user in OFDMA are
spread uniformly across the available bandwidth. Hence, ICI results in less
impairment in OFDMA than in a single-user OFDM.
However, for OFDMA, the MAI power is significantly higher than that
of PMU-OFDM. As shown in Fig. 5.38, we observe that the MAI value of
204 5 Precoded Multiuser (PMU)-OFDM System

PMU-OFDM with either symmetric or anti-symmetric Hadamard-Walsh code-


words is about 10 dB less than that of OFDMA. Thus, we expect PMU-OFDM
with only anti-symmetric codewords to outperform OFDMA in the bit error
probability.
Next, we consider the BEP performance of six systems: fully-loaded
PMU-OFDM, fully-loaded OFDMA, half-loaded PMU-OFDM with sym-
metric codewords, half-loaded PMU-OFDM with anti-symmetric codewords,
PMU-OFDM with symmetric codewords, excluding the all-one codeword and
half-loaded OFDMA.
Eb
Under the setting of N = 64, M = 16, L = 4, and N 0
= 30 dB, simulation
results on the BEP are shown in Fig. 5.39, where the BEP performance is
plotted as a function of the Doppler frequency for the above systems.
For this example, we assume that perfect channel knowledge is available
at the receiver. A frequency equalizer can be used to compensate the symbol
distortion effect, i.e., the detected symbol x̂j [k] is multiplied by (Hj [k])−1 .
Thus, both systems only suffer from MAI and ICI.
We see that fully-loaded PMU-OFDM and fully-loaded OFDMA have com-
parable performance for fD Ts ≥ 10−4 . For fD Ts < 10−4 that corresponds to
the mobile speed of 54 km/h with respect to our chosen parameters, OFDMA

100
Fully-loaded PMU-OFDM
Fully-loaded OFDMA
Half-loaded OFDMA
10−1 Half-loaded PMU-OFDM with even codes
Half-loaded PMU-OFDM with odd codes
PMU-OFDM with even codes, without all-one code
Bit error probability

10−2

10−3

10−4

0.5 1 1.5 2 2.5 3


f T x 10−4
D s

Fig. 5.39. The BEP comparison for PMU-OFDM and OFDMA as a function of
the normalized Doppler frequency [[125] IEEE].
c
5.5 PMU-OFDM System in Time-Varying Channel Environment 205

outperforms PMU-OFDM. This is no surprise since as we saw in previous sec-


tions of this chapter, OFDMA is MAI-free if time or frequency asynchronism
is negligible.
For half-loaded systems, we see that PMU-OFDM with anti-symmetric
codewords results in much better BEP performance than PMU-OFDM with
symmetric codewords and OFDMA in the Doppler environment with the max-
imum normalized Doppler frequency ranging from 1 × 10−5 to 3 × 10−4 . This
Doppler frequency range corresponds to a mobile speed between 5.4 km/h
and 162 km/h. As explained before, the poorer performance of half-loaded
PMU-OFDM with symmetric codewords is due to high ICI of the user with
the all-one codeword. Thus, if we exclude the all-one codeword, PMU-OFDM
with seven symmetric codewords outperforms all other systems as shown in
Fig. 5.39.
Finally, we examine how PMU-OFDM and OFDMA perform when the
active number of users varies from 1 to T = M . The codeword priority scheme
for PMU-OFDM is stated below. When the system load is less than 50%, we
use codewords with a higher number of sign changes among the M/2 anti-
symmetric codewords. When the system load is more than 50%, we choose all
M/2 anti-symmetric codewords plus an symmetric codeword pre-determined
to have a low average ICI value based on the results given in Table 5.3. The
procedure is repeated until all 16 codewords are selected. The BEP results
are plotted as functions of the user number for PMU-OFDM and OFDMA in
Fig. 5.40.
From Fig. 5.40, we see that PMU-OFDMA significantly outperforms
OFDMA in a lightly loaded (50% or less) system. When the system reaches
its full loading, the BEP performance of PMU-OFDM and OFDMA becomes
comparable.

5.5.6 Channel Estimation in Fast Time-Varying Channel

In the previous example, we assumed perfect channel knowledge at the re-


ceiver. However, in practice, we have only access to the estimates of channel
coefficients.
M−1 −j 2π vd
From Eq. (5.153), if N M  vd, then e−j N M vd 1, M
2π 1
v=0 e
NM

1. Note that since 0 ≤ v ≤ M − 1, and 0 ≤ d ≤ L − 1, N M  vd is equivalent


to N  L − 1. Therefore, Hj [k] can be written as

N 
M−1 L−1
1
gj (n; d)e−j N kd .

Hj [k] (5.163)
N M n=0
d=0

That is, for every time index n and any user j, we take the DFT of channel
coefficient gj (n; d) over path d with frequency index k. The result is then
206 5 Precoded Multiuser (PMU)-OFDM System

10−2

10−3
Bit error probability

OFDMA
PMU-OFDM
10−4

10−5
2 4 6 8 10 12 14 16
Number of users

Fig. 5.40. The BEP performance comparison as a function of the user number
for PMU-OFDM and OFDMA, where the codeword priority scheme is adopted for
PMU-OFDM [[125] IEEE].
c

averaged over one OFDM symbol to yield Hj [k]. When N  L − 1, another


interpretation of Hj [k] is


L−1
gjavg (d)e−j N kd ,

Hj [k] (5.164)
d=0

 M−1
where gjavg (d) = N1M N n=0 gj (n; d) is the average of the dth channel tap
over one OFDM block. Therefore, Hj [k] represents the DFT of gjavg (d). Often,
subcarriers are much longer than the channel length and Hj [k] can be com-
puted from Eq. (5.153) or (5.153).
In Example 5.19, we ideally mitigated the symbol distortion effect in the
receiver using a frequency domain equalizer whose one tap gain for user j is
set to (Hj [k])−1 . In practice, we must obtain and estimate (Hj [k])−1 .
To this end, channels estimation for time-varying channel must be pre-
formed. When the channel is time-varying within an OFDM block, the
preamble-based training method may not work well. Periodic insertion of
training symbols during transmission of every block has been suggested for
OFDM in time-varying channels. It was shown in [93] that the best set of
5.5 PMU-OFDM System in Time-Varying Channel Environment 207

frequency domain pilot tones are those which are equally spaced. We adopt
this technique to estimate the fast fading channel. Let P be the number of
equally spaced pilot tones at subchannels Λ[k] = k × N/P , for 0 ≤ k ≤ P − 1.
An estimate of Hj [k] can be obtained at pilot tones via

x̂j [Λ[k]] (ICI (0) [Λ[k]] + ICI (1) [Λ[k]] + ê[Λ[k]])


Ĥj [Λ[k]] = + . (5.165)
xj [Λ[k]] xj [Λ[k]]

Then, the estimate of gjavg (d) is obtained through an IDFT of length P as

P −1
1  j2πdk
ĝjavg (d) = Ĥj [Λ[k]]e P . (5.166)
P
k=0

More sophisticated algorithms were suggested to improve the channel esti-


mation performance in a Doppler environment. For example, two ICI miti-
gation techniques were proposed in [90] to improve the channel estimation
performance in the presence of the Doppler effect. Other channel estimation
methods for OFDM in time-varying channels were reported in [73] and [71].

100
Fully-loaded PMU-OFDM
Fully-loaded OFDMA
Half-loaded OFDMA with channel estimation (ch. est.)
Half-loaded OFDMA
Half-loaded PMU-OFDM with even codes
10–1
Half-loaded PMU-OFDM with odd codes and ch. est .
Half-loaded PMU-OFDM with odd codes
Bit error probability

PMU-OFDM with even codes, without all-one code

10–2

10–3

10–4
5 10 15 20 25 30
E /N
b 0

Fig. 5.41. The BEP comparison for PMU-OFDM and OFDMA as a function of
the SNR value [[125] IEEE].
c
208 5 Precoded Multiuser (PMU)-OFDM System

Example 5.21: Performance Comparison of PMU-OFDM


to OFDMA with Non-Perfect Channel Estimation
in Doppler Environment
By adopting N = 64 and L = 4, assume the channel length is much shorter
than subcarriers and Hj [k] can be computed from Eq. (5.153) or (5.153).
We consider the same six scenarios given in Fig. 5.39 in Example 5.19. and
two more scenarios with channel estimation. The channel estimation was per-
formed using Eqs. (5.165) and (5.166). We used eight pilots in every OFDM
block. The BEP results of all eight scenarios are plotted in Fig. 5.41 as a func-
Eb
tion of the SNR value, N 0
, with the maximum normalized Doppler frequency
−4
fixed at 10 .
For fully-loaded cases, the BEP curves of PMU-OFDM and OFDMA are
close to each other. For half-loaded cases, the BEP curves of PMU-OFDM with
symmetric codewords and OFDMA are also close with each other while PMU-
OFDM with anti-symmetric codewords outperforms the above four cases
considerably.
Excluding the all-one codeword, PMU-OFDM with seven symmetric code-
words perform much better than half-load PMU-OFDM with all eight sym-
metric codewords due to the high ICI value of the user with all-one codeword.
Also, we obseve from Fig. 5.41 that the channel estimation penalty is about
1.5–2.0 dB for Eb /N0 ≤ 20 dB. However, the performance gap between half-
loaded PMU-OFDM with anti-symmetric codewords and half-loaded OFDMA
remains the same with realistic channel estimation.
6
MAI-Free MC-CDMA System

6.1 Introduction
Multicarrier CDMA (MC-CDMA) has been proposed as a promising
multiaccess technique. MC-CDMA systems can be divided into two types [49].
For the first type, one symbol is transmitted per time slot. The input symbol
is spread into several chips, which are then allocated to different subchannels.
The number of subchannels is equal to the number of chips [26, 155]. For the
second type, a vector of symbols is formed via the serial-to-parallel conversion,
and each symbol is spread into several chips. The chips corresponding to the
same symbol are allocated to the same subchannel [67], which is often called
MC-DS CDMA. When compared with conventional CDMA systems, MC-
CDMA can combat inter-symbol-interference (ISI) more effectively. Moreover,
the frequency diversity gain can be fully exploited if the maximum ratio comb-
ing (MRC) technique [49, 109] is used at the receiver in MC-CDMA systems.
Despite the above advantages, the performance of MC-CDMA systems is still
limited by MAI in a multipath environment. Even though MAI can be reduced
by MUD [144] and other signal processing [49] techniques, the diversity gain
provided by multipath channels could be sacrificed since the received chips
are no longer optimally combined under MRC. Furthermore, channel status
information is needed for MRC and MUD. In a multiuser environment, mul-
tiuser channel estimation is more complicated and its accuracy degrades as the
number of users increases, which will in turn degrade the system performance.
In this chapter, we approach the MAI reduction problem for MC-CDMA
systems from another angle. That is, we investigate a novel way to select a
set of “good” spreading codes so as to completely eliminate the MAI effect
while keeping the transceiver structure simple and the computational bur-
den low. Code design based on Hadamard-Walsh code is proposed to achieve
the MAI-free property in a synchronous MC-CDMA system [26, 155]. More
specifically, let N and L denote, respectively, the spreading factor and the
multipath length. The N = 2ns Hadamard-Walsh codewords are partitioned
judiciously into G subsets, where G = 2ng with ns > ng ≥ 1 and G ≥ L.

C.-C.J. Kuo et al., Precoding Techniques for Digital Communication Systems,


DOI: 10.1007/978-0-387-71769-2 6,  c Springer Science+Business Media, LLC 2008
210 6 MAI-Free MC-CDMA System

Then we can obtain an MAI-free system and each user can fully exploit the
diversity gain provided by the multipath channel using any subset of code-
words in frequency-selective channels. The number of supportable MAI-free
users in each codeword subset is N/G. We also show a procedure to estimate
the channel information for individual users under an MAI-free environment.
Moreover, we consider the performance of the proposed Hadamard-Walsh code
based scheme in a carrier frequency offset (CFO) environment. It is shown
that this code scheme can reduce the CFO-induced MAI effect to a negli-
gible amount under an interested CFO level. Some Hadamard-Walsh code-
words can even achieve MAI-free in a CFO environment. Finally, based on
the theoretical requirements for MAI-free MC-CDMA, we propose a subset of
Hadamard-Walsh codes that are completely MAI-free even in the presence of
CFO. We show that, by partitioning those codewords into subsets, codewords
in some particular subsets will be MAI-free in a CFO environment and the
number of supportable MAI-free users with HW codes is 1 + log2 (N/G).

6.2 System Model and Its Properties

The block diagram of the MC-CDMA system in uplink direction is shown in


Fig. 6.1, where the desired signal path demonstrates a signal transmitted by
user i and detected in receiver side for user i. On the other hand, we call
the signal transmitted by user i and detected for user j as MAI. The system
transmits one data symbol in one time slot. Suppose that there are T users.
Let the symbol from user i be xi . In the first stage, xi is spread by N chips
to form an N × 1 vector, denoted by yi . Let the kth element of yi be yi [k].
The relation between yi [k] and xi is given by

yi [k] = wi [k]xi , 0 ≤ k ≤ N − 1, (6.1)

where wi [k] is the kth element of the ith orthogonal code. Note that we con-
sider the short code scenario here, where the spreading code for a target
user is the same for any time slot. After spreading, yi is passed through the
N × N IDFT matrix. Then, the output is parallel-to-serial (P/S) converted
and the cyclic prefix (CP) of length L−1 is added to combat the inter-symbol-
interference (ISI), where L is the considered maximum delay spread.

wi [0] hi (n ) wi* [0] λ*i [0]


y i [ 0] ^ ^
z [ 0]
IDFT

y [ 0]
DFT

xi w [ N − 1] wi* [ N − 1] λ*i [ N − 1] x^i


i
yi [ N −1] y[ N − 1]
^ zi [ N − 1]
^

Fig. 6.1. The block diagram of an MC-CDMA system [[136] IEEE].


c
6.2 System Model and Its Properties 211

At the receiver side, the receiver removes CP and passes each block of size
N through the N × N DFT matrix. Since there are T users, the kth element
of the DFT output ŷ can be written as [10, 27]


T −1
ŷ[k] = λj [k]yj [k] + e[k], (6.2)
j=0

where λj [k] is the kth component of the N -point DFT of user j’s channel
impulse response, and e[k] is the received noise after DFT. Based on ŷ, we will
detect symbols for T users. As shown in Fig. 6.1, to detect symbols transmitted
by the ith user, ŷ is multiplied by wi∗ [k] and frequency gain λ∗i [k]. The channel
information hj (n) or λj [k] of every user is assumed to be known to the receiver.
The estimation of channel information under an MAI-free environment will
be described in Section 6.2.2. After being multiplied with the frequency gains,
N chips are summed up to form reconstructed symbol x̂i . Using Eqs. (6.1)
and (6.2), x̂i is given by


N −1 
T −1 
N −1 
N −1
x̂i = xi |λi [k]|2 + xj λ∗i [k]wi∗ [k]λj [k]wj [k] + λ∗i [k]wi∗ [k]e[k],
k=0 j=0,j =i k=0 k=0
     
multipath effect MAIi←j

(6.3)

where M AIi←j denotes the MAI from user j to user i. Note that, when the
channel noise e[k] is AWGN, the process from ŷ[k] to x̂i is called the maximum
ratio combining (MRC) technique [49], which ensures the minimum bit error
probability for detected symbols [108], or the maximum achievable diversity
gain provided by multipath channels [105].
For any target user i, if M AIi←j = 0, the reconstructed symbol x̂i will
be affected only by his/her own transmitted symbols xi and the correspond-
ing channel response λi [k]. Thus, this allows the system to use some simple
detection schemes without involving multiuser detection. When the channel
has flat fading, λi [k] and λj [k] are independent of k and M AIi←j = 0 if
orthogonal codes such as the Hadamard-Walsh codes are used. However, in
practical situations, the channel environment is usually frequency-selective
and the orthogonality of orthogonal codes will be lost under MRC.
The system model can also be used in downlink transmission, where the
signal for every user experiences the same fading. In this situation, MAI-free
can be achieved using ORC [49], i.e., the combining gain is λ−1 [k] instead
of λ∗ [k] in Eq. (6.3) (the subscript disappears since every user experiences
the same channel in downlink transmission). However, for subchannels with
serious fading, ORC tends to amplify the noise in these subchannels. Thus,
the performance will degrade dramatically. That is, the use of ORC may lead
to the loss of the diversity gain from multipath channels. In the following
sections, we will design wi [k] such that M AIi←j = 0 under the multipath
212 6 MAI-Free MC-CDMA System

environment. Moreover, the proposed code design allows MRC to be used in


both uplink and downlink transmissions. Thus, a full diversity gain from the
multipath channel can be achieved.

6.2.1 MAI Analysis over Frequency-Selective Fading

Let F be the N × N DFT matrix with the element at the kth row and the nth
column given by [F]k,n = √1N e−j N kn and the maximum length of channel

impulse response be L, i.e., hi (n) = 0, for n > L − 1. The MAI term in


Eq. (6.3) can be expressed using matrix representation as

M AIi←j = xj h†i F†0 Wi∗ Wj F0 hj , (6.4)


  
Ai,j

where
t
hi = (hi (0) hi (1) · · · hi (L − 1)) ,
 
I
F0 = F L ,
0 N ×L
and
Wi = diag(wi [0] · · · wi [N − 1]).
To have zero MAI for a frequency-selective fading channel, we need to have
M AIi←j = 0 for all nonzero hi and hj . This means that Ai,j in Eq. (6.4)
should be the L × L zero matrix for all i = j. Define Ri,j = Wi∗ Wj , where
Ri,j is diagonal, i.e.,

Ri,j = diag (ri,j [0] · · · ri,j [N − 1]) ,

with
ri,j [k] = wi∗ [k]wj [k].
We can rewrite Ai,j as
 
  IL
Ai,j = IL 0 F† Ri,j F , (6.5)
   0
Bi,j

Since Ri,j is diagonal, it is well known that Bi,j is a circulant matrix [47].
t
That is, the first column of Bi,j , (bi,j (0) · · · bi,j (N − 1)) , is the N -point IDFT
of ri,j , where ri,j = (ri,j [0] · · · ri,j [N − 1])t . Matrix Ai,j is an L × L upper left
submatrix of Bi,j , i.e.,
⎛ ⎞
bi,j (0) bi,j (N − 1) . . . bi,j (N − L + 1)
⎜ .. ⎟
⎜ bi,j (1) bi,j (0) . ⎟
Ai,j = ⎜⎜ ⎟. (6.6)
.. . ⎟
⎝ . . . ⎠
bi,j (L − 1) ... bi,j (0)
6.2 System Model and Its Properties 213

To have Ai,j = 0 means that bi,j (0) = · · · = bi,j (L − 1) = 0 and bi,j (N − L −


1) = · · · = bi,j (N − 1) = 0. That is, the first L samples and the last L − 1
samples of the IDFT of ri,j are zeros. Hence, to achieve MAI-free property
for arbitrary channel, the following conditions should be satisfied

bi,j (n) = 0, 0≤n≤L−1
. (6.7)
bi,j (N − n) = 0, 1 ≤ n ≤ L − 1

Lemma 6.1: Suppose the channel length is L and the spreading gain is N .
To achieve MAI-free property, N should be greater or equal to 2L.
Proof: From Eq. (6.7), there should be at least 2L − 1 elements for the
codewords. However, if N = 2L − 1, all elements of the codewords are zeros.
Therefore, N ≥ 2L. 
Note that Lemma 6.1 holds for both real and complex code design. In what
follows, we show how to achieve the MAI-free conditions in Eq. (6.7) using the
Hadamard-Walsh codes. Before proceeding, let us recall a well known property
of the Hadamard matrix [8] as follows.
An N × N Hadamard matrix HN with N = 2p , p = 1, 2, · · · , can be
recursively defined using the Hadamard matrix of order 2, i.e.,
 
HN/2 HN/2
HN = H2 ⊗ HN/2 = , (6.8)
HN/2 −HN/2

where ⊗ is the Kronecker product [8, 54] and


 
+1 +1
H2 = .
+1 −1

Our proposed code scheme is stated below. Suppose N = 2ns and G = 2ng ,
where both ns and ng are integers, and ns > ng ≥ 1. The columns of an N ×N
Hadamard matrix HN form the N Hadamard-Walsh codes. We divide the N
codewords into G subsets. Each subset # has N/G codewords. That $ is, the gth
subset, denoted by Gg , has codewords w N g , · · · , w N (g+1)−1 , where wi is
G G
the ith column of HN and 0 ≤ g ≤ G − 1. For instance, let N = 8 and G = 2.
Then, G0 contains codewords {w0 , w1 , w2 , w3 } and G1 contains codewords
{w4 , w5 , w6 , w7 }.
Lemma 6.2: Let ri,j be an N × 1 vector with the kth element be ri,j [k] =
wi∗ [k]wj [k]. For wi and wj that belong to the same subset, ri,j is equal to one
of the codewords in G0 excluding codeword w0 .
Proof: Let us first prove that for wi and wj ∈ G0 , ri,j is again a codeword
within G0 . According to Eq. (6.8), the N/G×N/G upper left submatrix of HN
is a N/G × N/G Hadamard matrix. Thus, the product of any two columns
of this submatrix is again a column of this submatrix (see [65]). Since the
codewords in subset 0 are the first N/G columns of HN , which is obtained
214 6 MAI-Free MC-CDMA System

by repeating the N/G × N/G submatrix by G times. Hence, for wi and wj


in subset 0, ri,j is a codeword in subset 0.
Now, let us consider ri,j for wi and wj that are in the same subset other
than subset 0. Recall that wi [k] is the kth element of the ith codeword. It can
also be used to denote the kth element of the ith column of HN . According
to Eq. (6.8), for 0 ≤ i ≤ N/2 − 1, we have the following property

wi [k] = wi+N/2 [k], 0 ≤ k ≤ N/2 − 1,
(6.9)
wi [k] = −wi+N/2 [k], N/2 ≤ k ≤ N − 1.

We see from Eq. (6.9) that the product of any two columns in the last half
N/2 columns is equal to that of the two corresponding columns in the first
half N/2 columns, i.e.,

wi [k]wj [k] = wi+N/2 [k]wj+N/2 [k], 0 ≤ i, j ≤ N/2 − 1. (6.10)

Suppose that we divide the N codewords into two sets, denoted by S0 and
S1 , respectively. The first N/2 half codewords form S0 while the last N/2 half
codewords form S1 . Hence, as proved in the beginning of the lemma that for
wi and wj in S0 , ri,j is again a codeword in S0 . For wi and wj in S1 , ri,j is
equal to a codeword in S0 based on Eq. (6.10). Using a similar procedure, we
can divide S0 into 2 sets, S00 and S01 . Thus, for wi and wj in S00 , ri,j is a
codeword in S00 . Now, we prove that for wi and wj in S01 , ri,j is a codeword
in S00 . From Eq. (6.8), for 0 ≤ i ≤ N/4 − 1, we have the following property

wi [k] = wi+N/4 [k], 0 ≤ k ≤ N/4 − 1 or N/2 ≤ k ≤ 3N/4 − 1,
wi [k] = −wi+N/4 [k], N/4 ≤ k ≤ N/2 − 1 or 3N/4 ≤ k ≤ N − 1.
(6.11)
We see from Eq. (6.11) that the product of any two columns in the second
quarter is equal to the product of the two corresponding columns in the first
quarter, i.e.,

wi [k]wj [k] = wi+N/4 [k]wj+N/4 [k], 0 ≤ i, j ≤ N/4 − 1. (6.12)

From Eq. (6.14), for wi and wj in S01 , ri,j is again a codeword in S00 . Sim-
ilarly, we can divide S1 into two sets, i.e. S10 and S11 , and show that for
wi and wj in either S10 or S11 , ri,j is again a codeword in S00 . Using the
same procedure, we can continue to divide the codewords until we have G
subsets, and show that for wi and wj in the same subset, ri,j is a codeword of
subset 0. 
Lemma 6.3: Let w̃i (n), 0 ≤ n ≤ N − 1 and 1 ≤ i ≤ N/G − 1 be the N -point
IDFT of the codewords in G0 excluding w0 . Then, w̃i (n) has the following
property: 
w̃i (n) = 0, 0 ≤ n ≤ G − 1,
(6.13)
w̃i (N − n) = 0, 1 ≤ n ≤ G − 1.
6.2 System Model and Its Properties 215
N −1
Proof: For n = 0, it is easy to see w̃i (0) = k=0 wi [k] = 0 since there are
an equal number of +1 and −1 for any codeword except w0 . For n = 0, since
w̃i (n) is the IDFT of the codewords in G0 , we have

N −1
1  2π
w̃i (n) = wi [k]ej N mn . (6.14)
N m=0

Let m = k +gN/G, 0 ≤ k ≤ N/G−1, 0 ≤ g ≤ G−1, we can rewrite Eq. (6.14)


as
1  
N/G−1 G−1

w̃i (n) = wi [k + gN/G]ej N (k+gN/G)n . (6.15)
N g=0 k=0

Since codewords wi [k] in G0 are the first N/G columns of HN , they are formed
by repeating the upper left N/G × N/G submatrix of HN by G times. Hence,
wi [k] = wi [k + gN/G], 0 ≤ k ≤ N/G − 1, 0 ≤ g ≤ G − 1. We can rewrite
Eq. (6.15) as
1 
N/G−1
w̃i (n) = wi [k]an , (6.16)
N
k=0

where
(

G−1
G, n = cG with c = 0, ±1, ±2, · · · ,
j 2π
G gn
an = e =
g=0
0, otherwise.

Therefore, we obtain
( 
N/G−1
G
N k=0 wi [k], n = cG with c = 0, ±1, ±2, · · · ,
w̃i (n) = (6.17)
0, otherwise.

From Eq. (6.17) and w̃i (0) = 0, we are led to Eq. (6.13). 
From Eq. (6.13) and Lemma 6.2, we have the following property

bi,j (n) = 0, 0 ≤ n ≤ G − 1,
(6.18)
bi,j (N − n) = 0, 1 ≤ n ≤ G − 1,

where bi,j (n) denotes the nth element of the IDFT of ri,j within the same
subset.

Example 6.1: DFT of Hadamard-Walsh Codewords


Let us give an example to illustrate Lemma 6.3. Let N = 16, the N -point
DFT of the N Hadamard-Walsh codewords are shown in Fig. 6.2. From this
figure, we see that, except the all-one codeword, the IDFT of any codeword
has zero at n = 0. If G = 8, we have eight subsets and each subset has two
216 6 MAI-Free MC-CDMA System

Magnitude

10
5
0
16 16
14 14
12 12
10 10
8 8

i: user index 6 6 n: time index


4 4
2 2
0 0

Fig. 6.2. |w̃i (n)| as a function of user index i and time index n with N = 16 [[136]
IEEE]
c [[136] IEEE].
c

codewords. From Lemma 6.3, for wi and wj in the same subset, ri,j is equal
to w1 . From the figure, the first eight elements of w̃1 (n) are zeros. If G = 4,
then we have four subsets and each subset has four codewords. Again from
Lemma 6.3, for wi and wj in the same subset, ri,j is equal to either w1 , w2 ,
or w3 . From the figure, the first four elements of w̃1 (n), w̃2 (n), and w̃3 (n) are
zeros. 
Based on the above discussion, we have established one of the main results
of this work as stated below.

Theorem 6.1: Let the channel length be L. We divide the N Hadamard-


Walsh codewords into G subsets with G ≥ L, where N and G are power of
2 and each subset consisting of N/G codewords. Then, using any one of the
G subsets of codewords, the corresponding MC-CDMA system is completely
MAI-free.
Note that Theorem 6.1 holds for arbitrary multipath coefficients. More-
over, the maximum number of MAI-free users, T , in each subset depends
on the spreading gain N and multipath length L. Hence, the system can be
designed accordingly. Different applications may have different concerns. We
describe two application scenarios below.
6.2 System Model and Its Properties 217

Application Scenario 1. In cellular systems, frequency reuse for different


cells is an important issue since improper frequency reuse will lead to signifi-
cant co-channel interference [108]. The proposed scheme divides the codewords
into several subsets to achieve MAI-free property. It is intuitive to use distinct
subsets of codewords in neighboring cells to reduce co-channel interference.
Let us give an example to illustrate this point. Let G = L = 4. Thus, the or-
thogonal codes are divided into 4 subsets, i.e., subsets 0, 1, 2 and 3. Figure 6.3
gives an example of frequency planning using the proposed code scheme. For a
larger L, G should be increased accordingly to be MAI-free. In this situation,
we have more subsets, and the distance among the same subset in reuse can
be increased to reduce co-channel interference further.

Application Scenario 2. In wireless local area network (WLAN) applica-


tions, the distance among cells is not as close as that in cellular systems.
Hence, co-channel interference may not be a major concern. According to
Theorem 6.1, the maximum number of users that a cell can support while
maintaining the MAI-free property depends on N/G and hence the mul-
tipath number L. Thus, a smaller value of L or G enables the system to
support more users in one cell. In this situation, N should be much larger
than L to support more users. For a fixed sampling frequency, this can be
done by increasing the OFDM-block duration. Hence, if the complexity is ig-
nored, N can be as large as possible if the duration of one block does not
exceed the channel coherent time. Generally speaking, this concept stands
in contrast with that in an MC-CDMA system, where N should be cho-
sen to be close to L so that subchannels have less correlation and a more
random signature waveform. However, as stated in Theorem 6.1, when the
proposed code design is used, the system is completely MAI-free so that we
can choose N that is much larger than L to support more users in WLAN
applications.

group
3
group group
1 0
group group group
3 2 3
group group group group
1 0 1 0
group group group
2 3 2
group group
1 0
group
2

Fig. 6.3. An example of frequency reuse using the proposed code scheme with
G = L = 4 [[136] IEEE].
c
218 6 MAI-Free MC-CDMA System

Example 6.2: Illustration of the MAI-Free Property


In this example, we show that MC-CDMA is MAI-free with the proposed
code scheme. We considered the performance in the uplink direction. Also,
the Hadamard-Walsh codewords are generated using the Kronecker product
in Eq. (6.8) so that the codeword indices are adopted based on this fact. The
simulation was conducted under the following setting: N = 64, G = L = 2
or 4. The transmit power had an unit variance. The taps of the channel were
i.i.d. random variables with an unit variance. We evaluate the M AIi←j as
given in Eq. (6.3). For L = 2, one realization of |M AIi←j | as a function of
user indices i and j is shown in Fig. 6.4(a). As shown in the figure, there are
two zones where the MAI is zero; i.e., the zone with codewords from 1 to 32,
and the zone with codewords from 33 to 64. The peak values appear along
the diagonal since they correspond to the reconstructed desired signal power
for each user. Thus, the system is MAI-free if either one of the two subsets
of codewords is in use. For L = 4, the performance is shown in Fig. 6.4(b).
We see four zones where the MAI is equal to zero. Hence, if we use either
one of these four subsets, we can achieve an MAI-free system. These results
corroborate the claim in Theorem 6.1. 

Example 6.3: Illustration of the Diversity Gain


In this example, we would like to show that, when the proposed code scheme is
used, every user can achieve a low bit error probability to reflect the diversity
gain L. The BPSK modulation and Hadamard-Walsh codes of N = 16 were
adopted. The uplink channel coefficients were i.i.d. complex Gaussian random
variables of unit variance. For each individual user, the Monte Carlo method
was run for more than 250,000 symbols. The bit error probabilities of two
systems were shown in Fig. 6.5. The solid curve is obtained from a system
with flat fading, i.e., L = 1, with N full codewords used. The dashed curve
is resulted from a system of multipath length L = 2 and with the proposed
N/2 Hadamard-Walsh codes G0 . Since there is no MAI, simulation results are
consistent with the theoretical results in [3] and [105].
We see that, when L grows from 1 to 2 with the proposed code scheme,
the bit error probability improves dramatically due to the increase of the
diversity order. Actually, the dashed curve is the same as that for a system
with L = 1 and two receive antennas with MRC [3]. That is, a diversity order
of 2 is achieved via code design in the frequency domain rather than the space
domain (see p. 777 in [105]). This example also explains the interplay between
the diversity order and the number of users allowed. That is, when L grows,
we need to divide N codewords into more subsets to achieve MAI-free. Hence,
few users can be supported within each cell. However, these users can enjoy
a higher diversity order as L increases.
Note that frequency diversity is inherent in MC-CDMA systems. However,
without a proper code design, the system has MAI that will degrade the BEP
performance as the number of users increases. Under this situation, MAI will
dominate system performance and increasing diversity gain alone may not
6.2 System Model and Its Properties 219

6
MAIi ← j: magnitude

0
60
60
40 50
40
j: user index 20 30
20
10 i: user index
0 0

(a)

10

8
MAIi ← j: magnitude

0
60
50 60
40 50
30 40
30
20 20
j: user index
10 10 i: user index
0 0

(b)

Fig. 6.4. |M AIi←j | as a function of user indices i and j with N = 64: (a) L = 2
and (b) L = 4 [[136] IEEE].
c
220 6 MAI-Free MC-CDMA System

10−1
L = 1 with all N codewords
L = 2 with the first N/2 codewords

10−2
Bit error probability

10−3

10−4
0 5 10 15 20 25 30
Eb/N0

Fig. 6.5. The bit error probability as a function of Eb /N0 to illustrate the diversity
order of the proposed code scheme [[136] IEEE].
c

necessarily improve overall performance [49]. If the proposed code design is


used together with Mr receive antennas, a diversity order of LMr can be
achieved for each individual user. 

6.2.2 Channel Estimation Under MAI-Free Condition

In the last section, we assume that the channel information λi [k] for every user
is known to the receiver. Without accurate channel information, neither ORC
nor MRC can be performed at the receiver end. For non-MAI-free schemes,
channel information is needed for the MUD-based technique in the receiver. If
channel information is not available, it has to be estimated by some techniques
[132]. For uplink transmission, every user experiences a different fading. Thus,
multiuser channel estimation is required if the system is not MAI-free. For
downlink transmission, although the mixed signal of all users from the base
station experiences the same channel fading, orthogonality of users’ codes may
be destroyed as a result of frequency-selective fading. Unless the base station
uses the same training sequence xi and spreading code wi [k] for every user
at the same time slot, it would be difficult for an individual user to acquire
his/her own downlink channel information without extra signal processing
techniques. However, this reduces the system flexibility since all users have
to be coordinated for training with the same signature waveform at the same
6.2 System Model and Its Properties 221

time slot. Thus, it is desirable to design a system where channel estimation


is conducted under an MAI-free environment. In this section, we will show
that the channel information can be obtained in an MAI-free environment
if the proposed code scheme is used. Thus, there is no need to do multiuser
estimation in the uplink direction and the training procedure is more flexible
in the downlink transmission.
To get λi [k] is equivalent to obtaining its time domain impulse response
hi (n), 0 ≤ n ≤ L − 1. We will show how to obtain every user’s hi (n) without
worrying about MAI. Again, the result derived here is for the more general
uplink case. It can be adapted for the downlink case as well. Referring to
Fig. 6.1 and from Eq. (6.2), if the real Hadamard-Walsh code is used, the
N × 1 chip vector of user i before gain combining is


T −1
ẑi = xi F0 hi + xj Wi Wj F0 hj + Wi e, (6.19)
j=0,j =i

where e is the noise vector after DFT. Taking the N -point IDFT of ẑi in
Eq. (6.19), we have

  
T −1
IL
F† ẑi = xi hi + xj F† Wi Wj F0 hj +F† Wi e, (6.20)
0   
j=0,j =i
ci,j

where the second term is the interference term from other users. Since the
channel path is of length L, if the first L elements of ci,j are zeros for all hj ,
we can obtain channel hi without worrying about the interference from other
users.

Theorem 6.2: Suppose that the channel length is equal to L and the code
scheme as stated in Theorem 6.1 is used, where G ≥ L. Then, if we use any
one subset of codewords in the MC-CDMA system, the first L elements of
ci,j are zeros. As a result, we can estimate the channel hi in a completely
MAI-free environment. That is,

zi (n) = xi hi (n) + ẽi (n), 0 ≤ n ≤ L − 1, (6.21)

where zi (n) is the nth element of Fẑi and ẽi (n) is the nth element of F† Wi e.

Proof: Let us express the DFT matrix F as ( F0 F1 ). Then, ci,j in Eq. (6.20)
can be manipulated as
   
F†0 F†0 Ri,j F0
ci,j = Wi Wj F0 hj = hj . (6.22)
F†1 F†1 Ri,j F0
222 6 MAI-Free MC-CDMA System

From the discussion in Section 6.2.1, F†0 Ri,j F0 = 0 if any one subset of
codewords are used. Hence, the first L elements of ci,j are zeros, and we get
Eq. (6.21). 
According to Eq. (6.21), if xi is a known training symbol, we can obtain
hi (n), 0 ≤ n ≤ L−1, without worrying about the interference from symbols of
other users. That is, channel estimation can be done in a completely MAI-free
environment.

Discussion on System Parameters and Performance Tradeoff. From


the discussion above, when the number of users increases, we may increase
the spreading gain N or decrease the number of partitioned subsets G to ac-
commodate more users. The adjustment of parameters N and G dynamically
is an interesting problem, which is under our current investigation. When the
system is heavily loaded in the sense that the number of active users is ap-
proaching N/L, the proposed code design provides a set of optimal codes for
the system in terms of MAI reduction and multipath diversity.
Another tradeoff results from the change of the multipath length L. Under
the condition G = L, if L becomes larger (or smaller), the number of allowed
users decreases (or increases). For a fixed N , since the diversity gain of a user
is equal to L, there exists a tradeoff between the number of users and the
diversity gain [105].
Finally, it is interesting to examine the case where the number of active
users exceeds N/L. Under this scenario, to get an MAI-free system, MUD can
be used in the uplink direction while the ORC scheme [49] can be performed
in the downlink direction with the penalty that the full diversity gain may be
lost. It is worthwhile to emphasize that since Hadamard-Walsh code in nature
has group MAI-free property in MC-CDMA system, the complexity for MUD
in this case can be greatly reduced.

6.2.3 Proposed Code Design in the Presence of CFO

In this section, we consider the CFO effect and show that it can be han-
dled by the use of the proposed code design. In particular, we show that the
MAI due to the CFO effect can be reduced to zero or a negligible amount.
Consider the kth chip of the received vector after DFT in a CFO environ-
ment, i.e.,

T −1
ŷ[k] = rj [k] + e[k], (6.23)
j=0

where e[k] is the received noise after DFT, and rj [k] is the received signal
due to channel fading and the CFO effect. Suppose the jth user has a nor-
malized CFO j , which is the actual CFO normalized by 1/N of the over-
all bandwidth and −0.5 ≤ j ≤ 0.5. rj [k] in Eq. (6.23) can be expressed
by [87]:
6.2 System Model and Its Properties 223
N −1
: N −1
;
1  1 
λj [m]yj [m]ej N nm ej N nj e−j N nk
2π 2π 2π
rj [k] = √ √
N n=0 N m=0
N−1
e−jπ N
m−k

= αj λj [k]yj [k] + βj λj [m]yj [m] , (6.24)


   N sin
π(m−k+j )
m=0,m =k N
(0)
rj [k]   
(1)
rj [k]

where αj and βj are given by


sin πj jπj N −1 N −1
αj = π e
N and βj = sin (πj )ejπj N . (6.25)
N sin Nj

The first term of Eq. (6.24) is the distorted chip and the second term is the ICI
caused by the CFO. Note that, when there is no CFO, rj [k] equals λj [k]yj [k]
as in Eq. (6.2). From Eqs. (6.3) and (6.23), if the real Hadamard-Walsh code
is used, we see that x̂i [k] under CFO is given by


N −1 
T −1 
N −1 
N −1
x̂i = ri [k]λ∗i [k]wi [k] + rj [k]λ∗i [k]wi [k] + e[k]λ∗i [k]wi [k],
k=0 j=0,j =i k=0 k=0
     
si i←j
MAI
(6.26)

where si is the desired signal and M AI i←j is the MAI of user i due to the
jth user’s CFO. Using Eqs. (6.24) and (6.26), it can be shown that the MAI
term is given by

M
(0) (1)
AI i←j = M AIi←j + M AIi←j , (6.27)
where

N −1
rj [k]λ∗i [k]wi [k]
(0) (0)
M AIi←j =
k=0

N −1
= αj xj λj [k]wj [k]λ∗i [k]wi [k] (6.28)
k=0

and

N −1
rj [k]λ∗i [k]wi [k]
(1) (1)
M AIi←j =
k=0
= βj xj ηj , (6.29)

where

N −1 
N −1
e−jπ
m−k
N
ηj = λj [m]wj [m] π(m−k+j )
λ∗i [k]wi [k].
k=0 m=0,m=k N sin N
224 6 MAI-Free MC-CDMA System
(1)
Note that if there is no CFO for user j, i.e., αj = 1 and βj = 0, M AIi←j = 0
(0)
and M AIi←j is equal to the MAI term defined in Eq. (6.3). This gives us an
(0)
intuition that M AIi←j is the dominating MAI term when the CFO is small.
(0)
Hence, if we can find a way that makes M AIi←j = 0, the MAI due to the
CFO can be reduced to a negligible amount. According to Eqs. (6.3), (6.28)
and Theorem 6.1, we have the following Lemma.
Lemma 6.4: Let the channel length be L and the code scheme as stated in
Theorem 6.1 is used. Then, if we use any one of the G subsets of codewords
(0)
for the MC-CDMA system with G ≥ L, the dominating MAI term M AIi←j
in Eq. (6.28) is zero.
(1)
Now, let us look at another interference term M AIi←j , which is called the
p
e−jπ N
“residual MAI” for convenience. Define gj (p) = π(p+j ) . Then, we have
N sin N
the following Lemma.
Lemma 6.5: Let the channel length be L and the code scheme as stated in
Theorem 6.1 is used. Then, if we use any one of the G subsets of codewords
(1)
for the MC-CDMA system with G ≥ L, the residual MAI term M AIi←j in
Eq. (6.29) becomes
⎧ ⎫

⎪ ⎪


N −1 ⎨& '† ⎬
(1) (p) † (p)
M AIi←j = βj xj gj (−p) hi F0 Wi Wj F0 hj , (6.30)

⎪    ⎪ ⎪
p=1 ⎩ (p)

Di,j

where
(p)
Wi = diag (wi [p] · · · wi [N − 1] wi [0] · · · wi [p − 1]) (6.31)
and
& '
= hi (0)e−j N 0p hi (1)e−j N 1p · · · hi (L − 1)e−j N (L−1)p .
(p) 2π 2π 2π
hi (6.32)

Proof: The proof can be found in [136]. 


Since the MAI due to the CFO is divided into two terms, i.e., the domi-
nating MAI in Eq. (6.28) and the residual MAI in Eq. (6.29), if we can make
both terms equal zero, the system can be MAI-free under a CFO environment.
(0)
From Lemma 6.4, M AIi←j = 0 with the proposed code. Thus, our goal now
(1)
is to find a way to make M AIi←j = 0. Let us further manipulate Di,j in
Eq. (6.30) as
 
(p)   † (p) I
Di,j = IL 0 F Wi Wj F L . (6.33)
0
(p)
According to Eqs. (6.30) and (6.33), if Di,j = 0 for all 1 ≤ p ≤ N − 1, we
(1)
have M AIi←j = 0.
6.2 System Model and Its Properties 225

Theorem 6.3: Suppose the codeword set G0 is used, the two codewords w0
(1) T −1 (1)
and w1 will have zero M AIi←j term. That is, j=0,j =0 M AI0←j = 0 and
T −1 (1)
j=0,j =1 M AI1←j = 0.

Proof. For the all-one code w0 , we have


(p)
W0 Wj = Wj , 1 ≤ p ≤ N − 1.
 
  IL
Hence, we have D0,j = IL 0 F† Wj F
(p)
, for 1 ≤ p ≤ N − 1. Since Wj is
0
a diagonal matrix with diagonal elements drawn from G0 . From the discussion
(p) T −1 (1)
in Section 6.2.1, D0,j = 0 for all p. Hence, j=0,j =0 M AI0←j = 0.
For w1 , which is a codeword with a sign change for every consecutive code
symbol, i.e., w1 = (+1 − 1 + 1 − 1 · · · )t , its circulant shift is either w1 or
(−1 + 1 − 1 + 1 · · · )t = −w1 . Hence, we have

(p) W1 Wj , p even,
W1 Wj = 1 ≤ p ≤ N − 1.
−W1 Wj , p odd,

Since W1 Wj is again a codeword in G0 , from the discussion in Section 6.2.1,


(p) T −1 (1)
D1,j = 0 for all p. Hence, j=0,j =0 M AI1←j = 0. 
Theorem 6.3 suggests to use the 0th codeword set so that there are two
codewords to preserve the MAI-free property under the CFO environment.
Since codewords w0 and w1 are completely MAI-free under the CFO envi-
ronment when G0 is used, we can use them as training sequences to estimate
the channel and/or CFO for each user. That is, in uplink direction, every
user use these two codewords in turn to acquire his/her own channel and/or
CFO information. In the downlink direction, one of these two codewords can
be reserved as the pilot signal for CFO estimation. In this case, any single-
user CFO estimation algorithm (e.g., the one given in [87]) can be applied
while sophisticated MUD or signal processing techniques can be avoided. This
result stands in contrast with the CFO estimation for GO-MC-CDMA sys-
tems [158], where multiuser estimation is demanded to acquire accurate CFO
information.
Example 6.4: MAI in the Presence of CFO
In this example, we demonstrate that the dominating MAI due to the CFO
effect can be completely eliminated by the use of the proposed code design.
The system configuration was the same as that in Example 6.3 with mul-
tipath length L = 2. Since the simulation was conducted in the uplink di-
rection, every user has his/her own CFO value. Let us consider the worst
case, where every user is randomly assigned a CFO either +ε or −ε. Accord-
ing to Eq. (6.3), when there is no CFO, the desired signal will be scaled by
N −1 2 N −1 2
k=0 |λi [k]| . Thus, we normalize the MAI by k=0 |λi [k]| for fair com-
(0)
parison. The dominating total MAI of user i, denoted by M AIi , is obtained
226 6 MAI-Free MC-CDMA System
 T −1 2
 (0) 
by averaging   N −11|λ 2 j=0,j =i M AIi←j  for more than 250,000 sym-
k=0 i [k]|
(1)
bols. Similarly, the residual total MAI, denoted by M AIi , is obtained by
 T −1 2
 (1) 
averaging   N −11|λ [k]|2 j=0,j =i M AIi←j  for more than 250,000 symbols.
k=0 i
To illustrate the MAI effect clearer, we did not add noise in this example.
First, let us consider the fully-loaded case, i.e., T = N = 16. The slim-
triangular curves in Fig. 6.6 show the dominating and the residual total MAI
of each individual user. The bold-diamond curve, denoted by M AI (0) , is the
(0)
averaged dominating total MAI for the 16 slim-triangular curves of M AIi .
(0)
Note that the 16 slim-triangular curves of M AIi are tightly clustered and
thus overlap with the bold-diamond curve. Similarly, the bold-square curve,
denoted by M AI (1) , is the averaged residual total MAI for the 16 slim-circle
(1)
curves of M AIi . We see that M AI (0) is larger than M AI (1) by 5–32 dB.
Hence, it confirms that the dominating MAI term defined in Eq. (6.28) is
indeed the key MAI impairment, which is due to CFO.
Now, we consider several half-loaded schemes. First, we examine Shi and
Latva-aho’s scheme [119] for a half-loaded system, i.e., w0 , w1 , w6 , w7 , w10 ,
w11 , w12 , and w13 . The MAI performance is shown in Fig. 6.7. By comparing

Fig. 6.6. The dominating and the residual MAI as a function of CFO in a fully-
loaded situation [[136] IEEE].
c
6.2 System Model and Its Properties 227

Fig. 6.7. The dominating and the residual MAI as a function of CFO in a half-
loaded situation with Shi and Latva-aho’s scheme [[136] IEEE].
c

Fig. 6.7 with Fig. 6.6, we see that both the dominating MAI and the resid-
ual MAI decrease by only about 3 dB, which shows a reasonable but not
satisfactory MAI reduction as the number of users decreases to half in the
system.
Next, let us consider the proposed code selection schemes with half-loaded.
Since L = 2, we divide the codewords into two subsets. G0 contains the first
N/2 codewords and G1 contains the last N/2 codewords. The performance
is shown in Fig. 6.8(a) and (b), respectively. Note that the dominating MAI
(0)
term M AIi is equal to zero so that it is not shown here. Moreover, there
(1)
are only six curves in Fig. 6.8(a) for M AIi , since the two codewords w0 and
w1 are completely MAI-free under the CFO environment.
By examining Fig. 6.6 and Fig. 6.8(a) and (b), we see that the dominating
MAI can be completely eliminated by the proposed code scheme. In this case,
the residual MAI will determine the system performance. Furthermore, the
residual MAI decreases around 5 dB. These results show that the MAI due
to the CFO effect can be greatly reduced using the proposed code schemes.
Moreover, if the codewords of G0 are used, users of w0 and w1 can still have
zero MAI under the CFO environment. 
228 6 MAI-Free MC-CDMA System

(a)

(b)

Fig. 6.8. MAI reduction via the proposed code schemes using codewords in (a) G0
and (b) G1 [[136] IEEE].
c

Example 6.5: BEP in the Presence of CFO


In this example, we consider the bit error probability (BEP) performance in
the presence of the CFO effect for several code schemes of MC-CDMA and
the GO-MC-CDMA scheme [158]. The parameter setting remains the same as
6.2 System Model and Its Properties 229

that in Example 6.4. For MC-CDMA, we consider the proposed schemes G0


and G1 , Shi and Latva-aho’s scheme [119], and the even indexed codewords,
i.e., w0 , w2 , · · · , wN −2 . For GO-MC-CDMA, since L = 2, we divide 16 sub-
carriers into eight groups, where each group can support two users. Since the
system is half-loaded, each group has exactly one user so that the system is
MAI-free when there is no CFO. We assume that every scheme can accurately
estimate individual user’s CFO. With G0 , this can be achieved using estima-
tion algorithms for single-user OFDM systems since there are two MAI-free
codewords even in a multiuser CFO environment. In contrast, other schemes
need to use multiuser CFO estimation. The CFO effect is compensated at the
receiver without any feedback. Figure 6.9 shows the BEP as a function of the
CFO with SNR (= Eb /N0 ) fixed at 15 dB for different schemes. It is clear
that the proposed code schemes G0 and G1 outperform Shi and Latva-aho’s
scheme and the set of even-indexed codewords significantly. They also out-
perform GO-MC-CDMA slightly. We see from this figure that codeword set
G0 slightly outperforms codeword set G1 . This is because that w0 and w1 are
free from MAI in the presence of CFO. 

−1
10

−2
10
Bit error probability

−3
10
Codewords with even indices
Shi and Latv-aho scheme
GO-MC-CDMA
G1

G0
−4
10
0 0.05 0.1 0.15 0.2 0.25 0.3
ε : CFO

Fig. 6.9. The BEP as a function of CFO (with fixed Eb /N0 = 15 dB) [[136] IEEE].
c
230 6 MAI-Free MC-CDMA System

Example 6.6: CFO Estimation with a Single-User Algorithm


It was shown in Example 6.5 that the GO-MC-CDMA system can achieve
comparable performance with the proposed code scheme in a CFO environ-
ment. However, this result is obtained under the assumption that every user’s
CFO can be estimated accurately. For the MC-CDMA system with codeword
set G0 , users with codewords w0 and w1 do not have MAI from others in a
CFO environment and, consequently, accurate CFO can be estimated if each
user adopts these codewords to estimate his/her own CFO in turn. In this
case, estimation algorithms developed for single-user OFDM can be used for
the proposed scheme. In contrast, we need more sophisticated estimation al-
gorithms for multiuser OFDM systems for GO-MC-CDMA since none of the
users in GO-MC-CDMA is free from MAI in a CFO environment. In this ex-
ample, we will evaluate the CFO estimation error for the proposed system and
the GO-MC-CDMA system when the single-user CFO estimation algorithm
given in [87] is used.
The parameter setting remains the same as that in Example 6.5. Since
both MC-CDMA and GO-MC-CDMA spread one symbol into several chips,
the detection output is actually one symbol. Hence, we only need two symbols
for CFO estimation. Referring to Fig. 6.1, we denote the two successive output
symbols, i.e., the current and the next ones, by x̂i and x̂i . The CFO estimation
can be obtained via [87]
1
ˆi = tan−1 [ {x̂∗i x̂i } / {x̂∗i x̂i }] ,

where  {x} and  {x} are the real and the imaginary parts of x. For the pro-
posed scheme, all eight codewords in G0 are active. For the GO-MC-CDMA,
each user occupies one of the eight groups and there are eight users [158].
Without loss of generality, w0 is used for CFO estimation in the proposed
scheme. For GO-MC-CDMA, the user who occupies the 0th and the 8th sub-
carriers with the all-one codeword is used for CFO estimation. The Monte
Carlo method is used to run for more than  20,000 realizations. The estima-
tion mean square errors, i.e., E |i − ˆi |2 , as a function of CFO for both
systems are shown in Fig. 6.10. We see that the estimation error in GO-
MC-CDMA increases as the CFO value becomes larger. This is because the
CFO-induced MAI increases as the CFO value increases, which deteriorates
estimation accuracy. On the other hand, the estimation error in the proposed
MC-CDMA system with G0 remains constant in a multiuser CFO environ-
ment. This shows that the use of codeword set G0 has a better CFO estimation
result in a multiuser environment. 
Example 6.7: Code Priority of MC-CDMA
In this example, we consider a code priority scheme for a fully-loaded MC-
CDMA system using the proposed code scheme. The system setting is the
same as that in Example 6.5 except that SNR is set to 18 dB while CFO is set
to zero. The Monte Carlo method is used with more than 10,000,000 symbols
for all users in the simulation.
6.2 System Model and Its Properties 231
−1
10
GO- MC- CDMA: E /N =20 dB
b 0
GO- MC- CDMA: Eb/N0=30 dB
Proposed scheme: Eb/N0=20
−2
10 Proposed scheme: Eb/N0=30 dB
Mean square error

−3
10

−4
10

−5
10
0 0.05 0.1 0.15 0.2 0.25 0.3
ε :CFO

Fig. 6.10. The mean squared error of CFO estimation as a function of CFO for the
proposed and the GO-MC-CDMA schemes [[136] IEEE].
c

We first consider two code priority schemes that assign codewords accord-
ing to the following order:

Priority Scheme I: w0 , w8 , w1 , w9 , w2 , w10 , w3 , w11 ,


w4 , w12 , w5 , w13 , w6 , w14 , w7 , w15 . (6.34)

Priority Scheme II: w0 , w9 , w2 , w11 , w4 , w13 , w6 , w15 ,


w1 , w8 , w3 , w10 , w5 , w12 , w7 , w14 . (6.35)

Scheme I assigns the next user an even-indexed (or odd-indexed) codeword in


G1 whenever the current user is assigned an even-indexed (or odd-indexed)
codeword in G0 . Since even-indexed (or odd-indexed) codewords in G1 cause
more serious MAI to even-indexed (or odd-index) codewords, the first code
priority has poor performance. It is adopted as a performance benchmark.
Scheme II assigns even- and odd-indexed codewords from G0 and G1 alterna-
tively for the first eight users. It serves as another performance benchmark.
Furthermore, we also implement the code priority scheme proposed by Shi
and Latva-aho’s in [119]. Since this scheme only considers a system up to the
half-loaded situation, its performance curve is plotted up to eight users.
232 6 MAI-Free MC-CDMA System

Finally, we consider the proposed code priority scheme, where we first


assign eight codewords in G0 to the first eight active users. When the number
of active users exceeds eight, we will use codewords in G1 . One such code
priority scheme can be written as

Proposed Priority Scheme: w0 , w1 , w2 , w3 , w4 , w5 , w6 , w7 ,


w8 , w9 , w10 , w11 , w12 , w13 , w14 , w15 .
(6.36)

If there is no CFO, the system is MAI-free using only eight codewords in either
G0 or G1 according to Theorem 6.1. The order of the first eight codewords
can be changed arbitrarily. Also, we can assign codewords all from G1 first
and then from G0 .
The bit error probability is plotted as a function of the number of active
users for the above four code priority schemes in Fig. 6.11. Scheme I has
the worst performance as expected. The performance of the proposed code
priority stays the same when the number of active users is smaller than 9 due
to the MAI-free property. When T exceeds 8, the performance of the proposed
scheme degrades dramatically. However, its performance remains at least as

−1
10

−2
10
Bit error probability

−3
10

−4
10
Code priority #1
Shi and Latva-aho code priority
Code priority #2
Proposed code priority
−5
10
0 2 4 6 8 10 12 14 16
T: Number of users

Fig. 6.11. The BEP as a function of the number of users with SNR=18 dB [[136]
IEEE].
c
6.2 System Model and Its Properties 233

good as Schemes I and II. We also see that for Shi and Latva-aho’s scheme,
it has the same performance as the proposed code priority when the numbers
of users are 1, 2, 3, and 5. This is reasonable since codewords of these user
numbers fall in the set of G0 and, hence, they are free from MAI. However,
for other numbers of users, its performance is worse than the proposed code
priority scheme. Moreover, in the Shi and Latva-aho’s priority, if the number
of active users changes, some users will need to change their codewords, which
complicates the actual deployment of this scheme. 

6.2.4 Practical Considerations on Applicability


of the Proposed Scheme

The proposed scheme is applicable to both up and down links in a multiuser


system. The basic assumption of having a synchronous channel holds in the
downlink. As to the uplink, we may consider a quasi-synchronous channel,
where the time offset is within one chip. Such a channel holds in the up-
link direction for a micro cell, e.g., see pp. 1179–1195 in [121]. In practice,
quasi-synchronism can be achieved by the use of the GPS. Therefore, there
are several systems or code designs based on this assumption; e.g., group-
orthogonal (GO)-MC-CDMA system [158], the LV (Lagrange/Vandermonde)
code [116], the code scheme for MC-CDMA in [119], and the LAS (large area
synchronized) code in [122]. Even if the system is not perfectly synchronized,
time delay can still be included in the channel impulse response. In this case,
we may have larger L and, hence, the number of supportable users to meet
the MAI-free property decreases.
As presented above, there is a close connection between the multipath
length, the spreading gain (i.e., the number of subcarriers) and the number of
users that can be supported by the proposed technique. In particular, in order
to achieve a zero MAI, the system load has to be significantly reduced for a
larger multipath length. This is a potential disadvantage for the proposed
scheme. In practice, under a reasonable sampling frequency, the multipath
length is in general moderate. For example, in an outdoor environment, the
most commonly used multipath duration is around 1–3 μs [108]. For the IS-95
standard, the chip rate is 1.2288 M chips/s in the uplink direction. Hence, the
resolvable multipath length is around 1–3 taps. In an indoor environment, the
maximum multipath duration for an office building is around 0.27 μs. If we
take the sampling frequency of WLAN of 20 MHz as an example, the resolvable
multipath length is around 5 taps. However, the indoor multipath duration is
in general under 0.1 μs. In this case, the resolvable multipath length is around
2–3 taps. It is worthwhile to point out that under a fixed multipath L and
DFT/IDFT size N , the GO-MC-CDMA system [158] without MUD supports
exactly the same number of MAI-free users as the proposed system. Although
the number of MAI-free users in both systems decreases as channel length L
increases, every MAI-free user in these systems can enjoy an increased channel
diversity order L.
234 6 MAI-Free MC-CDMA System

As commented by Chen in [23] “. . . all existing CDMA systems fail to offer


satisfactory performance and capacity, which is usually far less than half of the
processing gain of CDMA systems.” Hence, the choice of the spreading code
to reduce MAI can be a direction for the design of next-generation CDMA
systems. We approach the MAI reduction problem from a similar viewpoint.
That is, for fixed channel conditions, we attempt to select a subset of code-
words that can lead to an MAI-free system and hence provide a high date rate
with simple transceiver design. This design concept stands in contrast with
that of conventional CDMA systems. For instance, as the number of users
increases in IS-95, the achievable data rate decreases in order to support the
fully-loaded user capacity. If a higher data rate is desired in IS-95, we need to
use the more sophisticated MUD, which will increase the transceiver burden.
However, since the Hadamard-Walsh code in nature has several codeword sets
that can achieve MAI-free in MC-CDMA, the complexity for MUD using the
Hadamard-Walsh code is much less than that using other code schemes. 

6.3 MAI-Free MC-CDMA with CFO Using


Hadamard-Walsh Codes

In this section, we show how to select a subset of Hadamard-Walsh codes for


MC-CDMA to eliminate completely MAI under any CFO level.
Let w[k], 0 ≤ k ≤ N − 1, be any codeword of length N and G < N be any
integer that divides N . For convenience, we define periodic and antiperiodic
codewords below. A codeword is said to be periodic with period N/G if

w[((k + gN/G))N ] = w[k], (6.37)

where 0 ≤ k ≤ N − 1, 0 ≤ g ≤ G − 1. A codeword is said to be antiperiodic


with antiperiodic N/G if

w[((k + N/G))N ] = −w[k]. (6.38)

For example, codeword (1 -1 1 -1) is periodic with period 4/2 = 2 while


codeword (1 -1 -1 1) is antiperiodic with antiperiod 4/2 = 2.
Lemma 6.6: Let wi = (wi [0], · · ·, wi [N − 1])t be any periodic or antiperiodic
(p)
codeword with period or antiperiod N/G. Then, wi is periodic or antiperi-
odic if wi is periodic or antiperiodic, respectively, with the same period or
antiperiod N/G.
(p)
Proof: By the definition of wi , we have
(p)
wi [k] = wi [((N − p + k))N ]. (6.39)

If wi is periodic with period N/G, we have


6.3 MAI-Free MC-CDMA with CFO Using Hadamard-Walsh Codes 235
(p)
wi [((k + gN/G))N ] = wi [((N − p + k + gN/G))N ]
(p)
= wi [((N − p + k))N ] = wi [k].

Similarly, if wi is antiperiodic with antiperiod N/G, we have


(p)
wi [((k + N/G))N ] = wi [((N − p + k + N/G))N ]
(p)
= −wi [((N − p + k))N ] = −wi [k],

for 0 ≤ g ≤ G − 1 and 0 ≤ k ≤ N − 1. 
Consider the N times N Hadamard-Walsh matrix, HN . As described in
Section 6.2.1, we can form the subset G0 = {w0 , w1 , · · · , w N −1 } for a channel
G
with maximum length L. We can further divide codewords in G0 into two
disjoint subsets of equal size as

G00 = {w0 , w1 , · · · , w N } and G01 = {w N , w N , · · · , w N −1 }.


2G −1 2G 2G +1 G

Then, we have the following properties.


Lemma 6.7: For any codeword wi in G01 and any codeword wj in G00 , we

have M AI i←j = 0.
Proof: We need to show

M
(0) (1)
AI i←j = M AIi←j + M AIi←j = 0.

Define
ri,j [k] = wi [k]wj∗ [k] k = 0, 1, · · · , N − 1.
(p) (p)

By Theorem 6.1, if wi and wj are in two disjoint subsets Gg , g = 0, 1, ..., G−1,


M AIi←j = 0 in a frequency selective channel. Since G01 and G01 are dis-
(0)
joint subsets of G0 , and M AIi←j = αj M AIi←j by Eq. (6.28), we conclude
(0)
M AIi←j = 0.
(1) (p)
To have M AIi←j = 0, Di,j must be the L × L zero matrix for all i = j
(p)
and p = 0, 1, ..., N − 1. But Di,j is an L × L upper left submatrix of matrix
Ci,j = F† Wi Wj F which is a circulant matrix [47]. That is, the first column
(p) (p)

(p) (p) (p)


of Ci,j , (ci,j (0) · · · ci,j (N − 1))t , is the N -point IDFT of ri,j . To have Di,j = 0
(p)
the first L samples and the last L − 1 samples of the IDFT of ri,j are zeros.
(p)
By taking the IDFT of ri,j , we have

N −1
(p) 1  (p) 2π
ri,j (n) = ri,j [m]ej N mn . (6.40)
N m=0

Let m = k + gN/G, 0 ≤ k ≤ N/G − 1 and 0 ≤ g ≤ G − 1. We can rewrite


Eq. (6.40) as
236 6 MAI-Free MC-CDMA System

(p) 1  
N/G−1 G−1
(p) 2π
ri,j (n) = ri,j [k + gN/G]ej N (k+gN/G)n . (6.41)
N g=0
k=0

Since codewords wi and wj belong to G0 , they are among the first N/G
columns of the Hadamard matrix and formed by repeating the upper left
N/G × N/G submatrix of HN , G times. Hence, they are periodic with period
(p)
N/G. By Lemma 6.1, wi is also periodic with period N/G. Since the prod-
uct of two periodic functions whose periods are the same is another periodic
function with the same period, we have
(p) (p)
ri,j [k + gN/G] = ri,j [k]. (6.42)
Then, we can rewrite Eq. (6.41) as

(p) 1 
N/G−1
(p) 2π

G−1

ri,j (n) = ri,j [k]ej N kn ej G gn , (6.43)
N g=0
k=0

where 0 ≤ k ≤ N/G − 1 and 0 ≤ g ≤ G − 1. Since


 2π
G−1 
j G gn G, n = 0, ±G, · · ·
e = (6.44)
0, otherwise,
g=0

we have
 N/G−1 (p) 2π
(p)
G
k=0 ri,j [k]ej N kn , n = 0, ±G, · · ·
ri,j (n) = N (6.45)
0, otherwise.
(p) (p)
To prove ri,j (0) = 0, we need to show ri,j has an equal number of +1
(p)
and −1. In general, ri,j does not belong to the Hadamard matrix whose
codewords have an equal number of +1 and −1. For example, for N = 8,
(1)
w8 · w7 has two −1’s and six 1’s, where · denotes the component-wise vector
(p)
product. However, if wi ∈ G01 and wj ∈ G00 , ri,j does have an equal number
of +1 and −1 as shown below. According to Eq. (6.8), codewords in G00 are
the first N/2G columns of HN and obtained by repeating the N/2G × N/2G
submatrix 2G times. Hence, any codeword wj ∈ G00 is periodic with period
N/2G. Similarly, we can show that any codeword wi ∈ G01 is antiperiodic
(p)
with antiperiodic N/2G. By Lemma 6.1, wi is also antiperiodic with the
same antiperiod. Therefore, for 0 ≤ k ≤ N − 1, we have
(p)
wi [((k + N/2G))N ]wj [((k + N/2G))N ] =
(p)
−wi [((N − p + k))N ] wj [k] = −wi [k]wj [k].
(p)
Hence, ri,j is antiperiodic with antiperiod N/2G. It can be easily shown
(p)
that any antiperiodic code has an equal number of ±1. Thus, ri,j (0) =
1 N −1 (p)
N k=0 ri,j [k] = 0. 
6.3 MAI-Free MC-CDMA with CFO Using Hadamard-Walsh Codes 237

Let us consider an example for N = 16 and L = 2. By choosing G = 2,

G00 = {w0 , w1 , w2 , w3 }

and
G01 = {w4 , w5 , w6 , w7 }.
By Lemma 6.7, any codeword chosen from G01 achieves zero MAI with respect
to any codeword from G00 . On the other hand, we will not necessarily have
an MAI-free system when both codewords are chosen from G00 (or both from
G01 ). For instance, 
M AI 2←3 = 0 or  M AI 4←5 = 0 for N = 16 and L = 2.
Note that Lemma 6.7 does not identify a subset of codewords that can be
assigned to all active users while keeping the system MAI-free. This choice will
be examined in the following theorem. In particular, we want to determine
such an MAI-free set from subsets of G0 and specify the number of codewords
in the resulting MAI-free set.
Theorem 6.4: For a channel of length L and G = 2q ≥ L, there are 1 +
log2 (N/G) codewords from N HW codes that will lead to an MAI-free MC-
CDMA system under any CFO level.
Proof: We form subsets G0 , G00 , and G01 as described above. To build an
MAI-free subset, we must choose only one of the codes from G01 . To determine
the remaining codes from G00 , we divide G00 into two subsets G000 and G001 ,
each with N/4G codes, by following the same procedure. Then, we can choose
one from G001 , since it can be proved by arguments similar to that in Lemma
6.2 that any codeword from G001 is MAI-free from any codeword in G000 . By
repeating this procedure, we can obtain an MAI-free set. Since the division
of a subset generates one codeword to be included in the MAI-free codeword
set in each stage, we have log2 (N/G) codes in the MAI-free set. Furthermore,
each of the two subsets has only one codeword in the last stage, we can add
both codewords (w0 and w1 ) to the MAI-free set. Thus, the total number of
MAI-free codewords is 1 + log2 (N/G). 
Consider the previous example, where N = 16 and L = G = 2. By Lemma
6.2, we can choose one codeword, say w4 , from

G01 = {w4 , w5 , w6 , w7 }

for the MAI-free set. We can divide

G00 = {w0 , w1 , w2 , w3 }

into two subsets, i.e., G000 = {w0 , w1 } and G001 = {w2 , w3 }. Then, we
include either w2 or w3 in G001 in the MAI-free set. If we divide G000 further
into two more subsets, {w0 } and {w1 }, we can include both w0 and w1 in
the MAI-free set by Lemma 6.2. Therefore, there will be 1 + log2 (16/2) = 4
MAI-free codewords.
238 6 MAI-Free MC-CDMA System

Example 6.8: MAI-Free MC-CDMA with Hadamard-Walsh Codes


In this example, we corroborate theoretical results derived in Lemma 6.7 and
Theorem 6.4 with computer simulations. Let N = 16, L = 2 and the CFO
value is fixed to be ±0.1. Every user was randomly assigned Nby a CFO value
−1
of either 0.1 or −0.1. The MAI power was normalized by k=0 |λi [k]|2 since
the desired signal was scaled by the same amount.
The values of M AIi←j power for all HW codewords in G0 are tabulated
in Table 6.1. When the MAI value is below −290 dB, it is equivalent to zero
numerically. We have several interesting observations from Table 6.1. First,
users with codewords w0 and w1 are mutually MAI-free with other users.
This result is not surprising since it was already shown in the proof of Lemma
6.6. Second, there is no MAI among any two users, if one uses a codeword
from G01 = {w4 , w5 , w6 , w7 } while the other from G00 = {w0 , w1 , w2 , w3 }.
This result validates Lemma 6.7. Third, if users use codewords from G01 (or
from G00 ), MAI may not be zero. For example, we see M AI4←6 = −27.6
dB and M AI5←7 = −45.0 dB. Finally, we observe {w0 , w1 , w2 , w4 } or
{w0 , w1 , w3 , w7 } provides two subsets of MAI-free codewords in the pres-
ence of CFO. This confirms our claim that there are 1 + log2 (16/2) = 4 users
that are mutually MAI-free.

Table 6.1. M AIi←j power (dB) as a function of Hadamard-Walsh codewords in


G0 [[124] IEEE].
c
w0 w1 w2 w3 w4 w5 w6 w7
w0 × −324 −324 −325 −325 −328 −326 −328
w1 −325 × −325 −324 −328 −324 −328 −326
w2 −325 −325 × −30.1 −326 −328 −324 −327
w3 −325 −325 −30.1 × −328 −326 −328 −324
w4 −324 −329 −326 −328 × −33.1 −27.6 −53.6
w5 −328 −324 −328 −326 −33.1 × −54.25 −45.0
w6 −326 −329 −324 −328 −27.6 −54.2 × −33.1
w7 −328 −326 −328 −323 −53.6 −45.0 −33.1 ×
7
Simplified Multiuser Detection for MC-CDMA
with Carrier Interferometry Codes with CFO

In previous chapter, we demonstrated that by partitioning Hadamard-Walsh


(HW) codewords into subsets, codewords in some particular subsets allow
MAI-free communication in a CFO environment. For a system with a spread-
ing gain N and a maximum channel length L, the number of supportable
MAI-free users with HW codes is 1 + log2 (N/G), where G is a power of 2 and
L ≤ G < N . However, the number of such MAI-free users are relatively small
in most practical multipath channels. Consequently, to have a fully-loaded
MC-CDMA, one may resort to multiuser detection techniques.
Multiuser detection (MUD) techniques have been developed to mitigate
MAI. The optimum multiuser detection scheme in [144] maximizes the like-
lihood functions of N users jointly by choosing the bits that minimize the
mean square error (MSE) between the estimated received signal and the ac-
tual received composite signal. The optimum MUD scheme improves the user
capacity gain substantially at the cost of a high computational complexity,
which grows exponentially with the number of users.
Suboptimum linear MUD techniques, such as the decorrelating detector
and the minimum mean-square error (MMSE) detector, have been designed
to lower the complexity of the optimum MUD while keeping the user capacity
gain as close to that of the optimum MUD as possible. The decorrelating and
MMSE detectors do not require the knowledge of the received amplitudes.
One other desirable feature of the decorrelating detector is that it can be
decentralized such that every user can be detected independently from other
users [144]. For N users, both techniques demand the inversion of an N ×
N matrix, which requires O(N 3 ) computations with Gauss-elimination and
O(N 2 ) computations with fast algorithms.
There is another class of MAI suppression techniques, known as interfer-
ence cancellation (IC), whose complexity grows linearly with the user number
[4]. The IC techniques attempt to remove interference after hard decoding.
They can be classified into two types: parallel interference cancellation (PIC)
and successive interference cancellation (SIC). The PIC detector was intro-
duced to CDMA systems as a “multistage detector” in [143] and it was shown

C.-C.J. Kuo et al., Precoding Techniques for Digital Communication Systems,


DOI: 10.1007/978-0-387-71769-2 7,  c Springer Science+Business Media, LLC 2008
240 7 MC-CDMA

to have low complexity and good performance. The PIC receiver [32, 89] pro-
cesses all N users simultaneously by cancelling their interference after all of
them have been decoded independently. The SIC receiver decodes users suc-
cessively in several stages with interference being cancelled at each stage [89].
The SIC receiver has a lower complexity than the PIC receiver at the cost of
higher latency.
A large amount of efforts has been made to reduce the complexity of the
multiuser detectors. Cai et al., [158] proposed to assign a set of subcarriers
to a group of users while preserving the frequency diversity of MC-CDMA as
much as possible. With this design, MAI is only present among users in the
same group so that it can be suppressed via simplified MUD techniques. A new
ML-MUD scheme called sphere decoding was proposed for MC-CDMA, whose
complexity is a polynomial function of the user number [14]. It applies lattice
sphere decoding to the received signal that is modeled as multidimensional
lattice packing points. However, when the user number is large, the sphere
decoding ML algorithm is cumbersome to perform. Moreover, neither of these
techniques are shown to be effective in the presence of CFO.
In asynchronous DS-CDMA where the size of cross-correlation matrix is
quite large, suboptimal approaches to implement the decorrelating detectors
have been proposed which are based on partitioning the long received sequence
into blocks of data that have more manageable size [63, 150]. A linear filter
implementation for the decorrelating detector was proposed in [82] where the
filter coefficients depend on the cross-correlations.
In this chapter, we show how the set of orthogonal carrier interferometry
(CI) codewords [92] can increase the number of MAI-free users to N/G, for a
multipath channel of length L and spreading gain N , where G is a power of 2
and L ≤ G < N , and in a CFO environment. It is worthwhile mentioning that
two sets of orthogonal CI codewords were introduced in [92] to increase user
capacity of MC-CDMA from N to 2N with negligible performance degra-
dation in a multipath fading channel. CI codes were also used as training
sequences for channel estimation to decouple the inter-antenna interference in
a CFO-free MIMO-OFDM system [72].
We also show that the use of CI codes will lead to simplified MAI multiuser
detection techniques if employed by a fully-loaded MC-CDMA in a CFO en-
vironment. We first use the PIC receiver to suppress MAI and evaluate the
performance of PIC in MC-CDMA with CI and HW codes over a multipath
Rayleigh fading channel with CFO. Using CI codewords, the complexity for
fully-loaded PIC (i.e., N active users) is linearly proportional to the chan-
nel length, L, rather than the user number, N . Since N is in general much
larger than L, the complexity is substantially reduced. Next, we show that,
by exploiting the sparsity of the cross-correlation matrix of MC-CDMA with
CI codes, we can lower the complexity of ML detectors so that its complexity
grows exponentially with the channel multipath length instead of the number
of active users. To be more specific, the complexity of the ML detector for a
fully-loaded MC-CDMA system with BPSK signals and spread gain N and
7.1 Orthogonal Carrier Interferometry Codes 241

multipath length L in a CFO environment is in the order of O(22L−1 ) (instead


of O(2N ) where N is the number of active users). This results in significant
complexity reduction in practical situations. Finally, we show that the cross-
correlation matrix for MC-CDMA with N − L + 1 users can be transformed
to a band matrix so that the complexity of the decorrelating detector is re-
duced to O(2(N − L + 1)(L − 1)2 ). This is far less than the complexity of the
decorrelating detector for a general dense matrix of the same dimension.
Although MC-CDMA with CI codes (CI-MC-CDMA) and PIC has been
considered before [130], a detailed BEP analysis of PIC in the presence of CFO
is not yet available in the literature. Here, we analyze the bit error probability
(BEP) of fully-loaded MC-CDMA with a single-stage PIC using CI and HW
codes.
Also, we derive an upper bound for the minimum error probability MUD
for MC-CDMA in a CFO environment and compare it with simulated BEP
results for ML detector. We also develop an iterative channel/CFO estima-
tion scheme with orthogonal MAI-free HW or CI codewords, and adopt it in
computer simulation.
In the following, we first study the carrier interferometry codes and its
properties for MC-CDMA in the presence of CFO. Then, single stage PIC
is described and theoretical BEP for a MC-CDMA with single-stage PIC is
analyzed in Section 7.2 along with some examples demonstrating the perfor-
mance of multiple-stage PIC. We describe the ML detector for MC-CDMA
with and without CFO in Section 7.3. The sparsity of the cross correlation
matrix of CI codes along with its associated tail biting trellis is studied in
this section. Also, an upper bound on the minimum error probability is de-
rived for MC-CDMA in a CFO environment. In Section 7.4, we discuss the
reduced complexity decorrelating detector using a proper subset of CI codes
that results in a band cross-correlation matrix. Finally, simple channel and
CFO estimation techniques are proposed and the performance of PIC with
the proposed channel/CFO estimation techniques are discussed in Section 7.5

7.1 Orthogonal Carrier Interferometry Codes


for MAI-free MC-CDMA with CFO
In this section, we briefly review the system model of MC-CDMA in a CFO
environment that was described in detail in Chapter 6. The kth component
of the DFT output, ŷ, can be written as


T −1
ŷ[k] = rj [k] + e[k], (7.1)
j=0

where e[k] is the DFT of additive noise, and rj [k] is the received signal con-
tributed from the jth user due to the channel fading and the CFO effects and
is given by Eq. (6.24) as
242 7 MC-CDMA


N −1
rj [k] = αj λj [k]yj [k] + βj λj [m]yj [m]gj (m − k), (7.2)
m=0,m=k

where λj [m] is the mth component of the N-point DFT of the channel impulse
response of user j. αj and βj are given by Eqs. (6.25) in Chapter 6, and
gj (m−k) = e−jπm−k/N /N sin π(m − k + j )/N was also defined in Chapter 6.
When there is no CFO (i.e., j = 0), rj [k] = λj [k]yj [k]. Since βj gj (0) = αj
and by Eqs. (6.1) and (7.2), we have

ŷ = Cx + e, (7.3)

where the element in the ith row and the jth column is


N −1
C(i, j) = βj gj (m − i)λj [m]wj [m], (7.4)
m=0

and
ŷ = (ŷ[0], ŷ[1], . . . , ŷ[N − 1])T ,
and
e = (e[0], e[1], . . . , e[N − 1])T
is circularly symmetric complex Gaussian random vector with zero mean and
covariance σ 2 I.
To detect transmitted symbols, one way is to use single user detection tech-
niques such as the maximum ratio combining (MRC). As shown in Chapter 6,
the MRC detector detects the ith transmitted symbol as


N −1
ẑi = ŷ[k]λ∗i [k]wi∗ [k]
k=0

T −1
= si + 
M AI i←j + êi , (7.5)
j=0,j =i

N −1
where êi = k=0 e[k]λ∗i [k]wi∗ [k] and si can be written as


N −1
si = ri [k]λ∗i [k]wi∗ [k]
k=0

N −1 N
 −1
= βi xi gi (m − k)λi [m]yi [m]λ∗i [k]wi∗ [k]. (7.6)
k=0 m=0

We saw in Chapter 6 that the MAI for a MC-CDMA in the presence of


CFO is given by

M
(0) (1)
AI i←j = M AIi←j + M AIi←j , (7.7)
7.1 Orthogonal Carrier Interferometry Codes 243

where


N −1
λj [k]wj [k]λ∗i [k]wi∗ [k],
(0)
M AIi←j = αj xj (7.8)
k=0

and
−1
( +

N
e−jπ
m−k
N
λ∗i [k]wi∗ [k] . (7.9)
(1)
M AIi←j = βj xj λj [m]yj [m] π(m−k+j )
m=0,m=k N sin N

In this chapter we will find it more convenient to combine Eqs. (7.8) and (7.9)
into


N −1 N
 −1

M AI i←j = βj xj gj (m − k)λj [m]yj [m]λ∗i [k]wi∗ [k]. (7.10)
k=0 m=0

Equation (7.8) can be expressed in matrix form as shown in Chapter 6

M AIi←j = αj xj h†i F†0 Wi† Wj F0 hj ,


(0)
(7.11)
  
Aij

 
IL
where F0 = F , hi = (hi (0) · · · hi (L − 1))T , and
0 N ×L

Wi = diag(wi [0] wi [1] · · · wi [N − 1]),

and where IL is an L × L identity matrix, and F is the N × N DFT matrix


whose element at the kth row and the nth column is [F]k,n = √1N e−j N kn .

(1)
It was also shown in Chapter 6 that M AIi←j given by (7.9) can be rewrit-
ten as
−1
( +
(1)

N
(p) † † (p) †
M AIi←j = βj xj gj (−p) (hi ) F0 (Wi ) Wj F0 hj , (7.12)
p=1
  
(p)
Cij

where p
e−jπ N
gj (p) = π(p+j )
, (7.13)
N sin N

(p)
Wi = diag(wi [p] · · · wi [N − 1] wi [0] · · · wi [p − 1]), (7.14)
and
2π0p 2π(L−1)p
= (hi (0)e−j · · · hi (L − 1)e−j
(p)
hi N N )T . (7.15)
244 7 MC-CDMA

We can combine Eqs. (7.11) and (7.12) into one equation as


−1
( +

N

M AI i←j = βj xj gj (−p) (hi )†
(p)
F†0 (Wi )† Wj F0
(p)
hj , (7.16)
p=0
  
(p)
Dij

(0) (0)
where if p = 0, βj gj (0) = αj , Wi = Wi , and hi = hi .
Since a relatively small number of users can be MAI-free in a channel with
CFO using HW codes, we look for other codes to address this issue. In this
section, we study the carrier interferometry (CI) codes of size N , which is of
the following form

wi [k] = ej N ki , k, i = 0, 1, · · · , N − 1. (7.17)

Then, the MAI-free property of this code can be stated below.


Theorem 7.1: Let the channel length be L and G = 2q ≥ L. There ex-
ists N/G carrier interferometry codewords such that the corresponding MC-
CDMA is MAI free in a CFO environment.
Proof: Define

ri,j [k] = wi [k]wj∗ [k] k = 0, 1, · · · , N − 1.


(p) (p)

To have zero MAI in a frequency selective channel with CFO, we demand


M AI i←j = 0.

As shown in Eq. (7.16), we can define


 
† IL
(Wi )† Wj F
(p) (p)
Dij = (IL 0)F .
0

(p)
Thus, Dij must be a zero matrix of dimension L × L for all i = j to achieve
MAI-free in a CFO environment. Similar to the proof of Lemma 6.7 in Chap-
ter 6, we can define matrix Ci,j = F† Wi Wj F and note that Di,j is an
(p) (p) (p)

(p)
L × L upper left submatrix of matrix Ci,j which is a circulant matrix [47].
(p)
Thus, the first column of Ci,j , (ci,j (0) · · · ci,j (N − 1))T , is the N -point IDFT
(p) (p)
of ri,j . To have Di,j = 0, the first L samples and the last L − 1 samples of
(p)
the IDFT of ri,j must be zeros.
Now, consider two codewords with indices i and i . By taking the IDFT
(p)
of ri,j , we have
N −1
(p) 1  (p) 2π
ri,i (n) = √ ri,i [m]ej N mn . (7.18)
N m=0
7.1 Orthogonal Carrier Interferometry Codes 245

Let m = k + gN/G, 0 ≤ k ≤ N/G − 1 and 0 ≤ g ≤ G − 1. We can rewrite


Eq. (7.18) as

(p) 1  
N/G−1 G−1
(p) 2π
ri,i (n) = √ ri,i [k + gN/G]ej N (k+gN/G)n . (7.19)
N k=0 g=0

Since wi [k] = ej2π/N (N −p+k)i , and


(p)

ri,i [k + gN/G] = wi [((k + gN/G))N ]wi∗ [((k + gN/G))N ],


(p) (p)

we have

ri,i [k + gN/G] = ej N (N −p+k+gN/G)i e−j N (k+gN/G)i
(p) 2π 2π

 
= ej N (N −p+k)i e−j N ki ej G g(i−i ) .
2π 2π 2π
(7.20)

If i − i = mG, where m can be any nonzero integer, we get ej2π/Gg(i−i ) =
ej2πmg = 1. Then, we have

ri,i [k + gN/G] = ej N (N −p+k)i e−j N ki = ri,i [k].
(p) 2π 2π (p)
(7.21)

Then, we can rewrite Eq. (7.19) as

(p) 1 
N/G−1
(p) 2π

G−1

ri,i (n) = ri,i [k]ej N kn ej G gn , (7.22)
N g=0
k=0

where 0 ≤ k ≤ N/G − 1 and 0 ≤ g ≤ G − 1. Since


G−1 
2π G, n = 0, ±G, · · ·
ej G gn = (7.23)
0, otherwise,
g=0

we have
( N/G−1 (p) 2π
(p) √G
k=0 ri,i [k]ej N kn , n = 0, ±G, · · ·
ri,i (n) = N
0, otherwise.

Furthermore, for i = i , we have


N −1 −1
1  (p) 
N
2π(i−i )
ej N (N −p+k) = 0.
(p)
ri,i (0) = √ ri,i [k] = (7.24)
N k=0 k=0

Thus, M AI i←j = 0. Since there are N/G codewords such that i − i = mG,
m = 1, 2, · · · , the total number of MAI-free codewords from N exponential
codes is N/G. 
246 7 MC-CDMA

The MAI-free property of CI codes in a CFO environment can be expected


since a multiuser system with the CI spreading codes in the frequency domain
is equivalent to a TDMA system in the time domain. Besides being codewords
in an MC-CDMA system, the CI codes are especially valuable as training
sequences in a MIMO-OFDM system. For instance, it was shown in [72] that
such codes can decouple inter-antenna interference in a CFO-free channel.
Owing to Theorem 7.1, we can use CI codes as training sequences for multi-
user MIMO-OFDM systems in a CFO environment to eliminate both inter-
antenna interference and MAI.
Theorem 7.2: For two CI codewords with indices i and i , i, i = 0, 1, · · · , N −
1, we have M AI i←i = 0 in a CFO environment if ((|i − i |))N −(L−1) ≥ L,
where L is the channel length.
Proof: For p = 0, 1, · · · , N − 1, we have
N −1
(p) 1  j 2π (N −p+k)i −j 2π i k j 2π kn
ri,i (n) = √ e N e N e N
N k=0
N −1  √ −j 2π pi
1  j 2π (i−i +n)k −j 2π pi N e N , ((i − i + n))N = 0,
= √ e N e N =
N 0, otherwise.
k=0

To meet condition ((|i − i |))N −(L−1) ≥ L, |i − i | should take values from


{L, L + 1, · · · , N − L}. Consider the case i − i ≥ 0. Under the conditions that
i−i is equal to a value in this set, ri,i (n) = 0, for n = N −L, N −L−1, · · · , L,
(p)

by the above equation. On the other hand, if i − i ≤ 0, then ri,i (n) = 0, for
(p)

n = N + L, N + L + 1, · · · , 2N − L. Since
N −1
(p) 1  j 2π (N −p+k)i −j 2π i k j 2π k(n+N )
ri,i (n + N ) = √ e N e N e N
N k=0
N −1
1  j 2π (N −p+k)i −j 2π i k j 2π k(n) (p)
= √ e N e N e N = ri,i (n),
N k=0
(p) (p)
ri,i (n) is periodic with period N . We conclude ri,i (n) = 0 for n = L, L +
(p) (p) (p) (p)
1, · · · , N − L, while ri,i (0) = ri,i (1) = · · · = ri,i (L − 1) = 0, and ri,i (N −
L + 1) = ri,i (N − L + 2) = · · · = ri,i (N − 1) = 0, for ((|i − i |))N −(L−1) ≥ L.
(p) (p)


Therefore M AI i←i = 0. 
Example 7.1: MAI-Free with CI Codewords
Let N = 16, L = 2 and the CFO value is fixed to be ±0.1. That is, every user
was randomly assigned by a CFO value of either 0.1 or −0.1. The MAI power in
the unit of dB between users with different
N −1codewords is shown in Table 7.1.
k=0 |λi [k]| since the desired signal
2
The MAI power was normalized by
was scaled by the same amount. According to Theorem 7.1 in Section 7.1,
7.1 Orthogonal Carrier Interferometry Codes 247

Table 7.1. M AIi←j power (dB) as a function of CI codewords [[124] IEEE].


c

w0 w2 w4 w6 w8 w10 w12 w14


w0 × −317 −306 −303 −309 −301 −294 −295
w2 −317 × −317 −302 −308 −309 −297 −291
w4 −306 −317 × −306 −306 −302 −298 −297
w6 −303 −302 −306 × −311 −302 −298 −297
w8 −310 −308 −305 −311 × −305 −298 −299
w10 −300 −308 −303 −302 −304 × −296 −293
w12 −295 −297 −298 −303 −298 −297 × −302
w14 −296 −291 −297 −301 −299 −293 −302 ×
w1 −18.6 −19.0 −309 −302 −311 −304 −296 −292
w3 −306 −18.4 −12.9 −306 −306 −301 −300 −293
w5 −305 −309 −16.2 −19.2 −304 −304 −297 −304
w7 −309 −299 −301 −19.6 −16.3 −299 −298 −308
w9 −304 −304 −306 −302 −16.6 −15.3 −296 −292
w11 −293 −304 −304 −297 −302 −18.2 −11.1 −296
w13 −297 −303 −306 −305 −307 −300 −19.9 −19.0
w15 −15.3 −297 −298 −303 −307 −302 −300 −17.6

users with even indexed codewords, i.e., {w0 , w2 , w4 , w6 , w8 , w10 , w12 , w14 }
are mutually MAI-free. This is illustrated in the top half of Table 7.1. We also
observe that users whose codewords’ indices difference is greater than L − 1
have zero MAI as proved by Theorem 7.2.

Corollary: For an MC-CDMA system in a CFO environment consisting of N


active users with CI codewords. If L ≤ N/2, each user has 2(L − 1) interfering
users only.

Proof: The condition L ≤ N/2 guarantees that there are i and i satisfying
((|i − i |))N −(L−1) ≥ L. For every codeword with index i, there are L − 1
codewords such as codeword i for which 0 < i − i and ((i − i ))N −(L−1) ≥ L
and L − 1 codewords such as codeword i for which i − i < 0 and ((i −
i))N −(L−1) ≥ L. Therefore, the number of interferers is 2(L − 1).
The Corollary leads to an important result. In practice, N >> L. That
is the spreading gain is much greater that the channel length. Thus, a user
has to combat much fewer interferers than what it would have to combat if
other codes instead of CI codes were used. On the other hand, the user with CI
codewords experiences a considerable amount of MAI from nonzero interferers
in a fully loaded MC-CDMA system. Fortunately, interference cancellation can
be employed to suppress MAI. In the next section.We will consider the parallel
interference cancellation (PIC) technique for this purpose and show that its
complexity can be reduced by the use of CI codewords since there are only
2(L − 1) instead of N − 1 interferers.
248 7 MC-CDMA

7.2 Complexity Reduction in PIC MUD Detection

In this section, we evaluate the performance of a fully-loaded MC-CDMA sys-


tem with the BPSK modulation and parallel interference cancellation (PIC).
It will be shown that the complexity of PIC can be reduced substantially if
the CI codewords are used.
The receiver for the ith user in an MC-CDMA system with PIC is depicted
in Fig. 7.1. First, initial bit estimates for all users are derived from the single
user detection (SUD) receivers, which is basically the same as the one depicted
in Fig. 6.1 in Chapter 6. We call this stage as stage 0 of the PIC detector and
denote the detected symbols by x̂0i , i = 1, · · · , N .
(p) (p)
Let F0 hj = λj and F0 hi = λi . We can express the detected symbol
for user i at the 0th stage of PIC as


N −1 
T
x̂0i = |λi [k]|2 xi + xj γi,j + êi [k], (7.25)
k=0 j=1,j =i

where

N −1
gj (−p)(λi )† (Wi )† Wj λj ,
(p) (p)
γi,j = βj
p=0
N −1
and êi [k] = k=0 e[k]λ∗i [k]wi∗ [k].

Fig. 7.1. The MC-CDMA receiver with single-stage PIC for the ith user.
7.2 Complexity Reduction in PIC MUD Detection 249

Tentative hard decisions are made on x̂0j , j = 1, 2, · · · , T , j = i, to produce


initial bits estimates; namely,

sgn[Re{x̂0j }].

These bits are then spread into multiple chips by their corresponding code-
words and scaled by channel gains before passing through the IDFT matrix
and the parallel-to-serial converter. To take the CFO effect into account, each
signal is multiplied by ej2π/N nj , n = 0, 1, · · · , N − 1. Here, it is assumed that
the exact knowledge of channel gains and CFO values for all users is available
in the receiver. Later, simulation results with imperfect channel estimation
will also be given. These estimated signals add up to form the MAI estimate
for desired user i, which can be subtracted from its received signal ŷ.
The resultant signal is used by the receiver of user i to produce new de-
tected symbol at stage 1 of the PIC detector denoted by x̂1i as

 N −1
1 
T
x̂1i = x̂0i − sgn[Re{x̂0j }]wj [k  ]λj [k  ]
N 
j=1,j =i k,k =0
( N −1 +
 2π 
j N n(k −k+) ∗ ∗
e wi [k]λi [k] + êi [k] (7.26)
n=0

alternatively, x̂1i can be expressed as


T 
N −1
gj (−p)(λi )† (Wi )† Wj λj
(p) (p)
x̂1i = x̂0i − sgn[Re{x̂0j }]βj
j=1,j =i p=0

+ êi [k]. (7.27)

If a correct decision is made on a particular interfere’s bit, the interference


from that user to the ith user can be completely cancelled. On the other
hand, if an incorrect decision is made, the interference from that user will be
enhanced rather than cancelled.
By substituting x̂0i from Eq. (7.25) in Eq. (7.28), we obtain


N −1 
T
x̂1i = xi |λi [k]|2 + γi,j (xj − sgn[Re{x̂0j }]) + êi [k]. (7.28)
k=0 j=1,j =i

Assuming xj is a BPSK modulated symbol, (xj − sgn[Re{x̂0j }]) is a three-


valued random variable (0,2,−2) whose magnitude represents whether or not a
tentative decision is made correctly on the jth user’s bit at the previous stage.
For higher-order modulation schemes such as M -ary PAM and M -ary PSK,
(xj − sgn[Re{x̂0j }]) can take several values which makes the BEP analysis
more difficult.
250 7 MC-CDMA

We know that λi is a multivariate Gaussian random vector with zero mean


and covariance matrix R whose elements are given by
 L−1
L−1   
[R]k,k = E{λi [k]λ∗i [k  ]} = E{hi (l)h∗i (l )}e−j N kl ej N k l . (7.29)
2π 2π

l=0 l =0

We assume a uniform model for the multipath intensity (power-delay) profile,


i.e., the channel coefficients have the same power at each tap. Also, the typical
wide-sense stationary uncorrelated scattering (WSSUS) channel model [60] is
adopted. Then, we have

E{hi (l)h∗i (l )} = σh2 i δ(l − l ),


and
(

L−1
 σh2 i L k = k
σh2 i e−j N (k−k )l

[R]k,k = = −j2π(k−k )L/N (7.30)
σh2 i 1−e
1−e−j2π(k−k )/N
k = k .
l=0

Since γi,j is a linear transform of λj given λi , γi,j |λi is a circularly symmetric


complex Gaussian random variable with zero mean and variance

N −1
(p ) (p )
(Wi λi )† Wj Rj Wj† (Wi )gj (−p)gj∗ (−p ).
(p) (p)
var[γi,j |λi ] = |βj |2 λi
p,p =0
(7.31)
The probability of error for user i at state 0 of the PIC detector using
Eq. (7.25), can be written as
1 1
P [sgn[Re{x̂0i }]) = xi ] = P [Re{x̂0i } < 0| xi = +1] + P [x̂0i > 0| xi = −1]
2 2
= P [Re{x̂0i } > 0| xi = −1]
⎡ ⎧ ⎫ ⎤
⎨ T ⎬ N−1
= P ⎣Re ẑk + γi,j xj > |λi [k]|2 ⎦. (7.32)
⎩ ⎭
j=1,j =i k=0

K
Due to the presence of xj terms, j=1,j =i xj γi,j in Eq. (7.25) is not Gaus-
sian for given λi . However, if it is conditioned on all possible xj , j = i,
we will have a collection of Gaussian random variables that can be ap-
proximated
# by the Gaussian
$  distribution [144]. Hence, we see that I0i =
K N −1
Re ê[k] + j=1,j =i γi,j xj − k=0 |λi [k]|2 in the above equation is condi-
tional Gaussian conditioned
 onλi . By using Δx0i to denote (xi −sgn[Re{x̂0i }])
and noting that E Re2 {z[k]} = 12 Σ 2 , the probability of error conditioned
on λi is simply given by
⎛ ⎞
N −1
|λ [k]|2
P [Δx0i = 0|λi ] = Q ⎝ / k=0 ⎠,
i
(7.33)
T 2 2 /2
σ
j=1,j =i γi,j + σ
7.2 Complexity Reduction in PIC MUD Detection 251

where
  1
σγ2i,j = E Re2 {γi,j |λi } = var[γi,j |λi ].
2
Then, the BEP can be obtained by
⎛ ⎞
 ∞ N −1
|λ [k]|2
Q ⎝ / k=0 ⎠P [λi ]dλi .
i
P [Δx0i = 0] = (7.34)
0 T 2 2
j=1,j =i σγi,j + σ /2

This integral can be calculated by the Monte Carlo method [102].


We are ready now to derive the BEP of the detected symbol after stage 1
of PIC. From Eq. (7.28), we have
⎡ ⎧ ⎫ ⎤
⎨ T ⎬ N −1
P [Re{x̂1i } > 0| xi = −1] = P ⎣Re ẑk + γi,j Δx0j > |λi [k]| ⎦ .
2
⎩ ⎭
j=1,j =i k=0
(7.35)
To get the closed form for BEP is difficult since e0j ’s are dependent with
given λi . Following the arguments in [32] for the BEP derivation of DS-CDMA
with PIC, Δx0j s are actually not strongly dependent. For example, consider
Δx0j and Δx0j . There are only two terms (i.e., xj γj,j  in Δx0j and xj γj  ,j
in Δx0j ) that are dependent. Since given λi , γi,j , and γi,j  are independent,
we can assume with reasonable precision that γi,j Δx0j , j = 1, 2, . . . T , j =
i terms are independent. Using the above simplifying assumption, we shall
derive two formulas
T for the BEP based on two probabilistic models for residual
interference ( j=1,j =i γi,j Δx0j ) in the next two sections.

7.2.1 Derivation of BEP Assuming Gaussian Model for Residual


Interference

Using the above simplifying assumptions, we can assume the distribution of


the total residual interference (after cancellation) converges to the Gaussian
distribution for a sufficiently large number, T , of users. Under this assumption
and by noting

E{Δ2x0j Re2 {γi,j }|λi } = 4P [e0j = 0]E{Re2 {γi,j }|λi } (7.36)

and

E 2 {Δx0j Re{γi,j }|λi } = E{Δx0j Re{γi,j }Δx0j Re{γi,j  }|λi } = 0, (7.37)

we can derive the conditional BEP as


⎛ ⎞
N −1
|λi [k]|2
P [error|λi ] = Q ⎝ / k=0 ⎠, (7.38)
T
j=1,j =i 4P [Δx0j = 0]σγ2i,j + σ 2 /2
252 7 MC-CDMA

and the BEP is given by


⎛ ⎞
 ∞ N −1
|λi [k]|2
P [error] = Q ⎝ / k=0 ⎠P [λi ]dλi . (7.39)
T
0
j=1,j =i 4P [Δx0j = 0]σγ2i,j + σ 2 /2

7.2.2 Derivation of BEP Using Non-Gaussian Model for Residual


Interference

The assumption of Gaussian residual interference is not true when the number
of interfering users are not sufficiently large. For example, as proved in the
previous section, every user encounters only 2(L − 1) nonzero interference
terms if the CI codewords are used. Thus, for small values of L, the Gaussian
assumption for residual interference is not reasonable. The distribution of
residual interference must be derived and therefore, a new BEP formula can
be obtained.
In the following, we first derive the BEP formula for a system with only 2
interferes. Then, the result is extended to T interferers.
Suppose users i and i are interfering users for user i. The detected re-
ceived symbol for user i is given by
N −1
x̂1i = xi k=0 |λi [k]|2 + Δx0i γi,i + Δx0i γi,i + ê[k], (7.40)

Let Δx0i γi,i = U1 , Δx0i γi,i = U2 , and ê[k] = Z. Then, the probability den-
sity function of U1 conditioned on λi , denoted by fU1 (u1 |λi ), can be obtained
by

fU1 (u1 |λ1 ) = fU1 (u1 |Δx0i = 2; λi )P [Δx0i = 2|λi ]


+ fU1 (u1 |Δx0i = −2; λi )P [Δx0i = −2|λi ]
+ fU1 (u1 |Δx0i = 0; λi )P [Δx0i = 0|λi ]
= fU1 (2γi,i |λi )P [Δx0i = 2|λi ] + fU1 (−2γi,i |λ1 )P [Δx0i = −2|λi ]
+ δ(u1 )P [Δx0i = 0|λi ]. (7.41)

We use N (a, b) to denote the Gaussian distribution, where a and b are the
mean and variance, respectively. Since

fU1 (2γi,i |λi ) = fU1 (−2γi,i |λi ) = N (0, 4σγ2i,i ),

we obtain

fU1 (u1 |λi ) = P [Δx0i = 0|λi ]N (0, 4σγ2i,i ) + (1 − P [Δx0i = 0|λi ])δ(u1 ),

where ⎛ ⎞
N −1
|λi [k]|2
P [Δx0i = 0|λi ] = Q ⎝ / k=0 ⎠.
T
j=1,j =i σγ2i ,j + σ 2 /2
7.2 Complexity Reduction in PIC MUD Detection 253

Similarly,

fU2 (u2 |λi ) = P [Δx0i = 0|λi ]N (0, 4σγ2i,i ) + (1 − P [Δx0i = 0|λi ])δ(u2 ).

Again, Δx0i and Δx0i are assumed to be independent. Thus, U1 and U2


are independent and given the independence among Z and U1 and U2 , the
pdf of their sum can be obtained by fU1 (u1 |λi ) ∗ fU2 (u2 |λi ) ∗ fZ (z|λi ), where
∗ denotes the convolution operation. Therefore, we have

fV (v|λi ) = fU1 (u1 |λi ) ∗ fU2 (u2 |λi ) ∗ fZ (z|λi )


= P [Δx0i = 0|λi ]P [Δx0i = 0|λi ]N (0, 4σγ2i,i + 4σγ2i,i + σ 2 /2)
+ P [Δx0i = 0|λi ](1 − P [Δx0i = 0|λi ])N (0, 4σγ2i,i + σ 2 /2)
+ P [Δx0i = 0|λi ](1 − P [Δx0i = 0|λi ])N (0, 4σγ2i,i + σ 2 /2)
+ (1 − P [Δx0i = 0|λi ])(1 − P [Δx0i = 0|λi ])N (0, σ 2 /2). (7.42)

Thus, the conditional BEP after one stage of PIC detector, with CI codewords
and L = 2 is given by
⎛ ⎞
N −1
k=0 |λi [k]|
2
P [error|λi ] = P [Δx0i = 0|λi ]P [Δx0i = 0|λi ]Q ⎝/ ⎠
4(σγi,i + σγ2i,i ) + σ 2 /2
2

 
N −1
k=0 |λi [k]|
2
+ (1 − P [Δx0i = 0|λi ])(1 − P [Δx0i = 0|λi ])Q .
σ 2 /2
⎛ ⎞
N −1
|λi [k]| ⎠
2
+ P [Δx0i = 0|λi ](1 − P [Δx0i = 0|λi ])Q ⎝ / k=0
4σγi,i + σ 2 /2
2

⎛ ⎞
N −1
|λ [k]| 2
+ (1 − P [Δx0i = 0|λi ])P [Δx0i = 0|λ1 ]Q ⎝ / k=0 ⎠.
i

4σγ2i,i + σ 2 /2
(7.43)

Next, we extend the above result to any number, I, of interferers. Let the set
of all interfering users’ indices for user i be Si . Then we have


N −1
x̂1i = xi |λi [k]|2 + Δx0Si [1] γi,Si [1] + Δx0Si [2] γi,Si [2] + . . . + Δx0Si [I] γi,Si [I]
k=0
+ êi [k], (7.44)

and the conditional BEP can be obtained as


254 7 MC-CDMA


1 
1 
1
P [error|λi ] = ··· P [Δx0Si [1] = 0|λi ]r1 · · ·
r1 =0 r2 =0 rSi [I]=0
(
rSi [I]
P [Δx0Si [I] = 0|λi ]

(1 − P [Δx0Si [1] = 0|λi ])1−r1 · · · (1 − P [Δx0Si [I] = 0|λi ])1−rSi [I]


⎛ ⎞+
N −1
k=0 |λi [k]|
2
Q ⎝/ , ⎠ , (7.45)
4(r1 σγi,S [1] + . . . + rSi [I] σγ2i,S [I] ) + σ 2 /2
2
i i

and  ∞
P [error] = P [E|λi ]P [λi ]dλi . (7.46)
0

Example 7.2: Theoretical vs Simulated BEP of Fully-Loaded MC-


CDMA with Single Stage PIC
In this example, we conduct the Monte Carlo simulation to evaluate the BEP
performance of MC-CDMA with PIC in the presence of CFO and to cor-
roborate theoretical results derived above. We consider both analytical and
simulated BEP performance.
In the simulation, channel taps were generated as independently identically
distributed (i.i.d.) random variables of unit variance. Every user was randomly
assigned by a CFO value of either 0.1 or −0.1. As before, to compute the
analytical BEP, the Monte Carlo integration method was used [102]. That is,
random variables λi [k], k = 0, 1, · · · , N − 1 are generated by taking the DF T
of complex Gaussian distributed channel taps for K times. Then, computer
generated samples of λi [k], k = 0, 1, · · · , N − 1 are substituted in the Q
function and the sum of trials is divided by K.
Figure 7.2 depicts analytical and simulated BEP results for a fully-loaded
MC-CDMA system with CI and HW codewords as a function of the SNR
value, Eb /N0 , Under the setting of N = 16, L = 2, T = 16, and CF O = ±0.1.
To shorten simulation time, only the BEP for first users were compared.
Since L = 2, we can use Eqs. (7.43) and (7.56) as the analytical BEP
expressions of MC-CDMA with CI codewords. For this case, we observe a
close agreement between the analytical BEP expression and simulation re-
sults. However, if BEP is computed from the approximate expressions (7.39),
analytical and simulation results do not have strong agreement, particularly
in the high SNR regime as shown in Fig. 7.2 for HW codewords. The disagree-
ment is due to the assumption of an analytical Gaussian model for the total
residual user interference in Eq. (7.40) whereas this simplifying assumption is
not present in computer simulation.
The analysis of the performance of multiple stage PIC is even more com-
plicated than the performance analysis given in this section. We shall evaluate
the performance of multiple stage PIC with both HW and CI codes by com-
puter simulation in next example.
7.2 Complexity Reduction in PIC MUD Detection 255

−1
10
Bit error probability

−2
10

Analytical, CI codes
Simulated, CI codes
Analytical, HW codes
Simulated, HW codes
Analytical one-stage PIC, CI codes
Simulated one-stage PIC, CI codes
Analytical one-stage PIC, HW codes
−3
10 Simulated one-stage PIC, HW codes
5 10 15 20 25 30
Eb/N0, dB

Fig. 7.2. Analytical and simulated BEP versus Eb /N0 with N = 16, L = 2 and
CF O = ±0.1.

Example 7.3: Multistage PIC and Effect of Limit Cycle Points


In this example, we use two-stage PIC to illustrate the significant performance
improvement of PIC and the effect of convergence to limit cycle points, which
is described below. A point at the qth stage of PIC is expressed as x̂q =
(x̂q1 , x̂q2 , · · · , x̂qT ). A limit cycle point is a point that is periodic with a period
equal to a certain number of iterations. For example, for a cycle limit point
with period 2, we have x̂q+2 = x̂q .
The simulation was performed under two sets of parameters: (1) N = 16,
L = 2 and (2) N = 32, L = 4, respectively. We set CF O = ±0.1 and the
user number T = 16. In Figs. 7.3 and 7.4, BEP results are given for both
set of parameters. For HW codewords, we see that 2-stage PIC can enhance
the BEP performance significantly. However, the performance gain obtained
by the second stage of PIC is not significant for CI codewords. We see from
Figs. 7.2, 7.3 and 7.4 that the performance of single-stage and two-stage PIC
is better with HW codewords.
It is known that the performance of a multistage PIC detector depends
on the cross-correlation of codewords [13, 144]. In general, multistage PIC
may or may not converge to the optimum (jointly maximum likelihood) solu-
tion [144]. It was shown in [13] that, for a CDMA system employing random
256 7 MC-CDMA

−1
10
Bit error probability

−2
10

Zero stage PIC, CI


Zero stage PIC, CI
One stage PIC, CI
−3 Two stage PIC, CI
10
One stage PIC, HW
LCM PIC, CI
Two stage PIC, HW
LCM PIC, HW

0 5 10 15 20
E b /N 0 , dB

Fig. 7.3. BEP versus the SNR with N = 16, L = 2, CF O = ±0.1 dB.

spreading sequences, the poor performance of multistage PIC is mostly due to


the existence of limit cycle points with a period of 2. This often happens when
the number of users approaches the spread factor. We observed in computer
simulation that the large number of limit cycle points account for the worse
performance of PIC with CI codewords. For example, for N = 32, L = 4 and
T = 16, the ratio of the number of limit cycle points to the total simulated
samples is 7/3125 for HW codewords while this ratio becomes 52/3125 for CI
codewords.
Due to the periodic nature of limit cycle points, they can be detected easily.
On the other hand, if a fixed point is reached, it is difficult to determine wether
it is optimal or non-optimal. Three limit cycle mitigation (LCM) techniques
were proposed in [13] to improve the poor convergence behavior of PIC. The
simplest one among the three was employed in this work. That is, if a limit
cycle point is detected, the receiver only attempts to cancel the interference
from users whose bit estimates are fixed during multiple PIC iterations and
it does not attempt to cancel the interference from those users with toggling
bit estimates.
With this technique, the performance of PIC greatly improves for both HW
and CI codewords as shown in Figs. 7.3 and 7.4, where the PIC detector with
the limit cycle mitigation algorithm (LCM) is called LCM PIC. The BEP
7.2 Complexity Reduction in PIC MUD Detection 257

−2
10
Bit error probability

−3
10

−4
Zero stage PIC, CI
10 Zero stage PIC, CI
One stage PIC, CI
Two stage PIC, CI
−5 One stage PIC, HW
10
LCM PIC, CI
Two stage PIC, HW
LCM PIC, HW

0 5 10 15 20
E /N , dB
b 0

Fig. 7.4. BEP versus the SNR with N = 32, L = 4, CF O = ±0.1 dB.

performance of MC-CDMA with HW codewords is still superior to that of


CI codewords. However, multistage PIC for MC-CDMA with CI codewords is
much less complex than that with HW codewords, since each PIC stage for any
user with the CI codeword has to deal with only 2(L − 1) nonzero interferers
as stated in the Corollary of Theorem 7.2 in Section 7.1 while HW codewords
must deal with all N −1 interferers. We also observe that increasing N from 16
to 32 improves the BEP performance for both codewords. The reason is that
the number of mutually zero MAI terms increases as the subcarrier number
increases for a fixed number of users.
Example 7.4: The Effect of User Number on the Performance
of PIC
Figure 7.5 shows the BEP results versus the user number with N = 32, L = 4,
CF O = ±0.3 and Eb /N0 = 10 dB. Based on N and L values, there are eight
MAI-free users with CI codewords and four MAI-free users with HW code-
words. Thus, CI codewords outperforms HW codewords when the number of
active users is between 6 and 8 as demonstrated in the Fig. 7.5. However,
as the number of users exceeds 8, the performances of both systems become
comparable. Figure 7.5. also shows the performance of PIC with the LCM
algorithm for both CI and HW codewords. The two systems have compa-
rable performance with PIC for number of users between 6 and 8 but HW
258 7 MC-CDMA
−1
10

−2
10
Bit error probability

−3
10

−4
10

Zero stage PIC, CI


−5 Zero stage PIC, HW
10
LCM PIC, CI
LCM PIC, HW

−6
10
10 15 20 25 30
Number of users

Fig. 7.5. BEP versus the active user number with N = 32, L = 4, CF O = ±0.3,
and Eb /N0 = 10 dB.

outperforms CI as the number of users increases. However, for both codes,


the PIC detector greatly improves the performance of light or heavy loaded
MC-CDMA.

7.3 Complexity Reduction in ML MUD Detection


We consider the ML detection based on the received signal given in Eq. (7.3)
in this section. It is assumed that the receiver has the perfect knowledge of
channel coefficients and CFO values. We will examine the multipath and the
CFO effects separately. By Theorem 7.2, we know that, for two CI codewords

with indices i and i , where i, i = 0, 1, · · · , N − 1, M AI i←i = 0 in a CFO
environment if ((|i − i |))N −(L−1) ≥ L, where ((n))N denotes n modulo N and
L is the channel length. We show in this section how the above property of
CI codes can reduce the complexity of the ML-MUD receiver significantly.

7.3.1 ML-MUD in Multipath Fading Channel


For given transmitted signal x, we would like to maximize the likelihood of
the received signal. From Eq. (7.3) and since noise n is Gaussian, the ML
7.3 Complexity Reduction in ML MUD Detection 259

estimate can be written as

x̂ = arg min ||ŷ − Cx||2 . (7.47)


x

By expanding the right hand side of (7.47) and noting that ||ŷ||2 is indepen-
dent of x, we can reformulate the optimization problem as

x̂ = arg min Ω(x),


x

where
Ω(x) = ||Cx||2 − 2{(Cx, ŷ)},
and where C is defined in Eq. (7.4) in a CFO environment. If there is no CFO,
Eq. (7.4) can be written as

C(i, j) = wj [i]λj [i].

Thus, we have


T −1 
N −1
(Cx, ŷ) = x† (C† ŷ) = xj ŷj [k]wj∗ [k]λ∗j [k].
j=0 k=0

 −1 ∗ ∗
Note that N k=0 ŷj [k]wj [k]λj [k] is actually the estimate of the input signal
for user j obtained by MRC (i.e., ẑj ). On the other hand,

||Cx||2 = x† H(0) x,

where H(0) = C† C. Then, the ML optimization problem is equivalent to


minimizing
9 − 2{(x† , ẑ)}
Ω(x) = x† Hx
with respect to x, where ẑ is the output vector of MRC with its ith element
given in (7.5). In the absence of CFO, αj = 1 for any user j and the MAI from
(0)
user j to desired user i is equal to M AIi←j . Then, we have from Eq. (7.8)
that

N −1
λj [k]wj [k]λ∗i [k]wi∗ [k].
(0)
M AIi←j = xj
k=0
(0)
It can be easily shown that H(0) (i, j) = M AIi←j /xj . In fact, H(0) can be
viewed as the cross-correlation channel matrix.

7.3.2 ML-MUD in Multipath Fading Channel with CFO

We rewrite Eq. (7.5) in vector format as

ẑ = Hx + ê, (7.48)
260 7 MC-CDMA

where the i, jth entry of H is equal to


(
si , i = j,
H(i, j) = MAI
i←j
xj , i = j.

The ML detector has the following form

x̂ = arg min Ω(x), (7.49)


x

where

T −1
Ω(x) = ||ẑ − Hx||2 = |ẑi − hi x|2 , (7.50)
i=0

and where hi is the ith row of H.

7.3.3 Viterbi Algorithm for Tail Biting Trellis (TBT)

We know from Theorem 7.2 and its Corollary that for T = N active users with
CI codewords and L ≤ N/2, each user has only 2(L − 1) (instead of N − 1)
interfering users. Both H(0) and H are sparse matrices so that ML-MUD can
be performed with a much lower complexity. As shown in Fig. 7.6 for N = 16,
T = 16, and L = 2, the nonzero elements (indicated by black squares) of H
(or H(0) for the case without CFO) are concentrated along the three diagonal
lines. Elements in the off-diagonal region with |i − j| ≥ L are all equal to zero
except for two corners.

Fig. 7.6. The cross-correlation matrix of CI codewords with N = T = 16 and


L = 2.
7.3 Complexity Reduction in ML MUD Detection 261

The well-known Viterbi algorithm (VA) can be used to solve the ML op-
timization problem. Generally speaking, its complexity is proportional to the
number of states. The number of transition per stage has to be taken into
account since the complexity is proportional to this number. The number of
state transitions per stage is referred to as the trellis size. It turns out that
regardless of how we define the states, the complexity of ML MUD is O(2T )
for a general non-sparse cross-correlation channel matrix and the BPSK mod-
ulation. On the other hand, by exploiting the sparsity of H, we can show
that the complexity of VA grows exponentially with 2L − 1. For example, for
N = T = 16 and L = 2 given in Fig. 7.6, Eq. (7.50) can be written as

N −1
Ω(x) = |ẑi − H(i, ((i − 1))N )x((i−1))N − H(i, i)xi
i=0
−H(i, ((i + 1))N )x((i+1))N |2 .
Thus, if a proper trellis is defined, the number of states at each stage can
be reduced to 2(L − 1) = 2 for this example. The trellis construction for the
sparse cross-correlation channel matrix of MC-CDMA with CI codewords is
explained below.
The structure of H (or H(0) for the case without CFO) implies a trellis
that is defined on a circulant time axis (or called the tail biting trellis (TBT)
[16]). TBT was defined and discussed for error correcting codes in [16]. TBT
also arises in the context of maximum likelihood (ML) detection in overloaded
array processing [52]. A method for trellis construction for a similar matrix
structure was proposed in [52] as explained below.
We denote the state of the trellis at stage i and the state space at the ith
stage by s[i] and Si , respectively. We use U [i] to denote the column indices
of nonzero elements on the ith row of H. For example, for the example given
in Fig. 7.6, we have U [0] = {0, 1, 15} and U [15] = {0, 14, 15}. The ith state is
defined as [52]
s[i] = {xu | u ∈ U [((i − 1))N ] ∩ U [i]}. (7.51)
Using the above definition, we obtain
s[i] ∪ s[((i + 1))N ] = {U [((i − 1))N ] ∩ U [i]} ∪ {U [i] ∩ U [((i + 1))N ]} = U [i].
(7.52)
In other words, state sequence {s[i]} for TBT is defined such that, during the
ith stage of the Viterbi algorithm recursion, U [i] corresponds to symbol indices
in both s[i] and s[i + 1]. From the sequence of states defined by Eq. (7.51),
we can construct the trellis by listing state values at stage i and connect the
valid state transition from stage i to stage i + 1.
By defining μ = 2(L − 1), we can determine the ith state from Eq. (7.51)
by
s[i] = (x((i−μ− ))N , · · · , xi , · · · , x((i+μ+ ))N ), (7.53)
where μ− =  μ−12  and μ+ =  2 . Note that μ is the number of inter-
μ−1

ferers for each user in the MC-CDMA system with CI codes and μ + 1 is
262 7 MC-CDMA

(x10,x11)

(x11,x12)

(x12,x13)

(x13,x14)

(x14,x15)
(x15,x0)

(x9,x10)

(x15,x0)
(x0,x1)

(x1,x2)

(x2,x3)

(x3,x4)

(x4,x5)

(x5,x6)

(x6,x7)

(x7,x8)

(x8,x9)
Fig. 7.7. TBT for the case with N = T = 16 and L = 2.

the bandwidth of the cross-correlation channel matrix. For example, the state
definition for the cross-correlation channel matrix given in Fig. 7.6 is

s[i] = (x((i−1))N , xi ), 0 ≤ i ≤ N − 1.

With the state sequence for the example in Fig. 7.6, we construct the corre-
sponding trellis in Fig. 7.7, where the value of each state is placed at the top
of each stage. Note that the state is tail biting in the sense that it starts and
ends with the same state, i.e., (x15 , x0 ).
Given these state and trellis definitions, we have to solve the ML optimiza-
tion problem, which is equivalent to finding a closed path with the minimum
cost through TBT. A closed path around a TBT is a path that starts and
ends with the same state. In other words, it is identified by a sequence of
state-indices for which js [i] = js [i + N ], 0 ≤ i ≤ Nstgs − N .
Since the initial state is not actually defined in the TBT, finding the op-
timum ML solution requires a modified version of the VA that is explained
below. That is, the actual ML path for the TBT can be determined in two
steps. First, an arbitrary state value is selected as the initial state and the
optimum closed path is determined by running the VA. Then, the first step
is repeated for every possible value of the initial state. The optimum path is
then chosen to be the one with the minimum cost. However, this algorithm
has a complexity proportional to O(22μ ) [16]. For example, the TBT depicted
in Fig. 7.7 requires 4 calls of the Viterbi Algorithm.
Some approximate ML algorithms with less complexity and satisfactory
results for decoding TBT were discussed in [16, 52]. These algorithms initialize
all metrics at states in initial stage S0 to zero, decode with VA starting at
S0 and go around the TBT. Some of these algorithms require that, after each
run of VA, the resultant winning path should be checked to see if the starting
state is the same as the ending state (i.e., if the winning path is closed) [148].
A less complicated approximate ML algorithm, called the Iterative Tail Biting
Viterbi Algorithm (ITB-VA), was proposed in [52] that applies VA iteratively
around a TBT multiple times without excluding paths that are not closed.
This approach is taken in our work.
We define Nstgs = Nround N , where Nround > 1 is a real number. Nround
and Nstgs are in fact the desired numbers of iterations and stages around the
TBT, respectively. We can extend the received symbol sequence ẑ[i] (for the
7.3 Complexity Reduction in ML MUD Detection 263

(x10,x11)

(x11,x12)

(x12,x13)

(x13,x14)

(x14,x15)
(x15,x0)

(x9,x10)

(x15,x0)
(x0,x1)

(x1,x2)

(x2,x3)

(x3,x4)

(x4,x5)

(x5,x6)

(x6,x7)

(x7,x8)

(x8,x9)

(x0,x1)

(x1,x2)

(x2,x3)

(x3,x4)

(x4,x5)

(x5,x6)

(x6,x7)

(x7,x8)
Fig. 7.8. The periodically extended TBT for the case with N = T = 16 and L = 2
and Nround = 1.5.

case with CFO) and state sequence s[i] periodically as


ẑ[i] = ẑ[((i))N ], s[i] = s[((i))N ], 0 ≤ i ≤ Nstgs .
Thus, a periodically extended TBT is constructed such that its ith stage is
identical to its ((i))N stage for every i. Figure 7.8 shows such an extended
TBT for the example given in Fig. 7.6 with Nstgs = 1.5.
After going around the TBT Nround times, the optimum path is chosen
and the estimated sequence is translated into a sequence of symbol estimates,
x̂0 , . . . , x̂N −1 . To do this, we recall that VA has the property that all surviving
paths merge after a certain stage δ [74]. Thus, the optimum surviving path
at the ith stage is used to estimate the (i − δ)th information bits, where
δ is called the traceback-depth or the truncation length [74]. A more detailed
description of the ITB-VA algorithm can be found in [52]. As discussed before,
the number of state transitions per stage determines the complexity of VA.
By assuming that all components of sparse matrix H (or H) 9 can be pre-
computed and their computational complexity is negligible as compared to
the complexity of ITB-VA, the complexity of ITB-VA for MC-CDMA with CI
codes and the BPSK modulation is O(2(2L−1) ) at each stage. Since there are
Nround N stages, where Nround is usually less than 2 [16], the total complexity
for all stages is O((Nround N )2(2L−1) ), which is far less than the complexity
of a conventional ML MUD (i.e., O(N 2N )).

7.3.4 Upper Bound on Minimum Error Probability


It is difficult to obtain a closed-form solution of the minimum error probability
for MC-CDMA. However, we can derive its upper bound for BPSK transmit-
ted symbols. We take a similar approach to the procedure for synchronous
CDMA in the AWGN channel [144] and fading channel [11] and extend it to
synchronous MC-CDMA in a CFO environment.
We define Ei to be the set of error vectors that affect the ith user in form
of
Ei = {Δ ∈ {−1, 0, 1}T , Δi = 0},
where Δi = xi − x̂i . The set of errors that are compatible with transmitted
vector x ∈ {−1, 1}T is denoted by
264 7 MC-CDMA

A(x) = {Δ ∈ E, Δi = xi or 0}
= {Δ ∈ E, 2Δ − x ∈ {−1, 1}T },

where E = ∪Ti=1 Ei is the set of nonzero error vectors. The probability of


errors for the ith user, denoted by Pi (e), in a fading channel is given by the
corresponding bit error probability in the AWGN channel when conditioned
on fading coefficients. The minimum error probability in the AWGN channel
is given by
 "
P ∪Δ∈Ei {x − 2Δ = arg min Ω(b), Δ ∈ Ai (x)} .
b

Applying the union bound over Ei , we have



Pi (error) ≤ P {Ω(x − 2e) ≤ Ω(x), Δ ∈ Ai (x)}, (7.54)
Δ∈Ei

where we have used the fact that, if x − 2Δ is the most likely vector, it is
more likely than x.
It can be easily shown that, when no CFO is present, we have

Ω(x − 2Δ) − Ω(x) = 4{ΔT x̂} + 4ΔT H(0) Δ


−2xT H(0) Δ − 2eT H(0) x
= 4ΔT H(0) Δ + 4{ΔT ê},

where the second equality is true since ẑ = H(0) x + ê. We see that this event
is dependent on noise ê only while Δ ∈ A(x) depends on x only. Thus, we
conclude that these two events are independent.
Extending Eq. (7.54) to the fading channel, we can express the error prob-
ability as

Pi (error|H(0) ) ≤ P {Ω(x − 2Δ) − Ω(x) ≤ 0|H(0) }
Δ∈Ei
( +
× P {Δ ∈ A(x)} , (7.55)

which follows from the fact that the admissibility of  is independent of H(0) .
For equally likely transmitted bits, we have
−1
T,
P {Δ ∈ A(x)} = P {(xi − Δi )Δi = 0} = 2−w(Δ) ,
i=0
T −1
where w(Δ) = i=0 |Δi |. To compute P {Ω(x − 2Δ) − Ω(x) ≤ 0|H(0) }, we
note that, since ê is proper (circularly symmetric) complex Gaussian random
vector with zero mean and covariance σ 2 H(0) ,
7.3 Complexity Reduction in ML MUD Detection 265

1 1
E{({ΔT ê})2 } = E{ΔT êê† Δ} = σ 2 eT H(0) Δ.
2 2
Thus, the error probability for user i is bounded by
√ 
 2Δ T H(0) Δ
Pi (error|H(0) ) ≤ 2−w(Δ) Q .
σ
Δ∈Ei

When CFO is present, we have

Ω(x − 2Δ) − Ω(x) = 4ΔT H Δ + 4{ΔT e },

where H = H† H and e = H† ê. By taking a similar approach, we can show


that the upper bound for the error probability is
 √ T  
 2Δ H Δ
−w(Δ)
Pi (error|H) ≤ 2 Q √ .
Δ∈Ei σ ΔT H† H(0) HΔ

The unconditional BEP for user i can be obtained by


 ∞
Pi (error) = Pi (e|H)P {H}dH . (7.56)
0

This integral can be calculated using the Monte Carlo method [102].
The upper bound for synchronous DS-CDMA in the AWGN channel was
made tighter in [144] by eliminating the so-called decomposable error vectors
(i.e., Δ ∈ Ei that meet certain criteria) from the summation. This result was
extended to the fading channel case in [11] by expressing channel matrix H as
A† RA where A and R only contain fading coefficients and cross-correlation
coefficients, respectively, and whereby allowing the set of indecomposable er-
rors to be independent of fading [11]. However, we are not able to separate
fading coefficients and cross-correlation in our system to make the bound
tighter.
Example 7.5: ML MUD BEP Versus Minimum Probability
of Error
Figure 7.9 shows the upper bound to BEP as a function of the SNR value,
Eb /N0 , under the setting of N = 8, L = 2, T = 8 for both zero CFO and
CFO= ±0.3 cases. The upper bound curves in each case are plotted against
their corresponding simulated BEP. To obtain simulated BEP, Nround = 1.5
and δ = 3. To shorten simulation and computation time, only the BEP for
the first user was computed. We see that the upper bound is not very tight
particularly for the system with CFO. The reason is that the decomposable
error sequences could not be identified and discarded in the presence of fading
channels, unlike the asynchronous DS-CDMA case in [11]. It is also clear from
the figure that ML performs better in the absence of CFO.
266 7 MC-CDMA
0
10

−1
10

−2
10
Bit error probability

−3
10

−4
10 No MUD, CFO = 0
No MUD, CFO = ± 0.3
Upper bound, CFO = ± 0.3
−5
10 ML MUD, CFO = ± 0.3
Upper bound, CFO = 0
ML MUD, CFO = 0
−6
10
0 2 4 6 8 10
E /N , dB
b 0

Fig. 7.9. Upperbound and simulation BER N = 10, L = 2, CF O = 0, ±0.3.

Example 7.6: ML MUD Performance


Figure 7.10 shows significant performance improvement of the ML detector,
where performance of MC-CDMA with ML-MUD is compared to MC-CDMA
with the single user MRC detection. The system parameters were N = 16,
L = 2, and T = 16. As compared with Fig. 7.9 with N = 8, we see that
ML performs better since there were more pairwise MAI-free users. Separate
simulations were performed to acquire the BEP performance for CF O = 0
and CF O = ±0.3. We see that the BEP achieved by ML for both systems
is very low when SNR is close to 10 dB. Again, we see that the ML detector
performs better without CFO.

7.4 Complexity Reduction in Decorrelating MUD


Detection

A band matrix is a matrix whose nonzero elements are confined to a diagonal


band comprising the main diagonal and several sub-diagonals. For band ma-
trix A with [A]i,j = 0 if i − j > ml and j − i > mu , integers ml and mu are
called the lower and upper bandwidths, respectively, and m = ml + mu + 1 is
the total bandwidth.
7.4 Complexity Reduction in Decorrelating MUD Detection 267

100

10−1

No MUD, CFO = 0
10−2
Bit error probability

No MUD, CFO = ± 0.3


ML MUD, CFO = ± 0.3
ML MUD, CFO = 0
10−3

10−4

10−5

10−6
0 2 4 6 8 10
Eb/N0, dB

Fig. 7.10. BEP versus SNR N = 16, L = 2.

Channel matrix H (or H 9 for the case without CFO) can be converted to a
band matrix by reducing the number of users to N − (L − 1), and employing
j2πki
CI codes wi [k] = e N , k = 0, 1, . . . N − 1, with the set of indices (i) which is
either {0, 1, . . . N − L} or {L, L + 1, . . . , N }. The bandwidth of the resulting
band matrix is L − 1 + L − 1 + 1 = 2L − 1. To give an example, for a channel
of length L = 3, its cross-correlation matrix with N = 16 can be transformed
into a band matrix of size N − (L − 1) = 14 and bandwidth 2L − 1 = 5, as
j2π0k
shown in Fig. 7.11, by omitting the first two CI codes, w0 [k] = e 16 and
j2π1k
w1 [k] = e 16 , k = 0, 1, · · · , 15.
Band matrices are usually stored by recording diagonals in the band while
the rest is simply set to zero. For example, for the following matrix of size
6 × 6 with ml = mu = 1:
⎡ ⎤
11 12 0 0 0 0
⎢ 21 22 23 0 0 0 ⎥
⎢ ⎥
⎢ 0 32 33 34 0 0 ⎥
⎢ ⎥ (7.57)
⎢ 0 0 43 44 45 0 ⎥
⎢ ⎥
⎣ 0 0 0 54 55 56 ⎦
0 0 0 0 65 66
we can store it compactly of the following 6 × 3 matrix:
268 7 MC-CDMA

Fig. 7.11. Conversion of a channel matrix into a band matrix with N = 16, L = 3
by reducing the user number to 14.

⎡ ⎤
0 11 12
⎢ 21 22 23 ⎥
⎢ ⎥
⎢ 32 33 34 ⎥
⎢ ⎥ (7.58)
⎢ 43 44 45 ⎥
⎢ ⎥
⎣ 54 55 56 ⎦
65 66 0

A banded matrix system can be solved by LU decomposition faster and with


less storage space than a general dense matrix of the same dimension.
Consider a MC-CDMA system with a spreading gain of N and Nb =
N − (L − 1) users employ codewords with indices i = L, L + 1, . . . , N (or
0, 1, . . . , N − L) in a CFO environment. By using MRC in the receiver and
disregarding the additive noise vector n̂ in Eq. (7.59), the received signal ẑ
becomes
ẑ = Hb x, (7.59)
where Hb is the cross-correlation matrix. The above equation is in fact a linear
system equation involving complex band matrix Hb , which can be solved by
the Gaussian elimination algorithm with partial pivoting.
The Gaussian elimination algorithm first factors Hb into the product of
an upper triangular matrix U and a lower triangular matrix L; namely,

Hb = LU. (7.60)

Next, the solution of the system Hb x = ẑ can be rewritten by

L(Ux) = ẑ, (7.61)

which demands forward and backward substitutions. The total number of


operations required to solve Hb x = ẑ depends upon the number of pivoting
7.4 Complexity Reduction in Decorrelating MUD Detection 269

required. Generally speaking, if Nb  ml + mu , the number of operations


required by the factorization in Eq. (7.60) is O(Nb ml (ml + mu )) while the
number of operations required by solving x in Eq. (7.61) by forward/backward
substitutions is about O(Nb (2ml + mu )) [85].
Note that, for the cross-correlation matrix Hb of our interest, the lower
and the upper bandwidths are both equal to L − 1. Hence, the complexity of
the factorization process and the solution process is equal to O(2(N − L +
1)(L − 1)2 ) and O(3(N − L + 1)(L − 1)), respectively. In contrast, to solve
a general linear system of equations with Gaussian elimination with a dense
matrix of size (N − L + 1) × (N − L + 1), it demands O((N − L + 1)3/3) for the
LU factorization and O((N − L + 1)2) for forward and backward substitutions.
The complexity of matrix inversion with fast algorithms is O((N − L + 1)2 ).
Thus, the complexity of the decorrelating MUD detection for MC-CDMA has
considerably been reduced with CI codes in practical channel scenarios, where
N  2(L − 1)2 .
In the absence of CFO, Hb = C† C, is a Hermitian positive definite banded
matrix with m = ml = mu = L − 1, for which there is an even faster Gaus-
sian elimination algorithm for solving the linear system. The total number
of operations is approximately equal to O((N − m)m2 /2 − m3 /3) for the LU
factorization and O(2(N − m)m − m2 ) for the forward and the backward sub-
stitutions [85]. In contrast, the complexity of solving a general dense matrix of
the same size is O((N −L+1)3 /6) for the LU factorization and O((N −L+1)2 )
for the forward and the backward substitutions and the complexity of matrix
inversion with fast algorithms is O((N − L + 1)2 ).

7.4.1 Error Probability for Decorrelating MUD

The detected symbol for the above MUD detection technique is given by
x̂ = x + H−1
b ê, (7.62)
where ê is an N − L + 1 × 1 Gaussian random vector with zero mean and
covariance σ 2 H(0) where H(0) = C† C. Hence, we have
† −1†
E{(H−1 −1 † −1
b ê)(Hb ê) } = E{Hb êê Hb }
(0) −1†
= σ 2 H−1
b H Hb . (7.63)
(0) −1†
Let Hn = H−1
b H Hb . Since ê is a proper complex random vector, we have
1 1
E{{êê† }} = E{êê† } = σ 2 Hn .
2 2
Then, under the BPSK modulation, the conditional bit error probability for
the ith user is equal to
( % +
2
Pi (error) = E Q .
σ 2 Hn (i, i)
270 7 MC-CDMA

−2
10
N = 21, L = 4, K = 18, Analytical BEP
N = 21, L = 4, K = 18, Simulated BEP
N = 16, L = 3, K = 14, Simulated BEP
Bit error probability

−3
10

−4
10

0 5 10 15 20
E /N , dB
b 0

Fig. 7.12. Analytical and simulated BER performance of decorrelating detec-


tor with reduced complexity Gaussian elimination algorithm for MC-CDMA with
CF O = ±0.3.

Example 7.7: Decorrelating Detector Performance


Figure 7.12 shows the theoretical and simulated BEP as a function of the
SNR value, Eb /N0 , in the presence of CFO, which is uniformly distributed
between −0.5 and 0.5, with a decorrelating detector with N = 21, L = 4,
T = 18 for MC-CDMA. To shorten the simulation time, only the BEP for the
first user was computed. We see that simulated and analytical BEP results
are in good agreement. The average BEP performance of a MC-CDMA with
N = 16, L = 3, T = 14 is also shown in Fig. 7.12.

Example 7.8: Performance Comparison of PIC, ML, and Decorre-


lating Detectors
Figure 7.13 compares the BEP performance of PIC, ML, and decorrelating
detectors with the MRC detector as the benchmark. As expected, the optimum
ML MUD detector greatly outperforms all other detectors. We also observe
that the decorrelating detector with N − L + 1 = 15 users outperforms the
PIC detector when Eb /N0 > 15 dB.
7.5 Channel and CFO Estimation 271

10−1

10−2
Bit error probability

10−3 No MUD
Decorrealting with 16 users
Decorrealting with 15 users
two-stage PIC
10−4 Optimum maximum likelihood detection

10−5

0 5 10 15 20
E /N , dB
b 0

Fig. 7.13. Comparison of MUD techniques for N = 16 and L = 2 and CF O = 0.

7.5 Channel and CFO Estimation

The performance of single-stage PIC was analyzed in Section 7.2 under the
assumption that perfect knowledge of channel coefficients and CFO values for
each user is available in the receiver. In practice, channel and CFO estimation
must be performed using training sequences. Many Channel and CFO esti-
mation algorithms have been proposed for single user OFDM systems with a
quasi-static channel. A maximum likelihood CFO estimation algorithm was
proposed in [87]. Pilot-based channel estimation techniques for OFDM were
discussed in [17, 55, 71]. These CFO/channel estimation algorithms can be
extended to multiuser OFDM and MC-CDMA to avoid the use of complex
multiuser estimation schemes. However, performance of such CFO or channel
estimation techniques can be degraded by MAI.
In this section, we propose simple estimation techniques using the proposed
HW codewords in Chapter 6 or CI codewords as introduced in Section 7.1.
Since users with these codewords do not have MAI from other users in a
CFO environment, more accurate CFO or channel coefficients can be esti-
mated if groups of users employ these codewords in turn. In other words, only
log2 (N/G) + 1 users with HW codewords or N/G users with CI codewords
can be active simultaneously to send out pilot symbols.
272 7 MC-CDMA

The CFO estimation algorithm for OFDM given in [87] is based on the
repetition of data symbol xi and comparing the phases between successive
received symbols. In MC-CDMA, the detection output is actually one symbol
rather than an N × 1 OFDM symbol. If we denote two successive output
symbols by x̂i and x̂i , the CFO can be estimated by [87]

1
ˆi = tan−1 [{x̂∗i x̂i }/{x̂∗i x̂i }]. (7.64)

Channel estimation is performed by transmitting pilot symbols. The channel


is assumed to be quasi-static. The CFO estimation should be performed before
the channel estimation so that each user can compensate for his/her own CFO
in the receiver. Thus, the ICI and the self-distortion factors caused by CFO
are not present during the pilot transmission. Let ai denote the pilot symbol
for user i, the received signal after going through the DFT and Wi matrices
becomes


T
z[k] = ai wi [k]wi∗ [k]λi [k] + aj wj [k]wi∗ [k]λj [k] + z[k]wi∗ [k], (7.65)
j=1,j =i

−1
10

−2
Bit error probability

10

−3 Zero stage PIC, CI, with channel/cfo estimation


10
Zero stage PIC, CI
Zero stage PIC, HW, with channel/cfo estimation
Zero stage PIC, HW
LCM PIC, CI, with channel/cfo estimation
−4
10 LCM PIC, CI
LCM PIC, HW, with channel/cfo estimation
LCM PIC, HW

0 5 10 15 20
E /N , dB
b 0

Fig. 7.14. The effect of channel/CFO estimation on the BEP performance, with
N = 16, L = 2, CF O = ±0.1 dB.
7.5 Channel and CFO Estimation 273

for k = 0, 1, · · · , N − 1. To extract hi (0), hi (1), · · · , hi (L − 1) from the above


equation, MRC is not useful. Instead, we do the following
N −1 j 2π
N kl
1 k=0 z[k]e
ĥi (l) = √ , l = 0, 1, · · · , L − 1. (7.66)
N ai
It is reasonable to perform the above procedure multiple times and then av-
erage the estimated channel coefficient so that the variance of the estimator
will decrease. Note that the CFO estimation task discussed above requires
the knowledge of the channel impulse response, while the channel estimation
algorithm needs to know the estimated CFO values to get rid of the ICI and
the self-distortion factors caused by CFO.
To address this problem, one possible solution is to perform the CFO esti-
mation and the channel estimation iteratively; namely, first, the CFO values
are estimated without the knowledge of the channel impulse response and
the estimated CFO values are used to estimate the channel impulse response.
In the second iteration, the estimated channel impulse response is employed
to estimate the CFO values which in turn are used to update the previous
channel estimation results. In the following example, we shall evaluate the
performance of the above iterative estimation techniques via simulations.

−1
10

−2
10

−3
10
Bit error probability

−4
10

−5
10 Zero stage PIC, CI, with channel/cfo estimation
Zero stage PIC, CI
−6 Zero stage PIC, HW, with channel/cfo estimation
10
Zero stage PIC, HW
LCM PIC, CI, with channel/cfo estimation
−7
10 LCM PIC, CI
LCM PIC, HW, with channel/cfo estimation
−8
LCM PIC, HW
10
0 5 10 15 20
E /N , dB
b 0

Fig. 7.15. The effect of channel/CFO estimation on the BEP performance, with
N = 32, L = 4, CF O = ±0.1 dB.
274 7 MC-CDMA

Example 7.9: The Effect of Channel/CFO Estimation on the Per-


formance of PIC
In this example, we investigate the effect of PIC to illustrate the significant
performance advantage of PIC. The simulation was performed under two sets
of parameters: (1) N = 16, L = 2 and (2) N = 32, L = 4, respectively. We set
CF O = ±0.1. Figures 7.14 and 7.15 show the effect of iterative channel and
CFO estimations on the BEP performance, where the PIC detector employed
the cyclic limit mitigation technique, and the channel and CFO estimation
methods were explained in Section 7.5. There are 32 pilot symbols sent for
channel estimation and the resultant estimated channel coefficients were av-
eraged to yield the estimate of channel impulse response. We see from these
figures that the performance loss of MC-CDMA without PIC due to the chan-
nel/CFO estimation is negligible as the performance loss of the PIC detectors
is less than 2.5 dB for Eb /N0 < 10 dB.
8
Ultra-Wideband (UWB) Precoding System
Design Using Channel Phase

8.1 Introduction
The ultra-wideband (UWB) communication system, which conveys its data
symbol by a set of carrier-less pulse waveforms is also known as UWB impulse
radio. The narrow pulse, which is of the order of nanoseconds, leads to remark-
able multipath resolution at the receiver, i.e., signals coming from different
paths can be differentiated easily if their inter-arrival time is greater than one
pulse duration. As a result, tens or even hundreds of multipath components
are usually found in an indoor environment. Since different path experiences
independent channel gain and the probability that all pathes suffer from deep
fading simultaneously is low, these many multipath components can be ex-
ploited to combat signal deep fading efficiently. However, to acquire sufficient
signal power for symbol decoding, one method is to employ a large number of
Rake fingers at the receiver [151]. The receiver design of this kind is not only
expensive but also consumes a lot of battery power to decode the transmit-
ted symbol. Therefore, it is not favorable for the mobile application, where
hardware complexity and power consumption are the major concerns.
The idea called time-reversal prefilter (TRP) or Pre-Rake was recently ap-
plied to reduce the UWB receiver complexity in [58, 123, 139] and is described
as follows. Given that the channel response is available at the transmitter, the
TRP passes the original transmit signal through one prefilter whose impulse
response is the same as the order-reversed channel response. Therefore, the
equivalent channel response aftering precoding becomes the autocorrelation
of the original channel. All the multipath components will be constructively
combined after certain delay and it is named as the peak received signal af-
terward. As a result, the receiver with fewer fingers can enjoy full multipath
diversity. The TRP transmitter requires the complete channel response, which
is estimated and then passed to the transmitter by the receiver in a frequency
division duplex (FDD) system. Owing to a large number of channel taps in
the channel, the overhead of the channel information feedback is large and,
hence, the TRP scheme is not attractive to the real world implementation.

C.-C.J. Kuo et al., Precoding Techniques for Digital Communication Systems,


DOI: 10.1007/978-0-387-71769-2 8,  c Springer Science+Business Media, LLC 2008
276 8 CPP-UWB

In this chapter, we introduce a new UWB precoding architecture called


channel phase precoded UWB (CPP-UWB) to overcome the drawback of
TRP. The CPP transmitter convolutes the transmit signal with the time-
reversed phase only, rather than both phase and amplitude information, for
symbol precoding. For the carrierless UWB channel response, the phase is
equal to either +1 or −1, which corresponds to the sign of each tap. As
compared to the amplitude information, which is represented by several bits
in practice, the phase information is described by one single bit. Therefore, the
capacity of the feedback channel is saved. If the feedback phase information
is accurate, there will be a strong channel tap in the resultant channel since
all the channel taps are coherently combined at the receiver. For those off-
peak received signals, their power is much weaker than the peak signal due
to arbitrary combining of multpath. By taking advantage of the concentrated
signal power, the CPP receiver simply takes one sample at the peak for symbol
decision.
Please notice that a similar idea called “delay tuning,” which concentrates
the received signal power by adjusting the delay and phase of the time-hopped
signal properly, was also proposed in [95]. Yet two main differences between
the delay tuning and our proposed CPP technique are given as follows. First,
the multipath arrival is described as a random variable in [95]. However, the
channel model with a fixed inter-arrival time is verified based on the channel
sounding experiment in [18]. The random path arrival assumption in [95]
requires more complicated schemes for both channel estimation and received
signal power focusing. Second, the concentrated signal power at the receiver
implies less intersymbol interference (ISI) at the receiver output. Thus, the
data rate of CPP-UWB can be improved by reducing the symbol interval
without much performance degradation. On the contrary, the delay tuning
system, employs a fixed length time-hopping code, does not take advantage
of the concentrated signal power to raise its data rate.
The key to the received signal power concentration of CPP systems de-
pends on the accuracy of estimated phase information and a training based
phase estimation scheme is used to identify the current phase knowledge. For
a given number of training symbols, the lower bound on the output signal-to-
noise power ratio (SNR) is derived when the data symbol is encoded by the
estimated phase. Please keep in mind that the use of training symbols causes
data rate loss and a system designer can minimize the training overhead based
on the derived lower bound.
Due to the concentrated signal power at the CPP receiver, the ISI level can
be kept tolerable when the symbol interval is less than the channel response
length. However, the ISI power gradually dominates the system performance
when the data rate increases. Under such circumstance, the CPP system per-
formance can be saved by suppressing the residual ISI at the receiver. Here,
one ISI mitigation scheme, say, codeword length optimization (CLO), is de-
veloped to suppress interfering signals.
The CLO problem in CPP-UWB systems is motivated by the following
observation. Employing a longer codeword, which demands more feedback
8.2 System Model and Features 277

channel phase components, may not only improve the concentrated peak
power but also amplify the off-peak signal power as well. Consider the case
where the symbol interval is less than channel duration, the output signal
to interference power ratio (SIR) can be deteriorated consequently. Hence, a
longer code may not necessarily enhance the performance than a short one.
The design goal of CLO is to maximize the output SIR by adjusting the code-
word length. Since the best codeword length is usually less or equal to the
original channel response, CLO helps to further reduce the feedback burden.
However, it is worthwhile to point out that the closed-form solution to optimal
code size is not possible to find since the problem itself is highly nonlinear
in nature. Even though we can apply an exhaustive search algorithm to find
the best code length, this scheme also requires very high computational power
due to a huge number of channel taps in the UWB channel. To save the com-
putational complexity, a fast search algorithm for the optimal code length is
also proposed. Even though the fast algorithm maximizes the output SIR,
instead of the output signal to interference plus noise power ratio (SINR), the
difference between the resulting output SINR and the maximal one is small
even when the noise power is strong and it converges to the maximum SINR
as noise power goes down.
The organization of this chapter is detailed as follows. The system model
of CPP-UWB system is given in Section 8.2 followed by the performance
comparison between different precoding systems in Section 8.3. The lower
bound on the output SNR when the data symbol is encoded by the estimated
channel phase is derived in Section 8.4. Next, the CLO problem is formulated
and a fast search algorithm for solving CLO is derived in Section 8.5. Finally,
the power spectral mask of the proposed CPP system is derived and the
related implement issue is discussed in Section 8.6.

8.2 System Model and Features

8.2.1 System Model

Here we consider a carrierless, pulse-based UWB communication system,


where the emitted pulse waveform is carefully designed so that the output
power spectral density (PSD) satisfies the US Federal Communications Com-
mission (FCC) requirement in [39]. The tap-delay-line (TDL) channel model
from Chao and Scholtz [22] is adopted to simplify our discussion and it is
described as

L−1 
L−1
h(t) = hi δ(t − iΔ) = pi αi δ(t − iΔ), (8.1)
i=0 i=0

where hi = pi αi is the channel gain of the ith path and Δ denotes the duration
of pulse waveform, which is set as the minimum multipath resolution in time
domain. In a baseband channel, the channel gain of the ith signal path hi is a
real number and is composed of two random variables: phase pi and amplitude
278 8 CPP-UWB

αi . The phase pi , which belongs to either +1 or −1 with equal probability,


depends on the number of reflection during the transmission path. The am-
plitude signal αi is modeled as a Rayleigh random variable with probability
density function (PDF) as
x −x2 /2σi2
fαi (x) = e . (8.2)
σi2
Also, the second moment of αi , i.e., the power of hi , decreases as its tap index
goes up. Mathematically, we have

E{h2i } = E{α2i } = 2σi2 = Ωγ i , (8.3)

where Ω is the power of the first tap and γ = e−Δ/Γ is a positive number
less than 1 since Γ that controls the rate of power decay is greater than Δ.
The effective channel response length is determined by the value of Γ , i.e., the
channel tap, whose power is less than Ωγ L−1 is ignored. Four different values
of Γ , which are suitable for four different channel modes, namely, channel
model 1 to 4 (CM 1–4) in [41], are also specified in [25]. Please note that the
inter-arrival time between two consecutive multipath components is fixed in
our model. This model is suitable for the environment, where the multipath
is dense since multipath components in the same time domain grid can be
viewed as an effect channel tap [18, 86].
The block diagram of the CPP-UWB system is shown in Fig. 8.1, where the
receiver sends the L-bit estimated channel phase knowledge to the transmitter
for symbol precoding. Once the transmitter acquires the fed back phase infor-
mation, it synthesizes the codeword c(L) with the reversed phase information
as
(L) (L) 1
c(L) = [c0 , · · · , cL−1 ]t = √ [p̂L−1 , · · · , p̂0 ]t , (8.4)
L

where L in the dominator is used to normalize the total transmitted power
per symbol. Next, the BPSK symbol, b(i), is modulated onto the codeword
c(L) and the pulse waveform ws (t). Mathematically, the transmitted signal
becomes
∞  (L)
L−1
xs (t) = b(i) cj ws (t − jΔ − iT ), (8.5)
i=−∞ j=0

Fig. 8.1. The block diagram of the CPP-UWB system [[20] IEEE]
c
8.2 System Model and Features 279

where the symbol interval T is set as an integer multiple of Δ, i.e., T = M Δ,


where M ∈ N.
The signal arrives at the receiver is distorted by the multipath channel and
contaminated by the additive white Gaussian noise (AWGN). After received
pulse waveform matching and chip sampling, the discrete received signal can
be formulated as
(L) (L)
r(L) (i) = [r0 (i), · · · , r2L−2 (i)]t = H(L) c(L) b(i) + I (L) (i) + n(L) (i), (8.6)

where H(L) is a (2L − 1) × L Toeplitz matrix whose first column contains


h = [h0 , h1 , · · · , hL−1 ]t from the first to the Lth elements and zeros else-
(L) (L)
where, I (L) (i) = [I0 (i), · · · , I2L−2 (i)]t is the ISI vector, and n(L) (i) =
(L) (L)
[n0 (i), · · · , n2L−2 (i)]t is the AWGN vector whose mean is zero and covari-
ance matrix is (No /2)I2L−1 . It is worthwhile to comment that the equivalent
channel response spans 2L − 1 taps. If the symbol interval is less than L chips,
the received signal r(L) (i) contains ISI since I (L) (i) is nonzero.
Let h(L) denote the equivalent channel response after precoding, i.e.,
(L) (L)
h(L) = H(L) c(L) = [h0 , · · · , h2L−2 ]t , (8.7)

where
( i
(L)
√1 p̂L−1+j−i pj αj , 0 ≤ i ≤ L − 1,
hi = L j=0
2L−2−i (8.8)
√1
L j=0 pi+j−L+1 p̂j αi+j−L+1 , L ≤ i ≤ 2L − 2.

If the fed back phase information is accurate, i.e., p̂i = pi ∀i ∈ {0, · · · , L − 1},
we have
 1 
L−1
(L) 
hL−1  = √ αj , (8.9)
p̂i =pi L j=0
where all the path gains are summed after some delay. As compared to other
(L)
taps in h(L) , the amplitude of hL−1 is much stronger due to the coherent
combination of all multipath components. By taking advantage of the con-
centrated signal power, a single finger rake receiver (i.e., a matched filter
(MF)), can be applied to decode the transmit symbol as
# $
(L)
b̂(i) = sign rL−1 (i) . (8.10)

In addition, the focused signal power allows the symbol interval to be reduced
to improve the data rate without much ISI penalty.

8.2.2 Features of CPP-UWB System

The CPP-UWB system contains several interesting features, which are de-
tailed next.
280 8 CPP-UWB

Low Cost Receiver Design and Low Bit Rate Feedback Channel
Capacity

The TRP transmitter requires full channel information, which includes both
phase and amplitude of every channel tap. Usually, a high resolution analogy-
to-digital converter (ADC) is necessary at the TRP receiver to resolve each
channel gain, for example, a 10-bit ADC is employed at TRP receiver in
[56]. The higher resolution ADC also produces more bits for every channel
estimation and thus demands more channel capacity for information feedback.
On the contrary, the CPP transmitter demands only the knowledge of channel
phase, which can be identified by a 1-bit ADC. Therefore, the receiver design
employing a lower resolution ADC is cheaper while the required feedback
channel capacity is lower.

High Data Transmission Rate

If the phase knowledge at the transmitter is perfect, the phase precoding


changes the power profile of the original UWB channel so that a majority of
power is focused at one tap while the rest of power is distributed among those
off-peak taps. As compared to the system without precoding, the concentrated
power profile lessens the ISI effect at the receiver output for a fixed symbol
interval. In other words, the reduced off-peak signal power allows a high data
rate throughput by shortening the symbol interval for a fixed noise margin.

Secure Data Transmission

The channel measurement result in [103] implies that the spatial correlation
between two responses measured by two different receive antennas separated
by more than 10 in. does not exceed 0.1. Hence, the channel responses at dif-
ferent locations are almost independent. Since the returned phase information
is location dependent, the peak power is achievable only at a specific place
while the signal power is spread elsewhere. Thanks to the scattered signal
power, it is hard for eavesdroppers to acquire the transmitted data symbol if
they are far away from the desired receiver. Even if the phase knowledge is
wiretapped by the third party users during the phase information feedback
stage, they have to combat a serious ISI during the decoding process. Conse-
quently, CPP technique enhances the transmission security by the nature of
its design.
It is worthwhile to mention that the TRP scheme also supports a high
rate transmission and a secure communication link. In fact, with more chan-
nel knowledge involved in the precoding process, TRP can achieve a higher
data rate and more secure communication link than CPP. However, its huge
feedback overhead and expensive ADC receiver design impedes its deployment
in practice.
8.3 Performance Analysis of CPP-UWB Systems 281

8.3 Performance Analysis of CPP-UWB Systems


Since the CPP scheme that utilizes partial channel knowledge in the precoding
process, the concentrated signal power may not be as high as TRP. In this
section, we will analyze the focused signal power generated by phase codeword
and then compare it to other TRP schemes, such as partial Pre-Rake (PPR)
[57] in the following analysis. Please note that to simplify our discussion, the
fed back phase information is assumed to be perfect.

8.3.1 Channel Power Concentration of Phase Precoding

Given the ideal phase knowledge at the transmitter, from (8.8), the averaged
peak power at the receiver can be calculated as
⎧⎛ ⎞2 ⎫
& '2 " ⎪
⎨ 1 L−1 ⎪


= E ⎝√ αj ⎠
(L)
P̄h(L) ≡ E hL−1 b(i)
L−1 ⎪
⎩ L j=0 ⎪

⎧ ⎫
1 ⎨ 2  ⎬
L−1 L−1
= E αj + αl αm . (8.11)
L ⎩ j=0

l,m=0;l=m

Because the channel gain is assumed to be independent for different path, and
the
√ first and the second moments of the Rayleigh random variable αj equal
πΩγ j
2 and Ωγ j , respectively, the averaged power in (8.11) can be further
simplified as
⎛ ⎞
Ω(1 − γ L ) Ωπ ⎝ 
L−1
P̄h(L) = + γ (l+m)/2 ⎠
L−1 L(1 − γ) 4L
l,m=0;l=m
⎛ 2 L−1 ⎞
Ω(1 − γ L ) Ωπ ⎝  l/2 
L−1
= + γ − γl⎠
L(1 − γ) 4L
l=0 l=0

&  2
π ' Ω(1 − γ L ) Ωπ 1 − γ L/2
= 1− + . (8.12)
4 L(1 − γ) 4L 1 − γ 1/2
Let ς denote the ratio between the second and the first terms at the right-hand
side of (8.12), i.e.,
& L/2
'2
Ωπ 1−γ
4L 1−γ 1/2
ς=  L) . (8.13)
1 − π4 Ω(1−γ
L(1−γ)

After some manipulations, it can be shown that ς can be bounded from below
as  
π 2Γ 1 − eη/2
ς≥ , (8.14)
4 − π Δ 1 + eη/2
282 8 CPP-UWB

where η is defined as
ΔL 1
η=. (8.15)
Γ
For a typical value of η (e.g., η = 6.146), we have the following relationship

ς ≥ 50 ⇐⇒ Γ/Δ ≥ 6.8. (8.16)

Since the pulse duration Δ is usually a smaller number as compared to the


decay time constant Γ , the second term at the right-hand side of (8.12) con-
tributes most of the power for P̄h(L) . Therefore, P̄h(L) is approximated as
L−1 L−1

 
L/2 2
Ωπ 1−γ
P̄h(L) ≈ . (8.17)
L−1 4L 1 − γ 1/2

Please note that the total channel power in the UWB channel is computed as


L−1 
 2  L−1 1 − γL
P̄chl = E hj = Ωγ j = Ω . (8.18)
j=0 j=0
1−γ

Let χ, which is defined as the ratio between P̄h(L) and P̄chl , be the power
L−1
degradation factor due to incomplete channel information usage in the pre-
coding process. We can have

P̄h(L) π (1 + γ 1/2 )(1 − γ L/2 )


χ= L−1

P̄chl 4L (1 + γ L/2 )(1 − γ 1/2 )
πΔ (1 − e−η/2 )(1 + e−Δ/2Γ )
= , (8.19)
4Γ η (1 + e−η/2 )(1 − e−Δ/2Γ )

where L is substituted by Γ η/Δ to get the last equation. Recall that since
Δ/2Γ is usually a small number, we can obtain the following approximation

1 + e−Δ/2Γ ≈ 2 and 1 − e−Δ/2Γ ≈ Δ/2Γ, (8.20)

where the higher order terms in the Taylor series expansion of e−Δ/2Γ are
ignored. From (8.20) and (8.19), the degradation factor χ in (8.19) can be
further simplified as

1
It is worthwhile to point out that the power of last channel tap E{h2L−1 } satisfies
the following equation.

E{h2L−1 } = Ωγ L−1 ≥ Ωγ L = Ωe−η ,

where L and γ is replaced by Γ η/Δ and e−Δ/Γ , respectively, to arrive the last
equation. The variable η governs the effective channel length since channel taps
whose power is less or equal to Ωe−η are ignored.
8.3 Performance Analysis of CPP-UWB Systems 283

π(1 − e−η/2 )
χ≈ . (8.21)
η(1 + e−η/2 )
As (8.21) suggests, the degradation factor χ is also controlled by η. For a
larger value of η, i.e., a longer channel duration, the power degradation factor
χ could be even lower. This is because the power of each tap shrinks √ with
respective to its tap index and we normalize the phase codeword by L in its
dominator.

8.3.2 Comparison Between TRP and CPP Schemes

The TRP scheme that demands a very high feedback overhead to deliver full
channel knowledge back to the transmitter is not practical. Two Pre-Rake
schemes, namely, partial Pre-Rake (PPR) and selective Pre-Rake (SPR), are
proposed to save the feedback channel capacity of TRP in [58]. It is shown in
[21] that given the channel model in Section 8.2, PPR technique concentrates
on the highest channel power on the average for a fixed number of feed back
channel taps. Since both CPP and PPR transmitters utilize partial channel
information, namely the channel phase in CPP and the first several channel
taps in PPR, it would be interesting to compare their required feedback quan-
tity when both precoders generates the same amount of peak power at their
receivers.
Assume ideal phase information is available at the transmitter. The peak
power generated by a l-chip phase codeword is

P̄h(l) = P̄h(L) L=l . (8.22)
l−1 L−1

Consider the channel model specified in Section 8.2. The signal power con-
centrated at the receiver end when only L̄ first channel taps are given to the
PPR transmitter is shown as [21]

1 − γ L̄
P̄P P R (L̄) = Ω . (8.23)
1−γ

Therefore, the number of channel taps L̄ necessary to produce the same peak
power as P̄h(l) can be found by solving the following equation
l−1

P̄P P R (L̄) = P̄h(l) . (8.24)


l−1

After some manipulations, we can have


C D
Γ
L̄ = ln(1 − ρ) , (8.25)
Δ

where
284 8 CPP-UWB
 2
1 π π 1 − γ l/2
ρ = (1 − )(1 − γ l ) + (1 − γ). (8.26)
l 4 4l 1 − γ 1/2
Please note that we adopt the peak power as our criterion to compare the
feedback overhead of both PPR and CPP systems in (8.25). Therefore, their
BEP performance should be roughly the same when the received signal is ISI-
free. However, both system may have different output SINR when the symbol
interval M is less than L. The closed-form relationship between L̄ and l may
not be possible if both systems are evaluated at the same output SINR level.
This is because the output SINR is a highly nonlinear function of either L̄
and l. In fact, when the ISI power is not large, as shown in Example 8.1, the
gap between their BEP curves is small and (8.25) is still valid.
Example 8.1: Comparison Between CPP-UWB and PPR Systems
In this example, we compare the feedback overhead between CPP-UWB and
PPR systems when both precoding schemes accumulate the same amount
of power at the peak. The system parameters are chosen as Δ = 0.7 ns,
Γ = 20.5 ns (CM3), and L = 180. Two different values of symbol intervals,
say, M = 30 and 60, which correspond to data rate equal to 47.6 and 23.8
Mbps, respectively, are considered. Furthermore, the result shown here is the
average of 1000 channel realizations. Let the corresponding feedback number
of channel phase l be the same as the symbol interval in chip. The amount of

0
10
CPP-UWB/CLO (M=60)
PPR (L̄=21,M=60)
CPP-UWB/CLO (M=30)
PPR (L̄=30,M=30)
−1
10
BEP

−2
10

−3
10

−4
10
0 5 10 15 20
SNR (dB)

Fig. 8.2. BEP performance between CPP-UWB and PPR at different noise power
[[80] IEEE].
c
8.4 Phase Estimation and Performance Analysis 285

feedback channel taps for PPR L̄ computed by (8.25) are L̄ = 30 and 21 for
M=60 and 30, respectively. If a 10-bit ADC is utilized at the PPR receiver
[56], this equals to 300(30 ∗ 10) and 210(21 ∗ 10) bits per channel feedback. On
the other hand, CPP-UWB requires only 60 and 30 bits per feedback, which
is much smaller than that of PPR.
Next, let us consider their decoding performance. The corresponding BEP
performance at different input SNR are plotted in Fig. 8.2. It is observed from
Fig. 8.2 that the BEP gap between two systems is small for two different data
rates considered in this example. Therefore, the feedback overhead comparison
based on the concentrated signal power is valid even when ISI occurs.

8.4 Phase Estimation and Performance Analysis with


Estimated Phase Information
The decoding performance of CPP systems relies on the correctness of the
returned phase information. In this section, a training based phase estimation
scheme, which is simple yet practical, is applied to identify the current phase
knowledge. When the CPP transmitter utilizes the phase estimate acquired
by N training symbols, the corresponding SNR lower bound is derived to
evaluate its system performance.

8.4.1 Channel Phase Estimation Algorithm


The phase estimation scheme using training symbols is described as the fol-
lowing steps.
1. After channel synchronization is achieved, N channel sounding pulses bt (i)
∀i = 0, · · · , N − 1, whose pattern are also known at the receiver, are
emitted by the transmitter. The interval between two consecutive pulses
are long enough so that the response of different pulse is separated.
2. The receiver performs the pulse waveform match and samples the received
signal for every Δ second to digitalize N channel responses. In order to
minimize the noise distortion, all received signals are first demodulated
and then averaged before phase estimation. Mathematically, the averaged
response becomes
N −1 N −1
1  1 
r̂t = bt (l)rt (l) = bt (l) (h + nt (l)) = h + n̂t , (8.27)
N N
l=0 l=0

where
N −1
1 
n̂t = [n̂t,0 , · · · , n̂t,L−1 ]t = bt (l)nt (l) (8.28)
N
l=0
and nt (l) ∼ N (0, N0 /2IL ) is the AWGN noise vector corresponding
to the lth training symbol. Also, it can be shown easily that n̂t ∼
N (0, N0 /2N IL ).
286 8 CPP-UWB

3. The channel phase is thus measured by the sign of every tap in the aver-
aged response r̂t . Thus, the channel phase estimate is given as
P̂ = sign{r̂t }. (8.29)

8.4.2 Performance Analysis with Estimated Phase


When the data symbol is precoded with the phase estimate p̂, the output
SNR of CPP-UWB can be bounded from below as stated in the following
proposition.
Proposition 8.1. Let N be the number of training symbols used in the CPP-
UWB system. The output SNR ν̄ (L) satisfies
& '2 " & '2 "
(L) (L)
E hL−1 b(i) 2E hL−1
ν̄ (L) = & '2 " =
(L) N0
E nL−1 (i)
( :√ √  3/2 ;
2 1 − γL 
L−1
πΩ i/2 2π ΩN0 γ i
≥ Ω + γ −
LN0 1−γ 2 Ωγ i 2N0 + 2N Ωγ i
i,j=0;i=j
:√ √  3/2 ; +
πΩ j/2 2π ΩN0 γ j
· γ − .
2 Ωγ j 2N0 + 2N Ωγ j
(8.30)

Proof: By (8.8), the averaged peak power can be simplified as


⎧⎛ ⎞2 ⎫
& '2 " ⎪
⎨ ⎪

1 
L−1
= E ⎝√ p̂j pj αj b(i)⎠
(L)
E hL−1 b(i)

⎩ L j=0 ⎪

1  
L−1 L−1
1
= Ωγ j + E {pj p̂j αj } E {pm p̂m αm } .
L j=0 L
j,m=0;j =m
(8.31)
Please note that whether the ith phase estimate is correct depends on the
magnitude of αi and the noise power of the ith element of n̂t . Conditioned on
one channel realization, the probability of correct phase estimate is
   
P r p̂i = pi αi = P r{p̂i = 1αi , pi = 1}P r{pi = 1}

+ P r{p̂i = −1αi , pi = −1}P r{pi = −1}
  
= P r pi αi + n̂t,i > 0αi , pi = 1 P r{pi = 1}
  
+ P r pi αi + n̂t,i < 0αi , pi = −1 P r{pi = −1}
/ 
= 1−Q 2
2N αi /N0 . (8.32)
8.4 Phase Estimation and Performance Analysis 287

Similarly, we can have


/ 
  
P r p̂i = pi αi = Q 2N α2i /N0 . (8.33)

From (8.32) and (8.33), the expected value of p̂i pi αi conditioned on the chan-
nel gain αi is computed as
  
E p̂i pi αi αi = αi P r{p̂i = pi } − αi P r{p̂i = pi }
/ 
= αi − 2αi Q 2N α2i /N0 . (8.34)

By averaging over the probability of αi , we have the uncoditional expected


value as
   
E {p̂i pi αi } = Eαi E p̂i pi αi αi
 / "
= Eαi {αi } − 2Eαi αi Q 2
2N αi /N0
√  ∞
πΩ i/2
e−N x /N0 xfαi (x)dx
2
≥ γ −
2 0
√ √  3/2
πΩ i/2 2π ΩN0 γ i
= γ − , (8.35)
2 Ωγ i 2N0 + 2N Ωγ i
where the inequality is obtained by substituting the Q function with its upper
bound in [146], i.e.,
1
Q(x) ≤ e−x /2 ,
2
(8.36)
2
and several mathematical manipulations are applied to the right-hand side of
the inequality to get the final result. By replacing E {p̂i pi αi } in (8.31) with its
lower bound derived in (8.35) and dividing (8.31) by N0 /2, we have completed
the proof of Proposition 8.1.
An example for the proposed phase estimation scheme is shown next.
Example 8.2: Effect of Proposed Phase Estimation Algorithm
In this example, we demonstrate the impact of training overhead on the per-
formance of the phase estimation algorithm. The system parameters are re-
mained the same as those in the previous example, except we let the symbol
interval be greater than L so that ISI is absent. Three input SNR values (10,
15, and 20 dB) are considered and the corresponding output SNR for different
number of training symbols N is shown in Fig. 8.3. The output SNR curves,
which correspond to the idea phase and the derived lower bound, respectively,
are also plotted for referencing purposes. As Fig. 8.3 suggests, when the input
SNR is fixed, the use of more training symbols improves the output SNR at
the cost of higher overhead and degraded data rate. The derived lower bound
that suggests the worse case output SNR for a given number of training sym-
bols. Therefore, we can vary the number of training symbols to save the data
288 8 CPP-UWB

18

16
SNR = 20 dB

14

12
OutputSNR (dB)

10 SNR = 15 dB

4 SNR = 10 dB

2
20 40 60 80 100 120 140 160 180 200
Number of Training Symbols

Fig. 8.3. The output SNR vs the number of training symbols [[80] IEEE].
c

rate. In addition, the bound becomes tie as either N or input SNR increases
since the gap between Q(x) and its upper bound becomes small as x goes up.
Furthermore, the degradation between input SNR and ideal output SNR
is around 3.3 dB. With (8.15), the value of η in this example is computed as
180 ∗ 0.7
η= ≈ 6.146. (8.37)
20.5
By substituting (8.37) into (8.21), we have the peak power degradation as

π(1 − e−η/2 ) 
χ≈  = 0.4659 = −3.3 dB, (8.38)
η(1 + e−η/2 ) 
η=6.146

which corroborates the observation we have in Fig. 8.3.


The derived lower bound provides a means to predict the system perfor-
mance for a fixed number of training symbol. Therefore, the system designer
can properly adjust the amount of training symbols based on the specified per-
formance requirement to avoid excess training overhead. By observing (8.32),
we learn that the probability of correct phase estimation approaches asymp-
totically to 1 as N increases. This implies that not only the proposed phase
estimation scheme is unbiased, but also the corresponding mean-square-error
8.5 Codeword Length Optimization (CLO) in an ISI Channel 289

(MSE) of the phase estimate is zero as long as the number of training symbol
is large enough.

8.5 Codeword Length Optimization (CLO) in an ISI


Channel
8.5.1 Problem Statement

Before we discuss the codeword length optimization (CLO) problem, let us


fist consider the output signal-to-interference power ratio (CIR) of different
phase code sizes as in the following example.
Example 8.3: Output Signal-to-Interference Power Ratio (SIR) vs
Different Codeword Size
Here we compare the output signal power and SIR of CPP systems with
different codeword length in this example. The system variables are chosen
as: L = 180, M = 30, Δ = 0.7 ns, and Γ = 20.5 ns (CM3). Two different
sizes of phase code, namely, l = 180 and l = 30, are considered and their
resultant channel responses are drawn in Fig. 8.4. The solid stars and cir-
cles in the figure denote the corresponding desired signals and interference,

Codeword Length = 180


0
Peak = – 3.8 dB
–20 SIR = 12.3 dB
Power( dB)

–40

–60

–80

–100
0 50 100 150 200 250 300 350 400
time (×0.7n s)

Codeword Length = 30
0
Peak = – 4.2 dB
–20 SIR = 15.7 dB
Power( dB)

–40

–60

–80
0 50 100 150 200 250
time (×0.7n s)

Fig. 8.4. The received signal power for different codeword lengths [[80] IEEE].
c
290 8 CPP-UWB

respectively. It is observed from Fig. 8.4 that the use of more phase informa-
tion that combines more channel taps at the receiver output could produce
more peak signal power as well as more interference from the neighboring
symbols. Therefore, the SIR generated by a longer phase code is actually
worse.
Although the focused signal power allows us to shrink the symbol in-
terval to rise the data transmission rate without much ISI, the more phase
knowledge at the transmitter may not reduce the symbol error rate at the
receiver as Example 8.3 suggests. Consider a fixed symbol interval, an op-
timal codeword length, which renders the highest output SIR at the CPP
receiver output, exists. Furthermore, since the optimal codeword length is
usually less than L, knowing the best codeword size reduces the feedback
burden, too.
Our CLO problem is based on the following assumption that the chan-
nel duration L is assumed to be an integer dividable by the symbol interval
M , i.e., L = KM , where K is a positive integer. The assumption here is
not as restrict as it first appears since we can always truncate or zero-pads
the original channel response to satisfy this requirement when the ampli-
tude of the last channel tap becomes very small. We will show later that
the system bit-error-probability (BEP) of this altered channel is almost in-
distinguishable from that of the original channel using one example later in
this section.
When a data symbol is encoded by a l-chip long codeword c(l) , i.e.,

(l) (l) 1
c(l) = [c0 , · · · , cl−1 ]t = √ [pl−1 , · · · , p0 ]t , (8.39)
l

the corresponding received signal is written as

(l) (l)
r(l) (i) = [r0 (i), · · · , rL+l−2 (i)]t
= H(l) c(l) b(i) + I (l) (i) + n(l) (i)
= h(l) b(i) + I (l) (i) + n(l) (i),

where H(l) is the (L+l−1)×l Toeplitz matrix, I (l) (i) and n(l) (i) are the inter-
(l) (l)
ference and AWGN vectors, respectively, h(l) = H(l) c(l) = [h0 , · · · , hL+l−1 ]t ,
and
⎧ 1 i
⎪ √
⎨ l j=0 pl−1+j−i pj αj , 0 ≤ i ≤ l − 1,
(l) l−1
j=0 pj pj+i−l+1 αj+i−l+1 , l ≤ i ≤ L − 1,
√1
hi = (8.40)

⎩ √1 l−i+L−2
l

l j=0 pj pl−1+j αl−1+j , L ≤ i ≤ L + l − 2.

The codeword c(l) leads to the coherent combination of the first l channel taps
(l) (l)
and the peak signal power occurs at hl−1 (i). The average output SIR at rl−1
is equal to
8.5 Codeword Length Optimization (CLO) in an ISI Channel 291

& '2 " ⎨(L−1)/M
  & '2 "
(l) (l)
ν̄ (l) = E hl−1 b(i) × E hl+jM−1 b(i − j) +

j=1
⎫−1
(l−1)/M 
 & '2 "⎬
(l)
E hl−jM−1 b(i + j) . (8.41)

j=1

The problem of codeword length optimization is then formulated as


Lopt = arg max ν̄ (l) , (8.42)
0<l≤L

where a closed-form solution to Lopt is difficult owing to the nonlinear nature


of the problem. Even though we can apply an exhaustive search algorithm
that tests all the possible value of l and pick up the one with highest output
SIR, this algorithm is computationally very intensive and is not favorable.
Next, a fast search algorithm is presented to find the solution for the above
optimization problem with less computational complexity.
Please note that our discussion of the optimized codeword length is pri-
marily limited to the high data rate scenario, i.e., T ≤ LΔ. Since the transmit
power per symbol in CPP systems is always normalized to one and the power
of each tap decreases exponentially with respect to its tap index, a longer code
may not guarantee a higher peak power. Hence, the optimal codeword length
that generates the highest output SNR can be found in the ISI-free case, too.
The CLO problem for the low data rate case is however not as interesting as
that for the high data rate case.

8.5.2 Fast Search Algorithm for Optimal Code Length


Let us first rewrite the output SIR by substituting (8.40) into (8.41) and
performing some manipulations as
& "
l−1 '2
E j=0 αj
ν̄ (l) = l−1 (j) ,
j=0 I

where
j/M 
 # $
2
I (j) ≡ E (αj−mM pj pj−mM b(i + m)) +
m=1
(L−j)/M 
 # $
2
E (αj+nM pj pj+nM b(i − n))
n=1
j/M  (L−j)/M 
     
= E α2j−mM + E α2j+nM
m=1 n=1
j/M  (L−j)/M 
 
= Ωγ j−mM + Ωγ j+nM
m=1 n=1
292 8 CPP-UWB

is the normalized interference power generated by adding the jth channel


(L−j)/M  j/M 
tap at the peak, and n=1 Ωγ j+nM and m=1 Ωγ j−mM are the ISI
power caused by the previous and following data symbols with respect to b(i),
respectively.
The variable βj that measures the ratio between the average power of the
jth path and I (j) is defined as
 
E α2j Ωγ j Ωγ j
βj ≡ = =  j/M  (L−j)/M 
I (j) I (j) Ωγ j−mM + Ωγ j+nM
m=1 n=1
1
= j/M  (L−j)/M  nM . (8.43)
m=1 γ −mM + n=1 γ

We get from (8.43) that

β0 = · · · = βM−1 > βM = · · · = β2M−1 > · · · > β(K−1)M = · · · = βKM−1 ,


(8.44)
which renders a way to separate all L tap components into K disjoint groups
so that elements in the same group have the same β value. For example,
group 1 has the 0th to the (M − 1)th elements, group 2 has the M th to the
(2M − 1)th elements, and so on.
The following lemma will be needed in deriving the fast search algorithm.
Lemma 8.1. If S1 , S2 , I1 and I2 are all positive numbers, then
S2 S1 S2 S1 + S2 S1
< ⇐⇒ < < .
I2 I1 I2 I1 + I2 I1

The proof of the above Lemma is straightforward and thus omitted here.
When the codeword length is not greater than M , we can determine the best
code length based on the following Proposition.

Proposition 8.2. When 0 < l ≤ M , M = arg max0<l≤M ν̄ (l) .

Proof: Let k denote the length of codeword less or equal to M . The output
SIR of codeword c(l) , ν̄ (k) , is first simplified as
& "
k−1 '2 # k−1 $
E j=0 αj E
k−1 2
αj + i,j=0;i=j αi αj
j=0
ν̄ (k) = k−1 (j) = k−1
j=0 I I (0) j=0 d2j
& '
k−1 2j π k−1 i i+j−1
Ω j=0 d + 2 i=1 j=1 d
=
(0)
k−1 2j
I j=0 d
k−1 i i+j−1
π i=1 j=1 d π
= β0 + β 0 k−1 2j = β0 + β0 g(k), (8.45)
2 j=0 d
2
8.5 Codeword Length Optimization (CLO) in an ISI Channel 293
k−1 i k−1
where d = γ 1/2 < 1 and g(k) = ( i=1 j=1 di+j−1 )/( j=0 d2j ) > 0 ∀k. It
is shown in the part A of the Appendix in [20] that g(k) is a monotonically
increasing function of k for 1 ≤ k ≤ M . Therefore, we can conclude that the
maximum SIR must occur at k = M . !

Next, consider the case when the code length l exceeds M and it is de-
composed as
l = kM + l̄, (8.46)

M  and l̄ ≡ l − kM , respectively. Then, the output SIR can be


where k ≡  l−M
reformulated as
S(l) S(kM ) + ΔS(l̄)
ν̄ (l) = = , (8.47)
I(l) I(kM ) + ΔI(l̄)
& "
kM−1 '2 kM−1
where S(kM ) ≡ E j=0 αj and I(kM ) ≡ j=0 I (j) are the sig-
nal power and interference power obtained
# by combining the first kM channel $
l−1 kM−1 l−1
taps, ΔS(l̄) ≡ S(kM +l̄)−S(kM ) = E 2
j=kM αj + 2 i=0 j=kM αi αj
l−1
and ΔI(l̄) ≡ j=kM I (j) are the amounts of increased signal and noise power
due to the extension of code length from kM to kM + l̄, respectively. The next
proposition provides the upper bound for the output SIR when l is greater
than M .

Proposition 8.3. Let the codeword length l be given by (8.46). The output
SIR, ν̄ (l) , is upper bounded by either ν̄ (kM) or ν̄ ((k+1)M) .

Proof: We try to establish this Proposition by showing that either one of the
following statements is true.
1. If ν̄ ((k+1)M) ≥ ν̄ (kM) , maxkM≤l≤(k+1)M ν̄ (l) = ν̄ ((k+1)M) .
2. If ν̄ (kM) ≥ ν̄ ((k+1)M) , maxkM≤l≤(k+1)M ν̄ (l) = ν̄ (kM) .

The proof of the first statement under the assumption ν̄ ((k+1)M) ≥ ν̄ (kM) is
given next. If ΔS( l̄)
ΔI(l̄)
≤ S(kM)
I(kM) , we recall the fact that

S(kM ) + ΔS(l̄) S(kM )


ν̄ (l) = ≤ = ν̄ (kM) ≤ ν̄ ((k+1)M) (8.48)
I(kM ) + ΔI(l̄) I(kM )

ΔS(l̄) S(kM)
from Lemma 8.1. Otherwise, consider the case where ΔI(l̄)
≥ I(kM) . It is
shown in Lemma 2 of [20] that ΔS( l̄)
ΔI(l̄)
is an increasing function with respective
¯ Therefore, we ΔS(l̄) ΔS(M)
to l. have ΔI(l̄) ≤ ΔI(M) , and then use Lemma 8.5.2 to get

S(kM ) + ΔS(l̄) S(kM ) + ΔS(M )


ν̄ (l) = ≤ = ν̄ (k+1)M . (8.49)
I(kM ) + ΔI(l̄) I(kM ) + ΔI(M )
294 8 CPP-UWB

The first statement is then approved based on (8.48) and (8.49). The second
statement that can be proved similarly is omitted here. !

From Proposition 8.3, the maximum output SIR must be bounded by


ν̄ (k·M) , k = 1, · · · , K, i.e.,

max ν̄ (l) ≤ max ν̄ (k·M) . (8.50)


0<l≤L 0<k≤K

Thus, we obtain the fast search algorithm that is given in the following propo-
sition.

Proposition 8.4. A fast search algorithm that identify the optimal code
length Lopt can be written as

Lopt = arg max ν̄ (l) = M · arg max ν̄ (kM) , (8.51)


0<l≤L 0<k≤K

and the corresponding maximum output SIR is


& "
Lopt −1 '2
E j=0 αj
ν̄ (Lopt ) = Lopt −1 (j) . (8.52)
j=0 I

As compared with the exhaustive search algorithm in (8.42), the fast algorithm
reduces the search number by a factor of M .

Another possible criterion for CLO is to consider both AWGN and ISI
jointly. That is, we can maximize the output SINR, i.e.,
⎛ # $ ⎞
l−1
E ( j=0 αj )2
L̂opt = arg max ⎝ l−1 ⎠. (8.53)
0<l≤L l · (N0 /2) + j=0 I (j)

The fast search algorithm in Proposition 8.4 may not give the maximum
output SINR, especially in the low SNR environment. In fact, as suggested
in the following example, the performance gap between these two criteria as
specified in (8.51) and (8.53) is small even in the presence of high noise power.
Example 8.4: Residual ISI Suppression of CPP-UWB Systems
In this example, we show that the performance of CPP-UWB systems can
be improved by adjusting its codeword length according to the proposed fast
search algorithm in Section 8.5. The system parameters are remained the same
as those in the Example 8.1. To simplify our discussion, the feedback phase
information is assumed to be perfect. The BEP curves of CPP-UWB system
with and without CLO are drawn in Fig. 8.5. In addition, the performance
curves of CPP-UWB system with different codeword length optimization cri-
teria given in (8.53) is presented for the performance benchmark. According
8.5 Codeword Length Optimization (CLO) in an ISI Channel 295
0
10
CPP−UWB (M = 60)
CPP−UWB/CLO (M = 60)
CPP−UWB/SINR max (M = 60)
CPP−UWB (M = 30)
−1
10 CPP−UWB/CLO (M = 30)
CPP−UWB/SINR max (M = 30)
BEP

−2
10

−3
10

−4
10
0 5 10 15 20
SNR (dB)

Fig. 8.5. The BEP performance improvement with different ISI suppression schemes
at different data rates [[80] IEEE].
c

to our simulation result, the optimal codeword size found by the proposed
fast searching algorithm for M = 30 and 60 are l=30 and 60, respectively. It
is observed from Fig. 8.5 that the fast algorithm provides additional 2 dB gain
at BEP equal to 10−3 as compared with the conventional CPP-UWB system
using full phase knowledge. For a low data rate case, the decoding perfor-
mance between two different codeword length choosing criteria is very small.
On the contrary, for a high data rate case, the performance gap between two
phase code designs reduces as signal power increases.
To better illustrate this idea, let us consider the output SINR of CPP-
UWB systems at different codeword length. Here we fix the symbol interval
at M = 30 and the corresponding output SINR is plotted as a function of
different code length under different input SNR in Fig. 8.6. The lower six
curves represents the cases that the input SNR is equal to 0–25 dB with a
step size of 5 dB while the top one denotes the case when the noise power is
very weak, i.e., the output SIR. The circle and triangle marks in each curve
denotes the maximum output SIR and SINR, respectively. From Fig. 8.6, the
codeword length determined by different criteria converges when the input
SNR is greater or equal to 20 dB. Even though different design criteria requires
different amount of phase information when the SNR is less than 20 dB, the
output SINR gap between two different size of phase codes is indeed small.
In the next example, we will validate the previous claim about the channel
length duration for the development of our fast search algorithm.
296 8 CPP-UWB

20

SNR = ∞
SNR =25 dB
15

10
Output SINR (dB)

–5 SNR = 0 dB

–10
SINR maximized
SIR maximized

–15
0 20 40 60 80 100 120 140 160 180
Codeword Length

Fig. 8.6. Output SINR with different codeword lengths at different input SNR
values [[80] IEEE].
c

Example 8.5: Effect of Channel Length Approximation and Com-


parison of Fast and Exhaustive Search for CLO
In the previous example, we assume the channel response length L is equal
to 180 chips. However, in the real situation, the channel response may not be
an integer multiple of the symbol interval. As we claim earlier, we can always
pad zeros or truncate the channel tail to meet this channel response length
requirement in Section 8.5. In this example, we would like to validate this
claim by comparing the CPP-UWB systems under different channel length
assumption here. Let us consider a real channel case, whose duration varies
between 170 and 190 chips. Under this circumstance, the optimal codeword
size that provides the maximum output SIR can be found by an exhaustive
search algorithm. The corresponding BEP curve is plotted in Fig. 8.7, where
the ideal channel length is also shown for comparison. As Fig. 8.7 suggests,
the performance curves between these two channel models are almost indis-
tinguishable. This result implies that modifying the length of UWB channel
when the power of taps at the channel tail is small enough will not alter the
system performance much. Furthermore, the computational complexity for
CLO can be saved by applying the fast search algorithm.
8.6 Consideration of FCC Power Spectral Mask 297
0
10
CPP-UWB/CLO Approx. (M = 30)
CPP-UWB/CLO M = 30)
CPP-UWB/CLO Approx. (M = 30)
CPP-UWB/CLO (M = 30)

-1
10
BEP

-2
10

-3
10

-4
10
0 2 4 6 8 10 12 14 16 18 20
SNR (dB)

Fig. 8.7. The BEP performance comparison between two systems using the approx-
imated and the real channels, where Δ = 0.7 ns, Γ = 20.5 ns (CM3) [[80] IEEE].
c

8.6 Consideration of FCC Power Spectral Mask

The US Federal Communications Commission (FCC) assigns a huge frequency


band that overlaps with those used in the existing narrow-band radio services
to the UWB signal. In order to avoid excess in-band interference from UWB
transmitters, FCC also enforces a mask on the output power spectrum that
limits the maximum possible power of UWB radio at different frequency band
as shown in Fig. 8.8. Since precoding changes the power spectral density (PSD)
of the transmitted signal, we derive the PSD of the CPP signal and address
the related implementation issue when the power mask is applied as follows.
To simplify our discussion, we assume that the symbol interval is shorter
than the channel response and the channel length is an integer multiple of the
symbol interval, i.e., L = KM with K being a positive integer. If the energy
per pulse is equal to Es , the transmitted signal is depicted as
2 ∞ K−1 M−1
Es   
xs (t) = b(l − k)pL−1−kM−j ws (t − lM Δ − jΔ)
L j=0
l=−∞ k=0
= ws (t) ⊗ x1 (t), (8.54)
298 8 CPP-UWB

−40

−45
UWB EIRP Emission level (dBM)

−50

−55

−60

Indoor Limit
Part 15 Limit
−65

−70
3 4
10 10
Frequency (MHz)

Fig. 8.8. The power spectral mask of FCC for UWB transmitters in [39].

where
2 ∞ K−1 M−1
Es   
x1 (t) = b(l − k)pL−1−kM−j δ(t − lM Δ − jΔ). (8.55)
L j=0
l=−∞ k=0

Let Tf be an integer multiple of the symbol interval, i.e., Tf = T α where α is


a positive integer, the truncated x1 (t) signal is defined as

(Tf ) x1 (t), −Tf −  ≤ t ≤ Tf − 
x1 (t) = , (8.56)
0, elsewhere
(Tf )
where 0 <   1. Then, x1 (t) can be explicitly expressed as
2
Es   
α−1 K−1 M−1
(T )
x1 f (t) = b(l − k)pL−1−kM−j δ(t − lM Δ − jΔ). (8.57)
L j=0
l=−α k=0

(Tf )
The time-averaged autocorrelation function of x1 (t) is equal to
1 # $ Es Es
(T ) (T )
E x1 f (t1 )x1 f (t2 ) = 2αM Kδ(t1 − t2 ) = δ(τ ), (8.58)
2Tf 2T Lα T
8.6 Consideration of FCC Power Spectral Mask 299

where τ = t1 − t2 . Consequently, the PSD of x1 (t) is computed as


 # $"
1 (Tf ) (Tf )
Sx1 (f ) = lim F E x1 (t1 )x1 (t2 )
Tf →∞ 2Tf
 "
Es Es
= lim F δ(τ ) = . (8.59)
α→∞ T T

Let the Fourier transform of ws (t) be Ws (f ), the PSD of xs (t) is computed


as
Es
Sxs (f ) = Sx1 (f )|Ws (f )|2 = |Ws (f )|2 , (8.60)
T
which is proportional to either |Ws (f )|2 or the inverse of T , i.e., the data
rate. Furthermore, the same conclusion in (8.60) can be drawn when the
data symbol is encoded by the code found by the fast search algorithm in
Section 8.5.
For a fixed data rate system, we learn from (8.60) that the maximum
power per data symbol in CPP-UWB is bound by enforcing the FCC mask.
A better system performance can be achieved by properly design the pulse
waveform ws (t) so that as much power as possible can be pumped into the
wireless channel without violating the power spectral mask constraint. The
pulse waveform design problem is interesting; however, it will not be covered
in this book since it is out of the current scope of this chapter. Interested
readers are referred to [70] and references therein for detailed treatment on
this topic. On the contrary, for a fixed pulse waveform, the output power of
CPP transmitters should be lowered when the symbol interval is shortened.
Consequently, the received signal power decreases as well. To keep the same
SNR level as the low data rate system, more attention should be focused on
the noise figure reduction at the receiver in its hardware implementation.
9
Conclusion and Future Trend

9.1 Conclusion
9.1.1 Chapter 2

In Chapter 2, we reviewed the principle of Precoding for ISI channels. First,


Tomlinson–Hiroshima (TH) precoding was explained in detail. It was shown
that in high SNR, TH precoding can approach the capacity of any band-
width limited Gaussian channel as closely as the capacity of an ideal Gaus-
sian channel. Moreover, we also reviewed trellis shaping and trellis precoding
and showed that with trellis precoding, the SNR gap of about 9 dB (at
Pr (E) 10−6 ) between capacity and uncoded modulation can be reduced by
approximately 7 dB over any strictly bandlimited high SNR Gaussian chan-
nel. We also reviewed the basic of multirate signal processing and represented
OFDM systems using matrix forms, which can also be regarded as transceiver
with precoding and postcoding. As a result, CP and ZP inserted OFDM sys-
tems can be easily represented and different receivers, e.g., ZF and MMSE can
be designed accordingly. We also introduced the SC-CP system which can be
regarded as an OFDM system with channel independent precoding.

9.1.2 Chapter 3

In Chapter 3, several precoding techniques, namely Tx-MF, Tx-ZF, and


Tx-Wiener, are introduced to improve the performance of downlink DS-
CDMA systems while keeping the receiver design simple. The Tx-MF pre-
coding automatically combines all the multipath gains of the desired user’s
channel after some delay and, thus, a simple MF receiver is sufficient to acquire
full multipath diversity. Since Tx-MF fails to suppress interference efficiently,
its performance degrades as the number of users increases. By exploiting the
knowledge of transmit signals from all the other users at the base station,
Tx-ZF that eradicates all the interference at the desired receiver output can
be applied. However, Tx-ZF cancels MAI at the cost of reducing its available

C.-C.J. Kuo et al., Precoding Techniques for Digital Communication Systems,


DOI: 10.1007/978-0-387-71769-2 9,  c Springer Science+Business Media, LLC 2008
302 9 Conclusion and Future Trend

signal dimension. As a result, the overall signal power at the receiver output is
reduced. Finally, the Tx-Wiener precoding is also proposed to strike a balance
between signal power gain and interference suppression.

9.1.3 Chapter 4

In Chapter 4, precoding for MIMO channels were discussed. We demonstrated


that TH precoding can easily be applied to a BLAST MIMO system. For high
SNR, the capacity achieved by TH precoding approaches to capacity of MIMO
channel.
Linear precoder can also be designed jointly with a linear decoder to min-
imize a desired criterion. In particular, if the criterion is to minimize the
weighted sum of symbol estimation errors for all subchannels, the linear pre-
coder and decoder decouples MIMO channel into parallel eigen subchannels.
By selecting appropriate error weights, we can maximize the information rate
and minimize the sum of error rates. The joint linear precoder/decoder de-
sign is in general a complicated non-convex problem. Optimization can be
done with respect to various criteria. By adopting a unified framework, the
linear precoder and decoder can still diagonalize channel if other criteria such
as minimization of maximum or average BER are chosen. Hence, the opti-
mization problems are simplified greatly.
Furthermore, precoding was used in conjunction with space-time codes.
It was shown that it is possible to design linear precoder in a channel with
fading correlation for a space-time MIMO system or to design linear and
unitary precoder in order to maximize diversity and coding gain.
Finally, since the complete channel information may not be always avail-
able at the MIMO transmitter due to channel variation, two MIMO precoder
designs using incomplete channel information, i.e., channel statistics precoding
and unitary precoding, are also provided. It is found that when either channel
mean or channel covariance matrix is known to the transmitter, either trans-
mit beamforming or spatial multiplexing achieves the highest information rate
depending on the quality of the returned information. The unitary precoding
is deployed over the precoded OSTBC system, where the codeword is selected
to minimize the symbol error rate. The discrete codeword design can be re-
lated to a subspace packing problem in the Grassmannian manifold, where the
minimum chordal distance between any two codewords is maximized. When
being compared to the 2 × 2 OSTBC systems, the unitary precoding with 3
feedback bits in the 4 × 2 MIMO system provides more than 3 dB gain at the
BER equal to 10−3 as simulation results suggest.

9.1.4 Chapter 5

In Chapter 5, we proposed an approximately MAI-free multiaccess OFDM


system, called the PMU-OFDM system. In the PMU-OFDM system, every
9.1 Conclusion 303

user receives negligible MAI and behaves as in a “single-user” OFDM sys-


tem. Like the OFDM system, each individual user in PMU-OFDM transmits
N parallel symbols. When N is sufficiently large, PMU-OFDM has the ap-
proximate MAI-free property. Moreover, we proposed a code selection scheme
using Hadamard-Walsh codes to relax the requirement on the large value
of N while the system can still achieve the approximate MAI-free property.
More specifically, in a fully-load situation, i.e. without code selection, the MAI
power decreases at a speed of O(N −2 ) while the MAI power decreases at a
speed of O(N −4 ) in a half-loaded situation with code selection. Furthermore,
PMU-OFDM with code selection was proved to operate robustly in time, fre-
quency asynchronous as well rapid time-variant environments. Based on the
code selection scheme, we showed that a proper code priority can further
enhance the performance of PMU-OFDM in a frequency asynchronous and
rapid time-variant environment. Since PMU-OFDM can solve the MAI and
asynchronism issues using a simple code selection scheme, it does not demand
sophisticated multiuser detection or signal processing techniques, which are
commonly adopted in multiuser OFDM systems. When a fully-loaded user
capacity is required and hence MUD is required, the use of code selection in
PMU-OFDM can reduce the complexity for MUD since each user only needs
to deal with the interference from half of the users.

9.1.5 Chapter 6

In Chapter 6, we proposed a code selection scheme, which is based on the


Hadamard-Walsh code, to achieve a completely MAI-free as well as full diver-
sity gain in an MC-CDMA system with multipath effect. Based on this code
selection, we can estimate the channel information under an MAI-free envi-
ronment conveniently. We also showed that MC-CDMA systems can be more
robust to the CFO effect using the proposed code selection scheme. More
specifically, by properly partitioning the codewords, the users in a specific
codeword set can achieve MAI-free even in CFO environments.

9.1.6 Chapter 7

In Chapter 7, we show how different precoding scheme, i.e. CI code, can in-
crease the number of MAI-free users in MC-CDMA system with CFO. More-
over, thanks to the MAI-free property of the proposed code selection schemes,
if the system is to be operated in a fully-loaded situation, we can greatly sim-
ply the computational complexity for MAI suppression since each user does
not need to deal with the interference from all other users.

9.1.7 Chapter 8

In Chapter 8, a channel phase precoding technique is proposed to simplify the


receiver design while saving the feedback information quantity in TRP. The
304 9 Conclusion and Future Trend

CPP scheme is not only simpler in design but also computationally more effi-
cient than the TRP-based system. A performance lower bound on the output
SNR is derived to evaluate the system performance when the transmit symbol
is encoded by the estimated phase information acquired by a set of training
symbols. The concentrated signal power at the CPP receiver is exploited to
enhance the data transfer rate by reducing the symbol interval without much
ISI degradation. In the high date transmission case, a better system perfor-
mance can be achieved by optimizing the codeword length so that its output
SIR is maximized. The closed form solution is not possible since the optimiza-
tion problem is a highly nonlinear. Instead of resorting to an exhaustive search
scheme, a fast search algorithm is derived to find out the optimal codeword
length with low computational burden.

9.2 Future Research Trend


9.2.1 Precoding with Partial Channel Information

In Chapters 4 and 8, we introduce precoding techniques with partial chan-


nel information for MIMO and UWB systems, respectively. Partial channel
information feedback is in general more feasible in wireless or mobile envi-
ronments, especially when the numbers of transmit and receive antennas are
large. For instance, in current IEEE 802.16e-2005 standard, the subcarriers
can be up to 2048. If MIMO precoding with full channel information is per-
formed, we need to compute the precoding solution for all subcharriers and
feedback them to transmit side. In this case, the computational complexity
and the amount of feedback are huge and may not be feasible. This result
also limits the standard not to offer more antenna numbers to achieve higher
data throughput, because the computational complexity and feedback amount
both increase exponential with the number of transmit and receive antennas.
To solve this problem, precoding with partial channel information may offer
a good solution.

9.2.2 Combined Precoding and MUD for Multiuser


Communications

In Chapters 5–6, we showed that by proper transceiver and code design,


some subsets of users in the multiuser systems can achieve an MAI-free or
nearly MAI-free property without any channel information in the transmit
side, where the MAI arises from various sources such as multipath effect, time
and frequency offsets, and Doppler effect. Such precoding schemes can be com-
bined with MUD to achieve a good leverage for the computational complexity
between transmitter and receiver. Without proper precoding scheme, MUD
will lead to large complexity burden in the receiver side. The scenario is that,
by using precoding in the multiuser systems, some subsets of users can be
9.2 Future Research Trend 305

MAI-free. In this case, the interferers for each target user become fewer. Since
the complexity of MUD directly related to the number of interferers, reducing
number of interferers can reduce the complexity for MUD. In addition, since
the active users may not always be fully-loaded, once the active user number
is below the supportable MAI-free user number, there is no need to perform
MUD while those active users can still enjoy an MAI-free transmission.

9.2.3 Other Code Scheme to Achieve More MAI-Free User


Number

To achieve MAI-free property, the proposed code selection scheme reduces


the number of users from M to M/2 in the PMU-OFDM system. Also, in
MC-CDMA system, the number of MAI-free users is also limited by the multi-
path length. The proposed code selection schemes in these two systems are
however only examples. It is possible that there exist some other better code
schemes that may allow more number of users to achieve MAI-free property.
It is an interesting topic to explore such better code schemes.
References

[1] S. Abeta, H. Atarashi, and M. Sawahashi. Forward link capacity of


coherent DS-CDMA and MC-CDMA broadband packet wireless access
in a multi-cell environments. IEEE VTC Fall, 5:2213–2218, Sep. 2000.
[2] M. Abramowitz and I. A. Stegun. Handbook of Mathematical Func-
tions With Formulas, Graphs, and Mathematical Tables. United States
Department of Commerce, 1972 edition.
[3] S. M. Alamouti. A simple transmit diversity technique for wireless com-
munications. IEEE J. Select. Areas Commun., 16:1451–1458, Oct. 1998.
[4] J. G. Andrews. Interference cancellation for cellular systems: a contem-
porary overview. IEEE Commun. Mag., 12:19–29, 2005.
[5] J. T. Aslanis S. Kasturia, and J. M. Cioffi. Vector coding for partial
response channels. IEEE T. Inform. Theory, 36:741–762, 1990.
[6] S. Barbarossa, M. Pompili, and G. B. Giannakis. Channel-independent
synchronization of orthogonal frequency division multiple access sys-
tems. IEEE J. Select. Areas Commun., 20:474–486, Feb. 2002.
[7] A. Barg and D. Y. Nogin. Bounds on packings of spheres in the
Grassmann manifold. IEEE J. Select. Areas Commun., 48:2450–2454,
September 2002.
[8] K. G. Beauchamp. Walsh Functions and Their Applications. Academic
Press, 1975.
[9] D. Bertsekas. Nonlinear Programming. Athena Scientific, 1995.
[10] J. A. C. Bingham. Multicarrier modulation for data transmission: an
idea whose time has come. IEEE Commun. Mag., 28:5–14, May 1990.
[11] D. Brady, Z. Zvonar. Multiuser detection in single-path fading channels.
IEEE T. Commun., 42:1729–1739, 1994.
[12] M. Brandt-Pearce and A. Dharap. Transmitter-based multiuser
interference rejection for the down-link of a wireless CDMA system
in a multipath environment. IEEE J. Select. Areas Commun., 18(3):
407–417, 2000.
308 References

[13] D. R. Brown. Multistage parallel interference cancellation: convergence


behaviour and improved performance through limit cycle mitigation.
IEEE T. Signal Proc., 53:283–294, Jan. 2005.
[14] L. Brunel. Multiuse detection techniques using maximum likelihood
sphere decoding in multicarrier CDMA systems. 3:949–957, May 2004.
[15] A. R. Calderbank and J. E. Mazo. Baseband line codes via spectral
factorization. IEEE J. Select. Areas Commun., SAC-7:914–928, 1989.
[16] A. R. Calderbank and A. V. Giannakis. Minimal tail-biting trellises: the
golay code and more. IEEE T. Inform. Theory, 45:1435–1455, 1999.
[17] S. Caoleri, M. Ergen, and A. Bahai. Channel estimation techniques
based on pilot arragnements in OFDM systems. 48:223–229, Sep. 2002.
[18] D. Cassioli, M. Z. Win, and A. F. Molisch. The ultra-wide bandwidth
indoor channel: from statistical model to simulations. IEEE J. Select.
Areas Commun., 20(6):1247–1257, Aug. 2002.
[19] R. W. Chang. Synthesis of band-limited orthogonal signals for multi-
channel data transmission. Bell Syst. Tech. J.
[20] Y.-H. Chang, S.-H. Tsai, X. Yu, and C.-C. J. Kuo. Ultra-wideband
(UWB) transceiver design using channel phase precoding (CPP). IEEE
T. Signal Proc., 55(7):3807–3822, 2007.
[21] Y.-H. Chang, X. Yu, and C.-C. J. Kuo. Techniques for received signal
focusing in DSUWB systems. Proc. IEEE VTC’04 Fall, 3:1777–1781,
Sep. 2005.
[22] Y.-L. Chao and R. A. Scholtz. Weighted correlation receivers for ultra-
wideband transmitted reference systems. Proc. IEEE Globecom’04, 1:
66–70, Nov. 2004.
[23] H. H. Chen and M. Guizani. Guest editorial: multiple access technolo-
gies for B3G wireless communications. IEEE Commun. Mag., 9:65–67,
Feb. 2005.
[24] R. L.-U Choi, K. B. Letaief, and R. D. Murch. MISO CDMA transmis-
sion with simplified receiver for wireless communication handsets. IEEE
T. Commun., 49(5):888–898, May 2001.
[25] L.-U. Choi and R. D. Murch. Transmit-preprocessing techniques with
simplified receivers for the downlink of MISO TDD-CDMA systems.
IEEE T. Veh. Technol., 53(2):285–295, 2004.
[26] A. Chouly, A. Brajal, and S. Jourdan. Orthogonal multicarrier tech-
niques applied to direct sequence spread spectrum CDMA systems.
IEEE Globlecom, 3:1723–1728, Dec. 1993.
[27] J. S. Chow, J. C. Tu, and J. M. Cioffi. A discrete multitone transceiver
system for HDSL applications. IEEE J. Select. Areas Commun., 9:
895–908, Aug. 1991.
[28] J. M. Cioffi. Dynamic spectrum management. http://www.stanford.
edu/group/coffi/dsm/tut/chap11.doc.
[29] R. de Buda. Some optimal codes have structure. IEEE J. Select. Areas
Commun., 7:893–899, 1989.
References 309

[30] A. Dekorsy, V. Kühn, and K.-D. Kammeyer. Exploiting time and fre-
quency diversity by iterative decoding in OFDM-CDMA systems. In
IEEE Globlecom, 5:2576–2580, Dec. 1999.
[31] J. H. Deng and T. S. Lee. An iterative maximum SINR receiver for mul-
ticarrier CDMA systems over a multipath fading channel with frequency
offset. IEEE T. Wirel. Commun., 2:560–569, May 2003.
[32] D. Divsalar and M. K. Simon. CDMA with interference cancellation
for multiprobe missions. JPL TDA Progress Report, 42-120:40–53, Feb.
1995.
[33] G. F. Edelmann, T. Akal, W. S. Hodgkiss, S. Kim, W. A. Kuperman,
and H. C. Song. An initial demonstration of underwater acoustic com-
munication using time reversal. IEEE J. Oceanic Eng., 27:602–609,
2002.
[34] R. Esmailzadeh, E. Sourous, and M. Nakagawa. Prerake diversity com-
bining in time-division duplex CDMA mobile communications. IEEE
T. Veh. Technol., 48(3):795–801, May 1999.
[35] M. V. Eyuboglu and Jr. G. D. Forney. Combined coding and equal-
ization using trellis precoding. In Proc. of CSI Workshop on Advanced
Communications Technologies, Ruidoso, NM, May 1989.
[36] M. V. Eyuboglu and Jr. G. D. Forney. Trellis precoding: combined
coding, precoding and shaping for intersymbol interference channels.
IEEE T. Inform. Theory, 38(2):301–314, 1992.
[37] M. V. Eyuboglu and S. U. H. Qureshi. Reduced-state sequence estima-
tion with set partitioning and decision feedback. IEEE T. Commun.,
36:13–20, 1988.
[38] M. V. Eyuboglu and S. U. H. Qureshi. Reduced-state sequence estima-
tion for coded modulation on intersymbol interference channels. IEEE
J. Select. Areas Commun., 7(6):989–995, 1989.
[39] Federal Communications Commission (FCC). Revision of Part 15
of the Commission’s Rules Regarding Ultra-Wideband Transmission
Systems, First Report and Order, ET Decoet 98-153, FCC 02-48.
adopted/released Feb. 14/Apr. 22, 2002.
[40] M. Fink. Time reversal of ultrasonic fields-part I: basic principles. IEEE
T. Ultrason. Ferroelectr. Freq. Control, 39(5):555–566, Sep. 1992.
[41] J. R. Foerster. Channel modeling sub-committe report final. IEEE
P802.15 WPAN P802.15-02/490r1-SG3a, Feb. 2003.
[42] G. D. Forney. Trellis shaping. IEEE T. Inform. Theory, 38:281–300,
1992.
[43] Jr. G. D. Forney and A. R. Calderbank. Coset codes for partial response
channels; or, coset codes with spectral nulls. IEEE T. Inform. Theory,
35:925–943, 1989.
[44] G. D. Forney and M. V. Eyuboglu. Combined equalization and coding
using precoding. IEEE Commun. Mag., 29:25–34, 1991.
310 References

[45] Jr. G. D. Forney and L.-F. Wei. Multidimensional constellation-Part 1:


Introduction, figures of merit, and generalized cross constellation. IEEE
J. Select. Areas Commun., 7:877–892, 1989.
[46] D. A. Gore and A. J. Paulraj. MIMO antenna subset selection with
space-time coding. IEEE Trans. on Signal Processing, 50:2580–2588,
October 2002.
[47] R. M. Gray. On the asymptotic eigenvalue distribution of Toeplitz ma-
trices. IEEE T. Inform. Theory, 18:725–730, Nov. 1972.
[48] X. Gui and T. S. Ng. Performance of asynchronous orthogonal multi-
carrier CDMA system in frequency selective fading channel. IEEE T.
Commun., 47:1084–1091, Jul. 1999.
[49] S. Hara and R. Prasad. Overview of multicarrier CDMA. IEEE Com-
mun. Mag., 35:126–133, Dec. 1997.
[50] H. F. Harmuth. Applications of Walsh function in communications.
IEEE Spectrum, Nov. 1969.
[51] M. G. Heinemann, A. Larazza, and K. B. Smith. Acoustic communica-
tions in an enclosure using single-channel time-reversal acoustics. Appl.
Phys. Lett., 80:694–696, 2002.
[52] J. Hicks, R. J. Boyle, S. Bayram, and W. H. Tranter. Overloaded array
processing with spatially reduced search joint detection. IEEE J. Select.
Areas Commun., 19:1584–1593, 2001.
[53] B. M. Hochwald, T. L. Marzetta, T. J. Richardson, W. Sweldens, and
R. Urbanke. Systematic design of unitary space-time constellations.
IEEE Trans. Inform. Theory, 46:1962–1973, September 2000.
[54] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University
Press, 1985.
[55] M. H. Hsieh and C. H. Wei. Channel estimation for OFDM systems
based on comb-type pilot arrangement in frequency selective fading
channels. IEEE T. Consum. Electr., 44:217–225, 1998.
[56] J. Ibrahim, R. Menon, and R. M. Buehrer. UWB signal detection based
on sequence optimization for dense multipath channels. 10(4):228–230,
2006.
[57] IEEE standard for local and metropolitan area networks. IEEE 802.16a
Standard, Apr. 2003.
[58] S. Imada and T. Ohtsuki. Pre-RAKE diversity combining for UWB
systems in IEEE 802.15 UWB multipath channel. Proc. Joint UWBST
& IWUWBS’04, 2:236–240, May 2004.
[59] S. A. Jafar and A. Goldsmith. Transmit optimization and optimality of
beamforming for multiple antenna systems. 3(4):1165–1175, 2004.
[60] W. C. Jakes. Microwave Mobile Communications. Wiley, NY, 1974
edition.
[61] M. Joham, W. Utschick, and J. A. Nossek. Linear transmit processing
in MIMO communications systems. IEEE T. Signal Processing, 53(8):
2700–2712, Aug. 2005.
References 311

[62] G Jöngren, M. Skoglund, and B. Ottersten. Combining beamforming


and orthogonal space-time block coding. IEEE Trans. Inform. Theory,
48:611–626, March 2002.
[63] A. Kajiwara and M. Nakagawa. Microcellular CDMA system with a
linear multi-user interference canceller. IEEE J. Select. Areas Commun.,
12(4):605–611, May 1994.
[64] S. M. Kay. Fundamentals of Statistical Signal Processing, Esti-
mateion/Detection Theory. Prentice-Hall, Englewood Cliffs, NJ, 1993.
[65] B. L. N. Kennett. A note on the finite Walsh Transform. IEEE T.
Inform. Theory, 16:489–491, Jul. 1970.
[66] I. Koffman and V. Roman. Broadband wireless access solutions based
on OFDM access in IEEE 802.16. IEEE Commun. Mag., 40:96–103,
Apr. 2002.
[67] S. Kondo and L. B. Milstein. Performance of multicarrier DS CDMA
systems. IEEE T. Commun., 44:238–246, Feb. 1996.
[68] P. Kyritsi, G. Papanicolaou, P. Eggers, and A. Opera. MISO time re-
versal and delay-spread compression for FWA channels at 5 GHz. IEEE
Antenn. Wirel. Propag. Lett., 3:96–99, 2004.
[69] E. G. Larsson and P. Stoica. Space-time block coding for wireless com-
munications. Cambridge University Press, 2003.
[70] T. P. Lewis and R. A. Scholtz. An ultrawideband signal design with
power spectral density constraints. Proc. 38th Asilomar Conf., 2:
1521–1525, Nov. 2005.
[71] Y. Li. Pilot-symbol-aided channel estimation for OFDM in wireless
systems. IEEE T. Veh. Technol., 49:1207–1215, 2000.
[72] Y. (G.) Li. Simplified channel estimation for OFDM with multiple trans-
mit antennas. 1:67–75, 2002.
[73] Y. Li, L. G. Cimini, and N. R. Sollenberger. Robust channel estimation
for OFDM systems with rapid dispersive fading channels. IEEE T.
Commun., 46:902–915, 1998.
[74] L. Lin and D. Costello. Error Control Coding. Prentice HAll PTP, 2003
edition.
[75] Y.-P. Lin. Multirate Systems. Course materials at National Chiao Tung
University Taiwan, 2007.
[76] Y.-P. Lin and S.-M Phoong. Perfect discrete multitone modulation
with optimal transceivers. IEEE T. Signal Processing, 48:1702–1711,
Jun. 2000.
[77] Y.-P. Lin and S.-M Phoong. ISI Free FIR Filterbank Transceivers
for Frequency Selective Channels. IEEE T. Signal Processing, 49:
2648–2658, Nov. 2001.
[78] Y.-P. Lin and S.-M Phoong. BRE minimized OFDM systems with chan-
nel independent precoders. IEEE T. Signal Processing, 51:2369–2380,
Sep. 2003.
312 References

[79] D. J. Love and R. W. Heath. Grassmannian beamforming for multiple-


input multiple-output wireless systems. IEEE Trans. Inform. Theory,
49(10):2735–2747, October 2003.
[80] D. J. Love and R. W. Heath. Limited feedback unitary precoding for
orthogonal space-time block codes. IEEE Trans. on Signal Processing,
53(1):64–73, January 2005.
[81] R. Lupas and S. Verdú. Linear multiuser detectors for synchronous
code-division multiple access channels. IEEE T. Inform. Theory, 34:
123–136, Jan. 1989.
[82] R. Lupas and S. Verdu. Near-far resistance of multi-user detectors in
asychronous channels. IEEE T. Commun., 38(4):496–508, 1990.
[83] U. Madhow and M. L. Honig. On the average near-far resistance for
MMSE detection of direct sequence CDMA signals with random spread-
ing. IEEE T. Inform. Theory, 45(6):2039–2045, Sep. 1999.
[84] H. Miyakawa and H. Harashima. Matched-transmission technique for
channels with intersymbol interference. IEEE T. Commun., 20(4):
774–780, 1972.
[85] C. B. Moler, J. J. Dongarra, J. R. Bunch, and G. W. Stewart. LINPACK
Users’ Guide. Society for industrial and applied mathematics, 1980
edition.
[86] A. F. Molisch. Ultrawideband propagation channels-theory, mea-
surement, and modeling. IEEE T. Veh. Technol., 54(5):1528–1545,
Sep. 2005.
[87] P. H. Moose. A technique for orthogonal frequency division multi-
plexing frequency offset correction. IEEE T. Commun., 42:2908–2914,
Oct. 1994.
[88] M. Morelli. Timing and frequency sunchronization for the uplink of an
OFDMA system. IEEE T. Commun., 52:296–306, Feb. 2004.
[89] S. Moshavi. Multi-user detection for DS-CDMA communications. IEEE
Commun. Mag., 49:124–136, Oct. 1996.
[90] Y. Mostofi and D. C. Cox. ICI mitigation for pilot-aided OFDM mobile
systems. IEEE T. Wireless. Commun., 4:765–774, 2005.
[91] A. Narula, M. J. Lopez, M. D. Trott, and G. W. Wornell. Efficient use
of side information in multiple-antenna data transmission over fading
channels. IEEE J. Select. Areas Commun., 16(8):1423–1436, October
1998.
[92] B. Natarajan, Z. Wu, C. R. Nassar, and S. Shattil. Large set of CI
spreading codes for high-capacity MC-CDMA. IEEE T. Commun., 52:
1862–1866, Nov. 2004.
[93] R. Negi and J. Cioffi. Pilot tone selection for channel estimation in a
mobile OFDM system. IEEE T. Consumer Electronics, 44:1122–1128,
1998.
[94] H. T. Nguyen, J. B. Andersen, and G. F. Pedersen. The potential use
of time reversal techniques in multiple element antenna systems. 9(1):
40–42, Jan. 2005.
References 313

[95] S. Niranjayan, A. Nallanathan, and B. Kannan. Delay tuning based


transmit diversity scheme for TH-PPM UWB: performance with RAKE
reception and comparison with multi RX schemes. Proc. Joint UWBST
& IWUWBS’04, 341–345, May 2004.
[96] K. Nishimori, K. Cho, Y. Takatori, and T. Hori. Automatic calibra-
tion method using transmitting signals of an adaptive array for TDD
systems. IEEE T. Veh. Technol., 50(6):1636–1640, Nov. 2001.
[97] R. Nogueroles, M. Bossert, A. Donder, and V. Zyablov. Improved per-
formance of a random OFDMA mobile communication system. IEEE
VTC, 3:2502–2506, May 1998.
[98] A. V. Oppenheim and R. W. Schafer. Discrete-Time Signal Processing.
Prentice Hall, 1989.
[99] P. Palomar, J M. Cioffi, and M. A. Lagunas. Joint Tx-Rx beamforming
design for multicarrier MIMO channels: a unified framework for convex
optimization. IEEE Trans. on Signal Processing, 51(9):2381–2401, 2003.
[100] S.-M. Phoong. Multirate Systems. Course materials at National Taiwan
University Taiwan, 2007.
[101] T. Pollet, M. V. Bladel, and M. Moeneclaey. BER sensitivity of OFDM
systems to carrier frequency offset and Wiener phase noise. IEEE Trans.
Commun., 43:191–193, Feb./Mar./Apr. 1995.
[102] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. vetterling. Nu-
merical Recipes, the Art of Scientific Computing. Cambridge University
Press, NY, 1986 edition.
[103] C. Prettie, D. Cheung, L. Rusch, and M. Ho. Spatial correlation of UWB
signals in a home environment. Proc. UWBST’02, 65–69, May 2002.
[104] R. Price. Non linearly feedback equalized PAM versus capacity for noisy
filtered channels. Proc. ICC’72, 1972.
[105] J. G. Proakis. Digital Communications. McGraw-Hill, 4th edition, 2000.
[106] M. O. Pun, S.-H. Tsai, and C.-C. Jay Kuo. An EM-based maximum
likelihood joint carrier frequency offset and channel estimation for uplink
of OFDMA systems. In IEEE VTC Fall, Sep. 2004.
[107] M. O. Pun, S.-H. Tsai, and C.-C. Jay Kuo. Joint maximum likelihood
estimation of carrier frequency offset and channel in uplink OFDMA
systems. IEEE Globecom, 6:3748–3752, Dec. 2004.
[108] T. S. Rappaport. Wireless Communications. Prentice Hall PTR, 2002.
[109] I. M. Ryzhik, I. S. Gradshteyn, and A. Jeffrey. Tables of Integrals,
Series, and Products. Academic Press, Inc., 1994 edition.
[110] H. Sampath and A. Paulraj. Linear precoding for space-time coded
systems with known fading correlation. IEEE Trans. Commun., 6(6):
502–513, 2002.
[111] H. Sampath, P. Stoica, and A. paulraj. A generalized space-time lin-
ear precoder and decoder design using the weighted MMSE criterion.
In Proc. of 39th Asilomar Conf. on Signals, Systems and Computers.,
pages 753–758, 2000.
314 References

[112] H. Sampath, P. Stoica, and A. Paulraj. Generalized linear precoder and


decoder design for MIMO channels using the weighted MMSE criterion.
IEEE Trans. Commun., 49(12):2198–2206, December 2001.
[113] H. Sari, G. Karam, and I. Jeanclaude. Transmission techniques for
digital terrestrial TV broadcasting. IEEE Commun. Mag., 100–109,
Feb. 1995.
[114] H. Sari, Y. Levy, and G. Karam. An analysis of orthogonal frequency-
division multiple access. IEEE Globecom, 3:1635–1639, Nov. 1997.
[115] A. Scaglione, G. B. Giannakis, and S. Barbarossa. Redundant filterbank
precoders and equalizers part I: unification and optimal designs. IEEE
T. Signal Process., 47:1988–2006, Jul. 1999.
[116] A. Scaglione, G. B. Giannakis, and S. Barbarossa. Lagrange/
Vandermonde MUI eliminating user codes for quasi-synchronous CDMA
in unknown multipath. IEEE T. Signal Process., 48:2057–2073, Jul.
2000.
[117] A. Scaglione, P. Stoica, S. Barbarossa, G. B. Giannakis, and H. Sam-
path. Optimal design for space-time linear precoders and decoders.
IEEE Trans. on Signal Processing, 50(5):1051–1064, May 2002.
[118] T. M. Schmidl and D. C. Cox. Robust frequency and timing synchro-
nization for OFDM. IEEE T. Commun., 45:1613–1621, Dec. 1997.
[119] Q. Shi and M. Latva-aho. Simple spreading code allocation scheme for
downlink MC-CDMA. Electronics Letters., 38:807–809, Jul. 2002.
[120] D. Shiu, J. G. Foschini, M. Gans, and J. M. Kahn. Fading correaltion
and its effect on the capacity of multi-element antenna systems. IEEE
Trans. Commun., 48(3):502–513, 2000.
[121] M. K. Simon, J. K. Omura, R. A. Scholtz, and B. K. Levitt. Spread
Spectrum Communications Handbook, Electronic Edition. McGraw-Hill,
electronic edition, 2002.
[122] S. Stanczak, H. Boche, and M. Haardt. Are LAS-codes a miracle? IEEE
Globecom, 3:589–593, Nov. 2001.
[123] T. Strohmer, M. Emami, J. Hansen, G. Papanicolaous, and
A. J. Paulraj. Application of time-reversal with MMSE equalization
to UWB communications. Proc. IEEE Globecom’04, 5:3123–3127, Nov.
2004.
[124] L. Tadjpour, S.-H. Tsai, and C.-C. J. Kuo. Orthogonal codes for MAI-
free MC-CDMA with carrier frequency offsets (CFO). IEEE Globecom,
06, Nov. 2006.
[125] L. Tadjpour, S.-H. Tsai, and C.-C. Jay Kuo. An approximately MAI-
free multiaccess OFDM system in fast time-varying channels. IEEE T.
Signal Process., 55(7):3787–3799, 2007.
[126] O. Takyu, T. Ohtsuki, and M. Nakagawa. Frequency offset compensa-
tion with MMSE-MUD for multi-carrier CDMA in quasi-synchronous
uplink. IEEE ICC, 4:2485–2489, May 2003.
References 315

[127] V. Tarokh, H. Jafakhani, and A. R. Calderbank. Space-time block codes


from orthogonal design. IEEE Trans. Inform. Theory, 45:1456–1467,
1999.
[128] V. Tarokh, N. Sheshadri, and A. R. Calderbank. Space-time codes for
high data rate wireless communication: performance criterion and code
construction. IEEE Trans. Inform. Theory, 44:744–765, Mar. 1998.
[129] E. Telatar. Capacity of multi-antenna Gaussian channels. European
Transactions on telecommunications, 10(6):585–595, Nov/Dec, 1999.
[130] V. Thippavajjula and B. Natarajan. Parallel interference cancellation
techniques for synchronous carrier interferometry/MC-CDMA uplink.
In Proc. of 60th IEEE Vehicular Technolgy Conference. San Fransisco,
CA, Sep. 2004.
[131] M. Tomlinson. New automic equalizer employing modulo arithmetic.
Electronics Letters. 7:138–39, 1971.
[132] M. Torlak, G. Xu, and H. Liu. An improved signature waveform ap-
proach exploiting pulse shaping information in synchronous CDMA sys-
tems. IEEE ICC, 2:22–27, Jun. 1996.
[133] S.-H. Tsai, Y.-P. Lin, and C.-C. J. Kuo. A repetitively coded multicar-
rier CDMA (RCMC-CDMA) transceiver for multiuser communications.
IEEE WCNC, 2:959–964, Mar. 2004.
[134] S.-H. Tsai, Y.-P. Lin, and C.-C. J. Kuo. Code priority of Multiuser
OFDM systems in frequency asynchronous environment. In Proc.
of IEEE 62nd Semiannual VTC Fall, pages 710–713. Dallas, Texas,
Sep. 25–28, 2005.
[135] S.-H. Tsai, Y.-P. Lin, and C.-C. J. Kuo. An approximately MAI-
free multiaccess OFDM system in carrier frequency offset environment.
IEEE T. Signal Proces., 53(11):4339–4353, 2005.
[136] S.-H. Tsai, Y.-P. Lin, and C.-C. J. Kuo. MAI-free MC-CDMA sys-
tems based on Hadamard Walsh codes. IEEE T. Signal Proces., 54:
3166–3179, Aug. 2006.
[137] G. V. Tsoulos and M. A. Beach. Calibration and linearity issues
for an adaptive antenna system. In Proc. IEEE VTC’97, 1597–1600,
May 1997.
[138] G. Ungerboeck. Channel coding with multilevel/phase signals. IEEE
T. Inform. Theory, IT-28:55–67, 1982.
[139] K. Usuda, H. Zhang, and M. Nakagawa. Pre-RAKE performance for
pulse based UWB system in a standardized UWB short-range channel.
Proc. IEEE WCNC’04, 2:920–925, 2004.
[140] P. P. Vaidyanathan. Multirate Systems and Filter Banks. Englewood
Cliffs, Prentice-Hall, NJ, 1993.
[141] J.-J. van de Beek, M. Sandell, and P. O. Börjesson. ML estimation of
time and frequency offset in OFDM systems. IEEE T. Signal Proces.,
45:1800–1805, Jul. 1997.
[142] J. J. van de Beek, P. O. Börjesson, M. L. Boucheret, D. Landström, J. M.
Arenas, P. Ödling, C. Östberg, M. Wahlqvist, and S. K. Wilson. A time
316 References

and frequency synchronization scheme for multiuser OFDM. IEEE J.


Select. Areas Commun., 17:1900–1914, Nov. 1999.
[143] M. Varanasi and B. Aazhang. Multistage detection in asynchronous
code-division multiple-access communications. IEEE Trans. Commun.,
38:509–519, Apr. 1990.
[144] S. Verdu. Multiuser Detection. Cambridge University Press, 1998.
[145] E. Visotsky and U. Madhow. Space-time transmit precoding with
imperfect feedback. IEEE Trans. Inform. Theory, 47(6):2632–2639,
September 2001.
[146] A. J. Viterbi. CDMA: Principles of Spread Spectrum Communication.
Addison-Wesley, Reading, MA, 1995.
[147] B. R. Vojcic and W. M. Jang. Transmitter precoding in synchronous
multiuser communications. IEEE Trans. Commun., 46(10):1346–1355,
Oct. 1998.
[148] Q. Wang and V. K. Bhargava. An efficient maximum lielihood decod-
ing algorithm for generalized tail biting convolutional codes including
quasicyclic codes. IEEE Trans. Commun., 37:875–879, 1989.
[149] Z. Wang and G. B. Giannakis. Linearly precoded or coded OFDM
against wireless channel fades. In Third IEEE Signal Processing Work-
shop on Signal Processing Advances in Wireless Communications.,
page 267.
[150] S. S. H. Wijayasuriya, G. H. Norton, and J. P. McGeehan. A near-far
resistant sliding window decorrelating algorithm for multi-user detec-
tors in DS-CDMA systems. In Proc. IEEE Globecom Conference, pages
1331–1338. San Fransisco, CA, Dec. 1992.
[151] M. Z. Win and R. A. Scholtz. On the energy capture of ultraw-
ide bandwidth signals in dense multipath environments. 2(9):245–247,
Sep. 1998.
[152] C. Windpassinger, R. F. H. Fischer, T. Vencel, and J. B. Huber. Pre-
coding in multiantenna and multiuser communications. 3(4):1305–1316,
2004.
[153] X. Xia. Precoded and vector OFDM robust to channel spectral
nulls and with reduced cyclic prefix length. IEEE T. Commun., 49:
1363–1374, 2001.
[154] Y. Xin, Z. Wang, and G. Giannakis. Space-time diversity systems
based on linear constellationprecoding. IEEE T. Wireless Commun.,
2(2):502–513, 2003.
[155] N. Yee, J. P. Linnartz, and G. Fettweis. Multi-carrier CDMA in indoor
wireless radio networks. IEICE T. Commun., E77-B:900–904, Jul. 1994.
[156] H. Yoo and D. Hong. Edge sidelobe supressor scheme for OFDMA
uplink systems. IEEE Commun. Lett., 7:534–536, Nov. 2003.
[157] Y. Zhao and S. G. Haggman. Intercarrier interference self-cancellation
scheme for OFDM mobile communication systems. IEEE T. Commun.,
49:1185–1191, 2001.
[158] S. Zhou, X. Cai, and G. B. Giannakis. Group-orthogonal multicarrier
CDMA. IEEE T. Commun., 52:90–99, Jan. 2004.
Index

ADC, 280 eigenmode, 68


analogy-to-digital converter, 280
antiperiodic, 234 Fast Search Algorithm, 291
approximately MAI-free, 121–124, 127, FCC, 277, 297
137, 146, 159, 188 Federal Communications Commission,
AWGN, 279 277, 297
frequency division duplex, 275
carrier frequency offset (CFO), 271, 272
carrier inteferometry (CI) codes, 246 Generalized Weighted MMSE, 74
carrier interferometry (CI) codes, 244
CFO estimation, 118, 159, 166, 225, Hadamard-Walsh, 47, 123–128, 132,
229–231 136–138, 140–147, 151, 156,
channel estimation, 159, 206–209, 160–162, 166, 167, 171, 172,
220–222 174–176, 188, 189, 192, 195, 199,
CLO, 276, 289 200, 202, 203, 209–211, 213, 215,
codeword length optimization, 276 216, 218, 221–223, 234, 235, 238
coding gain, 94
cosets, 21 inter-symbol interference, 47
CPP-UWB, 275 intercarrier interference (ICI), 201–203
Interpolation, 30
DAC, 42 intersymbol interference, 56
decimation, 30 ISI, 13–15, 27–29, 41, 47, 56, 58, 276
decision feedback equalizer (DFE), 13
decision feedback feedback (DFE), 13, Jakes, 187
14, 20
decorrelator, 47 lattice, 21
delay tuning, 276 Linear constellation precoding (LCP),
direct-sequence code division multiple 92
access, 47
diversity, 209, 218, 220, 222, 233 MAI, 47, 56–58, 117–120
diversity gain, 93, 209–212, 218, 222 MAI-free, 118, 122, 195, 200
Doppler, 186 match filter, 47
downlink, 48 maximum likelihood sequence estima-
DS-CDMA, 47 tion (MLSE), 14
318 Index

MC-CDMA, 117, 118, 122, 123, 166, Hadamard-Walsh Code, 124, 137
167, 169–171, 209, 210, 216–218, MAI decreasing rate, 129
221, 224, 230, 233, 234, 237, 238 self CFO, 161
MF, 47, 48, 51, 52, 54, 55, 58, 60 time offset, time asynchronism, 132
MIMO, 67 time-variant channel, 185
MISO, 52 polyphase decompositions, 31
MMSE receiver, 47 Polyphase Identity, 31
MRC, 209, 211, 212, 218, 220 polyphase representation, 35
MUD, 123, 166, 209, 220, 222, 225, 233, post-cursor ISI, 57
234 Power Spectral Mask, 297
multi-input single-output, 52 PPR, 281, 283
multipath, 13, 39, 47, 49, 52, 54, 55, pre-cursor ISI, 57
275, 276, 279, 318 Pre-Rake, 52, 54, 275
multipath, 47 Precoded Multiuser OFDM (PMU-
multiple access interference, 47 OFDM), 117
multiple access interference (MAI), 190,
192, 195, 237, 238, 244, 246 Rake receiver, 47
multirate, 29, 30, 32
multiuser detection, 118, 119, 123, 211 SC-CP, 41–44
multiuser detector, 47 MMSE equalization, 44
multiuser OFDM, 117, 119, 230, 318 zero forcing, 43
multiuser OFDM , 230 SC-ZP, 45
MMSE equation, 45
Noble Identities, 30 zero forcing, 45
selective Pre-Rake, 283
OFDM, 29 shaping constellation expansion ratio
channel information, 40 (CER), 23
cyclic prefix, 37 shaping gain, 22
multirate representations, 29, 32 sign bit shaping, 23
precoding, 41, 46 signal power focusing, 276
zero padding, 39 signal to interference power ratio, 277
OFDMA, 118, 122, 123, 146–153, signal-to-noise power ratio, 276
166–171, 181–183, 185, 202–205, single carrier system with cyclic prefix,
207, 208 41
single carrier system with zero padding,
PAPR, 42 45
parallel interference cancellation (PIC), Singular value decomposition (SVD), 68
247, 248, 255 SIR, 277, 290–294
partial Pre-Rake, 281 SNR, 276
peak-to-average power ratio, 42 SPR, 283
phase estimation, 286 syndrome
phase precoding, 275 syndrome sequence, 25
PMU-OFDM syndrome former, 25, 26
approximately MAI-free, 121 syndrome
carrier frequency offset (CFO), syndrome sequence, 25
frequency asynchronism, 151
code design, 139 TDD, 48, 275, 318
code priority, 171, 183, 184, 200, 202 TDD-DS-CDMA, 318
Doppler effect, 188 TDD-DS-CDMA, 52
Index 319

the additive white Gaussian noise, 279 TRP, 52, 54, 275, 280, 283
time division duplex, 48 Tx-MF, 48, 52
time-reversal prefilter, 52, 275 Tx-Wiener, 48, 59
Tomlinson–Harashima precoding Tx-ZF, 48, 56
(THP), 15, 19, 20
ultra wideband, 275
transmit matched filter, 48
UWB, 275
transmit Wiener filter, 48
transmit zero-forcing filter, 48 Viterbi algorithm, 24
Trellis precoding, 27
Trellis Shaping, 22 WSSUS, 186

You might also like