Space Time Code Division
Space Time Code Division
Wireless Networks
Thesis by
Yindi Jing
Doctor of Philosophy
2004
(Submitted September 7, 2004)
ii
c 2004
Yindi Jing
All Rights Reserved
iii
Acknowledgments
I owe a debt of gratitude to many people who have helped me with my graduate study
and research in diverse ways. Without their generosity and assistance, the completion
and boundless energy. He is also an endless source of creative ideas. Often times,
I have realized how truly fortunate I am to have such an open-minded advisor who
allowed me to choose my research subject freely.
My greatest and heartfelt thanks must also go to Professor Babak Hassibi, my
associate advisor and mentor, for his constant encouragement, inspiration, and guid-
ance both in completing this thesis and in my professional development. He led me to
the exciting world of wireless communications. He not only always has great insights
but also shows his students how to start from an ultimate vision of the world and
Vijay Gupta, and Haris Vikalo. Great thanks to Maralle Fakhereddin, a summer
intern, who spent a lot of time and energy proofreading this thesis.
Special thanks to my friends Lun Li and Min Tao for their help, support, and
friendship during my darkest time. My lifetime friend Bing Liu deserves special
mention for his support and concern. He is like a family member to me.
I would also like to acknowledge my officemates of Steele 7, Xin Liu and Domi-
tilla del Vecchio, my classmate and former officemate of Steele 135 and Steele 4,
Jiantao Wang, for discussing homework and research problems during my first two
years at Caltech, Jim Endrizzi, the international student advisor of Caltech Inter-
national Students Program office, for helping me with international student issues,
Caltech Chinese Association, and Caltech Badminton Club for entertaining my stay
at Caltech.
v
Abstract
sity gain when using space-time coding among nodes in wireless networks.
Capacity has long been a bottleneck in wireless communications. Recently, multiple-
antenna techniques have been used in wireless communications to combat the fading
effect, which improves both the channel capacity and performance greatly. A re-
cently proposed method for communicating with multiple antennas over block-fading
channels is unitary space-time modulation, which can achieve the channel capacity
at high SNR. However, it is not clear how to generate well performing unitary space-
time codes that lend themselves to efficient encoding and decoding. In this thesis, the
design of unitary space-time codes using Cayley transform is proposed. The codes
scheme for multiple-antenna systems with unknown channel information at both the
transmitter and the receiver is differential unitary space-time modulation. It can
be regarded as a generalization of DPSK and is suitable for continuous fading. In
differential unitary space-time modulation, fully diverse constellations, i.e., sets of
unitary matrices whose pairwise differences are non-singular, are wanted for their
good pairwise error properties. In this thesis, Lie groups and their representations
are used in solving the design problem. Fully diverse differential unitary space-time
codes for systems with four and three transmit antennas are constructed based on the
vi
Lie groups Sp(2) and SU (3). The designed codes have high diversity products, lend
themselves to a fast maximum-likelihood decoding algorithm, and simulation results
show that they outperform other existing codes, especially at high SNR.
Then the idea of space-time coding devised for multiple-antenna systems is applied
to communications over wireless networks. In wireless relay networks, the relay nodes
encode the signals they receive from the transmit node into a distributed space-time
code and transmit the encoded signals to the receive node. It is shown in this thesis
that at very high SNR, the diversity gain achieved by this scheme is almost the same
as that of a multiple-antenna system whose number of transmit antennas is the same
as the number of relay nodes in the network, which means that the relay nodes work
as if they can cooperate fully and have full knowledge of the message. However, at
moderate SNR, the diversity gain of the wireless network is inferior to that of the
multiple-antenna system. It is further shown that for a fixed total power consumed
in the network, the optimal power allocation is that the transmitter uses half the
power and the relays share the other half fairly. This result addresses the question of
what performance a relay network can achieve. Both it and its extensions have many
applications to wireless ad hoc and sensory network communications.
vii
Contents
2.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.10 Contributions of This Thesis . . . . . . . . . . . . . . . . . . . . . . . 40
viii
3 Cayley Unitary Space-Time Codes 44
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.6.2 Design of Ar . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.6.3 Design of A11,1 , A11,2 , ...A11,Q , A22,1 , A22,2 , ...A22,Q . . . . . . . . 65
3.6.4 Frobenius Norm of the Basis Matrices . . . . . . . . . . . . . . 68
3.6.5 Design Summary . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.7 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.2 The Special Unitary Lie Group and Its Parameterization . . . . . . . 145
6.3 SU (3) Code Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.4 AB Code Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
x
6.5 A Fast Decoding Algorithm for AB Codes . . . . . . . . . . . . . . . 160
6.6 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Bibliography 245
xii
List of Figures
5.3 Comparison of the rate 1.95 Sp(2) code with the rate 1.75 differential
Cayley code, the rate 2, 2 × 2 complex orthogonal design, and the rate
1.94, 4 × 4 complex orthogonal design with N = 1 receive antennas . 127
5.4 Comparison of the rate 1.95 Sp(2) code with the rate 1.98 group-based
5.6 Comparison of the rate 3.99 Sp(2) code with the rate 4, 2 × 2 and rate
3.99, 4 × 4 complex orthogonal designs with N = 1 receive antenna . 131
5.7 Comparison of P = 11, Q = 7, θ = 0 Sp(2) codes of Γ = { π4 }, R =
3.1334, Γ = { π8 , π4 , 3π
8
π π π π 5π
} + 0.012, R = 3.5296, and Γ = { 12 , 6 , 4 , 3 , 12 } +
6.1 Comparison of the rate 1.9690, (1, 3, 4, 5) type I AB code with the rate
1.99 G21,4 code and the best rate 1.99 cyclic group code . . . . . . . . 164
6.2 Comparison of the 1) rate 2.9045, (4, 5, 3, 7) type I AB code, 2) rate
3.15, (7, 9, 11, 1), SU (3) code, 3) rate 3.3912, (3, 7, 5, 11) type II AB
code, 4) rate 3.5296, (4, 7, 5, 11) type I AB code, and 5) rate 3.3912,
(3, 7, 5, 11), SU (3) code with 6) the rate 3, G171,64 code . . . . . . . . 165
6.3 Comparison of the 1) rate 3.9838, (5, 8, 9, 11) type II AB code, 2) rate
4.5506, (9, 10, 11, 13) type II AB code, 3) rate 3.9195, (5, 9, 7, 11), SU (3)
code, and 4) rate 4.3791, (7, 11, 9, 13), SU (3) code with the 5) rate 4
G1365,16 code and 6) rate 4 non-group code . . . . . . . . . . . . . . . 167
6.4 Comparison of the rate 4.9580, (11, 13, 14, 15) type II AB code with the
rate 5 G10815,46 code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
List of Tables
6.2 Diversity products of some group-based codes and a non-group code . 152
6.3 Diversity products of AB codes . . . . . . . . . . . . . . . . . . . . . 157
xvi
List of Symbols
At transpose of A
Ā conjugate of A
A∗ conjugate transpose of A
A⊥ unitary complement of A
If A is m × n, A⊥ is the m × (m − n) matrix such that
[A A⊥ ] is unitary.
tr A trace of A
det A determinant of A
rank A rank of A
kAkF Frobenius norm of A
P probability
xvii
E expected value
Var variance
Z set of integers
MIMO multiple-input-multiple-output
ML maximum-likelihood
OFDM orthogonal frequency division multiplexing
PEP pairwise error probability
Chapter 1 Introduction to
Multiple-Antenna Communication
Systems
1.1 Introduction
in the last ten years, during which new methods were introduced and new devices
invented. Nowadays, we are surrounded by wireless devices and networks in our ev-
eryday lives: cellular phone, handheld PDA, wireless INTERNET, walkie-talkie, etc.
The ultimate goal of wireless communications is to communicate with anybody from
This means that we can not increase capacity by simply increasing the transmit
power. Communication systems in use today are predominantly single-antenna sys-
2
tems. Because of the multiple-path propagation in wireless channels, the capacity of
a single wireless channel can be very low. Research efforts have focused on ways to
make more efficient use of this limited capacity and have accomplished remarkable
progresses. On the one hand, efficient techniques, such as frequency reuse [Rap02]
and OFDM [BS99], have been invented to increase the bandwidth efficiency; on the
other hand, advances in coding such as turbo codes [BGT93] and low density parity
check codes [Gal62, MN96, McE02] make it feasible to almost reach Shannon capacity
[CT91, McE02], the theoretical upper bound for the capacity of the system. However,
a conclusion that the capacity bottleneck has been broken is still far-fetched.
Other than low Shannon capacity, single-antenna systems suffer another great dis-
advantage: its high error rate. In an additive white Gaussian noise (AWGN) channel,
which models a typical wired channel, the pairwise error probability (PEP), the prob-
ability of mistaking the transmitted signal with another one, decreases exponentially
with the signal-to-noise ratio (SNR), while due to the fading effect, the average PEP
for wireless single-antenna systems only decreases linearly with SNR. Therefore, to
achieve the same performance, a much longer code or much higher transmit power is
needed for single-antenna wireless communication systems.
Given the above disadvantages, single-antenna systems are unpromising candi-
dates to meet the needs of future wireless communications. Therefore, new commu-
nication systems superior in capacity and error rate must be introduced and conse-
quently, new communication theories for these systems are of great importance at the
present time.
Recently, one such new systems, digital communication systems using multiple-
munication systems with multiple antennas have a much higher capacity than single-
antenna systems. They showed that the capacity improvement is almost linear in
the number of transmit antennas or the number of receive antennas, whichever is
smaller. This result indicated the superiority of multiple-antenna systems and ig-
nited great interest in this area. In few years, much work has been done generalizing
and improving their results. On the one hand, for example, instead of assuming that
the channels have rich scattering so that the propagation coefficients between trans-
mit and receive antennas are independent, it was assumed that correlation can exist
between the channels; on the other hand, unrealistic assumptions, such as perfect
channel knowledge at both the transmitter and the receiver are replaced by more re-
alistic assumptions of partial or no channel information at the receiver. Information
theoretic capacity results have been obtained under these and other new assumptions,
for example, [ZT02, SFGK00, CTK02, CFG02].
These results indicate that multiple-antenna systems have much higher Shannon
capacity than single-antenna ones. However, since Shannon capacity can only be
achieved by codes with unbounded complexity and delay, the above results do not
reflect the performance of real transmission systems. For example, in a system with
two transmit antennas, if identical signals are transmitted from both antennas at a
time, a PEP that is inversely proportional to SNR is obtained, which is the same
as that of single-antenna systems although the coding gain is improved. However,
if Alamouti’s scheme [Ala98] is used, a PEP that behaves as SNR −2 is obtained.
reason that space-time coding attracts much attention from academic researchers and
industrial engineers alike.
The idea of space-time coding was first proposed by Tarokh, Seshadri and Calder-
bank in [TSC98]. They proved that space-time coding achieves a PEP that is inversely
first practical space-time code is proposed by Alamouti in [Ala98], which works for
systems with two transmit antennas. It is also one of the most successful space-time
codes because of its great performance and simple decoding.
The result in [TSC98] is based on the assumption that the receiver has full knowl-
edge of the channel, which is not a realistic assumption for systems with fast-changing
channels. Hochwald and Marzetta studied the much more practical case where no
channel knowledge is available at either the transmitter or the receiver. They first
found a capacity-achieving space-time coding structure in [MH99] and based on this
result, they proposed unitary space-time modulation [HM00]. In [HM00], they also
proved that unitary space-time coding achieves the same diversity, M N , as general
space-time coding.
Based on unitary space-time modulation, a transmission scheme that is better
5
tailored for systems with no channel information at both the transmitter and the
receiver is proposed by Hochwald and Sweldens in [HS00] and Hughes in [Hug00a],
coding has been improved greatly. There are many papers on the design of differential
and non-differential unitary space-time codes, for example, [TJC99, HH02b, MHH02,
HH02a, JH03e, JH03b, JH04e, GD03, DTB02]. There is also much effort in trying to
improve the coding gain by combining space-time codes with other error-correcting
codes or modulations, for example, [SD01, SG01, LFT01, Ari00, BBH00, FVY01,
GL02, BD01, LB02, JS03]. Today, this area is still under intensive theoretical study.
In this thesis, the design of space-time codes is investigated in order to exploit
the transmit diversity provided by the transmit antennas at the transmitter along
with the applications of space-time coding in wireless networks in order to exploit
time codes using Cayley transform. Part III includes Chapters 4, 5 and 6, where
the design of differential unitary space-time codes based on groups is discussed. In
Chapter 4, concepts and background materials of groups and Lie groups are listed,
along with motivations to the use of groups in differential unitary space-time code
design. Chapters 5 and 6 are on the design of differential unitary space-time codes
for systems with four transmit antennas and three transmit antennas based on the
Lie groups Sp(2) and SU (3), respectively. Part IV is Chapter 7, in which the idea
of space-time coding is used in wireless networks to exploit the distributed diversity
6
among the relay nodes. The last part, Chapter 8, is the summary and discussion.
Consider a wireless communication system with two users. One is the transmitter
and the other is the receiver. The transmitter has M transmit antennas and the
receiver has N receive antennas as illustrated in Figure 1.1. There exists a wireless
channel between each pair of transmit and receive antennas. The channel between
the m-th transmit antenna and the n-th receive antenna can be represented by the
random propagation coefficient hmn , whose statistics will be discussed later.
w1
s1 h11 x1
h12
h1N w2
Transmitter
h21
Receiver
s2 x2
h22
h2N
...
...
hM1 wN
sM hM2 hMN xN
feeds signals s1 , · · · , sM to its M antennas respectively. The antennas then send the
signals simultaneously to the receiver. Every receive antenna at the receiver obtains
a signal that is a superposition of the signals from every transmit antenna through
the fading coefficient. The received signal is also corrupted by noise. If we denote
the noise at the n-th receive antenna by wn , the received signal at the n-th receive
7
antenna is
M
X
xn = hmn sm + wn .
m=1
h11 h12 ··· h1N
h21 h22 · · · h2N
H= . .. .. ,
. ..
. . . .
hM 1 hM 2 · · · h M N
x = sH + w. (1.1)
The wireless characteristic of the channel places fundamental limitations on the per-
formance of wireless communication systems. Unlike wired channels that are sta-
tionary and predictable, wireless channels are extremely random and are not easily
analyzed due to the diverse environment, the motion of the transmitter, the receiver,
and the surrounding objects. In this section, characteristics of wireless channels are
between the transmitter and the receiver increases. Traditionally, propagation model-
ing focuses on two aspects. Propagation models that predict the mean signal strength
for an arbitrary transmitter-receiver separation distance are called large-scale propa-
gation models since they characterize signal strength over large transmitter-receiver
distances. Propagation models that characterize the rapid fluctuations of the received
signal strength over very short travel distances or short time durations are called small
scale or fading models. In this thesis, the focus is on fading models, which are more
suitable for indoor and urban areas.
nel fading process is correlated. This type of fading is referred to as flat fading or
frequency nonselective fading.
The Rayleigh distribution is commonly used to describe the statistical time-
varying nature of the received envelope of a flat-fading signal. It is also used to
model fading channels in this thesis. For a typical mobile wireless channel in indoor
or urban areas, we may assume that the direct line-of-sight wave is obstructed and
the receiver obtains only reflected waves from the surrounding objects. When the
number of reflected waves is large, according to central limit theory, two quadrature
components of the received signal are uncorrelated Gaussian random processes with
mean zero and variance σ 2 . As a result, the envelope of the received signal at any
time instant has a Rayleigh distribution and its phase is uniform between −π and π.
9
The probability density function of the Rayleigh distribution is given by
r 2
r − 2σ
σ2
e 2 r≥0
p(r) = .
0 r<0
If the fading coefficients in the multiple-antenna system model given in (1.1) are
normalized by
M
X
|h2mn | = M, for i = 1, 2, · · · , N, (1.2)
m=1
1
we have σ 2 = 2
. Therefore, the fading coefficient hmn has a complex Gaussian
distribution with zero-mean and unit-variance, or equivalently, the real and imaginary
parts of hmn are independent Gaussians with mean zero and variance 12 . Note that
with (1.2),
M 2 M M
X X X
2 2
E |sn |2 = P,
E hmn sn = E |hmn | |sn | =
m=1 m=1 m=1
which indicates that the normalization in (1.2) makes the received signal power at
every receive antenna equals the total transmit power.
Another widely used channel model is the Ricean model which is suitable for the
case when there is a dominant stationary signal component, such as a line-of-sight
propagation path. The small-scale fading envelope is Ricean, with probability density
function,
r 2 +A2
r − Ar
σ2
e 2σ 2 I0 σ2
if r ≥ 0
p(r) = .
0 if r < 0
The parameter A is always positive and denotes the peak amplitude of the dominant
signal, and I0 (·) is the zeroth-order modified Bessel function of the first kind [GR00].
systems are of great interest. This section is about the capacity of multiple-antenna
communication systems with Rayleigh fading channels. Three cases are discussed:
both the transmitter and the receiver know the channel, only the receiver knows the
channel, and neither the transmitter nor the receiver knows the channel. The results
E tr s∗ s ≤ P, or equivalently, E tr ss∗ ≤ P.
In the first case, assume that both the transmitter and receiver know the channel
matrix H. Note that H is deterministic in this case. Consider the singular value
x̃ = Ds̃ + ṽ.
Since v is circularly symmetric complex Gaussian2 with mean zero and variance IN ,
1
An M × N matrix, A, is diagonal if its off-diagonal entries, aij , i 6= j, i = 1, 2, · · · , M, j =
1, 2, · · · , N , are zero.
2 xRe
n
A complex vector x ∈ C is said to be Gaussian if the real random vector x̂ = ∈ R2n
xIm
is Gaussian. x is circularly symmetric if the variance of x̂ has the structure
QRe −QIm
QIm QRe
for some Hermitian non-negative definite matrix Q ∈ Cn×n . For more on this subject, see [Tel99,
Ede89].
11
ṽ is also circularly symmetric complex Gaussian with mean zero and variance IN .
Since the rank of H is min{M, N }, at most min{M, N } of its singular values are
√
non-zero. Denote the non-zero singular values of H as λi . The system equation can
be written component-wisely to get
p
x̃i = λi s̃i + ñi , for 1 ≤ i ≤ min{M, N }.
capacity achieving distribution of s˜i is circularly symmetric Gaussian and the ca-
pacity for the i-th independent channel is log(1 + λi Pi ), where Pi = E s̃i s̃∗i is the
power consumed in the i-th independent channel. Therefore, to maximize the mu-
tual information, s̃i should be independent circularly symmetric Gaussian distributed
and the transmit power should be allocated to the equivalent independent channels
optimally. It is also proved in [Tel99] that the power allocation should follow “water-
filling” mechanism. The power for the i-th sub-channel should be E s̃∗i s̃i = (µ−λ−1 +
i ) ,
Pmin{M,N }
where µ is chosen such that i=1 (µ − λ−1 + 3
i ) = P . The capacity of the system
is thus
min{M,N }
X
C= log(µλi ),
i=1
When only the receiver knows the channel, the transmitter cannot perfrom the
“water-filling” adaptive transmission. It is proved in [Tel99] that the channel capacity
is given by
C = log det(IN + (P/M )H ∗ H),
which is achieved when s is circularly symmetric complex Gaussian with mean zero
3 +
a denotes max{o, a}.
12
and variance (P/M )IM . When the channel matrix is random according to Rayleigh
distribution, the expected capacity is just
N log(1 + P ),
which grows linearly in N , the number of receive antennas. Similarly, for a fixed M ,
1
limN →∞ N
HH ∗ = IM with probability 1. Since det(IN + (P/M )H ∗ H) = det(IM +
(P/M )HH ∗ ), the capacity behaves, with probability 1, as
PN
M log 1 + ,
M
log(1 + P ),
the block-fading channel model, which will be discussed in the next chapter.
13
1.5 Diversity
power or sacrificing bandwidth. The basic idea of diversity is that, if two or more
independent samples of a signal are sent and then fade in an uncorrelated manner,
the probability that all the samples are simultaneously below a given level is much
lower than the probability of any one sample being below that level. Thus, properly
combining various samples greatly reduces the severity of fading and improves relia-
bility of transmission. We give a very simple analysis below. For more details, please
refer to [Rap02, Stu00, VY03].
The system equation for a single-antenna communication system is
√
x= ρsh + v,
where h is the Rayleigh flat-fading channel coefficient. ρ is the transmit power. v is the
noise at the receiver, which is Gaussian with zero-mean and unit-variance. s satisfies
the power constraint E |s|2 = 1. Therefore, the SNR at the receiver is ρ|h|2 . Since
h is Rayleigh distributed, |h|2 is exponentially distributed with probability density
function
Thus, the probability that the receive SNR is less than a level is,
Z
ρ
2
P (ρ|h| < ) = P 2
|h| < = e−x dx = 1 − e− ρ .
ρ 0
P (ρ|h|2 < ) ≈ ,
ρ
14
which is inversely proportional to the transmit power. For a multiple-antenna system,
with the same transmit power, the system equation is
√
x= ρsH + v,
where E ss∗ = 1. Further assume that the elements of s are iid, in which case
E |si |2 = 1/M . Since hij are independent, the expected SNR at the receiver is
M X
M X
N M N M N
∗ ∗
X X
2
X ρ XX2
ρE sHH s = ρ E si sj hik hjk = ρ E |si | |hik | = |hik |2 .
i=1 j=1 k=1 i=1 k=1
M i=1 k=1
The probability that the SNR at the receiver is less than the level is then
M N
! M X
N
!
ρ XX X M
P |hik |2 < = P |hik |2 <
M i=1 k=1 i=1 k=1
ρ
cM2 M
< P |h11 | < , · · · , |hM N |2 <
ρ ρ
M,N
Y M
= P |h|2 <
i=1,k=1
ρ
M
M N
= 1 − e− ρ .
M N
! M N
ρ XX M
P |hik |2 < . ,
M i=1 ρ
k=1
much lower error probability than single-antenna systems at high transmit power.
There are a lot of diversity techniques. According to the domain where diver-
sity is introduced, they can be classified into time diversity, frequency diversity and
antenna diversity (space diversity). Time diversity can be achieved by transmitting
15
identical messages in different time slots, which results in uncorrelated fading signals
at the receiver. Frequency diversity can be achieved by using different frequencies to
transmit the same message. The issue we are interested in is space diversity, which is
typically implemented using multiple antennas at the transmitter or the receiver or
both. The multiple antennas should be separated physically by a proper distance to
obtain independent fading. Typically a separation of a few wavelengths is enough.
combined to increase the overall receive SNR and mitigate fading. There are many
combining methods, for example, selection combining, switching combining, maxi-
mum ratio combining, and equal gain combining. Transmit diversity is more difficult
to implement than receive diversity due to the need for more signal processing at
both the transmitter and the receiver. In addition, it is generally not easy for the
transmitter to obtain information about the channel, which results in more difficulties
in the system design.
Transmit diversity in multiple-antenna systems can be exploited by a coding
scheme called space-time coding, which is a joint design of error-control coding, mod-
ulation, and transmit diversity. The idea of space-time coding is discussed in the next
chapter.
16
Consider the wireless communication system given in Figure 1.1 in Section 1.2. We
use block-fading model by assuming that the fading coefficients stay unchanged for
T consecutive transmissions, then jump to independent values for another T trans-
missions and so on. This piecewise constant fading process mimics the approximate
coherence interval of a continuously fading process. It is an accurate representation
of many TDMA, frequency-hopping, and block-interleaved systems.
The system equation for a block of T transmissions can be written as
r
ρT
X= SH + W, (2.1)
M
where
s11 · · · s1M x11 · · · x1M
.. .. .. .
. .. ..
S=
. . . ,
X =
. . . ,
sT 1 · · · sT M xT 1 · · · x T M
h11 · · · h1N w11 · · · w1N
. .. .. . .. ..
H= . . , W = . . .
. . . .
hM 1 · · · h M N w T 1 · · · xT N
S is the T ×M transmitted signal matrix with stm the signal sent by the m-th transmit
17
antenna at time t. The t-th row of S indicates the row vector of the transmitted values
from all the transmitters at time t and the m-th column indicates the transmitted
values of the m-th transmit antenna across the coherence interval. Therefore, the
horizontal axis of S indicates the spatial domain and the vertical axis of S indicates
the temporal domain. This is why S is called a space-time code. In the design of S,
redundancy is added in both the spatial and the temporal domains. H is the M × N
V is the T × N noise matrix with vtn the noise at the n-th receive antenna at time
t. The vtn s are iid with CN (0, 1) distribution. X is the T × N matrix of the received
signal with xtn the received value by the n-th receive antenna at time t. The t-th row
of X indicates the row vector of the received values at all the receivers at time t and
the n-th column indicates the received values of the n-th transmit antenna across the
coherence interval.
If the transmitted signal is further normalized as
M
1 X 1
E |stm |2 = , for t = 1, 2, ..., T, (2.2)
M m=1 T
which means that the average expected power over the M transmit antennas is kept
constant for each channel use, the expected received signal power at the n-th receive
antenna and the t-th transmission is as follows.
2
X M M M
ρT ρT X 2 2 ρT X
E |stm |2 = ρ.
E s h
tm mn = E |s tm | E |h mn | =
M
m=1
M m=1
M m=1
18
The expected noise power at the n-th receive antenna and the t-th transmission is
E |wtn |2 = 1.
... ...
b1 Rayleigh flat-fading channel b1
b2 Codebook S (TxM) X (TxN) b2
RT Σ Decoder
(size 2 )
b RT 1/2 b RT
( ρT/M) H (MxN) W (TxN)
columns of the matrix to its transmit antennas. The receiver decodes the R bits based
on its received signals which are attenuated by fading and corrupted by noise. The
space-time block code design problem is to design the set, C = {S1 , S2 , · · · , S2RT }, of
2RT transmission matrices in order to obtain low error rate.
model is discussed. Note that the results in Section 1.4 are actually included in the
results here since the system model used in Section 1.4 is a special case of the block-
fading model used here with T = 1. As before, three cases are discussed: both the
transmitter and the received know the channel, only the receiver knows the channel,
and neither the transmitter nor the receiver knows the channel. The results are based
19
on [Tel99, MH99] and [ZT02].
When both the transmitter and the receiver know the channel, the capacity is the
same using block-fading model or not since in this case, H is deterministic. When
only the receiver has perfect knowledge of the channel, it is proved in [MH99] that
the average capacity per block of T transmissions is
ρ ∗
C = T · E log det IN + H H ,
M
where the expectation is over all possible channel realizations. Therefore, the average
ρ
C = E log det IN + H ∗ H ,
M
which is the same as the result in Section 1.4. Thus, the capacity increases almost
linearly in min{M, N }.
Now, we discuss the case that neither the transmitter nor the receiver knows the
channel. It is proved in [MH99] that for any coherence interval T and any number
of receive antennas, the capacity with M > T transmit antennas is the same as the
capacity obtained with M = T transmit antennas. That is, according to capacity,
there is no point in having more transmit antennas than the length of the coherence
interval. Therefore, in the following text, we always assume that T ≥ M .
The structure of the signal that achieves capacity is also given in [MH99], which
will be stated in Section 2.5. Although the structure of capacity-achieving signal is
given in [MH99], the formula for the capacity is still an open problem. In [ZT02],
the asymptotic capacity of Rayleigh block-fading channels at high SNR is computed.
The capacity formula is given up to the constant term according to SNR. Here is the
main result.
Define G(T, M ) as the set of all M dimensional subspaces of CT . Let K =
20
min{M, N }. If T ≥ K + N , then at high SNR, the asymptotic optimal scheme is to
use K of the transmit antennas to send signal vectors with constant equal norm. The
K
C =K 1− log ρ + ck,n + o(1), (2.3)
T
where
X N
1 M T M
ck,n = log |G(T, M )| + M 1− log + 1− E log χ22i ,
T T πe T i=N −M +1
and QT 2π i
|S(T, M )| i=T −M +1 (i−1)!
|G(T, M )| = = QM 2πi
|S(M, M )| i=1 (i−1)!
is the volume of the Grassmann manifold G(T, M ). χ22i is a chi-square random vari-
able (see [EHP93]) of dimension 2i. Formula (2.3) indicates that the capacity is
linear in min{M, N, T2 } at high SNR. This capacity expression also has a geometric
interpretation as sphere packing in Grassmann manifold.
In [HM02], the probability density of the received signal when transmitting isotrop-
ically distributed unitary matrices is obtained in closed form, from which capacity
of multiple-antenna systems can be computed. Also, simulated results in [HM02]
show that at high SNR, the mutual information is maximized when M = min{N, T2 },
whereas at low SNR, the mutual information is maximized by allocating all transmit
power to a single antenna.
21
2.3 Performance Analysis of Systems with Known
Channels
When the receiver knows the channel H, it is proved in [TSC98] and [HM00] that the
p
2
arg max P (X|Si ) = min
X − ρT /M Si H
.
i=1,2,··· ,L i=1,2,··· ,L F
Since the exact symbol error probability and bit error probability are very difficult
to calculate, research efforts focus on the pairwise error probability (PEP) instead in
order to get an idea of the error performance. The PEP of mistaking Si by Sj is the
−N ρT ∗
Pe ≤ det IM + (Si − Sj ) (Si − Sj ) .
4M
M N
−N ∗ 4M
Pe . det (Si − Sj ) (Si − Sj ) . (2.4)
ρT
In wireless communication systems, for the receiver to learn the channel, training are
needed. Then, data information can be sent, and the ML decoding and performance
analysis follow the discussions in the previous section. This scheme is called training-
based scheme.
Training-based schemes are widely used in multiple-antenna wireless communica-
tions. The idea of training-based schemes is that when the channel changes slowly,
the receiver can learn the channel information by having the transmitter send pilot
signals known to the receiver. Training-based schemes dedicate part of the trans-
mission matrix S to be a known training signal from which H can be learned. In
particular, training-based schemes are composed of two phases: the training phase
and the data-transmission phase. The following discussion is based on [HH03].
r
ρt
Xt = St H + V t ,
M
where St is the Tt × M complex matrix of training symbols sent over Tt time samples
and known to the receiver, ρt is the SNR during the training phase, Xt is the Tt × N
complex received matrix, and Vt is the noise matrix. St is normalized as tr St St∗ =
M Tt .
Similarly, the system equation for the data-transmission phase is
r
ρd
Xd = Sd H + V d ,
M
There are two general methods to estimate the channel: the ML (maximum-
likelihood) and the LMMSE (linear minimum-mean-square-error) estimation whose
channel estimations are given by
s s −1
M ∗ −1 ∗ M M
Ĥ = (S St ) St Xt and Ĥ = TM + St∗ St St∗ Xt ,
ρt t ρt ρt
respectively.
In [HH03], an optimal training scheme that maximizes the lower bound of the
capacity for MMSE estimation is given. There are three parameters to be optimized.
The first one is the training data St . It is proved that the optimal solution is to
choose the training signal as a multiple of a matrix with orthonormal columns. The
second one is the length of the training interval. Setting Tt = M is optimal for any
ρ and T . Finally, the third parameter is the optimal power allocation, which should
satisfy the following,
ρd < ρ < ρ t if T > 2M
ρd = ρ = ρ t if T = 2M .
ρd > ρ > ρ t if T < 2M
If Sd is not unitary, the matrix given in (2.6) is only the orthogonal complement of
S.1 Note that the unitary complement of S may not exist in this case.
There are other training schemes according to other design criterions. For exam-
ple, in [Mar99], it is shown that, under certain conditions, by choosing the number of
transmit antennas to maximize the throughput in a wireless channel, one generally
spends half the coherence interval training.
As discussed in the previous section, training-based scheme allocates part of the trans-
mission interval and power to training, which causes both extra time delay and power
consumption. For systems with multiple transmit and receive antennas, since there
are M N channels in total, to have a reliable estimation of the channels, consider-
ably long training interval is needed. Also, when the channels change fast because of
the movings of the transmitter, the receiver, or surrounding objects, training is not
possible. In this section, a transmission scheme called unitary space-time modulation
1
A T × (T − M ) matrix à is the orthogonal complement of a T × M matrix A if and only if
∗
à A = 0.
25
(USTM) is discussed, which is suitable for trainsmissions in multiple-antenna sys-
tems when the channel is unknown to both the transmitter and the receiver without
When the receiver does not know the channel, it is not clear how to design the signal
set and decode. In [MH99], the capacity-achieving signal is given, and the main result
is as follows in Theorem 2.1.
SNR with T > M , d11 = d22 = · · · = dM M = 1 achieves capacity where dii is the i-th
diagonal entry of D.
With this ML decoding, the PEP of mistaking Si by Sj , averaged over the channel
distribution, has the following Chernoff upper bound
M
" #N
1Y 1
Pe ≤ (ρT /M )2 (1−d2m )
,
2 m=1 1 +
4(1+ρT /M )
for example, [HMR+ 00, ARU01, TK02]. Since L can be quite large, this calls into
question the feasibility of computing and using this performance criterion. The large
number of possible signals also rules out the possibility of decoding via an exhaustive
search. To design constellations that are huge, effective, and yet still simple, so that
they can be decoded in real time, some structure needs to be introduced to the signal
set. In Chapter 3, it is shown how Cayley transform can be used for this purpose.
√
Xτ = ρSτ Hτ + Vτ , (2.8)
0, ..., L − 1, taken from our signal set C and the previously transmitted matrix, Sτ −1 .
In other words,
Sτ = Uzτ Sτ −1 (2.9)
with S0 = IM . To assure that the transmitted signal will not vanish or blow up to
infinity, Uzτ must be unitary. Since the channel is used M times, the corresponding
1
transmission rate is R = M
log2 L, where L is the cardinality of C. If the propagation
environment keeps approximately constant for 2M consecutive channel uses, that is,
Hτ ≈ Hτ −1 , then from the system equation in (2.8),
√
Xτ = ρUzτ Sτ −1 Hτ −1 + Vτ = Uzτ (Xτ − Vτ −1 ) + Vτ = Uzτ Xτ + Vτ − Uzτ Vτ −1 .
where
Since Uzτ is unitary, the additive noise term in (2.11) is statistically independent of Uzτ
and has independent complex Gaussian entries. Therefore, the maximum-likelihood
decoding of zτ can be written as
It is shown in [HS00, Hug00a] that, at high SNR, the average PEP of transmitting
Ui and erroneously decoding Uj has the upper bound
M N
1 8 1
Pe . ,
2 ρ | det(Ui − Uj )|2N
which is inversely proportional to | det(Ui − Uj )|2N . Therefore the quality of the code
is measured by its diversity product defined as
1 1
ξC = min | det(Ui − Uj )| M . (2.13)
2 0≤i<j≤L
From the definition, the diversity product is always non-negative. A code is said to
be fully diverse or have full diversity if its diversity product is not zero. Fully diverse
physically means that the receiver will always decode correctly if there is no noise.
1 1
The power M
and the coefficient 2
in formula (2.13) are used for normalization.
With this normalization, the diversity product of any set of unitary matrices is be-
tween 0 and 1. From the definition of diversity product, it is easy to see that the set
with the largest diversity product is {IM , −IM } since it has the minimum number of
29
elements with the maximum determinant difference. Since
to normalize the diversity product of the set to 1, (2.13) is obtained. The differential
unitary space-time code design problem is thus the following: Let M be the number
of transmitter antennas, and R be the transmission rate. Construct a set C of L =
2M R M × M unitary matrices such that its diversity product, as defined in (2.13), is
as large as possible.
Many design schemes [HS00, Hug00a, SHHS01, Hug00b, GD03, DTB02] have
focused on finding a constellation C = {U0 , ..., UL } of L = 2M R unitary M × M
matrices that maximizes ξC defined in (2.13). Similar to USTM, in general, the
number of unitary M × M matrices in C can be quite large. This huge number
of signals calls into question the feasibility of computing ξC and also rules out the
possibility of decoding via an exhaustive search. To design constellations that are
huge, effective, and yet still simple so that they can be decoded in real time, some
structure should be imposed upon the signal set. In Chapter 4 of this thesis, the idea
of design differential unitary space-time code with group structure is introduced. In
Generalizations
The Alamouti’s scheme [Ala98] is historically the first and the most well-known space-
time code which provides full transmit diversity for systems with two transmit an-
tenna. It is also well-known for its simple structure and fast ML decoding.
h 11
y x
... ...
h 1N
y^ x^
Transmitter
y x
Receiver
...
h 21
x* −y*
h 2N
The transmission scheme is shown in Figure 2.2. The channel is used in blocks of
two transmissions. During the first transmission period, two signals are transmitted
simultaneously from the two antennas. The first antenna transmits signal x and the
second antenna transmits signal −y ∗ . During the second transmission period, the first
antenna transmits signal y and the second antenna transmits signal x∗ . Therefore,
the transmitted signal matrix S is
x y
S= .
−y ∗ x∗
It is easy to see that the two columns/rows of S are orthogonal. This design scheme
is also called the 2 × 2 orthogonal design. Further more, with the power constraint
where S1 and S2 are two sets in C. If S1 = S2 = C, the code is exactly the Lie group
SU (2). To obtain finite codes, S1 and S2 should be chosen as finite sets, and therefore
the codes obtained are finite samplings of the infinite Lie group.
The Alamouti’s scheme not only has the properties of simple structure and full
rate (it’s rate is 1 symbol per channel use), it also has an ML decoding method with
very low complexity. With simple algebra, the ML decoding of Alamouti’s scheme is
equivalent to
r N r N
1X 1 X
∗ ∗ ∗ ∗
arg min x − (x1i h1i + x2i h2i ) and arg min y − (x1i h2i − x2i h1i ) ,
x ρ i=1
y ρ i=1
which actually shows that the decodings of the two signals x and y can be decou-
pled. Therefore, the complexity of this decoding is very small. This is one of the
most important features of Alamouti’s scheme. For the unknown channel case, this
transmission scheme can also be used in differential USTM, whose decoding is very
similar to the one shown above and thus can be done very fast.
We now turn our attention to the performance of this space-time code. For any
The diversity product of the code is min(x1 ,y1 )6=(y1 ,y2 ) |x1 − x2 |2 + |y1 − y2 |2 . If xi and yi
1 P −1
are chosen from the P -PSK signal set {1, e2πj P , · · · , e2πj P }, it is shown in [SHHS01]
sin(π/P )
that the diversity product of the code is √
2
.
Because of its great features, much attention has been dedicated to finding meth-
ods to generalize Alamouti’s scheme for higher dimensions. A real orthogonal de-
sign of size n is an n × n orthogonal matrix whose entries are the indeterminants
±x1 , · · · , ±xn . The existence problem for real orthogonal designs is known as the
Hurwitz-Radon problem [GS79] and has been solved by Radon. In fact, real orthog-
ized the complex orthogonal design problem to non-square case also. They proved
1
the existence of complex orthogonal designs with rate no more than 2
and gave a
3
4 × 3 complex orthogonal design with rate 4
. In [WX03], Wang and Xia proved
3
that the rate of complex orthogonal designs is upper-bounded by 4
for systems with
more than two transmit antennas and the rate of generalized orthogonal designs (non-
square case) is upper-bounded by 54 . The restricted (generalized) complex orthogonal
design is also discussed in [WX03].
33
2.8 Sphere Decoding and Complex Sphere Decod-
ing
required. A natural way to decode is exhaustive search, which finds the optimal
decoding signal by searching over all possible signals. However, this algorithm has
a complexity that is exponential in both the transmission rate and the dimension.
Therefore, it may take long time and cannot fulfill the real time requirement especially
when the rate and dimension is high. There are other decoding algorithms, such as
nulling-and-canceling [Fos96], whose complexity is polynomial in rate and dimension,
however, they only provide approximate solutions. In this section, an algorithm
called sphere decoding is introduced which not only provides the exact ML solutions
for many communication systems but also has a polynomial complexity for almost all
rates.
Sphere decoding algorithm was first proposed to find vectors of shortest length in a
given lattice [Poh81], and has been tailored to solve the so-called integer least-square
problem:
min kx − Hsk2F ,
s∈Zn
where x ∈ Rm×1 , H ∈ Rm×n and Zn denotes the m-dimensional integer lattice, i.e.,
s is an n-dimensional vector with integer entries. The geometric interpretation of
the integer least-square problem is this: as the entries of s run over Z, s spans the
“rectangular” n-dimensional lattice. For any H, which we call the lattice-generating
matrix, Hs spans a “skewed” lattice. Therefore, given the skewed lattice and a vector
x, the integer least-square problem is to find the “closest” lattice point (in Euclidean
sense) to x. We can generalize this problem by making s ∈ S n where S is any discrete
34
set.
Many communication decoding problems can be formulated into this problem
with little modification since many digital communication problems have a lattice
formulation [VB93, HH02b, DCB00]. The system equation is often
x = Hs + v,
where s ∈ Rn×1 is the transmit signal, x ∈ Rm×1 is the received signal, H ∈ Rm×n is
the channel matrix and v ∈ Rm×1 is the channel noise. Note that here all the matrix
To obtain the exact solution to this problem, as mentioned before, an obvious method
is exhaustive search, which searches over all s ∈ S n and finds the one with the
minimum kx − Hsk2F . However, this method is not feasible when the number of
possible signals is infinite. Even when the cardinality of the lattice is finite, the
complexity of exhaustive search is usually very high especially when the cardinality
of the lattice is huge. It often increases exponentially with the number of antennas
and transmission rate. Sphere decoding gives the exact solution to the problem with
a much lower complexity. In [HVa], it is shown that sphere decoding has an average
complexity that is cubic in the transmission rate and number of antennas for almost
all practical SNRs and rates. It is a convenient fast ML decoding algorithm.
The idea of sphere decoding is to search over only lattice points that lie in a certain
sphere of radius d around the given vector x. Clearly, the closest lattice point inside
the sphere is the closest point in the whole lattice. The main problem is how to find
the vectors in the sphere.
35
A lattice point Hs is in a sphere of radius d around x if and only if
kx − Hsk2F ≤ d2 . (2.15)
R
Consider the Cholesky or QR factorization of H: H = Q , where R is
0m−n,n
an n × n upper triangular matrix with positive diagonal entries and Q is an m × m
2
y1 r r · · · r1,m s1
1,1 1,2
y2 0 r1,1 · · · r2,n s2
≤ d2n ,
.. − .. .. (2.16)
..
. ··· ··· . . .
yn 0 0 · · · rn,n sn
F
where yi indicates the i-th entry of y. Note that the n-th row of the vector in the
left hand side depends only on sn , the (n − 1)-th row depends only on sn and sn−1 ,
and so on. Looking at only the n-th row of (2.16), a necessary condition for (2.16) to
hold is (xn − rn,n sn )2 ≤ d2n which is equivalent to
−dn + xn dn + xn
≤ sn ≤ . (2.17)
rn,n rn,n
Therefore, the interval for sn is obtained. For each sn in the interval, define d2n−1 =
d2n − (xn − rn,n sn )2 . A stronger necessary condition can be found by looking at the
2
Here we only discuss the n ≤ m case. The n > m case can be seen in [HVb].
36
(n − 1)-th row of (2.16): |xn−1 − rn−1,n−1 sn−1 − rn−1,n sn |2 ≤ d2n−1 . Therefore, for each
sn in (2.17), we get an interval for sn−1 :
−dn−1 + xn−1 − rn−1,n sn dn−1 + xn−1 − rn−1,n sn
≤ sn−1 ≤ . (2.18)
rn−1,n−1 rn−1,n−1
Continue with this procedure till the interval of s1 for every possible values of sn , · · · , s2
is obtained. Thus, all possible points in the sphere (2.15) are found. A flow chart of
sphere decoding can be found in [DAML00] and pseudo code can be found in [HVa].
The selection of the search radius in sphere decoding is crucial to the complexity.
If the radius is too large, there are too many points in the sphere, and the complexity
is high. If the selected radius is too small, it is very probable that there exists no
point in the sphere. In [VB93], it is proposed to use the covering radius of the
lattice. The covering radius is defined as the radius of the spheres centered at the
lattice points that cover the whole space in the most economical way. However,
the calculation of the covering radius is normally difficult. In [HVa], the authors
proposed to choose the initial radius such that the probability of having the correct
point in the sphere is 0.9, then increase the radius gradually if there is no point in
the sphere. In our simulations, we use this method. Other radius-choosing methods
can be found in [DCB00, DAML00]. There are also publications on methods that
can further reduce the complexity of sphere decoding, interested readers can refer to
[GH03, AVZ02, Art04b, Art04a] .
The sphere decoding algorithm described above applies to real systems when s
is chosen from a real lattice. Therefore, the algorithm can be applied to complex
systems when the system equation can be rewritten as linear equations of unknowns
with twice the dimension by separating the real and imaginary parts of x, H and s.
Fortunately, this is true for many space-time coding systems ([HH02b, HH02a]). In
particular, real sphere decoding is used in the decoding of our Cayley unitary space-
37
time codes in Chapter 3, the Sp(2) differential unitary space-time codes in Chapter
5, and also the distributed space-time codes in Chapter 7.
Based on real sphere decoding, Hochwald generalized it to the complex case which
is more convenient to be applied in wireless communication systems using PSK signals
[HtB03]. The main idea is as follows.
The procedure follows all the steps of real sphere decoding. First, use the Cholesky
This inequality limits the search to points of the constellation contained in a complex
disk of radius r/rn,n centered at yn /rn,n . These points are easily found when the
constellation forms a complex circle (as in PSK).
Let sn = rc e2πjθn , where rc is a positive constant and θn ∈ {0, 2π/P, · · · , 2π(P −
1)/P }. That is, sn is a P -PSK signal. Denote yn /rn,n as r̂c e2πj θ̂n and define d2n =
r 2 /rn,n
2
. Then the condition becomes
which yields
1
cos(θn − θ̂n ) ≥ (r 2 + r̂c2 − r 2 /rn,n
2
).
2rc r̂c c
If the right-hand side of the above is greater than 1, the search disk does not contain
any point of the PSK constellation. If the value is less than −1, then the search disk
38
includes the entire constellation. Otherwise, the range of the possible angle for sn is
P −1
(rc2 + r̂c2 − r 2 /rn,n
2
) P −1
(rc2 + r̂c2 − r 2 /rn,n
2
)
θ̂n − cos ≤ θn ≤ θ̂n + cos (2.20)
.
2π 2rcr̂c 2π 2rc r̂c
Im
rc
n
Re
rc
This can be easily seen in Figure 2.3. The sphere given in (2.19) is the area
bounded by the dashed circle. The values that sn can take spread on the solid circle
uniformally. Note that
1
(r 2 + r̂c2 − r 2 /rn,n
2
) > 1 ⇔ |rc − r̂c | > d ⇔ rc > r̂c + d or r̂c > rc + d.
2rc r̂c c
If rc > r̂c + d, then the dashed circle is inside the solid circle. If r̂c > rc + d, the two
circles are disjoint. Therefore, if either happens, there is no possible sn in the sphere
1
given in (2.19). If (r 2
2rc r̂c c
+ r̂c2 − r 2 /rn,n
2
) < −1, then the solid circle is contained in
the dashed circle, which means that all the PSK signals are in the sphere. Otherwise,
the solid circle has an arc that is contained in the sphere, and possible angles are
given by (2.20).
39
Therefore, an interval for sn ’s angle, or equivalently, the set of values that sn can
take on is obtained. For any chosen sn in the set, the set of possible values of sn−1
can be found by similar analysis. By continuing with this procedure till the set of
possible values of s1 is found, all points in the complex disk are obtained.
2.9 Discussion
The results in Sections 2.2-2.6 are based on the assumption that the fading coeffi-
cients between pairs of transmit and receiver antennas are frequency non-selective
and independent of each other. In this section, situations in which these assumptions
are not valid are discussed.
In practice, channels may be correlated especially when the antennas are not suffi-
ciently separated. The correlated fading models are proposed in [ECS+ 98, SFGK00].
The effects of fading correlation and channel degeneration (known as the keyhole
effect) on the MIMO channel capacity have been addressed in [SFGK00, CTK02,
CFG02], in which it is shown that channel correlation and degeneration actually de-
grade the capacity of multiple-antenna systems. Channel correlation can be mitigated
using precoding, equalization and other schemes. For more on these issues, refer to
[ZG03, KS04, SS03, HS02a, PL03].
In wideband systems,3 transmitted signals experience frequency-selective fadings,
which causes inter-symbol interference (ISI). It is proved in [GL00] that the cod-
ing gain of the system is reduced, and it is reported that at high SNR, there ex-
ists an irreducible error rate floor. A conventional way to mitigate ISI is to use
an equalizer at the receiver ([CC99, AD01]). Equalizers mitigate ISI and convert
frequency-selective channels to flat-fading channels. Then, space-time codes designed
for flat-fading channels can be applied ([LGZM01]). However, this approach results
3
If the transmitted signal bandwidth is greater than the channel coherence bandwidth, the com-
munication system is called a wideband system.
40
in high complexity at the receiver. An alternative approach is to use orthogonal
frequency division multiplexing (OFDM) modulation. The idea of OFDM can be
found in [BS99]. In OFDM, the entire channel is divided into many narrow parallel
sub-channels with orthogonal frequencies. In every sub-channel, the fading can be
regarded as frequency non-selective. There are many papers on space-time coded
OFDM, for example [ATNS98, LW00, LSA98, BGP00].
alyzed ([BBH00, FVY01, BD01, GL02, TC01, JS03]). Other research investigated
the combinations of space-time coding with convolutional codes and turbo codes
([Ari00, SG01, SD01, LFT01, LLC02]). The combinations of these schemes increases
the performance of the system, however, the decoding complexity is very high and
the performance analysis is very difficult.
Contributions of this thesis are mainly on the design of space-time codes for multiple-
antenna systems and their implementation in wireless networks. It can be divided
into three parts.
In part one, unitary space-time codes are designed for systems with no channel
information at both the transmitter and the receiver using Cayley transform. Cay-
ley transform provides an one-to-one mapping from the space of (skew) Hermitian
matrices to the space of unitary matrices. Based on the linearity of the space of
Hermitian matrices, the transmitted data is first broken into sub-streams α1 , · · · , αQ ,
and ignoring the data dependence of the additive noises, α1 , · · · , αQ appears linearly
at the receiver. Therefore, linear decoding algorithms such as sphere decoding and
nulling-and-canceling can be used with polynomial complexity. Our Cayley codes
have a similar structure as training-based schemes under transformations.
Cayley codes do not require channel knowledge at either the transmitter or the
receiver, are simple to encode and decode, and can be applied to any combination of
transmit and receive antennas. They are designed with a probabilistic criterion: they
maximize the expected log-determinant of differences between matrix pairs.
The recipe for designing Cayley unitary space-time codes for any combination of
transmit and receive antennas and coherence intervals is given, and also simulation ex-
amples are presented, which compare our Cayley codes with optimized training-based
space-time codes and uncoded training-based schemes for different system settings.
Our simulation results are preliminary. They indicate that Cayley codes generated
with this recipe only slightly underperform optimized training-based schemes using
orthogonal designs and/or linear dispersion codes. However, they are clearly superior
to uncoded training-based space-time schemes. Further optimization on basis ma-
trices of Cayley codes is necessary for a complete comparison of Cayley codes with
training-based schemes.
The second part of our contributions is the design of unitary space-time codes
based on Lie groups for the differential transmission scheme. The work can be re-
garded as extensions of [HK00]. In Chapter 5, we work on the symplectic group
Sp(n) which has dimension n(2n + 1) and rank n. We first give a parameterization
of Sp(n) and then design differential unitary space-time codes which are subsets of
Sp(2) by sampling the parameters appropriately. Necessary and sufficient conditions
for full diversity of the codes are given. The designed constellations are suitable for
42
systems with four transmit antennas and any number of receive antennas. The spe-
cial symplectic structure of the codes lends themselves to linear-algebraic decoding,
such as sphere decoding. Simulation results show that they have better performance
than the 2 × 2 and 4 × 4 complex orthogonal designs, group-based diagonal codes,
and differential Cayley codes at high SNR. Although they slightly underperform the
k1,1,−1 finite-group code and the carefully designed non-group code, they do not need
exhaustive search (of exponentially growing size) required by such code and therefore
are far superior in term of decoding complexity.
In Chapter 6, we keep working on the idea of differential unitary space-time code
design based on Lie groups with rank 2 and analyze the special unitary Lie group
SU (3), which has dimension 8 and rank 2. The group is not fixed-point-free, but
we describe a method to design fully-diverse codes which are subsets of the group.
Furthermore, motivated by the structure of the SU (3) codes, we propose a simpler
code called the AB code. Both codes are suitable for systems with three transmit
antennas. Necessary conditions for full diversity of both codes are given and our
conjecture is that they are also sufficient conditions. The codes have simple formulas
from which their diversity products can be calculated in a fast way. A fast maximum-
likelihood decoding algorithm for AB codes based on complex sphere decoding is
given, by which decoding can be done with a complexity that is polynomial in the
rate and dimension. Simulation results show that SU (3) codes and AB codes perform
as well as finite group-based codes at low rates. At high rates, performance of SU (3)
and AB codes is much better than that of finite group-based codes and about the same
as that of the carefully designed non-group codes. The AB codes are, in addition,
where R is the number of relay nodes and P is the average total power consumed in
the network. This result indicates that when T ≥ R and the average total transmit
power is high, relay networks achieve almost the same diversity as multiple-antenna
systems with R transmit antennas. This result is also supported by simulations. We
further show that with R = T , the leading order term in the PEP of wireless re-
P R
4 R
lay networks behaves as 21 | det(Si1−Sj )|2 8 log
P
, which compared to 1 1
2 | det(Si −Sj )| 2 P
,
the PEP of multiple-antenna systems, shows the loss of performance due to the facts
that space-time codes relay networks are implemented distributively and the relay
nodes have no knowledge of the transmitted symbols. We also observe that the high
SNR coding gain, | det(Si − Sj )|−2 , of relay networks is the same as what arises in
multiple-antenna systems. The same is true at low SNR where a trace condition
comes up.
44
3.1 Introduction
high data rates on wireless channels with multi-path fading [Fos96, Tel99]. Many
proposed schemes that achieve these high rates require the propagation environment
or channel to be known to the receiver (see, e.g., [Fos96, Ala98, TSC98, HH02b]
and the references therein). In practice, knowledge of the channel is often obtained
via training: known signals are periodically transmitted for the receiver to learn the
channel, and the channel parameters are tracked in between the transmission of the
training signals. However, it is not always feasible or advantageous to use training-
based schemes, especially when many antennas are used or either end of the link is
moving so fast that the channel is changing very rapidly [Mar99, HH03].
Hence, there is much interest in space-time transmission schemes that do not re-
quire either the transmitter or receiver to know the channel. Information-theoretic
calculations with a multi-antenna channel that changes in a block-fading manner first
rates. The constellation proposed in [MHH02], on the other hand, while, theoretically
having good performance, has to date no tractable decoding algorithm. Recently, a
USTM design method based on the exponential map has been proposed in [GKB02].
In USTM, the first M columns of the T × T unitary matrices are chosen to be the
transmitted signal. Therefore, let us first look at the space of T × T unitary matrices
which is referred as the Stiefel manifold. It is well-known that this manifold is highly
non-linear and non-convex. Note that an arbitrary complex T × T matrix has 2T 2
real parameters, but for a unitary one, there are T constraints to force each column
to have unit norm and another 2 × T (T2−1) constraints to make the T columns pairwise
of unitary matrices. There are some parameterization methods in existence but all
of them suffer from disadvantages for use in unitary space-time code design. We now
briefly discuss these. The discussion is based on [HH02a].
The first parameterization method is by Givens rotations. A unitary matrix Φ
can encode the data onto the angles of rotations and also the diagonal phases of D.
But it is not a practical method since neither is the parameterization one-to-one (for
example, one can re-order Givens rotations) nor does systematic decoding appears to
be possible.
are Householder matrices [GL96]. This method is also not encouraging to us because
we do not know how to encode and decode data onto Householder matrices in any
efficient manner.
And also, unitary matrices can be parameterized with the matrix exponential
Φ = eiA . When A is T × T Hermitian, Φ is unitary. The exponential map also has
the difficulty of not being one-to-one. This can be overcome by imposing constraints
0 6 A < 2πI, but the constraints are not linear although convex. We do not know
Cayley codes for differential USTM [HH02a]. As will be shown later, this extension is
far from trivial. Nonetheless, the codes designed here inherit many of the properties
of Cayley differential codes. In particular, they:
1. are very simple to encode: the data is broken into substreams used to parame-
47
terize the unitary matrices,
2. can be used for systems with any number of transmit and receive antennas,
The work in this chapter has been published in IEEE Transactions on Signal
Processing Special Issue on MIMO Communications [JH03e], the Proceeding of 2002
Cayley transform was proposed in [HH02a] to design codes for differential unitary
space-time modulation whereby both good performance and simple encoding and
Φ = (I + Y )−1 (I − Y ),
matrix Y = iA:
1 − ia
v= ,
1 + ia
which maps the real line to the unit circle. Notice that no finite point on the real line
can be mapped to the point,−1, on the unit circle.
The most prominent advantage of Cayley transform is that it maps the compli-
cated space of unitary matrices to the space of Hermitian matrices, which is linear.
It can be easily proved that
= I.
The second equation is true because I − iA, I + iA, (I − iA)−1 and (I + iA)−1 all
iA = (I + Φ)−1 (I − Φ)
provided that (I + Φ)−1 exists. This shows that Cayley transform and its inverse
transform coincide. Thus, Cayley transform is one-to-one. It is not an onto map
49
because those unitary matrices with eigenvalues at −1 have no inverse images. Recall
that the space of Hermitian or skew-Hermitian matrices has dimension T 2 which
easily invertible.
And also, it is proved in [HH02a] that a set of unitary matrices is fully diverse if
and only if the set of their Hermitian inverse Cayley transforms is fully diverse. This
suggests that a set of unitary matrices with promising performance can be obtained
that the generalization is far from trivial since the non-squareness of the matrices
causes a lot of problems in the code design.
Because Cayley transform maps the nonlinear Stiefel manifold to the linear space
(over the reals) of Hermitian matrices (and vice-versa), it is convenient and most
straightforward to encode data linearly onto Hermitian matrices and then apply Cay-
ley transform to get unitary matrices.
We call a set of T × M unitary matrices a Cayley unitary space-time code if any
50
element in the set can be written as
IM
S = (IT + iA)−1 (IT − iA) (3.2)
0
Q
X
A= αq A q , (3.3)
q=1
where α1 , α2 , ..., αQ are real scalars (chosen from a set A with r possible values) and
A1 , A2 , ..., AQ are fixed T × T complex Hermitian matrices.
The code is completely determined by the set of matrices {A1 , A2 , ..., AQ }, which
can be thought of as Hermitian basis matrices. Each individual codeword, on the
other hand, is determined by our choice of the scalars α1 , α2 , ..., αQ whose values are
in the set Ar (the subscript ’r’ represents the cardinality of the set). Since each of
the Q real coefficients may take on r possible values and the code occupies T channel
uses, the transmission rate is R = (Q/T ) log2 r. We defer the discussions on how to
design Aq ’s, Q, and the set Ar to the later part of this chapter and concentrate on
how to decode α1 , α2 , ..., αQ at the receiver first.
Similar to differential Cayley codes, our Cayley unitary space-time codes also have
the good property of linear decoding, which means that the receiver can be made to
form a system of linear equations in the real scalars α1 , α2 , ..., αQ . First, it is useful
to see what our codes and their ML decoding look like.
51
Partition the T × T matrix A as
A11 A12
,
A21 A22
Observe that
= (I + iA)−1 (I − iA)
= 2(I + iA)−1 − I
−1
IM + iA11 iA12
= 2 −I
iA∗12 IT −M + iA22
−1 −1 ∗ −1 −1 −1
2[I − (I + iA11 ) A12 ∆2 A12 ](I + iA11 ) − I −2i(I + iA11 ) A12 ∆2
=
−1 ∗ −1 −1
−2i∆2 A12 (I + iA11 ) 2∆2 − I
where ∆2 = I + iA22 + A∗12 (I + iA11 )−1 A12 is the Schur complement of I + iA11 in
I + A.
Therefore, from (3.2),
2[I − (I + iA11 )
−1
A12 ∆−1 ∗
2 A12 ](I
+ iA11 ) −1
−I
S= , (3.4)
−2i∆−1 ∗
2 A12 (I + iA11 )
−1
52
which is composed by the first M columns of Φ, and
−2i(I + iA11 )
−1
A12 ∆−1
2
S⊥ =
2∆−1
2 −I
X1
(T − M ) × N block X2 as X = , the second form of the ML decoder in (2.7)
X2
reduces to
2
arg min
[−2iX1∗ (I + iA11 )−1 A12 + X2∗ (2 − ∆2 )]∆−1
2
.
F
(3.5)
{αq }
The reason for choosing the second form of the ML decoding, as opposed to the first
one, is that we prefer to minimize, rather than maximize the Frobenius norm. In
fact, we shall presently see in the following that a simple approximation leads us
the system equation at the receiver is not linear. The formula looks intractable
because it has matrix inverses as well as the Schur complement ∆2 . Adopting the
approach of [HH02a] by ignoring the covariance of the additive noise term ∆−1
2 , we
obtain
2
arg min
2X2∗ − X2∗ ∆2 − 2iX1∗ (I + iA11 )−1 A12
F , (3.6)
{αq }
and
Some algebra shows that the above decoding formula (3.6) reduces to
α̂lin = arg min kX2∗ − X2∗ B ∗ B − 2iX1∗ B + iX2∗ B ∗ A11 B − iX2∗ A22 k2F , (3.10)
{αq }
which is now quadratic in entries of A. Fast decoding methods such as sphere decod-
ing and nulling-and-canceling can be used which have polynomial complexity as in
BLAST [Fos96].
We call (3.10) the “linearized” decoding because it is equivalent to the decoding
1
With this condition, the number of degrees of freedom in A is T 2 − 2T M + 2M 2 , which is greater
than 2T M − M 2 , the number of degrees of freedom in an arbitrary T × M unitary matrix, when
T ≥ 3M .
54
of a system whose system equation is linear in the unknowns αq s. For a wide range
of rates and SNR, (3.10) can be solved exactly in roughly O(Q3 ) computations us-
ing sphere decoding [FP85, DCB00]. Furthermore, simulation results show that the
penalty for using (3.10) instead of the exact ML decoding is small, especially when
weighed against the complexity of the exact ML decoding. To facilitate the presenta-
tion of the sphere decoding algorithm, the equivalent channel model in matrices are
From (3.8), A12 = A∗21 is fully determined by A11 . Therefore, the degrees of freedoms
in A are all in matrices A11 and A22 . The encoding formula (3.3) of A can thus be
modified to the following encoding formulas of A11 and A22 :
Q Q
X X
A11 = αq A11,q and A22 = αq A22,q , (3.11)
q=1 q=1
where Q is the number of possible A11,q s and A22,q s, α1 , α2 , ...αQ are real scalars chosen
from the set Ar , and A11,1 , A11,2 , ..., A11,Q and A22,1 , A22,2 , ..., A22,Q are fixed M × M
and (T − M ) × (T − M ) complex Hermitian matrices.2 The matrix A is therefore
constructed as
A11 (I + iA11 )B
A =
B ∗ (I − iA11 ) A22
PQ PQ
αq A11,q
q=1 (I + i q=1 αq A11,q )B
= P PQ
B ∗ (I − i Qq=1 α q A 11,q ) q=1 α q A 22,q
2
Actually, in our design , A11 and A22 can have different numbers of degrees of freedom, Q1 and
Q2 , and the P
coefficients of the twoPbasis sets can have non-identical sample spaces. That is, we can
have A11 = Q p=1 αp A11,p , A22 =
1 Q2
q=1 βq A22,q where αi ∈ Ar1 and βi ∈ Ar2 . However, to simplify
the design problem, here we just set Q1 = Q2 and r1 = r2 .
55
Q
X A11,q iA11,q B 0 B
= αq + .
q=1 −iB ∗ A11,q A22,q B∗ 0
Q Q
2
X X
∗ ∗ ∗ ∗ ∗ ∗ ∗
arg min
X2 − X2 B B − 2iX1 B + i αq X2 B A11,q B − i αq X2 A22,q
. (3.12)
{αq }
q=1 q=1 F
Define
for q = 1, 2, ..., Q. By decomposing the complex matrices C and Jq into their real and
imaginary parts, the decoding formula (3.12) can be further rewritten as
2
α I
1 T −M
CR J1,R · · · JQ,R ..
arg min
− .
,
{αq }
CI
J 1,I · · · J Q,I
αQ IT −M
F
where CR , CI are the real and imaginary parts of C and Ji,R , Ji,I are the real and
imaginary parts of Ji . Also, denoting by CR,j , CI,j , Ji,R,j , Ji,I,j the j-th columns of
CR , CI , Ji,R , Ji,I for j = 1, 2, ..., (T − M ), and writing matrices in the above formula
t t t t
t
where R is the 2N (T −M )-dimensional column vector CR,1 CI,1 · · · CR,T −M CI,T −M
J1,R,1 J2,R,1 ··· JQ,R,1
J1,I,1
J2,I,1 ··· JQ,I,1
.. .. .. ..
.
. . . . (3.15)
J1,R,T −M J2,R,T −M · · · JQ,R,T −M
J1,I,T −M J2,I,T −M · · · JQ,I,T −M
R = Hα + W, (3.16)
within a sphere of a certain radius. Sphere decoding has the important advantage
over nulling-and-canceling that it computes the exact solution. Its worst case behavior
is exponential in Q, but its average behavior is comparable to nulling-and-canceling.
When the number of transmit antennas and the rate are small, exact ML decoding
3
In general, the covariance of the noise is dependent on the transmitted signal. However, in
ignoring ∆−1
2 in (3.6), we have ignored this signal dependence.
57
using exhaustive search is possible. However, a search over all possible α1 , ..., αQ may
be impractical for large T and R. Fortunately, the performance penalty for the lin-
earized Ml decoding given in (3.10) is small, especially weighed against the complexity
of exact ML decoding using exhaustive search.
polynomial, it is important that the number of linear equations resulting from (3.10)
be at least as large as the number of unknowns. (3.16) suggests that there are
2N (T −M ) real equations and Q real unknowns. Hence we may impose the constraint
Q ≤ 2N (T − M ).
This argument assumes that the matrix H has full column rank. There is, at first
glance, no reason to assume otherwise but it turns out to be false. Due to the
Hermitian constraint on A, not all the 2M (T − M ) equations are independent. A
careful analysis yields the following result.
Theorem 3.1 (Rank of H). The matrix given in (3.15) generally has rank
min (2N (T − M ) − N 2 , Q) if T − M ≥ N
rank(H) = . (3.17)
2
min ((T − M ) , Q) if T − M < N
equation C = iX2∗ (A22 − B ∗ A11 B) when A11 and A22 vary. Because A11 and A22
are not arbitrary matrices, the range space of C cannot have all the 2(T − M )N
dimensions as it appears. Now let’s study the number of constraints added on the
range space of C as A11 and A22 can only be Hermitian matrices. Since
= C(iX2 ),
to studying iX2∗ A22 , which is the same setting as that of differential USTM [HH02a].
In Theorem 1 of [HH02a], it is argued that for a generic choice of the basis matrices
A22,1 , · · · , A22,Q , the rank of H attains the upper bound. Therefore the same holds
here, and H attains the upper bound.
59
Theorem 3.1 shows that even though there are 2N (T − M ) equations in (3.16),
not all of them are independent. To have at least as many equations as unknowns,
is needed, or equivalently,
With the choice (3.7) or equivalently (3.8), the first block of the transmitted matrix
S in (3.4) can be simplified as
= [2I − 2B∆−1 ∗
2 B (I − iA11 ) − (I + iA11 )](I + iA11 )
−1
= [I − 2B∆−1 ∗ −1
2 B ](I − iA11 )(I + iA11 ) .
I− 2B∆−1
2 B
∗
−1
S= (I + iA11 ) (I − iA11 ).
−2i∆−1
2 B
∗
60
Our Cayley unitary space-time code and its unitary complement can be written as
I −iB IM −2iB∆−1
2
S⊥ =
S= U1 and , (3.19)
0 I −2i∆−1
2 B ∗
2∆−1
2 − IT −M
where
Q Q
X X
∗ ∗
∆2 = I + B B − i αq B A11,q B + i αq A22,q (3.20)
q=1 q=1
A22 , and the code occupies T channel uses, the transmission rate is
Q
R= log2 r. (3.21)
T
IM −iB
nel matrix H to get H 0 = U1 H. If we left multiply X, S and V by =
0 IT −M
IM iB
to get X 0 , S 0 and V 0 , the system equation (2.1) can be rewritten as
0 IT −M
r
ρT IM
X0 =
0 0
H +V .
M −2i∆−1 ∗
2 B
61
We can see that this is very similar to the equation of training-based schemes (2.5).
The only difference is in the noises. In (2.5), entries of the noise are independent white
Gaussian noise with zero-mean and unit-variance. Here, entries of V 0 are no longer
independent with unit-variance, although they still have zero-mean. The dependence
of the noises is beneficial to the performance since more information can be obtained.
The following theorem about the structure of S ⊥ is needed later in the optimization
−2iB∆−1
2 −2iB −1
S⊥ =
= ∆2
2∆−1
2 − I 2I − ∆ 2
and
−2i∆−1
1 A12 (I + iA22 ) −1
∆−1
1 0 −2iA12 (I + iA22 )
−1
S⊥ =
= .
2∆−1
2 −I 0 ∆−1
2 2I − ∆2
⊥ ⊥
− Ŝ
S
−1 −1
∆1 0 −2iA12 (I + iA22 ) ∆1 0 −2iB −1
= ∆ˆ2 − ˆ
∆
2
0 ∆2−1
2I − ∆2 0 ∆2 ˆ
2I − ∆2
−1 −1 ˆ
0 −2iA12 (I + iA22 ) ∆2 + 2i∆1 B −1
∆1 ˆ
= ∆ 2
0 ∆2−1 ˆ 2 − ∆2 ∆
2∆ ˆ 2 − 2∆2 + ∆2 ∆ ˆ2
−1 −1 −1
0 2iA12 (I + iA22 ) ∆2 − 2iA12 (I + iA22 ) ∆2 −1 ˆ
∆1 ˆ
= ∆ 2
0 ∆2−1 ˆ
2(∆2 − ∆2 )
−1 −1 ˆ
−2i∆1 A12 (I + iA22 ) 0 ∆2 − ∆2 −1
ˆ
= ∆ 2
0 2∆2−1 ˆ
∆2 − ∆ 2
−1
−2iB∆2 ˆ ˆ −1
= (∆2 − ∆ 2 )∆ 2
−1
2∆2
−iB −1 ˆ
= 2
∆2 (∆2 − ∆ 2 )∆ ˆ −1
2
I
Without the unitary constraint, this is an affine space since all the data is encoded
in ∆−1 ⊥
2 . So, in general, the space of S is the intersection of the linear affine space in
(3.23) and the Stiefel manifold S ⊥∗ S ⊥ = I. We can see from (3.22) or (3.23) that the
1 −(Sd∗ − Ŝd∗ )
S ⊥ − Ŝ ⊥ = √
. (3.24)
2 0
Note now that the dimension of the affine space is min(M, T − M ) which is smaller
than T − M when T > 2M . So, the affine space of S ⊥ of Cayley codes has a higher
dimension than that of training-based schemes when T > 2M .
Although the idea of Cayley unitary space-time codes has been introduced in (3.19),
we have not yet specified Q, nor have we explained how to choose the discrete
set Ar from which αq s are drawn, or the design of the Hermitian basis matrices
{A11,1 , A11,2 , ..., A11,Q } and {A22,1 , A22,2 , ..., A22,Q }. We now discuss these issues.
3.6.1 Design of Q
To make the constellation as rich as possible, we should make the number of degrees
We are left with how to design the discrete set Ar and how to choose {A11,1 , A11,2 ,
· · · , A11,Q } and {A22,1 , A22,2 , · · · , A22,Q }.
64
3.6.2 Design of Ar
As mentioned in Section 2.5, at high SNR, to achieve capacity in the sense of maxi-
mizing mutual information between X and S, Φ = (I +iA)−1 (I −iA) should assemble
samples from an isotropic random distribution. Since our data modulates the A ma-
trix (or equivalently A11 and A22 ), we need to find the distribution on A that yields
an isotropically distributed Φ.
As proved in [HH02a], the unitary matrix Φ is isotropically distributed if and only
if the Hermitian matrix A has the matrix Cauchy distribution
2 −T
2T (T − 1)! · · · 1! 1
p(A) = ,
π T (T +1)/2 det(I + A2 )T
1
p(a) = .
π(1 + a2 )
matrix A. Therefore (3.3) is used instead of (3.11). We want our code constellation
PQ
A = q=1 αq Aq to resemble samples from a Cauchy random matrix distribution.
1 − iα1 1−v
v= , and α1 = −i .
1 + iα1 1+v
1−v
α1 = −i = − tan(θ/2). (3.26)
1+v
For example, for r = 2, we have the set of points on unit circle V = {eiπ/2 , e−iπ/2 }.
From (3.26), the set of values for α1 is A2 = {−1, 1}. For the case of r = 4, we can
get by simple calculation that A4 = {−2.4142, −0.4142, 0.4142, 2.4142}. It can be
seen that the points rapidly spread themselves out as r increases, which reflects the
r Q = 2RT . (3.27)
To complete the code construction, it is crucial that the two sets of bases {A11,1 , A11,2 ,
· · · , A11,Q } and {A22,1 , A22,2 , · · · , A22,Q } are chosen appropriately, and we present a
criterion in this subsection.
If the rates being considered are reasonably small, the diversity product criterion
min0 | det(Φl −Φl0 )∗ (Φl −Φl0 )| is tractable. At high rates, however, it is not practical to
l6=l
66
pursue the full diversity criterion. There are two reasons for this: first, the criterion
becomes intractable because of the number of matrices involved and second, the
sets of basis matrices {A11,1 , A11,2 , ...A11,Q } and {A22,1 , A22,2 , ...A22,Q }, we define a
distance criterion for the resulting constellation of matrices V to be
1
ξ(V) = E log det(S ⊥ − S 0⊥ )∗ (S ⊥ − S 0⊥ ), (3.28)
T −M
where S is given by (3.19) and (3.20) and S 0 is given by the same formulas except
that the αq s in (3.20) are replace by αq0 s. The expectation is over all possible αq s and
αq0 s chosen uniformly from Ar such that (α1 , ..., αQ ) 6= (α10 , ..., αQ
0
). Remember that
S ⊥ denotes the T × (T − M ) unitary complement matrix of the T × M matrix S.
Let us first look at the difference between this criterion with that in [HH02a]. Here,
we use S ⊥ and S 0⊥ instead of S and S 0 themselves because the unitary complement
instead of the transmitted signal itself is used in the linearized ML decoding. This
criterion cannot be directly related to the diversity product as in the case of [HH02a],
but still, from the structure, it is a measure of the expected “distance” between
matrices S ⊥ and S 0⊥ . Thus, maximizing ξ(V) should be connected with lowering
average pairwise error probability. Hopefully, optimizing the expected “distance”
between the unitary complements S ⊥ and S 0⊥ instead of that between the unitary
signals S and S 0 themselves will obtain a better performance. And also, since the
constraints (3.7) is imposed to simplify ∆2 , which turns out to simplify S ⊥ as well,
the calculation of our criterion is much easier than the calculation of the one used in
[HH02a], which maximizes the expected “distance” between the unitary matrices Φ
67
and Φ0 . Therefore, the optimization problem is proposed to be
By (3.22), we can rewrite the optimization as a function of A11 , A22 and get the
simplified formula,
where
∆2 = I + B ∗ B − iB ∗ A11 B + iA22 ,
and
Q Q
X X
A11 = αq A11,q , A22 = αq A22,q ,
q=1 q=1
Q Q
X X
A011 = αq0 A11,q , A022 = αq0 A22,q .
q=1 q=1
When r is large, the discrete sets from which αq s, αq0 s are chosen from (Ar ) can
be replaced with independent scalar Cauchy distributions. And by noticing that the
sum of two independent Cauchy random variables is scaled-Cauchy, our criterion can
be simplified to
Entries of A11,q s and A22,q s in (3.30) are unconstrained other than that they must be
Hermitian matrices. However, we found that it is beneficial to constrain the Frobenius
norm of all the matrices in {A11,q } to be the same, which we denote by γ1 . This is
similarly the case for the matrices {A22,q }, whose Frobenius norm we denote by γ2 .
In fact, in our experience it is very important, for both the criterion function (3.30)
and the ultimate constellation performance, that the correct Frobenius norms of the
basis matrices be chosen. The gradients of the Frobenius norms γ1 and γ2 are given in
Section 3.9.2 and the gradient-ascent method is used for the optimization. The matrix
B is choosen as γ3 [IM , 0M ×(T −2M ) ] with γ3 close to 1 for the following two reasons.
Firstly, the optimization of B is too complicated to be done by the gradient-ascent
method. Secondly, as long as B is full rank, simulation shows that the Frobenius
norm of B and B itself do not have significant effects on the performance. This has
We now summarize the design method for Cayley unitary space-time codes with M
transmit antennas and N receive antennas, and target rate R.
is a soft limit for sphere decoding, we choose our Q that obeys the inequality
to keep the decoding complexity polynomial.
4. Choose {A11,q } and {A22,q } that solves the optimization problem (3.30). A
gradient-ascent method can be used. The computation of the gradients of the
criterion in (3.30) is presented in Section 3.9.1. At the end of each iteration,
gradient-ascent is used to optimize the Frobenius norms of the basis matrices
A11,1 , A11,2 , · · · , A11,Q and A22,1 , A22,2 , · · · , A22,Q . The computation of the gra-
dients is given in Section 3.9.2. Note first that the solution to (3.30) is highly
non-unique. Another solution can be obtained by simply reordering A11,q s and
A22,q s. And also, since the criterion function is neither linear nor convex in
the design variables A11,q and A22,q , there is no guarantee of obtaining a global
maximum. However, since the code design is performed off-line and only once,
we can use more sophisticated optimization techniques to get a better solution.
Simulation results show that the codes obtained by this method have good per-
In this section, we give examples of Cayley unitary space-time codes and the simulated
performance of the codes for various number of antennas and rates. The fading
coefficient between every transmit-and-receive antenna pair is modeled independently
as a complex Gaussian variable with zero-mean and unit-variance and is kept constant
for T channel uses. At each time, a zero-mean, unit-variance complex Gaussian
noise is added to the received signal at every receive antenna. Two error events
are demonstrated including block errors, which correspond to errors in decoding the
70
T × M matrices S1 , ..., SL , and bit errors, which correspond to errors in decoding
α1 , ..., αQ . The bits are allocated to each αq by a Gray code and therefore, a block
error may correspond to only a few bit errors. We first give an example to compare
the performance of the linearized ML, which is given by (3.10) with that of the true
ML, then performance comparisons of our codes with training-based methods are
given.
and r = 2. The number of signal matrices is 2RT = 64, for which the true ML is
feasible. The resulting bit error rate and block error rate curves for the linearized ML
are the line with circles and line with stars in Figure 3.1. The resulting bit error rate
and block error rate curves for the the true ML are the solid line and the dashed line
in the figure. We can see from Figure 3.1 that the performance loss for the linearized
0
T=4 M=2 N=1 R=1.5
10
−1
10
BER/BLER
−2
10
−4
10
10 12 14 16 18 20 22 24 26 28 30
SNR
Codes
In this subsection a few examples of Cayley codes for various multiple-antenna com-
munication systems are given and their performance are compared with that of
training-based codes.
As discussed in Chapter 2, a commonly used scheme for unknown channel multiple-
antenna communication systems is to obtain the channel information via training. It
is important and meaningful to compare our code with that of training-based codes.
Training-based schemes and the optimal way to do training are discussed in Section
2.4 . In our simulations of training-based schemes, the LMMSE estimation is used.
We set the training period Tτ as M and the training signal matrix Sτ as ρτ IM ,
which are optimal. For simplicity, we use equal-training-and-data-power by setting
√
ρd = ρτ = M , which is optimal if T = 2M . In most of the following simulations,
different space-time codes are used in the data transmission phase for different system
settings. Sphere decoding is used in decoding all the Cayley codes and the decoding
of the training-based codes is always ML, but the algorithm varies according to the
codes used.
Example of T = 4, M = 2, N = 2
The first example is for the case of two transmit and two receive antennas with
coherence interval T = 4. For training-based schemes, half of the coherence interval
is used for training. For the data transmission phase, we consider two different space-
time codes. The first one is the well-known orthogonal design in which the transmitted
allocated to each entry by Gray code. The second one is the LD code proposed in
[HH02b]:
4
X 1 1
Sd = (αq Aq + iβq Bq ), with αq , βq ∈ {− √ , √ },
q=1
2 2
where
1 1 0 0 1
√1
A1 = B 1 = √ , A2 = B 2 =
2 ,
2 0 1 1 0
1 1 0 0 1
, A4 = B4 = √12
A3 = B 3 = √ .
2 0 −1 −1 0
Clearly, the rate of the training-based LD code is also 2. For the Cayley code, from
(3.25), we choose Q = 4. To attain rate 2, r = 4 from (3.27). The Cayley code was
obtained by finding a local maximum to (3.31).
The performance curves are shown in Figure 3.2. The dashed line and dashed line
with plus signs indicate the BER and BLER of the Cayley code at rate 2, respectively.
The solid line and solid line with plus signs indicate the BER and BLER of the
training-based orthogonal design at rate 2 respectively and the dash-dotted line and
dash-dotted line with plus sighs show the BER and BLER of the training-based
LD code at rate 2 respectively. We can see from the figure that the Cayley code
underperforms the optimal training-based codes by 3 − 4dB. However, our results are
preliminary and it is conceivable that better performance may be obtained by further
optimization of (3.30) or (3.31).
74
−1
10
BER/BLER
−2
10
−4
10
0 2 4 6 8 10 12 14 16 18 20
SNR
For the training-based scheme of this setting, 2 channel uses of each coherence interval
are allocated to training. Therefore, in the data transmission phase, bits are encoded
into a 3 × 2 data matrix Sd . Since we are not aware of any 3 × 2 space-time code,
0
T=5 M=2 N=1
10
−1
10
−2
10
BER/BLER
−3
10
The performance curves are shown in Figure 3.3. The solid line and solid line with
plus signs indicate the BER and BLER of the Cayley code at rate 1, respectively,
the dash-dotted line and dash-dotted line with plus signs show the BER and BLER
76
of the Cayley code at rate 2, respectively, and the dashed line and dashed line with
plus signs shows the BER and BLER of the training-based scheme, which has a rate
of 6/5. Exhaustive search is used in decoding the training-based scheme and sphere
decoding is applied to decode the Cayley codes.
We can see that our Cayley code at rate 1 has lower BER and BLER than the
training-based scheme at rate 6/5 at any SNR. And, even at a rate which is 4/5
higher, 2 compared with 6/5, the performance of the Cayley code is comparable to
that of the training-based scheme when the SNR is as high as 35dB.
Example of T = 7, M = 3, N = 1
For this system setting, three channel uses of each coherence interval are allocated
to training. In the data transmission phase of the training-based scheme, we use the
h i h i
β2√
+β3 α2√
−α4 β1 β2 −β3
α1 + α 3 + i + β42 2
−i √
2
+ 2
0
h i h i
β1 β2 −β3
−α√2 +α4
2
− i √
2
+ 2
α1 − i β2√+β
2
3
+i − α2√+α
−2
4 β1
√
2
β2 −β3
2
Sd = h i h i .
α2√
+α4
+ i √β12 − β2 −β α1 − α3 + i β2√+β
0 2 2
3
2
3
− β4
h i h i
α2√
−α4 β1 β2 −β3 α2√
+α4 β β −β
2
+i √
2
+ 2
−α3 + iβ4 − 2 + i √2 − 2 1 2 3
By setting αi , βi as BPSK, we obtain a LD code at rate 8/7. For the Cayley code, we
the training-based LD code, which has a rate of 8/7. Sphere decoding is applied in
the decoding of both codes. From the figure we can see that the performance of the
Cayley code is close to the performance of the training-based LD code. Therefore,
at a rate 1/7 lower, the Cayley code is comparable with the training-based LD code.
77
0
T=7 M=3 N=1
10
−1
10
BER/BLER
−2
10
SNR
3.8 Conclusion
Cayley unitary space-time codes are proposed in this chapter. The codes do not
require channel knowledge at either the transmitter or the receiver, are simple to
encode and decode, and apply to systems with any combination of transmit and
receive antennas. They are designed with a probabilistic criterion: they maximize the
We showed that by constraining A12 = (I + iA11 )B and ignoring the data dependence
of the additive noise, α1 , ..., αQ appear linearly at the receiver. Therefore, linear
decoding algorithms such as sphere decoding and nulling-and-canceling can be used
whose complexity is polynomial in the rate and dimension. Our code has a similar
tion results are preliminary, but indicate that the Cayley codes generated with this
recipe only slightly underperform optimized training-based schemes using orthogonal
designs and LD codes. However, they are clearly superior to uncoded training-based
space-time schemes. Further optimization of the Cayley code basis matrices (in (3.30)
3.9 Appendices
In the simulations, the maximization of the design criterion function (3.30) is per-
formed using a simple gradient-ascent method. In this section, we compute the gra-
max E log det[B ∗ (A11 − A011 )B − (A22 − A022 )]2 − 2E log det ∆22 . (3.32)
{A11,q ,A22,q },B
To compute the gradient of a real function f (Aq ) with respect to the entries of the
∂f (Aq ) 1
= min [f (Aq + δ(ej etk + ek etj )) − f (Aq )], j 6= k, (3.33)
∂<Aq j,k
δ→0 δ
∂f (Aq ) 1
= min [f (Aq + iδ(ej etk − ek etj )) − f (Aq )], j 6= k, (3.34)
∂=Aq j,k
δ→0 δ
∂f (Aq ) 1
= min [f (Aq + δej etj ) − f (Aq )], (3.35)
∂Aq j,j
δ→0 δ
where ej is the unit column vector of the same dimension of columns of Aq which has
a one in the j-th entry and zeros elsewhere. That is, while calculating the gradient
with respect to A11,q , ej should have dimension M and for the gradient with respect
to A22,q , the dimension should be T − M instead.
PQ
First, note that A11 − A011 = A11,q aq where aq = αq − αq0 and similarly, A22 −
q=1
Q
P
A022 = A22,q aq . Therefore, to apply (3.33) to the first term of (3.32) with respect
q=1
80
to A11,q , let H = B ∗ (A11 − A011 )B − (A22 − A022 ). Therefore,
log det[B ∗ (A11 − A011 )B − (A22 − A022 ) + B ∗ (ej etk + ek etj )Bδaq ]2
= log det{H 2 + [HB ∗ (ej etk + ek etj )B + B ∗ (ej etk + ek etj )BH]δaq + o(δ 2 )I}
= log det H 2 + log det{I + H −2 [HB ∗ (ej etk + ek etj )B + B ∗ (ej etk + ek etj )BH]δaq + o(δ 2 )I}
= log det H 2 + tr {H −2 [HB ∗ (ej etk + ek etj )B + B ∗ (ej etk + ek etj )BH]δaq } + o(δ 2 )
= log det H 2 + tr {H −1 B ∗ (ej etk + ek etj )B + H −2 B ∗ (ej etk + ek etj )BHδaq } + o(δ 2 )
= log det H 2 + tr {BH −1 B ∗ (ej etk + ek etj ) + BH −1 B ∗ (ej etk + ek etj )δaq } + o(δ 2 )
where {A}i,j indicates the (i, j)-th entry of matrix A and <{A}i,j indicates the real
part of the (i, j)-th entry of matrix A. We use tr AB = tr BA and the last equality
follows because BH −1 B ∗ is Hermitian. We may now apply (3.33) to obtain
∂ log det[B ∗ (A11 − A011 )B − (A22 − A022 )]2
= 4E <{BH −1 B ∗ }j,k aq , j 6= k.
∂<A11,q j,k
The gradient with respect to the imaginary components of A11,q can be obtained
in a similar way as the following
∂ log det[B ∗ (A11 − A011 )B − (A22 − A022 )]2
= 4E ={BH −1 B ∗ }j,k aq , j 6= k,
∂=A11,q j,k
where ={A}i,j indicates the imaginary part of the (i, j)-th entry of matrix A. And
the gradient with respect to the diagonal elements is
∂ log det[B ∗ (A11 − A011 )B − (A22 − A022 )]2
= 2E {BH −1 B ∗ }j,j aq .
∂A11,q j,j
81
Similarly, we get the gradient with respect to A22,q ,
∂ log det[B ∗ (A11 − A011 )B − (A22 − A022 )]2 −1
= −4E <Hj,k aq , j 6= k,
∂<A22,q j,k
∂ log det[B ∗ (A11 − A011 )B − (A22 − A022 )]2 −1
= −4E =Hj,k aq , j 6= k,
∂=A22,q j,k
∂ log det[B ∗ (A11 − A011 )B − (A22 − A022 )]2 −1
= −2E Hj,j aq .
∂A22,q j,j
For the second term, by using the same method, the following results are obtained
∂ log det ∆22
= 2E <(D + D ∗ + E + E ∗ )j,k αq , j 6= k,
∂<A11,q j,k
∂ log det ∆22
= 2E =(D + D ∗ + E + E ∗ )j,k αq , j 6= k,
∂=A11,q j,k
∂ log det ∆22
= 2E (D + E)j,j αq ,
∂A11,q j,j
∂ log det ∆22
= 2E <(F + F ∗ + G + G∗ )j,k αq , j 6= k,
∂<A22,q j,k
∂ log det ∆22
= 2E =(F + F ∗ + G + G∗ )j,k αq , j 6= k,
∂=A22,q j,k
∂ log det ∆22
= 2E (F + G)j,j αa ,
∂A22,q j,j
where
D = iB∆−2 ∗
2 (I + iA22 )B ,
E = iB∆−2 ∗ ∗
2 B (I − iA11 )BB ,
F = ∆−2
2 A22 ,
G = iB ∗ (I + iA11 )B∆−2
2 ,
Let γ1 be a multiplicative factor that we use to multiple every A11,q and γ2 a multi-
plicative factor that we use to multiple every A22,q . Thus, γ1 and γ2 are the Frobenius
norms of matrices in {A11,q } and {A22,q }. We solve for the optimal γ1 , γ2 > 0 by max-
ξ(γ1 , γ2 ) = E log det[γ1 B ∗ (A11 − A011 )B − γ2 (A22 − A022 )]2 − 2E log det ∆22 ,
where
Q Q1
X X
∆2 = I + B ∗ B − iγ1 B ∗ αq A11,q B + iγ2 αq A22,q .
q=1 q=1
∂f (x1 , x2 ) 1
= lim [f (x1 + δ, x2 ) − f (x1 , x2 )],
∂x1 δ→0 δ
∂f (x1 , x2 ) 1
= lim [f (x1 , x2 + δ) − f (x1 , x2 )].
∂x2 δ→0 δ
∂ξ(γ1 , γ2 )
∂γ1
= −2E tr {f −1 [2γ1 B ∗ A11 BB ∗ A11 B + iB ∗ (BB ∗ A11 − A11 BB ∗ )B
−γ2 (A22 B ∗ A11 B + A11 BA22 B ∗ )]} + E tr [g −1 (2γ1 B ∗ (A11 − A011 )BB ∗ (A11 − A011 )B
−γ2 ((A22 − A022 )B ∗ (A11 − A011 )B + (A11 − A011 )B(A22 − A022 )B ∗ ))]
83
and
∂ξ(γ1 , γ2 )
∂γ2
= −2E tr [f −1 (2γ1 A222 − i(B ∗ BA22 + A22 BB ∗ ) − γ1 (A22 B ∗ A11 B + A11 BA22 B ∗ ))]
E tr [g −1 (2γ2 A222 − γ1 ((A22 − A022 )B ∗ (A11 − A011 )B + (A11 − A011 )B(A22 − A022 )B ∗ ))].
Simulation shows that good performance is obtained when γ1 and γ2 are not too
far away from unity.
84
Another interesting space-time codes design scheme is the group codes proposed orig-
inally in [SHHS01, Hug00b, HK00], in which the set of matrices, which are space-time
codes, forms a group. The motivation of group-based codes is as the following.
As discussed in Section 2.6, the space-time code design problem for differential
unitary space-time modulation, which is well-tailored for the unknown-channel case,
is: given the number of transmitter antennas M and the transmission rate R, find a
set C of L = 2M R M × M unitary matrices, such that the diversity product
1 1
ξC = min | det(Si − Si0 )| M (4.1)
2 Si 6=Si0 ∈C
is as large as possible. This design problem is very difficult to solve because of the
following reasons. First, it is easy to see that the objective function of the code
design problem, given in (4.1), is not convex. Second, the constraint space, which
is the space of unitary matrices, is also non-convex. Furthermore, when our desired
rate of transmission R is not very low, the constellation size L = 2M R can be huge,
which make the problem even more difficult according to computation complexity. For
example, if there are four transmit antennas and we want to transmit at a rate of four
bits per channel use, we need to find a set of L = 216 = 65, 536 unitary matrices whose
minimum value of the determinants of the pairwise difference matrices is maximized.
Therefore, it appears that there is no efficient algorithm with tractable computational
85
complexity to find the exact solution to this problem. To simplify the design problem,
it is necessary to introduce some structure to the constellation set C. Group structure
Definition 4.1 (Group). [DF99] A group is an ordered pair (G, ?), where G is a
set and ? is a binary operation on G satisfying the following axioms.
2. There exists an element e in G, called the identity of G, such that for all a ∈ G,
a ? e = e ? a = a.
matrix of a. Note that the matrix multiplication operation is not commutative, that
is, ab = ba is not true in general.
Example 3. U (n), the set of all unitary n × n matrices, is a group under the
operation of matrix multiplication with e = In and a−1 the inverse matrix of a. It is
a subgroup of the group GLn (C).
Now we are going to discuss the advantages of group structure in the space-time
code design problem. In general, for two arbitrary elements A and B in a set, C, with
86
cardinality L, | det(A − B)| takes on L(L − 1)/2 distinct values if A 6= B. Therefore,
when L is large, diversity product of the set may be quite small. If C forms a group,
for any two matrices A and B in the set, there exists a matrix C ∈ C such that
C = A−1 B. Therefore,
| det(A − B)| = | det A|| det(IM − A−1 B)| = | det(IM − A−1 B)| = | det(IM − C)|,
that the calculations of both the diversity product and the transmission matrix are
greatly simplified. From the above formula, it can be easily seen that the complexity
of calculating the diversity product is reduced dramatically. In general, to calculate
the diversity product of a set with L elements, L(L − 1)/2 calculations of matrix
determinants are needed, which is quadratic in L. However, if the set forms a group, it
has been shown that only L−1 calculations of matrix determinants are needed, which
is linear in L. For the calculation of the transmission signal matrix, generally, the
multiplication of two M × M matrices is needed. If the set C is a group, the product
is also in the group, which means that every transmission matrix is an element of
S = {IM , S, S 2 , · · · , S L−1 },
where
1 0 ··· 0
l1
2πj
0 e L ··· 0
S= .. .. ..
..
. . . .
lM −1
0 0 · · · e2πj L
with l1 , · · · , lM −1 ∈ [0, L − 1]. The code is called cyclic code since any element in the
It is easy to check that S forms a group. Actually it is the Lie group SU (2): the set of
2 × 2 unitary matrix with unit determinant. This group has infinitly many elements.
To get a finite constellation, x and y can be chosen from some finite sets S1 and S2 ,
for example PSK or QAM signals1 .
Before getting into more details about the design of space-time group codes, some
group theory and representation theory are reviewed in this section that are needed
later.
1
By these choices, the resulted sets might not form a group anymore.
88
Definition 4.2. A subgroup H of G is a normal subgroup if
homomorphism if
As discussed in the previous section, if our signal set has a group structure under
matrix multiplication, its diversity product can be simplified to
1 1
ξC = min | det(I − V )| M .
2 I6=V ∈C
If we insist on a fully diverse constellation, which means that ξC 6= 0, then from the
above equalities, the eigenvalues of all non-identity elements in the constellation must
be different from one. This leads to the following definition.
free (fpf ) if and only if it has a faithful representation as unitary matrices with
the property that the representation of each non-unit element of the group has no
eigenvalue at unity.
Note that the above definition does not require that in every representation of the
group, non-unit elements have no eigenvalue at unity, but rather that there exists one
done so, all groups would have been fpf if all elements in the groups are represented
as the identity matrix. For more information on groups, see [DF99] and [Hun74].
90
4.3 Constellations Based on Finite Fixed-Point-
Free Groups
To get space-time codes with good performance, fpf groups have been widely investi-
gated. Shokrollahi, Hassibi, Hochwald and Sweldens classified all finite fpf groups in
their magnificent paper [SHHS01] based on Zassenhaus’s work [Zas36] in 1936. There
are only six types of finite fpf groups as given in the following. Before giving the big
theorem, we first introduce a definition.
Definition 4.10. Given a pair of integers (m, r), define n as the smallest positive
integer such that r n = 1 mod m. Define t = m/ gcd(r − 1, m). The pair (m, r) is
called admissible if gcd(n, t) = 1.
1.
Gm,r = σ, τ |σ m = 1, τ n = σ t , σ τ = σ r ,
2.
Dm,r,l = σ, τ, γ|σ m = 1, τ n = σ t , σ τ = σ r , σ γ = σ l , τ γ = τ l , γ 2 = τ nr0 /2 ,
D m/t
Em,r = σ, τ, µ, γ|σ m = 1, τ n = σ t , σ τ = σ r , µσ = µ,
m/t
E
γσ = γ, µ4 = 1, µ2 = γ 2 , µγ = µ−1 , µτ = γ, γ τ = µγ ,
4.
D m/t m/t
Fm,r,l = σ, τ, µ, γ, ν|σ m = 1, τ n = σ t , σ τ = σ r , µσ = µ, γ σ = γ,
µτ = γ, γ τ = µγ, µ4 = 1, µ2 = γ 2 , µγ = µ−1 , ν 2 = µ2 , σ ν = σ l ,
τ ν = τ l , µν = γ −1 , γ ν = µ−1 ,
5.
SL2 (F5 ) = µ, γ|µ2 = γ 3 = (µγ)5 , µ4 = 1 .
Km,n,l = hJm,r , νi
with relations
Some unitary representations of these abstract groups are also given in [SHHS01].
Constellations based on representations of these finite fpf groups give space-time codes
with amazingly good performances at low to moderate rates, for example, SL2 (F5 )
, which works for systems with two transmit antennas at rate 3.45. In that paper,
the authors also give non-group constellations which are generalizations of some finite
groups and products of group representations.
As shown in [SHHS01], these finite fpf groups are few and far between. There are
only six types of them and unitary representations of them have dimension and rate
constraints. Although very good constellations are obtained for low to moderate
rates, no good constellations are obtained for very high rates from these finite groups.
This motivates the search for infinite fpf groups, in particular, their most interesting
case, Lie groups.
Definition 4.11 (Lie group). [BtD95] A Lie group is a differential manifold which
is also a group such that the group multiplication and inversion maps are differential.
93
The above definition gives us the main reason for studying Lie groups. Since
Lie groups have an underlying manifold strcuture, finite constellations, which are
subsets of infinite Lie groups, can be obtained by sampling the underlying continuous
manifold appropriately.
Definition 4.12 (Lie algebra). [SW86] A Lie algebra g is a vector space over a
field F on which a product [, ], called the Lie bracket, is defined, which satisfies
1. X, Y ∈ g implies [X, Y ] ∈ g,
It turns out that there is a close connection between Lie groups and Lie algebras.
Theorem 4.2 (Connection between Lie group and Lie algebra). [SW86] Let
G be a Lie group of matrices. Then g, the set of tangent vectors to all curves in
G at the identity, is a Lie algebra. Let g be a linear algebra generated by the basis
g1 , · · · , gn , then g(θ) = eθ1 g1 +···+θn gn is a local Lie group for small enough θ.
Therefore, to obtain many, if not most, of its properties, one can study the Lie
algebra, rather than the Lie group itself. Lie algebras are easier to be analyzed
because they are vector spaces with good properties.
Example 1. GL(n, C) is the Lie group of non-singular n × n complex matrices.
Its Lie algebra is the space of n × n complex matrices.
Definition 4.14. A Lie algebra g is simple if dim g > 1 and it contains no nontrivial
ideals. Or equivalently, the Lie group G of the Lie algebra has no nontrivial normal
Lie subgroups.
D 1 g = [g, g]
and
D k g = [D k−1 g, D k−1 g].
The rank of a Lie algebra g equals the maximum number of commuting basis ele-
Theorem 4.3 (Lie groups with unitary representations). [HK00] A Lie group
has a representation as unitary matrices if and only of its Lie algebra is a compact
3
If for any two elements f and g in a group G, f ? g = g ? f , G is an Abelian group.
95
semi-simple Lie algebra or the direct sum of u(1) and a compact semi-simple Lie
algebra.
For more on the definition of semi-simple and simple Lie algebras, see [HK00,
BtD95, Ser92].
To design differential unitary space-time codes with good performance, two conditions
must be satisfied: the unitarity and the full diversity. For unitarity, from Theorem
4.3, to get unitary constellations, we should look at compact, semi-simple Lie groups.
Since any semi-simple Lie group can be decomposed as a direct sum of simple Lie
groups, for simplicity, we look at compact, simple Lie groups.
For full diversity, we want to design constellations with positive diversity product,
based on these two Lie groups are constrained to systems with one and two transmit
antennas. (Codes constructed based on higher-dimensional representations of SU (2)
can be found in [Sho00].) To obtain constellations that work for systems with more
than two transmit antennas, we relax the fpf condition, which is equivalent to non-
identity elements have no unit eigenvalues, and consider Lie groups whose non-identity
elements have no more than k > 0 unit eigenvalues (k = 0 corresponds to fpf groups).
Since constellations of finite size are obtained by sampling the Lie group’s underlying
manifold. When k is small, there is a good chance that, by sampling appropriately,
fully diverse subsets can be obtained.
It follows from the exponential map relating Lie groups with Lie algebras that
96
a matrix element of a Lie group has unit-eigenvalues if and only if the correspond-
ing matrix element (the logarithm of the element) in the corresponding Lie algebra
(tangent space at identity) has zero-eigenvalues and vice versa. Thus classifying Lie
groups whose non-identity elements have no more than k eigenvalues at 1 is the same
as classifying Lie algebras whose non-zero elements have no more than k eigenvalues
at zero. Unfortunately, there does not appear to be a straightforward way of analyz-
Lemma 4.1. If a Lie algebra g of M × M matrices has rank r, it has at least one
non-zero element with r − 1 eigenvalues at zero.
Proof: Assume that b1 , · · · , br are the commuting basis elements of the Lie al-
gebra g. Since they commute, there exists a matrix T such that b1 , · · · , br can be
the center of the group G. There are three groups with rank 2: the Lie group of unit-
97
Group Dimension Z(G) Cartan Name Rank
SU (n); n ≥ 2 n2 − 1 Zn An−1 n−1
Sp(n); n ≥ 2 n(2n + 1) Z2 Cn n
Spin(2n + 1); n ≥ 3 n(2n + 1) Z2 Bn n
Spin(2n); n ≥ 4 n(2n − 1) Z4 (n odd) Dn n
Z2 × Z2 (n even)
E6 78 Z3 E6 6
E7 133 Z2 E7 7
E8 248 0 E8 8
F4 52 0 F4 4
G2 14 0 G2 2
G2 has dimension 14, and its simplest matrix representation is 7-dimensional, which
is very difficult to be parameterized.
98
5.1 Abstract
As discussed in Section 4.5, Sp(n) is a Lie group with rank n. In this chapter, the
ML decoding via sphere decoding. Simulation results show that they have better
performance than 2 × 2 and 4 × 4 orthogonal designs, Cayley differential codes, and
some finite-group-based codes at high SNR. It is also shown that they are comparable
to the carefully designed product-of-groups code.
This chapter is organized as follows. In Section 5.2, the Lie group Sp(n) is dis-
cussed and a parameterization method of it is given. Based on the parameterization,
in Section 5.3, differential unitary space-time codes that are subsets of Sp(2) are de-
signed. The full diversity of the codes is proved in Section 5.4. In Section 5.5, Sp(2)
codes with higher rates are proposed. It is shown in Section 5.6 that the codes have
a fast ML decoding algorithm using sphere decoding. Finally, in Section 5.7 the per-
formance of the Sp(2) code is compared with that of other existing codes including
Alamouti’s orthogonal design, the 4 × 4 complex orthogonal design, Cayley differen-
tial unitary space-time codes, and finite-group-based codes. Section 5.8 provides the
tion
Definition 5.1 (Symplectic group). [Sim94] Sp(n), the n-th order symptectic
group, is the set of complex 2n × 2n matrices S obeying
Sp(n) has dimension n(2n + 1) and rank n. As mentioned before, we are most
interested in the lowest rank case, which is also the simplest case of n = 2. Also
note that Sp(1) = SU (2), and SU (2) constitutes the orthogonal design of Alamouti
Lemma 5.1. The multiplicity of the unit eigenvalue of any matrix in Sp(2) is even.
100
Proof: Assume S is a matrix in Sp(2) and x is an eigenvector of S with eigenvalue
1. Then we have Sx = x. From the symplectic condition S t JS = J, S ∗ J S̄ = J. Since
x1 0 I2 x̄1 x1 = x̄2
x1 = 0
= ⇒ ⇒ ,
x2 −I2 0 x̄2 x2 = −x̄1 x2 = 0
condition 2 becomes
JS = S̄J. (5.1)
design form. To get a more detailed structure of the Lie group, let us look at the
conditions imposed on A and B for S to be unitary. From SS ∗ = I2n or S ∗ S = I2n ,
AA∗ + BB ∗ = In
A∗ A + B t B̄ = In
or . (5.3)
BAt = AB t
A∗ B = B t Ā
Lemma 5.2. For any n × n complex matrices A and B that satisfy (5.3), there exist
unitary matrices U and V such that A = U ΣA V and B = U ΣB V̄ , where ΣA and ΣB
are diagonal matrices whose diagonal elements are singular values of A and B.
(1) (2)
Proof: Denote the i-th diagonal element of D1 and D2 as dii and dii , respec-
tively. Since U D1 U ∗ is a similarity transformation, which preserves the eigenvalues,
the set of eigenvalues of D1 is the same as the set of eigenvalues of D2 , or in other
(1) (2) (1) (2)
words, dii = djj for some j. Notice that dii s and dii s are ordered non-increasingly.
(1) (2)
Therefore, dii = dii for i = 1, 2, ..., n, that is, D1 = D2 = D. Now write D as
diag{d1 IP1 , · · · , dq IPq }, where Pi is the number of times the element di appears in D
for i = 1, · · · , q. It is obvious that U can be written as diag{U1 , · · · , Uq } where the
size of Ui is Pq for i = 1, · · · , q.
1
Note that the structure in (5.2) is akin to the quasi-orthogonal space-time block codes in [TBH00,
Jaf01]. The crucial difference, however, is that in our paper, we shall insist that (5.2) be a unitary
matrix. This leads to further conditions on A and B, which are described below, and do not appear
in quasi-orthogonal space-time block codes.
102
Lemma 5.4. If U D 2 = D 2 U for any n × n positive semi-definite diagonal matrix D
and any n × n matrix U , then U D = DU .
Proof: By looking at the (i, j)-th entries of U D 2 and D 2 U , uij d2jj = d2ii uij . If
d2ii 6= d2jj , uij = 0 is obtained, and therefore uij djj = dii uij . If d2ii = d2jj , since
D is a positive semi-definite matrix, dii is non-negative, therefore, dii = djj and so
uij djj = dii uij is obtained. Therefore, U D = DU .
ordered. From the equation AA∗ +BB ∗ = In in (5.3), the following series of equations
can be obtained.
Since UA and UB are unitary, UA∗ UB is also unitary. Now since the diagonal entries
Therefore, In − Σ2A = Σ2B and (VA0∗ V̄B )Σ2B = Σ2B (VA0∗ V̄B ) by using Lemma 5.3. Define
VD = VA0∗ V̄B , which is obviously a unitary matrix. Therefore V̄B = VA0 VD , VD Σ2B =
matrix and ejG/2 ΣA = ΣA ejG/2 , ejG/2 ΣB = ΣB ejG/2 can be obtained, where G/2 =
diag{G1 /2, · · · , Gq /2}. Therefore ejG/2 is the square root of VD . Thus, A and B can
104
be written as
A = U1 ΣA VD V1∗ = U1 ΣA ejG V1∗ = U1 ejG/2 ΣA ejG/2 V1∗ = (U1 ejG/2 )ΣA (V1 e−jG/2 )∗ ,
Proof: Lemma 5.2 and formula (5.2) imply that any matrix in Sp(2) can be
written as the form in (5.5). Conversely, for any matrix S with the form of (5.5), it
is easy to verify the unitary and symplectic conditions in Definition 5.1.
Now let us look at the dimension of S. It is known that an n × n unitary matrix
n(n−1)
has dimension 2n2 − n − 2 × 2
= n2 . Therefore, there are all together 2n2
degrees of freedom in the unitary matrices U and V . Together with the n real angles
θi , the dimension of S is therefore n(2n + 1), which is exactly the same as that of
Sp(n). But from the discussion above, an extra condition is imposed on the matrix
S: the diagonal elements of ΣA and ΣB are non-negative and non-increasingly/non-
decreasingly ordered. This might cause the dimension of S to be less than matrices
105
in Sp(2) at first glance. However, the order and signs of the diagonal elements of ΣA
and ΣB can be changed by right multiplying U and left multiplying V with two types
Let us now focus on the case of n = 2. The goal is to find fully diverse subsets
of Sp(2). For simplicity, first let ΣA = ΣB = √1 I2 , by which 2 degrees of freedom
2
are neglected. To get a finite subset of unitary matrices from the infinite Lie group,
further choose U and V as orthogonal designs with entries of U chosen from the set of
2π 2π(P −1)
P -PSK signals: {1, ej P , · · · , ej P } and entries of V chosen from the set of Q-PSK
2π 2π(Q−1)
signals shifted by an angle θ: {ejθ , ej( Q +θ) , · · · , ej( Q
+θ)
} [Hug00b, Ala98]. The
following code is obtained.
j 2πk
j 2πl
1
e P e P
U = 2√ ,
2πl 2πk
−j P −j P
−e e
UV U V̄
1
2πm 2πn
CP,Q,θ = √ j( +θ) j( +θ) , (5.6)
e e
Q Q
2 −Ū V Ū V̄ V = √12
,
2πn 2πm
−e−j( Q +θ) e−j( Q +θ)
0 ≤ k, l < P, 0 ≤ m, n < Q
where P and Q are positive integers. θ ∈ [0, 2π) is an angle to be chosen later. There
are P 2 possible U matrices and Q2 possible V matrices. Since the channel is used in
2
Definition of permutation matrices can be found in [Art99]. It is easy to see that both types of
matrices are unitary, therefore, the unitarity of U and V keeps unchanged.
106
blocks of four transmissions, the rate of the code is therefore
1
(log2 P + log2 Q). (5.7)
2
It is easy to see that any transmission matrix in the code can be identified by the
4-tuple (k, l, m, n). The angle θ, an extra degree of freedom added to increase the
diversity product, is used in the proof of the full diversity of the code. However,
simulation results indicate that the code is always fully-diverse no matter what value
θ takes.
The rate of the code is the same as that of CP,Q,θ and its full diversity can be proved
In differential unitary space-time code design for multiple-antenna systems, the most
widely used criterion is the full diversity of the code since as discussed in Section
2.6, the diversity product is directly related to the pairwise error probability of the
1 U1 V1 U1 V̄1 1 U2 V2 U2 V̄2
S1 = √ and S2 = √ (5.8)
2 −Ū V Ū V̄ 2 −Ū V Ū V̄
1 1 1 1 2 2 2 2
and
2πki 2πli 2πmi 2πni
j j j( +θ) j( +θ)
1 e P e P
1 e e Q
Q
2. M ∗ M = M M ∗ = (det M )I2 ,
M∗ Mt
4. M −1 = det M
and M̄ −1 = det M
.
Define
det (S1 − S2 )
1 U 1 V1 − U 2 V2 U1 V̄1 − U2 V̄2
= √ det
2 −(Ū1 V1 − Ū2 V2 ) Ū1 V̄1 − Ū2 V̄2
1 O1 O2
= √ det
2 −Ō2 Ō1
1
= √ det O1 det(Ō1 + Ō2 O1−1 O2 ) (5.11)
2
easy to prove that the addition, multiplication, and conjugate operations preserve
this property, O1 and O2 are also have the orthogonal design structure. By taking
advantage of this, when det O1 6= 0 and det O2 6= 0, the determinant of the difference
can be calculated to be
det (S1 − S2 )
1 Ō2 O1∗ O2
= √ det O1 det Ō1 +
2 det O1
1 Ō1 O2 O2 Ō2 O1∗ O2
∗
= √ det O1 det +
2 det O2 det O1
∗
1 Ō1 O2 Ō2 O1∗
= √ det O1 det + det O2
2 det O2 det O1
√ √
1 det O1 det O2 ∗ det O1 det O2 ∗
= √ det Ō1 O2 + Ō2 O1
2 det O2 det O1
1 ∗ 1 ∗ t
= √ det aŌ1 O2 + (Ō1 O2 )
2 a
t
1 α β 1 α β
= √ det a +
2 −β̄ ᾱ a −β̄ ᾱ
2 2 !
1 1 1
= √ aα + α + aβ − β̄
2 a a
109
2 2
1 1 1 β̄
= √ |α|2 a + + aβ − , (5.12)
2 a 2 a
q
det O1
where a = det O2
is a positive number and (α, β) is the first row of Ō1 O2∗ .
Lemma 5.6. For any S1 and S2 given in (5.8) and (5.9) where S1 6= S2 , det (S1 − S2 ) =
0 if and only if O1 = ±J Ō2 , or equivalently
e2jθ w + = x+
e2jθ w − = x−
, or , (5.13)
2jθ +
e y = z+
2jθ −
e y = z−
where
k1 m1 k2 m2 l1 m1 l2 m2
w + = e2πj( P + Q ) − e2πj( P + Q ) + e2πj( P + Q ) − e2πj( P + Q )
k n k n l n l n
x+ = −e2πj( P1 − Q1 ) + e2πj( P2 − Q2 ) + e2πj( P1 − Q1 ) − e2πj( P2 − Q2 )
k n k n l n l n
(5.14)
+ 2πj( P1 + Q1 ) 2πj( P2 + Q2 ) 2πj( P1 + Q1 ) 2πj( P2 + Q2 )
y = e − e + e − e
z + = e2πj( kP1 − mQ1 ) − e2πj( kP2 − mQ2 ) − e2πj( lP1 − mQ1 ) + e2πj( lP2 − mQ2 )
and
k1 m1 k2 m2 l1 m1 l2 m2
w − = e2πj( P + Q ) − e2πj( P + Q ) − e2πj( P + Q ) + e2πj( P + Q )
k n k n l n l n
x− = e2πj( P1 − Q1 ) − e2πj( P2 − Q2 ) + e2πj( P1 − Q1 ) − e2πj( P2 − Q2 )
k n k n l n l n
. (5.15)
y − = e2πj( P1 + Q1 ) − e2πj( P2 + Q2 ) − e2πj( P1 + Q1 ) + e2πj( P2 + Q2 )
z − = −e2πj( kP1 − mQ1 ) + e2πj( kP2 − mQ2 ) − e2πj( lP1 − mQ1 ) + e2πj( lP2 − mQ2 )
Theorem 5.2 (Condition for full diversity). There exists a θ such that the code
in (5.6) is fully diverse if and only if P and Q are relatively prime.
110
This theorem provides both the sufficient and the necessary condition for the code
set to be fully diverse. Before proving this theorem, a few lemmas are given first which
Lemma 5.7. For any four points on the unit circle that add up (as complex numbers)
to zero, it is always true that two of them add up to zero. (Clearly, the other two
must also have a summation of zero.)
Lemma 5.8. If P and Q are relatively prime, then for any non-identical pairs,
(k1 , l1 , m1 , n1 ) and (k2 , l2 , m2 , n2 ), where k1 , l1 , k2 , l2 ∈ [0, P ) and m1 , n1 , m2 , n2 ∈
[0, Q) are integers, w + , x+ , y + , z + , as defined in (5.14), cannot be zero simulta-
Proof of Theorem 5.2: The proof has two steps. First, we prove the suffi-
ciency of the condition, that is, assuming P and Q are relatively prime, we show
that there exists a θ such that the code is fully-diverse. If P and Q are relatively
prime, by Lemma 5.8, for any non-identical pair of signal matrices (k1 , l1 , m1 , n1 ) and
1 +
2
Arg ( wx + ) mod 2π if w + 6= 0
− 1 Arg ( w++ )
mod 2π if w + = 0, x+ 6= 0
2 x
θk+1 ,l1 ,m1 ,n1 ,k2 ,l2 ,m2 ,n2 = ,
1 +
Arg ( yz + ) mod 2π if w + = x+ = 0, y + 6= 0
2
− 1 Arg ( y+ )
mod 2π if w + = x+ = y + = 0, z + 6= 0
2 z+
111
which is the same as
1 +
2
Arg ( wx + ) mod 2π if w + 6= 0
0 if w + = 0, x+ 6= 0
θk+1 ,l1 ,m1 ,n1 ,k2 ,l2 ,m2 ,n2 = (. 5.16)
1 +
2
Arg ( yz + ) mod 2π if w + = x+ = 0, y + 6= 0
0 if w + = x+ = y + = 0, z + 6= 0
Arg c indicates the argument of the complex number c. Also, by Lemma 5.8, w − , x− , y − , z −
cannot be zero simultaneously. (For definitions of w − , x− , y − , z − , see (5.15)). Define
1 −
2
Arg ( wx − ) mod 2π if w − 6= 0
− 1 Arg ( w−− )
mod 2π if w − = 0, x− 6= 0
2 x
θk−1 ,l1 ,m1 ,n1 ,k2 ,l2 ,m2 ,n2 = ,
1 −
Arg ( yz − ) mod 2π if w − = x− = 0, y − 6= 0
2
− 1 Arg ( y− )
mod 2π if w − = x− = y − = 0, z − 6= 0
2 z−
By choosing
/ θk+1 ,l1 ,m1 ,n1 ,k2 ,l2 ,m2 ,n2 ,
θ∈
(5.18)
+
|w | = |x+ |, |y +| = |z + |, 0 ≤ k1 , l1 , k2 , l2 < P, 0 ≤ m1 , n1 , m2 , n2 < Q}
112
and
/ θk−1 ,l1 ,m1 ,n1 ,k2 ,l2 ,m2 ,n2 ,
θ∈
(5.19)
||w − | = |x− |, |y −| = |z − |, 0 ≤ k1 , l1 , k2 , l2 < P, 0 ≤ m1 , n1 , m2 , n2 < Q} ,
(5.13) cannot be true at the same time. Therefore, by Lemma 5.6, det(S1 − S2 ) 6= 0,
which means that the code is fully diverse. An angle in [0, 2π) that satisfies (5.18)
can always be found since the two sets at the right-hand side of (5.18) and (5.19) are
finite. This proves the sufficiency of the condition (P and Q are relatively prime) in
Theorem 5.2.
In the second step, we prove the necessity of the condition, that is, assuming that
P and Q are not relatively prime, we show that there exist two signals in the code
such that the determinant of the difference of the two is zero for any θ. Assume
that the greatest common divisor of P and Q is G > 1, then there exist positive
integers P 0 and Q0 such that P = P 0 G and Q = Q0 G. Consider the following two
signal matrices S1 and S2 as given in (5.8) and (5.9) with k2 = k1 − P 0 , l2 = l1 − P 0 ,
m2 = m1 + Q0 , n2 = n1 + Q0 , k1 = l1 , and k2 = l2 . Assume that k1 , l1 ∈ [P 0 , P ), and
m1 , n1 ∈ [0, Q − Q0 ). Since P > P 0 and Q > Q0 , we can always choose k1 , l1 , m1 , n1
Remark: Note that we have actually proved that the codes in (5.6) are fully
diverse for almost any θ except for a measure zero set. However, this is a sufficient
condition and may be not necessary. The diversity products of many codes for θ from
0 to 2π with step size 0.001 are calculated by simulation. Two of these are shown in
the following. Simulation results show that the codes are fully diverse for all θ.
The following two plots show the diversity products of two Sp(2) codes at different
113
0.19
1/4
ξ=1/2 min|det(S −S )|
2
1 0.18
0.17
0.16
0.15
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
θ (radian)
0.09
0.085
0.08
1/4
ξ=1/2 min|det(S1−S2)|
0.075
0.07
0.065
0.06
0.055
0.05
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
θ (radian)
Since the angles of the elements in the V matrix of (5.6) are chosen from Q-PSK
signals shifted by an angle θ, it is enough to set the changing region of θ as [0, 2π/Q)
instead of [0, 2π). It can be seen from the two plots that the Sp(2) code with P = 7
and Q = 3 gets its highest diversity product, 0.1870, at θ = 0.0419 and the Sp(2)
In section 5.4, Sp(2) codes are designed with the 2 degrees of freedom in ΣA and ΣB
unused. For higher rate code design, one of the two degrees of freedom can be added
in by letting
where γi ∈ Γ for some real set Γ. The code can be constructed as follows.
j 2πk j 2πl
e P e P
U = √12
,
2πl 2πk
−e −j P e −j P
cos γi U V sin γi U V̄ 2πm 2πn
CP,Q,θ,Γ = j( Q +θ) j( Q +θ) , (5.20)
e e
− sin γi ŪV cos γi Ū V̄ V = √1 ,
2 2πn 2πm
−e−j( Q +θ) e−j( Q +θ)
0 ≤ k, l < P, 0 ≤ m, n < Q, γi ∈ Γ
where P and Q are positive integers and θ ∈ [0, 2π) is a constant to be chosen later.
It can be easily seen that any signal matrix in the code can be identified by the 5-tuple
(k, l, m, n, γi ). The code proposed in (5.6) is a special case of this code, which can be
obtained by setting Γ = { π4 }. Since the set has altogether P 2 Q2 |Γ| matrices and the
115
channel is used in blocks of four transmissions, the rate of the code is
1 1
(log2 P + log2 Q) + log2 |Γ|, (5.21)
2 4
where |Γ| indicates the cardinality of the set Γ. A conclusion similar to Theorem 5.2
Theorem 5.3 (Condition for full diversity). If P and Q are relatively prime with
1. Γ ⊂ (0, π2 ),
2. For any γ ∈ Γ,
sin π( Pl + m
Q
)
tan γ 6= ± , (5.22)
sin π( Pk + m
Q
)
then there exists θ such that the signal set in (5.20) is fully-diverse.
Proof: First we need to show that the right-hand side formula of (5.22) is well
k = 0 and m = 0. This contradicts the condition that (k, m) 6= (0, 0). We now prove
116
that the right-hand side formula of (5.23) is well defined, that is, cos 2π Pl 6= 0 for any
l ∈ [0, P ). Again, this can be proved by contradiction. Assume that cos 2π Pl = 0.
there exists a θ such that the code is fully diverse. It is equivalent to show that for
any non-identical pair of signals Si and Sj of the code, det(Si − Sj ) 6= 0. Without
loss of generality, assume
cos γi Ui Vi sin γi Ui V̄i cos γi Uj Vj sin γi Uj V̄j
Si = and Sj = , (5.24)
− sin γi Ūi Vi cos γi Ūi V̄i − sin γi Ūj Vj cos γi Ūj V̄j
where Ui , Vi are 2 × 2 unitary matrices given by (5.9) and Uj , Vj are the two 2 × 2
matrices in (5.9) by replacing i by j. The two signals being different indicates that
the two 5-tuples, (ki , li , mi , ni , γi ) and (kj , lj , mj , nj , γj ), are not identical. From the
proof of Lemma 5.6, det(Si − Sj ) is zero if and only if O10 = ±J Ō20 , where O10 and O20
are defined as
O10 = cos γi Ui Vi − cos γj Uj Vj and O20 = sin γi Ui V̄i − sin γj Uj V̄j . (5.25)
By using (5.25) and (5.9), similar to the argument in the proof of Lemma 5.6, O10 =
±J Ō20 can be equivalently written as
ki mi kj mj li mi lj mj
e 2jθ (cos γ e2πj( P + Q ) − cos γ e2πj( P + Q ) ± sin γ e2πj( P + Q ) ∓ sin γ e2πj( P + Q ) )
i j i j
k n k n l n l n
2πj( Pi − Qi ) 2πj( Pj − Qj ) 2πj( Pi − Qi ) 2πj( Pj − Qj )
= ∓ sin γ i e ± sin γ j e + cos γ i e − cos γ j e
or (5.26)
k n kj nj l n lj nj
e2jθ (cos γ e2πj( Pi + Qi ) − cos γ e2πj( P + Q ) ± sin γ e2πj( Pi + Qi ) ∓ sin γ e2πj( P + Q ) )
i j i j
ki mi kj mj li mi l m
2πj( P − Q ) 2πj( P − Q ) 2πj( P − Q ) 2πj( Pj − Qj )
= ± sin γi e ∓ sin γj e − cos γi e + cos γj e
117
Define
k m k m l m l m
+ 2πj( Pi + Qi ) 2πj( Pj + Qj ) 2πj( Pi + Qi ) 2πj( Pj + Qj )
w̃ = cos γ i e − cos γ j e + sin γ i e − sin γ j e
ki ni kj nj li ni l n
x̃+ = − sin γi e2πj( P − Q ) + sin γj e2πj( P − Q ) + cos γi e2πj( P − Q ) − cos γj e2πj( Pj − Qj )
k n k n l n l n (5.27)
+ 2πj( Pi + Qi ) 2πj( Pj + Qj ) 2πj( Pi + Qi ) 2πj( Pj + Qj )
ỹ = cos γ i e − cos γ j e + sin γ i e − sin γ j e
ki mi kj mj li mi l m
z̃ + = + sin γ e2πj( P − Q ) − sin γ e2πj( P − Q ) − cos γ e2πj( P − Q ) + cos γ e2πj( Pj − Qj )
i j i j
and
ki mi kj mj li mi lj mj
w̃ − = cos γi e2πj( P + Q ) − cos γj e2πj( P + Q ) − sin γi e2πj( P + Q ) + sin γj e2πj( P + Q )
k n k n l n l n
x̃− = + sin γi e2πj( Pi − Qi ) − sin γj e2πj( Pj − Qj ) + cos γi e2πj( Pi − Qi ) − cos γj e2πj( Pj − Qj )
k n k n l n l n (5.28)
.
− 2πj( Pi + Qi ) 2πj( Pj + Qj ) 2πj( Pi + Qi ) 2πj( Pj + Qj )
ỹ = cos γ i e − cos γ j e − sin γ i e + sin γ j e
ki mi kj mj li mi l m
z̃ − = − sin γ e2πj( P − Q ) + sin γ e2πj( P − Q ) − cos γ e2πj( P − Q ) + cos γ e2πj( Pj − Qj )
i j i j
Lemma 5.9. For any non-identical pairs (ki , li , mi , ni , γi ) and (kj , lj , mj , nj , γj ), where
ki , li , kj , lj ∈ [0, P ), mi , ni , mj , nj ∈ [0, Q) are integers and γi , γj ∈ Γ, if P and Q are
relatively prime with P odd and if the set Γ ⊂ (0, π2 ) satisfies conditions (5.22) and
1 +
2
Arg ( w̃x̃ + ) mod 2π if w̃ + 6= 0
− 1 Arg ( w̃++ ) mod 2π if w̃ + = 0, x̃+ 6= 0
2 x̃
θ̃k+1 ,l1 ,m1 ,n1 ,k2 ,l2 ,m2 ,n2 = ,
1 +
Arg ( ỹz̃ + ) mod 2π if w̃ + = x̃+ = 0, ỹ + 6= 0
2
− 1 Arg ( ỹ+ ) mod 2π if w̃ + = x̃+ = ỹ + = 0, z̃ + 6= 0
2 z̃ +
1 x̃−
2
Arg ( w̃ −) mod 2π if w̃ − 6= 0
− 1 Arg ( w̃−− )
mod 2π if w̃ − = 0, x̃− 6= 0
2 x̃
θ̃k−1 ,l1 ,m1 ,n1 ,k2 ,l2 ,m2 ,n2 = ,
1 −
Arg ( ỹz̃ − ) mod 2π if w̃ − = x̃− = 0, ỹ − 6= 0
2
− 1 Arg ( ỹ− )
mod 2π if w̃ − = x̃− = ỹ − = 0, z̃ − 6= 0
2 z̃ −
n
/ θ̃k+1 ,l1 ,m1 ,n1 ,k2 ,l2 ,m2 ,n2
θ∈
(5.30)
+
|w̃ | = |x̃+ |, |ỹ +| = |z̃ + |, 0 ≤ k1 , l1 , k2 , l2 < P, 0 ≤ m1 , n1 , m2 , n2 < Q
and
n
/ θ̃k−1 ,l1 ,m1 ,n1 ,k2 ,l2 ,m2 ,n2
θ∈
(5.31)
,
−
|w̃ | = |x̃− |, |ỹ −| = |z̃ − |, 0 ≤ k1 , l1 , k2 , l2 < P, 0 ≤ m1 , n1 , m2 , n2 < Q
(5.29) cannot be true. Therefore det(Si − Sj ) 6= 0, which means that the code is
fully-diverse. An angle in [0, 2π) that satisfies both (5.30) and (5.31) can always be
found since the two sets at the right-hand side of (5.30) and (5.31) are finite.
One of the most prominent properties of our Sp(2) codes is that it is a generalization
of the orthogonal design. In this section, it is shown how this property can be used
to get linear-algebraic decoding, which means that the receiver can be made to form
a system of linear equations in the unknowns.
5.6.1 Formulation
The ML decoding for differential USTM is given in (2.12), which, in our system, can
be written as
2
Ui∗ 0 (τ ) Vj V̄j (τ −1)
X1 X1
0 Uit
−Vj V̄j
1
.. ..
arg max
. − √ .
,
i,j
2
∗
Ui 0 (τ ) Vj V̄j (τ −1)
XN XN
t
0 Ui −Vj V̄j
F
(τ ) (τ −1)
where Xi denotes the i-th column of Xτ and Xi denotes the i-th column of
(τ ) (τ −1)
Xτ −1 . It is obvious that Xi and Xi are 4 × 1 column vectors. We further denote
(τ ) (τ −1)
the (i, j)-th entry of Xτ as xij and denote the (i, j)-th entry of Xτ as xij for
i = 1, 2, 3, 4 and j = 1, 2, · · · , N . The ML decoding is equivalent to
h i h i h i
2
(τ ) (τ ) t (τ −1) (τ −1) t (τ −1) (τ −1) t
Ui∗ x11 , x21 Vj x11 , x21 + V̄j x31 , x41
h i h i h i
(τ ) (τ ) t (τ −1) (τ −1) t (τ −1) (τ −1) t
Ujt x31 , x41
−Vj x11 , x21 + V̄j x31 , x41
1
.. ..
arg max
. − √ .
.
i,j
h
it 2
h it h it
∗ (τ ) (τ ) (τ −1) (τ −1) (τ −1) (τ −1)
Ui xN 1 , x N 2
Vj x1N , x2N + V̄j x3N , x4N
h it h it h it
(τ ) (τ ) (τ −1) (τ −1) (τ −1) (τ −1)
Ujt xN 3 , xN 4 −Vj x1N , x2N + V̄j x3N , x4N
F
From the design of the code, we know that the matrices Ui , Vj and their conjugates
a b
and transposes are all orthogonal designs. For any orthogonal design M =
−b̄ ā
121
and any two-dimensional vector X = [x1 , x2 ]t , M X can be written equivalently as
<a <a
<x1 =x1 −<x2 =x2 =a =x −<x −=x −<x
1 1 2 2
=a
+i ,
<x2 −=x2 <x1 =x1 <b =x2 <x2 =x1 −<x1 <b
=b =b
where <x indicates the real part of x and =x indicates the imaginary part of x. It can
be seen that the roles of M and X are interchanged. Therefore, by careful calculation,
the ML decoding of Sp(2) codes can be shown to be equivalent to
2
cos 2πk
P
sin 2πk
(τ ) (τ −1) P
A1 (X1 ) −C1 (X1 )
(τ ) (τ −1)
cos 2πl
P
B1 (X1 ) −D1 (X1 )
sin 2πl
P
arg max
···
,(5.32)
0≤k,l<P,0≤m,n<Q
2πm
cos( Q + θ)
(τ ) (τ −1)
AN (XN ) −CN (XN )
2πm
sin( Q + θ)
(τ ) (τ −1)
BN (XN ) −DN (XN )
cos( 2πn
Q
+ θ)
sin( 2πn
Q
+ θ)
F
where
(τ ) (τ ) (τ ) (τ )
<xi1 =xi1 −<xi2 =xi2
1 <x(τ ) −=x(τ ) <x(τ ) (τ )
=xi1
(τ ) i2 i2 i1
Ai (Xi ) = √ , (5.33)
2 (τ ) (τ ) (τ )
<xi3 −=xi3 −<xi4 −=xi4
(τ )
(τ ) (τ ) (τ ) (τ )
<xi4 =xi4 <xi3 −=xi3
122
(τ ) (τ ) (τ ) (τ )
=xi1 −<xi1 −=xi2 −<xi2
1 =x(τ ) (τ )
<xi2
(τ )
=xi1
(τ )
−<xi1
(τ ) i2
Bi (Xi ) = √ , (5.34)
2 =x(τ )
(τ )
<xi3
(τ )
−=xi4
(τ )
<xi4
i3
(τ ) (τ ) (τ ) (τ )
=xi4 −<xi4 =xi3 <xi3
(τ −1) (τ −1) (τ −1) (τ −1)
<(xi1 + xi3 ) =(−xi1
) + xi3
<(x(τ −1) + x(τ −1) ) (τ −1)
=(xi2
(τ −1)
− xi4 )
(τ −1) 1 i2 i4
Ci (Xi )= 2
(τ −1) (τ −1) (τ −1) (τ −1)
<(−xi1 + xi3 ) =(xi1 + xi3 )
(τ −1) (τ −1) (τ −1) (τ −1)
<(−xi2 + xi4 ) =(−xi2 − xi4 )
, (5.35)
(τ −1) (τ −1) (τ −1) (τ −1)
<(xi2 + xi4 ) =(−xi2 − xi4 )
(τ −1) (τ −1) (τ −1)
(τ −1)
<(−xi1 − xi3 ) =(−xi1 +
xi3 )
(τ −1) (τ −1) (τ −1) (τ −1)
<(−xi2 + xi4 ) =(xi2 − xi4 )
(τ −1) (τ −1) (τ −1) (τ −1)
<(xi1 − xi3 ) =(xi1 + xi3 )
(τ −1) (τ −1) (τ −1) (τ −1)
=(xi1 + xi3 ) <(xi1 − xi3 )
=(x(τ −1) + x(τ −1) ) <(−x(τ −1) + x(τ −1) )
(τ −1) 1 i2 i4 i2 i4
Di (Xi )= 2
=(−x(τ −1) + x(τ −1) ) <(−x(τ −1) − x(τ −1) )
i1 i3 i1 i3
(τ −1) (τ −1) (τ −1) (τ −1)
=(−xi2 + xi4 ) <(xi2 + xi4 )
, (5.36)
(τ −1) (τ −1) (τ −1) (τ −1)
=(xi2 + xi4 ) <(xi2 − xi4 )
(τ −1) (τ −1)
(τ −1) (τ −1)
=(−xi1 − xi3 ) −
<(xi1 xi3 )
(τ −1) (τ −1) (τ −1) (τ −1)
=(−xi2 + xi4 ) <(−xi2 − xi4 )
(τ −1) (τ −1) (τ −1) (τ −1)
=(xi1 − xi3 ) <(−xi1 − xi3 )
h it
2πk
and cos P
, sin 2πk
P
, cos 2πl
P
, sin 2πl
P
, cos( 2πm
Q
+ θ), sin( 2πm
Q
+ θ), cos( 2πn
Q
+ θ), sin( 2πn
Q
+ θ)
(τ ) (τ )
is the vector of unknowns. Notice that Bi (Xi ) can also be constructed as Ai (−jXi ).
It can be seen that formula (5.32) is quadratic in sines and cosines of the unknowns.
123
Thus, it is possible to use fast decoding algorithms such as sphere decoding to achieve
the exact ML solution in polynomial time.
In this paragraph, the sphere decoding for codes given in (5.20) is discussed. For
each of the angle γi , a sphere decoding is applied and one signal is gained. In doing
the sphere decoding for each γi , matrices Ai , Bi for i = 1, 2, · · · , N are the same as
those in (5.33) and (5.34), but the Ci and Di matrices should be modified to
(τ −1) (τ −1) (τ −1) (τ −1)
<(cos γi xi1 + sin γi xi3 ) =(− cos γi xi1 + sin γi xi3 )
<(cos γi x(τ −1) + sin γi x(τ −1) ) (τ −1)
=(cos γi xi2 − sin γi xi4
(τ −1)
)
i2 i4
Ci =
<(− cos γ x(τ −1) + sin γ x(τ −1) ) (τ −1)
=(cos γi xi1 + sin γi xi3
(τ −1)
)
i i1 i i3
(τ −1) (τ −1) (τ −1) (τ −1)
<(− cos γi xi2 + sin γi xi4 ) =(− cos γi xi2 − sin γi xi4 )
(5.37)
(τ −1) (τ −1) (τ −1) (τ −1)
<(cos γi xi2 + sin γi xi4 ) =(− cos γi xi2 − sin γi xi4 )
(τ −1) (τ −1) (τ −1) (τ −1)
<(− cos γi xi1 − sin γi xi3 ) =(− cos γi xi1 + sin γi xi3 )
(τ −1) (τ −1) (τ −1) (τ −1)
<(− cos γi xi2 + sin γi xi4 ) =(cos γi xi2 − sin γi xi4 )
(τ −1) (τ −1) (τ −1) (τ −1)
<(cos γi xi1 − sin γi xi3 ) =(cos γi xi1 + sin γi xi3 )
and
(τ −1) (τ −1) (τ −1) (τ −1)
=(cos γi xi1 + sin γi xi3 ) <(cos γi xi1 − sin γi xi3 )
=(cos γi x(τ −1) + sin γi x(τ −1) ) <(− cos γi xi2
(τ −1)
+ sin γi xi4
(τ −1)
)
i2 i4
Di =
(τ −1) (τ −1) (τ −1) (τ −1)
=(− cos γi xi1 + sin γi xi3 ) <(− cos γi xi1 − sin γi xi3 )
(τ −1) (τ −1) (τ −1) (τ −1)
=(− cos γi xi2 + sin γi xi4 ) <(cos γi xi2 + sin γi xi4 )
. (5.38)
(τ −1) (τ −1) (τ −1) (τ −1)
=(cos γi xi2 + sin γi xi4 ) <(cos γi xi2 − sin γi xi4 )
(τ −1) (τ −1) (τ −1)
(τ −1)
=(− cos γi xi1 − sin γi xi3 ) <(cos γi xi1
− sin γi xi3 )
(τ −1) (τ −1) (τ −1) (τ −1)
=(− cos γi xi2 + sin γi xi4 ) <(− cos γi xi2 − sin γi xi4 )
(τ −1) (τ −1) (τ −1) (τ −1)
=(cos γi xi1 − sin γi xi3 ) <(− cos γi xi1 − sin γi xi3 )
For each transmission, |Γ| sphere decoding is used and therefore |Γ| signal matrices
are obtained in total. ML decoding given in (5.32) is then used to get the optimal
124
one. The complexity of this decoding algorithm is |Γ| times the original one, but it
is still cubic polynomial in the transmission rate and dimension.
Below are some remarks on the implementation of sphere decoding in our systems.
1. The main idea of sphere decoding is disucssed in Section 2.8. The choice of the
searching radius is very crucial to the speed of the algorithm. Here, the radius
is initialized as a small value and then increase it gradually based on the noise
√
level [HV02]. The searching radius C is initialized in such a way that the
probability that the correct signal is in the sphere is 0.9, that is,
√
P (kvkF < C) = 0.9. (5.39)
If no point is found in this sphere, the searching radius is then raised such
that the probability is increased to 0.99 and so on. Using this algorithm, the
probability that a point can be found during the first search is high. The noise
of the system is given in (2.11). Since Wτ , Wτ −1 and the transmitted unitary
matrix Vzτ are independent, it is easy to prove that the noise has mean zero
and variance 2N I4 . Each component of the 4 × N -dimensional noise vector
has mean zero and variance 2. Therefore the random variable v = kWτ0 k2F has
Gamma distribution with mean 4N . The value of C that satisfies (5.39) can be
easily calculated.
2. From (5.32), it can be seen that the unknowns are in forms of sines and
cosines. Notice that for any α 6= β ∈ [0, 2π), sin α = sin β if and only
if β = (2k + 1)π − α for some integer k. When P, Q are odd, it can be
seen that π cannot be in the set ΘP = { 2πk
P
|k = 0, 1, 2, ..., P − 1}, which is
the set of all possible angles of Ui ’s entries, and π + θ cannot be in the set
125
ΘQ = { 2πk
Q
+ θ|k = 0, 1, 2, ..., P − 1}, which is the set of all possible angles of
Vi ’s entries. Therefore, the map fP : ΘP → {sin x|x ∈ ΘP } by fP (θ) = sin θ
and the map fQ : ΘQ → {sin x|x ∈ ΘQ } by fQ (θ) = sin θ are one-to-one and
onto. The independent unknowns k, l, m, n can thus be replaced equivalently
by their sines: sin 2πk
P
, sin 2πl
P
, sin( 2πm
Q
+ θ), sin( 2πn
Q
+ θ).
3. Notice that there are only four independent unknowns but eight components in
the unknown vector in (5.32). We combine the 2i-th component (with the form
of cos x) and the (2i + 1)-th component (with the form of sin x) for i = 1, 2, 3, 4.
From previous discussions we know that for any value in the set {sin x|x ∈ ΘP }
or {sin x|x ∈ ΘQ }, there is only one possible value in ΘP or ΘQ whose sine equals
the value. Therefore, there is only one possible value of the cosine. In other
words, for any possible value of the 2i-th component, there is one unique value
for the (2i + 1)-th component. Therefore, it is natural to combine the 2i-th and
the (2i+1)-th components. To simplify the programming, while considering the
searching range of each unknown variable, we skip the 2i-th component and only
consider the (2i + 1)-th one. For example, instead of analyzing all the possible
values of sin 2πn
Q
(the 8th component) satisfying q88 sin2 2πn
Q
+q77 (cos 2πn
Q
+S8 )2 ≤
C, all the possible values of sin 2πn
Q
satisfying q88 sin2 2πn
Q
≤ C are considered
[DAML00]. It may seem that more points than needed is searched, but actually
the extra points will be eliminated in the next step of the sphere decoding
algorithm.
4. Complex sphere decoding can also be used here to obtain ML results, which ac-
tually is simplier than the real sphere decoding. However, in all the simulations,
In this section, examples of Sp(2) codes and also the simulated performance of the
codes at different rates are given. The fading coefficient between every transmit-and-
z1 z2
C(z1 , z2 ) = , (5.40)
∗ ∗
−z2 z1
z z2 z3 0
1
−z ∗ z ∗ 0 −z3
2 1
C(z1 , z2 , z3 ) = (5.41)
−z3∗ 0 z1∗
z2
∗ ∗
0 z3 −z2 z2
proposed in [TH02].
127
0
Sp(2) code VS Cayley code and orthogonal design
10
R=2 2−d orthogonal design
R=1.75 Cayley code
P=5, Q=3, R=1.95 Sp(2) code
R=1.94 4−d orthogonal design
−1
10
−2
10
BLER
−3
10
−4
10
−5
10
−6
10
10 12 14 16 18 20 22 24 26 28 30
SNR
Figure 5.3: Comparison of the rate 1.95 Sp(2) code with the rate 1.75 differential
Cayley code, the rate 2, 2 × 2 complex orthogonal design, and the rate 1.94, 4 × 4
complex orthogonal design with N = 1 receive antennas
128
5.7.1 Sp(2) Code vs. Cayley Code and Complex Orthogonal
Designs
The first example is the Sp(2) code with P = 5, Q = 3, θ = 0, that is, entries of the
2π 4π 6π 8π
U matrix of the code are chosen from the 5-PSK signal set {1, ej 5 , ej 5 , ej 5 , ej 5 },
2π 4π
and entries of the V matrix are chosen from the 3-PSK signal set {1, ej 3 , ej 3 }.
Therefore by (5.7), the rate of the code is 1.95. It is compared with 3 code: the
rate 2, 2 × 2 complex orthogonal design given by (5.40), where z1 , z2 are chosen
2π 4π 6π
from the 4-PSK signal set {1, ej 4 , ej 4 , ej 4 }; a rate 1.75 differential Cayley code
with parameters Q = 7, r = 2 [HH02a]; and also the rate 1.94, 4 × 4 complex or-
thogonal design given by (5.41), where z1 , z2 , z3 are chosen from the 6-PSK signal set
2π 4π 6π 8π 10π
{1, ej 6 , ej 6 , ej 6 , ej 6 , ej 6 }. The number of receive antennas is 1. The performance
curves are shown in Figure 5.3. The solid line indicates the BLER of the Sp(2) code.
The lines with circles indicates the BLER of the differential Cayley code. The line
with plus signs and the dashed line show the BLER of the 2 × 2 and 4 × 4 complex
orthogonal designs, respectively. From the plot, it can be seen that the Sp(2) code
has the lowest BLER at high SNR. For example, at a BLER of 10−3 , the Sp(2) code
is 2dB better than the differential Cayley code, even though the Cayley code has a
lower rate, 1dB better than the 4 × 4 complex orthogonal design, and 4dB better than
the 2 × 2 complex orthogonal design.
In this subsection, the same Sp(2) code is compared with a group-based diagonal
code and the K1,1,−1 code both at rate 1.98 [SHHS01]. The K1,1,−1 code is in one of
the 6 types of the finite fixed-point-free groups given in [SHHS01]. The number of
receive antennas is 1. In Figure 5.4, the solid line indicates the BLER of the Sp(2)
code and the line with circles and plus signs show the BLER of the K1,1,−1 code and
129
0
Sp(2) code vs. group based codes
10
−1
10
−2
10
−3
10
BLER
−4
10
−5
10
−6
10
R=1.98 diagonal group code
R=1.98 K1,1,−1 code
P=5, Q=3, R=1.95 Sp(2) code
−7
10
0 5 10 15 20 25 30
SNR
Figure 5.4: Comparison of the rate 1.95 Sp(2) code with the rate 1.98 group-based
K1,1,−1 code and a rate 1.98 group-based diagonal code with N = 1 receive antennas
130
the diagonal code, respectively. The plot indicates that the Sp(2) code is better than
the diagonal code but worse than K1,1,−1 code according to the BLER. For example,
at a BLER of 10−3 , the Sp(2) code is 2dB better than the diagonal code, but 1.5dB
worse than K1,1,−1 group code. However, decoding K1,1,−1 code requires an exhaustive
search over the entire constellation.
0
Sp(2) code VS Orthogonal design
10
−1
10
−2
10
BLER
−3
10
−4
10
−5
10
Figure 5.5: Comparison of the rate 3.13 Sp(2) code with the rate 3, 2 × 2 and 4 × 4
complex orthogonal designs with N = 1 receive antenna
The comparison of the Sp(2) codes with complex orthogonal designs at rate ap-
proximately 3 and 4 is shown in Figures 5.5 and 5.6. In Figure 5.5, the solid line
indicates the BLER of the Sp(2) code of P = 11, Q = 7, θ = 0. The line with circles
shows the BLER of the 2×2 complex orthogonal design (5.40) with z1 , z2 chosen from
8-PSK. The dashed line indicates the BLER of the rate 3, 4 × 4 complex orthogonal
131
0
Sp(2) code vs. orthogonal design
10
−1
10
−2
10
BLER
−3
10
−4
10
Figure 5.6: Comparison of the rate 3.99 Sp(2) code with the rate 4, 2 × 2 and rate
3.99, 4 × 4 complex orthogonal designs with N = 1 receive antenna
design (5.41) with z1 , z2 , z3 chosen from 16-PSK. Therefore, the rate of the Sp(2) code
is 3.13 and the rate of the 2 × 2 and 4 × 4 orthogonal designs is 3. Similarly, in Figure
5.6, the solid line indicates the BLER of the Sp(2) code of P = 23, Q = 11, θ = 0.
The line with circles shows the BLER of the 2 × 2 complex orthogonal design (5.40)
with z1 , z2 chosen from 16-PSK and the dashed line indicates the BLER of the 4 × 4
complex orthogonal design (5.41) with z1 , z2 , z3 chosen from 40-PSK. Therefore, the
rate of the Sp(2) code is 3.99 and the rates of the 2 × 2 and 4 × 4 complex orthogonal
designs are 4 and 3.99. The number of receive antennas is 1. It can be seen from the
two figures that the Sp(2) codes are better than the 4 × 4 complex orthogonal designs
for all the SNRs and are better than the 2 × 2 complex orthogonal designs at high
SNR.
132
5.7.4 Performance of Sp(2) Codes at Higher Rates
In this subsection, simulated performances of the Sp(2) codes at higher rates, as given
in (5.20), are shown for different Γ and are compared with the corresponding original
codes given in (5.6), whose Γ is { π4 }.
indicate the BLER of the Sp(2) codes that are just mentioned, which we call the new
codes, and the solid line shows the BLER of the P = 11, Q = 7, θ = 0 Sp(2) code
with Γ = { π4 } and rate 3.1334, which we call the original code. The figure shows that
the new codes are about only 1dB and 2dB worse than the original one with rates
0.3962 and 0.5805 higher. The BLER of the rate 4 non-group code given in [SHHS01],
which has the structure of product-of-groups, is also shown in the figure by the line
with circles. It can be seen that performance of the new code is very close to that of
the non-group code with rate 0.4704 lower. The result is actually encouraging since
the design of the non-group is very difficult and its decoding needs exhaustive search
over 216 = 65, 536 possible signal matrices.
π π π π 5π
The second example is the P = 9, Q = 5, θ = 0.0377, Γ = { 12 , 6 , 4 , 3 , 12 } + 0.016
Sp(2) code. The rate of the code is therefore 3.3264 by formula (5.21). In Figure 5.8,
the dashed line indicates the BLER of the rate 3.3264 Sp(2) code we just mention,
which we call the new code, and the solid line shows the BLER of the P = 9, Q =
5, θ = 0.0377 Sp(2) with Γ = { π4 } and rate 2.7459, which we call the original code.
The figure shows that the new code is only about 2dB worse than the original one with
rate 0.5805 higher. Also, the BLER of the rate 4 non-group code given in [HH02a]
is shown in the figure by the line with circles. It can be seen that the new code is
133
0
Sp(2) code with different Γ
10
P=11, Q=7, |Γ|=1, R=3.13 Sp(2) code
P=11, Q=7, |Γ|=5, R=3.71 Sp(2) code
P=11, Q=7, |Γ|=3, R=3.53 Sp(2) code
R=4 non−group code
−1
10
−2
10
BLER
−3
10
−4
10
−5
10
20 21 22 23 24 25 26 27 28 29 30
SNR
1dB better than the non-group code with rate 0.6736 lower. As mentioned before,
the result is actually encouraging since the design of the non-group is very difficult
and its decoding needs exhaustive search over 216 = 65, 536 possible signal matrices.
5.8 Conclusion
In this chapter, the symplectic group Sp(n), which has dimension n(2n + 1) and
rank n, are analyzed and differential USTM codes based on Sp(2) are designed. The
group, Sp(2), is not fpf, but a method to design fully-diverse codes which are subsets
of the group are proposed. The constellations designed are suitable for systems with
four transmit antennas and any number of receive antennas. The special symplectic
−1
Sp(2) code VS non−group code
10
P=9, Q=5, |Γ|=1, R=2.75 Sp(2) code
P=9, Q=5, |Γ|=5, R=3.33 Sp(2) code
R=4 non−group code
−2
10
BLER
−3
10
−4
10
−5
10
20 21 22 23 24 25 26 27 28 29 30
SNR
such as sphere decoding. Simulation results show that they have better performance
than the 2 × 2 and 4 × 4 complex orthogonal designs, a group-based diagonal code as
well as differential Cayley codes at high SNR. Although they slightly underperform
the k1,1,−1 finite-group code and the carefully designed non-group code, they do not
need the exhaustive search (of exponentially growing size) required for such codes
and therefore are far superior in term of decoding complexity. Our work shows the
promise of studying constellations inspired by group-theoretic considerations.
135
5.9 Appendices
Proof: First assume that the determinant is zero and prove that O1 = ±J Ō2 . As-
sume det (S1 − S2 ) = 0. If det O1 = 0, from Lemma 5.5, O1 = 022 . Therefore
1 0 O2 2
det (S1 − S2 ) = √ det = det O2 = 0.
2 −Ō2 0
β
Ō1 = JO2 ,
det O2
or equivalently,
β
O1 = J Ō2
det O2
O2∗
since O2−1 = det O2
by Lemma 5.5. Since a = 1 we have det O1 = det O2 . Therefore,
the following equations can be obtained.
2 2
β β β
det O2 = det O1 = det( J Ō2 ) = det O2 = det O2 .
det O2 det O2 det O2
136
Therefore,
β
= ±1.
det O2
Thus, O1 = ±J Ō2 .
Now assume that O1 = ±J Ō2 and prove that det(S1 − S2 ) = 0. First, assume that
O1 is invertible. If O1 = ±J Ō2 , we have O1−1 = (±J Ō2 )−1 = ±Ō2−1 J¯−1 = ∓Ō2−1 J.
From (5.11),
det (S1 − S2 )
1
= √ det O1 det(±JO2 ∓ Ō2 Ō2−1 JO2 )
2
1
= √ det O1 det(±JO2 ∓ JO2 )
2
= 0.
Secondly, assume that O1 is not invertible, that is det O1 = 0. From Lemma 5.5,
O1 = 022 . Therefore, from O1 = ±J Ō2 , O2 = 022 . Thus, S1 − S2 = 044 and
det(S1 − S2 ) = 0.
Now what left to be proved is that O1 = ±J Ō2 is equivalent to (5.13). By (5.10),
it is equivalent to
U1 V1 − U2 V2 = ±J(Ū1 V1 − Ū2 V2 ),
and thus,
“
m
”
k l2 k2 l2 j 2π Q2 +θ n2
1 e P
2πj 2
e2πj P 0 1 e−2πj P e−2πj P e ej (2π P +θ)
= ∓ “
n
” “
m
”
2 l2
−e−2πj P e−2πj P
k2
−1 0
l2
−e2πj P e2πj P
k2
−e
−j 2π Q2 +θ
e
−j 2π Q2 +θ
h “k m ” i h “
k n
” i h “
l m
” i h “
l n
” i
j 2π P1 + Q1 +θ j 2π P1 − Q1 −θ j 2π P1 + Q1 +θ j 2π P1 − Q1 −θ
e ±e −e ±e
⇔ h “
k1 m1
” i h “
k1 n1
” i h “
l1 m1
” i h “
l n
” i
−j 2π P − Q −θ −j 2π P + Q +θ −j 2π P − Q −θ −j 2π P1 + Q1 +θ
±e −e −e ∓e
h “
k m
” i h “
k n
” i h “
l m
” i h “
l n
” i
j 2π P1 − Q1 −θ j 2π P1 + Q1 +θ j 2π P1 − Q1 −θ j 2π P1 + Q1 +θ
∓e +e +e ±e
h “ ” i h “ ” i h “ ” i h “ ” i
k1 m k1 n l1 m l n
−j 2π + Q1 +θ −j 2π − Q1 −θ −j 2π + Q1 +θ −j 2π P1 − Q1 −θ
e P
±e P
−e P
±e
h “
k2 m2
” i h “
k2 n2
” i h “
l2 m2
” i h “
l2 n2
” i
j 2π + +θ j 2π − −θ j 2π + +θ j 2π − −θ
e
P Q
±e P Q
−e P Q
±e P Q
= h “
k m
” i h “
k n
” i h “
l m
” i h “
l n
” i
−j 2π P2 − Q2 −θ −j 2π P2 + Q2 +θ −j 2π P2 − Q2 −θ −j 2π P2 + Q2 +θ
±e −e −e ∓e
h “
k m
” i h “
k n
” i h “
l m
” i h “
l n
” i
j 2π P2 − Q2 −θ j 2π P2 + Q2 +θ j 2π P2 − Q2 −θ j 2π P2 + Q2 +θ
∓e +e +e ±e
h “ ” i h “ ” i h “ ” i h “ ” i
k2 m k2 n l2 m l2 n
−j 2π + Q2 +θ −j 2π − Q2 −θ −j 2π + Q2 +θ −j 2π − Q2 −θ
e P
±e P
±e P
−e P
“ ” “ ” “ ” “ ”
k m k m l m l m
2θj 2πj P1 + Q1 2πj P2 + Q2 2πj P1 + Q1 2πj P2 + Q2
e e −e ±e ∓e
“
k n
” “
k n
” “
l n
” “
l n
”
2πj P1 − Q1 2πj P2 − Q2 2πj P1 − Q1 2πj P2 − Q2
= ∓e ±e +e −e
⇔ “
k1 n1
” “
k2 n2
” “
l1 n1
” “
l2 n2
” .
e 2θj e2πj P + Q − e2πj P + Q ± e2πj P + Q ∓ e2πj P + Q
“
k m
” “
k m
” “
l m
” “
l m
”
2πj P1 − Q1 2πj P2 − Q2 2πj P1 − Q1 2πj P2 − Q2
= ±e ∓e −e +e
Proof: Assume ejθ1 + ejθ2 + ejθ3 + ejθ4 = 0, then the following series of equations are
138
true.
θ1 +θ2
θ1 −θ2 θ1 −θ2
θ3 +θ4
θ3 −θ4 θ3 −θ4
j j −j j j −j
e 2 +e e +e 2 e +e 2 =0 2 2 2
θ1 − θ 2 θ1 +θ2 θ3 − θ4 j θ3 +θ4
⇒ 2 cos ej 2 = −2 cos e 2
2 2
θ1 −θ2 = ± θ3 −θ4 + 2kπ
θ1 −θ2 = ± θ3 −θ4 + (2k + 1)π
2 2 2 2
⇒ or ,
θ1 +θ2 = θ3 +θ4 + (2l + 1)π
θ1 +θ2 = θ3 +θ4 + 2lπ
2 2 2 2
for some integers k and l. Without loss of generality, only the first case is considered
P4 P1
P6 O P5
P3 P2
This lemma can also be proved easily in a geometric way. As in Figure 5.9,
139
P1 , P2 , P3 , P4 are the four points on the unit circle that add up to zero, and O is the
center of the unit circle, which is the origin. Without loss of generality, assume P2 is
−→
the point that is closest to P1 . Since the four points add up to zero, OP 5 , which is the
−−→ −→ −→ −→ −→
summation of OP1 and OP 2 , and OP 6 , which is the summation of OP 3 and OP 4 , are
on the same line with inverse directions and have the same length. Since OP1 = OP3 ,
P1 P5 = OP2 = OP4 = P3 P6 , 4OP1 P5 = 4OP3 P6 . Therefore, ∠P1 OP5 = ∠P3 OP6 .
−→ −→
Thus, OP 1 and OP 3 are on the same line with inverse directions and have the same
−→ −→
length, which means that OP 1 + OP 3 = 0.
or
k n l n
e2πj( P1 − Q1 ) = −e2πj( P2 − Q2 )
k n l n
.
e2πj( P2 − Q2 ) = −e2πj( P1 − Q1 )
Without loss of generality and to simplify the proof, only the first case is discussed
140
here. From the first set of equations, there exist integers i1 and i2 such that
2π( k1 − n1
P Q
) = 2π( Pl1 − n1
Q
) + 2πi1
,
2π( k2 −
n2
) = 2π( Pl2 − n2
) + 2πi2
P Q Q
and therefore
k1 −l1
P
= i1
. (5.42)
k2 −l2
P
= i2
mj
k l
mi k l
2πi 2πi Pi 2πi Pi 2πi 2πi Pj 2πi Pj
e Q cos γi e + sin γi e = e cos γj e
Q + sin γj e (5.43)
n
kj lj
n
−2πi Qi ki li
−2πi Qj
e sin γi e2πi P − cos γi e2πi P = e sin γj e2πi P − cos γj e2πi P (5.44)
n
k l
ni ki l
2πi Pi 2πi Qj 2πi Pj 2πi Pj
e2πi Q cos γi e2πi P + sin γi e = e cos γj e + sin γj e (5.45)
m
k l
mi ki l
2πi Pi −2πi Qj 2πi Pj 2πi Pj
e−2πi Q sin γi e2πi P − cos γi e = e sin γj e − cos γj e (5.46)
The square of the norm of the left hand side of (5.43) equals
ki
li 2
cos γi e2πi P + sin γi e2πi P
ki li ki li
= (cos γi cos 2π + sin γi cos 2π )2 + (cos γi sin 2π + sin γi sin 2π )2
P P P P
k i l i k i l i
= cos2 γi cos2 2π + sin2 γi cos2 2π + 2 cos γi cos 2π sin γi cos 2π +
P P P P
ki li ki li
cos2 γi sin2 2π + sin2 γi sin2 2π + 2 cos γi sin 2π sin γi sin 2π
P P P P
k i k i k i l i k i li
= cos2 2π + sin2 2π + 2 cos γi sin γi (cos 2π cos 2π + sin 2π sin 2π )
P P P P P P
ki − li
= 1 + sin 2γi cos 2π .
P
Similarly, the square of the norm of the right hand side of (5.43) is equal to
kj − lj
1 + sin 2γj cos 2π .
P
ki − li kj − lj
1 + sin 2γi cos 2π = 1 + sin 2γj cos 2π ,
P P
which is equivalent to
ki − li kj − lj
sin 2γi cos 2π = sin 2γj cos 2π . (5.47)
P P
142
Define r = |ki −li | and s = |kj −lj |. Since ki , li , kj , lj ∈ [0, P ), ki −li , kj −lj ∈ (−P, P ).
kj −lj
Therefore, r, s ∈ [0, P ), cos 2π Pr = cos 2π kiP−li and cos 2π Ps = cos 2π P
. Thus,
r s
sin 2γi cos 2π = sin 2γj cos 2π .
P P
“k mj
” “l mj
”
k mi j l mi j
2πj ( Pi + ) − e2πj + 2πj ( Pi + ) − e2πj +
cos γi e Q P Q
= − sin γi e Q P Q
and
k i − k j mi − m j
“ k +k m +m
”
2πj i2P j + i Q j
2j cos γi sin 2π + e
2P 2Q
li − l j m i − m j
“ l +l m +m
”
2πj i2P j + i2Q j
= −2 sin γi sin 2π + e .
2P 2Q
Therefore,
k i − k j mi − m j li − l j m i − m j
cos γi sin 2π + = ± sin γi sin 2π + . (5.48)
2P 2Q 2P 2Q
From (5.23),
k i − k j mi − m j li − l j m i − m j
cos γi sin 2π + 6= ± sin γi sin 2π + ,
2P 2Q 2P 2Q
ki = k j
li = l j .
mi = m j
“k nj
” “l nj
”
k n j l n j
2πj ( Pi − Qi ) 2πj − 2πj ( Pi − Qi ) 2πj −
sin γi e −e P Q
= cos γi e −e P Q
and
k i − k j ni − n j
“ k +k n +n
”
2πj i2P j − i Q j
2i sin γi sin 2π − e
2P 2Q
li − l j ni − n j
“ l +l n +n
”
2πj i2P j − i2Q j
= 2j cos γi sin 2π − e .
2P 2Q
Therefore,
k i − k j ni − n j li − l j n i − n j
sin γi sin 2π − = ± cos γi sin 2π − . (5.49)
2P 2Q 2P 2Q
By a similar argument,
ki = k j
li = l j .
ni = n j
6.1 Abstract
In this chapter, the special unitary Lie group SU (3) is discussed. Based on the
the codes can be calculated in a fast way. Necessary conditions for full-diversity of
the codes are also proved. Our conjecture is that they are also sufficient conditions.
Simulation results given in Section 6.6 show that the codes have better performances
than the group-based codes [SHHS01] especially at high rates and are as good as
eterization
Definition 6.1 (Special unitary group). [Sim94] SU (n) is the group of complex
From the definition, SU (n) is the group of complex n × n unitary matrices with
determinant 1. It is called the special unitary group. It is also known that SU (n) is
a compact, simple, simply-connected Lie group of dimension n2 − 1 and rank n − 1.
Since we are most interested in the case of rank 2, here the focus is on SU (3), which
has dimension 8. The following theorem on the parameterization of SU (3) is proved.
u22 u23
matrix of U . This theorem indicates that any matrix in SU (3) can
u32 u33
be written as a product of three 3 × 3 unitary matrices which are basically SU (2)
since they are actually reducible 3 × 3 unitary representations of SU (2) by adding
an identity block. Now let’s look at the number of degrees of freedom in U . Since
Φ, Ψ ∈ SU (2), there are 6 degrees of freedom in them. Together with the 2 degrees of
freedom in the complex scalar α, the dimension of U is 8, which is exactly the same
146
as that of SU (3). Based on (6.1), matrices in SU (3) can be parameterized by entries
of Φ, Ψ and α, that is, any matrix in SU (3) can be identified with a 3-tuple (Φ, Ψ, α).
From (6.1), it can also be seen that all the three matrices are block-diagonal with a
unit block. The first and third matrices have the unit element at the (1, 1) entry and
the second matrix has the unit element at the (2, 2) entry. To get a more symmetric
parameterization method, the following corollary is proved.
Corollary 6.1. Any matrix U belongs to SU (3) if and only if it can be written as
jω
p jβ
αe 0 1 − |α| e
2
1 012 Ψ 021
U = 0 1 0
, (6.2)
021 Φ p 012 1
− 1 − |α|2 e−jβ 0 αe−jω
Proof: First, it is easy to prove that any matrix with the structure in (6.2) is in
SU (3) by checking the unitary and determinant conditions. What is left is to prove
that any matrix in SU (3) can be written as the formula
in (6.2).
0 0 −1
For any matrix U ∈ SU (3), define U 0 = U 0 1 0 . It is easy to check
1 0 0
that U 0 is also a matrix in SU (3). Therefore, from Theorem 6.1, there exist matrices
Φ0 , Ψ00 ∈ SU (2) and a complex scalar a, such that
p
a 0 − 1 − |a|2
1 0 12 1 012
U0 =
0 1 0
.
0 021 Ψ00
021 Φ p
1 − |a|2 0 ā
147
0 0
00 ψ11 ψ12 0 2 0 2
Let Ψ = , where |ψ11 | + |ψ12 | = 1. Note that
0 0
−ψ̄12 ψ̄11
0 0 1 0 0 −1
1 0 Ψ 0 0
12
21
= 0 1 0
0 1 0 ,
021 Ψ00 012 1
−1 0 0 1 0 0
0 0
ψ̄11 −ψ̄12
where we have defined Ψ0 = (it is easy to see that Ψ0 ∈ SU (2)).
0 0
ψ12 ψ11
Therefore,
U0
p
a 0 − 1 − |a|2 0 0 1
0 0 −1
1 012 Ψ0 021
=
0 1 0 0
1 0
0 1 0
021 Φ0 p 012 1
1 − |a|2 0 ā −1 0 0 1 0 0
p
2
1 − |a| 0 −1
a 0 0
1 012 Ψ0 021
=
0 1 0
0
1 0
0
021 Φ p 012 1
−ā 0 1 − |a| 2 1 0 0
and p
2
1 − |a| 0
a
0
1 012 Ψ 021
U = 0 1 0
.
021 Φ 0 012 1
p
−ā 0 1 − |a|2
p
It is easy to see that Φ, Ψ ∈ SU (2). (6.2) is obtained by letting α = 1 − |a|2 and
β = ∠a,
The parameter ω does not add any degrees of freedom as can be seen in the
proof of the corollary. However, as will be seen later that it is important to our code
design. From formula (6.2), any matrix in SU (3) can be written as a product of three
To get finite constellations of unitary matrices from the infinite Lie group SU (3), the
parameters, Φ, Ψ, α, β, ω, need to be sampled appropriately. We first sample Φ and
Ψ. As discussed in Chapter 2, Alamouti’s orthogonal design
x y
1
p x, y ∈ C
|x|2 + |y|2 −x̄ ȳ
is a faithful representation of the group SU (2). To get a discrete set, x and y must
belong to discrete sets. As is well known, the PSK signal is a very good and simple
149
modulation. Therefore, Φ and Ψ are chosen as follows.1
q
2πj Pp 2πj Q r
2πj R 2πj Ss
1 e e 1 e e
Φ= √ q
and Ψ= √ ,
2 −e−2πj Q e−2πj P
p
2 −e−2πj Ss e
r
−2πj R
r
p q s
θp,q = 2π − and ξr,s = 2π + . (6.3)
P Q R S
(6.2) becomes
r s
√1 e2πj R √1 e2πj S
1 0 0 2 2
0
q
√1 e2πj P
p
√1 e2πj Q diag eθp,q , 1, e−θp,q diag {eξr,s , 1, e−ξr,s } − √1 e−2πj Ss r
√1 e−2πj R .
0 0
2 2 2 2
q p
0 − √12 e−2πj Q √1 e−2πj P
2
0 0 1
Define
jθp,q
e 0 0
(1)
q
A(p,q) = √1 e 2πj Pp √1 e
2πj Q , (6.4)
0 2 2
e−jθp,q
q p
0 − √12 e−2πj Q √1 e−2πj P e−jθp,q
2
1
PSK symbols have been analyzed in [SWWX04], where it is shown that having a full parame-
terization of SU (2), that is, parameterizing x and y fully (both the norms and the arguments) gives
about 1-2 dB improvement but with a much more complicated decoding. Here, to make our main
idea clear, x and y are chosen as simple P SK signals.
150
which is the product of the first two matrices in the above formula, and
r s
√1 e2πj R ejξr,s √1 e2πj S ejξr,s 0
2 2
(1)
B(r,s) = 1 −2πj Ss √1 e
r
−2πj R , (6.5)
− √2 e 2
0
0 0 e−jξr,s
which is the product of the last two matrices. The following codes are obtained.
n o
(1) (1) (1)
C(P,Q,R,S) = A(p,q) B(r,s) |p ∈ [0, P ), q ∈ [0, Q), r ∈ [0, R), s ∈ [0, S) (6.6)
The set is a subset of SU (3). We call it the SU (3) code. There are all together P QRS
elements in the code (6.6). Since the channel is used in blocks of three transmissions,
1
R= log2 (P QRS). (6.7)
3
p1 −p2 q −q
2πj ( − 12Q 2 ) cos 2π p1 −p2 q1 −q2
x=e 2P
2P
+ 2Q
2πj (
r −r
− 12R 2
s −s
− 12S 2 ) cos 2π r1 −r2 s1 −s2
. (6.8)
w=e 2R
− 2S
complex scalar c.
1
ζC (1) (P, Q, R, S) = min |2=[(1 − Θ̄x)(1 − Θw)]|1/3 . (6.10)
2 δp ∈ (−P, P ), δq ∈ (−Q, Q),
δr ∈ (−R, R), δs ∈ (−S, S)
(δp , δq , δr , δs ) 6= (0, 0, 0, 0)
Actually, instead of 16L, less than 8L calculations is enough because of the symmetry
in (6.10). Note that
|∆(δp , δq , δr , δs )|
Therefore, only half of the determinants are needed to be calculated. The computa-
tional complexity is greatly reduced especially for codes of high rates, that is, when
P QRS is large.
From the symmetries of δp , δq , δr , δs in (6.10), it is easy to prove that
Table 6.2: Diversity products of some group-based codes and a non-group code
which indicates that switching the positions of P and Q, R and S, or (P, Q) and (R, S)
does not affect the diversity product. But generally, ζC (1) (P, Q, R, S) 6= ζC (1) (P, R, Q, S).
Diversity products of some of the SU (3) codes are given in Table 6.1. Diversity
products of some of the group-based codes and non-group codes in [SHHS01] are
also given in Table 6.2 for comparison. It can be seen from the tables that diversity
products of SU (3) codes are about the same as those of the group-based codes at
low rates, but when the rates are high, diversity products of SU (3) codes are much
greater than those of the group-based codes at about the same rates. However,
diversity products of the SU (3) code at rate 3.9195, which is 0.0510, is smaller than
that of the non-group code at rate 4.01, which is 0.0933. But simulated performance
shows that the code performs as well as the non-group code, which will be seen in
Section 6.6.
Proof: First, we prove that gcd(P, Q) = 1 is a necessary condition for the set
(1)
{A(p,q) } to be fully diverse and thus a necessary condition for full diversity of the
code. Let
jθ1
e 0 0
(1)
p q1
A(p1 ,q1 ) =
0 √1 e 2πj P1 √1 e2πj Q e −jθ1
2 2
q p
−2πj Q1 −2πj P1
0 − √12 e √1 e
2
e−jθ1
and
jθ2
e 0 0
(1)
p q
A(p2 ,q2 ) = √1 e 2πj P2 √1 e 2πj Q2 ,
0 2 2
e−jθ2
q2 p2
0 − √12 e−2πj Q √1 e−2πj P e−jθ2
2
(1) (1)
det A(p1 ,q1 ) − A(p2 ,q2 ) = ejθ1 − ejθ2 X
p2 q2 p2 q2
ejθ1 − ejθ2 = e2πj ( P − Q ) − e2πj ( P − Q ) = 0.
(1)
Therefore, gcd(P, Q) = 1 is a necessary condition for the set {A(p,q) } to be fully
diverse. By a similar argument, gcd(R, S) = 1 is also a necessary condition.
We are not able to give sufficient conditions for full diversity of the SU (3) codes.
Here is our conjecture.
Conjecture 6.1 (Sufficient conditions for full diversity). The conditions, that
any two of the integers (P, Q, R, S) are relatively prime and none of them are even,
155
(1)
are sufficient for code C(P,Q,R,S) to be fully diverse.
and2
r
0 p q 0 s
θp,q = 2π ± ± ξr,s = 2π ± ± . (6.12)
P Q R S
n o
(2) (2) (2)
C(P,Q,R,S) = A(p,q) B(r,s) |p ∈ [0, P ), q ∈ [0, Q), r ∈ [0, R), s ∈ [0, S) . (6.13)
They are not subsets of the Lie group SU (3) any more since the determinant of
0 0
the matrices is now ej(θ −ξ ) which is not 1 in general. However, the matrices in the
codes are still unitary. Since any matrix in the code is a product of two unitary
matrices (they are not representations of SU (2) anymore because their determinants
are no longer 1), we call it AB code. Simulations show that they have the same
and sometimes slightly better diversity products than the codes in (6.6), which is
2
There are actually 16 possibilities in (6.12). Different codes are obtained by different choices of
signs. Two of them are used in this chapter.
156
not surprising since we now get rid of the constraint of unit determinant. In the
next section, it will be seen that the handy structure of AB codes results in a fast
maximum-likelihood decoding algorithm. The code has the same rate as the code in
(6.6). It is easy to see that any matrix U in the two codes can be identified by the
4-tuple (p, q, r, s).
Theorem 6.4 (Calculation of the diversity product). For any two matrices
U1 (p1 , q1 , r1 , s1 ) and U2 (p2 , q2 , r2 , s2 ) in the code C (2) ,
0 0
det(U1 − U2 ) = 2ejθ1 e−jξ2 Θ̄1 Θ̄2 =[(Θ1 − Θ̄1 w)(Θ̄2 − Θ2 x)], (6.14)
p1 −p2 q −q
where Θ1 = e2πj (± 2P
± 12Q 2 ) and Θ = e2πj (± r12R
−r2 s −s
± 12S 2 )
.
2
1
ζC (2) (P, Q, R, S) = min |2=[(Θ1 − Θ̄1 w)(Θ̄2 − Θ2 x)]|1/3 . (6.15)
2 δp ∈ (−P, P ), δq ∈ (−Q, Q),
δr ∈ (−R, R), δs ∈ (−S, S)
(δp , δq , δr , δs ) 6= (0, 0, 0, 0)
the determinants of difference matrices are enough to obtain the diversity prod-
uct. AB codes also have the symmetry that ζC (2) (P, Q, R, S) = ζC (2) (Q, P, R, S) =
ζC (2) (P, Q, S, R) = ζC (2) (R, S, P, Q). But generally, ζC (1) (P, Q, R, S) 6= ζC (1) (P, R, Q, S).
As mentioned before, for AB codes, the choices for the angles θ and ξ are not
unique. Based on (6.12), there are actually 16 possible choices. Two of them are used
here,
r
0 p q s
θ = 2π − + , ξ 0 = 2π − −
P Q R S
and
0 p q 0 r s
θ = 2π − , ξ = 2π − − ,
P Q R S
157
(P, Q, R, S) Rate Type Diversity Product
(1, 3, 4, 5) 1.9690 I 0.2977
(4, 5, 3, 7) 2.9045 I 0.1413
(3, 7, 5, 11) 3.3912 II 0.0899
(4, 7, 5, 11) 3.5296 I 0.0731
(5, 9, 7, 11) 3.9195 I 0.0510
(5, 8, 9, 11) 3.9838 II 0.0611
(9, 10, 11, 13) 4.5506 II 0.0336
(11, 13, 14, 15) 4.9580 II 0.0276
are high, diversity products of AB codes are much greater than those of group-based
codes at about the same rates. However, diversity product of the AB code at rate
3.9838, which is 0.0661, is smaller than that of the non-group code at rate 4.01,
which is 0.0933. However, simulated performances in Section 6.6 show that the code
gcd(R, S) = 1.
n o
(2)
Proof: We first prove that the set A(p,q) , p ∈ [0, P ), q ∈ [0, Q) is fully diverse if
(2)
and only if P and Q are relatively prime. For any two different matrices A(p1 ,q1 ) and
(2)
A(p2 ,q2 ) in the set, denote
jθ1 jθ2
e 0 0 e 0 0
(2)
p q (2)
p q2
A(p1 ,q1 ) = √1 e 2πj P1 √1 e 2πj Q1 A(p2 ,q2 ) = √1 e 2πj P2 √1 e2πj ,
0 0
Q
2 2 2 2
q1 p1
q2 p2
0 − √12 e−2πj Q √1 e−2πj P
2
0 − √12 e−2πj Q √1 e−2πj P
2
158
where θ1 = 2π(± pP1 ± q1
Q
) and θ2 = 2π(± pP2 ± q2
Q
). Therefore,
(2) (2)
det A(p1 ,q1 ) − A(p2 ,q2 )
p1 p2 q1 q2
2πj 2πj
√1 e2πj P − √1 e2πj P √1 e Q − √1 e Q
2 2 2 2
ejθ1 − ejθ2 det
= q q p p
−2πj Q1 −2πj Q2 −2πj P1 −2πj P2
− √12 e + √12 e √1 e
2
− √1 e
2
1 jθ1 jθ2
2πj p1 p
2
2πj P2 2πj qQ1 q
2
2πj Q2
= e −e e P −e + e −e .
2
p1 p2 q1 q2
The second factor equals zero if and only if e2πj P = e2πj P and e2πj Q = e2πj Q . Since
p1 , p2 ∈ [0, P ) and q1 , q2 ∈ [0, Q), this is equivalent to (p1 , q1 ) = (p2 , q2 ), which cannot
be true since the two matrices are different. Therefore, the determinant equals zero
if and only if ejθ1 = ejθ2 .
Now assume that gcd(P, Q) = G > 1, that is, P and Q are not relatively prime.
P Q
When p1 −p2 = G
and q1 −q2 = − G (since G divides both P and Q, this is achievable,)
p 1 − p 2 q1 − q 2 1 1
θ1 − θ2 = 2π ± ± = 2π ± ± − = 0,
P Q G G
n o
jθ1 jθ2 (2)
which means that e =e . Therefore, the set A(p,q) , p ∈ [0, P ), q ∈ [0, Q) is not
fully diverse.
Now assume that gcd(P, Q) = 1. If ejθ1 = ejθ2 , θ1 − θ2 = 2kπ for some integer
±kQ∓(q1 −q2 )
k, which means that ± p1 −p
P
2
± q1 −q2
Q
= k. Therefore, p1 −p2
P
= Q
. Since
gcd(P, Q) = 1, P |(p1 − p2 ). However, because p1 − p2 ∈ (−P + 1, P − 1), the only
q1 −q2
possibility is that p1 − p2 = 0. Therefore, Q
= ±k. From q1 − q2 ∈ (−Q +
1. Necessary conditions for full diversity of the type I AB code are that any two of
P, Q, R, S are relatively prime and at most one of the four integers P, Q, R, S
is even.
2. Necessary conditions for the full diversity of the type II AB code are gcd(P, Q) =
gcd(R, S) = 1 and at most one of the four integers P, Q, R, S is even.
n o n o
(2) (2)
Proof: Since for code C (2) to be fully diverse, A(p,q) and B(r,s) must be
fully-diverse. Therefore, from Theorem 6.5, gcd(P, Q) = gcd(R, S) = 1 are necessary
where
2πj (
p1 −p2 q −q p 1 − p 2 q1 − q 2
− 12Q 2 ) cos 2π π π
x = e 2P + = ej 2 cos = 0,
2P 2Q 2
r1 −r2 s1 −s2 r1 − r 2 s1 − s 2 π
w = e2πj (− 2R − 2S ) cos 2π
π
− = e−j 2 cos = 0,
2R 2S 2
p1 −p2 q1 −q2 r1 −r2 s1 −s2
Θ1 = e2πj (± 2P ± 2Q ) = e±j 2 , and Θ2 = e2πj (± 2R ± 2S ) = e±j 2 .
π π
Therefore,
fully diverse. Therefore, necessary conditions for code C (2) to be fully-diverse are
gcd(P, Q) = gcd(R, S) = 1 and among the four integers P, Q, R, S, at most one is
even.
What is left is to prove that for type I AB code to be fully diverse, gcd(P, R) =
We are not able to give sufficient conditions for the full diversity of the AB codes.
Our conjectures are that the necessary conditions are also sufficient.
Conjecture 6.2 (Sufficient conditions for full diversity). The necessary con-
ditions given in Theorem 6.6 are also sufficient conditions for the two types of AB
2
2
(2) (2)
(2)∗ (2)
arg min
Xτ − A(p,q) B(r,s) Xτ −1
= arg min
A(p,q)Xτ − B(r,s) Xτ −1
.
p,q,r,s F p,q,r,s F
(2)
Therefore, the decoding formula for code C(P,Q,R,S) can be written as
e−jθp,q
0 0 xτ,11 · · · xτ,1N
p q
2πj Q
arg max
0 √1 e−2πj P − √1 e τ,21 · · · xτ,2N
x
p,q,r,s
2 2
q p
−2πj Q
0 √1 e √1 e2πj P xτ,31 · · · xτ,3N
2 2
2
r s
√1 e2πj R √1 e2πj S 0 xτ −1,11 · · · xτ −1,1N
2 2
s r
−
− √12 e−2πj S √12 e−2πj R 0 τ −1,21 · · ·
xτ −1,2N
0 0 e−jξr,s xτ −1,31 · · · xτ −1,3N
F
2
xτ −1,11 xτ −1,21
xτ,11
0 0 0 − √
2
− √
2
0
e−jθp,q
0 x̄τ,21 x̄τ,31 x̄τ −1,21 x̄τ −1,11
√ − √ 0 − √2 √
2 2 2 e2πj Pp
xτ,31 xτ,21
0
√
2
√
2
−xτ −1,31 0 0
−2πj Q
q
. .. .. .. .. .. e
= arg max
..
−jξ 0
,
. . . . .
p,q,r,s
e r,s
xτ −1,1N xτ −1,2N
xτ,1N 0 0 0 − √ − √
2 2 e2πj Rr
x̄τ,2N x̄τ,3N x̄τ −1,2N x̄τ −1,1N
0
√
2
− √
2
0 − √2 √
2
s
2πj
xτ,3N xτ,2N
e S
0 √
2
√
2
−xτ −1,3N 0 0
F
where xt,ij indicates the (i, j)-th entry of the M × N matrix Xt for t = τ, τ − 1. The
(2) (2)
equality is obtained since the matrices A(p,q) and B(r,s) have orthogonal structure. It
is easy to see that the formula inside the norm is linear in the PSK unknown signals.
Therefore, sphere decoding for complex channels proposed in [HtB03] can be used with
0 0
slight modification. The only difference here is that the unknowns e−jθp,q and e−jξr,s
p q
are not independent unknown PSK signals but are determined by e2πj P , e−2πj Q and
162
r s
e2πj R , e2πj S . Therefore, in the sphere decoding, instead of searching over the intervals
0 0
for the unknowns e−jθp,q and e−jξr,s , their values are calculated by values of p, q and
is chosen to be too small, then there may be no point in the sphere being searched.
It is better to start with a small value then increase it gradually. In [DAML00], the
authors proposed to choose the packing radius or the estimated packing radius to be
the initial searching radius. Here, another initialization for the searching radius based
on the noise level as in [HV02] and [JH03b] is used. The noise of the system is given
in (2.11). Since Wτ , Wτ −1 and the transmitted unitary matrix, Vzτ , are independent,
it is easy to prove that the noise matrix has mean zero and variance 2N I3 . Each
component of the 3 × N -dimensional noise vector has mean zero and variance 2.
Therefore the random variable v = kWτ0 k2F has Gamma distribution with mean 3N .
√
The searching radius C is initialized in such a way that the probability that the
√
correct signal is in the sphere is 0.9, that is, P (kvkF < C) = 0.9. If no point is
found in the sphere, then the searching radius is raised such that the probability is
increased to 0.99 and so on. Using this algorithm, the probability that a point can be
found during the first search is high. For more details of sphere decoding and sphere
decoding for complex channels, please refer to [DAML00] and [HtB03].
Although SU (3) codes also have the structure of products of two unitary matrices
(1) (1)
A(p,q) and B(r,s) , since the two unitary matrices do not have the orthogonal design
163
structure, we cannot find a way to simplify the decoding. Therefore, for SU (3) codes,
exhaustive search is used to obtain the ML results.
In this section, examples of both SU (3) codes and the two types of AB codes are shown
and also the simulated performance of the codes at different rates. The number of
transmit antennas is three. The fading coefficient from each transmit antenna to
each receive antenna is modeled independently as a complex Gaussian variable with
zero-mean and unit-variance and keeps constant for 2M = 6 channel uses. At each
channel use, zero-mean, unit-variance complex Gaussian noise is added to each receive
antenna. The block error rate (BLER), which corresponds to errors in decoding the
3 × 3 transmitted matrices, is demonstrated as the error event of interest. The
comparison of the proposed codes with some of the group-based codes and the non-
group code in [SHHS01] is also shown.
the rate of the code is 1.9690. From Table 6.3, diversity product of the code is 0.2977.
Its BLER is compared with the G21,4 group code at rate R = 1.99 with diversity prod-
uct 0.3851 and also the best cyclic group code at rate 1.99, whose diversity product is
0.3301, with u = (1, 17, 26). The number of receive antennas is one. The performance
curves are shown in Figure 6.1. The solid line indicates the BLER of the type I AB
code. The solid line with circles indicates the BLER of the G21,4 code and the dashed
line indicates the BLER of the cyclic group code. It can be seen from the plot that
the performance of the three codes is close to each other. The AB code is a little
164
0
10
G code R=1.99
21,4
(1,3,4,5) AB code R=1.97
cyclic group code R=1.99
−1
10
−2
10
BLER
−3
10
−4
10
−5
10
−6
10
10 12 14 16 18 20 22 24 26 28 30
SNR (dB)
Figure 6.1: Comparison of the rate 1.9690, (1, 3, 4, 5) type I AB code with the rate
1.99 G21,4 code and the best rate 1.99 cyclic group code
165
(0.5dB-1dB) worse than the G21,4 code and better (0.5dB-1dB) than the cyclic group
code. Notice that the decoding of both group-based codes needs exhaustive search
but the AB code has a fast decoding method. Therefore, at rate approximately 2, the
AB code is as good as the group-based codes with far superior decoding complexity.
R≈3
0
10
G171,64 code R=3
(4,5,3,7) AB code R=2.90
(7,9,11,1) SU(3) code R=3.15
(3,7,5,11) AB code R=3.39
(4,7,5,11) AB code R=3.53
(3,7,5,11) SU(3) code R=3.39
−1
10
−2
10
bler
−3
10
−4
10
10 12 14 16 18 20 22 24 26 28 30
SNR (dB)
Figure 6.2: Comparison of the 1) rate 2.9045, (4, 5, 3, 7) type I AB code, 2) rate
3.15, (7, 9, 11, 1), SU (3) code, 3) rate 3.3912, (3, 7, 5, 11) type II AB code, 4) rate
3.5296, (4, 7, 5, 11) type I AB code, and 5) rate 3.3912, (3, 7, 5, 11), SU (3) code with
6) the rate 3, G171,64 code
In this subsection, two sets of codes are compared. The first set includes the
(4, 5, 3, 7) type I AB code with rate R = 2.9045, the G171,64 group-based code at rate
3, and the SU (3) with (P, Q, R, S) = (7, 9, 11, 1) and rate 3.1456. The number of
receive antennas is one. The simulated BLERs are shown in Figure 6.2. The line
166
with squares indicates the bler of the type I AB code at rate 2.9045. The line with
plus signs indicates the BLER of the G171,64 code and the line with stars shows the
BLER of the SU (3) code at rate 3.1456. It can be seen from the plot that the rate
2.9045 AB code is about 1dB better than the G171,64 code whose rate is 3. The SU (3)
code has about the same performance as the group-based code with a rate 0.1456
higher.
The second set of codes includes the (3, 7, 5, 11) type II AB code at rate R =
3.3912, the (4, 7, 5, 11) type I AB code with rate R = 3.5296, and the (3, 7, 5, 11)
SU (3) code with rate R = 3.3912. The number of receive antennas is one. The
simulated BLERs are also shown in Figure 6.2. The solid and dashed lines show the
BLER of the rate 3.3912 and rate 3.5296 AB code, respectively. BLER of the SU (3)
code is shown by the dash-dotted line. The three codes have very close performance.
Compared with the performance of the G171,64 code, which is shown by the line
with circles, the three codes, the two AB codes and the SU (3) codes with rates
0.3912,0.5296, and 0.3912 higher, perform about 1.5dB worse than that of the group-
based code. Note that the AB codes can be decoded much faster than the G171,64
code and the SU (3) codes.
The comparison of the (5, 8, 9, 11) type II AB code at rate 3.9838 and the (9, 10, 11, 13)
type II AB code at rate 4.5506, the (5, 9, 7, 11) SU (3) code at rate 3.9195, and the
(7, 11, 9, 13) SU (3) code at rate 4.3791, with the rate 4 group-based G1365,16 code is
given in Figure 6.3. As can be seen in Figure 6.3, the line with circles indicates the
BLER of the G1365,16 code. The line with plus signs and the solid line show the BLER
of the rate 3.9838 and 4.5506 AB code, respectively. The dashed and the dash-dotted
167
0
10
G1365,16 code R=4
(5,8,9,11) AB code R=3.98
(9,10,11,13) AB code R=4.55
(5,9,7,11) SU(3) code R=3.92
(7,11,9,13) SU(3) code R=4.38
−1
10 non−group code R=4
−2
10
BLER
−3
10
−4
10
−5
10
20 25 30 35
SNR (dB)
Figure 6.3: Comparison of the 1) rate 3.9838, (5, 8, 9, 11) type II AB code, 2) rate
4.5506, (9, 10, 11, 13) type II AB code, 3) rate 3.9195, (5, 9, 7, 11), SU (3) code, and 4)
rate 4.3791, (7, 11, 9, 13), SU (3) code with the 5) rate 4 G1365,16 code and 6) rate 4
non-group code
168
line show the BLER of the rate 3.9195 and 4.3791 SU (3) code, respectively. The
number of receive antennas is one. It can been seen from the plot that at about the
same rate, the (5, 8, 9, 11) type II AB code and the (5, 9, 7, 11) SU (3) code perform
a lot better than the G1365,16 code. For example, at the BLER of 10−3 , the AB code
has an advantage of about 4dB and the SU (3) code has an advantage of about 3.5dB.
Also, at rate 0.3791 higher, the (7, 11, 9, 13), SU (3) code is more than 1dB better than
the G1365,16 code does at high SNRs. The BLER of the (9, 10, 11, 13) type II AB code
is slightly lower than that of the G1365,16 code even with a rate 0.5506 higher. The
performance of the non-group code is also shown, which is indicated by the line with
squares. It can be seen from the plot that the (5, 8, 9, 11) type II AB code and the
SU (3) code at rates 3.9838 and 3.9195 are as good as the non-group code given in
[SHHS01] at rate 4 according to BLER, although diversity product of the non-group
code is much higher than those of the AB and SU (3) codes from Tables 6.1-6.3.
The reason might be that although in the AB and SU (3) codes, the minimum of
the determinants of the differences of two matrices is much smaller than that of the
non-group code, the overall distribution of elements in the AB and SU (3) codes are
as good as the overall distribution of the non-group code. Or in other words, in the
AB and SU (3) codes, pairs of matrices that have very small difference determinant
is scarce. The expected difference determinant, E 1≤i<j≤L det |Ui − Uj |, of the AB and
SU (3) codes may be as large as that of the non-group code. When the rate is high,
the probability that matrices that are close to others are transmitted is small. It
is the expected “distance”4 instead of the worse-case “distance” that dominants the
BLER.
This plot shows that both the AB codes and the SU (3) codes have much better
performance than the group-based code. They even have the same good performance
as the elaborately designed non-group codes. Another advantage is that the AB
4
Here, the distance of two matrices, A and B, is | det(A − B)|. It is quoted since it is not a metric
by definition. For definition of metric, see [Yos78].
169
codes have a fast decoding algorithm while the decoding of both the group-based and
non-group codes needs exhaustive search.
0
10
(11,13,14,15) AB code R=4.96
G code R=5
10815,46
−1
10
BLER
−2
10
−3
10
20 22 24 26 28 30 32 34 36 38 40
SNR (dB)
Figure 6.4: Comparison of the rate 4.9580, (11, 13, 14, 15) type II AB code with the
rate 5 G10815,46 code
In this subsection, the (11, 13, 14, 15) type II AB code is compared with the
G10815,46 group-based code. The rate of the AB code is 4.9580 and the rate of the
group-based code is 5. The performance is shown in Figure 6.4. The line with circles
indicates BLER of the G10815,46 code and the solid line shows BLER of the AB code.
The plot shows that the AB code has a much better performance. For example, at the
BLER of 10−3 , the AB code is 6dB better and the performance gap is even higher for
lower BLERs or higher SNRs. As mentioned before, the AB code has a fast decoding
algorithm while decoding the group-based codes needs exhaustive search. Therefore,
170
at high rates, AB codes have great advantages over group-based codes in both the
performance and decoding complexity.
6.7 Conclusion
In this chapter, the research on the idea of differential unitary space-time code designs
based on Lie groups with rank 2, which is first discussed in Chapter 4, is continued.
The special unitary Lie group SU (3) is analyzed, which has dimension 8 and rank 2.
The group is not fixed-point-free, but a method to design fully-diverse codes, which
are subsets of the group, is described. Furthermore, motivated by the structure of the
SU (3) codes proposed, a simpler code, called the AB code, is proposed. Both codes
are suitable for systems with three transmit antennas. Necessary conditions for the
full diversity of both codes are given and our conjecture is that they are also sufficient
conditions. The codes have simple formulas from which their diversity products can
be calculated in a fast way. A fast maximum-likelihood decoding algorithm for AB
codes based on complex sphere decoding is given by which the codes can be decoded
in a complexity that is polynomial in the rate and dimension. Simulation results
show that both SU (3) codes and AB codes perform as well as the finite group-based
codes at low rates. But they do not need the exhaustive search (of exponentially
growing size) required of group-based codes and therefore are far superior in terms
of decoding complexity. Both SU (3) and AB codes have great advantages over the
finite group-based codes at high rates and perform as well as the carefully designed
Proof: It is easy to check that any matrix U that satisfies (6.1) is in SU (3) by
checking that U U ∗ = I3 and det U = 1. Now what is left to prove is that
any matrix
a U12
U ∈ SU (3) can be written as (6.1). Partition U into where a is a
U21 U22
complex number, U12 is 1 × 2, U21 is 2 × 1, and U22 is 2 × 2. Since U U ∗ = I3 ,
∗
a U12 ā 1 012
U21
= .
∗ ∗
U21 U22 U12 U22 021 I2
∗
Comparing the (1, 1) entries, U12 U12 = 1 − |a|2 can be obtained. Therefore |a|2 < 1.
Comparing the (1, 2) entries,
∗ ∗ ∗
aU21 + U12 U22 = 012 ⇒ U21 = −a−1 U12 U22
∗
.
Comparing the (2, 2) entries and using the above equality, we have
∗ ∗
U21 U21 + U22 U22 = I2 ⇒ −a−1 U21 U12 U22
∗ ∗
+ U22 U22 = I2 ⇒ a−1 U21 U12 = U22 − U22
−∗
.
∗ ∗ ∗
Therefore, det U22 = ā. From U21 U21 + U22 U22 = I2 , it is obvious that I2 − U22 U22
∗
has rank 1. So, 1 is an eigenvalue of U22 U22 . The other eigenvalue must be |a|2
∗
since det U22 U22 = |a|2 . Thus, the Hermitian and positive matrix U22 U22
∗
can be
172
∗ 1 0 ∗
decomposed as U22 U22 = Φ Φ , for some unitary matrix Φ with deter-
0 |α|2
minant 1.
Therefore,
there exists a unitary matrix Ψ with determinant 1 such that
1 0
U22 = Φ Ψ.
0 α
∗ ∗
Again, from U21 U21 + U22 U22 = I2 ,
∗
U21 U21
∗
= I2 − U22 U22
0 0 ∗
= Φ Φ
0 1 − |a|2
0 p ∗
= Φ p 0 1 − |a|2 e−jζ Φ .
1 − |a|2 ejζ
where ζ is an arbitrary angle. By similar argument, a general solution for U12 is,
p
U12 = 0 1 − |α|2 ejη Ψ
∗ ∗
aU21 + U12 U22 = 012
p p 1 0 ∗
⇒ a 0 1 − |a|2 e−jζ Φ∗ + 0 1 − |a|2 ejη ΨΨ∗ Φ = 0
0 a
⇒ ejη = −e−jζ .
173
Therefore, we have proved that matrices in SU (3) can be written as
p
a 0 − 1− |a|2 e−jζ Ψ
0 1 0
Φ p Φ Ψ
1 − |a|2 ejζ 0 ā
p −jζ
ᾱ 2
0 − 1 − |α| e
1 0 12 1 012
= 0 1 0
.
021 Φ p 021 Ψ
1 − |α|2 ejζ 0 α
Since
p −jζ
0 − 1 − |a| e 2
1 0 0 a 1 0 0
0 e−jβ 0 0 1 0 0 ejβ 0
p
jβ
0 0 e 1 − |a|2 ejζ 0 ā 0 0 e−jβ
p −j(ζ+β)
a 0 − 1 − |a| e 2
=
0 1 0
p
1 − |a|2 ej(ζ+β) 0 ā
for any real angle β, the angle ζ is a redundant degree of freedom. Therefore, we can
−1
(1) (1)
A(p2 ,q2 ) A(p1 ,q1 )
−jθ2 jθ1
e 0 0 e 0 0
p2 q 2
p1 q1
= √1 e−2π P 2π
√1 e Q √1 e2π P 2π
√1 e Q e−jθ1
0 2
− 2
0
2 2
q2 p
q p
1 −2π Q jθ2 1 2π P2 jθ2 1 −2π Q1 1 −2π P1 −jθ1
0 √
2
e e √
2
e e 0 − 2e √ √
2
e e
j(θ1 −θ2 )
e
p1 −p2
−2πj
q1 −q2
2πj(−
p2
+
q1
) 2πj(−
p1
+
q2
)
= 1 2πj P 1 Q − e −jθ1
0 2
e + e Q
2
e P P Q e
p1 q2 p2 q1
p −p q −q
1 2πj( P − Q ) 2πj( P − Q ) jθ2 1 −2πj 1 P 2 2πj 1 Q 2 −j(θ1 −θ2 )
0 2
e −e e 2
e +e e
j(θ1 −θ2 )
e 0 0
p1 −p2 q1 −q2 p1 +p2 q1 +q2
= 2πj ( 2P − 2Q ) −jθ1 2πj (− 2P + 2Q )
0 e cos γ 1 e je sin γ 1
p1 +p2 q1 +q2 p1 −p2 q1 −q2
jθ2 2πj ( 2P − 2Q ) −j(θ1 −θ2 ) 2πj (− 2P + 2Q )
0 e je sin γ1 e e cos γ1
−jθ2
e 0 0 1 0 0
p2 q2
= 2πj(− + )
0 e 0
2P 2Q 0 cos γ j sin γ
1 1
p q2
jθ2 2πj( 2P2 − 2Q )
0 0 e e 0 j sin γ1 cos γ1
jθ1
e 0 0
p1 q1
0 e 2P 2Q 2πj( − ) ,
0
p1 q1
−jθ1 2πj(− 2P + 2Q )
0 0 e e
and
−1
(1) (1)
B(r2 ,s2 ) B(r1 ,s1 )
r2 s2 r1 s1
√1 e2π
e R
jξ2 √1 e2π S e jξ2
0 √1 e−2π R e −jξ1
− √12 e2π S 0
2 2 2
s r2
s1 r1
= 1 −2π S2 √1 e−2π √1 e−2π √1 e2π
− √2 e 2
R 0
2
S e−jξ1 2
R 0
0 0 e−jξ2 0 0 ejξ1
175
r1 −r2 s1 −s2 r s2 r s1
1 −2πj −2πj −j(ξ1 −ξ2 ) 1 2πj( R1 + ) 2πj( R2 + ) jξ2
2
e +e
R S e 2
e S −e S e 0
r s r s
r −r s −s
= 1 2πj(− R2 − S1 ) 2πj(− R1 − S2 ) 1 2πj 1 R 2 2πj 1 S 2
2
e − e e−jξ1 2
e +e 0
0 0 ej(ξ1 −ξ2 )
r −r s −s r1 +r2 s1 +s2
−j(ξ1 −ξ2 ) 2πj (− 12R 2 − 12S 2 )
e e cos γ2 jejξ2 e2πj ( 2R + 2S ) sin γ2 0
r1 +r2 s1 +s2
−jξ1 2πj (− 2R − 2S )
r1 −r2 s1 −s2
=
je e sin γ 2 e 2πj( 2R + 2S )
cos γ 2 0
0 0 ej(ξ1 −ξ2 )
r2 s2
jξ2 2πj ( 2R + 2S )
e e 0 0 cos γ2 j sin γ2 0
r2 s2
= 2πj (− 2R − 2S )
0 e 0 j sin γ cos γ
2 2
0 0 e−jξ2 0 0 1
r1 s1
−jξ1 2πj (− 2R − 2S )
e e 0 0
r1 s1
0 e2πj ( 2R + 2S ) 0 .
0 0 ejξ1
Thus,
e e 0 0 cos γ2 j sin γ2 0
r2 s2
−
0 e2πj (− 2R − 2S ) 0 j sin γ cos γ 0
2 2
0 0 e−jξ2 0 0 1
r s
−jξ1 2πj (− 2R 2S )
1 − 1
e e 0 0
r1 s1
0 e2πj ( 2R
+ 2S ) 0
0 0 ejξ1
= ej(θi −θ
2 +ξi −ξ2 )
1 0 0 1 0 0
p2 q2
det 0 e 2πj(− + )
2P 2Q 0 0 cos γ1 j sin γ1
p q2
j(θ2 +ξ2 ) 2πj( 2P2 − 2Q )
0 0 e e 0 j sin γ1 cos γ1
1 0 0
p1 q1
0 e2πj( 2P − 2Q )
0
p1 q1
0 0 e−j(θ1 +ξ1 ) e2πj(− 2P + 2Q )
r2 s2
j(θ2 +ξ2 ) 2πj ( 2R + 2S )
e e 0 0 cos γ2 j sin γ2 0
r2 s2
− 2πj (− 2R − 2S ) 0 j sin γ cos γ 0
0 e 2 2
0 0 1 0 0 1
r1 s1
−j(θ1 +ξ1 ) 2πj (− 2R − 2S )
e e 0 0
r1 s1
2πj ( 2R + 2S ) 0
0 e
0 0 1
= ej(θi −θ
2 +ξi −ξ2 )
1 0 0
p1 −p2 q1 −q2 p1 +p2 q1 +q2
det 0 2πj ( 2P − 2Q ) −j(θ1 +ξ1 ) 2πj (− 2P + 2Q )
e cos γ 1 je e sin γ 1
p1 +p2 q1 +q2 p1 −p2 q1 −q2
j(θ2 +ξ2 ) 2πj ( 2P − 2Q ) −j(θ1 −θ2 +ξ1 −ξ2 ) 2πj (− 2P + 2Q )
0 je e sin γ1 e e cos γ1
r −r s −s r +r s +s
−j(θ1 −θ2 +ξ1 −ξ2 ) 2πj (− 12R 2 − 12S 2 ) j(θ2 +ξ2 ) 2πj ( 12R 2 + 12S 2 )
e e cos γ2 je e sin γ2 0
r +r s +s r1 −r2 s1 −s2
−j(θ +ξ ) 2πj (− 12R 2 − 12S 2 )
− je 1 1 e sin γ2 e2πj ( 2R + 2S ) cos γ2 0 .
0 0 1
177
Define
2πj (−
p1 +p2 q +q
) sin 2π
+ 12Q 2 p 1 − p 2 q 1 − q 2
y = je 2P + ,
2P 2Q
r +r s +s
2πj ( 12R 2 + 12S 2 ) r1 − r 2 s1 − s 2
z = je sin 2π − .
2R 2S
Therefore,
det(U1 − U2 )
−j(θ1 −θ2 +ξ1 −ξ2 ) j(θ2 +ξ2 )
1 0 0 e w e z 0
= Θ̄ det e−j(θ1 +ξ1 ) y − −e−j(θ1 +ξ1 ) z̄
0 x w̄ 0
j(θ2 +ξ2 ) −j(θ1 −θ2 +ξ1 −ξ2 )
0 −e ȳ e x̄ 0 0 1
−j(θ1 −θ2 +ξ1 −ξ2 )
1−e w −ej(θ2 +ξ2 ) z 0
= Θ̄ det
−j(θ
e 1 1 z̄+ξ )
x − w̄ −j(θ
e 1 1y +ξ )
j(θ2 +ξ2 ) −j(θ1 −θ2 +ξ1 −ξ2 )
0 −e ȳ e x̄ − 1
= Θ̄ (1 − e−j(θ1 −θ2 +ξ1 −ξ2 ) w)(x − w̄)(e−j(θ1 −θ2 +ξ1 −ξ2 ) x̄ − 1)+
e−j(θ1 −θ2 +ξ1 −ξ2 ) |z|2 (e−j(θ1 −θ2 +ξ1 −ξ2 ) x̄ − 1) + e−j(θ1 −θ2 +ξ1 −ξ2 ) |y|2 (1 − e−j(θ1 −θ2 +ξ1 −ξ2 ) w)
= Θ̄ (1 − Θw)(x − w̄)(Θx̄ − 1) + Θ|z|2 (Θx̄ − 1) + Θ|y|2 (1 − Θw)
= Θ̄ (x − Θwx − w̄ + Θ|w|2 )(Θx̄ − 1) + Θ|z|2 (Θx̄ − 1) + Θ|y|2 (1 − Θw)
= Θ̄ (Θx̄ − 1)(x − Θwx − w̄ + Θ) + Θ|y|2 (1 − Θw)
= Θ̄ Θ|x|2 − x − Θ2 |x|2 w + Θwx − Θx̄w̄ + w̄ + Θ2 x̄ − Θ + Θ|y|2 − Θ2 |y|2 w 0
= Θ̄ Θwx − Θx̄w̄ − (x − Θ2 x̄) − (Θ2 w − w̄)
−1
(2) (2)
A(p2 ,q2 ) A(p1 ,q1 )
e−jθ2 0 0 1 0 0
“
p q2
”
2πj − 2P2 + 2Q 0 cos 2π p1 −p2 + q1 −q2 p1 −p2 q1 −q2
=
0 e 0 2P 2Q j sin 2π 2P + 2Q
“
p2 q2
”
2πj − 2Q p1 −p2 q1 −q2 p1 −p2 q1 −q2
0 0 e 2P
0 j sin 2π 2P + 2Q cos 2π 2P + 2Q
jθ 0 0
e
1
“ ”
p1 q1
0 e2πj 2P − 2Q 0 ,
“
p1 q1
”
2πj − 2P + 2Q
0 0 e
and
−1
(2) (2)
B(r2 ,s2 ) B(r1 ,s1 )
r2 s2
2πj ( 2R + 2S ) r1 −r2 s1 −s2
r1 −r2 s1 −s2
e 0 0 cos 2π 2R − 2S j sin 2π 2R − 2S 0
r2 s2
= 0 e2πj (− 2R − 2S ) 0 j sin 2π
r1 −r2
2R − s1 −s2
2S cos 2π r1 −r2
2R − s1 −s2
2S 0
0 0 e−jξ2 0 0 1
r1 s1
2πj (− 2R − 2S )
e 0 0
r1 s1
0 e2πj ( 2R + 2S ) 0 .
0 0 ejξ1
Therefore,
= 2jejθ1 e−jξ2 Θ̄1 Θ̄2 Im(Θ1 Θ̄2 − Θ1 Θ2 x − Θ̄1 Θ̄2 w + Θ̄1 Θ2 wx)
= 2jejθ1 e−jξ2 Θ̄1 Θ̄2 Im[Θ1 Θ̄2 (1 − Θ̄21 w)(1 − Θ22 x)].
180
7.1 Abstract
In this chapter, the idea of space-time coding devised for multiple-antenna systems
the same as assuming that the R relay nodes can fully cooperate and have full knowl-
edge of the transmitted signal. It is further shown that for a fixed total transmit
power across the entire network, the optimal power allocation is for the transmitter
to expend half the power and for the relays to collectively expend the other half. It
is also proved that at low and high SNR, the coding gain is the same as that of a
multiple-antenna system with R transmit antennas. However, at intermediate SNR,
it can be quite different.
181
7.2 Introduction
The communication systems that have been discussed or worked with in previous
chapters are point-to-point communication systems, which only have two users: one
is the transmitter and the other is the receiver. Recently, communications in wireless
networks are of great interest because of their diverse applications. Wireless networks
consist of a number of nodes or users communicating over wireless channels. Roughly,
there are two types of wireless networks according to the structure. One type is
networks that have a master node or base station. All nodes communicate with
the base station directly and the base station is in control of all transmissions and
forwarding data to the intended users. A cellular phone system is the most popular
example of this kind of wireless networks. Another example is satellite communication
systems. The other kind of wireless networks is ad hoc or sensory networks, which is
the type of networks that are going to be dealt with in this chapter.
Figure 7.1: Ad hoc network
between each pair of nodes. In the figure, only some of the links are represented. In ad
182
hoc wireless networks, there is no master node or base station. All communications
are peer to peer. As every node may not be in the direct communication range1
of every other node, nodes can cooperate in routing each other’s data. Therefore,
transmissions may be completed by one-hop routing or even multiple-hop routing.
In addition, nodes in an ad-hoc network may be mobile. The difference between an
ad-hoc network and sensory network is that in the former, nodes may be mobile and
there can be more than one pair of nodes communicating at the same time, while, for
sensory networks, the nodes are normally static and there is only one pair of nodes
communicating at a time.
According to the features mentioned above, ad-hoc and sensory networks can be
rapidly deployed and reconfigured, can be easily tailored to specific applications, and
are robust due to the distributed nature and redundancy of the nodes. Because
of these advantages of ad-hoc and sensory networks, they have many applications,
for example, the data network, the home network, the wireless network of mobile
laptops, PDAs and smart phones, the automated transportation systems, sensor dust,
Bluetooth [Har00], etc.2 However, because of exactly the same unique features, the
analysis on wireless ad hoc networks is very difficult in networking, signal processing,
and especially information theoretical aspects.
There are many preliminary results in ad hoc wireless networks. In 2000, the
capacity of wireless ad-hoc networks was first analyzed in the landmark paper [GK00].
It is proved that the optimal bit-distance product can be transported by a network
√
placed in a disk of unit area scales as O( n) bit-meters per second, where n is the
number of nodes in the network. In [GT02], it is proved that the mobility of nodes can
1
There are many ways to define communication range of a wireless network according to the
transmit power, interference, distance, and other factors in the network. For example, in the protocol
model in [GK00], it is defined that one node located at Xi can transmit to another node located at
Xj successfully if |Xk − Xj | ≥ (1 + ∆)|Xi − Xj | for every other node located at Xk simultaneously
transmitting over the same sub-channel. ∆ is a positive number that models situations where a
guard zone is specified by the protocol to prevent a neighboring node from transmitting on the same
sub-channel at the same time.
2
For more applications and introduction, refer to [GW02] and [Per01].
183
increase the per-session throughput greatly. Results on the network layer designing,
interference, and energy management can be found in [PGH00, BMJ+ 98, RkT99,
RM99, DBT03]. Although these work illuminate issues in ad hoc networks with
specific network models and under specific conditions, most of the questions about
ad hoc networks are still open. For example, what is the Shannon capacity region,
how to do scheduling and coding to achieve capacity, and how to allocate power
among the nodes? In this chapter, we use the space-time coding idea, which is widely
used in multiple-antenna systems, in wireless networks to improve the performance
of network communications.
As has been mentioned in Chapter 1, multiple antennas can greatly increase
versity using the repetition and space-time algorithms. The mutual information and
outage probability of the network are analyzed. However, in their model, the relay
nodes need to decode their received signals, which causes extra consumption in both
time and energy and also may cause error propagation. In [NBK04], a network with
a single relay under different protocols is analyzed and second order spatial diversity
is achieved. In [HMC03], the authors use space-time codes based on Hurwitz-Radon
matrices and conjecture a diversity factor around R/2 from their simulations. Also,
their simulations in [CH03] show that the use of Khatri-Rao codes lowers the average
bit error rate. In this chapter, relay networks with fading are considered and linear
dispersion space-time codes [HH02b] are applied among the relays. The problem we
are interested in is: can we increase the reliability of a wireless network by using
space-time codes among the relay nodes?
184
A key feature of this work is that no decoding is required at the relay nodes.
This has two main benefits: first, the computation at the relay nodes is considerably
simplified, and second, we can avoid imposing bottlenecks on the rate by requiring
some relay nodes to decode (See e.g., [DSG+ 03]).
The wireless relay network model used here is similar to those in [GV02, DH03]. In
[GV02], the authors show that the capacity of the wireless relay network with n nodes
√
behaves like log n. In [DH03], a power efficiency that behaves like n is obtained.
Both results are based on the assumption that each relay knows its local channels
so that they can work coherently. Therefore, the system should be synchronized at
the carrier level. Here, it is assumed that the relay nodes do not know the channel
information. All we need is the much more reasonable assumption that the system is
synchronized at the symbol level.
The work in this chapter shows that the use of space-time codes among the relay
nodes, with linear dispersion structure, can achieve a diversity, min{T, R} 1 − logloglogP P .
When T ≥ R, the transmit diversity is linear in the number of relays (size of the
network) and is a function of the total transmit power. When P is very large,
the diversity is approximately R. The coding gain for large R and very large P is
det −1 (Si − Sj )∗ (Si − Sj ), where Si is the distributed space-time code. Therefore, with
very large transmit power and a big network, the same transmit diversity and cod-
ing gain are obtained as in the multiple-antenna case, which means that the systems
works as if the relays can fully cooperate and have full knowledge of the transmitted
signal.
This chapter is organized as follows. In the following section, the network model
and the two-step protocol is introduced. The distributed space-time coding scheme
is explained in Section 7.4 and the pairwise error probability (PEP) is calculated
in Section 7.5. In Section 7.6, the optimum power allocation based on the PEP is
derived. Sections 7.7 and 7.8 contain the main results. The transmit diversity and the
185
coding gain are derived. To motivate the main results, simple approximate derivations
are given first in Section 7.7, and then in Section 7.8 the more involved rigorous
derivation is shown. In Section 7.9, the transmit diversity obtained in Sections 7.7
and 7.8 is improved slightly, and the optimality of the new diversity is proved. A
more general distributed linear dispersion space-time coding is discussed in Section
7.10, and in Section 7.11 the transmit diversity and coding gain for a special case
are obtained, which coincide with those in Sections 7.7 and 7.8. The performance of
relay networks with randomly chosen distributed linear dispersion space-time codes
is simulated and compared with the performance of the same space-time codes used
in multiple-antenna systems with R transmit antennas and one receive antenna. The
details of the simulations and the BER and BLER figures are given in Section 7.12.
Section 7.13 provides the conclusion and future work. Section 7.14 contains some of
the technical proofs.
The work in this chapter has been published in the Proceeding of the Third Sen-
sory Array and Multi-Channel Signal Processing Workshop (SAM’04) [JH04f] and is
Consider a wireless network with R + 2 nodes which are placed randomly and inde-
pendently according to some distribution. There is one transmit node and one receive
node. All the other R nodes work as relays. Every node has one antenna. Anten-
nas at the relay nodes can be used for both transmission and reception. Denote the
channel from the transmitter to the i-th relay as fi , and the channel from the i-th
relay to the receiver as gi . Assume that fi and gi are independent complex Gaussian
186
with zero-mean and unit-variance. If the fading coefficients fi and gi are known to
relay i, it is proved in [GV02] and [DH03] that the capacity behaves like log R and a
√
power efficiency that behaves like R can be obtained. However, these results rely on
the assumption that the relay nodes know their local connections, which requires the
system to be synchronized at the carrier level. However, for ad hoc networks with a
lot of nodes which can also be mobile, this is not a realistic assumption. In our work,
a much more practical assumption, that the relay nodes are only coherent at the
symbol level, is made. In the relay network, it is assumed that the relay nodes, know
only the statistical distribution of the channels. However, we make the assumption
that the receiver knows all the fading coefficients fi and gi , which needs the network
to be synchronized at the symbol level. Its knowledge of the channels can be obtained
by sending training signals from the relays and the transmitter. The main question
is what gains can be obtained? There are two types of gains: improvement in the
outage capacity and improvement in the PEP. In this chapter, the focus is on the
latter.
relays
r1 t1
transmitter f1 r2
t2
g1
receiver
f2
g2
s x
. . .
...
...
fR
gR
rR tR
Assume that the transmitter wants to send the signal s = [s1 , · · · , sT ]t in the
codebook {s1 , · · · , sL } to the receiver, where L is the cardinality of the codebook. s
187
is normalized as
E s∗ s = 1. (7.1)
time τ is denoted as ri,τ , which is corrupted by the noise vi,τ . From time T + 1 to 2T ,
the i-th relay node transmits ti,1 , · · · , ti,T to the receiver based on its received signals.
Denote the received signal at the receiver at time τ + T by xτ , and the noise at the
receiver at time τ + T by wτ . Assume that the noises are complex Gaussian with
zero-mean and unit-variance, that is, the distribution of vi,τ and wτ are CN (0, 1).
The following notations are used:
vi,1 ri,1 ti,1 w1 x1
vi,2 ri,2 ti,2 w2 x2
vi = . , ri = . , ti = . , w= .. , x= .. .
. . .
. . . . .
vi,T ri,T ti,T wT xT
p
ri = P1 T fi s + v i (7.2)
and
R
X
x= gi ti + w. (7.3)
i=1
3
Although in the figure, all the relay nodes sit on a line in the middle of the transmitter and the
receiver, this does not means that they must be in the middle of the two communicating nodes to
relay the information. The positions of the relay nodes are arbitrary. For simplicity and clearness
of the figure, we draw it this way.
188
7.4 Distributed Space-Time Coding
The key question is what the relay nodes should do. There are two widely used
cooperative strategies for the relay nodes. The first one is called amplify-and-forward,
in which the relays just amplify their received signals according to power constraints
and forward to the receiver. The other is called decode-and-forward, in which the
relay nodes do fully decoding and then send their decoded information to the receiver.
If the relay nodes know their local connections, beamforming can be done by amplify-
and-forward. However, it is obvious that if the relay nodes do not know the channels,
amplify-and-forward is not optimal. For decode-and-forward, if the relays can decode
the signal correctly, which happens when the transmit power is very high or the
transmission rate is very low, the system is equivalent to a multiple-antenna system
with R transmit antennas and one receive antenna, and the best diversity R can be
obtained. However if some relay nodes decode incorrectly, whether because of bad
channel, low transmit power, or high transmission rate, they will forward incorrect
signals to the receiver, which will harm the decoding at the receiver greatly. Therefore,
for ad hoc networks whose nodes have limited power, decode-and-forward puts a heavy
to decode. The strategy we use is called distributed space-time coding, in which simple
signal processing is done at relay nodes. No decoding is need at relay nodes, which
saves both time and energy, and more importantly, there is no rate constraint on
transmissions. As will be seen later, this strategy leads to the optimal diversity, R,
with asymptotically high transmit power.4
4
A combination of requiring some relay nodes to decode and others to not, may also considered.
However, in the interest of space, we shall not do so here.
189
In our approach, we use the idea of the linear dispersion space-time codes [HH02b]
for multi-antenna systems by designing the transmitted signal at every relay as a linear
r T r
P2 X P2
ti,τ = ai,τ t ri,t = [ai,τ 1 , ai,τ 2 , · · · , ai,τ T ]ri ,
P1 + 1 t=1 P1 + 1
or in other words,
r
P2
ti = Ai ri , (7.4)
P1 + 1
where
a ai,12 · · · ai,1T
i,11
ai,21 ai,22 · · · ai,2T
Ai = . .. .. , for i = 1, 2, · · · , R.
. ..
. . . .
ai,T 1 ai,T 2 · · · ai,T T
henceforth assume that Ai are unitary matrices. As we shall presently see, this also
simplifies the analysis considerably since it keeps the noises forwarded by the relay
nodes to the receiver white.
Now let’s discuss the transmit power at each relay node. Because tr ss∗ = 1, fi , vi,j
p p
E r∗i ri = E ( P1 T fi s + vi )∗ ( P1 T fi s + vi ) = E P1 T |fi |2 s∗ s + vi∗ vi = (P1 + 1)T.
5
Note that the conjugate of ri does not appear in (7.4). The case with ri is discussed in Section
7.10.
190
Therefore the average transmit power at relay node i is
P2 P2
E t∗i ti = E (Ai ri )∗ (Ai ri ) = E r∗i ri = P2 T,
P1 + 1 P1 + 1
which explains our normalization in (7.4). The expected transmit power for one
r R
P2 X
x = gi Ai ri + w
P1 + 1 i=1
r R
P2 X p
= g i A i ( P1 T f i s + v i ) + w
P1 + 1 i=1
r f1 g1 r R
P1 P2 T .
. P2 X
= [A1 s, · · · , AR s] . +
gi Ai vi + w.
P1 + 1 P1 + 1 i=1
fR gR
Define
f g
1 1
r
f2 g2 R
P2 X
S = [A1 s, A2 s, · · · , AR s], H = . , and W = gi Ai vi + w.
.
. P1 + 1 i=1
fR gR
r
P1 P2 T
x= SH + W. (7.5)
P1 + 1
Remark: From equation (7.5), it can be seen that the T × R matrix S works like
the space-time code in the multiple-antenna case. We call it the distributed space-time
code to emphasize that it has been generated in a distributed way by the relay nodes,
191
without having access to s. H, which is R × 1, is the equivalent channel matrix and
W , which is T × 1, is the equivalent noise, W is clearly influenced by the choice of
the space-time code. Using the unitarity of the Ai , it is easy to get the normalization
of S:
R
X R
X
∗ ∗
tr S S = s A∗i Ai s = s∗ s = R.
i=1 i=1
Since Ai s are unitary and wj , vi,j are independent Gaussian, W is also Gaussian when
gi s are known. It is easy to see that E W = 0T 1 and
Var (W |gi ) = E W W ∗
r R
! r R
!∗
P2 X P2 X
= E g i A i vi + w g i A i vi + w
P1 + 1 i=1 P1 + 1 i=1
R,R
P2 X
= E gi ḡj Ai vi vj∗ A∗j + IT
P1 + 1 i=1,j=1
R
P2 X 2
= |gi | Ai A∗i + IT
P1 + 1 i=1
R
!
P2 X 2
= 1+ |gi | IT .
P1 + 1 i=1
Thus, W is both spatially and temporally white. This implies that, when both fi and
gi are known, x|si is also Gaussian with the following mean and variance.
r
P1 P2 T
E (x|si ) = Si H
P1 + 1
192
and
R
!
P2 X 2
Var (x|si ) = Var W = 1+ |gi | IT .
P1 + 1 i=1
Thus,
r r
P1 P2 T P1 P2 T
(x− S H)∗ (x− S H)
P1 +1 i P1 +1 i
−
1 1+
P 2 PR
|gi |2
P (x|si ) = h PR iT e P1 +1 i=1
.
P2
2π 1 + P1 +1 i=1 |gi |2
r
2
P P T
1 2
arg max P (x|si ) = arg min
x − Si H
. (7.6)
si si
P1 + 1
F
P
2
r PR R
xRe
P1 P2 T i=1 fi gi Ai − i=1 fi gi Ai sRe
arg min
− P Re P Im
(. 7.7)
si
P1 + 1 R R
xIm i=1 fi gi Ai i=1 fi gi Ai sIm
Im Re F
Since (7.7) is equivalent to the decoding of a real linear system, sphere decoding can
be used whose complexity is polynomial in the transmission rate and dimension at
Theorem 7.1 (Chernoff bound on PEP). With the ML decoding in (7.6), the
PEP, averaged over the channel coefficients, of mistaking si by sj has the following
Chernoff bound.
P1 P2 T
− H ∗ (Si −Sj )∗ (Si −Sj )H
4(1+P1 +P2 R |gi |2 )
Pe ≤ E e
P
i=1 .
fi ,gi
By integrating over fi s in the above formula, we can get the following inequality on
193
PEP.
P1 P2 T
P e ≤ E det −1 IR + PR (Si − Sj )∗ (Si − Sj )diag {|g1 |2 , · · · , |gR |2 } .(7.8)
gi 2
4 1 + P1 + P2 i=1 |gi |
Proof: The PEP of mistaking S1 by Si has the following Chernoff upper bound
[SOSL85].
q
P1 P2 T
Since Si is transmitted, x = P1 +1
Si H + W . Therefore, from (7.6),
ln P (x|Sj ) − ln P (x|Si )
h q q i
P1 P2 T ∗ ∗ P1 P2 T ∗ ∗ P1 P2 T ∗
P1 +1
H (S i − S j ) (S i − S j )H + P1 +1
H (S i − S j ) W + P1 +1
W (S i − S j )H
= − P2
PR .
1 + P1 +1 i=1 |gi |2
Thus,
Pe h q q i
λ P2 P1 P2 T P1 P2 T
− P 2 PR P T H ∗ (Si −Sj )∗ (Si −Sj )H+
P1 +1 1 P1 +1
H ∗ (Si −Sj )∗ W + P1 +1
W ∗ (Si −Sj )H
1+ |g |2
≤ E e P1 +1 i=1 i
fi ,gi ,W
» r r –
P2 P1 P2 T ∗ P1 P2 T
λ P T H ∗ (Si −Sj )∗ (Si −Sj )H+ H (Si −Sj )∗ W + W ∗ (Si −Sj )H +W W ∗
P1 +1 1 P1 +1 P1 +1
− P 2 PR
Z 1+ |g |2
e P1 +1 i=1 i
= E h iT dW
fi ,gi P2
PR 2
2π 1 + P1 +1 i=1 |gi |
„ r «∗ „ r «
P1 P2 T P1 P2 T
λ (Si −Sj )H+W λ (Si −Sj )H+W
P1 +1 P1 +1
P P T −
λ(1−λ) 1 2 Z 1+
P2 PR
|g |2
1+P1
− PR
|g |2
H ∗ (Si −Sj )∗ (Si −Sj )H e P1 +1 i=1 i
= E e 1+α
i=1 i h iT dW
fi ,gi P2
PR
2π 1 + P1 +1 i=1 |gi |2
P P T
λ(1−λ) 1 2
P1 +1
− P 2 PR H ∗ (Si −Sj )∗ (Si −Sj )H
1+ |g |2
= E e P1 +1 i=1 i
fi ,gi
λ(1−λ)P1 P2 T
− PR H ∗ (Si −Sj )∗ (Si −Sj )H
|g |2
= E e 1+P1 +P2
i=1 i .
fi ,gi
194
1 1
Choose λ = 2
which maximizes λ(1 − λ) = 4
and therefore minimizes the right-hand
side of the above formula. Thus,
P1 P2 T
− PR H ∗ (Si −Sj )∗ (Si −Sj )H
|g |2 )
Pe ≤ E e 4(1+P1 +P2
i=1 i . (7.9)
fi ,gi
This is the first upper bound in Theorem 7.1. To get the second upper bound,
the expectation over fi s must be calculated. Notice that
f1 g1 g1 · · · 0 f1
. .. . . .. ..
H= . = . .
. . . .
fR gR 0 · · · gR fR
P1 P2 T
− f ∗ G(Si −Sj )∗ (Si −Sj )Gf
4(1+P1 +P2 R |gi |2 )
Pe ≤
P
E e i=1
fi ,gi
Z P1 P2 T
1 1 −
4(1+P1 +P2 R |gi |2 )
f ∗ G(Si −Sj )∗ (Si −Sj )Gf ∗
e−f f df
P
= E R
e i=1
2 ig (2π)
Z „ «
P1 P2 T
1 −f ∗ IR + PR 2
G(Si −Sj )∗ (Si −Sj )G f
= E e 4(1+P1 +P2 |g | )
i=1 i df
gi (2π)R
P1 P2 T
= E det −1 IR + PR G(Si − Sj )∗ (Si − Sj )G
gi
4 1 + P1 + P2 i=1 |gi |2
P1 P2 T
= E det −1 IR + PR (Si − Sj )∗ (Si − Sj )diag {|g1 |2 , · · · , |gR |2 }
gi
4 1 + P1 + P2 i=1 |gi |2
as desired.
Let’s compare (7.8) with the Chernoff bound on the PEP of a multiple-antenna
system with R transmit antennas and 1 receive antenna (the receiver knows the
channel) [TSC98, HM00]:
−1 PT ∗
P e ≤ det IR + (Si − Sj ) (Si − Sj ) .
4R
195
The difference is that now the expectations over the gi must be calculated. Similar
to the multiple-antenna case, the “full diversity” condition can be obtained from
(7.8). It is easy to see that if Si − Sj drops rank, the upper bound in (7.8) increases.
Therefore, the Chernoff bound is minimized when Si − Sj is full-rank, or equivalently,
det(Si − Sj )∗ (Si − Sj ) 6= 0 for any 1 ≤ i 6= j ≤ L.
In this section, the optimum power allocation between the transmit node and relay
nodes, that minimize the PEP, is discussed. Because of the expectations over gi ,
this is easier said than done. Therefore, a heuristic argument is used. Note that
P
g= R 2
i=1 |gi | has the gamma distribution [EHP93],
g R−1 e−g
p(g) = ,
(R − 1)!
whose mean and variance are both R. By the law of large numbers, almost surely
1
R
g → 1 when R → ∞. It is therefore reasonable to approximate g by its mean, i.e.,
PR
i=1 |gi |2 ≈ R, especially for large R. Therefore, (7.8) becomes
−1 P1 P2 T ∗ 2 2
P e . E det IT + (Si − Sj ) (Si − Sj )diag {|g1 | , · · · , |gR | } . (7.10)
gi 4 (1 + P1 + P2 R)
P1 P2 T
It can be seen that the upper bound in (7.10) is minimized when 4(1+P1 +P2 R)
is
maximized, which can be easily done.
Assume that the total power consumed in the whole network is P T for transmis-
sions of T symbols. Since the power used at the transmitter and each relay are P1
and P2 respectively for each transmission, P = P1 + RP2 . Therefore,
P1 P2 T P1 P −P
R
1
T P1 (P − P1 )T P 2T
= = ≤
4 (1 + P1 + P2 R) 4(1 + P1 + P − P1 ) R(1 + P ) 16R(1 + P )
196
with equality when
P P
P1 = and P2 = . (7.11)
2 2R
Therefore, the optimum power allocation is such that the transmitter uses half the
total power and the relay nodes share the other half fairly. So, for large R, the relay
nodes spend only a very small amount of power to help the transmitter.
With this optimum power allocation, for high total transmit power (P 1),
P1 P2 T
PR
4 1 + P1 + P2 i=1 |gi | 2
P P
2 2R
T
= PR
P P 2
4 1+ 2
+ 2R i=1 |gi |
P P
2 2R
T
≈ P
P P R
4 2
+ 2R i=1 |gi |2
PT
= PR .
2
8(R + i=1 |gi | )
(7.8) becomes
" #
PT
P e . E det −1 IT + PR (Si − Sj )∗ (Si − Sj )diag {|g1 |2 , · · · , |gR |2 } . (7.12)
gi 2
8(R + i=1 |gi | )
As mentioned earlier, to obtain the diversity, the expectation in (7.8) must be cal-
culated. This will be done rigorously in Section 7.8. However, since the calculations
are detailed and give little insight, a simple approximate derivation, which leads to
the same diversity result, is given here. As discussed in the previous section, when R
197
PR
is large, i=1 |gi |2 ≈ R with high probability. In this section, this approximation is
used to simplify the derivation.
Define
To highlight the transmit diversity result, we first upper bound the PEP using the
2
minimum nonzero singular values of M , which is denoted as σmin . Therefore, from
(7.12),
2
−1 P T σmin
Pe . E det IT + diag {Irank M , 0}diag {|g1 |2 , · · · , |gR |2 }
gi 16R
rank M 2
−1
Y P T σmin 2
= E 1+ |gi |
gi
i=1
16R
"Z −1 #rank M
∞ 2
P T σmin
= 1+ x e−x dx
0 16R
2
−rank M rank M
P T σmin − 16R
P T σ2
16R
= −e min Ei − 2
,
16R P T σmin
where
Z χ
et
Ei(χ) = dt, χ<0
−∞ t
∞
X (−1)k χk
Ei(χ) = c + log(−χ) + , (7.14)
k=1
k · k!
− 16R
P T σ2
1
e min =1+O ≈1
P
6
Pn 1
The Euler-Mascheroni constant is defined by c = limn→∞ k=1 k − log n . c has the numerical
value 0.57721566 · · · .
198
and
16R
−Ei − 2
= log P + O(1) ≈ log P.
P T σmin
Therefore,
log log P
When M is full rank, the transmit diversity is min{T, R} 1 − log P
. Therefore,
similar to the multiple-antenna case, there is no point in having more relays than the
coherence interval according to the diversity. Thus, we will henceforth always assume
log log P
T ≥ R. The transmit diversity is therefore R 1 − log P . (7.15) also shows that
the PEP is smaller for bigger coherence interval T . A tighter upper bound is given
Theorem 7.2. Design the transmit signal at the i-th relay node as in (7.4) and
use the power allocation in (7.11). For full diversity of the space-time code, assume
T ≥ R. If P 1, for any positive x, the PEP has the following upper bound
R k
X 16R X R−k
Pe . det −1 [M ]i1 ,··· ,ik 1 − e−x [−Ei(−x)]k , (7.16)
k=0
PT 1≤i1 <···<ik ≤R
where [M ]i1 ,··· ,ik denotes the k × k matrix composed by choosing the i1 , · · · , ik -th rows
and columns of M .
Z ∞ Z ∞
−1 PT
Pe . ··· det IR + M diag {λ1 , · · · , λR } e−λ1 · · · e−λR dλ1 · · · dλR ,
0 0 16R
Pe
Z x Z ∞ Z x Z ∞ −1
PT
. + ··· + det IR + M diag {λ1 , · · · , λR }
0 x 0 x 16R
e−λ1 · · · e−λR dλ1 · · · dλR
R
1X X
= Ti1 ,··· ,ik ,
2 k=0 1≤i <···<i
1 k ≤R
where
Z Z −1
PT
Ti1 ,··· ,ik = ··· det IR + M diag {λ1 , · · · , λR } e−λ1 · · · e−λR dλ1 · · · dλR .
16R
the i1 , · · · ik -th integrals
are from x to ∞,
all others are from 0 to x
Z ∞ Z ∞Z x Z x −1
PT
T1,··· ,k = ··· ··· det IR + M diag {λ1 , · · · , λR } e−λ1 · · · e−λR dλ1 · · · dλR .
16R
|x {z x
}| 0 {z 0
}
k R−k
PT
det IR + M diag {λ1 , · · · , λR }
16R
PT
> det IR + M diag {λ1 , · · · , λk , 0, · · · , 0}
16R
PT det[M ]1,··· ,k diag {λ1 , · · · , λk } 0k,R−k
= det IR + M
16R ∗ 0R−k,R−k
PT
= det Ik + [M ]1,··· ,k diag {λ1 , · · · , λk }
16R
PT
> det [M ]1,··· ,k diag {λ1 , · · · , λk }
16R
k
PT
= det[M ]1,··· ,k λ1 · · · λk ,
16R
200
where [M ]i1 ,··· ,ik is defined in Theorem 7.2. Therefore,
k Z x Z x
16R −1
T1,··· ,k < det e−λk+1 · · · e−λR dλk+1 · · · dλR
[M ]1,··· ,k ···
PT 0 0
Z ∞ Z ∞ −λ1
e e−λk
··· ··· dλ1 · · · dλk
x x λ1 λk
k Z x R−k Z ∞ −λ k
16R −1 −λ e
= det [M ]1,··· ,k e dλ dλ
PT 0 x λ
k
16R R−k
= det −1 [M ]1,··· ,k 1 − e−x [Ei(−x)]k .
PT
In general,
k
16R R−k
Ti1 ,··· ,ik < det −1 [M ]1,··· ,k 1 − e−x [Ei(−x)]k .
PT
matrix since M is positive definite. Therefore, all terms in (7.16) are positive.
R k
1 X 16R X
Pe . det −1 [M ]i1 ,··· ,ik logk P. (7.17)
P R k=0 T 1≤i <···<i 1 k ≤R
1 7
Proof: Set x = P
. Therefore,
R−k R−k
−x R−k − P1 1 1 1 1
1−e = 1−e = +o = R−k + o .
P P P P R−k
7
Actually, this is not the optimum choice based on the transmit diversity. The transmit diversity
can be improved slightly by choosing a optimum x. However, the coding gain of that case is smaller
than the coding gain in (7.17). The details will be discussed in Section 7.9.
201
From (7.14),
1
Therefore, (7.17) is obtained from (7.16) by omitting the higher order terms of P
.
The same as the PEP Chernoff upper bound of multiple-antenna systems with R
transmit antennas and one receive antenna at high SNR, which is
R
1 −1 ∗ 4R
P e ≤ R det (Si − Sj ) (Si − Sj ) ,
P T
1
the factor PR
is also obtained in the network case. However, instead of a constant
equivalent to
R
16R log log P
det −1
M P −R(1− log P ). (7.18)
T
Therefore, as in (7.15), transmit diversity of the distributed space-time code is, again,
R 1 − logloglogP P , which is linear in the number of relays. When P is very large (P
log log P
log P ), log P
1, and a transmit diversity about R is obtained which is the same
as the transmit diversity of a multiple-antenna system with R transmit antennas
and one receive antenna. That is, the system works as if the R relay nodes can fully
cooperate and have full knowledge of the transmitted signal as in the multiple-antenna
case. However, for any general average total transmit power, the transmit diversity
depends on the average total transmit power P .
202
7.8 Rigorous Derivation of the Diversity
PR
In the previous section, we use the approximation i=1 |gi |2 ≈ R. In this section,
a rigorous derivation of the Chernoff upper bound on the PEP is given. The same
transmit diversity is obtained but the coding gain becomes more complicated. Here
is the main result.
Theorem 7.3. Design the transmit signal at the i-th relay node as in (7.4) and
use the power allocation in (7.11). For full diversity of the space-time code, assume
R k k
1 X 8 X X
Pe . det −1 [M ]i1 ,··· ,ik BR (k − l, k) logl P,(7.19)
P R k=0 T 1≤i <···<i 1 k ≤R l=0
where
k k−i k−i1 −···−ij−1
k X X1 X k k − i1 − · · · − ij−1
BR (j, k) = ··· ···
j i1 =1 i2 =1 ij =1 i1 ij
(i1 − 1)! · · · (ij − 1)!Rk−i1 −···−ij . (7.20)
Proof: Before proving the theorem, we first give a lemma that is needed.
Z Z k
!k k
∞ ∞ X e−λ1 · · · e−λk X
··· A+ λi dλ1 · · · dλk = BA,x (j, k) [−Ei(−x)]k−j , (7.21)
x x i=1
λ1 · · · λ k j=0
203
where
k P P 1 Pk−i1 −···−ij−1 k k − i1 − · · · − ij−1
BA,x (j, k) = ki1 =1 k−i
i2 =1 · · · ij =1 · · ·
j i1 ij
and
Z ∞
Γ(i, x) = e−t ti−1 dt
x
Z ∞ Z ∞
PT
··· det −1 IR + PR M diag {λ1 , · · · , λR } e−λ1 · · · e−λR dλ1 · · · dλR .
0 0 8 R + i=1 λi
We use the same method as in the previous section: breaking every integral into two
parts. Therefore,
R
1X X
Pe ≤ Ti01 ,··· ,ik ,
2 k=0 1≤i <···<i
1 k ≤R
while
0
T1,··· ,k
Z ∞ Z ∞Z x Z x
PT
= ··· ··· det −1 IR + PR M diag {λ1 , · · · , λR }
| x {z x }| 0 {z 0 } 8 R + i=1 λi
k R−k
Therefore,
k Z x Z x
0 8 −1
T1,··· ,k < det [M ]1,··· ,k ··· e−λk+1 · · · e−λR dλk+1 · · · dλR
PT 0 0
Z Z " k
#k
∞ ∞ X e−λ1 · · · e−λk
··· R + (R − k)x + λi dλ1 · · · dλk .
x x i=1
λ1 · · · λ k
k k
8
−x R−k
X
0
T1,··· ,k < det −1
[M ]1,··· ,k 1 − e BR+(R−k)x,x (j, k) [−Ei(−x)]k−j .
PT j=0
205
1
Choose x = P
. Similarly, for large P ,
k
1 1
R + (R − k) ≈ Rk , −Ei − ≈ log P,
P P
1 1
1 − e− P ≈ , Γ(i, x) ≈ (i − 1)!.
P
Therefore,
k k
8 1 X
0
T1,··· ,k < det −1
[M ]1,··· ,k BR,x (j, k) logk−j P
PT P R−k j=0
k k
8 1 X
= det −1
[M ]1,··· ,k BR,x (k − l, k) logl P.
PT P R−k
l=0
In general,
k k
1 8 X
Ti01 ,··· ,ik < det −1 [M ]i1 ,··· ,ik BR,x (k − l, k) logl P.
PR T l=0
Corollary 7.2. If R 1,
R k
1 X 8R X
Pe . R det −1 [M ]i1 ,··· ,ik logk P. (7.22)
P k=0 T 1≤i <···<i 1 k ≤R
Proof: When R 1, BR (0, k) >> BR (l, k) for all l > 0 since BR (0, k) = Rk is
the term with the highest order of R. Therefore, (7.22) is obtained from (7.19).
Remarks:
1. The k = l = R term,
R
−1 8R log P
det M , (7.23)
TP
206
in (7.19) has the highest order of P . By simple rewriting, it is equivalent to
R
8R log log P
det −1
M P −R(1− log P ), (7.24)
T
which is the same as (7.18) except for a coefficient of 2R . Therefore, the same
log log P
transmit diversity, R 1 − log P , is obtained.
R
−1 4R
det M .
PT
Comparing this with the highest order term given in (7.23), we can see the relay
network has a performance that is
3. Corollary 7.2 also gives the coding gain for networks with large number of relay
nodes. When P is very large (log P 1), the dominant term in (7.22) is (7.24).
The coding gain is therefore det−1 M , which is the same as the multiple-antenna
case. When P is not very large, the second term in (7.22),
R
R−1 X
8R logR−1 P
det −1 [M ]1,··· ,i−1,i+1,··· ,R ,
T i=1
PR
[M ]i1 ,··· ,ik = ([Si ]i1 ,··· ,ik − [Sj ]i1 ,··· ,ik )∗ ([Si ]i1 ,··· ,ik − [Sj ]i1 ,··· ,ik ),
where [Si ]i1 ,··· ,ik = [Ai1 si · · · Aik si ] is the distributed space-time code when
only the i1 , · · · , ik -th relay nodes are working. To have a good performance
for not very large transmit power, Corollary 7.2 indicates that the distributed
space-time code should have the property that it is “scale-free” in the sense
that it is still a good distributed space-time code when some of the relays are
not working. In general, for networks with any number of relay nodes, the same
conclusion can be obtained from (7.19).
4. Now we look at the low average total transmit power case, that is the P 1
P
case. With the same approximation R 2
i=1 |gi | ≈ R, using the power allocation
given in (7.11),
P P
P1 P2 T 2 2R
T P 2T
≈ = .
4 1 + P 1 + P2 R
P
|g | 2 4 (1 + P ) 16R
i=1 i
−1 P 2T 2 2
Pe . E det IR + M diag {|g1 | , · · · , |gR | }
gi 16R
−1
P 2T 2 2
2
= E 1+ tr M diag {|g1 | , · · · , |gR | } + o(P )
gi 16R
R
!−1
P 2T X 2 2
= E 1+ mii |gi | + o(P )
gi 16R i=1
R
!
P 2T X
= E 1− mii |gi |2 + o(P 2 )
gi 16R i=1
R
!
P 2T X
= 1− mii + o(P 2 )
16R i=1
208
P 2T
= 1− tr M + o(P 2 ),
16R
where mii is the (i, i) entry of M . Therefore, the same as in the multiple-antenna
case, the coding gain at low total transmit power is tr M . The design criterion
is to maximize tr M .
5. Corollary 7.2 also shows that the results obtained by the rigorous derivation in
this section is consistent with the approximate derivation in the previous section
except for a coefficient 2k . Actually the upper bound in (7.22) is tighter than
the one in (7.17). This is reasonable since in (7.22) all the terms except the one
with the highest order of R are omitted, however in the derivation of (7.17), we
P
approximate R 2
i=1 |gi | by its expected value R.
choice according to transmit diversity. The transmit diversity can be improved slightly
by choosing the positive number x optimally.
Theorem 7.4. The best transmit diversity can be obtained using the distributed space-
time codes is α0 R, where α0 is the solution of
R k k
X 8 X X
Pe . det −1 [M ]i1 ,··· ,ik BR (k − l, k)P −[α0 R+(1−α0 )(k−l)] . (7.27)
k=0
T 1≤i1 <···<ik ≤R l=0
209
If R 1,
" R k #
X 8R X
Pe . det −1 [M ]i1 ,··· ,ik P −α0 R . (7.28)
k=0
T 1≤i1 <···<ik ≤R
Pe
R (7.29)
.
P P 8 k Pk
≤ det −1 [M ]i1 ,··· ,ik PT (1 − ex )R−k l=0 BR+(R−k)x (k − l, k)[−Ei(−x)]
l
k=01≤i1 <···<ik ≤R
1
Set x = Pα
with α > 0. Therefore,
R−k R−k
−x R−k − P1α 1 1 1 1
1−e = 1−e = +o = α(R−k) + o .
Pα Pα P P α(R−k)
and
k
1 k 1
R + (R − k) α =R +O .
P Pα
From (7.14),
R k X k
X X 8
−1 αl logl P
Pe . det [M ]i1 ,··· ,ik BR (k − l, k) α(R−k)
k=0 1≤i1 <···<ik ≤R
PT l=0
P
R k Xk
X X 8
= −1
det [M ]i1 ,··· ,ik BR (k − l, k)P −[k+α(R−k)] αl logl P.
k=0 1≤i <···<i ≤R
T l=0
1 k
210
log α log log P
Note that αl = P l log P and logl P = P l log P . Therefore,
Pe
R k
k X
X X 8 log α log log P
. det −1
[M ]i1 ,··· ,ik BR (k − l, k)P −[αR+(1−α)k−l log P −l log P ].
k=0 1≤i1 <···<ik ≤R
PT l=0
which is the negative of the highest order of P . To obtain the best transmit diversity,
α should be chosen to maximize mink∈[1,R] β(α, k). Note that
∂β
|α=α0 = 0.
∂k
Also note
∂2β 1
= −1 − < 0.
∂α∂k α log P
Therefore,
∂β
∂k
> 0 if α < α0
.
∂β
∂k
< 0 if α > α0
211
Thus,
β(α, 0) = αR if α ≤ α0
min β(α, k) = .
k∈[1,R]
β(α, R) = R 1 − log α log log P
log P
− log P
if α ≥ α0
Therefore,
log α0 log log P
max min β(α, k) = β(α0 , R) = R 1 − − = α0 R,
α≥α0 k∈[1,R] log P log P
1 log α
Still we need to check the condition log P α0
. Define γ(α) = α + log P
. Then
dγ(α) 1
dα
=1+ α log P
> 0. Since
log log P
γ(1) = 1 > 1 −
log P
and
log log P
log log P
log log P log 1 − log P log log P
γ 1− =1− + <1− ,
log P log P log P log P
212
therefore,
log log P
1− < α0 < 1. (7.30)
log P
Therefore, α0 log P > log P − log log P . If log P log log P , log P 1 is true and
log log P
Proof: From the proof of Theorem 7.4, we know that 1 − log P
< α0 . We only
need to prove the other part. Let
Since as in the proof of Theorem 7.4, γ 0 (α) > 0. We just need to prove that γ(α1 ) −
log log P
1− log P
> 0.
Let’s first prove
log P log log P log log P
log α1 > − − .
log P − log log P log P (log P − log log P ) log P
1
Define g(x) = log(1 − x) + cx. Since g 0 (x) = c − 1−x , g 0 (x) > 0 if x < 1 − 1c . Note that
g(0) = 0, therefore, g(x) > 0 or equivalently log(1−x) > −cx when 0 < x < 1− 1c . Let
log log P log log P log P log log P
x0 = log P
− log P (log P −log log P )
and c0 = log P −log log P
. 1− c10 = log P
> x0 for P > e.
213
It is also easy to check that x0 > 0 for P > e. Therefore, log α1 = log(1 − x0 ) > −c0 x0
and
log log P
γ(α1 ) − 1 −
log P
log log P 1
= + log α1
log P (log P − log log P ) log P
log log P 1 log P log log P log log P
> − −
log P (log P − log log P ) log P log P − log log P log P log P (log P − log log P )
log log P
=
log P (log P − log log P )2
> 0.
Theorem 7.5 indicates that the PEP Chernoff bound of the distributed space-time
codes decreases faster than
R k R
X 8R X
−1 log P
det [M ]i1 ,··· ,ik
k=0
T 1≤i1 <···<ik ≤R
P
R k 1
!R
X 8R X (log P )1− log P −log log P
det −1 [M ]i1 ,··· ,ik .
k=0
T 1≤i1 <···<ik ≤R
P
" R k #−1
X 8R X
det −1 [M ]i1 ,··· ,ik ,
k=0
T 1≤i1 <···<ik ≤R
214
and the coding gain of (7.22) for very high SNR log P 1 is det M . To compare
√
the two, we assume that the singular values of M take their
maximum
−1 value, 2,
P R k
and R = T . Therefore the coding gain of (7.28) is R
k=0 4 = 5−R . The
k
coding gain of (7.22) is 4−R . The upper bound in (7.22) is 0.97dB better according
to coding gain.
Therefore, when P is extremely large, the new upper bound is tighter than the
previous one since it has a lager diversity. Otherwise, the previous bound is tighter
since it has a larger coding gain.
In this section, a more general type of distributed linear dispersion space-time codes
[HH02b] is discussed. The transmitted signal at the i-th relay node is designed as
r
P2
ti = (Ai ri + Bi ri ) i = 1, 2, · · · , R, (7.31)
P1 + 1
By separating the real and imaginary parts, (7.31) can be written equivalently as
r
ti,Re P2 A i + B i 0 ri,Re
= . (7.32)
ti,Im P1 + 1 0 Ai − B i ri,Im
215
The expected total transmit power at the i-th relay can therefore be calculated to be
P2 T .
Now let’s look at the received signal. Similar to the rewriting of (7.31), (7.2) can
be equivalently written as
ri,Re p fi,Re IT −fi,Im IT sRe vi,Re
= P1 T + .
ri,Im fi,Im IT fi,Re IT sIm vi,Im
Therefore,
r
ti,Re P1 P2 T A i + B i 0 fi,Re IT −fi,Im IT sRe
=
ti,Im P1 + 1 0 Ai − B i fi,Im IT fi,Re IT sIm
r
P2 vi,Re
+
P1 + 1 vi,Im
xRe
For the T × 1 complex vector x, define the 2T × 1 real vector x̂ = . Further
xIm
define the 2T × 2T real matrix
R
X gi,Re IT −gi,Im IT Ai + Bi 0 fi,Re IT −fi,Im IT
H= ,
i=1 gi,Im IT gi,Re IT 0 Ai − B i fi,Im IT fi,Re IT
r
P1 P2 T
x̂ = Hŝ + W,
P1 + 1
216
where H is the equivalent channel matrix and W is the equivalent noise.
Theorem 7.6 (ML decoding and PEP). Design the transmit signal at the i-th
relay node as in (7.31). Then
„ r «∗ „ r «
P1 P2 T P1 P2 T
x̂− Hŝi x̂− Hŝi
P1 +1 P1 +1
−
1 (
2 1+
P 2 PR
|g |2)
P (x|si ) = h PR iT e P1 +1 i=1 i
, (7.33)
P2 2
2π 1 + P1 +1 i=1 |gi |
r
2
P1 P2 T
arg max P (x|si ) = arg min
x̂ − Hŝi
.
si si
P1 + 1
F
Using the optimum power allocation given in (7.11), the PEP of mistaking s i with sj
has the following Chernoff upper bound for large P .
R
−1/2 PT
P
P e ≤ E det I2R + 8(R+
Gk Gkt , (7.34)
k=1 |gk | ) k=1
PR 2
gi
where
gk,Re IT −gk,Im IT Ak + Bk 0 (si − sj )Re −(si − sj )Im
Gk = .
gk,Im IT gk,Re IT 0 Ak − B k (si − sj )Im (si − sj )Re
Proof: To get the distribution of x|si , let’s first discuss the noise part. Since
vi and w are independent circularly symmetric Gaussian with mean 0 and vari-
ance IT , W is Gaussian with mean zero and its variance can be calculated to be
PR
1 + PP1 +1
2
|g
i=1 i | 2
I2T . Therefore, when fi and gi are known, x̂|si is Gaussian with
q
mean PP11P+1 2T
Hŝi and variance the same as that of W. Thus, P (x|si ) = P (x̂|ŝi ) is as
given in (7.33). It is straightforward to get the ML decoding from the distribution.
Now let’s look at the PEP of mistaking si by sj . By the same argument as in the
217
proof of Theorem 7.1, the PEP has the following Chernoff upper bound.
[H(s\ \
P1 P2 T t
− i −sj )] H(si −sj )
Pe ≤ E e ( 8 1+P1 +P2 R |gi |2 )
P
i=1 .
fi ,gi
Note that
fi,Re IT −fi,Im IT (si − sj )Re (si − sj )Re −(si − sj )Im fi,Re
= .
fi,Im IT fi,Re IT (si − sj )Im (si − sj )Im (si − sj )Re fi,Im
Therefore,
H(s\i − sj )
R
X gi,Re IT −gi,Im IT Ai + Bi 0 fi,Re IT −fi,Im IT (si − sj )Re
=
i=1 gi,Im IT gi,Re IT 0 Ai − B i fi,Im IT fi,Re IT (si − sj )Im
R
X gi,Re IT −gi,Im IT Ai + Bi 0 (si − sj )Re −(si − sj )Im fi,Re
=
i=1 gi,Im IT gi,Re IT 0 Ai − B i (si − sj )Im (si − sj )Re fi,Im
..
.
f
i,Re
= Ĥ ,
fi,Im
..
.
7.11 Either Ai = 0 or Bi = 0
We have not yet been able to explicitly evaluate the expectation in (7.34). Our
conjecture is that when T ≥ R, the same transmit diversity R 1 − logloglogP P will be
obtained. Here we give an analysis of a much simpler, but far from trivial, case:
for any i, either Ai = 0 or Bi = 0. That is, each relay node sends a signal that
is either linear in its received signal or linear in the conjugate of its received signal.
It is clear to see that Alamouti’s
scheme
is included in this case with R = 2, A1 =
0 1
I2 , B1 = 0, A2 = 0, and B2 = . The conditions that Ai + Bi and Ai − Bi are
1 0
orthogonal become that Ai is orthogonal if Bi = 0 and Bi is orthogonal if Ai = 0.
Theorem 7.7. Design the transmitted signal at the i-th relay node as in (7.31).
Use the optimum power allocation given in (7.11). Further assume that for any
i = 1, · · · , R, either Ai = 0 or Bi = 0. The PEP of mistaking si with sj has the
following Chernoff upper bound.
PT
P e ≤ E det −1 IR + PR (Ŝi − Ŝj )∗ (Ŝi − Ŝj )diag {|g1 |2 , · · · , |gR |2 } ,(7.35)
gi 2
8 R + i=1 |gi |
where
Theorem 7.8. Design the transmit signal at the i-th relay as in (7.31). Use the
optimum power allocation as given in (7.11). For the full diversity of the space-time
code, assume T ≥ R. Define
R k
" k
#
X 8 X X logl P
Pe . BR (k − l, k) det −1 [M̂ ]i1 ,··· ,ik .
k=0
T 1≤i1 <···<ik ≤R l=0
PR
R k
" R
#
X 8 X X
Pe . BR (k − l, k) det −1 [M̂ ]i1 ,··· ,ik P −[α0 R+(1−α0 )(k−l)] .
k=0
T 1≤i1 <···<ik ≤R l=0
all 0 ≤ k ≤ R, 1 ≤ i1 < · · · < ik ≤ R. That is, to have good performance for not very
large transmit power, the distributed space-time code should have the property that
it is “scale-free” in the sense that it is still a good distributed space-time code when
some of the relays are not working.
220
7.12 Simulation Results
In this section, we give the simulated performance of the distributed space-time codes
for different values of the coherence interval T , number of relay nodes R, and total
transmit power P . The fading coefficients between the transmitter and the relays,
fi , and between the receiver and the relays, gi , are modeled as independent complex
Gaussian variables with zero-mean and unit-variance. The fading coefficients keep
constant for T channel uses. The noises at the relays and the receiver are also modeled
as independent zero-mean unit-variance Gaussian additive noise. The block error rate
(BLER), which corresponds to errors in decoding the vector of transmitted signals
s, and the bit error rate (BER), which corresponds to errors in decoding s1 , · · · , sT ,
is demonstrated as the error events of interest. Note that one block error rate may
correspond to only a few bit errors.
The transmit signals at each relay are designed as in (7.4). We should remark
that our goal here is to compare the performance of linear dispersion (LD) codes
implemented distributively over wireless networks with the performance of the same
codes in multiple-antenna systems. Therefore the actual design of the LD codes and
their optimality is not an issue here: all that matters is that the codes used for
simulations in both systems be the same.8 Therefore, the matrices, Ai , are generated
randomly based on the isotropic distribution on the space of T × T unitary matrices.
It is certainly conceivable that the performance obtained in the following figures can
s
6
{−(N − 1)/2, · · · , −1/2, 1/2, · · · , (N − 1)/2},
T (N 2 − 1)
q
6
where N is a positive integer. The coefficient T (N 2 −1)
is used for the normalization
of s given in (7.1). The number of possible transmitted signal is therefore L2T . Since
the channel is used in blocks of T transmissions, the rate of the code is, therefore,
1
log N 2T = 2 log N.
T
coefficients between the transmit antennas and the receive antenna as independent
zero-mean unit-variance complex Gaussian. The noises at the receive antenna are also
modeled as independent zero-mean unit-variance complex Gaussian. As discussed in
the chapter, the space-time code used is the T × R matrix S = [A1 s, · · · , AR s].
The rate of the space-time code is again 2 log N . In both systems, sphere decoding
[DAML00, HV02] is used to obtain the ML results.
and R
In Figure 7.3, the BER curves of relay networks with different coherence interval
T and number of relay nodes R are shown. The solid line indicates the BER of
a network with T = R = 5, the line with circles indicates the BER of a network
with T = 10 and R = 5, the dash-dotted line indicates the BER of a network
with T = R = 10, and the line with stars indicates the BER of a network with
T = R = 20. It can be seen from the plot that the bigger R, the faster the BER
222
BER of networks with different T and R
0
10
T=R=5
T=10,R=5
T=R=10
T=R=20
−1
10
−2
10
BER
−3
10
−4
10
−5
10
−6
10
20 21 22 23 24 25 26 27 28 29 30
P (dB)
curve decreases, which verifies our analysis that the diversity is linear in R when
T ≥ R. However, the slopes of the BER curves of networks with T = R = 5 and
T = 10, R = 5 is the same. This verifies our result that the transmit diversity only
depends on min{T, R}, which is always R in our examples. Increasing the coherence
interval does not improve the diversity. According to the analysis in Section 6 and
7, increasing T can improvement the coding gain. However, when having a larger
coherence interval, not much performance improvement can be seen from the plot by
comparing the solid line (the BER curve of network with T = R = 5) and the line
with circles (the BER curve of network with T = 10, R = 5). The reason may be that
our code is randomly chosen without any optimization.
223
7.12.2 Perfromance Comparisions of Distributed Space-Time
codes. The performance is compared in two ways. In one, we assume that the total
transmit power for both the systems is the same. This is done since the noise and
channel variances are everywhere normalized to unity. In other words, the total
transmit power in the networks (summed over the transmitter and R relay nodes) is
the same as the transmit power of the multiple-antenna systems. In the other, we
assume that the SNR at the receiver is the same for the two systems. Assuming that
the total transmit power is P , in the distributed scheme the SNR can be calculated
P2
to be 4(1+P )
, and in the multiple-antenna setting it is P . Thus, roughly a 6 dB
increase in power is needed to make the SNR of the relay networks identical to that
BER and BLER curves are shown in Figure 7.4 and 7.5. Figure 7.4 shows the BER
and BLER of the two systems with respect to the total transmit power. Figure
7.5 shows the BER and BLER of the two systems with respect to the receive SNR.
In both figures, the solid and dashed curves indicate the BER and BLER of the
relay network. The curve with plus signs and curve with circles indicate the BER
and BLER of the multiple-antenna system. It can be seen from the figures that the
performance of the multiple-antenna system is always better than the relay network at
any total transmit power or SNR. This is what we expected because in the multiple-
antenna system, the transmit antennas of the transmitter can fully cooperate and
224
−2
10
−3
BER/BLER
10
−4
10
−5
10
−6
10
−7
10
20 22 24 26 28 30 32 34 36 38
P (dB)
−2
10
−3
BER/BLER
10
−4
10
−5
10
−6
10
−7
10
20 22 24 26 28 30 32 34 36 38
SNR (dB)
relay network. However, the differences of the slopes of the BER and BLER curves of
the two systems are diminishing as the total transmit power goes higher. This can be
seen more clearly in Figure 7.5. At low SNR (0-10dB), the BER and BLER curves of
the multiple-antenna system go down faster than those of the relay network. As SNR
goes higher, the differences of slopes of the BER curves and BLER curves vanishes,
which indicates that the two system have about the same diversity at high SNR. This
verifies our analysis of the transmit diversity.
Also, in Figure 7.4, at the BER of 10−4 , the total transmit power of the relay
network is about 37.5dB. Our analysis of (7.25) indicates that the performance of
the relay network should be 12.36dB worse. Reading from the plot, we get a 11.5dB
difference. This verifies the correctness and tightness of our upper bound.
In the next example, T = R = 10 and N = 2. Therefore, the rate is again 2.
The simulated performances are as shown in Figure 7.6 and 7.7. Figure 7.6 shows
the BER and BLER of the two systems with respect to the total transmit power.
Figure 7.7 shows the BER and BLER of the two systems with respect to the receive
SNR. The indicators of the curves are the same as before. The BER and BLER of
both the relay network and the multiple-antenna system are lower than those in the
previous example. This is because there are more relay nodes or transmit antennas
present. From Figure 7.6, it can be seen that the multiple-antenna system has a
higher diversity at low transmit power. However, as the total transmit power or SNR
goes higher, the slope differences of the BER and BLER curves between the two
systems diminish. Figure 7.7 shows the same phenomenon. When the receive SNR is
low (0-10dB), the performance of the two systems are about the same. However, the
BER and BLER curves of the multiple-antenna system goes done faster than those
of the relay network. When SNR is high (above 20dB), the BER and BLER curves
227
0
T=10, R=10, rate=2
10
−1
10
−2
10
−3
10
BER/BLER
−4
10
−5
10
−6
10
−7
10 relay network BER
relay network BLER
multi−antenna BER
multi−antenna BLER
−8
10
0 5 10 15 20 25 30
P (dB)
Figure 7.6: BER/BLER comparison of the relay network with the multiple-antenna
system at T = R = 10, rate = 2 and the same total transmit power
228
0
T=10, R=10, rate=2
10
−1
10
−2
10
−3
10
BER/BLER
−4
10
−5
10
−6
10
−8
10
0 5 10 15 20 25 30
SNR (dB)
Figure 7.7: BER/BLER comparison of the relay network with the multiple-antenna
system at T = R = 10, rate = 2 and the same receive SNR
229
have about the same slope.
Also, in Figure 7.6, at the BER of 10−3 , the total transmit power of the relay
network is about 26dB. Our analysis of (7.25) indicates that the performance of the
relay network should be 10.77dB worse than that of the multiple-antenna system.
Reading from the plot, we get a 9dB difference. At a BER of 10−4 , the total transmit
power of the relay network is about 30dB. Our analysis of (7.25) indicates that the
performance of the relay network should be 11.39dB worse. Reading from the plot,
we get a 10dB difference.
−2
10
BER/BLER
−3
10
−4
10
−5
10
−6
10
15 16 17 18 19 20 21 22 23 24 25
P (dB)
Figure 7.8: BER/BLER comparison of the relay network with the multiple-antenna
system at T = R = 20, rate = 2 and the same total transmit power
Figure 7.8 and Figure 7.9 show the performance of systems with T = R = 20 and
N = 2. The rate is again 2. Figure 7.8 shows the BER and BLER of the two systems
with respect to the total transmit power. Figure 7.9 shows the BER and BLER of
the two systems with respect to the receive SNR. The indicators of the curves are
230
−2
10
BER/BLER
−3
10
−4
10
−5
10
−6
10
15 16 17 18 19 20 21 22 23 24 25
SNR (dB)
Figure 7.9: BER/BLER comparison of the relay network with the multiple-antenna
system with T = R = 20, rate = 2 and the same receive SNR
231
the same as before. It can be seen from the figures that for total transmit power or
SNR higher than 20, the slopes of the BER and BLER curves of the two systems
are about the same. Also, from Figure 7.9 we can see that for SNR less than 14dB,
the performance of the two systems are about the same. However, the BER and
BLER curves of the multiple-antenna system goes down faster than those of the relay
network. When SNR is high (above 20dB), the performance difference converges.
Again, in Figure 7.8, at a BER of 10−4 , the total transmit power of the relay
network is about 26dB. Our analysis of (7.25) indicates that the performance of
the relay network should be 10.77dB worse. Reading from the plot, we get a 9dB
difference.
−1
10
−2
10
−3
10
BER/BLER
−4
10
−5
10
−6
10
−7
10 relay network BER
relay network BLER
multi−antenna BER
multi−antenna BLER
−8
10
10 15 20 25 30 35
P (dB)
Figure 7.10: BER/BLER comparison of the relay network with the multiple-antenna
system at T = 10, R = 5, rate = 2 and the same total transmit power
In this chapter, the use of linear dispersion space-time codes in wireless relay net-
works is proposed. We assume that the transmitter and relay nodes do not know
the channel realizations but only their statistical distribution. The ML decoding and
pairwise error probability at the receiver is analyzed. The main result is that the
diversity of the system behaves as min{T, R} 1 − logloglogP P , which shows that when
T ≥ R and the average total transmit power is very high (P log P ), the relay
network has almost the same diversity as a multiple-antenna system with R transmit
antennas and one receive antenna. This result is also supported by simulations. It is
further shown that, assuming R = T , the leading order term in the PEP behaves as
1
8 log P R
R
| det(Si −Sj )|2 P
, which compared to | det(Si1−Sj )|2 P4 , the PEP of a space-time
code, shows the loss of performance due to the fact that the code is implemented
distributively and the relay nodes have no knowledge of the transmitted symbols. We
also observe that the high SNR coding gain, | det(Si − Sj )|−2 , is the same as that
arises in space-time coding. The same is true at low SNR where a trace condition
comes up.
We then improve the achieved diversity gain slightly (by the order no larger than
O log log P
2
log P
). Furthermore, a more general type of distributed space-time linear codes
is discussed, in which the transmission signal from each relay node to the receive node
is designed as a linear combination of both its received signal and the conjugate of its
received signal. For a special case, which includes the Alamouti’s scheme, exactly the
same diversity gain can be obtained. Simulation results on some randomly generated
233
distributed space-time codes are demonstrated, which verify our theoretical analysis
on both the diversity and coding gain.
There are several directions for future work that can be envisioned. One is to study
the outage capacity of our scheme. Another is to determine whether the diversity
log log P
order min{T, R} 1 − log P can be improved by other coding methods that are
more complicated and general than linear code used here. We conjecture that it
large. Also, in our network model, only single antenna is used at every node. What
if there are multiple antennas at the transmit node, the receive node, and/or the
relay nodes? For multiple-antenna systems, it has been shown in Chapter 2 that the
diversity increases linearly in the number of transmit and receive antennas. Here,
in relay networks, can we obtain the same linear increase in the number of antennas
constraint. If, in the relay network, we allow some of the relay nodes to decode
and then all the relay nodes, those who decode and those who do not, generate
a distributed space-time code, it is conceivable that the diversity can be improved
with some sacrifice of rate (needed for the decoding relay nodes to decode correctly).
receiver as well. The Cayley codes [HH02a] might be a suitable candidate for this.
7.14 Appendices
Z Z k
!k
∞ ∞ X e−λ1 e−λ2 · · · e−λk
I≡ ··· A+ λi dλ1 · · · dλk .
x x i=1
λ1 · · · λ k
P k
Consider the expansion of A + ki=1 λi into monomial terms. We have
!k
k k k k−i k−i1 −···−ij−1
X X X X X1 X i
A+ λi = ··· C(i1 , . . . , ij )λil11 λil22 · · · λljj Ak−i1 −···−ij ,
i=1 j=0 1≤l1 <···<lj ≤k i1 =1 i2 =1 ij =1
where j denotes how many λ’s are present, l1 , . . . , lj are the subscripts of the j λ’s
that appears, im ≥ 1 indicates that λlm is taken to the im -th power (the summation
should be
X
,
i1 ,...,ij ≥1
P
im ≤k
which is equivalent to
k k−i k−i1 −···−ij−1
X X1 X
··· .
i1 =1 i2 =1 ij =1
k k − i1 k − i1 − · · · − ij−1
C(i1 , . . . , ij ) = ···
i1 i2 ij
235
i
counts how many times the term λil11 λil22 · · · λljj Ak−i1 −···−ij appears in the expansion.
Thus we have
k k k−i1 −···−ij−1
X X X X
I= ··· C(i1 , . . . , ij )I(j; l1 , . . . , lj ; i1 , . . . , ij ),
j=0 1≤l1 <···<lj ≤k i1 =1 ij =1
where
Z ∞ Z ∞
i e−λ1 · · · e−λk
I(j; l1 , . . . , lj ; i1 , . . . , ij ) ≡ ··· λil11 λil22 · · · λljj Ak−i1 −···−ij dλ1 · · · dλk .
x x λ1 · · · λ k
We compute
I(j; l1 , . . . , lj ; i1 , . . . , ij )
j Z ∞
!
Y Y Z ∞
e−λi
k−i1 −···−ij im −1 −λlm
= A λlm e dλlm dλi
m=1 x x λi
i6=i1 ,...ij
j
!
Y
= A k−i1 −···−ij
Γ(im , x) [−Ei(−x)]k−j .
m=1
Note that the result is independent of l1 , . . . , lj . Finally adding the terms up, we have
I
k k k−i1 −···−ij−1 j
X X X X k−j
Y
k−i1 −···−ij
= ··· C(i1 , . . . , ij )A [−Ei(−x)] Γ(im , x)
j=0 1≤l1 <···<lj ≤k i1 =1 ij =1 m=1
k k k−i1 −···−ij−1
X X X X
= 1 ··· C(i1 , . . . , ij )Ak−i1 −···−ij Γ(i1 , x) · · · Γ(ij , x)
j=0 1≤l1 <···<lj ≤k i1 =1 ij =1
[−Ei(−x)]k−j
k X k k−i1 −···−ij−1
X
k
X k k − i1 − · · · − ij−1
= ··· ··· Γ(i1 , x) · · · Γ(ij , x)Ak−i1 −···−ij
j=0
j i =1 i =1
i 1 i j
1 j
[−Ei(−x)]k−j
k
X
≡ BA,x (j, k) [−Ei(−x)]k−j .
j=0
236
Thus ends the proof.
and
gk,Re IT ∓gk,Im IT (si − sj )Re −(si − sj )Im
±gk,Im IT gk,Re IT (si − sj )Im (si − sj )Re
(si − sj )Re −(si − sj )Im gk,Re ∓gk,Im
= .
(si − sj )Im (si − sj )Re ±gk,Im gk,Re
Therefore,
gk,Re IT −gk,Im IT Ak + Bk 0 (si − sj )Re −(si − sj )Im
gk,Im IT gk,Re IT 0 Ak − B k (si − sj )Im (si − sj )Re
Ak + B k 0 (si − sj )Re −(si − sj )Im gk,Re IT −sgn k gk,Im IT
= ,
0 Ak − B k (si − sj )Im (si − sj )Re sgn k gk,Im IT gk,Re IT
237
where sgn k = 1 if Bk = 0 and sgn k = −1 if Ak = 0. Thus,
Ak + B k 0 (si − sj )Re −(si − sj )Im
Gk Gkt =
0 Ak − B k (si − sj )Im (si − sj )Re
t t
2
|gk | 0 (si − sj )Re −(si − sj )Im Ak + Bk 0
.
0 |gk |2 (si − sj )Im (si − sj )Re 0 Ak − B k
Define
A 1 − B 1 0 si,Re −si,Im
Si0 = ···
0 A1 − B 1 si,Im si,Re
AR + B R 0 si,Re −si,Im
(7.38)
.
0 AR − B R si,Im si,Re
Define
1 0
SGN = diag ··· ··· .
0 sgn k
[Ŝi − Ŝj ]Re −[Ŝi − Ŝj ]Im
It is easy to see that the matrix can be obtained by
[Ŝi − Ŝj ]Im [Ŝi − Ŝj ]Re
switching the columns of (Si0 − Sj0 )SGN . More precisely,
R
[Ŝi − Ŝj ]Re −[Ŝi − Ŝj ]Im 0 0
Y
= (Si − Sj )SGN Ek ,
[Ŝi − Ŝj ]Im [Ŝi − Ŝj ]Re k=2
where Ek = [e1 , · · · , ek−1 , e2k−1 , ek , · · · , e2k−2 , e2k , · · · , e2R ] with {ek } the standard
basis of RR . It is easy to see that Ek−1 = Ekt and det Ek = 1. Right multiplying by
Ek , we move the (2k − 1)-th column of a matrix to the k-th position and shift the
k-th to (2k − 2)-th columns one column right, that is,
Therefore,
where we have defined G = diag {|g1 |2 , · · · , |g1 |2 }. Note that for any complex matrix
A,
ARe AIm 2
det = | det A| .
−AIm ARe
Therefore,
PT
det I2R + PR (Si0 − Sj0 )t (Si0 − Sj0 )diag {|g1 |2 , |g1 |2 , · · · , |gR |2 , |gR |2 }
8 R + i=1 |gi | 2
∗ ∗
[(Ŝi − Ŝj ) (Ŝi − Ŝj )G]Re [(Ŝi − Ŝj ) (Ŝi − Ŝj )G]Im
PT
= det I2R + PR
8 R + i=1 |gi |2 −[(Ŝi − Ŝj )∗ (Ŝi − Ŝj )G]Im [(Ŝi − Ŝj )∗ (Ŝi − Ŝj )G]Re
PT ∗ (Ŝ − Ŝ )G] PT ∗ (Ŝ − Ŝ )G]
[IR + R ( Ŝ i − Ŝ j ) i j Re [I R + R ( Ŝ i − Ŝ j ) i j Im
8(R+ i=1 |gi |2 ) 8(R+ i=1 |gi |2 )
P P
= det
PT ∗ (Ŝ − Ŝ )G] PT ∗ (Ŝ − Ŝ )G]
−[IR + ( Ŝ i − Ŝ j ) i j Im [I R + ( Ŝ i − Ŝ j ) i j Re
8(R+ Ri=1 |gi | )
2 8(R+ R i=1 |gi | )
2
P P
PT
= det IR + PR (Ŝi − Ŝj )∗ (Ŝi − Ŝj )GG∗ .
8 R + i=1 |gi | 2
240
Reaching the end of this thesis, to conclude, a brief summary of contributions of this
thesis and discussions on possible future research directions are given in the following.
As this thesis can be roughly divided into two big parts: the MIMO/multiple-antenna
systems part and the wireless ad hoc network part, a separate summary and discussion
are provided for each.
Systems
is the well-known Alamouti’s scheme, is presented. A brief review of the real and
complex sphere decoding algorithms, which are widely used as fast ML decoding al-
gorithms in multiple-antenna communications, is also given. In Chapter 3, non-square
unitary space-time codes is designed via Cayley transform. The code can be used to
systems with any number of transmit and receive antennas with a fast nearly optimal
decoding algorithm. Preliminary simulations show that the code is far better than
the uncoded training-based space-time schemes and only slightly underperforms op-
timized training-based schemes using orthogonal designs and linear dispersion codes.
most prominent one is the capacity. The capacity of multiple-antenna systems is still
unknown when neither the transmitter and the receiver has the channel information,
which is the most practical case. Although some results are obtained for very high
[MH99, HM00, ZT02, LM03] and very low SNR cases [LTV03, PV02, HS02b, RH04],
we just scratched the surface of the research on the capacity of multiple-antenna
systems. The capacity when partial channel information is available and the capacity
for systems with frequency-selective channels are also open. This area of research will
remain timely and important for many years.
As most research on multiple-antenna systems focused on exploiting the diversity
gain provided by multiple antennas, there is another gain, called the spatial multiplex-
ing or the degrees of freedom, corresponding to the increase in the data rate provided
by multiple antennas. In [ZT03], it is proved that these two types of gains can be
obtained simultaneously, however, there is a fundamental trade-off between how much
each of the two gains can be extracted. Then comes the question of finding practical
codes which can actually achieve the optimal trade-off between diversity and spatial
multiplexing with good performance. In [GCD04], a coding scheme called LAST cod-
ing is proposed and is proved to achieve the optimal trade-off. Other related work
242
can be found in [LK04, GT04, TV04].
Another open problem is the error rate of multiple-antenna systems. The analysis
on the exact block or bit error rate is very difficult. In all the analysis given in this
thesis, the Chernoff upper bound on the pairwise error probability is used. There are
also works on the exact pairwise error probability [TB02]. Any improvement in the
analysis of exact block/bit error rate or any results of non-trivial lower bound on the
Networks
Chapter 7 is about wireless ad hoc networks. In this chapter, the idea of space-time
coding proposed for multiple-antenna systems is applied to wireless relay networks,
high transmit power although space-time codes are used distributively among the R
relay nodes.
As discussed in Chapter 7, the straightforward future research are the analysis
on the outage capacity with this coding scheme and coding scheme designs when no
regime, in which case, the receive SNR at the relay nodes is low. In this situation,
decoding-and-forward is not advantageous. However, if some relay nodes are very
near the transmitter, it might be advantageous for them to decode since they have
high receive SNRs according to diversity gain, capacity, outage capacity, etc. In our
approach, simple signal processing, which is called the distributed space-time coding,
is used at the relay nodes. No decoding needs to be done at the relay nodes, which
both saves the computational complexity and improves the reliability when the SNR
is not very high. This algorithm is superior to amplify-and-forward since the latter is
actually a special case of the former. Other work based on this algorithm can be find
might not be applicable for real ad hoc or sensory networks. What is more important
is that this might be not optimal if some relay nodes have full or partial knowledge
of their local channels. Relay nodes can estimate their instantaneous receive SNR
from the transmit node at every time. Therefore, it might be advantageous for those
relay nodes who have high SNR to use higher transmit power to relay the signals.
Therefore, the optimal power allocation among the relay nodes is another interesting
problem.
In the network model in Chapter 7, there is only one transmit-and-receive pair,
244
which is applicable to most sensory networks but not to general ad hoc wireless
networks. When there are multiple pairs of transmit and receive nodes, not only
does noise exist but also interference. One most straightforward method to solve this
is to use time division or frequency division by assigning a different time instant or
frequency interval to every pair. However, it will be interesting to see if there are
better and more efficient strategies.
routing better than single-hop routing, and what is the optimal power allocation? Re-
lated work can be found in [GK00, GK01, GT02, TG03]. To get some results in this
area, most of the work nowadays focus on one of the two special networks: networks
with a small amount of nodes so that theoretical analysis are possible (for example,
[TG03, CH04]) and networks with very large number of nodes in which asymptotic
results may be obtained [GK00, GK01, GT02]. Understanding wireless ad hoc net-
work is the key to our ultimate goal of wireless communication: to communicate
with anybody anywhere at anytime for anything. For a considerable period of time,
research on wireless ad hoc networks will keep timely, interesting and significant.
245
Bibliography
[Art04a] H. Artés. Channel dependent tree pruning for the sphere decoder. In
Proc. of 2004 IEEE International Symposium on Information Theory
(ISIT’04), June-July 2004.
OFDM for high data rate wireless communication over wide-band chan-
246
nels. In Proc. of 1998 Fall IEEE Vehicular Technology Conference
(VTC’98-Fall), pages 2232–2236, Sep. 1998.
[AVZ02] E. Agrell, A. Vardy, and K. Zeger. Closest point search in lattices. IEEE
Trans. on Information Theory, 48:2201–2214, Aug. 2002.
[DCB00] M. O. Damen, A. Chkeif, and J.-C. Belfiore. Lattice code decoder for
space-time codes. IEEE Communications Letters, pages 161–163, May
2000.
[DF99] Dummit and Foote. Abstract Algebra. John Wiley and Sons Inc., 2nd
edition, 1999.
[GK01] P. Gupta and P. R. Kumar. Internet in the sky: the capacity of three-
dimensional wireless networks. Communications in Information and Sys-
[GKB02] S. Galliou, I. Kammoun, and J. Belfiore. Space-time codes for the GLRT
noncoherent detector. In Proc. of 2002 International Symposium on In-
formation Theory (ISIT’02), page 419, 2002.
[GV02] M. Gastpar and M. Vetterli. On the capacity of wireless networks: the re-
lay case. In Proc. of the 21st Annual Joint Conference of the IEEE Com-
puter and Communications Societies (Infocom’02), pages 1577 – 1586,
June 2002.
[Hay01] S. Haykin. Communications Systems. John Wiley and Sons Inc., 4th
edition, 2001.
Cambridge, 1991.
[HJ02a] B. Hassibi and Y. Jing. Unitary space-time codes and the Cayley trans-
form. In Proc. of 2002 IEEE International Conference on Acoustics,
Speech, and Signal Processing, (ICASSP’02), volume 3, pages 2409–2412,
May 2002.
[HJ02b] B. Hassibi and Y. Jing. Unitary space-time modulation via the Cayley
transform. In Proc. of 2002 IEEE International Symposium on Informa-
tion Theory (ISIT ’02), page 134, June-July 2002.
[HVb] B. Hassibi and H. Vikalo. On sphere decoding algorithm: Part II, general-
izations, second statistics, and application to communications. submitted
to IEEE Trans. on Signal Processing.
Speech, and Signal Processing 2003 (ICASSP ’03), volume 4, Apr. 2003.
[JH03c] Y. Jing and B. Hassibi. Fully-diverse Sp(2) code design. In Proc. of 2003
[JH03e] Y. Jing and B. Hassibi. Unitary space-time modulation via Cayley trans-
form. IEEE Trans. on Signal Processing Special Issue on MIMO Com-
munications, 51:2891–2904, Nov. 2003.
[JH04f] Y. Jing and B. Hassibi. Using space-time codes in wireless relay net-
works. In Proc. of the 3rd IEEE Sensory Array and Multichannel Signal
[KS04] J. H. Kotecha and A. M. Sayeed. Transmit signal design for optimal esti-
mation of correlated MIMO channels. IEEE Trans. on Signal Processing,
52:546–557, Feb. 2004.
May 2001.
[LM03] A. Lapidoth and S. M. Moser. Capacity bounds via duality with appli-
cations to multi-antenna systems on flat-fading channels. IEEE Trans.
Apr. 2002.
258
[MN96] D. J. C. MacKay and R. M. Neal. Near Shannon limit performance of
low density parity check codes. IEE Electronics Letters, 32:1645–1655,
Aug. 1996.
[PGH00] G. Pei, M. Gerla, and X. Hong. Lanmar: landmark routing for large scale
wireless ad hoc networks with group mobility. In Proc. of the 1st Annual
workshop on Mobile and Ad Hoc Networking and Computing, Aug. 2000.
[Ser92] Jean-Pierre Serre. Lie Algebras and Lie Groups. Springer-Verlag, 1992.
[SW86] Sattinger and Weaver. Lie Groups and Algebras with Applications to
Physics, Geometry, and Mechanics. Springer-Verlag, 1986.
[SWWX04] A. Song, G. Wang, W. Wu, and X.-G. Xia. Unitary space-time codes
from Alamouti’s scheme with APSK signals. IEEE Trans. on Wireless
[TC01] M. X. Tao and R. S. Cheng. Improved design criteria and new trellis
codes for space-time coded modulation in slow flat-fading channels. IEEE
Communications Letters, 5:312–315, July 2001.
[TG03] S. Toumpis and A.J. Goldsmith. Capacity regions for wireless ad hoc
networks. IEEE Trans. on Wireless Communications, 2:736–748, July.
2003.
time codes over Rayleigh fading channels. In Proc. of 2000 Fall IEEE
Vehicular Technology Conference (VTC’00-Fall), volume 5, pages 2285–
2290, sep. 2000.
[Win83] J. H. Winters. Switched diversity with feedback for DPSK mobile radio
systems. IEEE Trans. on Vechnology, 32:134–150, 1983.
[WX03] H. Wang and X-G Xia. Upper bounds of rates of complex orthogonal
space-time block codes. IEEE Trans. on Information Theory, 49:2788–
2796, Oct. 2003.
1978.
263
[Zas36] H. Zassenhaus. Über endliche Fastkörper. Abh. Math. Sem. Hamburg,
11:187–220, 1936.