Random Matrix Theory and Wireless Communications
Random Matrix Theory and Wireless Communications
Editorial Board
Editor-in-Chief: Sergio Verdú
Department of Electrical Engineering
Princeton University
Princeton, New Jersey 08544, USA
[email protected]
Editors
Venkat Anantharam (Berkeley) Amos Lapidoth (ETH Zurich)
Ezio Biglieri (Torino) Bob McEliece (Caltech)
Giuseppe Caire (Eurecom) Neri Merhav (Technion)
Roger Cheng (Hong Kong) David Neuhoff (Michigan)
K.C. Chen (Taipei) Alon Orlitsky (San Diego)
Daniel Costello (NotreDame) Vincent Poor (Princeton)
Thomas Cover (Stanford) Kannan Ramchandran (Berkeley)
Anthony Ephremides (Maryland) Bixio Rimoldi (EPFL)
Andrea Goldsmith (Stanford) Shlomo Shamai (Technion)
Dave Forney (MIT) Amin Shokrollahi (EPFL)
Georgios Giannakis (Minnesota) Gadiel Seroussi (HP-Palo Alto)
Joachim Hagenauer (Munich) Wojciech Szpankowski (Purdue)
Te Sun Han (Tokyo) Vahid Tarokh (Harvard)
Babak Hassibi (Caltech) David Tse (Berkeley)
Michael Honig (Northwestern) Ruediger Urbanke (EPFL)
Johannes Huber (Erlangen) Steve Wicker (GeorgiaTech)
Hideki Imai (Tokyo) Raymond Yeung (Hong Kong)
Rodney Kennedy (Canberra) Bin Yu (Berkeley)
Sanjeev Kulkarni (Princeton)
Editorial Scope
TM
Foundations and Trends in Communications and Information Theory
will publish survey and tutorial articles in the following topics:
Antonia M. Tulino
Sergio Verdú
in North America:
now Publishers Inc.
PO Box 1024
Hanover, MA 02339
USA
Tel. +1-781-985-4510
Random Matrix
Theory and Wireless
Communications
Antonia M. Tulino1 , Sergio Verdú2
1
Dept. Ingegneria Elettronica e delle Telecomunicazion, i Universita degli Studi di
Napoli “Federico II”, Naples 80125, Italy
2
Dept. Electrical Engineering, Princeton University, Princeton, New Jersey 08544,
USA
Abstract
Random matrix theory has found many applications in physics, statis-
tics and engineering since its inception. Although early developments
were motivated by practical experimental problems, random matrices
are now used in fields as diverse as Riemann hypothesis, stochastic
differential equations, condensed matter physics, statistical physics,
chaotic systems, numerical linear algebra, neural networks, multivari-
ate statistics, information theory, signal processing and small-world
networks. This article provides a tutorial on random matrices which
provides an overview of the theory and brings together in one source
the most significant results recently obtained. Furthermore, the appli-
cation of random matrix theory to the fundamental limits of wireless
communication channels is described in depth.
Table of Contents
Section 1 Introduction 3
References 163
2
1
Introduction
From its inception, random matrix theory has been heavily influenced
by its applications in physics, statistics and engineering. The landmark
contributions to the theory of random matrices of Wishart (1928) [311],
Wigner (1955) [303], and Marc̆enko and Pastur (1967) [170] were moti-
vated to a large extent by practical experimental problems. Nowadays,
random matrices find applications in fields as diverse as the Riemann
hypothesis, stochastic differential equations, condensed matter physics,
statistical physics, chaotic systems, numerical linear algebra, neural
networks, multivariate statistics, information theory, signal processing,
and small-world networks. Despite the widespread applicability of the
tools and results in random matrix theory, there is no tutorial reference
that gives an accessible overview of the classical theory as well as the
recent results, many of which have been obtained under the umbrella
of free probability theory.
In the last few years, a considerable body of work has emerged in the
communications and information theory literature on the fundamental
limits of communication channels that makes substantial use of results
in random matrix theory.
The purpose of this monograph is to give a tutorial overview of ran-
3
4 Introduction
y = Hx + n (1.1)
1
n
FnA (x) = 1{λi (A) ≤ x}, (1.2)
n
i=1
where λ1 (A), . . . , λn (A) are the eigenvalues of A and 1{·} is the indi-
cator function.
1 If
FnA converges as n → ∞, then the corresponding limit (asymptotic empirical distribution
or asymptotic spectrum) is simply denoted by FA (x).
1.2. The Role of the Singular Values 7
ratio (SINR). For an i.i.d. input, the arithmetic mean over the users (or
transmit antennas) of the MMSE is given, as function of H, by [271]
−1
1 1 †
min E ||x − My||2 = tr I + SNR H H (1.7)
K M∈CK×N K
1
K
1
= (1.8)
K 1 + SNR λi (H† H)
i=1
∞
1
= dFK † (x)
1 + SNR x H H
0
N ∞ 1 N −K
= dFN
HH † (x) −
K 0 1 + SNR x K
(1.9)
where the expectation in (1.7) is over x and n while (1.9) follows from
(1.3). Note, incidentally, that both performance measures as a function
of SNR are coupled through
−1
d † †
SNR loge det I + SNR HH = K − tr I + SNR H H .
dSNR
As we see in (1.5) and (1.9), both fundamental performance measures
(capacity and MMSE) are dictated by the distribution of the empirical
(squared) singular value distribution of the random channel matrix.
In the simplest case of H having i.i.d. Gaussian entries, the density
function corresponding to the expected value of FN HH†
can be expressed
explicitly in terms of the Laguerre polynomials. Although the integrals
in (1.5) and (1.9) with respect to such a probability density function
(p.d.f.) lead to explicit solutions, limited insight can be drawn from
either the solutions or their numerical evaluation. Fortunately, much
deeper insights can be obtained using the tools provided by asymptotic
random matrix theory. Indeed, a rich body of results exists analyzing
the asymptotic spectrum of H as the number of columns and rows goes
to infinity while the aspect ratio of the matrix is kept constant.
Before introducing the asymptotic spectrum results, some justifica-
tion for their relevance to wireless communication problems is in order.
In CDMA, channels with K and N between 32 and 64 would be fairly
typical. In multi-antenna systems, arrays of 8 to 16 antennas would be
1.2. The Role of the Singular Values 9
1.8
1.6
1.4 β= 1
1.2
1
0.5
0.8
0.6 0.2
0.4
0.2
0
0 0.5 1 1.5 2 2.5
Fig. 1.1 The Marc̆enko-Pastur density function (1.10) for β = 1, 0.5, 0.2.
0.9
0.8
0.7 β= 1
0.6
0.5
0.4
0.3
0.2
0.1
10
0.2 0.5
0
0 2 4 6 8 10 12 14 16
Fig. 1.2 The Marc̆enko-Pastur density function (1.12) for β = 10, 1, 0.5, 0.2. Note that the
mass points at 0, present in some of them, are not shown.
1
= β log 1 + SNR − F (SNR , β)
4
1
+ log 1 + SNR β − F (SNR , β)
4
log e
− F (SNR , β) (1.14)
4 SNR
1.2. The Role of the Singular Values 11
−1 b
1 † 1
tr I + SNR H H → fβ (x) dx (1.15)
K a 1 + SNR x
F(SNR , β)
= 1− (1.16)
4 β SNR
with
2
√ 2 √ 2
F(x, z) = x(1 + z) + 1 − x(1 − z) + 1 . (1.17)
4 4
3 3
2 2
1 1
0 0
0 2 4 6 8 10 0 2 4 6 8 10
4 4
3 3
2 2
1 1
0 0
0 2 4 6 8 10 0 2 4 6 8 10
SNR N = 50 SNR
N = 15
Fig. 1.3 Several realizations of the left-hand side of (1.13) are compared to the asymptotic
limit in the right-hand side of (1.13) in the case of β = 1 for sizes: N = 3, 5, 15, 50.
0.3
0.25
0.2
0.15
0.1
0.05
0
−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 2.5
Fig. 1.4 The semicircle law density function (1.18) compared with the histogram of the
average of 100 empirical density functions for a Wigner matrix of size n = 100.
its entries are chosen to be i.i.d., then the eigenvalues of √1n A are
asymptotically uniformly distributed on the unit circle of the complex
plane. This is commonly referred to as Girko’s full-circle law, which is
exemplified in Figure 1.5. It has been proved in various degrees of rigor
and generality in [173, 197, 85, 68, 9]. If the off-diagonal entries Ai,j and
Aj,i are Gaussian and pairwise correlated with correlation coefficient
ρ, then [238] shows that the eigenvalues of √1n A are asymptotically
uniformly distributed on an ellipse in the complex plane whose axes
coincide with the real and imaginary axes and have radius 1 + ρ and
1 − ρ, respectively. When ρ = 1, the projection on the real axis of such
elliptic law is equal to the semicircle law.
1.5
0.5
−0.5
−1
−1.5
−1.5 −1 −0.5 0 0.5 1 1.5
Fig. 1.5 The full-circle law and the eigenvalues of a realization of a matrix of size n = 500.
dent entries, the matrices HH† whose eigenvalues are of interest do not
have independent entries.
When the entries of H are zero-mean i.i.d. Gaussian, HH† is com-
monly referred to as a Wishart matrix. The analysis of the joint dis-
tribution of the entries of Wishart matrices is as old as random matrix
theory itself [311]. The joint distribution of the eigenvalues of such ma-
trices is known as the Fisher-Hsu-Roy distribution and was discovered
simultaneously and independently by Fisher [75], Hsu [120], Girshick
[89] and Roy [210]. The corresponding marginal distributions can be
expressed in terms of the Laguerre polynomials [125].
The asymptotic theory of singular values of rectangular matrices
has concentrated on the case where the matrix aspect ratio converges
to a constant
K
→β (1.19)
N
as the size of the matrix grows.
The first success in the quest for the limiting empirical singular
value distribution of rectangular random matrices is due to Marc̆enko
and Pastur [170] in 1967. This landmark paper considers matrices of
the form
W = W0 + HTH† (1.20)
where T is a real diagonal matrix independent of H, W0 is a determin-
istic Hermitian matrix, and the columns of the N × K matrix H are
i.i.d. random vectors whose distribution satisfies a certain symmetry
condition (encompassing the cases of independent entries and uniform
distribution on the unit sphere). In the special case where W0 = 0,
T = I, and H has i.i.d. entries with variance N1 , the limiting spectrum
of W found in [170] is the density in (1.10). In the special case of square
H, the asymptotic density function of the singular values, correspond-
ing to the square root of the random variable whose p.d.f. is (1.10) with
β = 1, is equal to the quarter circle law:
1
q(x) = 4 − x2 , 0 ≤ x ≤ 2. (1.21)
π
As we will see in Section 2, in general (W0 = 0 or T = I) no closed-form
expression is known for the limiting spectrum. Rather, [170] character-
1.3. Random Matrices: A Brief Historical Account 17
0.6
0.5
0.4
0.3
0.2
0.1
0
0 0.5 1 1.5 2 2.5
Fig. 1.6 The quarter circle law compared a histogram of the average of 100 empirical sin-
gular value density functions of a matrix of size 100 × 100.
Figure 1.6 compares the quarter circle law density function (1.21)
with the average of 100 empirical singular value density functions of
a 100 × 100 square matrix H with independent zero-mean complex
1
Gaussian entries with variance 100 .
Despite the ground-breaking nature of Marc̆enko and Pastur’s con-
tribution, it remained in obscurity for quite some time. For example, in
1977 Grenander and Silverstein [101] rediscovered (1.10) motivated by
a neural network problem where the entries of H take only two values.
Also unaware of the in-probability convergence result of [170], in 1978
Wachter [296] arrived at the same solution but in the stronger sense of
almost sure convergence under the condition that the entries of H have
4 TheStieltjes transform is defined in Section 2.2.1. The Dutch mathematician T. J. Stieltjes
(1856-1894) provided the first inversion formula for this transform in [246].
18 Introduction
5 The mean is zero in the interesting special case where H has i.i.d. complex Gaussian
entries [15].
1.3. Random Matrices: A Brief Historical Account 19
1 In
the terminology introduced in [188], a random vector with real and imaginary compo-
h i
nents x and y, respectively, is proper complex if E (x − E[x]) (y − E[y])T = 0 .
21
22 Random Matrix Theory
The proof of Lemma 2.1 uses the expression of the p.d.f. of H given
in (2.1) and [67, Theorem 3.1].
The p.d.f. of the eigenvalues of standard Gaussian matrices is stud-
ied in [32, 68]. If the n×n matrix coefficients are real, [69] gives an exact
expression
for the expected number of real eigenvalues which grows as
2n/π.
1 P 1
n−1 n
− 12 n λ2i
e i=1 (λi − λj )2 . (2.3)
(2π)n/2 i=1
i!
i<j
As shown in [304, 172, 81, 175], the spacing between adjacent eigen-
values of a Wigner matrix exhibits an interesting behavior. With the
eigenvalues of a Gaussian Wigner matrix sorted in ascending order, de-
note by L the spacing between adjacent eigenvalues relative to the mean
eigenvalue spacing. The density of L in the large-dimensional limit is
accurately approximated by4
π − π s2
fL (s) ≈
se 4 (2.5)
2
For small values of s, (2.5) approaches zero implying that very
small spacings are unlikely and that the eigenvalues somehow repel
each other.
3 Such matrices are often referred to as simply Gaussian Wigner matrices.
4 Wigner postulated (2.5) in [304] by assuming that the energy levels of a nucleus behave
like a modified Poisson process. Starting from the joint p.d.f. of the eigenvalues of a
Gaussian Wigner matrix, (2.5) has been proved in [81, 175] where its exact expression has
been derived. Later, Dyson conjectured that (2.5) may also hold for more general random
matrices [65, 66]. This conjecture has been proved by [129] for a certain subclass of not
necessarily Gaussian Wigner matrices.
24 Random Matrix Theory
π −m(m−1)/2
fA (B) = m exp −tr Σ−1 B detBn−m . (2.6)
i=1 (n − i)!
detΣ n
UU† = U† U = I.
n
−n − 12 n(n+1)
2 π (n − i)! (2.7)
i=1
1
|ζi − ζ |2 . (2.8)
n!
i<
Lemma 2.9. [164, 96] For a central Wishart matrix W ∼ Wm (n, I),
E[tr{W}] = mn
E[tr{W2 }] = mn (m + n)
E[tr2 {W}] = mn (mn + 1).
Lemma 2.10. [164, 96](see also [133]) For a central Wishart matrix
W ∼ Wm (n, I) with n > m,
m
E tr W−1 = (2.9)
n−m
while, for n > m + 1,
mn
E tr W−2 =
(n − m)3 − (n − m)
2 −1 m n m−1
E tr W = + .
n − m (n − m)2 − 1 n − m + 1
2.1. Types of Matrices and Non-Asymptotic Results 27
where ψ(·) is Euler’s digamma function [97], which for natural argu-
ments can be expressed as
m−1
1 1
ψ(m) = ψ(1) + = ψ(m − 1) + (2.14)
m−1
=1
π2
with ψ̇(1) = 6 .
m−1
Γ(m + p − n − ζ −
) Γ(n + ζ −
)
† −1
E[det(H W ζ
H) ] =
Γ(n −
) Γ(m + p − n −
)
=0
m−1
E[log det(H† W−1 H)] = (ψ(n −
) − ψ(m + p − n −
)) .
=0
10 W
= HH† is a pseudo-Wishart matrix if H is a m×n Gaussian matrix and the correlation
matrix of the columns of H has a rank strictly larger than n [244, 267, 94, 58, 59].
2.1. Types of Matrices and Non-Asymptotic Results 29
m n!
E[det(I + γW)] = γi. (2.17)
i (n − i)!
i=0
− 1 F1 −ζ, 1 − d − ζ, γ
1
(2.19)
Γ(1 − d − ζ)
with 1 F1 (·) the confluent hypergeometric function [97] and with d =
r − t + i + k + 1.
12 In the remainder, det({f (i, j)}) denotes the determinant of a matrix whose (i, j)th entry
is f (i, j).
13 If b is an integer, [b] = b(b + 1) . . . (b − 1 + k).
k
2.1. Types of Matrices and Non-Asymptotic Results 31
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
5
4 5
3 4
3
2
2
1
1
0 0
Fig. 2.1 Joint p.d.f. of the unordered positive eigenvalues of the Wishart matrix HH† with
r = 3 and t = 2. (Scaled version of (2.22).)
Theorem 2.17. [75, 120, 89, 210] Let the entries of H be i.i.d. complex
Gaussian with zero mean and unit variance. The joint p.d.f. of the
ordered strictly positive eigenvalues of the Wishart matrix HH† , λ1 ≥
. . . ≥ λt , equals
Pt
t
λr−t
t
−
e i=1 λi i
(λi − λj )2 (2.22)
(t − i)! (r − i)!
i=1 i<j
where D(i, j) is the (i, j)th cofactor of the matrix D with entries
(n − m + k − 1)!
D,k = . (2.27)
a−n+m−k
Figure 2.2 contrasts a histogram obtained via Monte Carlo simu-
lation with the marginal p.d.f. of the unordered eigenvalues of W ∼
Wm (n, Σ) with n = 3 and m = 2 and with the correlation matrix Σ
chosen such that15
2
Σi,j = e−0.2(i−j) . (2.28)
14 An alternative expression for (2.23) can be found in [183, B.7].
15 The correlation in (2.28) is typical of a base station in a wireless cellular system.
2.1. Types of Matrices and Non-Asymptotic Results 33
0.8
0.6
0.4
0.2
0
0 5 10 15
with
⎡ λ
⎤
− a1 − λan
⎢ 1 a1 . . . a1m−n−1 a1m−n−1 e 1 . . . a1m−n−1 e 1
⎥
⎢ . . ... . . ... . ⎥
Ξ=⎢
⎢
⎥.
⎥
⎣ . . ... . . ... . ⎦
λ1 λn
m−n−1 am−n−1 e− am
1 am . . . am m−n−1 e− am
. . . am
m
N sup |FN
AA† (x) − FBB† (x)| ≤ rank(A − B).
N
(2.30)
x≥0
N sup |FN
A (x) − FB (x)| ≤ rank(A − B).
N
(2.31)
x≥0
where the so-called expansion kernel {ψk, (i, j)} is a set of complete
orthonormal discrete basis functions formed by the eigenfunctions of
the correlation function of A, i. e., this kernel must satisfy for all k ∈
{1, . . . , N } and
∈ {1, . . . , K}
N
K
rA (i, j; i , j ) ψk, (i , j ) = λk, (rA ) ψk, (i, j) (2.33)
i =1 j =1
then
A = UÃV†
with Uk,i=uk (i) and Vj,=v∗ (j), which renders the matrices U and V
unitary. As a consequence, A and its Karhunen-Loève image, Ã, have
the same singular values.
1
K
lim 1{Pi,j ≤ α}
K→∞ K
j=1
1 1
N K
lim Pi,j = lim Pi,j . (2.36)
N →∞ N K→∞ K
i=1 j=1
Pi,j = ϕ(i − j)
Theorem 2.27. [79, 193, 29, 44] Denote the maximum singular value
of A (spectral norm of A) by ρ(A). Let A1 , . . . , An , . . . be a stationary
ergodic sequence of random matrices for which
E[log(max{ρ(An ), 1}) < ∞.
38 Random Matrix Theory
2.2 Transforms
As mentioned in Section 1.3, it is often the case that the solution for the
limiting spectrum is obtained in terms of a transform of its distribution.
In this section, we review the most useful transforms including the
Shannon transform and the η-transform which, suggested by problems
of interest in communications, are introduced in this monograph.
For notational convenience, we refer to the transform of a random
variable and the transform of its cumulative distribution or density
function interchangeably. If the distribution of such variable equals
the asymptotic spectrum of a random matrix, then we refer to the
transform of the matrix and the transform of its asymptotic spectrum
interchangeably.
1 1 1
S(z) = SP (z) − 1 − (2.44)
N N z
where N is the dimension of s and SP is the Stieltjes transform of the
random variable s2 .
2.2.2 η-transform
In the applications of interest, it is advantageous to consider a trans-
form that carries some engineering intuition, while at the same time is
closely related to the Stieltjes transform.
Interestingly, this transform, which has not been used so far in the
random matrix literature, simplifies many derivations and statements
of results.20
∞
ηX (γ) = (−γ)k E[X k ], (2.49)
k=0
whenever the moments of X exist and the series in (2.49) converges.
From (1.8) it follows that the MMSE considered in Section 1.2 is
equal to the η-transform of the empirical distribution of the eigenvalues
of H† H.
Simple properties of the η-transform that prove useful are:
• ηX (γ) is strictly monotonically decreasing with γ ≥ 0 from 1
to P[X = 0]. 21
• γηX (γ) is strictly monotonically increasing with γ ≥ 0 from
0 to E[ X1 ].
while
1
lim tr{A−1 } = lim γηA (γ). (2.51)
n→∞ n γ→∞
21 Notefrom (2.47) that it is easy (and, it will turn out, sometimes useful) to extend the
definition of the η-transform to (generalized or defective) distributions that put some
nonzero mass at +∞. In this case, ηX (0) = P[X < ∞]
42 Random Matrix Theory
1
η(γ)
0.9 10
0.8
0.7
0.6
2
0.5
0.4
0.3
1
0.2
0.5
0.1
0.1
0
0 1 2 3 4 5 6 7 8 9 10
Fig. 2.3 η-transform of the Marc̆enko-Pastur law (1.10) evaluated for β = 0.1, 0.5, 1, 2, 10.
Lemma 2.29.
γ d 1 1
VX (γ) = 1 − SX − (2.60)
log e dγ γ γ
= 1 − ηX (γ). (2.61)
Since VX (0) = 0, VX (γ) can be obtained for all γ > 0 by integrating
the derivative obtained in (2.60). The Shannon transform contains the
same information as the distribution of X, either through the inversion
of the Stieltjes transform or from the fact that all the moments of X
are obtainable from VX (γ).
As we saw in Section 1.2, the Shannon transform of the empirical
distribution of the eigenvalues of HH† gives the capacity of various
communication channels of interest.
1 1 1
V(γ) = log 1 + γ − F (γ, β) + log 1 + γβ − F (γ, β)
4 β 4
log e
− F (γ, β) . (2.62)
4β γ
3.5
V (γ) 0.1
3
0.5
1
2.5
2 2
1.5
10
0.5
0
0 1 2 3 4 5 6 7 8 9 10
Fig. 2.4 Shannon transform of the Marc̆enko-Pastur law (1.10) for β = 0.1, 0.5, 1, 2, 10.
Example 2.17. [61] The Shannon transform of grt (·) in (2.23) is23
t−1
k k
23 Related expressions in terms of the exponential integral function [97] and the Gamma
function can be found in [219] and [126], respectively.
46 Random Matrix Theory
log e n−1 i!
n
1 m
1 m
X
i=1 in
V(γ) = det
(−1) 2 γ 2 i<j φi − φj i<j ai − aj =1
d(d−1) n(n−1)
Y
MXY = MX MY . (2.69)
1
Example 2.20. If X is exponentially distributed with mean µ, then
ν 1−z
MX 2 (z) = Γ(ν + z − 1).
Γ(ν)
r −1 Γ (1 − z + n) Γ(z +
)
r−1 2 r−1−n
Mgr,r (1 − z) = .
Γ(z) Γ(1 − z) (n!)2
!
n=0 =0
VX (γ) = M−1
Υ (γ) (2.70)
where M−1
Υ is the inverse Mellin transform of
2.2.5 R-transform
−1
Definition 2.14. [285] Let SX (z) denote the inverse (with respect
to the composition of functions) of the Stieltjes transform of X, i. e.,
−1
z = SX (SX (z)). The R-transform of X is defined as the complex-
valued function of complex argument
−1 1
RX (z) = SX (−z) − . (2.72)
z
2.2.6 S-transform
24 A less compact definition of the S-transform on the complex plane is given in the literature
(since the η-transform had not been used before) for arbitrary random variables with
nonzero mean. Note that the restriction to nonnegative random variables stems from the
definition of the η-transform.
2.2. Transforms 51
−1 −1 γ−1
ηAB (γ) = ηBA +1 (2.90)
β
and hence the S-transform counterpart to (2.56):
x+1 x
ΣAB (x) = ΣBA . (2.91)
x+β β
1 2k
= . (2.95)
k+1 k
The zero-mean assumption in the definition of a Wigner matrix can be
relaxed to an identical-mean condition using Lemma 2.23. In fact, it
suffices that the rank of the mean matrix does not grow linearly with
N for Theorem 2.33 to hold.
Assuming for simplicity that the diagonal elements of the Wigner
matrix are zero, we can give a simple sketch of the proof of Theorem
2.33 based on the matrix inversion lemma:
1
(A−1 )i,i = (2.96)
Ai,i − a†i A−1
i ai
2.3. Asymptotic Spectrum Theorems 53
√
Condition (2.93) on the entries of N W can be replaced by the
Lindeberg-type condition on the whole matrix [10, Thm. 2.4]:
1
E |Wi,j |2 1 {|Wi,j | ≥ δ} → 0 (2.98)
N
i,j
k 1 k k
x f̃β (x) dx = βi (2.102)
a k i i−1
i=1
1
= lim tr (HH† )k . (2.103)
N →∞ N
1
K
a.s
|λ − x |2 → 0. (2.108)
K
=1
Moreover, if d1 ≤ d2 ≤ . . . ≤ dK denote the ordered differences |λi −xi |,
then
a.s
dyK → 0 (2.109)
56 Random Matrix Theory
for all y ∈ (0, 1). For the smallest and largest eigenvalues of H† H, and
−K+1
for the smallest and largest zero of the polynomial LN K (N x), we
have that almost surely
lim x1 = lim λ1 = (1 − β)2 (2.110)
K→∞ K→∞
2
lim xK = lim λK = (1 + β) (2.111)
K→∞ K→∞
W = W0 + HTH† (2.114)
2.3. Asymptotic Spectrum Theorems 57
T
S(z) = S0 z − β E . (2.115)
1 + TS(z)
The case W0 = 0 merits particular attention. Using the more con-
venient η-transform and Shannon transform, we derive the following
result from [226]. (The proof is given in Appendix 4.1 under stronger
assumptions on T.)
1.8
1.6
1.4
1.2
0.8
0.6
0.4
0.2
0
0 1 2 3 4 5 6 7 8 9 10
Fig. 2.5 Shannon transform of the asymptotic spectrum of HTH† for β = 23 and T expo-
nentially distributed. The stars indicate the Shannon transform, obtained via Monte Carlo
simulation, of the averaged empirical distribution of the eigenvalues of HTH† where H is
3 × 2.
1
If T = I, then ηT (γ) = 1+γ , and (2.116) becomes
β
η =1−β+ (2.120)
1 + γη
whose explicit solution is the η-transform of the Marc̆enko-Pastur dis-
tribution, f̃β (·), in (1.12):
F(γ, β)
η(γ) = 1 − . (2.121)
4γ
Equation (2.116) admits an explicit solution in a few other cases, one
of which is illustrated by the result that follows.
γ 1
ηT (γ) = F , β̃ (2.122)
4 β̃ γ
1 λ λ
fΣ (λ) = −1 1− (2.126)
2πµλ2 σ1 σ2
with σ1 ≤ λ ≤ σ2 and
√ √
( σ2 − σ1 )2
µ= . (2.127)
4σ1 σ2
27 Although [223] obtained (2.123) with the condition that Y be Gaussian, it follows from
(2.121) and Theorem 2.39 that this condition is not required for (2.122) and (2.123) to
hold.
60 Random Matrix Theory
29 Inthe case that C and A are diagonal deterministic matrices, Theorem 2.43 is a special
case of Theorem 2.50.
62 Random Matrix Theory
0.95
0.9
0.85
0.8
0.75
0.7
0.65
0.6
0.55
0 1 2 3 4 5 6 7 8 9 10
Fig. 2.6 η-transform of HTH† with β = 23 and ηT given by (2.54). The stars indicate the
η-transform of the averaged empirical spectrum of HTH† for a 3 × 2 matrix H.
VHH† (γ)
lim log(γ β) − = L∞ (2.140)
γ→∞ min {β P[T = 0], P[D = 0]}
with
⎧
⎪ −E D β > 1
⎪
⎪ log αβ e − β VT (α)
⎪
⎪
⎪
⎨
L∞ = −E log T eD β = 1 (2.141)
⎪
⎪
⎪
⎪
⎪
⎪
⎩ −E log Γ∞ T − 1 VD 1 β < 1
e β Γ∞
1 1
ηT (α) = 1 − , ηD = 1 − β. (2.142)
β Γ∞
and with D and T the restrictions of D and T to the events D = 0 and
T = 0.
64 Random Matrix Theory
a.s. γt (γ)
(N ) (y, γ) → y ∈ [0, 1]
γE[D]
(2.119), while30
m(m − k)!(k − 1)!
B(m1 , . . . , mk , n1 , . . . , nm+1−k ) = .
f (m1 , . . . , mk ) · f (n1 , . . . , nm+1−k )
with
c+1 (d) = β d E[T+1 ]E[D] .
then, as K, N → ∞ with K
N → β, almost surely
a.s. E[Dmn (D)] ξn
δn(N ) (y) → = (2.147)
E[D] E[D]
where ξn can be computed through the following recursive equation
n
ξn = β E D2 m−1 (D) E Ti+1 ξn1 −1 . . . ξni −1
=1 n1 +···+ni =n−
1≤i≤n−
with
n
mn (d) = βd m−1 (d) E Ti+1 ξn1 −1 . . . ξni −1 . (2.148)
=1 n1 +···+ni =n−
1≤i≤n−
Moreover, E[mn (D)] yields yet another way to compute the nth moment
of the asymptotic spectrum of HH† .
Under mild assumptions on the distribution of the independent en-
tries of H, the following convergence result is shown in Appendix 4.4.
v N : [0, 1) × [0, 1) → R
(y,γ)
As K, N → ∞, (N ) converges almost surely to E[v(X,y)] , with (y, γ)
solution to the fixed-point equation
⎡ ⎤
v(X, y)
(y, γ) = E ⎣ ⎦ y ∈ [0, 1]. (2.157)
v(X,Y)
1 + γ β E 1+γ (Y,γ) |X
with ΓHH† (·, ·) and ΥHH† (·, ·) satisfying (2.154) and (2.155).
P[E[v(X, Y)|Y] = 0]
β = β ,
P[E[v(X, Y)|X] = 0]
we have that
VHH† (γ)
lim log(γβ) − = L∞
γ→∞ min{βP[E[v(X, Y)|Y] = 0], P[E[v(X, Y)|X] = 0]}
2.3. Asymptotic Spectrum Theorems 69
with
⎧
⎪
⎪ −E log 1
E v(X ,Y )
| − β E [log (1 + α(Y ))] β > 1
⎪
⎪ e
1+α(Y ) X
⎪
⎪
⎨
a.s.
L∞ → −E log v(Xe,Y ) β = 1
⎪
⎪
⎪
⎪
⎪
⎪
⎩ −E log Γ∞ (Y ) − 1 E log 1 + E v(X ,Y ) |X β < 1
e β Γ∞ (Y )
where Γ∞ (y) is the solution to (2.160) for β < 1 and 0 otherwise while
(y, γ) is the solution to (2.157).
In the case that v(x, y) factors as v(x, y) = vX (x)vY (y), then (2.164)
becomes
n
mn (r) = β r m−1 (r) E[Di+1 ] E[Cmn1 −1 (C)] · ·E[Cmni −1 (C)]
=1 n1 +···+ni =n−
1≤i≤n−
2.3. Asymptotic Spectrum Theorems 71
Define
1 ∗
R(N, m) = Hi1 ,jm Hi1 ,j1 · · · H∗im ,jm−1 Him ,jm , (2.174)
N
where the summation ranges over all 2m-tuples i1 , . . . , im , j1 , . . . , jm
satisfying 1 ≤ i ≤ N and 1 ≤ j ≤ K, such that the cardinality of
the set of distinct values of i plus the cardinality of the set of distinct
values of j equals k + 1, and such that there is one-to-one pairing of
the unconjugate and the conjugate terms in the products.
31 Theexistence of ρ(x, y) implies that Λ is a matrix that behaves ergodically in the sense
of Definition 2.17.
74 Random Matrix Theory
we have
whenever
, -
φ pi (Aj(i) ) = 0 ∀i = 1, . . . ,
(2.181)
where j(i) = j(i + 1) (i.e., consecutive indices are distinct, but non-
neighboring indices are allowed to be equal).
such that
we have
(2.186)
Example 2.34. Any random matrix and the identity are asymptoti-
cally free.
Historically, Examples 2.35 and 2.36 are the first results on the
freeness of random matrices.
Since SS† and {P1 , P†1 , . . . , PL , P†L } are asymptotically free (e.g. Ex-
ample 2.45), it follows from Theorem 2.61 that
We note that, under the condition that P and V are unitary Haar
matrices, Theorems 2.61 and 2.62 hold not only in terms of asymptotic
freeness but also in terms of almost surely asymptotic freeness.
K
HTH = †
Tk hk h†k . (2.196)
k=1
2.4. Free Probability 83
Thus, with ζ ≥ 0
K
RHTH† (−ζ) = lim RT † (−ζ) (2.197)
K→∞ k hk hk
k=1
β Tk
K
= lim (2.198)
K→∞ K 1 + Tk ζ
k=1
1 − ηT (ζ)
= β (2.199)
ζ
where (2.198) follows from (2.82) whereas (2.199) follows from the
law of large numbers. Finally, using the relationship between the η-
transform and the R-transform in (2.74) we obtain (2.116) letting
ζ = γηHTH† (γ), i.e.
Note that (2.197) has not been rigorously justified above, since it
involves both the limit in the size of the matrices which is the basis for
the claim of asymptotic freeness and a limit in the number of matrices.
The more general result (2.133)–(2.134) can be readily obtained
from (2.194), (2.195) and (2.200).
For T = I, we recover the η-transform in (2.121) of the Marc̆enko-
Pastur law. It is interesting to note that, in this special case, we are
summing unit-rank matrices whose spectra consist of a 1 − N1 mass
at 0 and a N1 mass at a location that converges to 1. If we were to
take the N th classical convolution (inverting the sum of log-moment
generating functions) of those distributions we would obtain asymptot-
ically the Poisson distribution; however, the distribution we obtain by
taking the N th free convolution (inverting the sum of R-transforms)
is the Marc̆enko-Pastur law. Thus, we can justifiably claim that the
Marc̆enko-Pastur law is the free analog of the classical Poisson law.
The free analog of the Gaussian law is the semicircle law according
to the celebrated free probability central limit theorem:
√ z
= 2 RW √ (2.205)
2
which admits the solution (cf. Example 2.25)
RW (z) = z. (2.206)
2.4. Free Probability 85
1 †
m
†
HH = √ si si
m
i
is the semicircle law. This result was also found using the moment ap-
proach, based on combinatorial tools, in [16] (without invoking Gaus-
sianity) and in [57] using results on√the asymptotic distribution of the
zeros of Laguerre polynomials Lm N ( N mx + m + N ).
34 Given the definition of the S-transform, we shall consider only nonnegative random ma-
trices whose trace does not vanish asymptotically.
86 Random Matrix Theory
γ
ηAB (γ) = ηA . (2.209)
ΣB (ηAB (γ) − 1)
As an application of (2.209), we can obtain the key relation (2.116)
from the S-transform of the Marc̆enko-Pastur law in (2.87)
1
ΣH† H (x) =
1+βx
From (2.209), Examples 2.13, 2.32 and 2.45 we obtain the following
result.
β−1
ηQQ† A (γ) = ηA γ + γ (2.212)
ηQQ† A (γ)
with K
N → β.
2.4. Free Probability 87
x+1 x x
ΣHTH† (x) = ΣH† H ΣT (2.215)
x+β β β
1 x
= ΣT (2.216)
x+β β
where (2.216) follows from Example 2.29.
Example 2.54. Consider the set {1, 2, 3, 4} and the non-crossing par-
tition = {{1, 3}, {2}, {4}}. Definition 2.22 is interpreted graphically
in Figure 2.7(a) by connecting elements in the same block with a line.
The fact that these lines do not cross evidences the non-crossing nature
of the partition. In contrast, the crossing partition = {{1, 3}, {2, 4}}
of the same set is also shown in Figure 2.7(b).
1 2 3 4 1 2 3 4
(a) (b)
Fig. 2.7 Figures (a) and (b) depict a non-crossing and a crossing partition respectively.
1 1 2 2 3 3 4 4 5 5 6 6 7 7
Fig. 2.8 The non-crossing partition
= {{1, 5, 7}, {2, 3, 4}, {6}} and the complementation
map K(
) = {{1, 4}, {2}, {3}, {5, 6}, {7}} obtained with the repeated integers.
elements of the set placing them between the elements of the old set;
then connect with a line as many elements of the new set as possible
without crossing the lines of the original partition.
The number of non-crossing partitions of the set {1, 2, . . . , n} into
i blocks equals35
1 n n
Qi = .
n i i−1
Moreover, the number of non-crossing partitions of {1, 2, . . . , n} equals
the nth Catalan number. This follows straightforwardly from the fact
that
n
1 2n
Qi = .
n+1 n
i=1
35 Note
Pn
that i=1 Qi β i equals the n-th moment of f̃β (·) given in (1.12).
90 Random Matrix Theory
ξN = Op (aN ) (2.220)
ξN = o(aN ) a. s. (2.222)
Then,
−2/5
W − Fw = Op (N
FN ). (2.224)
√
If we further assume that all entries of N W have finite moments of
all orders, then for any η > 0, the empirical distribution of the Wigner
matrix tends to the semicircle law as
−2/5+η
W − Fw = o(N
FN ) a. s. (2.225)
√
If we relax the assumption on the entries of N W to simply finite
W and E[FW ]
fourth-order moments, then the convergence rates for FN N
2 otherwise.
† 1−β
= log det(H H) + K log(1 − β) + log e
β
94 Random Matrix Theory
The counterpart of Theorem 2.75 for real H was first derived by Jon-
sson in [131] for a real zero-mean matrix with Gaussian i.i.d. entries
and an analogous result has been found by Girko in [87] for real (pos-
sible nonzero-mean) matrix with i.i.d. entries and variance N1 . In the
special case of Gaussian entries, Theorem 2.75 can be easily obtained
following [131] using the expression of the moment-generating function
of log det(H† H) in (2.11). In the general case, Theorem 2.75 can be
easily verified using the result given in [15].
Notice that
F (γ, β)
lim = min{1, β} (2.237)
γ→∞ 4γ
and Theorem 2.75 can be obtained as special case.
2.5. Convergence Rates and Asymptotic Normality 95
More general results (for functions other than log(1+γx)) are given
in [15].
N
∆N = g(λi ) − N g(x) dFHTH† (2.241)
i=1
converges, as K, N → ∞ with K
N → β, to a zero-mean Gaussian random
variable.41
40 In [14, 13, 170, 222] this interval contains the spectral support of H† HT.
41 See [15] for an expression of the variance of the limit.
3
Applications to Wireless Communications
96
3.1. Direct-Sequence CDMA 97
and
F (SNR , β)
C (β, SNR ) = β log 1 + SNR −
mmse
(3.7)
4
while the capacity achieved with the optimum receiver is (1.14)
F (SNR , β)
C opt
(β, SNR ) = β log 1 + SNR −
4
F (SNR , β) F (SNR , β)
+ log 1 + SNR β − − log e.
4 4SNR
(3.8)
Spectral Efficiency
Bits/s/Hz
6
No Spreading
5
Optimal
2 MMSE
Decorrelator
1
Matched Filter
0.5 1 1.5 2
β
Eb
Fig. 3.1 Capacity of CDMA without fading for N0
= 10dB.
Figure 3.1 (from [275]) compares (3.6), (3.7) and (3.8) as a func-
tion of the number of users to spreading gain β, choosing SNR so that
Eb
β SNR /C(β, SNR ) = N 0
= 10.
100 Applications to Wireless Communications
y = SAx + n. (3.9)
Here, the role of the received signal-to-noise ratio of the kth user is
taken by |Ak |2 SNR .
The η-transform is intimately related to the performance of MMSE
multiuser detection of (3.1). The arithmetic mean of the MMSEs for
the K users satisfies [271, (6.27)]
−1
1
K
1 † †
MMSEk = tr I + SNR A S SA (3.10)
K K
k=1
→ ηA† S† SA (SNR ) (3.11)
whereas the multiuser efficiency of the kth user (output SINR relative
to the single-user signal-to-noise ratio) achieved by the MMSE receiver,
ηkmmse (SNR ), is1
⎛ ⎞−1
ηkmmse (SNR ) = sTk ⎝I + SNR |Ai | si si ⎠ sk
2 T
(3.12)
i =k
→ ηSAA† S† (SNR ) (3.13)
where the limit follows from (2.57). According to Theorem 2.39, the
MMSE multiuser efficiency, abbreviated as
and, for β ≤ 1,
, -
C dec (β, SNR ) = β E log 1 + |A|2 SNR (1 − β P[|A| > 0]) (3.17)
2 Although most fading distributions of practical interest do not have any point masses
at zero, we express various results without making such an assumption on the fading
distribution. For example, the inactivity of certain users or groups of users can be modelled
by nonzero point masses in the fading distribution.
3 Equation (3.18) also holds for the capacity with non-Gaussian inputs, as shown in [186]
1−η SNR η̇
= 1+ log e (3.22)
SNR η
where we used (3.15) to write (3.22). The derivative of (3.19) yields
d opt 1−η
C (β, SNR ) = log e. (3.23)
dSNR SNR
d opt d mmse 1
C (β, SNR ) − C (SNR , β) = η̇ 1 − log e, (3.24)
dSNR dSNR η
which is equivalent to (3.18) since, at SNR = 0, both functions equal 0.
Random matrix methods have also been used to optimize power
control laws in DS-CDMA, as the number of users goes to infinity,
for various receivers: matched filter, decorrelator, MMSE and optimum
receiver [217, 281].
Departing from the usual setup where the channel and spreading se-
quences are known by the receiver, the performance of blind and group-
blind linear multiuser receivers that have access only to the received
3.1. Direct-Sequence CDMA 103
spreading sequence of the user of interest is carried out via random ma-
trix techniques in [318]. The asymptotic SINR at the output of direct
matrix inversion blind MMSE, subspace blind MMSE and group-blind
MMSE receivers with binary random spreading is investigated and an
interesting saturation phenomenon is observed. This indicates that the
performance of blind linear multiuser receivers is not only limited by
interference, but by estimation errors as well. The output residual inter-
ference is shown to be zero-mean and Gaussian with variance depending
on the type of receiver.
where
A = diag{A1, , . . . , AK, },
= 1, . . . L (3.26)
and {Ak, } indicates the i.i.d. fading coefficients of the kth user at the
th antenna.
Assuming that the fading coefficients are bounded,4 using Lemma
2.60, [108] shows that the asymptotic averaged empirical singular value
distribution of (3.25) is the same as that of
⎡ ⎤
S1 A1
⎣ ··· ⎦
SL AL
β
C (β, SNR ) = β E log 1 + SNR P 1 − P[P > 0]
dec
(3.33)
L
while
H = [C1 s1 , . . . , CK sK ]A (3.35)
1 i−j
(Ck )i,j = ck (3.36)
Wc Wc
with ck (·) the impulse response of the channel for the kth user inde-
pendent across users.
Let Λ be an N × K matrix whose (i, j)th entry is
with E[ρk (X)] representing the one-dimensional channel profile (cf. Def-
inition 2.18) of Λ. The multiuser efficiency of the kth user achieved by
the MMSE receiver is [159]
SINRk
ηkmmse (SNR ) = (3.42)
SNR hk 2
. −1
h†k I + SNR i =k hi h†i hk
= (3.43)
hk 2
(y, SNR )
→ (3.44)
E[ρk (X)]
Let the ratio between the effective number of users and the effective
processing gain be defined as
P[E [ρ(X, Y)|Y] > 0]
β = β . (3.46)
P[E [ρ(X, Y)|X] > 0]
Using Corollary 2.4, we obtain that the asymptotic MMSE multiuser
efficiency admits the following expression for β < 1:
H = CSA. (3.49)
Using Theorem 2.46 and with the aid of an auxiliary function χ(SNR ),
abbreviated as χ, we obtain that the MMSE multiuser efficiency of the
For the sake of brevity, we will not explicitly extend the analysis to
the case in which both frequency selectivity and multiple receive anten-
nas are present. This can be done by blending the results obtained in
3.1. Direct-Sequence CDMA 109
the time delays of the resolvable paths of all users are assumed known.
Thus, the channel estimation encompasses only the path gains and it is
further conditioned on the data (hypothesis that is valid during training
or with error-free data detection). The joint estimation of the channel
path gains for all the users is performed over an estimation window
of Q symbols, presumed small relative to the channel coherence time.
For the ith symbol within this window, the output of the chip matched
filter is
K
L
y(i) = Ck, sk, (i)(x(i))k + n(i) (3.55)
k=1 =1
The linear receiver performing data estimation operates under the belief
that the estimate of the
th path gain of the kth user has mean C̄k,
and variance ξk2 . These estimates are further assumed uncorrelated and
3.1. Direct-Sequence CDMA 111
with equal variance for all paths of each user. (When the channel is
perfectly known, Ck, = C̄k, and ξk,2 = 0 and the results reduce to
1
L
|C̄k, |2 SINRd (3.56)
1 + ξk2 SINRd
=1
for D ≥ 0, where SINR0 = 0 and SINR1 = 1+βPSNR is the SINR at the out-
put of the matched filter. The analysis for unequal-power users can be
found in [253, 255]. A generalization of the analysis in [116] and [253]
can be found in [162] where a connection between the asymptotic be-
havior of the SINR at the output of the reduced rank Wiener filter and
the theory of orthogonal polynomials for the so-called power moments
is established. It is further demonstrated in [116] and [162], numerically
and analytically respectively, that the number of stages D needed in
the reduced-rank MSW filter to achieve a desired output SINR does not
scale with the dimensionality; in fact, a few stages are usually sufficient
to achieve near-full-rank output SINR regardless of the dimension of the
signal space. However, the weights of the reduced-rank receiver do de-
pend on the spreading sequences. Therefore, in long-sequence CDMA
they have to be reevaluated from symbol to symbol, which hampers
real-time implementation.
To lift the burden of computing the weights from the spreading se-
quences for every symbol interval, [187, 265, 159] proposed the asymp-
totic reduced-rank MMSE receiver, which replaces the weights in (3.58)
114 Applications to Wireless Communications
with their limiting values in the asymptotic regime. Following this ap-
proach, various scenarios described by (1.1) have been evaluated in
[45, 105, 158, 159, 187, 265].8 For all these different scenarios it has
been proved that, in contrast with the exact weights, the asymptotic
weights do not depend on the realization of H and hence they do not
need to be updated from symbol to symbol. The asymptotic weights are
determined only by the number of users per chip and by the asymptotic
moments of HH† and thus, in order to compute these weights explic-
itly, it is only necessary to obtain explicit expressions for the asymptotic
eigenvalue moments of the interference autocorrelation matrix. Numer-
ical results show that the asymptotic weights work well for even modest
dimensionalities.
Alternative low-complexity implementations of both the decorrela-
tor and the MMSE receiver can be realized using the concepts of itera-
tive linear interference cancellation [84, 124, 33, 207, 71, 72], which rely
on well-known iterative methods for the solution of systems of linear
equations (and consequently for matrix inversion) [7]. This connection
has been recently established in [99, 251, 72]. In particular, parallel
interference cancellation receivers are an example of application of the
Jacobi method, first- and second-order stationary methods and Cheby-
shev methods, while serial interference cancellation receivers are an
example of application of Gauss-Seidel and successive relaxation meth-
ods. For all these linear (parallel and serial) interference cancellation
receivers, the convergence properties to the true decorrelator or MMSE
solution have been studied in [99] for large systems. For equal-power
users, the asymptotic convergence of the output SINR of the linear
multistage parallel interference cancellation receiver (based on the first
8 In[187], DS-CDMA with equal-power users and no fading is studied. In turn, [158] con-
siders the more general scenario of DS-CDMA with unequal-power users and flat-fading.
Related results in the context of the reduced-rank MSW and of the receiver originally pro-
posed by [179] were reported in [45]. In [158, 159], the analysis is extended to multi-antenna
receivers and further extended to include frequency selectivity in [105, 159]. Specifically,
the frequency-selective CDMA downlink is studied in [105] with the restriction that the
signature matrix be unitarily invariant with i.i.d. entries. In [159], in contrast, the analysis
with frequency-selectivity is general enough to encompass uplink and downlink as well as
signature matrices whose entries are independent with common mean and variance but
otherwise arbitrarily distributed. The case of frequency-selective CDMA downlink with
orthogonal signatures has been treated in [105].
3.1. Direct-Sequence CDMA 115
where Hk indicates the matrix H with the kth column removed. The
weights that minimize the mean-squared error are
⎡ ⎤−1 ⎡ ⎤
H1 + H0 H0 ··· HD + HD−1 H0 H0
⎢ . . . ⎥ ⎢ .. ⎥
w=⎣ .. .. .. ⎦ ⎣ . ⎦
HD + HD−1 H0 · · · H2D−1 + HD−1 HD−1 HD−1
(3.61)
where the (i, j)th entry of the above matrix is Hi+j−1 + Hi−1 Hj−1 with
m
Hm = h†k Hk H†k + σ 2 I hk . (3.62)
∞ m 2m−2n
Hm = σ |Ak |2 µn (3.65)
n=0
n
i
m = n − i + 1 (3.67)
=1
i
m = n, (3.68)
=1
A similar result holds for the faded DS-CDMA with antenna diver-
sity described in Section 3.1.3 with |A| now equal to the square root
of the random variable whose distribution is given by the asymptotic
empirical distribution of P1 , . . . , PK as defined in Section 3.1.3.
For the frequency-selective faded downlink, applying Theorem 2.48
to the model in Section 3.1.4 we have [159]
m
∞ m 2m−2n
Hm = σ |Ak |2 E |C|2 mn (|C|2 ) (3.69)
n
n=0
where
n
mn (r) = β r m−1 (r) E |A|2i+2 E |C|2 mn1 −1 (|C|2 )
=1 n1 +···+ni =n−
1≤i≤n−
. . . E |C|2 mni −1 (|C|2 ) (3.70)
with |C|2 as in Section 3.1.4 and with |A| representing a random vari-
able, independent of |C|2 , whose distribution equals the asymptotic
3.2. Multi-Carrier CDMA 117
∞ m 2m−2n
Hm = σ δn,k
n
n=0
m
m 2m−2n
= σ E[ρ(X, k)]E[mn (X)ρk (X)] (3.71)
n=0
n
with ρ(·, ·) and ρk (·) as in Section 3.1.4 and with mn (·) obtained
through the recursive equation given by (2.164) in Theorem 2.55.
⎡ ⎤
υ(X, y)
Ψ∞ (y) = E ⎣ ⎦ . (3.88)
υ(X,Y)
1 + β E Ψ∞ (Y) |X
3.2. Multi-Carrier CDMA 121
Although (3.90) and (3.93) are equivalent, they admit different inter-
pretations. The latter is a generalization of the capacity given in (3.18).
122 Applications to Wireless Communications
where
⎡ ⎤
υ(X, y)
(y, SNR ) = E ⎣ ⎦ (3.96)
υ(X,Z)
1 + SNR β(1 − y)E 1+SNR (Z,SNR ) |X
H = CQA. (3.97)
with C = diag(c).
The role of the received signal-to-noise ratio of the kth user is, in this
scenario, taken by |Ak |2 SNR E[|C|2 ] where |C| is a random variable whose
distribution equals the asymptotic empirical singular value distribution
of C.
In our asymptotic analysis, we assume that the empirical singular
value distribution of A and C converge almost surely to respective
nonrandom limiting distributions F|A| and F|C| .
⎡ ⎤
1 ⎣ |C|2
η mmse = E ⎦ . (3.98)
E [|C|2 ] 1 + SNR β|C|2 E |A|2
1+|A|2 SNR E[|C|2 ] ηmmse
where
θ ρ = 1 − η|A|2 (SNR θ) (3.102)
β θ ρ = 1 − η|C|2 (ρβ). (3.103)
Note that θ(SNR ) = E |C|2 η mmse (SNR ).
β − 1 + ηCQQ† C†
ηCQQ† C† (SNR ) = ηCC† SNR . (3.108)
ηCQQ† C† (SNR )
From
1 1
K K
1
MMSEk = (3.109)
K K 1 + SINRk
k=1 k=1
it follows that, as K, N → ∞,
1
K
1 a.s. 1
→ 1 − (1 − ηCQQ† C† (SNR )). (3.110)
K 1 + SINRk β
k=1
SINR β
β = 1 − ηCC† SNR (3.112)
1 + SINR 1 + SINR(1 − β)
9 This is called the Stiefel manifold (cf. Section 2, Footnote 2).
126 Applications to Wireless Communications
whereas the multiuser efficiency of the kth user achieved by the MMSE
receiver, ηkmmse (SNR ), converges almost surely to
, -
ηkmmse (SNR ) → η mmse SNR E |C|2
where the right side is the solution to the following equation at the
point τ = SNR E |C|2
$ %
η mmse |C̃|2
=E (3.113)
1 + τ η mmse βτ |C̃|2 + 1 + (1 − β)τ η mmse
2
|C|
with |C̃|2 = E[|C| 2 ] . A fixed-point equation equivalent to (3.113) was
derived in [56].
For equal-power users, the spectral efficiencies achieved by the
MMSE receiver and the decorrelator are
, -
C mmse (β, SNR ) = β log 1 + SNR E |C|2 η mmse (SNR ) (3.114)
and, for 0 ≤ β ≤ 1,
, -
C dec (β, SNR ) = β log 1 + SNR E |C|2 (1 − β) . (3.115)
In parallel with [217, Eqn. (141)], the capacity of the optimum receiver
is characterized in terms of the η-transform of HH† = CQQ†C†
SNR
1
C opt (β, SNR ) = (1 − ηCQQ† C† (x)) dx (3.116)
0 x
with ηCQQ† C† (·) satisfying (3.108). An alternative characterization of
the capacity (inspired by the optimality by successive cancellation with
MMSE protection against uncancelled users) is given by
with
(גy, SNR ) SNR |C|
2
=E (3.118)
1 + (גy, SNR ) β y SNR |C|2 + 1 + (1 − β y)(גy, SNR )
where Y is a random variable uniform on [0, 1].
The case of unequal-power users has been analyzed in [37] with the
restrictive setup of a finite number of user classes where the power
3.2. Multi-Carrier CDMA 127
is allowed to vary across classes but not over users within each class.
Reference [37] shows that the SINR of the kth user at the output of
the MMSE receiver, SINRk , and consequently ηkmmse (SNR ), converge al-
most surely to nonrandom limits. Specifically, the multiuser efficiency
converges to the solution η of
$ %
|C̃|2
E , - , - = 1 (3.119)
β|C̃|2 1 − η|A|2 (τ η) + η 1 − β + βη|A|2 (τ η)
with τ = SNR E[|C|2 ]. From the multiuser efficiency, the capacity can be
readily obtained using the optimality of successive interference cancel-
lation as done in (3.117).
and
⎡ ⎤
|C̃|2
SINR orth = E⎣ ⎦. (3.122)
|C̃|2
τ 1 − β 1+SINR orth + β 1+SINR orth
1 SINR orth
Notice, by comparing (3.121) and (3.122), that in the latter the term
τ = SNR E[|C|2 ] is multiplied by 1 − β SINR orth +1 , which is less than 1.
1 1 SINR orth
128 Applications to Wireless Communications
Accordingly, for a given SINR the required SNR is reduced with respect
to the one required with i.i.d sequences.
∞ m 2m−2n
Hm = |Ak |2
σ ξn (3.123)
n=0
n
where, in the case of i.i.d. spreading sequences,
n
ξn = β E m−1 (|C|2 ) |C|4 E |A|2i+2 ξn1 −1 . . . ξni −1
=1 n1 +···+ni =n−
1≤i≤n−
(3.124)
and
n
mn (r) = βr m−1 (r) E |A|2i+2 ξn1 −1 . . . ξni −1 (3.125)
=1 n1 +···+ni =n−
1≤i≤n−
with |C| and |A| random variables whose distributions equal the asymp-
totic empirical distributions of the singular values of C and A, respec-
tively. In the case of orthogonal sequences, the counterparts of (3.124)
and (3.125) can be found in [105].
For the uplink, the binomial expansion (3.62) becomes
m
∞ m 2m−2n
Hm = σ ξn,k (3.126)
n
n=0
where
ξn,k = E[mn (X)vk (X)] (3.127)
with mn (·) solution to the recursive equation
n
mn (x) = β m−1 (x) E[ v(x, Y) E [v(X, Y)mn1 −1 (X)|Y]
=1 n1 +···+ni =n−
1≤i≤n−
3.3.1 Preliminaries
while
E[x2 ]
SNR = . (3.130)
nR E[n ]
1 2
In contrast with the multiaccess scenarios, in this case the signals trans-
mitted by different antennas can be advantageously correlated and thus
the covariance of x becomes relevant. Normalized by its energy per di-
mension, the input covariance is denoted by
E[xx† ]
Φ= (3.131)
nT E[x ]
1 2
10 Although,
ˆ ˘ ¯˜
in most of the multi-antenna literature, E tr HH† = nT nR , for consistency
with the rest of the paper we use the normalization in (3.129). In the case that the entries
of H are identically distributed, the resulting variance of each entry is n1 .
T
130 Applications to Wireless Communications
For all these scenarios, the capacity per receive antenna is given
by the maximum over Φ of the Shannon transform of the averaged
empirical distribution of HΦH†, i.e.
C(SNR ) = max VHΦH† (SNR ). (3.132)
Φ:trΦ=nT
If, instead, only statistical CSI is available, then V should be set, for
all the channels that we will consider, to coincide with the eigenvectors
of E[H† H] while the capacity-achieving power allocation, P, can be
found iteratively [264].
ν SNR
C(SNR ) = β log λ fβ (λ) dλ (3.136)
max{a,ν −1 } β
where ν satisfies
b
+
β
ν− fβ (λ) dλ = 1 (3.137)
max{a,ν −1 } SNR λ
132 Applications to Wireless Communications
and thus we need only solve the integrals for β < 1. Applying Example
2.15 to (3.136) and Theorem 2.10 to (3.137) and exploiting (3.138), the
following result is obtained.
20
analytical
Capacity (bits/s/Hz)
simulation
.
n T=4
15 nR=6
.
n T=2
nR=6
10 n T=2
nR=4
-3 0 3 6 9 12 15
SNR (dB)
Fig. 3.2 Capacity of a canonical channel with various numbers of transmit and receive
antennas. The arrows indicate the SNR above which (3.139) is satisfied.
1 SNR SNR
C(β, SNR ) = β log 1 + − F ,β
β 4 β
with F (·, ·) given in (1.17). Notice that this capacity coincides, except
for a signal-to-noise scaling, with that of an unfaded equal-power DS-
CDMA channel.11
If β = 1, the asymptotic capacity per receive antenna with statisti-
cal CSI at the transmitter is equal to
√
1 + 1 + 4SNR log e ,√ -2
C(β, SNR ) = 2 log − 1 + 4 SNR − 1
2 4 SNR
evidencing the linear growth with the number of antennas originally
observed in [250, 76]. Further insight can be drawn, for arbitrary β,
from the high-SNR behavior of the capacity (cf. Example 2.15):
⎧
⎪
⎪ log SNR β−1
e − (β − 1) log β + o(1) β>1
⎪
⎨
C(SNR ) = log SNR
e + o(1) β=1
⎪
⎪
⎪
⎩ β log SNR − (1 − β) log(1 − β) + o(1)
βe β < 1.
Besides asymptotically in the number of antennas, the high-SNR capac-
ity can be characterized for fixed nT and nR via (2.12) in Theorem 2.11.
Also in this case, the capacity is seen to scale linearly with the num-
ber of antennas, more precisely with min(nT , nR ). While this scaling
makes multi-antenna communication highly appealing, it hinges on the
validity of the idealized canonical channel model. Much of the research
that has ensued, surveyed in the remainder of this section, is geared
precisely at accounting for various nonidealities (correlation, determin-
istic channel components, etc) that have the potential of compromising
this linear scaling.
correlated. In its full generality, the correlation between the (i, j) and
(i , j ) entries of H is given by
rH (i, i , j, j ) = E Hi,j H∗i ,j . (3.141)
In a number of interesting cases, however, correlation turns out to
be a strictly local phenomenon that can be modeled in a simplified
manner. To that end, the so-called separable (also termed Kronecker
or product) correlation model was proposed by several authors [220, 40,
203]. According to this model, an nR × nT matrix Hw , whose entries
are i.i.d. zero-mean with variance n1T , is pre- and post-multiplied by
the square root of deterministic matrices, ΘT and ΘR , whose entries
represent, respectively, the correlation between the transmit antennas
and between the receive antennas:
1/2 1/2
H = ΘR Hw ΘT . (3.142)
Implied by this model is that the correlation between two transmit
antennas is the same regardless of the receive antenna at which the
observation is made and viceversa. As confirmed experimentally in [41],
this condition is often satisfied in outdoor environments if the arrays are
composed by antennas with similar polarization and radiation patterns.
When (3.142) holds, the correlation in (3.141) can be expressed (cf.
Definition 2.9) as
(ΘR )i,i (ΘT )j,j
rH (i, i , j, j ) = . (3.143)
nT
Results on the asymptotic capacity and mutual information, with vari-
ous levels of transmitter information, of channels that obey this model
can be found in [181, 262, 43, 263, 178]. Analytical non-asymptotic
expressions have also been reported: in [208, 209, 2], the capacity of
one-sided correlated channels is obtained starting from the joint dis-
tribution of the eigenvalues of a Wishart matrix ∼ Wm (n, Σ) given in
Theorem 2.18 and (2.19). References [135, 234, 149, 39] compute the
moment generating function of the mutual information of a one-sided
correlated MIMO channel, constraining the eigenvalues of the correla-
tion matrix to be distinct. The two-sided correlated MIMO channel is
analyzed in [148, 231, 149] also through the moment generating func-
tion of the mutual information (cf. (2.16)).
136 Applications to Wireless Communications
where ν satisfies
∞
+
1
ν− dG(λ) = 1 (3.145)
0 SNR λ
with G(·) the asymptotic spectrum of H† H whose η-transform can be
derived using Theorem 2.43 and Lemma 2.28. Invoking Theorem 2.45,
the capacity in (3.144) can be evaluated as follows.
Corollary 3.1. With correlation at the end of the link with the fewest
antennas, the capacity per antenna with full CSI at the transmitter
converges to
⎧
⎪
⎪ E ΛT 1 1−β
E[ 1 β<1
⎪
⎨ β log e + log 1−β + β log SNR
β + ΛT ]
ΛR = 1
C=
⎪
⎪ β>1
⎪
⎩ E log ΛeR − β log β−1 β + log SNR (β − 1) + E[ ΛR ]
1
ΛT = 1.
12
Mutual Information (bits/s/Hz)
d
(i.i.d.)
d=2
10 d
.
d=1
8 receiver
.
transmitter
2 analytical
simulation
-10 -5 0 5 10 15 20
SNR (dB)
the same pattern are used to discriminate different multipath components and reduce
correlation.
140 Applications to Wireless Communications
For the more restrictive case where UR and UT are Fourier matrices,
the model (3.152) was proposed earlier in [213].
The matrices H and H̃ are directly related through the Karhunen-
Loève expansion (cf. Lemma 2.25) with the variances of the entries of
H̃ given by the eigenvalues of rH obtained by solving the system of
equations in (2.33). Furthermore, from Theorem 2.58, the asymptotic
spectrum of H is fully characterized by the variances of the entries of
H̃, which we assemble in a matrix G such that Gi,j = nT E[|H̃i,j |2 ] with
Gi,j = nT nR . (3.153)
ij
C(β, SNR ) = β E [log(1 + SNR E [G(R, T)P(T, SNR )Γ(R, SNR )| T])]
+E [log(1 + E[G(R, T)P(T, SNR )Υ(T, SNR )|R])]
−β E [G(R, T)P(T, SNR )Γ(R, SNR )Υ(T, SNR )] log e
where P(t, SNR ) is the asymptotic power profile of the capacity achieving
power allocation at each SNR .
12
Mutual Information (bits/s/Hz)
10 analytical
simulation
-10 -5 0 5 10 15 20
SNR (dB)
H = A ◦ Hw (3.157)
L
H= H (3.158)
=1
ηAL (SNR )
L
β
SNR = . (3.163)
1 − ηAL (SNR ) ηAL (SNR ) + β−1 − 1
=1
with the scalar Ricean factor K quantifying the ratio between the
Frobenius norm of the deterministic (unfaded) component and the ex-
pected Frobenius norm of the random (faded) component. Considered
individually, each (i, j)th channel entry has a Ricean factor given by
|H̄i,j |2
K .
E[|Hi,j |2 ]
Using Lemma 2.22 the next result follows straightforwardly.
L
n= H x + nth (3.166)
=1
L
y = Hx + H x + nth . (3.167)
=1
E[x2 ]
SIR = (3.168)
E[x 2 ]
and use SNR to specify the signal-to-thermal-noise ratio. With that, the
overall SINR satisfies
1 1
L
1
= + (3.169)
SINR SNR SIR
=1
• For growing β,
1
lim η1 = .L (3.174)
β→∞ 1
1 + SNR 1 + =1 SIR
1
lim η2 = .L 1
(3.175)
β→∞ 1 + SNR =1 SIR
(1) The mean and variance of I are obtained through the mo-
ment generating function (for fixed number of antennas). A
Gaussian distribution with such mean and variance is then
compared, through Monte Carlo simulation, to the empirical
distribution of I. This approach is followed in [235, 299, 26]
for the canonical channel, in [234] for channels with one-sided
correlation, and in [235] for uncorrelated Ricean channels.
Although, in every case, the match is excellent, no proof of
asymptotic Gaussianity is provided. Only for SNR → ∞ with
Φ = I and with H being a real Gaussian matrix with i.i.d.
entries has it been shown that I − E[I] converges to a Gaus-
sian random variable [87].
16 The input covariance is constrained to be Φ = I in [233], which also gives the corre-
sponding distribution of I for min(nT , nR ) = 2 and arbitrary max(nT , nR ) although in
the form of an involved integral expression.
150 Applications to Wireless Communications
17 The more restrictive case of a canonical channel at either low or high SNR is analyzed in
[113].
3.3. Single-User Multi-Antenna Channels 151
(1.17)]. For the canonical channel, this yields (cf. Theorem 2.76)
" #
(1 − ηHH† (γ))2
E[∆ ] = − log 1 −
2
β
"
#
1 F (γ, β) 2
= − log 1 − . (3.180)
β 4γ
With Rayleigh fading and correlation at the transmitter, in turn,
" #
2
(1 − η HTH † (γ))
E[∆2 ] = − log 1 − (3.181)
β
1/2 1/2
where T = ΘT ΦΘT with Φ the capacity-achieving power allocation.
0.6
0.5
0.4
0.3
0.2
0.1
0
−4 −3 −2 −1 0 1 2 3 4
Fig. 3.5 Histogram of ∆nR for a Rayleigh-faded channel with nT = 5 and nR = 10. The
transmit antennas are correlated as per (3.182) while the receive antennas are uncorrelated.
Solid line indicates the corresponding limiting Gaussian distribution.
λi 2 † †
= tr σ I + STS STS
λi + σ 2
i=1
" #
−1 K
= tr σ 2 I + STS† Tk sk s†k
k=1
153
154 Appendices
1
η = ηSTS† . (4.3)
σ2
From the the fact that the η-transform of STS† evaluated at σ −2 is the
multiuser efficiency achieved by each of the users asymptotically,
Tk
SIR k = η (4.4)
σ2
we obtain
1 SIR k 1
K K
1
lim = 1 − lim Tk
K→∞ K SIR k + 1 K→∞ K η +1
k=1 k=1 σ2
η
= 1 − ηT (4.5)
σ2
almost surely, by the law of large numbers and the definition of η-
transform. Also by definition of η-transform,
1 λi
N
lim =1−η (4.6)
N →∞ N λi + σ 2
i=1
Using Theorem 2.38 with W0 and T therein equal to T−1 and Q−1
(this is a valid choice since Q−1 is diagonal), it follows that the asymp-
totic spectrum of T−1 + γ(HQ−1 H† ) depends on T−1 only through its
asymptotic spectrum. Therefore, when we take N1 log of both sides of
(4.9) we are free to replace T by DT . Thus,
1
VW (γ) = lim log det I + γ(W0 + HTH† ) (4.10)
N →∞ N
1
= = lim log det I + γ(W0 + HDT H† ) (4.11)
N →∞ N
= VW0 +HDT H† (γ) (4.12)
Since the Shannon transforms are identical, so are the η-transforms.
Using Theorem 2.38 and (2.48), it follows that the η-transform of W0 +
HDT H† and consequently of W is
⎡ ⎤
1
ηγ = E ⎣ ⎦ (4.13)
W0 + γ1 + β E 1+Tηγ T
γ
= β
(4.16)
1+ η (1 − ηT (ηγ))
156 Appendices
from which
ηϕ + ϕβ (1 − ηT (ηγ)) = γη (4.17)
= ϕη0 (ϕ). (4.18)
with vX (x) and vY (y) such that the distributions of vX (X) and vY (Y)
(with X and Y independent random variables uniform on [0, 1]) equal
the asymptotic empirical distributions of D and T respectively. In turn,
(2.137) can be proved as special case of (2.158) when the function
v(x, y) can be factored. In this case, the expressions of ΓHH† (x, γ) and
ΥHH† (y, γ) given by Equations (2.154) and (2.155) in Theorem 2.50
become
1
ΓHH† (x, γ) =
1 + β γ vX (x)E[vY (Y) ΥHH† (Y, γ)]
1
= (4.19)
1 + β γ vX (x) Υ̃HH† (γ)
and thus
E [log (1 + γE[v(X, Y)Γ(X, γ)|Y])] = E log(1 + γ ΛT Γ̃(γ))
= VT (γ Γ̃(γ)). (4.23)
Likewise,
E [log(1 + γ β E[v(X, Y)Υ(Y, γ)|X])] = E log(1 + γ β ΛD Υ̃(γ))
= VD (γ β Υ̃(γ)). (4.24)
158 Appendices
Moreover,
Defining
plugging (4.25), (4.24), (4.23) into (2.158), and using (4.26), (4.21) and
(4.20), the expression for VHH† in Theorem 2.44 is found.
1
ΓHH† (x, γ) = (4.27)
v(x,Y)
1 + βγ E 1+γE[v(X,Y)ΓHH† (X,γ)|Y]
v(r, Y)
E
1 + γE[v(X, Y)Γ(X, γ)|Y]
4.5. Proof of Theorem 2.53 159
1
Γ(γ) =
v(x, Y)
1 + βγE
1 + γ Γ(γ)E[v(X, Y)| Y]
1
=
v(x, Y)
1 + β γE
1 + γ Γ(γ)µ
1
=
E [v(x, Y)]
1+βγ
1 + γ Γ(γ)µ
resulting in
1
Γ(γ) =
1
1 + βγµ
1 + γΓ(γ)µ
F(γ, β)
ηHH† (γ) = 1 − .
4βγ
Using (2.48) and the inverse Stieltjes formula, the claim is proved.
where FHH† (·) represents the limiting distribution to which the em-
pirical eigenvalue distribution of HH† converges almost surely. The
160 Appendices
log e 1
= 1− dFHH† (λ)
γ 1 + γλ
log e 1
= 1− dFHH† (λ)
γ 1 + γλ
log e
= (1 − E [ΓHH† (X, γ)]) (4.28)
γ
where, in the last equality, we have invoked Theorem 2.50 and where
ΓHH† (·, ·) satisfies the equations given in (2.154) and (2.155), namely
1
ΓHH† (x, γ) = (4.29)
1 + βγE[v(x, Y)ΥHH† (Y, γ)]
1
ΥHH† (y, γ) = (4.30)
1 + γE[v(X, y)ΓHH† (X, γ)]
with X and Y independent random variables uniform on [0, 1]. For
brevity, we drop the subindices from ΓHH† and ΥHH† . Using (4.29)
we can write
1 − Γ(x, γ) βE[v(x, Y)Υ(Y, γ)]
= ,
γ 1 + βγE[v(x, Y)Υ(Y, γ)]
which, after adding and subtracting to the right-hand side
βγE[v(x, Y)Υ̇(Y, γ)]
,
1 + βγE[v(x, Y)Υ(Y, γ)]
becomes
1 − Γ(x, γ) βE[v(x, Y)Υ(Y, γ)] + βγE[v(x, Y)Υ̇(Y, γ)]
=
γ 1 + βγE[v(x, Y)Υ(Y, γ)]
βγE[v(x, Y)Υ̇(Y, γ)]
−
1 + βγE[v(x, Y)Υ(Y, γ)]
d
= ln(1 + βγE[v(x, Y)Υ(Y, γ)])
dγ
βγE[v(x, Y)Υ̇(Y, γ)]
− (4.31)
1 + βγE[v(x, Y)Υ(Y, γ)]
4.5. Proof of Theorem 2.53 161
d
where Υ̇(·, γ)= dγ Υ(·, γ). From (4.28) and (4.29) it follows that
d
V̇HH† (γ) = E log(1 + βγE[v(X, Y)Υ(Y, γ)])
dγ
−β γE v(X, Y) Γ(X, γ) Υ̇(Y, γ) log e. (4.32)
Notice that
d
−γE v(X, Y) Γ(X, γ) Υ̇(Y, γ) =− (γE [v(X, Y) Γ(X, γ) Υ(Y, γ)])
dγ
+E γ v(X, Y)Γ̇(X, γ)Υ(Y, γ)
+E [v(X, Y)Γ(X, γ)Υ(Y, γ)] (4.33)
d
with Γ̇(·, γ)= dγ Γ(·, γ). From (4.29),
v(X,Y)(γ Γ̇(X,γ)+Γ(X,γ))
E v(X, Y) γ Γ̇(X, γ) + Γ(X, γ) Υ(Y, γ) = E 1+γE[v(X,Y)Γ(X,γ)|Y]
E[v(X,Y)(γ Γ̇(X,γ)+Γ(X,γ))|Y]
= E 1+γE[v(X,Y)Γ(X,γ)|Y]
162
References
163
164 References
[110] F. Hiai and D. Petz, “Asymptotic freeness almost everywhere for random
matrices,” Acta Sci. Math. Szeged, vol. 66, pp. 801–826, 2000.
[111] F. Hiai and D. Petz, The Semicircle Law, Free Random Variables and Entropy.
American Mathematical Society, 2000.
[112] B. Hochwald and S. Vishwanath, “Space-time multiple access: Linear growth
in the sum-rate,” in Proc. Allerton Conf. on Communication, Control and
Computing, (Monticello, IL), Oct. 2002.
[113] B. M. Hochwald, T. L. Marzetta, and V. Tarokh, “Multi-antenna channel
hardening and its implications for rate feedback and scheduling,” IEEE Trans.
on Information Theory, submitted May 2002.
[114] T. Holliday, A. J. Goldsmith, and P. Glynn, “On entropy and Lyapunov ex-
ponents for finite state channels,” submitted to IEEE Trans. on Information
Theory, 2004.
[115] M. L. Honig, “Adaptive linear interference suppression for packet DS-CDMA,”
Euro. Trans. Telecommunications, vol. 9, pp. 173–182, Mar./Apr. 1998.
[116] M. L. Honig and W. Xiao, “Performance of reduced-rank linear interference
suppression for DS-CDMA,” IEEE Trans. on Information Theory, vol. 47,
pp. 1928–1946, July 2001.
[117] R. Horn and C. Johnson, Matrix Analysis. Cambridge University Press, 1985.
[118] D. Hösli and A. Lapidoth, “The capacity of a MIMO Ricean channel is mono-
tonic in the singular values of the mean,” 5th Int. ITG Conf. on Source and
Channel Coding, Jan. 2004.
[119] D. C. Hoyle and M. Rattray, “Principal component analysis eigenvalue spectra
from data with symmetry breaking structure,” Physical Review E, vol. 69,
026124, 2004.
[120] P. L. Hsu, “On the distribution of roots of certain determinantal equations,”
Annals of Eugenics, vol. 9, pp. 250–258, 1939.
[121] L. K. Hua, Harmonic analysis of functions of several complex variables in the
classical domains. Providence, RI: American Mathematical Society, 1963.
[122] P. Jacquet, G. Seroussi, and W. Szpankowski, “On the entropy of a hidden
Markov process,” in Proc. Data Compression Conference, Mar. 23–25 2004.
[123] S. A. Jafar, S. Vishwanath, and A. J. Goldsmith, “Channel capacity and beam-
forming for multiple transmit and receive antennas with covariance feedback,”
Proc. IEEE Int. Conf. on Communications (ICC’01), vol. 7, pp. 2266–2270,
2001.
[124] K. Jamal and E. Dahlman, “Multi-stage interference cancellation for DS-
CDMA,” in Proc. IEEE Vehicular Technology Conf. (VTC’96), (Atlanta, GA),
pp. 671–675, Apr. 1996.
[125] A. T. James, “Distributions of matrix variates and latent roots derived from
normal samples,” Annals of Math. Statistics, vol. 35, pp. 475–501, 1964.
[126] R. Janaswamy, “Analytical expressions for the ergodic capacities of certain
MIMO systems by the Mellin transform,” Proc. IEEE Global Telecomm. Conf.
(GLOBECOM’03), vol. 1, pp. 287–291, Dec. 2003.
[127] S. K. Jayaweera and H. V. Poor, “Capacity of multiple-antenna systems with
both receiver and transmitter channel state information,” IEEE Trans. on
Information Theory, vol. 49, pp. 2697–2709, Oct. 2003.
References 171
[177] X. Mestre, Space processing and channel estimation: performance analysis and
asymptotic results. PhD thesis, Dept. de Teoria del Senyal i Comunicacions,
Universitat Politècnica de Catalunya, Barcelona, Catalonia, Spain, 2002.
[178] X. Mestre, J. R. Fonollosa, and A. Pages-Zamora, “Capacity of MIMO chan-
nels: asymptotic evaluation under correlated fading,” IEEE J. on Selected
Areas in Communications, vol. 21, pp. 829– 838, June 2003.
[179] S. Moshavi, E. G. Kanterakis, and D. L. Schilling, “Multistage linear receivers
for DS-CDMA systems,” Int. J. of Wireless Information Networks, vol. 39,
no. 1, pp. 1–17, 1996.
[180] A. L. Moustakas and S. H. Simon, “Optimizing multiple-input single-output
(MISO) communication systems with general Gaussian channels: nontrivial
covariance and nonzero mean,” IEEE Trans. on Information Theory, vol. 49,
pp. 2770–2780, Oct. 2003.
[181] A. L. Moustakas, S. H. Simon, and A. M. Sengupta, “MIMO capacity through
correlated channels in the presence of correlated interferers and noise: a (not
so) large N analysis,” IEEE Trans. on Information Theory, vol. 49, pp. 2545–
2561, Oct. 2003.
[182] R. J. Muirhead, Aspects of multivariate statistical theory. New York, Wiley,
1982.
[183] R. R. Müller, Power and Bandwidth Efficiency of Multiuser Systems with
Random Spreading. PhD thesis, Universtät Erlangen-Nürnberg, Erlangen,
Germany, Nov. 1998.
[184] R. R. Müller, “On the asymptotic eigenvalue distribution of concatenated
vector-valued fading channels,” IEEE Trans. on Information Theory, vol. 48,
pp. 2086–2091, July 2002.
[185] R. R. Müller, “Multiuser receivers for randomly spread signals: Fundamen-
tal limits with and without decision-feedback,” IEEE Trans. on Information
Theory, vol. 47, no. 1, pp. 268–283, Jan. 2001.
[186] R. R. Müller and W. Gerstacker, “On the capacity loss due to separation of
detection and decoding in large CDMA systems,” in IEEE Information Theory
Workshop (ITW), p. 222, Oct. 2002.
[187] R. R. Müller and S. Verdú, “Design and analysis of low-complexity interference
mitigation on vector channels,” IEEE J. on Selected Areas on Communica-
tions, vol. 19, pp. 1429–1441, Aug. 2001.
[188] F. D. Neeser and J. L. Massey, “Proper complex random processes with appli-
cations to information theory,” IEEE Trans. on Information Theory, vol. 39,
pp. 1293–1302, July 1993.
[189] A. Nica, R-transforms in free probability. Paris, France: Henri Poincare Insti-
tute, 1999.
[190] A. Nica and R. Speicher, “On the multiplication of free n-tuples of non-
commutative random variables,” American J. Math., vol. 118, no. 4, pp. 799–
837, 1996.
[191] B. Niederhauser, “Norms of certain random matrices with dependent entries,”
Random Operators and Stochastic Equations, vol. 11, no. 1, pp. 83–101, 2003.
[192] A. Y. Orlov, “New solvable matrix integrals,” Acta Sci. Math, vol. 63, pp. 383–
395, 1997.
References 175
[259] D. N. Tse and P. Viswanath, “On the capacity of the multiple antenna broad-
cast channel,” in Multiantenna channels: Capacity, Coding and Signal Pro-
cessing, (G. Foschini and S. Verdú, eds.), pp. 87–106, American Mathematical
Society Press, 2003.
[260] B. S. Tsybakov, “The capacity of a memoryless Gaussian vector channel,”
Problems of Information Transmission, vol. 1, pp. 18–29, 1965.
[261] A. M. Tulino, A. Lozano, and S. Verdú, “Capacity-achieving input covariance
for single-user multi-antenna channels,” Bell Labs Tech. Memorandum ITD-
04-45193Y (also submitted to IEEE Trans. on Wireless Communications.),
Sep. 2003.
[262] A. M. Tulino, A. Lozano, and S. Verdú, “Impact of correlation on the capacity
of multi-antenna channels,” Bell Labs Technical Memorandum ITD-03-44786F
(also submitted to IEEE Trans. on Information Theory), Sep. 2003.
[263] A. M. Tulino, A. Lozano, and S. Verdú, “MIMO capacity with channel state
information at the transmitter,” in Proc. IEEE Int. Symp. on Spread Spectrum
Tech. and Applications (ISSSTA’04), Aug. 2004.
[264] A. M. Tulino, A. Lozano, and S. Verdú, “Power allocation in multi-antenna
communication with statistical channel information at the transmitter,” in
Proc. IEEE Int. Conf. on Personal, Indoor and Mobile Radio Communica-
tions. (PIMRC’04), (Barcelona, Catalonia, Spain), Sep. 2004.
[265] A. M. Tulino and S. Verdú, “Asymptotic analysis of improved linear receivers
for BPSK-CDMA subject to fading,” IEEE J. on Selected Areas in Commu-
nications, vol. 19, pp. 1544–1555, Aug. 2001.
[266] A. M. Tulino, S. Verdú, and A. Lozano, “Capacity of antenna arrays with
space, polarization and pattern diversity,” Proc. 2003 IEEE Information The-
ory Workshop (ITW’03), pp. 324–327, Apr. 2003.
[267] H. Uhlig, “On singular Wishart and singular multivariate beta distributions,”
Annals of Statistics, vol. 22, pp. 395–405, 1994.
[268] V. V. Veeravalli, Y. Liang, and A. Sayeed, “Correlated MIMO Rayleigh fading
channels: Capacity, optimal signalling and asymptotics,” submitted to IEEE
Trans. on Information Theory, 2003.
[269] S. Venkatesan, S. H. Simon, and R. A. Valenzuela, “Capacity of a Gaussian
MIMO channel with nonzero mean,” Proc. 2003 IEEE Vehicular Technology
Conf. (VTC’03), Oct. 2003.
[270] S. Verdú, “Capacity region of Gaussian CDMA channels: The symbol syn-
chronous case,” in Proc. Allerton Conf. on Communication, Control and Com-
puting, (Monticello, IL), pp. 1025–1034, Oct. 1986.
[271] S. Verdú, Multiuser Detection. Cambridge, UK: Cambridge University Press,
1998.
[272] S. Verdú, “Random matrices in wireless communication, proposal to the Na-
tional Science Foundation,” Feb. 1999.
[273] S. Verdú, “Large random matrices and wireless communications,” 2002 MSRI
Information Theory Workshop, Feb 25–Mar 1, 2002.
[274] S. Verdú, “Spectral efficiency in the wideband regime,” IEEE Trans. on In-
formation Theory, vol. 48, no. 6, pp. 1319–1343, June 2002.
180 References
[275] S. Verdú and S. Shamai, “Spectral efficiency of CDMA with random spread-
ing,” IEEE Trans. on Information Theory, vol. 45, pp. 622–640, Mar. 1999.
[276] S. Vishwanath, N. Jindal, and A. Goldsmith, “On the capacity of multiple
input multiple output broadcast channels,” in Proc. IEEE Int. Conf. in Com-
munications (ICC’02), pp. 1444–1450, Apr. 2002.
[277] S. Vishwanath, N. Jindal, and A. Goldsmith, “Duality, achievable rates and
sum-rate capacity of Gaussian MIMO broadcast channels,” IEEE Trans. on
Information Theory, vol. 49, pp. 2658–2668, Oct. 2003.
[278] S. Vishwanath, G. Kramer, S. Shamai, S. Jafar, and A. Goldsmith, “Capacity
bounds for Gaussian vector broadcast channels,” in Multiantenna channels:
Capacity, Coding and Signal Processing, (G. Foschini and S. Verdú, eds.),
pp. 107–122, American Mathematical Society Press, 2003.
[279] E. Visotsky and U. Madhow, “Space-time transmit precoding with imperfect
feedback,” IEEE Trans. on Information Theory, vol. 47, pp. 2632–2639, Sep.
2001.
[280] P. Viswanath and D. N. Tse, “Sum capacity of the multiple antenna Gaussian
broadcast channel,” in Proc. IEEE Int. Symp. Information Theory (ISIT’02),
p. 497, June 2002.
[281] P. Viswanath, D. N. Tse, and V. Anantharam, “Asymptotically optimal water-
filling in vector multiple-access channels,” IEEE Trans. on Information The-
ory, vol. 47, pp. 241–267, Jan. 2001.
[282] H. Viswanathan and S. Venkatesan, “Asymptotics of sum rate for dirty paper
coding and beamforming in multiple antenna broadcast channels,” in Proc.
Allerton Conf. on Communication, Control and Computing, (Monticello, IL),
Oct. 2003.
[283] D. Voiculescu, “Asymptotically commuting finite rank unitary operators with-
out commuting approximants,” Acta Sci. Math., vol. 45, pp. 429–431, 1983.
[284] D. Voiculescu, “Symmetries of some reduced free product c∗ -algebra,” in Op-
erator algebras and their connections with topology and ergodic theory, Lecture
Notes in Mathematics, vol. 1132, pp. 556–588, Berlin: Springer, 1985.
[285] D. Voiculescu, “Addition of certain non-commuting random variables,” J.
Funct. Analysis, vol. 66, pp. 323–346, 1986.
[286] D. Voiculescu, “Multiplication of certain non-commuting random variables,”
J. Operator Theory, vol. 18, pp. 223–235, 1987.
[287] D. Voiculescu, “Limit laws for random matrices and free products,” Inven-
tiones Mathematicae, vol. 104, pp. 201–220, 1991.
[288] D. Voiculescu, “The analogues of entropy and of Fisher’s information measure
in free probability theory, I,” Communications in Math. Physics, vol. 155,
pp. 71–92, July 1993.
[289] D. Voiculescu, “The analogues of entropy and of Fisher’s information measure
in free probability theory, II,” Inventiones Mathematicae, vol. 118, pp. 411–
440, Nov. 1994.
[290] D. Voiculescu, “Alternative proofs for the type II free Poisson variables and
for the free compression results (appendix to a paper by A. Nica and R.
Speicher),” American J. Math., vol. 118, pp. 832–837, 1996.
References 181
[307] E. Wigner, “Distribution laws for the roots of a random Hermitian matrix,”
in Statistical Theories of Spectra: Fluctuations, (C. E. Porter, ed.), New York:
Academic, 1965.
[308] E. Wigner, “Random matrices in physics,” SIAM Review, vol. 9, pp. 1–123,
1967.
[309] J. H. Winters, “Optimum combining in digital mobile radio with cochannel
interference,” IEEE J. on Selected Areas in Communications, vol. 2, pp. 528–
539, July 1984.
[310] J. H. Winters, J. Salz, and R. D. Gitlin, “The impact of antenna diversity on
the capacity of wireless communication systems,” IEEE Trans. on Communi-
cations, vol. 42, pp. 1740–1751, Feb./Mar./Apr. 1994.
[311] J. Wishart, “The generalized product moment distribution in samples from a
normal multivariate population,” Biometrika, vol. 20 A, pp. 32–52, 1928.
[312] W. Xiao and M. L. Honig, “Large system convergence analysis of adaptive
reduced- and full-rank least squares algorithms,” IEEE Trans. on Information
Theory, 2004, to appear.
[313] Y. Q. Yin, “Limiting spectral distribution for a class of random matrices,” J.
of Multivariate Analysis, vol. 20, pp. 50–68, 1986.
[314] Y. Q. Yin and P. R. Krishnaiah, “A limit theorem for the eigenvalues of
product of two random matrices,” J. of Multivariate Analysis, vol. 13, pp. 489–
507, 1984.
[315] Y. Q. Yin and P. R. Krishnaiah, “Limit theorem for the eigenvalues of the
sample covariance matrix when the underlying distribution is isotropic,” The-
ory Prob. Appl., vol. 30, pp. 861–867, 1985.
[316] W. Yu and J. Cioffi, “Trellis precoding for the broadcast channel,” in Proc.
IEEE Global Telecomm. Conf. (GLOBECOM’01), pp. 1344–1348, Oct. 2001.
[317] B. M. Zaidel, S. Shamai, and S. Verdú, “Multicell uplink spectral efficiency of
coded DS-CDMA with random signatures,” IEEE Journal on Selected Areas
in Communications, vol. 19, pp. 1556–1569, Aug. 2001.
[318] J. Zhang and X. Wang, “Large-system performance analysis of blind and
group-blind multiuser receivers,” IEEE Trans. on Information Theory, vol. 48,
pp. 2507–2523, Sep. 2002.