On The Sphere Decoding Algorithm. I. Expected Complexity
On The Sphere Decoding Algorithm. I. Expected Complexity
Expected Complexity
BABAK H ASSIBI
H ARIS V IKALO
Abstract
The problem of finding the least-squares solution to a system of linear equations where the unknown
vector is comprised of integers, but the matrix coefficient and given vector are comprised of real numbers,
arises in many applications: communications, cryptography, GPS, to name a few. The problem is equivalent to finding the closest lattice point to a given point and is known to be NP-hard. In communications
applications, however, the given vector is not arbitrary, but rather is an unknown lattice point that has been
perturbed by an additive noise vector whose statistical properties are known. Therefore in this paper, rather
than dwell on the worst-case complexity of the integer-least-squares problem, we study its expected complexity, averaged over the noise and over the lattice. For the sphere decoding algorithm of Fincke and
Pohst we find a closed-form expression for the expected complexity, both for the infinite and finite lattice.
It is demonstrated in the second part of this paper that, for a wide range of signal-to-noise ratios (SNR)
and numbers of antennas, the expected complexity is polynomial, in fact often roughly cubic. Since many
communications systems operate at noise levels for which the expected complexity turns out to be polynomial, this suggests that maximum-likelihood decoding, which was hitherto thought to be computationally
intractable, can in fact be implemented in real-timea result with many practical implications.
Index TermsInteger least-squares problem, sphere decoding, wireless communications, multiple-antenna
systems, lattice problems, NP hard, expected complexity.
sZ m
(1)
where x Rn1 , H Rnm , and Z m denotes the m-dimensional integer lattice, i.e., s is an mdimensional vector with integer entries. Often, the search space is a (finite) subset of the infinite lattice,
sDZ m
kx Hsk2 .
(2)
The integer least-squares problem has a simple geometric interpretation. As the entries of s run over the
integers, s spans the rectangular m-dimensional lattice, Z m . However, for any given lattice-generating
matrix H, the n-dimensional vector Hs spans a skewed lattice. (When n > m, this skewed lattice
lives in an m-dimensional subspace of R n1 .) Therefore, given the skewed lattice Hs, and given a vector
x Rn1 , the integer least-squares problem is to find the closest lattice point (in a Euclidean sense) to
xsee Figure 1.
2 Overview of Methods
Since the integer least-squares problem arises in many applications and finding the exact solution is, in
general, NP hard, all practical systems employ some approximations, heuristics or combinations thereof. In
communications applications, these approximations can be broadly categorized into three classes.
1. Solve the unconstrained least-squares problem to obtain s = H x, where H denotes the pseudoinverse of H. Since the entries of s will not necessarily be integers, round them off to the closest
integer (a process referred to as slicing) to obtain
h
i
sB = H x
Zm
(3)
The above sB is often called the Babai estimate [1]. In communications parlance, this procedure is
referred to as zero-forcing equalization.
2. Nulling and cancelling. In this method, the Babai estimate is used for only one of the entries of s, say
the first entry s1 . s1 is then assumed to be known and its effect is cancelled out to obtain a reducedorder integer least-squares problem with m 1 unknowns. The process is then repeated to find s 2 ,
etc. In communications parlance this is known as decision-feedback equalization.
3. Nulling and cancelling with optimal ordering. Nulling and cancelling can suffer from error-propagation:
if s1 is estimated incorrectly it can have an adverse effect on the estimation of the remaining unknowns
s2 , s3 , etc. To minimize the effect of error propagation, it is advantageous to perform nulling and cancelling from the strongest to the weakest signal. This is the method proposed for V-BLAST
[3]see also [4].
The above heuristic methods all require O(mn 2 ) computations, essentially because they all first solve
the unconstrained least-squares problem.
In practice, however, the columns of H are rarely orthogonal. Orthogonalizing the columns of H via
a QR decomposition, or otherwise, generally destroys the lattice structure. (The reason being that, if s
has integer entries, Rs need not have integer entries.) One method that attempts to alleviate this is lattice
reduction. In these methods, one attempts to find an invertible m m matrix T , such that T and T 1 have
integer entries (thereby preserving the lattice structure), and such that the matrix G = HT is as orthogonal
as possible. Having found such a T , rather than solve (1), one can solve the integer least-squares problem
min kx Gtk2 ,
tZ m
(4)
using the earlier mentioned heuristics and set s = T t. Of course, lattice reduction is itself NP-hard. A
common heuristic is the LLL (Lenstra, Lenstra and Lovasz [5]) algorithm which, permitting a gross oversimplification, can be regarded as Gram-Schmidt over integers.
While lattice reduction may lead to some improvement in the solution of (1), the integer least-squares
problem over the infinite lattice, it is not useful for (2), which is over a subset of the lattice. The reason
is that the lattice transforming matrix T often destroys the properties of the subset D Z m . Since in
communications applications, we are always interested in a finite subset of the integer lattice, we shall
therefore not consider lattice reduction methods in this paper.
in D is 464 = 2128 3.4 1038 . As can be seen from Figure 2, the bit-error rate (BER) performance of the
exact integer least-squares solution is far superior to that of the best heuristic, which in this case is nulling
10
10
BER
10
N/C
3
10
ML
4
10
15
20
25
30
SNR (dB)
Figure 2: Bit error performance of a rate R = 16 linear space-time code, corresponding to m = 64. Exact
ML solution vs. nulling/cancelling with optimal ordering. (No. of lattice points= 2 128 3.4 1038 ).
The above discussion shows that there is merit in studying exact solutions. The most obvious one is
to form a search over the entire lattice which, although theoretically feasible for finite lattices, invariably
requires an exponential search. There do, however, exist exact methods that are a bit more sophisticated
than the above full search. These include Kannans algorithm [8] (which searches only over restricted
parallelograms), the KZ algorithm [9] (based on the Korkin-Zolotarev reduced basis [10]) and the sphere
decoding algorithm of Fincke and Pohst [11, 12]. Since then the work of Fincke and Pohst the sphere
decoding algorithm has been rediscovered in several contexts (see, e.g., [13] in the context of GPS systems)
and is the algorithm we will be considering in this paper. Many of the results of this paper and various
extensions can be found in the second authors PhD dissertation [14].
1
Of course, at this point it may appear surprising that one can even generate the ML curve in Figure 2, since it requires finding
the exact solution among a set of size 1038 more on this later.
3 Sphere Decoding
The basic premise in sphere decoding is rather simple: we attempt to search over only lattice points s Z m
that lie in a certain sphere of radius d around the given vector x, thereby reducing the search space and
hence the required computations (see Figure 3). Clearly, the closest lattice point inside the sphere will also
be the closest lattice point for the whole lattice. However, close scrutiny of this basic idea leads to two key
questions.
1. How to choose d? Clearly, if d is too large, we obtain too many points and the search remains
exponential in size, whereas if d is too small, we obtain no points inside the sphere.
A natural candidate for d is the covering radius of the lattice, defined to be the smallest radius of
spheres centered at the lattice points that cover the entire space. This is clearly the smallest radius that
guarantees the existence of a point inside the sphere for any vector x. The problem with this choice
of d is that determining the covering radius for a given lattice is itself NP hard [15].
Another choice is to use d as the distance between the Babai estimate and the vector x, i.e., d =
kx H sB k, since this radius guarantees the existence of at least one lattice point (here the Babai
estimate) inside the sphere. However, it is again not clear in general whether this choice of radius
leads to too many lattice points lying inside the sphere.
2. How can we tell which lattice points are inside the sphere? If this requires testing the distance of each
lattice point from x (to determine whether it is less than d), then there is no point in sphere decoding
as we will still need an exhaustive search.
Sphere decoding does not really address the first question. However, it does propose an efficient way to
answer the second, and more pressing, one. The basic observation is the following. Although it is difficult
to determine the lattice points inside a general m-dimensional sphere, it is trivial to do so in the (onedimensional) case of m = 1. The reason is that a one-dimensional sphere is simply an interval and so the
desired lattice points will be the integer values that lie in this interval. We can use this observation to go
from dimension k to dimension k + 1. Suppose we have determined all k-dimensional lattice points that lie
in a sphere of radius d. Then for any such k-dimensional point, the set of admissible values of the k + 1-th
dimensional coordinate that lie in the higher dimensional sphere of the same radius d forms an interval.
The above means that we can determine all lattice points in a sphere of dimension m and radius d by
successively determining all lattice points in spheres of lower dimensions 1, 2, . . . , m and the same radius
d. Such an algorithm for determining the lattice points in an m-dimensional sphere essentially constructs
a tree where the branches in the k-th level of the tree correspond to the lattice points inside the sphere of
radius d and dimension ksee Figure 4. Moreover, the complexity of such an algorithm will depend on the
size of the tree, i.e., on the number of lattice points visited by the algorithm in different dimensions.
k=1
k=2
m=4
k=3
k=4
(5)
In order to break the problem into the subproblems described above, it is useful to consider the QR factorization of the matrix H
H = Q
R
0(nm)m
(6)
Q1 Q2
matrices Q1 and Q2 represent the first m and last n m orthonormal columns of Q, respectively. The
condition (5) can therefore be written as
i R
h
R
Q
1
2
s
=
x
s
= kQ1 x Rsk2 + kQ2 xk2 ,
d
x Q1 Q2
Q2
0
0
(7)
m
X
i=1
yi
m
X
j=i
ri,j sj .
(8)
where ri,j denotes an (i, j) entry of R. Here is where the upper triangular property of R comes in handy.
The right-hand side (RHS) of the above inequality can be expanded as
0
(9)
where the first term depends only on s m , the second term on {sm , sm1 } and so on. Therefore a necessary
0
condition for Hs to lie inside the sphere is that d 2 (ym rm,m sm )2 . This condition is equivalent to sm
belonging to the interval
&
d + ym
rm,m
'
sm
%
0
d + ym
,
rm,m
(10)
where de denotes rounding to the nearest larger element in the set of numbers which spans the lattice. 2
Similarly, bc denotes rounding to the nearest smaller element in the set of numbers which spans the lattice.
0
2
Of course, (10) is by no means sufficient. For every s m satisfying (10), defining dm1
= d2
(ym rm,m sm )2 and ym1|m = ym1 rm1,m sm , a stronger necessary condition can be found by looking
2
Clearly, for the case of infinite lattice Z m , this set is the set of integer numbers Z. The set which spans a finite subset of Z m
is a finite subset of Z, possibly shifted.
at the first two terms in (9), which leads to s m1 belonging to the interval
&
dm1 + ym1|m
rm1,m1
'
sm1
%
0
dm1 + ym1|m
.
rm1,m1
(11)
One can continue in a similar fashion for s m2 , and so on until s1 , thereby obtaining all lattice points
belonging to (5).
i
Q1 Q2 , R, x, y = Q1 x, d.
dk +yk|k+1
c, sk
rk,k
=d
dk +yk|k+1
e
rk,k
Pm
j=k+1 rk,j sj ,
2
dk2 = dk+1
6. Solution found. Save s and its distance from x, d m2 d12 + (y1 r1,1 s1 )2 , and go to 3.
Note that the subscript k|k + 1 in yk|k+1 above is used to denote received signal y k adjusted with the
already estimated symbol components s k+1 , . . . , sm . Furthermore, note that in step 2 and step 3 of the code,
we assumed unit spacing between any two nearest elements of the set spanning the lattice. If the lattice is
scaled, i.e., if the spacing between two nearest neighbors in the set spanning the lattice is different from 1,
those algorithm steps need to be adjusted accordingly.
We should mention that the original paper of Fincke and Pohst [12] used slightly different notation to the
one we have used. For completeness, we shall include it here. The paper [12] makes use of the unconstrained
least-squares solution s = H x = R1 Q1 x. In this case, it follows that kQ2 xk2 = kxk2 kH sk2 and so
inequality (7) becomes
d2 kxk2 + kH sk2 kR(
s s)k2 .
9
(12)
2
rm,m
(sm
sm ) +
2
rm1,m1
sm1 sm1 +
rm1,m
rm1,m1
(sm sm )
2
+ ...
(13)
sm
and
&
rm,m
sm1|m
dm1
rm1,m1
'
'
sm sm +
rm,m
(14)
sm1 sm1|m +
dm1
rm1,m1
(15)
m1,m
respectively, where we have defined sm1|m = sm1 rm1,m1
(sm sm ). We can now alternatively write
the algorithm as
Input: R, x, s, d.
0
dk
rk,k ,
2 r2
dk+1
k+1|k+2)2 , and go to 2.
k+1,k+1 (sk+1 s
0
rk,j
j=k+1 rk,k (sj
Pm
sj ), dk2 =
2 (s s
6a. Solution found. Save s and its distance from x, d m2 d12 + r1,1
1|2 )2 , and go to 3a.
1
2 tc + m 1
b4d
1
1
+ 1 ,
(2m3 + 3m2 5m) + (m2 + 12m 7) (2b d2 tc + 1)
6
2
b4d2 tc
10
(16)
2 , . . . , r2
2
where t = max(r1,1
m,m ). In practice t grows proportional to n (r 1,1 , for example, is simply the
squared norm of the first column of H, which has n entries) and d 2 grows proportional to m (for more
on this see below) and so the upper bound on the number of computations in (16) can be quite large.
Our experience with numerical implementations of the algorithm shows that the bound is extremely loose.
Moreover, although it does depend on the lattice-generating matrix H (through the quantity t), it offers little
insight into the complexity of the algorithm. We will therefore not further consider it.
In this paper we propose to study the complexity of the sphere decoding algorithm using the geometric
interpretation we have developed so far. As mentioned earlier, the complexity of the sphere decoding algorithm depends on the size of the generated tree in Fig. 4, which is equal to the sum of the number of lattice
points in spheres of radius d and dimensions k = 1, . . . , m. The size of this tree depends on the matrix
H, as well as on the vector x. Therefore, unlike the complexity of solving the unconstrained least-squares
problem which only depends on m and n and not on the specific H and x, the complexity of the sphere
decoding algorithm is data-dependent.
3.2.1
Expected Complexity
Of course since the integer least-squares problem is NP hard, the worst-case complexity of sphere decoding
is exponential. However, if we assume that the matrix H and vector x are generated randomly (according
to some known distributions), then the complexity of the algorithm will itself be a random variable. In this
case, it is meaningful to study the expected (or average) complexity of sphere decoding, and perhaps even
some of its higher order moments.3
In what follows we will give a rough argument for the expected complexity of sphere decoding being
exponential, although it is not too difficult to make it rigorous. (For a rigorous treatment, albeit using a
different approach, see [2].) For an arbitrary point x, and an arbitrary lattice H, it is not too difficult to show
that the expected number of lattice points inside the k-dimensional sphere of radius d is proportional to its
volume given by (see, e.g., [15])
k/2
dk .
(k/2 + 1)
Therefore the expected total number of points visited by the sphere decoding is proportional to the total
3
In passing, we should mention that there is recent interest in studying the expected, rather than worst-case, complexity of
various algorithms in the computer science literature. The reader may wish to refer to the survey paper [16] and the references
therein, as well as the influential papers [17, 18]. In these works, a uniform distribution on the underlying problem is often
(artificially) assumed and complexity issues such as NP-completeness, etc., are re-visited. However, as we shall see below, our
problem allows for a very natural stochastic model.
11
P =
k=1
k/2
dk .
(k/2 + 1)
A simple lower bound on P can be obtained by considering only the volume of an arbitrary intermediate
dimension, say k,
k/2
dk
P
(k/2 + 1)
2ed2
k
k/2
1
,
k
where we have assumed m k 1 and have used Stirlings formula for the Gamma function. Clearly,
P , and its lower bound, depend on the radius d 2 . This must be chosen in such a way that the probability of
the sphere decoder finding a lattice point does not vanish to zero. This clearly requires the volume of the
m-dimensional sphere to not tend to zero, i.e.,
2ed2
m
m/2
= O(1),
m
which for large m implies that 2ed2 = m1+ m . Plugging this into the lower bound for P yields
1
m1+ m
k
!k/2
1
1 m 1 1 1
= 2 + 2 m 2 2 ,
where we have defined = m/k > 1. This last expression clearly shows that the expected number of points
P , and hence the complexity of the algorithm, grows exponentially in m. (Take, e.g., = 2.)
4 A Random Model
Although not unexpected, the above is a discouraging result. In communications applications, however,
the vector x is not arbitrary, but rather is a lattice point perturbed by additive noise with known statistical
properties. Thus, we will assume
x = Hs + v,
(17)
where the entries of v are independent N (0, 2 ) random variables with known variance and the entries of
H are independent N (0, 1) random variables. Furthermore, H and v are mutually independent.
12
1
2 2
kvk2 =
kx Hsk2 is a 2 random variable with n degrees of freedom. Thus we may choose the radius to be
n/2
n/21
e d = 1 ,
(n/2)
where the integrand is the probability density function of the 2 random variable with n degrees of freedom,
and where 1 is set to a value close to 1, say, 1 = 0.99. [If the point is not found, we can increase the
probability 1 , adjust the radius, and search again.]
The important point is that the radius d is chosen based on the statistics of the noise, and not based on
the lattice H. Making the choice based on H quickly leads us to NP hard problems (such as determining the covering radius). Moreover, choosing the radius based on the noise has a beneficial effect on the
computational complexity.
Of course, if H possesses special structure, such as Toeplitz structure, then this is not a reasonable assumption and the structure must be explicitly taken into account. However, this merits a separate analysis and
will be considered in the second part of this paper.
Now, as argued in the previous section, the complexity of the sphere decoding algorithm is proportional
to the number of nodes visited on the tree in Figure 4 and, consequently, to the number of points visited in
the spheres of radius d and dimensions k = 1, 2, . . . , m. Hence the expected complexity is proportional to
the number of points in such spheres that the algorithm visits on average. Thus the expected complexity of
sphere decoding algorithm is given by
C(m, 2 , d2 ) =
m
X
k=1
=Ep (k,d2 =n 2 )
(18)
=fp (k)=2k+11
The summation in (18) goes over the dimensions k = 1 through k = m. The coefficient
fp (k) = 2k + 11
is the number of elementary operations (additions, subtractions, and multiplications) that the Fincke-Pohst
algorithm performs per each visited point in dimension k.
We need to compute Ep (k, d2 ), the expected number of points inside the k-dimensional sphere of radius
d. Let us first begin with the highest dimension, i.e., k = m.
4.2.1
k=m
x
Hs
a
Hst
Figure 5: st transmitted and x received. We are interested whether an arbitrary point Hs a lies in a sphere
of radius d centered around x.
If the lattice point st was transmitted and the vector x = Hs t + v received, we are interested in the
14
(19)
Now the vector w = v + H(st sa ) is clearly a zero-mean Gaussian random vector, since its entries are the
sums of zero-mean Gaussian random variables. Now the covariance matrix has (i, j) entry
Ewi wj
= E
vi +
= 2 ij +
m
X
hik (st,k
k=1
m X
m
X
k=1 l=1
sa,k )
)(
vj +
m
X
l=1
Pm
2
k=1 sk .
Thus, w is an n-dimensional vector of zero-mean iid Gaussian random variables with variance 2 + kst
kwk2
2( 2 +kst sa k2 )
kv+H(st sa )k2
2( 2 +kst sa k2 )
Thus, the probability that the lattice point s a lies in a sphere of radius d around x is given by the normalized
incomplete gamma function
n
d2
,
2( 2 + ksa st k2 ) 2
d2
2( 2 +ksa st k2 )
n/21
e d.
(n/2)
(20)
Now that we have computed this probability, the expected number of points in the m-dimensional sphere
can be evaluated. However, before doing so, let us turn to the k < m case.
4.2.2
k<m
Referring back to (9), we are interested in all k-dimensional lattice points s such that
d2
m
X
i=mk+1
yi
15
m
X
j=i
ri,j sj .
(21)
To better understand this set, let us again consider the QR decomposition of (6) to write
2
R
(st sa )
=
v + Q
0(nm)m
2
=
Q v +
(st sa )
.
0(nm)m
[We shall henceforth suppress the subscripts of the all-zero submatrices, as long as their dimensions are
self-evident.] Now if we partition the upper triangular matrix R and the vector u = Q v as
R=
Rmk,mk Rmk,k
0k(mk)
Rk,k
and u =
umk
uk
unm
(22)
where the block matrices Rmk,mk , Rmk,k and Rk,k are (m k) (m k), (m k) k and k k,
respectively, and the vectors umk , uk and unm are m k, k and n m dimensional, respectively, then
we can write
kx Hsa k2 = kumk + Rmk,mk (stmk samk ) + Rmk,k (skt ska )k2 +
kuk + Rk,k (skt ska )k2 + kunm k2 ,
where the vectors stmk , samk , and skt , ska , are m k and k dimensional, respectively, and such that
st =
stmk
skt
sa =
samk
ska
It is now straightforward to see that d 2 = d2 kunm k2 and that kuk + Rk,k (skt ska )k2 is simply the sum
of the last k terms in the sum (8). Thus, we may rewrite the inequality (21) as
2
k
u
Rk,k
2
k
k
k 2
nm 2
k
k
d ku + Rk,k (st sa )k + ku
k =
+
(st sa )
.
unm
0
(23)
Thus, to compute the expected number of k-dimensional lattice points that satisfy (21), we need to
determine the probability distribution of the RHS of (23). For this we need the following result.
16
Lemma 1. Let H be an n m (with n m) random matrix with iid columns such that each column has a
distribution that is rotationally-invariant from the left. In other words, for any n n unitary matrix , the
distribution of hi , the i-the column of H, satisfies
ph (hi ) = ph (hi ).
Consider now the QR decomposition
H = Q
R
0
where Q is n n and unitary, and R is m m and upper triangular with non-negative diagonal entries.
Then Q and R are independent random matrices, where
1. Q has an isotropic distribution, i.e., one that is invariant under pre-multiplication by any nn unitary
matrix:
pQ (Q) = pQ (Q),
= = I.
H=
Hmk,mk
Hmk,k
Hnm+k,mk Hnm+k,k
where the subscripts indicate the dimensions of the sub-matrices. Then R k,k has the same distribution
as the R obtained from the QR decomposition of the (n m + k) k matrix H nm+k,k .
Proof: See Appendix A.
Remarks:
1. What
is
interesting about the above Lemma is that even though the (n m + k) k submatrix
R
k,k is not the R of the QR decomposition of the (n m + k) k submatrix H nm+k,k it has
0
the same distribution.
2. Lemma 1 clearly holds for an H with iid zero-mean unit-variance Gaussian entries. In this case, one
can be explicit about the distribution of R: the entries are all independent with the i-th diagonal term
17
having a 2 distribution with n i + 1 degrees of freedom and the strictly upper triangular entries
having iid zero-mean unit-variance Gaussian distributions.
Let us now apply Lemma 1 to the problem at hand. First, since v has iid zero-mean 2 -variance Gaussian
entries and Q is unitary, the same is true of u = Q v and also of the sub-vectors umk , uk and unm .
Moreover, since Q is independent of R, the same is true of u. Returning to the inequality (23) let us
multiply the vector inside the norm by an isotropically-random unitary matrix . Since this does not change
norms, we have
k
R
u
k,k
k
k
2
(st sa )
.
+
d
0
unm
uk
unm
=
entries. Also, from Lemma 1 part 2, the (n m + k) k matrix H
unit-variance Gaussian entries. Thus, we may write (23) as
Rk,k
0
k sk )k2 ,
d2 k
v + H(s
a
t
(24)
which is precisely (19), except that the dimensions have changed from n and m to n m + k and k. Thus,
using the same argument as presented after (19), we conclude that the probability that the k-dimensional
lattice point ska lies in a sphere of radius d around x is
d2
nm+k
,
k
2
2
k
2
2( + ksa st k )
d2
k 2
2( 2 +ksk
a st k )
(nm+k)/21
e d.
((n m + k)/2)
(25)
Given this probability and the one in (20), one could in principle proceed by finding the argument of
the gamma function in (20) and (25) for each pair of points (s a , st ), and sum their contributions; however,
even for a finite lattice this would clearly be a computationally formidable task (and not doable at all in the
infinite lattice case). Therefore, we shall find it useful to enumerate the lattice, i.e., count the number of
points with the same argument of the gamma function in (20) and (25). Enumeration of infinite and finite
lattices is treated separately.
18
Ep (k, d ) =
2
l=0
d2
nm+k
,
2
2( + l)
2
Since ksk2 = s21 + . . . + s2k , we basically need to figure out how many ways a non-negative integer l can be
represented as the sum of k squared integers. This is a classic problem in number theory and the solution is
denoted by rk (l) [20]. There exist a plethora of results on how to compute r k (l). We only mention one here
due to Euler: rk (l) is given by the coefficient of xl in the expansion
1+2
m2
m=1
!k
=1+
rk (l)xl .
(27)
l=1
(For more on the problem of representing integers as the sum of squares see Appendix B.)
The above arguments lead to the following result.
Theorem 1 (Expected complexity of sphere decoding over infinite lattice). Consider the model
x = Hs + v,
where v Rn1 is comprised of iid N (0, 2 ) entries, H Rnm is comprised of iid N (0, 1) entries, and
s Z m is an m-dimensional vector whose entries are integer numbers. Then the expected complexity of
the sphere decoding algorithm of section 3.1 with a search radius d for solving the integer least-squares
problem,
min kx Hsk2 ,
sZ m
is given by
2
C(m, , d ) =
m
X
k=1
fp (k)
l=0
19
d2
nm+k
,
2
2( + l)
2
rk (l).
(28)
We should remark that, for any given search radius d, there always exists a probability that no lattice
point is found. Therefore, to obtain the optimal solution, it is necessary to increase the search radius. One
plausible way of doing this is to start with a radius for which the probability of finding the transmitted point
is 1 , then if no point is found to increase the search radius to a value such that the probability of not
finding the transmitted point is 1 2 , and so on. For such a strategy, we have the following result.
Corollary 1 (Expected complexity for finding the optimal solution). Consider the setting of Theorem 1.
Given any 0 < 1, consider a strategy where we first choose a radius such that we find the transmitted
lattice point with probability 1 , and then increase it to a probability of 1 2 , and so on, if no point
is found. Then the expected complexity of the sphere decoding algorithm to find the optimal solution is
bounded by
2
C(m, , )
X
i=1
(1 )
i1
m
X
k=1
X
i n 2 n m + k
,
rk (l),
fp (k)
2( 2 + l)
2
(29)
l=0
n n
i
,
= 1 i , i = 1, 2, . . .
2 2
(30)
Note that the probability of having to perform i decoding steps in order to find the transmitted point can
be calculated to be (1 )i1 as follows
p d(i)2 > kvk2 > d(i 1)2 = p kvk2 < d(i)2 p kvk2 < d(i 1)2 = 1i 1+i1 = (1)i1 ,
where d(i) denotes sphere radius at the i th decoding step. We remark that this is different from the probability of having to perform i decoding steps in (29) since there is always the error probability of the sphere
decoder finding a lattice point even when the transmitted lattice point is not inside the sphere. This explains
why we have only an upper bound in the above corollary.
20
L1 L3
L3 L1
,
,...,
,
2
2
2
2
(31)
In fact, L is often taken as a power of 2. We say that the point s then belongs to the lattice constellation D Lm ,
DLm = DL DL . . . DL ,
|
{z
}
m-times
Furthermore, in this case, rather than the noise variance 2 , one is interested in the signal-to-noise ratio
,
=
m(L2 1)
.
12 2
(32)
The probability expression (25) for finding an arbitrary lattice point s ka inside a sphere around the given
point x when the point lattice skt was transmitted, holds for the finite lattice case as well. However, counting
the lattice points which have the same argument of the incomplete gamma function in (25) is not as easy.
The reason is that unlike in the infinite lattice case, the difference between two lattice points in D Lk , ska skt ,
is not necessarily another lattice point in D Lk . Thus, the lattice enumeration that we used in the previous
More formally, the number of subset lattice points in the k-dimensional sphere is given by
1 X
Lk
l
k ,ksk sk k2 =l
skt ,ska DL
a
t
n 2
nm+k
,
2
2( + l)
2
o
(skt , ska ) | skt , ska DLk , kskt ska k2 = l ,
appears to be complicated.
For this, we propose a modification of Eulers generating function technique. In particular, for various finite lattices we will define generating polynomials that, when combined appropriately, perform the counting
operations for us.
21
y
i
is comprised of zero and one entries. The number of such vectors whose
squared norm is l is clearly
2. D4k : In this case, all points in the constellation are not the same. For each entry of s kt , we can
distinguish between the corner and the center points, as illustrated in Figure 7.
23
?
}
21
1
2
22
3
2
?
}
Extending
Eulers idea, for the corner points we identify the generating polynomial
0 (x) = 1 + x + x4 + x9 ,
(33)
(34)
Essentially, the powers in the polynomials 0 (x) and 1 (x) contain information about possible squared
distances between an arbitrary point s ka and the transmitted point skt . For instance, if an entry in the
transmitted vector skt , say st,1 , is a corner point, then sa,1 st,1 {0, 1, 2, 3}, depending on
sa,1 D4 . Thus the squared norm of their difference |s a,1 st,1 |2 can be either 0, 1, 4, or 9, as
described by the powers of 0 (x). On the other hand, if st,1 , is a center point, then sa,1 st,1
{0, 1, 1, 2} (which explains coefficient 2 in front of term x in 1 (x)). Now, if among the k
entries of skt , we choose a corner point j times, the number of ways ks kt ska k2 can add up to l is
k
j
j (x) kj (x).
0
(35)
7 5 3 1 1 3 5 7
D8k = { , , , , , , , }k .
2 2 2 2 2 2 2 2
(36)
Similar to the L = 4 case, we can identify the following polynomials for counting s a st in D8k
23
lattice:
0 (x) = 1 + x + x4 + x9 + x16 + x25 + x36 + x49 ,
1 (x) = 1 + 2x + x4 + x9 + x16 + x25 + x36 ,
(37)
Therefore, if among k entries of skt , we choose ji points from Si , i {0, 1, 2, 3}, then the number of
ways kskt ska k2 can add up to l is given by the coefficient of x l in the polynomial
j0 , j 1 , j 2 , j 3
where j0 + j1 + j2 + j3 = k and
k
j0 , j 1 , j 2 , j 3
(38)
k!
j0 !j1 !j2 !j3 ! .
We can now summarize the above results for the expected computational complexity of the Fincke-Pohst
algorithm for finite lattices in the following theorem.
Theorem 2. [Expected complexity of the sphere decoding over a finite lattice] Consider the model
x = Hs + v,
where v Rn1 is comprised of iid N (0, 1) entries, H R nm is comprised of iid N (0, /m) entries,
and s DLm is an m-dimensional vector whose entries are elements of an L-PAM constellation. Then the
expected complexity of the sphere decoding algorithm of section (3.1) with a search radius of d for solving
the integer least-squares problem
min kx Hsk2 ,
m
sDL
C(m, , d2 ) =
m
X
k=1
fp (k)
k
X
l=0
k
l
24
n
2(1 +
12l
m(L2 1)
nm+k
,
2
)
(39)
nm+k
,
12q
2
2(1 + m(L2 1) )
n
(40)
m
X
k=1
X 1
fp (k)
4k
q
gkj0 j1 j2 j3 (q)
n
2(1 +
12q
m(L2 1)
nm+k
,
2
)
(41)
We remark that to obtain the optimal solution to the integer least-squares problem we will occasionally
need to increase the search radius d, and so we can obtain a result similar to that of Corollary 1, which we
omit for brevity.
5 Conclusion
In many communication problems, maximum-likelihood detection reduces to solving an integer least-squares
problem, i.e., a search for the closest integer lattice point to the given vector. In such applications ML detection is rarely performed, on the grounds that it requires exponential complexity and is therefore computationally intractable. However, what distinguishes the communications applications from many other instances
of integer-least-squares problems is that the given vector is not arbitrary, but rather is an unknown lattice
m
Since DL
is shifted integer lattice, we assume that each rounding in step 2 of the algorithm in Section 3.1 takes L1 operations.
Hence fp (k) slightly differs from the one used to find expected complexity of sphere decoding in the infinite lattice Z m .
4
25
point that has been perturbed by an additive noise vector with known statistical properties. Therefore, the
complexity of finding the closest lattice point for any given algorithm should, in fact, be viewed as a random
variable. This is the approach taken in this paper and, for the sphere decoding algorithm of Fincke and
Pohst, we obtained a closed-form expression for the mean, averaged over both the noise and the lattice, of
the complexity. This was done for both finite and infinite lattices.
Based on these closed-form expressions, in the second part of this paper, we will demonstrate that over a
wide range of SNRs, rates and dimensions, the expected complexity of sphere decoding is polynomial (often
roughly cubic), which implies the practical feasibility of sphere decoding in many applications. The second
part of this paper will deal with various generalizations of this result and will also address the computation
of the variance of the sphere decoding algorithm.
We should also mention that there are many variants of the sphere decoder algorithm, some of which are
mentioned in Sec. 7 of second part of this paper. While these algorithms generally outperform the standard
sphere decoder mentioned here, the computation of their expected complexity appears to be a formidable
task. Our results may therefore be viewed as upper bounds for these more powerful algorithms.
26
Proof of Lemma 1
Let us start with part 1. Since H is rotationally-invariant from the left, the matrix H has the same distribution as H, for any unitary matrix . The following simple calculation shows that the same is true for any
random unitary that is independent of H:
p(H) =
p(H|)p()d =
p(H)p()d = p(H),
where in the second step we used the fact that that is independent of H to conclude p(H|) = p(H).
In particular, the above equation is true for an independent isotropically random unitary . 5 Now, for such
a , we have
H = QR,
from which, due to the uniqueness of the QR decomposition when R has positive diagonal entries (see, e.g.,
[23, 24]), we conclude that Q is the Q and R remains the R in the QR decomposition of H. Now, since
is independent isotropically random, Q is also isotropically random and, moreover, it is independent of
Q. Therefore Q must be independent of R, as well. Since H and H have the same distribution, the Qs
in their QR decompositions must have the same distribution, from which we conclude that Q must be an
isotropically random unitary matrix, independent of R.
This concludes the proof of part 1. [We remark that the proof of part 1 only required that H be
rotationally-invariant. We did not require the independence of the columns of H. This independence is
required for the proof of part 2, to which we now turn our attention.]
Consider the partitioning
H=
Hmk,mk
Hmk,k
Hnm+k,mk Hnm+k,k
where the subscripts indicate the dimensions of the sub-matrices. Now consider the QR decomposition of
the leading m k columns of H, i.e.,
Hmk,mk
Hnm+k,mk
= Q1
Pmk,mk
0
where Q1 is n(mk) unitary and Pmk,mk is upper triangular with non-negative diagonal entries. Now
5
For some of the properties of isotropically random unitary matrices the reader may refer to [21, 22].
27
since Pmk,mk
Pmk,mk = Hmk,mk
Hmk,mk + Hnm+k,mk
Hnm+k,mk , which is the leading
Q1 H =
Rmk,mk
mk,k
H
nm+k,k
H
(A.1)
Now, since Q1 depends only on the first m k columns of H and these are independent of the remaining
nm+k,k = Q2
H
Rk,k
0
Q1 H =
0 Q2
and so
Since Q1
H = Q1
0 Q2
mk,k
Rmk,mk H
0
Rk,k
Rmk,mk
mk,k
H
Rk,k
(A.2)
is unitary and since the diagonal entries of R mk,mk and Rk,k are non-negative, we
0 Q2
conclude that this is indeed the QR decomposition of H (which justifies our use of the notation R k,k for the
k,k ).7 Since H
nm+k,k and Hnm+k,k have the same distribution, we conclude that R k,k
R in the QR of H
has the same distribution as the R obtained from the QR decomposition of H nm+k,k .
This concludes the proof of part 2.
6
7
28
(B.1)
where d1 (l) and d3 (l) are the number of divisors of l congruent to 1 and 3 mod 4, respectively.
In 1770, Lagrange proved his famous Four Squares Theorem, which states that every positive integer
can be represented as the sum of four squares. This essentially establishes that r 4 (l) > 0 for all positive
integers l; however, Lagrange did not give an explicit formula for r 4 (l).
In terms of computing the value of rk (l), the first result is due to Euler who introduced (what is now
known as) the Jacobi theta function
(x) =
xm = 1 + 2
m=
xm
(B.2)
m=1
rk (l)xl .
(B.3)
l=1
In other words, the number of ways a non-negative integer l can be represented as the sum of k squares is
8
In fact, Waring considered the much more general problem of determining the number of ways an integer can be represented as
the sum of k integers raised to the power q. In this sense, the number of ways an integer can be represented as the sum of k squares
is essentially the q = 2 Waring problem.
29
30
References
[1] M. Grotschel, L. Lovasz, and A. Schriver, Geometric Algorithms and Combinatorial Optimization.
Springer Verlag, 2nd ed., 1993.
[2] M. Ajtai, The shortest vector problem in L 2 is NP-hard for randomized reductions, Proceedings of
the 30th Annual ACM Symposium on Theory of Computing, pp. 1019, 1998.
[3] G. J. Foschini, Layered space-time architecture for wireless communication in a fading environment
when using multi-element antennas, Bell Labs. Tech. J., vol. 1, no. 2, pp. 4159, 1996.
[4] B. Hassibi, An efficient square-root algorithm for BLAST, submitted to IEEE Trans. Sig. Proc., 2000.
Download available at http://mars.bell-labs.com.
[5] A. K. Lenstra, H. W. Lenstra, and L. Lovasz, Factoring polynomials with rational coefficients, Math.
Ann., pp. 515534, 1982.
[6] M. O. Damen, A. Chkeif, and J.-C. Belfiore, Lattice code decoder for space-time codes, IEEE Comm.
Let., pp. 161163, May 2000.
[7] B. Hassibi and B. Hochwald, High-rate codes that are linear in space and time, IEEE Trans. Info.
Theory, vol. 48, pp. 18041824, July 2002.
[8] R. Kannan, Improved algorithms on integer programming and related lattice problems, Proc. 15th
Annu. ACM Symp. on Theory of Computing, pp. 193206, 1983.
[9] J. Lagarias, H. Lenstra, and C. Schnorr, Korkin-Zolotarev bases and successive minima of a lattice
and its reciprocal, Combinatorica, vol. 10, pp. 333348, 1990.
[10] A. Korkin and G. Zolotarev, Sur les formes quadratiques, Math. Ann., vol. 6, pp. 366389, 1873.
[11] M. Pohst, On the computation of lattice vectors of minimal length, successive minima and reduced
basis with applications, ACM SIGSAM Bull., vol. 15, pp. 3744, 1981.
[12] U. Fincke and M. Pohst, Improved methods for calculating vectors of short length in a lattice, including a complexity analysis, Mathematics of Computation, vol. 44, pp. 463471, April 1985.
[13] A. Hassibi and S. Boyd, Integer parameter estimation in linear models with applications to GPS,
IEEE Transactions on Signal Processing, vol. 46, pp. 293852, November 1998.
31
[14] H. Vikalo, Sphere Decoding Algorithms for Digital Communications. PhD Thesis, Stanford University,
2003.
[15] J. Conway and N. Sloane, Sphere Packings, Lattices and Graphs. Springer-Verlag, 1993.
[16] J. Wang, Average-case computational complexity theory, Complexity Theory Retrospective II,
pp. 295328, 1997.
[17] L. Levin, Average case complete problems, SIAM Journal on Computing, vol. 15, pp. 28586, 1986.
[18] Y. Gurevich, Average case completeness, Journal of Computer and System Sciences, vol. 42, no. 3,
pp. 346398, 1991.
[19] M. L. Mehta, Random Matrices. Academic Press, 2nd ed., 1991.
[20] G. Hardy, Ramanujan: Twelve Lectures. Chelsea Publishing, 1940.
[21] B. M. Hochwald and T. L. Marzetta, Unitary space-time modulation for multiple-antenna communication in Rayleigh flat-fading, IEEE Trans. Info. Theory, vol. 46, pp. 543564, Mar. 2000.
[22] B. Hassibi and T. L. Marzetta, Block-fading channels and isotropically-random unitary inputs: The
received signal density in closed-form, IEEE Trans. on Info. Thy., vol. 48, pp. 147384, June 2002.
[23] R. Horn and C. Johnson, Topics in Matrix Analysis. Cambridge: University Press, 1991.
[24] G. Golub and C. V. Loan, Matrix Computations. Balitmore: Johns Hopkins University Press, 2nd ed.,
1989.
[25] V. Kac and M. Wakimoto, Integrable highest weight modules over affine super-algebras and Appells
function, Comm. Math. Phys., vol. 215, no. 3, pp. 631682, 2001.
[26] S. Milne, Infinite families of exact sums of squares formulas, Jacobi elliptic functions, continued
fractions, and Schur functions, Ramanujan, vol. 6, no. 1, pp. 7149, 2002.
[27] I. Peterson, Surprisingly square, Science News, vol. 159, June 2001.
[28] M. Knopp, Modular Functions in Analytic Number Theory. Chicago, IL: Markham Publishing Company, 1970.
[29] S. Wolfram, The Mathematica Book. Cambridge University Press, 4th ed., 1999.
32