A Subspace Approach To Blind Space-Time Signal Processing For Wireless Communication Systems
A Subspace Approach To Blind Space-Time Signal Processing For Wireless Communication Systems
A Subspace Approach To Blind Space-Time Signal Processing For Wireless Communication Systems
1, JANUARY 1997
173
I. INTRODUCTION
174
..
.
..
.
..
.
It is common to assume at this point that all
are FIR filters of length at most
N:
channels
VAN DER VEEN et al.: SUBSPACE APPROACH TO BLIND SPACE-TIME SIGNAL PROCESSING
as
..
.
175
..
.
0
. .
.. ..
..
.
..
.
..
..
..
..
..
..
..
..
..
..
..
..
(2)
.
.
..
..
..
..
..
has a factorization
real
imag
(3)
while
This effectively doubles the number of observables
halving the noise power on each entry.
The algorithms we consider in this paper rely on the
existence of a filtering matrix
such that
.
This implies that the row span of is equal to (or contained
in) the row span of . For this to be true, it is necessary
that
has full column rank, which implies that
.
This may put undue requirements on the number of antennas
or oversampling rate. However, it is possible to ease this
condition by making use of time-invariance and the structure
of . Extending
to a block-Hankel matrix by left shifting
and stacking times (as discussed later,
can be viewed as
1 The subscript L denotes the number of block rows in
the subscript if this does not lead to confusion.
S . We usually omit
(4)
(the shifts of to the left are each over positions) and the
objective becomes, for given , to determine factors
and
of the indicated structure such that the entries of belong
to the finite alphabet. As we show in the sequel, identification
is possible if this is a minimal-rank factorization. Necessary
conditions for
to have a unique factorization
are
that is a tall matrix and that is a wide matrix, which
for
leads to
(5)
poses a fundamental
Given sufficient data, only
identifiability restriction.
Note that these conditions are not sufficient for and to
have full rank. One case where
does not have full rank is
when the channels do not have equal lengths, in which case
is at most
. Ill-conditioned
the rank of
cases might occur when the channels are bandwidth limited
so that sampling faster than the Nyquist rate does not provide
independent linear combinations of the same symbols. In
principle, the maximal effective is given by the ratio of the
Nyquist rate and the symbol rate [27]. (There may be other
practical reasons to select a larger , e.g., to correct for errors
in carrier recovery. This is not considered here.)
For SISO models, the condition that
is of full rank
is usually formulated in terms of common zeros; if the transforms
of the rows of
do not have a root in
common, then
has full column rank for at least all
(viz., e.g., [7], [26]). For arbitrary channels, this technical
condition holds almost surely. In the FIR-MIMO case, the
corresponding requirement is that
is irreducible and
column reduced (viz. [22]).
III. SUBSPACE-BASED APPROACHES
According to the previous section, the basic problem in
solving the blind FIR-MIMO problem is, for a given matrix
176
, to find a factorization
, where
is blockToeplitz with entries
. If we assume that the data
matrix
is corrupted by additive white Gaussian noise, then
the maximum likelihood criterion yields the nonlinear least
squares minimization problem
(6)
block-Toeplitz
To find an exact solution of this nonlinear optimization problem is computationally formidable. It is possible to approach
the optimum via iterative techniques that alternatingly estimate
and , starting from some initial estimate for
[28]. This
approach is still computationally expensive due to the repeated
enumeration of all possible sequences of length using the
Viterbi algorithm. In addition, the initial point has to be quite
accurate in order to converge to the global minimum, rather
than one of the numerous local minima.
The subspace-based approaches derived in this section simplify the problem by breaking it up into two subproblems.
Suppose that the channels have equal lengths and that the
conditions (5) are satisfied. Then,
full column rank
full row rank
row
col
row
col
To factor
into
, the strategy is to find either ,
which is a block-Toeplitz matrix with a specified row span, or
, which is a block-Hankel matrix with a specified column
span. In the scalar case (
signal), a number of algorithms
have been proposed for doing the latter, in particular, [7] and
[8], and it is straightforward to extend these algorithms to the
vector case (
), presuming the channel lengths are all
equal. However, for
, subspace information alone leads
to an ambiguity:
is a factorization with the
same subspaces for
diag
and any invertible
matrix. This ambiguity is resolved in a second step by
taking advantage of the finite-alphabet property of the signals.
We outline three approaches: one that directly estimates
from its row span, as was originally proposed in [20], [1],
and [2], then an entirely equivalent but computationally more
is
attractive version, and finally, an approach in which
estimated first. In the absence of noise, all approaches give
exact results. Note that none of these approaches provides a
factorization
in which both factors are forced to
have the required Toeplitz or Hankel structure so that they are
suboptimal in that respect.
VAN DER VEEN et al.: SUBSPACE APPROACH TO BLIND SPACE-TIME SIGNAL PROCESSING
(7)
row
177
satisfies
. Given
, we take to be a matrix
whose rows form a basis for
. Hence, the Toeplitz
is determined uniquely up to multiplication at the
matrix
right by
diag
. Now, to identify , we have to
find the factorization
, which, in the case of finite
alphabet signals, can be done using a suitable I-MIMO signal
separation algorithm, as outlined in Section III-C.
The computation of
calls for an SVD of
,
.
which is a matrix with dimensions of order
Hence, this approach requires order
operations,
which is not feasible for
or so. It is possible to
alleviate the computational requirements as we need only the
basis vectors in the null space, which does not require
a full SVD. For example, a spherical subspace updating
algorithm, if applicable, would yield a complexity of roughly
.
Row Span Intersections: We again consider (7) and let
be a basis for row
as determined in the first step. Define
(9)
where we take
consider other values for
realigned into
row
row
..
.
row
(10)
0
..
..
..
(8)
is equal to
,
The number of block columns of
where is a parameter chosen equal to the channel length
(or maybe smaller, as we will propose later). The blocks are
each shifted down over one position.
If
is a wide matrix (this gives additional conditions
on
and ), then
determines , but only up
matrix , because
also
to a left invertible
..
.
(11)
178
), we
are each
block Toeplitz
dist row
row
dist row
row
(14)
(12)
shifted over one entry and
..
..
The matrices
summarize the identity matrices present
in
, which is possible because we are only interested
in the singular values and right singular vectors of
, and
these do not change by replacing the stack of identity matrices
by
. This is immediately seen by looking at
and observing that it is the same as the square of (11).
The estimated basis for the intersection
is given by the
right singular vectors of
that correspond to the largest
:by Appendix A, those that are equal to
singular values of
(if there is no noise). As we will motivate later in Section
IV-D, the next largest singular values are close to
.
Thus, the ISI filtering process is based on distinguishing
and
. It is clear that for
singular values between
large , this becomes a delicate matter. This motivates us to
keep
small, i.e., not to make the stacking
parameter
larger than necessary.
and
in (8) is (cf. Appendix A)
The relation between
(13)
and
Hence, the right singular vectors of
are pairwise identical, except for a reversal in ordering. In
addition, their squared singular values pairwise add up to .
This is independent of any noise influence and is entirely
to be orthonormal
caused by the fact that we took and
bases of complementary subspaces. Hence, the null span union
method is just as delicate: The two methods give exactly the
same results and have the same robustness and sensitivity to
noise.
We let
denote the dimension of , which is the estimated
basis of the intersection. Under the (noise-free) conditions
intersections,
specified in Section IV-A, with
, i.e., the intersecting subspace is precisely
we obtain
of a
VAN DER VEEN et al.: SUBSPACE APPROACH TO BLIND SPACE-TIME SIGNAL PROCESSING
179
TABLE I
ILSF ALGORITHM
(15)
which is precisely the problem studied in [16] and [17]. In that
paper, two iterative block algorithms are introducedILSE
and ILSPwhich are summarized below. Starting from an
initial estimate
, the algorithms proceed as follows:
ILSE
for
a)
b)
ILSP
for
a)
b)
arg
Proj
First
Then
full rank
(16)
0
..
.
..
.
..
..
..
.
..
.
..
.
0
If the matrix on the left is tall (this gives minimal conditions
on ), then generically its right null space specifies up to a
right block-diagonal factor diag
. For any solution
, the basis
is found from an inverse
180
as
..
.
where
vec
is a stack of all input data. At
this point, we are back at the model
, and the
ILSE/P/F algorithm is employed to remove the ambiguity that
represents.
For the estimation of , it is only required that be of full
row rank, which is a mild condition. In particular, it is not
necessary that all channels have equal length, although certain
modifications are in order (see [22], which also contains some
identifiability results).
It is unclear whether a direct estimation of
is to be
preferred over an indirect estimation via . The former
initially forces only the structure of , neglecting that of ,
whereas the latter does the opposite. In general, estimating
is computationally easier for large
and can be done
consistently. Our experience with simulations, however, is that
estimating directly might be more accurate in the presence
of model mismatch (see Section IV-E). In addition, if the
channel lengths are not well defined (i.e., the FIR assumption
is only approximately true), row span methods can potentially
obtain a better model fit. This is because they do not force
zeros in the lower right block of
but have the freedom to
insert the actual (nonzero) coefficients instead. Finally, without
going into details, we mention that the row span methods
are almost immediately applicable to more general ARMA
(rational) channel models, in which a state space model is
assumed.
IV. ASPECTS
OF THE
ALGORITHM
A. Identifiability
Does the intersection/FA algorithm provide a unique estimate of ? This identifiability issue is the subject of the
following theorem. Similar results for
were presented
in [7] but from the point of view of estimating
from its
Hankel structure. An alternative proof appears in [26].
Theorem 1: Consider the FIR-MIMO scenario with
sources and channels of equal length
. Suppose that
the dimension conditions (5) are satisfied for some and that
the rank
and rank
.
Let
be a structured factorization of
.
Taking only the Toeplitz structure into account,
is
uniquely specified by the condition row
row
up to a left block-diagonal factor
diag
, where
is an invertible
matrix.
Taking also the FA property into account, under conditions of [17, Theorem 3.2],2
is unique up to
diag
, where
can take the form of a permutation
and a diagonal scaling by
.
We first derive the following lemma, where can be any
number of intersections between 1 and
.
2 This theorem basically requires that S contains all possible d-dimensional
columns that can be generated by the finite alphabet. This is a sufficient but
pessimistically large condition on N .
Lemma 1: For
, let be an orthonormal
basis of row
, and define
as in (9). Under the
row
is a
conditions of Theorem 1, row
subspace of dimension
and contains
(
).
Proof of the Lemma: The rank condition on
implies
has full row rank. In turn, this implies that
has
that
full row rank equal to
for
(since any subset
of the rows of
has full row rank as well).
. In investigating row
row
, we
Suppose
instead of since they span the
may as well look at
same space. Consider
..
..
..
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
..
..
..
..
..
..
..
..
..
..
s2
s0
s1
s2
s0
s1
s0
s1
s2
s0
s1
s2
S3 = s1 s2 s0 s1 s2 s0
In this case the intersections do not remove any row. Note that
3d; therefore, it is not of full rank.
S4 has rank
VAN DER VEEN et al.: SUBSPACE APPROACH TO BLIND SPACE-TIME SIGNAL PROCESSING
B. Detection of
and
If
and
have full column rank and row rank, respectively, then the rank of
is
. The number
of signals can be estimated by increasing
by one and
looking at the increase in rank of
. This property provides
a very effective detection mechanism even if the noise level
is quite high since it is independent of the actual (observable)
channel length. Furthermore, it still holds if all channels do
not have equal lengths (see Section IV-C). In case they do,
then can be determined from the estimated rank of
and the estimated number of signals by
.
It is interesting to note that there is an efficient updating
algorithm for estimating the rank of
and the corresponding
column span for all from 1 to
at once, without requiring
SVDs and using only the full-size
. The SSE-1 subspace
estimator derived in [32] is a technique for computing the
number of singular values of a matrix that are larger than a
given threshold , and a basis for a subspace that is -close to
the column span of the matrix in some norm. The algorithm
is such that at the same time, this information is produced
on all principal submatrices of
as well. Applied to
, it
produces the ranks of all
, with respect to a given
threshold at the complexity of a QR factorization.
C. Unequal Channel Lengths
For simplicity of presentation, we have only considered
channels with equal length up to now:
.
In general, however, the lengths
may be different. In that
case, it is perhaps more natural to write the factorization
with a rank-deficient
as
(17)
and
correspond to the channel and
where each
symbol matrix of source only. Generically, these factors are
of full rank
. The rank of
is thus expected to be
181
row
tot
tot
rank
tot
tot
i.e.,
is equal to the number of sources with a channel
length
. Thus,
is monotonically decreasing from
(
) to (
), and
tot .
If we perform intersections step by step, then the first
intersection removes the top row of every
, and the
4 For
a set
E;
182
the rows of
. The latter normalization is in the order
of
. For large , it is clear that rows of the second
term (containing
) become orthogonal to
since
the inner product is proportional to
. Obviously, the
. Hence, the second
columns of this term are orthogonal to
term contributes additional singular values
each
repeated two times. Altogether, asymptotically and under
noise-free conditions,
for
has singular
values equal to
and groups of
singular values equal
to
. If we take
, then similarly,
we can show that there are
singular values equal to
, followed by the groups of
singular values equal to
. The right singular
vectors corresponding to the
largest singular values are
a basis for
; the intersections have removed
echos of each signal.
If
is not large and if there is noise, then obviously, the
singular values start to deviate from these asymptotic values,
and in particular, the gap between the singular values around
and
closes. The assessment of these deviations is
subject to future research. Such an analysis would give pointers
to suitable minimal values for (in relation to the noise level)
such that there still can be a gap.
E. Comparison by Simulation
To assess and compare the performance of the proposed
algorithms, we consider a simple but unrealistic scenario in
which all assumptions on the model are satisfied. A more
challenging test case is deferred to Section VI. We took
real-valued BPSK sources and a randomly selected
complex channel matrix with
observables and
equal channel lengths
. (
and
are equivalent
in this example because there is no modulation function
and no multiray model.) We added complex white Gaussian
i.i.d. noise with variance . The number of snapshots was
. The signal-to-noise ratio (SNR) is defined as
, which is the average SNR per signal
per observable. The relative power of both sources was set
equal.
The singular values of
are displayed in Fig. 3. Without
noise, the rank of
is expected to be
,
which turns out to be the case. The number of sources can be
identified from the graph by looking at the increase in rank of
as increases. In addition, can be estimated, assuming
(18)
VAN DER VEEN et al.: SUBSPACE APPROACH TO BLIND SPACE-TIME SIGNAL PROCESSING
(a)
(b)
(c)
Fig. 4. Transformed singular values of
VT (n) ,
namely, (n
sv
(VT (n) )2 )1=2 . Small values indicate the number of remaining signals
after intersections. (a) SNR = 10 dB, full intersections. (b) SNR = 5 dB, full
intersections (second source not resolved). (c) SNR = 5 dB, underestimating
d and taking less intersections. After intersections, d^S = 3 signals remain.
183
184
(a)
(b)
^ = L; d^ = d
2 BPSK signals. (a) Using exact values L
Fig. 5. BER performance for d
X
X = d(L + m 1): (b) Using approximate values. For
comparison, the CRLB for a zero-forcing equalizer ( yN ) is indicated, assuming perfect knowledge of H , the CRLB for the blind scenario (not using
the FA property), and the performance of ILSE initialized with the exact H .
and
. If
,
Ideally,
is full rank, and
is square, but for ill-defined
, where is the
channel lengths, has size
actual (large and fuzzy) channel length. Fig. 6(c) shows the
magnitude of the entries of (up to the first 24 rows of ).
The first 10 rows of
have 11 large entries; thus, the first
10 rows of are a linear combination of 11 rows of , plus
some weaker ISI from other rows. The reason for this is that
contains a sharp peak, which is smeared by the
shifts over 11 symbols. The next few rows of
show the
influence of the smaller peaks in
: An increasing number
of rows of
get involved.
For large , we may write, as in (18),
one large entry. However, it is seen that the top row has at
least eight large entries; therefore, the vector in the intersection
is still a linear combination of at least eight symbols. Thus,
the intersection did not produce the desired effect of removing
all ISI. The structure of this figure is very characteristic
and shows how the intersections work. Indeed, small singular
values of
(or
) correspond to the top and bottom rows
of
since these are repeated only a few times in
. The
large singular values correspond to rows in the middle of
, which are repeated up to times. The width of the legs
of is nearly constant. For well-defined channel lengths,
the width of the legs is expected to be 1 because the right
singular vectors corresponding to a singular value are specific
echos (rows of
; cf. (18)). The widening of the second
leg of the in our example shows the influence of the
structured noise that is introduced by truncating the rank of
at 12. Qualitatively, it can be attributed to the second peak
in
, which is partly (but not entirely) eliminated by the
truncation of
to rank 12. The truncated data matrix still
contains one or a few linear combinations of echos, but since
there are fewer combinations than symbols that play a role
after truncation, the echos cannot be removed by intersections.
The conclusion drawn from this experiment is that for actual
channels the SVD-based intersection scheme may not remove
all the ISI if the rank of
is ill-defined.
B. Effect of Taking Fewer Intersections
intersections?
What happens if we take less than
is the
We provide an intuitive analysis. Let us say that
true rank of
and that the resulting approximation error is
lumped into the noise term. Since
, it is seen
amplifies
that the noise on the rows of is not uniform:
the noise at the top rows of
less than at the later rows.
and
. If we take
Consider a simple example where
intersections, then the basis of singular vectors of
VAN DER VEEN et al.: SUBSPACE APPROACH TO BLIND SPACE-TIME SIGNAL PROCESSING
185
(a)
(d)
(b)
(e)
(c)
(f)
10
186
TABLE II
BLIND FIR-MIMO IDENTIFICATION ALGORITHM
(a)
(b)
Fig. 7. (a) Relative power and (b) response to a raised-cosine pulse (T
ns, = 0:5) of two measured indoor channels.
Fig. 8.
Singular values of
m for m = 2;
111
=6
; 10.
VAN DER VEEN et al.: SUBSPACE APPROACH TO BLIND SPACE-TIME SIGNAL PROCESSING
187
Substituting
above equation with
Since both
and
are diagonal, this implies that there is
a unitary matrix
such that
is diagonal.
This, however, constitutes precisely an SVD for
This result is readily generalized to the joint intersection
of subspaces
. Likewise, we compute an
SVD of
but now look for singular values that
are close to
.
188
(a)
(b)
(c)
(d)
APPENDIX B
APPROXIMATE CRAMERRAO BOUNDS
Suppose
, where
is a white i.i.d.
complex Gaussian noise process with covariance matrix
.
For simplicity of future notation, let us specialize to the
case of real signals (e.g., BPSK signals
). Define
vectors of unknown parameters
vec
and
. We assume that the number of sources
is known and that the channels have equal known channel
length . If we do not take into account that the entries of
belong to a finite alphabet, then the concentrated Fisher
information matrix for
is derived in [26] as
; n; p
an invertible
matrix . Indeed, the dimension of the null
space of is observed to be
in generic examples. To fix
, one has to assume that certain symbols are known.
, knowing the value of one symbol of suffices,
For
and the variance of the remaining estimates is obtained by
deleting the corresponding column of
. Let
be equal
to
with the column corresponding to the known symbol
taken out, and define and
accordingly. Then, the CRLB
on the covariance of
is
diag
(19)
following
VAN DER VEEN et al.: SUBSPACE APPROACH TO BLIND SPACE-TIME SIGNAL PROCESSING
diag
median diag
blind
[approx. blind CRLB, no FA,
]
(21)
and
are the exact symbols, and
and
represent the
noise on the estimates. Normalization to arrive at an estimate
6 Here, the first-order approximation s (s + e )01 (s + e)
r r
r
e is used, as well as the BPSK assumption si = 1.
j j
s 0 er s0r 1 s +
189
Note that
can actually amplify the noise contribution by
. In estimating the correction on the bound, assume (not
entirely correctly) that the columns of
are independent, zero mean, and have equal distribution E
. Let
be the th column of , and then
E
The left-hand side of this expression is given by the uncorsubmatrix of
rected CRLB, namely, , which is the
in (20) corresponding to . It follows that an estimate of
and an approximate lower bound on var blind can be
obtained as
median
var
(22)
blind
For
BPSK signals,
expression reduces to (21).
REFERENCES
[1] A. J. van der Veen, S. Talwar, and A. Paulraj, Blind estimation of
multiple digital signals transmitted over FIR channels, IEEE Signal
Processing Lett., vol. 2, pp. 99102, May 1995.
, Blind identification of FIR channels carrying multiple finite
[2]
alphabet signals, in Proc. IEEE ICASSP, 1995, pp. 12131216.
[3] W. A. Gardner, A new method of channel identification, IEEE Trans.
Commun., vol. 39, pp. 813817, June 1991.
[4] L. Tong, G. Xu, and T. Kailath, Blind identification and equalization
based on second-order statistics: A time domain approach, IEEE Trans.
Inform. Theory, vol. 40, pp. 340349, Mar. 1994.
[5] L. Tong, G. Xu, B. Hassibi, and T. Kailath, Blind channel identification
based on second-order statistics: a frequency-domain approach, IEEE
Trans. Inform. Theory, vol. 41, pp. 329334, Jan. 1995.
[6] L. Tong, G. Xu, and T. Kailath, Fast blind equalization via antenna
arrays, in Proc. IEEE ICASSP, 1993, pp. IV:272274.
[7] E. Moulines, P. Duhamel, J. Cardoso, and S. Mayrargue, Subspace
methods for the blind identification of multichannel FIR filters, IEEE
Trans. Signal Processing, vol. 43, pp. 516525, Feb. 1995.
[8] D. Slock, Blind fractionally-spaced equalization, perfect-reconstruction
filter banks and multichannel linear prediction, in Proc. IEEE ICASSP,
vol. IV, 1994, pp. 585588.
[9] G. Xu and H. Liu, A deterministic approach to blind identification of
multi-channel FIR systems, in Proc. IEEE ICASSP, 1994, vol. IV, pp.
581584,
[10] F. R. Magee and J. G. Proakis, Adaptive maximum-likelihood sequence
estimation for signaling in the presence of intersymbol interference,
IEEE Trans. Inform. Theory, vol. IT-19, pp. 120124, Jan. 1973.
[11] G. Ungerboeck, Adaptive maximum-likelihood receiver for carriermodulated data transmission systems, IEEE Trans. Commun., vol.
COM-22, pp. 624636, May 1974.
[12] G. Picchi and G. Prati, Blind equalization and carrier recovery using a
stop-and-go decision-directed algorithm, IEEE Trans. Commun., vol.
COM-35, pp. 877887, Sept. 1987.
[13] Z. Ding, Blind equalization based on joint minimum MSE criterion,
IEEE Trans. Commun., vol. 42, pp. 648654, Feb. 1994.
[14] N. Seshadri, Joint data and channel estimation using fast blind trellis
search techniques, in Proc. Globecom, 1991, pp. 16591653.
[15] D. Yellin and B. Porat, Blind identification of FIR systems excited by
discrete-alphabet inputs, IEEE Trans. Signal Processing, vol. 41, pp.
13311339, Mar. 1993.
[16] S. Talwar, M. Viberg, and A. Paulraj, Blind estimation of multiple cochannel digital signals using an antenna array, IEEE Signal Processing
Lett., vol. 1, pp. 2931, Feb. 1994.
190
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
Alle-Jan van der Veen (S87M94) was born in The Netherlands in 1966. He
graduated (cum laude) from the Department of Electrical Engineering, Delft
University of Technology, Delft, The Netherlands, in 1988, and received the
Ph.D. degree (cum laude) from the same institute in 1993.
Throughout 1994, he was a postdoctoral scholar at Stanford University,
Stanford, CA, in the Scientific Computing/Computational Mathematics Group
and in the Information Systems Laboratory. At present, he is a researcher in
the Signal Processing Group of DIMES, Delft University of Technology. His
research interests are in the general area of system theory applied to signal
processing, in particular, system identification, time-varying system theory,
and in numerical methods and parallel algorithms for linear algebra problems.
Dr. van der Veen is the recipient of a 1994 IEEE SP paper award.
Shilpa Talwar received the M.S. degree in electrical engineering and the
Ph.D. degree in scientific computing and computational mathematics from
Stanford University, Stanford, CA, in 1996.
She is currently employed by Stanford Telecom, Sunnyvale, CA. Her
research interests include wireless communications, array signal processing,
and numerical linear algebra.