0% found this document useful (0 votes)
160 views20 pages

An Analytical Constant Algorithm: Modulus

This document summarizes an analytical constant modulus algorithm for blind signal separation using antenna arrays. The algorithm treats constant modulus signal separation as a generalized eigenvalue problem that can be solved via simultaneous matrix diagonalization. This allows detection of the number of constant modulus signals present, retrieval of all the signals exactly, and rejection of non-constant modulus signals using only a modest number of samples. The algorithm is robust to noise and was tested on experimental data.

Uploaded by

Joyce George
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
160 views20 pages

An Analytical Constant Algorithm: Modulus

This document summarizes an analytical constant modulus algorithm for blind signal separation using antenna arrays. The algorithm treats constant modulus signal separation as a generalized eigenvalue problem that can be solved via simultaneous matrix diagonalization. This allows detection of the number of constant modulus signals present, retrieval of all the signals exactly, and rejection of non-constant modulus signals using only a modest number of samples. The algorithm is robust to noise and was tested on experimental data.

Uploaded by

Joyce George
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 44, NO.

5, MAY 1996

1136

An Analytical Constant Modulus Algorithm


Alle-Jan van der Veen, Member, IEEE, and Arogyaswami Paulraj, Fellow, IEEE

Abstract-Iterative constant modulus algorithms such as Godard and CMA have been used to blindly separate a superposition
of cochannel constant modulus (CM) signals impinging on an
antenna array. These algorithms have certain deficiencies in the
context of convergence to local minima and the retrieval of all
individual CM signals that are present in the channel. In this paper, we show that the underlying constant modulus factorization
problem is, in fact, a generalized eigenvalue problem, and may
be solved via a simultaneous diagonalization of a set of matrices.
With this new analytical approach, it is possible to detect the
number of CM signals present in the channel, and to retrieve all
of them exactly, rejecting other, non-CM signals. Only a modest
amount of samples is required. The algorithm is robust in the
presence of noise and is tested on measured data collected from
an experimental set-up.

:
)I ,
: ,

,I

I. INTRODUCTION
A. Blind Signal Separation
N elementary problem in the area of spatial signal
processing is that of blind beamfonning. This problem
arises, e.g., in the following wireless communications scenario,
illustrated in Fig. 1. Consider a number of sources (users)
at distinct locations, all broadcasting signals at the same
frequency and at the same time. The signals are received by a
central platform containing an array of antennas. By linearly
combining the antenna outputs, the objective is to separate
the signals and to copy each of them without interference
from the other signals. The task of the blind beamformer is
to compute the proper weight vectors w, from the measured
data only, without detailed knowledge of the signals and the
channel.
Mathematically, the situation is described by the simple and
well-known data model

X=AS

(1)

where the matrix X: m x n is a collection of n snapshots from


each of the m antennas, A: m x d is the array response matrix,
and S : d x n is the signal matrix, with d rows s z ( i = 1. . . d )
corresponding to each of the d source signals. This model is
a reasonably accurate description for stationary propagation
~

Manuscript received June 8, 1994; revised November 6, 1995. This research


was supported by ARPA, Contract F49620-91-C-0086, monitored by the
AFOSR, and by NSF Grant DMS-9205192. Portions of thhis work were
presented at SPIE94. The associate editor coordinating the review of this
manuscript and approving it for publication was Dr. Zhi Ding.
A. J. van der Veen was with the Department of Computer Science/SCCM,
Stanford University, Stanford, CA 94305, USA. He is now with the Department of Electrical Engineering/DIMES, Delft University of Technology. Delft,
The Netherlands (e-mail: [email protected]).
A. Paulraj is with Information Systems Laboratory, Stanford University,
Stanford, CA 94305 USA.
Publisher Item Identifier S 1053-587X(96)03068-1.

2-

Fig. 1. Elementary blind beamforming scenario.

environments in which the multipath has only a short delay


spread (as compared to the inverse of the signal bandwidths),
so that no equalization is required. The beamforming problem
may thus be formulated as a structured matrix factorization
problem: given X , find factors A and S satisfying certain
structural properties. Once A is known, the weight vectors w,
for signal copy are given by the rows of W , where W = At
is the pseudoinverse of A.
Although we will be concerned with blind beamforming,
it is useful to note that quite similar structured factorization
problems arise in the context of blind equalization of a
single source observed through an unknown time-dispersive
channel. The two scenarios might even be combined into a
blind multi-user separation problem in the presence of long
delay multipath. Such problems are often separable into an
equalization and a separation (beamforming) step (viz. e.g.,
[2]), so that a generic solution to the blind beamforming
problem is also relevant in the combined scenario.
One mainstream of approaches for computing the structured
factorization has focused on properties of the A-matrix. In particular, the columns of the A-matrix are (not always correctly)
assumed to be vectors on the array manifold, each associated
to a certain direction-of-arrival (DOA). The identification of

1053-587)</96$05.00 0 1990 IEEE

VAN DER VEEN AND PAULKAJ: CONSTANT MODULUS ALGORITHM

1 I37

the DOAs necessitates the use of calibrated antenna arrays


In this paper, we will limit ourselves to blind beamforming
(for the MUSIC algorithm [ 3 ] ) or special array geometries of constant-modulus (CM) signals, assuming no other prop(for the ESPRIT algorithm [4]), and puts serious limitations erties of the signals except their independence. This blind
on the propagation environment as well: since in principle CM beamforming problem was introduced in [lo] and [ll]
the direction of each multipath ray is estimated, the total and was solved using the CMA [7], but in a restricted form:
number of dominant rays has to be less than the number of only the reception of a single signal-of-interest among other
antennas. Moreover, rays cannot have identical delays, and interfering signals was considered. It was observed that the
diffuse multipath is not allowed. For short-delay or diffuse algorithm can lock onto one of the interfering signals rather
multipath, it might be more accurate to model each column of than the desired signal. In later papers, this misbehavior was
A as the sum of two (or many) vectors on the array manifold, used to set up a multistage CMA, in which the intention
but then the estimation of all corresponding directions is is to capture all incident CM signals [lo], [26], [27]. The
output of a first CMA stage results in the detection of the
computationally very intensive, if possible at all.
A second class of approaches, more promising in the first CM signal. By an orthogonal projection, or an LMSpresence of unstructured multipath and useful in the context implementation of it, this signal is subtracted from the data
of blind equalization as well, exploits structural properties matrix, and the resulting filtered data is fed to a second CMA
of the S-matrix that should hold and be reconstructed by stage in order to detect a possible second CM signal. However,
the factorization. One widely used property is the constant for short data sequences, the signals are not yet orthogonal to
modulus of many communication signals (e.g., FM and PM in each other, and the projection leads to a misadjustment in the
the analog domain, and FSK, PSK, 4-QAM for digital signals). second and subsequent stages, thus limiting its performance.
A related but distinct property is the $finite alphabet of digital To mitigate this effect, the forced orthogonality of the signals
signals. The idea of modulus restoral has its roots in the work may be relaxed [28]-[3O] by only making sure that they are
of Sato [5], Godard [6], and Treichler, Agee, and Larimore [7], sufficiently independent of each other. In these schemes, a
[SI, all for the purpose of blind equalization; the algorithms number of CMAs are running in parallel, all started from
are known as CMAs. They are usually implemented as distinct initializations. Orthogonality is tested and weakly
stochastic gradient-descent optimizers of a modulus-error cost restored at the end of an update block of n samples.
Although the latter approach has been successfully demonfunction, and are in that form quite similar to decision-directed
adaptive filters or Bussgang algorithms for finite-alphabet strated in an on-line outdoor experiment [311, it provides only
restoral (the literature is abundant; viz. [9]). The application a heuristic solution to the underlying, very tough problem: how
of the CMA to blind beamforming is straightforward and can gradient descent techniques be used to converge reliably
was first considered in [IO] and [ll]; a combined spatio- to all minima of the cost function. Indeed, how do we know
temporal CMA was proposed in [12]. Blind beamforming the number of minima to look for in the first place? When
based solely on the finite alphabet structure is developed only a finite block of data is available, it is very likely that
in [13]. Other properties of S that might be used are the there are also local minima of the sample cost function that do
spectral self-coherence of communication signals leading to not correspond to any of the source signals. Depending on the
the SCORE class of blind beamforming algorithms [14] and initialization of the gradient descent optimization, the CMAs
several statistical properties: e.g., the assumed independence do converge sometimes to these solutions. In this respect, it
of the sources allows to separate non-Gaussian signals based should be noted that global convergence of the CMA has only
been proven for infinite sets of data (or in the averaged sense)
on their high-order cross-correlations [ 151-[ 181.
In the context of blind equalization of digital signals, [32], and only for scenarios that admit a perfect solution.
finally, the cyclostationarity of such signals may be exploited Finally, convergence of the CMAs may be slow and irregular,
by the use of multiple antennas [I91 or by sampling faster especially for weak sources and short data sets.
than the symbol rate and using fractionally spaced equalizers
(FSE). The spatial or temporal oversampling of cyclostationary B. Contributions
signals leads to a data matrix factorization X = HS in
In this paper, we introduce a new approach to the constant
which both the channel matrix H and the signal matrix S modulus factorization problem. We show that the problem is
have a Hankel or Toeplitz structure. This structure by itself essentially a generalized eigenvalue problem and can be solved
is already strong enough to determine the factorization, as is analytically, by a deterministic algorithm and using only a
demonstrated in the innovative approach by Tong, Xu, and finite set of data (n samples of m antennas). This leads to the
Kailath [20], but perhaps more clearly visible in consecutive following results.
work [21], [22]. It may also very effectively be combined with
-For cl 5 m sources, and without noise, n > d 2 samples
the properties of S into a single scheme, such as FSE with
are sufficient to compute A and S exactly via a certain
CMA [23], [24], or FSE with finite alphabets [2], [25]. The
eigenvalue problem.
latter papers
consider the more ambitious joint separation and
. .
equalization of multiple digital signals, but the same should
is not so much an issue in blind beamforming where the usual
be uossible with FSE-CMA as well. As mentioned before, in assumption is that the number of sources is at most equal to the number
these applications the equalization and beamforming stagesof sensors, hut has caused much confusion in the blind equalization of finite
impulse response (FIR) channels by FIR equalizers, viz. [33]. Current insight
are
and a
to the
i s that a fractionally
. sampled
- equalizer allows a perfect solution and thus
assures asymptotic global convergcnce [24].
beamforming problem is crucial.

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 44, NO. 5, MAY 1996

1138

-For n > d2, it is possible to detect the number of CM


signals present in X . This implies that not all sources
have to generate CM signals, although the algorithm only
recovers the CM sources.
-With X distorted by additive noise, a generalization of
the algorithm is robust in finding S,even when n is quite
small. This is demonstrated with experimental data.
The algorithm is derived by setting up the equations for the
weight vector w such that W X is a CM signal (Section 11).
This gives n quadratic equations in the entries of w, which can
be linearized when written in terms of the Kronecker product
w @ W,a vector with d2 entries (W is the complex conjugate
of w). If n > d 2 , then the dimensionality of the solution
space of this linear system of equations indicates how many
CM signals are present in X. Most solution vectors of the
linear system do not have the Kronecker structure w 8 W ;
the core of the CM problem is to find those that do. It is
shown in Section I11 how this problem can be transformed into
a generalization of an eigenvalue problem: the simultaneous
diagonalization of a number of matrices. Without noise, this
problem has an essentially unique solution that can be found
using standard linear algebra tools. With noise added to X ,
there is in general no exact diagonal solution, and we have
to find an approximate simultaneous diagonalization. This is
a challenging, nonstandard linear algebra problem for which
we propose an algorithm that exhibits quadratic convergence
in simulations (Section IV). In Section V, the algorithm is
applied both to computer-generated data and to measured data
sets, with very good results.
It is not the first time that a Kronecker approach has
been proposed to solve the CM problem, viz. e.g., [34]-1361.
However, these authors operate in the equalization context
and try to find only one structured solution w @ W to the
linear system, ignoring the fact that there might be more
such solutions (in the equalization context, this occurs if the
equalizer length is too long). Interestingly enough, an entirely
similar simultaneous diagonalization problem did turn up in
fourth-order cumulant-based techniques for blind separation
of multiple non-Gaussian signals [ 151-[ 181. With hindsight,
one might perhaps say that CM signals are deterministic
counterparts of non-Gaussian signals, but only as far as the
fourth-order cumulant is concerned. At this point, we are only
aware that there must be connections, but the details remain
to be sorted out.
C. Notation

Lower-case bold (as in w) denotes either a row or a column


vector. Its ith entry is sometimes denoted as ( w ) , . w T is
the transpose, W is the complex conjugate, and w * is the
complex conjugate transpose. @ is the Kronecker product.
For matrices, At denotes the Moore-Penrose pseudo-inverse,
row(A) denotes the row span (co-range) of A.
11. PROBLEM FORMULATION
AND TRANSFORMATION

A. Problem Statement; Uniqueness


In this section, we discuss the actual problem that will be
solved. Starting from the data model X = AS in (l),we first

note that, without loss of generality, the constant modulus of


all signals may be modeled to be equal to one: any other value
of the amplitude of one of the signals is absorbed in the Amatrix by a proper scaling of corresponding columns of A and
rows of S.Hence, the problem we consider is, for a given data
matrix X , to find a factorization
X = AS,

with A, S full rank, ISijl= 1.

(2)

A slightly more general way to formulate the problem is


obtained by premultiplying (2) with W = At where At
denotes the pseudoinverse of A:
Problem PI (CM Factorization Problem): For a given data
matrix X: m x n of rank d, find S and W : S x m, such that

W X = s,

ISijl = 1

where S is of full rank and S 5 d is as large as possible.


In this formulation, X is a linear combination of d signals,
but only 6 5 d of them are of CM type. Only the CM signals
will be reconstructed by the beamformer.
The formulations X = AS and W X = S are equivalent
only if the factorization X = AS is essentially unique,
meaning that the only CM signals that can be constructed
by the beamformer are the signals that were originally sent,
and not some spurious ghost signals. Trivial transformations
such as the choice of ordering of the rows of S and the
complex phases of the entries of the first column of S cannot
be avoided but lead to an admissible form of nonuniqueness.
Save for these transformations, and under conditions that A, S
are full rank and the sources generating S are sufficiently
independent and have sufficient phase richness, uniqueness
is guaranteed with probability 1 once n is large enough.
This is well known for S = d , n + 00,and analog CM signals
(viz. e.g., [7] for equalization, [37] for beamforming). One
may have concerns on the sufficient phase richness of digital
CM signals with small constellations, but in fact even BPSK
signals give unique factorizations [ 131. However, a sharpening
of the n + cx condition is possible.
Lemma 1: Suppose that X : m x n has rank d and that the
factorization { X = AS,
I = 1}is unique for n + 00.Then
the factorization is in general already unique for n 2 2d.
Prooj? See Appendix B.
0
The algorithm that we derive in this paper requires n > d2,
which is still quite reasonable for small values of d (e.g.,
d < lo).
B. The Gerchberg-Saxton Algorithm (GSA)
Denote by row(X) the subspace spanned by the rows of X
(the co-range of X), and define the set of CM signals as

CM = { S I I
Szjl = 1, all i , j }.
Problem P1 asks for all row vectors w (the rows of W ) such
that W X = s is a CM signal, for linearly independent signals
s. Hence, we have the following lemma.

VAN DER VhhN AND PAULRAJ CONSTANT IMODULUS ALGORITHM

1139

Lemma 2: Problem PI is precisely equivalent to finding all


linearly independent signals s that satisfy

( A ) s E row (X),
(B) S E C M .
From this formulation, it is straightforward to devise an algorithm based on alternating projections: start with a (random)
choice of s in the row span of X , and alternatingly project it
onto CM and back onto the row span of X. The set CM is
not a linear subspace, so that the the projection onto CM is
nonlinear as follows:

Every entry of the vector is radially projected onto the complex


unit circle. It is customary to estimate weight vectors w rather
than signals, in which case the alternating projection algorithm
is expressed as the iteration

(w(i)x)]
xi.
w(i+l) = [PCM

(3)

(Note that s ( ~=
) w ( ' ) X and that . X t X is a projection onto
the row span of X . )
It is interesting to note that this is a well-established
algorithm in the field of optics for solving the phase-retrieval
problem, where it is known as the Gerchberg-Saxton algorithm
(GSA) 1381. The connection of the phase-retrieval problem
with the CM problem was made only recently [37]. Essentially
the same algorithm was derived from the CMA by Agee [39],
called the LSCMA, and claimed to have faster convergence
than the standard CMA. It is also closely related to the OCMA
variant by Gooch and Lundell [ I l l , who replaced the LMStype updating of the CMA by an RLS-update. One difference
of the GSA and LSCMA with other CMA methods is that
they are block methods: they iterate on X rather than update
vectors XA.. Hence, they typically require less data (smaller
n), although of course the standard iterative CMA could reuse
old data as well. Conversely, the GSALSCMA methods could
be used on data matrices of increasing sizes by introducing
updating versions for the pseudoinverse, which leads to the
OCMA. The disadvantage of using these iterative algorithms
on small finite data sets is that global convergence properties
are lost-spurious local minima could be introduced. It is not
known how large the block size has to be before the asymptotic
global convergence results are applicable.

C. Equivalent Problem Statement


However, our intent is to compute an exact solution to the
problem in Lemma 2 via analysis, and not via alternating
projections. Recall that we are searching for all vectors s that
are in the row span row(X) and also have the CM property.
The first property has so far been captured as requiring s =
w X , but it i s more convenient to take linear combinations of
a minimal (orthonormal) basis for the row span of X. This
avoids problems with nonuniqueness of w when d < m,, and
makes sure that different linear combinations lead to different
signals.

Thus, let X = U C V : U E C ~ ~ E ~R n~L X~ n ,~V


( E, n xC
n
be a singular value decomposition of X [40]; U and V are
unitary matrices containing the singular vectors, and C is a
real diagonal matrix with nonnegative entries: the singular
values. Suppose that there are d sources, so that the rank of
X is equal to d. Only d singular values are nonzero, and
we collect the corresponding right singular vectors of V in a
. rows of V form an orthonormal basis
matrix V E d x nThe
of the row span of X, and we can rewrite condition (A) in
Lemma 2 as

(A): s E row(X)

V : d x n.

s = wV,

Here, the weight vector w is not precisely the same as


before: it is now acting on the orthogonal basis vectors of
row(X), rather than directly on X. This reduces the number
of parameters to estimate from m to d and ensures that linearly
independent w result in linearly independent s. A second
advantage of an orthogonal basis is that the corresponding
matrix W (acting on V instead of X ) has a condition number
that tends to 1 as n grows: for uncorrelated signals and large
n. S f i becomes an isometry and W/fi,
as a mapping of one
isometry into another, becomes unitary. When using X instead
of V , W would have a bad condition number if signals come
from close directiom2
To rewrite condition ( A ) :s E CM, put V = [VI . . . v,],
where v, ( E d is the ith column in V . Then

( A ) :s = [(~)1.. . ( s ) ~E ]CM
[1(s)1I2

. . . I(s)n12]= [I . . . I ]

wv1v;w*

wv,v;w*
!

1.

If we define Pk = vkvi E ( E d x d , for k = l , . . . . n , then


the above derivation has shown that problem P I is precisely
equivalent to finding all linearly independent vectors w such
that
WPkW*

= 1,

k = 1,.. . n

(4)

which calls for the simultaneous solution of n quadratic


equations in the entries of w, or the intersection of n, ellipsoids.
To find all solutions, the approach is to expand these equations
in the entries of w, which leads to Kronecker products.
For matrices Y E ( E C i x d and vectors y E ( E d 2 , denote

rYll

II : II
Y12

(5)

*The idea to switch to an orthogonal basis and force V I T / f i to he close


to unitary can be used to enhance convergence to independent solutions
in iterative CMA algorithms as well. Similar to Pro-ESPRIT 1411, such an
approach could be called a Procrustes CMA.

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 44, NO. 5, MAY 1996

1 I40

transformation, as follows. Let Q be any n x n unitary matrix


such that

vec-'(y) :=

If we set
Simple choices for Q suffice; e.g., Q could be a Householder
transformation [40],

or Q could be a DFT (discrete Fourier transform) matrix.


Apply Q to P as follows:

Collect the n condition vectors pk in one matrix P of size


n x d2:

P=

With these definitions, we have

(8)

.
T

Lemma 3: Problem P1 is precisely equivalent to finding all


linearly independent vectors y satisfying

pY=

':]

(9)

y=w@W.

Py =0
y=w@\?l

(10)

For each solution w , the corresponding CM signal is given


by s = wV.
Prooj? To prove that this problem is equivalent to solving
(4), and hence problem P1, it remains to show that a set of
solutions {wk @ wk}! is linearly independent if and only if
the corresponding set {wk}? is linearly independent. This is
0
straightforward; see Appendix B.
The linear system of (9) is overdetermined for n > d 2 .
Nonetheless, if there is more than one CM signal present in
X , there has to be more than one solution y to the linear
system, and because they are linearly independent, P has to
be singular. Hence, the set of independent solutions y to the
linear system is not unique: any vector in the kernel of P can
be added to any solution y. The second condition (10) restricts
the solution space to vectors y that have a certain structure:
they must be expressible as a Kronecker product of a vector
with its complex conjugate. Note that it is not sufficient to
compute solutions y to the system of equations and hope that
they have the required structure y = w @ W.
In general, the solution space to (9) can be written as an
affine space of the form y = yo a l y l . . . atye, where
yo is a particular solution to (9), and y1 . . . , Ye is a basis
of the kernel of P . We find it more convenient to work with
a fully linear solution space, which is obtained via a linear

We will show that the first condition can always be satisfied


by scaling a nontrivial solution to the second equation. This
then leads to the following equivalent problem statement:
Lemma 4: The CM problem P1 is precisely equivalent
to the following problem. Let X be a given matrix. With
p @ ( n - l ) X d 2 and V constructed from X , find all linearly
independent nonzero solutions y that satisfy

+ +

For each solution w, scaled such that I/wll = n1l2, the


vector s = wV is a CM signal contained in X .
Proof: See Appendix B.
0
Suppose that the dimension of the kernel of P is equal to
some number 8 (we will argue in Section 111-A below that in
general 8 is, indeed, equal to the number of CM signals). Let
{ y1. . . , y i } be a basis for this kernel. It can be computed via
a QR factorization, or, with more numerical accuracy, from
an SVD of P. In the latter case, {yk}? are the right singular
vectors corresponding to the S singular values 0 f - P that are
zero. With a basis of the kernel, any solution y to P y = 0 can
be written as y = a l y l f . . . adya, for arbitrary coefficients
a, E (E. The condition that y = w @ W as well is more
conveniently expressed as Y = w * w , where Y = vecpl(y).
Likewise, denote by Yk = vecpl(yk), k = 1,. . . ,6 the
correspondingnd x d matrices constructed from the chosen
basis of ker(P). Since

+ +

ajya = w @ w
a1y1 ' .
a 1 Y 1 + . . . + agYg = w*w
'

($

it is seen that, in essence, we have to find scalar linear


combinations of a set of matrices (a generalized matrix pencil

VAN DER VEEN AND PAULRAJ: CONSTANT MODULUS ALGORITHM

I141

for more than two matrices) such that the result is a rank1 Hermitian matrix, hence factorizable as w * w . Linearly
independent solutions w correspond to linearly independent
y, and in turn to linearly independent parameter vectors
[al . . . ai].Hence, we may rewrite the conditions (13) in terms
of the Y k , which gives a new problem statement that is entirely
equivalent to the original CM problem.
Problem P2 (Equivalent CM Problem): Let X be the given
matrix, from which the set of d x d matrices {YI. . . . , Yi} are
derived as discussed above. The CM problem P1 is precisely
equivalent to the following problem: determine all independent
nonzero parameter vectors [a1. . . as] such that
a1Y1

+ . + a$Y$= w * w .

(14)

For each solution w scaled to liwll = nl/, the vector s = w V


is a CM signal contained in X .
Finding all solutions to (14) is thus the core of the CM
problem. In the next section, we show that (14) is, in fact, a
generalized eigenvalue problem.
Remark: The given definition for the vec-operation in
(5)-(6) does not make use of the fact that we only apply it to
Hermitian matrices w * w and Pk. Other choices that do make
use of this are possible; e.g., we could define vec(w*w) to be
a realvector containing R e ( ( w ) % ( w ) JIm((w)%(w);)
*),
instead
~
This transformation
of the redundant (w)%(w),*
and ( w )(w):.
to real vectors leads to an equivalent but numerically and
computationally favorable variant of the procedure and is
detailed in Appendix A. The result is that, with this alternate
definition, P and P are transformed to real matrices, and
the basis matrices Yk constructed from the kernel of P are
Hermitian by construction. The coefficients ak to be computed
may then be restricted to reals as well.
111. EXACTSOLUTION TO PROBLEM P2
In this section, we will show that problem P2 admits an
exact solution, which can be computed using standard linear
algebra techniques. We still operate under the assumption that
there exists a solution, and that the solution is essentially
unique. For computability, we will now also require that

>

d2.

A. Estimation of the Number of CM Signals

ment and can only happen if there are specific phase relations
between the signals. Explicit examples can be constructed for
the case of two CM signals sl,s2: a rank deficiency occurs
if and only if there are constants a , c E such that for each
sample point k
(Sl)k(SZ)k

Q(SZ)k(Sl)k

= c.

Writing ( s z )=
~ ( s 1 ) k expj$k, where $ k is the phase difference between the two signals, this reduces to
e-J$k

+ aeJ+k = c.

(15)

For every choice of constants a . c, there are at most two values


for d h such that the equation holds. Hence, a degeneracy can
occur only for BPSK-type (two state) signals sampled at the
symbol rate. E.g., for BPSK, d k E {O.n}, so that there are
constants ( a = -1,c = 0) for which (15) holds for all k. A
problem also occurs for MSK signals, which are *l for even
k and &tj for odd IC. These degeneracies go away when the
signals are fractionally sampled, because $k then also assumes
intermediate values, or by a specialization to real signals.

B. Computation of W
We will assume from now on that dim kcrp = 6 is equal
to the number of CM signals in X . The CM problem is
solved once we have found all S independent parameter vectors
[ a l , . . . , a&]
that make the generalized matrix pencil (14) rank1 and Hermitian. This problem is in essence a generalized
eigenvalue problem. Indeed, if d = 6 = 2, then there are
two matrices Y1 and Y2, each of size 2 x 2, and we have to
find X = az/al such that Yl XU, has its rank reduced by
one (to become one). For larger 6, there are more than two
matrices, and the rank should be reduced to one by taking
linear combinations of all of them. This can be viewed as an
extension of the generalized eigenvalue problem.
From the opposite perspective, suppose that the solutions
of the CM problem are w 1 , . . , wg. We already showed
that w1 8 W l , . . . , wg 8 W g is a linearly independent set
of vectors; together they are a basis of the kernel of P .
Moving to matrices, each of the matrices Yl. . . . , Ys is a
different (independent) linear combination of the pencil basis
w T w l , . , w i w 6 , i.e.

+.
+..

Y1 = X l l w ~ w l + X ~ ~ w ~. .w+ ~
X ~ ~ W Z W ~= W*hlW
We first show that there is a relation between 6,the number Y2 = X2lwTwl+X2zw~w2
.+X2gw;wg = W*AZW
of CM signals that are present in X , and 6, the dimension
of the kernel of P. With 6 independent constant modulus
+ . . . + X 6 g ~ g * ~=s W*AsW
signals, there are 6 linearly independent solutions w to the CM Yg = X&lw~wl+Xgzw,*wz
(16)
factorization problem, corresponding to 6 linearly independent
where
vectors y = w 8 W in the kepel of P . Hence, it is necessary
that d i m k e r P 2 b . Since P : ( n - 1) x d 2 , we also have
dim kerp 2 d2 - (n- 1).To be able to detect S from dim kerp,
we have to require that, at least, n 2 d2 1.
Proposition 5: Let 6 be the number of CM signals in X ,
and suppose n > d2. Then d i m k e r P 2 S. Generically,
d i m k r r P = 6.
Proof: See Appendix B.
0
In the proof it is shown that the occurrence of the nongeneric Hence, by the existence of a solution to the CM problem,
case where S > S is independent of the propagation environ- there must be a matrix W whose inverse simultaneously

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 44, NO. 5, MAY 1996

1142

diagonalizes Yl,. . . , Y6. (Its uniqueness is, in fact, proven


in [I51 for the case of Hermitian matrices Yk.) This makes
Problem P2 equivalent to the following problem.
Theorem 6: Suppose n > d 2 and dim ker(p) = S. Then
the CM factorization problem PI, or P2, is equivalent to a
simultaneous diagonalization problem: find W : 6 x d (full
rank S) such that

Yl = W*AlW ( A I , . . , A h E C ~ diagonal3)
~ ~ ,
Y2 = W*AZW

function of signals to the set of CM signals,


n

In terms of this distance, the problem can be posed as finding


S independent signals s that are minimizers of
min{dist(s, C M ) I s E row(X)}

(18)

where row(X) is the estimated row span of S, which we


will take to be the principal row span of X as determined
using an SVD. Thus, let X = U C V as before, and let d
Yb = W*AsW
be the number of singular values of X that are significantly
In general, Yl and Y2 are d x d matrices of rank 6, larger than zero. The detection of d from the singular values is
and not less than S. In this case, a generalized eigenvalue relatively straightforward if the noise power and the statistical
decomposition of just Yl and Y:, will already determine W : distribution of the noise is known, but notoriously nontrivial
otherwise (cf. [42] and references therein; we do not go into
there exist matrices M , N (invertible) such that
details here). The rows of V corresponding to these singular
M*YiN = hi
values form an orthogonal basis of the principal row span of
X , and are collected in the matrix V. The matrix P and P
M*Y2N = A2
can be constructed from V as in Section 11-C. The following
where A I , A2 are diagonal matrices, size d x d , each with S proposition is- a result of expressing the cost function in (18)
nonzero entries. Reducing these matrices to 6 x 6 diagonal in terms of P and w.
matrices with nonzero entries on the diagonal, and trimming
Proposition 7: The CM problem with noise (18) is solved
M and N likewise to full rank 6 x d matrices, we obtain the by finding the set of all linearly independent minimizers y of
decomposition

G*YlN = A 1
M*Y2N = A 2
M and I? are unique up to equal permutations of their columns
and (possibly different) right diagonal invertible factors. This
uniqueness implies that, after a suitable diagonal sca!ing, we
can arrange it such that M = N = Wt, or W = N1,with
each row of W having norm n1I2.
For the case where YI and Y2are not of rank 6, it is possible
that they do not fully determine W , so that the other Y k also
have to be taken into account. It is obvious that it is possible
to obtain W also in this case, but we omit the details of this
more general procedure at this point. Numerically, it is better
to take all Y k into account in all cases, and this is of course
preferable in the presence of noise as well. Such an algorithm
is described in the next section.
Hence, we have shown at this point that in absence of noise,
the CM problem is, in fact, a generalized eigenvalue problem
and can be solved explicitly.

IV. THE CM PROBLEMWITH ADDITIVE


NOISE
A. Equivalent Optimization Problem

With noise added to the data,

subject to y = w @ W ,/(wII = n1/2.For each such y, the


corresponding signal s is s = (cw)V, where the corrective
scaling c is given by c2 = n / ( n E ; ) .
Frooj? See Appendix B.
U
The correction of w by c is close to one and of no
importance in practice, as it will only scale the amplitude of
the corresponding signal s.
Minimizing (19) with the given conditions on y undoubtedly
requires some iterative method, but the route set out by the
solution of the noiseless case will provide accurate initial
points for such a method. Thus, we first compute a basis
of orthogonal vectors Y k that solve (19) without structural
constraint; as before, these follow from an SVD o f A Pas the
right singular vectors corresponding to the smallest 6 singular
values (the numerical kernel of P). The number of CM signals
is estimated from the number of singular values that are
significantly smaller than the others; a suitable threshold
level is given in (30) later in this paper. The next step is
to unstack the vectors yk into corresponding matrices Y k =
vet-' (yk), and subsequently impose the required Kronecker
structure onto these matrices: linear combinations of the Y k
should result in matrices that are close to rank-1 Hermitian
matrices of the form w*w, i.e.,

a,Y,

+ .. .

VAN DER VEEN AND PAULRAJ: CONSTANT MODULUS AI.GORITHM

B. Simultaneous Diagonalization as a
Super-Generalized Schur Problem
Assume for the moment that there is no noise added to X .
As we have seen in Theorem 6, the matrix W E (
I
?
that we
try to find is full rank and such that

1143

The rows of A-' are the independent vectors [al. . . a 6 ] .


A straightforward generalization shows that we do not need
to compute the factorization (23), as the parameters can be
computed directly from the main diagonals of R k without
knowing the h k :
Proposition 8: For given Y1, . . . , Y b , assume the decomposition (22) and the existence of decomposition (23). All
independent parameter vectors [a1
such that Y :=
alY1 . . . a6Ys has rank 1 are given by the rows of A
as follows:

Ys = W*A&W.
With noise, we can try to find M = Wt to simultaneously
make M * Y l M , . . . , M*Y6M as diagonal as possible. Because
M is not unitary, the fact that it has to have full rank is hard
to quantify, and it makes sense to rewrite this S-generalized

(R1)11

AIR-',

R=

' ' '

(R1)66

' ' '

(R6)66

: ]

(R6)11

eigenvalue problem as a S-generalized Schur decomposition.


Proofi Because the Rk are upper triangular, a necessary
We first explain the procedure for the noise-free case. Bring in
condition for (24) to hold is that the resulting matrix has a main
a Q R factorization of W* and an R Q decomposition of W ,
diagonal
with at most one nonzero entry. But, in view of the
W * = Q*R',
W = R"Z*
existence of factorization (23) with RI, R" having nonsingular
main diagonals, it cannot happen that the main diagonal of the
where Q, Z are unitary d x d matrices, and R' E (EdxS,RI' E result is all zero.
0
(pXd
are upper triangular. The factorizations are of course
Note that Proposition 8 by itself does not ensure that Y
related, but we will ignore this for the moment. If 6 < d, then is Hermitian, unless, e.g., all
are Hermitian and all Ri,s
we can arrange that Only the leading
Of R' and
are real. This feature is a side effect: the set of 6 independent
are nonzero (and nonsingular). Substitution into (21) leads to solutions Y is in our case uniaue. and it suffices to enforce
the rank-] property.
'
&
E
uppertriang.)
QY1Z = RI ( R l , . . . , R s ,
Factoring each of the 6 rank-I matrices that is obtained
in this way gives 6 independent vectors w, which form the
rows of the matrix W that we were looking for in (21).
Hence, in the noise-free case, the computation of a "supergeneralized" Schur decomposition, i.e., two unitary matrices
Q , Z , gives the solution to the simultaneous diagonalization
problem. Although it seems at first sight that we have doubled
the number of parameters to estimate (two matrices &, Z ,
rather
than one matrix W ) ,this is not true: the fact that the
Rg = R'A6R".
matrices are unitary makes that the total number of parameters
Only the top-left 6x S block of each Rk is nonzero. In addition, to estimate is precisely the same. However, the constraint that
each of these blocks is nonsingular. Hence, there exists Q.Z Q, 2 be unitary is a desirable condition, whereas the fact that
such that QYkZ is upper triangular, for k = 1, . . . , 6,which is W must have full rank is difficult to handle.
a generalized Schur decomposition, but for S matrices rather
We now return to the case where the data matrix X is
than two. With this decomposition, it is seen that a parameter distorted by noise. In this case, there is no unitary Q, Z that
vector [a1. . . 061 satisfies (20) only if
k exactly upper triangular.
simultaneously makes all matrices Y
a l R ~ + . . . + a g R hisrank 1.
(24) However, we can try to find Q.2to make these matrices
as much upper triangular as possible, by minimizing the
With the model of R I . . . . , Rg in (23), we obtain that, equiv- Frobenius norm of the residual lower triangular entries. One
alently,
approach for doing this is described in Section IV-C below.
It is an extension to more than two matrices of the usual QZ
is rank 1.
ollhl
...
iteration for computing the generalized Schur decomposition
Since all the h k are diagonal, the cuk are straightforward to of two matrices. There are several other approaches for solving
simultaneous diagonalization problems as well, as discussed
compute: only one entry of the diagonal matrix alhl . . .
can be nonzero. Setting this entry equal to one, all in that subsection.
With Q, Z and hence RI. . ' . , R6 obtained this way, we
possible parameter vectors [a1. . . 0 6 ] follow by constructing
a matrix whose columns consist of the diagonal entries of the can compute all independent parameter vectors [a1. . . as] as
in Proposition 8. Each parameter vector gives a matrix Y,
h k as follows:
approximately of the form Y N w*w, and each w can be
estimated as the singular vector corresponding to the largest
singular value of each Y . It remains to scale w to ensure that
~~w~~
= n1I2.

'

+ +

+ +

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 44, NO. 5 , MAY 1996

1144

Given a matrix X =AS+ N

C".

An estimate of S E CM is obtained as follows:

1. Estimate row(X);

a. Compute SVD(X): X =: UZV


b. Estimate d = rank(X) from C: the number of signals
c.

p = first d rows of V

2. Estimate k e r ( p ) ,which summarizes all CM conditions:

a. Construct p : (n- 1) xd2 from

0 =:

[v,

"'

P:

v,]

P := [vec(vlv;) ... v ~ c ( v , v ~ ) ] ~

[ ]

:= Q P , with Q as in (11)

b. Compute SVD(p):

p =: UpXpVi

c. Estimate 6 = dimker(p) from Z,: the number of CM signals


d. [y1

... ys] := last 6 columns of V,

3. Solve the simultaneous diagonalization problem (21):


a.

YI = vec-'(yl);..,~a = vec-'(ya)

b. Find Q,Z to make R I := QYlZ,...,R&:= QYaZapproximately upper (section 4.3)

c. From R I , . . . , Ra, compute all vectors


s.t.

[akl... aL&k= 1,... ,6

+.'. + akhY6 is approximately rank I (proposition 8)

f'k := a k l Y ~

4.Recover the signals: for each


a. Compute wk such that

Pk

e:

e:w;wk

b. scale wk such that llwkll = n1I2


c.

Sk

:= WkQ

(d. perform a few Gerchberg-Saxton iterations, as in equation (3))

The vectors

SI,

,ss are the rows of S.

Fig. 2. Analytic constant modulus factorization algorithm (ACMA). The vectoring operations in Steps 2a and 3a may be replaced by Hermitian vectoring
operations (see Appendix A).

The above scheme provides an approximate solution to the


problem in Theorem 7, i.e., the CM factorization problem with
additive noise. The algorithm is summarized in Fig. 2. We
call it the analytic CM factorization algorithm (ACMA). It is
not clear in what sense the solution approximates the optimal
solution; however, it finds the exact solution if there is no
noise, and simulations give very accurate results for moderate
noise levels or large n. For high noise levels, closely spaced
signals, or small n, the vectors w that are obtained by the
above procedure can be used as initial starting points in a
Gerchberg-Saxton iteration, which effectively searches for the
minima of (IS). Since these starting points are accurate, a few
iterations suffice, and independent signals are almost always
obtained, except in severely ill-conditioned cases. Examples
of the application of the algorithm to simulated and measured
data are given in Section V.
C. Super-Generalized Schur Decomposition

In this subsection, we describe a possible approach to the


super-generalized Schur decomposition problem: for given

matrices

Yl.
. . . Y6, find Q, 2 (unitary) such that

. . . . Rg are as much upper triangular as possible.


where RI,
Our approach is to modify the standard QZ iteration method
used for computing the Schur decomposition of two matrices
so that it works for more than two matrices. There are several
ways to do this. We will present a variant that treats all
matrices YI, . . . Y6 equally.
The QZ iteration for computing the Schur decomposition of
two matrices [40] starts with setting Q(O) = I , Z(O) = 1.At the
kth iteration step, a unitary matrix Q ( k ) is computed such that
Q(k)(YIZ(k-l))
is upper triangular, and a unitary matrix Z ( k )
is computed to make (Q(k)Yz)Z(k)
upper triangular. As an
extension to more than two matrices, we propose the following
two-step iteration. Denote by 11 . 1 1 the~ Frobenius
~
norm of
the strictly lower triangular part of a matrix.

VAN D E R VEEN ANI) PAIJLIIAJ. CONSTANl MOI)Ul.US A1,GORITHM

I145

and columns removed. The matrices H3, . . . , Hd-1 follow, in


turn.
The reason that this does not necessarily provide the optimal
f o r k = 1 , 2. . . .
solution to the LS problem of Step (a) is that H I only
looks at the first columns of the matrices Rk, and might
a. Find Q ( k )(unitary) to minimize
introduce potentially large entries in the below-diagonal part of
subsequent columns. It is not even guaranteed that the belowb. find Z()
(unitary) to minimize
diagonal norm is lower than before. Note that this is nothing
new: the same happens in the original QZ iteration for two
(26) matrices, and nonetheless it converges (except perhaps for
Each of the two steps in the iteration poses a least-squares strongly nonnormal problems).
problem with an exact solution which might, however, be hard
The resulting QZ iteration (26) is observed in simulations
to find. The customary idea in such situations is to find an to converge fast, usually quadratically in 3-5 iterations. At
approximate solution to each of the steps, and rely on the this point, there is no proof of convergence. Hints for a
possible proof might be provided by convergence proofs for
outer iteration to provide convergence.
To describe the approximate solution to Step (a) (or Step (b), the standard QZ iteration. Because the inner loop consists
which is similar), suppose that, at the kth stage, we have ma- of SVDs, the scheme is only practical if d is small, which
trices R~ := ~ ( k - l ) y , % ( k - 1 ) , . . . ins:= ~ ( k - l ) y ~ ~ ( k ~ - lis) , certainly the case for the currently envisioned applicanot yet upper triangular, and we have to find a unitary matrix tions.
Remark: While this paper was in review, we learned about
Q that minimizes the below-diagonal norm of QR,1,. . . QRs.
Recall that for a single matrix R1,a QR. factorization gives the other approaches to the super-generalized Schur decomposisolution, and is obtained as a product of Householder rotations tion, and to simultaneous diagonalization problems in general.
HI , . . . H(,-*, where a single H ; maps the below-diagonal These problems are not entirely new to the SP community:
entries of the ith column of the matrix to zero. This approach in the context of blind beamforming of non-Gaussian signals,
may be mimicked for the simultaneous triangularization of a analysis of the fourth-order cumulants of the data matrix has
set of matrices, although we can only try to make the below- led to problems of the form (21), viz. [IS], [43], as well as
diagonal entries small. Thus, Q in Step (a) is obtained as the related problems of the form &*X,Q = A,, z = 1,. . . , d with
product of d - 1 more elementary unitary matrices
the X , Hermitian, Q unitary and all 12, close to diagonal, viz.
[17]. The fact that Q is unitary instead of just nonsingular
is a consequence of the statistical expectation operator: with
infinite data, our W would be unitary as well (cf. footnote
The first factor, lf,, is deqigned to simultaneously minimize 2). In [18] and [43], a whitening transformation derived from
the below-diagonal norms of only the first column of each of the data covariance matrix immediately reduces the problem
the matrices R I ,
. . . , Rh. Similarly, Hk is used to minimize the to Q*X,Q = A, as well.
below-diagonal norm of all the kth columns. Denote by ( R I ) ~ Overviews of several such problems are given in [44]
the first column of R I , and similarly for the other R k s , then and [45]. The algorithms in [44] are of Jacobi type, and
intended for solving Q*X,Q = A, and some structured
variants. A similar Jacobi algorithm is proposed in [17]. Such
Jacobi iterations are readily set up for the generalized Schur
where [*. . . *] is the first row of the result, and E contains the decomposition (25) as well, although one has to be careful
remaining rows. The objective is to find H I such that ~~EIIFabout outer and inner rotations to ensure convergence;
is minimized. The solution is not unique, but a possible H1 cf. [46]. The (real-valued) QZ problem is considered in [45],
follows directly from an SVD
and solved using isospectral flows, which results in a steepest
gradient-type algorithm. For the diagonalization problem (21)
with positive definite matrices
the nonorthogonal FG+
Indeed, for this choice of lfl, we have IIEii$ = o ; + . . . + a i , algorithm of [47] may be used. The orthogonal variant of this
which is as small as we can hope for.
algorithm is a generalization of the cyclic Jacobi algorithm.
After NI has been computed and applied to RI . . . , Rg, Note that in our application, the Y Leven
,
if they are constructed
we have obtained new matrices Ri; . , , Ri, with the below- to be Hermitian, are not necessarily positive, although we may
diagonal norm of the first columns minimized. The next factor, try to find linear combinations that are positive. The approach
H2, is used to minimize the below-diagonal norm of the second in [18] and [43] is to find one such positive combination,
columns of these matrices. As H Z is unitary and does not affect then try to glean W from a Cholesky decomposition (or
the first rows of R{ . . . R;, this will not change the below- Schur decomposition, after a whitening transformation) of this
diagonal norm of the first columns. In fact, Hz can be found single matrix. Numerically, this is likely to be suboptimal,
in precisely the same way as H I by looking at the reduced because in the end only two matrices determine the decomproblem where we act on Ri . . . , RL with their first rows position. So far, none of the above approaches has proven
convergence, but reported experimental results are invariably
A different suitable initialir.ation follows from a Schur decomposition of
positive.
Just 11 and y ~ .
[Extended QZ i t e r a t i ~ n ] ~

x,

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 44, NO. 5, MAY 1996

1146

x
0.25-

N=100
x N=26
+ N=17

0.2 -

mx

..

0.15.

* x

0.15.

N=100
N=26
+ N=17

+P

0.1 .

0.1 -

O O O O 0 0 0 0

0&
088

+y

0.05.

0.05.

0'
0

10

15

(b)

3K

0.25

+ +

x * x
0.2

x i +

0.15-

S T #
0 0 0 0 0 0 0

10

0.1 -

0 N=100
x N=26
+ N=17

0.05.

0'
0

15

(c)
Fig. 3. Singular values of P : no noise,

02

= 30, for n = 100; 26.17. (a) Four CM signals. (b) Two CM signals; (c) similar to (a), but with 02 = 5'.

D. Computational Complexity
We briefly investigate the computational complexity of the
proposed algorithm (Fig. 2). The ACMA consists of mainly
three computational steps: an SVD of X (size m x n),an SVD
of P (size n - 1 x d 2 ) , and a simultaneous diagonalization
of S matrices Y , of size d x d. The second SVD is the
most expensive and has order n(d2)' operations. Since we
require n > m2, and m 2 d, the complexity of this
step is at least C?(d')). In comparison, the first SVD has
C3(m2n)2 O(d4) operations, and the complexity of the
simultaneous diagonalization step is also 0( d 4 ) . This implies
that d cannot be very large and that the algorithm is too
complex for equalization purposes (where sometimes d >
100 is taken).5 Since only subspaces are needed but not the
individual singular vectors, the SVD's may be replaced by
any other principal subspace estimator, such as provided by
the Schur subspace estimation (SSE) method [48], the URV
updating [49], the PAST method [50],the FSD [51], or the
FST [52]. The latter three algorithms can also exploit the
fact that only d kernel vectors out of d2 singular vectors are
needed, which gives rise to significant savings. In addition,
As mentioned before, the CM equalization problem satisfies the same
model, but has different properties: the data matrix has a Hankel structure, and,
in principle, it suffices to find only one solution. In this case, a combination
with other (intersection-type) algorithms that make use of this structure is
probably preferable.

TABLE I
APPROXIMATE
COMPUTATIONAL
COMPLEXITY
OF
BLINDALGORITHMS
(IC FLOP PER SNAPSHOT)
GSA (kfloph)

ACMA (kfloph)
m=4

10

m=4

d=2

0.7

1.4

2.4

3.7

d=2

2.9

3.6

4.6

10

0.81.21.82.4

5.9

1.4 2.2 3.1 4.0

13.0 14.0 15.3

3.2 4.4 5.6

39.2 40.5

5.6 7.2

10

93.6

10

8.8

all above-mentioned methods allow for efficient updating of


the subspaces for increasing n, so that the ACMA algorithm
may be transformed from a block-algorithm into an adaptive
algorithm. (Interesting complications arise because the two
SVD's operate in conjunction. A more detailed analysis of
the possibilities is beyond the scope of the present paper.)
To illustrate the computational requirements in a more
quantitative manner, we consider a replacement of the SVD
by the SSE. Without updating, one implementation of the SSE
(the SSE-2 [53]) has a complexity of about m2n complex
multiplications and 2m2n complex rotations (for a matrix of
size n x m).Assuming no special purpose rotation processors,
we set one complex rotation equal to four complex multiplications, and one complex multiplication equal to four real

VAN DER VEEN AND PAULRAJ: CONSTANT MODULUS ALGORITHM

1147

gerchberg iterations

0.3 -

0.25.

x
0.2

+
* x

N=100
N=26

5 0.25 %
J 0.2?3

0.151

\\

'\\ \

- - - _ _ _= m n d p

E0.15I
n
0.1 m

0.05

bqr*

10

0.05 -

15

(a)
Fig. 4. Four CM signals. Each signal has SNR= 15 dB, 02 = 30'. (a) Singular values of P . (b) Convergence of Gerchberg algorithm for analytically
computed initial points, and for random initial points, for n = 17.

floating point operations (flop). In that case the SSE of X


takes 36m2nreal flop, and (since P can be transformed to a
real matrix) the SSE of P takes 9 ( d 2 ) 2 nreal flop, so that the
complexity of ACMA is

A. Computer-Generated Data

We first study the performance of the algorithm on


computer-generated data. The set-up of this experiment is kept
extremely simple on purpose. We simulate a uniform linear
array of m = 4 isotropic sensors, spaced X/2 apart, where
ACMA: 9d4n 36m2nreal flop.
(27) X is the wavelength of the carrier frequency of the signak6
The resulting main lobe has a beam width of approximately
In comparison, the complexity of the GSA (3) is mainly 26". There are d = 4 signals present, with angles of arrival
determined by a loop containing two complex matrix multipli- 191 = 0", varying &(82 = 30.5"),& = 60".04 = -20".
cations, W . l i and S . X t (where W :d x m, X : m x n, S: d x n ) , The number of CM signals among the four signals is varied
not taking additional soft orthogonalization steps or restarts from S = 4 to S = 2. The number of samples that are used
into account. Each of the multiplications has a complexity of is varied, too, and taken to be n = 1OO,26 and 17. The
dmn complex operations. About ten iterations of the inner loop CM signals that are generated are sequences of unit-modulus
are usually sufficient, although occasionally many more are numbers with uniformly distributed random phase. The other
needed. In addition, the computation of X i = X * ( X X * ) - l (non-CM) signals are normally distributed random complex
calls for about 2 m 2 n complex multiplications (ignoring the numbers, with zero mean and unit variance. The signals are
inversion), or 8ni2n real flop. Altogether, the complexity of scaled according to their relative SNR's.
In the first experiment, we consider the noiseless case.
GSA is approximately
Fig. 3(a) shows plots of the singular values of P , for O2 = 30",
and with d = S = 4 CM signals. In Fig. 3(b), only the first two
GSA (CMA):
signals are CM, the other two are Gaussian. In Fig. 3(c), the
4 . 1 0 . 2 d n i n 8m2n = 80dmn 8m2n real flop. (28)
number of CM signals is again equal to four, but 0 2 is taken
to be 5". It is seen that the number of zero singular values is
The standard CMA may be viewed as an updating version of precisely equal to the number of CM signals, as predicted by
the GSA, where instead of iterating on the same data, new data Proposition 5. Changing the number of CM signals or moving
is continuously introduced. It is not likely to converge faster the angles-of-arrival closer does not influence the distribution
than GSA, viz. [37], so that it has at least the complexity of of the other singular values by much. In particular, the level
(28). Table I gives a listing of (27) and (28) for a range of of the smallest nonzero singular value stays roughly constant.
values of m and d, in kflop per snapshot (i.e., n = 1). It is The distribution of the nonzero singular values does change
seen that for up to six sources the complexity of the ACMA with n:they tend to be located along slanted lines. For larger
is comparable to the GSA.
n,the graph flattens, which facilitates detection. As there is no
noise in the example so far, the CM signals can be retrieved
without errors. To give an idea of the convergence speed of the
V. EXPERIMENTAL
EVALUATION
extended QZ iteration, we list the total below-diagonal norm
To assess the performance of the algorithm, we have applied of the matrices RI, in (25) after each iteration, for an instance
5.
it to a number of test matrices, based both on computer- of case (a), n = 17: 0.3.9 . lop6, 4 .
In the next series of experiments, the same set-up is used,
generated data and on real data collected from an experimental
set-up. The results are quite convincing. For example, the but now we add normally distributed independent white noise
algorithm could find weight vectors to separate a superposition to all samples. The SNR of each signal with respect to the
of four CM signals using four sensors and only 17 data
samples, in well-conditioned cases even if each signal has a
6This information is not used in any way by the algorithm, and any other
signal to background noise ratio of 5 dB.
array geometry would have been suitable as well.

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 44, NO. 5, MAY 1996

I148

gerchberg iterations
I

**

0.25.

* +

0.2.

* x i
0.15 -

$ +

O O O

\
\

Fig. 5. Four signals, two CM signals. Each signal has

SNR = 1.3 dB,

'

0.2

E 0.25-

N=100
N=26

\
\

,0.3-,

SNR = 15 dB

= 30".

02

N=26, SNR=15dB .
SNR(2) = 5 dB
d=4, dl=4

\:\

0.1 5

0.05
0
0

10

"0

15

(a)

10

15

20

(b)

Fig. 6. Four CM signals. Signal 2 has SNR = i dB,

6'2

= 30'; the other signals have SNR = 1 5 dB.

0.250.2.

**

)K

**++
*

"m

0.15
0.1

'

ooooooooo*

"

0.05.

0'
0

+j
5

10

15

(a)
Fig. 7. Four CM signals. Each signal

noise level is set to 15 dB per antenna element. We first


take all signals to be CM, and t92 = 30". The singular value
plot of P is shown in Fig. 4(a). The previously zero singular
values are now raised by some amount, but there still is a
gap between the small and the larger ones. To evaluate how
close the analytically computed weight vectors wl, . . . , w4
are to the optimal solution of the problem (the minimizers of
the distance function in (18)), these weight vectors are used
as initial points in the Gerchberg-Saxton iteration, viz. (3).
Fig. 4(b) shows the computed average modulus error of WX
as

after each iteration step (solid line), for the case n = 17.
We have chosen this "1-2" norm rather than the "2-2" norm
in (17) because it has a nicer physical interpretation (as the
standard deviation of the modulus of signals), and because
the convergence of the GSA is usually monotonic in this
norm, but not in (17). It is seen that the post-processing
hardly changes the computed wk,which is reflected by the
horizontal lines: they are almost equal to the optimal values.
For n = 26 (not shown), the lines are perfectly straight.
Although not clearly visible in Fig. 4(b), all four signals are
resolved; because the signals have the same amplitude, the
modulus error lines tend to overlap. The independence of the
retrieved signals was checked by computing their covariance.
The value of the modulus error is commensurate with the

VAN DER VEEN AND PAULRAJ: CONSTANT MODULUS ALGORITHM

I149

gerchberg iterations

0.41

1 %

IN=26. SNR=15dBI
1=4- - _ \ \ ' - - - - - d_+,
- d--CnaflC
- - -0.3 \ \
\

+
L

0 N=100
X

N=26

-In

- - random

0.15 0 0

0.1 .
0.05

ooooooo

0
0

10

"0

15

I
5

10

(a)

TABLE I1
WORSTSIR [dBl AFTERSEPARATION,
CASESNR(s2) = 17.6
dB (WORSTRECEIVED
SIR = -2.4 dB/ANTENNA)
d^=5,8=4

36.0

34.8

SO

27.2

26.8

26

12.6

22.8

35.4

348

23.5

26.9

6.6

25.1

d=5,8=5

6=6,8=4

ACMA +GSA ACMA +GSA


n=100

ACMA +GSA ACMA

35.1

34.8

19.5

26.9

+GSA

36.0

34.8

(18.9)

(36.9)

3.0

25.3

17
8.2
17.2
(.): not all signals were recovered

TABLE I11
WORSTSIR [dB1 AFTERSEPARATION,
CASESNR(s2) = 17 G
dB (WORSTRECEIVED
SIR = -11
dB/ANTtNNA)

ACMA +GSA

, ACMA

+GSA ACMA +GSA ACMA

+GSA

(14.1)

(34.9)

(14.2)

(33.2)

(8.2)
(7.0)
(9.5)
(.): not all signals were recovered

20

(b)
Fig. 8. Four CM signals. Each signal has

2=4,8=4

15

(19.6)

(12.7) (26.1)
(12.4)

(12.8)

noise level and number of antennas: SNR = 15 dB translates


to an expected modulus error of 0.063 in situations without
cochannel interference.
Also shown in Fig. 4(b) is the performance of the Gerchberg-Saxton algorithm when started with random initial
weight vectors (dashed lines), which would be the usual approach to the CM problem. It is seen that not always all signals
are retrieved, that the convergence can be extremely slow,
and that the algorithm sometimes converges to suboptimal
stationary values. We mention that for larger n (say n = loo),
the local minima are usually not attractive, but recovering all
independent signals remains an issue (solved to some extent
by "soft-orthogonalization," as mentioned in the introduction,
but this was not implemented in these simulations).
Essentially the same remarks can be made for the case where
two out of four signals are CM signals (Fig. 5), and when
the signal power of one or all of the signals is reduced to 5
dB (Figs. 6 and 7). The effect of a reduced signal powerA(or
increased noise level) is seen in the singular values of P as
an increase in the small singular values, which will limit the
detection at some point. The average modulus error of a 5-dB

SNR = 1 5 dB,

H2 =

5O.

signal is expected to be 0.20, which matches with the figures.


The angular spacing between the first two signals can be
reduced to 10" without problems. When the spacing is further
reduced to 5" (Fig. 8), the detection of the two closely spaced
CM signals becomes problematic, and for n = 26, GSA
postprocessing failed to keep two of the signals independent.
The reason is that, because of the close angles of arrival, the
condition number of the A-matrix is increased (from 2.3 to
11.9), which, at this noise level, is sufficient to close the gap
between the large and small singular values for any n.This is
confirmed by theoretical predictions of the gap size (Section
V-C). Nonetheless, n = 100 was sufficient in simulations to
separate the signals even without additional GSA iterations:
the loss of gap prevents detection of the number of CM signals,
but not necessarily their separation.
B. Experiments on Measured Data
The algorithm was also tested on data collected from an
experimental rooftop antenna array.7 The configuration of the
array is shown in Fig. 9. The receiving array consisted of
m = 6 isotropic antennas, where antennas z1-x; formed part
of an airplane DF array with a baseline of approximately 1.5 m,
and antenna J G was a dipole at approximately 1 m to the right
of the array. Located nearby were d = 4 dipole antennas,
marked S I - S ~ , each broadcasting FM signals at RF carrier
frequencies of 902.1 MHz & 200 Hz (i.e., the individual carrier
frequencies were slightly offset). The signal transmitted by
source s1 was an FM-modulated tone of 1 kHz, signals S Z - S ~
consisted of FM modulated speech and music. The received
signals were RF-demodulated, sampled at 37.5 kHz (complex),
digitized at 12 b, and bandlimited at 25 kHz. The actual 10dB bandwidth of the sources was around 6 kHz. In the first
experiment, the power of each transmitted signal with respect
to the ambient background noise (SNR) was 19.1, 17.6, 17.9,
and 16.7 dB, respectively. In a second experiment, the power
of s2 was lowered to SNR(s2) = 7.6 dB.
In Fig. 10, the singular values of X and P are shown. For
n = 100 and n = 50, it is clear that there are four CM signals.
(For the record, we mention that the cpndition number of A
was later estimated as 5.8.) Denote by d, S the parameters used
by the ACMA, as opposed to the true values ( d = 6 = 4).
7The data was measured and provided by ARGOSystems, Inc , Sunnyvale,
CA, as part of an ongoing research project with Stanford University

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 44, NO. 5, MAY 1996

1150

Fig. 9. Experimental set-up, with four EM transmitters ( S I . .


antenna array consisting of six receivers.

. . s4) and an

Fig. 11 shows the modulus error during subsequent Gerchberg


iterations when the ACMA is run with d , 6 = 4. For n = 50,
the analytically computed values of w are hardly changed; for
n = 17, the Gerchberg iterations improve a bit on the w. With
random initializations, the Gerchberg iterations may converge
to at least two spurious local minima.
Table 11 lists the estimated signal-to-interference ratios
(SIR's) obtained by the ACMA, both before and after the
additional GSA iterations. The values are based on the rows
of the matrix '&A, where ii/ contains the weight vectors as
determined by the algorithm for the listed n.d, 8, and A is
an estimate of the unknown true A matrix computed using
the ACMA on n = 400 samples and with d = 6 = 4.
~A should be close to the identity matrix (or a permutation
and diagonal scaling thereof). The table shows the results
obtained for various choices of the parameters d and 6 used
in the algorithm. It is seen that overestimating d is not really
a problem, provided n is large enough. (The case d = m
is of interest because in that case the SVD of X may be
replaced by a simple QR factorization.) Overestimating S as
well is sometimes not a problem, but led to a fatal result
for n = 50: only two independent signals were obtained.
In general, overestimating 6 is not a good idea because
the algorithm tries to compute a change of basis-from an
orthonormal basis of ker(i)) to a rank one basis {w;wk}. If
the orthonormal basis is too large, then there is no suitable
transformation to a rank-1 basis, and all estimates of w are
affected.
In a second experiment, the power of source s2 was lowered
to SNR(s2) = 7.6 dB. As the spacing between SI and s2 is
still only 1.5', this is a challenging test of the algorithm. (The
condition number of A is now 15.9.) Some results are depicted
in Figs. 12 and 13. The detection of the other three signals
from the singular values of P remained the same, but the fourth
singular value (apparently corresponding to s2) is raised and
now somewhere in the middle of the gap between the large
and small singular values. The detection that there are four

independent signals by inspection of the singular values of X


is also more difficult now, even if n = 100. Overestimating
d decreases the singular value gap (resolution) in P if n is
small, but the gap remains unchanged for lar8er n.
If d and S are estimated correctly (d = 6 = 4), then all
signals are retrieved for n as low as 26 (Fig. 13). For n = 17,
the recovered signals were no longer sufficiently independent.
Table 111 lists the SIR's for various choices of the parameters.
The improvement in SIR is about the same as for the first
experiment. Note that if both d and S are underestimated (as
could easily occur because s2 does not show up very well in
the SVD of X ) , then s2 is of course lost, but the behavior of
the other estimates is approximately the same.
Note from the tables that the SIR is not always improved by
the additional Gerchberg iterations. The reason is that the GSA
is connected to a different cost function, (29) rather than (17),
with slightly different minima for finite n and SNR. None of
these minima are necessarily coinciding with minimal SIR.
The main conclusion to draw from the experiments carried
out in this section is that in all observed cases the algorithm
obtains the optimum of the minimization problem if n is
sufficiently large. For four signals, n = 50-100 is typically
large enough, even under severe conditions. For smaller n, the
estimates move away from their optimal values, but usually,
the algorithm still finds all CM signals if their number has
been estimated correctly, and the optima can be obtained by
adding a few iterations of the GSA as postprocessing. The
effect of a smaller n is mostly felt in a closing ofAthegap
between the larger and smaller singular values of P , which
limits the detection of 6.This is mitigated to some extent by
the property that the algorithm is quite robust when d or 6
are overestimated.
C. Detection Thresholds

What determines the singular values of P , and thus the


resolution of the algorithm? This is the topic of a separate
paper, but it is relevant to at least summarize some of the
results here, as they explain some properties of the singular
value plots quantitatively.
The large singular values of P tend toward l / f i , but for
small values of n,they are not constant yet but distributed
along a line. This distribution is similar to that of the singular
values of a random matrix, which has been investigated in [53]
and [54]. Extrapolating the result in [53], the smallest among
the set of d2 - d large singular values is (with probability
better than 0.95) expected to satisfy
min (large sv)

>

d-1

1
-

2.
n

This matches with the experiments earlier in this section as


well. At the other side of the gap, the 6 small singular values
of the numerical kernel should ideally be equal to zero, but
with noise they are increased to
max(smal1 sv) = a J Z . cond(A).

Here, a' is the normalized noise power per sample per


antenna: 201og(l/a) is the SNR of the strongest signal at a

VAN DER VFbN ANI1 PAULKAJ CONSTANT MODULUS ALGORITHM

I151

Fig. 10. Experimcnt with four FM signals and six antennas; SNR(s2) = 17.6 dB: (a) singular values of S ; (b) singular values of P , with

0.25
\\\

\\

0.2-

\\
\

Q)

3 0.15Lo

\\ \
\

\
\

\
\

,\. .
'.\
-.
-- -

Fig. 11.

10

(7 = 4 , i

(32)

(independent of n ) , and an indication of the minimal number


of samples that is needed in that case, as follows:

n>

[-

d-

'i-

\\

11

m=6, N=17
d=4, d l = 4

- - - - - - = --random
anW-:

'\

.
\

'_
-_-_----__---

..

b
'., .
\

$0.05\
m
0

= 4 (SNR(s2) = 17.6 dB): (a) n = 30; (b)

single receiver. The noise is enhanced by a factor


because
of the inherent squaring of the data. The factor "cond(A)" is
the conditioning of A , and includes two effects. When the array
response is approximately uniform in all directions, cond(A ) is
just the square root of the ratio of the power of the strongest
signal to the weakest: this translates 0 into the SNR of the
weakest signal. A large (bad) condition number of A may also
be due to a close spacing of two signals, as determined by
the resolution limit of the array. In such cases, a correction by
1/& is sometimes in order. The above two equations allow
to derive the maximal noise power for which there still can
be a gap, as

+ 20 log cond(A)

\
\

0.15-

20

15

Gerchberg iterations for

SNR > 3 dB - 10 log m

- 2 - .

-. J.

\ '\\

U)

-0

(33)

1 - rrJZconcI(A)/JIIL]

(we still require n,> d 2 , too). They also allow to set automatic
decision thresholds for yank detection in subspace

VI. CONCLUDING
REMARKS
In this paper, we have described an analytic method for
solving the constant modulus problem. The method condenses

= 4.

El-

\
\

\\'\

'\

0.2
L

'

\ \ \\
\

$ 0.05-

0.1-

3.

.
\

'\

m=6, N=50
d=4,dl=4
-analytic
- -random

ti

- _ _- -. --_

71

--

= 17

all conditions on the weight vectors w into a single matrix P ,


and finds all independent vectors in the kernel of this matrix
that have a Kronecker product structure. This problem, in turn,
is shown to be a generalized matrix pencil (eigenvalue) problem, which may be formulated in terms of a super-generalized
Schur decomposition: For given matrices Y I ,. . . , Y6, find &; Z
(unitary) such that

where RI , . . . Rh are as much upper triangular as possible.


We have proposed a modified Q Z iteration that treats all
Y k equally, converges to upper triangular matrices RI, in the
absence of noise, and usually has quadratic convergence in
our simulations.
The analytic algorithm is definitely more complex to implement than the usual iterative approaches for blind beamand

deconvo'ution

Of

'Onstant

nals. However, it gives fundamental solutions to a number of problems that have plagued iterative CM algorithms
ever since their inception in the early 1980's. The most
important advantages of the analytic approach are listed below.

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 44, NO. 5, MAY 1996

1152

sv(X)/N

lo-

+
m

lo-*

m
o

lo-

+
m
0

1
0

+
mn

m=6, d=4
N=100
3 N=50
+ N=26

o-i

N=100
m N=50
1o - ~
+ N=26
0

10

(4

8, m
15

(b)

N=100

m N=50
+ N=26
0

10

15

20

25

(c)

Fig. 12. Experiment with four FM signals (SNR(s2) = 7.6 dB): (a) singular values of S ; (b) singular values of

1) It is less blind: The number of CM signals are detected


explicitly from the close-to-zero singular values of P .
Not all signals have to be CM signals.
2) It is deterministic: The minima of the cost functions are
found by analysis rather than by trying different initial
points in the usual steepest-gradient methods. The only
parameters that have to be set are the total number of
signals, and the number of CM signals.
3) It is robust on small data sets and in the presence of
noise, although a few additional iterations of the standard
CMA may be necessary to find the optimal weight
vectors.
4) There are detection criteria that predict how many antennas and samples are needed in given scenarios (cf.
Section V-C).
The modest requirements on the number of samples is an
important issue in applications where multipath causes fast
fading.
Signals that are not CM signals but have a kurtosis8 smaller
than two, e.g., QAM and other finite alphabet signals, may
be modeled as a CM signal at the RMS amplitude plus a
limited amount of noise corresponding to the variance on
this amplitude. When the equivalent noise power satisfies
(32), then the number of samples n can be chosen large
8The kurtosis of a signal z ( t ) is defined as ~ (

f,with d^ = 4; (c) with d = 5.

enough to allow detection of the signal as a CM signal,


and thus to recover this signal as well. This abuse of the
CM property to separate independent non-Gaussian signals is
of course already common practice in blind equalizers and
beamformers ever since their invention (viz. [6], [55]).It might
even be argued that the fourth-order cumulant techniques in
[ 151-[ 171, constructed to separate independent non-Gaussian
signals, do in fact rely on the same property. Further research is needed to bring the many hidden connections into
perspective.
APPENDIXA
VECTORING OF HERMITIAN
MATRICES
For Hermitian matrices Y , we can redefine the vectoring
operation v e c ( Y ) to take advantage of the symmetry in Y ,
and end up with about half the number of parameters. One
convenient way to do so leads to real vectors instead of
complex vectors and is based on the property (for z E (E):

Hence, a unitary matrix will transform a vector in which both


x and x* are present into a vector where these components
z=
) EI~(t)1~/(EIz(t)1)are real. Thus define, for Hermitian matrices Y E (Edxd, the

VAN DER VEEN AND PAULRAJ: CONSTANT MODULUS ALGORITHM

0.25-

E
&

. \
m=6, N=50
d=4,dl=4
-analytic
- -random

'

\ \\ \ \ \

\ \

U)

\\ \
\\ \

0.2-

\\\

0.1 -

E 0.05-

1153

\\\

m=6, N=26
d=4, dl=4

0.25.

Em 0.155 0.1 . ' '.


.5 0.05 - \.._.
- --- - - - \

\\

5:-

-1

\\

y:\
'\\:\ . ,'.
'<\;

gerchberg iterations

---

=
C
L
-

Fig. 13. Gerchberg iterations for

----

2 = 4 , k = 4( SNR(s2) = 7.6 dB): (a) n = 50; (b) TI

Hermitian vectoring operation "vech" as

y = vech(Y)=
-

i.e.

There is a unitary matrix ( U , say) with a simple structure,


mapping the result of the original vectoring operation into
y = U v e c ( Y ) . The inverse operation is
the new result, Y = vec-'(U*y)
- =: vech-l(y),
- which also may be evaluated
explicitly.
Besides the fact that vech(.) is a real vector, a second
advantage is that the inverse operation vech- returns matrices
that are Hermitian by construction. Both advantages show
up when we elaborate on our application; see (7). Because
P, = vkv; is Hermitian, p is a real vector. The implication
is that the matcix E, const&ted from these vectors, is real, as
is the matrix
Hence, the SVD of
is a real SVD, which
saves about a factor of three on computations. Because of the
unitarity of the transformation, the singular values of P and
@ are precisely the same, but the basis {y,) that we select
from the kernel of f consists of real vectors. As a result,
the matrices Yl = vech-'(y,), . . . , Y6 = vech-l ( y 6 )that we
form from the basis are Hermitian by construction.

c.

APPENDIXB
PROOFS
Proof of Lemma I : Without loss of generality, we may take
d = m. Our approach is to determine how many vectors w
there can be such that W X is a CM signal. As derived in
Section 11-C, each column of X gives a quadratic equation
that the entries of w have to satisfy. We assume that these
constraints are independent.

*__

= 26.

Since w is a complex vector, it consists of 2d parameters.


Any w can be scaled by a unimodular scaling such that its
first entry is real and positive; since this scaling does not affect
the constant-modulus property of wX, it is an unconstrained
parameter, so that the n columns of X only put constraints
on the remaining 2d - 1 parameters. On general principles,
we expect that when n = 2d - 1, there is only a discrete set
of isolated solutions for w. Nonetheless, this set might be too
large: e.g., when d = 2, the isolated solutions are determined
by the intersection of three ellipsoids in 3-space, with 0,2. or
4 solutions. Adding one more constraint (i.e., n = 2 4 will
place a new condition on the isolated solutions, to which in
general only the original CM source signals and their weight
0
vectors can comply.
Proof of Lemma 3: To prove equivalence, it remains to
show that a set of solutions {yk}; of the form y k = Wk @ w k
is linearly independent if and only if the corresponding set
{wk)? is linearly independent. Indeed, with Yk = wEwk and
yk = wk 8 w k = vec(Yk),
{yk)!

is a linearly independent set

+ +

' . nny6 = 0
=+-az=O,z=l,.",S]
U [alYl Q'Y2 ' . . CuY6 = 0
U [QlYl+ 0 2 y 2

'

+ +

* a, = 0, i = 1,. . . , SI
e rank[Yl Y2... Ya]= 6
U rank[wI w4 . . .wg*]= 6

{wk}: is a linearly independent set.

0
Proof of Lemma 4: The only issue to show is the equivalence of ply = n1l2to llwll = n l / ' . This proof consists of
two (technical) steps.
1) We first show that ply = nl/' U tr(Y) = n, where
Y = vec-'y. (tr(.) is the trace operator.) Indeed, let
PI = vec-l(p1). We show that PI = rL-1/21.For this,
we use the fact that Q is unitary and P is constructed
from V, an isometry. p1 only depends on the first row
of Q. This row must be equal to n - l l 2 [ 1 . . .l],because
all other rows of Q are necessarily orthogonal to this
vector. Using the definition of P gives
nl/'F1 = vlv?

+ . . . + v,v:

*..

-- VV* = I .

1 I54

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 44, NO. 5 , MAY 1996

Finally, it remains to note that ply = w P l w * =


ww*n-l/ = tr(w*w)n-l/ = tr(Y)n-l/.
2) Furthermore, when y = w 8 W, then tr(Y) =
tr(w*w) = llw112 so that ply = nl/* llw112 = n. 0
Proof of Proposition 5: The relation between $ and S may
be written as V = AS, where the nonsingular d x d matrix A
is derived from the priginal antenna response matrix. P was
defined in terms of V in (8); in a similar way, we may define
a matrix PS in terms of S.This produces

which will provide the direction o,f w, and the computation


of a scalar c to minimize dist(cwV,CM). After solving the
first problem, the optimal value for c is directly determined by
min, n-(n - c2n) c4c2, which has the solution

(35)

0
ACKNOWLEDGMENT

P = PSf[AT@A*],

The d2 x d2 matrix [AT @ A*] is nonsingular: its singular


values are given by all cross-products of the singular values
of A. Hence the propagation environmnent does not influence
the dimension of the kernel of P (or P ) ; it will be too large
only if there are specific phase relations between the signals,
valid for all points in time. It is not a trivial task to analyze
these relations, except for the case d = 6 = 2, which is done
in the main text. For statistically independent signals with a
rich enough phase space (analog FM or PM signals, or digital
CM signals with reasonably large constellations or sufficient
oversampling), the probability is zero that the rank of the
matrix PS is any lower than necessary. This becomes more
so for larger n.
0
Proof of Proposition 7: We first show that

d i s t ( w V , C M ) = n-(n - W W * ) ~ IIP(w 8 W)ii2


Indeed, take the definition of dist (., C M ) in (17), and make
the same series of substitutions as in Section 11-C:
dist(wV,CM) = c(lwv,,12 -

C(l= C(1-pky)2

WPkW*)2

(y = w 8 w*)

In making the last step, we have used the proof of Lemma 4:


12112 - ply = n-1/(, - ww*).
Hence, we have shown that the distance function
d i s t ( w V , C M ) splits into two terms. The first term
n-l(n - ww*) is only a penalty on the norm of w:
I I w I I should be close to n1/2.A multiplication of w by some
~ Ilp(w ~3W)ll by e.
number c will scale both ( w w * ) and
This means that the given minimization problem is separable
into the constrained minimization problem for w ,
t2 := minllij(w

w)112

s.t. w w * = n

The authors are grateful to F. McCarthy of ARGOSystems,


Inc., for sharing his measurement data reported on in Section
V-B. The first author would like to thank Prof. G. H. Golub
for inviting him to Stanford University, and the anonymous
reviewers for their stimulating critiques.

REFERENCES
A. J. van der Veen and A. Paulraj, A constant modulus factorization
technique for smart antenna applications in mobile communications, in
Proc. SPIE, Adv. Signal Processing Algorithms, Architect., Implementat. V , CA, F.T. Luk, Ed., San Diego, CA, July 1994, vol. 2296, pp.
230-241.
A. J. van der Veen, S. Talwar, and A. Paulraj, Blind estimation of
multiple digital signals transmitted over FIR channels, IEEE Signal
Processing Lett., vol. 2, pp. 99-102, May 1995.
R. 0. Schmidt, Multiple emitter location and signal parameters estimation, IEEE Trans. Antennas Propagat., vol. 34, pp. 276-280, Mar.
1986.
R. Roy. A. Paulraj, and T. Kailath, ESPRIT-A subspace rotation
approach to estimation of parameters of cisoids in noise, IEEE Trans.
Acoust., Speech, Signal Processing, vol. ASSP-34, no. 5, pp. 1340-1342,
1986.
Y. Sato. A method of self-recovering equalization for multilevel
amplitude-modulation systems, IEEE Trans. Commun., vol. COM-23,
pp. 679-682, June 1975.
D. N.Godard, Self-recovering equalization and carrier tracking in twodimensional data communication systems, IEEE Trans. Commun., vol.
COM-28, pp. 1867-1875, Nov. 1980.
J. R. Treichler and B. G. Agee, A new approach to multipath correction
3f constant modulus signals, IEEE Trans. Acousl., Speech, Signal
Processing, vol. ASSP-31, pp. 4 5 9 4 7 1 , Apr. 1983.
M. G. Larimore and J. R. Treichler, Convergence behavior of the
constant modulus algorithm, in Proc. ZEEE ICASSP, 1983, vol. 1, pp.
13-16.
S. Haykin, Ed., Blind Deconvolution: Englewood Cliffs, NJ: PrenticeHall. 1994.
J. R. Treichler and M. G. Larimore, New processing techniques based
on constant modulus adaptive algorithm, IEEE Trans. Acoust., Speech,
Signal Processing, vol. ASSP-33, pp. 420-431, Apr. 1985.
R. Gooch and J. Lundell, The CM array: An adaptive beamformer for
constant modulus signals, in Proc. IEEE ICASSP, Tokyo, Japan, 1986,
pp. 2523-2526.
R. P. Gooch and B. J. Suhlett, Joint spatial and temporal equalization
in a decision-directed adaptive antenna system, in 22nd IEEE Asilomar
Con$ Signals, Syst., Comput., 1988, vol. 1, pp. 255-259.
S. Talwar, M. Viberg, and A. Paulraj, Blind estimation of multiple
co-channel digital signals arriving at an antenna array, in 27th IEEE
Asilomar Conj Signals, Syst. Comput., 1993, vol. 1, pp. 349-353.
B. G. Agee, S. V. Schell, and W. A. Gardner, Spectral self-coherence
restoral: A new approach to blind adaptive signal extraction using
antenna arrays, Proc. IEEE, vol. 78, pp. 753-767, Apr. 1990.
J. F. Cardoso, Super-symmetric decomposition of the fourth-order
cumulant tensor. Blind identification of more sources than sensors, in
Proc. IEEE ICASSP, Toronto, Canada, 1991, vol. 5, pp. 3109-3112.
J. F. Cardoso, Iterative techniques for blind source separation using
only fourth-order cumulants, in Signal Processing VI: Proc. EUSIPCO92, J. Vandewalle et al. Ed. Brussels, Belgium: Elsevier, 1992, vol. 2,
pp. 739-742.
J. F. Cardoso and A. Souloumiac, Blind beamforming for non-Gaussian
signals, in IEE Proc. F (Radar Signal Processingj, vol. 140, pp.
362-370. Dec. 1993.

VAN DER VEEN AND PAULRAJ: CONSTANT MODULUS ALGORITHM

[ 181 L. Tong, Y. Inouye, and R:W.

Liu, Waveform-preserving blind estimation of multiple independent sources, IEEE Trans. Signal Processing,
vol. 41, pp. 2461-2470, July 1993.
[ 191 S. Mayrarguc, A blind spatio-temporal equalizer for a radio-mobile
channel using the constant modulus algorithm (CMA), in Proc. IEEE
ICASSP, 1994, pp. IV:317-320.
[20] L. Tong, G. Xu, and T. Kailath, Blind identification and equalization
based on second-order statistic5: A time domain approach, IEEE Trans.
InfiJrm. Theory, vol. 40, pp. 340-349, Mar. 1994.
[21] E. Moulines, P. Duhamel, J. F. Cardoso, and S. Mayrargue, Subspace
methods for the blind identification of multichannel FIR filters, in Proc.
IEEE ICASSP, 1994, pp. IV573-576.
[22] D. T. M. Slock, Blind fractionally-spaced equalization, perfectreconstruction filter banks and multichannel linear prediction, in Proc.
IEEE ICASSP, 1994, pp. IV:585-588.
[23] Y. Li and Z. Ding, Global convergence of fractionally spaced Godard
equalizers, in 28th IEEE Asilomar Conj; Signals, Syst., Comput., 1994,
pp. 617-621.
[24] I. Fijalkow, F. Lopez de Victoria, and C. R. Johnson Jr., Adaptive fractionally spaced blind equalization, in Pmc. 6 / h IEEE DSP Workshop,
Yoscmitc, 1994, pp. 257-260.
[25] H. Liu and G. Xu, A deterministic approach to hlind symhol estimation, IEEE Signcrl Processing Lellers, vol. 1, pp. 205-207, Dec.
1994.
[26] J. .I.Shynk and R. P. Gooch, Convergence properties of the Multistage
CMA adaptive beamformer, in 2 7 h IEEE Asilomar Con$ Signals, Syst.,
CcJmpUt., 1993, Vol. 1, pp. 622-626.
1271 A. V. Keerthi, A. Mathur, and J. J. Shynk, Direction-finding performance of the multistage CMA array, in 28th IEEE Asilomur Con$
Signul.s, Syst., Comput., 1994, vol. 2, pp. 847-852.
[28] B. G. Agee, Fast adaptive polarization control using the least-squares
constant modulus algorithm, in 20th IEEE Asilomar Cor$ Signals,
Syst., comput., 1987, pp. 590-595.
1291 ~,
Blind separation and capture of communication signals using
a multitarget constant modulus beamformer, in Proc.
Boston, MA, 1989, vol. 2, pp. 340-346.
[30] F. McCarthy, Multiple signal direction-finding and interference reduction techniques, in Wescon/Y3 Conf Rec., San Francisco, CA, Sept.
1993, pp. 354-361.
[3I ] ~,
Demonstration in Workshop Smart Antennas Wireless Mobile
Commun., Stanford University, Stanford, CA, June 1994.
[32] B. G. Agee, Convergent bchavior of modulus-restoring adaptive arrays
in Gaussian interference environments, in 22ndAsilomar Conjf Signals,
Syst., Comput., Pacific Grove, CA, Nov. 1988, pp. 818-822.
1331 C. R. Johnson, Admissibility in blind adaptive channel equalization,
IEEE Control Sysl. Mag., vol. 11, pp. 3-15, Jan. 1991.
[34] H. Jamali and S . L. Wood, Error surface analysis for the complex
constant modulus adaptive algorithm, in 24th IEEE Asilomar Con$
Signdc., Sys/., C~~mput.,
1990, vol. I , pp. 248-252.
[35] H. Jamali, S. L. Wood, and R. Cristi, Experimental validation of the
Kronecker product Godard blind adaptive algorithms, in 26th Asilomcrr
C ( J ~ $Signals, Syst., Comput.,1992, vol. 1, pp. 1-5.
[36] K. Dogancay and R. A. Kennedy, A globally admissible off-line
modulus restoral algorithm for low-order adaptive channel equalisers, in Proc. IEEE ICASSP, Adelaide, Australia, 1994, vol. 3, pp.
111/61-64.
[37] Y. Wang et al., A matrix factorization approach to signal copy of
constant modulus signals arriving at an antenna array, in Proc. 28th
Con$ Inform. Sci. Syst., Princeton, NJ, Mar. 1994.
[38] R. W. Gerchberg and W. 0. Saxton, A practical algorithm for the
determination of phase from image and diffraction plane pictures,
Optik, vol. 35, pp. 237-246, 1972.
1391 B. G. Agee, The least-squares CMA: A new technique for rapid
correction of constant modulus signals, in Proc. IEEE ICASSP, Tokyo,
Japan, 1986, pp. 953-956.
[40] G. H. Goluh and C. F. Van Loan, Marrix Computations. Baltimore,
MD: The Johns Hopkins Univ. Press, 1989.
[41] M. D. Zoltowski and D. Stavrinides, Sensor array signal processing via
a Procrustus rotations based eigenanalysis of the ESPRIT data pencil,
IEEE Trans. Acoust., Speech, Signal Processing, vol. 37, pp. 832-861,
June 1989.
[42] A. A. Shah and D. W. Tufts, Determination of the dimension of a signal
subspace from short data records, IEEE Tram. Signal Processing, vol.
42, pp, 2531-2535, Sept. 1994.
[43] L. Tong, Y. Inouye, and R.-W. Liu, A finite-step global convergence
algorithm for the cumulant-based parameter estimation of multichannel moving average processes, in P ~ o c .IEEE ICASSP, 1991, pp.
V:3445-3448.

1155

[44] A. Bunse-Gerstner, R. Byers, and V. Mehrmann, Numerical methods


for simultaneous diagonalization, SIAM J. Matrix Anal. Applic., vol. 4,
pp. 927-949, 1993.
1451 M. T. Chu, A continuous Jacohi-like approach to the simultaneous
reduction of real matrices, Linear Algebru Applic., vol. 147, pp. 75-96,
1991.
[46] P. J. Eberlein, On the Schur decomposition of a matrix for parallel
computation, IEEE Truns. Comput., vol. C-36, pp. 167-174, 1987.
[47] B. D. Flury and B. E. Neuenschwander, Simultaneous diagonalization
algorithms with applications in multivariate statistics, in Approximation and Computation. R. V. M. Zahar, Ed. Basel, Switzerland:
Birkhauser, 1995, pp. 179-205.
[48] A. J. van der Veen, A Schur method for low-rank matrix approximation, SlAM J. Malrix Anal. Appl., vol. 17, no. 1, pp. 139-160, Jan.
1996.
[49] G. W. Stewart, An updating algorithm for subspace tracking, IEEE
Truns. Signal Processing, vol. 40, pp. 1535-1541, June 1992.
[50] B. Yang, Projection approximation subspace tracking, IEEE Trans.
Signal Processing, vol. 43, pp. 95-107, Jan. 1995.
[SI] G. Xu and T. Kailath, Fast signal-subspace decomposition, IEEE
Trans. Signal Processing, vol. 42, pp. 539-551, Mar. 1994.
[52] D. J. Rabideau and A. 0. Steinhardt, Fast subspace tracking, in Proc.
7th SP Workshop Statist.. Siznal Array Pmcessin,q, Quebec, Canada,
June 1994, pp. 353-356.
.
-1531- A. J. van der Veen, Updating the Schur subspace estimator, IEEE
Trans. Signal Processing, suhmited for publication.
[54] A. Edelman, The distribution and moments of the smallest eigenvalue
of a random matrix of Wishart type, Linear Algebra Applic., vol. 159,
pp. 55-80, 1991.
[55] J. Lundell and B. Widrow, Application of the constant modulus
adaptive heamformer to constant and nonconstant modulus algorithms,
in 21st Asilomur Con$ Signals, Sys/., Comput., Nov. 1987, vol. 1, pp.
432-436.

Alle-Jan van der Veen (M94) was bom in The


Netherlands in 1966 He graduated (cum laude)
from Delft University of Technology, Department
of Electrical Engineering, in 1988, and received the
P h D degree (cum laude) from the sdme institute
in 1993
Throughout 1994, he was a postdoctoral scholar
at Stanford University, Stanford, CA, USA, in the
ScientifiL Computing/Computdtional Mathematics
group and in the Inforindtion Systems Laboratory
He is currently a researcher in the Signal Processing
Group of DIMES, Delft University of Technology His research interests are
in the general area of cystem theory applied to signal processing, In particular
system identification, time-varying system theory, and in numerical methods
and parallel algorithms for linear algebra problems
Dr van der Veen is the recipient of a 1994 IEEE Signal Processing paper
award

Arogyaswami Paulraj (F91) was educated at the


Naval Engineering College, India, and at the Indian
Institute of Technology, New Delhi, India. He rcceived the Ph.D. degree in 1973.
A large part of his career to date has been spent in
research laboratories in India where he supervised
the development of sevcral electronic systems. His
contributions include a sonar receiver 1973-1974, a
surface ship sonar 1976-1983, a parallel computer
1988-1991, and telecommunications systems. He
is currently~.professor of electrical engineering at
Stanford University working in the area of mobile communications. He
has held visiting appointments at several universities-Indian Institute of
Technology, Delhi, 1973-1974; Loughborough University of Technology,
U.K., 1974-1975; and Stanford University, CA, 1983-1986. His research has
spanned several disciplines, emphasizing estimation theory, sensor signal processing, antenna array processing, parallel computer architectures/algorithms,
and communication systems.
Dr. Paulraj is the author of approximately 90 research papers and holds
several patents. He has won a number of national awards in India for his
contributions to technology development.

You might also like