Direction-of-Arrival Estimation Based On Deep
Direction-of-Arrival Estimation Based On Deep
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 1
Abstract—Lacking of adaptation to various array imperfec- directions to array outputs and assume that the mappings are
tions is an open problem for most high-precision direction- reversible. Based on this assumption, the array outputs can
of-arrival (DOA) estimation methods. Machine learning-based be matched by the pre-formulated mappings to realize direc-
methods are data-driven, they do not rely on prior assumptions
about array geometries, and are expected to adapt better to array tion estimation. Different matching criteria result in different
imperfections when compared with model-based counterparts. methods, e.g., manifold correlation for beamformers [3], [4],
This paper introduces a framework of deep neural network [5], superplane fitting for subspace-based methods [6], [7], [8],
(DNN) to address the DOA estimation problem, so as to obtain raw array output reconstruction on overcomplete dictionaries
good adaptation to array imperfections and enhanced generaliza- for sparsity-inducing methods [9], [10], [11], [12], [13], and
tion to unseen scenarios. The framework consists of a multi-task
autoencoder and a series of parallel multi-layer classifiers. The raw array output fitting for maximum likelihood methods [14],
autoencoder acts like a group of spatial filters, it decomposes the [15], [16]. Performances of these parametric methods depend
input into multiple components in different spatial subregions. heavily on the consistency between the two mappings, the
These components thus have more concentrated distributions forward mapping from signal directions to array outputs during
than the original input, which helps to reduce the burden of data collection and the inverse mapping from array outputs to
generalization for subsequent DOA estimation classifiers. The
classifiers follow a one-vs-all classification guideline to determine signal directions for DOA estimation.
if there are signal components near preseted directional grids, Various imperfections may exist in array systems due to
and the classification results are concatenated to reconstruct non-ideal sensor design and manufacture, array installation
a spatial spectrum and estimate signal directions. Simulations and inter-sensor mutual interference, background radiation,
are carried out to show that the proposed method performs etc. [17]. As a result, the forward mapping from signal
satisfyingly in both generalization and imperfection adaptation.
directions to array outputs in practical systems is far more
Index Terms—Direction-of-arrival (DOA) estimation, deep complicated than its backward counterparts used in para-
neural network (DNN), array imperfection, multi-task autoen- metric DOA estimation methods [18], [19]. Some of these
coder, one-vs-all classification, supervised learning.
imperfections are too complicated to be modelled accurately,
and the inaccurate modeling may pose significant negative
I. I NTRODUCTION influence on the performance of DOA estimation [20], [21].
To facilitate method implementation, simplified models are
D IRECTION-of-arrival (DOA) estimation is a widely
studied problem in various areas, including wireless
communications, astronomical observation, radar and sonar
established to describe the effects of various imperfections,
and auto-calibration processes are proposed to improve DOA
[1]. A main trend of the research in DOA estimation is estimation precision [22], [23], [24], [25], [26], [27], [28].
improving precision and superresolution, enhancing adaptation Most of the simplifications on array imperfections are made
to demanding scenarios with limited snapshots, low signal- from mathematical perspectives approximately with various
to-noise ratio (SNR), etc. [2]. Various methods have been additional assumptions, such as uniform linearity/circularity
proposed to meet these requirements, such as beamformers array geometries [22], [23], [24], constrained sensor location
[3], [4], [5], subspace-based methods [6], [7], [8], sparsity- errors within a particular line or plane [25], [26], and inter-
inducing methods [9], [10], [11], [12], [13] and maximum sensor independence of gain and phase errors [27], [28].
likelihood methods [14], [15], [16]. There have been long- Auto-calibration DOA estimation methods have been proved
lasting developments in DOA estimation performance [2]. A to be effective via simulations in previous literatures [22], [23],
common feature of the above-mentioned methods is that, they [24], [25], [26], [27], [28]. However, in the simulations, array
are parametric and formulate forward mappings from signal outputs are generated based on artificially simplified models,
and the formulations of imperfections are assumed to be
Z. -M. Liu is with the State Key Laboratory of Complex Electromagnetic known beforehand with some unknown variables only. These
Environment Effects on Electronics and Information System, National Uni- simplifications and assumptions deviate practice with different
versity of Defense Technology, Changsha, Hunan, 410073, China.
E-mail: [email protected]. degrees, and it is still an open question to clearly explain
C. Zhang and P. S. Yu are with the Department of Computer Science, how the autocalibration methods behave in practical systems,
University of Illinois at Chicago. Email: {czhang99, psyu}@uic.edu. especially when additional assumptions, such as linear/circular
The research in this paper is partially supported by National Science
Foundation of China (NO. 61771477). array geometries, are deviated. Moreover, combined effects
Manuscript received April 24, 2018. of multiple kinds of imperfections, which probably exist in
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 2
practical systems, are much more difficult to be modelled enough for time-frequency transformation as that for acoustic
precisely and calibrated automatically. Only a small number signals. Moreover, most of the deep learning methods locate
of literatures have made in-depth discussions on this problem sources on very coarse grids with inter-grid spacings of 5◦
[29], [30], [31]. [46], [49] or even 10◦ [50]. Such coarse DOA estimates do
Recent research introduces machine learning techniques to not meet the precision requirements in most general DOA
solve the DOA estimation problem [32], [33], [34], [35], [36]. estimation applications.
Although ideas alike date back to the 1990’s [37], [38], [39], In this paper, we propose a hierarchical framework of
they may be resuscitated due to fast developing deep learning deep neural networks (DNN) to deal with the general DOA
theory and methods [40], [41], [42], [43], which have largely estimation problem. The covariance vector of the array outputs
enhanced modeling capability than their shallow counterparts is computed and used as the input of the DNN, and a multi-
and other machine learning techniques. Methods falling in this task autoencoder is introduced before multi-layer classifiers
category establish training datasets with DOA labels first, and to decompose the signal components in the input into spatial
then derive a mapping from array outputs to signal directions subregions. After that, a series of multi-layer classifiers is
with existing machine learning techniques, such as radial basis introduced to realize DOA estimation. Major contributions of
function (RBF) [39] and support vector regression (SVR) [32], this paper are three-fold.
[33]. The derived mapping is then used on test data to estimate 1) A deep neural network is established and an end-to-end
signal directions. These methods are data-driven and do not method is proposed for general DOA estimation. It does not
rely on pre-assumptions about array geometries and whether need to transform array outputs to the frequency domain, and
they are calibrated or not. They have been demonstrated to be therefore is different from existing deep learning-based DOA
computationally more efficient than subspace-based methods estimation methods for acoustic signals [46], [49], [50].
in simulations [32], [39], and perform comparably with them 2) An autoencoder is introduced to preprocess original array
in experiments [33]. outputs like spatial filters. This preprocessing step helps to
Performances of the RBF- and SVR-based DOA estimation reduce distribution divergences of the input data of DOA
methods [32], [33], [39] rely heavily on the generalization estimation neural networks, which largely enhances the gen-
characteristic of the machine learning techniques. They per- eralization of the proposed method in unseen scenarios.
form satisfyingly when the training and test data have nearly 3) If the DNN framework is trained using outputs of a
identical distributions [44], [45]. However, in most DOA certain array, the corresponding DOA estimation method will
estimation problems, it is very difficult or even impossible be robust to various kinds of imperfections in the steering
to establish a large enough training dataset to cover the distri- vector of the array without using any prior information about
butions of all test data. That is because too many unknown them. Analyses are carried out to support such predominance,
parameters are present in the array output model, such as and simulation results provide further evidences.
signal number, signal directions, SNR, signal waveforms and The rest of the paper consists of five parts. Section II
noise samples. formulates the array output model. Section III presents a new
In the past few years, some researchers have also introduced DNN framework and interprets how it fits DOA estimation
deep learning techniques to solve DOA estimation and source requirements. Section IV introduces training strategies of
localization problems with microphone arrays [46], [47], [48], the hierarchical DNN, and highlights its behavior in array
[49], [50]. Very demanding scenarios with dynamic acoustic imperfection adaptation. Section V carries out simulations to
signals [46], reverberant environments [47], [48] and wideband demonstrate the predominance of the proposed method in gen-
signals [49], [50] are considered. It is very hard to establish eralization over previous machine learning-based methods, and
analytical signal propagation models in such applications, and its predominance in imperfection adaptation over parametric
parametric methods may encounter great difficulties in solving methods. Section VI concludes the whole paper.
these problems. However, deep learning-based methods are
able to reconstruct complicated propagation models based
on training datasets, and then estimate source directions and II. P ROBLEM FORMULATION
locations. In despite of the methods’ successes in single-signal
scenarios [47], [48] or in the area of acoustic signal processing Assume that K independent signals impinge onto an M -
[46], [49], [50], they can hardly be used directly for general element array consisting of omni-directional sensors, the in-
DOA estimation. That is because superresolution of multiple cident directions of the signals are θ1 , · · · , θK , respectively.
temporally overlapped signals is usually required for array The waveforms of the kth signal is sk (t), and the array output
signal processing methods, and acoustic signals generally last is sampled at N uniquely-spaced time instants t1 , · · · , tN to
for seconds and contain redundant time-frequency features. obtain snapshots X = [x(t1 ), · · · , x(tN )]. The array outputs
Original acoustic signals are transformed to the time-frequency are contaminated by zero-mean Gaussian noise v(t).
domain first in the methods, and the transformed signals are In most academic works on array signal processing, various
then treated as inputs of deep neural networks [46], [49], [50]. kinds of practical array imperfections are overlooked, and the
DOA estimation of acoustic signals is finally realized in a mappings from signal directions to array responding functions
similar way as pattern recognition of images. Nonetheless, are supposed to be deterministic and known beforehand.
in general DOA estimation problems, the snapshot number Denote the imperfection-free mapping by θ 7→ a(θ), the array
is usually on scales of tens or hundreds, which is not large outputs can be formulated as follows,
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 3
The DNN framework has two parts, a multi-task autoen- weight matrix from the (l1 − 1)th layer to the l1 th layer of the
(p)
pth task, bl1 ∈ R|cl1 |×1 is the additive bias vector in the l1 th
coder that behaves as spatial filters, and a group of parallel (p)
multi-layer classifiers that realize spatial spectrum construc- layer, fl1 [•] represents the element-wise activation function in
tion. A sketch of the DNN structure is shown in Fig.1. the l1 th layer.
The autoencoder denoises the DNN input and decomposes The multi-task autoencoder aims at decomposing the in-
its components into P spatial subregions. If some signals puts into P spatial subregions. A straightforward strategy for
located in the pth subregion impinge onto the array (possibly defining the subregions is choosing P + 1 particular directions
together with some other signals located in the other P − 1 θ(0) < θ(1) < · · · < θ(P ) , which satisfy θ(1) − θ(0) =
subregions), the output of the pth decoder equals the DNN θ(2) − θ(1) = · · · = θ(P ) − θ(P −1) and [θ(0) , θ(P ) ) spans the
input when the other signals are absent. If no signal impinges potential scope of the incident signals. If a signal component
from this subregion, the output of the pth decoder equals zero. impinging from the pth subregion is used as the input to
A fully-connected multi-layer NN is designed for each the autoencoder, the output of the pth decoder, which is also
subregion afterwards. Each of them behaves as a multi-class (p)
denoted as up = c2L1 is expected to be equal to the input r,
classifier to determine if there are signals on a list of refined while the other decoder outputs equals zero1 .
directional grids within the spatial subregion. If a signal is There are many ways to design spatial filter-like autoen-
located on a certain grid or between two adjacent grids, the coders, so that F (p) (r) = r if the signal direction θ ∈
outputs of the corresponding NN node(s) will be non-zero,
and the values of the node outputs indicate how close the 1 The denoising ability of the autoencoder can be further enhanced by using
signal direction is to this grid. The grids are preseted properly the perturbation-free counterpart of the input as the expected output of the
corresponding decoder. But as it is not an easy task to collect perturbation-
to ensure that any two signals do not coexist between two free counterparts of the training dataset, we use the original noisy inputs as
neighbor grids. the outputs directly in this paper.
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 4
Fig. 1. Structure of proposed deep neural network for direction-of-arrival estimation. The network consists of two parts, one is a multi-task autoencoder for
spatial filtering, the other is a fully-connected multi-layer neural network for spatial spectrum estimation.
[θ(p−1) , θ(p) ) and F (p) (r) = 0 otherwise approximately, where to be known beforehand. The trained models do not work if
F (p) (•) is the over-all function of the pth autoencoder task. the incident signal number changes. Therefore, a set of models
Furthermore, an additional requirement is necessary in the should be trained for each case of incident signal number.
autoencoder structure for the DOA estimation application, i.e., Even though, these models can hardly be integrated to deal
F (p) (r1 + r2 ) = F (p) (r1 ) + F (p) (r2 ). The additive property with the DOA estimation problems when the signal number is
is required, because when multiple signals located in different not known beforehand.
subregions impinge onto the array simultaneously, the autoen- A more flexible way to enhance the generalization of the
coder should be able to decompose the input vector to different methods to unknown signal number is to use a list of one-vs-
decoder outputs. In order to make the autoencoder additive, all classifiers instead. In the DOA estimation problem, each
(p)
the activation functions fl1 [•] should be linear, therefore, we node of the classifier output stands for a preseted directional
replace it with the unit function instead, i.e., grid, and the final output value on the node represents the
probability of a signal locating in the neighborhood of the
(p) (p)
cl1 = netl . (5) grid. DOA of signals impinging from off-grid directions can
be estimated via interpolation between two adjacent grids.
As there is no nonlinear transformations in the autoencoder As shown in Fig.1, there are P parallel classifiers in
hidden layers, each of the multi-layer encoding and decoding total, the pth classifier takes the output of the pth decoder
processes can be simplified to a single layer, i.e., L1 = 1, and as input, and analyzes the components of the input on the
the autoencoder can be rewritten as preseted grids in the pth spatial subregion. There are no mutual
connections between different classifiers. The computations of
c1 = U1,0 r + b1 , (6) the classifiers are feedforward,
(p) (p) (p) (p)
(p) (p) netl2 = W l2 ,l2 −1 hl2 −1 + ql2 ,
up = U2,1 c1 + b2 , p = 1, · · · , P. (7) (8)
p = 1, · · · , P ; l2 = 1, · · · , L2 .
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 5
(p) (p)
R|hl2 |×|hl2 −1 | is the fully-connected feed-forward weight ma- signal directions is the equally spaced spectrum grids of the
(p)
trix between the (l2 − 1)th layer and the l2 th layer, ql2 is the classifier outputs, which are denoted by ϑ1 , ϑ2 , · · · , ϑI . I is
additive bias vector on the l2 th layer; gl2 [•] is an element-wise supposed to be dividable by P with I/P = I0 being an integer.
activation function for the inputs of the l2 th layer nodes. If the covariance vector r(ϑi ) corresponding to the signal
After obtaining all the outputs of the P classifiers in parallel from direction ϑi is inputted to the autoencoder, the output of
based on the P decoder outputs, the spatial spectrum associ- the pi th (pi = di/I0 e with dαe equaling the smallest integer
ated with DNN input r can be estimated by concatenating the not smaller than α) decoder is expeced to be r(ϑi ), while the
P outputs in order, i.e., outputs of the other P −1 decoders expected to be 0κ×1 where
κ = |r|. By concatenating the outputs of all the P decoders,
T
y = yT1 , · · · , yTP .
(10) the expected output of the whole autoencoder can be obtained
as
There are totally |y| one-vs-all classifiers in this part of
the DNN. In order to realize DOA estimation based on the T
spectrum estimate, only the grid nodes close to the true signal u = uT1 , · · · , uTP
T
directions are expected to have positive values in y, while all
(13)
the others have zero values. = 0Tκ×1 , · · · , 0Tκ×1 , rT (ϑi ), 0Tκ×1 , · · · , 0Tκ×1 .
| {z } | {z }
p−1 P −p
IV. DNN- BASED DIRECTION - OF - ARRIVAL ESTIMATION
(0) (P )
Besides the framework described in the previous section, When ϑi varies from θ to θ , the pi ’s corresponding
training dataset structure and training strategy are two other to i = 1, · · · , I are 1, · · · , 1, 2, · · · , 2, · · · , P, · · · , P . Denote
| {z } | {z } | {z }
factors that play important roles in the performance of the I0 I0 I0
DNN-based DOA estimator. As the autoencoder and the the autoencoder label corresponding to data r(ϑi ) by u(ϑi ),
parallel classifiers implement different functions and work then the dataset for autoencoder training is
separately during DOA estimation, and training deep NN as
a whole gets trapped in undesirable local minima more easily Γ(1) = [r(ϑ1 ), · · · , r(ϑI )] , (14)
[51], we propose to train the two parts of the DNN in separate
and the column-wise labelset associated with the dataset is
procedures.
In order to reduce the variability of the DNN input, which
is influenced significantly by uncertain signal waveforms, we Ψ(1) =
[u(ϑ1 ), u(ϑ2 ), · · · , u(ϑI )]
follow the guidelines of the RBF- and SVR-based methods Φ1 0κ×I0 0κ×I0 0κ×I0
to compute the array covariance matrix [32], [33], [39], and 0κ×I0 Φ2 0κ×I0 0κ×I0 ,
(15)
=
reformulate the off-diagonal upper right matrix elements as an ..
0κ×I0 0κ×I0 . 0κ×I0
input vector to the DNN, 0κ×I0 0κ×I0 0κ×I0 ΦP
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 6
Gain
Gain
∂(1) (ϑi ) 0.4 filter 0 0.4 filter 0
= ũT (ϑi )[U2,1 ]:,l , (21) filter 1
filter 2
filter 1
filter 2
∂[b1 ]l 0.2 filter 3 0.2 filter 3
filter 4 filter 4
0.0 filter 5 0.0 filter 5
∂(1) (ϑi ) −60 −40 −20 0 20
Direction (Deg.)
40 60 −60 −40 −20 0 20
Direction (Deg.)
40 60
= [ũ(ϑi )]l , (22)
∂[b2 ]l (a) (b)
where [α]l represents the lth element of vector α, [A]i1 ,i2
represents the (i1 , i2 )th element of matrix A. The variants are 0.7 filter 0 0.7 filter 0
filter 1 filter 1
then updated iteratively as 0.6 filter 2 0.6 filter 2
0.5 filter 3 0.5 filter 3
filter 4 filter 4
0.4 filter 5 0.4 filter 5
∂(1) (ϑi )
Gain
Gain
0.3 0.3
αnew = αold + µ1 , (23)
∂α 0.2 0.2
0.1 0.1
where α can be any element of matrices U1,0 , U2,1 or vectors 0.0 0.0
b1 , b2 , µ1 is the learning rate, αold and αnew denote the values −60 −40 −20 0 20 40 60 −60 −40 −20 0 20 40 60
Direction (Deg.) Direction (Deg.)
of the variables before and after current update, respectively. (c) (d)
The subfigures in Fig.2 show the spatial responses of six
trained filters, i.e., P = 6, in the spatial scope of [−60◦ , 60◦ )2 . Fig. 2. Performance of multi-task autoencoder for spatial filtering, (a) gain
responses; (b) phase responses; (c) filter outputs of two signals in the same
Fig.2(a) shows the spatial gains of the filters, i.e., subregion (θ1 = 5◦ , θ2 = 15◦ ); (d) filter outputs of two signals in different
subregions (θ1 = 10◦ , θ2 = 30◦ ).
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 7
Spectrum
Spectrum
θ−ϑl−1
ϑl −ϑl−1 , ϑl−1 ≤ θ < ϑl , θ ∈ {θ, θ + ∆j }, 0.4 0.4
ϑl+1 −θ 0.2 0.2
[y(θ, ∆j )]l = ϑl ≤ θ < ϑl+1 , θ ∈ {θ, θ + ∆j },
ϑl+1 −ϑl , 0.0 0.0
0, otherwise. −60 −40 −20 0 20 40 60 −60 −40 −20 0 20 40 60
Direction (Deg.) Direction (Deg.)
(28)
That is, the reconstructed spectrum is expected to have non- (a) (b)
zero positive values only on the grids adjacent to the true signal Fig. 3. Reconstructed spatial spectrum of two signals, (a) θ1 = 5◦ , θ2 = 15◦ ;
directions, and the direction of each signal can be estimated (b) θ1 = 10◦ , θ2 = 30◦ .
precisely via linear amplitude interpolation between the two
adjacent grids.
The training dataset of the classifiers can be written as are only slight perturbations on the spectrum grids without
h
(2) (2)
i incident signals. The directions of the signals can finally
Γ(2) = Γ1 , · · · , ΓJ , (29) be estimated based on the estimated spectrum via linear
interpolation within the spectrum peaks.
where
(2)
Γj = [r(ϑ1 , ∆j ), · · · , r(ϑI − ∆j , ∆j )] , (30)
C. Adaptation to array imperfections
and the associated labelset is
h
(2) (2)
i As has been discussed in Section II, data-driven DOA
Ψ(2) = Ψ1 , · · · , ΨJ , (31) estimation methods are expected to have built-in adaptations
where to typical array imperfections, such as gain and phase incon-
sistence [27], [28], sensor position error [25], [26] and mutual
(2)
Ψj = [y(ϑ1 , ∆j ), · · · , y(ϑI − ∆j , ∆j )] , (32) coupling [22], [23], [24]. We validate such property of the
proposed method in this subsection.
During the training process, the reconstruction error of
Suppose that the responding function of the array is per-
the spatial spectrum is back-propagated to update the NN
turbed by a particular kind of or combined imperfections
parameters of the parallel classifiers. Denote the expected and
with parameters e, and the mapping from signal directions to
actual classifier outputs corresponding to r(θ, ∆) by y(θ, ∆) e
covariance vectors is θ →7− re (θ). The DNN is assumed to have
and ŷ(θ, ∆), respectively, and also denote the reconstruction
no prior information about the imperfections and the perturbed
error by
array responding function. When the perturbed vector re (ϑi )
ỹ(θ, ∆) = ŷ(θ, ∆) − y(θ, ∆). (33)
with di/I0 e = p is inputted to the autoencoder, the associated
The loss function for the classifiers is taken as the squared label vector is
l2 -norm of the spectrum reconstruction error, i.e.,
1 2
T
(2) (θ, ∆) =
kỹ(θ, ∆)k2 , (34) u = uT1 , · · · , uTP
2 T
The gradients of the loss function with respect to the classi- (36)
= 0Tκ×1 , · · · , 0Tκ×1 , rTe (ϑi ), 0Tκ×1 , · · · , 0Tκ×1 .
fier variants can be derived via straightforward mathematical | {z } | {z }
analyses. We skip details of the derivations here and refer p−1 P −p
readers interested in them to previous literatures such as
[43]. Most deep learning platforms, such as TensorFlow [52], That is to say, the vector is filtered into the pth filter even in
also provide callable instructions for computing the gradients the presence of array imperfections.
automatically. After that, the decoder outputs are inputted to the classifiers.
The elements of the weight matrices and bias vectors are As the component corresponding to the signal from direction
then updated using their gradients as follows, ϑi is embedded in the output of the pth decoder, it will be
processed by the pth classifier. The associated spectrum label
∂(2) (θ, ∆) contains a spectrum peak on the one or two grids closest to
αnew = αold + µ2 , (35)
∂α ϑi , which can be interpolated to obtain a DOA estimate of ϑi
where µ2 is the learning rate. exactly. Therefore, the whole DNN (the autoencoder together
After training the classifiers with the settings detailed in with the parallel classifiers) actually forms an inverse mapping
e
Section V, we re-input the array covariance vectors r(θ = of re (θ) →
7− θ, no matter which kinds of imperfections are
5◦ , ∆ = 10◦ ) and r(θ = 10◦ , ∆ = 20◦ ) associated with present and how they perturbs the array responding function.
Fig.2(c) and Fig.2(d) to the whole DNN, and get the re- The derived inverse mapping from DNN inputs to signal
constructed spectra shown in Fig.3(a) and Fig.3(b). The two directions with the effect of array imperfections embedded in
signals in both scenarios are well separated, no matter they it also adapts to test data, and is expected to obtain correct
impinge from the same or different spatial subregions. There DOA estimates in despite of array imperfections.
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 8
V. S IMULATIONS AND ANALYSES to 60◦ − ∆, and the direction of the second signal is θ + ∆.
This section carries out simulations to show the predomi- The SNR of both signals is 10dB, and 10 groups of snapshots
nance of the proposed method over state-of-the-art machine are collected for each direction setting with random noise.
learning-based DOA estimation method [32], [33] in gen- Finally, (118 + 116 + · · · + 80) × 10 = 19800 covariance
eralization, and also its predominance over the most com- vectors are collected in the dataset. The vectors are used
mon parametric method [6] in imperfection adaptation. The for training with mini-batch size of 32 and learning rate of
simulations are implemented on TensorFlow [52], and the µ2 = 0.001, and the order of the vectors is shuffled during
gradients are computed using its embedded tools directly. each of the 300 training epochs. The number of the hidden
The more recently proposed methods [46], [47], [48], [49], layers is chosen to be L2 − 1 = 2 as a trade-off between the
[50] are not chosen as baselines, because some of them work expressivity power (improves with deeper networks [41]) and
only in single-source scenarios [47], [48], and the others take under-training risk (aggravates with more network parameters
time-frequency representations of incident signals as inputs [51]) of the classifiers, and the sizes of the hidder and output
[46], [49], [50]. They do not adapt to the considered multi- layers in each classifier are b2/3 × κc = 30, b4/9 × κc = 20
source scenarios with only a few hundreds of snapshots in and I0 = 20, respectively. All the weights and biases of
the simulations. Auto-calibration techniques [22], [23], [24], the DNN are randomly initialized according to a uniform
[25], [26], [27], [28] are not introduced in parametric methods, distribution between -0.1 and 0.1.
as no prior information (neither formulation nor value) about Three typical kinds of array imperfections are considered
array imperfections is assumed to be known beforehand. Such in the simulations, including gain and phase inconsistence,
settings help to show the robustness of different methods to sensor position error and inter-sensor mutual coupling. The
uncalibrated arrays and make fairer performance comparisons. imperfections may be very complicated to be modeled with
Also, this simulation setting prevents usages of calibration concise mathematical models, so we use simplified models to
techniques designed for certain kinds of imperfections with facilitate simulations in this paper. The gain biases of the array
pre-assumed formulations [22], [23], [24], [25], [26], [27], sensors are
[28]. egain = ρ × [0, 0.2, · · · , 0.2, −0.2, · · · , −0.2]T , (37)
| {z } | {z }
5 4
A. Simulation settings where the parameter ρ ∈ [0, 1] is introduced to control the
In the following simulations, we use a 10-element uniform strength of the imperfections.The phase biases are
linear array (ULA) to estimate directions of signals impinging
from the spatial scope of [−60◦ , 60◦ ), i.e., M = 10, θ(0) = ephase = ρ × [0, −30◦ , · · · , −30◦ , 30◦ , · · · , 30◦ ]T . (38)
| {z } | {z }
−60◦ , θ(P ) = 60◦ . The inter-element spacing of the ULA is 5 4
half-wavelength, and the potential space is divided into P = 6 The position biases are
subregions with equal spatial scopes. The spatial spectrum is
constructed with a grid of 1◦ , thus there are I = 120 grids epos = ρ × [0, −0.2, · · · , −0.2, 0.2, · · · , 0.2]T × d, (39)
| {z } | {z }
in total with ϑ1 = −60◦ , ϑ2 = −59◦ , · · · , ϑI = 59◦ , and 5 4
each spatial subregion has I0 = 20 grids. The covariance where d is the inter-sensor spacing of the ULA. The mutual
vectors r in the training datasets of both the autoencoder and coupling coefficient vector is
the classifiers, and also the vectors in the test datasets, are
obtained from N = 400 snapshots. emc = ρ × [0, γ 1 , · · · , γ M −1 ]T , (40)
For the training of the autoencoder, the [−60◦ , 60◦ ) space ◦
where γ = 0.3ej60 is the mutual coupling coefficient between
is also sampled with an interval of 1◦ to obtain a direction adjacent sensors.
set with ϑ1 = −60◦ , ϑ2 = −59◦ , · · · , ϑI = 59◦ and compute By specializing ρ, the array imperfections will be deter-
the covariance vectors and associated labels according to (14) mined, and the perturbed array responding function is rewritten
and (15). At each of the directional grids, only one group as follows,
of snapshots is collected to compute one covariance vector.
a(θ, e) = (IM + δmc Emc ) × (IM + Diag(δgain egain ))
The SNR of the snapshots is 10dB. The mini-batch training (41)
×Diag(exp(jδphase ephase )) × a(θ, δpos epos ),
strategy [53] is followed with a batch size of 32 and learning
rate of µ1 = 0.001, and 1000 epochs are taken for the training where δ(•) is used to indicate whether a certain kind of imper-
with the dataset shuffled in each epoch. The size of the input fection exists, IM is the M ×M unitary matrix, Diag(•) forms
layer is κ = M (M − 1)/2 = 45, and that of the hidden and diagonal matrices with the given vector on the diagonal, Emc is
output layers are b45/2c = 22 and κI = 45 × 6, respectively. a toeplitz matrix with parameter vector emc [22], a(θ, δpos epos )
The autoencoder parameters are fixed after the training is the actual array responding vector corresponding to the
process, and another dataset is collected in the two-signal signal from direction θ when position error epos is embedded
scenarios to train the classifiers. The inter-signal angle ∆ in the array geometry.
is sampled from the set of {2◦ , 4◦ , · · · , 40◦ }, which covers The array responding function given in (41) has been largely
scenarios from very close signals to signals separated by twice simplified when compared with its counterpart in practical
the width of a subregion. Then the direction of the first signal applications, which can be measured more precisely with
(denoted by θ) is sampled with an interval of 1◦ from −60◦ computational electromagnetic methods, such as [54], [55],
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 9
Direction (Deg.)
0.5
we use the simplified formulations mainly to facilitate simula- 0 0.0
tion, and we believe that these simplifications are reasonable −20 −0.5
−1.0
for performance comparison. That is because the proposed −40
−1.5
machine-learning method does not make use of any prior −60
0 20 40 60 80 100
−2.0
0 20 40 60 80 100
Sample index Sample index
information about the array imperfections and steering vectors.
The proposed end-to-end training and testing strategies can (a) (b)
be generalized straightforwardly to other array geometries
60 60
and imperfections, no matter how the antennas are fed and
40 40
how much the array steering vector has been biased by
20 20
Direction (Deg.)
Direction (Deg.)
imperfections.
0 0
In the figures, we use dashed lines to represent true values −20 −20
of signal directions, and dots with triangular or circle markers −40 −40
to represent their estimates and (statistical) estimation errors. −60 −60
0 20 40 60 80 100 0 20 40 60 80 100
Sample index Sample index
(c) (d)
B. Generalization to untrained scenarios
Fig. 4. DOA estimation performance of off-grid signals, (a) DNN-based
In this subsection, we compare the proposed method with DOA estimates; (b) DNN-based DOA estimation errors; (c) SVR-based DOA
the state-of-the-art machine learning-based DOA estimation estimates with noise-free training data; (d) SVR-based DOA estimates with
method, i.e., the SVR-based DOA estimator [32], [33], to show training data of SNR=10dB.
how they generalize to scenarios not included in the training
dataset. No array imperfection is considered in the simulations.
Firstly, two signals with an angular distance of 9.4◦ and dataset of the classifiers introduced in sections III.C and IV.B.
SNR=10dB are assumed to impinge onto the array simultane- Inter-signal SNR divergences of ±6dB, ±3dB and 0dB are
ous, and the direction of the first signal varies from −60◦ to considered in addition to original training dataset settings,
50◦ . This angular distance is not contained in the training set and the network training process keeps unchanged. The DOA
∆, and the direction of the second signal deviates from the estimation results and errors of the proposed DNN-based
preseted training directions and the output spectrum grids. The method are shown in Fig.5(a) and (b), and the performances of
final DOA estimates are obtained via amplitude interpolation the existing SVR-based method are shown in Fig.5(c) and (d).
within the two most significant peaks of the reconstructed In order to obtain valid DOA estimates, the SVRs are trained
spectra. The estimated directions and the estimation errors with noise-free dataset, while the DNN is trained with array
of the two signals when the first signal direction increases outputs of SNR=10dB. The DNN-based method still performs
from −60◦ with a step of 1◦ to 50◦ are shown in Fig.4(a) satisfyingly in DOA estimation precision in despite of inter-
and (b), respectively. The DOA estimates well match their signal SNR divergences. On the contrary, as the different signal
true values and most of the estimation errors are smaller SNR introduced diversities to the distributions of the training
than 0.5◦ . In Fig.4(c) and (d), we plot the results of the and testing data, the SVR-based method only obtained biased
SVR-based DOA estimation results in the same scenarios. DOA estimates. The DOA estimation biases of the first signal,
In Fig.4(c), the SVRs are trained with the same training which has lower SNR, is as large as 4◦ ∼ 5◦ in most of the
dataset as the DNN classifiers, except that the array outputs cases.
for training are noise-free, while the testing data are noise- We then keep the SNR of the two signals fixed at 10dB
contaminated with SNR=10dB. In Fig.4(d), the training data and enlarge their angular distance to 60◦ , which deviates
of the SVRs are also polluted by noise with SNR=10dB, from the ∆’s in the training set largely. When the first signal
the same as the proposed method without exceptions. The direction varies from −60◦ to −1◦ , the DOA estimates of the
SVRs also perform well when the training data are noise- DNN-based method and the SVR-based method are shown in
free, but their performance aggravates significantly when there Fig.6. The proposed DNN-based method again shows much
are perturbations in the training data. As it is very difficult better adaptation to such a previously unseen scenario during
or even impossible to collect noise-free training data, the training, and the SVR-method fails to obtain valid DOA
proposed method is believed to behave better than the SVR- estimates for the signals.
based method in practice. Finally, we show how the proposed method behaves when
Secondly, two signals with different SNR impinge onto the testing data contain different numbers of signals as the
the array simultaneously, with the SNR of the first signal training data. The DNN and SVR have been trained with
being 10dB and that of the second one being 13dB. The array outputs in two-signal scenarios, and the SVR-method
angular distance between them is 16.4◦ and the direction of forms two regression machines for processing test data and
the first signal varies from −60◦ to 43◦ . In order to improve outputs two DOA estimates for each given data [32], [33].
the ability of the proposed method in adaptation to power If the input covariance vector contains more or fewer signal
divergences between multiple signals, we enlarge the training components, the SVR outputs make no sense. However, the
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 10
2.0
40
1.5 40
40
1.0 20
Direction (Deg.)
Direction (Deg.)
0.5
0
0 0.0 0
−0.5 −20
−20 −20
−1.0
−40 −40 −40
−1.5
−60 −2.0 −60 −60
0 20 40 60 80 100 0 20 40 60 80 100 0 20 40 60 80 100 0 10 20 30 40 50 60 70
Sample index Sample index Sample index Sample index
20
Direction (Deg.)
3
0
2
−20
1 imperfections, the differences in DOA estimation precision
−40
0 between different parametric methods is much minor. Among
−60 −1
0 20 40 60
Sample index
80 100 0 20 40 60
Sample index
80 100 the existing parametric methods, we have chosen MUSIC as it
(c) (d)
is a widely accepted baseline method. The SVR method is also
excluded here, because it lacks robustness to noisy training
Fig. 5. DOA estimation performance of two unequally-powered signals with datasets, and SVR models trained with noise-free datasets will
SNR of 10dB and 13dB, (a) DNN-based DOA estimates; (b) DNN-based
DOA estimation errors; (c) SVR-based DOA estimates; (d) SVR-based DOA
make the comparisons unfair.
estimation errors. Two signals with SNR=10dB are assumed to impinge onto
the array from directions of 31.5◦ and 41.5◦ , both off the
60 60
training and output spectrum grids. The adjusting parameter ρ
40 40
in (37)-(40) varies from 0 to 1. When ρ = 0, no imperfection
20 20 is contained in the array responding functions. Four cases
Direction (Deg.)
Direction (Deg.)
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 11
1.4
MUSIC MUSIC 1.75
3 filters 3.5 3 filters
DNN DNN 6 filters 6 filters
2.0 1.2 10 filters 10 filters
3.0
1.50
1.0
1.5 2.5
RMSE (Deg.)
RMSE (Deg.)
1.25
0.8
RMSE (Deg.)
RMSE (Deg.)
1.00 2.0
1.0 0.6
0.75 1.5
0.4 1.0
0.5 0.50
0.2 0.5
0.25
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
ρ ρ ρ ρ
RMSE (Deg.)
RMSE (Deg.)
RMSE (Deg.)
2.0 2.0
0.6 1.0
1.5 1.5
0.4
0.5 1.0 1.0
0.2
0.5 0.5
0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
ρ ρ 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
ρ ρ
(c) (d)
(c) (d)
Fig. 8. DOA estimation root-mean-square-error (RMSE) of MUSIC and the
Fig. 9. DOA estimation root-mean-square-error (RMSE) of the proposed
proposed method for two signals from directions of 31.5◦ and 41.5◦ in the
method with different spatial filters in the presence of array imperfections, (a)
presence of different array imperfections, (a) gain and phase inconsistence;
gain and phase inconsistence; (b) sensor position error; (c) mutual coupling;
(b) sensor position error; (c) mutual coupling; (d) combined imperfection.
(d) combined imperfection.
DOA estimation precision seldom changes with the amplitudes directly. The results in Fig.9 indicate that, the DOA estimation
of the imperfections. When ρ is as small as 0.1∼0.3, the performance of the single classifier will be much worse than
DNN-based method performs comparably as MUSIC, and that of a DNN with a 3-task autoencoder. That is why we have
when ρ becomes larger and the array responding function introduced a multi-task autoencoder in the DNN framework to
deviates farther away from its ideal counterpart, the DNN- spatilly filter the inputs to concentrate their distributions, so as
based method performs much better than MUSIC. to reduce the burden in generalization of the subsequent DOA
In order to demonstrate the contribution of the multi-task estimation classifiers.
autoencoder in DOA estimation, we select different decoder
numbers to show how the DOA estimation precision changes VI. C ONCLUSION
with the spatial filter (also the decoder) number. Three filter This paper proposes a deep neural network (DNN) frame-
numbers of 3, 6 and 10 are selected, and the corresponding work to deal with the problem of DOA estimation, so as to
DNN’s are trained and tested in scenarios of different kinds make up for the drawbacks of previous parametric and data-
of array imperfections and different ρ’s. Parameters of the driven methods in terms of array imperfection adaptation and
training and testing datasets and processes are set the same generalization. The proposed DNN-based framework consists
as that in the simulations corresponding to Fig.8. The DOA of a multi-task autoencoder and a series of parallel multi-layer
estimation RMSE are shown in Fig.9. The results show that, classifiers. The two parts are trained separately with different
when the filter number is as small as 3, each of the filter covers datasets, and the dataset for training the refined classifiers
a wide spatial subregion and the filter outputs are still much is generated only in two-signal scenarios. In despite of the
divergent in distribution. Thus the trained DNN does not work simplicity of the training dataset, the proposed method is
well in some of the testing scenarios, and the DOA estimation demonstrated via simulations to own much enhanced gener-
RMSE’s have very large variances. When the filter number alizations when compared with the SVR-based method in the
increases to 6, the DOA estimation RMSE becomes very small machine learning community. It adapts well to noisy training
and keeps stable in different scenarios. After that, when we data, off-grid signals, unequally-powered signals, much large
further increase the filter number to 10, the RMSE does not angular distances, and even different numbers of signals as
decrease largely any more. It can be concluded from this group that in the training dataset. The proposed method has also been
of simulation results that, decoder numbers smaller than a shown to adapt well to various kinds of array imperfections. It
certain threshold leads to worse DOA estimation performance obtains DOA estimates with much higher precisions than the
in the proposed DNN framework, while decoder numbers most widely-studied parametric method of MUSIC when the
larger than the threshold do not lead to significant performance imperfections are significant.
improvements. Therefore, we have set the decoder number to The proposed method also has an obvious drawback when
be 6 in previous simulations empirically. Another special value compared with existing parametric counterparts, such as MU-
of the decoder number is 1, which is equivalent to removing SIC. It requires a large amount of labeled data to train the
the autoencoder in the proposed DNN framework in Fig.1. In DNN framework for DOA estimation, which may be very
this case, a single classifier will be trained for DOA estimation demanding in practical applications when it is difficult to
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 12
collect such data. Therefore, a potential future work of our [23] T. Svantesson, “Modeling and estimation of mutual coupling in a
research is reducing the size of the training dataset, or using uniform linear array of dipoles,” in Acoustics, Speech, and Signal
Processing, 1999. Proceedings., 1999 IEEE International Conference
other substitutional training processes, e.g., training the DNN on, vol. 5. IEEE, 1999, pp. 2961–2964.
with simulation data first and then adjust it for practical usage [24] M. Lin and L. Yang, “Blind calibration and doa estimation with uniform
via transfer learning with a small amount of practical data circular arrays in the presence of mutual coupling,” IEEE Antennas and
Wireless Propagation Letters, vol. 5, no. 1, pp. 315–318, 2006.
collected with arrays. [25] A. J. Weiss and B. Friedlander, “Array shape calibration using eigen-
structure methods,” Signal Processing, vol. 22, no. 3, pp. 251–258, 1991.
[26] B. P. Flanagan and K. L. Bell, “Array self-calibration with large sensor
R EFERENCES position errors,” Signal Processing, vol. 81, no. 10, pp. 2201–2214, 2001.
[27] A. Paulraj and T. Kailath, “Direction of arrival estimation by eigen-
structure methods with unknown sensor gain and phase,” in Acoustics,
[1] D. H. Johnson and D. E. Dudgeon, Array signal processing: concepts Speech, and Signal Processing, IEEE International Conference on
and techniques. Simon & Schuster, 1992. ICASSP’85., vol. 10. IEEE, 1985, pp. 640–643.
[2] H. Krim and M. Viberg, “Two decades of array signal processing [28] Y. Li and M. Er, “Theoretical analyses of gain and phase error cali-
research: the parametric approach,” IEEE signal processing magazine, bration with optimal implementation for linear equispaced array,” IEEE
vol. 13, no. 4, pp. 67–94, 1996. Transactions on Signal Processing, vol. 54, no. 2, pp. 712–723, 2006.
[3] B. D. Van Veen and K. M. Buckley, “Beamforming: A versatile approach [29] B. C. Ng and C. M. S. See, “Sensor-array calibration using a maximum-
to spatial filtering,” IEEE assp magazine, vol. 5, no. 2, pp. 4–24, 1988. likelihood approach,” IEEE Transactions on Antennas and Propagation,
[4] J. Litva and T. K. Lo, Digital beamforming in wireless communications. vol. 44, no. 6, pp. 827–835, 1996.
Artech House, Inc., 1996. [30] K. V. Stavropoulos and A. Manikas, “Array calibration in the presence
[5] J. Li and P. Stoica, Robust adaptive beamforming. John Wiley & Sons, of unknown sensor characteristics and mutual coupling,” in Signal
2005, vol. 88. Processing Conference, 2000 10th European. IEEE, 2000, pp. 1–4.
[6] R. Schmidt, “Multiple emitter location and signal parameter estimation,” [31] Z.-M. Liu and Y.-Y. Zhou, “A unified framework and sparse bayesian
IEEE transactions on antennas and propagation, vol. 34, no. 3, pp. 276– perspective for direction-of-arrival estimation in the presence of array
280, 1986. imperfections,” IEEE Transactions on Signal Processing, vol. 61, no. 15,
[7] R. Roy and T. Kailath, “Esprit-estimation of signal parameters via rota- pp. 3786–3798, 2013.
tional invariance techniques,” IEEE Transactions on acoustics, speech, [32] M. Pastorino and A. Randazzo, “A smart antenna system for direction
and signal processing, vol. 37, no. 7, pp. 984–995, 1989. of arrival estimation based on a support vector regression,” IEEE
[8] E. Gonen and J. M. Mendel, “Subspace-based direction finding meth- transactions on antennas and propagation, vol. 53, no. 7, pp. 2161–
ods,” Madisetti, VK and Williams DB, editeur, The Digital Signal 2168, 2005.
Processing Handbook, chapitre, vol. 62, 1999. [33] A. Randazzo, M. Abou-Khousa, M. Pastorino, and R. Zoughi, “Direction
[9] D. Malioutov, M. Cetin, and A. S. Willsky, “A sparse signal recon- of arrival estimation based on support vector regression: Experimental
struction perspective for source localization with sensor arrays,” IEEE validation and comparison with music,” IEEE Antennas and Wireless
transactions on signal processing, vol. 53, no. 8, pp. 3010–3022, 2005. Propagation Letters, vol. 6, pp. 379–382, 2007.
[10] Z.-M. Liu, Z.-T. Huang, and Y.-Y. Zhou, “Direction-of-arrival estimation [34] A. Rawat, R. Yadav, and S. Shrivastava, “Neural network applications
of wideband signals via covariance matrix sparse representation,” IEEE in smart antenna arrays: a review,” AEU-International Journal of Elec-
Transactions on Signal Processing, vol. 59, no. 9, pp. 4256–4270, 2011. tronics and Communications, vol. 66, no. 11, pp. 903–912, 2012.
[11] ——, “An efficient maximum likelihood method for direction-of-arrival [35] K. Terabayashi, R. Natsuaki, and A. Hirose, “Ultrawideband direction-
estimation via sparse bayesian learning,” IEEE Transactions on Wireless of-arrival estimation using complex-valued spatiotemporal neural net-
Communications, vol. 11, no. 10, pp. 1–11, 2012. works,” IEEE Transactions on Neural Networks and Learning Systems,
[12] Z. Yang, L. Xie, and C. Zhang, “Off-grid direction of arrival estimation vol. 25, no. 9, pp. 1727–1732, 2014.
using sparse bayesian inference,” IEEE Transactions on Signal Process- [36] Y. Gao, D. Hu, Y. Chen, and Y. Ma, “Gridless 1-b doa estimation ex-
ing, vol. 61, no. 1, pp. 38–43, 2013. ploiting svm approach,” IEEE Communications Letters, vol. 21, no. 10,
pp. 2210–2213, 2017.
[13] Z.-M. Liu and F.-C. Guo, “Azimuth and elevation estimation with
[37] S. Jha and T. Durrani, “Direction of arrival estimation using artificial
rotating long-baseline interferometers,” IEEE transactions on signal
neural networks,” IEEE transactions on systems, man, and cybernetics,
processing, vol. 63, no. 9, pp. 2405–2419, 2015.
vol. 21, no. 5, pp. 1192–1201, 1991.
[14] A. G. Jaffer, “Maximum likelihood direction finding of stochastic
[38] H. L. Southall, J. A. Simmers, and T. H. O’Donnell, “Direction finding
sources: A separable solution,” in Acoustics, Speech, and Signal Pro-
in phased arrays with a neural network beamformer,” IEEE Transactions
cessing, 1988. ICASSP-88., 1988 International Conference on. IEEE,
on Antennas and Propagation, vol. 43, no. 12, pp. 1369–1374, 1995.
1988, pp. 2893–2896.
[39] A. H. El Zooghby, C. G. Christodoulou, and M. Georgiopoulos, “A
[15] P. Stoica and A. Nehorai, “Music, maximum likelihood, and cramer-rao neural network-based smart antenna for multiple source tracking,” IEEE
bound,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Transactions on Antennas and Propagation, vol. 48, no. 5, pp. 768–776,
vol. 37, no. 5, pp. 720–741, 1989. 2000.
[16] M. I. Miller and D. R. Fuhrmann, “Maximum-likelihood narrow-band [40] G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of
direction finding and the em algorithm,” IEEE Transactions on Acous- data with neural networks,” science, vol. 313, no. 5786, pp. 504–507,
tics, Speech, and Signal Processing, vol. 38, no. 9, pp. 1560–1577, 1990. 2006.
[17] B. Allen and M. Ghavami, Adaptive array systems: fundamentals and [41] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521,
applications. John Wiley & Sons, 2006. no. 7553, pp. 436–444, 2015.
[18] B. Porat and B. Friedlander, “Accuracy requirements in off-line array [42] J. Schmidhuber, “Deep learning in neural networks: An overview,”
calibration,” IEEE transactions on aerospace and electronic systems, Neural networks, vol. 61, pp. 85–117, 2015.
vol. 33, no. 2, pp. 545–556, 1997. [43] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press,
[19] G. R. Hopkinson, T. M. Goodman, and S. R. Prince, A guide to the use 2016.
and calibration of detector array equipment. SPIE Press, 2004, vol. [44] J. Quionero-Candela, M. Sugiyama, A. Schwaighofer, and N. D.
142. Lawrence, Dataset shift in machine learning. The MIT Press, 2009.
[20] M. Viberg and A. L. Swindlehurst, “Analysis of the combined effects [45] M. Sugiyama and M. Kawanabe, Machine learning in non-stationary
of finite samples and model errors on array processing performance,” environments: Introduction to covariate shift adaptation. MIT press,
IEEE Transactions on Signal Processing, vol. 42, no. 11, pp. 3073–3083, 2012.
1994. [46] R. Takeda and K. Komatani, “Discriminative multiple sound source
[21] Z. Liu, Z. Huang, and Y. Zhou, “Bias analysis of music in the presence localization based on deep neural networks using independent location
of mutual coupling,” IET Signal Processing, vol. 3, no. 1, pp. 74–84, model,” in Spoken Language Technology Workshop (SLT), 2016 IEEE.
2009. IEEE, 2016, pp. 603–609.
[22] B. Friedlander and A. J. Weiss, “Direction finding in the presence [47] X. Xiao, S. Zhao, X. Zhong, D. L. Jones, E. S. Chng, and H. Li, “A
of mutual coupling,” IEEE transactions on antennas and propagation, learning-based approach to direction of arrival estimation in noisy and
vol. 39, no. 3, pp. 273–284, 1991. reverberant environments,” in Acoustics, Speech and Signal Processing
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TAP.2018.2874430, IEEE
Transactions on Antennas and Propagation
IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 13
(ICASSP), 2015 IEEE International Conference on. IEEE, 2015, pp. Chenwei Zhang is a Ph.D. student in the De-
2814–2818. partment of Computer Science at the University of
[48] F. Vesperini, P. Vecchiotti, E. Principi, S. Squartini, and F. Piazza, “A Illinois at Chicago. He received his B.S. degree in
neural network based algorithm for speaker localization in a multi-room Computer Science and Technology from Southwest
environment,” in Machine Learning for Signal Processing (MLSP), 2016 University, China, in 2014. His research interests
IEEE 26th International Workshop on. IEEE, 2016, pp. 1–6. include deep learning, natural language processing,
[49] S. Chakrabarty, E. Habets et al., “Broadband doa estimation using and applications in text and web mining.
convolutional neural networks trained with noise signals,” arXiv preprint
arXiv:1705.00919, 2017.
[50] S. Adavanne, A. Politis, and T. Virtanen, “Direction of arrival estima-
tion for multiple sound sources using convolutional recurrent neural
network,” arXiv preprint arXiv:1710.10059, 2017.
[51] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep
network training by reducing internal covariate shift,” in International
Conference on Machine Learning, 2015, pp. 448–456.
[52] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S.
Corrado, A. Davis, J. Dean, M. Devin et al., “Tensorflow: Large-scale
machine learning on heterogeneous distributed systems,” arXiv preprint
arXiv:1603.04467, 2016.
[53] A. Cotter, O. Shamir, N. Srebro, and K. Sridharan, “Better mini-batch
algorithms via accelerated gradient methods,” in Advances in neural
information processing systems, 2011, pp. 1647–1655.
[54] R. W. Kindt, K. Sertel, and J. L. Volakis, “A review of finite array
modeling via finite-elementand integral-equation-based decomposition
methods,” The Radio Science Bulletin, no. 336, pp. 12–22, 2011.
[55] J. Hu, W. Lu, H. Shao, and Z. Nie, “Electromagnetic analysis of large
scale periodic arrays using a two-level cbfs method accelerated with
fmm-fft,” IEEE transactions on antennas and propagation, vol. 60,
no. 12, pp. 5709–5716, 2012.
[56] D. J. Ludick, M. M. Botha, R. Maaskant, and D. B. Davidson, “The
cbfm-enhanced jacobi method for efficient finite antenna array analysis,”
Philip S. Yu received the B.S. Degree in E.E. from
IEEE antennas and wireless propagation letters, vol. 16, pp. 2700–2703,
National Taiwan University, the M.S. and Ph.D. de-
2017.
grees in E.E. from Stanford University, and M.B.A.
[57] K. M. Pasala and E. M. Friel, “Mutual coupling effects and their
degree from New York University. His main research
reduction in wideband direction of arrival estimation,” IEEE transactions
interests include big data, data mining (especially
on aerospace and electronic systems, vol. 30, no. 4, pp. 1116–1122,
on graph/network mining), social network, privacy
1994.
preserving data publishing, data stream, database
[58] R. S. Adve and T. K. Sarkar, “Compensation for the effects of mutual
systems, and Internet applications and technologies.
coupling on direct data domain adaptive algorithms,” IEEE Transactions
He is a Disthinguished Professor in the Department
on Antennas and Propagation, vol. 48, no. 1, pp. 86–94, 2000.
of Computer Science at UIC and also holds the
[59] C. K. Edwin Lau, R. S. Adve, and T. K. Sarkar, “Minimum norm
Wexler Chair in Information and Technology. Before
mutual coupling compensation with applications in direction of arrival
joining UIC, he was with IBM Thomas J. Watson Research Center, where he
estimation,” IEEE Transactions on Antennas and Propagation, vol. 52,
was manager of the Software Tools and Techniques department. Dr. Yu has
no. 8, pp. 2034–2041, 2004.
published more than 970 papers in refereed journals and conferences with
[60] H.-S. Lui and H. T. Hui, “Direction-of-arrival estimation: measurement
more than 74,500 citations and an H-index of 127. He holds or has applied
using compact antenna arrays under the influence of mutual coupling,”
for more than 300 US patents.
IEEE Antennas and Propagation Magazine, vol. 57, no. 6, pp. 62–68,
Dr. Yu is a Fellow of the ACM and the IEEE. He is the recepient of ACM
2015.
SIGKDD 2016 Innovation Award for his influential research and scientific
contributions on mining, fusion and anonymization of big data, the IEEE
Computer Society’s 2013 Technical Achievement Award for ”pioneering and
fundamentally innovative contributions to scalable indexing, querying, search-
ing, mining and anonymization of big data”, and the Research Contributions
Award from IEEE Intl. Conference on Data Mining (ICDM) in 2003 for
his pioneering contributions to the field of data mining. He also received
an IEEE Region 1 Award for ”promoting and perpetuating numerous new
electrical engineering concepts” in 1999. Dr. Yu is the Editor-in-Chief of
ACM Transactions on Knowledge Discovery from Data. He is on the steering
committee of ACM Conference on Information and Knowledge Management
Zhang-Meng Liu received the PhD degree in sta- and was a steering committee member of the IEEE Conference on Data
tistical signal processing from National University Mining and the IEEE Conference on Data Engineering.
of Defense Technology (NUDT) of China in 2012.
He is now an associate professor at NUDT working
in the interdiscipline of electronics engineering and
computer science, especially electronic data mining.
Dr. Liu was a visiting scholar in the Big Data and
Social Computing (BDSC) laboratory led by Prof.
Philip S. Yu in the University of Illinois at Chicago
from April 2017 to March 2018. He has published
more than 20 papers on signal processing, Bayesian
learning and data mining.
0018-926X (c) 2018 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See
http://www.ieee.org/publications_standards/publications/rights/index.html for more information.