Deep-Learning For Radar A Survey
Deep-Learning For Radar A Survey
fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2021.Doi Number
ABSTRACT A comprehensive and well-structured review on the application of deep learning (DL) based
algorithms, such as convolutional neural networks (CNN) and long-short term memory (LSTM), in radar
signal processing is given. The following DL application areas are covered: i) radar waveform and antenna
array design; ii) passive or low probability of interception (LPI) radar waveform recognition; iii) automatic
target recognition (ATR) based on high range resolution profiles (HRRPs), Doppler signatures, and
synthetic aperture radar (SAR) images; and iv) radar jamming/clutter recognition and suppression.
Although DL is unanimously praised as the ultimate solution to many bottleneck problems in most of
existing works on similar topics, both the positive and the negative sides of stories about DL are checked in
this work. Specifically, two limiting factors of the real-life performance of deep neural networks (DNNs),
limited training samples and adversarial examples, are thoroughly examined. By investigating the
relationship between the DL-based algorithms proposed in various papers and linking them together to form
a full picture, this work serves as a valuable source for researchers who are seeking potential research
opportunities in this promising research field.
INDEX TERMS Deep-learning, radar waveform recognition, synthetic aperture radar (SAR), automatic
target recognition (ATR), adversarial examples, jamming recognition
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
interference between radar and communications systems waveform generation and selection problem for multi-
that share the same frequency band. In [7], Smith et al. mission airborne weather radar was discussed in [17], where
proposed a novel DNN structure made of the actor network, a feedforward neural network with varying number of hidden
which performs actions based on the current environment layers was designed to synthesize nonlinear frequency
state, and the critic network, which is responsible for modulated waveforms (NFMW) with pre-determined
judging if the actor’s behavior is appropriate. Deep bandwidth and pulse length.
deterministic policy gradient (DDPG)-based reinforcement
C. DL FOR ARRAY DESIGN
learning strategy is adopted, and waveforms containing
DL-based algorithms have also been employed to realize
power spectrum notches are designed to constrain
cognitive selection and intelligent partition of antenna
interferences from radar to communications systems. In [8]-
subarrays. For example, in [18], a CNN with multiple
[9], Thornton et al. proposed a novel Double Deep
convolutional layers, pooling layers and fully connected
Recurrent Q-Network, which combines the double Q-
learning algorithm and the long-short term memory layers (referred to as “Conv”, “POOL”, and “FC”,
(LSTM), so that radar learns to avoid sub-bands containing respectively, for simplicity in the rest of this work) was
utilized for cognitive transmit/receive subarray selection
interference signals in a spectrum co-existence scenario.
based on the development of the surrounding environment.
DL-based algorithms are also increasingly adopted to solve
Moreover, DL-based algorithms could potentially boost the
the problem of target-tracking in congested-spectrum
performance of subarray-based MIMO (Sub-MIMO) radar,
environments. Specifically, researchers from the U.S. Army
Combat Capabilities Development Command (DEVCOM) which could be regarded as a hybrid of phased-array radar
developed a DL-based strategy for radars to autonomously and MIMO radar. The essence of Sub-MIMO radar is to
transmit correlated waveforms within the same subarray,
learn the behavior of interferences from co-existing
which resembles the working mechanism of the
communication systems so that clean spectrum is identified
conventional phased-array, while the waveforms from
& radar waveforms are modified accordingly [10]. In [11],
Kozy et al. models the problem of radar tracking in the different subarrays designed to be orthogonal, so that they
presence of interference as a Markov Decision Process, and could be separated at the receiving end for waveform
diversity gain [19]. It follows naturally that the partition of
applies deep-Q learning to balance the signal-to-
subarrays for Sub-MIMO radar plays a key role in deciding
interference-plus-noise ratio (SINR) and the bandwidth
the balance between the coherent processing gain and the
usage so that the mutual interferences between radar and
waveform diversity gain. In [20], a novel CNN was
the co-existing communications systems is minimized.
proposed for interleaved sparse array design for phased-
B. DL FOR OPTIMIZED WAVEFORM SYNTHESIS MIMO radar. Specifically, the parallel lightweight structure
DL-based algorithms are also increasingly adopted in the (i.e. PL module), which is based on the MobileNet-V2
fields of radar waveform optimization under specific structure, was used to divide feature matrices into parallel
constraints, especially for MIMO radar. In order to separate branches. Meanwhile, the scale reduced convolution
the echo signals caused by the illuminating signals from structure (i.e. SR-module) was used as an alternative to the
different transmitting facilities of MIMO radar for further conventional pooling layer for feature matrix dimension
processing at the receiving end and achieve the waveform reduction. Simulation results show that compared with
diversity gain, the waveforms from different transmitting uniform antenna array partition, the proposed CNN
antennas have to be near-orthogonal [12]. Hence the cross- provides transmit beampatterns with narrower mainlobe
correlations between waveforms from different transmitting and lower sidelobes, more accurate direction of arrival
antennas are to be minimized. To minimize the auto-/cross- (DOA) estimation, and higher output SINR.
correlation sidelobes while meeting the constraints of The structures of the DNNs proposed in [7]-[20] and
constant modulus, Hu et al. designed a deep residual neural their distinctive features are summarized in TABLE 1.
network consists of 10 residual blocks, each of which is
made of dual layers of 128 neurons [13]. Later, a deep III. DL FOR LPI OR PASSIVE RADAR WAVEFORM
residual network similar to the one in [13] was adopted in RECOGNITION
[14] to synthesize desired beampatterns while minimizing The DL-based radar waveform recognition is also gaining
the cross-correlation sidelobes under the constraints of popularity in recent years. Various neural networks and
constant modulus. In [15], Zhong et al. proposed a feed- algorithms have been developed, which include the deep
forward neural network with ten hidden layers to maximize convolutional neural networks (CNNs) [21]-[23], auto-
the SINR of MIMO radar under the constraints of constant encoders [24]-[26], and recurrent neural networks (RNNs)
modulus and low sidelobe levels. many research works are [27]-[29]. These techniques could potentially 1) boost the
focused on the problem of the minimization of cross- possibility of intercepting and recognizing the signals
correlation sidelobe levels. In [16], the problem of multi- transmitted from the low probability of interception (LPI)
target detection was considered assuming unknown target radar [30]-[31]; and 2) improve the direct-path signal
positions, where deep reinforcement learning based strategy estimation accuracy for passive radar applications [43]-[45].
was adopted for waveform synthesis to maximize the However, as is pointed out in [46], [47], DL-based signal
detection capabilities of MIMO radar. Finally, the
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
classification algorithms are vulnerable to adversarial attacks, proposed in [30]-[33], which include the linear frequency
which are expected to be more powerful than classical modulated (LFM) waveform, the BPSK, the Frank-coded
jamming attacks. waveform, the Costas-coded waveform, the P1-P4 phase-
coded waveforms, and the T1-T4 time-coded waveforms.
A. DL FOR LPI RADAR Although the performances of different DNNs in [30]-[33]
Most modern radar systems have been designed to emit are noncomparable due to training/test data difference, the
LPI waveforms to avoid interception and detection by classification accuracy offered by these DNNs for SNR = -4
enemies. Therefore, automatic radar LPI waveform dB are all higher than 90%. In [34]- [35], the performances
recognition has become a key counter-countermeasures of DNNs were tested with less than 8 different types of
technology. In literatures, dozens of DL-based waveform waveforms. In [34], networks (Inception-v3 and ResNet-
recognition techniques have been proposed within the past 152) pretrained with ImageNet were used to reduce the
five years. Usually, the raw radar data are first pre- training cost. In [35], instantaneous autocorrelation function
processed with time-frequency analysis (TFA) techniques, (IAF) was used for denoising via atomic norm as a pre-
such as Choi-William distribution (CWD) [30]-[35], processing step, following which a CNN structure was
Fourier-based Synchrosqueezing transform (FSST) [36], proposed for the classification of the LFM, the Costas-
Wigner Ville distribution (WVD) [37], and short-time coded, and the P2-P4 coded waveforms.
Fourier transform (STFT) [38]-[40], to obtain the time-
Although the CWD is a widely adopted TFA technique,
frequency images. After that, various DNN structures,
it also involves high computational complexity, which
mostly CNN, could be designed for feature extraction and
makes the researchers to seek computationally-effective
waveform classification.
alternatives. The FSST was used in [36] as a substitute for
In [30]-[35], the TFA technique (CWD) was used to CWD in the pre-preprocessing step, following which a
generate time-frequency images in the pre-processing step. multi-resolution CNN with three different kernel sizes was
In [30], the sample averaging technique (SAT) was adopted proposed. In [37], the WVD was adopted, and a VGG16
for signal pre-processing to reduce the computational cost, variant pretrained with ImageNet was used to reduce the
after which a 9-layer CNN was proposed. In [31], a 7-layer training cost. Moreover, the STFT was adopted in [38]-[40]
CNN along with a novel tree structure-based process to obtain the time-frequency diagram of radar data. In [38],
optimization tool (TPOT) classifier was designed. In [32], Ghadimi et al. proposed two CNN structures based the
Ma et al. employed two different DNN structures to GoogLeNet and AlexNet, respectively, for the classification
approach the waveform classification problem: a 11-layer of LFM, P2-P4, and T1-T4 waveforms. In [39], Wei et al.
CNN and a bidirectional LSTM, with the former exhibiting proposed a novel squeeze-and-excitation network for
better performance. In [33], transfer learning was employed feature extraction in time, frequency, and time-frequency
to counter the problem of limited training data. The domains, and the recognition results of all the domains are
network was pretrained with five different existing high- fused subsequently. In [40], a simple CNN with three
performance CNN architectures: VGG16, ResNet50, convolution layers and one fully connected layer was used
Inception-ResNetV2, DenseNet, and MobileNetV2, with to classify of 20 different types of signals, which include
VGG-16 proved to offer the highest classification accuracy. frequency-modulated waveforms with various bandwidth
Twelve different types of radar waveforms have been and pulse width and phase-modulated waveforms.
used to test the performance of various CNN structures
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
Finally, it is worth mentioning that some research works the SAR image reconstruction with low error. It was also
on this topic didn’t employ TFA techniques for signal pre- shown that as the number of layers of the RNN increases,
processing. For example, in [41], an adaptive 1D CNN with the image contrast improves at the cost of increased
four hidden layers and two dense layers was proposed for reconstruction error. In [45], Wang et al. developed a novel
the classification of continuous and pulsed waveforms DNN consisting of a two-channel CNN and bi-directional
(sinusoidal, LFM, bi-phase coded, frequency-stepped). LSTM, which is termed as TCNN-BL, for waveform
The preprocessing procedures, the DNN structures, and recognition for cognitive passive radar, which could modify
the radar waveforms used for performance evaluation in the sampling rate adaptively to suit the task at hand.
[30]-[40] are summarized in TABLE 2. Moreover, a parameter transfer approach was utilized to
improve the network training efficiency.
B. DL FOR PASSIVE RADAR
Another potential application area for the DL-based C. CHALLENGES
automatic waveform recognition algorithms is passive radar. According to [46], the DNNs are highly vulnerable to
Passive radar utilizes the signals from illuminators of adversarial attacks. Depending on the information that is
opportunities (IOs) (e.g. base stations of wireless available to the attackers, adversarial attacks could be
communications systems) for target detection, imaging, and classified as white-box attack (the model structure and the
tracking, which could increase the radar coverage area parameters of the network are completely known a priori),
while avoiding the high infrastructure cost and the grey-box attack (known model structure & unknown
spectrum-crowdedness caused by the construction of new parameters), and black-box attack (unknown model
dedicated radar transmitters. However, since the waveforms structure & parameters). In most cases, the detailed
from the IOs are usually unknown to radar receivers, the information regarding DNNs is unknown to the attacker,
performance of passive radar is usually much worse than who can only get access to the classification results of the
the conventional active radar [42]. In [43]-[44], DL was network. Although black-box attack is more common and
used to realize simultaneous waveform estimation and less devastating than the other two types of attacks, white-
image reconstruction for passive SAR composed of a box attack is often used in research works to evaluate the
ground-based IO at known position and an airborne worst-case scenario. In [47], Sadeghi et al. showed that
receiver. A recurrent neural network (RNN) was designed, black-box attack can be designed to be approximately as
with which the scene reflectivity was recovered via forward effective as white-box attack, which could lead to dramatic
propagation, while the waveform coefficients were performance degradation in DL-based radio signal
reconstructed via backpropagation. Simulation results show classification. It is worth noting that most research works
that the proposed RNN could learn the characteristics of on the topic of signal/waveform misclassification caused by
quadrature phase-shifted keying (QPSK) signals [43] and adversarial attacks target the wireless communication
OFDM signals transmitted from DVB-T [44], and perform systems rather than radar. Nevertheless, the theory and
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
Public-domain
Ref. Preprocessing DNN structure Main features Dataset
available?
Deep belief network (one
Gaussian–Bernoulli RBM Incorporate HRRP frame &
Feng et Time-shift layer + two conventional average processing into one
al. [54] compensation, energy RBM layers) autoencoder
normalization & Stacked denoising Robust to noisy observation
average processing for autoencoder
sensitivity elimination Use t-SNE to deal with
Pan et al. Deep belief network HRRP data from
imbalanced distribution among
[55] (stacked RBMs) + softmax Yak-42 (large jet),
different targets & aspects
Cessna Citation S/II
Semi-supervised multitask Yes
(small jet), and An-26
recognition: deep-u-blind Feature maps of encoder are
Energy normalization to (twin-engine
Zhao et denoising network transferred to decoder through
eliminate amplitude- turboprop)
al. [56] (autoencoder - decoder) + the fusion layers to avoid
scaling sensitivity
recognition phase (AlexNet gradient vanishing
variant)
Dividing HRRP data Target-aware recurrent
Robust to time-shift sensitivity
Xu et al. into multiple attentional network (input +
due to memory & attention
[57][58] overlapping sequential encoder layer + attention
mechanism
features mechanism)
Time-shift Samples of target are divided
compensation, energy Concatenated neural into 4 sub-classes based on HRRP data from
Liao et al. normalization, network consisting of 3 aspect angle to reduce target- civilian aircrafts
No
[59] embedding secondary- independent shallow neural aspect sensitivity (Airbus A319, A320,
label (i.e. target aspect sub-networks Recognition results of multiple A321, Boeing B738)
angle) in the model samples are fused
Multi-scale conv kernels to
Deep 1D residual-inception extract features with different Seven types of ship
Energy normalization &
Guo et al. network (Conv + POOL + precisions and possess weight- of different sizes
average processing for No
[60] residual-inception block + sharing property (length from 89.3 m
sensitivity elimination
inception-POOL + FC) Novel loss function considering to 182.8 m)
inter-class/intra-class distance
Multi-channel input (real-
Energy normalization, Four classes of
imaginary, amplitude-spectrum)
Song et training data Multi-channel CNN (3 × unspecified ground
Use “deep features” generated No
al. [61] augmentation with Conv + 3 × POOL + 2 × FC) targets with different
by the final conv layer instead
shifting (translation) HRRP
of handcrafted features
Target section Deep convolutional
GAN for HRRP generation
segmentation, padding generative adversarial 6 classes of vehicles
Unbalanced training samples
Song et & normalization for network consisting of (Sedan, Jeep, MPV,
(i.e. majority vs minority class) No
al. [62] noise & clutter generator & discriminator tractor, farm vehicle,
2 novel 1D convolutional
elimination in GAN made of 1D convolutional box truck)
operators, SC & FSC, are used
training operators
Multistatic radar system 8 fighters (F-16, F-
Lundén CNN (2 × Conv + 2 × POOL The HRRPs of targets are 35, F-18, MQ-1, PAK
No
et al. [63] + 3 × FC) calculated with POFACETS & FA T-50, JAS-39C,
3D facet models of aircrafts Eurofighter, Rafale)
HRRPs simulated
HRRPs are simulated based
based on CAD
Energy normalization to on target CAD models and then
Karabayır CNN (4 × Conv + 2 × POOL models of 6 military &
eliminate amplitude- converted to 2D images No
et al. [64] + 1 × FC) 4 civilian ship targets
scaling sensitivity Simulation infrastructure from
assuming X-band
MatConvNet is used
maritime radar
Frame maximum likelihood FMLP is used to characterize
Liu et al. profile (FMLP)-trajectory all HRRP signals in a specific Three aircraft targets
No
[65] similarity auto-encoder frame rather than centroid (D507, D715, D910)
(stacked autoencoders) alignment used in [54]
10 vehicles (Camry,
Data alignment &
Self-attention module (focus Polarimetric HRRP recognition Civic, Jeep93,
normalization to reduce
on specific range cells) + 2 Collect scattering information Jeep99, Maxima,
time-shift & amplitude
Zhang et from both spatial and 4 Mazda MPV,
sensitivity; data × Conv LSTM layers with 1
al. polarimetric dimensions Mitsubishi, Sentra, Yes
reshaping with coding × POOL between them +
[66][67] Focus on discriminative range Avalon, Tacoma)
module (raw fully classification module (FC + cells for learning capacity [66]; 4 vehicles
polarimetric data coded softmax) improvement (truck, pick-up,
into real matrix)
sedan, minibus) [67]
mechanism of adversarial attacks for these two closely potential to be extended to the 2D image classification
related fields are identical. To encounter the challenges problem. In [49], two statistical tests were proposed for the
posted by adversarial examples, various adversarial training detection of adversarial examples.
and detection approaches have been developed. For
example, in [48], the 1D CNN used as RF signal classifier IV. DL FOR ATR
was pre-trained with an autoencoder to migrate the Machine learning (such as k-nearest neighbor and
deceiving effects of adversarial examples, which has the dictionary learning) has been employed for ATR long
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
before the emergence of DL [50], [51]. After AlexNet (one are summarized in TABLE 3. The preprocessing procedures
of the most popular deep CNNs) won the ILSVRC’12 and the dataset used for performance evaluation have also
contest [52], DL for radar ATR has become an intensively been noted in the table. It is worth mentioning that some
researched subject. Based on the amount of labeled data in simulation results regarding target recognition using a
the dataset used for training the network, DL could be supervised DL based on the HRRPs collected with MIMO
classified as unsupervised learning, supervised learning, radar have also been presented [68]. However, since the
and semi-supervised learning (SSL), with SSL being a DNN used to obtain the results in [68] was not detailed, it is
halfway between the other two. According to [53], in not included in TABLE 3.
common cases, 1%-10% of the data used for SSL training
B. DL-BASED ATR USING MICRO-DOPPLER SIGNATURES
are labeled, while the rest are unlabeled samples. Since
DL-based target detection/classification based on micro-
most of the existing DL-based radar ATR methods are
Doppler signatures has been gaining ground rapidly in the
supervised, the recognition/classification accuracies of
these methods are heavily limited by the amount of labeled field of automatic ground moving human/animal/vehicle
training data. In this section, we provide a comprehensive target recognition [69]-[73] and drone classification [74]-
[77]. In [69], MAFAT dataset, which contains the echo
review of DL-based ATR methods proposed in recent
signals from humans and animals collected by different
published research works, which includes i) ATR using the
pulse-Doppler radars at different locations, terrains, and SNR,
HRRP; ii) ATR using the micro-Doppler signatures; iii)
was used for the training of a six-layer CNN. To achieve
ATR for SAR; and iv) major challenges for DL-based ATR.
higher classification accuracy, the data was further
A. DL-BASED ATR USING HRR PROFILES augmented via random frequency/time shifting, noise-adding,
In order to perform ATR using the HRRP, some and vertical/horizontal image flipping. In [70], a CNN
preprocessing procedures are often required to eliminate the composed of 5 dense blocks (i.e. 3 × 3 Conv followed by 1 ×
sensitivities of the DL-based algorithm to time-shift, 1 Conv) and 5 transition blocks (i.e. 1 × 1 Conv followed by
amplitude-scaling, and aspect-angle. Commonly used 2 × 2 POOL) was proposed for human motion classification
sensitivity removal approaches include time-shift
based on micro-Doppler signatures, the performance of
compensation, energy normalization, and average
which was tested with two datasets containing the echoes
processing [54]-[56]. The DNN structures used for radar
associated with six human motions (walking, running,
HRRP target recognition include the deep belief network
crawling, forward jumping, creeping, and boxing) obtained
[54], [55], recurrent attentional network [57], [58],
via simulation and measurement, respectively. The major
concatenated neural network, CNNs [62]-[64], stacked
feature of the human motion recognition algorithm in [70] is
auto-encoder (SAE) [65], and convolutional LSTM [66],
that the proposed network is more robust to the varying target
[67].
angle aspect than most classic CNN models, such as
Some researchers used measured HRRP data for
VGGNet, ResNet, and DenseNet. In [71]-[73], Hadhrami et
performance evaluation. For example, the HRRP data from
al. investigated the problem of single-person/group/vehicle
Yak-42 (large jet), Cessna Citation S/II (small jet), and An-
recognition based on micro-Doppler signatures with DL. Pre-
26 (twin-engine turboprop) were used in [54]- [58]; the
trained classic CNN models (such as VGG16, VGG19, and
HRRP data from Airbus A319, A320, A321, and Boeing
AlexNet) and transfer learning were adopted to improve the
B738 were used in [59]; the HRRP data from seven types of
network training efficiency. The RadEch human/vehicle
ship of different sizes (length from 89.3 m to 182.8 m) were
targets tracking data collected with Ku-band pulse-Doppler
used in [60]; the HRRP data from various types of ground
radar, which covered typical scenarios like single-
vehicles were used in [62], [66], [67]. Since most
person/group walking/running and truck moving, was used to
researchers only have access to a limited mount of HRRP
test the performance of the proposed network. Moreover,
measurement data associated with a handful of vehicles,
many of them resort to simulated HRRP data generated by data augmentation (×16) with image vertical flipping and
software based on the specific CAD models of vehicles for circular shifting was employed to compensate for the limited
research purposes. For example, in [63], Lundén et al. training data.
generated HRRP data for 8 fighters (F-35, Eurofighter, etc.) In [74] and [75], pretrained classic CNN models (e.g.
with POFACETS & 3D facet models of aircrafts. In [64], GoogLeNet) are used for drone classification. Specifically, in
the HRRP data for 6 military and 4 civilian ship targets are [74], the micro-Doppler signatures and the cadence-velocity
simulated based on CAD models assuming X-band diagrams obtained by 14 GHz frequency modulated
maritime radar. Another feasible alternative is data continuous wave (FMCW) radar in indoor/outdoor
augmentation with generative adversarial network (GAN). experiments are merged as Doppler images, based on which
Specifically, in [62], GAN was adopted to address the drones with different number of motors are classified. In [75],
problem of unbalanced training samples, i.e. the labeled both the pretrained GoogLeNet and the deep series CNN
training samples for some classes (majority classes) with 34 layers are employed for in-flight drone/bird
significantly outnumber the other classes (minority classes). classification. The RGB and the grayscale echo signal dataset
The DNN structures of the DL-based ATR methods collected by 24 GHz and 94 GHz FMCW radars are used to
proposed in [54]-[65] along with their distinctive features train the two networks, respectively. One distinctive feature
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
Public-
Reference Preprocessing DNN structure Main features Dataset domain
available?
STFT + FFT CNN (2 × Conv + 2 × Data augmentation with MAFAT dataset (echo signals from
Dadon et
shift + abs (.) + POOL + 2 × FC) random frequency/time humans & animals within coverage Yes
al. [69]
log(.) + norm(.) shifts, noising, flipping area of pulse-Doppler radar)
CNN (5 dense blocks + 5
Simulation & measurement dataset: 6
STFT + average transition blocks + output); Insensitive to angle aspect
Yang et al. human motions (walking, running,
background dense block: 2 x Conv; (i.e. the target moves in No
[70] crawling, forward jumping, creeping,
subtraction transition block: 1 × Conv arbitrary direction)
boxing)
+1 × POOL
Pretrained CNN (VGG16 Pre-trained CNN model & RadEch tracking data collected with
Hadhrami & VGG 19 [71][73]; transfer learning Ku-band pulse-Doppler radar (one-
et al. AlexNet [72][73]) Data augmentation (16 x) person walking/running/crawling; Yes
[71][72][73] with vertical flipping & group walking/running, wheeled,
circular shifting truck, clutter)
CNN - pretrained model Micro-Doppler signature & Micro-Doppler signatures of drones
Short-time
Kim et al. (GoogLeNet) cadence-velocity diagram with different number of motors
Fourier No
[74] are merged as Doppler measured by 14 GHz FMCW radar in
Transform
image indoor/outdoor experiments
(STFT)
2 networks: pretrained RGB & grayscale dataset
CNN (GoogLeNet) & deep are used for training
Echo signals from inflight drones and
Rahman et series network (CNN with GoogLeNet & the proposed
birds collected by 24 GHz & 94 GHz No
al. [75] 34 layers) series network, respectively
FMCW radar
Clutter & noise are
regarded as sub-classes
Deep belief network Spectral correlation function
Cyclic (Gaussian–Bernoulli RBM (SCF) pattern signature is Echo signals collected from three
Mendis et
autocorrelation + RBM layers, similar to used micro unmanned aerial systems by No
al. [76][77]
function + FFT [54]) Resilient to white Gaussian S-band CW Doppler radar
noise
of the networks presented in [75] is that clutter and noises Specifically, this book focused on the ATR performances of
have been treated as two separate sub-classes. In [76] and various DNNs evaluated with the popular MSTAR dataset,
[77], Mendis et al. proposed a deep belief network (DBN) with MSTAR stands for the Moving and Stationary Target
formed by stacking the conventional RBM and the Gaussian Acquisition and Recognition. The public release of the
Bernoulli RBM (GBRBM), which is similar to the one MSTAR dataset, which was collected by the Defense
proposed in [54], to address the problem of micro drone Advanced Research Projects Agency (DARPA) and the Air
detection and classification. The classification was based on Force Research Laboratory (AFRL), consists of 20,000
the Doppler signatures of the targets of interest and their SAR image chips covering 10 targets types from the former
spectral correlation function (SCF) (i.e. Fourier transform Soviet Union. It should be noted that, although the MSTAR
of autocorrelation function) signature patterns. The dataset has long been widely adopted in research works to
performance of the proposed DBN was tested with the echo evaluate the performance of traditional machine-learning
signals collected from three micro-drones (available at based algorithms (e.g. SVM), by which a classification rate
supermarkets at a price lower than $100) by S-band CW of 97%-100% had been reached, it has been shown in some
Doppler radar. The micro-Doppler signature based target papers that the ATR performance of the algorithms
detection and classification approaches proposed in [69]-[77] trained/tested merely on the MSTAR dataset usually
are summarized in TABLE 4. degrade when trained/tested using other dataset (e.g. the
Finally, it is worth mentioning that a comprehensive QinetiQ dataset [80], [81]). Nevertheless, in this section, we
review on the application of DL for UAV detection and will give a brief review of recently proposed DNNs for
classification was provided in [78]. Although [78] covers the ATR employing the MSTAR dataset [82]-[92] and other
general topic of drone detection with multi-types of sensors SAR image datasets (e.g. TerraSAR-X). The limitation of
(which include electro-optical, thermal, sonar, radar, and the MSTAR dataset and the possible counter solutions will
radio frequency sensors) and does not focus specifically on be covered later in Section IV-D.
drone classification using the Doppler signatures collected by In [82], Chen et al. proposed an all-convolutional
radar, it still serves as a good reference work for readers who network (A-ConvNet) composed of 5 Conv and 3 × POOL.
are interested in the topic of drone/birds detection and Since only sparse connected Conv were used and the FC
classification. was omitted, A-ConvNet is highly computational efficient.
The performance of A-ConvNet was evaluated under both
C. DL-BASED ATR FOR SAR AND VIDEO SAR
standard operating condition (SOC) and extended operating
In 2020, Majumder, Blasch, and Garren published a book condition (EOC) (e.g. substantial variation in depression
summarizing recently proposed DL-based approaches for
angle/target articulation), which has been widely adopted as
radar ATR, where DL for single and multi-target
the performance benchmark in research papers. In [83], a
classification in SAR imagery was considered [79].
normal multiview deep CNN (DCNN) was proposed, which
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
is a parallel network with multiple inputs (i.e. SAR images cannot be expected to perform well when environmental
from different views) requiring only a limited amount of conditions change”---J. Pearl [93]).
raw SAR images. The features learned from different views According to [91] and [92], the LSTM network
are fused progressively toward the last layer of the network, outperforms the hidden Markov models (HMMs), which
which leads to classification rates of 98% and 93% for SOC has been widely adopted to model the multi-aspect SAR
and EOC, respectively. In [84], Furukawa et al. proposed a images until 2000s [94], in modeling the stochastic
CNN termed as verification support network (VersNet) sequences, especially when the initial probability of states
composed of an encoder and a decoder. A main feature of is unknown. However, the LSTM is notoriously time-
the network is that the input SAR image could be of consuming to train (not to mention that the training time of
arbitrary size and consisting of multiple targets from MA-BLSTM increases by 5 times with the decrease of
different classes. In [85], Shang et al. added an information training data [91]). Moreover, auto-extracted features
recorder, which is a variant of the memory module proposed obtained with CNNs or other types of unsupervised neural
in [89], along with a mapping matrix to the basic CNN. The networks are not necessarily better than the hand-crafted
resulting memory CNN (M-Net) uses spatial similarity ones designed by human experts. Actually, many well-
information of recorded features to predict unknown sample established researchers hold doubts against the “black-box”
labels. A two-step training process (i.e. parameter transfer) process of “automatic” feature extraction, which makes a
was employed to guarantee convergence of the results and network extremely vulnerable to adversarial attacks (more
to reduce the required of training time. The CNNs proposed details regarding this problem will be provided in Section
in [86]-[88] are also worth brief mentioning. In [86], IV-D).
morphological operation was used to smooth edge, remove Except for the CNNs and the LSTM networks mentioned
blurred pixel, amend cracks, and the large-margin softmax above, other DL-based networks such as the autoencoders
batch normalization was employed. In [87] and [88], the and Capsule Networks (CapsNets) have also been
database was extended with affine transformation in range, investigated as feasible solutions to the ATR problem. In
and a couple of SVMs were used to replace the FC in CNN [95], Deng et al. proposed a network composed of stacked
for final classification. auto-encoders (SAE). To avoid overfitting, restriction based
ATR based on SAR image sequence obtained from, for on Euclidean distance was implemented (i.e. samples from
example, single-radar observations along a circular orbit the same target at different aspect angles have shorter
over time or joint observation from different angles by distance in feature space) and a dropout layer was added to
multiple airborne radars, has also been investigated in the network. In [96] and [97], Geng et al. proposed a deep
research works. Considering that the sub-images in the supervised & contractive neural network (DSCNN), which
SAR image sequence obtained by the imaging radar over a consists of 4 layers of supervised and contractive
period of time from the same target often exhibit autoencoders. Multiscale patch-based feature extraction
conspicuous variations, a spatial-temporal ensemble was performed with three filters: the gray-level gradient
convolutional network (STEC-Net) consisting of 4 cooccurrence matrix (GLGCM) filter, the Gabor filter, and
convolutional layers and 4 pooling layers was proposed in the histogram of oriented gradient (HOG) filter. The graph-
[90]. Dilated 3D convolution was used to extract spatial and cut-based spatial regularization was applied to smooth the
temporal features simultaneously, which were progressively results. Moreover, unlike the other networks discussed in
fused and represented as the ensemble feature tensors. To this subsection, which have all been trained and tested
reduce the training time, compact connection was used using the MSTAR dataset, the DSCNN was tested with
rather than fully connected layer. In [91], Zhang et al. three datasets, the TerraSAR-X, the Radarsat-2, and the
proposed a multi-aspect-aware bidirectional LSTM network ALOS-2 data. A comprehensive review of autoencoder and
(MA-BLSTM) consisting of the feature extraction blocks, its variants for target recognition in SAR images could be
the feature dimension reduction block, and 3-layer LSTM found in [98]. In [99]-[102], various capsule networks
block. The feature extraction block utilizes the Gabor filter (CapsNets) were proposed to address two problems in
(orientation and rotation sensitive) in combination with the SAR-image based ATR: limited training data and
three-patch local binary pattern (TPLBP) operator (rotation depression angle variance. CapsNets are composed of
invariant) to obtain global & local features, while 3-layer capsules which are vectors of information about the input
MLP was employed for feature dimension reduction. In data, with the magnitude representing the probability of the
[92], Bai et al. proposed a bidirectional LSTM network, the presence of an entity and the direction representing the pose
performance of which was evaluated for two cases: clutter- and position of the entity. Due to page limitation, this
present and clutter-free. Surprisingly, the presence of clutter minority group of CapsNets based networks won’t be
lead to higher classification accuracy than the clutter-free detailed here. The DNNs discussed in this section for ATR
case. All the DNNs proposed in [90]-[92] reported a target using SAR images are summarized in TABLE 5.
recognition accuracy higher than 99.9%, but the Finally, note that DL could also be used for video-SAR
performance is expected to degrade in real-life application moving target indication. Specifically, Ding et al. proposed
scenarios (note: “a machine trained in one environment a faster region-based CNN in [103], which is a variant of
the algorithm proposed by Ren et al. in [104]. To reduce the
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
training burden, the features were extracted with pertained vehicles employed by other countries. Therefore, machine
CNN models such as AlexNet, VGGNet, and ZFNet. The learning with small training data sets is key to the success
Density-based Spatial Clustering of Application with Noise of radar ATR using SAR images. In the following, we will
(DBSCAN) algorithm was developed to reduce false alarms, examine various neural networks that are designed to meet
and the Bi-LSTM was used to improve the detection this challenge.
probability. The performance of the proposed network was Since these networks have all been trained using the
evaluated with both simulated video SAR data and real data MSTAR dataset, the classification accuracies of these
released by Sandia National Laboratory, which was further networks and the number of samples involved in the
augmented with rotation and cropping. training process are comparable. Before we move on, we
will first provide some details on the MSTAR dataset, so
D. MAJOR CHALLENGES FOR DL-BASED ATR
the readers could get a clear picture of what is happening.
In Section IV-C, we reviewed many DNNs trained and
As was mentioned before, the MSTAR dataset consists of
tested with the MSTAR dataset. In this subsection, we will 20,000 SAR image chips covering 10 targets types from the
look into two limiting factors which have been keeping the former Soviet Union (BMP2, BTR70, T72, BTR60, 2S1,
unanimous adoption of DNNs for radar ATR tasks on battle
BRDM2, D7, T62, ZIL131, ZSU23/4). These targets were
fields from becoming true: the limited amount of training
data and the potential security risk posted by carefully measured over the full 360 azimuth angles and over
crafted adversarial attacks. multiple depression angles (15 , 17 , 30 , and 45 ), and
the SAR images are 128 × 128 pixels in size and of 1 foot
(1) Lack of training data ×1 foot resolution. In most of papers, to demonstrate the
robustness of the proposed networks to the variation of
Although classification rates of higher than 99% have angles, the SAR images used for training and testing
been reported in many papers covering DNNs trained for usually correspond to two different depression angles (e.g.
radar ATR using the MSTAR dataset, the accuracies of
15 and 17 ).
these networks are expected to degrade dramatically when
Supervised learning: For comparison purpose, we first
tested with SAR images taken at depression angles that are
look at the application of traditional machine learning
very different from the ones used to obtain the training
method to address this problem. The topic has been
dataset or other SAR image datasets, e.g. the QinetiQ
dataset [80], [81]. As pointed out by J. Pearl, the neural thoroughly reviewed in [105]. More recently, in [106],
Clemente et al. utilized K-nearest neighbor for ATR against
networks usually cannot perform well if the environment
compound Gaussian noise, which was added to the MSTAR
they are tested in is different from the one they are trained
datasets manually. The features were represented by
with [93]. However, the DL-based approaches will simply
Krawtchouk moments, and the selection of testing/training
lose all their glamor if we must train the network from the
samples were randomized in each Monte Carlo run. Using
very beginning with large amount of qualified training data
for every new classification task. What’s worse, unlike only 191 training samples, the network proposed in [106]
reached an accuracy of 93.86%.
other ordinary image classification tasks (e.g. cat/dog
Semi-supervised learning: Since the manual feature
classification), the SAR images used for radar ATR are
extraction usually induces high computational complexity
usually very scarce, especially when the targets are military
while the auto feature extraction is a time-consuming
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
process requiring a large amount of labeled training 93.16%. In [113], Yu et al. proposed a 13-layer CNN, with
samples, some researchers resort to semi-supervised the input data preprocessed with Gabor filters. The center
machine learning. In [107], Hou et al. introduced a semi- 88 × 88 pixels of the SAR images were cropped to reduce
supervised online dictionary learning algorithm, where the the computational burden, and the training dataset was
SAR images were modeled with complex Gaussian augmented with the approach proposed in [112]. By
distribution (CGD). The dictionary was updated by adding replacing 1%-15% pixels in target scene with randomly
samples to the training process in a progressive way, and generated samples, the anti-noise performance of the
the Bayesian inference was employed to learn the proposed network was demonstrated. In [114], data
dictionary. In [108], Wang et al. used dual-networks and augmentation was performed by first using improved Lee
cross-training (i.e. the Siamese network) to improve the sigma filtering to remove speckles and then adding random
classification rate with limited training data. Specifically, noises. The proposed 9-layer CNN reached a high accuracy
the pseudo-labels generated by one network were used to of 98.7% with 1900 training samples.
fine-tune the other network, and an iterative categorical In [115] and [116], Lewis and Scarnati pointed out that
cross-entropy function was designed as the loss function of the synthetic SAR images obtained by simply manipulating
the dual-networks for contrastive learning. Although a high the real SAR images as the ordinary optical images are of
accuracy of 97.86% was obtained in [108] with only 400 poor quality (despite of the resemblance between them in
training samples, it is worth noting that the Siamese “appearance”), and using only the synthetic data in the
network is famous for its sensitivity to input variations and training process could lead to dramatic performance
weak generalizability. Feature augmentation, i.e. degradation. For example, the SAR ATR CNN in [117]
combining complementary features extracted by optimally- achieved only a 19.5% accuracy when trained with
selected multi-level layers rather than utilizing the high- synthetic data and tested with real data. Therefore, in [115]
level features only, is another solution to improve the and [116], 3D CAD models of targets were used to
accuracy with limited training samples. In [109], Zhang et synthesize the Synthetic and Measured Paired and Labeled
al. proposed a CNN composed of 5 Conv layers, 5 pooling Experiment (SAMPLE) dataset. The input data was
layers, and 2 FC layers. The features from the Conv layers preprocessed with t-SNE for dimension reduction, and
were concatenated, and the AdaBoost rotation forest (RoF) variance-based joint sparsity was employed for denoising.
was used to replace the original softmax layers. With 500 Moreover, the clutter was transferred from real to synthetic
training samples, the networks proposed in [109] reach a SAR images via task masks. With 50% real data from the
classification rate 96.3%. Note that other supervised MSTAR dataset and 50% synthesized data generated with
classifiers, such as SVM and random forest, could also be the GAN, the modified DenseNet proposed in [115]
used as substitutes for the softmax layers of a classic CNN reached an accuracy of 92%. In [118], dual parallel GAN
to improve the accuracy. (DPGAN) made of a generator with 4 convolution layers
Unsupervised learning: One way to realize and 4 deconvolution layers and a discriminator with 4
unsupervised learning with limited training data samples is convolution layers was proposed. The raw images with
to employ transfer learning. In [110], Huang et al. opposite azimuth were merged together for shadow
proposed a DNN composed of stacked convolutional auto- compensation. With 300 GAN-augmented training samples,
encoders, which was trained with unlabeled SAR images the 5-layer CNN proposed in [118] reached a high accuracy
for the subsequent transfer learning rather than the of 99.3%.
commonly used ImageNet, which contains optical images The networks proposed in [106]-[118] along with the
that are far different from SAR images. In [111]-[118], data number of MSTAR samples used for training and the
augmentation was performed to boost the training dataset in corresponding accuracies are summarized in TABLE 6,
addition to transfer learning to further improve the where “AUG” represents training data augmentation. Since
classification accuracy. Specifically, in [111], Zhong et al. transfer learning plays a key role in improving the accuracy
employed three classic CNNs, namely CaffeNet, VGG-F, of DNNs with limited training data while reducing the
and VGG-M, that have been pretrained with the ImageNet training time, the readers are also referred to [119], in
dataset. The data augmentation method used in [82] was which how to apply transfer learning in SAR ATR were
adopted, and 2700 images for each class were obtained via discussed in detail (note that it was concluded in [119] that
randomly sampling 88 × 88 patches from the 128 × 128 simple “domain adaption based transfer learning” by
SAR image chips. With network pruning (a maximum of applying a DNN model pretrained with natural optical
80% filters pruned) and recovery employed, the networks images, e.g. ImageNet, directly to the problem of SAR
presented in [111] is 3.6 times faster than the A-ConvNets image classification/recognition does not work well).
proposed in [82] at the cost of 1.42% decrease in accuracy. Finally, although the MSTAR data set has been widely used
In [112], Ding et al. an all-in-one 6-layer CNN was for the training of SAR ATR DNNs [106]-[118], some
proposed, and three types of data augmentation, namely researchers resort to a few SAR image datasets obtained by
posture synthesis, translation, and noise-adding were TerraSAR-X that have been made available to public,
combined. With training samples augmented to 1000 per which include the landscape mapping dataset [120], the
class, the network in [112] reached a test accuracy of
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
ship detection dataset [121], [122], and the vehicle adversarial example that maximizes the loss so that it will
detection dataset [123]. be misclassified. The BIM, which is also referred to in
literatures as the iterative fast gradient sign attack method
(2) Adversarial attacks (I-FGSM), and the ILCM were all proposed by Kurakin et
According to literatures, one most intriguing feature of al. in [127]. The BIM is a straightforward extension of the
adversarial attacks is that by slightly changing some pixels FGSM method, which seeks to maximize the cost of the
of a picture (changes so trivial that humans can’t even true class along small steps in the gradient direction in an
notice), the DL-based image classification algorithm will be iterative manner. In contrast, the ILCM iteratively
fooled to make unbelievable mistakes. For example, if we maximize the probability of specific false target class with
add a toaster sticker to a banana, it could be misclassified as lowest confidence score for clean image. The PGD-based
toaster by a DL-based classifier [46]. Based on the attack method [128] is essentially the same as the BIM
adversary’s knowledge on the network to be attacked, except that for PGD, the example is initialized at a random
adversarial attacks could be classified as white-box, grey- point in the ball of interest determined by the l norm. The
box and black-box attack (see Section III-C for details). DeepFool method proposed by Moosavi-Dezfooli [129]
Moreover, an adversarial attack is said to be “targeted” if first computes the minimum distance it takes to reach the
the adversarial examples have been designed to be class boundary assuming that the classifier is linear, then
misclassified as a specific type of target and “nontargeted” makes corresponding steps towards that direction.
otherwise. The research in the field of adversarial attacks The score-based attacks do not require gradients of the
resembles a cat-and-mouse game: many algorithms are model or other internal knowledge about the networks to be
designed to misguide the existing DNNs into attacked, but need to know the probability that the input
misclassification, while the others are developed to improve samples belong to a certain class, i.e. the probability labels.
the robustness of the DNNs to adversarial examples via It is less popular than the gradient-based attacks. The
adversarial training, adversarial detection, gradient-masking, single-pixel attack proposed by Narodytska and
etc. In this subsection, we will give a brief introduction to Kasiviswanathan [130] in 2017 is a typical score-based
several highly-cited adversarial attack algorithms proposed attack. It probes the weakness of a DNN by changing single
in recent years. Before we move on to introduce original pixels to while or black one at a time. In 2019, an
research works on this topic, we will first provide some alternative single-pixel based approach was proposed in
background information on commonly used attack methods [131], which relies on the differential evolution algorithm
that are readily available as Python toolboxes free for and achieved a high successful-misguiding rate by only
download [124]. modifying less than 5 image pixels. In contrast, the
The adversarial attacks widely adopted by DNN attackers decision-based attacks rely only on the class decision made
generally belong to three categories: the gradient-based by the targeted networks and does not require any
attacks, the score-based attacks, and the decision-based knowledge regarding gradients or probabilities. This last
attacks. category of adversarial attacks includes the boundary attack
The gradient-based attacks utilize the input gradients to [132], the noise attack, and the blur attack (for images only)
obtain perturbations that the model predictions for a [124].
specific class are most sensitive to. The fast gradient sign In the following, we will concentrate on the application
method (FGSM), the Basic Iterative Method (BIM), the of adversarial attacks in radar ATR. In [133], Huang et al.
iterative least-likely class method (ILCM), the Projected proposed four algorithms to misguide multi-layer
Gradient Descent (PGD) and the DeepFool are some of the perceptron (MLP) and CNN designed for radar ATR using
most famous attack methods belong to this group [124]. HRRP. Two of them are fine-grained perturbations (i.e. the
The FGSM proposed by Goodfellow et al. [126] utilizes adversarial sample to be updated according to the input),
the loss function with respect to the input to create an while the other two are universal perturbations (i.e. image-
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
Adversary’s Adversary’s
Reference Name Main features
knowledge specificity
Fine-grained perturbation; Fast gradient sign method (FGSM) variant with
Algorithm-1 Nontargeted
scaling factor obtained via binary search; more effective than FGSM
Huang et al. [133] Algorithm-2 White-box & Targeted Fine-grained perturbation; multi-iteration method continuously updating
Algorithm-3 black-box Nontargeted Universal perturbation via the aggregation of fine-grained perturbations
Algorithm-4 Targeted Universal perturbation by scaling one fine-grained perturbation
I-FGSM Nontargeted FGSM variant; in each iteration, the clipping function changes only 1 pixel
Iteratively maximize the probability of specific false target class with lowest
ILCM White-box
Huang et al. [134] confidence score for clean image
Targeted
Decision Attack decisions; don’t need gradients; find the decision boundary between
Black-box
based attack clear sample and samples of desired false class
Compute the gradient of loss function & seek the minimum step size to
FGSM
obtain adversarial sample
DeepFool Compute minimum distance to class boundary assuming linear classifier
Compute the minimum distance to a point x’ adjacent to x so that the
Lewis et al. [135] NewtonFool White-box
probability of x’ belongs to true class of x approaches zero
BIM Maximize the loss along small steps in the gradient direction
Nontargeted
Maximize probability of some specific target class which is unlikely to be the
PGD
true class for a given sample; multiple iterations
Universal perturbation proposed in [137] is added to MSTAR images to fool
Wang et al. [136] UAP Black-box
CNNs; success rate higher than 80%
Competitive Overcomplete Output Layer (COOL) used as the output layer to
Wagner et al.[138] DeepFool White-box
improve robustness against DeepFool
agnostic). These algorithms and their main features are improved robustness of the CNN against the adversarial
summarized in TABLE 7. Simulation results show that the examples generated by DeepFool.
proposed algorithms are highly aggressive when conducting
both white and black attacks. In [134], Huang et al. V. DL FOR RADAR INTERFERENCE SUPPRESSION
considered the problem of adversarial attacks on radar ATR Jamming and clutter are two types of interferences that
using SAR images. First, the I-FGSM was employed to limit the performance of modern radar systems. In this
generate adversarial examples for white-box and black-box section, various DL-based jamming recognition and anti-
nontargeted attacks on three classic CNN models: AlexNet, jamming algorithms are reviewed. The technical trends in
VGGNet, and ResNet. After that, the ILCM algorithm and using the DNNs to address the challenging problem of
the DBA algorithm were used to create adversarial marine target detection in sea clutter are also discussed.
examples for targeted white-box and black-box attack,
A. JAMMING
respectively. The characteristics of these three algorithms
In [145]-[151], various DNNs were designed for jamming
are briefly introduced in TABLE 7. Simulation results show
signal classification, with the majority of them being CNNs.
that using the adversarial examples generated with the I-
The main features of these networks are summarized in
FGSM, the success rate of VGGNet and ResNet in target
TABLE 8, along with the types of jamming signals that have
recognition dropped from 95% to 7% when black-box
been used for network training and performance testing.
attack was conducted. In addition, under the targeted white-
Specifically, In [146] and [147], an improved Siamese-CNN
box attack from ILCM, the confidence level of ResNet for
(S-CNN) was proposed, which is composed of two 1-D
the true class label decreased from 99% to 61.4%.
CNNs for feature extraction from the real and the imaginary
Meanwhile, under the targeted black-box attack from
parts of the data, respectively. This network only needs 500
decision-based attack, the confidence levels of AlexNet,
training samples for each target class, and its performance
VGGNet, and ResNet for the true class label were as low as
were compared with various machine learning methods (e.g.
22.4%, 15.9%, and 23.2%, respectively. In [135], Lewis et
the SVM). In [148] and [149], the 1-D jamming signals were
al. tested five white-box adversarial attacks to fool the DL-
transformed to 2-D time-frequency images via time
based radar classifier: FGSM, DeepFool, NewtonFool, BIM,
frequency analysis so that they could be processed with CNN.
and PGD. In [136], the nontargeted black-box universal
In [149], a DNN based on the bilinear EfficientNet-B3 and
adversarial perturbation (UAP) was employed to fool the
the attention mechanism was proposed. The model
CNNs, for which the success rate in misguiding the
parameters of EfficientNet-B3 obtained in the pretraining
network was higher than 80%.
process using the ImageNet dataset were used as the initial
As was mentioned before, although the mainstream
weights of the proposed network. Note that EfficientNet-B3
research in the field of adversarial examples aims to
belongs to a large family of EfficientNet algorithms (named
“attack”, a considerable number of researchers work on the
as EfficientNet-B0 to B7) [150]. Although the accuracy of
“defence” side, i.e. to improve the robustness of the DNNs
EfficientNet-B3 is 4% lower than that of EfficientNet-B7, the
to adversarial examples via adversarial training, adversarial
amount of model parameters involved in the former is only
detection, gradient-masking, etc. For example, in [138], the
1/5 of the latter, which indicates less training time. In [151], a
competitive overcomplete output layer (COOL) was
VGG-16 variant was developed for barrage jamming
designed to replace the commonly used softmax layer for
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
detection and classification for SAR, where the statistical radar by the council for scientific and industrial research
characteristics of SAR echo signals was exploited. (CSIR). In [162], Chen et al. proposed a dual-channel
Except for the works discussed above, using the DL-based convolutional neural network (DCCNN) made of LeNet
approaches to perform target classification in the presence of and VGG16, for which the amplitude and the time-
jamming [152], to choose the optimum anti-jamming strategy frequency information were used as two inputs, and the
for radar [153], [154], to analyze the probability of radar features extracted from the two channels were fused at the
being jammed [155], and to adaptively select the best method FC layer. One distinctive characteristic of [162] is that
to jam an enemy radar [157] have also been investigated. The softmax classifier with variable threshold and SVM
DNN structures proposed in these works and their distinctive classifier with controllable false alarm rates were designed.
features are summarized in TABLE 8. Finally, a detailed The performance of the proposed network was tested with
discussion regarding the application of artificial intelligence two datasets, the Intelligent PIXel processing radar (IPIX)
in electronic warfare systems was presented in [158], which dataset collected by the fully coherent dual-pol X-band
is also recommended for readers who are interested in the radar for floating target and the CSIR dataset for
recent trends of DL-based jamming/anti-jamming techniques. maneuvering marine target. In [163], a fully convolutional
network (FCC) with 20 layers were proposed for ship
B. CLUTTER
detection in SAR images collected by Gaofen-3 and
Marine target detection is a much more challenging task
TerraSAR-X. It is worth mentioning that pixel truncation
for radar than ground moving target detection due to the
was implemented as a preprocessing procedure assuming
highly nonhomogeneous and time-varying clutter incurred by
that the potential ship pixels are brighter than the clutter,
the sea. An early attempt of using machine learning methods which is not necessarily true. Finally, in [164], a DL-based
for target detection in the presence of sea clutter was made in empirical clutter model named as the multi-source input
[159], where k-Nearest-Neighbor and SVM were used for
neural network (MSINN) was proposed to predict the sea
marine target/clutter classification using the data collected by
clutter reflectively. This model was tested with the sea
the S-band NetRAD system jointly developed by the
clutters collected by ground-based UHF band polarized radar
University College London and the University of Cape Town and was proven to fit the measurement data better than the
[160]. existing empirical sea clutter models.
With DL gaining popularity in recent years, many
Although most research papers in this field focus on sea
researchers resort to DNNs to further improve the detection
clutter, DNNs have also been designed to address other
performance of marine radars [161]-[164]. Specifically, in
types of clutter. For example, in [165], Cifola et al.
[161], Pan et al. used the Faster R-CNN proposed by Ren et
considered the problem of clutter/target recognition for
al. in [104] for target detection using the sea clutter dataset drone signals polluted by wind turbine returns. A denoising
collected with the X-band ground-based Fynmeet marine adversarial autoencoder was designed, the performance of
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
which was tested with the micro-Doppler signatures of the high -accuracy performance (above 97%) of traditional
drones and wind-turbines measured with X-band CW radar. machine learning based ATR methods 20 years ago.
In [166], Lepetit et al. used U-Net, a CNN variant that was Moreover, the ever-evolving adversarial attacks also post
originally proposed for medical image segmentation, to great security risk to the DNNs. This work provides a full
remove clutter from precipitation echoes collected by picture of numerous potential research opportunities and
weather radar. 150,000 images collected by the Trappes grave challenges in applying the DL-based approaches to
polarimetric ground weather radar in Météo-France were address the existing problems in radar signal processing,
used for network training. which serves as a good reference work for researchers
The DNN structures presented in [161]-[166] and their interested in this field.
main features are summarized in TABLE 9. Note that
except for the works mentioned above, deep convolutional Acknowledgments
autoencoders were proposed for target detection in sea The author would like to thank the anonymous reviewers
clutter in [167], [168], and a LSTM-based network was for their insightful comments and suggestions, which
designed for sea clutter prediction in [169]. Since these definitely made the work more technically sound.
networks were tested only with simulated data, they are
expected to exhibit noticeable performance degradation in REFERENCES
real-life detection scenarios. [1] E. Mason, B. Yonel, and B. Yazici, “Deep learning for
radar,” in 2017 IEEE Radar Conference (RadarConf),
VI. CONCLUSION 2017, pp. 1703–1708.
In this work, we consider the application of DL algorithms [2] F. Gini, “Grand challenges in radar signal processing,”
in radar signal processing. With the DL gaining popularity Frontiers in Signal Processing, vol. 1, pp. 1-6, March
rapidly in recent years, DL for radar signal recognition, DL 2021.
for ATR based on HRRP/Doppler signatures/SAR images, [3] P. Lang, X. Fu, M. Martorella, et al., “A comprehensive
and DL for radar jamming recognition & clutter suppression survey of machine learning applied to radar signal
have been explored thoroughly by many researchers. processing.” ArXiv Preprint ArXiv:2009.13702, 2020.
Although classification accuracies of 98%-100% have been [4] X. X. Zhu et al., “Deep learning in remote sensing: a
reported in many research works on radar ATR with DL comprehensive review and list of resources,” IEEE
networks using the MSTAR dataset, it should be emphasized Geoscience and Remote Sensing Magazine, vol. 5, no.
that there is a long way to go before the DL approaches 4, 2017.
become qualified substitutes for the classic radar ATR [5] L. Zhang, L. Zhang, and B. Du, “Deep learning for
methods. Firstly, DL networks demand large amount of remote sensing data: a technical tutorial on the state of
training data. Unlike the typical problem of image the art,” IEEE Geoscience and Remote Sensing
classification, for which large amounts of training data are Magazine, vol. 4, no. 2, 2016.
available online, representative real-world HRRPs and SAR [6] S. Haykin, “Cognitive radar: a way of the future,” in
images that are labelled with accurately verified targets are IEEE Signal Processing Magazine, vol. 23, no. 1, pp.
simply not readily available for everyone at demand. Not to 30–40, Jan. 2006.
mention that a network trained under a specific environment [7] G. E. Smith and T. J. Reininger, “Reinforcement
doesn’t work the same way when the environment changes. learning for waveform design,” in 2021 IEEE Radar
Secondly, although some DL networks reach high accuracies Conference (RadarConf21), 2021, pp. 1–6.
with limited training data, most of them were tested with [8] C. E. Thornton, R. M. Buehrer, A. F. Martone, and K. D.
only the MSTAR dataset, which has also been used to prove Sherbondy, “Experimental analysis of reinforcement
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
learning techniques for spectrum sharing radar,” in [22] R. Zhou, F. Liu and C. W. Gravelle, "Deep learning
2020 IEEE International Radar Conference (RADAR), for modulation recognition: A survey with a
2020, pp. 67–72. demonstration," in IEEE Access, vol. 8, pp. 67366-
[9] C. E. Thornton, M. A. Kozy, R. M. Buehrer, A. F. 67376, 2020.
Martone, and K. D. Sherbondy, “Deep reinforcement [23] P. Itkin and N. Levanon, "Ambiguity function based
learning control for radar detection and tracking in radar waveform classification and unsupervised
congested spectral environments,” IEEE Transactions adaptation using deep CNN models," 2019 IEEE
on Cognitive Communications and Networking, vol. 6, International Conference on Microwaves, Antennas,
no. 4, 2020. Communications and Electronic Systems (COMCAS),
[10] U.S. Army DEVCOM Army Research Laboratory 2019, pp. 1-6.
Public Affairs, “Army fast-tracks adaptable radars for [24] A. Dai, H. Zhang and H. Sun, “Automatic modulation
congested environments”, Defense Insider, 2020. classification using stacked sparse auto-encoders,”
[11] M. A. Kozy, J. Yu, R. Buehrer, A. Martone, and K. 2016 IEEE 13th International Conference on Signal
Sherbondy, “Applying deep-Q networks to target Processing (ICSP), pp. 248-252, November 2016
tracking to improve cognitive radar,” 2019 IEEE Radar [25] Z. Qu, W. Wang, C. Hou and C. Hou, "Radar signal
Conference (RadarConf), pp. 1–6, 2019. intra-pulse modulation recognition based on
[12] H. Deng, Z. Geng and B. Himed, "MIMO radar convolutional denoising autoencoder and deep
waveform design for transmit beamforming and convolutional neural network," in IEEE Access, vol. 7,
orthogonality," in IEEE Transactions on Aerospace pp. 112339-112347, 2019.
and Electronic Systems, vol. 52, no. 3, pp. 1421-1433, [26] K. Ren, H. Ye, G. Gu and Q. Chen, "Pulses
June 2016. classification based on sparse auto-encoders neural
[13] J. Hu, Z. Wei, Y. Li, H. Li, and J. Wu, “Designing networks," in IEEE Access, vol. 7, pp. 92651-92660,
unimodular waveform(s) for MIMO radar by deep 2019.
learning method,” IEEE Transactions on Aerospace [27] X. Li, Z. Liu, Z. Huang and W. Liu, "Radar emitter
and Electronic Systems, vol. 57, no. 2, 2021. classification with attention-based multi-RNNs," in
[14] W. Zhang, J. Hu, Z. Wei, H. Ma, X. Yu, and H. Li, IEEE Communications Letters, vol. 24, no. 9, pp.
“Constant modulus waveform design for MIMO radar 2000-2004, Sept. 2020.
transmit beampattern with residual network,” Signal [28] X. Li, Z. Liu, and Z. Huang, “Attention-based radar
Processing, vol. 177, pp. 107735, 2020. PRI modulation recognition with recurrent neural
[15] K. Zhong et al., “MIMO radar waveform design via networks,” IEEE Access, vol. 8, pp. 57426–57436,
deep learning,” in 2021 IEEE Radar Conference 2020.
(RadarConf21), 2021, pp. 1–5. [29] Z.-M. Liu and P. S. Yu, “Classification, denoising, and
[16] L. Wang, S. Fortunati, M. S. Greco, and F. Gini, deinterleaving of pulse streams with recurrent neural
“Reinforcement learning-based waveform optimization networks,” IEEE Transactions on Aerospace and
for MIMO multi-target detection,” in 2018 52nd Electronic Systems, vol. 55, no. 4, 2019.
Asilomar Conference on Signals, Systems, and [30] S.-H. Kong, M. Kim, L. M. Hoang, and E. Kim,
Computers, 2018, pp. 1329–1333. "Automatic LPI radar waveform recognition using
[17] J. Kurdzo, J. Y. N. Cho, B. Cheong, and R. Palmer, “A CNN", IEEE Access, vol. 6, pp. 4207-4219, 2018.
neural network approach for waveform generation and [31] J. Wan, X. Yu, Q. Guo, et al., “LPI radar waveform
selection with multi-mission radar,” 2019 IEEE Radar recognition based on CNN and TPOT,” Symmetry, vol.
Conference (RadarConf), pp. 1–6, 2019. 11, no. 5, 2019.
[18] A. M. Elbir, K. V. Mishra, and Y. C. Eldar, “Cognitive [32] Z. Ma, Z. Huang, A. Lin, and G. Huang, “LPI radar
radar antenna selection via deep learning,” IET Radar, waveform recognition based on features from multiple
Sonar & Navigation, vol. 13, no. 6, pp. 871–880, Jun. images”, Sensors, vol. 20, no. 2, pp. 526, 2020.
2019. [33] B. Lay and A. Charlish, “Classifying LPI signals with
[19] Z. Geng, H. Deng and B. Himed, "Interference transfer learning on CNN architectures,” 2020 Sensor
mitigation for airborne MIMO radar," International Signal Processing for Defence Conference (SSPD),
Conference on Radar Systems, 2017, pp. 1-6. 2020.
[20] T. Cheng, B. Wang, Z. Wang, R. Dong, and B. Cai, [34] Q. Guo, X. Yu, and G. Ruan, “LPI radar waveform
“Lightweight CNNs-based interleaved sparse array recognition based on deep convolutional neural
design of phased-MIMO radar,” IEEE Sensors Journal, network transfer learning,” Symmetry, vol. 11, no. 4,
vol. 21, no. 12, 2021. 2019.
[21] C. Wang, J. Wang and X. Zhang, "Automatic radar [35] S. Zhang, A. Ahmed and Y. D. Zhang, "Sparsity-based
waveform recognition based on time-frequency time-frequency analysis for automatic radar waveform
analysis and convolutional neural network," 2017 IEEE recognition," 2020 IEEE International Radar
International Conference on Acoustics, Speech and Conference (RADAR), 2020.
Signal Processing (ICASSP), 2017, pp. 2437-2441.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
[36] X. Ni, H. Wang, F. Meng, J. Hu, and C. Tong, “LPI and physical robustness,” in 2019 IEEE Global
radar waveform recognition based on multi-resolution Conference on Signal and Information Processing
deep feature fusion,” IEEE Access, vol. 9, pp. 26138– (GlobalSIP), 2019, pp. 1–5.
26146, 2021. [50] B. Chen, H. Liu, J. Chai, Z. Bao, “Large margin
[37] Z. Pan, S. Wang, M. Zhu, and Y. Li, “Automatic feature weighting method via linear programming”,
waveform recognition of overlapping LPI radar signals IEEE Trans. Knowl. Data Eng., vol. 21, no. 10, 2008,
based on multi-instance multi-label learning,” IEEE pp. 1475–1488.
Signal Processing Letters, vol. 27, pp. 1275–1279, [51] B. Feng, L. Du, H.-w. Liu, F. Li, “Radar HRRP target
2020. recognition based on K-Salgorithm”, in Proceedings of
[38] G. Ghadimi, Y. Norouzi, R. Bayderkhani, M. Nayebi, the 2011 IEEE CIE International Conference on Radar,
and S. M. Karbasi, “Deep learning-based approach for Chengdu, China, 2011, pp. 642–645.
low probability of intercept radar signal detection and [52] R. Alake, “What AlexNet brought to the world of deep
classification,” Journal of Communications learning (online), https://towardsdatascience.com/what-
Technology and Electronics, vol. 65, pp. 1179–1191, alexnet-brought-to-the-world-of-deep-learning46c7974
2020. b46fc, July 2020. Accessed on 18 Sept. 2021.
[39] S. Wei, Q. Qu, H. Su, J. Shi, X. Zeng, and X. Hao, [53] Y. Ouali, C. Hudelot, and M. Tami, “An Overview of
“Intra-pulse modulation radar signal recognition based Deep Semi-Supervised Learning,” ArXiv, vol.
on Squeeze-and-Excitation networks,” Signal, Image abs/2006.05278, 2020.
and Video Processing, vol. 14, no. 6, 2020. [54] B. Feng, B. Chen, and H. Liu, “Radar HRRP target
[40] A. Orduyilmaz, E. Yar, M. B. Kocamis, M. Serin, and recognition with deep networks,” Pattern Recognition,
M. Efe, “Machine learning-based radar waveform vol. 61, pp. 379–393, 2017.
classification for cognitive EW,” Signal, Image and [55] M. Pan, J. Jiang, Q. Kong, J. Shi, Q. Sheng, and T.
Video Processing, pp. 1–10, 2021. Zhou, “Radar HRRP target recognition based on t-SNE
[41] A. Yildirim and S. Kiranyaz, “1D convolutional neural segmentation and discriminant deep belief network,”
networks versus automatic classifiers for known LPI IEEE Geoscience and Remote Sensing Letters, vol. 14,
radar signals under white gaussian noise,” IEEE Access, no. 9, 2017.
vol. 8, pp. 180534–180543, 2020. [56] C. Zhao, X. He, J. Liang, T. Wang, C. Huang, "Radar
[42] Z. Geng, "Evolution of netted radar systems," in IEEE HRRP target recognition via semi-supervised multi-
Access, vol. 8, pp. 124961-124977, 2020. task deep network", IEEE Access, vol. 7, pp. 114788-
[43] B. Yonel, E. Mason, and B. Yazici, “Deep learning for 114794, 2019.
waveform estimation in passive synthetic aperture [57] B. Xu, B. Chen, J. Wan, H. Liu, and L. Jin, “Target-
radar,” in 2018 IEEE Radar Conference (RadarConf18), aware recurrent attentional network for radar HRRP
2018, pp. 1395–1400. target recognition,” Signal Processing, vol. 155, pp.
[44] B. Yonel, E. Mason, B. Yazici, "Deep learning for 268–280, 2019.
waveform estimation and imaging in passive radar", [58] B. Xu, B. Chen, J. Liu, C. Du, "Gaussian mixture
IET Radar Sonar & Navigation, vol. 13, no. 6, pp. 915- model-tensor recurrent neural network for HRRP target
926, 2019. recognition", 2019 International Radar Conference
[45] Q. Wang, P. Du, J. Yang, G. Wang, J. Lei, and C. Hou, (RADAR), pp. 1-6, 2019.
“Transferred deep learning based waveform [59] K. Liao, J. Si, F. Zhu, and X. He, “Radar HRRP target
recognition for cognitive passive radar,” Signal recognition based on concatenated deep neural
Processing, vol. 155, pp. 259–267, 2019. networks,” IEEE Access, vol. 6, pp. 29211–29218,
[46] M. Stone, “Why adversarial examples are such a 2018.
dangerous threat to deep learning”, [60] C. Guo, Y. He, H. Wang, T. Jian, S. Sun, "Radar
https://securityintelligence.com/articles/why- HRRP target recognition based on deep one-
adversarial-examples-are-such-a-dangerous-threat-to- dimensional residual-inception network", IEEE Access,
deep-learning/, Mar. 2020. Accessed on 23 June 2021. vol. 7, pp. 9191-9204, 2019.
[47] M. Sadeghi and E. G. Larsson, "Adversarial attacks on [61] J. Song, Y. Wang, W. Chen, Y. Li, J. Wang, "Radar
deep-learning based radio signal classification," in HRRP recognition based on CNN", The Journal of
IEEE Wireless Communications Letters, vol. 8, no. 1, Engineering, vol. 2019, no. 21, pp. 7766-7769, 2019.
pp. 213-216, Feb. 2019. [62] Y. Song, Y. Li, Y. Wang, and C. Hu, “Data
[48] S. Kokalj-Filipovic, R. Miller, N. Chang, and C. L. augmentation for imbalanced HRRP recognition using
Lau, “Mitigation of adversarial examples in RF deep deep convolutional generative adversarial network,”
classifiers utilizing autoencoder pre-training,” Military IEEE Access, vol. 8, pp. 201686–201695, 2020.
Communications and Information Systems Conference, [63] J. Lundén and V. Koivunen, "Deep learning for
2019. HRRP-based target recognition in multistatic radar
[49] S. Kokalj-Filipovic, R. Miller, and G. Vanhoy, systems," 2016 IEEE Radar Conference (RadarConf),
“Adversarial examples in RF deep learning: detection 2016, pp. 1-6.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
[90] R. Xue, X. Bai, and F. Zhou, “Spatial–temporal [103] J. Ding, L. Wen, C. Zhong, and O. Loffeld, “Video
ensemble convolution for sequence SAR target SAR moving target indication using deep neural
classification,” IEEE Transactions on Geoscience and network,” IEEE Transactions on Geoscience and
Remote Sensing, vol. 59, no. 2, 2021. Remote Sensing, vol. 58, pp. 7194–7204, 2020.
[91] F. Zhang, C. Hu, Q. Yin, W. Li, H.-C. Li, and W. [104] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-
Hong, “Multi-aspect-aware bidirectional LSTM CNN: towards real-time object detection with region
networks for Synthetic Aperture Radar target proposal networks,” IEEE Transactions on Pattern
recognition,” IEEE Access, vol. 5, pp. 26880–26891, Analysis and Machine Intelligence, vol. 39, no. 6, 2017.
2017. [105] K. El-Darymli, E. W. Gill, P. Mcguire, D. Power, and
[92] X. Bai, R. Xue, L. Wang, and F. Zhou, “Sequence C. Moloney, “Automatic target recognition in synthetic
SAR image classification based on bidirectional aperture radar imagery: a state-of-the-art review,”
convolution-recurrent network,” IEEE Transactions on IEEE Access, vol. 4, pp. 6014–6058, 2016.
Geoscience and Remote Sensing, vol. 57, no. 11, 2019. [106] C. Clemente, L. Pallotta, D. Gaglione, A. De Maio,
[93] J. Pearl, “Theoretical impediments to machine learning and J. J. Soraghan, “Automatic target recognition of
with seven sparks from the causal revolution.” military vehicles with Krawtchouk moments,” IEEE
arXiv:1801.04016. 2018. Transactions on Aerospace and Electronic Systems,
[94] B. Pei and Z. Bao, ``Multi-aspect radar target vol. 53, no. 1, 2017.
recognition method based on scattering centers and [107] B. Hou, L. Wang, Q. Wu, Q. Han, L. Jiao, "Complex
HMMs classifiers,'' IEEE Trans. Aerosp. Electron. Gaussian–Bayesian online dictionary learning for SAR
Syst., vol. 41, no. 3, pp. 1067-1074, Jul. 2005. target recognition with limited labeled samples", IEEE
[95] S. Deng, L. Du, C. Li, J. Ding, and H. Liu, “SAR Access, vol. 7, pp. 120626-120637, 2019.
automatic target recognition based on Euclidean [108] C. Wang, H. Gu, and W. Su, “SAR image
distance restricted autoencoder,” IEEE Journal of classification using contrastive learning and pseudo-
Selected Topics in Applied Earth Observations and labels with limited data,” IEEE Geoscience and
Remote Sensing, vol. 10, no. 7, 2017. Remote Sensing Letters, pp. 1–5, 2021.
[96] J. Geng, J. Fan, H. Wang, X. Ma, B. Li, and F. Chen, [109] F. Zhang, Y. Wang, J. Ni, Y. Zhou, and W. Hu,
“High-resolution SAR image classification via deep “SAR target small sample recognition based on CNN
convolutional autoencoders,” IEEE Geoscience and cascaded features and AdaBoost rotation forest,” IEEE
Remote Sensing Letters, vol. 12, pp. 2351–2355, 2015. Geoscience and Remote Sensing Letters, vol. 17, pp.
[97] J. Geng, H. Wang, J. Fan, and X. Ma, “Deep 1008–1012, 2020.
supervised and contractive neural network for SAR [110] Z. Huang, Z. Pan, and B. Lei, “Transfer learning with
image classification,” IEEE Transactions on deep convolutional neural network for SAR target
Geoscience and Remote Sensing, vol. 55, pp. 2442– classification with limited labeled data,” Remote
2459, 2017. Sensing, vol. 9, pp. 907, 2017.
[98] G. Dong, G. Liao, H. Liu, and G. Kuang, “A review of [111] C. Zhong, X. Mu, X. He, J. Wang, and M. Zhu, “SAR
the autoencoder and its variants: a comparative target image classification based on transfer learning
perspective from target recognition in synthetic- and model compression,” IEEE Geoscience and
aperture radar images,” IEEE Geoscience and Remote Remote Sensing Letters, vol. 16, pp. 412–416, 2019.
Sensing Magazine, vol. 6, no. 3, 2018. [112] J. Ding, B. Chen, H. Liu, and M. Huang,
[99] J. Guo, L. Wang, D. Zhu, C. Hu, "Compact “Convolutional neural network with data augmentation
convolutional autoencoder for SAR target for SAR target recognition,” IEEE Geoscience and
recognition", IET Radar Sonar & Navigation, vol. 14, Remote Sensing Letters, vol. 13, pp. 364–368, 2016.
no. 7, pp. 967-972, 2020. [113] Q. Yu, H. Hu, X. Geng, Y. Jiang, J. An, "High-
[100] C. Schwegmann, W. Kleynhans, B. P. Salmon, L. performance SAR automatic target recognition under
Mdakane, and R. Meyer, “Synthetic aperture radar ship limited data condition based on a deep feature fusion
detection using capsule networks,” IGARSS 2018 - network", IEEE Access, vol. 7, pp. 165646-165658,
2018 IEEE International Geoscience and Remote 2019.
Sensing Symposium, pp. 725–728, 2018. [114] Y. Kwak, W.-J. Song, and S.-E. Kim, “Speckle-
[101] L. De Laurentiis, A. Pomente, F. Del Frate, and G. noise-invariant convolutional neural network for SAR
Schiavon, “Capsule and convolutional neural network- Target recognition,” IEEE Geoscience and Remote
based SAR ship classification in Sentinel-1 data,” Sensing Letters, vol. 16, pp. 549–553, 2019.
Active and Passive Microwave Remote Sensing for [115] B. Lewis, T. Scarnati, E. Sudkamp, J. Nehrbass, S.
Environmental Monitoring III, Oct. 2019. Rosencrantz, and E. Zelnio, “A SAR dataset for ATR
[102] R. Shah, A. Soni, V. Mall, T. Gadhiya, and A. K. Roy, development: the Synthetic and Measured Paired
“Automatic target recognition from SAR images using Labeled Experiment (SAMPLE),” in Algorithms for
capsule networks,” in Pattern Recognition and Machine Synthetic Aperture Radar Imagery XXVI, 2019, vol.
Intelligence, Cham, 2019, pp. 377–386. 10987, pp. 39–54.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
[116] T. Scarnati and B. Lewis, “A deep learning approach [129] S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard,
to the Synthetic and Measured Paired and Labeled “DeepFool: A Simple and accurate method to fool deep
Experiment (SAMPLE) challenge problem,” in neural networks,” in 2016 IEEE Conference on
Algorithms for Synthetic Aperture Radar Imagery Computer Vision and Pattern Recognition (CVPR),
XXVI, 2019, vol. 10987, pp. 29–38. 2016, pp. 2574–2582.
[117] M. Cha, A. Majumdar, H. T. Kung, and J. Barber, [130] N. Narodytska and S. P. Kasiviswanathan, “Simple
“Improving SAR automatic target recognition using black-box adversarial perturbations for deep
simulated images under deep residual refinements,” networks.” arXiv:1612.06299, 2016
IEEE International Conference on Acoustics, Speech [131] J. Su, D. V. Vargas, and K. Sakurai, “One pixel
and Signal Processing (ICASSP), pp. 2606–2610, 2018. attack for fooling deep neural networks,” IEEE
[118] H. Zhu, Rocky Leung, M. Hong, "Shadow Transactions on Evolutionary Computation, vol. 23, pp.
compensation for synthetic aperture radar target 828–841, 2019.
classification by dual parallel generative adversarial [132] W. Brendel, J. Rauber, and M. Bethge, “Decision-
network", IEEE Sensors Letters, vol. 4, no. 8, pp. 1-4, based adversarial attacks: reliable attacks against
2020. black-box machine learning models,” International
[119] Z. Huang, Z. Pan, and B. Lei, “What, where, and how Conference on Learning Representations (ICLR), 2018.
to transfer in SAR target recognition based on deep [133] T. Huang, Y. Chen, B. Yao, B. Yang, X. Wang, and
CNNs,” IEEE Transactions on Geoscience and Remote Y. Li, “Adversarial attacks on deep-learning-based
Sensing, vol. 58, no. 4, 2020. radar range profile target recognition,” Information
[120] C. He, D. Xiong, Q. Zhang, and M. Liao, “Parallel Sciences, vol. 531, pp. 159–176, 2020.
connected generative adversarial network with [134] T. Huang, Q. Zhang, J. Liu, R. Hou, X. Wang, and Y.
quadratic operation for SAR image generation and Li, “Adversarial attacks on deep-learning-based SAR
application for classification,” Sensors (Basel, image target recognition,” Journal of Network and
Switzerland), vol. 19, 2019. Computer Applications, vol. 162, p. 102632, 2020.
[121] Q. An, Z. Pan, H. You, and Y. Hu, “Transitive [135] B. Lewis, K. Cai, and C. Bullard, “Adversarial
transfer learning-based anchor free rotatable detector training on SAR images,” in Automatic Target
for SAR target detection with few samples,” IEEE Recognition, 2020, vol. 11394, pp. 83–90.
Access, vol. 9, pp. 24011–24025, 2021. [136] L. Wang, X. Wang, S. Ma, and Y. Zhang, “Universal
[122] C. Lu and W. Li, “Ship classification in high- adversarial perturbation of SAR images for deep
resolution SAR images via transfer learning with small learning based target classification,” in 2021 IEEE 4th
training dataset,” Sensors, vol. 19, 2019. International Conference on Electronics Technology
[123] Y. Guo, L. Du, D. Wei, and C. Li, “Robust SAR (ICET), 2021, pp. 1272–1276.
automatic target recognition via adversarial learning,” [137] S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P.
IEEE Journal of Selected Topics in Applied Earth Frossard, “Universal adversarial perturbations,” 2017
Observations and Remote Sensing, vol. 14, pp. 716– IEEE Conference on Computer Vision and Pattern
729, 2021. Recognition (CVPR), pp. 86–94, 2017.
[124] J. Rauber, W. Brendel, and M. Bethge, “Foolbox: A [138] S. Wagner, C. Panati, and S. Brüggenwirth, “Fool the
Python toolbox to benchmark the robustness of COOL - On the robustness of deep learning SAR ATR
machine learning models,” Reliable Machine Learning systems,” in 2021 IEEE Radar Conference
in the Wild Workshop, 34 th International Conference (RadarConf21), 2021, pp. 1–6.
on Machine Learning, Sydney, Australia, 2017. [139] J. Deng, W. Yi, K. Zeng, Q. Peng, and X. Yang,
[125] S. Haldar, “Gradient-based adversarial attacks: an “Supervised learning based online filters for targets
introduction”, https://medium.com/swlh/gradient-based tracking using radar measurements,” 2020 IEEE Radar
-adversarial-attacks-an-introduction-526238660dc9, Conference (RadarConf20), pp. 1–6, 2020.
published on 9 Apr. 2020, accessed on 1 Aug. 2021. [140] C. Gao, J. Yan, S. Zhou, P. Varshney, and H. Liu,
[126] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Long short-term memory-based deep recurrent neural
“Explaining and harnessing adversarial examples.” 3rd networks for target tracking,” Inf. Sci., vol. 502, pp.
International Conference on Learning Representations 279–296, 2019.
(ICLR), 2015. [141] C. Gao, J. Yan, S. Zhou, B. Chen, and H. Liu, “Long
[127] A. Kurakin, I. Goodfellow, and S. Bengio, short-term memory-based recurrent neural networks for
“Adversarial examples in the physical world.” nonlinear target tracking,” Signal Process., vol. 164, pp.
Workshop track of International Conference on 67–73, 2019.
Learning Representations (ICLR), 2017. [142] J. Liu, Z. Wang, and M. Xu, “DeepMTT: A deep
[128] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and learning maneuvering target-tracking algorithm based
A. Vladu, “Towards deep learning models resistant to on bidirectional LSTM network,” Inf. Fusion, vol. 53,
adversarial attacks.” International Conference on pp. 289–304, 2020.
Learning Representations (ICLR), 2018.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
[143] W. Yu, H. Yu, J. Du, M. Zhang, and J. Liu, [157] R. Di Pietro, G.-H. Lee, J. Jo, and C. H. Park,
“DeepGTT: A general trajectory tracking deep learning “Jamming prediction for radar signals using machine
algorithm based on dynamic law learning,” IET Radar, learning methods,” Security and Communication
Sonar & Navigation, 2021. Networks, vol. 2020, pp. 1-9, 2020.
[144] Y. Shi, B. Jiu, J. Yan, H. Liu, and K. Li, “Data-driven [158] P. Sharma, K. K. Sarma, and N. E. Mastorakis,
simultaneous multibeam power allocation: when “Artificial intelligence aided electronic warfare
multiple targets tracking meets deep reinforcement systems-recent trends and evolving applications,” IEEE
learning,” IEEE Systems Journal, vol. 15, pp. 1264– Access, vol. 8, pp. 224761–224780, 2020.
1274, 2021. [159] D. Callaghan, J. Burger, and A. K. Mishra, “A
[145] A. Mendoza, A. Soto, and B. C. Flores, machine learning approach to radar sea clutter
“Classification of radar jammer FM signals using a suppression,” in 2017 IEEE Radar Conference
neural network,” in Radar Sensor Technology XXI, (RadarConf), 2017, pp. 1222–1227.
2017, vol. 10188, pp. 526–536. [160] M. Inggs, H. Griffiths, F. Fioranelli, M. Ritchie, and
[146] G. Shao, Y. Chen, and Y. Wei, “Convolutional neural K. Woodbridge, “Multistatic radar: System
network-based radar jamming signal classification with requirements and experimental validation,” 2014
sufficient and limited samples,” IEEE Access, vol. 8, International Radar Conference, pp. 1–6, 2014.
pp. 80588–80598, 2020. [161] M. Pan, J. Chen, S. Wang, and Z. Dong, “A novel
[147] G. Shao, Y. Chen, and Y. Wei, “Deep fusion for approach for marine small target detection based on
radar jamming signal classification based on CNN,” deep learning,” in 2019 IEEE 4th International
IEEE Access, vol. 8, pp. 117236–117244, 2020. Conference on Signal and Image Processing (ICSIP),
[148] Q. Liu and W. Zhang, “Deep learning and recognition 2019, pp. 395–399.
of radar jamming based on CNN,” in 2019 12th [162] X. Chen, N. Su, Y. Huang, and J. Guan, “False-
International Symposium on Computational alarm-controllable radar detection for marine target
Intelligence and Design, 2019, vol. 1, pp. 208–212. based on multi features fusion via CNNs,” IEEE
[149] Y. Xiao, J. Zhou, Y. Yu, and L. Guo, “Active Sensors Journal, vol. 21, pp. 9099–9111, 2021.
jamming recognition based on bilinear EfficientNet [163] Q. An, Z. Pan, and H. You, “Ship Detection in
and attention mechanism,” IET Radar, Sonar & Gaofen-3 SAR Images Based on Sea Clutter
Navigation, 2021. Distribution Analysis and Deep Convolutional Neural
[150] M. Tan and Q. V. Le, “EfficientNet: rethinking Network,” Sensors (Basel, Switzerland), vol. 18, 2018.
model scaling for convolutional neural networks,” [164] L. Ma et al., “Research on sea clutter reflectivity
ArXiv:1905.11946, 2019. using deep learning model in Industry 4.0,” IEEE
[151] J. Yu, J. Li, B. Sun, and Y. Jiang, “Barrage jamming Trans. on Industrial Informatics, vol. 16, no. 9, 2020.
detection and classification based on convolutional [165] L. Cifola and R. Harmanny, “Target/clutter
neural network for Synthetic Aperture Radar,” in disentanglement using deep adversarial training on
IGARSS 2018 - 2018 IEEE International Geoscience micro-Doppler signatures,” 2019 16th European Radar
and Remote Sensing Symposium, 2018, pp. 4583–4586. Conference (EuRAD), pp. 201–204, 2019.
[152] W. Wang, Y. Wei, X. Zhen, H. Yu, and R. Wang, [166] P. Lepetit et al., “Using deep learning for restoration
“Classifying aircraft based on sparse recovery and of precipitation echoes in radar data,” IEEE
deep-learning,” The Journal of Engineering, vol. 2019, Transactions on Geoscience and Remote Sensing, pp.
pp. 7464–7468, 2019. 1–14, 2021.
[153] K. Li, B. Jiu, H. Liu, and S. Liang, “Reinforcement [167] Q. Zhang, Y. Shao, S. Guo, L. Sun, and W. Chen, “A
learning based anti-jamming frequency hopping novel method for sea clutter suppression and target
strategies design for cognitive radar,” in 2018 IEEE detection via deep convolutional autoencoder,”
International Conference on Signal Processing, International Journal of Signal Processing, 2017.
Communications and Computing, 2018, pp. 1–5. [168] S. Guo, Q. Zhang, Y. Shao, and W. Chen, “Sea
[154] K. Li, B. Jiu, P. Wang, H. Liu, and Y. Shi, “Radar clutter and target detection with deep neural networks,”
active antagonism through deep reinforcement learning: DEStech Transactions on Computer Science and
A Way to address the challenge of mainlobe jamming,” Engineering, 2017.
Signal Process., vol. 186, p. 108130, 2021. [169] J. Zhao, J. Wu, X. Guo, J. Han, K. Yang, and H.
[155] S. Ak and S. Bruggenwirth, “Avoiding jammers: a Wang, “Prediction of radar sea clutter based on
reinforcement learning approach,” in 2020 IEEE LSTM,” Journal of Ambient Intelligence and
International Radar Conference, 2020, pp. 321–326. Humanized Computing, 2019.
[156] S.-J. Hong, Y.-G. Yi, J. Jo, and B. Seo,
“Classification of radar signals with convolutional
neural networks,” 2018 Tenth International Conference
on Ubiquitous and Future Networks (ICUFN), pp.
894–896, 2018.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3119561, IEEE Access
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/