0% found this document useful (0 votes)
36 views11 pages

Robust Low-Cost Drone Detection and Classification

Uploaded by

deneme.e1206
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views11 pages

Robust Low-Cost Drone Detection and Classification

Uploaded by

deneme.e1206
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

<Society logo(s) and publica-

tion title will appear here.>

Received XX Month, XXXX; revised XX Month, XXXX; accepted XX Month, XXXX; Date of publication XX Month, XXXX; date of
current version XX Month, XXXX.
Digital Object Identifier 10.1109/XXXX.2022.1234567

Robust Low-Cost Drone Detection and


Classification in Low SNR
Environments
Stefan Glüge1 , Matthias Nyfeler1 , Ahmad Aghaebrahimian1 ,
Nicola Ramagnano2 and Christof Schüpbach3 , Fellow, IEEE
1
Institute of Computational Life Sciences, Zurich University of Applied Sciences, 8820 Wädenswil, Switzerland
arXiv:2406.18624v1 [eess.SP] 26 Jun 2024

2
Institute for Communication Systems, Eastern Switzerland University of Applied Sciences, 8640 Rapperswil-Jona, Switzerland
3
Armasuisse Science + Technology, 3602 Thun, Switzerland
Corresponding author: Stefan Glüge (email: [email protected]).

ABSTRACT The proliferation of drones, or unmanned aerial vehicles (UAVs), has raised significant
safety concerns due to their potential misuse in activities such as espionage, smuggling, and infrastructure
disruption. This paper addresses the critical need for effective drone detection and classification systems that
operate independently of UAV cooperation. We evaluate various convolutional neural networks (CNNs)
for their ability to detect and classify drones using spectrogram data derived from consecutive Fourier
transforms of signal components. The focus is on model robustness in low signal-to-noise ratio (SNR)
environments, which is critical for real-world applications. A comprehensive dataset is provided to support
future model development. In addition, we demonstrate a low-cost drone detection system using a standard
computer, software-defined radio (SDR) and antenna, validated through real-world field testing. On our
development dataset, all models consistently achieved an average balanced classification accuracy of
≥ 85% at SNR > −12 dB. In the field test, these models achieved an average balance accuracy of
> 80%, depending on transmitter distance and antenna direction. Our contributions include: a publicly
available dataset for model development, a comparative analysis of CNN for drone detection under low
SNR conditions, and the deployment and field evaluation of a practical, low-cost detection system.

INDEX TERMS Deep neural networks, Robustness, Signal detection, Unmanned aerial vehicles

I. INTRODUCTION development, we make the dataset publicly available. In

D RONES, or civil UAVs, have evolved from hobby


toys to commercial systems with many applications.
In particular, mini/amateur drones have become ubiquitous.
terms of performance, we focus on the robustness of the
models to low SNRs, as this is the most relevant aspect
for a real-world application of the system. Furthermore, we
With the proliferation of these low-cost, small and easy-to- evaluate a low-cost drone detection system consisting of a
fly drones, safety issues have became more pressing (e.g. standard computer, SDR, and antenna in a real-world field
spying, transfer of illegal or dangerous goods, disruption of test.
infrastructure, assault). Although regulations and technical Our contributions can therefore be summarised as follows:
solutions (such as transponder systems) are in place to safely
integrate UAVs into the airspace, detection and classification • We provide the dataset used to develop the model.
systems that do not rely on the cooperation of the UAV are Together with the code to load and transform the data,
necessary. Various technologies such as audio, video, radar, it can be easily used for future model development.
or radio frequency (RF) scanners have been proposed for • We compare different CNNs using 2D spectrogram data
this task [1]. for detection and classification of drones based on their
In this paper, we evaluate different CNNs for drone RF signals under challenging conditions, i.e. low SNRs
detection and classification using the spectrogram data com- down to −20 dB.
puted with consecutive Fourier transforms for the real and • We visualise the model embeddings to understand how
imaginary parts of the signal. To facilitate future model the model clusters and separates different classes, to
identify potential overlaps or ambiguities, and to ex-

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

VOLUME , 1
Glüge et al.:

amine the hierarchical relationships within the learned Bluetooth speaker, a Wi-Fi hotspot, and simultaneous Blue-
features. tooth and Wi-Fi interference. The dataset does not include
• We implement the models in a low-cost detection measurements without drones, which would be necessary
system and evaluate them in a field test. to evaluate a drone detection system. The results in [14]
show that Bluetooth signals are more likely to interfere
with detection and classification accuracy than Wi-Fi signals.
A. RELATED WORK
Overall, frequency domain features extracted from a CNN
A literature review on drone detection methods based on were shown to be more robust than time domain features
deep learning (DL) is given in [1] and [2]. Both works reflect in the presence of interference. In [15] the drone signals
the state of the art in 2024. Different DL algorithms are from the DroneDetect dataset were augmented with Gaussian
discussed with respect to the techniques used to detect drones noise and SDR recorded background noise. Hence, the pro-
based on visual, radar, acoustic, and RF signals. Given these posed approach could be evaluated regrading its capability
general overviews, we briefly summarise recent work based to detect drones. They trained a CNN end-to-end on the raw
on RF data, with a particular focus on the data side of the IQ data and report an accuracy of 99% for detection and
problem to motivate our work. between 72% and 94% for classification.
With the advent of DL-based methods, the data used The Cardinal RF dataset [16] consists of the raw time
to train models became the cornerstone of any detection series data from six drones + controller, two Wi-Fi and
system. Table 1 provides an overview of openly available two Bluetooth devices. Based on this dataset, Medaiyese
datasets of RF drone signals. The DroneRF dataset [3] is et al. [17] proposed a semi-supervised framework for UAV
one of the first openly available datasets. It contains RF detection using wavelet analysis. Accuracy between 86% and
time series data from three drones in four flight modes 97% was achieved at SNRs of 30 dB and 18 dB, while it
(i.e. on, hovering, flying, video recording) recorded by two dropped to chance level for SNRs below 10 dB to 6 dB. In
universal software radio peripheral (USRP) SDR transceivers addition, [18] investigated different wavelet transforms for
[4]. The dataset is widely used and enabled follow-up work the feature extraction from the RF signals. Using the wavelet
with different approaches to classification systems, i.e. DL- scattering transform from the steady state of the RF signals
based [5], [6], focused on pre-processing and combining at 30 dB SNR to train SqueezeNet [19], they achieved an
signals from two frequency bands [7], genetic algorithm- accuracy of 98.9% at 10 dB SNR.
based heterogeneous integrated k-nearest neighbour [8], and In our previous work [20], we created the noisy drone
hierarchical reinforcement learning-based [9]. In general, RF signals dataset1 from six drones and four remote con-
the classification accuracies reported in the papers on the trollers. It consists of non-overlapping signal vectors of
DroneRF dataset are close to 100%. Specifically, [4], [5], 16384 samples, corresponding to ≈ 1.2 ms at 14 MHz. We
and [6] report an average accuracy of 99.7%, 100%, and added Labnoise (Bluetooth, Wi-Fi, Amplifier) and Gaussian
99.98%, respectively, to detect the presence of a drone. There noise to the dataset and mixed it with the drone signals with
is therefore an obvious need for a harder, more realistic SNR ∈ [−20, 30] dB. Using IQ data and spectrogram data
dataset. to train different CNNs, we found an advantage in favour
Consequently, [10] investigate the detection and classifica- of the 2D spectrogram representation of the data. There
tion of drones in the presence of Bluetooth and Wi-Fi signals. was no performance difference at SNR ≥ 0 dB but a major
Their system used a multi-stage detector to distinguish improvement in the balanced accuracy at low SNR levels,
drone signals from the background noise and interfering i.e. 84.2% on the spectrogram data compared to 41.3% on
signals. Once a signal was identified as a drone signal, it the IQ data at −12 dB SNR.
was classified using machine learning (ML) techniques. The Recently, [21] proposed an anchor-free object detector
detection performance of the proposed system was evalu- based on keypoints for drone RF signal spectograms. They
ated for different SNRs. The corresponding recordings (17 also proposed an adversarial learning-based data adaptation
drone controls from eight different manufacturers) are openly method to generate domain independent and domain aligned
available [11]. Unfortunately, the Bluetooth/Wi-Fi noise is features. Given five different types of drones, they report a
not part of the dataset. Ozturk et al. [12] used the dataset to mean average precision of 97.36%, which drops to ≈ 55%
further investigate the classification of RF fingerprints at low when adding Gaussian noise with −25 dB SNR. The raw
SNRs by adding white Gaussian noise to the raw data. Using data used in their work is available2 , but yet, unfortunately
a CNN, they achieved classification accuracies ranging from not usable without any further documentation.
92% to 100% for SNR ∈ [−10, 30]dB.
The openly available DroneDetect dataset [13] was created
by Swinney and Woods [14]. It contains raw in-phase and
quadrature (IQ) data recorded with a BladeRF SDR. Seven
drone models were recorded in three different flight modes 1 https://www.kaggle.com/datasets/sgluege/

(on, hovering, flying). Measurements were also repeated noisy-drone-rf-signal-classification


with different types of noise, such as interference from a 2 https://www.kaggle.com/datasets/zhaoericry/drone-rf-dataset

2 VOLUME ,
<Society logo(s) and publication title will appear here.>

TABLE 1: Overview on openly available drone RF datasets.


Dataset Year Datatype UAV Noise Size
DroneRF [3] 2019 Raw Amplitude 3 drones + 3 controller Background RF 3.75 GB
activities
Drone remote controller 2020 Raw Amplitude 17 controller none 124 GB
RF signal dataset [11]
DroneDetect dataset [13] 2020 Raw IQ 7 drones + 7 controller Bluetooth, Wi-Fi devices 66 GB
Cardinal RF [16] 2022 Raw Amplitude 6 drones + 6 controller Bluetooth, Wi-Fi 65 GB
Noisy drone RF signals 2023 Pre-processed IQ and 6 drones + 4 controller Bluetooth, Wi-Fi, Gauss 23 GB
[20] Spectrogram

B. MOTIVATION
As we have seen in other fields, such as computer vision, the
success of DL can be attributed to: (a) high-capacity models;
(b) increased computational power; and (c) the availability
of large amounts of labelled data [22]. Thus, given the
large amount of available raw RF signals (cf. Tab. 1) we
promote the idea of open and reusable data, to facilitate
model development and model comparison.
With the noisy drone RF signals dataset [20], we have
provided a first ready-to-use dataset to enable rapid model
development, without the need for any data preparation.
Furthermore, the dataset contains samples that can be con-
sidered as “hard” in terms of noise, i.e. Bluetooth + Wi-Fi FIGURE 1: Recording of drone signals in the anechoic
+ Gaussian noise at very low SNRs, and allows a direct chamber. A DJI Phantom 4 Pro drone with the DJI Phantom
comparison with the published results. GL300F remote control.
While the models proposed in [20] performed reasonably
well in the training/lab setting, we found it difficult to trans-
fer their performance to practical application. The reason was acquisition process again to provide a complete picture of
the choice of rather short signal vectors of 16384 samples, the development from the raw RF signal to the deployment
corresponding to ≈ 1.2 ms at 14 MHz. Since the drone of a detection system within a single manuscript.
signals occur in short bursts of ≈ 1.3–2 ms with a repetition
period of ≈ 60–600 ms, our continuously running classifier
predicts a drone whenever a burst occurs and noise during A. DATA ACQUISITION
the repetition period of the signal. Therefore, in order to The drone’s remote control and, if present, the drone itself
provide a stable and reliable classification per every second, were placed in an anechoic chamber to record the raw RF
one would need an additional “layer” to pool the classifier signal without interference for at least one minute. The
outputs given every 1.2 ms. signals were received by a log-periodic antenna and sampled
In the present work, we follow a data-centric approach and and stored by an Ettus Research USRP B210, see Fig. 1. In
simply increase the length of the input signal to ≈ 75 ms to the static measurement, the respective signals of the remote
train a classifier in an end-to-end manner. Again, we provide control (TX) alone or with the drone (RX) were measured. In
the data used for model development in the hope that it will the dynamic measurement, one person at a time was inside
inspire others to develop better models. the anechoic chamber and operated the remote control (TX)
In the next section, we briefly describe the data collec- to generate a signal that is as close to reality as possible. All
tion and preprocessing procedure. Section III describes the signals were recorded at a sampling frequency of 56 MHz
model architectures and their training/validation method. In (highest possible real-time bandwidth). All drone models
addition, we describe the setup of a low-cost drone detection and recording parameters are listed in Tab. 2, including both
system and of the field test. The resulting performance uplink and downlink signals.
metrics are presented in Section IV and are further discussed We also recorded three types of noise and interference.
in Section V. First, Bluetooth/Wi-Fi noise was recorded using the hard-
ware setup described above. Measurements were taken in a
public and busy university building. In this open recording
II. MATERIALS setup, we had no control over the exact number or types
We used the raw RF signals from the drones that were of active Bluetooth/Wi-Fi devices and the actual traffic in
collected in [20]. Nevertheless, we briefly describe the data progress.

VOLUME , 3
Glüge et al.:

TABLE 2: Transmitters and receivers recorded in the development dataset and their respective class labels. Additionally,
we show the center frequency (GHz), the channel spacing (MHz), the burst duration (ms), and the repetition period of the
respective signals (ms).
Transmitter Receiver Label Center Freq. (GHz) Spacing (MHz) Duration (ms) Repetition (ms)
DJI Phantom GL300F DJI Phantom 4 Pro DJI 2.44175 1.7 2.18 630
Futaba T7C - FutabaT7 2.44175 2 1.7 288
Futaba T14SG Futaba R7008SB FutabeT14 2.44175 3.1 1.4 330
Graupner mx-16 Graupner GR-16 Graupner 2.44175 1 1.9/3.7 750
Bluetooth/Wi-Fi Noise - Noise 2.44175
Taranis ACCST X8R Receiver Taranis 2.440 1.5 3.1/4.4 420
Turnigy 9X - Turnigy 2.445 2 1.3 61, 120-2900 a

a The repetition period of the Turnigy transmitter is not static. First bursts were observed after 61 ms, the following signal bursts were observed in the
interval [120, 2900] ms

Second, artificial white Gaussian noise was used, and


third, receiver noise was recorded for 30 seconds from the TABLE 3: Number of samples in the different classes in the
USRP at various gain settings ([30, 70] db in steps of 10 dB) development dataset.
without the antenna attached. This should prevent the final Class DJI FutabaT14 FutabaT7 Graupner Taranis Turnigy Noise
model from misclassifying quantisation noise in the absence #samples 1280 3472 801 801 1663 855 8872
of a signal, especially at low gain settings.

Finally, the normalised drone signal vectors were mixed


B. DATA PREPARATION
with the normalised noise vectors by
To reduce memory consumption and computational effort, √ 
we reduced the bandwidth of the signals by downsam- k · x̂(i) + n̂(i) SNR
pling from 56 MHz to 14 MHz using the SciPy [23] sig- ŷ(i) = √ , with k = 10 10 , (3)
nal.decimate function with an 8th order Chebyshev type I k+1
filter. to generate the noisy drone signal vectors ŷ at different
The drone signals occur in short bursts with some low SNRs.
power gain or background noise in between (cf. Tab. 2). We
divided the signals into non-overlapping vectors of 1048576 C. DEVELOPMENT DATASET
samples (74.9 ms) and only vectors containing a burst, or at To facilitate future model development, we provide our
least a partial burst, were used for the development dataset. resulting dataset3 along with a code example4 to load and
This was achieved by applying an energy threshold. As the inspect the data. The dataset consists of the non-overlapping
recordings were made in an echo-free chamber, the signal signal vectors of 220 samples, corresponding to ≈ 74.9 ms
burst is always clearly visible. Hence, we only used vectors at 14 MHz.
that contained a portion of the signal whose energy was As described in Sec. B, the drone signals were mixed
above the threshold, which was arbitrarily set at 0.001 of with noise. More specifically, 50% of the drone signals were
the average energy of the entire recording. mixed with Labnoise (Bluetooth + Wi-Fi + Amplifier) and
The selected drone signal vectors x with i ∈ {1, . . . k} 50% with Gaussian noise. In addition, we created a separate
were normalised to a carrier power of 1 per sample, i.e. noise class by mixing Labnoise and Gaussian noise in all
only the part of the signal vector containing drone bursts possible combinations (i.e., Labnoise + Labnoise, Labnoise
was considered for the power calculation (m samples out + Gaussian noise, Gaussian noise + Labnoise, and Gaussian
of k ). This was achieved by identifying the bursts as those noise + Gaussian noise). For the drone signal classes, as
samples where a smoothed energy was above a threshold. for the noise class, the number of samples for each SNR
The signal vectors x are thus normalised by level was evenly distributed over the interval of SNRs ∈
s [−20, 30] dB in steps of 2 dB, i.e., 679-685 samples per SNR
1 X
x̂(i) = x(i) / |x(i)|2 . (1) level. The resulting number of samples per class is given in
m i Tab. 3.
Noise vectors (Bluetooth, Wi-Fi, Amplifier, Gauss) n with In our previous work [20] we found an advantage in
samples i ∈ {1, . . . k} were normalised to a mean power of using the spectrogram representation of the data compared
1 with s 3 https://www.kaggle.com/datasets/sgluege/
1X
n̂(i) = n(i)/ |n(i)|2 . (2) noisy-drone-rf-signal-classification-v2
k i 4 https://github.com/sgluege/noisy-drone-rf-signal-classification-v2

4 VOLUME ,
<Society logo(s) and publication title will appear here.>

to the IQ representation, especially at low SNRs levels. B. MODEL EVALUATION


Therefore, we transform the raw IQ signals by computing the During training, the model was evaluated on the validation
spectrum of each sample with consecutive Fourier transforms set after each epoch. If the balanced accuracy on the valida-
with non-overlapping segments of length 1024 for the real tion set increased, it was saved. After training, the model
and imaginary parts of the signal. That is, the two IQ with the highest balanced accuracy on the validation set
signal vectors ([2 × 220 ]) are represented as two matrices was evaluated on the withheld test data. The performance
([2 × 1024 × 1024]). Fig. 2 shows four samples of the dataset of the models on the test data was accessed in terms of
at different SNRs. Note that we have plotted the log power classification accuracy and balanced accuracy.
spectrogram of the complex spectrum ŷfft as As accuracy simply measures the proportion of correct
p
log10 |ŷfft | = log10 ( Re(ŷfft )2 + Im(ŷfft )2 ) (4) predictions out of the total number of observations, it can
be misleading for unbalanced datasets. In our case, the noise
D. DETECTION SYSTEM PROTOTYPE
class is over-represented in the dataset (cf. Tab. 3). Therefor,
we also report the balanced accuracy, which is defined as
For field use, a system based on a mobile computer was
the average of the recall obtained for each class, i.e. it gives
used as shown in Fig. 3 and illustrated in Fig. 4. The RF
equal weight to each class regardless of how frequent or rare
signals were received using a directional left-hand circularly
it is.
polarised antenna (H&S SPA 2400/70/9/0/CP). The antenna
gain of 8.5 dBi and the front-to-back ratio of 20 dB helped
to increase the detection range and to attenuate the unwanted C. VISUALISATION OF MODEL EMBEDDINGS
interferers in the opposite direction. Circular polarisation Despite their effectiveness, CNNs are often criticised for be-
has been chosen to eliminate the alignment problem as the ing “black boxes”. Understanding the feature representations,
transmitting antennas have a linear polarisation. The USRP or embeddings, learned by the CNN helps to demystify these
B210 was used to down-convert and digitise the RF signal models and provide some understanding of their capabilities
at a sampling rate of 14 Msps. On the mobile computer, the and limitations. In general, embeddings are high-dimensional
GNU Radio program collected the baseband IQ samples in vectors generated by the intermediate layers that capture
batches of one second and send one batch at a time to our essential patterns from the input data.
PyTorch model, which classified the signal. To speed up the In our case, we chose the least complex VGG11 BN
computations in the model we utilised an Nvidia GPU in model to visualise its embeddings. When inferencing the
computer. The classification results were then visualised in test data, we collected the activations at the last dense
real time in a dedicated GUI. classification layer, which consists of 256 units. Given 3549
test samples, this results in a 256 × 3549 matrix. Using t-
III. METHODS distributed Stochastic Neighbor Embedding (t-SNE) [28] and
A. MODEL ARCHITECTURE AND TRAINING Uniform Manifold Approximation and Projection (UMAP)
As in [20] we chose the Visual Geometry Group (VGG) [29] as dimensionality reduction techniques, we project
CNN architecture [24]. The main idea of this architecture these high-dimensional embeddings into a lower-dimensional
is to use multiple layers of small (3 × 3) convolutional space, creating interpretable visualisations that reveal the
filters instead of larger ones. This is intended to increase model’s internal data representations.
the depth and expressiveness of the network, while reducing Our goals were to understand how the model clusters and
the number of parameters. There are several variants of this separates different classes, to identify potential overlaps or
architecture, which differ in the number of convolutional lay- ambiguities, and to examine the hierarchical relationships
ers (11 and 19, respectively). We used a variant with a batch within the learned features.
normalisation [25] layer after the convolutions, denoted as
VGG11 BN to VGG19 BN. For the dense classification D. DETECTION SYSTEM FIELD TEST
layer, we used 256 linear units followed by 7 linear units We conducted a field test of the detection system in Rap-
at the output (one unit per class). perswil at the Zurich Lake. The drone detection prototype
A stratified 5-fold train-validation-test split was used as was placed on the shore (cf. Fig. 4) in line of sight of a
follows. In each fold, we trained a network using 80% and wooden boardwalk across the lake, with no buildings to
20% of the available samples of each class for training and interfere with the signals. The transmitters were mounted
testing, respectively. Repeating the stratified split five times on a 2.5 m long wooden pole. The signals from the trans-
ensures that each sample was in the test set once in each mitters were recorded (and classified in real time) at four
experiment. Within the training set, 20% of the samples were positions along the walkway at approximately 110 m, 340 m,
used as the validation set during training. 560 m and 670 m from the detection system. Figure 5 shows
Model training was performed for 200 epochs with a batch an overview of the experimental setup. At each recording
size of 8. The PyTorch [26] implementation of the Adam position, we measured with the directional antenna at three
algorithm [27] was used with a learning rate of 0.005, betas different angles, i.e. at 0◦ – facing the drones and/or remote
(0.9, 0.999) and weight decay of 0. controls, at 90◦ – perpendicular to the direction of the trans-

VOLUME , 5
Glüge et al.:

(a) FutabaT14 at SNR 26 dB (b) DJI at SNR 6 dB

(c) Taranais at SNR −4 dB (d) Noise at SNR 22 dB

FIGURE 2: Log power spectrogram and IQ data samples from the development dataset at different SNRs (a-d)

Mobile computer

PyTorch GPU

Classification
IQ Data
2.4-2.48 GHz Results

USRP USB3
GNU Radio GUI
B210

FIGURE 3: Block diagram of the mobile drone detection


system.

mitters, and at 180◦ – in the opposite direction. Directing the FIGURE 4: Detection prototype at the Zurich Lake in
antenna in the opposite direction should result in ≈ 20 dB Rapperswil.
attenuation of the radio signals.
Table 4 lists the drones and/or remote controls used in the
field test. Note that the Graupner drone and remote control
are part of the development dataset (cf. Tab. 2), but were not
TABLE 4: Drones and/or remotes used in the field test
measured in the field experiment. We assume that no other
drones were present during the measurements, so recordings Class Drone/remote control
where none of our transmitters were used are labelled as DJI DJI Phantom Pro 4 drone and remote
“Noise”. FutabaT14 Futaba T14 remote control
For each transmitter, distance, and angle, 20 to 30 s, or FutabaT7 Futaba T14 remote control
approximately 300 spectrograms were live classified and Taranis FrySky Taranis Q X7 remote control
recorded. The resulting number of samples for each class, Turnigy Turnigy Evolution remote control
distance, and angle are shown in Tab. 5.

6 VOLUME ,
<Society logo(s) and publication title will appear here.>

TABLE 6: Mean ± standard deviation of the accuracy (Acc.)


and the balanced accuracy (balanced Acc.) obtained in 5-
fold cross-validation of the different models on the test data
of the development dataset. An indication of the model
training time is given with the mean ± standard deviation
of the number of training epochs (#epochs), i.e. when the
highest balanced accuracy on the validation set was reached.
The number of trainable parameters (#params) indicates the
complexity of the model.
Model Acc. balanced Acc. #epochs #params
VGG11 BN 0.944 ± 0.005 0.932 ± 0.002 66.4 ± 24.4 9.36 · 106
VGG13 BN 0.947 ± 0.003 0.935 ± 0.003 138.6 ± 46.4 9.54 · 106
VGG16 BN 0.947 ± 0.006 0.937 ± 0.005 101.8 ± 41.5 14.86 · 106
VGG19 BN 0.952 ± 0.006 0.939 ± 0.008 98.2 ± 45.8 20.17 · 106

FIGURE 5: Experimental measurement setup at the Zurich


Lake in Rapperswil. One can see the four recording posi-
tions along the wooden walkway and the detection system
positioned at the lake side. Further, recordings were done at
different angels of the directional antenna indicated by the
arrows at the detection system.

TABLE 5: Number of samples (#samples) for each class,


distance and antenna direction (angle) recorded in the field
test. Recordings at 0 m distance have no active transmitter
and were therefore labelled “Noise”
class #samples Distance [m] #samples Angle [◦ ] #samples
DJI 4900 0 2597 0 9110
FutabaT14 5701 110 6208 90 9076
FutabaT7 5086 340 6305 180 9226 FIGURE 6: Mean balanced accuracy obtained in the 5-fold
Noise 2597 560 6358 cross-validation of the different models on the test set of the
Taranis 5094 670 5942 development dataset over the SNRs levels.
Turnigy 4032

do not show the standard deviation to keep the plot readable.


IV. RESULTS
In general, we observe a drastic degradation in performance
A. CLASSIFICATION PERFORMANCE ON THE
from −12 dB down to near chance level at −20 dB.
DEVELOPMENT DATASET
The vast majority of misclassifications occurred between
Table 6 shows the general mean ± standard deviation of
noise and drones and not between different types of drones.
accuracy and balanced accuracy on the test data of the
Figure 7 illustrates this fact. It shows the confusion matrix
development dataset (cf. Sec. C), obtained in the 5-fold
for the VGGG11 BN model for a single validation on the
cross-validation of the different models.
test data for the samples with −14 dB SNR.
There is no meaningful difference in performance between
the models, even when the model complexity increases from
VGG11 BN to VGG19 BN. The number of epochs for B. EMBEDDING SPACE VISUALISATION
training (#epochs) shows when the highest balanced accuracy Figure 8 shows the 2D t-SNE visualisation of the
was reached on the validation set. It can be seen that the least VGG11 BN embeddings of 3549 test samples from the
complex model, VGG11 BN, required the least number of development dataset. It can be seen that each class forms a
epochs compared to the more complex models. However, the separate cluster. While the different drone signal clusters are
resulting classification performance is the same. rather small and dense, the noise cluster takes up most of the
Figure 6 shows the resulting 5-fold mean balanced accu- embedding space and even forms several sub-clusters. This
racy over SNRs ∈ [−20, 30] dB in 2 dB steps. Note that we is most likely due to the variety of the signals used in the

VOLUME , 7
Glüge et al.:

FIGURE 8: 2D t-SNE visualisation of the VGG11 BN em-


beddings of 3549 test samples from the development dataset.
The hyperparameters for t-SNE were: metric “eucleidean”,
number of iterations 1000, perplexity 30 and method for
FIGURE 7: Confusion matrix of the outputs of the gradient approximation “barnes hut”
VGG11 BN model on a single fold for the samples at
−14 dB SNR from the test data. The average balanced
accuracy is 0.71.
TABLE 7: Mean ± standard deviation of the balanced ac-
curacy (balanced Acc.) of the complete field test recordings
noise class, i.e. Bluetooth and Wi-Fi signals plus Gaussian for the different models.
noise. Model balanced Acc.
We used t-SNE for dimensionality reduction because VGG11 BN 0.792 ± 0.022
of its ability to preserve local structure within the high- VGG13 BN 0.807 ± 0.011
dimensional embedding space. Furthermore, t-SNE has been VGG16 BN 0.811 ± 0.009
widely adopted in the ML community and has a well- VGG19 BN 0.806 ± 0.016
established track record for high-dimensional data visuali-
sation. However, it is sensitive to hyperparameters such as
perplexity and requires some tuning, i.e. different parameters
can lead to considerable different results. type of drone. Therefore, we also evaluated the models in
It can be argued that UMAP would be a better choice terms of a binary problem with two classes “Drone” (for
due to its balanced preservation of local and global structure all six classes of drones in the development dataset) and
together with its robustness to hyperparameters. Therefore, “Noise”.
we created a web application5 that allows users to test and Table 8 shows that the accuracies were highly depend on
compare both approaches with different hyperparameters. the class. Our models generalise well to the drones in the
dataset, with the exception of the DJI. The dependence on
C. CLASSIFICATION PERFORMANCE IN THE FIELD direction is not as strong as expected. Orienting the antenna
TEST 180◦ away from the transmitter reduces the signal power by
about 20 dB, resulting in lower SNR and lower classification
For each model architecture, we performed 5-fold cross-
accuracy. However, as the transmitters were still quite close
validation on the development dataset (cf. Sec. A), resulting
to the antenna, the effect is not pronounced. As we have seen
in five trained models per architecture. Thus, we also evalu-
on the development dataset in Fig. 6, there is a clear drop
ated all five trained models on the field test data. We report
in accuracy once the SNR is below −12 dB. Apparently we
the balanced accuracy ± standard deviation for each model
were still above this threshold, regardless of the direction of
architecture for the complete field test dataset averaged over
the antenna.
all directions and distances in Tab. 7.
What may be surprising is the low accuracy on the
As observed on the development dataset (cf. Tab. 6), there
signals with no active transmitter, labelled as “Noise”, in
is no meaningful difference in performance between the
the direction of the lake (0◦ ). Given the uncontrolled nature
model architectures. We therefore focus on VGG11 BN, the
of a field test, it could well be that there a drone was actually
simplest model trained, in the more detailed analysis of the
flying on the other side of the 2.3 km wide lake. This could
field test results.
explain the false positives we observed in that direction.
A live system should trigger an alarm when a drone is
Table 9 shows the average balanced accuracy of the
present. Therefore, the question of whether the signal is from
VGG11 BN models on the field test data collected at differ-
a drone at all is more important than predicting the correct
ent distances for each antenna direction. There is a slight
5 https://visvgg11bndronerfembeddings.streamlit.app decrease in accuracy with distance. However, the longest

8 VOLUME ,
<Society logo(s) and publication title will appear here.>

TABLE 8: Mean balanced accuracy ± standard deviation


of the VGG11 BN models on the field test recordings for
the different classes for each direction (0◦ , 90◦ and 180◦ ).
The upper part shows the accuracies for the classification
problem (seven classes) and the lower part the accuracies
for the detection problem “Drone” or “Noise”
Class 0◦ 90◦ 180◦
DJI 0.623 ± 0.080 0.624 ± 0.035 0.540 ± 0.051
FutabaT14 0.716 ± 0.101 0.984 ± 0.011 0.911 ± 0.042
FutabaT7 0.724 ± 0.041 0.737 ± 0.034 0.698 ± 0.059
Noise 0.554 ± 0.038 0.924 ± 0.026 0.833 ± 0.068
Taranis 0.936 ± 0.008 0.879 ± 0.002 0.858 ± 0.002
Turnigy 0.899 ± 0.129 0.958 ± 0.027 0.962 ± 0.024
Drone 0.958 ± 0.011 0.859 ± 0.014 0.847 ± 0.017
Noise 0.554 ± 0.038 0.924 ± 0.026 0.833 ± 0.068 FIGURE 9: Confusion matrix of the outputs of the
VGG11 BN model on a single fold for the samples from
the field test data. The average balanced accuracy is 0.80.

TABLE 9: Mean balanced accuracy ± standard deviation of


the VGG11 BN models on the field test data with active
The drone detection system consisted of rather simple and
transmitters collected at different distances for each antenna
low budget hardware (consumer grade notebook with GPU
direction (0◦ , 90◦ and 180◦ ). The upper part shows the
+ SDR). Recording parameters such as sampling frequency,
accuracies for the classification problem (seven classes) and
length of input vectors, etc. were set to enable real-time
the lower part the accuracies for the detection problem
detection with the limited amount of memory and computing
“Drone” or “Noise”
power. This means that data acquisition, pre-processing and
Classification model inference did not take longer than the signal being
Distance (m) 0◦ 90◦ 180◦ processed (≈ 74.9 ms per sample in our case).
110 0.852 ± 0.044 0.838 ± 0.024 0.786 ± 0.044 Obviously, the VGG models were able to learn the rele-
340 0.815 ± 0.078 0.916 ± 0.017 0.828 ± 0.031 vant features for the drone classification from the complex
560 0.730 ± 0.063 0.796 ± 0.007 0.764 ± 0.017 spectrograms of the RF signal. In this respect, we did not
670 0.708 ± 0.068 0.805 ± 0.008 0.777 ± 0.007 find any advantage for the use of more complex models, such
Detection as VGG19 BN, over the least complex model, VGG11 BN
Distance (m) 0◦ 90◦ 180◦ (cf. Tabs. 6 and 7).
110 0.955 ± 0.007 0.851 ± 0.025 0.849 ± 0.018 Furthermore, we have seen that the misclassifications
340 0.986 ± 0.008 0.940 ± 0.010 0.913 ± 0.018 mainly occur between the noise class and the drones, and
560 0.960 ± 0.011 0.820 ± 0.013 0.807 ± 0.022 not between the different drones themselves (cf. Figs. 7 and
670 0.925 ± 0.021 0.826 ± 0.013 0.823 ± 0.014 9). This is particularly relevant for the application of drone
detection systems in security sensitive areas. The first priority
is to detect any kind of UAV, regardless of its type.
Based on our experience and results, we see the following
distance of 670 m appears to be too short to be a problem limitations of our work. The field test showed that the models
for the system. Unfortunately, this was the longest distance can be used and work reliably (cf. Tab. 8). However, it
within line-of-sight that could be recorded at this location. is the nature of a field test that the level of interference
Figure 9 shows the confusion matrix for the outputs of the from WiFi/Bluetooth noise and the possible presence of
VGG11 BN model of a single fold on the field test data. other drones cannot be fully controlled. Furthermore, due
As with the development dataset (cf. Fig. 7), most of the to the limited space/distance between the transmitter and
confusion is between noise and drones rather than between receiver in our field test setup, we were not able to clearly
different types of drones. demonstrate the effect of free space attenuation on detection
performance (cf. Tab. 9).
V. DISCUSSION Regarding the use of simple CNNs as classifiers, it is not
We were able to show that a standard CNN, trained on drone possible to reliably predict whether multiple transmitters are
RF signals recorded in a controlled laboratory environment present. In that case, an object detection approach on the
and artificially augmented with noise, generalised well to the spectrogams could provide a more fine-grained prediction,
more challenging conditions of a real-world field test. see for example the works [30], [31] and [21]. Nevertheless,

VOLUME , 9
Glüge et al.:

the current approach will still detect a drone if one or more [12] E. Ozturk, F. Erden, and I. Guvenc, “Rf-based low-snr classification
are present. of uavs using convolutional neural networks,” ITU Journal on Future
and Evolving Technologies, vol. 2, pp. 39–52, 7 2021. [Online].
We have only tested a limited set of VGG architectures. It Available: https://www.itu.int/pub/S-JNL-VOL2.ISSUE5-2021-A04
remains to be seen whether more recent architectures, such [13] C. J. Swinney and J. C. Woods, “Dronedetect dataset: A
as the pre-trained Vision Transformer [32], generalise as well radio frequency dataset of unmanned aerial system (uas) signals
for machine learning detection & classification,” 2021. [Online].
or better. We hope that our development dataset will inspire Available: https://dx.doi.org/10.21227/5jjj-1m32
others to further optimise the model side of the problem and [14] ——, “Rf detection and classification of unmanned aerial vehicles
perhaps find a model architecture with better performance. in environments with wireless interference,” in 2021 International
Conference on Unmanned Aircraft Systems (ICUAS), 2021, pp. 1494–
Another issue to consider is the occurrence of unknown 1498.
drones, i.e. drones that are not part of the train set. Ex- [15] S. Kunze and B. Saha, “Drone classification with a convolutional
amining the embedding space (cf. B) gives a first idea of neural network applied to raw iq data,” in 2022 3rd URSI Atlantic and
Asia Pacific Radio Science Meeting (AT-AP-RASC), May 2022, pp. 1–
whether a signal is clearly part of a known dense drone 4. [Online]. Available: https://ieeexplore.ieee.org/document/9814170/
cluster or rather falls into the larger, less dense, noise [16] O. Medaiyese, M. Ezuma, A. Lauf, and A. Adeniran, “Cardinal rf
cluster. We believe that a combination of an unsupervised (cardrf): An outdoor uav/uas/drone rf signals with bluetooth and
deep autoencoder approach [33], [34] with an additional wifi signals dataset,” 2022. [Online]. Available: https://dx.doi.org/10.
21227/1xp7-ge95
classification part (cf. [35]) would allow, first, to provide [17] O. O. Medaiyese, M. Ezuma, A. P. Lauf, and A. A. Adeniran,
a stable classification of known samples and, second, to “Hierarchical learning framework for uav detection and identification,”
indicate whether a sample is known or rather an anomaly. IEEE Journal of Radio Frequency Identification, vol. 6, pp. 176–188,
2022.
[18] O. O. Medaiyese, M. Ezuma, A. P. Lauf, and I. Guvenc,
REFERENCES “Wavelet transform analytics for rf-based uav detection and
identification system using machine learning,” Pervasive and Mobile
[1] N. Al-lQubaydhi, A. Alenezi, T. Alanazi, A. Senyor, N. Alanezi, Computing, vol. 82, p. 101569, 6 2022. [Online]. Available:
B. Alotaibi, M. Alotaibi, A. Razaque, and S. Hariri, “Deep https://linkinghub.elsevier.com/retrieve/pii/S1574119222000219
learning for unmanned aerial vehicles detection: A review,” Computer [19] F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han, W. J. Dally,
Science Review, vol. 51, p. 100614, 2 2024. [Online]. Available: and K. Keutzer, “Squeezenet: Alexnet-level accuracy with 50x fewer
https://linkinghub.elsevier.com/retrieve/pii/S1574013723000813 parameters and <1mb model size,” CoRR, vol. abs/1602.07360, 2016.
[2] M. H. Rahman, M. A. S. Sejan, M. A. Aziz, R. Tabassum, J.-I. [Online]. Available: http://arxiv.org/abs/1602.07360
Baik, and H.-K. Song, “A comprehensive survey of unmanned [20] S. Glüge., M. Nyfeler., N. Ramagnano., C. Horn., and C. Schüpbach.,
aerial vehicles detection and classification using machine learning “Robust drone detection and classification from radio frequency sig-
approach: Challenges, solutions, and future directions,” Remote nals using convolutional neural networks,” in Proceedings of the 15th
Sensing, vol. 16, p. 879, 3 2024. [Online]. Available: https: International Joint Conference on Computational Intelligence - NCTA,
//www.mdpi.com/2072-4292/16/5/879 INSTICC. SciTePress, 2023, pp. 496–504.
[3] M. S. Allahham, M. F. Al-Sa’d, A. Al-Ali, A. Mohamed, [21] R. Zhao, T. Li, Y. Li, Y. Ruan, and R. Zhang, “Anchor-free multi-uav
T. Khattab, and A. Erbad, “Dronerf dataset: A dataset of detection and classification using spectrogram,” IEEE Internet of
drones for rf-based detection, classification and identification,” Things Journal, vol. 11, pp. 5259–5272, 2 2024. [Online]. Available:
Data in Brief, vol. 26, p. 104313, 10 2019. [Online]. Available: https://ieeexplore.ieee.org/document/10221859/
https://linkinghub.elsevier.com/retrieve/pii/S2352340919306675 [22] C. Sun, A. Shrivastava, S. Singh, and A. Gupta, “Revisiting unrea-
[4] M. F. Al-Sa’d, A. Al-Ali, A. Mohamed, T. Khattab, and A. Erbad, sonable effectiveness of data in deep learning era,” in 2017 IEEE
“Rf-based drone detection and identification using deep learning International Conference on Computer Vision (ICCV), 2017, pp. 843–
approaches: An initiative towards a large open source drone database,” 852.
Future Generation Computer Systems, vol. 100, pp. 86–97, 11 2019. [23] P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy,
[5] C. J. Swinney and J. C. Woods, “Unmanned aerial vehicle flight mode D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser, J. Bright, S. J.
classification using convolutional neural network and transfer learn- van der Walt, M. Brett, J. Wilson, K. J. Millman, N. Mayorov, A. R. J.
ing,” in 2020 16th International Computer Engineering Conference Nelson, E. Jones, R. Kern, E. Larson, C. J. Carey, İ. Polat, Y. Feng,
(ICENCO), 2020, pp. 83–87. E. W. Moore, J. VanderPlas, D. Laxalde, J. Perktold, R. Cimrman,
[6] Y. Zhang, “Rf-based drone detection using machine learning,” in 2021 I. Henriksen, E. A. Quintero, C. R. Harris, A. M. Archibald, A. H.
2nd International Conference on Computing and Data Science (CDS), Ribeiro, F. Pedregosa, P. van Mulbregt, and SciPy 1.0 Contributors,
2021, pp. 425–428. “SciPy 1.0: Fundamental Algorithms for Scientific Computing in
[7] C. Ge, S. Yang, W. Sun, Y. Luo, and C. Luo, “For rf signal-based Python,” Nature Methods, vol. 17, pp. 261–272, 2020.
uav states recognition, is pre-processing still important at the era of [24] K. Simonyan and A. Zisserman, “Very deep convolutional networks
deep learning?” in 2021 7th International Conference on Computer for large-scale image recognition,” in 3rd International Conference
and Communications (ICCC), 2021, pp. 2292–2296. on Learning Representations, ICLR 2015, San Diego, CA, USA, May
[8] Y. Xue, Y. Chang, Y. Zhang, J. Sun, Z. Ji, H. Li, Y. Peng, and 7-9, 2015, Conference Track Proceedings, Y. Bengio and Y. LeCun,
J. Zuo, “Uav signal recognition of heterogeneous integrated knn Eds., 2015. [Online]. Available: http://arxiv.org/abs/1409.1556
based on genetic algorithm,” Telecommunication Systems, vol. 85, [25] S. Ioffe and C. Szegedy, “Batch normalization: accelerating deep
pp. 591–599, 4 2024. [Online]. Available: https://link.springer.com/ network training by reducing internal covariate shift,” in Proceedings
10.1007/s11235-023-01099-x of the 32nd International Conference on International Conference on
[9] A. AlKhonaini, T. Sheltami, A. Mahmoud, and M. Imam, “Uav Machine Learning - Volume 37, ser. ICML’15. JMLR.org, 2015, p.
detection using reinforcement learning,” Sensors, vol. 24, no. 6, 2024. 448–456.
[Online]. Available: https://www.mdpi.com/1424-8220/24/6/1870 [26] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan,
[10] M. Ezuma, F. Erden, C. K. Anjinappa, O. Ozdemir, and I. Guvenc, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf,
“Detection and classification of uavs using rf fingerprints in the E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner,
presence of wi-fi and bluetooth interference,” IEEE Open Journal L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-
of the Communications Society, vol. 1, pp. 60–76, 2020. [Online]. performance deep learning library,” in Advances in Neural Information
Available: https://ieeexplore.ieee.org/document/8913640/ Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer,
[11] ——, “Drone remote controller rf signal dataset,” 2020. [Online]. F. d'Alché-Buc, E. Fox, and R. Garnett, Eds. Curran Associates, Inc.,
Available: https://dx.doi.org/10.21227/ss99-8d56 2019, pp. 8024–8035.

10 VOLUME ,
<Society logo(s) and publication title will appear here.>

[27] D. P. Kingma and J. Ba, “Adam: A method for stochastic


optimization,” in 3rd International Conference on Learning
Representations, ICLR 2015, San Diego, CA, USA, May 7-9,
2015, Conference Track Proceedings, Y. Bengio and Y. LeCun, Eds.,
2015. [Online]. Available: http://arxiv.org/abs/1412.6980
[28] L. van der Maaten and G. Hinton, “Visualizing data using t-sne,”
Journal of Machine Learning Research, vol. 9, no. 86, pp. 2579–2605,
2008. [Online]. Available: http://jmlr.org/papers/v9/vandermaaten08a.
html
[29] L. McInnes, J. Healy, N. Saul, and L. Großberger, “Umap:
Uniform manifold approximation and projection,” Journal of Open
Source Software, vol. 3, no. 29, p. 861, 2018. [Online]. Available:
https://doi.org/10.21105/joss.00861
[30] K. N. R. Surya Vara Prasad and V. K. Bhargava, “A classification
algorithm for blind uav detection in wideband rf systems,” in 2020
IEEE 92nd Vehicular Technology Conference (VTC2020-Fall), 2020,
pp. 1–7.
[31] S. Basak, S. Rajendran, S. Pollin, and B. Scheers, “Combined rf-based
drone detection and classification,” IEEE Transactions on Cognitive
Communications and Networking, vol. 8, no. 1, pp. 111–120, 2022.
[32] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai,
T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly,
J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words:
Transformers for image recognition at scale,” ICLR, 2021.
[33] S. Lu and R. Li, DAC–Deep Autoencoder-Based Clustering: A
General Deep Learning Framework of Representation Learning.
Springer Science and Business Media Deutschland GmbH, 2 2022,
vol. 294, pp. 205–216. [Online]. Available: https://link.springer.com/
10.1007/978-3-030-82193-7 13
[34] H. Zhou, J. Bai, Y. Wang, J. Ren, X. Yang, and L. Jiao, “Deep radio
signal clustering with interpretability analysis based on saliency map,”
Digital Communications and Networks, 1 2023. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/S2352864823000238
[35] E. Pintelas, I. E. Livieris, and P. E. Pintelas, “A convolutional
autoencoder topology for classification in high-dimensional noisy
image datasets,” Sensors, vol. 21, p. 7731, 11 2021. [Online].
Available: https://www.mdpi.com/1424-8220/21/22/7731

VOLUME , 11

You might also like