Drone RF Signal Detection and Fingerprinting UAVSig Dataset and Deep Learning Approach
Drone RF Signal Detection and Fingerprinting UAVSig Dataset and Deep Learning Approach
Abstract—Unmanned aerial vehicles (UAVs) are useful for of the same model. The fingerprinting problem is challenging
commercial, recreational, and military applications, but they can because UAVs of the same model use the same protocols and
also be used by malicious attackers and pose security threats. many commercial UAVs transmit proprietary waveforms [5].
Therefore, it is important to monitor their occurrences and ensure
their compliance. Radio frequency signals can be leveraged for As a result, the detector might have limited or no knowledge
such task. However, commercial UAVs often adopt the frequency of the protocols and transmitted data, and therefore in order
hopping spread spectrum signals and proprietary protocols, to differentiate between them must rely on the raw I/Q
which makes them more challenging to detect. While prior works samples. Commercial UAVs usually adopt frequency hopping
have studied the UAV detection and classification problem, most spread spectrum (FHSS) physical layer signals [6]. Therefore,
of them consider the classification between different UAV models.
Classification between drones of the same model has not yet localization of the UAV signals in a wideband spectrum is also
fully been investigated. In this work, we consider the tasks of a major challenge for UAV detection.
detecting and localizing UAV signals in a wideband spectrum, To investigate this problem, we need an appropriate UAV
and also the task of fingerprinting these drones of the same model RF dataset. We surveyed existing public UAV RF datasets
simultaneously. To solve this problem, we collect the UAVSig, an [7]–[11], which were collected for different purposes, such
over-the-air dataset of UAV RF signals. We also present a deep
learning model which can solve detect and fingerprint multiple as transmission type classification, UAV operational mode
drones of the same model simultaneously based on spectrograms. classification, etc. However, those datasets have several lim-
Our model can achieve 99.5% precision and 99.4% recall for itations as listed in Table I. Synthetic data can be used for
transmission detection, and 90.9% classification accuracy with training but cannot always represent real-world scenarios.
drones of the same model. Consequently, a model trained on synthetic data can have
Index Terms—Radio frequency fingerprinting, deep learning,
unmanned aerial vehicles, UAV, dataset, YOLO degraded performance on real signals captured over-the-air
(OTA) [12]. Therefore, we collect a comprehensive set of
I. I NTRODUCTION OTA RF signals from the UAVs of the same model and make,
referred as UAVSig dataset, to address the problem. A detailed
A. Motivation comparison of the datasets is presented in Table I.
Unmanned aerial vehicles (UAVs) have been used in vari- Many prior works have already studied UAV detection
ous scenarios including military, surveillance and monitoring. and classification using RF signals, and achieved good per-
Many other potential applications, such as flying wireless base formance with machine learning methods [3]–[5], [13]–[15].
stations, are also of great potential [1]. However, the flexibility The authors in these works have considered this task under
and mobility of UAVs can also pose threats to people and different scenarios, including low signal-to-noise ratio (SNR)
property, and thus it is important to develop technologies for [13], the presence of interference [3], [14], out-of-distribution
drone detection and classification [2]. Existing drone detection (OOD) and misclassified signal detection [4], different opera-
and classification techniques are based on four types of signals: tional mode [15] and preamble feature extraction [5]. However,
radar, acoustic, visual, and radio frequency (RF) [2]. Among these works mainly consider the classification between UAVs
these methods, RF-based techniques have many advantages, of different models, and do not consider fingerprinting problem
such as being able to work in all kinds of weather [3] and of interest. On the other hand, fingerprinting UAVs is discussed
operate in non-line-of-sight (NLoS) scenarios [4]. in [10], where data streams from 7 identical DJI M100 UAVs
Using the RF signals, the monitoring system would need to are used for classification. It only considered a single 10
detect, localize and fingerprint multiple UAVs in a wideband MHz channel UAV transmission and thus the detection and
spectrum. The detection algorithm needs to find occurrences localization of UAV signals in a wideband spectrum was not
of the UAV transmissions in the spectrum, localize the time fully characterized. The authors in [16] investigated the drone
and frequency parameters of those transmissions, and classify detection and classification over the entire 2.4GHz ISM band.
their source transmitters based on RF fingerprinting. Here, However, this classification was also limited to differentiating
fingerprinting means distinguishing between different UAVs between different UAV makes and models.
Authorized licensed use limited to: Peng Cheng Laboratory. Downloaded on February 24,2025 at 07:34:16 UTC from IEEE Xplore. Restrictions apply.
979-8-3503-7423-0/24/$31.00 ©2024 IEEE 431
MILCOM 2024 Track 3 - Cyber Security and Trusted Computing
Input
B. Contributions
Conv 2D (64, 3, 3)
Our contributions can be summarized as follows: MaxPool 2D (2, 2)
• We collect a large scale UAV RF signal dataset (UAVSig) ResBlock (64)
ResBlock (128)
for drone detection and fingerprinting. UAVSig consists Input
ResBlock (256)
of OTA transmission captures from 8 transmitters in the ResBlock (512)
Conv2D (K, 3, 3)
2.4GHz ISM band. Specifically, the transmissions are BatchNormalization + LeakyReLU
Conv2D (1024, 3, 3) Conv2D (K, 3, 3)
labeled with both time and frequency domain parameters, Conv2D (2048, 3, 3) BatchNormalization + LeakyReLU
as well as the transmitter identity. The dataset will be Conv2DTranspose (2048, 3, 3) Conv2D (K, 1, 1)
made public for future research. Conv2DTranspose (2048, 3, 3) BatchNormalization
• We consider the task of UAV signal detection, spectrum
Conv2D (9, 2, 2) LeakyReLU
Authorized licensed use limited to: Peng Cheng Laboratory. Downloaded on February 24,2025 at 07:34:16 UTC from IEEE Xplore. Restrictions apply.
432
MILCOM 2024 Track 3 - Cyber Security and Trusted Computing
~2.438
The output shape of the model is (15, 15, c + 5), and the loss
function of the model is detailed in (5).
nc X
nr
1obj Fig. 3: Two drone collection setup
X
2 2
L= ij [(xi,j − x̂i,j ) + (yi,j − ŷi,j ) ]
i=0 j=0 from unknown sources we cannot label or control. The re-
nc X
nr ceiver antenna was angled upwards towards the transmitters-
√
q
1obj
X p p
+ ij [( wi,j − ŵi,j ) + ( hi,j − ĥi,j )] of-interest and the sky, further reducing unwanted interference.
i=0 j=0
nc X
nr N B. Collection
1obj
X X
+ ij (pi,j (n) − p̂i,j (n))2 The UAVSig dataset contains two capture types from two
i=0 j=0 n=1
nc Xnr nc X
nr
days: 1) one and two drone captures and 2) controller cap-
X
1obj
X
1noobj tures. Note that for clarity, we denote each combination
+ ij (Ci,j − Ĉi,j ) + ij (Ci,j − Ĉi,j )
i=0 j=0 i=0 j=0
of transmitters-of-interest, physical placements, and possible
(5) channel assignments as a scenario.
For the one and two drone captures, the drones, the
In (5), 1 obj
ij , Ci,j and pi,j (n) ∈ {0, 1} equal to 1 if a transmitters-of-interest, were placed on an elevated table at
transmission exists in the area denoted by coordinates (i, j). a set distance away from the receiver antenna. We designated
Similarly, 1nobj ij ∈ {0, 1} is 1 if no transmission exists in two locations for drone placement, side-by-side horizontally
the area denoted by coordinates (i, j). All the predictions across the table, as shown in Fig. 3. These designated spots
x̂, ŷ, ŵ, ĥ, p̂, Ĉ ∈ [0, 1], and thus the sigmoid activation func- allowed the captured transmit power to be approximately the
tion is used at the output layer in the model. same for each drone. A drone was placed in each spot for two
To develop and evaluate the model for this task, we collected drone collection, or one drone was placed only in the leftmost
the UAVSig dataset, which is detailed in the next section. spot for one drone collection. When capturing drone signals, a
controller needs to be paired to each drone. Since we needed
III. DATASET
to isolate the drone signals from those of the controllers,
A. RF and Hardware Configuration we placed the controllers far behind and a floor below the
In our UAVSig dataset, we collected over-the-air spectrum receiver antenna. Note the data collection was conducted in a
data from 8 transmitters: 4 identical DJI M100 UAVs and 4 controlled, lab-like setting to make the captures from different
C1 DJI remote controllers. The UAVs are wideband, fixed- transmitters as similar as possible. Our goal was to isolate the
channel transmitters, operating in a 10 MHz channel we RF fingerprinting capability, rather than adding side-channel
select through the controller. Meanwhile, the controllers are information that could affect the performance.
frequency-hopped transmitters that operate over the whole 2.4 As described earlier, we assigned each drone to transmit
GHz ISM band. We captured 50 MHz of bandwidth centered in one of four channels. For one drone, we collected every
at 2.4435 GHz using a small software-defined radio (SDR), combination of each drone transmitting on each of the four
one USRP B205mini-i. We collected data with each drone channels for a total of 16 scenarios. For two drones, we
transmitting in each of 4 selected channels, as illustrated in collected every combination of two drones transmitting on two
Fig. 2. Our center frequency does not precisely center the different channels for a total of 72 scenarios.
4 channels in the spectrum, but we do ensure that all UAV For the controller captures, we placed the controllers on
channels-of-interest are fully inside the spectrograms. Since the same elevated table as with the drones. There were four
the controllers are frequency-hopped, some transmissions are designated locations for controller placement, two placed on
cut-off partially or entirely. the table side-by-side, and two placed additionally on top of
The USRP was connected to a 20 dBi panel antenna with boxes behind the first two side-by-side, as illustrated in Fig.
an 18° beamwidth. While we were unable to capture data in a 4. To make the transmissions more consistent and to avoid
location devoid of 2.4 GHz ISM band devices, the directional interference from the drones, we did not pair a drone to the
antenna’s spatial isolation significantly reduced interference controllers when collecting controller captures. Anecdotally,
Authorized licensed use limited to: Peng Cheng Laboratory. Downloaded on February 24,2025 at 07:34:16 UTC from IEEE Xplore. Restrictions apply.
433
MILCOM 2024 Track 3 - Cyber Security and Trusted Computing
Transmission No
End
exists?
Yes
Time-domain
labels
Fig. 4: Controllers collection setup
we did not visually notice any significant difference between FFT Frequency labels
the controller transmissions and activity with and without a
drone connected. Each controller scenario does not include an Fig. 5: DSP label generation workflow.
0.0 0.0
RF channel to select, since, as previously stated, the controller −40 −40
Power (dB)
Power (dB)
Time (ms)
Time (ms)
−80 −80
every combination of the four controllers being turned on or 3.1 3.1
−100 −100
off. We also captured this for two different position setups of
4.2 4.2 −120
−120
the controllers by flipping each controller diagonally from the
−140
original setup for a total of 32 scenarios. 5.2
2415 2425 2435 2445
Frequency (MHz)
2455 2465
5.2
2415 2425 2435 2445
Frequency (MHz)
2455 2465
(fre uency resolution 97.7 kHz) (fre uency resolution 97.7 kHz)
For each of the 16 scenarios with one drone, 72 scenarios
with two drones, and 32 scenarios with controllers in the (a) (b)
UAVSig dataset, we captured six, one-second, 50 MSa/s cap- Fig. 6: Example labeled spectrograms of signals in UAVSig
tures with a receiver gain of 20 dB. The captures were taken dataset where the transmissions are bounded with red boxes.
consecutively with small time gaps between each capture. (a) shows a spectrogram with drone 1 on channel 3 and drone
Each captured sample includes 16-bit I/Q data, saved as 3 on channel 1, and (b) shows a spectrogram with all four
two 32-bit floats by our GNU-Radio-based data collection controllers active.
software. Synthesized signals were not used for the UAVSig
dataset as addition might introduce unrealistic RF characteris-
tics. Synthesizing multiple transmitter signals together could IV. E VALUATION
fail to capture more complex interactions between simultane-
ous transmissions. To examine the model performance, a intersection-over-
union (IoU) threshold is used, where IoU is the ratio of the
C. Processing intersected area over the union area of two bounding boxes. In
To utilize the collected data for training and testing our the scope of this work, we consider a prediction as correct if it
model, we need to label each transmission with its start time, has an IoU = 0.5 with the real transmission. Consequently, to
end time, center frequency, bandwidth and source transmitter evaluate the transmission detection performance of the model,
identity. To generate the time and frequency labels, we adopt a we report the precision and recall of the predictions, which
digital signal processing (DSP) procedure as depicted in Fig. are calculated as shown in (6) and (7). In (6) and (7), TP, FP
5. Then, to manually assign the transmitter identity labels, and FN mean true positives, false positives and false nega-
we consider each scenario separately. For a scenario where tives, respectively. Then, to evaluate the UAV fingerprinting
only one transmitter is on, all the detected transmissions in performance, we consider the classification accuracy, which is
the 6 captures are assigned the same transmitter label. For computed within the correctly detected transmissions. In the
the scenarios where two drones are on, as the channels of the evaluation, we consider a baseline model from [16], because it
drones are recorded for each scenario, we assign the drone is designed for a most similar problem, which is to detect UAV
labels by comparing the estimated and recorded true center fre- transmissions and classify UAV models. The baseline model
quencies of the transmissions. Finally, for the scenarios where adopts the YOLO-lite [18] architecture with some adaption.
multiple controllers are on, we cannot control their hopping To ensure fairness for the comparison, we use the same input
sequence and thus can only assign non-deterministic labels shape of (512 × 512) for both our model and the baseline
for the transmissions. All active controllers are listed in these model.
labels. It should be noted that while the DSP approach can The details of the experiments are explained in the rest of
help to label the transmissions, it cannot fingerprint the UAVs the section. For all the evaluations, we use the 5 captures of a
at the same time. Therefore, we present our spectrogram- scenario for training and the other capture in the same scenario
based one-stage deep learning model to solve the detection for testing.
and fingerprinting tasks simultaneously. After the labeling, we TP
Precision = (6)
split each 1 second signal into 5.2 millisecond segments to TP + FP
generate the 512 × 512 spectrograms. Examples of labeled TP
spectrograms are shown in Fig. 6. Recall = (7)
TP + FN
Authorized licensed use limited to: Peng Cheng Laboratory. Downloaded on February 24,2025 at 07:34:16 UTC from IEEE Xplore. Restrictions apply.
434
MILCOM 2024 Track 3 - Cyber Security and Trusted Computing
True label
3000
the evaluation. The results are shown in Table II, where the
Drone 3 162 99 5107 222 2000
performance of our model is in the “train on one-drone, test
on one-drone” column. It can be observed that our model is 51 9 7 5431
1000
Drone 4
able to correctly detect 98.2% of the drones’ transmissions, 0
Drone 1 Drone 2 Drone 3 Drone 4
with limited number of false positive predictions. Besides, Predicted label
within the successfully detected transmissions, we can classify
between the drones with an accuracy of 81.6%. At the same Fig. 8: Confusion matrix of our model when it is trained on
time, while the baseline model is able to detect most of the two-drone and also tested on two-drone data.
transmissions, it also makes a lot of false predictions, and 2296 1643 322 75
Drone 1 2500
fail at the drone fingerprinting task. This result shows that
compared to the UAV classification problem, the fingerprinting 2000
Drone 2 1952 2918 113 15
True label
problem is indeed more challenging and warrants a more 1500
complex network architecture. At the same time, it is able Drone 3 2373 951 1478 233
1000
to solve this problem with carefully designed deep learning
model. As the baseline model cannot reliably fingerprint the Drone 4 1984 896 1172 190
500
drones, we only evaluate our model in the following cases.
Drone 1 Drone 2 Drone 3 Drone 4
Finally, the confusion matrix of our model in this case is Predicted label
presented in Figure 7, where an integer value in the confusion (a)
500
matrix represents the number of bounding boxes. As shown in Drone 1 167 171 177 90
the confusion matrix, the model tends to misclassify between 400
the pair of drone 1 and drone 2 and also the pair of drone Drone 2 180 378 27 18
True label 300
3 and drone 4. This result shows that there might be some
subtle hardware similarities within each pair of drones. These Drone 3 1 3 505 97 200
observations can be helpful for future security design, such as
100
preventing possible spoofing attacks. Drone 4 0 0 465 122
500 0
Drone 1 438 132 33 2 Drone 1 Drone 2 Drone 3 Drone 4
Predicted label
400 (b)
Drone 2 103 491 10 1
300
True label
Authorized licensed use limited to: Peng Cheng Laboratory. Downloaded on February 24,2025 at 07:34:16 UTC from IEEE Xplore. Restrictions apply.
435
MILCOM 2024 Track 3 - Cyber Security and Trusted Computing
shown in Section III, compared to drones, the controllers [3] H. Zhang, T. Li, Y. Li, J. Li, O. A. Dobre, and Z. Wen, “RF-Based
transmit with much smaller bandwidth, shorter time duration, Drone Classification Under Complex Electromagnetic Environments
Using Deep Learning,” IEEE Sensors Journal, vol. 23, no. 6, pp. 6099–
and longer intervals. Therefore, the detection and spectrum 6108, 2023.
localization of controller transmissions could be a harder [4] Y. Chen, L. Zhu, Y. Jiao, C. Yao, K. Cheng, and Y. Gu, “An Extreme
problem. We include all 15 controller scenarios for training. Value Theory-Based Approach for Reliable Drone RF Signal Identifica-
tion,” IEEE Transactions on Cognitive Communications and Networking,
Since we do not have precise ground truth controller labels vol. 10, no. 2, pp. 454–469, 2024.
for the multiple controller scenarios, we cannot fingerprinting [5] G. Reus-Muns and K. R. Chowdhury, “Classifying UAVs With Pro-
controllers in this case. The detection and spectrum local- prietary Waveforms via Preamble Feature Extraction and Federated
Learning,” IEEE Transactions on Vehicular Technology, vol. 70, no. 7,
ization results are presented in the last column of Table II. pp. 6279–6290, 2021.
Both the precision and recall slightly decrease compared to [6] I. Guvenc, F. Koohifar, S. Singh, M. L. Sichitiu, and D. Matolak,
the drone transmission detection results. This degradation con- “Detection, Tracking, and Interdiction for Amateur Drones,” IEEE
Communications Magazine, vol. 56, no. 4, pp. 75–81, 2018.
firms our intuition that the detection and spectrum localization [7] M. Ezuma, F. Erden, C. K. Anjinappa, O. Ozdemir, and I. Guvenc,
of controller transmissions is more difficult than that of drone “Drone Remote Controller RF Signal Dataset.” IEEE Dataport, doi:
transmissions. 10.21227/ss99-8d56, 2020.
[8] C. J. Swinney and J. C. Woods, “DroneDetect Dataset: A Radio Fre-
V. C ONCLUSION quency dataset of Unmanned Aerial System (UAS) Signals for Machine
Learning Detection & Classification.” IEEE Dataport, doi: 10.21227/5jjj-
In this work, we considered possible security threats brought 1m32, 2021.
by UAVs. Specifically, to ensure UAV compliance and se- [9] O. Medaiyese, M. Ezuma, A. Lauf, and A. Adeniran, “Cardinal RF
(CardRF): An Outdoor UAV/UAS/Drone RF Signals with Bluetooth and
curity, it is important to solve the UAV detection, spectrum WiFi Signals Dataset.” IEEE Dataport, doi: 10.21227/1xp7-ge95, 2022.
localization and fingerprinting problem. First, we presented [10] N. Soltani, G. Reus-Muns, B. Salehi, J. Dy, S. Ioannidis, and K. Chowd-
a one-stage deep learning model to detect and fingerprint hury, “RF Fingerprinting Unmanned Aerial Vehicles With Non-Standard
Transmitter Waveforms,” IEEE Transactions on Vehicular Technology,
the UAVs using spectrograms. Then, to evaluate the model, vol. 69, no. 12, pp. 15518–15531, 2020.
we collected UAVSig, which is an OTA captured RF dataset [11] S. Basak, S. Pollin, and B. Scheers, “Drone RF Dataset.” KU Leuven
for the UAVs1 . Using this dataset, our model can detect the RDR, doi: 10.48804/HZRVNZ, 2024.
[12] D. Uvaydov, M. Zhang, C. P. Robinson, S. D’Oro, T. Melodia, and
drone transmissions with 99.5% precision, 99.4% recall and F. Restuccia, “Stitching the Spectrum: Semantic Spectrum Segmentation
fingerprint the drones with 90.9% accuracy. In the evaluations, with Wideband Signal Stitching.” arXiv:2402.03465, 2024.
we focused on the drone transmissions, and also observed [13] E. Ozturk, F. Erden, and I. Guvenc, “RF-Based Low-SNR Classification
of UAVs Using Convolutional Neural Networks.” arXiv:2009.05519,
model performance degradation in certain scenarios, which can 2020.
be further analyzed for a more robust model. [14] M. Ezuma, F. Erden, C. Kumar Anjinappa, O. Ozdemir, and I. Guvenc,
In the future, we will improve the generalizablity of UAVSig “Detection and Classification of UAVs Using RF Fingerprints in the
Presence of Wi-Fi and Bluetooth Interference,” IEEE Open Journal of
and our RF fingerprinting model. We will augment the existing the Communications Society, vol. 1, pp. 60–76, 2020.
dataset with more OTA captures to include more UAVs and [15] K. Raina, T. Alladi, V. Chamola, and F. R. Yu, “Detecting UAV
diversified scenarios, such as the varying SNRs found in Presence Using Convolution Feature Vectors in Light Gradient Boosting
Machine,” IEEE Transactions on Vehicular Technology, vol. 72, no. 4,
UAV swarms. While increasing the training data size can pp. 4332–4341, 2023.
improve the performance, capturing data for all possible [16] S. Basak, S. Rajendran, S. Pollin, and B. Scheers, “Combined RF-Based
combinations of the UAVs is unrealistic, especially with many Drone Detection and Classification,” IEEE Transactions on Cognitive
Communications and Networking, vol. 8, no. 1, pp. 111–120, 2022.
UAVs. Therefore, future data collection must introduce enough [17] Joseph Redmon and Santosh Divvala and Ross Girshick and Ali
variability within a reasonable amount of data. For the model, Farhadi, “You only look once: Unified, real-time object detection.”
we will also investigate and improve its robustness in different arXiv:1506.02640, 2016.
[18] R. Huang, J. Pedoeem, and C. Chen, “YOLO-LITE: A Real-Time Object
scenarios, such as open-set scenarios [19] due to adversarial Detection Algorithm Optimized for Non-GPU Computers,” in 2018
attacks, and also domain-shift problems due to channel [20] IEEE International Conference on Big Data (Big Data), (Los Alamitos,
and receiver [21] variations. CA, USA), pp. 2503–2510, IEEE Computer Society, dec 2018.
[19] T. Zhao, S. Sarkar, Y. Tian, and D. Cabric, “Anomaly Transmitter
R EFERENCES Recognition and Tracking,” in 2024 IEEE International Symposium on
Dynamic Spectrum Access Networks (DySPAN), pp. 357–364, 2024.
[1] M. Mozaffari, W. Saad, M. Bennis, Y.-H. Nam, and M. Debbah, “A [20] G. Shen, J. Zhang, A. Marshall, and J. R. Cavallaro, “Towards scalable
Tutorial on UAVs for Wireless Networks: Applications, Challenges, and and channel-robust radio frequency fingerprint identification for lora,”
Open Problems,” IEEE Communications Surveys & Tutorials, vol. 21, IEEE Transactions on Information Forensics and Security, vol. 17,
no. 3, pp. 2334–2360, 2019. pp. 774–787, 2022.
[2] B. Taha and A. Shoufan, “Machine Learning-Based Drone Detection [21] T. Zhao, S. Sarkar, E. Krijestorac, and D. Cabric, “GAN-RXA: A Practi-
and Classification: State-of-the-Art in Research,” IEEE Access, vol. 7, cal Scalable Solution to Receiver-Agnostic Transmitter Fingerprinting,”
pp. 138669–138682, 2019. IEEE Transactions on Cognitive Communications and Networking,
vol. 10, no. 2, pp. 403–416, 2024.
1 Dataset available at: https://cores.ee.ucla.edu/downloads/datasets/uavsig/.
Authorized licensed use limited to: Peng Cheng Laboratory. Downloaded on February 24,2025 at 07:34:16 UTC from IEEE Xplore. Restrictions apply.
436