0% found this document useful (0 votes)
18 views5 pages

Ref 14

This paper discusses a method for drone detection and classification using deep learning, specifically employing the YOLOv3 object detector. The model is trained on a dataset of over 10,000 images of various types of drones, achieving a mean average precision of 0.74 after 150 epochs. The study highlights the challenges of detecting drones due to their small size and varying appearances, suggesting potential improvements through additional detection mechanisms.

Uploaded by

satishbokka1619
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views5 pages

Ref 14

This paper discusses a method for drone detection and classification using deep learning, specifically employing the YOLOv3 object detector. The model is trained on a dataset of over 10,000 images of various types of drones, achieving a mean average precision of 0.74 after 150 epochs. The study highlights the challenges of detecting drones due to their small size and varying appearances, suggesting potential improvements through additional detection mechanisms.

Uploaded by

satishbokka1619
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Proceedings of the International Conference on Intelligent Computing and Control Systems (ICICCS 2020)

IEEE Xplore Part Number:CFP20K74-ART; ISBN: 978-1-7281-4876-2

Drone Detection and Classification using


Deep Learning
Dinesh Kumar Behera Arockia Bazil Raj
Dept. of Aerospace Engineering Dept. of Electronics Engineering
Defence Institute of Advanced Technology Defence Institute of Advanced Technology
Pune, India Pune, India
[email protected] [email protected]
Abstract— This paper presents a systematic approach to drone the ability to ext ract features easy to understand and reliab le
detection and classification using deep learning with different than conventional machine learning approaches [4-5].
modalities. The YOLOv3 object detector is used to detect the
moving or still objects. It uses a computer vision-based
approach as a reliable solution. The convolutional neural In this study, experiments with the latest object detectors are
network helps to extract features from images and to detect the carried out based on the deep learning approach to detect
object with maximum accuracy. The model is trained with a
proper dataset and trained for 150 epoch only to detect various drones. While detecting it is going to classify the type such
types of drones. A con volutional neural network with modern as tricopter, quadcopter, or hexacopter. The model is trained
object detection methods shows an excellent approach for real- with a proper dataset so that in the case due to some
time detection of drones.
orientation or scaling issue it is not able to find out the type
Keywords— Deep learning, Convolutional neural network, object but it is going to detect the drone. In figure 1, so me images
detector fro m the dataset are shown with ground truth annotation [6-
8]
I. INTRODUCTION
The size of the drone industry expands exponentially to
make this gadget reachable to common citizens with cheaper
prices. By loading explosives , with them, the drone can
easily be converted into killer weapons. Even by pertaining
drones, some terror attacks attempts have been reported [1,
17]. As drones are small in size and having small
electro magnetic signature makes difficu lties fo r
conventional RADA R for detection. A counter mechanis m
is appraised by industry and the academic world.

An object is having a specific structure, texture as well as


some specific pattern. In natural environments, it is difficult
to differentiate between the same types of objects because of
high variation. The performance of an object detector
reduces due to the lighting condition, change of appearance,
and at what angle the object is facing towards the camera.
Most object detector fails when some deformat ion happens
to the object or changes in scale happens to the object [3, 4]. Fig 1. sample images from the dataset
Stopple and background noises add more difficult ies to the
II. LITERATURE REVIEW
object detector. In the modern-day object detection, a
convolutional neural network (CNN) [14] has performed so
well that traditional methods have almost vanished from the
In the past few years, many object detector has been
picture. The best part of the convolutional neural network is
proposed in the field of deep learn ing such as R-CNN [15],
its ability for extract ing features. Based on the convolutional
Fast R-CNN [7], Faster R-CNN [9], YOLO [6], SSD [12],
neural network, many object detectors come in to picture
etc. CNN brought a revolutionary change in the field o f
such as R-CNN [15], Fast R-CNN [7], Faster R-CNN [9],
computer vision and object detection. CNN is a hierarchical
YOLO [6], SSD [12], etc. Apart fro m that, the high-
model that can be trained to perform a variety of detection,
performance GPUs and its easy availability through the use
recognition, and segmentation tasks. It conducts a region-
of high-performance cloud computing advanced
based approach in which characteristics are extracted in a
computational ability. In the success of the neural network,
layer hierarchy, where lower layers in the network extract
it played a crucial ro le. Deep learning arch itectures allo w

978-1-7281-4876-2/20/$31.00 ©2020 IEEE 1012

Authorized licensed use limited to: University of Exeter. Downloaded on June 25,2020 at 12:02:19 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Intelligent Computing and Control Systems (ICICCS 2020)
IEEE Xplore Part Number:CFP20K74-ART; ISBN: 978-1-7281-4876-2

lower-level characteristics such as edges and middle layer III. METHODOLOGY


display droplet-like structure [18-20]. In this work, an approach is proposed to detect the target
R-CNN, Fast R-CNN, and Faster R-CNN are the modern moving or still using YOLOv3 [11] object detector to get
days’ region-based approach. R-CNN and Fast R-CNN use maximu m accuracy. YOLOv3 is the latest model given by
selective search methods to find out object properties of the YOLO family. YOLO only uses a convolutional neural
different scales and at different positions of the image. network fo r getting features fro m images. It does some
Again for all of these, the image is compressed to a fixed small changes in them fo r better and faster perfor mance.
size o f (227x227) p ixels before feeding it to the CNN DarkNet 53 is used in YOLOv 3, which has 53 convolutional
model. In R-CNN it collects 2000 reg ions by using the layers for extract ing features from the images. The
selective search [21] method. In this, they collect similar- architecture of DarkNet53 is shown in figure 2 [27, 28].
looking sub-regions and try to merge it to get the final big
region. After getting the big region classifier has to classify
into one of the classes. But selective search does not fulfill
proper requirements. So to avoid this one linear regression
method is used to map pred icted bounded box to ground -
truth bounding box. Then it uses the SVM classifier [8] in
an offline manner. For every classified region, one SVM
classifier is used. R-CNN takes more time for the testing
procedure as its training pipeline is very complex. In the
case of Fast R-CNN, the feeding approach is the same as
before in R-CNN. After feed ing it to CNN it creates a CNN
feature map. Then this feature map is fed to the RoI pooling
layer, which co mpresses them into square sizes and feeds it
to fully connected layers for getting image vectors. Then
using the softmax layer it predicts offset values for the
bounding boxes as well as the class to which the bounded
region belongs.
In the case of Faster R-CNN, instead of using the selective
search method for finding the region fro m the featured map
form CNN layer they use a separate region-based proposal.
Then it feeds to the RoI pooling layer. Finally, the softmax
layer for classification is used.
In the case of Faster R-CNN, instead of using the selective
search method for finding the region fro m the featured map
form CNN layer they use a separate region-based proposal.
Then it feeds to the RoI pooling layer. Finally, the softmax Fig 2. DarkNet53 Architecture
layer for classification is used.
The network generates a class likelihood for each bounding YOLOv3 follows the same prediction method as
box and offset values for the bounding box. A threshold value YOLO9000 [10] with some small changes in it. It uses
is set to predict the class probability of the bounding box. If the dimension clusters as the anchor bo x for p redicting
value is more than the threshold then that region is having the bounding boxes. The equations are given below shows the
maximum probability of containing the object in the image. In calculations of the coordinate points of bounding [29, 30].
the case of SSD, it is a feed-forward CNN that generates a
fixed-sized bounding box which gives a confidence score of b x = (t x) + cx (1)
each class to locate the object inside the image. The b y = (t y ) + cy (2)
architecture based on the VGG-16 (Visual Geometry Group-
16) [13] architecture. The fully connected layers are not taken bw = pwet w (3)
into consideration. In the case of high-quality image th
classification performance-wise VGG-16 is very accurate. To bh = ph e (4)
extract features in different scales, a collection of convolutional
layers is added along with the VGG-16 layers. Where t x and t y shows the x and y coordinate in the image
starting point of the bounding box. c x and cy show top-left
Based on the brief description of object detectors, YOLO
coordinate of the image when the offset is taken into
was considered in the study for experiments on drone consideration. t w and th show the width and height of each
detection and classification. bounding box. For pred icting an objectness score, YOLOv 3
uses logistic regression. Also for predicting class, it uses

978-1-7281-4876-2/20/$31.00 ©2020 IEEE 1013

Authorized licensed use limited to: University of Exeter. Downloaded on June 25,2020 at 12:02:19 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Intelligent Computing and Control Systems (ICICCS 2020)
IEEE Xplore Part Number:CFP20K74-ART; ISBN: 978-1-7281-4876-2

binary cross-entropy loss function. It predicts boxes of 3 size of 64. The performance of the model is analyzed at
different scales. As it uses a convolutional neural network different iterat ion. For every 10 epoch, the model loss rate,
for its main structure, it extracts features fro m the bo xes of precision, recall, etc. is saved during the time of training,
different scales by using a similar concept of feature
and kept it training for 150 epochs. To understands the
pyramid network. When the bounding box above overlaps
the ground truth object more than every other bounding box training procedure and how good the training is calculated
before, the value is 1. If the objectness score is not good but the loss, precision, and recall values. The graphs of the
if it shows more than the threshold value for ground truth model perfo rmance, are shown in figure 4. To evaluate the
object, it will ignore that pred iction. In Darknet53, the last detection performance, the mean average precision (mAP)
few layers give a 3-d tensor which gives the information value is calculated. The results show the best performance
regarding bounding box, offset value, and class score [31,
of the model is 0.74 at the 150th epoch.
32].
For class prediction, each box predicts the classes which can
contain the bounding box using multi label classification.
PyTorch is an open-source mach ine learning library, based
on a torch lib rary is used as a platform. It is developed by
Facebook’s AI research lab for natural language processing
and computer v ision applicat ions. PyTorch has two
interfaces such as python and C++. But python interface is a
more refined one as python is the most used software
language for the application of AI projects.

Fig 3. Flowchart of the complete experiment

IV. RESULTS FROM EXPERIM ENT


A. Dataset
The dataset had made by collect ing images fro m the
internet and extracting frames fro m videos of different types
of drones. There are more than 10000 images of different
categories of drones. The type of drones is differentiated
based on their number of rotors. In this , all drones are mu lti-
rotor drones. Such as, if it has three rotors then it is
tricopter. If it has four rotors then it is a quadcopter and if it
has six rotors then hexacopter. If due to some light issue or
viewpoint issue if the drones are not able to differentiate
type then those types of images are taken in drone category
to train the model. The drones have appeared in the images
at different scales, different orientations, different
viewpoints, and illu minations. The annotations give height,
width, and top left (x,y) coordinate and type of drone for the
ground truth bounding box. For this experiment, the
annotations are taken in YOLO format.
Fig 4. Results from training
B. Performance of Dataset Training
NVIDIA GeForce GTX 1050 Ti GPU model is trained
with our model, with a learning rate of 0.0001 and a batch

978-1-7281-4876-2/20/$31.00 ©2020 IEEE 1014

Authorized licensed use limited to: University of Exeter. Downloaded on June 25,2020 at 12:02:19 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Intelligent Computing and Control Systems (ICICCS 2020)
IEEE Xplore Part Number:CFP20K74-ART; ISBN: 978-1-7281-4876-2

C. Visual Analysis of test results V. CONCLUSION


In figure 5, some results have been shown on drone Due to the big arch itecture of the YOLOv3 model and less
detection fro m images. The testing images are different fro m class, the model is trained for 150 epoch only. In so me
the training images. The 4 images are taken of different types cases, the model is unable to detect the correct drone type.
of drones from a different viewpoint. The first image shows For improvement of accuracy, a new kind of counter
it is a quadcopter. The second one is detecting it is a mechanis m can be integrated, such as RF signal detection.
hexacopter. The third one is detecting as a drone and forth In which the RF signal between the operator and the drone.
X band RADA R and micro-doppler RADA R are the new
one as a tricopter.
methods. A new acoustic system is a modern method to
detect drone from drone blade sound also able to detect the
type of drone.

REFERENCES
[1] Convolutional Neural Network-Based Real-Time Object Detection
and Tracking for Parrot AR Drone 2,ALI ROHAN,MOHAMMED
RABAH, AND SUNG-HO KIM1 School of Electronics and
Information Engineer-ing, Kunsan National University,South Korea,
Department of Control and Robotics Engineering, Kunsan National
University, Gunsan
[2] Real-T ime, Cloud-based Object Detection for Unmanned Aerial
Vehicles Jangwon Lee, Jingya Wang, David Crandall, Selma
Sabanovi c, and Geoffrey Fox School of Informatics and Comput ing
Indiana University
[3] Using Shape Descriptors for UAV Detection , Eren Unlu, Emmanuel
Zenou, Nicolas Riviere` ,Eren Unlu, Emmanuel Zenou, Nicolas
Riviere`. Using Shape Descriptors for UAV Detection. Electronic
Imaging 2017, Jan 2018, Burlingam, United States.pp. 1-5.
[4] Drone Detection and Identification System using Artificial Intelli-
gence,Dongkyu ’Roy’ Lee,Woong Gyu La and Hwangnam
Kim,School of Electrical Engineering, Korea University, Seoul, Rep.
[5] A Study on Detecting Drones Using Deep Convolutional Neural Net-
works,Muhammad Saqib,Nabin Sharma,Sultan Daud Khan
[6] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look
once:Unified, real-time object detection,” in Proc. IEEE Conf.
Comput. Vis.Pattern Recognit. (CVPR), Jun. 2016, pp. 779-788.
[7] R. Girshick. Fast r-cnn. In Proceedings of the IEEE International
Con-ference on Computer Vision,2015.
[8] Support Vector Machines: Theory and Applications, Theodoros Evge-
niou1 and Massimiliano Pontil2
[9] S. Ren, K. He, R. Girshick and J. Sun. Faster r-cnn:Towards real-time
object detection with region proposal networks. In Advances in neural
information processing systems, pages 91–99, 2015.
[10] YOLO9000:Better, Faster, Stronger Joseph Redmon, Ali Farhadi,
Uni-versity of Washington, Allen Institute for AIy
[11] YOLOv3: An Incremental Improvement, Joseph Redmon, Ali
Farhadi, University of Washington

[12] SSD: Single Shot MultiBox Detector Wei Liu1, Dragomir Anguelov,
Dumitru Erhan, Christian Szegedy,Scott Reed, Cheng-Yang Fu1, and
Alexander C. Berg1
[13] Accelerating Very Deep Convolutional Networks for Classification
and Detection Xiangyu Zhang, Jianhua Zou, Kaiming He†, and Jian
Sun
[14] Visualizing and Understanding Convolutional Networks Matthew D.
Zeiler and Rob Fergus ,Dept. of Computer Science, New York
University,USA
[15] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hi-
erarchies for accurate object detection and semantic segmentation. In
Proceedings of the IEEE conference on computer vision and pattern
recognition, pages 580–587,2014.
[16] Selective Search for Object Recognition,J. R. R. Uijlings, K. E. A.
van de Sande,T . Gevers · A. W. M. Smeulders
[17] Arockia Bazil Raj, A and Harish C Kumawat, 2020, ‘Extraction of
Doppler Signature of micro-to-macro rotations/motions using CW
Radar assisted measurement system’, IET- Science, Measurement &
T echnology, Accepted- Mar. 2020. [SCI – IET; IF:1.895].
[18] Daliya V Thiruvoth, B. Pawan Kumar, V.S Kumar, A.A.Bazil Raj
and Ravi Dutt Gupta, 2019, “Dual-Band Shared-Aperture
Fig 5. Results from test images Reflectarray Antenna Element at Ku-Band for the TT&C Application

978-1-7281-4876-2/20/$31.00 ©2020 IEEE 1015

Authorized licensed use limited to: University of Exeter. Downloaded on June 25,2020 at 12:02:19 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Intelligent Computing and Control Systems (ICICCS 2020)
IEEE Xplore Part Number:CFP20K74-ART; ISBN: 978-1-7281-4876-2

of a Geostationary Satellite”, IEEE Inter., Conf.,’Recent Trends on


Eletronics,Information,Communication and T echnology(RTEICT)’.
[19] Anupama Gupta and A.A.Bazil Raj, 2019, “Feature Extraction of
Intra-Pulse Modulated LPI Waveforms Using ST FT”, IEEE Inter.,
Conf., ‘Recent Trends on Electronics, Information, Communication
and T echnology (RTEICT)’ pp. 90-95
[20] Suchismita Batabyal and A.A.Bazil Raj, 2019, “Design of Ring
Oscillator Based PUF with Enhanced Challenge Response pair and
Improved Reliability”, IEEE Inter., Conf., ‘Recent T rends on
Electronics, Information, Communication and Technology
(RT EICT)’ pp.1-5
[21] Lakshmi Prasad and A.A.Bazil Raj, 2019, “Design of 2D-WH/T S
OCDMA PON ONU Receiver with FLC T echnique”, IEEE Inter.,
Conf., ‘Recent Trends on Electronics, Information, Communication
and T echnology (RTEICT)’ pp.90-95
[22] Ravi Vaishnavi, G. unnikrishnan and A.A.Bazil Raj, 2019,
“Implementation of Algorithm for Point Target Detection and
Tracking in Infrared Image Sequence”, IEEE Inter., Conf., ‘Recent
Trends on Electronics, Information, Communication and
T echnology (RTEICT)’, pp. 1-5
[23] Satyendra R. Nishad and A.A.Bazil Raj, 2019, “Sliding Mode
Control of Robotic Gait Simulator”, IEEE Int., Conf., ‘ Intelligent
Computing and Control Systems (ICCS-2019) ’, pp. 1-6.
[24] Priyanka Shakya and A. A. Bazil Raj, 2019, “Inverse Synthetic
Aperture Radar Imaging Using Fourier Transform Technique”,
IEEE Inter., Conf., ‘Innovations in Information and
Communication Technology (ICIICT)’ pp. 1-4.
[25] Using Shape Descriptors for UAV Detection , Eren Unlu,
Emmanuel Zenou, Nicolas Riviere` ,Eren Unlu, Emmanuel Zenou,
Nicolas Riviere`. Using Shape Descriptors for UAV Detection.
Electronic Imaging 2017, Jan 2018, Burlingam, United States.pp.
1-5.
[26] Upasana Garg , A A Bazil Raj and K P Ray, 2018, “Cognitive
Radar Assisted T arget Tracking: A Study”, IEEE Inter., Conf.,
‘Communication and Electronics Systems (ICCES)’ pp. 427-430.
[27] Arockia Bazil Raj, A et al, 2018, ‘Multi-Bit Digital Receiver
Design For Radar Signature Estimation’, IEEE Int., Conf., ‘Recent
Trends in Electronics, Communication and Information
T echnology 2018’, pp. 1-6.
[28] Arockia Bazil Raj, A et al, 2018, ‘Design and Evaluation of C-band
FMCW Radar System’, IEEE Int., Conf., ‘Trends in Electronics
and Informatics 2018’, pp. 1-5.
[29] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only
look once:Unified, real-time object detection,” in Proc. IEEE Conf.
Comput. Vis.Pattern Recognit. (CVPR), Jun. 2016, pp. 779-788.
[30] Arockia Bazil Raj, A et al, 2020, ‘Prehistoric man’s fire to today’s
free space optical communication: technology and advancements’,
IEEE Communications survey and tutorials, R1 submitted, Mar.
2020.
[31] A.A. Bazil Raj, “FPGA- Based Embedded System Developer's
Guide”, 1st ed., USA: CRC Press, 2018.

978-1-7281-4876-2/20/$31.00 ©2020 IEEE 1016

Authorized licensed use limited to: University of Exeter. Downloaded on June 25,2020 at 12:02:19 UTC from IEEE Xplore. Restrictions apply.

You might also like