Object Detection Using You Only Look Once (YOLO) Algorithm in Convolution Neural Network (CNN)
Object Detection Using You Only Look Once (YOLO) Algorithm in Convolution Neural Network (CNN)
net/publication/370988885
Object Detection using You Only Look Once (YOLO) Algorithm in Convolution
Neural Network (CNN)
CITATIONS READS
2 644
6 authors, including:
Jayshree Das
BVRIT
10 PUBLICATIONS 3 CITATIONS
SEE PROFILE
All content following this page was uploaded by Jayshree Das on 04 July 2023.
Abstract— A useful tool to utilize would be a computer vision linear SVM for object identification, and a regression model
technique that enables us to recognize and find items in an for bounding boxes. Utilizing so many models lengthen the
image or video. To meet the demand, many algorithms, runtime. The outcome takes about 45-50 seconds to acquire.
including Region based Convolution Neural Network (RCNN)
Fast Region based Convolution Neural Network (FRCNN), are B. Fast-RCNN
available. However, in this instance the You Only Look Once RCNN leverages past work and DCNN to finish this task
(YOLO) V3 technique is suggested to be used for object to provide effective results quickly. Fast RCNN is made up of
detection. This program can instantly find and identify different a CNN that has had its final pooling layer replaced by a "ROI
objects. The class probabilities of the discovered photos are pooling" layer, and whose final FC layer is replaced by two
provided by the object identification process in YOLO, which is branches: a (K + 1) category softmax layer branch and a
conducted as a regression problem. Both Python and OpenCV
category-specific bounding box regression branch. The
used for implementation of the work. The observation of the
work results more accuracy of object detection using YOLO
strategy is comparable to the RCNN algorithm. However, we
algorithm. send the input picture to the CNN to create a convolutional
feature map rather than feeding it the region
Keywords— Computer Vision, Neural Network, Open CV, You recommendations. The regions of suggestions are located
Only Look Once from the convolutional feature map, warped into squares
using a RoI pooling layer, and then reshaped into a fixed size
I. INTRODUCTION to be input into a fully connected layer [4].
There are several methods for object detection, including C. Single Shot Detection (SSD)
RCNN and Fast RCNN. Even though these techniques have
SSD is a detector with a single shot. It forecasts the border
overcome the constraints of data restriction and modelling in
boxes and the classes in a single pass without the need of a
object detection, they cannot find objects in a single algorithm
delegated region proposal network. SSD adds two new
run. While other algorithms may require many runs to detect
features to increase accuracy: tiny convolutional filters to
an item, YOLO can do so in only one. You Only Look Once,
predict object classes and offsets for standard border boxes.
or YOLO. To identify objects, the approach requires just a
Although the precision is said to be state-of-the-art, the
single forward propagation through a neural network, as the
complete process only moves at 7 fps. Much less than what
name implies [1].
real-time processing demands. By removing the requirement
The technology of autonomous driving also relies heavily for the regional proposal network, SSD accelerates the
on object detection. Many automakers employ it in process. Several adjustments, such as multi-scale features and
conjunction to image recognition software to enable AI default boxes, are made by SSD to make up for the accuracy
sensors in their vehicles to operate safety, identify traffic in decline [5].
the past, produce 3D maps, and navigate without a driver [2].
D. You Look Only Once (YOLO)
II. VARIOUS ALGORITHMS YOLO algorithm uses convolutional neural networks
(CNN) for detecting objects. As name suggests, for detecting
A. Region based Convolutional Neural Network (RCNN)
objects it only requires a single forward propagation. Which
The RCNN is a machine learning model used in image means the detection of the object is completed in only single
processing and computer vision. By drawing borders around run. The use of CNN is for simultaneously detecting different
items in a picture, RCNN's primary objective is to identify the class probabilities and bounding boxes. YOLO works in a
things there [3]. The disadvantages of RCNN include the way that the blocks that are left over are separated into
employment of three models: CNN for character extraction, different grids in the illustration. Size of each grid is S x S.
2
has more characteristics and is easier to identify, YOLO IV. REAL TIME OBJECT DETECTION
weights are employed in this. Once the detection criteria is satisfied, used the code in
Then, after receiving a picture as input, the image is real time environment for detection of several classes.
separated into 1024x1024 squares. However, this technique is restricted to a few classes.
For the image, bounding boxes are constructed that will be A. Observations
applied to the confidence level for the item. The YOLO V3's weight is important since the
Only if the object is present in the training and test data accuracy also depends on the number of
set will it be recognized. characteristics the image is subjected to.
The result is a box encircling the object and an image on The size of the picture affects how quickly things are
which the confidence is based. Here Mean Average detected. The conversion process takes less time if the
Precision(mAP) will be used to know the confidence level of image's converted size is minimal. The length of time
the detection in a scale of 0-1. required for detection increases if the size of the
image is large when being converted.
B. Results
Fig. 3: Bird Class detectBird class is detected with a
confidence level of 0.87 .
SSD500 46.5 19
FRCNN 59.1 6
RFCN 51.9 12
YOLOV3 60.6 20
3
VI. FUTURE SCOPE:
The use of weapons for illicit purposes is growing in
popularity nowadays. One such activity is mass shooting. It
can be possible to rapidly locate the weapons by employing
YOLO. The future work that will be implemented in this
effort will expand object detection into specialized domains
like the autonomous field, the military field etc.
The results of this study can be utilized to enhance the
accuracy of the detection of distinct types of weapons, such as
Fig. 6. Apple class detection rifles and handguns. They can also help in preventing the
mass shootings that have been happening in the recent past.
Apple class is detected with different confidence level.
REFERENCES
[1] Zihong Shi, “Object Detection Models and Research Directions.”
IEEE International Conference on Consumer Electronics and
Computer Engineering (ICCECE) 2021.
[2] Abhishek Sarda, Shubhra Dixit, Anupama Bhan, “Object Detection for
Autonomous Driving using YOLO algorithm.” 2021.
[3] Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik. “Region-
based convolutional networks for accurate object detection and
segmentation.” IEEE Transactions on Pattern Analysis and Machine
Intelligence Volume: 38, Issue: 1, 01 January 2016.
[4] Ross Girshick, “Fast R-CNN.” IEEE International Conference on
Computer Vision, 2015.
Fig. 7. Fire Hydrant Class Detection
[5] Qianjun Shuai, Xingwen Wu. “Object detection system based on SSD
algorithm.” IEEE International Carnahan Conference on Security
Fire Hydrant class is detected with confidence level of 1. Technology October 2020.
[6] Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi. “You
V. CONCLUSION Only Look Once: Unified, Real-Time Object Detection.” IEEE
In comparison to Fast R-CNN and Retina-Net and other Conference on Computer Vision and Pattern Recognition (CVPR) June
2016.
object identification methods, this method yields results better
[7] Muhammed Ku¨rs¸ad Uçar, Majid Nour, Hatem Sindi, and Kemal
for object detection. Polat, “The Effect of Training and Testing Process on Machine
The observation is that the accuracy of the image will Learning in Biomedical Datasets.” May 2020
depend on the quality of the image irrespective of whether it is [8] Y Amit, P Felzenszwalb, R Girshick – “Object Detection” .2020.
color image or black and white images. [9] Zhong-Qiu Zhao; Peng Zheng; Shou-Tao Xu; Xindong Wu, “Object
Detection with Deep Learning: A Review.” Volume: 30, Issue: 11,
Using this algorithm any class can be detected, the only IEEE Transactions on Neural Networks and Learning Systems January
thing that need to be do is to train and test the dataset in which 2019.
the class images are available. Limitation of this detection is [10] Harsh Jain, Aditya Vikram, Mohana, Ankit Kashyap, Ayush Jain,
Weapon Detection using Artificial Intelligence and Deep Learning for
that it cannot detect small objects unless you train the similar Security Applications. IEEE International Conference on Electronics
objects to the algorithm. and Sustainable Communication Systems (ICESC), 2020.