10 - CPU Based YOLO A Real Time Object Detection Algorithm
10 - CPU Based YOLO A Real Time Object Detection Algorithm
Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 18,2020 at 13:18:33 UTC from IEEE Xplore. Restrictions apply.
TABLE I. COMPARISON BETWEEN DIFFERENT YOLO VERSIONS
= (2)
Performance Evaluation
Predicted Model Input
Predicted Image Train Set Test Set mAP FPS
Size
Intersection YOLOv1 448x448 VOC 2007+2012 VOC 63.4 45
[13] 2007
Fast YOLOv1 448x448 VOC 2007+2012 VOC 52.7 155
Ground Truth Ground Truth
2007
YOLOv2 416x416 VOC 2007+2012 VOC 76.8 67
(a) Area of Overlap (b) Area of Union 2007
Fig. 2. Intersection Over Union Tiny – YOLOv2 [11] 416x416 VOC 2007+2012 VOC 57.1 207
2007
YOLOv2 608x608 COCO [14] COCO 48.1 40
At the same time, while generating bounding boxes, each
YOLOv3 [12] 608x608 COCO COCO 57.9 20
grid cell predicts C conditional class probability of the
object. The class-specific probability for each grid cell is:
YOLO is later upgraded with various versions such as
( | )∗ ( )∗ YOLOv2 or YOLOv3[12] in order to optimize localization
= ( )∗ (3) errors and increase mean average precision (mAP). The FPS
and mAP of different version of YOLO is shown in Table I.
YOLO utilizes the equation below for calculating loss
B. Frameworks
function and finally improve confidence [7]:
To develop our model we need to install a deep learning
LossFunction: framework where we will run the YOLO algorithm. There
are few frameworks for running an algorithm which are
1 ( − ) +( − ) discussed below:
• TensorFlow: It is a deep learning framework
created by Google which can be used for designing,
+ 1 − + ℎ − ℎ building, and training models. But it needs huge
amount of GPU power and is favorable to Linux
[15].
+ 1 − • Darknet: It was developed by the developer of
YOLO based on Linux. And it runs better on GPU
based computers [16].
+ 1 −
• Darkflow: It was made by adapting darknet to
Tensorflow and works very fast in GPU based
+ 1 ( ) − ̂ ( ) . (4) computers. It also runs in CPU based computers but
installing in windows is terrible and very slow [17].
∈
• Opencv: It was built by Intel and also it has a deep
To legitimate the center and the bounding box of each learning framework. It works only in CPU and
prediction, the loss function is used. Every input image is installation in windows is easy [10].
divided into an S × S grid, with B bounding boxes for each
grid. The and variables are coordinates of center of III. CPU BASED YOLO ARCHITECTURE
each prediction, while and ℎ refer to dimensions of Our aim is to build the model (CPU Based YOLO) for
bounding box. The variable is used to increase CPU Based real time object detection. Developers of YOLO
emphasis on boxes with objects, and variable decrease used Darknet [16] framework for running the algorithm. We
the emphasis on boxes with no objects. C refers to the run YOLOv3 with DarkFlow [17] and OpenCv [10] as only
confidence score, and ( ) refers to the prediction of they are favorable for CPU Based YOLO. Our aim was to
classification. The 1 is 1 if the bounding box in the develop the model in Windows operating system but we
found that DarkFlow installation in windows is really a
cell predicts the object, an else 1 is 0. If the object is in complex task. However, we installed it and run YOLOv3
cell i then 1 is 1 and 0 otherwise. The loss function [12] on it, but the result was terrible. The FPS was too low
indicates the performance of the model, if loss is low then and the starting time was too much. Then we start to do the
performance will be high [7]. The accuracy of predictions task on OpenCv. We input a video though OpenCv and
generated by models in object detection is calculated through found a good FPS in the output. The architecture is shown in
the average precision equation defined as [7]: Fig. 3.
A. Setup
= ( )∆ ( ) (5) We loaded YOLOv3 and Dataset (COCO) [14] through
the framework OpenCv. Procedure of loading programme is
( ) refers to the precision at threshold k and ∆ ( ) refers shown in Fig.4.
to the fluctuate in recall [7].
553
Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 18,2020 at 13:18:33 UTC from IEEE Xplore. Restrictions apply.
above 16 but not so accurate for tiny objects like cup, book,
YOLO cellphone. The result is shown in Fig. 7. We run the same
task in many videos and found almost same result. As we
Real Time
used OpenCv, GPU was not used by the framework.
OpenCv Object
Detection
COCO
dataset
554
Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 18,2020 at 13:18:33 UTC from IEEE Xplore. Restrictions apply.
80
70
60
50
40 mAP
30 FPS
20
10
0
YOLO(GPU) [7] Fast R-CNN(GPU) [9] DPM(GPU) [18] Faster R-CNN(GPU) [8] SSD(GPU) [19] CPU Based
YOLO(Proposed Model)
555
Authorized licensed use limited to: Auckland University of Technology. Downloaded on December 18,2020 at 13:18:33 UTC from IEEE Xplore. Restrictions apply.