International Journal of Engineering, Science & InformationTechnology (IJESTY)

Volume 2, No. 3 (2022) pp. 96-99

ISSN 2775-2674 (online)
Research Paper, Short Communication, Review, Technical Paper

Design of a Real-Time Object Detection Prototype System

With YOLOv3 (You Only Look Once)
Chichi Rizka Gunawan1*, Nurdin2, Fajriana2
1Master Student Department of Information Technology, Universitas Malikussaleh, Aceh, Indonesia
2Department of Information Technology, Universitas Malikussaleh, Aceh, Indonesia
*Corresponding author E-mail: [email protected]

Manuscript received 15 April 2022; revised 1 May 2022; accepted 15 June 2022. Date of publication 25 July 2022

Object detection is an activity that aims to gain an understanding of the classification, concept estimation, and location of objects in an
image. As one of the fundamental computer vision problems, object detection can provide valuable information for the semantic under-
standing of images and videos and is associated with many applications, including image classification. Object detection has recently
become one of the most exciting fields in computer vision. Detection of objects on this system using YOLOv3. The You Only Look
Once (YOLO) method is one of the fastest and most accurate methods for object detection and is even capable of exceeding two times
the capabilities of other algorithms. You Only Look Once, an object detection method, is very fast because a single neural network pre-
dicts bounded box and class probabilities directly from the whole image in an evaluation. In this study, the object under study is an object
that is around the researcher (a random thing). System design using Unified Modeling Language (UML) diagrams, including use case
diagrams, activity diagrams, and class diagrams. This system will be built using the python language. Python is a high-level program-
ming language that can execute some multi-use instructions directly (interpretively) with the Object Oriented Programming method and
also uses dynamic semantics to provide a level of syntax readability. As a high-level programming language, python can be learned easi-
ly because it has been equipped with automatic memory management, where the user must run through the Anaconda prompt and then
continue using Jupyter Notebook. The purpose of this study was to determine the accuracy and performance of detecting random objects
on YOLOv3. The result of object detection will display the name and bounding box with the percentage of accuracy. In this study, the
system is also able to recognize objects when they object is stationary or moving.

Keywords: YOLO, YOLOv3, Python, Anaconda.

1. Introduction
With the development of the times, humans continue to develop knowledge and technology to help and ease their work. One area of
research that is still developing is artificial intelligence or better known as Artificial Intelligence (AI) [1][2][3] [4].
Machine Learning is an approach in AI that is widely used to replace or imitate human behavior to solve problems or perform automation.
As the name implies, machine learning tries to imitate how humans or intelligent creatures learn and generalize. The hallmark of machine
learning is the existence of a training, learning, or training process. Therefore, machine learning requires data to learn, known as training
data [5] [6] [7].
Object detection is the ability of a system to recognize objects that are in an image or video [8]. Then the object detection process begins
with the file.bmp extension from the original image, then resizing, grayscale, and edge detection convolution [9]. As one of the funda-
mental computer vision problems, object detection can provide valuable information for the semantic understanding of images and vide-
os and is associated with many applications, including image classification. Object detection has recently become one of the most excit-
ing fields in computer vision [2] [10].
The You Only Look Once (YOLO) method is one of the fastest and most accurate methods for object detection and is even capable of
exceeding 2 times the capabilities of other algorithms. You Only Look Once, an object detection method, is very fast because a single
neural network predicts bounded box and class probabilities directly from the full image in an evaluation [11] [12]. However, it makes
more localization errors and the training speed is relatively slow. This research will create a system to detect objects in real-time. This
study aims to determine the accuracy and performance of this algorithm by utilizing surrounding object data. It is hoped that this research
will be able to provide accuracy values and show better performance of object detection algorithms when applied [12] [13].

Copyright © Authors. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted
use, distribution, and reproduction in any medium, provided the original work is properly cited.
2. Literature Review
2.1. Object Detection
Object detection is an activity that aims to gain an understanding of the classification, concept estimation, and location of objects in an
image. As one of the basic computer vision problems, object detection can provide valuable information for semantic understanding of
images and videos, and is associated with many applications, including image classification [14].

2.2. Artificial Intelligence

Artificial Intelligence is a simulation of human intelligence that is modeled in a machine and programmed to be able to think like humans.
Artificial intelligence is a technology that requires data to be used as knowledge so that the intelligence made can be even better so that it
can continue to grow and learn from previous mistakes [3].
Artificial intelligence can do self-correction is because artificial intelligence designed to learn from the mistakes that have been experi-
enced. Artificial intelligence is one of the following four factors, namely: acting humanly, thinking humanly, thinking rationally,
and acting rationally [4].

2.3. Machine Learning

Machine learning can be defined as computer applications and mathematical algorithms adopted using learning that comes from data and
produces predictions in the future. The learning process in question is an attempt to acquire intelligence through two stages, includ-
ing training and testing. The field of machine learning deals with the question of how to build computer programs to improve automati-
cally based on experience [15].

2.4. You Only Look Once

You Only Look Once (YOLO) is an algorithm for object detection based on Convolutional Neural Network. In the YOLO architecture,
there are 24 convolutional layers that function to get features from the image. Then followed by 2 connected layers which function to
predict probability and coordinates [12].

2.5. You Only Look Once v3

Single-stage architecture created is called the YOLO ( You Only Look Once ) method which produces fast inference time. The frame rate
for a 448 x 448-pixel image is 45 fps (0.022 seconds per image) on the Titan X GPU while achieving advanced mAP (precision average).
Yolov3 has several stages in classifying detection YOLOv3 feature extraction uses the darknet to predict the class and location of objects,
after which YOLOv3 will classify objects according to their class [16] [17].

2.6. Python
Python is a high-level programming language that can execute a number of multi-use instructions directly ( interpretively ) with the Ob-
ject Oriented Programming method and also uses dynamic semantics to provide a level of syntax readability. As a high-level program-
ming language, python can be learned easily because it has been equipped with automatic memory management [18].

2.7. Tensorflow
Tensorflow is a software library or library that is open source or open, and free for machine learning. Tensorflow is used for many things
but focuses more on training and inference of deep neural Tensorflow library is a library based on dataflow and programming [19] [20].
Tensorflow is a computational framework for building machine learning models. Tensorflow provides a variety of toolkits that allow you
to build models at your preferred level of abstraction and run graphics on multiple hardware platforms, including CPU, GPU, and TPU

3. Methods
3.1. Object Analysis
The detection carried out leads to random objects around the researcher. Light intensity is also taken into account.

3.2. System Overview

To obtain object information, a system will be built, where system will recognize the name of the object taken from the webcam, then the
results will detect the name and accuracy of the random object. This system will be built using the python language where the user must
run through the Anaconda prompt and then continue using Jupyter Notebook, after that when the system displays the camera screen, the
user must scan the object so that the camera can capture the object and generate information on the object.
Next, there is a training stage where all datasets that are used as training data will be trained using the YOLO (You Only Look Once)
method. All data will be recognized so that the system can detect objects accurately and accordingly.
Fig 1. Flowchart of the Overall System

Figure 1 shows the system flow diagram in this study. Where the camera will monitor objects around. When the camera captures an ob-
ject, the captured object will be processed using the YOLOv3 algorithm for identification. If detected by the camera, the object will be
marked with a bounding box on the displayed display and the information and accuracy of the object will be known if the system cannot
identify the detected object, the system will repeat the command to monitor the surrounding situation again.

4. Results and Discussion

In the application of machine learning, you can learn various forms of visual random objects from colors, shapes, textures, and images.
Problems that are often encountered in object detection are the difficulty of detecting objects or non-objects in an image, and the high
number of object variability as is the case with random objects that have various shapes, colors, and sizes that vary.

4.1. System Design

In this stage, the system design will be carried out using Unified Modeling Language (UML) diagrams including use case diagrams,
activity diagrams, and class diagrams.

4.1.1. Use Case Diagram

Description of use cases on this system.
1. Scan Object is a feature that can be done by the user before getting the results of random object detection, the user must point a ran-
dom object at the camera, and then it will be detected.
2. Detecting Object is a process where the object captured by the camera will be recognized to get the result in the form of the object
name and its accuracy.
3. Viewing Object Detection Results is a feature where users can see directly the results of object detection that appear, complete with
object names and percentage accuracy.

Fig 2. Use Case Diagram

4.1.2. Activity Diagram

A diagram that shows the activities of each function, which describes the workflow (workflow) of a system and can describe the menu
activities that exist in the system.
1. Object Scan

Fig 3. Activity Diagram of the Scan Object Process

The object scan process, where the first object detection system will be run then the user directs the object to the camera. Then the object
scan process will run.

2. Object Detection

Fig 4. Activity Diagram of the Object Detection Process

The object detection process, the first process detects whether the object was successfully captured by the camera there are 2 conditions
in this process, which is whether the system can recognize the object or not, if not, the system will return to the initial step, which is de-
tecting the object to be captured by the camera.
3. Object Detection Results

Fig 5. Activity Diagram of the Object Detection Result Process

The process of object detection results, where after the system obtains the results, the system will display or provide the name of the ob-
ject and the percentage of object accuracy.

4. Managing Object Information

Fig 6. Activity Diagram of the Process of Managing Object Information

4.3.3 Class Diagram

Class diagrams describe the static structure of the classes in the system and describe the attributes, operations, and relationships be-
tween classes. Class diagrams help in visualizing the class structure of a system and are the most widely used type of diagram.

Fig 7. Class Diagram

5. Conclusion

It can be concluded that the system design has been carried out using Unified Modeling Language (UML) diagrams including use case
diagrams, activity diagrams, and class diagrams. This system will be built using the python language where the user must run through
the Anaconda prompt and then continue using Jupyter Notebook. The object under study is the object that is around the researcher. The
result of object detection will display the name and bounding box with the percentage of accuracy. In this study, the system is also able to
recognize objects when they object is stationary or moving.


