0% found this document useful (0 votes)
46 views6 pages

Detecting Vehicles Using YOLOv8n in Edge Computing Dashcam

Uploaded by

12345venkab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views6 pages

Detecting Vehicles Using YOLOv8n in Edge Computing Dashcam

Uploaded by

12345venkab
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

2023 3rd International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA)

Detecting Vehicles using YOLOv8n in Edge


Computing Dashcam
2023 3rd International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA) | 979-8-3503-9455-9/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICICyTA60173.2023.10428952

1st Aura Syafa Aprillia Radim 2nd Muchammad ’Irfan Chanif Rusydi 3rd Surya Michrandi Nasution
School of Electrical Engineering School of Electrical Engineering School of Electrical Engineering
Telkom University Telkom University Telkom University
Bandung, Indonesia Bandung, Indonesia Bandung, Indonesia
[email protected] [email protected] michrandi@ telkomuniversity.ac.id

4th Casi Setianingsih


School of Electrical Engineering
Telkom University
Bandung, Indonesia
setiacasie@ telkomuniversity.ac.id

Abstract—Dashcam is a camera placed on the dashboard of [3], and road damage [4]. Some researchers also applied deep
a vehicle. This device's function is to capture footage of all events learning methods to traffic scenes [5], [6]. However, the
in front of the vehicle. Security and safety have become a detection process was mainly carried out on on-premises
significant concern in various sectors, including transportation computers or a smartphone and had high latency. To prevent
and public roads. Traffic accidents caused by drivers’ ignorance accidents among vehicles, the detection process needs to be
of objects around the vehicle are still a severe problem on the near real-time low latency.
highway. In this study, a simple dashcam built from an edge
computer was developed. By adding a camera, the dashcam is In this paper, a solution to prevent accidents among
able to detect vehicles ahead. By the time, vehicles appear in the vehicles is proposed by implementing an object detection
system, it will be detected using an object detection method system on an edge computing device or a single-board
called YOLOv8. This research is expected to be one step in a computer (SBC). This SBC works as the main processing unit
proof-of-concept of the development of an Intelligent using a GPU that is preinstalled in it. Objects in front of the
Transportation System that is in accordance with traffic dashboard camera are detected by using the Convolutional
conditions in Indonesia. In this paper simulated and tested the Neural Network method.
usage of GPU from the edge computing device. Even though the
YOLO8n has lower 6.29, 9.11, 6.05, and 0.24 points The content of this paper is organized as follows. Section
performances for its precision, recall, mAP50, and mAP50-95 II presents the literature review. Section III presents the
respectively than YOLOv7-tiny, it only used half the system design. Section IV presents experimentation results
computational cost than the YOLOv7-tiny. It shows the and a discussion of the process of developing the system.
YOLOv8n is suitable as a detection method in an edge Finally, Section V explains the conclusion and future work of
computing device. As the inference time testing, objects in an this study.
image can be detected from 65-500 ms based on the power
supplied to the computer. It also means, in a second the system II. LITERATURE REVIEW
is able to infer objects for 2 to 15.38 frames.
A. Dashcam
Keywords—ADAS, dashcam, object detection, You Only Look Dashcam has been widely used by drivers to record traffic
Once, YOLOv8 on the road. Many believe that dashcam is an essential part of
the vehicles. There are several reasons to install a dashcam,
I. INTRODUCTION including the insurance corporation's favorable insurance rates
Security and safety have become a significant concern in for drivers and the collection of material that can be used as
various sectors, including transportation and public security. evidence in legal procedures [7]. To such a degree that both
Traffic accidents caused by drivers’ ignorance of objects Chinese and South Korean governments oblige public
around the vehicle are still serious problems on the highway. transportation and commercial vehicles to install a dashcam to
Intelligent and effective object detection technology is assist in investigating traffic accidents [8]. Despite the
increasingly important in monitoring ahead and approaching importance of being widely known, only some utilize
traffic [1]. dashcams for purposes other than recording the road and
traffic conditions.
A dashboard camera (Dashcam) is placed on a vehicle’s
dashboard. This device usually serves to record all events in B. Object Detection
front of the vehicle. Dashcam is one of the devices that the Object Detection is one of the essential tasks in the
demand proliferates in the market. Currently, a dashcam computer vision field, mainly dealing with detecting instances
serves as a device to record events in front of the vehicle, and of visual objects and then categorizing them into several
then the recording is used as evidence of an accident [2], so classes [9]. With this kind of identification and localization,
later it can be used as a proof of insurance claims. However, object detection can be used to count objects in a scene and
dashcams could also be used for other purposes, such as one determine and track their precise locations, all while
of the components of an Autonomous Driving Assistance accurately labeling them. It has been widely used for
System (ADAS). recognizing faces [10], vehicles [11], counting pedestrians
Many researchers have been incorporating dashcam [12], securing systems [13], implementing autonomous cars
footage in order to classify crash and near-crash conditions [14], etc.. Object detection has undergone many changes and

Authorized licensed use limited to: Southern New Hampshire University. Downloaded on August 08,2024 at 17:06:34 UTC from IEEE Xplore. Restrictions apply.
979-8-3503-9455-9/23/$31.00 ©2023 IEEE 89
2023 3rd International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA)

developments in the past twenty years [9]. Although it is ( ). The formulation to measure the precision and recall are
commonly divided into two periods: (1) traditional object shown in (1) and (2) respectively.
detection and (2) deep learning based.
In 2012, Krizhevsky et al. proposed a deep convolutional = (1)
network trained on a subset of ImageNet [15] called AlexNet.
It was the forerunner of the YOLO model. A year later, = (2)
Girshick et al. proposed a new object detection framework
called R-CNN [16]. It combined region proposals with CNNs
to detect objects in images. Since then, the object detection The threshold value determines whether the prediction is
research and development field has been rapidly advancing, Positive or Negative. For example, if the threshold is 0.5, then
with new models, datasets, and techniques emerging rapidly. the prediction is positive if the intersection over union (IoU)
is greater than 0.5. Otherwise, the prediction is negative. Both
Along with the birth of AlexNet, the YOLO (You Only Precision and Recall are relative to the threshold value and are
Look Once) model was introduced in 2015 which [17]. The usually used in the form of a Precision-Recall Curve. The
original base YOLO model can achieve 45 frames per second. calculation of the area under the curve to get the Average
Redmon et al. also released the smaller version of YOLO Precision to get the value is done and the calculation of
called Fast YOLO which can achieve 155 frames per second. the Average Precision for each class must be done. In (3), is
According to Redmon et al., YOLO outperforms DPM and R- the number of classes in the COCO’s dataset.
CNN on the Picasso Dataset and People-Art Dataset [17]. In
2022, Wang et al. release YOLOv7 [18]. A year later, as of = ∑ (3)
January 2023, YOLOv8 was introduced by Ultralytics [19],
the same software company that released YOLOv3 and
YOLOv5. As of now, YOLOv8 is one of the latest State of E. Edge Computing Devices
The Art (SOTA) open-source object detection models. Nasution et al. proposed a system that can detect vehicles
and street lanes [21]. The image feed was captured using a
C. COCO Dataset
smartphone and then processed using the ImageAI library
The COCO dataset is a large-scale object detection, with RetinaNet for COCO as an object detection model
segmentation, and key point dataset. In total, The Microsoft proposed as a system that can detect vehicles and street lanes.
Common Objects in COntext contains 91 common object The image feed was captured using a smartphone and then
categories, with 82 of them having more than 5,000 labeled processed using the ImageAI library with RetinaNet for
instances [20]. The first version of the COCO Dataset had COCO as an object detection model. In their study, the process
124,000 images which are divided into training and validation of object detection is done by using on-premises computer.
datasets. The training dataset consists of 83,000 images, and
the rest images will be used as a validation dataset. Later, the Nasution and Dirgantara improved the systems by
COCO dataset improved their image numbers. In 2017, the building ADAS on an edge computing device. This paper uses
total images in the dataset were more than 330,000 images. Raspberry Pi 4 as the main processing unit [22].
Based on its size, the COCO Dataset is widely used by the Unfortunately, their results only managed to get 0.9 FPS.
State-of-the-Art Object Detection Model to train and evaluate Based on the lack of FPS, this study focused on finding a more
the model performance. robust edge computer.
Despite its popularity, the COCO Dataset has some III. SYSTEM DESIGN
drawbacks. Although the COCO Dataset contains quite a large In this section, the system designed for this study will be
number of image classes, it has an imbalanced class discussed. Overall, the proposed system consists of 3 stages,
distribution. The total number of annotated objects for the as follows: (1) the image’s stream collection from the
person class is 64,115, while the hair dryer class is only around camera’s feeds, (2) the image processing stages, using an edge
100. Additionally, for this research, the model doesn’t need to computer, and (3) the display of the detection result into the
detect anything other than objects that relate to traffic. ADAS’s screen. Fig. 1 shows the flowchart of the proposed
D. Performance Metrics method of detecting traffic objects.
The performances of YOLOv8n are measured by
calculating the Precision ( ), Recall ( ), and mean
Average Precision ( ). is the average of Average
Precision ( ) over all classes. is the area under the
precision-recall curve. The comparison of is averaged
for Intersection over Union (IoU) thresholds from .50 to .95
with .05 increments (MS COCO standard metric, abbreviated
as mAP50-95) and mAP50 (PASCAL VOC metric,
abbreviated as mAP50) [7].
Precision and Recall in this experiment is a relative
measure because it depends on the threshold value. Precision
is calculated by dividing the True Positive ( ) by the sum of
True Positive ( ) and False Positive ( ). Meanwhile, the
value of Recall is calculated by dividing True Positive ( )
with the sum of the True Positive ( ) and False Negative Fig. 1. System Flowchart

Authorized licensed use limited to: Southern New Hampshire University. Downloaded on August 08,2024 at 17:06:34 UTC from IEEE Xplore. Restrictions apply.
90
2023 3rd International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA)

First, the camera captures the image frame by frame, C. Training The Model
followed by preprocessing the images before feeding them to Before the model training using YOLOv8n-Traffic with
the YOLO model. The YOLO model provides the bounding high epochs, the optimal hyperparameter configuration must
point coordinates and probability for each class for all detected be defined first. There will be various training configurations
objects. The bounding box and class are then drawn on the with a small number of epochs to save time. Each training
image. The image with the bounding box and class is then configuration needs up to 2 hours, even though the epoch’s
displayed on the screen. number is 8. In training the model, there are various batch
A. Image Feed sizes and optimizers were tested.
The image is gathered by using the camera that mentioned The batch sizes that are used in this paper are 16 and 32.
earlier. Fig. 2, shows a sample result of the gathered image Furthermore, the optimizers that are used are SGD and Adam.
from the camera. In the proposed system, there are several Fig.4 shows the result of training with different batch sizes and
classes (vehicles) that tried to be recognized by using an object optimizers. It can be seen that batch size 32 performs better
detection method. Later, the dataset is filtered into classes that than batch size 16, especially in the stage of model training.
are related to traffic conditions. The best combination between the batch sizes and optimizers
is model training with 8 epochs using batch size 32 and SGD
optimizer. Meanwhile, Fig. 5 shows the comparison between
the GPU utilization of the SGD optimizer when using 16 and
32-batch sizes.

mAP
Fig. 2. Image Sample from Collected Footage

Some part of the traffic footage that has been collected is


utilized as an addition to the training dataset. Indonesia has
unique traffic conditions, which is most of the vehicle type is
motorcycle. According to the Bureau of Indonesian Statistics,
Fig. 4. mAP50-95 Comparison using Variation of Batch Sizes
the number of motorcycles is dominated by 77.5% compared
to other kinds of vehicles [23]. Based on this number, the
traffic situations in Indonesia are different from other
countries.
GPU Percentage (% )

B. Filtering The Dataset


In this study, the dataset that will be used are COCO
dataset to train and validate the model. The models that were
used are train2017 and val2017. In the step of inferencing the
real data, the images will be captured by collecting the real
traffic conditions on the road. The observation area in this
paper is in Bandung, Indonesia. Fig. 5. GPU Utilization Comparison Between Variation of Batch Sizes

As mentioned in the previous section, the COCO dataset D. Edge Processing Unit
contains 80 classes. Traffic-related object classes are all that In order to improve the frame rate, in this paper the edge
are needed in this research namely cars; trucks; buses; computer that will be used as the main processing unit is
motorcycles; bicycles; traffic lights; stop signs; trains; Jetson Nano. It is powered by a Quad Core ARM-A57 CPU
hydrants; cats; and dogs. There are 78,663 images in training with 32 cores of Maxwell GPU. The GPU (CUDA Core)
split into 12 classes, as shown in Fig. 3. This number of data allows the system to accelerate the deep-learning model. The
was reduced from 122,125 images according to the original availability of GPU is another factor that needs to be
dataset. The training time of the model is expected to be considered in order to improve the frame rate. According to
reduced by decreasing the size of the dataset. As seen in Fig.3, Pandey et al., GPU is a crucial component in the deep learning
the number of motorcycles is lower than the number of car [24], and it means that an edge computer with a GPU is
classes. preferred. The object detection system runs using python3 and
uses PyTorch Library as the deep learning framework.
There will be two stages in creating the proposed system,
namely the training and inferencing stages. The stage of model
training, it was conducted a computer that has GeForce RTX
3090, 24 GB Video RAM, 32GB RAM, Intel Core i3-12100
4 Core 8 Thread, and running on Windows 10. Meanwhile, the
model inferencing and testing stage was performed on Jetson
Nano 4GB that runs Ubuntu 20.04.02 LTS.
As shown in Fig. 6, the prototype of a dashcam is
implemented in a car. It can be seen that the camera is placed
Fig. 3. Filtered COCO Dataset Class Count behind the Jetson Nano. The main power for the system comes

Authorized licensed use limited to: Southern New Hampshire University. Downloaded on August 08,2024 at 17:06:34 UTC from IEEE Xplore. Restrictions apply.
91
2023 3rd International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA)

from the car battery which is connected to the cigarette lighter Based on the formula mentioned in (3), the filtered
and USB car charger. Whenever the camera from the dashcam COCO dataset has a better result, as shown in Fig. 10. The
is ready to capture images, the testing is conducted by driving calculation was conducted by simulating the model
around the city of Bandung. training using 8 epochs. It can be seen in the figure, that the
of the filtered dataset reaches 0.298, meanwhile, the
original COCO dataset (train2017) only reaches 0.135. In this
training simulation, the number of small epochs is caused to
save the training time.

Fig. 6. Prototype Implementation – Front and SideView

mAP
IV. RESULTS AND DISCUSSION
A. Object Detection
According to the filtered dataset, there are 12 kinds of objects
that tried to be detected. As shown in Fig. 7, the result of
detected objects is limited to the filtered classes on the COCO Fig. 10. Comparison of mAP50-95 between Model Training (Filtered and
dataset. According to the figure, there are several cars, and a COCO’s Original Dataset
motorcycle that were detected, meanwhile there is also a
motorcycle that failed to be detected by the system. C. Training Result
In this section, there will be a discussion about the result
of the trained model for detecting objects. In this simulation,
it has a longer epoch than the previous simulation. The model
is trained using 80 epochs. The method of object detection that
is used in this paper is YOLOv8. YOLOv8 offers several
model sizes, such as YOLOv8n which is the smallest model,
YOLOv8l as the medium model, and YOLOv8x as the largest
model. YOLOv8n was chosen as the model due to its small
size and compatibility with edge computing devices.
Fig. 7. Object Detection Result
YOLOv8n used 168 layers with over 3 million parameters
with a computation cost of around 8 GFLOPs.
B. Dataset Comparison
The filtered dataset is compared to the original based on Along with the model training using YOLOv8, the
the GPU utilization (%) and memory (%). As seen in Fig. 8 previous version of this method (YOLOv7-tiny) was also
and Fig. 9, the filtered dataset has lower utilization in GPU trained to compare its performances. Both YOLOv7-tiny and
and memory usage. It means the filtered dataset is more YOLOv8n are the smallest models in its YOLO version. As
efficient when it is trained. Based on these results, the longer seen in Fig. 11, the YOLOv8n has a lower computational cost.
the model has been trained, it will deliver better results. As the need of GFLOPs, YOLOv8n needs only 8.7,
meanwhile, the YOLOv7-tiny needs 13.1. As seen from the
point of view of the parameter aspect, YOLOv8n only needs
almost half of YOLOv7-tiny’s. Meanwhile, the YOLOv8n
GPU Percentage (% )

uses 168 layers which is slightly smaller than the number of


YOLOv7-tiny layers that needs 208.

Fig. 8. Comparison of GPU Utilization between Model Training using


Filtered and COCO’s Original Dataset
GPU Memory Allocated (% )

Fig. 11. Layers, Parameters, and GFLOPs Comparison between YOLOv7-


tiny and YOLOv8n

Both YOLO methods are tested using a larger number of


epochs to compare their performances. As shown in Table 1,
the newest version of YOLO has almost similar performance
Fig. 9. Comparison of GPU Memory Utilization between Model Training
using Filtered and COCO's Original Dataset to the previous one. YOLOv8n has 71.25%, 56.59%, 63.8%,
and 46.31% for its precision, recall, mAP50, and mAP50-95
respectively. Meanwhile, the YOLOv7-tiny has 6.29, 9.11,

Authorized licensed use limited to: Southern New Hampshire University. Downloaded on August 08,2024 at 17:06:34 UTC from IEEE Xplore. Restrictions apply.
92
2023 3rd International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA)

6.05, and 0.24 greater points for the same metrics as tested in this study, most of the filtered image classes still have less than
YOLOv8n. Though YOLOv8n has lower performances, it 1,000 instance count. In short, fine-tuning the model would be
uses lower GFLOPs and parameters and needs fewer layers beneficial for the performance results.
than YOLOv7-tiny.
D. Prototype Testing
TABLE I. COMPARISON BETWEEN YOLOV7-TINY AND YOLOV8 Based on the prototype that has been implemented in a
vehicle, it describes the real condition of how the dashcam
Model
Recall Precision mAP50 mAP50-95 works. The prototype must be tested its power consumption in
(%) (%) (%) (%) order to know its usage. Somehow, the power source in a
vehicle may need more power to support other devices that
YOLOv7-tiny 65.7 77.54 69.85 46.55 must be charged at the same time when it is used to power the
edge computing device.
YOLOv8n 56.59 71.25 63.8 46.31
The device may need more power consumption in order to
inferencing the images faster. As shown in Table II, the higher
Based on the results, both methods have almost similar power consumption (W) makes inference time faster. For each
performances. The comparison of mAP.95 between both test in power consumption, the fastest, medium, and slowest
smallest versions of YOLO is shown in Fig. 12. It can be seen inference time is 341, 374, and 500 ms. By increasing the
in the figure, that the mean average precision of YOLOv8n is power by 2 watts, the inference time is faster almost twice.
almost similar to YOLOv7-tiny as forementioned before. The fastest, medium and slowest inference time is 158, 178,
YOLOv8n has low performance since it’s the smallest version and 220 ms. Meanwhile, whenever the power increased twice,
of YOLOv8. The bigger version may improve its the inference time was almost 5 times faster. The inference
performance. times in this test (10 W) need less than 100 ms.

TABLE II. POWER CONSUMPTION EFFECT ON INFERENCE TIME

Power Consumption Inferencing Time (s)


(W) Fastest Medium Slowest
5 341 374 500

7 158 178 220

10 65 70 81
Fig. 12. mAP50-95 Comparison between YOLOv8n and YOLOv7-tiny
with 80 Epochs
Object detection by using Jetson Nano as its edge
Based on Fig. 13, it can be inferred that achieving metrics computing device has better results compared with previous
of mAP50-95 on COCO val2017 above 50% can only be research mentioned earlier. As shown (4), the formulation for
reached by bigger models such as YOLOv7 and YOLOv8m. measuring the number of Frames per Second (FPS) is
However, both of them have 4 to 8 times the parameter of determined. The value of FPS is defined based on the division
YOLOv8n and YOLOv7-tiny and require 10 times of 1000 to inference time.
computational cost based on GFLOPS values. Adopting those
bigger YOLO models wasn’t an option for us since we meant = (4)
to do the detection process on the Edge Processing Unit with
limited computational resources. As seen in Table III, the FPS value varies between each
power consumption. The FPS is between 2 to 2.93 when the
edge computing device is powered up using 5W. Meanwhile,
the FPS started to rise along with the power. As the Jetson
Nano receives 7W for its power, the system can detect objects
for 4.55 to 6.33 frames every second. In the end, by the time
the power consumption doubled from the first testing, the FPS
range is between 12.35 to 15.38 frames in a second.

TABLE III. FRAME PER SECOND BASED ON THE INFERENCE TIME FOR
EACH POWER CONSUMPTION

Power Consumption Frame per Second


(W) Fastest Medium Slowest
Fig. 13. Comparison of YOLO Object Detection Model in COCO val
mAP50-95 5 2.93 2.67 2

On the other hand, the COCO dataset has an imbalance 7 6.33 5.62 4.55
instance count problem. The chosen model was trained on data 10 15.38 14.29 12.35
and only slightly fixed the problem by incorporating it, which
was collected and annotated from traffic footage, so it affected V. CONCLUSION
the performance of the model. Although filtering the COCO
dataset is essential to get better results in terms of , since According to the test conducted in this study, YOLOv7-
it removes a bunch of image classes that are unnecessary in tiny and YOLOv8n almost have similar performances based

Authorized licensed use limited to: Southern New Hampshire University. Downloaded on August 08,2024 at 17:06:34 UTC from IEEE Xplore. Restrictions apply.
93
2023 3rd International Conference on Intelligent Cybernetics Technology & Applications (ICICyTA)

on the calculation of precision, recall, and . The gaps doi: 10.3390/s21217422.


between these methods are 6.29, 9.11, 6.05, and 0.24 points [6] A. Boukerche and Z. Hou, “Object Detection Using Deep Learning
Methods in Traffic Scenarios,” ACM Comput. Surv., vol. 54, no. 2,
for precision, recall, mAP50, and mAP50-95 respectively. 2021, doi: 10.1145/3434398.
Even though the performance of YOLOv8n is lower than [7] V. Adamová, “Dashcam As a Device To Increase the Road Safety
YOLOv7-tiny, YOLOv8n has a better computational cost Level,” Proc. CBU Nat. Sci. ICT, vol. 1, pp. 1–5, 2020, doi:
than YOLOv7-tiny. It means YOLOv8n is a suitable method 10.12955/pns.v1.113.
for implementation in edge computing devices without [8] J. Kim, J. Kim, S. Park, and U. Lee, “Dashcam Witness: Video Sharing
Motives and Privacy Concerns across Different Nations,” IEEE
reducing the object’s detection capability. Access, vol. 8, pp. 110425–110437, 2020, doi:
Object detection in edge computing devices using Jetson 10.1109/ACCESS.2020.3002079.
[9] Z. Zou, K. Chen, Z. Shi, Y. Guo, and J. Ye, “Object Detection in 20
Nano has various inference times in detecting objects. By the Years: A Survey,” Proc. IEEE, vol. 111, no. 3, pp. 257–276, 2023, doi:
time the edge computer receives 5W as its power, it needs 10.1109/JPROC.2023.3238524.
341-500 ms to detect objects in an image. When the power is [10] A. Kumar, A. Kaur, and M. Kumar, “Face detection techniques: a
doubled, the time needed to detect objects in an image is 5 review,” Artif. Intell. Rev., vol. 52, no. 2, pp. 927–948, 2019, doi:
times faster which only needs 65-81 ms to detect objects. 10.1007/s10462-018-9650-2.
[11] R. A. Hadi, G. Sulong, and L. E. George, “Vehicle Detection and
Based on the inference time testing, the calculation of frame
Tracking Techniques : A Concise Review,” Signal Image Process. An
per second was also conducted. By the time the Jetson Nano Int. J., vol. 5, no. 1, pp. 1–12, 2014, doi: 10.5121/sipij.2014.5101.
received 5W as its power, it was able to detect 2-2.93 frames [12] C. Song, W. Guan, and J. Ma, “Potential travel cost saving in urban
for each second. As the power increased supply to 7W, the public-transport networks using smartphone guidance,” PLoS One,
Jetson Nano can detect 4.55-6.33 frames per second. At last, vol. 13, no. 5, pp. 1–22, 2018, doi: 10.1371/journal.pone.0197181.
by the time the power supply is changed to 10W, the number [13] H. Hashib, M. Leon, and A. M. Salaque, “Object Detection Based
Security System Using Machine learning algorthim and Raspberry Pi,”
of frames that can be detected in a second is increasing greatly. 5th Int. Conf. Comput. Commun. Chem. Mater. Electron. Eng.
It can detect 12.35 to 15.38 frames for each second. IC4ME2 2019, pp. 1–4, 2019, doi:
10.1109/IC4ME247184.2019.9036531.
Future improvement of the system can be done by using [14] A. Balasubramaniam and S. Pasricha, “Object Detection in
the full potential of GPU in the Nvidia Jetson Series. The Autonomous Vehicles: Status and Open Challenges,” pp. 1–6, 2022,
usage of edge computing devices is vast, but it must be chosen [Online]. Available: http://arxiv.org/abs/2201.07706.
wisely based on the purpose of the research. The other [15] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet
improvement is rewriting the code in C++ which has lower Classification with Deep Convolutional Neural Networks,” in
Advances in Neural Information Processing Systems, 2012, pp. 1097–
processing or memory consumption. It may reduce the 1105, doi: 10.1201/9781420010749.
inferencing time for each frame. Further development towards [16] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature
Autonomous Driving Assistance Systems (ADAS) can be hierarchies for accurate object detection and semantic segmentation,”
done by collaborating more cameras and sensors. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp.
580–587, 2014, doi: 10.1109/CVPR.2014.81.
ACKNOWLEDGMENT [17] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look
Once: Unified, Real-Time Object Detection,” IEEE Conf. Comput. Vis.
This work is funded by The Ministry of Education, Pattern Recognit., 2016.
Culture, Research, and Technology, Republic of Indonesia [18] C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “YOLOv7:
(Kementrian Pendidikan, Kebudayaan, Riset, dan Teknologi, Trainable bag-of-freebies sets new state-of-the-art for real-time object
Kemdikbudristek) via DRTPM 2023 under the contract detectors,” pp. 1–15, 2022, [Online]. Available:
http://arxiv.org/abs/2207.02696.
number 003/SP2H/RT-MONO/LL4/2023, [19] G. Wang, Y. Chen, P. An, H. Hong, J. Hu, and T. Huang, “UAV-
346/PNLT2/PPM/2023. YOLOv8: A Small-Object-Detection Model Based on Improved
YOLOv8 for UAV Aerial Photography Scenarios,” Sensors, vol. 23,
REFERENCES no. 16, 2023, doi: 10.3390/s23167190.
[20] T. Y. Lin et al., “Microsoft COCO: Common objects in context,” Lect.
Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect.
[1] A. Soin and M. Chahande, “Moving vehicle detection using deep Notes Bioinformatics), vol. 8693 LNCS, no. PART 5, pp. 740–755,
neural network,” Int. Conf. Emerg. Trends Comput. Commun. 2014, doi: 10.1007/978-3-319-10602-1_48.
Technol., 2017. [21] S. M. Nasution, E. Husni, Kuspriyanto, R. Yusuf, and R. Mulyawan,
[2] P. M. Greenwood, J. K. Lenneman, and C. L. Baldwin, “Advanced “Road Information Collector Using Smartphone for Measuring Road
driver assistance systems (ADAS): Demographics, preferred sources Width Based on Object and Lane Detection,” Int. J. Interact. Mob.
of information, and accuracy of ADAS knowledge,” Transp. Res. Part Technol., vol. 14, no. 2, pp. 42–61, Feb. 2020.
F Traffic Psychol. Behav., vol. 86, pp. 131–150, 2022, doi: [22] S. M. Nasution and F. M. Dirgantara, “Pedestrian Detection System
https://doi.org/10.1016/j.trf.2021.08.006. using YOLOv5 for Advanced Driver Assistance System (ADAS),”
[3] L. Taccari et al., “Classification of Crash and Near-Crash Events from Resti (Rekayasa Sist. dan Teknol. Informasi), vol. 7, no. 3, pp. 715–
Dashcam Videos and Telematics,” IEEE Conf. Intell. Transp. Syst. 721, 2023, doi: https://doi.org/10.29207/resti.v7i3.4884.
Proceedings, ITSC, vol. 2018-November, pp. 2460–2465, 2018, doi: [23] BPS, “Perkembangan Jumlah Kendaraan Bermotor Menurut Jenis,
10.1109/ITSC.2018.8569952. 1949-2016.” 2018, [Online]. Available:
[4] A. Tedeschi and F. Benedetto, “A real-time automatic pavement crack https://www.bps.go.id/linkTableDinamis/view/id/1133.
and pothole recognition system for mobile Android-based devices,” [24] M. Pandey et al., “The transformational role of GPU computing and
Adv. Eng. Informatics, vol. 32, pp. 11–25, 2017, doi: deep learning in drug discovery,” Nat. Mach. Intell., vol. 4, no. 3, pp.
10.1016/j.aei.2016.12.004. 211–221, 2022, doi: 10.1038/s42256-022-00463-x.
[5] X. He, R. Cheng, Z. Zheng, and Z. Wang, “Small object detection in
traffic scenes based on yolo-mxanet,” Sensors, vol. 21, no. 21, 2021,

Authorized licensed use limited to: Southern New Hampshire University. Downloaded on August 08,2024 at 17:06:34 UTC from IEEE Xplore. Restrictions apply.
94

You might also like