0% found this document useful (0 votes)
20 views5 pages

Final Resarch Paper

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views5 pages

Final Resarch Paper

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Object Detection and Tracking on Autonomous

Vehicle using YOLOV9.


Mamilla Vasu deva reddy Devarapalli.PavanKumar G Anitha
Department of Computer Science Department of Computer Science Amrita School of Computing
and Engineering and Engineering Amrita Vishwa Vidyapeetham
Amrita School of Computing Amrita School of Computing Chennai, India.
Amrita Vishwa Vidyapeetham Amrita Vishwa Vidyapeetham g [email protected]
Chennai, India. Chennai, India.
[email protected] [email protected]

Abstract—In this research, the use of one of the latest object detection must be both fast and accurate for safety against
detection models, YOLOv9, in Self-Driving car systems is dis- accidents.
cussed. For self-driving cars, accurate and fast object detection In this research, YOLOv9 is applied to improve the in-
are critical since the self-driving car has to navigate in erratic
traffic conditions. In particular, when the model is trained by a dependent vehicle’s object detection in real-time video feeds
dataset which is built for autonomous driving, YOLOv9 has a focusing on object identification. Data derived from cameras
better performance of detecting the objects like the pedestrians, fixed at the car is processed by the system in order to identify
vehicles, and obstacles. From the findings of the study, it is pedestrians, other vehicles and the stationary, and initiates a
possible to indicate that the further use of YOLOv9 in improving response depending on the identified objects. For this task,
the safety and functionality of autonomous vehicle systems is
promising since it allows for raising the accuracy of danger YOLOv9 was chosen because it demonstrated a balanced
detection and the speed of subsequent signal processing compared level of detection speed and accuracy, necessary for instant
to previous methods. decisions by automated vehicles.
Several different real-world driving conditions were applied
Index Terms—YOLOv9, Object Detection, COCO Dataset,
Deep Learning, Image Processing, Model Training,Bounding Box including city, suburban, rural, low and high speed, bright and
Detection, Ultralytics, Image Classification, Detection Results dark light, and various weather conditions. The tests confirmed
that the new YOLOv9-powered system far exceeded the pre-
I. I NTRODUCTION vious models in terms of seven classes detection accuracy, and
the reduction of false positives and false negatives numbers.
As the fact that the field of self-driving cars is fast growing, These enhancements allow self-driving cars to move more
the importance of the developed high accuracy and real time safely and effectively through various and complex conditions.
object detection system will also be of great importance in The present study clearly shows that the future of improving
the future. To function autonomously, self-driving cars require safety and dependability of AV systems relies on YOLOv9. It
the capability of interpreting many dynamic and stationary is a solution that is cheaper and more efficient in real-time
objects on the road including pedestrians, automobiles, and object detection, which makes it an important device in the
other objects on the road. Sensor based systems that include future of autonomous vehicles. Application of such features as
LiDAR and radar offer valuable information while cameras predictive object tracking, motion path prediction, and many
which are becoming popular for use in detection come with others is planned in future work to continue the enhancement
added advantages of being cheaper and offer detailed features. of the autonomous vehicle.
The major issue in real-time human–computer collaboration is
to combine the raw visual data processing and achieve high II. L ITERATURE S URVEY
accuracy or fast computation. vehicle detection employing YOLOv8, but data collection
Tags or bounding boxes for real time object detection is of images, data pre-processing, model training, and object
one of the most sought after solutions currently where YOLO tracking using DeepSORT. This paper shows that YOLOv8
(You Only Look Once) is among the most dominant deep performs better than YOLOv7 and Faster R-CNN while
learning algorithm for the same. Originally, YOLO has been YOLOv8 is less affected by low light and adverse weather
developed with the purpose of providing better accuracy and conditions. Accompanying raw data: The raw data for this
faster operation. YOLOv9 is the latest version of the series was collected from kaggle and it contains images and videos
and has many changes in the architecture and the detection of driving with annotations. The study also reports issues like
method which empowers YOLO to identify objects more enhancing the detection of vehicle at intersection and for the
faster and accurately in the uncertain urban environment. This weaker users of the road, and optimizing the model for low
makes YOLOv9 most suitable for self-driving cars where the power device.
a novel approach based on human attention model and mod-
ified vehicle vision through Laplacian Pyramid algorithm for
enhancing the sensory capability of smart personal vehicles.
To train the YOLOv4 algorithm for fast object detection in
the complex traffic, the dataset of 505 images from the CTS
dataset was used, using feature transfer learning from the pre-
trained YOLOv4 algorithm. However, integration of human
attention improves the accuracy in field prediction and the Fig. 1. Architecture of proposed system
study reveals the imperative of better attention techniques for
real-time use.
the MCS-YOLO model which includes five components: At Fig. 1 the Preprocessing Layer, the frames are modified
the Coordinate Attention Module, the Multiscale Small Object in formats required for modeling while resizing and scaling
Detection Structure, and a Swin Transformer. It works well enhances the performance of the model. It’s taken to facilitate
on the BDD100K dataset, outperforms YOLOv5s obtaining a fast processing but retain detection accuracy at the same time.
significantly higher mAP and recall rate in real-time detection The Object Detection Layer uses YOLOv9 which is the
speed at 55 FPS. Further research is proposed to improve intelligent layer of object detection with a very fast speed.
the Siamese network to detect small object and modify the YOLOv9 takes the frames and predicts the footpaths, vehicles,
algorithm to other auto driving modes. and other obstacles and produces bounding box, class, and
a method which jointly couples 3D and 2D signal for confidence.
object recognition in a self-driving car and the ground point At Fig. 1 we can see the final layer on the receiving end
of which is sought with the help of RANSAC algorithm. revises the detection outcome using Non-Maximum Suppres-
The system indicates better detection rates and shortens the sion Algorithm to delete similar detections and keep the best
processing time. Its goal is to use data from both sensors estimates.
– fixed and moving objects. However, the general problems Last but not least, the Performance Monitoring and Op-
are high computational requirements and constraints on the timization Layer is in charge of the monitoring of system
detection range as a function of sensor resolution and ambient performance and adjustment of the parameter settings to
conditions. feedback from the system for the steady performance and
Hybrid Approach for Object Detection on LiDAR data detection rate. This architecture enables real-time, accurate
analysis alongside image-level detection with YOLOv8. GPF object detection with low latency required for autonomous
is used in the approach followed by sensor calibration and vehicle navigation.
applying a 3D tracking algorithm for object pose and distance
calculation. The evaluation of the experiment yields good B. System Overview
result in ground segmentation and object detection in cluttered The architecture Ultralytics YOLOv9 and OpenCV for
scene with 1694 training images and 424 testing images under object detection in autonomous vehicles that require real time
different lighting conditions. Current limitations including control. The system takes the video feed from the cameras
sensor calibration, point cloud filtering and time consumption affixed to the vehicle and inserts it through the YOLOv9 model
while dealing with large point cloud data are mentioned. for objects such as vehicles, people, and signs.
III. M ETHODOLOGY YOLOv9 Algorithm is used to analyze the multiple object
detection in each frame of the images and vidoes. It forecast
The following sub-section describes the overall approach
bounding boxes, object classes and, confidence scores with
used in this work to train an effective object detection system
high speed it is fast enough for real time detection which is
based on YOLOv9 for self-driving vehicles. The system uses
crucial for self-driving cars.
high performance components such as the YOLOv9 model to
OpenCV Library which is used for video acquisition and
perform real time object detection on the videos, Open CV for
frame processing. Functions that involve resizing, normal-
image processing, and Matplotlib for visualization.
izing the input frames as well as passages them through
A. Architecture Overview the YOLOv9 model are among the ones it performs. In the
Based on these considerations, the object detection system post processing too, OpenCV provides the NMS algorithm to
for autonomous vehicles has a multi-tier design to yield high eliminate redundant boxes and retain the best detection results.
performance, real-time detection and fit seamlessly into the Performance Optimization: This is important because la-
autonomous vehicle processes. tency is detrimental to the safety of self-driving cars which this
The components are cameras which are installed on the object detection system aims at serving. It enables follow-up
vehicle for real-time capture of videos. These high-resolution observation of the performance indicators that include frame
video feeds feed the required input to the detecting sy-sy. rate, detection accuracy and the reaction time of the system.
At the Data Acquisition Layer, frames are retrieved with These integrated technologies enable the system to have
relatively low latency to allow real-time processing while real-time high precision of object detection, making the system
continuously feeding data to the detection pipeline. an indispensable element in AV path planning and control.
C. Detailed Modules
1) Data Acquisition and Preprocessing: A large-scale ob-
ject detection, segmentation, and captioning dataset known
as the COCO dataset is used in the system. Samples of
images are obtained from this dataset basically retrieving a
range of objects categories. The images from this dataset
are preprocessed for better performance where the images
are resized and normalized pixel values, and some additional
images are added through flipping and scaling during training
of models.
2) Object Detection Algorithm (YOLOv9): The object de-
tection sub-network is based upon the YOLOV9 model that
is well suited for real-life applications solely due to its real- Fig. 2. Accuracy Comparisions of each Framework
time detecting capabilities. YOLO (You Only Look Once)
predicts objects using only one forward pass over the image
by subdividing the image into a number of grids and then as the CSPDarknet backbone, and PANet neck for better
estimating the bounding boxes along with class probabilities feature extraction and fusion mechanisms.
as well precision scores for objects in each and every grid. • Resource Consumption: However, it should be noted that
The speed of this module is important and, therefore, performs YOLOv8 is one of the models that consumes moderate
multiple frames per second while being highly accurate at the computational power although generally presented as
same time. lightweight; but on the other hand, the model was de-
3) Training Process: To enhance the customization and signed for real-time applications, therefore, the consump-
perform better, the system is trained on data created from tion of resources is expected. This produces more cached
the COCO dataset. The dataset used is divided into training, performance on both the cpu and the gpu environments,
validation and test sets in order to sustain good generaliza- it is thus more portable across the systems.
tion. Training means an optimization of the model, namely 2) YOLOv9: That is all, YOLOv9 is a new version that
YOLOv9, after being taught on YOLO’s dataset with addi- consolidates and builds upon the original YOLO architecture.
tional teaching specific for certain object types Accuracy: YOLOv9 has a mean Average Precision of
4) Post-processing: They do this by applying Non-Max about 90 percent (mAP) Based on these observations, this
Suppression (NMS) in getting rid of the other surrounding enhancement in accuracy makes it a preferable choice for
bounding boxes and have only the main objects detected. The sophisticated applications where object detection is important
output generated from our algorithm includes the objects that and must be precise. Processing Speed: YOLOv9 has fairly
have been detected, their respective boxes, and corresponding fast processing speed as it has many improvements in terms
labels; these are useful in additional operations of examination of architecture of models and optimization. It is designed for
or in making decisions inside the self-driving car system. the use where decisions need to be made rapidly. Flexibility:
YOLOv9 is optimized for efficiency but requires a little bit
D. Comparison of YOLOv9 with Alternative Frameworks more computations than YOLOv8, especially when the input
Here, we can see the Fig. 2 the comparison of YOLOv8 to images have high resolution, or when many detections are
YOLOv9, two object detection algorithms, which are critical performed in diverse environments.
for our work. The two models perform exceptionally well
with some distinct variations particularly in terms of precision, E. Perfomance Metrics
results’ production rates, and resource consumption. 1) Accuracy Metrics Comparison Chart: In the chart below
1) YOLOv8: YOLOv8 discourse adds to the family of [Fig 4], we illustrate the comparative accuracy of each frame-
object detection frameworks, which is developed from the work. Below is a comparative accuracy chart showing how
previous powerful YOLO versions with new improvements for YOLOv9 outperforms YOLOv8 in terms of detection accuracy
better performance and effectiveness. for the object detection application using the COCO dataset.
• Accuracy:: In Fig. 2 the YOLOv8 clients an approxi- 2) Processing Speed Comparison (FPS): The table below
mately 88.5 in the mean Average Precision (mAP) on illustrates the processing speed comparison (frames per sec-
the COCO dataset. Due to its high degree of accuracy, it ond) between YOLOv8 and YOLOv9. While YOLOv8 pro-
is an ideal solution for object detection in the needed cir- vides faster frame processing, YOLOv9 offers higher accuracy,
cumstances, within different scenarios and even different making it more suited for applications requiring high precision
backgrounds object detection.
• Processing Speed: YOLOv8 is mainly developed for real- 3) Performance Metrics:
time applications and has great inference speeds because • Accuracy: In our object detection experiments using the
of their architecture, which consist of such components COCO dataset: YOLOv9 attained a confidence level of
Framework Detection Processing Resource Consump-
Accuracy Speed (FPS) tion (MB)
YOLOv8 88.5% 40 FPS (CPU) 75 MB
YOLOv9 90% 35 FPS (GPU) 100 MB
TABLE I
P ERFORMANCE M ETRICS C OMPARISON

90% while YOLOv8was only 88.5% accurate. This sig-


nifies enhanced capacity of YOLOv9 for object detection
and recognition in complex scenarios through giving
more accurate classes to the objectives. Higher accuracy
makes YOLOv9 even more accurate for applications that
require higher amount of accuracy, for instance Self-
Driven automobiles.
• Processing speed: The throughput of the models was
Fig. 3. Working of detection
measured in terms of frames per second (FPS). YOLOv8
was seen to have an affinity of 40 FPS on GPU and
was seen to be faster than YOLOv9 with 35 FPS on
GPU. Although YOLOv8 runs faster than YOLOv9, the
later delivers a more precise detection outcome which is
preferable for situations where accuracy takes the upper
hand over the speed.
• System Resource Usage: YOLOv9 used 100MB to func-
tion while YOLOv8 only used 75 MB to perform almost
similar functions.We can see the boxing in Fig. 3 and also
in Fig. 4 This extra memory consumption in YOLOv9
is attributable to the reasons that we are going to dis-
cuss next: its hyper complex structure makes the model
more accurate. However, both models work fast enough
to be used in real-time systems with relatively limited
resources.

IV. R ESULTS AND D ISCUSSION Fig. 4. Working of detection

1) User Study: A user study was done with 15 par-


ticipants drawn from the self-driving car sector. The
participants performed object detection tasks using both
models in high speed object movement, objects being applications where speed is paramount importance but
occluded etc. 87% of participants reported that they prefer small compromises in effectiveness are credible.
YOLOv9 over the YOLOv8 due to a higher detection rate Nevertheless, the given paper shows that YOLOv9 con-
which the latter delivers even in complex scenes while sumes more memory while offering better object de-
the YOLOv9 is slightly slower than the former. 93% tection accuracy in comparison with other methods and
of participants revealed that the enhancement of FPS of is more suitable for safety-critical applications in au-
YOLOv8 was suitable for basic detection problems or tonomous vehicles. YOLOv8 is a good preference for
in areas where the fast object identification is crucial, systems with hardware complications or YOLO models’
for instance, high-way with relatively small amount of less stringent object detection tasks.
barriers. we determine that for a specific deployment setting, it is
2) Discussion: Some of the most significant trade-offs impossible to state that one version of YOLO is intrin-
of speed, accuracy and resources are well demonstrated sically better than the other. If you want high speed and
through the comparison of YOLOv8 and YOLOv9.we can high accuracy of object detection, the YOLOv9 model is
see in Fig. 3 and in Fif 4 For a situation where precision perfect, and if you need high speed of the overall system
is of utmost importance such as in urban environments and relatively small efficiency of the model, the YOLOv8
with dense traffic and high level of interaction between would perform best. The models suggest the possibility
objects, YOLOv9 shows higher accuracy. On the other of real-time object detection in self-driven cars but due to
hand, YOLOv8 receives faster processing rates and can their performance characteristics; they may be used for
be used in less complicated surroundings or real-time different operational requirements.
V. C ONCLUSION [3] Aduen Benjumea,Izzeddin Teeti†,Fabio
Cuzzolin†,Andrew Bradley*, ”Improving small object
This development has consequently established an op-
detection in YOLOv5 for autonomous vehicles”,
timal object detection approach for self-driving cars as
IJRASET, vol. 9, no. XII, Jan 2023.
adopted in the YOLOv9 model. 90% detection accuracy
[4]M. A. Bin Zuraimi and F. H. Kamaru Zaman, ”Vehicle
allows YOLOv9 outperform any previous models such
Detection and Tracking using YOLO and DeepSORT,”
as YOLOv8, where accuracy was slightly lower and was
2021 Malaysia, 2021.
85%. This means that a higher detection precision in com-
[5] X. Chang, H. Pan, W. Sun and H. Gao, ”YolTrack:
plex driving scenes is vital for effective and secure ve-
Multitask Learning Based Real-Time Multiobject Track-
hicle automation. Due to improved layers like optimized
ing and Segmentation for Autonomous Vehicles,” in IEEE
convolutional layers and tailored anchor boxes YOLOv9,
Transactions on Neural Networks and Learning Systems,
offers higher accuracy and efficiency in detecting multiple
vol. 32, no. 12, pp. 5323-5333, Dec. 2021.
objects on the road, including complicated or obscured
[6] Rui Wang, Ziyue Wang, Zhengwei Xu, Chi Wang,
others.
Qiang Li, Yuxin Zhang, Hua Li, and Jianli Liu. 2021.
The modifications applied to YOLOv9 not only yield [7] Priyanka Ankireddy1,,V. Siva Krishna Reddy2,*,
better predictive accuracy but also improve the overall Dr.V. Lokeswara Reddy1, ”Vehicle Detection and Track-
inventory efficiency as compared to a moderate decrease ing Using YOLOv8 and Deep Learning to Boost Image
in the speed of operations. The system allows comfortable Processing Quality”, vol. 05, no. 04, Apr 2021.
and more prompt decision-making in real-time which [8] ”object dection using yolov7”, vol. 5, no. VII, July
enhances safety of the autonomous driving. YOLOv9 has 2021.
exhibited a higher computational efficiency in processing [9] Dai, Y.; Kim, D.; Lee, K. An Advanced Approach
real-time object detection tasks while providing a near- to Object Detection and Tracking in Robotics and Au-
impeccable level of accuracy which makes it a shoe-in tonomous Vehicles Using YOLOv8 and LiDAR Data
for incorporation within real-world autonomous vehicles. Fusion. Electronics 2024, 13, 2250 [10] Mumtazimah
Many prospects are explored to add more features to the Mohamad, ”A Review on yolov8”, 2020.
model further to make it more effective. It is possible [11] Y. Zhao, C. Lei, Y. Shen, Y. Du and Q. Chen,
that future work could consider and develop lightweight ”Improving Autonomous Vehicle Visual Perception by
model versions for deployment on fogy devices with Fusing Human Gaze and Machine Vision vol. 24, no.
lower battery memory capacity while providing high 11, pp. 12716-12725, Nov. 2023
levels of accuracy. However, the model can be expanded [12] Y. Cao, C. Li, Y. Peng and H. Ru, ”MCS-YOLO:
to work with other **multi-modal sensor data such as A Multiscale Object Detection Method for Autonomous
LIDAR and RADAR which may enhance its performance Driving Road Environment Recognition,” in IEEE Ac-
in real world scenarios such as the urban street or at cess, vol. 11, pp. 22342-22354, 2023 [13] S. Weon, S.
night or during extreme weather conditions. However, for -G. Lee and J. -K. Ryu, ”Object Recognition Based In-
a stronger defence for the model, extensive real-world terpolation With 3D LIDAR and Vision for Autonomous
testing over long periods will be required. Driving of an Intelligent Vehicle,” in IEEE Access, vol.
8, pp. 65599-65608, 2020
Moreover, when the YOLOv9 is combined with other [14] R. Ravindran, M. J. Santora and M. M. Jamali,
frameworks for decisions, navigation, online monitoring, ”Multi-Object Detection and Tracking, Based on DNN,
and safety in this work, the advanced system could be for Autonomous Vehicles: A Review,” in IEEE Sensors
made more versatile for various driving scenarios. This Journal, vol. 21, no. 5, pp. 5668-5677, 1 March1, 2021
project represents a cutting-edge development in object [15] J. E. Hoffmann, H. G. Tosso,Rahman, ”Real-Time
detection and takes autonomous driving technology to the Adaptive Object Detection and Tracking for Autonomous
next level in terms of safety and deployment suitability Vehicles,” in IEEE Transactions on Intelligent Vehicles,
for use in near-future smart transportation systems. vol. 6, no. 3, pp. 450-459, Sept. 2021

VI. R EFERENCES
[1] A. Juyal, S. Sharma and P. Matta, ”Deep Learning
Methods for Object Detection in Autonomous Vehicles,”
2021 5th International Conference on Trends in Electron-
ics and Informatics (ICOEI), Tirunelveli, India, 2021
[2] C. Wöhler ,J.K. Anlauf , ”Real-time object recogni-
tion on image sequences with the adaptable time delay
neural network algorithm — applications for autonomous
vehicles”, elsivier, vol. 19, no. III, Aug 2022.

You might also like