0% found this document useful (0 votes)
7 views8 pages

Object Detection For Self Driving Car in Complex Traffic Scenarios

This study evaluates the performance of YOLOv8 models for object detection in complex traffic scenarios specific to Indian roads, highlighting the challenges posed by unique vehicle types and traffic patterns. The research demonstrates that YOLOv8 models outperform previous YOLO versions in accuracy and speed, using a dataset of 9300 annotated images collected from Bangalore and Hyderabad. The findings contribute to advancements in autonomous vehicle technology, particularly in environments with intricate traffic conditions.

Uploaded by

cahayatestacc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views8 pages

Object Detection For Self Driving Car in Complex Traffic Scenarios

This study evaluates the performance of YOLOv8 models for object detection in complex traffic scenarios specific to Indian roads, highlighting the challenges posed by unique vehicle types and traffic patterns. The research demonstrates that YOLOv8 models outperform previous YOLO versions in accuracy and speed, using a dataset of 9300 annotated images collected from Bangalore and Hyderabad. The findings contribute to advancements in autonomous vehicle technology, particularly in environments with intricate traffic conditions.

Uploaded by

cahayatestacc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

MATEC Web of Conferences 393, 04002 (2024) https://doi.org/10.

1051/matecconf/202439304002
STAAAR-2023

Object Detection for Self-Driving Car in Complex


Traffic Scenarios
Biplab Das1*, Pooja Agrawal2
1, 2
School of Robotics, Defence Institute of Advanced Technology (DIAT), DU, Pune, India

Abstract. The application of convolutional neural networks (CNNs) in


particular has greatly enhanced the object detection capabilities of self-
driving cars, because of recent advancements in artificial intelligence (AI).
However, striking a balance in vehicular settings between high precision and
fast processing continues to be a persistent challenge. Developing nations
such as India, possessing the second-largest global population, introduce
unique intricacies to road scenarios. Numerous challenges arise on Indian
roads, such as unique vehicle kinds and a variety of traffic patterns, such as
auto-rickshaws, which are only seen in India. This study presents the
outcomes of evaluating the YOLOv8 models, which have demonstrated
superior performance in Indian traffic conditions when compared to other
existing YOLO models. The examination utilized the dataset, compiled from
data collected in the cities of Bangalore and Hyderabad, as well as their
surrounding areas. The investigation's findings demonstrate how well the
YOLOv8 models work to address the unique problems that Indian road
conditions present. This study advances the development of autonomous
vehicles designed for intricate traffic situations such as those found on
Indian Roads.

1 Introduction
The incorporation of artificial intelligence (AI) technology is causing the automotive sector
to evolve quickly [1,2]. Through a yearly average growth rate of 36.15%, worldwide demand
for AI in cars is predicted to increase significantly, with a projected USD 6.6 billion in sales
of AI technology for cars by 2025 [3,4]. Detecting objects is a very important part of self-
driving cars to locate the object as well as to classify the object in front of it and its
surroundings so that based on it cars can decide to drive efficiently. In recent years there has
been huge growth in computer vision for that reason object detection for autonomous cars
has also improved. Yolo models showed great performance compared to other detection
models [5] As we all know, speed is crucial for real-time applications. Yolo models, being
single-stage detection systems, outperformed double-stage detection systems like R-CNN
and Faster R-CNN in terms of locating objects and classification speed [6]. Many kinds of
research happened on object detection for self-driving cars but most of them were done on
the dataset which does not consist of that much complex data. It is extremely difficult to
assure safety and dependability in extreme corner instances for self-driving cars using the

* Corresponding author: [email protected]

© The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons
Attribution License 4.0 (https://creativecommons.org/licenses/by/4.0/).
MATEC Web of Conferences 393, 04002 (2024) https://doi.org/10.1051/matecconf/202439304002
STAAAR-2023

IDD dataset [7], which contains very complex and diverse road scene scenarios. The dataset
used here consists of complex traffic scenarios of roads in Hyderabad, Bangalore cities, and
their outskirts in India. The objects in the dataset have used car, person, bicycle, bus, traffic
sign, traffic light, truck, autorickshaw, animal, motorcycle, and rider. By training the most
recent Yolo algorithms on an unconstrained complicated environment dataset and comparing
their accuracy. This paper makes significant advances in the field of object localization and
classification for self-driving cars.

Object localization and classification are very important parts of vision-based self-driving
cars. In recent years various deep learning methods come up with higher accuracy and speed
to detect objects in traffic road scenarios. Here we discussed some of the recent papers that
focused on object detection for self-driving cars.

1.1 Related Works

The authors of [8] suggested an improved Yolov4 model to identify ten different types of
objects and an improved model to forecast whether or not people would cross the street. For
object detection, they have used the BDD100k dataset and for pose estimation of pedestrians,
they have collected and validated data by themselves. By optimizing they have reduced the
complexity of the model of yolov4 to achieve faster speed and accuracy.
In [9] they have improved the capability of the YOLOv5 object detector for
detecting tiny objects, specifically in the context of autonomous racing. The researchers
propose architectural modifications to the YOLOv5 model, including changes to the
backbone, neck, and other elements. The authors suggested further research and testing with
different datasets to validate and refine the proposed techniques. Overall, the research
advances the development of autonomous vehicle vision systems by offering insights into
enhancing tiny object recognition in the YOLOv5 model.
A novel hybrid model for object detection and tracking in autonomous cars is
presented in this paper [10]. It combines a tracking model based on Kalman filters with a
two-stage object detector based on Faster R-CNN. In-depth tests on the KITTI dataset
demonstrated faster and more accurate results than previous approaches, which is essential
for real-time object recognition and vehicle safety. Extensive experiments on the KITTI
dataset showcased superior accuracy and speed compared to existing methods, crucial for
real-time object detection and vehicle safety. It's important to remember, though, that the
intricacy of the design can need a large amount of processing power.
The authors of [11] emphasized the importance of exclusively testing self-
driving cars on Indian roads, which require a specialized dataset due to the unique and
unstructured complexities presented in India. Consequently, they developed this dataset.
Within this dataset, the authors employed Faster R-CNN for object detection and conducted
an evaluation. Various metrics, such as classifier accuracy, loss RPN classifier, loss RPN
regression, loss detector classifier, and loss detector regression, were used to evaluate the
model's performance after it had been trained on 1200 images.

In this study enhancement of object detection based on the yolov8 model for self-driving cars
has been proposed under the complex traffic environment. Here also comparison with other
YOLO models has been done. Fig. 1 shows the architecture of YOLOv8.

2
MATEC Web of Conferences 393, 04002 (2024) https://doi.org/10.1051/matecconf/202439304002
STAAAR-2023

Fig. 1. Architecture of YOLOv8 [12].

2 Methodology
The below subsections discussed the data collection and preprocessing as well as the
strategies for training, validating, and testing are examined alongside their corresponding
assessment standards.

2.1 Data Preparation


The unstructured and diverse dataset used here is collected from open source, a total of 9300
annotated images were collected with annotation names 0-animal, 1-autorickshaw, 2-
bicycle, 3-bus, 4-car, 5-motorcycle, 6-person, 7-rider, 8-traffic light, 9-traffic sign, 10-truck.
The collection includes several road scene views of Indian cities, as well as their surrounding
areas. These images feature a combination of towns and villages with varying lighting
conditions. Fig. 2 shows some images from the dataset. YOLO model prefers some particular
input image sizes like 640x640 and 1280x1280 and image annotation in YOLO format. For
that reason before training the whole dataset was preprocessed with the online tool available
Roboflow (https://app.roboflow.com) for auto-orientation, resizing the images to 640x640 so
that training time will be less and converting the annotations from XML to TXT format, also
known as YOLO format. Additionally, images are resized to 1280x1280 to assess model
performance. The dataset from the Roboflow website has been split between 70, 20, and 10
percent ratios for training, validating, and testing.

Fig. 2. Some unstructured road scene images from the dataset.

3
MATEC Web of Conferences 393, 04002 (2024) https://doi.org/10.1051/matecconf/202439304002
STAAAR-2023

2.2 Testing and Evaluation

All the training and evaluation were done with the Python programming language. The deep
learning-based YOLOv8 model was used for training on the unstructured complex traffic
dataset through transfer learning. The YOLOv8 model works faster and more accurately than
existing YOLO models for detecting objects, and this study assisted in the detection of
objects in an extensive variety of traffic circumstances, such as on Indian roads. This work
uses the YOLOv8 model given by Ultralytics [13] for training, saving models, and validation
and testing. There are a total of 6 versions of the YOLOv8 model presented those base models
that could be used like yolov8n.pt for nano object, yolov8s.pt used for small object detection,
yolov8m.pt, yolov8l.pt, and yolov8x.pt for medium, large, and extra-large object detection
purposes. Those base models were used for training, validation, and testing, around 6510 data
was used for training, 1860 data was used for validation, and 630 data for testing. Fig. 3
shows the workflow of this paper.

Fig. 3. Object Detection Model.

Determining the extent to which bounding boxes overlap about ground truth data indicates
successful recognition is a crucial step in evaluating object detection algorithms. IOUs are
utilized for this, and mAP0.5 is the accuracy when IOU=50, meaning that the detection is
deemed successful if there is more than 50% overlap. The bounding box must be detected
with greater accuracy. All the evaluations were done based on mAP0.5, mAP0.5-0.95,
Precision-Recall curve, and other Object Detection Metrics. The formulas for IOU, Recall,
and Precision are presented below.

𝐼𝐼𝐼𝐼𝐼𝐼 = (𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼𝐼)/(𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 𝑜𝑜𝑜𝑜 𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈𝑈) . . . . . . (1)

𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃 = (𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃)/(𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃) . . . . . . (2)

𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅𝑅 = (𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶𝐶 𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃)/(𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇 𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺𝐺 𝑇𝑇𝑇𝑇𝑇𝑇𝑇𝑇ℎ). . . . . . (3)

4
MATEC Web of Conferences 393, 04002 (2024) https://doi.org/10.1051/matecconf/202439304002
STAAAR-2023

3 Experimental Results
In this section, the evaluation of various object localization and classification models is
presented, which are used for detecting complex traffic scenarios. These object detection
models were trained, validated, and tested on a complex dataset of 9700 labeled images.

Table 1. Performance of trained Yolo model.

Image size used mAP0.5-


Model mAp0.5
for training 0.95

Yolo5l 640 x 640 0.58 0.374

Yolov8m 640 x 640 0.579 0.371

Yolov8l 640 x 640 0.597 0.388

Yolov8x 640 x 640 0.603 0.390

Yolov8x 1280 x 1280 0.688 0.446

The dataset contains various types of labeled images, such as person, traffic lights,
motorcycles, bicycles, traffic signs, animals, riders, buses, cars, trucks. The mean average
precision of different models is presented in Table 1. Furthermore, Fig. 4 displays the
precision-recall curves of trained models. YOLOv8x (1280x1280), YOLOv8x, YOLOv8l,
and YOLOv8m trained models are displayed in (a), (b), (c), and (d). The balance between
recall and accuracy for various thresholds is displayed on the precision-recall curve. High
accuracy is correlated with a low false positive rate, while high recall is correlated with a low
false negative rate. An area under the curve with a high value indicates high recall and
precision. When comparing the Precision-Recall curve of the YOLOv8x (1280x1280) trained
model to other models, it has shown good results. Fig. 5 shows the validation results of
trained models. YOLOv8x (1280x1280), and YOLOv8x (640x640), trained models are
displayed in (a) and (b). Fig.5.a) The YOLOv8x(1280x1280) trained model has shown good
accuracy. Fig. 6 displays the sample input images (1280x1280) used for testing. Additionally,
Fig. 7 displays the objects that the Yolov8x model detected. Consequently, the YOLOv8x
model demonstrates suboptimal performance in these underrepresented classes, despite
delivering strong results for other categories. When the input image resolution was increased
to 1280x1280, the YOLOv8x model performed approximately 14% better than 640x640
resolution images as seen in Table 1.

5
MATEC Web of Conferences 393, 04002 (2024) https://doi.org/10.1051/matecconf/202439304002
STAAAR-2023

(a) (b)

(c) (d)

Fig. 4. Precision-Recall curve of different Yolov8’s models after validation. a) Yolov8x (Image size
1280x1280) b) Yolov8x(Image size 640x640) c) Yolov8l(Image size 640x640) d) YOLOv8m( Image
size 640x640).

(a) (b)

Fig. 5. Validation accuracy of different Yolov8 models. a) Yolov8x(Image size 1280x1280) b)


Yolov8x( Image size 640x640).

6
MATEC Web of Conferences 393, 04002 (2024) https://doi.org/10.1051/matecconf/202439304002
STAAAR-2023

Fig. 6. Sample images for testing.

Fig. 7. Detected images of the YOLOv8x model.

4 Conclusion
Object detection is a critical component of self-driving cars, but it can be difficult in a
complex, varied traffic environment. This paper suggests using the YOLOv8 model to
improve object detection in unstructured complex traffic scenarios. YOLOv8x (1280x1280)
trained model demonstrated superior performance over other models. By training the model
on a larger dataset with nearly identical and higher instances of each class, model
performance can be improved. despite delivering strong results for other categories. When
the input image resolution was increased to 1280x1280, the YOLOv8x model performed
approximately 14% better than 640x640 resolution images as seen in Table 1.

References
1. H. Mankodiya, D. Jadav, R. Gupta, S. Tanwar, W.C. Hong, R. Sharma, "OD-XAI:
Explainable AI-Based Semantic Object Detection for Autonomous Vehicles," Appl.
Sci. 12, 5310 (2022).
2. S.A. Khan, H. Lim, "Novel Fuzzy Logic Scheme for Push-Based Critical Data
Broadcast Mitigation in VNDN," Sensors 22, 8078 (2022).
3. A. N. Bhavana, M.M. Kodabagi, "Exploring the Current State of Road Lane Detection:
A Comprehensive Survey," Int. J. Hum.-Comput. Interact. 2, 40-46 (2023).
4. S.A. Khan, H. Lim, "Push-Based Forwarding Scheme Using Fuzzy Logic to Mitigate
the Broadcasting Storm Effect in VNDN," in Proceedings of the Artificial Intelligence
and Mobile Services–AIMS 2022: 11th International Conference, Held as Part of the
Services Conference Federation, SCF , Honolulu, HI, USA, December 10–14, (2022),
pp. 3–17. Springer, Berlin/Heidelberg, Germany.

7
MATEC Web of Conferences 393, 04002 (2024) https://doi.org/10.1051/matecconf/202439304002
STAAAR-2023

5. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified,
Real-Time Object Detection," in 2016 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR, 2016), pp. 779-788.
6. S.A. Khan, H.J. Lee, H. Lim, "Enhancing Object Detection in Self-Driving Cars Using
a Hybrid Approach," Electronics 12, 2768 (2023).
7. G. Varma, A. Subramanian, A. Namboodiri, M. Chandraker, and C.V. Jawahar, "IDD:
A Dataset for Exploring Problems of Autonomous Navigation in Unconstrained
Environments," in IEEE Winter Conference on Applications of Computer Vision
(WACV, 2019).
8. Y. Li, H. Wang, L.M. Dang, D. Han, H. Moon, T. Nguyen, "A Deep Learning-Based
Hybrid Framework for Object Detection and Recognition in Autonomous Driving,"
IEEE Access, vol. 8, (2020).
9. A. Benjumea, I. Teeti, F. Cuzzolin, and A. Bradley, "YOLO-Z: Improving Small
Object Detection in YOLOv5 for Autonomous Vehicles," (2021).
10. Y. Kortli, S. Gabsi, Y. Lew Yan Voon, J. Lew, M. Jridi, M. Maher, M. Marzougui, and
M. Atri, "Deep Embedded Hybrid CNN-LSTM Network for Lane Detection on
NVIDIA Jetson Xavier NX," Knowledge-Based Systems, (2022).
11. G.N.V.V. Satya Sai Srinath Namburi, Athul Zac Joseph, S. Umamaheswaran, Ch.
Lakshmi Priyanka, Malavika Nair M, and Praveen Sankaran, "NITCAD - Developing
an Object Detection, Classification, and Stereo Vision Dataset for Autonomous
Navigation in Indian Roads," Procedia Computer Science, (2020).
12. J. Solawetz and F. Francesco, "What is YOLOv8? The Ultimate Guide," (2023).
Available online: https://blog.roboflow.com/whats-new-in-yolov8/.
13. G. Jocher and AyushExel, "YOLO by Ultralytics," (2023). Available online:
https://docs.ultralytics.com/.

You might also like