0% found this document useful (0 votes)
75 views20 pages

Project Report (Group 9)

Uploaded by

Abhay Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views20 pages

Project Report (Group 9)

Uploaded by

Abhay Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

(OBJECT DETECTION IN AUTONOMOUS

VECHILES USING DEEP LEARNING)


Project report submitted
in partial fulfilment of the requirement for the degree of

Bachelor of Technology

By

Abhay Singh (223025001)

Aryan Kaushik (223025016)

Kashish Bhardwaj (223025036)

Lakshay Choudhary (223025040)

COMPUTER SCIENCE ENGINEERING DEPARTMENT

COLLEGE OF SMART COMPUTING, COER UNIVERSITY


(April 2024)
CERTIFICATE

It is certified that the work contained in the project report titled


“Object detection in autonomous vehicles using deep learning”
by “Abhay Singh , Aryan Kaushik , Kashish Bhardwaj , Lakshay
Choudhary ” has been carried out under my/our supervision and that
this work has not been submitted elsewhere for a degree.

Signature of Supervisor(s)
MR. KAPILKUMAR
CSE DEPARTMENT
COER UNIVERSITY
April, 2024
Declaration
I declare that this written submission represents my ideas in my own words and
where others' ideas or words have been included, I have adequately cited and
referenced the original sources. I also declare that I have adhered to all principles of
academic honesty and integrity and have not misrepresented or fabricated or falsified
any idea/data/fact/source in my submission. I understand that any violation of the
above will be cause for disciplinary action by the Institute and can also evoke penal
action from the sources which have thus not been properly cited or from whom
proper permission has not been taken when needed.

Abhay Singh (223025001)

Aryan Kaushik (223025016)

Kashish Bhardwaj (223025036)

Lakshay Choudhary (223025040)


ACKNOWLEDGMENT

We would want to convey my heartfelt gratitude to Mr. Kapil Kumar our mentor, for his
invaluable advice and assistance in completing our project. He was there to assist us in every
step of the way, and his motivation is what enabled us to accomplish my task effectively. I
would also like to thank all of the other supporting personal who assisted us by supplying the
equipment that was essential and vital, without which we would not have been able to
perform efficiently on this project. We would also want to thank the university college of
engineering Roorkee for accepting our project in my desired field of expertise. We’d also like
to thank our friends and parents for their support and encouragement as we worked on this
project.

Also, this project would not have been completed without the help of readers including
ourself and various websites on the internet have helped a lot us in writing and understanding
this project.

Finally, we express our gratitude to everyone.

Thank you
CONTENTS

I. Introduction
II. Review of Literature
III. Report on present investigation
IV. Result and Discussion
V. Summary and Conclusion
VI. Appendix
VII. Refences

S.No. CHAPTER PAGE NO.

1. INTRODUCTION 1-2
1.1 APPLICATION OF OBJECT DETECTION IN AUTONOMOUS
1.1.1. Pedestrian Detection
1.1.2. Vehicle Detection
1.1.3. Obstacle Detection
1.1.4. Traffic Sign recognition

2. REVIEW OF LITERATURE 3-5


2.1 HISTOGRAM OF ORIENTED GRADIENT (HOG)
2.2 Region-based Convolutional Neural Networks (R-CNN)
2.3 Fast R-CNN
2.4 Faster R-CNN
2.5 YOLO

3. REPORT ON THE PRESENT INVESTIGATION 6-8


3.1 Methodology:
3.1.1 Data collection
3.1.2 Labelling Img
3.1.3 Downloading Required model
3.1.4 Training the model
3.1.5 Making predictions
3.2 Technologies used:
3.2.1 Python
3.2.2 Yolov8 model
3.2.3 Deep learning
4. RESULT AND DISCUSSION 9-10
5. SUMMARY AND CONCLUSIONS 11
6. Appendix 12
7. REFERENCES 13
LIST OF ABBREVIATIONS

YOLO YOU LOOK ONY ONCE


HOG HISTOGRAM OF ORIENTED GRADIENT
CNN CONVOLUTIONAL NEURAL NETWORK
R-CNN REGION BASED CONVOLUTIONAL NEURAL
NETWORK
IMG IMAGE
CHAPTER 1 INTRODUCTION
In this Project “Object detection in autonomous vehicles using deep learning” we are trying to
detect the object in real time by using YOLOv8 algorithm. Object detection helps in
identifying and locating the object within a frame. Most of the people get confused about the
term object classification and object detection. Basically object classification is done to
identify and categorize the object. In object classification we are not detect the object or we
can say that we are not locate the position of the object. But on other hand object detection
deals with identifying and locating objects of a given img. With the development of deep
learning, we are able to achieve accurate result of object detection. In today’s world most of
the car manufacturer like tesla are using object detection technology so that they are able to
make selfdriven cars. When the car driver activate the autopilot mode then the car is able to
navigate by itself easily. We are using YOLOv8 algorithm because of its speed and accuracy.
Over the years many models are introduces to detect the object such as CNN (Convolutional
Neural Network), R-CNN (Region-Based Convolutional Neural Network) but they are not
able to detect the object in real time and they also take very much time to detect the object. So
to remove these issues YOLO was came into the picture which are able to detect the object in
real time and it take very less time to detect the object with high accuracy. YOLO was
introduced in 2015 by Joseph Redmon , Santosh Divvala , Ross Girshick ,and Ali Farhadi. It
is single shot algorithm which means that it detect the object in single pass only. YOLO
algorithm divides the given image into grid cells and it predict the probability of presence of
an object by using the bounding box coordinates of the object.

There are several steps of working of YOLO algorithm:

1. First of all we take an img and then passed it through CNN. This process is done to extract
the features of the img.

2. The obtained features are then passed through a series of fully connected layer that help in
predicting the object probabilities and bounding box coordinates.

3. In next step the img is divided into a grid cell , and each grid cell is responsible for
predicting a set of bounding box and object probabilities.

4. By the help of bounding box we predict the object by using a post processing algorithm to
remove overlapping boxes and choose the box with the highest probability.

1
5. The final output that we get is a set of predicted bounding box.
1.1 Application of object detection in autonomous vehicles:

1 . Pedestrian Detection

Object detection technique is used to detect pedestrians on road and by the help of this the
safety also increases.

2 . Vehicle Detection

By detecting other vehicles autonomous vehicle are able to make a safe distance and
navigate through traffic.

3 . Obstacle Detection

By detecting the obstacle like construction on road helps to avoid accident on roads.

4. Traffic Sign recognition

Detecting the traffic sign and speed limit allows us to follow the traffic rules and regulations.

2
CHAPTER 2 REVIEW OF LITERATURE
There are many traditional models that are used to detect the object but they have some
limitations due to which new models are proposed to increase the speed and accuracy of the
model. Some of the traditional models are :

1 . Histogram of oriented Gradients (HOG)

2 . Region-based Convolutional Neural Networks (R-CNN)

3 . Fast R-CNN

4. Faster R-CNN

5 . YOLO

2.1 HISTOGRAM OF ORIENTED GRADIENT (HOG)

Histogram of oriented Gradients was introduced in 1986. It is the oldest method for object
detection. It was not so popular at that time. It become popular in 2005 where it is used to
perform many task related to computer. HOG extract the features of an image to detect the
object.

Below are some points that tell us the working of HOG works -

1 . First of all we have to find the gradient by dividing the entire computation of the image
into gradient representation (8x8 cells).

2. By the help of 64 gradient vector we split the cell into angular bins and compute the
histogram for a particular area. This process helps to reduce the size of 64 vectors to 9 values.

3. When we get 9 values for histogram of each cell then we choose to create overlaps for the
bock of cell.

4. The final step is to form the feature blocks, normalize the obtained features vector and
collect all the features vector to get all HOG features..

3
LIMITATIONS –

• It is time consuming.
• Computational complexity is very high

2.2 Region-based Convolutional Neural Networks (R-CNN)

It was introduced in 2014. This model remove many issues that are present in HOG. In this
we are trying to extract about 2000 features by making use of selective features. Selective
search algorithm helps us for selecting the most significant extractions.

Below are some points that tell us the working of HOG works -

1 . First step is that by the help of selective search algorithm we select the important regional
proposals that ensure to generate multiple sub segment of a particular img.

2 . Once the selective search algorithm is completed our next step is to extract the features.
By the help of a pre-trained convolutional neural network we are able to extract the features.

3 . The final step is to make predictions of the image . The prediction are made by the
computation of a classification model and regression model is used to correct the bounding
box classification for the proposed region.

LIMITATIONS –

• High memory consumption.


• RCNN can be slow during training phase because it processes each region proposal
independently.

2.3 Fast R-CNN

This model was introduced in 2015. In R-CNN we pass each region proposal one by one in
CNN architecture and selective search algorithm generate 2000 region proposal so it is very
complex and expensive to train the image using R-CNN. So to remove this problem
FastRCNN was introduced. Basically it take the whole image as an input in CNN architecture
instead of taking 2000 region proposal.

4
LIMITATIONS –

• It struggle to detect small object in the img.


• Training time can be time consuming when working with large dataset
2.4 Faster R-CNN

Faster R-CNN was introduced in 2015. We know that there are some issues in R-CNN and to
remove those issues Fast R-CNN model was proposed. But there are issues in Fast R-CNN
and to remove them Faster R-CNN model was introduced. Fast R-CNN also use selective
search algorithms to compute the region proposals, so this technique was replaced by Faster
R-CNN by introducing superior region proposal network. The region proposal network
reduce the margin computation time , usually 10 ms per image. This network consist of
convolutional network by the help of which we obtain essential feature of each pixel. For
each feature we have multiple anchor (the centre of the sliding window with unique size and
scale). These anchors are passed into classification layer and regression layer by the help of
which we classify the object and localize the bounding box.

LIMITATION –

• It must not be fast enough for real-time application due to multi-stage process.

2.5 YOLO

YOLO (You Only Look Once) was introduced by Joseph Redmon, Santosh Divvala, Ross
Girshick, and Ali Farhadi in 2016.YOLO (You Only Look Once) is a real-time object
detection algorithm that uses deep learning to detect objects in images or videos. YOLO
works by processing an image or video frame at a time and predicting the location and class
of objects in the frame. It uses a convolutional neural network (CNN) to extract features from
the input image and then applies a series of regression models to predict the bounding boxes
and class probabilities of objects in the frame. YOLO is known for its speed and accuracy,
making it a popular choice for real-time object detection applications.

5
CHAPTER 3 REPORT ON THE PRESENT INVESTIGATION

We are using YOLOv8 model for object detection to increase the speed and accuracy.
Here are some steps which we have followed during the completion of our project – 1

3.1 Methodology

3.1.1 Data Collection:

This step involves gathering images of vehicles from online sources like Pexels and Pixabay.
These platforms offer a wide variety of high-quality images that can be used for training an
object detection model. It is important to collect a diverse set of images that cover different
types of vehicles, backgrounds, lighting conditions, and angles to ensure that the model
generalizes well to real-world scenarios.

3.1.2. Labeling Img:

Labeling is the process of marking images with bounding boxes that indicate the location of
objects (in this case, vehicles) within the img. The tool mentioned, labelImg, is commonly
used for this purpose. It allows users to open images in a graphical interface and draw
bounding boxes around objects. These labeled images are then saved along with XML files
that contain information about the coordinates of the bounding boxes and the corresponding
object classes.

The labeled images are typically divided into three subsets: training, testing, and validation.
The training set is used to train the model, the testing set is used to evaluate the model's
performance during training, and the validation set is used to fine-tune the model and assess
its generalization ability.

3.1.3. Downloading Required Model:

Before training the object detection model, it's necessary to have the appropriate software and
libraries installed. In this case, the ultralytics package needs to be installed using pip. This

6
package provides implementations of various deep learning models, including YOLOv8,
which is a popular architecture for object detection tasks.

3.1.4. Training the Model:

Training the object detection model involves feeding the labeled images into the YOLOv8
architecture and adjusting the model's parameters to minimize the difference between the
predicted bounding boxes and the ground truth bounding boxes.

The main.py file contains the code for configuring the training process, including specifying
hyperparameters such as learning rate, batch size, and number of epochs. During training, the
model learns to recognize vehicles in the images and predict their bounding boxes.

The output of this step is a trained model file named yolov8n.pt, which contains the learned
weights and parameters of the model.

3.1.5. Making Predictions:

Once the model is trained, it can be used to detect vehicles in new images or videos. In this
case, the model is applied to a video file (test2.mp4) to identify vehicles.

The yolo command is used to perform object detection using the trained YOLOv8 model.
Parameters such as the model file (yolov8n.pt), confidence threshold (conf), and source file
(test2.mp4) are specified to customize the detection process.

The output of this step is typically a new video file or images with bounding boxes drawn
around the detected vehicles, providing visual confirmation of the model's performance.

3.2 Technologies Used:


3.2.1. Python

Python plays a crucial role in the application of YOLO (You Only Look Once), a real-time
object detection system, due to its versatility, extensive library support, and ease of use.
Python is often used to integrate, customize, and extend the YOLO codebase, allowing
developers to tailor the model to specific use cases. It is also employed for data preprocessing
and augmentation, leveraging libraries such as NumPy, OpenCV, and PIL for tasks like
7
resizing, cropping, and applying transformations to images. Furthermore, Python's popular
deep learning libraries such as TensorFlow and PyTorch are commonly used for training,
fine-tuning, and inference with YOLO models, while visualization libraries like matplotlib
and seaborn aid in model analysis and performance visualization. The rich Python community
and availability of resources further make it an attractive choice for working with
YOLO.
3.2.2 YOLOv8 model

The YOLOv8 model, an advanced iteration of the YOLO (You Only Look Once) series, is
widely employed for real-time object detection across diverse applications due to its
exceptional features and capabilities. Its real-time detection capability makes it well-suited
for applications such as autonomous vehicles, surveillance systems, and robotics where rapid
and accurate object detection is essential. The model's versatility extends across various
domains including industrial automation, retail analytics, security systems, medical imaging,
and sports analytics. Moreover, YOLOv8 is known for its precision in accurately localizing
objects within images, an important feature for applications such as medical imaging and
quality control in manufacturing. Its scalability and efficiency make it suitable for
deployment in both high-powered server environments and resource-constrained edge devices,
contributing to its wide applicability.

3.2.3 Deep learning

Deep learning plays a critical role in object detection by leveraging complex neural network
architectures to automatically extract features from images, train object detection models, and
significantly enhance detection accuracy compared to traditional computer vision methods.
Through techniques like convolutional neural networks, deep learning models can efficiently
learn hierarchical representations of data, enabling precise localization of objects and the
ability to detect a wide range of object classes across diverse domains such as healthcare,
autonomous driving, and surveillance.

8
CHAPTER 4 RESULT AND DISCUSSION

YOLO is another popular deep learning model for object detection in autonomous vehicles. In
a study that compared the performance of Faster R-CNN and YOLO on the COCO dataset,
YOLO achieved higher speed (45 frames per second) and comparable accuracy to Faster
RCNN.

(Fig 1)

Fig 1 shows the detection of cars and truck and also shows the percentage of accuracy.

9
(Fig 2)

Fig 2 shows the detection of cars and shows the percentage of accuracy.

Object detection is a critical component of autonomous vehicles, as it enables the vehicle to


perceive and react to its surroundings. Deep learning has shown great potential in achieving
accurate and efficient object detection in autonomous vehicles, as demonstrated by various
studies that have applied deep learning models, such as Faster R-CNN, YOLO. The capacity
of deep learning-based object identification to handle complex and diverse things, such as
pedestrians, automobiles, and traffic signs, in a variety of environmental situations, is one of
its key benefits. such as different lighting, weather, and road conditions. Deep learning
models can learn to detect and classify these objects based on their features and patterns in
large datasets, enabling the autonomous vehicle to make informed decisions and respond
appropriately. Another advantage of deep learning-based object detection is its adaptability
and scalability. Deep learning models can be trained on diverse datasets and can be fine-tuned
or transferred to different domains or tasks, enabling the autonomous vehicle to detect new
objects or respond to new situations. Additionally, deep learning models can be optimized for
different hardware platforms, such as GPUs and embedded systems, to achieve real-time
performance and low power consumption. However, there are also some challenges and
limitations of deep learning-based object detection in autonomous vehicles. One challenge is
the need for large and diverse datasets for training and evaluation, which may require
significant resources and time. Another challenge is the interpretability and transparency of
deep learning models, which may affect their trustworthiness and accountability.

10
CHAPTER 5 SUMMARY AND CONCLUSIONS

This Project “Object detection in autonomous vehicles using deep learning” focus on
detection of objects like cars, bike and trucks in real-time by the help of YOLOv8 model.
Autonomous vehicles are those without a driver that offer better security and comfort to
passengers. The safety of their propulsion and their ability to avoid causing traffic accidents
are the two most crucial factors with regard to autonomous cars. It involves the system and
device functional safety of the vehicle. Object detection is a critical component in enabling
autonomous vehicles to perceive and interact with their environment. In recent years, deep
learning-based approaches have shown significant improvements in object detection accuracy
and speed. We propose a method for object detection in autonomous vehicles using YOLOv8
model. Our approach achieves high accuracy and fast speed , making it suitable for real-time
applications in autonomous vehicles. In this project first we collect images of vehicles from
online sources like Pexels and Pixabay. Then we do the image labeling for marking images
with bounding boxes that indicate the location of objects (in this case, vehicles) within the
image. After that we choose YOLOv8 model for object detection. The purpose to choose
YOLOv8 model is that it has the capability to detect the object in real time and also the speed
and accuracy of this model is very high. Then we train our model. Once the model get trained
we can detect vehicles in new images or videos.

The purpose of our project is to detect the object in autonomous vehicles using deep learning
by using YOLOv8 model. By the help of YOLOv8 algorithm we are able to detect the images
correctly and we can predict the Bounding boxes and multiple class probabilities are
displayed concurrently but we are not able to achieve 100 % accuracy.

11
CHAPTER 7 REFERENCES

1. Youtube : https://youtu.be/m9fH9OWn8YM?si=m6_nKzeRK7RmE1C5
2. Deep learning module : https://docs.ultralytics.com/quickstart/#conda-docker-image
3. https://ieeexplore.ieee.org/document/9633965
4. Muhammad Azriyahya “Deep Learning for Object Identification in LiDAR for
Autonomous Vehicles” 2020 IEEE 10th International Conference on System
Engineering and Technology (ICSET), 9 November 2020, Shah Alam, Malaysia.
5. Ruturaj Kulkarni “"Traffic Light Detection and Recognition for Self-Driving Vehicles
using Deep Learning," 2018 IEEE Fourth International Conference on Computing,
Communication, Control, and Automation (ICCUBEA)

12
13

You might also like