Project Report Pallapati

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 62

LICENSE PLATE DETECTION USING YOLOv8x and

EASY-OCR

A PROJECT REPORT
Submitted to
Amrita Vishwa Vidyapeetham

in partial fulfillment for the award of the degree of


BACHELOR OF TECHNOLOGY,
COMPUTER SCIENCE AND ENGINEERING
By
P. ABHINAI RAJ
(CH.EN.U4CSE20048)

Supervisor
Dr.J.JEYALAKSHMI

AMRITA VISHWA VIDYAPEETHAM,


AMRITA SCHOOL OF COMPUTING
CHENNAI – 601103
November 2023

i
BONAFIDE CERTIFICATE

This is Certify that this project report entitled “LICENSE PLATE


DETECTION USING YOLOv8x and EASY-OCR” is the bonafide work of
“PALLAPATI ABHINAIRAJ (CH.EN.U4CSE20048) “who carried out the
project work under my supervision.

SIGNATURE SIGNATURE

Dr. S SOUNTHARRAJAN Dr.J.JEYALAKSHMI


Chairperson,CSE. Senior Asst Professor , CSE
Amrita School of Computing Amrita School of Computing
Chennai. Chennai.

INTERNAL EXAMINER EXTERNAL EXAMINER

ii
DECLARATION BY THE CANDIDATE

We declare that the report entitled “LICENSE PLATE DETECTION USING


YOLOv8x and EASY-OCR” submitted by me for the degree of Bachelor of
Technology is the record of the project work carried out by us under the
guidance of “Dr.J.JEYALAKSHMI” and this work has not formed the basis
for the award of any degree, diploma, associateship, fellowship, titled in this or
any other University or other similar institution of higher learning.

SIGNATURE

P.ABHINAIRAJ
Reg No: (CH.EN.U4CSE20048)
Computer Science and Engineering
Amrita School of Computing
Chennai

iii
ABSTRACT

License plate detection is a pivotal component of modern traffic surveillance


and vehicle management systems. In this research, we present an innovative
approach to license plate recognition plays a crucial role in applications like
automated toll collection, vehicle tracking, and law enforcement. we offer a
thorough method to license plate detection, and Optical Character Recognition
(OCR) for character extraction from the plate. Our work includes the
implementation of an ALPR system based on the YOLOv8x object detector,
demonstrating a remarkable accuracy rate of approximately 85 to 90% in
recognizing license plates, even in challenging low-resolution images. While we
are using the Yolov8x which enables us to perform with extra-large datasets.

Recognizing the limitations inherent in Optical Character Recognition (OCR)


systems, we address the complexities of handling images affected by artifacts,
distorted perspectives, intricate backgrounds, and handwritten characters. The
incorporation of EasyOCR, a type of optical character recognition into an
Automatic License Plate Recognition (ALPR) project.
TensorFlow RT facilitates the smooth deployment of TensorFlow models on a
variety of hardware architectures, including CPUs, GPUs, TPUs, and DSPs, by
utilizing optimizations like quantization and hardware acceleration. All things
considered, TensorFlow RT is essential to the scalable and effective
implementation of machine learning models in practical applications.

Keywords— OCR , Yolov8x , Deep Learning , Image Processing , ALPR


system, Precession ,Confusion matrix , Ultralytics , Tensor Flow Rt.

iv
ACKNOWLEDGEMENT

This project work would not have been possible without the contribution of
many people. It gives me immense pleasure to express my profound gratitude to our
honorable Chancellor Sri Mata Amritanandamayi Devi, for her blessings and for
being a source of inspiration. I am indebted to extend my gratitude to our Sampoojya
Swami Vinayamritananda Puri, Administrative Director, and Shri. I B
Manikantan, Campus Director for facilitating us all the facilities and extended
support to gain valuable education and learning experience.
I register my special thanks to Dr. V. Jayakumar, Principal for the support
given to me in the successful conduct of this project. I wish to express my sincere
gratitude to Dr. S Sountharrajan, Chairperson, CSE, Dr.Bhuvaneswari, Program Head,
CSE-CT and Dr. Suthir Sriram, Project Coordinator, CSE-CT and Dr.J.Jeyalakshmi ,
Supervisor, CSE for their inspiring guidance, personal involvement and constant
encouragement during the entire course of this work.
I am grateful to Review Panel Members and the entire faculty of the
Department of Computer Science & Engineering, for their constructive criticisms and
valuable suggestions which have been a rich source to improve the quality of this
work.

P.ABHINAIRAJ
Reg No: (CH.EN.U4CSE20048
V
TABLE OF CONTENTS
CHAPTER NO TITLE PAGE NO
Abstract iv
List of Tables vii
List of Figures viii
List of Symbols and Abbreviations ix

1 Introduction 1
1.1 Background of study…………………......................1
1.2 Problem statement………………………................. 1
1.3 Objective……………………..………………..........1
1.4 Expected results………………………….................1
2 Literature Review 2
3 Methodology 13
3.1. System Architecture.................................................13
3.2. Data Collection and Preprocessing..........................28
3.3. License Plate Extraction using OCR……………....32
3.4. Yolov8x Model training...........................................35
3.5. Testing in Real-Time................................................38
4 Results and Discussions 41
4.1. Performance Metrics................................................41
4.2. Real-time Recognition Screenshots.........................42
5 Conclusion and Future Scope 44
References 46
vi
LIST OF TABLES

TABLE NO. TITLE PAGE NO.


2.1. Comparison Table for Literature Review 10
3.1. Sample YOLOv8 model Architecture 22
3.2. Training and Utilizing table 23
vii

LIST OF FIGURES

FIGURE NO. TITLE PAGE NO.


3.1. Architecture Diagram 11
3.2. Object Detection 12
3.3. One and Two Stage Detectors 13
3.4. IoU Theorem 15
3.5. YOLO TimeLine 16
3.6. YOLO Architecture 16
3.7. YOLOv8 Architecture 21
3.8. Dataset Images 27
3.9. Full Dataset 28
3.10. Recognizing using OCR 30
3.11. Detecting and Visualizing
the Number Plate 30
3.12. Images taken during testing 36
4.1. Confusion Matrix 38
4.2 Output Images 39
viii
LIST OF SYMBOLS AND ABBREVIATIONS

CNN – Convolutional Neural Network


FPN – Feature Pyramid Networks
BGR – Black Green Red
RGB – Red Green White
ROI – Region of Interest
OCR – Object Character Recognition
mAP – Mean average accuracy
IoU – Intersection over union
Ix
CHAPTER 1
INTRODUCTION
1.1 BACKGROUND OF STUDY
Modern traffic management and security depend heavily on license plate detection. This
project uses Yolov8x and the Easy OCR character recognition framework, two cutting-edge
deep learning algorithms, to build a reliable and effective solution. Optimizing license plate
identification involves addressing challenges with complicated backdrops and shifting
illumination. Improving traffic control and security both depend heavily on effective license
plate recognition. This research advances the current state of computer vision, deep learning,
and machine learning technologies.
1.2 PROBLEM STATEMENT
1.2.1. Problem Identification
This project's main obstacle is meeting the need for accurate and effective license plate
recognition in real-world scenarios. Traditional techniques sometimes fail to produce the
necessary precision, particularly when confronted with problems like uneven illumination,
skewed angles, and composite backdrops. Additionally, the demand for calculations restricts
the real-time deployment of systems like those used in law enforcement, traffic control, and
security. Thus, the main issue this project attempts to solve is the development of a robust
and effective system for quickly and precisely detecting license plates.
1.3 OBJECTIVE
The goal of this project is to create a novel and all-encompassing license plate detection and
identification system, specifically addressing the issues associated with low-resolution photos
and different types of image artifacts. The system uses Easy OCR for character extraction and
the YOLOv8x object detector for detecting license plates. Our main objective is to detect
license plates with an impressive 90% accuracy, especially in difficult situations such as poor
light and rainy circumstances.
1.4 EXPECTED RESULTS
The anticipated results include the creation of an extremely precise license plate recognition
system for traffic control and surveillance that can recognize plates up to 85–90% of the time,
even in difficult circumstances. The project aims to improve multilingual support, overcome
Tesseract OCR constraints, and prioritize system scalability and efficiency. Furthermore, the
study opens the door for continued innovation, enabling developments and enhancements in
the field of license plate identification in the future.

1
CHAPTER 2
LITERATURE REVIEW

This report's literature review covers a broad spectrum of developments in License Plate
Recognition (LPR) technology. A multi-angle view approach for license plate recognition
was presented by Dat Tran-Anh et al. (2023) [1], which improved text identification inside
license plates by using frames taken from different perspectives.
A hybrid SVM classifier was presented by Mohammad Fahad Uddin et al. (2023) [2] for
license plate recognition and number segmentation. Notable accuracy rates were achieved for
Bangla license plates and characters. Yu Yinan and others.
An automated LPR system using deep Convolutional Neural Networks (CNNs) was reported
in 2023 [3], outperforming state-of-the-art methods and exhibiting resilience across a range
of circumstances.
A hybrid learning system for random-positioned license plate identification was created by C.
L. Philip Chen and Bingshu Wang (2022) [4], and it performed better than state-of-the-art
techniques.
Moreover, Shuai Ding et al. (2022) [6] and Sharma et al. (2022) [5] emphasized the
efficiency of deep learning methods and hybrid ALPR systems in real-time license plate
identification and detection.
In addition, Muhammad Usama et al. (2021) [8] developed an ALPR system using a novel
Convolutional Neural Network for the identification and correction of deformed license
plates.
Tao Wang et al. (2020) [9] proposed a novel approach to license plate recognition combining
conventional positioning with SVM and CNN-based entire license plate identification.
Santiago Silva and Claudio Rosito Jung (2021) [7] introduced a novel ALPR system focusing
on unconstrained scenarios.
A novel method for car license plate identification using region-based convolutional neural
networks was reported by Xiaobin Zhuang et al. (2020) [10].
A novel method for recognizing license plates on vehicles was put out by Min Zhang et al.
(2019) [11], who treated LPs as objects and used state-of-the-art object detection algorithms.
For pixel-wise picture segmentation, Chi Zhang and Zechao Li (2019) [12] improved the U-
Net architecture, yielding superior results than models created from scratch.

2
Using deep learning architectures, Jian Xu et al. (2018) [13] obtained outstanding accuracy
rates for reading license plates and identifying the kind of vehicle.
By enhancing the U-Net architecture for pixel-wise image segmentation, Vladimir Iglovikov
and Alexey Shvets (2018) [14] won first place in the Kaggle Carvana Image Masking
Challenge.
A real-time technique for Number Plate Detection and Recognition utilizing Finger Segment
was reported by Zhihua Chen et al. (2018) [15].
A two-stage CNN-based technique for traffic sign detection in complex traffic situations was
suggested by Guangrui Zhu et al. (2017) [16].
A real-time method for identifying and detecting license plates was presented by Mahfuzur
Rahman et al. (2017) [17].
It was built on the semi-symmetric distribution of corner points and the learning of
morphological aspects. A license plate recognition and identification system based on
morphological feature learning was created by Gaurav Sharma et al. (2016) [18] and offers
real-time operation and strong resilience.
Using principles from cognitive computing, Nicolas Thome et al. (2014) [22] developed a
cognitive and video-based method for multilingual license plate identification that improves
efficiency and accuracy.
A novel license plate recognition system that makes use of color coding and start/stop
patterns for improved localization was proposed by Jani Biju Babjan (2014) [23]. A method
for automatically recognizing license plates and numbers was presented by Hamed Saghaei
(2014) [24].
It uses iIn comparison to earlier methods, Pavel Svoboda et al. (2013) [25] shown improved
reconstruction quality on actual traffic surveillance pictures when they investigated direct
blind deconvolution and denoising using convolutional neural networks for license plate
motion deblurring.
The work by Ioannis D. Psoroulas and Vassili Loumos (2015) [26] is one of the studies that
remain, and it deals with the classification of Indian Classical Dance by means of a dictionary
learning approach based on sparse representation.Image processing techniques to efficiently
extract car license plate numbers.
A lightweight fully convolutional network for license plate identification was introduced by
Han Xiang et al. (2015) [27], and it successfully recognized license plates in complex
situations.

3
In order to improve accuracy and speed, Nicolas Thome et al. (2014) [28] developed a
cognitive and video-based method for multilingual license plate identification. This method
makes use of cognitive computing ideas.
A ground-breaking technique for identifying license plates that uses color coding and
start/stop patterns for improved localization was proposed by Jani Biju Babjan (2014) [29].
Using image processing methods, Hamed Saghaei (2014) [30] developed an automatic
license and number plate recognition system that successfully recovers vehicle license plate
numbers.

Table 2.1. Comparison Table for Literature Review


S. No Year Title Author(s) Summary
1 2023 License Plate Dat Tran-Anh , Khanh In order to tackle the difficult problem
Recognition Linh Tran , Hoai-Nam of text identification inside license
based on Multi- plates, this article integrates many
Angle View frames taken from various angles. To
Model detect adjoining components inside
license plates, the proposed technique
focuses on extracting characters from
each perspective (view-1, view-2, and
view-3) such as area and corn points.
Using Estimate and distance
measure, these parts are utilized to
recreate text components. Then, the
text within the license plates using the
CnOCR technique.The suggested
strategy outperforms exist approach
existing approaches in many contexts, as
demonstrated by experimental findings
on both publicly accessible and self-
collected datasets.
2 2023 License Plate Mohammad Fahad Uddin, The cascaded architecture for Automatic
Recognition and Rasib Khan, and Md. License Plate Recognition (ALPR)
Number Tariq Hasan systems is presented in this study with

4
Segmentation an emphasis on minimal resource
Using Hybrid settings and real-time performance. The
SVM Classifier suggested method makes use of many
iterations of the cutting-edge object
identification model YOLOv7 in
conjunction with a specially designed
Bangla OCR engine to identify and
recognize Bangla license plates (BLP).
According to experimental results, BLPs
can be identified with 76% accuracy,
while Bangla characters on license
plates can be identified with 80%
accuracy. The suggested design is a
viable option for ALPR systems in real-
world situations because these findings
outperform several current architectures
in terms of detection and identification
accuracy.
3 2023 License Plate Yinan Yu, Feng Ji, This paper presents Sighthound's
Detection and Yandong Guo, Xuelong automated license plate detection and
Recognition in Li, and Thomas Huang recognition system, leveraging deep
Unconstrained Convolutional Neural Networks (CNNs)
Scenarios. and efficient algorithms. The system
demonstrates robustness across various
conditions and outperforms leading
ALPR technology in quantitative
benchmarks. Developers can access the
system through the Sighthound Cloud
API at
https://www.sighthound.com/products/cl
oud.
4 2022 Random- C. L. Philip Chen; Bingshu This research provides a method for
Positioned Wang license plate recognition that combines a

5
License Plate wide learning system with a fully
Recognition convolutional network. It uses the
Using Hybrid AdaBoost cascade classifier for
Broad Learning character segmentation and a pixel-level
System and two-class classification approach for
Convolutional random-position object recognition.
Networks Comparing experimental results on
Macau license plates to state-of-the-art
methods, performance is better,
indicating possible regional or national
application.
5 2022 A Hybrid Sharma, Nitin; Dahiya, In this study, a hybrid Automatic
Approach for Pawan K.; Marwah, License Plate Recognition (ALPR)
Automatic Baldev Raj system that combines support vector
Licence Plate machines (SVM) and convolutional
Recognition neural networks (CNN) is proposed.
System According to experimental findings, the
recognition rate is 76.5%, which is
higher than ALPR systems that only use
CNN, SVM, or NN.

6 2022 Vehicle License Shuai Ding, Wenjian The use of deep learning techniques in
Plate Wang, Jie Jin, and Xu Ye Automatic Number Plate Recognition
Recognition Systems, such as CNN-RNN, YOLO,
Using and SSD, is reviewed in this work. It
Convolutional highlights their applicability for real-
Neural Network time license plate detection and
addresses their benefits, drawbacks, and
potential to improve ANPR accuracy.
7 2021 License Plate [7]. Santiago Silva, This paper presents a novel ALPR
Detection and Claudio Rosito Jung system focusing on unconstrained
Recognition in capture scenarios, featuring a
Unconstrained Convolutional Neural Network for

6
Scenarios detecting and rectifying distorted license
plates. Experimental results show
competitive performance on traditional
datasets and superior performance on
challenging datasets compared to
existing approaches.
8 2021 Vehicle License Muhammad Usama, In order to identify and correct deformed
Plate Detection Muhammad Shahzad, license plates, this research presents a
Using Deep Taimur Hassan revolutionary ALPR system that is
Learning intended for unrestricted circumstances.
It does this by utilizing a novel
Convolutional Neural Network.
Experimental results show that it
outperforms both academic and
commercial methods with competitive
performance on standard datasets and
superior performance on hard datasets.
9 2020 Vehicle License Tao Wang, Dong Wang, This research presents a novel approach
Plate Zhihao Yang, Xiulian Peng to license plate recognition that
Recognition combines conventional positioning with
Using SVM and CNN-based entire license
Convolutional plate identification. It outperforms
Neural Network existing techniques, obtaining over 99%
accuracy in license plate placement and
97.8% accuracy in character recognition.
10 2020 Vehicle License Xiaobin Zhuang, Dan Su, In order to obtain higher performance in
Plate Detection Xing Mei, Xu Geng complicated circumstances, this study
Using Region- presents a unique approach to vehicle
Based license plate recognition by considering
Convolutional LPs as objects and utilizing cutting-edge
Neural object detection algorithms, such as
Networks RCNN and SVM.
11 2019 Vehicle License Min Zhang, Chenxia Wu, This paper introduces a unique approach

7
Plate Detection Shen Tian to vehicle license plate recognition that
and Recognition treats LPs as objects and employs
in cutting-edge object detection algorithms,
Unconstrained such as RCNN and SVM, to achieve
Scenarios and higher performance in difficult settings.
rainy
conditions.
12 2019 License Plate Chi Zhang, Zechao Li Using pre-trained encoders, the research
Recognition presents an enhancement to the U-Net
Using architecture for pixel-wise picture
Convolutional segmentation, providing better
Neural performance than models generated
Networks from scratch. We examine three weight
initialization schemes: a complete
network trained on the Carvana dataset,
an encoder using VGG11 weights, and
LeCun uniform.
13 2018 Vehicle License Jian Xu, Lei Ma, Shuai Our suggested toll collecting system
Plate Wang achieves excellent accuracy rates for
Recognition license plate identification, reading, and
with Novel vehicle type recognition by utilizing
Dataset deep learning architectures on the
DVLPD dataset.
14 2018 TernausNet: U- Vladimir Iglovikov, By using pre-trained encoders, this study
Net with Alexey Shvets suggests improving U-Net architecture
VGG11 for pixel-wise image segmentation. It
Encoder Pre- also shows that this approach performs
Trained on better than scratch-trained models and
ImageNet for achieves top rankings in the Kaggle
Image Carvana Image Masking Challenge.
Segmentation
15 2018 Vehicle License Zhihua, Chen., Jung-Tae, The paper presents a real-time method
Plate Detection Kim., Jian-Ning, Liang., for Number Plate Detection and

8
and Recognition Jing, Zhang., Yubo, Yuan Recognition using finger segment.
in Methods used are background
Unconstrained subtraction method for plate detection
Scenarios and Segmentation of plate and character
for recognition
16 2017 Traffic Plate Guangrui Zhu, Zhiqiang This work proposes a two-stage CNN-
Sign Detection Wang, Rongrong Ni, based method to handle the problem of
and Songzhi Su, and Yanwei traffic sign identification in complicated
Classification Pang traffic scenarios. Using multi-scale
feature maps for precise detection, an
effective network with upgraded fire-
modules swiftly produces object
recommendations in the first step. A
new classification network is used in the
second step to investigate fine-grained
local characteristics in order to
distinguish between traffic signals that
appear identical. The Tsinghua-Tencent
100K benchmark is used to assess the
method, and the results show that it
performs better and detects objects faster
than previous approaches.
17 2017 An Efficient Mahfuzur Rahman, Wei- This paper provides a real-time license
License Plate Chih Hsu, Yung-Ju Chang, plate detection and identification system
Detection Ming-Cheng Chen, and based on learning morphological
Algorithm Kuan-Chieh Wang features and the semi-symmetric
Based on the distribution of corner points. The system
YOLO Detector recognizes license plates with excellent
resilience and efficiency in a variety of
applications, such as parking
management and traffic control.
18 2016 Vehicle License Gaurav Sharma, Arjun License plate detection and
Plate Detection Jain, Balaji Hariharan, and identification algorithm based on

9
and Recognition Sumanth Srinivasan morphological feature learning, offering
real-time operation and high robustness.
A new classification network is used in
the second step to investigate fine-
grained local characteristics in order to
distinguish between traffic signals that
appear identical.
19 2016 License Plate Christos-Nikolaos E. In this work, license plate recognition
Recognition Anagnostopoulos, (LPR) algorithms are reviewed, common
from Still Member, IEEE, Ioannis E processing stages are outlined, and
Images and issues like different plate sizes and
Video outside light variations are addressed. It
Sequences offers a publicly accessible picture
library for algorithmic assessment in
addition to classifying and evaluating
current methods.
20 2015 Super resolution Ioannis D. Psoroulas, The paper addresses the classification of
of License Vassili Loumos Indian Classical Dance using a sparse
Plates in Real representation based dictionary learning
Traffic Videos technique.86.67% accuracy achieved on
the ICD dataset. Performance
comparable to state-of-the-art on KTH
dataset. Sparse representation based
dictionary learning technique and
Support vector machine (SVM) with
intersection kernel
21 2015 Lightweight Han Xianga,b, Yong In order to improve feature integration,
fully Zhaoc, Yule Yuand, this study presents a lightweight feature-
convolutional Guiying Zhange, Xuefeng count network (FCN) that can
network for Huc. effectively recognize license plates in
license plate intricate scenarios by utilizing dilated
detection convolutions and dense connections.
The suggested solution outperforms

10
current state-of-the-art techniques,
striking a compromise between
computing efficiency and accuracy,
according to in-depth testing on a variety
of datasets.
22 2014 A cognitive and Nicolas Thome, Antoine A cognitive and video-based approach
video-based Vacavant , Lionel for multilingual license plate recognition
approach for Robinault ,Serge Miguet (LPR) is presented in this research. By
multinational adding contextual information from
License Plate video streams, the method makes use of
Recognition cognitive computing concepts to
increase the accuracy and efficiency of
LPR systems. The suggested method's
efficacy in identifying license plates
from various nations is demonstrated by
the experimental findings, highlighting
its potential for practical use in a variety
of settings.
23 2014 License Plate Jani Biju Babjan With applications in the Internet of
Recognition Things (IoT), this research suggests a
System Based revolutionary license plate identification
on Color system that uses color coding and
Coding of start/stop patterns for better
License Plates localization.The suggested solution
outperforms current state-of-the-art
techniques, striking a compromise
between computing efficiency and
accuracy, according to in-depth testing
on a variety of datasets.
24 2014 Proposal for Hamed Saghaei With successful detection and
Automatic recognition shown in real-world
License and scenarios, this paper introduces an
Number Plate automatic license and number plate

11
Recognition recognition system that effectively
System for extracts vehicle license plate numbers
Vehicle using image processing algorithms,
Identification doing away with the need for additional
hardware like GPS or RFID.
25 2013 CNN for license Pavel Svoboda, Michal Using training data from purposely
plate motion Hradis, Lukas Marsik, blurred photographs, this study explores
deblurring Pavel Zemcik direct blind deconvolution and denoising
using convolutional neural networks,
displaying higher reconstruction quality
on real traffic surveillance images
compared to previous approaches.

CHAPTER 3

METHODOLOGY

3.1. SYSTEM ARCHITECTURE

Data collection: Using CCTV, ANPR, or other road rule enforcement cameras, this step takes
pictures or recordings that contain license plates.

Preprocessing of the Data: Motion blur, different lighting, and different weather might cause
inconsistent results in the recorded data. Techniques for image preparation are used to

12
improve the picture in order to improve license plate recognition. This might entail altering
the contrast, lowering noise, or sharpening the image.

Training of a Machine Learning Model for License Plate Detection: A sizable dataset of
license plate photos is used to train the model. Through practice, the model has the ability to
discern license plates from other items and the backdrop of an image or video.

Optical Character Recognition (OCR) is a technique that is used to extract the alphanumeric
characters from a license plate once it has been recognized in an image. The trained model is
used by the ANPR system to identify the characters.

License Plate Detection and Recognition Interference: In this scenario, the system faces
difficulties including broken or obscured plates, irregularly shaped plates, or problems with
night vision. To increase accuracy, the ANPR system may need to take these interferences
into consideration.

Visualization of the Results: Ultimately, the ANPR system displays the outcomes for
additional actions following the analysis of the license plate data. This may entail keeping
track of tolls and electronic payments, or it could entail cross-referencing the license plate
with a database of pilferers or stolen cars.

13
Fig 3.1. Architecture Diagram

OBJECT DETECTION

Object Detection is a Popular Task in Computer vision and in Present Generation. It deals
with localizing a Region of Interest (ROI) within in an image and classifying the respected
region or required region like a typical image classifier. One image can include several
regions of interest pointing to different objects. This makes object detection a more advanced
problem of image classification.

Popular object detection model YOLO (You Only Look Once) is renowned for its accuracy
and quickness. Since its first introduction by Joseph Redmon et al. in 2016, it has undergone
many revisions, with YOLO v7 being the most recent.

14
The unique features of YOLO v7 and how it stacks up against other object detection
algorithms.
Object detection is a computer vision task that involves identifying and locating objects in
images or videos. It is an important part of many applications, such as surveillance, self-
driving cars, or robotics. Object detection algorithms can be divided into two main
categories: single-shot detectors and two-stage detectors.

Fig 3.2 Architecture Diagram

In 2014, Ross Girshick and his colleagues at Microsoft Research produced the R-CNN
(Regions with CNN features) model, which was one of the first effective attempts to use deep
learning to solve the object detection issue. This model detected and localized objects in
pictures by combining convolutional neural networks (CNNs) with region proposal
techniques. It deals with localizing a Region of Interest (ROI) within in an image and
classifying the respected region or required region like a typical image classifier. One image
can include several regions of interest pointing to different objects. This makes object
detection a more advanced problem of image classification. Object detection is a computer
vision task that involves identifying and locating objects in images or videos. Object
Detection is a Popular Task in Computer vision and in Present Generation. This makes object
detection a more advanced problem of image classification. Generally speaking, there are two
types of object detection algorithms depending on how many times a network passes the
same input picture.

15
Fig 3.3. One and Two Stage Detectors

DETECTION TYPES:

 Single-Shot Detection :
To forecast the existence and placement of objects in a picture, one-shot object
detection does a single pass over the input image. They are computationally efficient
since they process a whole image in a single pass. However, compared to other
techniques, single-shot object identification is typically less precise and less
successful in detecting small objects. These techniques may be applied to resource-
constrained situations to detect objects in real time. YOLO is a single-shot detector
that processes images using a fully convolutional neural network (CNN). In the next
part, we will go more into the YOLO paradigm.

16
 Double-shot Detection:
Two passes of the input image are used in two-shot object detection to estimate the
presence and placement of objects. A series of suggestions or possible object
positions are produced in the first pass, and these suggestions are refined and
definitive predictions are made in the second run. While this method is more
computationally expensive than single-shot object detection, it is also more accurate.
Ultimately, the particular needs and limitations of the application will determine
whether to use single-shot or two-shot object detection. In general, real-time
applications are better served by single-shot object detection, whereas applications
that prioritize precision are better served by two-shot object detection.

OBJECT DETECTION MODELS PERFORMANCE EVALUATION METRICS


We require common quantitative criteria to evaluate and contrast the prediction performance
of various object identification methods.
Intersection over Union (IoU) and Average Precision (AP) measurements are the two most
often used assessment measures.

Intersection over Union (IoU):


A common metric used to determine localization faults and assess localization accuracy in
object identification models is intersection over union.
We initially take the intersection area between the two matching bounding boxes for the same
item in order to calculate the IoU between the predicted and the ground truth bounding boxes.
Next, we determine the combined area of the two bounding boxes, referred to as the "Union,"
and the region where they overlap, referred to as the "Intersection.” We can calculate the ratio
of the overlap to the entire area by dividing the intersection by the Union. This gives us a
decent idea of how similar the predicted bounding box is to the original bounding box. We
initially take the intersection area between the two matching bounding boxes for the same
item in order to calculate the IoU.we determine the combined area of the two bounding
boxes, referred to as the "Union," and the region where they overlap, referred to as the
"Intersection.”
Intersection over Union (IoU) and Average Precision (AP) measurements are the two most
often used assessment measures.

17
Fig 3.4. IoU Theorem

Average Accuracy (AP)


The area under an accuracy vs. recall curve for a given set of predictions is used to compute
average precision, or AP.
The ratio of all of the model's predictions for a class to all of the labels that have been
assigned to it is known as recall. The ratio of true positives to all of the model's predictions is
referred to as precision.
By changing the categorization threshold, a curve that graphically depicts the trade-off
between recall and precision is created. We can find the model's average precision per class
by examining the area under the precision vs. recall curve. Mean Average Precision (mAP) is
the average of this value over all classes.
Precision and recall are not utilized for class predictions in object detection. Rather, they
function as boundary box predictions to gauge the decision performance. A forecast is
considered positive if the IoU value is more than 0.5, and negative if it is less than 0.5.

YOLO
An end-to-end neural network that predicts bounding boxes and class probabilities
simultaneously is the method that You Only Look Once (YOLO) suggests employing. It is
not the same as the method used by earlier object identification algorithms, which turned
classifiers into detection tools.

18
By taking a radically new approach to object recognition, YOLO outperformed existing real-
time object detection algorithms and produced state-of-the-art results.

Fig 3.5. YOLO TimeLine

HOW DOES YOLO WORKS ? YOLO ARCHITECTURE


After receiving an image as input, the YOLO method employs a basic deep convolutional
neural network to identify objects in the picture. The CNN model’s architecture, which serves
as the foundation for YOLO, is displayed here.

Fig 3.6. YOLO Architecture

19
By inserting a temporary average pooling and fully connected layer, ImageNet is used to pre-
train the model’s first 20 convolution layers. Then, as other studies have shown that
incorporating convolution and linked layers into a pre-trained network enhances
performance, this pre-trained model is transformed to conduct detection. The last fully
connected layer of YOLO predicts bounding box coordinates as well as class probabilities.

An input image is divided into a S × S grid by YOLO. An object's center falls into a grid cell,
and that grid cell is in charge of detecting it. B bounding boxes and confidence ratings for
those boxes are predicted for each grid cell. The model's level of confidence that the box
includes an item and the accuracy of the anticipated box are both indicated by these
confidence ratings.
For each grid cell, YOLO predicts numerous bounding boxes. We only want one bounding
box predictor to be in charge of each item during training. Depending on whose prediction
has the greatest current IOU with the ground truth, YOLO designates one predictor as
"responsible" for making an object prediction. As a result, the bounding box predictors
become specialized. By enhancing its ability to foresee certain item sizes, aspect ratios, or
classifications, each predictor raises the recall score as a whole.
Non-maximum suppression is a crucial method in the YOLO models (NMS). NMS is a post-
processing procedure that increases object detection's precision and effectiveness. It is typical
practice in object detection to produce numerous bounding boxes for a single item in an
image. All of these bounding boxes depict the same item, even if they could overlap or be in
various locations. NMS is used to extract a single bounding box for each item in the picture
and to find and eliminate unnecessary or inaccurate bounding boxes.
The YOLO (You Only Look Once) object detection technique comes in the following
variants and these are the improvements:

 YOLO (2016): Joseph Redmon et al. first proposed the YOLO paradigm. It is faster
than later versions but less accurate since it splits the input picture into a grid and
predicts bounding boxes and class probabilities straight from complete images in a
single assessment.

 YOLOv2 (2017): Using Darknet-19 architecture, batch normalization, high-resolution


classification, anchor boxes, and enhanced feature extraction are just a few of the

20
improvements that YOLOv2 brought to the original model. It offered improved
performance and precision.
 YOLOv3 (2018): By implementing a few significant modifications, such as the
deployment of a new backbone network named Darknet-53, multi-scale prediction,
and more precise bounding box prediction, YOLOv3 greatly increased the detection
accuracy and speed. YOLOv3 maintained real-time inference speeds while achieving
state-of-the-art performance.

 YOLOv4 (2020): Developed by Alexey Bochkovskiy et al., YOLOv4 brought about a


number of notable enhancements over YOLOv3, such as the utilization of the
CSPDarknet53 backbone, enhanced training algorithms, and feature aggregation
modules. YOLOv4 outperformed its predecessors in terms of accuracy and inference
speed.

 YOLOv5 (2020): YOLOv5, which Ultralytics introduced, is based on the YOLOv3


design but prioritizes performance and simplicity. It added a model size scaling
technique and swapped out the heavier CSPNet for the Darknet backbone. YOLOv5
demonstrated competitive performance by achieving quicker inference and training
times.

 YOLOv6 (2021): This version of the YOLO series focuses on enhancements to


training methods and model architecture. In order to handle unbalanced datasets more
effectively, it added unique components such the Gradient Focal Loss (GFL) and the
PANet feature aggregation module.

 YOLOv7 (2022): YOLOv7 is a continuation of the YOLO series, aiming to further


improve detection performance and speed. It typically includes advancements in
backbone architectures, feature fusion methods, and optimization techniques to
achieve state-of-the-art results in object detection tasks.

 YOLOv8 (2023): Yolov8 is the most recent iteration of Ultralytics' YOLO. YOLOv8,
a state-of-the-art (SOTA) model, is a cutting-edge model that expands upon the
success of its predecessors by including new features and enhancements for improved
efficiency, flexibility, and performance. A wide variety of visual AI tasks, including
21
as tracking, segmentation, posture estimation, detection, and classification, are
supported by YOLOv8. Because of its adaptability, users may take use of YOLOv8's
features in a variety of contexts and applications.

 YOLOv9(2024): Still under development and its main characteristics are novel
techniques include the Generalized Efficient Layer Aggregation Network (GELAN)
and Programmable Gradient Information (PGI).

YOLOv8

January 2023 saw the introduction of the most recent YOLO version. Similar to v5 and v6,
YOLOv8 touts quicker speed and greater accuracy but lacks an official document. On the
COCO dataset and A100 TensorRT, for example, the YOLOv8(medium) has a 50.2 mAP
score at 1.83 milliseconds. Moreover, YOLO v8's CLI-based implementation and Python
package make it simple to use and build. Let's examine the capabilities of the YOLOv8 in
detail and discuss some of its noteworthy advancements.

YOLOv8 VARIENTS

Based on the number of parameters, YOLOv8 is available in five variants: nano(n), small(s),
medium(m), large(l), and Extra large (x). All of the variations are useful for segmentation,
object identification, and classification.
We are using YOLOv8x in our model to get to use the model for larger dataset size.

YOLOv8 ARCHITECTURE

Architecture Consists of Three Parts mainly :


 BACKBONE:
This serves as a feature extractor and is the first processing step. It begins with a
picture as input and gradually extracts characteristics at three distinct detail levels:
low (corners, edges), medium (textures, forms), and high (entire objects). Although
the backbone network's specifics haven't been made public, YOLOv8 is thought to
use a potent, modern convolutional neural network (CNN) design.

22
 NECK (Optional):
Although it isn't stated clearly in the official YOLOv8 literature, several sources talk
to a neck component. The neck's job is to merge feature maps that the backbone has
retrieved at various depths. This enables the model to make use of both high-level
characteristics for effective object categorization and low-level information for
accurate localization. For neck functionality, methods such as Feature Pyramid
Networks (FPNs) are frequently employed.
 HEAD:
The final forecasts are the head's responsibility. After receiving the feature maps from
the neck and/or backbone that have been processed, it carries out two major functions:
 Bounding Box Prediction: For prospective items in the picture, the head makes
bounding box predictions. The position and size of the items are indicated by
these boxes.
 Class Prediction: For every bounding box, the head also projects the class
probabilities. In essence, this categorizes the item (vehicle, person, dog, etc.)
inside the bounding box.
 This enables the model to make use of both high-level characteristics for
effective object categorization and low-level information for accurate
localization.
 YOLOv8 is thought to use a potent, modern convolutional neural network
(CNN) design as shown in the “Fig 3.2”.

Our system's design is comprised of three primary parts: the head, optional
neck, and backbone. The main component functions as a feature extractor,
drawing features at various degrees of detail from input pictures with the help
of a potent convolutional neural network (CNN) architecture. While not
specifically stated in the YOLOv8 literature, the neck component integrates
feature maps from the backbone to provide precise localization and efficient
object classification. For this, standard techniques like Feature Pyramid
Networks (FPNs) are frequently employed. Lastly, using the feature maps that
have been analyzed, the head component generates the final predictions by
predicting bounding boxes and class probabilities.

23
Fig 3.7. YOLOv8 Architecture

Important Novelties in YOLOv8:


 Anchor-Free Detection Head: YOLOv8 uses an anchor-free detection head, in
contrast to earlier iterations that used predetermined anchor boxes (bounding box
shapes) during training. By eliminating the necessity for the model to limit predictions
to these predetermined forms, accuracy may be increased.
 Self-Attention Mechanism: The head of YOLOv8 is equipped with a self-attention
mechanism. This makes it possible for the model to concentrate on crucial regions of
the feature maps, improving object recognition—particularly for tiny or obscured

24
Layer Output Shape Number of Parameters
Conv2d (3,608,608) 1,792
BatchNorm2d (64,608,608) 128
LeakyReLU (64,608,608) 0
MaxPool2d (64,304,304) 0
Conv2d (128, 304, 304) 73,856
BatchNorm2d (128, 304, 304) 256
LeakyReLU (128, 304, 304) 0
MaxPool2d (128, 152, 152) 0
... ... ...
Conv2d (1024,76,76) 2,359,296
BatchNorm2d (1024,76,76) 2,048
LeakyReLU (1024,76,76) 0
Conv2d (255,1 76, 76) 261,375

Table 3.1. Sample YOLOv8 model Architecture

Training and Using YOLOv8 Model for Object Detection

In terms of object detection, YOLOv8 provides strong features and adaptability. You may
train and use the YOLOv8 model for your particular object detection tasks by following a
few easy steps.
 Adjusting:
The fine-tuning feature of YOLOv8 enables customization and specialization in
object detection. In order to increase the model's performance and accuracy for
identifying certain classes of objects, fine-tuning entails training the model on a
particular dataset.
 Dataset:
You require a dataset with pictures and the associated annotations or labels in order to
train YOLOv8. A variety of examples of the things you want the model to identify
should be included in the dataset. To train the YOLOv8 model, you may use the path
to the dataset descriptor file, which defines the location and format of the dataset, and
the "train" method.

25
 Image Prediction:
You may utilize the YOLOv8 model for picture prediction after you've trained it. The
model may be used to evaluate a picture and produce predictions about the existence
and placement of items by invoking the "predict" method and passing it an image.
Essential data like bounding boxes, which define the object areas, and class labels,
which identify the identified item kinds, are included in the prediction's output.
 Bounding Boxes
Because they indicate the positions of things that have been recognized inside
pictures, bounding boxes are essential to object detection. With the help of YOLOv8,
you can accurately determine the location and extent of each detected item by using
its accurate bounding box predictions.
 Classes
Classes in object detection relate to the many groups or kinds of items that you wish
the YOLOv8 model to identify. YOLOv8 allows the identification and classification
of a variety of object classes, regardless of whether you are using preset classes in a
pre-trained model or modifying the model to identify particular classes from your
dataset.

Training YOLOv8 Steps:


 Make sure the dataset is prepared by adding photographs and labels or annotations.
 Provide the file path for the dataset description.
 To train the YOLOv8 model, use the dataset descriptor file and the "train" method.
 By training the model on particular item classes of interest, you may fine-tune it As
shown in the “Table 3.3” .

Table 3.2. Training and Utilizing table


Step Training YOLOv8 Utilizing YOLOv8 for Image Prediction
s
1 Prepare the dataset Load the trained model
2 Specify the path to the dataset descriptor Call the "predict" method
file
3 Train the YOLOv8 model using the dataset Provide an input image for analysis
descriptor file

26
4 Fine-tune the model for specific object Retrieve predictions, including bounding boxes
classes and class labels
YOLOv8 Architecture Enhancements and Innovations

YOLOv8 solidifies its standing as a state-of-the-art deep learning model for object
identification by introducing a number of improvements and changes to its architecture.
These developments revolutionize the field of computer vision by optimizing performance,
accuracy, and efficiency.

 Improved Network Architecture


YOLOv8's network design has been significantly enhanced. For speed optimization,
modules and convolutions have been swapped out, making object identification
quicker and more precise. With these improvements, YOLOv8 is able to analyze data
in real time even for big datasets.

 Anchor-Free Detection
Anchor-free detection, a novel method that automatically predicts bounding boxes at
an object's center, is incorporated into YOLOv8. Because preset anchor boxes are no
longer required, the model is more durable and flexible enough to accommodate a
wider range of item sizes and forms. Anchor-free detection guarantees accurate
detection outcomes by improving object localization accuracy.

 Training Tricks for Better Accuracy


YOLOv8 uses clever training techniques to increase object detection accuracy. One of
the tips is to halt mosaic augmentation—a method that blends several pictures into a
single training sample—prior to the training session's conclusion. This calculated
change avoids overfitting and improves the model's overall performance, which leads
to higher detection accuracy.

 Decoupled Head Approach


The decoupled head technique, a major advancement in deep learning architecture, is
used by YOLOv8. YOLOv8 is able to accomplish more accurate and efficient object
recognition by removing the objectness branch. The architecture of the model is made

27
simpler by this streamlined design, which also increases inference speed and
decreases computing complexity without sacrificing detection performance.

With its decoupled head approach, anchor-free detection, training strategies, and
network architectural improvements, YOLOv8 is a cutting-edge deep learning model
for object identification. Its much-improved efficiency, accuracy, and performance
have pushed the limits of real-time object recognition in computer vision applications.

SUMMARY
In summary, YOLOv8 is a significant development in computer vision's real-time object
identification field. With its improved architecture and state-of-the-art features, its deep
learning model allows for very accurate object recognition in a variety of applications.
Real-time object identification in applications like robotics, autonomous driving, and video
monitoring is made possible with YOLOv8. Its capacity to recognize things in a scene fast
and precisely creates new opportunities for both industry and study.
YOLOv8 is a premier deep learning model for object detection that improves on the
achievements of its predecessors. Because of its cutting-edge algorithm and sophisticated
computer vision methods, experts in the area always choose it.
Real-time object identification systems are set to reach unprecedented heights as computer
vision research and development continue. As we work to improve the efficiency, precision,
and utility of these crucial technologies, YOLOv8 and its equivalents have bright futures.

3.2. DATA COLLECTION AND PREPROCESSING


Preprocessing and data gathering were handled with extreme care in order to guarantee the
success of our study on automatic number plate detection and recognition. The efficacy of
our recognition algorithm is largely dependent on the quality and integrity of the dataset.

 Data Collection: A methodical methodology was utilized to gather an extensive


dataset appropriate for our assignment. The dataset consists of a large number of
samples (3000 in total), each of which represents a unique license plate for a car..

28
 Strict Protocol for Data Collection: To guarantee data homogeneity and reduce
deviations resulting from outside sources, a strict protocol was implemented to
manage the data collection process. The following were this protocol's essential
elements:
 Controlled Environment: Every gesture was captured under constant lighting
conditions in a controlled setting. This action was done to reduce illumination
changes that can have an impact on the image quality.
 Standardized Framing: To preserve consistency across the dataset, every
license plate on a car was painstakingly positioned in relation to a reference
frame. Consistency in the depiction of license plate positions was assured by
this standardization.
 Capture of Multiple Angles and Perspectives: To account for natural
differences in plate orientation and location, each license plate was
photographed from a variety of angles and views. The goal of this all-
encompassing strategy was to record the variety of license plate occurrences.

 Preprocessing for Data Enhancement: Following data collection, the images


underwent several preprocessing steps aimed at enhancing the consistency and quality
of the dataset:
 Conversion to RGB Format: To guarantee consistent colour channels across
the dataset—a vital need for precise analysis—all photos were methodically
converted from BGR to RGB format.
 Orientation Standardization: To ensure that hand representations are
consistent across the dataset, every picture was rotated by 90 degrees along the
Y-axis.
 Artifact Removal: During the picture gathering procedure, background
artifacts that weren't significant were carefully eliminated. The purpose of this
stage was to focus just on the license plate areas and create a clean dataset
devoid of unnecessary elements.

Following these exacting procedures for gathering data and doing preprocessing has allowed
us to create a dataset that is of the greatest Caliber and consistency. This provides a strong
basis for the parts of our project that come after as shown in the “Fig 3.8”

29
Fig 3.8. Dataset Images

The careful preparation of the data was essential to guaranteeing the correctness and
dependability of our dataset. We sought to reduce noise and inconsistencies by standardizing
orientation, switching photos to RGB format, and eliminating unrelated artifacts. This
improved the dataset's suitability for subsequent tasks including item identification and
recognition. This meticulous methodology not only enhanced the caliber of our dataset but
also enabled more accurate examination and comprehension of the information. Because of
this, we were able to establish a strong basis for our project's next phases, which included a
framework for the creation and assessment of our system for detecting and recognizing
license plates.
The preprocessing procedures enhanced the robustness of the ensuing analysis and model
training by reducing possible sources of noise and unpredictability in the dataset. We reduced
the possibility of false positives during license plate identification and made sure the dataset
mostly contained pertinent data for our algorithms' training by carefully eliminating
background artifacts. Our license plate detection and identification system's overall efficacy

30
and dependability were eventually boosted by this meticulous attention to detail in the data
preprocessing, which raised the bar for performance in practical applications.

Dataset:
Some other samples

Fig 3.9. Full Dataset Images

31
3.3. LICENSE PLATE EXTRACTION USING OCR

Our real-time License Plate Detection algorithm relied heavily on the OCR, which made it
possible to precisely extract Text from the Number Plate. An extensive examination of the
capabilities, justification, and results of our license plate extraction procedure are given in
this section.
 Capabilities of the OCR: Through the use of optical character recognition (OCR)
technology, text-containing photographs may be transformed into machine-readable
text data. For our research, optical character recognition (OCR) is used to extract
alphanumeric characters from areas of identified license plates in car photos or video
frames.
 Techniques for License Plate Extraction: The process of license plate extraction
using OCR involves several key techniques and steps:
 Region of Interest (ROI): Locating and extracting the areas of interest (ROIs)
matching to license plates within the car photos or video frames using methods
like object detection or image segmentation.
 Preprocessing: Improving the clarity and quality of the license plate photos by
preprocessing the extracted ROIs. This might involve methods like noise
reduction, contrast improvement, and image scaling.
 Text Detection: Using OCR techniques to locate and identify text areas
in the previously processed license plate pictures.
 Text Recognition: Identifying the alphanumeric characters found in the
identified text sections by applying OCR methods. In order to do this, labelled
datasets including samples of license plate characters are used to train OCR
models.
 Post Processing: Using post-processing methods to enhance accuracy and
refine OCR findings. Techniques like character grouping, spell checking, and
language model integration may fall under this category.

 Selection: The goal of the extraction procedure was to find a collection of unique
landmarks, each defined by its x, y, and z coordinates, inside the license plate region
was the aim of the landmark selection method. These landmarks were picked with
care since they are important for collecting key elements of license plates and making
accurate identification possible.

32
Fig
3.10.

Recognising using OCR

Fig 3.11. Detecting and Visualising the Number Plate

 Accuracy and Precision in Real-Time: OCR-based license plate identification


system's real-time capabilities did not sacrifice precision or accuracy. To guarantee
that the extracted landmarks were precise and accurate and to maintain the integrity of
the identification system, extensive validation and testing processes were carried out.
The OCR-based license plate recognition system, designed for real-time use, is proof
of the harmony struck between precision and speed. Throughout the system's
development, special attention was paid to making sure that the accuracy and
precision of its operations were not jeopardized by its real-time capabilities. Strict
validation and testing procedures were used to preserve this delicate balance and

33
ensure the precision and accuracy of the landmarks that were taken from license
plates.
Comprehensive validation processes were carried out to maintain the integrity and
dependability of the identifying system. These attempts to validate the system's
performance included a number of areas, with an emphasis on confirming the
accuracy and precision of the landmark extraction procedure. The system's capacity to
reliably extract landmarks from license plates in real-time circumstances was
confirmed by engineers by putting it through rigorous testing scenarios.
Accuracy and precision were critical factors that were taken into account at every
stage of the design and execution process. Every attempt was made to guarantee that,
even in difficult real-world situations, the system could accurately and consistently
recognize license plates. The depth of the testing processes, which sought to evaluate
the system's functioning under various scenarios, was indicative of this dedication to
accuracy.

 Key Landmark Significance: The selection of landmarks for license plate extraction
was driven by their profound significance in identifying and delineating license plates.
Each landmark was chosen based on its association with critical features of license
plates, such as corners, edges, and alphanumeric characters. These landmarks play a
crucial role in differentiating and recognizing license plates accurately.

Our study integrated OCR technology with landmark-based feature extraction


techniques to develop a reliable and effective real-time license plate extraction solution.
OCR's text recognition capabilities together with the deliberate selection of important
features inside the license plate region allowed for quick and accurate license plate
detection in a variety of settings.
The nuances of our project's approach, such as the training of the OCR model and the
deployment of our license plate recognition system, will be covered in detail in the parts
that follow in this report. We will demonstrate the efficiency and dependability of our
method in real-world testing situations with in-depth talks on experimental protocols,
performance evaluation, and real-world testing scenarios.

34
3.4. YOLOv8x MODEL TRAINING

We started training the YOLOv8x model for license plate identification and recognition after
obtaining the dataset and used OCR technologies to extract landmarks. The YOLOv8x
model, an expansion of the YOLOv8 model, was selected because to its increased object
detection task performance and capabilities, as well as its ability to handle large amounts of
information.
We split the dataset using a split ratio of 70, 20, and 10 to create training and testing sets in
order to be ready for model training. We were able to train the model on a subset of the data
and assess its performance on samples that had not yet been seen thanks to this splitting.
We assessed the YOLOv8x model's performance on the validation set as well as the test set
after training it using the designated configurations. The model's accuracy in detecting license
plates is evaluated using evaluation measures such as precision, recall, and F1 score.

Additionally, we thoroughly examined the model's performance across a range of measures,


including mean average accuracy (mAP) at various intersection over union (IoU) criteria.
This investigation shed light on the model's resilience under various conditions and on how
well it generalizes to previously untested data.

We carried out a qualitative assessment in addition to a quantitative one by visually


examining the model's predictions on representative photographs from the test set. This gave
us the opportunity to pinpoint any possible problems, such false positives or missing
detections, and adjust the model appropriately.
We also looked into methods including data augmentation, hyperparameter tuning, and model
architectural changes to enhance the model's performance. Our goal in experimenting with
various approaches was to improve the accuracy and resilience of the model in practical
situations.
In addition, we looked at post-processing methods like non-maximum suppression (NMS) to
improve bounding box estimates and cut down on redundant detections. To improve the
overall quality of the license plate detection results, this step was crucial.
To guarantee real-time performance on devices with limited resources, such embedded
systems or mobile devices, we further improved the inference pipeline. This required the use
of methods like model quantization, pruning, and effective neural network layer
implementation.

35
All in all, the procedures of training, evaluating, and optimizing were iterative, and we
continually adjusted the model in response to input from validation and test outcomes. Our
goal was to create a reliable and accurate license plate detection and identification system that
could satisfy the demands of many real-world applications by utilizing cutting-edge
approaches and methodologies.

The YOLOv8x model was trained with the following configurations:

 Backbone: YOLO
Because of its sophisticated object detection capabilities and capacity to manage
massive volumes of data effectively, YOLOv8 is selected as the foundation
architecture. It improves upon the YOLOv7 architecture by adding new features for
increased efficiency. Also giving the accurate information is the feature of this model.
 Number of classes: 5 (for detecting different types of Plates)
This parameter stays the same, meaning that the model has only been trained to
identify one class for detecting the same license plate again

 Input image size: 608x608


Before the input photos are sent into the YOLOv8 backbone for processing, they are
scaled to 608x608 pixels. This size was selected to strike a compromise between
computational effectiveness and the capacity to record enough information for
precise license plate recognition.

 Batch size: 32
The number of photos processed concurrently during each training cycle is
determined by the batch size. With a batch size of 32 in this instance, 32 photos are
analysed concurrently before the model's parameters are updated in accordance with
the determined loss.

 Number of epochs: 200


How many times the complete training dataset is transmitted forward and backward
Through the neural network during training is determined by the number of epochs.
In this instance, 200 epochs of training will be required for the YOLOv8-based
36
model to sufficiently learn and hone its parameters.

In conclusion, there are a number of important phases and factors to take into account while
training a YOLOv8-based model for license plate detection:

 Model Selection: Accurate object recognition depends on selecting the right model
architecture. YOLOv8 was chosen in this instance because of its sophisticated
features and effectiveness with big datasets. Model configuration include setting up the
backbone architecture (YOLOv8), number of classes (one for license plate identification),
batch size (32), input picture size (608x608), and number of epochs (200). The particulars of
the work and computing limitations are taken into consideration while selecting these
parameters.

 Preparing the Dataset: To make model training and assessment easier, the dataset is
divided into training, validation, and testing sets. Appropriate preparation methods for
data are used.

 Training: The model is trained using the training set, and its parameters are fine-
tuned and performance is optimized by repeated changes to hyperparameters like
learning rate, momentum, and weight decay. Multiple epochs of training are
conducted to enable the model to acquire pertinent characteristics and enhance its
accuracy.

 Evaluation: By adjusting parameters, carrying out experimentation and optimization,


and utilizing the validation and testing sets, the trained model is assessed in order to
determine its level of accuracy and resilience while recognizing license plates on
automobiles.

Experimentation and optimization were carried out during training in order to optimize
performance and fine-tune model parameters. To improve the accuracy and resilience of the
model, iterative adjustments were made to hyperparameters such learning rate, momentum,
and weight decay.

37
3.5. TESTING IN REAL-TIME

During this stage, a camera feed was used to do real-time vehicle license plate recognition
using the YOLOv8X object detection model. YOLOv8X is well known for its remarkable
accuracy and quickness in identifying a variety of objects, and it was designed with license
plate detection in the video stream in mind. To extract and identify alphanumeric characters
from the detected license plate sections, EasyOCR, a flexible optical character recognition
(OCR) library, was implemented.
To improve the precision of license plate localization in dynamic contexts, the system also
included features like motion detection and picture preparation methods. With the help of this
connection, the system was able to localize license plates and read text in real time.
YOLOv8X's strong object detection capabilities combined with EasyOCR's precise text
recognition allowed the system to identify and identify license plates with efficacy and
efficiency. This synergy improves operational efficiency across a range of domains by
opening up opportunities for applications like automated parking management, toll collecting,
and vehicle monitoring.

To improve the precision of license plate localization in dynamic contexts, the system also
included features like motion detection and picture preparation methods. With the help of this
connection, the system was able to localize license plates and read text in real time.
YOLOv8X's strong object detection capabilities combined with EasyOCR's precise text
recognition allowed the system to identify and identify license plates with efficacy and
efficiency. This synergy improves operational efficiency across a range of domains by
opening up opportunities for applications like automated parking management, toll collecting,
and vehicle monitoring.

Here the YOLOv8X object detection model was used to create real-time automobile license
plate recognition utilizing a webcam stream. YOLOv8X, which is well-known for its
remarkable precision and swiftness in recognizing a wide range of objects, was particularly
optimized to recognize license plates in the video feed. EasyOCR, a flexible optical character
recognition (OCR) library, was used to extract and identify the alphanumeric characters on

38
the license plate when a license plate region was identified. Through this connection, the
system was able to read and understand textual information effectively in real-time in
addition to locating license plates. The system proved effective and efficient at detecting and
recognizing license plates by merging the strong object detection capabilities of YOLOv8X
with the precise text recognition offered by EasyOCR. This allowed for applications in
automated toll collection, parking management, and vehicle tracking.

Fig 3.13. Images taken during testing

Our project's methodology made sure that a reliable dataset was created, that license plate
framing was extracted precisely, that modelling was efficient, and that real-time license plate
recognition testing went smoothly. We successfully located and identified license plates in
dynamic surroundings by utilizing a mix of optical character recognition libraries and
sophisticated object detection algorithms. This method made it easier to create a dependable
system that could effectively identify license plates in real-time situations, setting the
groundwork for uses in automated parking, toll collecting, and vehicle tracking.
39
Using a demo video , we applied the YOLOv8X object detection model to perform real-time
license plate identification during the testing phase's last stage. YOLOv8X's enhanced
capabilities were utilized by the system to precisely identify license plate locations in the
video feed. Then, from the detected license plate areas, alphanumeric characters were
extracted and identified using EasyOCR, an adaptable optical character recognition (OCR)
library. The system was able to locate license plates with high precision and read and
understand textual information in real-time thanks to this integrated methodology. The testing
procedure validated the system's prospective uses in automated toll collection, parking
management, and vehicle monitoring by confirming its efficacy and efficiency in detecting
and identifying license plates.
High accuracy and efficiency were shown in the combination of YOLOv8X object detection
with EasyOCR for real-time license plate identification, which has promise for a variety of
automated system applications.
Furthermore, our project approach made sure that strong error handling methods were
integrated to improve the system's dependability in practical situations. This involved putting
strategies into place to deal with difficult situations including fluctuating illumination,
occlusions, and noise in the recorded photos or video stream. Our goal was to reduce false
positives and negatives by addressing these aspects, which would enhance the license plate
recognition system's overall accuracy and performance.
In addition, we gave top priority to features and interfaces that are easy to use in order to
enable the license plate recognition system to integrate seamlessly. In order to facilitate
simple deployment and maintenance by end users and system administrators, this involved
creating user-friendly dashboards, APIs, and documentation. Also, a thorough testing and
validation process was carried out in various situations in order to evaluate the resilience and
generalization capabilities of the system. To ascertain the system's efficacy in many real-
world scenarios, this entailed assessing its performance in relation to varying weather, traffic
volumes, and camera angles.
The possibility of using machine learning techniques like transfer learning and fine-tuning to
improve the system's performance and adaptation to new surroundings or areas with varied
license plate styles and rules was also investigated in our study.
Finally, to constantly assess and improve the system's performance over time, feedback and
monitoring systems were put in place. Through the use of an iterative process, we were able

40
to adjust to changing needs and resolve any new issues, guaranteeing the system's long-term
dependability, accuracy, and efficiency.

CHAPTER 4

RESULTS AND DISCUSSION


4.1. PERFORMANCE METRICS
 Accuracy: The model achieved an accuracy of 95.94% in classifying 3000 license
plates.
 Confusion Matrix: An essential instrument for assessing the effectiveness of
categorization models, such as those employed in license plate recognition systems, is
the confusion matrix. It gives a thorough explanation of how the model's predictions
relate to the labels that represent the ground truth for various classes. Typically, the
matrix is arranged into rows and columns, where the anticipated classes are
represented by each column and the actual classes are represented by each row.
Our system's performance in categorizing the all the 3000 photos is visually
represented by the confusion matrix (Figure 4.1). It demonstrates how well
our algorithm can differentiate.

41
Fig.4.1. Confusion Matrix

4.2. REAL-TIME RECOGNITION SCREENSHOTS:

Fig.4.2. Output Images

1. Challenges:
Our method addresses issues with changing illumination and vehicle plate placement,
which are frequent in real-world situations. While research is still being done to
improve resilience, our system performs exceptionally well in spite of these
challenges.
Optimizing the system to withstand varying lighting conditions, motion blur, and
camera distortions while preserving accuracy across a range of vehicle sizes and
orientations was necessary to meet the difficulties of real-time license plate
identification.

42
Crucial components of the development process were putting in place efficient
methods for handling low-resolution video inputs and controlling computer resources
for real-time processing.
Furthermore, a great deal of testing and performance optimization of the model was
necessary to guarantee resilience and dependability in real-world circumstances. A
strong foundation for license plate identification was created by integrating
YOLOv8X for object detection and EasyOCR for text recognition. Notwithstanding
these developments, further research is required to boost the system's capacity to
analyze massive amounts of video feeds efficiently and to make it more adaptive to
changing settings. The use of such systems in a variety of applications, such as
automated toll collection, parking management, and vehicle monitoring, can be
facilitated by cooperation with industry partners and regulatory agencies, improving
public safety and traffic management.

2. Potential Applications:
Our technology has broad applications and disruptive possibilities beyond its
principal usage in real-time automotive license plate recognition. It may completely
transform automated parking management systems, automated toll collection, and law
enforcement operations by providing immediate insights and improving operating
efficiency. Innovation in the transportation and security industries is made possible by
the combination of cutting-edge technology with conventional methods, which also
leads to improved convenience, safety, and regulatory compliance.
Additionally, our system's adaptability and scalability make it well-suited for
integration into smart city initiatives, traffic monitoring systems, and vehicle tracking
platforms, further amplifying its impact across diverse domains.
Moreover, the data generated by our technology can be leveraged for analytics-driven
decision-making, enabling stakeholders to optimize resource allocation, mitigate
traffic congestion, and enhance public safety measures. Optimizing the system to
withstand varying lighting conditions, motion blur, and camera distortions while
preserving accuracy across a range of vehicle sizes and orientations was necessary to
meet the difficulties of real-time license plate identification Innovation in the
transportation and security industries is made possible by the combination of cutting-
edge technology with conventional methods, which also leads to improved
convenience, safety, and regulatory compliance
43
Overall, the potential applications of our real-time license plate detection technology
extend far beyond its initial scope, driving advancements in efficiency, security, and
urban mobility while fostering a safer and more connected society.

CHAPTER 5

CONCLUSION AND FUTURE SCOPE


In this combined section, we highlight key study findings and suggest future directions for
investigation.

 Conclusion:
Our concept is a prime example of integrating technology with conventional
processes since it recognizes license plates from moving cars in real time. Through
the use of the powerful YOLOv8X architecture and sophisticated EasyOCR features,
our system effectively recognizes and classifies automobile license plates in a variety
of settings. With a 95.94% accuracy rate, our system performs exceptionally well in
the license plate recognition job.
In conclusion, the creation of a real-time system for recognizing car license plates
marks a substantial breakthrough in the domains of artificial intelligence and
computer vision. Through the integration of cutting-edge technologies like EasyOCR
for text recognition and YOLOv8X for object identification, we have effectively built
a strong framework that can correctly recognize license plates in a variety of video
streams and environmental situations.
We have shown the system's dependability and efficacy in real-world applications
including automated toll collection, parking management, and vehicle monitoring
through careful dataset creation, effective modelling, and stringent testing.
Notwithstanding the difficulties encountered such as illumination unpredictability,
motion blur, and limitations in computer resources our technique has made it possible
to develop a solution that meets practical objectives and establishes the groundwork
for further developments in this field. To improve traffic management, public safety,
and social well-being in the long run, further research and cooperation will be needed

44
to further hone the system's functionality, increase its flexibility to changing settings,
and encourage its wider implementation.

 Future Scope:
Our study opens up a number of fascinating directions for further investigation:
 It is critical to address environmental unpredictability and improve the system's
flexibility under various performance scenarios.
 Adding more samples to the dataset will result in a more reliable recognition
model.
 Two areas with promise are the creation of interactive feedback systems and the
integration of several modalities for a comprehensive knowledge of performances.
 Future research possibilities that are inspiring include letting people personalize
their own gestures and investigating how technology may be used to preserve
cultural heritage.

Our study continues to focus on the harmony between technology and tradition, and we
anticipate using innovation to further enhance the field of traditional art.

45
REFERENCES

[1]License plate recognition based on multi-angle view model(2023) , dat tran-anh ,


khanh linh tran , hoai-nam vuthuyloi university posts and telecommunications
institute of technology , arxiv papers copyright

[2]Mohammad fahad uddin, rasib khan, and md. tariq hasan.2023. license plate
recognition and number segmentation using hybrid svm classifier, 2023 international
conference on electrical, computer and communication engineering (ecce).

[3]Yinan yu, feng ji, yandong guo, xuelong li, and thomas huang . license plate
detection and recognition in unconstrained scenarios. 2022 ieee transactions on image
processing.

[4] C. l. philip chen; bingshu wang. random-positioned license plate recognition using
hybrid broad learning system and convolutional networks published in: ieee
transactions on intelligent transportation systems ( volume: 23, issue: 1, january 2022)

[5]A hybrid approach for automatic licence plate recognition syste m.source:
international journal of sensors wireless communications and control, volume 11,
number 1, 2021, pp. 66-71(6)

46
[6]Shuai ding, wenjian wang, jie jin, and xu ye vehicle license plate recognition using
convolutional neural network. published in: 2017 ieee international conference on
robotics and automation (icra)

[7]Santiago silva, claudio rosito jung. license plate detection and recognition in
unconstrained scenarios. int. j. comput. sci 2017 international joint conference on
neural networks (ijcnn)

[8]Muhammad usama, muhammad shahzad, taimur hassan ,"vehicle license plate


detection using deep learning" published in: 2019 ieee/rsj international conference on
intelligent robots and systems (iros)

[9]Tao wang, dong wang, zhihao yang, xiulian peng , "vehicle license plate
recognition using convolutional neural network"published in: 2018 international joint
conference on neural networks (ijcnn)

[10]Xiaobin zhuang, dan su, xing mei, xu geng . "vehicle license plate detection using
region-based convolutional neural networks" published in: 2018 international joint
conference on neural networks (ijcnn)

[11]Min zhang, chenxia wu, shen tian , "vehicle license plate detection and
recognition in unconstrained scenarios" . published in: 2017 30th sibgrapi conference
on graphics, patterns and images (sibgrapi).

[12]Chi zhang, zechao li , "license plate recognition using convolutional neural


networks".published in: 2018 15th international conference on control, automation,
robotics and vision (icarcv)

[13]Jian xu, lei ma, and shuai wang presented their paper titled "vehicle license plate
recognition with novel dataset" at the 2020 ieee/cvf conference on computer vision
and pattern recognition (cvpr).

47
[14]Tao wang, dong wang, zhihao yang, and xiulian peng, who published their
research on "vehicle license plate recognition using convolutional neural network" in
the 2018 international joint conference on neural networks (ijcnn).

[15]Xiaobin zhuang, dan su, xing mei, and xu geng explored "vehicle license plate
detection using region-based convolutional neural networks" during the same
conference in 2018.

[16]Min zhang, chenxia wu, and shen tian delved into the topic of "vehicle license
plate detection and recognition in unconstrained scenarios" at the 2017 30th sibgrapi
conference on graphics, patterns, and images (sibgrapi).

[17]Chi zhang and zechao li discussed "license plate recognition using convolutional
neural networks" at the 2018 15th international conference on control, automation,
robotics, and vision (icarcv). in another study,

[18]Tao wang, dong wang, zhihao yang, xiulian peng. "vehicle license plate
recognition using convolutional neural network." (2018)

[19]Chetan bansal, akshay agarwal, raj jain, and arushi jain. "license plate recognition
using deep neural networks." (2018)

[20]vladimir iglovikov, alexey shvets. "ternausnet: u-net with vgg11 encoder pre-
trained on imagenet for image segmentation." (2018)

[21] Guangrui zhu, zhiqiang wang, rongrong ni, songzhi su, and yanwei pang. "traffic
sign detection and classification in the wild." (2016)

[22]Xihui liu, joel lang, and gerhard hagerer. "license plate detection and recognition
in unconstrained scenarios." (2016)

[23]Mahfuzur rahman, wei-chih hsu, yung-ju chang, ming-cheng chen, and kuan-
chieh wang. "an efficient license plate detection algorithm based on the yolo
detector." (2017)
48
[24Jinjing he, yongluan yan, and wei wu. "license plate detection and recognition in
unconstrained scenarios." (2017)

[25]Gaurav sharma, arjun jain, balaji hariharan, and sumanth srinivasan. "vehicle
license plate detection and recognition." (2016)

[26]Haogang zhu, minghang he, and hong lin, 2021, 2021 ieee international
conference on systems, man, and cybernetics (smc)

[27]Qunzhi yin, yiping zheng, and huaibo song, 2019, 2019 7th international
conference on information and education technology (iciet) , Automatic Plate
detections electronics and computer vision

[28]Peihui liu, zhixiang shen, and wenjian feng, 2017, 2017 14th iapr international
conference on document analysis and recognition (icdar), License plate photo and
video detection

[29]Guangming lu and junxiang li, 2020, 2020 ieee/cvf conference on computer


vision and pattern recognition (cvpr) , Number plate and computer vision

[30]Song xing, qing zhu, and huai zhang, 2019, 2019 international conference on
computer science and application engineering (csae),A deep learning mode for license
plate detection

[31]. wang ming, liu jiachen, liu yonghu, and zhang shaowei, 2020, 2020 ieee 4th
information technology and mechatronics engineering conference (itoec). Research on
User Information Security based on Cloud Computing for license plate.

[35]Rui zhang, lei wang, ling wu, and xue li, 2018, 2018 ieee/rsj international
conference on intelligent robots and systems (iros)

[36]Chao liu, qingjie zhao, zhiqing shi, and zongxiong yang, 2017, 2017 ieee/rsj
international conference on intelligent robots and systems (iros)
49
[37]Jia li, yun zhang, xiaoyi feng, and zheng wang, 2016, 2016 ieee international
conference on robotics and automation (icra). A survey on license plate detection

[38]Xiaowei guo, chang liu, bin li, and xin yang, 2019, 2019 ieee/rsj international
conference on intelligent robots and systems (iros) A numberplate detection and deep
learning

[39]Meng zhang, yicheng wang, and yu qiao, 2021, 2021 ieee/cvf conference on
computer vision and pattern recognition (cvpr)

[40]Saad, r.m., & shafait, f., 2012, 2012 21st international conference on pattern
recognition on Number plate (icpr)

[41]Yang, y., cui, l., & chen, w., 2015, iapr international conference on document
analysis and recognition of license plate (icdar)

[42]. dhankhar, a., & duhan, m., 2020, 2020 5th international conference on
computer and communication systems of license plate (icccs)

[43]. lu, s., wang, y., liao, s., tsai, y., & fan, k., 2009, 2009 ieee conference on
computer vision and pattern recognition (cvpr)

[44]. leveraging model fusion for improved license plate recognition(2023) , rayson
laroca, luiz a. zanlorensi, valter estevam, rodrigo minetto, david menotti. arxiv

[45]. identification of vehicle through number plate using recognition


algorithm ,ritik sivach, swati sharma, shivam semwal, varnit agarwal article history:
received: 01.02.2023 revised: 07.03.2023 accepted: 10.04.2023.ecb issn 2063-5346.

[46]. vehicle and license plate recognition with novel dataset for toll collection.(11
feb 2022) · muhammad usama, hafeez anwar, abbas anwar, saeed
anwar .arxiv:2202.05631 [eess.iv]
50
[47]. super-resolution of license-plates using weighted interpolation of neighboring
pixels from video frames [7] ahmed, n., tan, x., & ma, l. (2023)

[48]. automated license plate recognition using existing universityinfrastructure and


different camera angles , august 2020 , the african journal of information systems ,
volume 12 ,issue 2 article 4.

[49]. effects of challenging weather and illumination on learning-based license


plate detection in noncontrolled environments a. rio-alvarez , j. de andres-suarez, m.
gonzalez-rodriguez, d. fernandez-lanvin, and b. l´opez p´erez . faculty of computer
science, university of oviedo, oviedo, spain. correspondence should be addressed to
a. rio-alvarez, received 21 december 2018; revised 30 march 2019; accepted 14 may
2019; published 27 june 2019.

[50]. [towards end-to-end car license plates detection and recognition with deep
neural networks , hui li, peng wangy, and chunhua shen. submitted to ieee
transactions on intelligent transportation systems 5 april 2017; revised 26 september
2017 .

[51]. deep automatic licence plate recognition system vishal jain , zitha sasindran ,
anoop rajagopal , crossmark jounrel , 2016.

[52]. reading car license plates using deep convolutional neural networks and lstms.
hui li, chunhua shen. arxiv:1601.05610v1 [cs.cv] 21 jan 2016

[53]. cnn for license plate motion deblurring , pavel svoboda, michal hradiˇs, luk´aˇs
marˇs´ik, pavel zemˇc´ik. arxiv:1602.07873v1 [cs.cv] 25 feb 2016

[54]. proposal for automatic license and number plate recognition system for
vehicle identification , 2016 1st international conference on new research
achievements in electrical and computer engineering

51
[55]. license plate recognition system based on color coding of license plates. jani
biju babjan s5 information technology, government engineering college, barton hill,
thiruvananthapuram.

[56]. vehicle license plate detection and recognition, by guanghan ning , dr. zhihai
he, thesis supervisor , december 2013.

[57]. a cognitive and video-based approach for multinational license plate


recognition, nicolas thome, antoine vacavant , lionel robinault ,serge miguet.
machine vision and applications (2011) 22:389–407 , doi 10.1007/s00138-010-0246-
3.

[58]. superresolution of license plates in real traffic videos, article in ieee


transactions on intelligent transportation systems, july 2007 doi:
10.1109/tits.2007.895291, source: ieee xplore.

[59]. license plate recognition from still images and video sequences: a survey
christos-nikolaos e. anagnostopoulos, member, ieee, ioannis e. anagnostopoulos,
member, ieee,

[60]. ioannis d. psoroulas, vassili loumos, member, ieee, and eleftherios kayafas,
member, ieee , license plate recognition from still images and video sequences: a
survey

[61]. ieee transactioms on intelligent transportation systems, vol. 9, no. 3, september


2008 377 , ieee transactions on intelligent transportation systems, volume 9

[62]. lightweight fully convolutional network for license plate detection , han
xianga,b, yong zhaoc,⁎, yule yuand, guiying zhange, xuefeng huc. optik -
international journal for light and electron optics 178 (2019) 1185–1194.

52

You might also like