0% found this document useful (0 votes)
184 views14 pages

Parking Occupancy Detection Using cOMPUTER VISION

The document discusses developing a computer vision-based system to detect vehicle occupancy in parking lots. It reviews literature on smart parking systems using RFID, IoT, and computer vision techniques. The project aims to detect parked vehicles and analyze parking data to optimize efficiency, using algorithms like Mask R-CNN and YOLOv5 on video collected from a parking lot.

Uploaded by

Nitin Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
184 views14 pages

Parking Occupancy Detection Using cOMPUTER VISION

The document discusses developing a computer vision-based system to detect vehicle occupancy in parking lots. It reviews literature on smart parking systems using RFID, IoT, and computer vision techniques. The project aims to detect parked vehicles and analyze parking data to optimize efficiency, using algorithms like Mask R-CNN and YOLOv5 on video collected from a parking lot.

Uploaded by

Nitin Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

Parking Occupancy Detection using


Computer Vision Techniques

Ashish Katiyar1 Diksha Gupta2


Geoinformatics, Department of Civil Engineering Geoinformatics, Department of Civil Engineering
Indian Institute of Technology, Kanpur Indian Institute of Technology, Kanpur
Uttar Pradesh, India Uttar Pradesh, India
e-mail address- [email protected] e-mail address- [email protected]

Abstract— The surge in vehicle numbers on roads Challenges include insufficient parking lots and a narrower
contributes significantly to traffic congestion and management road network compared to developed countries.
challenges, particularly evident in developing nations like
India where the influx of cars exceeds road and parking High-density cities like Delhi and Mumbai face acute
capacity. Addressing these issues necessitates the parking shortages, leading to vehicles occupying roadsides,
implementation of sophisticated parking management systems. compounding congestion and safety risks. Finding available
This project focuses on two key objectives: detecting vehicle parking becomes time-consuming, causing traffic to
occupancy within marked parking slots and analyzing parking overflow as drivers search for spots [2].
data. Using the parking lot near IIT Kanpur main gate as a
reference, video data was collected for 14 consecutive days, Presently, manual parking procedures persist,
enabling the evaluation of vehicle occupancy and parking contributing to fuel and time wastage. Inefficient planning
patterns. and management further exacerbate the issue, impacting
public places' viability as people avoid them due to parking
Object detection algorithms such as Mask-RCNN and constraints.
YOLO-v5 were employed to identify occupied parking spaces
within the lot. Various methods, including HAAR cascade-
A smart parking system, a component of India's
based classifiers, DNN-based systems utilizing ResNet Intelligence Transportation System (ITS), integrates various
classifiers, and RCNN with IoU, were tested for detecting technologies such as communication, information
vehicles within allotted slots. The data collected was stored in processing, electronics, and control. Incorporating these
CSV format for analysis. technologies into India's transportation system yields
numerous benefits. Advancements like radio frequency
This project aims to provide insights into detecting parking identification (RFID), Internet of Things (IoT)-based
space availability and analyzing parking data to optimize time solutions, and artificial intelligence (AI) methods, including
and fuel efficiency. In the Mask-RCNN approach, pre- machine learning, deep learning, and computer vision, have
occupied spaces are denoted by red boxes, while green boxes been successfully implemented [3]. This research project
represent available parking spots. Similarly, YOLOv5 was focuses on designing a cost-effective automated off-street
utilized to count cars in video frames and identify available parking system to enhance efficiency for patrons.
parking spaces. The YOLO Annotation Toolbox facilitated the
extraction of parking space coordinates from recorded video The main objective of this work is to develop a system by
frames, which were then visualized in QGIS for further which the parked and unoccupied spaces within the parking
analysis. lot can be detected. It's also known that multiple automated
parking systems are already designed and implemented but
Keywords—Parking management system, Object detection those systems cannot be applied in India directly. So,
algorithms,Mask-RCNN, YOLOv5, data analysis, Q GIS, Real- developing a cost-effective parking system is the main
time data handling, computer vision, urban mobility. objective of this project, where the use of sensors is limited
as the maintenance cost of those sensors is extremely high.
I. INTRODUCTION Instead of this, there are the following sub-objectives of this
The automobile industry report highlights India's work-
remarkable growth from producing 22.93M vehicles  Detection of the vehicles within the video frame
annually in 1733 to a US $32.73B car market by FY 2021, using a Camera sensor.
projected to reach US $54.8B by FY 2027 [1]. This growth is
paralleled by an increase in vehicles, commercial buildings,  Locate the parking spaces in the marked as well as
and connecting roadways, making transportation hubs unmarked parking lots.
essential to modern society.  Identify the presence or absence of the vehicles
However, with the surge in vehicles, traffic congestion within the allotted parking space.
and parking problems have escalated. Modern transportation
 To find patterns and insights by parking data
infrastructure and parking facilities struggle to keep pace,
analysis.
exacerbated by India's status as a developing country.

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

II. LITERATURE REVIEW The parking lot occupancy status is shown on the different
platforms as-
In this section, we review previous research on
automatic smart parking management systems. The study
aims to offer insights into the extent of previous work and
the diverse approaches employed in designing and
implementing such systems. Various methods have been
utilized across different studies to address unique challenges
such as parking space availability, traffic congestion,
weather conditions, and parking schemes. The literature
review is structured into distinct methodologies, each
addressing specific aspects of smart parking management.
A. RFID based Parking System
Using Radio Frequency Identification (RFID)
technology, devices can instantly read information encoded
in tags. Consisting of an antenna and microprocessor, RFID
facilitates tracking and identification of tags attached to
vehicles. This system employs RFID readers, antennas, and
tags for managing various operations such as collecting
vehicle reports and monitoring parking occupancies [4]. Figure 2: Parking Occupancy Status on Telegram bot
Software enables efficient check-in and check-out of B. IoT Based Smart Parking Solution
vehicles, reducing congestion near parking lots. By utilizing IoT has popularized in Smart Cities, addressing issues
RFID tags, waiting times at parking gates are minimized, like traffic congestion, road safety, and smart parking
allowing only tagged vehicles to park and enhancing solutions. An IoT-based cloud-integrated system was
security while preventing unauthorized access. proposed to develop an automated parking solution,
The RFID system utilizes multiple technologies for enabling users to locate and book parking spaces via mobile
automatic vehicle identification and data capture. Generally, devices before entering the lot [6]. IoT connects physical
the components that are used in the RFID system are RFID objects through the internet, allowing them to behave in
tags, RFID readers, and antenna. An antenna is primarily lifelike manners by performing computation, sensing, and
responsible for transmitting the data to the RFID reader and communication. Cloud computing serves as the platform for
storing the data in the database system. Additionally, a local hosting IoT applications, forming a "Cloud of Things"
host computer system is essentially required for the RFID (CoT) for remote monitoring, accessing, and controlling
system. The RFID reader picks up the signal from the tag objects. This system can be implemented for indoor and
and reformats it so that it can be read. Tag data is outdoor parking systems using sensors like Geomagnetic,
transmitted via signal interference once it has been Fiber Bragg Grating, and Ultrasonic sensors, with
observed. To accomplish this, a database is connected to the Geomagnetic sensors wirelessly communicating parked
host computer system in order to store and preserve the vehicle status by sensing changes in the magnetic field [7].
data. In this research, a central database is used to control The hardware and their uses in IoT based parking
parking lot input and output. management systems are shown in table –
Anusooya G et. al. [5] suggested transferring the real-
time parking occupancy information to the website, mobile
application, and the telegram bot app using raspberry pi
communicating with the google app engine.

Figure 1: Structure of RFID based parking system.

Table 1: Hardware and their uses

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

Generally, an ultrasonic sensor is used for outdoor When the image is captured from the camera, image
parking systems because they are cost effective and come processing is done to mean the act of arbitrarily altering an
with better accuracy. XBee sensors provide a large range of image to conform to an aesthetic standard or to provide
signal transfer distances (40 m indoor capacities and 120 m evidence for a preferred reality, and noise reduction is
outdoor capacities). The Uno module provides better generally accomplished by an image filtering technique. The
memory management and is compatible with the XBee reference image is taken initially, and the captured image is
radio. Light emitting diodes (LEDs) are used because of subtracted from the reference image to check the changes
their visibility efficiency from every direction and mostly and similarities between the two before further processing is
for their power efficiency. done.
For the removal of generated noise, morphological
processes, i.e., dilation and erosion, are employed. The
image is converted into a grey image, and image
enhancement techniques are used. Then, corners are
identified and compared by using edge detectors to check
for similarities and differences. The whole system can be
divided into the following sequential manner-System
Initialization, Image Acquisition, Image Segmentation,
Image Enhancement, Image Detection.

Figure 3: System Architecture and Working Procedure

This parking solution is time effective and worked well


with great efficiency with some minor errors like double
parking issues and improper parking vehicle identification.
These errors can be resolved using human interference or
using some more modern technologies. A big parking lot
may require a lot of sensors, and regular checks and
maintenance of the sensors are also required at some
periodic intervals, hence the system is not cost-effective for
every place.
C. Smart Parking Solution With Image Processing
A camera is required as the main hardware component,
which makes this system very inexpensive and simple to
obtain. The camera is used to locate the free space in the
parking lot in real-time by image sequence gathering and
detection of free space. The subtraction method is applied to
identify moving vehicles. A software component of the
system could be an image-processing program such as
MATLAB or Python [8]. MATLAB software is used to
detect vehicles in the parking lot using the input from the
camera in real time. After picking up the image, it is
processed through multiple image processing commands to
understand the image more efficiently, and output for the
same is obtained. The block diagram is shown below as-
Figure 5: Parking Slot Availability Result on Display

Canny edge detection is used by J. Trivedi et al. [9] and


Katy Blumer et al. [10] to create a real-time intelligent
parking management system. A USB camera was used to
get the input video feed, LCD board showing current
parking lot capacity, and a Raspberry Pi module to transfer
the information using Wireless Fidelity (Wi-Fi).
Although this method is cost-effective, it has a limitation
in that it's most effective for smaller parking areas. It may
lead to false results in cases when some different objects are
Figure 4: Flow Structure of Image Processing
present instead of cars.

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

D. Machine Vision based Automated Parking System classification task can be viewed as a binary problem. These
Machine learning-based technology has been methods can be classified into 2 major categories-
extensively researched in recent years due to its flexibility
and cost-effectiveness. Computer Vision is the subset of  Feature extraction-based methods
'artificial intelligence' (AI) technologies that enable  Deep learning-based methods
computers and systems to generate relevant information
from visual inputs such as digital images and videos. Feature extraction-based methods:
Growth in AI and innovation in neural networks and deep All the images acquired by the camera are
learning led to machines surpassing humans in various tasks segmented into individual parking spaces as a pre-
related to labelling and detecting objects. The production of processing step. The image is typically scaled, and
massive amounts of data is the primary driver behind the histogram equalization tasks are performed to make it more
acceleration of computer vision technology, which is used suitable for feature extraction. In the feature extraction step,
to train the model and make the model robust. In the context one or more feature vectors are extracted from the images,
of smart parking solutions, Computer vision technique- such as Local Phase Quantization (LPR), Local Binary
based methods have been tested to overcome challenges as Pattern (LBP) and Histogram of oriented gradient (HOG)
mentioned- a.) Marking and detection of parking slots. b.) [18]. Then a classifier (support vector machine (SVM),
Classification of the parking slot based on the occupancy of multilayer perception (MLP), etc.) is used for training the
parking spaces. c.) Detection and counting the number of model and feature vectors from the images and the ground
vehicles present in the parking lot [11]. Magnetometers and truth of each image is fed to train the classifier. Then this
ultrasonic sensors are no longer necessary because a single trained model is used to perform predictions based on
camera can monitor a larger area. Furthermore, the camera's unseen images. Almeida et al. proposed to use Local Phase
installation and maintenance costs are less, and it can help Quantization (LPQ) and LBP as feature vectors and SVM as
with extra tasks like detecting theft and looking into unusual the classifier. Later, Almeida et al. used LPQ and LBP as
driver behavior [12, 13]. feature descriptors and an ensemble of SVMs was used as a
Numerous publicly accessible datasets are accessible for classifier on the PKLot dataset. In Suwignyo et al. [19],
advancing the research on computer vision-based parking LBP has been used as a feature extractor and KNN and
management systems. Some of them are as- PKLot Dataset SVMs are tested as a classifier for the parking management
[14], CNRPark, CNRPark-EXT, PLDs dataset [15,16,17]. solution. In Dizon et al., as feature descriptors, LBP and

HOG were used, and a linear SVM classifier was employed.


Figure 8: High-level Scheme of Feature Extraction based Methods.

Deep learning-based methods:


Deep learning-based methods have a similar workflow as
the feature-based approach, except for the feature extraction
and classifier treating separately, these 2 steps are combined
as a representation learning block and can be divided into
Figure 6 : PKLot Dataset: Red box shows occupied parking space
while green box shows unoccupied parking space (a.) Transfer learning on pre-existing CNNs for
classification, such as LeNet, AlexNet, VGG and ResNet
[20]. (b.) Custom CNN generation based on pre-existing
Convolutional Neural Networks (CNN)s (c.) Use of deep
learning-based Object Detection methods, such as- Fast-
RCNN, Faster-RCNN, Mask-RCNN, Single Shot Detector
(SSD) and You Only Look Once (YOLO) etc. Yoshinki et
al. [21] used the concept of transfer learning of the neural
network, it was first trained on a generic dataset before fine-
tuning for a parking lot dataset. Because of the compactness
Figure 7: Parking Lot Dataset (PLDs) of the network structures of AlexNet and LeNet, these are
used by Julien Nyambal et al. for the parking lot
Furthermore, to determine whether a parking space is management system.
occupied or available for use, each parking space

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

Ding and Yang et al. [22] proposed a YOLOv3 with added


residual blocks to extract more granular features from the
images. It is employed to classify pictures of parking lots.
Microsoft Common Objects in Context (COCO) and
PASCAL Visual Object Classes Challenge (PASCAL VOC)
datasets were used to train the model and then fine-tuned.
Based on CNN architectures of the AlexNet, LeNet and
VGGNet, some lightweight models are also proposed.
These models are primarily created for systems with low
processing power, like smart cameras, and tested on the

Figure 10: Object Detection with Multiple Classes


To understand the basic deep learning approach, for
detecting multiple class images the knowledge of CNNs is a
must [25].
Convolutional Neural Networks (CNNs) are advanced
artificial intelligence systems designed to process visual
PKLot dataset. Merzoug et al. [23] used a lightweight information, inspired by the human brain's ability to
network to classify the parking spaces by the MobileNetV2 recognize objects in images. These networks are particularly
algorithm and tested it for PKLot, CNRPark-EXT, and a adept at tasks like image recognition, object detection, and
private parking dataset. medical diagnosis.
Figure 8: DL-based methods high-level scheme CNNs operate by breaking down images into smaller,
more manageable components, similar to how our brains
III. THEORETICAL BACKGROUND
analyze visual information. They do this through layers of
Object detection is the basic research platform in the field convolutional and pooling operations. Convolutional layers
of computer vision, artificial intelligence, deep learning, etc. use filters to scan images for patterns, such as edges and
It aims to locate and identify objects of various categories, textures, creating feature maps that highlight these features.
quickly and accurately in each image. It has been widely Pooling layers then reduce the size of these feature maps,
used in image and video retrieval, autonomous driving, focusing on the most relevant information while discarding
intelligence surveillance system, medical image analysis, unnecessary details.
etc. Following convolution and pooling, CNNs pass the
There are several kinds of object detection queries, which feature maps through fully connected layers, where the
can be understood through the image below- network interprets the features to make predictions. For
example, if trained to recognize cats, the fully connected
layers analyze the features and determine if they resemble
cat patterns.
CNNs are trained through backpropagation, where the
network adjusts its parameters based on the difference
between its predictions and the true labels of the images it's
trained on. This iterative process allows CNNs to learn from
experience and improve their accuracy over time.
CNNs are sophisticated algorithms that mimic human
visual processing. By analyzing images in layers and
recognizing patterns, they can accurately identify objects
and features within images, making them valuable tools for
various applications in fields like computer vision,
healthcare, and autonomous vehicles.

Figure 9: Different forms of object Detection in Deep Learning.

The image classification model is evaluated and


examined based on mean classification error. The best-
performing model is chosen based on "precision" and
"recall" across every possible best-matching bounding box
Figure 11: CNN Workfow Structure [26]
for the known objects in the ground truth image [24].
Following is a popular example of object detection
algorithms for better understanding and to demonstrate how
an object detection algorithm looks like-

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

CNN, or ConvNets, are the multi-layer NN architecture


Transfer Learning Paradigm and Fine-Tuning that is used to discern visual patterns from the image grid
Transfer learning is a machine learning method, where a cells. CNN consists of numerous layers, such as
model is developed for a task is reused as a starting point for convolutional layers, pooling layers, and fully connected
a model on another task. layers, and utilizes the backpropagation algorithm to obtain
In deep learning, pre-trained models are frequently used the spatial hierarchy of the features present in the image
as starting points for computer vision tasks because they adaptively and automatically. The summary table of various
require less time and computation than developing neural CNN architecture models is below-
networks. In this, pre-trained weights are used, so the
training doesn't need to start from scratch [27].

Figure 12: Systematic Illustrating of Transfer Learning Paradigm


In a specific manner, it is not a methodology but just a
paradigm used actively in computer vision. The training
pattern obtained in transfer learning is called "fine-tuning".
After fine-tuning, the parameters are reduced to a significant
level, which ultimately reduces the training time required.
Training from scratch requires a large dataset, while
transfer-
learning is useful when there is a small dataset available.
The only constraint while using transfer learning is to
ensure that the input to the network is similar or identical.
The objective of the training phase of all the neural
networks is to reduce the value of the loss function. Loss
functions are generally defined as the statistical measure of
the model performance based on accurate prediction.
Classification loss (cross entropy loss and focal loss) and
Regression loss functions (mean square error, mean absolute
error, hubler loss functions) are being used for classification
and regression task respectively for neural network. Table 2: ISLVRC Competition CNN Architecture Models

Figure 14: Comparison of various kinds of CNN Architectures in


terms of accuracy

Object Detection Algorithms


After understanding the concepts of various CNN
algorithms, it becomes crucial to have knowledge about
object detection algorithms. CNNs are able to perform
classification tasks, but it is also recommended to estimate
the location of the object present in the input image. This
whole task is referred to as 'Object Detection'.
Object detection algorithms are being used for various
applications, such as face detection, pedestrian detection,
vehicle detection, medical image analysis, etc. However,
because of significant variations in poses, occlusion,
viewpoints, and lighting conditions, it is becoming
Figure 13: MSE (blue), MAE (red) and Hubler Loss (yellow) challenging to accomplish object detection perfectly with
additional object localization tasks. In recent years, Object
Basic CNN Architectures Detection applications have attracted a lot of attention.
In deep learning, several types of Neural Networks are Before the development of CNN architectures, there
used in image processing, image classification, were some additional object detection algorithms. These
segmentation, self-driving cars, etc. range of applications. methods usually consist of 3 different stages. The first stage
One such popular neural network is the convolutional neural is associated with framing the candidate region on the input
network (CNN). image. For this purpose, a sliding window approach is used.

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

The second stage is associated with feature extraction from Single-stage object detection algorithms are a class of
the selected candidate region using 'Scale-Invariant Feature computer vision models designed to detect objects within
Transform' (SIFT) and HOG algorithms. The third stage is images or videos in a single pass without the need for region
associated with the classification task, whose main objective proposal. These algorithms are efficient and fast, making
is to predict the class label of the selected objects. The them suitable for real-time applications. Unlike two-stage
classifiers used for such purposes are obtained from algorithms, which typically involve region proposal
machine learning, i.e., 'support vector machine' (SVM) and followed by object classification, single-stage algorithms
AdaBoost. algorithms etc. Classification accuracy is directly predict object bounding boxes and class labels from
basically measured by calculating the precision and recall input images.
estimates. 'HOG + SVM' and 'HOG + Cascade' One popular single-stage object detection algorithm is
combinations have been used for such purposes. YOLO (You Only Look Once). YOLO [30] divides the
Based on DNNs, modern object detection algorithms have input image into a grid and predicts bounding boxes and
been classified into the following 2 categories- class probabilities for each grid cell. This approach allows
YOLO to detect multiple objects in a single forward pass,
 Two-stage detection algorithms making it highly efficient.
 Single-stage detection algorithms Another example is SSD (Single Shot MultiBox
Detector) [31], which uses a similar approach to YOLO but
Two-stage object detection algorithms are used in introduces multiple feature maps at different scales to detect
computer vision to identify objects in images or videos. objects of various sizes. SSD achieves high accuracy by
They consist of two main stages: region proposal and object combining information from multiple feature maps at
classification. In the region proposal stage, potential object different resolutions.
regions are generated using techniques like selective search Overall, single-stage object detection algorithms offer a
or edge boxes. These regions are then passed to the object balance between speed and accuracy, making them well-
classification stage. Here, the algorithm determines the suited for real-time applications where fast inference is
presence of objects and assigns them to specific classes crucial. They continue to be an active area of research in the
using deep learning methods like Convolutional Neural field of computer vision, with ongoing efforts to improve
Networks (CNNs). Object classification involves analyzing their efficiency and accuracy.
the content of each proposed region and assigning
probability scores to different object classes. After
classification, the algorithm refines the bounding boxes of
detected objects to improve localization accuracy.
Techniques like non-maximum suppression are used to
remove redundant bounding boxes and select the most likely
object detections based on their confidence scores. Overall,
two-stage object detection algorithms are powerful tools for
accurately identifying and localizing objects in complex
scenes, with applications in autonomous driving,
surveillance, and video object tracking.
RCNN, Fast RCNN, Faster RCNN [28] and Mask RCNN
[29] are the popular algorithms as 2 stage object detection
algorithms. Figure 16: A simplified illustration of the YOLO Object Detection
Pipeline

Accuracy of Object Detection Model


By comparing the detected objects from the detected
objects to ground reference data, the accuracy matrices for
object detection algorithms define the accuracy of a deep
learning model. These accuracy matrices mainly include the
following matrices- Confusion matrix, F1 score, Precision-
Recall curve and COCO 'mean Average Precision' (mAP).

Figure 15: Output of Mask R-CNN algorithm at parking area

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

Mean Average Precision: Depending on the various


detection challenges that are present, the mean Average
Precision, or mAP score, is calculated by taking the mean
AP across all classes and/or overall IoU thresholds.
Mathematically, mAP is defined as below-

1
mAP= n
(5)
n ∑ AP ( i )
i=0

The performance of various computer vision models can


Figure 17: Confusion matrix

Precision: Precision is defined as the proportion of true


positives to all positive predictions.

TP
precision= (1)
TP+ FP
Recall: Recall is the proportion of true positives to actual
(relevant) objects
TP
recall= (2)
TP+ F N
F1 Score: F1 score is de_ned as the weighted average of
be compared using mAP. mAP provides researchers and
precision and recall values. The range of its values is 0 to 1,
engineers working in computer vision with a single primary
with 0 denoting the lowest accuracy. Alternatively, the F1
metric that considers both precision and recall when
score is the measure to evaluate the balance between
evaluating models.
precision and recall.
IV. SETUP AND METHODOLOGY
2∗Precision All the design parameters are looked at, and a preliminary
F 1 Score= (3)
Precision+ Recall set-up for an experiment and its challenges are described-
A. Setup of Camera Sensor:
As a result of network and hardware advancements,
desktop computing has been surpassed by the burgeoning
mobile computing industry, in which smartphones, tablets,
and computers play a significant role in people's lives. It
also features several sensors such as microphone,
gyroscope, accelerometer, digital camera, digital compass,
GPS, enabling the development of apps to help with daily
tasks.
Precision-Recall Curve: The performance of an object Figure 18: Sensors Embedded in a Smartphone [32]
detection model is assessed using this plot of precision (y- B. Data Collection
axis) and recall (x-axis). The area under this curve is called For the project, a mobile camera (Redmi Note 10 Pro)
AUC. was used to record video of the parking area, which was
mounted on a tripod stand and placed on the two-story
Average Precision: The average precision (AP), which building near the
represents the average of all the precision for a class object parking lot. The
tested for different IoU thresholds, is the compressed way to captured area
represent the AUC. Mathematically, AP is defined as- belongs to the travel
companies
associated with the
n−1 'Indian Institute of
AP=∑ [ recalls ( k )−recalls ( k+ 1 ) ]∗Precision(k )(4 Technology', Kanpur
k=0 (IIT Kanpur). The
) height and
orientation of the

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

tripod stand were kept constant to _x the frame size of the Figure 21: Steps involved in Yolo Annotation Tool
video. The video data is collected for 4 hours from 11 a.m.
to 3 p.m. each day from Wednesday, 04/01/2033, to The 'main.py' python file draws the bounding box around
Tuesday, 17/01/2023, through a two-story building near IIT objects on the image and stores top-left and bottom-right
Kanpur main gate no. 2. The OpenCamera Android points in the txt file format. Example of txt file below-
application was used to capture and store the data, and the
video resolution was set to a high-definition frame rate
(HD) at 1,280 x 720 (16:9, 0.92 MP). The image Figure 22: Annotation data using Yolo Annotation Tool
corresponding to the setup of the camera is shown. A power
bank was also attached to the camera to provide a power The convert.py and process.py python files are used to
supply during the whole period of recording. convert the annotations in the specific format used by the
Figure 19: Camera
Setup on the roof of the building YOLO algorithm in the training process.
C. Methodology:
This section includes information about the model The process of marking the parking slots is represented
architecture, a justification for the model selection, and a as follows-
detailed mathematical description of all the steps that were
taken throughout the entire process.

Overview of the Process:


Design and analysis of deep learning computer vision
framework that can be tracked with the following steps-

Figure 20: Overview of Deep Learning-based Object Detection


Framework
Data is collected from the roof of the building near IIT
Kanpur Gate 2 near the parking lot. This data constitutes the
most critical component of any deep learning process.
After collecting the two weeks of data continuously,
data pre-processing is required. Video pre-processing is
associated with multiple techniques used to analyze,
modify, and enhance video data. The process involves tasks
such as resizing, color correction, noise reduction, and
frame extraction.
To create computer vision and deep learning
applications, image annotation is a crucial first step. The
process of annotating images involves labelling regions,
objects, and attributes within the images to provide semantic
information for the training and evaluation of the Figure 23: Use of Yolo Annotation Tool
algorithms. For the current research, Yolo Annotation Tool
was created to annotate the frames using a manual approach. Here, the first row gives the class id value, and the rest of
For the project, mask RCNN and YOLOv5 algorithms the rows provide the left-upper and right-bottom points of
were used as the base object detection algorithm, which is the corresponding marked parking space coordinates.
quick and effective but might not be as precise as other .
algorithms when trying to spot smaller or partially obscured
1) Mask-RCNN based Parking Occupancy System
objects. Q-GIS was used to perform the geospatial analysis
from the parking data and the exploratory data analysis is Mask R-CNN provides a wealth of data for each
performed. It is the step involved in the data analysis detected object. Most object detection algorithms only
process and involves using summary statistics and visual return the object's boundary. However, Mask R-CNN
representations of the data to look for insights by plotting provides not only the object's location but also a mask
various graphs, detecting outliers, test hypotheses, and around that object, i.e., object segmentation.
underlying assumptions. 'Common Objects in Context' (MS COCO) is a widely
used dataset consisting of images annotated with object
Marking of the Parking Spaces masks. By using the above coordinates, the location of each
Yolo Annotation Tool is used to annotate the objects in parking space is known. By looking at multiple frames of
the image or frame to train and test object detection video in succession, we can easily work out which parking
efficiently and quickly in their respective works. space is occupied or unoccupied. However, for the boxes
Yolo Annotation Tool is an open-source toolbox for partially occupied by the car, we need a method to measure
performing above mentioned task on the video frame or how much two objects overlap to predict the occupancy. So,
image. OpenCV acts as a pre-requisite for this purpose. The we use a concept called 'Intersection Over Union' or 'IoU'.
steps required to perform annotations are as below-

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

creating a function in the plots.py file in the Utils folder of


YOLOv5 as shown in the following-
Figure 25: Region of Interests in YOLOv5 Algorithm

The 'detection.py' is also modified as a parking project


requirement, using OpenCV to count and show the total

of ∩¿
IoU =Area ¿
Area of ∪¿ ¿
(6)
Figure 24: Intersection over Union

For this project, the IoU value is set to be 0.20, that is if


the car bounding box occupies more than 1/5 part of the
marked parking space, then that space is occupied parking
space otherwise unoccupied parking space. If any parking count
Figure 26 : YOLOv5 based Parking Occupancy Detection System
space is continuously unoccupied for more than 100 frames,
of the vehicles present on each frame and the status of
then the "Space Available" message is written on the frame
empty and occupied parking spaces available. The code
using OpenCV.
requires running this code specifically for the 'Car' as an
All the frames are passed through the system and stored
object only.
in an empty folder, and after that, all are merged to get a
A pre-trained model on the COCO dataset
(yolov5m.pt) is used to detect the total number of cars
present on each frame. The results are computed by
considering various hyperparameters tuning as the
confidence threshold (0.25), image size (640), IoU threshold
(0.50), and the class of the detecting object. OpenCV is used
to count the total number of available cars at any frame as
well as the total number of marked empty parking spaces.
Here, all the vehicles present in each frame are
counted and mentioned in the left upper corner of the
processed image and the total number of marked empty
parking spaces is shown in the right upper corner of each
processed frame. If the car is not present at any of the
final video file corresponding to the final output by setting marked places, the 'Space Available' message appears at that
up the required FPS. place.
Figure 25: Mask RCNN based parking management system.
Parking Data Analysis
2) YOLOv5 based Parking Occupancy System
YOLOv5[33] is the recent object detection system Since we have created CSV files containing different
developed by "Ultralytics" on the Pytorch framework and parking information, this data is being analyzed to get
was released in June 2020. To locate the occupied and insights from the data, and day-wise and week-wise
unoccupied parking spaces, YOLOv5 pre-trained model analyses are done. This data is also shared with QGIS
'yolov5m' is used, which is trained on the "COCO" shapefile layer attributes with their specific layer polygons
dataset.The parking spaces marked through the Yolo to represent the insights in geographic representation.
Annotation Tool are linked with the YOLOv5 algorithm by Choropleth maps, line charts, and
area charts are designed to identify hidden patterns and
draw conclusions.

 Weekly area chart analysis:

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

Figure 27: Area Chart Week 1 - Count of Vehicles w.r.t. time Figure 29: Parking Slot Utilization Frequency Considering 14 Days
Dataset
The above choropleth map represents the utilization of
pre-defined parking slots by different vehicles over 14 days.
The range interval is set to 20 minutes. The color map
shows the rate of parking space utilization for each parking
space.

Insights: In the above choropleth map, parking spaces are


called with the colourmap unit. The shades represent the
count of vehicles occupying the respective parking spaces.
The following insights are observed from it.

i. During the whole 14 days, parking space numbers


Figure 28: Area Chart Week 2 - Count of Vehicles w.r.t. time 17,18 and 19 were rarely used by patrons to park
their vehicles as parking slots are at the end of the
parking lot and are far from the building. These
parking spaces are used mostly only 2 times during
Insights: The line chart and area chart display the number the whole 14 days.
of available cars in the parking lot with respect to time. The ii. Temporary parking space number 3 is also less
following insights can be observed by visualizing the above utilized as it's much more time consuming to park
line and area charts. the vehicle at that location if the Non1 and Non2
parking spaces are occupied earlier.
i. The parking area is sufficiently occupied for the iii. Mostly, those parking spaces are used most of the
whole 4 hours duration for each day. time, which are nearer to the building.
ii. On average, in the morning time i.e. at 11:00 am, iv. Due to maintenance work at parking slot number
there is a greater number of total cars present as 13 for some days, its use is also less.
compared to any other time.
iii. In the first week, the average number of cars  Weekly Average Occupancy of Parking Spaces
available in the afternoon was less as compared to
the morning time and increased later. In the next
week, in the morning time, the average number of
available vehicles is more as compared to later.

 Choropleth Map Analysis:


Based on numerical data, choropleth maps use a variety
of shading and color schemes. Equal intervals, quantiles,
lovely breaks, and natural breaks can all be used to calculate
the number of classes. The Equal Interval function divides
the attribute value range into equal-sized subranges. For
data with well-known fields, the equal interval classification
should be used because it highlights how much of an
attribute there is in relation to other values.

 Parking Slot Utilization Frequency


Figure 30: Week 1 Average Occupancy of Parking Spaces

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

related to the daytime situation. The data is collected for a


shorter duration of 4 hours for 14 days, which is not
significant enough to draw resilient conclusions. There are
some novel object detection algorithms that are more
efficient but are not used due to their increased
computational complexity.
Furthermore, computational complexity analysis and
feature visualization, which can help to understand which
parts of the network most contribute to the final predictions,
are absent from this work. There is also a lack of
comparative research into training times as a function of
GPU resources and an in-depth analysis of GPU resource
utilization.

Future Prospect
Figure 31: Week 2 Average Occupancy of Parking Spaces Since the approach mentioned in this work is in its
initial stages toward the final goal, many advancements can
be made. Some of which are discussed subsequently.
Insights: The above choropleth maps represent the average The network architectures employed in this project
occupancy duration for each of the slots per day by using consist of classic object detection algorithms, i.e., Mask
the color map. The yellow color shows the least value, RCNN and YOLOv5. There are some other nascent object
which progressed towards blue color representing the detection algorithms that are much more time-efficient and
maximum value. cost-effective.
A camera is a device that records visual data, usually in
i. In the first week, slot number 17 and 19 are the the form of images or videos, and is frequently employed
least occupied as compared to other parking for tasks like object detection and classification. Contrarily,
spaces. a multimodal system incorporates various types of sensors
ii. In the next week, slot number 17, 18 and 19 are or information sources, such as cameras, 'Light detection
least occupied as they are at the end of the parking and ranging' (LiDAR), radar, or other sensors, to increase
lot. the system's accuracy and robustness.
iii. Due to maintenance work at slot number 13, its Video observation using an object tracking algorithm
total parking duration is also less in both weeks. may help to obtain individual vehicle trajectories as well as
iv. On average, those parking spaces are more the transition of the parking lot occupancy. This data might
occupied, which are nearer to the connecting road be useful to get information related to considering the
or to the building. factors that are important to select to analyze the parking
behavior of the patrons in the parking lot.
Significance of the Work
This research venture fulfils its objective to develop a ACKNOWLEDGMENT
functional system that could take the input data captured I want to thank my friends Prashant, Kiran, Diksha, Govind,
from a camera in the form of videos and images (frames) Ibaad and Unnat for their help in various parts of my
and process them to predict the occupied and unoccupied research and for making my stay on the campus memorable.
parking spaces within the parking lot. Until this section, the I would especially want to thank Prashant for helping me to
work on this project has met the requirement by utilizing get introduced to all the initial details of the thesis
object detection algorithms and OpenCV. The approach objectives and possible outcomes and methods and Diksha
proposed in this work takes the input as an image or video, for the support and help during the time of articulation, and
uses the Yolo Annotator Tool to locate the parking spaces, for providing useful corrections. Also, I would like to thank
and passes it through an object detection algorithm to get my seniors Sujata, Avadh, Rupesh, and Milaa ma'am for
the parking occupancy status. QGIS software is also used to their support and time. I cannot describe in words the
create the shapefile of the parking spaces and merge it with motivation, fun, and encouragement I have received from
the CSV data generated from the processed frames or videos these people.
to get insights from the parking data. This kind of system
forms an end-to-end architecture for real-world and real-
time applications. Additionally, it establishes a baseline V. REFERENCES
against which future technologies can be evaluated.

Limitations of the Work


The most significant shortcoming of the work is relying
on a single sensor, a camera. Thus, the system is dependent
on lighting and atmospheric conditions. Since the data has
been collected only during the daytime, the analysis is only

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

[1] B. R. C. Software, "India Automobiles," september 2022. [Online]. Available: https://www.ibef.org/industry/india-


automobiles. [Accessed 2022].
[2] S. A. a. K. C. Shaheen, "Smart parking linked to transit: Lessons learned from field test in san francisco bay area of
california," Transportation Research Record, vol. 1, no. 2063, pp. 73-80, 2008.
[3] M. I. L. Y. T. E. N. N. R. Z. Idris, "Car park system: A review of smart parking system and its technology.,"
Information Technology Journal, vol. 8, no. 2, pp. 101-113, 2009.
[4] jensen, T. S. B. and N. , "Parking space verification: improving robustness using a convolutional neural network,"
International Conference on Computer Vision Theory and Applications, vol. 8, no. 3, pp. 311-318, 2018.
[5] Anusooya, G. J. J. C. S. and K. , "Rfid based smart car parking system," International Journal of Applied Engineering
Research, vol. 12, no. 17, pp. 6559-6563, 2017.
[6] S. A. A. S. S. H. and A. , "A review on smart iot based parking system," International Conference On Soft Computing
And Data Mining, pp. 264-273, 2020.
[7] Khanna, A. a. A. and R. , "Iot based smart parking system," IEEE, pp. 266-270, 2016.
[8] Lopez, M. Griffn, T. Ellis, K. Enem and D. , "Parking lot occupancy tracking through image processing," CATA, pp.
265-270, 2019.
[9] Trivedi, J. Devi, M. S. and Dhara, "Canny edge detection based realtime," Transport/Politechnika slaska, 2020.
[10] Blumer, K. Halaseh, H. R. Ahsan, M. U. Dong and Mavridis, "Cost effective single-camera multi-car parking
monitoring and vacancy detection towards real world parking statistics and real-time reporting," International
Conference on Neural Information Processing, pp. 506-515, 2012.
[11] de Almeida,, P. R. L., Alves, J. H. Parpinelli and Barddal, "A systematic review on computer vision-based parking lot
management applied on public datasets," Expert Systems with Applications, pp. 116-731, 2022.
[12] Li, Chen and Ngan, "Simultaneously detecting and counting dense vehicles from drone images," IEEE Transactions
on Industrial Electronics, vol. 66, no. 12, pp. 9651-9662, 2019.
[13] Li, X. Chuah, M. C. and Bhattacharya, "Uav assisted smart parking soluton," international conference on unmanned
aircraft systems IEEE, pp. 1006-1013, 2017.
[14] De Almeida, P. R. Oliveira, L. S. B. J. A. S. Silva Jr and Koerich, "Pklot{a robust dataset for parking lot
classification)," Expert Systems with Applications, vol. 42, no. 11, pp. 4937-4949, 2015.
[15] Amato, G. Carrara, F. Falchi and F. Gennaro, "Deep learning for decentralized parking lot occupancy detection,"
Expert Systems with Applications, vol. 72, pp. 327-334, 2017.
[16] Amato, Carrara, Falchi, F. Gennaro and Vairo, "Car parking occupancy detection using smart camera networks and
deep learning.," IEEE, pp. 1212-1217, 2016.
[17] Nieto, Garcia-Martin, Hauptmann and Martinez, "Automatic vacant parking places management system using
multicamera vehicle detection," IEEE Transactions on Intelligent Transportation Systems, vol. 20, no. 3, pp. 1069-
1080, 2018.
[18] Dalal and B. Triggs, "Histograms of oriented gradients for human detection.," IEEE computer society conference on
computer vision and pattern recognition, vol. 1, pp. 886-893, 20005.
[19] M. A. Suwignyo, Setyawan and B. Yohanes, "Parking space detection using quaternionic local ranking binary
pattern.," IEEE, pp. 351-355, 2018.
[20] Zhang, S. Ren and J. Sun, "Deep residual learning for image recognition," IEEE conference on computer vision and
pattern, pp. 770-778, 2016.
[21] J. Yosinski, j. Clune, Y. Bengio and H. Lipson, "How transferable are features in deep neural networks?," Advances in
neural information processing systems, vol. 27, pp. 324-330, 2014.
[22] X. Ding and R. Yang, "Vehicle and parking space detection based on improved yolo network model.," Journal of
Physics: Conference Series, vol. 1325, p. 012084, 2019.
[23] M. Merzoug, A. Mostefaoui and A. Benyahia, ACM International Symposium on QoS and Security for Wireless and
Mobile Networks, pp. 37-42, 2019.
[24] W. Zhao and J. Ma, "End-to-end scene text recognition with character centroid prediction," eural Information
Processing: 24th International Conference, vol. 3, no. 224, pp. 14-18, 2017.
[25] Z. Zou, Z. Shi, Y. Guo and J. Ye, "Object detection in 20 years: A survey," arXiv, 2019.
[26] C. Bishop, "Neural networks and their applications," Review of scientific instruments, vol. 65, no. 6, pp. 18.3-1832,
1994.
[27] J. Brownlee, "A gentle introduction to transfer learning for deep learning," Machine Learning Mastery, vol. 20, 2017.
[28] S. Ren, K. He, R. Girshick and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal
networks," Advances in neural information processing systems, vol. 28, 2015.
[29] K. He, G. Gkioxari, P. Dollar and R. Girshick, "Mask r-cnn," Proceedings of the IEEE international conference on

www.ijisrt.com
International Journal of Innovative Science and Research Technology

Volume 9 , Issue 4 , April - 2024

computer vision, pp. 2961-2969, 2017.


[30] j. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You only look once: Unified real-time object detection.,"
Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779-788, 2016.
[31] W. Liu, D. Anguelov, D. Erhan and C. Szegedy, "Ssd: Single shot multibox detector," European conference on
computer vision springer, pp. 21-37, 2016.
[32] N. Katuk, N. Zakaria and K. Ku-Mahamud,, "Mobile phone sensing using built-in Camera," 2019.
[33] J. Solawetz, "What is yolov5? a guide for beginners.," 2020.

www.ijisrt.com

You might also like