Classification and Detection of Vehicles Using Deep Learning
Classification and Detection of Vehicles Using Deep Learning
Classification and Detection of Vehicles Using Deep Learning
Volume 4 Issue 3, April 2020 Available Online: www.ijtsrd.com e-ISSN: 2456 – 6470
@ IJTSRD | Unique Paper ID – IJTSRD30353 | Volume – 4 | Issue – 3 | March-April 2020 Page 283
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
Although the training data set is modest in size, the neural (DNNs) to the classification and License plate detection of
network itself does not embody distinct advantages in the vehicles and obtained a better recognition performance
evaluation systems of higher recognition accuracy and under different viewpoints or traffic conditions.
shorter training time than other models. The predominant
motives for it are limited computing sources and lengthy III. LITERATURE SURVEY
processing time for even a small network. However, one of [1] Chen, Z., Ellis, T., and Velastin, S. A., “Vehicle Type
the keys to these issues is the feature extraction and Categorization” A comparison of classification schemes In
classification of detection images for vehicle rearview. Intelligent transportation systems (ITSC), 14th international
Because of Facing many practical problems, neural networks IEEE conference.
have been gradually integrated with some better methods
emerging in the 21st century, such as Convolutional Neural This Paper proposed the Vehicle detection and classification
Networks, ResNet Architecture, Tensor Flow. In this kind of based on histogram of orientation gradients (HOG) approach
model, aimed at different tasks, different systems are usually which was presented by Zezhi Chen and Tim Ellis. This
designed and respectively applied different manual design Classification model contains two parts: feature extraction
features. and classifier selection. In their approach, measurement-
based features (MBP) and the histogram of orientation
gradients (HOG) features are used to classify the vehicles
into four categories: car, van, bus, and motorcycle.
For example, the object recognition uses Feature extraction [2] Chang, S.-L., Chen, L.-S., Chung, Y.-C., and Chen, S.-W.,
from CNN. License plate recognition uses Tesseract-OCR, and “Automatic license plate recognition” IEEE transactions on
the pedestrian detection uses Histogram of Oriented Intelligent Transportation Systems.
Gradient (HOG) feature. Therefore, this paper aims at
classifying and detecting the vehicles from different This paper proposed a license plate image technique
viewpoints under any circumstances. consisting of two main models: a license plate locating model
and a license number identification module. Specifically, the
II. EXISTING SYSTEMS license plates extracted from the first model are examined in
The traditional methods of vehicle classification and the identification model to reduce the error rate. They used
detection are mainly based on the following methods: 1) color edge detection to compute edge maps.
SIFT (Scale-invariant feature transform) feature matching
and extraction; 2) the moving vehicle detection method of But this is limited to only four types of edges. By using the
Gaussian mixture model; 3) the license plate classification unique formulas, the model can transform RGB space into
method; 4) the monitoring video classification method of HSI space that denote red, green, blue colors as hue,
HOG (Histogram of Oriented Gradient) and SVM (Support saturation, intensity parameter values of an image pixels,
Vector Machine). Based on these traditional methods, the respectively. The identification module consists of two main
neural network model gets the lower recognition reference stages, preprocessing and recognition. After this process,
to detect the objects. So, these methods give the inaccurate segmentation and recognition will be invoked sequentially.
results in which accuracy is limited to 85% only. However, this identification model takes more time to
recognize the characters and it is a complex process which
The above traditional algorithms have already obtained good needs to be modified. This model can detect the License
results. Several shortcomings in the traditional methods plate, if and only if that license plate & characters are in
have limited its realization: (i) the high similarity between specified color edges. This type of model can be useful in a
models naturally influence the accuracy, (ii) some models particular region or place.
containing only a few images, and (iii) the unified direction
like front, side or back in picture causes the inaccurate [3] Farhat, A., Al-Zawqari, A., Al-Qahtani, A., Hommos, O.,
classification. (iv) the methods usually require the images Bensaali, F., Amira, A., and Zhai, X., “Tesseract-OCR Based
are captured from certain viewpoints, (v) the vehicle Feature Extraction and Template Matching Algorithms For
recognition granularity just stays in the stage of the vehicle Number Plate Recognition” In Industrial Informatics and
model classification. In this paper we address these Computer Systems (CIICS), International Conference on
limitations and providing techniques that are practical for IEEE.
vehicle classification problem. According to existing
methods, this paper further optimized the classification and A modified template matching Correlation algorithm was
detection algorithms. It applies the Deep Neural Networks proposed by Ali Farhat et al. They have developed four
@ IJTSRD | Unique Paper ID – IJTSRD30353 | Volume – 4 | Issue – 3 | March-April 2020 Page 284
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
algorithms for Numeric Automatic License Plate Recognition as bus, car, bike, bicycle. In this way, the task of identifying
(ALPR) systems: Vector crossing, zoning and combined vehicle types from different angles can be completed.
zoning-vector and template matching correlation. First three
algorithms are based on feature extraction techniques and
the last one is application of correlation technique. By using
vector crossing algorithm, they distinguished the ten
characters (0−9) except the characters “2”, “3” and “5”. Since
these characters have the same number of the vectors. By
using Zoning method, densities of the image in each zone are
derived by the algorithm to determine the characters.
@ IJTSRD | Unique Paper ID – IJTSRD30353 | Volume – 4 | Issue – 3 | March-April 2020 Page 285
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
an identity function and the shortcut connection is called
Identity connection. The identical mapping is learned by
zeroing out the weights in the intermediate layer during
training, since it’s easier to zero out the weights than push
them to one. For the case when the dimensions of F(x) differ
from x, Projection connection is implemented rather than
Identity connection. The function G(x) changes the
dimensions of input x to that of output F(x).
A. SYSTEM FRAMEWORK
1. Vehicle Detection and Classification Based on
Convolutional Neural Networks:
1.1. Vehicle Detection:
In this system, images of the training dataset are given as
input and the foreground area of the target vehicle is
detected by the convolutional neural network (CNN). But
there are some issues to build a deep learning model by
using CNN. The first problem while training the deeper
networks is, accuracy should be increased with an increase
in depth of the network as long as over-fitting is taken care
of. However, the problem with increased depth is that the
signal required to change the weights, which arises from the
end of the network by comparing ground-truth and
prediction becomes very small at the earlier layers, as a
result of increased depth.
Fig 7: Projection connection of Residual Network
It essentially means that earlier layers are almost negligible
to learn. This is called a vanishing gradient. The other ResNet Architecture uses Rectified Linear Unit (ReLu) as an
problem while training the deeper networks is, executing the activation function of neural networks. This rectified Linear
optimization on huge parameter space and therefore naively activation function is a piecewise linear function that gives
adding the layers which results in higher training error. This the output as its input itself, if the input is positive.
is called a degradation problem. In order to overcome these Otherwise, it gives output as zero if the input is negative.
issues, Convolutional Neural Network uses ResNet ReLu activation function reduces the large number of
Architecture to train the deeper networks. computations. It has become the default activation function
for many types of neural networks because a model that uses
ResNet architecture makes use of shortcut connections to ReLu is easier to train and often achieves better
solve the vanishing gradient and degradation problems. The performance. The ReLu activation function is given as shown
basic building block of ResNet is a Residual block which is below.
repeated throughout the network. The ResNet structure as
shown in Figure 6. X if x > 0
F (x) = for all x ∈R
0 if x 0
@ IJTSRD | Unique Paper ID – IJTSRD30353 | Volume – 4 | Issue – 3 | March-April 2020 Page 286
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
From the VGGNet, shortcut connection is inserted to form a
residual network. In order to overcome the issues such as
vanishing gradient and degradation problems of the deep
learning model which occurs during training the model.
2. TENSORFLOW:
TensorFlow is an open source library which uses data flow
graphs to build the models. This allows developers to create
large-scale neural networks with many layers. TensorFlow is
a neural network module which is mainly employed for
Classification, Perception, Understanding, Discovering,
Prediction and Creation. Tensors are the multidimensional
arrays, an extension of 2-dimensional tables to data with a
Fig 8: Vehicle Detection higher dimension. There are many features regarding
TensorFlow which makes it appropriate for Deep Learning.
1.2. Vehicle Classification:
The foreground vehicle area obtained by vehicle detection
algorithm is introduced into another convolutional neural
network for vehicle recognition. Convolutional neural
network uses a certain number of convolutional kernels to
slide the input image with a certain step size to extract
features in the convolution layer. Each convolution kernel
focuses on different features, so the convolution obtains
different feature maps.
B. SOFTWARE REQUIREMENTS
1. TESSERACT-OCR:
The Character Recognition popularly referred as Optical
Character Recognition (OCR). Tesseract-OCR is the
software process of converting typed, handwritten, or
printed text to machine-encoded text that we can access
and manipulate via a string variable. The OCR has been
one of the challenging and popular fields of research in
pattern recognition. This can be used as one of the
Fig 10: Feature Map of Convolution Layer
@ IJTSRD | Unique Paper ID – IJTSRD30353 | Volume – 4 | Issue – 3 | March-April 2020 Page 287
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
classifier in the image processing step since the most 3. ANACONDA NAVIGATOR:
crucial step in OCR is identifying characters as one of 36 Anaconda Navigator is used to launch applications and to
symbols (A-Z and 0-9). Therefore, for classification, we manage conda packages, virtual environments, and
need to extract visual features from individual character programs without the use of command line commands. In
image. Then, character can be classified with these order to get the Navigator, download the Navigator Cheat
features using machine learning. The most effective way Sheet and install Anaconda. It is a free open-source software
for now is OCR-tesseract, which is originally developed by which is associated with Python and R programming
Hewlett Packard in the 1980s. It is one of the most languages for scientific computing such as for data science,
accurate open source OCR engines currently available. machine learning applications, large scale data processing,
predictive analytics and so on. It aims to simplify package
management and deployment.
C. METHODOLOGY
1. DATA COLLECTION & DATA PREPARATION:
In this Project, dataset is prepared by the collection of
vehicle images from Open source, CCTV cameras and
internet. To ensure the representativeness and the integrity
of data, images are taken from different viewpoints under
different traffic conditions. Most of the images in this dataset
are rear views of the vehicle. Besides, the image dataset also
covers most types of vehicles and noise images.
Fig 12: Flowchart of Tesseract-OCR Process
This involves collection of data by recording using a camera
As other traditional procedures, this processing follow a
with auto iris function which keeps the average illumination
step by step approach. As the most important part,
of the view constant. It also uses i-LIDS dataset, then the
recognition consists of two stages, which are adaptive
input data is processed into a set of features before
classification and repeating recognition, respectively. In
becoming suitable inputs for per frame vehicle detection and
the first stage, each character is recognized sequentially.
classification using 3D models.
Additionally, the symbol that is classified is stored by the
adaptive classifier as the training data. Then, the adaptive
2. EDGE DETECTION:
classifier will learn the information that classified
An edge based multi-stage detection is the main primary
characters provide, from which the characters that are
function to detect the vehicle and its license plate edges at
not recognized in first step will be classified again. This
their localization. In this paper, license plate edges are
two-stage process make this method accurate and
detected from its original image by edge detection.
efficient.
Localization of plate is undoubtedly a challenging task
since there are significant variations in plate size, color,
2. LabelIMG:
lighting condition and spatial orientations of license plate
Labelimg is a graphical image annotation tool. It is designed
in images. So, three steps are employed at the
in Python and uses Qt for its graphical interface. Annotations
preprocessing stage. They are (i) Gray Scale Conversion,
are saved as XML files in PASCAL VOC format which is
(ii) Median Filtering, (iii) Contrast Enhancement.
employed by ImageNet. Image Annotation could be
the process of building datasets for computer vision models.
2.1. Gray Scale Conversion:
This helps machines to learn, how to automatically assign
By using following formula, the 24-bit color image can be
metadata into a digital image using captioning or keywords.
converted into 8-bit gray image.
This Labelimg tool is used to train our customized model.
Gray = 0.59×R + 0.30×G + 0.11×B
This technique is used for image retrieval systems to
organize and easily locate particular images from a database.
2.2. Median Filtering:
Image labeling gives the insight into the content of images.
As one of non-linear filter, it can calculate the median of
When you use the API, you get a list of the entities that were
the gray values of a pixel’s neighbors. In this stage, they
recognized: people, things, places, activities, and so on.
use 3×3 masks to get eight surrounding neighbors’ gray
Each label found comes with a score that indicates the
value and replace the pixel value with the median value.
confidence or probability or accuracy of the ML model
As a result, the function could remove salt-and-peeper
according to its relevance.
noise from the image.
@ IJTSRD | Unique Paper ID – IJTSRD30353 | Volume – 4 | Issue – 3 | March-April 2020 Page 288
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
2.3. Contrast Enhancement: VI. FLOW CHART
Histogram equalization technique is invoked to enhance
contrast of the images. In the procedure of the conversion,
the total number of pixels in the image is N and the
number of pixels with the gray level k is nk. Then the
stretched gray level Sk is calculated by the following
formula.
Sk =
3. FUNCTIONAL PROCESS:
This paper proposed an Advanced vehicle classification and
detection method based on the deep learning approach. In
this system, the surveillance cameras are used for observing
the vehicles at any specific region or area. These surveillance
cameras are often connected to the Personal Computer (PC)
or any recording device to provide those vehicle images or
videos which are passing through that specific gateway or
area. This vehicle data is taken as input to the Personal
Computer (PC). By using the i-Lids, real time camera and
open source data a dataset is created. With the help of this
dataset, the Deep Learning model is trained up- to a specified
epoch. With the Increase of number of epochs, the loss
function of the model is reduced. Finally, the trained model is
tested with the new vehicle data to check whether the model
is working effectively in vehicle classification and detection
of their number plates.
This Info includes type of vehicle, its number plate and other
info like time in & time out are tabulated accurately in the
log. Every vehicle info is appended to its log. Finally, this
model provides the accurate info of log regarding the
vehicles that are passed through this Artificial Intelligence
Surveillance.
@ IJTSRD | Unique Paper ID – IJTSRD30353 | Volume – 4 | Issue – 3 | March-April 2020 Page 289
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
VIII. RESULTS
In most of the cases, the experimental results of proposed
system have obtained with higher accuracy. The proposed
method has integrated feature extraction, object frame
boundary generation, linear regression & classification and
License plate recognition to provide a customized efficient
model. By using this model, we can retrieve the vehicle info
accurately. Even if there is background noise in the license
plate, it will not affect the performance of the recognition. So,
this system can recognize all the characters on the license
plate successfully. Thus, the comprehensive performance has
been greatly improved. When compared to existing models,
the current deep learning vehicle classification and detection
neural network model has the rate of accuracy as very high.
Therefore, the proposed model gives standard results.
IX. CONCLUSION
Based on CNN, this paper proposed the vehicle type
classification and license plate recognition in urban traffic
video surveillance. This Deep Learning model can do both
Classification and Detection of vehicles simultaneously. This
reduces the complexity of the processing which helps to
increase the performance of the system to train and test the
model. With the increase of training dataset, the modification
of parameters and replacement of the model; the proposed
method becomes effective and validated by the relevant
experiments. The experimental results show that the vehicle
recognition rate is improved. In a comparative analysis, this
vehicle classification & detection framework can get high
accuracy with success rates which has been proposed to
expose in a better performance than existing frameworks.
Fig 15: (a) loss curves of curves; (b) curves of accuracy
X. FUTURE WORK
The dataset with 3500 pictures is collected and In the future, we will implement this proposed model as a
preprocessed with auto-iris function which is used to real-time hardware application with a deep learning
evaluate the trained models. We drew the loss curve and Framework which further improves the accuracy and
accuracy curve of every classification and recognition robustness for the traffic vehicle classification and detection.
network structure, which are helpful to understand the There are still mountains of work about deep learning to be
training process of every model. From the comparison of the studied. For example, it needs the other efficient and theory
curves, it is clear that the Resnet-101 can get the fastest based deep learning model algorithms. It explores the new
convergence and higher accuracy as shown in Fig 15. feature extraction models which are also worthy to do
further research. Besides studying efficient parallel training
Architecture Accuracy Year algorithms, they need to be investigated. Regarding the
Alex Net 82.6 2012 applications of extending deep learning and how to make
Inception-V1 88.4 2013 rational use of deep learning in enhancing the performance
VGG 90.8 2014 of traditional algorithms is still the focus of a variety of fields.
ResNet-50 92.3 2016
ResNet-101 95.2 2018 XI. ACKNOWLEDGEMENT
Table 1: Comparision of Network Structures I would like to express my sincere appreciation to my
supervisors Dr. N. Viswanathan, Dr. K. Manivel and N.
We apply the proposed model which is trained by the image Jayanthi for their constant guidance and encouragement
dataset into the real traffic road video or images collected by during my whole B.E period at the Mahendra Engineering
ourselves. The collection of real traffic vehicle pictures is College. They gave me this opportunity to do this great
also tested. We observed that for the recognition of the real project on “CLASSIFICATION & DETECTION OF VEHICLES
traffic road image, the training image dataset obtains the USING DEEP LEARNING” which helps me to learn a lot of
higher accuracy. new things. I am thankful to them; without their valuable
help, this project would not have been possible.
The experimental results obtained from the training and
testing dataset is shown in the below table. The figure 15 XII. REFERENCES
shows that, Resnet Network structure gives the higher [1] Saha, S., Basu, S., Nasipuri, M., and Basu, D. K.,
accuracy for object detection and classification than the “License plate localization from vehicle images: An
other network structures. By analyzing the testing edge-based multi-stage approach”, International
performance with different number of layers in ResNet Journal of Recent Trends in Engineering, 1(1):284–
structure, the best performing one is the cascading of the 288.
ResNet101 network structure which achieves 95.27%.
@ IJTSRD | Unique Paper ID – IJTSRD30353 | Volume – 4 | Issue – 3 | March-April 2020 Page 290
International Journal of Trend in Scientific Research and Development (IJTSRD) @ www.ijtsrd.com eISSN: 2456-6470
[2] Chang, S.-L., Chen, L.-S., Chung, Y.-C., and Chen, S.-W., for Qatari number plate”, In Industrial Informatics
“Automatic license plate recognition”, IEEE and Computer Systems (CIICS), International IEEE
transactions on intelligent transportation systems, Conference.
5(1):42–53.
[5] Anagnostopoulos, I. E., Loumos, C. N. E.,
[3] Al-Shami, S., El-Zaart, A., Zantout, R., Zekri, A., and Anagnostopoulos, V., and Kayafas, E., “A license plate
Almustafa, K. “A new feature extraction method for detection algorithm for intelligent transportation
automatic license plate recognition”, In Digital system applications”, IEEE Transactions on
Information and Communication Technology and its Intelligent transportation systems, 7(3):377–392.
Applications (DICTAP), Fifth International IEEE
[6] Dalal, N. and Triggs, B., “Histograms of oriented
Conference.
gradients for human detection In Computer Vision
[4] Farhat, A., Al-Zawqari, A., Al-Qahtani, A., Hommos, O., and Pattern Recognition”, CVPR IEEE Computer
Bensaali, F., Amira, A., and Zhai, X., “Ocr based Society Conference on, volume 1, IEEE.
feature extraction and template matching algorithms
@ IJTSRD | Unique Paper ID – IJTSRD30353 | Volume – 4 | Issue – 3 | March-April 2020 Page 291