0% found this document useful (0 votes)
6 views13 pages

License Plate Recognition System Based On Improved

This article presents an improved end-to-end deep learning model for license plate recognition using an enhanced YOLOv5 and GRU framework. The model incorporates an improved channel attention mechanism and reduces parameters to enhance efficiency and accuracy, achieving a recognition precision of 98.98%. The proposed method demonstrates significant improvements over traditional recognition algorithms, particularly in complex environments.

Uploaded by

Neha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views13 pages

License Plate Recognition System Based On Improved

This article presents an improved end-to-end deep learning model for license plate recognition using an enhanced YOLOv5 and GRU framework. The model incorporates an improved channel attention mechanism and reduces parameters to enhance efficiency and accuracy, achieving a recognition precision of 98.98%. The proposed method demonstrates significant improvements over traditional recognition algorithms, particularly in complex environments.

Uploaded by

Neha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

This article has been accepted for publication in IEEE Access.

This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

VOLUME 4, 2016 1

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.DOI

License Plate Recognition System


Based on Improved YOLOv5 and GRU
HENGLIANG SHI1 , DONGNAN ZHAO.2
1
Automobile and Rail Transit School Luoyang Polytechnic,Science and Technology Avenue 6,Yibin District,Luoyang (e-mail: [email protected])
2
Information Engineering School Henan University of Science and Technology,Kaiyuan Avenue 263,Luolong,Luoyang (e-mail: [email protected])
Corresponding author: Dongnan Zhao (e-mail: [email protected]).
This work was supported in part by key scientific research projects of Henan Province Education Institution. And the number of the project
is 22B120002

ABSTRACT Aiming at the problem that the traditional license plate recognition method lacking of
accuracy and speed, an end-to-end deep learning model for license plate location and recognition in
natural scenarios was proposed. First, we added an improved channel attention mechanism to the down-
sampling process of the You only look once(YOLOv5). Additionally, a location information is added in the
ones to minimize the information loss from sampling, which can improve the feature extraction ability
of the model. Then we reduce the number of parameters on the input side and set only one class in
the YOLO layer, which improves the efficiency and accuracy of the detector for locating license plates.
Finally, Gated recurrent units(GRU) + Connectionist temporal classification(CTC) was used to build the
recognition network to complete the character segmentation-free recognition task of the license plate,
significantly shortened the training time and improved the convergence speed and recognition accuracy
of the network. The experimental results show that the average recognition precision of the license plate
recognition model proposed in this paper reaches 98.98%, which is significantly better than the traditional
recognition algorithm, and the recognition effect is good in complex environment with good stability and
robustness.

INDEX TERMS deep learning,target detection,license plate recognition,YOLOv5,GRU

I. INTRODUCTION computer vision problems [1-3]. Convolutional Neural Net-


HE license plate is an important information carrier of work(CNN) is one of the best deep learning techniques for
T the vehicle, providing a unique identity mark for the
vehicle. License plate recognition is a key link in building
target detection and recognition tasks, and the most popular
algorithm in CNN-based target detection is YOLO, proposed
intelligent transportation, which can play an important role by Joseph Redmon in 2015 [4]. It creatively combines the
in traffic calming, vehicle tracking, unmanned parking lots, two stages of candidate area and target recognition into one,
and automatic highway toll collection. The current stage and only one forward operation is needed to complete the
of license plate recognition technology is generally divided target detection, which greatly reduces the image processing
into three stages: detection, segmentation and recognition. time and makes the model very efficient, many researchers
Such schemes have complex processes, low efficiency, and have done a lot of secondary development work on the YOLO
are easily affected by uneven lighting and noise, with poor family of model detectors [5-6]. On the one hand, there is co-
robustness. operation with other methods, such as use YOLOv3 to extract
and classify underwater objects and combine it with a deep
Although license plate recognition technology has been learning method based on (Long Short Term Memory)LSTM
widely used in real life, it is more often used in fixed to determine the location of the underwater objects [7]. On
scenarios and environments, and the precision and robustness the other hand, the YOLO backbone structure is optimized
of existing recognition technology can hardly meet the needs [8-9], such as replaced the output layer with deformable
of realizing applications in complex conditions and real- convolution to improve the detection speed in the backbone
time scenarios. In recent years, with the rapid development network CSPDarknet53_dcn(P) of YOLOv4. And a new
of computer hardware, neural network models based on feature fusion module was redesigned to improve the detec-
deep learning have become the best tools to solve complex

2 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

tion accuracy of small objects using multiple scale detection engineering.


layers [10]. The latest version of YOLOv5 is highly precise, Following the overview in the first section of this paper,
fast, and generates detection weight files of only 10-120 MB We present the work related to license plate detection and
using Pytorch framework, which means that high-precision recognition algorithms in Section I. The basic framework of
license plate detection models have become much easier the YOLOv5-LSE method and the optimization process are
to use on embedded devices [11-12], Such as constructed presented in Section II. To verify the effectiveness of the
an auxiliary domain (S-DAYOLO) based on YOLOv5s, in method, we show its experimental results on the dataset in
which the synthetic image uses the latest method to convert Section III and perform a detailed comparative analysis with
the source image to a similar image in the target domain, other cutting-edge recognition algorithms. Subsequently, the
which is a good solution to the problem of object detection Improved algorithm is discussed in section IV.
performance degradation after different domain conversions.
It will be embedded in electronic components for use in II. RELATED WORKS
autonomous driving assistance systems [13]. Meanwhile, re- In the license plate positioning stage, the traditional License
current neural networks(RNNs) play an indispensable role in plate localization methods based on a priori information are
the field of natural language processing by processing char- generally classified as color texture [20-21], shape regression
acters directly as input sequences, eliminating the processing [22-23], and edge detection [24-25].The color of the license
of character segmentation and allowing end-to-end license plate is usually blue, yellow, white and green, with high color
plate recognition models to become mainstream [14-15]. In contrast and fixed shape, so the color and shape features are
model optimization, attention mechanisms enable models to widely used. For example, Tian et al [26] used a color dif-
know "where the places of interest are" and are widely used ference model to obtain a binarized image to select the target
to improve the performance of neural network models [16- region, then used the Adaboost algorithm to train the above
17]. Among them, SE(Squeeze and Excitation) is the most features along with other features to obtain a classifier, and
popular attention mechanism because of its low cost and high finally used the classifier to precisely locate the license plate.
gain by establishing channel correlation through 2D global SalauAO et al [27] used the license plate aspect ratio geo-
pooling [18]. However, the SE attention mechanism only metric information as a threshold for foreground extraction to
collects information about the relationship between channels implement the GrabCut algorithm for automatic license plate
and does not focus on the corresponding location informa- localization. This approach has limitations because the aspect
tion, this information is extremely critical for the acquisition ratios of the license plate is different from place to place. The
of target structures in detection and recognition tasks [19]. feature extraction of traditional localization methods relies
The main contributions of this paper are as follows: on manual design, which is not well suited to the diversity
of images. Therefore, traditional methods of license plate
• The proposed lightweight deep learning model requires
detection are inefficient and have poor accuracy. In recent
only one forward computation process to complete the
years, target detection methods based on deep learning have
end-to-end detection and recognition of license plates;
developed rapidly, and the algorithms are mainly divided
• The YOLOv5 algorithm is improved to extend a novel
into two categories. In one category, a part of the candidate
attention mechanism in the down-sampling process of
region is first generated by the algorithm, and then the
the Neck structure, this work improves the efficiency
candidate region is classified and positioned again [28-29].
and accuracy of license plate location;
For example,Naaman Omar et al [30] used a deep semantic
• Upon using the improved YOLOv5, we modified the
segmentation network to classify license plate images into
feature parameters of the prediction part of the classifier
digital regions, city regions, and country regions. Ibtissam
to increase the accuracy of the model while reducing the
Slimani et al [31] based on wavelet transform for license plate
training time;
detection, followed by validation of potential regions using
• We use GRU + CTC recognition network to complete
CNN classifier. Another category is end-to-end detection
the recognition of positioned license plates. The model
algorithms, which directly get the location coordinates and
does not require pre-segmentation of license plate char-
class probability of the target, typical algorithms are SSD
acters, and the automatic extraction of characters is done
[32], YOLO [33-35]. The form1er has a lower recognition
by deep neural networks after self-learning.
speed and the latter is slightly less accurate.
To demonstrate that the proposed method in this paper is In the license plate recognition stage, the traditional recog-
more effective compared with previous license plate recog- nition algorithm usually performs the operation of segment-
nition algorithms, we conducted extensive experiments on ing the license plate characters one by one first, and then uses
the CCPD dataset. With the same training and test sets, our optical character recognition (OCR) technology to recognize
recognition algorithm improves the recognition precision by each character [36-37]. Nahlah M [38] uses the honeybee
0.44%. In terms of algorithm operational efficiency, we also algorithm to complete the segmentation of the license plate
observe that although an improved attention mechanism is characters and then uses a support vector machine (SVM) to
incorporated, a more efficient recognition network structure recognize the license plate characters. Experimental results
ensures that the method meets the requirements of practical show that the method has a good license plate recognition
VOLUME 4, 2016 3

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

effect, but it has poor recognition efficiency. The current B. YOLOV5


popular recognition algorithms discard the character seg- YOLO is one of the most famous object detection algorithms
mentation process and more often fuse license plate detec- because of its high efficiency, high precision and light weight.
tion with recognition [39-40]. Such as OMAR et al [41] YOLOv5 is the latest generation of the YOLO family of
proposed the concept of LPR-CNN, which consists of two detection networks introduced by Glenn Jocher in May 2020
convolutional neural networks to form a license plate and using Pytorch framework. There are four versions of the
character detection system. This method is trained in an end- YOLOv5 network model: YOLOv5s, YOLOv5m, YOLOv5l,
to-end manner and does not require pre-formed character and YOLOv5x. The YOLOv5 network has the smallest depth
segmentation. Experimental results show that the method is and feature map width in the YOLOv5 series, and the next 3
effective for license plate recognition of real vehicle images. are deepened and widened on its basis.
In addition, in practical applications, many license plate As shown in Fig.2, the network structure of YOLOv5
recognition algorithms can only achieve good recognition consists of four parts: input, backbone, neck and prediction.
under specific conditions [42]. This condition includes good YOLOv5 has been iterated to V6.1 version, and we will also
weather conditions, adequate lighting, fixed scenes, and fa- use the latest version of the network structure in this paper,
cilities. License plate recognition in complex environments on which the related model introduction and improvements
remains difficult, with challenges such as poor lighting at are based.
night, rain and snow, dumped, obscured or blurred license Input: The input is the stage of image pre-processing
plates [43]. In traditional license plate recognition, the two for the input image. Preprocessing includes data enhance-
modules, location and recognition, are usually divided into ment, adaptive image scaling and anchor frame calculation.
two separate tasks, and use more complex algorithms to solve YOLOv5 uses the Mosaic data enhancement method to stitch
these challenges. However, the neural network model based four images into a new photo by random layout, cropping and
on deep learning connects the two problems well. Therefore, scaling, which greatly enriches the detection. And the data of
this paper proposes an end-to-end license plate detection and four images can be calculated directly in the calculation of
recognition method based on deep learning that optimizes the batch normalization, which speeds up the training efficiency.
efficiency and accuracy of recognition. YOLOv5 has embedded the anchor frame calculation into the
training, outputting the predicted frame on the initial anchor
III. METHODOLOGY frame, and later comparing it with Ground-trush to calculate
A. MODEL FRAMEWORK
the Loss, thus continuously updating the anchor frame size
The overall network model framework of the license plate and adaptively calculating the optimal anchor frame value.
recognition system is shown in Fig.1, which consists of two Backbone: The Backbone mainly consists of Focus struc-
parts: license plate positioning and license plate recognition, ture and CSP structure. However, after the latest version V6.0
and synthesizes the two parts into an overall network model of YOLOv5, the Focus module is replaced with a 6 × 6 sized
through the data interface. First, in the license plate local- convolutional layer. The two are theoretically equivalent, but
ization module, we use the improved YOLOv5 model to for some existing GPU devices and corresponding optimiza-
perform the detection and cropping work on the license plates tion algorithms, the 6×6 convolutional layer is more efficient.
in the images. After that, in the recognition network, we use The CSP structure enhances the learning ability of the model
GRU to complete the work of sequence labeling and decod- and speeds up network inference.
ing. The GRU output matrix and the corresponding ground-
Neck: The Neck includes FPN and PAN structures.
truth(GT) text will be input into the CTC loss function, and
the output recognized license plate results will be obtained
by calculating the loss function values of each data point.

FIGURE 1. LPR Model Framework FIGURE 2. YOLOv5 network structure

4 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

YOLOv5 adds PAN to the FPN structure to make the com- is lost, making the model less accurate in detecting small
bined structure better for the fusion and extraction of features targets.
at different levels. Version V6.1 also replaces SPP with SPPF, In recent years, attention mechanisms have been widely
which is designed to convert arbitrarily sized feature maps used to improve the performance of modern deep neural
into fixed-size feature vectors more quickly. networks. We introduce a new attention mechanism in order
Prediction: Prediction completes the output of the target to solve the problem presented in the previous section [44].
detection results. YOLOv5 forms a new loss function CIOU The new attention mechanism is an improvement on the SE
based on the IOU loss function by considering the distance block (Squeeze-and-Excitation). The SE block focuses on
information of the center point of the bounding frame, and the relationship between channels and allows the model to
IOU refers to the intersection ratio of the predicted frame and automatically identify the importance of different channel
the real frame. DIOU_nms is also used in this process instead features, but ignores the location information. Location in-
of the traditional NMS operation, with the aim of suppressing formation is very important in the visual task of capturing
redundant frames better, thus further improving the detection target structure [45], and PAN will generate a large amount
accuracy of the algorithm. of channel and location information, we embed location
information into channel attention to form a new attention
C. DETECTION ALGORITHM OPTIMIZATION mechanism L-SE. It is implementation in YOLOv5 is shown
In general, the license plate takes up a relatively small part in Fig.3. It will be added to the down-sampling process in the
of the image, and there are two problems with license plate PAN structure [46].
detection as follows: We will give a brief description of the SE module to better
• The information presented by the pixels in the detection explain L-SE. The standard convolution itself is difficult to
area where the license plate is located is limited, which obtain channel relationships, but this channel relationship
can easily lead to poor detection of the license plate by information is significant for the final classification decision
the target detection algorithm; of the model. The SE module does this job well by mak-
• In the model training phase, the labeling of small objects ing the model focus more on the most informative channel
is prone to bias, and the detection results will be greatly features and suppressing the unimportant channel features to
affected when the targets are small and the number of achieve better feature extraction. It works as follows: Firstly,
classes is large. the Squeeze is performed on the feature map obtained by
To solve the above two problems, we have improved convolution to get the global features of channels, then the
the YOLOv5 algorithm. We improve the feature extraction Excitation is performed on the global features to learn the
ability of the model by adding an attention mechanism to relationship between each channel and get the weights of
improve the detection effect of the model on small targets. different channels, and finally the final features are obtained
At the same time, we use a single class in the model, which by multiplying with the original feature map. Given the input
greatly reduces the number of parameters, makes the model U, the Squeeze equation for the c-th channel is as follows:
less likely to fall into category confusion, and reduces the im-
pact of labeling on detection results. The two improved parts H X W
1 X
will be explained in detail in the following two subsections. zc = Fsq (uc ) = uc (i, j), z ∈ Rc (1)
H × W i=1 j=1
1) novel Attention mechanisms Where Z is the global feature of the c-th channel, it is
The neck structure of YOLOv5 is the FPN + PAN model. obtained by encoding the entire space on the channel with
FPN is a top-down feature pyramid that passes the higher- features. uc come from the convolutional layers in the PAN
level semantic features down through up-sampling and con- structureand their convolutional kernel size is fixed, so they
volution. But FPN only enhances the semantic information can be considered as a collection of local descriptors. H
and does not pass on the localization information. The PAN denotes the height of the feature map and W is the width
structure compensates nicely for this by adding a bottom- of the feature map. Fsq denotes the global averaging of the
up feature pyramid after the FPN. PAN performs a down- squeezed work of the feature map in the set.
sampling operation on the bottom layer of the FPN, its upper
layer will be subjected to 3 × 3 convolution operation, then it
will be connected laterally with the bottom layer after down-
sampling, and the two will be added together, and finally 3×3
convolution will be performed again to fuse their features.
This process operates iteratively from the bottom of the FPN
up to form a new feature pyramid that contains both semantic
and localization information. PAN uses 8 times, 16 times and
32 times down-sampling and convolution for three different
sizes of images to complete the feature extraction and trans-
FIGURE 3. Attention mechanism
fer. In this process, a large amount of position information
VOLUME 4, 2016 5

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

After we get the global description of the features by f ∈ Rc/r×(V +L) and f v ∈ Rc/r×V . We use a bottleneck
squeezing, we next need to get the relationship between the structure with two fully connected layers here to reduce
pipes through the excitation operation. The traditional SE model complexity as well as to improve generalization capa-
module uses a gating mechanism in the form of sigmoid: bilities.In the first FC layer, r is a dimensionality reduction
coefficient, which plays the role of dimensionality reduc-
tion.In the second FC layer, f v and f l are transformed to have
s = Fex (z, W ) = σ(g(z, W )) = σ (W2 Re LU (W1 z)) the same dimension as the input U by two 1×1 convolutional
(2) transforms Fv and Fl , which are independent of each other.
Where Fex denotes the excitation operation, W1 ∈ Yielding:
c c
R r ×c ,W2 ∈ Rc× r , they represent two linear transforma-
tions, and the weights of each channel are obtained by wv = σ (Fv (f v )) (6)
learning them. The sigma is a nonlinear activation function
that normalizes to obtain the importance of the channel, with
wl = σ FL f l

0 being unimportant and 1 being important. (7)
The optimization of L-SE lies in the addition of location v l
where σ is the sigmoid function, w and w are used
information. Equation (1) only squeezes the global spatial
as attention weights for the different channels. Finally, the
information and does not preserve the location information.
Scale operation is performed, which means that the obtained
equation (1) only squeezes the global spatial information
weights are multiplied with the original feature map to get
and does not preserve the location information. We factorize
the final features. The output of L-SE block X is:
equation (1) into equation (2) parts (V, 1) and (L, 1), repre-
senting two spatial ranges of pooling kernels, which operate
xc (i, j) = uc (i, j) × wcv (i) × wcl (i) (8)
in pairs of 1D feature encoding for each channel in different
directions, (V, 1) being vertical and (L, 1) being horizontal. As mentioned above, the L-SE block not only considers
The output of the c-th channel with height v in the vertical the importance of the different channels, but also focuses on
direction is: the coded location information. We apply the attention of
two different directions simultaneously to the input tensor,
L
1X and the resulting attentional map can determine whether the
zcv (v) = uc (v, j) (3)
L i=1 corresponding direction is storing the target of interest. We
can also adjust the attention during this encoding process
In the same way, the output of the c-th channel at width l to make the localization of the interest target location more
in the horizontal direction can be written as: accurate and thus improve the target detection ability of the
v model.
1 X
zcl (l) = uc (i, l) (4)
V i=1 2) Parametric optimization
The above two formulations will collect features along two YOLOv5 was originally implemented on the COCO2017
spatial directions, horizontal and vertical, and will eventually dataset, with 80 classes present in the original classifier
generate a pair of perceptual feature maps in the correspond- (people, motorcycles, fire hydrants, elephants, umbrellas,
ing directions. It differs from the squeeze operation in that L- etc.). In YOLOv5, each bounding box is represented by
SE can learn the relational weights between individual chan- five predicted values, and RGB has three channels, so the
nels in one spatial direction while collecting precise position number of parameters for predicting only the bounding boxes
information in another spatial direction. This approach helps is 3 × (5 + 80) = 255. This number of parameters is
YOLOv5 to locate the target of interest more precisely [47]. too large, which will reduce the prediction efficiency of the
In order to make full use of the collected location infor- model while increasing the probability of class errors. In this
mation, we propose a new method for calculating weights. paper, our second contribution is to reduce the number of
Create a shared 1 × 1 convolutional transform function F classes in the classifier. In the model where only a single
into which a subset of the aggregated features generated class (license plate) is used, the number of parameters in the
by equation (3) and equation (4) will be fed, the excitation prediction bounding box will become 3 × (5 + 1) = 18.
operation yields: Such improvements make model detection much faster and
less likely to fall into error confusion, thereby ensuring the
accuracy of model detection.
f = σ F zv , zl

(5)
where (z v , z l ) denotes the tandem operation of aggregated D. RECOGNITION ALGORITHM
feature subsets and σ a nonlinear activation function. where Character segmentation is an inseparable part of the tra-
f ∈ Rc/r×(V +L) denotes an intermediate feature map with ditional license plate recognition framework. The effect of
information encoded in two different directions.Similarly, character segmentation is highly susceptible to noise and
we split f into two separate tensors along the direction complex environment. If the effect of character segmentation
6 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

is not good, even if we have a high-performance recognizer,


there will be false recognition and missed recognition. Many
different image pre-processing methods are generally used
in traditional recognition schemes to solve this problem, but
none of them has achieved better results. So we treat the
characters in license plates as undivided sequences and use
a deep learning model to solve the recognition problem.

1) GRU + CTC

In the field of end-to-end license plate recognition algo-


rithms, the LSTM + CTC scheme is more widely used.
However, in recent years, the emergence and development of
GRU gives us more options. GRU is a very effective variant
of LSTM network, which is simpler in structure, lighter
in weight, and very effective in recognition. Therefore, in
this paper, we choose the combination of GRU + CTC as
the license plate recognition algorithm, and we also prove FIGURE 4. GRU + CTC recognition model
through experiments that this scheme improves the model
training time, convergence speed, and recognition accuracy. 3) sequence labelling
We will use GRU recursive processing for each layer of fea-
tures in the obtained feature sequence. GRU allows predic-
tion of past contextual information, and the network makes
the sequence recognition operation more stable compared
to processing each feature individually. GRU has only two
gates, which combine the input gate and the forget gate in
LSTM into one, called update gate, to select the memory
information that can continue to be retained until the current
2) sequence signature generation
moment. The other gate is the reset gate, which will control
how much of the past information is forgotten. We first need
to get the two gating states by the last transmitted down state
ht−1 and the input xt of the current node. The equation is:
The process of license plate recognition based on GRU +
zt = σ W xz xt + Whz ht−1 + bz

CTC is shown in Fig.4. The recognition process of GRU + (9)
CTC based license plate recognition model is shown in Fig.4.
rt = σ Wxr xt + Whr ht−1 + br

(10)
In the first part of the license plate recognition model, we
use a pre-trained 7-layer CNN model to extract the sequential Where σ is the sigmoid function, by which we can trans-
feature representation from the cropped license plate image. form the data to a value in the range of 0 − 1 to act as a
A conv5 feature map of size N ×C ×W ×H is obtained, after gating signal, rt controls the reset gate, and zt controls the
which a sliding window of 3 × 3 is made on conv5, and each update gate. After obtaining the gated state, the reset gate
point will be combined with the features of the surrounding rt is first applied to the hidden layer output ht−1 at the
area to obtain a feature vector of length 3 × 3 × C, and a previous moment to achieve the reset of the information state.
feature map of N × 9C × W × H is output. But the obtained Then, after multiplying it with the input text xt together with
features are spatial features, which only CNN can learn, and the corresponding bias and summing it, the update of the
we will do Reshape operation to transform them to the size immediate information xt of the node at the current moment
that GRU can learn. is realized through the tanh activation.
VOLUME 4, 2016 7

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

into output labels. Therefore, we connect CTC directly to


the output of GRU, and the input of CTC happens to be the
h′t = tanh W hh̃ rt · ht−1 + W xh̃ xt + bh̃
  
(11)
output activation of GRU. Additionally, we use the sequence
In the output of the hidden layer, GRU can use a single decoding method to make better use of the output sequence of
gated Z for both forgetting and selective memory. Yielding: GRU and further obtain the optimal solution with maximum
probability of approximate path.
ht = (1 − z) ∗ ht−1 + z ∗ h′ (12)
IV. EXPERIMENTS AND RESULTS ANALYSIS
Where (1 − z) can be considered as the forgetting gate, A. DATASET AND ENVIRONMENT
which will clear the unimportant information in dimension The license plate dataset used for the experiments in this
ht−1 . (1 − z) ∗ ht−1 represents the selective clearing of paper is the open source large-scale Chinese urban parking
the originally hidden state. z ∗ h′ represents the selection of dataset CCPD and some of the vehicle data images collected
certain information in dimension h′ . formula ht is to clear by ourselves, totaling 12500 images. The images in the
certain dimensional information in ht−1 passed down and dataset are all 1160 × 720 in size and contain images from
add certain dimensional information entered by the current many complex environments, such as bright light, cloudy sky,
node. The * symbol here denotes Hadamard Product, which dark light, smudged license plates, tilted, etc., constituting a
is a matrix multiplication method. The formula Wxz , Wxr , dataset with a diverse and sufficient number of license plate
Wxh′ is the weight matrix of the connection between the scenes distributed. We divide the dataset into training set and
hidden layers of the input layer at moment t, respectively; test set with the number of 10,000 and 2,500 respectively,
Whz , Whr , Whh′ is the weight matrix of the connection which will be used to train the model and test the model
between t−1 and the hidden layers at moment t, respectively; effect. The data of the license plate recognition part requires
bz , br , bh′ is the update gate, the reset gate, and the bias of us to detect and crop the license plate of the original image,
the current hidden layer, respectively. and part of the sample data is shown in the Fig.5.
Finally, we use the Softmax layer to transform the state ht
of GRU into a probability distribution of 7 classes. In this
paper, we use the challenging Chinese license plate as the
experimental object, and the Chinese license plate consists
of seven characters, so we set seven characters information
network layers.
pt (c = ck |xt ) = sof tmax(ht ), k = 1, 2, ..., 7 (13)
The whole feature sequence will eventually be converted FIGURE 5. Sample images from the dataset
into probability estimation sequence (p = p1 , p2 , ..., pL ),
which has the same length as the output sequence.
The experimental environment for this paper is built
on a Linux operating system. The CPU model is an I7-
4) sequence decoding
[email protected], the GPU model is an NVIDIA GeForce
In the final stage of the license plate recognition model, RTX3090 8GB, and the software version is CUDA11.2 and
we convert the sequence of probability estimates P into a PyTorch1.7.
string. If the work is done using the common Softmax Loss,
each column of output needs to correspond to a character B. EVALUATION INDICATORS
element. This requires that each license plate image in the
In this paper, we use objective evaluation criteria to evaluate
license plate training set needs to be labeled with the position
license plate detection and recognition models. Precision,
of each character in the image, and then CNNs are used
recall and mAP(mean Average Precision) are used as eval-
to find the alignment to each column of the Feature map
uation metrics with the following equations.
to obtain the Label corresponding to that column output in
order to train. However, in practice, it is very difficult to P recision = T P/(T P + F P ) (14)
mark such aligned samples, and it is a huge job to mark the
position of each character in addition to the marked char- Recall = T P/(T P + F N ) (15)
acters. Moreover, the plate smudging and obscuring caused n
1 X
by complex environment may cause inconsistency in the mAP = P (k)∆R(k) (16)
C
number of plate characters, resulting in each column output k=1

not necessarily corresponding to each character one by one. Where TP is the number of true positive samples, FP is the
Therefore, to solve the time series problem with uncertain number of false positive samples, FN is the number of false
alignment relationship between input features and output negative samples, C is the number of categories, n is the
labels, we introduce CTC. It does not require data with pre- number of referenced thresholds, k is the threshold, P(k) is
segmented and can directly decode the pre-reading sequence the accuracy, and R(k) is the recall.
8 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

C. RESULTS ANALYSIS algorithms for training tests. The comparison experiments


1) License plate positioning model will use the same training and test sets with the number of
In the training phase of the improved YOLOv5 license plate 10,000 for the former and 2,500 for the latter, and evaluate
detection model, considering that we added the attention the performance of each mainstream algorithm using recall,
mechanism L-SE, which incorporates multi-level feature in- precision, mAP and FPS. The results of the comparative tests
formation, we set the batch size to 128, Epoch to 300, initial are shown in Table 1. As we can see from the table, compared
learning rate to 0.01, decay rate to 0.005, and iteration The with the traditional target detection algorithms Faster R-CNN
total number of iterations is 30000, and the framework is and SSD300, the YOLOv5 correlation algorithm is slightly
Pytorch. Where batch size indicates the number of samples deficient in detection speed, but the accuracy advantage is
selected in a single round of training, and learning rate is more obvious. Compared with the original YOLOv5 algo-
the hyperparameter that knows how we should adjust the rithm, the improved algorithm YOLOv5-LSE has increased
network weights by the gradient of the loss function. The complexity and slightly decreased in detection speed, but
purpose of license plate localization is to obtain the area the recall, precision and mAP are improved by 3%, 4%
where the license plate is located and provide data for li- and 2%, respectively, and the comprehensive performance is
cense plate recognition. Therefore, the accuracy of license improved, and the detection speed FPS reaches the real-time
plate localization directly affects the effectiveness of license requirement of engineering applications.
plate recognition, so we use Avg IOU, which responds to
the accuracy of localization, to measure the effectiveness of
license plate localization. The larger the value of this index,
the better the positioning effect. The training result of the
model is shown in Fig.6.
TABLE 1. Detection results of different algorithms.

Algorithm Recall Precision mAP Fps


Faster R-CNN [48] 86.5 89.4 92.3 15
SSD300 [49] 88.3 89.6 93.3 38
RPnet [50] 95.2 94.8 94.2 62
YOLOv5 93.5 93.4 94.6 50
YOLOv5-1 95.4 95.2 95.8 46
YOLOv5-LSE 96.5 97.4 97.1 52

2) License plate recognition model

The end-to-end license plate recognition model based on


GRU+CTC uses 10,000 images, we set the batch_size is
128, the epoch is 30, the initial learning rate is 0.01 and the
dynamic decay mechanism is used, and the gradient descent
is optimized using the Adam algorithm and the framework
is Pytorch. Compared with some traditional recognition al-
FIGURE 6. The training results of YOLOv5. (a) Precision (b) Recall(c) mAP gorithms, such as BP neural network [51], tesseract [52],
(d)Avg IOU HOG + SVM [53], etc., the recognition effect advantage
of deep learning-based algorithms is more obvious, so it is
From (a) and (c) in Fig.6, it can be seen that after Epoch not compared with traditional algorithms in the recognition
exceeds 50, the accuracy and mapping convergence of the model validation stage. We choose to compare with the more
model are between 0.90 − 0.99, which means that the model popular and effective license plate recognition algorithm
detection accuracy is high enough; In Figure b, when the re- LSTM + CTC at this stage in the three directions of recog-
call convergence converges to 1 after 20 iterations, indicating nition precision, training time and CTC Loss to verify the
that the target can be detected completely; figure d shows that cutting-edge and effectiveness of the recognition algorithm
as the training deepens, the AvgIOU of the model is stable in this paper. At the end, we will use the complete end-to-end
between 0.9 and 1, indicating good model training results. license plate recognition model including the target detection
We compare the improved algorithm YOLOv5-LSE with module to perform license plate detection and recognition
YOLOv5-1 (only the class parameter is set to 1), YOLOv5, in different practical scenarios to verify the precision and
SSD300, Faster R-CNN, RPNet and TE2E target detection robustness of the license plate recognition system.
VOLUME 4, 2016 9

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

the CTC Loss of GRU decreases faster. the CTC loss can
be understood as the product of the probabilities of the
corresponding labels of the output paths at each time step.
Given an input, the output path probability here refers to the
probability of the label of that input output at each time step
from time t = 1 to T. The CTC loss essentially maximizes
the probability sum of all paths, and the faster its value
decreases before it stabilizes, the more accurate the model
is represented.

FIGURE 7. Model training time

Fig.7 shows the training time of GRU + CTC compared to


LSTM + CTC. From the figure, we can see that the training
time of GRU + CTC is significantly lower than that of LSTM
+ CTC within the same number of Epochs, which is caused
by the different network structures of the models themselves.
The GRU network combines the input and forgetting gates in
the LSTM network into one called the update gate. So that
the GRU network itself contains only one surviving unit and
updates reset two gates. GRU has one less gate function than FIGURE 9. Model Precision

LSTM, and the parameter size is 1/4 less than LSTM, and
the computation is also greatly reduced, so the training time Fig.9 shows the variation in recognition precision of the
of GRU + CTC is less and the network converges faster. In two recognition algorithms over the training cycle. From the
the case of suitable amount of license plate data and all the figure, we can see that the model of GRU + CTC can obtain
hyper parameters are tuned, the performance of the two is higher recognition precision in shorter time, and the precision
comparable, and the structure of GRU is simpler, so GRU + reaches 96.6% at epoch=8, while the precision of LSTM +
CTC is chosen to be more efficient for training. CTC is only 95.4%, which is about 1.2% lower. In addition,
the maximum accuracy of GRU + CTC is 98.8%, while the
maximum accuracy of LSTM + CTC is 98.1%, which is
slightly lower than the former by 0.7%.We calculated the
average accuracy of the two recognition models, and the
average accuracy of GRU + CTC is 97.6% and LSTM +
CTC is 96.9%. Therefore, when the number of samples in
the dataset is moderate, the GRU + CTC recognition model
has high training efficiency and recognition effect.

TABLE 2. Comparison of the effectiveness of complete LPR algorithms.

Model framework Precision(%) Time(ms)


OPENCV + LSTM + CTC PyTorch 96.94% 115.69
OPENCV + GRU + CTC PyTorch 97.41% 102.32
YOLOv5 + LSTM + CTC PyTorch 98.02% 69.47
YOLOv5 + GRU + CTC PyTorch 98.32% 65.32
YOLOv5-LSE + LSTM + CTC PyTorch 98.54% 73.25
YOLOv5-LSE + GRU + CTC PyTorch 98.98% 70.25
FIGURE 8. CTC_Loss

Table 2 shows the comparison of recognition accuracy of


Fig.8 shows the CTC Loss variation curves for different the YOLOv5-LSE + GRU + CTC, the license plate recogni-
network structures. From the figure, we can see that the tion algorithm proposed in this paper, with other algorithms.
CTC Loss values of GRU and LSTM converge from the In the comparison experiments, we not only chose the orig-
10th Epoch onward. However, in the initial training stage, inal YOLOv5 algorithm, but also added the OPENCV +
10 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

RNN license plate recognition method. We use OPENCV to different lighting scenes, and the experimental results show
preprocess the image, a process that includes binarization, that the license plate recognition is accurate with recognition
medianization, corruption, expansionand centralized blur- precision of 97.2%, 96.2% and 97.9%, respectively. Sub-
ring. After that, the rectangular feature aspect ratio (Ar) of images (1,3,6,7) show the license plate recognition under
the license plate is selected as the threshold to locate and complex character conditions, which include consecutive
segment the license plate, and the same algorithm as in this identical characters, numbers and letters with similar shapes
paper is chosen for the final character recognition module. and Chinese abbreviations of different provinces in the li-
From the table we can see that the license plate recogni- cense plate, and the experimental results show that the model
tion model using deep learning localization algorithm has has a good recognition effect. As shown above, the model
some advantages in terms of accuracy and recognition speed proposed in this paper can accurately recognize license plate
compared to the traditional localization algorithm model. images in various scenes with strong stability and robustness.
Comparing the two localization algorithms, YOLOv5-LSE
and OPENCV, we can see that with the same algorithm used V. CONCLUSIONS
in the recognition module, the experimental results show that Aiming at the problem that the traditional license plate
the speed of license plate recognition using the YOLOv5- recognition method lacking accuracy and speed, this paper
LSE localization algorithm is significantly higher and the proposes an end-to-end deep learning model for license plate
recognition time is reduced by about 40%.In addition, when localization and recognition in natural scenes. In our exper-
the localization algorithm of license plate is the same, the iments, we worked on improving the deep learning-based
recognition module uses GRU + CTC, the recognition rate YOLOv5 target detection algorithm by adding an improved
of license plate has some improvement, and because the attention mechanism LSE to its network structure. In addi-
structure of GRU is simpler compared with LSTM, it has tion, the number of parameters on the input side is reduced
obvious advantages in recognition speed. Then, using the and only a single target class parameter is set in the YOLO
improved YOLOv5-LSE for license plate location, because layer. Finally, the recognition network is constructed using
of the addition of the attention mechanism, which increases GRU + CTC to complete the recognition of license plates
the complexity of the model, the recognition time increases without character segmentation. The experimental results
by 4.93ms, but it has less impact on the overall performance show that the improved YOLOv5-LSE + GRU + CTC license
of the license plate recognition model, and the improved plate recognition model has obvious advantages over the
algorithm improves 0.44% in recognition precision. In con- traditional model, with an average recognition precision of
clusion, the comprehensive performance of YOLOv5-LSE + 98.98% and FPS meeting the requirements of engineering
GRU + CTC proposed in this paper is better and the model applications. At the same time, through example verification,
has the highest recognition precision. the improved model in this paper has good overall recogni-
tion effect in complex environment with strong stability and
robustness.In our future work, we are committed to apply
the license plate recognition algorithm in the paper to the
embedded system and realize the practical application work
of the algorithm in life.

A. REFERENCES
REFERENCES
FIGURE 10. Recognition results with the proposed model. [1] Ali Arshaghi,Mohsen Ashourin,Leila Ghabeli, “Detection and Classifica-
tion of Potato Diseases Potato Using a New Convolution Neural Network
Architecture,” TS, 2021, 38(6) : 1783-1791.
The license plate recognition results using the YOLOv5- [2] Oleksandr Bezsonov et al, “Breed recognition and estimation of live
LSE license plate recognition model proposed in this paper weight of cattle based on methods of machine learning and computer
vision,” Eastern-European Journal of Enterprise Technologies, 2021, 6(9)
under different practical scenarios and environmental condi- : 64-74.
tions are shown in Fig.10. Practical scenarios include streets, [3] MohiAlden Khaled et al, “Design and evaluation of an intelligent sorting
parking lots and highways, and environmental conditions in- system for bell pepper using deep convolutional neural networks,” Journal
of food science, 2021, 87(1) : 289-301.
clude rain, strong light exposure and low light at night. Con- [4] Redmon J,Divvala S,Girshick R,et al, “You Only Look Once: Uni-
sidering that the license plate recognition part is relatively fied, Real-Time Object Detection,” Computer Vision & Pattern Recogni-
small in the image, in order to show the recognition result tion.IEEE, 2016.
[5] Kim Munhyeong and Jeong Jongmin and Kim Sungho, “ECAP-YOLO:
as much as possible, we have partially cropped the result Efficient Channel Attention Pyramid YOLO for Small Object Detection in
image and removed a little background part. As can be seen Aerial Image,” Remote Sensing, 2021, 13(23) : 4851-4851.
from the figure, our recognition model is able to accurately [6] Miranda Pedro R.,Pestana Daniel,Lopes João D,et al, “Configurable Hard-
detect the license plate location and give the license plate ware Core for IoT Object Detection,” Future Internet, 2021, 13(11).
[7] Mathias Ajisha,Dhanalakshmi Samiappan,Kumar R, “Occlusion aware
character recognition results and the precision values. The underwater object tracking using hybrid adaptive deep SORT -YOLOv3
sub-images (4, 5, 8) show the license plate recognition under approach,” Multimedia Tools and Applications, 2022, 81(30).

VOLUME 4, 2016 11

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

[8] Lin Feng et al, “mproved YOLO Based Detection Algorithm for Floating [34] Zou, Yongjie et al, “License plate detection and recognition based on
Debris in Waterway,” Entropy, 2021, 23(9) : 1111-1111. YOLOv3 and ILPRNET,” Signal, Image and Video Processing, 2021, :
[9] Nayereh Zaghari et al, “The improvement in obstacle detection in au- 1-8.
tonomous vehicles using YOLO non-maximum suppression fuzzy algo- [35] Yuchao SUN,Qiao PENG,Dengyin ZHANG, “Light-YOLOv3: License
rithm,” The Journal of Supercomputing, 2021, 77(11) : 1-26. Plate Detection in Multi-Vehicle Scenario:Regular Section,” IEICE Trans-
[10] Y. Cai, T. Luan et al, “YOLOv4-5D: An Effective and Efficient Object actions on Information and Systems, 2021, E104.D(5) : 723-728.
Detector for Autonomous Driving,” IEEE Transactions on Instrumentation [36] Sahar S Tabrizi,Nadire Cavus, “A Hybrid KNN-SVM Model for Iranian
and Measurement, 2021, 70, 4503613: 1-13. License Plate Recognition,” Procedia Computer Science, 2016, 102 : 588-
[11] Madasamy Kaliappan et al, “OSDDY: embedded system-based object 594.
surveillance detection system with small drone using deep YOLO,” [37] Pu Baoqing, “Research on Chinese License Plate Recognition Algorithm
EURASIP Journal on Image and Video Processing, 2021, 2021(1). Based on Convolution Neural Network,” Journal of Physics: Conference
[12] Tumas Paulius and Serackis Artūras and Nowosielski Adam, “Augmenta- Series, 2021, 1802(3) : 032055-.
tion of Severe Weather Impact to Far-Infrared Sensor Images to Improve [38] Nahlah,M,Shatnawi, “Bees algorithm and support vector machine for
Pedestrian Detection System,” Electronics, 2021, 10(8) : 934-934. Malaysian license plate recognition,” International Journal of Business
Information Systems Ijbis, 2018.
[13] Li G, Ji Z, Qu X, et al, “Cross-Domain Object Detection for Autonomous
[39] Kong Xiangjie et al, “A Federated Learning-Based License Plate Recogni-
Driving: A Stepwise Domain Adaptative YOLO Approach,” IEEE Trans-
tion Scheme for 5G-Enabled Internet of Vehicles,” IEEE TRANSACTIONS
actions on Intelligent Vehicles, 2022.
ON INDUSTRIAL INFORMATICS, 2021, 17(12), 8523–8530.
[14] Zaremba W,Sutskever I,Vinyals O, “Recurrent Neural Network Regular-
[40] Alghyaline Salah, “Real-time Jordanian license plate recognition using
ization,” Eprint Arxiv, 2014.
deep learning,” Journal of King Saud University - Computer and Infor-
[15] Palaiahnakote Shivakumara et al, “CNN-RNN based method for license mation Sciences, 2020, 34(6pa) : 2601-2609.
plate recognition,” CAAI Transactions on Intelligence Technology, 2018, [41] OMAR,NAAMAN,SENGUR,ABDULKADIR,AL-ALI,SALIM GANIM
3(3) : 169-175. SAEED, “Cascaded deep learning-based efficient approach for license
[16] Kelvin Xu,Jimmy Ba,Ryan Kiros,Kyunghyun Cho,Aaron plate detection and recognition.,” Expert Systems with Application, 2020,
Courville,Ruslan Salakhudinov,Rich Zemel,Y oshua Bengio, “Show, 149, 113280.1–113280.10.
attend and tell: Neural image caption gen-eration with visual attention,” [42] Liu P, Li G H, Tu D, “Low-quality license plate character recognition
ICML, 2015, 2048–2057. based on CNN,” 2015 8th International Symposium on Computational
[17] Feng Lin et al, “EEG-Based Emotion Recognition Using Spatial-temporal Intelligence and Design (ISCID). IEEE, 2015.2, 53–58.
Graph Convolutional LSTM with Attention Mechanism,” IEEE journal of [43] M. C. Rademeyer,A. Barnard,M. J. Booysen, “Optoelectronic and En-
biomedical and health informatics, 2022, PP. vironmental Factors Affecting the Accuracy of Crowd-Sourced Vehicle-
[18] Jie Hu,Li Shen,Gang Sun, “Squeeze-and-excitation net-works,” CVPR, Mounted License Plate Recognition,” IEEE Open Journal of Intelligent
2018, 7132–7141. Transportation Systems, 2020, 1 : 15-28.
[19] Huiyu Wang,Y ukun Zhu,Bradley Green,Hartwig Adam,Alan Y [44] Huiyu Wang,Yukun Zhu,Bradley Green,Hartwig Adam,Alan Y
uille,Liang-Chieh Chen, “Axial-deeplab: Stand-alone axial-attention uille,Liang-Chieh Chen, “Axial-deeplab: Stand-alone axial-attention
for panoptic segmentation,” rXiv preprint arXiv:2003, 2020. for panoptic segmentation,” arXivpreprint arXiv:2003.07853, 2020.
[20] TangYushun,ZhangShengguo,et al, “Color-based license plate recognition [45] Jie Hu,Li Shen,and Gang Sun, “Squeeze-and-excitation net-works,”
research,” modern computer, 2020, 32, 63-66, 71. CVPR, 2018, 7132–7141.
[21] LiXuehan, “Extraction of Macau car license plate location based on image [46] Liu S,Qi L,Qin H,et al, “Path Aggregation Network for Instance Segmen-
texture recognition,” Information Technology and Informatization, 2020, tation,” CVPR, 2018.
11, 237–240. [47] Hou Q,Zhou D,Feng J, “Coordinate Attention for Efficient Mobile Net-
[22] Ling Xiang,Huang bang, “Multi—license Plate Location Based on Im- work Design,” 2021.
proved Two—Dimensional Discrete Wavelet Transform,” journal of [48] REN S,HE K, “Faster R-CNN:Towards real-time object detection with
ChongQing jiaotong university(natural science), 2020, 39(2), 16–21. region proposal networks,” IEEE Transactions on Pattern Analysis &
[23] ZhangZhenwei,SuYanchen, “Location method for license plate based Machine Intelligence, 2017, 39, 1137–1149.
on M—HSI and pixel offset,” computer engineering and design, 2018, [49] Liu W,Anguelov D,Erhan D, “SSD:single shot multibox detector,” Pro-
39(11), 3576–3583. ceedings of the European Conference on Computer Vision. Heidelberg:
Springer, 2016, 21–37.
[24] Felzenszwalb P F,Girshick R B,Mcallester D, “Object detection with
[50] Xu Z B,Yang W, Meng A J, “Towards end-to-end license plate detection
discriminatively trained part based models,” IEEE Transactions on Pattern
and recognition: A large dataset and baseline,” Proceedings of the Euro-
Analysis and Machine Intelligence, 2010, 32, 1627–1645.
pean conference on computer vision. Heidelberg: Springer, 2018, 261–
[25] M. Ravichandran,S. Sumitha, “Robust Automated License Plate and Char- 277.
acter Recognition,” Digital Image Processing, 2012, 4(4). [51] Xiaochun Wang,Guo Wei Yang,Yang Yang, “License Plate Fault-Tolerant
[26] TIAN,YUANMEI,SONG,JUAN,ZHANG,XIANGDONG,et al, “An algo- Characters Recognition Algorithm Based on Color Segmentation and BP
rithm combined with color differential models for license-plate location,” Neural Network,” Applied Mechanics and Materials, 2013, 2700(825),
Neurocomputing, 2016, 212, 22–35. 1281–1286.
[27] Salau A O,Yesufu T K ,Ogundare B S, “Vehicle Plate Number Localization [52] Polishetty R,Roopaei M,Rad P, “A Next-Generation Secure Cloud-Based
Using a Modified GrabCut Algorithm,” Journal of King Saud University Deep Learning License Plate Recognition for Smart Cities,” IEEE Inter-
Computer Information Sciences, 2019. national Conference on Machine Learning & Applications. IEEE, 2016.
[28] Mahmood Zahid et al, “Towards Automatic License Plate Detection,” [53] Rashedul ISLAM,MD Rafiqul ISLAM,Kamrul Hasan TALUKDER, “An
Sensors, 2022, 22(3) : 1245-1245. efficient method for extraction and recognition of bangla characters
[29] Singh Sweta, “Automatic Car License Plate Detection System,” IOP from vehicle license plate,” Multimedia tools and applications, 2020,
Conference Series: Materials Science and Engineering, 2021, 1116(1). 79(27/28),20107–20132.
[30] Naaman Omar,Abdulkadir Sengur,Salim Ganim Saeed Al-Ali, “Cascaded
deep learning-based efficient approach for license plate detection and
recognition,” Expert Systems With Application, 2020, 149(C), 113280-
113280.
[31] Ibtissam Slimani et al, “An automated license plate detection and recog-
nition system based on wavelet decomposition and CNN,” Array, 2020,
8.
[32] Mensch A,Varoquaux G,Thirion B, “Compressed Online Dictionary
Learning for Fast fMRI Decomposition,” 2016 IEEE 13th International
Symposium on Biomedical Imaging (ISBI 2016),IEEE, 2016.
[33] Redmon J, Farhadi A, “YOLOv3: an incremental improvement,”
https://arxiv.org/abs/1804.02767, 2020.

12 VOLUME 4, 2016

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3240439

Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS

HENGLIANG SHI received the Ph.D.degree in


transportation engineering from Nanjing Univer-
sity of Science and Technology, China.he was an
Associate Professor,Dean of School of Automo-
tive and Rail Transportation, Luoyang Polytech-
nic.His research interests include video tracking,
intelligent detection, big data analysis.

DONGNAN ZHAO is currently pursuing the de-


gree with the Henan University of Science and
Technology, China. His research interests include
deep learning,computer vision and big data analy-
sis.

VOLUME 4, 2016 13

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/

You might also like