Vacant Parking Slot Detection in The Around View
Vacant Parking Slot Detection in The Around View
Article
Vacant Parking Slot Detection in the Around View
Image Based on Deep Learning
Wei Li 1 , Libo Cao 1 , Lingbo Yan 1, * , Chaohui Li 1 , Xiexing Feng 1 and Peijie Zhao 2
1 State Key Laboratory of Advanced Design and Manufacturing for Vehicle Body, Hunan University,
Changsha 410006, China; [email protected] (W.L.); [email protected] (L.C.);
[email protected] (C.L.); [email protected] (X.F.)
2 GAC Parts Corporation Limited, Guangzhou 510630, China; [email protected]
* Correspondence: [email protected]
Received: 28 February 2020; Accepted: 8 April 2020; Published: 10 April 2020
Abstract: Due to the complex visual environment, such as lighting variations, shadows, and
limitations of vision, the accuracy of vacant parking slot detection for the park assist system (PAS)
with a standalone around view monitor (AVM) needs to be improved. To address this problem, we
propose a vacant parking slot detection method based on deep learning, namely VPS-Net. VPS-Net
converts the vacant parking slot detection into a two-step problem, including parking slot detection
and occupancy classification. In the parking slot detection stage, we propose a parking slot detection
method based on YOLOv3, which combines the classification of the parking slot with the localization
of marking points so that various parking slots can be directly inferred using geometric cues. In the
occupancy classification stage, we design a customized network whose size of convolution kernel
and number of layers are adjusted according to the characteristics of the parking slot. Experiments
show that VPS-Net can detect various vacant parking slots with a precision rate of 99.63% and a
recall rate of 99.31% in the ps2.0 dataset, and has a satisfying generalizability in the PSV dataset.
By introducing a multi-object detection network and a classification network, VPS-Net can detect
various vacant parking slots robustly.
Keywords: park assist system; vacant parking slot detection; deep learning; around view image
1. Introduction
With the rapid development of society, passenger cars are becoming more and more popular
in large cities, which makes it difficult to find a vacant parking slot. A study shows that over 50%
of drivers are frustrated by looking for free parking space in traffic dense area [1]. Besides, in total
car collisions, 23% of accidents happen in parking lots [2]. In this context, the park assist system
(PAS) is a promising technology most drivers want to see, which is composed of three parts: object
position designation, path planning, and parking guidance or path tracking. As the most important
component of the PAS, the task of the object position designation is to detect a vacant parking
slot accurately. The PAS can be divided into four categories based on the parking space detection
method: free space-based [3–7], parking slot marking-based [8–11], interface-based [12–14], and
infrastructure-based [15–17]. Compared with other methods, the parking slot marking-based approach
can be applied in wider situations, since it does not depend on the existence of adjacent vehicles or
extra communication equipment. Moreover, as people pay more attention to vehicle safety, myriads of
vehicles are equipped with the around view monitor (AVM) [18], which provides 360◦ surveillance
around the vehicle. Therefore, the vacant parking slot detection in the around view image can make
full use of the existing equipment on the vehicle.
In order to make vacant parking slot detection in the around view image meaningful and practical,
it should satisfy the following conditions: recognizing various types of parking slots and being robust
under the complex visual environment. To this aim, a series of marking point-based parking slot
detection methods were proposed by Suhr [19–22]. These methods utilize designed features to
detect marking points, which are easily affected by lighting variations. To detect marking points
robustly, a method utilizing the deep convolutional neural network (DCNN) was proposed in [11].
Due to the powerful feature extraction ability of DCNN, this method significantly improves the
accuracy of parking slot detection. However, it cannot classify the parking slot occupancy status and
involves a few cumbersome steps to infer the complete parking slot. To complement this method, an
end-to-end DCNN was proposed in [23] to perform automatic parking slot detection and classification
simultaneously. However, this method is based on the Faster R-CNN baseline and it cannot meet
the real-time requirements. Moreover, a few semantic segmentation-based methods were proposed
in recent years, such as VH-HFCN [24] and DFNet [25]. Despite these methods having outstanding
performance in ground markings segmentation, they need post-processing to generate parking slots,
which is time-consuming and inaccurate. A detailed literature review will be presented in Section 2.
In view of the limitations of previous works, we attempt to devise a vacant parking slot detection
method with a standalone AVM based on DCNN, namely VPS-Net, which can not only detect various
types of vacant parking slots effectively but also meet real-time requirements. VPS-Net converts the
vacant parking slot detection into a two-step problem, including parking slot detection and occupancy
classification. In the parking slot detection process, we first detect and classify all marking points
and parking slot heads using a pre-trained detector based on YOLOv3 [26]. Then, the geometric
information is used to match paired marking points and infer the complete parking slot. In the
occupancy classification process, a customized DCNN is designed to make the parking slot occupancy
classification reliable. Finally, VPS-Net is evaluated in the ps2.0 dataset [11] and PSV dataset [24].
The results show that VPS-Net outperforms previous methods with a precision rate of 99.63% and a
recall rate of 99.31%. Moreover, it achieves a real-time detection speed of 20.5 ms per frame on Nvidia
Titan Xp.
The contributions of this paper are as follows:
• A new vacant parking slot detection method in the around view image is proposed, and
we name it as VPS-Net, which combines the advantages of a multi-object detection network
with a classification network. Compared with the semantic segmentation-based methods that
need a series of complex post-processing to get the position of the parking slot, VPS-Net
can directly get the coordinates of marking points, so the more accurate position of parking
slots can be achieved. To facilitate future researchers, the related codes and the annotations
for vacant parking slots of ps2.0 and PSV datasets have been made publicly available at
https://github.com/weili1457355863/VPS-Net.
• A parking slot detection method based on YOLOv3 is proposed, which combines the classification
of the parking slot with the localization of marking points. Compared with previous marking
point-based methods that cumbersome steps are required to match the paired marking points
of the parking slot, VPS-Net simplifies the process of parking slot detection, so various kinds of
parking slots can be detected quickly and robustly.
• A customized DCNN model is designed to distinguish whether it is a vacant parking slot.
To evaluate the performance of the model, we update both ps2.0 and PSV datasets by marking the
type of parking slot in each image. Compared with some state-of-the-art (SOTA) DCNN models,
our customized DCNN model not only achieves comparable accuracy but also consumes less
time to process an image and has fewer parameters.
The remainder of this paper is organized as follows. Section 2 introduces the related research.
Section 3 describes the details of the VPS-Net method. Section 4 presents the experimental results of
the VPS-Net. Finally, the paper is discussed and concluded with a summary in Sections 5 and 6.
Sensors 2020, 20, 2138 3 of 22
2. Related Works
In this paper, our method mainly includes the detection of parking slots in the around view image
and the classification of parking slot occupancy. Related works about these aspects will be described
in detail here.
3. Proposed Method
VPS-Net detects various vacant parking slots based on deep learning. As shown in Figure 1, there
are three typical kinds of parking slots that VPS-Net can cope with. A parking slot consists of four
vertices, two of which are paired marking points of the entrance line, and the other two vertices are
usually invisible in the around view image due to limitations of vision. Figure 2 shows the overview
of VPS-Net for detecting vacant parking slots. VPS-Net divides vacant parking slot detection into
two steps: parking slot detection and occupancy classification, which combines the advantages of
a multi-object detection network with a classification network. In the parking slot detection stage,
a YOLOv3-based detector is used to detect marking points and parking slot heads simultaneously.
Subsequently, geometric cues are used to match paired marking points and determine the orientation
of the parking slot. Finally, to obtain the complete parking slot, the two invisible vertices are inferred by
the type, orientation, and paired marking points of the parking slot. After the parking slot is detected,
its position in the image will be transferred to the occupancy classification part. In the occupancy
classification stage, the detected parking slot is first regularized to a uniform size with 120 × 46 pixels,
and then a customized DCNN is designed to distinguish whether it is vacant. Once the vacant parking
slot is detected, its position will be sent to the decision module of the PAS for further process.
Sensors 2020, 20, 2138 5 of 22
Figure 1. Three typical kinds of parking slots. (a) perpendicular parking slots; (b) parallel parking
slots; (c) slanted parking slots. A parking slot consists of four vertices, of which the paired marking
points of the entrance line are marked with red dots, and the other two invisible vertices are marked
with yellow dots. The entrance lines and the viewing range of an AVM system are also marked out.
Figure 2. Overview of the VPS-Net, which contains two modules: parking slot detection and occupancy
classification. It takes the around view image as input and outputs the position of the vacant parking
slot to the decision module of the PAS.
learning could be divided into one-stage method [26,32,43] and two-stage method [44–46]. Compared
with the two-stage method, the one-stage method processes an image much faster. Considering the
real-time requirements of vacant parking slot detection and our detection task is relatively simple, our
detector is based on YOLOv3 [26] that is a representative one-state method. To train the YOLOv3-based
detector, the training labels including the bounding boxes of parking slot heads and marking points
are prepared. As shown in Figure 4, the bounding box of the parking slot head consists of 4 parameters,
p( x, y), w1 , and h1 , which can be calculated by the coordinates of paired marking points of entrance
line by (1)–(3). For each “T-shaped” or “L-shaped” marking point pi , its bounding box is a fixed
w2 × h2 and pi centered rectangle.
p1 ( x, y) + p2 ( x, y)
p( x, y) = (1)
2
|p1 ( x ) − p2 ( x )|
w1 = + ∆w (2)
2
Figure 3. Marking points and parking slot heads. (a) shows the geometric relationship between the
paired marking points and the parking slot head. Paired marking points are marked with green dots,
and the parking slot head is marked with the red rectangle; (b) shows a variety of deformations of
“T-shaped” or “L-shaped” marking points; (c) shows three kinds of the parking slot head belonging to
classes “right-angled head”, “obtuse-angled head”, and “acute-angled head” respectively.
Figure 4. The bounding boxes of the parking slot head and marking points. Each bounding box consists
of three parts: coordinates of the center point, width, and height.
Sensors 2020, 20, 2138 7 of 22
In the implementation process, we use the Darknet-53 pre-trained on ImageNet [47] as the feature
extractor of YOLOv3-based detector and then fine-tune the ps2.0 dataset [11]. In the process of
fine-tuning, the batch size is 32, the image is scaled to 416 × 416, the anchors are modified for ps2.0
dataset to [(10, 13), (28, 42), (33, 23), (30, 61), (62, 45), (61, 199), (126, 87), (156, 198)], and the learning
rate starts from 0.0001 and is decayed by 10 every 45,000 steps. The Adam optimizer is used with
the proposed optimization setting in [48] with [β 1 , β 2 , ε ] = [0.9, 0.999, 10−8 ]. Data augmentation is
performed during the training process. We flip the image and the corresponding bounding box with
a 50% probability level. We also add color augmentations with a 50% chance, including random
saturation with [1.0, 1.5], and random exposure with [1.0, 1.5] in the HSV color space.
where b j is one of the four vertices of B and ∼ b j represents the diagonal vertex of b j . ∆w and ∆h are
hyperparameters that control the width and height of B.
2: for p in P do
3: Count the number N of p in B
4: end for
5: if N = 2 then
6: p1 and p2 are paired marking points
7: end if
8: if N = 1 and the confidence of B > 95% then
9: Step 1: Calculate the other marking point p2 0 using Equation (4)
10: Step 2: p1 and p2 0 are paired marking points
11: end if
12: if N = 0 and the confidence of B > 98% then
13: Step 1: Calculate the NAIV of the four vertex regions of B using Equation (5)
14: Step 2: The largest NAIV set of diagonal vertices p1 0 and p2 0 are paired marking points
15: end if
16: if N > 2 then
17: Two points p1 and p2 that is the closest to the diagonal vertices of B are paired marking points
18: end if
19: end for
Sensors 2020, 20, 2138 8 of 22
Figure 5. The relationship between two marking points p1 , p2 and the bounding box of the parking
slot head B. (a) shows p1 ⊆ B and p2 ⊆ B; (b) shows p1 ⊆ B and p2 6⊂ B; (c) shows p1 6⊂ B and
p2 ⊆ B ; (d) shows p1 6⊂ B and p2 6⊂ B.
If neither p1 nor p2 is contained in the B and the object confidence of B is greater than 98%,
then the normalized average intensity values (NAIV) [19] of the four vertex regions are calculated by
Equation (5) and the largest NAIV set of diagonal vertices is selected as the paired parking points.
This is because marking points are much brighter than the ground plane and the pixels near marking
points tend to have greater intensity [12].
( )
1 1
N AIVi =
MAX ( I ) N ∑ I ( x, y) (5)
x,y∈ Ri
where N AIVi is the NAIV of the vertex i-centered region Ri of fixed size 10 × 10 pixels. MAX ( I ) is
the maximum intensity value of the image I. N and ( x, y) are the number of pixels in the region Ri
and their locations in the x-axis and y-axis.
If there are more than two marking points in the bounding B, the two marking points that are the
closest to the diagonal vertices of B are paired marking points. After that, the type of parking slot can
be determined by the distance between the paired marking points and the type of parking slot head.
When the head of the parking slot is classified as a “right-angled head”, if the distance is less than t, it
is considered as a perpendicular parking slot, otherwise, it is a parallel parking slot. If the head of the
parking slot is classified as an “obtuse-angled head” or an “acute-angled head”, it is considered as a
slanted parking slot.
−
p− →
" #
cos α i sin α i 1 p2
p3 = − d i + p2 (6)
− sin α i cos α i p →
− p
1 2
−
p− →
" #
cos α i sin α i 1 p2
p4 = − d i + p1 (7)
− sin α i cos α i p−→p
1 2
where α i and di are the parking angle and the depth of the parking slot, respectively.
Sensors 2020, 20, 2138 9 of 22
The parking angle α i can be determined by the type of the parking slot head and the orientation
of the parking slot. The depth di can be choosen as different values according to the type of the parking
slot. For the perpendicular parking slot or the parallel parking slot, α i = ±α 1 and di = d1 or di = d2 .
For the slanted parking slot with an acute angle, α i = ±α 2 and di = d3 . For the slanted parking slot
with an obtuse angle, α i = ±α 3 and di = d3 . When the four vertices of the parking slot are arranged
clockwise, α i is positive. Otherwise, it is negative.
Figure 6. Complete parking slot inference. (a–d) are the perpendicular parking slot, the parallel parking
slot, the slanted parking with an acute angle, and the slanted parking with an obtuse angle respectively.
Their depth is d1 , d2 and d3 respectively, and their parking angle is α 1 , α 2 and α 3 respectively. p1 , p2
are two visible paired marking points, and p3 , p4 are two invisible vertices.
Since the orientation of the parking slot determines whether the four vertices of the parking slot
are arranged clockwise or counterclockwise, it should be identified through geometric cues. For the
parking slot around the vehicle, the entrance line does not intersect the rectangular box formed by
the car model. Therefore, as shown in Figure 7, the orientation of the parking slot can be determined
according to the IOU between the rectangular box formed by the entrance line and the rectangular box
formed by the car model. The IOU can be calculated by Equation (8). For the vehicle parking into the
parking slot, the entrance line intersects the rectangular box formed by the car model, as shown in
Figure 8. If it is the vertical parking slot or the slanted parking slot, the orientation of the parking slot is
considered to be the downward direction. If it is the parallel parking slot and the slope of the entrance
line is positive, the orientation of the parking slot is the right direction. Otherwise, the orientation of
the parking slot is the left direction.
Area1 ∩
IOU =