Sensors 23 00789
Sensors 23 00789
Article
Research on Lane Line Detection Algorithm Based on
Instance Segmentation
Wangfeng Cheng 1 , Xuanyao Wang 1,2,3, * and Bangguo Mao 1
1 School of Mechanical Engineering, Anhui University of Science and Technology, Huainan 232001, China
2 Institute of Environment-Friendly Materials and Occupational Health, Anhui University of Science and
Technology, Wuhu 241000, China
3 Shaanxi Automobile Holding Group Huainan Special Purpose Vehicle Co., Ltd., Huainan 232001, China
* Correspondence: [email protected]
Abstract: Aiming at the current lane line detection algorithm in complex traffic scenes, such as lane
lines being blocked by shadows, blurred roads, and road sparseness, which lead to low lane line
detection accuracy and poor real-time detection speed, this paper proposes a lane line detection
algorithm based on instance segmentation. Firstly, the improved lightweight network RepVgg-A0 is
used to encode road images, which expands the receptive field of the network; secondly, a multi-size
asymmetric shuffling convolution model is proposed for the characteristics of sparse and slender lane
lines, which enhances the ability to extract lane line features; an adaptive upsampling model is further
proposed as a decoder, which upsamples the feature map to the original resolution for pixel-level
classification and detection, and adds the lane line prediction branch to output the confidence of the
lane line; and finally, the instance segmentation-based lane line detection algorithm is successfully
deployed on the embedded platform Jetson Nano, and half-precision acceleration is performed using
NVDIA’s TensorRT framework. The experimental results show that the Acc value of the lane line
detection algorithm based on instance segmentation is 96.7%, and the FPS is 77.5 fps/s. The detection
speed deployed on the embedded platform Jetson Nano reaches 27 fps/s.
Keywords: autonomous driving; complex road environment; lane line detection; instance segmentation
Ozgunalp et al. [5] proposed a feature-map-based lane detection algorithm that used an
inverse perspective transformation method. Chi et al. [6] proposed the use of the road
vanishing point estimation algorithm to detect lane lines, but the model-based algorithm
was computationally complex. In addition, the model-based detection algorithm was
computationally intensive and could only deal with problems such as road occlusion in
specific environments, thus making the results somewhat limited. Traditional lane line
detection algorithms based on features and models [7–11] were easily affected by external
environmental factors. When the lane line was broken, blocked, or unpainted, its robustness
was extremely low, which would lead to incorrect lane line detection or even make it
impossible. To solve the problem of low accuracy of lane line detection in complex road
environments, convolutional neural networks based on deep learning were widely used in
lane line detection because of their powerful feature detection capabilities. Aly et al. [12]
used Gaussian filtering and detected street lanes using line detection and a new RANSAC
spline-fitting technique. Kim et al. [13] combined convolutional neural networks with the
RANSAC algorithm and proposed a continuous end-to-end migration learning method that
can detect both left and right lane lines of the current lane. Neven et al. [14] transformed
the lane line detection problem into an instance segmentation problem that distinguished
the lane lines and their background using a binary classification principle. Ren et al. [15]
proposed a Fast-RCNN network, which used a multi-task loss function for training, which
allowed all layers to be updated and reduced the number of parameters in fully connected
layers, improving detection performance. However, the two-stage-based network detection
was slow. He et al. [16] proposed the use of SPP-Net to improve the detection speed. By
introducing the pooling layer to reduce the parameter amount of the model, the speed of the
model was improved to a certain extent, but the detection accuracy decreased to a certain
extent. Hairs et al. [17] proposed an ak-cnn model for lane line detection, which had an
auxiliary loss, which reduced parameters and running time while improving detection and
estimation indicators and had an excellent real-time performance. However, it was prone to
a lack of flexibility in complex traffic conditions. Liu et al. [18] proposed a transformer-based
network structure to better learn lane structure information and context information and
directly output the parameter information of lane lines to avoid additional post-processing,
and improved the overall detection speed. However, due to the large number of candidate
lane lines generated in the network, post-processing methods (NMS) were still required
to filter redundant lane lines, and the real-time performance was poor. Chao et al. [19]
proposed a VGG-ss model to build an encoder–decoder structure to improve the accuracy
and real-time performance of lane line detection. However, when the lane line was blocked
or destroyed, the precision measurement accuracy and real-time speed dropped slightly,
and this experiment was only used for pictures and had not been studied on videos, sports,
and other images.
In summarizing the aforementioned literature, it can be seen that the current lane line
detection algorithms based on deep learning use data to extract features adaptively, which
makes the lane line detection accuracy and real-time detection speed significantly improved
compared with traditional lane line detection algorithms. However, under factors such
as road occlusion, road blur, and the characteristics of slender and sparse lane lines, it
is difficult for ordinary convolutional neural networks to extract accurate road features
from road images, and cannot achieve satisfactory accuracy and real-time performance.
Aiming at the above problems, this paper proposes a lane line detection algorithm based
on instance segmentation. The contributions part of this paper includes the succeeding
three points:
1. Improve the RepVgg-A0 network to expand the receptive field of the network without
increasing the amount of calculations, and propose a multi-size asymmetric shuffled
convolution model to enhance the extraction of sparse and slender lane lines ability.
2. An adaptive upsampling model is proposed, which allows the network to select the
weight of the two upsampling methods at each position; at the same time, a lane line
prediction branch is added to facilitate the output of lane line confidence.
Sensors 2023, 23, 789 3 of 21
3. Deploy the lane line detection algorithm to the embedded platform Jetson Nano,
and use the TensorRT framework for half-precision acceleration to make its detection
speed meet the needs of real-time detection.
This paper is arranged as follows: A lane line detection model based on instance
segmentation is designed in Section 2. In Section 3, the lane line detection experiment is
carried out by combining the Tusimple extended dataset and the video collected by the real
car and deployed to the mobile terminal. Finally, in Section 4, the content of this paper is
summarized and directions for future work are provided.
Table1.1. Overall
Table Overall structure
structureof
ofthe
theencoder.
encoder.
368×640×3
184×320×64
92×160×64
Output
46×80
Encoder
Input
Figure 1.
Figure 1. Improved
Improved encoder
encoder network
network structure
structure diagram.
diagram.
FLOPs = k2 Ci Co Ho Wo (2)
Params = k 2Ci Co (1)
Sensors 2023, 23, 789 5 of 21
FLOPs = k 2CiCo HoWo (2)
Among them are the height and width of the input feature map, respectively, and the
channelAmong
numbersthem are
of the the and
input height and feature
output width of the input
maps, feature The
respectively. map, respectively,
asymmetric con-and
the channel numbers of the input and output feature maps, respectively. The asymmetric
volution parameters and calculations equivalent to k × k convolution are as follows:
convolution parameters and calculations equivalent to k × k convolution are as follows:
Paramsa = 2kCiCo (3)
Params a = 2kCi Co (3)
FLOPsaa =
FLOPs =kC
kCiC H (2W + k −1)
i Coo Hoo (2Woo + k − 1)
(4) (4)
It can be seen
It can be seenfrom Formulas
from Formulas (1)–(4) thatthat
(1)–(4) thethe
larger thethe
larger sizesize
of the convolution
of the convolutionkernel,
kernel,
thethe
moremoreobvious
obviousthe number
the numberof parameters
of parameters and calculations that can
and calculations thatbe can
reduced by con- by
be reduced
verting it into it
converting aninto
asymmetric convolution.
an asymmetric In addition,
convolution. some studies
In addition, have shown
some studies that the
have shown that
asymmetric effect of convolution applied to the middle layer of the network is
the asymmetric effect of convolution applied to the middle layer of the network is better [23]. better [23].
Therefore,
Therefore, 5 ×55×and
thethe 7 × 77 ×
5 and convolutions
7 convolutions in the multi-size
in the shuffled
multi-size convolution
shuffled convolution module
module
areare
replaced
replaced byby asymmetric
asymmetricconvolution,
convolution,and andaamulti-size
multi-sizeasymmetric
asymmetric shuffled convolu-
shuffled convolution
tion module,asasshown
module, shownininFigure
Figure2b,2b,isisdesigned.
designed. For fixed 46 46××8080× ×
128128
input feature
input maps,
feature maps,
thethe number
number of module
of module parameters
parameters andand calculations
calculations areare reduced
reduced by by 60.24%
60.24% andand 61.47%,
61.47%,
respectively.
respectively.
Channel Split
Channel Split
3×3 Conv
BN FReLU
3×3 Conv 5×1 Conv
BN FReLU FReLU
5×5 Conv 1×5 Conv
BN FReLU BN FReLU
7×7 Conv 7×1 Conv
BN FReLU FReLU
1×7 Conv
Concat
BN FReLU
FReLU
Channel Shuffle
Concat
ReLU
Channel Shuffle
(a) (b)
Figure 2. Multi-scale
Figure shuffled
2. Multi-scale convolution
shuffled module
convolution and multi-scale
module asymmetric
and multi-scale shuffled
asymmetric convolution
shuffled convolu-
module. (a) Multi-scale shuffled convolution module. (b) Multi-scale asymmetric shuffled
tion module. (a) Multi-scale shuffled convolution module. (b) Multi-scale asymmetric convolu-
shuffled
tion module.
convolution module.
Stack 6 multi-size
Stack 6 multi-size asymmetric
asymmetric shuffled
shuffledconvolution
convolutionmodels
modelsareareused
used to
to form the feature
form the fea-
ture enhancement model of the lane line detection model in this paper.
enhancement model of the lane line detection model in this paper. Among them, the Among them, thelast
last5 5modules
modulesuse usehole hole convolution to further expand the receptive field, and the
convolution to further expand the receptive field, and the hole rateshole
rates
areare
set set
to 2,to4,2,6,4,8,6,and
8, and 10, respectively.
10, respectively. The feature
The feature enhancement
enhancement modulemodule further
further extracts
extracts theline
the lane laneinformation
line information existing
existing in the feature
in the feature map output
map output by the encoder
by the encoder andthe
and inputs
result into the lane line prediction branch and decoder structure, and its corresponding
network structure is shown in Figure 3 below.
Sensors 2023, 22, x FOR PEER REVIEW 6 of 21
46×80×128
128
Output
Figure3.
Figure 3. Feature
Feature enhancement
enhancementmodule.
module.
2.3.
2.3. Design
Design of
of Decoder
Decoder Network
Network Structure
Structure
The
The decoder’s
decoder’s job jobisistotoclassify
classifyeach
each pixel
pixel in in
thethe feature
feature mapmap by upsampling
by upsampling the
the low-
low-resolution
resolution feature feature
map,map,which which contains
contains rich feature
rich feature information,
information, to thetosize
theofsize
theofinput
the
input
image. image.
The two Themost
two popular
most popular upsampling
upsampling algorithms
algorithms are bilinear
are bilinear interpolation
interpolation and
and trans-
transposition convolution.
position convolution. However,
However, thesethese algorithms
algorithms ignoreignore the impact
the impact of theofgradient
the gradient in
in pixel
pixel values between adjacent points, which will degrade the
values between adjacent points, which will degrade the sampled image’s detailed fea-sampled image’s detailed
features.
tures. It isIt is also
also simpletotoignore
simple ignorecoarse-grained
coarse-grainedfeatures
featuresandand other
other issues.
issues. ToTo solve
solve the
the
above
above problems, Zheng et al. [24] proposed a bilateral upsampling module, which directly
problems, Zheng et al. [24] proposed a bilateral upsampling module, which directly
adds
adds bilinear
bilinear interpolation
interpolation and and transposed
transposed convolution
convolution upsampling
upsampling results,
results,and
and achieved
achieved
certain
certainresults,
results,butbutdid
didnot
notconsider
considerthethetwo
twokinds
kindsofofupsampling
upsampling based
based ononthe applicability
the applicabil-
of
itythe method
of the method to atospecific
a specificimage
imagearea.
area.ToToeffectively
effectivelyextract
extractimage
imagefeatures,
features,this
thispaper
paper
suggests an adaptive upsampling module that enables the network
suggests an adaptive upsampling module that enables the network to choose the to choose the weight
weightof
the two upsampling methods at each
of the two upsampling methods at each location. location.
The
The adaptive
adaptive upsampling
upsamplingmodule modulestructure
structureisisshown
shownin inFigure
Figure4.4. After
After inputting
inputting thethe
H × W ×
H x W x C feature map, firstly, use bilinear interpolation and transposed convolution to
C feature map, firstly, use bilinear interpolation and transposed convolution to
perform obtain two 2H × 2W ×
perform upsampling, and initially obtain two 2H × 2W × C/2 upsampling feature maps, EE
upsampling, and initially C/2 upsampling feature maps,
and
andF;F;then,
then,splice
spliceEEandandF Fatatthe channel
the channel dimension
dimension to obtain
to obtaina 2H × 2W
a 2H × C× feature
× 2W map
C feature mapG,
perform 3 × 3 convolutions on G to extract the spatial attention description S (2H × 2W × 2),
G, perform 3 × 3 convolutions on G to extract the spatial attention description S (2H × 2W
and use the Softmax function to extract two attention weights of 2H × 2W × 1 for S; finally,
× 2), and use the Softmax function to extract two attention weights of 2H × 2W × 1 for S;
Sensors 2023, 22, x FOR PEER REVIEW the attention weights are weighted and summed with E and F, respectively, 7 of 21 to obtain the
finally, the attention weights are weighted and summed with E and F, respectively, to
final upsampling result (2H × 2W × C/2).
obtain the final upsampling result (2H × 2W × C/2).
2H×2W×C/2 2H×2W×C/2
Bilinear
interpolation
upsampling E ´
H´W´C 2H×2W×C/2
2H×2W×C 2H×2W×2
Concat Conv
G 3×3 S Softmax Add +
Transposed
Convolution F ´
Upsampling
2H×2W×C/2 2H×2W×C/2
Figure
Figure 4. Adaptive
4. Adaptive upsampling
upsampling module.module.
2H×2W×C/2 2H×2W×C/2
Sensors 2023, 23, 789 7 of 21
Figure 4. Adaptive upsampling module.
Input
Input
3×3 TConv
Stride=2
BN
1×1 Conv ReLU
BN ReLU Non-bt-1D
Bilinear
Interpolate
Non-bt-1D
Output
Output
(a) (b)
Figure 5. The5.specific
Figure structure
The specific of the
structure twotwo
of the upsampling modules.
upsampling (a)(a)
modules. Bilinear interpolation
Bilinear interpolationupsam-
upsampling.
pling.(b)
(b)Transposed
Transposedconvolution
convolutionupsampling.
upsampling.
The adaptive
The adaptive upsampling module
upsampling is superimposed
module is superimposed threethree
timestimes
to form the decoder
to form the decoder
of theoflane
the line
lanedetection model
line detection in this
model inpaper, and its
this paper, andcorresponding
its corresponding network structure
network is is
structure
shown in Figure 6. The input feature map is decoded by the decoder
shown in Figure 6. The input feature map is decoded by the decoder and upsampled to the and upsampled to
the original
originalimage
image size,
size, and
and the number
number of ofchannels
channelsisisreduced
reducedtoto 7. 7.
The The first
first channel
channel is to
is used
used predict
to predict thethe background
background of of
thethe lane
lane line,
line, andthe
and theother
otherchannels
channelsdirectly
directlypredict
predictthe
thepixel
Sensors 2023, 22, x FOR PEER REVIEW
pixelcoordinates
coordinatesofofthe thelane line
lane instance,
line instance,which
which hashas
a faster detection
a faster speedspeed
detection than the the8 of 21
thanalgorithm
that isthat
algorithm firstissemantically segmented
first semantically and then
segmented andfits
thenthefits
lane
theline.
lane line.
368×640×3
368×640×7
184×320×64
92×160×64
46×80×128
Input
Output
Adaptive Upsampling Module for Triple Stacking
Decoder
Figure6.
Figure 6. Decoder
Decoder network
network structure.
structure.
Output
Adaptive Upsampling Module for Triple Stacking
Decoder
Sensors 2023, 23, 789 8 of 21
Figure 6. Decoder network structure.
Input 46×80×128
1 Output
Fc 128
ReLU
Input
Fc 6
Sigmoid
Output
(a) (b)
Figure 7. Lane
Figure line
7. Lane prediction
line branch.
prediction (a)(a)
branch. Internal structure
Internal structureofoflane
laneline
lineprediction
predictionbranch.
branch.(b)
(b)Exter-
External
nalstructure
structureofoflane
laneline
lineprediction
predictionbranch.
branch.
2.5.2.5. Proposed
Proposed Lane
Lane Detection
Detection Model
Model
Figure 8 depicts the lane line detection model based on instance segmentation. This
model uses ReLU as the activation function and primarily consists of 12 convolutional lay-
ers, 3 upsampling layers, and 5 pooling layers. The input feature map is compressed three
times by the encoder, and the output feature map is stacked by six multi-size asymmetric
shuffled convolutions through the feature enhancement model. Part of the output feature
map is passed to the decoder, and the other part is passed to the lane line prediction branch.
The output feature map is adaptively up-sampled 3 times by the decoder, and the result of
the instance segmentation of the feature map is output. The lane line prediction branch
outputs the lane line confidence through maximum pooling 5 times. The encoder network
structure and the decoder network structure present an asymmetric state, which effectively
reduces the number of parameters and computation of the model.
metric shuffled convolutions through the feature enhancement model. Part of the output
feature map is passed to the decoder, and the other part is passed to the lane line predic-
tion branch. The output feature map is adaptively up-sampled 3 times by the decoder,
and the result of the instance segmentation of the feature map is output. The lane line
prediction branch outputs the lane line confidence through maximum pooling 5 times.
Sensors 2023, 23, 789 9 of 21
The encoder network structure and the decoder network structure present an asymmetric
state, which effectively reduces the number of parameters and computation of the model.
Input Output
368×640×7
184×320×64
184×320×64
92×160×64
92×160×64
128 256 512 128 46×80×128
46×80
46×80×7
23×40×7
Multi-Scale Asymmetric Adaptive Maximum
Convolution Shuffled Convolution Upsampling Pooling 6440 128 6
1
Figure
Figure8.
8.Lane
Lane detection model.
detection model.
3.3.Experimental
Experimental Results
Results and
and Analysis
Analysis
3.1. Dataset and Preprocessing
3.1. Dataset and Preprocessing
This paper is based on the TuSimple [26] dataset, which comprises video images
This paper is based on the TuSimple [26] dataset, which comprises video images col-
collected on American highways. There are 20 frames in each segment. The original dataset
lected on American highways. There are 20 frames in each segment. The original dataset
only marked the final frame of the 20 frames because there are many video frame data
only marked
points. theframe
The first final and
frametheofimages
the 20offrames
the tenthbecause there are
and eleventh many
frames in video
the middleframe aredata
points.
chosen for labeling to improve the dataset’s generalizability. The labeling file is in json are
The first frame and the images of the tenth and eleventh frames in the middle
chosen
format,for
andlabeling
for everyto ten
improve
pixels the dataset’s
in the expanded generalizability. The labeling
(vertical) direction, a point isfilemarked.
is in json
format,
There areand for every
25,632 tenofpixels
pictures roads.inDifferent
the expanded
from the(vertical) direction,
original dataset, a point
14,504 is marked.
pictures are
There arefor
selected 25,632 pictures
training, 2325 of roads.are
pictures Different
used forfrom the original
verification, dataset,
and 8803 14,504
pictures arepictures
used for are
selected
testing. for training,the
To enhance 2325 picturesofare
diversity theused
data for
andverification,
improve theand 8803effect
robust pictures
of thearemodel,
used for
data enhancement
testing. To enhanceprocessing is performed
the diversity of the dataon and
the training
improve set,
theincluding random
robust effect of rotation
the model,
sors 2023, 22, x FOR PEER REVIEW
and enhancement
data random horizontal deflection.
processing Figure 9 on
is performed shows some common
the training scenes in
set, including the dataset.
random rotation10 of
Each image has 2 to 5 marked lane lines. In this paper, these discrete
and random horizontal deflection. Figure 9 shows some common scenes in the dataset. lane line coordinate
points
Each are connected
image has 2 to 5tomarked
form anlaneexample
lines.image
In thisaspaper,
a real mark.
these discrete lane line coordinate
points are connected to form an example image as a real mark.
FigureFigure
9. Display of some
9. Display of somescenes
scenes from theTuSimple
from the TuSimple dataset.
dataset.
Figure9.9.Display
Figure Displayofofsome
somescenes
scenesfrom
fromthe
theTuSimple
TuSimpledataset.
dataset.
3.2.Experiment
3.2.
3.2. ExperimentPreparation
Experiment Preparation
Preparation
Theserver
The
The serverused
server usedin
used inthe
in theexperiment
the experimentisisisthe
experiment the11th
the 11thGen
11th GenIntel
Gen Intel(R)
Intel (R)Core
(R) Core(TM)
Core (TM)i5-11400H
(TM) i5-11400H@@@
i5-11400H
2.70
2.70 GHz
GHz 2.69
2.69 GHz,
GHz, 512
512 GB
GB memory,
memory, and
and NVIDIA
NVIDIA GeForce
GeForce
2.70 GHz 2.69 GHz, 512 GB memory, and NVIDIA GeForce RTX3050 graphics processor. RTX3050
RTX3050 graphics
graphics processor.
processor.
The operating system is Windows 10 professional version, the deep learning frameworkis
The
The operating
operating system
system is Windows
is Windows 10 professional
10 professional version,
version, the deep
the learning
deep learning framework
framework
istensorflow2.4-GPU,
istensorflow2.4-GPU,
tensorflow2.4-GPU, and CUDA
and
and CUDA
CUDA version is is
version
version 11.0.
is11.0.
11.0.
The
Thevideo
video acquisition
acquisition device
device used
used in
in the
the
The video acquisition device used in the experiment experiment
experimentisisisaaafront-view
front-view camera,
front-viewcamera,
camera,as as shown
asshown
shown
in Figure
ininFigure 10,
Figure10, with
10,with a resolution
witha aresolution of 2592
resolutionofof2592 × 1944.
2592× ×1944.
1944.TheThe experimental
Theexperimental
experimentalvehiclevehicle is a Volkswagen
vehicleisisa aVolkswagen
Volkswagen
Sagitar,
Sagitar, the
the embedded
embedded platform Jetson
platform JetsonNano
Nanoisisused usedfor
for mobile
mobile deployment,
deployment, andand the
theop-
op-
Sagitar, the embedded platform Jetson Nano is used for mobile deployment, and the
operating
erating system
system is is ubuntu18.0.4.The
ubuntu18.0.4. The details
details are
are shown
shown ininFigure
Figure11. 11.
erating system is ubuntu18.0.4. The details are shown in Figure 11.
Figure10.
Figure
Figure 10.Image
10. Imageacquisition
Image acquisitionequipment.
acquisition equipment.
equipment.
Figure 11.Experimental
Figure Experimental vehicleand
and JetsonNano
Nano embeddedplatform.
platform.
Figure11.
11. Experimentalvehicle
vehicle andJetson
Jetson Nanoembedded
embedded platform.
Table222displays
Table
Table displaysthe
displays thelane
the laneline
lane linedetection
line detectionmodel’s
detection model’shyperparameter
model’s hyperparametersettings.
hyperparameter settings.For
settings. Forlane
For lane
lane
lineimage
line
line imagesegmentation
image segmentationand
segmentation andlane
and laneline
lane lineconfidence
line confidenceprediction,
confidence prediction,different
prediction, differentloss
different lossfunctions—the
loss functions—the
functions—the
cross-entropy
cross-entropy loss
cross-entropy loss function function
functionand and
andbinary binary
binary cross-entropy
cross-entropy
cross-entropy loss
lossloss function,
function,
function, respectively—are
respectively—are
respectively—are used.
used.
used.
Use Use
Use
each eachbatch
each
batch batch
for forfortraining
training training afterupdating
after
after updating updating
the model themodel
the modelparameters,
parameters, parameters, and
and
and record record
itrecord ititasasa a
as a training
trainingsession.
training
session. session.
Set theSet Setthethemaximum
maximum maximum
number number
number ofofiterations
iterations
of iterations toto300,
to 300, 300,and
and andthe
the themaximum
maximum
maximum number
number
number of
ofoftraining
training
training timestimes
timesto toto80,000.
80,000.
80,000. When When
When thethe thenumber
numbernumber ofoftraining
training
of training times times isisgreater
is greater
times greater thanthis
thanthan
this thisvalue,
value, value,
stop
stoptraining.
training.
training.
stop The The Thelearning
learninglearning israte
rate rate isisdetermined
determineddetermined
by thebyby thefollowing
following
following
the formula:formula:
formula:
ss s0.9) 0.9
===
LLL ((1−−− ) ) 0.9
1 (1 (5)
(5)
(5)
ppp
Name Value
Batch size 8
Iterations 300
Initial learning rate 0.02
Optimizer SGD
Optimizer decay factor 0.0001
In the formula, L represents the learning rate, is the current training times, and is the
highest training times.
Sensors 2023, 23, 789 11 of 21
Fpred
FP = (7)
Npred
M pred
FN = (8)
Ngt
Among them, Fpred is the number of wrongly predicted lane lines, M pred means the
number of real lane lines that have not been predicted, and the lower the values of FP and
FN, the better the model performance.
To verify the performance of the model in this paper, it is compared with existing
models (ResNet-18, ResNet-34 [27], Enet [28], LaneNet [29], SCNN [30], ENet-SAD [31],
RESA-50 [32], SGLD-34 [33], Res34-VP [34]) that conducted comparative experiments on
the TuSimple test set, and the results are shown in Table 3.
As can be seen from Table 3, the lane detection model proposed in this paper is
superior to the current excellent lane detection model in terms of accuracy, achieving the
highest accuracy rate and the lowest FP value. Moreover, the amount of parameters and
calculations of the model is only higher than that of the lightweight network Enet, and
the reasoning speed is second only to Res18-Seg. The model in this paper can quickly and
accurately detect lane lines with a small number of computing resources, achieve a balance
between accuracy and speed, and meet the accuracy and real-time requirements of lane
line detection. Therefore, on the whole, the lane detection model in this paper is superior
to other lane detection models in terms of comprehensive performance.
To verify the influence of the adaptive upsampling module and the feature enhance-
ment module on the overall performance of the lane line detection model, an ablation
experiment was carried out to compare the accuracy before and after adding the adaptive
Sensors 2023, 23, 789 12 of 21
upsampling module and the feature enhancement module. Table 4 records the results of
the ablation experiment.
It can be seen from Table 4 that after adding the adaptive upsampling module and the
feature enhancement module, the accuracy of the model has been improved to varying
degrees, which are 0.2% and 0.82%, respectively. After using the two modules comprehen-
sively, the accuracy rate of the model is increased to 96.7%, which is 0.89% higher than the
original, indicating that the above two modules can effectively improve the performance of
the lane line detection model.
Loss
LaneNet
SensorsSensors
2023, 22, x FOR PEER REVIEW SCNN
2023, 23, 789 60 13 of 21
Our model
40
140
20 ENet-SAD
120 Res34-VP
0 RESA-50
0 8 16 24 32 40 48 56 64 SGLD-34
72 80 88
100 Epoch Res18-Seg
Res34-Seg
Figure
80 12. Model verification—loss function graph. ENet
Loss
LaneNet
60Figure 13 shows that in the training phase, the SCNN
verification dataset is used t
validate each lane line detection model to generate a model
Our loss function verification c
can40be seen from Figure 13 that ENet-SAD and RESA-50 present an overfitting ph
non, according to the loss function curve which first decreases and then increases
20 time, the other eight lane line detection models did not appear to be ove
same
during the training period, but during the training process, the loss function of
work 0 such as Res34-VP has a certain degree of oscillation during the convergence
0 8 16 24 32 40 48 56 64 72 80 88
Compared with the convergence Epoch
effects of the remaining seven lane line detection
the model in this paper has the best verification convergence.
Figure 12.Model
Figure 12. Model verification—loss
verification—loss function
function graph. graph.
90Figure
13 shows that in the training phase, the verification dataset is used to
validate
80 each lane line detection model to generate ENet-SAD
a loss function verification cu
Res34-VP
can be seen from Figure 13 that ENet-SAD and RESA-50 present an overfitting phe
RESA-50
70
non, according to the loss function curve which first SGLD-34
decreases and then increases.
same
60 time, the other eight lane line detection modelsRes18-Seg
did not appear to be over
Res34-Seg
during the training period, but during the training process, the loss function of t
50 ENet
Loss
work such as Res34-VP has a certain degree of oscillation during the convergence p
LaneNet
40
Compared with the convergence effects of the remaining
SCNNseven lane line detection m
Our model
the model in this paper has the best verification convergence.
30
20
90
10 ENet-SAD
80
Res34-VP
700 RESA-50
0 8 16 24 32 40 48 56 64 SGLD-34
72 80 88
60 Epoch Res18-Seg
Res34-Seg
Figure
50 13.13.
Figure Model
Model verification—loss
verification—loss functionfunction
graph. graph. ENet
Loss
LaneNet
40 Figure
For
15 shows that in the training phase, the above fourSCNN
the ablation experiments in Table 4, the
detection models are cross-
performance before and after add
validated using the verification data set to generate a loss function verification curve. It can
Our model
adaptive
be30
seen from upsampling
Figure 15 thatmodule
the modeland feature
of the fusionenhancement module
feature enhancement is comapared,
module shows a to
a comparative
significant increaseanalysis with the speed
in the convergence model in in
thethis
laterpaper. Figure
stage, and 14 is the verification
the convergence effect
20
outperforms the one fused with the baseline as well as networks fused with the adaptive
the loss value during model training, and the experimental environment settings
upsampling module. However, the model in this paper has more advantages than the first
same as above. For the baseline, when the loss function converges to a specific st
10 networks,
three so it has the best verification convergence.
convergence speed of the loss function decreases significantly due to the declin
0
feature
0 extraction
8 16 ability
24 of the40model.
32 48 After
56 the
64 convergence
72 80 speed
88 decreases, th
Epoch
Figure 13. Model verification—loss function graph.
For the ablation experiments in Table 4, the performance before and after add
adaptive upsampling module and feature enhancement module is comapared, to c
a comparative analysis with the model in this paper. Figure 14 is the verification c
Baseline
120 Fusion Adaptive Upsampling Module
Fusion Feature Enhancement Module
line,
100
adaptive upsampling module, and feature enhancement
Our Model
module, as well as
sion feature enhancement module and adaptive upsampling module all show a
Sensors 2023, 23, 789 convergence.
80 However, the model in this paper has a stable downward
14 of 21 trend,
Loss
oscillation amplitude is the smallest, so convergence works best.
60
140
40 Baseline
120 Fusion Adaptive Upsampling Module
20 Fusion Feature Enhancement Module
100 Our Model
0
80 0 8 16 24 32 40 48 56 64 72 80 88
Epoch
Loss
40Figure 15 shows that in the training phase, the above four detection models ar
validated using the verification data set to generate a loss function verification c
can 20
be seen from Figure 15 that the model of the fusion feature enhancement
shows a significant increase in the convergence speed in the later stage, and the
0
gence 0effect8 outperforms
16 24 the
32 one 40 fused
48 with
56 the
64 baseline
72 80as well
88 as networks fus
Epoch
the adaptive upsampling module. However, the model in this paper has more adv
than
Figure
Figure the
14.
14. first
Model
Model three networks,
verification—loss
verification—loss so it has thegraph.
function
function graph. best verification convergence.
90
Figure 15 shows that in the training phase, the above four detection models ar
validated
80 using the verification data set to generate a loss function verification c
Baseline
can be seen from FigureFusion
15 that the model
Adaptive of theModule
Upsampling fusion feature enhancement
70
Fusion
shows a significant increase in Feature Enhancement
the convergence Module
speed in the later stage, and the
Our Model
60 effect outperforms the one fused with the baseline as well as networks fus
gence
the adaptive upsampling module. However, the model in this paper has more adv
50
Loss
than the first three networks, so it has the best verification convergence.
40
90
30
80
20 Baseline
70 Fusion Adaptive Upsampling Module
10 Fusion Feature Enhancement Module
60 Our Model
0
0 8 16 24 32 40 48 56 64 72 80 88
50
Epoch
Loss
40 15.
Figure
Figure 15.Model
Model verification—loss
verification—loss function
function graph. graph.
3.5. 30
Comparison of Lane Line Detection Effects
3.5. Comparison of Lane Line Detection Effects
To verify the effectiveness of the feature enhancement module and the adaptive
20
To verify
upsampling thethe
module, effectiveness of theand
feature maps before feature enhancement
after the module
feature enhancement andand
module the adap
the final
10
sampling detection results are visualized, as shown in Figure 16. Among them,
module, the feature maps before and after the feature enhancement mod Figure 16a
is the feature map before the encoder processing, Figure 16b is the feature map after
the final detection results are visualized, as shown in Figure 16. Among them, Fig
being0processed by the feature enhancement module, and Figure 16c is the input image.
Figure 0
is the 8 map
feature
16b shows 16 before
that 24features
the 32 encoder
the 40
extracted48processing,
56 encoder
by the 64 Figure
72 relatively
are 80 is88
16b the feature
scattered localmap afte
Epoch
features, while the feature enhancement module can capture the complete features of the
lane lines,
Figure 15.and the perceived
Model ability of the
verification—loss lane lines
function is significantly enhanced. Figure 16b is
graph.
the mapping output by the adaptive upsampling module in the original image. It can be
seen that the model can accurately detect the lane lines in the input image.
3.5. Comparison of Lane Line Detection Effects
To verify the effectiveness of the feature enhancement module and the adap
sampling module, the feature maps before and after the feature enhancement mod
the final detection results are visualized, as shown in Figure 16. Among them, Fig
is the feature map before the encoder processing, Figure 16b is the feature map aft
processed by the feature enhancement module, and Figure 16c is the input image. Figure
16b shows that the features extracted by the encoder are relatively scattered local features,
while the feature enhancement module can capture the complete features of the lane lines,
Sensors 2023, 23, 789 and the perceived ability of the lane lines is significantly enhanced. Figure 16b is the15map-
of 21
ping output by the adaptive upsampling module in the original image. It can be seen that
the model can accurately detect the lane lines in the input image.
(a) (b)
(c) (d)
Figure 16.
Figure 16. Visualization of the
Visualization of the lane
lane line
line detection
detection process.
process. (a)
(a) Before
Before feature
feature enhancement.
enhancement. (b)
(b) After
After
feature enhancement. (c) Input image. (d) Detection result.
feature enhancement. (c) Input image. (d) Detection result.
To verify
To verifythe
theperformance
effectivenessofofthe thelane
feature enhancement
line detection model module
in thisand thethe
paper, adaptive
remaining up-
sampling module, the feature maps before and after the feature enhancement
nine lane line detection models in Table 3, and the first three in Table 4, different modules module and
the final
are fused.detection
The four results
casesare visualized,
of road shadow,as shown in Figure
road blur, 16. Amongand
road occlusion, them, roadFigure 16a
slender
is the feature map before the encoder processing, Figure 16b is the feature
and sparse characteristics in the testset are selected for instance segmentation analysis, as map after being
processed
shown by the17
in Figure feature
below.enhancement module,
After inputting andand
pictures Figure
fixed16c is thefor
labels, input
the image.
four lane Figure
line
16b showsmodels
detection that theoffeatures
ENet-SAD,extracted by the RESA-50,
Res34-VP, encoder are relatively
and SGLD-34, scattered local features,
the instances in the
whilescenarios
four the featureareenhancement
segmented, and module can capture
the segmented the complete
solid lines havefeatures of thein
defect losses lane
thelines,
four
and the perceived
scenarios. ability ofsegmentation
For the instance the lane linesofisRes18-Seg,
significantly enhanced.
Res34-Seg, andFigure
ENet16bthree is the
lanemap-
line
ping output by the adaptive upsampling module in the original image.
detection models in four scenarios, the segmented solid lines have defect losses in three It can be seen that
the model For
scenarios. canthe
accurately
instancedetect the lane of
segmentation lines in the and
LaneNet input image.
SCNN lane line detection models
Toscenarios,
in four verify the the
performance
segmented ofsolid
the lane
linesline detection
have defect model
losses in twothis paper,
scenarios.the remain-
For the
instance
ing nine segmentation of themodels
lane line detection three network
in Tablemodels in the
3, and the four
first scenes
three in the4,case
in Table of ablation
different mod-
experiments,
ules are fused.the Thesolid
fourlines
casesof ofthe
roadsegmentation
shadow, roadhave blur,defect losses in three
road occlusion, and roadscenes, two
slender
scenes, and characteristics
and sparse one scene, respectively. Compared
in the testset with the
are selected performance
for instance of the firstanalysis,
segmentation 12 models, as
when the model in this paper performs instance segmentation on the
shown in Figure 17 below. After inputting pictures and fixed labels, for the four lane linefour scenarios, there
is no defectmodels
detection loss, and
of the detectionRes34-VP,
ENet-SAD, effect reaches the bestand
RESA-50, level. Therefore,
SGLD-34, theon the whole,
instances in the
lane detection model in this paper is superior to other lane detection models in terms of
comprehensive performance.
instance segmentation of the three network models in the four scenes in the case of abla-
tion experiments, the solid lines of the segmentation have defect losses in three scenes,
two scenes, and one scene, respectively. Compared with the performance of the first 12
models, when the model in this paper performs instance segmentation on the four scenar-
Sensors 2023, 23, 789
ios, there is no defect loss, and the detection effect reaches the best level. Therefore, on the
16 of 21
whole, the lane detection model in this paper is superior to other lane detection models in
terms of comprehensive performance.
Input
images
Seg_labels
ENet-SAD
Res34-VP
RESA-50
SGLD-34
Res18-Seg
Res34-Seg
ENet
LaneNet
SCNN
Baseline
Adaptive
Upsampling
Feature
Enhancement
Our model
Figure 17. Instance Segmentation Analysis of Different Network Performance Based on TuSimple
Dataset.
3.6. Lane Line Detection and Mobile Terminal Deployment in Different Scenarios
To further verify the effect of the lane line detection model in this paper, road video
information is collected by the front car camera in different scenarios, such as normal
roads, road congestion at night, road blocking, and night tunnels. At the same time,
according to the results of instance segmentation under the TuSimple testset, the closest
to the effect of the model in this paper is the network of the fusion feature enhancement
module. Considering that there are many network models for comparison and reducing the
repetition of experiments, the lane line detection model of the fusion feature enhancement
3.6. Lane Line Detection and Mobile Terminal Deployment in Different Scenarios
To further verify the effect of the lane line detection model in this paper, road video
information is collected by the front car camera in different scenarios, such as normal
roads, road congestion at night, road blocking, and night tunnels. At the same time, ac-
cording to the results of instance segmentation under the TuSimple testset, the closest to
Sensors 2023, 23, 789 17 of 21
the effect of the model in this paper is the network of the fusion feature enhancement
module. Considering that there are many network models for comparison and reducing
the repetition of experiments, the lane line detection model of the fusion feature enhance-
module
ment and the
module andlane line line
the lane detection model
detection in this
model paper,
in this forfor
paper, comparison and
comparison analysis
and in
analysis
complex traffic scenarios, are shown in Figure 18.
in complex traffic scenarios, are shown in Figure 18.
Figure18.
Figure Laneline
18.Lane linedetection
detectioneffect
effectin
indifferent
differentscenarios.
scenarios.
ItItcan
canbe
beseen
seenfrom
fromFigure
Figure18a,b
18a,bthat
thatwhen
whendriving
drivingon
onaanormal
normalroad,
road,both
boththethelane
lane
line detection model of the fusion feature enhancement module and the lane line
line detection model of the fusion feature enhancement module and the lane line detection detec-
tion model
model in thisinpaper
this paper can smoothly
can smoothly segmentsegment and accurately
and accurately detect
detect the lanethe lane
line. Theline. The
scenar-
scenarios of road congestion at night, road blocking, and the tunnel at night can be
ios of road congestion at night, road blocking, and the tunnel at night can be seen in Figure seen
in Figure 18c,e,g, showing that the lane line detection model with the fused feature en-
hancement module has a partial miss detection problem, which is marked with an elliptical
dashed line. From Figure 18d,f,h, it can be seen that the lane line detection model in this
paper can detect lane lines accurately.
To further test the performance of the lane detection model in this paper, it is deployed
on the mobile terminal for verification. It can be seen from Table 3 that the parameter
quantity of the lane line model in this paper is 9.57 M, which is very low. At the same time,
since RepVgg-A0 is a lightweight network, different branch structures are subtly fused
during inference, thereby compressing the parameters of the model. Therefore, the lane
line detection model can be directly deployed to the embedded platform Jetson Nano, and
line. From Figure 18d,f,h, it can be seen that the lane line detection model in this paper
can detect lane lines accurately.
To further test the performance of the lane detection model in this paper, it is de-
ployed on the mobile terminal for verification. It can be seen from Table 3 that the param-
Sensors 2023, 23, 789 eter quantity of the lane line model in this paper is 9.57 M, which is very low. At the 18
same
of 21
time, since RepVgg-A0 is a lightweight network, different branch structures are subtly
fused during inference, thereby compressing the parameters of the model. Therefore, the
lane line detection model can be directly deployed to the embedded platform Jetson Nano,
the TensorRT framework can be used for half-precision acceleration to make its detection
and the TensorRT framework can be used for half-precision acceleration to make its de-
speed meet the requirements of real-time detection. Based on the complex traffic scene in
tection speed meet the requirements of real-time detection. Based on the complex traffic
Figure 18, the lane line detection model in this paper is deployed to the embedded platform
scene in Figure 18, the lane line detection model in this paper is deployed to the embedded
Jetson Nano, and the displayed results are shown in Figure 19 below.
platform Jetson Nano, and the displayed results are shown in Figure 19 below.
Figure
Figure 19.
19. Mobile
Mobileterminal
terminal detection
detection rendering.
rendering.
Figure 19a,c,e,g
Figure 19a,c,e,g show
show the
the deploying
deploying of
of the
the lane
lane line
linedetection
detection model
model to
tothe
theembedded
embedded
platform Jetson Nano, and the performance of the model based on the above
platform Jetson Nano, and the performance of the model based on the above different different
scenarios can be tested. Due to the limited space shown in the picture on the left, its
enlarged effect under the ubuntu18.0.4 system is shown in Figure 19b,d,f,h, in which it can
be seen that under the premise of accurate detection of lane lines in complex scenarios,
the real-time detection speed of the Jetson Nano platform has reached above 27 fps/s.
Although there is still a certain gap with the real-time detection speed under the Windows
system, it can meet the real-time detection speed requirements under the deployment of
the mobile terminal. Therefore, the lane line detection model in this paper is deployed on
the mobile terminal and performs well.
Sensors 2023, 23, 789 19 of 21
4. Conclusions
In this paper, aiming at the problems of low lane line detection accuracy and poor
real-time detection speed of existing lane line detection algorithms in complex traffic scenes,
a lane line detection algorithm based on instance segmentation is proposed. The design
method of this paper mainly includes optimizing the RepVgg-A0 network structure to
expand the receptive field of the network; a multi-size asymmetric shuffled convolution
model is proposed to enhance extraction of sparse and slender lane lines ability; an adaptive
upsampling model is proposed, which allows the network to select the weight of the two
upsampling methods at each position; a lane line prediction branch is added to facilitate
the output of lane line confidence; and the lane line detection algorithm is deployed to the
embedding of the standard platform Jetson Nano, using the TensorRT framework for half-
precision acceleration. The experimental results show that the lane line detection algorithm
in this paper has an Acc value of 96.7% on the expanded TuSimple dataset and a real-
time detection speed of 77.5 fps/s. The model is successfully deployed on the embedded
platform Jetson Nano, and achieved a real-time detection speed of 27 fps/s, making it
suitable for mobile terminal deployment. Therefore, the lane line detection algorithm in
this paper is more suitable for current self-driving cars after being deployed on the mobile
terminal, to improve the accuracy and safety of the automated driving perception part.
Due to the limitation of the experimental conditions, there is a gap between the real-
time detection speed of the lane line algorithm deployed on the mobile terminal and the
real-time detection speed under the Windows system. Therefore, the next step is to consider
further compression of the model parameters, so that the real-time detection speed of the
mobile terminal can be further improved without reducing the accuracy.
Author Contributions: Conceptualization, W.C. and X.W.; methodology, B.M.; software, B.M.; vali-
dation, W.C., X.W. and B.M.; formal analysis, W.C.; investigation, X.W.; resources, X.W.; data curation,
B.M.; writing—original draft preparation, W.C.; writing—review and editing, B.M.; visualization,
X.W.; supervision, B.M.; project administration, W.C.; funding acquisition, X.W. All authors have read
and agreed to the published version of the manuscript.
Funding: This work was supported by the Anhui Provincial Natural Science Foundation under Grant
(1908085ME159), the funder: Xuanyao Wang, funding number: 5. The project was funded by the
Scientific Research Activities of Post-Doctoral Researchers in Anhui Province under Grant (2020B447),
the funder: Xuanyao Wang, funding number: 4. Anhui University of Technology Research Institute
of Environmentally Friendly Materials and Occupational Health (Wuhu) R&D special funding project
(ALW2021YF05). the funder: Xuanyao Wang, funding number: 6.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Haris, M.; Hou, J. Obstacle Detection and Safely Navigate the Autonomous Vehicle from Unexpected Obstacles on the Driving
Lane. Sensors 2020, 20, 4719. [CrossRef] [PubMed]
2. Yang, W.; Zhang, X.; Lei, Q. Lane Position Detection Based on Long Short-Term Memory (LSTM). Sensors 2020, 20, 3115. [CrossRef]
[PubMed]
3. Mammeri, A.; Boukerche, A.; Tang, Z. A real-time lane marking localization, tracking and communication system. Comput.
Commun. 2015, 73, 229. [CrossRef]
4. Sotelo, N.; Rodríguez, J.; Magdalena, L. A Color Vision-Based Lane Tracking System for Autonomous Driving on Unmarked
Roads. Auton. Robot. 2004, 1, 95–116. [CrossRef]
Sensors 2023, 23, 789 20 of 21
5. Ozgunalp, N.; Dahnoun, N. Lane detection based on improved feature map and efficient region of interest extraction. In
Proceedings of the 2015 IEEE Global Conference Signal and Information Process (GlobalSIP, IEEE 2015), Orlando, FL, USA,
14–16 December 2015. [CrossRef]
6. Chi, F.H.; Huo, Y.H. Forward vehicle detection system based on lane-marking tracking with fuzzy adjustable vanishing point mech-
anism. In Proceedings of the 2012 IEEE International Conference on Fuzzy Systems, Brisbane, QLD, Australia, 10–15 June 2012.
[CrossRef]
7. Lin, H.Y.; Dai, J.M.; Wu, L.T.; Chen, L.Q. A Vision-Based Driver Assistance System with Forward Collision and Overtaking
Detection. Sensors 2020, 20, 5139. [CrossRef] [PubMed]
8. Li, K.; Shao, J.; Guo, D. A Multi-Feature Search Window Method for Road Boundary Detection Based on LIDAR Data. Sensors
2019, 19, 1551. [CrossRef] [PubMed]
9. Cao, Y.; Chen, Y.; Khosla, D. Spiking deep convolutional neural networks for energy-efficient object Recognition. Int. J. Comput.
Vis. 2014, 113, 54–66. [CrossRef]
10. Zhang, X.; Yang, W.; Tang, X.; Wang, Y. Lateral distance detection model based on convolutional neural network. IET Intell.
Transp. Syst. 2019, 13, 31–39. [CrossRef]
11. Kim, J.; Kim, J.; Jang, G.-J.; Lee, M. Fast learning method for convolutional neural networks using extreme learning machine and
its application to lane detection. Neural Netw. 2017, 87, 109–121. [CrossRef] [PubMed]
12. Aly, M. Real time detection of lane markers in urban streets. In Proceedings of the IEEE Intelligent Vehicles Symposium,
Eindhoven, The Netherlands, 4–6 June 2008. [CrossRef]
13. Kim, J.; Park, C. End-To-End Ego Lane Estimation Based on Sequential Transfer Learning for Self-Driving Cars. In Proceedings of
the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW, 2017), Honolulu, HI, USA, 21–26
July 2017; p. 1194. [CrossRef]
14. Neven, D.; de Brabandere, B.; Georgoulis, S.M. Towards End-to-End Lane Detection: An Instance Segmentation Approach. In
Proceedings of the 2018 IEEE Intelligent Vehicles Symposium(IV), Changshu, China, 26–30 June 2018; p. 286. [CrossRef]
15. Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE
Trans. Pattern Anal. Mach. Intell. 2017, 6, 1137. [CrossRef] [PubMed]
16. He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. IEEE Trans.
Pattern Anal. Mach. Intell. 2014, 9, 1904. [CrossRef]
17. Haris, M.; Jin, H.; Xiao, W. Lane line detection and departure estimation in a complex environment by using an asymmetric
kernel convolution algorithm. Vis. Comput. 2022, 1–10. [CrossRef]
18. Liu, R.; Yuan, Z.; Liu, T. End-to-end lane shape prediction with transformers. In Proceedings of the IEEE Winter Conference on
Applications of Computer Vision, Online, 5–9 January 2021; pp. 3694–3702.
19. Chao, M.; Dean, L.; He, H. Lane Line Detection Based on Improved Semantic Segmentation. Sens. Mater. 2021, 33, 4545–4560.
20. Ma, N.; Zhang, X.; Zheng, H.T. Shufflenet v2: Practical guidelines for efficient CNN architecture design. In Proceedings of the
European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 116–131.
21. Ding, X.; Zhang, X.; Ma, N.; Han, J.; Ding, G.; Sun, J. Repvgg: Making vgg-style convnets great again. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pattern Recognition, Online, 19–25 June 2021; pp. 13733–13742.
22. Qiu, S.; Xu, X.; Cai, B. FReLU: Flexible rectified linear units for improving convolutional neural networks. In Proceedings of the
2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 1223–1228.
23. Szegedy, C.; Vanhoucke, V.; Ioffe, S. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826.
24. Zheng, T.; Fang, H.; Zhang, Y. Resa: Recurrent feature-shift aggregator for lane detection. In Proceedings of the AAAI Conference
on Artificial Intelligence, Online, 2–9 February 2021; Volume 35, pp. 3547–3554.
25. Romera, E.; Alvarez, J.M.; Bergasa, L. Erfnet: Efficient residual factorized convnet for real-time semantic segmentation. IEEE
Trans. Intell. Transp. Syst. 2017, 19, 263–272. [CrossRef]
26. Wang, P.; Chen, P.; Yuan, Y. Understanding convolution for semantic segmentation. In Proceedings of the 2018 IEEE Winter
Conference on Applications of Computer Vision (WACV), IEEE, Lake Tahoe, NV, USA, 12–15 March 2018; pp. 1451–1460.
27. Chen, L.C.; Papandreou, G.; Kokkinos, I. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution,
and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 834–848. [CrossRef] [PubMed]
28. Paszke, A.; Chaurasia, A.; Kim, S. Enet: A deep neural network architecture for real-time semantic segmentation. arXiv 2016,
arXiv:1606.02147.
29. Lu, P.; Xu, S.; Peng, H. Graph-Embedded Lane Detection. IEEE Trans. Image Process. 2021, 30, 2977–2988. [CrossRef] [PubMed]
30. Pan, X.; Shi, J.; Luo, P. Spatial as deep: Spatial cnn for traffic scene understanding. In Proceedings of the AAAI Conference on
Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32.
31. Hou, Y.; Ma, Z.; Liu, C. Learning lightweight lane detection CNNS by self attention distillation. In Proceedings of the IEEE
International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1013–1021.
Sensors 2023, 23, 789 21 of 21
32. Wang, B.; Wang, Z.; Zhang, Y. Polynomial regression network for variable-number lane detection. In Proceedings of the European
Conference on Computer Vision, Online, 23–28 August 2020; pp. 719–734.
33. Su, J.; Chen, C.; Zhang, K. Structure guided lane detection. arXiv 2021, arXiv:2105.05403.
34. Liu, Y.B.; Zeng, M.; Meng, Q.H. Heatmap-based vanishing point boosts lane detection. arXiv 2020, arXiv:2007.15602.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.