0% found this document useful (0 votes)
19 views9 pages

CCD Image-Based Pixel-Level Identification Model For Pavement Cracks Under Complex Noises Using Artificial Intelligence

This document presents a novel automatic identification model for pavement cracks using artificial intelligence and CCD imaging technology, addressing the limitations of manual detection methods. The model leverages a modified U-net architecture with a MobileNet backbone and an Atrous Channel Pyramid Attention (ACPA) mechanism to enhance segmentation accuracy and efficiency, achieving a precision of 88.84% and an accuracy of 98.87%. The study demonstrates the effectiveness of deep learning in improving pavement crack detection under complex background conditions, making it suitable for real-time applications.

Uploaded by

Bhoomika A S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views9 pages

CCD Image-Based Pixel-Level Identification Model For Pavement Cracks Under Complex Noises Using Artificial Intelligence

This document presents a novel automatic identification model for pavement cracks using artificial intelligence and CCD imaging technology, addressing the limitations of manual detection methods. The model leverages a modified U-net architecture with a MobileNet backbone and an Atrous Channel Pyramid Attention (ACPA) mechanism to enhance segmentation accuracy and efficiency, achieving a precision of 88.84% and an accuracy of 98.87%. The study demonstrates the effectiveness of deep learning in improving pavement crack detection under complex background conditions, making it suitable for real-time applications.

Uploaded by

Bhoomika A S
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Received 13 July 2023, accepted 11 August 2023, date of publication 15 August 2023, date of current version 25 August 2023.

Digital Object Identifier 10.1109/ACCESS.2023.3305670

CCD Image-Based Pixel-Level Identification


Model for Pavement Cracks Under Complex
Noises Using Artificial Intelligence
FEI SONG 1, YU ZOU2 , WENSHA SHAO3 , AND XIAOYUAN XU 1
1 School of Information Technology, Jiangsu Open University, Nanjing 210017, China
2 Science and Technology Office, Jiangsu Open University, Nanjing 210036, China
3 Jiangsu Lifelong Education Credit Bank Management Center, Jiangsu Open University, Nanjing 210017, China

Corresponding author: Xiaoyuan Xu ([email protected])


This work was supported in part by the National Natural Science Foundation of China under Grant 62102184, and in part by the Natural
Science Foundation of Jiangsu Province under Grant BK20200784.

ABSTRACT Existing manual detection methods have limitations, particularly for pavement cracks in
complex backgrounds, which are manifested in low recognition accuracy, high misjudgment rate, and
long time-consuming. To overcome these problems, artificial intelligence technology and charge-coupled
devices (CCD) imaging technology are combined to construct an automatic identification method for
pavement hidden cracks under complex background interference conditions. First, the classic semantic
segmentation model U-net is selected as the basic model, and the MobileNet lightweight network is utilized to
replace the encoder part of U-Net with huge parameters, to realize the lightweight of the model and improve
the segmentation effect of pavement cracks. On this basis, the Atrous Channel Pyramid Attention (ACPA)
mechanism is introduced into the U-net to further improve contextual information capability to focus on
selectively relevant features. A pavement crack data set containing different complex and diverse crack types
and background noise is used to evaluate the effectiveness and scope of application of the developed model.
Quantitative evaluation results show that the developed model achieves an overall performance in the test
set with a precision of 88.84%, recall of 89.76%, accuracy of 98.87%, and IoU of 89.95%, respectively.
Combined with the analysis of the results of the comparison experiment and the ablation experiment, it can
be inferred that the utilization of the MobileNet lightweight network to replace the encoder part of U-net can
effectively construct a lightweight model while the ACPA module can effectively perform multi-scale and
long-distance cross-channel interaction, help suppress useless features, strengthen useful features, and help
the network learn stronger feature representations of hidden areas of pavement cracks.

INDEX TERMS Pavement disease, machine vision, deep learning, damage assessment, feature extraction.

I. INTRODUCTION and long-term operation. Manual inspection is a traditional


China is currently undergoing urbanization at an unprece- pavement crack identification method, but it has the problems
dented speed and scale, and the construction of highways is an of a high missed identification rate, long time-consuming,
important part of urban infrastructure development [1], [2]. and expensive labor costs [3].
Pavement cracks are a very common road disease, which To solve this problem, the optical imaging technology that
will cause serious harm to the service life of roads and combines the onboard charge-coupled device (CCD) sensor
vehicle driving safety. Effective identification of pavement with digital image processing technology has attracted much
cracks is one of the key tasks to ensure pavement safety attention due to its ability to automatically monitor road
health conditions [4], [5]. In fact, in the past few years, digital
The associate editor coordinating the review of this manuscript and image processing and computer software and hardware tech-
approving it for publication was Wei Wang . nology are widely used in intelligent crack detection vehicles.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.
VOLUME 11, 2023 For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/ 89733
F. Song et al.: CCD Image-Based Pixel-Level Identification Model for Pavement Cracks

Compared with manual detection, this automatic detection To obtain accurate segmentation results for pavement
technology has significantly improved accuracy, but the cracks in complicated background and noise interferences,
detection vehicles are expensive and suffer from challenging the developed network needs to obtain high-quality semantic
issues due to the complex road background [6]. Moreover, features and sufficient spatial detail information. The network
pavement cracks mostly present a linear shape distribution generally performs a downsampling operation on the feature
feature, which makes their phase pixel ratio relatively low, map to obtain high-quality semantic features. This operation
and the difficulty of crack identification is more challenging can not only reduce the resolution of the feature map, and
than other types of defects, especially in complex environ- reduce the amount of calculation, but also help to increase
ments with a lot of background noise [7]. the network receptive field. However, in the process of down-
In the past few decades, with the rapid development of the sampling the feature map, the spatial detail information will
field of artificial intelligence(AI), various intelligent algo- be lost. To address the above-mentioned problems, this study
rithms have been applied to the identification and detection utilizes the U-shaped network architecture(U-net) as the base
of road surface defects [8], [9], [10], [11], [12]. Machine network. To realize real-time detection of pavement cracks at
learning-based algorithms represented by artificial neural the pixel level, the mobile network MobileNet is introduced to
networks and support vector machines are often used to replace the backbone part of the conventional UNet network.
learn the intrinsic knowledge association of pavement crack The improved UNet network integrates the feature maps in
data [13], [14]. However, these methods rely on traditional the encoder with similar resolutions as the decoder into the
manual feature extraction methods, and their performance is decoder through skip connections. To further eliminate the
limited in scenes with complex road backgrounds, uneven problem of sample loss when gradually increasing the spatial
pixel distribution, and noise interference. In addition, the resolution, the Atrous Channel Pyramid Attention (ACPA)
cumbersome feature extraction process makes the overall module is introduced into the U-net network to improve the
inference efficiency of the model slow, making it difficult to integrity and accuracy of identifying micro-cracks in com-
meet real-time detection requirements. plex backgrounds. On this basis, the model loss function is
In recent years, deep learning methods have been improved to eliminate the serious imbalance of the sample
researched and partially applied in the fields of computa- background caused by tiny cracks
tional photography, image recognition, and automatic driv- The main contributions of this study can be attributed as
ing, and have achieved remarkable results [15], [16]. The follows.
use of deep learning methods for CCD image data min- 1) The application of the replacement of the backbone
ing and effective extraction has become a hot research network using the MobileNet lightweight network can
topic in the field of road surface management. For exam- maintain high detection accuracy, greatly reduce the
ple, Liu et al. [17] proposed an automated pavement crack size of the model weight file and reduce the crack
detection and segmentation method using two-step convo- segmentation time.
lutional neural networks. Yang et al. [18] proposed a feature 2) The ACPA module can effectively perform multi-scale
pyramid and hierarchical boosting network for pavement and long-distance cross-channel interaction, help sup-
crack detection. Fan et al. [19] utilized the parallel ResNet press useless features, strengthen useful features, help
to develop a high-performance pavement detection and mea- the network learn stronger feature representations of
surement model. In other research, Loprencipe et al. [20] pavement crack areas, and have fewer parameters.
proposed an ensemble method using deep convolutional neu- 3) The comparative experiments on the pavement engi-
ral networks for automatic pavement crack detection and neering crack data set show that the construction
measurement. The effective implementation of the above method has higher inference accuracy and efficiency
research shows that deep learning is an effective method than other benchmark methods, indicating that the
to solve the hidden crack identification of pavement. The model has achieved an effective balance in inference
effective implementation of the above research shows that accuracy and efficiency
deep learning is an effective method to solve the hidden The rest of this paper is mainly as follows. Section II gives
crack identification of pavement. But on the other hand, an introduction to the architecture of the proposed network
the aforementioned deep learning-based networks achieve and the mathematical principles of each part. Section III
good performance, but these existing methods still suffer describes the implementation process and the dataset sources.
from insufficient feature extraction capability, especifically Section IV describes the experimental results and feasibility
in long-range contextual information, which is crucial for tiny analysis. Lastly, the conclusions are provided in the final part
crack detection. The interference of the small receptive field of this paper.
area can easily lead to the phenomenon of fracture or false
positive in the crack recognition area under the interference II. METHODOLOGY
of the model under complex light and shadow condi- In this section, the overall architecture of the developed net-
tions. In addition, the balance mode between network infer- work is first presented to give a workflow. Then, the theory
ence efficiency and inference accuracy needs to be further about the components of the model composition is further
studied. elaborated. The specific content is as follows.
89734 VOLUME 11, 2023
F. Song et al.: CCD Image-Based Pixel-Level Identification Model for Pavement Cracks

A. NETWORK ARCHITECTURE OVERVIEW and it is difficult to meet the real-time requirements in


Conventional convolution operations rely on exploiting practical engineering applications. MobileNet is a popular
short-range local feature semantics, which is not effective convolutional neural network architecture designed for effi-
enough for identifying hidden cracks in the pavement under cient and lightweight computations on mobile and embedded
complex background noise [21]. To address the above prob- devices [22]. MobileNet is specifically designed to be com-
lems, this study proposes the overall model structure of the putationally efficient and have a small memory footprint.
improved lightweight U-net architecture using the ACPA By incorporating MobileNet as the backbone of U-net, the
module. FIGURE 1 shows the overall architecture of the overall model becomes more efficient and requires fewer
developed network. It can be seen from this figure that the computational resources, making it suitable for deployment
proposed network is composed of the encoder, decoder, cen- on resource-constrained devices such as mobile phones or
tral layer, and side network. The encoder and the decoder each edge devices.
have 4 steps, which are connected through the central layer, Based on the above analysis, this study utilized the
and on the same step, the feature map obtained by the encoder lightweight convolutional neural network MobileNet (only
is integrated into the decoder in a channel connection through 1/41 parameters of the VGG16 network) to replace the back-
a skip connection. bone network of the U-net network. The application of the
MobileNet network can effectively realize the lightweight of
the model, reduce the computational complexity and improve
the semantic segmentation effect of cracks. FIGURE 3
shows the visualization diagram of the depthwise separable
convolution block. This is to say that the depth-wise separable
convolutions used in MobileNet can significantly reduce the
number of parameters compared to traditional convolutional
layers. By replacing the standard convolutional layers in
U-net with MobileNet’s depth-wise separable convolutions,
the parameter count of the U-net model decreases, leading to
faster training and inference times.

FIGURE 1. The diagram of the developed model.

B. THE IMPROVED LIGHTWEIGHT U-NET ARCHITECTURE


As a representative work of the deep learning network model
for semantic segmentation, U-net was proposed by Ron-
neberger et al. in 2015 [22]. The architecture is characterized
by a contracting path to capture context and a symmetric
expanding path that enables precise localization. In the past FIGURE 2. The flowchart of the U-net-based architecture.
few years, U-net-based convolutional neural network archi-
tectures have been widely used for image segmentation tasks,
such as identifying and labeling different objects within an
image. For example, U-net has been widely used in medical
image analysis and other areas of expertise that require fine-
grained segmentation.
FIGURE 2 shows the basic architecture composition dia-
gram of the U-net-based network. It can be seen from this
figure that the U-net-based network model consists of an
encoder and a decoder, where the encoder part is used to learn
low-level and high-level features of the input image, and the
decoder maps these features to the image that generates the
segmentation result.
The downsampling part of the conventional U-net net-
work is usually a VGG16 network with a large number
of parameters, which consumes a large number of com-
puting resources in the pavement crack segmentation task, FIGURE 3. The diagram of the depthwise separable convolution block.

VOLUME 11, 2023 89735


F. Song et al.: CCD Image-Based Pixel-Level Identification Model for Pavement Cracks

C. ATROUS CHANNEL PYRAMID ATTENTION improves the network’s ability to capture the global informa-
Due to the complex environmental conditions of the road, tion of cracks.
such as landmarks, oil stains, and water stains, when the
visual detection method detects hidden cracks on the road
surface, if there are breaks and discontinuities, it is easy to
cause misjudgment of subsequent results. To further improve
this problem, ACPA is introduced to enhance the model
identification and segmentation effect of the model on small
target objects. ACPA is an attention mechanism used in
computer vision tasks, particularly in the field of seman-
tic segmentation [22]. It is designed to capture long-range
dependencies and enhance the feature representations in con-
volutional neural networks (CNNs). Semantic segmentation
involves assigning a class label to each pixel in an image,
which requires understanding the context and relationships
between different regions. ACPA helps in capturing such FIGURE 4. The architecture diagram of the ACPA.

contextual information by selectively attending to relevant


features.
In image semantic segmentation, it is usually necessary to 1) THE IMPROVED COMBINED LOSS FUNCTION
obtain high-level semantic information about the image to The white and black areas in the mask image are the pavement
strengthen the understanding of the whole image. However, crack area and the background area respectively. Since the
in the down-sampling process of obtaining the high-level pavement crack (positive sample) only accounts for 4% of
semantic information of the image, the detailed information the image area, the pavement background (negative sample)
of the image will inevitably be lost, especially for some small accounts for too much area [23]. The loss function indicates
objects. The number of pixels occupied in the image is small, the degree of difference between the output result obtained
and it is easy to lose in the process of multiple downsampling. after the sample is input into the model and the true label
However, this detailed information and small objects will of the sample. To further improve the performance of the
exist in the channels of low-level feature maps. pixel-level model in pavement hidden crack segmentation,
This paper proposes an ACPA module, which replaces the a combined loss function is developed.
one-dimensional convolution with a size of k with multi- Considering that the essence of crack identification is
ple parallel one-dimensional convolutions with different hole a data mining problem under the condition of unbalanced
rates based on the ECA module. for the interaction between samples, this study introduces a variety of loss functions
longer-distance channels from multiple scales. to improve the sample mining ability of the model. Con-
FIGURE 4 shows the basic schematic diagram of the sidering that the essence of crack identification is a data
ACPA neural architecture. To further improve the representa- mining problem under the condition of unbalanced samples,
tion ability of pavement crack characteristic area, this paper this study introduces a variety of loss functions to improve
proposes an ACPA module, which replaces the common the sample mining ability of the model. In detail, this study
1D convolution in the conventional attention-based module combines the traditional binary cross-entropy loss function
with multiple parallel 1D dilated convolutions with different with the Dice loss function to construct a hybrid loss function.
decay rates for cross-channel interactions within multi-scale The combination of these loss functions can encourage the
distances. The specific formula and theory are explained as model to learn both global and local features, leading to more
follows. accurate segmentation. In this study, multiple loss functions
H X
W are combined to leverage their respective strengths. This is to
1 X
say that combining the binary cross-entropy loss(BCE) with
fGAP(k) = fk (i, j) (1)
H ×W the Dice loss can encourage the model to learn both global
i j
and local features, leading to more accurate segmentation.
where fk (i, j) represents the value of fk at the position of The formula for the binary cross-entropy loss function can
coordinate (i, j). To facilitate the one-dimensional convolu- be represented as follows.
tion operation, remove one dimension and obtain FGAP . It can
N
(
be also represented using the following formula. 1 X α (p (yi ))2 (1 − yi ) log (1 − p (yi ))
LossBCE = −
FGAP = fGAP(1) , fGAP(2) , fGAP(3) , . . . , fGAP(C)
 
(2) N
i=1
+(1 − α) (1 − p (yi ))2 yi log (p (yi )) .
(3)
The application of the above mechanism can further
improve the accuracy segment the pavement cracks. This is where N is the sum of all pixels of the image to be segmented;
mainly due to an attention mechanism module being added α is a dynamic scaling factor, which can dynamically reduce
after the connection layer in the network decoding part, which the weight of easily distinguishable samples during training;

89736 VOLUME 11, 2023


F. Song et al.: CCD Image-Based Pixel-Level Identification Model for Pavement Cracks

yi is the real label of the i-th pixel (the background pixel label that there are significant differences in the shape of pavement
is 0, the crack pixel label is 1); p (yi ) is the probability that the cracks in the data set, with inconsistent crack widths and
network predicts that the i-th pixel is the crack pixel label. different bifurcation types.
The Dice loss is a suitable loss function for training image
segmentation models because it encourages the model to
produce segmentation masks that align well with the ground
truth masks [24]. The utilization of the Dice loss function
can improve the model’s ability to capture fine details and
handle class imbalance. Minimizing the Dice loss effectively
maximizes the overlap between the predicted and ground
truth masks. Dice loss function that maximizes the overlap
between the predicted and ground truth segmentation masks.
The Dice loss is defined as 1 - Dice_coefficient, and it can
help improve the model’s ability to capture fine details and
handle class imbalance.
2|X ∩ Y |
Dice_coefficient = (4)
|X | + |Y |
where |X ∩ Y | represents the intersection of sets X and Y; |X |
and |Y | represents the number of its elements. For the seg-
mentation task, |X | and |Y | represent the segmented ground
true and predict mask, respectively.
Since the relationship between Dice loss and Dice coeffi-
cient is: LossDice = 1 - Dice_coefficient, the formula for Dice FIGURE 5. Different types of pavement cracks.
Loss can be explained as follows.
2|X ∩Y |
LossDice = 1 − (5)
|X |+|Y |

III. IMPLEMENTATION DETAILS


A. DATASET DESCRIPTION
The dataset used in this study is provided by Yang et al. [18],
which was originally collected from the main campus of
Temple University using cell phones. The resolution of the
original pavement crack data set is 2000∗ 1500, and the num-
ber of the dataset is 500. Moreover, each crack image has a
pixel-level annotated binary map. FIGURE 6. Pavement cracks and annotations.
To facilitate the network to read the image and prevent
memory overflow and explosion, the image is cropped to
448∗ 448 resolution. FIGURE 5 shows the same images of B. TRAINING CONFIGURATION AND EVALUATION
pavement cracks in the crack dataset It can be observed INDICATORS
from the figure that the morphological difference between This experiment was carried out on Windows 10 operat-
pavement cracks and the background roughness is very sig- ing system. The hardware environment used in this study
nificant. In addition, light and shadow conditions, whether is 2×Xeon(R) Gold 5118, 256GB memory, 1×Tesla T4,
there are water stains on the road surface, oil stains, and other and 1TB SSD. The software environment is Python 3.7,
interferences also affect the CCD imaging effect. Cuda10.2, and Cudnn7.6.The proposed method used in this
Specifically, due to the limited number of images, the study is coded and implemented based on the deep learn-
large size of each image, and limited computing resources, ing framework PyTorch, and the software platform Vscode.
we crop each image into 16 non-overlapping image regions, Considering the limitation of the GPU card, the batch size is
A total of 3000 cropped images were further divided into finally set to 6 throughout the training process.
the training set, verification set, and test set, according to a In regards to the relatively small scale of the pavement
fixed ratio of 60%, 20%, and 20%. Validation data is used to crack dataset, this study leverage transfer learning by initial-
select the best model during training to prevent overfitting. izing the U-net model with pre-trained weights from a related
Once a model is selected, it is tested on test data and other task or a larger dataset. This approach can provide a good
datasets to assess the generalization of the model. FIGURE 6 starting point and accelerate convergence, and fine-tuning the
demonstrates part of the original image and labeling results pre-trained model on the specific dataset can help improve
of the pavement crack dataset. It can be seen from the figure performance. All the proposed and benchmark methods are

VOLUME 11, 2023 89737


F. Song et al.: CCD Image-Based Pixel-Level Identification Model for Pavement Cracks

trained on the same image computing device equipped with


a graphics card. The specific training parameters for the
construction method are as follows. The number of batches
of the model is set to 8, the number of training rounds is set
to 100, and the learning rate is set to 0.0005.
In this study, various evaluation indicators are used to
calculate the model’s ability to crack segmentation, including
recall, precision, intersection over union (IoU), and accuracy.
The details about these evaluation indicators are shown as
follows:
TP
Recall = (6)
TP + FN
TP
Precision = (7)
TP + FP
TP
IoU = (8)
TP + FP + FN
TP + FP
Accuracy = (9)
TP + TN + FP + FN
where True Positive (TP) refers to the number of pixels in
the segmentation result that are correctly identified as crack
regions. False Positive (FP) refers to the number of pixels
that are identified as crack regions in the segmentation results
but belong to the background region in the label map. True
Negative (TN) refers to the number of pixels correctly iden-
tified as the background area. False Negative (FN) indicates FIGURE 7. The change of the loss function and accuracy indicators during
the model training.
the number of pixels that are recognized as background in the
segmentation result but belong to the crack area in the label
image.
parameters and scale, but inevitably results in a slight decline
in the model’s evaluation indicators on the test set. It can be
IV. RESULTS AND DISCUSSIONS
further seen that the introduction of the mixed loss function
It should be noted that building a robust detection model
can effectively improve the mining effect of the model on
on the target dataset needs an iterative process that requires
negative samples (crack pixels), which is manifested in a
experimentation and validation on the specific dataset.
significant increase in the recall rate. Benefitting from the
It’s important to carefully analyze the model’s performance
large receptive field area and long-term sequence dependence
metrics, such as IoU to assess the effectiveness of the applied
of the attention mechanism, the model’s ability to identify
improvements. FIGURE7 shows the loss function and evalu-
hidden cracks has been further improved, which is reflected in
ation index (accuracy) changes of the model during 100 iter-
the increase in model accuracy and IoU in terms of evaluation
ations. It can be inferred from the figure that the loss function
indicators.
of the model on both the training set and the verification set
shows a gradual and stable decline and convergence area,
TABLE 1. The ablation study of the developed model performance with
indicating that the performance of the constructed model has different modules.
tended to be stable after 100 iterations. Correspondingly, the
accuracy of crack identification of the model on the verifi-
cation set shows a trend of increasing steadily and finally
tending to convergence with the increase of the number of
iterations.
In this section, we conduct ablation studies to reveal the
effectiveness of different modules in the developed method.
More specifically, the combined loss function, the conven-
tional BCE loss function, and the ACPA mechanism are
removed from the build method to evaluate its effectiveness.
Table 1 shows the results of the ablation experiment of
the construction method with different modules. As shown
in Table 1, the application of lightweight backbone net- In this research, to further provide more convincing evi-
work alternatives effectively reduces the number of model dence for the applicability and superiority of the construction

89738 VOLUME 11, 2023


F. Song et al.: CCD Image-Based Pixel-Level Identification Model for Pavement Cracks

TABLE 2. Model training and model building parameter comparison.

TABLE 3. Crack detection performances attained by different models.

FIGURE 8. Prediction results of the developed method and other deep


learning methods on the dataset(The color box represents the key area
where the pavement crack recognition effect is different).

TABLE 4. Comparative evaluation of crack detection efficiency of


different segmentation methods.

method in solving the pavement crack detection problem,


an in-depth comparative analysis with some state-of-the-
art crack detection models was developed. Specifically,
to demonstrate the effectiveness of the proposed method, five
convolution-based semantic segmentation methods, includ-
ing the original U-net network [25], the SegNet [26], the
Deeplab [27], the FCN [28], are utilized as the comparative
methods. Both the proposed method and the comparison
method use the same dataset for training and test evaluation
to ensure fairness.
Table 2 shows the model size and training time of the
construction method and other comparison methods. It can
be deduced from the table that the introduction of the Mobil-
FIGURE 9. Demonstration of pavement crack detection effect under three
Net backbone network can effectively reduce the amount kinds of real noise(The color box represents the key area where the
and scale of model parameters, making the model easier pavement crack recognition effect is different).
to deploy and apply. Table 4 shows the evaluation metrics
of the developed and other comparative methods of model
performance in the test set. It can be deduced from this table network, thereby significantly improving the model detection
that all deep learning-based crack detection methods achieve efficiency. The model has reached the real-time reasoning
promising performances on the pavement crack dataset, while efficiency benchmark (30FPS) and has a strong ability to
the developed method presented a significant performance implement detection tasks in particular scenes.
boost over the other methods in terms of various evaluation To better and effectively evaluate the recognition effect
metrics. of the proposed method in crack segmentation, three kinds
Table 4 shows the model inference efficiency comparison of crack images containing different backgrounds and crack
of the developed and the other methods. It can be seen from shapes are selected to evaluate the identification performance
the figure that the introduction of the MobileNet backbone effect. FIGURE 8 demonstrates the identification effect
network effectively reduces the overall parameters of the of the proposed method on different types of cracks for

VOLUME 11, 2023 89739


F. Song et al.: CCD Image-Based Pixel-Level Identification Model for Pavement Cracks

hydraulic concrete structures. It can be seen from the figure [3] F. Guo, Y. Qian, Y. Wu, Z. Leng, and H. Yu, ‘‘Automatic railroad track
that the proposed method has achieved good performance components inspection using real-time instance segmentation,’’ Comput.-
Aided Civil Infrastruct. Eng., vol. 36, no. 3, pp. 362–377, Mar. 2021, doi:
on different types of cracks in the test set, and the results 10.1111/mice.12625.
of neural network segmentation and crack identification are [4] L. Zhao, Y. Wu, X. Luo, and Y. Yuan, ‘‘Automatic defect detection of
basically consistent. pavement diseases,’’ Remote Sens., vol. 14, no. 19, p. 4836, Sep. 2022,
doi: 10.3390/rs14194836.
FIGURE 9 shows the pavement crack identification effect [5] C. Chen, S. Chandra, Y. Han, and H. Seo, ‘‘Deep learning-based thermal
of the developed method under real noise interferences. It can image analysis for pavement defect detection and classification consider-
ing complex pavement conditions,’’ Remote Sens., vol. 14, no. 1, p. 106,
be inferred from the results that the constructed model can
Dec. 2021, doi: 10.3390/rs14010106.
accurately identify pavement cracks even in the presence [6] L. Pei, Z. Sun, L. Xiao, W. Li, J. Sun, and H. Zhang, ‘‘Virtual generation
of significant noise contamination (including paint, cigarette of pavement crack images based on improved deep convolutional gener-
ative adversarial network,’’ Eng. Appl. Artif. Intell., vol. 104, Sep. 2021,
butts, and wooden sticks). Also, it can be also seen that its Art. no. 104376, doi: 10.1016/j.engappai.2021.104376.
geometric profile is consistent with the real labeling results, [7] L. Deng, A. Zhang, J. Guo, and Y. Liu, ‘‘An integrated method for
indicating the effectiveness of the developed method. road crack segmentation and surface feature quantification under complex
backgrounds,’’ Remote Sens., vol. 15, no. 6, p. 1530, Mar. 2023, doi:
10.3390/rs15061530.
V. CONCLUSION [8] Y. Shi, L. Cui, Z. Qi, F. Meng, and Z. Chen, ‘‘Automatic road
crack detection using random structured forests,’’ IEEE Trans.
Pavement crack is a potential and harmful pavement disease,
Intell. Transp. Syst., vol. 17, no. 12, pp. 3434–3445, Dec. 2016, doi:
which has attracted great attention from the engineering man- 10.1109/TITS.2016.2552248.
agement circle. If it is not dealt with properly, it will seriously [9] T. Rateke and A. von Wangenheim, ‘‘Road surface detection and differ-
entiation considering surface damages,’’ Auton. Robots, vol. 45, no. 2,
threaten driving safety and cause serious safety accidents. pp. 299–312, Feb. 2021, doi: 10.1007/s10514-020-09964-3.
The existing manual-based detection methods have the dis- [10] H. Maeda, T. Kashiyama, Y. Sekimoto, T. Seto, and H. Omata, ‘‘Gen-
advantages of low efficiency, long time-consuming, and poor erative adversarial network for road damage detection,’’ Comput.-Aided
Civil Infrastruct. Eng., vol. 36, no. 1, pp. 47–60, Jan. 2021, doi:
accuracy, and it is difficult to meet the needs of large-scale 10.1111/mice.12561.
urban pavement crack diagnosis. Based on this, this study [11] O. D. Adeniji, D. B. Adekeye, S. A. Ajagbe, A. O. Adesina, Y. J. Oguns,
combines artificial intelligence methods and CCD image and M. A. Oladipupo, ‘‘Development of DDoS attack detection approach
in software defined network using support vector machine classifier,’’ in
technology to propose an automatic identification method for Proc. ICPCSN. Singapore: Springer, 2022, pp. 319–331.
hidden cracks in pavement with high detection performance. [12] S. A. Ajagbe and M. O. Adigun, ‘‘Deep learning techniques for detection
A series of experimental comparisons and multi-angle ver- and prediction of pandemic diseases: A systematic literature review,’’
Multimedia Tools Appl., vol. 6, pp. 1–35, 2023, doi: 10.1007/s11042-023-
ifications show that the proposed method can still achieve 15805-z.
higher detection accuracy and performance than other bench- [13] N. Aravind, S. Nagajothi, and S. Elavenil, ‘‘Machine learning model
mark methods under complex road noise interference, which for predicting the crack detection and pattern recognition of geopoly-
mer concrete beams,’’ Construct. Building Mater., vol. 297, Aug. 2021,
illustrates the feasibility of the idea. Art. no. 123785, doi: 10.1016/j.conbuildmat.2021.123785.
[14] A. Malekloo, E. Ozer, M. Alhamaydeh, and M. Girolami, ‘‘Machine
learning and structural health monitoring overview with emerging
A. LIMITATION technology and high-dimensional data source highlights,’’ Struct.
However, this study also has some limitations, which need to Health Monit., vol. 21, no. 4, pp. 1906–1955, 2021, doi: 10.1177/
be further explained and explained. First of all, the method 14759217211036880.
[15] J. Huyan, W. Li, S. Tighe, Z. Xu, and J. Zhai, ‘‘CrackU-net: A novel deep
developed in this study takes pavement cracks as the research convolutional neural network for pixelwise pavement crack detection,’’
object to verify the feasibility of the method. It should be pro- Struct. Control Health Monitor., vol. 27, no. 8, pp. 1–19, Aug. 2020, doi:
moted and applied to the identification of different types of 10.1002/stc.2551.
[16] Y. Wu, Y. Qin, Y. Qian, F. Guo, Z. Wang, and L. Jia, ‘‘Hybrid deep
pavement defects, including potholes, depressions, etc. The learning architecture for rail surface segmentation and surface defect detec-
proposed method can be further combined with unmanned tion,’’ Comput.-Aided Civil Infrastruct. Eng., vol. 37, no. 2, pp. 227–244,
aerial photography remote sensing technology to realize real- Feb. 2022, doi: 10.1111/mice.12710.
[17] J. Liu, X. Yang, S. Lau, X. Wang, S. Luo, V. C. Lee, and L. Ding,
time, efficient, and large-scale automatic identification and ‘‘Automated pavement crack detection and segmentation based on two-
diagnosis of hidden cracks in the pavement, to improve the step convolutional neural network,’’ Comput.-Aided Civil Infrastruct.
Eng., vol. 35, no. 11, pp. 1291–1305, Nov. 2020, doi: 10.1111/mice.
efficiency and automation of pavement management. Also, 12622.
laser scanning and ground radar photography can be com- [18] F. Yang, L. Zhang, S. Yu, D. Prokhorov, X. Mei, and H. Ling, ‘‘Feature
bined to study the 3D reconstruction of pavement cracks and pyramid and hierarchical boosting network for pavement crack detection,’’
IEEE Trans. Intell. Transp. Syst., vol. 21, no. 4, pp. 1525–1535, Apr. 2020,
improve the understanding degree of pavement diseases. doi: 10.1109/TITS.2019.2910595.
[19] Z. Fan, H. Lin, C. Li, J. Su, S. Bruno, and G. Loprencipe, ‘‘Use
of parallel ResNet for high-performance pavement crack detection and
REFERENCES
measurement,’’ Sustainability, vol. 14, no. 3, p. 1825, Feb. 2022, doi:
[1] W. Cao, Q. Liu, and Z. He, ‘‘Review of pavement defect detec- 10.3390/su14031825.
tion methods,’’ IEEE Access, vol. 8, pp. 14531–14544, 2020, doi: [20] Z. Fan et al., ‘‘Ensemble of deep convolutional neural networks for auto-
10.1109/aCCESS.2020.2966881. matic pavement crack detection and measurement,’’ Coatings, vol. 10,
[2] A. Zhang, K. C. P. Wang, B. Li, E. Yang, X. Dai, Y. Peng, Y. Fei, no. 2, p. 152, 2020.
Y. Liu, J. Q. Li, and C. Chen, ‘‘Automated pixel-level pavement crack [21] J. Zhang, X. Yang, W. Li, S. Zhang, and Y. Jia, ‘‘Automatic detection of
detection on 3D asphalt surfaces using a deep-learning network,’’ Comput.- moisture damages in asphalt pavements from GPR data with deep CNN
Aided Civil Infrastruct. Eng., vol. 32, no. 10, pp. 805–819, Oct. 2017, doi: and IRS method,’’ Autom. Construct., vol. 113, May 2020, Art. no. 103119,
10.1111/mice.12297. doi: 10.1016/j.autcon.2020.103119.

89740 VOLUME 11, 2023


F. Song et al.: CCD Image-Based Pixel-Level Identification Model for Pavement Cracks

[22] O. Ronneberger, P. Fischer, and T. Brox, ‘‘U-Net: Convolutional net- YU ZOU received the M.E.E. degree in electronics
works for biomedical image segmentation,’’ in Proc. 18th Int. Conf. Med. and communication engineering from the Nanjing
Image Comput. Comput.-Assist. Intervent. (MICCAI), Munich, Germany, University of Information Science and Technol-
Oct. 2015, pp. 234–241. ogy, Nanjing, China, in 2019. He is currently
[23] S. Feroz and S. A. Dabous, ‘‘UAV-based remote sensing applications pursuing the Ph.D. degree in artificial intelligence
for bridge condition assessment,’’ Remote Sens., vol. 13, no. 9, p. 1809, with the Nanjing University of Information Sci-
May 2021, doi: 10.3390/rs13091809.
ence and Technology. He is a Administrative Staff
[24] X. Li, X. Sun, Y. Meng, J. Liang, F. Wu, and J. Li, ‘‘Dice loss for data-
of Jiangsu Open University. His research interests
imbalanced NLP tasks,’’ in Proc. 58th Annu. Meeting Assoc. Comput.
Linguistics, 2020, pp. 465–476, doi: 10.18653/v1/2020.acl-main.45. include research and application of artificial intel-
[25] S. Guan, A. A. Khan, S. Sikdar, and P. V. Chitnis, ‘‘Fully dense UNet for ligence and deep learning technology.
2-D sparse photoacoustic tomography artifact removal,’’ IEEE J. Biomed.
Health Informat., vol. 24, no. 2, pp. 568–576, Feb. 2020.
[26] V. Badrinarayanan, A. Kendall, and R. Cipolla, ‘‘SegNet: A deep convolu-
tional encoder-decoder architecture for image segmentation,’’ IEEE Trans.
Pattern Anal. Mach. Intell., vol. 39, no. 12, pp. 2481–2495, Dec. 2017. WENSHA SHAO received the M.E. degree in
[27] C. Liu, L.-C. Chen, F. Schroff, H. Adam, W. Hua, A. L. Yuille, and technology of computer application from the Nan-
L. Fei-Fei, ‘‘Auto-DeepLab: Hierarchical neural architecture search for jing University of Finance and Economics, China,
semantic image segmentation,’’ in Proc. IEEE/CVF Conf. Comput. Vis. in 2014. She is currently the Manager with Jiangsu
Pattern Recognit. (CVPR), Jun. 2019, pp. 82–92. Open University. Her research interests include
[28] L. Yan, D. Liu, Q. Xiang, Y. Luo, T. Wang, D. Wu, H. Chen, Y. Zhang, and
research and application of artificial intelligence
Q. Li, ‘‘PSP net-based automatic segmentation network model for prostate
and deep learning technology.
magnetic resonance imaging,’’ Comput. Methods Programs Biomed.,
vol. 207, Aug. 2021, Art. no. 106211, doi: 10.1016/j.cmpb.2021.106211.

FEI SONG received the master’s degree in tech- XIAOYUAN XU received the M.B.A. degree from
nology of computer application from the Nanjing Nanjing University, Nanjing, China, in 2007.
University of Finance and Economics, Nanjing, She is currently an Associate Professor with
China, in 2014. She is currently a Lecturer with Jiangsu Open University. Her research interests
Jiangsu Open University. Her research interests include research and application of data processing
include research and application of artificial intel- and cloud computing.
ligence and deep learning technology.

VOLUME 11, 2023 89741

You might also like