Bigdata50022 2020 9378047
Bigdata50022 2020 9378047
Department of Civil & Environmental Department of Civil & Environmental Department of Civil & Environmental
Engineering Engineering Engineering
University of Missouri Columbia University of Missouri Columbia University of Missouri Columbia
WSP USA E2509 Lafferre Hall, Columbia, MO 65211 E2509 Lafferre Hall, Columbia, MO 65211
211 N Broadway, St. Louis MO 63108 [email protected] [email protected]
[email protected]
Abstract— Automatic detection and classification of pavement inspection techniques require domain expertise and field trips
distresses is critical in timely maintaining and rehabilitating which can be tedious, unsafe and rather expensive [3]. To
pavement surfaces. With the evolution of deep learning and high overcome these manual techniques, optimal and cost-effective
performance computing, the feasibility of vision-based pavement procedures could be applied that automate the detection and
defect assessments has significantly improved. In this study, the
characterization of pavement distresses. Recently, automated
authors deploy state-of-the-art deep learning algorithms based on
different network backbones to detect and characterize pavement evaluation techniques utilizing vehicle mounted cameras have
distresses. The influence of different backbone models such as been used to detect and characterize pavement distresses [4, 5].
CSPDarknet53, Hourglass-104 and EfficientNet were studied to Similarly, some state highway agencies have been collecting
evaluate their classification performance. The models were pavement condition data through automatic 3D surveys that
trained using 21,041 images captured across urban and rural acquire high resolution images capable of identifying damaged
streets of Japan, Czech Republic and India. Finally, the models pavement surfaces [6, 7]. In such images, the level of distresses
were assessed based on their ability to predict and classify can often be interpreted in the form of type, length and severity
distresses, and tested using F1 score obtained from the statistical of pavement cracks. Since, cracking normally appears in the
precision and recall values. The best performing model achieved
initial declining stages, proper detection techniques would
an F1 score of 0.58 and 0.57 on two test datasets released by the
IEEE Global Road Damage Detection Challenge. The source code allow for maintenance streams to take heed that can ultimately
including the trained models are made available at [1]. save both time and money. Also, as pavement surfaces only
deteriorate 40 percent in quality during the first three quarters
Keywords—deep convolutional neural networks, deep learning, of their life-time, there exists a potential for restoration when
pavement distress, road crack detection detected and treated timely [8].
I. INTRODUCTION
With aging infrastructure, state highway agencies have
The distresses in pavement surfaces present a potential realized the importance of advanced computerized techniques
threat to roadway and driving safety. For most state and local in assessing pavement condition. By the end of 2012, over 35
level transportation agencies, maintaining high quality of state highway agencies in the US had deployed either
pavement surfaces is critical to their quotidian endeavor of automated or semi-automated image based techniques to obtain
keeping roadways safe and cornerstone for a sustainable pavement cracking data [9]. In the current state of practice,
transportation infrastructure. Detecting distresses timely is image based computer vision techniques are quite common.
often regarded as one of the most crucial steps in limiting Also, with the deteriorating highway quality in the US, where
further degradation and maintaining high quality pavement almost 1 out of every 5 miles of highway pavement warrants
surfaces. In 2019, the United States federal government spent a attention, the issue is certainly worth acting upon [10, 11]. The
total of 29 billion dollars on infrastructure where almost half of authors in this paper thus, lay out an integrated approach that
the federal transportation spending went into highway and builds up on some of the advanced deep learning frameworks,
roadway infrastructure [2]. To optimize financial resources, utilizing the recent advances in high performance computing to
there exists a need to periodically assess the condition of detect pavement distresses from images captured using a
pavement surfaces and have its maintenance in check. These vehicle mounted smartphone.
maintenance assessments could be either manual or have some
form of automation applied. The paper is structured systematically where the authors
begin Section II discussing some of the related works relevant
In manual assessments, technicians determine the condition to this study, section III where an open-source pavement dataset
of pavements and give off a rating that is subject to vary is described, Section IV where the proposed method is
depending on every other technician. Not to forget that these discussed, Section V which broadly explains the experimental
Authorized licensed use limited to: University of Prince Edward Island. Downloaded on May 29,2021 at 01:46:48 UTC from IEEE Xplore. Restrictions apply.
results and Section VI with final conclusions of the study. encoder. Their method uses the multi-dilation module which
can create inference from multi-featured cracks through the
II. RELATED WORK process of dilated convolution based on multiple rates.
In the past few years, many computer vision based studies Furthermore, they optimized the multi-dilation features using
have emerged that focus on the automatic detection of the SE-Upsampling technique.
pavement distresses. Some of these studies include the use of
local binary patterns [12, 13], Gabor filters [14], shape based Likewise, object detection based methods typically locate
methods [15] and tree structure algorithms [16] among many distresses within a pavement image using bounding box
others. Although these algorithms are generally superior, their approach. These methods first extract discriminative features
inability at capturing discriminative features from images make from an image using a CNN which is followed by generating
them incapable at differentiating crack and non-crack image regions of interest (RoI) and finally detects objects through
pixels. Also, these methods fail at accurately detecting bounding box coordinates. YOLO [25] based crack detection is
distresses in real-world situations such as changing illumination proposed in [26, 27] that can detect different classes of cracks
and varied pavement textures. Deep learning, on the other hand in near real-time. Likewise, Li et al. in [28] deployed Faster R-
has shown a tremendous potential at solving similar problems. CNN [29] to detect multiple crack types including potholes and
alligator cracks. An important point to note is that Faster R-
Recently, the application of deep learning in computer CNN is a two-stage detector where in its first stage, it generates
vision based detection of pavement distresses have been studied RoI and further, passes down the region proposals for object
extensively. Based on how deep learning-based methods detect classification and bounding box regression in its second stage.
distresses, they can be broadly classified into three main This approach of two-stage detection causes Faster R-CNN to
categories viz. pure image classification, pixel-level obtain higher accuracy but the overall process remains slower
segmentation and object detection based methods [17]. In pure compared to single-stage detector such as YOLO. In the current
image classification, an image is divided into overlapping study, the authors have based their research on single-stage
blocks and then the block image is separated into classes. detectors that can detect pavement distresses in near real-time
Afterwards, a deep convolutional neural network (DCNN) and can be scalable.
decides what image blocks contain distresses. Classification
based methods utilizes both binary and multi-class III. DATA
differentiation techniques. Zhang et al. used convolutional Road images released by the IEEE Big Data, Global Road
neural networks (CNN) to reduce pavement images into smaller Damage Detection Challenge was used in this study [4, 30].
patch sizes and based on the output probability classified these These images were taken from a vehicle mounted smartphone
patches as crack and no-crack [18]. Using a smart phone,
modified GoogLeNet is used to organize image blocks in
TABLE I. Pavement Distresses with their Corresponding
detecting pavement cracks [19]. Multi-class classification
Classes
typically comes in handy when detecting different classes of
cracks. In [20] Li et al. used DCNN to classify pavement cracks
utilizing 3D images and further labeled those cracks into 5 Pavement Distress Sample Image
different categories. In their approach, they deployed a total of Distress Class
4 different CNNs for classification which despite having Longitudinal D00
various receptive field sizes had little effect on the overall Crack
accuracy.
5578
Authorized licensed use limited to: University of Prince Edward Island. Downloaded on May 29,2021 at 01:46:48 UTC from IEEE Xplore. Restrictions apply.
and captured scenes of urban and rural streets across three YOLO, allows mixing of 4 training images. This means that it
different countries namely, Japan, Czech Republic and India. helps mixing 4 different contexts, while CutMix being another
The dataset comprised of altogether three sets: train, test1 and data augmentation feature additionally mixes 2 input images.
test2. The train set contained 10,506 images from Japan, 2,829 Due to this, YOLO is normally able to detect objects outside
from Czech Republic and 7,706 from India. Images in the train their normal context, which in our case being unique distresses
set were pre-annotated end to end with their respective classes within the pavement image. Since, batch normalization
of pavement cracks which served as ground truths. Table I computes activation statistics from those 4 images on every
illustrates all four pavement cracks classified as longitudinal, layer, the requirement of a greater size for a mini-batch is
transverse, alligator and pothole. Similarly, test1 and test2 significantly reduced. Also, while applying YOLO to the
contain images without annotations, and were used in problem of predicting pavement distresses, proper selection of
evaluation of results. The total image count for test1 and test2 optimal hyper-parameters is necessary. In this study, the YOLO
were 2,631 and 2,664 respectively. models are trained by setting the following hyperparameters:
batch-size 64, the optimizer weight decay value of 0.0005, the
IV. PROPOSED METHODOLOGY initial learning rate of 0.01 and 0.937 momentum.
The objective of this study is to develop a simplified
B. CenterNet
pavement surface assessment system leveraging deep learning
algorithms. The proposed framework should be capable of CenterNet, just like YOLO is a state-of-the-art single stage
detecting distresses that would most likely differ from one object detection algorithm. However, it functions differently
another due to the variation in pavement textures among the compared to YOLO or any other anchor based object detection
following countries: Japan, Czech Republic and India. Also, methods. Technically, CenterNet is an advanced version of
each class of pavement distress is regarded as a distinguishable CornerNet [35]. The CornerNet is another anchor free method
object and the proposed approach could be applied to any that uses a pair of corner key-points to localize objects.
country based on fine-tuning of certain parameters. To learn the However, CenterNet unlike its predecessor, also uses the
visual and textual patterns of each of these distress types, three centered information, allowing it to use triplets rather than just
single stage object detection algorithms, i.e. YOLO [31], a pair of keypoints. The network architecture of CenterNet is
CenterNet [32], and EfficientDet [33] were deployed. All these shown in Fig. 1. As seen from Figure 1, center pooling and
algorithms were implemented in Pytorch [34] and the network cascade corner pooling is applied to enrich the center and corner
models were trained physically on an NVIDIA GTX 1080 Ti information. Here, center pooling gathers useful and more
GPU and a cloud based Google Colab platform. Since, YOLO, identifiable visual patterns by extracting the maximum possible
CenterNet and EfficientDet detect objects using bounding values in both horizontal and vertical directions. The method of
boxes, the pavement distresses which were intrinsically center pooling follows the following order: first a feature map
different from each other need their visual patterns be analyzed is obtained as an output from the backbone, then to verify if the
within each of their bounding boxes. A brief description of pixel in a feature map is an actual center keypoint, max values
these single stage object detectors is presented as follows. from both its horizontal and vertical directions are figured out
before adding them altogether. This, in turn helps center
A. YOLO pooling obtain superior detection of center keypoints.
You Only Look Once is a state-of-the-art object detection
algorithm. For any object detector, there is a need to have a Similarly, cascade corner pooling collects the max values at
certain network size (i.e. resolution) to detect fine textures of both boundary condition and internal directions of an object.
pavement distresses, specific number of layers for an increased This feature helps in extracting not just the boundary
receptive field size and a significantly large number of information but also the objects’ visual patterns. In the current
parameters to further fine-tune. The most advanced iteration of task of detecting pavement distresses, CenterNet’s Hourglass-
YOLO achieves this requirement by using the CSPDarknet53 104 backbone was used. While training CenterNet models, a
backbone containing 29 convolutional layers 3 × 3, receptive learning rate of 0.000025 was used throughout the iteration.
field of 725 × 725 and a total of 27.6 M parameters. CenterNet had an average inference time of 340ms per image.
Furthermore, the SPP block added over YOLO’s
CSPDarknet53 increases the size of receptive field without Embeddings & offsets
affecting its operating speed. Similarly, PANet is used for Cascade
Corner
parameter aggregation from different levels of backbone. This Pooling
also goes to different levels of detector. Some of the advanced
features such as weighted-residual-connections, cross-stage- Corner
partial-connections, cross mini-batch, normalization (CmBN), CNN
Heatmaps
Authorized licensed use limited to: University of Prince Edward Island. Downloaded on May 29,2021 at 01:46:48 UTC from IEEE Xplore. Restrictions apply.
C. EfficientDet It was inferred that CenterNet and EfficientDet models trained
A scalable and efficient single-stage detector called for 300 epochs had similar performances to the ones trained for
EfficientDet [33] was deployed to predict distresses and lower epochs. CenterNet and EfficientDet’s overall training
compare the results of its performance with other detectors used time for 300 epochs was roughly 75-80 hours separately on the
in the study. EfficientDet combines EfficientNet [36] backbone same GPU hardware resources shared by YOLO.
with its bi-directional feature pyramid network (BiFPN) and the
power of compound scaling. The BiFPN allows learnable In the post-processing stage, Intersection over Union (IoU)
weights to study the necessity of various input features and values were iterated and further analyzed for YOLO model.
helps apply top-down and bottom-up fusion of features. Although IOU is most commonly used as an evaluation metric,
Similarly, compound scaling helps scale the size of resolution, we also used it to remove duplicate bounding boxes, for the
depth and width of the backbone, features, and the networks of prediction of same pavement distresses. This was achieved by
bounding boxes and class prediction. In this study, EfficientDet sorting all the predicted distresses present in an image in
model was trained using the SGD optimizer and momentum descending order of their confidence values. When two
was set to 0.9 with the weight decay of 0.00004. Similarly, the bounding boxes point out to the same distress type and class, its
learning rate was increased gradually from 0 to 0.16. IOU is likely to have a very high value. In that case, the box
with the highest IoU confidence would be chosen and the other
D. Training and Post-Processing one would be discarded.
With a view to speed up training and improve performance,
transfer learning was used. In the field of computer vision and V. RESULTS
deep learning, transfer learning has been extensively used in The performance of the proposed YOLO, CenterNet and
optimizing learning techniques. Using this approach, the EfficientDet models were evaluated on a set of 2,631 images
knowledge gained from the previous job could be seamlessly present in test1 and 2,664 images in test2. In both test1 and
integrated to advance generalization about a new task. In this test2, about 50% of the images were taken from Japan, 37%
study, we used YOLO weights pre-trained on ImageNet as the from India and around 13% from Czech Republic. F1 score,
initialization of the distress detection task. Also, multiple shown in equation (1) was used as an evaluation metric. The F1
transfer learning strategies were analyzed using different score measures the model’s accuracy using the harmonic mean
weights. As training dataset consist of images from three of precision and recall. Here, precision as shown in equation (2)
different countries, the pavement and cracking textures were is the fraction of true positive (tp) examples amongst the
somewhat dissimilar from one another. Although, the images retrieved samples i.e. true positives (tp) and false positives (fp).
from Japan and Czech Republic, shared some similarity, they Similarly, recall shown in equation (3) is also known as
were still quite far-off from pavement images taken from India. sensitivity. It is the portion of the total number of relevant
Therefore, it is obvious that a single model trained on a samples that were successfully retrieved. In other words, it is
comprehensive three country dataset would yield inferior the ratio of true positives (tp) divided by true positives (tp) and
detection and characterization results. So, taking heed of that, a false negatives (fn). For any distress prediction to be considered
model is trained to detect pavement distresses in Japan and a true positive, it should have an Intersection over Union (IoU)
Czech Republic images only, while a separate model is trained value greater than or equal to 0.5 with the ground truth, and
to detect distresses from India. While running on an NVIDIA required both predicted and ground truth classes to have an
GTX 1080 Ti GPU, the proposed models achieved an inference exact match. Likewise, if a prediction obtained an IoU greater
speed of approximately 65 frames per second. than or equal to 0.5 but classified a wrong distress class, then it
was counted as false positive. The false negatives were referred
Similarly, the size of YOLO based model proposed for to the ones that did not generate any predictions at all.
detecting pavement distresses in Japan and Czech datasets, and
trained for 54 epochs was roughly 381 megabytes whereas the 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × recall
model meant for India and trained for 100 epochs was about 𝐹1 = 2 (1)
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
361 megabytes. A heavier model aimed at Japan and Czech
datasets and trained for 66 epochs averaged about 729 𝑡𝑝
megabytes in size. Here, the total training time was around 15 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (2)
𝑡𝑝 + 𝑓𝑝
hours. To understand the performance capability of YOLO
based models, the network was trained for 300 epochs with a 𝑡𝑝
constant learning rate, weight decay and momentum. The 𝑟𝑒𝑐𝑎𝑙𝑙 = (3)
𝑡𝑝 + 𝑓𝑛
models trained for different epochs were tested on the test1 and
test2 datasets released by the IEEE Big Data, Global Road Some of the true positive examples of predicted pavement
Damage Detection Challenge. Through these experimental distresses for all three countries are shown in Fig. 2. The
results, it was observed that the models trained in between the columns increasing from left to right show the longitudinal
range of 50-70 epochs performed comparatively better than crack (D00), transverse crack (D10), alligator crack (D20) and
those trained in between 150-300 epochs. Similarly, both potholes (D40). The predictions for Japan and Czech Republic
CenterNet and EfficientDet were used for comparative analysis.
5580
Authorized licensed use limited to: University of Prince Edward Island. Downloaded on May 29,2021 at 01:46:48 UTC from IEEE Xplore. Restrictions apply.
(a)
(b)
(c)
D00 D10 D20 D40
Fig. 2. Predicted Pavement Distresses: (a) Japan (b) Czech Republic (c) India
is made using the same YOLO model trained on 66 epochs To attain the best prediction outcome, YOLO models were
whereas for India, a different model trained on 100 epochs is trained for different epochs and tested by varying the IoU and
deployed. Similarly, some of the misclassifications or false confidence thresholds. From our analysis, the models trained
positives are shown in Fig. 3 for all three algorithms. In Fig. for 66 epochs on Japan and Czech dataset and 100 epochs on
3(a) YOLO generates an incorrect prediction due to the India dataset proved to be the most optimal. Similarly, for
presence of larger shadow on the upper portion of the image. It confidence thresholds lower than 0.1 would have several
was observed that YOLO predicted erroneous detections for instances of false positives which would reduce precision. Also,
images with appreciable shadow formation. A possible resolve increasing the IOU value greater than 0.9 would increase the
around this could be to augment the training database with possibility of predicting multiple overlapping boxes that could
sufficient number of images with all such scenarios. Similarly, either potentially match or come closer to the ground truths.
in Fig. 3(b), CenterNet misclassified potholes and had several Experimental results demonstrate that the IoU of 1.0 and a
overlapping detections. Also, the bounding box size for confidence threshold of 0.17 would attain the best F1 score.
alligator crack exceeds the regular cracking area. A similar Table II shows the test results obtained for all three models used
issue is observed in Fig. 3(c) where the area of non-crack region in this study. YOLO achieved an F1 score of 0.5814 and 0.5751
exceeds the regular size and suffers from misclassification. on test1 and test2 respectively. It is interesting to note that both
EfficientDet suffers from the inability of detecting abstract CenterNet and EfficientDet underperformed YOLO by almost
features such as the one where the texture of distresses differs 20 to 25 percent error margin, making YOLO a more suitable
from the expected normal. detector for this study. Although both CenterNet and
5581
Authorized licensed use limited to: University of Prince Edward Island. Downloaded on May 29,2021 at 01:46:48 UTC from IEEE Xplore. Restrictions apply.
EfficientDet possessed superior detection capability, they was ranked 4th out of 121 teams worldwide, that participated in
struggled at detecting transverse cracks and at times certain the challenge [37].
longitudinal cracks. While YOLO also faced, problems
detecting longitudinal and transverse cracks, its performance at
detecting such cracks on Japan images wasn’t as substandard as REFERENCES
it was on the other two. However, predicting distresses and [1] R. C. Detection. [Online]. Available:
characterizing them as transverse cracks appeared challenging. https://github.com/titanmu/RoadCrackDetection.
A deficit of transverse cracks in the training database could [2] T. I. USA Facts. [Online]. Available: https://usafacts.org/state-of-
the-union/transportation-infrastructure/.
have been another possible reason. Similarly, despite having [3] A. Chatterjee and Y.-C. Tsai, "A fast and accurate automated
trouble detecting transverse cracks, both YOLO and CenterNet pavement crack detection algorithm," in 2018 26th European
did well at predicting alligator cracks and potholes. Signal Processing Conference (EUSIPCO), 2018: IEEE, pp. 2140-
EfficientDet, on the other hand faced difficulty at detecting 2144.
[4] D. Arya et al., "Transfer Learning-based Road Damage Detection
alligator cracks and conceivably made predictions using larger for Multiple Countries," arXiv preprint arXiv:2008.13101, 2020.
bounding boxes covering greater portion of non-distresses. [5] H. Maeda, Y. Sekimoto, T. Seto, T. Kashiyama, and H. Omata,
"Road damage detection using deep neural networks with images
TABLE II. Performance comparison of different algorithms captured through a smartphone," arXiv preprint arXiv:1801.09454,
2018.
[6] Y. J. Tsai and Z. Wang, "Development of an asphalt pavement
Model Backbone Precision Recall F1 raveling detection algorithm using emerging 3D laser technology
score and macrotexture analysis," 2015.
test1 [7] K. C. Wang, Q. J. Li, G. Yang, Y. Zhan, and Y. Qiu, "Network
level pavement evaluation with 1 mm 3D survey system," Journal
YOLO CSPDarknet53 0.59021 0.57296 0.5814 of traffic and transportation engineering (English edition), vol. 2,
no. 6, pp. 391-398, 2015.
CenterNet Hourglass-104 0.48825 0.47665 0.4823
[8] M. Gavilán et al., "Adaptive road crack detection system by
EfficientDet EfficientNet 0.44269 0.43001 0.4362 pavement classification," Sensors, vol. 11, no. 10, pp. 9628-9657,
2011.
test2 [9] W. Vavrik, L. Evans, S. Sargand, and J. Stefanski, "PCR
YOLO CSPDarknet53 0.57986 0.57058 0.5751 evaluation: considering transition from manual to semi-automated
CenterNet Hourglass-104 0.47865 0.47395 0.4762 pavement distress collection and analysis," 2013.
[10] ASCE, "2017 infrastructure report card," 2017: ASCE Reston,
EfficientDet EfficientNet 0.44879 0.43958 0.4441 VA.
[11] K. Gopalakrishnan, S. K. Khaitan, A. Choudhary, and A. Agrawal,
"Deep convolutional neural networks with transfer learning for
computer vision-based data-driven pavement distress detection,"
VI. CONCLUSION Construction and Building Materials, vol. 157, pp. 322-330, 2017.
[12] Y. Hu and C.-x. Zhao, "A novel LBP based methods for pavement
This study presents an automated approach to pavement crack detection," Journal of pattern Recognition research, vol. 5,
distress detection and characterization using deep learning. no. 1, pp. 140-147, 2010.
Taking advantage of the pre-annotated database of training [13] A. Miraliakbari, S. Sok, Y. O. Ouma, and M. Hahn, "Comparative
data, deep learning based models are trained on different Evaluation of Pavement Crack Detection Using Kernel-Based
Techniques in Asphalt Road Surfaces," International Archives of
network architectures. Post processing parameters such as the the Photogrammetry, Remote Sensing and Spatial Information
optimal confidence thresholds and increased IoU values were Sciences, vol. 1, 2016.
applied for different model versions. Overall, the best models [14] M. Salman, S. Mathavan, K. Kamal, and M. Rahman, "Pavement
achieved F1 scores of 0.5814 and 0.5751 on test1 and test2 crack detection using the Gabor filter," in 16th international IEEE
conference on intelligent transportation systems (ITSC 2013),
datasets, released by the IEEE Global Road Damage Detection 2013: IEEE, pp. 2039-2044.
Challenge. Our observation suggest that these models [15] T. Wang, K. Gopalakrishnan, A. K. Somani, O. G. Smadi, and H.
performed generally well at detecting alligator cracks and Ceylan, "Machine-vision-based roadway health monitoring and
potholes but had difficulty detecting transverse cracks. assessment: Development of a shape-based pavement-crack-
detection approach," 2016.
Especially, for the dataset available from India, very few [16] Q. Zou, Y. Cao, Q. Li, Q. Mao, and S. Wang, "CrackTree:
instances of transverse cracks were seen. This caused a few Automatic crack detection from pavement images," Pattern
misclassifications for images from India and prompted an Recognition Letters, vol. 33, no. 3, pp. 227-238, 2012.
unwarranted confusion with the longitudinal cracks. Also, the [17] W. Cao, Q. Liu, and Z. He, "Review of pavement defect detection
methods," IEEE Access, vol. 8, pp. 14531-14544, 2020.
distress classes appeared somewhat unbalanced for the data [18] L. Zhang, F. Yang, Y. D. Zhang, and Y. J. Zhu, "Road crack
available from India and Czech Republic. Therefore, some detection using deep convolutional neural network," in 2016 IEEE
instances of false negative predictions were obtained. international conference on image processing (ICIP), 2016: IEEE,
Regardless, a complete spectrum of distresses from Japan pp. 3708-3712.
[19] S. Li and X. Zhao, "Convolutional neural networks-based crack
dataset helped the model achieve largely accurate predictions, detection for real concrete surface," in Sensors and Smart
which ultimately pushed the comprehensive three-country F1 Structures Technologies for Civil, Mechanical, and Aerospace
score high. The study was conducted as part of the IEEE Global Systems, 2018.
Road Damage Detection Challenge and the proposed solution [20] B. Li, K. C. Wang, A. Zhang, E. Yang, and G. Wang, "Automatic
classification of pavement crack using deep convolutional neural
5582
Authorized licensed use limited to: University of Prince Edward Island. Downloaded on May 29,2021 at 01:46:48 UTC from IEEE Xplore. Restrictions apply.
network," International Journal of Pavement Engineering, vol. 21,
no. 4, pp. 457-463, 2020.
[21] Z. Fan, Y. Wu, J. Lu, and W. Li, "Automatic pavement crack
detection based on structured prediction with the convolutional
neural network," arXiv preprint arXiv:1802.02208, 2018.
[22] M. D. Jenkins, T. A. Carr, M. I. Iglesias, T. Buggy, and G.
Morison, "A deep convolutional neural network for semantic
pixel-wise segmentation of road and pavement surface cracks," in
2018 26th European Signal Processing Conference (EUSIPCO),
2018: IEEE, pp. 2120-2124.
[23] Q. Zou, Z. Zhang, Q. Li, X. Qi, Q. Wang, and S. Wang,
"Deepcrack: Learning hierarchical convolutional features for crack
detection," IEEE Transactions on Image Processing, vol. 28, no. 3,
pp. 1498-1512, 2018.
[24] W. Liu, Y. Huang, Y. Li, and Q. Chen, "FPCNet: Fast pavement
crack detection network based on encoder-decoder architecture,"
arXiv preprint arXiv:1907.02248, 2019.
[25] J. Redmon and A. Farhadi, "YOLO9000: better, faster, stronger,"
in Proceedings of the IEEE conference on computer vision and
pattern recognition, 2017, pp. 7263-7271.
[26] V. Mandal, L. Uong, and Y. Adu-Gyamfi, "Automated road crack
detection using deep convolutional neural networks," in 2018
IEEE International Conference on Big Data (Big Data), 2018:
IEEE, pp. 5212-5215.
[27] H. Majidifard, Y. Adu-Gyamfi, and W. G. Buttlar, "Deep machine
learning approach to develop a new asphalt pavement condition
index," Construction and Building Materials, vol. 247, p. 118513,
2020.
[28] J. Li, X. Zhao, and H. Li, "Method for detecting road pavement
damage based on deep learning," in Health Monitoring of
Structural and Biological Systems XIII, 2019, vol. 10972:
International Society for Optics and Photonics, p. 109722D.
[29] S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards
real-time object detection with region proposal networks," in
Advances in neural information processing systems, 2015, pp. 91-
99.
[30] H. Maeda, T. Kashiyama, Y. Sekimoto, T. Seto, and H. Omata,
"Generative adversarial network for road damage detection,"
Computer‐Aided Civil and Infrastructure Engineering.
[31] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, "YOLOv4:
Optimal Speed and Accuracy of Object Detection," arXiv preprint
arXiv:2004.10934, 2020.
[32] K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, and Q. Tian,
"Centernet: Keypoint triplets for object detection," in Proceedings
of the IEEE International Conference on Computer Vision, 2019,
pp. 6569-6578.
[33] M. Tan, R. Pang, and Q. V. Le, "Efficientdet: Scalable and
efficient object detection," in Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recognition, 2020,
pp. 10781-10790.
[34] A. Paszke et al., "Automatic differentiation in pytorch," 2017.
[35] H. Law and J. Deng, "Cornernet: Detecting objects as paired
keypoints," in Proceedings of the European Conference on
Computer Vision (ECCV), 2018, pp. 734-750.
[36] M. Tan and Q. V. Le, "Efficientnet: Rethinking model scaling for
convolutional neural networks," arXiv preprint arXiv:1905.11946,
2019.
[37] D. Arya et al., "Global Road Damage Detection: State-of-the-
art Solutions," arXiv preprint arXiv: 2011.08740, 2020.
5583
Authorized licensed use limited to: University of Prince Edward Island. Downloaded on May 29,2021 at 01:46:48 UTC from IEEE Xplore. Restrictions apply.