Yolov8 and Point Cloud Fusion For Enhanced Road Pothole Detection and Quantification
Yolov8 and Point Cloud Fusion For Enhanced Road Pothole Detection and Quantification
com/scientificreports
Keywords YOLOv8, Pothole detection, Point cloud data, Damage quantification, Depth camera
The longevity of road surfaces and overall traffic safety are significantly affected by road faults, underscoring the
heightened importance of road condition evaluation in contemporary maintenance practices1,2. Among these
issues, surface potholes are regarded as one of the most hazardous, posing potential risks to vehicle safety and
compromising the structural integrity of road surfaces. As stated in paper3, the primary approach for pothole
detection remains manual visual inspection. However, this method suffers from limitations in accuracy due to
subjective judgment and lacks detection efficiency. Thus, there is a critical need for the development of intelligent
pothole detection techniques. For a more comprehensive evaluation of road hazards, detection must extend
beyond mere presence confirmation to include precise localization and geometric characterization of potholes.
Furthermore, expanding detection capabilities to allow real-time identification of potholes along the road ahead
is critical for improving efficiency. Although advancements in intelligent pothole detection have enhanced both
timeliness and accuracy, substantial challenges remain4, which numerous researchers have sought to address5–7.
Advancements in sensor technology and image processing techniques have led to a marked increase in academic
and industry research focused on road damage identification. Numerous detection methods with varying
levels of accuracy and applicability have been proposed8–10. The primary approaches include vibration-based
techniques11–13, image-based techniques14–17, and point cloud-based techniques18–23.
Vibration-based techniques assess road damage by measuring how vehicles dynamically respond to surface
flaws. For instance, the authors of11 used gyroscope and accelerometer data to develop a classifier that categorizes
roads as either flat or containing potholes. Another study12 proposed a pothole detection system with a hardware
cost of approximately $60, utilizing triaxial accelerometers and GPS modules to collect raw data. While these
methods are cost-effective and capable of real-time operation, they provide only qualitative results. Additionally,
vehicles must traverse potholes to detect the associated vibrations, and uneven surfaces may be misidentified as
potholes, which can diminish accuracy and performance over longer distances.
Image-based approaches for pothole identification typically involve capturing photos or videos with
cameras, followed by processing with image analysis and machine learning algorithms. These methods enable
rapid classification of potholes, significantly enhancing detection accuracy. Studies24–26 applied YOLO and its
advanced models to detect potholes in images, yielding promising results. However, image-based techniques
1Institute
of Intelligent Machines, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei
230031, China. 2University of Science and Technology of China, Hefei 230026, China. email: [email protected];
[email protected]
often struggle in adverse weather or low-light conditions. For example, the dataset used in27 leveraged the
YOLO v3 computer vision model for automated pothole detection, incorporating images captured under various
lighting and weather conditions. The study indicated that while environmental changes can significantly affect
detection accuracy, incorporating images of potholes from harsh environments during the training phase allowed
the model to achieve robust detection performance even in complex conditions. Nevertheless, image-based
methods primarily focus on the presence of potholes rather than extracting detailed geometric information. In
practical applications, stains on asphalt surfaces and road patches are often erroneously identified as potholes,
thereby reducing detection accuracy28. Additionally, supervised learning techniques require large amounts of
well-labeled training data for effective stereo matching, presenting challenges for real-world implementation29.
Depth cameras and multi-line LiDAR (Light Detection and Ranging) are commonly used tools in point
cloud-based techniques, though LiDAR’s high cost remains a limitation. Studies30,31 have demonstrated the
effectiveness of depth cameras for road assessment, utilizing Kinect cameras to detect cracks and potholes. Point
cloud-based techniques allow precise capture of pothole geometries. For example, a clustering approach for
identifying cracks in laser-scanned point cloud data was introduced in32. Additionally, roughness descriptors
have been applied to segment and classify stone and asphalt surfaces33, and 3D fracture skeletons have been
extracted from mobile LiDAR point clouds using the Otsu thresholding technique34. Choi et al.35 applied a
multi-resolution segmentation technique to identify road cracks in point cloud images, while Chen and Li36 used
a high-pass convolution method to locate cracks in asphalt. Although various methods have been developed to
detect damage from laser-scanned data, the effectiveness of detection is highly dependent on the resolution of
the point cloud data. Detecting potholes accurately relies on high-resolution point clouds, but extracting precise
edge information remains challenging due to the discrete nature of sampled data. As noted in related studies37,
the computational efficiency of pothole detection from point clouds is generally lower than that of image-based
methods, primarily due to the large data volume, which gives image-based techniques a comparative advantage
in efficiency38.
To address these challenges, we propose a pothole detection method that integrates point cloud and image
data. Our main contributions are as follows:
• A significant reduction in false positive rates for pothole identification, improving detection accuracy.
• Enhanced precision in determining the 3D dimensions of detected potholes.
• Development of a classification framework for real-time assessment of pothole severity, categorizing potholes
based on the extent of road damage.
Methodology
Research framework
This paper presents a novel pothole detection and measurement method that integrates YOLOv8 with 3D point
cloud data for rapid and precise road surface assessment. The proposed approach enables calculation of key
geometric metrics, including pothole perimeter, area, and depth, as outlined in Fig. 1. The method comprises five
primary stages: (1) Image and depth data acquisition: Using an RGB-D depth camera, road surface imagery and
depth data are collected to capture detailed spatial information; (2) Detection and localization: A deep learning-
based YOLOv8 model detects potholes and generates bounding boxes, providing initial regions of interest; (3)
3D point cloud extraction: Corresponding 3D point cloud data is extracted for each detected bounding box to
extract relevant pothole data; (4) Edge point filtering and component analysis: Exploiting the inevitable surface
irregularity caused by potholes, the point cloud data from Step 3 is filtered to identify sharp edge points along the
pothole perimeter; (5) Geometric computation of pothole dimensions: The boundary point cloud data extracted
in Step 4 is used to perform precise 3D geometric calculations, yielding metrics for perimeter, surface area, and
depth.
Fig. 1. Workflow of the proposed pothole detection and measurement method combining YOLOv8 and 3D
point cloud data.
Where, ′x, y,′ z denote the spatial coordinates within the point cloud, with D as the corresponding depth value,
while x , y refer to the pixel coordinates within the RGB image plane. Following the transformation of these
coordinates, points constrained by the bounding box-defined by the top-left and bottom-right vertices-are
designated as the point cloud region of interest (ROI). The subset of points meeting these specified boundary
conditions is represented as the set P, formulated as follows:
P = {(X, Y, Z) | xl < X < xr , dl < Y < dr , yl < Z < yr }(2)
3D contour extraction
To enhance the accuracy of 3D contour and geometric feature detection for potholes, this study converts the
2D candidate regions identified in the image into corresponding 3D target regions. Within these regions, a
feature analysis of the pothole point cloud is performed. By leveraging the characteristic that potholes disrupt
the smoothness of surface geometry, the boundary points defining the pothole contour are extracted from the
point cloud.
1 ∑
n
∑
n
where pi = (xi , yi , zi ) and pj = (xj , yj , zj ) are two points in the point cloud, and ∥pi − pj ∥ represents the
Euclidean distance between points pi and pj .
Subsequently, for each filtered triangle, the normal vector ni is calculated. Here, e1 and e2 denote the
edge vectors of two adjacent vertices in the triangle; the normal vector for the triangle is then determined by
computing the cross product of these two edge vectors, as described by the following equation:
e1 × e2
ni = (4)
∥e1 × e2 ∥
Next, the surrounding associated normal vectors are computed for each triangle. For any given triangle ni , all
adjacent triangles are identified as Tni = △1 , △2 , . . . , △m , associating the normal vectors n1 , n2 , . . . , nm of
these neighboring triangles with the central triangle. To detect potential boundary points, the angle θij between
neighboring normal vectors is calculated, with the expression given as follows:
( )
ni · nj
θij = cos−1 (5)
∥ni ∥∥nj ∥
If the angle θij between the normal vectors of any adjacent triangles exceeds a specified threshold θth , this
indicates a significant change in smoothness between ni and its surroundings, identifying ni as a potential
boundary point. All selected boundary points are then compiled into the set PB .
PB = {pi | θij > θth }(6)
At this stage, however, the set PB may contain some noise points, such as other surface irregularities that do not
correspond to the true pothole boundary and thus require further refinement.
Boundary extraction
To filter out noise points and accurately extract pothole boundary points, we employed the DBSCAN (Density-
Based Spatial Clustering of Applications with Noise) clustering algorithm. DBSCAN is an unsupervised density-
based clustering method that automatically forms clusters based on the spatial distribution of data points,
without the need to predefine the number of clusters. The algorithm determines whether a point is a core point,
a border point, or a noise point by setting two key parameters: the neighborhood radius (ϵ) and the minimum
number of neighbors (minPts). This effectively removes isolated points and low-density noise, improving the
accuracy of pothole boundary extraction.
In this study, DBSCAN is used to identify pothole boundary points and filter out noise points from non-
target regions. The algorithm first searches for high-density areas in the point cloud data and classifies them
into the same cluster. Specifically, if a point has at least minPts neighbors within a radius ϵ, it is considered a
core point and forms a new cluster. If a point is not a core point but lies within the ϵ-neighborhood of a core
point, it is labeled as a boundary point and assigned to the cluster of the core point. If a point is neither a core
point nor within the neighborhood of any core point, it is considered a noise point and is removed. DBSCAN’s
performance heavily depends on two key parameters: the neighborhood radius (ϵ) and the minimum number of
neighbors (minPts). Therefore, it is essential to carefully select these parameters to ensure effective extraction of
pothole boundaries and the filtering of noise points.
The neighborhood radius (ϵ) determines the maximum distance at which two points in the point cloud are
considered neighbors. If ϵ is set too small, true pothole boundary points may be misclassified as noise points.
If ϵ is set too large, different pothole regions may be incorrectly merged into a single cluster, compromising
boundary extraction accuracy. In this study, ϵ is set based on the average point distance of the point cloud data,
ensuring that the formation of clusters matches the actual shape of the potholes. Specifically, we calculate the
average distance to the K-nearest neighbors for all points in the dataset and set ϵ to twice the average point
distance. This ensures that boundary points are properly grouped while effectively filtering out outlier noise
points.
The minimum number of neighbors (minPts) determines the minimum number of neighbors required for
a point to be classified as a core point and plays a significant role in clustering connectivity and noise point
identification. If minPts is set too low, isolated points may be mistakenly classified as boundary points, reducing
the robustness of the clustering. If minPts is set too high, sparse pothole boundary points may fail to form valid
clusters, affecting the completeness of boundary extraction. In this study, based on the density of the point cloud
data and the distribution characteristics of boundary points, minPts=10 is selected to ensure the continuity of
pothole boundary points, while effectively removing isolated noise points and improving detection accuracy.
By employing this parameter selection method, DBSCAN can adaptively extract pothole boundaries in
this study and efficiently filter out environmental noise points, thus providing high-quality data support for
subsequent pothole depth analysis and geometric feature extraction. Experimental results demonstrate that
this approach maintains efficient and stable performance in complex scenarios, enhancing the robustness and
accuracy of pothole detection.
We first perform dimensionality reduction on the filtered 3D point cloud data by projecting it onto the
xoy plane, thereby obtaining a planar point cloud dataset without height information. Next, the Alpha-Shape
algorithm is applied for boundary extraction to identify the contours of the pit areas. The Alpha parameter
(α) plays a crucial role in the Alpha-Shape algorithm, as it determines the tightness and smoothness of the
boundary. An appropriate value of α ensures the accuracy of the boundary extraction, while too large or too
small an α can result in incomplete or overfitted boundaries. Typically, the choice of α depends on the density
and spatial distribution characteristics of the point cloud data, ensuring the rationality of the boundary.
Let the point set of the region of interest be pi = {(xi , yi ), . . . , (xn , yn )}, where i represents the index of
the points, and there are n points, which can form n(n − 1) line segments. For any two points p1 (x1 , y1 ) and
p2 (x2 , y2 ) in the point set p, we draw a circle with radius α and calculate the centers of the two circles passing
through these two points, denoted as pc (xc , yc ), expressed as follows:
{
xc = x1 + 12 (x2 − x1 ) + H(y2 − y1 )
(7)
yc = y1 + 12 (y2 − y1 ) + H(x2 − x1 )
where,
√
α2 1
H= − (8)
(x1 − x2 ) − (y1 − y2 )2
2
4
After obtaining the two circle centers, the relationship between the distance of other points to the circle centers
and the radius α is evaluated to determine if there are any other points inside the circle. If one of the circles
does not contain any other points, the points p1 and p2 are identified as boundary points of the pit, and the line
segment connecting P1 and p2 is considered a boundary segment. By iterating over the point set p, the set of
boundary points is denoted as pL .
After extracting the boundary points, the elevation value zi of each point pL (xi , yi ) in pL is remapped,
resulting in a new 3D point cloud dataset pLT (xLT , yLT , zLT ). This method not only preserves the features
of the pit but also effectively extracts the upper and lower boundary point cloud information, providing the
necessary data support for subsequent point cloud completion tasks. Figure 2 illustrates the boundary extraction
process.
Where, xN +1 = x1 , yN +1 = y1 .
To calculate the depth, an iteration is performed over all data points within the pothole point set. The
maximum absolute elevation among these points represents the pothole’s maximum depth, denoted as Hmax
, expressed as:
Hmax = |maxi (zi ) − z̄plane | zi ∈ PT L (zi )(10)
Here, PLT represents the pothole point set, Zi is the elevation value of each point within the pothole, and z̄plane
is the average height of points on the segmentation plane. The perimeter of the pothole can be calculated by
summing the distances between consecutive boundary points. The expression is as follows:
n √
∑
C= (xi+1 − xi )2 + (yi+1 − yi ) + (zi+1 − zi )2 (11)
i=1
Fig. 2. Feature point extraction: (a) Standard pothole. (b) Triangulation processing. (c) Feature point
extraction based on normal vectors. (d) Boundary point extraction based on Alpha-Shape.
This formula represents the sum of Euclidean distances between consecutive points, providing the perimeter
length of the pothole boundary in three-dimensional space.
Experiments
To validate the effectiveness of the proposed YOLOv8 and point cloud fusion-based road pothole detection
method, two experimental scenarios were designed. The first involved real-world road pothole testing, while the
second utilized foam-based simulated potholes. These experiments aimed to evaluate the algorithm’s accuracy,
robustness, and stability. Pothole characteristics such as perimeter, area, and depth were detected and quantified
across multiple dimensions. In addition, the algorithm was compared with existing detection methods to
provide a comprehensive performance analysis. The primary goal of the experiments was to highlight the
advantages of integrating point cloud and image data, while assessing the algorithm’s effectiveness under varying
environmental conditions.
Fig. 4. Illustration of pothole quantification: (a) Depth measurement; (b) Area and perimeter measurement.
Fig. 5. Test road. The satellite imagery was obtained from Baidu Maps (https://map.baidu.com). The left side is
an asphalt road with lower roughness, while the right side is a gravel road with higher roughness.
errors, all manual measurements were repeated three times, and the average values were used as the ground
truth for validating the method’s multidimensional metrics.
Fig. 6. Types of false positives: (a) Caused by manhole covers and road patches; (b) Caused by stains; (c)
Caused by road patches; (d) Caused by weeds; (e) Excessive roughness; (f) Caused by distant vehicles.
TP
Precision =
TP + FP
TP
Recall = (12)
TP + FN
F 1 = 2 × Precision × Recall
Precision + Recall
In this evaluation, TP (true positives) represents the number of correctly detected potholes, FP (false positives)
indicates the number of incorrectly detected potholes, and FN (false negatives) denotes the number of missed
potholes. A detection is classified as a true positive (TP) if the intersection over union (IoU) between the detected
bounding box and the annotated pothole box exceeds 0.5. Conversely, if the detected bounding box does not
overlap with the annotated pothole or the IoU is less than 0.5, it is classified as a false positive (FP).
From Table 1, it is evident that the precision achieved by the proposed method surpasses that of YOLOv8
alone. This enhancement is primarily due to the incorporation of 3D point cloud data, which effectively reduces
false positive (FP) detections. The experiments identified two primary types of errors in pothole detection, with
representative misdetection examples presented in Fig. 6. The first error type involves misclassifying non-ground
objects, such as vehicles, as potholes. The second type pertains to the incorrect identification of ground features,
including road patches, manhole covers, or stains, as potholes.
To address misdetections of non-ground objects, we leverage the 3D point cloud data of the bounding box and
apply an elevation threshold, recognizing that potholes are confined to the road surface. Detections exceeding
this elevation threshold are classified as non-road potholes. For misdetections on the ground, a maximum depth
threshold is utilized. After extracting the 3D point cloud data of a potential pothole, its maximum depth is
compared against predefined damage levels. Detections with a maximum depth below the threshold for small
potholes are flagged as false detections, effectively eliminating road patches, stains, shadows, and other similar
anomalies. Table 2 provides a more detailed classification of correctly detected potholes, building upon the data
presented in Table 1.
Table 2 offers an intuitive visualization of the quantity, size, and classification of detected potholes along
the test sections, providing actionable insights for road inspection and maintenance. Specifically, potholes are
classified into three categories based on depth: low (less than 25 mm), moderate (25-50 mm), and high (greater
than 50 mm). To further validate the effectiveness and applicability of the proposed method, we compared its
performance with several widely adopted pothole detection algorithms, including point cloud-based methods6,
traditional image processing techniques28, and multi-sensor fusion algorithms39. The comparison results are
detailed in Table 3.
A comparison of the detection results in Table 3 reveals that the proposed method outperforms existing
approaches in both precision and recall. Notably, the method demonstrates a significant advantage in precision,
attributed to the integration of YOLOv8’s high accuracy with a classification step that categorizes detected
potholes into small, medium, and large sizes using 3D point cloud features. This additional classification effectively
eliminates false positives, such as road patches and stains, by excluding detections where depth differences fall
below the specified thresholds. The proposed method achieves an overall detection accuracy of 95.8% and a
recall rate of 93.3%, with a single-frame detection speed of 0.24 seconds, meeting the requirements for real-time
applications. Moreover, it surpasses other recent advanced methods in terms of detection effectiveness.
Precision
Methods Depth Area
Deep learning (GA-DenseNet) and binocular stereo vision40 93% 88%
Laser point cloud6 ≥ 94% ≥ 93%
RGB-D camera and instance segmentation algorithm41 - 87% - 91%
Our ≥ 96% 87% - 95%
Table 5. Comparative analysis of geometric dimension accuracy with other detection methods.
and less distinct boundary profiles, show less accurate detection results, with surface area errors approximately
13% and perimeter errors around 11%. However, regardless of the road roughness or the clarity of the boundary
contours, depth values are consistently accurately detected, with relative errors not exceeding 4%.
Some of the detection results for the potholes presented in Table 4 are illustrated in Fig. 7. The first row shows
the RGB image of the pothole, the second row displays the triangulated point cloud, the third row presents the
extracted feature point cloud, and the fourth row shows the corresponding point cloud of the pothole.
To further assess the accuracy of the proposed method, a comparative analysis was conducted against recent
related studies. The results of this comparison are summarized in Table 5.
In Li et al.40, the GA-DenseNet classification model is applied to categorize road surface damage types, where
point cloud data is processed and converted into binary images to extract the depth and area features of potholes.
However, this approach, which relies on a transformation ratio to estimate the damage area, produces larger
errors compared to our boundary extraction method based on smoothness. Similarly, Rufei et al.6 employ the
integral invariants principle based on laser point clouds for pothole detection. This direct method of extracting
pothole dimensions from point cloud contours demonstrates higher accuracy in measuring pothole sizes. In
contrast, Lin et al.41 focus on the surface area and volume of potholes, rather than their maximum depth. Their
findings indicate that as the severity of damage increases, the errors in calculated area and volume decrease, with
error rates ranging from 9% to 13.2%. The accuracy of these measurements is highly dependent on the size of
the pothole area.
The method proposed in this paper, however, performs consistently across potholes of varying sizes.
Detection accuracy is influenced primarily by road roughness and the clarity of pothole boundaries. The highest
detection precision is achieved with foam potholes due to their distinct contours and pronounced curvature
changes, which enhance boundary extraction accuracy. In contrast, natural potholes on cement roads often have
smoother boundary planes and less variation in internal structure. Additionally, prolonged vehicle pressure
causes the boundary localization of these potholes to become less distinct, leading to higher errors in detection.
Conclusion
This paper presents a method for road pothole detection by integrating image and point cloud data. The proposed
approach begins with YOLOv8, which detects potholes in captured images and marks the 2D bounding boxes.
The top-left and bottom-right corner coordinates of these bounding boxes are matched with corresponding depth
data to determine the 3D coordinates, designating the target region within the point cloud data. Subsequently,
the pothole boundary contour is identified by analyzing smoothness variations, and the point clouds within
the contour are extracted to calculate geometric features, including average depth, surface area, and perimeter.
The proposed method was validated under real-world road conditions. YOLOv8 effectively identified
candidate potholes, which were then classified by damage severity. Furthermore, surface stains and patches, often
misclassified as potholes, were successfully filtered out-a limitation of YOLO-based methods alone. While the
recall rate remained consistent at 93.3% compared to YOLOv8’s results, the proposed method improved precision
from 89.3% to 95.8%. To evaluate the geometric accuracy of the detected potholes, multiple experiments were
conducted. The results showed that under road conditions with lower roughness, the detection accuracies for
perimeter, surface area, and depth were 96%, 95%, and 96%, respectively. Furthermore, the model processed
one image in 0.23 seconds, demonstrating its suitability for practical applications. However, the experiments
revealed that the proposed method relies on a complete pothole point cloud for accurate detection. In cases
where occlusion leads to an incomplete point cloud representation of the pothole, the accuracy of size estimation
is compromised.
Future research should focus on enhancing the recall rate of pothole detection, expanding the dataset to
include diverse environmental conditions, and improving the model’s adaptability to complex and harsh
environments. Additionally, developing a pothole detection approach that can handle incomplete point cloud
data is a key direction for future studies. Further efforts are also needed to advance the sampling frequency and
quality of data acquisition equipment, broadening the applicability of this method in engineering practice.
Data availibility
The datasets generated during and analyzed during the current study are available from the corresponding au-
thor on reasonable request.
References
1. Arya, D. et al. Deep learning-based road damage detection and classification for multiple countries. Automation in Construction
132, 103935 (2021).
2. Zhao, L., Wu, Y., Luo, X. & Yuan, Y. Automatic defect detection of pavement diseases. Remote Sensing 14, 4836 (2022).
3. Fan, R. & Liu, M. Road damage detection based on unsupervised disparity map segmentation. IEEE Transactions on Intelligent
Transportation Systems 21, 4906–4911 (2019).
4. Dhiman, A. & Klette, R. Pothole detection using computer vision and learning. IEEE Transactions on Intelligent Transportation
Systems 21, 3536–3550 (2019).
5. Fan, R. et al. Long-awaited next-generation road damage detection and localization system is finally here. In 2021 29th European
Signal Processing Conference (EUSIPCO), 641–645 (IEEE, 2021).
6. Rufei, L., Jiben, Y., Hongwei, R., Bori, C. & Chenhao, C. Research on a pavement pothole extraction method based on vehicle-
borne continuous laser scanning point cloud. Measurement Science and Technology 33, 115204 (2022).
7. Ravi, R., Bullock, D. & Habib, A. Pavement distress and debris detection using a mobile mapping system with 2d profiler lidar.
Transportation research record 2675, 428–438 (2021).
8. Ma, N. et al. Computer vision for road imaging and pothole detection: a state-of-the-art review of systems and algorithms.
Transportation safety and Environment 4, tdac026 (2022).
9. Kim, Y.-M. et al. Review of recent automated pothole-detection methods. Applied Sciences 12, 5320 (2022).
10. Tang, K. et al. Decision fusion networks for image classification. IEEE Transactions on Neural Networks and Learning Systems
(2022).
11. Allouch, A., Koubâa, A., Abbes, T. & Ammar, A. Roadsense: Smartphone application to estimate road conditions using
accelerometer and gyroscope. IEEE Sensors Journal 17, 4231–4238 (2017).
12. Ren, J. & Liu, D. Pads: A reliable pothole detection system using machine learning. In International Conference on Smart Computing
and Communication, 327–338 (Springer, 2016).
13. Ghadge, M., Pandey, D. & Kalbande, D. Machine learning approach for predicting bumps on road. In 2015 International Conference
on Applied and Theoretical Computing and Communication Technology (iCATccT), 481–485 (IEEE, 2015).
14. Kortmann, F. et al. Detecting various road damage types in global countries utilizing faster r-cnn. In 2020 IEEE International
Conference on Big Data (Big Data), 5563–5571 (IEEE, 2020).
15. Javed, A. et al. Pothole detection system using region-based convolutional neural network. In 2021 IEEE 4th International
Conference on Computer and Communication Engineering Technology (CCET), 6–11 (IEEE, 2021).
16. Cano-Ortiz, S., Iglesias, L. L., del Árbol, P. M. R., Lastra-González, P. & Castro-Fresno, D. An end-to-end computer vision system
based on deep learning for pavement distress detection and quantification. Construction and Building Materials 416, 135036
(2024).
17. Cano-Ortiz, S., Iglesias, L. L., del Árbol, P. M. R. & Castro-Fresno, D. Improving detection of asphalt distresses with deep learning-
based diffusion model for intelligent road maintenance. Developments in the Built Environment 17, 100315 (2024).
18. Haq, M. U. U., Ashfaque, M., Mathavan, S., Kamal, K. & Ahmed, A. Stereo-based 3d reconstruction of potholes by a hybrid, dense
matching scheme. IEEE Sensors Journal 19, 3807–3817 (2019).
19. Ahmed, A. et al. Pothole 3d reconstruction with a novel imaging system and structure from motion techniques. IEEE Transactions
on Intelligent Transportation Systems 23, 4685–4694 (2021).
20. Guan, J. et al. Automated pixel-level pavement distress detection based on stereo vision and deep learning. Automation in
Construction 129, 103788 (2021).
21. Wu, R. et al. Scale-adaptive pothole detection and tracking from 3-d road point clouds. In 2021 IEEE International Conference on
Imaging Systems and Techniques (IST), 1–5 (IEEE, 2021).
22. Tang, K. et al. Deep manifold attack on point clouds via parameter plane stretching. In Proceedings of the AAAI Conference on
Artificial Intelligence 37, 2420–2428 (2023).
23. Tang, K. et al. Manifold constraints for imperceptible adversarial attacks on point clouds. In Proceedings of the AAAI Conference on
Artificial Intelligence 38, 5127–5135 (2024).
24. Dharneeshkar, J., Aniruthan, S., Karthika, R., Parameswaran, L. et al. Deep learning based detection of potholes in indian roads
using yolo. In 2020 international conference on inventive computation technologies (ICICT), 381–385 (IEEE, 2020).
25. Suong, L. K. & Jangwoo, K. Detection of potholes using a deep convolutional neural network. Journal of universal computer science
24, 1244–1257 (2018).
26. Omar, M. & Kumar, P. Detection of roads potholes using yolov4. In 2020 international conference on information science and
communications technologies (ICISCT), 1–6 (IEEE, 2020).
27. Bučko, B., Lieskovská, E., Zábovská, K. & Zábovskỳ, M. Computer vision based pothole detection under challenging conditions.
Sensors 22, 8878 (2022).
28. Anand, S., Gupta, S., Darbari, V. & Kohli, S. Crack-pot: Autonomous road crack and pothole detection. In 2018 digital image
computing: techniques and applications (DICTA), 1–6 (IEEE, 2018).
29. Wang, H., Fan, R., Cai, P. & Liu, M. Pvstereo: Pyramid voting module for end-to-end self-supervised stereo matching. IEEE
Robotics and Automation Letters 6, 4353–4360 (2021).
30. Zhang, Y. et al. A kinect-based approach for 3d pavement surface reconstruction and cracking recognition. IEEE Transactions on
Intelligent Transportation Systems 19, 3935–3946 (2018).
31. Kamal, K. et al. Performance assessment of kinect as a sensor for pothole imaging and metrology. International Journal of Pavement
Engineering 19, 565–576 (2018).
32. Chang, K., Chang, J. & Liu, J. Detection of pavement distresses using 3d laser scanning technology. In Computing in civil engineering
2005, 1–11 (2005).
33. Díaz-Vilariño, L., González-Jorge, H., Bueno, M., Arias, P. & Puente, I. Automatic classification of urban pavements using mobile
lidar data and roughness descriptors. Construction and Building Materials 102, 208–215 (2016).
34. Yu, Y., Li, J., Guan, H. & Wang, C. 3d crack skeleton extraction from mobile lidar point clouds. In 2014 IEEE Geoscience and Remote
Sensing Symposium, 914–917 (IEEE, 2014).
35. Choi, J., Zhu, L. & Kurosu, H. Detection of cracks in paved road surface using laser scan image data. The International Archives of
the Photogrammetry, Remote Sensing and Spatial Information Sciences 41, 559–562 (2016).
36. Chen, X. & Li, J. A feasibility study on use of generic mobile laser scanning system for detecting asphalt pavement cracks. The
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 41, 545–549 (2016).
37. Wu, H. et al. Road pothole extraction and safety evaluation by integration of point cloud and images derived from mobile mapping
sensors. Advanced Engineering Informatics 42, 100936 (2019).
38. Yoon, S. & Cho, J. Convergence of stereo vision-based multimodal yolos for faster detection of potholes. Computers, Materials &
Continua 73 (2022).
39. Chen, L. et al. Gocomfort: Comfortable navigation for autonomous vehicles leveraging high-precision road damage crowdsensing.
IEEE Transactions on Mobile Computing 22, 6477–6494 (2022).
40. Li, J., Liu, T. & Wang, X. Advanced pavement distress recognition and 3d reconstruction by using ga-densenet and binocular stereo
vision. Measurement 201, 111760 (2022).
41. Lin, W., Li, X., Han, H., Yu, Q. & Cho, Y.-H. A novel approach for pavement distress detection and quantification using rgb-d
camera and deep learning algorithm. Construction and Building Materials 407, 133593 (2023).
Author contributions
Junkui Zhong: Conceptualization, Methodology, Data Collection, Analysis, Writing-Original Draft, Writing-Re-
view and Editing. Deyi Kong: Methodology, Supervision, Writing-Review and Editing. Yuliang Wei: Methodol-
ogy, Supervision, Review-editing and validation. Bin Pan: Data Analysis, Review and Editing. All authors have
read and agreed to the published version of the manuscript.
Funding
This paper was supported by the Anhui Provincial Natural Science Foundation (No. 2308085QA22), the Hefei
Institute of Technology Innovation Engineering (Project No. KY-2023-SC-01), and the Anhui Provincial Major
Science and Technology Project (Project No. 202203a06020002).
Declarations
Competing interests
The authors declare no competing interests.
Additional information
Correspondence and requests for materials should be addressed to D.K. or Y.W.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.