Final Report
Final Report
A
Project Report
Submitted in partial fulfilment of the
Requirements for the award of the Degree of
BACHELOR OF ENGINEERING
IN
INFORMATION TECHNOLOGY
By
Ms. B. Leelavathy
Assistant Professor
BONAFIDE CERTIFICATE
Beyond mere detection, the system furnishes insights into object bounding
box coordinates and associated labels, fostering a comprehensive understanding of
detected entities' spatial context and semantic relevance within the image.
1
experimenting with different momentum values is conducted to optimize the
model's training process, ensuring better performance. In summary, the solution
combines the strengths of PyTorch, ResNet34 architecture, and SSD model to
propose a robust system for pothole detection.
This innovative system not only alerts drivers to the presence of potholes
but also provides a mechanism to instantly notify relevant officials about damaged
roads. This proactive approach to road damage monitoring and reporting aims to
enhance overall road safety and expedite the necessary repairs, ultimately
contributing to more reliable and secure road infrastructure.
2
One of the key objectives of the proposed work is to achieve real-time
performance, enabling the system to process live camera feeds and detect potholes
on the fly. This requires developing efficient algorithms and optimizing the
computational resources to minimize processing time and latency.
Additionally, the project aims to integrate the pothole detection system with
Google Maps, to provide users with enhanced functionalities or visualizations. This
integration will enhance the usability and usefulness of the system, making it more
valuable for both end users and stakeholders.
Overall, the proposed work seeks to deliver an accurate, robust, and real-
time pothole detection system that can effectively address the challenges associated
with road maintenance and infrastructure management. Through a systematic
approach to model development, evaluation, and optimization, the project aims to
achieve its objectives and contribute to improving road safety and infrastructure
maintenance efforts.
3
The report also includes a Future Scope section that explores potential
enhancements and extensions to the current system, discussing future research
directions and opportunities for improvement. The Conclusion summarizes the key
findings, contributions, and the project's significance, while the References section
lists the sources, references, and literature cited throughout the report. Lastly, the
Appendices contain supplementary material, additional data, code snippets, and
detailed documentation to support the main content of the report.
4
2. LITERATURE SURVEY
The literature survey conducted for the Realtime Pothole Detection System
project provided a comprehensive evaluation of the Single Shot Detector (SSD) in
the context of object detection, with a particular focus on its application in pothole
detection. Object detection plays a crucial role in various computer vision
applications, including road maintenance, safety, and infrastructure management.
SSD emerged as a leading object detection framework due to its efficiency,
accuracy, and suitability for real-time detection tasks, making it a prime candidate
for the project's objectives.
The findings from the literature survey provided valuable insights into the
capabilities and potential of SSD in pothole detection applications. SSD
demonstrated superior performance in terms of detection accuracy, processing
speed, and adaptability to varying lighting and environmental conditions compared
to YOLO [2]. Its multi-scale feature extraction capabilities and efficient use of
computational resources enable SSD to detect potholes of various sizes, shapes, and
dimensions with higher precision and reduced false positives [13].
5
Studies like [3] delve into optimizing pre-trained SSD models for pothole detection
tasks. This can involve techniques like transfer learning and fine-tuning the network
on pothole image datasets specific to your region or road conditions. This can lead
to a model that is both accurate in pothole detection and efficient in processing
power consumption.
While YOLO variants like YOLOv5 offer potentially higher accuracy, SSD
can be faster due to its single-stage processing pipeline [10][11][15]. This speed
advantage makes SSD more suitable for real-time applications where immediate
pothole detection is critical. Research on strategies for creating large and diverse
pothole image datasets specifically for your region or application can be valuable
[4]. This might involve collecting images under various lighting conditions, weather
scenarios, and different road surface textures to improve the model's
generalizability.
Firebase Realtime Database or Cloud Firestore can be leveraged for storing
pothole location data, including GPS coordinates, damage severity, and timestamps
[20]. These services allow for real-time updates whenever a new pothole is detected,
facilitating efficient data access for visualization and analysis.
Integrating pothole data with Google Maps allows users to visualize
detected potholes and plan their routes accordingly: Studies on integrating real-time
pothole detection data with Google Maps can provide insights on data visualization
and user interaction. Look for research on using Google Maps APIs to display
pothole information along user-defined routes or within user-specified areas [19].
This can empower users to make informed decisions about their journeys and
contribute to safer road conditions.
Limitations and challenges of real-time pothole detection systems. This can
include factors like varying lighting conditions, weather, shadows, and different
road surface textures that can impact detection accuracy [3][4][5][7]. Consider
strategies to mitigate these challenges, such as image pre-
By incorporating the insights from this literature survey, you can build a robust,
user-friendly, and scalable Real-Time Pothole Detection System with SSD and
Google Maps Integration. This system can contribute significantly to improving
road safety and facilitating efficient road maintenance efforts.
6
3.PROPOSED WORK
3.1 Block Diagram
7
5. Object Detection and Location Tracking: Here, the machine learning
model is split into two parts:
Object detection: This part of the model is responsible for identifying and
locating damaged areas in the road images.
Location tracking: This part of the model uses a Neo6M GPS module to
track the location of the camera as it captures video.
6. Training the Dataset: The training set is fed into the object detection
model to train it to recognize damaged roads.
7. Obtain Results: Once trained, the model is used to process the test set and
identify damaged roads in the images.
8. Damaged Road Detected: If damaged road is detected, the Neo6M GPS
module logs the location of the damage and updates it in firebase.
8
3.2 Algorithm
The Realtime Pothole Detection System project utilized the Single Shot
Detector (SSD) algorithm as the core machine learning model for object detection,
specifically targeting potholes on roads. SSD is a state-of-the-art object detection
algorithm known for its efficiency, accuracy, and suitability for real-time
applications.
Algorithm for preparing a custom dataset for object detection tasks using the
Single Shot Detector (SSD) architecture:
9
perform simultaneous object localization and classification in a single forward pass
through the network, hence the name "Single Shot."
3. If __name__ == '__main__':
3.1 Call create_model function with parameters (num_classes=2, size=300) to create
SSD model instance
10
3.2 Print SSD model architecture
3.3 Calculate and print total number of parameters and trainable parameters in the
model
Algorithm for training an object detection model using the Single Shot
Detector (SSD) architecture:
11
4.2 Initialize tqdm progress bar
4.3 For each batch in train_data_loader:
4.3.1 Forward pass through model
4.3.2 Compute losses
4.3.3 Accumulate losses
4.3.4 Backward pass and update model parameters
4.3.5 Update tqdm progress bar description with current loss
4.4 Return final loss value
6. If __name__ == '__main__':
6.1 Create directories for outputs if not exist
6.2 Load training and validation datasets and create data loaders
6.3 Initialize and move model to device (CPU or GPU)
6.4 Print total and trainable parameters of the model
6.5 Initialize optimizer and learning rate scheduler
6.6 Initialize loss and mAP history trackers
12
6.8.3 Run training and validation functions
6.8.4 Print epoch, training loss, and mAP values
6.8.5 Save best model based on mAP
6.8.6 Save current epoch model
6.8.7 Save loss and mAP plots
6.8.8 Step learning rate scheduler
In the context of the Realtime Pothole Detection System project, the SSD
algorithm was trained using a custom dataset of road images annotated with
bounding boxes around potholes. The training process involved optimizing key
hyperparameters, such as the learning rate, batch size, number of epochs, and loss
function, to ensure efficient model convergence and performance. Various data
augmentation techniques, including random cropping, grayscale conversion,
brightness and contrast adjustments, were also employed to enhance the model's
generalization and robustness.
Once trained, the SSD model was deployed to perform real-time pothole
detection using live camera feeds integrated with the system. The detected potholes
were then visualized and analyzed on a comprehensive website integrated with
Google Maps API, providing users with a detailed overview of pothole locations,
sizes, and other relevant information for road maintenance and safety purposes.
13
4. EXPERIMENTAL STUDY
4.1 Data Sets
In the Realtime Pothole Detection System project, the dataset plays a crucial
role in training and evaluating the performance of the machine learning models.
Here's an explanation of the datasets used in the project, based on the provided
information:
The dataset for the project was sourced from Kaggle [15][16][17][18], a
platform known for hosting machine learning datasets and competitions. The
dataset comprises of 768 unique images containing various road conditions and
pothole instances.
To facilitate the training of the computer vision model, the images in the
dataset were annotated using LabelMe, a popular tool for annotating images with
bounding boxes. This annotation process involved marking the location of potholes
and other relevant objects or features in the images. The annotated data, which
contains information about the coordinates and dimensions of the bounding boxes,
was then stored in XML format. This structured data is essential for training the
machine learning models to recognize and detect potholes accurately.
In summary, the dataset used in the project consists of over 700 annotated
images sourced from Kaggle, which were further enhanced and diversified through
image augmentation techniques using the Albumentations library. This curated and
14
augmented dataset serves as the foundation for training and evaluating the machine
learning models developed for the Realtime Pothole Detection System.
15
Google Colab Account offers a cloud-based platform for GPU accelerated
model training.
ReactJS and NodeJS are used for web development, creating the project's
frontend and backend components.
Firebase provides real-time database and authentication services.
Google Maps API enables geolocation functionalities.
Lastly, a browser with JavaScript and CORS enabled is essential for web
interface interaction and compatibility.
16
A minimum of 5GB of storage is required to accommodate the project's
datasets, software tools, libraries, and other resources. Adequate storage space
ensures smooth data handling, model training, and software development processes
without the risk of running out of space, facilitating seamless progress and efficient
project management.
17
The Neo6M GPS Module is an essential hardware component that facilitates
geolocation data collection by capturing geographical coordinates and locations of
detected potholes. Integrated with the system, this GPS module provides accurate
positioning information, enabling the system to identify and map the exact locations
of potholes on a geographical platform. The Neo6M GPS Module's high accuracy,
reliability, and compatibility with the project's hardware and software components
make it a crucial tool for enhancing the system's precision and effectiveness in real-
time pothole detection and mapping.
In summary, the hardware requirements for the Realtime Pothole Detection
System project encompass a range of essential components, including a powerful
PC with a Nvidia Graphics card for computational tasks, sufficient storage for data
management, ESP32 Camera Module for real-time image capture, and Neo6M GPS
Module for accurate geolocation data collection. These hardware components play
a pivotal role in enabling data collection, system integration, and real-time
processing, facilitating the development and deployment of an efficient and scalable
Realtime Pothole Detection System.
4.4 Preprocessing
1. Data Collection:
Process: The ESP32 Camera Module is utilized to capture images or video feeds
of road surfaces in different environmental conditions, such as varying lighting,
weather, and road types.
Importance: High resolution and diverse data collection ensures that the
machine learning models are exposed to various scenarios and conditions,
enhancing their ability to generalize and accurately detect potholes in real-world
settings.
2. Image Annotation:
18
Purpose: Image annotation involves labelling the collected images by marking
the locations of potholes and other relevant objects using bounding boxes.
Importance: The annotated images provide ground truth data required for
supervised learning, enabling the machine learning models to learn and recognize
potholes based on the labelled examples, thereby improving detection accuracy and
reducing false positives.
3. Data Augmentation:
Purpose: Data augmentation techniques are applied to increase the diversity and
robustness of the dataset, enhancing the model's generalization capabilities.
4. Data Splitting:
19
model's performance on unseen data, enabling iterative refinement and
improvement.
Process: The pixel values of the pre-processed images are normalized and
standardized to a common scale or range, such as [0, 1] or [1, 1].
6. Data Integration:
Process: The geographical coordinates obtained from the GPS module are
mapped to the corresponding annotated images, creating a spatial reference for each
detected pothole.
20
data for machine learning model training, ensuring data quality, diversity,
consistency, relevance, and spatial context, ultimately facilitating the development
and deployment of an accurate, efficient, and scalable Realtime Pothole Detection
System capable of identifying, mapping, and addressing potholes in real time.
1. Hyperparameters:
Batch Size: A batch size of 16 was selected based on available memory and
computational resources, balancing training speed and memory usage.
Number of Epochs: The model was trained for 75 epochs to ensure adequate
learning without overfitting to the training data.
Optimizer: SGD with learning rate 0.0005, momentum 0.9, and Nesterov
momentum.
Loss Function: SSD internally uses a combined loss function addressing both
classification and localization tasks. Classification loss: Measures how well the
model classifies objects within bounding boxes. Localization loss: Measures the
difference between predicted and ground truth bounding boxes.
2. Model Architecture:
21
Object Detection Framework: SSD (Single Shot Detector) was employed for
predicting bounding boxes at different scales in a single pass through the network,
facilitating efficient and accurate object detection.
The key innovation of SSD lies in its ability to perform object detection in
a single forward pass through a neural network, eliminating the need for a two-stage
process as seen in previous models. This approach significantly speeds up the
detection process, making SSD highly suitable for real time applications where
speed is crucial.
Architecture:
22
One of the distinguishing features of SSD is its use of multiscale feature
maps to predict bounding boxes at different resolutions. This enables SSD to detect
objects of various sizes effectively, from small to large, within a single network
architecture. Each feature map is associated with default bounding boxes of
different sizes and aspect ratios, allowing SSD to detect objects with high precision
across the entire image.
Default Boxes:
SSD uses a set of default boxes or anchor boxes at different aspect ratios
and scales to predict the location and size of objects within the image. These default
boxes are predefined and serve as starting points for the detection process, which is
refined based on the predictions made by the network. By utilizing default boxes at
multiple scales and aspect ratios, SSD achieves high detection accuracy and
robustness against objects of different sizes and orientations.
Prediction Layers:
Loss Function:
23
Augmentation Techniques: A comprehensive set of augmentation techniques,
including random cropping, grayscale conversion, saturation adjustment, brightness
adjustment, hue adjustment, shadow adjustment, and nighttime simulation, were
applied to enhance model robustness and generalization capabilities.
4. Regularization Parameters:
Dropout Rate: A dropout rate of 0.5 was applied during training to prevent
overfitting by randomly deactivating neurons and promoting model generalization.
Weight Decay: A weight decay term with a value of 0.0001 was added to the loss
function to penalize large weights and prevent overfitting.
5. Evaluation Metrics:
Precision, Recall, and F1 Score: Precision, Recall, and F1 Score metrics were
computed, demonstrating the model's ability to correctly detect potholes, identify
all potholes, and balance Precision and Recall effectively.
4.6 Results
24
adaptability and reliability in real-world scenarios where lighting can vary
significantly.
The model showcased the ability to detect potholes of diverse sizes, shapes, and
dimensions with high accuracy. This versatility ensures comprehensive coverage
and effective detection of potholes across different road surfaces and environments.
25
Figure(xi): Comprehensive website integrated with Google Maps API for
analyzing results and viewing potholes in a route
The comprehensive website serves as the central interface where users can
access and analyze the pothole detection results. It offers a user-friendly
environment with intuitive navigation and interactive features, allowing users to
easily explore and interpret the collected data.
The integration with Google Maps API enables the visualization of detected
potholes directly on the map interface, providing spatial context and geographical
distribution of the identified potholes. Users can view the locations of detected
potholes, zoom in/out for detailed analysis, and click on individual markers to
access additional information about each pothole, such as size, shape, and severity.
26
potholes based on various parameters, such as location, size, and detection
confidence, to identify trends, patterns, and areas requiring attention.
The real-time integration with Google Maps API ensures that the website's
map interface reflects the latest pothole detection results, providing users with up-
to-date information and enabling prompt action and response to identified issues.
4.7 Analysis:
Table (I): Results obtained while training SSD model under various parameters
1. Momentum Variation:
Momentum values of 0.5 and 0.9 were tested. A momentum value of 0.9 yielded
the highest mAP@50 of 85.85, indicating that higher momentum contributes to
better detection performance. However, a momentum of 0.5 resulted in a
significantly lower mAP@50 of 13.15, suggesting that too low a momentum value
may adversely affect detection accuracy.
3. Optimizer Variation:
27
Two types of optimizers, SGD (Stochastic Gradient Descent) and ADAM
(Accelerated Gradient Descent), were tested. SGD yielded the highest mAP@50 of
88.676 and 85.85, with momentum values of 0.1 and 0.9, respectively. AGD
resulted in lower mAP@50 values ranging from 3.5 to 49.9 across different
configurations. SGD appears to be more effective in optimizing the model for
higher detection accuracy compared to AGD in this context.
The achieved train loss values varied across different configurations. Lower train
loss values generally corresponded to higher mAP@50 values, indicating a
correlation between lower train loss and better model performance. For instance, a
train loss of 0.545 resulted in a mAP@50 of 25.769, while a train loss of 0.6764
yielded a mAP@50 of 88.676, reinforcing the importance of minimizing training
loss for optimal performance.
Higher momentum values and moderate learning rates around 0.01 appear
to contribute to better detection performance. The SGD Optimizer outperformed the
AGD Optimizer in optimizing the model for higher detection accuracy. Lower train
loss values are indicative of better model performance, highlighting the importance
of effective model training and optimization strategies.
28
Figure(xiii): Iterations vs Train Loss and Epoch vs mAP@50
The above analysis is done for momentum 0.5 with learning rate of 0.0000002
having SGD (Stochastic Gradient Descent) optimizer achieving 13.15 mAP@50
and train loss of 39.944
The above analysis is done for momentum 0.9 with learning rate of 0.01 having
SGD (Stochastic Gradient Descent) optimizer achieving 85.85 mAP@50 and train
loss of 0.9760
29
Figure(xiv): Iterations vs Train Loss and Epoch vs mAP@50
The above analysis is done for momentum 0.9 with learning rate of 0.001 having
AGD (Accelerated Gradient Descent) optimizer achieving 49.90 mAP@50 and
train loss of 1.892
The above analysis is done for momentum 0.1 with learning rate of 0.001 having
ADAM optimizer achieving 25.769 mAP@50 and train loss of 0.5450
30
The above analysis is done for momentum 0.1 with learning rate of 0.01 having
ADAM optimizer achieving 14.2 mAP@50 and train loss of 5.026
The above analysis is done for momentum 0.1 with learning rate of 0.02 having
ADAM optimizer achieving 3.5 mAP@50 and train loss of 7.5
The data with a momentum of 0.5, a learning rate of 0.0000002, and an SGD
optimizer produced a notably lower mAP@50 of 13.15 with a train loss of 39.944.
This configuration may be less favourable due to the extremely low learning rate.
With a momentum of 0.1, a learning rate of 0.01, and an SGD Optimizer, the results
showed a mAP@50 of 88.676 and a train loss of 0.6764. This suggests that higher
momentum and a moderate learning rate contribute to better detection
performance.
31
5. CONCLUSION AND FUTURE SCOPE
The Realtime Pothole Detection System project was undertaken with the
primary objective of developing an efficient machine learning model capable of
real-time detection of potholes. To achieve this, the project focused on leveraging
the capabilities of the SSD (Single Shot Detector) model. A series of optimization
processes were conducted to finetune key hyperparameters, including the learning
rate, batch size, number of epochs, optimizer, and loss function. As a result of these
optimizations, specific values were determined for the hyperparameters, namely a
learning rate of 0.0005, a batch size of 16, and training for 75 epochs, SGD
Optimizer and with default Loss function.
Looking ahead, the project has several avenues for future development and
enhancement. Firstly, there is a need for further optimization of the model to
enhance its accuracy, efficiency, and real-time performance, potentially through
additional finetuning of hyperparameters or exploring advanced model
architectures. Secondly, improvements in real-time camera feed handling and
integration with IP streaming solutions could be pursued to bolster the system's
detection capabilities and responsiveness. Additionally, the integration of the
system with the Google Maps API could facilitate location-based analysis and
visualization of detected potholes, offering valuable insights for infrastructure
maintenance and planning.
32
Continued refinement of performance metrics, such as mAP@50, precision,
recall, and overall accuracy, will be essential to achieving higher standards and
reliability in pothole detection. Furthermore, retraining the model with an expanded
and diversified dataset comprising 5000+ augmented images could be undertaken
to further enhance its detection capabilities. Lastly, the development of a
comprehensive user interface integrated with the Google Maps API would enable
users to analyse, visualize, and report detected potholes and damages in real-time,
enhancing user engagement and system usability.
33
REFERENCES
[1]. Optimal Fuzzy Wavelet Neural Network Based Road Damage Detection (June
2023). Mohammed Alamgeer, Hend Khalid Alkahtani, Mashael Maashi.
[2]. Au Yang Her, Weng Kean Yew, Pang Jia Yew, Melissa Chong Jia Ying.
Realtime pothole detection system on vehicle using improved YOLOv5 in Malaysia
((2022).
[3] Liu W, Wang Z, Liu X, Zeng N, Liu Y, Alsaadi FE. A survey of deep neural
network architectures and their applications. Neurocomputing. (2020)
[4] Alom MZ, Taha TM, Yakopcic C, Westberg S, Sidike P, Nasrin MS, Hasan M,
Van Essen BC, Awwal AA, Asari VK. A stateoftheart survey on deep learning
theory and architectures. Electronics. (2021)
[5] Y. Darma, M. R. Karim, and S. Abdullah, “An analysis of Malaysia road traffic
death distribution by road environment,” Sadhana Academy Proceedings in
Engineering Sciences, vol. 42, no. 9, pp. 1605–1615, Sep. 2017, doi:
10.1007/s120460170694 9.
[6] “Malaysia’s Minister Khairy Jamaluddin injures from fall after bicycle hits
pothole,” The Straits Times, Dec. 28, 2020. Accessed: May 31, 2022. [Online].
Available:
https://www.straitstimes.com/asia/seasia/malaysiasministerkhairyjamaluddininjure
dfromfallafterbicyclehitspothole
[7] K. Perimbanayagam, “75yearold man killed after crashing into pothole,” New
Straits Times, Jan. 03, 2021. Accessed: May 31, 2022. [Online]. Available:
https://www.nst.com.my/news/nation/2021/01/654169/75yearoldmankilledaftercr
ashingpothole
[8] S. Thiruppathiraj, U. Kumar, and S. Buchke, “Automatic pothole classification
and segmentation using android smartphone sensors and camera images with
machine learning techniques,” in 2020 IEEE REGION 10 CONFERENCE
(TENCON), Nov. 2020, pp. 1386–1391. doi:
10.1109/TENCON50793.2020.9293883.
[9] R. Fan, U. Ozgunalp, B. Hosking, M. Liu, and I. Pitas, “Pothole Detection
Based on Disparity Transformation and Road Surface Modeling,” IEEE
Transactions on Image Processing, vol. 29, pp. 897–908, 2020, doi:
10.1109/TIP.2019.2933750.
34
[10] J. W. Baek and K. Chung, “Pothole classification model using edge detection
in road image,” Applied Sciences (Switzerland), vol. 10, no. 19. MDPI AG, Oct.
01, 2020. doi: 10.3390/APP10196662.
[11] Y. Jo and S. Ryu, “Pothole detection system using a blackbox camera,” Sensors
(Switzerland), vol. 15, no. 11, pp. 29316–29331, Nov. 2015, doi:
10.3390/s151129316.
[12] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path Aggregation Network for Instance
Segmentation,” Mar. 2018.
[13] GitHub, “GitHub ultralytics/yolov5.” https://github.com/ultralytics/yolov5
(accessed May 31, 2022).
[14] https://docs.ultralytics.com/models/#featuredmodels
[15] https://www.kaggle.com/datasets/atulyakumar98/potholedetectiondataset
[16] https://www.kaggle.com/datasets/andrewmvd/potholedetection
[17] https://www.kaggle.com/datasets/rajdalsaniya/potholedetectiondataset
[18]https://www.kaggle.com/datasets/prudhvignv/roaddamageclassificationandass
essment
[19] https://developers.google.com/maps/documentation/
[20] Firebase Documentation (google.com)
35