0% found this document useful (0 votes)
52 views21 pages

Obstacle Detection For Visually Impaire Using IoT

Obstacle Detection for Visually Impaire Using IoT and machine learning

Uploaded by

medof45802
Copyright
© Public Domain
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views21 pages

Obstacle Detection For Visually Impaire Using IoT

Obstacle Detection for Visually Impaire Using IoT and machine learning

Uploaded by

medof45802
Copyright
© Public Domain
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Detection and Obstacle Recognition Technique

for Blind Persons Using IOT

ABSTRACT Object detection is a critical task in computer vision, essential for real-time applications ranging from
autonomous vehicles to surveillance systems. This proposed work presents a comparative evaluation of Single Shot Multibox
Detector (SSD algorithms). YOLOv3, MobileNetv3, RetinaNet and Faster R-CNN, in the context of real-time obstacle
detection from camera images. The study evaluates these algorithms on accuracy metrics Precision, Recall and F1 score
across various Intersection Over Union (IoU) thresholds. Also, the computational efficiency in terms of the time taken per
frame is assessed to determine the effectiveness of each algorithm. The workflow includes image processing, augmentation,
and application of SSD models to detect objects like vehicles, pedestrians and traffic signals. Results indicate that YOLOv3
achieves the highest precision of 96% demonstrating robust performance in real-time scenarios, while MobileNetv3 follows
closely with 92%, RetinaNet and Faster RCNN achieves accuracy 90% and 90% respectively. These findings contribute to
understanding the trade-offs between accuracy and computational efficiency in selected suitable SSD models for practical
deployment.

I. INTRODUCTION
Visual aids are the fundamental needs of the visually impaired for interaction with the environment, enabling them to navigate
their surroundings, recognize objects, and produce audible outputs [1] . The world needs to work on the unique challenges that
necessitate solutions. Object detection permits machines to recognize and identify the object in an image [2]. To solve this
problem, an obstacle detection and recognition system in a real-time environment is required. The affected people need this
type of device that can help them walk on the road to avoid major accidents. The techniques of artificial intelligence helped
AL
them recognize objects and navigate the blind person both indoors and outdoors [3]. Advancements in deep learning techniques
have significantly improved the accuracy and efficiency of object detection algorithms. One such innovative approach is the
single-shot detector (SSD) algorithms [4] and two-shot detectors. Detection algorithms are used in a real-time object detection
N

framework that combines the strengths of both speed and accuracy. Traditional object detection methods involved multi-step
processes, including region proposal generation and object classification. However, single-shot detector (SSD) revolutionized
FI

this by performing both tasks in one step, making it much faster. The features used in this proposed work are boundary box,
confidence score, and thresholding [5]. Studies have been conducted to detect the obstacles using sensors or cameras with
some machine learning and deep learning technologies. Mini Cameras used to detect obstacles from images in live systems
using machine learning techniques [6]. the Yolo algorithm using cameras and image dataset obstacles has been detected and
recognized to help the visually impaired avoid accidents and improve their walking in both indoor and outdoor environments.
The above techniques are used for detecting objects and text-to-speech for audio voice [7].
The motivation behind this proposed work is:
• In certain scenarios, the obstacle detection system may need to interact with humans or provide interpretable outputs.
Challenges include designing user-friendly interfaces and ensuring the system's decisions are understandable and
trustworthy.
• Over time, the reliability of an obstacle detection system may be affected by factors such as model degradation, changing
environmental conditions, and wear and tear of sensors. Ensuring long-term reliability and ease of maintenance is an
ongoing challenge.
• Achieving real-time performance in obstacle detection systems is crucial for applications like autonomous vehicles,
where timely responses are essential. Ensuring that the detection models meet real-time constraints without sacrificing
accuracy is challenging.

Addressing these challenges requires a multidisciplinary approach, combining knowledge from computer vision, machine
learning, and domain-specific knowledge for the application at hand [8]. Continuous efforts are essential to overcome these
challenges and advance the effectiveness of obstacle detection and warning systems using SSD models [9]. In the navigation
system, ultrasonic sensors installed in both shoes and a walking stick are used and algorithms are used to identify obstacles
and immediately inform the user of the presence of obstacles [10]. Most of the previous work in obstacle detection rates below
96%. This goes a long way to explain the rationale behind our proposed model, which seeks to solve the problems of the
visually impaired. Previous systems used computer vision algorithms to improve the accuracy of outdoor navigation systems
for the detection of obstacles during walking. However, these algorithms have
been found to be incapable of providing the desired results [11]. The objective of this work is to provide a comprehensive
analysis of the performance of several advanced SSD algorithms. YOLOv3, MobileNetv3, RetinaNet and Faster RCNN in the
context of the obstacle detection and recognition. Evaluation measures including precision, recall and F1 score will be
compared on the various datasets and experimental settings. By elucidating the comparative performance of these algorithms,
this This study begins with the idea of object detection typically for the visually impaired. This research focuses on the
effectiveness of the single-shot and two-shot detection algorithms. These SSF algorithms are considered for their capability,
as a means of providing environmental awareness.
This paper has the following contributions:

• The central aim of our study is to focus on the performance, strengths, and limitations of detection algorithms when
applied in a real-time environment.
• The potential impact of this study is to revolutionize assistive technologies, to make them more accessible and
effective concerning accuracy and time for the visually impaired.
• Through an in-depth analysis of prior works utilizing deep learning detection algorithms, we identified key
innovations and common defects. This critical examination of the strengths and weaknesses of various approaches,
informs efficient models.
• By providing a comprehensive evaluation and comparison of various detection algorithms, this study contributes to
the broader field of autonomous systems. These findings and recommendations presented in this paper have
significant implications for optimizing obstacle detection systems particularly in the context of mining operations.

The visually impaired faces challenges in their daily life, from navigation systems to avoiding the obstacle. Object
detection when combined with accessible interfaces, can allow them to cover these objects. Detection algorithms have gained
prominence for their speed and efficiency in real-time environments. Experiments in this paper show the comparative analysis.

The paper has been organized as follows.


• The Literature review provides extensive details about the previous work on obstacle detection models.
• The framework and architectures of detection algorithms are discussed in this section of the paper.
AL
• Implementation and Results of different detection algorithms from different environments.
• Conclusion
N

II. Related Work

Bling people face many challenges in their daily routine, because of the dependency in their daily tasks. These challenges have
FI

an impact on their independence, mobility, and overall quality of life. Some of the key challenges include: Safety is the biggest
challenge they face in their daily lives, without visual assistance they face the risk of accidents, falls, and collisions with objects
or other individuals. Navigation is one of the most prominent challenges for visually impaired individuals. They face
difficulties in road crossing, avoiding obstacles, Locating destinations, and Independence. The review encompasses various
technologies, methodologies, and man-made approaches utilized in developing these systems. It also highlights common
challenges and limitations while emphasizing the real-world impact of such technology [12].
TABLE 1. Literature review of the existing studies.

Refences Research problem Goal Techniques and Achievements


hardwares
[13]-2024 To avoid collision WBSS is designed for F-KAZE algorithm Prove good accuracy of the
7 m-class USV. proposed technology.

[14]-2024 To navigate visually impaired. To enhance vision and Ultrasonic sensor and RGC Navigate perfectly for visually
navigate blind people sensing camera. impaired.

[15]-2023 To remotely locate stick. To achieve high GPS receiver, GSM Detection and automatically gives
security Modem, statistical alter.
approach.
[16]-2023 To get directions by hand Object tracking Deep Neural Networks Outperform.
(DNN) and YOLO version
5
[17]-2023 Alert for terrorist attack in Sense the distant HC-05, ultrasonic sensor, Robot moves in all directions.
military. object and monitor Arduino Uno
human voice. microcontroller
[18]-2023 Selfy -mobility Movement with hand preprogrammed algorithm Health monitor system and Alert
gestures. system.

[19]-2023 To improve quality of life To navigate and k-nearest neighbor (KNN) satisfactory accuracy
position. algorithm
[20]-2023 Detect object timely. Monitor lifestyle. Statistical technique Improved accuracy.

[21]-2022 Easy life for common Wearable, modest, TensorFlow detection Highly efficient and convenient
individuals. versatile system algorithm system.
AL
N

To provide quality work regarding obstacle detection and navigation systems major objectives from different research
articles are discussed below. The following discussion will provide different approaches used for the visually impaired.
FI

An obstacle detection algorithm utilizing deep learning is proposed for an unmanned surface vehicle (USV) equipped with a
wide baseline stereovision system (WBSS) [13].
Over time, the scholars have carried out various researches in a bid to develop gadgets that will be of help to the visually
impaired. This manuscript is devoted to different sensors applied to the navigation systems designed for this category of people.
The review looks at the current developments in the navigation aids for the visually impaired such as the visual, proximity and
LIDAR sensors among others. These sensors produce a lot of data and these data are then used to mimic the environment. The
manuscript also addresses the strengths of each type of sensor and the best prior that provide the best outcomes. Some of the
issues that are related to the use of sensors are also presented as well as some of the ways through which they can be addressed
[14].

outdoor environments. Sensors are used to collect quantitative datasets to train the model. Blind is monitored by using a mobile
application for security and navigation. The system integrated a soil sensor, stair detection sensor, and obstacle sensor.
Radiofrequency is used to decide the obstacle-free track. Global positioning system (GPS) is used for navigation. Keywords—
Arduino Uno, Ultrasonic sensor, Infrared sensor, Soil moisture sensor, Global Positioning System (GPS), Global System for
Mobile Communication (GSM) module [15].
The research article was published in 2023 in the journal IEEE in the English language. This paper is collected from Google
Scholar. The aim of this work is to develop a smart glove to give direction with hands. Deep neural networks and You Look
Only Once (YOLO) version 5 is used to train the model. It also includes cameras and microphones. The image-based dataset
is used in this paper to train the model. The contribution of this work includes finding the color of the object [16].
This work is aimed to help the visually impaired by using smart cars. The car is composed of different sensors that detect
objects and take directions in the form of a voice from the user. It also monitors the distance of the user from the object. An
application is developed to select the path and observe the movement of the car. The quantitative data is collected by the
sensors to measure the distance from the
object. The application used Artificial intelligence to train the model. This article is published in IEEE in the year 2023.
Keywords— Arduino Uno, L298N, Bluetooth (HC-05), Ultrasonic sensor (HC-SR04) [17].
This Paper is aimed to develop an IoT-based wheelchair for disabled persons. The machine learning algorithm is used to make
decisions about pathfinding more securely and safely. Sensors used in this paper collect quantitative datasets for results.
Keywords—Quadriplegia, paralysis, wheelchair, IoT, gestures, self-mobility, Artificial Intelligence (AI), Machine learning
(ML) [18].
This work has a remarkable effect on the quality of life. Many other features need to be measured to know the direction of the
place, The brute force from the K-Nearest Neighbor (KNN) algorithm method is used for feature matching. avoiding dangerous
situations. Keywords: Blindness, feature detection, feature matching, landmark detection, oriented features from accelerated
segment test (FAST) and rotated binary robust-independent elementary feature (BRIEF) (ORB) algorithm, visual impairment
[19].
Yolo algorithm is used in this work to increase the efficiency of the work. The Yolo family belongs to the Convolutional Neural
Network (CNN) algorithm. The work supports many problems the visually impaired face in their surroundings. The white cane
is used as an obstacle-detection device. Sensors are integrated with this white cane to take data. The cane is connected with
the mobile applications where the Yolo algorithm is trained. it increases the accuracy of the model. Keywords–Modern Cane,
Visually challenged, Application, Detection, Accuracy [20].
Camera and detection sensors are used to detect the object. Raspberry 4B, camera, ultrasonic sensor, and Arduino are integrated
AL
on a stick. Stick will be used by the user to detect the object; the dataset is transferred to the mobile application for classification.
The deep learning algorithm is used in this paper for classifications. TensorFlow Object Detection algorithm is used for
classification and to train the model. The ultrasonic sensor is also used to detect and send the beep to the user, with the help of
N

a camera they can send the details about the object. The device is designed to make the user more secure in the crowd.
KEYWORDS: Smart system, visual losses, biomedical sensor, object recognition, TensorFlow, Viola Jones, ultrasonic sensor
FI

[21].

The challenges faced by the visually impaired limit their daily activities in interaction with the surrounding world. The navigation
system is the biggest need for the visually impaired to give them directions about their path. The IoT field plays a vital role in
helping the blinds in their walking in outdoor and indoor environments. A mobile application is designed to process the data.
Sensors collect their data and send it to the mobile application for preprocessing and classification results. Google Maps is used
to give them directions. Its prototype is developed and tested in different environments. Accuracy is improved from 81% to 99%.
CNN model is compared with the KNN classifier to improve its performance. KEYWORDS: Visually impaired, walking
assistance, IoT, machine learning [22].

Table 2. presents a discussion on both innovations and defects. It outlines the technologies employed in the referenced research
articles, along with their respective weaknesses. This comparative analysis aims to see the novelty of the proposed work.
Research questions for the comparative study on obstacle detection for the visually impaired are:

RQ1: How do different detector algorithms perform in terms of accuracy and reliability for obstacle detection across various
environmental conditions?
RQ2: How does the performance of deep-learning-based detector algorithms compare in the obstacle detection model, and are
there scenarios where one approach outperforms the other?
TABLE 2.
Innovations and defects in previous work.

Type Object Detection Algorithms Innovations Defects

One shot detector YOLOv3 An algorithm is powerful and efficient Emphasize a deeper focus on
for target detection [23]. implementation comparisons, including
scenario analysis.
One shot detector Efficient-Net Several techniques used including It is planned to use neural architecture
weight decay, mosaic Decay, Mosaic search technology (NAS) to improve the
Data augmentation and loss universality and performance of
normalization to improve the detection.
performance of Efficient-Net [24].

Two shot detectors R-CNN It is more precise to state that it Training time is too long.
amalgamated deep learning methods
and traditional methods [25].
AL
RetinaNet Retina-Net recognizes objects of many Used the Kalman Filter algorithm for
One shot detector sizes using a feature pyramid network giving the best accurate results for the
and focal loss [26]. object recognition.
One shot detector MobileNetv3 UAV device based object detection lower model accuracy, imperfect real-
N

model [27]. time performance, and insensitivity to the


occlusion of the image.
FI

Two shot detectors Center-Net By appropriately reducing structural Generate incorrect boundary boxes.
complexity, Center-Net attains a
favorable [28].
Two shot detectors Faster-RCNN Faster R-CNN employs an innovative Gives low accuracy when used with
approach by utilizing the convolution VGG16 network.
network to generate proposed boxes.
This strategy reduces the number of
frameworks from 2000 to 300 [29].

Innovations and the defects found in the previous research work are discussed in Table 2. Models that have used one-shot and
two-shot detectors are cited in the above table with findings.

III. MATERIALS AND METHODS

In recent times, the deep learning-based object detection field has witnessed the emergence of numerous high-performance
methods. These approaches specifically consider the unique characteristics of road traffic [30], frequent changes in lighting
and background conditions [31] , diverse dimensions and positions of obstacles in images [32] , as well as the shape, size, and
color of the obstacles [33]. Given the demanding real-time performance requirements for object detection algorithms selected
from a multitude of options. The aim is to conduct a comparative analysis of these algorithms, emphasizing their exceptional
accuracy and real-time capabilities.
The preceding section elucidates the comprehensive design of the obstacle detection model, comprising two distinct segments:
the input phase and the analysis phase. The input phase involves the integration of a sensing device camera to observe the
surroundings, while the analysis phase unfolds in the python Jupiter environment. This section delves into the implementation
details of the Obstacle Detection model. The obtained data consists of values of frontal image. The complex dataset is sent to
the microcontroller to be fed into system. The real-time experiments have been performed using the dataset received from
outdoor and indoor. The model was successfully functioning during real-time experiment. The gathered dataset serves as the
foundation for generating results through Python. The dataset comprises information collected from camera monitoring the
surrounding. To enhance the accuracy of the results, the dataset undergoes preprocessing in the Jupiter environment, ensuring
a robust foundation for subsequent analysis and interpretation.

A. Architecture of the proposed model


The Framework of Object detection algorithms typically involves several key components and stages. Figure 1 shows the external
architecture of the proposed model.

System External Architecture

Sensors used Obstacle detected


for detection
Mini AI
AL camera

Data transmission Bluetooth device


Transmission of
for preprocessing
processed data
N

FIGURE 1. External Architecture of proposed Obstacle Detection model


FI

The internal architecture of the obstacle detection model is discussed in Figure 2.

3.Region 5. Post
Two shot detector
Proposal processing steps

1. Image capture 2. Preprocessing 6. Output 7. Evalutation 8. Fine-Tuning

One shot detector 4. Prediction


FIGURE 2. Internal Architecture of the proposed Obstacle Detection Model

The overview of the external architecture for an obstacle detection system using the mini camera, Arduino Uno, Bluetooth device,
mobile application, and hands-free module is shown in Figure 1. The mini cameras capture the image of the environment. Images
are processed to detect the obstacles. This processing can be done directly on the camera, in this proposed model we have fixed
the model on knee gloves which makes it easy for the user to detect all the surroundings. The Arduino Uno is the control unit for
the proposed system. The algorithms are implemented on the Arduino Uno to make decisions based on the input. These algorithms
used techniques for object recognition and edge detection to identify the obstacle. The mobile application is used as the user
interface, it receives information from the Arduino Uno and displays information about the detected obstacle to the user. The
application also provides additional information about the navigation system.
• Image Capture: The process begins by capturing an image sent to the dedicated mobile application developed for this
purpose illustrated in Figure 2.
• Processing: Upon receiving the image, the server performs preprocessing steps to enhance the quality of the image and
prepare it for further analysis. This involves considering input size, anchor box configuration, and feature map
resolution. The preprocessed image is passed through a pre-trained classification model. The model's purpose is to
classify the content of the image, in this case, identifying specific objects and patterns in indoor and outdoor
environments.
• Region Proposal (Two Stage Detector): Two-stage detectors, a region proposal network (RPN) may be employed to
generate candidate bounding boxes or regions of interest (Roles) that are likely to contain objects.
• Prediction (One Stage Detector): For a one-stage detector, predictions for bounding boxes and class probabilities are
directly made in one step through the network, eliminating the need for a separate region proposal stage.
• Post-processing steps: post-processing steps may include non-maximum suppression (NMS) to filter redundant or
overlapping bounding boxes and improve the final set of detections.
• Output: The final output consists of the detected objects, along with their corresponding bounding boxes and class
labels.

Evaluation: the algorithm’s performance is evaluated by using different metrics such as precision, recall, F1 score, and others.
This is often done on a separate validation or test dataset

B. Data Preprocessing and Augmentation


Image processing is a process of detecting individual objects and images. Sometimes in image classification cannot get better
results because the image can have various noises or the image has many other objects that are not of interest [34], which. The
input image is used in Python to do preprocessing and augmentation it converts the image into the mask. In the case of an
AL
outdoor environment, there can be multiple things in the image which will give the mask to identify the multiple objects [35].
This approach is mostly used in self-driving cars [36], because it's not only important to detect the objects in front of you it's
always important to detect the shape of the object in front of you to avoid the collision. So, this technique will help the visually
N

impaired in their walk on the roads. This method helps in classifying the important objects. Figure 3, presents the extraction
of meaningful images reduces noise, and improves the accuracy of the model. Image classification algorithms focus on the
FI

highlighted objects and help in recognizing them by assigning the class labels [37].
After the preprocessing of the image augmentation is used on training dataset, which helps image to improve robustness and
generalization ability of machine learning models. The process involves applying various transformations to the original
images. It rotates the image into different degrees shifting the image along x and y axis, resize the image applied a shearing
transformation, which skews the image. Flipping the image horizontally and vertically. Increased and decreased the brightness
modified the contrast of the image. Change the intensity of color. Shifted the colors of the image. Gaussian noise distribution
is used cropping the random portion of the image. Randomly hide the parts of the image to simulate partial occlusion. Applied
motion blur and change the perspective of the image as if viewed from a different angle. Adjusted the gamma value to control
the overall brightness of the image. All this work of preprocessing and augmentation is done in python Jupiter. This process is
applied on 100 images as training dataset.

IV. RESULTS
The model is trained and tested on real-time objects. It measures the discontinuity in the image by measuring point edges and
lines. It measures the high frequency of objects and ignores the noise in it [38]. Figure 3 shows the results from preprocessing
and augmentation. Images have been resized to fix in SSD model expects. Scale the pixel values to a range that the model can
handle, usually between o and 1. This is done by dividing the pixel values by 255. Subtracted the mean RGB values calculated
from the training dataset to center the data. This helps in improving the model’s convergence during training. Applied data
augmentation techniques random cropping, flipping, rotation and color jittering to make the model more robust to variations
in the input data.
.

A. EXPERIMENTAL RESULTS OF PREPROCESSING ANG AUGMENTATION


The results were obtained by using python Jupiter. Libraries used in this process are NumPy and TensorFlow.
FIGURE 3. Preprocessing and Augmentation of the Images from different environments
AL
B. IMAGE CLASSIFICATION BY USING SSD MODELS

The process of detecting the right object at the right time is the most important part of this model. Any negligence in measuring
the parameters can cause an accident or any other loss. This intelligent real-time obstacle detection model is proposed to avoid
N

accidental situations. Its notification system is an intelligent approach to guide the visually impaired about their current
situation to avoid collision. When a color is fed into the input layer of one-stage and two-stage algorithms, the algorithm goes
FI

through a series of steps to perform object detection. After image segmentation, some detection algorithms are used to compare
the performance concerning time and accuracy.

Steps used by detection algorithms to recognize the object from the image:

• VGG-16 Network: This network is used to extract the features from the original image.
• Feature Extraction: It is used on resized and preprocessed images, this image is passed through the one-stage model's
convolutional layers, which serve as a feature extractor. These layers capture hierarchical features at different scales in the
image. The network layer recognizes patterns, edges, textures, and object-specific features.
• Anchor box generation at various levels of the features hierarchy, the one-stage algorithms generate a set of anchor boxes
(default bounding box). These boundary boxes have different aspect ratios and scales to capture objects of varying sizes and
proportions in the image. Prediction, for each anchor box, the one-stage model makes predictions regarding two key aspects.
• Class Score: The model assigns a class score to each anchor box, indicating the likelihood that the box contains an object of
a particular class. These class scores are obtained through classification layers.
• Bounding box: The model predicts the bounding box offsets to adjust the anchor boxes and better fit the actual objects. These
offsets are obtained through regression layers. The one-stage algorithm can detect multiple objects in a single pass through
the neural network, making it highly efficient for real-time applications. It leverages anchor boxes to handle objects of
various sizes and aspect ratios and produces reliable object detection results.
• Final Predictions: The remaining bounding boxes, along with their associated class labels and confidence scores, are the
final predictions of the one-stage algorithm. These predictions indicate the objects detected in the input image and their
locations.
Object
Detection
Results of
RetineNet

Object
Detection
Results of
MobileNet
V3

Object
detection
result
from
YOLOv3
algorithm

Object
AL
detection
result
from
Faster
RCNN
N
FI

FIGURE 4. Object Detection and Recognition Using SSD models.

The one-stage and two-stage algorithms depend on many factors like model architecture, the hardware used in the model, and
the implementation techniques in Figure 4. In this proposed model we have used 3 one-shot and 1 two-shot algorithm, they can
be differentiated in terms of speed and efficiency, however, there is not 1 algorithm that can be the fastest for all the scenarios.
The model is trained for 80 classes, possibly all those objects a user can face in outdoor environments.
Mathematical key equations and concepts involved in each algorithm:
RetinaNet is focused for its focal loss that is useful in dealing with the class imbalance in object detection.
Equations:
Focal Loss = −𝑤𝑡 ( 1 − 𝑇𝑝)𝑓 𝑙𝑜𝑔(𝑇𝑝) …. (1)

Where Tp is the predicted probability for the true class, wt is the weighted factor for class C1 and f is the focusing parameter to
adjust the importance of hard examples.
𝐿𝑟𝑒𝑔(𝑥, 𝑥 ∗) = {0.5(𝑥 − 𝑥 ∗)2 ∣ 𝑥 − 𝑥 ∗∣ −0.5𝑖𝑓 ∣ 𝑥 − 𝑥 ∗∣< 1… (2)
Where x is the predicted bounding box coordinates and x* is the ground truth coordinates.
MobileNetv3 is designed for efficient mobile vision applications with reduced computational complexity and increased
accuracy:
Depthwise Convolutional:
Ax(e)=Bcx∗Lx, x=1…, X…. (3)
Where Bc is the c-th input channel, Lx is the corresponding filter, and * denotes the convolution operation.
Pointwise convolutional:
Outputp = 𝑃𝑜𝑖𝑛𝑡𝑤𝑖𝑠𝑒𝐶𝑜𝑛𝑣 (𝑂𝑢𝑡𝑝𝑢𝑡𝑑) = 𝐶𝑜𝑛𝑣1 ∗ 1 . 𝑂𝑢𝑡𝑝𝑢𝑡𝑑 … (4)
Where Outputp is the output feature map after the pointwise convolution. PointwiseConv represents the pointwise convolution
operation. Outputd is the input feature map (usually the result of a Depthwise convolution in the MobileNet architecture).
Conv1*1 represent a convolution operation with kernel size of 1*1.
Yolov3 algorithm is a real-time object detection system that frames object detection as a single regression problem, straight
from image pixels to bounding box coordinates and class probabilities.
For each grid cell, YOLOv3 predicts:
𝑃 (𝑐𝑙𝑎𝑠𝑠𝑥 / 𝑂𝑏𝑗𝑒𝑐𝑡) = 𝑎𝑡 𝑐𝑙𝑎𝑠𝑠𝑥 / 𝛴𝑐 𝑏 = 1 𝑎𝑡𝑐𝑙𝑎𝑠𝑠𝑏 ….. (5)
Final score calculated for each class in a bounding box is the product of the objectness score and the class probability:
𝑆(𝑐𝑙𝑎𝑠𝑠𝑥) = 𝑃 (𝑜𝑏𝑗𝑒𝑐𝑡) ∗ 𝑃 (𝐶𝑙𝑎𝑠𝑠𝑥 / 𝑜𝑏𝑗𝑒𝑐𝑡)… (6)
Yolov3 uses a multi-part loss function:
Loss = 𝜆𝑐𝑜𝑜𝑟𝑑 𝛴𝐼 𝑜𝑏𝑗𝑒𝑐𝑡𝑖 . [𝐼𝑂𝑈𝑃𝑡 + 𝑀𝑆𝐸𝑐𝑜𝑜𝑟𝑑 ] + 𝜆𝑛𝑜𝑜𝑏𝑗 𝛴𝐼 𝑛𝑜𝑜𝑏𝑗𝑒𝑐𝑡𝑖 . 𝑀𝑆𝐸… (7)
Where IOUPt is the intersection Over Union between predicted and ground truth boxes, 𝑀𝑆𝐸𝑐𝑜𝑜𝑟𝑑 is the mean squared error
for coordinates, CEclass is the Cross Entropy for classification and 𝜆 terms are weights.
Faster RCNN introduces Region Proposal Networks (RPN) to generate region proposals and then applied Fast R-CNN for
object detection.
Class prediction probability:
AL
𝑝(𝑐𝑙𝑎𝑠𝑠𝑎) = 𝑒𝑠𝑖 / ∑𝑐 𝑗 = 1 𝑒𝑠𝑖 …. (8)
N

𝐵𝑜𝑥 𝐿𝑜𝑠𝑠𝐹𝑎𝑠𝑡𝑅𝐶𝑁𝑁 = 𝑆𝑚𝑜𝑜𝑡ℎ 𝐿1 𝐿𝑜𝑠𝑠(𝑋𝑡𝑟𝑢𝑒 . 𝑌𝑃𝑟𝑒𝑑)… (9)


FI

Where smoothL1 aims to make the model robust to outliers. L1 Loss reduces sensitivity to outliers.
C. PERFORMANCE MATRICS RESULTS
The performance of an algorithm also depends upon factors such as input image resolution, input size, and hardware
acceleration. To determine the fastest algorithm in this proposed work, an experiment is conducted on images and results have
been compared. Different factors have been considered like accuracy and the time of detection taken by each algorithm.

FIGURE 5. Detection time comparison of SSD Detection Algorithms


MobileNetv3 and YOLO v3 performs better than RetinaNet and Faster RCNN. That’s all during the experiment, detection
time was tracked with a bar chart. Figure 5 shows object detection times for different SSD algorithms when working on fixed
number of images or frames. It is important to note that this comparison is essential in determining how efficient and effective
each algorithm may be in real-time applications. The y-axis represents the time taken by each algorithm to process the same
number of 100 images with time measured in seconds. To make comparison fairly, all algorithms were tested on identical
images collected. As per this graph, every bar represents detection time for a particular algorithm while the height of each bar
depicted how long it takes to process those pictures. The YOLOv3 algorithm has the shortest detection time which is 7 seconds.
Thus, it can be concluded that YOLOv3 is quite good at processing images efficiently hence being suitable for real-time
applications which require high speeds. MobileNetv3 takes 8.7 seconds as its detection time which is just slower that
MobileNetv3 indicating that when it comes to speed RetineNet may not be efficient but there may be other advantages
associated with great accuracy and robustness. Faster RCNN algorithm has the longest detection time about 9 seconds
compared to others. This appears slowest amongst them. Among other things, this visualization shows how sure each of these
algorithms are with regards to its object detections and how these levels of belief vary. The model has been trained for 80
classes. Class names involves all possible objects a user might come across on the road while walking around. This
visualization helps to understand how confident each algorithm is in its object detections and the variability in confidence
levels. The x-axis represents the detection confidence score, which ranges from 0 to 1. These scores indicate the probability
that a particular class belongs to an object detected by a model while y-axis shows the frequency of such confidence scores
within dataset. Hence, this graph combines histograms and Kernel Density Estimation (KDE) plots for each algorithm that
provide smooth curves representing distributions of their confidence scores.
YOLOv3 Algorithm: This gives confidence store which is fairly high, with many detections having confidence scores
beyond 0.80, the KDE plot shows that there is a peak around 0.90, hence there is a strong confidence in its detections.
MobileNetv3: MobileNetv3 has a confidence score above .80 with a few peaks around 0.90 this indicates it reliable similar
to YOLOv3.
RetinaNet: RetinaNet exhibits a slightly wider spread in the values of confidence scores observed in RetinaNet due to a
definite peak at about 0.85. Although many detectors are assured, they have more variability unlike those found in YOLOv3
and MobileNetv3
AL
Faster RCNN: Faster RCNN shows confidence scores with a peak around 0.87, indicating strong detection confidence.
Similar to the other algorithms, Faster RCNN demonstrates high confidence in its detection.
N
FI

FIGURE 6. Detection confidence scores of SSD models.

The high confidence score across all algorithms suggests robust performance in object detection tasks illustrated in Figure 6.
YOLOv3 and MobileNetv3 shows particularly high and consistent confidence level, while RetinaNet and Faster RCNN also
exhibit strong confidence with slightly more variability. YOLOv3 achieved 96% accuracy, MobileNetv3 achieved 92%,
RetinaNet and Faster RCNN achieved 90% accuracy on training data.
FIGURE 7. Accuracy Comparison of Detection Algorithms used in Obstacle detection model

The model has been experimented with on the road and trained with the detection algorithm which gives the best performance
as compared to others. In the experiment on the road, dataset has been collected to check the accuracy of the trained model.
The above line chart in Figure 7 compares the performance of four different object detection algorithms, Yolov3, RetinaNet,
MobileNetv3, Faster RCNN across various metrics and threshold variations. The metrics evaluated are Precision, Recall and
F1 score. The threshold represents different intersection over union (IoU) values used to determine true positive detections.
The y-axis represents the scores of three performance metrics: Precision, Recall, and F1 score. Each metric provides a different
AL
preceptive on the model’s detection accuracy and reliability. Measuring the accuracy of SSD algorithms for obstacle detection
involved the metrics that evaluate different aspects of the model’s performance.
N

𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒(𝑇𝑃)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (9)
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒(𝑇𝑃)+𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒(𝐹𝑃)
FI

It measured a single measure of the model’s accuracy across all object classes. Precision values for all classes. It provides a
single measure of the model’s accuracy across all object classes.
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (𝑇𝑃)
𝑅𝑒𝑐𝑎𝑙𝑙 = (10)
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 (𝑇𝑃)+ 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 (𝐹𝑁)

Recall measured the proportion of true positive detections out of all actual obstacles (true positives and false negatives).
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗𝑅𝑒𝑐𝑎𝑙𝑙
𝐹1 𝑠𝑐𝑜𝑟𝑒 = 2 ∗ (11)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
The legends on the chart identifies each algorithm and the corresponding performance metric, helping to distinguish between
the different lines.
Figure 8. Receiver Operating Characteristic curve of models.

The Area Under the Curve (AUC) score is used to evaluate the performance of a classification models. Specifically, the AUC
score represents the area under the Receive Operating Characteristics (ROC) curve shows in figure 8, which plots the true
positive against the false positive rate at various Threshold settings. We have tested on different images.
AL
N
FI
AL
N
FI

FIGURE 9. Detection error tradeoff graphs of classifiers.

The graph is figure 9 illustrates the false positive rate and false negative rate. It helps to understand the tradeoff between
detection errors and useful in evaluating and comparing the performance of different classification models.
YOLOv3: YOLOv3 shows strong precision values, but its recall and F1 scores exhibit variability as compared to precision.

MobileNetv3: The Precision values for MobileNetv3 are consistently high across all thresholds, indicating robust performance
in detecting and recognizing obstacles, while it is precision, it may miss some obstacles at certain threshold. Recall and F1
scores also remain stable, suggested good overall accuracy and balance between precision and recall.

RetinaNet: RetinaNet’s precision values also lower than those of MobileNetv3 and YOLO, but its precision shows an upward
trend, suggesting it becomes more effective at identifying obstacles as the IoU threshold increases. The F1 score reflect this
improvement in recall.

Faster RCNN: Faster RCNN’s Precision demonstrate high values similar to MobileNetv3, with recall and F1 scores that are
stable and slightly low, indicating a reliable performance across different threshold.
(a) (b)

AL
(c) (d)
FIGURE 10. Detection Scores for various classes using different SSD algorithm.
N

Figure 10 shows a heatmaps that illustrates the detection scores for various object classes across ten different images. This
FI

visualization provides an intuitive understanding of how well each object detection algorithm performs on different classes in
the dataset. The dataset consists of detection scores or frequencies for eight object classes: car, person, bicycle, truck, rikshaw,
footpath, traffic lights, bus. These scores are recorded across ten images 1 to 10. Each cell in the heatmaps represents the
detection score for a particular class in a specific image. The x-axis of the heatmap lists the object classes and y-axis shows
the image numbers. Each cell’s intensity corresponds to the detection score, with darker shades indicating higher scores.
Annotating each cell with the actual detection score value helps to precisely understand the model’s performance. The heatmap
allows for a quick comparison of detection scores across different classes. For instance, it can be seen if certain like car, person
consistently have higher detection scores across all images. The performance across the different images can be evaluated. If
some images have low detection scores for all classes, then it could be due to some conditions like low light or occlusion in
the image.
AL
FIGURE 10. Confusion matrix based on classification performance of SSD models
N

In this proposed work, confusion matrix has been computed to evaluate the classification performance of each object detection
algorithms. In each confusion matrix in figure 10 the horizontal axis represents the predictable label while the vertical axis
FI

represents the true label of the dataset. The confusion matrix provides the performance metrics of the YOLOv3 in relation to
the ground truth label. The matrix shows the number of true positive, true negative, false positive and false negative of each
class. The MobileNetv3 confusion matrix gives information about the classification of the algorithm. It focuses on the aspects
of the model that performs well or poorly in identifying objects within the dataset. The confusion matrix of the RetinaNet
algorithm measures the accuracy of the algorithm in different classes of objects. It points out the areas of improvement in terms
of the detection of specific classes and the areas of strength.
The confusion matrix of Faster RCNN measures the accuracy of the algorithm in classification. It helps in knowing the degree
of separation of the model in terms of object classes from the prediction it gives. These matrices provide a quantitative
comparison of each algorithm’s performance, which facilitates the identification of the best approach to object detection. The
detection model parameters include confidence score, class score threshold, input size image and input image scale.

TABLE 2. Comparison based on 2 Categories

Training Models Speed Accuracy(correct)


(How Fast)
RetinaNet (SSD) 9.5ms 90%
MobileNetv3(SSD) 7.2ms 92%
YOLOv3 10.1ms 96%
Faster RCNN 11.8ms 90%

The above Table 2 illustrates the concluded results of all the detection algorithms that have tested after findings of different
results its concluded that YOLOv3 gives better results in proposed model, when trained in real-time environment application.
All the detection algorithms were good in different aspects but the model is trained with YOLOv3 which proves the best in all
aspects.
TABLE 3. Accuracy Comparison with existing work.

Reference Accuracy Techniques (Existing work)


(Existing
work)
[39] Accurately Single Shot multibox
navigating detector
space
[40] Reliability YOLOR algorithm
and Safety
[41] 90.49% MobilNetv3 and SSD
[42] 0.781 MobileNet and SSD
[43] safety and YOLOv7
efficiency
Proposed 96% YOLOv3
work (our
study)
Table 3 illustrates that accuracies achieved in previous work and the accuracy of proposed model. YOLOv3 in proposed study
shows a slightly improved in Precision and F1score compared to existing work, indicating better overall detection accuracy.
The Recall is also marginally higher, suggesting a better balance between precision and recall. All evaluated algorithms in
proposed study show improvements over existing work, suggesting advancements in object detection techniques and algorithm
optimizations. Table 3 provides the comprehensive comparison of the accuracy of various SSD algorithms in proposed study
against existing work. The improvement observed in proposed study stress the efficiency of the optimizations and techniques
that have been applied to these algorithms and can be useful for choosing the most suitable algorithm for a given application.
Further research should also extend to enhance these algorithms in order to make it more accurate and faster.
AL
D. DISCUSSION
The information comparison of four SSD algorithms YOLOv3, MobileNetv3, RetinaNet and Faster RCNN was relevant to the
N

real-time object detection. The assessment focused on two critical aspects: Detection Time, along with the confidence score.
1) Detection time:
FI

The study provides a comprehensive evaluation of SSD algorithms for obstacle detection and recognition, focusing on four
prominent models: Some of the instance augmentation models include: YOLOv3, MobileNetv3, RetineNet and Faster RCNN.
This analysis was performed with a dataset that contained several object classes and the evaluation of the algorithm was made
based on the model’s precision, recall and F1 score. The findings show that this article YOLOv3 produces the most accurate
results with the highest precision of 96%, recall and F1 score of 90%. MobileNetv3 second to list with an accuracy of 92%,
RetineNet and Faster RCNN with 90%. Thus, the primary outcomes of this investigation can be rendered as follows: YOLOv3
model outperform in concerns to accuracy and inference time making it optimal for real-time use. MobileNetv3 represents the
accuracy /inference time ratio lower than YOLOv3, RetinaNet and Faster RCNN are promising for the cases where the detector
should be both highly accurate and relatively fast.
2) Detection confidence:

The confidence score of YOLOv3 is high and very stable, with the maximum value of approximately 96%, this shows that the
equipment used in the detection of the disease is highly reliable. This high level of confidence minimizes false positive and
negatives, are critical in accurate obstacle recognition. MobileNetv3 shows confidence scores comparable to that of YOLOv3
with the score lower to around 92%. This suggests that YOLOv3 can be gives good detections, which makes it a good candidate
for applications where detection accuracy is important. RetinaNet showed a slightly wider spread in confidence scores, with a
peak around 0.90, while this indicates a bit more variability in detecting confidence, RetineNet still offers robust performance,
especially in scenarios where detection variability can be tolerated. The confidence score of Faster RCNN peaking 0.90 suggest
strong detection reliability. Despite its slower detection time, the high confidence scores justify its use in applications where
accuracy is prioritized over speed.

3) Limitations:

Several limitations were made during this study.


• The experiments were performed on specific hardware configurations, and the results may vary with different hardware
setups. Computational resources, such as GPU capabilities can significantly influence the processing time and feasibility
of real-time applications.
• The model was evaluated in their standard forms without extensive optimization for specific use cases. Techniques like
quantization pruning, and hardware-specific optimization were not explored which could enhance real-time
performance.
• The study primarily focused on precision. Recall and F1 score. Other important factors such as model robustness
interpretability and ease of deployment were not covered in depth. These factors are crucial for practical
implementations and should be considered in future research.

4) Assumptions:

Few assumptions were made during this study:

• It was assumed that the object classes in the dataset were uniformly distributed. In real-world scenarios, certain object
types may be more prevalent than others, which could affect the detection performance.
• The study primarily considered static object in the evaluation process. Moving objects or objects with significant motion
blur may present additional challenges that were not fully addressed in this analysis.
• It was assumed that the preprocessing steps (such as resizing and normalizing) were consistently applied across all
algorithms. Variations in preprocessing techniques could influence the detection performance.
• The default hyperparameters of each SSD model were used in the evaluation. Fine-tunning these hyperparameters for
specific datasets or use cases could yield different results.
AL
5) Recommendations:

Based on the findings and limitations of this study, several recommendations are proposed for future research.
N

Future work should involve a more diverse and comprehensive dataset that includes various object types, lighting conditions
and environmental settings to improve the generalizability of the results. Conduct extensive testing in real-world scenarios to
FI

evaluate the robustness and reliability of the algorithms under different conditions, such as varying weather, occlusions and
dynamic backgrounds. Explore hardware-specific optimizations and model compression techniques like quantization and
pruning to improve the real-time performance of the models on different devices. Investigate the integration of advanced
techniques such as attention mechanisms, multi-scale feature integration and hybrid models to enhance detection performance
and efficiency.

V. CONCLUSION

The primary objective of this work was improving the performance of obstacle detection systems, by making them more
reliable, accurate and thus perform for real-time applications. To the best of our knowledge, this study offers a comprehensive
comparison with respect to recent algorithms Faster RCNN, MobileNetv3, RetinaNet and YOLOv3. We provided evaluation
according to theory accuracy and time efficiency which provide a comprehensive experience on how the models will be acting
in real-time applications. The practical approach ensures that findings based on such a method can be directly applied to real-
time situations. This includes autonomous navigation and safety systems. Otherwise, fields used to judge the model’s
effectiveness in dynamic, arduous circumstances. Our model employs efficient YOLOv3 detection algorithm to process images
and videos in real-time. These algorithms have been proven effective in our model. We acquired the real-world dataset that
include various real-world scenarios and obstacle types, this was difficult to train and test the model. MobileNetv3, despite
good balance between accuracy and speed, suitable for scenarios requiring both. Future work should address the limitation
identified and explore the recommendations provided to enhance the robustness and applicable of SSD-based object detection
models in diverse real-world scenarios.

VI. FUTURE WORK

While this study has provided valuable insights into the performance of various SSD algorithms for obstacle detection and
recognition, several avenues for future research can further enhance the applicability and robustness of these models. Utilize
and create larger, more diverse datasets that include a wide range of environmental conditions, object types and scenarios to
ensure that models are robust and generalize well to real-world situations. Conduct extensive trials in various real-world
environments, such as urban, rural and industrial settings, to validate and refine the models under different conditions.
Implement multi-scale feature integration techniques to better object detection of varying sizes and improve overall detection
performance.

AL
N
FI
B. REFERENCES

1. Ramisetti, C., et al. An Ultrasonic Sensor-based blind stick analysis with instant accident alert for Blind People. in 2022 International Conference
on Computer Communication and Informatics (ICCCI). 2022. IEEE.
2. Nunes, D., et al. Real-time Vision Based Obstacle Detection in Maritime Environments. in 2022 IEEE International Conference on Autonomous
Robot Systems and Competitions (ICARSC). 2022. IEEE.
3. Assaf, E.H., et al., High-Precision Low-Cost Gimballing Platform for Long-Range Railway Obstacle Detection. Sensors, 2022. 22(2): p. 474.
4. Shuai, Q. and X. Wu. Object detection system based on SSD algorithm. in 2020 international conference on culture-oriented science & technology
(ICCST). 2020. IEEE.
5. Kumar, A., Z.J. Zhang, and H. Lyu, Object detection in real time based on improved single shot multi-box detector algorithm. EURASIP Journal
on Wireless Communications and Networking, 2020. 2020: p. 1-18.
6. He, D., et al., Urban rail transit obstacle detection based on Improved R-CNN. Measurement, 2022. 196: p. 111277.
7. Guan, L., et al., A Lightweight Framework for Obstacle Detection in the Railway Image based on Fast Region Proposal and Improved YOLO-tiny
Network. IEEE Transactions on Instrumentation and Measurement, 2022. 71: p. 1-16.
8. Kamaruddin, F., et al. Smart Assistive Shoes with Internet of Things Implementation for Visually Impaired People. in Journal of Physics:
Conference Series. 2021. IOP Publishing.
9. Wang, X., et al., Target Electromagnetic Detection Method in Underground Environment: A Review. IEEE Sensors Journal, 2022.
10. Fang, R. and C. Cai. Computer vision based obstacle detection and target tracking for autonomous vehicles. in MATEC Web of Conferences. 2021.
EDP Sciences.
11. Vorapatratorn, S. AI-Based Obstacle Detection and Navigation for the Blind Using Convolutional Neural Network. in 2021 25th International
Computer Science and Engineering Conference (ICSEC). 2021. IEEE.
12. BAMDAD, M., D. SCARAMUZZA, and A. DARVISHY, SLAM for Visually Impaired Navigation: A Systematic Literature Review of the Current
State of Research. 2023.
13. Jin, J., et al., Wide baseline stereovision based obstacle detection for unmanned surface vehicles. Signal, Image and Video Processing, 2024. 18(5):
p. 4605-4614.
14. Patel, I., M. Kulkarni, and N. Mehendale, Review of sensor-driven assistive device technologies for enhancing navigation for the visually impaired.
Multimedia Tools and Applications, 2024. 83(17): p. 52171-52195.
15. Rajesh, P., et al. Arduino based Smart Blind Stick for People with Vision Loss. in 2023 7th International Conference on Computing Methodologies
and Communication (ICCMC). 2023. IEEE.
16. Jayachitra, J., et al. Design and Implementation of Smart Glove for Visually Impaired People. in 2023 5th International Conference on Smart
Systems and Inventive Technology (ICSSIT). 2023. IEEE.
AL
17. Sissodia, R., M.S. Rauthan, and V. Barthwal. Arduino based bluetooth voice-controlled robot car and obstacle detector. in 2023 IEEE International
Students' Conference on Electrical, Electronics and Computer Science (SCEECS). 2023. IEEE.
18. Chauhan, R., J. Upadhyay, and C. Bhatt. An innovative wheelchair for quadreplegic patient using IoT. in 2023 International Conference on Device
Intelligence, Computing and Communication Technologies,(DICCT). 2023. IEEE.
N

19. Lima, R., et al., Visually impaired people positioning assistance system using artificial intelligence. IEEE Sensors Journal, 2023. 23(7): p. 7758-
7765.
20. Prathibha, S., et al. Ultra-modern walking stick designed for the blind. in 2023 International Conference on Networking and Communications
FI

(ICNWC). 2023. IEEE.


21. Masud, U., et al., Smart assistive system for visually impaired people obstruction avoidance through object detection and classification. IEEE
access, 2022. 10: p. 13428-13441.
22. Dhou, S., et al., An IoT machine learning-based mobile sensors unit for visually impaired people. Sensors, 2022. 22(14): p. 5202.
23. Jiang, P., et al., A Review of Yolo algorithm developments. Procedia Computer Science, 2022. 199: p. 1066-1073.
24. Yu, T., et al., Intelligent detection method of forgings defects detection based on improved efficientnet and memetic algorithm. IEEE Access, 2022.
10: p. 79553-79563.
25. Arkin, E., et al., A survey: Object detection methods from CNN to transformer. Multimedia Tools and Applications, 2023. 82(14): p. 21353-21383.
26. Reddy, B.S., et al. A Comparative Study on Object Detection Using Retinanet. in 2022 IEEE 2nd Mysore Sub Section International Conference
(MysuruCon). 2022. IEEE.
27. Yang, Y. and J. Han, Real-Time object detector based MobileNetV3 for UAV applications. Multimedia Tools and Applications, 2023. 82(12): p.
18709-18725.
28. Duan, K., et al., CenterNet++ for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
29. Liu, B., W. Zhao, and Q. Sun. Study of object detection based on Faster R-CNN. in 2017 Chinese Automation Congress (CAC). 2017. IEEE.
30. Darms, M.S., et al., Obstacle detection and tracking for the urban challenge. IEEE Transactions on intelligent transportation systems, 2009. 10(3):
p. 475-485.
31. Li, L., et al. Foreground object detection from videos containing complex background. in Proceedings of the eleventh ACM international
conference on Multimedia. 2003.
32. Leng, J., et al., Robust obstacle detection and recognition for driver assistance systems. IEEE transactions on intelligent transportation systems,
2019. 21(4): p. 1560-1571.
33. Badrloo, S., et al., Image-based obstacle detection methods for the safe navigation of unmanned vehicles: A review. Remote Sensing, 2022. 14(15):
p. 3824.
34. Sugimoto, S., et al. Obstacle detection using millimeter-wave radar and its visualization on image sequence. in Proceedings of the 17th
International Conference on Pattern Recognition, 2004. ICPR 2004. 2004. IEEE.
35. Devi, S.K., et al., Intelligent Deep Convolutional Neural Network Based Object Detection Model for Visually Challenged People. Computer
Systems Science & Engineering, 2023. 46(3).
36. Cervera-Uribe, A.A. and P.E. Mendez-Monroy, U19-Net: a deep learning approach for obstacle detection in self-driving cars. Soft Computing,
2022. 26(11): p. 5195-5207.
37. Wang, X., et al. Cut and learn for unsupervised object detection and instance segmentation. in Proceedings of the IEEE/CVF Conference on
Computer Vision and Pattern Recognition. 2023.
38. Ashiq, F., et al., CNN-based object recognition and tracking system to assist visually impaired people. IEEE Access, 2022. 10: p. 14819-14834.
39. Duba, P.K., N.P.B. Mannam, and P. Rajalakshmi, Stereo vision based object detection for autonomous navigation in space environments. Acta
Astronautica, 2024. 218: p. 326-329.
40. Ouardi, M.M. and D.N. Jawawi, Object Detection Algorithms for Autonomous Navigation Wheelchairs in Hospital Environment: Object Detection
Algorithms for Autonomous Navigation Wheelchairs in Hospital Environment. International Journal of Innovative Computing, 2024. 14(1): p. 1-
6.
41. Cheng, B. and L. Deng, Vision detection and path planning of mobile robots for rebar binding. Journal of Field Robotics.
42. Kumar, D.N. and A. Akilandeswari. Novel approach for object detection neural network model using you only look once v4 algorithm and
compared with tensor flow SSD mobile net algorithm in terms of accuracy and latency. in AIP Conference Proceedings. 2024. AIP Publishing.
43. Adiuku, N., et al., Improved Hybrid Model for Obstacle Detection and Avoidance in Robot Operating System Framework (Rapidly Exploring
Random Tree and Dynamic Windows Approach). Sensors, 2024. 24(7): p. 2262.

AL
N
FI

You might also like