0% found this document useful (0 votes)

31 views5 pages

Demo Research Paper

Uploaded by

aadarsh4519

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views5 pages

Demo Research Paper

Uploaded by

aadarsh4519

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

2023 3rd Asian Conference on Innovation in Technology (ASIANCON)

Pune, India. Aug 25-27, 2023

Real-Time Object Detection and Audio Feedback

for the Visually Impaired
Ayan Ravindra Jambhulkar Akshay Rameshbhai Gajera Chirag Manoj Bhavsar
Department of Electronics and Department of Electronics and Department of Electronics and
Telecommunication Engineering Telecommunication Engineering Telecommunication Engineering
K. J. Somaiya College of Engineering K. J. Somaiya College of Engineering K. J. Somaiya College of Engineering
Mumbai, India. Mumbai, India. Mumbai, India.
[email protected] [email protected] [email protected]

Shilpa Vatkar
Department of Electronics and
Telecommunication Engineering
K. J. Somaiya College of Engineering
Mumbai, India.
[email protected]
2023 3rd Asian Conference on Innovation in Technology (ASIANCON) | 979-8-3503-0228-8/23/$31.00 ©2023 IEEE | DOI: 10.1109/ASIANCON58793.2023.10269899

Abstract— Visually impaired individuals face numerous the positions of different objects, which can vary from one
challenges in their daily lives, including the ability to identify image to another [2]. This research proposes a system that
and navigate through their surroundings independently. can help people who are visually impaired detect and identify
Object detection techniques based on computer vision have objects in their environment in real-time. To achieve this, we
shown results in helping the visually impaired by detecting and use an object detection algorithm called YOLO_v3 and a
classifying objects in real-time. In this paper, we used a dataset called MSCOCO. Our system generates an audio
realtime object detection and audio feedback system that description of the object, including its location and category,
provides audio feedback to the visually impaired for and plays it through a speaker or headphones using gTTS
identifying and navigating in their surroundings. The proposed
(Google Text to Speech) API. With providing audio
system uses the YOLO_v3 algorithm with the MS COCO
dataset to detect and classify objects in real-time and provide
feedback, this system aims to help visually impaired
corresponding audio feedback. We used gTTS (Google Text to individuals an additional way to detect and identify objects in
Speech) API for generating the audio feedback. The audio their environment.
feedback is generated using an audio processing techniques
and deep learning algorithms. We evaluated on a dataset, and
II. LITERATURE REVIEW
achieved an average detection accuracy of 90%. The proposed Object detection and recognition have been important
system provides a practical and effective solution for topics of research form many years. With the advancement
enhancing accessibility and independence for visually impaired of deep learning techniques, object detection has become
individuals, and demonstrates the potential of using advanced more accurate and efficient. The YOLO (You Only Look
deep learning algorithms and datasets for real-time object Once) algorithm has emerged as a popular method for real-
detection and audio feedback systems. time object detection due to its speed and accuracy [3]. There
has been an increase in the amount of interest in developing
Keywords— Real-time object detection, Audio feedback assistive technologies for visually impaired individuals.
system, YOLO_v3 algorithm, MS COCO dataset, gTTS (Google
These technologies aim to enhance their independence and
Text to Speech) API, Deep learning
mobility by providing them with additional means of
I. INTRODUCTION detecting and identifying objects in their environment. Deep
learning-based object detection systems have shown
From an early age, humans are taught by their parents to promising results in this regard. The Microsoft Common
distinguish between different things, including themselves as Objects in Context (MS COCO) dataset is widely used in
individuals. Our visual system as humans is remarkably deep learning-based object detection research. It is a large-
precise and can handle multiple tasks even when we are not scale dataset that contains over 330,000 images with more
consciously aware of it. However, when dealing with large than 2.5 million object instances labelled in 80 different
amounts of data, we require a more accurate system to categories [4]. Ramesh et al. proposed a real-time object
correctly identify and locate multiple objects at the same detection system for visually impaired individuals using deep
time. This is where machines come into play. By training our learning. Their system uses a YOLO-based object detection
computers using improved algorithms, we can enable them algorithm and provides audio feedback to the user in real-
to detect multiple objects within an image with a high level time [5]. Saha et al. proposed an object detection and audio
of accuracy and precision. Object detection is a particularly feedback system for visually impaired individuals that uses
challenging task in computer vision because it involves fully deep learning techniques. They used the YOLO algorithm
understanding images. In simpler terms, an object tracker for object detection and gTTS (Google Text-to-Speech) for
attempts to determine if an object is present in multiple audio feedback [6]. Li et al. proposed a deep reinforcement
frames and assigns labels to each identified object [1]. This learning-based object detection and obstacle avoidance
process encounters various challenges, such as complex system for visually impaired individuals. Their system uses a
images, loss of information, and the transformation of a combination of object detection and obstacle avoidance
three-dimensional world into a two-dimensional image. To techniques to enable visually impaired individuals to
achieve accurate object detection, our focus should not only navigate through complex environments [7]. One of the most
be on classifying objects but also on accurately determining commonly used object detection algorithms for real-time

979-8-3503-0228-8/23/$31.00 ©2023 IEEE 1

Authorized licensed use limited to: Somaiya University. Downloaded on May 24,2024 at 11:15:19 UTC from IEEE Xplore. Restrictions apply.
applications is YOLO (You Only Look Once) [8]. YOLO is When we engage in developing an object detection
an end-to-end neural network that processes images in real- algorithm, there are two primary aspects we focus on:
time and outputs bounding boxes and class probabilities for detection and localization. Detection involves determining
detected objects. YOLO has been used in several studies on whether an object belongs to a specific category or not. On
object detection for visually impaired individuals [9] [10]. the other hand, localization refers to establishing the
Another, important aspect of real-time object detection for boundaries of a bounding box around each object, taking into
the visually impaired is the provision of audio feedback. account that the position of objects may differ across
Text-to-speech (TTS) technology is commonly used for different images. To evaluate and compare the effectiveness
generating audio feedback in object detection systems. In a of various algorithms in the same application, it is beneficial
study conducted by Shin and Kwon [11], a real-time object to utilize challenging datasets that establish a standard for
detection system was developed using the YOLO algorithm performance assessment. We have used in the context of our
and TTS technology to provide audio feedback to visually problem statement the Microsoft Common Objects in
impaired individuals. In addition to YOLO, other object Context (MS COCO) dataset to test the algorithms'
detection algorithms have also been used in real-time performance. [18]. COCO, as its name implies, is a dataset
applications for the visually impaired. For example, the that comprises images collected from everyday scenes
Faster R-CNN (Regionbased Convolutional Neural Network) depicting common objects. These images are gathered in a
algorithm has been used in a study by Ghosal et al. [12] to way that reflects their natural context. If you're interested in
develop a real-time object detection system with audio accessing this dataset, you can easily download it from the
feedback for the visually impaired. Several datasets have official COCO website. [19]. The dataset consists of a total
been used for training and testing real-time object detection of 330,000 images, which are divided into 91 different
systems for the visually impaired. One of the most categories. Among these categories, 82 have been assigned
commonly used datasets is the MS COCO (Common Objects labels. The COCO dataset, although it has fewer categories
in Context) dataset, which contains over 330,000 images and compared to some other datasets, compensates for this by
more than 2.5 million object instances [13]. The MS COCO having a larger number of instances for each specific object.
dataset has been used in several studies on real-time object This characteristic of the COCO dataset enables machines to
detection for visually impaired individuals [9] [11] [12]. learn more accurately. Additionally, the COCO dataset
excels at effectively dealing with small objects, providing
III. RELATED WORK valuable training examples for machine learning algorithms.
For instance, a study by Saha et al. (2019) proposed a
real-time object detection and audio feedback system that IV. METHODOLOGY
uses a Raspberry Pi and a camera module to detect objects YOLO utilizes a single neural network to process the
and provide audio feedback. The system uses the entire image. It then divides the image into a grid of equally-
TensorFlow object detection API and the COCO dataset for sized cells, usually represented as SxS. For each object
object detection. The audio feedback is provided using a present in the image, YOLO creates a bounding box around
speaker or headphones. The study showed promising results it. It labels each object it finds with a confidence score and a
in detecting objects in real-time and providing audio class label. How precisely the object is contained within the
feedback [17]. Another study by Noh et al. (2018) proposed bounding box is shown by the confidence score. Within each
a similar system for object detection and identification. The grid cell, YOLO predicts four values: (x, y, w, h). These
system uses the Faster RCNN algorithm for object detection values represent the coordinates and dimensions of the
and a Raspberry Pi for audio feedback. The study showed bounding box for each object, with all values ranging
that the system was able to detect and identify objects in real- between 0 and 1. Additionally, YOLO provides a confidence
time, and that the audio feedback was effective in assisting score for every object detected within the cell. The prediction
visually impaired individuals in navigating environments output of YOLO has a specific shape of (S, S, BX5+C) [20].
[16]. In addition, a study by Bhuyan et al. (2019) proposed a This means that for each cell in the SxS grid, YOLO predicts
system for text detection and audio feedback. The system B bounding boxes and their corresponding confidence scores
uses the EAST text detection algorithm and the Google Text- (confidence score + class label) for a total of (BX5) values.
to- Speech API for audio feedback. The study showed that The additional C represents the number of class labels that
the system was able to detect text in real-time and provide the algorithm can detect.
accurate audio feedback to visually impaired individuals
[15]. Real-time Object Detection and Recognition for The system consists of two main components: object
Visually Impaired People using Deep Learning by D. Karimi detection using the YOLO_v3 algorithm and the generation
and H. R. Rabiee. This paper presents a realtime object of audio feedback using the gTTS API. It operates by taking
detection and recognition system for visually impaired input from a camera and performing real-time detection and
people based on deep learning techniques. The system uses a classification of objects.
convolutional neural network (CNN) to detect and classify A. Object Detection:
objects, and provides audio feedback to the user using text- In order to detect objects within images, we utilized the
to-speech technology. Real-time Object Detection and YOLO_v3 algorithm. This algorithm is favored for its
Classification for the Visually Impaired using Wearable impressive combination of speed and accuracy. YOLO_v3
Cameras by S. S. Saini and R. Singh. This paper proposes a follows a unique approach where it divides the image into a
wearable camera-based object detection and classification grid-like structure. Within each grid cell, the algorithm
system for the visually impaired. The system uses the YOLO predicts bounding boxes (which indicate the location and
algorithm to detect objects in real-time and provides audio size of the objects) and class probabilities (which determine
feedback to the user through a speaker or headphones. the type of object present).
DATASET

2
Authorized licensed use limited to: Somaiya University. Downloaded on May 24,2024 at 11:15:19 UTC from IEEE Xplore. Restrictions apply.
Fig. 1. Architecture of YOLO_v3

B. Audio Feedback Generation:

We used the gTTS (Google Text to Speech) API for
generating audio feedback for detected objects. gTTS is a
Google's Text to Speech API to generate speech from text.
We passed the object description generated by the YOLO_v3
algorithm to gTTS, which then generated an audio file of the
description in the form of an MP3 file. We played the audio Fig. 3. Flowchart
file through a speaker or headphones to provide the user with
audio feedback. We evaluated the performance of our system on a dataset
C. Overall System: of images containing various objects. We measured the
accuracy and speed of our system and found that it
YOLO_V3 utilizes an original Darknet architecture with
performed very well in terms of speed while maintaining
53 layers, but for the detection process, an additional 53
high accuracy. This system has the potential to assist visually
layers are added, resulting in a total of 106 layers. What
impaired individuals in detecting and identifying objects in
makes YOLO_V3 particularly interesting is its approach to
their environment.
making detections at three different scales, sizes, and
locations within the network. The detection kernel's shape is V. RESULTS
represented as 1x1x(B x (5+C)), where C represents the total
number of classes (e.g., 80 for the COCO dataset) and B In this particular section, various assessment measures
denotes the number of bounding boxes around the objects. were employed to gauge how well the algorithm performed
Consequently, the kernel size of YOLO_V3 is 1x1x255. To and how easily it could adapt. Precision, recall, and inference
gain a deeper understanding of the YOLO_V3 algorithm, a time were utilized as performance indicators. Using a
more extensive network called Darknet 53, consisting of 53 specified threshold value, true positives (TP), false positives
convolutional layers, is employed. Once an input image is (FP), true negatives (TN), and false negatives (FN) were
fed into the YOLO_V3 architecture, multiple objects within takeninto account when calculating precision and recall
the image are classified and assigned class labels. The values. As a criterion,an IOU value of 0.5 was used, meaning
resulting output is then processed by a Python module called that the detection is believed tobe accurate if the IOU value
gTTS, which converts the text into speech. The system is greater than or equal to 0.5; If not, it isregarded as false.
described utilizes a camera to capture input, performs real-
time object detection using the YOLO_V3 algorithm, ൌ
൅
generates an audio description of the detected objects using
ൌ
the gTTS API, and delivers the audio feedback to the user ൅
through a speaker or headphones. This system can be Fig. 4. Precision and Recall
implemented on a computer or laptop equipped with a GPU
to ensure real-time performance. In addition to measuring the precision and recall values,
the time it takes for an algorithm to detect objects is also
D. Regenerate response considered to evaluate its speed. To assess the speed of
detection, experiments were conducted in different scenarios,
including detecting a single object, detecting multiple
objects, and detecting objects at a distance. It's worth noting
that all of these experiments were conducted in real-time
using a webcam connected to a laptop.
Single Object:
With a Single object detection, it gives accuracy between
1 – 0.9 which is 100 % - 90 % accuracy

Fig. 2. Workflow of YOLO_V3 with Audio Feedback

3
Authorized licensed use limited to: Somaiya University. Downloaded on May 24,2024 at 11:15:19 UTC from IEEE Xplore. Restrictions apply.
Fig. 12. Terminal Output

Fig. 5. Video Frame Output

Fig. 6. Terminal Output

Fig. 13. Video Frame Output

Fig. 14. Terminal Output

Multiple Object:
With a Multiple object detection, it gives accuracy
between 1 – 0.78 which is 100 % - 78 % accuracy
Fig. 7. Video Frame Output

Fig. 8. Terminal Output

Fig. 15. Video Frame Output

Fig. 16. Terminal Output

Fig. 9. Video Frame Output Distant Object:

With a Distant object detection, it gives accuracy
between 0.9 – 0.64 which is 90 % - 64 % accuracy
Fig. 10. Terminal Output

Fig. 17. Video Frame Output

Fig. 11. Video Frame Output

4
Authorized licensed use limited to: Somaiya University. Downloaded on May 24,2024 at 11:15:19 UTC from IEEE Xplore. Restrictions apply.
camera as the input device, limiting its use in low-light
Fig. 18. Terminal Output environments. We can improve the detection model's
precision by expanding the data set to include more images
in a different lighting conditions and orientations. The object
detection technique may have a few extra features added,
such color recognition and distance measurement.
REFERENCES
[1] S. Cherian, & C. Singh, “Real Time Implementation of Object Tracking
Through webcam,” Internation Journal of Research in Engineering and
Technology, 128-132, (2014)J. Clerk Maxwell, A Treatise on
Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892,
pp.68–73.
[2] Z. Zhao, Q. Zheng, P.Xu, S. T, & X. Wu, “Object detection with deep
learning: A review,” IEEE transactions on neural networks and
learning systems, 30(11), 3212-3232, (2019).
[3] Redmon, J., & Farhadi, A. (2018). YOLOv3: An incremental
Fig. 19. Precision Curve of YOLO_v3 improvement. arXiv preprint arXiv:1804.02767.
[4] Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D.,
... & Zitnick, C. L. (2014). Microsoft COCO: Common objects in
context. In European conference on computer vision (pp. 740-755).
Springer, Cham.
[5] Ramesh, N., Anand, V. R., & Babu, R. V. (2018). Real-time object
detection for visually impaired using deep learning. In 2018
[6] International Conference on Communication and Signal Processing
(ICCSP) (pp. 0214-0218). IEEE.
[7] Saha, S., Nag, A., & Roy, P. P. (2019). Object detection and audio
feedback system for the visually impaired using deep learning.
International Journal of Computer Vision and Image Processing, 9(3),
1-14.
[8] Li, H., Chen, X., Liang, X., Li, Z., & Liu, S. (2019). Deep
reinforcement learning-based object detection and obstacle avoidance
for visually impaired. Sensors, 19(20), 4483
Fig. 20. . Recall Curve of YOLO_v3 [9] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look
Once: Unified, Real-Time Object Detection," in Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition, 2016.
We tested the system on a MS COCOdataset of images
[10] Y. Gao and W. Wu, "Real-time Object Detection for Visually Impaired
containing various objects and measured its accuracy and People with YOLO," in Proceedings of the 2nd International
speed. The results showed that our system achieved high Conference on Control Science and Systems Engineering, 2021.
accuracy in object detection between of 1 - 0.64 which is [11] N. R. Kuncham and K. H. Prasad, "Real-time Object Detection for
100% - 64%. The system was able to detect and classify Visually Impaired People Using YOLOv3," in Proceedings of the 6th
objects in real- time, on a laptop with a GPU. Also, the audio International Conference on Inventive Computation Technologies,
feedback generated by gTTS API was clear and 2021.
understandable, providing visually impaired individuals with [12] S. Shin and S. Kwon, "Real-time Object Detection with Audio
Feedback for the Visually Impaired using YOLOv3," in Proceedings of the
a reliable means of detecting and identifying objects in their 15th International Conference on Advanced Technologies, 2020.
environment. Overall this real-time object detection and
[13] S. Ghosal, P. Banerjee, and S. Chakraborty, "Real-time Object
audio feedback system showed high accuracy and speed, Detection and Audio Feedback System for the Visually Impaired using
making it a good tool for assisting visually impaired Faster R-CNN," in Proceedings of the International Conference on
individuals in navigating their environment. Computer Vision and Image Processing, 2019.
[14] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D
VI. CONCLUSION AND FUTURE SCOPE [15] Bhuyan, M. S., Chakravarty, S., Das, S., & Bora, P. K. (2019). Real-
In conclusion, our research has shown the effectiveness time text detection and audio feedback system for the visually
impaired. Multimedia Tools and Applications, 78(17), 24479-24499.
of utilizing deep learning techniques, specifically CNN and
[16] Noh, Y., Kim, C., & Hwang, I. (2018). Object detection and
YOLO_v3, to develop an object detection system for identification for visually impaired using deep learning and audio
visually impaired individuals. This has shown an excellent feedback system.
accuracy in identifying and categorizing single and multiple [17] Saha, S., Pal, S., & Mukherjee, J. (2019). An assistive device for
objects, and remote object utilizing a laptop webcam in a visually impaired people for object detection and audio feedback.
short amount of time. Also, our system can detect multiple [18] T. Lin, Y. Maire, M. Belongie, S. Hays, J. Perona, P. Ramanan, D.,
objects in a frame and accurately determine their positions. & C.L. Zitnick, “Microsoft coco: Common objects in context,” In
We have used MS COCO Dataset. We have also successfully European conference on computer vision (pp. 740-755). Springer,
Cham, (2014, September)
used our object detection system with gTTS API to provide
audio feedback to visually impaired individuals, enhancing [19] http://cocodataset.org/#home
their ability to navigate and interact with their environment. [20] J. Du, “Understanding of Object Detection Based on CNN Family and
YOLO,” In Journal of Physics: Conference Series (Vol. 1004, No.1,
This provides real-time audio feedback to the user. Also, this p. 012029). IOP Publishin, g, (2018, April).
has shown that the benefits of using deep learning and audio
feedback for object detection, there are still areas for
improvement. For example, our system currently relies on a

5
Authorized licensed use limited to: Somaiya University. Downloaded on May 24,2024 at 11:15:19 UTC from IEEE Xplore. Restrictions apply.

08 Real - Time - Object - Detection - With - Audio - Feedback - Using - Yolo - vs. - Yolo - v3
No ratings yet
08 Real - Time - Object - Detection - With - Audio - Feedback - Using - Yolo - vs. - Yolo - v3
7 pages
Real Time Object Detection With Audio Feedback Using Yolo v3
No ratings yet
Real Time Object Detection With Audio Feedback Using Yolo v3
4 pages
Object Detection
No ratings yet
Object Detection
25 pages
Research Paper
No ratings yet
Research Paper
7 pages
Object Detection With Voice Guidance To Assist Visually Impaired Using Yolov7
No ratings yet
Object Detection With Voice Guidance To Assist Visually Impaired Using Yolov7
7 pages
Review 2
No ratings yet
Review 2
30 pages
Object Detection System With Voice Alert For Blind
No ratings yet
Object Detection System With Voice Alert For Blind
7 pages
15 Object+Detection+
No ratings yet
15 Object+Detection+
8 pages
SSRN Id4528448 Code6045370
No ratings yet
SSRN Id4528448 Code6045370
6 pages
Set Conference 22mdt1034
No ratings yet
Set Conference 22mdt1034
17 pages
Final Review
No ratings yet
Final Review
31 pages
Blind Assistance
No ratings yet
Blind Assistance
16 pages
Object Detection and Recognition Using TensorFlow For Blind People
No ratings yet
Object Detection and Recognition Using TensorFlow For Blind People
6 pages
Final Project Report
No ratings yet
Final Project Report
76 pages
Voice Assisted Object Detection For Visually Impaired
No ratings yet
Voice Assisted Object Detection For Visually Impaired
5 pages
RTVB Research
No ratings yet
RTVB Research
4 pages
Enabling Object Detection Through Speech For Visually Impaired-2
No ratings yet
Enabling Object Detection Through Speech For Visually Impaired-2
55 pages
Vision AI A Deep Learning-Based Object Recognition System For Visually Impaired People Using TensorFlow and OpenCV
No ratings yet
Vision AI A Deep Learning-Based Object Recognition System For Visually Impaired People Using TensorFlow and OpenCV
7 pages
Ijsret v7 Issue2 211
No ratings yet
Ijsret v7 Issue2 211
5 pages
YOLO Based Real Time Human Detection Using Deep Learning
No ratings yet
YOLO Based Real Time Human Detection Using Deep Learning
9 pages
2023 Voice Assisted Real-Time Object Detection
No ratings yet
2023 Voice Assisted Real-Time Object Detection
14 pages
An Real Time Object Detection Method For Visually Impaired Using Machine Learning
No ratings yet
An Real Time Object Detection Method For Visually Impaired Using Machine Learning
6 pages
Virtual Assistant For The Blind
No ratings yet
Virtual Assistant For The Blind
7 pages
Real Time Object Detection With Audio Feedback Using Yolo vs. Yolo - v3
No ratings yet
Real Time Object Detection With Audio Feedback Using Yolo vs. Yolo - v3
7 pages
Final Report Yolo Voice
No ratings yet
Final Report Yolo Voice
94 pages
A Report On Existing AI Work For Visually Impaired People: Ayesha Tariq
No ratings yet
A Report On Existing AI Work For Visually Impaired People: Ayesha Tariq
51 pages
Irjet V5i12304 PDF
No ratings yet
Irjet V5i12304 PDF
7 pages
Literature Survey
No ratings yet
Literature Survey
4 pages
Real Time Object Detection For Visually Impaired Person
No ratings yet
Real Time Object Detection For Visually Impaired Person
8 pages
Deeplearningfor Objectdetect
No ratings yet
Deeplearningfor Objectdetect
20 pages
Ijcrt July Student 2022
No ratings yet
Ijcrt July Student 2022
5 pages
Gandhi Institute of Technology, Mumbai, India. Et Al. - 2020 - Recognize Objects For Visually Impaired Using Comp - +++
No ratings yet
Gandhi Institute of Technology, Mumbai, India. Et Al. - 2020 - Recognize Objects For Visually Impaired Using Comp - +++
5 pages
Object Detection Research Paper
No ratings yet
Object Detection Research Paper
4 pages
Development Smart Eyeglasses For Visuall
No ratings yet
Development Smart Eyeglasses For Visuall
9 pages
Cognitive Model For Object Detection Based On Speech-to-Text Conversion
No ratings yet
Cognitive Model For Object Detection Based On Speech-to-Text Conversion
5 pages
Object Detection With Voice Sensor and Cartoonizing The Image
No ratings yet
Object Detection With Voice Sensor and Cartoonizing The Image
6 pages
AI Optics: Object Recognition and Caption Generation For Blinds Using Deep Learning Methodologies
No ratings yet
AI Optics: Object Recognition and Caption Generation For Blinds Using Deep Learning Methodologies
6 pages
Object Detection and Identification
No ratings yet
Object Detection and Identification
8 pages
A Deep Learning Based Assistant For The Visually Impaired
No ratings yet
A Deep Learning Based Assistant For The Visually Impaired
11 pages
(2025-AEJ) Object Detection in Real-Time Video Surveillance Using Attention Based transformer-YOLOv8 Model
No ratings yet
(2025-AEJ) Object Detection in Real-Time Video Surveillance Using Attention Based transformer-YOLOv8 Model
14 pages
390 Submission
No ratings yet
390 Submission
5 pages
Lit Survey
No ratings yet
Lit Survey
1 page
Obstacle Detection For Visually Impaire Using IoT
No ratings yet
Obstacle Detection For Visually Impaire Using IoT
21 pages
Enhancing Surveillance Systems With YOLO Algorithm For Real-Time Object Detection and Tracking
No ratings yet
Enhancing Surveillance Systems With YOLO Algorithm For Real-Time Object Detection and Tracking
4 pages
Presentation1 FINAL 1
No ratings yet
Presentation1 FINAL 1
11 pages
Object Detection and Translation For Bli
No ratings yet
Object Detection and Translation For Bli
6 pages
Blind Assistance System Using Image Processing
No ratings yet
Blind Assistance System Using Image Processing
11 pages
Phase 2 Report
No ratings yet
Phase 2 Report
79 pages
Synopsis - Internship - Group-53
No ratings yet
Synopsis - Internship - Group-53
8 pages
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
No ratings yet
(IJCST-V8I3P4) :sakshi Gupta, Dr. T. Uma Devi
5 pages
Blind Assistive System Based On Real Time Object Recognition Using Machine Learning
No ratings yet
Blind Assistive System Based On Real Time Object Recognition Using Machine Learning
7 pages
M. e Report
No ratings yet
M. e Report
56 pages
Objection Detection
No ratings yet
Objection Detection
25 pages
Image Detection and Real Time Object Detection
100% (1)
Image Detection and Real Time Object Detection
8 pages
Leveraging Computer Vision and Natural Language Processing For Object Detection and Localization
No ratings yet
Leveraging Computer Vision and Natural Language Processing For Object Detection and Localization
11 pages
Investigating YOLO Models Towards Outdoor Obstacle
No ratings yet
Investigating YOLO Models Towards Outdoor Obstacle
20 pages
Real Time Sign Language Detection
No ratings yet
Real Time Sign Language Detection
6 pages
IJRPR13518
No ratings yet
IJRPR13518
4 pages
Route Detection and Navigation System Using Optimized YOLOv8
No ratings yet
Route Detection and Navigation System Using Optimized YOLOv8
21 pages
Computer Vision: Fundamentals and Applications
From Everand
Computer Vision: Fundamentals and Applications
Fouad Sabry
No ratings yet
RNRCTribute Manuel EN
No ratings yet
RNRCTribute Manuel EN
9 pages
Contents:: ASAP Installation and Administration
No ratings yet
Contents:: ASAP Installation and Administration
21 pages
3esi Enersight - Whitepaper - 10 Minute A D - Updated2 SC
No ratings yet
3esi Enersight - Whitepaper - 10 Minute A D - Updated2 SC
8 pages
MC GCMSReferenceManual
No ratings yet
MC GCMSReferenceManual
493 pages
Fusion 360 Lab Report
No ratings yet
Fusion 360 Lab Report
15 pages
The Art of Accompanying
100% (1)
The Art of Accompanying
127 pages
No Code Low Code
No ratings yet
No Code Low Code
15 pages
Bit Info Nepal - Operating Systems - Bit204-2078
No ratings yet
Bit Info Nepal - Operating Systems - Bit204-2078
2 pages
Iiith Pgcss Partb Brochure
No ratings yet
Iiith Pgcss Partb Brochure
20 pages
BTP EA Intro BB Ver 1.01 SAP Mobile Cards
No ratings yet
BTP EA Intro BB Ver 1.01 SAP Mobile Cards
13 pages
Attacking Metasploitable2 VM Server Cameron W
No ratings yet
Attacking Metasploitable2 VM Server Cameron W
20 pages
C Programming: Department of Electrical Engineering
No ratings yet
C Programming: Department of Electrical Engineering
37 pages
Examples Chapter9
No ratings yet
Examples Chapter9
7 pages
User Manual High-Speed ADSL Broadband D-Link DSL-2542B
No ratings yet
User Manual High-Speed ADSL Broadband D-Link DSL-2542B
59 pages
How To Convert A PDF File To Word, Excel or JPG Format
No ratings yet
How To Convert A PDF File To Word, Excel or JPG Format
4 pages
Computer Abbreviations
No ratings yet
Computer Abbreviations
6 pages
Design Thinking Uber Presentation With Name
No ratings yet
Design Thinking Uber Presentation With Name
10 pages
Blue Light Blue Color Blocks Flight Attendant CV - 20240530 - 170623 - 0000
No ratings yet
Blue Light Blue Color Blocks Flight Attendant CV - 20240530 - 170623 - 0000
2 pages
Engineers Mini Notebook
No ratings yet
Engineers Mini Notebook
80 pages
Script Quizizz
No ratings yet
Script Quizizz
4 pages
LAWO PI - MADI - SRC - en
No ratings yet
LAWO PI - MADI - SRC - en
2 pages
Social Bookmarking Sites
100% (1)
Social Bookmarking Sites
20 pages
Creativity and Innovation BM006-3-2-CRI Individual Assignment
No ratings yet
Creativity and Innovation BM006-3-2-CRI Individual Assignment
24 pages
WS011T00 Windows Server 2019 Administration
No ratings yet
WS011T00 Windows Server 2019 Administration
4 pages
Operatig System
100% (1)
Operatig System
29 pages
Spring Boot
No ratings yet
Spring Boot
29 pages
ANSYS Offshore Solutions What You Will Learn About
No ratings yet
ANSYS Offshore Solutions What You Will Learn About
3 pages
(Internal) I18n Code Evals Instructions
No ratings yet
(Internal) I18n Code Evals Instructions
18 pages
Trimble MS750 - Datasheet
100% (1)
Trimble MS750 - Datasheet
2 pages
Bus Ticket System
No ratings yet
Bus Ticket System
15 pages

Demo Research Paper

Uploaded by

Demo Research Paper

Uploaded by

2023 3rd Asian Conference on Innovation in Technology (ASIANCON)

Pune, India. Aug 25-27, 2023

Real-Time Object Detection and Audio Feedback

979-8-3503-0228-8/23/$31.00 ©2023 IEEE 1

B. Audio Feedback Generation:

Fig. 2. Workflow of YOLO_V3 with Audio Feedback

Fig. 5. Video Frame Output

Fig. 6. Terminal Output

Fig. 13. Video Frame Output

Fig. 14. Terminal Output

Fig. 8. Terminal Output

Fig. 15. Video Frame Output

Fig. 16. Terminal Output

Fig. 9. Video Frame Output Distant Object:

Fig. 17. Video Frame Output

You might also like