Vision AI A Deep Learning-Based Object Recognition System For Visually Impaired People Using TensorFlow and OpenCV
Vision AI A Deep Learning-Based Object Recognition System For Visually Impaired People Using TensorFlow and OpenCV
https://doi.org/10.22214/ijraset.2023.52197
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
Abstract: Object detection is a challenging task in computer vision that can provide valuable information for visually impaired
people. In this paper, we propose a deep learning-based model that detects objects for blind people using frameworks such as
TensorFlow and OpenCV. Our system uses a pre-trained model built on YOLO v7 to detect and recognize objects in real-time
from images or videos captured by a camera. Other versions of YOLO have also been taken into consideration. However, based
on some recent comparisons, YOLOv7 is the fastest and most accurate official YOLO version. It achieves 2% higher accuracy
than Cascade-Mask R-CNN models at dramatically increased inference speed (509% faster). The results are then converted to
speech using text-to-speech technology and delivered to the user through headphones or a speaker. Our system is intended to be
an IoT-based project that can recognize common objects and people. We evaluated our system on several datasets such as
Common Object in Context (COCO)- a large-scale labelled dataset containing 1.5 million object images, to demonstrate its
accuracy and efficiency. We believe that our system can offer a practical and affordable way for visually impaired people to
access visual information and enhance their quality of life.
Keywords: Object detection, Deep learning, TensorFlow, OpenCV, Blind people, YOLO v7, Text-to-speech.
I. INTRODUCTION
Visually impaired people face numerous challenges in their daily life, as they are unable to gather visual information about their
surroundings. Object detection systems can play a significant role in addressing these challenges by providing essential information
about the physical environment. With the recent advancements in deep learning and computer vision, researchers have developed
advanced object detection models that can accurately identify and locate objects in an image or video. In this paper, we present a
deep learning-based system that detects objects for visually impaired people using frameworks such as TensorFlow and OpenCV.
The system uses a pre-trained model built on YOLO v7 to detect and recognize various objects in real-time from images or videos
captured by a camera. The detected objects are then converted to speech using text-to-speech technology and delivered to the user
through headphones or a speaker.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 2591
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
B. Related Works
There have been several related works in this field, including a Deep Learning based Object Detection and Recognition Framework
for the Visually-Impaired [1], an Object Recognition System for Visually Impaired People [2], and a Computer Vision-Based
Assistance System for the Visually Impaired Using Mobile Edge AI Accelerator Devices [3].
III. METHODOLOGY
The proposed methodology involves collecting images of objects, pre-processing them to enhance object features, designing a real-
time object recognition model using TensorFlow and OpenCV, training and validating the model, and implementing it on a portable
device with audio feedback for visually impaired individuals.
B. Model Architecture
1) Designing a deep learning-based object recognition model using TensorFlow and OpenCV
2) The model should be able to recognize objects in real-time.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 2592
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
A. Performance Evaluation
The proposed system design shows how deep learning algorithms can be efficiently incorporated within computer vision-based
visual assistance systems. The proposed system is cost-effective, portable, and almost unnoticeable as an assistive device. It was
found that YOLOv7 is an edge-optimized model that uses leaky ReLU as the activation function, while other models use SiLU as
the activation function. Compared to YOLOv5-N, YOLOv7-tiny is 127 FPS faster and 10.7% more accurate. YOLOv7-X achieves
114 FPS inference speed compared to the comparable YOLOv5-L with 99 FPS, while YOLOv7 achieves a better accuracy (higher
AP by 3.9%). Compared with models of a similar scale, the YOLOv7-X achieves a 21% higher AP score than YOLOv5-L. [5].
Fig. 1 Performance comparison of YOLOv7 vs. YOLOR vs. YOLOv5 vs. Vit transformers (source [6])
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 2593
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
Some of the outputs of the model while being tested on real-world scenarios were as follows:
C. Results
The results show that YOLOv7 outperforms other object detection algorithms in terms of accuracy and speed. The research achieves
an average precision of 0.5 at 286 frames per second (FPS) which is faster than other state-of-the-art object detection algorithms
such as YOLOv5, YOLOv4, and Faster R-CNN. The research also shows that YOLOv7 has a lower memory usage compared to
other object detection algorithms. TensorFlow is used to convert YOLOv7 to TensorFlow Lite for mobile deployment. It is also
used to load the YOLOv7 network model from the hard disk into OpenCV.
Scenarios have been shown below of the model to be effective in recognizing objects in outdoor environments for input images.
V. CONCLUSIONS
The experimental research shows that YOLOv7 is a state-of-the-art object detection algorithm that can achieve high accuracy and
speed in real-world scenarios. The research also shows that YOLOv7 has a lower memory usage compared to other object detection
algorithms. This project has been proposed as an IoT-enabled automated system that can help the visually impaired in their safe
navigation and identifies several common objects in indoor and outdoor environments in real-time scenarios to help blind people
effectively, which has the potential to greatly improve their daily lives.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 2594
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
REFERENCES
[1] IEEE Conference Publication, “Deep Learning based Object Detection and Recognition Framework for the Visually-Impaired,” IEEE Xplore, 2019.
[2] IEEE Conference Publication, “Object Recognition System for Visually Impaired People,” IEEE Xplore, 2018.
[3] CVPR 2021 Workshop on Media and Automotive Intelligence, “Computer Vision-Based Assistance System for the Visually Impaired Using Mobile Edge AI
Accelerator Devices,” 2021.
[4] Academia.edu, “Virtual Fitness Trainer using Artificial Intelligence,” 2019..
[5] "Object Detection in 2023: The Definitive Guide - viso.ai." https://viso.ai/deep-learning/object-detection/. Accessed 2 Apr. 2023.
[6] Bochkovskiy, A., Wang, C.Y., & Liao, H.Y.M. (2021). Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint
arXiv:2107.10972
[7] Singh, A., & Davis, L.S. (2021). A Survey of Modern Deep Learning based Object Detection Models. arXiv preprint arXiv:2104.11892.
[8] Li, X., Chen, Y., & Zhang, Y. (2019). A Real-Time Objects Recognition Approach for Assisting Blind People. IEEE Access, 7, 159013-159021. doi:
10.1109/access.2019.2942813.
[9] Zhang, Y., Wang, S., & Yang, S. (2018). A Real-Time Objects Recognition Approach for Assisting Blind People based on Deep Learning. Journal of Image
and Graphics, 6(4), 289-294. doi: 10.11648/j.ijg.20180604.18.
[10] Rao, R., & Saxena, A. (2017). Real-Time Objects Recognition for Assisting Blind People using Raspberry Pi. International Journal of Advanced Research in
Computer Science, 8(3), 375-378. doi: 10.26483/ijarcs.v8i3.3914.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 2595