0% found this document useful (0 votes)
37 views16 pages

Blind Assistance

This document describes a blind assistance system that uses real-time object detection with distance and voice alerts. The system uses machine learning to detect and classify common daily objects in images and videos taken by a smartphone camera. It then calculates the distance to detected objects and provides voice feedback alerts to indicate if the user is close to or far from the object. The system was created to help visually impaired people better navigate the world and identify objects independently without additional devices. It uses technologies like TensorFlow, SSD, Pyttsx3, OpenCV and Python for tasks like object detection, distance estimation, text-to-speech feedback and processing images. The goal is to make daily life easier for the blind by helping them detect objects and obstacles in their environment

Uploaded by

sivapriyapatnala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views16 pages

Blind Assistance

This document describes a blind assistance system that uses real-time object detection with distance and voice alerts. The system uses machine learning to detect and classify common daily objects in images and videos taken by a smartphone camera. It then calculates the distance to detected objects and provides voice feedback alerts to indicate if the user is close to or far from the object. The system was created to help visually impaired people better navigate the world and identify objects independently without additional devices. It uses technologies like TensorFlow, SSD, Pyttsx3, OpenCV and Python for tasks like object detection, distance estimation, text-to-speech feedback and processing images. The goal is to make daily life easier for the blind by helping them detect objects and obstacles in their environment

Uploaded by

sivapriyapatnala
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

TITLE:

BLIND ASSISTANCE SYSTEM: REAL TIME OBJECT DETECTION WITH


DISTANCE AND VOICE ALERTS
ABSTRACT:
It’s a known fact that estimated number of visually impaired person in the world is about 285
million, approximately equal to the 20% of the Indian Population. The main challenge is to
identify the distant objects especially when they are on their own. They are mostly dependent
on someone for even accessing their basic day-to-day needs. So, it is a quite challenging task
and the technological solution for them is of utmost importance and much needed. The
existing system is an IOT-based project that uses the RASPBERRY PI model, which is a
smart eye wear that recognises the object, due to the require of additional device in order to
detect an object. So, we came up with a application that doesn't need any equipment; all we
need to do is use our phone to detect the object.
One such try from our side is that we came up with an Integrated Machine Learning
System which allows the Blind Victims to identify and classify Real Time Based Common
Day-to-day Objects and generate voice feedbacks and calculates distance which produces
warnings whether he/she is very close or far away from the object. The same system can be
used for Obstacle Detection Mechanism.

Key words: Blind, Object Detection, Object Recognition, Image Processing.


1.INTRODUCTION:
1.1 INTRODUCTION ABOUT PROBLEM:
Good eyesight facilitates us in carrying out day to day activities, but some people face
difficulties in carrying out these activities due to lack of proper vision. The visually impaired,
like everyone else, pursue a great range of interests, but a large percentage of them are
unemployed due to inadequate resources to learn from. The number of blind schools and
institutions available in the country aren’t enough to educate such a large population. Braille,
which is a partial transliteration of the English written language was developed to help people
with low vision in reading, but surveys conducted in blind schools have shown that many find
it difficult to learn and there aren’t sufficient learning kits to serve such a huge number of
people. Most visually impaired people also depend on others to be acquainted with objects
surrounding them and avoiding collision with obstacles while navigating. To address these
problems and help them self-learn as well as become more aware of their surroundings, there
is a need for a product using various technologies for the extraction of text and recognition of
objects.

1.2INTRODUCTION ABOUT EXISTING SOLUTION:


In the existing system,they proposed a method which detects objects using IOT based smart
wearable eye wear glasses called Perspective glass.Perspective glass is wearable pair of glass
which is designed for blind people which helps them in resolving problem of identifying the
objects or obstacles present before them during walking.The Perspective glass consists of a
raspberry pi board, 5 mp camera,ultrasonic sensors, buzzer, headphone, power source.This
glass is controlled by a Power button which when pushed ON, will take pictures of the
surroundings and The overall setup is powered through an external power source (power
bank).

1.3 INTRODUCTION ABOUT PROPOSED SOLUTION:


There are four phases in our proposed system. They are:
a. Object Detection
b. Object Identification
c. Depth Estimation
d. Voice Assistant
a. Object Detection and Identification:
Object detection is a computer vision technique for locating instances of objects in images or
videos. Object detection algorithms typically leverage machine learning or deep learning to
produce meaningful results.
The SSD object detection composes of 2 parts:
 Extract feature maps
 Apply convolution filters to detect objects.

SSD uses VGG16 to extract feature maps. Then it detects objects using the Conv4_3 layer.
For illustration, we draw the Conv4_3 to be 8 × 8 spatially (it should be 38 × 38). For each
cell (also called location), it makes 4 object predictions.
Each prediction composes of a boundary box and 21 scores for each class (one extra class for
no object), and we pick the highest score as the class for the bounded object. Conv4_3 makes
a total of 38 × 38 × 4 predictions: four predictions per cell regardless of the depth of the
feature maps. As expected, many predictions contain no object. SSD reserves a class “0” to
indicate it has no objects.

c. Depth Estimation:
Depth estimation or extraction feature is nothing but the techniques and algorithms which
aims to obtain a representation of the spatial structure of a scene. In simpler words, it is used
to calculate the distance between two objects. Our prototype is used to assist the blind people
which aims to issue warning to the blind people about the hurdles coming on their way. In
order to do this, we need to find that at how much distance the obstacle and person are located
in any real time situation. After the object is detected, rectangular box is generated around that
object. if that object occupies most of the frame, then with respect to some constraints the
approximate distance of the object from the particular person is calculated.

d. Voice Assistant:
After the detection of an object, it is utmost important to acknowledge the person about the
presence of that object on his/her way. For the voice generation module PYTTSX3 plays an
important role. Pyttsx3 is a conversion library in Python which converts text into speech. This
library works well with both Python 2 and 3. To get reference to a pyttsx. Engine instance, a
factory function called as pyttsx. init() is invoked by an application. Pyttsx3 is a tool which
converts text to speech easily. Pytorch is primarily a machine learning library. Pytorch is
mainly applied to the audio domain. Pytorch helps in loading the voice file in standard
mp3 format. It also regulates the rate of audio dimension. Thus, it is used to manipulate the
properties of sound like frequency, wavelength, and waveform. The numerous availabilities of
options for audio synthesis can also be verified by taking a look at the functions of Pytorch.

1.4 BRIEF INTRODUCTION ABOUT PLATFORM AND TECHNOLOGY


USED:
 Tensor Flow: TensorFlow APIs were used to implement it. The benefit of using APIs
is that they give a collection of common operations. As a result, we don't have to
write the program's code from start. They are both helpful and efficient, in our
opinion. APIs are time savers since they give us with convenience. The TensorFlow
object detection API is essentially a mechanism for building a deep learning network
that can solve object detection challenges. Their framework includes trained models,
which they refer to as Model Zoo . This contains the COCO dataset, the KITTI
dataset, and the Open Images Dataset, among others. COCO DATASETS are the
primary focus here.

 The SSD : It consists of two parts: an SSD head and a backbone model. As a feature
extractor, the backbone model is essentially a trained image classification network.
This is often a network trained on ImageNet that has had the final fully linked
classification layer removed, similar to ResNet . The SSD head is just one or more
convolutional layers added to the backbone, with the outputs read as bounding boxes
and classifications of objects in the spatial position of the final layer activations [3].
As a result, we have a deep neural network that can extract semantic meaning from an
input image while keeping its spatial structure, although at a lesser resolution. In
ResNet34, the backbone produces 256 7x7 feature maps for an input picture. SSD
divides the image into grid cells, with each grid cell being in charge of detecting
things in that region [1][7]. Detecting objects entails anticipate.

 Pyttsx3: Pyttsx3 is a conversion library in Python which converts text into speech.
This library works well with both Python 2 and 3. To get reference to a pyttsx. Engine
instance, a factory function called as pyttsx. init() is invoked by an application.
Pyttsx3 is a tool which converts text to speech easily.This algorithm works as
whenever an object is being detected, approximate distance is being calculated, with
the help of cv2 library and cv2.putText() function, the texts are getting displayed on to
the screen. To identify the hidden text in an image, we use Python-tesseract for
character recognition.

 Python: Python is a widely used general-purpose, high-level programming language.


It was initially designed by Guido Van Rossum in 1991 and developed by python
software foundation. Its main objective is to provide code readability and advanced
developer productivity. Its design philosophy was quite good too. Our project mainly
focuses on Neural Networks toolbox, Statistics and Machine Learning toolbox and
Image processing and Computer Vision toolbox which involves python programming

1.5 PURPOSE OF WORK:


It makes the work of Blind people easy, efficient, and reliable by sending wireless Voice
based feedback whether the object is either too close to him or is it at a safer distance. Smart
Assistant for blind people is a portable device. This device will make blind and visually
impaired people's lives much easier, as it will help them in recognizing objects. Another aim
is to identify texts on objects.
1.6 SCOPE OF WORK:
Our aim is to build an application for visually impaired people for Therefore, the motivation
of this project is given, the fact of building a technological tool supported by computer
science that allows to overcome some of these barriers, through the creation of a service that
recognizes and automatically characterize images taken or provided by a user.
2.LITERATURE SURVEY:
Megha P Arakeri .et al:
Assistive Technology for the Visually Impaired Using Computer Vision:

The proposed product successfully captures the readable material in front of the user,
identifies the text in the image and reads it out. It also informs the user about the distance of
object that is at his level of eyesight and tells the objects around him. Hence this product
helps the user to gain knowledge from the readable material. This gives him necessary
information about his surroundings and makes him independent. The user-friendly wearable
device is portable and compact.
Drawbacks:
While identifying objects at a greater distance it fails to relocate the particular object and advice with
it name because the system gets confused having many objects.

Laviniu Țepelea .et al:


A Vision Module for Visually Impaired People by Using Raspberry PI Platform:
A vision-based guidance module has been proposed to help visually impaired people. The
vision module can detect and recognize traffic signs with high accuracy and at a reasonable
distance detection range. The application is developed with a Raspberry PI3 Model B+
platform

Yuraja Kadri1:
Future Vision Technology:
This paper presents a prototype of lightweight smart glasses for visually impaired people. We
exhibited the working of the glasses, along with the hardware design and software design.
And we have implemented many excellent image processing, text recognition algorithms on
the new lightweight smart glass system. This system can detect and recognize the text in real
time. In the soon future, we will implement more useful applications in the smart glass
system such as handwritten texts, image detection.

K. Vijiyakumar:
Object Detection For Visually Impaired People Using SSD Algorithm:
In this project, an object detection system for visually impaired people based on SSD
algorithm in real time has been proposed. The system has retrieved the trained model from
the cloud database to perform object detection in real time. The proposed system is beneficial
for the visual impaired people for better living quality to detect the object as well as
calculating the distance of the object.
Esra ali hassan and Tong bong tang:
Smart Glasses For The Visually Impaired People:
This project presents a new design of assistive smart glasses for visually impaired students.
The objective is to assist in multiple daily tasks using the advantage of wearable design
format. As a proof of concept, this paper only presents one example application, i.e. text
recognition technology that can help reading from hardcopy materials. The building cost is
kept low by using single board computer raspberry pi 2 as the heart of processing and the
raspberry pi 2 camera for image capturing. Experiment results demonstrate that the prototype
is working as intended.

Jihong Liu and Xiaoye Sun:


A Survey of Vision Aids for the Blind:
The development of effective user interfaces, appropriate sensors, and information processing
techniques make it is possible to enable the blind to achieve additional perception of the
environment. Since the beginning of the 1970’s, the research of vision aids for the visually
impaired people has been broadly extended. After introducing the traditional methods for
guiding blind, two typical modes of mobility aid are presented in this paper. One called as
ETA (short for Electronic Travel Aids), bases on the other natural senses of the blind, such as
hearing, touch, smell, feeling and etc, focusing on Meijer’s vOICe system which is based on
the blind’s sensitive auditory senses and ENVS which is by means of haptic feedback. The
other technique is artificial vision, using surgical methods of implanting visual prosthesis in
the blind’s healthy retina, cortex, or optic nerve. The prosthesis generates electrical impulse
and evokes the perception of points of light in the patients’ visual cortex. The first type is
non-wounded for the blind while the second type is wounded for the blind.

Ali Khan .et al:


Wearable Navigation Assistance System For The Blind And Visually Impaired:
An assistance system was designed and developed in this research work. It is based on
ultrasonic sensors for detecting obstacles in the path. These sensors with vibrating device and
buzzer have been placed on multiple places on a jacket. The sensors scan the environment of
the user and inform him through vibration and buzzer. The system aid in the movement for
blind people by scanning their environment through object detection and guiding them to a
safe path. From the experiments, it was revealed that the developed prototype achieves its
objectives with adequate accuracy. In the future, it is aimed that an image processing obstacle
and person recognition be employed for tacking further real-life problems associated with the
travel of blind people.
Surya Chaitanya Jakka .et al:
Blind Assistance System Using Tensor Flow:

The proposed system is divided into two levels based on the SSD algorithm and TensorFlow,
which recognizes the objects not only for recognition but also for localization. It also tells
you how far the person is from the object. Individuals with visual impairments may face
difficulties as innovation advances step by step. This research work has proposed a novel
framework by utilizing AI, which makes the framework more straightforward to use
specifically for the individuals with visual impedances and to help the society. The main key
aspect of the proposed system is identifying or naming the object detected, calculating the
accurate distance between the user and objects and the voice over using Audio commands .

Devashish Pradeep Khairnar .et al:


PARTHA: A Visually Impaired Assistance System:
The proposed system gives an effective solution to assist visually impaired people. The
system has multiple modules and simple architecture which makes it more practical and
easier to use. The system claims to assist visually impaired not only by detecting obstacles
around them but also recognizing them and providing the respective information like type
and distance from the obstacle. The indoor navigation and live location sharing module help
the user to navigate in known as well as the unknown indoor environment. The preliminary
results show that the system is easy to use, effective, safe and fulfil all the targeted goals to
assist visually impaired. The system is user-friendly as it accepts the instruction and gives the
response in audio format. The system is affordable, reliable and satisfies other non-functional
requirements.
3.PROBLEM STATEMENT:
3.1 EXISTING SYSTEMS:
Perspective glass is wearable pair of glass which is designed for blind people which helps
them in resolving problem of identifying the objects or obstacles present before them
during walking. The Perspective glass consists of a raspberry pi board, 5 mp camera,
ultrasonic sensors, buzzer, headphone, power source. This glass is controlled by a Power
button which when pushed ON,
will take pictures of the surroundings and The overall setup is powered through an
external power source (power bank).

3.2 PROBLEM DEFINITION:


3.3 PROBLEM STATEMENT:
4.SYSTEM ARCHITECTURE:
4.1 PROCEDUALL DESIGN:

SSD ARCHITECTURE:
4.2 MODULE EXPLAINATION:
 Picture Capturing Module: At the point when the framework is turned on the
framework catch pictures utilizing camera. We need to interface this as contribution
to the COCO dataset and grouping of pixels and highlights happens. The caught
casings should be visible in the screen with drawn limits and mark. The technique
video capture () is utilized to begin the camera and catch the video.
 Picture Processing Module: OpenCV (Open-Source Computer Vision) is a library in
python what works mostly focused on constant PC vision. It is mostly used to do all
the computational activity connected with pictures. cv2 is utilized to perform picture
handling and utilize strategies which are utilized to identify and catch the casings and
indicates names. This module is handled after the info is taken from the camera.
 Object Detecting Module: The calculation will accept the picture as info and every
one of the calculations will happen like diving the picture into neurons only pixels and
arrangement of elements which will be finished on Neural Network. Picture will be
perused as string for the following calculation and it will be analyzed under prepared
dataset. This can be accomplished here by utilizing class list where 90 items are
prepared independently. Here we utilized SSD Architecture which goes under Tensor
Flow API.
 Distance computation Module: To find the distance of the item NumPy is utilized,
which is pip bundle utilized for numerical computation. Finding distance can be
approach by utilizing profundity assessment, utilizing recognized objects noticeable
on the screen approaches the profundity assessment will occur by finding mid ranges
and adjusting the assessment scale to 0-10.
 Sound Output Module: Next in the wake of distinguishing the item and ascertaining
the distance our point is to give the result in the sound utilizing voice notes. In the
result we will determine the distance alongside units and the admonition messages to
alarm the client. For sound result the pyttxs3 pip bundle which is predefined python
underlying module utilized for switching text over completely to discourse..

5.PRELIMINARY ANALYSIS:
5.1 BRIEF ABOUT INPUT DATA:
COCO is a large image dataset designed for object detection, segmentation, person keypoints

detection, stuff segmentation, and caption generation. It stores its annotations in the JSON

format describing object classes, bounding boxes, and bitmasks.

for storing and using the tools developed for COCO we have to create the dataset like like

COCO we can either convert the one which we have to COCO format or we can create one to

ourselves.

COCO has several features:

Object segmentation

Recognition in context

Superpixel stuff segmentation

330K images (>200K labeled)

1.5 million object instances

80 object categories

91 stuff categories

5 captions per image

250,000 people with keypoints

5.2 Type of analysis doing on Data:


5.3 Expected Outcome:
Once the input is taken from the dataset it will go through the phase of data pre-processing,
with which we obtain the balanced dataset. This dataset is trained and tested by ssd model
for object detection and which gives us the required outcome (voice feedback) with better
accuracy. This leads to the ability to utilize this model rather than other implemented models
i.e.,smart glasses in the process of blind assisting.

The proposed system is beneficial for the visual impaired people for better living quality to detect
the object as well as calculating the distance of the object.
6.FEASABILITY STUDY:
6.1 TECHNICAL FEASABILITY:
This study is carried out to check the technical feasibility, that is, the technical requirements
of the system. No system developed should not have a high demand for the available
technical resources. This will lead to high demand for the available technical resources. This
will lead to high demands being placed on the client. The developed system must have a
modest requirement, as only minimal or null changes are required for implementing this
system

6.2 OPERATIONAL FEASABILTY:


 Operational feasibility is the measure of how well a proposed system solves the
problems and takes advantage of the opportunities identified during scope
definition and how it satisfies the requirements identified in the requirements
analysis phase of system development.
 This includes staffing requirements, organizational structure, and any applicable
legal requirements. At the end of the operational feasibility study, your team will
have a sense of whether you have the resources, skills, and competencies to
complete this work.

6.3 ECONOMICAL FEASABILITY:


This study is carried out to check the economic impact that the system will have on the
organization. The amount of funds that the company can pour into the research and
development of the system is limited. The expenditure must be justified. Thus, the developed
system is well within the budget, and this was achieved because most of the technologies
used are freely available. Only the customized products had to be purchased.
7.SUMMARY OF THE PROJECT:
8.REFERENCE:
[1] Shiyam Raghul.M, Surendhar K, Suresh N R, R. Hemalatha, “Raspberry Pi based assistive device
for deaf, dumb and blind people”, International Journal of Scientific Research Engineering &
Technology, May 2017, vol. 2, issue 2, pp. 167-174.

[2] Mallapa D. Gaurav, Shruti S. Salimath, Shruti B. Hatti, Vijayalaxmi I. Byakod, Shivaleela Kanede,
”B-Light: A reading aid for the blind people using OCR and OpenCV”, International Journal of
Scientific Research Engineering & Technology, May 2017, vol. 6, issue 5, pp. 546-548

[3] Nikhil Mishra, “Image Text to Speech Conversion using Raspberry Pi & OCR Techniques”,
International Journal for Scientific Research and Development, vol. 5, issue 08, 2017, pp. 523-525.

[4] Zhiming Liu, Yudong Luo, Jose Cordero, “Finger-eye: A wearable text reading assistive system for
the blind and visually impaired”, IEEE International Conference on Real-time Computing and
Robotics, 6-10 June 2016, pp. 125-128.

[5] “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.” Shaoqing
Ren, Kaiming He, Ross Girshick, and Jian Sun, IEEE transactions, Dec 2016.

[6] Vicky Mohane, Chetan Gade “Object Recognition for Blind people Using Portable Camera” WCFTR
World conference 2016.

[7] Image recognition: By Samer Hijazi, Rishi Kumar, and Chris Rowen, IP Group, Cadence “Using
convolutional neural network for image recognition”.

[8] V. Tiponuţ, D. Ianchis, Z. Haraszy, “Assisted Movement of Visually Impaired in Outdoor


Environments”, Proceedings of the WSEAS International Conference on Systems, Rodos, Greece,
pp.386-391, 2009.

[9] blind-person-assistant-object-detection (ijraset.com)

[10] http://cocodataset.org/#home Show Context

[11] https://sci-hub.se/https://ieeexplore.ieee.org/document/8554625

[12] https://ieeexplore.ieee.org/document/1713189

[13] https://ieeexplore.ieee.org/document/8795205

[14] https://sci-hub.se/https://ieeexplore.ieee.org/document/9137791

You might also like