Project Report "Nayan Drishti": A Revolutionary Navigation/Visual Aid For The Visually Impaired
Project Report "Nayan Drishti": A Revolutionary Navigation/Visual Aid For The Visually Impaired
ON
SUBMITTED TO THE
Submitted by
Amar Sanjay Gupta
(2009590005)
Supervisor (s):
HOD Shaikh Sir (Computer Department)
Aditya Polytechnic,
Telgaon Naka, Beed-431122.
1
“Nayan Drishti”: A Revolutionary Navigation/Visual Aid
for the Visually Impaired
by
Amar Sanjay Gupta
(2009590005)
Supervior (s):
HOD Shaikh Sir (Computer Department)
Aditya Polytechnic,
Telgaon Naka, Beed-431122.
2022-2023
2
CERTIFICATE
3
Project Report Approval for Diploma
Examiners
1.
2.
Date:
Place:
4
DECLARATION
I declare that this written submission represents my ideas in my own words and
where others' ideas or words have been included, I have adequately cited and
referenced the original sources. I also declare that I have adhered to all
principles of academic honesty and integrity and have not misrepresented
or fabricated or falsified any idea/data/fact/source in my submission. I
understand that any violation of the above will be cause for disciplinary action
by the Institute and can also evoke penal action from the sources which have
thus not been properly cited or from whom proper permission has not been
taken when needed.
(2009590005) --------------------------------------------------
Date:
5
ABSTRACT
The disabled section in society typically blind people or people suffering from
any form of visual impairment have to live in complete uncertainty throughout
their lives. The lack of vision due to defects at birth or some accident has
forced them to constantly be aware of their surrounding environment and has
also made them live with fear. Blind people are often dependent on others to
lead a normal life and experience difficulty in performing day to day activities
on account of a disability they did not choose to have. Performing any simple
task like picking up an object or even walking from one place to another is
daunting challenge for blind people. In our project, we are developing a
prototype of a product which aids blind people in performing day to day
activities. We are working around the domains of computer vision and machine
learning to develop this prototype. Since blind people cannot see what objects
are in front of them a feature of object recognition would have to be
incorporated. While walking or navigating, blind people would not be aware of
any obstacles that are in front of them which is why we have decided to
implement a feature which would detect these obstructions. Blind people also
face uncertainty as to where they are or whether they are walking in the correct
direction. To relieve blind people of this burden some sort of GPS navigation
feature is used in the prototype. By developing this prototype we aim to
demonstrate how machine learning can be used to help the disabled section of
society and to inspire computer engineers to build upon this prototype which
would lead to development of an all-round navigation assistant for the blind
and visually impaired.
6
CONTENTS
ABSTRACT 6
CONTENTS 7
LIST OF FIGURES 8
LIST OF TABLES 10
ABBREVIATIONS 11
1. Introduction 13
1.1. Introduction 13
1.2. Aim and Objectives 14
1.3. Organization of report 15
2. Literature Survey 16
3. Problem Statement 18
3.1. Scope 18
4. System Analysis 19
4.1. Existing System 19
4.2. Proposed System 19
4.3. Analysis 20
4.4. Hardware and Software details 25
4.5. Methodology 27
4.6. Design details 30
5. Implementation 32
5.1. Module I – Object Recognition 32
5.2. Module II – Obstacle Detection 33
5.3. Module III – Location Sensing 34
6. Results 36
7. Project timeline and task distribution 39
8. Conclusion 42
References 43
Appendix 45
Coding conventions 49
Source Code, Acknowledgement, Publication 49, 58, 59
7
LIST OF FIGURES
8
16. Diagrammatic representation of GPS Location Sensing setup 35
9
LIST OF TABLES
1. Abbreviations 11
4. Appendix 45-48
5. Publication Table 1 59
6. Publication Table 2 59
10
ABBREVIATION
11
15. HDOP Horizontal Dilution of
Precision.
12
1. INTRODUCTION
1.1 Introduction:
We sometimes wonder how the disabled section our society manage to perform tasks that look
impossible pertaining to their disability. How efficiently can they perform their day-to-day
tasks and how well can they communicate with normal people. Keeping in mind the difficulties
they face, their ability to perform such tasks and how to ease the fear related to it technology
has advanced in many fields.
Our target audience are the blind people and the main motive behind the targeted people is to
enable them experience freedom in their walks of life and also immediately alert their dear
ones if they are facing trouble in anything. Blindness can basically be classified into three
types:
1. Complete Blindness: A state where the affected person is completely out of sight and is
unable to differentiate between objects and humans.
2. Night Blindness: A state where the affected person is unable to see once it’s about to start
getting dark until it is dusk i.e. sunrise or in any place that is poorly lit. People with night
blindness often have difficulty driving at night or seeing stars.
3. Color Blindness: A state where the person is unable to differentiate between colors. Color
blindness is also known as dyschromatopsia.
In the implementation of our project, we wanted to implement a model which would turn out
to be beneficial to our user (i.e. the blind person) not only helping them navigate through easily
with the help of object prediction and detection but also guide them in aspects such as “How
can one reach his/her destination from the source?” which is navigation using the Global
Positioning System (GPS) and if time permits even a software that helps our user to use a
smartphone in a way that is comfortable to them (A software/app that can understand braille
signs or senses the motion of the user to perform tasks such as calling whenever in distress,
sending the location of the user to the guardian, etc.). The outcome of our project is based on
three of the majorly talked about topics that are the backbone of computer technology and have
a plethora of applications in the real world:
1. Convolutional Neural Networks: These networks are a pack of layers that filter images
using convolution techniques. Convolutional layers convolve the input and pass its result to the
next layer and so on until the output layer is reached.
The product will take an input from the surrounding environment of the user and convolve it
for a better prediction of the obstructing object which will be done using machine learning.
13
3. Global Positioning System: One of the main aspects of this product, the GPS will eventually
help the user reach his/her destination from his/her source. The user will be able to
communicate his/her destination to the software and the software will guide the user to the
destination. There are many APIs available for the same to integrate it along with the software
to show real time location of the user.
4. IOT (Internet of Things): Internet of things is a trend in the modern technological world
and a boon to society. It helps power up our world with various aspects such as wireless
network connectivity, portable processors that can be used anywhere, etc. Our project consists
of raspberry pi, GPS module, camera module, etc. that will be effectively used in processing
data, capturing images, displaying location of the person, etc. This domain proves as a good
link to the above two domains.
The main aim of our project is to develop a prototype that would set a standard to
products that are manufactured for the blind and visually impaired. Blind people are often
taken advantage of and are left behind in most walks of life. A bitter truth blind people
must face is that they cannot drive a vehicle nor can they perform high level jobs like
that of an engineer or doctor and hence have to accept relatively lower salaries performing
other jobs. Reading too can only be done by means of Braille script which can be difficult
even with practice. Thus, we aim to relieve blind people from the hurdles they face on a
daily basis. By developing this prototype product we aim to bring to light the endless
possibilities machine learning has brought to the table in terms of helping the blind and
visually impaired.
To summarize our objectives:
To set the foundation for a product that aids the visually impaired.
To acquire some of the technical machine learning and artificial intelligence skills
required in the industry.
14
1.3 Organization of Report
First chapter of the report contains Introduction to this project which explains our motivation
for taking up this project and the different domains that it encompasses. Second chapter of the
report consists of the Literature Survey we had performed to analyze work that has already
been done in our project domain. The third chapter explains the problem statement while the
fourth chapter deals with design and architecture of our system. The fifth chapter is divided
into three phases, Module-I deals Object Recognition, Module-II deals with Obstacle Sensing
and Module-Ⅲ deals with Location Sensing. The sixth chapter explains the results in detail.
The seventh chapter deals with the work distribution while a conclusion is given in the eight
and final chapter. The remainder of the report deals with code, future scope and
acknowledgement.
15
2. LITERATURE SURVEY
“Electronic walking stick for the blind” published in the year 2018, the use of optical sensors has been
highlighted. This is a modern concept of a walking stick which is completely digital. These sensors
essentially convert light reflected from any surface and convert them into electric signal and acts as
response to the stimuli and informs the blind person via a speaker onthe handle about the obstacle. The
object aimed to give voice assistance to the user and was able to deliver in various cases.[1]
Platform: Microcontroller.
“Portable Blind aid Device” published in the year 2019, it highlights the use of a mobile-based project
in which the user can switch his wireless device into blind assistance mode with thehelp of a button.
With the help of the camera, GPS and a cloud-based architecture it will be able to give the real-time
location of the user and also make him aware of his surroundings. Also plan to include an advanced
image recognition algorithm to recognize the faces of stranger and to store them.[2]
Platform: Android or IOS devices or any mobile devices.
“Intelligent glasses for the blind” published in the year 2016, the device is smart glass which uses a
camera, an ultrasonic sensor and an electronic touchpad to assist the blind person. Using a mobile device
one can activate the glass and the camera will capture and convert the 3D image into a spatial matrix
and give appropriate outputs. The touchpad gives light electric shocks to make the user aware of the
obstacles. Future scope was to add a walking cane with a button to give the user audio output of the real-
time surroundings.[3]
Platform: Any computing device like mobile phones, laptops or tablets.
“Object Identification for Visually Impaired.” published in the year 2016, a simple image recognition
system that makes use of camera to recognize images, an ultrasonic sensor to detect obstacles is
explained. The camera captures the images and if the image can be recognized by the images in the
dataset then the device gives and audio output using a speaker that will be attached to the users clothing.
Future upgrades include to introduce face recognition and to use a wireless camera.[4]
Platform: Raspberry Pi, PIC controller.
“Real-Time Objects Recognition Approach for Assisting Blind People.” Published in the year 2017, an
object recognition project is depicted that with the help of SURF (Speeded Up Robust Features) and
light machine learning is able to give accurate information about the objects captured. Using GPS and
image recognition it gives 90% accurate results. It makes use of a database and uses machine learning
algorithm it is able to identify the objects.[5]
Platform: Windows, Ubuntu, MacOS.
16
“Ear Touch: Facilitating Smartphone Use for Visually Impaired People in Mobile and Public Scenarios.”
published in the year 2019, a Blind aid device which uses ear gestures like long swipe, tap, slide, etc. for
operating the various function on the smartphone. Making use of one-hand gestures and Ear- Touch
feature it enables the user to use the device with ease. And it gives out audio output through the inbuilt
speakers. The device was able to read the gestures accurately and activate the respective functions when
the user wanted it.[6]
Platform: Android, IOS, Windows OS or any mobile device.
“An Ultrasonic Navigation System for Blind People” published in the year 2017, introduces with us
another device for object detection which not only uses an ultrasonic sensor but also includes an
accelerometer, footswitch, microcontroller and vibration pad. User is made aware of the obstacle and
closeness of it by different degree of vibrations and also through an audio output.[7]
Platform: PIC microcontroller.
“Mobile Blind Navigation System Using RFID”. published in the year 2015, the paper introduces a
mobile based communication device which makes use of the WIFI, GPS to determine the user’s location.
And also makes use of RFID tags in the surroundings to give more accurate location to the user and guide
them by giving appropriate instructions. The user makes use of a smart walking cane and RFID handled
reader.[8]
Platform: Android, iOS and any mobile devices.
“Voice assisted system for the blind” published in the year 2015, the paper introduces us with a simple
object/obstacle detector which makes use of an ultra sonic senor, microcontroller, mp3 module and SD
card. The sensor sends the distance measured and the microcontroller depending the proximity limit it is
programmed sends an output to the mp3 module. It makes use of depth sensing for avoidance of pot holes
etc. for the user to walk with smoothness and comfort with normalcy.[9]
Platform: Microcontroller.
“Virtual-Blind-Road Following Based Wearable Navigation Device for Blind People.” published in the
year 2019, the paper depicts a navigation device specifically for the indoor environment which makes
use of SLAM (simultaneous localization and mapping). The device tries to give the person the best path
to an indoor destination by keeping track of the positions of previously encountered obstacles. It makes
use of PoI graph to store new points in the route to the destination and using A* algorithm on the PoI
graph it finds the shortest and optimum path to the destination. Obstacle distance is measured by an
Ultrasonic range finder and provides audio output.[10]
Platform: Embedded CPU like Raspberry or Arduino.
17
3. PROBLEM STATEMENT
Our product introduces a three-way combined solution system that will serve as a navigation
aid for the blind:
To ease the difficulty and uncertainty faced by the blind or any visually impaired person when
they have to walk from one place to another. A simple task like walking is scary for blind
people owing to the fact that they just do not know what obstacles, whether dangerous or not,
may be in front of them. Through our project we hope to create a product that makes navigation
a safer and simpler task for blind people. Object detection and recognition is the core concept
which our project revolves around. Our project domain is machine learning and we have
decided to make use of the concept of a Convolutional Neural Network built using TensorFlow
to process image data and perform the object recognition task after being trained with datasets.
3.1. Scope
Our project aims to implement three main features:
1) Object recognition using a camera attached to the pi module and using our machine
learning domain and implementing Convolutional Neural Network (CNN) break
down the images into 2D frames and train the algorithm to identify various objects
in the surrounding.
2) Object Detection using an Ultrasonic sensor which can detect obstacles at a
distance of 4m and sends an alert to the user if the proximity of the object is within
a certain given range.
3) Giving the real-time location of the user using a GPS module that is programed to
communicate with the satellite and on a press of a button give the user his/her
location. The live location data would be uploaded to Firebase so that in case the
blind person gets lost or robbed in public, the authorities can make use of the last
known location of the blind person to track him/her down.
All the directions and instructions and output from each source and input that is fed will be
given to the user via a Bluetooth speaker which will be connected to the Bluetooth in the
raspberry module. Since, our final outcome is basically a prototype product there is a vast scope
for improvement. With a rapidly increasing trend towards machine learning and AI, additional
functionalities can be implemented in the future. Once the prototype model is complete, if all
the above-mentioned features are working smoothly, a few additional features can be added
pertaining to distinguishing of day and night since blindness. For e.g. if a person has night
blindness, he/she can use the glasses during the night to distinguish between various objects
that is in hinderance to his/her path. For a person suffering from complete blindness, the day-
night distinguishing feature can be implemented as the person then would not have to be
dependent on someone else to actually inform him/her whether it is day or night. Since, we are
performing this project at the student level, we face restrictions in terms of hardware. If the
prototype is taken up by a company, a professional aspect can be incorporated which would
make the product give more accurate readings and predictions and would genuinely be of great
help to the blind people.
18
4. SYSTEM ANALYSIS
All above proposed papers that we have researched have shown 90% success rate in the field
and in all test cases. Although with certain setbacks too. All devices are capable of giving the
required results but the reactions time in real-world is less which the devices are not able to
keep up with. For real-time outputs more complex and advanced system and cloud-based
architecture is required which is difficult on a limited budget. In image recognition few projects
weren’t able to identify objects in the dark or night. The components used in few projects are
fragile and prone to damage. Hence when used by a user could get damaged. And using strong
equipment makes the project expensive.
Using the Raspberry Pi we are going to be implementing Object Detection and Recognition,
Obstacle Distance Sensing and GPS Location Sensing.
20
Cost of this operation:
A single convolution operation requires 1 x S multiplications.
Since the filter is being slided by Q x Q times, the total number of multiplications is equal to S
x Q x Q x (no. of filters).
So for point wise convolution operation,
Total no of multiplications = S x Q2 x P
So now we can create a mathematical comparison to prove how our architecture is more
efficient and computationally less intensive than standard convolution.
Normal Convolution P x Q2 x R2 x S
Depth Wise Separable Convolution S x Q2 x (K2 + P)
Where p = no of filters
S = no of channels in the image (eg. an RGB image has 3 channels red, green and blue)
Q = image dimensions (height and width)
K = kernel/filter dimensions
Fig .1. Single layer of the Mobile Net V2 architecture used in our project
The input for the CNN we are using is an image. On the Raspberry Pi, the camera module will
take the live video feed and on pressing a button an image will be captured. This image is then
fed to the object detection program which has the CNN logic coded into it. We have made use
of the Raspberry Pi NOIR Camera Module to take the live image feed.
21
The output for this image analysis by the CNN will be a rectangular box around the object
detected by the CNN. The box will have a label which indicates the object prediction made by
the CNN. The prediction is nothing but what object the CNN thinks it is based on the training
it has received. Once we have successfully integrated a wireless speaker with the Raspberry Pi
we will be providing an audio cue by making the speaker sound out the object recognized by
the neural network.
22
C. Location Sensing
In order for prototype to deliver on its promise of informing the blind person of their
current physical location, a GPS feature is to be implemented. For the purpose we are
using the UBLOX Neo 6M GPS Module.
The input would be the satellite signal picked up by the antenna from space. This module
simply has an antenna that picks up a signal sent by a GPS satellite in space and circuitry that
demultiplexes the signal to transfer the location data to the Raspberry Pi.
The output would involve extracting the latitude and longitude coordinates from the input and
converting them into a physical location address which would be sounded on the Bluetooth
Speaker.
GPS location data comes in various formatted sentences under the NMEA structure. Eg.
$GPGGA,181908.00,3404.7041778,N,07044.3966270,
W,4,13,1.00,495.144,M,29.200,M,0.10,0000*40
All NMEA messages start with the $ character, and each data field is separated by a comma.
181908.00 is the time stamp: UTC time in hours, minutes and seconds.
3404.7041778 is the latitude in the DDMM.MMMMM format. Decimal places are variable.
1 = Uncorrected coordinate
23
1.0 denotes the HDOP (horizontal dilution of precision).
29.200 denotes the geoidal separation (subtract this from the altitude of the antenna to arrive at
the Height Above Ellipsoid (HAE).
The $GPGGA is a basic GPS NMEA message. There are alternative and companion NMEA
messages that provide similar or additional information.
Here are a couple of popular NMEA messages similar to the $GPGGA message with GPS
coordinates in them (these can possibly be used as an alternative to the $GPGGA message):
$GPGLL, $GPRMC
In addition to NMEA messages that contain a GPS coordinate, several companion NMEA
messages offer additional information besides the GPS coordinate. Following are some of the
common ones:
$GPGSA – Detailed GPS DOP and detailed satellite tracking information (eg. individual
satellite numbers). $GNGSA for GNSS receivers.
$GPGSV – Detailed GPS satellite information such as azimuth and elevation of each satellite
being tracked. $GNGSV for GNSS receivers.
$GPGST – Estimated horizontal and vertical precision. $GNGST for GNSS receivers.
24
D. Audio Feedback
Since blind people are the target audience of our prototype, some form of auditory feedback
has to be put in place because blind people cannot see and hence they would require audio
messages to be delivered in order make the outputs of the above features useful. For the
purpose, we have made use of a wireless Bluetooth speaker.
The input to the speaker would obviously be the objects recognized by the neural network, the
distance that is sensed by the ultrasonic sensor in case of obstacles and finally the physical
location of the person.
The output of the speaker will be artificial voice generated by Python which would inform the
blind person about the above pieces of information.
Fig 3. Sony XRS Bluetooth speaker which we have used in our project
Software:
Raspbian OS which is the primary operating system of the Raspberry Pi on which
programs will be coded.
TensorFlow which is the machine learning framework by Google whose API is used to
code the convolutional neural network for object classification.
Python which is a high level programming language with massive machine learning
capabilities.
Under Python we will be using several special purpose libraries:
- RPi.GPIO which is python’s package to allow the program to exchange input and
output between the program and GPIO pins of the Pi.
- time which is needed to measure the time between sending and receiving the pulse in
distance sensing.
-OpenCV, keras for importing certain machine learning functions
26
-matplotlib for data visualization and mapping
-TensorFlow for building the neural network layer by layer
- pyrebase to allow the python program to upload live GPS data to Firebase
4.5. Methodology
A. Object Recognition
We have made use of the Mobile Net SSD architecture to program our neural network. This
architecture uses two main operations namely depth wise convolutions and pointwise
convolutions. Our architecture utilizes the concept of depth-wise separable convolution. For
Mobile Nets the depth-wise convolution applies a single filter to each input channel. The
pointwise convolution then applies a 1X1 convolution to produce a linear combination of the
outputs the depth-wise convolution. All layers in Mobile Net consist of a 3X3 depth-wise
separable convolution except for the first layer which has a full convolution. Each layer consists
of the depthwise separable convolution followed by a BatchNorm BN and Rectified Linear
Unit ReLU nonlinearity with the exception of the final fully connected layer which has no
nonlinearity and feeds into a softmax layer for classification. A final average pooling reduces
the spatial resolution to 1 before the fully connected layer. Counting depth-wise and pointwise
convolutions as separate layers, MobileNet has 28 layers.
The COCO Dataset has been used to train our MobileNet CNN for object detection and
recognition. COCO in short is a dataset used for training object detection, segmentation and
captioning networks. COCO stands for Common Objects in Context which means that images
are taken from everyday objects to prepare the dataset. It has approximately 330,000 images
with more that 200,000 of them labelled. For each convolutional layer of the architecture, the
ReLU activation function has been used along with Batch Normalization, the rectified linear
activation function also called ReLU is the most commonly used activation functions in
artificial neural networks. It returns a 0 as output for any negative input. If it receives a positive
value x as input it returns x as the output. So in general this function is given as an equation
f(x) = max(0, x). Batch Normalization is a deep learning technique that normalizes the output
of each sublayer. It allows for fast processing and deep network training by reducing internal
covariate shift.
27
Fig 8. Data Flow Diagram for Object Recognition
B. Obstacle Sensing
We have used the HC-SR04 Ultrasonic Sensor to perform this functionality. The accurate range
of this sensor is 2cm to 400cm which means that it can correctly measure distance between a
user and obstacle when the obstacle is no less than 2cm and no more than 400cm away from
the sensor.
C. Location Sensing
For the purpose of determining the location of the blind person in real time we have made use
of a Ublox Neo 6M GPS Module which is easily interfaced with the Raspberry Pi. The VCC
pin of the GPS is connected to the 5V pin 1 of the Pi, the transceiver TX pin is connected to
the receiver RX GPIO pin 15 of the Raspberry Pi and finally the gorund GND pin is connected
to pin 3 on the Pi. After the connections are made the unit is powered up and left in the open to
lock a satellite signal. Once the blue LED on the GPS module is blinking it means that location
data is being received. The program will extract any of the NMEA formatted sentences and
obtain the latitude and longitude coordinates. These coordinates are then converted to a location
address and all the location information uploaded to the cloud Firebase.
29
4.6. Design details/ Architecture
The first functionality offered by our prototype product is object detection and recognition. To
implement this the camera module will obtain a live video feed and in this feed one image or
frame will be sent to the program for analysis. The coded CNN will perform feature extraction
and match the object to a specific label. This label is nothing but the prediction made by the
neural network as to what the object is.
The second functionality we are implementing is obstacle distance sensing. Here we have
interfaced an ultrasonic sensor to send out an ultrasonic pulse in the direction of the obstacle
and the echo pulse will be received by the sensor. The exchange of signals will happen at the
I/O pins of the Pi and sensor. The time interval between the instant the pulse was sent and the
instant the echo was received is recorded and the distance between sensor and obstacle is
outputted after being calculated by a formula coded in the program.
The final feature we are incorporating in our project is GPS Location sensing. We will be
implementing this feature in our final semester i.e. semester 8. For this we have the GPS
Module with the antenna ready to interface with the Raspberry Pi. Once the module has been
locked on with a satellite GPS data will be retrieved by means of a program and using this data
the current location will be fetched in parallel with Google Maps.
For the purpose of giving the visually impaired person audio messages we will be interfacing
a Bluetooth Speaker. By means of the speaker, the user will get audio cues about objects
identified by the CNN, the distance between the user and an obstacle and also will be given an
audio representation of their location in terms of street name, city, country, zip code, etc.
30
Fig 12. Data Flow Diagram for the Architecture
31
5. IMPLEMENTATION
For the purpose of training the neural network and generating the label map, we have used the
COCO image dataset. COCO is a large-scale object detection, segmentation, and captioning
dataset. COCO has several features:
Object segmentation
Recognition in context
Super pixel stuff segmentation
330K images (>200K labeled)
1.5 million object instances
80 object categories.
32
The NOIR Camera Module of the Raspberry Pi once interfaced provides the Python program
the live video feed. We have implemented the MobileNet Neural Network using the interfaces
provided by TensorFlow. The video frames are given as input to the network which then
performs a feature extraction process and compares it to the features that it learnt during the
training phase. Once the probability of the labels are calculated , a rectangular box is drawn
around the object with the label of that object and its probability score. The Bluetooth speaker
then sounds a message to indicate what object has been recognized. This entire process happens
in a matter of seconds.
33
When the program is executed, the GPIO Pin 24 will send a signal to the TRIG pin of the sensor
triggering it to send an ultrasonic pulse out. The program will note this as the start of the time
interval. The echo from the obstacle will return to the sensor causing a signal to be sent to the
ECHO pin of the sensor. The ECHO signal is lowered from 5V to 3.3V by the voltage divider
and then sent to the GPIO Pin 23 of the Raspberry Pi. The program notes this as the end of the
pulse interval. The time taken for the pulse to hit the object and for its echo to return is
calculated as,
t = pulse end – pulse start
Finally the distance between obstacle and sensor is calculated as,
D = (T * 34300)/2,
where speed of sound is 343 m per sec or 34300 cm per sec.
34
Fig 16. Diagrammatic representation of GPS Location Sensing setup
There are 3 pins that are using to take the received data from the module:
a. GND: this pin is connected to pin 3 of the Pi for providing electrical grounding.
b. VCC: this pin is connected to pin 1 of the Pi which provides a 5V power supply.
c. TX: the transceiver pin of the module is connected to the RX pin of the Pi which is pin 5.
RX is the receiver pin of the Pi which will transfer received GPS data
In python the GeoPy, PyNMEA2 and GeoPandas libraries have been used. The GPS Module
receives location data in the form of NMEA sentences. In the Python code, the
GPRMC(Recommended minimum specific GPS/Transit data) sentence has been extracted and
used to obtain latitude and longitude coordinates of the module’s current location.
An example of a GPRMC sentence:
eg1. $GPRMC,081836,A,3751.65,S,14507.36,E,000.0,360.0,130998,011.3,E*62
eg2. $GPRMC,225446,A,4916.45,N,12311.12,W,000.5,054.7,191194,020.3,E*68
Reverse Geocoding i.e. coordinates to address has been performed by the program to obtain
the address. As usual this location address is sounded to the blind person by means of the
Bluetooth speaker.
Since we wanted to upload the location data to the cloud we have made use of the pyrebase
library as well. The library allows us to create nodes in Firebase and upload latitude, longitude
and address to a new node each time the blind person uses the GPS.
35
6. RESULTS
A. Object Recognition:
Figs 17, 18. 19. Object Recognition Output for a few objects
The output we have taken for object recognition was done using some common
household objects. The neural network was able to accurately recognize the objects we
put in front of the camera withing a matter of just 5 or less seconds which is extremely
36
good considering that the Raspberry Pi model we are using has just 1 GB of RAM. We
test out out the neural network on a few other items and a similarly accurate speed
output was obtained. The objects that were recognized were sounded on the Bluetooth
speaker using a particular format. For example, if the cup was detected the message
sounded was “Cup Detected!”.
B. Obstacle Sensing:
In the obstacle sensing feature using the ultrasonic sensor, whenever the obstacle is at
a distance of 15cm or less then an Alert message is sounded on the Bluetooth speaker
to warn the blind person that they are walking into something. We have chosen to keep
15cm as the range at which the warning is given however the range can always be
changed to whatever is needed in the program. The speaker gave messages in a fixed
format. For example if the obstacle was 10 cm away the message sounded was
“Warning obstacle is 10 centimeters away!”.
37
C. Location Sensing
The GPS Module was collecting a strong satellite signals outdoors. The $GPGGA NMEA
sentence was used to extract the latitude and longitude coordinates as seen in the screenshot.
The coordinates were then reverse geocoded to address. The message sounded by the speaker
for the address given in the screenshot is “You are currently at Nirmala Cooperative Housing
Society, St John Baptist Rd, Bandra West, Mumbai Suburban, Maharashtra, 400050 India.”
38
7. PROJECT TIMELINE AND TASK DISTRIBUTION
Project Timeline:
The timeline of work and events of our project can be split across semesters 7 and 8 in the form
of tables and charts.
Semester 7:
We had managed to complete implementation of the Object Recognition and Obstacle
Detection features of our prototype.
September 2020 Collecting all the remaining hardware that is required for initiating and
completing the project. We will finish off all basic technical activities
like installing the Raspbian OS on the Raspberry Pi and interfacing all
the components ensuring that they are functioning correctly.
October 2020 We will begin the main technical tasks of coding out the convolution
neural network incorporating Google’s TensorFlow framework.
Appropriate datasets will be used for training the network. Coding will
also be done for sensing and perceiving the distance of the object by the
ultrasonic sensor.
November 2020 Since we want GPS location to be a feature of our project whereby
audio output of street, city, etc. is given so the person knows where
he/she is , coding for utilizing the GPS Module will be completed. A
cloud storage will be used here.
December 2020 Any bugs in the code will be rectified or if any feature is not working
as expected it will be taken care of. We will test the device in different
scenarios to see how fast and accurate the results are being generated.
January 2021 Any additional features like a reading feature or face recognition feature
which recognizes faces of people known to the blind person may be
developed depending on how the status of the work of previous months.
39
Fig 23. Timeline chart for semester 7
Semester 8:
40
Fig 24. Timeline chart for semester 8.
Task Distribution:
We decided on how the overall work needed to complete the project could be broken up into
six main activities of collecting necessary hardware, interfacing components, programming/
coding, testing under different test cases, debugging any errors and finally documentation of
the work. The responsibility matrix we came up with ensured an equitable distribution of tasks
such that each group member contributed in every aspect of the project equally.
41
8. CONCLUSION
As stated clearly in the introduction, we are committed to the comfort of the user. He/she
should be able lead a near normal life with the help of our proposed project. We were also able
to decide and design the model as well as complete a part of it i.e. object detection. The code
that we implemented was able to successfully distinguish between a non-living object and a
human being with considerable amount of accuracy.
The objectives that were planned out of this semester have been successfully fulfilled, from
idea formulation to research to design to code implementation. The work was divided
successfully among the group members be it research of IEEE papers for idea formulation, to
designing the architecture of the system and eventually coding object detection part.
We had our guide who reviewed our work weekly and gave us necessary instructions wherever
possible.
Last but not least, we had our project panel who approved of the work done from idea
formulation to object detection and lauded us for the work done as well as constructively
criticized us in certain aspects that needed amendment. We have planned for the next semester
and hopefully we achieve our targets well in advance so that if possible, we can integrate more
functionalities that will appeal
the user.
42
REFERENCES
[1] Ren, Shaoqing, et al. “Faster R-CNN: Towards real-time object detection with region
proposal networks.” Advances in neural information processing systems, 2015.
[2] Chi-Sheng, Hsieh. “Electronic walking stick for the blind.” U.S. Patent No. 5,097,856, 24
Mar. 1992.
[3] Evanitsky, Eugene. “Portable blind aid device.” U.S. Patent No. 8,606,316, 10 Dec. 2013.
[4] Cervantes, Humberto Orozco. “Intelligent glasses for the visually impaired.” U.S. Patent
No. 9,488,833. 8 Nov. 2016.
[5] Jothimani, A., Shirly Edward, and G. K. Divyashree. “Object Identification for Visually
Impaired.” Indian Journal of Science and Technology 9.S1, 2016.
[6] Zraqou, Jamal S., Wissam M. Alkhadour, and Mohammad Z. Siam. “Real-Time Objects
Recognition Approach for Assisting Blind People.”, 2017.
[7] Ananth Noorithaya, M. Kishore Kumar, A. Sridevi. “Voice assisted system for the blind”,
2015. https://doi.org/10.1109/CIMCA.2014.7057785
[8] Ruolin Wang, Chun Yu, Xing-Dong Yang, Weijie He, Yuanchun Shi. 2019. EarTouch:
Facilitating Smartphone Use for Visually Impaired People in Mobile and Public Scenarios. In
CHI Conference on Human Factors in Computing Systems Proceedings (CHI 2019), May 4–
9, 2019, Glasgow, Scotland Uk. ACM, New York, NY, USA, 13 pages. https:
//doi.org/10.1145/3290605.3300254
[9] Conference on Human Factors in Computing Systems (CHI ’11). ACM, New York, NY,
USA, 3247–3256. https://doi.org/10.1145/1978942.1979424 Mounir Bousbia-Salah,
Abdelghani Redjati, Mohamed Fezari, Maamar Bettayeb. “An Ultrasonic Navigation System
For Blind People”.2007. https://www.researchgate.net/publication/251851635
[10] Rachid Sammouda, Ahmad AlRjoub. “Mobile Blind Navigation System Using
RFID”.2015. https://ieeeexplore.ieee.org/document/7353325
[11] Zraqou, Jamal S., Wissam M. Alkhadour, and Mohammad Z. Siam. “Real-Time Objects
Recognition Approach for Assisting Blind People.”, 2017.
[12] Simonyan, Karen, and Andrew Zisserman. “Very deep convolutional networks for large-
scale image recognition.” arXiv preprint arXiv:1409.1556 (2014).
43
[14] Ren, Shaoqing, et al. “Object detection networks on convolutional feature maps.” IEEE
transactions on pattern analysis and machine intelligence 39.7 (2017): 1476-1481.
[15] Anika Nawer, Farhana Hossain, Md. Galib Anwar. “Ultrasonic Navigation System for the
visually impaired & blind pedestrians” .2015.
https://www.researchgate.net/publication/283153904
[16] Shiri Azenkot, Sanjana Prasain, Alan Borning, Emily Fortuna, Richard E. Ladner, and
Jacob O. Wobbrock. 2011. Enhancing Independence and Safety for Blind People.
44
APPENDIX
45
46
47
48
Coding Conventions
Source code:
# Import packages
import os
import cv2
import numpy as np
from picamera.array import PiRGBArray
from picamera import PiCamera
import tensorflow as tf
import argparse
import sys
import subprocess
import subprocess
# Select camera type (if user enters --usbcam when calling this script,
# a USB webcam will be used)
camera_type = 'picamera'
parser = argparse.ArgumentParser()
parser.add_argument('--usbcam', help='Use a USB webcam instead of picamera',
action='store_true')
args = parser.parse_args()
49
if args.usbcam:
camera_type = 'usb'
# Import utilites
from utils import label_map_util
from utils import visualization_utils as vis_util
# Name of the directory containing the object detection module we're using
MODEL_NAME = 'ssdlite_mobilenet_v2_coco_2018_05_09'
# Path to frozen detection graph .pb file, which contains the model that is used
# for object detection.
PATH_TO_CKPT =
os.path.join(CWD_PATH,MODEL_NAME,'frozen_inference_graph.pb')
sess = tf.compat.v1.Session(graph=detection_graph)
# Define input and output tensors (i.e. data) for the object detection classifier
51
for frame1 in camera.capture_continuous(rawCapture,
format="bgr",use_video_port=True):
t1 = cv2.getTickCount()
# Acquire frame and expand frame dimensions to have shape: [1, None, None, 3]
# i.e. a single-column array, where each item in the column has the pixel RGB value
frame = np.copy(frame1.array)
frame.setflags(write=1)
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame_expanded = np.expand_dims(frame_rgb, axis=0)
# Perform the actual detection by running the model with the image as input
(boxes, scores, classes, num) = sess.run(
[detection_boxes, detection_scores, detection_classes, num_detections],
feed_dict={image_tensor: frame_expanded})
# All the results have been drawn on the frame, so it's time to display it.
52
cv2.imshow('Object detector', frame)
t2 = cv2.getTickCount()
time1 = (t2-t1)/freq
frame_rate_calc = 1/time1
rawCapture.truncate(0)
camera.close()
while(True):
t1 = cv2.getTickCount()
# Acquire frame and expand frame dimensions to have shape: [1, None, None, 3]
# i.e. a single-column array, where each item in the column has the pixel RGB value
ret, frame = camera.read()
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame_expanded = np.expand_dims(frame_rgb, axis=0)
# Perform the actual detection by running the model with the image as input
(boxes, scores, classes, num) = sess.run(
[detection_boxes, detection_scores, detection_classes, num_detections],
feed_dict={image_tensor: frame_expanded})
cv2.putText(frame,"FPS:
{0:.2f}".format(frame_rate_calc),(30,50),font,1,(255,255,0),2,cv2.LINE_AA)
# All the results have been drawn on the frame, so it's time to display it.
cv2.imshow('Object detector', frame)
t2 = cv2.getTickCount()
time1 = (t2-t1)/freq
frame_rate_calc = 1/time1
cv2.destroyAllWindows()
def execute_unix(inputcommand):
p = subprocess.Popen(inputcommand, stdout=subprocess.PIPE, shell=True)
(output, err) = p.communicate()
return output
54
GPIO.setmode(GPIO.BCM)
TRIG = 23
ECHO = 24
while 1:
GPIO.setmode(GPIO.BCM)
print("Distance Measurement In Progress")
GPIO.setup(TRIG,GPIO.OUT)
GPIO.setup(ECHO,GPIO.IN)
GPIO.output(TRIG, False)
print("Waiting For Sensor To Settle")
time.sleep(2)
GPIO.output(TRIG, True)
time.sleep(0.00001)
GPIO.output(TRIG, False)
while GPIO.input(ECHO)==0:
pulse_start = time.time()
while GPIO.input(ECHO)==1:
pulse_end = time.time()
distance = round(distance, 2)
print("Distance:",distance,"cm")
import serial
import time
import string
import pynmea2
import pandas as pd
#import geopandas as gpd
import geopy
import subprocess
import pyrebase
firebaseConfig = {
"apiKey": "AIzaSyBRYIpKhCOMrf9wSJhKGoupsaRq-AxYq0o",
"authDomain": "connectingfbtopy.firebaseapp.com",
"projectId": "connectingfbtopy",
"databaseURL": "https://connectingfbtopy-default-rtdb.firebaseio.com/",
"storageBucket": "connectingfbtopy.appspot.com",
"messagingSenderId": "921973152097",
"appId": "1:921973152097:web:56e04327d2f65c039f8d20",
"measurementId": "G-DPMCW1GJSB"
};
firebase = pyrebase.initialize_app(firebaseConfig)
db = firebase.database()
def execute_unix(inputcommand):
p = subprocess.Popen(inputcommand, stdout=subprocess.PIPE, shell=True)
(output, err) = p.communicate()
return output
while True:
56
port="/dev/ttyS0"
ser=serial.Serial(port, baudrate=9600, timeout=0.5)
dataout = pynmea2.NMEAStreamReader()
newdata=ser.readline()
if newdata[0:6] == "$GPRMC":
newmsg=pynmea2.parse(newdata)
lat=newmsg.latitude
lng=newmsg.longitude
gps = "Latitude:" + str(lat) + " Longitude:" + str(lng)
print(gps)
locator = Nominatim(user_agent="myGeocoder")
latitude= str(lat)
longitude=str(lng)
coordinates = ""+latitude+", "+longitude
location = locator.reverse(coordinates)
addr=location.address
data = {"Latitude": latitude, "Longitude": longitude, "Location":addr}
db.push(data)
print(addr)
string = "You are currently at "+addr
c = 'espeak -ven+m4 -k5 -s140 --punct="?" "%s" 2>>/dev/null' % string
execute_unix(c)
57
ACKNOWLEDGEMENT
Success of project like this is which involves high technical expertise, patience beyond limits
to sit and keep watching black and white terminal screen popping messages after messages,
and impeccable support of guides, is possible with every team member working together. So
big congratulations to my team-mates.
We take this opportunity to express our gratitude to the people who have been instrumental in
the successful completion of this project. We would like to show our greatest appreciation to
Mrs. Dipti Jhadav for her tremendous support and help. Without their encouragement and
support this project would have been dangling in its midway. She made sure that we were on
time always. We would also to like to mention our gratitude to all the panel members and
project mentors for their valuable inputs during the mock presentations. Thank you to our HOD
Ms. Sana Shaikh for her constant support and motivation. Thank you all for helping us achieve
this.
Date:
58
PUBLICATION
Name of the Salil Fernandes, Jordan D’Souza, Anthony Katti Karen, Mrs. Dipti Jadhav.
Author:
Publication ICT4SD 2021, Goa.
Conference:
Status: Paper accepted pending the filling of permission to publish and other
nomination forms.
Reviews by 1. The work is encouraging.
the panel 2. Abstract well written.
members of
3. The originality and scientific quality of this paper is acceptable.
the
conference: 4. Statistical analysis in this paper is suitable and info graphs
are satisfactory.
5. Recommended for inclusion.
59
NAYAN-DRISHTI: A Revolutionary Navigation/Visual Aid for the
Visually Impaired.
Salil Fernandes 1, Jordan D’souza 1, Anthony Kattikaren 1, Mrs. Dipti Jadhav 1
Don Bosco Institute of Technology, Mumbai-70.
[email protected], [email protected], [email protected], [email protected]
Abstract: The project/proposed product hinges on three domains of computational technology i.e., Machine learning,
Convolutional Neural Networks and Internet of things. The aim of the project is to invent a product that is helpful to the
disabled section of society as ideally as possible try to as well as to acquaint ourselves with the much talked about and ever
growing domains of computer technology. The main functions that our proposed product will offer are detection of the
obstructing object and alerting via a speaker (along with classification and distance of the object from the user) and a
navigation system (which obtains live data of the current location of the user with the help of the UBLOX GPS module). The
proposed product is designed in such a way so as to provide an all in one multitasking and hassle-free solution to our user and
to ease the burden that come along with the disability of blindness. The proposed product is touted as a boon to our users
since it not only will help them in identifying the obstructions ahead them but will also help them to navigate from their
current location to their destination with freedom and no fear.
Keywords: Convolutional Neural Networks (CNN), Rectified Linear Unit (ReLU), Batch Normalization (BN).
1. Introduction:
We sometimes wonder how the disabled section our society manage to perform tasks that look impossible pertaining to their
disability. We have to ponder as to how efficiently can they perform their day-to-day tasks and how well can they
communicate with normal people. Keeping in mind the difficulties they face, their ability to perform such tasks and how to ease
the fear related to it, technology has advanced in many fields.
Blindness can basically be classified into three types:
Complete Blindness, Night Blindness and Colour Blindness.
The proposed product will function accordingly as explained below using the following computing domains:
1. Convolutional Neural Networks: The product will take an input from the surrounding environment of the user and convolve
it for a better prediction of the obstructing object which will be done using machine learning.
2. Machine Learning: The product with the output image can will predict the object that is in hinderance to the user’s path
with the help of pre-trained datasets.
3. Global Positioning System: The user will be able to communicate his/her destination to the software and the software will
guide the user to the destination. The UBLOX GPS Module plays an important role in communicating the satellite in space to
obtain the real time location of the user.
4. IOT (Internet of Things): Our project consists of raspberry pi, GPS module, camera module, etc. that will be effectively
used in processing data, capturing images, displaying location of the person, etc.
2. Literature Survey:
A. Survey of Existing Systems:
“Electronic walking stick for the blind” published in the year 2018, the use of optical sensors has been highlighted. This is a
modern concept of a walking stick which is completely digital. These sensors essentially convert light reflected from any
surface and convert them into electric signal and acts as response to the stimuli and informs the blind person via a speaker
onthe handle about the obstacle. The object aimed to give voice assistance to the user and was able to deliver in various cases.
[1]
“Portable Blind aid Device” published in the year 2019, it highlights the use of a mobile-based project in which the user can
switch his wireless device into blind assistance mode with the help of a button. With the help of the camera, GPS and a cloud-
based architecture it will be able to give the real-time location of the user and also make him aware of his surroundings. Also
plan to include an advanced image recognition algorithm to recognize the faces of stranger and to store them.[2]
“Intelligent glasses for the blind” published in the year 2016, the device is smart glass which uses a camera, an ultrasonic sensor
60
and an electronic touchpad to assist the blind person. Using a mobile device one can activate the glass and the camera will
capture and convert the 3D image into a spatial matrix and give appropriate outputs. The touchpad give slight electric shocks to
make the user aware of the obstacles. Future scope was to add a walking cane with a button to give the user audio output of the
real-time surroundings.[3]
“Object Identification for Visually Impaired.” published in the year 2016, a simple image recognition system that makes use of
camera to recognize images, an ultrasonic sensor to detect obstacles is explained. The camera captures the images and if the
image can be recognized by the images in the dataset then the device gives and audio output using a speaker that will be
attached to the users clothing. Future upgrades include to introduce face recognition and to use a wireless camera.[4]
“Real-Time Objects Recognition Approach for Assisting Blind People.” Published in the year 2017, an object recognition
project is depicted that with the help of SURF (Speeded Up Robust Features) and light machine learning is able to give accurate
information about the objects captured. Using GPS and image recognition it gives 90% accurate results. It makes use of a database
and uses machine learning algorithm it is able to identify the objects.[5]
“EarTouch: Facilitating Smartphone Use for Visually Impaired People in Mobile and Public Scenarios.” published in the year
2019, a Blind aid device which uses ear gestures like long swipe, tap, slide, etc. for operating the various function on the
smartphone. Making use of one-hand gestures and Ear- Touch feature it enables the user to use the device with ease. And it
gives out audio output through the inbuilt speakers. The device was able to read the gestures accurately and activate the
respective functions when theuser wanted it.[6]
“An Ultrasonic Navigation System for Blind People” published in the year 2017, introduces with us another device for object
detection which not only uses an ultrasonic sensor but also includes an accelerometer, footswitch, microcontroller and vibration
pad. User is made aware of the obstacle and closeness ofit by different degree of vibrations and also through an audio output.[7]
“Mobile Blind Navigation System UsingRFID”. published in the year 2015, the paper introducesa mobile based communication
device which makes useof the WIFI, GPS to determine the user’s location. And also makes use of RFID tags in the
surroundings to givemore accurate location to the user and guide them by giving appropriate instructions. The user makes use of
asmart walking cane and RFID handled reader.[8]
“Voice assisted system for the blind” published in the year 2015, the paper introduces us witha simple object/obstacle detector
which makes use of anultrasonic senor, microcontroller, mp3 module and SD card. The sensor sends the distance measured and
the microcontroller depending the proximity limit it is programmed sends an output to the mp3 module. It makesuse of depth
sensing for avoidance of pot holes etc. for the user to walk with smoothness and comfort with normalcy.[9]
“Virtual-Blind-Road Following Based Wearable Navigation Device for Blind People.” published in the year 2019, the paper
depicts a navigation device specifically for the indoor environment which makes use of SLAM (simultaneous localization and
mapping). The device tries to give the person the best path to an indoor destination by keeping track of the positions of
previously encountered obstacles. It makes use of PoI graph to store new points in the route to the destination and using A*
algorithm onthe PoI graph it finds the shortest and optimum path to the destination. Obstacle distance is measured by an
Ultrasonic range finder and provides audio output.[10]
Under CNNs there are several architectures that come into the picture like the Faster R-CNN model, Mask R- CNN Inception
model and the SSD ResNet models to mention but a few. In our project we are using a Raspberry Pi 3 Model B+ to program and
execute our programs. Due to limited memory of 1GB RAM and lower processing power available we have to make use of a
neural network which is computationally less intensive but without making a compromise in accuracy.The architecture which
suited this purpose is the SSD- MobileNet v2 architecture.
Fig 1. Single convolution layer in MobileNet. Fig 2. Data Flow diagram for Object Recognition.
62
The architecture uses the concept of depth-wise convolutions. Depth-wise Convolution is a type ofconvolution wherein a single
convolutional filter isapplied to each input channel. In the regular
2D convolution performed over multiple input channels,the filter is as deep as the input and lets us freely mix channels to
generate each element in the output. In contrast, depth-wise convolutions keep each channel separated from each other. In
general, the steps used to perform a depth-wise convolution are:
1. Break up the input and filter into channels.
2. Convolve each input with the appropriate filter and combine the convolved outputs.
A depth-wise convolution consists of a depth separable convolution and a pointwise convolution. Spatial separable convolution
works mainly with the spatial dimensions of an image and kernel which are the widthand height. (The third dimension is called
depth which isthe number of channels of each image and is not taken into account by spatial separable convolutions). A spatial
separable convolution breaks a kernel into twosmaller kernels. The most common case would be to divide a 3x3 kernel into a 3x1
and 1x3 kernel. In place ofperforming a single convolution by 9 multiplications, weperform two convolutions with 3 multiplications
each to get the same result. With fewer multiplications, computational complexity goes down, and the network isable to run more
efficiently. Unlike spatial separable convolutions, depth-wise separable convolutions use kernels that are not divided into two
smaller kernels. The depth-wise separable convolution is so named because in addition to spatial dimensions, it deals with the
depth dimension — the number of channels — as well. An input image can have3 channels: RGB. After a few convolutions, an
image may have multiple channels. An image with 64 channelswould have 64 different versions of the same image. Analogous to
spatial separable convolution, a depth-wiseseparable convolution divides the kernel into two kernelsthat do two convolutions: a depth-
wise convolution and apointwise convolution.
B. Dataset Used:
The COCO Dataset has been used to train our MobileNetC. NN for object detection and recognition. COCO in short is a dataset
used for training object detection, segmentation and captioning networks. COCO stands forCommon Objects in Context which means
that images are taken from everyday objects to prepare the dataset. It has approximately 330,000 images with more that 200,000 of
them labelled.
63
c. ECHO which will gather the echo pulse from theobstacle and send the signal that the obstacle hasbeen detected back to the
Raspberry Pi.
d. GND which provides general grounding to the sensor.
For exchanging signals and for the program to receive readings from the sensor the GPIO pins of the RaspberryPi were
connected with the pins of the sensor. We have used the HC-SR04 Ultrasonic Sensor to perform this functionality. The accurate
range of this sensor is 2cm to 400cm which means that it can correctly measure distance between a user and obstacle when the
obstacleis no less than 2cm and no more than 00cm away fromthe sensor. Extra caution had to be taken when implementing the
circuit between the sensor and GPIO pins. The ultrasonic sensor operates at 5V and this voltage was supplied by connecting
GPIO pin 1 to the VCC pin of the sensor. Since all the GPIO pins (except for pin 1) operate at 3.3V and the signals sent by the
sensor at 5V, care has tobe taken to lower the voltage. To do this we made use ofa voltage divider circuit implemented on a
breadboard. We used 560 ohm and 1000 ohm resistors to lower the 5V echo signal to a 3.3V signal.
E. Location Sensing:
In order to allow the blind person to know where exactly he/she is when navigating outdoors, a GPS sensor has been used. The
Ublox Neo 6M GPS Module allows for easy interfacing with the Raspberry Pi.
The antenna once attached to the sensor allows for GPS data to be received. The antenna has to be left exposed to the sky in
order for a satellite to lock onto it and start transmitting GPS data. There are 3 pins that are using to take the received data from
the module:
a. GND: this pin is connected to pin 3 of the Pi for providing electrical grounding.
b. VCC: this pin is connected to pin 1 of the Pi which provides a 5V power supply.
c. TX: the transceiver pin of the module is connected to the RX pin of the Pi which is pin 5. RX is the receiver pin of the
Pi which will transfer received GPS data to the serial port to be read by the Python program.
In python the GeoPy, PyNMEA2 and GeoPandas libraries have been used. The GPS Module receives location data in the form
of NMEA sentences. In the Python code, the GPGLL (Geographic position, latitude / longitude) sentence has been extracted and
used to obtain latitude and longitude coordinates of the module’s current location. Reverse GeoCoding i.e. coordinates to
address has been performed by the program to obtain the address. As usual this location address is sounded to the blind person
by means of the Bluetooth speaker.
F. Wireless Sound:
The wireless Bluetooth chip that comes with the Raspberry Pi has been used to provide audio messages. A Bluetooth speaker is
connected to the Pi with the help of this chip. For the purpose of making use of Bluetooth several packages had to be installed
first namely: Bluez, Alsa, Bluetooth Manager and Pulse AUDIO.
Since our device aims to help blind people navigate audio messages have to be delivered. This is the primary reason for using a
wireless speaker. It is possible for blind people to get visual cues about their environment, hence audio hints have to be
delivered to inform them.
In the object detection and recognition feature, once an object is successfully detected, the speaker delivers a message for eg.
“Cup Detected”.
In the obstacle sensing feature, whenever the ultrasonic sensor senses the obstacles distance to be less than 10 cm a message is
64
sounded, eg. “Careful Obstacle is 10 cm away”
In the location sensing feature, the speaker sounds the address of the current location obtained by reverse geocoding the
coordinates.
4. Mathematical Model:
H W 1x1 conv H W
sTK s x
x
s x s x K’
T is called the expansion factor, s is called the stride, Hand W are the height and width of the image respectively. For our
MobileNet, the depth-wise convolution applies a single filter to each input channel. The pointwise convolution then applies a
1X1 convolution to produce a linear combination of the outputs the depth-wise convolution. All layers in MobileNet consist of a
3X3 depth-wise separable convolution except for the first layer which has a full convolution. A final average pooling reduces the
spatial resolution to 1 before the fully connected layer. Counting depth-wise and pointwise convolutions as separate layers,
MobileNet has 28 layers.
B. Ultrasonic Distance Sensing:
We have assumed the speed of sound to be approximately 340 metres per second or in other words 34,000 cm per second. To
calculate distance between user and obstacle, only the distance one way has to be calculated since the time interval being measured
is from the instant the ultrasonic pulse is sent to the instant the echo is received.
65
alert is generated. The range can be manually changed in the program.
C. Live Location Sensing:
This program enables the GPS module to acquire the signal from the orbital satellite and latch onto it. After the latching of the
signal, it detects the location of the user and informs the user about the current location.
Fig 10. Ultrasonic Distance output depicting Fig 11. Live Location Detection.
an alert for distances less than or equal to
10 cm.
66
7. Conclusions:
Through the results depicted in the above section, it is clear that our object detection was overall a success in terms of efficiency
and accuracy. The system could differentiate between basic objects easily with an efficiency rate ranging between 75-98%
depending on the rate of frames per second. On testing the object distance measurement, we found out that the system was again
efficient in displaying/sounding an alert as the object was nearing the user. An alert message will be sounded when the user is
around 3-5m from the obstructing object. But here for test purposes we chose 10 cm as our minimal limit to sound an alert message.
Thus, object detection and object recognition were successfully implemented.
67