Blind Navigation Report
Blind Navigation Report
2018-2019
A PROJECT REPORT ON
BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE & ENGINEERING
CERTIFICATE
Certified that this project work entitled by “BLIND NAVIGATION SYSTEM USING
ARTIFICIAL INTELLIGENCE” presented byDeekshit LN Swamy(1AT15CS019),
Kavitha.C(1AT15CS036), KavyaSree.M (1AT15CS037),bonafide students of Atria Institute
of Technology, Bangalore in partial fulfillment for the award of Bachelor of Engineering in
Computer science and Engineering of Visvesvaraya Technological University, Belagavi
during 2018-19. It is certified that all corrections/ suggestions indicated during internal
assessment have been incorporated in the report. The project report has been approved and it
satisfies the academic requirements with respect to the project report as prescribed for the said
Degree.
ExternalViva
(Name of Internal /External Examiner with Signature &Date)
Examiner 1:
Examiner 2:
DECLARATION
We,Deekshit LN Swamy(1AT15CS019), Kavitha.C(1AT15CS036),KavyaSree.M
(1AT15CS037), bonafidethe students of Eighth Semester, Department of Computer Science
& Engineering, Atria Institute of Technology, hereby declare that this project titled“BLIND
NAVIGATION SYSTEM USING ARTIFICIAL INTELLIGENCE” has been carried
out by us, under the guidance of Mr. Vijay Swaroop Assoc. Professor, Dept, of CS&E.We
also declare that this project ,which is aimed towards the partial fulfillment of the academic
requirement for the award of degree of Bachelor of Engineering Visvesvaraya
Technological University(VTU), Belagavifor the academic year 2018-2019, has been the
results of our own efforts and that no part of the project has been plagiarized without
citations.
The need to move from place to place is a crucial task for any human who intends to
complete a specific task. The path a human takes doesn’t always favor the type of path
feasible for that person. He has a greater chance of encountering any sort of obstacles which
may lead to a point where the human must make use of his visual and cognitive skills to
decide which path avoids the obstacle. This creates an issue to visually impaired people.
This project illustrates one way of overcoming this issue keeping visually impaired people in
focus. We have devised a system capable of identifying objects and suggesting the user to
choose his path based on the obstacle ahead of him. We have implemented YOLO algorithm
for classification and identification of objects. This involves a client-server feature to ensure
that the device is more real-time and efficient. The outcome of the project is to ensure the
safest path to choose with respect to the object ahead of the user by audio guidance.
A
ACKNOWLEDGMENT
The foundation for any successful venture is laid out not just by the individual accomplishing
the task, but also by several other people who believe that the individual can excel and put in
their every bit in every endeavor he/she embarks on, at every stage in life. And the success is
derived when opportunity meets preparation, also supported by a well-coordinated approach
and attitude.
We would like to express our sincere gratitude to the respected principal Dr. K.V.
Narayanaswamy, for providing a congenial environment to work in. We also like to express
our sincere gratitude to Dr. Aishwarya.P, Head of Department, Computer Science, for her
continuous support and encouragement.
We are indeed indebted toMr. G. Srinivas Achar, ourcoordinator andMr. Vijay Swaroop,
guide for his continued support, advice and valuable inputs during the course of this project
work.
Last, but not the least we would like to thank our families, who have acted as a beacon of
light throughout our life.
Our sincere gratitude goes out to all our comrades and well-wishers who have supported us
through all the ventures.
TABLE OF CONTENTS
ABSTRACT A
LIST OF FIGURES D
LIST OF TABLES E
CHAPTER 1: INTRODUCTION 1
1.1 PURPOSE 2
1.2 PROBLEM STATEMENT 2
1.3 SYSTEM OVERVIEW 2
1.4 SCOPE 3
1.5 OBJECTIVES 3
1.6 EXPECTED RESULTS 4
4.2 FLOWCHART 25
B
CHAPTER 5:IMPLEMENTATION 27
5.1 METHODOLOGY 30
CHAPTER 6:RESULTS 39
REFERENCES 42
C
LIST OF FIGURES
D
LIST OF TABLES
TABLE 1.6 EXPECTED RESULTS 4
E
E
CHAPTER 1
INTRODUCTION
The world is ever changing. There has been a phenomenal growth in technology and
economy in the past few decades. As a result, we are expected to adapt ourselves as per the
changes implemented. These changes however may be comfortable to quite a large portion of
the population. Nevertheless, the other set of population fail to adapt to these changes.
Amongst them are the blind. Analysis of data from 188 countries suggests there are more
than 200 million people with moderate to severe vision impairment. That figure is expected
to rise to more than 550 million by 2050. Considering a blind person travelling in a region
once familiar may not be the same in a couple of days to come.
With the advancements in Artificial Intelligence, we now have various algorithms that can
perform tasks like identifying an object. This ability comes in handy, as this could practically
act like an artificial eye to the blind. It could distinguish between static objects and dynamic
objects and help the blind to walk around. A voice assistant is a software whose key role is to
provide a voice command to the blind person. With machine learning algorithms that
processes a speech (Google gTTS).
1.1 PURPOSE
Current implementations of blind navigation primarily uses a stick with a proximity sensor is
implanted. This approach doesn’t meet the needs that seek all or most of other people are
capable of. Hence, we came up with an approach that identifies the type of object ahead and
also a navigation guide via audio. This enable the blind person to understand the
surroundings and move safely with confident.
Most technologies used today needs a user to look and operate because of the convenience in
GUI. However, this is not suitable to a visually impaired person. This not only creates a
problem to the blind alone. This also avoids many job opportunities. Since the blind also find
it difficult in navigating himself to a plot of employment. As a result, the economic growth of
a country ceases to increase.
The entire project system is categorized into 2 types of processing. The initial setup and
processing happens at the Raspberry pi itself. Here, we refer this device as the server device
as, it plays a role in sending images to the host computer (client), where the actual processing
of the images are done. In the raspberry pi, there are few python scripts written to invoke a
few preliminary actions. One of those are to determine the close proximity of objects. Here,
the raspberry pi makes use of a sensor “ultrasonic sensor” to get information about the
closeness of the person to the object. It triggers a stop signal only if the object is sufficiently
near. Once the object is near, the script responsible for capturing images of the object ahead
is invoked. Hence, saves and stores the images of the object ahead. These objects are now
ready to be sent to the client (host computer).
When the images are done being stored, the script responsible for the server socket is
initiated. Here, the server socket waits for any connection with the client system. If no
connection is available, the process continues. If a connection is established, then the images
from the raspberry pi is sent to the host computer.
Now, the images in the host computer (client), will undergo image processing through a
script that implements YOLO. After the processing is done, the result is needed to determine
the navigational decision parameter. The decision parameter is sent to the raspberry pi after
the processing at client. Now, the raspberry pi determines to which direction the person
should take his deviation with respect to the object he encounters. This decision is bought by
an audio output of the direction.
1.4 SCOPE
This concept of navigation is merely a step foot in the innovation of this sort. This device
could be integrated with navigation related applications found in the internet. Such as
booking a cab, home delivery services, etc. and moreover could also be made traceable to
ensure that the person isn’t lost. Furthermore, an active group of volunteers could help to
improve the detection algorithms and update any newly implemented systems in the society
to make the device adaptable to changes in development of the nation.
1.5 OBJECTIVES
The system involves a step by step approach to determine a navigation path to direct the blind
towards the destination. To make this functional, there are certain objectives that the system
should achieve,
• Object identification.
• Differentiation between Static (still) objects and Dynamic (moving) objects.
• Navigation system, to record path and remember locations.
• Voice assisted action-based system that warns the person about an obstacle. In
addition, helps in directing the blind to a location.
• Voice assistant that recognizes user command to make calls, book cabs, etc.
• Notify the user about a scheduled plan and create a navigation plan.
Description:
In this paper, Our unified architecture is extremely fast. Our base YOLO model processes
images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO,
processes an astounding 155 frames per second while still achieving double the mAP of other
real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more
localization errors but is far less likely to predict false detections where nothing exists.
Finally, YOLO learns very general representations of objects.
Title: Blind Navigation Assistance for Visually Impaired Based on Local Depth
Hypothesis from a Single Image
Description :
In this paper, we can learn to propose a depth estimation technique from a single image based
on local depth hypothesis devoid of any user intervention and its application to assist the
visually impaired people. The ambient space ahead of the user is captured by a camera and
the captured image is resized for computational efficiency. The obstacles in the foreground of
the image are segregated using edge detection followed by morphological operations. Then
depth is estimated for each obstacle based on local depth hypothesis. The estimated depth
map is then compared with the reference depth map of the corresponding depth hypothesis
and the deviation of the estimated depth map from the reference depth map is used to retrieve
the spatial information about the obstacles ahead of the user.
Description:
In this paper, we can aim at design and development of a smart and intelligent cane which
helps in navigation for the visually impaired people. The navigator system designed will
detect an object or obstacle using ultrasonic sensors and gives audio instructions for
guidance. The signals from the ultrasound sensor are processed by a microcontroller in order
to identify sudden changes in the ground gradient and/or an obstacle in front. The algorithm
developed gives a suitable audio instruction depending on the duration of ultrasound travel
which in turn is made available by an mp3 module associated with the system. This work
presents a new prototype of navigation system on a cane which can be used as a travel aid for
blind people. The product developed is light in weight, hence, does not cause fatigue to the
user.
Author: Mirjana Maksimović, Vladimir Vujović, Nikola Davidović, Vladimir Milošević and
BrankoPerišić
Description:
programmable small computer board. Comparative analysis of its key elements and
performances with some of current existing IoT prototype platforms have shown that despite
few disadvantages, the Raspberry Pi remains an inexpensive computer with its very
successfully usage in diverse range of research applications in IoT vision.
Description:
The objective of this project is to design, prototype and evaluate a smart guidance
system to help blind students navigate from all places. The system is intended to
provide overall measures – Artificial vision and object detection. The aim of the
overall system is to provide a low cost and efficient navigation aid for blind which
gives a sense of artificial vision by providing information about the environmental
scenario of static and dynamic objects around them.
Description:
In this paper ,we can have studies that established the need and utility of accessible
urban transport system for visually impaired persons. However, most public
transportation systems, especially in the developing countries, are not accessible and
this is often listed as one of the major bottlenecks for social and economic inclusion
of visually impaired. On boarding, the bus identification and homing system has been
developed to address these needs.
Description:
In this paper, the proposed system is a portable camera based visual assistance prototype for
blind people to identify currency notes and also helps them to read printable texts from the
handheld objects.
Description:
Two of the most demanding and widely studied applications relate to object detection and
classification. The task is challenging due to variations in product quality differences under
certain complicate circumstances influenced by nature and human. Research in these fields
has resulted in a wealth of processing and analysis methods. In this paper, we explicitly
explore current advances in the field of object detecting and categorizing based on computer
vision, and a comparison of these methods is given.
Description:
There has been extensive research in the field of object detection and tracking. Many
remarkable algorithms have been developed for object detection and tracking, including color
segmentation, edge tracking and many more. However, all these algorithms faced the limited
success in their implementation in the real world and were also bounded by the constraints
such as white/plain background. This paper is the result of our research where our research
team developed and implemented object detection and tracking system operational in an
unknown background, using real-time video processing and a single camera.
Description:
This paper presents a review of the various techniques that are used to detect an object
,localizea object, categories a object, extract the features, appearances information and more,
in images and videos.
Description:
In this paper, we can learn about machine learning which is used in a variety of
computational tasks where designing and programming explicit algorithms with good
performance is not easy. Applications include email filtering, recognition of network
intruders or malicious insiders working towards a data breach. One of the foundation
objectives of machine learning is to train computers to utilize data to solve a specified
problem.
Description:
Image processing is any form of signal processing for which the input is an image, such as
photographs or frames of video; the output of image processing can be either an image or a
set of characteristics or parameters related to the image. Most image-processing techniques
involve treating the image as a two-dimensional signal and applying standard signal-
processing techniques to it.
Author: ShahendaSarhan
Description:
In this paper we tried to share the dream of having a domain independent search engine and
not only an ordinary one but a smart search by voice engine which searches user speech
automatically without the user ’s request and provide him with evidence on his speech, this
engine was called SVSE.
Description:
In this paper,it is named as Personal Assistant with Voice Recognition Intelligence, which
takes the user input in form of voice or text and process it and returns the output in various
forms like action to be performed or the search result is dictated to the end user.
Description:
Voice assistants are software agents that can interpret human speech and respond via
synthesized voices. Apple’s Siri, Amazon’s Alexa, Microsoft’s Cortana, and Google’s
Assistant are the most popular voice assistants and are embedded in smartphones or dedicated
home speakers. Users can ask their assistants questions, control home automation devices and
media playback via voice, and manage other basic tasks such as email, to-do lists, and
calendars with verbal commands.
The paper proposes a system for the visually challenged people - an electronic travelling aid
which makes use of haptic feedback. It performs various tasks like detecting obstacles, their
identification and provides information about them. A circuitry is embedded in the cane for
the same. When the cane comes across an obstacle along its path it will generate a vibration,
mimicking the real sensation of the object with help of eccentric mass motor as actuator
installed in the cane. Obstacle detection is performed using ultrasonic sensor (disc shaped)
and detection-simulation algorithm. Also this system provides smart processing architecture,
avoids cross interference between actuators and cane and avoids cane-floor interaction.
Though this system provides aid in navigation it cannot adapt itself in new surroundings
whose database is not available.
Title: The GuideCane-applying mobile robot technologies to assist the visually impaired
In this reference, the user pushes the light weight Guide cane, which is used for obstacle
avoidance. The ultrasonic sensor detects an obstacle, the embedded computer determines the
suitable direction of motion that steers the guide cane and user around it. This steering is
quite noticeable at the handle, hence the user is easily guided. Here the user needs minimal
training to use this aid. The guide cane is very intuitive and users can travel effortlessly with
an average speed of 1m/s. Since the sonars are closer to the ground, sometimes they may
detect minor irregularities in the ground and misinterpret them as obstacles. The obstacle
avoidance performance is quite adequate at indoor environments only.
In this proposed system the smart stick aids visually impaired over any obstacles pits or
water. It helps the visually impaired person to lead a more confident life. The proposed
walking stick consists of GPS which is preprogrammed for identifying the optimal route.
Sensors like pit sensor, water sensor, ultrasonic sensors are used. It also consists of voice
synthesizer, keypad, vibrator, level converter, speaker or head phone. The controller used is
PIC controller-the proposed system includes two parts: sensor unit and GPS unit. For the
identification of pit, infrared sensors is used whereas ultrasonic sensors can be used to detect
any other obstacle. In order to detect water, electrodes are fixed at the bottom so that they
sense water and give indication to visually impaired person. When the person reaches the
destination using GPS, he is alerted with the voice response. As it is used only for defined set
of routes it may not be helpful in navigating at new route.
Title: A Multidimensional Walking Aid for Visually Impaired Using Ultrasonic Sensors
Network with Voice Guidance
In this paper visually impaired walking aid is designed and implemented using a network of
ultrasonic sensors, thereby capable of detecting the direction and position of obstacles. The
performance and functionality are also improved by addition of alert light, and voice
guidance signal which is relayed to a miniature headset. The recorded voice alerts the user of
the presence and direction of the obstacle. It can detect obstacles within 0m to 1m at left,
right and in front of the stick with an appropriate voice alert. The walking stick designed here
cannot determine the distance of the obstacle to the multidimensional.
Title: Environment sniffing smart portable assistive device for visually impaired
individuals
The Proposed system consist of a belt along with smart cane which mimics the real sensation
of touch. This is achieved by using haptic feedback. Along with obstacle detection (using
ultrasonic sensor), this system provides solution like staircase detection and human detection
(using PIR sensor).The voice assistance is provided with the help of smartphone app. An
object database is maintained with which the captured object is matched and identified.
Initially, the destination information is collected from the user and given as input to the smart
phone and Google Maps used to provide direction to that destination. There are three pairs of
ultrasonic sensors which detect the object in right and left directions and guide accordingly.
In scenarios where obstacle are present in all directions there may be an ambiguity in
detection. Also smart phone app used may not always yield accurate results.
Title: Autonomous walking stick for the blind using echolocation and image processing
In this reference, the author proposes a smart walking stick called assistor. It works on the
principle of echolocation which uses ultrasonic sensor to echo sound waves and detect
objects. An image sensor is used for capturing the runtime objects and a smart phone
application is used for the purpose of navigation. These two sensors are placed on the assistor
which continuously feeds information to smart application through Bluetooth technology.
The system is also provided with servo motors for the mobility of the stick. The motors are
adjusted based on the directions provided by Google Maps. Deepening Depth first search
algorithm is implemented in order to provide a better navigation. The major drawback of this
paper is unreliability of smartphone app. Also it does not provide any solution to determine
the distance between the obstacle and the user.
Title: Object recognition for blind people based on features extraction, International
Image Processing, Applications and Systems Conference
The paper proposes a visual substitution system based on the video analysis and
interpretation. It is based on a robust algorithms to recognize and locate objects in the image.
A single camera is used to capture the image. The type of features extracted depends on the
type of image captured. The images could be in binary, gray level or in color format. The
major algorithms used are Scale Invariant Feature Transform (SIFT) and Speed up Robust
Feature (SURF). SIFT algorithm works by comparing the target image with a reference
image that stored in the database which is based on the Euclidean distance between feature
vectors and key points. SURF algorithm can be used in the field of computer vision for
obstacle detection and 3D object reconstruction. SURF algorithm works on the principle of
Haar-wavelets responses and are computed at the current scale’s’. The proposed system uses
2 frames. While the first frame is matched with objects in database, the successive frames
will be matched from previous ones. When an object is recognized the corresponding voice
output is given to the user. Here the precision of object recognition and time for processing
plays a vital role in object recognition.
Title: Design and development of smart assistive device for visually impaired people
The proposed system helps the blind person to navigate alone safely and avoid any obstacles
that may be stationary or mobile. The device provides voice output direction to the user
through RFID technology. The system consists of five sections viz. obstacle detection, traffic
signal information, bus route information, location of blind stick and power generation.
Obstacles are detected through IR sensors and a notification is given to the user through
vibrations. There are 2 IR receivers placed on the blind stick which help in determining the
distance of the object from the user. For traffic signal detection, when the signal goes green
the IR receiver installed on the traffic pole gets activated and the corresponding voice output
is generated. This enables the user to cross the street independently without relying on others.
In the bus route information system, the destination of the bus is provided to the blind person
through a prerecorded voice output. The user is prompted with a voice message about the
arrival of the bus when it is within RF range. The location of the blind stick can be obtained
by providing a switch that consists of an RF transmitter which gets activated when pressed
and gives a beep sound at the receiver end. The wheels placed in the stick can be used to
generate power by transforming mechanical energy to electrical energy. Since the IR sensors
are placed on the belt which is a wearable for the user, it may cause discomfort in wearing it
always.
Title: Blind aid stick: Hurdle recognition, simulated perception, android integrated
voice based cooperation via GPS along with panic alert system
In this reference IR sensors along with ultrasonic range finders circuits are used for hurdle
detection. By using the combination of Bluetooth module along with GPS technology and an
Android app for the blind, a voice assistance is provided and in panic situation location the
information is shared with the concerned person through SMS. Here the user is alerted
through different duration of buzzer ringing and vibration for determining distance from the
obstacle. It is an easy to operate, low power consuming and cost efficient. If there is any
malfunctioning leading to change in the duration of vibration or ringing, it will result in
unreliable navigation. The android app (Google Maps) is not always reliable as it depends on
internet connectivity.
Title: An invisible eye for the blind people making life easy for the blind with Internet of
Things
In this paper a smart electronically travelling Aid - BlinDar has been proposed. The location
information is shared on the cloud using GPS and ESP8266 Wi-Fi module. A gas Sensor is
used in order to detect fire in case of fire accident. Also, a RF Transmitter/Receiver module is
used to locate the stick when it is misplaced. Apart from this the main sensing devices are
ultrasonic sensors, which can detect the obstacles and potholes up to a range of 2m.The main
computing engine used here is Arduino Mega2560.BlinDar is a fast responding, low power
consuming light weight and cost effective device for the visually challenged. Navigation gets
difficult sometimes as GPS is not that reliable in indoor environment.
The purpose of the project is to avoid any means of obstacles on the path of the blind
person. This is only possible if there is an object detection module such as an
ultrasonic sensor capable of registering any change in distance between the person
and the object in real-time.
Thus, we make use of a function designed exclusively to determine if the object is in
close proximity.
Once we know that there is an object ahead of us, it is also needed to know what kind
of an object it is and where it lies with respect to the person ahead of the object. This
is determined using an algorithm called YOLO, (you only look once). This algorithm
not only classifies the objects but also localizes them and even labels them. This
feature comes in handy when it comes to dealing with virtual awareness of the object
ahead.
Once the algorithm determines where the object lies in the frame of the captured
image, it is now crucial to determine where the person should more likely to take turn.
This is decided based on a logic statement that judges the position of the object with
respect to the person.
3.2.1 COST:
1. Wi-Fi:
Compatible devices can connect to each other over Wi-Fi through
a wireless access point as well as to connected Ethernet devices and
may use it to access the Internet. Such an access point (or hotspot) has
a range of about 20 meters (66 feet) indoors and a greater range
outdoors. Hotspot coverage can be as small as a single room with walls
that block radio waves, or as large as many square kilometres achieved
by using multiple overlapping access points.
The different versions of Wi-Fi are specified by various IEEE 802.11
protocol standards, with the different radio technologies determining
the ranges, radio bands, and speeds that may be achieved. Wi-Fi most
commonly uses the 2.4 gigahertz (12 cm) UHF and 5 gigahertz
(6 cm) SHF ISM radio bands; these bands are subdivided into multiple
channels. Each channel can be time-shared by multiple networks.
These wavelengths work best for line-of-sight. Many common
materials absorb or reflect them, which further restricts range, but can
3.4PLATFORM REQUIREMENTS
The Raspberry Pi 3 uses a Broadcom BCM2837 SoC with a 1.2 GHz 64-bit quad-core ARM
Cortex-A53 processor, with 512 KB shared L2 cache. Raspberry Pi 3 Model B has 1 GB of
RAM. The Raspberry Pi 3 (wireless) is equipped with 2.4 GHz Wi-Fi 802.11n (150 Mbit/s)
and Bluetooth 4.1 (24 Mbit/s) based on Broadcom BCM43438 FullMAC chip with no official
support for Monitor mode but implemented through unofficial firmware patching and the Pi 3
also has a 10/100 Ethernet port. The Raspberry Pi may be operated with any generic USB
computer keyboard and mouse. It may also be used with USB storage, USB to MIDI
converters, and virtually any other device/component with USB capabilities.
Other peripherals can be attached to the various pins and connectors on the surface of the
Raspberry Pi 0.Python and Scratch are the main programming language used, and also
support many other languages.
The Camera Module consists of Sony IMX219 8-megapixel sensor (compared to the 5-
megapixel Omni-Vision OV5647 sensor of the original camera). Camera module can take
video and still photographs. Libraries bundled in the camera can be used to create effects. It
supports 1080p30, 720p60, and VGA90 video modes, as well as still capture. It attaches via a
15cm ribbon cable to the CSI port on the Raspberry Pi. The camera works with all models of
Raspberry Pi 1, 2, and 3. It can be accessed through the MMAL and V4L APIs, and there are
numerous third-party libraries built for it, including the Pi Camera Python library. The
camera module is very popular in home security applications.
Ultrasonic transducers or ultrasonic sensors are a type of acoustic sensor divided into three
broad categories: transmitters, receivers and transceivers. Transmitters convert electrical
signals into ultrasound, receivers convert ultrasound into electrical signals, and transceivers
can both transmit and receive ultrasound.
In a similar way to radar and sonar, ultrasonic transducers are used in systems which evaluate
targets by interpreting the reflected signals. For example, by measuring the time between
sending a signal and receiving an echo the distance of an object can be calculated. Passive
ultrasonic sensors are basically microphones that detect ultrasonic noise that is present under
certain conditions.
Raspbian is a Debian-based computer operating system for Raspberry Pi. There are
several versions of Raspbian including Raspbian Stretch and Raspbian Jessie. Since
2015 it has been officially provided by the Raspberry Pi Foundation as the primary
operating system for the family of Raspberry Pi single-board computers.Raspbian was
created by Mike Thompson and Peter Green as an independent project.The initial
build was completed in June 2012. The operating system is still under active
Computer: Raspberry pi 3 model B and DELL or any other model with follow
requirements
RAM: 4GB or more
ROM: 500GB or more
1. Python3
2. OpenCV
3.4.2.1 Python3:
Python 3.0 (also called "Python 3000" or "Py3K") was released on December 3,
2008.It was designed to rectify fundamental design flaws in the language—the changes
required could not be implemented while retaining full backwards compatibility with the 2.x
series, which necessitated a new major version number. The guiding principle of Python 3
was: "reduce feature duplication by removing old ways of doing things".
Python 3.0 was developed with the same philosophy as in prior versions. However, as Python
had accumulated new and redundant ways to program the same task, Python 3.0 had an
emphasis on removing duplicative constructs and modules, in keeping with "There should be
one and preferably only one obvious way to do it".
Nonetheless, Python 3.0 remained a multi-paradigm language. Coders still had options
among object-orientation, structured programming, functional programming and other
paradigms, but within such broad choices, the details were intended to be more obvious in
Python 3.0 than they were in Python 2.x.
3.4.2.2 OpenCV:
OpenCV supports the deep learning frameworks TensorFlow, Torch/ PyTorch and Caffe.
Classification determines the category an object belongs to and regression deals with
obtaining a set of numerical input or output examples, thereby discovering functions enabling
the generation of suitable outputs from respective inputs. Mathematical analysis of machine
learning algorithms and their performance is a well-defined branch of theoretical computer
science often referred to as computational learning theory. Machine perception deals with the
capability to use sensory inputs to deduce the different aspects of the world, while computer
vision is the power to analyze visual inputs with a few sub-problems such as facial, object
and gesture recognition. Artificial neural networks (ANNs) or connectionist systems are
computing systems inspired by the biological neural networks. An ANN is based on a
Collection of connected units or nodes called artificial neurons. Each connection (analogous
to a synapse) between artificial neurons can transmit a signal from one to another. The
artificial neuron that receives the signal can process it and then signal artificial neurons
connected to it0. In common ANN implementations, the signal at a connection between
artificial neurons is a real number, and the output of each artificial neuron is calculated by a
non-linear function of the sum of its inputs. Artificial neurons and connections typically have
a weight that adjusts as learning proceeds. The weight increases or decreases the strength of
the signal at a connection. Artificial neurons have a threshold. Only if aggregate signal
crosses that threshold, then the signal is sent. Artificial neurons are generally organized in
layers. Different layers have different functions and perform different kinds of
transformations on their inputs. Signals travel from the first (input) to the last (output) layer,
possibly after traversing the layer’s multiple times.
1. SD CARD:
It is the storage section in the Raspberry Pi which contains Raspbian (LINUX)
and we can store scripts in it. Here, we have used 8GB SD card.
2. POWER SUPPLY:
Raspberry pi 3 requires power supply .here, we supply 5v to the model.
3. RASPBERRY PI 3:
It has a LINUX on which we run the scripts which is server socket. We
connect the Ultrasonic sensor, Pi Camera and earplug to this module. Once the
Ultrasonic sensor detects the object in the path and camera captures images of all
objects in the path which is encountered as obstacle for person and stores in it.
4. PI CAMERA:
Camera is used to capture the images of the objects in the path of the blind
person to identify and classify about type of object and guide the person to understand
about the objects.
5. ULTRASONIC SENSOR:
Ultrasonic sensor used to detect the objectwhich lies within the threshold set
and records the distance of the objects which is used to make decision about how far
the object is.
6. EAR PLUG/SPEAKER:
It is used to give voice output to the blind person about the guidance of what type of
object and direction for deviation to overcome the obstacle present in front.
4.2FLOWCHART
Flow chart of a system is the step by step illustration of the process that takes place in the
system. It can be considered as a guide book to explain any system in an illustrative manner.
In the above presented flow chart, we have explained the step by step process that happens in
the blind navigation system which is as following:
Step 1. Start the system.
Step 2. Voice output that is, Device ready.
Step 3. If Ultrasonic sensor detects object then proceed else loop in the step 3.
Step 4. Pi camera captures images and also takes distance of the object from the blind person.
Step 5. This captured image is sent to the yolo algorithm for identification and classification
of the objects in the path.
Step 6. Voice output that is, <object> ahead and go <direction>.
Sequence diagram describes about how the Raspberry Pi model is connected to the Ultrasonic
Sensor, Pi camera and Earplug. When device is ON, it indicates the user that device is ready.
Once the Ultrasonic Sensor detects the object that is in its range, the Pi Camera captures the
images and stores in it. Now, the server waits for connection with client. When client socket
is initialized then connection is established. Client receives all the captured images and
distance data from server. In the client, all the images are processed for identification and
classification of object and produces decision parameter set. Based on this parameters,
decision is made about object and direction. This is guided to the blind person as audio
output.
IMPLEMENTATION
5.1 METHODOLOGY
1. Turning on Device:
The basic most step involved in the project is to ensure the activeness of the server
end of the entire system (Raspberry pi).the user baring the device , must just connect
the device to a power source like a power bank /battery pack. Once the device loads
and boots up. It begins to execute the set of scheduled instructions set at the CronTab
interface of the raspberry pi. An audio clip saying “Device online and ready” confirms
the activeness of the server end of the system (raspberry pi).
Once the device is online, the crontab automatically executes a python code. This
code is associated with the ultrasonic sensor module. Here, the code gets to know how
far the object is from the perspective of the user. A threshold is set such that, any
object within the minimum distance coded for will trigger a stop signal for the user.
That is when the user must come to an halt. And then a python code responsible for
capturing images is triggered.
When the user is at halt, the camera module of the raspberry pi begins to capture
images of the object ahead automatically and stores these images in a folder. The
images are associated with the distance determined from the ultrasonic sensor. Once
the images are successfully saved and stored, the server socket on the raspberry pi
begins to wait for a connection from the client.
The server socket is now waiting for any connection to be established between the
client system through the client socket. The connectivity ensures that the sockets to be
connected are on the same network. This is to just ensure security and clarity of the
data transfer. Then, the images stored are transferred to the client’s system via
network port through this connection.
In order for the client to establish a connection between the server device, it must
know the ip address of the server device. This is already determined during the startup
for the server device it creates an audio clip baring the ip address of the device.
Once this is connected, the client is ready to receive data sent from the server device
(images).
Once the client gets the images of the object ahead from the server device, the process
of image classification and object detection occurs. The client device runs an
algorithm called YOLO that is helpful to determine several common day-to-day
objects. The computation is expensive in terms of processing. And hence, a system
with high end processing capacity is recommended. The output generated results in
the determination of which object and where the object lies in the frame.
Based on the position of the object in the frame, a decision is made to move on either
left or right of the object detected.
Module Design of a system is associated with the representation of the modules used in the
system and how they interact with each other. It’s very important to convert the system in
terms of modules as it makes it easier to interact and understand the system in a better way.
The ideology behind designing the modules of a system to divide the functionality of the
system in different parts and map them to get the desired step in the system.
Our project is converted into three modules which are namely M1, M2, M3 and M4. All the
three modules define the functionality and how it affects the overall system.
M1 module has interaction between Raspberry Pi and Ultrasonic sensor, Pi Camera where
sensor gives distance data and camera gives captured images to the Raspberry Pi.
M2 module deals with sending data to the client which is stored in Raspberry Pi.
M3 module deals with processing of images for identification and classification and sends
decision parameters to Raspberry Pi.
M4 module processes the decision parameters and releases or outputs a audio to provide
direction.
1. begin.py
2. ultra.py
3.capture.py
4. serverd.py
5. decision.py
1. clientd.py
2. object_detection_yolo.py
Purpose:
The purpose of creating these files is to have a link in order to have a easier execution of the
scripts and in turn detects the object then captures images of the object which is processed to
get identified and classified about what type of objects is in path and gives guidance to the
visually impaired person.
LOC
if __name__ == '__main__':
try:
while True:
dist = distance()
ifdist<50:
print("close")
os.system("mpg321 audio/nav/Stop.mp3")
os.system("python3 capture.py")
os.system("python3 serverd.py")
#os.system("pkill -9 python3")
os.system("python3 killp.py")
os.system("python3 killp.py")
#time.sleep(2)
print ("Measured
Department Distance = %.1f342018-2019
of CS&E, AIT cm" % dist)
time.sleep(1)
BLIND NAVIGATION SYSTEM USING AI IMPLEMENTATION
defgetOutputsNames(net):
layersNames = net.getLayerNames()
# Get the names of the output layers, i.e. the layers with unconnected outputs
'''
'''
val=(right-left)//2 + left
direction="right"
print("{} {}\n".format(val,imgW//2))
if(val>imgW//2):
direction="left"
totaldata.append(direction)
defpostprocess(frame, outs):
frameHeight = frame.shape[0]
frameWidth = frame.shape[1]
print("\n\n{}w {}h\n\n".format(frameWidth,frameHeight))
# Scan through all the bounding boxes output from the network and keep only the
# ones with high confidence scores. Assign the box's class label as the class with the
highest score.
classIds = []
confidences = []
boxes = []
#print(len(outs))
scores = detection[5:]
classId = np.argmax(scores)
confidence = scores[classId]
if confidence >confThreshold:
classIds.append(classId)
Department of CS&E, AIT 362018-2019
confidences.append(float(confidence))
BLIND NAVIGATION SYSTEM USING AI IMPLEMENTATION
TC1:
In this first test case, all the setup operations are under taken and device is initialized.
Raspberry Pi echoes the network to which it is connected and starts to run in that network.
This makes a python scripts to execute and allow the device i.e., Raspberry Pi to produce a
voice output to the blind person that device is ready.
TC2:
In this second test case, ultrasonic sensor detects the object which is in its minimum threshold
set and produces a voice output to the blind person to stop.
TC3:
In this third test case, once the person stops, camera gets ON to capture the images if the
object is in its minimum threshold set. Now all this information that is distance and captured
images is stored in server. Now server waits for client connection to be established if not this
process is repeated again.
TC4:
In this fourth test case, when server gets connected with client then it runs its socket scripts
and achieve all the stored images and distance data stored in server (Raspberry Pi) which is
saved and stored.
TC5:
In this fifth test case, the capture images are processed by the object detection algorithm such
as YOLO (you look only once) which identifies and classifies the object. This determines
type of object and location of object in frame which helps to decision the direction in which a
blind person should navigation in order to get rid of that object. With this decision parameter,
the device outputs a voice as <object> ahead and go <direction>.
INPUT:
The input is being collected through the help of Ultrasonic sensors and Pi Camera which are
connected to the Raspberry Pi 3 model using the python scripts. This is then fed to the client
(host computer) for processing it.
OUTPUT:
The output is the voice which guides the blind person what type of object is in his path and in
which direction to navigate based on the objects in the frame and its distance from blind
person.
RESULT:
The process of ultrasonic sensor detecting object which is in its range and the Pi Camera
capturing the images of the object and then processing of the object to detect what type of
object it is, which is repeated for every sensing of the ultrasonic sensor and guide the visually
impaired person to navigate along his path.
RESULTS
This device could be integrated with navigation related applications found in the internet.
Such as booking a cab, home delivery services, etc. and moreover could also be made
traceable to ensure that the person isn’t lost.
[1] http://www.research.ibm.com/cognitive-computing/neurosynaptic-chips.shtml
http://www.aaai.org/Magazine/Watson/watson.php.
[3] http://www.karigirl.com/.
[4]http://www.nytimes.com/2009/04/27/technology/27jeopardy.html?_r=0&adxnnl=1&adxn
nlx=1352725300-PoS7/s6cj4rYR8Fof95/EA.
[5]https://www.aclunc.org/issues/technology/blog/note_to_self_siri_not_just_working_for_m
e,working_full-time_for_apple,_too.shtml.
[6]http://www.chatbots.org/conversational/agent/ask_jenny_lends_a_hand_on_virgin_medias
_website/.
Zabaware Inc. (n.d.). Ultra Hal can hold conversations with you. Available:
[7] http://www.zabaware.com/assistant/index.html.
[8] Dean, Jeff; Monga, Rajat; et al. (9 November 2015). "TensorFlow: Large-scale machine
learning on heterogeneous systems" (PDF). TensorFlow.org. Google Research. Retrieved 10
November 2015.
[9]Getting Started with Raspberry Pi; Matt Richardson and Shawn Wallace; 176 pages; 2013;
ISBN 978- 1449344214
[10] Raspberry Pi User Guide; Eben Upton and Gareth Halfacree; 312 pages; 2014; ISBN
9781118921661.
[11] Hinton, G.; Deng, L.; Yu, D.; Dahl, G.; Mohamed, A.; Jaitly, N.; Senior, A.; Vanhoucke,
V.; Nguyen, P.; Sainath, T.; Kingsbury, B. (2012). "Deep Neural Networks for Acoustic
Modeling in Speech Recognition --- The shared views of four research groups". IEEE Signal
Processing Magazine.29(6):82- 97. doi:10.1109/msp.2012.2205597
[13] )Zoph, Barret; Le, Quoc V. (2016-11-04). "Neural Architecture Search with
Reinforcement Learning". arXiv:1611.01578
[14] R. J. Williams and D. Zipser. Gradient-based learning algorithms for recurrent networks
and their computational complexity. In Back-propagation: Theory, Architectures, and
Applications. Hillsdale, NJ: Erlbaum, 1994
[15] Xilinx. "HDL Synthesis for FPGAs Design Guide". section 3.13: "Encoding State
Machines". Appendix A: "Accelerate FPGA Macros with One-Hot Approach". 1995
[16]Yosinski, Jason; Clune, Jeff; Nguyen, Anh; Fuchs, Thomas; Lipson, Hod (2015-06-22).
"Understanding Neural Networks Through Deep Visualization". arXiv:1506.06579
[17] )Graupe, Daniel (2013). Principles of Artifircial Neural Networks. World Scientific. pp.
1–. ISBN 978- 981-4522-74-8.
[18] Dominik Scherer, Andreas C. Müller, and Sven Behnke: "Evaluation of Pooling
Operatiorns in Convolutional Architectures for Object Recognition," In 20th International
Conference Artificial Neural Networks (ICANN), pp. 92-101, 2010.
https://doi.org/10.1007/978-3-642-15825- 4_10
[19] Lawrence, Steve; C. Lee Giles; Ah Chung Tsoi; Andrew D. Back (1997). "Face
Recognition: A Convolutional Neural Network Approach". Neural Networks, IEEE
Transactions on. 8 (1): 98– 113. CiteSeerX 10.1.1.92.5813
[20]Zeiler, Matthew D.; Fergus, Rob (2013-01-15). "Stochastic Pooling for Regularization of
Deep Convolutional Neural Networks". arXiv:1301.3557
[21] Sutton, R. S., and Barto A. G. Reinforcement Learning: An Introduction. The MIT Press,
Cambridge, MA, 1998
[22] Hope, Tom; Resheff, Yehezkel S.; Lieder, Itay (2017- 08-09). Learning TensorFlow: A
Guide to Building Deep Learning Systems. "O'Reilly Media, Inc.". pp. 64–. ISBN
9781491978481.
[24]Graupe, Daniel (7 July 2016). Deep Learrning Neural Networks: Design and Case
Studies. World Scientific Publishing Co Inc. pp. 57–110. ISBN 978-9r81-314- 647-1
[25]Clevert, Djork-Arné; Unterthiner, Thomas; Hochreiter, Sepp (2015). "Fast and Accurate
Deep Network Learning by Exponential Linear Units (ELUs)". arXiv:1511.07289
[26]YoshuaBengio (2009). Learning Deep Architectures for AI. Now Publishers Inc. pp. 1–3.
ISBN 978-1- 60198-294-0.
[27] Christopher Bishop (1995). Neural Networks for Pattern Recognition, Oxford University
Press. ISBN 0-19-853864-2