DL Assisted Smart Glasses - PMU Univ
DL Assisted Smart Glasses - PMU Univ
DL Assisted Smart Glasses - PMU Univ
Abstract— Computer Vision Technology has played a levels of needs and not all levels require special places and
significant role in assisting visually challenged people to carry special schools. For instance, people with vision difficulties can
out their day to day activities without much dependency on study with other students if they have an appropriate
other people. Smart glasses in one such solution which enables environment. In order to solve this issue, we can use the help of
blind or visually challenged people to “read” images. This computer vision technology to make special aids which the
paper is an attempt in this direction to build a novel smart glass visually impaired people can live comfortably, as far as
which has the ability to extract and recognize text captured from possible.
an image and convert it to speech. It consists of a Raspberry Pi It is observed that most blind people are intelligent and can
3 B+ microcontroller which processes the image captured from study if they have the chance to be able to study in regular
a webcam super-imposed on the glasses of the blind person. government administered schools as they exist almost
Text detection is achieved using the OpenCV software and open everywhere. It is a misconception among majority who think
source Optical Character Recognition (OCR) tools Tesseract people who are blind or with vision difficulties cannot live
and Efficient and Accurate Scene Text Detector (EAST) based alone and they need help of other people at all times. In fact,
on Deep Learning techniques. The recognized text is further they do not need help all the times, they can be independent
processed by Google’s Text to Speech (gTTS) API to convert most of the times and they have the chance to live like other
to an audible signal for the user. A second feature of this people.
solution is to provide location-based services to the blind people One of the popular solution in this scenario is to use Smart
by identifying locations in an academic building using the RFID Glasses for the visually impaired people [3]. These types of
technology. This solution has been extensively tested in a glasses make the use of computer vision hardware and software
university environment for aiding visually challenged students. tools (camera, image processing, image classification and
The novelty of the implemented solution lies in providing the speech processing). Such a solution gives a chance to visually
desired computer vision functionalities of image/text impaired people to lead a comfortable life with other people and
recognition which is economical, small-sized, accurate and uses study in any school or university without the need of help from
open source software tools. This solution can be potentially other people every time. It has been observed that the use of
used for both educational and commercial applications. Smart Glasses has increased the percentage of educated people.
Most schools, colleges and universities are accepting students
Keywords: Image Recognition; Speech processing; Optical with vision difficulties. It is expected that from next academic
Character Recognition; Deep Learning; Raspberry Pi; Python. year Prince Mohammad bin Fahd University (PMU) will accept
blind students for admission [4]. The college would like to start
I. INTRODUCTION using smart glasses for the first time in this setup and help
In our societies, there are many people who are suffering students to improve their education level with minimum
from different diseases or handicap. According to World Health assistance from the instructor.
Organization (WHO), about 8% of the population in eastern This was the motivation behind the design and
Mediterranean region has vision difficulties, which includes development of smart glasses is to help blind and visually
blindness, low vision and some kind of visual impairment [1]. impaired students with their studies. These glasses are designed
Such people need to be provided special facilities so that they to use the computer vision technology to capture an image and
can live comfortably. Especially in the field of education, there extract English text and convert it into audio signal with the aid
are special schools and universities for people with special of speech synthesis. Also, it was decided to add a feature of
needs [2]. Most blind people and people with vision difficulties translating text/words from English to Arabic language as the
were not in a position to complete their studies special schools majority of the students at PMU are Arabic speaking.
for people with special needs are not available everywhere and
most of them are private and expensive. So the only alternative The main objectives of the proposed system can now be
was that they study at home acquiring basic knowledge from summarized as the follows: capturing image, extracting text
their parents. This education was not technical enough and from the image, identifying the correct text, converting text to
hence cannot compete with other people. There are different speech, translate the text to other language, to integrate the
Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 18,2021 at 05:11:01 UTC from IEEE Xplore. Restrictions apply.
Table I: Comparative Summary of Smart Glasses Solutions
Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 18,2021 at 05:11:01 UTC from IEEE Xplore. Restrictions apply.
Fig. 2: Process Diagram of the Proposed System
Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 18,2021 at 05:11:01 UTC from IEEE Xplore. Restrictions apply.
group of individual symbols.
(ii) OpenCV Libraries
OpenCV is a library of programming functions for real-time
computer vision, the library is cross-platform and free for use
under the open-source BSD license [15]. For the installation of
the OpenCV 4 libraries, the recommended operating system for
the raspberry pi B+ which is Raspbian Stretch was installed.
Win32 Disk Imager was used to flash the SD card.
(iii) Google Text to Speech (gTTS) API
One of the most important functions of the smart glasses is
text to voice conversion. In order to implement this task, we
installed gTTS (Google Text-to-Speech). It is a python library
that interfaced with Google Translate API [13]. gTTS has many
features such as convert ultimate length of text to voice, provide
error pronunciation using customizable text pre-processors and
support many languages and retrieve them when needed. We
used the gTTS to perform language translation from English to
Arabic (called as Button 2, see Fig. 4).
Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 18,2021 at 05:11:01 UTC from IEEE Xplore. Restrictions apply.
worthwhile to include multi-lingual feature (e.g.
French or Urdu ) in the speech translation module.
● To improve the direction and warning messages to the
user, we can include GPS-based navigation and alert
system.
● To provide for more space visibility, we can include a
wide angle camera (e.g. 2700 degrees as compared to
600 currently used).
● Finally, to provide for more real-time experience, we
can include video processing instead of still images.
Authorized licensed use limited to: PES University Bengaluru. Downloaded on March 18,2021 at 05:11:01 UTC from IEEE Xplore. Restrictions apply.