Submitted in Partial Fulfilment of The Requirements For The Award of The Degree of
Submitted in Partial Fulfilment of The Requirements For The Award of The Degree of
ON
VOICE BASED EMAIL SYSTEM FOR BLIND
Submitted in
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING
BY
N. PAVANI
(15D21A05E8)
i
Department of Computer Science and Engineering
SRIDEVI WOMEN’S ENGINEERING COLLEGE
(Approved by AICTE and Affiliated to JNTU HYD )
V.N.PALLY, Gandipet, Hyderabad-75
2018-2019
CERTIFICATE
This is to certify that the TECHNICAL SEMINAR report entitled “VOICE
BASED EMAIL SYSTEM FOR BLIND” is being submitted by Ms. NEMALIKONDA
PAVANI (15D21A05E8) in partial fulfilment for the award of degree of Bachelor of
Technology in Computer Science and Engineering is a record bonafide work carried out
by them.
GUIDENCE OF DEPARTMENT
EXTERNAL EXAMINER
ii
ACKNOWLEDGEMENT
First of all, I would like to express my deep gratitude towards my internal guide Mrs. D.
MADHAVI, ASSISTANT PROFESSOR in CSE and E. Krishnaveni, ASSISTANT
PROFESSOR in CSE and DR.TKS. RATISHBABU PROFESSOR in CSE for their support
in completion of my technical project.
I would like to thank total all my faculty and friends for the guidance and constant
cooperation who are extended all possible help to complete the task.
Finally, I am very much indebted to my parents for their moral support and
encouragement to achieve goals.
N. PAVANI
(15D21A05E8)
iii
TABLE OF CONTENTS
CERTIFICATE
ACKNOWLEDGEMENT
ABSTRACT
1. INTRODUCTION
1.1 LITERATURE SURVEY
1.2 EASE OF SCOPE
1.3 AIM AND OBJECTIVE
1.4 OVERALL DESCRIPTIVE
1.4.1 PROJECT DESCRIPTION
1.4.2 SOFTWARE INTERFACE
1.4.3 HARDWARE INTERFACE
1.4.4 PROJECT FUNCTION
1.4.5 USER CHARACTERISTICS
1.4.6 CONSTRAINTS
2. ARCHITECTURE DESIGN
3. ADVANTAGES &DISADVANTAGES
3.1 ADVANTAGES
3.2 DISADVANTAGES
4. IMPLEMENTATION
5. CONCLUSION
5.1 FUTURE ENHANCEMENTS
6. REFERNCES
LIST OF FIGURES
INTRODUCTION
Internet is considered as a major storehouse of information in today’s world. No single work
can be done without the help of it. It has even become one of the defector methods used in
communication. And out of all methods available email is one of the most common forms of
communication especially in the business world. However not all people can use the internet.
This is because in order to access the internet you would need to know what is written on the
screen. If that is not visible it is of no use. This makes internet a completely useless
technology for the visually impaired and illiterate people. Even the systems that are available
currently like the screen readers TTS and ASR do not provide full efficiency to the blind
people so as to use the internet. As nearly 285 million people worldwide are estimated
visually impaired it become necessary to make internet facilities for communication usable
for them also. Therefore we have come up with this project in which we will be developing a
voice based email system which will aid the visually impaired people who are naive to
computer systems to use email facilities in a hassle free manner. The users of this system
would not need to have any basic information regarding keyboard shortcuts or where the keys
are located. All functions are based on simple mouse click operations making it very easy for
any type of user to use this system. Also the user need not worry about remembering which
mouse click operation he/she needs to perform in order to avail a given service as the system
itself will be prompting them as to which click will provide them with what operations. The
most common mail services that we use in our day today life cannot be used by visually
challenged people. This is because they do not provide any facility so that the person in front
can hear out the content of the screen.
The basic function of the application is to provide user with a simple way to perform email
operations on his phone without compromising his security. The application is totally voice-
based allowing blind person to send and receive emails on the go. It converts the user spoken
voice into text and performs the action accordingly. It consists of voice confirmation i.e.,
confirming if the user has actually spoken the recognized text or not, which minimizes the
errors involved.
Components:
1) Authentication: Since users tend to forget their passwords or simply use weak passwords
that allow an adversary to break into their email accounts, the application makes use of
fingerprints. The Secure Hash Algorithm is used to hash the password and store the hash
value in the database instead of the password itself, to enhance security. SQLite is a software
library that implements a self-contained, server-less, zero-configuration, transactional SQL
database engine. The Java Mail Application Programming Interface (API) provides a
platform independent and protocol-independent framework to build mail and messaging
applications. The Java Mail API provides a set of abstract classes defining objects that
comprise a mail system. It is an optional package (standard extension) for reading,
composing, and sending electronic messages. Simple Mail Transfer Protocol (SMTP) is used
when email is delivered from an email client to an email server or when email is delivered
from one email server to another. Post Office Protocol (POP) allows a client to download an
email from mail server. Internet Message Access Protocol (IMAP) is an Internet standard
protocol used by e-mail clients to retrieve e-mail messages from a mail server over a TCP/IP
connection. IMAP is defined by RFC 3501.
2) Navigation: Here, the user will have to use certain keywords which will perform certain
actions. The keywords like: Compose, Received Mails, Sent Mails, Go Back.
3) Speech to text (STT): here whatever we speak is converted to text. Their will a small icon
of microphone on whose clicking the user had to speak and the speech will be converted to
text format, which the naked people would see and read.
4) Text to speech: Here the method is full opposite of STT. This method, converts the text
format of the emails to synthesized speech.
For people who can see, emailing is not a big deal, but for people who are not blessed with
gift of vision it postures a key concern because of its intersection with many vocational
responsibilities. This voice based email system has great application as it is used by blind
people as they can understand where they are. E.g. whenever cursor moves to any icon on the
website say Register it will sound like “Register Button”. There are many screen readers
available. But people had to remember mouse clicks. This system will reduce this problem as
mouse pointer would read out where he/she lies. This system focuses more on user
friendliness of all types of persons including regular persons, visually compromised people as
well as illiterate. This system makes the disabled people feel like a normal user. They can
hear the recently received mails to the Inbox, as well as the IVR technology proves very
effective for them in the terms of guidance.
The project aims to develop a voice based email system that would help blind people to
access email in a hassle free manner with the help of a smart watch. The system will not let
the user make use of the keyboard instead will work on speech recognition. In today’s age
much of the communication takes place through internet .In order to make the visually
challenged person take the benefits of the internet we come up with our project of voice
based email system through smart watch. The smart watch will recognize the speech and
convert that into text hence user friendly for them. It will be connected to internet via
Bluetooth or wifi-hotspot or stand alone internet connection so that the respective email can
be sends to the receiver. Arduino smart-watch processor will be implemented so as to get the
access of Bluetooth, wifi and battery status.
To provide the user friendly system to all the visually impaired peoples. To help them to
moving towards in the challenging world of internet, to provide them a facility to use these
technologies, through this they have a chance to overcome their visual disability.
2. 512 mb RAM.
3. Microphone.
This voice mail system is developing to help the visually impaired people to make feel them a
normal user. Voice interactions can escape the physical limitations on keypad and help user
to accessing mails easily. This system can used by both visually abled or disabled persons.
The proposed system is a desktop application that allows sending and receiving of mails via
the internet. We use artificial intelligence to benefit the blind to make use of the advanced
technology for their growth and improvement. The proposed system is a desktop application
which makes use of artificial intelligence that makes it cost-effective and easy to maintain.
1.5 Constraints
The information of all the users must be stored in a database that is accessible by the
Administrator. Voice Mail system facility is available to all the users 24 hours a day.
User can access their account from any computer and can send or retrieve messages
previously stored.
CHAPTER-2
ARCHITECHTURE DESIGN:
The design of this project is divided into three phrases as described below:
A. User Interface Design: The user interface is designed using Java eclipse (Html, CSS, and
JavaScript). The website focuses more on efficiency in understanding the Interactive voice
response(IVR) rather than the look and feel of the system as the system is primarily
developed for the blind people3 to whom the look and feel won’t be of that primary
importance as the efficiency of understanding the prompting would be.
B. Database Design: Our system maintains a database for user validation and storing mails of
the user. The database is used to store the information of user like username, password ,his
mails .When user request for any information then information is retrieved from database.
There are total of five tables. The relationship between them is assigned after much
consideration. The implementation part of Fig 1.
C. System Design: Fig. 2 depicts the complete system design. It is the level-2 data flow
diagram which gives complete detailed flow of events in the system. As we can see all
operations are performed by mouse click events only. Also at some places voice input is
required.
FIG NO -2 ARCHITECTURE
METHODOLOGY
The Software Development Life Cycle includes models such as Waterfall Model,
Prototype Model, and Object-oriented Model, etc. for developing the correct software.
The Waterfall Model is the earliest method of structured system development.
CHAPTER-3
ADVANTAGES AND DISADVANTAGES
3.1 ADVANTAGES
The messages may be created in the user’s voice mailbox and then they are
transported to another voice mailbox , Voice messaging is a viable alternative
to e-mail and fax systems as a business communicating tool , The voice-
messaging system improves the public relations in the companies .
The voice-messaging systems include many services such as the voice
messages , the voice-mail distribution lists , fax-in and fax-on demand in the
mailbox , the interactive voice response , and the voice forms that any user can
access anywhere in the world .
CHAPTER-4
IMPLEMENTATION
This system is currently being developed by us. The following are modules are the ones that
are already developed. Their working is as follows:
A. Registration:
This is the first module of the system. Any user who wishes to use the system should first
register to obtain username and password. This module will collect complete information of
the user by prompting the user as to what details needs to be entered. The user will need to
speak up the details to which the system will again confirm by prompting alphabetically. If
the information is not correct user can re-enter else the prompt will specify the operation to
be performed to confirm.
B. Login: Once the registration is done the user can login to the system. This module will ask
the user to provide the username and password. This will be accepted in speech. Speech
conversion will be done to text and user will be told to validate whether the details are
entered correctly or not. Once the entry is done correctly database will be checked for entry.
If the user is authorized it will be directed to homepage.
C. Forgot Password: In case where an authorized user forgets the password and thus is not
able to login he/she can select forgot password module. In this module the user will be first
told to enter username. According to username the security question will be searched in
database. This is the question provided at time of registration. The question will be spoken
out by the computer. The user should in turn specify the answer that was provided by him/her
during registration. If both get matched, user is given option to change password.
D. Home Page: The user is redirected to this page once log in done successfully. From this
page now the user can perform operations that the user wishes to perform. The options
available are: 1. Inbox 2. Compose 3. Sent mail 4. Trash Prompting will provide the mouse
click operation that needs to be performed for the required service. The double right click
event is specifically reserved to log out of the system at any time the user wants to. This will
be specified by the prompt right at the beginning after login.
All these functionalities has been implemented. The modules given below are to be included
in the system and will be implemented as a part of the proposed system. The complete
walkthrough of this system is given as follows:
E. Compose mail: This is one of the most important options provided by the mail services.
The functionality of compose mail option would not match the already existing mail system.
Since the system is for visually challenged people and keyboard operations are completely
avoided composing mail would only be done on voice input and mouse operations. No typed
input will be required. User can directly record message that needs to be propagated and can
send it. This voice massage will go in form of attachment. The receiver can hear the
recording and get the message user wanted to send. User would not require attaching the file.
Record option will be provided in the compose window itself. Once recorded it will confirm
whether the recording is perfect or not by letting the user hear it and if the user confirms it
will be automatically attached to the mail.
G. Sent mail: This option will keep a track of all the mails sent by the user. If the user wants
to access these mails, this option will provide them with their needs. In order to access the
sent mails user will need to perform the actions provided by the prompt to navigate between
mails. When the control lands on particular mail user will be prompted as who the receiver
was and what is the subject of the mail. This will help the user in efficiently understanding
and extracting the required mail.
This project is designed using some set of APIs. SNMTP (Simple Network Mail
Transmission Protocol) has been used for mailing servicing. Voice Typing and Dictation
Speech Interaction Models are designed using the Windows 7 LVCSR dictation engine.
In order to control speech accuracy, we turned off the (default) MLLR acoustic adaptation.
Error Correction Methods are implemented using the Windows 7 API’s and Windows
Presentation Foundation (WPF).
IVR systems can be used for mobile purchases, banking payments and services, retail
orders, utilities, travel information and weather conditions. A common misconception
refers to an automated attendant as an IVR. The terms are distinct and mean different
things to traditional telecommunications professionals—the purpose of an IVR is to
take input, process it, and return a result, whereas that of an automated attendant is to
route calls. The term voice response unit (VRU) is sometimes used as well. DTMF
decoding and speech recognition are used to interpret the caller's response to voice
prompts. DTMF tones are entered via the telephone keypad.
Other technologies include using text-to-speech (TTS) to speak complex and dynamic
information, such as e-mails, news reports or weather information. IVR technology is
also being introduced into automobile systems for hands-free operation. TTS is
computer generated synthesized speech that is no longer the robotic voice
traditionally associated with computers. Real voices create the speech in fragments
that are spliced together (concatenated) and smoothed before being played to the
caller. Another technology which can be used is using text to speech to talk advanced
and dynamic data, such as e-mails, reports and news and data about weather. IVR
used in automobile systems for easy operations too. Text To Speech is system
originated synthesized speech that’s not the robotic voice historically related to
computer. Original voices produce the speech in portions that are joined together and
rounded before played to the caller.
The process of converting spoken speech or audio into text is called speech to text
converter. The process is usually called speech recognition. The Speech recognition is
used to characterize the broader operation of deriving content from speech which is
known as speech understanding. We often associate the process of identifying a
person from their voice, that is voice recognition or speaker recognition so it is wrong
to use this term for it. As shown in the above block diagram speech to text converters
depends mostly on two models 1.Acoustic model and 2.Language model. Systems
generally use the pronunciation model. It is really imperative to learn that there is
nothing like a universal speech recognizer.
If you want to get the best quality of transcription, you can specialize the above
models for the any given language communication channel. Likewise another pattern
recognition technology, speech recognition can also not be without error. Accuracy of
speech transcript deeply relies on the voice of the speaker , the characteristic of
speech and the environmental conditions. Speech recognition is a tougher method
than what folks unremarkably assume, for a personality’s being. Humans are born for
understanding speech, not to transcribing it, and solely speech that’s well developed
will be transcribed unequivocally. From the user's purpose of read, a speech to text
system will be categorized based in its use.
Speech synthesis is the synthetic production of speech. A automatic data handing out
system used for this purpose is called as speech synthesizer, and may be enforced in
software package and hardware product. A text-to-speech (TTS) system converts
language text into speech, alternative systems render symbolic linguistic
representations. Synthesized speech can be created by concatenating pieces of
recorded speech that are stored in a database. Systems differ in the size of the stored
speech units; a system that stores phones or diphones provides the largest output
range, but may lack clarity. For specific usage domains, the storage of entire words or
sentences allows for high-quality output. Alternatively, a synthesizer can incorporate
a model of the vocal tract and other human voice characteristics to create a completely
"synthetic" voice output.
The quality of a speech synthesizer is judged by its similarity to the human voice and
by its ability to be understood clearly. An intelligible text to speech program permits
individual with ocular wreckage or reading disabilities to concentrate to written words
on a computing device. Several computer operational systems have enclosed speech
synthesizers since the first nineteen nineties years.
The text to speech system is consist of 2 parts:-front-end and a back-end. The front-
end consist of 2 major tasks. Firstly, it disciple unprocessed text containing symbols
like numbers and abstraction into the equivalent of written out words. This method is
commonly known as text, standardization, or processing. Front end then assigns
spoken transcriptions to every word, and divides and marks the text into speech units,
like phrases, clauses, and sentences. The process of assigning phonetic transcriptions
to words is called text-to-phoneme or grapheme-to-phoneme conversion. Phonetic
transcriptions and prosody information together make up the symbolic linguistic
representation that is output by the front-end. The back-end—often referred to as the
synthesizer—then converts the symbolic linguistic representation into sound. In
certain systems, this part includes the computation of the target prosody (pitch
contour, phoneme durations), which is then imposed on the output speech.
FIG NO – 4.6. TEXT-TO-SPEECH
Text-to-speech (TTS) is a type of speech synthesis application that is used to create a
spoken sound version of the text in a computer document, such as a help file or a Web
page. TTS can enable the reading of computer display information for the visually
challenged person, or may simply be used to augment the reading of a text message.
Current TTS applications include voice-enabled e-mail and spoken prompts in voice
response systems. TTS is often used with voice recognition programs. There are
numerous TTS products available, including Read Please 2000, Proverb Speech Unit,
and Next Up Technology's Text Aloud. Lucent, Elan, and AT&T each have products
called “Text-to-Speech”.
Voice based email system helps visually challenged people to access email services
efficiently. It has been observed that nearly about 60% total blind population across the world
is present in India. This system overcomes difficulties faced by visually impaired people as
well as illiterate people. This will reduce the drawbacks of existing system such as software
load of using screen readers and Automatic Speech Recognizer (ASR). The system will be
guiding the user what needs to be performed for obtaining desired results by prompting.
Hence this reduces the user’s load of remembering keyboard shortcuts and location of keys.
The user needs to follow the instructions given by the system.
The system we are developing will be working only on desktops. As use of mobile phones is
increasing day-to-day, there is a need to include this facility as an application in mobile
phones also. Also security features can be implemented during login phase to make the
system more secure.
CHAPTER-6
REFERENCES
1. TheWHOwebsite.[Online].Available:
http://www.who.int/mediacentre/factsheets/fs282/en/
2. The Radicati website. [Online]. Available: http://www.radicati.com/wp/wp-
content/uploads/2014/01/Email-Statistics-Report-2014-2018-Executive-Summary.pdf.
3. http://www.match-project.org.uk/resources/tutorial/Speech_Language/
Speech_Recognition/Rec_6.html
4. http://webaim.org/articles/visual/blind
5. https://developers.google.com/gmail/api/?hl=en.
6. T. Dasgupta and A. Basu. A speech enabled indian language text to braille
transliteration system. In Information and Communication Technologies and
Development (ICTD), 2009 International Conference on, pages 201 IEEE, 2009
Jagtap Nilesh, Pawan Alai, Chavhan Swapnil and Bendre M.R.
7. “Voice Based System in Desktop and Mobile Devices for Blind People”. In
International Journal of Emerging Technology and Advanced Engineering (IJETAE),
2014 on Pages 404-407(Volume 4, issue 2).
8. G. Shoba, G. Anusha, V. Jeevitha, R. Shanmathi. “AN Interactive Email for Visually
Impaired”. In International Journal of Advanced Research in Computer and
Communication Engineering (IJARCCE), 2014 on Pages 5089-5092.(Volume 3, Issue
1).