0% found this document useful (0 votes)
25 views6 pages

Hand Gesture To Speech and Text Conversi

Uploaded by

omkar.yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views6 pages

Hand Gesture To Speech and Text Conversi

Uploaded by

omkar.yadav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

International Journal of Innovative Technology and Exploring Engineering (IJITEE)

ISSN: 2278-3075, Volume-9 Issue-5, March 2020

Hand Gesture to Speech and Text Conversion


Device
K. P. Vijayakumar, Ananthu Nair, Nishant Tomar

proof. To fill the communication gap between the normal


Abstract: Sign language and facial expressions are the major people, dumb people, and deaf people, a system is designed
means of communication for the speech flawed people. General to convert hand gestures into the audio message as well as a
people can understand the facial expression to an extent but text message.
cannot understand the sign language. Dumb people are unable to
The objective of the proposed system is to interpret many
express their thoughts to normal humans. To reduce this gap of
communication, this paper presents an electronic system which
messages using less number of sensors. Thus the system will
will help the mute people to exchange their ideas with the normal become lightweight and faster. Another main objective is to
person in emergency situations. The system consists of a glove provide two types of output, text and audio output, to help
that can be worn by the subject which will convert the hand ges- the dumb people to communicate even with a deaf person.
tures to speech and text. The message displayed will also help The organization of the paper is as follows: Section 2 gives
deaf people to understand their thoughts. This prototype involves the discussion of all the techniques and algorithms that have
raspberry pi 3 as the micro-controller along with the flex sensors, been used in existing systems. The system modules such as
accelerometer sensor. The resistance of the flex sensor changes flex sensors and accelerometer are thoroughly discussed in
due to the bending moment of the fingers of the subject. The
section 3. The results and discussions are done in section 4
accelerometer measures the angular displacement of the wrist
along the y-axis. The microcontroller takes the input from the
which includes the parameters which define and test the
two sensors and matches it with the pre-programmed values and quality and accuracy of the system. Finally, the conclusion,
plays the respective message. The system makes use of python advantages, and future works are discussed in section 5.
and its libraries for microcontroller programming.
II. RELATED WORK
Keywords: Flex sensor, Accelerometer, pyttsx3, Raspberry Pi,
Analog to Digital Converter (ADC). The system [3] mainly consists of an Arduino microcontrol-
ler and flex sensors. Flex sensors are used for sensing the
I. INTRODUCTION gestures. The output of the flex Sensor is processed by the
A huge portion of the global population has the inability Arduino. The output from the microcontroller is then trans-
of speaking either partially or completely. In India, around mitted via a Bluetooth module. An android device which is
2.78 percent of the total population are speech flawed [1] connected to the device makes use of the MIT app inventor
and a very small fraction of these are good in communicat- to convert gesture to speech. The system [4] reviews the
ing with hand gestures. The possibility for a normal person sensors that are human wearable and can measure the differ-
knowing the sign language is very less. So to reduce the ent reflections and activities of the human body. Many fea-
communication gap, the research in the field of gesture to tures like weight and sensitivity levels are reviewed. Weara-
speech (G2S) system becomes more important. In recent ble sensors provide portability to the system. The system
years many researchers focused on hand gesture detection according to [5] gives the conversion of hand gesture to
and developed many techniques in the field of robotics and speech system which has been developed using the images
artificial intelligence [2]. This project uses a similar ap- of hand gestures of the dumb person (subject) captured by
proach but tries to implement the idea distinctly and came the camera. The images are segmented by using the skin
up with an important application in the domain of IoT. The region detection algorithm. According to this algorithm, the
device helps a dumb person to communicate with the nor- skin region remains white and every other part of the image
mal person as well as a deaf person. Various methods im- becomes black based on the R/G ratio. The feature extrac-
plemented for the gesture to speech conversion by research- tion technique is used to classify several types of hand ges-
ers all over the world. tures. The classified values are matched with the pre-
The motivating factors of the paper come from the idea recorded soundtracks in the database using MATLAB. A
that (i) a system that can interpret many messages by using hand glove [6] is designed by using the flex sensor and ad-
the minimum number of sensors thus making the system vanced virtual RISC microcontroller which converts the
less complex to use. (ii) This paper provides a method for analog signal as input sent by the flex sensor attached on
designing a faster system using sensors. (iii) The system glove fingers. As the subject makes a gesture using the fin-
should be free from thermal injuries and must be shock- gers, the change in resistance is sent to the microcontroller
which converts the signal to an 8-bit binary code, according
to which the respective messages are played using the
Revised Manuscript Received on March 04, 2020.
Dr. K P Vijayakumar, Assitant Professor, Department of Computer speaker system. The system [7] converts the hand gesture to
Science and Engineering, SRM Institute of Science and Technology, Kat- the speech system as well as displays the respective mes-
tankulathur, Chennai. sage. The system uses the 5 flex sensors, one on each finger
Ananthu Nair, Undergraduate Student, Department of Computer fixed on a glove along with the Arduino-nano and speaker
Science and Engineering, SRM Institute of Science and Technology, Kat-
tankulathur, Chennai. amplifier.
Nishant Tomar, Undergraduate Student, Department of Computer
Science and Engineering, SRM Institute of Science and Technology, Kat-
tankulathur, Chennai.

Published By:
Retrieval Number: E2260039520/2020©BEIESP Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.E2260.039520 1241 & Sciences Publication
Hand Gesture to Speech and Text Conversion Device

The flex sensor measures the change in resistance of the flex Table 1: Literature Review
sensors due to the bending moment of the fingers.
These values are converted into digital parameter which is
then matched with the values pre-entered in the memory.
The respective voice message gets played and also the same
message is displayed using the LCD screen.
The main method used in this system [8] is image
processing. The camera captures the subjects' hand images.
Those images are processed using different methods like
color splitting and feature extraction. Every image plays the
respective pre-fed sound using the hardware. This system is
the vision-based method of changing gestures to the audio
system.
The system [9] is embedded with flex sensors that measure
the resistance across the fingers. Google text to speech li-
brary is used for text to speech conversion. This device
needs to have an active internet connection for the conver-
sion of text to speech. The system given in [10] uses Man-
darin-Tibetan bilingual speech synthesizer with the support
of a deep neural network model and an SVM to identify
different facial expressions and hand movements that will
enable the device to include emotion to output audio. This
system [11] makes use of a low-cost packaging material
called Velo-stat. Its electrical resistance decreases when
pressure is applied to it. This technique cane utilized to
measure the bending of fingers to identify the gesture. This
system [12] uses an accelerometer to measure the orienta-
tion of the wrist and gyroscope to measure the angular ve- III. METHODOLOGY
locity with a hidden Markov model mainly for the regional
language. The system comprises of a method to convert the hand ges-
The inference derived from the related work is (i) the li- tures to text and audio messages. The prototype built has
mited number of voices/gestures due to the limited number modules like raspberry pi 3 microcontroller, bend sensitive
of sensors which allow only four messages [13] to be inter- flex sensor, analog to digital converter and accelerometer.
preted and if more sensors are used then there are chances The architecture diagram is shown in fig. 1. Working of
that the system will slow down and processing time increas- different modules of the system is discussed in further sub-
es gradually. (ii) Another popular technique to capture ges- sections
tures is image processing [14]. The problem with image
processing is that it requires additional overhead to process
the image which makes the system time consuming and
complex. Also, the camera has to be necessarily used for
image capturing. Advanced computational resources are
required to implement an efficient image processing system.
Sensitivity depends on precision, various topologies like
novel algorithm acquisition needs to be analyzed. (iii)
Another disadvantage from the design perspective is that it
has the risk of thermal injuries and also the application cost
is high. (iv)There is a need for fabric data collecting glove
for getting high accuracy. It includes the problem of bending
sensor repeatability. It requires the full hand gestures and
also the thumb rotation sensor. The sensitivity level of the
sensor is low.
Therefore the challenges for the proposed system would be Fig 1: Architecture Diagram
to design an electronic glove that should be shockproof and
producing minimal thermal effects that should not seriously A. Flex sensor
affect the subject. The portability of the system is a major Flex sensors are bend sensitive sensors that measure the
challenge as the system should be lightweight. The main electrical resistance due to bending as shown in fig.2.
goal is to obtain the maximum messages with optimal The bending is directly proportional to the bending value.
processing time. These sensors are in the form of strip ranging from 1 inch to
5 inches long. Their resistance value varies from 10 KΏ to
50 KΏ. These sensors are very thin and lightweight so it is
comfortable for the subject.

Published By:
Retrieval Number: E2260039520/2020©BEIESP Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.E2260.039520 1242 & Sciences Publication
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-9 Issue-5, March 2020

This paper uses 4 flex sensors, 1 on each finger using a hand D. Raspberry PI 3B
glove. These sensors are used to detect the bending of fin- D.1. Features:
gers during different hand gestures. As the flex sensors
 Raspberry pi 3 B is the foremost version of the 3rd gen-
bend, there is a change in resistance and for the specific ges-
eration raspberry pi as shown in fig.5.
ture, there is a specific output using voltage divider formula
 It has 64 bit, quad-core. 1.2 GHz Broadcom BCN2837
(i) as given below:
processor.
𝑉𝑜𝑢𝑡 = (𝑅1 ∗ 𝑉𝑖𝑛)/(𝑅1 + 𝑅2) (i)  It has 1 GigaByte of Random Access Memory (RAM)
for faster processing speed and 40 pins extended
GPIO.
 It works on a micro USB power source up to a maxi-
mum of 2.5A.
 It also has a CSI camera port to support a raspberry pi
camera if needed.
 The power can be given by simply connecting it to a
PC with the help of a USB cable or start with a battery.

Fig 2: Flex Sensor


B. Accelerometer
The accelerometer is used to sense the angular displacement
along the 3 axis x, y, z as shown in fig.3. The accelerometer
is attached to the back of the hand to measure the angular
displacement while making different gestures. The capacit-
ance of the accelerometer varies according to the displace-
ment. The change in capacitance is also given in analog
form by the accelerometer.

Fig 5: Raspberry PI 3B
D.1. Working:
A micro SD card having RASPBIAN operating system is
mounted on the raspberry pi. The digital values obtained
from the sensors are matched to pre-programmed values
given in the python program and the respective messages
Fig 3: Accelerometer are displayed and played through a speaker.
Different python libraries such as pyttsx3, RPI.GPIO, Time,
OS are used.Pyttsx3 library is used to convert text to speech.
C. Analog to Digital Converter (ADC) Different voice modulation changes can be incorporated by
The ADC used is 3208. The sensors give the analog values varying the attribute values in this library. The set of mes-
as the output but raspberry pi needs to have the digital val- sages can easily be changed by changing the text in the
ues as the input. Also, the raspberry pi does not possess an code.
internal analog to digital converter so there is a need for an
external analog to digital converter that converts analog val- E. Text and speech
ues provided by the sensors to the digital values as shown in A 16*2 display is used to show the output message from the
fig.4 needed for the raspberry pi. raspberry pi board and an external set of speakers can be
used to listen to the output message.
F. Process Flow
The flow chart of the process is simple and represented in
fig.6. (i) The gesture is made by the user using the glove. (ii)
The flex sensor and accelerometer give their respective out-
put according to the gesture made. (iii) The analog values
are converted to digital values by analog to digital converter
(ADC). (iv) Raspberry pi matches those values with the pre-
programmed values and if the value is matched the output
message is played through speakers and displayed on an
LCD screen. Else the user has to make the gesture again to
obtain the output.
Fig 4: Working of ADC `

Published By:
Retrieval Number: E2260039520/2020©BEIESP Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.E2260.039520 1243 & Sciences Publication
Hand Gesture to Speech and Text Conversion Device

table 2. According to table 1, less than 30 (< 30) means no


bending of the finger, while greater than 30 (> 30) means
bending of the finger during gesture making.

Table 2: Flex values

The different gestures are shown in fig. 7. The figure


Fig 6: Flow Diagram has 7 gestures which are used to interpret 14 messages us-
ing two different modes of the accelerometer (horizontal
IV. RESULT AND DISCUSSION and vertical).

A. System Setup
Following steps are used for setting up the system:
 The device works on the Raspbian operating system,
therefore burn the Raspbian ISO image file to a micro
SD card which is then mounted onto the raspberry pi
board.
 Install VNC viewer which will work as a virtual ap-
plication to access the Raspbian interface.
 Install and use the FING app to find the IP address of
the Raspberry pi.
 Enter the obtained IP address in the VNC viewer and
set up the password for system protection.
 Install python IDLE for writing the python code,
which programs the raspberry pi.
 Open the new file and write the necessary python
code for accelerometer, ADC, Raspberry pi and con- Fig 7: Hand gestures
version of gestures. B.1. Accuracy Factor
 The power to pi can be given using a USB cable us-
Figures [7-9] depicts the accuracy vs gesture graph where
ing an external power source.
the x-axis denotes the different gestures and the y-axis
B. Process Flow represents accuracy on a scale of 0 to 100. Accuracy for
7 gestures is shown when the accelerometer is in the
Four flex sensors are attached to the index, middle, ring and
horizontal position as shown in fig. 8. The accuracy
pinky fingers. They are named as f1, f2, f3, and f4 respec-
varies from 79 percent to 93 percent. Gesture 7 is the least
tively. The threshold value for the flex sensor is taken as 30.
accurate while gesture 5 being the most accurate. The
If the digital value of the flex sensor is above 30 the re-
average accuracy for this set is 86 percent.
sponse is positive else negative. The accelerometer works in
two positions horizontal and vertical which allows convert-
ing two gestures using the same finger positions. The output
of the accelerometer in a horizontal position is less than 430
(<430) and in a vertical position, the output is greater than
430 (>430). The value 430 is used as a threshold value for
the accelerometer. There are two sets of hand gestures hav-
ing 7 gestures each. The details of each gesture are given in

Published By:
Retrieval Number: E2260039520/2020©BEIESP Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.E2260.039520 1244 & Sciences Publication
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-9 Issue-5, March 2020

V.CONCLUSION
The way of living of the dumb people can be made better by
the proposed system which can make them communicate
effectively with the normal people as well as other hearing
disabled people. This system is effective and gives fast re-
spective response for the given respective 14 gestures using
the 4 flex sensors. The messages can be modified according
to the needs of the situation and subject (disabled person).
The message is played as well as displayed through an LCD
screen.
The future work includes the use of more number of flex
sensors and gyroscope to increase the number of messages
and their precision. The system can be made portable by
Fig 8: Gestures vs Accuracy (Horizontal) using the battery or a small solar panel can be used as a
Accuracy for the remaining 7 gestures is shown in fig. 9 power source. The inclusion of multilingual can provide
when the accelerometer is in the vertical position. The ac- flexibility to the system.
curacy varies from 71 percent to 89 percent. The accuracy
of the vertical gestures is less than the horizontal gestures REFERENCES
due to the difficulty in making the gesture while the hand 1. V.Padmanabhan, M.Sornalatha, Hand gesture recognition and voice
is in the vertical position. The average accuracy of this set conversion system for dumb people, International Journal of Scientific
of gestures is 82 percent. & Engineering Research, Volume 5, Issue 5, May-2014 427 ISSN
2229-5518.
2. Z. Lei, Z. H. Gan, M. Jiang, and K. Dong, "Artificial robot navigation
based on gesture and speech recognition," Proceedings 2014 IEEE In-
ternational Conference on Security, Pattern Analysis, and Cybernetics
(SPAC), Wuhan, 2014, pp. 323-327.
3. H. S. K., Rai S, S., Pal, S., Sulthana K, U., & Chakma, S. Development
of Device for Gesture to Speech Conversion for the Mute Community.
2018 International Conference on Design Innovations for 3Cs Compute
Communicate Control (ICDI3C), IEEE explore(2018).
4. S. C. Mukhopadhyay, "Wearable Sensors for Human Activity Monitor-
ing: A Review,” IEEE Sensors Journal, vol. 15, no. 3, pp. 1321-1330,
March 2015.doi: 10.1109/JSEN.2014.2370945.
5. R. R. Itkarkar and A. V. Nandi, "Hand gesture to speech conversion
using MATLAB, " 2013 Fourth International Conference on Compu-
ting, Communications and Networking Technologies (ICCCNT), Tiru-
chengode, 2013, pp. 1-4.doi: 10.1109/ICCCNT.2013.6726505.
6. Ahmed, Syed Faiz & Ali, Syed & Munawwar, Saqib. (2010). “Elec-
tronic Speaking Glove for Speechless Patients, A Tongue to a Dumb”.
Sustainable Utilization and Development in Engineering and Technol-
ogy, IEEE Conference on. 56 - 60. 10.1109.
Fig 9: Gestures vs Accuracy (Vertical) 7. Safayet Ahmed, Rafiqul Islam, Md.Saniat Rahman Zishan, Mo-
hammed Rabiul Hasan, Md.Nahian Islam, "Electronic speaking system
The combined accuracy of all 14 gestures is shown in fig. for speech impaired people: Speak up", Electrical Engineering and In-
10 while both the horizontal and vertical modes are ac- formation Communication Technology (ICEEICT) 2015 International
Conference on, pp. 1-4, 2015.
tive. There is a noticeable decrease in accuracy as there 8. S.K. Imam Basha, S. Ramasubba Reddy, “Speaking system to mute
are more chances of mixing up messages and chances of people using hand gestures”, International Research Journal of Engi-
inappropriate gestures. The overall average accuracy of neering and Technology, Volume: 05 Issue: 09, Sep 2018.
the system is around 80 percent. 9. Vaibhav Mehra, Aakash Choudhury, Rishu Ranjan Choubey.Gesture
To Speech Conversion using Flex sensors, MPU6050 and Py-
thon.International Journal of Engineering and Advanced Technology
(IJEAT) ISSN: 2249 – 8958, Volume-8 Issue-6, August 2019.
10. Yang, H., An, X., Pei, D. and Liu, Y., 2014, September. Towards
realizing gesture-to-speech conversion with an HMM-based bilingual
speech synthesis system. In 2014 International Conference on Orange
Technologies (pp. 97-100). IEEE.
11. Preetham, C., Ramakrishnan, G., Kumar, S., Tamse, A. and Krishnapu-
ra, N., 2013, April. Hand talk-implementation of a gesture recognizing
glove. In 2013 Texas Instruments India Educators' Conference (pp.
328-331). IEEE.
12. Aiswarya, V., Raju, N.N., Joy, S.S.J., Nagarajan, T. and Vijayalaksh-
mi, P., 2018, March. Hidden Markov Model-Based Sign Language to
Speech Conversion System in TAMIL. In 2018 Fourth International
Conference on Biosignals, Images and Instrumentation (ICBSII) (pp.
206-212). IEEE.
13. P. Vamsi Praveen, K. Satya Prasad,Electronic Voice to Deaf & Dumb
People, Using Flex Sensor International Journal of Innovative Re-
search in Computer and Communication Engineering Vol. 4, Issue 8,
August 2016, ISSN(Online): 2320-9801

Fig. 10: Gestures vs Accuracy (Horizontal+ Vertical)

Published By:
Retrieval Number: E2260039520/2020©BEIESP Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.E2260.039520 1245 & Sciences Publication
Hand Gesture to Speech and Text Conversion Device

14. K. Manikandan, Ayush Patidar, Pallav Walia, Aneek Barman


Roy,Hand Gesture Detection and Conversion to Speech and Text, In-
ternational Journal of Pure and Applied Mathematics, Volume 120 No.
6 2018, 1347-1362, ISSN: 1314-3395

AUTHORS PROFILE

Dr. K P Vijayakumar, Assitant Professor, Department


of Computer Science and Engineering, SRM Institute of
Science and Technology, Kattankulathur, Chennai. He
has interest in database systems, wireless sensor net-
works, IoT and big data analytics.

Ananthu Nair, Undergraduate Student, Department of


Computer Science and Engineering, SRM Institute of
Science and Technology, Kattankulathur, Chennai. He
has keen interest in database systems and IoT.

Nishant Tomar, Undergraduate Student, Department of


Computer Science and Engineering, SRM Institute of
Science and Technology, Kattankulathur, Chennai. He
has keen interest in machine learning and data science.

Published By:
Retrieval Number: E2260039520/2020©BEIESP Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.E2260.039520 1246 & Sciences Publication

You might also like