Hand Gesture To Speech and Text Conversi
Hand Gesture To Speech and Text Conversi
Published By:
Retrieval Number: E2260039520/2020©BEIESP Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.E2260.039520 1241 & Sciences Publication
Hand Gesture to Speech and Text Conversion Device
The flex sensor measures the change in resistance of the flex Table 1: Literature Review
sensors due to the bending moment of the fingers.
These values are converted into digital parameter which is
then matched with the values pre-entered in the memory.
The respective voice message gets played and also the same
message is displayed using the LCD screen.
The main method used in this system [8] is image
processing. The camera captures the subjects' hand images.
Those images are processed using different methods like
color splitting and feature extraction. Every image plays the
respective pre-fed sound using the hardware. This system is
the vision-based method of changing gestures to the audio
system.
The system [9] is embedded with flex sensors that measure
the resistance across the fingers. Google text to speech li-
brary is used for text to speech conversion. This device
needs to have an active internet connection for the conver-
sion of text to speech. The system given in [10] uses Man-
darin-Tibetan bilingual speech synthesizer with the support
of a deep neural network model and an SVM to identify
different facial expressions and hand movements that will
enable the device to include emotion to output audio. This
system [11] makes use of a low-cost packaging material
called Velo-stat. Its electrical resistance decreases when
pressure is applied to it. This technique cane utilized to
measure the bending of fingers to identify the gesture. This
system [12] uses an accelerometer to measure the orienta-
tion of the wrist and gyroscope to measure the angular ve- III. METHODOLOGY
locity with a hidden Markov model mainly for the regional
language. The system comprises of a method to convert the hand ges-
The inference derived from the related work is (i) the li- tures to text and audio messages. The prototype built has
mited number of voices/gestures due to the limited number modules like raspberry pi 3 microcontroller, bend sensitive
of sensors which allow only four messages [13] to be inter- flex sensor, analog to digital converter and accelerometer.
preted and if more sensors are used then there are chances The architecture diagram is shown in fig. 1. Working of
that the system will slow down and processing time increas- different modules of the system is discussed in further sub-
es gradually. (ii) Another popular technique to capture ges- sections
tures is image processing [14]. The problem with image
processing is that it requires additional overhead to process
the image which makes the system time consuming and
complex. Also, the camera has to be necessarily used for
image capturing. Advanced computational resources are
required to implement an efficient image processing system.
Sensitivity depends on precision, various topologies like
novel algorithm acquisition needs to be analyzed. (iii)
Another disadvantage from the design perspective is that it
has the risk of thermal injuries and also the application cost
is high. (iv)There is a need for fabric data collecting glove
for getting high accuracy. It includes the problem of bending
sensor repeatability. It requires the full hand gestures and
also the thumb rotation sensor. The sensitivity level of the
sensor is low.
Therefore the challenges for the proposed system would be Fig 1: Architecture Diagram
to design an electronic glove that should be shockproof and
producing minimal thermal effects that should not seriously A. Flex sensor
affect the subject. The portability of the system is a major Flex sensors are bend sensitive sensors that measure the
challenge as the system should be lightweight. The main electrical resistance due to bending as shown in fig.2.
goal is to obtain the maximum messages with optimal The bending is directly proportional to the bending value.
processing time. These sensors are in the form of strip ranging from 1 inch to
5 inches long. Their resistance value varies from 10 KΏ to
50 KΏ. These sensors are very thin and lightweight so it is
comfortable for the subject.
Published By:
Retrieval Number: E2260039520/2020©BEIESP Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.E2260.039520 1242 & Sciences Publication
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-9 Issue-5, March 2020
This paper uses 4 flex sensors, 1 on each finger using a hand D. Raspberry PI 3B
glove. These sensors are used to detect the bending of fin- D.1. Features:
gers during different hand gestures. As the flex sensors
Raspberry pi 3 B is the foremost version of the 3rd gen-
bend, there is a change in resistance and for the specific ges-
eration raspberry pi as shown in fig.5.
ture, there is a specific output using voltage divider formula
It has 64 bit, quad-core. 1.2 GHz Broadcom BCN2837
(i) as given below:
processor.
𝑉𝑜𝑢𝑡 = (𝑅1 ∗ 𝑉𝑖𝑛)/(𝑅1 + 𝑅2) (i) It has 1 GigaByte of Random Access Memory (RAM)
for faster processing speed and 40 pins extended
GPIO.
It works on a micro USB power source up to a maxi-
mum of 2.5A.
It also has a CSI camera port to support a raspberry pi
camera if needed.
The power can be given by simply connecting it to a
PC with the help of a USB cable or start with a battery.
Fig 5: Raspberry PI 3B
D.1. Working:
A micro SD card having RASPBIAN operating system is
mounted on the raspberry pi. The digital values obtained
from the sensors are matched to pre-programmed values
given in the python program and the respective messages
Fig 3: Accelerometer are displayed and played through a speaker.
Different python libraries such as pyttsx3, RPI.GPIO, Time,
OS are used.Pyttsx3 library is used to convert text to speech.
C. Analog to Digital Converter (ADC) Different voice modulation changes can be incorporated by
The ADC used is 3208. The sensors give the analog values varying the attribute values in this library. The set of mes-
as the output but raspberry pi needs to have the digital val- sages can easily be changed by changing the text in the
ues as the input. Also, the raspberry pi does not possess an code.
internal analog to digital converter so there is a need for an
external analog to digital converter that converts analog val- E. Text and speech
ues provided by the sensors to the digital values as shown in A 16*2 display is used to show the output message from the
fig.4 needed for the raspberry pi. raspberry pi board and an external set of speakers can be
used to listen to the output message.
F. Process Flow
The flow chart of the process is simple and represented in
fig.6. (i) The gesture is made by the user using the glove. (ii)
The flex sensor and accelerometer give their respective out-
put according to the gesture made. (iii) The analog values
are converted to digital values by analog to digital converter
(ADC). (iv) Raspberry pi matches those values with the pre-
programmed values and if the value is matched the output
message is played through speakers and displayed on an
LCD screen. Else the user has to make the gesture again to
obtain the output.
Fig 4: Working of ADC `
Published By:
Retrieval Number: E2260039520/2020©BEIESP Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.E2260.039520 1243 & Sciences Publication
Hand Gesture to Speech and Text Conversion Device
A. System Setup
Following steps are used for setting up the system:
The device works on the Raspbian operating system,
therefore burn the Raspbian ISO image file to a micro
SD card which is then mounted onto the raspberry pi
board.
Install VNC viewer which will work as a virtual ap-
plication to access the Raspbian interface.
Install and use the FING app to find the IP address of
the Raspberry pi.
Enter the obtained IP address in the VNC viewer and
set up the password for system protection.
Install python IDLE for writing the python code,
which programs the raspberry pi.
Open the new file and write the necessary python
code for accelerometer, ADC, Raspberry pi and con- Fig 7: Hand gestures
version of gestures. B.1. Accuracy Factor
The power to pi can be given using a USB cable us-
Figures [7-9] depicts the accuracy vs gesture graph where
ing an external power source.
the x-axis denotes the different gestures and the y-axis
B. Process Flow represents accuracy on a scale of 0 to 100. Accuracy for
7 gestures is shown when the accelerometer is in the
Four flex sensors are attached to the index, middle, ring and
horizontal position as shown in fig. 8. The accuracy
pinky fingers. They are named as f1, f2, f3, and f4 respec-
varies from 79 percent to 93 percent. Gesture 7 is the least
tively. The threshold value for the flex sensor is taken as 30.
accurate while gesture 5 being the most accurate. The
If the digital value of the flex sensor is above 30 the re-
average accuracy for this set is 86 percent.
sponse is positive else negative. The accelerometer works in
two positions horizontal and vertical which allows convert-
ing two gestures using the same finger positions. The output
of the accelerometer in a horizontal position is less than 430
(<430) and in a vertical position, the output is greater than
430 (>430). The value 430 is used as a threshold value for
the accelerometer. There are two sets of hand gestures hav-
ing 7 gestures each. The details of each gesture are given in
Published By:
Retrieval Number: E2260039520/2020©BEIESP Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.E2260.039520 1244 & Sciences Publication
International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-9 Issue-5, March 2020
V.CONCLUSION
The way of living of the dumb people can be made better by
the proposed system which can make them communicate
effectively with the normal people as well as other hearing
disabled people. This system is effective and gives fast re-
spective response for the given respective 14 gestures using
the 4 flex sensors. The messages can be modified according
to the needs of the situation and subject (disabled person).
The message is played as well as displayed through an LCD
screen.
The future work includes the use of more number of flex
sensors and gyroscope to increase the number of messages
and their precision. The system can be made portable by
Fig 8: Gestures vs Accuracy (Horizontal) using the battery or a small solar panel can be used as a
Accuracy for the remaining 7 gestures is shown in fig. 9 power source. The inclusion of multilingual can provide
when the accelerometer is in the vertical position. The ac- flexibility to the system.
curacy varies from 71 percent to 89 percent. The accuracy
of the vertical gestures is less than the horizontal gestures REFERENCES
due to the difficulty in making the gesture while the hand 1. V.Padmanabhan, M.Sornalatha, Hand gesture recognition and voice
is in the vertical position. The average accuracy of this set conversion system for dumb people, International Journal of Scientific
of gestures is 82 percent. & Engineering Research, Volume 5, Issue 5, May-2014 427 ISSN
2229-5518.
2. Z. Lei, Z. H. Gan, M. Jiang, and K. Dong, "Artificial robot navigation
based on gesture and speech recognition," Proceedings 2014 IEEE In-
ternational Conference on Security, Pattern Analysis, and Cybernetics
(SPAC), Wuhan, 2014, pp. 323-327.
3. H. S. K., Rai S, S., Pal, S., Sulthana K, U., & Chakma, S. Development
of Device for Gesture to Speech Conversion for the Mute Community.
2018 International Conference on Design Innovations for 3Cs Compute
Communicate Control (ICDI3C), IEEE explore(2018).
4. S. C. Mukhopadhyay, "Wearable Sensors for Human Activity Monitor-
ing: A Review,” IEEE Sensors Journal, vol. 15, no. 3, pp. 1321-1330,
March 2015.doi: 10.1109/JSEN.2014.2370945.
5. R. R. Itkarkar and A. V. Nandi, "Hand gesture to speech conversion
using MATLAB, " 2013 Fourth International Conference on Compu-
ting, Communications and Networking Technologies (ICCCNT), Tiru-
chengode, 2013, pp. 1-4.doi: 10.1109/ICCCNT.2013.6726505.
6. Ahmed, Syed Faiz & Ali, Syed & Munawwar, Saqib. (2010). “Elec-
tronic Speaking Glove for Speechless Patients, A Tongue to a Dumb”.
Sustainable Utilization and Development in Engineering and Technol-
ogy, IEEE Conference on. 56 - 60. 10.1109.
Fig 9: Gestures vs Accuracy (Vertical) 7. Safayet Ahmed, Rafiqul Islam, Md.Saniat Rahman Zishan, Mo-
hammed Rabiul Hasan, Md.Nahian Islam, "Electronic speaking system
The combined accuracy of all 14 gestures is shown in fig. for speech impaired people: Speak up", Electrical Engineering and In-
10 while both the horizontal and vertical modes are ac- formation Communication Technology (ICEEICT) 2015 International
Conference on, pp. 1-4, 2015.
tive. There is a noticeable decrease in accuracy as there 8. S.K. Imam Basha, S. Ramasubba Reddy, “Speaking system to mute
are more chances of mixing up messages and chances of people using hand gestures”, International Research Journal of Engi-
inappropriate gestures. The overall average accuracy of neering and Technology, Volume: 05 Issue: 09, Sep 2018.
the system is around 80 percent. 9. Vaibhav Mehra, Aakash Choudhury, Rishu Ranjan Choubey.Gesture
To Speech Conversion using Flex sensors, MPU6050 and Py-
thon.International Journal of Engineering and Advanced Technology
(IJEAT) ISSN: 2249 – 8958, Volume-8 Issue-6, August 2019.
10. Yang, H., An, X., Pei, D. and Liu, Y., 2014, September. Towards
realizing gesture-to-speech conversion with an HMM-based bilingual
speech synthesis system. In 2014 International Conference on Orange
Technologies (pp. 97-100). IEEE.
11. Preetham, C., Ramakrishnan, G., Kumar, S., Tamse, A. and Krishnapu-
ra, N., 2013, April. Hand talk-implementation of a gesture recognizing
glove. In 2013 Texas Instruments India Educators' Conference (pp.
328-331). IEEE.
12. Aiswarya, V., Raju, N.N., Joy, S.S.J., Nagarajan, T. and Vijayalaksh-
mi, P., 2018, March. Hidden Markov Model-Based Sign Language to
Speech Conversion System in TAMIL. In 2018 Fourth International
Conference on Biosignals, Images and Instrumentation (ICBSII) (pp.
206-212). IEEE.
13. P. Vamsi Praveen, K. Satya Prasad,Electronic Voice to Deaf & Dumb
People, Using Flex Sensor International Journal of Innovative Re-
search in Computer and Communication Engineering Vol. 4, Issue 8,
August 2016, ISSN(Online): 2320-9801
Published By:
Retrieval Number: E2260039520/2020©BEIESP Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.E2260.039520 1245 & Sciences Publication
Hand Gesture to Speech and Text Conversion Device
AUTHORS PROFILE
Published By:
Retrieval Number: E2260039520/2020©BEIESP Blue Eyes Intelligence Engineering
DOI: 10.35940/ijitee.E2260.039520 1246 & Sciences Publication