Sign Language Recognition Using Sensor Gloves: December 2002
Sign Language Recognition Using Sensor Gloves: December 2002
net/publication/4014533
CITATIONS READS
91 6,335
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Yasir Niaz Khan on 31 May 2014.
Abstract
This paper examines the possibility of A gesture in a sign language, is a particular
recognizing sign language gestures using sensor movement of the hands with a specific shape
gloves. Previously sensor gloves are used in made out of them. Facial expressions also count
games or in applications with custom gestures. toward the gesture, at the same time. A posture
This paper explores their use in Sign Language on the other hand, is a static shape of the hand to
recognition. This is done by implementing a indicate a sign.
project called “Talking Hands”, and studying
the results. The project uses a sensor glove to A sign language usually provides signs for
capture the signs of American Sign Language whole words. It also provides signs of letters to
performed by a user and translates them into perform words that don’t have a corresponding
sentences of English language. Artificial neural sign in that sign language. So, although
networks are used to recognize the sensor values sentences can be made using the signs for letters,
coming from the sensor glove. These values are performing with signs of words is faster. The
then categorized in 24 alphabets of English sign language chosen for this project is the
language and two punctuation symbols American Sign Language.
introduced by the author. So, mute people can
write complete sentences using this application. 1.1. American Sign Language
3. Previous work
Previously, sensor gloves have been used in
games for creating virtual 3D environments.
Players can give input to the game using the
gloves. Gloves, along with other sensor devices,
have also been used in making games. Actions of
the experts wearing the sensors are captured and
translated into the game to give a realistic look to
the game. Sensor gloves have also been used in
giving commands to robots. Streams of shapes of
the hand are defined and then recognized to
control a robotic hand or vehicle. Input layer Hidden layer Output layer
Glove-TalkII is a system that translates hand Figure 1: Model of Neural Network used in the
project. Input, hidden and output layers contain 7, 54
gestures to speech through an adaptive interface. and 26 neurons (nodes) respectively.
Currently, the best version of Glove-TalkII uses
several input devices (including a Cyberglove, a
4.1. Neural Network Model
ContactGlove, a polhemus sensor, and a foot-
pedal), a parallel formant speech synthesizer and
Artificial Neural Network with feed forward
3 neural networks. One subject was trained to
and back propagation algorithms have been used.
use Glove-TalkII. After 100 hours of practice he
Feed forward algorithm is used to calculate the
is able to speak intelligibly. The subject passed
output for a specific input pattern. Back
through 8 distinct stages while he learned to
propagation algorithm is used for learning of the
speak. His speech is fairly slow (1.5~to~3 times
network. Three layers of nodes have been used in
slower than normal speech) and somewhat
the network. First layer is the input layer that
robotic. Reading novel passages intelligibly
takes 7 sensor values from the sensors on the
usually requires several attempts, especially with
glove. So this layer has 7 sensors. This layer
polysyllabic words. Intelligible spontaneous
does not do any processing and just passes the
speech is possible but difficult.
values forward.
4. Sign Language Recognition Next layer is the hidden layer, which takes
the values from the input layer and applies the
Our system is aimed at maximum recognition weights on them. This layer has 52 nodes. This
of gesture without any training. This makes the layer passes its output to the third layer. The
system usable at public places where there is no third layer is the output layer, which takes its
room for long training sessions. The speed of input from the hidden layer and applies weights
gesture capturing and recognition can be to them. There are 26 nodes in this layer. Each
adjusted in the application to incorporate both node denotes one alphabet of the sign language
the slow and fast performers of ASL. subset. This layer passes out the final output.
A threshold is applied to the final output. Sign languages are also space-dependant.
Only the values above this threshold and This means that the space (relative to the body)
considered. If none of the nodes give an output where the gestures are performed also
above the threshold value, no letter is outputted. contributes to sentence formation. Sensors would
If more than one node gives a value above the be needed to detect the relative space where the
threshold, no letter is outputted. The activation gestures are performed.
function used is the sigmoid function. This
activation function is applied at both of the Sign languages, as spoken languages, have
processing layer after the weights have been certain rules of grammar for forming sentences.
applied. This function is used in processing and These rules must be taken into account while
learning of the network. translating a sign language into a spoken
language. Rules of the targeted spoken language
Sampling is done 4 times a second. The user must also be considered into account. In the end,
must keep the sign performed for 3/4th of a adding a speech engine to speak the translated
second to get it recognized. This limit can be text would help enhance ease of use.
lowered for faster performers.
4.2. Results
Data
The accuracy rate of the software was found Glove
to be 88%. This figure is lower due to the fact
that training was done on the samples of people
who did not know sign language and were given Neural Networks
a handout to perform the signs by reading from
it. So, there was great deal of variation in the ASL
samples. Some samples even gave completely Recognition
wrong readings of the sensors. Testing was also
ASL Lexicon
done on the same kind of people.
1. Words
4.3. Problems Machine 2. Rules
Translation
One problem that was faced in the project System
English Lexicon
was that some of the alphabets involved dynamic
gestures. These may not be recognized using this 1. Words
glove. So these were left out from the domain of 2. Rules
the project. Also, some gestures require use of Text
both hands. This requires two sensor gloves. To
Speech Speech
4.4. Proposed solutions
The problem of dynamic gestures can be Figure 2: Model of an application that can fully
translate a sign language into a spoken language.
resolved using sensors on the arm as well.
Sensors would be required at elbow and perhaps
shoulder. Hidden Markov Models can be
employed to recognize the sequence of readings 5. Conclusion
given by moving hands.
This project was meant to be a prototype to
4.5. Future work check the feasibility of recognizing sign
languages using sensor gloves. The completion
As mentioned above, signs of sign languages of this prototype suggests that sensor gloves can
are usually performed not only with hands but be used for partial sign language recognition.
also with facial expressions. One big extension More sensors can be employed to recognize full
to the application can be use of sensors (or sign language.
cameras) to capture facial expressions.
6. Application
The product generated as a result can be used
at public places like airports, railway stations and
counters of banks, hotels etc. where there is
communication between different people. In
addition to this a mute person can deliver a
lecture using it.
7. Reference
[1] http://www.bconnex.net/~randys/index1.html
[2]
http://www.acm.org/sigchi/chi95/Electronic/doc
umnts/papers/ssf_bdy.htm
[3] http://where.com/scott.net/asl/index.html
[4] http://www-
white.media.mit.edu/~testarne/asl/asl-
tr466/index.html