0% found this document useful (0 votes)
183 views5 pages

American Sign Language Research Paper

The document describes a system for American Sign Language (ASL) recognition using convolutional neural networks. The system extracts features from images of hand gestures, classifies them using a CNN model, and outputs the predicted letter or word. The authors collected over 10,000 training images across 24 ASL letters and symbols. Their model achieved 95.68% accuracy on the training set and 93.76% on the test set. The purpose is to help facilitate communication between deaf/mute and hearing individuals through real-time ASL recognition.

Uploaded by

Varun Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
183 views5 pages

American Sign Language Research Paper

The document describes a system for American Sign Language (ASL) recognition using convolutional neural networks. The system extracts features from images of hand gestures, classifies them using a CNN model, and outputs the predicted letter or word. The authors collected over 10,000 training images across 24 ASL letters and symbols. Their model achieved 95.68% accuracy on the training set and 93.76% on the test set. The purpose is to help facilitate communication between deaf/mute and hearing individuals through real-time ASL recognition.

Uploaded by

Varun Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

AMERICAN SIGN LANGUAGE RECOGNITION

Nandkishore Mishra1, Nitin Saxena2, Shivam Singh3, Varun Garg4

Students, Department of Information Technology, IMS Engineering College, Ghaziabad, Uttar Pradesh, India

[email protected], [email protected], [email protected],


[email protected] 4

ABSTRACT

Sign Language is a type of communication that depends on hand signs, gestures and expressions, commonly used in deaf
and dumb community. But people outside from the community find it hard or almost impossible to communicate with
deaf people. They have to depend on an Interpreter which can both be expensive and very difficult to come by to
communicate. In this project, we have implemented hand segmentation and tracking, feature extraction and classification
methods. In our method, the hand sign is first passed through a filter and after the filter is applied the hand sign is passed
through a classifier which predicts the class of the hand gestures. This project recognizes 24 gestures which are 24
alphabets (except i and z). We examine the various issues such as signer dependence/independence, manual/non-manual,
glove/device-based, vocabulary size, constraints in hand segmentation, and isolated/continuous sign. The purpose of this
project is to provide a complete progress in the field of SLR, specifically in ASL.

Keywords

Hand segmentation, Convolutional Neural Network, feature extraction, classification, hand gestures, hand tracking,
dataset, human-computer interaction (HCI), American Sign Language.

1. INTRODUCTION
American Sign Language (ASL) substantially facilitates a way of communication for hard of hearing and voiceless community.
However, there are about 300,000-500,000 interpreter which significantly limit the number of people who can easily communicate
with deaf and voiceless community. The alternative of SL is written form of communication which is difficult, professional and even
impractical when a case of emergency occurs or even in the day to day life. In order to overcome from this drawback and to empower
real time conversation, we have provided an ASL recognition system which uses Convolutional Neural Networks (CNN) to
concurrently convert ASL signs into text. Our model consists of three tasks to be performed concurrently:

1. Obtaining and filter images of the user signing (as input).

2. Classifying each picture according to their features extracted to a letter.

3. Regenerate and exhibit the most likely word from classification scores and generate that word as output.

From a machine vision perspective, this problem represents a significant challenge due to these:

• context concerns (e.g. lighting sensitivity, background, and camera position)


• Occlusion (e.g. some or all fingers or an entire hand can be out of the field of view)

While Neural Networks had been applied to ASL alphabet recognition in the past with accuracies that is consistently high, some of
them require a 3-D capture element with motion-tracking gloves or a Microsoft Kinect, and only few of them provides real-time
output. The constraints imposed by the extra requirements reduce the scalability and feasibility of these solutions. ASL is a primary
sign language since the only disabled Deaf and Mute (D&M) people have in communication and they cannot use spoken languages
hence the only way for them is to communicate through sign language. Deaf and dumb (D&M) people make use of their hands to
express different gestures to express their ideas with other people. Gestures are the nonverbally exchanged messages and these
gestures can only be understood with vision. This nonverbal communication of deaf and dumb people is called sign language. Sign
language is a visual language and consists of 3 major components:
Finger-spelling: used to spell words letter by letter Word level sign.
Vocabulary: used for the majority of communication.
Non-manual features: Facial expressions and tongue, mouth and body position.

In our project we basically focus on producing a model which can recognize hand gestures in order to produce an alphabet to that sign.

1.1 Objective
• Objective is to create an application that detects any specific sign that could be used by user so that we can verify the
symbol.
• The main objective of this project is to design a system that can assist the impaired people to communicate with normal
people.

This project also aims to meet the following objectives:

i. To develop gesture recognizing system that can recognize sign gesture of American Sign Language and translate it into text.

ii. To test the accuracy of the system.

iii. Design a solution that is intuitive and simple. Communication for majority of people is not difficult. It should be the same way for
the deaf.

1.2 Features
1. We create a CNN classifier that can recognize Sign Language gestures signed by the user at real time with high accuracy.

2. It provides a better and more practical way of communication between a deaf and hearing person

3. Easy implementation

4. Environment and user friendly

5. No new hardware required

6. Can be used in many places to make communication easy with deaf and dumb people.

2. TECHNOLOGY USED

2.1 Convolutional Neural Network


Convolutional Neural Network is a class of Artificial Neural Networks, commonly used in object recognition task. A CNN is deep
learning algorithm which helps to extract spatial and relative features of the image which differentiate the image from other images
and produce a desired output. A CNN consists of various layers like pooling, convolutional, dropout, fully connected, the weights and
biases are adjusted using Gradient Descent and Back-propagation method. CNN is the faster method to classify different kind of
images and it provides more accuracy in the image pre-processing. IT has the ability to learn features on its own. Moreover, the final
output layer would have dimensions (number of classes), because by the end of the CNN architecture we will reduce the full image
into a single vector of class scores.

2.2 Tensor Flow


Tensorflow may be a free and code file software system package library for dataflow and differentiable programming among a range
of tasks. It is a symbolic branch of data library, and is employed for machine learning applications like neural networks in addition. It
can be used multiple types of tasks but mostly focuses on the training of the neural network. Tensorflow provides more flexibility and
train the data much faster. It can run on single processor systems, GPUs put together as mobile devices and scale distributed systems
of the assorted machines.

2.3 Keras
Keras is also a minimalist and open source Python library for deep learning which is able to run on high of Tensor Flow. It was
developed to make implementing deep learning models as fast and simple as potential for analysis and development. The primary
reason why Keras appeals to the masses is its user friendliness and ease of use. It provides a high-level abstraction to backend libraries
so we won’t have to define each and every function.

3. METHODOLOGY

3.1 Data Collection


Data of hand signs are collected using a web camera. Photos of hand signs were taken in various environments with different lighting
conditions having slightly different orientation in each photo. So, that we could simulate all types of real-time conditions. Our database
contains about 10000 images composed for the American Sign Language standard samples all in the .jpg format. We have about 400-
500 images for each letter of the alphabet (from A to Y except J) and also about 460 images for the blank character (i.e., for space
between words or sentences) It is important to mention that having a big database means more accuracy, which will increase the
recognition percentage. However, working on software, we must take into consideration the quality of the image as well as the overall
size of the program.

There are two types of dataset we are using for this project:

a. The first one is downloaded dataset which consists of more than 10,000 images with 400-500 images of each letter.

Figure1: Downloaded Dataset

b. The second one is our own dataset which consists of 100-200 images of similar looking symbols.

Figure 2: Our own Dataset

3.2 Method
• First, we will extract the frames from the image of each gesture.
• After that, noise from the frames i.e. background, other body parts than hand are removed to extract relevant features.
• Then frames are given to the CNN model for training on spatial features.
• Store the train and test frame prediction. We’ll use the model obtained previously for the prediction of frames.
• A text output is generated which is the alphabet predicted of the symbol or sign from the procedure.

Figure 3: Modular Structure

4. RESULT
In this paper, we test and train model for 24 different signs or symbols. Our system achieves 95.68% training accuracy with 1.87% loss
when the model is trained for almost 10,000 images of downloaded dataset and almost 4,000 images of our own created dataset. From
this project, it is clear that, to increase the accuracy we have to increase the number of images for the training of data especially to
differentiate between the similar looking symbols such as m & n, s & t, etc. Finally this model achieves 93.76% test accuracy for both
the dataset provided.

Table 1: Accuracy and loss %

Attribute Value in
percentage
Training Accuracy 95.68
Training Loss 1.87
Validation Accuracy 94.16
Validation Loss 2.63
Test Accuracy 93.76

5. CONCLUSION
The ASL recognition system is feasible can be used by hearing persons and deaf and mute community because this system allows
them to communicate with each other. This project reduces the complications of communication between D&M community and
hearing people. This system can capture hand gestures and navigate the words in text format.CNN can be used as an efficient way to
convert sign language into text format. This model has obtained an accuracy of approximately 94% which shows that CNN can be
successfully used to learn features and classify Sign Language Gestures.

6. FUTURE WORK
Develop the model to process the continuous input stream of sign language and convert them into their corresponding sentences.

Future scope includes but is not limited to:

•The implementation of our model for other sign languages such as Indian Sign Language etc. as it is currently limited to American
Sign Language.

•Enhancement of the model to recognize common words and expressions.

•Further training the neural network to efficiently recognize symbols involving both hands.

•For further convenience of the society, conversion of the interpreted text to speech can be implemented as well.

REFERENCES

[1]Dataset:https://www.kaggle.com/grassknoted/aslalphabet/notebooks?sortBy=hotness&
group=everyone&pageSize=20&datasetId=23079

[2] Wikipedia, American Sign Language, https://en.wikipedia.org/wiki/American_Sign_Language (accessed on Oct,


2020)

[3] ASL Recognition (Apr 26, 2020), https://towardsdatascience.com/American-signlanguage-recognition-using-cnn-


36910b86d651 (accessed on Oct, 2020)

[4] Oliver Theobald, Machine Learning for Absolute Beginners: A plain English Introduction (2nd edition);
independently published (Jan, 2018)

[5] Beena M.V., Dr. M.N. Agnisarman Namboodiri, “Automatic Sign Language Finger spelling using Convolutional
neural network”, International Journal of Pure and Applied Mathematics, p-ISSN: 1311-8080, 2017.

[6] Kumud Alok, Anurag Mehra, Ankita Kaushik, Aman Verma, “Hand Sign Recognition using convolutional neural
network”, International Research Journal of Engineering and Technology, Jan 2020.

[7] Shailesh bachani, Shubham dixit, Rohin chadha, Prof. Avinash Bagul, “Sign Language Recognition using neural
network”, International Research Journal of Engineering and Technology, p-ISSN: 2395-0072, April 2020 .

[8] S. Dinesh, S. Sivaprakash, M. Keshav, K. Ramya, “Real-time American sign language recognition with faster
regional convolution neural networks”, International Journal of Innovative Research in Science, Engineering and
Technology, p-ISSN: 2347-6710, March 2018.

[9] Alhussain Akoum, Nour Al Mawla, “Hand Gesture Recognition Approach for ASL Language Using Hand Extraction
Algorithm”, Journal of Software Engineering and Applications, 419-430, August 2015.

[10] Byeongkeun Kang, Subarna Tripathi, Truong Q. Nguyen, “Real-time Sign Language Finger spelling Recognition
using Convolutional Neural Networks from Depth map”, Journal of Software Engineering and Applications,14 OCT-
2015.

You might also like