0% found this document useful (0 votes)
9 views20 pages

Project Synopsis

The document outlines a project focused on developing a real-time system to convert American Sign Language (ASL) into text using machine learning techniques. It aims to enhance communication for the deaf and hard of hearing community by creating an intuitive interface that translates sign language gestures into written words. The project includes a comprehensive methodology, hardware and software requirements, and future enhancements to improve accuracy and accessibility.

Uploaded by

Pratham Dubey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views20 pages

Project Synopsis

The document outlines a project focused on developing a real-time system to convert American Sign Language (ASL) into text using machine learning techniques. It aims to enhance communication for the deaf and hard of hearing community by creating an intuitive interface that translates sign language gestures into written words. The project includes a comprehensive methodology, hardware and software requirements, and future enhancements to improve accuracy and accessibility.

Uploaded by

Pratham Dubey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

SYNOPSIS REPORT

ON
Real Time Conversion of American Sign Language to text
with Emotion using Machine Learning

B. TECH COMPUTER SCIENCE & ENGINEERING.

Submitted by

Harshit Garg 2000910100075


Pratham Dubey 2000910100126
Shaurya Gupta 2100910109012

Project Supervisor:
Dr. Rachna Jain
Department of Computer Science and Engineering
JSS Academy of Technical Education, Noida

October 2023
TABLE OF CONTENTS

Topic Page No.


S.
No
INTRODUCTION 3
MOTIVATION 5
OBJECTIVE(S) 6
SCOPE OF THE PROJECT 7
RELATED WORK 8
HARDWARE & SOFTWARE 12
REQUIREMENTS
PREDICTED TIMELINE FOR THE PROJECT 13
(PREPARE GANTT CHART USING MS
OFFICE TOOL)
CONCLUSION 14

REFERENCES
15
INTRODUCTION

American sign language is a predominant sign language Since the only disability
Deaf and Dumb (hereby referred to as D&M) people have is communication
related and since they cannot use spoken languages, the only way for them to
communicate is through sign language. Communication is the process of
exchange of thoughts and messages in various ways such as speech, signals,
behavior and visuals. D&M people make use of their hands to express different
gestures to express their ideas with other people. Gestures are the non-verbally
exchanged messages and these gestures are understood with vision. This
nonverbal communication of deaf and dumb people is called sign language. A
sign language is a language which uses gestures instead of sound to convey
meaning combining hand-shapes, orientation and movement of the hands, arms
or body, facial expressions and lip-patterns. Contrary to popular belief, sign
language is not international. These vary from region to region.
Sign language is a visual language and consists of 3 major components.

Minimizing the verbal exchange gap among D&M and non-D&M people turns
into a want to make certain effective conversation among all. Sign language
translation is among one of the most growing lines of research and it enables the
maximum natural manner of communication for those with hearing impairments.
A hand gesture recognition system offers an opportunity for deaf people to talk
with vocal humans without the need of an interpreter. The system is built for the
automated conversion of ASL into textual content and speech.
In our project we primarily focus on producing a model which can recognize
Fingerspelling based hand gestures in order to form a complete word by
combining each gesture. The gestures we aim to train are as given in the image
below.
In the recent years there has been tremendous research done on the hand gesture
recognition.
With the help of literature survey, we realized that the basic steps in hand gesture
recognition are: -

 Data acquisition
 Data pre-processing
 Feature extraction
 Gesture classification

3.1 Data acquisition:

The different approaches to acquire data about the hand gesture can be
done in the following ways:

1. Use of sensory devices:

It uses electromechanical devices to provide exact hand configuration, and


position. Different glove-based approaches can be used to extract
information. But it is expensive and not user friendly.

2. Vision based approach:


In vision-based methods, the computer webcam is the input device for observing
the information of hands and/or fingers. The Vision Based methods require only
a camera, thus realizing a natural interaction between humans and computers
without the use of any extra devices, thereby reducing cost. These systems tend
to complement biological vision by describing artificial vision systems that are
implemented in software and/or hardware. The main challenge of vision-based
hand detection ranges from coping with the large variability of the human hand’s
appearance due to a huge number of hand movements, to different skin-color
possibilities as well as to the variations in viewpoints, scales, and speed of the
camera capturing the scene.

3.2 Data Pre-Processing and 3.3 Feature extraction for vision-based


approach:

● In [1] the approach for hand detection combines threshold-based colour detection
with background subtraction. We can use AdaBoost face detector to differentiate
between faces and hands as they both involve similar skin-color.

● We can also extract necessary image which is to be trained by applying a filter


called Gaussian Blur (also known as Gaussian smoothing). The filter can be
easily applied using open computer vision (also known as OpenCV) and is
described in [3].

● For extracting necessary image which is to be trained we can use instrumented


gloves as mentioned in [4]. This helps reduce computation time for Pre-
Processing and gives us more concise and accurate data compared to applying
filters on data received from video extraction.

● We tried doing the hand segmentation of an image using color segmentation


techniques but skin colorur and tone is highly dependent on the lighting
conditions due to which output, we got for the segmentation we tried to do were
no so great. Moreover, we have a huge number of symbols to be trained for our
project many of which look similar to each other like the gesture for symbol ‘V’
and digit ‘2’, hence we decided that in order to produce better accuracies for our
large number of symbols, rather than segmenting the hand out of a random
background we keep background of hand a stable single colour so that we don’t
need to segment it on the basis of skin colour. This would help us to get better
results.

3.4 Gesture Classification:


 In [1] Hidden Markov Models (HMM) is used for the classification of the
gestures. This model deals with dynamic aspects of gestures. Gestures are
extracted from a sequence of video images by tracking the skin-color blobs
corresponding to the hand into a body– face space centred on the face of the
user.

 The goal is to recognize two classes of gestures: deictic and symbolic. The
image is filtered using a fast look–up indexing table. After filtering, skin
colour pixels are gathered into blobs. Blobs are statistical objects based on
the location (x, y) and the colorimetry (Y, U, V) of the skin color pixels in
order to determine homogeneous areas. In [2] Naïve Bayes Classifier is used
which is an effective and fast method for static hand gesture recognition. It
is based on classifying the different gestures according to geometric based
invariants which are obtained from image data after segmentation.
 Thus, unlike many other recognition methods, this method is not dependent
on skin colour. The gestures are extracted from each frame of the video,
with a static background. The first step is to segment and label the objects
of interest and to extract geometric invariants from them. Next step is the
classification of gestures by using a K nearest neighbor algorithm aided
with distance weighting algorithm (KNNDW) to provide suitable data for a
locally weighted Naïve Bayes‟ classifier.

 According to the paper on “Human Hand Gesture Recognition Using a


Convolution Neural Network” by Hsien-I Lin, Ming-Hsiang Hsu, and Wei-
Kai Chen (graduates of Institute of Automation Technology National Taipei
University of Technology Taipei, Taiwan), they have constructed a skin
model to extract the hands out of an image and then apply binary threshold
to the whole image. After obtaining the threshold image they calibrate it
about the principal axis in order to centre the image about the axis. They
input this image to a convolutional neural network model in order to train
and predict the outputs. They have trained their model over 7 hand gestures
and using this model they produced an accuracy of around 95% for those 7
gestures.
MOTIVATION

The Sign-Language-to-Text Conversion project represents a transformative


endeavor, poised to revolutionize communication for the deaf and hard of
hearing community. It aspires to seamlessly translate the eloquence of sign
language into written words, transcending the limitations imposed by a lack of
effective communication.
This initiative holds the promise of dismantling the barriers that have isolated
individuals reliant on sign language. It goes beyond mere conversion; it
symbolizes a dedication to inclusivity and accessibility. Through this project, we
endeavor to empower individuals to participate fully in society, ensuring their
unique perspectives are valued and understood.
By developing a robust and accurate conversion system, we aim to grant access
to education and information with unprecedented ease. This means unlocking
opportunities for the community, allowing them to enter the workforce with
confidence, and forging connections on a level playing field. It acknowledges
that sign language is not a mere alternative to speech, but a rich and vibrant
language in its own right, deserving of a place of prominence in the global
linguistic landscape.
The Sign-Language-to-Text Conversion project is a testament to the resilience of
those who communicate through sign language. It invites a community of
researchers, engineers, and advocates to join forces, recognizing that every
innovation and milestone reached is a step towards a world where
communication is a universal right, not a privilege.
In this endeavor, we are driven by the belief that true communication knows no
bounds. Every line of code written, every breakthrough achieved, brings us
closer to a world where sign language is not just understood, but celebrated.
Together, we embark on this journey towards a future where every voice, in
whatever form it takes, is heard and valued. This is our collective mission, and
together, we have the power to make it a reality.
OBJECTIVES
The Proposed Work aims to meet the following objectives:

Since sign language is distinct from regular writing, a language barrier is created
to prevent communication between regular people and D&M people. They
therefore rely on communication based on visuals for interaction.

People who are not familiar with D&M may be able to understand the
movements if there is a common interface that converts sign language to text.
Therefore, efforts have been made to develop a visual interface that would enable
D&M people to communicate without truly understanding each other's
languages.

The goal is to create an intuitive human computer interface (HCI) where the
computer understands the human sign language.

There are numerous sign languages used around the world, including American
Sign Language (ASL), French Sign Language, British Sign Language (BSL),
Indian, and others.
SCOPE

The Sign-Language-To-Text-Conversion project is a groundbreaking initiative


with the primary goal of creating a seamless system to translate sign language
gestures into written text. This comprehensive endeavor encompasses the
development of specialized hardware, including advanced cameras and sensors,
designed to accurately capture and analyze the intricate movements of sign
language users. These captured data will then be processed in real-time using
state-of-the-art machine learning algorithms, transforming them into coherent
and understandable written text.

Moreover, the project emphasizes accessibility and user-friendliness by aiming to


integrate the system into various devices such as smartphones, tablets, and
computers. Rigorous user testing and feedback mechanisms will be implemented
to ensure that the system is not only accurate but also intuitive and adaptable
across different environments and contexts.

Additionally, the project will be designed with scalability in mind, allowing for
future adaptations to accommodate emerging sign language variants and
advancements in technology. This forward-looking approach ensures that the
Sign-Language-To-Text-Conversion system remains relevant and effective in an
ever-evolving landscape.

Ultimately, this project is poised to revolutionize communication for the deaf and
hard of hearing community, setting a new standard for inclusive and accessible
language technologies that have the potential to benefit individuals worldwide.
LITERATURE SURVEY

Methodology Description Objective Advantages Limitations Date


Principal Component Method for Static Robust to the Transition May
Analysis compressing noise in speed of 2018
a lot of data representation hand
into of images gestures
something
that captures
the essence
of the original
data.
OTSU Algorithm This Dynamic Commonly used Small-sized May
algorithm because of objects, 2018
performs the simple background
reduction of a calculations and with fewer
grey-level high stability
details
image to a
binary image.
Hidden Markov A Statistical Dynamic High accuracy Fails as the May
Model (HMM) Markov rate over distance 2018
Model. It’s a several varies for
simple attempts of image
dynamic recognitions extraction
Bayesian
network
along with a
Markov
Process with
unobserved
states.
Naïve Bayes Classification Human-robot Effective and Dependent Not
Classifier with of gestures interaction, fast recognition on stable provided
Distance Weighting based on Gesture method background,
geometric- Recognition may require
based multiple
invariants users for
from image robustness
data,
combined
with weighted
K-Nearest
Neighbors
and Naïve
Bayes.
Smoothing Images Application of Smoothing to Provides The code Not
diverse linear reduce noise different example provided
filters (blur, smoothing focuses on
Gaussian options image
Blur, median smoothing
Blur, bilateral and doesn't
Filter) to provide
smooth gesture
images using recognition
OpenCV
functions.
Real-time ASL Real-time Translation of High accuracy Dataset Apr
Gesture Recognition recognition of sign language in recognizing generation 2021
American to text ASL gestures challenges,
Sign language lighting and
Language background
gestures constraints
using
Convolutional
Neural
Networks,
thresholding,
and
Gaussian
blur.
CNN-Based Sign Recognition Improve High recognition Specific to Not
Language of Italian sign communication accuracy, Italian sign provided
Recognition language for Deaf generalization language,
gestures community on users and not
using surroundings international
Convolutional
Neural
Networks
with feature
extraction
and
classification.
A review To provide a Specific 2013
The approach and
paper on comprehensive details on
techniques used to The paper
Indian Sign overview of methodology,
conduct the review provides a
Language research in findings, and
on Indian Sign comprehensive
Recognition Indian Sign conclusions
Language overview of
published in Language are not
Recognition. research in
the Recognition. provided in
Indian Sign
International the citation.
Language
Journal of
Recognition.
Computer
Applications in
2013.
HARDWARE & SOFTWARE REQUIREMENTS
 Python 3.6.6
 Tensorflow 1.11.0
 OpenCV 3.4.3.18
 NumPy 1.15.3
 Matplotlib 3.0.0
 Hunspell 2.0.2
 Keras 2.2.1
 PIL 5.3.0

TIMELINE CHART
CONCLUSION
In this report, a functional real time vision based American Sign Language
recognition for D&M people have been developed for asl alphabets.

We achieved final accuracy of 98.0% on our data set. We have improved our
prediction after implementing two layers of algorithms wherein we have verified
and predicted symbols which are more similar to each other.

This gives us the ability to detect almost all the symbols provided that they are
shown properly, there is no noise in the background and lighting is adequate.

Future Scope:
We are planning to achieve higher accuracy even in case of complex
backgrounds by trying out various background subtraction algorithms.
We are also thinking of improving the Pre Processing to predict gestures in low
light conditions with a higher accuracy.
This project can be enhanced by being built as a web/mobile application for the
users to conveniently access the project. Also, the existing project only works for
ASL; it can be extended to work for other native sign languages with the right
amount of data set and training. This project implements a finger spelling
translator; however, sign languages are also spoken in a contextual basis where
each gesture could represent an object, or verb. So, identifying this kind of a
contextual signing would require a higher degree of processing and natural
language processing (NLP).

REFERENCES
[1] T. Yang, Y. Xu, and “A., Hidden Markov Model for Gesture Recognition”,
CMU-RI-TR-94 10, Robotics Institute, Carnegie Mellon Univ., Pittsburgh, PA,
May 1994.

[2] Pujan Ziaie, Thomas M uller, Mary Ellen Foster, and Alois Knoll “A Na ̈ıve
Bayes Munich, Dept. of Informatics VI, Robotics and Embedded Systems,
Boltzmannstr. 3, DE-85748 Garching, Germany.

[3]https://docs.opencv.org/2.4/doc/tutorials/imgproc/
gausian_median_blur_bilateral_filter/gausian_median_blur_bilateral_filter.html

[4] Mohammed Waleed Kalous, Machine recognition of Auslan signs using


PowerGloves: Towards large-lexicon recognition of sign language.

[5]aeshpande3.github.io/A-Beginner%27s-Guide-To-Understanding-
Convolutional-Neural Networks-Part-2/

[6] http://www-i6.informatik.rwth-aachen.de/~dreuw/database.php

[7] Pigou L., Dieleman S., Kindermans PJ., Schrauwen B. (2015) Sign Language
Recognition Using Convolutional Neural Networks. In: Agapito L., Bronstein
M., Rother C. (eds) Computer Vision - ECCV 2014 Workshops. ECCV 2014.
Lecture Notes in Computer Science, vol 8925. Springer, Cham

[8] Zaki, M.M., Shaheen, S.I.: Sign language recognition using a combination of
new vision-based features. Pattern Recognition Letters 32(4), 572–577 (2011).

[9] N. Mukai, N. Harada and Y. Chang, "Japanese Fingerspelling Recognition


Based on Classification Tree and Machine Learning," 2017 Nicograph
International (NicoInt), Kyoto, Japan, 2017, pp. 19-24.
doi:10.1109/NICOInt.2017.9

[10] Byeongkeun Kang, Subarna Tripathi, Truong Q. Nguyen” Real-time sign


language fingerspelling recognition using convolutional neural networks from
depth map” 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR)
[11] Number System Recognition (https://github.com/chasinginfinity/number-
sign-recognition)

[12] https://opencv.org/

[13] https://en.wikipedia.org/wiki/TensorFlow

[14] https://en.wikipedia.org/wiki/Convolutional_neural_nework

[15] http://hunspell.github.io/

You might also like