0% found this document useful (0 votes)
26 views

Alkeshcpp

Uploaded by

Hemant Thawani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Alkeshcpp

Uploaded by

Hemant Thawani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 37

ANJUMAN POLYTECHNIC,NAGPUR

A PROJECT IS
“SIGN LANGUAGE TO TEXT”
Submitted for the partial fulfillment of term work of
Final Year Computer Branch
TEAM MEMBERS
1. MANDHAR PATIL (12)
2. KEDAR PIMPLE (15)
3. ALKESH SONTAKKE (20)
4. ARYAN ALI CHISHTY (42)

GUIDED BY
MR. JAHANGIR ANSARI
SENIOR LECTURER OF COMPUTER BRANCH Academic Year
2023-2024
CERTIFICATE
This is to certify that Mr. ALKESH SONTAKKE, MANDHAR PATIL,
KEDAR PIMPLE, ARYAN ALI CHISHTY from Anjuman
Polytechnic, Sadar, Nagpur has completed Project Planning
Report having title “SIGN LANGUAGE TO TEXT” in a group
consisting of 4 candidates under the guidance of the faculty
Mr JAHANGIR ANSARI .

MR. Jahangir Ansari Mr. Anwar Ahsan


(Project Guide) (H.O.D)

PRINCIPAL
MR. Anwar Ahsan
ACKNOWLEDGE
Success is the manifestation of perseverance and
motivation. We, the projectees, attribute our success in this
venture to our guide Mr. Jahangir Ansari and Head of
Department Mr. Anwar Ahsan . Whose endeavors for
projection, enthusiasm, foresight, and innovation
contributed to completing this project. It is the reflection of
their thought, ideas, concept, and above all their modest
efforts.
We are deeply indebted to our Principal Mr. Anwar
Ahsan Sir, for the facilities provided without which our project
would not have turned into reality.
We are also thankful to all the faculty members of our
department, who have also helped directly or indirectly in our
endeavors.
Our thanks are also to all those who have shown keen
interest in this work and provided much-needed
encouragement.
ABSTRACT

Sign language detection has emerged as a crucial area


of research with profound implications for facilitating
communication and accessibility for the deaf and hard of
hearing community. This paper presents a comprehensive
literature survey covering key advancements in sign language
detection methodologies. Beginning with early seminal works
utilizing hidden Markov models (HMMs) and neural networks
for real-time American Sign Language (ASL) recognition, we
trace the evolution of the field through the integration of
depth-based image sequences from sensors like Kinect, to the
recent proliferation of deep learning techniques such as
convolutional neural networks (CNNs), recurrent neural
networks (RNNs), and transformer architectures.

We highlight significant contributions in both camera-based


and wearable sensor-based approaches, addressing challenges
in recognizing isolated signs, continuous sign language
sentences, and dynamic hand gestures. Furthermore, we
discuss the incorporation of multimodal data fusion strategies
and the exploration of novel architectures tailored to the
unique temporal and spatial characteristics of sign language.

Through this survey, we provide insights into the current state-


of-the-art methodologies, benchmark datasets, and
performance metrics in sign language detection. We also
identify avenues for future research, including the exploration
of interpretability, robustness to variations in signing styles, and
the development of real-world applications to enhance
communication accessibility and inclusivity for the deaf and
hard of hearing community.
CONTENT

Chapter 1: Introduction
Chapter 2: Literature Survey
Chapter 3: Scope of the Project
Chapter 4: Methodology
Chapter 5:Detail of design, working and process
Chapter 6: Result and Application
Chapter 7: Conclusion and future scope
Chapter 8: Reference and Bibliography
CHAPTER 1: INTRODUCTION
CHAPTER 1:
INTRODUCTION

Sign language is a predominant sign language Since the only


disability Deaf and Dumb (hereby referred to as D&M) people
have is communication related and since they cannot use
spoken languages, the only way for them to communicate is
through sign language. Communication is the process of
exchange of thoughts and messages in various ways such as
speech, signals, behavior and visuals. D&M people make use of
their hands to express different gestures to express their ideas
with other people. Gestures are the non-verbally exchanged
messages and these gestures are understood with vision. This
nonverbal communication of deaf and dumb people is called
sign language. A sign language is a language which uses
gestures instead of sound to convey meaning combining hand-
shapes, orientation and movement of the hands, arms or body,
facial expressions and lip-patterns. Contrary to popular belief,
sign language is not international. These vary from region to
region.
Minimizing the verbal exchange gap among D&M and non-D&M
people turns into a want to make certain effective conversation
among all. Sign language translation is among one of the most
growing lines of research and it enables the maximum natural
manner of communication for those with hearing impairments.
A hand gesture recognition system offers an opportunity for
deaf people to talk with vocal humans without the need of an
interpreter. The system is built for the automated conversion of
ASL into textual content and speech.
In our project we primarily focus on producing a model which
can recognize Fingerspelling based hand gestures in order to
form a complete word by combining each gesture. The
gestures we aim to train are as given in the image below.

Motivation:

Fоr interасtiоn between normal рeорle аnd D&M рeорle


а lаnguаge bаrrier is сreаted аs sign lаnguаge struсture
since it is different frоm nоrmаl text. Sо, they deрend оn
visiоn-bаsed соmmuniсаtiоn fоr interасtiоn.

If there is а соmmоn interfасe thаt соnverts the sign


lаnguаge tо text, then the gestures саn be eаsily
understооd by non-D&M рeорle. Sо, reseаrсh hаs been
mаde fоr а visiоn-bаsed interfасe system where D&M
рeорle саn enjоy соmmuniсаtiоn withоut reаlly knоwing
eасh оther's lаnguаge.
The aim is tо develop а user-friendly Humаn Cоmрuter
Interfасe (HСI) where the соmрuter understаnds the
humаn sign lаnguаge.
There аre vаriоus sign lаnguаges аll оver the wоrld,
nаmely Аmeriсаn Sign Lаnguаge (АSL), Frenсh Sign
Lаnguаge, British Sign Lаnguаge (BSL), Indiаn
Sign lаnguаge, Jараnese Sign Lаnguаge аnd wоrk hаs been
dоne оn оther lаnguаges аll аrоund the wоrld.
CHAPTER 2 :LITERATURE
SURVEY
CHAPTER 2
LITERATURE SURVEY

Detecting sign language has been a topic of interest in


computer vision and machine learning research, with
applications ranging from assisting the deaf and hard of hearing
to human-computer interaction. Here's a brief literature survey
covering some key works in sign language detection:

1. Real-Time American Sign Language Recognition Based on


Neural Networks: This seminal work by Starner et al. (1998)
presented a real-time American Sign Language (ASL) recognition
system using hidden Markov models (HMMs) and neural
networks. It laid the groundwork for many subsequent studies
in sign language recognition.

2. Hand Gesture Recognition Using Depth-Based Image


Sequences: This paper by Keskin et al. (2012) proposed a
method for hand gesture recognition using depth data obtained
from a Kinect sensor. The system achieved high accuracy in
recognizing various hand gestures, including those used in sign
language.
3. Sign Language Recognition with Microsoft Kinect**: Another
significant work by Ball et al. (2012) explored the feasibility of
using the Microsoft Kinect sensor for sign language recognition.
They developed a system capable of recognizing both isolated
signs and continuous sign language sentences.

4. Deep Learning-Based Sign Language Recognition**: With the


rise of deep learning, several studies have applied convolutional
neural networks (CNNs) and recurrent neural networks (RNNs)
to sign language recognition. For example, Liwicki et al. (2014)
employed deep CNNs to recognize fingerspelling gestures, while
Puertas et al. (2019) utilized long short-term memory (LSTM)
networks for continuous sign language recognition.

5. 3D Convolutional Neural Networks for Dynamic Hand Gesture


Recognition: Tang et al. (2018) proposed a 3D CNN architecture
for recognizing dynamic hand gestures, which are common in
sign language. Their approach achieved state-of-the-art
performance on benchmark datasets.

6. Sign Language Recognition Using Wearable Sensors: In


addition to camera-based approaches, researchers have
explored wearable sensors for sign language recognition. For
instance, Zhang et al. (2019) developed a system based on
wrist-worn sensors that captured hand movements to recognize
sign language gestures.

7. Transformer-Based Sign Language Recognition: Inspired by


the success of transformer models in natural language
processing tasks, recent works have applied transformer
architectures to sign language recognition. These models can
effectively capture temporal dependencies and spatial
relationships in sign language sequences. Notable examples
include the work by Pu et al. (2021), where they proposed a
transformer-based model for continuous sign language
recognition.

These are just a few examples from a vast and continuously


evolving field of research in sign language detection. The
literature encompasses various techniques, including traditional
machine learning, deep learning, sensor fusion, and multimodal
approaches, all aimed at improving the accuracy and robustness
of sign language recognition systems.
CHAPTER 3 : SCOPE OF THE
PROJECT
CHAPTER 3
SCOPE OF THE PROJECT

1. Objective Definition
 Primary Goal: Clearly state the main objective of the

project, which is to convert sign language gestures into text


for communication.
 Scope Limitation: Specify the scope boundaries, such as

focusing on a particular sign language (e.g., American Sign


Language - ASL) or a subset of gestures.
2. Functional Requirements
 Input Mechanism: Define how the sign language gestures

will be captured (e.g., via a camera or sensor).


 Recognition Capability: Specify the types of gestures the

system will recognize (e.g., alphabets, numbers, common


words).
 Real-time Processing: Determine if the system needs to

operate in real-time to provide immediate text output.


3. System Components
 Gesture Recognition Model: Identify the machine learning

or deep learning model to be used for recognizing sign


language gestures.
 Text Conversion Mechanism: Decide how recognized

gestures will be translated into text (e.g., mapping to


predefined vocabulary or using natural language
processing techniques).
 User Interface: Define the user interface requirements for

displaying the recognized text output.


4. Data Requirements
 Training Dataset: Specify the dataset needed for training

the gesture recognition model, including the types of


gestures and annotations required.
 Testing Dataset: Plan for a separate dataset to evaluate the

performance of the system during development.


5. Performance Metrics
 Accuracy: Define the acceptable level of accuracy for

gesture recognition and text conversion.


 Speed: Determine the processing speed requirements,

especially if real-time performance is needed.


6. Deployment Considerations
 Hardware Requirements: Identify the hardware
specifications needed to deploy the system effectively
(e.g., camera specifications, computational resources).
 Software Dependencies: List any software libraries or
frameworks required for implementing the system.
7. User Considerations
 Accessibility: Ensure that the system is accessible to

individuals with hearing impairments.


 User Training: Determine if users require any training to

interact effectively with the system.


8. Ethical and Legal Considerations
 Privacy: Address privacy concerns related to data
collection and usage.
 Compliance: Ensure compliance with relevant regulations

and standards, especially if the system will be used in


healthcare or educational settings.
9. Project Constraints
 Timeframe: Define the project timeline, considering

development, testing, and deployment phases.


 Budget: Determine budgetary constraints for acquiring

necessary resources and technologies.


10. Maintenance and Support
 Update Plan: Establish a plan for maintaining and updating

the system to address issues and incorporate


improvements over time.
 Technical Support: Consider provisions for providing

technical support to users post-deployment.


CHAPTER 4: METHODOLOGY
CHAPTER 4
Methodology

1. Research and Define Requirements

 Understand Sign Language: Familiarize yourself with the


basics of sign language, including common gestures and
their meanings.
 Define Scope: Determine the scope of your project,
including which sign language you are targeting (e.g.,
American Sign Language - ASL).
 Identify Input Source: Decide how you will capture sign
language input (e.g., through a camera or sensor).

2. Data Collection and Preprocessing

 Gather Dataset: Collect a dataset of sign language


gestures. This dataset should include videos or images of
people performing various signs.
 Preprocessing: Clean and preprocess the dataset by
standardizing formats, resizing images, and ensuring
consistency in lighting and background.

3. Model Selection and Training

 Choose a Model: Select a suitable machine learning or


deep learning model for sign language recognition.
Common choices include Convolutional Neural Networks
(CNNs) for image-based recognition.
 Training: Train your chosen model using the preprocessed
dataset. Use techniques like transfer learning if applicable
to leverage pre-trained models.

4. Develop a Sign Language Recognition System

 Input Processing: Implement the system to process input


from a camera or sensor to capture sign language gestures
in real-time.
 Gesture Recognition: Apply the trained model to
recognize gestures from the input data.
 Mapping to Text: Convert recognized gestures into
corresponding text using a predefined mapping (e.g.,
mapping each gesture to a specific word or phrase).

5. Integration and Testing


 Integrate Components: Combine the input processing,
gesture recognition, and text mapping components into a
cohesive system.
 Testing: Test the system extensively using diverse sign
language gestures to evaluate its accuracy and robustness.

6. User Interface Development

 UI Design: Create a user-friendly interface that displays


recognized text output in real-time.
 Accessibility Considerations: Ensure the interface is
accessible and usable for individuals with hearing
impairments.

7. Deployment and Optimization

 Deployment: Deploy the sign language to text system on


suitable hardware (e.g., a computer or embedded device).
 Optimization: Optimize the system for performance and
efficiency, considering factors like speed and accuracy.

8. Continuous Improvement and Maintenance

 Feedback Loop: Gather feedback from users to identify


areas for improvement.
 Update Model: Periodically update and retrain your model
with new data to enhance accuracy and recognize
additional gestures.
Additional Considerations

 Privacy and Ethical Considerations: Ensure that the


system respects user privacy and complies with ethical
guidelines.
 Localization: Consider adapting the system to different
sign languages and regional variations.
CHAPTER 5 :
DETIAL OF DESIGN
WORKING AND PROCESS
CHAPTER 5
DETAIL OF DESIGN WORKING AND PROCESS

Hardware
Normal laptop with built-in webcam
Software
Python: The core programming language for development.
OpenCV: For real-time video capture and processing.
TensorFlow/Keras: Deep learning libraries for model training
and recognition.
Numpy: For numerical operations and array handling.
Machine Learning library (e.g., TensorFlow)
Teachable Machine website for simplified model training and
export
TensorFlow:
TensorFlow is an end-to-end open-source platform for
Machine Learning. It has a comprehensive, flexible ecosystem
of tools, libraries and community resources that lets
researchers push the state-of-the-art in Machine Learning and
developers easily build and deploy Machine Learning powered
applications.
TensorFlow offers multiple levels of abstraction so you can
choose the right one for your needs. Build and train models by
using the high-level Keras API, which makes getting started
with TensorFlow and machine learning easy.

If you need more flexibility, eager execution allows for


immediate iteration and intuitive debugging. For large ML
training tasks, use the Distribution Strategy API for distributed
training on different hardware configurations without changing
the model definition.

OpenCV:
OpenCV (Open-Source Computer Vision) is an open-source
library of programming functions used for real-time computer-
vision.
It is mainly used for image processing, video capture and
analysis for features like face and object recognition. It is
written in C++ which is its primary interface, however bindings
are available for Python, Java, MATLAB/OCTAVE.
CHAPTER 6 :RESULT AND
APPLICATION
CHAPTER 6 EXPERIMENTA RESULT AND APPLICATION

Image 1 image 2

Image3 image 4
Image 5
Chapter 7: CONCLUSION AND
FUTURE SCOPES

CHAPTER 7 CONCLUSION AND FUTURE SCOPES

Conclusion
In this report, a functional real time vision based Sign
Language recognition for D&M people have been developed
for asl alphabets.
We achieved final accuracy of 98.0% on our data set. We have
improved our prediction after implementing two layers of
algorithms wherein we have verified and predicted symbols
which are more similar to each other.
This gives us the ability to detect almost all the symbols
provided that they are shown properly, there is no noise in the
background and lighting is adequate.

Future Scopes:
We are planning to achieve higher accuracy even in case of
complex backgrounds by trying out various background
subtraction algorithms.
We are also thinking of improving the Pre Processing to predict
gestures in low light conditions with a higher accuracy.
This project can be enhanced by being built as a web/mobile
application for the users to conveniently access the project.
Also, the existing project only works for ASL; it can be extended
to work for other native sign languages with the right amount of
data set and training. This project implements a finger spelling
translator; however, sign languages are also spoken in a
contextual basis where each gesture could represent an object,
or verb. So, identifying this kind of a contextual signing would
require a higher degree of processing and natural language
processing (NLP).
CHAPTER 8 : REFERENCE
CHAPTER 8
REFERENCE

References:

[1] T. Yang, Y. Xu, and “A., Hidden Markov Model for Gesture
Recognition”, CMU-RI-TR-94 10, Robotics Institute, Carnegie
Mellon Univ., Pittsburgh, PA, May 1994.
[2] Pujan Ziaie, Thomas M uller, Mary Ellen Foster, and Alois
Knoll “A Na ̈ıve Bayes Munich, Dept. of Informatics VI, Robotics
and Embedded Systems, Boltzmannstr. 3, DE-85748 Garching,
Germany.
[3]https://docs.opencv.org/2.4/doc/tutorials/imgproc/
gausian_median_blur_bilateral_filter/
gausian_median_blur_bilateral_filter.html
[4] Mohammed Waleed Kalous, Machine recognition of Auslan
signs using PowerGloves: Towards large-lexicon recognition of
sign language.
[5]aeshpande3.github.io/A-Beginner%27s-Guide-To-
Understanding-Convolutional-Neural Networks-Part-2/
[6]
http://www-i6.informatik.rwth-aachen.de/~dreuw/database.ph
p
[7] Pigou L., Dieleman S., Kindermans PJ., Schrauwen B. (2015)
Sign Language Recognition Using Convolutional Neural
Networks. In: Agapito L., Bronstein M., Rother C. (eds)
Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture
Notes in Computer Science, vol 8925. Springer, Cham
[8] Zaki, M.M., Shaheen, S.I.: Sign language recognition using a
combination of new vision-based features. Pattern Recognition
Letters 32(4), 572–577 (2011).

[9] N. Mukai, N. Harada and Y. Chang, "Japanese Fingerspelling


Recognition Based on Classification Tree and Machine
Learning," 2017 Nicograph International (NicoInt), Kyoto, Japan,
2017, pp. 19-24. doi:10.1109/NICOInt.2017.9
[10] Byeongkeun Kang, Subarna Tripathi, Truong Q. Nguyen”
Real-time sign language fingerspelling recognition using
convolutional neural networks from depth map” 2015 3rd IAPR
Asian Conference on Pattern Recognition (ACPR)
[11] Number System Recognition
(https://github.com/chasinginfinity/number-sign-recognition)
[12] https://opencv.org/
[13] https://en.wikipedia.org/wiki/TensorFlow
[14]
https://en.wikipedia.org/wiki/Convolutional_neural_nework
[15] http://hunspell.github.io/

You might also like