0% found this document useful (0 votes)
10 views32 pages

Finalreport 11

GestureConnect is a web-based application designed to translate American Sign Language (ASL) gestures into text in real-time, facilitating communication for hearing-impaired individuals. The project utilizes computer vision and a Convolutional Neural Network (CNN) to recognize hand gestures captured via a webcam, enhancing accessibility and inclusivity. The report details the project's objectives, methodology, and potential future enhancements, emphasizing its educational value for ASL learners and educators.

Uploaded by

amiya07chinnu03
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views32 pages

Finalreport 11

GestureConnect is a web-based application designed to translate American Sign Language (ASL) gestures into text in real-time, facilitating communication for hearing-impaired individuals. The project utilizes computer vision and a Convolutional Neural Network (CNN) to recognize hand gestures captured via a webcam, enhancing accessibility and inclusivity. The report details the project's objectives, methodology, and potential future enhancements, emphasizing its educational value for ASL learners and educators.

Uploaded by

amiya07chinnu03
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

GestureConnect: A Sign Language

Translator
A PROJECT REPORT

Submitted in partial fulfilment of the requirements for the award of degree of

BACHELOR OF TECHNOLOGY

By

Aina Arun [SCM21CS010]


Aleena Johnson [SCM21CS022]
Amiya Shereef [SCM21CS017]
ChristySaji [SCM21CS040]

SCMS SCHOOL OF ENGINEERING AND TECHNOLOGY


(Affiliated to APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY)
VIDYA NAGAR, PALISSERY, KARUKUTTY
ERNAKULAM - 683576
March 24, 2025
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
SCMS SCHOOL OF ENGINEERING AND TECHNOLOGY
(Affiliated to APJ ABDUL KALAM TECHNOLOGICAL UNIVERSITY)
VIDYA NAGAR, PALISSERY, KARUKUTTY
ERNAKULAM - 683576

BONAFIDE CERTIFICATE
This is to certify that the report entitled ‘GestureConnect- A Sign Language Translator’ submitted by
Aina Arun [SCM21CS010], Aleena Johnson [SCM21CS017], Amiya Shereef [SCM21CS022],
Christy Saji [SCM21CS040] to the APJ Abdul Kalam Technological University in partial fulfillment
of the requirements for the award of the Degree of Bachelor of Technology in ( Computer Science And
Engineering ) is a bonafide record of the project work carried out by him/her under my/our guidance
and supervision. This report in any form has not been submitted to any other University or Institute for
any purpose.

COORDINATOR GUIDE HOD


Ms. Rinsu Sujith Ms. Sruthy K Joseph Dr. Manish T I
Assistant Professor Assistant Professor Professor
Department of CSE Department of CSE Department of CSE
SSET SSET SSET
DECLARATION

We undersigned hereby declare that the project report on ‘ GestureConnect - A Sign Language
Translator ’,submitted for partial fulfilment of the requirements for the award of the degree of
Bachelor of Technology of the APJ Abdul Kalam Technological University, Kerala is a bonafide
work done by us under the supervision of Dr.Mahesh K M This submission represents our ideas in
our own words and where ideas or words of others have been included, we have adequately and
accurately cited and referenced the original sources. We also declare that we have adhered to the
ethics of academic honesty and integrity and have not misrepresented or fabricated any data or idea
or fact or source in our submission. We understand that any violation of the above will be a cause for
disciplinary action by the institute and/or the University and can also evoke penal action from the
sources which have thus not been properly cited or from whom proper permission has not been
obtained. This report has not previously formed the basis for the award of any degree, diploma, or
similar title of any other University.

Place:Karukutty Aina Arun


Date:27/03/2024 [SCM21CS010]
Amiya Shereef
[ SCM21CS022]
Aleena Johnson
[SCM21CS017]
Christy Saji
[SCM21CS040]

ii
ACKNOWLEDGEMENT
On the very outset of this report, we would like to extend our sincere and heartfelt Obligation
towards all the personages who have helped us in making this project a reality. We would like to
extend our gratitude to our principal Dr.Anitha G. Pillai for facilitating the development of the
project by providing the necessary resources from the college. We would like to put on record our
deepest gratitude to the Head of the Department of Computer Science, Dr. Manish T I, for his
guidance and support throughout the project. We are ineffably indebted to our project guide Ms.
Sruthy K Joseph and our class coordinator Ms. Rinsu Sujith for their valuable inputs and mentoring,
without which this project would not have been completed successfully. We are very lucky to have
the grace of god and also express our gratitude towards our parents and members of family, who
always supported us morally as well as economically. We convey our heartiest thanks to our team
members who have been kind enough to put their earnest efforts toward the development of this
project. We would also like to express our heartfelt thanks to the faculty members and classmates for
their cooperation, guidance, feedback, and encouragement since day one in building this project.
Any omission in this brief acknowledgement does not mean lack of gratitude.

iii
ABSTRACT

The ASL Language Translator is a web-based application designed to bridge the communication gap
between hearing-impaired individuals and those unfamiliar with American Sign Language (ASL).
This system utilizes computer vision to recognize ASL hand gestures and translate them into text in
real time. By providing an accessible and interactive platform, the project aims to facilitate smoother
communication and promote inclusivity for the hearing-impaired community.
The translation process begins with a webcam-based image capture system integrated into the
website. The captured frames are processed using image preprocessing techniques to enhance clarity
and improve recognition accuracy. A Convolutional Neural Network (CNN) model is then used to
classify the detected hand gestures. The recognized gestures are instantly converted into text and
displayed on the web interface, allowing users to interact with the system effortlessly.
This project is particularly beneficial for individuals with hearing impairments, enabling them to
communicate more effectively in digital spaces. It can also serve as a learning tool for those
interested in understanding ASL. The user-friendly web interface ensures accessibility for a wide
range of users, making the system easy to navigate and use.
By integrating sign language recognition into a web-based platform, this project enhances digital
accessibility. The ASL Language Translator aims to create a practical and inclusive communication
tool for both hearing and non-hearing individuals.

iv
CONTENTS

Declaration​ …………………………………………………………………………………… ii
Acknowledgement ………………………………………………………….………………..… iii
Abstract ……………………………………………………………………………………….... iv
List of Figures …………………………………………………...……………………………... vi
Abbreviation …………………………………………………………….……………………… vii

1.INTRODUCTION 1
1.1​ Overview​ ………………………………………………….. …………………………… 1
1.2​ Objectives ……………………………………………………..………………………….. 2
1.3​ Problem Statement​ …………………………………………... ……………………...…. 3
1.4​ Organization Of Report …………………………………………………………………… 3

2.LITERATURE REVIEW 5

3.METHODOLOGY 6
3.1​ Proposed System………………………………………………………………………….. 6
3.2​ Dataset ………………………………………………………………….………………… 8
3.3​ Libraries Used ………………………………………………….………………………… 9

4.SYSTEM DESIGN 11
4.1​ System Architecture ……………………………………………………………………… 11
4.2​ Use Case Diagram …………………………………………………..……………………. 15

5.RESULTS 17
5.1​ Results ……………………………………………………………………………………. 17

6.CONCLUSION AND FUTURE SCOPE 21


6.1​ Conclusion………………………………………………………………………………… 21
6.2​ Future Scope……………………………………………………………………………… 22
References ……………………………………………………………………………………… 24
GestureConnect - A Sign Language Translator​ List of Figures

List of Figures

3.1​ Proposed System …………………………………………………………………………….6


3.2​ ASL Dataset …………………………………………………………………………………​8
3.3​ Training Dataset……………………………………………………………………………...9

4.1​ Use Case Diagram…………………………………………………………………………..15

5.1​ ‘C’ sign…..……………………………………………………………………………….....18


5.2​ ‘O’ sign...………………………………………………………...………………………… 18
5.3​ ‘S’ sign …….………………………………………………………………………………. 18
5.4​ Blank frame ……………………………………………………………………………….. 18
5.5​ Login page …………………………………………………………………………………. 19
5.6​ Home page …………………………………………………………………………………. 19

vi
GestureConnect - A Sign Language Translator​ ABBREVIATIONS

ABBREVIATIONS
●​ ASL- American Sign Language

●​ CNN - Convolutional Neural Networks

●​ AI -Artificial Intelligence

●​ GUI - Graphical User Interface

●​ ReLU - Rectified Linear Unit

●​ TTS - Text-to-Speech

●​ API - Application Programming Interface

●​ ROI - Region of Interest

●​ NLP - Natural Language Processing

vii
CHAPTER 1

INTRODUCTION

1.1​ Overview
The ASL Language Translator is a web-based application designed to facilitate communication
between hearing-impaired individuals and those unfamiliar with American Sign Language (ASL).
Many individuals rely on ASL as their primary mode of communication, but due to a lack of
widespread knowledge about sign language, interactions with non-ASL users can be challenging.
This project aims to bridge this communication gap by providing a simple and accessible platform
that translates ASL hand gestures into text in real time. By making ASL more understandable to a
wider audience, this project promotes inclusivity and accessibility.The system functions through a
webcam-based image capture module that detects and processes hand gestures. When a user
performs an ASL sign, the system captures the image and applies preprocessing techniques such as
noise reduction and background removal to improve accuracy. The processed image is then analyzed
to classify the detected gesture, and the corresponding text is displayed on the web interface. This
real-time translation ensures that users can communicate effectively without significant delays.

One of the main advantages of this project is its web-based implementation, eliminating the need for
additional software installations. Users can access the platform through a web browser, making it
convenient and widely available on different devices. The interface is designed to be intuitive,
ensuring that users with minimal technical expertise can navigate it effortlessly. The system
primarily focuses on recognizing commonly used ASL gestures, making it practical for everyday
conversations. Future enhancements may include expanding the gesture vocabulary and improving
recognition accuracy.

This project holds significant value for hearing-impaired individuals, educators, and ASL learners. It
provides an effective communication tool for individuals who rely on sign language, helping them
interact more easily with non-ASL users.

1
GestureConnect-A Sign Language Translator INTRODUCTION

Educators can use the system as a teaching aid for students learning ASL, while learners can
practice and verify their ASL signs through the platform. By increasing awareness and
understanding of sign language, this project contributes to fostering better communication
between different communities.
In conclusion, the ASL Language Translator is a user-friendly web application designed to
promote accessibility and inclusivity for hearing-impaired individuals. By offering a simple and
efficient way to translate ASL gestures into text, the project makes communication easier for
those who rely on sign language. With potential improvements such as expanded vocabulary
support and enhanced functionality, this system can continue to evolve as a valuable tool for
bridging language barriers and promoting inclusivity.

1.2​ Objectives

•​ Bridge the Communication Gap: One of the primary goals of this project is to reduce the
communication barriers between hearing-impaired individuals and those who do not understand
ASL. Many people with hearing impairments rely on ASL, but due to the lack of widespread
ASL knowledge, communication can be difficult. This project aims to provide a practical
solution by converting ASL hand gestures into readable text, allowing smooth and effective
interactions without the need for interpreters or written notes.

•​ Develop an Easy-to-Use Web-Based Platform: The ASL translator is designed as a web-based


application to ensure accessibility for a wide range of users. By making the platform available
through a browser, it eliminates the need for additional software or specialized hardware
installations. The user interface is designed to be simple and intuitive so that individuals,
regardless of their technical expertise, can easily navigate and use the system. The goal is to
make the platform widely accessible across multiple devices, including desktops, laptops, and
mobile phones.

•​ Implement Real-Time Gesture Recognition: For effective communication, the system captures
and translates ASL gestures into text instantly. The webcam-based module processes hand
gestures in real time, allowing users to receive immediate feedback. Real-time recognition is
essential to maintaining the flow of conversation and ensuring a smooth experience. By
minimizing delays in gesture recognition, the system enhances usability and practicality. .

2
GestureConnect-A Sign Language Translator INTRODUCTION

•​ Ensure Accurate Sign Recognition: To improve the accuracy of sign recognition, the system
integrates image preprocessing techniques such as noise reduction, background removal, and
contrast enhancement. These techniques help identify hand movements clearly, minimizing
gesture classification errors. The project focuses on recognizing commonly used ASL gestures
with high precision, ensuring that translations are reliable and meaningful. Future enhancements
may include expanding the gesture database to improve recognition capabilities further.

•​ Support ASL Learners and Educators: Beyond assisting hearing-impaired individuals, the project
serves as an educational tool for ASL learners and teachers. ASL students can use the system to
practice their gestures and receive instant feedback, helping them refine their skills. Educators can
incorporate the translator into their teaching methods to provide interactive learning experiences.
By making ASL more accessible, the project encourages a greater understanding and adoption of
sign language.

1.3​ Problem Statement


The lack of widespread knowledge of American Sign Language (ASL) creates communication
barriers for hearing-impaired individuals. This project aims to address this issue by developing a
web-based ASL translator that captures and converts hand gestures into text, enabling seamless
and accessible communication in real-time.

1.4​ Organization Of Report


The report is divided into five chapters. The overview, objectives, and problem statement are
covered in the introductory part of the report. A variety of reviews of relevant literature are
included in the second chapter. Chapter Three goes into the specific methodology of the
functioning of the suggested system. The system architecture and use case diagrams are discussed
in the fourth chapter. The experimental results and discussions are presented in the fourth chapter.
In the last chapter, the project’s conclusion and future scope are also mentioned. The references are
provided at the end of the pages.

3
CHAPTER 2

LITERATURE REVIEW

" An Efficient Two-Stream Network for Isolated Sign Language Recognition Using Accumulative
Video Motion "[1]introduced a Hierarchical Sign Learning Model, a trainable deep learning
network designed for sign language recognition. The model uses a key posture extractor and
accumulative video motion to enhance recognition. It employs three specialized networks: DMN,
AMN, and SRN, and was evaluated on the KArSL-190 and KArSL-502 Arabic Sign Language
datasets. Experiments were conducted in both signer-dependent and signer-independent modes.
However, the model demonstrated limited generalization to different signers, particularly in the
signer-independent mode. Additionally, potential overfitting to extracted postures was observed,
reducing recognition accuracy for dynamic and nuanced gestures.

"BIM Sign Language Translator Using Machine Learning (TensorFlow)"[2], is an offline system
designed to convert sign gestures into text in multiple languages using a camera system. The
translator was built with TensorFlow, Python, OpenCV, and Qt as the core development libraries.
An interactive GUI was developed using PyQt, featuring a video feed display area to capture
real-time gestures. The system employs OpenCV to capture sign gestures via an HD camera,
processes them, and stores them locally as JPG images. It was distributed as a standalone
executable file, ensuring ease of use for end-users. However, the system faced limitations,
including an inability to interpret dynamic signs, sensitivity to background lighting and signer skin
tones, and a lack of diversified training data, which diminished accuracy. Additionally, the
restricted sign language dictionary constrained the system's translation capabilities.

"Static Sign Language Recognition Using Deep Learning"[3] employs a vision-based approach
using Convolutional Neural Networks (CNNs) for real-time recognition of static ASL gestures,
utilizing the Keras framework. A skin color modeling technique is used to enhance gesture
detection. Data collection involved capturing a diverse set of hand gesture images in a controlled
environment, ensuring consistency and reliability. These images were preprocessed by resizing and
converting them from the RGB to HSV color space, which improved skin detection and overall

4
GestureConnect-A Sign Language Translator LITERATURE REVIEW
accuracy. The system achieved an impressive average accuracy of 90.04% for letter recognition;
however, it is currently limited to recognizing static gestures. The performance of the model was
highly dependent on optimal lighting conditions, as better lighting significantly improved
accuracy. Additionally, the system struggled with complex backgrounds, highlighting the
importance of simpler and less cluttered environments for effective recognition.

"A Moroccan Sign Language Recognition Algorithm Using a Convolution Neural Network "[4]
The model integrates convolution and max pooling layers for real-time classification and
localization. It uses a database of 20 stored signs, including letters and numbers from Moroccan
Sign Language. Input images captured via a webcam are preprocessed by converting RGB images
to grayscale, followed by binary image conversion to extract hand gestures. While the model is
effective for static signs, it is less suitable for dynamic or continuous gestures that require
spatiotemporal analysis. Additionally, the system is highly dependent on the quality of the dataset
and exhibits significantly reduced accuracy when the datasets deviate from the norm.

"A CNN sign language recognition system with single & double-handed gestures"[5] The system
was implemented using Python and leveraged TensorFlow, Keras, and OpenCV for image capture
and analysis. A graphical user interface was designed using the OpenCV library. The testing
dataset comprised 2,375 images, with 125 images per gesture. While the system achieved effective
recognition of static signs, it was unable to process dynamic gestures, sequences, or motion.
Additionally, it exhibited sensitivity to hand positioning, where minor deviations in position or
orientation led to misclassifications. The model also showed a lower recognition rate for signs
outside the training dataset, highlighting its limited generalizability.

Together, these studies illustrate the transformative impact of deep learning and image processing
technologies on sign language recognition systems. Researchers have advanced traditional gesture
recognition methods by developing novel architectures, such as Convolutional Neural Networks,
hierarchical learning models, and real-time preprocessing techniques. These efforts aim to improve
the scalability, accuracy, and usability of sign language recognition systems, using innovations like
GUI-based interactive interfaces and real-time gesture analysis.However, challenges such as limited
datasets, sensitivity to environmental factors, and the inability to process dynamic gestures
emphasize the need for further refinement. Collectively, these research projects underscore the
interdisciplinary nature of this field, highlighting the collaboration required between computer
science, linguistics, and human-computer interaction to create inclusive communication tools .

5
CHAPTER 3

METHODOLOGY

3.1​ Proposed System

Figure 3.1: Proposed System

1. Video Capture and Preprocessing:

●​ The system begins by capturing real-time video input using a webcam.


●​ Preprocessing techniques, such as flipping, grayscale conversion, Gaussian blurring, and
adaptive thresholding, are applied to prepare the video frames for analysis.

6
GestureConnect-A Sign Language Translator METHODOLOGY

2 .Deep Learning Models for Gesture Recognition:

●​ Multiple pre-trained deep learning models based on Convolutional Neural Networks (CNNs)
are used to recognize and classify hand gestures.
●​ Specialized models handle ambiguous cases (e.g., distinguishing between similar gestures like
D, R, U or M, N, S) to improve accuracy.
●​ A hierarchical prediction mechanism is implemented to refine the output from these models.

3.Text Generation and Sentence Formation:

●​ Recognized gestures are converted into characters, which are dynamically grouped into words.
●​ The system forms complete sentences by concatenating recognized words, with a "blank"
gesture signaling the end of a word.

4. Graphical User Interface (GUI):

●​ A user-friendly GUI is built using Tkinter to display the video feed, predicted characters, words,
and sentences in real time.
●​ Buttons for functionalities such as speaking out the constructed sentence and clearing the
current word are provided.

5.Text-to-Speech Conversion:

●​ A text-to-speech engine is integrated to vocalize the constructed sentences, enabling effective


communication with non-signers.

6.Error Handling and Robustness:

●​ A counter-based mechanism ensures gesture stability by confirming predictions only after


consistent detection over a defined threshold.
●​ This reduces misclassifications caused by accidental or inconsistent gestures.

7.Extensibility and Modularity:

●​ The system's modular design facilitates the integration of additional features, such as extended
gesture libraries, support for dynamic gestures, and predictive text suggestions.

7
GestureConnect-A Sign Language Translator METHODOLOGY

3.2​ Dataset
The dataset used for this American Sign Language (ASL) translation system consists of a structured
collection of images representing the ASL alphabet (A-Z). Each letter is associated with a set of
images depicting the corresponding hand gesture under various conditions, including different
lighting, angles, and hand positions. The dataset comprises approximately 31,500 images in total,
with approximately 1100 images per letter, and is organized into 27 separate folders for each letter
to ensure systematic data processing and effective model training.

To enhance the accuracy of the recognition system, the dataset includes diverse images, addressing
variations in hand orientation, skin tone, and environmental factors, making the system more
inclusive and robust. Preprocessing steps, such as resizing images to [128x128 pixels], grayscale
conversion, and data augmentation (e.g., rotation, flipping, and contrast adjustment), were applied
to improve the model’s training performance and generalization capabilities.

Additionally, the dataset incorporates a standard ASL reference chart displaying each letter's hand
gesture, ensuring consistency during training and validation. This reference aids in cross-checking
the correctness of image classifications. The dataset plays a crucial role in developing an automated
ASL recognition system that converts sign language gestures into textual output, facilitating
seamless communication for individuals with hearing and speech impairments.

Figure 3.2: ASL Dataset

8
GestureConnect-A Sign Language Translator METHODOLOGY

Figure 3.3: Training Dataset

3.3​ Libraries Used


1. Core Python Libraries

●​ numpy → For handling numerical data.


●​ operator → For sorting predictions.
●​ os. sys → For file handling and system operations.
●​ time → For handling time-based operations.

2. Computer Vision & Image Processing

●​ cv2 (OpenCV) → For:


○​ Capturing video from the webcam.
○​ Preprocessing images (grayscale, blurring, thresholding).
○​ Drawing bounding boxes for hand gestures.
●​ PIL (Pillow) → For handling and displaying images in Tkinter.

9
GestureConnect-A Sign Language Translator METHODOLOGY

3. Machine Learning & Deep Learning

●​ tensorflow.keras.models.model_from_json → Loads a trained deep learning model (CNN).


●​ tensorflow.keras.models.load_weights → Loads pre-trained weights into the model.

4. GUI (Graphical User Interface)

●​ tkinter → Used to create the user interface.

5. Text-to-Speech (TTS)

●​ pyttsx3 → Converts recognized ASL symbols into spoken word.

10
CHAPTER 4

SYSTEM DESIGN

4.1​ System Architecture

The ASL Translator system is designed using a Convolutional Neural Network (CNN) to
efficiently recognize American Sign Language (ASL) gestures. The system follows a
well-structured architecture consisting of multiple components, including preprocessing, feature
extraction, classification, and result generation. This pipeline ensures seamless gesture recognition
and translation into text.

The model was trained using a large dataset consisting of 31,500 images of ASL gestures,
encompassing all 26 alphabet signs and a blank gesture. The dataset was divided into:

●​ Training Data: 25,200 images (80%)


●​ Testing Data: 6,300 images (20%)

The extensive size and diversity of the dataset, with variations in lighting, skin tones, hand
orientations and backgrounds, ensured that the model achieved robust generalization across
different real-world scenarios.The input to the system consists of real-time grayscale images
captured through a webcam.

The proposed system is as follows:

1. Input and Preprocessing

●​ Input Resolution: Each input frame is resized to 128x128 pixels to maintain uniformity across
all inputs and reduce computational complexity.
●​ Grayscale Conversion: The input image is converted to grayscale, reducing it to a
single-channel (128x128x1) image.This step captures essential shape and contour information
while minimizing unnecessary color information.

11
GestureConnect-A Sign Language Translator SYSTEM DESIGN

●​ Adaptive Thresholding: Enhances contrast between the hand gesture and the
background.Effectively reduces noise, improving the accuracy of subsequent feature extraction.

2. Feature Extraction Using Convolutional Layers.

●​ The system utilizes two convolutional layers for extracting essential features from the input
image.
●​ First Convolutional Layer : 32 filters of size 3x3 are applied. ReLU (Rectified Linear Unit)
activation is used to introduce non-linearity, capturing edges, curves, and contours.The output
feature map remains at 128x128x32.
●​ Second Convolutional Layer : Another set of 32 filters of size 3x3 is applied. A MaxPooling2D
layer with a pool size of 2x2 and a stride of 2 follows, reducing the spatial dimensions to
64x64x32. This downsampling retains essential spatial information while lowering
computational costs.

3. Activation Function

The ReLU activation function is applied at each convolutional layer. It introduces non-linearity to
allow the model to learn complex patterns from the input data. The function is defined as:

●​ 𝑓(𝑥) is the output


●​ 𝑥 is the input to the neuron

4. Dimensionality Reduction and Flattening

●​ The output from the second convolutional layer is passed through a Flattening Layer.This
converts the 64x64x32 feature maps into a one-dimensional vector of size 32768.
●​ The flattened data is then fed into fully connected dense layers for classification

12
GestureConnect-A Sign Language Translator SYSTEM DESIGN

5. Fully Connected Layers

●​ The fully connected layers progressively reduce the dimensionality and refine the feature
representation:
●​ First Dense Layer : Contains 128 units with ReLU activation to capture deeper feature
relationships.
●​ Second Dense Layer: Contains 96 units followed by a Dropout Layer with a dropout rate of 0.4
to prevent overfitting.
●​ Third Dense Layer:Contains 64 units to reduce dimensionality further and extract refined
patterns.
●​ Output Layer: Comprises 27 units, representing the 26 ASL letters and a blank gesture.
●​ A softmax activation function generates the probability distribution for each class.The softmax
function is defined as:

●​ σ(zi​) = Probability for class i


●​ zi​= Input to the output neuron for class i
●​ n = Number of output classes

6. Training and Optimization

●​ The model is trained using the categorical cross-entropy loss function, which is effective for
multi-class classification.The equation for categorical cross-entropy is:

●​ yi​= Actual class label (0 or 1)


●​ ŷi​= Predicted class probability
●​ The Adam optimizer is used for adaptive learning and efficient gradient descent.

13
GestureConnect-A Sign Language Translator SYSTEM DESIGN

●​ The model was trained using a batch size of 10 over 5 epochs.

7. Model Performance and Metrics

●​ The system's accuracy was evaluated using performance metrics such as:
1.​ Accuracy:

2.​ Precision: Measures how many of the predicted positive results are correct.

3.​ Recall: Measures how many actual positive results were correctly predicted

4.​ F1 Score: Provides a balance between precision and recall.

●​ TP = True Positives
●​ TN = True Negatives
●​ FP = False Positives
●​ FN = False Negatives

14
GestureConnect-A Sign Language Translator SYSTEM DESIGN

By the fifth epoch, the model achieved:

●​ Training Accuracy: 99.93%


●​ Validation Loss: 0.0370

8. Model Variants and Customization

●​ Specialized models such as model-bw_dru.json and model-bw_smn.json were developed to


recognize specific gesture sets.
●​ Each model retains the same CNN architecture but is fine-tuned for particular classes to reduce
misclassification rates.
●​ Model architecture is saved in JSON format, and model weights are stored in H5 format, ensuring
efficient deployment and easy future updates

4.2 Use Case Diagram

Figure 4.1: Use Case Diagram

15
GestureConnect-A Sign Language Translator SYSTEM DESIGN

The ASL Translator system offers an intuitive platform designed to bridge the communication gap
for individuals using American Sign Language. The system incorporates two distinct types of
users: Admin and Regular Users, each with unique roles and responsibilities. New users are
required to create an account upon first accessing the platform, providing their basic details. These
details can be updated by the user at any time. Admins are given the authority to access, review,
and modify user information as necessary, ensuring that the system remains organized and
up-to-date.

Once logged in, users can interact with the core feature of the system: real-time ASL translation.
This begins with navigating to the Camera Button located in the upper taskbar. Upon clicking the
button, a GUI window opens, displaying a live video feed. Within this window, there is a defined
Region of Interest (ROI) where users are instructed to position their hands to perform ASL
gestures. The system processes the captured video frames by converting the input image into a
binary format, which simplifies gesture detection. The trained Convolutional Neural Network
(CNN) model predicts the corresponding ASL letter based on the detected gesture.

The system is designed to dynamically build words and sentences. If a user holds a particular sign
steady for 60 seconds, the corresponding letter is added to the word in progress. Conversely, if no
sign is detected for 60 seconds, the system interprets this as the completion of the current word and
appends it to the sentence being formed. This process allows users to construct complete sentences
in an efficient and intuitive manner.Once a sentence is constructed, users have the option to
convert it into speech using a built-in Text-to-Speech (TTS) functionality. By clicking a button
within the GUI, the system employs a TTS API to vocalize the generated sentence, enabling
seamless communication with non-ASL users. This feature ensures that the translated content is
not only visible as text but also accessible as audible speech.

Admins, on the other hand, focus on the management aspect of the system. They can view all user
accounts, update user details if needed, and ensure the smooth functioning of the platform. This
administrative oversight ensures that the platform remains user-friendly and reliable for all.The
ASL Translator system combines robust gesture recognition, real-time processing, and
user-friendly design to create a powerful tool for accessible communication.

16
CHAPTER 5

RESULTS

5.1​ Results
The proposed ASL Translator was rigorously evaluated to validate its performance in recognizing
American Sign Language (ASL) gestures. The system underwent extensive testing under various
real-world conditions, including variations in lighting, hand positioning, skin tones, and
background complexities, to assess its accuracy, robustness, and adaptability. These evaluations
demonstrated the system's ability to reliably interpret static ASL gestures with an improved
accuracy.

The CNN model was trained on a diverse dataset of ASL signs, and its performance was measured
using standard evaluation metrics. During testing, the model exhibited exceptional accuracy in
recognizing specific signs, particularly those with distinct hand shapes and minimal
ambiguity.Among the tested signs, ‘C’, ‘O’, and ‘S’ demonstrated the highest recognition
accuracy, achieving an accuracy rate of approximately 95%.

●​ ‘C’ Sign: Characterized by its curved hand shape, the model accurately identified this sign
across different lighting conditions. Using the above performance metrics, the accuracy for the
‘C’ sign was calculated as follows:

95𝑇𝑃+ 0𝑇𝑁
Accuracyc = 95𝑇𝑃 + 3𝐹𝑃+ 2 𝐹𝑁 = 95%

●​ ‘O’ Sign: The circular hand gesture led to fewer classification errors, achieving an accuracy of
95%
●​ ‘S’ Sign: With a closed fist formation, the ‘S’ sign presented minimal ambiguity, resulting in a
consistently high accuracy.

In addition to these top-performing signs, other gestures also exhibited satisfactory recognition
accuracy. Signs such as ‘W’, ‘Q’, ‘T’, ‘F’, and ‘M’ were accurately identified with an approximate
accuracy of 89-92%. While these signs involve more intricate finger positioning and occasional

17
GestureConnect-A Sign Language Translator RESULTS

overlaps, the model’s convolutional layers effectively captured the subtle variations, enabling
successful classification.

Figure 5.1: 'C' sign Figure 5.2: 'O' sign

Figure 5.3: 'S' sign Figure 5.4: Blank frame

The blank input, represented as an absence of hand gestures, was also tested to evaluate the
model’s robustness in recognizing non-sign scenarios. The system accurately identified blank
frames with an accuracy of 97%, demonstrating its ability to differentiate between active signs and
noise .

970 𝑇𝑃 + 20 𝑇𝑁
Accuracyblank = 970 𝑇𝑃 + 10 𝐹𝑃 + 30 𝐹𝑁 = 97%

18
GestureConnect-A Sign Language Translator RESULTS

The system features a user-friendly interface designed for seamless interaction. Upon accessing
the login page, users are prompted to enter their credentials, ensuring secure access to the
application. The login mechanism is straightforward and responsive, contributing to a positive
user experience. Following successful authentication, users are directed to the main web page,
where the sign recognition functionality is displayed in real-time.
The web interface presents a clean and organized layout, displaying the camera feed alongside the
predicted sign and its corresponding text translation. Additional options, such as resetting the
recognition process or accessing previous translations, are also available. The intuitive design
ensures accessibility for both deaf users and non-signers, promoting inclusivity and ease of use

Figure 5.5: Login Page

Figure 5.6: Home Page

19
The model showcased consistent performance across diverse testing conditions. While occasional
misclassifications occurred, particularly with ambiguous or overlapping gestures, the multi-layered
prediction approach significantly reduced these errors. The use of separate models for handling
more complex gestures further bolstered the system's reliability. The translation GUI (Figure 5.3)
effectively showcases the real-time gesture detection and translation process, highlighting the
region of interest (ROI) and the live video feed used for input.

Future improvements to the system could include advanced data augmentation techniques,
expanded and more diverse datasets, and continuous learning mechanisms to address challenges
like gesture ambiguity. These enhancements would further minimize misclassification rates and
increase recognition accuracy in more dynamic environments.

The ASL Translator stands out as a practical, user-friendly tool designed to promote inclusivity for
the Deaf community. Its potential applications span multiple sectors, including education,
healthcare, and public services. By enabling seamless communication between sign language users
and non-signers, the system fosters greater social inclusion and accessibility. Continued
development, including the expansion of the gesture library and optimization of the model's
architecture, will ensure its long-term adaptability and success in real-world scenarios. This project
represents a significant step forward in creating accessible and inclusive communication tools
using artificial intelligence and machine learning.

20
CHAPTER 6

CONCLUSION AND FUTURE SCOPE

6.1​ Conclusion
The development of an American Sign Language (ASL) translator using Convolutional Neural
Networks (CNNs) represents a significant advancement in breaking communication barriers
between the Deaf and hearing communities. By integrating real-time video capture, image
preprocessing, and classification models, the system accurately identifies ASL gestures and
translates them into text. The addition of text-to-speech functionality further enhances
accessibility.

While the project successfully achieves real-time sign recognition and sentence formation, certain
limitations remain. Gesture ambiguity, lighting dependency, and restricted vocabulary pose
challenges to achieving higher accuracy. Additionally, the system's focus on static letter
recognition limits its capability to interpret continuous sign language.

Future improvements could involve developing a unified model for all ASL signs, implementing
advanced hand tracking using tools like Tensorflow, and incorporating Natural Language
Processing (NLP) for sentence-level prediction. Expanding the dataset to include diverse signers
and scenarios will further enhance accuracy and generalization.

Overall, this ASL translator serves as a strong foundation for further research and development.
By addressing current limitations and embracing innovative solutions, the system has the potential
to become an even more effective and inclusive communication tool for the Deaf community.

21
GestureConnect-A Sign Language Translator CONCLUSION AND FUTURE SCOPE

6.2​ Future Scope

1. Improving Accuracy & Performance

●​ Larger & More Diverse Dataset:Expanding the dataset with varied conditions will enhance
model accuracy. Adding dynamic gestures and two-handed signs will improve recognition.

●​ Better Preprocessing Techniques: Using background removal, adaptive thresholding, and


data augmentation will optimize gesture detection and model performance.

●​ Use Advanced Deep Learning Models: Replacing CNNs with EfficientNet, MobileNetV3,
or ViTs will boost accuracy. Pose estimation models and attention mechanisms will refine
complex gesture recognition.

2. Adding More Features

●​ Real-time Sentence Formation: NLP models will convert letter sequences into structured
sentences with context-aware word prediction.

●​ Support for ASL Words & Phrases: Expanding to ASL words and phrases using LSTMs will
enable natural and complete translations.

●​ Mobile App Integration: A mobile version using TensorFlow Lite or ML Kit will allow
real-time ASL recognition on smartphones.

●​ Multi-language Support: ASL translation into multiple languages using Google Translate
API will improve global accessibility.

3. Hardware & Wearable Integration

●​ AI-Powered Smart Gloves: Smart gloves with flex sensors will detect hand movements and
convert gestures into text or speech.

●​ AR/VR for ASL Learning: AR-based apps and VR tutorials using Oculus will create an
immersive, gamified ASL learning experience.

22
GestureConnect-A Sign Language Translator CONCLUSION AND FUTURE SCOPE

4. Accessibility & Real-World Applications

●​ Integration with Voice Assistants: Text-to-speech conversion will enable ASL users to
communicate through Alexa, Google Assistant, or Siri.

●​ Deaf-Mute Communication Tool: A two-way system with ASL recognition and


speech-to-text will bridge the gap between ASL and non-ASL users.

●​ Live ASL to Speech on Video Calls: Integrating ASL recognition with Zoom, Teams, and
Meet will enable real-time ASL-to-speech translation.

23
References
[1] Hamzah Luqman , An Efficient Two-Stream Network for Isolated Sign Language Recognition
Using Accumulative Video Motion,2022

[2] H.B.D. Nguyen H. N. Do, BIM Sign Language Translator Using Machine Learning
(TensorFlow),2022

[3] Nourdine Herbaz , Hassan El Idrissi and Abdelmajid Badri, A Moroccan Sign Language
Recognition Algorithm Using a Convolution Neural Network,2022

[4] Lean Karlo , Ronnie O, August C, Maria Abigail B. Pamahoy et al, Static Sign Language
Recognition Using Deep Learning,2022

[5] Neil Buckley,Lewis Sherret,Emanuele Lindo Secco, A CNN sign language recognition system
with single & double-handed gestures,2022

•J.P.Sahoo,A.J Prakash,P.Plawaik, Real Time Hand Gesture Recognition using Fine-Tuned


Convolution Neural Network,2022
•J. A. Deja, P. Arceo, D. G. David, P. Lawrence, and R. C. Roque, “MyoSL: A Framework for
measuring usability of two-arm gestural electromyography for sign language,” in Proc.
International Conference on Universal Access in Human-Computer Interaction, 2018, pp. 146-159.
•G. Joshi, S. Singh, and R. Vig, Taguchi-TOPSIS based HOG parameter selection for complex
background sign language recognition,2020.

24

You might also like