0% found this document useful (0 votes)
13 views24 pages

Final Project

The document outlines the development of the GestureConnect-ASL Translator, an advanced system designed to facilitate communication between hearing-impaired individuals and non-ASL speakers through real-time translation of American Sign Language gestures into text or speech using Convolutional Neural Networks (CNN) and TensorFlow. It addresses the challenges of existing ASL translation tools, aiming to create an accessible and efficient solution that promotes inclusivity. The project includes phases for planning, model development, real-time integration, and documentation, with ongoing efforts to enhance accuracy and user experience.

Uploaded by

amiya07chinnu03
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views24 pages

Final Project

The document outlines the development of the GestureConnect-ASL Translator, an advanced system designed to facilitate communication between hearing-impaired individuals and non-ASL speakers through real-time translation of American Sign Language gestures into text or speech using Convolutional Neural Networks (CNN) and TensorFlow. It addresses the challenges of existing ASL translation tools, aiming to create an accessible and efficient solution that promotes inclusivity. The project includes phases for planning, model development, real-time integration, and documentation, with ongoing efforts to enhance accuracy and user experience.

Uploaded by

amiya07chinnu03
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Project Phase I:

GestureConnect-ASL Translator
By

ALEENA JOHNSON(SCM21CS017)
AINA ARUN(SCM21CS010)
AMIYA SHEREEF(SCM21CS022)
CHRISTY SAJI(SCM21CS040)
Under the Supervision of
Ms. SRUTHY K JOSEPH
Assistant Professor
Department of Computer Science and Engineering
SCMS School of Engineering & Technology, Ernakulam
1. INTRODUCTION
 Introduction to ASL Language Translator
• The ASL Language Translator is an advanced system designed to bridge the
communication gap between hearing-impaired individuals and those unfamiliar with
American Sign Language (ASL).
• Using image capturing techniques, the translator recognizes and translates ASL
gestures into text or speech in real-time.
• The system employs Convolutional Neural Networks (CNN) to analyze hand gestures,
ensuring accurate identification of signs.
• This project aims to make communication more inclusive and accessible for the
hearing-impaired community.
1.INTRODUCTION
 Technology Overview
• The translator starts by capturing hand gestures through a camera. CNN processes
these images, identifying key features and patterns to understand the gestures.
• The model is trained using TensorFlow, which allows the system to provide quick and
accurate real-time translation.
• This combination of technologies-image capturing, CNN, and TensorFlow ensures a
seamless user experience, promoting efficient communication and fostering
inclusivity in everyday interactions.
2.PROBLEM STATEMENT
● Communication between hearing-impaired individuals and non-ASL speakers
poses significant challenges.
● Current solutions often require human interpreters or specialized devices, which
may not be accessible or efficient.
● There is a lack of real-time, automated systems that can accurately capture and
interpret ASL gestures.
● The need is for a system that translates ASL into text or speech instantly, enabling
seamless communication.
● This project aims to address these challenges by using image capturing techniques,
CNN, and TensorFlow for an efficient and inclusive translation system.
3.MOTIVATION
● Communication Barrier: Millions of hearing-impaired individuals face difficulties in
daily communication with non-ASL speakers, limiting their ability to interact freely.
● Lack of Accessibility: Existing ASL translation tools are either costly, limited in
availability, or not real-time, making them inefficient in daily use.
● Technological Advancement: With advancements in machine learning and image
processing, there is an opportunity to create an affordable, real-time solution that makes
communication smoother and more inclusive.
● Inclusivity: Developing this system will foster greater inclusivity by enabling hearing-
impaired individuals to communicate effortlessly with the world, promoting equality and
breaking down social barriers.
4.OBJECTIVES
● Real-Time ASL Translation: To develop a system that can accurately and efficiently
translate American Sign Language gestures into text or speech in real-time.
● Leverage Advanced Technologies: To utilize image capturing techniques,
Convolutional Neural Networks (CNN), and TensorFlow for precise gesture recognition
and translation.
● Promote Inclusivity: To create an accessible tool that bridges communication gaps,
making interactions between hearing-impaired individuals and non-ASL speakers more
inclusive and seamless.
5.LITERATURE SURVEY
Sl.no Paper Title Proposed Methodology Drawbacks
1. An Efficient Two- • The proposed solution is a Hierarchical • It has limited generalization
Stream Network Sign Learning Model which is a trainable to different signers
for Isolated deep learning network for sign language especially in signer
Sign Language recognition. independent mode.
Recognition • Developed a trainable deep learning network • There is potential overfitting
Using for sign language recognition that uses Key to extracted postures,
Accumulative posture extractor and accumulative video leading to reduced
Video Motion by motion. recognition accuracy in
Hamzah • Utilizes 3 specialized networks: DMN, dynamic, nuanced gestures.
Luqman , 2022 AMN, and SRN. • Struggled to identify digits,
• Evaluated on the KArSL-190 and KArSL- and individual letters due to
502 Arabic Sign Language datasets. similar signs with the slightest
• Experiments were conducted in two modes: variation in finger positions.
signer dependent and signer independent
5.LITERATURE SURVEY
Sl. Paper Title Proposed Methodology Drawbacks
no
2 BIM Sign • It is an offline BIM sign language translator • Limitations include inability
Language using a camera system to convert sign to sense dynamic signs,
Translator Using gestures into text in multiple languages. sensitivity to background light
Machine • BIM sign language uses TensorFlow, Python, and skin color of signers.
Learning OpenCV, and Qt as a core development library. • Lack of diversified training
(TensorFlow), by • Develop an interactive GUI using PyQt, featuring data which caused accuracy to
Herrick Yeap a video feed display area that captures real-time be diminished.
Han Lin, gestures. • The sign language
Norhanifah Murl. • Using OpenCV, the model captures sign dictionary was limited,
2022 gestures through an H D camera, processes which restricted translation
them, and stores them locally as JPG images capabilities
• The system was distributed as a standalone
executable file to ensure simplicity for users.
5.LITERATURE SURVEY
Sl.n Paper Title Proposed Methodology Drawbacks
o
3 Static Sign • Uses a CNN based model for sign language • Static Gesture Focus: The
Language recognition which uses real time sign capturing current model is limited to
Recognition using Keras . Uses a skin color modeling recognizing static gestures.
Using Deep technique. • The accuracy of
Learning, Lean • A vision-based approach using Convolutional recognition was affected by
Karlo , Ronnie Neural Networks (CNNs) is employed for real-time lighting; optimal conditions
O, August C, recognition of static ASL gestures. led to better performance.
2019 • Data Collection included Image Capturing: of a • The system struggled with
diverse set of images of hand gestures in a controlled complex backgrounds,
environment. indicating that simpler
• Image was then enhanced by Resizing the images backgrounds are essential for
and converting from RGB to HSV color space for accurate recognition.
better skin detection.
• The system achieved an impressive average accuracy
of 90.04% for letter recognition
5.LITERATURE SURVEY
Sl. Paper Title Proposed Methodology Drawbacks
no
4 A Moroccan Sign • Utilized a CNN and image preprocessing • The model primarily effective
Language based algorithm to classify single and double for static signs, making it less
Recognition handed static sign language. suitable for dynamic or
Algorithm Using a • Combines convolution and max pooling for continuous gestures that
Convolution real-time classification and localization. require spatiotemporal
Neural Network, • The system uses a database of 20 stored analysis.
Nourdine Herbaz , signs(letters & numbers) of Morrocan Sign • Highly dependent on the
Hassan El Language. quality of the data set and
Idrissi ,2022 • Input images captured from the webcam are shows significantly lower
first converted from RGB image to accuracy when datasets vary
grayscale from the norm.
• From the processed image, the hand gesture
is extracted using binary image conversion
techniques,
.
5.LITERATURE SURVEY
Sl. Paper Title Proposed Methodology Drawbacks
no
5 A CNN sign • A Webcam based static sign language • The model was Limited to Static
language recognition system which uses a CNN Signs and was unable to recognize
recognition architecture to translate BSL. dynamic gestures, sequences, or
system with • It also uses Tensorflow , Keras and motion,.
single & double- OpenCV for image capture and analysis. • Sensitivity to Hand Positioning:
handed gestures, • The design was implemented using the Minor deviations in hand position
by Neil Buckley, python programming language. or orientation may result in
Lewis Sherret, • The graphical user interface was misclassification.
2022 implemented using the OpenCV library. • Lower recognition rate for signs
• The testing dataset consisted of a total of outside the training dataset,
2,375 images, which is 125 images per
gesture. ,
6.PROPOSED ARCHITECTURE
 Block Diagram
7.DESCRIPTION
1.Data Collection Phase
 Capturing Images:
This block initiates the data collection process by capturing images of hand gestures. A camera
or digital imaging device is used to acquire multiple frames of each sign language gesture,
ensuring the system captures sufficient visual data for each gesture.
To create a robust dataset, multiple images are often captured from slightly different angles,
lighting conditions, and backgrounds to account for variability in real-world conditions.
 Collecting the Images:
After capturing, the images are collected and organized into a structured dataset. This dataset is
typically labeled according to the corresponding gestures, which could be letters, numbers, or
specific signs in American Sign Language.
Organizing images into labeled classes (e.g., ‘A’, ‘B’, ‘1’, etc.) is crucial as it enables supervised
learning in the training phase.
7.DESCRIPTION
 Process the Gestures:
The preprocessing step involves several image processing techniques to prepare the images
for feature extraction:
• Resizing: Ensures all images are of uniform dimensions, which helps the neural network
handle the images more effectively.
• Grayscale Conversion: Converts images from RGB to grayscale to reduce the
computational load by focusing on intensity values rather than color.
• Thresholding: A binary thresholding operation separates the hand gesture (foreground) from
the background by setting a threshold value, turning pixels above this threshold white and
those below it black. This segmentation highlights the hand's shape and reduces background
noise.
• Normalization: This step may also involve normalizing pixel values to bring all images to a
similar range, facilitating more consistent neural network performance.
7.DESCRIPTION
2. Database Phase
 Reception of New Gestures:
• In this block, new hand gesture images (test data) are fed into the system for recognition.
These images are similar in structure to the training images, having undergone the same
preprocessing steps.
• This stage is critical for real-time applications where new gestures are continuously
introduced for interpretation.
 Extraction of Gesture Characteristics:
• Feature extraction is a core process where the system identifies essential gesture features using
image processing and deep learning techniques.
• Edge Detection (e.g., using Sobel or Canny filters) and Contours Extraction can be applied
to emphasize the outline of the hand, capturing unique patterns like finger positions and hand
orientation.
7.DESCRIPTION
 Training/Testing Images:
• This block represents a repository where both training and testing images are stored in the database.
The training images contain labeled data that the Convolutional Neural Network (CNN) uses to
learn the patterns and associations between image features and specific gestures.
• Testing images, also preprocessed and labeled, are used to evaluate the model's accuracy and ability
to generalize to unseen data. The database ensures that the system has access to a large, diverse set of
images for both training and testing.

3. Recognition Phase
 Make a Decision:
After processing and extracting features from a new gesture, the system uses a decision-making
algorithm (typically the CNN classifier) to determine the gesture’s identity.
7.DESCRIPTION
 Identify Gestures:
This block is responsible for identifying the gesture by comparing the features of the new gesture image
with known patterns in the model. The model uses a classification algorithm to match the gesture to one
of the pre-defined categories.
For example, a Softmax layer at the end of the CNN outputs a probability distribution over all gesture
classes, and the class with the highest probability is selected as the identified gesture.

 Comparison between Gestures:


To ensure accuracy, the system may perform a comparison step where it measures the similarity between
the new gesture’s features and those of known gestures stored in the database. Techniques like Euclidean
Distance or Cosine Similarity can quantify the similarity between feature vectors.
This comparison can act as a final validation step to confirm that the recognized gesture is indeed the
closest match to a known gesture in the database.
8.NOVELTY
• Real-Time Translation
• Achieving real-time ASL-to-text translation can be challenging, especially with the
complexity of hang gestures. Implementing efficient CNN architecture that can process
gestures quickly allows for smooth, immediate translation experience
• User-Friendly Interface
• Designing an interface that is accessible and intuitive for both Deaf users and non-signers
enhances usability. For example, features like real-time feedback, adjustable speed, or text-
to-speech output (for hearing users) add value and make the tool versatile .
• Minimal Hardware Requirements
• Developing a model that can work effectively on low-power devices like smartphones or
tablets without specialized hardware makes the ASL translator accessible to a broader
audience.
• Data Efficiency
• Employing methods like transfer learning or data augmentation to achieve high accuracy
with limited labeled ASL data is novel, as sign language datasets are often small and difficult
to gather.
9.TIMELINE
Phase 1: Planning & Setup
Dec 15 - Dec 20: Finalize project scope, objectives, and requirements

Dec 21 - Dec 26: Research ASL datasets; gather and preprocess data.
Dec 27 - Dec 31: Set up the development environment, libraries (TensorFlow,
OpenCV), and initial project structure.

Phase 2:Model Development & Training


Jan 1 - Jan 7: Build the initial data pipeline and data augmentation for better
accuracy.
Jan 8 - Jan 21: Develop and fine-tune the machine learning model to recognize
basic ASL signs.
Jan 22 - Jan 31: Test and validate the model's performance on test data.
9.TIMELINE
Phase 3: Real-time Integration & Testing
Feb 1 - Feb 10: Integrate the model with OpenCV for real-time webcam input.

Feb 11 - Feb 20: Conduct user testing, resolve errors, and optimize model
performance.
Feb 21 - Feb 29: Perform final tests and prepare the model for deployment.

Phase 4: Documentation & Final Touches


Mar 1 - Mar 15: Write technical documentation covering setup, code structure,
and functionalities.
Mar 16 - Mar 25: Draft user documentation for setup, usage, and troubleshooting.
Mar 26 - Mar 31: Conduct a final review and prepare the project report for
submission.
9.CONCLUSION
• Advancing the development of an ASL-to-text translator powered by Convolutional Neural
Networks (CNNs).
• Current achievements include effective data collection, preprocessing, and preliminary
gesture recognition.
• Initial results show promise for bridging communication gaps for Deaf and hard-of-hearing
communities.
• Ongoing efforts focus on:
• Expanding the ASL gesture dataset for comprehensive coverage.
• Enhancing model accuracy and speed with more advanced CNN architectures.
• Integrating real-time feedback for interactive and practical usage.
• Project aims to deliver a scalable, accessible, and user-friendly solution that promotes
inclusivity and accessibility.
10.REFERENCES
• Hamzah Luqman , An Efficient Two-Stream Network for Isolated Sign Language
Recognition Using Accumulative Video Motion,2022

• H.B.D. Nguyen H. N. Do, BIM Sign Language Translator Using Machine Learning
(TensorFlow),2022

• Nourdine Herbaz , Hassan El Idrissi and Abdelmajid Badri, A Moroccan Sign Language
Recognition Algorithm Using a Convolution Neural Network,2022

• Lean Karlo , Ronnie O, August C, Maria Abigail B. Pamahoy et al, Static Sign Language
Recognition Using Deep Learning,2022

• Neil Buckley,Lewis Sherret,Emanuele Lindo Secco, A CNN sign language recognition system
with single & double-handed gestures,2022
10.REFERENCES
• H.B.D. Nguyen H. N. Do, Deep learning for American sign language fingerspelling recognition
system,2019
• J.P.Sahoo,A.J Prakash,P.Plawaik, Real Time Hand Gesture Recognition using Fine-Tuned Convolution
Neural Network,2022
• J. A. Deja, P. Arceo, D. G. David, P. Lawrence, and R. C. Roque, “MyoSL: A Framework for measuring
usability of two-arm gestural electromyography for sign language,” in Proc. International Conference
on Universal Access in Human-Computer Interaction, 2018, pp. 146-159.
• G. Joshi, S. Singh, and R. Vig, Taguchi-TOPSIS based HOG parameter selection for complex
background sign language recognition,2020.
THANK YOU!

You might also like