0% found this document useful (0 votes)
67 views25 pages

SignTone - Full

The project of sign language interpreter A sign language interpreter application, developed using the PhraseCam framework, could be a groundbreaking tool for enhancing communication between hearing and deaf or hard-of-hearing individuals. This application would utilize advanced machine learning models, particularly those designed for real-time image and video processing, to interpret and translate sign language gestures into spoken or written language, and vice versa.

Uploaded by

Samuel Prince
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views25 pages

SignTone - Full

The project of sign language interpreter A sign language interpreter application, developed using the PhraseCam framework, could be a groundbreaking tool for enhancing communication between hearing and deaf or hard-of-hearing individuals. This application would utilize advanced machine learning models, particularly those designed for real-time image and video processing, to interpret and translate sign language gestures into spoken or written language, and vice versa.

Uploaded by

Samuel Prince
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

Sign Tone: Two Way Sign Language Recognition and

Multilingual Interpreter System using Deep Learning


• Problems Identified
 The deaf community faces several challenges that can significantly impact their daily lives,
communication, and overall well-being.

 Communication methods such as sign language may not be universally understood, limiting
interaction with those who do not know sign language.

 Deaf individuals may face challenges in accessing real-time communication tools, making it difficult to
engage spontaneously in conversations.

 While technology has advanced, some deaf individuals may still lack access to affordable and effective
assistive devices or software.

 Addressing these challenges requires a multifaceted approach involving technological innovation,


improved accessibility
Aim and Objective
Aim
The aim of this project is to develop a Deaf Companion System that utilizes AI technologies,
particularly Temporal Convolutional Networks (TCNs), to enhance communication between
individuals with hearing and mute disabilities and the broader community.

Objectives
 To develop and build the deaf companion System.
 To enable two-way communication between deaf and normal people.
 To design and develop a high-performance sign recognition module.
 To generate high-quality speech from the text.
 To transform the text from a normal individual into SL through an Avatar.
Abstract
Deaf individuals face significant difficulties in communicating with others in society, as only a small
number of them possess knowledge of and utilize sign language for communication.

In general, deaf individuals use sign language or text to interact or communicate with others. While
these methods are effective within the deaf community, they face significant limitations when trying to
communicate with the hearing community.

This can lead to isolation, frustration, and discrimination. The main contribution of this project is to
develop and build the deaf companion System (DCS) to enable two-way communication between non-
deaf and normal people in Indian Sign Language using Temporal Convolutional Network(TCN).

The proposed system has three modules; the sign recognition module (SRM) that recognizes the signs of
a deaf individual which were integrated into the sign translation with Multilingual Interpreter System,
the speech recognition using Hidden Markov Model and synthesis module (SRSM) that processes the
speech of a non-deaf individual and converts it to text, and an Avatar module (AM) to generate and
perform the corresponding sign of the non-deaf speech.
Introduction
Effective communication is a fundamental aspect of human interaction, enabling the exchange of
information, ideas, and emotions.

However, individuals with hearing and mute disabilities often face significant challenges in
expressing themselves and understanding others, leading to communication barriers.

Sign language has traditionally served as a crucial means of communication for the deaf community,
but its interpretation remains challenging for non-signers.

In response to these challenges, this project introduces the development of a Deaf Companion
System, leveraging advanced technologies such as Temporal Convolutional Networks (TCNs) to
enhance communication between deaf individuals and the wider community.

The system aims to bridge the gap by recognizing sign language gestures, converting spoken
language to text, and generating realistic sign language avatars.
Existing System
The traditional systems for communication and education for the deaf community have often
relied on established methods, but these methods may have limitations in addressing the
diverse needs of individuals with hearing and mute disabilities.
 Assistive Devices
Hearing Aids and Cochlear Implants: Traditional assistive devices include hearing aids and
cochlear implants, which aim to improve auditory perception.
 Manual Communication Tools
Pen and Paper: Traditional tools such as writing or using pen and paper are often employed for
basic communication.
 Interpreters
Sign Language Interpreters: In various situations, sign language interpreters are employed to
bridge communication gaps between deaf and hearing individuals.
Existing Algorithms
Support Vector Machines (SVMs)
SVMs have been employed for classification tasks in SLR. They work well with high-dimensional
feature vectors extracted from sign language images.

Ensemble Learning
Ensemble methods, such as Random Forests and Gradient Boosting, can be applied to combine
multiple SLR models to improve overall accuracy and robustness.

Gesture Recognition by Learning from Poses (GRBP)


GRBP is a method that focuses on learning spatial relations between body joint poses for sign
language recognition. It utilizes pose features extracted from skeletal data.
Disadvantages
 Traditional systems lead to limited accessibility, hindering communication with non-signers.
 Real-time processing challenges persist, especially for resource-constrained devices.
 Handling natural variability in sign language gestures remains a challenge for rule-based
models.
 Existing algorithms lack adaptability to diverse sign language variations.
 Difficult to interpret the decision-making process.
 Face challenges in adapting to unseen variations or emerging sign language expressions.
 Challenges in modeling the context and semantic relationships between signs.
Proposed System
The proposed system for the project, titled "Deaf Companion System," is designed to
revolutionize communication for individuals with hearing and mute disabilities.

Deaf Companion System


The proposed system aims to enhance communication for individuals with hearing and mute
disabilities through innovative technologies.

Two-Way Communication
Implementation of a comprehensive two-way communication system, fostering seamless
interaction between deaf individuals and the broader community.

SignNet Model Architecture


Development of the SignNet Model, combining Convolutional Neural Networks (CNN) and
Temporal Convolutional Networks (TCN) for robust sign language recognition.
Proposed System
Sign Language Recognition Module (SRM)
Integration of the SRM, utilizing the SignNet Model to accurately interpret sign language
gestures in real-time, ensuring contextual understanding.

Speech Recognition and Synthesis Module (SRSM)


Incorporation of an SRSM employing Hidden Markov Models to convert non-deaf speech into
text, facilitating comprehensive communication.

Avatar Module (AM)


Creation of an AM to generate realistic sign language avatars synchronized with non-deaf
speech, enhancing visual communication.
Advantages
 Precise sign language gesture recognition with superior accuracy.
 Enables real-time recognition, fostering dynamic communication interactions for deaf
individuals and non-signers.
 Incorporates both sign language recognition and speech-to-text conversion, enabling
seamless communication between deaf and non-deaf individuals.
 Avatar Module generates realistic sign language avatars synchronized with non-deaf speech,
enhancing visual communication effectiveness.
 System trained on Indian Sign Language showcases cultural sensitivity, addressing specific
needs and nuances of the target user group.
 Web-based interface ensures ease of deployment for the SignNet Model, providing a user-
friendly experience for diverse users.
 Adaptable for widespread use in various environments and contexts.
System Architecture
Login

SignNet Model: Train

System Maintenance
SL Recognition
User Management
DCS Admin ML Interpretation

SR & Speech Synthesis

Avatar Generation

Input Sign
Input Sign

Output: Text & Voice Output: Sign Avatar

Mute Normal Mute Avatar Normal


Modules
1. Deaf Companion System
2. System User Dashboard
2.1. Admin
2.2. Deaf
2.2. Non-Deaf
3. SignNet Model: Build and Train
3.1. Import Dataset
3.2. Preprocessing
3.3. Segmentation
3.4. Feature Extraction
3.4. Classification
3.5. Build and Train
3.6. Deploy Model
Modules
4. Sign Language Recognition
4.1. Live Video with Sign
4.2. Recognize Sign
4.3. Multilanguage Interpretation
5. Speech Recognition and Synthesis
6. Avatar Generation
1. Deaf Companion System
The Deaf Companion System is designed to provide a groundbreaking solution for individuals
with hearing and mute disabilities. T

he User Authentication module establishes secure access, while the Dashboard serves as a
central hub for seamless navigation.

The Sign Language Recognition interface allows users to input gestures for real-time
interpretation, complemented by the Speech-to-Text interface for comprehensive
communication.

The Avatar Customization module enables personalization, and language, cultural, and
accessibility settings enhance adaptability.

The Admin Dashboard provides tools for training model, monitoring and system management.
This design ensures a holistic and accessible platform for individuals with hearing and mute
disabilities.
2. System User Dashboard
2.1. Admin

Login
Admin authentication for secure access to the system's administrative functions.

Build and Train SignNet Model


Dedicated functionality for the admin to initiate the building and training process of the SignNet
Model.

User Management
Admin-exclusive capabilities to manage user profiles, permissions, and system maintenance
tasks.
2.2. Deaf User

Show Sign into the Webcam


User interface allowing deaf individuals to display sign language gestures through their webcam
for real-time interpretation.

Deaf View Avatar with Predicted Sign


Deaf users receive visual representation through avatars synchronized with the predicted sign
language gestures based on the spoken words.

2.3. Non-Deaf User

Speak Using Mic


User interface enabling non-deaf individuals to communicate verbally using a microphone.

Non-Deaf Receive
View text representations and hear voice outputs corresponding to the predicted sign language
gesture.
3. Deaf Companion System
In the process of building and training the SignNet Model for the Deaf Companion System,
several essential steps are followed to ensure accuracy and effectiveness.

3.1. Import Dataset


In this initial phase, a diverse dataset is gathered and imported to serve as the foundational
training data for the SignNet Model. This dataset comprises high-quality images representing
various sign language gestures.

3.2. Preprocessing
This step involves a series of preprocessing tasks to enhance the dataset quality. It includes
standardizing image dimensions, converting images to grayscale for simplicity, applying a Gabor
Filter for noise reduction, and binarizing images to facilitate effective feature extraction.

3.3. Segmentation
Employing a Region Proposal Network (RPN), this step focuses on identifying and isolating
distinct regions within the images. Segmentation is crucial for improving the model's ability to
accurately recognize and interpret individual sign language gestures.
3.4. Feature Extraction
Implementing a Fully Connected Layer, this phase captures key features and nuances of the sign
gestures.

3.5. Classification
Utilizing a Pooling Layer, this step categorizes the extracted features, aiding in the identification
of specific sign language gestures based on the captured features.

3.6. Build and Train


This phase involves the development and training of the SignNet Model using Convolutional
Neural Network (CNN) architecture. The model is trained on the preprocessed and segmented
dataset, with parameters adjusted for optimal performance.

3.7. Deploy Model


The final step is the integration of the trained SignNet Model into the Deaf Companion System
Web App. This integration allows for real-time sign language recognition, contributing to a
seamless and inclusive communication experience for individuals with hearing and mute
disabilities.
4. Sign Language Recognition
4.1. Live Video with Sign
This feature facilitates real-time communication for deaf users who can express sign language
gestures through their webcams. The live video feed captures the dynamic nature of sign
language, allowing users to convey messages seamlessly.

4.2. Recognize Sign


Employing Temporal Convolutional Networks (TCN) integrated with the SignNet Model, this step
focuses on the recognition and interpretation of live sign language gestures. The TCN enhances
temporal modeling, ensuring the model captures the dynamic nature of signs over time for
accurate interpretation.

4.3. Multilanguage Interpretation


The Multilanguage Interpretation feature broadens the system's accessibility by allowing
interpretation of sign language gestures in multiple languages. This ensures inclusivity and
accommodates users with diverse linguistic preferences, making the system adaptable to a
global audience.
5. Speech Recognition and Synthesis
5.1. Speech Recognition
The Speech Recognition Module is designed to convert non-deaf speech into text, facilitating
seamless communication with deaf individuals. Using Hidden Markov Models (HMM), the
module captures the nuances of spoken language, translating them into textual representations.

5.2. Speech Synthesis


The Speech Synthesis Module generates spoken language output based on the recognized text,
enhancing communication for non-deaf individuals. Leveraging text-to-speech (TTS) synthesis
techniques, the module creates natural-sounding spoken output. The synthesized voice is then
synchronized with predicted sign language gestures, providing a comprehensive and engaging
communication experience for users.
6. Visual Communication
The Avatar Generation module within the Deaf Companion System, designed to elevate the
experience of visual communication. Dynamic Avatar Generation is to transform the text from a
normal individual into sign language, which is then executed through an Avatar.

Dynamic Avatar Generation


Utilizes the recognized speech to dynamically generate visual representations of sign language
avatars. The avatars are crafted in real-time, ensuring synchronization with the flow and nuances
of the communicated message.

Synchronization
Intricately synchronizes with the rhythm and content of non-deaf speech, creating a seamless
and harmonious connection between spoken words and visual representation.
System Requirements
Language : Python 3.7.4(64-bit) or (32-bit)
SN Design : HTML, CSS, Bootstrap
IDE : IDLE
Web Framework : Flask 1.1.1
Database : MySQL
Local Server : Wampserver 2i
OS : Windows 10 64 –bit
Packages : Tensor Flow, Pandas, Sickit Learn, Matplotlib, Mediapipe
Dataset
The Sign Language MNIST data came from greatly extending the small number (1704) of the
color images .

Each training and test case represents a label (0-25) as a one-to-one map for each alphabetic
letter A-Z (and no cases for 9=J or 25=Z )

The MNIST dataset for SL is collected from UCI Repository or Kaggle or github
References
1. F. Wen, Z. Zhang, T. He and C. Lee, "AI enabled sign language recognition and VR space bidirectional
communication using triboelectric smart glove", Nature Commun., vol. 12, no. 1, pp. 1-13, Sep. 2021

2. R. Gupta and A. Kumar, "Indian sign language recognition using wearable sensors and multi-label classification",
Comput. Electr. Eng., vol. 90, Mar. 2021.

3. A. Wadhawan and P. Kumar, "Deep learning-based sign language recognition system for static signs", Neural
Comput. Appl., vol. 32, no. 12, pp. 7957-7968, Jun. 2020.

4. R. Cui, H. Liu and C. Zhang, "A deep neural framework for continuous sign language recognition by iterative
training", IEEE Trans. Multimedia, vol. 21, no. 7, pp. 1880-1891, Jul. 2019.

5. G. A. Rao, K. Syamala, P. V. V. Kishore and A. S. C. S. Sastry, "Deep convolutional neural networks for sign
language recognition", Proc. Conf. Signal Process. Commun. Eng. Syst. (SPACES), pp. 194-197, Jan. 2018.

6. A. Sadeghzadeh and M. B. Islam, "Triplet loss-based convolutional neural network for static sign language
recognition", Proc. Innov. Intell. Syst. Appl. Conf. (ASYU), pp. 1-6, Sep. 2022.

7. C. Li, Y. Hou, P. Wang and W. Li, "Joint distance maps based action recognition with convolutional neural
networks", IEEE Signal Process. Lett., vol. 24, no. 5, pp. 624-628, May 2017.

You might also like