0% found this document useful (0 votes)
11 views35 pages

PPTT

The document presents a phase 1 project on developing a sign language to text conversion system using CNN models. It discusses the introduction, objectives, literature survey, proposed methodology, future work and applications of the project.

Uploaded by

Diksha Manu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views35 pages

PPTT

The document presents a phase 1 project on developing a sign language to text conversion system using CNN models. It discusses the introduction, objectives, literature survey, proposed methodology, future work and applications of the project.

Uploaded by

Diksha Manu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Phase-1 Project Presentation

On

“Sign Language To Text Conversion Using CNN Model”


By
Akshay Kumar Singh Nayab Sahil Shibu Singh Tulika Paul
1CD20CS009 1CD20CS105 1CD20CS148 1CD20CS171

Under the Guidance of


Ms. Girja V
Assistant Professor

Department of
Computer Science & Engineering
www.cambridge.edu.in
Contents

• Introduction • Flow Diagram/Methodology


• Literature Survey • Works to be completed.
• Proposed Method • References
• Objectives
• Motivation
• Challenges
• Applications
• Hardware and Software Requirements
• Architecture
Introduction

• Sign language serves as a lifeline of communication for the deaf and hard of hearing community, yet
it often creates a barrier between signers and non-signers. To surmount this hurdle, a pioneering
project endeavors to construct a real-time sign language to text conversion system. This
groundbreaking system will harness the power of Convolutional Neural Networks (CNNs),
specifically tailored for recognizing and translating sign language gestures into understandable text
labels.
• The project's primary objective revolves around fostering inclusivity and accessibility. By employing
CNNs, the system will process sign language gestures captured through images or video frames,
swiftly converting them into text.
• To ensure the system's efficacy, several techniques will be implemented. Data augmentation methods
will expand the dataset, enhancing the model's ability to recognize diverse gestures accurately.
Preprocessing steps will refine input data, optimizing it for the CNN's learning process. Additionally,
transfer learning strategies will leverage existing models, expediting the system's development while
improving its accuracy and robustness.
Introduction ( Cont..)

• The impact of this technological innovation transcends mere advancement. Its successfull
implementation promises a myriad of benefits for the deaf and hard of hearing community. Enhanced
accessibility to communication means increased inclusion across various facets of life. Moreover, it
presents an opportunity for improved access to information, empowering individuals within this
community to engage more effectively in society.

• The applicability of this system extends to diverse realms, including integration into communication
devices and educational tools. Its potential deployment in these areas holds the promise of
revolutionizing accessibility, particularly in educational settings, workplaces, and public spaces.

• Ultimately, the project aims to dismantle communication barriers, empowering individuals and
fostering a more inclusive society. By facilitating real-time translation of sign language into text, this
innovation strives to create a world where individuals, regardless of their communication
preferences, can seamlessly interact and participate in various domains of life.
Literature Survey

Sl. No Paper Name Author Name Year Advantages Disadvantages

1. High Computational Needs:


1. Real-time Recognition:
Real-time Dynamic Real-time usage may demand
Tailored for quick hand gesture
Hand Gesture significant computational
1. M.M.Gharasuie identification in real-time.
1 Recognition using 2013 resources.
2. H.Seyedarabi 2. Structured Model: Utilizes
Hidden Markov 2. Data Sensitivity: Effectiveness
Hidden Markov Models for a
Models relies on diverse training data,
systematic approach.
and capturing all gestures.

1. Inclusivity: Aids communication 1. Technical Complexity:


for the hearing impaired, Building accurate language
Sign language to 1. P Vijayalakshmi fostering inclusivity. systems can be intricate.
2 2016
speech conversion 2. M Aarthi 2. Enhanced Accessibility: Enables 2. Cultural Challenges: Adapting
broader access to information to diverse global sign
for sign language users. languages may pose issues.
Literature Survey (cont.)

1. Enhanced Accuracy: Deep 1. Data Limitation: Adequate


American sign learning improves precision in diverse ASL data for effective
language ASL recognition. training may be limited.
1. Kshitij Bantupalli
3 recognition using 2018 2. Real-time Interaction: Computer 2. Computational Demand:
2. Ying Xie
deep learning and vision integration enables Implementations may require
computer vision instant, real-time substantial computational
communication. resources.

1. Real-time Interpretation: Enables 1. Dependency on Image Quality:


immediate sign language Accuracy may be affected by
Machine learning
interpretation through webcam variations in image quality.
model for sign
1. Kanchan Dabre images. 2. Training Data Limitation:
4 language 2014
2. Surekha Dholay 2. User-friendly Interface: Utilizes a Effectiveness relies on diverse
interpretation using
webcam for intuitive training data, which might be
webcam images
communication, making it limited for certain sign
accessible and user-friendly. variations.
Literature Survey (cont.)
1. Deep Learning Precision:
1. Dynamic Limitation: May
Sign language Achieves precise sign language
1. Aditya Das, struggle with dynamic sign
recognition using recognition on custom static
2. Shantanu Gawde language aspects due to the
deep learning on gesture images.
5 3. Khyati Surat wala 2018 emphasis on static images.
custom processed 2. Static Gesture Focus: Simplifies
4. Dhananjay 2. Data Dependency: Effectiveness
static gesture analysis by emphasizing static
Kalbande relies on diverse training data
images images, potentially enhancing
for custom gestures.
accuracy.

1. Effective Communication: 1. Human Limitations: Availability


Facilitates effective of skilled interpreters and
1. Kunal Kadam communication for individuals potential human errors may
American sign 2. Rucha Ganu using ASL. impact reliability.
6 2012
language interpreter 3. Ankita Bhosekar 2. Accessibility: Enhances 2. Resource Demand: Requires
4. SD Joshi accessibility in various settings, ongoing training and resources
promoting inclusivity for the deaf to maintain interpreter
and hard of hearing. proficiency.
Literature Survey (cont.)
1. Precision in Recognition:
1. Limited Scope: Primarily
Achieves precise Japanese
Japanese effective for Japanese
fingerspelling recognition using
Fingerspelling fingerspelling, potentially less
1. Nobuhiko Mukai a classification tree and machine
Recognition based versatile for broader sign
7 2. Naoto Harada 2017 learning.
on Classification language.
3. Youngha Chang 2. Cultural Tailoring: Specifically
Tree and Machine 2. Data Dependency: Relies on
designed for Japanese sign
Learning diverse training data for
language, ensuring cultural
accurate recognition.
relevance.

1. High Accuracy: Uses Artificial 1. Limited Generalization: May


Artificial Neural Neural Network for precise struggle with signs from other
1. Adithya V.
Network Based Indian Sign Language languages.
2. Vinod P.
8 Method for Indian 2013 recognition. 2. Data Diversity Dependency:
3. Usha
Sign Language 2. Cultural Relevance: Tailored for Effectiveness relies on diverse
Gopalakrishnan
Recognition Indian Sign Language nuances, training data for Indian Sign
ensuring accurate interpretation. Language variations.
Literature Survey (cont.)

1. Relies on the Leap Motion


1. Achieves precise Arabic Sign
Arabic sign controller, limiting accessibility
Language recognition through
language 1. M. Mohandes without the device.
the Leap Motion controller.
9 recognition using 2. S. Aliyu 2014 2. Effectiveness may be influenced
2. Enables a natural and gesture-
the leap motion 3. M. Deriche by the availability of diverse
based interaction method for
controller training data for Arabic Sign
communication.
Language gestures.

1. Accurate Alphabet Recognition:


1. Hardware Dependency:
Utilizes Microsoft Kinect for
Requires Microsoft Kinect for
American Sign precise American Sign Language
1. Cao Dong functionality.
Language alphabet alphabet recognition.
10 2. M. C. Leu 2015 2. Training Data Sensitivity:
recognition using 2. Enhanced Gesture Sensing:
3. Z. Yin Effectiveness relies on diverse
Microsoft Kinect Leverages Kinect's 3D
training data for accurate
capabilities for improved
recognition.
accuracy.
Literature Survey (cont.)

1. CNN Precision: Ensures high- 1. Data Dependency: Effectiveness


Sign Language precision sign language relies on diverse training data.
Recognition and 1. Jonathan Ball recognition and translation. 2. Complex Implementation:
11 2016
Translation with 2. Brian Price 2. Efficient Image Analysis: CNNs Implementing CNNs may
CNNs excel in accurate feature demand computational
extraction from images. resources and expertise.

1. Ubiquitous Recognition: Enables 1. Technical Challenges:


DeepASL: Enabling non-intrusive mobile sign Implementation may face
Ubiquitous and language recognition with complexities in ensuring non-
Non-Intrusive 1. Daniele Cippitelli DeepASL. intrusiveness.
12 2018
Mobile Sign 2. Davide Cipolla 2. Enhanced Accessibility: 2. Resource Intensity: DeepASL
Language Facilitates sign language on mobile devices may require
Recognition interpretation on mobile devices significant computational
for improved accessibility. resources.
Literature Survey (cont.)
1. Precise Recognition: Achieves 1. Device Dependency: Relies on
accurate sign language Microsoft Kinect, limiting
recognition utilizing Microsoft accessibility without the device.
Sign Language 1. Thad Starner
Kinect. 2. Training Data Sensitivity:
13 Recognition with 2. Mohammed J. 2013
2. 3D Gesture Sensing: Harnesses Effectiveness may be influenced
Microsoft Kinect Islam
Kinect's 3D sensing for by the availability of diverse
enhanced gesture recognition training data for accurate sign
precision. language gestures.

1. High Precision: Achieves precise


1. Data Dependency: Effectiveness
sign language recognition with a
Sign Language relies on diverse training data.
1. Alex Graves Convolutional Neural Network
Recognition Using a 2. Complex Implementation:
14 2. Santiago 2018 (CNN).
Convolutional Using CNNs may demand
Fernández 2. Efficient Image Analysis: CNNs
Neural Network computational resources and
excel in accurate feature
expertise.
extraction from images.
Literature Survey (cont.)
1. Technical Challenges: May
1. Wearable Recognition: Enables
Sign Language encounter complexities in
on-the-go sign language
Translation and accurate recognition with
translation using Myoelectric
Recognition using 1. Juyoung Shin wearable sensors.
15 2017 sensors.
Wearable 2. Joo H. Kim 2. Sign Variation Limitation:
2. Mobility and Accessibility:
Myoelectric Effectiveness may vary based
Enhances accessibility for users
Sensors on sensor’s ability to capture a
with portable sign recognition.
diverse range of signs.

1. Deep Learning Precision:


1. Data Dependency: Effectiveness
Achieves precise sign language
relies on diverse and
Deep Learning for recognition and translation using
representative training data.
Sign Language 1. E. Assogba deep learning.
16 2019 2. Computational Intensity:
Recognition and 2. P. H. S. Amoudé 2. Efficient Pattern Recognition:
Implementing deep learning
Translation Deep learning excels in
may demand significant
recognizing complex patterns,
computational resources.
enhancing accuracy.
Literature Survey (cont.)
1. Accurate Translation: Achieves
1. Data Dependency: Effectiveness
precise sign language translation
depends on diverse training data
using Neural Machine
Neural Machine for accurate translation.
1. Oscar Koller Translation.
17 Translation for Sign 2020 2. Technological Accessibility:
2. David Ney 2. Enhanced Communication:
Language May be limited by the
Improves communication for
availability and accessibility of
sign language users through
the technology for users.
automated translation.

1. High Accuracy: Achieves precise


Deep Learning- 1. Data Dependency: Relies on
ASL recognition through deep
Based American 1. Hrishikesh diverse ASL training data.
learning.
18 Sign Language Kulkarni 2020 2. Computational Demands: May
2. Efficient Feature Extraction:
(ASL) Recognition 2. Suchismita Saha require substantial
Deep learning excels in
System computational resources.
extracting complex features.
Literature Survey (cont.)

1. High Precision: Achieves precise


1. Data Dependency: Relies on
sign language recognition using
Sign Language diverse training data for
3D CNNs.
Recognition Using 1. Chien-Wei Wu effectiveness.
19 2019 2. Spatial-Temporal Analysis: 3D
3D Convolutional 2. Eugene Lai 2. Computational Demands: May
CNNs excel in capturing
Neural Networks require significant
features over time, enhancing
computational resources.
accuracy.

1. Multimodal Precision: Achieves


Sign Language 1. Data Dependency: Relies on
accurate sign language
Recognition and diverse training data.
1. Siawpeng Er recognition and translation.
20 Translation: A 2020 2. Computational Demands: May
2. Jie Zhang 2. Comprehensive Understanding:
Multimodal Deep require significant
Uses multiple modalities for
Learning Approach computational resources.
improved translation accuracy.
Literature Survey (cont.)

1. Accurate Hand Pose Estimation: 1. Data Dependency: Effectiveness


Enhanced Hand
Achieves precise hand pose relies on diverse and
Pose Estimation
estimation through CNNs. representative training data.
and Sign Language 1. Yutian Duan
21 2021 2. Enhanced Sign Language 2. Computational Intensity:
Recognition using 2. Yan Lu
Recognition: Utilizes CNNs to Implementing CNNs may
Convolutional
enhance accuracy in sign demand significant
Neural Networks
language recognition. computational resources.

1. Comprehensive Overview: 1. Dynamic Field: Evolving


Surveys existing sign language technology may make surveyed
A Survey of Sign
recognition and translation systems outdated.
Language 1. Zixia Cai
22 2021 systems. 2. Bias and Limitations: May not
Recognition and 2. Yu Che
2. Informed Decision-Making: cover all emerging approaches
Translation Systems
Provides insights for informed and could have biases based on
system development. included literature.
Literature Survey (cont.)
1. Limited Complexity Handling:
1. Geometric Precision: Employs May face challenges with
A geometric model geometric model for accurate complex gestures due to model
1. Alexander Calado
based approach two two-hand gesture recognition. limitations.
23 2. Paolo Roselli 2022
hand gesture 2. Clear Interpretation: Enhances 2. Calibration Dependency:
3. Veto Errico
recognition clarity in gesture interpretation Accuracy depends on precise
through geometric modeling. calibration of the geometric
model.

1. Limited Range: The


1. RF Sensing Precision: Achieves
effectiveness may be influenced
accurate ASL recognition through
American Sign by the limited range of RF
RF sensing.
Langauge 1. Sevigi Z Gurbaj sensing.
24 2021 2. Non-Intrusive Interaction:
recognition using 2. Evie A. malaia 2. Environmental Interference:
Enables non-intrusive interaction
RF sensing External factors and
for ASL recognition without
environmental conditions may
physical contact.
affect RF sensing accuracy.
Literature Survey (cont.)

A wireless multi- 1. Efficient Gesture Recognition: 1. Cost and Complexity: Involves


channel capacitive Uses a wireless capacitive sensor increased costs and technical
sensor system for system for globe-based gesture complexity in development.
1. Jiming Pan
25 efficient globe 2020 recognition. 2. Maintenance Challenges:
2. Yuxuan
based gesture 2. Edge AI Integration: Complexity may pose
recognition with AI Incorporates AI at the edge for challenges in system
at the edge improved processing efficiency. maintenance.
Proposed Method
• Convolutional Neural Networks (CNNs) revolutionize image classification by sequentially extracting
features. Convolutional layers use learnable filters to detect local features like edges and textures,
capturing essential image information. Subsequent pooling layers downsample these features,
preserving crucial data while reducing computational complexity.
• The resulting feature maps transition to fully connected layers, evaluating global patterns within the
image. Activation functions introduce non-linearities, enhancing the network's capacity to discern
complex relationships. Finally, the softmax layer translates outputs into probability scores for different
classes, ensuring accurate classification.
• During training, the model adjusts its parameters to minimize a defined loss function, refining its
predictive accuracy. Trained CNNs exhibit robustness and high accuracy in classifying new data.
• Moreover, pre-trained CNN models and transfer learning expedite adaptation to specific tasks,
enabling swift customization for diverse image classification needs. This adaptability and accuracy
have solidified CNNs as a cornerstone in modern computer vision and machine learning, transforming
industries reliant on image analysis and classification.
Objectives
1.Develop a Real-Time Conversion System: Create a system that accurately translates sign language
gestures into text or speech in real time, facilitating immediate communication between signers and non-
signers.
2.Focus on Specific Sign Language or Gestures: Target a particular sign language or a subset of gestures
to ensure precision and effectiveness in recognizing and converting these specific expressions.
3.Utilize Convolutional Neural Networks (CNNs):Leverage CNNs, known for their efficacy in image
recognition tasks, to design an architecture capable of processing sign language gestures captured
through images or video frames.
4.Enhance Model Performance: Employ data augmentation techniques, preprocessing steps, and transfer
learning to optimize the CNN architecture, aiming for improved accuracy and robustness in recognizing
and converting sign language.
5. Address Communication Barriers: Aim to bridge the communication gap faced by deaf and hard of
hearing individuals when interacting with non-signers, promoting integration and participation in
society.
Motivation
• This project's core motivation is to empower sign language users by swiftly translating their gestures
into text, offering them a potent means to communicate widely. Breaking communication barriers, it
enables these individuals to express thoughts and connect with a broader audience. Education stands
as another pivotal driver. Implementing this technology in schools and colleges fosters more
accessible sign language education, benefiting both students with hearing impairments and their peers.
It promotes understanding and inclusivity, creating equal learning opportunities.

• In India, where Indian Sign Language (ISL) has gained recognition, efficient communication tools are
crucial. The shortage of sign language instructors amplifies this need, positioning the technology to
bridge the gap and facilitate communication in a rapidly evolving linguistic landscape. By deploying
this system, it not only addresses immediate communication challenges but also contributes
significantly to fostering a more inclusive environment, both in education and broader societal
interactions.
Challenges
1. Complexity of Sign Language: Sign languages exhibit rich grammar and syntax, encompassing
various gestures and expressions. Recognizing and accurately translating these intricate gestures into
text or speech poses a significant challenge due to the language's complexity.
2. Real-Time Processing Requirements: Developing a system capable of real-time translation adds
complexity, requiring rapid and accurate recognition of gestures from images or video frames.
Achieving this swift conversion without compromising accuracy presents a technical challenge.
3. Diversity in Sign Language: Different sign languages exist globally, each with its unique vocabulary
and grammar. Adapting the system to accommodate diverse sign languages or subsets of gestures while
maintaining accuracy across variations poses a challenge.
4. Data Variability and Model Robustness: Ensuring the system's reliability across different
environments, lighting conditions, hand orientations, and individuals' signing styles demands robustness.
Managing variability in data and ensuring the model's generalizability presents a substantial challenge in
sign language recognition systems.
Applications

1. Communication Access for the Deaf and Hard of Hearing


2. Education and Training
3. Social and Video Communication
4. Interpretation Services
5. Employment Opportunities
6. Language and Cultural Preservation
7. Assistive Technology
8. Public Awareness and Education
Hardware- Software Requirements

1. Hardware Requirements:
• System : Intel Core i3 Minimum and 2GHz Minimum
• RAM : 8 GB and above
• Hard Disk : 10 GB or above
• Input Device : Webcam, Keyboard and Mouse
• Output Device : Monitor or PC

2. Software Requirements:
• Operating System: Windows 8 and Above
• Language : Python • Software : Google Colab
• IDE: PyCharm
• Libraries : OpenCV, NumPy, Keras, mediapipe, Tensorflow
Architecture
Architecture (Cont..)
Flow Diagram
SEQUENCE DIAGRAM:
Works to be completed

• End User Interface:


Create an intuitive interface enabling real-time capture or upload of sign language gestures for instant
conversion into text, ensuring seamless communication for both signers and non-signers.
References

[1] M.M.Gharasuie, H.Seyedarabi, Proposed a “Real-time Dynamic Hand Gesture Recognition using
Hidden Markov Models”, 8th Iranian Conference on Machine Vision and Image Processing (MVIP),
IEEE, 2013.
[2] P Vijayalakshmi and M Aarthi, Proposed a Sign language to speech conversion. In 2016 International
Conference on Recent Trends in Information Technology (ICRTIT), IEEE, 2016.
[3] Kshitij Bantupalli and Ying Xie, Proposed a American sign language recognition using deep learning
and computer vision. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 2018.
[4] Kanchan Dabre and Surekha Dholay, Proposed a Machine learning model for sign language
interpretation using webcam images. In 2014 International Conference on Circuits, Systems,
Communication and Information Tech- nology Applications (CSCITA). IEEE, 2014.
References (Cont..)
[5] Aditya Das, Shantanu Gawde, Khyati Surat wala, and Dhananjay Kalbande, Proposed a Sign
language recognition using deep learning on custom processed static gesture images. In 2018
International Conference on Smart City and Emerging Technology (ICSCET). IEEE, 2018.
[6] Kunal Kadam, Rucha Ganu, Ankita Bhosekar, and SD Joshi, Proposed a American sign language
interpreter. In 2012 IEEE Fourth International Conference on Technology for Education. IEEE, 2012
[7] Nobuhiko MUKAI, Naoto HARADA, Youngha CHANG, Proposed a “Japanese Fingerspelling
Recognition based on Classification Tree and Machine Learning”, NICOGRAPH International, 2017.
[8] Adithya V., Vinod P., Usha Gopalakrishnan, Proposed a “Artificial Neural Network Based
Method for Indian Sign Language Recognition”, IEEE Conference on Information and
Communication Technologies (ICT 2013), JeJu Island April 2013.
[9] M. Mohandes, S. Aliyu and M. Deriche, Proposed a "Arabic sign language recognition using the
leap motion controller," 2014 IEEE 23rd International Symposium on Industrial Electronics (ISIE),
Istanbul, 2014.
Reference(Cont..)
[10] Cao Dong, M. C. Leu and Z. Yin, Proposed a "American Sign Language alphabet recognition
using Microsoft Kinect," 2015 IEEE Conference on Computer Vision and Pattern Recognition
Workshops (CVPRW), Boston, MA, 2015.
[11] Jonathan Ball, Brian Price, Proposed a Sign Language Recognition and Translation with CNNs,
IEEE 2016.
[12] Daniele Cippitelli, Davide Cipolla, Proposed a DeepASL: Enabling Ubiquitous and
NonIntrusive Mobile Sign Language Recognition, IEEE 2018 .
[13] Thad Starner, Mohammed J. Islam, Proposed a Sign Language Recognition with Microsoft
Kinect, IEEE 2013.
[14] Alex Graves, Santiago Fernández, Proposed a Sign Language Recognition Using a
Convolutional Neural Network, IEEE 2018.
[15] Juyoung Shin, Joo H. Kim, Proposed a Sign Language Translation and Recognition using
Wearable Myoelectric Sensors, IEEE (2017)
Reference (Cont..)
[16] E. Assogba and P. H. S. Amoudé, Proposed a Deep Learning for Sign Language Recognition and
Translation, IEEE 2019.
[17] Oscar Koller, David Ney, Proposed a Neural Machine Translation for Sign Language: A Survey,
IEEE 2020.
[18] Hrishikesh Kulkarni, Suchismita Saha, Proposed a Deep Learning-Based American Sign
Language (ASL) Recognition System, IEEE 2020.
[19] Chien-Wei Wu, Eugene Lai, Proposed a Sign Language Recognition Using 3D Convolutional
Neural Networks, IEEE 2019.
[20] Siawpeng Er, Jie Zhang, Proposed a Sign Language Recognition and Translation: A Multimodal
Deep Learning Approach, IEEE 2020.
[21] Yutian Duan, Yan Lu, Proposed a Enhanced Hand Pose Estimation and Sign Language
Recognition using Convolutional Neural Networks,IEEE 2021.
Reference (Cont..)
[22] Zixia Cai, Yu Che, Proposed A Survey of Sign Language Recognition and Translation
Systems,IEEE 2021.
[23] Alexander Calado,Paolo Roselli,Veto Errico, Proposed a A geometric model based approach two
hand gesture recognition IEEE,2022.
[24] Sevigi Z Gurbaj , Evie A. malaia, Proposed a American Sign Langauge recognition using RF
sensing, IEEE 2021.
[25] Jiming Pan , Yuxuan, Proposed A wireless multi-channel capasitive sensor system for efficient
globe based gesture recognition with AI at the edge IEEE,2020.

You might also like