Final
Final
INSTITUTE OF ENGINEERING
Minor Project
On
SIGN LANGUAGE RECOGNITION USING
MACHINE LEARNING
Kathmandu, Nepal
Falgun 2079
TRIBHUVAN UNIVERSITY
INSTITUTE OF ENGINEERING
Kathmandu Engineering College
By
Kathmandu, Nepal
Falgun 2079
TRIBHUVAN UNIVERSITY
Kathmandu Engineering College
Department of Computer Engineering
ii
ACKNOWLEDGEMENT
Firstly, we would like to dedicate our regards to the Institute of Engineering (IOE) for
the inclusion of this Minor Project on the syllabus for the course of Bachelors of
Computer Engineering.
The experience of doing this project will surely enrich our technical and teamwork
skills to a great extent.
iii
ABSTRACT
The communication between a person from the impaired community with a person
who does not understand sign language could be tedious task. This is why sign
language is introduced. Sign language is an art of conveying message using hand
gesture. But for this people have to learn sign languages which takes time and money.
We have developed a software that will be a bridge between these people with the
help of only a mobile phone. Our system classifies the hand gestures using AI and
translate it into English alphabets. This is achieved by collecting the required image
data, extracting features, feeding to the machine learning algorithms, and generating
the classifier output.
The user will be prompted to display the hand gesture infront of camera. Then the
video from camera will be fed to our program. Through mediapipe we will obtain the
hand landmarks which will be pre-processed to get only the necessary data. The
processed data will then be fed to the pre-trained model which will classify the
corresponding letter. The resulting alphabet will be shown on the screen.
iv
TABLE OF CONTENTS
ACKNOWLEDGEMENT.........................................................................................iii
ABSTRACT.................................................................................................................iv
TABLE OF CONTENTS............................................................................................v
LIST OF FIGURES...................................................................................................vii
LIST OF ABBREVIATIONS..................................................................................viii
CHAPTER 1: INTRODUCTION...............................................................................1
1.1. BACKGROUND THEORY.......................................................................................1
1.1.1 SIGN LANGUAGE............................................................................................1
1.1.2 MACHINE LEARNING...........................................................................................2
1.2. PROBLEM DEFINITION..........................................................................................2
1.3. PROJECT OBJECTIVES...........................................................................................2
1.4. PROJECT SCOPE AND APPLICATIONS...............................................................2
CHAPTER 2: LITERATURE REVIEW...................................................................3
2.1 EXISTING SYSTEMS AVAILABLE WORLDWIDE...................................................4
2.1.1 GNOSYS.......................................................................................................................4
2.1.2 ACE ASL.......................................................................................................................4
2.2 LIMITATIONS OF PREVIOUS SYSTEMS...................................................................4
2.3 SOLUTIONS PROPOSED BY OUR SYSTEM..............................................................4
CHAPTER 3: METHODOLOGY..............................................................................5
3.1 PROCESS MODEL..........................................................................................................5
3.1.1 INCREMENTAL MODEL.......................................................................................5
3.2 BLOCK DIAGRAM........................................................................................................6
3.3 ALGORITHMS................................................................................................................7
3.4 NECESSARY UML DIAGRAMS...................................................................................8
3.4.1 DFD LEVEL 0...........................................................................................................8
3.4.2 DFD LEVEL 1...........................................................................................................8
3.4.3 DFD LEVEL 2...........................................................................................................9
3.4.4 USE CASE DIAGRAM..........................................................................................10
3.4.5 ACTIVITY DIAGRAM..........................................................................................11
3.5 TOOLS USED................................................................................................................12
3.5.1 GOOGLE COLABORATORY...............................................................................12
3.5.2 PYTHON.................................................................................................................12
3.5.3 TENSORFLOW......................................................................................................12
v
3.5.4 OPENCV.................................................................................................................13
3.5.5 MEDIAPIPE............................................................................................................13
3.5.6 HTML/CSS:............................................................................................................13
3.5.7 PANDAS.................................................................................................................13
3.5.8 JAVASCRIPT.........................................................................................................13
3.6 VERIFICATION AND VALIDATION.........................................................................14
CHAPTER 4: EPILOGUE........................................................................................16
4.1. RESULTS AND CONCLUSION.................................................................................16
4.2 FUTURE ENHANCEMENT.........................................................................................16
REFERENCES...........................................................................................................17
SCREENSHOTS........................................................................................................18
vi
LIST OF FIGURES
Figure 3.1: Block Diagram of Incremental Process Model ...........................................
5
Figure 3.2: System Block Diagram……………….......................................................6
Figure 3.3: DFD Level 0.............................................................................................. 8
Figure 3.4: DFD Level 1................................................................................................
8
Figure 3.5: DFD Level 2................................................................................................
9
Figure 3.6: Use Case Diagram………………………………………………...
……...10
Figure 3.7: Activity Diagram…………………………………………………….
…...11
Figure 3.8: Graph between training and validation accuracy…………………………
14
Figure 3.9: Graph between training and validation loss………………………………
15
vii
LIST OF ABBREVIATIONS
AI Artificial Intelligence
CV Computer Vision
ML Machine Learning
UI User Interface
viii
CHAPTER 1: INTRODUCTION
1.1. BACKGROUND THEORY
Sign language alphabets are created through hand gesture. Ordinary people may not
understand sign language. It is used by nearly 250000 people from all around the
world. The ASL consists of 26 gestures for its 26 alphabets letters. The pattern
recognition problem belongs to the part of preprocessing step. Feature extraction is an
essential step in every convolutional pattern recognition task. The hand gesture
requires image for processing into the classifier. The Deep CNN based algorithm is
used for alphabet recognition.
1
1.1.2 MACHINE LEARNING
Machine learning is defined as the field of study that gives computers the ability to
learn without being explicitly programmed. It is seen as a part of artificial intelligence.
Machine learning algorithms build a model based on a sample data, known as training
data, in order to make predictions or decisions without being explicitly programmed to
do the tasks. ML algorithms are widely used in speech recognition and e-mail filtering,
among other areas, where it is unfeasible to develop conventional algorithms to do the
required tasks.
This project can be used by anyone communicating in Sign Language. Hospitals can
use this software to communicate with deaf or mute patients. Deaf schools can use this
system to teach deaf students. It can be used in regular basis in day-to-day activities
like transportation, management, tourism, and all the place where deaf and mute are.
Can also be used by different companies and organization to improve their
communication with their employees or customers.
2
CHAPTER 2: LITERATURE REVIEW
These two parts of body (Hand & Arm) have most attention among those people who
study gestures in fact much reference only consider these two for gesture recognition.
The majority of automatic recognition systems are for deictic gestures (pointing),
emblematic gestures (isolated signs) and sign languages (with a limited vocabulary
and syntax). Some are components of bimodal systems, integrated with speech
recognition. Some produce precise hand and arm configuration while others only
coarse motion.
Stark and Kohler developed the ZYKLOP system for recognizing hand poses and
gestures in real-time. After segmenting the hand from the background and extracting
features such as shape moments and fingertip positions, the hand posture is classified.
Temporal gesture recognition is then performed on the sequence of hand poses and
their motion trajectory. A small number of hand poses comprises the gesture catalog,
while a sequence of these makes a gesture.
Freeman developed a real-time system to recognize hand poses using image moments
and orientations histograms, and applied it to interactive video games. Cutler and Turk
described a system for children to play virtual instruments and interact with life like
characters by classifying measurements based on optical flow.
3
2.1 EXISTING SYSTEMS AVAILABLE WORLDWIDE
2.1.1 GNOSYS
Ace Asl is the first AI based ASL app to provide immediate feedback on sign language
through photo. It uses AI to analyze the hand gesture and provide translation
immediately.
The previous system tried to provide quick and accurate sign language conversion but
there were some limitations with accuracy and ease of use. The output of the previous
system was not consistent. Previous systems also had many steps and requirements
like asking for your preferred hand, skin color before you could use the app. Also the
UI was bit complex and response time was not quick.
After hours of research and review of existing systems, we managed to compile the
necessary functions to implement that could further improve the already existing
systems. Our system is very UI friendly and accuracy is much improved and
consistent. We increased our accuracy by training our model with complex and large
datasets.
4
CHAPTER 3: METHODOLOGY
The main importance of the Incremental model is that it divides the software
development into submodules and each submodule is developed by following the
software development life cycle process SDLC like Analysis, Design, Code, and Test.
By doing this model make sure that we are not missing any objective that is expected
from the end of the software even though how minor objective it can be. Thus, we are
5
achieving 100% objective of the software with this model also since we are testing
aggressively after each stage. We are making sure of the end software is defect-free
and also each stage is compatible with previously developed and future developing
stages. Now let’s look into few of the characteristics of the Incremental model and
why is such popular.
We prepared the dataset by collecting landmarks from four of our group members. We
used mediapipe to collect the landmarks. In total we collected data from 3699 images.
On average we collected about 140 images for each alphabet using both left and right
hand.
The obtained landmarks from mediapipe are then preprocessed. The obtained
landmarks provided us with extra data which was not needed so we only extracted the
useful data like x and y coordinates. Then those coordinates were normalized.
6
There are 26 labels for each alphabet. We used 3500 out of 3699 images for training
and remaining 200 images were used for testing. We used multiple linear regression
for training our model.
After the alphabet is classified by the trained model, the index of the classified
alphabets is mapped with corresponding alphabet and then the result is displayed.
3.3 ALGORITHMS
7
3.4 NECESSARY UML DIAGRAMS
8
3.4.3 DFD LEVEL 2
9
3.4.4 USE CASE DIAGRAM
10
3.4.5 ACTIVITY DIAGRAM
11
3.5 TOOLS USED
3.5.2 PYTHON
3.5.3 TENSORFLOW
12
3.5.4 OPENCV
3.5.5 MEDIAPIPE
3.5.6 HTML/CSS:
3.5.7 PANDAS
It is a software library written for Python for data manipulation and analysis. It offers
data structures and operations for manipulating numerical tables and time series.
3.5.8 JAVASCRIPT
JavaScript often abbreviated as JS, is a programming language that is one of the core
technologies of the World Wide Web, alongside HTML and CSS. Websites use
JavaScript on the client side for webpage behavior, often incorporating third-
party libraries. All major web browsers have a dedicated JavaScript engine to execute
the code on users' devices.
13
3.6 VERIFICATION AND VALIDATION
In total we collected data from about 3700 images and trained the model for 30
epochs.
14
Figure 3.9: Graph between training and validation loss
15
CHAPTER 4: EPILOGUE
4.1. RESULTS AND CONCLUSION
Sign Language Recognition System is a web-based application that recognizes
American sign language in real time. This project is done using machine learning. We
implemented multiple linear regression to build the model and we achieved 99.4%
accuracy.
Our system will be able to help with communication between deaf and mute people
with others. Our system provides a platform for those people to be able to
communicate with the people around and can fulfill their requirements. With high
accuracy and fast response, the communication will be seamless providing the feeling
of natural conversation which is the ultimate goal of our project.
16
REFERENCES
[1] Cui Y, Juyang W (2000) Appearance-based hand sign recognition from intensity
image sequences. Compute Vision Image Understand.
[2] Sangeeta Kumari, Parth Srivastav (2020) Hand gesture-based recognition for
interactive Human Computer using tenser-flow.
[3] Abdul Rehman Javed (2022) Hyper tuned Deep Convolutional Neural Network for
Sign Language Recognition.
[4] Panwar, Meenakshi & Mehra, Pawan. (2011), “Hand gesture recognition for
human computer interaction”
17
SCREENSHOTS
18
19