0% found this document useful (0 votes)
20 views27 pages

Final

Uploaded by

lartzyjames
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views27 pages

Final

Uploaded by

lartzyjames
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

TRIBHUVAN UNIVERSITY

INSTITUTE OF ENGINEERING

Kathmandu Engineering College


Department of Computer Engineering

Minor Project
On
SIGN LANGUAGE RECOGNITION USING
MACHINE LEARNING

[Code No: CT 654]


By

Diwash Adhikari (BCT76023)


Gaurab Ghimire (BCT76025)
Nirajan Khadka (BCT76043)
Parakram Basnet (BCT76046)

Kathmandu, Nepal
Falgun 2079
TRIBHUVAN UNIVERSITY
INSTITUTE OF ENGINEERING
Kathmandu Engineering College

Department of Computer Engineering

SIGN LANGUAGE RECOGNITION USING MACHINE


LEARNING

[Code No: CT654]

PROJECT REPORT SUBMITTED TO


THE DEPARTMENT OF COMPUTER ENGINEERING
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR
THE BACHELOR OF ENGINEERING

By

Diwash Adhikari (BCT76023)


Gaurab Ghimire (BCT76025)
Nirajan Khadka (BCT76043)
Parakram Basnet (BCT76046)

Kathmandu, Nepal
Falgun 2079
TRIBHUVAN UNIVERSITY
Kathmandu Engineering College
Department of Computer Engineering

ii
ACKNOWLEDGEMENT

Firstly, we would like to dedicate our regards to the Institute of Engineering (IOE) for
the inclusion of this Minor Project on the syllabus for the course of Bachelors of
Computer Engineering.

Also, we would like to thank our Department of Computer Engineering, Kathmandu


Engineering College for providing us with the proper guidance and a wonderful
learning atmosphere throughout our time here at Kathmandu Engineering College and
for giving us with this exciting opportunity to test our knowledge through this Minor
project.

The experience of doing this project will surely enrich our technical and teamwork
skills to a great extent.

Diwash Adhikari (KEC076BCT023)

Gaurab Ghimire (KEC076BCT025)

Nirajan Khadka (KEC076BCT043)

Parakram Basnet (KEC076BCT046)

iii
ABSTRACT

The communication between a person from the impaired community with a person
who does not understand sign language could be tedious task. This is why sign
language is introduced. Sign language is an art of conveying message using hand
gesture. But for this people have to learn sign languages which takes time and money.
We have developed a software that will be a bridge between these people with the
help of only a mobile phone. Our system classifies the hand gestures using AI and
translate it into English alphabets. This is achieved by collecting the required image
data, extracting features, feeding to the machine learning algorithms, and generating
the classifier output.

The user will be prompted to display the hand gesture infront of camera. Then the
video from camera will be fed to our program. Through mediapipe we will obtain the
hand landmarks which will be pre-processed to get only the necessary data. The
processed data will then be fed to the pre-trained model which will classify the
corresponding letter. The resulting alphabet will be shown on the screen.

Keywords: Hand Gesture, ML, AI

iv
TABLE OF CONTENTS

ACKNOWLEDGEMENT.........................................................................................iii
ABSTRACT.................................................................................................................iv
TABLE OF CONTENTS............................................................................................v
LIST OF FIGURES...................................................................................................vii
LIST OF ABBREVIATIONS..................................................................................viii
CHAPTER 1: INTRODUCTION...............................................................................1
1.1. BACKGROUND THEORY.......................................................................................1
1.1.1 SIGN LANGUAGE............................................................................................1
1.1.2 MACHINE LEARNING...........................................................................................2
1.2. PROBLEM DEFINITION..........................................................................................2
1.3. PROJECT OBJECTIVES...........................................................................................2
1.4. PROJECT SCOPE AND APPLICATIONS...............................................................2
CHAPTER 2: LITERATURE REVIEW...................................................................3
2.1 EXISTING SYSTEMS AVAILABLE WORLDWIDE...................................................4
2.1.1 GNOSYS.......................................................................................................................4
2.1.2 ACE ASL.......................................................................................................................4
2.2 LIMITATIONS OF PREVIOUS SYSTEMS...................................................................4
2.3 SOLUTIONS PROPOSED BY OUR SYSTEM..............................................................4
CHAPTER 3: METHODOLOGY..............................................................................5
3.1 PROCESS MODEL..........................................................................................................5
3.1.1 INCREMENTAL MODEL.......................................................................................5
3.2 BLOCK DIAGRAM........................................................................................................6
3.3 ALGORITHMS................................................................................................................7
3.4 NECESSARY UML DIAGRAMS...................................................................................8
3.4.1 DFD LEVEL 0...........................................................................................................8
3.4.2 DFD LEVEL 1...........................................................................................................8
3.4.3 DFD LEVEL 2...........................................................................................................9
3.4.4 USE CASE DIAGRAM..........................................................................................10
3.4.5 ACTIVITY DIAGRAM..........................................................................................11
3.5 TOOLS USED................................................................................................................12
3.5.1 GOOGLE COLABORATORY...............................................................................12
3.5.2 PYTHON.................................................................................................................12
3.5.3 TENSORFLOW......................................................................................................12

v
3.5.4 OPENCV.................................................................................................................13
3.5.5 MEDIAPIPE............................................................................................................13
3.5.6 HTML/CSS:............................................................................................................13
3.5.7 PANDAS.................................................................................................................13
3.5.8 JAVASCRIPT.........................................................................................................13
3.6 VERIFICATION AND VALIDATION.........................................................................14
CHAPTER 4: EPILOGUE........................................................................................16
4.1. RESULTS AND CONCLUSION.................................................................................16
4.2 FUTURE ENHANCEMENT.........................................................................................16
REFERENCES...........................................................................................................17
SCREENSHOTS........................................................................................................18

vi
LIST OF FIGURES
Figure 3.1: Block Diagram of Incremental Process Model ...........................................
5
Figure 3.2: System Block Diagram……………….......................................................6
Figure 3.3: DFD Level 0.............................................................................................. 8
Figure 3.4: DFD Level 1................................................................................................
8
Figure 3.5: DFD Level 2................................................................................................
9
Figure 3.6: Use Case Diagram………………………………………………...
……...10
Figure 3.7: Activity Diagram…………………………………………………….
…...11
Figure 3.8: Graph between training and validation accuracy…………………………
14
Figure 3.9: Graph between training and validation loss………………………………
15

vii
LIST OF ABBREVIATIONS

AI Artificial Intelligence

CV Computer Vision

GPU Graphics Processing Unit

ML Machine Learning

UI User Interface

ASL American Sign Language

viii
CHAPTER 1: INTRODUCTION
1.1. BACKGROUND THEORY

1.1.1 SIGN LANGUAGE

Sign language alphabets are created through hand gesture. Ordinary people may not
understand sign language. It is used by nearly 250000 people from all around the
world. The ASL consists of 26 gestures for its 26 alphabets letters. The pattern
recognition problem belongs to the part of preprocessing step. Feature extraction is an
essential step in every convolutional pattern recognition task. The hand gesture

requires image for processing into the classifier. The Deep CNN based algorithm is
used for alphabet recognition.

1
1.1.2 MACHINE LEARNING

Machine learning is defined as the field of study that gives computers the ability to
learn without being explicitly programmed. It is seen as a part of artificial intelligence.
Machine learning algorithms build a model based on a sample data, known as training
data, in order to make predictions or decisions without being explicitly programmed to
do the tasks. ML algorithms are widely used in speech recognition and e-mail filtering,
among other areas, where it is unfeasible to develop conventional algorithms to do the
required tasks.

1.2. PROBLEM DEFINITION

Language is a medium through which people communicate. But sometimes


communication with a mute or a deaf can troublesome. We need to know different
sign languages to communicate with them. For this we need to learn the sign language
but we don’t have enough time to learn about the sign language or we don’t need to. It
is possible to communicate with deaf or a mute developing our own algorithm by
building a translator which could translate the sign into language. This can be achieved
through AI.

1.3. PROJECT OBJECTIVES

The main objectives of our project are:


 To recognize the hand gestures using machine learning model.
 To translate the hand gesture movement into English Alphabets Letters.

1.4. PROJECT SCOPE AND APPLICATIONS

This project can be used by anyone communicating in Sign Language. Hospitals can
use this software to communicate with deaf or mute patients. Deaf schools can use this
system to teach deaf students. It can be used in regular basis in day-to-day activities
like transportation, management, tourism, and all the place where deaf and mute are.
Can also be used by different companies and organization to improve their
communication with their employees or customers.

2
CHAPTER 2: LITERATURE REVIEW

These two parts of body (Hand & Arm) have most attention among those people who
study gestures in fact much reference only consider these two for gesture recognition.
The majority of automatic recognition systems are for deictic gestures (pointing),
emblematic gestures (isolated signs) and sign languages (with a limited vocabulary
and syntax). Some are components of bimodal systems, integrated with speech
recognition. Some produce precise hand and arm configuration while others only
coarse motion.

Stark and Kohler developed the ZYKLOP system for recognizing hand poses and
gestures in real-time. After segmenting the hand from the background and extracting
features such as shape moments and fingertip positions, the hand posture is classified.
Temporal gesture recognition is then performed on the sequence of hand poses and
their motion trajectory. A small number of hand poses comprises the gesture catalog,
while a sequence of these makes a gesture.

Similarly, Maggioni and Kämmerer described the Gesture Computer, which


recognized both hand gestures and head movements. There has been a lot of interest in
creating devices to automatically interpret various sign languages to aid the deaf
community. One of the first to use computer vision without requiring the user to wear
anything special was built by Starner, who used HMMs to recognize a limited
vocabulary of ASL sentences. The recognition of hand and arm gestures has been
applied to entertainment applications.

Freeman developed a real-time system to recognize hand poses using image moments
and orientations histograms, and applied it to interactive video games. Cutler and Turk
described a system for children to play virtual instruments and interact with life like
characters by classifying measurements based on optical flow.

3
2.1 EXISTING SYSTEMS AVAILABLE WORLDWIDE

2.1.1 GNOSYS

It is a smartphone app powered by artificial intelligence (AI). Also referred to as


“Google translator for the deaf” it works by putting a smartphone before the user. It
uses neural networks and computer vision to understand the photo of the sign language
speaker. This is later converted to text by smart algorithms.

2.1.2 ACE ASL

Ace Asl is the first AI based ASL app to provide immediate feedback on sign language
through photo. It uses AI to analyze the hand gesture and provide translation
immediately.

2.2 LIMITATIONS OF PREVIOUS SYSTEMS

The previous system tried to provide quick and accurate sign language conversion but
there were some limitations with accuracy and ease of use. The output of the previous
system was not consistent. Previous systems also had many steps and requirements
like asking for your preferred hand, skin color before you could use the app. Also the
UI was bit complex and response time was not quick.

2.3 SOLUTIONS PROPOSED BY OUR SYSTEM

After hours of research and review of existing systems, we managed to compile the
necessary functions to implement that could further improve the already existing
systems. Our system is very UI friendly and accuracy is much improved and
consistent. We increased our accuracy by training our model with complex and large
datasets.

4
CHAPTER 3: METHODOLOGY

3.1 PROCESS MODEL

3.1.1 INCREMENTAL MODEL

Figure 3.1: Block Diagram of Incremental Process Model


Incremental model is one of the most adopted models of software development
process where the software requirement is broken down into many standalone
modules in the software development life cycle. Incremental development will be
carried out in steps covering all the analysis, designing, implementation, carrying out
all the required testing or verification and maintenance. In incremental models, each
iteration stage is developed and hence each stage will be going through requirements,
design, coding and finally the testing modules of the software development life cycle.
Functionality developed in each stage will be added on the previously developed
functionality and this repeats until the software is fully developed. At each
incremental stage there will be though review basing on which the decision on the
next stage will be taken out.

The main importance of the Incremental model is that it divides the software
development into submodules and each submodule is developed by following the
software development life cycle process SDLC like Analysis, Design, Code, and Test.
By doing this model make sure that we are not missing any objective that is expected
from the end of the software even though how minor objective it can be. Thus, we are

5
achieving 100% objective of the software with this model also since we are testing
aggressively after each stage. We are making sure of the end software is defect-free
and also each stage is compatible with previously developed and future developing
stages. Now let’s look into few of the characteristics of the Incremental model and
why is such popular.

3.2 BLOCK DIAGRAM

Figure 3.2: System block diagram

We prepared the dataset by collecting landmarks from four of our group members. We
used mediapipe to collect the landmarks. In total we collected data from 3699 images.
On average we collected about 140 images for each alphabet using both left and right
hand.

The obtained landmarks from mediapipe are then preprocessed. The obtained
landmarks provided us with extra data which was not needed so we only extracted the
useful data like x and y coordinates. Then those coordinates were normalized.

6
There are 26 labels for each alphabet. We used 3500 out of 3699 images for training
and remaining 200 images were used for testing. We used multiple linear regression
for training our model.

After the alphabet is classified by the trained model, the index of the classified
alphabets is mapped with corresponding alphabet and then the result is displayed.

3.3 ALGORITHMS

MULTIPLE LINEAR REGRESSION:


Multiple linear regression is the most common form of linear regression analysis. As a
predictive analysis, the multiple linear regression is used to explain the relationship
between one continuous dependent variable and two or more independent variables.

Multiple linear regression implementation:


1. Import necessary libraries, including TensorFlow, Keras, NumPy, pandas, OS.
2. Use pandas to read csv datasets and make training and testing sets.
3. Create a Keras Sequential model using dense, dropout and flatten layers.
4. Add a final classification layer with softmax activation.
5. Compile the sequential model with Sparse Categorical Cross-entropy loss
function and Adam optimizer.
6. Train the model for 30 epochs.
7. Save the pre-trained model.
8. Evaluate the model's performance.

7
3.4 NECESSARY UML DIAGRAMS

3.4.1 DFD LEVEL 0

Figure 3.3: DFD Level 0


3.4.2 DFD LEVEL 1

Figure 3.4: DFD Level 1

8
3.4.3 DFD LEVEL 2

Figure 3.5: DFD Level 2

9
3.4.4 USE CASE DIAGRAM

Figure 3.6: Use Case Diagram

10
3.4.5 ACTIVITY DIAGRAM

Figure 3.7: Activity Diagram

11
3.5 TOOLS USED

3.5.1 GOOGLE COLABORATORY

Google Colaboratory is a web-based platform that allows team of developers to work


together online. It’s also a great platform to train machine learning model as it
provides great hardware for fast compilation and training of ML models.

3.5.2 PYTHON

Python is an interpreted, object-oriented, high-level programming language with


dynamic semantics. Python is easy and simple where it takes no time to make a
program from pseudo-code. Python’s popularity these days can be examined with its
large number of modules and packages encouraging code reusability. The question
might be raised like “Why use Python?” which is quite slow compared to a language
like C++ when speed is of major concern. It is because python is handy when it comes
to deep learning. Different high-level machine learning libraries exist for only python.
Also, there is quite many tutorials or communities that will help learners to step the
foot in deep learning using python focusing on logics and understanding neural
networks rather than the syntax.

3.5.3 TENSORFLOW

TensorFlow is an end-to-end open-source platform for machine learning developed by


the Google Brain team for internal Google use. Built in 2015, it is the most widely
used machine learning and deep learning package. Tensors are the multidimensional
arrays which are commonly operated in neural networks and hence the name
‘TensorFlow’. It helps developers to create multi layers neural networks to implement
classification, regression, discovering prediction and creation. TensorFlow is designed
in Python programming language that makes it easy to understand. The library can run
on multiple CPUs and GPUs and is available in various platforms. It has high
scalability of computation across machines.

12
3.5.4 OPENCV

OpenCV is an open-source computer vision and machine learning software library. It


is written in C and C++. It was focused to work for computational efficiency in real-
time applications which takes advantage of multicore processors. In OpenCV, we
would get at least 30 frames per second which is why it performs faster in real-time
detection. The library has more than 2500 optimized machine learning and computer
vision algorithms. The applications of these algorithms include detecting and
recognizing faces, objects, detecting colors, extract 3D models, track movement in the
camera, determine human actions in the camera, etc.

3.5.5 MEDIAPIPE

MediaPipe is an open-source framework for building pipelines to perform computer


vision inference over arbitrary sensory data such as video or audio. It readily provides
recognition of hand or face or any other objects which makes it easier for developers
to work on detection projects.

3.5.6 HTML/CSS:

HTML is a markup language that acts as a skeleton of websites. It provides basic


structure to the websites that we see. Along with HTML, CSS is used to provide an
interactive user interface to the skeleton structure provided by HTML. In short, CSS is
used to give beautiful design to the ugly structure created with HTML.

3.5.7 PANDAS

It is a software library written for Python for data manipulation and analysis. It offers
data structures and operations for manipulating numerical tables and time series.

3.5.8 JAVASCRIPT

JavaScript often abbreviated as JS, is a programming language that is one of the core
technologies of the World Wide Web, alongside HTML and CSS. Websites use
JavaScript on the client side for webpage behavior, often incorporating third-
party libraries. All major web browsers have a dedicated JavaScript engine to execute
the code on users' devices.

13
3.6 VERIFICATION AND VALIDATION

In total we collected data from about 3700 images and trained the model for 30
epochs.

Description of data used:


 Training data: 3700
 Validation data: 200
 Number of classes: 26
 Images per class: 140

Following results were obtained.


 Training Accuracy 99.4%
 Validation Accuracy 99.83%

Figure 3.8: Graph between training and validation accuracy

14
Figure 3.9: Graph between training and validation loss

15
CHAPTER 4: EPILOGUE
4.1. RESULTS AND CONCLUSION
Sign Language Recognition System is a web-based application that recognizes
American sign language in real time. This project is done using machine learning. We
implemented multiple linear regression to build the model and we achieved 99.4%
accuracy.

Our system will be able to help with communication between deaf and mute people
with others. Our system provides a platform for those people to be able to
communicate with the people around and can fulfill their requirements. With high
accuracy and fast response, the communication will be seamless providing the feeling
of natural conversation which is the ultimate goal of our project.

4.2 FUTURE ENHANCEMENT


 To develop a mobile application where we can implement this feature.
 Add features like making sentences as well as audio translation.

16
REFERENCES

[1] Cui Y, Juyang W (2000) Appearance-based hand sign recognition from intensity
image sequences. Compute Vision Image Understand.

[2] Sangeeta Kumari, Parth Srivastav (2020) Hand gesture-based recognition for
interactive Human Computer using tenser-flow.

[3] Abdul Rehman Javed (2022) Hyper tuned Deep Convolutional Neural Network for
Sign Language Recognition.

[4] Panwar, Meenakshi & Mehra, Pawan. (2011), “Hand gesture recognition for
human computer interaction”

[5] S. Albawi, T. A. Mohammed and S. Al-Zawi, "Understanding of a convolutional


neural network," 2017 International Conference on Engineering and Technology
(ICET), Antalya, 2017

17
SCREENSHOTS

18
19

You might also like