0% found this document useful (0 votes)
14 views57 pages

Report Format

This mini project report details a prototype developed to assist deaf and dumb individuals in communicating with others by translating sign language gestures into text and speech. The project aims to bridge the communication gap between the disabled and the general public, utilizing Python for high accuracy in gesture recognition. The report includes sections on methodology, results, and future scope, highlighting the importance of accessible communication tools for the deaf community.

Uploaded by

appalabalaji5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views57 pages

Report Format

This mini project report details a prototype developed to assist deaf and dumb individuals in communicating with others by translating sign language gestures into text and speech. The project aims to bridge the communication gap between the disabled and the general public, utilizing Python for high accuracy in gesture recognition. The report includes sections on methodology, results, and future scope, highlighting the importance of accessible communication tools for the deaf community.

Uploaded by

appalabalaji5
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 57

TITLE (in Capital Letter)

MINI PROJECT REPORT

of

Submitted in the Fulfilment of the requirements for the award of the Degree of

BACHELOR OF TECHNOLOGY

IN

ELECTRONICS AND COMMUNICATION ENGINEERING

Submitted by

Member 1 Roll Number


Member 2 Roll Number
Member 3 Roll Number
Member 4 Roll Number

UNDER THE GUIDANCE OF


Guide Name
(Guide Post)
Department of ECE

2021-2025

I
2021-2025

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

CERTIFICATE

This is to certify that the project entitled “Tilte” is the bonafied work done by Member 1
(Roll Number), Member 2 (Roll Number), Member 3 (Roll Number), Member 4 (Roll
Number) in partial fulfilment of the requirement for the award of the degree of B. Tech in
Electronics and Communication Engineering, during the academic year 2023-2024.

Internal Guide Head of the Department

Guide Name Dr. S. V. S. Prasad

External Examiner

II
ACKNOWLEDGEMENT

We express our profound thanks to the management of MLR Institute of Technology, Dundigal,
Hyderabad, for supporting us to complete this project.

We take immense pleasure in expressing our sincere thanks to Dr. K. Srinivasa Rao, Principal, MLR
Institute of Technology, for his kind support and encouragement.

We are very much grateful to Dr. S. V. S. Prasad, Professor and Head of the Department, MLR Institute
of Technology, for encouraging us with his valuable suggestions.

We are very much grateful to Guide Name, Guide Post for his unflinching cooperation throughout the
project.

We would like to express our sincere thanks to the teaching and nonteaching faculty members of ECE
Department of MLR Institute of Technology, who extended their help to us in making our project work
successful.

Project Associates:

Member 1 Roll Number

Member 2 Roll Number

Member 3 Roll Number

Member 4 Roll Number

III
ABSTRACT

This project will help many dumb and deaf citizens across the world, most of the common
people can’t help out with the sign language so it raises difficulty when the disabled
people need to communicate with the regular people. So, we developed a prototype which
translates gestures into text and speech. This prototype can translate the hand signs and
then converts into communicable language into words or phrases specifically in English.
So, the aim of our paper is to help the deaf and dumb people community talk to those that
could notable to get sign language. We will develop software that records and tracks the
motion of the hand signs which recognizes by the camera and converts it. This sign
language translator can be implemented by using python methodology. It is a two-way
communication and python is easy to communicate with deaf and dumb people and it has
more accuracy compared to other methodologies. Hence, after running the python code,
the entered text will be automatically converted to gif or simply the hand signs will be
displayed on the screen. Text to sign conversion is easy to understand and the accuracy of
this project is 99.3%.

IV
CONTENTS
Title Page Ⅰ
Certificate Ⅱ
Acknowledgement Ⅲ
Abstract Ⅳ

CHAPTER 1 – INTRODUCTION Page No.


1.1. Introduction 1
1.2. Difficulties faced by blind 1-3
1.2.1. Navigating around places 2
1.2.2. Finding Reading Material 2
1.2.3. Getting devices to become independent 2-3
1.3. Introduction of Sign Language 3
1.4. Gestures of Sign Language 3-6
1.5. Motivation 6-8
1.6. Objective 8-9
1.7. Problem Statement 9-10
1.8. Project Review 10

CHAPTER 2- LITERATURE SURVEY 11

Journal 1 11-12

Journal 2 12

Journal 3 12-13

Journal 4 13-14

Journal 5 14-15

CHAPTER 3 – METHODOLOGY

3.0 Introduction 16
3.1 Required Libraires 17-21

V
Page No.
3.1.1. OpenCV 17
3.1.1.1. OpenCV's application areas include 17-19
3.1.2. NumPy 19-21
3.1.2.1. Comparison between Core Python and NumPy 20
3.1.2.2. Advantages of using NumPy with Python 20-21
3.1.3. TensorFlow 21-22
3.1.3.1. Features of TensorFlow 21
3.1.3.2. Installation 21-22
3.1.4. Pillow 22
3.1.5. Tkinter 23
3.1.5.1. How to use Tkinter 23
3.1.6. Pyttsx3 23
3.1.6.1. Installation 23
3.1.7. Media pipe 24
3.1.7.1. Uses of media pipe 24
3.1.7.2. Media Pipe Hands 24
3.1.8. Text Blob 25
3.1.9. Keras 26-28
3.1.9.1. Importance of Keras 26-27
3.1.9.2. Backend of Keras 27
3.1.9.3. Advantages of Keras 27-28
3.2 Webcam 28
3.3 Flowchart 28-30
3.3.1 Flowchart: Figure 3.10- Text to sign conversion. 30
3.3.2 Flowchart: Figure 3.11- Sign to text conversion 29-30
3.4 Source Code 30-44
3.4.1 Main code 30-44

CHAPTER 4 – RESULTS AND DISCUSSIONS

4.1 Results 45
4.1.1. Text to Sign Practical Results 45-47
4.1.2. Sign to Text Conversion 47-49

VI
4.2 Table Representation 49-51
4.1.1. Comparison Table 50
4.1.2. System Symbols 51
4.3 Discussions 51-52
4.3.1. Advantages 51-52
4.3.2. Disadvantages 52
4.3.3. Applications 52

CHAPTER 5 – CONCLUSIONS AND FUTURE SCOPE

5.1 Conclusion 52
5.2 Future Scope 54

REFERENCES 55-56

VII
List of Figures

S. No. Fig. No. Description


1 1.1 Deaf people

2 1.2 Dumb people

3 1.3 Blind people

4 1.4 Hand gestures for deaf and dumb

5 3.1 OpenCV library

6 3.2 Applications of CV

7 3.3 NumPy library

8 3.4 NumPy installation

9 3.5 TensorFlow

10 3.6 Installation of TensorFlow

11 3.7 Palm detection

12 3.8 Text Blob

13 3.9 Keras

14 3.10 Webcam

15 4.1 Hand Gesture for letter h

16 4.2 Hand Gesture for letter e

17 4.3 Hand Gesture for letter l

18 4.4 Hand Gesture for letter o

19 4.5 Stop Gesture

20 4.6 Right Pointer gesture

21 4.7 Close gesture

22 4.8 Right Open gesture

VII
I
List of Tables

S. No. Table No. Description


1 4.2.1 Comparison Table between previous to our project

2 4.2.2 System Symbols of hand gestures

IX
Acronym

LED Light Emitting Diode

ECE Electronics Communication Engineering

X
CHAPTER 1
INTRODUCTION
1.1 Introduction

The world was surrounded about 70 million deaf people. India has a large population of 2.4 million these
people. Indian Sign Language is a primer used for communication [1]. People who do not know how to
communicate with deaf and dumb people it’s really a challenge for them. We know that these people will
able to communicate with themselves but coming to normal people they do not even understand the way
they are using sings. To develop the sign language between the normal and these people they are many
techniques have been taken place [1].

The below figure 1.1 and figure 1.2 shows us the disability of the loss of hearing in the people for which
our project is made for, the people who cannot hear can communicate through hand signs so the below
illustrate the images of the people who cannot hear.

Figure 1.1 Deaf people Figure 1.2 Dumb people

1.2 Difficulties faced by blind


Blindness is one of the most, if not the most, misunderstood type of disability. The general masses have
their own pre-conciliar notion about the blind people that they firmly believe to be true without even
getting in touch with a blind person. Most of the members of the non-blind community believe that the
blind people cannot do their work or live a normal life. There are many perfect examples of the
contradiction of society’s perspective and the reality of a blind person’s life [2].

1
Blind people do lead a normal life with their own style of doing things. But they face troubles due to
inaccessible infrastructure and social challenges. Let us have an empathetic look at some of the daily life
problems, struggles and challenges faced by the blind people [2].
1.2.1 Navigating around places
The biggest challenge for a blind person, especially the one with the complete loss of vision, is to
navigate around places. Obviously, blind people roam easily around their house without any help because
they know the position of everything in the house [3]. People living with and visiting blind people must
make sure not to move things around without informing or asking the blind person.Commercial places
can be made easily accessible for the blinds with tactile tiles. But unfortunately, this is not done in most
of the places. This creates a big problem for blind people who might want to visit the place [3].

1.2.2 Finding Reading Material


Blind people have a tough time finding good reading materials in accessible formats. Millions of people
in India are blind but we do not have even the proper textbooks in braille, leave alone the novels and other
leisure reading materials. Internet, the treasure trove of information and reading materials, too is mostly
inaccessible for the blind people. Even though a blind person can use screen reading software but it does
not make the Internet surfing experience very smooth if the websites are not designed accordingly. Blind
person depends on the image description for understanding whatever is represented through pictures. But
most of the time, websites do not provide clear image description [3].

Figure 1.3 shows how the blind people will read the books. It is one of the techniques used by the blind
people for reading purpose.

Figure 1.3 Blind People Reading

2
1.2.3 Getting devices to become independent
The most valuable thing for a disabled person is gaining independence. A blind person can lead an
independent life with some specifically designed adaptive things for them. There are lots of adaptive
equipment that can enable a blind person to live their life independently but they are not easily available
in the local shops or markets. Refreshable Braille Display an example of such useful devices. A blind

person needs to hunt and put much effort to get each equipment that can take them one step closer
towards independence [3].

Everyone faces challenges in their life… blind people face a lot more. But this certainly does not mean
that you can show sympathy to blind persons. They too, just like any individual, take up life’s challenges
and live a normal life, even if it does not seem normal to the sighted individuals.We have idea about the
sign language that from the mouth words are produced and voice is produced by sounds. Here we must
concentrate on the sign language because vision will be clear for deaf and dumb people [1,2].

1.3 Introduction of Sign Language

Sign Languages (also known as signed languages) are languages that use the visual-manual modality to
convey meaning, instead of spoken words. Sign languages are expressed through manual articulation in
combination with non-manual markers. Sign languages are full-fledged natural languages with their own
grammar and lexicon. Sign languages are not universal and are usually not mutually intelligible although
there are also similarities among different sign languages [4].

Linguists consider both spoken and signed communication to be types of natural language, meaning that
both emerged through an abstract, protracted ageing process and evolved over time without meticulous
planning. This is supported by the fact that there is substantial overlap between the neural substrates of
sign and spoken language processing, despite the obvious differences in modality. Sign language should
not be confused with body language, a type of non-verbal communication [16].

Wherever communities of deaf people exist, sign languages have developed as useful means of
communication and form the core of local deaf cultures. Although signing is used primarily by the deaf
and hard of hearing, it is also used by hearing individuals, such as those unable to physically speak, those
who have trouble with oral language due to a disability or condition (augmentative and alternative
communication), and those with deaf family members including children of deaf adults [17,18].

1.4 Gestures of Sign Language

Figure 1.4 shows the hand gestures of each alphabet. These alphabets are used by deaf and dumb people
to show their thoughts and ideas.

3
Figure 1.4 Hand gestures for deaf and dumb

Gestures are a form of nonverbal communication in which visible bodily actions are used to communicate
important messages, either in place of speech or together and in parallel with spoken words [1]. Gestures
include movement of the hands, face, or other parts of the body. Physical non-verbal communication such
as purely expressive displays, proxemics, or displays of joint attention differ from gestures, which
communicate specific messages. The number of sign languages worldwide is not precisely known. Each
country generally has its own native sign language; some have more than one. The 2021 edition of
Ethnology lists 150 sign languages, while the SIGN-HUB Atlas of Sign Language Structures lists over
200 and notes that there are more which have not been documented or discovered yet. As of 2021, Indo
sign language is the most used sign language in the world, and ranks it as the 151st most "spoken"
language in the world [19]. Some sign languages have obtained some form of legal recognition.

Linguists distinguish natural sign languages from other systems that are precursors to them or obtained
from them, such as constructed manual codes for spoken languages, home sign and baby sign and signs
learned by non-human primates [20].Despite of many problems which are facing by the people the Indian
sign language provides the best way of communication and give support juggernauts, the sign language is
the only paper presence the same language translation by using Python software by the translation the
same language is translated [3].

By the translation the sign language is translated into the Spoken English here the total system consists of
standard Windows to develop operating laptop, [3,4] develop to operate with the camera the visuals are
performed based on the functions and protesters auto through audio or video device. For the blind people
several, which is our function avoiding the speaker whore all the system will able to perform the and
gestures of a person who was so the blind man and them people can able to hear the voice which will be
performing by the other person using hands the advantage of the oral project is here no electronic love
will be required because the total system will be performed to online only software will be required [5].

4
As we know the hearing in speaking disabilities communicate between two sign language they can easily
understand and able to translated to the normal people that people are so fast in living sign language
communication because it is the only way to tell what they want to do in the India is a vast country the
length of people living together as the unity in diversity people with different religion caste colour [6].

Some of them are disable persons every person has the unique talent in every person is different from the
person here the person who is blind and when the time by using different technique but that they can
explain language and them people can write that talk in their own way the special in talents are very
present in this people to advance the communication between the dumb and deaf people The Sign
Language translator has come into existence a vision [8] based approach have been discussed for
interpreting the Indian Sign Language using hand modality here the typical hand is organisation consist of
many four modules gesture the acquisition tracking and segmentation.

The performance of the system will

1 Mute people will perform the sign language in front of camera


2 For normal people the output will be spread as test
3 No dependency on others
4 To provide a new path of communication experience

The objective of our project is to develop the communication for the deaf and dumb people which
converts the sign language into text [5,21]. This would help the normal people to communicate with these
people by using the software system.

By doing this project we can get the basic idea how the sign language is translated into text and in Python
software we have in built functions which will make our code easier. We have many languages to be used
but Python is used for the efficiency purpose. This project will be a creative model for the deaf and dumb
people. This framework works on the code which will not need any algorithm [22,2]. Previous project works on the
algorithm and uses the hardware components which will make the project more costly but our project will be easier
and easy to implement.

As we know there are many people in the world who have the disability of deaf and dumb it raises a
complication when they wanted to communicate with the normal people but only few know the sign
language that helps to communicate with the deaf and dumb but not everyone are familiar with that
language so even if we wanted to help them out we can do it and there are times when the sign language
help us from preventing the greater damage by communicating in sign language since there is a difficulty
in communication by no knowing that language we are creating an translator to do so [9].

The proposed solution as there are multiple translators from one language to another which are basically
written or spoken, we desire to make a translator that can translate through the computer vision i.e., it
5
tracks

6
the motion and position of our hands. Every position of our hands holds a desired meaning which can be a
word or a letter [7]. The software we are designing will track the motion of the hands and translate it as
well as create the sounds of the pronunciation of that word or letter [9].

1.5 Motivation

Accessibility: Many people, such as the deaf or hard of hearing, cannot communicate in spoken language.
A two-way sign recognition system would allow these people to communicate more effectively with other
people who do not know sign language.

Efficiency: Traditional sign language interpretation methods require a human interpreter, which can be
time-consuming and expensive. A two-way hand sign recognition system can provide a faster and more
efficient way of communication.

Independence: A two-way manual sign recognition system would allow sign language users to
communicate independently without the help of others. Accuracy: Human interpreters can make mistakes
or misunderstand certain signs, which can lead to miscommunication. A two-way hand sign recognition
system, on the other hand, would provide a more accurate way to communicate.

Innovation: Developing a two-way handshake recognition system would represent an innovative use of
technology that could benefit many people. It can also be used in various environments such as schools,
hospitals, and workplaces. Versatility: The two-way manual sign recognition system recognizes different
sign languages, including American Sign Language (ASL) and British Sign Language (BSL), allowing
users to communicate with people from different regions. Privacy: Some people may not want a human
interpreter during sensitive or personal conversations. A two-way hand sign recognition system would
provide a private means of communication.

Cost-effective: Hiring a human translator can be expensive, especially for those who require regular
translation. A two-way hand sign recognition system would be a one-time investment that could save
money in the long run.

Integration with other technologies: The two-way hand gesture recognition system can be integrated with
other technologies such as virtual assistants, speech recognition and machine learning, making it more
advanced and useful. Educational benefits: The two-way hand recognition system can be used as a
teaching tool to help people learn sign language. It can also be used to facilitate communication between
deaf and hearing people, promoting inclusion and diversity. Remote communication: a two-way hand
gesture recognition system can facilitate communication between people who are physically far from
each other,
7
such as during a video call or conference. This can be particularly useful in situations where travel is
difficult or impossible.

Emergency situations: In emergency situations where people may not be able to communicate verbally, a
two-way hand sign recognition system can provide a quick and effective way to communicate. User-
friendly: The two-way hand sign recognition system can be designed to be user-friendly with a simple
user interface that is easy to use for people of all ages and backgrounds.

Customizable: The two-way hand sign recognition system can be customized to meet the needs of
different users, such as recognizing certain signs or gestures that are unique to an individual or
community.

Research potential: The two-way hand recognition system can also be used for research purposes, such as
studying the similarities and differences between different sign languages or understanding how people
use non-verbal communication in different contexts. Improved communication in noisy environments: a
two- way hand sign recognition system can improve communication in noisy environments where spoken
language may be difficult to hear or understand.

Greater engagement: A two-way hand recognition system can help break down communication barriers
between people speaking different languages, promoting engagement, and understanding. Accessibility
for people with physical disabilities: For people who may have physical disabilities that make it difficult
to use traditional methods of communication, such as speech, a two-way hand sign recognition system
can provide an accessible means of communication.

Real-time communication: The two-way hand gesture recognition system can provide real-time
communication, which allows faster and more efficient communication between users.

Better accuracy and clarity: a two-way hand sign recognition system can improve the accuracy and clarity
of communication, reducing the risk of misunderstandings or misinterpretations.

Multi-user support: The two-way hand gesture recognition system can be designed to support multiple
users simultaneously, enabling group communication and collaboration.

Increased security: In certain settings, such as construction sites or military operations, a two-way hand
sign recognition system can provide a secure method of communication where verbal communication
may not be possible or safe. Future development potential: The development of a two-way hand
recognition system has the potential for future technological development, leading to better
communication systems and the use of more advanced sign language recognition technology.
8
1.6 Objective

The goal of the project is to develop a two-way hand sign recognition system that recognizes and
accurately translates sign language gestures into spoken language and vice versa. The system must be
user-friendly, accessible, versatile, and adaptable to the needs of different users. The project aims to
promote inclusion and accessibility, improve communication in noisy environments and provide a fast,
efficient, and cost- effective means of communication for the deaf or hard of hearing. The project also
aims to promote technological development in sign language recognition and pave the way for further
development in the field. To achieve this goal, the project designs and develops a robust and accurate sign
language recognition algorithm that can analyse hand gestures and convert them into corresponding text
or speech. The system is trained using various sign languages to ensure its accuracy and versatility.

The project also develops a user-friendly user interface that enables easy communication and interaction
between users regardless of their language or communication skills. The user interface must be adaptable
to the needs of different users and it must support real-time communication between multiple users.

The implementation of the project focuses on accessibility and inclusion, ensuring that the system meets
the needs of the deaf or hard of hearing. The system is tested and evaluated with input from sign language
users to ensure its effectiveness and usability. To ensure the success of the project, a thorough analysis of
existing sign language recognition systems and technologies will be conducted. This analysis informs the
design and development of a two-way hand sign recognition system, considering the strengths and
weaknesses of current systems and identifying areas for improvement.

The project also collaborates with sign language experts, including deaf and hard of hearing people, sign
language interpreters and linguists. Such cooperation ensures that the system is deeply developed to
understand sign language and the communication needs of its users. In addition to developing a two-way
hand sign recognition system, the project will create a detailed user manual and training materials to
enable users to use the system effectively. This includes providing technical support and maintenance
services to ensure the system is operational and up-to-date.

The goal of the project is to provide sign language users with a reliable and accessible means of
communication, improve their ability to communicate with others and promote inclusion and
understanding in society. The project aims to positively impact the lives of deaf or hard of hearing people
and promote sign language recognition technology.

9
1.7 Problem Statement

Communication is an integral part of our daily lives, and the ability to communicate effectively with
others is essential for social, academic, and professional success. However, traditional forms of
communication, such as spoken language, may not be accessible to the deaf or hard of hearing. Sign
language is an important means of communication for these people, but it can cause problems when
communicating with people who do not understand sign language.

Current solutions for sign language recognition are limited and most of them are designed for one-way
translation from sign language to spoken language. This is a major challenge for people who rely on sign
language to communicate because it limits their ability to communicate with people who do not
understand sign language. In addition, most sign language recognition systems are expensive, complex,
and not easy to use, which makes it difficult for many people to use them.

Thus, the problem of this project is to develop a two-way manual sign recognition system that is accurate,
user-friendly, and easy to achieve, and can provide a reliable and cost-effective means of communication
for sign language users. The system must be able to translate sign language gestures into spoken language
and vice versa, support real-time communication between multiple users, and be adaptable to the needs of
different users. The aim of the project is to provide a solution to the communication problems faced by
the deaf or hard of hearing, promoting inclusion and accessibility in society. The lack of reliable and
accessible two-way sign language recognition systems prevents people who rely on sign language from
communicating effectively. This often leads to social isolation and a lack of opportunities in academic
and professional settings. Therefore, the development of a two-way sign language recognition system that
enables effective communication is crucial to promote inclusion and accessibility in society.

In addition, most existing sign language recognition systems rely on complex and expensive hardware
and software, making them inaccessible to many people who would benefit from them. The goal of the
project is to develop an affordable, accessible, and easy-to-use system that allows people with different
levels of technical expertise to operate and benefit from it.

The project also addresses the limitations of existing one-way sign language recognition systems, which
are designed to translate sign language into spoken language but do not provide reverse translation. This
makes it difficult for people who use sign language to communicate with those who do not understand it.
The goal of the two-way sign language recognition system developed within this project is to provide a
reliable and accessible means of two-way communication, improve the quality of life of people who use
sign language, and promote inclusion in society. The need for an accurate and reliable two-way sign
language recognition system is particularly important in emergency situations and health services where
10
effective communication can be a matter of life or death. In such situations, not understanding or
misinterpreting the signs can lead to serious consequences. Therefore, the development of a two-way sign
language recognition system could save lives and improve the quality of care for people who use sign
language. In addition, the project aims to advance sign language recognition technology. This requires
addressing the limitations of current systems and identifying areas for improvement. By developing a
robust and accurate two-way sign language recognition system, the project aims to push the boundaries of
sign language recognition technology, paving the way for more innovative and effective solutions in the
future. In short, the design problem is to develop a reliable, accurate, and easily accessible two-way sign
language recognition system that would facilitate effective communication between sign language users
and non- users. The project aims to address the limitations of current systems, provide a cost-effective
and innovative solution, and promote inclusion and accessibility in society.

1.8 Project Overview

Development of sign language translator for disable people in two-way communication .in this project we
have used python software which is open-source platform and robust. It is easy when compared to other
software. In previous project they have used image processing and Arduino that are mentioned in
literature survey. After that we have discussed about the results and comparison between our project and
other projects. When compared to other projects our project accuracy is more. Overall, we have
concluded our project.

11
CHAPTER 2

LITERATURE SURVEY

Journal 1:

It uses data processing methods to recognize the sign gestures. Initially it starts by taking the raw image
from the user and then process and the data send to the gesture classification here the preloaded gesture
data is being saved in accordance with the hand position finger position shape etc data is being noted and
saved and the saved data is been compared with the uploaded data for recognising the sign gesture. Many
vision-based and sensor-based techniques have been used for sign language recognition. Pavlovic et
al[22]. discussed about the visual interpretation of hand gestures for Human-Computer Interaction. The
paper published on 1997 emphasizes on the advantages and shortcomings and important differences in the
gesture interpretation approaches depending on whether a 3D model of the human hand or an image
appearance model of the human hand is used. As of the time, this survey was done 3D hand models
offered a way of more elaborate modelling of hand gestures but lead to computational hurdles that had not
been overcome given the real-time requirements of HCI. They also discussed implemented gestural
systems as well as other potential applications of vision-based gesture recognition [12].

Images of various Indian Sign Language alphabets were collected using various webcams. Initially, the
data collection system was created using OpenCV and Python Library library packages. In this program,
we captured images manually using a camera and collect different images of every sign of a particular
letter with different backgrounds and store images in front of a particular letter like in front of the “S”
letter S sign image is stored in the database and so on. After that, after that, we started working on a data
processing system that would convert images that had been resized to Gray [2,23].

This article describes an American Sign Language recognition system that employs the Controller for
Leap allusion with 26 letters and 10 numbers. The study included a total of 23 characteristics, which were
then split into six distinct groups of combinations. According to the findings, the space between two
fingertips and the neighbouring fingers is an important characteristic of sign language understanding.
Anna Deza, Danial Hasan, et al has proposed a neural network model which can determine the given
picture of a signing hand of the American Sign Language (ASL) alphabet being signed. This translator
would substantially reduce the barrier for many mute people who is unable to hear to converse with others
in everyday situations. Prediction of sign language must happen in real-time as it involves dynamic
gestures (facial, body pose, hand pose) [13]. Low latency image processing and high FPS is significant in
improving the efficiency of the recognition of the sign. Vastly used Procedural image processing
algorithm is computationally intensive which leads to frame loss and a decrease in vide0 streaming due to
delay in the
12
computation. Introduction of Multithreading computation by taking advantage of the multicore-processor
present in most modern embedded devices [12].

Journal 2:

Here the main aim of this project is to get the audio from the user end then recognise the voice and
convert into speech and separate each letter and display the specific hand signs for each of the so to
capture the audio PyAudio module was used and Speech recognition using Google Speech API. So
basically, it's recognizing the audio with the help of Google Speech API, by connecting to the internet its
done and The text is then pre-processed using NLP (Natural Language Processing). As we know that
Machine can only understand binary language (i.e.,0 and 1) then how can it understand our language. So,
to make the machine understand human language NLP was introduced. Natural Language Processing is
the ability of the machine where it processes the text said and structures it. It understands the meaning of
the words said and accordingly produces the output. Text pre-processing consists of three things
Tokenization, Normalization and Noise removal and finally Dictionary based machine translation is done
and at the output we can see the video/GIF of the hand signs to the corresponding audio to make this
translator work. This project will help for many of the dumb and deaf citizens across the world, most of
the common people cannot help out with the sign language so it raises difficulty when they the disabled
people needed to communicate with regular people. So, we developed a prototype which translates sign
language gestures into text and speech. This prototype can translate the hand signs and converted into
communicable language into words or phrases specifically in English. So, the aim of our project is to help
the deaf and dumb people community talk to those that do not understand sign language we will develop
a software that records and tracks the motion of the hand signs which recognizes by the camera and
converts it [24].

Our objective is to help people suffering from the problem of hearing. There have been many projects
done on the sign languages that convert sign language as input to text or audio as output. But audio to
sign language conversion systems have been rarely developed. It is useful to both normal and deaf people.
In this project we introduce new technology that is audio to sign language translator using python. In this
it takes audio as input, search that recording using google display the text on-screen and finally it gives
sign code of given input using ISL (Indian Sign Language) generator [13]. All the words in the sentence
are then checked against the words in the dictionary containing images and GIFs representing the words.
If the words are not found, its corresponding synonym is replaced. Set of gestures are predefined in the
system.

Journal 3:

In this project report we can see multiple gesture control and generally for the sign converters we can see
13
the part of the device where we sign made by hands are being translated but here the whole body gestures
is being noted and the rest of the processing format is same as the basic one like taking the gestures as the

14
input and recognizing it by comparing to the preloaded dataset and those are converted into the words and
at the end we get the text as output it uses pose estimation library to track whole body movements along
with the help of tensor flow so it creates a data set and the rest of the procedure is mentioned above[3].

The main objective of this project is to recognizing the gestures and displaying the correspondent word.
The first phase involves capturing the gesture using a webcam along with pose estimation library. The
webcam captures the image and image is processed with pose estimation algorithm in tensor-flow utility.
It shows how the webcam is reading the image and the skeleton mapped on the image is the result of the
pose estimation library. The skeleton obtained provides the values for creating the data set; the data set is
a collection of the values of the coordinates of the end points of the skeleton. These values are labelled
accordingly and are appended to the machine for predicting when the input is taken. The block diagram
explains how the work is carried out in the system [7].

It has been preferred for a society that having a sign language for hearing impaired and deaf people to
communicate. As coming of several technologies in the past, it’s been kind of easy to have a translator
which converts the sign language to the appropriate sentence and quite popular too. As they have shown
this in their paper [25], its firstly based on an Arabic sign language which automates the process of being
translated on to give a subtle way of communication and further they have shown that the scope of their
project apars the usage and defined set of measurements. The application directly converts the Arabic sign
language into a meaningful sentence by applying an automated Machine learning algorithm as they
concluded. Capturing signs from real world and translating them is the core objective of this work. The
real-world signs are read using a webcam which captures both static and moving images of the objects in
front of it. The deaf and dumb person who is signing is made to stand in front of the webcam and the
image captured from this is processed with the ft-pose-estimation library to map out the skeleton of the
person signing. It is an example of how the skeleton is mapped on the system [2].

Journal 4:

Most of the sign converter will decode from the movements whereas here it is reversed we can see that
we get the voice from the mic as the input and it processes the data which received and it simply
converted the sound into letters/words/sentences. For that it passes through some noise removal methods
to remove the channel noise, free space noise, mic noise etc. and give it to the speech recognition system
which compares it with the trained voice data base and it helps to recognise the words later on the text
which we got from this speech recognition is processed and the text is now converted into signs with the
help of the Rule Based Matching and it recheck the text sign data base and the recognised sign symbol
and displays the text as well as the animated hand signs so the end user can see it [1]. The whole project is
simply for the normal people

15
to help the disabled ones with so they made it portable so they can use this in mobile phones and create an
application out of it.

ISL recognition system is a way by which Indian Sign language can be decoded and interpreted in the
local spoken language. In order to develop such systems, large database is required. Western countries
like the United States of America, United Kingdom have prepared their Sign Language database which is
available over internet. However, there is no authentic database of ISL available. Hence those who want
to work on development of ISL recognition system need to create their own database of gestural signs.
The linguistic studies of ISL began from 1978 onwards. ISL is a completely natural language with its own
grammar, syntax, phonetics, and morphology [1]. Gesture Recognition system developed in examined the
input gestures for match with a known gesture in the gesture database.

Sign Language (SL) is the natural way of communication of deaf community. More than 2.5% population
of the world is deaf. According to the studies of The All-India Federation of the Deaf, in India around 4
million are deaf and more than 10 million people are hard of hearing. Every country has its own Sign
Language developed with several grammatical differences [9].

Journal 5:

Media Pipe is an open-source framework for computer vision solutions released by Google a couple of
years ago. Among these solutions, the Holistic Model can track in real-time the position of the Hands, the
Pose, and the Face landmarks. For now, the code only uses hands positions to make the prediction [9].

Media pipe Hands is a reliable hand and finger tracking device solution. It uses machine learning (ML) to
understand 21 3D local hand marks from just one frame. Although modern methods depend largely on the
powerful desktop locations for discovery, our approach benefits real-time performance on mobile phones,
even scales to many hands. We hope to give you this handy idea working on extensive research and
development society will result in cases of misuse, to promote new applications and new research
methods. Media pipe Hands uses an integrated ML pipe of the many models working together: The palm
detection model which works on the full image and returns the direct directed hand binding box. Hand
gesture model applicable to image-cut region defined by a palm detector once returns 3D hand key points
with high reliability. This strategy is like the one used in our Media pipe Face Mesh solution, using a face
detector and a face detector a landmark model [8].

Media pipe Hands uses a Machine Learning Pipeline that integrates multiple co-working models: A palm-
type acquisition model that works in a complete image and returns a fixed hand-held binding box. A
handwriting model that works with a cropped image location defined by a palm detector and restores 3D
reliable key points. So, to make a Web site we must photograph at least 25-30 images per mark and with
this model we can get 21 hand points. i.e., links [x, y, z]. x and y are common to say [0.0, 1.0] the width
16
and height of the image respectively. The z represents the depth of the landmark and the depth of the arm
at the root, and the smaller the value the closer the camera becomes. After making the Website can
predict the sign with the help of the Appropriate Model. We will use the KNN algorithm [3,2].

17
CHAPTER 3

METHODOLOGY

Open CV is an open-source software library for computer vision and machine learning. The OpenCV full
form is Open-Source Computer Vision Library. It was created to provide a shared infrastructure for
applications for computer vision and to speed up the use of machine perception in consumer products.
OpenCV, as a BSD-licensed software, makes it simple for companies to use and change the code. There
are some predefined packages and libraries that make our life simple and OpenCV is one of them. Figure
3.1 shows the OpenCV logo it indicates both the logo with the text.png.

Figure 3.1 OpenCV Library

Gary Brad sky invented OpenCV in 1999 and soon the first release came in 2000. This library is based on
optimised C / C++ and supports Java and Python along with C++ through an interface. The library has
more than 2500 optimised algorithms, including an extensive collection of computer vision and machine
learning algorithms, both classic and state-of-the-art. Using OpenCV it becomes easy to do complex tasks
such as identify and recognise faces, identify objects, classify human actions in videos, track camera
movements, track moving objects, extract 3D object models, generate 3D point clouds from stereo
cameras, stitch images together to generate an entire scene with a high-resolution image and many more.

Python is a user-friendly language and easy to work with but this advantage comes with a cost of speed,
as Python is slower to languages such as C or C++. So, we extend Python with C/C++, which allows us to
write computationally intensive code in C/C++ and create Python wrappers that can be used as Python
modules. Doing this, the code is fast, as it is written in original C/C++ code (since it is the actual C++
code working in the background) and, it is easier to code in Python than C/C++. Open CV-Python is a
Python wrapper for the original OpenCV C++ implementation.

18
3.1 Required Libraires

3.1.1 OpenCV

OpenCV (Open-Source Computer Vision) is a cross-platform library of programming functions mainly


aimed at real time computer vision. It is free to use, and provides an array of functions for machine
learning, neural networks, and other applications of artificial intelligence. To support machine learning,
OpenCV provides a statistical machine learning library that contains boosting, decision tree learning,
Artificial Neural Networks (ANN), Support Vector Machine (SVM), ensemble learning, etc. It is written
in C++ and the primary interface is in C++ as well. There are bindings in Java, Python, and
MATLAB/OCTAVE. To encourage adoption by a wider audience, wrappers in other languages such as
C#, Ch, Perl, and Ruby have been developed. All of the new developments and algorithms in OpenCV are
now developed in the C++ interface. All these objectives have been well satisfied by choosing the system
using appropriate classifiers in OpenCV for eye closure detection. In this algorithm, first a driver’s image
is acquired by the camera for processing.

In OpenCV, the face detection of the driver’s image is carried out first followed by eye detection. The
eye detection technique detects the open state of eyes only. Then the algorithm counts the number of open
eyes in each frame and calculates the criteria for detection of drowsiness. If the criteria are satisfied, then
the driver is said to be drowsy. The display and buzzer connected to the system perform actions to correct
the driver abnormal behaviour. For this system, the face and eye classifiers are required. The HARR
Classifier Cascade files inbuilt on OpenCV include different classifiers for the face detection and the eyes
detection. The inbuilt OpenCV xml “haarcascade_frontalface_alt2.xml” is used to search and detect the
face in individual frames. The classifier “haarcascade_eye_tree_eyeglasses.xml” is used to detect eyes in
the open state from the detected face.

3.1.1.1 OpenCV's application areas include:

1. 2D and 3D feature toolkits


2. Ego motion estimation
3. Facial recognition system
4. Gesture recognition
5. Human-computer interaction (HCI)
6. Mobile robotics
7. Motion understanding
8. Object identification
9. Segmentation and recognition
10. Stereopsis stereo vision: depth perception from 2 cameras
19
11. Structure from motion (SFM)

Figure 3.2 shows the various application of computer vision. The application areas includes like event
detection, self-driving, military applications and agriculture equipment’s.

Figure 3.2 Applications of CV

OpenCV runs on the following desktop operating systems: Windows, Linux, macOS, FreeBSD, NetBSD,
and OpenBSD. OpenCV runs on the following mobile operating systems: Android, iOS, Maemo,
BlackBerry 10. The user can get official releases from Source Forge or take the latest sources from
GitHub. OpenCV uses make.

OpenCV is open-source software for creating computer vision related task and it is available as an
extension for C, C++, Java, and Python programming languages. Making a computer vision application in
real time is a challenging task and it needs efficient processing power. Raspberry-pi is an ARM11
controller based small sized open-source CPU with 512 MB RAM and supports 700 MHz processing
speed. It supports interfacing of various low level and high-level peripherals including digital camera and
GPIO’s. It can work with light weight Linux based operating system Raspbian which is loaded with
Python-IDLE programming software. OpenCV Linux version is installed to Raspberry-pi. Haar Cascade
Classifier technique is used for the detection face region and eye region. It is a learning-based approach
where a function is trained from lots of similar and dissimilar images. It is then used to detect objects in
other images. Open CV is packed with a trainer as well as detector. The trainer is used to create our own
classifier for an object. For proper detection 1000 eye closing images has been put as similar images and
1000 dissimilar images are also included. The resulting classifier is stored in a file with .xml extension
which is used in the programming environment. On the other hand, to detect the alcohol intake by the
person, an alcohol gas sensor or Breathalyzer MQ-3 is interfaced with the Arduino system board which
will scan whether the person in driving seat is drunken or not. Based on the detection of drowsiness or
alcoholic intoxication, an alarm will be turned on or the car the car’s power source can be cut down
through a relay to stop the car or preventing the driver to start the car.

18
3.1.2 NumPy

NumPy is a library for the Python programming language, that adds support for large, multi-dimensional
arrays and matrices, along with a large collection of high-level mathematical functions to operate on these
arrays. The ancestor of NumPy, Numeric, was created by Jim Hugunin with contributions from several
other developers. In 2005, Travis Oliphant created NumPy by incorporating features of the competing
Numarray into Numeric, with extensive modifications. NumPy is open-source software and has many
contributors. Figure 3.3 shows the logo of the NumPy library.

Figure 3.3 NumPy Library

The Python programming language was not initially designed for numerical computing but attracted the
attention of the scientific and engineering community early on so a special interest group called matrix-
sig was founded in 1995 to define an array computing package. Among its members were Python
designer and maintainer Guido van Rossum, who implemented extensions to Python's syntax (the
indexing syntax) to make array computing easier. An implementation of a matrix package was completed
by Jim Fulton, then generalized by Jim Hugunin to become Numeric, also variously called Numerical
Python extensions or NumPy. Hugunin, a graduate student at Massachusetts Institute of Technology
(MIT), joined the Corporation for National Research Initiatives (CNRI) to work on JPython in 1997
leaving Paul Dubois of Lawrence Livermore National Laboratory (LLNL) to take over as maintainer.
Other early contributors include David Ascher, Konrad Hinsen, and Travis Oliphant. In early 2005,
NumPy developer Travis Oliphant wanted to unify the community around a single array package and
ported Numarray's features to Numeric, releasing the result as NumPy 1.0 in 2006. This new project was
part of SciPy. To avoid installing the large SciPy package just to get an array object, this new package
was separated and called NumPy. Support for Python 3 was added in 2011 with NumPy version 1.5.0. In
2011, PyPy started development on an implementation of the NumPy API for PyPy. It is not yet fully
compatible with NumPy. In early 2005, NumPy developer Travis Oliphant wanted to unify the
community around a single array package and ported Numarray's features to Numeric, releasing the result
as NumPy 1.0 in 2006. This new project was part of SciPy. To avoid installing the large SciPy package
19
just to get an array object, this new package was

20
separated and called NumPy. Support for Python 3 was added in 2011 with NumPy version 1.5.0. In
2011, PyPy started development on an implementation of the NumPy API for PyPy. It is not yet fully
compatible with NumPy.

 NumPy is a powerful N-dimensional array object


 It has Sophisticated (broadcasting) functions
 It has tools for integrating C/C++ and Fortran code
 It is useful linear algebra, Fourier transform, and random number capabilities
 Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional
container of generic data. Arbitrary data types can be defined.
 This allows NumPy to integrate with a wide variety of databases seamlessly and speedily.
NumPy is licensed under the BSD license, enabling reuse with few restrictions.

3.1.2.1 Comparison between Core Python and NumPy:

When we say "Core Python", we mean Python without any special modules, i.e., especially without
NumPy. The advantages of Core Python:

 high-level number objects: integers, floating point


 containers: lists with cheap insertion and append methods, dictionaries with fast lookup

3.1.2.2 Advantages of using NumPy with Python:

 array-oriented computing

 efficiently implemented multi-dimensional arrays

 designed for scientific computation

 Installation: NumPy 1.19.1 can be installed by using the pip command:

 pip install NumPy==1.19.1

Figure 3.4 shows the installation of NumPy in terminal of the complier.

Figure 3.4 NumPy Installation

21
3.1.3 TensorFlow

TensorFlow is a software library or framework, designed by the Google team to implement machine
learning and deep learning concepts in the easiest manner. It combines the computational algebra of
optimization techniques for easy calculation of many mathematical expressions. Figure 3.5 shows the
TensorFlow in python. It is one of the libraries which is extensively used in applications for deep
learning and machine learning.

Figure 3.5 TensorFlow

3.1.3.1 Features of TensorFlow

1. It includes a programming support of deep neural networks and machine learning technique.
2. It includes a high scalable feature of computation with various data set
3. TensorFlow uses GPU computing, automating management. It also includes a unique feature
of optimization of same memory and the data

3.1.3.2 Installation

To install TensorFlow, it is important to have “Python” installed in your system. Python version 3.4+ is
considered the best to start with TensorFlow installation.

Consider the following steps to install TensorFlow in Windows operating system.

Step 1 − Verify the python version being installed.

Step 2 − A user can pick up any mechanism to install TensorFlow in the system. We recommend “pip” and
“Anaconda.” Pip is a command used for executing and installing modules in Python.

Figure 3.6 shows the installation of tensor flow in python.

22
Figure 3.6 installation of tensor-flow

Step 3 − Execute the following command to initialize the installation of TensorFlow –conda create --
name tensor flow python = 3.5

It downloads the necessary packages needed for TensorFlow setup.

Step 4 − After successful environmental setup, it is important to activate TensorFlow module.

activate tensor flow


Step 5 − Use pip to install “TensorFlow” in the system. The command used for installation is mentioned
as below −

pip install tensor flow

3.1.4 Pillow
Python Imaging Library (expansion of PIL) is the de facto image processing package for Python
language. It incorporates lightweight image processing tools that aids in editing, creating, and saving
images. Support for Python Imaging Library got discontinued in 2011, but a project named pillow forked
the original PIL project and added Python3.x support to it. Pillow was announced as a replacement for
PIL for future usage. Pillow supports many image file formats including BMP, PNG, JPEG, and TIFF.
The library encourages adding support for newer formats in the library by creating new file
decoders [13].
This module is not preloaded with Python. So, to install it execute the following command

command: pip install pillow

3.1.5 Tkinter

Tkinter is the de facto way in Python to create Graphical User interfaces (GUIs) and is included in all
standard Python Distributions. In fact, it is the only framework built into the Python standard library.
This Python framework provides an interface to the Tk toolkit and works as a thin object-oriented layer
23
on

24
top of Tk. The Tk toolkit is a cross-platform collection of ‘graphical control elements’, aka widgets, for
building application interfaces.

3.1.5.1 How to use Tkinter

This framework provides Python users with a simple way to create GUI elements using the widgets found
in the Tk toolkit. Tk widgets can be used to construct buttons, menus, data fields, etc. in a Python
application. Once created, these graphical elements can be associated with or interact with features,
functionality. For example, a button widget can accept mouse clicks, and can also be programmed to
perform some kind of action, such as exiting the application.

3.1.6 Pyttsx3

Pytesseract or Python-tesseract is an OCR tool for python that also serves as a wrapper for the Tesseract-
OCR Engine. It can read and recognize text in images and is commonly used in python ocr image to text
cases.It is also useful as a stand-alone invocation script to tesseract, as it can read all image types
supported by the Pillow and Leptonica imaging libraries, including jpeg, png, gif, bmp, tiff, and others.

3.1.6.1 Installation

Pyttsx3 is a cross-platform text-to-speech library. It works on Windows, Mac, and Linux. It uses the
native speech drivers for all operating systems and can be used offline. It uses the aws_cli package to
configure the driver. To use this package, install pip on your computer. Then, install pyttsx3 by running
the following command: awsconfig – pyttsx3.

The pyttsx3 library is an extremely popular and highly-recommended Text-to-Speech (TTS) conversion
library. It is fully supported by many popular operating systems and works offline with no delay. You can
install pyttsx3 using the pip package manager. Once installed, pyttsx3 will load the right driver for your
operating system. This includes sapi5 on Windows and espeak on Linux. Since it is compatible with any
platform, you can use it with any TTS device.

3.1.7 Media pipe

Media Pipe is an open-source framework for building pipelines to perform computer vision inference
over arbitrary sensory data such as video or audio. Using Media Pipe, such a perception pipeline can be
built as a graph of modular components. In computer vision pipelines, those components include model
inference, media processing algorithms, data transformations, etc. Sensory data such as video streams
enter the graph, and perceived descriptions such as object-localization or face-key point streams exit the
graph.

Media Pipe was built for machine learning (ML) teams and software developers who implement
25
production-ready ML applications, or students and researchers who publish code and prototypes as part of

26
their research works. The Media Pipe framework is mainly used for rapid prototyping of perception
pipelines with AI models for inferencing and other reusable components. It also facilitates the deployment
of computer vision applications into demos and applications on different hardware platforms.

3.1.7.1 Uses of media pipe

Every YouTube video we watch is processed with machine learning models using Media Pipe. Google
has not hired thousands of employees to watch every video people upload, because thousands of people
are not enough to look after and check each published video, the amount of data Google gets daily is not
easy for humans to check. Machine Learning models are developed to make our life easier, so for tasks
that are hard for us to complete, machine learning and deep learning models help us to do them in less
amount of time, on the other hand, we can save money by not hiring employees.
Yes, Google has machine learning/deep learning models to see if the videos match their policies and if the
content is not having copy-right issues. Basically, Media Pipe is a framework for Computer Vision and
Deep Learning that builds perception pipelines. For now, you just need to know, perception pipelines are
some sort of audio, video, or time-series data that catch the process in pipelining zone.

3.1.7.2 Media Pipe Hands

Figure 3.7 Palm detection

3.1.8 Text Blob

Text Blob is a Python library that can be used to process textual data. Some of the tasks where it is good
to use are sentiment analysis, tokenization, Spelling Correction, and many other natural language
processing tasks. In this article, I will walk you through a tutorial on Text Blob in Python.Text Blob is an
open-source Python library that is very easy to use for processing text data. It offers many built-in
methods for common natural language processing tasks. Some of the tasks where I prefer to use it over
other Python libraries are spelling correction, part of speech tagging, and text classification. But it can be
used for various NLP tasks like:
27
 Noun phrase extraction
 Part of speech tagging
 Sentiment Analysis
 Text Classification
 Tokenization
 Word and phrase frequencies
 Parsing
 n-grams
 Word inflexion
 Spelling Correction

Command line for installation: pip install text blob

Figure 3.8 shows the logo of the text blob.

Figure 3.8 Text Blob

3.1.9 Keras

Keras is an open-source high-level Neural Network library, which is written in Python is capable enough
to run on Theano, TensorFlow, or CNTK. It was developed by one of the Google engineers, Francois
Chollet. It is made user-friendly, extensible, and modular for facilitating faster experimentation with deep
neural networks. It not only supports Convolutional Networks and Recurrent Networks individually but
also their combination. It cannot handle low-level computations, so it makes use of the Backend library to
resolve it. The backend library act as a high-level API wrapper for the low-level API, which lets it run on
TensorFlow, CNTK, or Theano.

28
Initially, it had over 4800 contributors during its launch, which now has gone up to 250,000 developers. It
has a 2X growth ever since every year it has grown. Big companies like Microsoft, Google, NVIDIA, and
Amazon have actively contributed to the development of Keras. It has an amazing industry interaction,
and it is used in the development of popular firms likes Netflix, Uber, Google, Expedia, etc.

Figure 3.9 shows the logo of the keras in python.

Figure 3.9 Keras

3.1.9.1 Importance of Keras

 Focus on user experience has always been a major part of Keras.


 Large adoption in the industry.
 It is a multi-backend and supports multi-platform, which helps all the encoders come together for
coding.
 Research community present for Keras works amazingly with the production community.
 Easy to grasp all concepts.
 It supports fast prototyping.
 It seamlessly runs on CPU as well as GPU.
 It provides the freedom to design any architecture, which then later is utilized as an API for the
project.
 It is really very simple to get started with.

3.1.9.2 Backend of Keras

Keras being a model-level library helps in developing deep learning models by offering high-level
building blocks. All the low-level computations such as products of Tensor, convolutions, etc. are not
handled by Keras itself, rather they depend on a specialized tensor manipulation library that is well
optimized to serve as a backend engine. Keras has managed it so perfectly that instead of incorporating
one single library of

29
tensor and performing operations related to that library, it offers plugging of different backend engines
into Keras.

Keras consist of three backend engines, which are as follows:

 TensorFlow
TensorFlow is a Google product, which is one of the most famous deep learning tools widely used
in the research area of machine learning and deep neural network. It came into the market on 9th
No- vember 2015 under the Apache License 2.0. It is built in such a way that it can easily run on
multiple CPUs and GPUs as well as on mobile operating system
 Theano
Theano was developed at the University of Montreal, Quebec, Canada, by the MILA group. It is
an open-source python library that is widely used for performing mathematical operations on
multi- dimensional arrays by incorporating SciPy and NumPy. It utilizes GPUs for faster
computation and efficiently computes the gradients by building symbolic graphs automatically. It
has come out to be very suitable for unstable expressions, as it first observes them numerically and
then computes them with more stable algorithms.
 CNTK
Microsoft Cognitive Toolkit is deep learning's open-source framework. It consists of all the basic
building blocks, which are required to form a neural network.

3.1.9.3 Advantages of Keras

Keras encompasses the following advantages, which are as follows:

 It is very easy to understand and incorporate the faster deployment of network models.
 It has huge community support in the market as most of the AI companies are keen on using it.
 It supports multi backend, which means you can use any one of them among TensorFlow, CNTK,
and Theano with Keras as a backend according to your requirement.

3.2 Webcam

Python OpenCV: Capture Video from Camera. Python provides various libraries for image and video pro-
cessing. One of them is OpenCV. OpenCV is a vast library that helps in providing various functions for
image and video operations. With OpenCV, we can capture a video from the camera. It lets you create a

30
video capture object which is helpful to capture videos through webcam and then you may perform
desired operations on that video.

Steps to capture a video:

Use cv2.VideoCapture() to get a video capture object for the camera. Set up an infinite while loop and use
the read () method to read the frames using the above created object. Use cv2.imshow() method to show
the frames in the video.

Figure 3.9 shows the webcam where the images will be taken and stores the images.

Figure 3.10 Webcam

3.3 Flowchart

A flowchart is a picture of the separate steps of a process in sequential order. It is a generic tool that can
be adapted for a wide variety of purposes, and can be used to describe various processes, such as a
manufacturing process, an administrative or service process, or a project plan. Flowchart is a type of
diagram that represents a workflow or process. A flowchart can also be defined as a diagrammatic
representation of an algorithm, a step-by-step approach to solving a task. The flowchart shows the steps as
boxes of various kinds, and their order by connecting the boxes with arrows. This diagrammatic
representation illustrates a solution model to a given problem. Flowcharts are used in analysing,
designing, documenting, or managing a process or program in various fields.

31
3.3.1 Flowchart

Start

Input as text from user

Not Matched Tracking letter by letter

Matching with database

Matched

Display hand gesture

Output

Figure 3.10 Text to sign conversion.

The above flowchart one figure 3.10 shows the text to sign conversion. When we initialize the code, it
processes and run the following code then it shows that user has to give the text. After that the text will be
displays each letter from the given word it checks whether the given letter is matched with the database or
not. If it matches with the database then the output will be displays in the form of gesture form. Each
letter has get separated from the word and displays each letter gesture. If it not matched with the database
then it back to the homepage. This situation will not be there for the following code. Maximum the code
will able to give all the gestures according to the words given by the user. This flowchart shows the text
to sign conversion. It depends on the user input and gif will be displayed as the output. We can see when
the code is executed. It can able to take only English alphabets rather than Telegu and Hindi

32
3.3.2 Flowchart

Start

Input from user

Not
matched Tracking hand
gesture

Matching with
database

Matched

Display text

Output

Figure 3.11 Sign to text conversion

In figure 3.11 the flow chart shows the sign to text conversion. Here the code will give the text form.
When the user places the gesture in the front of camera then the code will able to track the gesture of the
user and check the gesture will be matched with the database of or not if it is not matched with the given
database, it will show the error. If it is matched with the database the output will be displayed in the form
of text. Here some words are included like call me, thumbs up, thumbs down, fist and other words. Every
word has unique feature and the output will be based on the input given by the user. The words are
displayed in the frame only. When we place the gesture the text form will be displays. Here the user must
keep the gesture correctly because it will show the error if the user failed to place the gesture in front of
the camera. Here also only the English words are displayed rather than Hindi or Telugu.

33
3.4 Source Code

3.4.1 Main code

Hand_gesture_detection.py

(Code will be written from here)

34
CHAPTER 4
RESULTS AND DISCUSSION
4.1 Results

Our project results will be based on the user input. Here we have two practical results they are text to sign
and sign to text. When user wants to interact with the disable people, he can use the text to sign
conversion method. In this text to sign the user will give the message in the box that is displays when the
code will execute the user will write or text the msg like hello then when the user clicks on the convert
button then it will shows the each alphabet like h for different gesture, e for different gesture for different
gesture and o for different gesture. Every letter has the unique gesture and it is easy to identify for the
disabled people. The practical results of the text to sign are displayed below.

4.1.1 Text to Sign practical results


Hello
Figure 4.1 shows the text to sign conversion. It is showing the hand gesture for the letter H. It is
converting the letter H into sign language.

Figure 4.1 Hand gesture for the letter h

Figure 4.2 shows the text to sign conversion. It is showing the hand gesture for the letter E. It is
converting the letter E into sign language.

Figure 4.2 Hand gesture for the letter e

35
Figure 4.3 shows the text to sign conversion. It is showing the hand gesture for the letter L. It is
converting the letter L into sign language.

Figure 4.3 Hand gesture for the letter l

Figure 4.4 shows the text to sign conversion. It is showing the hand gesture for the letter O. It is
converting the letter O into sign language.

Figure 4.4 Hand gesture for the letter o

4.1.2 Sign to Text conversion


Figure 4.4 and 4.5 shows the sign to text conversion. It converts the hand signs or gestures into text and
displays the text message on the screen. The below figures shows the hand gestures for stop and Pointer.

36
Figure 4.5 Stop Gesture

Figure 4.6 Right pointer gesture

Figure 4.6 represent the close gesture. The fingers will be opened or close it will show according to the user
gesture.

37
Figure 4.7 Close gesture
Figure 4.7 shows the right hand open which is showing the backside of the hand this represents the stop
gesture or open gesture according to the user input.

Figure 4.8 Right Open


This is the results of sign to text conversion. We have included whether the user hand is open or close he
is using his left or right hand. Some of the gestures are close, open, move, pointer and stop. These
gestures are included in this project. The text will be displayed in text form.

38
4.2 Table Representation
Table 4.2.1: Comparison table between previous works to proposed work

COMPARISON TABLE

Parameters Proposed Model Sign Language Glove Trans- Vision Based Sign Language
lator Using Microcontroller Translation [14]
[13]
Language used Python Embedded C MATLAB

One/Two-way Communica- Two-way One way One way


tion

Accuracy in % for convert- 100 - -


ing text/speech signals into
hand signs

Accuracy in % for convert- 98-99 98 97-99


ing hand signs to text

Methodology Voice/text to sign language Development of a sign lan- A video is captured at around
conversion: Scraping Data guage glove translator using 30 frames per second, then
from Giphy using Chrome microcontroller and Android the image acquisition process
Extension then filtered the technology for deaf and mute is done and finally image
gif files and added names to people. In this methodology pre-processing this part con-
it also added gif files of sin- they used flex sensor and ac- sists of hand segmentation
celerometer. Here the user
gle alphabets followed by morphological
will choose what to translate
operations. One method for
Took Voice/Text input from like numbers or letters or
words. Here these modes this is proposed an adaptive
user and split into words and skin color model for hand
checked if it is present in the will have data separate to
each other. The user will segmentation another method
GIF filenames. If it is not for hand segmentation and
choose one mode and then he
present then use the Alphabet tracking is
will sign the gesture by using
GIFs for making up words. hand glove. Arduino will
Finally Displayed it onto able to read the given data based on HSV histogram.
Tkinter App from these sensors and then
compares with the gesture. If
Sign language to voice/text
the given set of values will
conversion: match with the database on
Used the ASL Dataset on the inbuilt data then device
will produce the output by
Kaggle of Alphabets then
displaying the text and text-
created a CNN algorithm in
to-speech feature for the dis-
TensorFlow and trained the played text.
model for small data used
Live Webcam feed of user
hand and predicted the Al-
phabet from a Region of In-
terest. Finally Displayed it
onto Tkinter App
Platform for coding PyCharm/VS Code Arduino IDE MATLAB (version
above 2015)
Components used for re- Cam, software Glove, sensor, tx and rx Cam, software
cording and translation

39
The above table represents the table representation or comparison between our project and previous
project. Our previous projects are using Arduino and MATLAB. Our project was done using the python
software which makes our project easy when compared to other projects. Using Arduino board, we need
Arduino ide and some flex sensors also the budget we also be high. Our project is cost effective because
we are using the software python and we can use MATLAB but MATLAB is somewhat difficult when
compared to python. Python is open platform we can do our project in portable way and very efficient.
Our project has the accuracy of 99.3% when compared to other projects. The table represents the different
parameters like methodology, accuracy, one way communication or two-way communication and
components requires.

Table 4.2.2 System Symbols of hand gestures

Signs Thumb Index Middle Ring Last


Thumbs up 1 0 0 0 0

Thumbs down 0 0 0 0 0

Peace 0 1 1 0 0

Call me 1 0 0 0 1

Fist 0 0 0 0 0

In table the system symbols are displayed with respect to the fingers. We have 5 fingers namely thumb,
index, middle, ring, little. Here 0 and 1 represents the open and close of each figure means when we place
the gesture of thumbs up or okay only thumb finger will be open it is represented as 1 and remaining
finger will be closed so they are representing 0. Each word has unique gesture according to the gesture the
0 and 1 represents the open and close of each figure means when we place the gesture of thumbs up or
okay only thumb finger will be open it is represented as 1 and remaining finger will be closed so they are
representing
0. Each word has unique gesture according to the gesture the 0 and 1 are changes.

4.3 Results and Discussions

In this sign language translation project, we create a sign detector, which detects alphabets that can very
easily be extended to cover a vast multitude of other signs and hand gestures including the numbers. We
have developed this project using OpenCV and Keras modules of python. For this OpenCV technique, we
are getting 90% accuracy compared to other techniques. As compared to previously existing system we
have implemented one of the techniques based on the OpenCV which is time consumption reducing
process.

40
4.3.1 Advantages

The advantage is that it provides all the data needed more accurately as it also provides fingers movement
data. The disadvantages are that they are costly and are difficult to be used commercially. The purpose of
Sign Language Recognition (SLR) systems is to provide an efficient and accurate way to convert sign
language into text or voice has aids for the hearing impaired for example, or enabling very young children
to interact with computers (recognizing sign language), among others.

Since nobody is perfect, your translator is likely to make mistakes at times, like giving you faulty
translations or accidently altering the final message in minor ways. Such errors can be very costly,
especially if you rely on the translation to make serious decisions. And because you probably don’t
understand the target language, you’ll only realize the mistakes when the damage is already done.
Remember that language translation involves reproducing the actual meaning of a message in the source
language, as accurately as possible. This is a delicate exercise because you must be sure that the words in
the translation are the most acceptable rendition of the original text.

4.3.2 Disadvantages

Only uses English as its default language. Need more languages in libraries. Since nobody is perfect, your
translator is likely to make mistakes at times, like giving you faulty translations or accidently altering the
final message in minor ways. Such errors can be very costly, especially if you rely on the translation to
make serious decisions. And because you probably don’t understand the target language, you’ll only
realize the mistakes when the damage is already done. Remember that language translation involves
reproducing the actual meaning of a message in the source language, as accurately as possible. This is a
delicate exercise because you must be sure that the words in the translation are the most acceptable
rendition of the original text.

4.3.3 Applications
1. Can be used in car systems.
2. Health care field
3. Military

41
CHAPTER 5
CONCLUSIONS AND FUTURE SCOPE

5.1 Conclusion
The sign translator created by Python is an excellent example of how technology can be used to break
down communication barriers and create a more inclusive society. By providing a means for those who
use sign language to communicate with those who do not, the software is helping to improve the lives of
hearing- impaired individuals and making it easier for them to participate in everyday activities. In
addition to its practical applications, the sign translator also has the potential to increase awareness and
understanding of sign language and the challenges faced by hearing-impaired individuals. By promoting
inclusivity and diversity, this technology is contributing to a more equitable and just society. As such, it is
an important step forward in the ongoing efforts to create a world that is accessible to all. the sign
translator created by Python is a remarkable tool that bridges the communication gap between hearing-
impaired individuals and the rest of the world. With this software, it is possible to convert hand signs to
text and vice versa. This technology has the potential to revolutionize the way we interact with hearing-
impaired individuals and make their lives easier. The use of machine learning algorithms in the
development of this software has made it possible to recognize complex hand gestures and translate them
into text with high accuracy. With further improvements and advancements in technology, it is likely that
the sign translator will become even more advanced and user-friendly, making it accessible to more
people around the world. Overall, the sign translator created by Python is a significant step forward in the
field of assistive technology and has the potential to change the lives of many people for the better.
Furthermore, the sign translator created by Python has immense potential to revolutionize the education
sector by enabling teachers to communicate more effectively with hearing-impaired students. It can also
help hearing-impaired students to participate more actively in classroom discussions and feel more
included in the learning process. With the software's ability to recognize complex hand gestures and
translate them into text, it can also be used to enhance communication in other fields such as healthcare,
law enforcement, and emergency services. The sign translator is a valuable tool that can save lives in
critical situations where quick and accurate communication is essential. The software's versatility and
flexibility make it a vital asset for many industries, and its impact on society can be far-reaching. In
conclusion, the sign translator created by Python is a remarkable achievement that has the potential to
transform the lives of many individuals and contribute significantly to creating a more inclusive and
equitable society.

42
5.2 Future Scope

In future work, proposed system can be developed and implemented using Raspberry Pi. Image
Processing part should be improved so that System would be able to communicate in both directions i.e.it
should be capable of converting normal language to sign language and vice versa. We will try to
recognize signs which include motion. Moreover, we will focus on converting the sequence of gestures
into text i.e., word and sentences and then converting it into the speech which can be heard.

 Since deaf people are usually deprived of normal communication with other people, they must
rely on an interpreter or some visual communication. Now the interpreter cannot be available
always, so this project can help eliminate the dependency on the interpreter.
 The system can be extended to incorporate the knowledge of facial expressions and body
language too so that there is a complete understanding of the context and tone of the input speech.
 A mobile and web-based version of the application will increase the reach to more people.
 Integrating hand gesture recognition system using computer vision for establishing 2-way
communication system.

The system can be improved by allowing multi-language to be displayed and converted to speech.
Furthermore, other sensors (accelerometers, capacitive flex sensors etc.,) can be integrated with the
system for recognition of movement of the hand such as swapping, rotation, tilting etc. Mobile
applications can be developed for replacing the LCD display and the speaker which minimizes the
hardware.

43
References

[1] D, N., Rupanagudi S.R., Sachin S.K., Sthuthi 8., Pavithra R. and Raghavendra, "Novel segmentation algorithm
for hand gesture recognition," Automation, Computing, Communication, Control and Compressed Sensing
(iMac4s), 2013 International Multi-Conference on, vol., no., pp.383,388, 22-23 March 2013.

[2] Dawod A.Y., Abdullah 1. and Alam MJ., "Adaptive skin color model for hand segmentation," Computer Applications
and Industrial Electronics (ICCAIE), 2010 International Conference on, vol., no., ppA86,489, 5-8 Dec. 2010.

[3] Bhame, V.; Sreemathy, R.; Dhumal, H., "Vision based hand gesture recognition using eccentric approach for human
computer interaction," in Advances in Computing, Communications, and Informatics.

[4] Archana S. Ghotkar, Dr. Gajanan K. Kharate,”Study of vision-based hand gesture recognition using Indian sign
language,” International journal on smart sensing and intelligent systems vol. 7, no. 1, March 2014.

[5] Chenglong Yu, Member, IEEE, Xuan Wang, Member, IEEE, Hejiao Huang, Member, IEEE, Jianping Shen, Kun Wu,
“Vision-Based Hand Gesture Recognition Using Combinational Features”, 2010 Sixth International Conference on
Intelligent Information Hiding and Multimedia Signal Processing.

[6] Dominikus Willy, AryNoviyanto, AniatiMurniArymurthy, “Evaluation of SIFT and SURF Features in the Songket
Recognition”.

[7] Dawod A.Y., Abdullah 1. and Alam MJ., "Adaptive skin color model for hand segmentation," Computer Applications
and Industrial Electronics (ICCAIE), 2010 International Conference.

[8] Madhuri, Y.; Anitha, G.; Anburajan, M.,"Vision-based sign language translation device,"in Information
Communication and Embedded System.

[9] M. S. E. Mohammed Elmahgiubi Mohamed Ennajar, Nabil Drawil, “Sign language translator and gesture
recognition,” in 2015 Global Summit on Computer & Information Technology.

[10] Zhou, Qiangqiang; Zhao, Zhenbing, "Substation equipment image recognition based on SIFT feature matching," in
Image and Signal Processing (CISP), 2012 5th International Congress on , vol., no., pp.1344-1347, 16-18 Oct. 2012.

[11] N. C. Camgoz, S. Hadfield, O. Koller, H. Ney, and R. Bowden, “Neural sign language translation,” in Proceedings of
the IEEE conference on Computer Vision and Pattern Recognition (CVPR), pp. 7784–7793, IEEE, Salt Lake City,
UT, USA, March 2018.

[12] J. Huang, W. Zhou, Q. Zhang, H. Li, and W. Li, “Video-based sign language recognition without temporal
segmentation,” in Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18), AAAI, New
Orleans, LA, USA, February 2018.

[13] S.-K. Ko, C. J. Kim, H. Jung, and C. Cho, “Neural sign language translation based on human keypoint estimation,”
Applied Sciences, vol. 9, no. 13, p. 2683, 2019.

[14] “Hierarchical lstm for sign language translation,” in Proceedings of the Thirty-Second AAAI Conference on Artificial
Intelligence, AAAI, New Orleans, LA, USA, February 2018.

[15] O. Koller, C. Camgoz, H. Ney, and R. Bowden, “Weakly supervised learning with multi-stream cnn-lstm-hmms to
discover sequential parallelism in sign language videos,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 42, no. 9, pp. 2306–2320, 2019.

[16] N.J. Ayache and O.D. Faugeras. A new approach for the recognition and positioning of twodimensional objects. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 8(1):44–54, January 1986.

[17] MykhayloAndriluka, Leonid Pishchulin, Peter Gehler, Bernt Schiele. 2D human pose estimation: New benchmark
and state of the art analysis. In CVPR, IEEE Conference on 25th September 2014.

[18] MykhayloAndriluka, Leonid Pishchulin, Peter Gehler, Bernt Schiele. Strong appearance and expressive spatial
models for human pose estimation. In ICCV, IEEE International Conference on 1-8 December, 2013.

44
[19] S. Ioffe and D.A. Forsyth. Probabilistic methods for finding people. International Journal of Computer Vision,
43(1):45– 68, June 2001.

[20] Fangfang Yuan, Fusheng Lian, Xingjian Xu. Decision tree algorithm optimization research based on MapReduce.
ICSESS, 6th IEEE International Conference on 23-25 September, 2015.

[21] Chen Jin, Luo De-lin and mu Fen Fenxiang. An improve ID3 Decision tree algorithm. IEEE 4th International
Conference on computer Science & Education.

[22] Gordan.V. Kass(1980). An exploratory Technique for investigation large quantities of categorical data Applied
Statics, vol 29, No .2, pp. 119-127.

[23] Muhammad Aminur Rahaman, Mahmood Jasim, Md. Haider Ali and Md. Hasanuzzaman, “Real-Time Computer
Vision based Bengali Sign Language Recognition”, 2014 17th International Conference on Computer and Information
Technology (ICCIT).

[24] Rahat Yasir, Riasat Azim Khan, “Two Handed Hand Gesture Recognition for Bangla Sign Language using LDA and
ANN”, The 8th International Conference on Software, Knowledge, Information Management and Applications
(SKIMA 2014).

[25] P.Gajalaxmi, T, Sree Sharmila, “Sign Language Recognition for Invariant features based on multiclass Support
Vector Machine with BeamECOC Optimization”, IEEE International Conference on Power, Control, Signals and
Instrumentation Engineering(IPCSI-2017).

45

You might also like