Format Report
Format Report
OF
GUIDED BY
PROF. CHANDGUDE A. S.
Submitted by
is a bonafide student of this institute and the work has been carried out by him/her
under the supervision of Prof. Chandgude A. S. and it is approved for the partial fulfill-
ment of the requirement of Savitribai Phule Pune University, for the award of the degree
of Bachelor of Engineering (Computer Engineering).
Place :
Date :
1
ACKNOWLEDGEMENT
It gives us great pleasure in presenting the project report on ‘REAL TIME SIGN
LANGUAGE RECOGNIZER USING AI.’
We would like to take this opportunity to thank our internal guide Prof. Chandgude
A. S. for giving us all the help and guidance we needed. So really grateful to him for their
kind support. Their valuable suggestions were very helpful.
We would like to extend our sincere thanks to our family members. It is privilege
to acknowledge their cooperation during the course of this dissertation. We express our
heartiest thanks to our known and unknown well-wishers for their unreserved coopera-
tion, encouragement and suggestions during the course of this dissertation report.
We would like to thanks to our all teachers, and all our friends who helped with the
ever daunting task of gathering information for the dissertation.
2
ABSTRACT
3
Contents
1 INTRODUCTION 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 LITERATURE SURVEY 3
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4
3.3.1 User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4 SYSTEM DESIGN 15
5
4.3 UML Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
5 Other Specification 22
5.0.1 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.0.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.0.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6 Conclusions 23
APPENDIX A
APPENDIX B
APPENDIX C
6
LIST OF ABBREVIATIONS
ABBREVIATION ILLUSTRATION
7
List of Figures
4.2 DFD-Level 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 DFD-Level 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.4 DFD-Level 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
8
Chapter 1
INTRODUCTION
The sign language is used widely by people who are deaf-dumb these are used as a
medium for communication. A sign language is nothing but composed of various ges-
tures formed by different shapes of hand, its movements, orientations as well as the facial
expressions. There are around 466 million people worldwide with hearing loss and 34
million of these are children. ‘Deaf’ people have very little or no hearing ability .They
use sign language for communication. People use different sign languages in different
parts of the world. Compared to spoken languages they are very less in number. India
has its own sign language by the name Indian Sign Language (ISL). In developing coun-
tries there are only very few schools for deaf students. Unemployment rate among adults
with hearing loss are very high in developing countries . Data from Ethnologies states
that among deaf population in India, which is about 1 percent of total population, literacy
rate and number of children attending school is very less. It goes on to state that official
recognition of sign languages, increasing the availability of interpreters and providing
transcription in sign languages greatly improve accessibility. Signs in sign languages are
the equivalent of words in spoken languages Signed languages appear to favour. The as-
sociate editor coordinating the review of this manuscript and approving it for publication
was Weiyao Lin. Simultaneous sign internal modification , rather than the concatena-
tion of morphemes. But learners in the initial stages of SL learning use iconicity as a
mnemonic aid to remember new signs. Butthe lack of iconicity makes it difficult to learn
new signs for those who learn SL as a new language . Finger spelling is the representa-
tion of the letters of a writing system and sometimes numeral systems. Sign Language
(ISL) can represent English alphabets A-Z using finger spelling .It can be one handed
or two handed and ISL follows two handed style. It issued to represent words that have
no sign equivalent or used to emphasize a word Though finger spelling usage is less in
casual signing, they are an important component in sign i language learning. This project
aims at identifying alphabets in Indian Sign Language from the corresponding gestures.
recognition and sign language recognition has been a well researched topic for Amer-
ican Sign Language(ASL), but few research works have been published regarding Indian
Sign Language(ISL). But instead of using high-end technology like gloves or kinect, we
aim to solve this problem using state of the art computer vision and artificial intelligence
algorithms.
1.1 Motivation
Highly influenced by the teachings of the Shrimad Bhagavad-Gita we decided work for
the people with speaking disability. During academics we had met people with speaking
disability, we had seen their problem closely. So that was the biggest motivation for us to
create this project.
In the world of technology, everything and everyone in the world is growing so rapidly,
so we think disability of deaf people should not be barrier in the growth of deaf people.
As technology is changing world so rapidly, so technology can also change the life
of such people, artificial intelligence can play major role in improving communication of
such people.
LITERATURE SURVEY
This chapter contains the existing and established theory and research in this report
range. This will give a context for work which is to be done. This will explain the depth
of the system. Review of literature gives a clearness and better understanding of the ex-
ploration/venture. A literature survey represents a study of previously existing material
on the topic of the report. This literature survey will logically explain this system.
Real Time Sign Language Interpreter: In this paper Geethu Nath and Arun C.S. de-
veloped a system using ARM CORTEX A8processor for recognizing the ASL symbols.
Implementation of Real Time Hand Gesture Recognition: : In this paper used the
codebook algorithm for background subtraction and generated binary images from the
given image frames.
3.1 Introduction
User of the Sign Language Recognition system are either the non profitable organization
or the individuals with vocal disabilities. They have facility to translate and spread the
thoughts of the deaf and dumb peaople.
Let’s assume that this system used in the gatherings or seminars where every person with
vocal disability can give his/her valueble thoughts to all the person or audience. assuming
this point we have developed this system.
This system requires the real-time system which had ability to capture video and con-
vert them into frames and after that the system should convert the frames into respective
numpy array to proceed further.
After gettiing the numpy arrays of frame the classification system required to apply the
CNN on arrays and analyze the data and classify the image and gie the correct output.
User has to interface with system to access the features and to provide easy communica-
tion with system.
Frontend Interface: Tkinter Graphical User interFace, KivyMD
Backend Interfqace: Python ML
Python Tkinter and Kivy Library provides the software interface for the user.
A good working webcam or an external webcam will require to capture thhe frame of the
user.
Following measures are taken for improving the performance of the Sign Language Recog-
nision system.
True positive and true negatives are the observations that are correctly predicted and
therefore shown in green. We want to minimize false positives and false negatives so they
are shown in red color. These terms are a bit confusing. So let’s take each term one by
one and understand it fully.
True Positives (TP):These are the correctly predicted positive values which means that
the value of actual class is yes and the value of predicted class is also yes.
True Negatives (TN) -These are the correctly predicted negative values which means
that the value of actual class is no and value of predicted class is also no.
False Positives (FP) - When actual class is no and predicted class is yes.
False Negatives (FN) -When actual class is yes but predicted class in no.
Once you understand these four parameters then we can calculate Accuracy, Preci-
sion, Recall and F1 score.
Accuracy - Accuracy is the most intuitive performance measure and it is simply a ratio of
correctly predicted observation to the total observations. One may think that, if we have
high accuracy then our model is best. Yes, accuracy is a great measure but only when
you have symmetric datasets where values of false positive and false negatives are almost
same. Therefore, you have to look at other parameters to evaluate the performance of
your model. For our model, we have got 0.803 which means our model is approx. 80
percent accurate.
Accuracy = T P + T N/T P + F P + F N + T N
P recision = T P/T P + F P
Recall (Sensitivity) - Recall is the ratio of correctly predicted positive observations to the
all observations in actual class - yes. The question recall answers is: Of all the passengers
that truly survived, how many did we label? We have got recall of 0.631 which is good
for this model as it’s above 0.5.
Recall = T P/T P + F N
F1 score - F1 Score is the weighted average of Precision and Recall. Therefore, this
score takes both false positives and false negatives into account. Intuitively it is not as
easy to understand as accuracy, but F1 is usually more useful than accuracy, especially
if you have an uneven class distribution. Accuracy works best if false positives and false
negatives have similar cost. If the cost of false positives and false negatives are very
different, it’s better to look at both Precision and Recall. In our case, F1 score is 0.701.
A good working Camera is the integral parat Sign Language Recognition system. So if
the damage camera damage occures then the system cam crash completely. So camera
must be good for the safety of the good working of the system.
Software testing is a critical element of software quality assurance and represents the ul-
timate review of specification, design and coding. In fact, testing is the one step in the
software engineering process that could be viewed as destructive rather than constructive.
A strategy for software testing integrates software test case design methods into a well-
planned series of steps that result in the successful construction of software. Testing is
the set of activities that can be planned in advance and conducted systematically. The un-
derlying motivation of program testing is to affirm software quality with methods that can
economically and effectively apply to both strategic to both large and small-scale systems.
Correctness: The system should classify the langauge and predict output very quickly.
The low accuaracy can result in low reliability.
Usability: The system is very useful for the organization of the deaf and dumb peo-
ple. Very reliable for the making change in the life of the deaf and dumb people.
• Camera : min.720p
• HDD/SSD : 256gb
• Power Supply
1. Planning : - This is the first phase in the systems development process. It identifies
whether or not there is the need for a new system to achieve a business”s strategic
objectives. This is a preliminary plan (or a feasibility study) for a company”s busi-
ness initiative to acquire the resources to build on an infrastructure to modify or
improve a service. The company might be trying to meet or exceed expectations
3. Systems Design :- The third phase describes, in detail, the necessary specifica-
tions, features and operations that will satisfy the functional requirements of the
proposed system which will be in place. This is the step for end users to discuss
and determine their specific business information needs for the proposed system.
It”s during this phase that they will consider the essential components (hardware
and/or software) structure (networking capabilities), processing and procedures for
the system to accomplish its objectives.
4. Development :-The fourth phase is when the real work begins in particular, when a
programmer, network engineer and/or database developer are brought on to do the
major work on the project. This work includes using a flow chart to ensure that the
process of the system is properly organized. The development phase marks the end
of the initial section of the process. Additionally, this phase signifies the start of
production. The development stage is also characterized by instillation and change.
Focusing on training can be a huge benefit during this phase.
5. Integration and testing:- The fifth phase involves systems integration and system
testing (of programs and procedures) normally carried out by a Quality Assurance
(QA) professional-to determine if the proposed design meets the initial set of busi-
ness goals. Testing may be repeated, specifically to check for errors, bugs and
6. Implementation :- The sixth phase is when the majority of the code for the pro-
gram is written. Additionally, this phase involves the actual installation of the
newly-developed system. This step puts the project into production by moving
the data and components from the old system and placing them in the new system
via a direct cut over. While this can be a risky (and complicated) move, the cut
over typically happens during off-peak hours, thus minimizing the risk. Both sys-
tem analysts and end-users should now see the realization of the project that has
implemented changes.
7. Operations and Maintenance :- The seventh and final phase involves maintenance
and regular required updates. This step is when end users can fine-tune the system,
if they wish, to boost performance, add new capabilities or meet additional user
requirements.
• exploration
• installation
• initial implementation
• full implementation
SYSTEM DESIGN
• DFD Level 0 : A level 0 data flow diagram (DFD), also known as a context di-
agram, shows a data system as a whole and emphasizes the way it interacts with
external entities. This DFD level 0 example shows how such a system might func-
tion within a typical retail business.
• DFD Level 1 : A level 1 data flow diagram (DFD) is more detailed than a level 0
DFD but not as detailed as a level 2 DFD. It breaks down the main processes into
sub processes that can then be analyzed and improved on a more intimate level.
• DFD Level 2 : A level 2 data flow diagram (DFD) offers a more detailed look at
the processes that make up an information system than a level 1 DFD does. It can
• Class Diagram : The class diagram is the main building block of object-oriented
modeling. It is used for general conceptual modeling of the structure of the appli-
cation, and for detailed modeling translating the models into programming code.
Class diagrams can also be used for data modeling.
• Use Case Diagram : A UML use case diagram is the primary form of system/software
requirements for a new software program underdeveloped. Use cases specify the
expected behavior (what), and not the exact method of making it happen (how).
Use cases once specified can be denoted both textual and visual representation (i.e.
use case diagram). A key concept of use case modeling is that it helps us design a
system from the end user’s perspective.
Other Specification
5.0.1 Advantages
• Normal people no need to learn sign language he/she talk by using sign language
recognizer.
• Easy to use.
5.0.2 Limitations
5.0.3 Applications
Conclusions
Conclusion
Objctive: Breaching communication gap between disable people and normal people. •
To create the software which will help deaf people to communicate with other people. •
Making complete use of artificial intelligence and latest technologies to make human life
easier.
Introduction
1. Input Image: OPENCV has features which are used to perform various operations
and hence convert the hand images and show the required results.
1.1 Camera module: This module is subject for interfacing and capturing input through
the different sorts of picture markers and sends this picture to the detection module for
handling as frames. The generally utilized techniques of capturing and recognizing input
are hand belts, data gloves and cameras. In our framework, we use the inbuilt webcam
which is financially savvy to see both static and dynamic signs.
1. Convolution:- The first layers that receive an input signal are called convolution
filters. Convolution is a process where the network tries to label the input signal by
referring to what it has learned in the past. If the input signal looks like previous
cat images it has seen before, the “cat” reference signal will be mixed into, or
convolved with, the input signal. The resulting output signal is then passed on to
the next layer.
3. Pooling: Its function is to progressively reduce the spatial size of the representa-
tion to reduce the number of parameters and computation in the network. Pooling
layer operates on each feature map independently .The most common approach
used in pooling is max pooling in which maximum of a region taken as its repre-
sentative. For example, in the following diagram a 2x2 region is replaced by the
4. Activation: The activation layer controls how the signal flows from one layer
to the next, emulating how neurons are fired in our brain. Output signals which
are strongly associated with past references would activate more neurons, enabling
signals to be propagated more efficiently for identification. CNN is compatible with
a wide variety of complex activation functions to model signal propagation, the
most common function being the Rectified Linear Unit (ReLU), which is favored
for its faster training speed.
ReLU Activation Function: The Rectified Linear Unit is the most commonly
used activation function in deep learning models. The function returns 0 if it re-
ceives any negative input, but for any positive value xx it returns that value back.
So, it can be written as
f (x) = max(0, x)
• Archana S. Ghotkar and Gajanan K. Kharate explored rule-based and dynamic time
warping (DTW) based method to recognize ISL words. Their experiments proved
that the performance of DTW is very much higher for continuous word recognition.
• M.K. Bhuyan segments frame into video object planes (VOPs), to obtain a seman-
tically meaningful hand position. Key VOPs and temporal information are tracked
to form a complete gesture sequence. The test results concluded that by using
keyframes, a gesture could be uniquely represented as a finite state machine with
keyframes and corresponding frame duration as states.
1. Geethu G nath and arun C S, ”real time sign language interpreter,” 2017 interna-
tional conference on electrical, instrumentation, and communication engineering
(ICEICE2017).
2. Kumud tripathi, neha baranwal and G. C. Nandi, ”continuous indian sign language
gesture recognition and sentence formation”, eleventh international multiconfer-
ence on information processing2015 (IMCIP-2015), procedia computer science 54
(2015) 523 – 531.
4. Joyeeta Singha and Karen Das, ”Automatic Indian Sign Language Recognition
for Continuous Video Sequence,” ADBU Journal of Engineering Technology 2015
Volume 2 Issue 1.
6. M.K. Bhuyan, ”FSM-based recognition of dynamic hand gestures via gesture sum-
marization using key video object planes,” World Academy of Science, Engineering
and Technology Vol: 6 2012-08-23.
8. Kairong Wang, Bingjia Xiao, Jinyao Xia, and Dan Li, ”A Dynamic Hand Gesture
Recognition Algorithm Using Codebook Model and Spatial Moments,” 2015 7th
International Conference on Intelligent Human-Machine Systems and Cybernetics.
9. Francke H., Ruiz-del-Solar J. and Verschae R., ”Real-Time Hand Gesture Detec-
tion and Recognition Using Boosted Classifiers and Active Learning,” Advances in
Image and Video Technology. PSIVT 2007. Lecture Notes in Computer Science,
vol 4872. Springer, Berlin, Heidelberg.
10. Hari Prabhat Gupta, Haresh S Chudgar, Siddhartha Mukherjee, Tanima Dutta, and
Kulwant Sharma, ”A Continuous Hand Gestures Recognition Technique for Human-
Machine Interaction using Accelerometer and Gyroscope sensors,” IEEE Sensors
Journal (Volume: 16, Issue: 16, Aug.15, 2016) Page(s): 6425 – 6432.