An Approach For Morse Code Translation From Eye Blinks Using Tree Based Machine Learning Algorithms and Opencv
An Approach For Morse Code Translation From Eye Blinks Using Tree Based Machine Learning Algorithms and Opencv
G Sumanth Naga Deepak, B Rohit, Ch Akhil, D Sai Surya Chandra Bharath and Kolla
Bhanu Prakash
Abstract: For ages, human beings have been communicating with one another through
different modes of communication. Communication is a process through which a person can
communicate his/her feelings and thoughts to the other person. To communicate we can do it
through either speech or sign language. The spoken language is used by abled persons, While
the differently abled persons (deaf and dumb) may find it difficult to understand the same. So,
for effective communication between the differently abled and abled person sign language has
been developed. For private communication between two people, morse code has been
developed which is highly efficient to exchange secrets. It also helps in emergencies where a
person cannot communicate through hand gestures. Different methods/modes are used in
morse code, but our focus is on eye blinking. Our approach towards this area has been to
implement morse code using eye blinks in real-time assistance using a webcam to provide
predicting power based on machine learning's tree algorithms.
1. INTRODUCTION
Everyone needs communication to express their feelings or opinion or for information passing.
Normal people will communicate by speaking and deaf and dumb people by using sign language.
Morse code is a type of sign language which is very useful for secret communication and this type of
communication is important in army, navy, and air force departments why because in those
departments there is so much sensitive information which shouldn't be reviled to others and that
information should not be understandable by others. So, they prefer a separate way of communication,
which is called the Morse Code. It was invented by Samuel Finley Breese Morse (1791-1872) is an
American [9].
Before the invention of the Morse code, people were communicated through handwritten papers and
transported by horseback. Because of this a lot of time was wasted on the transport of information in
the form of handwritten papers. But the Morse code changed the way of communication.
In this type of communication, they will communicate using eye blinks or finger gestures, or head
gestures according to their situation or convenience. Every gesture or eye blink will have a certain
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd 1
ICASSCT 2021 IOP Publishing
Journal of Physics: Conference Series 1921 (2021) 012070 doi:10.1088/1742-6596/1921/1/012070
meaning and according to that gesture, they will communicate. This was the quickest long-distance
method of communication at the point of its invention why because a single gesture in more code will
convey a lot of information. Morse code played an important role in information passing during the
Second World War Since it increased the speed of communication.
In the above figure, we can see a more code for each alphabet and each digit. Those dot-dash patterns
will be decoded from the user input, for example a short eye blink will represent a dot and a long eye
blink will represent a dash. The combination of those dots and dash will represent a specific meaning.
We can see how those dot dash patterns differ from one to another in the above figure. This figure
was built upon the binary search tree logic. In binary search tree if the input is less than the root then
it will be assigned to the left sub tree else it will assign to the right sub tree. In this tree if the input is
dot then it is assigned to the left sub tree. If it is dash, then it is assigned to the right sub tree. This
figure is to understand how the more code differs from one alphabet to another alphabet or from one
number to another number.
This type of communication is not only useful for national security purposes, but also useful for deaf
and dumb people why because they can convey more information in less time with less effect.
2. LITERATURE SURVEY
Through technology many ways of communication are possible. Communication is differing from one
case to another case. Normal people will communicate through speaking and hearing in an
2
ICASSCT 2021 IOP Publishing
Journal of Physics: Conference Series 1921 (2021) 012070 doi:10.1088/1742-6596/1921/1/012070
understandable language. Differently abled people like deaf and dumb will have an alternate way of
communication like through sign language but the problem with sign language is, it is not comfortable
for all the people. [11] To have an effective communication between differently abled people and
normal people there is a need for a sign language translator. And this translator is not only useful for
good understanding but also for Secret information passing which is very useful in national security.
Morse code is a type of sign language which is built upon dots and dashes symbols. [12] There are
several surveys on the morse code. The surveys explain the possibilities and drawbacks of the
implementations. [13]
The Morse code translator can be implemented by using Eye Blinks and Decoding using OpenCV [1].
The input will be taken as a video that consists of a sequence of eye blinks and using OpenCV we can
decode and convert that eye blinks to specific meaning. [14]
Morse code has been implemented using data patterns. In which, the input will be a sequence of 0’s
and 1’s and this sequence will have a specific meaning according to that meaning information will be
passed or conveyed [2]. This kind of implementation is not efficient since the input is given through
the keyboard. We need a good dynamic model that takes dynamic input in a pattern of 0’s and 1’s
which is difficult. [15]
Another way of implementing Morse code translator is using finger gesture as an input which will
convert the input gesture into respective dot-dash pattern [3] but the problem with this model is that the
input should be given with high accuracy. If the position of the finger slightly changes then the whole
meaning changes which is not acceptable in secret information passing. A model that takes head
gesture as input will be better than finger gesture because the model cannot decode the user’s gesture
exactly because even the slightest changes in the gesture will give an entirely different meaning [4].
So, we decided to implement the model which takes eye blinks as an input because of the high
possibility of getting an accurate output. [16]
3. EXISTING SYSTEMS
For ages the morse code has been used by many government officials during emergencies and when
the normal means of communication are not available, and it is also a type of sign language used by
people to communicate. [17] We found some of the existing models like morse code translation using
sound that works on the principle of sound clicks. In this method, the person will communicate with
his eyes and another person need to convert it into sound (click) and that must be decoded into
human-understandable language. In some other models, the morse code is generated using tongue
gestures will be inappropriate. In some, they are using costly sensors that are not affordable by
common people. A few existing systems have devised different representations (or) data patterns of
0’s and 1’s to represent a character. The major disadvantage of this kind of system is that it is prone to
changes by external factors like noise, strong electrical impulses, changes happening due to internal
refraction of light in optical cables. In some models input is taken in form of head gestures and each
gesture will have a specific meaning like lifting the head in upward direction means dot, downward
direction means dash, towards left direction means a backspace (to delete a dot or dash) and towards
the right direction is to conform and convert into text.
4. METHODOLOGY
In our research, we are performing the morse code using eye blinks. First, we have used OpenCV to
open the webcam for input. Then with the help of dlib.get_frontal_face_detector()
we detect the face. Then we have used dlib.shape_predictor library for the detection of the eye region,
this library is useful to detect the different facial landmarks like the tip of a nose, edges of eye or
3
ICASSCT 2021 IOP Publishing
Journal of Physics: Conference Series 1921 (2021) 012070 doi:10.1088/1742-6596/1921/1/012070
mouth or ears, and so on. We have given different time limits to detect dashes and dots i.e., 15
seconds for the dash and 30 seconds for the dot. So, to get a dash as an output we have to blink our
eyes once and close our eyes for 15 seconds, and similarly, for a dot, we have to blink our eyes once
and close our eyes for 30 seconds. After giving input, we press the ‘q’ key to close the webcam.
Then we are storing the dots and dashes in a list which is further passed on to our machine learning
models for prediction of the character. Upon completion of the prediction, we will display the
character on the screen. This whole procedure is performed using the flask framework as an interface
for the user. We have used flask in the interface to bridge the gap between the machine learning
model and the front-end part (which is enhanced by the HTML and CSS) where the user tries to give
the input. Flask has played a major role since when a user gives the input then it goes into a proxy
server where our machine learning model gets stored and from there it tries to fetch the output and
display it on the screen.
5. ALGORITHMS
Decision Tree
Decision Tree is a supervised machine learning technique. Supervised technique means models of
these types would require label (or) the target column to be there before hand for training the model.
But unlike other supervised techniques, it can be used for both classification and regression. Decision
Tree algorithm is a tree structured, so in it every leaf node denotes the output, the branches denote the
rules for decision making and internal nodes denotes the features that are extracted from the data. The
decisions are taken based on these extracted features. A decision tree consists of two types of nodes,
they are decision and leaf nodes. Decision node’s main purpose is to take decisions and they have
many branches and leaf node’s main purpose is to present the outcome of these decisions and have
zero branches. In decision tree we have used Gini index to split the data and form a tree kind of a
structure. Among all the different types of algorithms used for splitting the data and formation of tree
Gini index is best among them for our dataset. Since it does the perfect split and gives a better
accuracy in the end.
4
ICASSCT 2021 IOP Publishing
Journal of Physics: Conference Series 1921 (2021) 012070 doi:10.1088/1742-6596/1921/1/012070
Random Forest:
Random Forest is an ensemble learning technique. An ensemble technique in itself means creations of
multiple models and voting is done among this model and the label or the target is predicted based
upon the highest voting factor. Similarly, a Random forest creates multiple tree models and does the
prediction. The greater number of tree-models results to a more accurate output and it also helps to
prevent over fitting problem. Random forest algorithm takes less training time when compared to
other algorithms. It gives output with high accuracy, even when a large dataset is used or even a large
part of the data is missing. In the study we have used 1000 random trees with a maximum depth as 2
and created the model.
5
ICASSCT 2021 IOP Publishing
Journal of Physics: Conference Series 1921 (2021) 012070 doi:10.1088/1742-6596/1921/1/012070
When we run the program, first the webcam will open and starts capturing the video which will be
taken as input. Then we do video processing and analysis how long the eye blinks are to convert that
into dots and dashes which will be sent to a machine learning model to analysis and translate it. After
analysis it the output is displayed.
The existing systems have quite a low accuracy in comparison with our proposed system. Even it is
quite cumbersome to use the existing system since the time taken for the executions is quite high, they
are going in seconds to predict a single character. The major drawback in existing systems is that it
makes the process tiresome and complicated since a person who is performing the morse code using
the eye blinks must also remember that dot is represented through the right eye(R) and dash through
the left eye(L). Whereas in our proposed system we have reduced the burden on the person by
removing the L-R eye pattern to create Morse code structure instead we have used both the eyes and
kept a time limit as 15 seconds blink for dash and 30 seconds blink for the dot. We have also kept 40
seconds closed eye time window to remove extra dots and dashes but then also we have a limitation
regarding the occurrences of extra dots and dashes which sometimes causes hindrance in predicting
the character. Our proposed system has machine learning techniques been implemented with
comparative analysis been done with 4 other supervised models, Due to the inclusion of machine
learning we have given our proposed system the prediction power to predict the character with high
accuracy which were lacking in the existing models.
The above image depicts on how to perform the morse code using eye blinks.
6
ICASSCT 2021 IOP Publishing
Journal of Physics: Conference Series 1921 (2021) 012070 doi:10.1088/1742-6596/1921/1/012070
The above photo shows how the output is displayed on the screen.
Validation Graphs:
Figure 6. Graph between True values vs Predicated values of Random Forest Regressor
In the above graph the blue colour line indicates the true values, and the scatter plots indicates the
predicted values we can see in the above image that there are some points the does not synchronize
with the line so that the accuracy of this model is extremely low.
7
ICASSCT 2021 IOP Publishing
Journal of Physics: Conference Series 1921 (2021) 012070 doi:10.1088/1742-6596/1921/1/012070
Figure 7. Graph between True values vs Predicated values of Random Forest Classifier
In the above graph the blue colour line indicates the true values, and the scatter plots indicates the
predicted values we can see in the above image that there are some points the does not synchronize
with the line so that the accuracy of this model is better than the Random Forest Regressor one point
is above the line and one point is below.
Figure 8. Graph between True values vs Predicated values of Decision Tree Classifier
In the above graph, the blue colour line indicates the true values, and the scatter plots indicates the
predicted values we can see in the above image that there are some points the does not synchronize
with the line sothat the accuracy of this model is better than the Random Forest Classifier only one
point is below.
8
ICASSCT 2021 IOP Publishing
Journal of Physics: Conference Series 1921 (2021) 012070 doi:10.1088/1742-6596/1921/1/012070
Figure 9. Graph between True values vs Predicated values of Decision Tree Regressor
In the above graph the blue colour line indicates the true values, and the scatter plots indicates the
predicted values we can see in the above image that there are some points the does not synchronize
with the line so that the accuracy of this model is better than the Decision Tree Classifier here the
points are scattered but they are in equidistance to the blue line.
The above graphs are the representations of how well the prediction of the true values is done by our
proposed system. From the given graphs we could say that the Decision tree Regressor has a better
graph representation since the true values and predicted values are in sync with each other.
Model Accuracy
Random Forest Regressor 84%
Random Forest Classifier 92%
Decision Tree Classifier 91%
Decision Tree Regressor 96%
In the Base paper, we referred to, no machine learning algorithm was used. Only OpenCV techniques
were used and got an accuracy of 51.16%. So, in this project, we created four different ML (Machine
Learning) models and did a comparative study. Each model gave a different accuracy as you can see
in the above table. Among these four models, Decision Tree Regressor has the highest accuracy. So,
Decision Tree Regressor is the best model for this project.
7. CONCLUSION
The existing system was quite complicated and people working on this system must have to remember
a lot. whereas our system has resolved the complicated issues and added predictive power to it. But
the only flaw in the proposed system is that the time limit is associated with dots and dashes. So, in
the future, we can improve the model by removing the time limits and implementing advanced
machine learning algorithms such as Neural Networks. This model could also be extended to words
and sentences and even for paragraphs.
9
ICASSCT 2021 IOP Publishing
Journal of Physics: Conference Series 1921 (2021) 012070 doi:10.1088/1742-6596/1921/1/012070
REFERENCES
[1] Kranthi Kumar, V. Sai Srikar, Y. Swapnika, V. Sai Sravani, N. Aditya,“A Novel Approach for
Morse Code Detection from Eye Blinks and Decoding using OpenCV”.
[2] Nugaliyadde, K.N. Manatunga and K.H.D. Perera, “Compression Using Morse Code and Data
Patterns”
[3] Ricky Li,“Computer input of morse code using finger gesture recognition”.
[4] Headspeak,“Morse code Based Head Gesture to Speech Conversion Using Intel”.
[5] Hari Singh, Jaswinder Singh, IKG Punjab Technical University, Kapurthala, India, Beant
College of Engineering and Technology, Gurdaspur, India, “Performance Analysis of Real-
Time Eye Blink Detector for Varying Lighting Conditions and User Distance from the
Camera”.
[6] Sukhwinder Kaur, Hari Singh,Realsense Technology Rupam Das, Kuderu B. ShivaKumar,
“Human Eye Blink Detection using YCbCrColor Model, Haar-Like Features and Template
Matching”.
[7] Paparao Nalajala; Bhavana Godavarth; M Lakshmi Raviteja; Deepthi Simhadri”Morse code
generator using Microcontroller with alphanumeric keypad”.
[8] Luis Ricardo Sapaico and Makoto Sato Tokyo Institute of Technology, Tokyo, Japan,
“Analysis of Vision-based Text Entry using Morse Code generated by Tongue Gestures”.
[9] https://www.britannica.com/biography/Samuel-F-B-Morse
[10] https://en.wikipedia.org/wiki/Morse_code
[11] http://www.learnmorsecode.com/
[12] Prakash K.B. Content extraction studies using total distance algorithm, 2017, Proceedings of
the 2016 2nd International Conference on Applied and Theoretical Computing and
Communication Technology, iCATccT 2016, 7912085, 673-
679,10.1109/ICATCCT.2016.7912085
[13] Prakash K.B., Mining issues in traditional indian web documents, 2015, Indian Journal of
Science and Technology, 8(32), 10.17485/ijst/2015/v8i1/77056
[14] Babitha D., Jayasankar T., Sriram V.P., Sudhakar S., Prakash K.B., Speech emotion recognition
using state-of-art learning algorithms2020, International Journal of Advanced Trends in
Computer Science and Engineering, 9(2), 1340-1345, 10.30534/ijatcse/2020/67922020
[15] Prakash K.B., Rajaraman A., Lakshmi M.,Complexities in developing multilingual on-line
courses in the Indian context, 2017, Proceedings of the 2017 International Conference On
Big Data Analytics and Computational Intelligence, ICBDACI 2017, 8070860, 339-342,
10.1109/ICBDACI.2017.8070860
[16] Prakash K.B., Kumar K.S., Rao S.U.M., Content extraction issues in online web education,
2017, Proceedings of the 2016 2nd International Conference on Applied and Theoretical
Computing and Communication Technology, iCATccT 2016, 7912086, 680-685,
10.1109/ICATCCT.2016.7912086
[17] Prakash K.B., Rajaraman A., Perumal T., Kolla P. Foundations to frontiers of big data analytics
2016, Proceedings of the 2016 2nd International Conference on Contemporary Computing
and Informatics, IC3I 2016, 7917968, 242-247, 10.1109/IC3I.2016.7917968
[18] Bharadwaj Y.S.S., Rajaram P., Sriram V.P., Sudhakar S., Prakash K.B. Effective handwritten
digit recognition using deep convolution neural network, 2020, International Journal of
Advanced Trends in Computer Science and Engineering, 9(2),1335-
1339,10.30534/ijatcse/2020/66922020
10