0% found this document useful (0 votes)
120 views9 pages

A Survey of Punjabi Language Translation Using OCR and ML

This document summarizes a study on using machine learning and natural language processing techniques to translate text written in the Punjabi Gurmukhi script to Hindi and English. The study uses machine learning algorithms to translate input Gurmukhi text after preprocessing, segmentation, and recognition. Long short-term memory neural networks are applied to capture long-term dependencies and patterns in large texts, improving translation quality and efficiency. The findings could help develop more advanced translation systems and make information exchange more accessible across languages.

Uploaded by

gkaurbe21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
120 views9 pages

A Survey of Punjabi Language Translation Using OCR and ML

This document summarizes a study on using machine learning and natural language processing techniques to translate text written in the Punjabi Gurmukhi script to Hindi and English. The study uses machine learning algorithms to translate input Gurmukhi text after preprocessing, segmentation, and recognition. Long short-term memory neural networks are applied to capture long-term dependencies and patterns in large texts, improving translation quality and efficiency. The findings could help develop more advanced translation systems and make information exchange more accessible across languages.

Uploaded by

gkaurbe21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

A Survey of Punjabi Language Translation using

OCR and ML
Shaveta Khepra Priya Kumari Raj Gupta
Department of Computer Science & Department of Computer Science & Department of Computer Science &
Engineering, School of Engineering & Engineering, School of Engineering & Engineering, School of Engineering &
Technology, Sharda University Technology, Sharda University Technology, Sharda University
Greater Noida, India Greater Noida, India Greater Noida, India
[email protected] [email protected] [email protected]

Abhishek Vijendra Singh Bramhe


Department of Computer Science & Engineering, Department of computer science and engineering
School of Engineering & Technology Shri Ramdeobaba College of Engineering and Management
Sharda University Nagpur, India
Greater Noida, India [email protected]
[email protected]

Abstract— The use of Natural Language Processing (NLP) range from morphological (how to choose words and
in machine translation has grown in significance as technologies combine sub-word units), statically typed (related to
as well as computers have become more prevalent in our daily grammar), definitional (to indicate the meanings of the
lives. This essay’s goal is to evaluate the work done in the field words, as the same term could perhaps refer to various things
of natural language processing (NLP) to translate the such as a stream or financial bank), and so forth. This makes
Gurmukhi script, which is primarily used to represent the the issue quite difficult for something like a computer that
Punjabi language. Together with Hindi and Punjabi, the lacks both world knowledge as well as an intuitive
languages written throughout the Gurumukhi script has more recognition of human vocabulary [3]. In this paper, machine
than 1.5 billion native users, making it among the most used
learning algorithms are used to translate the Gurmukhi script
language worldwide. However, there is limited research
available on the Gurmukhi script. The study presented in this
into targeted language. First of all, the input will be carried
paper focuses on using machine learning and AI techniques to out in the form of Gurumukhi text which will be further
translate written Gurmukhi text into Hindi and English processed by using pre-processing techniques, text
languages. The input for the translation process will be in the segmentations, recognitions, and post-processing, the output
form of machine-printed Gurmukhi text, while the output will will be in the form of targeted language. The pre-processing
be provided in both Hindi and English languages. The goal of step consists of many applied processes sequentially on a
this study is to enhance how well global languages absorb text file. It accepts a text file as input and enhances it by
information and technology, making it easier and more accurate minimizing noise and distortion [4]. For the Gurmukhi
to translate from regional languages. In conclusion, the paper script, the other authors have applied the noise detections or
provides a comprehensive review of the work related to the removal method created by Lehal and Dhir [5]. Additionally,
translation of Gurmukhi script using NLP techniques, with a Lehal and Singh have suggested the post-processing method
focus on the use of machine learning and AI. The findings of for Gurmukhi script optical character recognition [6].
this study can contribute to the development of more
advanced and accurate translation systems, making the The LSTM neural network type has been successfully
exchange of information and knowledge more accessible for used for a variety of natural language processing tasks,
people who speak different languages. including machine translation. It can handle sequential data
types like text. By applying LSTM in Gurmukhi script
Keywords— Machine Learning (ML), Artificial Intelligence translation for large text, the model can learn and capture
(AI), Natural Language Processing (NLP), Gurmukhi Script, long-term dependencies and patterns in the text, which can
Machine translation (MT). result in more accurate and fluent translations. This approach
can also address the challenge of translating large volumes
I. INTRODUCTION of text, which is often encountered in real-world
ML (Machine learning), that is a branch of AI (Artificial applications. Therefore, the use of LSTM in Gurmukhi script
Intelligence), where computer programs could more translation for large text can be considered as a novel
accurately predict scenarios without needing special approach that can potentially improve the quality and
instructions for this purpose [1]. The machine learning term efficiency of the translation process. Machine learning, that
was coined by researchers who noticed computers detecting is a branch of AI (Artificial Intelligence), where computer
patterns and theorized that computers may learn without programs could more accurately predict scenarios without
being taught to carry out particular tasks [2]. needing special instructions for this purpose [1]. The
machine learning term was coined by re- searchers who
An MT system can accept a phrase, a chunk of text, or noticed computers detecting patterns and theorized that
entire documents as input. The ambiguities in this problem
computers may learn without being taught to carry out

978-93-80544-47-2/23/©BVICAM, New Delhi, India 136


Authorized licensed use limited to: Thapar Institute of Engineering & Technology. Downloaded on February 28,2024 at 20:24:27 UTC from IEEE Xplore. Restrictions apply.
particular tasks [2]. An MT system can accept a phrase, a of machine translation and is often used for simple or routine
chunk of text, or entire documents as input. The translations.
ambiguities in this problem range from morphological
(how to choose words and combine sub-word units), The process of direct machine translation typically
statically typed (related to grammar), definitional (to indicate involves the following steps:
the meanings of the words, as the same term could perhaps x Input: The machine translation system receives the
refer to various things such as a stream or financial bank), primary text, which is written in the source
and so forth. This makes the issue quite difficult for languages.
something like a computer that lacks both world knowledge
as well as an intuitive recognition of human vocabulary [3]. x Analysis: The machine translation system analyzes
In this paper, machine learning algorithms are used to the source text and breaks it down into smaller units,
translate the Gurmukhi script into targeted language. First of such as words or phrases.
all, the input will be carried out in the form of Gurumukhi x Translation: The machine translation system then
text which will be further processed by using pre-processing utilizes algorithms to convert contents from any
techniques, text segmentations, recognitions, and post- languages to the other. This can be done using rule-
processing, the output will be in the form of targeted based systems, statistical systems, or a combination
language. The preprocessing step consists of many applied of both.
processes sequentially on a text file. It accepts a text file
as input and enhances it by minimizing noise and distortion x Output: The machine translation system generates
[4]. For the Gurmukhi script, the other authors have applied the translated text in the target language, which is
the noise detections or removal method created by Lehal and then output and can be reviewed by the user.
Dhir [5]. Additionally, Lehal and Singh have suggested the
post-processing method for Gurmukhi script optical 2) Rules -Based Machine Translation (RBMT):
character recognition [6]. The LSTM neural network type RBMT, or rules-based machine translation, is a type of
has been successfully used for a variety of natural language machine translation system that is built on linguistic
processing tasks, including machine translation. It can concepts that enable words to be used in diverse situations
handle sequential data types like text. By applying LSTM in and also possess multiple interpretations, was one of the first
Gurmukhi script translation for large text, the model can commercialized machines translating strategies. Huge
learn and capture long-term dependencies and patterns in quantities of linguistic terms can be applied to using the
the text, which can result in more accurate and fluent RBMT methodology in three stages: analyzing, transferring,
translations. This approach can also address the challenge and generating. Experienced programmers and linguists have
of translating large volumes of text, which is often established rules for accurately translating between two
encountered in real-world applications. Therefore, the use languages, based on their extensive research, and
of LSTM in Gurmukhi script translation for large text can understanding of the principles underlying each language.
be considered as a novel approach that can potentially Programmers as well as linguists who've already spent a
improve the quality and efficiency of the translationprocess. significant amount of time researching and understanding the
principles between two languages, have established rules.
A. Types of MT (Machine Translation)
Due to their high accuracy, and low recall conditions,
In machine translation, artificial intelligence is utilized to these methods only produce the desired results whenever the
convert text from one language to another without the instructions could be implemented properly. However, this is
need for human translation. Modern machine translation not always the case. Careful sequencing is required since
involves communicating the complete meaning of the source there could be a contradiction between rules when more than
language text into the target language, surpassing mere one rule applies to a phrase [3].
word-to-word
translation.

Fig. 2. Rule-Based Machine translation

In the Rule Based MT, there are two types:


x Transfer-based machine translation: This type of
Fig. 1. Machine translation Approaches
machine translation breaks down the translation
process into several subtasks, such as morphological
1) Direct Translation: analysis, syntactic parsing, and semantic analysis, and
Direct machine translation, also referred to as "pure" then translating the source text's meaning into the
machine translation, utilizes algorithms to convert contents targeted languages. This approach is useful for
from any languages to the other. This is the most basic form handling complex grammatical structures and
idiomatic expressions.

2023 10th International Conference on Computing for Sustainable Global Development (INDIACom) 137
Authorized licensed use limited to: Thapar Institute of Engineering & Technology. Downloaded on February 28,2024 at 20:24:27 UTC from IEEE Xplore. Restrictions apply.
x Interlingual machine translation: One approach to source and target languages, unlike other machine translation
machine translation involves using an intermediary systems like SMT. This means that NMT is not reliant on
language for translation between the source and target certain components that are typically required in other
languages, which is often a constructed language, and systems.
then translating it into the target language. The
advantage of this approach is that it can handle
multiple languages at once and it may reduce errors
in the final output.
3) Corpus-based machine translation:
This type of machine translation relies on large amounts
of bilingual text (parallel corpus) to train statistical models
for translation. The models are trained to learn patterns in the
data and use those patterns to make translations. In the Fig. 4. Neural Machine Translation
Corpus-based machine translation, there are two types
6) Hybrid machine translation:
x Example-based machine translation: This type of Hybrid machine translation combines various translation
machine translation uses a database of bilingual methods to enhance accuracy and fluency in the translated
sentence pairs to translate text. The system retrieves text.
the most similar sentence pair from the database and
adapts it to create the target sentence. This approach A statistical machine translation (SMT) system that has
is useful for handling rare languages or less common been received training on massive amounts of parallel texts
language pairs. in various translations is a common technique for machine
translation. The SMT system is then augmented with
Statistical Machine Translation (SMT): In pure additional information from other sources, such as a neural
SMT, the model is developed entirely from machine translation (NMT) system or a knowledge-based
information via corpuses, with no user intervention. system.
Up until the beginning of 2010, it was the dominant
paradigm. A computer needs examples that provide The NMT system is based on deep learning and is trained
the following information to learn how to translate: on large amounts of monolingual data, which allows it to
The translation of the phrases (the bilingual word generate more fluent and natural-sounding translations. Then
mappings), that is, which word(s) corresponds to knowledge-based system, on the other hand, uses linguistic
which one(s), as well as the appropriate placements and domain-specific knowledge to improve the translation of
of the converted words in the targeted phrase specific phrases or terms.
(alignment) [3].
The different systems can be combined in a number of
ways, such as:
Cascading: One translation system's output is used as
the input for another system's translation.
Ensemble: Through the use of a voting or weighted
average method, the result of numerous devices is
integrated.
Hybrid: The output of multiple systems is combined at
different stages of the translation process.
Hybrid machine translation systems can provide more
accurate translations for specific domains and languages
Fig. 3. Statistical Machine Translation and can also adapt to new domains and languages more
quickly.
4) Knowledge-based machine translation:
7) Recurrent Neural Networks (RNN):
This kind of machine translation translates text based on
Recurrent neural networks (RNNs) seem to be neural
a predetermined set of grammatical and lexical rules. A
networks created especially for handling sequential data,
different name for it is rule-based machine translation
including text, speech, and time series data. The term
(RBMT). Knowledge-based machine translation is
"recurrent" refers to the network's ability to process input
particularly advantageous in its ability to effectively handle
data in a sequential manner, in which the current time phase
specific domains such as technical or legal texts where the
is provided with the results by the preceding time phase.
structure of the text is well-defined.
This recurrence enables RNNs to capture patterns in
5) Neural Machine Translation or NMT: sequential data that other neural networks may miss.
Neural machine translation, or NMT, is a form of An RNN consists of a series of recurrent layers, each of
machine translation which builds statistics translation which contains a set of neurons (or "units"). Each neuron
systems using architectures of neural networks inspired by receives input from the previous layer, as well as from the
the human brain. One significant advantage of NMT is that it previous state of the same layer. The output of each neuron
allows for a single system to be used for translating both the

138 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom)
Authorized licensed use limited to: Thapar Institute of Engineering & Technology. Downloaded on February 28,2024 at 20:24:27 UTC from IEEE Xplore. Restrictions apply.
is then passed to the next layer, as well as to the next state of Where the weight matrices Wxh and Whh are used to
the same layer. correlate the inputs and hidden state, respectively. F is a
sigmoid or tanhshaped nonlinear activation function. The
This allows the RNN to maintain a "memory" of the network's output is calculated as:
previous input, which is particularly useful for processing
sequences of data where the current input is dependent on
the previous inputs. Where Wyh is the weight matrix that connects the hidden
Applications for RNNs include time - series data state to the output.
forecasting, voice recognizing, machine translation, and The Elman RNN architecture is able to overcome the
natural language processing. vanishing gradient problem that occurs in vanilla RNNs. It
B. Types of RNN: also has a simpler architecture than LSTM and GRU, which
makes it easier to train and understand. Elman RNNs are
There are different types of RNN, including Vanilla frequently utilised in a variety of applications, such as
RNN, Elman RNN, Jordan RNN, Long Short-Term Memory forecasting of time series, voice recognition, and natural
(LSTM) and Gated Recurrent Unit (GRU). Each of these language processing. Due to their ability to capture
architectures have different architectures and characteristics. sequential dependencies and model complex patterns, Elman
RNNs are applied in various fields, including natural RNNs have proven to be an effective tool for solving various
language processing, speech recognition, time series problems in these fields, such as language modeling, speech
prediction, and machine translation. Due to their ability to recognition, and predicting stock prices.
capture dependencies in sequential data, RNNs have become 3) Jordan RNN:
an essential tool in these areas of research, enabling A Jordan RNN is an extension of the Elman RNN that
researchers to develop more accurate and efficient models includes a second recurrent connection between the output
for complex tasks. layer and the hidden layer. This second recurrent connection
1) Vanilla RNN: allows the output of the network to have an effect on the
Vanilla RNN is the simplest type of recurrent neural hidden state, which can improve the ability of the network to
network (RNN). It consists of a single layer of neurons, process sequences of data where the current input is
where each neuron receives input from the previous layer, as dependent on the previous inputs and outputs.
well as from the previous state of the same layer Vanilla In a Jordan RNN, the input at time step t (xt), the hidden
RNN use a mathematical formula to compute the hidden state at time step t-1 (ht-1), and the output at time step t
state at each time step, t. The formula uses the inputs at t entail the computation of the hidden state at each time step, t.
time step, the hidden condition at time step xt, and the (yt-1). It is possible to compute the hidden state at time step t
previous time step (ht-1). It is possible to compute the (ht) as follows:
hidden state at time step t (ht) as follows:

The RNN uses weight matrices Wxh and Whh to connect


Where the weight matrices Wxh and Whh are used to input and hidden states. It also uses an activation function f,
interconnect the inputs as well as hidden states, respectively. like sigmoid or tanh. The output is computed as a function of
F is a sigmoid or tanh-shaped nonlinear activation function. the hidden state and weight matrix why. This allows the
Vanilla RNNs have the ability to process sequential data, RNN to capture patterns in sequential data by modeling
where the current input is dependent on the previous inputs. dependencies through the hidden state:
However, Throughout the backpropagation process,
gradients have a tendency to disappear or burst. Because of
this, vanilla RNNs are not commonly used in practice, and Wyh is the weight matrix that links the output to the
other types of RNNs, such as LSTMs and GRUs, are hidden state.
generally preferred.
The Jordan RNN architecture is able to overcome the
2) Elman RNN: vanishing gradient problem that occurs in vanilla RNNs and
A context layer, also known as the hidden layer or an also it has a more complex architecture than Elman RNN.
extensions of the vanilla RNN, is a feature of an Elman RNN Jordan RNNs are not as popular as other types of RNNs,
that is used to preserve the prior hidden state. The context such as LSTM and GRU, due to its complexity and the
layers serves to preserve a "memory" of the prior inputs, accessibility of more RNN architectures that have been
enabling the networks that interpret data sequences in which demonstrated makes them more practical.
the present input depends upon that prior inputs more
effectively. 4) Long Short-Term Memory (LSTM):
LSTM systems appear to function effectively for
Elman RNN are a type of RNN that uses a similar
classification, evaluating, and forecasting outcomes using
architecture to Vanilla RNNs, by including an output layer
time-series information even though there could occur delays
that is dependent upon the current hidden state. The inputs at
of varying lengths among key instances throughout the time
time step t, ht, and the hidden state at time step 1 (ht-1), are
series. LSTMs had been developed to address the
used to calculate the hidden state at each subsequent time
disappearing gradients which might occur when training
step, t, as follows:
traditional RNNs [3]. LSTMs consist of two parts:

2023 10th International Conference on Computing for Sustainable Global Development (INDIACom) 139
Authorized licensed use limited to: Thapar Institute of Engineering & Technology. Downloaded on February 28,2024 at 20:24:27 UTC from IEEE Xplore. Restrictions apply.
1.Encoder term in the problem regarding question answering). In the
project, we will use LSTM and will train KNN using it.
2.Encoder Vector
Advantages and disadvantage of LSTM: Some
3.Decoder
advantages of Long Short-Term Memory (LSTM)
networks include:
x Ability to process long sequences of data: LSTMs are
able to remember information from earlier in the
sequence and use it to inform their predictions, which
makes them well-suited for tasks that require
processing long sequences of data.
x Robustness to noise and missing data: LSTMs are
able to handle noise and missing data in the input
Fig. 5. Long short Term Memory (LSTM) sequence, This can be advantageous when the data is
inconsistent or lacking.
a) Encoder:The collection contains numerous
recurring units, every of which receives the single input x Ability to model complex dependencies: LSTMs have
series element that transmitted data of that component a complex architecture that allows them to model
forward (LSTM or Glavnoye Razvedyvatelnoye Upravlenie complex dependencies between elements in the input
sequence.
(GRU) units for better performance). In an issue involving
question-answering, the input sequence consists of the entire x Some disadvantages of LSTMs include:
question's words. The symbol x_i, where I denote the
phrase's sequence, is used to represent individual phrase. x High computational cost: LSTMs are more
computationally expensive than other types of
The hidden regions h_i are calculated using the algorithm
recurrent neural networks, which can make them
below:
slower to train and less practical for large-scale
applications.
The output of a typical recurrent neural network is x Difficulty in understanding and interpreting the
represented by this straightforward formula. As you can see, model: The complex architecture of LSTMs can
we simply apply the proper weights to the input vector x_t make it difficult to understand and interpret the
with the earlier concealed condition h_ (t-1). model's predictions and decisions.
b) Vector Encoder:The encoder's last hidden phase for x Difficulty in debugging: The complex architecture of
the model is represented by this. These vector aims to LSTMs can also make it difficult to debug the model
contain information regarding every input element to help when it is not performing as expected.
the decoder generate accurate forecasts. It serves as the
d) Gated Recurrent Unit (GRU):
decoder portion of the model's first hidden state.
GRU is a type of RNN designed to address the vanishing
c) Decoder: The grouping of several recurrent gradient issue in traditional RNNs. It makes use of gates to
elements, all of that predicts the outcome at time step t, regulate flow of data as well as enhance the management of
called y_t. Every recurrent component gets the hidden sequential data. which regulates the amount of information
phase from the preceding component that produces the from the input signal being passed to the existing hidden
output as well as a hidden phase for itself. state, and a gate that controls the amount of information
The outcome sequence in question-answering problem is from the previous hidden state that goes to the current one.
a compilation of each word in response. The symbol y_i, in
Mathematically, the compute of the updated gate is:
which i stands for word's order, is used to represent
individual word. The following equations is used to compute
its internal layer h_i:
Where is the sigmoid activation function, Wz and Uz are
the weight matrices that connect, respectively, the input and
hidden states.
You'll notice that humans are simply using the previous
concealed state to estimate the subsequent one. The reset gate is computed as:
The result y_t at period step t is calculated using the
following formula:
Where Wr and Ur are the weight matrices that connect
the input and hidden states respectively.
To use the present time step's concealed state and the The current hidden state is then computed as:
proper weight W, we calculate the outputs (S). Using
SoftMax, we might create the probabilistic vector to make it
possible to enable us to predict the outcome (for e.g., the

140 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom)
Authorized licensed use limited to: Thapar Institute of Engineering & Technology. Downloaded on February 28,2024 at 20:24:27 UTC from IEEE Xplore. Restrictions apply.
Where W and U are the weighted matrices which such as time series data and speech. It is called a Time-Delay
connect, respectively, the input and hidden states. Neural Network because it uses time delays to combine
information from multiple time steps in the input sequence.
The output of the network is computed as: A TDNN consists of multiple layers of neurons, each of
which has a set of fixed-size, context-dependent filters that
Where Wy is the weight matrix that connects the hidden are applied to different portions of the input sequence. These
state to the output. GRUs have been found to be more “filters are designed to extract relevant “features from the
effective than traditional RNNs are utilized in a variety of input data and can be thought of as having a local receptive
applications, including sequential data prediction, voice field. The filters are applied at different time-delays to the
recognition, and natural language processing. input sequence, which allows the network to combine
information from multiple time steps in the input sequence.
e) Bi-directional RNN: A TDNN architecture also includes pooling layers, which are
A bidirectional RNN (Recurrent Neural Network) is a used to reduce the dimensionality of the data. The pooling
type of RNN that processes the input sequence in two layers are applied after the filters, and they aggregate the
directions: from past to future and from future to past. This information from the filters over a certain time window.
allows the network to take into account both the past and
future context when processing the input sequence, which The output of” a TDNN is typically a vector of fixed
can improve the performance of certain tasks such as size, which can be used as features for a classifier or
language modeling or machine translation. In a bidirectional regressor. TDNNs are commonly used for tasks that involve
RNN, there are typically two separate RNNs, each temporal data with spread-out relevant information, such as
processing the input sequence in one direction, and their speech recognition, natural “language processing, and time
outputs are then combined before being passed to the next series” prediction. They are effective in handling inputs with
layer or used to make a final prediction. temporal structure. TDNNs have been found to be more
effective than traditional feed-forward neural networks on
f) Deep RNN: sequential data, due to the ability to combine information
A deep recurrent neural network (deep RNN) is a type of from multiple time steps in the input sequence.
recurrent neural network (RNN) which has multiple layers of
hidden units, also known as depth. These layers allow the II. LITERATURE SURVEY:
network to learn and represent more complex functions, A. Existing System
which can help improve the performance of the network on A brief history of the development of Gurumukhi script,
certain tasks. including its copybook variations utilizing classifiers, is
A deep RNN passes the hidden state from one layer as provided in Jasuja's 1996 study Examination of Gurumukhi
inputs to the succeeding layer, allowing information to flow script [7]. The “feature extraction and hybrid classification”
through the network. This allows the network to learn strategy for machine identification of Gurmukhi characters
hierarchical representations of the input data, which can be using binary decision trees and nearest neighbors is
useful for tasks such as natural language processing and described in the study, “Feature extraction and classification
speech recognition. for OCR of Gurmukhi script,” 1999 [22]. The purpose of the
2001 [6] study, “A shape-based post processor for Gurmukhi
Deep RNNs can be created by stacking multiple layers of OCR,” is to employ post processing as contextual data to
RNN cells, such as LSTM or GRU cells, on top of each clarify or correct issues in OCR results. SVM (support
other. Deep RNNs have multiple layers, where each layer vector machine) was utilized by Umapada PallIndian, 2004
can learn distinct features or representations of the input [8] for character recognition. A novel pattern classifier called
sequence. The network is able to learn more complicated a support vector machine uses statistical learning to classify
representation of the training dataset because each layer's patterns. Support vector machines are ideally suited for
output is transferred as input to the succeeding layer. The binary classification problems.The 2010 [4] article, “A
final output is a combination of the outputs from all the Complete Machine-Printed Gurmukhi OCR System,”
layers. explores potential causes for the problems that have occurred
However, the computational cost rises as the quantity of during the various stages of developing the entire OCR
layers and hidden elements for each layer is increased and systems for Gurmukhi script. The author's multi-stage
the number of parameters in the network. classification technique for categorization used the binary
tree and the k-nearest neighbors’ classifier in a hierarchy.
Training deep RNNs can be difficult due to the vanishing Dharam Veer Sharma describes the first use of Neocognitron
gradient problem, which arises when the gradient signal to create a system for identifying single handwritten
propagates through many layers of the network. To Gurumukhi letters in his paper Gurumukhi Script Isolated
overcome this issue, techniques such as gradient clipping can Handwritten Character Recognition Using Neocognitron,
be used, as well as more advanced RNN architectures like 2010 [16]. In their study, G S Lehal and Renu Dhir (2011)
LSTM and GRU. The complexity of deep RNNs increases [5] provide information on the techniques for recognising
with “the number of layers and hidden units,” which results Skew induced during document scanning in Gurmukhi
in a larger number of network parameters and computational Script. In his article on offline handwritten Gurumukhi letter
costs. identification, Munish Kumar, 2012 [9] studied numerous
g) Time-Delay Neural Network (TDNN): features, including their pairings. The C-SVC type classifiers
of the Lib-SVM programme were used for classification.
Time-Delay Neural Network is a feedforward “neural
Gurmukhi Script To Braille Code, 2012 [24], the author of
network that has been created to deal with sequential data,

2023 10th International Conference on Computing for Sustainable Global Development (INDIACom) 141
Authorized licensed use limited to: Thapar Institute of Engineering & Technology. Downloaded on February 28,2024 at 20:24:27 UTC from IEEE Xplore. Restrictions apply.
this study described the evolution of the entire system for TABLE I. THE EXISTING SYSTEM OF GURMUKHI SCRIPT
TRANSLATION USING CLASSIFIERS:
translating Gurmukhi into braille. Blind individuals
frequently use the Braille writing system. Inside the research S.No. Title/ Technique Accuracy
in handwritten Gurumukhi scripts text line detection and Year Used
segmentation, 2013 [23] researchers present a textual line [1] Examination of Classifier A very high
Gurumukhi percentage (92%) of
segmentation technique for handwritten Punjabi documents script: a skilled writing was
which addresses issues regarding linked and overlapping text preliminary found
line components and extracts text lines using handwritten report, 1996
scanned documents. The textual line recognition technique [2] A shape-based OCR On machine printed
focused upon finding its best advantageous text line post processor images, the
segments and connecting them to their corresponding text for Gurmukhi recognition rate
OCR, 2001 increased by 3%, from
lines by creating a space in adjacent text lines. For the 94.35% to 97.34%,
“offline recognition of handwritten” Gurmukhi characters, employing
Munish Kumar, 2014 [10] given two alternative feature postprocessing
extraction strategies, notably parabolic curved fitting-based techniques.
characteristics and powers curved fitting-based [3] Indian script Handwritten 98% recognition
characteristics. V.S. Dhaka, 2015 [11] in his paper gives an character OCR accuracy wasreported.
recognition: a
overview of some of the approaches for detecting survey, 2004
characters/words in the recovered text with reduced error, [4] A Complete OCR The paper presents a
modified LeNet-5 CNN model is used here. Munish Machine Printed character level
Kumar,2015 [12] in his study, suggested a feature extraction Gurmukhi OCR accuracy of over 97%.
method to the identification for segmented handwritten System, 2010
Gurmukhi letters. In order to minimize the data's dimension, [5] A Range Free OCR This method is
Skew Detection effective for all kinds
this work also uses a features selecting method based on Technique for of scripts.
PCA. SVM, MLP, and k-NN classifiers all have been Digitized
employed. Direct (MT), RBMT, and DDMT (Data-Driven Gurmukhi Script
MT) were the three methods of machine translation that were Documents, 2011
used historically (Bhattacharyya, 2015) [13]. The author of [6] Handwritten OCR Accuracy percentage
this article HMM-based Indic Optical Character is75.78%
Recognition
Handwritten Word Recognition,2016 [18] has put out an (OCR): A
algorithm for the classification of strokes that are written in Comprehensive
Systematic
the top, middle, or lower zones of Gurmukhi, as well as for Literature
the identification of zones. Krishma Koundal, 2017 [14] in Review, 2013
his paper used “digitization, pre-processing, text [7] Punjabi Optical OCR The quantity of the
segmentation, “feature extraction, classification, and Character training dataset and
“postprocessing. Nadeem Jadoon Khan, 2017 [15] promoted Recognition: A the test data
Survey, 2017 determines recognition
the development of SMT (Statistical Machine Translation)
accuracy.
and linguistic resources for these language pairs. “A brief [8] Handwritten OCR, BPNN, The highest level of
history of” machine translation is provided in this article. isolated Bangla PNN, SVM handwritten
With the SMT approach being employed under the current compound Devanagari character
paper, the summary of Machine Translation (MT) character identification accuracy
approaches is also discussed The Indian languages which recognition: A was 95.19%, and the
new benchmark highest level of
they used in this study have been again briefly explored. using a novel handwritten
This paper introduces a new deep learning approach to deep learning Devanagari numeral
recognize isolated compound characters in handwritten approach, 2018. recognition accuracy
Bangla, 2017 [21] and establishes a new standard for was 95.56%.
character recognition. Deep neural networks' greedy layer- The above papers outline the feature extraction and
wise training has significantly advanced the field of pattern hybrid classification approach for machine identification of
recognition. The major goal of the paper Devanagari and Gurmukhi characters using binary decision trees and closest
Gurmukhi script handwritten, 2018 [17] was to identify and neighbors. Above papers offer details on the methods for
evaluate document images in the context of machine identifying Skew that are introduced during document
learning classifiers by using OCR, BPNN, PNN, SVM. scanning in Gurmukhi Script. Papers presents Text Line
Handwritten Optical Character Recognition (OCR), 2020 detection and Segmentation in Handwritten Gurmukhi
[25] paper presents OCR technique in both the digitization of Scripts and OCR technique in both the digitization of
typewritten materials and the conversion of handwritten typewritten materials and the conversion of handwritten
mediaeval manuscripts into digital format. The procedures medieval manuscripts into digital format.
used to translate the Gurmukhi Script were examined in a
study titled “A Comprehensive “Study on the Recognition” TABLE II. THE EXISTING SYSTEM OF GURMUKHI SCRIPT
TRANSLATION USING CLASSIFIERS:
of Gurmukhi Script” by using HCR, PCR, Hidden Markov
Model (HMM), 2020 [26]. This paper aims to compare the S.No. Title/ Technique Accuracy
effectiveness of handwriting identification techniques in two Year Used
different scripts, Gurumukhi, and Latin, 2021 [19]. [1] Examination Classifiers A very high percentage
of Gurumukhi (92%) of

142 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom)
Authorized licensed use limited to: Thapar Institute of Engineering & Technology. Downloaded on February 28,2024 at 20:24:27 UTC from IEEE Xplore. Restrictions apply.
script: a skilled writing was III. CONCLUSION
preliminary found.
report, 1996 In this paper, the several methods and approaches applied
[2] Offline Polynomi al DCT2 to translating the Gurmukhi script are thoroughly examined.
handwritten SVM with feature set and linear This paper additionally discusses the available tools for
Gurmukhi iDCT2 SVM translating text from the input languages into the target
character features and a classifier were used to
recognition: linear SVM achieve an accuracy
languages. We have also covered many techniques for
Study of classifier using result of 95.8% for estimating the translation accuracy of systems created for
different DCT2 class 35. And using Gurmukhi Script translation. The basic techniques for
featureclassifie features. polynomial SVM and various script translation tasks, such as pre-processing”, text
r combinations, iDCT2 segmentation, “feature extraction, “classification and
2012 features, accuracy of recognition, post-processing, and others, are examined. We
82.5% was achieved.
[3] Efficient SVM andK-NN The accuracy obtained
anticipate that the research on the Gurmukhi Script
Feature with the SVM and K- translation process will be aided by these specific points in
Extraction NN future research and study of current methodologies.
Techniques for classifiers is 98.10%
Offline and REFERENCES
Handwritten 97.14%
[1] https://www.geeksforgeeks.org/machine-learning/
Gurmukhi respectively.
Character [2] https://remitrix.com/blog/a-bried-history-of-ml/
Recognition, [3] https://www.cfilt.iitb.ac.in/resources/surveys/shreya_shubham_nitish_
2014 lit_survey_mt.pdf
[4] Offline CNN Experiments are [4] Lehal, Gurpreet. (2010). A Complete MachinePrinted Gurmukhi OCR
Handwritten evaluated onUNIPEN System. 10.1007/978-1-84800-330-9_3.
English Script lowercase data sets, [5] G S Lehal and Renu Dhir, “A Range Free Skew Detection Technique
Recognition: with the word for Digitized Gurmukhi Script Documents”, Proceedings 5th
ASurvey, 2015 recognition rate of International Conference of Document Analysis and Recognition,
92.20%. Bangalore, 1999, pp. 147-152.
[5] A Novel SVM The achieved
[6] G. S. Lehal, C. Singh: A shape-based post processor for Gurmukhi
Hierarchical classifiers recognition accuracy
OCR. Proceedings 6th International Conference on Document
Technique for percentage is 93.8%.
Analysis and Recognition, Seattle, USA, 2001 1105-1109.
Offline
Handwritten [7] Jasuja, O., & Singh, S. (1996). Examination of Gurumukhi script: a
Gurmukhi preliminary report. Science & Justice, 36(1), 9-13.
Character [8] Pal, U., & Chaudhuri, B. (2004). Indian script character recognition: a
Recognition, survey. Pattern Recognition, 37(9), 18871899.
2015 [9] Kumar, Munish & Kumar, Rishav & Jindal, M. (2012). Offline
[6] Machine Rule-Based Gives Comparison handwritten Gurmukhi character recognition: Study of different
translation, Machine feature-classifier combinations. ACM International Conference
2015 Translation Proceeding Series. 94-99.10.1145/2432553.2432571.
(RBMT) [10] Kumar, Munish & Kumar, Rishav & Jindal, M. (2014). Efficient
[7] Machine SMT, NLP, Average 10% Feature Extraction Techniques for Offline Handwritten Gurmukhi
Translation Phrase based accurate Character Recognition. National Academy Science Letters. 37. 381-
Approaches Translation results for all the 391. 10.1007/s40009-014-0253-4.
and Survey for language pairs.
[11] Dhaka, V.S., Kumar, M., & Chaudhary, P. (2015). Offline
Indian Handwritten English Script Recognition: A Survey . Context of
Languages, Machine Learning Classifiers. Journal of Artificial Intelligence. 11.
2017 65-70. 10.3923/jai.2018.65.70.
[8] Literature NMT Due to its attention
Survey: (Neural mechanism, NMT [12] Kumar, Munish et al. “A Novel Hierarchical Technique for Offline
Handwritten Gurmukhi Character Recognition.” National Academy
Machine machine produces translations
Science Letters 37 (2015): 567-572.
Translation, translation) of high quality.
2020 [13] Pushpak Bhattacharyya. 2015. Machine translation.CRC Press .
An effective method for features extraction from offline [14] Koundal, Krishma & Kumar, Munish & Garg, Naresh. (2017). Punjabi
handwritten techniques for identifying characters and words Optical Character Recognition: A Survey. Indian Journal of Science
and Technology. 10. 1-8.
in a recovered text Gurmukhi was proposed. The Offline
[15] Roy, P. P., Bhunia, A. K., Das, A., Dey, P., & Pal, U. (2016). J. Clerk
English Characters/Word Recognition with a lesser error are Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2.
discussed. Authors have introduced a novel hierarchical Oxford: Clarendon, 1892, pp.68–73.
technique for identifying isolated offline handwritten [16] Kaur, Amanpreet & Garg, Rakesh & Mahajan, Meenakshi. (2021).
Gurmukhi characters in this paper. Comparison of Gurumukhi and Latin scripts in handwriting
identification: A forensic perspective. Problems of Forensic Sciences.
The principles and procedures of rule-based” machine 5-17. 10.4467/12307483PFS.20.001.14781.
translation, “statistical “machine translation, and EBMT are [17] Das, Nibaran & Basu, Subhadip & Sarkar, Ram & Kundu,
compared and contrasted. The creation of SMT and Mahantapas & Nasipuri, Mita. (2009). Handwritten Bangla
linguistic resources for the language pairings is encouraged Compound character recognition: Potential challenges and probable
by this survey. Above papers serve as a concise overview of solution.Proceedings of the 4th Indian International Conference on
Artificial Intelligence, IICAI 2009. 1901-1913.
the key publications in the subject of neural machine
[18] Roy, Saikat, Nibaran Das, Mahantapas Kundu, and Mita Nasipuri.
translation. "Handwritten isolated Bangla compound character recognition: A
new benchmark using a novel deep learning approach." Pattern
Recognition Letters 90 (2017): 15-21.

2023 10th International Conference on Computing for Sustainable Global Development (INDIACom) 143
Authorized licensed use limited to: Thapar Institute of Engineering & Technology. Downloaded on February 28,2024 at 20:24:27 UTC from IEEE Xplore. Restrictions apply.
[19] Lehal, Gurpreet & Singh, Chandan. (1999). Feature extraction and International Conference on Computing for Sustainable Global
classification for OCR of Gurmukhi script. Development (INDIACom). IEEE, 2021.
[20] Modi, Namisha & Jindal, Khushneet. (2013). Text Line detection and
Segmentation in Handwritten Gurumukhi Scripts. International
Journal of Advanced Research in Computer Science and Software
Engineering. 3. 1075-1080.
[21] Er. Vandana, Er. Nidhi Bhalla, Ms. Rupinderdeep Kaur / International
Journal of Engineering Research and Applications (IJERA) ISSN:
2248-9622 www.ijera.com Vol. 2, Issue4, Julyaugust 2012, pp.1298-
1302 1298.
[22] Memon, Jamshed & Sami, Maira & Khan, Rizwan & Uddin, Mueen.
(2020). Handwritten Optical Character Recognition (OCR): A
Comprehensive Systematic Literature Review (SLR). IEEE Access. 1-
1. 10.1109/ACCESS.2020.3012542.
[23] Sharma, Sandhya & Gupta, Sheifali & Kumar, Neeraj. (2020). A
Comprehensive Study on the Recognition of Gurmukhi Script. Journal
of Computational and Theoretical Nanoscience.2674-2677.
10.1166/jctn.2020.8965.
[24] Sanjay Kumar Singh, Shrikant Tiwari, Ali Imam Abidi, and Aruni
Singh. 2017. Prediction of pain intensity using multimedia data.
Multimedia Tools Appl. 76, 18 (Sep 2017), 19317–19342.
https://doi.org/10.1007/s11042-017-4718-6.
[25] Kumar, Santosh & Singh, Sanjay & Abidi, Ali & Datta, Deepanwita &
Kumar, Arun. (2018). Group Sparse Representation Approach for
Recognition of Cattle on Muzzle Point Images. International Journal
of Parallel Programming. 46. 1-26. 10.1007/s10766-017-0550-x.
[26] Chhabra, Megha, Shukla, Manoj Kumar, and Ravulakollu, Kiran
Kumar. ‘Boosting the Classification Performance of Latent
Fingerprint Segmentation Using Cascade of Classifiers’. 1 Jan. 2020 :
359 – 371.
[27] Chhabra, Megha, Manoj K. Shukla, and Kiran Kumar Ravulakollu.
"State-of-the-art: A systematic literature review of image segmentation
in latent fingerprint forensics." Recent Advances in Computer Science
and Communications (Formerly: Recent Patents on Computer
Science) 13.6 (2020): 1115-1125.
[28] M. Khan, S. Chakraborty, R. Astya and S. Khepra, "Face Detection
and Recognition Using OpenCV," 2019 International Conference on
Computing, Communication, and Intelligent Systems (ICCCIS),
Greater Noida, India, 2019, pp. 116-119, doi:
10.1109/ICCCIS48478.2019.8974493.
[29] M. Chhabra and S. Khepra, "Image processing based Latent
Fingerprint Forensics - A Survey," 2020 International Conference on
Smart Electronics and Communication (ICOSEC), Trichy, India,
2020, pp. 78-84, doi: 10.1109/ICOSEC49089.2020.9215318.
[30] Raju, S. Sabari, Peeta Basa Pati, and A. G. Ramakrishnan. "Text
localization and extraction from complex color images." Advances in
Visual Computing: First International Symposium, ISVC 2005, Lake
Tahoe, NV, USA, December 5-7, 2005. Proceedings 1. Springer
Berlin Heidelberg, 2005.
[31] Pati, Peeta Basa, and A. G. Ramakrishnan. "Word level multi-script
identification." Pattern Recognition Letters 29.9 (2008): 1218-1229.
[32] Rababaah, Aaron Rasheed. "A Deep Learning based Process Model
for Crack Detection in Pavement Structures." 2022 9th International
Conference on Computing for Sustainable Global Development
(INDIACom). IEEE, 2022.
[33] Rudrappa, Naveenkumar T., and Mallamma V. Reddy. "Using
Machine Learning for Speech Extraction and Translation: HiTEK
Languages." 2022 9th International Conference on Computing for
Sustainable Global Development (INDIACom). IEEE, 2022.
[34] Bajpai, Saumya Tripathi, and Dilip Kumar Sharma. "A Review on
Identification of Unreliability and Fakeness in Social Media Posts
using Blockchain Technology." 2022 9th International Conference on
Computing for Sustainable Global Development (INDIACom). IEEE,
2022.
[35] Sahu, Biswajeet, Hemanta Kumar Palo, and Sachi Nandan Mohanty.
"A performance evaluation of machine learning algorithms for
emotion recognition through speech." 2021 8th International
Conference on Computing for Sustainable Global Development
(INDIACom). IEEE, 2021.
[36] Tiwari, Dimple, and Bharti Nagpal. "Ensemble sentiment model:
bagging with linear discriminant analysis (BLDA)." 2021 8th

144 2023 10th International Conference on Computing for Sustainable Global Development (INDIACom)
Authorized licensed use limited to: Thapar Institute of Engineering & Technology. Downloaded on February 28,2024 at 20:24:27 UTC from IEEE Xplore. Restrictions apply.

You might also like