0% found this document useful (0 votes)

49 views13 pages

FN Paper 2

The document discusses neural machine translation for Indian languages. It provides background on machine translation approaches like rule-based, statistical and neural machine translation. It then describes training and evaluating neural machine translation systems for translations between English and three Indian languages: Tamil, Hindi and Punjabi. The results are analyzed based on BLEU scores and human evaluation.

Uploaded by

jai bhole

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

49 views13 pages

FN Paper 2

Uploaded by

jai bhole

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

J. Intell. Syst.

2019; 28(3): 465–477

Amarnath Pathak and Partha Pakray*

Neural Machine Translation for Indian

Languages
https://doi.org/10.1515/jisys-2018-0065
Received January 30, 2018; previously published online June 18, 2018.

Abstract: Machine Translation bridges communication barriers and eases interaction among people having
different linguistic backgrounds. Machine Translation mechanisms exploit a range of techniques and linguis-
tic resources for translation prediction. Neural machine translation (NMT), in particular, seeks optimality in
translation through training of neural network, using a parallel corpus having a considerable number of
instances in the form of a parallel running source and target sentences. Easy availability of parallel corpora
for major Indian language forms and the ability of NMT systems to better analyze context and produce fluent
translation make NMT a prominent choice for the translation of Indian languages. We have trained, tested,
and analyzed NMT systems for English to Tamil, English to Hindi, and English to Punjabi translations. Pre-
dicted translations have been evaluated using Bilingual Evaluation Understudy and by human evaluators to
assess the quality of translation in terms of its adequacy, fluency, and correspondence with human-predicted
translation.

Keywords: Machine Translation, Neural Machine Translation, OpenNMT, BLEU, Indian languages.

1 Introduction
Machine Translation (MT), an application area of Natural Language Processing (NLP) and a subfield of com-
putational linguistics, facilitates automated translation of text or speech in a source natural language to
corresponding text or speech in a different target natural language. Language incomprehensibility has wide-
ranging adverse impacts on several aspects of human living, and the same can be reasonably alleviated with
effective use of MT. Besides, the crucial idea of MT is to bridge communication barriers among people from
different linguistic backgrounds.
Although MT-predicted translations differ from human-like translation, they are comprehensible and
the translation process is free from human intervention. The effectiveness of the translation approach is
manifested in its potential to ensure generation of semantically equivalent and grammatically sound tar-
get construct. An intellectual translation approach refrains from a word-for-word translation behavior and
delves into conceptuality and crux of the languages, prior to translation.
Classical approaches of MT are broadly categorized into rule-based, corpus-based, and hybrid
approaches. Rule-based approaches, namely transfer-based and interlingua-based approaches, rely on a set
of predefined translation rules, and investigate the syntax, semantics, and morphology of the two languages
to furnish target representation. Rule-based approaches resort to linguistic models to ensure comprehen-
sibility of translation rules and to produce a syntactically and semantically sound translation that is well
formed and less prone to grammatical errors [21]. However, interlingua-based approaches are inefficient, pri-
marily because of their impractical and infeasible idea of using language-independent representation for
translation.

*Corresponding author: Partha Pakray, Department of Computer Science and Engineering, National Institute of Technology
Mizoram, Mizoram, India, e-mail: [email protected]. https://orcid.org/0000-0003-3834-5154
Amarnath Pathak: Department of Computer Science and Engineering, National Institute of Technology Mizoram, Mizoram,
India. https://orcid.org/0000-0002-1666-4464
466 | A. Pathak and P. Pakray: NMT for Indian Languages

Corpus-based or data-driven approaches, namely Example-Based Machine Translation (EBMT) and Sta-
tistical Machine Translation (SMT), dynamically build a translation model that is characterized by a set of
translation rules learned from parallel corpus. A considerable number of instances in the corpus, and the
ability of the translation approach to dynamically dig out excellent rules guide the result of translation to
perfection. Given the test data (text to be translated), EBMT investigates the translation model to seek the
optimal match using techniques of clustering or generalization. Parallel corpora, thesaurus for computing
semantic similarity, bilingual dictionary, and syntactic parser are the key resources that are crucial to the
functioning of EBMT. On the other hand, SMT employs Bayes’ theorem to reformulate the translation prob-
lem as a probability maximization problem. Considering S to be the source language sentence and T to be the
target language sentence, SMT attempts to find translation T that maximizes P(T|S). Using Bayes’ theorem,
P(T|S) can be rewritten as the right-hand side of Eq. (1):

P(S|T) * P(T)
P(T|S) = . (1)
P(S)

With P(S) being fixed, the problem of translation boils down to finding a translation T, which maximizes
P(S|T)*P(T). Language model, P(T), and translation model, P(S|T), are learnt from the parallel corpora and
constitute indispensable components of SMT. The two components are exploited by the decoding algorithm
to predict target sentences. The inability of the SMT system to exploit context information of the source sen-
tence, system complexity, and use of many different independently trained components are the key factors
that add to the inefficiency of SMT systems.
Although conventional approaches of translation have served the purpose for years, their underly-
ing demerits and the need for better-quality translation enforce exploration of new, better-performing
techniques. Neural Machine Translation (NMT), a fairly new and proficient approach to translation, incor-
porates the use of adequately trained large neural networks in the translation process. Encoder and decoder,
which are the networks of Long Short-Term Memory (LSTM) units, constitute key components of the NMT
system architecture. In a baseline system, encoder encodes the source sentence, one symbol at a time, and
stores the entire encoding in its last hidden state. Encoded representation is fed to decoder for translation
prediction. The comprehensibility, adequacy, and fluency of predicted translation are largely determined by
the implementation approach used for decoder. Recent past has witnessed the use of Convolutional Neural
Networks (CNNs) as well as Recurrent Neural Networks (RNNs) for implementing the decoder network [4, 23].
However, carrying encoded representation along the decoder network is a primary requirement of the decod-
ing process, and simple RNNs are bad at it. An improved implementation strategy for decoder replaces simple
RNNs with LSTM, which memorizes encoded information and carries enough of it along the network. Use of
attention mechanism [14] furthers the effectiveness of translation by allowing decoder to access the entire
pool of encoder states for translation prediction. The NMT system endows scalability and a radical shift from
phrase-based translation, as in the case of SMT, to sentence-based translation. Unlike classical MT systems,
NMT systems are end-to-end systems that discard the use of additional components for translation. Decoder
in NMT exploits a comparatively larger context, comprising source as well as partial target text, for accurate
translation prediction. NMT is better at generating more fluent translations with less syntactic and semantic
errors.
Motivated by the advantages of NMT over classical MT systems and the promising results produced by
NMT in recent years, we have investigated its effectiveness in the context of Indian languages. In particular,
we have trained and tested NMT systems for English to Tamil, English to Hindi, and English to Punjabi transla-
tions. Predicted translations have been evaluated by employing human evaluators and Bilingual Evaluation
Understudy (BLEU) evaluation [18]. Besides, we have comprehensively analyzed the implications at the per-
formance of the English-Hindi NMT system, given changes in training data, epochs, and length of sentences
in the test set.
The rest of the paper is organized as follows: Section 2 reviews relevant literature on translation of Indian
languages. Section 3 details the system architecture for NMT. Section 4 describes details of different experi-
mental setups. Section 5 describes system results and their comprehensive analysis. Section 6 concludes the
paper and points the direction for future research.
A. Pathak and P. Pakray: NMT for Indian Languages | 467

2 Related Works
The era of NMT emerged in 1987 when English to Spanish translation was attempted using backpropagation
neural network and a highly limited vocabulary [1]. The translation process used backpropagation mecha-
nism for mapping from one language to another. The NMT system architecture has been subjected to a number
of modifications after its resurgence in 2013. An RNN encoder-decoder NMT system facilitates encoding of
a variable-length source sentence into a fixed-length vector and decoding of fixed-length vectors to obtain
the target sequence [3]. Gated Recurrent Units replaced CNNs [8] for implementing hidden units of decoder.
LSTM [22] offers improved propagation of the encoded source sentence along the decoder network, resulting
in improved translation quality of longer sentences.
Use of neural networks on phrase-based SMT has been limitedly explored in the context of Indian lan-
guages, such as Tamil and Punjabi. Neural networks have been used to learn sets of ordered rules for Hindi
to English translation, and the idea can be extended to other Indo-European languages by changing the
dictionary used for literal translation [2]. A feed-forward backpropagation Artificial Neural Network (ANN)
architecture uses nine separate modules for translating simple sentences of the English language into Hindi
[9]. Bilingual dictionary, and storing word meanings and word features of a language pair have been imple-
mented using ANN. A quantum neural network-based approach for English to Hindi translation learns the
pattern of the English-Hindi parallel corpus using part-of-speech information of each of the word in corpus,
and later uses the gained knowledge to perform translation [16]. Use of quantum neural network for reorder-
ing of words for parts-of-speech tagging and their alignment during the MT has been found to increase the
translation accuracy to a significant extent. Translation accuracy can be further enhanced by increasing the
size of the training corpus through automatic extraction of parallel texts from comparable corpora and adding
them to the training data [17].
A direct translation methodology for Punjabi to Hindi translation exploits the syntactic and semantic
similarity between the two languages [7]. Using a number of lexicons and the word-for-word translation
approach, words in the source language are replaced by their target language equivalents. An extended
web-based Hindi to Punjabi MT system facilitates website and email translation [5].
Ambiguities of content as well as some function words pose challenges to MT systems. A supervised
method for resolving prepositional ambiguity in English to Tamil translation performs disambiguation by
exploiting collocation occurrences and linguistic information of words [12]. English to Tamil translation con-
fronts issues of non-availability of parallel corpora and morphological difference between the two languages
[11]. Such issues have been tackled by incorporating linguistic knowledge in SMT. Formalism-based MT
approach for English to Tamil translation uses synchronous TAG pairs derived from XTAG English grammar
[15]. Proposed tag systems can be well extended to other Indian languages as well. Furthermore, integrating
rule-based MT systems with functionality for complex sentence simplification has been found to improve the
quality of English to Tamil translation [20]. Complex sentences, connected using connectives, are split into
simpler sentences before feeding them to the translation system.
Owing to the diversities of Indian languages and exceedingly better performance of NMT, Google has
recently employed NMT in multilingual translation of nine Indian languages (http://indianexpress.com/
article/technology/tech-news-technology/googles-neural-machine-translation-for-indian-languages-heres-
what-it-means/), namely Hindi, Bengali, Marathi, Tamil, Telugu, Gujarati, Punjabi, Malayalam, and Kan-
nada. Google’s multilingual NMT system is characterized by simplicity, low-resource language improvement,
and zero-shot translation features [6]. The zero-shot translation feature enables the system to do translation
prediction for a previously unseen language pair. Google’s multilingual NMT system does not affect the basic
encoder-decoder architecture but instead uses a special token at the beginning of source sentence to specify
the target language.
The Shared Task Cum Workshop on Machine Translation in Indian Languages (MTIL; http://
nlp.amrita.edu/mtil_cen/) offers research infrastructure for the development, evaluation, and comparison
of the MT systems [13]. The specific objectives of the workshop were to design high-quality parallel corpora
of Indian languages, to explore the role of state-of-the-art MT techniques in the context of Indian languages,
468 | A. Pathak and P. Pakray: NMT for Indian Languages

and to cope with the issues of language divergence. A total of five teams participated in the workshop. CDAC-
M was the top-ranked team in English-Malayalam, English-Tamil, and English-Hindi translation categories,
whereas our NIT-M team was top ranked in the English-Punjabi translation category. Human evaluation score
was preferred over BLEU score for ranking of the teams.
The factored SMT model of the CDAC-M team employs suffix separation (SS), source side reordering, and
transliteration for all the four translation categories [19]. Reordering reorders the source side sentences as per
the word order of target language. Reordering leads to better alignments and parallel phrase extraction, which
eventually improves the translation quality. During SS, words are split into stem and suffixes, and a contin-
uation symbol (@@) is added after the stem word. The continuation symbol helps combine suffixes after
having performed the translation. Transliteration, a post-processing step, translates an out-of-vocabulary
(OOV) word to the target language word. An unsupervised model based on expectation maximization has
been used to train a transliteration model. Eventually, language modeling selects the best translation from
the n-best transliterated output. Augmenting Moses (https://github.com/moses-smt)-based baseline system
with pre-processing and post-processing steps have led to improvement in BLEU score for the English-Hindi
and English-Tamil translation categories.

3 System Description
Data pre-processing, system training, and system testing/translation constitute key steps of system func-
tioning, and the same have been elaborated in the following subsections. We have exploited the OpenNMT
(http://opennmt.net/) system architecture, tuned its parameters, and trained and tested the system using
corpora provided by MTIL organizers [13].

3.1 Data Preprocessing

MTIL corpora have been used to train and test the OpenNMT system (refer to Section 4.1 for corpora descrip-
tion). Raw data consist of parallel running source and target sentences, which are tokenized during the
pre-processing step. Validation data, derived from training data provided by the organizers, has been used to
evaluate convergence of training and to check the validity of system parameters. The remaining training data
have been employed in system training. Thus, the training and validation data have no instances in common
and are therefore disjoint. Using the entire training data for validation could result in an undesirable over-
fitted model. A usual practice enforces a maximum limit of 5000 over the number of sentences in validation
files.
The pre-processor primarily aims at building dictionaries that index the words present in the training and
validation datasets. Training and validation files are fed to the pre-processor module of OpenNMT to generate
two human-readable dictionary files and a serialized torch file. Dictionary files list out all unique words of
training data along with four extra words, namely <blank>, <unk>, <s>, and </s>. Each word is mapped to a
unique index that serves as the system’s internal representation for the word. The serialized torch file, which
embeds dictionaries, training data, and validation data, is used to train the system.

3.2 System Training

We have trained a sequence-to-sequence recurrent neural network model, using attention mechanism, for
translation prediction. Training data are shuffled and sorted prior to training. Shuffling ensures that instances
in the training batch uniformly come from different parts of the corpus. Sorting ensures uniform length
of instances in the training batch. Training performance can be escalated with use of multiple Graphics
Processing Units (GPUs) to train different batches of training data in synchronous or asynchronous fashion.
An epoch in training refers to one forward and one backward pass over all the training instances. The
system is trained for some fixed number of epochs. An epoch comprises iterations, and in each iteration
A. Pathak and P. Pakray: NMT for Indian Languages | 469

one forward pass and one backward pass are performed over a set of training instances. The system has
been trained for 15 epochs in the first experimental setup and 19 epochs in the other three setups (refer to
Section 4.2 for details of the experimental setups). A validation score, dynamically computed using valida-
tion data, helps in checking the convergence of training. The learning rate of the network decays by a factor
of 0.7 if validation score improvement falls below 0.
The primary components of the system architecture are discussed next.

3.2.1 Encoder

A unidirectional sequencer has been used as encoder for encoding variable-length input sequence of the
source language into fixed-size vectors. The architecture of encoder is characterized by a two-layer LSTM
recurrent neural network having 500 hidden units in each layer. LSTM, unlike conventional backpropaga-
tion neural networks, remembers encoded source representation and carries enough of it along the decoder
network.
To convert the input sequence into word embedding, encoder splits the input sequence into an array of
words. Each word in the array is mapped to its index in the vocabulary, and the index of the <unk> word is
used for OOV words. Sorting of the input sequences in a batch, prior to training, eliminates the need for zero
padding. Thereafter, each index value is transformed into a vector of fixed length and different word vectors
are combined to give a single fixed-length vector that is representative of the complete input sequence.

3.2.2 Decoder

Similar to encoder, a two-layer LSTM decoder, having 500 hidden units in each layer, has been used to decode
a fixed-size source vector using input feeding and global attention mechanism. A decoder using global atten-
tion mechanism consults the entire pool of source states at each step of decoding. Not only the last hidden
state but also the entire hidden states of source are considered as representatives of sentence meaning. Score
function takes the current hidden state of decoder ht and the source vector hs as argument to determine the
attention score of each source state. The score function used by the system is given by Eq. (2):

score(h t , h s ) = h Tt W a h s . (2)

The probabilistic attention score of a state is its score divided by sum of all the attention scores. Variable-
length weight alignment vector at uses probabilistic attention scores and gives an estimate of the amount of
attention to pay at different places in the source. A context vector ct is then computed using source vector hs
and weight alignment vector at . The decoder uses its current hidden state information ht and context vector
ct to predict attentional vector ht , which eventually predicts the current word yt . Figure 1 describes working
of decoder in a nutshell [14].
Moreover, the system uses the input feeding approach to feed attentional vector h̃t to current hidden
state, a mechanism that keeps the system informed about past alignment decisions.
Figure 2 illustrates the system architecture. Attention mechanism and input feeding are used to transform
input sequence “A B C D” into target sequence “X Y Z” [14].

3.3 System Testing/Translation

System testing uses a trained model to predict translation for test sentences, which are fed to the system in
batches. The translation process makes use of beam search, a heuristic-based optimized version of best first
search, to search for the best or the list of best translations. The effectiveness of the search mechanism is man-
ifested in its ability to facilitate trade-off between translation time and search accuracy, which is ensured by
setting beam size to a relatively small value. Besides, the translator uses the <unk> symbol when it is uncertain
about the target word.
470 | A. Pathak and P. Pakray: NMT for Indian Languages

Figure 1: LSTM Decoder Using Context Vector and Current Hidden State for Translation Prediction.

Figure 2: System Architecture for NMT.

4 Experimental Design
This section contains detailed description about the experimental setup and corpora used for training and
testing the translation effectiveness of the NMT system. OpenNMT, an open source toolkit, facilitates the
required experimental framework, and provides a platform for training and deploying NMT models [10].

4.1 Corpora Description

The NMT system has been trained using Hindi, Punjabi, and Tamil MTIL training corpora, which comprise
parallel running source-target sentence pairs, with English being the source language in all the three corpora.
System training imposes certain constraints on the corpora format, which necessitate pre-processing of data
prior to training (refer to Section 3.1). Validation data, a subset of training corpus containing 4000 instances,
are used for checking the convergence of training. The MTIL test corpus containing 562 English sentences
has been used for testing the translation effectiveness of trained and validated models. Table 1 summarizes
the nature of corpus, the name of the corresponding corpus, and the number of instances present therein
[11, 12, 20].
A. Pathak and P. Pakray: NMT for Indian Languages | 471

Table 1: Corpora Description.

Nature of corpus Name of corpus Number of instances

Training Hindi_MTIL2017-Training 160,758

Punjabi_MTIL2017-Training 129,022
Tamil_MTIL2017-Training 139,033
Test MTIL-Test 562
Validation (Hindi) Hindi_MTIL2017-Training 4000
Validation (Punjabi) Punjabi_MTIL2017-Training 4000
Validation (Tamil) Tamil_MTIL2017-Training 4000
Gold MTIL-Hindi_Gold 562

4.2 Experimental Setup

We have used the following different experimental setups to train, test, and analyze the system’s performance
from different perspectives.
(i) Initially, we trained the NMT system using English-Hindi, English-Punjabi, and English-Tamil parallel
training corpora. The three different trained models were tested using MTIL-Test corpus. Result sets
containing predicted translations were provided to MTIL organizers for human and BLEU evaluation.
(ii) We have re-trained the NMT system using Hindi_MTIL2017-Training corpus and saved the trained model
obtained at 19 different epochs. Each of the 19 models has been tested using MTIL-Test corpus, and predic-
tion results have been subjected to BLEU evaluation using Gold Data provided by the organizers. Such
a setup helps in analyzing the change in translation behavior of the NMT system with increase in the
number of epochs.
(iii) Besides, we have re-trained the NMT system for 19 epochs using 80 k instances of Hindi_MTIL2017-
Training corpus, which is approximately half of the original corpus size. The prediction results from each
of the 19 models have been subjected to BLEU evaluation. Such a setup helps in analyzing change in the
translation behavior of the NMT system with change in the number of instances in the training data.
(iv) Furthermore, we have created four different test sets from the original test data, each test set containing
100 sentences. The average length of sentences in the four test datasets is 10, 15, 20, and 25, respec-
tively. The best epoch model, one having the highest BLEU score, is tested using the four test datasets,
and prediction results have been evaluated using BLEU evaluation. Such a setup helps in assessing the
relationship between translation performance and the average length of sentences in the test dataset.

The results of all these experimental setups have been detailed and analyzed in Section 5.

5 Results and Analysis

5.1 Evaluation Results Provided by MTIL Organizers

The prediction results of our first experimental setup were provided to MTIL organizers for human and BLEU
evaluation [13]. Human evaluators assessed the quality of translation with respect to adequacy, fluency, and
overall rating. They compared the prediction results and Gold Data to rate the three parameters on a scale of
1 to 5, with 1 being the least and 5 being the maximum parameter value.
The adequacy of translation is a measure of the amount of meaning expressed in a reference translation
that is also expressed in a translation sentence. Table 2 summarizes the adequacy ratings provided by three
independent human evaluators for all target languages and for all participating teams. Among all the partic-
ipating teams, our NIT-M team attained the highest adequacy rating of 3.38 and the least adequacy rating of
1.59 for the English-Punjabi and English-Tamil language pairs, respectively. In Tables 2–5, different scores of
our NIT-M team are highlighted in bold.
472 | A. Pathak and P. Pakray: NMT for Indian Languages

Table 2: Adequacy Table.

Language Team Adequacy

Evaluator 1 Evaluator 2 Evaluator 3 Average

Malayalam CDAC-M 2.20 1.36 2.20 1.92

Tamil CDAC-M 3.34 1.82 2.69 2.62
NIT-M 1.53 1.69 1.54 1.59
Hans 3.14 1.83 1.51 2.16
Hindi JU 1.97 1.51 1.97 1.81
IIT-B 2.44 2.66 2.55 2.55
CDAC-M 3.92 3.71 3.82 3.82
NIT-M 3.49 2.59 3.72 3.27
Punjabi IIT-B 2.03 3.62 2.29 2.65
NIT-M 3.30 3.45 3.38 3.38
CDAC-M 2.42 3.44 3.28 3.05
CDAC-M, Centre for Development of Advanced Computing, Mumbai; IIT-B, Indian Institute of Technology, Bombay; NIT-M,
National Institute of Technology Mizoram; JU, Jadavpur University; HANS, SSN College of Engineering.

Table 3: Fluency Table.

Language Team Fluency

Evaluator 1 Evaluator 2 Evaluator 3 Average

Malayalam CDAC-M 1.76 1.34 1.90 1.67

Tamil CDAC-M 2.95 1.81 2.94 2.57
NIT-M 1.51 1.72 1.72 1.65
Hans 3.09 1.76 1.50 2.12
Hindi JU 1.80 1.54 1.81 1.72
IIT-B 2.93 3.52 3.23 3.23
CDAC-M 3.61 3.65 3.63 3.63
NIT-M 3.94 2.97 3.76 3.56
Punjabi IIT-B 2.10 3.60 2.43 2.71
NIT-M 3.84 3.64 3.74 3.74
CDAC-M 2.48 3.49 3.10 3.02

Table 4: Rating Table.

Language Team Rating

Evaluator 1 Evaluator 2 Evaluator 3 Average

Malayalam CDAC-M 1.83 1.29 1.68 1.60

Tamil CDAC-M 2.95 1.82 2.43 2.40
NIT-M 1.51 1.71 1.52 1.58
Hans 2.96 2.06 1.50 2.17
Hindi JU 1.61 1.51 1.60 1.57
IIT-B 2.37 2.81 2.59 2.59
CDAC-M 3.27 3.59 3.43 3.43
NIT-M 3.94 2.37 3.45 3.26
Punjabi IIT-B 1.94 3.62 2.30 2.62
NIT-M 2.85 3.65 3.25 3.25
CDAC-M 2.28 3.40 3.07 2.92

The fluency of translation concerns the well formedness of a translation sentence in the target language,
irrespective of sentence meaning. A fluent translation will be flawless, syntactically correct, and comprehen-
sible, but need not necessarily be a semantically correct translation. Table 3 summarizes the fluency ratings
provided by three independent human evaluators for the three language pairs. The highest fluency rating of
A. Pathak and P. Pakray: NMT for Indian Languages | 473

Table 5: Percentage Measures and BLEU Score.

Language Team Adequacy and fluency in % Rating in % BLEU score

Malayalam CDAC-M 35.85 31.94 2.6

Tamil CDAC-M 51.8 48 6.15
NIT-M 32.37 31.64 1.31
Hans 42.8 43.5 1.93
Hindi JU 35.28 31.5 3.57
IIT-B 57.81 51.87 21.01
CDAC-M 74.53 68.64 20.64
NIT-M 68.27 65.14 23.25
Punjabi IIT-B 52.93 52.4 11.38
NIT-M 67.55 65.05 9.24
CDAC-M 60.91 58.34 8.68

3.74 and lowest fluency rating of 1.65 were scored by our NIT-M team for the English-Punjabi and English-Tamil
language pairs.
Overall rating expects annotators to rate the predicted translations on a scale of 1–5, with the least value
of 1 referring to completely wrong and worst translation whereas the maximum value of 5 refers to excellent
translation. The overall ratings provided by evaluators for the three language pairs are summarized in Table 4
The lowest overall rating of 1.58 and highest overall rating of 3.26 were attained by our NIT-M team for the
English-Tamil and English-Punjabi translations, respectively.
Further, Table 5 summarizes the percentage measures and BLEU score for the three language pairs. Our
NIT-M team attained the highest BLEU score of 23.25 and lowest BLEU score of 1.31 for the English-Hindi and
English-Tamil language pairs, respectively.
As can be seen from Tables 2–5, among all the teams, our team ranked first in English to Punjabi and
English to Hindi translations from the human evaluation and BLEU evaluation perspectives, respectively.
Moreover, for all the language pairs and among all the three human evaluation parameters, the highest mea-
sures have been recorded for fluency. This is attributed to the fact that NMT systems are well known for
producing fluent translations.

5.2 Performance Comparison of NIT-M and CDAC-M Systems

Figures 3–5 show the comparison of adequacy, fluency, and BLEU scores of NIT-M and CDAC-M systems for
the three translation categories.
The three scores of the NIT-M team are lower than those of CDAC-M for the English-Tamil transla-
tion. This owes to the agglutinative nature of Tamil and the morphological divergence between the English
and Tamil languages. The majority of Tamil words are formed by a combination of multiple words, a phe-
nomenon referred to as agglutination. Agglutination and morphological divergence lead to generation of
more unknown (<unk>) words, which lowers the quality of translation. As discussed in the Section 2, the

Figure 3: Comparison of Adequacy Scores of NIT-M and CDAC-M Systems.

474 | A. Pathak and P. Pakray: NMT for Indian Languages

Figure 4: Comparison of Fluency Scores of NIT-M and CDAC-M Systems.

Figure 5: Comparison of BLEU Scores of NIT-M and CDAC-M Systems.

CDAC-M system employs SS, source side reordering, and transliteration to cope with the challenges of agglu-
tination and morphological divergence in the English-Tamil translation. However, our NIT-M system makes
no attempt to handle such linguistic issues, hence the poor performance.
The remarkable performance of the NIT-M system for the English-Hindi and English-Punjabi language
pairs owe to a large-sized corpus. Training using a large corpus ensures effective tuning of system parame-
ters and generation of a sound translation model, which eventually leads to better predictions. Moreover, the
performances of two teams are comparable in these two translation categories. Although the Tamil training
corpus is also fairly large sized, agglutination and morphological divergence cause the NIT-M system to lag
behind in this translation category.
Furthermore, for English-Punjabi translation, the human and BLEU evaluation scores are negatively cor-
related, with low BLEU score and high manual evaluation scores. This is attributed to the underlying working
principle of BLEU, which relies on the precision of n-grams in the reference and candidate translation. Even
minor lexical differences can cause a huge difference in n-gram precision, which eventually affects the BLEU
score. However, such minor lexical differences are often considered insignificant from the perspective of
human evaluation.
Moreover, the comparatively lower evaluation scores of the CDAC-M system for English-Punjabi trans-
lation can be attributed to the inability of preprocessing and post-processing steps to handle the language-
specific constructs of Punjabi language.

5.3 Evaluation Results of Different Experimental Setups

Furthermore, we have examined different experimental setups, as mentioned in Section 4.2, to analyze
a system’s translation performance with respect to number of epochs, size of training data, and aver-
age length of sentences in the test dataset. We have used a multi-BLEU (http://www.statmt.org/moses/
?n=Moses.SupportTools) evaluator, using 1-g precision, to compare Gold Data and predicted translations.
Figure 6 shows the BLEU score versus epoch plot for English-Hindi MT. The highest BLEU score of 52.54 is
attained at epoch 18 and the BLEU score curve converges after epoch 16 (say, epoch_converge). The decreasing
curve between epoch 6 and epoch 7 is presumably because of decay in the learning rate of the network, which
A. Pathak and P. Pakray: NMT for Indian Languages | 475

Figure 6: BLEU Score Achieved by NMT System at Different Epochs.

leads to poor translation prediction. The learning rate of network is decayed if the validation score improve-
ment falls below zero. As the BLEU score versus epoch plot converges after a specific epoch (epoch_converge),
the plot can be helpful in selecting the maximum number of training epochs. The number of training epochs
should preferably not exceed the value epoch_converge.
Furthermore, Figure 7 shows a performance comparison between the NMT system trained using com-
plete training data (NMT-I) and the one trained using half the training data (NMT-II). The high BLEU scores
for the former indicate that the system’s performance considerably improves with increase in the number of
instances in the training data. As NMT-I has seen more number of sentence pairs during training, it predicts
correct Hindi translation of many English words and generates less <unk> words in comparison to NMT-II.
To further analyze the system, we have selected the best-trained model, the one obtained at epoch 18 of
the second experimental setup, and tested it using four distinct test sets. The four test sets (each of size 100)
have been derived from the original test set provided by organizers, in such a way that the average lengths of
sentences in these sets are 10, 15, 20, and 25. Sentences longer than 25 words have been accommodated in the
fourth test set. Also, sentences having an average length of less than 10 words have been dropped. Figure 8
shows the BLEU scores achieved by the NMT system for the four test sets.
It has been interesting and uncommon to observe that the performance of the NMT system (in terms of
BLEU score) improves with increase in the length of test sentences. It probably owes to the context-analyzing
ability of the NMT system. Use of attention mechanism and context vectors facilitate improvement in trans-
lation performance with increase in length of test sentences. At each translation step, decoder uses global
attention to investigate the entire pool of source states. Context vector generated from source states, current
hidden state of decoder, and partial target text are then used by decoder for translation prediction of the cur-
rent word. Thus, the entire decoding process in NMT relies on the context of the source sentence. As the larger
source (test) sentences are context rich, their predicted translations are better in quality.

Figure 7: BLEU Score Achieved by NMT-I and NMT-II at Different Epochs.

476 | A. Pathak and P. Pakray: NMT for Indian Languages

Figure 8: BLEU Score Achieved by NMT System for Different Sentence Lengths.

6 Conclusion and Future Scope

NMT, a fairly new approach to MT, uses an adequately trained end-to-end neural network system for trans-
lation prediction. The ability to produce fluent translation, better context-analyzing abilities, and improved
performance over the SMT system are some of the key benefits that motivated us to explore the usage of NMT
in the context of Indian languages. In particular, we have trained and tested NMT systems for English-Tamil,
English-Hindi, and English-Punjabi language pairs. Predicted translations were provided to MTIL organizers
for human and BLEU evaluations whereby translation quality was evaluated on the grounds of adequacy, flu-
ency, and an overall rating. Besides, different experimental setups have been designed to analyze the change
in the translation performance of the English-Hindi NMT system with change in the number of epochs, train-
ing data, and length of test sentences. A close analysis of predicted translations guide us to the conclusion that
NMT systems produce fluent translations and their performance improves with increase in training data and
length of test sentences. Moreover, a translation performance versus epoch plot can be helpful in checking
the convergence of system training.
The underlying working idea of NMT relies heavily on the size of the training corpus, which directs us
to increase the number of instances in the training corpus. The effectiveness of translation is largely deter-
mined by attention mechanism and the score function used for computing the attention of each source state.
The score function can be modified to increase the interaction between the source state vector hs and the
current hidden state of decoder ht . Besides, a better grasp over the crux of target language constructs can
help improve the comprehensibility, adequacy, and fluency of translation. Furthermore, a skillful and care-
ful selection of values for system parameters such as number of epochs, hidden layers, GPUs, etc., can also
add to translation quality.

Acknowledgment: The work presented here falls under Research Project Grant No. YSS/2015/000988,
Funder Id: 10.13039/501100001843 and partially supported by the Department of Science & Technology
(DST) and Science and Engineering Research Board (SERB), Government of India. The authors would like to
acknowledge the Department of Computer Science & Engineering, National Institute of Technology Mizoram,
India, for providing infrastructural facilities and support.

Bibliography
[1] R. B. Allen, Several studies on natural language and back-propagation, in: Proceedings of the IEEE First International
Conference on Neural Networks, 2, IEEE Piscataway, NJ, pp. 335–341, San Diego, California, 1987.
[2] A. Chandola and A. Mahalanobis, Ordered rules for full sentence translation: a neural network realization and a case
study for Hindi and English, Pattern Recogn. 27 (1994), 515–521.
A. Pathak and P. Pakray: NMT for Indian Languages | 477

[3] K. Cho, B. V. Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk and Y. Bengio, Learning phrase representa-
tions using RNN encoder-decoder for statistical machine translation, in: Proceedings of the Empirical Methods in Natural
Language Processing (EMNLP 2014), pp. 1724–1734, Doha, Qatar, 2014.
[4] J. Gehring, M. Auli, D. Grangier, D. Yarats and Y. N. Dauphin, Convolutional sequence to sequence learning, in:
Proceedings of the 34th International Conference on Machine Learning, pp. 1243–1252, Sydney, Australia, 2017.
[5] V. Goyal and G. S. Lehal, Web based Hindi to Punjabi machine translation system, J. Emerg. Technol. Web Intell. 2 (2010),
148–151.
[6] M. Johnson, M. Schuster, Q. V. Le, M. Krikun, Y. Wu, Z. Chen, N. Thorat, F. Viégas, M. Wattenberg, G. Corrado, M. Hughes
and J. Dean, Google’s multilingual neural machine translation system: enabling zero-shot translation, Trans. Assoc.
Comput. Linguist. 5 (2017), 339–351.
[7] G. S. Josan and G. S. Lehal, A Punjabi to Hindi machine translation system, in: 22nd International Conference on
Computational Linguistics: Demonstration Papers, COLING ’08, pp. 157–160, Association for Computational Linguistics,
Stroudsburg, PA, USA, 2008.
[8] N. Kalchbrenner and P. Blunsom, Recurrent convolutional neural networks for discourse compositionality, in: Proceedings
of the 2013 Workshop on Continuous Vector Space Models and their Compositionality, pp. 119–126, Sofia, Bulgaria, 2013.
[9] S. Khan and R. B. Mishra, A neural network based approach for English to Hindi machine translation, Int. J. Comput. Appl.
53 (2012), 50–56.
[10] G. Klein, Y. Kim, Y. Deng, J. Senellart and A. Rush, OpenNMT: open-source toolkit for neural machine translation, in:
Proceedings of ACL 2017, System Demonstrations, pp. 67–72, Association for Computational Linguistics, Vancouver,
Canada, 2017.
[11] M. A. Kumar, V. Dhanalakshmi, K. P. Soman and S. Rajendran, Factored statistical machine translation system for English
to Tamil language, Pertan. J. Soc. Sci. Hum. 22 (2014), 1045–1061.
[12] M. A. Kumar, S. Rajendran and K. P. Soman, Cross-lingual preposition disambiguation for machine translation, Proc.
Comput. Sci. 54 (2015), 291–300.
[13] M. A. Kumar, B. Premjith, S. Singh, S. Rajendran and K. P. Soman, An overview of the shared task on machine translation
in Indian languages (MTIL) – 2017, J. Intell. Syst. 28 (2019), 455–464.
[14] M.-T. Luong, H. Pham and C. D. Manning, Effective approaches to attention-based neural machine translation, in:
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421, Lisbon,
Portugal, 2015.
[15] V. K. Menon, S. Rajendran and K. P. Soman, A synchronised tree adjoining grammar for English to Tamil machine
translation, in: International Conference on Advances in Computing, Communications and Informatics (ICACCI), 2015,
pp. 1497–1501, Kerala, India, 2015.
[16] R. Narayan, S. Chakraverty and V. P. Singh, Quantum neural network based machine translator for Hindi to English, Appl.
Soft Comput. 38 (2016), 1060–1075.
[17] S. Pal, P. Pakray and S. K. Naskar, Automatic building and using parallel resources for SMT from Comparable Corpora, in:
Proceedings of the 3rd Workshop on Hybrid Approaches to Machine Translation (HyTra), pp. 48–57, Gothenburg, Sweden,
2014.
[18] K. Papineni, S. Roukos, T. Ward and W.-J. Zhu, BLEU: a method for automatic evaluation of machine translation, in:
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Association for Computational
Linguistics, pp. 311–318, Philadelphia, PA, 2002.
[19] Raj Nath Patel, Prakash B. Pimpale and M. Sasikumar, Machine translation in Indian languages: challenges and
resolution, J. Intell. Syst. 28 (2019), 437–445.
[20] C. Poornima, V. Dhanalakshmi, K. M. Anand and K. P. Soman, Rule based sentence simplification for English to Tamil
machine translation system, Int. J. Comput. Appl. 25 (2011), 38–42.
[21] K.-Y. Su and J.-S. Chang, Why corpus-based statistics-oriented machine translation, in: Proceedings of the Fourth
International Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages,
pp. 249–262, Montreal, Canada, 1992.
[22] I. Sutskever, O. Vinyals and Q. V. Le, Sequence to sequence learning with neural networks, in: Proceedings of Advances in
Neural Information Processing Systems, pp. 3104–3112, Montreal, Canada, 2014.
[23] H. Xiong, Z. He, X. Hu and H. Wu, Multi-channel encoder for neural machine translation, arXiv preprint arXiv:1712.02109
(2017).

Machine Translation Mondal 2023
No ratings yet
Machine Translation Mondal 2023
90 pages
A Seminar Report On Machine Learing
35% (23)
A Seminar Report On Machine Learing
30 pages
Assignment 1:: Intro To Machine Learning
No ratings yet
Assignment 1:: Intro To Machine Learning
6 pages
Used Car Price Prediction: B.E. (CSE) VI Semester Case Study
100% (3)
Used Car Price Prediction: B.E. (CSE) VI Semester Case Study
30 pages
AI Human Rights Democracy and The Rule of Law 1626817771
No ratings yet
AI Human Rights Democracy and The Rule of Law 1626817771
46 pages
Lecture Notes 01
No ratings yet
Lecture Notes 01
77 pages
Logistic Regression Example
100% (1)
Logistic Regression Example
22 pages
15-381 Spring 2007 Assignment 6: Learning
No ratings yet
15-381 Spring 2007 Assignment 6: Learning
14 pages
L2 数量课件
No ratings yet
L2 数量课件
196 pages
Guide
No ratings yet
Guide
210 pages
Customer Lifetime Value in Video Games Using Deep Learning and Parametric Models
No ratings yet
Customer Lifetime Value in Video Games Using Deep Learning and Parametric Models
7 pages
AI ClassXII Study Materials
No ratings yet
AI ClassXII Study Materials
40 pages
Supplementary Materials For: Improving Refugee Integration Through Data-Driven Algorithmic Assignment
No ratings yet
Supplementary Materials For: Improving Refugee Integration Through Data-Driven Algorithmic Assignment
37 pages
Deeplog: Anomaly Detection and Diagnosis From System Logs Through Deep Learning
No ratings yet
Deeplog: Anomaly Detection and Diagnosis From System Logs Through Deep Learning
14 pages
Overfitting and Underfitting in Machine Learning
No ratings yet
Overfitting and Underfitting in Machine Learning
3 pages
Machine Translation Systems For Indian Languages: Review of Modelling Techniques, Challenges, Open Issues and Future Research Directions
No ratings yet
Machine Translation Systems For Indian Languages: Review of Modelling Techniques, Challenges, Open Issues and Future Research Directions
29 pages
Represented Using Tensors, and As A Result, Neural Network Programming Utilizes
No ratings yet
Represented Using Tensors, and As A Result, Neural Network Programming Utilizes
32 pages
Review Article: Example-Based Machine Translation
No ratings yet
Review Article: Example-Based Machine Translation
46 pages
Sface: An Efficient Network For Face Detection in Large Scale Variations
No ratings yet
Sface: An Efficient Network For Face Detection in Large Scale Variations
17 pages
Modelling Tabular Data Using Conditional GAN's
No ratings yet
Modelling Tabular Data Using Conditional GAN's
15 pages
TabTransformer - Tabular Data Modeling Using Contextual Embeddings
No ratings yet
TabTransformer - Tabular Data Modeling Using Contextual Embeddings
17 pages
An Evaluation of Machine Learning Methods To Detect Malicious SCADA Communications PDF
No ratings yet
An Evaluation of Machine Learning Methods To Detect Malicious SCADA Communications PDF
6 pages
1506 06726 PDF
No ratings yet
1506 06726 PDF
11 pages
Survey On Neural Machine Translation Into Polish: Proceedings of The 11th International Conference MISSI 2018
No ratings yet
Survey On Neural Machine Translation Into Polish: Proceedings of The 11th International Conference MISSI 2018
13 pages
A Gentle Introduction To Neural Machine Translation
No ratings yet
A Gentle Introduction To Neural Machine Translation
14 pages
Neural NILM: Deep Neural Networks Applied To Energy Disaggregation
No ratings yet
Neural NILM: Deep Neural Networks Applied To Energy Disaggregation
10 pages
ML Paper 2
No ratings yet
ML Paper 2
8 pages
Mobile Device Training Strategies in Federated Learning: An Evolutionary Game Approach
No ratings yet
Mobile Device Training Strategies in Federated Learning: An Evolutionary Game Approach
6 pages
Deep Learning Notes PDF
No ratings yet
Deep Learning Notes PDF
26 pages
Extra 1 PDF
No ratings yet
Extra 1 PDF
9 pages
A Review On Deep Learning Techniques Applied To Semantic Segmentation
No ratings yet
A Review On Deep Learning Techniques Applied To Semantic Segmentation
23 pages
Google PDF
No ratings yet
Google PDF
23 pages
Covid-19 Prediction Using Machine Learning
No ratings yet
Covid-19 Prediction Using Machine Learning
23 pages
Stochastic Packet Inspection For TCP Traffic: Firstname - Lastname@enst - FR Firstname - Lastname@polito - It
No ratings yet
Stochastic Packet Inspection For TCP Traffic: Firstname - Lastname@enst - FR Firstname - Lastname@polito - It
6 pages
A Study of Cross-Validation and Bootstrap For Accuracy Estimation and Model Selection
No ratings yet
A Study of Cross-Validation and Bootstrap For Accuracy Estimation and Model Selection
8 pages
A Survey of Multilingual Neural Machine Translation: Raj Dabre, Chenhui Chu, Anoop Kunchukuttan
No ratings yet
A Survey of Multilingual Neural Machine Translation: Raj Dabre, Chenhui Chu, Anoop Kunchukuttan
38 pages
Google Neural Machine Translation System
No ratings yet
Google Neural Machine Translation System
23 pages
Experiments With A Hindi-to-English Transfer-Based MT System Under A Miserly Data Scenario
No ratings yet
Experiments With A Hindi-to-English Transfer-Based MT System Under A Miserly Data Scenario
21 pages
Comparison Kubeflow TFX
No ratings yet
Comparison Kubeflow TFX
12 pages
On The Dangers of Stochastic Parrots: Can Language Models Be Too Big?
No ratings yet
On The Dangers of Stochastic Parrots: Can Language Models Be Too Big?
14 pages
Extremely Low Resource Neural Machine Translation For Asian Languages
No ratings yet
Extremely Low Resource Neural Machine Translation For Asian Languages
36 pages
Challenges in NMT - 1907.05019
No ratings yet
Challenges in NMT - 1907.05019
27 pages
Challenges in NMT - 2004.05809
No ratings yet
Challenges in NMT - 2004.05809
22 pages
Neural Machine Translation A Review of Methods Resources and - 2020 - AI Ope
No ratings yet
Neural Machine Translation A Review of Methods Resources and - 2020 - AI Ope
17 pages
Machine Translation Development For Indian Languages and Its Approaches
No ratings yet
Machine Translation Development For Indian Languages and Its Approaches
21 pages
Interpr&TranslTrain 14 (2018) 4 Moorkens, What To Expect From Neural Machine Translation. A Practical In-Class Translation Evaluation Exercise
No ratings yet
Interpr&TranslTrain 14 (2018) 4 Moorkens, What To Expect From Neural Machine Translation. A Practical In-Class Translation Evaluation Exercise
14 pages
Neural Machine Translation For English-Tamil: Himanshu Choudhary Aditya Kumar Pathak
No ratings yet
Neural Machine Translation For English-Tamil: Himanshu Choudhary Aditya Kumar Pathak
7 pages
Machine Translation Systems and Quality Assessment A Systematic Review
No ratings yet
Machine Translation Systems and Quality Assessment A Systematic Review
27 pages
Neural Machine Translation Advised by Statistical Machine Translation
No ratings yet
Neural Machine Translation Advised by Statistical Machine Translation
7 pages
Challenges in NMT - 1706.03872
No ratings yet
Challenges in NMT - 1706.03872
12 pages
Neural Machine Translation Model For University Email Application
No ratings yet
Neural Machine Translation Model For University Email Application
6 pages
1679506287709733
No ratings yet
1679506287709733
15 pages
Bilingual Machine Translation
No ratings yet
Bilingual Machine Translation
8 pages
Research Article: Improving Transformer-Based Neural Machine Translation With Prior Alignments
No ratings yet
Research Article: Improving Transformer-Based Neural Machine Translation With Prior Alignments
10 pages
Machine Translation Thesis PDF
100% (3)
Machine Translation Thesis PDF
8 pages
PHD Thesis Machine Translation
100% (3)
PHD Thesis Machine Translation
7 pages
Abstract 2 - NMT
No ratings yet
Abstract 2 - NMT
1 page
Advanced Technical Exploration of Modern Translation Technologies
No ratings yet
Advanced Technical Exploration of Modern Translation Technologies
4 pages
ChatGPTvs - GoogleTranslate HiT-IT-2023-proceedings
No ratings yet
ChatGPTvs - GoogleTranslate HiT-IT-2023-proceedings
12 pages
Low-Resource Neural Machine Translation A Systematic Literature Review
No ratings yet
Low-Resource Neural Machine Translation A Systematic Literature Review
39 pages
Machine Translation Systems For Indian Languages: Review of Modelling Techniques, Challenges, Open Issues and Future Research Directions
No ratings yet
Machine Translation Systems For Indian Languages: Review of Modelling Techniques, Challenges, Open Issues and Future Research Directions
29 pages
(IJCST-V9I1P20) :T. Madhavi Kumari, Dr. A. Vinaya Babu
No ratings yet
(IJCST-V9I1P20) :T. Madhavi Kumari, Dr. A. Vinaya Babu
6 pages
JETIR2211403
No ratings yet
JETIR2211403
6 pages
Duplichecker Plagiarism Report
No ratings yet
Duplichecker Plagiarism Report
4 pages
03 Content
No ratings yet
03 Content
4 pages
NLP Project Research Paper Tanmaya
No ratings yet
NLP Project Research Paper Tanmaya
4 pages
Electronics 14 00243
No ratings yet
Electronics 14 00243
30 pages
Multi-Model Neural Machine Translation: B. Nikitha, K. Bhanu Prakash, M. Sravanthi Suma, M. Kavya Srihitha
No ratings yet
Multi-Model Neural Machine Translation: B. Nikitha, K. Bhanu Prakash, M. Sravanthi Suma, M. Kavya Srihitha
2 pages
English-to-Malayalam Machine Translation Framework Using Transformers
No ratings yet
English-to-Malayalam Machine Translation Framework Using Transformers
5 pages
RCSHPPR 22
No ratings yet
RCSHPPR 22
5 pages
Lang Gragh
No ratings yet
Lang Gragh
14 pages
An Introduction To Machine Translation (MT)
No ratings yet
An Introduction To Machine Translation (MT)
2 pages
Phase 1 Project
No ratings yet
Phase 1 Project
18 pages
OpenNMT Open-Source Toolkit For Neural Machine Translation
No ratings yet
OpenNMT Open-Source Toolkit For Neural Machine Translation
6 pages
Machine Translation
No ratings yet
Machine Translation
58 pages
359 1632 1 PB
No ratings yet
359 1632 1 PB
5 pages
Philipp Koehn: Neural Machine Translation
No ratings yet
Philipp Koehn: Neural Machine Translation
11 pages
Is Neural Machine Translation The New State of The Art?
No ratings yet
Is Neural Machine Translation The New State of The Art?
12 pages
ASWIN TS Unit 3 NLP Translations Gen AI
No ratings yet
ASWIN TS Unit 3 NLP Translations Gen AI
5 pages
Machine Translation Final Draft
No ratings yet
Machine Translation Final Draft
27 pages
Machine Translation of Vedic Sanskrit Using Deep Learning Algorithm
No ratings yet
Machine Translation of Vedic Sanskrit Using Deep Learning Algorithm
4 pages
Machine Translation
No ratings yet
Machine Translation
13 pages
Tanujasynopsis
No ratings yet
Tanujasynopsis
8 pages
Paper Review
No ratings yet
Paper Review
41 pages
Multimodal Machine Translation For Sanskrit-Hindi An Empirical Analysis
No ratings yet
Multimodal Machine Translation For Sanskrit-Hindi An Empirical Analysis
4 pages
Divai2020 Benkova
No ratings yet
Divai2020 Benkova
11 pages
Natural Language Processing Unit 5
No ratings yet
Natural Language Processing Unit 5
23 pages
2018 - Generating Noun Declension-Case Markers For English To Indian Languages in Declension Rule Based MT Systems
No ratings yet
2018 - Generating Noun Declension-Case Markers For English To Indian Languages in Declension Rule Based MT Systems
7 pages

FN Paper 2

Uploaded by

FN Paper 2

Uploaded by

J. Intell. Syst.

2019; 28(3): 465–477

Amarnath Pathak and Partha Pakray*

Neural Machine Translation for Indian

3.1 Data Preprocessing

3.2 System Training

3.3 System Testing/Translation

Figure 2: System Architecture for NMT.

4.1 Corpora Description

Table 1: Corpora Description.

Nature of corpus Name of corpus Number of instances

Training Hindi_MTIL2017-Training 160,758

4.2 Experimental Setup

5 Results and Analysis

Table 2: Adequacy Table.

Language Team Adequacy

Evaluator 1 Evaluator 2 Evaluator 3 Average

Malayalam CDAC-M 2.20 1.36 2.20 1.92

Table 3: Fluency Table.

Language Team Fluency

Evaluator 1 Evaluator 2 Evaluator 3 Average

Malayalam CDAC-M 1.76 1.34 1.90 1.67

Table 4: Rating Table.

Language Team Rating

Evaluator 1 Evaluator 2 Evaluator 3 Average

Malayalam CDAC-M 1.83 1.29 1.68 1.60

Table 5: Percentage Measures and BLEU Score.

Language Team Adequacy and fluency in % Rating in % BLEU score

Malayalam CDAC-M 35.85 31.94 2.6

5.2 Performance Comparison of NIT-M and CDAC-M Systems

Figure 3: Comparison of Adequacy Scores of NIT-M and CDAC-M Systems.

Figure 4: Comparison of Fluency Scores of NIT-M and CDAC-M Systems.

Figure 5: Comparison of BLEU Scores of NIT-M and CDAC-M Systems.

5.3 Evaluation Results of Different Experimental Setups

Figure 6: BLEU Score Achieved by NMT System at Different Epochs.

Figure 7: BLEU Score Achieved by NMT-I and NMT-II at Different Epochs.

6 Conclusion and Future Scope

You might also like