0% found this document useful (0 votes)
68 views71 pages

Machine Translation

Uploaded by

Diya Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views71 pages

Machine Translation

Uploaded by

Diya Raj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Machine Translation

Dr. Partha Pakray cs.nits.ac.in/partha


Dept. of CSE [email protected]
Associate Professor +91-8259065018
NIT Silchar Center for Natural Language Processing (CNLP)
● Introduction

● Challenges

● Types

● Evaluation Metrics

● World Respective

Contents ● India Respective

● Evaluation Tracks
Overview - Machine Translation
● Machine Translation Tools

PAST ● Current Trends & Future Directions

PRESENT ● References

FUTURE
Machine Translation (MT) is the application of
computers to the task of translating texts from one
natural language to another.
Google Translate:
Launched in April 2006 as
a statistical machine
translation service.

In November 2016, Google


announced that Google
Translate would switch to a
neural machine translation
engine - Google Neural
Machine Translation
(GNMT).

GNMT is used in all 109 languages


in the Google Translate roster as of
October 2020.

Source: https://en.wikipedia.org/wiki/Google_Translate

Ref. https://translate.google.co.in/
● Based on the availability of resources (corpus,
tools, speakers), there are two categories of
natural languages: high and low resource
languages.
● High resource languages are those languages
which are resource-rich languages like English,
German, French, and Hindi.
● Low resource languages are resource-poor like
many Indian languages especially found in the
north-eastern region of India like Boro, Khasi,
Kokborok, Mizo etc.
Is Machine Translation hard or easy?
● Lexical Ambiguity: Ambiguous words
have multiple, related or unrelated,
meanings.

Challenges ■ Homonymy/ Polysemy: Refers to same


word having different/ multiple meanings.
E.g. “Bank”, “Book” English word ‘book’
has two meanings viz. reading object or
Machine Translation recording something officially in legal
sense
● Choosing the Appropriate Inflection:

■ Number: All the concrete nouns. E.g.


book, books.

■ Gender: All the adjectives.

Challenges ■ Case: All the arguments.

■ Tense: All the verbs. E.g. buy, bought,


will buy.
Machine Translation
● Inserting Spontaneous Words:

■ Determiners: a book, the book.

Definite and indefinite articles (corresponding to


the, a, an in English) do not exist in the Russian
language. - https://en.wikipedia.org/wiki/Russian_grammar

■ Prepositions: in November

Challenges ■ Relative pronouns:


bought the book
The person who

■ Possessive pronouns: He raised his hand


Machine Translation
■ Conjunction: I run and play every day
● Syntactic Translation Problems:Word
order of various languages are not same
(S-V-O, S-O-V, V-S-O etc.).

Challenges
Machine Translation
Word order of English (S-V-O) and Hindi
(S-O-V)
Word Proportion Example
order of languages languages
SOV 45% Proto-Indo-European, Sanskrit, Hindi, Ancient
Greek, Latin, Japanese

SVO 42% English, French, Hausa, Indonesian, Malay, M


andarin, Russian

VSO 9% Biblical
Hebrew, Arabic, Irish, Filipino, Tuareg-Berber,
Welsh

VOS 3% Malagasy, Baure, Proto-Austronesian

OVS 1% Apalaí, Hixkaryana

Challenges OSV 0% Warao

about half of the world's languages deploy subject–object–verb order


Machine Translation (SOV);
about one-third of the world's languages deploy subject–verb–object
order (SVO);
a smaller fraction of languages deploy verb–subject–object (VSO) order;
the remaining three arrangements are rarer: verb–object–subject (VOS)
is slightly more common than object–verb–subject (OVS), and
object–subject–verb (OSV) is the rarest by a significant margin

https://en.wikipedia.org/wiki/Word_order
● Phrase Translation Problems:

■ Idiomatic phrases have hidden meaning,


can not translate the phrases word by
Challenges word.

● Semantic Ambiguity:

■ This ambiguity reflects more than one


meanings of the sentence and needs
Machine Translation context (pragmatics) to resolve it.

■ Example: My son dropped the glass and it


broke into pieces (e.g. the glass).
● Long Term Dependency Problem: if the

Challenges source sentence is very long.

● Context Analyzing ability.

● Low Resourced Languages: Lack of


sufficient corpus for low resource
Machine Translation
languages.
● Novels
Translation is Hard! ● Word play, jokes, messages
● Concept gaps: go Greek, bei fen
● Other constraints: lyrics, dubbing, poem etc.
● Sentiment aware text
Machine Translation
How humans do translation?
• How to learn foreign language: Training stage
– Memorize word translations Translation lexicon
Templates, transfer rules
– Learn some patterns
– Exercise Learning algorithm?
• Passive activity: read, listen Reranking?
• Active activity: write, speak
Decoding stage
• Translation: Parsing, semantics analysis?
– Understand the sentence
– Clarify or ask for help (optional) Word-level? Phrase-level?
– Translate the sentence Generate from meaning?
What kinds of resources are available to MT?
➢ Translation lexicon:
○ Bilingual dictionary

➢ Templates, transfer rules:


○ Grammar books

➢ Parallel data
➢ Thesaurus, WordNet, FrameNet, etc.

➢ NLP tools: tokenizer, morph analyzer, parser, …

★ More resources for “major” languages, less for “minor” languages.


Types
Periods of the MT Approaches
Machine Translation

(Anoop, 2018)
● Rule-based: Relies on Set of Rules

■ Direct-based Machine Translation

■ Transfer-based Machine Translation

■ Interlingua-based Machine Translation

Types ● Corpus-based: Relies on Data (Parallel,


Monolingual)

Machine Translation ■ Example-based Machine Translation


(EBMT)

■ Statistical Machine Translation (SMT)

■ Neural Machine Translation (NMT)


● Direct-based Machine Translation:

■ Known as dictionary-based.

■ Simply each source word is translated into


Types the target word based on bilingual lexicon
and predefined rules for reordering output
translation.
Machine Translation
● Transfer-based Machine Translation:
Types ■ Source sentence is analysed on the basis
of syntax/semantic and the output
Machine Translation syntactic structure is mapped to a new
syntactic/semantic structure in the target
language by the set of rules.

■ Actual source words are translated into


the target words following the direct
translation approach.

(Anoop, 2018)
● Interlingual-based Machine Translation:

Types ■ Source language text is analysed and


transformed into an abstract
language-independent representation
Machine Translation (interlingua) and then target language is
generated from the interlingua
representation.

■ Interlingual machine translation is


effective when the source and target
languages are very different from each
other like Arabic and English.

(Anoop, 2018)
Drawback of Rule -based Machine Translation:

■ Unavailability of good dictionaries and the


expensive to construct new dictionaries
and set of rules for all languages.

Types ■ Therefore, corpus-based (also known as


data-driven) is introduced.

Machine Translation
● Example-based Machine Translation

■ Key idea is the analogy (text similarity)


concept.

■ Drawback is that in real-time scenarios,


we can not cover various types of
sentences by examples only.

Types ● Statistical Machine Translation

■ Statistical models are generated whose


parameters are estimated or learned from the
Machine Translation analysis of parallel corpus. The obtained
statistical model is used to predict the target
sentence for the given source sentence.

■ Considers the translation task as a


probabilistic task by predicting the best
translation for a given source sentence.
● Statistical Machine Translation

Types ■ Various types namely, word-based


translation, Phrase-based translation,
Syntax-based translation and Hierarchical
Machine Translation phrase-based translation,

■ Most acceptable and widely used approach


is phrased-based.

■ Prior to neural approach, phrased-based SMT


achieves state-of-the-art approach [Anoop
Kunchukuttan and Pushpak Bhattacharyya,
2016].

(Anoop, 2018)
● Neural Machine Translation: Basics of
Modelling (seq-2-seq)

Types ■


Inputs: Parallel sentences

Embedding Layers: Input sequences are fed


Machine Translation into the model with one word for every
time step. Each word is encoded as a
unique integer or one-hot encoded vector
that maps to the vocabulary. Embeddings
are used to convert each word to a vector.
The size of the vector depends on the
complexity of the vocabulary.

■ Encoder: It takes embeddings and prepare


learned source representation for the decoder.

Source: https://opennmt.net/
● Neural Machine Translation: Basics of
Modelling (seq-2-seq)

■ Decoder. It takes the encoded input as


context and prepare the correct
translation sequence.
Types ■ Outputs. The outputs are returned as a
sequence of integers or one-hot encoded
vectors, which can then be mapped to the
Machine Translation dataset vocabulary.
● One hot vector

Types
Machine Translation

https://www.researchgate.net/figure/Example-of-text-representation-by-one-hot-vector_fig2_301703031
● Embedding

■ Embeddings allow us to capture more


precise semantic word relationships.

■ This is achieved by projecting each word


into n-dimensional space.

Types ■ Words with similar meanings occupy


similar regions of this space; the closer
two words are, the more similar they are.

Machine Translation
● Embedding

■ And often the vectors between words


represent useful relationships. e.g.
Word2Vec (Mikolov et al. 2013),
GloVe (Pennington et al. 2014), BERT
(Devlin et al. 2018).

Types
Machine Translation

Source:
https://towardsdatascience.com/language-translation-with-rnns-d84d43b40571
Recurrent Neural Network (RNN) Model

■ Vector-Sequence Model: Image (vector


representation) to caption Generation
(Output sequence is a sentence)

■ Sequence-Vector Model: Text


(sequence is an input sentence) to

Types Sentiment - positive/negative (fixed


sized output as vector)

■ Sequence-Sequence Model: Source


Machine Translation
Sentence (input sequence) to Target
Sentence (output sequence)
Neural Machine Translation: Basic Recurrent
Neural Network (RNN) Model [Sutskever et al., 2014]

■ Main components: Encoder and


Decoder

■ Encoder is responsible to compress the


entire source sentence into a context
Types vector

■ Decoder is used to decode the target


Machine Translation sentence from the context vector

https://towardsdatascience.com/understanding-encoder-decoder-sequence-to-sequence-model-679e04
Ref: https://www.youtube.com/watch?v=StOFwSRBwMo
af4346
Neural Machine Translation: Basic Recurrent
Neural Network (RNN) Model Disadvantage

■ Slow to train

■ Long sequences lead to vanishing /

Types exploding gradients

Machine Translation

Source: https://www.youtube.com/watch?v=TQQlZhbC5ps
Neural Machine Translation: Recurrent Neural
Network (RNN) Model - LSTM

■ Deals with variable-length: source-target phrases

Types ■ Sequence to Sequence learning

Handling long term dependency issue using Long


Machine Translation ■
Short Term Memory (LSTM).

Long Short Term Memory


Input Gate: Pass the previous hidden state and current input into a sigmoid
function.
Forget Gate: This gate decides what information should be thrown away or LSTM-based Encoder-Decoder Architecture (h: hidden c: context),
kept.
Output Gate: The output gate decides what the next hidden state should be. Source: https://images.app.goo.gl/QtvN431WEGZbpJgD6
Remember that the hidden state contains information on previous inputs.
Neural Machine Translation: Recurrent Neural
Network (RNN) Model - Embedding

Embeddings allow us to capture more semantic word


relationships.

Types
Machine Translation
Embedding-based Encoder-Decoder Architecture

Source: https://towardsdatascience.com/language-translation-with-rnns-d84d43b40571
Neural Machine Translation: Recurrent Neural
Network (RNN) Model - Attention Mechanism

■ Disadvantage - without attention


based LSTM: Context vector is a
single vector that summarizes the

Types entire input sequence.

● There might be input words that


Machine Translation may need a more attention due to
their impact on the translation.

■ Attention mechanism is avoid


attempting to learn a single vector
representation for each sentence.

“Effective Approaches to Attention-based Neural Machine Translation”, 2015


Neural Machine Translation: Recurrent Neural
Network (RNN) Model

■ Attention mechanism pays attention to


certain input vectors of the input sequence
based on the attention weights.

■ This allows the decoder network to


Types “focus” on a different part of the encoder
outputs.

Machine Translation ■ It does this for every step of the decoder’s


own outputs using a set of attention
weights.
Neural Machine Translation: Recurrent Neural
Network (RNN) Model - Attention

Types
Machine Translation

Attention Mechanism
Source: https://towardsdatascience.com/language-translation-with-rnns-d84d43b40571
Neural Machine Translation: Bidirectional
Recurrent Neural Network (BRNN) Model
[Sree Harsha Ramesh and Krishna Prasad Sankaranarayanan, 2018]

■ BRNN utilizing two distinct RNNs, one


layer for the forward direction and
another layer for the backward direction

Types
Machine Translation

Source: https://images.app.goo.gl/KrDpfoWswawkPNka7
Disadvantage:

Normal RNN and LSTM based MT

Types ●

Sequential process
Slower
Machine Translation

So introduced Transformer Model by Vaswani et al. 2017


Transformer Model:

● Sequential Model to Parallelize Model to


accelerate learning process.

Types
Machine Translation

So introduced Transformer Model by Vaswani et al. 2017,


In the figure the encoder network can be seen on the left and the
decoder network on the right.

Neural Machine Translation: Transformer Model


[Ashish Vaswani et al. 2017]

■ Primarily used in language modelling,

Types self-attention mechanism allows the


inputs to interact with each other
(“self”).
Machine Translation
■ Find out who they should pay more
attention to (“attention”).

■ Transformers do not rely on


sequential processing, and
processing all tokens at the same
time and calculating attention
weights between them.

Source: https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html
Neural Machine Translation: Transformer Model

■ Outputs are aggregates of these


interactions and attention scores.

■ Encode each position and by applying a


self-attention mechanism to connect
two different words, which would be

Types parallelized to accelerate learning.

■ Each encoder consists of two layers:


Machine Translation Self-attention and a feed Forward
Neural Network.
● Multilingual Concept

Various Concept ●

Multi-modal Concept
Zero shot Concept

of MT
Neural Machine Translation: Multilingual Concept
[Tan et al., 2017]

■ By utilizing many to one, many to many


and one to many translation approach,
transfer learning is possible.

■ Transfer learning is very much effective


in low resource language translation.
Types ■ Existing works on Indian languages:

Machine Translation 1. Sindhi, Magahi, Bhojpuri, Hindi [Pulkit


et al., 2020]

2. Bengali, Hindi, Malayalam, Tamil,


Telugu, Urdu, Sinhalese [Sukanta et al.,
2018].
● Neural Machine Translation: Multimodal
concept [Calixto et al., 2017]

■ Multimodal (Text/image/speech) concept


is very much effective in low resource

Types language translation.

■ Existing works on Indian languages:


Machine Translation
1. English-Hindi [Koel et al., 2018].

2. Kannada, Malayalam, Tamil [Bharathi


et al., 2019].

Image source:
https://jlibovicky.github.io/2020/06/06/MT-Weekly-Unsupervised-Multimodal-MT.html
Neural Machine Translation: Zero-shot

■ Zero-shot: Without parallel training data.

■ Google provides zero-shot translation


using multilingual NMT system for
English-German and English-French
[Melvin et al., 2017].

Types
Machine Translation

Source: https://ai.googleblog.com/2016/11/zero-shot-translation-with-googles.html
● SMT v/s NMT: End to End Problem

■ SMT does not provide end to end


solution.

■ Given a input, we can not directly get


the output in SMT.

SMT vs NMT ■ In SMT, If one of the module is


updated then entire translation system
Machine Translation needs to be updated.

■ NMT provides end to end solution


because of neural-based
encoder-decoder architecture.
● SMT v/s NMT: Long term dependency
and Context analysis

■ SMT suffers problem of long term


dependency.
SMT vs NMT ■ SMT lacks context analyzing ability.

Machine Translation ■ NMT handles such issues by utilizing


Long-Short Term Memory (LSTM),
Attention Mechanism.
● SMT v/s NMT: Generalization

■ Generalization means continuous


representation (vector form of real
numbers) is very much effective to

SMT vs NMT capture


languages.
various properties of

■ SMT lacks generalization.


Machine Translation
■ NMT provides generalization because
of continuous representations.
● SMT v/s NMT: Word embedding

■ Word embedding is a type of word


representation that allows words with
similar meaning to have a similar
representation e.g. Word2Vec, GloVe,
BERT.

SMT vs NMT ■ Word embedding is not feasible in


SMT.

Machine Translation ■ In NMT, word embedding is possible.

■ Word embedding concept is useful for


improving translation quality by
utilizing monolingual corpus.

■ Word embedding is very much effective


in low resource language translation.

Photo credit: Chris Bail


● SMT v/s NMT: Transfer learning

■ Transfer learning is not feasible in


SMT.

SMT vs NMT ■ In NMT, multilingual concept helps in


transfer learning by utilizing many to
Machine Translation one, Many to many and one to many
translation approach.
● SMT v/s NMT: Mode (Text/image/speech)

SMT vs NMT ■ SMT is limited


translation.
to text based

Machine Translation ■ Multimodal (Text/image/speech)


concept is possible in NMT.
Automatic Evaluation Metrics:

■ Bilingual Evaluation Understudy

Evaluation (BLEU) (Papineni et al., 2002)

■ Translation Edit Rate (TER) (Snover et

Metrics al., 2006)

■ Metric for Evaluation of Translation


Machine Translation with Explicit Ordering (METEOR)
(Lavie et al., 2009)

■ F-measure (Lavie et al., 2009)


Human Evaluation Metric

■ Good translation of a sentence posses


two main aspects: Adequacy and
Fluency.
Evaluation ■ Adequacy aspect refers to measure
predicted translation quality on the
Metrics ground of amount of meaning
corresponding to reference translation.

Machine Translation ■ Fluency aspect measures the sentence


formation of predicted sentence,
whether it is well-formed or not. It is
not relevant to source sentence.
Human Evaluation metric: Considering 1-5 Scale

Evaluation English to Hindi Translation

Source He is a boy
Adequacy Fluency

Metrics Predicted वह खेल रहा है


1 4

Reference वह एक लड़का है
Machine Translation
Conference on Machine Translation (WMT)

Evaluation
■ http://www.statmt.org/

■ Hindi-Nepali, Hindi-Marathi,

Tracks ■
English-Tamil

2005-2020

Machine Translation
WMT Track Top Scorer Approach

Hindi to NITS-CNLP NMT


Nepali (Laskar et al.) (Transformer)

Evaluation 2019
Nepali to NITS-CNLP NMT

Tracks Hindi (Laskar et al.) (Transformer)

Hindi to INFOSYS NMT


Machine Translation Marathi (Transformer)
2020
Marathi to WIPRO-RIT NMT
Hindi (Transformer)
Workshop on Asian Translation (WAT)

Evaluation ■


http://lotus.kuee.kyoto-u.ac.jp/WAT/

English-Hindi, English-Odia,

Tracks Bengali/Hindi/Malayalam/Tamil/Telugu/
Marathi/Gujarati - English

■ 2014-2021
Machine Translation
WAT Track Top Scorer Approach

Evaluation English to
Hindi
Team: 683
(CNLP, NIT
Silchar)
NMT
(BRNN)
multimodal

Tracks 2019 translation


task
(Laskar et al.)

English to CNLP-NITS NMT


Machine Translation Hindi (BRNN)
2020 multimodal
Translation
task
● Shared Task & Workshop on Machine
Translation System in Indian
Languages (MTIL)
■ https://nlp.amrita.edu/mtil_cen/
■ English - Tamil

Evaluation ■

English - Malayalam
English - Hindi
■ English - Punjabi
Tracks ●
■ 2017
Workshop on Low Resource Machine
Machine Translation Translation (LoResMT)

■ http://sites.google.com/view/loresmt/

■ Bhojpuri/Magahi/Sindhi-English,
Hindi-Bhojpuri, Hindi-Magahi,
Russian-Hindi

■ 2019-2021
● Conference On Machine Translation
World (WMT)

Respective ● Workshop on Asian Translation (WAT)

● Workshop on Low Resource Machine


Translation (LoResMT)
● Shared Task & Workshop on Machine
Translation System in Indian
Languages (MTIL)

India ● CFILT, IIT Bombay:


http://www.cfilt.iitb.ac.in/

Respective ● IIIT HYDERABAD:


https://ltrc.iiit.ac.in/
● TDIL Data Center, Govt. of India
https://www.tdil-dc.in/index.php?optio
n=com_vertical&parentid=8&Itemid
=553&lang=en
● SMT:
○ Moses : http://www.statmt.org/moses/

● NMT:
Available Tools OpenNMT: https://opennmt.net/
Machine Translation Marian: https://marian-nmt.github.io/

Nematus: https://github.com/EdinburghNLP/nematus
● Low Resource Indian Language
Translation
○ Mizo
Current Trends ○

Assamese
Nyshi
& ○ Khsai

Future Direction
● Multilingual-based Translation
Machine Translation

● Multimodal-based Translation
○ Hindi, Assamese
● Corpus creation and introduce in MT
Current Trends
& ● Efficiently tackle insufficient data issue

Future Direction
● Deals with:
Machine Translation ○ Out-of-Vocabulary
○ Rare-words
○ multi-word expressions problem
● Transfer learning:
Current Trends How can effectively utilize

& high-resource languages to improve


low-resource pair translation?

Future Direction ● Multimodal-embeddings:

Machine Translation How can effectively utilize textual and


visual features to improve low-resource
pair translation?
Current Trends ● Speech-to-Text Translation

&
● Text-to-Speech Translation
Future Direction
Machine Translation
● Speech-to-Speech Translation
References
1. Anoop Kunchukuttan, Pushpak Bhattacharyya. Faster Decoding for Subword Level Phrase-based SMT between Related Languages.
Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), COLING, pp-82-88 (2016).
2. Sutskever, Ilya and Vinyals, Oriol and Le, Quoc V. Sequence to Sequence Learning with Neural Networks. In proceedings of the 27th
International Conference on Neural Information Processing Systems - volume 2, MIT Press, Cambridge, MA, USA, NIPS’14, pp.
3104–3112 (2014).
3. Sree Harsha Ramesh and Krishna Prasad Sankaranarayanan. Neural Machine Translation for Low Resource Languages using Bilingual
Lexicon Induced from Comparable Corpora. In proceedings of the 2018 Conference of the North American Chapter of the Association for
Computational Linguistics: Student Research Workshop. Association for Computational Linguistics, New Orleans, Louisiana, USA, pp.
112–119 (2018).
4. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. Attention
is All you Need. In Advances in Neural Information Processing Systems 30, Curran Associates, Inc., pp. 5998–6008 (2017).
5. Tan, X., Chen, J., He, D., Xia, Y., Qin, T., Liu, T.Y. Multilingual neural machine translation with language clustering. In proceedings of the
2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language
Processing (EMNLP-IJCNLP), pp. 963–973. Association for Computational Linguistics, Hong Kong, China (2019).
6. Pulkit Madaan, Fatiha Sadat. Multilingual Neural Machine Translation involving Indian Languages. Proceedings of the WILDRE5– 5th
Workshop on Indian Language Data: Resources and Evaluation, European Language Resources Association (ELRA), pp. 29–32 (2020).
7. Sukanta Sen, Kamal Kumar Gupta, Asif Ekbal, Pushpak Bhattacharyya. IITP-MT at WAT2018: Transformer-based Multilingual
Indic-English Neural Machine Translation System, Association for Computational Linguistics (2018).
8. Calixto, I., Liu, Q., Campbell, N. Doubly-attentive decoder for multi-modal neural machine translation. In proceedings of the 55th Annual
Meeting of the Association for Computational Linguistics (Volume 1:Long Papers), pp. 1913–1924. Association for Computational
Linguistics, Vancouver, Canada (2017).
9. Koel Dutta Chowdhury, Mohammed Hasanuzzaman, Qun Liu. Multimodal Neural Machine Translation for Low-resource Language Pairs
using Synthetic Data. Proceedings of the Workshop on Deep Learning Approaches for Low-Resource NLP, Association for Computational
Linguistics, pp. 33-42 (2018).
10.
References
Bharathi Raja Chakravarthi, Ruba Priyadharshini, Bernardo Stearns, Arun Jayapal, Sridevy S, Mihael Arcan, Manel Zarrouk, John P McCrae. Multilingual
Multimodal Machine Translation for Dravidian Languages utilizing Phonetic Transcription. Proceedings of the 2nd Workshop on Technologies for MT of Low
Resource Languages. European Association for Machine Translation, pp-56-63 (2019).
11. Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado,
Macduff Hughes, Jeffrey Dean. Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation. Transactions of the Association
for Computational Linguistics, Volume 5, pp-339–351 (2017).
12. Giulia MattoniPat NagleCarlos CollantesDimitar Sht. Shterionov Dimitar Sht. Shterionov. Zero-Shot Translation for Indian Languages with Sparse Data,
MT Summit (2017).
13. Rashi Kumar, Piyush Jha, Vineet Sahula. An Augmented Translation Technique for low Resource language pair: Sanskrit to Hindi translation. Proceedings
of the 2019 2nd International Conference on Algorithms, Computing and Artificial Intelligence, pp 377-383 (2019)
14. Saurav Kumar, Saunack Kumar, Diptesh Kanojia, Pushpak Bhattacharyya. A Passage to India”: Proceedings of the 1st Joint Workshop on Spoken Language
Technologies for Under-resourced languages (SLTU) and Collaboration and Computing for Under-Resourced Languages (CCURL), pp. 352–357 (2020).
15. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: A Method for Automatic Evaluation of Machine Translation. In proceedings
of the 40th Annual Meeting on Association for Computational Linguistics (ACL ’02). Association for Computational Linguistics, Stroudsburg, PA, USA,
pp-311–318.
16. Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A study of translation edit rate with targeted human
annotation. InIn Proceedings of Association for Machine Translation in the Americas. pp. 223–231.
17. Alon Lavie and Michael J. Denkowski. 2009. The Meteor Metric for Automatic Evaluation of Machine Translation.Machine Translation 23, 2–3 (Sept.
2009), pp. 105–115.
18. Tomas Mikolov, Kai Chen, Greg Corrado and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. 1st International Conference on
Learning Representations, {ICLR}, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings.
19. Jeffrey Pennington, Richard Socher, Christopher Manning. GloVe: Global Vectors for Word Representation. Proceedings of the 2014 Conference on
Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, pp. 1532–1543.
20. Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language
Technologies, Volume 1, Association for Computational Linguistics, pp. 4171–4186.
Hands-on:
Machine Translation
Mr. Sahinur Laskar, PhD Scholar, NIT Silchar

You might also like