Paraphrases as Foreign Languages in Multilingual Neural Machine Translation

Zhou, Zhong; Sperber, Matthias; Waibel, Alex

Computer Science > Computation and Language

arXiv:1808.08438 (cs)

[Submitted on 25 Aug 2018 (v1), last revised 1 Oct 2021 (this version, v3)]

Title:Paraphrases as Foreign Languages in Multilingual Neural Machine Translation

Authors:Zhong Zhou, Matthias Sperber, Alex Waibel

View PDF

Abstract:Paraphrases, the rewordings of the same semantic meaning, are useful for improving generalization and translation. However, prior works only explore paraphrases at the word or phrase level, not at the sentence or corpus level. Unlike previous works that only explore paraphrases at the word or phrase level, we use different translations of the whole training data that are consistent in structure as paraphrases at the corpus level. We train on parallel paraphrases in multiple languages from various sources. We treat paraphrases as foreign languages, tag source sentences with paraphrase labels, and train on parallel paraphrases in the style of multilingual Neural Machine Translation (NMT). Our multi-paraphrase NMT that trains only on two languages outperforms the multilingual baselines. Adding paraphrases improves the rare word translation and increases entropy and diversity in lexical choice. Adding the source paraphrases boosts performance better than adding the target ones. Combining both the source and the target paraphrases lifts performance further; combining paraphrases with multilingual data helps but has mixed performance. We achieve a BLEU score of 57.2 for French-to-English translation using 24 corpus-level paraphrases of the Bible, which outperforms the multilingual baselines and is +34.7 above the single-source single-target NMT baseline.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1808.08438 [cs.CL]
	(or arXiv:1808.08438v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1808.08438
Journal reference:	Proceedings of 57th Annual Meeting of the Association for Computational Linguistics Student Research Workshop, 2019

Submission history

From: Zhong Zhou [view email]
[v1] Sat, 25 Aug 2018 15:20:30 UTC (192 KB)
[v2] Tue, 25 Jun 2019 08:29:27 UTC (1,451 KB)
[v3] Fri, 1 Oct 2021 00:17:25 UTC (568 KB)

Computer Science > Computation and Language

Title:Paraphrases as Foreign Languages in Multilingual Neural Machine Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Paraphrases as Foreign Languages in Multilingual Neural Machine Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators