On Long-Tailed Phenomena in Neural Machine Translation

Raunak, Vikas; Dalmia, Siddharth; Gupta, Vivek; Metze, Florian

Computer Science > Computation and Language

arXiv:2010.04924 (cs)

[Submitted on 10 Oct 2020]

Title:On Long-Tailed Phenomena in Neural Machine Translation

Authors:Vikas Raunak, Siddharth Dalmia, Vivek Gupta, Florian Metze

View PDF

Abstract:State-of-the-art Neural Machine Translation (NMT) models struggle with generating low-frequency tokens, tackling which remains a major challenge. The analysis of long-tailed phenomena in the context of structured prediction tasks is further hindered by the added complexities of search during inference. In this work, we quantitatively characterize such long-tailed phenomena at two levels of abstraction, namely, token classification and sequence generation. We propose a new loss function, the Anti-Focal loss, to better adapt model training to the structural dependencies of conditional text generation by incorporating the inductive biases of beam search in the training process. We show the efficacy of the proposed technique on a number of Machine Translation (MT) datasets, demonstrating that it leads to significant gains over cross-entropy across different language pairs, especially on the generation of low-frequency words. We have released the code to reproduce our results.

Comments:	Accepted to Findings of EMNLP 2020
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2010.04924 [cs.CL]
	(or arXiv:2010.04924v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2010.04924

Submission history

From: Vikas Raunak [view email]
[v1] Sat, 10 Oct 2020 07:00:57 UTC (8,088 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-10

Change to browse by:

cs
cs.AI
cs.LG

References & Citations

DBLP - CS Bibliography

listing | bibtex

Vikas Raunak
Siddharth Dalmia
Vivek Gupta
Florian Metze

export BibTeX citation

Computer Science > Computation and Language

Title:On Long-Tailed Phenomena in Neural Machine Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:On Long-Tailed Phenomena in Neural Machine Translation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators