DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

Ma, Shuming; Dong, Li; Huang, Shaohan; Zhang, Dongdong; Muzio, Alexandre; Singhal, Saksham; Awadalla, Hany Hassan; Song, Xia; Wei, Furu

Computer Science > Computation and Language

arXiv:2106.13736 (cs)

[Submitted on 25 Jun 2021 (v1), last revised 18 Aug 2021 (this version, v2)]

Title:DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

Authors:Shuming Ma, Li Dong, Shaohan Huang, Dongdong Zhang, Alexandre Muzio, Saksham Singhal, Hany Hassan Awadalla, Xia Song, Furu Wei

View PDF

Abstract:While pretrained encoders have achieved success in various natural language understanding (NLU) tasks, there is a gap between these pretrained encoders and natural language generation (NLG). NLG tasks are often based on the encoder-decoder framework, where the pretrained encoders can only benefit part of it. To reduce this gap, we introduce DeltaLM, a pretrained multilingual encoder-decoder model that regards the decoder as the task layer of off-the-shelf pretrained encoders. Specifically, we augment the pretrained multilingual encoder with a decoder and pre-train it in a self-supervised way. To take advantage of both the large-scale monolingual data and bilingual data, we adopt the span corruption and translation span corruption as the pre-training tasks. Experiments show that DeltaLM outperforms various strong baselines on both natural language generation and translation tasks, including machine translation, abstractive text summarization, data-to-text, and question generation. The code and pretrained models are available at \url{this https URL}.

Comments:	Work in progress
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2106.13736 [cs.CL]
	(or arXiv:2106.13736v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2106.13736

Submission history

From: Shuming Ma [view email]
[v1] Fri, 25 Jun 2021 16:12:10 UTC (278 KB)
[v2] Wed, 18 Aug 2021 03:14:53 UTC (278 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-06

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Shuming Ma
Li Dong
Shaohan Huang
Dongdong Zhang
Hany Hassan Awadalla

…

export BibTeX citation

Computer Science > Computation and Language

Title:DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators