Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models

Blevins, Terra; Limisiewicz, Tomasz; Gururangan, Suchin; Li, Margaret; Gonen, Hila; Smith, Noah A.; Zettlemoyer, Luke

Computer Science > Computation and Language

arXiv:2401.10440 (cs)

[Submitted on 19 Jan 2024 (v1), last revised 8 Oct 2024 (this version, v2)]

Title:Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models

Authors:Terra Blevins, Tomasz Limisiewicz, Suchin Gururangan, Margaret Li, Hila Gonen, Noah A. Smith, Luke Zettlemoyer

View PDF HTML (experimental)

Abstract:Despite their popularity in non-English NLP, multilingual language models often underperform monolingual ones due to inter-language competition for model parameters. We propose Cross-lingual Expert Language Models (X-ELM), which mitigate this competition by independently training language models on subsets of the multilingual corpus. This process specializes X-ELMs to different languages while remaining effective as a multilingual ensemble. Our experiments show that when given the same compute budget, X-ELM outperforms jointly trained multilingual models across all considered languages and that these gains transfer to downstream tasks. X-ELM provides additional benefits over performance improvements: new experts can be iteratively added, adapting X-ELM to new languages without catastrophic forgetting. Furthermore, training is asynchronous, reducing the hardware requirements for multilingual training and democratizing multilingual modeling.

Comments:	EMNLP 2024
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2401.10440 [cs.CL]
	(or arXiv:2401.10440v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2401.10440

Submission history

From: Terra Blevins [view email]
[v1] Fri, 19 Jan 2024 01:07:50 UTC (2,258 KB)
[v2] Tue, 8 Oct 2024 11:44:49 UTC (2,268 KB)

Computer Science > Computation and Language

Title:Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators