From English to Code-Switching: Transfer Learning with Strong Morphological Clues

Aguilar, Gustavo; Solorio, Thamar

Computer Science > Computation and Language

arXiv:1909.05158 (cs)

[Submitted on 11 Sep 2019 (v1), last revised 1 May 2020 (this version, v3)]

Title:From English to Code-Switching: Transfer Learning with Strong Morphological Clues

Authors:Gustavo Aguilar, Thamar Solorio

View PDF

Abstract:Linguistic Code-switching (CS) is still an understudied phenomenon in natural language processing. The NLP community has mostly focused on monolingual and multi-lingual scenarios, but little attention has been given to CS in particular. This is partly because of the lack of resources and annotated data, despite its increasing occurrence in social media platforms. In this paper, we aim at adapting monolingual models to code-switched text in various tasks. Specifically, we transfer English knowledge from a pre-trained ELMo model to different code-switched language pairs (i.e., Nepali-English, Spanish-English, and Hindi-English) using the task of language identification. Our method, CS-ELMo, is an extension of ELMo with a simple yet effective position-aware attention mechanism inside its character convolutions. We show the effectiveness of this transfer learning step by outperforming multilingual BERT and homologous CS-unaware ELMo models and establishing a new state of the art in CS tasks, such as NER and POS tagging. Our technique can be expanded to more English-paired code-switched languages, providing more resources to the CS community.

Comments:	Accepted to ACL 2020
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1909.05158 [cs.CL]
	(or arXiv:1909.05158v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1909.05158

Submission history

From: Gustavo Aguilar [view email]
[v1] Wed, 11 Sep 2019 15:53:21 UTC (1,793 KB)
[v2] Thu, 12 Sep 2019 18:54:22 UTC (1,794 KB)
[v3] Fri, 1 May 2020 20:56:37 UTC (1,468 KB)

Computer Science > Computation and Language

Title:From English to Code-Switching: Transfer Learning with Strong Morphological Clues

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:From English to Code-Switching: Transfer Learning with Strong Morphological Clues

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators