Syntactic Inductive Bias in Transformer Language Models: Especially Helpful for Low-Resource Languages?

Gessler, Luke; Schneider, Nathan

Computer Science > Computation and Language

arXiv:2311.00268 (cs)

[Submitted on 1 Nov 2023]

Title:Syntactic Inductive Bias in Transformer Language Models: Especially Helpful for Low-Resource Languages?

Authors:Luke Gessler, Nathan Schneider

View PDF

Abstract:A line of work on Transformer-based language models such as BERT has attempted to use syntactic inductive bias to enhance the pretraining process, on the theory that building syntactic structure into the training process should reduce the amount of data needed for training. But such methods are often tested for high-resource languages such as English. In this work, we investigate whether these methods can compensate for data sparseness in low-resource languages, hypothesizing that they ought to be more effective for low-resource languages. We experiment with five low-resource languages: Uyghur, Wolof, Maltese, Coptic, and Ancient Greek. We find that these syntactic inductive bias methods produce uneven results in low-resource settings, and provide surprisingly little benefit in most cases.

Comments:	Accepted at CoNLL 2023
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2311.00268 [cs.CL]
	(or arXiv:2311.00268v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2311.00268

Submission history

From: Luke Gessler [view email]
[v1] Wed, 1 Nov 2023 03:32:46 UTC (8,575 KB)

Computer Science > Computation and Language

Title:Syntactic Inductive Bias in Transformer Language Models: Especially Helpful for Low-Resource Languages?

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Syntactic Inductive Bias in Transformer Language Models: Especially Helpful for Low-Resource Languages?

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators