Exploring Strategies for Generalizable Commonsense Reasoning with Pre-trained Models

Ma, Kaixin; Ilievski, Filip; Francis, Jonathan; Ozaki, Satoru; Nyberg, Eric; Oltramari, Alessandro

Computer Science > Computation and Language

arXiv:2109.02837 (cs)

[Submitted on 7 Sep 2021]

Title:Exploring Strategies for Generalizable Commonsense Reasoning with Pre-trained Models

Authors:Kaixin Ma, Filip Ilievski, Jonathan Francis, Satoru Ozaki, Eric Nyberg, Alessandro Oltramari

View PDF

Abstract:Commonsense reasoning benchmarks have been largely solved by fine-tuning language models. The downside is that fine-tuning may cause models to overfit to task-specific data and thereby forget their knowledge gained during pre-training. Recent works only propose lightweight model updates as models may already possess useful knowledge from past experience, but a challenge remains in understanding what parts and to what extent models should be refined for a given task. In this paper, we investigate what models learn from commonsense reasoning datasets. We measure the impact of three different adaptation methods on the generalization and accuracy of models. Our experiments with two models show that fine-tuning performs best, by learning both the content and the structure of the task, but suffers from overfitting and limited generalization to novel answers. We observe that alternative adaptation methods like prefix-tuning have comparable accuracy, but generalize better to unseen answers and are more robust to adversarial splits.

Comments:	EMNLP 2021
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2109.02837 [cs.CL]
	(or arXiv:2109.02837v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2109.02837

Submission history

From: Kaixin Ma [view email]
[v1] Tue, 7 Sep 2021 03:13:06 UTC (244 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2021-09

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Filip Ilievski
Eric Nyberg
Alessandro Oltramari

export BibTeX citation

Computer Science > Computation and Language

Title:Exploring Strategies for Generalizable Commonsense Reasoning with Pre-trained Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Exploring Strategies for Generalizable Commonsense Reasoning with Pre-trained Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators