On the Robustness of Language Encoders against Grammatical Errors

Yin, Fan; Long, Quanyu; Meng, Tao; Chang, Kai-Wei

Computer Science > Computation and Language

arXiv:2005.05683 (cs)

[Submitted on 12 May 2020]

Title:On the Robustness of Language Encoders against Grammatical Errors

Authors:Fan Yin, Quanyu Long, Tao Meng, Kai-Wei Chang

View PDF

Abstract:We conduct a thorough study to diagnose the behaviors of pre-trained language encoders (ELMo, BERT, and RoBERTa) when confronted with natural grammatical errors. Specifically, we collect real grammatical errors from non-native speakers and conduct adversarial attacks to simulate these errors on clean text data. We use this approach to facilitate debugging models on downstream applications. Results confirm that the performance of all tested models is affected but the degree of impact varies. To interpret model behaviors, we further design a linguistic acceptability task to reveal their abilities in identifying ungrammatical sentences and the position of errors. We find that fixed contextual encoders with a simple classifier trained on the prediction of sentence correctness are able to locate error positions. We also design a cloze test for BERT and discover that BERT captures the interaction between errors and specific tokens in context. Our results shed light on understanding the robustness and behaviors of language encoders against grammatical errors.

Comments:	ACL 2020
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2005.05683 [cs.CL]
	(or arXiv:2005.05683v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2005.05683

Submission history

From: Fan Yin [view email]
[v1] Tue, 12 May 2020 11:01:44 UTC (154 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-05

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Fan Yin
Kai-Wei Chang

export BibTeX citation

Computer Science > Computation and Language

Title:On the Robustness of Language Encoders against Grammatical Errors

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:On the Robustness of Language Encoders against Grammatical Errors

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators