Language Model Pre-training on True Negatives

Zhang, Zhuosheng; Zhao, Hai; Utiyama, Masao; Sumita, Eiichiro

Computer Science > Computation and Language

arXiv:2212.00460 (cs)

[Submitted on 1 Dec 2022]

Title:Language Model Pre-training on True Negatives

Authors:Zhuosheng Zhang, Hai Zhao, Masao Utiyama, Eiichiro Sumita

View PDF

Abstract:Discriminative pre-trained language models (PLMs) learn to predict original texts from intentionally corrupted ones. Taking the former text as positive and the latter as negative samples, the PLM can be trained effectively for contextualized representation. However, the training of such a type of PLMs highly relies on the quality of the automatically constructed samples. Existing PLMs simply treat all corrupted texts as equal negative without any examination, which actually lets the resulting model inevitably suffer from the false negative issue where training is carried out on pseudo-negative data and leads to less efficiency and less robustness in the resulting PLMs. In this work, on the basis of defining the false negative issue in discriminative PLMs that has been ignored for a long time, we design enhanced pre-training methods to counteract false negative predictions and encourage pre-training language models on true negatives by correcting the harmful gradient updates subject to false negative predictions. Experimental results on GLUE and SQuAD benchmarks show that our counter-false-negative pre-training methods indeed bring about better performance together with stronger robustness.

Comments:	Accepted by AAAI 2023
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2212.00460 [cs.CL]
	(or arXiv:2212.00460v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2212.00460

Submission history

From: Zhuosheng Zhang [view email]
[v1] Thu, 1 Dec 2022 12:24:19 UTC (1,082 KB)

Computer Science > Computation and Language

Title:Language Model Pre-training on True Negatives

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Language Model Pre-training on True Negatives

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators