You Are What You Annotate: Towards Better Models through Annotator Representations

Deng, Naihao; Zhang, Xinliang Frederick; Liu, Siyang; Wu, Winston; Wang, Lu; Mihalcea, Rada

Computer Science > Computation and Language

arXiv:2305.14663 (cs)

[Submitted on 24 May 2023 (v1), last revised 22 Oct 2023 (this version, v2)]

Title:You Are What You Annotate: Towards Better Models through Annotator Representations

Authors:Naihao Deng, Xinliang Frederick Zhang, Siyang Liu, Winston Wu, Lu Wang, Rada Mihalcea

View PDF

Abstract:Annotator disagreement is ubiquitous in natural language processing (NLP) tasks. There are multiple reasons for such disagreements, including the subjectivity of the task, difficult cases, unclear guidelines, and so on. Rather than simply aggregating labels to obtain data annotations, we instead try to directly model the diverse perspectives of the annotators, and explicitly account for annotators' idiosyncrasies in the modeling process by creating representations for each annotator (annotator embeddings) and also their annotations (annotation embeddings). In addition, we propose TID-8, The Inherent Disagreement - 8 dataset, a benchmark that consists of eight existing language understanding datasets that have inherent annotator disagreement. We test our approach on TID-8 and show that our approach helps models learn significantly better from disagreements on six different datasets in TID-8 while increasing model size by fewer than 1% parameters. By capturing the unique tendencies and subjectivity of individual annotators through embeddings, our representations prime AI models to be inclusive of diverse viewpoints.

Comments:	Accepted to Findings of EMNLP 2023
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2305.14663 [cs.CL]
	(or arXiv:2305.14663v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2305.14663

Submission history

From: Naihao Deng [view email]
[v1] Wed, 24 May 2023 03:06:13 UTC (7,521 KB)
[v2] Sun, 22 Oct 2023 17:10:54 UTC (7,620 KB)

Computer Science > Computation and Language

Title:You Are What You Annotate: Towards Better Models through Annotator Representations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:You Are What You Annotate: Towards Better Models through Annotator Representations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators