Temporal-aware Language Representation Learning From Crowdsourced Labels

Hao, Yang; Zhai, Xiao; Ding, Wenbiao; Liu, Zitao

Computer Science > Computation and Language

arXiv:2107.07958 (cs)

[Submitted on 15 Jul 2021]

Title:Temporal-aware Language Representation Learning From Crowdsourced Labels

Authors:Yang Hao, Xiao Zhai, Wenbiao Ding, Zitao Liu

View PDF

Abstract:Learning effective language representations from crowdsourced labels is crucial for many real-world machine learning tasks. A challenging aspect of this problem is that the quality of crowdsourced labels suffer high intra- and inter-observer variability. Since the high-capacity deep neural networks can easily memorize all disagreements among crowdsourced labels, directly applying existing supervised language representation learning algorithms may yield suboptimal solutions. In this paper, we propose \emph{TACMA}, a \underline{t}emporal-\underline{a}ware language representation learning heuristic for \underline{c}rowdsourced labels with \underline{m}ultiple \underline{a}nnotators. The proposed approach (1) explicitly models the intra-observer variability with attention mechanism; (2) computes and aggregates per-sample confidence scores from multiple workers to address the inter-observer disagreements. The proposed heuristic is extremely easy to implement in around 5 lines of code. The proposed heuristic is evaluated on four synthetic and four real-world data sets. The results show that our approach outperforms a wide range of state-of-the-art baselines in terms of prediction accuracy and AUC. To encourage the reproducible results, we make our code publicly available at \url{this https URL}.

Comments:	The 59th Annual Meeting of the Association for Computational Linguistics Workshop on Representation Learning for NLP (ACL RepL4NLP 2021)
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2107.07958 [cs.CL]
	(or arXiv:2107.07958v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2107.07958

Submission history

From: Zitao Liu [view email]
[v1] Thu, 15 Jul 2021 05:25:56 UTC (1,850 KB)

Computer Science > Computation and Language

Title:Temporal-aware Language Representation Learning From Crowdsourced Labels

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Temporal-aware Language Representation Learning From Crowdsourced Labels

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators