Too good to be true? Predicting author profiles from abusive language

van der Vegt, Isabelle; Kleinberg, Bennett; Gill, Paul

Computer Science > Computation and Language

arXiv:2009.01126 (cs)

[Submitted on 2 Sep 2020 (v1), last revised 3 Sep 2020 (this version, v2)]

Title:Too good to be true? Predicting author profiles from abusive language

Authors:Isabelle van der Vegt, Bennett Kleinberg, Paul Gill

View PDF

Abstract:The problem of online threats and abuse could potentially be mitigated with a computational approach, where sources of abuse are better understood or identified through author profiling. However, abusive language constitutes a specific domain of language for which it has not yet been tested whether differences emerge based on a text author's personality, age, or gender. This study examines statistical relationships between author demographics and abusive vs normal language, and performs prediction experiments for personality, age, and gender. Although some statistical relationships were established between author characteristics and language use, these patterns did not translate to high prediction performance. Personality traits were predicted within 15% of their actual value, age was predicted with an error margin of 10 years, and gender was classified correctly in 70% of the cases. These results are poor when compared to previous research on author profiling, therefore we urge caution in applying this within the context of abusive language and threat assessment.

Comments:	Pre-print
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2009.01126 [cs.CL]
	(or arXiv:2009.01126v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2009.01126

Submission history

From: Isabelle van der Vegt [view email]
[v1] Wed, 2 Sep 2020 15:05:43 UTC (231 KB)
[v2] Thu, 3 Sep 2020 13:23:58 UTC (231 KB)

Computer Science > Computation and Language

Title:Too good to be true? Predicting author profiles from abusive language

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Too good to be true? Predicting author profiles from abusive language

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators