Abusive Language Detection in Online Conversations by Combining Content-And Graph-Based Features

Uploaded by

gest

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views7 pages

Abusive Language Detection in Online Conversations by Combining Content-And Graph-Based Features

Uploaded by

gest

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

ORIGINAL RESEARCH

published: 04 June 2019

doi: 10.3389/fdata.2019.00008

Abusive Language Detection in

Online Conversations by Combining
Content- and Graph-Based Features
Noé Cécillon, Vincent Labatut*, Richard Dufour and Georges Linarès

LIA, Avignon University, Avignon, France

In recent years, online social networks have allowed world-wide users to meet and
discuss. As guarantors of these communities, the administrators of these platforms must
prevent users from adopting inappropriate behaviors. This verification task, mainly done
by humans, is more and more difficult due to the ever growing amount of messages to
check. Methods have been proposed to automatize this moderation process, mainly by
providing approaches based on the textual content of the exchanged messages. Recent
work has also shown that characteristics derived from the structure of conversations,
in the form of conversational graphs, can help detecting these abusive messages. In
this paper, we propose to take advantage of both sources of information by proposing
fusion methods integrating content- and graph-based features. Our experiments on raw
Edited by:
chat logs show not only that the content of the messages, but also their dynamics
Sabrina Gaito,
University of Milan, Italy within a conversation contain partially complementary information, allowing performance
Reviewed by: improvements on an abusive message classification task with a final F-measure of
Roberto Interdonato, 93.26%.
Territoires, Environnement,
Télédètection et Information Spatiale Keywords: automatic abuse detection, content analysis, conversational graph, online conversations,
(TETIS), France social networks
Eric A. Leclercq,
Université de Bourgogne, France

*Correspondence:
1. INTRODUCTION
Vincent Labatut
[email protected] The internet has widely impacted the way we communicate. Online communities, in particular,
have grown to become important places for interpersonal communications. They get more and
Specialty section: more attention from companies to advertise their products or from governments interested in
This article was submitted to monitoring public discourse. Online communities come in various shapes and forms, but they are
Data Mining and Management, all exposed to abusive behavior. The definition of what exactly is considered as abuse depends on
a section of the journal the community, but generally includes personal attacks, as well as discrimination based on race,
Frontiers in Big Data
religion, or sexual orientation.
Received: 01 April 2019 Abusive behavior is a risk, as it is likely to make important community members leave, therefore
Accepted: 14 May 2019 endangering the community, and even trigger legal issues in some countries. Moderation consists
Published: 04 June 2019
in detecting users who act abusively, and in taking actions against them. Currently, this moderation
Citation: work is mainly a manual process, and since it implies high human and financial costs, companies
Cécillon N, Labatut V, Dufour R and have a keen interest in its automation. One way of doing so is to consider this task as a classification
Linarès G (2019) Abusive Language
problem consisting in automatically determining if a user message is abusive or not.
Detection in Online Conversations by
Combining Content- and
A number of works have tackled this problem, or related ones, in the literature. Most of
Graph-Based Features. them focus only on the content of the targeted message to detect abuse or similar properties.
Front. Big Data 2:8. For instance (Spertus, 1997), applies this principle to detect hostility (Dinakar et al., 2011), for
doi: 10.3389/fdata.2019.00008 cyberbullying, and (Chen et al., 2012) for offensive language. These approaches rely on a mix of