A Hierarchically-Labeled Portuguese Hate Speech Dataset
A Hierarchically-Labeled Portuguese Hate Speech Dataset
A Hierarchically-Labeled Portuguese Hate Speech Dataset
94
Proceedings of the Third Workshop on Abusive Language Online, pages 94–104
Florence, Italy, August 1, 2019. c 2019 Association for Computational Linguistics
and ‘sexism’ (Waseem and Hovy, 2016) – despite their taxonomy, Salminen et al. followed an iter-
the fact that such broad distinctions unduly over- ative and qualitative procedure called “open cod-
generalize. For instance, by classifying discrimi- ing” (Glaser and Strauss, 2017).
nation against both black people and refugees sim- There are obvious similarities between Salmi-
ply as ‘racism’, we ignore that in this case, dif- nen et al.’s approach and ours. However, there
ferent characteristics with a different motivation are also some significant differences. The first dif-
are targeted (also reflected in a different language ference concerns the underlying definition of hate.
style). In particular, we compile and annotate While they use the very generic definition “hateful
a new dataset composed of 5,668 tweets in Por- comments toward a specific group or target”, the
tuguese, which is one of the most commonly-used definition we adopt is more specific (cf. above).
languages online (Fox, 2013). Two types of an- This leads to differences in the taxonomy. For
notations are carried out. For the first, non-expert instance, they introduce ‘hate against media’ and
annotators classify the messages in a binary fash- ‘hate against religion’, which is hate against ab-
ion (‘hate’ vs. ‘no-hate’). For the second, we stract entities and not considered by us. Addition-
build a multilabel hate speech hierarchical anno- ally, they merge in the same hate speech taxonomy
tation schema with 81 hate categories in total1 . To the targets of hate and the type of discourse. In our
demonstrate the usefulness of our dataset, we car- case, we focus on the targets of hate speech only.
ried a baseline classification experiment with pre-
trained word embeddings and LSTM on the binary 2.2 Dataset Annotation
classified data, with a state-of-the-art outcome. Several hate speech datasets are publicly avail-
The remainder of the paper is structured as fol- able, e.g., for English (Waseem and Hovy, 2016;
lows: Section 2 reviews the literature. Section 3 Davidson et al., 2017; Nobata et al., 2016; Jigsaw,
describes our crawling procedure. In Section 4, we 2018), Spanish (Fersini et al., 2018), Italian (Po-
present the two annotation schemas we work with: letto et al., 2017; Sanguinetti et al., 2018), German
the binary and the hierarchical schema. Section 5 (Ross et al., 2016), Hindi (Kumar et al., 2018),
discusses a baseline hate speech experiment that and Portuguese (de Pelle and Moreira, 2017). In
we carried out to validate our new dataset. Sec- this section, we analyze the data collection strat-
tion 6 presents some ethical considerations of this egy, the annotation method and the dataset prop-
work. In Section 7, finally, the conclusions of our erties of three representative hate speech datasets:
work are presented. the Hate speech, Racism and Sexism dataset by
Waseem and Hovy (2016), the Offensive Lan-
2 Related Work guage Dataset by Davidson et al. (2017), and the
2.1 Hate Speech Concepts Portuguese News Comments dataset by de Pelle
and Moreira (2017). We have chosen the first two
Fortuna and Nunes (2018) analyze and compare because they are the most widely used datasets for
several aggression-related concepts. As a result of English hate speech automatic classification. They
their analysis, they present the following definition show how Twitter can be used to retrieve infor-
of hate speech: mation and how to conduct the manual classifica-
“Hate speech is language that attacks or dimin- tion relying on both expert and non-expert annota-
ishes, that incites violence or hate against groups, tors. The third is another annotated and published
based on specific characteristics such as physi-
cal appearance, religion, descent, national or eth- dataset for Portuguese, which is rather different
nic origin, sexual orientation, gender identity or from ours.
other, and it can occur with different linguistic
styles, even in subtle forms or when humour is Hate speech, Racism and Sexism Dataset.
used.”
This dataset2 (Waseem and Hovy, 2016) contains
We adopted this definition in our work. Our 16,914 tweets in English, which were classified by
work has also been inspired by the taxonomy pro- two annotators using the classes “Racism”, “Sex-
vided by Salminen et al. (2018), which includes 29 ism” and “Neither”. Regarding the tweet collec-
hate categories characterized in terms of hateful tion, an initial manual search was conducted on
language, target, and sub-target types. To create Twitter to collect common slurs and terms related
1
to religious, sexual, gender, and ethnic minorities.
https://github.com/paulafortuna/Port
2
uguese-Hate-Speech-Dataset https://github.com/ZeerakW/hatespeech
95
The authors identified frequently occurring terms agreement, the value was 0.71.
in tweets that contain hate speech and used those In comparison to this work, the dataset that we
terms to retrieve more messages. The messages have compiled provides more data and is not re-
were then annotated by the main researcher, to- stricted to specific topics. Additionally, our anno-
gether with a gender studies student; in total, 3,383 tation focuses only on hate speech, instead of gen-
tweets as sexist, 1,972 as racist, and 11,559 as nei- eral offensive content. We also use and provide a
ther sexist nor racist. The inter-annotator agree- complete labeling schema.
ment had a Cohen’s Kappa of 0.84. The authors of Compared to the previous two datasets, our sec-
the study concluded that the use of n-grams pro- ond annotation schema is considerably more fine-
vides good results in the task of automatic hate grained. As we will see below, our annotation pro-
speech detection, and adding demographic infor- cedure with the fine-grained schema is similar to
mation leads to little improvement in the perfor- that of Waseem and Hovy (2016).
mance of the classification model.
2.3 Classification methods
Offensive Language Dataset. Davidson et al. Different studies conclude that deep learning ap-
(2017) annotated a dataset3 with 14,510 tweets in proaches outperform classical machine learning
English, using the classes “Hate”, “Offensive” and algorithms in the task of hate speech detection;
“Neither”. Regarding the collection of the mes- see, e.g., Mehdad and Tetreault (2016); Park and
sages, they started with an English hate speech Fung (2017); Del Vigna et al. (2017); Pitsilis et al.
lexicon compiled by Hatebase.org, searching for (2018); Founta et al. (2018); Gambäck and Sikdar
tweets that contained terms from this lexicon. The (2017). For instance, Badjatiya et al. (2017) com-
outcome was a collection of tweets written by pare the use of different types of neural networks
33,458 Twitter users. The collected tweets were (CNN, LSTM) and deep learning libraries such as
completed by further follow-up tweets of these FastText with the use of classical machine learn-
users, which resulted in a corpus of 85.4 million ing techniques and experiment with different types
tweets. Finally, from this corpus, a random sample of word embeddings. The setup that achieved
of 25,000 tweets containing terms from the lex- the best performance consists of the combination
icon has been extracted and manually annotated of deep techniques with standard ML classifiers,
by CrowdFlower workers. Three or more work- and more precisely, of embeddings learned by an
ers from CrowdFlower annotated each message. LSTM model, combined with gradient boosted de-
The majority voting was used to assign a label to cision trees. We will follow a similar methodology
each tweet. Tweets that did not have a majority for classification.
class were discarded. This resulted in a sample of
24,802 labeled tweets. The inter-annotator agree- 3 Message Collection
ment score provided by CrowdFlower was 92%.
Our overall approach to message collection is out-
However, a total percentage of only 5% of tweets
lined in Figure 1. In what follows, we introduce in
were labeled as hate speech by the majority of the
detail the individual steps.
workers.
Use of Keywords and Profiles. We used Twit-
Portuguese News Comments Dataset. de Pelle ter’s search API for keywords and profiles because
and Moreira (2017) collected a dataset4 with 1,250 both can be complementary as message sources.
random comments from the Globo news site on With the first, we access a wider range of tweets
politics and sports news. Each comment was from different profiles, but we restrict the search
annotated by three annotators, who were asked to specific words or expressions that indicate hate.
to indicate whether it contained ‘racism’, ‘sex- With the second, we obtain more spontaneous
ism’, ‘homophobia’, ‘xenophobia’, ‘religious in- discourse, but from a more restricted number of
tolerance’, or ‘cursing’. ‘Cursing’ was the most users:
frequent label, while the other labels had few in-
stances in the corpus. Regarding the annotator • Hate-related keywords: We used Twitter’s
API search feature to look for keywords and
3
https://github.com/t-davidson/hate-s hashtags related to hate speech, such as fufas,
peech-and-offensive-language
4
https://github.com/rogersdepelle/Off sapatão ‘dyke’ or #LugarDeMulherENaCoz-
ComBR inha ‘#womensPlaceIsInTheKitchen’.
96
Pages and
keywods Crawling in Twitter Tweets Filtering Tweets Sampling
enumeration
Keywords related
Hate profiles
with hate speech
• Hate-related profiles: Using the profile 3,000). We decided then to use a maximum of 200
search API, we query with words like ódio tweets per search instance in order to keep a more
‘hate’, discurso de ódio ‘hate speech’ and diverse source of tweets.
ofensivo ‘offensive’ in order to find accounts
Final Dataset. Our final dataset contains 5,668
that post hateful messages. In Portuguese,
tweets, containing content from 1,156 different
there are social media users whose profile
users. The majority of the tweets (more than 95%)
is built specifically for sharing hateful con-
are from January, February, and March of 2017.
tent against certain minorities. We collect the
messages from those accounts with the ex- 4 Annotation of Hate Speech
pectation to find hate speech messages. This
search also allowed us to find counter hate In what follows, we present the annotation proce-
profiles. Those also use the same words in dures for binary hate speech and hierarchical hate
their description. It seemed adequate to keep speech annotation.
these profiles because they reproduce hate
4.1 Binary annotation
speech messages from other users.
Three annotators classified every message. 18 Por-
We looked at 29 specific profiles and used 19 key- tuguese native speakers (Information Science stu-
words and ten hashtags in a total for 58 search dent volunteers) were given annotation guidelines
instances.5 The goal has been to be exhaustive to perform the task (cf. Appendix A.1). All of
and cover different types of discrimination, based them received an equivalent number of messages.
on religion, gender, sexual orientation, ethnicity, The annotation was binary and the annotators had
and migration. We compiled this collection of to label each message as ‘hate speech’ or ‘not hate
search instances because there was no specific speech’.
hate speech lexicon available for Portuguese, e.g., To check the agreement between the three clas-
Hatebase contains generic hate (Hatebase, 2019). sifications of every message, we used Fleiss’s
Kappa (Fleiss, 1971). We observed a low agree-
Crawling. We used R to crawl content with re- ment with a value of K = 0.17. We think that this
spect to both keywords and profiles content on the low value is the result of relying exclusively on
8th and 9th of March of 2017. A total of 42,930 non-expert annotators for classifying hate speech.
messages were collected. For instance, in Waseem and Hovy (2016), the two
Tweet Filtering. We kept tweets categorized annotators were the author of the study plus a gen-
by Twitter as written in Portuguese. We elimi- der studies student. On the other hand, the two
nated repetitions and retweets from already col- other studies mentioned in Section 2 (de Pelle and
lected messages to avoid duplication and removed Moreira, 2017; Davidson et al., 2017), are more
HTML tags and messages with less than three generic in that they do not focus exclusively on
words. hate speech (as we do), but rather consider offen-
sive speech in general, which includes insults that
Tweet Sampling. The procedure previously de- are more explicit and easier to recognize, while
scribed resulted in 33,890 tweets. We noticed that hate speech is subtler and more difficult to iden-
the search instances returned several tweets from tify.
different magnitudes (e.g., some profiles had only For our final annotation, we applied the ma-
around 30 messages while others had more than jority vote, which resulted in a dataset in which
5
We use the term “search instance” to refer to profiles, 31.5% of the messages are annotated as ‘hate
keywords or hashtags used for the Twitter search. speech’.
97
4.2 Hierarchical annotation following properties:
When studying hate speech, it is possible to dis- • The ‘hate speech’ class corresponds to the
tinguish between different categories of it, like root of the graph.
‘racism’, ‘sexism’, or ‘homophobia’. A more fine-
grained view can be useful in hate speech classi- • If hate speech can be divided into several
fication because each category has a specific vo- types of hate, several nodes descend from the
cabulary and ways to be expressed, such that cre- root node. This gives rise to the second level
ating a language model for each category may be of classes (Table 1) according to the targets
helpful to improve the automatic detection of hate of the hate (e.g., ‘racism’, ‘homophobia’, and
speech (Warner and Hirschberg, 2012). ‘sexism’).
Another phenomenon we can observe when an- • This second level of nodes can also be di-
alyzing different categories of hate speech is their vided into subgroups of targets. For instance,
intersectionality. This concept appeared as an an- racist messages can be targeted against black
swer to the historical exclusion of black women people, Chinese people, Latinos, etc.
from early women’s rights movements often con-
cerned with the struggles of white women alone. • The division of classes can continue until we
Intersectionality brings attention to the experi- do not find more distinct groups, resulting in
ences of people who are subjected to multiple a terminal node.
forms of discrimination within a society (e.g., be-
ing woman and black) (Collins, 2015). Waseem • The lower nodes of the graph inherit the
(2016) introduce a hate speech labeling scheme classes from the upper nodes, up to the root.
that follows an intersectional approach. In ad- • The lower nodes of the graph can have one
dition to ‘racism’, ‘sexism’, and ‘neither’, they or more parents. In the second case, this
use the label “both” arguing that the intersection gives rise to a class that intersects the parent
of multiple oppression categories can differ from classes.
the forms of oppression it consists of (Crenshaw,
2018). • Instances are classified according to a multi-
To better take into account different hate speech label approach and can belong to classes as-
categories from an intersectional perspective, we signed to both terminal and/or non-terminal
approach the definition of the hate speech annota- nodes.
tion schema in terms of a hierarchical structure of
classes. Class Definition
Hate speech based on gender. Includes
Sexism
hate speech against woman.
4.2.1 Hate speech and hierarchical Hate speech based on body, such as fat,
classification Body
thin, tall or short people.
Origin Hate speech based on the place of origin.
In hierarchical classification, there is a structure Homophobia Hate speech based on sexual orientation.
defining the hierarchy between the categories of Racism Hate speech based on ethnicity.
the problem (Dumais and Chen, 2000). This is Hate speech based on a person’s ideas,
Ideology
such as feminist or left wing ideology.
opposed to flat classification, where categories are Religion Hate speech based on religion.
treated in isolation. Several structures can be used Hate speech based on health conditions,
Health
to represent a hierarchy of classes. One of them is such as against disabled people.
Hate speech based on life habits, such as
a Rooted Directed Acyclic Graph (rooted DAG), Other-Lifestyle
vegetarianism.
where each class corresponds to a node and can
have more than one parent. Another property of Table 1: Direct subtypes of the ‘hate speech’ type.
this graph is that documents can be assigned to
terminal categories and to non-terminal node cat- This annotation schema has several advantages
egories alike (Hao et al., 2007). In the specific compared to standard binary or disjoint flat clas-
case of hate speech classification, we propose to sification. Firstly, it models in a better way the
use a rooted DAG in order to be able to cover hate relationships between different subtypes of hate
speech subtypes and their intersections, as exem- speech. Additionally, it preserves rare classes,
plified in Figure 2. The graph of classes has the while signaling them as part of more generic
98
observed K = 0.72. We also consider the agree-
ment of the annotators by type of hate speech.
We ranked the classes by the best agreement and
removed the classes with only one instance for
any of the annotators. We found diverse values
in the different categories (Table 2), which points
out that some specific types of hate speech can be
more difficult to classify than others.
99
Class ND Parent nodes Freq Class ND Parent nodes Freq
Hate speech 0 - 1228 Ageing 1 Hate speech 4
Sexism 1 Hate speech 672 Angolans 3 Africans 4
Women 2 Sexism 544 Nordestines 3 Rural people, Brazilians 4
Homophobia 1 Hate speech 322 Chinese 3 Asians 3
Homossexuals 2 Homophobia 288 Homeless 2 Other/Lifestyle 3
Lesbians 3 Homossexuals, Woman 248 Arabic 2 Origin 2
Body 1 Hate speech 164 Bissexuals 2 Homophobia 2
Fat people 2 Body 160 Blond women 2 Women, Body 2
Fat women 3 Women, Fat people 153 East europeans 2 Origin 2
Ugly people 2 Body 131 Jews 2 Religion 2
Ugly women 3 Women, Ugly people 130 Jornalists 2 Other/Lifestyle 2
Racism 1 Hate speech 94 Old people 2 Ageing 2
Ideology 1 Hate speech 92 Thin people 2 Body 2
Migrants 1 Hate speech 82 Thin women 3 Women, Thin people 2
Men 2 Sexism 70 Vegetarians 2 Other/Lifestyle 2
Refugees 2 Migrants 70 White people 2 Racism 2
Feminists 2 Ideology, Sexism 65 Young people 2 Ageing 2
Gays 3 Homossexuals 56 Agnostic 2 Ideology 1
Black people 2 Racism 52 Argentines 3 Latins 1
Religion 1 Hate speech 30 Autists 2 Health 1
Left wing ideology 2 Ideology 26 Brazilian women 3 Women, South Americans 1
Origin 1 Hate speech 26 Egyptians 3 Arabic 1
Trans women 3 Women, Transexuals 26 Football players women 2 Women, Other/Lifestyle 1
OtherLifestyle 1 Hate speech 20 Gamers 2 Other/Lifestyle 1
Islamists 2 Religion 17 Homeless women 3 Women, Homeless 1
Immigrants 2 Migrants 15 Indigenous 2 Racism 1
Transexuals 2 Sexism 14 Iranians 3 Arabic 1
Muslims 2 Religion 11 Japaneses 3 Asians 1
Black Women 3 Women, Black people 8 Men Feminists 3 Feminists, Men 1
Criminals 2 Other/Lifestyle 8 Mexicans 3 Latins 1
Latins 2 Racism, Origin 7 Muslim women 3 Muslims, Women 1
Health 1 Hate speech 6 Old women 3 Women, Old people 1
Rural people 2 Origin 6 Polyamorous 2 Other/Lifestyle 1
Travestis 3 Women 6 Poor people 2 Other/Lifestyle 1
Aborting women 3 Women 5 Russians 3 East europeans 1
Asians 2 Racism, Origin 5 Sertanejos 3 Rural people, Brazilians 1
Brazilians 3 South Americans 5 Street artists 2 Other/Lifestyle 1
Disabled people 2 Health 5 Ucranians 3 East europeans 1
South Americans 2 Origin 5 Venezuelans 3 Latins 1
Africans 2 Origin 4
Table 3: Hate subclasses (Class) and respective parent categories (Parent nodes) sorted by frequency (Freq). Infor-
mation of the node depth is also provided (ND).
100
before the classification to extract 50 dimensions Hate speech dataset (PT)
CV f1-score 0.78
as input to the xgBoost algorithm,7 which is a gra- training data (N) 5099
dient boosting implementation from the Python li- test set f1-score 0.72
brary (Chen and Guestrin, 2016). testing data (N) 567
For xgBoost, the default parameter setting has Table 4: Results of Portuguese hate speech classifica-
been used, except for ‘eta’ and ‘gamma’. In this tion with the new dataset presented in this paper for bi-
case, we conducted a grid search combining sev- nary classification. We provide the micro-averaged F1
eral values of both (eta: 0, 0.3, 1; and gamma: 0.1, scores and also the number of instances used in each of
1, 10) in order to obtain the optimal eta and gamma the datasets (N).
settings. Figure 3 shows a graphical representation
of our model.
anonymized by omitting the tweet id. As a con-
sequence, it is possible to reach the original tweet
Raw data and user only by a search for the exact text of the
tweet. To also prevent this, we make our dataset
available in GitHub only for research purposes
Pretrained
Word under the condition that no such a search is per-
Embeddings
formed. A disclaimer is attached, stating that any
attempt to violate the privacy of Twitter users is
against the established usage conditions, and that
LSTM
last layer the authors of this paper cannot be made liable for
this violation.
As far as the quality of the data collection is
xgBoost
concerned, sampling bias may have been intro-
duced. Firstly, because Twitter API was used and
this provides only a subset of the all posted data in
Figure 3: Classification method used as baseline for the platform. Secondly, we use a set of keywords
binary hate speech classification with the Portuguese and crawl profiles based on our decision criteria,
dataset.
as explained in Section 3. However, we do not
aim to have a representative sample of online hate
speech on Twitter. We consider that for building a
5.2 Results dataset with examples of hate speech, our method
In this section, we present the results of our is adequate, and that we could find diverse hate
classification experiment for classification of hate speech instances belonging to 80 different classes.
speech in Portuguese. Table 4 shows the base-
line results of the LSTM-based model on our new 7 Conclusions and Future Work
dataset. We provide the cross validation and test
In this work, we built a Portuguese dataset for re-
set F1 scores and also the number of instances we
search in hate speech detection.
used in each of these (N). The results show a state-
To gather our data, we crawled Twitter for mes-
of-the-art outcome. We can thus assume that even
sages and manually annotated them using guide-
if annotated merely in terms of basic binary (‘hate’
lines. Firstly, we developed a method for binary
vs. ‘not hate speech’) labels, our dataset already
classification using the classification of three an-
constitutes a valid hate speech resource.
notators per message as ground truth. With this
6 Ethical considerations dataset, we conducted a baseline classification ex-
periment using pre-trained word embeddings and
Regarding the ethical aspects of this study, we LSTM, achieving very competitive performance.
took into consideration the privacy of the authors Furthermore, we provided a hate speech hier-
of the collected messages. However, we acknowl- archical labeling schema that integrates the com-
edge the limitations of our sampling procedure plexity of hate speech subtypes and their intersec-
when studying online hate speech. The data was tions. This allowed us to find out that distinct types
7
We also experimented with higher dimensionality, but of hate speech present different agreement lev-
this did not improve the performance of the classifier. els between annotators. Therefore, future guide-
101
lines for annotation may benefit from specifying Thomas Davidson, Dana Warmsley, Michael Macy,
the particularities of the different subtypes of hate and Ingmar Weber. 2017. Automated Hate Speech
Detection and the Problem of Offensive Language.
speech.
In Proceedings of ICWSM.
As far as future work is concerned, in the con-
text of the annotation procedure, the agreement Fabio Del Vigna, Andrea Cimino, Felice Dell?Orletta,
between annotators can still be improved. We Marinella Petrocchi, and Maurizio Tesconi. 2017.
Hate me, hate me not: Hate speech detection on
think that the subjectivity of the task makes the facebook. In Proceedings of the First Italian Con-
learning process challenging and more specific ference on Cybersecurity, pages 86–95.
training is necessary for the annotators. Addition-
ally, based on our experiment, we suggest that fu- Susan Dumais and Hao Chen. 2000. Hierarchical clas-
sification of web content. In Proceedings of the 23rd
ture data collection procedures should assure sam- annual international ACM SIGIR conference on Re-
pling of different subtypes of hate to improve the search and development in information retrieval,
identification of less common subtypes. pages 256–263. ACM.
Finally, in future explorations of this dataset,
Elisabetta Fersini, Paolo Rosso, and Maria Anzovino.
we will experiment with multilabel classification 2018. Overview of the task on automatic misogyny
of hate speech to identify not only whether a mes- identification at ibereval 2018.
sage contains hate, but also the targeted groups.
Joseph L Fleiss. 1971. Measuring nominal scale agree-
Acknowledgments ment among many raters. Psychological bulletin,
76(5):378.
This work was partially funded by the Google
Paula Fortuna and Sérgio Nunes. 2018. A survey on
DNI project Stop PropagHate. Soler-Company
automatic detection of hate speech in text. ACM
and Wanner have been supported by the European Computing Surveys (CSUR), 51(4):85.
Commission under the contract numbers H2020–
7000024-RIA and H2020-786731-RIA. We would Antigoni-Maria Founta, Despoina Chatzakou, Nico-
las Kourtellis, Jeremy Blackburn, Athena Vakali,
like to thank the anonymous reviewers for their in-
and Ilias Leontiadis. 2018. A unified deep learn-
sightful comments and to the annotators for their ing architecture for abuse detection. arXiv preprint
contribution to this work. arXiv:1802.00385.
102
Proceedings of the 11th Brazilian Symposium in In- Radim Řehůřek and Petr Sojka. 2010. Software Frame-
formation and Human Language Technology, pages work for Topic Modelling with Large Corpora. In
122–131, Uberlândia, Brazil. Sociedade Brasileira Proceedings of the LREC 2010 Workshop on New
de Computação. Challenges for NLP Frameworks, pages 45–50, Val-
letta, Malta. ELRA. http://is.muni.cz/pub
Hatebase. 2019. Hatebase. Available in https: lication/884893/en.
//www.hatebase.org/, accessed last time in
February 2019. Björn Ross, Michael Rist, Guillermo Carbonell, Ben
Cabrera, Nils Kurowsky, and Michael Wojatzki.
Jigsaw. 2018. Toxic comment classifica- 2016. Measuring the Reliability of Hate Speech An-
tion challenge identify and classify toxic notations: The Case of the European Refugee Cri-
online comments. Available in https: sis. In Proceedings of NLP4CMC III: 3rd Workshop
//www.kaggle.com/c/jigsaw-toxic on Natural Language Processing for Computer-
-comment-classification-challenge, Mediated Communication, pages 6–9.
accessed last time in 23 May 2018.
Joni Salminen, Hind Almerekhi, Milica Milenković,
Ritesh Kumar, Atul Kr. Ojha, Shervin Malmasi, and Soon-gyo Jung, Jisun An, Haewoon Kwak, and
Marcos Zampieri. 2018. Benchmarking Aggression Bernard J Jansen. 2018. Anatomy of online hate:
Identification in Social Media. In Proceedings of the developing a taxonomy and machine learning mod-
First Workshop on Trolling, Aggression and Cyber- els for identifying and classifying hate in online
bulling (TRAC), Santa Fe, USA. news media. In Twelfth International AAAI Confer-
ence on Web and Social Media.
Yashar Mehdad and Joel Tetreault. 2016. Do charac-
Manuela Sanguinetti, Fabio Poletto, Cristina Bosco,
ters abuse more than words? In Proceedings of the
Viviana Patti, and Marco Stranisci. 2018. An italian
SIGdial 2016 Conference: The 17th Annual Meet-
Twitter corpus of hate speech against immigrants. In
ing of the Special Interest Group on Discourse and
Proceedings of LREC.
Dialogue, pages 299–303.
William A Schabas. 2000. Hate speech in rwanda: The
Chikashi Nobata, Joel Tetreault, Achint Thomas, road to genocide. McGill Law Journal, 46:141.
Yashar Mehdad, and Yi Chang. 2016. Abusive
language detection in online user content. In Anna Schmidt and Michael Wiegand. 2017. A survey
Proceedings of the 25th International Conference on hate speech detection using natural language pro-
on World Wide Web, pages 145–153. International cessing. SocialNLP 2017, page 1.
World Wide Web Conferences Steering Committee.
William Warner and Julia Hirschberg. 2012. Detecting
Ji Ho Park and Pascale Fung. 2017. One-step and Two- hate speech on the world wide web. In Proceed-
step Classification for Abusive Language Detection ings of the Second Workshop on Language in Social
on Twitter. In Proceedings of the First Workshop on Media, pages 19–26. Association for Computational
Abusive Language Online. Linguistics.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, Zeerak Waseem. 2016. Are you a racist or am i see-
B. Thirion, O. Grisel, M. Blondel, P. Pretten- ing things? annotator influence on hate speech de-
hofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Pas- tection on Twitter. In Proceedings of the 1st Work-
sos, D. Cournapeau, M. Brucher, M. Perrot, and shop on Natural Language Processing and Compu-
E. Duchesnay. 2011. Scikit-learn: Machine learning tational Social Science, pages 138–142.
in Python. Journal of Machine Learning Research,
12:2825–2830. Zeerak Waseem and Dirk Hovy. 2016. Hateful sym-
bols or hateful people? predictive features for hate
Rogers Prates de Pelle and Viviane P Moreira. 2017. speech detection on twitter. In Proceedings of
Offensive comments in the brazilian web: a dataset NAACL-HLT, pages 88–93.
and baseline results. In 6o Brazilian Workshop
on Social Network Analysis and Mining (BraSNAM .
2017), volume 6. SBC.
103
A Appendices
A.1 Non-expert annotators guidelines
translated to English
Analyse the tweets from the first set and evaluate
if according* to your opinion, these tweets contain
hate speech.
For every tweet, mark manually with 1 or 0 if
you think the tweet contains or not hate, respec-
tively, accordingly with Table 5.
Tweet HS A
Black people should go back to their land!! 1 A
Meat and black beans are delicious! 0 A
Muslim people are terrorists! 1 A
104