An Explainable Artificial Intelligence Model For Detecting Xenophobic Tweets
An Explainable Artificial Intelligence Model For Detecting Xenophobic Tweets
sciences
Article
An Explainable Artificial Intelligence Model for Detecting
Xenophobic Tweets
Gabriel Ichcanziho Pérez-Landa 1 , Octavio Loyola-González 2, * and Miguel Angel Medina-Pérez 1,2
1 School of Science and Engineering, Tecnologico de Monterrey, Carretera al Lago de Guadalupe Km. 3.5,
Atizapán 52926, Mexico; [email protected] (G.I.P.-L.); [email protected] (M.A.M.-P.)
2 Altair Management Consultants, Calle de José Ortega y Gasset 22-24, 5th Floor, 28006 Madrid, Spain
* Correspondence: [email protected]
Abstract: Xenophobia is a social and political behavior that has been present in our societies since
the beginning of humanity. The feeling of hatred, fear, or resentment is present before people from
different communities from ours. With the rise of social networks like Twitter, hate speeches were
swift because of the pseudo feeling of anonymity that these platforms provide. Sometimes this
violent behavior on social networks that begins as threats or insults to third parties breaks the
Internet barriers to become an act of real physical violence. Hence, this proposal aims to correctly
classify xenophobic posts on social networks, specifically on Twitter. In addition, we collected a
xenophobic tweets database from which we also extracted new features by using a Natural Language
Processing (NLP) approach. Then, we provide an Explainable Artificial Intelligence (XAI) model,
allowing us to understand better why a post is considered xenophobic. Consequently, we provide a
set of contrast patterns describing xenophobic tweets, which could help decision-makers prevent acts
Citation: Pérez-Landa, G.I.; of violence caused by xenophobic posts on Twitter. Finally, our interpretable results based on our
Loyola-González, O.; Medina-Pérez, new feature representation approach jointly with a contrast pattern-based classifier obtain similar
M.A. An Explainable Artificial classification results than other feature representations jointly with prominent machine learning
Intelligence Model for Detecting classifiers, which are not easy to understand by an expert in the application area.
Xenophobic Tweets. Appl. Sci. 2021,
11, 10801. https://doi.org/10.3390/
Keywords: Xenophobia; Twitter; Explainable Artificial Intelligence
app112210801
The next examples of violence preceded by comments on social networks were ex-
tracted from Citizens Crime Commission of New York City [9]:
• SUMMARY {Date: 23 July 2017, Place: Nashville, TN, USA, Platform: Facebook}:
“A 20-year-old man and their 37-year-old mother were shot and killed in their home hours after
the 20-year-old posted on Facebook multiple photos of himself with large wads of cash, jewelry,
and shopping bags.”
• SUMMARY {Date: 6 October 2016, Place: St. Louis, MO, USA, Platform: Twitter}:
“An 18-year-old man fatally shot a 33-year-old police officer who was responding to a distur-
bance call. The shooter had repeatedly threatened violence on their Twitter page for months
before the shooting.”
With the aim of stopping the hatred, racism, and Xenophobia present on the Internet,
many web pages have rules for their users that prohibit these types of behavior. However,
a post with violent content is visible until it is detected as brutal by some administrator
user or a system that does not work in real-time, creating a wave of violence. In contrast,
it is still posted [4]. Facebook has announced that there is no place for hate speech on
their social network, and they would battle against racism and Xenophobia. However, the
solution proposed by Facebook and Twitter indicates that the problem depends on human
effort, leaving the users the responsibility of reporting offensive comments [10].
According to Pitsilis et al. [11], detecting offensive posts requires a great deal of work
for human annotators, but this is a subjective task providing personal interpretation and
bias. As Nobata et al. [12] mentioned, the need to automate the detection of abusive
posts becomes very important due to the growth of communication among people on
the Internet.
Each social network has its privacy policy, which could or could not allow developers
to analyze the publications that users make on their platforms. For example, Facebook does
not recognize the extraction of comments from publications, except that these comments are
from a page that you manage [13]. Although there are pages such as export comments [14]
that allow this information to be obtained. However, Facebook only allows downloading
publications with less than 485 comments for a price of USD 11. On the one hand, Twitter
natively has an API that enables developers to download their users’ publications through
Twitter Streaming API, and Twitter REST API [15].
Twitter is a social network characterized by the briefness of the posts, with a maximum
of 280 characters. In the first quarter of 2019, Twitter reported 330 million users and 500
million tweets per day [16]. In the United States, Twitter is a powerful communication tool
for politicians since it allows them to express their position and share their thoughts with
many of the country’s population. This opinion can dramatically change citizens’ behavior,
even if it was only written on Twitter [17]. Based on what was said previously, an open
problem is detecting xenophobic tweets by using an automated Machine Learning model
that allows experts to understand why the tweet has been classified as xenophobic.
Hence, this research focuses on developing an Explainable Artificial Intelligence model
(XAI) for detecting xenophobic tweets. The main contribution of this research is to provide
an XAI model in a language close to experts in the application area, such as psychologists,
sociologists, and linguists. Consequently, this model can be used to analyze and predict
the xenophobic behavior of users in social networks.
As a part of this research, we have created a Twitter database in collaboration with
experts in international relations, sociology, and psychology. The experts have helped us to
classify xenophobic posts in our Twitter database proposal. Then, based on this database,
we have extracted new features using Natural Language Processing (NLP), jointly with
the XAI approach, creating a robust and understanding model for experts in the field of
Xenophobia classification, particularly experts in international relations.
This document is structured as follows: Section 2 provides preliminaries about Xeno-
phobia and contrast pattern-based classification. Section 3 shows a summary of works
related to Xenophobia and hate-speech classification. Section 4 introduces our approach for
Xenophobia detection in Twitter. Section 5 describes our experimental setup. Section 6 con-
Appl. Sci. 2021, 11, 10801 3 of 27
tains our experimental results as well as a brief discussion of the results. Finally, Section 7
presents the conclusions and future work.
2. Preliminaries
2.1. Xenophobia
According to Wicker [18], Xenophobia is a hostile response that society has towards
foreign people, providing stereotypes and prejudices against these unfamiliar people.
Usually, this aggressive response has political or individual purposes, which seek to
improve society’s cohesion through discrimination of foreign people. This xenophobic
response can become so strong as to be a distinctive feature of a population. Additionally,
Wicker mentions that the fear and hatred with which Xenophobia is fostered are qualities
based on subjective experiences, which have a background in the education and values
that we receive by society. In this way, Xenophobia, being a hostile response, can also be
considered as social behavior, which can be used to control communities, in which hatred
of the other forms a way not only to generate identity but to promote acts of violence before
third parties [19].
For Crush [20], Xenophobia proceeds through dynamic public rhetoric, which shows
contempt through verbal offenses or acts of violence. Xenophobia actively stigmatizes
immigrants, calling them “threats” to society, and thus making them “the cause of social prob-
lems.”
It is essential to understand what are the differences between racism and Xenophobia.
However, these phenomena are often intertwined, they can also present themselves in
other ways, and each one of them entails different societal problems. Therefore, the means
to solve and correct these social behaviors are different. On the one hand, racism implies
discrimination against human beings based on their physical characteristics, such as skin
tone, weight, height, and facial features, among others. On the other hand, Xenophobia
denotes “behavior specifically based on the perception that the other is alien or originates from
outside the community or nation” [21].
Some of the definitions of Xenophobia were presented at the international conference
in Migration, Racism, Discrimination, and Xenophobia [21]:
• By the standard dictionary definition: Xenophobia is the intense dislike or fear of
strangers or people from other countries.
• As a sociologist’s point of view: Xenophobia is an attitudinal orientation of hostility
against non-natives in a given population.
• From world conference against racism: Xenophobia describes attitudes, prejudices,
and behavior that reject, exclude, and often vilify persons, based on the perception
that they are outsiders or foreigners to the community, society, or national identity.
Xenophobia has managed to stay on social networks, which it has spread quickly. The
number of discourses of violence that motivate discrimination or violence against immi-
grants as well as minority groups has grown enormously [22]. Due to this growth, experts
have questioned states and social media companies on stopping this spread. Consequently,
these online and offline hate speeches have increased social tension, leading to violent
attacks that can end in death [23]. For these reasons and more than ever, social networks
must take action to mitigate this behavior on their platforms and reduce the probability
that people will be injured in the real world [24].
One problem that exists is how to know when a comment is xenophobic? According
to Bucio [25], in Mexico, posts with hate hashtags are daily published, among which are:
#indio, #puto, #naco, among others. The posts with hate hashtags cause various social
problems such as classism, homophobia, racism, sexism, Xenophobia, etc. [25]. The most
alarming thing about this xenophobic behavior on social networks is that public figures
write some of these xenophobic comments. Additionally, Bucio mentioned that public
figures are not penalized because their xenophobic posts are treated as “black humor or
harmless comments”; allowing several people to spread hate speeches hidden in “humor”
publications. They are also trying to lessen the fact that they are normalizing xenophobic
Appl. Sci. 2021, 11, 10801 4 of 27
behaviors. As they are “humor”, the people who write these posts do not contemplate
the consequences that their comments may have on people’s lives, such as sadness, pain,
distress, humiliation, isolation, and dignitary insult [26].
The problem of writing xenophobic posts is that we are unaware of how dangerous our
behavior can be on social networks. At the time when we started to spread publications that
incite discrimination, that promote hatred and violence towards others, we are complicit
in the consequences that these may have [27]. Threats, insults, blows, even attacks that
end in the death of third parties are caused day by day as a result of the normalization of
xenophobic behavior on social networks [28].
Social networks are aware of xenophobic behavior; however, there are still no quick
and precise measures to address this issue with the importance it needs. The lack of an
automatic xenophobic publication detection tool makes them last longer online and can
harm third parties while they are not deleted. There are even cases where “after deleting
offensive posts”, they tend to reappear after a while [4].
Finally, the classification of xenophobic comments on social networks is very re-
cent [29–31]. According to Plaza-Del-Arco et al. [32], the rating of xenophobic posts is a
poorly addressed topic. Besides, Loyola-González [33] mentions that there is currently a
trend to transform unexplainable models (black-box) to explainable models (white-box),
particularly in sectors such as health care. Hence, our proposal aims to classify xenophobic
posts through an Explainable Artificial Intelligence model. With the use of XAI models, in
such wat that experts can have a set of explainable patterns describing xenophobic posts.
For mining the contrast patterns, it is required to construct a Decision Tree (DT). A DT
contains components of the tree structure based on the graph theory. Then, a DT can be
reported as a directed graph where two vertices are associated by only one path [44]. The
top-down approach is the most used method to make an inducing decision tree [33]. This
approach is based on the divide and conquer method [45]. It begins by making a node at
the root with all the objects of the training database D, and after, it splits the root node into
two disjoint subsets, the left child Dl and the right child Dr ; this process is performed again
recursively until a halt criterion is reached [46].
According to Loyola-González [33], the extraction of contrast patterns obtained from
only one decision tree generates very few contrast patterns. Nevertheless, the pattern’s
extraction from several equal decision trees causes several duplicate patterns. To deal
with this problem, collecting diverse decision trees for extracting the contrast patterns can
mitigate this problem [47]. Each pattern is the conjunction of the properties fi # vj in a
path from the root node to a leaf node; that is, any path from the root to a leaf decides
the conjunction of properties, making a pattern. Finally, only those patterns satisfying
the contrast pattern condition are kept [3]. For example, Figure 1 shows an example of a
hypothetic decision tree for Xenophobia classification.
Hate-speech
Hate-speech ≤ 0.20 Hate-speech > 0.20
From this decision tree, we can extract the following six contrast patterns:
• P1 = [Hate-speech ≤ 0.20] ∧ [Positive < 0.53] ∧ [Foreigners 6= “Present”]
• P2 = [Hate-speech ≤ 0.20] ∧ [Positive < 0.53] ∧ [Foreigners = “Present”]
• P3 = [Hate-speech ≤ 0.20] ∧ [Positive ≥ 0.53]
• P4 = [Hate-speech > 0.20] ∧ [Violent Foreigners = “Present”]
• P5 = [Hate-speech > 0.20] ∧ [Violent Foreigners 6= “Present”] ∧ [Angry ≤ 0.20]
• P6 = [Hate-speech > 0.20] ∧ [Violent Foreigners 6= “Present”] ∧ [Angry > 0.20]
from which P1 , P3 , P5 corresponds to the Non-Xenophobia class and the remaining
patterns (P2 , P4 , P6 ) corresponds to the Xenophobia class.
For filtering the contrast patterns, there exist two main approaches, based on set
theory (i) and quality measures (ii) [33]. The first approach removes duplicate and specific
patterns and also removes redundant items. The second approach allows generating a
pattern ranking in which the raking position is based on the discriminative power of the
patterns. It can be explained the three main steps of the contrast pattern filtering process
based on the set theory as follows:
• Removing duplicated contrast patterns: when two or more contrast patterns have
the same items and cover the same objects, they are colled duplicate patterns. This
problem usually happens because the contrast patterns come from several decision
trees built from the same training database. Then, all the duplicated contrast patterns
are removed to reduce the contrast patterns, and only one is kept.
• Removing specific contrast patterns: commonly, some specific patterns are extracted
in the mining process. Let us suppose that there are two contrast patterns P1 and
P2 , that belong to the same class. The pattern P2 is considered more specific than
Appl. Sci. 2021, 11, 10801 6 of 27
P1 if all the items contained in P1 are also in P2 but not vice versa. For example, let
P1 = [Hate-speech ≤ 0.20] ∧ [Positive ≥ 0.53] and P2 = [Hate-speech ≤ 0.20] ∧ [Positive
≥ 0.53]∧ [Negative < 0.24]. Since all the items presented in P1 are also in P2 , but P2
has one more item, then P2 is considered more specific than P1 . Therefore, according
to [48], P2 should be removed.
• Removing redundant items: An item I1 is more general than another item I2 if all of
the objects presented in I1 are also in 2 , but not the other way around. We also say that
I2 is redundant with I1 . If two items in a pattern are redundant, the most general item
is eliminated [3]. We can provide the next example of a pattern with redundant items:
[Hate-speech > 0.20] ∧ [Hate-speech > 0.41], which is simplified to [Hate-speech > 0.41]
because an entry with a hate-speech greater than 0.41 is also greater than 0.20.
To explain the second approach of filtering the contrast pattern based on quality
measures, let us take the support as our quality measure. Let p be a pattern, and C =
{C1 ,C2 ,C3 , . . . ,Cn } a set of classes such that C1 ∪ C2 ∪ C3 ∪ . . . ∪ Cn = U; then, support
for p is obtained after dividing the number of objects that belong to Ci that were described
by p by the total number of objects that are in Ci [36]. After obtaining all the supports for
each contrast pattern, they can be ranked from higher to lower support. Then, it can be
selected only the first n patterns and remove the rest of the patterns [38]. Additionally, a
minimum threshold can be proposed for considering a pattern with enough quality and
then eliminate all those patterns that do not reach the minimum threshold.
There are two main strategies for classifying the contrast patterns: unweighted (i) and
weighted (ii) scores. The unweighted strategy is easy to compute and understand but is
not suitable for all kinds of problems, such as class imbalance problems because it tends to
be biased towards the majority class [36]. The weighted approach is more computationally
expensive but is ideal for handling both balance and imbalance problems [33]. Specifically,
for our Xenophobia detection problem on social networks, we have an imbalanced database.
In this way, we decided to use PBC4cip [36] as our contrast pattern-based classifier, not only
because it has been proved to obtain the best classification results jointly with PBCQE [43]
for class imbalance problems, but also because PBC4Cip provides significantly fewer
patterns than others contrast pattern-based classifiers [36]. PBC4cip weights the sum of
supports in each class at the training stage by considering all contrast patterns covering a
query object and class imbalance level. This preparing plan is distinctive from conventional
classifiers, which only sum the supports [3].
The contrast pattern-based classifiers have been used to solve real-world problems,
where they have managed to obtain similar or best results to other classifiers. It can be
mentioned some of the most relevant applications where contrast pattern-based classifiers
have been applied, such as improvement of road safety [49], rule construction from crime
pattern [50], the discovery of an unusual rule within a cerebrovascular examination [51],
describing political figures [52], the observation of sales trends in dynamic markets [53],
bot detection on Twitter [3], bot detection on Web Log Files [54], detection of alarm pat-
terns in industrial alarm floods [55], complex activity recognition in smart homes [56],
discriminating deviant behaviors in MOBA games [57], summarizing significant changes
in network traffic [58], among others.
3. Related Work
In this section, we present previous works related to our research. All these works
have similar semantics since their objective is to identify undesirable behaviors in social
networks using Machine Learning.
Pitropakis [59] addressed the issue of Xenophobia classification on Twitter. For that,
they created a Xenophobia database on Twitter using keywords associated with Xenopho-
bia. Additionally, they used a geolocation filter to focus on the UK, USA, and Canada
countries. Their database consisted of labeling 6085 tweets, of which 3971 belong to the
Non-Xenophobia class and 2114 to the Xenophobia class. Finally, to classify the tweets, they
used Term Frequency–Inverse Document Frequency (TFIDF) [60] as their feature extraction
Appl. Sci. 2021, 11, 10801 7 of 27
method, and they also used word n-grams of length one to three and character n-grams
of size one to four to create their tokens. They used Support Vector Machines (SVM) [61],
Naïve–Bayes (NB) [62], and Logistic Regression (LR) [63] as their classifier models. They
obtained 0.84 in the F1 score test, 0.87 in the recall, and 0.85 in precision.
Plaza-Del-Arco et al. [32] compared three different approaches to deal with Spanish
hate speech on social networks. The first approach used supervised machine learning
classifiers, while the second used deep learning techniques, and the last was performed
using lexicon-based techniques. The problems addressed in their investigation were
misogyny and Xenophobia classification in Twitter. To accomplish that, Plaza-Del-Arco et al.
use a supervised machine learning approach using the Term Frequency–Inverse Document
Frequency [60] jointly with the Naïve–Bayes [62], SupportVector Machines [61], Logistic
Regression [63], Decision Tree, and Ensemble Voting (EV) machine learning classifiers.
Furthermore, the FastText word embedding jointly with Recurrent Neural Networks
(RNN) [64] and Long-Short-Term Memory (LSTM) [65] were used. Finally, the last approach
used was to build an emotion lexicon dictionary made of words related to misogyny and
Xenophobia. Finally, using the supervised machine learning approach, they obtained their
best results 0.754 in the accuracy, 0.747 in precision, 0.739 in the recall, and 0.742 in the
F1 score test. These results were obtained by using the Ensemble Voting classifier with
unigrams and bigrams.
Charitidis et al. [66] proposed an ensemble of classifiers for the classification of tweets
that threaten the integrity of journalists. They brought together a group of specialists to
define which posts had a violent intention against journalists. Something worth noting is
that they used five different Machine Learning models among which are: Convolutional
Neural Network (CNN) [67], Skipped CNN (sCNN) [68], CNN+Gated Recurrent Unit
(CNN+GRU) [69], Long-Short-Term Memory [65], and LSTM+Attention (aLSTM) [70].
Charitidis et al. used those models to create an ensemble and tested their architecture
in different languages obtaining an F1 Score result of 0.71 for the German language and
0.87 for the Greek language. Finally, with the use of Recurrent Neural Networks [64] and
Convolutional Neural Networks [67], they extracted essential features such as the word or
character combinations and the word or character dependencies in sequences of words.
Pitsilis et al. [11] used Long-Short-Term Memory [65] classifiers to detect racist and
sexist posts issued short posts, such as those found on the social network Twitter. Their
innovation was to use a deep learning architecture using Word Frequency Vectorization
(WFV) [11]. Finally, they obtained a precision of 0.71 for classifying racist posts and 0.76
for sexist posts. To train the proposed model, they collected a database of 16,000 tweets
labeled as neutral, sexist, or racist.
Sahay et al. [71] proposed a model using NLP and Machine Learning techniques
to identify comments of cyberbullying and abusive posts in social media and online
communities. They proposed to use four classifiers: Logistic Regression [63], Support
Vector Machines [61], Random Forest (RF) (RF, and Gradient Boosting Machine (GB) [72].
They concluded that SVM and gradient boosting machines trained on the feature stack
performed better than logistic regression and random forest classifiers. Additionally, Sahay
et al. used Count Vector Features (CVF) [71] and Term Frequency-Inverse Document
Frequency [60] features.
Nobata et al. [12] focused on the classification of abusive posts as neutral or harmful,
for which they collected two databases, both of which were obtained from Yahoo!. They
used the Vowpal Wabbit regression model [73] that uses the following Natural Language
Processing features: N-grams, Linguistic, Syntactic and Distributional Semantics (LS, SS,
DS). By combining all of them, they obtained a performance of 0.783 in the F1-score test
and 0.9055 AUC.
Appl. Sci. 2021, 11, 10801 8 of 27
It is essential to highlight that all the investigations above collected their database;
therefore, they are not comparable. A summary of the publications mentioned above can
be seen in Table 1. The previously related works seek the classification of hate posts on
social networks through Machine Learning models. These investigations have relatively
similar results that range between 0.71 and 0.88 in the F1-Score test.
Beyond the performance that these classifiers can have, the problem of using black-box
models is that we cannot be sure what factors determine whether a message is abusive.
Today we need to understand the background of the behavior of ML models to make better
decisions [33,74]. This is why this work takes on the characteristics of previous works but
proposes a radical change in its intelligibility, offering experts in the field the possibility of
having a transparent tool that helps them classify xenophobic posts and understand why
these posts are considered in this way.
Table 1. Summary of previous work in terms of the problem they address, the data source used,
features extracted, classifiers used, evaluation metrics, and the result obtained in the evaluation.
Database Evaluation
Author Problem Extracted Features Methods Performance
Origin Metrics
• Word n-grams • LR • F1 • 0.84 F1
Pitropakis et al. • Xenophobia • Twitter • Char n-grams • SVM • Rec • 0.87 Rec
• TF-IDF • NB • Prec • 0.85 Prec
• LR
• SVM • F1 • 0.742 F1
• TF-IDF
• Misogyny and • NB • Rec • 0.739 Rec
Plaza-Del-Arco et al. • Twitter • FastText
Xenophobia • Vote • Prec • 0.747 Prec
• Emotion lexicon
• DT • Acc • 0.754 Acc
• LSTM
• CNN
• Word or character • English: 0.82
• Wikipedia • sCNN
combinations • German: 0.71
• Hate speech to • Twitter • CNN
Charitidis et al. • Word or character • F1 • Spanish: 0.72
journalists • Facebook + GRU
dependencies in • Fr:ench 0.84
• Other • LSTM
sequences of words • Greek: 0.87
• aLSTM
• Sexism • Word Frequency • LSTM • Sexism: 0.76
Pitsilis et al. • Twitter • F1
• Racism Vectorization • RNN • Racism 0.71
• Train: Twitter • Count Vector • LR
• AUC • 0.779 AUC
Sahay et al. • Cyberbullying and YouTube Features • SVM
• Acc • 0.974 Acc
• Test: Kaggle • TF-IDF • RF
• N-grams
• Yahoo! • Linguistic semantics • Vowpal
• Abusive • F1 • 0.783 F1
Nobata et al. Finance • Syntactic semantics Wabbit’s
language • AUC • 0.906 AUC
and News • Distributional regression
semantics
DATABASE CREATION
Figure 2. The creation of the Xenophobia database consisted of downloading tweets through the Twit-
FEATURE
ter API jointly with the Python REPRESENTATION
Tweepy library. CREATION
Then, Xenophobia experts took it upon themselves to
manually label the tweets.
We decided to keep only the raw text of each tweet to make a Xenophobia classifier
based only on text. We made this decision to extrapolate this approach to other platforms
because each social network has additional information that could not exist or is difficult to
access on other platforms [76]. For example, detailed profile information as geopositioning,
account creation date, preferred language; among others, are characteristics challenging to
obtain (even not provided) in • Obtain the sentiments,
other social networks. emotions,
In this way,intentions,
the exclusion of additional
syntactic features, and critical words.
information from the text allows focusing on its classification based solely on natural
language processing techniques such as sentiment, semantic and syntactic analysis [77],
which is more versatile for applying to any platform containing posts. As an additional
configuration for obtaining theMINING CONTRAST
analyzed tweets, we used PATTERNS
the (geo_search) Tweepy method
with the parameters (query= “USA”, granularity=“country”); consequently, it allowed us
to collect tweets issued from the USA and using the English language.
These data were collected in five weeks, from 27 June to 31 August 2021. The tweets
publication date corresponds with the collection’s date of the same. Each week 2000 tweets
were downloaded. For the labeling process, we were supported by five experts. Two
were psychologists, two were experts in international relations, and the last expert was a
sociologist. These experts were in charge of labeling the tweets manually.
Since a single Twitter API return can•return,
•Mining Filteringat most,•Explaining
100 tweets per looked term,
we followed the same scheme used by Pitropakis et al. [59]. We used a set of keywords
regarding Xenophobia instead of a single immigration term. Some of our xenophobic
keywords were the same as the ones used by Pitropakis et al., such as immigration, migrant,
and deport them all. While our experts proposed a new set of keywords, among which are:
illegal aliens, backcountry, and violent. Nevertheless, we also used a set of neutral terms
to make our database more diversified, such as sports, food, travel, love, money, among
others. As a result, a total of 10,073 tweets were annotated.
The collected tweets were labeled in two categories where 8056 tweets were labeled
as non-xenophobic, 2017 as xenophobic, where 79.97% of the labels correspond with the
non-Xenophobia class and the remainder, 20.03%, belong to the Xenophobia class. Table 2
shows two random examples of tweets belonging to each class. Finally, our collected
database was divided into 20 batches of 504 tweets, each one. Each expert was in charge of
labeling four batches for a total of 2016 tweets. After the first labeling process, a second
process was done by one of our experts in international relations. This second process was
to inspect again all the tweets labeled as xenophobic and look for any discrepancy.
Appl. Sci. 2021, 11, 10801 10 of 27
Class Example
No wonder why the 4Chan CHUDs have misunderstood the meaning of this movie
and then made it their foundational text. https://t.co/96M7rHy3fc
Non-xenophobic
i just received the best text in the world i truly love my friends so fucking much
Figure 3. The creation of our feature representation proposal consisted of using three NLP APIs
3 MINING CONTRAST PATTERNS
(Parallel Dots [80], Meaning Cloud [81], IBM NLU [82]) to perform sentiment analysis and the Python
spaCy library [83] for extracting the syntactic features and the keywords of the tweets.
The second step was to perform the syntactic analysis; we used the spaCy python
library [83]. We employ the en_core_web_lg pipeline that is specifically designed for
Appl. Sci. 2021, 11, 10801 11 of 27
blogs, news, and comments. This pipeline is pre-trained on the OntoNotes Release
5.0 data [86]. With this library, we were enabled to extract different linguistic features
using the spaCy Part-of-speech tagging implemented on its pipeline. For more infor-
mation related to Universal POS tags, it can be visited the following document: https:
//universaldependencies.org/docs/u/pos/ (accessed on 15 April 2021). The linguistic
features extracted from spaCy are:
• ADJ: (INT) The number of adjectives presented in the tweet.
• AUX: (INT) The number of auxiliary verbs presented in the tweet.
• NUM: (INT) The number of numbers presented in the tweet.
• PROPN: (INT) The number of proper nouns presented in the tweet.
• ALPHAS: (INT) The number of words presented in the tweet that are not stopwords.
• HASHTAGS: (INT) The number of hashtags presented in the tweet.
• URLs: (BOOL) If the tweet has a URL or not.
Finally, the third step was to extract the most representative xenophobic words of
the database according to their frequency. Beyond the fact that our database was created
using xenophobic keywords, not all the tweets with a xenophobic word were labeled
as xenophobic. Furthermore, there are new terms related to Xenophobia that were not
proposed at creating the database. The process to extract the new xenophobic keywords
was as follows:
• Clean the tweet: To clean the tweet, we normalize the tweets by removing stopwords,
unknown characters, numbers, URLs, user mentions, and then apply lemmatization.
Lemmatization is a normalization technique [87], generally defined as “the transfor-
mation of all inflected word forms contained in a text to their dictionary look-up
form” [88].
• Get the frequency of the words: For each class (Xenophobic and Not-xenophobic) it
was generated a list of all the words that belong to the class, then it was counted the
frequency of each term, and it was gotten a dictionary where the word was the key,
and the frequency was the value.
• Extract the xenophobic keywords: After getting the frequency of the words, they
were sorted by the highest to the lowest frequency, and it was selected only the
20 most used words. It was considered two conditions to determine if a comment
might be regarded as a xenophobic keyword. The first condition: if the word only
belongs to the xenophobic class, this means that the term is present in the 20 most
used words list of the Xenophobia class and did not belong to the other list. The
second condition: if the word is presented in both lists, but the absolute frequency of
the word is more significant in the Xenophobia list than the non-Xenophobia list.
When we consider the proportion of the tweets that belong to the Xenophobia and
no-Xenophobia class, we can realize that for each tweet that was labeled as xenophobic,
there are four tweets labeled as non-xenophobic. If a word has the same use frequency
in both classes, we can say that the word is four times more used in the xenophobic class.
The above process was used again to obtain bigrams, sequences of two words that appear
together or near each other. As a result, the following list of words was obtained. Five are
unigrams, and five are bigrams: country, illegal, foreigners, alien, criminal, back country,
illegal alien, violent foreigners, criminal foreigners, criminal migrant.
Table 4 shows the number of features grouped by different key labels for our INTER
feature representation. In total, 37 features were used to construct our new feature repre-
sentation proposal. Of which 20 were from the sentiment analysis, seven were extracted
from the syntactic analysis, and the last ten were from the xenophobic keyword extraction
process described above. Finally, Table 5 shows an example of two tweets extracted from
EXD, one belonging to the non-Xenophobia class and the other to the Xenophobia class.
These tweets were transformed using our interpretable feture representation and Table 6
shows each feature grouped by different key labels.
Appl. Sci. 2021, 11, 10801 12 of 27
Table 4. Distribution of the features presented in our INTER feature representation. The overall
column shows the total number of features.
Class Tweet
Immigrant families deserve to live without fear in Massachusetts, especially amid the #COVID19
Non-Xenophobia pandemic. It’s a moral imperative. Let us align our laws with our values! Pass the
#SafeCommunitiesAct ASAP! @MassGovernor @KarenSpilka @SpeakerDeLeo #MALeg
@EUTimesNET I do not know what liberal idiot runs your site but the USA is not a hellhole. We
Xenophobia may have racist terrorists running around burning things but Europe has violent migrants raping
women, vandalizing churches and attacking Christians. You’re far from a model region.
Table 6. Example of our interpretable feature representation for tweets belonging to the Xenophobia
and non-Xenophobia class grouped by different key labels.
Figure 4. The extraction of the contrast patterns consist on three phases mining, filtering and classifi-
cation.
According to Dong and Bailey [38], a pattern is a condition on data tuples that eval-
uates to either true or false. To be considered a pattern, the succinct state must be much
simpler and smaller than the original length of the data. Ordinarily, a pattern is represented
by a conjunction of relational statements, each with the form: [fi # vj ], where vj is a value
within the space of feature fi , and # is a relational operator taken from the set { =, 6=, ≤, >,
∈, ∈
/, } [33,36,38]. For example, [violent foreigners = “present”] ∧ [hate−speech ≥ 0.11], is a
pattern describing post xenophobes.
In comparison, contrast patterns are a type of pattern whose supports differ sig-
nificantly among the analyzed databases [38]. There are three steps to build a contrast
pattern-based classifier: mining, filtering, and classification [3,33]:
• Mining: it is committed to looking for a set of candidate patterns by an exploratory
examination using a search-space, characterized by a group of inductive limitations
given by the user.
• filtering: it is committed to choosing a set of high-quality patterns from the mining
stage; this step permits equal or superior results than using all the patterns extracted
at the mining step.
• Classification: it is responsible for looking for the finest methodology to combine the
data provided by a subset of patterns and construct an accurate model that is based
on patterns.
We decided to use the Random Forest Miner (RFMiner) [91] as our algorithm for
mining contrast patterns during the first step. García-Borroto et al. [92] conducted a large
number of experiments comparing several well-known contrast pattern mining algorithms
that are based on decision trees. According to the results obtained in their experiments,
García-Borroto et al. have shown that RFMiner is capable of creating diversity of trees. This
feature allows RFMiner to obtain more high-quality patterns compared to other known
pattern miners. The filtering algorithms can be divided into two groups: based on set
theory and based on quality measure [33]. For our filtering process, we start using the
set theory approach. We remove redundant items from patterns and duplicated patterns.
Furthermore, we choose only general patterns. After this filtering process, we kept the
patterns with higher support.
Finally, we decided to use PBC4cip [36] as our contrast pattern-based classifier for the
classification phase due to the good results that PBC4cip has reached in class imbalance
problems. This classifier uses 150 trees by default; nevertheless, after many experiments
classifying the patterns, we use only 15 trees, looking for the simplest model with good
classification results in the AUC score metric. We repeated this process, reducing the
number of trees and minimizing the AUC loss and the number of trees. A stop criterion was
executed when the AUC score obtained in our experiments was more than 1% compared
with the results that PBC4Cip reaches with the default number of trees.
5. Experimental Setup
This section shows the methodology designed to evaluate the performance of the
tested classifiers. For our experiments, we use two databases: our Experts Xenophobia
Database (EXD), which consists of 10,057 tweets labeled by experts in the fields of inter-
Appl. Sci. 2021, 11, 10801 14 of 27
1 2 3 4 5 6
Feature
Database Cleaning Representation
Partition Classifier Evaluation
Figure 5. Flow diagram for the procedure of getting the classification results of the Xenopho-
bia databases.
1. Database: The first step consisted of obtaining the Xenophobia databases used to train
and validate all the tested machine learning classifiers detailed in step number five.
2. Cleaning: For each database, our proposed cleaning method was used to obtain a
clean version of the database. Our cleaning method was specially designed to work
with databases made on Twitter. It removes unknown characters, hyperlinks, retweet
text, and user mentions. Additionally, our cleaning method converts the most used
slang to their original word, removes stop words, and applies lemmatization. Finally,
it removes tweets that are identical before and after applying the normalization
process described above.
3. Feature representation: For each database, different feature representations were
used to convert tweets into representations used by the tested machine learning
classifiers. We use the following well-known feature representations: Bag Of Words
(BOW) [93], Term Frequency-Inverse Document Frequency (TFIDF) [60], Word To
Vector (W2V) [94], and our interpretable proposal (INTER).
4. Partition: For each feature representation, five partitions were generated. Each
partition was performed using the Distribution Optimally Balanced Stratified Cross-
Validation (DOB-SCV) method [95]. According to Zeng and Martinez [95], the princi-
pal advantage of DOB-SCV is that it keeps a better distribution balance in the feature
space when splitting a sample into groups called folds. This property empowers
the cross-validation training set better to capture the distribution features within the
actual data set.
5. Classifier: For each partition, the following machine learning classifiers were used:
C4.5 (C45) [96], k-Nearest Neighbor (KNN) [97], Rusboost (RUS) [98], UnderBagging
(UND) [99], and PBC4cip [36]. Except for KNN, the other classifiers are based on
decision trees. The classifiers mentioned above have been implemented in the KEEL
software [100], except for PBC4cip, which is a package available for the Weka Data-
Mining software tool [101]; it can be taken from https://sites.google.com/view/
leocanetesifuentes/software/multivariate-pbc4cip (accessed on 20 October 2020).
6. Evaluation: For each classifier, we used the following performance evaluations met-
rics: F1 score and Area Under the ROC Curve (AUC) [102]. These metrics are widely
used in the literature for class imbalance problems [103,104].
Appl. Sci. 2021, 11, 10801 15 of 27
Table 7. Comparison between the number of tweets belonging to the non-xenophobic and xenophobic
classes before and after using the cleaning method. The class imbalance ratio (IR) is calculated as the
proportion between the number of objects belonging to the majority class and the number of objects
belonging to the minority class [36]. The higher the IR value, the more imbalanced the database is.
On the one hand, our INTER feature representation method proposal is designed to
be interpretable and provide a set of feelings, emotions, and keywords from a given text.
On the other hand, the feature representation BOW, TFIDF, and W2V transform an input
text into a numeric vector [105]. According to Luo et al. [79] these numeric transformations
are considered black-box and prevent them from being human-readable.
We can also mention that there are methods based on neural networks built from the
numeric feature representation methods achieving interpretable results [106]. On the one
hand, the interpretability of the neural networks is based on highlighting the keywords
that a text has to belong to a class [106]; on the other hand, our approach seeks to obtain
more interpretability features such as feelings, emotions, and intentions; this can allow an
expert to understand why a text is considered to be xenophobic with more detail.
Table 8 shows a summary of the information presented above, synthesizing which
classifiers and feature representation are interpretable and which classifiers use contrast
patterns or not. From Table 8a, and the C4.5’s definition stated by Ting et al. [96],
García et al. [107], and Dong and Bailey [38] (see Section 4 for more detail), we can com-
ment that the tree-based classifiers are interpretable; however, only PBC4cip uses contrast
patterns. Finally, Table 8b shows that the only feature representation being interpretable is
our INTER feature representation proposal.
Table 8. Summary of the characteristics of the classifiers and the interpretability of the feature repre-
sentations.
( [ S H U W V ; H Q R S K R E L D '