Cyberbullying Detection Through Sentiment Analysis
Cyberbullying Detection Through Sentiment Analysis
Cyberbullying Detection Through Sentiment Analysis
Abstract-In recent years with the widespread of social media As a result of such wide usage of social media among adults,
platforms across the globe especially among young people, cyberbullying or cyber aggression has become a major problem
cyberbullying and aggression have become a serious and annoying for social media users. This had lead to an increasing number
problem that communities must deal with. Such platforms provide of cyber victims who have suffered either physically,
various ways for bullies to attack and threaten others in their emotionally, mentally, and/or physically.
communities. Various techniques and methodologies have been used Cyberbullying can be defined as a type of harassment that
or proposed to combat cyberbullying through early detection and takes place online on social networks. Criminals rely on such
alerts to discover and/or protect victims from such attacks. Machine networks to collect data and information to enable them to
learning (ML) techniques have been widely used to detect some
execute their crimes, for example, by determining a vulnerable
language patterns that are exploited by bullies to attack their victims.
Also. Sentiment Analysis (SA) of social media content has become
victim [2]. Therefore, researchers have been working on finding
one of the growing areas of research in machine learning. SA some methods and techniques that would detect and prevent
provides the ability to detect cyberbullying in real-time. SA provides cyberbullying. Recently, monitoring systems of cyberbullying
the ability to detect cyberbullying in real-time. This paper proposes a have gained a considerable amount of research, their goal is to
SA model for identifying cyberbullying texts in Twitter social media. efficiently identify cyberbullying cases [3]. The major idea
Support Vector Machines (SVM) and Naïve Bayes (NB) are used in behind such systems is the extraction of some features from
this model as supervised machine learning classification tools. The social media texts then building classifier algorithms to
results of the experiments conducted on this model showed detected cyberbullying based on such extracted features. Such
encouraging outcomes when a higher n-grams language model is features could be based on users, content, emotions, and/or
applied on such texts in comparison with similar previous research. social networks. Furthermore, machine learning methods have
Also, the results showed that SVM classifiers have better been used to detect language pattern features from texts written
performance measures than NB classifiers on such tweets. by bullies.
The research in detecting cyberbullying has been mostly
Keywords — Cyberbullying, sentiment analysis, machine done either through filtration techniques or through machine
learning, social media learning techniques. Infiltration techniques, profane words or
I. INTRODUCTION idioms have to be detected from texts to identify cyberbullying
[4]. Filteration techniques usually use Machine learning
Social media has been used by almost all people especially methods to build classifiers that have the capabilities of
young adults as a major media of communication. In [1], young detecting cyberbullying using corpora of collected data from
adults were among the earliest social media adopters and social networks such as Facebook and Twitter. For instance, in
continue to use it at high levels, also, usage by older adults has [5], data were collected from Formspring then it was labeled
increased in recent years as shown in Fig. 1. using the Amazon Mechanical TURK [6]. WEKA toolkit [7]
machine learning methods were, also, employed to train and
test these classifiers. Such techniques suffer from an inability
to detect indirect language harassment [8].
Chen [9] had proposed a technique to detect offensive
language constructs from social networks through the analysis
of features that are related to the users writing styles, structures,
and certain cyberbullying contents to identify potential bullies.
The basic technique used in this study is a lexical syntactic
feature that was successfully able to detect offensive contents
from texts sent by bullies. Their results had indicated a very
high precision rate (98.24%), and recall of 94.34%.
Nandhini and Sheeba [10] had proposed a technique for
detecting cyberbullying based on an NB classifier using data
collected from MySpace. They had reported an achieved
accuracy of 91%. Romsaiyud el a. [11] had employed an
enhanced NB classifier to extract cyberbullying words and
Figure 1. Percentage of U.S. Adults Who Use Social Media Sites by Age. clustered loaded patterns. They had achieved an accuracy of
293
Figure 2. Proposed Tweets Preprocessing Stages.
294
and gives better concepts to the data. Moreover, N-gram is TABLE 2. Tweets Statistics
a traditional method that takes into consideration the Total number of Tweets 5628
occurrences of N-words in a tweet and could identify formal
expressions [20]. Hence, we have used N-gram in our SA. Number of positive (cyberbullying) Tweets 1187
In this research, we have implemented the term Number of negative (no cyberbullying) Tweets 2342
frequency using weka [21]. Term frequency assigns weights
Number of neutral Tweets 2099
for each term in a document in which it depends on the
number of occurrences of the term in a document, and it
gives more weight to those terms that appear more frequent TABLE 3. NB and SVM Measures for Different N-gram Language Models
in tweets because these terms represent words and language Measure 2 gram 3 gram 4 gram Average
patterns that are more used by the tweeters.
NB 82.35 81.7 81.1 82.025
H. Feature Selection Accuracy
SVM 91.21 91.7 92.02 91.64
Feature selection techniques have been used
successfully in SAs [22] [23]. In which Features would be NB 78.46 78.68 78.42 78.52
Precision
ranked according to some measures such that non useful or SVM 88.92 89.1 89.3 89.11
non-informative features would be removed to improve the
NB 77.31 79.4 79.71 78.81
accuracy and efficiency of the classification process. In this Recall
study, we have used the Chi-square and Information gain SVM 86.28 87.36 88.04 87.23
techniques to remove such irrelevant features. NB 77.88 79.04 79.06 78.66
F-
IV. EXPERIMENTS AND RESULTS Measure SVM 87.58 88.22 88.66 88.16
To evaluate the performance of the machine learning NB 78.61 77.9 78.03 77.9
methods used in this research; namely the Naïve Bayes (NB) ROC
SVM 88.2 88.56 89.3 88.93
and the Support Vector Machine (SVM), we have collected
a total of 5628 tweets (Positive-cyberbullying, negative-no
cyberbullying, and neutral). This set of tweets was manually
classified into 1187 cyberbullying tweets, 2342 with no 95
cyberbullying tweets and the remaining 2099 are neutral
tweets. Table 2 presents the distribution of these tweets. 90
Before conducting our experiments, the set of tweets had
gone through the various phases of cleaning, preprocessing,
Normalization Tokenization, Named Entity Recognition, 85
stemming, and features selection as has been discussed in
the previous section. Then this data set is split into a ratio of 80
(70, 30) for training and testing the NB and SVM classifiers.
Finally, cross-validation is used in which 10-fold equal- 75
sized sets are produced.
Several experiments have been conducted to compare
70
the performance of NB and SVM classifiers of the above-
collected set of tweets. In the first experiment, tweets with
2-gram, 3-gram, and 4-gram are used to evaluate the NB and 65
SVM classifiers in terms of accuracy, precision, recall, F- NB SVM NB SVM NB SVM NB SVM NB SVM
measure, and ROC. Table 3 presents the results of this Accuracy Precision Recall F-Measure ROC
experiment. Fig. 3 illustrates the averages of the measures
obtained over the different n-grams models for both NB and 2-gram 3-gram 4-gram
SVM classifiers. From Table 3 and Fig. 3 we can conclude
that SVM classifiers have achieved higher average results Figure 3. Graphical Comparisons of NB and SVM Measures
than the NB classifiers in all n-gram language models in
terms of accuracy, precision, recall, F-measure, and ROC. Another experiment was conducted to compare our
For instance, SVM classifiers achieved an average accuracy proposed classifiers to the work presented in [12] using the
value of 92.02% in the case of the 4-gram language model, two major classification techniques, namely; Naïve Byes
whereas, the NB classifiers achieved an average accuracy of (NB), and Support Vector Machine (SNM) using the same
81.1 on the same language model. Also, the 4-gram data set presented earlier. Table 4 presents the summarized
language model has outperformed all other n-grams performance measures of our proposed techniques in
language models in all measures in both SVM and NB implementing the NB and SVM classifiers in comparison
classifiers. This is because a higher n-gram leads to an with the implementation of [12]. It is very clear from Table
increase in the probability of estimation. 4 and Fig. 4, that in most measures we had obtained slightly
better results.
295
TABLE 4. Averages of NB and SVM Measures for Different N-gram have gone through several phases of cleaning, annotations,
Language Models
normalization, tokenization, named entity recognition,
Avg. Avg. Avg. Avg. Avg. removing stopped words, stemming and n-gram, and
features selection.
Accur. Recall Prec. F-Meas. ROC The results of the conducted experiments have indicated
that SVM classifiers have outperformed NB classifiers in
NB 81.71 78.8 78.52 78.65975 77.9 almost all performance measures over all language models.
Proposed Specifically, SVM classifiers have achieved an average
Work SVM 91.64 87.22 89.1 88.14997 88.93
accuracy value of 92.02%, while, the NB classifiers have
achieved an average accuracy of 81.1 on the 4-gram
Previous NB 80.9 79.1 77.04 78.05641 77.02
language model.
work
SVM 83.46 85.3 84.32 84.80716 85.71 Furthermore, more experiments have been conducted to
evaluate our proposed work to a similar work of [12]. These
experiments had also indicated that our SVM and NB
classifiers had slightly better performance measures when
95 compared to this previous work.
Finally, for direction research in cyberbullying
90 detection, we would like to explore other machine learning
techniques such as Neural Networks and deep learning, with
85 larger sets of tweets. Also, to adopt some proven methods
for an automated annotation process to handle such a large
80 set of tweets.
ACKNOWLEDGMENT
75
I would like to gratefully acknowledge the support of
East Central University-Ada, Oklahoma and the Department
70
of Mathematics and Computer Science at ECU for their
support in providing the funds for my paper registration in
65
this conference.
NB SVM NB SVM
proposed Work Previous work REFERENCES
[1] JUNE 12, 2019, PEW Research center, Internet & Technology-Social
Avg. Accuracy Avg. Recall Avg. Precision Media Fact Sheet. https://www.pewresearch.org/internet/fact-
sheet/social-media/, accessed March 28, 2020.
Avg. F-Measure Avg. ROC [2] Tavani, Herman. T., “Introduction to Cybernetics: Concepts,
Perspectives, and Methodological Frameworks”, In H. T. Tavani,
Figure 4. Averages of Graphical Comparisons of NB and SVM Measures ethics and Technology: Controversaries, questions, and Strategies for
ethical Computing, river University – Fourth Edition, Wiley, pp 1-2,
Furthermore, as shown in Fig. 4, the performance 2013.
[3] S. Salawau, Y. He, and J. Lumsden, “Approaches to Automated
measures of the our SVM classifiers have better results than Detection of Cyberbullying: A survey,” Vol. 3045, no c, pp 1-20,
the SVM classifiers of the previous work. For instance, we 2017.
have obtained an average accuracy of 91.61 in the proposed [4] Internet Monitoring and Web Filtering Solutions”, “PEARL
work in contrast of an average accuracy average of 83.44 in SOFTWARE, 2015. Online.
Avaliable:http://www.pearlsoftware.com/solutions/cyber-bullying-
the previous work. Also, the average ROC of our SVM inschools.html. [Accessed Feb 20, 2020]
classifier is 88.93 compared to 85.71 of the SVM of the [5] K. Reynolds, “Using Machine Learning to Detect Cyberbullying”,
previous work. 2012.
This is an impressive result since ROC compares the true [6] Amaon Mechanical Turk”, Aug. 15, 2014 [Online]Available:
positive and false-positive rates, which is the fraction of the http://ocs.aws.amazon.com/AWSMMechTurk/latest/AWSMechanic
al-TurkGetingStartedGuide/SvcIntro.html. Accessed July 3,2020.
sensitivity or recall in machine learning. [7] S. Garner, Weka: The Waikato Environment for Knowledge
Analysis”, New Zealand, 1995.
V. CONCLUSIONS [8] V. Nahar, X. Li and C. Pang, “An effective Approach for
In this research, we have proposed an approach to detect Cyberbullying Detection,” in Communication in Information Science
and Management Engineering, May 2013.
cyberbullying from Twitter social media platform based on [9] Chen, Y., Zhou, Y., Zhu, s. and Xu, H., “Detecting Offensive
Sentiment Analysis that employed machine learning Language in Social Media to Protect Adolescent Online Saftey”, In
techniques; namely, Naïve Bayes and Support Vector privacy, Security, Risk and Trust (PASSAT), 2012 International
Machine. The data sets used in this research is a collection Conference on Social Computing (SocialCom), pp 71-80, 2012.
[10] B. Sri Nandhinia, and J.I. Sheeba, “Online Social Network Bullying
of tweets that have been classified into positive, negative, or Detection Using Intelligence Techniques”, International Conference
neutral cyberbullying. Before training and testing such on Advanced Computing Technologies and Applications (ICACTA-
machine learning techniques, the collected set of tweets 2015), Procedia Computer Science 45 (2015) 485 – 492
296
[11] Walisa Romsaiyud, Kodchakorn na Nakornphanom, Pimpaka
Prasertslip, Piyapon Nurarak, and Pirom konglerd,” Automated
Cyberbullying Detection Using Clustering Appearance Patterns”, in
Knowledge and Smart Technology (KST), 2017 9th International
Conference on, pages 242-247, IEEE, 2017.
[12] Dipika Jiandani, Riddhi Karkera, Megha Manglani, Mohit Ahuja,
Mrs. Abha Tewari, “Comparative Analysis of Different Machine
Learning Algorithms to Detect Cyber-bullying on Facebook”,
International Journal for Research in Applied Science & Engineering
Technology (IJRASET), Volume 6 Issue IV, April 2018, pp. 2322-
2328.
[13] Cristina Bosco and Viviana Patti and Andrea Bolioli, “Developing
Corpora for Sentiment Analysis: The Case of Irony and Senti–TUT,
Proceedings of the Twenty-Fourth International Joint Conference on
Artificial Intelligence (IJCAI 2015), pp. 4158-4162.
[14] N. Friedman, D. Geiger, and M. Goldszmidt, "Bayesian Network
Classifiers", Machine learning, Vol. 29, No. 2–3, pp. 131-163, 1997.
[15] Cortes, Corinna; Vapnik, Vladimir N., "Support-Vector Networks"
(PDF). Machine Learning. 20 (3): 273–297. (1995), Cutesier
10.1.1.15.9362. doi:10.1007/BF00994018.
[16] C. Cortes, and V. Vapnik. "Support-vector networks". Machine
Learning, Vol. 20, No. 3, pp. 273–297,1995
doi:10.1007/BF00994018.
[17] Leimin Tian, Catherine Lai, and Johanna D. Moore, “Polarity and
Intensity: The Two Aspects of Sentiment Analysis”, Proceedings of
the First Grand Challenge and Workshop on Human Multimodal
Language (Challenge-HML), pages 40–47, Melbourne, Australia
July 20, 2018. 2018, Association for Computational Linguistics.
[18] Monali Bordolo, and Saroj Kr. Biswas, “Sentiment Analysis of
Product using Machine Learning Technique: A Comparison among
NB, SVM and MaxEnt”, July 2018, International Journal of Pure and
Applied Mathematics 118(18):71-83
[19] Brajendra Singh Rajput, and Nilay Khare, “A Survey of Stemming
Algorithms for Information Retrieval”, IOSR Journal of Computer
Engineering (IOSR-JCE) e-ISSN: 2278-0661, p-ISSN: 2278-8727,
Volume 17, Issue 3, Ver. VI (May – Jun. 2015), PP 76-8
[20] L. Chen, W. Wang, M. Nagaraja, S. Wang, and A. Sheth, "Beyond
Positive/Negative Classification: Automatic Extraction of Sentiment
Clues from Microblogs,", Kno.e.sis Center, Technical Report, 2011.
[21] G. Holmes, A. Donkin, and I. Witten, “WEKA: A Machine Learning
Workbench,” In Proceedings of the 1994 Second Australian and New
Zealand Conference on Intelligent Information Systems, Brisbane, 29
November-2 December 1994, 357-361.
[22] Khalifa, K., and Omar, N., “A Hybrid Method Using Lexicon-Based
Approach and Naïve Bayes Classifier for Arabic Opinion Question
Answering,” Journal of Computer Science 10 (11): 1961-1968, 2014,
ISSN: 1549-3636.
[23] Fattah MA, “A Novel Statistical Feature Selection Approach for Text
Categorization. J Inf Process Syst 13:1397–1409. (2017),
https://doi.org/10.3745/JIPS.02.0076
[24] Guyon I, Elisseeff A, “An Introduction to Variable and Feature
Selection”. J Mach Learn Res 3:1157–1182. (2003)
https://doi.org/10.1016/j.aca.2011.07.027
297