Deep Learning Algorithms For Cyber-Bulling Detection in Social Media Platforms
Deep Learning Algorithms For Cyber-Bulling Detection in Social Media Platforms
ABSTRACT Social media platforms are among the most widely used means of communication. However,
some individuals exploit these platforms for nefarious purposes, with ‘‘cyberbullying’’ being particularly
prevalent. Cyberbullying, which involves using electronic means to harass or harm others, is especially
common among young people. Consequently, this study aims to propose a model for detecting cyberbullying
using a deep learning algorithm. Three datasets from Twitter, Instagram, and Facebook were utilized to
predict instances of bullying using the Long Short-Term Memory (LSTM) method. The results obtained
revealed the development of an effective model for detecting cyberbullying, addressing challenges faced
by previous cyberbullying detection techniques. The model achieved accuracies of approximately 96.64%,
94.49%, and 91.26% for the Twitter, Instagram, and Facebook datasets, respectively.
machines (SVM) and extreme gradient boosting (XGB) the enhancement in performance compared to standard
are commonly used as classifiers in this domain, where references.
the application of vectorized representations of textual
data is typically essential. Commonly employed approaches
include the utilization of bag-of-word models in conjunction II. RELATED WORKS
with TF-IDF (term frequency-inverse document frequency). Several methods offer models for cyberbullying detection.
The progress in embedding methodologies rooted in deep In the study of [8], the model is suggested to provide a dual
learning has brought forth tools like Fast Text, Glove, definition of cyberbullying by utilizing a creative CNN idea
word2vec, and transformer-based approaches, which have for content analysis as well as a dishonest method to deal
been employed to acquire more complex representations. with providing the arrangement with less accuracy. When
These embedding-based representations, pre-trained using compared to other studies, the collected data are proven to
representation learning tools, can be utilized with both tra- provide superior precision and categorization.
ditional and sophisticated classifiers. This expands the array A systematic review of n=186 entries from internet data-
of techniques available for identifying hate speech, providing banks was published by [9]. In this article, 10 reviews of
a diverse set of potential solutions applicable to various real- the literature have been chosen to assess and debate the data
world situations [6]. regarding the effectiveness of ML in preventing cyberbully-
The increase in cyberbullying on social networking sites ing. To predict cyberbullying, most models take advantage of
and the diversity of its forms has resulted in negative effects content-based features. While the most prevalent algorithms
on the victim. The negative effects that appeared on the are support vector machines, naive Bayes, and convolutional
victims after being exposed to cyberbullying are many, such neural networks, the majority of these traits are based on text
as negative effects on physical health and mental health like from social media posts. ML is a cutting-edge preventative
anxiety, depression, thinking, and low self-esteem, and some- technique that might enhance and combine adolescent edu-
times it led to suicide [27]. With the emergence of negative cation programs and serve as the foundation for the creation
effects and the increase of bullying on social media sites, of technology-based automated screening methods.
it has become necessary to find a solution to reduce and According to studies by [10], a technique to detect cyber-
prevent the phenomenon of cyberbullying [28]. bullying was created using fuzzy logic, in which the two
In a comparative study published by [29] for cyberbullying users’ communication is continuously observed, and each
detection in social media for the last five years, presented a message’s emotional content is identified. Depending on their
group of previous studies that used machine learning and deep emotions, each user’s behaviour is classified as either decent
learning algorithms in good attempts to detect and classify or bullying. The user’s account is automatically terminated
the phenomenon of cyberbullying. They concluded from their and reported if the amount of observed bullying exceeds a
observations of the results of related works for achieving predetermined threshold value. They concluded that if used
better results in future research.is recommended to use deep alongside social networking sites, it could be a helpful tool
learning algorithms (the BiLSTM classifier and BERT) while for avoiding online harassment. The created algorithm can
in the case of using machine learning algorithms, prefer use also be used for surveillance and studying human behaviour.
SVM and NB as classifiers. A novel pre-trained BERT model was developed by [11]
From the survey of the previous studies a number of limi- and assessed using two social media datasets. One dataset
tation has been addressed such as multi-class cyberbullying featured a comparatively small network layer at the top func-
categories, experiments with larger datasets for aggression tioning as a classifier, while the other dataset had a larger
detection, datasets from multiple social media platforms not network layer at the top serving as a classifier. The primary
taken in consideration. So the primary aim of this research objective of the study was to detect instances of cyberbullying
was to develop a detection model that enhances the per- on various social media platforms. When compared to earlier
formance of classifiers on a large generic dataset through methods, this one works better in terms of dimensions and
combining feature extraction techniques. The study presents training the model.
the creation of an LSTM deep learning detection model capa- A study [12] presented a new model known as DEA-
ble of identifying cyberbullying content in user comments RNN, which combines Elman-type recurrent neural networks
across three distinct social media platforms in real-time. The (RNNs) with a refined dolphin allocation algorithm (DEA).
experimental setup for the new datasets closely follows the A dataset of 10,000 tweets was used for evaluation, com-
methodology outlined in a prior publication [7]. in an attempt paring the model’s performance with various advanced
to solve the limitation, allowing us to examine how well the algorithms such as bidirectional long short-term memory (Bi-
suggested model performs on the chosen datasets and how LSTM), RNN, SVM, multinomial naive Bayes (MNB), and
adaptable it is to other datasets. Using other datasets makes random forests (RF). Results from the experiments indicated
the model more flexible in its detection of bullying. numer- that DEA-RNN outperformed all other methods across dif-
ous experiments were conducted with a wide range of time ferent scenarios, achieving an average accuracy of 90.45%,
steps utilizing a dataset from the real world, while adhering precision of 89.52%, recall of 88.98%, F1-score of 89.25%,
to a time-conscious evaluation technique that demonstrates and specificity of 90.94%.
Precision can be described as the ratio of true positives to the TABLE 1. Performance of the proposed model in terms of the various
data set split ratios.
sum of true positives and false positives.
TP
Precision = (5)
TP + FP
Recall, in the context of information retrieval, refers to the
ratio of correctly retrieved results to the total number of
results that should have been retrieved. Within the realm of
binary classification, recall is alternatively known as sensi-
tivity. It can be interpreted as the likelihood that a pertinent
document will be brought back by a search query.
TP
Recall = (6) Similar results were obtained by [17]. The researchers indi-
TP + FN
cated that the model has been trained using 4,590,756 param-
The F measure, also known as F1 score or F score, quantifies eters, with distinct input and output specifically designed
the accuracy of a test by calculating the weighted harmonic for textual data. The model attains 85% accuracy by using
mean of the test’s precision and recall. sequentially dense LSTM layers. Additionally, the workers
2∗ TP in the research study conducted by [24] utilized an 80:20
F − measure = (7) ratio for dividing their data and indicated that their model
2∗ TP + FP + FN
consistently performed well in the early detection of cyber-
F. RESULTS AND DISCUSSION bullying. They utilized various characteristics to differentiate
The laptop used for all experiments was a DELL 3000 series, between positive and negative cases, setting low thresholds
Core (TM) i5-5200u, equipped with a Processor: 2.20 GHz to facilitate early detection and utilizing simpler features like
Intel (R), 12.0 GB of RAM and a 64-bit operating system. profile owner traits for the negative model. A study of [21]
The models were run multiple times across various epochs introduced an innovative approach to detecting cyberbully-
to ensure consistent evaluation parameters. To validate and ing, incorporating three advanced deep learning structures: a
ensure the reliability of the results, the performance of the multichannel architecture with BiGRU, a transformer block,
cyberbullying detection scheme was assessed using metrics and CNN models. The effectiveness of their methodology
such as accuracy, precision, recall, and f1 measure on a was evaluated, with the results of the experiments highlight-
dataset with minimal skewness. ing the importance of their strategy in categorizing short
Data splitting refers to the process of dividing the available messages (tweets). When the dataset was divided into 75%
data into two segments, typically for the purpose of cross- for training and 25% for testing, an accuracy rate of approxi-
validation. One portion is dedicated to developing a predictive mately 88% was attained.
model, while the other is utilized to evaluate the model’s Once the model is defined, it undergoes a compilation
performance. The process of splitting data into training and process in order to enable the execution of the Keras backend,
testing sets is a pivotal stage in the assessment of data mining which is based on Tensorflow. This compilation of the model
models. Typically, a significant portion of the data is desig- includes the incorporation of optimizers, loss functions, and
nated for training purposes, with a smaller fraction set aside metrics. Optimizers play a crucial role in updating the weights
for testing. of the model during the training phase. An important param-
In this research, the data was partitioned into various ratios eter utilized in this process is the number of epochs, the
for training and testing purposes, including 80% training and frequency at which the model encounters the training dataset
20% testing, 70% training and 30% testing, 60% training and is denoted by the number of epochs. This parameter acts
40% testing, and 50% training and 50% testing. The sets as a hyperparameter that specifies the number of iterations
were constructed by utilizing a randomized array, enabling the learning algorithm will go through the complete training
the model to adapt across various data samples. Additionally, dataset. During each epoch, all samples in the training dataset
this methodology aids in revealing the model’s dependability are utilized to adjust the internal model parameters. It is
and the uniformity of outcomes through multiple iterations. important to note that an epoch consists of at least one batch.
The random state is produced through numpy.random to facil- The accuracy of the model exhibited improvement with
itate random selection during the division of data, ensuring each subsequent epoch. However, it reached a point of sta-
consistent and replicable splits. bility after 40 epochs, as demonstrated in Table 2.
The obtained accuracy for the split ratio of the dataset was The model loss during different epochs for the used
at its best at 80% training and 20% testing. 96.64, 95.49, datasets is shown in Fig (2), Fig (3), Fig (4). The
and 91.42 for Tweets, Instagram, and Facebook, respectively. cross-entropy loss that was evaluated throughout various
as illustrated in Table 1, which revealed that the performance epochs within the configuration demonstrated effective con-
of the suggested method decreased as the number of trained vergence, suggesting an optimal level of performance for the
testers decreased in the latest group of research studies. model.
FIGURE 3. Accuracy and loss of LSTM model for the dataset2 (Instagram).