Machine Learning Based Cyber Bullying Detection
Machine Learning Based Cyber Bullying Detection
ISSN No:-2456-2165
Abstract:- Cyber bullying is a serious issue that affects have shown great promise in detecting cyber bullying. These
individuals of all ages, particularly children and techniques use natural language processing (NLP)
teenagers who are more vulnerable to online harassment. algorithms to analyse text messages and identify patterns of
With the growing use of social media and other online abusive and aggressive behaviour. A significant advantage of
platforms, it has become increasingly important to machine learning-based methods over traditional rule-based
develop effective methods to detect and prevent cyber systems is their ability to adjust to evolving trends and
bullying. In this project, we propose a machine learning- patterns of cyberbullying, making them more efficient.
based approach for cyber bullying detection. The
proposed system uses natural language processing (NLP) In this project, we propose a machine learning-based
techniques to analyse text messages and identify patterns approach for cyber bullying detection. We aim to develop an
of abusive and aggressive behaviour. We apply various automated system that can accurately detect and flag
classification algorithms, such as Logistic Regression, potentially abusive content on online platforms. We apply
Decision Trees Classifier and Gaussian Naïve bayes, to various classification algorithms, such as logistic regression,
train our model and evaluate its performance. We also decision trees, and gaussian naïve bayes, to train our model
explore the use of ensemble methods, such as Random and evaluate its performance. We also explore the use of
Forest classifier and adaboost classifier, to improve the ensemble methods, such as Random Forest classifier, to
accuracy of our model. We use publicly available improve the accuracy of our model.In the next section, we
datasets to test our system and compare its performance provide a brief overview of related work in cyber bullying
with other existing approaches. Our results show that the detection. We then describe our proposed machine learning-
proposed machine literacy- grounded approach can based approach in detail and discuss the datasets and
effectively identify cyber bullying with high delicacy, evaluation metrics used in our experiments. We present and
perceptivity, and particularity. This project has analyse the results of our experiments and compare our
significant implications for the development of approach's performance with existing approaches. Finally,
automated systems that can help protect individuals we conclude the project and discuss future work.
from online harassment and promote a safer and more
inclusive online environment. II. RELATED WORK
Keywords:- Cyberbullying, Harassment, Machine Learning, [1]Cyberbullying is a growing concern with the
Natural Language Processing, social media analysis, Text increased use of social media and online
classification, Logistic Regression, Decision Tree Classifier, communities.Detecting and preventing cyberbullying is
Gaussian Naïve Bayes, Ensemble Methods, Adaboost crucial in ensuring the mental and physical well-being of
classifier, Random Forest Classifier, Sentiment analysis and individuals, especially children and women. [2] To address
Behavioural analysis. this issue, various studies have proposed the use of machine
learning and natural language processing techniques to
I. INTRODUCTION automatically detect cyberbullying.
Cyber bullying is a type of online importunity that [3] In a study conducted in May 2022, the authors
involves the use of electronic communication to bully, proposed the use of Support Vector Machines (SVM) to
intimidate, or hang others. It can take various forms, such as identify cyberbullying in Twitter, and Optical Character
sending threatening messages, sharing personal information Recognition (OCR) to detect image-based cyberbullying. [4]
without consent, spreading rumours, or posting insulting They categorized existing approaches into four main classes,
comments on social media platforms. Cyber bullying can including supervised learning, lexicon-based, rule-based,
have severe consequences, including depression, anxiety, and mixed-initiative approaches.
low self-esteem, and even suicide. Thus, it's essential to
descry and help cyber bullying to ensure the safety and well- [5] Another study conducted in December 2021
being of individualities who use online platforms. highlighted the research gap in resource-poor languages
such as Roman Urdu, which is widely used in South Asian
Traditional approaches to detecting cyber bullying countries. The authors performed extensive pre-processing
involve manual monitoring of online platforms, which can on the Roman Urdu microtext, including the creation of a
be time- consuming and expensive. With the growing slang-phrase dictionary and elimination of cyberbullying
volume of online content, it is becoming increasingly domain-specific stop words.[6] They experimented with
challenging to monitor and moderate online platforms different models, including RNN-LSTM, RNN-BiLSTM,
effectively. Therefore, there is a need for automated systems and CNN models, achieving validation accuracy of up to
that can identify and flag potentially abusive content quickly 85.5%.
and accurately.In recent years, machine learning techniques
[4.] https://journalofbigdata.springeropen.com/articles/10.
1186/s40537-021-00550-7
[6.] https://ieeexplore.ieee.org/document/9411601/authors
#authors
[8.] https://www.researchgate.net/publication/351131976_
CSyberbullying_Detection_on_Social_Networks_Usi
ng_Machine_Learning_Approaches
[10.] https://engineering.ucdenver.edu/docs/librariesprovid
VII. RESULT er29/college-of-engineering-and-applied-
Based on the all algorithms used in this project, science/sp2020-capstone/csci14-
report.pdf?sfvrsn=d3731fb9_2
Random Forest Algorithm give more accuracy, more
precision and support. So, we used Random Forest for the [11.] Monirah AAA., Mourad Y., “International Journal of
predict of Cyberbully. Advanced Computer Science and Applications
(IJACSA)”, 2018.
Table 1: Algorithm Result
S.No. Algorithm Name Accuracy [12.] John H., Mohamed N., Mostafa A., Zeyad E., Eslam
1 Logistic Regression 81 A., Ammar M., “International Journal of Advanced
2 Decision Tree 84 Computer Science and Applications (IJACSA)”,
3 Gaussian Naïve Bayes 62 2019.
4 Random Forest Classifier 92
[13.] Mangala K., Anvitha K., Deepa, Deepika K V., Divya
VIII. CONCLUSION AND FUTURE SCOPE C H, “Cyber-Bullying Detection using Machine
Learning Algorithms “, “IJCRT”.
The research paper compares various supervised
machine learning algorithms and ensemble methods for [14.] https://jpinfotech.org/detection-of-cyberbullying-on-
detecting cyberbullying. According to the study, the Random social-media-using-machine-learning/
Forest classifier performed the best with a 92% accuracy [15.] https://www.irjet.net/archives/V9/i5/IRJET-
rate while Naive Bayes was the least accurate with only a V9I5562.pdf
61% accuracy rate. The future scope of the project is, to
implement in real time and collaboration with companies. [16.] https://www.mdpi.com/2079-9292/11/20/3273/pdf
[17.] https://link.springer.com/article/10.1007/s40747-022-
00772-z