0% found this document useful (0 votes)
37 views5 pages

Paper 4

Uploaded by

balayya112233
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views5 pages

Paper 4

Uploaded by

balayya112233
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056

Volume: 10 Issue: 03 | Mar 2023 www.irjet.net p-ISSN: 2395-0072

Cyberbullying Detection Using Machine Learning


Polasa Jahnavi1, Siliveri Rohith Vardhan2,Shashank Kandhaktla3

123 Student, Computer Science & Engineering, Anurag University, Telangana, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - In the current digital era, The issue of technologies examine online information such as social
cyberbullying is spreading and has the potential to seriously media postings, comments, and messages to discover
harm people's mental health, interpersonal relationships, patterns of behaviour that are suggestive of cyberbullying.
and academic achievement. To recognise and stop such These systems may be trained on big datasets of labeled
activity, there is a growing demand for automated data containing instances of cyberbullying and non-
cyberbullying detection systems. This paper proposes a cyberbullying material.
machine learning-based approach for detecting
cyberbullying in all forms. cyberbullying is a developing Once trained, the cyberbullying detection tool
problem that can take many different forms, including text, may automatically flag instances of cyberbullying and take
photos, and PDF documents. Cyberbullying detection is action to block or delete the offending material, either by
being developed to evaluate various forms of content and notifying a human moderator for additional review or by
spot instances of the behaviour using machine learning blocking or removing the offending content. Automatic
techniques like SVM[15], k closest neighbour, Decision cyberbullying detection technologies can be effective in
Tree[17], and Random Forest[18]. To identify cyberbullying recognising and reducing cyberbullying on a large scale,
in photographs and text, we used image recognition but they must be accurate, unbiased, and respectful of
algorithms, OCR[19], and natural language processing individual privacy.
approaches. We trained the aforementioned machine
learning algorithms on a sizeable dataset of labeled 2. RELATED WORKS
cyberbullying content. The prevalence of cyberbullying
might be considerably reduced by the suggested method, In our project, we have implemented this model to
and the online space could become safer and more detect bullying by browsing the web for published articles.
welcoming as a result.
This section examines the most current automated
Key Words: Machine learning, Natural language Cyberbullying Detection Classification methods.
processing, Cyberbullying, Text, Image, Documents.
Table -1: literature survey on Cyberbullying Detection.
1. INTRODUCTION
Cyberbullying is the intentional use of electronic Research Papers on Cyberbullying Detection
communication tools like social media, texting, email, or
instant messaging to harass, threaten, or hurt someone. S.no Title Dataset Methodology
Cyberbullying may take many different forms, such as
sending threatening or unpleasant messages, spreading 1 Cyberbullying Datasets NLP(Natural
rumours or gossip online, disclosing private or Detection on Social from Language
embarrassing information, making up false accounts or Networks Using Twitter Processing.
posing as someone else, or even cyberstalking. Machine Learning comments
Cyberbullying may have severe repercussions for the Approaches[14] and ML(Machine
victim, such as mental discomfort, sadness, anxiety, and in remarks. Learning)
severe cases, suicide. Cyberbullying is a major problem (Adya Bansal,
that has to be acknowledged, addressed and prevented. Akash
Baliyan, Akash
The practice of finding and highlighting instances of Yadav)
cyberbullying using technology is referred to as
cyberbullying detection. Cyberbullying detection can be
done manually, with human moderators reviewing online
material and identifying instances of cyberbullying, or
using automated techniques, such as machine learning
algorithms. Automatic cyberbullying detection

© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1193
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 03 | Mar 2023 www.irjet.net p-ISSN: 2395-0072

Cyberbullying Datasets Deep Neural For text analysis, the system will use topic modelling
2 techniques to identify the emotional content and themes
Detection in Social from Network-
Networks Using Wikipedia Based Models of the text. The system will also analyse the text for the use
Deep Learning Twitter of derogatory language and other harmful expressions
Based Models[13] comments commonly used in cyberbullying. This method for
and detecting cyberbullying using machine learning is to train
remarks a classification model on a labeled dataset of cyberbullying
(Maral Dadvar and incidents. The algorithm may then be used to predict
Kai Eckert) and
Formspring. whether or not fresh text descriptions include
cyberbullying. To train the model features such as
profanity, insulting language, and hostile tone may be
retrieved from the text.
A multilingual ML(Machine
3 Datasets are
system for Learning)
Arabic NLP approaches may also be used to preprocess text data
detecting
language and extract aspects such as sentiment, grammar, and
cyberbullying in text data. semantics. These characteristics can be utilised to increase
Arabic content
the classification model's accuracy.
using machine
learning.[4]
For image analysis, the system will use deep learning
techniques to recognise patterns and identify visual cues
(Batoul Haidar, associated with cyberbullying. The system will look for
Maroun Chamoun, specific types of images, such as those with derogatory
and Ahmed captions or those depicting harmful acts, to identify
Serhrouchni.) instances of cyberbullying.

For PDF documents, the system will use OCR to convert


3. THE PROPOSED MODEL the text into a machine-readable format, and then apply
the same text analysis techniques used for other types of
This paper proposes a machine learning-based approach content.
to detect cyberbullying in various forms, including text,
photos, and PDF documents. Machine learning techniques Once the system identifies an instance of cyberbullying, it
such as SVM, k nearest neighbour, Decision Tree, and will flag the content for review and further action by
Random Forest are used, as well as image recognition human moderators, such as reporting the content to
algorithms, OCR[19], and natural language processing appropriate authorities or removing it from the platform.
approaches. The suggested strategy has the potential to
greatly reduce cyberbullying. 3.1 ALGORITHMS
Detects cyberbullying in the following: • Support Vector Machine (SVM[15]) is a powerful
supervised learning algorithm used for
 Text classification and regression analysis. The goal of
SVM[15] is to find the best hyperplane that
 Image
separates the data points into different classes
 Text file(.txt) with the largest possible margin.
 PDF(.pdf)
• K-Nearest Neighbours (KNN[16]) is a simple and
widely-used non-parametric classification
The purpose of this work is to identify cyberbullying and
algorithm in machine learning. KNN[16] works by
to make the internet a safer and more inclusive place by finding the K closest data points in the training set
avoiding cyberbullying and safeguarding individuals from
to a given input data point and assigns a label to
its negative consequences. This is accomplished by
the input data point based on the most common
reporting instances of cyberbullying for assessment and
label among its K nearest neighbors.
subsequent action by human moderators, who may then
take necessary actions to address the problem and help
• Decision Tree[17] is a popular machine learning
victims.
algorithm used for classification and regression
analysis. It works by recursively splitting the data
into subsets based on the most informative

© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1194
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 03 | Mar 2023 www.irjet.net p-ISSN: 2395-0072

features until a decision is made about the class


label or predicted value of a given input data
point.

• Random Forest[18] is an ensemble learning


algorithm used for both classification and
regression analysis. Random Forest works by
constructing a multitude of Decision Trees at
training time and outputting the class that is the
mode of the classes of the individual trees.

3.2 OPTICAL CHARACTER RECOGNITION(OCR)


OCR stands for Optical Character Recognition, which is a
technology used to convert printed or handwritten text
into a machine-readable format. OCR involves scanning
the text using an optical scanner or a smartphone camera
and then using image processing techniques to extract the
text from the image. OCR works by analysing the shape
and size of the characters in the image and comparing
them to a database of known characters. The OCR
software then uses pattern recognition algorithms to
identify the characters in the image and convert them into
machine-readable text.

4. EXPERIMENT AND RESULTS


Fig 1: Dataset statistics
4.1 DATASET
4.2 PERFORMANCE EVALUATION
There are several publicly available datasets for
cyberbullying detection research, including The SVM algorithm achieved an accuracy of 86.862, the k-
NN algorithm achieved a higher accuracy of 96.962, the
1. Fine-Grained Balanced Cyberbullying Dataset[1]: It Decision Tree[ algorithm achieved an even higher
was created by academics at the University of Cagliari and accuracy of 97.972, and the Random Forest algorithm
includes 25,000 Facebook and Twitter posts. The dataset achieved a similar accuracy of 0.972.
includes both cyberbullying and non-cyberbullying
messages, and it is balanced to provide an equal amount of Overall, the data indicate that the Decision Tree and
good and bad examples. Random Forest algorithms performed the best for
cyberbullying detection, outperforming SVM and k-NN.
2. Aggression Parsed Dataset [2]: This dataset contains
Table -2: Accuracy of different Algorithms
20,000 tweets, labeled as containing cyberbullying or not,
and is often used for evaluating machine learning models
for cyberbullying detection. Accuracy vs Algorithms
Algorithms Accuracy
3. Hate Speech and Offensive Language dataset: This
dataset contains tweets that are labeled as containing hate Support vector machine(SVM) 0,861998703343521
speech or offensive language. It can be used for
cyberbullying detection as well. K-Nearest Neighbors(k-NN) 0,9622117254793
Decision Tree 0,972307122348801
Random Forest 0,972260813188849

© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1195
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 03 | Mar 2023 www.irjet.net p-ISSN: 2395-0072

analysis to identify patterns of negative behaviour.


However, incorporating other modalities such as audio,
and video could provide additional context and improve
accuracy. Cyberbullying behaviours and language can
evolve, and static models may not be effective at capturing
these changes. Dynamic models that can adapt to changing
patterns of behaviour could improve the effectiveness of
cyberbullying detection systems. Cyberbullying is a global
problem, and many existing systems only analyse text in a
single language. The multilingual analysis could improve
the ability of these systems to identify cyberbullying
across different cultures and languages. Social media
platforms have access to vast amounts of data on user
behaviour, and collaborating with these platforms to
develop more effective cyberbullying detection systems
could be a promising avenue for future research.

7. REFERENCES

Chart -1: Comparative Study of Machine Learning 1. J. Wang, K. Fu, C.T. Lu, “SOSNet: A Graph Convolutional
Algorithms for Cyberbullying Detection Network Approach to Fine-Grained Cyberbullying
Detection,” Proceedings of the 2020 IEEE International
Each algorithm was tested using a dataset containing Conference on Big Data (IEEE BigData 2020), pp. 1699-
labeled cyberbullying content, and the reported accuracy 1708, December 10-13, 2020.
is the percentage of occurrences of cyberbullying that was
properly recognised. 2. Elsafoury, Fatma (2020), “Cyberbullying datasets”,
Mendeley Data, V1, doi: 10.17632/jf4pzyvnpj.1
5. CONCLUSIONS
3. Cyberbullying Detection Using Machine Learning,
In conclusion, cyberbullying is a serious problem that can Aaminah Ali, Adeel M. Syed software Engineering
have a significant impact on individuals and society. It is Department, Bahria University, Islamabad,
important to develop effective methods for detecting and PakistanSoftware Engineering Department, Bahria
preventing cyberbullying, and machine learning has University, Islamabad, Pakistan.
emerged as a promising approach for this task. The
existing literature on cyberbullying detection using 4. A Multilingual System for Cyberbullying Detection:
machine learning techniques highlights the potential for Arabic Content Detection using Machine Learning
deep learning models, natural language processing Batoul Haidar*,1, Maroun Chamoun1, Ahmed
techniques, and social network analysis to identify Serhrouchni2 1Saint Joseph University, Lebanon
patterns of negative behaviour and potential sources of 2Telecom ParisTech, France.
cyberbullying. However, more research is needed to
develop more accurate and efficient models that can be 5. Dadvar, M., Trieschnigg, D., de Jong, F.: Experts and
deployed in real-world settings. Furthermore, it is machines against bullies: a hybrid approach to
important to consider the ethical implications of using detecting cyberbullies. In: Sokolova, M., van Beek, P.
machine learning for cyberbullying detection and to (eds.) AI 2014. LNCS (LNAI), vol. 8436, pp. 275–281.
ensure that any system developed is fair, transparent, and Springer, Cham (2014).
accountable. Overall, cyberbullying detection using
machine learning is a rapidly evolving field, and we will 6. Zhang, X., et al.: Cyberbullying detection with a
likely see continued progress and innovation in the years pronunciation-based convolutional neural network. In:
to come. 2016 15th IEEE International Conference on Machine
Learning and Applications (ICMLA), pp. 740–745
(2016).
6. FUTURE ENHANCEMENT
7. Reynolds, K., Kontostathis, A., Edwards, L.: Using
Several potential future enhancements could improve the
machine learning to detect cyberbullying. In:
effectiveness of cyberbullying detection using machine
Proceedings of the 10th International Conference on
learning techniques. Here are a few examples: Many
Machine Learning and Applications, ICMLA 2011, vol. 2,
cyberbullying detection systems rely solely on text
pp. 241–244 (December 2011).

© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1196
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 10 Issue: 03 | Mar 2023 www.irjet.net p-ISSN: 2395-0072

8. Neelakandan S,1Sridevi M,2Saravanan 19. Automated Detection of Cyberbullying Using


Chandrasekaran,3Murugeswari K,4Aditya Kumar Singh Machine Learning and OCR, Niraj Nirmal, Pranil Sable,
Pundir,5Sridevi R,6and T.Bheema Lingaiah7 Deep Prathamesh Patil, Prof. Satish Kuchiwale.
Learning Approaches for Cyberbullying Detection and
Classification on Social Media.

9. N. Yuvaraj, K. Srihari, G. Dhiman, et al., “Nature-


inspired-based approach for automated
cyberbullying classification on multimedia social
networking,” Mathematical Problems in Engineering,
vol. 2021, Article ID 6644652, 12 pages, 2021.

10. S. Mahbub, E. Pardede, and A. S. M. Kayes,


“Detection of Harassment Type of Cyberbullying: A
Dictionary of Approach Words and its Impact,” Security
and Communication Networks, vol. 2021, Article ID
5594175, 12 pages, 2021.

11. Md Manowarul Islam; Md Ashraf Uddin; Linta


Islam; Arnisha Akter; Selina Sharmin; Uzzal Kumar
Acharjee Cyberbullying Detection on Social Networks
Using Machine Learning Approaches.

12. J. Wang, R. J. Iannotti, and T. R. Nansel, "School


bullying among US adolescents: Physical, verbal,
relational and cyber," Journal of Adolescent Health, vol.
45, pp. 368--375, 2009.

13. Maral Dadvar and Kai Eckert Web-based


Information Systems and Services, Stuttgart Media
University Nebenstrasse 8, 70569 Stuttgart, Germany,
Cyberbullying Detection in Social Networks Using Deep
Learning Based Models.

14. Cyberbullying Detection on Social Networks Using


Machine Learning Approaches Adya Bansal, Akash
Baliyan, Akash Yadav, Aman Kamlesh, Hemant Kumar
Baranwal Dept. of Computer Science and Engineering,
Meerut Institute of Engineering and Technology,
Meerut, U.P. India.

15. RECENT ADVANCES ON SUPPORT VECTOR


MACHINES RESEARCH Yingjie Tian1, Yong Shi2,
Xiaohui Liu.

16. KNN Model-Based Approach in Classification,


Gongde Guo, Hui Wang, David Bell, Yaxin Bi, and Kieran
Greer.

17. Cyberbullying detection from text using ensemble


classifier technique with base classifier decision trees,
Shivam Sarawat.

18. Cyberbullying identification on Twitter using


random forest classifier, Novalita Novalita, Anisa
Herdiani, Diyas Puspandari.

© 2023, IRJET | Impact Factor value: 8.226 | ISO 9001:2008 Certified Journal | Page 1197

You might also like