0% found this document useful (0 votes)

16 views35 pages

Final Presentation Main

Uploaded by

faisalfahim.samin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views35 pages

Final Presentation Main

Uploaded by

faisalfahim.samin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 35

Cyberbullying and Toxic

Language Detection on
Social Media for Bangla
Language

Supervisor : Najeefa Nikhat Choudhury

Senior Lecturer, Department of Computer Science and
Engineering, Brac University

Presented by : Co-supervisor : Md. Faisal Ahmed

Lecturer Department of Computer Science and Engineering
Parom Guha Neogi- 20101562 Brac University
Anjel Haidar Fahim - 19101093
Faisal Khan - 19101443
Fahim Kabir Khan - 19101557 Co-supervisor 2 : Md. Moynul Asik Moni
Md. Fahim Faisal - 19101161 C. Lecturer, Department of Computer Science and Engineering,
Brac University
Introduction :
Research Problem: Challenges in Cyberbullying Detection :

Data Collection and Labeling

• Difficulty in obtaining sufficient labeled data
• Inconsistencies due to platform policies and repeated data

Bias in Data
• Demographic and cultural biases
• Subjective labeling and language nuances

Overfitting
• Model specialization to training data
• Poor adaptation to new data

Generalization
• Struggle with new, unseen data
• Domain shift and distribution mismatches
2
Introduction
Research Problem: Challenges in Cyberbullying Detection

Data Collection and Labeling

• Difficulty in obtaining sufficient labeled data
• Enhanced Social Media Policies

Overfitting
• Model specialization to training data
• Poor adaptation to new data

Generalization
• Struggle with new, unseen data
• Degradation of model performance over time 3
Introduction
Research Problem: Challenges in Cyberbullying Detection (Cont.)

Ethical Concerns
• Privacy and security issues
• Balancing harmful behavior recognition and free expression

Interpretability
• Understanding model decisions
• Dealing with symbols and local slang

Scalability
• Efficient processing of large datasets
• Real-time detection and adaptability 4
Introduction
Research Objective

Create a Framework for Linguistically Sensitive Detection

• Compile or improve datasets labeled with Bengali cyberbullying examples
• Adapt machine learning algorithms for Bengali language challenges

Utilize Cutting-Edge Machine Learning Innovations

• Research and implement recent developments in machine learning
• Focusing on transformer based models

5
Introduction
Research Objective (Cont.)

Comprehensive Assessment of System Performance

• Evaluate system's ability to detect Bengali cyberbullying
• Prioritize metrics like recall, precision, and discrimination of content
types

Improve Awareness of Cyberbullying and Preventive Measures

• Enhance digital safety for Bengali-speaking social media users
• Raise awareness about cyberbullying among Bengali-speaking
communities
6
Limitation of Existing Works

- Requires large datasets for optimal results; smaller datasets don't provide the best outcomes due to model
complexity.
- Focuses only on text, ignoring other mediums like images or videos.
- Limited by dataset size and may not be representative of all types of online content.
- Initial experimentation focuses only on specific platforms, such as tweets.
- Narrow focus on just two datasets limits generalizability.

7
Dataset Description
Data Collection

• Initial Dataset: 44,001 comments compiled and publicly available on Mendeley Data.
• Source of Comments: Gathered from social media platforms, primarily Facebook.
• Diverse Interactions: Comments from actors, influencers, politicians, athletes, etc.
• Expansion of Dataset: Additional 16,000 data points collected from Instagram, YouTube, etc.

8
Dataset Description
Data Collection (Cont.)

Manual Collection: Human sourcing for thorough perspective of online debate.

Labeling and Validation: Robust peer validation process for accuracy and reliability.
Final Dataset Composition: 58,448 comments categorized into specific types. Final dataset (Cleaned)

Original Dataset 9
Dataset Description
Data Analysis

Gender
• Distribution: Around 36,000 comments relate to
females, with the rest to males.
• Insights: Provides nuanced views on online
discourse across social spheres.
• Comprehensiveness: Dataset spans diverse topics,
offering insights into online interactions.
• Validation: Rigorous validation ensures data
accuracy, reliability, and trustworthiness.
• Potential Insights: Offers understanding on
harassment, discrimination, social dynamics, and
gender attitudes. 10
Gender Distribution of the Dataset
Dataset Description
Data Analysis (Cont.)

Category Distribution by Profession :

• Acting Industry: Highly targeted due to public scrutiny.

• Singing Industry: Faces significant harassment under the

spotlight.

• Politicians: Criticized and threatened due to divisive politics.

• Sports Personalities: Subject to fan and critic scrutiny.

• Scholars: Vulnerable to harassment in public debates.

• Other Professions: Experience diverse forms of online

harassment.

Category Distribution 11
Dataset Description
Data Analysis (Cont.)

Label Distribution
• Sexual Comments: 17.19% of dataset, often include
offensive language and unwanted approaches.
• Not Bully Comments: 32.98% of dataset, may
contribute to hostile environment despite seeming
benign.
• Troll Comments: 25.15% of dataset, aimed at
evoking strong emotions or sabotaging
conversations.
• Religious Comments: 14.87% of dataset, often
polarizing and promoting hate speech.
• Threat Comments: 9.83% of dataset, pose direct risk
to individuals' safety and well-being.
Label Distribution (Type of Bully) 12
Methodology
Preprocessing

Data Cleaning: Dataset Splitting:

- Remove noise: missing values, emojis, non-Bengali - Split data: training (80%), validation (10%),
characters, whitespace. testing (10%).

- Use `preprocess-bengali-text` function. - Validate model performance, test

generalization.
- Convert emojis, remove residuals, filter non-Bengali
letters.
Label Encoding & Class Weights:
Tokenization with BERT: - Encode labels with `LabelEncoder`.
- Tokenize text with BERT tokenizer. - Compute class weights with `compute-class-
weight`.
- Add start/end tokens, handle padding/truncation.
- Ensure balanced performance across classes.
- Generate attention masks.
13
Model Training Setup :
Cross-Validation Setup:
- Use Stratified k-fold cross-validation.
- Dataset split into 5 equal folds.
GPU/CPU Configuration:
- Train for 20 epochs per fold. - Detect GPU availability for faster training.
- Ensures robust performance evaluation. - Use `torch.device` to manage hardware resources.
- Model moved to designated device with `.to(device)`.

Pre-Trained BERT Initialization:

- Used `bert-base-multilingual-cased` from transformers.
- BERT optimized for Bengali text classification.

Methodology
14
Methodology

Model Training Initialization :

15
Methodology
Model Architecture

Transformer layers:
- Multi-Head Self-Attention
BERT Overview : - Feed-Forward Neural Network
- Model: `bert-base-multilingual-cased`. - Residual Connections and Layer Normalization
- Understands word context bidirectionally.
Output Representations:
- [CLS] Token
Input Representation: - [SEP] Token
- Token Embeddings
- Segment Embeddings Sequence Classification with BERT:
- Position Embeddings - Input Layer
- BERT Encoder
- Classification Layer 16
Result Analysis

1. Data Loading and Preprocessing:

2. Tokenization:
3. Splitting Data:
4. Label Encoding:
5. Model Initialization:
6. Training Loop:
7. Testing Phase:
8. Visualization and Reporting:

17
Result Analysis
Training Phase (Cont.)

Fold-wise Classification Reports :

Fold 1 Performance at a Glance :

-achieved an accuracy of approximately

88.99%
- Support: 18704
- F1-score: 0.89
- Total Accuracy: 0.89

Here is the detailed report on Fold 1 18

Result Analysis
Training Phase (Cont.)

Fold-wise Classification Reports :

Fold 2 Performance at a Glance:

- Fold 2 achieved an accuracy of

around 96.46%
- Support: 18703
- F1-score: 0.96
- Total Accuracy: 0.96

19
Here is the detailed report on Fold 2
Result Analysis
Training Phase (Cont.)

Fold-wise Classification Reports :

Fold 3 Performance at a Glance:

- Fold 3 achieved an accuracy of almost

97.11%
- Support: 18703
- F1-score: 0.97
- Total Accuracy: 0.97

Here is the detailed report on Fold 3 20

Result Analysis
Training Phase (Cont.)

Fold-wise Classification Reports :

Fold 4 Performance at a Glance:

- Fold 4 achieved an exceptional

accuracy of about 97.49%
- Support: 18703
- F1-score: 0.97
- Total Accuracy: 0.97

Here is the detailed report on Fold 4

21
Result Analysis
Training Phase (Ends here)

Fold-wise Classification Reports :

Fold 5 Performance at a Glance:

- Fold 5 achieved an outstanding

precision of approximately 98.37%
- Support: 18703
- F1-score: 0.98
- Total Accuracy: 0.98

Here is the detailed report on Fold 5 22

Result Analysis
Training and Validation Losses & Accuracies

Training and Validation Losses Training and Validation Accuracies 23

Result Analysis
Overall Metrics

Accuracy Assessment: Achieved 93.47%.

Class-wise Evaluation:

- Not Bully: 94% precision, recall, F1.

- Religious: Consistent 94% metrics.
- Sexual: 93% accuracy, 94% precision and recall.
- Threat: Impressive 94% accuracy.
- Troll: Over 92% accuracy.

Interpretation:
- Overall Metrics: Above 94%.
- Balanced Performance.
Testing Phase Result 24
Result Analysis
Comprehensive Performance Analysis of On-line Behavior
Classification Model

Performance Analysis: Evaluates model's comment classification.

Classes Evaluated:

- Non Bully: 94.43% Precision, 93.80% Recall, 94.12% F1-score

- Religious: 94.07% Precision, 94.40% Recall, 94.24% F1-score
- Sexual: 93.46% Precision, 94.29% Recall, 93.87% F1-score
- Threat: 93.81% Precision, 93.46% Recall, 93.63% F1-score
- Troll: 91.76% Precision, 92.09% Recall, 91.88% F1-score

Insights: Model shows high accuracy and reliability.

25
Result Analysis
Confusion Matrix Analysis

Confusion Matrix Analysis:

- The confusion matrix provides a summary of the

performance of a classification model on a set of test
data.

Improvement Areas:

- Enhance accuracy and robustness, refine architecture.

26
Result Analysis
ROC curve and AUC curve

ROC Curve:

-Shows True Positive Rate vs. False Positive Rate at

different thresholds.

AUC Curve:

-the overall performance of a binary classification

model by calculating the area under the ROC curve.

Conclusion: Effectively identifies dangerous remarks,

promoting online safety.
27
Conclusion
• Successful use of transformer models in Bengali cyberbullying detection.
• Acknowledgment of challenges in evolving cyberbullying landscape.
• Commitment to continuous improvement for prevention.
• Emphasis on collaboration for safer online spaces.
• Optimism about technology's role in combating cyberbullying.
• Call for sustained efforts in research, intervention, and education.

28
Future work Plans
• Partner with social platforms for real-time application.
• Incorporate user feedback for algorithm refinement.
• Explore advanced NLP techniques.
• Develop automated moderation and response.
• Conduct longitudinal studies on cyberbullying trends.
• Enhance system for linguistic and cultural nuances.
• Collaborate with mental health organizations for support.
• Launch educational campaigns for awareness. 29
Reference
[1] S. Kemp, Digital 2023: Bangladesh- datareportal– global digital insights, Feb. 2023. [Online]. Available: https://datareportal.com/reports/digital-
2023 bangladesh.

[2] N. Shahbazi, Y. Lin, A. Asudeh, and H. V. Jagadish, “Representation bias in data: A survey on identification and resolution techniques,” ACM
Computing Surveys, vol. 55, no. 13s, pp. 1–39, Jul. 2023, issn: 1557-7341. doi: 10.1145/ 3588433. [Online]. Available:
http://dx.doi.org/10.1145/3588433.

[3] X. Ying, An overview of overfitting and its solutions, Feb. 2019. doi: 10.1088/ 1742-6596/1168/2/022022.

[4] J. Yadav, D. Kumar, and D. Chauhan, “Cyberbullying detection using pre trained bert model,” in 2020 International Conference on Electronics
and Sustainable Communication Systems (ICESC), 2020, pp. 1096–1100. doi: 10. 1109/ICESC48915.2020.9155700.

[5] M. Behzadi, I. G. Harris, and A. Derakhshan, “Rapid cyber-bullying detection method using compact bert models,” in 2021 IEEE 15th
International Con ference on Semantic Computing (ICSC), 2021, pp. 199–202. doi: 10.1109/ ICSC50631.2021.00042.

30
Reference
[6] M. Gada, K. Damania, and S. Sankhe, “Cyberbullying detection using lstm cnn architecture and its applications,” in 2021 International
Conference on Computer Communication and Informatics (ICCCI), 2021, pp. 1–6. doi: 10. 1109/ICCCI50826.2021.9402412.

[7] C. Raj, A. Agarwal, G. Bharathy, B. Narayan, and M. Prasad, Cyberbully ing detection: Hybrid models based on machine learning and natural
language processing techniques, Nov. 2021. doi: 10.3390/electronics10222810. [Online]. Available: http://dx.doi.org/10.3390/electronics10222810.

[8] B. Haidar, C. Maroun, and A. Serhrouchni, A multilingual system for cyber bullying detection: Arabic content detection using machine learning,
Dec. 2017. doi: 10.25046/aj020634.

[9] J. Hani, M. Nashaat, M. Ahmed, Z. Emad, E. Amer, and A. Mohammed, Social media cyberbullying detection using machine learning, 2019. doi:
10. 14569/IJACSA.2019.0100587. [Online]. Available: http://dx.doi.org/10. 14569/IJACSA.2019.0100587.

[10] D. Chatzakou, I. Leontiadis, J. Blackburn, et al., Detecting cyberbullying and cyberaggression in social media, 2019.

31
Reference
[11] A. Akhter, K. Uzzal, and M. Polash, Cyber bullying detection and classification using multinomial naıve bayes and fuzzy logic, 2019. 44

[12] M. F. Ahmed, Z. Mahmud, Z. T. Biash, A. A. N. Ryen, A. Hossain, and F. B. Ashraf, Cyberbullying detection using deep neural network from
social media comments in bangla language, 2021. arXiv: 2106.04506 [cs.CL].

[13] M. G. Hussain, T. A. Mahmud, and W. Akthar, “An approach to detect abu sive bangla text,” in 2018 International Conference on Innovation in
Engineer ing and Technology (ICIET), 2018, pp. 1–5. doi: 10.1109/CIET.2018.8660863.

[14] K. R. Talpur, S. S. Yuhaniz, and N. Amir, Cyberbullying detection: Current trends and future directions, 2020.

[15] S.Sarker and A.R.Shahid, Cyberbullying of high school students in bangladesh: An exploratory study, 2018. arXiv: 1901.00755 [cs.CY].

32
Reference
[16] Z. Alsaed and D. Eleyan, Approaches to cyberbullying detection on social net works: A survey, Jul. 2021.

[17] R. Ghosh, S. Nowal, and G. Manju, Social media cyberbullying detection using machine learning in bengali language, 2021.

[18] M. I. H. Emon, K. N. Iqbal, M. H. K. Mehedi, M. J. A. Mahbub, and A. A. Rasel, “Detection of bangla hate comments and cyberbullying in
social me dia using nlp and transformer models,” in Advances in Computing and Data Sciences: 6th International Conference, ICACDS 2022,
Kurnool, India, April 22–23, 2022, Revised Selected Papers, Part I, Springer, 2022, pp. 86–96.

[19] J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” CoRR,
vol. abs/1810.04805, 2018. arXiv: 1810.04805. [Online]. Available: http://arxiv.org/abs/1810.04805.

[20] M.A.Moreno, A. D.Gower, H. Brittain, and T. Vaillancourt, Applying natural language processing to evaluate news media coverage of bullying
and cyberbul lying, 2019

33
Reference
[21] R. Kumar, B. Lahiri, and A. K. Ojha, Aggressive and offensive language iden tification in hindi, bangla, and english: A comparative study,
2021.

[22] M. Das, S. Banerjee, P. Saha, and A. Mukherjee, Hate speech and offensive language detection in bengali, 2022. arXiv: 2210.03479 [cs.CL].

[23] M. T. Ahmed, M. Rahman, S. Nur, A. Islam, and D. Das, “Deployment of machine learning and deep learning algorithms in detecting
cyberbully ing in bangla and romanized bangla text: A comparative study,” in 2021 International Conference on Advances in Electrical, Computing,
Communica tion and Sustainable Technologies (ICAECT), 2021, pp. 1–10. doi: 10.1109/ ICAECT49130.2021.9392608.

[24] Abdhullah-Al-Mamun and S. Akhter, “Social media bullying detection using machine learning on bangla text,” in 2018 10th International
Conference on Electrical and Computer Engineering (ICECE), 2018, pp. 385–388. doi: 10. 1109/ICECE.2018.8636797.

[25] S. Sultana, M. O. F. Redoy, J. Al Nahian, A. K. M. Masum, and S. Abu jar, Detection of abusive bengali comments for mixed social media data
using machine learning, 2023.

34
THANK YOU

QUESTIONS ?
35

Structural Calculation Platform
33% (3)
Structural Calculation Platform
50 pages
2001 ATRA Seminar Manual Contents PDF
No ratings yet
2001 ATRA Seminar Manual Contents PDF
218 pages
Maths Project
No ratings yet
Maths Project
6 pages
Lesson 5 DLP Grade 8 Tle-Ict (Carry Out Mensuration and Calculation)
75% (4)
Lesson 5 DLP Grade 8 Tle-Ict (Carry Out Mensuration and Calculation)
7 pages
Master Thesis Ankush Arora
No ratings yet
Master Thesis Ankush Arora
83 pages
Batch 17
No ratings yet
Batch 17
27 pages
20 Amazing Hacks
100% (1)
20 Amazing Hacks
6 pages
Intro Merged
No ratings yet
Intro Merged
46 pages
33 Submission
No ratings yet
33 Submission
12 pages
Cyberbullying Detection Using NLP (r3) - 1
No ratings yet
Cyberbullying Detection Using NLP (r3) - 1
45 pages
Sharifirad Sima PHD CSCI Aug 2019
No ratings yet
Sharifirad Sima PHD CSCI Aug 2019
107 pages
DS - Lab Report.
No ratings yet
DS - Lab Report.
25 pages
Nasimuzzaman (M230105004 Project)
No ratings yet
Nasimuzzaman (M230105004 Project)
32 pages
Lec 10
No ratings yet
Lec 10
51 pages
INDEXReport Ayush
No ratings yet
INDEXReport Ayush
38 pages
AI Lab Report BIM
No ratings yet
AI Lab Report BIM
34 pages
Cmos Schmitt Trigger
No ratings yet
Cmos Schmitt Trigger
6 pages
Blue and White Modern Digital Marketing Agency Presentation
No ratings yet
Blue and White Modern Digital Marketing Agency Presentation
9 pages
Adishwar Steel Bangalore
No ratings yet
Adishwar Steel Bangalore
3 pages
Thesis - Aru Omarali
No ratings yet
Thesis - Aru Omarali
34 pages
75 C1.1 Exam Dec. 2021
No ratings yet
75 C1.1 Exam Dec. 2021
5 pages
CH 04
No ratings yet
CH 04
47 pages
Sentiment Analysis Using NLP
No ratings yet
Sentiment Analysis Using NLP
42 pages
MT19CS019 Thesis Final
No ratings yet
MT19CS019 Thesis Final
32 pages
Thesis 450 2 1
No ratings yet
Thesis 450 2 1
36 pages
Praveen Phase 3
No ratings yet
Praveen Phase 3
6 pages
Conference Latex Template ECCE
No ratings yet
Conference Latex Template ECCE
6 pages
GSM Prepaid Tariff Sheet As On 01.02.2025
No ratings yet
GSM Prepaid Tariff Sheet As On 01.02.2025
1 page
Final Report Cyberbulying Muntasir2
No ratings yet
Final Report Cyberbulying Muntasir2
39 pages
Detecting Cyberbullying Text Using The Approaches With Machine Learning Models For The Low-Resource Bengali Language
No ratings yet
Detecting Cyberbullying Text Using The Approaches With Machine Learning Models For The Low-Resource Bengali Language
10 pages
B Tech Manufacturing Technology
No ratings yet
B Tech Manufacturing Technology
61 pages
2022 V14i4075
No ratings yet
2022 V14i4075
9 pages
Sample 1
No ratings yet
Sample 1
22 pages
NEW Java Mannual-Lab
No ratings yet
NEW Java Mannual-Lab
43 pages
Technology NEW Vocab Parts 1-2-3
No ratings yet
Technology NEW Vocab Parts 1-2-3
21 pages
Social Media Cyberbullying Detection On Political Violence From Bangla Text
No ratings yet
Social Media Cyberbullying Detection On Political Violence From Bangla Text
22 pages
English 5 Blended Words
No ratings yet
English 5 Blended Words
26 pages
Malignant Comments Classifier Project
No ratings yet
Malignant Comments Classifier Project
30 pages
Template
No ratings yet
Template
16 pages
Softcom Assignment1
No ratings yet
Softcom Assignment1
18 pages
C1 W1 Assignment
No ratings yet
C1 W1 Assignment
14 pages
C1 W1 Assignment
No ratings yet
C1 W1 Assignment
16 pages
Cyberbullying
No ratings yet
Cyberbullying
18 pages
OrorJai Sri Gurudevoror
No ratings yet
OrorJai Sri Gurudevoror
19 pages
Deep Learning Journal
No ratings yet
Deep Learning Journal
6 pages
Survey On Multilevel Security Using Honeypot
No ratings yet
Survey On Multilevel Security Using Honeypot
5 pages
IC-RTETM Final Sentiment Analysis
No ratings yet
IC-RTETM Final Sentiment Analysis
13 pages
Few-Shot Learning Tutorial - Medium
No ratings yet
Few-Shot Learning Tutorial - Medium
16 pages
Toxic Comment Analyser
No ratings yet
Toxic Comment Analyser
19 pages
Tectura Cloud Capability - 2017
No ratings yet
Tectura Cloud Capability - 2017
26 pages
NLP Labsheet-2 Sentiment Analysis Using Naive Bayes Classifier
No ratings yet
NLP Labsheet-2 Sentiment Analysis Using Naive Bayes Classifier
15 pages
Cyberbullying Text Identification Based On Deep Le
No ratings yet
Cyberbullying Text Identification Based On Deep Le
12 pages
49 00 00 Fi
No ratings yet
49 00 00 Fi
8 pages
Document Dsbda Codes For Mini Project
No ratings yet
Document Dsbda Codes For Mini Project
9 pages
Sentiment Analysis of Social Media With Python - by Haaya Naushan - Towards Data Science
No ratings yet
Sentiment Analysis of Social Media With Python - by Haaya Naushan - Towards Data Science
9 pages
Ai Casestudy
No ratings yet
Ai Casestudy
16 pages
CyberbullyingDetection - Documentation
No ratings yet
CyberbullyingDetection - Documentation
12 pages
Natural Language Processing Assignment
No ratings yet
Natural Language Processing Assignment
3 pages
Asp7a Product Specifications
No ratings yet
Asp7a Product Specifications
2 pages
35C+ & 45C+ Gas Fryer Parts Manual: Pitco Frialator Inc
No ratings yet
35C+ & 45C+ Gas Fryer Parts Manual: Pitco Frialator Inc
16 pages
Yogeesh
No ratings yet
Yogeesh
9 pages
Seismic Zones Factor Zone 4 Normal Occupancies 8: I Occupancy Requirements Table 2.2D
No ratings yet
Seismic Zones Factor Zone 4 Normal Occupancies 8: I Occupancy Requirements Table 2.2D
5 pages
PatternProject FinalReport
No ratings yet
PatternProject FinalReport
5 pages
Template For The First Slide of PPT Presentation1
No ratings yet
Template For The First Slide of PPT Presentation1
18 pages
Predicting Cyberbullying in Social Media Using Machine Learning
No ratings yet
Predicting Cyberbullying in Social Media Using Machine Learning
7 pages
Worksheet 3
No ratings yet
Worksheet 3
2 pages
ML Projrct Article 2
No ratings yet
ML Projrct Article 2
6 pages
Draft Artikel RTI - BahasaInggris
No ratings yet
Draft Artikel RTI - BahasaInggris
5 pages
Leveraging NLP Techniques and Explainable AI For Abusive Bangla Comment Detection
No ratings yet
Leveraging NLP Techniques and Explainable AI For Abusive Bangla Comment Detection
6 pages
Challenge 2024
No ratings yet
Challenge 2024
5 pages
1a Slide Introduction
No ratings yet
1a Slide Introduction
8 pages
Enterprise Resource Planning
No ratings yet
Enterprise Resource Planning
6 pages
COMP 4650 6490 Assignment 3 2023-v1.1
No ratings yet
COMP 4650 6490 Assignment 3 2023-v1.1
6 pages
Hate Speech Detection Documentation With Code
No ratings yet
Hate Speech Detection Documentation With Code
4 pages
AI - Predicting Cyberbullying On Social Media
No ratings yet
AI - Predicting Cyberbullying On Social Media
4 pages
GeM Bidding 5879144
No ratings yet
GeM Bidding 5879144
5 pages
DS Theory HW 3
No ratings yet
DS Theory HW 3
6 pages
AMER - BRO - Stroboscopy Solution - (MKENT-2482EN-U Rev 2) - 09.2020
No ratings yet
AMER - BRO - Stroboscopy Solution - (MKENT-2482EN-U Rev 2) - 09.2020
4 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
4 pages
Final Poster
No ratings yet
Final Poster
1 page
Phrasal Verbs 22
No ratings yet
Phrasal Verbs 22
4 pages
25-30 KV Underground Cables
No ratings yet
25-30 KV Underground Cables
3 pages
GC 2025 01 26
No ratings yet
GC 2025 01 26
2 pages
Deep Learning with C#, .Net and Kelp.Net: The Ultimate Kelp.Net Deep Learning Guide
From Everand
Deep Learning with C#, .Net and Kelp.Net: The Ultimate Kelp.Net Deep Learning Guide
Matt R. Cole
No ratings yet
Touchpad Computer Application for Class 10 – Ver 1.0: Course Code 165, Skill Education
From Everand
Touchpad Computer Application for Class 10 – Ver 1.0: Course Code 165, Skill Education
Dr. Sanjay Jain
No ratings yet
XGBoost for Regression Predictive Modeling and Time Series Analysis: Learn how to build, evaluate, and deploy predictive models with expert guidance
From Everand
XGBoost for Regression Predictive Modeling and Time Series Analysis: Learn how to build, evaluate, and deploy predictive models with expert guidance
Partha Pritam Deka
No ratings yet
Decoding Large Language Models: An exhaustive guide to understanding, implementing, and optimizing LLMs for NLP applications
From Everand
Decoding Large Language Models: An exhaustive guide to understanding, implementing, and optimizing LLMs for NLP applications
Irena Cronin
No ratings yet
Caffe Deep Learning Framework Essentials: Definitive Reference for Developers and Engineers
From Everand
Caffe Deep Learning Framework Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Internet of Things (IoT) A Quick Start Guide: A to Z of IoT Essentials
From Everand
Internet of Things (IoT) A Quick Start Guide: A to Z of IoT Essentials
Chitra Lele
No ratings yet
Machine Learning with Python: A Comprehensive Guide with a Practical Example
From Everand
Machine Learning with Python: A Comprehensive Guide with a Practical Example
MARTIN NEEL
No ratings yet

Final Presentation Main

Uploaded by

Final Presentation Main

Uploaded by

Cyberbullying and Toxic

Supervisor : Najeefa Nikhat Choudhury

Presented by : Co-supervisor : Md. Faisal Ahmed

Data Collection and Labeling

Data Collection and Labeling

Create a Framework for Linguistically Sensitive Detection

Utilize Cutting-Edge Machine Learning Innovations

Comprehensive Assessment of System Performance

Improve Awareness of Cyberbullying and Preventive Measures

Manual Collection: Human sourcing for thorough perspective of online debate.

Category Distribution by Profession :

• Singing Industry: Faces significant harassment under the

• Politicians: Criticized and threatened due to divisive politics.

• Sports Personalities: Subject to fan and critic scrutiny.

• Scholars: Vulnerable to harassment in public debates.

• Other Professions: Experience diverse forms of online

Data Cleaning: Dataset Splitting:

- Use `preprocess-bengali-text` function. - Validate model performance, test

Pre-Trained BERT Initialization:

Model Training Initialization :

1. Data Loading and Preprocessing:

Fold-wise Classification Reports :

Fold 1 Performance at a Glance :

-achieved an accuracy of approximately

Here is the detailed report on Fold 1 18

Fold-wise Classification Reports :

Fold 2 Performance at a Glance:

- Fold 2 achieved an accuracy of

Fold-wise Classification Reports :

Fold 3 Performance at a Glance:

- Fold 3 achieved an accuracy of almost

Here is the detailed report on Fold 3 20

Fold-wise Classification Reports :

Fold 4 Performance at a Glance:

- Fold 4 achieved an exceptional

Here is the detailed report on Fold 4

Fold-wise Classification Reports :

Fold 5 Performance at a Glance:

- Fold 5 achieved an outstanding

Here is the detailed report on Fold 5 22

Training and Validation Losses Training and Validation Accuracies 23

Accuracy Assessment: Achieved 93.47%.

- Not Bully: 94% precision, recall, F1.

Performance Analysis: Evaluates model's comment classification.

- Non Bully: 94.43% Precision, 93.80% Recall, 94.12% F1-score

Insights: Model shows high accuracy and reliability.

Confusion Matrix Analysis:

- The confusion matrix provides a summary of the

- Enhance accuracy and robustness, refine architecture.

-Shows True Positive Rate vs. False Positive Rate at

-the overall performance of a binary classification

Conclusion: Effectively identifies dangerous remarks,

You might also like