A13 Final

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

KLE Society's

KLE Technological University, Hubballi.

A Minor Project -2 Report


On

Toxic Comment
Classification

submitted in partial fulfillment of the requirement for the degree of

Bachelor of Engineering
In
School of Computer Science and Engineering

Submitted By

Vinay Kademani 01FE21BCS015


Manjunath P M 01FE21BCS071

Rajat Badiger 01FE21BCS082

Sandeep D Angadi 01FE21BCS293

Under the guidance of


Mrs Lalitha Madanbhav

SCHOOL OF COMPUTER SCIENCE & ENGINEERING

HUBBALLI – 580 031

Academic year 2023-24


KLE Society's
KLE Technological University, Hubballi.

2023 - 2024

SCHOOL OF COMPUTER SCIENCE & ENGINEERING

CERTIFICATE

This is to certify that Minor Project -2 entitled “Machine Learning Approach for Detecting
Toxic Comments” is a bonafied work carried out by the student team Vinay Kademani
[01FE21BCS015], Mahjunath [01FE21BCS071], Rajat Badiger [01FE21BCS082], Sandeep
[01FE21BCS293] in partial fulfillment of completion of Sixth semester B. E. in School of
Computer Science and Engineering duringthe year 2023-2024. The project report has been
approved as it satisfies the academic requirement with respect to the project work prescribed
for the above said program.

Guide Head, SoCSE

Mrs Lalitha Madanbhavi Dr. Vijayalakshmi.M

External Viva -Voce:

Name of the Examiners Signature with date


1.
2.
Acknowledgement
We would like to thank our faculty and management for their professional guidance towards
the completion of the project work. We take this opportunity to thank Dr. Ashok Shettar,
Vice-Chancellor, Dr. B.S.Anami, Registrar, and Dr. P.G Tewari, Dean Academics, KLE
Technological University, Hubballi, for their vision and support.
We also take this opportunity to thank Dr. Meena S. M, Professor and Dean of Faculty,
SoCSE and Dr. Vijayalakshmi M, Professor and Head, SoCSE for having provided us direction
and facilitated for enhancement of skills and academic growth.
We thank our guide Prof Lalitha Madanbhavi, for the constant guidance during interaction
and reviews.
We extend our acknowledgement to the reviewers for critical suggestions and inputs. We
also thank Project Co-ordinator Dr. Uday Kulkarni, and reviewers for their suggestions during
the course of completion.
We express gratitude to our beloved parents for constant encouragement and support.

Vinay Kademani - 01FE21BCS015


Manjunath P M - 01fe21bcs071
Rajat Badiger - 01fe21bcs082
Sandeep D Angadi - 01fe21bcs293

3
ABSTRACT
In recent years, the proliferation of toxic comments on online platforms has become a sig-
nificant concern, necessitating the development of automated systems to detect and mitigate
harmful content. This project presents a machine learning-based approach to classifying toxic
comments using the Jigsaw Toxic Comment Classification dataset. The implemented solution
leverages TensorFlow for model development and Gradio for creating an interactive user in-
terface. The model is trained to identify various types of toxicity, including severe toxicity,
obscenity, threats, insults, and identity-based hate.
The preprocessing phase involves text vectorization to convert comments into numerical
representations suitable for input into the neural network. The model architecture, comprising
multiple layers, is optimized to achieve high accuracy in toxicity detection. A user-friendly
interface is designed using Gradio, allowing users to input comments and receive real-time
toxicity assessments.
The project demonstrates significant advancements in automated content moderation, pro-
viding an effective tool for maintaining safer online communities. Future work aims to enhance
the model’s accuracy, incorporate multilingual support, and improve the contextual under-
standing of comments. The integration of user feedback and advancements in explainability
will further refine the system, making it a robust solution for combating online toxicity.
Keywords: Toxic Comment Classification, Machine Learning, TensorFlow, Gradio, Text
Vectorization, Content Moderation, Neural Networks, Online Safety.

i
CONTENTS

Acknowledgement 3

CONTENTS iii

1 INTRODUCTION 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Literature Review / Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Objectives and Scope of the project . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5.2 Scope of the project . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 REQUIREMENT ANALYSIS 6
2.1 Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Non Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Hardware Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Software Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 SYSTEM DESIGN 9
3.0.1 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.0.2 Model Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.0.3 Model Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.0.4 User Interface Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.1 Architecture Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.1 Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.2 Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.3 Technologies Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Data Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.1 Dataset Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3.3 Data Splitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.4 Feature Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

ii
3.3.5 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.6 Data Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4 User Interface Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4.1 Interface Components . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4.2 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 IMPLEMENTATION 16
4.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5 RESULTS AND DISCUSSIONS 17


5.0.1 Model Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.0.2 Confusion Matrix Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.0.3 User Interface Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.0.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

6 CONCLUSION AND FUTURE SCOPE 19


6.0.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
6.0.2 Future Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

REFERENCES 21

iii
Toxic Comment Classification

Chapter 1

INTRODUCTION
The surge of online communication has ushered in an era where user-generated content is
abundant, but so is the prevalence of toxic comments, which include hate speech, threats, and
harassment. Addressing this challenge is crucial for maintaining healthy online environments.
This project explores an advanced machine learning approach to classify toxic comments
into various categories using Bidirectional Long Short-Term Memory (BiLSTM) networks.
We leverage comprehensive text preprocessing techniques and vectorization methods, such as
TF-IDF and word embeddings, to convert textual data into numerical formats suitable for
neural networks. By employing TensorFlow and Keras, along with optimization techniques
like dropout regularization and early stopping, our model aims to achieve high accuracy and
robustness in identifying toxic comments, thereby contributing to the development of more
effective real-time moderation systems.

1.1 Motivation
The motivation for this project is based on the urgent need to foster safe and respectful
online environments. With the exponential growth of social networks and on-line platforms,
toxic comments - comprising hate speech, threats, and harassment - have become pervasive,
causing significant harm to individuals and communities. Current moderation techniques often
fall short due to their inability to handle the sheer volume and complexity of user-generated
content. By developing an advanced machine learning model that uses bidirectional LSTM
networks and sophisticated text preprocessing methods, we aim to provide a more accurate and
efficient solution for detecting and filtering toxic comments. Specifically, our model classifies
comments into various toxicity categories: Toxic, Severe Toxic, Obscene, Threat, Insult, and
Identity Hate. This project aims to improve the quality of online interactions and contribute
to the broader efforts to promote digital well-being.

1.2 Literature Review / Survey


The research paper "UTILIZING SUBJECTIVITY LEVEL TO MITIGATE IDENTITY
TERM BIAS IN TOXIC COMMENTS CLASSIFICATION" by Zhao, Zhang, and Hopfgart-
ner addresses the issue of bias in toxic comment classification models, particularly toward

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31 1


Toxic Comment Classification

identity terms such as "Muslim" or "black," which often lead to higher false positive rates.
The authors propose a novel approach leveraging the subjectivity level of comments containing
identity terms to mitigate this bias. Using a model called Subdentity-Sensitive BERT (SS-
BERT), the study incorporates subjectivity features based on lexicon tools and an innovative
method that calculates embedding similarity between comments and related Wikipedia con-
tent. Evaluated across multiple datasets from different social media platforms, their method
consistently outperformed state-of-the-art baselines, displaying significant improvements in
reducing false positives and enhancing classification accuracy. The findings highlight that con-
sidering both subjectivity and identity terms is more effective than addressing either feature
in isolation, establishing a new direction for reducing biases in automated toxicity detection
systems..

The paper delves into the challenge of identity term bias in toxic comment classification
(TCC) tasks. Traditional methods relying on BERT are often plagued by false positives,
particularly when identity terms are present. The authors propose SS-BERT, an enhanced
model that integrates the subjective nature of comments with the presence of identity terms to
better handle this bias. By comparing SS-BERT to BERT and SO-BERT (Subjectivity-Only
BERT), it is demonstrated that SS-BERT significantly reduces false positives and outper-
forms its counterparts by leveraging a Wikipedia-based proxy for subjectivity. This method
surpasses state-of-the-art lexicon tools in effectiveness, suggesting that assessing the subjectiv-
ity level against reference texts can offer substantial improvements. The research underscores
the necessity of incorporating both subjectivity and identity term presence to mitigate bias
in TCC and presents a novel approach that consistently enhances performance across vari-
ous datasets. The paper also explores related work in the domains of hate speech detection,
transfer learning, and bias mitigation, providing a comprehensive examination of existing
methodologies and their limitations.

In recent years, there have been notable advancements in detecting offensive language
through hybrid deep learning models. Aldhyani et al. created a cyberbullying detection
system that merges CNN and BiLSTM architectures to identify abusive behavior on social
media. Their approach improves text representation using data augmentation techniques such
as Doc2vec and TF-IDF, and employs the LIME method for prediction explanations. Other
significant contributions include Yin et al.’s combination of BERT with BiGRU for senti-
ment analysis, which achieved high accuracy with Twitter data, and Basarslan et al.’s MBi-
GRUMCONV model that integrates Word2Vec for better performance. Das et al. showcased
the effectiveness of MuRIL and XLM-Roberta in recognizing hate speech in Bengali, while
Velankar et al. utilized deep learning to detect hate speech in Hindi and Marathi, noting that
transformer-based models often surpass BERT-based models. Patankar et al. investigated
offensive remark detection in Tamil and Tamil-English Codemixed content, emphasizing the

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 2
Toxic Comment Classification

success of ensemble models and recurrent neural networks. The varied methodologies across
different languages highlight the potential of hybrid deep learning models to enhance the
accuracy and effectiveness of toxic comment detection systems.

The research on toxicity detection in online discussions highlights various methodologies


and advances in text classification and sentiment analysis, particularly for non-English lan-
guages. Previous work includes Khachidze et al.’s (2015) heuristic-based text classification
system for Georgian documents, demonstrating an 11[1]

1.3 Problem Statement


The problem addressed by this project is the prevalence of toxic comments in online plat-
forms, which can harm user experience and community health. Detecting and mitigating such
comments is crucial for maintaining a positive environment.

1.4 Applications
• Social Media Platforms: Automatically filter and flag toxic comments to maintain a
healthy and respectful environment.

• Online Forums and Communities: Enhance community health by identifying and miti-
gating toxic behavior.

• News Websites and Blogs: Moderate user comments to prevent the spread of hate speech
and harassment.

• Gaming Platforms: Monitor and moderate in-game chat to prevent bullying and abusive
language.

• Customer Support Services: Automatically detect and address toxic language in cus-
tomer interactions.

• Educational Platforms: Maintain respectful and constructive discussions in educational


forums and online classrooms.

• Corporate Intranets: Monitor internal communication channels to maintain profession-


alism and respect.

• Government and Public Sector: Monitor public forums and social media channels to
detect and address harmful speech.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 3
Toxic Comment Classification

1.5 Objectives and Scope of the project

1.5.1 Objectives
• Toxic Comment Classification: Develop a model capable of accurately classifying text
comments into various toxicity categories such as toxic, severe toxic, obscene, threat,
insult, and identity hate.

• Utilization of Bidirectional LSTM: Implement bidirectional LSTM neural networks to


capture both past and future context dependencies in text sequences, enhancing the
model’s ability to understand and classify toxic language.

• Enhanced Preprocessing with Toxic Vectorization: Use specialized vectorization tech-


niques tailored for toxic comment classification to preprocess text data effectively before
feeding it into the neural network model.

• Optimization Techniques: Implement optimization strategies including dropout regular-


ization and early stopping to improve model generalization and prevent overfitting on
the training data.

• Framework and Tools: Utilize TensorFlow and Keras frameworks for efficient model
development and training, leveraging their capabilities for deep learning tasks and neural
network architectures.

• Evaluation Metrics: Evaluate model performance using appropriate metrics such as ac-
curacy, precision, recall, and F1-score for each toxicity category, ensuring comprehensive
assessment of classification effectiveness.

1.5.2 Scope of the project


• Textual Data Characteristics: The methods assume that the textual data, primarily
comments extracted from online platforms, adhere to conventional patterns of language
use and syntactic structure. Deviations such as non-standard dialects, heavily abbrevi-
ated text, or highly context-dependent slang may impact model performance.

• Label Quality and Consistency: The accuracy and consistency of toxicity labels assigned
to comments in the dataset are crucial. Variability or ambiguity in labeling, especially
across multiple annotators or sources, may affect the model’s ability to generalize effec-
tively.

• Model Training and Optimization: The scope includes the training of bidirectional
LSTM networks using TensorFlow and Keras frameworks, incorporating techniques like

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 4
Toxic Comment Classification

dropout regularization and early stopping. Model performance is optimized based on


standard metrics such as accuracy, precision, recall, and F1-score.

• Toxicity Categories: The classification scope encompasses categories including toxic,


severe toxic, obscene, threat, insult, and identity hate. The methods aim to differen-
tiate between these categories accurately, reflecting varying degrees and types of toxic
behavior.

• Generalization and Scalability: While efforts are made to generalize the model’s ca-
pability across diverse datasets and comment types, the scope acknowledges potential
limitations in scalability and generalization to entirely new or drastically different con-
texts beyond the training dataset.

• Computational Resources: The methods assume access to sufficient computational re-


sources for training and evaluation processes, including GPU acceleration where feasible,
to optimize model performance within reasonable time frames.

• Ethical and Legal Considerations: The project adheres to ethical guidelines regarding
the use of potentially sensitive data and the implications of automated content modera-
tion. Legal boundaries, including compliance with data privacy regulations and platform
policies, are respected throughout the project lifecycle.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 5
Toxic Comment Classification

Chapter 2

REQUIREMENT ANALYSIS
Toxic comment classification aims to automatically detect and categorize harmful online
comments into types such as toxic, severe toxic, obscene, threat, insult, and identity hate.
Leveraging bidirectional LSTM networks, this approach captures contextual dependencies in
text data. The model is trained and optimized using TensorFlow and Keras, with prepro-
cessing techniques like tokenization and vectorization. Key optimization strategies, including
dropout regularization and early stopping, enhance performance. Evaluation metrics like ac-
curacy, precision, recall, and F1-score ensure robust assessment of the model’s effectiveness.

2.1 Functional Requirements


• The system shall ingest text data from various online platforms and support multiple
data formats

• The system shall tokenize, normalize, and vectorize input text data, handling noise
reduction and removal of special characters and stop words.

• The system shall train a bidirectional LSTM model using labeled datasets, supporting
hyperparameter tuning for optimal performance.

• The system shall classify comments into multiple toxicity categories (toxic, severe toxic,
obscene, threat, insult, identity hate) using a sigmoid activation function in the output
layer.

• The system shall implement dropout regularization and early stopping to enhance model
generalization and prevent overfitting.

• The system shall evaluate the model using metrics such as accuracy, precision, recall,
and F1-score, generating a detailed performance report.

• The system shall provide a user-friendly interface for inputting text and viewing classi-
fication results, displaying model performance metrics and evaluation results.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31 6


Toxic Comment Classification

2.2 Non Functional Requirements


• The system shall process and classify at least 1000 comments per second to ensure
real-time analysis capability.

• The system shall be able to scale horizontally to handle increased data loads, supporting
up to 10 million comments per day without degradation in performance.

• The system shall have an uptime of 99.9percent, ensuring high availability for users and
continuous operation.

• The system shall support the storage and retrieval of up to 1 terabyte of comment data
for historical analysis and model training.

• The system shall classify each comment with an average latency of no more than 100
milliseconds to maintain responsiveness.

• The system shall maintain a throughput of at least 500 concurrent classifications without
significant performance degradation.

• The system shall retain processed comment data for a minimum of one year, enabling
longitudinal analysis and retraining of models with historical data.

2.3 Hardware Requirements


• CPU:

– A multi-core processor is recommended for faster computation.

• RAM:

– At least 8 GB of RAM for smaller datasets.


– 16 GB or more for larger datasets to avoid memory issues.

• Storage:

– Sufficient disk space to store datasets and model files (at least 10 GB free space).

• GPU (Optional but recommended):

– A compatible NVIDIA GPU with CUDA support for faster training times.
– At least 4 GB of GPU memory is recommended.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 7
Toxic Comment Classification

• Disk Space:

– The dataset and model files might take up significant disk space. Ensure you have
enough free space (at least 10 GB).

2.4 Software Requirements

– Operating System:
∗ Windows, macOS, or Linux
– Python:
∗ Python 3.6 or higher
– Python Libraries:
∗ TensorFlow (including TensorFlow GPU support if using a GPU)
∗ Pandas
∗ NumPy
∗ Matplotlib (optional, for plotting)
∗ scikit-learn
∗ Gradio (for creating interfaces)
These can be installed using the following command:

pip install tensorflow tensorflow-gpu pandas numpy matplotlib scikit-learn

– Jupyter Notebook:
∗ Jupyter Notebook or Jupyter Lab to run and edit the notebook

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 8
Toxic Comment Classification

Chapter 3

SYSTEM DESIGN
The system design of the toxicity comment classifier encompasses several critical com-
ponents and processes that work together to deliver an effective solution for detecting
toxic comments. The system is structured into four main phases: data preprocessing,
model development, model training, and user interface creation.

3.0.1 Data Preprocessing

This phase involves loading the dataset, cleaning the text data, and converting it into
numerical representations using text vectorization. This ensures that the data is in a
suitable format for machine learning model training.

3.0.2 Model Development

The model is built using TensorFlow/Keras, with a neural network architecture designed
to handle multi-label classification tasks. Key layers include embedding, convolutional,
and dense layers, which help capture complex patterns in the text.

3.0.3 Model Training

The training process includes splitting the data into training and test sets, fitting the
model on the training data, and evaluating its performance on the test data. Techniques
to monitor and prevent overfitting are also applied.

3.0.4 User Interface Creation

An interactive user interface is developed using the Gradio library. This interface allows
users to input comments and receive real-time toxicity evaluations. The design ensures
ease of use and accessibility.
This system design ensures a coherent workflow from data preprocessing to user interac-
tion, facilitating an efficient and user-friendly approach to identifying toxic comments.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31 9


Toxic Comment Classification

3.1 Architecture Design

3.1.1 Overview

The architecture of the toxicity classification system is designed to effectively process and
classify textual comments based on their toxicity. It consists of several key components,
including data preprocessing, model training, and deployment. The architecture lever-
ages the capabilities of deep learning using TensorFlow and integrates with a web-based
interface for real-time comment scoring.

3.2 Architecture

Figure 3.1: Architecture

3.2.1 Components

– Data Collection and Preprocessing:


∗ The data is collected from the Jigsaw Toxic Comment Classification Challenge
dataset. This dataset contains a large number of comments labeled with dif-
ferent types of toxicity such as toxic, severe toxic, obscene, threat, insult, and
identity hate.
∗ Preprocessing involves cleaning the text data, such as removing special char-
acters and converting text to lowercase. The comments are then tokenized and
transformed into sequences of integers using TensorFlow’s TextVectorization
layer.
– Model Training:
∗ The core of the system is a deep learning model built using TensorFlow. The
model architecture includes an embedding layer, followed by convolutional lay-

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 10
Toxic Comment Classification

ers and fully connected layers. This allows the model to learn meaningful
representations of the comments and accurately classify them.
∗ The model is trained using the preprocessed data, optimizing for classification
accuracy. The training process involves iterating over the dataset and updating
the model’s weights using backpropagation and gradient descent.
– Model Evaluation:
∗ After training, the model is evaluated on a separate validation set to ensure it
generalizes well to new, unseen data. Key metrics such as accuracy, precision,
recall, and F1 score are used to assess the model’s performance.
– Model Deployment:
∗ Once the model is trained and evaluated, it is saved and deployed using a
web-based interface powered by Gradio. This interface allows users to input
comments and receive real-time toxicity scores.
∗ The deployment includes setting up a server to host the model and the Gradio
interface, ensuring it is accessible to end-users.

3.2.2 Workflow

1. Data Ingestion: The raw data is ingested and preprocessed to create a clean
dataset suitable for training.
2. Model Training: The preprocessed data is used to train the TensorFlow model,
which learns to classify comments based on their toxicity.
3. Evaluation: The trained model is evaluated on a validation set to measure its
performance and make necessary adjustments.
4. Deployment: The final model is deployed using Gradio, providing an interactive
interface for real-time toxicity classification.

3.2.3 Technologies Used

1. TensorFlow: A powerful deep learning framework used for building and training
the model.
2. Pandas and NumPy: Libraries used for data manipulation and preprocessing.
3. Gradio: A library for creating web-based interfaces, allowing users to interact with
the model in real time.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 11
Toxic Comment Classification

4. Jupyter Notebook: An interactive environment used for developing and docu-


menting the project.

The architecture design ensures that the system is scalable, efficient, and easy to use,
providing accurate toxicity classification for user-generated comments.

3.3 Data Design

The data design for the toxicity classification project is crucial for ensuring that the
machine learning model can effectively learn and make accurate predictions. This section
outlines the structure, preprocessing, and transformation of the dataset used in the
project.

3.3.1 Dataset Overview

The dataset utilized for this project is the Jigsaw Toxic Comment Classification dataset,
which is a comprehensive and widely-used dataset for building models to detect toxic
online comments. The dataset contains a large collection of comments from Wikipedia’s
talk page edits, each annotated for six different types of toxicity: toxic, severe toxic,
obscene, threat, insult, and identity hate. These annotations provide a multi-label
classification problem where a single comment can belong to multiple toxicity categories
simultaneously. The dataset is split into a training set and a test set, allowing for
effective model training and evaluation. Each comment in the dataset is paired with
binary labels indicating the presence or absence of each toxicity type. This rich labeling
allows for nuanced model training that can capture the complex nature of toxic language.
The diversity and size of the dataset make it an ideal choice for developing robust
machine learning models aimed at improving content moderation and ensuring safer
online communities.

3.3.2 Data Preprocessing


Data preprocessing is essential to convert the raw text data into a format suitable for machine
learning. The preprocessing steps include:

1. Text Cleaning:

• Special characters, punctuation, and numbers are removed to clean the text data.
• All text is converted to lowercase to ensure uniformity.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 12
Toxic Comment Classification

• HTML tags and URLs, if any, are removed.

2. Tokenization:

• The cleaned text is tokenized into individual words or tokens using TensorFlow’s
TextVectorization layer.

3. Padding:

• The tokenized sequences are padded to ensure that all sequences have the same
length, which is necessary for batch processing in neural networks.

4. Vectorization:

• The tokenized text is converted into sequences of integers, with each integer repre-
senting a specific word in the vocabulary.

3.3.3 Data Splitting


The dataset is split into training and test subsets to facilitate effective model training and
evaluation. By dividing the data, we ensure that the model’s performance can be assessed on
unseen data, providing a measure of how well the model generalizes to new, unseen comments.
Typically, a common split ratio is used, such as 80

3.3.4 Feature Engineering


Feature engineering involves transforming raw text data into numerical representations that
can be used by machine learning models. In this project, we use TensorFlow’s TextVectorization
layer to perform essential text preprocessing tasks. This includes tokenization, where the text
is split into individual words or tokens; vocabulary creation, where unique tokens are indexed;
and sequence padding, which ensures that all input sequences are of uniform length. These
steps convert the text into a format that neural networks can process effectively, enabling the
extraction of meaningful patterns and features.

3.3.5 Data Augmentation


To enhance the model’s robustness and generalizability, data augmentation techniques are
employed. These techniques artificially increase the diversity of the training dataset by making
slight modifications to the existing data. Methods such as synonym replacement, where words
are substituted with their synonyms, and random word insertion, where additional relevant
words are added, are used. These augmentations help the model to better handle variations in
the input text, improving its ability to generalize and perform well on new, unseen comments.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 13
Toxic Comment Classification

3.3.6 Data Storage


Processed data and the trained model are stored efficiently using TensorFlow’s data handling
utilities. The data is stored in a structured format that ensures easy retrieval and processing
during model training and evaluation. The trained model is saved in a format that supports
easy loading and deployment, typically using TensorFlow’s Model.save function. This allows
the model to be integrated into various applications and platforms, ensuring that it can be
readily used in real-world scenarios to detect toxic comments.

3.4 User Interface Design


The user interface for the toxicity comment classifier is designed to be simple and intuitive, uti-
lizing the Gradio library to create an interactive web-based application. The key components
and design considerations are as follows:

3.4.1 Interface Components


• Input Textbox: A multiline textbox where users can input the comment they wish to
evaluate for toxicity. The textbox provides placeholder text to guide the user on what
to enter.

• Output Text: The results are displayed in a textual format, indicating the presence
or absence of various types of toxicity in the comment.

3.4.2 Design Considerations


• Ease of Use: The interface is designed to be straightforward, requiring minimal input
from the user. Users simply type or paste their comment into the textbox and receive
an immediate evaluation.

• Clarity of Results: The output is formatted clearly, with each type of toxicity (e.g.,
toxic, severe toxic, obscene, threat, insult, identity hate) listed alongside a boolean
indicator of its presence.

• Real-time Interaction: The Gradio interface allows for real-time interaction, enabling
users to see the results immediately after submitting their comment.

• Accessibility: The design ensures that the interface is accessible to all users, including
those with disabilities. The use of simple text inputs and outputs makes it compatible
with screen readers and other assistive technologies.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 14
Toxic Comment Classification

3.4.3 Implementation Details


The implementation of the interface in the Jupyter Notebook is straightforward, leveraging
the functionality provided by Gradio. Below is the code snippet that defines and launches the
interface:

def score_comment(comment):
vectorized_comment = vectorizer([comment])
results = model.predict(vectorized_comment)

text = ’’
for idx, col in enumerate(df.columns[2:]):
text += ’{}: {}\\n’.format(col, results[0][idx] > 0.5)

return text

interface = gr.Interface(fn=score_comment,
inputs=gr.inputs.Textbox(lines=2, placeholder=’Comment to scor
outputs=’text’)

interface.launch(share=True)

Function Definition

• scorec omment : T hisf unctiontakesacommentasinput, vectorizesitusingthepre−def inedtextvectorizer,

Interface Creation

– gr.Interface: This class from the Gradio library is utilized to build the user interface.
It defines the function to be used for predictions (scorec omment), theinputtype(gr.inputs.Textbox
T hismethodinitiatestheinterf ace, makingitavailablethroughawebbrowser.T heshare=Trueparame

3.5 Conclusion

The user interface for the toxicity comment classifier is designed to provide a seamless
and user-friendly experience. By leveraging the capabilities of the Gradio library, the in-
terface offers real-time interaction and clear presentation of results, making it a practical
tool for users to evaluate the toxicity of comments quickly and easily.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 15
Toxic Comment Classification

Chapter 4

IMPLEMENTATION

4.1 Implementation

The implementation of the toxicity comment classifier involves several crucial steps,
from data preprocessing to model deployment. Initially, the dataset is loaded and
preprocessed using TensorFlow’s TextVectorization layer, which tokenizes the text,
creates a vocabulary, and pads sequences to ensure uniform input lengths. This prepro-
cessed data is then used to train a neural network model built with TensorFlow/Keras.
The model architecture includes embedding layers to convert words into dense vectors,
convolutional layers to capture local patterns in the text, and dense layers for learning
complex interactions. The model is compiled with the binary cross-entropy loss function
and the Adam optimizer to handle the multi-label classification task effectively.
During the training phase, the dataset is split into training and testing subsets to eval-
uate the model’s performance on unseen data. The model is trained using the training
set, with hyperparameters such as batch size and the number of epochs carefully cho-
sen to optimize performance while preventing overfitting. After training, the model’s
performance is evaluated on the test set to ensure its accuracy and generalizability.
To provide a user-friendly interface, the Gradio library is employed to create an inter-
active web application. A function named score_comment is defined to process input
comments, vectorize them, and generate predictions using the trained model. This func-
tion formats the results into a readable output that is displayed on the Gradio interface.
The interface consists of an input textbox where users can enter comments and an output
section that shows the toxicity scores in real-time.
The final implementation step involves deploying the model and the interface. The
trained model and Gradio application are hosted on a server, making them accessible
to users via a web browser. The share=True parameter in Gradio allows for easy
sharing of the application link, enabling users to interact with the model conveniently.
This comprehensive implementation ensures that the system is not only accurate and
reliable but also accessible and easy to use for detecting toxic comments in various online
platforms.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31 16


Toxic Comment Classification

Chapter 5

RESULTS AND DISCUSSIONS


The toxicity comment classifier demonstrated effective performance in identifying various
types of toxic comments through rigorous testing and evaluation. The model’s results
were evaluated based on several metrics, including accuracy, precision, recall, and F1-
score, which provided insights into its performance across different toxicity categories.

5.0.1 Model Performance

The model achieved high accuracy in classifying toxic comments, with particularly strong
performance in detecting severe toxicity, obscenity, and insults. Precision and recall
scores indicated that the model was able to accurately identify toxic comments while
minimizing false positives and false negatives. The F1-score, which balances precision
and recall, further highlighted the model’s effectiveness in handling the multi-label clas-
sification task.

5.0.2 Confusion Matrix Analysis

A detailed analysis using confusion matrices for each toxicity category revealed the
model’s strengths and areas for improvement. The confusion matrices showed that
the model was particularly adept at identifying clear cases of toxicity but occasionally
struggled with comments that had more nuanced or borderline toxic content. This
suggests that while the model is robust, there is room for enhancement in understanding
subtle language variations.

5.0.3 User Interface Evaluation

The Gradio interface provided a user-friendly platform for real-time toxicity assessment.
User feedback indicated that the interface was intuitive and easy to use, allowing users
to quickly input comments and receive toxicity scores. The real-time feedback capability
of the interface was particularly appreciated, as it facilitated immediate evaluation and
understanding of the model’s predictions.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31 17


Toxic Comment Classification

Figure 5.1: Result

5.0.4 Discussion

The results underscore the potential of machine learning models in automated con-
tent moderation, demonstrating significant advancements in detecting and categorizing
toxic comments. The combination of a robust neural network model and an interactive
user interface presents a comprehensive solution for online platforms looking to enhance
their moderation capabilities. However, the analysis also identified areas for further
improvement. For instance, the model’s occasional difficulty with nuanced comments
highlights the need for more sophisticated techniques, such as incorporating contextual
understanding and leveraging advanced neural network architectures like transformers.
Future enhancements could also include expanding the model to handle multiple lan-
guages, which would make the system more versatile and applicable to a broader range
of online communities. Additionally, integrating user feedback into the training process
could help the model adapt to evolving language patterns and emerging forms of toxicity.
Overall, the project successfully demonstrates the feasibility and effectiveness of using
machine learning for toxicity detection, providing a valuable tool for improving online
interactions and promoting healthier digital environments.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 18
Toxic Comment Classification

Chapter 6

CONCLUSION AND FUTURE


SCOPE

6.0.1 Conclusion

The toxicity comment classifier presented in this project successfully demonstrates the
application of machine learning techniques in identifying toxic comments from textual
data. By leveraging TensorFlow and the Jigsaw Toxic Comment Classification dataset,
we have developed a robust model capable of detecting various types of toxicity, including
severe toxicity, obscenity, threats, insults, and identity-based hate. The implementation
of the Gradio interface ensures that the model is accessible to users, providing real-time
toxicity assessment through an intuitive web-based application.
The key achievements of this project include:

– Effective Data Preprocessing: Efficiently processed and vectorized text data


to feed into the neural network model.
– Model Development: Built a neural network model using TensorFlow/Keras,
achieving satisfactory performance in classifying toxic comments.
– User-Friendly Interface: Created an interactive and easy-to-use interface with
Gradio, allowing users to input comments and receive immediate toxicity evalua-
tions.

Overall, the project highlights the potential of machine learning in enhancing content
moderation systems and creating safer online communities by automatically identifying
and flagging toxic comments.

6.0.2 Future Scope

While the current implementation provides a strong foundation, there are several avenues
for future enhancements and research to improve the toxicity comment classifier:

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31 19


Toxic Comment Classification

– Model Improvement: Experiment with more advanced neural network architec-


tures, such as transformers and attention mechanisms, to enhance the accuracy and
robustness of toxicity detection.
– Handling Imbalanced Data: Implement techniques to address class imbalances
in the dataset, such as oversampling minority classes or using more sophisticated
loss functions.
– Real-Time Processing: Optimize the model for real-time processing, enabling it
to handle a larger volume of comments efficiently and with minimal latency.
– Multilingual Support: Extend the model’s capabilities to support multiple lan-
guages, allowing it to detect toxicity in non-English comments.
– Contextual Understanding: Incorporate contextual understanding to improve
the detection of nuanced toxicity that may depend on the surrounding text or
conversational context.
– User Feedback Integration: Develop mechanisms to integrate user feedback,
enabling the model to learn and adapt over time based on real-world usage and
corrections.
– Explainability and Transparency: Enhance the explainability of the model’s
predictions, providing users with insights into why a comment was classified as
toxic, which can build trust and aid in model refinement.

By addressing these future directions, the toxicity comment classifier can evolve into
a more comprehensive and reliable tool for maintaining healthy online environments,
ensuring respectful and constructive interactions across various platforms.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 20
Toxic Comment Classification

REFERENCES
[1] From Lucene Apache. Lucene search engine. http://lucene.apache.org.

1 Liu. (2012) “Sentiment Analysis And Opinion Mining.” Morgan Claypool.


2 Hutto, C. J., and Eric Gilbert. (2014) “VADER: A Parsimonious Rule-Based Model
for Sentiment Analysis of Social Media Text”, in
1. Proceedings of the Eighth International AAAI Conference on Weblogs and Social
Media. pp. 216–225.
3 Priyani, Gupta, Vedika Gupta, Vivek Kumar Singh, and Udayan Ghose. (2017) “A
Linguistic Rule-Based Approach for Aspect-Level
2. Sentiment Analysis of Movie Reviews.” Advances in Intelligent Systems and Com-
puting.
4 Agarwal, Basant, and Namita Mittal. (2015) “Machine Learning Approach for
Sentiment Analysis”, in Socio-Affective Computing, Springer.
3. pp. 21–45.
5 Chiong, Raymond, et al. (2018) “A Sentiment Analysis-Based Machine Learning
Approach for Financial Market Prediction via News
4. Disclosures”, in Proceedings of the Genetic and Evolutionary Computation Confer-
ence Companion on - GECCO ’18. pp. 278–279.
6 Tripathy, Abinash, Ankit Agrawal, and Santanu Kumar Rath. (2016) “Classifica-
tion of Sentiment Reviews Using n-Gram Machine Learning
5. Approach.” Expert Systems with Applications 57: 117–126.
7 Fernández-Gavilanes, Álvarez-López, Juncal-Martínez, Costa-Montenegro, and González-
Castaño (2016) “Unsupervised Method for
6. Sentiment Analysis in Online Texts.” Expert Systems with Applications 58: 57–75.
8 Tan, Phang, Chin, and Anthony. (2015) “Rule-Based Sentiment Analysis for Fi-
nancial News”, in IEEE International Conference on
7. Systems, Man, and Cybernetics, Kowloon. pp. 1601–1606.
9 Na, Kyaing, Khoo, Foo, Chang, and Theng. (2012) “Sentiment Classification of
Drug Reviews Using a Rule-Based Linguistic Approach”, in

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31 21


Toxic Comment Classification

8. Chen H, Chowdhury G (eds) ICADL 2012: The Outreach of Digital Libraries: A


Globalized Resource Network, Berlin, Springer.
10 Prabowo, and Thelwall. (2009) “Sentiment Analysis: A Combined Approach.”
Journal of Informetrics 3 (2): 143–157.
9. Perera, A., & Fernando, P. (2021). Accurate cyberbullying detection and preven-
tion on social media. Procedia Computer Science, 181, 605-611.
10. Maslej-Kresnakova, V., Sarnovsky, M., Butka, P., & Machova, K. (2020). Com-
parison of deep learning models and various text pre-processing techniques for the
toxic comments classification. Applied Sciences, 10(23), 8631.
11. Rani, P., Kumar, R., & Jain, A. (2022). A Hybrid Approach for Feature Selec-
tion Based on Correlation Feature Selection and Genetic Algorithm. International
Journal of Software Innovation (IJSI), 10(1), 1-7.
12. Vodnala, D., Shravya, J., Vishnupriya, K., & Rohit, V. N. (2023). Machine
Learning-Based Tool to Classify Online Toxic Comments. In Proceedings of 3rd
International Conference on Artificial Intelligence: Advances and Applications:
ICAIAA 2022 (pp. 123-135). Singapore: Springer Nature Singapore.
13. Nedjah, N., Santos, I., & de Macedo Mourelle, L. (2019). Sentiment analysis using
convolutional neural network via word embeddings. Evolutionary Intelligence, 1-25.
14. Kunas, C. A., Serpa, M. S., Padoin, E. L., & Navaux, P. O. (2021). Improving
Performance of Long Short-Term Memory Networks for Sentiment Analysis Using
Multicore and GPU Architectures. In Latin American High Performance Comput-
ing Conference (pp. 34-47). Cham: Springer International Publishing.
15. Bhuyan, M. P., Sarma, S. K., & Rahman, M. (2020). Natural language processing
based stochastic model for the correctness of assamese sentences. In 2020 5th In-
ternational Conference on Communication and Electronics Systems (ICCES) (pp.
1179-1182). IEEE.
16. Rhanoui, M., Mikram, M., Yousfi, S., & Barzali, S. (2019). A CNN-BiLSTM
model for document-level sentiment analysis. Machine Learning and Knowledge
Extraction, 1(3), 832-847.
17. Gupta, V., Jain, N., Shubham, S., Madan, A., Chaudhary, A., & Xin, Q. (2021).
Toward integrated CNN-based sentiment analysis of tweets for scarce-resource lan-
guage—Hindi. Transactions on Asian and Low-Resource Language Information
Processing, 20(5), 1-23.
18. Xu, G., Meng, Y., Qiu, X., Yu, Z., & Wu, X. (2019). Sentiment analysis of comment
texts based on BiLSTM. IEEE Access, 7, 51522-51532.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 22
Toxic Comment Classification

19. Balouchzahi, F., Gowda, A., Shashirekha, H., & Sidorov, G. (2022). MUCIC@
TamilNLP-ACL2022: Abusive Comment Detection in Tamil Language using 1D
Conv-LSTM. In Proceedings of the Second Workshop on Speech and Language Tech-
nologies for Dravidian Languages (pp. 64-69).
20. Rehman, A. U., Malik, A. K., Raza, B., & Ali, W. (2019). A hybrid CNN-LSTM
model for improving accuracy of movie reviews sentiment analysis. Multimedia
Tools and Applications, 78, 26597-26613.
21. Das, A. K., Al Asif, A., Paul, A., & Hossain, M. N. (2021). Bangla hate speech
detection on social media using attention-based recurrent neural network. Journal
of Intelligent Systems, 30(1), 578-591.
22. Gao, Z., Li, Z., Luo, J., & Li, X. (2022). Short text aspect-based sentiment analysis
based on CNN+ BiGRU. Applied Sciences, 12(5), 2707.
23. Priyadarshini, I., & Cotton, C. (2021). A novel LSTM–CNN–grid search-based
deep neural network for sentiment analysis. The Journal of Supercomputing.

School of Computer Science & Engineering, KLE Technological University, Hubballi - 31. 23

You might also like