0% found this document useful (0 votes)

31 views7 pages

A Comparative Study of Some Selected Classifiers On An Imbalanced Dataset For Sentiment Analysis

Extracting subjective data from online user generated text documents is made quite easy with the use of sentiment analysis. For a classification task different individual algorithms are applied to a review dataset in which most classifiers produce accurate results while others produce limited and inaccurate predictions.

Uploaded by

International Journal of Innovative Science and Research Technology

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views7 pages

A Comparative Study of Some Selected Classifiers On An Imbalanced Dataset For Sentiment Analysis

Uploaded by

International Journal of Innovative Science and Research Technology

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY1751

A Comparative Study of Some Selected

Classifiers on an Imbalanced Dataset for
Sentiment Analysis
Mohammed Ali Kawo1; Dr. Garba Muhammad2; Dr.Danlami Gabi3 and Dr. Musa Sule Argungu4
1
Department of Computer Science, Federal University Gusau, Zamfara State, Nigeria
1, 2, 3,4
Department of Computer Science, Kebbi State University of Science and Technology, Aliero, Nigeria

Abstract:- Extracting subjective data from online user views or feelings from text data is known as sentiment
generated text documents is made quite easy with the use analysis or opinion mining. Sentiment analysis determines a
of sentiment analysis. For a classification task different person's opinion or sentiment toward a particular incident
individual algorithms are applied to a review dataset in (Kawade & Oza, 2017). In order to perform sentiment
which most classifiers produce accurate results while analysis, we must provide a text or document that can be
others produce limited and inaccurate predictions. This examined and that can provide a system or model that
research is to evaluate various machine learning summarizes the opinions expressed in the text (Krishna,
algorithms for online dataset classification, where same 2020). Customer’s sentiment about company's goods and
set of data will be used to test four different machine services is determined by comments and reviews from other
learning algorithms: Naive Bayes, Support Vector users, it has proven extremely helpful in practically every
machine, K-nearest neighbor and Decision tree. In order business and social arena (Kumar et al., 2023). Sentiment
to determine which machine learning model will perform analysis involves a variety of techniques which includes
best in sentiment analysis as a constant issue. In this Natural Language Processing (NLP), Machine Learning
research, our primary goal is to identify the most (ML), Deep Learning (DL), Ensemble Methods and Hybrid
effective machine learning model for sentiment analysis Techniques.
of English texts among the aforementioned classifiers.
Their robustness will be tested and classified with an (Kasthuri & Jebaseeli, 2020) Many studies
imbalanced dataset Kaggle.com a Machine learning concentrated on using standard classifiers to handle most
repository. The dataset will first undergo data problems such as the maximum entropy, naive Bayes,
preprocessing in order to enable analysis, and then decision tree, K-nearest neighbor and support vector
feature extraction for the base classifiers performance machine. But in order to improve the classification accuracy
and accuracy which will be carried out in Jupyter on sentiment analysis a substantial and robust classifier must
notebook from Anaconda. Each machine learning to be obtained.
algorithm performance scores will be calculated for
higher accuracy using confusion matrix, F1-score, As a text classifier that can categorize text into
precision and recall respectively. different sentiments, sentiment analysis also known as
opinion mining is useful for reviews of movies, products,
Keywords:- Machine Learning Algorithms, Sentiment customer services, opinions about any event, such as
Analysis, Imbalanced, Confusion Matrix. politics, societal activities (Kawade & Oza, 2017).
Sentiment analysis is also useful for identifying people's
I. INTRODUCTION opinions about any event like academics, practitioners and
in human computer interaction, as well as those in other
Machine learning is the concept of self-learning disciplines like sociology, marketing, economics and
(George & Srividhya, 2022). It is a subset of artificial advertising (Bahwari, 2019). It can also be used to
intelligence which involves training of computer to learn determine whether a particular item or service is good or
and improve from data without being thoroughly or detailed bad, preferred or not preferred, and polarity of text (positive,
programmed. It deeply relied on algorithms and statistical negative, or neutral).
models to recognize patterns and make predictions and
decisions based on the input data. Machine leaning Due to the recent rapid rise of social platforms, a great
processes large amount of data to discover insights and deal of research in the field of sentiment analysis has
develop automated responses and actions, enabling focused on social medias. In order to improve company or
computers to perform task and improve their performance find solutions to a variety of real world issues, practitioners
overtime (George & Srividhya, 2022). and researchers have been working tirelessly to investigate
and analyze this huge amount of data. They have done this
Sentiment analysis examines how individuals express by utilizing the daily interactions and ever growing user
their ideas, sentiments, assessments, attitudes, and emotions generated material that the websites facilitate (Agustini,
in written language (Kumar et al., 2023). The inspection of 2021).

IJISRT24MAY1751 www.ijisrt.com 2826

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY1751

Educators must comprehend the views and feelings of (George & Srividhya, 2022) provides a successful
their students, just as organizations must comprehend the method for creating precise classifiers for the Usenet2
thoughts of their clients. In an educational setting, sentiment dataset. The base classifiers used in the recommended
analysis is also very useful where teachers and students are approach are Naïve Bayes, Support Vector Machine, and
the driving forces behind the advancement of every nation's Genetic Algorithm. In their work both homogeneous and
educational infrastructure (Alade & Nwankpa, 2022). In heterogeneous models are constructed and classification
most cases, the creation of opinion mining or sentiment accuracy improved significantly by the suggested ensemble
analysis systems in education is to find out what students bagged techniques compared to the base classifiers.
think about education and how to improve the sector.
According to (Mostafa et al., 2021), Support Vector
Sentiment analysis is the act of making assessments of Machine, Bayesian, and Entropy classifiers were used to
people's ideas, imaginations, and personalities built on their determine the sentiment polarity of tweets that yielding
written words, feelings, various picture types including positive, negative and impartial tweets. These three distinct
emoticons, behavior, artwork, and other visual signs. Even methods for classifying Twitter material according to
though sentiment analysis is extensively used in so many phrases in supervised machine learning approaches were
domains, it still lack some areas where it application is applied to trained datasets in three different ways. However,
needed and the best models that can effectively perform the in order to obtain precise and trustworthy predictions many
analysis and predictions accurately is yet to be defined. classifiers are combined using ensemble approaches.

II. LITERATURE REVIEW (Kumar et al., 2023) conducts sentiment analysis on

the Twitter140 dataset using Decision Tree, Logistic
Sentiment analysis of review datasets using Naïve Regression, and Support Vector Machine. Within the biased
Bayes and K-NN Classifier as the two supervised methods techniques, these algorithms are very common. One of the
used with two datasets namely film and hotel, (Bahwari, struggles in machine learning sentiment analysis is the
2019) the more training data that is entered the better the ability to acquire large amount of data for better
accuracy obtained in the NB algorithm with the dataset film classification (Lazrig & Humpherys, 2022).
but for the K-NN method, accuracy is obtained randomly.
(Tan et al., 2023) the authors set out to develop a sentiment III. BASE CLASSIFIERS FOR
analyzer that could accurately classify the polarity of text SENTIMENT ANALYSIS
with outstanding precision. To do this, they employed five
distinct machine learning techniques: Logistic Regression, Among the most innovative cutting edge technologies
Bernoulli Naive Bayes, Naive Bayes, and linear support of the twenty-first century is predicted to be machine
vector classification where Naïve Bayes outperforms all learning (Jordan & Mitchell, 2020). Despite the fact that the
other classifiers. future cannot be predicted, society must start considering
ways to optimize its advantages. To acquire more insight on
SVM is used to identify slogs. It was determined which our research, current and advanced reviews were explored in
models were most useful for logging web frameworks that machine learning in other to establish more facts on the
used web indexes (Meenu, 2019). In (Meenu, 2019) authors widely used machine learning algorithms to be used in our
suggested several grouping computations for Sequential research, which are:
Minimal Optimization (SMO), Logistic Regression,
Decision Trees, Naïve Bayes, Classification, and Regression  Naive Bayes:
Trees to identify phishing mails in a coordinated manner The Naive Bayes model can handle large amounts of
across controlled and unsupervised methods. data and is robust against complicated classification
methods (George & Srividhya, 2022). Naïve Bayes theory is
In (Zishumba, 2019) Machine learning techniques such explained by the following equation: P(H|E) = (P(E|H) *
as Support Vector Machine, bag-of-words model, and Naïve P(H))/P(E). Where P(H|E) signify the prior probability of
Bayes are used for sentiment analysis of digital texts. In the hypothesis given that the evidence is true, P(E|H) is the
(Agustini, 2021) author employed a number of classifiers to likelihood of the evidence given that the hypothesis is true
assess a dataset of movie reviews and divide them into while P(H) is the prior probability of the hypothesis and
positive and negative categories. Out of 85,600 user P(E) is the prior probability that the evidence is true. (Patel,
comments, Logistic Regression performed the second best, 2017) Predicting the correct class for a freshly produced
with an accuracy rate of 99.46%. in another studies of instance and being simple to use are the main advantages of
(Agustini, 2021) author applied multiple classifiers to this classifier.
examine a dataset of movie reviews and classify them as
favorable or unfavorable. With an accuracy of 99.46% for
85,600 user reviews, Logistic Regression delivered the
second-best results.

IJISRT24MAY1751 www.ijisrt.com 2827

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY1751

Fig 1 Naive Bayes (Gaussian Distribution) Fig 2 Support Vector Machine

 Support Vector Machine:  Decision Tree:

When it comes to nonlinear regression and Regression and classification may both be done using
classification tasks, support vector machines (SVMs) are the decision tree due to its tree-like structure. Using decision
essentially binary classifiers that work well at categorizing trees, one can create a training model that can be used to
both linear and nonlinear data (Patel, 2017). SVMs handles predict the class or value of the destination variable. (Moret,
overfitting problems that occur in high dimensional 2019),The application of decision trees is very advantageous
environments due to its global optimization base and it in many fields, such as databases, taxonomy and
helpful in variety of applications (Liakos et al., 2018). The identification, machine diagnosis, switching theory, pattern
process presents each data point as a point in an n- recognition, decision table programming, and algorithm
dimensional space, where 'n' is the total number of features analysis. The diagrammatic representation of Decision Tree
you possess. Each feature's value is represented by a unique is illustrated in figure 3 below
coordinate. Finding which can be utilized to divide a certain
class is the next stage in the classification process (George
& Srividhya, 2022).

Fig 3 Decision Tree Diagram

IJISRT24MAY1751 www.ijisrt.com 2828

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY1751

 K-Nearest Neighbors: and the other to the minority class. Figure 5 below shows
One of the simplest machine learning algorithms and a the schema of imbalanced dataset.
theoretically valid method is the KNN technique, which was
first proposed by Cover and Hart in 1967. The idea behind
KNN is very simple and straightforward: given a sample, if
the K closest neighbors (i.e., most similar samples) in the
feature space are also samples in that class, then this sample
also belongs to that class. The classification outcome of the
sample is directly affected by the choice of K values (Feng
et al., 2023). Figure 4 below displays how K-Nearest
Neighbor took values of a different sample.

Fig 5 An Imbalanced Dataset

When most of the data in each class is evenly

distributed then the majority of conventional data
classification techniques can be implemented with
proficiency in terms of total classification accuracy.
However, when categorizing an imbalanced dataset that
contained some examples from the interest group, these
Fig 4 K-Nearest Neighbor classifiers will unable to do any better (Thesis, 2023).

IV. DATASET V. METHODOLOGY

Datasets are very essential in data analysis and The most essential segment of our research is enclosed
machine learning. A vigorous decision making process for in this section, where the techniques and algorithms that will
organizations and the entire nation will be greatly enhanced be applied to the datasets in order to obtain the desired
by the meaningful utilization of data. In order to carry out results. In our research, the aforementioned classifiers task
the proposed research, a life and health insurance company would be to forecast using the provided input features of the
imbalanced dataset from Kaggle.com a machine learning imbalanced dataset, to determine whether the insured
repository is chosen. This dataset is in ‘csv’ format that customers will be willing to sign up for vehicle insurance
comprises of 267,507 records with fourteen input features. newly provided by the company.. All the four based
classifiers will be utilized individually and their results will
For machine learning practitioners working on binary be compared in order to ascertain which classifier has the
classification and sentiment analysis tasks frequently find highest accuracy level. Figure 5 below illustrates our
imbalanced datasets as a barrier in detecting tasks such as methodological workflow.
fraud, spam, diseases and hardware faults. (George &
Srividhya, 2022) A dataset that is imbalanced comprises of
two distinct observations one belonging to the majority class

IJISRT24MAY1751 www.ijisrt.com 2829

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY1751

Fig 6 Proposed Workflow Diagram

 Data Preprocessing  Training Model

Once data has been collected, data preparation mostly Ultimately, in our research, we employed four trending
known as preprocessing is an essential step in sentiment distinctive models such as K-nearest Neighbors (KNN),
analysis. It cleans, organizes, and scrubs raw data into a Decision trees (DT), Support Vector Machines (SVM) and
format that machine learning models can use for training Naïve Bayes (NB) to classify the dataset in order to
and evaluation. Preprocessing also known as text filtering ascertain which classifier will perform best in term of
(Arya et al., 2019), is the process of removing noisy, accuracy, precision, recall and f1-score respectively.
unreliable and partial datasets through the use of
tokenization, stemming, and vectorization techniques all of Our current research encountered a problem while
which are essential sub-steps in the process. using SVM classifier commonly known as support vector
machine due to the enormous dataset, the SVM classifier
Scrubbing the data is a crucial first step in doing took approximately two and half hours classifying dataset of
preprocessing for sentiment analysis. Scrubbing is the about above three hundred thousand with just fourteen
technical process of enhancing the dataset to increase its features. Consequent to that, we deployed DSVM classifier
utility. This will require data that is redundant, incomplete, which is known as dual support vector machine due to the
incorrectly formatted, or irrelevant to be edited and problem found. The DSVM is prompt in classifying large
occasionally removed (Theobald, 2017). datasets with less time consumption.

 Separating Training and Testing Data Set  Testing the Model

When using machine learning, we typically divide our Performance evaluation is a critical component of
original dataset into two subsets the training set and the every research study. Given that it is essential to examine
testing set. We then fit our model using the train data in the behaviors of the system. A confusion matrix is used in
order to provide predictions for the test set. In order to machine learning to evaluate the effectiveness of a
divide the original datasets into training and testing sets classification model (Zishumba, 2019). In case where the
since the dataset in an imbalanced one, we employed both true values are known, it compares test result in tabular
stratified sampling techniques and k-fold cross-validation form. The performance of the suggested models in this
with k equal to 5. This will allow the sharing of similar research will also be evaluated using the confusion matrix.
representative samples from each class and will improve the
quality and efficiency of the models to enable smooth
comparison among them.

IJISRT24MAY1751 www.ijisrt.com 2830

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY1751

VI. RESULT AND DISCUSSION from each classifier's will also serve as an additional means
of assessing the accuracy on these classifiers.
To the provided kaggle imbalanced dataset of life and
health insurance Company, we implemented four classifiers Following the end of the training phase on each of
such as Naïve Bayes, Support Vector Machine, K-Nearest these classifiers (NB, KNN, SVM, and DT), the dataset is
Neighbor and Decision Tree, each with the same datasets applied to test the classification performances, table 1,
that is split up into a training set and a testing set. The below displays the confusion matrix of all the classifiers on
Jupyter notebook from Anaconda is used to explore the stratify sampling technique that was used on the dataset in
experiment which serves as the basis for the findings that order to have equal representations in all the classes. While
are presented here. Various performance metrics including figure 6, below also shows the bar chart representation of all
classification accuracy will be used to compare these the four classifiers accuracy result.
classifiers; Precision, Recall, and F1-score values obtained

Table 1 Performance Evaluation based on Confusion Matrix

Classifiers Accuracy Precision Recall F1-score
Naïve Bayes 77% 41% 84% 55%
K-N-Neighbor 80% 31% 16% 21%
SVM 84% 48% 10% 16%
Decision Tree 81% 43% 45% 44%

From table 1 above, the statistical values of all the four classifiers is show, where Support vector machine with the accuracy
score of 84 percent outperform all other classifiers even though the recall and f1-score is low, while decision tree came next with
81 percent accuracy score with average scores in recall and f1 respectively.

Fig 7 Bar Chart Accuracy Result for the Four Classifiers

Table 2 Performance Comparisons of Four Classifiers based on K-fold cross Validation

Classifiers Accuracy K1 Accuracy K2 Accuracy K3 Accuracy K4 Accuracy K5
Naïve Bayes 0.76955 0.77583 0.77150 0.77039 0.77184
KNN 0.80439 0.80821 0.80617 0.80524 0.80613
SVM 0.83555 0.83713 0.83565 0.83674 0.83586
Decision Tree 0.81260 0.81791 0.81325 0.81303 0.81345

From all indication in table 2 above, it has been clearly shown that Support Vector Machine has the highest accuracy score
of 84 percent followed by Decision Tree classifier with approximately 82 percent.

IJISRT24MAY1751 www.ijisrt.com 2831

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY1751

VII. CONCLUSION AND FUTURE WORKS [6]. Ghosh, S., Hazra, A., & Raj, A. (2020). A
Comparative Study of Different Classification
In summary, insurance companies, product companies, Techniques for Sentiment Analysis. International
industries of all kinds, institutions and health practitioners Journal of Synthetic Emotions, 11(1), 49–57.
can utilize machine learning method in analyzing https://doi.org/10.4018/ijse.20200101.oa
sentiments. The life and health insurance company dataset [7]. Jawale, S. (2019). Sentiment Analysis using
used to predict if the customers are willing to apply for Ensemble Learning. May.
vehicle insurance in that same company. From the predicted [8]. Jordan, M. I., & Mitchell, T. M. (2020). Machine
analysis result in table one show the low score in precision learning: Trends, perspectives, and prospects.
which indicates low outcome of customers that are willing Science, 349(6245), 255–260. https://doi.org/
to review their vehicle insurance with the same company. 10.1126/science.aaa8415
Even though our target is to compare the classifiers, we still [9]. Kawade, D. R., & Oza, D. K. S. (2017). Sentiment
have to predict the outcome of the dataset. The frequent Analysis: Machine Learning Approach. International
change in vocabularies and cultural diversities has raised a Journal of Engineering and Technology, 9(3), 2183–
great challenge in the field of sentiment analysis. Every 2186. https://doi.org/10.21817/ijet/2017/v9i3/
culture has a way of expressing emotions be it happiness or 1709030151
sadness. Contextual sensitivity is another factor that [10]. Kumar, S., Kaur, N., Kavita, & Joshi, A. (2023).
contributes to the challenges of sentiment analysis, since Tweet sentiment analysis using logistic regression.
grammar continues to revolve every day. From the research July, 332–336. https://doi.org/10.1049/icp.2023.1801
carried out it has been proven that support vector machine [11]. Lazrig, I., & Humpherys, S. L. (2022). Using
has the highest classification score compared to naïve bayes, Machine Learning Sentiment Analysis to Evaluate
k-nearest neighbor and decision tree. This indicates that Learning Impact. Information Systems Education
even in an imbalanced data classification process support Journal (ISEDJ), 20(1), 20. https://isedj.org/;
vector machine still perform excellently. For the fact https://iscap.info
remains that support vector machine has the strength of [12]. Liakos, K. G., Busato, P., Moshou, D., Pearson, S., &
handling complex regression or classification problems. In Bochtis, D. (2018). Machine learning in agriculture:
the future, deep learning should be compare with some of A review. Sensors (Switzerland), 18(8), 1–29.
the base machine learning classifiers such as support vector https://doi.org/10.3390/s18082674
machine in predicting sentiment analysis so as to [13]. Meenu, S. G. (2019). 154. Sunila. International
standardize a model for sentiment analysis. Journal of Electronics Engineering (ISSN: 0973-
7383, Volumne 11(• Issue 1), 965–970.
REFERENCES [14]. Mostafa, G., Ahmed, I., & Junayed, M. S. (2021).
Investigation of Different Machine Learning
[1]. Agustini, T. (2021). Sentiment Analysis on Social Algorithms to Determine Human Sentiment Using
Media using Machine Learning-Based Approach. Twitter Data. International Journal of Information
June, 544437. Technology and Computer Science, 13(2), 38–48.
[2]. Arya, P., Bhagat, A., & Nair, R. (2019). Improved https://doi.org/10.5815/ijitcs.2021.02.04
Performance of Machine Learning Algorithms via [15]. Patel, R. (2017). Sentiment Analysis on Twitter Data
Ensemble Learning Methods of Sentiment Analysis. Using Machine Learning by Ravikumar Patel A
10(2), 110–116. thesis submitted in partial fulfillment of the
[3]. Bahwari. (2019). Sentiment Analysis Using Random requirements for the degree of MSc Computational
Forest Algorithm - Online Social Media Based. Sciences The Faculty of Graduate Studies.
Journal Of Information Technology AND ITS [16]. Tan, K. L., Lee, C. P., & Lim, K. M. (2023). A
UTILIZATION, 2(2), 29–33. Survey of Sentiment Analysis: Approaches, Datasets,
https://www.researchgate.net/publication/338548518 and Future Research. Applied Sciences
_SENTIMENT_ANALYSIS_USING_RANDOM_F (Switzerland), 13(7). https://doi.org/10.3390/app
OREST_ALGORITHM_ONLINE_SOCIAL_MEDI 13074550
A_BASED [17]. Theobald, O. (2017). Machine Learning For Absolute
[4]. Feng, W., Gou, J., Fan, Z., & Chen, X. (2023). An Beginners.
ensemble machine learning approach for [18]. Zishumba, K. (2019). Sentiment Analysis Based on
classification tasks using feature generation. Social Media Data. Journal of Information and
Connection Science, 35(1). https://doi.org/10.1080/ Telecommunication, 1–48. http://repository.aust.edu.
09540091.2023.2231168 ng/xmlui/bitstream/handle/123456789/4901/Kudzai
[5]. George, S., & Srividhya, V. (2022). Performance Zishumba.pdf?sequence=1&isAllowed=y
Evaluation of Sentiment Analysis on Balanced and
Imbalanced Dataset Using Ensemble Approach.
Indian Journal of Science and Technology, 15(17),
790–797. https://doi.org/10.17485/ijst/v15i17.2339

IJISRT24MAY1751 www.ijisrt.com 2832

Sentimental Analysis
100% (2)
Sentimental Analysis
171 pages
CPSE Contacts
No ratings yet
CPSE Contacts
1,264 pages
YOUTUBE SENTEMENT ANALYSIS (Major Project mp11)
No ratings yet
YOUTUBE SENTEMENT ANALYSIS (Major Project mp11)
40 pages
Employee Performance Review - Quarterly - Final
No ratings yet
Employee Performance Review - Quarterly - Final
5 pages
2 Scjhasdjfsadfs
No ratings yet
2 Scjhasdjfsadfs
22 pages
Twitter Sentiment Analysis For Product Reviews To Gather Information Using Machine Learning Technique
No ratings yet
Twitter Sentiment Analysis For Product Reviews To Gather Information Using Machine Learning Technique
6 pages
Rethinking Urban Mobility Through Public Parking Facilities in Yaounde : A Case Study of Mokolo, Yaounde
No ratings yet
Rethinking Urban Mobility Through Public Parking Facilities in Yaounde : A Case Study of Mokolo, Yaounde
17 pages
From Resilience to Success: An Appreciative Inquiry into the Experiences of Criminologist Licensure Examination Passers
No ratings yet
From Resilience to Success: An Appreciative Inquiry into the Experiences of Criminologist Licensure Examination Passers
17 pages
A Survey On Sentiment Analysis Methods Applications and Challenges
No ratings yet
A Survey On Sentiment Analysis Methods Applications and Challenges
50 pages
MLRP
No ratings yet
MLRP
8 pages
Review of Sentiment Analysis: An Hybrid Approach
No ratings yet
Review of Sentiment Analysis: An Hybrid Approach
31 pages
Unit 2
No ratings yet
Unit 2
71 pages
Samiksha Krishna Kadam
No ratings yet
Samiksha Krishna Kadam
6 pages
A Survey On Sentiment Analysis Methods, Applications
No ratings yet
A Survey On Sentiment Analysis Methods, Applications
50 pages
Sentiment Analysis For Enhancing Business Process Using Naive Bayes
No ratings yet
Sentiment Analysis For Enhancing Business Process Using Naive Bayes
10 pages
LCD Panel Repairing Book - Parte3
No ratings yet
LCD Panel Repairing Book - Parte3
30 pages
22 Vol 102 No 3
No ratings yet
22 Vol 102 No 3
17 pages
Sentiment Analysis Based On Deep Learning - A Comparative Study
No ratings yet
Sentiment Analysis Based On Deep Learning - A Comparative Study
29 pages
ML Project Report
No ratings yet
ML Project Report
26 pages
Customer Product
No ratings yet
Customer Product
5 pages
TOEFL Reading Practice
No ratings yet
TOEFL Reading Practice
142 pages
MADHU IEEE Updated 28 07 24
No ratings yet
MADHU IEEE Updated 28 07 24
5 pages
02 6 16 TAJAS Sentiment+Analysis+of+Consumer+Feedback+and
No ratings yet
02 6 16 TAJAS Sentiment+Analysis+of+Consumer+Feedback+and
11 pages
4.SentimentAnalysis FullPaper
No ratings yet
4.SentimentAnalysis FullPaper
38 pages
RES Presentation
No ratings yet
RES Presentation
21 pages
Pharmacological Evaluation of the Analgesic Potential of Eleusine indica (Poaceae) Ethanolic Root Extract
No ratings yet
Pharmacological Evaluation of the Analgesic Potential of Eleusine indica (Poaceae) Ethanolic Root Extract
15 pages
Sentiments Analysis of Amazon Reviews Dataset by Using Machine Learning
No ratings yet
Sentiments Analysis of Amazon Reviews Dataset by Using Machine Learning
9 pages
04 Samss 035
No ratings yet
04 Samss 035
16 pages
Rockfall Barrier
No ratings yet
Rockfall Barrier
12 pages
Sentiment Analysis To Measure The Users Opinion by Using Machine Learning Techniques
No ratings yet
Sentiment Analysis To Measure The Users Opinion by Using Machine Learning Techniques
15 pages
A Comparative Study On Sentiment Analysis
100% (1)
A Comparative Study On Sentiment Analysis
4 pages
Grade 11 - ABM - Araling Panlipunan - Applied Economics - Week 3
No ratings yet
Grade 11 - ABM - Araling Panlipunan - Applied Economics - Week 3
8 pages
A Comparative Study of Different Classification Te
No ratings yet
A Comparative Study of Different Classification Te
10 pages
XGBOOST
No ratings yet
XGBOOST
5 pages
MP 1
No ratings yet
MP 1
14 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
11 pages
Applsci 13 04550
No ratings yet
Applsci 13 04550
21 pages
Sentiment Analysis (Group:8) Under The Guidance of Dr. Ashish Srivastava
No ratings yet
Sentiment Analysis (Group:8) Under The Guidance of Dr. Ashish Srivastava
6 pages
A Review On Sentiment Analysis Techniques and Approaches
No ratings yet
A Review On Sentiment Analysis Techniques and Approaches
5 pages
AIB The Mock (Recall) Myth PDF
No ratings yet
AIB The Mock (Recall) Myth PDF
2 pages
DR S.K-IEEE-updated-29-07-24
No ratings yet
DR S.K-IEEE-updated-29-07-24
5 pages
Sentiment Analysis Using Machine Learning
No ratings yet
Sentiment Analysis Using Machine Learning
5 pages
Sentiment Analysis of IMDb Movie Reviews
No ratings yet
Sentiment Analysis of IMDb Movie Reviews
6 pages
Data Warehouse References
No ratings yet
Data Warehouse References
40 pages
Instruction Manual: Digital Genset Controller DGC-500
No ratings yet
Instruction Manual: Digital Genset Controller DGC-500
151 pages
Data Mining and Sentiment Analysis: Discovering Emotional Patterns in Text Data
No ratings yet
Data Mining and Sentiment Analysis: Discovering Emotional Patterns in Text Data
8 pages
Sentiment Analysis With Machine Learning and Deep Learning A Survey of Techniques and Applications
No ratings yet
Sentiment Analysis With Machine Learning and Deep Learning A Survey of Techniques and Applications
11 pages
Study Material 2 PDF
No ratings yet
Study Material 2 PDF
8 pages
Base 1
No ratings yet
Base 1
7 pages
Ultrasonic Sensor UB 0-GM 75 - 5: Dimensions
No ratings yet
Ultrasonic Sensor UB 0-GM 75 - 5: Dimensions
5 pages
Formation of Smart Sentiment Analysis Technique For Big Data
No ratings yet
Formation of Smart Sentiment Analysis Technique For Big Data
8 pages
System For Sentiment Analysis of Big Text Data
No ratings yet
System For Sentiment Analysis of Big Text Data
4 pages
IJCRT2207068
No ratings yet
IJCRT2207068
5 pages
MADHU-IEEE Update
No ratings yet
MADHU-IEEE Update
5 pages
GR22
No ratings yet
GR22
8 pages
Aspect Based Sentiment Analysis Approaches and Algorithms
No ratings yet
Aspect Based Sentiment Analysis Approaches and Algorithms
4 pages
ICDAIC 2023 Paper 51
No ratings yet
ICDAIC 2023 Paper 51
6 pages
Sentiment Analysis Using Twitter Data
No ratings yet
Sentiment Analysis Using Twitter Data
7 pages
Sentiment Analysis and Review Classification Using Deep Learning
No ratings yet
Sentiment Analysis and Review Classification Using Deep Learning
8 pages
Design and Implementation of Smart Dustbin for Automated Wet and Dry Waste Segregation
No ratings yet
Design and Implementation of Smart Dustbin for Automated Wet and Dry Waste Segregation
5 pages
Thriller English
No ratings yet
Thriller English
69 pages
Unlocking Twitter's Sentiments: A Deep Dive Into Sentiment Analysis
No ratings yet
Unlocking Twitter's Sentiments: A Deep Dive Into Sentiment Analysis
8 pages
Product Fake Reviews Detection With Sentiment Analysis Using Machine Learning
No ratings yet
Product Fake Reviews Detection With Sentiment Analysis Using Machine Learning
8 pages
A Survey On Sentimental Analysis Techniques and Its Usage in Recommendation Systems
No ratings yet
A Survey On Sentimental Analysis Techniques and Its Usage in Recommendation Systems
6 pages
Sentiment Analysis in Financial Markets
No ratings yet
Sentiment Analysis in Financial Markets
6 pages
Finalreview 1
No ratings yet
Finalreview 1
4 pages
Bab3 Matrikulasi
No ratings yet
Bab3 Matrikulasi
31 pages
45 Ijmtst0806103
No ratings yet
45 Ijmtst0806103
4 pages
A Comprehensive Insight into Adult Congenital Heart Disease: A Battle of Survival into Adulthood
No ratings yet
A Comprehensive Insight into Adult Congenital Heart Disease: A Battle of Survival into Adulthood
11 pages
Sentimental Analysis Using NLP
No ratings yet
Sentimental Analysis Using NLP
5 pages
Troubleshooting GEFANUC 90 30
No ratings yet
Troubleshooting GEFANUC 90 30
18 pages
Alzheimer's Disease: Advances in Early Diagnosis and Emerging Therapeutics
No ratings yet
Alzheimer's Disease: Advances in Early Diagnosis and Emerging Therapeutics
4 pages
Sentiments of Public Opinion
No ratings yet
Sentiments of Public Opinion
3 pages
Sentiment Analysis of A Product Based On User Reviews Using Random Forests Algorithm
No ratings yet
Sentiment Analysis of A Product Based On User Reviews Using Random Forests Algorithm
5 pages
Tourism Web App With Aspect Based Sentiment Classification Framework For Tourist Review
No ratings yet
Tourism Web App With Aspect Based Sentiment Classification Framework For Tourist Review
6 pages
Monthly Bill
No ratings yet
Monthly Bill
1 page
Twitter Sentiment Analysis Using Different Algorithms
No ratings yet
Twitter Sentiment Analysis Using Different Algorithms
6 pages
Twitter Sentiment Analysis With Textblob
No ratings yet
Twitter Sentiment Analysis With Textblob
6 pages
Machine Learning Based Sentiment Analysis For Text Messages
No ratings yet
Machine Learning Based Sentiment Analysis For Text Messages
7 pages
Sentiment Analysis Techniques A Review
No ratings yet
Sentiment Analysis Techniques A Review
5 pages
Sentiment Analysis of Twitter Data: A Survey of Techniques: Vishal A. Kharde S.S. Sonawane
No ratings yet
Sentiment Analysis of Twitter Data: A Survey of Techniques: Vishal A. Kharde S.S. Sonawane
11 pages
Asm Note
No ratings yet
Asm Note
1 page
V4I9201545
No ratings yet
V4I9201545
8 pages
Introduction To Environmental Science
No ratings yet
Introduction To Environmental Science
40 pages
Reviving Chettinad Architecture: A Cultural Legacy of Tamil Nadu
No ratings yet
Reviving Chettinad Architecture: A Cultural Legacy of Tamil Nadu
9 pages
Perception and Readiness of Graduate Level Students Toward E-Governance Implementation in Nepal: A Study at Far Western University
No ratings yet
Perception and Readiness of Graduate Level Students Toward E-Governance Implementation in Nepal: A Study at Far Western University
15 pages
Electrostatic Lens (10 Points) : Theory
No ratings yet
Electrostatic Lens (10 Points) : Theory
4 pages
Cementing “Optimization Techniques” in Social Sciences Research: Towards Non-Mathematical Optimization Techniques for the Social Sciences
No ratings yet
Cementing “Optimization Techniques” in Social Sciences Research: Towards Non-Mathematical Optimization Techniques for the Social Sciences
10 pages
Molecular Insights into Prion Degradation in Creutzfeldt Jakob Disease’s Challenges and Future Directions: A Review
No ratings yet
Molecular Insights into Prion Degradation in Creutzfeldt Jakob Disease’s Challenges and Future Directions: A Review
13 pages
Ginkgo Biloba-Derived Flavonoids as Metal Chelators in Alzheimer’s Neurochemistry: A Biochemical Approach
No ratings yet
Ginkgo Biloba-Derived Flavonoids as Metal Chelators in Alzheimer’s Neurochemistry: A Biochemical Approach
7 pages
NPAs and Profitability in Indian Private Sector Banks: Evidence from a Panel Study
No ratings yet
NPAs and Profitability in Indian Private Sector Banks: Evidence from a Panel Study
7 pages
Biblio Tatla Aspects of Universality in Modern and Postmodern Architecture
No ratings yet
Biblio Tatla Aspects of Universality in Modern and Postmodern Architecture
3 pages
The Impact of Artificial Intelligence Interventions on Adolescent Mental Health: A Multidimensional Study Using ChatGPT, Gemini, and DeepSeek
No ratings yet
The Impact of Artificial Intelligence Interventions on Adolescent Mental Health: A Multidimensional Study Using ChatGPT, Gemini, and DeepSeek
8 pages
An Analysis of Cognitive Flexibility and Student Engagement: Reimagining Teaching Strategies in Post-Pandemic Higher Education
No ratings yet
An Analysis of Cognitive Flexibility and Student Engagement: Reimagining Teaching Strategies in Post-Pandemic Higher Education
9 pages
2017 Microprocessor and Interface
No ratings yet
2017 Microprocessor and Interface
3 pages
Temperature-Energy Relationships and Spatial Distribution Analysis for Nano-Enhanced Phase Change Materials Via Thermal Energy Storage
No ratings yet
Temperature-Energy Relationships and Spatial Distribution Analysis for Nano-Enhanced Phase Change Materials Via Thermal Energy Storage
18 pages
From Global Standards to Local Fields: Redefining Labour Through MGNREGS in Kerala’s Tribal Heartlands – An Interrogation of ILO Norms
No ratings yet
From Global Standards to Local Fields: Redefining Labour Through MGNREGS in Kerala’s Tribal Heartlands – An Interrogation of ILO Norms
7 pages
Innovation of Detector Score Plaque Sensor Based to Improve the Effectiveness and Afficiency of Dental Health Services
No ratings yet
Innovation of Detector Score Plaque Sensor Based to Improve the Effectiveness and Afficiency of Dental Health Services
7 pages
Managing Cardiovascular Toxicities in Cancer Therapy
No ratings yet
Managing Cardiovascular Toxicities in Cancer Therapy
9 pages
Promptsecure: Secure Prompt Engineering Protocols for Regulated Genai Environments
No ratings yet
Promptsecure: Secure Prompt Engineering Protocols for Regulated Genai Environments
9 pages
IMPROVE Floodeye: Integrated Mobile System for Predictive Routing and Optimized Vehicle Navigation Using Ensemble Algorithm
No ratings yet
IMPROVE Floodeye: Integrated Mobile System for Predictive Routing and Optimized Vehicle Navigation Using Ensemble Algorithm
6 pages
Safety Data Sheet: 1 Identification of The Substance/Mixture and of The Company/Undertaking
No ratings yet
Safety Data Sheet: 1 Identification of The Substance/Mixture and of The Company/Undertaking
6 pages
Zinner Syndrome: A Radiological Case Report with Multimodal Imaging Insights
No ratings yet
Zinner Syndrome: A Radiological Case Report with Multimodal Imaging Insights
6 pages
2023 2024 SPGBHS Main Teaching Load
No ratings yet
2023 2024 SPGBHS Main Teaching Load
2 pages
An Overview of Evans Syndrome–A Rare Disease
No ratings yet
An Overview of Evans Syndrome–A Rare Disease
5 pages
Impact of Yogic Intervention on Refractive Error Among Adolescents: An Experimental Study
No ratings yet
Impact of Yogic Intervention on Refractive Error Among Adolescents: An Experimental Study
5 pages
Abhishek Dhiman
No ratings yet
Abhishek Dhiman
3 pages
Bringing India to the Global Table: The Transformative Power of International Joint Ventures
No ratings yet
Bringing India to the Global Table: The Transformative Power of International Joint Ventures
4 pages
Integrative Approach to Type 1 Diabetes Mellitus: An Unani Perspective on Asbab-E-Sitta Zaruriya
No ratings yet
Integrative Approach to Type 1 Diabetes Mellitus: An Unani Perspective on Asbab-E-Sitta Zaruriya
3 pages
Cream and Brown Illustration Social Science Class Education Presentation
No ratings yet
Cream and Brown Illustration Social Science Class Education Presentation
18 pages
1.1 Propositional Logic (EX) .4111.1534320746.8969
No ratings yet
1.1 Propositional Logic (EX) .4111.1534320746.8969
2 pages
Chapter 6
No ratings yet
Chapter 6
10 pages
Exp 1a Determine The Resultant of Two Non-Linear Force Vectors
No ratings yet
Exp 1a Determine The Resultant of Two Non-Linear Force Vectors
7 pages
Kuba Raffia Technology, A Symbol of Authenticity for the Dress Code of Ancestral Value in Congo-Kinshasa
No ratings yet
Kuba Raffia Technology, A Symbol of Authenticity for the Dress Code of Ancestral Value in Congo-Kinshasa
3 pages
Managing Performance and Building Digital Trust in Remote Teams Through Cybersecurity-Conscious HRM Policies and the Economics of Remote Work
No ratings yet
Managing Performance and Building Digital Trust in Remote Teams Through Cybersecurity-Conscious HRM Policies and the Economics of Remote Work
14 pages
Understanding Students’ Entrepreneurial Mindset in Sorsogon State University
No ratings yet
Understanding Students’ Entrepreneurial Mindset in Sorsogon State University
16 pages
The School of Talents as an Empowerment Catalyst in Transforming Women’s Lives and Promoting Gender Equality in Pentecostal Communities
No ratings yet
The School of Talents as an Empowerment Catalyst in Transforming Women’s Lives and Promoting Gender Equality in Pentecostal Communities
11 pages
The Role of Streptococci in Infective Endocarditis
No ratings yet
The Role of Streptococci in Infective Endocarditis
6 pages
Acknowledgementslip S1365262679000
No ratings yet
Acknowledgementslip S1365262679000
1 page

A Comparative Study of Some Selected Classifiers On An Imbalanced Dataset For Sentiment Analysis

Uploaded by

A Comparative Study of Some Selected Classifiers On An Imbalanced Dataset For Sentiment Analysis

Uploaded by

Volume 9, Issue 5, May – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24MAY1751

A Comparative Study of Some Selected

IJISRT24MAY1751 www.ijisrt.com 2826

II. LITERATURE REVIEW (Kumar et al., 2023) conducts sentiment analysis on

IJISRT24MAY1751 www.ijisrt.com 2827

Fig 1 Naive Bayes (Gaussian Distribution) Fig 2 Support Vector Machine

 Support Vector Machine:  Decision Tree:

Fig 3 Decision Tree Diagram

IJISRT24MAY1751 www.ijisrt.com 2828

Fig 5 An Imbalanced Dataset

When most of the data in each class is evenly

IV. DATASET V. METHODOLOGY

IJISRT24MAY1751 www.ijisrt.com 2829

Fig 6 Proposed Workflow Diagram

 Data Preprocessing  Training Model

 Separating Training and Testing Data Set  Testing the Model

IJISRT24MAY1751 www.ijisrt.com 2830

Table 1 Performance Evaluation based on Confusion Matrix

Fig 7 Bar Chart Accuracy Result for the Four Classifiers

Table 2 Performance Comparisons of Four Classifiers based on K-fold cross Validation

IJISRT24MAY1751 www.ijisrt.com 2831

IJISRT24MAY1751 www.ijisrt.com 2832

You might also like