Sentiment Analysis of Amazon Reviews Using Machine Learning Algorithms
Sentiment Analysis of Amazon Reviews Using Machine Learning Algorithms
Algorithms
K.E Hemapriya1, Siva Ranjini.C2, Siva Roopini.C3
Sri Krishna Arts and Science College, Coimbatore, India
Abstract
Amazon is the world's largest online retailer and marketplace by revenue and market
share. It also leads the smart speaker industry, offers cloud computing services via AWS, live-
streams on Twitch, and is a web-based firm. This paper uses Weka's machine learning
methods to give a thorough analysis of sentiment found in Amazon product evaluations.
Since e-commerce giants like Amazon have grown at an exponential rate, businesses must
now analyze customer sentiment from reviews to make informed decisions. In this work, we
use three widely used machine learning algorithms— Enhanced predictive accuracy
classifier(EPAC), Efficient J48 decision tree classifier(EJ48DTC), and Probabilistic
classifiers reverend features independence efficiency machine learning(PCRFIEML) —to
categorize reviews into positive and negative attitudes. To ascertain the efficacy of these
algorithms in sentiment analysis tasks, we assess their performance in terms of accuracy,
precision, recall, and F1 score. We also investigate how different text preparation methods
affect categorization results. Our results offer insightful information on the applicability of
various machine learning techniques for sentiment analysis of Amazon reviews, aiding
businesses in extracting actionable intelligence from vast amounts of customer feedback.
1. Introduction
Online retailers like Amazon have completely changed how customers engage with
goods and services in the age of digital commerce. Amazon.com is an e-commerce platform
that sells a wide range of products, including media (books, movies, music, and software),
apparel, baby products, consumer electronics, beauty products, gourmet food, groceries,
health and personal care products, industrial and scientific supplies, kitchen items, watches,
jewelry, turf and garden products, musical instruments, outdoor equipment, tools, automotive
products, toys and activities, farm supplies, and consulting services. Customer reviews assist
customers in learning more about the product and deciding whether it is good for them.
Customer reviews should provide customers with honest product feedback from other buyers.
We have a zero-tolerance policy for any review that attempts to mislead or manipulate
customers.
Understanding the attitude portrayed in these texts has become critical for
organizations looking to improve customer happiness and improve their offers, as millions of
people contribute to enormous libraries of evaluations. Sentiment analysis, a branch of
natural language processing, automatically classifies attitudes as neutral, positive, or
negative, providing a methodical way to conclude such textual data. Sentiment Analysis is the
most often utilized technique for analyzing text-based data and identifying sentiment content.
Opinion mining is another term for sentiment analysis. A diverse set of text data is being
created in the form of recommendations, feedback, tweets, and comments. E-commerce such
as Amazon.com platforms generate a large amount of data every day in the form of client
reviews.
Naive Bayes is a probabilistic classifier that is easy to use and effective since it
assumes independence between features. However, it has trouble identifying intricate
correlations between variables, which might result in less accurate sentiment analysis,
particularly when dealing with sarcasm or subtle language.
The interpretable Efficient J48 decision tree classifier decision tree technique can
handle both category and numerical input. Unfortunately, it often overfits noisy data, which
reduces its efficacy in sentiment analysis, especially in the case of huge feature spaces or
unbalanced datasets. Several decision trees are combined in an Enhanced predictive accuracy
classifier, an ensemble technique, to increase resilience and decrease overfitting. However,
because sentiment analysis relies on majority voting from numerous trees, it may have
trouble comprehending the model's judgments and fail to pick up on finer subtleties in
language. Through this paper, the algorithms are enhanced by overcoming the drawbacks.
The study's results have major significance for organizations functioning in the digital
marketplace since they provide actionable insights into customer attitudes that may guide
strategic decision-making processes. By shedding light on the efficacy of various machine
learning approaches in the context of Amazon reviews, this study hopes to advance the state-
of-the-art in sentiment analysis and pave the way for more nuanced and sophisticated
methodologies for extracting meaningful intelligence from textual data. Through meticulous
investigation and empirical analysis, we hope to provide a comprehensive understanding of
the complexities involved in sentiment analysis of Amazon reviews, ultimately contributing
to the improvement of customer-centric strategies and the optimization of business outcomes
in the digital age.
2. Literature Survey
The study by Muhammad Ali presents a novel approach that includes nostalgic
elements based on the item's attributes. Amazon consumer data was used to launch and verify
feedback. [1] The world's first data center to measure public mood. The system performs pre-
processing actions such as stone-coating, tokenization, packing, and stop-word removal from
databaset al.
‘Rezaul Haque’,’ Naimul Islam’, ‘Mayisha Tasneem’ and ’Amit Kumar Das’ gives
details about multi-class sentimental analysis on social media comments which is a
challenging issue that has garnered scholarly interest. [2] They comprehend the sentiment
behind a social media message with the help of multi-class sentimental anlysis on Bangla is
one of the most frequently spoken languages in the world, yet studies conducted in the
language have not been sufficiently significant or effective in predicting textual mood .
Manish Bhargava and Himanshu Arora state the importance of Support Vector
Machine (SVM) in the analysis of tweets dataset on Weka tool and performance of the model
is analyzed [3].
This work seeks to create a Flexible Learning Experience analyzer model by utilizing
five supervised machine learning techniques. The WEKA machine learning method was used
to evaluate the model's efficacy using a 10-fold cross-validation strategy[4]. Results were
then compared.
Mihir P. Mehta, Gopal Kumar and M. Ramkumar suggest this study which created a
new scale for evaluating consumer feedback by using topic modeling and sentiment analysis
to examine TripAdvisor hotel reviews. This method enhanced the study of the hospitality
experience by better classifying the feedback[5]. The results showed that during the COVID-
19 epidemic, consumer satisfaction decreased, with North America and Europe doing
particularly well. Among Asian nations, Sri Lanka has the highest rate of consumer
satisfaction.
Jin Zhou and Jun-Min Ye draw the conclusion that sentimental analysis has a
significant influence on education research and that qualitative methods should be used to
verify the results and look into the psychological underpinnings of emotion learning [6].
Vasundhara and Suraiya Parveen suggest that business development improves by reviewing
products and understanding client needs. Combining relevant features such as Stanford's
(POS) part of speech tagging, Sentiwordnet lexicon, and classifier methods can increase
results and accuracy[7]. They analyzed a dataset using the WEKA tool and concluded that
support vector machines outperform alternative classification techniques.
Tanjim Ul Haque et al. matched their findings to comparable studies on product
reviews. They analyzed a small sample of Amazon product reviews to identify polarized
opinions towards the goods [8]. They achieved over 90% accuracy, precision, and recall using
the F1 measure.
The technique is beneficial for customers searching for items, organizations tracking
brand sentiment, and other applications. The use of classifier ensembles and lexicons for
sentiment categorization in microblogging services like Twitter has received limited attention
in the literature[9]. Experiments using public tweet sentiment datasets demonstrate that
classifier ensembles combining Multinomial Naive Bayes, SVM, Enhanced predictive
accuracy classifier, and Logistic Regression increase classification accuracy.
In [10], opinion mining was conducted on a small dataset of Amazon product reviews
to identify polarized sentiments regarding the goods.
3. Methodology
The methods used to perform sentiment analysis on Amazon data are done by using
three algorithms such as Efficient J48 decision tree classifier, Probabilistic classifiers
reverend features independence efficiency machine learning, and Enhanced predictive
accuracy classifier. There are various steps to performing sentiment analysis on an Amazon
dataset using the Weka tool.
STEP 1: Data Preparation
Obtain the Amazon dataset, which includes reviews and sentiment labels (good or
negative). Ensure that the dataset is correctly prepared, with each review and sentiment label
clearly labelled.
STEP 2: Data pre-processing
Load the dataset into Weka in the proper file format (e.g., CSV or ARFF).
To clean up the text data, use preprocessing processes such as tokenization, lowercase, stop-
word removal, punctuation removal, and stemming. Convert the preprocessed text data to a
Weka-compatible format, such as bag-of-words or TF-IDF vectors.
STEP 3: Feature Extraction
Select characteristics from the preprocessed text data to represent each review.
Depending on the representation used, this stage may include constructing a feature vector for
each review based on word frequencies or other linguistic properties.
STEP 4: Model Training
Divide the dataset into training and testing groups (e.g., 70% training and 30%
testing). Train sentiment analysis models with three machine learning algorithms: EJ48DTC,
PCRFIEML, and EPAC. Use Weka's built-in classifiers for EJ48DTC, PCRFIEML, and
EPAC, or create custom classifiers if necessary.
STEP 5: Model Evaluation:
Assess the trained models using relevant measures like as accuracy, precision, recall,
and F1-score. Compare the efficacy of the EJ48DTC, PCRFIEML, and EPAC classifiers in
sentiment analysis on the Amazon dataset. Analyze any substantial performance
discrepancies and determine each model's strengths and flaws.
STEP 6: Result Interpretation
Interpret the sentiment analysis model outputs and draw inferences based on their
performance on the Amazon dataset. Identify trends and insights from the data, such as
which algorithm is best for sentiment categorization and what factors influence its
performance.
3.1. Efficient J48 decision tree classifier (EJ48DTC):
EJ48DTC in Weka creates decision trees by recursively splitting data based on
attribute values, aiming for homogenous subsets. It chooses features that provide
considerable information gain or entropy reduction. Preprocess Amazon reviews, extract
features, and train EJ48DTC on labeled data before doing sentiment analysis. Evaluate its
performance using criteria such as accuracy, then evaluate the resulting tree to get insight into
sentiment categorization. EJ48DTC provides a straightforward and effective technique for
assessing sentiment in Amazon reviews, finding influential variables and their thresholds, and
calculating positive or negative sentiment in a streamlined process.
Important metrics utilized for attribute selection in the EJ48DTC method implemented in
Weka for sentiment analysis on the Amazon dataset are Information Gain and Entropy
Reduction. Information Gain divides the data according to a certain property to quantify the
decrease in uncertainty regarding the class label (sentiment). After dividing the data
according to the property, it compares the entropy of the original class distribution with the
new one. This idea is expanded upon by Entropy Reduction, which takes into account the
weighted average of entropy reduction overall potential attribute values. EJ48DTC selects the
most illuminating characteristics for categorization based on these parameters.
Parameter Description
H(B) This is the entropy of the
target variable B.
H(B∣A) This is the conditional
entropy of B was given the
attribute A.
H(Bi ∣A) This is the conditional
entropy of B was given the
subset of A = ai.
Ni This is the number of
instances in the subset
N This is the total number of
instances.
Table 1: Formula Description
𝑃(𝑎|𝑏)×𝑃(𝑏)
𝑃 (𝑎) = (3)
𝑃(𝑏|𝑎)
Parameter Description
𝑃 (𝑎 ) it is the likelihood of seeing the input features, represented by the
normalization factor x.
𝑃 (𝑏 ) it is the class b prior probability, which shows the likelihood that
each sentiment label will appear in the dataset.
𝑃 (𝑏 |𝑎 ) The likelihood of a sentiment label (such as positive or negative)
given the input text is represented as the probability of class b given
the input attributes a.
𝑃 (𝑎 |𝑏 ) It denotes the likelihood of a sentiment label (such as positive or
negative) given the input text and is the probability of class b given
the input features a.
Table 2: Formula Description
Step 2: Split the dataset into training and testing sets to facilitate model evaluation.
Step 3: Perform extensive feature engineering to extract relevant features from the Amazon
review dataset. Consider features like word frequency, n-grams, sentiment lexicons, etc.
Step 4: Preprocess the text data, including steps like tokenization, lowercasing, stop-word
removal, and possibly stemming. Also, handle negation and context in the text using
techniques like negation handling and part-of-speech tagging.
Step 5: Utilize techniques such as information gain or chi-square test to select the most
informative features that contribute significantly to sentiment analysis.
Step 6: Convert the preprocessed text data into a suitable format, such as bag-of-words
representation or TF-IDF vectors.
Step 7: Train the PCRFIEML classifier on the training set, specifying the target variable
(sentiment labels) and input features (selected textual characteristics).
Step 8: Evaluate the trained PCRFIEML model on the testing set using performance metrics
like accuracy, precision, recall, and F1-score to assess its effectiveness in sentiment analysis.
Step 9: Analyze the learned probabilities from the PCRFIEML model to understand the
importance of different words/features in predicting sentiment.
Step 10: Combine multiple PCRFIEML classifiers using techniques like bagging or boosting
to improve classification accuracy.
Step 11: Compare the performance of the PCRFIEML classifier against alternative machine
learning techniques to determine the most suitable approach for sentiment analysis on the
Amazon dataset.
Step 12: Generate a detailed report including performance metrics, feature importance
analysis, and any insights gained from model evaluation.
Parameter Description
ŷRF It reflects the sentiment label that the Improved Random
Forest ensemble has predicted.
Mode() The mode (most common) emotion label among all decision
tree predictions is determined by the function mode.
ŷ1 , ŷ2 , ŷ3′ … , ŷn These are the sentiment labels that each decision tree in the
Improved Random Forest ensemble predicts to be present.
Table 3: Parameter Description for EPAC algorithm
Step 3: Preprocess the text data by lowercasing, tokenizing, and removing stop words.
Step 4: Convert the preprocessed text data into a format appropriate for Weka, such as TF-
IDF vectors or bag-of-words representation.
Step 5: Choose the EPAC classifier from the Weka interface under the "Classify" tab,
indicating parameters like the number of trees in the forest, the number of features to take
into account at each split, and other considerations.
Step 6: Utilize ensemble techniques within EPAC, such as feature subset selection or random
feature selection, to introduce diversity among the trees and improve model performance.
Step 8: Assess the trained EPAC model's efficacy in sentiment analysis by utilizing
performance measures such as accuracy, precision, recall, and F1-score on the testing set.
Step 9: Analyze the feature importances acquired via the EPAC model to determine the
important features in predicting sentiment.
Step 10: Compare the EPAC classifier's performance against that of other machine learning
approaches, such as Naive Bayes or Support Vector Machines.
Step 11: Analyze the outcomes and learnings from the EPAC model on sentiment analysis on
the Amazon dataset.
Step 12: Interpret the decisions made by the EPAC model by visualizing individual decision
trees or feature importance plots, which can provide insights into how the model is analyzing
the text data for sentiment analysis.
Step 13: Generate a detailed report including performance metrics, feature importance
analysis, and any insights gained from model evaluation.
4. Performance Evaluation
In this study, we used three well-liked classifiers from the WEKA tool— EJ48DTC,
PCRFIEML and EPAC Classifier—to compare sentiment analysis on Amazon product
evaluations. 400 instances of Amazon product reviews, each with a label indicating whether it
was a favorable or negative feeling, were used in the experiment. To evaluate the classifiers'
performance, a range of assessment measures were used during training and testing.
According to the findings, Naive Bayes became the go-to model for sentiment
analysis using Amazon data. Attaining a maximum accuracy of 88.25%, it accurately
identified 353 out of 400 cases. In addition, Naive Bayes fared better than the EPAC
Classifier and EJ48DTC Decision Tree in terms of the Kappa statistic, relative absolute error,
mean absolute error, root mean squared error, and root relative squared error.
In particular, Naive Bayes performed better than other methods, as evidenced by its
Kappa score of 0.7647, which suggests significant agreement that goes beyond chance. In
comparison to the other classifiers, it also showed decreased mean absolute error and root
mean squared error, indicating improved sentiment classification precision. The Naive Bayes
algorithm had the highest accuracy and lowest relative absolute error as well as root relative
squared error.
The comparison research demonstrates the accuracy and robustness of Naive Bayes in
identifying both positive and negative attitudes expressed in product evaluations, highlighting
its usefulness in sentiment analysis on Amazon data. In light of this, we suggest that Naive
Bayes be used as the model of choice for sentiment analysis in comparable situations. This
will provide insightful information for companies and scholars who want to use sentiment
analysis to analyze client feedback and inform decision-making.
189
negative positive
Kappa Statistic
29%
40%
31%
0.2
0.2023
0.15 0.175
0.1 0.1198
0.05
0
J48 CLASSIFIER NAIVE BAYES RANDOM FOREST
CLASSIFIER
40.00%
30.00%
28.63%
20.00%
10.00%
0.00%
J48 CLASSIFIER NAIVE BAYES RANDOM FOREST
CLASSIFIER
Accuracy
105
100
95
90
85
80
75
J48 CLASSIFIER NAIVE BAYES RANDOM FOREST
CLASSIFIER
6. CONCLUSION
Using WEKA classifiers for sentiment analysis on Amazon data, the best model was
found to be PCRFIEML, which demonstrated strong performance and high accuracy across a
range of assessment measures. In the future, improving the project may entail investigating
sophisticated feature engineering techniques, experimenting with ensemble approaches,
adjusting classifier settings, resolving class imbalance problems, tailoring analysis for certain
domains, and putting ongoing learning strategies into practice. With these improvements,
sentiment analysis should become even more accurate and efficient, giving academics and
companies more useful information about what customers believe.
7. REFERENCE
[1] ‘Muhammad Ali’, ‘Faqeer Hussain’, ‘Bilal Ahmad’ and ‘Muhammad Usman’, The
Natural Language Processing Based Approach for Sentiment Analysis of User Reviews of
Amazon Product by Using Machine and Deep Learning AlgorithmsVolume:13, Issue:3, June
2023, International Journal of Current Engineering and Technology.
[2] ‘Rezaul Haque’,’ Naimul Islam’, ‘Mayisha Tasneem’ And ’Amit Kumar Das’, Multi Class
Sentiment Classification On Bengali Social Media Comments Using Machine Learning,
Volume 4, June 2023,International Journal Of Cognitive Computing In Engineering.
[3] Manish Bhargava, Himanshu Arora, Comparative Analysis and Design Of Different
Approaches for Twitter Sentiment Analysis and Classification of SVM, Volume: 10, Issue: 9,
30 September 2022, International Journal of Recent Innovation Trends in Computing and
Communication.
[4] ‘Archolito V Pahuriray’, ‘Joe D. Basanta’, ‘Jan Carlo T. Arroyo’ and ‘Allemar Jhone P.
Delima’, Flexible Learning Experience Analyzer (FLExA): Sentiment Analysis of College
Students through Machine Learning Algorithms with Comparative Analysis using WEKA
Volume 12, Issue 12, December 2022 , International Journal of Emerging Technology and
Advanced Engineering.
[5] ’Mihir P. Mehta’,’ Gopal Kumar’ and ‘M. Ramkumar’ , Customer Expectations In The
Hotel Industry During The COVID-19 Pandemic: A Global Perspective Using Sentiment
Analysis,Volume 48, Issue 1, 18 March 2021, Tourism Recreation Research.
[6] ‘Jin Zhou’ And ‘Jun-Min Ye’, Sentiment Analysis In Education Research: A Review Of
Journal Publications, Volume 31, Issue 3, 01 October 2020, Interactive Learning
Environments.
[9] ‘N.F.F. da Silva’, Tweet sentiment analysis with classifier ensembles, Decision Support
Systems (2014).
[10] Rain, Callen. "Sentiment Analysis in Amazon Reviews Using Probabilistic Machine
Learning."Swarthmore College (2013).