0% found this document useful (0 votes)
20 views13 pages

MultiAspect Oriented Sentiment Classification Prior Knowledge Topic Modelling and Ensemble Learning Classifier Approach

Uploaded by

luv ren luv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views13 pages

MultiAspect Oriented Sentiment Classification Prior Knowledge Topic Modelling and Ensemble Learning Classifier Approach

Uploaded by

luv ren luv
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

applied

sciences
Article
Multi-Aspect Oriented Sentiment Classification: Prior
Knowledge Topic Modelling and Ensemble Learning
Classifier Approach
Najwa AlGhamdi 1, * , Shaheen Khatoon 2 and Majed Alshamari 1

1 Department of Information Systems, King Faisal University, Al-Ahsa 31982, Saudi Arabia; [email protected]
2 School of AI and Advanced Computing, Xi’an Jiaotong-Liverpool University, Suzhou 215000, China;
[email protected]
* Correspondence: [email protected]

Abstract: User-generated content on numerous sites is indicative of users’ sentiment towards many
issues, from daily food intake to using new products. Amid the active usage of social networks and
micro-blogs, notably during the COVID-19 pandemic, we may glean insights into any product or
service through users’ feedback and opinions. Thus, it is often difficult and time consuming to go
through all the reviews and analyse them in order to recognize the notion of the overall goodness
or badness of the reviews before making any decision. To overcome this challenge, sentiment
analysis has been used as an effective rapid way to automatically gauge consumers’ opinions.
Large reviews will possibly encompass both positive and negative opinions on different features of
a product/service in the same review. Therefore, this paper proposes an aspect-oriented sentiment
classification using a combination of the prior knowledge topic model algorithm (SA-LDA), automatic
 labelling (SentiWordNet) and ensemble method (Stacking). The framework is evaluated using the

dataset from different domains. The results have shown that the proposed SA-LDA outperformed
Citation: AlGhamdi, N.; Khatoon, S.;
the standard LDA. In addition, the suggested ensemble learning classifier has increased the accuracy
Alshamari, M. Multi-Aspect Oriented
of the classifier by more than ~3% when it is compared to baseline classification algorithms. The
Sentiment Classification: Prior
study concluded that the proposed approach is equally adaptable across multi-domain applications.
Knowledge Topic Modelling and
Ensemble Learning Classifier
Approach. Appl. Sci. 2022, 12, 4066.
Keywords: sentiment classification; prior knowledge; topic models; data labelling; ensemble learning;
https://doi.org/10.3390/app12084066 stacked generalization

Academic Editors: Kristina Yordanova


and Emma Tonkin
1. Introduction
Received: 14 March 2022
Accepted: 12 April 2022
Amid the active usage of social networks and micro-blogs, especially during the
Published: 18 April 2022 COVID-19 pandemic, we may glean insights into any product or service through users’ feed-
back and opinions. Platforms such as micro-blogs, social media sites, online reviews, and
Publisher’s Note: MDPI stays neutral
discussion forums are rapidly growing. Therefore, it is challenging and time-consuming to
with regard to jurisdictional claims in
go through all the reviews and analyse them with the intention of discovering the notion of
published maps and institutional affil-
the overall goodness or badness of these reviews. Accordingly, the essential endeavours to
iations.
automatically analyse the sentiments of the users’ reviews are increasingly needed.
Opinion mining and sentiment analysis are automatic classifications of textual infor-
mation that focus on classifying data according to polarity (positive or negative). These
Copyright: © 2022 by the authors.
automatic techniques could possibly be among the adopted ways to gauge both user im-
Licensee MDPI, Basel, Switzerland. pressions and satisfaction. User-generated content usually contains unstructured text that
This article is an open access article is used in classification tasks such as information extraction (IE), text analysis and natural
distributed under the terms and language processing (NLP). It is applied to a vast number of reviews. Therefore, there is
conditions of the Creative Commons urgent demand for an advanced framework and formulas that can deal with the massive
Attribution (CC BY) license (https:// amount of information in order to precisely handle them and provide the most accurate
creativecommons.org/licenses/by/ related results.
4.0/).

Appl. Sci. 2022, 12, 4066. https://doi.org/10.3390/app12084066 https://www.mdpi.com/journal/applsci


Appl. Sci. 2022, 12, 4066 2 of 13

However, predicting overall polarities for each review is not enough since the review
could provide comments on various aspects of the corresponding product or service. For
instance, one review about a restaurant may mention the prices, cleanliness, services
and more. Analysing these aspects, rather than the overall review, constructs a better
understanding of the exact leading pros and cons of the product or service. Therefore, the
study focuses on performing aspect-level sentiment classification that predicts every aspect.
This paper proposes a multi-aspect-oriented sentiment classification model by using a
combination of the prior knowledge topic model algorithm (SA-LDA), automatic labelling
(SentiWordNet) and ensemble method (Stacking). In this study, the multi-aspect sentiment
analysis is addressed by using a topic model and an ensemble learning method. However,
the challenge of the models is that documents are rich in excessively informal and colloquial
language. Thus, this research aims to identify an approach that depends on the combination
of probabilistic topic modelling, namely, Seeded Aspect Latent Dirichlet Allocation (SA-
LDA) and an ensemble learning method, to analyse and visualize noticeable aspects of text
documents and classify them afterwards. Different domains, methods and classifiers have
been used to address the aspect extraction and sentiment analysis tasks.
Furthermore, to evaluate the effectiveness of the proposed model, we conduct ex-
tensive experiments on three different domains of online reviews (movie, restaurant, and
domestic Saudi Airlines reviews). The proposed model shows promising results. As far as
we know, no previous research has proposed a model similar to our proposed one, which
consists of three main modules: (1) LDA-based topic modelling; (2) sentiment lexicon
(SentiWordNet); (3) the ensemble classifier (stacked generalization method).
Section 2 of this paper describes several related works, whereas the description of data
collection, multi-aspect extraction model, proposed methodology, and ensemble learning
algorithm are presented in Section 3. The findings and the conclusion with future works
are presented in Sections 4 and 5, respectively.

2. Related Work
In this section, we offer a brief summary of the previous work in the context of
aspect extraction via prior knowledge topic modelling, sentiment lexicon classification, and
ensemble learning methods for Sentiment Analysis.

2.1. Multi-Aspect Topic Modelling for Aspect Extraction (Prior Knowledge Models)
Aspect extraction is one of the central phases in analysing the expressed opinions,
emotions and viewpoints in textual data shared for a certain topic. Despite the current
aspect extraction procedures that are based on topic models, the result of engaging only
topic-models leads to generate unrelated and incoherent aspects. Prior knowledge semi-
supervised models are introduced to enhance the correctness of aspects extraction using
topic models with minimal user involvement. These proposed models aim to use domain-
specific knowledge to guide the model in the topics extraction task to border the amount of
unrelated extracted topics.
Several studies revealed that employing prior knowledge of a topic model has raised
the aspect extraction accuracy. However, existing research studies have concentrated on
a single domain using knowledge to extract aspects from a specific domain. For instance,
Shi et al. [1] proposed a novel clustering method by leveraging prior knowledge to enhance
the web services clustering task accuracy using a semi-supervised technique. The results
have confirmed that the approach provides a major improvement in the clustering accuracy.
There is a considerable amount of literature on the prior knowledge topic model,
especially with the LDA model, for instance, Concept-LDA [2], MC-LDA [3], SLM [4],
MDK-LDA [5], GK-LDA [6], AKL [7], LTM [8], UFL-LDA [9], and many more.
The overall performance of all these and most other prior knowledge topic modelling
techniques have used LDA-based techniques for aspect extraction to indicate that the
extracted aspects are more corresponding and more accurate, as they significantly optimize
the execution of the baseline topic models [10,11].
Appl. Sci. 2022, 12, 4066 3 of 13

2.2. Sentiment Lexicon Classification


Sentiment lexicon classification (sentiment analysis) is the computational analysis of
people’s thoughts, ideas, and feelings towards an entity [12], and it involves classifying
them into positive, neutral, or negative categories. Sentiment lexicon approaches are
applied to label data and to measure the sentiment polarity. Sentiment lexicon classification
relies on two sorts of approaches which are corpus-based and dictionary-based [13].
Many existing studies have applied sentiment lexicon to different domains and lan-
guages [14–18]. Most of these studies have used the lexicon SentiWordNet to extract
sentiments and the results with little manual intervention. As it turns out, the chosen
lexicon has improved the accuracy in terms of topic-specific lexical sentiments.

2.3. Ensemble Learning Method


Ensemble learning methods are among the top current research topics in machine
learning [19]. Machine learning models are used for performing predictive classification in
order to achieve a good performance, and special attention has been drawn to sentiment
classification tasks. Some of the common ensemble learning methods include Averaging,
Bagging, AdaBoost and Staking.
Many research studies investigated applying sentiment classification using ensem-
ble methods [20–27]. Experiments were conducted on different domains such as restau-
rants [27–29], movies [30–33], products [34–37] and more. Additionally, the proposed
ensemble models [23,38–42], with various characteristics such as domains, languages,
and datasets have indicated that utilizing ensemble methods led to achieving optimized
performance in the tasks of sentiment classification.

3. Materials and Methods


An overview of the proposed methodology is shown in Figure 1. It consists of data pre-
processing followed by three core modules: (1) aspect extraction using the prior knowledge
topic model (SA-LDA) algorithm; (2) automatic labelling (SentiWordNet); (3) ensemble
learning classifier (Stacking). The details of each component are described in the follow-
ing subsections.

3.1. Dataset and Pre-Processing


The first module of the proposed methodology consists of data collection and pre-
processing. In this module, the data about users’ opinions towards different aspects
is collected from different online reviews on several domains. Table 1 shows the basic
descriptive information of the three datasets used in the experimental analysis.

Table 1. Summary of the datasets.

Datasets No. of Reviews Source of Datasets


Movie Reviews 2000 IMDB
Restaurant Reviews 2000 TripAdvisor
Domestic Saudi Airlines Reviews 2000 TripAdvisor
Total 6000 reviews

A step-by-step procedure for data collection and pre-processing is outlined in Algorithm 1.


The results were generated in a pre-processed textual corpus which contained an opinion unit
(sentence) that would be ready to be handled to extract aspects and opinion aspects in the
next step.
Appl. Sci. 2022, 12, 4066 4 of 13

Algorithm 1: Algorithm for data collection and pre-processing


Input: Online reviews (Ri )
Output: Cleaned reviews (CRi )
For each Review in Ri , where i = 1, 2, 3, 4 . . . n
Apply:
1. Remove unwanted contents ( Ri ).
2. Remove Stop-word ( Ri ).
3. Converted to lowercase ( Ri ).
4. Tokenization ( Ri ).
5. Replace conjunction words with full-stop ( Ri ).
Appl. Sci. 2022, 12, x FOR PEER REVIEW 6. Detect sentence boundaries ( Ri ). 4 of 15
7. Repeat steps 2 to 5 until Ri = Rn .

Figure 1. The architecture of the proposed framework.


Figure 1. The architecture of the proposed framework.

3.1.
3.2.Dataset and Pre-Processing
Aspect Extraction
The
The first
nextmodule of the
step of the proposed
proposed methodology
model pipeline isconsists of data extracting
automatically collection and pre-
semantic
processing.
aspects (whichIn this module,
are also calledthe data from
topics) aboutthe
users’ opinions towards
pre-processed different
textual corpus. In aspects is
this paper,
collected
a modified from
LDAdifferent
model,online
calledreviews on severalLDA
Seeded-Aspects domains. Table is
(SA-LDA), 1 shows the basic
proposed. It hasde-
an
scriptive
unlabelledinformation of thetextual
pre-processed three datasets usedcontains
corpus that in the experimental
opinion unitsanalysis.
of a specific domain
and an aspect specification as an input. An aspect specification is known as predefined
Table 1. Summary
aspects of theIndatasets.
(seed words). basic LDA, the model tends to only detect the most obvious aspects
of a text corpus which may not cover the expected and desired aspects. Thus, we proposed
Datasets No. of Reviews Source of Datasets
a modified LDA model by providing seed words (seed aspects) to guide the model to only
Movie Reviews 2000 IMDB
generate words from analogous seed aspects as presented in Figure 2.
Restaurant Reviews 2000 TripAdvisor
Domestic Saudi Airlines Reviews 2000 TripAdvisor
Total 6000 reviews

A step-by-step procedure for data collection and pre-processing is outlined in Algo-


rithm 1. The results were generated in a pre-processed textual corpus which contained an
a modified LDA model, called Seeded-Aspects LDA (SA-LDA), is proposed. It has an un-
labelled pre-processed textual corpus that contains opinion units of a specific domain and
an aspect specification as an input. An aspect specification is known as predefined aspects
(seed words). In basic LDA, the model tends to only detect the most obvious aspects of a
text corpus which may not cover the expected and desired aspects. Thus, we proposed a
Appl. Sci. 2022, 12, 4066 5 of 13
modified LDA model by providing seed words (seed aspects) to guide the model to only
generate words from analogous seed aspects as presented in Figure 2.

Figure
Figure 2.
2. The
The proposed
proposed model
model in
in plate
plate notation.
notation.

The SA-LDA at its basis comprises an LDA-based topic modelling, and it is extended
biasedtopic
with biased topicmodelling
modellinghyper-parameters
hyper-parameters (β and
(β and α) that
α) that are based
are based on continuous
on continuous word
word embeddings.
embeddings. The number
The number of aspects
of aspects (k)based
(k) is set is seton
based on the number
the number of unique
of unique main
main aspects
needed. Each review is modelled by an aspect and contains a sentence.
aspects needed. Each review is modelled by an aspect and contains a sentence. The pro- The proposed
model model
posed in plateinnotation is illustrated
plate notation in Figurein2,Figure
is illustrated where the generative
2, where hypothesishypothesis
the generative algorithm
is described
algorithm is in Algorithm
described 2.
in Algorithm 2.

Algorithm 2: Algorithm for the generative hypothesis

1 For each aspect k = 1 . . . . . . K,


• Choose seed aspect ∅kA ∼ Dir ( β).
2 For each review d,
• Choose θd ∼ Dir (α).
• For each token wn , n = 1 . . . . . . Nd , 
• Sample πd,n ∼ MaxEnt λ, xwd,n .
• Draw a topic Zn ∼ Multi (θd ). 
• Draw an indicator yd,n ∼ Bern πd,n
• if yd,n = A:  
• Sample a word wn ∼ Multi ∅ZAn .

We provided the model with several seed words for each main aspect as shown in
Table 2. After feeding in unique aspects and seeded words for each dataset, each review
sentence becomes ready for the next phase of the sentiment analysis task as described in
the next subsection.

Table 2. Aspects and seed words for each domain.

Domain Aspect Seed Words


ACT performance, actor, play, character, role, scene
Movie PLOT story, script, sequence, scenario
SOUNDTRACK sound, audio, music, playlist, effect
FOOD taste, dish, dinner, appetizer, menu
Restaurant SERVICE internet, parking, delivery, location, seating, staff
PRICE payment, price, discount, cost, pay offer
FLIGHT meal, internet, entertainment, seat, cleanliness, drink
Domestic Saudi Airlines SERVICE lounge, ticket, baggage, upgrade, punctuality
STAFF crew, captain, pilot, service, steward
Appl. Sci. 2022, 12, 4066 6 of 13

3.3. Automatic Labelling System


Automatic labelling uses the sentiment lexicon approach to label data and to measure
the sentiment polarity. In order to label a dataset in this work, SentiWordNet is applied.
SentiWordNet is obtained from the WordNet dictionary where each word is associated
with a numerical score. In this phase, for each sentence, the SentiWordNet dictionary
is applied to determine the polarity of each word, and then the polarity of the whole
sentence is calculated by adding the polarity of each word. If the word is not in the
SentiWordNet dictionary, it is searched for in the WordNet dictionary. WordNet is an
English language dictionary that contains synonym words gathered into a set called syn-set.
Thus, the analogous words related to the word in WordNet are fetched and searched in the
SentiWordNet dictionary such that their sentiment score is selected for polarity calculation.
This procedure increases the efficiency and effectiveness of automatic labelling.
Furthermore, some words, called negation words, may affect the sentiment orientation
of other words in the sentence. Negation words are those words that reverse the polarity
of the sentence when occurring in it. For example, in the text “the food is not good”, the
negation word “not” reverses the polarity of the sentence. To handle this issue, a negation
is considered in the polarity calculation. The algorithm of the automatic labelling phase is
illustrated in Algorithm 3.

Algorithm 3: Algorithm of the automatic labelling


Input: Sentences, SentiWordNet, WordNet, NegationWords
Output: Labelled Dataset
for each sentence S:
taggedSentence = POS(S)
for each WordCandidate (verb, adverb, and adjective) in taggedSentence
LookupSentiWordNet (WordCandidate)
if WordCandidate not in SentiWordNet
LookupWordNet (WordCandidate)
else if WordCandidate > 0
polarity (WordCandidate) ← positive
else if WordCandidate < 0
polarity (WordCandidate) ← negative
else if
polarity (WordCandidate) ← neutral
else (there is NegationWords near WordCandidate)
polarity (WordCandidate) ← opposite (polarity (WordCandidate))
PolarityScore += LookupSentiWordNet (WordCandidate)
TotalWordCandidateCount++
AveragePolarity = PolarityScore/ TotalWordCandidateCount
if AveragePolarity > 0
return 1
else
return 0

The result demonstrates the label (1 for positive and 0 for negative) and the sentiment
polarity. Then, it is used for the next phase, which is the ensemble learning classifier. The
labelled dataset is used to train the classification model. The ensemble learning classifier method
is used for sentiment classification. Precisely, in the ensemble method, stacked generalization is
employed on different classifier algorithms as explained in the next sub-section.

3.4. Predicting Polarity of Largescale Social Data Using Supervised Learning (The Ensemble
Learning Classifier Method)
An ensemble algorithm is trained on the labelled dataset to classify the unseen reviews
as positive or negative on the go. Up-to-date numerous ensemble learning methods have
been developed and introduced to enhance the performance of classification tasks. The
major purpose of the ensemble models is to combine a set of classifiers with the intention of
Appl. Sci. 2022, 12, 4066 7 of 13

achieving a better and more reliable predictive performance than a single classifier [43]. The
focus will be on the capability of an ensemble model to generate a better result compared
to each baseline classifier. In this experiment, a stacked generalization method8 was
Appl. Sci. 2022, 12, x FOR PEER REVIEW of 15 used, as
shown in Figure 3, because it minimizes generalization error.

Figure 3.
Figure 3. Steps
Steps of
ofthe
theensemble
ensemblelearning.
learning.

The idea
The idea of
of stacked
stackedgeneralization
generalization is is
meant
meantto combine
to combine the the
prediction resultresult
prediction of sev-
of several
base classifiers in the first level using a meta classifier in the next level in order to to
eral base classifiers in the first level using a meta classifier in the next level in order minimize
minimize the generalization error. The process of performing a stacked generalization
the generalization error. The process of performing a stacked generalization with k-fold
with k-fold cross-validation is shown in Figure 3.
cross-validation is shown in Figure 3.
The first step includes training the base classifiers in the first level, which are support
vectorThe first step
machine, includes
logistic training
regression, the base
random classifiers
forest, decision in the
tree, first
naï level,and
ve Bayes, which are support
K-near-
vector machine,by
est neighbours logistic regression,
employing random forest,on
k-fold cross-validation decision tree, naïve
each classifier. TheBayes,
datasetand K-nearest
is di-
neighbours
vided into k subsets. For eachk-fold
by employing time incross-validation
k sequential rounds, on one
eachofclassifier. The
the k subsets dataset
is used is divided
as the
test ksetsubsets.
into For each
and the other subsetiniskdrawn
k − 1 time sequential
from therounds, one
training the kthat,
set.ofAfter subsets is used
each base as the test
classi-
set
fierand the other
generates k − 1 subset
a prediction. Then,is the
drawn from the
prediction training
values set. After
from each that,
classifier areeach base classifier
combined
and provided
generates as the dataset
a prediction. Then, for the
thesecond level. values
prediction Finally, from
this step
eachincludes a training
classifier meta
are combined and
classifier on
provided asthe
thesecond
dataset levelforwith
thethe first level
second dataset
level. to produce
Finally, this stepthe final prediction.
includes Al- meta
a training
gorithm 4on
classifier describes
the secondthe stacked
level generalization
with the firstwith levelk-fold cross-validation
dataset to producewith the kfinal
= 10.prediction.
Algorithm 4 describes the stacked generalization with k-fold cross-validation with k = 10.
Algorithm 4: Stacked Generalization with k-fold cross-validation
Input: Dataset
Algorithm 𝐷, Base
4: Stacked classifiers 𝑡, base
Generalization withclassifier prediction 𝑝, meta classifier 𝑚
k-fold cross-validation
Output: Ensemble Classifier Prediction 𝑃
Input: Dataset D, Base classifiers t, base classifier prediction p, meta classifier m
Apply k-fold CV, 𝑘 = 10, 𝐷𝑛 = {𝐷1 , 𝐷2 ,…, 𝐷10 } //Split the dataset into 10 subsets
Output: Ensemble Classifier Prediction P
for k  1 to n do
Apply k-fold CV, k = 10, Dn = {D1 , D2 , . . . , D10 } //Split the dataset into 10 subsets
for each t  1 to T //base classifiers
for k ← 1 to n do
train the classifier 𝑝𝑘𝑡 from 𝐷𝑛 .
for each t ← 1 to T //base classifiers
end for
train the classifier p from Dn .
for 𝐷𝑝 do //generate first kt level dataset
end for
get a dataset 𝐷𝑝 , where 𝐷𝑝 = {𝑝𝑡1 , 𝑝𝑡2 ,…, 𝑝𝑇 }.
for D p do //generate first level dataset
end for
get a dataset D p , where D p = {pt1 , pt2 , . . . , p T }.
train 𝑚 from 𝐷𝑝 //meta classifier
end for
return 𝑃 //final prediction
train m from D p //meta classifier
return P //final prediction
Appl. Sci. 2022, 12, 4066 8 of 13

4. Evaluation Criteria and Experimental Results


The evaluation methods for classification models used in this paper are precision,
recall and F-measure, as in [44]. They were used to estimate the performance result of each
classifier. We evaluated our classifiers and models according to a 10-fold cross-validation
scheme on the datasets.
In this section, we will evaluate and discuss the three main modules of the proposed
model. In the first module (aspect extraction), we evaluated the proposed model, named
SA-LDA topic modelling. This evaluation relies on two parts: (1) manual evaluation of
each extracted aspect; (2) comparison of results with the based topic modelling algorithm
regarding each domain.
In the second module (automatic labelling), we tested the accuracy of the proposed
lexicon-based approach and verified the results with the manually labelled dataset. We also
compared three lexicon-based approaches with the related works and the present results.
In the third module (ensemble classifier), we illustrated the performance of the pro-
posed classifier model for the purpose of aspect sentiment analysis. This evaluation relies
on two parts: (1) evaluating the performance and accuracy of the proposed model on three
different domains; (2) comparing the proposed model to the baseline classifiers as well as
another ensemble method.

4.1. Aspect Extraction (SA-LDA Model)


The result shows that SA-LDA extracts valuable aspects and relates them to the main
aspect. However, LDA extracts many unrelated aspects along with some adjective words
which are considered as opinion words more than aspects. Table 3 compares the results
obtained from both models for each domain. The coloured words in ‘red’ indicate the errors
or unrelated aspects. We manually evaluated the model based on the number of words that
are related to the seed words/aspect which is our manual evaluation of the models. Even
with these upsetting words, the proposed models can produce better results. However,
the proposed model is flexible in a way that enables it to be adapted in any domain by
specifying the seed words for the needed aspects.
Additionally, when the two results are compared, it is obvious that the proposed model
outperforms the baseline model. Tables 3 and 4 illustrate the results of the performance of the
two models in light of the three domains. Concerning the accuracy of SA-LDA, as illustrated
in Table 4, it is clear that the Restaurant has the highest score with 86.7% while the Movie
comes second with a score of 83.3%. Yet, Domestic Saudi Airline has the lowest score of
80%. Conversely, the standard model (LDA) scored lower accuracy results with 54%, 41%
and 32% for Movie, Restaurant and Domestic Saudi Airlines, respectively. In conclusion,
these results indicate that the proposed model has been more successful in detecting more
correlated aspects, and it is likely to yield improved results with better performance.

4.2. Automatic Labelling (SentiWordNet)


Sentiment classification is an indication of the task of sentiment analysis which is a sub-
field of natural language processing. The lexicon approach is applied to extract the opinion
of each aspect by using SentiWordNet, which determines whether the text content specifies
a positive or a negative review. Opinion extraction and automatic labelling are carried out
in three steps: (1) applying part-of-speech tagging to each sentence; (2) extraction of all the
opinion words and detecting the polarity of each opinion word; (3) looking for a negation
word that is close to any opinion word, and once it is found, the polarity is reversed.
Opinion words are usually represented in the adjective, adverb, and verb forms such
as “like” or “really” which affect the final result. For instance, the sentences “I like pizza”
and “I really like pizza” both contain positive opinions, but the second sentence is more
positive. Opinion words can be identified after applying POS tagging for each sentence,
and it is typically found near the aspect.
The accuracy of SentiWordNet performance was measured by applying SVM classifier
and five-fold cross-validation. The overall results of the accuracy for each domain are
Appl. Sci. 2022, 12, 4066 9 of 13

shown in Table 5. The results are compared with the related work where SentiWordNet
and SVM classifier have been used for different sentiment analysis tasks.
The results indicate that the accuracy of ‘Restaurant’ scores has recorded the highest
percentage with 69.4%, while ‘Movie’ comes second with 65%, and the lowest score is
recorded by the ‘Domestic Saudi Airline’ with 63.2%. The percentage distribution of the
sentiment polarity for each aspect of the three domains is presented in Figure 4.

Table 3. Comparison between the proposed topic modelling results and the based topic
modelling algorithm.

Domain Aspect LDA Model SA-LDA Model


(actor, show, play, character, art, life, movie, (performance, act, actor, character, play, actress,
Act
appear, sound, only) part, scene, role, do)
Movie (story, play, series, know, role, line, set, (story, series, play, script, sequence, role,
Plot
sound, voice,say) scenario, line, do, text)
(music, song, sound, great, play, musical, (music, sound, audio, song, show, effect,
Soundtrack
product,show,story,movie) playlist, soundtrack, play, movie)
(food, restaurant, menu, good, chicken, (restaurant, menu, food, taste, appetizer,
Food
dishes, street, visit, dinner, service) dinner, dishes, view, cook, flavor)
Restaurant (staff, manager, ask, service, (service, staff, order, internet, view, location,
Service
food,said,friendly,restaurant,told,eat) delivery, parking, menu, seating)
(price, card, feel, money, charged, payment, (price, cost, payment, offer, card, discount, pay,
Price
cheap, night, really, like) bill, charge, worth)
(ticket,check,
(flight, seat, meal, internet, food,
Flight flight,bad,lounge,schedule,time,only,
entertainment, movie, clean, drink, ticket)
Domestic Saudi food,hour)
Airlines (staff, desk, check, (ticket, service, lounge, flight, baggage,
Service
airport,out,arrival,hour,ready,late,wait) schedule, staff, upgrade, offer, punctuality)
(staff, friendly, service, direct, (staff, service, facility, schedule, time, crew,
Staff
talk,helpful,front,desk,time, late) ticket, pilot, flight,plane)

Table 4. The performance of the proposed model and standard model across three domains.

Domain Accuracy (SentiWordNet) Accuracy of Related Work


Movie 65% 53.33% [45], 79% [46]
Restaurant 69.4% 53% [47], 56% [48]
The Domestic Saudi Airlines 63.2% 65.2% [49], 46.41% [50]

Table 5. Performance evaluation of reviews using SVM and comparison of related work results.

Accuracy of the Proposed Model SA-LDA Accuracy of the Baseline Model


Domain
(LDA with Seed Words) (LDA without Seed Words)
Movie 83.3% 54%
Restaurant 86.7% 41%
The Domestic Saudi Airlines 80% 32%
Appl. Sci. 2022, 12, 4066 10 of 13
Appl. Sci. 2022, 12, x FOR PEER REVIEW 11 of 15

Figure 4. Distribution of sentiment polarity.


Figure 4. Distribution of sentiment polarity.

4.3.4.3. EnsembleClassifier
Ensemble Classifier (Stacking
(Stacking Generalization)
Generalization)
The performance evaluation of the proposed ensemble classifier model for the pur-
The performance evaluation of the proposed ensemble classifier model for the purpose
pose of aspect sentiment analysis relies on two parts: (1) making a comparison between
of aspect sentiment analysis relies on two parts: (1) making a comparison between the
the proposed model and the baseline classifiers in addition to another ensemble method
proposed
on three model
differentand the baseline
domains; classifiers
(2) evaluating in additionand
the performance to another
accuracyensemble method on
of the proposed
three
modeldifferent
on threedomains; (2) evaluating the performance and accuracy of the proposed
different domains.
model Tables
on three different
6 and domains.
7 illustrate the comparison between the proposed model and the base-
lineTables 6 and
classifiers as7well
illustrate
as threethe comparison
other between methods
different ensemble the proposed model
including and the
bagging, baseline
ada-
classifiers
boost and asmajority
well asvoting
three other
for thedifferent ensemble methods including bagging, adaboost
selected domains.
and majority voting for the selected domains.
Table
As 6.outlined
Performance comparison
in Tables of 7,
6 and baseline classifiers on
the proposed three has
model various domains.
scored better results compared
to theDomain
baseline classifiers and other ensemble classifier
Base Classifiers Accuracy (%) Precision (%) methods, with
Recall an accuracy
(%) F1 (%) level
of 81.2%, precision of SVM
81.1%, recall of80.2
80.4%, and F1-scores
80.9 of 81%. The
79.1 lowest
80 accuracy
performance of other ensemble
LR methods
80.3 is for ‘majority
80.9 voting’ with 79 77.5%. 80The lowest
accuracy performance of RF the baseline 76classifier is ‘decision
74.5 tree’ with
79 68.8%, whereas
76.7 the
Restaurant
highest accuracy resultDTis 80.4% for the naïve Bayes 71.2 classifier.
68.8 63.4 67.1
NB 80.4 81 78.9 80.1
Table 6. Performance comparison
KNN of baseline
68.9 classifiers on three
65.9 various domains.
78.5 71.6
SVM 81.2 81 80.7 80.3
Domain Base Classifiers Accuracy (%)
LR Precision
80 (%) 71 Recall (%) 73.5 F172(%)
SVM 80.2 RF 80.9
73 73 79.1 72.6 80
72.7
LR Movie 80.3 DT 80.9 79 80
72.2 69 75 71.8
RF 76 NB 74.5
71.5 80.5 79 64 76.7
71.3
Restaurant
DT 68.8 71.2 63.4 67.1
KNN 71 74 68.4 71
NB 80.4 81 78.9 80.1
KNN 68.9 SVM 73.6
65.9 73 78.5 73 73.5
71.6
LR 72 71 72.3 73
SVM 81.2 81 80.7 80.3
The Domestic RF 62 56 75 57.5
LR 80 71 73.5 72
RF Saudi Airlines 73 DT 63.5
73 70.1 72.6 57 63
72.7
Movie
DT 72.2 NB 72
69 81 75 65 72
71.8
NB 71.5 KNN 73.2
80.5 76 64 70 73.4
71.3
KNN 71 74 68.4 71
SVM 73.6 73 73 73.5
LR 72 71 72.3 73
The Domestic RF 62 56 75 57.5
Saudi Airlines DT 63.5 70.1 57 63
NB 72 81 65 72
KNN 73.2 76 70 73.4
Appl. Sci. 2022, 12, 4066 11 of 13

Table 7. Performance comparison of different ensemble methods with proposed method on restaurant
reviews.

Domain Ensemble Method Acc. (%) P (%) R (%) F1 (%)


Bagging 80.3 80.1 79 80.1
AdaBoost 79.5 81 76.4 78.8
Restaurant
Majority Voting 77.5 76.4 79.4 77.9
Stacked Generalization (Proposed) 83.2 83 82.4 83.1
Bagging 80 80.7 79.6 79.4
AdaBoost 77.8 79 77.7 77.7
Movie
Majority Voting 77.6 78 77.6 77.2
Stacked Generalization (Proposed) 84 83.5 83 84
Bagging 74.1 74 74.3 74
The Domestic AdaBoost 77.5 77.3 76.2 76.3
Saudi Airlines Majority Voting 75 74 75.3 75.7
Stacked Generalization (Proposed) 84.4 83.1 82 84.7

5. Conclusions
The main aim of this paper is to develop an efficient model to discover sentiments
associated with different aspects of a given text in order to make a more accurate decision
from the users’ perspective. The main objectives of the proposed system are: (1) Designing
an efficient model to identify and extract all the possible aspects from given textual data.
This is achieved by using natural language processing (NLP) to prepare the text in a
format adopted by a topic model in addition to a topic model that extracts the main
topics/aspects in that text. (2) Mapping between the extracted aspects and their opinions
using linguistic and statistical techniques through utilizing a topic model and lexicon
classification. (3) Developing a sentiment classification model in order to identify the
sentiment orientation of the extracted aspect using an ensemble learning classifier.
To evaluate the performance of the proposed framework, we have compared each
component to the baseline algorithms for the topic modelling, lexicon-based method and
ensemble learning classifiers. The results have shown that the proposed framework is able
to predict labels of the three review domains—restaurant, movie, and Saudi airlines—with
an accuracy of 83.2%, 84% and 84.4% in each domain, respectively. Furthermore, once the
proposed system is compared to the baselines algorithms, better results (higher than 2%)
were scored in terms of the ability to predict the labels correctly.
This study has shown some promising results in the field of aspect-based sentiment
analysis. It opened the windows wide for further research to enhance and expand this area
of research. For future research, the proposed framework could be expanded to handle
Arabic texts, which will be a challenging task. Likewise, future studies could apply more
resources to the proposed framework to further enhance the results.

Author Contributions: Conceptualization, S.K. and M.A.; methodology, S.K. and N.A.; software,
N.A.; validation, N.A.; formal analysis, N.A.; investigation, N.A. and S.K.; resources, N.A., M.A. and
S.K.; data curation, N.A.; writing—original draft preparation, N.A.; writing—review and editing, S.K.;
visualization, N.A.; supervision, S.K. and M.A.; funding acquisition, M.A.; project administration.
M.A. All authors have read and agreed to the published version of the manuscript.
Funding: The authors extend their appreciation to the Deputyship for Research and Innovation,
Ministry of Education in Saudi Arabia for funding this research work through project number 523.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Not applicable.
Conflicts of Interest: The authors declare no conflict of interest.
Appl. Sci. 2022, 12, 4066 12 of 13

References
1. Shi, M.; Liu, J.; Cao, B.; Wen, Y.; Zhang, X. A Prior Knowledge Based Approach to Improving Accuracy of Web Services Clustering.
In Proceedings of the 2018 IEEE International Conference on Services Computing (SCC), San Francisco, CA, USA, 2–7 July 2018.
2. Ekinci, E.; İlhan Omurca, S. Concept-LDA: Incorporating Babelfy into LDA for Aspect Extraction. J. Inf. Sci. 2020, 46, 406–418.
[CrossRef]
3. Chen, Z.; Mukherjee, A.; Liu, B.; Hsu, M.; Castellanos, M.; Ghosh, R. Exploiting Domain Knowledge in Aspect Extraction. In
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, WA, USA, 18–21 October
2013; pp. 1655–1667.
4. Fang, L.; Huang, M. Fine Granular Aspect Analysis Using Latent Structural Models. In Proceedings of the 50th Annual Meeting
of the Association for Computational Linguistics, Jeju, Korea, 8–14 July 2012; Volume 2, pp. 333–337.
5. Chen, Z.; Mukherjee, A.; Liu, B.; Hsu, M.; Castellanos, M.; Ghosh, R. Leveraging Multi-Domain Prior Knowledge in Topic Models.
In Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, Beijing, China, 3–9 August 2013;
pp. 2071–2077.
6. Chen, Z.; Mukherjee, A.; Liu, B.; Hsu, M.; Castellanos, M.; Ghosh, R. Discovering Coherent Topics Using General Knowledge. In
Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management—CIKM ’13,
San Francisco, CA, USA, 27 October–1 November 2013; pp. 209–218.
7. Chen, Z.; Mukherjee, A.; Liu, B. Aspect Extraction with Automated Prior Knowledge Learning. In Proceedings of the 52nd
Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, MD, USA, 22–27 June
2014; pp. 347–358.
8. Chen, Z.; Liu, B. Topic Modeling Using Topics from Many Domains, Lifelong Learning and Big Data. In Proceedings of the the
31st International Conference on Machine Learning, Beijing, China, 21–26 June 2014; Volume 32, pp. II-703–II-711.
9. Wang, T.; Cai, Y.; Leung, H.; Lau, R.Y.K.; Li, Q.; Min, H. Product Aspect Extraction Supervised with Online Domain Knowledge.
Knowl.-Based Syst. 2014, 71, 86–100. [CrossRef]
10. Rana, T.A.; Cheah, Y.-N.; Letchmunan, S. Topic Modeling in Sentiment Analysis: A Systematic Review. J. ICT Res. Appl. 2016, 10,
76–93. [CrossRef]
11. Majumder, N.; Bhardwaj, R.; Poria, S.; Zadeh, A.; Gelbukh, A.; Hussain, A.; Morency, L.-P. Improving Aspect-Level Sentiment
Analysis with Aspect Extraction. Neural Comput. Appl. 2020, 2021, 1–14. [CrossRef]
12. Medhat, W.; Hassan, A.; Korashy, H. Sentiment Analysis Algorithms and Applications: A Survey. Ain Shams Eng. J. 2014, 5,
1093–1113. [CrossRef]
13. Khatoon, S.; Romman, L.A. Domain Independent Automatic Labeling System for Large-Scale Social Data Using Lexicon and
Web-Based Augmentation. ITC 2020, 49, 36–54. [CrossRef]
14. Keshavarz, H.; Abadeh, M.S. ALGA: Adaptive Lexicon Learning Using Genetic Algorithm for Sentiment Analysis of Microblogs.
Knowl.-Based Syst. 2017, 122, 1–16. [CrossRef]
15. Yang, L.; Li, Y.; Wang, J.; Sherratt, R.S. Sentiment Analysis for E-Commerce Product Reviews in Chinese Based on Sentiment
Lexicon and Deep Learning. IEEE Access 2020, 8, 23522–23530. [CrossRef]
16. Liapakis, A. A Sentiment Lexicon-Based Analysis for Food and Beverage Industry Reviews. The Greek Language Paradigm.
SSRN J. 2020, 9, 21–42. [CrossRef]
17. Zhang, S.; Wei, Z.; Wang, Y.; Liao, T. Sentiment Analysis of Chinese Micro-Blog Text Based on Extended Sentiment Dictionary.
Future Gener. Comput. Syst. 2018, 81, 395–403. [CrossRef]
18. Bandhakavi, A.; Wiratunga, N.; Padmanabhan, D.; Massie, S. Lexicon Based Feature Extraction for Emotion Text Classification.
Pattern Recognit. Lett. 2017, 93, 133–142. [CrossRef]
19. Kuncheva, L.I. Combining Pattern Classifiers: Methods and Algorithms, 2nd ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2014;
ISBN 978-1-118-91456-4.
20. Onan, A.; Korukoğlu, S.; Bulut, H. A Multiobjective Weighted Voting Ensemble Classifier Based on Differential Evolution
Algorithm for Text Sentiment Classification. Expert Syst. Appl. 2016, 62, 1–16. [CrossRef]
21. Oussous, A.; Lahcen, A.A.; Belfkih, S. Improving Sentiment Analysis of Moroccan Tweets Using Ensemble Learning. In Big Data,
Cloud and Applications; Tabii, Y., Lazaar, M., Al Achhab, M., Enneya, N., Eds.; Communications in Computer and Information
Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 872, pp. 91–104, ISBN 978-3-319-96291-7.
22. Nehe, M.P.B.; Nawathe, A. Aspect Based Sentiment Classification Using Machine Learning for Online Reviews. 2020. Available
online: https://easychair.org/publications/preprint_download/xnVW (accessed on 13 March 2022).
23. Shoukry, A.; Rafea, A. Machine Learning and Semantic Orientation Ensemble Methods for Egyptian Telecom Tweets Sentiment
Analysis. JWE 2020, 19, 195–214. [CrossRef]
24. Sultana, N.; Islam, M.M. Meta Classifier-Based Ensemble Learning for Sentiment Classification. In Proceedings of International Joint
Conference on Computational Intelligence; Uddin, M.S., Bansal, J.C., Eds.; Algorithms for Intelligent Systems; Springer: Singapore,
2020; pp. 73–84, ISBN 9789811375637.
25. Basiri, M.E.; Abdar, M.; Cifci, M.A.; Nemati, S.; Acharya, U.R. A Novel Method for Sentiment Classification of Drug Reviews
Using Fusion of Deep and Machine Learning Techniques. Knowl.-Based Syst. 2020, 198, 105949. [CrossRef]
26. Khalid, M.; Ashraf, I.; Mehmood, A.; Ullah, S.; Ahmad, M.; Choi, G.S. GBSVM: Sentiment Classification from Unstructured
Reviews Using Ensemble Classifier. Appl. Sci. 2020, 10, 2788. [CrossRef]
Appl. Sci. 2022, 12, 4066 13 of 13

27. Tharwat, A. Classification Assessment Methods. ACI 2021, 17, 168–192. [CrossRef]
28. Raju, K.D.; Jayasingh, B.B. Machine Learning for Sentiment Analysis for Twitter Restaurant. JES 2018, 9, 21–27.
29. Waikul, V.; Ravgan, O.; Pavate, A. Restaurant Review Analysis and Classification Using SVM. IOSR JEN 2019, 1, 49–52.
30. Sharieff, H.; Sindhu, T.; SaiRamesh, L. Comparison of Machine Learning Techniques for Sentimental Analysis on Restaurant
Reviews. IJAEM 2020, 2, 740–743.
31. Bandana, R. Sentiment Analysis of Movie Reviews Using Heterogeneous Features. In Proceedings of the 2nd International
Conference on Electronics, Materials Engineering & Nano-Technology (IEMENTech), Kolkata, India, 4–5 May 2018; pp. 1–4.
32. Ghosh, M.; Sanyal, G. An Ensemble Approach to Stabilize the Features for Multi-Domain Sentiment Analysis Using Supervised
Machine Learning. J. Big Data 2018, 5, 44. [CrossRef]
33. Untawale, T.M.; Choudhari, G. Implementation of Sentiment Classification of Movie Reviews by Supervised Machine Learning
Approaches. In Proceedings of the 2019 3rd International Conference on Computing Methodologies and Communication
(ICCMC), Erode, India, 27–29 March 2019; pp. 1197–1200.
34. Chang, J.-R.; Liang, H.-Y.; Chen, L.-S.; Chang, C.-W. Novel Feature Selection Approaches for Improving the Performance of
Sentiment Classification. J. Ambient. Intell. Humaniz. Comput. 2020, 2021, 1–14. [CrossRef]
35. Jagdale, R.S.; Shirsat, V.S.; Deshmukh, S.N. Sentiment Analysis on Product Reviews Using Machine Learning Techniques. In
Cognitive Informatics and Soft Computing; Advances in Intelligent Systems and Computing Book Series; Springer: Berlin/Heidelberg,
Germany, 2019; Volume 768, pp. 639–647.
36. Shaheen, M. Sentiment Analysis on Mobile Phone Reviews Using Supervised Learning Techniques. IJMECS 2019, 11, 32–43.
[CrossRef]
37. Choudhari, P.; Veenadhari, S. Sentiment Classification of Online Mobile Reviews Using Combination of Word2vec and Bag-of-
Centroids. In Machine Learning and Information Processing; Swain, D., Pattnaik, P.K., Gupta, P.K., Eds.; Advances in Intelligent
Systems and Computing; Springer: Singapore, 2020; Volume 1101, pp. 69–80, ISBN 9789811518836.
38. Xu, F.; Pan, Z.; Xia, R. E-Commerce Product Review Sentiment Classification Based on a Naïve Bayes Continuous Learning
Framework. Inf. Process. Manag. 2020, 57, 102–221. [CrossRef]
39. Al-Azani, S.; El-Alfy, E.-S.M. Using Word Embedding and Ensemble Learning for Highly Imbalanced Data Sentiment Analysis in
Short Arabic Text. Procedia Comput. Sci. 2017, 109, 359–366. [CrossRef]
40. Khan, J.; Alam, A.; Hussain, J.; Lee, Y.-K. EnSWF: Effective Features Extraction and Selection in Conjunction with Ensemble
Learning Methods for Document Sentiment Classification. Appl. Intell. 2019, 49, 3123–3145. [CrossRef]
41. Khai Tran; Thi Phan Deep Learning Application to Ensemble Learning—The Simple, but Effective, Approach to Sentiment
Classifying. Appl. Sci. 2019, 9, 2760. [CrossRef]
42. İzmir Katip Çelebi Üniversitesi; Onan, A. Ensemble of Classifiers and Term Weighting Schemes for Sentiment Analysis in Turkish.
SRC 2021, 1, 1–12. [CrossRef]
43. Ruta, D.; Gabrys, B. Classifier Selection for Majority Voting. Inf. Fusion 2005, 6, 63–81. [CrossRef]
44. Novaković, J.D.; Veljović, A.; Ilić, S.S.; Papić, Ž.; Milica, T. Evaluation of Classification Models in Machine Learning. Theory Appl.
Math. Comput. Sci. 2017, 7, 39–46.
45. Bhoir, P.; Kolte, S. Sentiment Analysis of Movie Reviews Using Lexicon Approach. In Proceedings of the 2015 IEEE International
Conference on Computational Intelligence and Computing Research (ICCIC), Madurai, India, 10–12 December 2015; pp. 1–6.
46. Rajeswari, A.M.; Mahalakshmi, M.; Nithyashree, R.; Nalini, G. Sentiment Analysis for Predicting Customer Reviews Using a
Hybrid Approach. In Proceedings of the 2020 Advanced Computing and Communication Technologies for High Performance
Applications (ACCTHPA), Cochin, India, 2–4 July 2020; pp. 200–205.
47. Guha, S.; Joshi, A.; Varma, V. SIEL: Aspect Based Sentiment Analysis in Reviews. In Proceedings of the 9th International Workshop
on Semantic Evaluation (SemEval 2015), Denver, CO, USA, 4–5 June 2015; pp. 759–766.
48. Fikri, M.; Sarno, R. A Comparative Study of Sentiment Analysis Using SVM and SentiWordNet. IJEECS 2019, 13, 902–909.
[CrossRef]
49. Yuan, P. Sentiment Classification and Opinion Mining on Airline Reviews. 2016. Available online: https://www.semanticscholar.
org/paper/Sentiment-Classification-and-Opinion-Mining-on-Yuan/daf1d9de4066eed1d193847cae578389da16c5e8 (accessed on
13 March 2022).
50. Mehta, P.; Chandra, S. Enhancement of SentiWordNet Using Contextual Valence Shifters. IJDATS 2019, 11, 337. [CrossRef]

You might also like