Challenges and Issues in Sentiment Analysis - A Comprehensive Survey
Challenges and Issues in Sentiment Analysis - A Comprehensive Survey
This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.Doi Number
ABSTRACT Sentiment analysis, a specialization of natural language processing (NLP), has witnessed
significant progress since its emergence in the late 1990s, owing to the swift advances in deep learning
techniques and the abundance of vast digital datasets. Though sentiment analysis has reached a relatively
advanced stage in the area of NLP, it is erroneously assumed that sentiment analysis has reached its
pinnacle, leaving no room for further improvement. However, it is important to acknowledge that numerous
challenges that require attention persist. This survey paper provides a comprehensive overview of sentiment
analysis, including its applications, approaches to sentiment classification, and commonly used evaluation
metrics. The survey primarily focuses on the challenges associated with different types of data for
sentiment classification, namely cross-domain data, multimodal data, cross-lingual data, and small-scale
data, and provides a review of the state-of-the-art in sentiment analysis to address these challenges. The
paper also addresses the challenges faced during sentiment classification irrespective of the type of data
available. It aims at a better understanding of sentiment analysis to enable practitioners and researchers
select suitable methods for sentiment classification depending on the type of data being analyzed.
INDEX TERMS Machine Learning, Sentiment Analysis, Natural Language Processing, Cross-domain data,
multimodal data, cross-lingual data, and small-scale data.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
approach that is based on the types of sentiment challenges associated with different kinds of sentiment
classification that are best suited for different types of data. classification categorized on the type of data are then
Approach 1 - Based on the technique: One common delved into, and the past methods used to address these
approach employed in surveys is to base the survey paper challenges are highlighted. The paper culminates with a
on a particular technique for sentiment classification. For broad outlook on the field of sentiment classification.
example, [30] explores deep learning techniques for
sentiment classification. However, if a researcher intends to
tackle a real-world problem, the approach to the problem
would depend on the available data. If a problem has
voluminous data from a different domain, it may require the
use of deep learning and transfer learning techniques. In
such cases, the researcher may need to refer to two survey
papers, where the first survey focuses on aspect-level
sentiment classification, and the second survey deals with
cross-domain sentiment classification. Our survey takes an
approach that depends on the type and source of available
data to solve these types of issues. We discuss the
challenges of the given data along with the most effective
techniques to solve the problem, regardless of whether it is
a machine learning or deep learning algorithm that should FIGURE 1. Timeline of papers surveyed.
be used.
Approach 2 - Based on the type of sentiment classification:
Some surveys focus on a particular type of sentiment II. APPROACHES TO SENTIMENT CLASSIFICATION
classification [31] – [34], however in real-world scenarios, Two primary perspectives to sentiment analysis include
obtaining clean and comprehensive datasets for sentiment machine learning and lexicon-based methods, which are
analysis can be challenging. Ideally, we would like to have explored in further detail below.
a single dataset containing all the necessary reviews, but
this is rarely the case. This can lead to challenges related to A. MACHINE LEARNING-BASED APPROACH
multi-source cross-domain, and aspect-level cross-lingual Machine learning (ML) allows a system to learn from its
sentiment classification. While some works focus on experiences and enhance its performance. This approach
aspect-level sentiment analysis, they may not address the requires training a machine learning model on a labeled
challenges of cross-domain sentiment classification. To dataset, where each text sample is associated with a
address these challenges, researchers may need to refer to sentiment label. The model learns to recognize features and
multiple surveys, each addressing a specific type of patterns in the text which are indicative of each sentiment
sentiment classification. Our survey discusses the class. Machine learning has proven its ability to handle the
challenges of different types of sentiment classification, intricacies of human language [35]-[38]. The ML-based
including those that are often overlooked in other surveys. approach can be further classified into four subgroups,
To summarize, our goal is a survey that covers both the explained next.
‘Approach 1’ and ‘Approach 2’ discussed above. In other 1) SUPERVISED LEARNING
words, we aim to cover both the techniques for and the Supervised learning is a specific category of machine
types of sentiment classification, so that the challenges in learning techniques that involves instructing a computer
the real-world problems are more efficiently addressed. program to recognize and categorize patterns in labeled
Fig. 1 shows the breakup of the papers surveyed. To ensure datasets. These labeled datasets have pre-assigned labels
a fair representation, the span was chosen to be two years. that represent the correct output or desired response. The
To give more weightage to more recent work, 32.7% of the primary objective of supervised learning is to teach the
papers are those published in the last two years (2021-22), model to associate input data with the corresponding
and 50% of the papers are those published within the last labeled output data, which in turn enables the model to
four years (2019-22). However, at 17.6%, significant make accurate predictions on new data that it has not seen
contributions prior to the year 2015 have also been included before. Supervised machine learning algorithms are often
in the paper. preferred because they can leverage the labeled examples in
the training dataset [39], especially when such data is
E. . STRUCTURE OF THIS SURVEY available. These algorithms are very popular in sentiment
This survey is structured to provide a comprehensive classification and were among the earliest approaches to the
review of the field of sentiment classification. Beginning task. Supervised learning can be further classified into the
with an overview of the various sentiment classification following two categories:
approaches, the evaluation metrics used to assess the
performance of these models are examined next. The
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
strengths of several models to enhance classification for the input data is obtained by totaling the individual
performance and is often used as the core of hybrid models scores of each word [64] – [66].
[49] – [51]. Ensemble learning techniques can be widely The lexicon-based approach can be further classified into
categorized into three types - boosting, bagging, and two sub-groups explained below.
stacking - each with its own unique approach to combining 1) DICTIONARY-BASED APPROACH
the base models. The dictionary-based technique for sentiment analysis is an
2) UNSUPERVISED LEARNING unsupervised method that involves utilizing a lexicon of
Unsupervised machine learning algorithms are used where terms to determine the sentiment of a given text [67]. This
there are no labeled data to train the classifier. These approach is rule-based and relies on a sentiment dictionary
techniques rely on self-learning and have been to statistically determine the sentiment weights of the words
demonstrated to be effective in the field of NLP, that express various emotions in the text [68]. However, the
particularly in sentiment classification [52] - [53]. The effectiveness of this approach relies on the accuracy and
majority of the current unsupervised sentiment completeness of the pre-existing sentiment lexicons used.
categorization approaches can be divided into two stages 2) CORPUS-BASED APPROACH
[54] – [55]. In the first stage, the sentiment intensity of the As opposed to the dictionary-based method, the corpus-
text is calculated by estimating the sentiment strength of the based approach makes use of co-occurrence metrics or
terms and expressions used to express emotions. In the language structures in a corpus. This approach employs two
second stage, the sentiment categorization of data is different techniques:
achieved by referring to the sentiment strength of the i. Statistical Approach
data against the baseline value of ‘0’. This corpus-based approach utilizes statistical methods to
3) SEMI-SUPERVISED LEARNING identify opinion seed words based on their frequency in
Supervised learning has shown a lot of success in many writings with positive or negative tones. If their frequencies
sentiment analysis tasks, but to boost the generalizability of are equal, they are considered neutral. Contemporary
the learning model, a substantial quantity of labeled data is techniques rely on the observation that words with similar
required [56]. To leverage large amount of unlabeled data, sentiments often co-occur in corpora, and their polarity can
unsupervised learning is presented as a feasible option. By be estimated by measuring their relative frequency of
blending unsupervised and supervised learning techniques, recurrence with other words in the same situation [69] –
semi-supervised learning provides a suitable solution. [71].
Semi-supervised learning is accomplished by using the ii. Semantic Approach
unlabeled data for unsupervised learning and the labeled This approach allows for semantically related phrases to be
data for supervised learning and then integrating both to assigned similar emotional evaluations by utilizing a
enhance the learning model’s performance [57] – [59]. database of emotional words that can be recursively
4) DEEP LEARNING expanded with synonyms and antonyms. The sentiment
Deep learning has ascended the emotion categorization polarity of a lexical item is ascertained by analyzing the
ladder during the last decade. To overcome the challenge of proportional number of positive and negative counterparts
handling many hidden layers in a neural network, deep for that term [72] – [74].
learning adopts a multilayer approach. While conventional
machine learning techniques rely on feature selection
techniques or explicit feature extraction methods, deep III. EVALUATION METRICS
learning models automatically learn and retrieve features, For any growing field of research, it is necessary to
thereby enhancing accuracy and performance. Additionally, establish a commonly accepted evaluation methodology
hyperparameters of classifier models are also automatically that is widely used within the field. This holds true for
evaluated in most cases. Consequently, deep learning has sentiment classification as well. At present, the bulk of the
gained widespread adoption in sentiment classification studies surveyed adopt the following standard measures.
tasks [60] – [63].
A. ACCURACY
B. LEXICON-BASED APPROACH Accuracy is a measure of the number of correct predictions,
A lexicon is an aggregation of words that are related to a given the total number of predictions. It is calculated by
specific emotional orientation. A sentiment lexicon is used taking the ratio of the number of true positives and true
in the lexicon-based approach to ascertain the polarity of a negatives to the total number of samples. Mathematically,
text document. In the pre-processed data, each word is the accuracy, ACC, is given by
identified and assigned a part of speech (POS), such as a
ACC = (1)
verb, adverb, noun, pronoun, etc. This POS labeling is then
utilized as a feature to extract the emotional content of a where TN is the tally of true negatives, TP is the tally of
given text statement. Finally, the polarity is calculated by true positives, FP is the tally of false positives, and FN is
searching the tokenized words through any lexical resource the tally of false negatives.
to get their score to determine their polarity. The final score
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
Predicted
B. PRECISION
Values
Precision is a measure of accuracy, indicating the extent to Negative (0) TN FN
which the model’s predictions are correct. It is the fraction Positive (1) FP TP
of true positives to the total number of positives.
Mathematically, the precision, PRE, is given by
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
FIGURE 3. Obstacles specific to different types of sentiment classification and the shared challenges encountered by all of them.
Pivot elements are sentiment features that can be classification performance. Finally, a Sentiment-Sensitive
transferred across different domains, while non-pivot Network Model (SSNM) is built by combining both the
elements are domain-specific features that cannot be easily KPE-net and NKPE-net to improve the sentiment
transferred. Pivot elements are words that have matching classification accuracy across different domains. This
sentiment orientation irrespective of the domain they appear approach addresses the challenges of transfer learning and
in, such as ‘sad’, ‘happy’, ‘bad’, and ‘good’. Hence, pivot domain adaptation in cross-domain sentiment classification,
elements are used for comparing sentiments across and the results from experiments establish its effectiveness
domains. Non-pivot elements, on the other hand, are words when compared against other state-of-the-art methods (see
that are domain-specific, like ‘sweet’, that can refer to taste, Table II).
or quality of the human, or happiness. They are used for
accurate sentiment classification in that particular domain 2) MULTI-SOURCE
and cannot be easily compared. In many cases, data sources used for sentiment analysis are
Therefore, it is important to understand the difference highly diverse, originating from different source domains.
between pivot and non-pivot elements, as the use of pivot Traditional domain adaptation techniques are often used to
elements can increase the performance across different bridge the gap between the target and source domains, but
domains and non-pivot elements improve the accuracy in they are generally not effective in selecting critical domain
that specific domain. Yanping et al. [82] proposed a novel sources and do little to mitigate the negative transfer that
approach to improve cross-domain sentiment classification can result, ultimately leading to decreased model
adopting a hierarchical attention network called KPE-net. performance. To handle this concern, Yanping et al. [83]
The KPE-net extracts the transferable pivot elements from proposed a novel contrastive transformer-based domain
different domains and constructs a joint attention learning adaptation method (CTDA). This method utilizes a mixed
network called non-KPE-net (NKPE-net) using pivot selection technique to choose the top-k sources and an
elements as a bridge. The NKPE-net captures the domain- adaptor to obtain domain-invariant information.
specific non-pivot elements to enhance the sentiment Additionally, a discriminator is employed to extract
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
details and phrases, which is a critical part of aspect-based and entity detection, and operates independently of
sentiment categorization. For instance, "The theatre had a translation techniques. While one of its key advantages is
huge screen, but the sound effects were awful." This phrase its ability to recognize pre-established aspect categories, the
contains two aspects that require polarity classification: approach is less successful in dealing with languages that
"screen" and "sound." The opinion words “awful” and have a shortage of resources, such as limited digital
"huge" are associated with the polar words "negative" and language resources.
"positive”, respectively. By using multi-lingual BERT and bilingual dictionaries, a
Academic attention has been focused on developing models deep learning strategy for cross-lingual aspect-based
for aspect-level sentiment classification. In the context of sentiment categorization is proposed, which extracts POS
supervised learning, ML algorithms are commonly tagging data [133]. The reviews are pre-processed, turned
employed by researchers to build classifiers [117]– [119]. into tokens, and mapped from one language to another
One of the orthodox approaches is the attribute-based using bilingual dictionaries. The multilingual BERT is then
support vector machine (SVM) [120] - [121]. Another utilized to generate vectors, which are then fed into the
promising technique is back-to-back neural network-based deep-learning classifier for training. The effectiveness of
word representation learning, that utilizes an attention the method put forward is assessed using a multilingual
mechanism [122]. This approach is preferred by many dataset for aspect-based sentiment classification.
researchers as it avoids the need for time-consuming feature
engineering and produces high-quality results. Among these 2) REPRESENTATIONAL FLEXIBILITY AND ALIGNMENT
methods, the LSTM fusion attention mechanism has shown DIFFERENCES
promising results in previous studies [123] – [124]. Feature representation learning-based methods aim to
BERT provides multilingual word embeddings for more mitigate distributional discrepancies by inducing a suitable
than a hundred languages and is more effective in sentiment feature characterization between the target and source
prediction than other models, especially for languages with language domains. Several variations of subject models
limited resources [125]. Attention mechanisms are used have been suggested to address cross-lingual categorization
among various neural networks to extract specific contexts issues. To achieve perfect alignment between the required
in some studies [124], [126] – [127]. Despite several domains, it has been suggested that the domains should
research studies on aspect-based sentiment analysis, nouns, share the same underlying subjects [134]– [137] and to
adjectives, and verbs, which are significantly related to comprehend a projection matrix among several domains, it
aspects and their sentiments, are often overlooked as POS is proposed to use common subjects [138].
in these studies. Additionally, aspect-based sentiment The techniques mentioned above suffer from an inherent
classification across languages is not well-focused on flaw, namely, the requirement of a perfect layout of themes
homographic terms because of the limited volume of the over language domains. The optimal layout is attained
cross-lingual data vocabulary. through either one-to-one topic layout or matrix
In sentiment analysis, there have been several attempts to extrapolation. However, imposing strict layout constraints
develop models that can classify sentiments in multiple can limit the representation adjustability and result in
languages. One such approach, described in [128], utilizes a decreased model performance [139], particularly when
multi-layer CNN architecture that builds upon the mono- there are significant dissimilarities in the distributions
lingual techniques outlined in [129]. Additionally, [130] between the target and source language domains. Put
attempted to categorize sentiments in Spanish, German, and differently, the presumption of perfect alignment is often
French, using three distinct machine translation techniques: erroneous since language domains often have varying
Google, Bing, and Moses. underlying distributions. Therefore, it is essential to relax
Cross-lingual word embedding has been explored through the constraints on perfect alignment to address cross-lingual
an adversarial learning process using a supervised categorization challenges.
technique [131]. This approach involves mapping Ref. [140] proposed a coarse alignment technique that uses
embeddings of the target and source languages into a shared group-to-group topic alignment to improve the model's
vector space. This multilingual word approach can support representation and then fine-tune it into a fine-grained
up to thirty languages. However, it has been observed that model at aspect level - aspect, opinion, and sentiment
the accuracy of mapping target and source languages in a unification model (AOS), an unsupervised model that
common vector space decreases significantly when unifies aspects, opinions, and sentiments of reviews from
transitioning from resource-affluent to resource- various domains, uses coarse alignment to capture more
impoverished languages, mainly as a result of insufficient accurate latent feature representation. To enhance AOS
data for languages with inadequate resources. further, a partially supervised AOS model (ps AOS), is
A multilingual n-gram-based approach was proposed in employed, in which tagged source language data are used in
[132] for aspect category detection of online reviews, using conjunction with logistical regression to diminish the
multilingual word embedding to handle multilingual data. variation in feature depictions across two language realms.
This approach divides aspect category detection into three Finally, a framework for expectation-maximization with
sub-activities: aspect category detection, attribute detection,
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
Gibbs sampling is suggested as a way to improve the was achieved by introducing a hybrid neural network model
model's performance. that uses data expansion methods [150]. This technique can
Table VI shows the challenges encountered in cross-lingual enlarge the data size, enhance the model's generalization
sentiment classification, the techniques used to address ability, and ultimately increase its accuracy. Although
these challenges, and the most effective/optimal of these intensive training on large-scale data is still required, this
techniques. approach artificially increases the data scale. Another
TABLE VI popular and practical strategy is to use multi-modal
SUMMARY OF CHALLENGES TO CROSS-LINGUAL SENTIMENT
information. A hybrid classifier trained on a combination of
CLASSIFICATION
images and text from social media was proposed instead of
Most Optimal
Challenge Techniques
Technique
Comment relying on a single input [151]. To achieve high accuracy,
[152] incorporated video information on top of this
Aspect-level Cross LCF-BERT [133] CNN-BERT CNN-
lingual AEN-BERT [133] with Attention BERT foundation and extracted features from data of different
classification
SPC-BERT [133]
[133] with modalities.
Attention Finally, user attributes convolutional and recurrent neural
CNN-BERT has a better
without precision,
networks (UCRNN), a sentiment categorization method
Attention [133] recall, and based on text data from multi-modal social media, uses
f1 score. parallel RNN and CNN to analyze text information and user
Representational LR Ps-AOS [140] ps-AOS attributes, respectively.
Flexibility and SVM has better
Alignment accuracy
TRiTL [141] E. OTHER CHALLENGES
Difference than the
DTL [142] other The earlier sections focused on challenges that are specific
CL-SCL [143] models to certain forms of sentiment classification. In addition,
Co-Training
there are other issues that are relevant to sentiment
[144]
classification in general, and these are discussed in the
TSU [145]
TCA [138]
following subsection.
PSCCLDA [136]
AOS [140] 1) FEATURE SELECTION
Feature selection is a key step in improving the accuracy
and reducing the training time of models used for prediction
Table VII presents an in-depth examination of the metrics tasks. Typically, data used for prediction has numerous
employed to determine the most optimal technique for features, some of which may be unrelated or even
addressing each challenge in cross-lingual classification. damaging to the model's performance. To address this
TABLE VII issue, three strategies have been developed for feature
COMPLEMENTARY ANALYSIS OF TABLE VI selection: filter, wrapper, and hybrid. Recent studies have
Reference F- demonstrated the advantages of the hybrid strategy for
Technique Accuracy Precision Recall
ID measure
sentiment classification tasks compared to the wrapper and
[133] CNN- - filter strategies [153].
BERT
with The evaluation of the relevance of words to a document is a
Attention crucial aspect of sentiment analysis. One commonly used
[140] Ps-AOS - - - technique for this exercise is term frequency-inverse
document frequency (TF-IDF). However, this method can
be improved by incorporating Next Word Negation (NVM)
D. SHORT-TERM AND SMALL-SCALE SENTIMENT
CLASSIFICATION [154]. This modification addresses common word negations
The information exchange facilitated by the widespread use like "yes," "yep," "yeah," and "sure".
of the Internet has led to unparalleled convenience, To further enhance the technique, [155] integrated the chi-
enabling hot topics to generate massive online debates. squared statistic selection method and used SVM as the
For small-scale short-term sentiment classification tasks, classifier. The chi-squared statistic calculates the
classifiers with smaller number of layers such as TextCNN correlation between a word feature and its associated class
[146], FastText [147], TextRNN [148], and TextCRNN or category. Later, Syafiqah et al. [156] proposed a model
[149] are commonly employed. These classifiers have the that combines TF-IDF with support vector machine-
advantage of not needing extensive training or massive recursive feature elimination (SVM-RFE) as an improved
amounts of data. However, they tend to perform poorly in hybrid feature selection method. TF-IDF selects the
classification due to their limited structure, notwithstanding features, and SVM-RFE ranks them in the order of their
efforts aimed at optimization and other modifications. importance. Jawad Khan et al. [157] used a different hybrid
Various techniques have been employed to augment data strategy by integrating the wrapper-based backward feature
information to improve sentiment categorization in such selection (BFS) method with the ensemble of multiple
situations. Data size expansion is a simple solution, which filters feature selection (EMFFS) method. Another
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
approach for feature selection is information gain (IG), in these models. While some studies have added user identity
which each feature is given a weight and used for selection to traditional models, little investigation has been done on
[158] - [159]. Sparsity-adjusted information gain (SAIG) is the combination of PLMs and user identity for enhanced
a modified information gain (IG) algorithm for feature attainment.
selection that outperforms the traditional IG algorithm in To deal with this problem, Jawad et al. [169] proposed a
terms of accuracy and efficiency, especially in datasets with new approach called user-enhanced pre-trained language
high sparsity [160]. models, which combines user identification with PLMs (U-
PLMs). Two strategies, attention-based personalization, and
2) FEATURE EXTRACTION embedding-based personalization were used to personalize
Statistical machine learning algorithms cannot be extended the PLMs by incorporating user identification into different
to text categorization issues although they are effective for parts of the model. By injecting user identification into
less sophisticated applications of sentiment classification various aspects of the PLMs, the U-PLMs are enabled to
[161] - [162]. Deep learning algorithms, on the other hand, achieve personalized text modeling from multiple
produce noteworthy outcomes in sentiment analysis [163]. perspectives, leading to improved performance.
Deep learning allows CNNs to grasp intricate and non-
linear structures, which enables the CNN to learn high- 4) ARCHITECTURE OF THE MODEL
dimensional complexities. The advancement of technology has led to the extensive use
However, CNN is not capable of association, and the of deep learning methods for sentiment categorization.
success of the CNN model largely depends on the However, the increasing complexity of deep learning
appropriate window size [164]. RNN is effective in learning models has led to longer training times, which is a
sequential models, however, it cannot extricate local drawback for real-time applications that require
features simultaneously. Consequently, RNN can be used in computational time to be minimum. To address this issue,
conjunction with CNN. [170] put forward an elementary deep-learning architecture
The LSTM is an RNN-based model that aims to address the that uses a single-layered Bi-LSTM. Isnanini [171] also
issue of the inability to achieve local feature extraction and proposed two sentiment categorization models with a
sequential learning concurrently. RNN's structure is altered straightforward design, including a one-layer CNN model
by LSTM - it transforms the RNN layer into a structure with fastText embedding and a BiGRU model. The study
with a memory cell and a gate. The LSTM's goal is to demonstrates that the CNN model can deliver comparable
preserve the data in the memory cell for future use and outcomes when used instead of the BiLSTM and BiGRU
updation. The gradient exploding and disappearing models, and also shows that the single-layer Bi-LSTM
difficulties in RNN [164] are resolved by LSTM with the model can perform better when incorporated with fastText
help of this new structure. Additionally, since LSTM embedding.
versions may capture extended short-term dependencies,
applying them to tackle sentiment analysis issues is more
promising. V. CONCLUSION
In text sentiment categorization, the text is often defined as In conclusion, sentiment analysis has grown into a crucial
vectors in high-dimensional space. Bi-LSTM cannot domain of research in natural language processing owing to
emphasize crucial information while extracting context its vast areas of application. Unlike many other survey
from features [165]. To overcome these limitations, a new papers in the field, this study adopted a unique approach of
deep-learning text classification model merging the CNN not limiting itself to one particular kind of sentiment
and Bi-LSTM structures has been proposed, which classification or technique but instead focused on the kinds
addresses the shortcomings of Bi-LSTM [166]. By of sentiment classification that are best suited for different
including a convolutional layer into a CNN model, the new types of data. This approach allowed us to present a more
ConvBiLSTM structure seeks to address the restriction of comprehensive and nuanced understanding of sentiment
Bi-LSTM. The one-dimensional convolutional layer shrinks analysis. The paper discussed two major approaches to
the size of the input texts by extricating n-gram features at sentiment analysis - machine learning-based and lexicon-
various sentence locations. These features are used to feed based approaches. Standardized evaluation metrics were
the Bi-LSTM, which extracts contextual data to categorize also presented to ensure consistency and comparability
sentiment findings. across studies. The central focus of our survey was to
address the challenges that arise during the sentiment
3) STAGNANT ACCURACY analysis of various kinds of data and to provide the latest
The development of pre-trained language models (PLMs), overview of the ongoing research in sentiment analysis
such as ALBERT [167], BERT [92], and RoBERTa [168], aimed at overcoming these challenges. The survey
has led to significant advances in various NLP applications, discusses challenges in four different types of sentiment
inclusive of document-level sentiment categorization. classification - cross-domain sentiment classification,
However, recent research has focused on improving text multimodal sentiment classification, cross-lingual sentiment
modeling further by incorporating user information into
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
classification, and short-term small-scale sentiment Technology and Society, vol. 3, no. 2, pp. 100-110, June 2022, doi:
10.1109/TTS.2021.3108963.
classification.
[7] F. Arias, M. Zambrano Núñez, A. Guerra-Adames, N. Tejedor-Flores
The main challenge of cross-domain sentiment and M. Vargas-Lombardo, "Sentiment Analysis of Public Social
classification (CDSC) is developing a model that would Media as a Tool for Health-Related Topics," in IEEE Access, vol. 10,
classify the sentiment across different domains. The major pp. 74850-74872, 2022, doi: 10.1109/ACCESS.2022.3187406.
[8] L. Liu, Z. Cao, P. Zhao, P. J. -H. Hu, D. D. Zeng and Y. Luo, "A
sub-challenges in CDSC discussed in this study are the
Deep Learning Approach for Semantic Analysis of COVID-19-
classification and transfer of sentiment features, Related Stigma on Social Media," in IEEE Transactions on
multisource data, and sentiment prediction of the target Computational Social Systems, vol. 10, no. 1, pp. 246-254, Feb.
domain. The principal difficulty of multimodal sentiment 2023, doi: 10.1109/TCSS.2022.3145404.
[9] N. Seddari, A. Derhab, M. Belaoued, W. Halboob, J. Al-Muhtadi and
classification (MSC) is developing a model that uses
A. Bouras, "A Hybrid Linguistic and Knowledge-Based Analysis
multimodal data to categorize the sentiment. The major Approach for Fake News Detection on Social Media," in IEEE
sub-challenges in MSC discussed in this study are improper Access, vol. 10, pp. 62097-62109, 2022, doi:
correlation and handling noisy data. 10.1109/ACCESS.2022.3181184.
[10] S. Ni, J. Li and H. -Y. Kao, "MVAN: Multi-View Attention
The main difficulty of cross-lingual sentiment classification
Networks for Fake News Detection on Social Media," in IEEE
(CLSC) is developing a model that uses data from different Access, vol. 9, pp. 106907-106917, 2021, doi:
languages. The major sub-challenges of CLSC discussed in 10.1109/ACCESS.2021.3100245.
this study are aspect-level classification of cross-lingual [11] "E-commerce Market Share, Growth & Trends Report, 2020-2027,"
Grand View Research, 2020. [Online]. Available:
data, representational flexibility, and alignment difference
https://www.grandviewresearch.com/industry-analysis/e-commerce-
among different languages. The main challenge of short- market.
term and small-scale sentiment classification is developing [12] S. Maurya and V. Pratap, "Sentiment Analysis on Amazon Product
models that can classify sentiment in real-time with smaller Reviews," 2022 International Conference on Machine Learning, Big
Data, Cloud and Parallel Computing (COM-IT-CON), Faridabad,
scale data in a shorter time.
India, 2022, pp. 236-240, doi: 10.1109/COM-IT-
Additionally, the paper addressed the challenges associated CON54601.2022.9850758.
with most of the sentiment classification techniques [13] H. He, G. Zhou and S. Zhao, "Exploring E-Commerce Product
regardless of the type of data available. These challenges Experience Based on Fusion Sentiment Analysis Method," in IEEE
Access, vol. 10, pp. 110248-110260, 2022, doi:
included feature selection, feature extraction, stagnant 10.1109/ACCESS.2022.3214752.
accuracy, and complex model architecture. Overall, this [14] K. Chitra, A. Tamilarasi, S. G. Dharani, P. Keerthana and T.
survey paper provided an in-depth analysis of sentiment Madhumitha, "Opinion Mining and Sentiment Analysis on Product
Reviews," 2022 International Conference on Computer
analysis, its challenges, and the state-of-the-art in the field. Communication and Informatics (ICCCI), Coimbatore, India, 2022,
The findings of this survey paper can be used as a reference pp. 1-7, doi: 10.1109/ICCCI54379.2022.9740777.
for researchers and practitioners in sentiment analysis to [15] C. Sindhu, S. Thejaswin, S. Harikrishnaa and C. Kavitha, "Mapping
Distinct Source and Target Domains on Amazon Product Customer
overcome the challenges and improve the accuracy of their Critiques with Cross Domain Sentiment Analysis," 2022 Second
models. Further investigations in sentiment analysis is International Conference on Artificial Intelligence and Smart Energy
needed to address the challenges and make sentiment (ICAIS), Coimbatore, India, 2022, pp. 782-786, doi:
analysis more reliable and effective in different applications. 10.1109/ICAIS53314.2022.9742732.
[16] S. Rathor and Y. Prakash, "Application of Machine Learning for
Sentiment Analysis of Movies Using IMDB Rating," 2022 IEEE
REFERENCES 11th International Conference on Communication Systems and
[1] Statista. (2020). Total data volume worldwide 2010-2025. [Online]. Network Technologies (CSNT), Indore, India, 2022, pp. 196-199,
Available: https://www.statista.com/statistics/871513/worldwide- doi: 10.1109/CSNT54456.2022.9787663.
data-created/. [17] M. B and C. S, "An Approach of Sentiment Analysis for Movie
[2] DataReportal – Global Digital Insights. (2018). Digital Around the Reviews," in 2022 International Conference on Communication,
World. [Online]. Available: https://datareportal.com/global-digital- Computing and Internet of Things (IC3IoT), Chennai, India, 2022,
overview. pp. 01-04, doi: 10.1109/IC3IOT53935.2022.9767915.
[3] Chaffey, D. (2022, June). Global social media statistics research [18] K. Soni, P. Yadav, and Rahul, "Comparative Analysis of Rotten
summary 2022 [June 2022]. Smart Insights. Retrieved January 30, Tomatoes Movie Reviews using Sentiment Analysis," in 2022 6th
2023, from https://www.smartinsights.com/social-media- International Conference on Intelligent Computing and Control
marketing/social-media-strategy/new-global-social-media-research/ Systems (ICICCS), Madurai, India, 2022, pp. 1494-1500, doi:
[4] O. Octaria, D. Manongga, A. Iriani, H. D. Purnomo and I. Setyawan, 10.1109/ICICCS53718.2022.9788287.
"Mining Opinion Based on Tweets about Student Exchange with [19] A. Gehlor and R. Singh, "Estimation of Sentiment Analysis Base
Tweepy and TextBlob," 2022 9th International Conference on Stock Market Crisis," 2022 IEEE 3rd Global Conference for
Information Technology, Computer, and Electrical Engineering Advancement in Technology (GCAT), Bangalore, India, 2022, pp. 1-
(ICITACEE), Semarang, Indonesia, 2022, pp. 102-106, doi: 5, doi: 10.1109/GCAT55367.2022.9972059.
10.1109/ICITACEE55701.2022.9924013. [20] F. Alzazah, X. Cheng and X. Gao, "Predict Market Movements
[5] V. Jagadishwari and N. Shobha, "Sentiment analysis of Covid 19 Based on the Sentiment of Financial Video News Sites," 2022 IEEE
Vaccines using Twitter Data," 2022 Second International Conference 16th International Conference on Semantic Computing (ICSC),
on Artificial Intelligence and Smart Energy (ICAIS), Coimbatore, Laguna Hills, CA, USA, 2022, pp. 103-110, doi:
India, 2022, pp. 1121-1125, doi: 10.1109/ICSC52841.2022.00022.
10.1109/ICAIS53314.2022.9742995. [21] A. Nabil and N. Magdi, "A New Model for Stock Market Predication
[6] P. Vyas, M. Reisslein, B. P. Rimal, G. Vyas, G. P. Basyal and P. Using a Three-Layer Long Short-Term Memory," 2022 2nd
Muzumdar, " Automated Classification of Societal Sentiments on International Mobile, Intelligent, and Ubiquitous Computing
Twitter With Machine Learning," in IEEE Transactions on Conference (MIUCC), Cairo, Egypt, 2022, pp. 421-424, doi:
10.1109/MIUCC55081.2022.9781741.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
[22] S. Sridhar and S. Sanagavarapu, "Analysis of the Effect of News International Conference on Industry 4.0 Technology (I4Tech), Pune,
Sentiment on Stock Market Prices through Event Embedding," 2021 India, 2020, pp. 1-6, doi: 10.1109/I4Tech48345.2020.9102673.
16th Conference on Computer Science and Intelligence Systems [39] J. Han, M. Kamber, and J. Pei, "Data Mining: Concepts and
(FedCSIS), Sofia, Bulgaria, 2021, pp. 147-150, doi: Techniques," 3rd ed., Elsevier, Amsterdam, Netherlands, 2012.
10.15439/2021F79. [40] C. Cortes and V. Vapnik, "Support-vector networks," Machine
[23] T. T. Dang, Y. Cheng and K. Hawick, "Market-Aware Sentiment Learning, vol. 20, no. 3, pp. 273-297, 1995.
Analysis for Stock Microblogs," 2021 26th International Conference [41] C. C. Aggarwal and C. Zhai, "Mining Text Data," Springer, 2012.
on Automation and Computing (ICAC), Portsmouth, United [42] M. Goldburd, A. Khare, and C. D. Tevet, "Generalized linear models
Kingdom, 2021, pp. 1-6, doi: 10.23919/ICAC50006.2021.9594230. for insurance rating," in Proceedings of the Casualty Actuarial
[24] Y. Mehta, A. Malhar and R. Shankarmani, "Stock Price Prediction Society, pp. 1-122, 2016.
using Machine Learning and Sentiment Analysis," 2021 2nd [43] P. McCullagh, "Generalized linear models," Eur. J. Oper. Res., vol.
International Conference for Emerging Technology (INCET), 16, no. 3, pp. 370-384, Nov. 1984, doi: 10.1016/0377-
Belagavi, India, 2021, pp. 1-4, doi: 2217(84)90084-1.
10.1109/INCET51464.2021.9456376. [44] I. Trofimov and A. Genkin, "Distributed coordinate descent for
[25] A. Costa and A. Veloso, "Employee Analytics through Sentiment generalized linear models with regularization," Pattern Recognition
Analysis," in Proceedings of the Brazilian Symposium on Databases, and Image Analysis, vol. 27, no. 2, pp. 349-364, 2017.
Petrópolis - RJ - Brazil, Oct. 2015, vol. 30, doi: [45] M. Abbas, K. Ali, S. Memon, A. Jamali, S. Memonullah, and A.
10.13140/RG.2.1.1623.3688 Ahmed, "Multinomial Naive Bayes Classification Model for
[26] T. D. Chungade and S. Kharat, "Employee performance assessment Sentiment Analysis," Mar. 2019. [Online]. Available:
in virtual organization using domain-driven data mining and 10.13140/RG.2.2.30021.40169.
sentiment analysis," 2017 International Conference on Innovations in [46] A. A. Farisi, Y. Sibaroni, and S. A. Faraby, "Sentiment analysis on
Information, Embedded and Communication Systems (ICIIECS), hotel reviews using Multinomial Naïve Bayes classifier," in Journal
Coimbatore, India, 2017, pp. 1-7, doi: of Physics Conference Series, vol. 1192, no. 1, p. 012024, March
10.1109/ICIIECS.2017.8276093. 2019, doi: 10.1088/1742-6596/1192/1/012024.
[27] R. Liu, Y. Shi, C. Ji and M. Jia, "A Survey of Sentiment Analysis [47] M.-R. Amini and N. Usunier, "Semi-Supervised Learning," in
Based on Transfer Learning," in IEEE Access, vol. 7, pp. 85401- Learning with Partially Labeled and Interdependent Data, Springer,
85412, 2019, doi: 10.1109/ACCESS.2019.2925059. Cham, 2015, pp. 3-18, doi: 10.1007/978-3-319-15726-9_3.
[28] F. Alattar and K. Shaalan, "A Survey on Opinion Reason Mining and [48] J. Kazmaier and J. H. van Vuuren, "The power of ensemble learning
Interpreting Sentiment Variations," in IEEE Access, vol. 9, pp. in sentiment analysis," Expert Systems with Applications, vol. 187,
39636-39655, 2021, doi: 10.1109/ACCESS.2021.3063921. 2022, article no. 115819, ISSN: 0957-4174,
[29] K. Chakraborty, S. Bhattacharyya and R. Bag, "A Survey of https://doi.org/10.1016/j.eswa.2021.115819.
Sentiment Analysis from Social Media Data," in IEEE Transactions [49] K. L. Tan, C. P. Lee, K. M. Lim and K. S. M. Anbananthen,
on Computational Social Systems, vol. 7, no. 2, pp. 450-464, April "Sentiment Analysis With Ensemble Hybrid Deep Learning Model,"
2020, doi: 10.1109/TCSS.2019.2956957. in IEEE Access, vol. 10, pp. 103694-103704, 2022, doi:
[30] J. Zhou, J. X. Huang, Q. Chen, Q. V. Hu, T. Wang and L. He, "Deep 10.1109/ACCESS.2022.3210182.
Learning for Aspect-Level Sentiment Classification: Survey, Vision, [50] M. S. Akhtar, D. Ghosal, A. Ekbal, P. Bhattacharyya and S.
and Challenges," in IEEE Access, vol. 7, pp. 78454-78483, 2019, Kurohashi, "All-in-One: Emotion, Sentiment and Intensity Prediction
doi: 10.1109/ACCESS.2019.2920075. Using a Multi-Task Ensemble Framework," in IEEE Transactions on
[31] K. Schouten and F. Frasincar, "Survey on Aspect-Level Sentiment Affective Computing, vol. 13, no. 1, pp. 285-297, 1 Jan.-March
Analysis," in IEEE Transactions on Knowledge and Data 2022, doi: 10.1109/TAFFC.2019.2926724.
Engineering, vol. 28, no. 3, pp. 813-830, 1 March 2016, doi: [51] N. Aslam, F. Rustam, E. Lee, P. B. Washington and I. Ashraf,
10.1109/TKDE.2015.2485209. "Sentiment Analysis and Emotion Detection on Cryptocurrency
[32] A. Nazir, Y. Rao, L. Wu and L. Sun, "Issues and Challenges of Related Tweets Using Ensemble LSTM-GRU Model," in IEEE
Aspect-based Sentiment Analysis: A Comprehensive Survey," in Access, vol. 10, pp. 39313-39324, 2022, doi:
IEEE Transactions on Affective Computing, vol. 13, no. 2, pp. 845- 10.1109/ACCESS.2022.3165621.
863, 1 April-June 2022, doi: 10.1109/TAFFC.2020.2970399. [52] S. M. Al-Ghuribi, S. A. Mohd Noah, and S. Tiun, "Unsupervised
[33] P. K. Soni and R. Rambola, "A Survey on Implicit Aspect Detection Semantic Approach of Aspect-Based Sentiment Analysis for Large-
for Sentiment Analysis: Terminology, Issues, and Scope," in IEEE Scale User Reviews," IEEE Access, vol. 8, pp. 218592-218613,
Access, vol. 10, pp. 63932-63957, 2022, doi: 2020, doi: 10.1109/ACCESS.2020.3042312.
10.1109/ACCESS.2022.3183205. [53] A. Yadav, C. K. Jha, A. Sharan, and V. Vaish, "Sentiment analysis of
[34] H. Liu, I. Chatterjee, M. Zhou, X. S. Lu and A. Abusorrah, "Aspect- financial news using unsupervised approach," Procedia Comput. Sci.,
Based Sentiment Analysis: A Survey of Deep Learning Methods," in vol. 167, pp. 589-598, Mar. 2020, doi: 10.1016/j.procs.2020.03.325.
IEEE Transactions on Computational Social Systems, vol. 7, no. 6, [54] J. Rothfels and J. Tibshirani, "Unsupervised sentiment classification
pp. 1358-1375, Dec. 2020, doi: 10.1109/TCSS.2020.3033302. of English movie reviews using automatic selection of positive and
[35] K. Dhola and M. Saradva, "A Comparative Evaluation of Traditional negative sentiment items," CS224N-Final Project, pp. 52-56, 2010.
Machine Learning and Deep Learning Classification Techniques for [55] Y. Dai, J. Liu, J. Zhang, H. Fu, and Z. Xu, "Unsupervised Sentiment
Sentiment Analysis," 2021 11th International Conference on Cloud Analysis by Transferring Multi-source Knowledge," Cognitive
Computing, Data Science & Engineering (Confluence), Noida, India, Computation, vol. 13, no. 5, pp. 1185-1197, Sep. 2021, doi:
2021, pp. 932-936, doi: 10.1109/Confluence51648.2021.9377070. 10.1007/s12559-020-09792-8.
[36] A. Rawat, H. Maheshwari, M. Khanduja, R. Kumar, M. Memoria [56] A. García-Pablos, M. Cuadros, and G. Rigau, "W2VLDA: Almost
and S. Kumar, "Sentiment Analysis of Covid19 Vaccines Tweets unsupervised system for Aspect Based Sentiment Analysis," Expert
Using NLP and Machine Learning Classifiers," 2022 International Systems with Applications, vol. 96, pp. 229-241, Jan. 2018, doi:
Conference on Machine Learning, Big Data, Cloud and Parallel 10.1016/j.eswa.2017.08.049.
Computing (COM-IT-CON), Faridabad, India, 2022, pp. 225-230, [57] J. J. E. Macrohon, C. N. Villavicencio, X. A. Inbaraj and J.-H. Jeng,
doi: 10.1109/COM-IT-CON54601.2022.9850629. "A Semi-Supervised Approach to Sentiment Analysis of Tweets
[37] D. Guan, "Sentiment Classification of Financial Online Reviews during the 2022 Philippine Presidential Election," in IEEE Access,
Based on Machine Learning Algorithm," 2022 International vol. 10, pp. 52217-52228, 2022, doi:
Conference on Education, Network and Information Technology 10.1109/ACCESS.2022.3051484.
(ICENIT), Liverpool, United Kingdom, 2022, pp. 289-293, doi: [58] R. W. Acuña Caicedo, J. M. Gómez Soriano, and H. A. Melgar
10.1109/ICENIT57306.2022.00070. Sasieta, "Bootstrapping semi-supervised annotation method for
[38] A. A. Wadhe and S. S. Suratkar, "Tourist Place Reviews Sentiment potential suicidal messages," Internet Interventions, vol. 28, article
Classification Using Machine Learning Techniques," 2020
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
no. 100519, 2022, ISSN 2214-7829, doi: Phagwara, India, 2021, pp. 239-243, doi:
10.1016/j.invent.2022.100519. 10.1109/ICCS54944.2021.00054.
[59] N. H. Cahyana, S. Saifullah, Y. Fauziah, A. Sasmito, and R. [73] Y. S. Mehanna and M. B. Mahmuddin, "A Semantic
Drezewski, "Semi-supervised Text Annotation for Hate Speech Conceptualization Using Tagged Bag-of-Concepts for Sentiment
Detection using K-Nearest Neighbors and Term Frequency-Inverse Analysis," in IEEE Access, vol. 9, pp. 118736-118756, 2021, doi:
Document Frequency," International Journal of Advanced Computer 10.1109/ACCESS.2021.3107237.
Science and Applications, vol. 13, no. 10, pp. 147-151, Oct. 2022, [74] G. Gautam and D. Yadav, "Sentiment analysis of twitter data using
doi: 10.14569/IJACSA.2022.0131020. machine learning approaches and semantic analysis," 2014 Seventh
[60] V. Tyagi, A. Kumar and S. Das, "Sentiment Analysis on Twitter International Conference on Contemporary Computing (IC3), Noida,
Data Using Deep Learning approach," 2020 2nd International India, 2014, pp. 437-442, doi: 10.1109/IC3.2014.6897213.
Conference on Advances in Computing, Communication Control and [75] L. Jiang, M. Yu, M. Zhou, X. Liu, and T. Zhao, "Target-dependent
Networking (ICACCCN), Greater Noida, India, 2020, pp. 187-190, Twitter sentiment classification," in Proceedings of the 49th Annual
doi: 10.1109/ICACCCN51052.2020.9362853. Meeting of the Association for Computational Linguistics: Human
[61] B. Seetharamulu, B. N. K. Reddy and K. B. Naidu, "Deep Learning Language Technologies, vol. 1, 2011, pp. 151-160.
for Sentiment Analysis Based on Customer Reviews," 2020 11th [76] S. Kiritchenko, X. Zhu, C. Cherry, and S. Mohammad, "NRC-
International Conference on Computing, Communication and Canada2014: Detecting aspects and sentiment in customer reviews,"
Networking Technologies (ICCCNT), Kharagpur, India, 2020, pp. 1- in Proceedings of the 8th International Workshop on Semantic
5, doi: 10.1109/ICCCNT49239.2020.9225665. Evaluation (SemEval 2014), January 2014, pp. 437-442, doi:
[62] A. C. Mazari and A. Djeffal, "Deep Learning-Based Sentiment 10.3115/v1/S14-2076.
Analysis of Algerian Dialect during Hirak 2019," 2020 2nd [77] J. Wagner, P. Arora, S. Cortes, U. Barman, D. Bogdanova, J. Foster,
International Workshop on Human-Centric Smart Environments for and L. Tounsi, "DCU: Aspect-based Polarity Classification for
Health and Well-being (IHSH), Boumerdes, Algeria, 2021, pp. 233- SemEval Task 4," in Proceedings of the 8th International Workshop
236, doi: 10.1109/IHSH51661.2021.9378753. on Semantic Evaluation (SemEval 2014), Dublin, Ireland, Aug.
[63] X. Glorot, A. Bordes, and Y. Bengio, "Domain adaptation for large- 2014, pp. 223-229, doi: 10.3115/v1/S14-2036.
scale sentiment classification: A deep learning approach," in [78] A. Neviarouskaya, H. Prendinger and M. Ishizuka, "SentiFul:
Proceedings of the 28th International Conference on Machine Generating a reliable lexicon for sentiment analysis," 2009 3rd
Learning (ICML-11), 2011, pp. 513-520. International Conference on Affective Computing and Intelligent
[64] R. D. Tan et al., "LMS Content Evaluation System with Sentiment Interaction and Workshops, Amsterdam, Netherlands, 2009, pp. 1-6,
Analysis Using Lexicon-Based Approach," 2022 10th International doi: 10.1109/ACII.2009.5349575.
Conference on Information and Education Technology (ICIET), [79] G. Qiu, L. Bing, J. Bu, and C. Chen, "Expanding domain sentiment
Matsue, Japan, 2022, pp. 93-98, doi: lexicon through double propagation," in Proceedings of the
10.1109/ICIET55102.2022.9778976. International Joint Conference on Artificial Intelligence, 2009, pp.
[65] M. Huang, H. Xie, Y. Rao, Y. Liu, L. K. M. Poon and F. L. Wang, 1199-1204.
"Lexicon-Based Sentiment Convolutional Neural Networks for [80] M. Taboada, J. Brooke, M. Tofiloski, K. Voll, and M. Stede,
Online Review Analysis," in IEEE Transactions on Affective "Lexicon-based methods for sentiment analysis," Computational
Computing, vol. 13, no. 3, pp. 1337-1348, 1 July-Sept. 2022, doi: Linguistics, vol. 37, no. 2, pp. 267-307, 2011. [Online]. Available:
10.1109/TAFFC.2020.2997769. https://doi.org/10.1162/COLI_a_00049
[66] W. Suwanpipob, N. Arch-int and M. Wattana, "A Sentiment [81] M. Yang, W. Yin, Q. Qu, W. Tu, Y. Shen and X. Chen, "Neural
Classification from Review Corpus using Linked Open Data and Attentive Network for Cross-Domain Aspect-Level Sentiment
Sentiment Lexicon," 2021 13th International Conference on Classification," in IEEE Transactions on Affective Computing, vol.
Information Technology and Electrical Engineering (ICITEE), 12, no. 3, pp. 761-775, 1 July-Sept. 2021, doi:
Chiang Mai, Thailand, 2021, pp. 19-23, doi: 10.1109/TAFFC.2019.2897093.
10.1109/ICITEE53064.2021.9611898. [82] Y. Fu and Y. Liu, " Cross-domain sentiment classification based on
[67] S. Lee, S. Ma, J. Meng, J. Zhuang, and T.-Q. Peng, "Detecting key pivot and non-pivot extraction," in Knowledge-Based Systems,
Sentiment toward Emerging Infectious Diseases on Social Media: A vol. 228, 107280, 2021. doi: 10.1016/j.knosys.2021.107280.
Validity Evaluation of Dictionary-Based Sentiment Analysis," [83] Y. Fu and Y. Liu, "Contrastive transformer based domain adaptation
International Journal of Environmental Research and Public Health, for multi-source cross-domain sentiment classification," Knowledge-
vol. 19, no. 11, article no. 6759, 2022, doi: 10.3390/ijerph19116759. Based Systems, vol. 245, p. 108649, Jun. 2022. [Online]. Available:
[68] E. Okango and H. Mwambi, "Dictionary Based Global Twitter https://doi.org/10.1016/j.knosys.2022.108649.
Sentiment Analysis of Coronavirus (COVID-19) Effects and [84] H. Tang, Y. Mi, F. Xue and Y. Cao, "Graph Domain Adversarial
Response," Annals of Data Science, vol. 9, no. 1, pp. 175-186, Feb. Transfer Network for Cross-Domain Sentiment Classification," in
2022, doi: 10.1007/s40745-021-00358-5. IEEE Access, vol. 9, pp. 33051-33060, 2021, doi:
[69] A. Husna, H. Rahman and E. K. Hashi, "Statistical Approach for 10.1109/ACCESS.2021.3061139.
Classifying Sentiment Reviews by Reducing Dimension using [85] Z. Li, Y. Wei, Y. Zhang, and Q. Yang, "Hierarchical Attention
Truncated Singular Value Decomposition," 2019 1st International Transfer Network for Cross-Domain Sentiment Classification," in
Conference on Advances in Science, Engineering and Robotics Proceedings of the Thirty-Second AAAI Conference on Artificial
Technology (ICASERT), Dhaka, Bangladesh, 2019, pp. 1-5, doi: Intelligence, vol. 32, no. 1, Main Track: NLP and Text Mining, 2018,
10.1109/ICASERT.2019.8934507. DOI: 10.1609/aaai.v32i1.12055.
[70] A. P. Kumar, A. Nayak and M. S. K, "A Statistical approach to [86] K. Zhang, H. Zhang, Q. Liu, H. Zhao, H. Zhu, and E. Chen,
evaluate the efficiency and effectiveness of the Machine Learning "Interactive attention transfer network for cross-domain sentiment
algorithms analyzing Sentiments," 2019 IEEE International classification," in Proceedings of the Thirty-Third AAAI Conference
Conference on Distributed Computing, VLSI, Electrical Circuits on Artificial Intelligence and Thirty-First Innovative Applications of
andRobotics (DISCOVER), Manipal, India, 2019, pp. 1-6, doi: Artificial Intelligence Conference and Ninth AAAI Symposium on
10.1109/DISCOVER47552.2019.9008028. Educational Advances in Artificial Intelligence
[71] R. Gupta, J. Kumar, H. Agrawal and Kunal, "A Statistical Approach (AAAI'19/IAAI'19/EAAI'19), January 2019, Article No.: 708, pp.
for Sarcasm Detection Using Twitter Data," 2020 4th International 5773-5780, DOI: 10.1609/aaai.v33i01.33015773.
Conference on Intelligent Computing and Control Systems [87] T. Manshu and Z. Xuemin, "CCHAN: An End to End Model for
(ICICCS), Madurai, India, 2020, pp. 633-638, doi: Cross Domain Sentiment Classification," in IEEE Access, vol. 7, pp.
10.1109/ICICCS48265.2020.9120917. 50232-50239, 2019, doi: 10.1109/ACCESS.2019.2910300.
[72] S. Choudhary and J. Godara, "Semantic Analysis on Social Media," [88] P. Liu, X. Qiu, and X. Huang, "Adversarial multi-task learning for
2021 International Conference on Computing Sciences (ICCS), text classification," Proceedings of the 55th Annual Meeting of the
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
Association for Computational Linguistics (Volume 1: Long Papers) [106] Z. Yu, J. Yu, J. Fan and D. Tao, "Multi-modal Factorized Bilinear
[Preprint]. Available at: https://doi.org/10.18653/v1/p17-1001. Pooling with Co-attention Learning for Visual Question Answering,"
[89] X. Chen and C. Cardie, "Multinomial adversarial networks for multi- 2017 IEEE International Conference on Computer Vision (ICCV),
domain text classification," Proceedings of the 2018 Conference of Venice, Italy, 2017, pp. 1839-1848, doi: 10.1109/ICCV.2017.202.
the North American Chapter of the Association for Computational [107] J. Arevalo, T. Solorio, M. Montes-y-Gomez, and F. A. Gonzalez,
Linguistics: Human Language Technologies, Volume 1 (Long "Gated multimodal units for information fusion,"
Papers) [Preprint]. Available at: https://doi.org/10.18653/v1/n18- arXiv:1702.01992v1 [cs.CL], Feb. 2017. [Online]. Available:
1111. https://arxiv.org/abs/1702.01992v1.
[90] J. Chen, X. Qiu, and X. Huang, "Meta Multi-Task Learning for [108] D. Borth, R. Ji, T. Chen, T. Breuel, and S.-F. Chang, "Large-scale
Sequence Modeling," Proceedings of the AAAI Conference on visual sentiment ontology and detectors using adjective noun pairs,"
Artificial Intelligence, vol. 32, no. 1, February 2018, pp. 12007- in Proceedings of the 21st ACM international conference on
12014, doi: 10.1609/aaai.v32i1.12007. Multimedia (MM '13), October 2013, pp. 223-232. DOI:
[91] Y. Dai, J. Liu, X. Ren, Z. Xu, "Adversarial training based multi- 10.1145/2502081.2502282.
source unsupervised domain adaptation for sentiment analysis," in [109] X. Yang, S. Feng, D. Wang and Y. Zhang, "Image-Text Multimodal
Proceedings of the AAAI Conference on Artificial Intelligence, vol. Emotion Classification via Multi-View Attentional Network," in
34, no. 5, pp. 7618-7625, 2020. IEEE Transactions on Multimedia, vol. 23, pp. 4014-4026, 2021, doi:
[92] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre- 10.1109/TMM.2020.3035277.
training of deep bidirectional transformers for language [110] K. Zhang, Y. Geng, J. Zhao, J. Liu, and W. Li, "Sentiment Analysis
understanding," arXiv preprint arXiv:1810.04805, 2018. of Social Media via Multimodal Feature Fusion," Symmetry, vol. 12,
[93] V. Sanh, L. Debut, J. Chaumond, and T. Wolf, "DistilBERT, a no. 12, p. 2010, Dec. 2020. DOI: 10.3390/sym12122010.
distilled version of BERT: Smaller, faster, cheaper and lighter," [111] N. Xu, W. Mao, and G. Chen, "A Co-Memory Network for
arXiv preprint arXiv:1910.01108, 2019. Multimodal Sentiment Analysis," in Proc. SIGIR '18: The 41st
[94] J. Blitzer, M. Dredze, and F. Pereira, "Biographies, bollywood, International ACM SIGIR Conference on Research & Development
boomboxes and blenders: Domain adaptation for sentiment in Information Retrieval, June 2018, pp. 929-932. DOI:
classification," in Proc. 45th Annu. Meeting Assoc. Comput. 10.1145/3209978.3210093.
Linguistics (ACL), (2007), vol. 1, pp. 440-447. [112] N. Xu, "Analyzing multimodal public sentiment based on
[95] Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. hierarchical semantic attentional network," 2017 IEEE International
Laviolette, M. Marchand, and V. Lempitsky, "Domain-Adversarial Conference on Intelligence and Security Informatics (ISI), Beijing,
Training of Neural Networks," in Domain Adaptation in Computer China, 2017, pp. 152-154, doi: 10.1109/ISI.2017.8004895.
Vision Applications, G. Csurka, Ed. Cham: Springer, 2017, pp. 79- [113] Y. Yu, H. Lin, J. Meng, and Z. Zhao, "Visual and Textual Sentiment
87. doi: 10.1007/978-3-319-58347-1_10. Analysis of a Microblog Using Deep Convolutional Neural
[96] Z. Li, Y. Zhang, Y. Wei, Y. Wu, and Q. Yang, "End-to-End Networks," in Algorithms, vol. 9, no. 2, p. 41, 2016. DOI:
Adversarial Memory Network for Cross-domain Sentiment 10.3390/a9020041.
Classification," in Proceedings of the 26th International Joint [114] G. Cai and B. Xia, "Convolutional Neural Networks for Multimedia
Conference on Artificial Intelligence (IJCAI), Melbourne, Australia, Sentiment Analysis," in Natural Language Processing and Chinese
2017, pp. 3897-3903, doi: 10.24963/IJCAI.2017/311. Computing, J. Li, H. Ji, D. Zhao, and Y. Feng, Eds. Cham: Springer,
[97] Y. Du, M. He, L. Wang, and H. Zhang, "Wasserstein based transfer 2015, pp. 149-158. DOI: 10.1007/978-3-319-25207-0_14.
network for cross-domain sentiment classification," Knowledge- [115] C. Baecchi, T. Uricchio, M. Bertini, et al., "A multimodal feature
Based Systems, vol. 204, pp. 1-11, Sep. 2020, doi: learning approach for sentiment analysis of social network
10.1016/j.knosys.2020.106162. multimedia," Multimed Tools Appl, vol. 75, no. 5, pp. 2507-2525,
[98] G. R. Wang, K. Z. Wang, and L. Lin, Adaptively connected neural Mar. 2016. DOI: 10.1007/s11042-015-2646-x.
networks, in Proc. of the 2019 IEEE Conf. Computer Vision and [116] D. Hazarika, S. Poria, R. Mihalcea, E. Cambria, and R.
Pattern Recognition, Long Beach, CA, USA, 2019, pp. 1781–1790. Zimmermann, "ICON: Interactive Conversational Memory Network
[99] R. Cadne, C. Dancette, H. Ben-younes, M. Cord, and D. Parikh, for Multimodal Emotion Detection," in Proceedings of the 2018
"RUBi: Reducing unimodal biases for visual question answering," in Conference on Empirical Methods in Natural Language Processing,
Proc. Conf. Neural Inf. Process. Syst. (NeurIPS), Vancouver, Oct.-Nov. 2018, Brussels, Belgium. [Online]. Available:
Canada, Dec. 2019, pp. 841-852. https://aclanthology.org/D18-1280. DOI: 10.18653/v1/D18-1280.
[100] N. Xu and W. J. Mao, " MultiSentiNet: A deep semantic network for [117] A. Go, R. Bhayani, and L. Huang, "Twitter sentiment classification
multimodal sentiment analysis," in Proc. 2017 ACM Conf. using distant supervision," CS224N Project Rep., Stanford, vol. 1,
Information and Knowledge Management, Singapore, 2017, pp. no. 12, p. 2009, 2009. [Online]. Available:
2399-2402. [Online]. Available: https://cs.stanford.edu/people/alecmgo/papers/TwitterDistantSupervi
https://doi.org/10.1145/3132847.3133142 sion09.pdf.
[101] J. C. Xu, D. L. Chen, X. P. Qiu, and X. J. Huang, “Cached long [118] B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up? Sentiment
short-term memory neural networks for document-level sentiment Classification using Machine Learning Techniques," in Proceedings
classification,” in Proc. 2016 Conf. Empirical Methods in Natural of the 2002 Conference on Empirical Methods in Natural Language
Language Processing, Austin, TX, USA, 2016, pp. 1660–1669. Processing (EMNLP 2002), July 2002, pp. 79-86, doi:
[102] C. Yang, X. Wang and B. Jiang, "Sentiment Enhanced Multi-Modal 10.3115/1118693.1118704.
Hashtag Recommendation for Micro-Videos," in IEEE Access, vol. [119] A. Kennedy and D. Inkpen, "Sentiment Classification of Movie
8, pp. 78252-78264, 2020, doi: 10.1109/ACCESS.2020.2989473. Reviews Using Contextual Valence Shifters," Computational
[103] F. R. Huang, K. M. Wei, J. Weng, and Z. J. Li, "Attention-based Intelligence, vol. 22, no. 2, pp. 110-125, May 2006, doi:
modality-gated networks for image-text sentiment analysis," ACM 10.1111/j.1467-8640.2006.00277.x.
Trans. Multimed. Comput. Commun. Appl., vol. 16, no. 3, p. 79, [120] S. Mohammad, S. Kiritchenko, and X. Zhu, "NRC-Canada: Building
2020. [Online]. Available: https://doi.org/10.1145/3388861 the State-of-the-Art in Sentiment Analysis of Tweets," in Second
[104] C. Peng, C. Zhang, X. Xue, J. Gao, H. Liang and Z. Niu, "Cross- Joint Conference on Lexical and Computational Semantics (*SEM),
modal complementary network with hierarchical fusion for Volume 2: Proceedings of the Seventh International Workshop on
multimodal sentiment classification," in Tsinghua Science and Semantic Evaluation (SemEval 2013), Atlanta, Georgia, USA, June
Technology, vol. 27, no. 4, pp. 664-679, Aug. 2022, doi: 2013, pp. 321-327, doi: N/A.
10.26599/TST.2021.9010055. [121] S. Liu, F. Li, F. Li, X. Cheng, and H. Shen, "Adaptive co-training
[105] Y. Du, Y. Liu, Z. Peng, and X. Jin, "Gated attention fusion network SVM for sentiment classification on tweets," in Proceedings of the
for multimodal sentiment classification," Knowledge-Based Systems, 22nd ACM International Conference on Information & Knowledge
vol. 240, pp. 108107, Mar. 2022. [Online]. Available: Management (CIKM '13), October 2013, pp. 2079-2088, doi:
https://doi.org/10.1016/j.knosys.2021.108107 10.1145/2505515.2505569.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
[122] H. T. Nguyen and M. Le Nguyen, "Effective Attention Networks for [137] F. Zhuang, P. Luo, Z. Shen, Q. He, Y. Xiong, Z. Shi, and H. Xiong,
Aspect-level Sentiment Classification," 2018 10th International "Collaborative Dual-PLSA: mining distinction and commonality
Conference on Knowledge and Systems Engineering (KSE), Ho Chi across multiple domains for text classification," Proceedings of the
Minh City, Vietnam, 2018, pp. 25-30, doi: 19th ACM International Conference on Information and Knowledge
10.1109/KSE.2018.8573324. Management (CIKM '10), October 2010, pp. 359-368. DOI:
[123] Y. Tay, L. A. Tuan, and S. C. Hui, "Learning to attend via word- 10.1145/1871437.1871486.
aspect associative fusion for aspect-based sentiment analysis," in [138] L. Li, X. Jin, and M. Long, "Topic correlation analysis for cross-
Proceedings of the Thirty-Second AAAI Conference on Artificial domain text classification," Proceedings of the Twenty-Sixth AAAI
Intelligence and Thirtieth Innovative Applications of Artificial Conference on Artificial Intelligence (AAAI'12), July 2012, pp. 998-
Intelligence Conference and Eighth AAAI Symposium on 1004.
Educational Advances in Artificial Intelligence [139] G. Zhou, T. He, J. Zhao, and W. Wu, "A subspace learning
(AAAI'18/IAAI'18/EAAI'18), February 2018, Article No.: 731, pp. framework for cross-lingual sentiment classification with partial
5956-5963. parallel data," in Proc. IJCAI, 2015, pp. 1426–1433.
[124] D. Ma, S. Li, X. Zhang, and H. Wang, "Interactive attention [140] D. Wang et al., "Coarse Alignment of Topic and Sentiment: A
networks for aspect-level sentiment classification," in Proceedings of Unified Model for Cross-Lingual Sentiment Classification," in IEEE
the 26th International Joint Conference on Artificial Intelligence Transactions on Neural Networks and Learning Systems, vol. 32, no.
(IJCAI'17), August 2017, pp. 4068-4074. doi: 2, pp. 736-747, Feb. 2021, doi: 10.1109/TNNLS.2020.2979225.
10.24963/ijcai.2017/568 [141] F. Zhuang, P. Luo, C. Du, Q. He, Z. Shi and H. Xiong, "Triplex
[125] K. Karthikeyan, Z. Wang, S. Mayhew, and D. Roth, ‘‘Cross-lingual Transfer Learning: Exploiting Both Shared and Distinct Concepts for
ability of multilingual bert: An empirical study,’’ in Proc. Int. Conf. Text Classification," in IEEE Transactions on Cybernetics, vol. 44,
Learn. Represent., 2019, pp. 1–12. no. 7, pp. 1191-1203, July 2014, doi: 10.1109/TCYB.2013.2281451.
[126] Y. Wang, M. Huang, X. Zhu, and L. Zhao, "Attention-based LSTM [142] M. Long, J. Wang, G. Ding, W. Cheng, X. Zhang, and W. Wang,
for aspect-level sentiment classification," in Proc. Conf. Empirical "Dual Transfer Learning," in Proceedings of the 2012 SIAM
Methods Natural Lang. Process., Nov. 2016, pp. 606-615, doi: International Conference on Data Mining (SDM), 2012, pp. 47-58,
10.18653/v1/d16-1058. doi: 10.1137/1.9781611972825.47.
[127] Y. Song, J. Wang, T. Jiang, Z. Liu, and Y. Rao, "Targeted sentiment [143] P. Prettenhofer and B. Stein, "Cross-Lingual Adaptation Using
classification with attentional encoder network," in Proc. ICANN Structural Correspondence Learning," ACM Transactions on
2019: Text and Time Series, I. Tetko, V. Kůrková, P. Karpov, and F. Intelligent Systems and Technology, vol. 3, no. 1, pp. 13:1-13:22,
Theis, Eds., vol. 11730, Cham: Springer, Sep. 2019, pp. 82-91. doi: October 2011, doi: 10.1145/2036264.2036277.
10.1007/978-3-030-30490-4_9 [144] X. Wan, "Co-training for cross-lingual sentiment classification," in
[128] J. Deriu, A. Lucchi, V. De Luca, A. Severyn, S. Müller, M. Proceedings of the Joint Conference of the 47th Annual Meeting of
Cieliebak, T. Hofmann, and M. Jaggi, "Leveraging large amounts of the ACL and the 4th International Joint Conference on Natural
weakly supervised data for multi-language sentiment classification," Language Processing of the AFNLP, 2009, pp. 235-243.
in Proc. WWW '17: 26th Int. Conf. World Wide Web, Apr. 2017, pp. [145] C. Ma, M. Wang, and X. Chen, "Topic and Sentiment Unification
1045-1052. doi: 10.1145/3038912.3052611 Maximum Entropy Model for Online Review Analysis," in
[129] K. Dashtipour, S. Poria, A. Hussain, et al., "Multilingual Sentiment Proceedings of the 24th International Conference on World Wide
Analysis: State of the Art and Independent Comparison of Web Companion, May 2015, pp. 649-654, doi:
Techniques," Cogn. Comput., vol. 8, no. 4, pp. 757-771, Aug. 2016. 10.1145/2740908.2741704.
doi: 10.1007/s12559-016-9415-7 [146] M. Xuanyuan, L. Xiao and M. Duan, "Sentiment Classification
[130] A. Balahur and M. Turchi, "Multilingual sentiment analysis using Algorithm Based on Multi-Modal Social Media Text Information,"
machine translation," in Proc. 3rd Workshop on Computational in IEEE Access, vol. 9, pp. 33410-33418, 2021, doi:
Approaches to Subjectivity, Sentiment and Social Media Analysis 10.1109/ACCESS.2021.3061450.
(WASSA), Stroudsburg, PA, USA: Association for Computational [147] A. Joulin, E. Grave, P. Bojanowski, and T. Mikolov, "Bag of Tricks
Linguistics, 2012, pp. 52-60. for Efficient Text Classification," in Proceedings of the 15th
[131] A. Conneau, G. Lample, M. Ranzato, L. Denoyer, and H. Jégou, Conference of the European Chapter of the Association for
"Word Translation Without Parallel Data," arXiv preprint Computational Linguistics: Volume 2, Short Papers, Valencia, Spain,
arXiv:1710.04087, Oct. 2017. [Online]. Available: Apr. 2017, pp. 427-431.
https://arxiv.org/abs/1710.04087. DOI: 10.48550/arXiv.1710.04087. [148] F. Li, M. Zhang, G. Fu, T. Qian, and D. Ji, ‘‘A Bi-LSTM-RNN
[132] E. Ghadery, S. Movahedi, H. Faili, and A. Shakery, "MNCN: A model for relation classification using low-cost sequence features,’’
Multilingual Ngram-Based Convolutional Network for Aspect 2016, arXiv:1608.07720. [Online]. Available:
Category Detection in Online Reviews," Proceedings of the AAAI http://arxiv.org/abs/1608.07720
Conference on Artificial Intelligence, vol. 33, no. 01, pp. 6441-6448, [149] R. Wang, Z. Li, J. Cao, T. Chen and L. Wang, "Convolutional
2019. DOI: 10.1609/aaai.v33i01.33016441. Recurrent Neural Networks for Text Classification," 2019
[133] K. Sattar, Q. Umer, D. G. Vasbieva, S. Chung, Z. Latif and C. Lee, International Joint Conference on Neural Networks (IJCNN),
"A Multi-Layer Network for Aspect-Based Cross-Lingual Sentiment Budapest, Hungary, 2019, pp. 1-6, doi:
Classification," in IEEE Access, vol. 9, pp. 133961-133973, 2021, 10.1109/IJCNN.2019.8852406.
doi: 10.1109/ACCESS.2021.3116053. [150] X. Sun and J. He, "A novel approach to generate a large scale of
[134] Z. Lin, X. Jin, X. Xu, W. Wang, X. Cheng, and Y. Wang, "A Cross- supervised data for short text sentiment analysis," Multimedia Tools
Lingual Joint Aspect/Sentiment Model for Sentiment Analysis," Appl., vol. 79, no. 9, pp. 5439-5459, Feb. 2018. doi:
Proceedings of the 23rd ACM International Conference on 10.1007/s11042-018-5748-4.
Conference on Information and Knowledge Management (CIKM [151] A. Kumar, K. Srinivasan, C. Wen-Huang, and A. Y. Zomaya,
'14), November 2014, pp. 1089-1098. DOI: "Hybrid context enriched deep learning model for fine-grained
10.1145/2661829.2662019. sentiment analysis in textual and visual semiotic modality social
[135] M. Paul and R. Girju, "Cross-cultural analysis of blogs and forums data," Information Processing & Management, vol. 57, no. 1, pp.
with mixed-collection topic models," in Proc. Conf. Empirical 102141, Jan. 2020. doi: 10.1016/j.ipm.2019.102141.
Methods Natural Lang. Process. (EMNLP), 2009, pp. 1408–1417. [152] S. Bairavel and M. Krishnamurthy, "Novel OGBEE-based feature
[136] Y. Bao, N. Collier, and A. Datta, "A partially supervised cross- selection and feature-level fusion with MLP neural network for
collection topic model for cross-domain text classification," social media multimodal sentiment analysis," Soft Computing, vol.
Proceedings of the 22nd ACM International Conference on 24, pp. 18431–18445, Dec. 2020. doi: 10.1007/s00500-020-05049-6.
Information & Knowledge Management (CIKM '13), October 2013, [153] T. Sabbah, M. Ayyash, and M. Ashraf, "Hybrid support vector
pp. 239-248. DOI: 10.1145/2505515.2505556. machine-based feature selection method for text classification,"
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/
This article has been accepted for publication in IEEE Access. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2023.3293041
International Arab Journal of Information Technology, vol. 15, no. 3, [171] I. N. Khasanah, "Sentiment Classification Using fastText Embedding
pp. 599–609, 2018. and Deep Learning Model," Procedia Computer Science, vol. 181,
[154] L. Dey, S. Chakraborty, A. Biswas, B. Bose, and S. Tiwari, pp. 682-688, May 2021, doi: 10.1016/j.procs.2021.05.103.
"Sentiment Analysis of Review Datasets Using Naïve Bayes' and K-
NN Classifier," International Journal of Information Engineering and
Electronic Business, vol. 8, no. 4, pp. 54-62, July 2016. DOI:
10.5815/ijieeb.2016.04.07.
[155] U. I. Larasati, M. A. Muslim, R. Arifudin, and A. Alamsyah, NILAA RAGHUNATHAN is currently
"Improve the accuracy of support vector machine using chi square pursuing the BTech degree in computer
statistic and term frequency inverse document frequency on movie science with specialization in data science
review sentiment analysis," Scientific Journal of Informatics, vol. 6, with the School of Computer Science and
no. 1, pp. 138-149, May 2019. DOI: 10.15294/sji.v6i1.14244. Engineering, Vellore Institute of Technology,
[156] N. S. Mohd Nafis and S. Awang, "An Enhanced Hybrid Feature Vellore, India.
Selection Technique Using Term Frequency-Inverse Document Her research interests include Data Science,
Frequency and Support Vector Machine-Recursive Feature Machine Learning, and Natural Language
Elimination for Sentiment Classification," in IEEE Access, vol. 9, Processing.
pp. 52177-52192, 2021, doi: 10.1109/ACCESS.2021.3069001.
[157] J. Khan, A. Alam and Y. Lee, "Intelligent Hybrid Feature Selection
for Textual Sentiment Classification," in IEEE Access, vol. 9, pp.
140590-140608, 2021, doi: 10.1109/ACCESS.2021.3118982. SARAVANAKUMAR KANDASAMY
[158] A. I. Pratiwi and A. Adiwijaya, "On the Feature Selection and received the PhD degree in computer
Classification Based on Information Gain for Document Sentiment science and engineering from the Vellore
Analysis," Hindawi Journal, vol. 2018, Article ID 1407817, Feb. Institute of Technology, Vellore, Tamil
2018. DOI: 10.1155/2018/1407817. Nadu, India, and MTech in computer
[159] R. Maulana, P. A. Rahayuningsih, W. Irmayani, D. Saputra, and W. science engineering from Indian Institute of
E. Jayanti, "Improved Accuracy of Sentiment Analysis Movie Technology, Guwahati, India. From the
Review Using Support Vector Machine Based Information Gain," year 2008, he is with the Vellore Institute
Journal of Physics Conference Series, vol. 1641, no. 1, p. 012060, of Technology, where he is currently an
Nov. 2020. DOI: 10.1088/1742-6596/1641/1/012060. Associate Professor in the department of
[160] B. Y. Ong, S. W. Goh and C. Xu, "Sparsity adjusted information software systems.
gain for feature selection in sentiment analysis," 2015 IEEE His research interests include natural language processing, machine
International Conference on Big Data (Big Data), Santa Clara, CA, learning, and advanced database systems.
USA, 2015, pp. 2122-2128, doi: 10.1109/BigData.2015.7363995.
[161] Q. Huang, R. Chen, X. Zheng and Z. Dong, "Deep Sentiment
Representation Based on CNN and LSTM," 2017 International
Conference on Green Informatics (ICGI), Fuzhou, China, 2017, pp.
30-33, doi: 10.1109/ICGI.2017.45.
[162] M. S. Neethu and R. Rajasree, "Sentiment analysis in twitter using
machine learning techniques," 2013 Fourth International Conference
on Computing, Communications and Networking Technologies
(ICCCNT), Tiruchengode, India, 2013, pp. 1-5, doi:
10.1109/ICCCNT.2013.6726818.
[163] J. Schmidhuber, "Deep learning in neural networks: An overview,"
Neural Networks, vol. 61, pp. 85-117, Jan. 2015. DOI:
10.1016/j.neunet.2014.09.003.
[164] G. Xu, Y. Meng, X. Qiu, Z. Yu and X. Wu, "Sentiment Analysis of
Comment Texts Based on BiLSTM," in IEEE Access, vol. 7, pp.
51522-51532, 2019, doi: 10.1109/ACCESS.2019.2909919.
[165] G. Liu and J. Guo, "Bidirectional LSTM with attention mechanism
and convolutional layer for text classification," Neurocomputing, vol.
337, pp. 325-338, Apr. 2019. DOI: 10.1016/j.neucom.2019.01.078.
[166] S. Tam, R. B. Said and Ö. Ö. Tanriöver, "A ConvBiLSTM Deep
Learning Model-Based Approach for Twitter Sentiment
Classification," in IEEE Access, vol. 9, pp. 41283-41293, 2021, doi:
10.1109/ACCESS.2021.3064830.
[167] P. -H. Chi et al., "Audio Albert: A Lite Bert for Self-Supervised
Learning of Audio Representation," 2021 IEEE Spoken Language
Technology Workshop (SLT), Shenzhen, China, 2021, pp. 344-350,
doi: 10.1109/SLT48900.2021.9383575.
[168] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M.
Lewis, L. Zettlemoyer, and V. Stoyanov, "RoBERTa: A Robustly
Optimized BERT Pretraining Approach," arXiv:1907.11692 [cs.CL],
2019. [Online]. Available:
https://doi.org/10.48550/arXiv.1907.11692.
[169] X. Cao, J. Yu and Y. Zhuang, "Injecting User Identity Into Pretrained
Language Models for Document-Level Sentiment Classification," in
IEEE Access, vol. 10, pp. 30157-30167, 2022, doi:
10.1109/ACCESS.2022.3158975.
[170] Z. Hameed and B. Garcia-Zapirain, "Sentiment Classification Using
a Single-Layered BiLSTM Model," in IEEE Access, vol. 8, pp.
73992-74001, 2020, doi: 10.1109/ACCESS.2020.2988550.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License. For more information, see https://creativecommons.org/licenses/by-nc-nd/4.0/