0% found this document useful (0 votes)
82 views

Sentiment Analysis Algorithms and Applications A S

This document summarizes a research article about sentiment analysis algorithms and applications. It discusses recent advances in sentiment analysis techniques, including enhancements to existing algorithms and new applications. It also covers related fields like emotion detection, transfer learning, and building sentiment analysis resources. The goal of the summary is to provide an overview of recent developments in sentiment analysis and related areas.

Uploaded by

Ahmed Emad
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views

Sentiment Analysis Algorithms and Applications A S

This document summarizes a research article about sentiment analysis algorithms and applications. It discusses recent advances in sentiment analysis techniques, including enhancements to existing algorithms and new applications. It also covers related fields like emotion detection, transfer learning, and building sentiment analysis resources. The goal of the summary is to provide an overview of recent developments in sentiment analysis and related areas.

Uploaded by

Ahmed Emad
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/261875740

Sentiment Analysis Algorithms and Applications: A Survey

Article  in  Ain Shams Engineering Journal · May 2014


DOI: 10.1016/j.asej.2014.04.011

CITATIONS READS

1,608 10,469

3 authors:

Ahmed Hassan Yousef Walaa Medhat


Nile University Nile University
74 PUBLICATIONS   1,963 CITATIONS    16 PUBLICATIONS   1,655 CITATIONS   

SEE PROFILE SEE PROFILE

Hoda K. Mohamed
Ain Shams University
49 PUBLICATIONS   1,824 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Advanced Control View project

Multimodal Sentiment Analysis View project

All content following this page was uploaded by Ahmed Hassan Yousef on 24 September 2014.

The user has requested enhancement of the downloaded file.


Ain Shams Engineering Journal (2014) xxx, xxx–xxx

Ain Shams University

Ain Shams Engineering Journal


www.elsevier.com/locate/asej
www.sciencedirect.com

ELECTRICAL ENGINEERING

Sentiment analysis algorithms and applications:


A survey
a,*
Walaa Medhat , Ahmed Hassan b, Hoda Korashy b

a
School of Electronic Engineering, Canadian International College, Cairo Campus of CBU, Egypt
b
Ain Shams University, Faculty of Engineering, Computers & Systems Department, Egypt

Received 8 September 2013; revised 8 April 2014; accepted 19 April 2014

KEYWORDS Abstract Sentiment Analysis (SA) is an ongoing field of research in text mining field. SA is the
Sentiment analysis; computational treatment of opinions, sentiments and subjectivity of text. This survey paper tackles
Sentiment classification; a comprehensive overview of the last update in this field. Many recently proposed algorithms’
Feature selection; enhancements and various SA applications are investigated and presented briefly in this survey.
Emotion detection; These articles are categorized according to their contributions in the various SA techniques. The
Transfer learning; related fields to SA (transfer learning, emotion detection, and building resources) that attracted
Building resources researchers recently are discussed. The main target of this survey is to give nearly full image of
SA techniques and the related fields with brief details. The main contributions of this paper include
the sophisticated categorizations of a large number of recent articles and the illustration of the
recent trend of research in the sentiment analysis and its related areas.
 2014 Production and hosting by Elsevier B.V. on behalf of Ain Shams University.

1. Introduction stated that OM and SA have slightly different notions [1].


Opinion Mining extracts and analyzes people’s opinion about
Sentiment Analysis (SA) or Opinion Mining (OM) is the com- an entity while Sentiment Analysis identifies the sentiment
putational study of people’s opinions, attitudes and emotions expressed in a text then analyzes it. Therefore, the target of
toward an entity. The entity can represent individuals, events SA is to find opinions, identify the sentiments they express,
or topics. These topics are most likely to be covered by and then classify their polarity as shown in Fig. 1.
reviews. The two expressions SA or OM are interchangeable. Sentiment Analysis can be considered a classification pro-
They express a mutual meaning. However, some researchers cess as illustrated in Fig. 1. There are three main classification
levels in SA: document-level, sentence-level, and aspect-level
SA. Document-level SA aims to classify an opinion document
* Corresponding author. Address: School of Electronic Engineering,
as expressing a positive or negative opinion or sentiment. It
Canadian International College, 22 Emarate Khalf El-Obour, Masr considers the whole document a basic information unit
Elgedida, Cairo, Egypt. Tel.: +20 24049568.
(talking about one topic). Sentence-level SA aims to classify
E-mail address: [email protected] (W. Medhat).
sentiment expressed in each sentence. The first step is to
Peer review under responsibility of Ain Shams University.
identify whether the sentence is subjective or objective. If the
sentence is subjective, Sentence-level SA will determine
whether the sentence expresses positive or negative opinions.
Production and hosting by Elsevier Wilson et al. [2] have pointed out that sentiment expressions
2090-4479  2014 Production and hosting by Elsevier B.V. on behalf of Ain Shams University.
http://dx.doi.org/10.1016/j.asej.2014.04.011

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
2 W. Medhat et al.

discussed with more details illustrating related articles and


originating references as well.
This survey can be useful for new comer researchers in this
field as it covers the most famous SA techniques and applica-
tions in one research paper. This survey uniquely gives a
refined categorization to the various SA techniques which is
not found in other surveys. It discusses also new related fields
in SA which have attracted the researchers lately and their cor-
responding articles. These fields include Emotion Detection
(ED), Building Resources (BR) and Transfer Learning (TL).
Emotion detection aims to extract and analyze emotions, while
the emotions could be explicit or implicit in the sentences.
Transfer learning or Cross-Domain classification is concerned
with analyzing data from one domain and then using the
results in a target domain. Building Resources aims at creating
lexica, corpora in which opinion expressions are annotated
according to their polarity, and sometimes dictionaries. In this
Figure 1 Sentiment analysis process on product reviews.
paper, the authors give a closer look on these fields.
There are numerous number of articles presented every year
are not necessarily subjective in nature. However, there is no in the SA fields. The number of articles is increasing through
fundamental difference between document and sentence level years. This creates a need to have survey papers that summa-
classifications because sentences are just short documents [3]. rize the recent research trends and directions of SA. The reader
Classifying text at the document level or at the sentence level can find some sophisticated and detailed surveys including
does not provide the necessary detail needed opinions on all [1,3,8–11]. Those surveys have discussed the problem of SA
aspects of the entity which is needed in many applications, from the applications point of view not from the SA tech-
to obtain these details; we need to go to the aspect level. niques point of view.
Aspect-level SA aims to classify the sentiment with respect to Two long and detailed surveys were presented by Pang and
the specific aspects of entities. The first step is to identify the Lee [8] and Liu [3]. They focused on the applications and chal-
entities and their aspects. The opinion holders can give differ- lenges in SA. They mentioned the techniques used to solve
ent opinions for different aspects of the same entity like this each problem in SA. Cambria and Schuller et al. [9], Feldman
sentence ‘‘The voice quality of this phone is not good, but the [10] and Montoyo and Martı́nez-Barco [11] have given short
battery life is long’’. This survey tackles the first two kinds of surveys illustrating the new trends in SA. Tsytsarau and
SA. Palpanas [1] have presented a survey which discussed the main
The data sets used in SA are an important issue in this field. topics of SA in details. For each topic they have illustrated its
The main sources of data are from the product reviews. These definition, problems and development and categorized the
reviews are important to the business holders as they can take articles with the aid of tables and graphs. The analysis of the
business decisions according to the analysis results of users’ articles presented in this survey is similar to what was given
opinions about their products. The reviews sources are mainly by [1] but with another perspective and different categorization
review sites. SA is not only applied on product reviews but can of the articles.
also be applied on stock markets [4,5], news articles, [6] or The contribution of this survey is significant for many rea-
political debates [7]. In political debates for example, we could sons. First, this survey provides sophisticated categorization of
figure out people’s opinions on a certain election candidates or a large number of recent articles according to the techniques
political parties. The election results can also be predicted from used. This angle could help the researchers who are familiar
political posts. The social network sites and micro-blogging with certain techniques to use them in the SA field and choose
sites are considered a very good source of information because the appropriate technique for a certain application. Second,
people share and discuss their opinions about a certain topic the various techniques of SA are categorized with brief details
freely. They are also used as data sources in the SA process. of the algorithms and their originating references. This can
There are many applications and enhancements on SA help new comers to the SA field to have a panoramic view
algorithms that were proposed in the last few years. This sur- on the entire field. Third, the available benchmarks data sets
vey aims to give a closer look on these enhancements and to are discussed and categorized according to their use in certain
summarize and categorize some articles presented in this field applications. Finally, the survey is enhanced by discussing the
according to the various SA techniques. The authors have col- related fields to SA including emotion detection, building
lected fifty-four articles which presented important enhance- resources and transfer learning.
ments to the SA field lately. These articles cover a wide This paper is organized as follows: Section 2 includes the
variety of SA fields. They were all published in the last few survey methodology and a summary of the articles. Section 3
years. They are categorized according to the target of the arti- tackles the FS techniques and their related articles, and
cle illustrating the algorithms and data used in their work. Section 4 discusses the various SC techniques and the corre-
According to Fig. 1, the authors have discussed the Feature sponding articles. In Section 5, the related fields to SA and
Selection (FS) techniques in details along with their related their corresponding articles are presented. Section 6 presents
articles referring to some originating references. The Sentiment the results and discussions, and finally the conclusion and
Classification (SC) techniques, as shown in Fig. 2, are future trend in research are tackled in Section 7.

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
Sentiment analysis algorithms and applications: A survey 3

Figure 2 Sentiment classification techniques.

2. Methodology articles to these algorithms is presented illustrating how they


use these algorithms to solve special problems in SA. The main
The fifty-four articles presented in this survey are summarized target of this survey is to present a unique categorization for
in Table 1. Table 1 contains the articles reference [4–7] and these SA related articles.
[12–61]. The objectives of the articles are illustrated in the third
3. Feature selection in sentiment classification
column. They are divided into six categories which are (SA,
ED, SC, FS, TL and BR). The BR category can be classified
to lexica, Corpora or dictionaries. The authors categorized Sentiment Analysis task is considered a sentiment classification
the articles that solve the Sentiment classification problem as problem. The first step in the SC problem is to extract and
SC. Other articles that solve a general Sentiment Analysis select text features. Some of the current features are [62]:
problem are categorized as SA. The articles that give contribu- Terms presence and frequency: These features are individual
tion in the feature selection phase are categorized as FS. Then words or word n-grams and their frequency counts. It either
the authors categorized the articles that represent the SA gives the words binary weighting (zero if the word appears,
related fields like Emotion Detection (ED), Building Resource or one if otherwise) or uses term frequency weights to indicate
(BR) and Transfer Learning (TL). the relative importance of features [63].
The fourth column specifies whether the article is domain- Parts of speech (POS): finding adjectives, as they are impor-
oriented by means of Yes/No answers (Y or N). Domain-ori- tant indicators of opinions.
ented means that domain-specific data are used in the SA pro- Opinion words and phrases: these are words commonly used
cess. The fifth column shows the algorithms used, and specifies to express opinions including good or bad, like or hate. On the
their categories as shown in Fig. 2. Some articles use different other hand, some phrases express opinions without using opin-
algorithms other than the SC techniques which are presented ion words. For example: cost me an arm and a leg.
in Section 4. This applies, for example, to the work presented Negations: the appearance of negative words may change
by Steinberger [43]. In this case, the algorithm name only is the opinion orientation like not good is equivalent to bad.
written. The sixth column specifies whether the article uses
SA techniques for general Analysis of Text (G) or solves the 3.1. Feature selection methods
problem of binary classification (Positive/Negative). The
seventh column illustrates the scope of the data used for evalu- Feature Selection methods can be divided into lexicon-based
ating the article’s algorithms. The data could be reviews, news methods that need human annotation, and statistical methods
articles, web pages, micro-blogs and others. The eighth column which are automatic methods that are more frequently used.
specifies the benchmark data set or the well-known data source Lexicon-based approaches usually begin with a small set of
used if available; as some articles do not give that information. ‘seed’ words. Then they bootstrap this set through synonym
This could help the reader if he is interested in a certain scope detection or on-line resources to obtain a larger lexicon. This
of data. The last column specifies if any other languages other proved to have many difficulties as reported by Whitelaw
than English are analyzed in the article. et al. [64]. Statistical approaches, on the other hand, are fully
The survey methodology is as follows: brief explanation to automatic.
the famous FS and SC algorithms representing some related The feature selection techniques treat the documents either
fields to SA are discussed. Then the contribution of these as group of words (Bag of Words (BOWs)), or as a string

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
4
dx.doi.org/10.1016/j.asej.2014.04.011
Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://

Table 1 Articles Summary.


References Year Task Domain- Algorithms used Polarity Data scope Data set/source Other language
oriented
[12] 2010 SA Y Rule-Based G Web Forums automotvieforums.com
[13] 2010 ED N Web-based, semantic labeling and Pos/Neg Web pages N/A
rule-based
[14] 2010 ED N Lexicon-based, semantic G Personal stories experienceproject.com
[15] 2010 SC N Markov Blanket, SVM, NB, ME Pos/Neg Movie Reviews, News Articles IMDB
[16] 2010 SC N Graph-Based approach, NB, SVM Pos/Neg Camera Reviews Chinese Opinion Analysis Domain Chinese
(COAE)
[17] 2010 SC Y Graph-Based approach Pos/Neg Movie, Product Reviews N/A Chinese
[18] 2010 SA N Semantic, LSA-based G Software programs users’ feedback CNETD
[19] 2010 SC Y Weakly and semi supervised Pos/Neg Movie Reviews, Multi-domain sentiment IMDB, Amazon.com
classification data set
[20] 2011 BR Y Random walk algorithm G Electronics, Stock, Hotel Reviews Domain-specific chinese corpus Chinese
[21] 2011 TL Y Entropy-based algorithm G Education, Stock, Computer Reviews Domain-specific chinese data set Chinese
[22] 2011 TL Y Ranking algorithm G Book, Hotel, Notebook Reviews Domain-specific chinese data set Chinese
[23] 2011 SC N CRFs Pos/Neg Car, Hotel, Computer Reviews N/A Chinese
[24] 2011 TL Y SAR G Movie Reviews, QA MPQA, RIMDB, CHES
[25] 2011 SA N 2-level CRF G Mobile Customer Reviews amazon.com, epinions.com, blogs, SNS
and emails in CRM
[26] 2011 SA N Multi-class SVM G Digital Cameras, MP3 Reviews N/A
[27] 2011 SA Y SVM, Chi-square G Buyers’ posts web pages ebay.com, wikipedia.com, epinions.com
[28] 2011 SA N Semantic G Chinese training data NTCIR7 Chinese
MOAT
[29] 2011 SC N Lexicon-based, semantic Pos/Neg Movie Reviews IMDB
[30] 2011 SC N Statistical (MM), semantic Pos/Neg Product Reviews amazon.com
[31] 2011 SA N Statistical G Book Reviews amazon.com
[32] 2011 TL Y Shared learning approach G Social Media, News data Blogspot, Flicker, youtube, CNN-BBC
[33] 2012 FS N Statistical (HMM - LDA), ME Pos/Neg Movie Reviews N/A
[34] 2012 BR Y Semantic G Restaurant Reviews N/A Spanish
[35] 2012 SA Y Context-based method, NLP G Restaurant Reviews N/A
[36] 2012 SC N NB, SVM Pos/Neg Restaurant Reviews N/A
[37] 2012 SA N Lexicon-Based, NLP G News N/A
[38] 2012 SA N PMI, semantic G Product Reviews N/A Chinese
[39] 2012 SA N NLP G Product Reviews amazon.com
[40] 2012 SC N Semi supervised, BN G Artificial data sets N/A
[41] 2012 BR Y NLP G blogs ISEAR Spanish, italian
[42] 2012 ED Y Corpus-based G Blogs data Live Journals Blogs, Text Affect, Fairy
tales, Annotated Blogs
[6] 2012 SA N S-HAL, SO-PMI G News Pages Sogou CS corpus Chinese
[43] 2012 BR N Triangulation G News Articles sentiment Dictionaries Other Latin, Arabic

W. Medhat et al.
[44] 2012 SC Y NB, SVM, rule-based G 2 sided debates convinceme.net
[45] 2012 ED N Lexicon-Based, SVM G Emotions corpus ISEAR, Emotinet
[7] 2012 SA N Semantic Pos/Neg Lexicons Dutch wordnet Dutch
[46] 2012 SA N SVM, K-nearest neighbor, Pos/Neg Media media-analysis company
NB, BN, DT, a Rule learner
Sentiment analysis algorithms and applications: A survey 5

which retains the sequence of words in the document. BOW is

Dutch, Chinese
Other language used more often because of its simplicity for the classification

Taiwanese
process. The most common feature selection step is the

Japanese
Spanish
Chinese
Dutch removal of stop-words and stemming (returning the word to
its stem or root i.e. flies fi fly).
In the next subsections, we present three of the most fre-

TREC 2006, TREC 2007, and TREC 2008


quently used statistical methods in FS and their related arti-
cles. There are other methods used in FS like information
2000-SINA blog data set, 300-SINA

gain and Gini index [62].

3.1.1. Point-wise Mutual Information (PMI)


The mutual information measure provides a formal way to
model the mutual information between the features and the
Enron Email corpus

classes. This measure was derived from the information theory


DGAP, EuroAhoc
MC, MCE corpus

[65]. The point-wise mutual information (PMI) Mi(w) between


Hownet lexicon
Data set/source

Aozora Bunko
Reuters 21578

epinions.com
the word w and the class i is defined on the basis of the level of
amazon.com
amazon.com

co-occurrence between the class i and word w. The expected


Twitter
Twitter
Twitter
BWSA

co-occurrence of class i and word w, on the basis of mutual


N/A

N/A
N/A

independence, is given by Pi  F(w), and the true co-occurrence


is given by F(w)  pi(w).
News, Satiric Articles, Customer Reviews

The mutual information is defined in terms of the ratio


between these two values and is given by the following
Movie, GPS, Camera, Books Reviews

equation:
Emails, Books, Novels, fairy tales

Headphones, Car, Hotel Reviews

   
FðwÞ  pi ðwÞ pi ðwÞ
Mi ðwÞ ¼ log ¼ log ð1Þ
FðwÞ  Pi Pi
Relationships Biography

Movie Reviews, Tweets


Movie, eBook Reviews

The word w is positively correlated to the class i, when


Smartphones, Tweets

Fast Food Reviews

Mi(w) is greater than 0. The word w is negatively correlated


to the class i when Mi(w) is less than 0.
Social Reviews

Stock Market
Film Reviews

PMI is used in many applications, and there are some


Stock News
Data scope

Blog Posts
Narratives

enhancements applied to it. PMI considers only the co-occur-


corpora

Tweets

rence strength. Yu and Wu [4] have extended the basic PMI by


developing a contextual entropy model to expand a set of seed
words generated from a small corpus of stock market news
Pos/Neg

Pos/Neg
Pos/Neg

Pos/Neg

Pos/Neg

Pos/Neg

Pos/Neg
Pos/Neg
Polarity

articles. Their contextual entropy model measures the similar-


ity between two words by comparing their contextual distribu-
G
G

G
G
G

tions using an entropy measure, allowing for the discovery of


words similar to the seed words. Once the seed words have
Taxonomy-based, corpus-based

been expanded, both the seed words and expanded words are
used to classify the sentiment of the news articles. Their results
Semantic, NB, SVM, DT

Chi-square, BNS, SVM

showed that their method can discover more useful emotion


Unsupervised, LDA

words, and their corresponding intensity improves their classi-


fication performance. Their method outperformed the (PMI)-
Algorithms used

SVM, NB, C4.5


Lexicon-Based

based expansion methods as they consider both co-occurrence


FCA, FFCA
SVM, 1-NN

SVM, ANN

PMI-Based

strength and contextual distribution, thus acquiring more use-


NB, SVM

Semantic

Semantic

ful emotion words and fewer noisy words.


SVM
FCA

NLP
ME

3.1.2. Chi-square (v2)


Domain-
oriented

Let n be the total number of documents in the collection, pi(w)


be the conditional probability of class i for documents which
N
N
N
N

N
N
N
N

N
N

N
Y

Y
Y

contain w, Pi be the global fraction of documents containing


the class i, and F(w) be the global fraction of documents which
Task

contain the word w. Therefore, the v2-statistic of the word


ED

ED
BR

SA

SA
SC

SC

SC

SC
SC

SC
SC

SC
SC
FS

FS

FS

between word w and class i is defined as [62]:


(continued)
Year

2012
2012
2012
2012

2012
2012
2013
2013
2013
2013
2013
2013
2013
2013
2013
2013
2013

n  FðwÞ2  ðpi ðwÞ  Pi Þ2


v2i ¼ ð2Þ
FðwÞ  ð1  FðwÞÞ  Pi  ð1  Pi Þ
References

v2 and PMI are two different ways of measuring the correla-


Table 1

tion between terms and categories. v2 is better than PMI as


[47]
[48]
[49]
[50]

[51]
[52]
[53]
[54]
[55]
[56]
[57]

[58]
[59]
[60]
[61]

it is a normalized value; therefore, these values are more com-


[5]

[4]

parable across terms in the same category [62].

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
6 W. Medhat et al.

v2 is used in many applications; one of them is the contex- 3.2. Challenging tasks in FS
tual advertising as presented by Fan and Chang [27]. They
discovered bloggers’ immediate personal interests in order to A very challenging task in extracting features is irony detec-
improve online contextual advertising. They worked on real tion. The objective of this task is to identify irony reviews. This
ads and actual blog pages from ebay.com, wikipedia.com work was proposed by Reyes and Rosso [48]. They aimed to
and epinions.com. They used SVM (illustrated with details in define a feature model in order to represent part of the subjec-
the next section) for classification and v2 for FS. Their results tive knowledge which underlies such reviews and attempts to
showed that their method could effectively identify those ads describe salient characteristics of irony. They have established
that are positively-correlated with a blogger’s personal a model to represent verbal irony in terms of six categories of
interests. features: n-grams, POS-grams, funny profiling, positive/nega-
Hagenau and Liebmann [5] used feedback features by tive profiling, affective profiling, and pleasantness profiling.
employing market feedback as part of their feature selection They built a freely available data set with ironic reviews from
process regarding stock market data. Then, they used them news articles, satiric articles and customer reviews, collected
with v2 and Bi-Normal Separation (BNS). They showed that from amazon.com. They were posted on the basis of an online
a robust feature selection allows lifting classification accuracies viral effect, i.e. contents that trigger a chain reaction in people.
significantly when combined with complex feature types. Their They used NB, SVM, and DT for classification purpose (illus-
approach allows selecting semantically relevant features and trated with details in the next section). Their results with the
reduces the problem of over-fitting when applying a machine three classifiers are satisfactory, both in terms of accuracy, as
learning approach. They used SVM as a classifier. Their results well as precision, recall, and F-measure.
showed that the combination of advanced feature extraction
methods and their feedback-based feature selection increases
4. Sentiment classification techniques
classification accuracy and allows improved sentiment analyt-
ics. This is because their approach allows reducing the number
of less-explanatory features, i.e. noise, and limits negative Sentiment Classification techniques can be roughly divided
effects of over-fitting when applying machine learning into machine learning approach, lexicon based approach and
approaches to classify text messages. hybrid approach [69]. The Machine Learning Approach (ML)
applies the famous ML algorithms and uses linguistic features.
3.1.3. Latent Semantic Indexing (LSI) The Lexicon-based Approach relies on a sentiment lexicon, a
collection of known and precompiled sentiment terms. It is
Feature selection methods attempt to reduce the dimensional-
divided into dictionary-based approach and corpus-based
ity of the data by picking from the original set of attributes.
approach which use statistical or semantic methods to find sen-
Feature transformation methods create a smaller set of fea-
timent polarity. The hybrid Approach combines both
tures as a function of the original set of features. LSI is
approaches and is very common with sentiment lexicons
one of the famous feature transformation methods [66]. LSI
playing a key role in the majority of methods. The various
method transforms the text space to a new axis system which
approaches and the most popular algorithms of SC are
is a linear combination of the original word features. Princi-
illustrated in Fig. 2 as mentioned before.
pal Component Analysis techniques (PCA) are used to
The text classification methods using ML approach can be
achieve this goal [67]. It determines the axis-system which
roughly divided into supervised and unsupervised learning
retains the greatest level of information about the variations
methods. The supervised methods make use of a large number
in the underlying attribute values. The main disadvantage of
of labeled training documents. The unsupervised methods are
LSI is that it is an unsupervised technique which is blind to
used when it is difficult to find these labeled training
the underlying class-distribution. Therefore, the features
documents.
found by LSI are not necessarily the directions along which
The lexicon-based approach depends on finding the opinion
the class-distribution of the underlying documents can be best
lexicon which is used to analyze the text. There are two meth-
separated [62].
ods in this approach. The dictionary-based approach which
There are other statistical approaches which could be used
depends on finding opinion seed words, and then searches
in FS like Hidden Markov Model (HMM) and Latent
the dictionary of their synonyms and antonyms. The corpus-
Dirichlet Allocation (LDA). They were used by Duric and
based approach begins with a seed list of opinion words, and
Song [33] to separate the entities in a review document from
then finds other opinion words in a large corpus to help in find-
the subjective expressions that describe those entities in terms
ing opinion words with context specific orientations. This
of polarities. This was their proposed new feature selection
could be done by using statistical or semantic methods. There
schemes. LDA are generative models that allow documents
is a brief explanation of both approaches’ algorithms and
to be explained by unobserved (latent) topics. HMM-LDA
related articles in the next subsections.
is a topic model that simultaneously models topics and syn-
tactic structures in a collection of documents [68]. The feature
4.1. Machine learning approach
selection schemes proposed by Duric and Song [33] achieved
competitive results for document polarity classification spe-
cially when using only the syntactic classes and reducing Machine learning approach relies on the famous ML
the overlaps with the semantic words in their final feature algorithms to solve the SA as a regular text classification
sets. They worked on movie reviews and used Maximum problem that makes use of syntactic and/or linguistic features.
Entropy (ME) classifier (illustrated with details in the next Text Classification Problem Definition: We have a set of
section). training records D = {X1, X2, . . ., Xn} where each record is

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
Sentiment analysis algorithms and applications: A survey 7

labeled to a class. The classification model is related to the


features in the underlying record to one of the class labels.
Then for a given instance of unknown class, the model is used
to predict a class label for it. The hard classification problem is
when only one label is assigned to an instance. The soft classi-
fication problem is when a probabilistic value of labels is
assigned to an instance.

4.1.1. Supervised learning


The supervised learning methods depend on the existence of
labeled training documents. There are many kinds of
supervised classifiers in literature. In the next subsections, we
present in brief details some of the most frequently used
classifiers in SA.

4.1.1.1. Probabilistic classifiers. Probabilistic classifiers use


mixture models for classification. The mixture model assumes
that each class is a component of the mixture. Each mixture Figure 3 Using support vector machine on a classification
component is a generative model that provides the probability problem.
of sampling a particular term for that component. These kinds
of classifiers are also called generative classifiers. Three of the
most famous probabilistic classifiers are discussed in the next variables, and edges represent conditional dependencies. BN
subsections. is considered a complete model for the variables and their rela-
4.1.1.1.1. Naı¨ve Bayes Classifier (NB). The Naı̈ve Bayes tionships. Therefore, a complete joint probability distribution
classifier is the simplest and most commonly used classifier. (JPD) over all the variables, is specified for a model. In Text
Naı̈ve Bayes classification model computes the posterior prob- mining, the computation complexity of BN is very expensive;
ability of a class, based on the distribution of the words in the that is why, it is not frequently used [62].
document. The model works with the BOWs feature extraction BN was used by Hernández and Rodrı́guez [40] to consider
which ignores the position of the word in the document. It uses a real-world problem in which the attitude of the author is
Bayes Theorem to predict the probability that a given feature characterized by three different (but related) target variables.
set belongs to a particular label. They proposed the use of multi-dimensional Bayesian network
PðlabelÞ  PðfeaturesjlabelÞ classifiers. It joined the different target variables in the same
PðlabeljfeaturesÞ ¼ ð3Þ classification task in order to exploit the potential relationships
PðfeaturesÞ
between them. They extended the multi-dimensional classifica-
P(label) is the prior probability of a label or the likelihood that tion framework to the semi-supervised domain in order to take
a random feature set the label. P(features|label) is the prior advantage of the huge amount of unlabeled information avail-
probability that a given feature set is being classified as a label. able in this context. They showed that their semi-supervised
P(features) is the prior probability that a given feature set is multi-dimensional approach outperforms the most common
occurred. Given the Naı̈ve assumption which states that all SA approaches, and that their classifier is the best solution
features are independent, the equation could be rewritten as in a semi-supervised framework because it matches the actual
follows: underlying domain structure.
PðlabelÞ  Pðf1jlabelÞ  . . . ::  PðfnjlabelÞ 4.1.1.1.3. Maximum Entropy Classifier (ME). The Maxent
PðlabeljfeaturesÞ ¼ Classifier (known as a conditional exponential classifier) con-
PðfeaturesÞ
verts labeled feature sets to vectors using encoding. This
ð4Þ encoded vector is then used to calculate weights for each fea-
An improved NB classifier was proposed by Kang and Yoo ture that can then be combined to determine the most likely
[36] to solve the problem of the tendency for the positive clas- label for a feature set. This classifier is parameterized by a
sification accuracy to appear up to approximately 10% higher set of X{weights}, which is used to combine the joint features
than the negative classification accuracy. This creates a prob- that are generated from a feature-set by an X{encoding}. In
lem of decreasing the average accuracy when the accuracies particular, the encoding maps each C{(featureset, label)} pair
of the two classes are expressed as an average value. They to a vector. The probability of each label is then computed
showed that using this algorithm with restaurant reviews using the following equation:
narrowed the gap between the positive accuracy and the dotprodðweights; encodeðfs; labelÞÞ
negative accuracy compared to NB and SVM. The accuracy PðfsjlabelÞ ¼
sumðdotprodðweights; encodeðfs; lÞÞforlinlabelsÞ
is improved in recall and precision compared to both NB
ð5Þ
and SVM.
4.1.1.1.2. Bayesian Network (BN). The main assumption of ME classifier was used by Kaufmann [52] to detect parallel
the NB classifier is the independence of the features. The other sentences between any language pairs with small amounts of
extreme assumption is to assume that all the features are fully training data. The other tools that were developed to automat-
dependent. This leads to the Bayesian Network model which is ically extract parallel data from non-parallel corpora use lan-
a directed acyclic graph whose nodes represent random guage specific techniques or require large amounts of

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
8 W. Medhat et al.

training data. Their results showed that ME classifiers can pro- word frequencies in the ith document. There are a set of
duce useful results for almost any language pair. This can weights A which are associated with each neuron used in order
allow the creation of parallel corpora for many new languages. to compute a function of its inputs f(). The linear function of
the neural network is: pi ¼ A  Xi . In a binary classification
4.1.1.2. Linear classifiers. Given X ¼ fx1 . . . . . . :xn g is the nor- problem, it is assumed that the class label of Xi is denoted
malized document word frequency, vector A ¼ fa1 . . . . . . an g is by yi and the sign of the predicted function pi yields the class
a vector of linear coefficients with the same dimensionality as label.
the feature space, and b is a scalar; the output of the linear Multilayer neural networks are used for non-linear bound-
predictor is defined as p ¼ A:X þ b, which is the output of aries. These multiple layers are used to induce multiple piece-
the linear classifier. The predictor p is a separating hyperplane wise linear boundaries, which are used to approximate
between different classes. There are many kinds of linear clas- enclosed regions belonging to a particular class. The outputs
sifiers; among them is Support Vector Machines (SVM) [70,71] of the neurons in the earlier layers feed into the neurons in
which is a form of classifiers that attempt to determine good the later layers. The training process is more complex because
linear separators between different classes. Two of the most the errors need to be back-propagated over different layers.
famous linear classifiers are discussed in the following There are implementations of NNs for text data which are
subsections. found in [74,75].
4.1.1.2.1. Support Vector Machines Classifiers (SVM). The There is an empirical comparison between SVM and Artifi-
main principle of SVMs is to determine linear separators in the cial neural networks ANNs presented by Moraes and Valiati
search space which can best separate the different classes. In [53] regarding document-level sentiment analysis. They made
Fig. 3 there are 2 classes x, o and there are 3 hyperplanes A, this comparison because SVM has been widely and success-
B and C. Hyperplane A provides the best separation between fully used in SA while ANNs have attracted little attention
the classes, because the normal distance of any of the data as an approach for sentiment learning. They have discussed
points is the largest, so it represents the maximum margin of the requirements, resulting models and contexts in which both
separation. approaches achieve better levels of classification accuracy.
Text data are ideally suited for SVM classification because They have also adopted a standard evaluation context with
of the sparse nature of text, in which few features are irrele- popular supervised methods for feature selection and weight-
vant, but they tend to be correlated with one another and ing in a traditional BOWs model. Their experiments indicated
generally organized into linearly separable categories [72]. that ANN produced superior results to SVM except for some
SVM can construct a nonlinear decision surface in the original unbalanced data contexts. They have tested three benchmark
feature space by mapping the data instances non-linearly to an data sets on Movie, GPS, Camera and Books Reviews from
inner product space where the classes can be separated linearly amazon.com. They proved that the experiments on movie
with a hyperplane [73]. reviews ANN outperformed SVM by a statistically significant
SVMs are used in many applications, among these applica- difference. They confirmed some potential limitations of both
tions are classifying reviews according to their quality. Chen models, which have been rarely discussed in the SA literature,
and Tseng [26] have used two multiclass SVM-based like the computational cost of SVM at the running time and
approaches: One-versus-All SVM and Single-Machine Multi- ANN at the training time. They proved that using Information
class SVM to categorize reviews. They proposed a method gain (a computationally cheap feature selection Method) can
for evaluating the quality of information in product reviews reduce the computational effort of both ANN and SVM with-
considering it as a classification problem. They also adopted out significantly affecting the resulting classification accuracy.
an information quality (IQ) framework to find information- SVM and NN can be used also for the classification of per-
oriented feature set. They worked on digital cameras and sonal relationships in biographical texts as presented by van de
MP3 reviews. Their results showed that their method can accu- Camp and van den Bosch [47]. They marked relations between
rately classify reviews in terms of their quality. It significantly two persons (one being the topic of a biography, the other
outperforms state-of-the-art methods. being mentioned in this biography) as positive, neutral, or
SVMs were used by Li and Li [57] as a sentiment polarity unknown. Their case study was based on historical biograph-
classifier. Unlike the binary classification problem, they argued ical information describing people in a particular domain,
that opinion subjectivity and expresser credibility should also region and time frame. They showed that their classifiers were
be taken into consideration. They proposed a framework that able to label these relations above a majority class baseline
provides a compact numeric summarization of opinions on score. They found that a training set containing relations, sur-
micro-blogs platforms. They identified and extracted the topics rounding multiple persons, produces more desirable results
mentioned in the opinions associated with the queries of users, than a set that focuses on one specific entity. They proved that
and then classified the opinions using SVM. They worked on SVM and one layer NN (1-NN) algorithm achieve the highest
twitter posts for their experiment. They found out that the con- scores.
sideration of user credibility and opinion subjectivity is essen-
tial for aggregating micro-blog opinions. They proved that 4.1.1.3. Decision tree classifiers. Decision tree classifier pro-
their mechanism can effectively discover market intelligence vides a hierarchical decomposition of the training data space
(MI) for supporting decision-makers by establishing a moni- in which a condition on the attribute value is used to divide
toring system to track external opinions on different aspects the data [76]. The condition or predicate is the presence or
of a business in real time. absence of one or more words. The division of the data space
4.1.1.2.2. Neural Network (NN). Neural Network consists is done recursively until the leaf nodes contain certain mini-
of many neurons where the neuron is its basic unit. The inputs mum numbers of records which are used for the purpose of
to the neurons are denoted by the vector overlineXi which is the classification.

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
Sentiment analysis algorithms and applications: A survey 9

There are other kinds of predicates which depend on the between the decision trees and the decision rules is that DT is a
similarity of documents to correlate sets of terms which may strict hierarchical partitioning of the data space, while
be used to further partitioning of documents. The different rule-based classifiers allow for overlaps in the decision space.
kinds of splits are Single Attribute split which use the presence
or absence of particular words or phrases at a particular node 4.1.2. Weakly, semi and unsupervised learning
in the tree in order to perform the split [77]. Similarity-based The main purpose of text classification is to classify documents
multi-attribute split uses documents or frequent words clusters into a certain number of predefined categories. In order to
and the similarity of the documents to these words clusters in accomplish that, large number of labeled training documents
order to perform the split. Discriminat-based multi-attribute are used for supervised learning, as illustrated before. In text
split uses discriminants such as the Fisher discriminate for classification, it is sometimes difficult to create these labeled
performing the split [78]. training documents, but it is easy to collect the unlabeled doc-
The decision tree implementations in text classification tend uments. The unsupervised learning methods overcome these
to be small variations on standard packages such as ID3 and difficulties. Many research works were presented in this field
C4.5. Li and Jain [79] have used the C5 algorithm which is a including the work presented by Ko and Seo [81]. They pro-
successor to the C4.5 algorithm. Depending on the concept of posed a method that divides the documents into sentences,
a tree; an approach was proposed by Hu and Li [17] in order and categorized each sentence using keyword lists of each
to mine the content structures of topical terms in sentence-level category and sentence similarity measure.
contexts by using the Maximum Spanning Tree (MST) struc- The concept of weak and semi-supervision is used in many
ture to discover the links among the topical term ‘‘t’’ and its applications. Youlan and Zhou [19] have proposed a strategy
context words. Accordingly, they developed the so-called that works by providing weak supervision at the level of fea-
Topical Term Description Model for sentiment classification. tures rather than instances. They obtained an initial classifier
In their definition, ‘‘topical terms’’ are those specified entities by incorporating prior information extracted from an existing
or certain aspects of entities in a particular domain. They intro- sentiment lexicon into sentiment classifier model learning.
duced an automatic extraction of topical terms from text based They refer to prior information as labeled features and use
on their domain term-hood. Then, they used these extracted them directly to constrain model’s predictions on unlabeled
terms to differentiate document topics. This structure conveys instances using generalized expectation criteria. In their work,
sentiment information. Their approach is different from the they were able to identify domain-specific polarity words clar-
regular machine learning tree algorithms but is able to learn ifying the idea that the polarity of a word may be different
the positive and negative contextual knowledge effectively. from a domain to the other. They worked on movie reviews
A graph-based Approach was presented by Yan and Bing and multi-domain sentiment data set from IMDB and
[16]. They have presented a propagation approach to incorpo- amazon.com. They showed that their approach attained better
rate the inside and outside sentence features. These two sentence performance than other weakly supervised sentiment classifica-
features are intra-document evidence and inter-document tion methods and it is applicable to any text classification task
evidence. They said that determining the sentiment orientation where some relevant prior knowledge is available.
of a review sentence requires more than the features inside the The unsupervised approach was used too by Xianghua and
sentence itself. They have worked on camera domain and Guo [50] to automatically discover the aspects discussed in
compared their method to both unsupervised approach and Chinese social reviews and also the sentiments expressed in dif-
supervised approaches (NB, SVM). Their results showed that ferent aspects. They used LDA model to discover multi-aspect
their proposed approach performs better than both approaches global topics of social reviews, then they extracted the local
without using outside sentence features and outperforms other topic and associated sentiment based on a sliding window con-
representational previous approaches. text over the review text. They worked on social reviews that
were extracted from a blog data set (2000-SINA) and a lexicon
4.1.1.4. Rule-based classifiers. In rule based classifiers, the data (300-SINA Hownet). They showed that their approach
space is modeled with a set of rules. The left hand side repre- obtained good topic partitioning results and helped to improve
sents a condition on the feature set expressed in disjunctive SA accuracy. It helped too to discover multi-aspect fine-
normal form while the right hand side is the class label. The grained topics and associated sentiment.
conditions are on the term presence. Term absence is rarely There are other unsupervised approaches that depend on
used because it is not informative in sparse data. semantic orientation using PMI [82] or lexical association using
There are numbers of criteria in order to generate rules, the PMI, semantic spaces, and distributional similarity to measure
training phase construct all the rules depending on these crite- the similarity between words and polarity prototypes [83].
ria. The most two common criteria are support and confidence
[80]. The support is the absolute number of instances in the
4.1.3. Meta classifiers
training data set which are relevant to the rule. The Confidence
refers to the conditional probability that the right hand side of In many cases, the researchers use one kind or more of classi-
the rule is satisfied if the left-hand side is satisfied. Some com- fiers to test their work. One of these articles is the work pro-
bined rule algorithms were proposed in [113]. posed by Lane and Clarke [46]. They presented a ML
Both decision trees and decision rules tend to encode rules approach to solve the problem of locating documents carrying
on the feature space, but the decision tree tends to achieve this positive or negative favorability within media analysis. The
goal with a hierarchical approach. Quinlan [76] has studied the imbalance in the distribution of positive and negative samples,
decision tree and decision rule problems within a single frame- changes in the documents over time, and effective training and
work; as a certain path in the decision tree can be considered a evaluation procedures for the models are the challenges they
rule for classification of the text instance. The main difference faced to reach their goal. They worked on three data sets

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
10 W. Medhat et al.

generated by a media-analysis company. They classified docu- ML classifiers are used by Walker and Anand [44] to clas-
ments in two ways: detecting the presence of favorability, and sify stance. Stance is defined as an overall position held by a
assessing negative vs. positive favorability. They have used five person towards an object, idea or position [84]. Stance is sim-
different types of features to create the data sets from the raw ilar to a point of view or perspective, it can be seen as identi-
text. They tested many classifiers to find the best one which are fying the ‘‘side’’ that a speaker is on, e.g. for or against
(SVM, K-nearest neighbor, NB, BN, DT, a Rule learner and political decisions. Walker and Anand [44] have classified
other). They showed that balancing the class distribution in stance that people hold and applied this on political debates.
training data can be beneficial in improving performance, They utilized 104 two-sided debates from convinceme.net for
but NB can be adversely affected. 14 different debate topics and tried to identify the stance or
Applying ML algorithms on streaming data from Twitter attitude of the speakers. Their main target was to determine
was investigated by Rui and Liu [56]. In their work, they were the potential contribution to debate side classification perfor-
investigating whether and how twitter word of mouth (WOM) mance of contextual dialogue features. The main effect for
affects movie sales by estimating a dynamic panel data model. context is when comparing their results with no context to
They used NB and SVM for classification purposes. Their those with context, where only 5 feature-topic pairs show a
main contribution was classifying the tweets putting into con- decrease from no context to context. They used SVM, NB
sideration the unique characteristics of tweets. They distin- and a rule-based classifier for classification purpose. They
guished between the pre-consumer opinion (those have not achieved debate-side classification accuracies, on a per topic
bought the product yet) and post-consumer opinion (those basis, higher than the unigram baselines when using sentiment,
bought the product). They worked on the benchmark movie subjectivity, dependency and dialogic features.
reviews and twitter data. They have collected Twitter WOM
data using Twitter API and movie sales data from Box- 4.2. Lexicon-based approach
OfficeMojo.com. Their results suggest that the effect of
WOM on product sales from Twitter users with more followers Opinion words are employed in many sentiment classification
is significantly larger than that from Twitter users with fewer tasks. Positive opinion words are used to express some desired
followers. They found that the effect of pre-consumption states, while negative opinion words are used to express some
WOM on movie sales is larger than that of post-consumption undesired states. There are also opinion phrases and idioms
WOM. which together are called opinion lexicon. There are three main
Another article compared many classifiers after applying a approaches in order to compile or collect the opinion word list.
statistically Markov Models based classifier. This was to cap- Manual approach is very time consuming and it is not used
ture the dependencies among words and provide a vocabulary alone. It is usually combined with the other two automated
that enhanced the predictive performance of several popular approaches as a final check to avoid the mistakes that resulted
classifiers. This was presented by Bai [15] who has presented from automated methods. The two automated approaches are
a two-stage prediction algorithm. In the first stage, his classi- presented in the following subsections.
fier learned conditional dependencies among the words and
encoded them into a Markov Blanket Directed Acyclic Graph 4.2.1. Dictionary-based approach
for the sentiment variable. In the second stage, he used a meta-
[85,86] presented the main strategy of the dictionary-based
heuristic strategy to fine-tune their algorithm to yield a higher
approach. A small set of opinion words is collected manually
cross-validated accuracy. He has worked on two collections of
with known orientations. Then, this set is grown by searching
online movie reviews from IMDB and three collections of
in the well known corpora WordNet [87] or thesaurus [88] for
online news then compared his algorithm with SVM, NB,
their synonyms and antonyms. The newly found words are
ME and others. He illustrated that his method was able to
added to the seed list then the next iteration starts. The itera-
identify a parsimonious set of predictive features and obtained
tive process stops when no new words are found. After the pro-
better prediction results about sentiment orientations, com-
cess is completed, manual inspection can be carried out to
pared to other methods. His results suggested that sentiments
remove or correct errors.
are captured by conditional dependencies among words as well
The dictionary based approach has a major disadvantage
as by keywords or high-frequency words. The complexity of
which is the inability to find opinion words with domain and
his model is linear in the number of samples.
context specific orientations. Qiu and He [12] used dictio-
Supervised and unsupervised approaches can be combined
nary-based approach to identify sentiment sentences in contex-
together. This was done by Valdivia and Cámara [54]. They
tual advertising. They proposed an advertising strategy to
proposed the use of meta-classifiers in order to develop a
improve ad relevance and user experience. They used syntactic
polarity classification system. They worked on a Spanish cor-
parsing and sentiment dictionary and proposed a rule based
pus of film reviews along with its parallel corpus translated
approach to tackle topic word extraction and consumers’ atti-
into English (MCE). First, they generated two individual mod-
tude identification in advertising keyword extraction. They
els using these two corpora then applying machine learning
worked on web forums from automotvieforums.com. Their
algorithms (SVM, NB, C4.5 and other). Second, they inte-
results demonstrated the effectiveness of the proposed
grated SentiWordNet sentiment corpus into the English corpus
approach on advertising keyword extraction and ad selection.
generating a new unsupervised model using semantic orienta-
tion approach. Third, they combine the three systems using a
4.2.2. Corpus-based approach
meta-classifier. Their results outperformed the results of using
individual corpus and showed that their approach could be The Corpus-based approach helps to solve the problem of
considered a good strategy for polarity classification when finding opinion words with context specific orientations. Its
parallel corpora are available. methods depend on syntactic patterns or patterns that occur

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
Sentiment analysis algorithms and applications: A survey 11

together along with a seed list of opinion words to find other 4.2.2.1. Statistical approach. Finding co-occurrence patterns or
opinion words in a large corpus. One of these methods were seed opinion words can be done using statistical techniques.
represented by Hatzivassiloglou and McKeown [89]. They This could be done by deriving posterior polarities using the
started with a list of seed opinion adjectives, and used them co-occurrence of adjectives in a corpus, as proposed by Fahrni
along with a set of linguistic constraints to identify additional and Klenner [91]. It is possible to use the entire set of indexed
adjective opinion words and their orientations. The constraints documents on the web as the corpus for the dictionary con-
are for connectives like AND, OR, BUT, EITHER-OR. . .. . .; struction. This overcomes the problem of the unavailability
the conjunction AND for example says that conjoined adjec- of some words if the used corpus is not large enough [82].
tives usually have the same orientation. This idea is called The polarity of a word can be identified by studying the
sentiment consistency, which is not always consistent practi- occurrence frequency of the word in a large annotated corpus
cally. There are also adversative expressions such as but, of texts [83]. If the word occurs more frequently among posi-
however which are indicated as opinion changes. In order to tive texts, then its polarity is positive. If it occurs more fre-
determine if two conjoined adjectives are of the same or differ- quently among negative texts, then its polarity is negative. If
ent orientations, learning is applied to a large corpus. Then, it has equal frequencies, then it is a neutral word.
the links between adjectives form a graph and clustering is per- The similar opinion words frequently appear together in a
formed on the graph to produce two sets of words: positive corpus. This is the main observation that the state of the art
and negative. methods are based on. Therefore, if two words appear together
The Conditional Random Fields (CRFs) method [90] was frequently within the same context, they are likely to have the
used as a sequence learning technique for extracting opinion same polarity. Therefore, the polarity of an unknown word
expressions. It was used too by Jiaoa and Zhoua [23] in order can be determined by calculating the relative frequency of
to discriminate sentiment polarity by multi-string pattern co-occurrence with another word. This could be done using
matching algorithm. Their algorithm was applied on Chinese PMI [82].
online reviews. They established many emotional dictionaries. Statistical methods are used in many applications related to
They worked on car, hotel and computer online reviews. Their SA. One of them is detecting the reviews manipulation by con-
results showed that their method has achieved high perfor- ducting a statistical test of randomness called Runs test. Hu
mance. Xu and Liao [25] have used two-level CRF model with and Bose [31] expected that the writing style of the reviews
unfixed interdependencies to extract the comparative relations. would be random due to the various backgrounds of the cus-
This was done by utilizing the complicated dependencies tomers, if the reviews were written actually by customers. They
between relations, entities and words, and the unfixed interde- worked on Book reviews from amazon.com and discovered
pendencies among relations. Their purpose was to make a that around 10.3% of the products are subject to online
graphical model to extract and visualize comparative relations reviews manipulation.
between products from customer reviews. They displayed the Latent Semantic Analysis (LSA) is a statistical approach
results as comparative relation maps for decision support in which is used to analyze the relationships between a set of doc-
enterprise risk management. They worked on mobile customer uments and the terms mentioned in these documents in order
reviews from amazon.com, epinions.com, blogs, SNS and to produce a set of meaningful patterns related to the docu-
emails. Their results showed that their method can extract com- ments and terms [66]. Cao and Duan [18] have used LSA to
parative relations more accurately than other methods, and find the semantic characteristics from review texts to examine
their comparative relation map is potentially a very effective tool the impact of the various features. The objective of their work
to support enterprise risk management and decision making. is to understand why some reviews receive many helpfulness
A taxonomy-based approach for extracting feature-level votes, while others receive few or no votes at all. Therefore,
opinions and map them into feature taxonomy was proposed instead of predicting a helpful level for reviews that have no
by Cruz and Troyano [60]. This taxonomy is a semantic repre- votes, they investigated the factors that determine the number
sentation of the opinionated parts and attributes of an object. of helpfulness votes which a particular review receives (include
Their main target was a domain-oriented OM. They defined a both ‘‘yes’’ and ‘‘no’’ votes). They worked on software pro-
set of domain-specific resources which capture valuable knowl- grams users’ feedback from CNET Download.com. They
edge about how people express opinions on a given domain. showed that the semantic characteristics are more influential
They used resources which were automatically induced from than other characteristics in affecting how many helpfulness
a set of annotated documents. They worked on three different vote reviews receive.
domains (headphones, hotels and cars reviews) from epi- Semantic orientation of a word is a statistical approach used
nions.com. They compared their approach to other domain- along with the PMI method. There is also an implementation of
independent techniques. Their results proved the importance semantic space called Hyperspace Analogue to Language (HAL)
of the domain in order to build accurate opinion extraction which was proposed by Lund and Burgess [93]. Semantic space
systems, as they led to an improvement of accuracy, with is the space in which words are represented by points; the posi-
respect to the domain-independent approaches. tion of each point along with each axis is somehow related to
Using the corpus-based approach alone is not as effective as the meaning of the word. Xu and Peng [6] have developed an
the dictionary-based approach because it is hard to prepare a approach based on HAL called Sentiment Hyperspace
huge corpus to cover all English words, but this approach Analogue to Language (S-HAL). In their model, the semantic
has a major advantage that can help to find domain and con- orientation information of words is characterized by a specific
text specific opinion words and their orientations using a vector space, and then a classifier was trained to identify the
domain corpus. The corpus-based approach is performed semantic orientation of terms (words or phrases). The hypoth-
using statistical approach or semantic approach as illustrated esis was verified by the method of semantic orientation infer-
in the following subsections: ence from PMI (SO-PMI). Their approach produced a set of

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
12 W. Medhat et al.

weighted features based on surrounding words. They worked 4.2.3. Lexicon-based and natural language processing techniques
on news pages and used a Chinese corpus. Their results showed Natural Language Processing (NLP) techniques are sometimes
that they outperformed the SO-PMI and showed advantages in used with the lexicon-based approach to find the syntactical
modeling semantic orientation characteristics when compared structure and help in finding the semantic relations [94]. Moreo
with the original HAL model. and Romero [37] have used NLP techniques as preprocessing
stage before they used their proposed lexicon-based SA algo-
4.2.2.2. Semantic approach. The Semantic approach gives sen- rithm. Their proposed system consists of an automatic focus
timent values directly and relies on different principles for com- detection module and a sentiment analysis module capable
puting the similarity between words. This principle gives similar of assessing user opinions of topics in news items which use
sentiment values to semantically close words. WordNet for a taxonomy-lexicon that is specifically designed for news anal-
example provides different kinds of semantic relationships ysis. Their results were promising in scenarios where colloquial
between words used to calculate sentiment polarities. WordNet language predominates.
could be used too for obtaining a list of sentiment words by iter- The approach for SA presented by Caro and Grella [35]
atively expanding the initial set with synonyms and antonyms was based on a deep NLP analysis of the sentences, using a
and then determining the sentiment polarity for an unknown dependency parsing as a pre-processing step. Their SA algo-
word by the relative count of positive and negative synonyms rithm relied on the concept of Sentiment Propagation, which
of this word [86]. assumed that each linguistic element like a noun, a verb, etc.
The Semantic approach is used in many applications to can have an intrinsic value of sentiment that is propagated
build a lexicon model for the description of verbs, nouns and through the syntactic structure of the parsed sentence. They
adjectives to be used in SA as the work presented by Maks presented a set of syntactic-based rules that aimed to cover a
and Vossen [7]. Their model described the detailed subjectivity significant part of the sentiment salience expressed by a text.
relations among the actors in a sentence expressing separate They proposed a data visualization system in which they
attitudes for each actor. These subjectivity relations are labeled needed to filter out some data objects or to contextualize the
with information concerning both the identity of the attitude data so that only the information relevant to a user query is
holder and the orientation (positive vs. negative) of the shown to the user. In order to accomplish that, they presented
attitude. Their model included a categorization into semantic a context-based method to visualize opinions by measuring the
categories relevant to SA. It provided means for the identifica- distance, in the textual appraisals, between the query and the
tion of the attitude holder, the polarity of the attitude and also polarity of the words contained in the texts themselves. They
the description of the emotions and sentiments of the different extended their algorithm by computing the context-based
actors involved in the text. They used Dutch WordNet in their polarity scores. Their approach approved high efficiency after
work. Their results showed that the speaker’s subjectivity and applying it on a manual corpus of 100 restaurants reviews.
sometimes the actor’s subjectivity can be reliably identified. Min and Park [39] have used NLP from a different perspec-
Semantics of electronic WOM (eWOM) content is used to tive. They used NLP techniques to identify tense and time
examine eWOM content analysis as proposed by Pai and expressions along with mining techniques and a ranking algo-
Chu [59]. They extracted both positive and negative appraisals, rithm. Their proposed metric has two parameters that capture
and helped consumers in their decision making. Their method time expressions related to the use of products and product
can be utilized as a tool to assist companies in better entities over different purchasing time periods. They identified
understanding product or service appraisals, and accordingly important linguistic clues for the parameters through an exper-
translating these opinions into business intelligence to be used iment with crawled review data, with the aid of NLP tech-
as the basis for product/service improvements. They worked niques. They worked on product reviews from amazon.com.
on Taiwanese Fast food reviews. Their results showed that Their results showed that their metric was helpful and free
their approach is effective in providing eWOM appraisals from undesirable biases.
related to services and products.
Semantic methods can be mixed with the statistical methods
to perform SA task as the work presented by Zhang and Xu [38] 4.2.3.1. Discourse information. The importance of discourse in
who used both methods to find product weakness from online SA has been increasing recently. Discourse information can be
reviews. Their weakness finder extracted the features and group found either among sentences or among clauses in the same
explicit features by using morpheme-based method to identify sentence. Sentiment annotation at the discourse level was studied
feature words from the reviews. They used Hownet-based in [95,96]. Asher et al. [95] have used five types of rhetorical rela-
similarity measure to find the frequent and infrequent explicit tions: Contrast, Correction, Support, Result, and Continuation
features which describe the same aspect. They identified the with attached sentiment information for annotation. Somasund-
implicit features with collocation statistics-based selection aran et al. [96] have proposed a concept called opinion frame. The
method PMI. They have grouped products feature words into components of opinion frames are opinions and are the relation-
corresponding aspects by applying semantic methods. They ships between their targets [3]. They have enhanced their work
have utilized sentence-based SA method to determine the and investigated design choices in modeling a discourse scheme
polarity of each aspect in sentences taking into consideration for improving sentiment classification [97].
the impact of adverbs of degree. They could find the weaknesses Rhetorical Structure Theory (RST) [98] describes how to
of the product, as it was probably the most unsatisfied aspect in split a text into spans, each representing a meaningful part
customers’ reviews, or the aspect which is more unsatisfied of the text. Heerschop et al. [29] have proposed a framework
when compared with their competitor’s product reviews. that performed document SA (partly) based on a document’s
Their results expressed the good performance of the weakness discourse structure which was obtained by applying RST on
finder. sentence level. They hypothesized that they can improve the

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
Sentiment analysis algorithms and applications: A survey 13

performance of a sentiment classifier by splitting a text into classification framework based on FFCA to conceptualize doc-
important and less important text spans. They used lexicon- uments into a more abstract form of concepts. They used train-
based for classification of movie reviews. Their results showed ing examples to improve the arbitrary outcomes caused by
improvement in SC accuracy compared to a baseline that does ambiguous terms. They used FFCA to train a classifier using
not take discourse structure into account. concepts instead of documents in order to reduce the inherent
A novel unsupervised approach for discovering intra-sen- ambiguities. They worked on a benchmark test bed (Reuters
tence level discourse relations for eliminating polarity ambigu- 21578) and two opinion polarity data sets on movie and eBook
ities was presented by Zhou et al. [28]. First, they defined a reviews. Their results indicated superior performance in all
discourse scheme with discourse constraints on polarity based data sets and proved its ability to decrease the sensitivity to
on RST. Then, they utilized a small set of cue phrase-based noise, as well as its adaptability in cross domain applications.
patterns to collect a large number of discourse instances which Kontopoulos et al. [55] have used FCA also to build an
were converted to semantic sequential representations (SSRs). ontology domain model. In their work, they proposed the
Finally, they adopted an unsupervised method to generate, use of ontology-based techniques toward a more efficient sen-
weigh and filter new SSRs without cue phrases for recognizing timent analysis of twitter posts by breaking down each tweet
discourse relations. They worked on Chinese training data. into a set of aspects relevant to the subject. They worked on
Their results showed that the proposed methods effectively rec- the domain of smart phones. Their architecture gives more
ognized the defined discourse relations and achieved significant detailed analysis of post opinions regarding a specific topic
improvement. as it distinguishes the features of the domain and assigns
Zirn et al. [30] have presented a fully automatic framework respective scores to it.
for fine-grained SA on the sub-sentence level, combining multi- Other concept-level sentiment analysis systems have been
ple sentiment lexicons and neighborhood as well as discourse developed recently. Mudinas et al. [114] have presented the
relations. They use Markov logic to integrate polarity scores anatomy of pSenti. pSenti is a concept-level sentiment analysis
from different sentiment lexicons using information about system that is integrated into opinion mining lexicon-based
relations between neighboring segments. They worked on and learning-based approaches. Their system achieved higher
product reviews. Their results showed that the use of structural accuracy in sentiment polarity classification as well as senti-
features improved the accuracy of polarity predictions achiev- ment strength detection compared with pure lexicon-based sys-
ing accuracy scores up to 69%. tems. They worked on two real-world data sets (CNET
The usefulness of RST in large scale polarity ranking of software reviews and IMDB movie reviews). They outper-
blog posts was explored by Chenlo et al. [61]. They applied formed the proposed hybrid approach over state-of-the-art
sentence-level methods to select the key sentences that con- systems like SentiStrength.
veyed the overall on-topic sentiment of a blog post. Then, they Cambria and Havasi have introduced SenticNet 2 in [115].
applied RST analysis to these core sentences to guide the They developed SenticNet 2; a publicly available semantic and
classification of their polarity and thus to generate an overall affective resource for opinion mining and sentiment analysis;
estimation of the document polarity with respect to a specific in order to bridge the cognitive and affective gap between
topic. They discovered that Bloggers tend to express their word-level natural language data and the concept-level senti-
sentiment in a more apparent fashion in elaborating and ments conveyed by them. Their system was built by means
attributing text segments rather than in the core of the text of sentic computing which is a new paradigm that exploits
itself. Their results showed that RST provided valuable infor- both Artificial Intelligence and SemanticWeb. They showed
mation about the discourse structure of the texts that can be that their system can easily be embedded in real-world applica-
used to make a more accurate ranking of documents in terms tions in order to effectively combine and compare structured
of their estimated sentiment in multi-topic blogs. and unstructured information.
Concept-level sentiment analysis systems have been used in
4.3. Other techniques other applications like e-health. This includes patients’ opinion
analysis [116] and crowd validation [117].
There are techniques that cannot be roughly categorized as ML
approach or lexicon-based Approach. Formal Concept Analy- 5. Related fields to sentiment analysis
sis (FCA) is one of those techniques. FCA was proposed by
Wille [99] as a mathematical approach used for structuring, ana-
lyzing and visualizing data, based on a notion of duality called There are some topics that work under the umbrella of SA and
Galois connection [100]. The data consists of a set of entities have attracted the researchers recently. In the next subsection,
and its features are structured into formal abstractions called three of these topics are presented in some details with related
formal concepts. Together they form a concept lattice ordered articles.
by a partial order relation. The concept lattices are constructed
by identifying the objects and their corresponding attributes for 5.1. Emotion detection
a specific domain, called conceptual structures, and then the rela-
tionships among them are displayed. Fuzzy Formal Concept Sentiment analysis is sometimes considered as an NLP task for
Analysis (FFCA) was developed in order to deal with uncer- discovering opinions about an entity; and because there is
tainty and unclear information. It has been successfully applied some ambiguity about the difference between opinion, senti-
in various information domain applications [101]. ment and emotion, they defined opinion as a transitional con-
FCA and FFCA were used in many SA applications as cept that reflects attitude towards an entity. The sentiment
presented by Li and Tsai [51]. In their work they proposed a reflects feeling or emotion while emotion reflects attitude [1].

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
14 W. Medhat et al.

It was argued by Plutchik [102] that there are eight basic and to extract them for emotion terms, from nonparallel cor-
and prototypical emotions which are joy, sadness, anger, fear, pora. They started with a small number of seeds (WordNet
trust, disgust, surprise, and anticipation. Emotions Detection Affect emotion words). Their approach learned extraction pat-
(ED) can be considered a SA task. SA is concerned mainly terns for six classes of emotions. They used annotated blogs
in specifying positive or negative opinions, but ED is and other data sets as texts to extract paraphrases from them.
concerned with detecting various emotions from text. As a They worked on data from live journals blogs, text affect, fairy
Sentiment Analysis task, ED can be implemented using ML tales and annotated blogs. They showed that their algorithm
approach or Lexicon-based approach, but Lexicon-based achieved good performance results on their data set.
approach is more frequently used. Ptaszynski et al. [50] have worked on text-based affect anal-
ED on a sentence level was proposed by Lu and Lin [13]. ysis (AA) of Japanese narratives from Aozora Bunko. In their
They proposed a web-based text mining approach for detect- research, they addressed the problem of person/character
ing emotion of an individual event embedded in English sen- related affect recognition in narratives. They extracted
tences. Their approach was based on the probability emotion subject from a sentence based on analysis of ana-
distribution of common mutual actions between the subject phoric expressions at first, then the affect analysis procedure
and the object of an event. They integrated web-based text estimated what kind of emotional state each character was in
mining and semantic role labeling techniques, together with a for each part of the narrative.
number of reference entity pairs and hand-crafted emotion Studying AA in mails and books was introduced by
generation rules to recognize an event emotion detection Mohammad [49]. He has analyzed the Enron email corpus
system. They did not use any large-scale lexical sources or and proved that there were marked differences across genders
knowledge base. They showed that their approach revealed a in how they use emotion words in work-place email. He
satisfactory result for detecting the positive, negative and neu- created lexicon which has manual annotations of a word’s
tral emotions. They proved that the emotion sensing problem associations with positive/negative polarity, and the eight basic
is context-sensitive. emotions by crowd-sourcing. He used it to analyze and track
Using both ML and Lexicon-based approach was presented the distribution of emotion words in books and mails. He
by Balahur et al. [45]. They proposed a method based on com- introduced the concept of emotion word density by studying
monsense knowledge stored in the emotion corpus (EmotiNet) novels and fairy tales. He proved that fairy tales had a much
knowledge base. They said that emotions are not always wider distribution of emotional word densities than novels.
expressed by using words with an affective meaning i.e. happy,
but by describing real-life situations, which readers detect as
being related to a specific emotion. They used SVM and 5.2. Building resources
SVM-SO algorithms to achieve their goal. They showed that
the approach based on EmotiNet is the most appropriate for Building Resources (BR) aims at creating lexica, dictionaries
the detection of emotions from contexts where no affect- and corpora in which opinion expressions are annotated
related words were present. They proved that the task of according to their polarity. Building resources is not a SA task,
emotion detection from texts such as the ones in the emotion but it could help to improve SA and ED as well. The main
corpus ISEAR (where little or no lexical clues of affect are challenges that confronted the work in this category are
present) can be best tackled using approaches based on com- ambiguity of words, multilinguality, granularity and the
monsense knowledge. They showed that by using EmotiNet, differences in opinion expression among textual genres [11].
they obtained better results compared to the methods that Building Lexicon was presented by Tan and Wu [20]. In
employ supervised learning on a much greater training set or their work, they proposed a random walk algorithm to con-
lexical knowledge. struct domain-oriented sentiment lexicon by simultaneously
Affect Analysis (AA) is a task of recognizing emotions elic- utilizing sentiment words and documents from both old
ited by a certain semiotic modality. Neviarouskaya et al. [103] domain and target domain. They conducted their experiments
have suggested an Affect Analysis Model (AAM). Their AAM on three domain-specific sentiment data sets. Their experimen-
consists of five stages: symbolic cue, syntactical structure, tal results indicated that their proposed algorithm improved
word-level, phrase-level and sentence-level analysis. This the performance of automatic construction of domain-oriented
AAM was used in many applications presented in Neviarous- sentiment lexicon.
kaya work [104–106]. Building corpus was introduced by Robaldo and Di Caro
Classifying sentences using fine-grained attitude types is [34]. They proposed Opinion Mining-ML, a new XML-based
another work presented by Neviarouskaya et al. [14]. They formalism for tagging textual expressions conveying opinions
developed a system that relied on the compositionality princi- on objects that are considered relevant in the state of affairs.
ple and a novel approach dealing with the semantics of verbs in It is a new standard beside Emotion-ML and WordNet. Their
attitude analysis. They worked on 1000 sentences from http:// work consisted of two parts. First, they presented a standard
www.experienceproject.com. This is a site where people share methodology for the annotation of affective statements in the
personal experiences, thoughts, opinions, feelings, passions, text that was strictly independent from any application domain.
and confessions through the network of personal stories. Their Second, they considered the domain-specific adaptation that
evaluation showed that their system achieved reliable results in relied on the use of ontology of support which is domain-
the task of textual attitude analysis. dependent. They started with data set of restaurant reviews
Affect emotion words could be used as presented by Kesht- applying query-oriented extraction process. They evaluated
kar and Inkpen [42] using a corpus-based technique. In their their proposal by means of fine-grained analysis of the disagree-
work, they introduced a bootstrapping algorithm based on ment between different annotators. Their results indicated that
contextual and lexical features for identifying paraphrases their proposal represented an effective annotation scheme that

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
Sentiment analysis algorithms and applications: A survey 15

was able to cover high complexity while preserving good intrinsic structure, revealed by these labeled documents to
agreement among different people. label the target-domain data. They worked on books, hotels,
Boldrini et al. [41] have focused on the creation of EmotiB- and notebook reviews that came from a domain-specific Chi-
log, a fine-grained annotation scheme for labeling subjectivity nese data set. They proved that their proposed approach could
in nontraditional textual genres. They focused on the annota- improve the performance of cross-domain sentiment
tion at different levels: document, sentence and element. They classification.
also presented the EmotiBlog corpus; a collection of blog posts The Stochastic Agreement Regularization algorithm deals
composed by 270,000 token about three topics in three with cross-domain polarity classification [111]. It is a probabi-
languages: Spanish, English and Italian. They checked the listic agreement framework based on minimizing the Bhatta-
robustness of the model and its applicability to NLP tasks. charyya distance between models trained using two different
They tested their model on many corpora i.e. ISEAR. Their views. It regularizes the models from each view by constraining
experiments provided satisfactory results. They applied the amount by which it allows them to disagree on unlabeled
EmotiBlog to sentiment polarity classification and emotion instances from a theoretical model. The Stochastic Agreement
detection. They proved that their resource improved the Regularization algorithm was used as a base for the work pre-
performance of systems built for this task. sented by Lambova et al. [24] which discussed the problem of
Building Dictionary was presented by Steinberger et al. [43]. cross-domain text subjectivity classification. They proposed
In their work they proposed a semi-automatic approach to three new algorithms based on multi-view learning and the
creating sentiment dictionaries in many languages. They first co-training algorithm strategy constrained by agreement
produced high-level gold-standard sentiment dictionaries for [112]. They worked on movie reviews and question answering
two languages and then translated them automatically into a data that came from three famous data sets. They showed that
third language. Those words that can be found in both target their proposed work give improved results compared to the
language word lists are likely to be useful because their word Stochastic Agreement Regularization algorithm.
senses are likely to be similar to that of the two source Diversity among various data sources is a problem for the
languages. They addressed two issues during their work; the joint modeling of multiple data sources. Joint modeling is
morphological inflection and the subjectivity involved in the important to transfer learning; that is why Gupta et al. [32]
human annotation and evaluation effort. They worked on have tried to solve this problem. In their work, they proposed
news data. They compared their triangulated lists with the a regularized shared subspace learning framework, which can
non-triangulated machine-translated word lists and verified exploit the mutual strengths of related data sources while being
their approach. unaffected by the effects of the changeability of each source.
They worked on social media news data that come from
famous social media sites as Blogspot, Flicker and Youtube
5.3. Transfer learning and also from news sites as CNN, BBC. They proved that their
approach achieved better performance compared to others.
Transfer learning extracts knowledge from auxiliary domain to
improve the learning process in a target domain. For example, 6. Discussion and analysis
it transfers knowledge from Wikipedia documents to tweets or
a search in English to Arabic. Transfer learning is considered a
In this section, we analyze the trend of researchers in using the
new cross domain learning technique as it addresses the
various algorithms, data or accomplishing one of the SA tasks.
various aspects of domain differences. It is used to enhance
The following graphs illustrate the number of the articles
many Text mining tasks like text classification [107], sentiment
analysis [108], Named Entity recognition [109], part-of-speech
tagging [110], . . . etc.
In Sentiment Analysis; transfer learning can be applied to
transfer sentiment classification from one domain to another
[21] or building a bridge between two domains [22]. Tan and
Wang [21] proposed an Entropy-based algorithm to pick out
high-frequency domain-specific (HFDS) features as well as a
weighting model which weighted the features as well as the
instances. They assigned a smaller weight to HFDS features
and a larger weight to instances with the same label as the
involved pivot feature. They worked on education, stock and
computer reviews that come from a domain-specific Chinese
data set. They proved that their proposed model could over-
come the adverse influence of HFDS features. They also
showed that their model is a better choice for SA applications
that require high-precision classification which have hardly
any labeled training data.
Wu and Tan [22] have proposed a two-stage framework for
cross-domain sentiment classification. In the first stage they
built a bridge between the source domain and the target
domain to get some most confidently labeled documents in Figure 4 Number of articles tardifferent sentiment analysis tasks
the target domain. In the second stage they exploited the over years.

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
16 W. Medhat et al.

(which were presented in Table 1) through years according to


their contributions in many criteria.
Fig. 4 illustrates the number of the articles that give contri-
bution to the six categories of SA tasks among years and the
overall count. This figure shows that still SA and SC attract
researchers more frequently. It can be noticed that they have
almost equal number of contributions among years and the
biggest amount in the overall count. The related fields ED,
TL and BR have attracted researchers more recently as they
are emerging fields of search.
ML algorithms are usually used to solve the SC problem for
its simplicity and the ability to use the training data which
gives it the privilege of domain adaptability. Lexicon-based
algorithms are frequently used to solve general SA problems Figure 7 Number and percentage of articles targeting different
because of their scalability. They are also simple and computa- text domains over years.
tionally efficient. Fig. 5 shows the algorithms used. As shown
the number and percentage of articles that use ML and the
Lexicon-based algorithms are changing among years. The than making pos/neg classification in the overall count. It
overall work for the recent few years shows that the research- shows that the number and percentage of articles, in the last
ers are using lexicon-based approach more frequently. This is four years that make general classification, is greater than
because it solves many SA tasks despite its high complexity. those who make pos/neg classification. In the last year, the
ML approaches are still an open field of search. Tsytsarau number of articles is almost the same which means that the
and Palpanas [1] have found that most of the work they pre- interest in pos/neg classification is still ongoing. However, this
sented was using ML approaches which means that, in the last increase in percentage of general classification implies that the
few years, the researchers are heading toward the general anal- field of SA analysis is maturing. In the past, the binary
ysis of texts. The uses of hybrid methods are not yet frequent classification problem has been a nice first step, as it involves
because its computational complexity is higher. distinguishing between the two extremes of the polarity spec-
Fig. 6 illustrates that the trend of researches has recently trum. Therefore, binary polarity classification is a comparably
been to make a general categorization of sentiments rather easy problem to tackle, due to its inherently crisp nature, as
well as the availability of (lots of) data that can easily be used
for this purpose. Identifying a general mood is little bit difficult

Figure 5 Number and percentage of articles according to the


algorithmic approach over years.
Figure 8 Number and percentage of articles targeting domain
dependent and independent text over years.

Figure 6 Number and percentage of articles according to the Figure 9 Number and percentage of articles using different
sentiment representation over years. natural languages over years.

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
Sentiment analysis algorithms and applications: A survey 17

research that and has an increasing trend with time. The SA


Table 2 Data sets.
field is expanding to absorb other related fields rather than bin-
ary classification (pos/neg classification). References Task Data Set/Source
We can notice that in the year 2012, most of the articles [41] BR ISEAR
were targeting the related fields of SA other than the normal [43] BR Sentiment dictionaries
SC problem. This explains why the use of the Lexicon-based [42] ED Live Journals Blogs, Text Affect, Fairy tales,
approaches is more often used recently; as the general classifi- Annotated Blogs
cation is not frequently used with ML algorithms. [45] ED ISEAR, Emotinet
[49] ED Enron Email corpus
The data used in SA are mostly on Product Reviews in the
[24] TL MPQA, RIMDB, CHES
overall count as shown in Fig. 7. The other kinds of data are
[32] TL Blogspot, Flicker, youtube, CNN-BBC
used more frequently over recent years specially the social [48] FS amazon.com
media. The other kinds of data are news articles or news feeds; [12] SA automotvieforums.com
web Blogs, social media, and others. [18] SA CNETD
We are interested too in seeing if the data used in the arti- [25] SA amazon.com, epinions.com, blogs, SNS
cles are domain dependent or not. Many articles have proved [27] SA ebay.com, wikipedia.com, epinions.com
that using domain dependent data gives more accurate results [31] SA amazon.com
than domain-independent data as in [35,60]. In Fig. 8, it is [39] SA amazon.com
shown that the researchers usually work in a domain-indepen- [55] SA Twitter
[15] SC IMDB
dent for its simplicity. This makes the domain-dependent a
[19] SC IMDB, Amazon.com
problem or as so-called a context-based SA; an ongoing field
[44] SC convinceme.net
of search. [50] SC 2000-SINA blog data set, 300-SINA Hownet
SA using non-English languages has attracted researchers lexicon
recently as shown in Fig. 9. The non-English languages include [51] SC Reuters 21578
the other Latin languages (Spanish, Italian); Germanic [53] SC amazon.com
languages (German, Dutch); Far East languages (Chinese, [56] SC Twitter
Japanese, Taiwanese); Middle East languages (Arabic). Fig. 9 [57] SC Twitter
shows that, still, the English language is the most frequently [60] SC epinions.com
used language due to the availability of its resources including
lexica, corpora and dictionaries. This opens a new challenge
to researchers in order to build lexica, corpora and dictionaries languages. The researchers are now in the phase of building
resources for other languages. resources of other Latin (European) languages.
There is still a lack of resources for the Middle East lan-
6.1. Open problems guages including the Arabic language. The resources built
for the Arabic language are not yet complete and not found
easily as an open source. This makes it a very good trend of
The analysis illustrated above gives a closer look at the recent
research now.
and future trend of research. While studying the recent articles,
NLP: The natural language processing tools can be used to
we have discovered some points that could be considered open
facilitate the SA process. It gives better natural language
problems in research.
understanding and thus can help produce more accurate
The Data Problem: It has been noticed that there is lack of
results of SA. These tools were used to help in BR, ED and
benchmark data sets in this field. It was stated in [1] that few of
also SA task in the last two years. This opens a new trend of
the most famous data sets are in the field of SA. Table 2 illus-
research of using the NLP as a preprocessing stage before
trates some famous data sources and data sets which were used
sentiment analysis.
to accomplish the different tasks of SA. It can be noticed that
Although [1] mentioned the problems of opinion aggrega-
ISEAR and Emotinet are used in the ED and BR articles.
tion and contradiction analysis, they were not found in the
These tasks do not use the famous customer reviews as its data
recent articles presented by this survey. This means that they
source. They may use novels, narratives or mails in their study
do not attract researchers recently; despite the fact that they
which are not used in other SA tasks.
are still opening fields of research.
IMDB and Amazon.com are very famous data sources of
It is noticed that working on domain-specific corpus gives
review data. IMDB is a source of movie reviews while ama-
better results than working on the domain-independent cor-
zon.com is a source of many product reviews. These data
pus. There is still lack of research in the field of domain-specific
sources are used in SA and SC tasks. It is noticed that twitter
SA which is sometimes called context-based SA. This is
was used frequently in the last year. Twitter is a very famous
because building the domain-specific corpus is more compli-
social network site where its tweets express people’s opinions
cated than using the domain-independent one. It is noticed
and its length is maximum 140 characters. The debate site
that the ED and BR task work usually on domain-independent
called convinceme.net is considered also a good data set which
sources, while TL always uses domain-dependent sources.
was used in SC task. The other sources are illustrated in the
rest of the table.
The Language problem: It was noticed in the articles pre- 7. Conclusion and future work
sented in this survey that the Far East languages especially
the Chinese language has been used more often recently. This survey paper presented an overview on the recent updates
Accordingly, many sources of data are built for these in SA algorithms and applications. Fifty-four of the recently

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
18 W. Medhat et al.

published and cited articles were categorized and summarized. [12] Qiu Guang, He Xiaofei, Zhang Feng, Shi Yuan, Bu Jiajun, Chen
These articles give contributions to many SA related fields that Chun. DASA: dissatisfaction-oriented advertising based on
use SA techniques for various real-world applications. After sentiment analysis. Expert Syst Appl 2010;37:6182–91.
analyzing these articles, it is clear that the enhancements of [13] Lu Cheng-Yu, Lin Shian-Hua, Liu Jen-Chang, Cruz-Lara
Samuel, Hong Jen-Shin. Automatic event-level textual emotion
SC and FS algorithms are still an open field for research. Naı̈ve
sensing using mutual action histogram between entities. Expert
Bayes and Support Vector Machines are the most frequently Syst Appl 2010;37:1643–53.
used ML algorithms for solving SC problem. They are consid- [14] Neviarouskaya Alena, Prendinger Helmut, Ishizuka Mitsuru.
ered a reference model where many proposed algorithms are Recognition of Affect, Judgment, and Appreciation in Text. In:
compared to. Proceedings of the 23rd international conference on computa-
The interest in languages other than English in this field is tional linguistics (Coling 2010), Beijing; 2010. p. 806–14.
growing as there is still a lack of resources and researches con- [15] Bai X. Predicting consumer sentiments from online text. Decis
cerning these languages. The most common lexicon source Support Syst 2011;50:732–42.
used is WordNet which exists in languages other than English. [16] Zhao Yan-Yan, Qin Bing, Liu Ting. Integrating intra- and inter-
Building resources, used in SA tasks, is still needed for many document evidences for improving sentence sentiment classifica-
tion. Acta Automatica Sinica 2010;36(October’10).
natural languages.
[17] Yi Hu, Li Wenjie. Document sentiment classification by explor-
Information from micro-blogs, blogs and forums as well as ing description model of topical terms. Comput Speech Lang
news source, is widely used in SA recently. This media infor- 2011;25:386–403.
mation plays a great role in expressing people’s feelings, or [18] Cao Qing, Duan Wenjing, Gan Qiwei. Exploring determinants of
opinions about a certain topic or product. Using social net- voting for the ‘‘helpfulness’’ of online user reviews: a text mining
work sites and micro-blogging sites as a source of data still approach. Decis Support Syst 2011;50:511–21.
needs deeper analysis. There are some benchmark data sets [19] He Yulan, Zhou Deyu. Self-training from labeled features for
especially in reviews like IMDB which are used for algorithms sentiment analysis. Inf Process Manage 2011;47:606–16.
evaluation. [20] Tan Songbo, Wu Qiong. A random walk algorithm for
In many applications, it is important to consider the context automatic construction of domain-oriented sentiment lexicon.
Expert Syst Appl 2011:12094–100.
of the text and the user preferences. That is why we need to make
[21] Tan Songbo, Wang Yuefen. Weighted SCL model for adaptation
more research on context-based SA. Using TL techniques, we of sentiment classification. Expert Syst Appl 2011;38:10524–31.
can use related data to the domain in question as a training data. [22] Qiong Wu, Tan Songbo. A two-stage framework for cross-
Using NLP tools to reinforce the SA process has attracted domain sentiment classification. Expert Syst Appl
researchers recently and still needs some enhancements. 2011;38:14269–75.
[23] Jiao Jian, Zhou Yanquan. Sentiment Polarity Analysis based
multi-dictionary. In: Presented at the 2011 International Con-
References ference on Physics Science and Technology (ICPST’11); 2011.
[24] Lambov Dinko, Pais Sebastião, Dias Gãel. Merged agreement
[1] Tsytsarau Mikalai, Palpanas Themis. Survey on mining sub- algorithms for domain independent sentiment analysis. In:
jective data on the web. Data Min Knowl Discov Presented at the Pacific Association for, Computational Lin-
2012;24:478–514. guistics (PACLING’11); 2011.
[2] Wilson T, Wiebe J, Hoffman P. Recognizing contextual polarity [25] Xu Kaiquan, Liao Stephen Shaoyi, Li Jiexun, Song Yuxia.
in phrase-level sentiment analysis. In: Proceedings of HLT/ Mining comparative opinions from customer reviews for com-
EMNLP; 2005. petitive intelligence. Decis Support Syst 2011;50:743–54.
[3] Liu B. Sentiment analysis and opinion mining. Synth Lect [26] Chin Chen Chien, Tseng You-De. Quality evaluation of product
Human Lang Technol 2012. reviews using an information quality framework. Decis Support
[4] Yu Liang-Chih, Wu Jheng-Long, Chang Pei-Chann, Chu Hsu- Syst 2011;50:755–68.
an-Shou. Using a contextual entropy model to expand emotion [27] Fan Teng-Kai, Chang Chia-Hui. Blogger-centric contextual
words and their intensity for the sentiment classification of stock advertising. Expert Syst Appl 2011;38:1777–88.
market news. Knowl-Based Syst 2013;41:89–97. [28] Zhou L, Li B, Gao W, Wei Z, Wong K. Unsupervised discovery
[5] Michael Hagenau, Michael Liebmann, Dirk Neumann. Auto- of discourse relations for eliminating intra-sentence polarity
mated news reading: stock price prediction based on financial ambiguities. In: Presented at the 2001 conference on Empirical
news using context-capturing features. Decis Supp Syst; 2013. Methods in Natural Language Processing (EMNLP’11); 2011.
[6] Tao Xu, Peng Qinke, Cheng Yinzhao. Identifying the semantic [29] Heerschop B, Goossen F, Hogenboom A, Frasincar F, Kaymak
orientation of terms using S-HAL for sentiment analysis. Knowl- U, de Jong F. Polarity Analysis of Texts using Discourse
Based Syst 2012;35:279–89. Structure. In: Presented at the 20th ACM Conference on
[7] Maks Isa, Vossen Piek. A lexicon model for deep sentiment Information and Knowledge Management (CIKM’11); 2011.
analysis and opinion mining applications. Decis Support Syst [30] Zirn C, Niepert M, Stuckenschmidt H, Strube M. Fine-grained
2012;53:680–8. sentiment analysis with structural features. In: Presented at the
[8] Pang B, Lee L. Opinion mining and sentiment analysis. Found 5th International Joint Conference on Natural Language Pro-
Trends Inform Retriev 2008;2:1–135. cessing (IJCNLP’11); 2011.
[9] Cambria E, Schuller B, Xia Y, Havasi C. New avenues in [31] Hu Nan, Bose Indranil, Koh Noi Sian, Liu Ling. ‘‘Manipulation
opinion mining and sentiment analysis. IEEE Intell Syst of online reviews: an analysis of ratings, readability, and
2013;28:15–21. sentiments’’. Decis Support Syst 2012;52:674–84.
[10] Feldman R. Techniques and applications for sentiment analysis. [32] Gupta Sunil Kumar, Phung Dinh, Adams Brett, Venkatesh
Commun ACM 2013;56:82–9. Svetha. Regularized nonnegative shared subspace learning. Data
[11] Montoyo Andrés, Martı́nez-Barco Patricio, Balahur Alexandra. Min Knowl Discov 2012;26:57–97.
Subjectivity and sentiment analysis: an overview of the current [33] Duric Adnan, Song Fei. Feature selection for sentiment analysis
state of the area and envisaged developments. Decis Support Syst based on content and syntax models. Decis Support Syst
2012;53:675–9. 2012;53:704–11.

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
Sentiment analysis algorithms and applications: A survey 19

[34] Robaldo Livio, Di Caro Luigi. OpinionMining-ML. Comput [54] Martı́n-Valdivia Marı́a-Teresa, Martı́nez-Cámara Eugenio,
Stand Interfaces 2012. Perea-Ortega Jose-M, Alfonso Ureña-López L. Sentiment polar-
[35] Caro Luigi Di, Grella Matteo. Sentiment analysis via depen- ity detection in Spanish reviews combining supervised and
dency parsing. Comput Stand Interfaces 2012. unsupervised approaches. Expert Syst Appl 2013.
[36] Kang Hanhoon, Yoo Seong Joon, Han Dongil. Senti-lexicon [55] Kontopoulos Efstratios, Berberidis Christos, Dergiades Theolo-
and improved Naı̈ve Bayes algorithms for sentiment analysis of gos, Bassiliades Nick. Ontology-based sentiment analysis of
restaurant reviews. Expert Syst Appl 2012;39:6000–10. twitter posts. Expert Syst Appl 2013.
[37] Moreo A, Romero M, Castro JL, Zurita JM. Lexicon-based [56] Rui Huaxia, Liu Yizao, Whinston Andrew. Whose and what
comments-oriented news sentiment analyzer system. Expert Syst chatter matters? The effect of tweets on movie sales. Decis
Appl 2012;39:9166–80. Support Syst 2013.
[38] Zhang Wenhao, Hua Xu, Wan Wei. Weakness finder: find [57] Li Yung-Ming, Li Tsung-Ying. Deriving market intelligence
product weakness from Chinese reviews by using aspects based from microblogs. Decis Support Syst 2013.
sentiment analysis. Expert Syst Appl 2012;39:10283–91. [58] Ptaszynski Michal, Dokoshi Hiroaki, Oyama Satoshi, Rzepka
[39] Min Hye-Jin, Park Jong C. Identifying helpful reviews based on Rafal, Kurihara Masahito, Araki Kenji, Momouchi Yoshio.
customer’s mentions about experiences. Expert Syst Appl Affect analysis in context of characters in narratives. Expert Syst
2012;39:11830–8. Appl 2013;40:168–76.
[40] Ortigosa-Hernández Jonathan, Rodrı́guez Juan Diego, Alzate [59] Pai Mao-Yuan, Chu Hui-Chuan, Wang Su-Chen, Chen Yuh-
Leandro, Lucania Manuel, Inza Iñaki, Lozano Jose A. Min. Electronic word of mouth analysis for service experience.
Approaching sentiment analysis by using semi-supervised learn- Expert Syst Appl 2013;40:1993–2006.
ing of multi-dimensional classifiers. Neurocomputing [60] Cruz Fermı́n L, Troyano José A, Enrı́quez Fernando, Javier
2012;92:98–115. Ortega F, Vallejo Carlos G. Long autonomy or long delay?’ The
[41] Boldrini Ester, Balahur Alexandra, Martı́nez-Barco Patricio, importance of domain in opinion mining. Expert Syst Appl
Montoyo Andrés. Using EmotiBlog to annotate and analyse 2013;40:3174–84.
subjectivity in the new textual genres. Data Min Knowl Discov [61] Chenlo J, Hogenboom A, Losada D. Sentiment-based ranking of
2012;25:603–34. blog posts using rhetorical structure theory. In: Presented at the
[42] Keshtkar Fazel, Inkpen Diana. A bootstraping method for 18th international conference on applications of Natural Lan-
extracting paraphrases of emotion expressions from texts. guage to Information Systems (NLDB’13); 2013.
Comput Intell 2012;vol. 0. [62] Aggarwal Charu C, Zhai Cheng Xiang. Mining Text Data.
[43] Steinberger Josef, Ebrahim Mohamed, Ehrmann Maud, Hurri- Springer New York Dordrecht Heidelberg London:  Springer
yetoglu Ali, Kabadjov Mijail, Lenkova Polina, Steinberger Ralf, Science+Business Media, LLC’12; 2012.
Tanev Hristo, Vázquez Silvia, Zavarella Vanni. Creating senti- [63] Yelena Mejova, Padmini Srinivasan. Exploring feature definition
ment dictionaries via triangulation. Decis Support Syst and selection for sentiment classifiers. In: Proceedings of the fifth
2012;53:689–94. international AAAI conference on weblogs and social media;
[44] Walker Marilyn A, Anand Pranav, Abbott Rob, Fox Tree Jean 2011.
E, Martell Craig, King Joseph. That is your evidence?: Classi- [64] Whitelaw Casey, Garg Navendu, Argamon Shlomo. Using
fying stance in online political debate. Decis Support Syst appraisal groups for sentiment analysis. In: Proceedings of the
2012;53:719–29. ACM SIGIR Conference on Information and Knowledge
[45] Balahur Alexandra, Hermida Jesús M, Montoyo Andrés. Management (CIKM); 2005. p. 625–31.
Detecting implicit expressions of emotion in text: a comparative [65] Cover TM, Thomas JA. Elements of information theory. New
analysis. Decis Support Syst 2012;53:742–53. York: John Wiley and Sons; 1991.
[46] Lane Peter CR, Clarke Daoud, Hender Paul. On developing [66] Deerwester S, Dumais S, Landauer T, Furnas G, Harshman R.
robust models for favourability analysis: model choice, feature Indexing by latent semantic analysis. JASIS 1990;41:391–407.
sets and imbalanced data. Decis Support Syst 2012;53:712–8. [67] Jolliffee IT. Principal component analysis. Springer; 2002.
[47] van de Camp Matje, van den Bosch Antal. The socialist network. [68] Griffiths Thomas L, Steyvers Mark, Blei David M, Tenenbaum
Decis Support Syst 2012;53:761–9. Joshua B. Integrating topics and syntax. Adv Neural Inform
[48] Reyes Antonio, Rosso Paolo. Making objective decisions from Process Syst 2005:537–44.
subjective data: detecting irony in customer reviews. Decis [69] Diana Maynard, Adam Funk. Automatic detection of political
Support Syst 2012;53:754–60. opinions in tweets. In: Proceedings of the 8th international
[49] Mohammad SM. From once upon a time to happily ever after: conference on the semantic web, ESWC’11; 2011. p. 88–99.
tracking emotions in mail and books. Decis Support Syst [70] Cortes C, Vapnik V. Support-vector networks, presented at the
2012;53:730–41. Machine Learning; 1995.
[50] Xianghua Fu, Guo Liu, Yanyan Guo, Zhiqiang Wang. Multi- [71] Vapnik V. The nature of statistical learning theory, New York;
aspect sentiment analysis for Chinese online social reviews based 1995.
on topic modeling and HowNet lexicon. Knowl-Based Syst [72] Joachims T. Probabilistic analysis of the rocchio algorithm with
2013;37:186–95. TFIDF for text categorization. In: Presented at the ICML
[51] Li Sheng-Tun, Tsai Fu-Ching. A fuzzy conceptualization model conference; 1997.
for text mining with application in opinion polarity classification. [73] Aizerman M, Braverman E, Rozonoer L. Theoretical founda-
Knowl-Based Syst 2013;39:23–33. tions of the potential function method in pattern recognition
[52] Kaufmann JM. JMaxAlign: A Maximum Entropy Parallel learning. Autom Rem Cont 1964:821–37.
Sentence Alignment Tool. In: Proceedings of COLING’12: [74] Ruiz M, Srinivasan P. Hierarchical neural networks for text
Demonstration Papers, Mumbai; 2012. p. 277–88. categorization. In: Presented at the ACM SIGIR conference;
[53] Moraes Rodrigo, Valiati João Francisco, Gavião Neto Wilson P. 1999.
Document-level sentiment classification: an empirical compari- [75] Ng Hwee Tou, Goh Wei, Low Kok. Feature selection, percep-
son between SVM and ANN. Expert Syst Appl 2013;40: tron learning, and a usability case study for text categorization.
621–33. In: Presented at the ACM SIGIR conference; 1997.

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
20 W. Medhat et al.

[76] Quinlan JR. Induction of decision trees. Machine Learn [97] Somasundaran S, Namata G, Wiebe J, Getoor L. Supervised and
1986;1:81–106. unsupervised methods in employing discourse relations for
[77] Lewis David D, Ringuette Marc. A comparison of two learning improving opinion polarity classification. In: Presented at the
algorithms for text categorization. SDAIR 1994. 2009 conference on Empirical Methods in Natural Language
[78] Chakrabarti Soumen, Roy Shourya, Soundalgekar Mahesh V. Processing (EMNLP’09); 2009.
Fast and accurate text classification via multiple linear discrim- [98] Mann W, Thompson S. Rhetorical structure theory: toward a
inant projections. VLDB J 2003;2:172–85. functional theory of text organization. Text 1988;8, 243–28.
[79] Li Y, Jain A. Classification of text documents. Comput J [99] Wille R. Restructuring lattice theory: an approach based on
1998;41:537–46. hierarchies of concepts. In: I. Rival, Reidel, Dordrecht-Boston;
[80] Liu Bing, Hsu Wynne, Ma Yiming. Integrating classification and 1982, p. 445–70.
association rule mining. In: Presented at the ACM KDD [100] Priss U. Formal concept analysis in information science. In:
conference; 1998. Presented at the annual review of information science and
[81] Ko Youngjoong, Seo Jungyun. Automatic text categorization by technology; 2006.
unsupervised learning. In: Proceedings of COLING-00, the 18th [101] Li S, Tsai F. Noise control in document classification based on
international conference on computational linguistics; 2000. fuzzy formal concept analysis. In: Presented at the IEEE
[82] Turney P. Thumbs up or thumbs down?: semantic orientation International Conference on Fuzzy Systems (FUZZ); 2011.
applied to unsupervised classification of reviews. In: Proceedings [102] Plutchik R. A general psychoevolutionary theory of emotion.
of annual meeting of the Association for Computational Emotion: Theory Res Exp 1980;1:3–33.
Linguistics (ACL’02); 2002. [103] Neviarouskaya Alena, Prendinger Helmut, Ishizuka Mitsuru.
[83] Read J, Carroll J. Weakly supervised techniques for domain- Recognition of affect conveyed by text messaging in online
independent sentiment classification. In: Proceeding of the 1st communication, presented at the Online Communities and Social
international CIKM workshop on topic-sentiment analysis for Comput., HCII’07; 2007.
mass opinion; 2009. p. 45–52. [104] Neviarouskaya Alena, Prendinger Helmut, Ishizuka Mitsuru.
[84] Somasundaran S, Wiebe J. Recognizing stances in online Compositionality principle in recognition of fine-grained emo-
debates. In: Proceedings of the joint conference of the 47th tions from text. In: Proceedings of the third international
annual meeting of the ACL and the 4th international joint ICWSM conference; 2009.
conference on natural language processing of the AFNLP; 2009. [105] Neviarouskaya Alena, Prendinger Helmut, Ishizuka Mitsuru.
p. 226–34. EmoHeart: automation of expressive communication of emo-
[85] Hu Minging, Liu Bing. Mining and summarizing customer tions in second life. Online Communities, LNCS
reviews. In: Proceedings of ACM SIGKDD international con- 2009;5621:584–92.
ference on Knowledge Discovery and Data Mining (KDD’04); [106] Neviarouskaya Alena, Tsetserukou Dzmitry, Prendinger Hel-
2004. mut, Kawakami Naoki, Tachi Susumu, Ishizuka Mitsuru.
[86] Kim S, Hovy E. Determining the sentiment of opinions. In: Emerging system for affectively charged interpersonal commu-
Proceedings of interntional conference on Computational Lin- nication. In; Presented at the ICROS-SICE international joint
guistics (COLING’04); 2004. conference, Fukuoka International Congress Center, Japan;
[87] Miller G, Beckwith R, Fellbaum C, Gross D, Miller K. 2009.
WordNet: an on-line lexical database. Oxford Univ. Press; 1990. [107] Joachims T. Learning to classify text using support vector
[88] Mohammad S, Dunne C, Dorr B. Generating high-coverage machines: methods, theory and algorithms. MA, USA: Norwell;
semantic orientation lexicons from overly marked words and 2002.
a thesaurus. In: Proceedings of the conference on Empirical [108] Pang Bo, Lee Lillian. Opinion mining and sentiment analysis.
Methods in Natural Language Processing (EMNLP’09); Found Trends Inform Retriev; 2008.
2009. [109] Zhang Tong, Johnson David. A robust risk minimization based
[89] Hatzivassiloglou V, McKeown K. Predicting the semantic named entity recognition system. In: Presented at the seventh
orientation of adjectives. In: Proceedings of annual meeting of conference on Natural language learning at HLT-NAACL; 2003.
the Association for Computational Linguistics (ACL’97); 1997. [110] Ratnaparkhi Adwait. A maximum entropy model for part-of
[90] Lafferty J, McCallum A, Pereira F. Conditional random fields: speech tagging. In: Proceedings of the conference on empirical
probabilistic models for segmenting and labeling sequence data. methods in natural language processing, April 1996.
In: Proceedings of International Conference on Machine Learn- [111] Ganchev K, Graca J, Blitzer J, Taskar B. Multi-view learning
ing (ICML’01); 2001. over structured and non-identical outputs. In: Proceedings of the
[91] Fahrni A, Klenner M. Old wine or warm beer: target-specific 24th conference on Uncertainty in Artificial Intelligence
sentiment analysis of adjectives. In: Proceedings of the sympo- (UAI’08); 2008. p. 204–11.
sium on affective language in human and machine, AISB; 2008. [112] Wan X. Co-training for cross-lingual sentiment classification. in:
p. 60–3. Proceedings of the joint conference of the 47th annual meeting of
[93] Lund K, Burgess C. Producing high-dimensional semantic spaces the Association for Computational Linguistics and the 4th
from lexical co-occurrence. Behav Res Methods 1996;28:203–8. International Joint Conference on Natural Language Processing
[94] Bolshakov Igor A, Gelbukh Alexander. Comput Linguis (Mod- (ACL/IJCNLP’09); 2009. p. 235–43.
els, Resources, Applications) 2004. [113] Medhat W, Hassan A, Korashy H. Combined algorithm for data
[95] Asher N, Benamara F, Mathieu Y. Distilling opinion in mining using association rules. Ain Shams J Electric Eng
discourse: a preliminary study, presented at the COLING’08; 2008;1(1).
2008. [114] Mudinas Andrius, Zhang Dell, Levene Mark. Combining lexicon
[96] Somasundaran S, Wiebe J, Ruppenhofer J. Discourse level and learning based approaches for concept-level sentiment
opinion interpretation, presented at the Coling’08; 2008. analysis. Presented at the WISDOM’12, Beijing, China; 2012.

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
Sentiment analysis algorithms and applications: A survey 21

[115] Cambria Erik, Havasi Catherine, Hussain Amir. SenticNet 2: a Ahmed Hassan, is an associate professor in the
semantic and affective resource for opinion mining and senti- Computers and Systems Engineering Depart-
ment analysis. In: Proceedings of the twenty-fifth international ment, Ain Shams University since 2009. He is
florida artificial intelligence research society conference; 2012. the executive Director of the Information and
[116] Cambria Erik, Benson Tim, Eckl Chris, Hussain Amir. Sentic Communication Technology Project (ICTP),
PROMs: application of sentic computing to the development of a Ministry of Higher Education, Egypt. He got
novel unified framework for measuring health-care quality. his Ph.D., M.Sc. and B.Sc. from Ain Shams
Expert Syst Appl 2012;39:10533–43. University in 2004, 2000, 1995 respectively.
[117] Cambria Erik, Hussain Amir, Havasi Catherine. Towards crowd He works also as the secretary of the IEEE,
Validation of the UK National Health Service. Presented at the Egypt section since 2012. His research inter-
Web Science Conf, Raleigh, NC, USA; 2010. ests include Data Mining, Software Engineering, Programming Lan-
guages, Artificial Intelligence and Automatic Control.

Walaa Medhat, is an Engineering Lecturer in


School of Electronic Engineering, Canadian Hoda Korashy, is a Prof. at Department of
International College, Cairo campus of CBU. Computers & Systems, Faculty of Engineer-
She got her M.Sc. and B.Sc. from Computers ing, Ain Shams University, Cairo, Egypt.
and Systems Engineering Department, Ain Major interests are in database systems, data
Shams University in 2008, 2002 respectively. mining, web mining, semantic web and intel-
Fields of interest: Text mining, Data mining, ligent systems.
Software engineering, Programming Lan-
guages and Artificial Intelligence.

Please cite this article in press as: Medhat W et al., Sentiment analysis algorithms and applications: A survey, Ain Shams Eng J (2014), http://
dx.doi.org/10.1016/j.asej.2014.04.011
View publication stats

You might also like