0% found this document useful (0 votes)

62 views6 pages

39 - Sentiment Analysis of Movie Reviews and Blog Posts

This document discusses sentiment analysis approaches for classifying the sentiment of movie reviews and blog posts. It evaluates the performance of the SentiWordNet approach using different linguistic features and scoring schemes. The paper implements SentiWordNet to analyze sentiment in large datasets of movie reviews and blog posts on political changes in Libya and Tunisia. It compares the performance of SentiWordNet to naive Bayes and SVM classifiers in sentiment classification of movie reviews and blog posts.

Uploaded by

Office Work

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

62 views6 pages

39 - Sentiment Analysis of Movie Reviews and Blog Posts

Uploaded by

Office Work

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Sentiment Analysis of Movie Reviews and Blog Posts

Evaluating SentiWordNet with different Linguistic Features and Scoring Schemes

V.K. Singh, R. Piryani, A. Uddin P. Waila

Department of Computer Science DST Centre for Interdisciplinary Mathematical Sciences,
South Asian University Banaras Hindu University
New Delhi, India Varanasi, India
[email protected], [email protected], [email protected],
[email protected]

Abstract—This paper presents our experimental work on for harnessing collective intelligence [1]. For example, a
performance evaluation of the SentiWordNet approach for user looking for a hotel in a particular tourist city may
document-level sentiment classification of Movie reviews prefer to go through the reviews of available hotels in the
and Blog posts. We have implemented SentiWordNet city before making a decision to book in one of them. Or a
approach with different variations of linguistic features, user willing to buy a particular model of digital camera
scoring schemes and aggregation thresholds. We used two may first look at reviews posted by many other users
pre-existing large datasets of Movie Reviews and two Blog about that camera before making a buying decision. This
post datasets on revolutionary changes in Libya and not only helps in allowing the user to get more and
Tunisia. We have computed sentiment polarity and also its relevant information about different products and services
strength for both movie reviews and blog posts. The paper on a mouse click, but also helps in arriving at a more
also presents an evaluative account of performance of the informed decision. Sometimes users prefer to write their
SentiWordNet approach with two popular machine learning experiences about a product or service as form of a blog
approaches: Naïve Bayes and SVM for sentiment post rather than an explicit review. However, in both case
classification. The comparative performance of the the data is basically textual.
approaches for both movie reviews and blog posts is
illustrated through standard performance evaluation
Popular sites like carwale.com, imdb.com are now full
metrics of Accuracy, F-measure and Entropy. of user reviews, in this case reviews of cars and movies
respectively. And the users writing on these sites are a
Keywords: Sentiment Analysis, SentiWordNet, Blog diverse group, ranging persons who recently bought a
Sentiment, Machine Learning Classifiers. product or used a service to those who are regular at them.
A look at the Internet movie database website
I. INTRODUCTION (www.imdb.com) will definitely show how useful it could
be when a person is interested about movies, produced
According to a recent statistics by the Social Media and released in virtually any part of the World. Similarly
tracking company Technorati, four out of every five users the posts on blog sites reflect opinion of a large number of
of Internet use social media in some form. This includes users. A blog post however is relatively a difficult source
friendship networks, blogging and micro-blogging sites, for sentiment analysis, as it often does not contain explicit
content and video sharing sites etc. It is worth observing statements that can be harnessed for sentiment. Many
that the World Wide Web (here after referred only as times they do contain more factual or episodic information
Web) has now completely transformed into a more and are not opinionated to the desired extent. However,
participative and co-creative Web. It allows a large they still constitute a tremendous source of user opinions
number of users to contribute in a variety of forms. The and views and should be harnessed for sentiment oriented
fact is that even those who are virtually novice to the and other useful analysis.
technicalities of the Web publishing are creating content
on the Web. In fact the value of a Website is now Though these reviews and posts are beyond doubt very
determined largely by its user base, which in turn decides useful and valuable, but at the same time it is also quite
the amount of data available on it. It may perhaps be true difficult for a new user (or a prospective customer) to read
to say that Data is the new Intel inside. all the reviews/ posts in a short span of time. Fortunately
we have a solution to this information overload problem
One such interesting form of user contributions on the which can present a comprehensive summary result out of
Web is reviews. Many sites on the Web allow users to a large number of reviews. The new IR formulations,
write their experiences or opinion about a product or popularly called sentiment classifiers, now not only allow
service in form of a review. The Web is now full of user- to automatically label a review as positive or negative, but
reviews for different items ranging from mobile phones, to extract and highlight positive and negative aspects of a
holiday trips, and hotel services to movie reviews etc. It is product/ service. Sentiment analysis is now an important
interesting to observe that these reviews not only express part of IR based formulations in a variety of domains. It is
opinions of a group of users but is also a valuable source traditionally used for automatic extraction of opinions

978-1-4673-4529-3/12/$31.00 2012
c IEEE 893
types about a product and for highlighting positive or terms of the text should be extracted, and (b) how to
negative aspects/ features of a product. Recently we have combine/ aggregate the individual sentiment values
seen use of sentiment analysis for opinion based clustering obtained from the library for the extracted terms so as to
of text-documents and for providing better and more arrive at a final sentiment label for the whole document.
focused recommendations by a recommender system [2], We have explored with different linguistic features and
[3]. scoring schemes. Computational Linguists suggest that
adjectives are good markers of opinions. For example, if
II. SENTIMENT ANALYSIS APPROACHES a review sentence says “The movie was excellent”, then
There are primarily three types of approaches for use of adjective ‘excellent’ tells us that the movie was
sentiment classification of opinionated texts: (a) using a liked by the reviewer and possibly he had a wonderful
machine learning based text classifier -such as Naïve experience watching it. Sometimes, Adverbs further
Bayes, SVM or kNN; (b) using Semantic Orientation modify the opinion expressed in review sentences. For
scheme of extracting relevant n-grams of the text and then example, the sentence “The movie was extremely good”
labeling them either as positive or negative and expresses a more positive opinion about the movie than
consequentially the document; and (c) using the the sentence “the movie was good”. We have therefore
SentiWordNet based publicly available library that explored with two linguistic feature selection schemes. In
provides positive, negative and neutral scores for words. one we only extract ‘adjectives’ and in the other one we
Some of the relevant past works on sentiment extract both ‘adjectives’ and ‘adverbs’. Though adverbs
classification can be found in [4], [5], [6], [7], [8], [9], are of various kinds, but for sentiment classification only
[10], [11], [12] and [13]. adjectives of degree seem useful.
The machine learning based text classifiers are a kind of After deciding for the type of terms to be extracted, the
supervised machine learning paradigm, where the remaining task was to obtain their SentiWordNet scores.
The SentiWordNet Version 3.0 gives us a sentiment
classifier needs to be trained on some labeled training data
polarity value between ‘-1’ to ‘+1’ for every term that we
before it can be applied to actual classification task. The extract and use for look up. A value towards ‘+1’ denotes
training data is usually an extracted portion of the original that the term is an indicator of positivity and a value
data hand labeled manually. After suitable training they towards ‘-1’ indicates that it is an indicator of negativity.
can be used on the actual test data. The Naïve Bayes is a After we obtain the sentiment value for all extracted
statistical classifier whereas SVM is a kind of vector space terms in a particular text document, we have to aggregate
classifier. The statistical text classifier scheme of Naïve all these scores to obtain a sentiment score for the whole
Bayes (NB) can be adapted to be used for sentiment text document. In order to find this score we aggregated
classification problem as it can be visualized as a 2-class the values for positive terms and negative terms of a text
text classification problem: in positive and negative document separately to obtain a ‘positivity score’ and
classes. A more detailed description can be found in [14]. ‘negativity score’ for the document. The magnitude of
Support Vector machine (SVM) is a kind of vector space these two values is then compared and whichever score is
model based classifier which requires that the text higher, determines the sentiment polarity of the whole
documents should be transformed to feature vectors before document. An indicative pseudo-code denoting the key
they are used for classification. Usually the text steps for computing the Sentiment scores, aggregating
documents are transformed to multidimensional tf.idf them and computing polarity strength for ‘adjective only’
vectors. The entire problem of classification is then scheme is illustrated below.
classifying every text document represented as a vector
into a particular class. It is a type of large margin • Extract the adj from text doc.
classifier. Here the goal is to find a decision boundary • For all extracted adj
between two classes that is maximally far from any o Senti_Score=SWN_Score(adj)
document in the training data. We have implemented o If a ‘NOT’ precedes the adj
Naïve Bayes algorithm as JAVA code and used Sequential Senti_Score = - (Senti_Score)
Minimal Optimizer (SMO) available in weka [15] for o If Senti_Score>0 then
implementing SVM. These two methods were Pos_Score+=Senti_Score
implemented originally in [16], and we used them here to o Else
compare their performance vis-à-vis SentiWordNet Neg_Score-=(-Senti_Score)
implementations described here.
• If Pos_Score>Neg_Score then
III. SENTIWORDNET Sentiment=Postive
Else
The SentiWordNet approach utilizes the publicly
available library of SentiWordNet [17], which provides a Sentiment=Negative
sentiment polarity values for every term occurring in the • Compute Polarity Strength
document. In this lexical resource each term t occurring o Polarity_Strength=Pos_Score-
in WordNet is associated to three numerical scores obj(t), Neg_Score
pos(t) and neg(t), describing the objective, positive and
negative polarities of the term, respectively. These three In the pseudo-code above, only adjectives (denoted as
scores are computed by combining the results produced adj) were extracted. We computed their SentiWordNet
by eight ternary classifiers. To make use of score and then checked for presence of ‘Not’ immediately
SentiWordNet we need to first extract relevant before it. If a ‘Not’ is present the Senti_Score is negated.
opinionated terms and then lookup for their scores in the This was done to incorporate the effect of negated
SentiWordNet. Therefore, the key issues in using expressive words. In the other variant we extracted an
SentiWordNet based approach are to decide: (a) which ‘Adverb+Adjective’ combine rather than only

894 2013 3rd IEEE International Advance Computing Conference (IACC)

‘Adjectives’. In this variant we first locate adjectives and The idea behind using variable scoring scheme is to
then check two terms (2-grams) preceding it for modify the sentiment strength of adjectives with a factor
occurrence of a preceding adverb and any ‘Not’ dependent on the preceding adverb type and score. For
preceding that adverb. The rest of the computation is as example, for the extracted adv+adj combines like ‘very
illustrated in the pseudo-code below. Adjectives are good’ and ‘very bad’, the sentiment score of adjective
represented as adj and Adverbs are represented as adv. ‘good’ and ‘bad’ are strengthened (increased/ decreased)
by a value proportional to the adverb score. Whereas, for
• Extract the ‘adv+adj’ combine from
adv+adj combines like ‘barely good’ and ‘barely bad’, the
the text doc. scores for ‘good’ and ‘bad’ are modified to reduce their
• For all extracted terms strength. The value by which the SentiWordNet scores
o Senti_Score=SWN_Score(Adj)+SWN_S are strengthened is a variable one, as it is a multiplication
core(Adv) of (1-score(adj)) to the adverb score. This results in
o If a ‘NOT’ precedes the term strong adjectives being strengthened in polarity by a
Senti_Score = - (Senti_Score) lesser value whereas weaker adjectives are strengthened
o If Senti_Score>0 then in polarity by a relatively higher value.
Pos_Score+=Senti_Score In our last variation we decided to fix the scaling factor
o Else so as to give a fixed weight to adjective priority. We have
Neg_Score-=(-Senti_Score) chosen a scaling factor sf = 0.35. This is equivalent to
giving only 35% weight to adverb scores. Now the
• If Pos_Score>Neg_Score then modifications in adjective scores are in a fixed proportion
Sentiment=Postive to adverb scores. Since we chose a value of scaling factor
Else sf = 0.35, the adjective scores will get a higher priority in
Sentiment=Negative the combined score. The indicative pseudo-code of key
• Compute Polarity Strength steps for this scheme i.e. SentiWordNet (Adjective
o Polarity_Strength=Pos_Score- Priority Scoring) is illustrated below.
Neg_Score • If adv is affirmative, then
o If score(adj)>0
In both the above schemes, we gave equal weightage to fsAPS (adv,adj)=
Adjectives and Adverb scores obtained from the min(1,score(adj)+sf*score(adv))
SentiWordNet. It would however be relevant to o If score(adj)<0
understand the point that in English language, adjectives fsAPS (adv,adj)= min(1,score(adj)-
are largely used in an opinionated tone and adverbs are
sf*score(adv))
usually used as complements or modifiers. Few examples
of adverb usage are: he ran quickly, only adults, very • If adv is negative, then
dangerous trip, very nicely etc. In all these examples o If score(adj)>0
adverbs modify the adjectives. It may therefore be fsAPS(adv,adj)= max(-
interesting to explore assigning different weightage to 1,score(adj)+sf*score(adv))
adjective and adverb SentiWordNet scores before o If score(adj)<0
combining them to produce a combine sentiment score. fsAPS(adv,adj)= max(-
A related previous work [18] has tried to explore this 1,score(adj)-sf*score(adv))
philosophy, and concluded that ‘Adverb+Adjective’
combine produces better results than using adjectives
alone. Hence we decided to explore this possibility by Here, the final sentiment values (fsAPS) are scaled form
assigning different weightage to adjective and adverb of adverb and adjective SentiWordNet scores, where the
SentiWordNet scores and evaluate its performance. We adverb score is given 35% weightage. The presence of
explored two variants of this idea, hereafter referred as ‘Not’ in both the Variable Scoring and Adjective Priority
SentiWordNet (Variable Scoring) and SentiWordNet Scoring methods was handled in the same way as we did
(Adjective Priority Scoring). An indicative pseudo-code in the two previous schemes of using adjective only and
of the key steps for SentiWordNet (Variable Scoring) is using ‘adverb+adjective’ combine. We have thus
illustrated below. implemented four scoring schemes with the two feature
selection variants, namely using adjectives only and using
• If adv is affirmative, then ‘adverb+adjective’ combine.
o If score(adj)>0
fsVS (adv,adj)= score(adj)+(1- IV. DATASETS AND PERFORMANCE MEASURES
score(adj))*score(adv) We aimed to evaluate performance of different variants
o If score(adj)<0 of the SentiWordNet approach for sentiment
fsVS (adv,adj)= score(adj)-(1- classification of explicit review texts (movie reviews
score(adj))*score(adv) here) and blog posts. We have used following four
• If adv is negative, then datasets.
o If score(adj)>0 A. Collecting Datasets
fsVS (adv,adj)= score(adj)+(1- We wanted to evaluate our implementations on both
score(adj))*score(adv) explicit review texts and on blog data as well. For this, we
o If score(adj)<0 have used two existing standard movie review data sets
fsVS (adv,adj)= score(adj)-(1- obtained from Cornell sentiment polarity dataset [19]. We
score(adj))*score(adv) downloaded polarity dataset v1.0 (referred as dataset 1)

2013 3rd IEEE International Advance Computing Conference (IACC) 895

and v2.0 (referred as dataset 2). The dataset 1 comprises TABLE III. COMPUTED SCORES ON DATASET2
of 700 positive and 700negative processed reviews,
whereas the dataset2 comprises of 1000 positive and 1000 Method
Performance measure Value
negative processed reviews. Our other dataset was on
Accuracy 65.6%
blogging data. We used blog posts on Libyan revolution SWN-1 F-measure 0.6523113
and Tunisian revolution (referred as dataset 3 and dataset Entropy 0.27848914
4), originally collected for analytical work in [20]. In Accuracy 63.2%
dataset 3 and dataset 4, we filtered out blog posts, not in SWN-2 F-measure 0.62643987
English and sentiment labeled them using Alchemy [21]. Entropy 0.28467706
A summary of the four datasets collected is described in Accuracy 65.2%
table I. SWN (VS) F-measure 0.64960706
Entropy 0.2802905
TABLE I. DETAILS OF DATASET USED Accuracy 65.9%
SWN (APS) F-measure 0.65685844
No. of Avg. Entropy 0.27805692
Dataset Reviews/ length Accuracy 82.9%
blog posts (in words) NB
F-measure 0.829
700+700 Movie Reviews 1400 655 Accuracy 77.95%
SVM
F-measure 0.779
1000+1000 Movie Reviews 2000 656
Blog posts on Libyan TABLE IV. COMPUTED SCORES ON DATASET3
1486 1130
Revolution
Blog posts on Tunisian
807 1171 Method
Revolution Performance measure Value
Accuracy 63.7%
SWN-1 F-measure 0.65272033
B. Performance Metrics Computed Entropy 0.24667716
In order to evaluate the accuracy and performance of Accuracy 61.6%
different variants of the SentiWordNet based approaches, SWN-2 F-measure 0.6335779
we computed the standard performance metrics of Entropy 0.25067705
Accuracy 63.1%
Accuracy, F-measure and Entropy. Accuracy is measured SWN (VS) F-measure 0.6469193
in percentage, whereas F-measure and Entropy metric Entropy 0.2483084
values range from 0 – 1. The best accuracy value is 100%, Accuracy 64.3%
best F-measure value is 1, and of Entropy best value is 0. SWN (APS) F-measure 0.6444416
Entropy 0.24841921
V. RESULTS
We computed results of four SentiWordNet based TABLE V. COMPUTED SCORES ON DATASET4
approaches for two movie reviews and two blog post
datasets. We call these implementations as SWN-1 (first Method
Performance measure Value
scheme of simple aggregation of adjective SentiWordNet
Accuracy 64.4%
scores), SWN-2 (second scheme of aggregation of SWN-1 F-measure 0.6478957
SentiWordNet scores of adv+adj combine), SWN(VS) – Entropy 0.26944947
the variable scoring method and SWN(APS) – the Accuracy 62.1%
adjective priority scoring scheme. We have also compared SWN-2 F-measure 0.62462026
results for movie review datasets with NB and SVM based Entropy 0.27585214
machine learning classifiers. The tables II, III, IV and V Accuracy 64.8%
present performance measures computed for dataset 1 to SWN (VS) F-measure 0.6516693
dataset 4, respectively. Entropy 0.2686239
Accuracy 65.1%
SWN (APS) F-measure 0.65399694
TABLE II. COMPUTED SCORES ON DATASET1 Entropy 0.26749235
Performance
Method Value
Measure We have computed the percentage of reviews and blog
Accuracy 64.3% posts assigned to ‘positive’ and ‘negative’ classes. The
SWN-1 F-measure 0.6399407 table VI presents detailed summary of assignment
Entropy 0.28240487 percentage. The table VII presents a detailed statistics of
Accuracy 63.2% how many positive texts are classified as ‘positive’ and
SWN-2 F-measure 0.6268331 how many negative texts are classified as ‘negative’, for
Entropy 0.28468996
Accuracy 64.9%
all four datasets by all the four schemes implemented.
SWN (VS) F-measure 0.6478626 The accuracy, F-measure and Entropy results obtained by
Entropy 0.28102395 the four different SentiWordNet approaches for both the
Accuracy 65.3% movie review datasets and the blog posts datasets are
SWN (APS) F-measure 0.6510046 illustrated in Figures 1 to 3. The datasets are marked on
Entropy 0.2799192 the x-axis, whereas the value obtained for accuracy, F-
Accuracy 81.07% measure and Entropy are marked on the y-axis in figures
NB
F-measure 0.811 1 to 3. Figure 4 presents an example of computed
Accuracy 76.78%
SVM sentiment strength for 700 positive reviews of dataset 1.
F-measure 0.768

896 2013 3rd IEEE International Advance Computing Conference (IACC)

TABLE VI. TOTAL PERCENTAGE OF ‘POSITIVE’ AND ‘NEGATIVE’
LABELS ASSIGNED BY ALL FOUR METHODS

Dataset
Method
Dataset1 Dataset2 Dataset3 Dataset4
POS 73.3 75.9 67.4 73
SVN-1
NEG 55.3 55.3 62.1 58.9
POS 75.1 75.4 65.5 69.8
SWN-2
NEG 51.3 51 60 57.1
SWN POS 71.3 72.5 66.2 73
(VS) NEG 58.6 57.8 61.8 59.5
SWN POS 72.6 73.8 66.7 74
(APS) NEG 58 58 61.2 59.3

TABLE VII. RESULT OF FOUR DIFFERENT VARIANTS OF

SENTIWORDNET IMPLEMENTATIONS ON DATASET1 –DATASET4

Method
Performance measure Value
Classified Positive 513/700
Dataset1 Fig. 2 F-measure values for the four SentiWordNet schemes.
Classified Negative 387/700
Classified Positive 759/1000
Dataset2
Classified Negative 553/1000
SWN-1
Classified Positive 293/435
Dataset3
Classified Negative 653/1051
Classified Positive 232/318
Dataset4
Classified Negative 288/489
Classified Positive 526/700
Dataset1
Classified Negative 359/700
Classified Positive 754/1000
Dataset2
Classified Negative 510/1000
SWN-2
Classified Positive 285/435
Dataset3
Classified Negative 631/1051
Classified Positive 222/318
Dataset4
Classified Negative 279/489
Classified Positive 499/700
Dataset1
Classified Negative 410/700
Classified Positive 725/1000
Dataset2
Classified Negative 578/1000
SWN (VS)
Classified Positive 288/435
Dataset3
Classified Negative 649/1051
Classified Positive 232/318 Fig. 3 Entropy Values for the four SentiWordNet schemes.
Dataset4
Classified Negative 291/489
Classified Positive 508/700
Dataset1
Classified Negative 406/700
Classified Positive 738/1000
Dataset2
Classified Negative 580/1000
SWN (APS)
Classified Positive 290/435
Dataset3
Classified Negative 643/1051
Classified Positive 235/318
Dataset4
Classified Negative 290/489

Fig. 4 Sentiment Strength for Positive reviews of Dataset 1, using

SentiWordNet Scheme 1.

VI. OBSERVATIONS AND CONCLUSION

We performed a detailed evaluation of performance of
the four SentiWordNet schemes for four different textual
datasets (two of them being explicit reviews and other two
had latent opinionated content). We also checked the
Fig. 1 Accuracy Values for the four SentiWordNet schemes.

2013 3rd IEEE International Advance Computing Conference (IACC) 897

accuracy level of these four schemes with an earlier REFERENCES
machine learning based sentiment classifier [1] V. K. Singh, D. Gautam, R. R. Singh and A. K. Gupta, “Agent-
implementations of ours. We observe that for movie based Computational Modeling of Emergent Collective
reviews any of the SentiWordNet based approaches are Intelligence”, Proceedings of International Conference on
not able to meet the performance of machine learning Computational Collective Intelligence, LNAI 5796, pp. 240-251,
classifiers of NB and SVM. A primary reason for this is Springer-Verlag, 2009.
that all these texts in movie reviews use similar term [2] V. K. Singh, M. Mukherjee and G. K. Mehta, “Combining
occurrence patterns, and therefore NB and SVM are able Collaborative Filtering and Sentiment Analysis for Improved
Movie Recommendations”, In C. Sombattheera et. al. (Eds.):
to achieve higher level of performance. Out of the four Multi-disciplinary Trends in Artificial Intelligence, LNAI 7080,
SentiWordNet implementations, SentiWordNet (Adjective Springer-Verlag, Berlin-Heidelberg, pp. 38-50, 2011.
Priority Scoring) obtains the best results on all the four [3] V. K. Singh, M. Mukherjee and G. K. Mehta, “Combining a
datasets. The SentiWordNet (Variable Scoring) is the next Content Filtering Heuristic and Sentiment Analysis for Movie
best in performance, followed by SentiWordNet Recommendations”, In K.R. Venugopal and L.M. Patnaik (Eds.):
(Adjective only) scheme. We conclude that Adjectives ICIP 2011, Aug. 2011, CCIS 157, pp. 659-664, Springer,
constitute the most important linguistic feature for Heidelberg, 2011.
sentiment analysis. However, an adjective only scheme is [4] K. Dave, S. Lawerence and D. Pennock, “Mining the Peanut
Gallery-Opinion Extraction and Semantic Classification of
less efficient than using ‘Adverb+Adjective’ combine. An Product Reviews”, Proceedings of the 12th International World
important issue, however, is that adverbs and adjectives Wide Web Conference, pp. 519-528, 2003.
should not be given equal weightage in computing [5] P. Turney, “Thumbs up or thumbs down? Semantic orientation
sentiment polarity of a text, rather adjective scores should applied to unsupervised classification of reviews”, Proceedings of
be modified by adverb scores, but only to a limited extent ACL-02, 40th Annual Meeting of the Association for
and using an appropriate weightage/ scaling factor. Computational Linguistics, pp. 417-424, Philadelphia, US, 2002.
[6] A. Esuli and F. Sebastiani, “Determining the Semantic Orientation
The analytical results obtained thus present a detailed of terms through gloss analysis”, Proceedings of CIKM-05, 14th
evaluative account of performance of the four ACM International Conference on Information and Knowledge
SentiWordNet based implementations. The datasets Management, pp. 617-624, Bremen, DE, 2005.
comprised of both explicit reviews (datasets 1 and 2) and [7] B. Pang, L. Lee and S. Vaithyanathan, “Thumbs up? Sentiment
latent opinionated text (datasets 3 and 4). We observe that classification using machine learning techniques”, Proceedings of
the SentiWordNet based implementations achieve almost the Conference on Empirical Methods in Natural Language
comparable performance values on blog data as well. This Processing, pp. 79-86, Philadelphia, US, 2002.
is in contrast to the popularly held belief that sentiment [8] S.M. Kim and E. Hovy, “Determining sentiment of opinions”,
analysis of blog posts (which do not have explicit review Proceedings of the COLING Conference, Geneva, 2004.
tone) is more difficult than explicit reviews. Further, the [9] K.T. Durant & MD. Smith, “Mining Sentiment Classification from
Political Web Logs”, Proceedings of WEBKDD, ACM, 2006.
work done on sentiment analysis using machine learning
[10] F. Sebastiani, “Machine Learning in Automated text
based classifier approaches show that they do not achieve categorization”, ACM Computing Surveys, 34(1): 1-47, 2002.
same level of performance on blog posts as they do on [11] P. Turney and M.L. Littman, “Unsupervised Learning of Semantic
explicit review texts. The SentiWordNet approach has the Orientation from a Hundred-Billion-Word corpus”, NRC
advantage of being capable of achieving good Publications Archive, 2002.
performance levels on both kinds of data. Moreover, it has [12] V. K. Singh, M. Mukherjee and G. K. Mehta, “Sentiment and
the advantage of ‘no requirement of training data and prior Mood Analysis of Weblogs using POS Tagging based Approach”,
training’ as compared to the machine learning based In S. Aluru et al. (Eds.): IC3 2011, CCIS 168, pp. 313-324,
classifiers. We plan to explore some more linguistic Springer-Verlag, Berlin Heidelberg, 2011.
features, particularly ‘verbs’ in an appropriate [13] B. Liu, “Web Data Mining: Exploring Hyperlinks, Contents and
combination with adjective and adverbs to see if we can Usage Data”, Springer-Verlag, Berlin-Heidelberg, pp. 411-416,
2002.
improve the performance levels further. Another variation
[14] C.D. Manning, P. Raghavan and H. Schutze, “Introduction to
that may be explored is to use bigger n-grams (than the 3- Information Retrieval”, Cambridge University Press, New York,
gram used here) to achieve better performance levels with USA, 2008.
reviews written in a relatively informal, and [15] Weka Data Mining Software in JAVA,
grammatically less accurate language. http://www.cs.waikato.ac.nz/ml/weka/
This experimental work helps in arriving at two [16] P. Waila, Marisha, V.K. Singh and M.K. Singh, “Evaluating
important conclusions: (a) the SentiWordNet based Machine Learning and Unsupervised Semantic Orientation
approaches can obtain same level of performance for both Approaches for Sentiment Analysis of Textual Reviews”,
Proceedings of International Conference on Computational
explicit reviews and the relatively less explicit and latent Intelligence and Computing Research, Coimbatore, India, 2012.
opinionated texts; and (b) Adjectives are no doubt the [17] SentiWordNet, available at http://www.sentiwordnet.isti.cnr.it
most important linguistic feature to exploit for sentiment [18] F. Benamara, C. Cesarano and D. Reforigiato, “Sentiment
analysis, but adverbs which modify the adjectives, Analysis: Adjectives and Adverbs are better than Adjectives
improve the performance levels further if they are Alone”, Proceedings of ICWSM 2006, CO USA, 2006.
combined with adjective scores in an appropriate [19] http://www.cs.cornell.edu/people/pabo/movie-review-data/
weightage. The ease of implementation of SentiWordNet [20] D. Mahata and N. Agarwal, “What Does Everybody Know?
allows not only allows us to perform sentiment analysis, Identifying Event-specific Sources from Social Media”, In
but it also makes a very reasonable case of using it as an Proceedings of the fourth International Conference on
added level of filtering for movie recommendations (to Computational Aspects of Social Networks, Sao Carlos, Brazil,
2012.
further refine the results of item-based collaborative
[21] Alchemy API, Retrieved from www.alchemyapi.com on 15 Dec.
algorithms), on the go as it requires no prior training. 2012.

898 2013 3rd IEEE International Advance Computing Conference (IACC)

Internship Report
100% (2)
Internship Report
59 pages
Introduction To C Programming Course Materail
100% (1)
Introduction To C Programming Course Materail
161 pages
Sentiment Analysis Using Convolutional Neural Network
No ratings yet
Sentiment Analysis Using Convolutional Neural Network
6 pages
JAROL Assumes The Promotion of Energy-Saving Technology As Its Own Task! 1. PREFACE NOTICE
50% (2)
JAROL Assumes The Promotion of Energy-Saving Technology As Its Own Task! 1. PREFACE NOTICE
182 pages
Sentimental Analysis
100% (2)
Sentimental Analysis
171 pages
Opinion Mining and Sentiment Analysis - Pang and Lee
No ratings yet
Opinion Mining and Sentiment Analysis - Pang and Lee
94 pages
A Modernized Approach To Sentiment Analysis of Product Reviews Using BiGRU and RNN Based LSTM Deep Learning Models
No ratings yet
A Modernized Approach To Sentiment Analysis of Product Reviews Using BiGRU and RNN Based LSTM Deep Learning Models
24 pages
CBSE Class 11 Mathematics Worksheet - Set Theory (1) Export PDF
100% (1)
CBSE Class 11 Mathematics Worksheet - Set Theory (1) Export PDF
14 pages
b10 PDF
100% (1)
b10 PDF
6 pages
Discovering Consumer Insight From Twitter Via Sentiment Analysis
No ratings yet
Discovering Consumer Insight From Twitter Via Sentiment Analysis
20 pages
Operations Research: Dr. Sarat K Jena
No ratings yet
Operations Research: Dr. Sarat K Jena
98 pages
Control of A Two-Tank System - MATLAB & Simulink Example PDF
No ratings yet
Control of A Two-Tank System - MATLAB & Simulink Example PDF
21 pages
Sentiment Analysis of Imdb Movie Review Database Final
100% (1)
Sentiment Analysis of Imdb Movie Review Database Final
16 pages
Web Application Architectures
No ratings yet
Web Application Architectures
8 pages
LinkWay S2 Datasheet 012 Web
No ratings yet
LinkWay S2 Datasheet 012 Web
2 pages
Feature-Based Opinion Mining and Ranking
No ratings yet
Feature-Based Opinion Mining and Ranking
22 pages
Data Reconciliation
No ratings yet
Data Reconciliation
15 pages
A Comparative Study of Different Classification Te
No ratings yet
A Comparative Study of Different Classification Te
10 pages
Movie Rating and Review Summarization in Mobile Environment
No ratings yet
Movie Rating and Review Summarization in Mobile Environment
11 pages
Kherwa 2014
No ratings yet
Kherwa 2014
7 pages
2019 - A Comprehensive Study On Lexicon Based Approaches For Sentiment Analysis
No ratings yet
2019 - A Comprehensive Study On Lexicon Based Approaches For Sentiment Analysis
7 pages
10.1007@s40558 015 0047 7
No ratings yet
10.1007@s40558 015 0047 7
25 pages
Sentiment Analysis of Movie Reviews and Blog Posts
No ratings yet
Sentiment Analysis of Movie Reviews and Blog Posts
6 pages
49 BC
No ratings yet
49 BC
5 pages
Guidance Note C - B - ENV 002, July 02
No ratings yet
Guidance Note C - B - ENV 002, July 02
12 pages
Ls Inverter Ic5
No ratings yet
Ls Inverter Ic5
20 pages
Courses and Instructors To Develop Your Potential.: Vmware Cloud Foundation Management and Operations V3.9.1
No ratings yet
Courses and Instructors To Develop Your Potential.: Vmware Cloud Foundation Management and Operations V3.9.1
4 pages
A Brief Review On Sentiment Analysis
No ratings yet
A Brief Review On Sentiment Analysis
5 pages
SentimentanalysisforuserreviewsusingBi LSTMself attentionbasedCNNmodel
No ratings yet
SentimentanalysisforuserreviewsusingBi LSTMself attentionbasedCNNmodel
16 pages
Department of Masters of Comp. Applications
No ratings yet
Department of Masters of Comp. Applications
16 pages
THEORY OF COST Micro 6
No ratings yet
THEORY OF COST Micro 6
12 pages
Opinion Mining A Review
No ratings yet
Opinion Mining A Review
12 pages
An Approach For Automatic Analysis of Online Store Product and Services Reviews
No ratings yet
An Approach For Automatic Analysis of Online Store Product and Services Reviews
13 pages
PDMS Procedure: 2D DRAFT Intermediate - Structural Discipline
No ratings yet
PDMS Procedure: 2D DRAFT Intermediate - Structural Discipline
14 pages
Sentiment Classification of Movie Reviews by Supervised Machine Learning Approaches
No ratings yet
Sentiment Classification of Movie Reviews by Supervised Machine Learning Approaches
8 pages
Fundamental Counting Principle
No ratings yet
Fundamental Counting Principle
14 pages
Sentiment Analysis Based Approaches For Understanding User Context in Web Content
No ratings yet
Sentiment Analysis Based Approaches For Understanding User Context in Web Content
5 pages
2013-AP-A Sentimental Analysis of Movie Reviews Involving Fuzzy Rule-Based
No ratings yet
2013-AP-A Sentimental Analysis of Movie Reviews Involving Fuzzy Rule-Based
6 pages
Research Paper - Sentiment Analysis Based On Movie Rating
No ratings yet
Research Paper - Sentiment Analysis Based On Movie Rating
5 pages
Polarity Prediction
No ratings yet
Polarity Prediction
10 pages
Analysis & Review of Product Using Twitter Tweets As A Dataset
No ratings yet
Analysis & Review of Product Using Twitter Tweets As A Dataset
34 pages
Masters' Thesis Report 1-1
No ratings yet
Masters' Thesis Report 1-1
5 pages
Classification of Opinion Mining Techniques: Nidhi Mishra C.K.Jha
No ratings yet
Classification of Opinion Mining Techniques: Nidhi Mishra C.K.Jha
6 pages
Online - Reviews Sentiment - Analysis
No ratings yet
Online - Reviews Sentiment - Analysis
5 pages
University Synopsius
No ratings yet
University Synopsius
3 pages
Sentiment Recognition in Customer Reviews Using Deep Learning
No ratings yet
Sentiment Recognition in Customer Reviews Using Deep Learning
10 pages
Paper-6 Data Mining and Natural Language Processing Methods For Extracting Opinions From Customer Reviews
No ratings yet
Paper-6 Data Mining and Natural Language Processing Methods For Extracting Opinions From Customer Reviews
7 pages
Sentiment Analysis of Movie Ratings Syst
No ratings yet
Sentiment Analysis of Movie Ratings Syst
5 pages
Sentiment Analysis On Unstructured Review
No ratings yet
Sentiment Analysis On Unstructured Review
5 pages
Machine Learning Algorithms For Opinion Mining and Sentiment Classification
No ratings yet
Machine Learning Algorithms For Opinion Mining and Sentiment Classification
6 pages
2019-03-19 - ELSE - NOVEL ARTIFICIAL INTELLIGENCE TECHNOLOGIES FOR ENHANCED RECRUITMENT CAMPAIGNS USING SOCIAL MEDIA - AMT - RI - IR - MA - GSJR - v9
No ratings yet
2019-03-19 - ELSE - NOVEL ARTIFICIAL INTELLIGENCE TECHNOLOGIES FOR ENHANCED RECRUITMENT CAMPAIGNS USING SOCIAL MEDIA - AMT - RI - IR - MA - GSJR - v9
8 pages
Reasearch Paper
100% (1)
Reasearch Paper
9 pages
Sentiment Analysis of Movie Reviews
No ratings yet
Sentiment Analysis of Movie Reviews
6 pages
Unit-III Final Java Servlets and XML Notes
No ratings yet
Unit-III Final Java Servlets and XML Notes
64 pages
45 Ijmtst0806103
No ratings yet
45 Ijmtst0806103
4 pages
Review On Developing Corpora For Sentiment Analysis Using Plutchik's Wheel of Emotions With Fuzzy Logic
No ratings yet
Review On Developing Corpora For Sentiment Analysis Using Plutchik's Wheel of Emotions With Fuzzy Logic
9 pages
Sentiments Analysis of Amazon Reviews Dataset by Using Machine Learning
No ratings yet
Sentiments Analysis of Amazon Reviews Dataset by Using Machine Learning
9 pages
Basic Linux Command
No ratings yet
Basic Linux Command
9 pages
FM Transmitter
No ratings yet
FM Transmitter
12 pages
17 - A Deep Learning Analysis On Question Classification Task Using Word2vec Representations
No ratings yet
17 - A Deep Learning Analysis On Question Classification Task Using Word2vec Representations
20 pages
Sentiment Analysis On Movie Reviews Based On Combined Approach
No ratings yet
Sentiment Analysis On Movie Reviews Based On Combined Approach
4 pages
A Comprehensive Study On Lexicon Based Approaches For Sentiment Analysis
No ratings yet
A Comprehensive Study On Lexicon Based Approaches For Sentiment Analysis
7 pages
A Survey On Sentiment Analysis of (Product) Reviews: A. Nisha Jebaseeli E. Kirubakaran, PHD
No ratings yet
A Survey On Sentiment Analysis of (Product) Reviews: A. Nisha Jebaseeli E. Kirubakaran, PHD
4 pages
Knowledge Based Approach For Concept Level Sentiment Analysis For Online Reviews
No ratings yet
Knowledge Based Approach For Concept Level Sentiment Analysis For Online Reviews
8 pages
22 - Improved Solar Photovoltaic Energy Generation Forecast Using Deep Learning-Based Ensemble Stacking Approach
No ratings yet
22 - Improved Solar Photovoltaic Energy Generation Forecast Using Deep Learning-Based Ensemble Stacking Approach
16 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
8 pages
A Deep Learning Approach For Public Sentiment Analysis in COVID-19 Pandemic
No ratings yet
A Deep Learning Approach For Public Sentiment Analysis in COVID-19 Pandemic
7 pages
CSMA-Based Link Scheduling in Multihop MIMO Networks Using SINR Model
No ratings yet
CSMA-Based Link Scheduling in Multihop MIMO Networks Using SINR Model
4 pages
Mohamed 2017
No ratings yet
Mohamed 2017
12 pages
Business Intelligence & Data Mining 12-13
No ratings yet
Business Intelligence & Data Mining 12-13
64 pages
Grade - X: English
No ratings yet
Grade - X: English
4 pages
Products Reviews and Sentimental Analysis System For Ecommerce Website
No ratings yet
Products Reviews and Sentimental Analysis System For Ecommerce Website
3 pages
Opinion Mining and Sentiment Anallysis
No ratings yet
Opinion Mining and Sentiment Anallysis
94 pages
Review of Online Product Using Rule Based and Fuzzy Logic With Smileys
No ratings yet
Review of Online Product Using Rule Based and Fuzzy Logic With Smileys
6 pages
S11 Question Catalog en
No ratings yet
S11 Question Catalog en
2 pages
Minor Fnal
No ratings yet
Minor Fnal
22 pages
Ijaiem 2013 05 26 063
No ratings yet
Ijaiem 2013 05 26 063
9 pages
Chapter 6 HW Packet
No ratings yet
Chapter 6 HW Packet
19 pages
COMP1001 LAB5.ipynb
No ratings yet
COMP1001 LAB5.ipynb
4 pages
Preprocessing The Informal Text For Efficient Sentiment Analysis
No ratings yet
Preprocessing The Informal Text For Efficient Sentiment Analysis
4 pages
Opinion Search and Retrieval From WWW: Dr. A. Padmapriya, S. Maheswaran
No ratings yet
Opinion Search and Retrieval From WWW: Dr. A. Padmapriya, S. Maheswaran
5 pages
Aar DCV 2
No ratings yet
Aar DCV 2
3 pages
UNIT TEST CHAPTER 11 IMMUNITY - Jamal XI69069
No ratings yet
UNIT TEST CHAPTER 11 IMMUNITY - Jamal XI69069
8 pages
579-Article Text-2248-1-10-20201027
No ratings yet
579-Article Text-2248-1-10-20201027
6 pages
Opinion Mining On Social Media Data Sentiment Analysis of User Preferences
No ratings yet
Opinion Mining On Social Media Data Sentiment Analysis of User Preferences
21 pages
Upc 1678 G
No ratings yet
Upc 1678 G
6 pages
Maths - Quantitative Aptitude Sample Test: Direction For Questions 8 To 11
No ratings yet
Maths - Quantitative Aptitude Sample Test: Direction For Questions 8 To 11
6 pages
Sentiment Analysis From Movie Reviews Us
No ratings yet
Sentiment Analysis From Movie Reviews Us
5 pages
14 - An Approach To Integrating Sentiment Analysis Into Recommender Systems
No ratings yet
14 - An Approach To Integrating Sentiment Analysis Into Recommender Systems
17 pages
43 - A Framework For Sentiment Analysis With Opinion Mining of Hotel Reviews
No ratings yet
43 - A Framework For Sentiment Analysis With Opinion Mining of Hotel Reviews
4 pages
A Novel Unsupervised Corpus-Based Stemming
No ratings yet
A Novel Unsupervised Corpus-Based Stemming
16 pages
Test - Unit - 1 - Vector - Google Forms
No ratings yet
Test - Unit - 1 - Vector - Google Forms
4 pages
Sentiment Analysis of Product Review
No ratings yet
Sentiment Analysis of Product Review
6 pages
Module 5 in Mathematics in The Modern World: Community College of Manito Manito, Albay A.Y. 2021 - 2022
No ratings yet
Module 5 in Mathematics in The Modern World: Community College of Manito Manito, Albay A.Y. 2021 - 2022
4 pages
44 - Aspect-Level Sentiment Analysis On E-Commerce Data
No ratings yet
44 - Aspect-Level Sentiment Analysis On E-Commerce Data
5 pages
37 - Datasets For Aspect-Based Sentiment Analysis in Bangla and Its Baseline Evaluation
No ratings yet
37 - Datasets For Aspect-Based Sentiment Analysis in Bangla and Its Baseline Evaluation
10 pages
I-O List
No ratings yet
I-O List
6 pages
Diploma Baru Ee111
No ratings yet
Diploma Baru Ee111
2 pages
41 - Product Review Sentiment Analysis by Using NLP and Machine Learning in Bangla Language
No ratings yet
41 - Product Review Sentiment Analysis by Using NLP and Machine Learning in Bangla Language
5 pages
Sentiment Analysis Using Neural Networks A New Approach
No ratings yet
Sentiment Analysis Using Neural Networks A New Approach
5 pages
35 - Cricket Sentiment Analysis From Bangla Text Using Recurrent Neural Network With Long Short Term Memory Model
No ratings yet
35 - Cricket Sentiment Analysis From Bangla Text Using Recurrent Neural Network With Long Short Term Memory Model
5 pages
40 - Sentiment Extraction From Bangla Text A Character Level Supervised Recurrent Neural Network Approach
No ratings yet
40 - Sentiment Extraction From Bangla Text A Character Level Supervised Recurrent Neural Network Approach
5 pages
36 - Sentiment Analysis of School Zoning System On Youtube Social Media Using The K-Nearest Neighbor With Levenshtein Distance Algorithm
No ratings yet
36 - Sentiment Analysis of School Zoning System On Youtube Social Media Using The K-Nearest Neighbor With Levenshtein Distance Algorithm
4 pages
Sentiment Analysis Over Online Product Reviews A Survey
No ratings yet
Sentiment Analysis Over Online Product Reviews A Survey
9 pages

39 - Sentiment Analysis of Movie Reviews and Blog Posts

Uploaded by

39 - Sentiment Analysis of Movie Reviews and Blog Posts

Uploaded by

Sentiment Analysis of Movie Reviews and Blog Posts

Evaluating SentiWordNet with different Linguistic Features and Scoring Schemes

V.K. Singh, R. Piryani, A. Uddin P. Waila

894 2013 3rd IEEE International Advance Computing Conference (IACC)

2013 3rd IEEE International Advance Computing Conference (IACC) 895

896 2013 3rd IEEE International Advance Computing Conference (IACC)

TABLE VII. RESULT OF FOUR DIFFERENT VARIANTS OF

Fig. 4 Sentiment Strength for Positive reviews of Dataset 1, using

VI. OBSERVATIONS AND CONCLUSION

2013 3rd IEEE International Advance Computing Conference (IACC) 897

898 2013 3rd IEEE International Advance Computing Conference (IACC)

You might also like