Prediction of Election Result by Enhanced Sentiment Analysis On Twiter Data PDF
Prediction of Election Result by Enhanced Sentiment Analysis On Twiter Data PDF
Prediction of Election Result by Enhanced Sentiment Analysis On Twiter Data PDF
Abstract—Sentiment analysis is the computational users can share their thoughts and opinions about
study of opinions, sentiments, evaluations, attitudes, different events and people on the micro-blogging
views and emotions expressed in text. It refers to platform. Therefore, Twitter is considered as a rich
a classification problem where the main focus is to
predict the polarity of words and then classify them source of information for sentiment analysis.
into positive or negative sentiment. Sentiment anal- Sentiment analysis can be considered as the use
ysis over Twitter offers people a fast and effective of natural language processing, text analysis and
way to measure the public’s feelings towards their computational linguistics to identify and extract
party and politicians. The primary issue in previous sentiment information in source materials. Gener-
sentiment analysis techniques is the determination
of the most appropriate classifier for a given classi- ally, sentiment analysis aims to find the attitude
fication problem. If one classifier is chosen from the of a writer with respect to some relevant topic or
available classifiers, then there is no surety in the the overall contextual polarity of a document.The
best performance on unseen data. So to reduce the main task in sentiment analysis is classifying the
risk of selecting an inappropriate classifier, we are polarity of a given text at the document, sentence,
combining the outputs of a set of classifiers. Thus in
this paper, we use an approach that automatically or feature level[1] whether the expressed opinion
classifies the sentiment of tweets by combining ma- in a document, a sentence or an feature is positive,
chine learning classifiers with lexicon based classifier. negative, or neutral.The accuracy of a sentiment
The new combination of classifiers are SentiWordNet analysis is based on how well it agrees with
classifier, naive bayes classifier and hidden markov human judgements. This can be measured by using
model classifier. Here positivity or negativity of each
tweet is determined by using the majority voting precision and recall[2].
principle on the result of these three classifiers. In this paper, we introduces an accurate sen-
Thus we were used this sentiment classifier for timent classifier by combining machine learning
finding political sentiment from real time tweets. classifiers with lexicon based classifier for finding
Thus we have got an improved accuracy in sentiment
political sentiment and sentiment towards new
analysis using classifier ensemble Approach. Our
method also uses negation handling and word sense released movies from real time tweets. Section 2
disambiguation to achieve high accuracy. contains detailed study of the method. Section 3
Index Terms—Sentiment Analysis; WordNet; Sen- includes implementation details and results. Sec-
tiWordNet; Word Sense Disambiguation; Machine tion 4 is the conclusion.
Learning Methods; ensemble Approach.
II. M ETHODOLOGY
I. I NTRODUCTION
The system mainly deals with the tweets ex-
Currently, twitter is becoming one of the most traction and sentiment classification. Here we
popular micro-blogging platforms. Millions of use a sentiment classifier by combining machine
learning classifiers with lexicon based classifier. the extracted tweet. Then it removes all the private
The classifiers using are SentiWordNet classifier, usernames identified by @ user. Then it removes
naive bayes classifier and hidden markov model all the Hash tags identified by the # symbol and
classifier.Our proposed system consists of mainly all the special characters. Refined tweets are then
six modules. They are classified using classification scheme.
1) Data acquisition Negation handling is one of the factors that
2) Pre-processing contributed significantly to the accuracy of our
3) Sentiment Classification using SentiWord- classifiers. A major problem occurring during the
Net sentiment classification is in the negation han-
4) Sentiment Classification using Naive Bayes dling. Since here we use each word as feature,
5) Sentiment Classification using HMM the word “win” in the phrase “not win” will
6) Sentiment Classification using ensemble be contributing to positive sentiment rather than
Approach. negative sentiment .This will leads to the errors in
Block diagram of proposed architecture for sen- classification. This type of error is due to the pres-
timent classification is shown in Figure 1. This ence of “not” and this is not taken into account. To
classifier is used to find political sentiment from solve this problem we applied a simple algorithm
extracted tweets in real time manner. for handling negations using state variables and
bootstrapping. We built on the idea of using an
alternate representation of negated forms[4]. This
algorithm stores the negation state using a state
variable. It transforms a word followed by a nt
or not into “not ”+ word form. Whenever the
negation state variable is set, the words read are
treated as “not ”+ word. When a punctuation
mark is encountered or when there is double
negation, the state variable will reset.We have
applied negation handling to our three classifiers
separately for accurate classification.
IV. C ONCLUSION
In this work, we have Implemented a real time
Fig. 2. political sentiment analysis from tweets using classifier twitter sentiment analyser using classifier ensem-
ensemble approach ble approach. Here it combines machine learning
classifiers with lexicon based classifier. Thus we
are taking advantages of these three classifiers
From this graph, it is clear that public sentiment such as SentiWordNet classifier, naive bayes clas-
towards Arvind Kejriwal is higher than that of sifier and hidden markov model classifier for accu-
Kiran Bedi. rate classification of political data. Thus we were
We extracted tweets about two new released developed a novel accurate sentiment classifier for
movies,namely Baahubali and Bajrangi Bhaijaan finding political sentiment from real time tweets.
for 3 weeks. Then we applied sentiment anal- Then Compared political sentiment towards two
ysis on these tweets using classifier ensemble politicians by plotting graphs using results of
approach.Then we got a graph in the following sentiment analysis on extracted twitter data. Again
figure 3. this classifier was used to Compare two new re-
leased movies by the sentiment analysis on twitter
data. Finally,we were analysed improvement in
the accuracy of our classifier compared to the
individual ones.
R EFERENCES
[1] N. D. Valakunde, Dr. M. S. Patwardhan, Multi-Aspect
and Multi-Class Based Document Sentiment Analysis of
Educational Data Catering Accreditation Process IEEE,
International Conference on Cloud and Ubiquitous Com-
puting and Emerging Technologies ,2013.
[2] Bing Liu, Sentiment Analysis and Opinion Mining Mor-
gan and Claypool Publishers, May 2012.
[3] A. Bifet, G. Holmes, B. Pfahringer, MOA-TweetReader:
real-time analysis in twitter streaming data LNCS
Fig. 3. sentiment analysis on new released movies using 6926,Springer-Verlag, Berlin Heidelberg, 2011, pp. 4660.
classifier ensemble approach [4] Vivek Narayanan1, Ishan Arora2, Arjun Bhatia3, Fast and
accurate sentiment classification using an enhanced Naive
Bayes model 2012.
From this graph,it is clear that public sen- [5] Pulkit Kathuria ,Kiyoaki Shirai, Example Based Word
Sense Disambiguation towards Reading Assistant System
timent towards film ’Baahubali’ is higher than The Association for Natural Language Processing,2012.
that of ’Bajrangi Bhaijaan’ upto the end of first [6] Farhan Hassan Khan, Saba Bashir, Usman Qamar, TOM:
week. During the second week and third week Twitter opinion mining framework using hybrid classifica-
tion scheme Decision Support Systems 57 (2014) 245257,
’Bajrangi Bhaijaan’ has higher public sentiment 2013 Elsevier B.V.
than ’Baahubali’.The accuracy improvement in
this sentiment analysis method is shown in Table1.
While considering accuracy, it is observed that
classification using ensemble Approach has higher
accuracy compared to the individual classifiers.