Sentiment Analysis Data
Sentiment Analysis Data
The identification of sarcasm helps enhancing sentiment analysis task when performed on microblogging websites
such as Twitter. Sentiment analysis and opinion mining rely on emotional words in a text to detect its polarity
(i.e., whether it deals ‘‘positively’’ or ‘‘negatively’’ with its theme). However, the appearance of the text might be
misleading. A typical example of that is when the text is sarcastic. In Twitter, such sarcastic texts are very
common. ‘‘All your products are incredibly amazing!!!’’ might be considered as a compliment. However,
considering the following tweet ‘‘Did I say incredibly??
Well, it’s true, nobody would believe that. They break the second day you buy them -_-’’, the user explicitly
explains that he did not mean what he said. Although some users indicate they are being sarcastic, most of them
do not. Therefore, it might be indispensable to find a way to automatically detect any sarcastic messages.
Through their work, Rajadesingan et al. [12] highlighted the limitations of some state of the art tools that
perform sentiment analysis, when more sophisticated forms of speech such as sarcasm are present. They explained
why sarcasm is hard to detect even by humans, and showed how the nature of tweets makes it even more
complicated. Therefore arise the importance of detection of sarcastic utterances in Twitter.
However, several challenges arise and make the task complicated. Joshi et al. [13] highlighted 3 main challenges
which are i) the identification of common knowledge, ii) the intent to redicule, and iii) the speaker-listener (or
reader in the case of written text) context.
On a related context, even though Brown [4] stated that sarcasm ‘‘is not a discrete logical or linguistic
phenomenon’’, works such as [8] and [9] were proposed to identify sarcastic writing patterns to decide on whether
or not an utterance is sarcastic. During our experiments as well as while manually annotating tweets, we noticed
that such patterns exist, in particular among non-native speakers of English. Therefore, we focus on detecting and
collecting such patterns from a manually annotated dataset, and we quantify them so that we can judge whether
or not a given tweet is sarcastic by comparing patterns extracted from it to them.
Throughout this work, we present a pattern-based framework that performs the task of sarcasm detection, a
framework relatively easy to implement, and that presents performances competitive to those of more complex
ones.
Introduction:
Using Sentiment Analysis or Opinion mining, a subset of data mining, can provide the vehicle for which thousands
of opinions can be analysed. This non trivial technology has been successfully used as a business intelligence tool
by many businesses in areas such as customer relationship management, targeted marketing, political campaigns,
Mass movements, disaster and crisis response, news reporting etc. (Gundecha & Liu, 2012) The success of this
technology is what has informed is application to the area of government’s relationship with it citizens, by using
a similar approach to derive better decision making for governments.
This research therefore seeks to explore the application of sentiment analysis to the relatively new dimension of
government sentiment analysis. The research work involves finding how governments can put in place a system
to mine data from social media. The focus of the research will be to build a model that can be used to analyse
sentiments of citizens concerning government policies, programs and projects.
Motivation :
Research indicates that sentiment analysis present much complex challenge than traditional topic
modelling. (pang, Lee, & Vaithyanathan, 2002) .This is despite the fact that sentiment analysis
classifies text into 3 main classes, whiles topic modelling involves n-ary of topics.
(Pang & Lee, 2008).
Sentiment classification classifies an opinion document e.g. a product review as expressing a
positive, negative and neutral sentiment. The task is also commonly known as the documentlevel
sentiment classification because it considers the whole document as the basic information unit.
(Liu & Zhang, 2012)
The main reason why sentiment Analysis is more difficult than topic-based classification is that
topic-based classification can be done with the use of keywords while this does not work well in
sentiment analysis( (Turney, 2002)
Some other reasons that make sentiment analysis difficult include; difficulty in determining
whether a given text is objective or subjective. (There is always a thin-line between the two). It is
also difficult to determine the opinion holder. Sentiment can be expressed in subtle ways without
any ostensible use of negative words. E.g. ”how could anyone sit through this movie?” contains,
no single word that is obviously negative. However this could be classified as negative review of
a movie. Thus sentiment requires more understanding than the usual topicbased classification.
Other factors include dependency on domain and other words (Pang & Lee, 2008). Opinions
expressed with sarcasm, irony, and negation.
Objective
The objective of sentiment analysis as described by (Liu & Zhang, 2012) includes the following;
entity extraction and grouping, aspect extraction and grouping, Opinion holder and time
extraction, aspect sentiment classification and opinion quintuple generation. A widely researched
tasks is sentiment or opinion detection which is viewed as classification of text as objective or
subjective. Usually opinion detection is based on the examination of adjectives in sentences. For
example, the polarity of the sentence “this is a nice car” can be determined easily by looking at
the adjective. (Hatzivassiloglov & Wiebe, 2000) They also examined the effects of adjectives in
sentiment subjectivity. Later studies (Benamara, Cesarano, Picariello, Reforgiato , &
Subrahmanian, 2007) have shown that adverbs may be used for similar purpose.
The second task is polarity classification. Given an opinionated piece of text, the goal is to classify
the opinion as belonging to one of two opposing sentiment polarities, or locate its position on the
continuum between these two polarities (Pang & Lee, 2008)
Literature review : Research indicates that sentiment analysis present much complex challenge
than traditional topic modelling. (pang, Lee, & Vaithyanathan, 2002) .This is despite the fact that
sentiment analysis classifies text into 3 main classes, whiles topic modelling involves n-ary of
topics.
(Pang & Lee, 2008).