Sarcastic Sentiment Detection in Tweets Streamed I - 2016 - Digital Communicatio
Sarcastic Sentiment Detection in Tweets Streamed I - 2016 - Digital Communicatio
Sarcastic Sentiment Detection in Tweets Streamed I - 2016 - Digital Communicatio
art ic l e i nf o a b s t r a c t
Article history: Sarcasm is a type of sentiment where people express their negative feelings using positive or intensified
Received 20 February 2016 positive words in the text. While speaking, people often use heavy tonal stress and certain gestural clues
Received in revised form like rolling of the eyes, hand movement, etc. to reveal sarcastic. In the textual data, these tonal and
16 May 2016
gestural clues are missing, making sarcasm detection very difficult for an average human. Due to these
Accepted 15 June 2016
Available online 12 July 2016
challenges, researchers show interest in sarcasm detection of social media text, especially in tweets.
Rapid growth of tweets in volume and its analysis pose major challenges. In this paper, we proposed a
Keywords: Hadoop based framework that captures real time tweets and processes it with a set of algorithms which
Big data identifies sarcastic sentiment effectively. We observe that the elapse time for analyzing and processing
Flume
under Hadoop based framework significantly outperforms the conventional methods and is more suited
Hadoop
for real time streaming tweets.
Hive
MapReduce & 2016 Chongqing University of Posts and Telecommunications. Production and Hosting by Elsevier B.V.
Sarcasm This is an open access article under the CC BY-NC-ND license
Sentiment (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Tweets
1. Introduction challenges are posed. Some of them are accessing, storing, pro-
cessing, verification of data sources, dealing with misinformation
With the advent of smart mobile devices and the high-speed and fusing various types of data [3]. However, almost 80% of
Internet, users are able to engage with social media services like generated data is unstructured [4]. As the technology developed,
Facebook, Twitter, Instagram, etc. The volume of social data being people were given more and more ways to interact, from simple
generated is growing rapidly. Statistics from Global WebIndex text messaging and message boards to other more engaging and
shows a 17% yearly increase in mobile users with the total number engrossing channels such as images and videos. These days, social
of unique mobile users reaching 3.7 billion people [1]. Social net- media channels are usually the first to get the feedback about
working websites have become a well-established platform for current event and trends from their user base, allowing them to
users to express their feelings and opinions on various topics, such provide companies with invaluable data that can be used to po-
as events, individuals or products. Social media channels have sition their products in the market as well as gather rapid feedback
become a popular platform to discuss ideas and to interact with from customers.
people worldwide. For instance, Facebook claims to have When an event commences or a product is launched, people
start tweeting, writing reviews, posting comments, etc. on social
1.59 billion monthly active users, each one being a friend with 130
media. People turn to social media platforms to read reviews from
people on average [2]. Similarly, Twitter claims to have more than
other users about a product before they decide whether to pur-
500 million users, out of which more than 332 million are active
chase it or not. Organizations also depend on these sites to know
[1]. Users post more than 340 million tweets and 1.6 billion search
the response of users for their products and subsequently use the
queries every day [1].
feedback to improve their products. However, finding and verify-
With such large volumes of data being generated, a number of
ing the legitimacy of opinions or reviews is a formidable task. It is
difficult to manually read through all the reviews and determine
n
Corresponding author. which of the opinions expressed are sarcastic. In addition, the
E-mail addresses: [email protected] (S.K. Bharti), common reader will have difficulty in recognizing sarcasm in
[email protected] (B. Vachha), tweets or product reviews, which may end up misleading them.
[email protected] (R.K. Pradhan), [email protected] (K.S. Babu),
[email protected] (S.K. Jena).
A tweet or a review may not state the exact orientation of the
Peer review under responsibility of Chongqing University of Posts and user directly, i.e., it may be sarcastically expressed. Sarcasm is a
Telecommunications. kind of sentiment which acts as an interfering factor in any text
http://dx.doi.org/10.1016/j.dcan.2016.06.002
2352-8648/& 2016 Chongqing University of Posts and Telecommunications. Production and Hosting by Elsevier B.V. This is an open access article under the CC BY-NC-ND
license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
S.K. Bharti et al. / Digital Communications and Networks 2 (2016) 108–121 109
that can flip the polarity [5]. For example, ‘I love being ignored processing of such large data sets become a complex problem.
#sarcasm’. Here, "love" expresses a positive sentiment in a nega- Twitter is one such social networking platform that generates data
tive context. Therefore, the tweet is classified as sarcastic. Unlike a continuously. In the existing literature, most of the researchers
simple negation, sarcastic tweets contain positive words or even used Tweepy (An easy-to-use Python library for accessing the
intensified positive words to convey a negative opinion or vice Twitter API) and Twitter4J (a java library for accessing the Twitter
versa. This creates a need for the large volumes of reviews, tweets API) for aggregation of tweets from Twitter [5,18–22]. The Twitter
or feedback messages to be analyzed rapidly to predict their exact Application Programming Interface (API) [23] provides a streaming
orientation. Moreover, each tweet may have to pass through a set API [24] to allow developers to obtain real time access to tweets.
of algorithms to be accurately classified. Befit and Frank [25] discuss the challenges of capturing Twitter
In this paper, we propose a Hadoop-based framework [6] that data streams. Tufekci and Zeynep [26] examined the methodolo-
allows the user to acquire and store tweets in a distributed en- gical and conceptual challenges for social media based big data
vironment [7] and process them for detecting sarcastic content in operations with special attention to the validity and representa-
real time using the MapReduce [8] programming model. The tiveness of big data analysis of social media. Due to some restric-
mapper class works as a partitioner and divides large volume of tions placed by Twitter on the use of their retrieval APIs, one can
tweets into small chunks and distributes them among the nodes in only download a limited amount of tweets in a specified time
the Hadoop cluster. The reducer class works as a combiner and is frame using these APIs and libraries. Getting a larger amount of
responsible for collecting processed tweets from each node in the tweets in real time is a challenging task. There is a need for effi-
cluster and assembles them to produce the final output. Apache cient techniques to acquire a large amount of tweets from Twitter.
Flume [9,10] is used for capturing tweets in real time as it is highly Researchers are evaluating the feasibility of using the Hadoop
reliable, distributed and configurable. Flume uses an elegant de- ecosystem [6] for the storage and processing [22,27–29] of large
sign to make data loading easy and efficient from several sources amounts of tweets from Twitter. Shirahatti et al. [27] used Apache
into the Hadoop Distributed File System (HDFS) [11]. For proces- Flume [10] with the Hadoop ecosystem to collect tweets from
sing these tweets stored in the HDFS, we use Apache Hive [12]. It Twitter. Ha et al. [22] used Topsy with the Hadoop ecosystem for
provides us with an SQL-like language called HiveQL to convert gathering tweets from Twitter. Furthermore, they analyzed the
queries into mapper and reducer classes [12]. Further, we use sentiment and emotion information for the collected tweets in
natural language processing (NLP) techniques like POS tagging their research. Taylor et al. [28] used the Hadoop framework in
[13], parsing [14], text mining [15,16] and sentiment analysis [17] applications in the bioinformatics domain.
to identify sarcasm in these processed tweets.
My paper compares and contrasts the time requirements for 2.2. Sarcasm sentiment analysis
our approach when run on a standard non-Hadoop implementa-
tion as well as on a Hadoop deployment to find the improvement Sarcasm sentiment analysis is a rapidly growing area of NLP
in performance when we use Hadoop. For real time applications with research ranging from word, phrase and sentence level
where millions of tweets need to be processed as fast as possible, classification [5,18,19,30] to document [31] and concept level
we observe that the time taken by the single node approach in- classification [21]. Research is progressing in finding ways for ef-
creases much higher than the Hadoop implementation. This sug- ficient analysis of sentiments with better accuracy in written text
gests that for higher volumes of data it is more advantageous to as well as analyzing irony, humor and sarcasm within social media
use the proposed deployment for sarcasm analysis. data. Sarcastic sentiment detection is classified into three cate-
The contributions of this paper are as follows: gories based on text features used for classification, which are
lexical, pragmatic and hyperbolic as shown in Fig. 1.
1. Capturing and processing real time tweets using Apache Flume
and Hive under the Hadoop framework. 2.2.1. Lexical feature based classification
2. We propose a set of algorithms to detect sarcasm in tweets Text properties such as unigram, bigram, n-grams, etc. are
under the Hadoop framework. classified as lexical features of a text. Authors used these features
3. We propose another set of algorithms to detect sarcasm in
to identify sarcasm, Kreuz et al. [32] introduced this concept for
tweets.
the first time and they observed that lexical features play a vital
role in detecting irony and sarcasm in text. Kreuz et al. [33], in
The rest of this paper is organized as follows. Section 2 presents
their subsequent work, used these lexical features along with
related work for capturing and processing data acquired through
syntactic features to detect sarcastic tweets. Davidov et al. [30]
the Twitter streaming API followed by sarcasm analysis of the
used pattern-based (high-frequency words and content words)
captured data. Section 3 explains preliminaries of this research
and punctuation-based methods to build a weighted k-nearest
paper. The proposed scheme is described in Section 4. Section 5
presents the performance analysis of the proposed schemes.
Finally, the conclusion and recommendations for future work are
drawn in Section 6.
2. Related work
neighbor (kNN) classification model to perform sarcasm detection. achieved good accuracy in their research to detect sarcasm in
Tsur et al. [34] observed that bigram based features produce better tweets. Utsumi and Akira [41] discussed extreme adjectives and
results in detecting sarcasm in tweets and Amazon product re- adverbs and how the presence of these two intensifies the text.
views. González-Ibánez et al. [18] explored numerous lexical fea- Most often, it provides an implicit way to display negative atti-
tures (derived from LWIC [35] and WordNet affect [36]) to identify tudes, i.e., sarcasm. Kreuz et al. [33] discussed the other hyperbolic
sarcasm. Riloff et al. [5] used a well-constructed lexicon based terms such as interjection and punctuation. They have shown how
approach to detect sarcasm and for lexicon generation they used hyperbole is useful in sarcasm detection. Filatova and Elena [31]
unigram, bigram and trigram features. Bharti et al. [19] considered used the hyperbole features in document level text. According to
bigram and trigram to generate bags of lexicons for sentiment and them, phrase or sentence level is not sufficient for good accuracy
situation in tweets. Barbieri et al. [37] considered seven lexical and considered the text context in that document to improve the
features to detect sarcasm through its inner structure such as accuracy. Liebrecht et al. [42] explained hyperbole features with
unexpectedness, the intensity of the terms or imbalance between examples of utterances: ‘Fantastic weather’ when it rains is iden-
registers. tified as sarcastic with more ease than the utterance without a
hyperbole (‘the weather is good’ when it rains). Lunando et al. [20]
2.2.2. Pragmatic feature based classification declared that the tweet containing interjection words such as
The use of symbolic and figurative text in tweets is frequent wow, aha, yay, etc. has a higher chance of being sarcastic. They
due to the limitations in message length of a tweet. These sym- developed a system for sarcasm detection for Indonesian social
bolic and figurative texts are called pragmatic features (such as media. Tungthamthiti et al. [21] explored concept level knowledge
smilies, emoticons, replies, @user, etc.). It is one of the powerful using the hyperbolic words in sentences and gave an indirect
features to identify sarcasm in tweets as several authors have used contradiction between sentiment and situation, such as raining,
this feature in their work to detect sarcasm. Pragmatic features are bad weather, which are conceptually the same. Therefore, if
one of the key features used by Kreuz et al. [33] to detect sarcasm ‘raining’ is present in any sentence, then one can assume ‘bad
in text. Carvalho et al. [38] used pragmatic features like emoticons weather’. Bharti et al. [19] considered interjection as a hyperbole
and special punctuations to detect irony from newspaper text data. feature to detect sarcasm in tweets that starts with an interjection.
González-Ibánez et al. [18] further explored this feature with some Based on the classification, a consolidated summary of previous
more parameters like smilies and replies and developed a sarcasm studies related to sarcasm identification is shown in Table 1. It
detection system using the pragmatic features of Twitter data. provides types of approaches used by previous authors (denoted
Tayal et al. [39] also used the pragmatic feature in political tweets as A1 and A2), various types of sarcasm occurring in tweets (de-
to predict which party will win in the election. Similarly, Rajade- noted as T1, T2, T3, T4, T5, T6, and T7), text features (denoted as F1,
singan et al. [40] used psychological and behavioral features on F2, and F3) and datasets from different domains (denoted as D1,
users' present and past tweets to detect sarcasm. D2, D3, D4, and D5), mostly from Twitter data. The details are
shown in Table 2.
2.2.3. Hyperbole feature based classification From Table 1, it is observed that only Bharti et al. [19] have
Hyperbole is another key feature often used in sarcasm de- worked for sarcasm type T2 and T3. Lunando et al. [20] discussed
tection from textual data. A hyperbolic text contains one of the that tweets with interjections are classified as sarcastic. Further,
text properties, such as intensifier, interjection, quotes, punctua- Rajadesingan et al. [40] are the only authors who worked for
tion, etc. Previous authors used these hyperbole features and sarcasm type T4. Most of the researchers identified sarcasm in
Table 1
Previous studies in sarcasm detection in text.
A1 A2 T1 T2 T3 T4 T5 T6 T7 F1 F2 F3 D1 D2 D3 D4 D5
Kreuz et al.(1995) ✓ ✓ ✓ ✓ ✓ ✓
Utsumi et al. (2000) ✓ ✓ ✓ ✓ ✓
Verma et al. (2004) ✓ ✓ ✓ ✓ ✓
Bhattacharyya et al. (2004) ✓ ✓ ✓ ✓ ✓
Kreuz et al. (2007) ✓ ✓ ✓ ✓ ✓ ✓
Chaumartin et al. (2007) ✓ ✓ ✓ ✓
Carvalho et al. (2009) ✓ ✓ ✓ ✓
Tsur et al. (2010) ✓ ✓ ✓ ✓
Davidov et al. (2010) ✓ ✓ ✓ ✓ ✓ ✓
González-Ibánez (2011) ✓ ✓ ✓ ✓ ✓
Filatova et al. (2012) ✓ ✓ ✓ ✓ ✓ ✓
Riloff et al. (2013) ✓ ✓ ✓ ✓ ✓
Lunando et al. (2013) ✓ ✓ ✓ ✓ ✓
Liebrecht et al. (2013) ✓ ✓ ✓ ✓ ✓
Lukin et al. (2013) ✓ ✓ ✓ ✓ ✓
Tungthamthiti et al. (2014) ✓ ✓ ✓ ✓ ✓
Peng et al. (2014) ✓ ✓ ✓ ✓ ✓
Raquel et al. (2014) ✓ ✓ ✓ ✓ ✓
Kunneman et al. (2014) ✓ ✓ ✓ ✓ ✓ ✓ ✓
Barbieri et al. (2014) ✓ ✓ ✓ ✓
Tayal et al. (2014) ✓ ✓ ✓ ✓ ✓
Pielage et al. (2014) ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Rajadesingan et al. (2015) ✓ ✓ ✓ ✓ ✓ ✓
Bharti et al. (2015) ✓ ✓ ✓ ✓ ✓ ✓ ✓
S.K. Bharti et al. / Digital Communications and Networks 2 (2016) 108–121 111
Table 2
Types, features and domains of sarcasm detection.
Types of features
F1 Lexical – unigram, bigram, trigram, n-gram, #hashtag
F2 Pragmatic – smilies, emoticons, replies
F3 Hyperbole – Interjection, Intensifier, Punctuation Mark, Quotes
F31 Interjection – yay, oh, wow, yeah, nah, aha, etc.
F32 Intensifier – adverb, adjectives
F33 Punctuation Mark – !!!!!, ????
F34 Quotes – “ ” , ‘ ’
Types of domains
D1 Tweets of Twitter
D2 Online product reviews
D3 Website comments
D4 Google Books
D5 Online discussion forums
Fig. 5. Parse tree for a tweet: I love waiting forever for my doctor.
3.5. Parsing
Capturing and processing real time tweets using Flume and In this paper, an HMM-based POS tagger is deployed to evalu-
Hive. ate accurate POS tag information for the Twitter dataset as shown
An HMM-based algorithm for POS tagging. in Algorithms 1 and 2. Algorithm 1 trains the system using
MapReduce functions for three approaches to detect sarcasm in 500,000 pre-tagged (according to the Penn Tree Bank style)
tweets: American English words from the American National Corpus (ANC)
1. Parsing_based_lexicon_generation_algorithm. [45,46]. Algorithm 2 evaluates the POS tag information of words in
2. Interjection_word_start. the given dataset.
3. Positive_sentiment_with_antonym_pair.
Other approaches to detect sarcasm in tweets: Algorithm 1. POS_training.
1. Tweet_contradicting_universal_facts.
2. Tweet_contradicting_time_dependent_facts.
3. Likes_dislikes_contradiction.
According to Algorithm 1, HMM uses pre-tagged American output of the mapper class (sentiment phrase file and situation
English words [45,46] as an input and creates three dictionary phrase file) passes to the reducer class as an input. The reducer
objects, namely WT, TT and T. WT stores the number of occurrence class calculates the sentiment score (as explained in Section 3.6) of
of each word with its associated tag in the training corpus. Simi- each phrase in both the sentiment and the situation phrase file.
larly, TT stores the number of occurrence of the bi-gram tags in the Then, it gives output an aggregated positive or negative score for
corpus and T stores the number of occurrence of uni-gram tag. For each phrase in terms of the sentiment and situation of the tweet.
each word in the sentence, it checks if the word is the starting Based on whether the score is positive or negative, the phrases are
word of the sentence or not. If a word is the starting word then it stored in the corresponding phrase file as shown in the reducer
assumes the previous tag to be ‘ $’. Otherwise, the previous tag is class of Fig. 7. PBLGA generates four files, namely positive senti-
the tag of the previous word in the respective sentence. It in- ment, negative sentiment, positive situation and negative situation
creases the occurrence of various tags through the dictionary ob- files as an output. Furthermore, we use these four files to detect
jects WT, TT and T. Finally, it creates a probability table using the sarcasm in tweets with tweet structure contradiction between
dictionary objects WT, TT and T. positive sentiment and negative situation and vice versa as shown
Algorithm 2 finds all the possible tags of a given word (for tag in Algorithm 3.
evaluation) using the pre-tagged corpus [45,46] and applies Eq. (4) Algorithm 3. PBLGA_testing.
[47], if the word is the starting word of a respective sentence
otherwise it applies Eq. (5) [47]. Next, it selects the tag whose
probability value is maximum. For example: once you encounter a
POS tag determiner (DT), such as ‘the’, maybe the probability that
the next word is a noun is 40% and it being a verb is 20%. Once the
model finishes its training, it is used to determine whether ‘can’ in
‘the can’ is a noun (as it should be) or a verb:
argmax[TT ($, t )/T ($)]⁎ [WT (word, t )/T (t )]
t ∈ APT (4)
Fig. 8. Procedure to detect sarcasm in tweets that starts with interjection word. According to Algorithm 3, it takes testing tweets and four bags
S.K. Bharti et al. / Digital Communications and Networks 2 (2016) 108–121 115
of lexicons generated using PBLGA. If the testing tweet matches all the positive sentiment tweets as sarcastic as shown in the re-
with any positive sentiment from the positive sentiment file, it ducer class of Fig. 9. In this approach, the antonym pairs of nouns,
subsequently checks for any matches with negative situation verbs, adjectives and adverbs are taken from NLTK wordnet [48].
against the negative situation file. If both checks match, the testing The algorithm PSWAP is executed under the Hadoop framework
tweet is sarcastic and similarly, and it checks for sarcasm with a as well as without Hadoop framework to compare the running
negative sentiment in a positive situation. Otherwise, the given time.
tweet is not sarcastic. Both the algorithms are executed under the
Hadoop framework as well as without the Hadoop framework to
4.4. Other approaches for sarcasm detection in tweets
compare the running time.
Table 3
Experimental environment.
Table 4
Datasets captured for experiment and analysis.
Set 1 5,000 1
Set 2 51,000 9
Set 3 100,000 21
Set 4 250,000 50
Set 5 1,050,000 187
Algorithm 5. TCUF_testing_tweets.
Fig. 10. Elapsed time for POS tagging under the Hadoop framework vs without the
Hadoop framework.
Algorithm 6. Tweet_contradict_time_dependent_facts.
Algorithm 7. TCTDF_testing_tweets.
Algorithm 9. LDC_testing_tweets.
Fig. 12. Processing time to analyze sarcasm in tweets using IWS under the Hadoop
framework vs without the Hadoop framework.
Fig. 13. Processing time to analyze sarcasm in tweets using PBLGA under Hadoop
framework vs without Hadoop framework.
Fig. 14. Processing time to analyze sarcasm in tweets using PBLGA, IWS and PSWAP
(combined approach) under the Hadoop framework vs without the Hadoop
framework.
Table 5 Hadoop framework. Tweets were in different sets and we ran the
Precision, recall and F-score values for proposed approaches. POS tag algorithm separately for each set. Therefore the graph in
Fig. 10 shows the maximum time (674 s) for 10.5 million tweets.
Approach Precision Recall F − score
PBLGA approach 0.84 0.81 0.82 5.4. Execution time for sarcasm detection algorithm
IWS approach 0.83 0.91 0.87
PSWAP approach 0.92 0.89 0.90
There are three proposed approaches, namely PBLGA, IWS and
Combined (PBLGA, IWS, and PSWAP) approach 0.97 0.98 0.97
LDC (first user's account) 0.92 0.72 0.81 PSWAP, which are deployed under Hadoop framework to analyze
LDC (second user's account) 0.91 0.77 0.84 the estimated time for sarcasm detection in tweets. We pass tag-
LDC (third user's account) 0.92 0.73 0.82 ged tweets as an input to all three approaches. Therefore, the
TCUF approach 0.96 0.57 0.72
tagging time is not considered in the proposed approaches for
TCTDF approach 0.93 0.62 0.74
sarcasm analysis. Then, we compared the elapsed time under the
Hadoop framework vs without the Hadoop framework for all three
approaches as shown in Figs. 11–13. PBLGA approach takes approx.
5. Results and discussion 3386 s to analyze sarcasm in 1.4 million tweets without the Ha-
doop framework and takes approx. 1,400 s to analyze sarcasm in
This section describes the experimental results of the proposed
1.4 million tweets under the Hadoop framework. The IWS ap-
scheme. We started with an experimental setup where a five node
proach takes approx. 25 s to analyze sarcasm in 1.4 million tweets
cluster is deployed under the Hadoop framework. Five datasets are
without the Hadoop framework and takes approx. 9 s to analyze
crawled using Apache Flume and the Twitter streaming API. We
sarcasm in 1.4 million tweets under the Hadoop framework. The
also discuss the time consumption of the proposed approach un-
PSWAP approach takes approx. 7,786 s to analyze sarcasm in
der the Hadoop framework as well as without the Hadoop fra-
1.4 million tweets without the Hadoop framework and takes ap-
mework and made a comparison. We also discuss all the ap-
prox. 2,663 s to analyze sarcasm in 1.4 million tweets under the
proaches with precision, recall and F-score measure.
Hadoop framework. Finally, we combined all three approaches and
ran with 1.4 million tweets. Then, we compared the elapsed time
5.1. Experimental environment under the Hadoop framework vs without the Hadoop framework
for all three combined approaches as shown in Fig. 14 and it takes
Our experimental setup consists of a five node cluster with the approx. 11,609 s to analyze sarcasm in 1.4 million tweets without
specifications as shown in Table 3. The master node consists of an the Hadoop framework (indicated with the solid line) and takes
Intel Xeon E5-2620 (6 core, v3 @ 2.4 GHz) processor with 6 cores approx. 4,147 s to analyze sarcasm in 1.4 million tweets under the
running the Ubuntu 14.04 operating system with 24 GB of main Hadoop framework (indicated with the dotted line).
memory. The remaining four nodes were virtual machines. All the
VMs ran on a single machine. The secondary name node server is
5.5. Statistical evaluation metrics
another Ubuntu 14.04 machine running on an Intel Xeon E5-2620
with 8 GB of main memory. The remaining three slave nodes re-
There are three statistical parameters, namely precision, recall
sponsible for processing the data consist of three Ubuntu 14.04
and F-score, which are used to evaluate our proposed approaches.
machines running Intel Xeon E5-2620 with 4 GB of main memory.
Precision shows how much relevant information is identified cor-
rectly and recall shows how much extracted information is re-
5.2. Datasets collection for experiment and analysis levant. F-score is the harmonic mean of precision and recall. Eqs. 6,
7, and 8 shows the formula to calculate precision, recall and F-score,
The datasets for the experimental analysis are shown in Ta-
respectively:
ble 4. There are five sets of tweets crawled from the Twitter using
the Twitter Streaming API and processed through Flume before Tp
Precision =
being stored in the HDFS. In total, 1.45 million tweets were col- Tp + Fp (6)
lected using keywords #sarcasm, #sarcastic, sarcasm, sarcastic,
happy, enjoy, sad, good, bad, love, joyful, hate, etc. After pre-
processing, approximately 156,000 tweets were found as sarcastic Tp
(tweets ending with #sarcasm or #sarcastic). The remaining Recall =
Tp + Fn (7)
tweets approximately 1.294 million were not sarcastic. Every set
contained a different number of tweets. Depending on the number
of tweets in each set, the crawling time (in hours) is given in
2⁎Precision⁎Recall
Table 4. F − Score =
Precision + Recall (8)
5.3. Execution time for POS tagging where Tp is true positive, Fp is false positive, and Fn is false negative.
Experimental datasets consist of a mixture of sarcastic and
In this paper, POS tagging is an essential phase for all the non-sarcastic tweets. In this paper, we assume the tweets with the
proposed approaches. Therefore, we used Algorithms 1 and 2 to hashtag sarcasm or sarcastic (#sarcasm or #sarcastic) as sarcastic
find POS information for all the datasets (approximately tweets. The datasets consist of a total of 1.4 million tweets. Among
1.45 million tweets). We deployed algorithms on both Hadoop as these tweets, 156,000 were sarcastic and the rest was non-sar-
well as without the Hadoop framework and estimated the elapsed castic. Experimental results in terms of precision , recall and
time as shown in Fig. 10. The solid line shows time taken (approx. F − score was the same under both the Hadoop and the non-
674 s) for POS tagging (approx. 10.5 million tweets) without the Hadoop framework. The only difference was algorithm processing
Hadoop framework, while the dotted line shows time (approx. time due to the parallel architecture of HDFS. Experimental results
225 s) for POS tagging (approx. 10.5 million tweets) under the are shown in Table 5.
120 S.K. Bharti et al. / Digital Communications and Networks 2 (2016) 108–121
Among the six proposed approaches, PBLGA and IWS were [1] D. Chaffey, Global Social Media Research Summary 2016. URL 〈http://www.
earlier implemented and discussed in [19] with a small set of test smartinsights.com/social-media-marketing/social-media-strategy/new-glo
bal-social-media-research/〉.
data (approx. 3,000 tweets for each experiment) and deployed in a [2] W. Tan, M.B. Blake, I. Saleh, S. Dustdar, Social-network-sourced big data ana-
non-Hadoop framework. In this work, we deployed PSWAP (novel lytics, Internet Comput. 17 (5) (2013) 62–69.
approach) along with PBLGA and IWS in both a Hadoop and non- [3] Z.N. Gastelum, K.M. Whattam, State-of-the-Art of Social Media Analytics Re-
search, Pacific Northwest National Laboratory, 2013, pp. 1-9.
Hadoop framework to check the efficiency in terms of time. PBLGA [4] P. Zikopoulos, C. Eaton, Understanding Big Data: Analytics for Enterprise Class
generates four lexicon files, namely positive sentiment, negative Hadoop and Streaming Data, McGraw-Hill Osborne Media, 2011.
situation, positive situation, and negative sentiment, using [5] E. Riloff, A. Qadir, P. Surve, L. De Silva, N. Gilbert, R. Huang, Sarcasm as contrast
between a positive sentiment and negative situation, in: Proceedings of the
156,000 sarcastic tweets. The PBLGA algorithm used 1.45 million Conference on Empirical Methods in Natural Language Processing, 2013, pp.
tweets as test data. While testing, PBLGA checks each tweet's 704–714.
structure for the contradiction between positive sentiment and [6] Hadoop. URL 〈http://hadoop.apache.org/〉.
[7] S. Fitzgerald, I. Foster, C. Kesselman, G. Von Laszewski, W. Smith, S. Tuecke, A
negative situation and vice versa to classify them as sarcastic or directory service for configuring high-performance distributed computations,
non-sarcastic. For 1.45 million tweets, PBLGA takes approx. 3386 s in: Proceedings on High Performance Distributed Computing, IEEE, 1997, pp.
in the non-Hadoop framework and it takes approx. 1,400 s in the 365–375.
[8] J. Dean, S. Ghemawat, Mapreduce: simplified data processing on large clusters,
Hadoop framework. PBLGA consumes most of the time to access Commun. ACM 51 (1) (2008) 107–113.
the four lexicon files for every tweet to meet the condition of [9] S. Hoffman, Apache Flume: Distributed Log Collection for Hadoop, Packt
tweet structure. IWS does not require any training set to identify Publishing Ltd, 2013.
[10] Flume. URL 〈http://flume.apache.org/〉.
tweets as sarcastic. Therefore, it takes the minimal processing time [11] K. Shvachko, H. Kuang, S. Radia, R. Chansler, The Hadoop distributed file sys-
in both frameworks (25 s for the without Hadoop and 9 s for the tem, in: Proceedings of 26th Symposium on Mass Storage Systems and
Hadoop framework). PSWAP requires a list of antonym pairs for Technologies (MSST), IEEE, 2010, pp. 1–10.
[12] A. Thusoo, J.S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wyckoff,
noun, adjective, adverb, and verb to identify sarcasm in tweets. R. Murthy, Hive: a warehousing solution over a map-reduce framework, Proc.
Therefore, it takes approx. 7,786 s for 1.45 million tweets in the VLDB Endow. 2(2) (2009) 1626–1629.
non-hadoop framework and approx. 2,663 s for 1.45 million [13] S.M. Thede, M.P. Harper, A second-order hidden Markov model for part-of-
speech tagging, in: Proceedings of the 37th Annual Meeting on Computational
tweets in the Hadoop framework. PSWAP consumes most of the Linguistics, ACL, 1999, pp. 175–182.
time in searching antonym pairs for all four tags (noun, adjective, [14] D. Klein, C.D. Manning, Accurate unlexicalized parsing, in: Proceedings of the
adverb, and verb) for every tweet. Finally, we combined all three 41st Annual Meeting on Association for Computational Linguistics, ACL, 2003,
pp. 423–430.
approaches together and tested. In the combined approach, the F- [15] K. Park, K. Hwang, A bio-text mining system based on natural language pro-
score value attained is 97%, but execution time is more as it checks cessing, J. KISS: Comput. Pract. 17 (4) (2011) 205–213.
all three approaches sequentially for every tweet until each one is [16] Q. Mei, C. Zhai, Discovering evolutionary theme patterns from text: an ex-
ploration of temporal text mining, in: Proceedings of the Eleventh ACM
satisfied to detect sarcasm. SIGKDD International Conference on Knowledge Discovery in Data Mining,
Three more novel algorithms were proposed, namely TCUF, ACM, 2005, pp. 198–207.
TCTDF and LDC. These three algorithms are implemented using [17] B. Liu, Sentiment analysis and opinion mining, Synth. Lect. Hum. Lang. Tech-
nol. 5 (1) (2012) 1–167.
conventional methods with small datasets. Presently, there are no [18] R. González-Ibánez, S. Muresan, N. Wacholder, Identifying sarcasm in twitter:
sufficient datasets available with us to deploy these algorithms a closer look, in: Proceedings of the 49th Annual Meeting on Human Language
under the Hadoop framework. TCUF requires a corpus of universal Technologies, ACL, 2011, pp. 581–586.
[19] S.K. Bharti, K.S. Babu, S.K. Jena, Parsing-based sarcasm sentiment recognition
facts. The accuracy of this approach is dependent on the universal in twitter data, in: Proceedings of the 2015 IEEE/ACM International Conference
facts set. We crawled approximately 5,000 universal facts from on Advances in Social Networks Analysis and Mining (ASONAM), ACM, 2015,
Google and Wikipedia for experimentation. TCTDF requires a cor- pp. 1373–1380.
[20] E. Lunando, A. Purwarianti, Indonesian social media sentiment analysis with
pus of time-dependent facts. Accuracy of this approach is depen- sarcasm detection, in: International Conference on Advanced Computer Sci-
dent on the time-dependent facts. Presently, we trained TCTDF ence and Information Systems (ICACSIS), IEEE, 2013, pp. 195–198.
with 10,000 news article headlines as time-dependent facts. LDC [21] P. Tungthamthiti, S. Kiyoaki, M. Mohd, Recognition of sarcasm in tweets based
on concept level sentiment analysis and supervised learning approaches, in:
requires Twitter users’ profile information and their past tweet 28th Pacific Asia Conference on Language, Information and Computation,
history. In this work, we tested LDC using ten Twitter users profile 2014, pp. 404–413.
and their past tweet history. [22] I. Ha, B. Back, B. Ahn, Mapreduce functions to analyze sentiment information
from social big data, Int. J. Distrib. Sens. Netw. 2015 (1) (2015) 1–11.
[23] Twitter streaming api. URL 〈http://apiwiki.twitter.com/〉, 2010.
[24] J. Kalucki, Twitter streaming api. URL 〈http://apiwiki.twitter.com/Streaming-
API-Documentation/〉, 2010.
6. Conclusion and future work [25] A. Bifet, E. Frank, Sentiment knowledge discovery in twitter streaming data, in:
13th International Conference on Discovery Science, Springer, 2010, pp. 1–15.
Sarcasm detection and analysis in social media provides in- [26] Z. Tufekci, Big questions for social media big data: representativeness, validity
and other methodological pitfalls, arXiv preprint arXiv:1403.7400.
valuable insight into the current public opinion on trends and [27] A.P. Shirahatti, N. Patil, D. Kubasad, A. Mujawar, Sentiment Analysis on Twitter
events in real time. In this paper six algorithms, namely PBLGA, Data Using Hadoop.
IWS, PSWAP, TCUF, TCTDF, and LDC, were proposed to detect sar- [28] R.C. Taylor, An overview of the Hadoop/mapreduce/hbase framework and its
current applications in bioinformatics, BMC Bioinform. 11 (Suppl 12) (2010) 1–6.
casm in tweets collected from Twitter. Three algorithms were run [29] M. Kornacker, J. Erickson, Cloudera Impala: Real Time Queries in Apache Ha-
with and without the Hadoop framework. The running time of doop, for Real. URL 〈http://blog〉. cloudera. com/blog/2012/10/cloudera-im-
each algorithm was shown. The processing time under the Hadoop pala-real-time-queries-in-apache-hadoop-for-real.
[30] D. Davidov, O. Tsur, A. Rappoport, Semi-supervised recognition of sarcastic
framework with data nodes reduced up to 66% on 1.45 million sentences in twitter and amazon, in: Proceedings of the Fourteenth Con-
tweets. ference on Computational Natural Language Learning, ACL, 2010, pp. 107–116.
In the future, sufficient datasets suitable for the other three [31] E. Filatova, Irony and sarcasm: Corpus generation and analysis using crowd-
sourcing, in: Proceedings of Language Resources and Evaluation Conference,
algorithms namely LDC, TCUF and TCTDF need to be attained and 2012, pp. 392–398.
deployed under the Hadoop framework. [32] R.J. Kreuz, R.M. Roberts, Two cues for verbal irony: hyperbole and the ironic
S.K. Bharti et al. / Digital Communications and Networks 2 (2016) 108–121 121
tone of voice, Metaphor Symb. 10 (1) (1995) 21–31. [48] J. Perkins, Python Text Processing with NLTK 2.0 Cookbook, Packt Publishing
[33] R.J. Kreuz, G.M. Caucci, Lexical influences on the perception of sarcasm, in: Ltd, 2010.
Proceedings of the Workshop on Computational Approaches to Figurative [49] D. Rusu, L. Dali, B. Fortuna, M. Grobelnik, D. Mladenic, Triplet extraction from
Language, ACL, 2007, pp. 1–4. sentences, in: Proceedings of the 10th International Multiconference on In-
[34] O. Tsur, D. Davidov, A. Rappoport, Icwsm—a great catchy name: Semi-su- formation Society—IS, 2007, pp. 8–12.
pervised recognition of sarcastic sentences in online product reviews, in:
Proceedings of International Conference on Weblogs and Social Media, 2010,
pp. 162–169.
[35] J.W. Pennebaker, M.E. Francis, R.J. Booth, Linguistic Inquiry and Word Count:
Liwc 2001, vol. 71, no. 1, Lawrence Erlbaum Associates, Mahway, 2001, pp. 1–11.
Santosh Kumar Bharti is currently pursuing his Ph.D. in Computer Science & En-
[36] C. Strapparava, A. Valitutti, et al., Wordnet affect: an affective extension of
gineering from National Institute of Technology Rourkela, India. His research in-
wordnet, in: Proceedings of Language Resources and Evaluation Conference,
terest includes opinion mining and sarcasm sentiment detection.
vol. 4, 2004, pp. 1083–1086.
[37] F. Barbieri, H. Saggion, F. Ronzano, Modelling sarcasm in twitter a novel ap-
proach, in: Proceedings of the 5th Workshop on Computational Approaches to
Subjectivity, Sentiment and Social Media Analysis, 2014, pp. 50–58.
[38] P. Carvalho, L. Sarmento, M.J. Silva, E. De Oliveira, Clues for detecting irony in
user-generated contents: oh...!! it's so easy;-), in: Proceedings of the 1st In- Bakhtyar Vachha is currently pursuing his M.Tech in Computer Science & En-
ternational CIKM Workshop on Topic-Sentiment Analysis for Mass Opinion, gineering from National Institute of Technology Rourkela, India. His research in-
ACM, 2009, pp. 53–56. terest includes network security and big data.
[39] D. Tayal, S. Yadav, K. Gupta, B. Rajput, K. Kumari, Polarity detection of sarcastic
political tweets, in: Proceedings of International Conference on Computing for
Sustainable Global Development (INDIACom), IEEE, 2014, pp. 625–628.
[40] A. Rajadesingan, R. Zafarani, H. Liu, Sarcasm detection on twitter: a behavioral
modeling approach, in: Proceedings of the Eighth ACM International Con-
ference on Web Search and Data Mining, ACM, 2015, pp. 97–106. Ramkrushna Pradhan is currently pursuing his M.Tech duel degree in Computer
[41] A. Utsumi, Verbal irony as implicit display of ironic environment: distin- Science & Engineering from National Institute of Technology Rourkela, India. His
guishing ironic utterances from nonirony, J. Pragmat. 32 (12) (2000) research interest includes speech translation, social media analysis and big data.
1777–1806.
[42] C. Liebrecht, F. Kunneman, A. van den Bosch, The perfect solution for detecting
sarcasm in tweets# not, in: Proceedings of the 4th Workshop on Computa-
tional Approaches to Subjectivity, Sentiment and Social Media Analysis, ACL,
New Brunswick, NJ, 2013, pp. 29–37. Korra Sathya Babu is working as an Assistant Professor in the Department of
[43] M.P. Marcus, M.A. Marcinkiewicz, B. Santorini, Building a large annotated Computer Science & Engineering, National Institute of Technology Rourkela, India.
corpus of English: the Penn treebank, Comput. Linguist. 19 (2) (1993) 313–330.
[44] A. Esuli, F. Sebastiani, Sentiwordnet: A publicly available lexical resource for
opinion mining, in: Proceedings of Language Resources and Evaluation Con-
ference, 2006, pp. 417–422.
[45] N. Ide, K. Suderman, The american national corpus first release, in: Proceed-
ings of Language Resources and Evaluation Conference, Citeseer, 2004. Sanjay Kumar Jena is working as Professor in the Department of Computer Science
[46] N. Ide, C. Macleod, The american national corpus: a standardized resource of & Engineering, National Institute of Technology Rourkela, India.
American English, in: Proceedings of Corpus Linguistics, 2001.
[47] E. Charniak, Statistical techniques for natural language parsing, AI Mag. 18 (4)
(1997) 33–43.