Inset Lexicon: Evaluation of A Word List For Indonesian Sentiment Analysis in Microblogs
Inset Lexicon: Evaluation of A Word List For Indonesian Sentiment Analysis in Microblogs
net/publication/321757985
CITATIONS READS
18 3,284
2 authors, including:
Fajri Koto
University of Melbourne
26 PUBLICATIONS 122 CITATIONS
SEE PROFILE
All content following this page was uploaded by Fajri Koto on 12 December 2017.
Abstract—In this study, we propose InSet, an Indonesian for Indonesian language. Therefore, in this work we tried
sentiment lexicon built to identify written opinion and to improve sentiment analysis technique specifically for
categorize it into positive or negative opinion, which could Indonesian language by building a precise Indonesian
be utilized to analyze public sentiment towards particular
topic, event, or product. Composed using collection of words lexicon which particularly aims microblogs.
from Indonesian tweet, InSet was constructed by manually Our sentiment lexicon was composed by words gathered
weighting each words and enhanced by adding stemming from Twitter, as the representation of commonly used
and synonym set. As the result, we obtained 3,609 positive social media in Indonesia. We built the lexicon by clas-
words and 6,609 negative words with score ranging between
sifying the polarity of each word and enhanced it with
–5 and +5. Based on the experiment utilizing the InSet, our
method outperforms other rarely found Indonesian lexicon some previously proven methods. The result of tests and
that we used as baseline. evaluations conducted in this study shows that InSet has
a satisfactory performance as an Indonesian sentiment
Keywords- sentiment analysis; lexicon; indonesian; mi-
croblog; twitter lexicon to predict the negative and positive polarity of
shortly written opinions.
I. I NTRODUCTION
II. R ELATED W ORKS
As Indonesia is an archipelago country with thousands
of islands across the ocean and diverse culture as well as In computer science, the study of sentiment analysis
high population density, online social media has become over microblogs has been one of the main focus in the field
a medium for most Indonesian citizen to communicate of natural language processing, information retrieval and
and send any thought or ideas across the country. Even data mining. These researches mainly focused on various
these days, any microblogging tools such as Facebook, construction for English Lexicon such as SentiWordNet
Twitter and Instagram have become an inseparable culture [6], Liu Lexicon [11], AFINN Lexicon [12], Opinion
of modern Indonesian. And in fact, it has been very easy to Finder [9], Senti-Strength [13], HBE Lexicon [14], and
create a viral news on the Internet and gather Indonesian also NRC Emotion Lexicon [15]. Research by Koto et
public opinions towards certain topic. al. has summarized the performance comparison among
With the massive number of social media users in these features and reveal that AFINN and Senti-Strength
Indonesia, it would be interesting, if not profitable, to are the current best features for English Twitter Sentiment
analyze the netizens sentiment of any topics, or which Analysis [16].
is also known as Sentiment Analysis. Sentiment Analysis Research on sentiment analysis over microblog Twit-
is a classification task to determine the text polarity ter has been done by Go et al. in which they utilized
and refers to broad area of natural language process- emoticons to annotate tweets with sentiment label [10].
ing, computational linguistic and text mining [1]. Ac- The next study by Agarwal et al. used manually annotated
cording to [2], there are two kinds of sentiment clas- tweets with sentiment and performed unigram model to do
sification task: 1) polarity classification with classes = classification [17]. In other studies, Koto et al. analyzed
{positive, negative}, and 2) subjectivity classification with sentence pattern of tweets with sentiment label, either in
classes = {subjective, objective}. It is clear that the pos- subjectivity and polarity domain [18].
itive or negative class indicates the positive or negative Despite the large number of works related to English
polarity of a sentence. On the other hand, the objective sentiment analysis, the research number that focuses on
sentence means the utterance of information containing Indonesian language are limited. In [4], Wicaksono et al.
facts or news and less argumentation while the subjectivity proposed a methodology to automatically construct dataset
reflects a private point of view, emotion or belief [3]. for twitter sentiment analysis, while in [5] Lunando et
Though sentiment analysis has been one of the most al. tried to combine sarcasm and sentiment analysis task
studied field in computer science and specifically in natural for Indonesian language. However, the features used for
language processing, there are not many researches con- sentiment analysis were only produced from the translated
ducted to improve Sentiment Analysis, or opinion mining, SentiWordNet [6].
978-1-5386-1981-0/17/$31.00 2017
c IEEE 391
Figure 1. Stages of InSet Construction
with score 0, while for the negative set the agreement 2 http://www.sinonimkata.com/
B. Experiment Result
To evaluate our method, we performed cross validation
Figure 3. Histogram of Valencies for InSet Lexicon (with k=10) and show the results in Table III. The value on
each cell indicates the accuracy of each methodology with
Finally, we have the word list that comprises of 10,218 particular classifier. Our experiment results show that the
words and call them as InSet (Indonesian Sentiment) as traditional technique such as TF and TF-IDF do not work
described in Figure 3. The word list has a bias towards well in classifying the tweets for Indonesian language. The
negative words (6,609, corresponding to 65%) compared translated SentiWordNet, Liu and AFINN also show the
to positive words (3,609). However, the bias apparently similar result. It might be caused by 1) the error of the
corresponds closely to the bias found in the Opinion translation system, 2) the translation system which does
Finder sentiment lexicon (4911 (64%) negative and 2718 not cover the OOV (Out of Vocabulary) or slang words
positive words), and also AFINN Lexicon (1598 (65%) of Indonesian language, and 3) the lexicon itself which
negative and 878 positive words). contains too many uncommon words and rarely used for
user-generated platform such as Twitter.
IV. E XPERIMENT AND E VALUATION
In Table III, the InSet has the highest accuracy for
A. Experimental Set-Up each classifier compared to the other baselines. It reaches
To evaluate the InSet, we used a Twitter dataset from 65.78% as the highest and better than the Vania Lexicon
2015 which has been manually annotated by a native with 61.48% accuracy. Vania Lexicon is an Indonesian
Indonesian speaker. The data was crawled through twitter Lexicon that was built by translating Opinion Finder
API and has 1,259 positive and 1,371 negative tweets. The Lexicon and then seeding the words for the enhancement.
InSet lexicon was used by constructing two values, called Factors which may cause difference in performance are:
as InSetPos and InSetNeg, where InSetPos or InSetNeg is 1) Vania Lexicon was produced by utilizing the trans-
the sum of the scores for the positive or negative words lation system, while the word candidates of InSet were
that match the lexicon. After that, we performed binary selected directly from Indonesian language source, 2)
classification by using several supervised algorithms such Vania Lexicon contains many formal words, while InSet
as: Logistic Regression (LR), Naive Bayes (NB), Support contains formal and informal words as it was generated
Vector Machine (SVM), and Neural Network (NN). The using Twitter data, 3) InSet was constructed by manually
preprocessing was conducted before performing training labeling of 2 native speakers where the bias and the
and testing for the evaluation and it includes 1) tweet precision of words polarity become the priority of InSet,
conversion to the lowercase, 3) removing url and twitter while Vania Lexicon was constructed automatically. Even
entities, and 3) removing special character. though our Lexicon is less reproducible, we argue that
Baseline
Classifier Translated English Lexicon InSet
TF TF-IDF Vania Lex.
SentiWordNet Liu Lex. AFINN
NB 59.52 58.71 61.48 52.66 61.48 57.18 65.13
LR 53.58 52.89 61.44 53.83 61.48 52.13 65.78
SVM 55.56 56.21 61.48 53.76 61.48 59.51 65.51
NN 57.64 54.68 61.41 52.51 59.84 58.97 65.71
the manual effort is still needed and required as it works [8] C. Vania, M. Ibrahim, and M. Adriani, “Sentiment Lexicon
very well like AFINN. 4) InSet has polarity score that Generation for an Under-Resourced Language”. In Interna-
indicates positive or negative sentiment of a word, while tional Journal Comput. Linguistics Appl., 20154, pp. 59–
72.
Vania Lexicon only a list of words in two different classes:
positive and negative set. [9] T. Wilson, J. Wiebe, and P. Hoffmann, “Recognizing
contextual polarity in phrase-level sentiment analysis”. In
V. C ONCLUSION Proc. of the conference on human language technology and
empirical methods in natural language processing, 2005,
In this study we have constructed InSet Lexicon that pp. 347–354.
comprises of 3,609 positive and 6,609 negative words in
Indonesian language. Each word was manually labeled [10] A. Go, R. Bhayani, and L. Huang, “Twitter sentiment
based on its polarity and then enhanced by adding the classification using distant supervision.” In CS224N Project
Report, Stanford, 2009, pp. 1–12.
stemming and synonym set. Our research contributions
lay on two points. First, we constructed the sentiment [11] B. Liu, M.and J. Cheng, “Opinion observer: analyzing and
lexicon for Indonesian language which is more suitable comparing opinions on the web”. In Proceedings of the
for microblogs. Second, our approach shows that the result 14th international conference on World Wide Web, 2005,
outperforms all of the existing baseline methods where pp. 342–351.
the highest accuracy is 65.78%. For the future works, the [12] F. A. Nielsen, “A new ANEW: Evaluation of a word list
enhancement of lexicon can be conducted by incorporating for sentiment analysis in microblogs”. 2011, Available at
the translated English lexicon with the InSet. http://arxiv.org/abs/1103.2903