Sutabri 2018
Sutabri 2018
net/publication/334854114
CITATIONS READS
26 825
4 authors, including:
All content following this page was uploaded by Edi Surya Negara on 07 March 2020.
Abstract— in the online ordering process, sometimes purchasing there are many e-traveling sites, such as wisatakita.com, pegi-
services often face problems in determining the services chosen pegi.com, booking.com, tripadvisor.co.id, traveloka.com,
closest to the characteristics of the user. Ratings used by some trivago.com, wisatakita.com, misteraladin.com, and so on, which
marketplace are sometimes not objective with the content of provides facilities for tourists to write testimonials about their
reviews provided by users. This will reduce the level of trust the
opinions and personal experiences online on the site.
user provides in the ratings provided by the service. Therefore, this
study will try to produce a comprehensive analysis, by reading and
analyzing any reviews related to certain services. The burden for The e-traveling site that is the object of this research is
users is the number of reviews that are not small and the use of very the traveloka.com site; the reason is that traveloka is the first
different language styles. This study proposes a method that can tourist travel site in Indonesia based online, since March 2012.
provide a rating that is more in line with the content of the review In addition, the number of hotel service users or tourists who use
in connection with the sentiments in the review. The method traveloka services is very large. There are 19,272 hotels in
developed using the corpus on the topic model on the hotel Indonesia promoted through traveloka.com, so this e-traveling
management site. Sentiment analysis was obtained using the Naïve site is the best-selling and trendy, used by domestic tourists. One
Bayesian method and the use of probabilistic values of the corpus.
problem that arises is that tourists or visitors must read all the
The test results showed the success rate of the method in analyzing
sentiment was 89%. The results of sentiment analysis are used as a testimonials in their entirety, so that it takes a long time. In
standard for calculating rating. addition, it was also found that the ratings or scores given in the
evaluation of testimonials were sometimes not in accordance
Keywords—analysis sentiment, corpus, naïve bayesian, topic with testimonials written by tourists or hotel service users.
model, hotel riview.
When a tourist or hotel service user, searches for tourist
I. INTRODUCTION destinations and hotels in certain tourist destinations, they will
usually look for hotel testimonials online in the destination city,
to make hotel booking decisions. These testimonials are
The development of online media today, has a positive
sometimes doubted by tourists or new users of hotel services,
impact with the emergence of unlimited textual information,
because it is very difficult to read and understand all these
resulting in the need to represent that information, without
testimonials in a short time.
reducing the value of information. Textual information is
divided into two, namely facts and opinions. Facts are objective
expressions of an entity or an event, whereas opinions are What was done in this study, looked at the limitations
subjective expressions that express one's sentiments or opinions of the hotel testimonials and analyzed sentiments, to determine
about an entity or event. (Y.Nur & D.Santika, 2011). The the positive or negative testimonials and ranked the hotel
amount of information in the form of user testimonials for testimonials, by applying the topic model approach, which uses
various items ranging from computer products, smart phones, generative techniques to model the topics contained in the
holiday services, hotel services to movie reviews. At present the testimonial document. The topic of the model was built to fit the
valuable source of knowledge will greatly help other users, find satisfaction measurement categories contained in e-traveling
the information needed, and make accurate decisions for the sites such as, cleanliness, comfort, food, location and service.
various interests needed. The Corpus is built on expert knowledge, which is used to
analyze the sentiments of hotel testimonials online using the
classification method.
The Tourism Industry is an object that has a great
opportunity to be promoted massively and developed online
through a website. Most of the tourist destinations currently The previous forms of research relating to sentiment
available make it easy for tourists to provide accommodation analysis have been conducted by J. Samudra, S. Supeno, and M.
and comfort during the holidays. (E.Indrayuni, 2016). Hotels are Hariyadi, in 2009, dividing or classifying text as an alternative to
a very important tourist product to pay attention to in terms of processing digital documents, so as to simplify and accelerate
facilities, excellent service or the distance to the hotel. Currently the search for information needed. The method used is Naïve
Bayes. Text documents are represented as a set of words, and No Author Research Theories
each word in the document is considered independent of each Year Topics Has Been
other. Developed
Research conducted by TB. Adji, GA. Buntoro, and 4. Brob, J., Aspect Oriented Distant
A.E Purnamasari in 2014, conducted a research on community 2013 Sentiment Analysis of supervision
sentiment analysis on social media issues, especially Twitter, Customer Reviews Using technique to
using a combination of Lexicon-based and Double Propagation Distant Supervision reduce human
methods which produced 7 parameters such as very positive, Techniques supervision in
positive, somewhat positive, neutral, somewhat negative, the annotation
negative and very negative with an accuracy rate of 23.44%. process.
The Naïve Bayes Classifier performance can be According to Tan and Zhang(2008). Data Acquisition
improved by using corpus data that has been created and in English, Data Acquisition abbreviated DAQ is, the sampling
developed in the previous stage. The use of corpus aims to give process of physical real-world conditions and conversions of the
more weight to the parameters of the probability value, for each resulting sample into a digital numerical value that can be
token listed in the corpus. The corpus used is the corpus that manipulated by computer.
deals with the topic of hotel parameters, namely comfort,
cleanliness, location of the hotel, food, and friendly service. In this research, the data acquisition process through
several stages of scrapping, labeling, tokenizes, stop word filter,
Corpus value weights are obtained from probabilistic and stemming. The scrapping process aims to get text data from
values. The occurrence of the term t on the existing topic, the popular e-traveling sites. Each from the review given to the
goal is to normalize the weight. In this study using the expert to make a positive sentiment label or negative.
proportionality of token numbers for each class c, positive
classes p + = 0.65 and negative p- = 0.35 in the data sequence. In addition to labeling sentiments, it is also possible to
So that condprob can be calculated by a formula such as, label the type of data sets as training data or test data. This
process is done randomly without viewing the review content.
This is to maintain data independence. The labeling process does
not change the structure from the content of the review. To
change the review structure to be used as data set then done pre-
To get a score for each class [c] can use the following formula processing. Processing stages consist of tokenizes, stop word
filters, and stemming.