An Interpretation of Sentiment Analysis
An Interpretation of Sentiment Analysis
Abstract—Sentiment analysis plays a very important role in Recently, sentiment analysis have being getting attention by
BI’s (Business Intelligence) applications which has been evident psychologist, in which psychologists come together with the
in the recent market activities. Towards sentiment analysis for old custom of emotion research. Opinion mining and sentiment
most of the popular websites like Amazon, Facebook, Twitter
necessitate the review of the customers which are used as a analysis are unable to be escaped to the sentimental knowledge
feedback. It’s play very important role for product review, that try to recognize human emotions [5].
Business intelligence as well as in decision making. The main However, the online data have several flaws that potentially
problem that arises to the point of view of users/customers is that, hinder the process of sentiment analysis. The first flaw is that,
it is practically in-feasible to read all those online reviews one by since people can freely post their own content, the quality of
one, because some of the products might have tens of thousand
reviews. In this paper, reviews are collected from the sources like their opinions cannot be guaranteed. For example, instead of
Amazon, Flipkart, and then used a method to combine both NLP sharing topic-related opinions, online spammers post spam on
(Natural Language Processing) and machine learning approach. forum which are irrelevant to the item. Among these some
Word sense disambiguation is also considered for this study. An spams are meaningless at all, while others have irrelevant
improvised lesk algorithms is used for removing noise in the data. opinions also known as fake opinions [6] [7] [8]. The second
Different types of data have different types of properties and
therefore are suited to different techniques correspondingly. This flaw is that, ground truth of such online data is not always
problem is closely related to the large scale nature of social available. A ground truth is more like a tag of a certain
networks and the necessity to perform aggregation operations, opinion, indicating whether the opinion is positive, negative,
which results in the form of Pie-Chart. Thus, we aggregate or neutral. Therefore, the branch of sentiment analysis, pycho-
millions of reviews into more user-friendly format. logical emotion and affect-sensitive systems research must be
Index Terms—Sentiment analysis, Machine learning, Opinion
Mining, Web Mining, Pie-Chart, Business Intelligence.
carried out simultaneously.
I. I NTRODUCTION
Although in the current era, the Web has transformed
itself from read-only to read-write. This evolution attracted
interested users working together and giving out through social
networks side (like facebook, twitter, linkedIn etc), blogs,
wikis, online communities, and other collaborative medias.
A screen shot of reviews given by users on amazon.com
for Nikon camera is depicted in Figure 1. Collaborative
and collective knowledge has spread throughout the Web,
especially in the fields which belong to each and every day
life’s. Regardless of significant improvement, opinion mining
and sentiment analysis are still not recognized as new interdis-
ciplinary branches by research community [1] [2]. Computer
scientists and engineers apply machine learning algorithms for
explicit classification from streaming data, voice, video, and
text [3] [4]. Often, opinion mining and sentiment analysis, Figure 1. A screen shot of reviews given by users for Nikon Camera.
both two term are used interchangeably in some applications,
but sentiment analysis and opinion mining basically focus There are hundreds of websites offering the same product
on emotion recognition and polarity detection, respectively. at different prices and also with different user feedbacks.
Sentiment analysis and opinion mining are emerging fields A common person usually decides what to buy, but from
due to above reasons [1] [5]. where to buy remains the sole source of dilemma. If in
978-1-5090-2597-8/16/$31.00 2016
c IEEE 18
some way we can tell the user which website offers both [10] into 3 classes: Web content, Web structure and Web usage
affordable price with awesome user feedbacks, then we have mining. In addition, some researchers have also classify Web
hit the jackpot!. Various e-commerce sites like junglee.com and mining into two different approaches. In both, the classes are
buyhatke.com do only comparison based on prices at different sink from three to two: Web content and Web usage mining
sellers, but they completely ignore user reviews about their as depend on the applications, sometimes Web structure is
buying experience. We would explore this interesting problem considered as better part of Web Content [10]; secondly, Web
and offer consumers general opinions based suggestions. In Usage is considered as better part of Web Structure. As the
this paper, we focus on the reviews given by users to a core of data mining, all these three classes attract the attention
particular items, products, or gadgets. in the process of hidden patterns discovery of unidentified
Due to the unexpected growth rate of e-business websites important information from the Web data. All three mainly
as well as online users, the growth in the reviews tends to centered on different mining objects, based in contents of the
millions by shoppers. Therefore, an automatic review mining Web. Figure 2 depicts the Web classes and its objects.
which can handle these reviews, has turn into a prerequisite for In this manner, we present a concise introduction of each of
recent research topic. We aim to explore several data mining the classes. Knowledge discovery has been done through Web
techniques and implement them to mine the data of various content mining, in which the data are the tradition collection of
social networking websites. In the recent years, more and text documents, transactional records, microarray data and, in
more social networking platforms have been established. Some the recent, due to increasing growth of video, images, audios
prominent examples are Facebook, Flipkart, Amazon, Twitter, sites and usage which are implanted in or connected to the Web
Google+, and LinkedIn. page i.e. multimedia data. From the study of various related
These platform providers collect a huge amount of data topics, researchers have separated Web content mining with
for each of their users. The importance of opinion mining is two points of view: the database approach or/and the agent-
mainly because reviews belong to the different category and based approach as discussed in article [11].
contain different types of information which is very useful for
the Business Intelligence. The main consideration of this work
is to collect information from different resources with the help
of web crawler and remove redundant, noise and ambiguous
data. For this purpose data pre-processing, data cleaning and
pruning play a very important role and provide summarized
result in the form of Pie-Chart. The visual interpretation of
the thousands lines of reviews by thousand of reviewers play
very important role into the business intelligence and decision
making.
In the paper of Kosala et al. [9] they had recommended a
breakdown of Web Mining in the subsequent jobs of :
Figure 3. Task of the Opinion Mining
• Resource finding: the task of filtering projected Web data
sets, Information collection and pre-processing,
In section I, we had provided an introduction of the back-
• Generalization: explicitly determines general patterns at
ground of web mining classification [12]. In section II,
particular Web sites as well as across various sites and
we depicted sentiment analysis and background related to it.
• Analysis: validation or explanation of the mined patterns.
In this part, we have reviewed several paper that we have
considered in our work, and also mentioned the limitations of
some of them. The proposed method was described in III rd
section. In Section IV, we described experimental work, in
which we have given description of dataset as well as how the
method was implementation and what tools were used. We
also described the graphical results of our work. The paper is
concluded with conclusion in section V.