A Review On Sentiment Analysis Methodologies Practices and Applications
A Review On Sentiment Analysis Methodologies Practices and Applications
net/publication/344487215
CITATION READS
1 18,279
2 authors:
All content following this page was uploaded by Pooja Mehta on 06 October 2020.
Abstract: The Sentiment Analysis is sometimes a technique to look at the information that is the form of text and determine opinions content from the
text. It is also termed as emotion or feeling mining. On-line communication channels like Twitter, Facebook, YouTube, and so forth are these days a lot of
passion into human life. People share their thoughts or feelings thereon. During this review paper, we tend to match on opinion mining or feeling
assessment which is an area of web data mining and Machine Learning. This paper shows aftereffect of examination by utilizing different ML and
Lexicon investigation methodologies. Outcomes are analyzed to play out an evaluation study and check the estimation of the present composition. In
this manner, it will help the future investigators with understanding present beginnings in the configuration of possibility examination.
Index Terms: Sentiment Analysis, opinion, emotions, Machine Learning, Accuracy, NLP, support vector machine
—————————— ——————————
increased a lot of acceptance among various zone like politics 1) Document Level: This is very first level of Opinion mining or
[9], business [10] and marketing/selling and advertisement (to sentiment analysis which is only based on the document. In
estimate sales of specific products). So identifying type of this particular level, we take the whole document is taken into
sentence is the most important part of opinion mining. We consideration and figure out the polarity. Through this level or
have to classify the sentence either subjective or objective. with the help of this level we can classifies whether the
Recent or existing research is using both supervised and available opinion or emotions provide us a positive sentiment
unsupervised learning technique to provide different or negative sentiment [4].To consider this, the document
techniques for several purpose of sentiment analysis. In initial should be on a single topic. The main source of this document
research all or combination of below supervised techniques to be considered is sentiment or emotions. For example, in
are used. one text, the file contains the review of only single product,
1. Support vector machine now that the system starts calculating whether the whole
2. Maximum Entropy review is expressing an overall positive or negative opinion
3. Naive Bayes about this product. Thus, for many products review this type of
Unsupervised techniques used by intial research are level is not valid. The main and only advantage of is that we
1. Exploit sentiment lexicons get most of all polarity of a particular feature and the drawback
2. Grammatical analysis of this level is that people‘s liking and disliking didn‘t get by
3. syntactic patterns this.
602
IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 02, FEBRUARY 2020 ISSN 2277-8616
2. APPLICATIONS the emotions. On the internet you can access list of stop
words. In the pre-processing step, it can be used to remove
A. Decision making support: them.
Building a website that could perform decision making is a 5) Conjunction handling - In general, there is only one
very crucial part. Analysis has its own advantage like; it can meaning of each sentence at a time. But there are certain
lead to different ideas which can help us to make decision in available conjunction words like But, And, while, although,
day to day life such as choosing a good restaurant to go for however, changes the whole denotation of its sentence. For
dinner, or buying a new car or selecting a good movie to watch example, even though the ride was good but it was not up to
etc. my hopes. By using these rules throughput can be amplified
B. Business related application: by 5% [6].
Because if every day changing market, the competition has 6) Negation handling - Negation words like ‗not‘ inverts the gist
increased a lot in co-operative world. Every wants to create a of the whole sentence. For example, the movie was not good
innovative and newest product which can fully satisfy their as ‗good‘ in it which is optimistic but ‗not‘ upturns the schism to
customers. To achieve more valuation of their product, negative.
organization can assemble all the needs of their users and To identify emotions or opinion words is an important task in
enhance the efficiency of product from feedback collected many applications in opinion mining. From the given feature
from their customers. ,classifying the polarity is basic important task. Positive,
C. Predictions and trend analysis: Negative and Neutral are three classes where the polarity is
Tracking views of public by sentiment scrutiny which enable categorized .From Polarity identification, calculation of
any person to predict the market scenario which helps any sentiment strength, sentiment score etc. can be done using
person for trading and polls market. By using this all opinions Lexicon techniques. ―There are various ways and techniques
user can predict the market trends. are available for opinion mining, there are majorly two groups
used. 1) Uses lexicon methods and 2) machine learning
3. MATERIAL AND METHOD AND method which resolves the problems of SA.
APPROACHES 1 )Lexicon based approach: In this current approch ,when
Numerous methodologies are available for opinion mining, but using the available lexicon techniques for a text which is given,
two main groups are used. The problems of SA will be solved will separate the words.In general it performed by aggregation
by the first group using by implementing the machine learning of scores : for example subjective words scores as
approach. The second group uses lexicon-based method positive,negative and nuetral etc are summed up separately
which is a linguistically-inclined method. In both groups, many for same. It assigns a score to each word ..Atlast four scores
techniques exist. From the following way, we can extract the are generated . The one which gets the maximum score gives
features of text or sentences. the overall split of the text[10]. It has mainly divided into two
1) N-Gram: Only one word can be taken by one at a time parts.a)Dictionary-based b)corpus-based.
(unigram) or two words (bigram) up to n words as a result.
Unigram features cannot be captured by some opinions. For A ) Dictionary-based approach - In this system,the user
example, this book is fascinating. It is an optimistic comment if collected a set of sentiments words and seed list is prepared
in only unigram model it is fascinating to take it together and by them.After that ,the user start searching for phrasebooks
negative. and lexicon to find synonyms and antonyms of particular
2) POS tagging: –―It is the way of words to signify it in content text.Once this is done,the newly created substitutes are added
(corpus) as it is linked to its parts of speech in the light of both into the seed list.Untill there are no new words are found to
its definition and its connotation with touching the words. users this process continues..
Nouns, pronouns, adjectives, adverbs, etc. are examples of Disadvantage: There has to have struggle in finding context or
different parts of speech‖. domain-oriented emotion words.
3) Stemming – In this, eliminating prefixes and suffixes is the B) Corpus-based approach - Corpus is a basically a term
main process.. For example, ‗running‘, ‗sleeping‘, ‗ran‘ can be which is a cluster of writing like group of some writing which is
stemmed from ‗run‘ and ‗sleep‘ respectively. It basically helps often on a very precise matter.In this,users uses the help of
in Cataloging but sometimes it also leads to decrease in corpus text to drawn-out the seed list which is in organized
cataloging accuracy. situation[9].
4) Stop words – Stop words are Pronouns (he/she, it), articles
(a, an, the), prepositions (above, in, near, under, besides).
These words are nothing but offer no or little information about
603
IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 02, FEBRUARY 2020 ISSN 2277-8616
is the basic component. Neurons are categorized in to three Comparison of various Approaches and Methods
parts- input, hidden and output. In [22] and [24] research work used a feed forward neural
network for:
3 .Decision tree classifier: To make division of the data, there 1) Identifying online users who express their feelings, show
is a condition which is used. one class consist those data perspective and tweets oftenly. And 2) To characterize these
which mollify the condition and other class consist of the tweets in different categories based on positive and negative
remaining of the data. This technique is called a recursive keywords, For this purpose they also used Twitter API.
technique which has two parts: single attribute split and multi Convolutional Neural network is one of the methods which are
attribute split. used for Sentiment Analysis. By combining sentiment analysis
4. Rule based classifier –It is condition based classifier which and the Morphological Sentence Pattern Model we can get
makes usage of condition or rule like IF, THEN. It can be many good outcomes. The other techniques like Tokenization,
written as stemming, and preprocessing, self- organizing map (SOM),
IF condition THEN decision‖ and a recursive neural network can also be used for sentiment
We can produce the rules based on our requirements at the analysis. In [21], the proposed system shows that after data
time of training phase [2]. preprocess it classifies into sentence-level and then extracts
the features of the data. After then it applies the coherence
4. RESOURCES OF SENTIMENT ANALYSIS resolution and usage of SentiWordNet. .It applies the SVM
To collect data is the main purpose of Sentiment analysis machine learning approach to count the accuracy of the
where social communication channels like Twitter, Facebook product feature. At last, we find the overall sentiment or
or any pre-existing resources. accuracy regarding the feature of the product. In [23], research
A) Blogs & Forums: It is source of opinions and emotions works shows that different machine learning methods are used
where we get information for research purpose and that all to extract the emotions. It uses the different twitter data to
information can be used by researchers via Web forums and preprocessing, subjective classification and feature
blogs. ―Generally, for only single subject forums are designed; classification. At last, it counts the accuracy for all machine
thus, by using the forums we can ensure the sentiment mining learning methods like SVM, Naïve Bayes and ME. In [25],
in single domain. As well, it‘s the trend that bloggers updated proposed different method which works on artificial intelligence
their blogs and reviews every day after activities in and around .Propose model works on VADER method which is different
their areas, countries and around the world. than tradition methods like SVM or ME. By using VADER
B) Reviews: There are many available studies which dedicated method which is Valence Aware Dictionary and sentiment
only on reviews because of their usability with the opinions Reasoner the opinions are categorized in to positive, negative
and sentiment. During any research, Movie and product and neutral The user emotions are divided in to optimistic,
reviews were mostly studied by researcher where the main negative and unbiased by a. The result shows the highest rank
purpose is to get the feedback from the sentiment and of three artificial intelligence. The best resource for the SA is
opinions. review data. In [26] researchers uses techniques of NLP and
C) News Articles: News articles, such as financial articles and computational linguistics to classify the sentiments of the
political reviews are a popular source of sentiment analysis reviews of the hotel data. This outcome shows the result of
[51]‖. The main format of News articles texts is structured and satisfaction, security, comfort, luxury and lodging services for
formal. tourist person. It would help the hotel managers to have what
D) Social Networks: Many social networks sites are available customer needs, discover areas for further development and
from which we can take the opinions and reviews for sentiment increase its service quality. NLP techniques are used for
analysis like Twitter, Facebook, etc. reviews data which works with the processing of textual data
Twitter: for sentiment analysis. Here, researchers used Sentimental
Tweets are the messages posted by different users, polarity based model (SPBM) for their work. It uses
having restriction of 140 characters. Users can read message multinomial algorithm from Naïve Bayes method which gave
(called Tweets) of one another. The micro-blogging service good prediction results when compared to other classification
which provides this facility is knows as Twitter. By using this algorithms. In Opinion mining, E-commerce and news type
tweets which can work as opinions and reviews for future datasets are available. In [27] and [28] research, they took the
patterns where we can generate the poll results. data set from Amazon and BBC – online news channels where
Facebook: they proposed works. While handling in all the datasets, it
The provision of posting personal profile, photos, videos divides positive and negative text for the objective, from the
and other related information are provided by most famous features review and articles respectively and performed
social networking facility called Facebook which is popular different analysis methods like preprocessing where data is
right after it got launched in 2004. cleaned for analysis. At first, the goal is to calculate the
Hence, these much ample amount of information available polarity of textual data whether it is positive or negative. Naïve
in form of user‘s message, computer technology which is Bayes and SVM methods are used to find the accuracy and
dependent on sentiment behind this message is introduced precision of the data which are supervised learning method.
known as sentiment analysis. Collection of known and defined words are called Sentiment
5. RELATED WORK lexicon. Two types of sentiment analysis are :1) Lexicon Based
A lot of studies have been done by scholars to analyze & 2 ) Machine learning .Polarity shift is the main concern in
emotions or opinions. There are many methods used to extract any aspect or feature level. And research such as [29][31],
the data. are done to find this polarity shifts .In their research Bag-of
words which handle text data as vector of different words. And
various ML techniques are used to categorize these words.
605
IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 02, FEBRUARY 2020 ISSN 2277-8616
This model also lacks majorly with polarity shift issue (NB), Support Vector Machines (SVM) and Rocchio classifiers
.Technique called PSDEE-Polarity Shift Detection, Elimination, including in this classification. In [34],they proposed sentiment
and Ensemble is used to address this polarity shift issue in analysis for dealing with any topic from related documents
Document level sentiment analysis. And then ML techniques which gives the output as positive or negative. For opinion
such as Naïve Bayes and SVM are used for sentiment polarity retrieve, the topic-related structured are made with the help
(which is positive, negative or neural) after preprocessing the with query-dependent. To implement ranking algorithm for data
data. Twitter site is a micro-blogging because it‘s data are not retrieval researchers uses SVMRank. To calculate
in structured format. These data are shared by different users performance of ranking, many methods that can be used to.
in form of their feelings or about their daily life. Twitter data are Mean Average Precision (MAP) as the evaluation metric used
appropriate in data stream mining as data or messages are for TREC community. In [37], by using the Naïve Bayes
small and continuous. [30][32][33][36][41], research works on classifier it can detect the polarity of the English tweets
twitter data which analysis sentiment from its short text. Text whether tweets are positive, negative or nuetral. Two unique
data can be categorized into positive or negative opinion .In variations of Naive Bayes classifiers were constructed 1
general or specific item feature is called sentiment )specific Baseline and 2) Binary (which makes use of an
classification. Views of different people about specific product lexicon methods and groups as positive and negative)
can be divided into positive sentiments, negative sentiments or Multiword from various sources and Valence Shifters are
neutral sentiments. Supervised machine learning method is identified by this approach. In [38] talks about the social media
most reliable method for sentiment analysis. The machine site like twitter, Facebook which is very famous in social media
learning algorithms which are useful for sentiment analysis are networks. They propose a new framework to finding the
Naive Bayes, Maximum Entropy and Support Vector Machine polarity of the opinion or emotions from the web dataset. It
(SVM) to find the accuracy of the product feature. Sentiment joins this system with manually data from Twitter. Twitter API is
Analysis is a very challenging and important task that works on used to gather data. It analyze the data in to positive , negative
machine learning . In [35], propose sentiment classification and neutral. Unigram Naive Bayes which is sub method of
works on the Arabic language. They spoke to that Arabic Naiva bayes approach is used for this. In [39], they propose a
tweets represent a decent open door for opinion mining supervised sentiment classification framework which is based
research however they were postponed because of lack of on data from Twitter to find the accuracy of the data. For
sentiment analysis assets or difficulties in Arabic language text Twitter client characterized it include hash tags in tweets ,
analysis. There are two levels available for classification in single words, n-grams which are then consolidated into a
Arabic. In first one subjectivity analyzer based on supervised solitary element vector for sentiment order. K-Nearest
approaches and filters the reviews in relevant and irrelevant Neighbor algorithm is used to allocate sentiments names by
and second level sentiment analyzer based also on building an element vector for every model in the preparation
supervised approaches and ensemble techniques to classify and test set.
relevant reviews into positive, negative and neutral. For
utilizing diverse weight plans, stemming and n-grams
procedures tests were led which demonstrated that SVM
classifier utilizing TF-IDF through bigrams includes was better
when contrasted with Naive Bayesian classifier. Naïve Bayes
606
IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 02, FEBRUARY 2020 ISSN 2277-8616
Articles[28]
Polarity Shift Detection
Lexicon-based and Supervised
9 (2017) Approaches in Sentiment Product Review 84.6%
Machine Learning-based
Analysis: A survey[29]
Language Technology Platform
A Sentiment Analysis Method COAE2014(BBC
10 (2017) (LTP) for dependency syntax 86.5%
of Short Texts in Microblog[30] DataSet)
analysis
SemEval-2016 Task 4:
11 (2016) Sentiment Analysis in SVM Twitter Dataset 84.5%
Twitter[32]
A Topic-based Approach for
12 (2016) Sentiment Analysis on Twitter SVM Twitter Dataset 74.09%
Data[33]
Ensemble of Classification
Naive Bayes, Arabic Reviews from 97.06%
Algorithms for Subjectivity and
13 2013 SVM jeeran.com(service and 89.1%
Sentiment Analysis of Arabic
product reviews)
Customers' Reviews[35]
Cities: A Naive-Bayes Strategy Training Dataset of
14 (2014) for Sentiment Analysis on Naïve Bayes Tweets by 76.54%
English Tweets[37] SEMEVAL2014
Opinion Mining on Social
15 (2013) Naïve Bayes Twitter Dataset 76.8%
Media Data[38]
Sentiment Knowledge
16 (2010) Discovery in Twitter streaming Multinomial Naïve Bayes Twitter API 82.45%
Data[41]
Twitter as a Corpus for
17 (2010) Sentiment Analysis and SRF Twitter Dataset 56.4%
Opinion Mining[42]
Overview." Systems, Man, and Cybernetics (SMC), IEEE Inventive Communication and Computational
International Conference on. IEEE, 2015. Technologies,ICICCT2017.
[5] Rasika Wagh,Payal Punde.‖ Survey on Sentiment [22] Brett Duncan and Yanqing Zhang, ―Neural Networks for
Analysis using Twitter Dataset‖ Proceedings of the 2nd Sentiment Analysis on Twitter‖, IEEE 14th International
International conference on Electronics, Communication Conference on Cognitive Informatics & Cognitive
and Aerospace Technology (ICECA 2018) IEEE Xplore Computing (ICCICC 2015)
ISBN:978-1-5386-0965-1 [23] Monika Negi, Kanika Vishwakarma, Goldi Rawat,
[6] Anchal Kathuria, Dr. Saurav Upadhyay.‖ A Novel Review Priyanka Badhani, Bhumika Gupta,‖ Study of Twitter
of Various Sentiment Analysis using Machine Learning Algorithms on
Sentimental Analysis Techniques‖ International Journal of Python‖, International Journal of Computer Applications
Computer Science and Mobile Computing, Vol.6 Issue.4, (0975 – 8887) Volume 165 – No.9, May 2017
April- 2017, pg. 17-22. [24] Shiv Dhar,S.Pednekar,K.Borad,Prof.Ashwini Save,‖
[7] D. M. E.-D. M. Hussein, ―A survey on sentiment analysis Sentiment Analysis using Neural Networks: A New
challenges,‖ J. King Saud Univ. - Eng. Sci., vol. 34, no. 4, Approach‖, International Conference on Inventive
2016. Communication and Computational Technologies (ICICCT
[8] Liu, B. Sentiment analysis: mining opinions, sentiments, 2018)
and emotions. The Cambridge University Press.2015. [25] Chae Won Park, Dae Ryong Seo, ―Sentiment Analysis of
[9] Bilal Saberi, Saidah Saad.‖Sentiment Analysis Or Opinion Twitter Corpus Related to Artificial Intelligence
Mining: A Review‖.International Journal of Advanced Assistants‖, 5th International Conference on Industrial
Science Engineering Information Technology, Vol- Engineering and Applications,2018.
7(2017), ISSN:2088-5334. [26] Kudakwashe Zvarevashe, Oludayo O. Olugbara,‖ A
[10] J. Bollen, H. Mao, and X. Zeng "Twitter mood predicts the framework for sentiment analysis with opinion mining of
stock market". Journal of Computational Science, 2(1): 1-8 hotel reviews‖, Conference on Information
2011. Communications Technology and Society (ICTAS) 2018.
[11] T. Xu, Q. Peng and Y. Cheng. "Identifying the semantic [27] Satuluri Vanaja, Meena Belwal,‖ Aspect-Level Sentiment
orientation of terms using S-HAL for sentiment analysis". Analysis on E-Commerce Data‖, International Conference
Knowledge-Based Systems, 35: 279-289, 2012 on Inventive Research in Computing Applications (ICIRCA
[12] T.T. Dang, N. T. X. Huong, A.C. Le and V.N. 2018).
Huynh."Automatically Learning Patterns in Subjectivity [28] Vishal S. Shirsat, Rajkumar S. Jagdale, S. N.
Classification for Vietnamese". Knowledge and Systems Deshmukh,‖Document Level Sentiment Analysis from
Engineering. Springer, pp. 629-640, 2015. News Articles‖, International Conference on Computing,
[13] Arora, Piyush. ―Sentiment Analysis for Hindi Language.‖ Communication, Control and Automation (ICCUBEA)
Diss.International Institute of Information Technology 2017.
Hyderabad, 2013. [29] Sayali Zirpe, Bela Joglekar,‖ Polarity Shift Detection
[14] T. Wilson, P. Hoffmann, S. Somasundaran, J. Kessler, J. Approaches in Sentiment Analysis: A survey‖,
Wiebe, Y. Choi, C. Cardie, E. Riloff and S. Patwardhan. International Conference on Inventive Systems and
"Opinion Finder: A system for subjectivity analysis". In Control,2017.
Proceedings of hlt/emnlp on interactive demonstrations, [30] Jie Li; Lirong Qiu,‖ A Sentiment Analysis Method of Short
pp. 34-35. Texts in Microblog‖, International Conference on
[15] E. Riloff, J. Wiebe and W. Phillips. "Exploiting subjectivity Computational Science and Engineering (CSE) and IEEE
classification to improve information extraction". In International Conference on Embedded and Ubiquitous
Proceedings of the National Conference On Artificial Computing (EUC) 2017.
Intelligence, pp. 1106. [31] Erik Cambria, Nanyang Technological
[16] P. D. Turney. "Thumbs up or thumbs down?: semantic University,‖Affective Computing and Sentiment Analysis‖,
orientation applied to unsupervised classification of IEEE Intelligent Systems,2016.
reviews". In Proceedings of the 40th annual meeting on [32] Preslav Nakov|, Alan Ritter, Sara Rosenthal, Fabrizio
association for computational linguistics,pp. 417-424. Sebastiani|, Veselin Stoyanov,‖ SemEval-2016 Task 4:
[17] Math Alrefai, Hossam Faris, Ibrahim Aljarah .― Sentiment Sentiment Analysis in Twitter‖, Proceedings of SemEval-
analysis for Arabic language: A brief survey of approaches 2016.
and techniques‖.2018 [33] Pierre FICAMOS, Yan LIU,‖ A Topic based Approach for
[18] Emma Haddia, Xiaohui Liua, Yong Shib, ‗‗The Role of Sentiment Analysis on Twitter Data‖, International Journal
Text Preprocessing in Sentiment Analysis‘‘, ELSEVIER, of Advanced Computer Science and Applications 2016.
Procedia Computer Science 17 ( 2013 ) 26 - 32. [34] Zhunchen Luo , Miles Osborne ,TingWang,‖ An effective
[19] Jagdale, Rajkumar S., Vishal S. Shirsat, and Sachin N. approach to tweets opinion retrieval‖, Springer
Deshmukh. "Sentiment Analysis of Events from Twitter Science+Business Media New York 2013.
Using Open Source Tool." (2016). [35] Nazlia Omar,Mohammed Albared, Adel Qasem Al-
[20] Kang Hanhoon, Yoo Seong Joon, Han Dongil., ―Senti- Shabi,Tareq Al-Moslmi,‖ Ensemble of Classification
lexicon and improved Naı¨ve Bayes algorithms for Algorithms for Subjectivity and Sentiment Analysis of
sentiment analysis of restaurant reviews‖, Expert Syst Arabic Customers' Reviews‖, International Journal of
Appl ,39:6000–10, 2012 Advancements in Computing Technology(IJACT),2013.
[21] Hari Krishna M, Rahamathulla K, Ali Akbar,‖ A Feature [36] Neha Upadhyay, Prof. Angad Singh,‖ Sentiment Analysis
Based Approach for Sentiment Analysis using SVM and on Twitter by using Machine Learning Technique‖ ,
Coreference Resolution‖, International Conference on
608
IJSTR©2020
www.ijstr.org
INTERNATIONAL JOURNAL OF SCIENTIFIC & TECHNOLOGY RESEARCH VOLUME 9, ISSUE 02, FEBRUARY 2020 ISSN 2277-8616
609
IJSTR©2020
www.ijstr.org
View publication stats