Aspect-Based Sentiment Analysis
Aspect-Based Sentiment Analysis
2021
Eric Romero
Southern Methodist University, [email protected]
Bosang Yun
Southern Methodist University, [email protected]
Part of the Applied Linguistics Commons, Business Intelligence Commons, Categorical Data Analysis
Commons, and the Data Science Commons
Recommended Citation
Onalaja, Samuel; Romero, Eric; and Yun, Bosang (2021) "Aspect-based Sentiment Analysis of Movie
Reviews," SMU Data Science Review: Vol. 5: No. 3, Article 10.
Available at: https://scholar.smu.edu/datasciencereview/vol5/iss3/10
This Article is brought to you for free and open access by SMU Scholar. It has been accepted for inclusion in SMU
Data Science Review by an authorized administrator of SMU Scholar. For more information, please visit
http://digitalrepository.smu.edu.
Onalaja et al.: Aspect-based Sentiment Analysis
[email protected], [email protected],
[email protected], [email protected]
1 Introduction
• Recommendation System
• Product and Service Quality Improvement
• Organizational Decision Making
• Customer Decision Making
• Organizational Marketing Research
The dataset used in this research was obtained by scrapping IMDB for review
analysis. Using this data, several models will be built to obtain sentiment analysis on
a variety of aspects within several film genres. This paper aims to build classification
models that show whether viewers' sentiments are positive or negative. Since
traditional sentiment analysis focuses only on classifying sentiments without
specifying the parts, Aspect-Based Sentiment Analysis (ABSA) will be utilized in the
study. This will help analyze common words, slang, emoticons, and typographical
errors related to different movies.
Aspect Extraction (AE) is the process of extracting all the terms related to the
aspect of a product/service, such as slang, abbreviations, emoticons, and typographical
errors. Performing this task requires a sequence labeling in which each input word is
labeled as either B, I, or O, where B means Beginning, I means inside, and O means
outside. This is necessary to show the position of the term as aspects can sometimes
contain two or more words.
Aspect Classification (ASC) is the process of classifying sentiments according to
their categories: Positive, Negative. The task involves scraping review data from the
IMDB website, scouring the dataset for missing data or outliers, and making sure the
dataset is not biased so we can get a near accurate result and then build models on the
cleaned-up dataset. Support Vector Machine, Logistic Regression, Naïve Bayes, and
Long Short-term neural networks will be implemented in conjunction with aspect
components AE and ASC, and genre driving factors.
Most of the past research and papers were focused on binary classification but this
paper will cover more in-depth understanding of sentiment analysis including various
aspects of movie reviews such as casts, music, location, technology and quality. This
will help in instances where reviewers are interested in quality or cast of a movie or in
a case where a reviewer is only interested in watching a movie by a particular producer
and wants to read different reviews of their produced movies.
Since previous papers have discussed different methods for sentiment analysis on
movie review datasets, this study will supplement existing research of movie review
by focusing on Aspect-Based Sentiment Analysis (ABSA) with aspect and genre
specific driving factors to further develop granular level of sentiment understanding
and prediction. This research will help contribute to existing research on sentiment
analysis and Aspect-Based Sentiment analysis (ABSA) for movie reviews and to help
make informed business decisions regarding movie qualities and customer satisfaction.
https://scholar.smu.edu/datasciencereview/vol5/iss3/10 2
Onalaja et al.: Aspect-based Sentiment Analysis
2 Literature Review
As people spend more time watching movies through streaming services, the need
also increases for efficient assessment of which movies to recommend. One proven
method is sentiment analysis of movie reviews, which have become more available to
the public through developments in online media. In the following article "A difference
of multimedia consumer's rating and review through sentiment analysis" published by
Lee, Jiang, Kong, and Liu in 2020, authors address a strong need for review-based
sentiment analysis in the consumer world. "This is so because services are difficult to
predict until they are experienced" [23].
The first film critiques came soon after the dawn of film media in the early 1900s.
As films became more popular, newspapers began hiring professional critics to write
more serious analysis of the films to add more than just entertainment value [6]. New
styles of film analysis developed over time and eventually became a standard feature
for prominent magazines.
In more modern times, film critique was additionally made popular through
television media. Established critics Roger Ebert and Gene Siskel were notable for
developing the show “Siskel & Ebert At the Movies” in the 1980s that would not only
review films, but also conduct interviews with film actors. The main task for most
review media is to explain a film's premise in addition to its artistic or entertainment
merits. Film summaries are often expressed through a rating system such as numeric
scales, grades, image representations, or “thumbs” in the case of Siskel and Ebert.
Online blogs were one of the first internet media to be used for film criticism,
allowing any person to write their opinion of a film for others to read. However,
audience size was limited by the popularity of amateur writers and the sites they used.
Using more modern platforms such as YouTube function in a similar fashion but
provide access to a wider audience and interests with the use of videos, cut-scenes,
animations, and actors to express film critiques.
Specialized websites were also developed to provide a direct source for film
critiques and reviews. Specific types of criticism have developed within online media
that focus on particular aspects such as scientific realism, plot holes, and theories on
possible sequels. Other sites may be specifically tailored to offer analysis on aspects
such as content advisories, for parents concerned with their children watching the film.
Several sites have dedicated their use to providing a source for the general public
to express their views on films. These typically incorporate a written commentary from
the user that can vary greatly in length depending on depth and breadth. Additionally,
a scaled rating system is commonly included that is then used to calculate an average
rating and rank to compare with other movies.
The modern film criticism industry has been shown to exhibit some bias,
particularly toward gender. Often it is the case that reviews are more male dominated
with fewer representations of women. For example, male reviewers authoring articles
in Time magazine or radio critics on NPR, have been shown to represent approximately
70-80 percent of those formats. [8] Changes initially introduced by the internet led to
a decrease in women working as film critics in newspapers. This eventually developed
into shortages of women as opinion columnists overall. Men were more likely to retain
these roles and therefore became the more prevalent voice for reviewers. [20]
https://scholar.smu.edu/datasciencereview/vol5/iss3/10 4
Onalaja et al.: Aspect-based Sentiment Analysis
was built using a unigram, bigram and trigram patterns. These included negative words
and intensive adverbs used as features. An improved Naive Bayes algorithm was
introduced to solve imbalances in positive classification and negative classification
accuracy which may prove useful for distinguishing between review ratings within this
research [4][18].
In a 2018 publication Khaleghi, Cannon, and Srinivas evaluated commonly used
hotel review recommender algorithm methods by comparing their accuracy.
Collaborative filtering is a method in which predictions are made based on user ratings
using the theory that users with similar ratings will like similar things. Matrix
factorization uses "single value decomposition," which is typically more accurate
except in the case of sparse data sets. In the experiment they found a significantly better
accuracy for the matrix factorization but at a cost of 10 times longer processing speed.
In addition, they also concluded that the limited size of their sampling data for linking
commonalities between users was affecting the outcomes. This procedure can be
emulated for the purposes of this research. Data obtained from specific professional
reviewers that typically make reviews on all types of movies would be more easily
separated to avoid sparsity [19].
General audience seeks entertainment with a good story that fulfills their
satisfaction. Professional critics look to dissect the movies with their own critical
standards. IMDB reflects general audiences’ views while Rotten Tomatoes reflects a
small group of selected critics. Overall, IMDB dataset is more representative of the
general audience, whose reviews dictate the decision of both ordinary and avid movie
goers, thus is concluded to be more suitable than Rotten Tomatoes’ for this study.
3 Data
An API request was made on IMDB to get 3000 movie titles (600 movies reviews
per genre). Using the "BeautifulSoup" package, website URLs that link to each movie
review page were scraped. For each movie, a positive and a negative review were
extracted using minimum and maximum score. This ensures that reviews are clear with
sentiments on either side. For each review, the genre of the movie was imported as
well. A total of 3,000 movie reviews of equal number of positive and negative reviews
were collected. Movies will be scraped and collected in a manner that the genre will be
equally distributed among Horror, Comedy, Action, Romance and Sci-fi. The resulting
data frame is then randomly shuffled to be fed into the preprocessing for the models.
It is crucial to keep the dimensionality of the text low to improve the performance
of the machine learning classifier [14]. Thus, it is highly recommended to remove the
noise as much as possible and to properly preprocess the text in this pre-analysis stage.
Punctuations and special characters: and html special characters will also be
removed.
Tokenization: Each review will be separated into smaller units called tokens using
NLTK package tokenization.
POS tagging: A generic POS tagging is applied to classify words into four categories
of adjective, verb, noun, and adverb. More detailed tagging techniques will be
developed to further increase accuracy of lemmatization. An alternate POS tagging
technique will be an algorithm that resembles a voting engine will be developed by
combining several different POS techniques. The POS tagging that the majority picked
will be selected.
https://scholar.smu.edu/datasciencereview/vol5/iss3/10 6
Onalaja et al.: Aspect-based Sentiment Analysis
3.3 EDA
Average number of adjective words per rating distribution and most common
words for each sentiment have been studied. The following graph shows a normal
distribution of sentiment and subjectivity. Lemmatized data set was then plotted by the
most frequent adjectives in each sentiment to show the separation in word
representations.
Fig. 2. Subjectivity distribution is displayed. It is a normal curve with slightly right skewness
As expected from the normal curve pictured above, the words are slightly skewed
toward the positive sentiment as shown in the word cloud below. More refined word
preprocessing will be implemented and see how this composition might change
accordingly.
After dominant aspects are identified in each review, negative and positive
occurrences of each dominant aspect are counted and plotted for EDA purposes. Using
the timestamp of each review, the count of the most frequent words for each year will
be measured and plotted for the trend analysis over time. After the top five most and
least frequent words are identified for each year, they will be put in a lexicon list to be
assigned special weights (frequent words for less weights, least frequent words for
more weights).
Each genre is examined to see what most frequent words belong to them. Even
though comparing two opposite genres "Comedy" and "Horror" were expected to have
clear distinction in words, the general words and their frequencies were similar to each
other. The similarity in lexicon list between the two genre underscores the necessity of
applying aspect-based analysis for granular level of sentiment understanding in
different aspects and genre.
https://scholar.smu.edu/datasciencereview/vol5/iss3/10 8
Onalaja et al.: Aspect-based Sentiment Analysis
The following graphs are plotted using a lexicon base package called Spacy.
Spacy classifies lemmatized text into nine different categories and the most frequently
occurring words are collected for each genre. Representative actors and directors for
each genre appear, such as a famous romance actress "Audrey Hepburn" for Romance
and a famous horror movie director "James Carpenter" for Horror. It also shows
representative places such as "Japan," "New York," and "France" for Romance and
"Texas" for Horror. This is doing much better than previous EDA conducted on general
EDA in terms of bringing out the separation among different aspects, which leads to
better sentiment analysis.
Fig. 6. Top 30 most frequent tags shown for Romance Genre using Spacy
Fig. 7. Top 30 most frequent tags shown for Horror Genre using Spacy
https://scholar.smu.edu/datasciencereview/vol5/iss3/10 10
Onalaja et al.: Aspect-based Sentiment Analysis
Fig. 8. Interactive plot for LDA result shows group of 3 topics is the most optimal.
4 Methods
Spacy is one of the best open source NER tools available and provides several
predefined entity categories. After each word is tagged, only the nouns that are
semantically similar to the predefined aspects using word2vec will be extracted and
stored in a bag of words for each aspect.
Uncommon terms such as actor or director’s names were manually searched online
and added to the lexicon. This process ensures that the directing and acting aspect based
components filter the terms that are associated them.
Each feature extraction method listed above will generate aspects that are different
in content and scale. By aggregating the resulting aspects, each review is reduced to
the dominant aspect specific chunk. Then the textual data has to be converted into
vectors in order to fed into machine learning algorithms. Term Frequency Inverse
Frequency(TF-IDF) and CountVectorizer perform the essential task of text
vectorization in which it associates each word with a number that represents how
relevant that word is in the document. TF-IDF differs from the CountVectorizer in
that TF-IDF considers the overall document weightage of a word and penalizes the
most frequent words. Both methods will be implemented and compared for its
performance against the model accuracy.
After the reviews are split and reduced by the dominant aspect specific for each
review, text information is converted into vectors using both TF-IDF and
CountVectorizer described above. Different machine learning tools will be applied
and conducted normal training/testing on those vectors. Then they are classified into
sentiment of -1 and 1using machine learning tools. Genre driving factors are taken
into consideration when determining the final sentiment prediction. All the steps
leading up to sentiment analysis using machine and deep learning are displayed in
Figure 10 below.
https://scholar.smu.edu/datasciencereview/vol5/iss3/10 12
Onalaja et al.: Aspect-based Sentiment Analysis
Fig. 10. General workflow leading up to sentiment using various feature extraction and
machine and deep learning tools. LDA=Latent Dirichlet Allocation, LR=Logistic Regression,
NB=Naive Bayes, SVM=Support Vector Machine, RNN-LSTM=Recurrent Neural Network-Long
Short-Term Memory
Several machine and deep learning models such as Naïve Bayes, Logistic
Regression, Support Vector Machine and LSTM of Recurrent Neural Network will
be used to train and test the review content and predetermined sentiments.
5 Results
The use of weighted driving factors to identify movie aspects in previous work
was particular in that they gave highest importance to certain aspects such as "movie,"
“acting,” and “plot” rather than equal weights as this resulted in a slight drop of metric
performance. Using weighted aspects also worked well to suppress opinions in the
reviews on other factors, such as when reviews were longer in length which added to
its sentiment influence. Accuracies also varied between genres ranging from 63.3% to
87.3% which was for the “crime” genre. Driving factor importance also shifted
between genres, “crime” for example found “movie”, “screenplay” and “plot” to be
more influential. The genre specific accuracies were found to increase when reviewers
had made comments on those driving factors [31]
Extraction of aspects was found to be most successful using a frequency-based
TD-IDF approach in prior research, though this was limited by the necessity for very
large amounts of data to achieve this. A rule-based approach worked best in prior
publications for extraction at 92% precision [42]. However, this also typically required
a large set of defined hand crafted rules which would tend to perform badly in any
undefined instances and specifically in cases of named entity recognition for different
languages [32]
The hypothesis of this research is that assigning high driving factors to certain
aspect of the movie result in higher accuracy of the models. The driving factors are
randomly assigned to various movie aspects and their impact tied to each aspect and
genre leading to the sentiment classification has been fully investigated based on the
accuracy of each model. Figure 11 shows the distribution of average aspect driving
factors for each genre for the highest accuracy observed in this research. After
numerous iterations to ensure the generalization of the driving factors over the entire
dataset using cross validation, the study revealed that assigning higher driving factors
to the plot and acting for Action, directing and music for Horror, screenplay and acting
for Comedy, screenplay and acting for Romance and plot and directing for Sci-fi genre
movies, result in the higher accuracy of the models that were utilized in this research.
Figure 12 shows the generalized result of top contributing movie aspects per genre
group based on the model performance.
Fig. 11. Driving factor distribution among various aspects and genre
Fig. 12. Driving factor distribution among various aspects and genre
10-Fold Cross Validation (CV) was performed on the entire dataset to measure the
model effectiveness. In Figure 13 it shows model performance when TF-IDF
vectorization was used. SVM performed the best when using TF-IDF vectorization in
terms of the average accuracy and standard deviation across its CV models. Naïve
Bayes performed the worst having to ingest TF-IDF vectors, which are numbers that
are continuous rather than discrete. When using CountVectorizer, the performance of
LR and SVM dropped by 3-4%. In Figure 14, it shows that NB performs the best in
terms of average accuracy and standard deviation among its CV result. This shows that
pairing vectorizer with models properly is imperative in performance. The research
proceeded with using CountVectorizer for NB and TF-IDF for LR and SVM models.
https://scholar.smu.edu/datasciencereview/vol5/iss3/10 14
Onalaja et al.: Aspect-based Sentiment Analysis
Fig. 13. 10-Fold CV accuracy compared among ML models used with TF-IDF vectorizer
Fig. 14. 10-Fold CV accuracy compared among ML models used with CountVectorizer
Figure 15 shows the result of running the models without accounting driving
factors. The result proves that incorporating driving aspect and genre driving factors
increase the accuracy on average 3 to 4%. Incorporating driving factors resulted in the
highest observed sentiment prediction accuracy of 68%, compared to 63% without
using them. Thus, the hypothesis of this research is reasonably accepted and can be
further developed by refining lexicon base and delving into deep learning models.
Fig. 15. 10-Fold CV accuracy result with no driving factors accounted. NB with
CountVectorizer, LR and SVM with TF-IDF vectorizer
6 Discussion
6.1 Ethics
In section 3, the dataset was described with its extraction method. The ethical
scope of this project was to scrape the data from IMDB website properly. It is
acknowledged that data was scrapped without permission from IMDB, however, the
authors of this research paper do not seek any monetary benefit or intend to
commercialize applications in the future, thus, there are no ethical issues of concern.
There are a few other biases that need to be addressed in this dataset, such as
population distribution and nationality. Even though the dataset seems to be limited to
the movies that were released in United States, the opinions gathered are international.
It is also not clear when and how long the audience was exposed to the movies, and
this might have affected the movie reviews due to generational differences. Gender
was also largely more representative of male reviewers as was seen to be common
throughout this type of media. To decrease this form of bias the site with the highest
female representation was selected for data collection. The dataset was randomly
sampled and is therefore treated as non-biased for the purpose of this study.
Furthermore, statistical methods were deployed in transforming the dataset to a
more suitable form for a proper EDA analysis to ensure the dataset is clean. To achieve
the aim and objectives of the research, machine learning models were used to perform
the following tasks.
https://scholar.smu.edu/datasciencereview/vol5/iss3/10 16
Onalaja et al.: Aspect-based Sentiment Analysis
• It is discovered that some of these aspects are more important than others.
These aspects are important as they can help to easily know reviewers' areas
of concentration.
There were several challenges that needed to be addressed over the course of the
research. As noted in section 2.6 and 6.1, securing a dataset that is unbiased and
representative of the general audience was a concern. The dataset generated from the
general audience rather than professional movie critics was targeted due to the public
audience being the main revenue stream and the overall sentiment of the movie. During
the process of identifying the right dataset for this study, the lack of information on
demographics, scoring metric and nationality of the reviewers were hindering factors
in fully grasping the data. However, a few general assumptions were established to
mitigate the bias as noted in section 2.6.
Another challenge that was encountered during the study was splitting the text into
different aspects properly with the established lexicon base. Part Of Speech (POS)
tagging was utilized to filter out adjectives, verbs, nouns, and their combinations that
belong to the lexicon base. Even though a supervised learning and a traditional lexicon
method (SPACY) was utilized to establish a lexicon base for the input text, it appeared
that not every contributing word to the sentiment was listed on the lexicon base. This
could be a future work to see how models perform with a refined lexicon base that
inclusively covers the input text words.
7 Conclusion
The project was conducted to find which movie aspects drive the sentiment of
the reviews using different driving factors. Four different sentiment classification
methods were utilized to find the sentiment: three supervised machine learning of
Logistic Regression, Naive Bayes, Support Vector Machines, and one deep learning of
Recurrent Neural Network. Despite challenges of establishing proper lexicon list for
the aspect related words, the added feature of aspect and genre driving factor boosted
accuracy of the aspect-based models, thus reasonably improved the predictions of the
review sentiment. The study results also suggest that these driving factor assisted
models can deliver insights on which aspects under certain genre drive the most
sentiment for any unseen test dataset.
The models presented here provide a framework for various other applications. As
discussed earlier, the study results presented here are not only useful in the movie
industry but also useful in other industries in which their business is driven by user
generated reviews. Analysis of reviews incorporating driving factors can increase
model accuracy while providing insight into customer motivations and concerns.
Acknowledgments.
https://scholar.smu.edu/datasciencereview/vol5/iss3/10 18
Onalaja et al.: Aspect-based Sentiment Analysis
References
1. Akter, Shariar; McGarthy, Grace; Sajib, Shariar; Michael, Katina; Dwivedi, Yogesh; Ambra, John;
Shen, K. “Algorithmic bias in data-driven innovation in the age of AI”
International Journal of Information Management Volume 60, October 2021, 102387.
https://www-sciencedirect-com.proxy.libraries.smu.edu/science/article/pii/S0268401221000803
2. Ali, Nehal & Hamid, Marwa & Youssif, Aliaa. (2019). “Sentiment Analysis For Movies
Reviews Dataset Using Deep Learning Models”. International Journal of Data Mining &
Knowledge Management Process. 09. 19-27. 10.5121/ijdkp.2019.9302.
https://www.researchgate.net/publication/333607586_SENTIMENT_ANALYSIS_FOR_
MOVIES_REVIEWS_DATASET_USING_DEEP_LEARNING_MODELS
3. Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and
Christopher Potts. (2011). “Learning Word Vectors for Sentiment Analysis.” The 49th
Annual Meeting of the Association for Computational Linguistics (ACL 2011).
http://ai.stanford.edu/~amaas/data/sentiment/
4. Atif Khan, Muhammad Adnan Gul, M. Irfan Uddin, Syed Atif Ali Shah, Shafiq Ahmad,
Muhammad Dzulqarnain Al Firdausi, Mazen Zaindin, (2020) "Summarizing Online
Movie Reviews: A Machine Learning Approach to Big Data Analytics", Scientific
Programming, vol. 2020, Article ID 5812715.
https://doi.org/10.1155/2020/5812715
5. Baid, Palak & Gupta, Apoorva & Chaplot, Neelam. (2017). “Sentiment Analysis of
Movie Reviews Using Machine Learning Techniques”. International Journal of
ComputerApplications. 179. 45-49. 10.5120/ijca2017916005.
https://www.researchgate.net/publication/321843804_Sentiment_Analysis_of_Movie_Re
views_using_Machine_Learning_Techniques’
6. Battaglia, James, "Everyone’s a Critic: Film Criticism Through History and Into the
Digital Age" (2010). Senior Honors Theses. 32.
https://digitalcommons.brockport.edu/honors/32
7. Brar, Gurshobit; “Sentiment Analysis of Movie Review Using Supervised Machine
Learning Techniques” International Journal of Applied Engineering Research ISSN
0973-4562 Volume 13, Number 16 (2018) pp. 12788-12791.
https://www.ripublication.com/ijaer18/ijaerv13n16_53.pdf
8. Cambria, E; Schuller, B; Xia, Y; Havasi, C (2013). "New avenues in opinion mining and sentiment
analysis". IEEE Intelligent Systems. 28 (2): 15–21. CiteSeerX 10.1.1.688.1384.
https://doi.org/10.1109%2FMIS.2013.30
9. Coggan, D. (2016, June 23). Male film critics greatly outnumber female critics, study finds.
EW.Com.
https://ew.com/article/2016/06/23/film-criticism-gender-study/
10. Collazo, M. (2014, April 30). How Movie Critics and Moviegoers View Films Differently. The
Artifice.
https://the-artifice.com/movie-critics-and-moviegoers-view-films-differently/
11. Ernoul,Lisa; Wardell,Angela (2016) “Representing the Greater Flamingo in Southern
France: A semantic analysis of newspaper articles showing change over time”. Ocean
and Coastal Management Vol 133 pg 105-113
https://www.sciencedirect.com/science/article/abs/pii/S0964569116302101
12. Fang, X., Zhan, J. (2015) “Sentiment analysis using product review data”, Journal of Big
Data 2,5.
https://journalofbigdata.springeropen.com/articles/10.1186/s40537-015-0015-2
13. Gruhl, D., R. Guha, Ravi Kumar, Jasmine Novak, and Andrew Tomkins. 2005. "The predictive
power of online chatter." KDD '05: Proceedings of the eleventh ACM SIGKDD international
conference on Knowledge discovery in data mining, pp. 78-87, August. doi:
10.1145/1081870.1081883.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.90.8553&rep=rep1&type=pdf
14. Haddi, Emma; Liu, Xiaohui; Shi, Yong. (2013) “The Role of Text Pre-processing in
Sentiment Analysis”, Procedia Computer Science: Vol 17, Page 26-32
https://www.sciencedirect.com/science/article/pii/S1877050913001385
15. Hatzivassiloglou, V; McKeown, K, 1997. "Predicting the Semantic Orientation of Adjectives." 35th
Annual Meeting of the Association for Computational Linguistics and 8th Conference of the
European Chapter of the Association for Computational Linguistics, July, pp. 174-181.
https://www.aclweb.org/anthology/P97-1023/
16. IMDb | Help. (n.d.). IMDB. Retrieved August 29, 2021, from
https://help.imdb.com/article/imdb/track-movies-tv/ratings-
faq/G67Y87TFYYP6TWAV?ref_=helpms_helpart_inline#
17. IMDb vs Rotten Tomatoes: The Wisdom of Crowd Goes to The Movies. (2018, November 28).
Wordpress.
https://learncuriously.wordpress.com/2018/11/25/wisdom-of-crowd-goes-to-the-movies/
18. Kang, Hanhoon ; Yoo, Seong Joon ; Han, Dongil (2012) “Senti-lexicon and improved
Naïve Bayes algorithms for sentiment analysis of restaurant reviews”, Elsevier Ltd Expert
systems with applications, Vol.39 (5), p.6000-6010
https://www-sciencedirect-com.proxy.libraries.smu.edu/science/article/pii/S09574174110
16538
19. Khaleghi, Ryan; Cannon, Kevin; and Srinivas, Raghuram (2018) "A Comparative
Evaluation of Recommender Systems for Hotel Reviews," SMU Data Science Review:
Vol. 1 : No. 4 , Article 1.
https://scholar.smu.edu/datasciencereview/vol1/iss4/1
20. Kilkenny, K. "How the Internet Led to the Decline of Female Film Critics". The Atlantic.
2015-12-27. Retrieved 2018-06-21
https://www.theatlantic.com/entertainment/archive/2015/12/female-film-critics/421629/
21. Kulkarni, Vivek; Perozzi, Bryan; Al-Rfou, Rami; Skiena,Steven “Statistically Significant
Detection of Linguistic Change”(2020) Github
http://viveksck.github.io/langchangetrack/data/kulkarni.pdf
22. Lakshmi Devi B., Varaswathi Bai V., Ramasubbareddy S., Govinda K. (2020) Sentiment
Analysis on Movie Reviews. In: Venkata Krishna P., Obaidat M. (eds) Emerging Research
in Data Engineering Systems and Computer Communications. Advances in Intelligent
Systems and Computing, vol 1054.
https://doi.org/10.1007/978-981-15-0135-7_31
23. Lee, Sung-Won ; Jiang, Guangbo ; Kong, Hai-Yan ; Liu, Chang(2020), “ A difference of
multimedia consumer's rating and review through sentiment analysis”, Multimedia tools
and applications
https://link-springer-com.proxy.libraries.smu.edu/article/10.1007/s11042-020-08820-x
24. Ligthart, A., Catal, C. & Tekinerdogan, B. (2021) “Systematic reviews in sentiment
analysis: a tertiary study,” Artif Intell Rev.
https://doi.org/10.1007/s10462-021-09973-3
25. Lochmiller, Chase; “A Survey of Techniques for Sentiment Analysis in Movie Reviews
and Deep Stochastic Recurrent Nets”, Department of Computer Science Stanford
University, Reports 2016.
https://cs224d.stanford.edu/reports/chase.pdf
26. Mamtesh, Seema Mehla (National Institute of Technology Kurukshetra, India) “Sentiment
Analysis of Movie Reviews using Machine Learning Classifiers”, International Journal of
Computer Applications (0975–8887) Volume 182 – No. 50, April 2019.
https://www.ijcaonline.org/archives/volume182/number50/mamtesh-2019-ijca-918756.pd
f
27. Nguyen, Heidi; Veluchamy, Aravind; Diop, Mamadou; and Iqbal, Rashed (2018)
"Comparative Study of Sentiment Analysis with Product Reviews Using Machine
Learning and Lexicon-Based Approaches," SMU Data Science Review: Vol. 1 : No. 4 ,
Article 7.
https://scholar.smu.edu/datasciencereview/vol1/iss4/7
28. Oswal,Sangeeta; Soni,Ravikumar, Narvekar, Omka (2019) “Named Entity Recognition
and Aspect based Sentiment Analysis”, International Journal of Computer Applications:
Vol 178 - No 46
https://www.ijcaonline.org/archives/volume178/number46/30859-2019919367
29. Ortony, Andrew; Clore, G; Collins, A (1988). The Cognitive Structure of Emotions (PDF).
Cambridge Univ. Press.
http://www.cogsci.northwestern.edu/courses/cg207/readings/Cognitive_Structure_of_Emotions_exe
rpt.pdf
30. Pang, B., Lee, L. 2004. "A Sentimental Education: Sentiment Analysis Using Subjectivity
Summarization Based on Minimum Cuts." Proceedings of the 42nd Annual Meeting of the
Association for Computational Linguistics (ACL-04), pp. 271-278, July.
https://www.aclweb.org/anthology/P04-1035/
31. Parkje, Viraj; Biswas, Bhaskar; “Aspect Based Sentiment Analysis of Movie Reviews”
(2014) International Conference on Soft Computing & Machine Intelligence
https://ieeexplore.ieee.org/document/7079348
32. Patil,Nita; Patil, Ajay; Pawar, B.V. (2019) “Named Entity Recognition using Conditional
Random fields”, International Conference on Computational Intelligence and Data
https://scholar.smu.edu/datasciencereview/vol5/iss3/10 20
Onalaja et al.: Aspect-based Sentiment Analysis