0% found this document useful (0 votes)

304 views8 pages

Applications of Deep Learning To Sentiment Analysis of Movie Reviews

This document explores using deep learning models for sentiment analysis of movie reviews. It begins with an introduction to sentiment analysis and challenges in the field. Traditional approaches rely on engineered features but struggle with language subtleties. Deep learning models represent words as vectors and learn complex features, fitting well for sentiment analysis. The document analyzes a movie review dataset and provides a Naive Bayes baseline, achieving 38% accuracy. It then applies and compares recurrent, recursive, and convolutional neural networks.

Uploaded by

Tamara Komnenić

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

304 views8 pages

Applications of Deep Learning To Sentiment Analysis of Movie Reviews

Uploaded by

Tamara Komnenić

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Applications of Deep Learning to Sentiment Analysis

of Movie Reviews
Houshmand Shirani-Mehr
Department of Management Science & Engineering
Stanford University
[email protected]

Abstract
Sentiment analysis is one of the main challenges in natural language processing.
Recently, deep learning applications have shown impressive results across different NLP tasks. In this work, I explore performance of different deep learning
architectures for semantic analysis of movie reviews, using Stanford Sentiment
Treebank as the main dataset. Recurrent, Recursive, and Convolutional neural
networks are implemented on the dataset and the results are compared to a baseline Naive Bayes classifier. Finally the errors are analyzed and compared. This
work can act as a survey on applications of deep learning to semantic analysis.

Introduction

Sentiment analysis or opinion mining is the automated extraction of writers attitude from the text
[1], and is one of the major challenges in natural language processing. It has been a major point
of focus for scientific community, with over 7,000 articles written on the subject [2]. As an important part of user interface, sentiment analysis engines are utilized across multiple social and review
aggregation websites. However, the domain of the applications for Sentiment Analysis reaches far
from that. It provides insight for businesses, giving them immediate feedback on products, and
measuring the impact of their social marketing strategies [3]. In the same manner, it can be highly
applicable in political campaigns, or any other platform that concerns public opinion. It even has
applications to stock markets and algorithmic trading engines [4]-[5].
It should be noted that adequate sentiment analysis is not just about understanding the overall sentiment of a document or a single paragraph. For instance, in product reviews usually the author
does not limit his view to a single aspect of the product. The most informational and valuable reviews are the ones that discuss different features, and provide a comprehensive list of pros and cons.
Therefore, it is important to be able to extract sentiments on a very granular level, and relate each
sentiment to the aspect it corresponds to. On the more advanced level, the analysis can go beyond
only positive or negative attitude, and identify complex attitude types.
Even on the level of understanding a single sentiment for the whole document, sentiment analysis is
not a straightforward task. Traditional approaches involve building a lexicon of words with positive
and negative polarities, and identifying the attitude of the author by comparing words in the text
with the lexicon [6]. In general, the baseline algorithm [7] consists of tokenization of the text,
feature extraction, and classification using different classifiers such as Naive Bayes, MaxEnt, or
SVM. The features used can be engineered, but mostly involve the polarity of the words according
to the gathered lexicon. Supervised [8] and semi-supervised [9] approaches for building high quality
lexicons have been explored in the literature.
However, traditional approaches are lacking in face of structural and cultural subtleties in the written
language. For instance, negating a highly positive phrase can completely reverse its sentiment, but
unless we can efficiently present the structure of the sentence in the feature set, we will not be able to
1

Introduction
Sentiment Analysis: one of major challenges in NLP
Provides insight for businesses, measuring impact of social marketing
capture
this effect. On
a more abstract
level, it willcampaigns
be quite challenging for a machine to understand
Immediate
feedback
for political
sarcasm
in
a
review.
The
classic
approaches
to
sentiment
analysis and natural language processing
Part of user-interface for many social platforms
are heavily based on engineered features, but it is very difficult to hand-craft features to extract
properties
mentioned. And due indeed
to the dynamic
naturesubtleties
of the language, those features might become
Not
straightforward,
many
obsolete in a short span of time.

Need to look beyond single words or phrases

Recently, deep learning algorithms have shown impressive performance in natural language pro Sentences
can
have complex
structure
cessing
applications
including
sentiment analysis
across multiple datasets [10]. These models do
not
need to be
provided
withnuances
pre-definedsuch
features
Social
and
lingual
as hand-picked
sarcasm by an engineer, but they can learn
sophisticated features from the dataset by themselves. Although each single unit in these neural
Traditional
Approach:
Engineering
features,
emphasizing
onthese
networks is fairly
simple, by stacking
layers of non-linear
units at the
back of each other,
models are capable of learning highly sophisticated decision boundaries. Words are represented in
word
and
phrases.
a high
dimensional
vector space, and the feature extraction is left to the neural network [11]. As
a result, these models can map words with similar semantic and syntactic properties to nearby loDeep
Learning: Represent words in a vector space, leave feature
cations in their coordinate system, in a way which is reminiscent of understanding the meaning of
words. Architectures
RecursiveNetwork
Neural Networks are also capable of efficiently understanding
extraction
to thelike
Neural
the structure of the sentences [12].These characteristics make deep learning models a natural fit for
esults
in complex
features and decision boundaries => Better results
aR
task
like sentiment
analysis.

Can m
Issues:

C
Ig

In this work, I am going to explore performance of different deep learning architectures for semantic
analysis of movie reviews. First, a preliminary investigation on the dataset is done. Statistical
properties of the data are explored, a Naive Bayes baseline classifier is implemented on the dataset,
and the performance of this classifier is studied. Then different deep learning architecture are applied
to the dataset, and their performance and errors are analyzed. Namely, Deep dense networks with
no particular structure, Recurrent Neural Networks, Recursive Neural Networks, and Convolutional
Neural Networks are investigated. At the end, a novel approach is explored by using bagging and
Stanford
Treebank
particularlySentiment
random forests for
convolutional neural networks.

Dataset

The11,855
from movie
reviews
dataset sentences
used for thisextracted
work is the Stanford
Sentiment
Treebank dataset [13], which contains
sentence
extracted
from movie
reviews.
These sentences
11,855
215,154
unique
phrases,
and
fully labeled
parsecontain
trees215,154 unique phrases,
and
have
fully
labeled
parse
trees.
The
sentences
are
already
parsed
by Stanford
Parser and the
semantic
5 classes
for sentiment, from strongly negative to strongly
positive
of each phrase on the tree is provided. The dataset has five classes for its labels, and a
cross-validation
8,544 training
1,101
sentences
in validation
and test
2,210
split ofexamples,
8,544 training
examples,
1,101 validation
samples, set,
and 2,210
casestest
is
already
provided
with
the
data.
Figure
1
shows
a
sample
of
this
dataset.
cases.

Vanilla

Furthe

Figure 1: Structure of a sample from Stanford Sentiment Treebank dataset.

This report is organized as follows: Section 2 explains the preliminary results on the dataset using
Baseline
Results
Analysis
Naive Bayes classifier. In sections
3-6 different
deep &
learning
models are studied and their performance is analyzed. Finally, section 7 compares the results of the models and conclude the paper.

2 Preliminary
Analysis & of
Baseline
Results
Based
on distribution
labels,
accuracy
Theused
first step
exploring performance
of different classifiers on a dataset is to identify an effective
is
asinperformance
measure.
performance measure. In many cases, especially when the dataset is heavily biased towards one of
the label classes, using
is not by
the best
way to measure
Performance
isaccuracy
measure
accuracy
of performance. However, as shown
in figure 2, the distribution of sample labels in Stanford Sentiment Treebank (SST) dataset is not
predicted
label
theAdditionally,
root (thepredicting
wholenone of the classes carries bigger weight
dominated by any
singleat
class.
sentence).
Distribution of Sentiment
2
Nave Bayes:
Labels

Training set: 78.3 %

rformance measure.
is measure by accuracy of
el at the root (the whole
compared to the others. The distribution of labels in the validation set shows same structure. Therefore, accuracy can be used here as an effective measure to compare results of different classifiers.

Distribution of Sentiment
Labels

: 78.3 %
et: 38.0 %
3%

2000

Count

1500

1000

omplex decision boundaries

500

iti

Po
si
ti

Nave Bayes Confusion

Matrix

Po
s

ra
l
eu
t
N

eg
N

St
ro

ng
l

ng
ly

eg
at

ive

at
ive

Label

Figure 2: Distribution of labels in the training set

of SST dataset.

Figure 3: The confusion matrix for Naive

Bayes Classifier applied to SST.

Although SST provides sentiments of phrases in the dataset as well, and we are able to train our
models using that information, sentiment analysis engines are usually evaluated on the whole sentence as a unit. Therefore, in this work the final performance is measured for the sentences, which
corresponds to the sentiment at the root of a tree in SST.
To have a baseline result for comparing how well the deep learning models perform, and to get
a better understanding of the dataset, a Naive Bayes classifier is implemented on the data. The
results of this classifier is shown in table 1. While the training accuracy is high, the test accuracy is
around 40%. Figure 3 is a visualization for the confusion matrix of the classifier. The figure shows
that Naive Bayes classifier performs relatively well in separating positive and negative sentiments,
however it is not very successful in modeling the lower level of separation between "strong" and
regular sentiment. Therefore, making the decision boundaries more complex seems like a viable
option for improving the performance of the classifier. This option is explored in following sections.

Word2vec Averaging and Deep Dense Networks

The simplest model to apply to the sentiment analysis problem in deep learning platform is to use
an average of word vectors trained by a word2vec model. This average can be perceived as a representation for the meaning of a sentence, and can be used as an input to a classifier. However, this
approach is not very different from bag of words approach used in traditional algorithms, since it
only concerns about single words and ignores the relations between words in the sentence. Therefore, it cannot be expected from such a model to perform well. The results in [13] show that this
intuition is indeed correct, and the performance of this model is fairly distant from state-of-the-art
classifiers. Therefore, I skip this model and start my implementation with more complex ones.
The next natural choice is to use a deep dense neural network. As the input, vectors of words in
the sentence are fed into the model. Various options like averaging word vectors or padding the
sentences were explored, yet none of them achieved satisfactory results. The models either did
not converge or overfit to the data with poor performance on validation set. None of these models
achieved accuracy higher than 35%. The intuition for these results is that while these models have
too many parameters, they do not effectively represent the structure of the sentence and relations between words. While in theory they can represent very complex decision boundaries, their extracted
features do not generalize well to the validation and test set. This motivates using different classes of
neural networks, networks that using their architecture can represent the structure of the sentences
in a more elegant way.
3

Complex networks have too many parameters, do not converge

Ignores the structure of the sentence

Recurrent Neural Networks

Figure#from#lecture#slides#

Figure 4: The structure of a Recurrent Neural Network.

Obeservations
Vanilla model (sigmoid non-linearity) does not perform well
0.7

0.65

0.6

Pooling improves performance (to 39.3% on validation set)

Accuracy

0.55

0.5

Further improvements

Using LSTM to model complex non-linearity in sentences

Results in overfitting
Solution: Dropout
Best Accuracy:
40.2% on validation set, 40.3% on test set
Figure 5: Learning Curve of implemented Recurrent Neural Network
0.4

0.35

0.3

0.25

8
10
Epoch Number

0.5
0.48

0.55

0.46
0.5

0.44
0.45

0.42
Accuracy

Accuracy

0.45

0.4

0.4
0.38

0.35

0.36
0.3

0.34
0.25
0.2

0.32

30
wvecdim

0.3

30
Batch Size

Figure 6: Recurrent Neural Network: Effect Figure 7: Recurrent Neural Network: Effect
of Word Vector Dimension
of Batch Size
Recurrent
neural networks
are vectors
not the most natural fit for representingSize
sentences
(Recursive neural
Dimension
of word
of Minibatch
networks are a better fit to the task for instance), however it is beneficial to explore how well they
perform for classifying sentiments. Figure 4 1 shows the structure of a vanilla recurrent neural
network. The inputs are the successive word vectors from the sentence, and the outputs can be
formulated as following:
h(t) = f(Hh(t1) + Lx(t) )
y(t) = softmax(U h(t) )
Where f is the non-linearity which is initially the sigmoid function, and y(t) is the prediction probability for each class. One possible direction is to use y at the last word in the sentence as the
prediction for the whole sentence, since the effect of all the words have been applied to this prediction. However, this approach did not yield higher than 35% accuracy in my experimentations.
1

From CS224D Slides

Learning Curve
4

Course#Project#Poster#
Recursive Neural Networks

rning in
tions
vie Review

Motivated by [14], I added a pooling layer between softmax layer and the hidden layer, which
ord#Deep#Learning#Tutorial# increases the accuracy to 39.3% on the validation set. The pooling is done on h(t) values, and mean
pooling achieves
almost 1% higher accuracy compared to max pooling. As a further improvement,
Figure#from#lecture#slides#
LSTM unit was used as the non-linearity in the network. With only this change, the performance
does not improve, and the model overfits due to more parameters in the LSTM unit. However, by
Obeservations
using Dropout [19] as a better regularization technique, the model is able to achieve 40.2% accuracy
on the
validation set and 40.3% accuracy on the test set. This accuracy is almost the same as the
model performs very
well
baseline model.
Single layer, tanh non-linearity:

Obeservatio
Even vanilla model performs very wel

42.2 %
Intuition:
Utilizes the structure of the s
n: Utilizes the structurethe
ofeffect
the sentence
and
phrase-level
labels on the
of changing
different
hyperparameters
accuracy
of the model.
Course#Project#Poster#
rovements
ths
Further improvements
5 Recursive Neural Networks
ayer
Recursive Neural Networks
2vec
vectors,
performs
poorly
2-deep layer
fits, should
use dropout
regularization
ve Neural Tensor Networks
Overfits, should use dropout regula
arameters, do not converge
Recursive Neural Tensor Networks
e

boundaries
on 42.2
the%input
yer,
tanh non-linearity:
on test set
stanford.edu
Figure 5 shows the learning curve for the recurrent neural network model, and figures 6 and 7 show

Figure#from#lecture#slides#

Figure 8: The structure of a Recursive Neural Network.

al Networks

Obeservations
Even vanilla model performs very well

nput

Single layer, tanh non-linearity: 42.2 % on test set

Intuition: Utilizes the structure of the sentence and phrase-level labels

Further improvements

f word vectors
oorly

rge

2-deep layer
Overfits, should use dropout regularization
Recursive Neural Tensor Networks

from#lecture#slides#

Learning Curve
Figure 9: Recursive
Neural Network: Learning
Curve

of word
vectors
Figure Dimension
10: Recursive Neural
Network:
Effect of
Word Vector Dimension

tions
Matrix for best
Figure 8 shows
the structure of a recursive neural network. The structure of the network is based
model
y) does not perform
well
on the structure of the parsed tree for the sentence. The vanilla model for this network can be
2

9.3% on validation
set)as follows:
formulated

h = f(W

Convolutional Neural Networks

-linearity in sentences
Dimension of word vectors

well

tion set, 40.3%

hLeft
+ b)
hRight

y = softmax(W (s) h + b(s) )

Since this model is already studied in detail in the assignments, and specially since Convolutional
Neural Networks achieve higher accuracy, I did not experiment with Recursive neural networks in
extent. The learning curve and some experimentations on the hyperparameters of the model are
Confusion
Matrix
best
shown in figures 9 and 10. The accuracy of the model is 42.2%
on the test set,
which isfor
higher
than
recursive
on
test neural
set networks and the baseline results.
model

Convolutional
Neural Networks
Figure#from#Kim#(2014)#

Learning Curve

In convolutional
networks,
Confusionneural
Matrix
for besta filter with a specific window size is run over the sentence, generating different results.
modelThese results are summarized using a pooling layer to generate one vector as

Obeservations
2

From CS224D Slides

exploring convolutional neural networks

5
the-art CNNs achieve superior performance
Convolutional Neural Networks
uded in the final write-up

Convolutional Neura

Confusion Matrix for best

model

Convolutional Neural Networks

Figure#from#Kim#(2014)#

Figure 11: The structure of a Convolutional Neural Networks.

Obeservations
the output of the filter layer. Different filters
can be applied to generate different outputs, and these
outputs can be used with a softmax layer to generate prediction probabilities. Figure 11 3 shows the
structure of this network. The model can be described using following equations:

Next step is exploring convolutional neural networks

(j)

ci =achieve
f (W xsuperior
i:i+h1 + b)
State-of-the-art CNNs
performance
(j)

(j)

c final
= max(c
1 , c2 , . . . , cnh+1 )
Will be included in the
write-up
(j)

y = softmax(W (s)
c + b(s) )
Where h is the length of the filter. For this work, I have used the model proposed by Kim [20], which
uses Dropout and regularization on the size of gradients as approaches to help the model converge
better.
Figure 12 shows the learning curve of the Convolutional neural network, and figure 13 shows that
50 is the local optimal dimension for word vectors used in the model.
100

0.5
0.48

0.46

80
0.44
0.42
Accuracy

Accuracy

70
60

0.4
0.38

0.36

40
0.34

30
20

0.32

8
10
12
Epoch Number

0.3
25

50
wvecdim

Figure 12: Convolutional Neural Network: Figure 13: Convolutional Neural Network:
Learning Curve
Effect of Word Vector Dimension
While we observe a slight improvement over Recurrent neural networks, the results are not significantly better than Baseline classifier. The significant gap between the training error and test error
shows that there is a serious overfitting in the model. As a solution, instead of training the word
vectors along other parameters using samples, predefined 300-dimensional vectors from word2vec 4
model are used, and are kept fixed during the training phase. These vectors are trained based on a
huge dataset of news articles. The resulted model shows a significant improvement in the accuracy.
Figure 14 shows the learning curve for this model. The model trains very fast (highest validation
accuracy is at epoch 5) and the final accuracy on test set is 46.4%.

Conclusion & Analysis of Results

Table 1 shows the comparison of results for different approaches explored in this work.
3
4

from [20]
Available from https://code.google.com/p/word2vec/

1
0.9
0.8

Accuracy

0.7
0.6
0.5
0.4
0.3
0.2

8
10
12
Epoch Number

Figure 14: Convolutional Neural Network with word vectors fixed from word2vec model: Learning
Curve
Recurrent neural networks are not an efficient model to represent structural and contextual properties
of the sentence, and their performance is close to the baseline Naive Bayes Algorithm.
Recursive neural networks are built based on the structure of the parsed tree of sentences, therefore
they can understand the relations between words in a sentence more adequately. Additionally, they
can use the phrase-level sentiment labels provided with the SST dataset for their training. Therefore,
we expect Recursive networks to outperform Recurrent networks and baseline results.
Convolutional neural network can be assumed as a generalized version of recursive neural networks.
However, like recurrent neural networks, they have the disadvantage of losing phrase-level labels as
training data. On the other hand, using word vectors from word2vec model results in a significant
improvement in the performance. This change can be contributed to the fact that due to large number
of parameters, neural networks have a high potential for overfitting. Therefore, they require a large
amount of data in order to to find generalizable decision boundaries. Learning the word vectors
along other parameters from sentence-level labels in SST dataset results in overfitting and degrade
performance on the validation set. However, once we use pre-trained word2vec vectors to represent
words and do not update them during the training, the overfitting decreases and the performance
improves.

Figure 15: Confusion Matrix: Convolutional Neural Network with fixed

word2vec word vectors

Figure 16: Confusion Matrix: Recursive

Neural Network

Figures 15 and 16 show the confusion matrix of the two best model from the experimentations.
Comparing to the confusion matrix for Naive Bayes, we can see that the correct predictions are
distributed more evenly across different classes. Naive Bayes classifier is not as consistent as deep
learning models in predicting classes on a more granular level. As mentioned before, this is due to
capacity of deep neural networks in learning complex decision boundaries. While it is possible to
engineer and add features in such a way that the performance of Naive Bayes classifier improves,
the deep learning model extracts features by itself and gain significantly higher performance.
7

Model
Naive Bayes
Recurrent Neural Network
Recursive Neural Network
Convolutional Neural Network
Convolutional Neural Network + word2vec

Training Accuracy
78.3
56.8
54.0
72.7
88.2

Validation Accuracy
38.0
40.2
38.6
41.1
44.1

Test Accuracy
40.3
40.3
42.2
40.5
46.4

Table 1: Summary of Results

References
[1] Pang et al. (2008) Opinion mining and sentiment analysis. Foundations and trends in information retrieval
2.1-2: 1-135.
[2] Feldman et al. (2013) Techniques and applications for sentiment analysis. Communications of the ACM.
[3] Pang et al. (2008) Opinion mining and sentiment analysis. Foundations and trends in information retrieval
2.1-2: 1-135.
[4] Bollen et al. (2011) Twitter mood predicts the stock market. Journal of Computational Science 2.1: 1-8.
[5] Groenfeldt, Tom. Trading On Sentiment Analysis A Public Relations Tool Goes To Wall Street. Editorial.
Forbes. N.p., 28 Nov. 2011. Web.
[6] Agarwal, Basant, et al. (2015) Sentiment Analysis Using Common-Sense and Context Information. Computational intelligence and neuroscience.
[7] Pang et al. (2012) Thumbs up?: sentiment classification using machine learning techniques. Proceedings
of the ACL-02 conference on Empirical methods in natural language processing-Volume 10. Association for
Computational Linguistics.
[8] Baccianella et al. (2010) SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and

Opinion Mining. LREC.
Vol. 10.
[9] Hatzivassiloglou et al. (1997) Predicting the semantic orientation of adjectives. Proceedings of the 35th
annual meeting of the association for computational linguistics and eighth conference of the european chapter
of the association for computational linguistics. Association for Computational Linguistics.
[10] Collobert, Ronan, et al. (2001) Natural language processing (almost) from scratch. The Journal of Machine
Learning Research 12: 2493-2537.
[11] Mikolov, Tomas, et al. (2013) Distributed representations of words and phrases and their compositionality.
Advances in neural information processing systems. 2013.
[12] Socher, Richard, et al. (2011) Parsing natural scenes and natural language with recursive neural networks.
Proceedings of the 28th international conference on machine learning (ICML-11).
[13] Socher, Richard, et al. (2013) Recursive deep models for semantic compositionality over a sentiment
treebank. Proceedings of the conference on empirical methods in natural language processing. Vol. 1631.
[14] Graves, Alex. (2012) Supervised sequence labelling with recurrent neural networks. Vol. 385. Heidelberg:
Springer.
[15] Bastien et al. (2012) Theano: new features and speed improvements. NIPS Workshop on Deep Learning
and Unsupervised Feature Learning.
[16] Bergstra et al. (2010) Theano: a CPU and GPU math expression compiler. In Proceedings of the Python
for Scientific Computing Conference (SciPy).
[17] Hochreiter et al. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
[18] Gers et al. (2000). Learning to forget: Continual prediction with LSTM. Neural computation, 2451-2471.
[19] Srivastava, Nitish, et al. (2014) Dropout: A simple way to prevent neural networks from overfitting. The
Journal of Machine Learning Research 15.1: 1929-1958.
[20] Kim, Yoon.
(2014) Convolutional neural networks for sentence classification.arXiv preprint
arXiv:1408.5882 (2014).
[21] Breiman, Leo. (2001) Random forests. Machine learning 45.1: 5-32.

02 Bed15302 Measurement Evaluation
No ratings yet
02 Bed15302 Measurement Evaluation
129 pages
Motivational and Emotional Influences On Learning: By: Teresita L. Caballero Educ 1 Sunday 7am-10am
No ratings yet
Motivational and Emotional Influences On Learning: By: Teresita L. Caballero Educ 1 Sunday 7am-10am
20 pages
Customer Churn Prediction
No ratings yet
Customer Churn Prediction
70 pages
Aims and Objectives NCF
No ratings yet
Aims and Objectives NCF
9 pages
How To Write An Abstract
100% (1)
How To Write An Abstract
10 pages
NLP Final Mini Project
No ratings yet
NLP Final Mini Project
17 pages
Technical Report
0% (1)
Technical Report
24 pages
Cross Validation Thesis
100% (4)
Cross Validation Thesis
5 pages
Sentiment Analysis Final Documentation Report
50% (2)
Sentiment Analysis Final Documentation Report
21 pages
Creating Training and Development Programs: Using The ADDIE Method
No ratings yet
Creating Training and Development Programs: Using The ADDIE Method
5 pages
Expert Systems Characteristics
100% (3)
Expert Systems Characteristics
26 pages
Shipped Within 3 Days!: Every Power Supply Included in This Brochure Is
100% (1)
Shipped Within 3 Days!: Every Power Supply Included in This Brochure Is
35 pages
"Sentiment Analysis of Imdb Movie Reviews": A Project Report
No ratings yet
"Sentiment Analysis of Imdb Movie Reviews": A Project Report
27 pages
Microsoft Malware Prediction
100% (1)
Microsoft Malware Prediction
16 pages
Sentiment Analysis of Imdb Movie Review Database Final
100% (1)
Sentiment Analysis of Imdb Movie Review Database Final
16 pages
Do Students Need Homework Essay
100% (1)
Do Students Need Homework Essay
4 pages
Alison Laywine - Kants Early Metaphysics and The Origins of The Critical Philosophy
No ratings yet
Alison Laywine - Kants Early Metaphysics and The Origins of The Critical Philosophy
185 pages
AI ML Program Playbook (McCombs)
No ratings yet
AI ML Program Playbook (McCombs)
4 pages
Luca Iandoli Organizational Cognition and Learning Building Systems For The Learning Organization 2007
No ratings yet
Luca Iandoli Organizational Cognition and Learning Building Systems For The Learning Organization 2007
363 pages
Contextualized Teaching and Learning
No ratings yet
Contextualized Teaching and Learning
40 pages
Tin Can Unabridged
100% (1)
Tin Can Unabridged
3 pages
Five Ways To Improve Your Learning
No ratings yet
Five Ways To Improve Your Learning
9 pages
22 Tips For Learning A Foreign Language
No ratings yet
22 Tips For Learning A Foreign Language
4 pages
A Structured Approach To Adopting Agile Practices
No ratings yet
A Structured Approach To Adopting Agile Practices
337 pages
Production Technology 2
No ratings yet
Production Technology 2
165 pages
Model of Mmedia Learning
No ratings yet
Model of Mmedia Learning
13 pages
MISIC Theerroneouspracticeof6percentproration
100% (1)
MISIC Theerroneouspracticeof6percentproration
3 pages
ADDIE+: Adopting Proven Practices From The IT Industry
No ratings yet
ADDIE+: Adopting Proven Practices From The IT Industry
7 pages
Applications of Artificial Intelligence
No ratings yet
Applications of Artificial Intelligence
44 pages
Hall Effect Sensor PDF
100% (1)
Hall Effect Sensor PDF
3 pages
Face Detection Using Haar Cascades
No ratings yet
Face Detection Using Haar Cascades
4 pages
Sliding Window Blockchain Architecture For Internet of Things
No ratings yet
Sliding Window Blockchain Architecture For Internet of Things
47 pages
Samsung Monitor S19a10n
No ratings yet
Samsung Monitor S19a10n
35 pages
Switching and Triggering Devices
No ratings yet
Switching and Triggering Devices
53 pages
Week 2
No ratings yet
Week 2
24 pages
Big Data and Hadoop-Sentiment Analysis Using Flume and Hive
No ratings yet
Big Data and Hadoop-Sentiment Analysis Using Flume and Hive
27 pages
Sample Lesson Plan Applying GARDNE'S TEACHING EVENT PDF
No ratings yet
Sample Lesson Plan Applying GARDNE'S TEACHING EVENT PDF
3 pages
Projectmanagement Elearning Course Commlab
No ratings yet
Projectmanagement Elearning Course Commlab
20 pages
SP14 CS188 Lecture 10 - Reinforcement Learning I PDF
No ratings yet
SP14 CS188 Lecture 10 - Reinforcement Learning I PDF
38 pages
Trauma and Self Care
No ratings yet
Trauma and Self Care
54 pages
Can Different Types of Animation Enhance Recall and Transfer of Knowledge? A Case Study On A Computer Science Subject
No ratings yet
Can Different Types of Animation Enhance Recall and Transfer of Knowledge? A Case Study On A Computer Science Subject
12 pages
2011 Soft Drinks Report - UK
No ratings yet
2011 Soft Drinks Report - UK
24 pages
Deep Learning Introduction Unit 1
No ratings yet
Deep Learning Introduction Unit 1
21 pages
Elearning Strategy 2013 2020 Glasgow
No ratings yet
Elearning Strategy 2013 2020 Glasgow
24 pages
Third Antinomy
No ratings yet
Third Antinomy
45 pages
T2 Instructional Design Final 090113
No ratings yet
T2 Instructional Design Final 090113
14 pages
Chapter 2. Planning
No ratings yet
Chapter 2. Planning
17 pages
Twitter Sentiment Analysis Using Deep Learning
No ratings yet
Twitter Sentiment Analysis Using Deep Learning
17 pages
Accelerating Innovation Training Overview Ebook
No ratings yet
Accelerating Innovation Training Overview Ebook
22 pages
Recognising CAPTCHA Using Neural Networks: J Component Project
No ratings yet
Recognising CAPTCHA Using Neural Networks: J Component Project
30 pages
Northpass Best Practices To Successfully Implement A Learning Management System
No ratings yet
Northpass Best Practices To Successfully Implement A Learning Management System
12 pages
Using Pre-Trained Models
No ratings yet
Using Pre-Trained Models
16 pages
Key Variables Affecting Individual Behavior
No ratings yet
Key Variables Affecting Individual Behavior
17 pages
Social Psychology.
No ratings yet
Social Psychology.
6 pages
ALLIENWARE Product Description
No ratings yet
ALLIENWARE Product Description
20 pages
Criminal Face Recognition Using GAN
No ratings yet
Criminal Face Recognition Using GAN
3 pages
Shirani MehrH PDF
No ratings yet
Shirani MehrH PDF
8 pages
A Definition of Employee Retention
No ratings yet
A Definition of Employee Retention
13 pages
Fake Product Review Monitoring & Removal and Sentiment Analysis of Genuine Reviews
No ratings yet
Fake Product Review Monitoring & Removal and Sentiment Analysis of Genuine Reviews
4 pages
Star Model Group4
No ratings yet
Star Model Group4
16 pages
The Experience Curve Effect
No ratings yet
The Experience Curve Effect
9 pages
E-Learning Bagi Guru Sma Srijaya Negara Palembang: Pelatihan Pembuatan Media Pembelajaran Berbasis
No ratings yet
E-Learning Bagi Guru Sma Srijaya Negara Palembang: Pelatihan Pembuatan Media Pembelajaran Berbasis
7 pages
Title: "ABANDONED"
No ratings yet
Title: "ABANDONED"
9 pages
A Machine Learning Framework For Sport Result Prediction
No ratings yet
A Machine Learning Framework For Sport Result Prediction
7 pages
Fake Product Review System
No ratings yet
Fake Product Review System
5 pages
Principles of Learning
No ratings yet
Principles of Learning
5 pages
U-Test Report Template
No ratings yet
U-Test Report Template
12 pages
Answer The Question 5
No ratings yet
Answer The Question 5
2 pages
A Review On Recent Advances in Deep Learning For
No ratings yet
A Review On Recent Advances in Deep Learning For
9 pages
Library Software Requirements Specification
No ratings yet
Library Software Requirements Specification
13 pages
Emotion Detection Using Text
No ratings yet
Emotion Detection Using Text
5 pages
7 Focus 4 Lesson Plan
No ratings yet
7 Focus 4 Lesson Plan
2 pages
Annexure-I Syllabus and Scheme of Examination
No ratings yet
Annexure-I Syllabus and Scheme of Examination
2 pages
Usability Testing Report
No ratings yet
Usability Testing Report
6 pages
Hoshin Kanri Template
No ratings yet
Hoshin Kanri Template
4 pages
My Experience in Teaching English
No ratings yet
My Experience in Teaching English
4 pages
Dap An de Anh Chuyen 2014
No ratings yet
Dap An de Anh Chuyen 2014
2 pages
Alienware 17 Fact Sheet
No ratings yet
Alienware 17 Fact Sheet
1 page
Mic Eales A Visual Enquiry Into Suicide
No ratings yet
Mic Eales A Visual Enquiry Into Suicide
4 pages
BS Unit 5
No ratings yet
BS Unit 5
2 pages
Abhijeet Manohar Bamane: Resume Career Objective
No ratings yet
Abhijeet Manohar Bamane: Resume Career Objective
2 pages
Business Communication
No ratings yet
Business Communication
3 pages
M Tech Mid 2 Nnfs Paper
No ratings yet
M Tech Mid 2 Nnfs Paper
2 pages
Sentiment Analysis For Movie Reviews
No ratings yet
Sentiment Analysis For Movie Reviews
3 pages
Learning Transfer
No ratings yet
Learning Transfer
3 pages
AEAC Hoshin Kanri 0110
No ratings yet
AEAC Hoshin Kanri 0110
5 pages
Elearning Bookmarks
No ratings yet
Elearning Bookmarks
4 pages
Questionnaire
No ratings yet
Questionnaire
4 pages
What Why How KevinYousie
No ratings yet
What Why How KevinYousie
4 pages

Applications of Deep Learning To Sentiment Analysis of Movie Reviews

Uploaded by

Applications of Deep Learning To Sentiment Analysis of Movie Reviews

Uploaded by

Applications of Deep Learning to Sentiment Analysis

Need to look beyond single words or phrases

Figure 1: Structure of a sample from Stanford Sentiment Treebank dataset.

Training set: 78.3 %

omplex decision boundaries

Nave Bayes Confusion

Figure 2: Distribution of labels in the training set

Figure 3: The confusion matrix for Naive

Word2vec Averaging and Deep Dense Networks

Complex networks have too many parameters, do not converge

Recurrent Neural Networks

Recurrent Neural Networks

Figure 4: The structure of a Recurrent Neural Network.

Pooling improves performance (to 39.3% on validation set)

Using LSTM to model complex non-linearity in sentences

From CS224D Slides

Figure 8: The structure of a Recursive Neural Network.

Single layer, tanh non-linearity: 42.2 % on test set

Convolutional Neural Networks

tion set, 40.3%

y = softmax(W (s) h + b(s) )

From CS224D Slides

exploring convolutional neural networks

Confusion Matrix for best

Convolutional Neural Networks

Figure 11: The structure of a Convolutional Neural Networks.

Next step is exploring convolutional neural networks

Conclusion & Analysis of Results

Figure 15: Confusion Matrix: Convolutional Neural Network with fixed

Figure 16: Confusion Matrix: Recursive

Table 1: Summary of Results

You might also like