0% found this document useful (0 votes)

24 views30 pages

Huang 2018

This document describes a new neural network model called SR-LSTM that was developed to improve document-level sentiment classification. The model uses a two-layer LSTM structure, with the first layer representing sentences and the second layer representing the document. It outperforms other models on three publicly available datasets. The model aims to better capture semantic relationships between sentences to determine overall document sentiment.

Uploaded by

bilal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views30 pages

Huang 2018

Uploaded by

bilal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Communicated by Prof. H.

Zhang

Accepted Manuscript

LSTM with sentence representations for Document-level Sentiment

Classification

Weihang Huang, Guozheng Rao, Zhiyong Feng, Qiong Cong

PII: S0925-2312(18)30479-X
DOI: 10.1016/j.neucom.2018.04.045
Reference: NEUCOM 19512

To appear in: Neurocomputing

Received date: 30 November 2017

Revised date: 6 March 2018
Accepted date: 20 April 2018

Please cite this article as: Weihang Huang, Guozheng Rao, Zhiyong Feng, Qiong Cong, LSTM with
sentence representations for Document-level Sentiment Classification, Neurocomputing (2018), doi:
10.1016/j.neucom.2018.04.045

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and
all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT

LSTM with sentence representations for

Document-level Sentiment Classification

Weihang Huang

T
School of Computer Science and Technology, Tianjin University, Tianjin, China

Guozheng Raoa,∗, Zhiyong Fenga,b , Qiong Conga

IP
a Tianjin Key Laboratory of Cognitive Computing and Application, Tianjin, China
b School of Computer Software, Tianjin University, Tianjin, China

CR
Abstract

US
Recently, due to their ability to deal with sequences of different lengths, neural
networks have achieved a great success on sentiment classification. It is widely
AN
used on sentiment classification. Especially long short-term memory networks.
However, one of the remaining challenges is to model long texts to exploit the
semantic relations between sentences in document-level sentiment classification.
M

Existing Neural network models are not powerful enough to capture enough
sentiment messages from relatively long time-steps. To address this problem,
we propose a new neural network model(SR-LSTM) with two hidden layers. The
ED

first layer learns sentence vectors to represent semantics of sentences with long
short term memory network, and in the second layer, the relations of sentences
PT

are encoded in document representation. Further, we also propose an approach

to improve it which first clean datasets and remove sentences with less emotional
polarity in datasets to have a better input for our model. The proposed models
CE

outperform the state-of-the-art models on three publicly available document-

level review datasets.
Keywords: Sentiment Classification, LSTM, Neural networks, Sentence
AC

vectors

∗ Correspondingauthor
Email address: [email protected] (Guozheng Rao )

Preprint submitted to Elsevier May 3, 2018

ACCEPTED MANUSCRIPT

1. Introduction

Sentiment classification is one of the most widely used natural language

processing techniques in many areas, such as E-commerce websites, stock fore-
cast, political orientation analyses [1]. Document-level sentiment classification

T
is a fundamental task in sentiment analysis [2]. Recently, neural networks ap-

IP
proaches have gained a big success on sentiment classification. Neural network
is used to deal with sentiment classification early with frees researchers from

CR
handcrafted feature engineering[3]. Among these methods, Recurrent Neural
Networks (RNNs) are one of the most prevalent architectures because of the
ability to handle variable length texts.

US
Paragragh-level or sentence-level sentiment analysis expects the model to
extract features from limited source of information [4], while document-level
AN
sentiment analysis demands more on selecting and storing global sentiment mes-
sage from long texts with noises and redundant local pattern. Simple Recurrent
Neural Networks are not powerful enough to handle the overflow and to capture
M

key sentiment messages from relatively far time-steps.

Efforts have been made to deal with long texts modeling process. Recently,
LSTM is more popular to deal with sentiment classification. Long short-term
ED

memory network(LSTM) is proposed by Hochreiter and Schmidhuber in 1997[5],

it is a typical recurrent neural network, which alleviates the problem of gradi-
PT

ent diffusion and explosion. LSTM can capture the long dependencies in a
sequence by introducing a memory unit and a gate mechanism which aims to
decide how to utilize and update the information kept in memory cell. A cached
CE

long short-term memory model is proposed to deal with document-level senti-

ment analysis[6]. Duyu Tang used gated recurrent neural network to model
document[7]. These models all exploit neural network to learn semantic rela-
AC

tions between all words in a document, but they are not capable of modeling
the intrinsic relations between sentences.
Partially inspired by the structure of LSTM and semantic compositionality[8],
we propose a deep recurrent neural network with two hidden layers. It first pro-

2
ACCEPTED MANUSCRIPT

duces continuous sentence vectors from word representations. it is the sentence

vector which we put forward to represent sentence composition. And all the sen-
tence vectors of the whole document is the input of the second LSTM layer to get
document representations and the last output vector represents document rep-

T
resentation, which takes considerations of semantics with different granularities.

IP
These document representations are used as features to classify the sentiment
label of each document. We conduct document level sentiment classification on

CR
three large-scale review datasets which come from IMDB, Yelp 2014 and Yelp
2015. We compare our model to three classes of model which contain machine
learning methods[9], recurrent neural network models and bi-directional neural
network models.
Our main contributions are US
AN
• We introduce a neural network model with two hidden layers to learn
continuous document representation. The document datasets are divided
into a certain number of sentences so that we can handle shorter sequences
M

each time and retain as many key sentiment information as possible.

• The proposed model can encode the relations between sentences in doc-
ED

ument representation. It is a neural network model which can capture

semantic information between words and sentences.

• Our model outperforms state-of-the-art methods for Document level senti-

ment classification on three document level datasets from IMDB and Yelp
Dataset Challenge.
CE

2. Related work
AC

In this section, we introduce our works in two areas: First, we discuss the
meaning of document-level sentiment classification and the existing approaches;
Second, we discuss recurrent neural networks for Document-level sentiment clas-
sification.

3
ACCEPTED MANUSCRIPT

2.1. Document-level Sentiment Classification

Sentiment classification is a new research topic in NLP which has a great

research and application value. Document-level sentiment classification is a diffi-
cult task in sentiment classification which aims at identifying the sentiment label

T
of a document[10]. Pang and Lee put forward the concept of document-level sen-

IP
timent for the first time[11]. The biggest challenge of Document-level sentiment
classification is not only to consider the semantics between words and sentences,

CR
but also to consider the overall context of semantic information to represent doc-
ument composition. We can not simply use the sum of all word representations
to represent the whole document. This is obviously not justified. Various meth-

US
ods have been investigated and explored over years. Most of these methods
depend on traditional machine learning algorithms. Pang first use a supervised
machine learning method and build a SVM classifier and represent documents
AN
with bag-of-words features [2]. Turney uses sentiment phrases extracted from
syntactic patterns for document level sentiment classification[12]. Goldberg uses
a graph-based method in a semi-supervised setting to finish this task[13]. Many
M

research results show that SVM and Naive bayesian classifier have better perfor-
mance than other machine learning methods. Afterwards, Various approaches
ED

focus on design handcrafted features to make machine learning methods have

a good performance, including [14] use word ngrams as features, [15] explore
text topic and [16] use bag of opinions. but they are heavily in need of effective
PT

handcrafted features.

2.2. Methods for modeling distributed representations

Most models for distributed representations fall into three classes where
realvalued vectors are used to represent meaning. The three classes are bags-
AC

of-words models, sequence models and tree-structured models. Phrase and sen-
tence representations are independent of word order in bags-of-words models.
They can be generated by averaging constituent word representations[17]. How-
ever, sequence models models construct sentence representations as an order-
sensitive function of the sequence of tokens[18]. Recently, tree-structured models

4
ACCEPTED MANUSCRIPT

compose each phrase sentence representations from its constituent subphrases

according to a given syntactic structure over the sentence[19].
Recurrent neural networks are a natural choice for sequence modeling tasks
due to their capability for processing arbitrary-length sequences. Especially

T
Long Short-term Memory networks have emerged as a popular model due to

IP
their effectiveness at capturing long-term dependencies. LSTM networks have
been successfully applied to a variety of fields, machine translation [20], speech

CR
recognition [21], understanding subtitles[22] and image caption generation [23].

2.3. Neural networks for Document-level Sentiment Classification

US
Recently, the use of neural network based methods to deal with sentiment
classification gradually becomes popular. They are prevalent due to their abil-
ity of learning discriminative features from data[23], and they can also think
AN
about the overall context information. With the development of distributed
representations, neural networks advance sentiment classification substantially.
It is known that good word embeddings as inputs can improve neural net-
M

work models[9], a simple and effective approach to learn distributed repre-

sentations was proposed by Mikolov[24] which introduces CBOW and Skip-
ED

gram models. GloVe is an unsupervised learning algorithm for obtaining global

context[25].These distributed representations can greatly enhance the ability of
neural network. Neural network models contain Recursive Neural Network[26],Recurrent
PT

Neural Network[27],LSTM[5]. Among them, RNNs can handle sequences better

because it can take into account context information of sequences. Unfortu-
nately, a problem of RNNs is that during training, components of the gradient
CE

vector can grow or vanish exponentially over long sequences [28]. However, long
short-term neural networks, it can solve this biggest problem of RNNsvanish-
AC

ing gradient problem. Variable models based on LSTM have been proposed to
increase the ability of LSTMs, such as [19] put a tree-structured model into
LSTM for better semantic composition, [29] add an extra sparse matrix into
LSTMs to get a better results and so on. Most of these models work well on
sentence-level and paragragh-level sentiment classification. When it comes to

5
ACCEPTED MANUSCRIPT

the document-level sentiment classification, the effect is not always prefect.

Although it is widely accepted that LSTM has more long-lasting memory
units than RNNs. It will still forget information which is too far away from
the current point. This problem is more pronounced when we deal with the

T
document-level sentiment classification. In order for LSTM to store longer in-

IP
formation, various models about LSTM have proposed to increase the ability
of LSTMs to store long-range information. For example, [30] put an external

CR
memory into LSTM, but they are of poor performance on time because of the
huge external memory matrix. [31] use a bidirectional LSTM with a attention
model to deal with document-level sentiment classification and [6]came up with

US
CLSTM, it defines a concept of forgetting rates and divides memory into several
groups, and different forgetting rates regarded as filters, are assigned to different
groups.
AN
We can conclude that these methods are proposed on a layer of LSTM model
and modify LSTM structure. As a comparison, we propose a LSTM model with
two hidden layers. the first layer is trained to acquire sentence vectors, a simple
M

method to get sentence vectors is ignoring the order of sentences and averaging
word embeddings as sentence vectors, but this fails to capture complex semantic
ED

relations between words, one could use CNN to get sentence vectors [7]. we try
the first LSTM to achieve this, and the second layer uses sentence vectors to
draw the overall sentiment polarity of the document as document composition,
PT

which can overcome standard LSTM model cant store too long information, but
also through the structure of two layers of LSTM, we can take considerations
CE

of more specific semantics.

3. The Proposed Method

In this section, we will review LSTM and some of its variants, and then we
will introduce our method and an approach to improve it.

6
ACCEPTED MANUSCRIPT

3.1. Long Short-term Memory Networks

Long short-term memory network(LSTM)[5] is a typical recurrent neural
network. It is to modify the structure of the momery cell in the RNNs through
transforming the tanh layer in the RNNs into a structure containing a memory

T
unit and a gate mechanism. This aims to decide how to utilize and update the

IP
information kept in the memory cell. Because of this structure, it alleviates the
problem of gradient diffusion and explosion. Figure 1 shows the structure of a

CR
standard LSTM at time step t. i, o, f represent the input gate, the output gate
and forget gate. c is the memory cell, the inputs of an input gate it , an output
gate ot , an forget gate ft are all vector xt that the network receives at time t
and its previous hidden size ht−1 .
US
AN

i o
M
ED
PT

f
CE
AC

Figure 1: The structure of a standard LSTM unit

Formally, the formula of each LSTM component can be formalized as:

ft = σ(Wf xt + Uf ht−1 + bf ) (1)

it = σ(Wi xt + Ui ht−1 + bi ) (2)

7
ACCEPTED MANUSCRIPT

ot = σ(Wo xt + Uo ht−1 + bo ) (3)

ĉt = tanh(Wc xt + Uc ht−1 + bc ) (4)

ct = ft ◦ ct−1 + it ◦ ĉt (5)

T
ht = ot ◦ tanh(ct ) (6)

IP
Where σ is the logistic sigmoid function, ft , it , ot , ct are the forget gate,the
input gate, the output gate, and the memory cell activation vector at time-step

CR
t. The entries of the gating vectors it , ft , ot are in [0,1]. ◦ is the multiplica-
tion operation. b represents bias, all of which have the same size as ht ∈ Rh ,
Wf , Wi , Wo , Wc ∈ RH×d , we refer to d as the memory dimension of the LSTM

US
and bf , bi , bo , bc ∈ RH , and Uf , Ui , Uo , Uc ∈ RH×H . The dimensionality of hid-
den layer and input respectively are H and h.
AN
3.2. Some variable species of LSTM

Since LSTM has been put forward, it has been widely concerned . In recent
years, LSTM in the field of natural language processing is also more and more
M

widely used, many deformations on the LSTM have been proposed.

• PC-LSTM
ED

One of the popular variants was proposed by Gers and Schmidhuber in

2000 [32], it adds peephole connection to the memory cell so that every
PT

gate can also accept the input information of the cell state. Compared
with standard LSTM, In the PC-LSTM, the forget gate, the input gate
and the output gate can also accept the whole information in the memory
CE

cell under every time-step. In this structure, it adds three connections

which connect the current memory cell to the forget gate, the input gate,
AC

and the output gate.

• CIFG-LSTM

In order to simplify the structure of the LSTM, In the CIFG-LSTM ,

the input gate and forget gate are coupled as one uniform gate, We use

8
ACCEPTED MANUSCRIPT

coupling the forget gate and the input gate to avoid producing redundant
information [33]. The new value is added to the state when some old
information is forgotten at the same time. so we define it = 1 − ft ,we use
ft to denote the coupled gate. And we will replace Eq.5 as below:

T
ct = ft ◦ ct−1 + (1 − ft ) ◦ ĉt (7)

IP
• GRU

CR
Another popular variant is GRU[34] , which is proposed by Cho,2014. It
combines the forget gate and the input gate into a single update gate.
The unit state and the hidden state are also merged, so the GRU model is

US
simpler than the standard LSTM model. Figure 2 shows the structure of
GRU, we can know that GRU has only two gates, the reset gate and the
AN
update gate, In GRU, r and z jointly control how the new hidden state
st is obtained from the previous hidden state st−1 calculation, and the
output gate in LSTM is canceled. If the reset gate is 1 and the update
key is 0, and then, the GRU is completely degraded to an RNN.
M

• Bi-LSTM
ED

A bidirectional LSTM [21] consist of two LSTMs, which utilizes addi-

tional backward information and thus enhances the memory capability.
The basic idea of bidirectional LSTM is to propose that the forward and
PT

backward training sequences are two LSTMs., and both are connected to
an output layer. This structure provides the complete past and future
context information for each point in the input layer.
CE

We also employ the bi-directional mechanism on SSR-LSTM which utilizes

additional backward information and thus enhances the memory capabil-
AC

ity.

3.3. LSTM with sentence representations(SR-LSTM)

We propose a end-to-end LSTM model with two layers designed to get a doc-
ument representation for sentiment classification, which can handle documents

9
ACCEPTED MANUSCRIPT

Softmax

Document Representation

LSTM LSTM LSTM

Sentence Vector

T
LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM LSTM

IP
Word Embeddings

The first sentence The second sentence The Maxn sentence

CR
Figure 2: The LSTM with sentence representations model for document level sentiment clas-
sification

US
of variable length. We can get a more delicate relationship between word em-
beddings, sentence representations in a document. Figure 2 shows the structure
AN
of LSTM with sentence representations. The idea comes from compositionality,
we all know that a document consist of a list of sentences and each sentence
consists of a list of words, the meaning of one sentence comes from the meanings
M

of words and the rules to combine them, the representation of one document
comes from the meanings of sentences. we use LSTMs with two layers, the
input of the first layer is all word embeddings [9] in the whole document, in
ED

word embeddings, every words is represented as a low dimensional vector. All

the word vectors are stacked in a word embedding matrix Lw ∈ Rd×|V | , where
|V | is the size of vocabulary and d is the dimension of word embeddings. These
PT

word vectors can be pre-trained from text corpus with embedding learning al-
gorithms, such as Word2vec[24], Glove[25]. In our model, we adopt Glove to
CE

make better use of semantic and grammatical associations of words. Our model
first produces continuous sentence vectors from word representations. Because
RNNs or LSTM current time-step input contains both the output of the previ-
AC

ous time-step and the input of the current time-step, so we can use the output
of the last time-step in each sentence to represent the sentence, it is the sen-
tence representation which we put forward to represent sentence composition.
And all the sentence representations are the input of the second LSTM layer.

10
ACCEPTED MANUSCRIPT

After calculating the hidden vector of the second layer, we regard the last hid-
den vector as the document representation [35] . Document representations are
then used as features for document-level sentiment classification. We feed it
to a linear layer whose output length is class number,and add a softmax layer

T
to output the probability of classifying the document as positive, negative or

IP
neutral. Softmax function is calculated as follows, where C is the number of
sentiment categories.

CR
exp(xi )
sof tmaxi = PC (8)
o
io =1 exp(xi )
Our neural network model can handle shorter sequences and reduce the for-

US
getting rate of information and at the same time, our model not only consider
the semantic information between words, but also combine the semantic in-
AN
formation between sentences in the document which can encode the relations
between sentences in the semantic meaning of document.

3.4. A tool for sentence emotional polarity

Before introducing an approach to improve our model, let us talk about

sentiment dictionary which is a tool for sentence emotional polarity. Sentiment
ED

dictionary contains sentiment scores of words. The popular sentiment dictionary

includes GI, NTU, HowNet, sentiWordNet which is a lexical resource for opinion
mining. The construction of sentiment dictionary is generally used to expand
PT

the seed sentiment dictionary to obtain a large-scale sentiment dictionary[36].

Words in sentiment dictionary are divided into three categories, which are emo-
CE

tional words, degree words, negative words. Emotion words can be divided into
positive evaluation words, positive emotion words, negative evaluation words
specifically and negative emotion words and degree words will be divided into
AC

several levels, for example, we define most is the highest level, least is the low-
est level. The effect of negative words is to determine whether to reverse the
polarity of words.
Figure 3 shows the data structure of a general sentiment dictionary. We
assume that the degree of words is divided into three categories. The number of

11
ACCEPTED MANUSCRIPT

Sentiment
Dictionary

T
Negative Degree Emotional
words words words

IP
CR
Most Medium Least
degree degree degree
pos neg

US
Figure 3: The data structure of a general sentiment dictionary

categories of degree words in different sentiment dictionaries is different. First,

AN
we spilt sentences into words and phrases. We use the NLP pipeline to achieve
this and tag part of speech of words. This method for sentiment classification
has the advantages of simple thinking and small workload. Judging the polarity
M

of a sentence is usually done by comparing the number of emotional words in a

sentence. If there are more positive words than negative words in a sentence, the
ED

sentence can be considered as positive and vice versa. Another way is that using
the sum scores of all words or phrases in a sentence to represent the emotional
polarity of the sentences. In our model ,we use the second approach. We can
PT

find that unlike methods of machine learning and neural networks, this method
does not take into account the semantic relations between words or sentences,
so it is not appropriate to deal with document level sentiment classification, it
CE

can not get a good result, but the performance of dealing with paragragh-level
or sentence-level sentiment analysis is still acceptable. In our model(LSTM
AC

with sorted sentence representations),we use this method to get the polarity of
sentences in each document.

12
ACCEPTED MANUSCRIPT

SR-
LSTM

Method with

T
Sentiment dictionary - +

IP
Maxn n>Maxn n<=Maxn
Maxn

CR
US n
Figure 4: The structure of SSR-LSTM
AN
3.5. LSTM with sorted sentence representations(SSR-LSTM)

Theoretically, LSTM can handle any length of sequences and avoid gradient
vanishing . To get a better training result, we will specify a maximum length of
M

sequences[37], the excess will be intercepted, and in our model, we will define
Maxn to represent the maximum number of sentences in each document. When
ED

one document which exceeds the maximum number of sentences, we will cut
off the extra sentences to satisfy the input. If the document have less than the
maximum number of sentences, we will fill the inadequacy of the place to do 0.
PT

When we deal with one document with the number of sentences beyond
CE

the maximum number of sentences, we will intercept one or a few sentences

from the back. If the subjective comments are at the end of the document,
we may lost very important sentiment information, so we propose LSTM with
AC

sorted sentence representations to overcome it. We do not change the order of

sentences as a basis and choose Maxn’s sentences according to the sentiment
polarity in each document. In order to judge the sentiment polarity of each
sentence, we use sentiment dictionary to finish this work. We use the sentiment
dictionary SentiWordNet 3.0 [38] to get the sentiment score of each word in a

13
ACCEPTED MANUSCRIPT

sentence and use the NLP pipeline to divide the sentence into phrases. The
sentiment score of phrases is obtained by averaging the sentiment values of all
the same word synonyms, and if there is an adverb, the sentiment score of
the phrase is multiplied by a weight, finally, the scores of all the parts in a

T
sentence are added as the sentence sentiment score. Positive values represent

IP
positive, negative values represent negative, and finally we take absolute value
comparison to get the sentence sentiment polarity of the sort. Figure 4 shows

CR
the structure of SSR-LSTM, before entering SR-LSTM neural network model,
there is an added hidden layer to process data with sentiment dictionary. If
the number of sentences contained in a document is more than Maxn, we select

US
Maxn of sentences with strong polarity and if the number is less than Maxn, we
add 0 vectors to Maxn, then we input the processed data into SR-LSTM model.
The entire model is trained end-to-end with stochastic gradient descent,
AN
where the loss function is the cross-entropy error of supervised sentiment classi-
fication. In order to avoid overfitting, overfitting means the model over-divides
the training data, including the noise data, so that the least cost can be ob-
M

tained. However, the overall law is ignored and for unknown data, such as the
test data, the model can not perform well. In order to overcome this problem,
ED

we join an L2 regularization into all parameters. L2 regularization is used to

limit the size of weights so that the model can not randomly fit the random
noise in the training data. Let y be the target distribution for each document,
PT

z be the predicted document distribution. The goal of training is to minimize

the cross-entropy error between y and z for all training documents.
XX j λ||θ||2
CE

loss = − yi logzij + (9)

i j
2

Where i is the index of document, j is the index of class. λ is the L2

regularization term. θ is the parameter set.

4. Experiment

In this section, we study the result of our model on three real-world datasets
for document-level sentiment classification, and we compare the effect of dif-

14
ACCEPTED MANUSCRIPT

Dataset Train Size Valid Size Test Size Ws/Doc Sens/Doc Class
Yelp 2014 183019 22745 25399 196.9 11.41 5
Yelp 2015 194360 23652 25341 151.9 8.97 5
IMDB 67426 8381 9112 394.6 16.08 10

T
Table 1: Statistical information of IMDB and Yelp 2014/2015 datasets, Train size, Valid

IP
size and Test size represent the number of training set,validation set,test set. Ws/Doc and
Sens/Doc mean average number of words and sentences contained in each document. Class is

CR
the number of classes.

ferent word embeddings and different dimensions of word embeddings on our

proposed model.

Datasets
Yelp 2014 120
US
Hidden layer units Maxn
11
Batch size
64
AN
Yelp 2015 120 9 64
IMDB 160 16 128
M

Table 2: Optimal hyper-parameter configuration for three datasets.

4.1. Experimental Setting

For document-level sentiment classification, we evaluate our model on three

popular real-world datasets. IMDB is a large movie review dataset, Yelp 2014
PT

and Yelp 2015 are two restaurant review datasets . All the three datasets can
be publicly accessed.We use the Stanford CoreNLP [39] to do tokenization and
split sentences on these datasets. Table 1 shows the statistical information of the
CE

three datasets. We spilt three datasets into training, validation and testing sets
with 80/10/10. The training set is mainly used to train the model and to avoid
AC

overfitting, we can not adjust parameters directly through the effect on the test
dataset. We use the validation set to further determine the hyperparameters
of model and evaluate the effect of our model under different parameters. We
use accuracy and MSE to evaluate our models, where accuracy is a standard
metric to measure the overall sentiment classification performance. MSE(Mean

15
ACCEPTED MANUSCRIPT

Squared Error) is a convenient way to measure average error. From this, we can
evaluate the degree of change in the data, the smaller MSE values, indicating
that the predictive model describes the experimental data with better accuracy.
The MSE is calculated as follows:

T
PN 2
j (standardj − predictedj )
M SE = (10)

IP
N

For the configuration of parameters, we want to select the best parameters

CR
to make our models have a perfect performance. We use publicly available
300-dimensional Glove [25] as pre-trained word embeddings, the dimension of
hidden units is set to 120. We use Adagrad [40] as an optimizer and its initial

US
learning rate is set to 0.01. For IMDB, Yelp 2014, Yelp 2015, we set batch
size to 128, 64, 64. The number of hidden layer units is 120 for three datasets.
For Maxn in our models, it represents the maximum number of sentences. This
AN
parameter is selected based on the average number of sentences in the document
for each dataset. For example, we set Maxn to 16 for IMDB because the average
number of sentences in each document for IMDB is 16.08. Finally, Maxn is
M

chosen among (16,11,10), this is the best parameter for the three datasets. The
statistical information show in Table 2.
ED

For model initialization, we randomly initialize all matrices with sampling

from uniform distribution in [-0.1,0.1] and update all parameters with stochastic
gradient.
PT

4.2. Baseline models

We compare our methods with the following baseline methods for document-
level sentiment classification.We divide our baseline models into three categories.
In the first class, we exploit machine learning algorithm to build sentiment
AC

classifier.

• Naive Bayesian

Naive Bayesian is a popular machine learning classification algorithm and

we use bags of words[11] as features.

16
ACCEPTED MANUSCRIPT

• SVM

we also use bag of words as features and train SVM classifier with LibLinear[41].

In the second class, we use recurrent neural networks to model long sequences

T
for document level sentiment classification.

IP
• RNN

RNN is a basic method to model sequential texts[27].

CR
• LSTM

LSTM is a recurrent neural network with memory cells and three gate

US
mechanism (Hochreiter and Schmidhuber, 1997).

• PC-LSTM
AN
In comparison with standard LSTM, PC-LSTM[32] adds peephole con-
nection to the memory cell so that every gate can also accept the input
information of the cell state.
M

• CIFG-LSTM

CIFG-LSTM[33] combine the input and forget gate of LSTM and it re-
ED

quires a smaller number of parameters than LSTM.

• GRU
PT

It combines the forget gate and the input gate into a single update gate[34].
The unit state and the hidden state are also merged, so the GRU model
CE

is simpler than the standard LSTM model.

• 2-layer LSTM
AC

In 2-layer LSTM architectures, the hidden state of an LSTM unit in first

layer is used as input to the LSTM unit in second layer in the same time
step[19]. Here, the idea is to let the second layer capture longer-term
dependencies of the input sequence.

17
ACCEPTED MANUSCRIPT

IMDB Yelp 2014 Yelp 2015

Model
Acc MSE Acc MSE Acc MSE
NB 0.353 4.36 0.583 0.83 0.613 0.73
SVM 0.404 3.54 0.608 0.75 0.589 0.81

T
RNN 0.232 5.82 0.473 1.15 0.479 1.09

IP
LSTM 0.398 3.21 0.610 0.57 0.617 0.55
PC-LSTM 0.402 3.23 0.612 0.55 0.615 0.56

CR
CIFG-LSTM 0.395 3.17 0.607 0.59 0.610 0.57
GRU 0.405 3.15 0.609 0.55 0.611 0.56
2-layer LSTM 0.401 3.18 0.613 0.51 0.625 0.45
CLSTM
SR-LSTM
SSR-LSTM
0.429
0.440
0.443
US
2.67
2.24
2.25
0.624
0.632
0.639
0.49
0.46
0.45
0.627
0.639
0.638
0.47
0.46
0.44
AN
Bi-LSTM 0.432 2.21 0.625 0.49 0.625 0.48
SSR-BiLSTM 0.463 2.13 0.651 0.41 0.653 0.40
M

Table 3: Results of our model against baseline models on Yelp 2014,Yelp 2015 and IMDB.
Acc and MSE are evaluation metrics. Acc means accuracy(higher is better) and MSE means
mean square error(lower is better) Best results in each group are in gold.
ED

• CLSTM
PT

CLSTM[6] aims at capturing the long-range information by a cache mech-

anism, which divides memory into several groups, and different forgetting
rates, regarded as filters, are assigned to different groups.
CE

4.3. Results

We compare the classification accuracy and MSE(mean square error) with

other competitive models. The results are shown in Table 3,from this, we have
several findings.
1. First, we compare two machine learning methods. They are NB and
SVM(Naive Bayesian and Support Vector Machine), we can find that SVM has

18
ACCEPTED MANUSCRIPT

a better performance than NB on IMDB and Yelp 2014 datasets and NB is

good on Yelp 2015 datasets. They have 0.404 and 0.608 accuracy respectively.
From Table 3, we can find that machine learning methods almost have same
performances as LSTM, but they need to be input a large number of features.

T
Designing effective features is a very fundamental work and the performance

IP
of a machine learning classifier is heavily dependent on the choice of data rep-
resentations and features, but neural network models can automatically learn

CR
features from the characteristics of the data, this is the reason that it’s widely
used for sentiment classification recently. From our experiment, we can include
LSTM and machine learning classifiers have almost the same performance, this
is a good news for us.
US
2. About the selected recurrent neural networks in baseline models, RNN
has the worst performance in modeling long texts due to the vanishing gradient
AN
problem. As a comparison, LSTM, PC-LSTM, CIFG-LSTM and GRU have
better performances which shows that an internal memory and the structure of
three gates play a key role in long texts modeling. The four LSTM models have
M

almost same performances on three datasets.

3. The proposed LSTM with sentence representations (SR-LSTM) and
ED

LSTM with sorted sentence representations (SSR-LSTM) have the best per-
formance in the three datasets and beats CLSTM proposed by [6] and 2-layer
LSTM [19], especially in Yelp 2014, SSR-LSTM achieves 0.639 in accuracy,
PT

which is 0.015 percent better than CLSTM and 0.026 percent better than 2-
layer LSTM. On IMDB and Yelp 2014 datasets, SSR-LSTM have a better per-
CE

formance, however, on Yelp 2015 datasets, SSR-LSTM have almost the same
performance as SR-LSTM, it shows that subjective sentences are rare emerged
at the end of each document on Yelp 2015 datasets.
AC

4. With the help of bidirectional architecture, models can look forward and
backward to capture features in modeling long texts, so Bi-LSTM has a better
performance than single directional models. And in bidirectional models, our
models have good performances, which achieves 0.463, 0.651, 0.653 in accuracy
in IMDB, Yelp 2014, Yelp 2015, 2.13, 0.41, 0.40 in MSE.

19
ACCEPTED MANUSCRIPT

5. In terms of time complexity and numbers of parameters, the two proposed

models(SR-LSTM and SSR-LSTM)and 2-layer LSTM all have two hidden lay-
ers, but our models require less computational resources and time than 2-layer
LSTM and higher accuracy. Compared with the fully connected layer, our mod-

T
els only use the output sentence vector of the first layer as the input of the second

IP
layer, so our models have less parameters and computational time.

4.4. The effect of Word Embeddings

CR
We know that the input of neural networks is word embeddings[9]. It is
well accepted that a good word embedding is crucial to composing text repre-

US
sentations at a higher level. We therefore want to know the effects of differ-
ent word embeddings on our models. We choose IMDB as our document-level
dataset for sentiment classification. We compare randomly initialized vectors
AN
, two word2vec models(CBOW and Skip-gram) [42] and glove [25] on three
models, which are LSTM, SR-LSTM, SSR-LSTM. All the word vectors are 300-
dimensional and learned from Twitter.
M

Next, we discuss the effect of word embeddings of different dimensions on

our model performance and time costing.
ED

Glove.50d Glove.100d Glove.200d Glove.300d

LSTM 0.358 0.377 0.391 0.398
SR-LSTM 0.405 0.422 0.436 0.440
PT

SSR-LSTM 0.403 0.424 0.438 0.443

Table 4: Classification accuracy of LSTM, SR-LSTM and SSR-LSTM with different di-
CE

mensional word embeddings. We compare between Glove.twitter.50d, Glove.twitter.100d,

Glove.twitter.200d and Glove.twitter.300d.
AC

Randomly initialized vectors mean that in the model, we deal with word em-
beddings the same as other parameters, we randomly initialize word embeddings
and other parameters with sampling from uniform distribution in [-0.1,0.1] and
update all parameters with stochastic gradient. Afterwards, distributed repre-
sentation, another saying is word embeddings is proposed by Hinton in 1986

20
ACCEPTED MANUSCRIPT

and many methods of learning distributed representation are proposed by [43].

CBOW and Skip-gram are two models contained in Word2vec [42]. The two
models are trained by neural networks. They take into account semantic rela-
tions of context information. GloVe is an unsupervised learning algorithm for

T
obtaining global context[25].

IP
From Table 6, we can find that two models (CBOW and Skip-gram) of
word2vec and Glove perform better than Randomly initialized vectors, espe-

CR
cially in SSR-LSTM. This shows the importance of context information for word
embedding learning such as Word2vec and Glove. In addition, we can also find
that Glove have a slight increase in accuracy on three models, which indicates

US
the important of global context for estimating a good word representation.
We compare between Glove vectors with different dimensions(50/100/200/300).
Classification accuracy and time cost are given in Table 4 and Table 5 respec-
AN
tively. We can find that 200-dimensional word vectors perform better than
50-dimensional and 100-dimensional word vectors, while 300-dimensional word
embeddings do not show significant improvements. Furthermore, SR-LSTM and
M

50dms 100dms 200dms 300dms

LSTM 7.8 23.2 48.9 113.1

SR-LSTM 12.5 40.6 71.3 132.2
SSR-LSTM 11.4 37.2 72.0 124.2
PT

Table 5: Time cost of each model with 50dms, 100dms, 200dms and 300dms Glove vectors.
Each value means how many minutes cost after training the model
CE

Random CBOW Skip-gram Glove

LSTM 0.343 0.382 0.375 0.398
SR-LSTM 0.398 0.432 0.438 0.440
AC

SSR-LSTM 0.397 0.428 0.439 0.443

Table 6: Classification accuracy of LSTM, SR-LSTM and SSR-LSTM with different word
embeddings. We compare between Randomly initialized vectors, CBOW, Skip-gram and
Glove vectors.

21
ACCEPTED MANUSCRIPT

48 35

46 30

44 25

Time costing(min)
Accuracy(%)

42 20

40 15

T
38 10

36 5

IP
34 0
8 10 12 14 16 18 20 22 24 8 10 12 14 16 18 20 22 24
The number of Maxn The number of Maxn

(a) Classification Accuracy (b) Time Costing

CR
Figure 5: The classification accuracy and time costing of SSR-LSTM on IMDB dataset.

SSR-LSTM have similar time cost and they cost more time than LSTM because
the parameter number of SR-LSTM and SSR-LSTM is higher. But they can
higher classification accuracy.
US
AN
4.5. Effectiveness on Maxn

For our proposed model, SSR-LSTM, Maxn(the maximum number of sen-

tences in each document) is highlight, so we compare the classification accuracy

of different selected Maxn and the time costing on IMDB dataset. The model we
choose is SSR-LSTM. The classification accuracy and time costing are respec-
ED

tively shown in Figure 5. From the tendency of two figures, we find that when
the selected Maxn is bigger, the classification accuracy is improved gradually,
especially Maxn from 10 to 16, the increase of accuracy is obvious, and when
PT

Maxn increased to 24, we find the increase of accuracy slowly and decline also
occurs during 22 to 24. The reason is that when Maxn is too large, the number
CE

of documents which contain sentences more than Maxn will be less and less,
so that the impact to accuracy is getting smaller and smaller. From figure 5 ,
we can find with the increase of Maxn, the cost of training time is longer and
AC

longer, even it increases faster because the neural network model needs to train
more parameters as Maxn increases which inceases the training time.
So in our model SSR-LSTM, it is reasonable to value Maxn as the aver-
age number of sentences in each document on three datasets because this can

22
ACCEPTED MANUSCRIPT

not consume too much time and also can ensure to have a high classification
accuracy.

5. Conclusion

T
We introduce new neural network models(SR-LSTM and SSR-LSM) for doc-

IP
ument level sentiment classification. SR-LSTM exploits two layers LSTM mod-
els. The approach encodes semantics of sentences and their relations in doc-

CR
ument representation. Since a document consists of many sentences and one
sentence has a list of words, our model models document in two steps. The first
layer uses word embedding technique to produce sentence vectors and sentence

US
vectors are treated as inputs of the second layer to get document representa-
tions. SSR-LSTM is an approach to improve SR-LSTM which first removes
AN
sentences sentences with less emotional polarity in datasets. Before data is in-
put to SR-LSTM model, we clean three datasets first. We do not change the
order of sentences as a basis and choose fixed number’s sentences according to
M

the sentiment polarity in each document. For SSR-LSTM, such a sorted input
can build a more perfect model. Our two models are trained end-to-end with su-
pervised sentiment classification objectives. Empirical results on three datasets
ED

(IMDB,Yelp 2014, Yelp 2015) shows that our models outperform state-of-the-art
models.
PT

For future work, we want to find a better way to get sentence vectors. In our
model, we simply model document in a sequential way. One could compose doc-
ument representation over discourse tree structures rather than in a sequential
CE

way, such as tree-structured LSTM[19]. We are going to achieve it.

6. Acknowledgments
AC

We thank the constructive work from Guozheng Rao for the fruitful discus-
sions, besides, we would like to thank the anonymous reviews for their valuable
comments. This work is supported by the National Natural Science Foundation
of China (NSFC). (61373165, 61373035 and 61672377).

23
ACCEPTED MANUSCRIPT

References

[1] J. Z. Wang, J. F. Jia, X. Liu, W. D. Chen, Q. K. Xue, Recognizing contex-

tual polarity: An exploration of features for phrase-level sentiment analysis
35 (3) (2009) 399–433.

T
IP
[2] T. B. Pang, B. Pang, L. Lee, Thumbs up? sentiment classification using
machine learning, Proceedings of Emnlp (2002) 79–86.

CR
[3] P. Liu, X. Qiu, X. Huang, Recurrent neural network for text classification
with multi-task learning, in: International Joint Conference on Artificial
Intelligence, 2016, pp. 2873–2879.

US
[4] B. O’Connor, R. Balasubramanyan, B. R. Routledge, N. A. Smith, From
tweets to polls: Linking text sentiment to public opinion time series, in:
AN
International Conference on Weblogs and Social Media, Icwsm 2010, Wash-
ington, Dc, Usa, May, 2010.
M

[5] A. Graves, Long short-term memory, Neural Computation 9 (8) (1997)

1735.
ED

[6] J. Xu, D. Chen, X. Qiu, X. Huang, Cached long short-term memory neural
networks for document-level sentiment classification (2016) 1660–1669.

[7] D. Tang, B. Qin, T. Liu, Document modeling with gated recurrent neural
PT

network for sentiment classification, in: Conference on Empirical Methods

in Natural Language Processing, 2015, pp. 1422–1432.
CE

[8] R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning, A. Y.

Ng, C. Potts, Recursive deep models for semantic compositionality over a
AC

sentiment treebank.

[9] Y. Bengio, H. Schwenk, J. S. Sencal, F. Morin, J. L. Gauvain, Neural

probabilistic language models, Journal of Machine Learning Research 3 (6)
(2003) 1137–1155.

24
ACCEPTED MANUSCRIPT

[10] B. Pang, L. Lee, Seeing stars: exploiting class relationships for sentiment
categorization with respect to rating scales, in: Meeting on Association for
Computational Linguistics, 2005, pp. 115–124.

[11] P. Bo, L. Lee, Opinion Mining and Sentiment Analysis, Now Publishers

T
Inc., 2008.

IP
[12] P. D. Turney, et al. thumbs up or thumbs down? semantic orientation
applied to unsupervised classification of reviews, Proceedings of Annual

CR
Meeting of the Association for Computational Linguistics (2002) 417–424.

[13] A. B. Goldberg, X. Zhu, Seeing stars when there aren’t many stars: graph-

US
based semi-supervised learning for sentiment categorization, in: The Work-
shop on Graph Based Methods for Natural Language Processing, 2006, pp.
45–52.
AN
[14] S. Wang, C. D. Manning, Baselines and bigrams: simple, good sentiment
and topic classification, in: Meeting of the Association for Computational
M

Linguistics: Short Papers, 2012, pp. 90–94.

[15] R. Xia, C. Zong, Exploring the use of word relation features for sentiment
ED

classification., 2 (2010) 1336–1344.

[16] G. Ifrim, G. Weikum, The bag-of-opinions method for review rating predic-
tion from sparse text patterns, in: COLING 2010, International Conference
PT

on Computational Linguistics, Proceedings of the Conference, 23-27 August

2010, Beijing, China, 2010, pp. 913–921.
CE

[17] P. W. Foltz, W. Kintsch, T. K. Landauer, Running head: Textual coherence

using latent semantic analysis the measurement of textual coherence with
latent semantic analysis.
AC

[18] T. A. Mikolov, Statistical language models based on neural networks.

[19] K. S. Tai, R. Socher, C. D. Manning, Improved semantic representations

from tree-structured long short-term memory networks, Computer Science
5 (1) (2015) : 36.

25
ACCEPTED MANUSCRIPT

[20] D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly

learning to align and translate, Computer Science.

[21] A. Graves, N. Jaitly, A. R. Mohamed, Hybrid speech recognition with deep

bidirectional lstm, in: Automatic Speech Recognition and Understanding,

T
2014, pp. 273–278.

IP
[22] H. Zhang, J. Li, Y. Ji, H. Yue, Understanding subtitles by character-level

CR
sequence-to-sequence learning, IEEE Transactions on Industrial Informat-
ics 13 (2) (2017) 616–624.

[23] I. Sutskever, O. Vinyals, Q. V. Le, Sequence to sequence learning with

US
neural networks 4 (2014) 3104–3112.

[24] T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed rep-

AN
resentations of words and phrases and their compositionality, Advances in
Neural Information Processing Systems 26 (2013) 3111–3119.

[25] J. Pennington, R. Socher, C. Manning, Glove: Global vectors for word

representation, in: Conference on Empirical Methods in Natural Language

Processing, 2014, pp. 1532–1543.
ED

[26] Q. Qian, B. Tian, M. Huang, Y. Liu, X. Zhu, X. Zhu, Learning tag embed-
dings and tag-specific composition functions in recursive neural network,
in: Meeting of the Association for Computational Linguistics and the In-
PT

ternational Joint Conference on Natural Language Processing, 2015, pp.

1365–1374.
CE

[27] T. Mikolov, M. Karafit, L. Burget, J. Cernocky, S. Khudanpur, Recurrent

neural network based language model, in: INTERSPEECH 2010, Confer-
AC

ence of the International Speech Communication Association, Makuhari,

Chiba, Japan, September, 2010, pp. 1045–1048.

[28] Y. Bengio, P. Simard, P. Frasconi, Learning long-term dependencies with

gradient descent is difficult., IEEE Transactions on Neural Networks 5 (2)
(2002) 157–166.

26
ACCEPTED MANUSCRIPT

[29] P. Bhatia, Y. Ji, J. Eisenstein, Better document-level sentiment analysis

from rst discourse parsing, Computer Science.

[30] T. Ke, A. Bisazza, C. Monz, Recurrent memory networks for language

modeling.

T
IP
[31] D. Tang, B. Qin, X. Feng, T. Liu, Effective lstms for target-dependent
sentiment classification, Computer Science.

CR
[32] F. A. Gers, J. Schmidhuber, Recurrent nets that time and count, in: Ieee-
Inns-Enns International Joint Conference on Neural Networks, 2000, pp.
189–194 vol.3.

US
[33] J. Chung, C. Gulcehre, K. H. Cho, Y. Bengio, Empirical evaluation of gated
recurrent neural networks on sequence modeling, Eprint Arxiv.
AN
[34] K. Cho, B. V. Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares,
H. Schwenk, Y. Bengio, Learning phrase representations using rnn encoder-
decoder for statistical machine translation, Computer Science.
M

[35] D. Tang, B. Qin, T. Liu, Learning semantic representations of users and

products for document level sentiment classification, in: Meeting of the As-
sociation for Computational Linguistics and the International Joint Con-
ference on Natural Language Processing, 2015, pp. 1014–1023.
PT

[36] S. M. Kim, E. Hovy, Automatic detection of opinion bearing words and

sentences, Proceedings of Ijcnlp.
CE

[37] A. Karpathy, J. Johnson, L. Fei-Fei, Visualizing and understanding recur-

rent networks.
AC

[38] K. Denecke, Using sentiwordnet for multilingual sentiment analysis, in:

IEEE International Conference on Data Engineering Workshop, 2008, pp.
507–512.

27
ACCEPTED MANUSCRIPT

[39] C. D. Manning, M. Surdeanu, J. Bauer, J. Finkel, S. J. Bethard, D. Mc-

closky, The stanford corenlp natural language processing toolkit, in: Meet-
ing of the Association for Computational Linguistics: System Demonstra-
tions, 2014.

T
[40] J. Duchi, E. Hazan, Y. Singer, Adaptive subgradient methods for online

IP
learning and stochastic optimization, Journal of Machine Learning Re-
search 12 (7) (2011) 257–269.

CR
[41] R. E. Fan, K. W. Chang, C. J. Hsieh, X. R. Wang, C. J. Lin, Liblinear: A
library for large linear classification, Journal of Machine Learning Research
9 (9) (2008) 1871–1874.
US
[42] T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word
AN
representations in vector space, Computer Science.

[43] A. Paccanaro, G. E. Hinton, Learning Distributed Representations of Con-

cepts Using Linear Relational Embedding, IEEE Educational Activities
M

Department, 2001.
ED
PT
CE
AC

28
ACCEPTED MANUSCRIPT

Biography

T
IP
Weihang Huang School of Computer Science and Technology, Tianjin Uni-

CR
versity, Tianjin, China Corresponding author Mail: [email protected]
Guozheng Rao Tianjin Key Laboratory of Cognitive Computing and Ap-
plication, Tianjin, China Mail: [email protected]

US
Zhiyong Feng Tianjin Key Laboratory of Cognitive Computing and Ap-
plication, Tianjin, China and School of Computer Software, Tianjin University,
AN
Tianjin, China
Qiong Cong Tianjin Key Laboratory of Cognitive Computing and Appli-
cation, Tianjin, China *Biography of the aut
M
ED
PT
CE
AC

Sentiment Analysis Using Convolutional Neural Network
No ratings yet
Sentiment Analysis Using Convolutional Neural Network
6 pages
Silt Control in Irrigation Channels
100% (1)
Silt Control in Irrigation Channels
36 pages
Understanding The Law of Resonance
No ratings yet
Understanding The Law of Resonance
13 pages
NLP Final Mini Project
No ratings yet
NLP Final Mini Project
17 pages
Medical Devices Report 2020 Revmar2021
No ratings yet
Medical Devices Report 2020 Revmar2021
15 pages
A-Dec Dental Lights and Monitor Mounts Service Guide
No ratings yet
A-Dec Dental Lights and Monitor Mounts Service Guide
68 pages
Sentiment Analysis Using Deep Learning Methods - PPT Download
No ratings yet
Sentiment Analysis Using Deep Learning Methods - PPT Download
37 pages
Coursework Assessment Summary Form Cie
100% (2)
Coursework Assessment Summary Form Cie
8 pages
Paper 1-Bidirectional LSTM With Attention Mechanism and Convolutional Layer
100% (1)
Paper 1-Bidirectional LSTM With Attention Mechanism and Convolutional Layer
51 pages
PHD Thesis On Physics Education
100% (3)
PHD Thesis On Physics Education
5 pages
04 Nursing Process of MHN
100% (1)
04 Nursing Process of MHN
13 pages
Quarter 3 - Module 8: The Power (Positivity, Optimism and Resiliency) To Cope
100% (1)
Quarter 3 - Module 8: The Power (Positivity, Optimism and Resiliency) To Cope
3 pages
Articles (Homework)
No ratings yet
Articles (Homework)
2 pages
Short Text Sentiment Classification Using Bayesian
No ratings yet
Short Text Sentiment Classification Using Bayesian
16 pages
CU-2022 B.sc. (Honours) Zoology Semester-6 Paper-DSE-B (2) - 1 QP
No ratings yet
CU-2022 B.sc. (Honours) Zoology Semester-6 Paper-DSE-B (2) - 1 QP
2 pages
Document Total
No ratings yet
Document Total
62 pages
Xlnet-Lstm-Cnn For Text Sentiment Analysis: Yiwen Wang
No ratings yet
Xlnet-Lstm-Cnn For Text Sentiment Analysis: Yiwen Wang
23 pages
Ensemble Deep Learning For Aspect-Based Sentiment Analysis
No ratings yet
Ensemble Deep Learning For Aspect-Based Sentiment Analysis
10 pages
Final Sentiment Classification
No ratings yet
Final Sentiment Classification
16 pages
Chilled Displays
No ratings yet
Chilled Displays
65 pages
Cse-564 (Final Viva Voce
No ratings yet
Cse-564 (Final Viva Voce
32 pages
Exploring Sentiment Analysis Through Deep Learning: A Comprehensive Review
No ratings yet
Exploring Sentiment Analysis Through Deep Learning: A Comprehensive Review
4 pages
Simplify Product Review Analysis Using Deep Learning and Natural Language Processing 1
No ratings yet
Simplify Product Review Analysis Using Deep Learning and Natural Language Processing 1
16 pages
Reverse Engineering Recurrent Networks For Sentiment Classification Reveals Line Attractor Dynamics
No ratings yet
Reverse Engineering Recurrent Networks For Sentiment Classification Reveals Line Attractor Dynamics
17 pages
Sentiment Analysis Presentation
No ratings yet
Sentiment Analysis Presentation
14 pages
Ashwin Prasanth PT1 Project
No ratings yet
Ashwin Prasanth PT1 Project
38 pages
9155EN
No ratings yet
9155EN
27 pages
Applsci 10 05841
No ratings yet
Applsci 10 05841
14 pages
Wa0002
No ratings yet
Wa0002
21 pages
PES1PG24CS018 Debjit DLTP Assignment-2 Sentiment Analysis Report
No ratings yet
PES1PG24CS018 Debjit DLTP Assignment-2 Sentiment Analysis Report
8 pages
2018-12-17 - Staph Aureus Paper - Lyons - As Submitted To JFP
No ratings yet
2018-12-17 - Staph Aureus Paper - Lyons - As Submitted To JFP
45 pages
Sentiment Analysis With An Recurrent Neural Networks
No ratings yet
Sentiment Analysis With An Recurrent Neural Networks
12 pages
The Man Behind The Famous Bee (Jollibee)
No ratings yet
The Man Behind The Famous Bee (Jollibee)
2 pages
Using The TI-73:: A Guide For Teachers
No ratings yet
Using The TI-73:: A Guide For Teachers
86 pages
Sentiment Analysis With LSTM
No ratings yet
Sentiment Analysis With LSTM
40 pages
A Reflective Essay On Veneration Without Understanding
No ratings yet
A Reflective Essay On Veneration Without Understanding
7 pages
Sentiment Analysis From Movie Reviews Us
No ratings yet
Sentiment Analysis From Movie Reviews Us
5 pages
Huang 2021
No ratings yet
Huang 2021
14 pages
MN10
No ratings yet
MN10
13 pages
Ales Hrdlicka - Some Results of Recent Anthropological Exploration in Peru, 1911
No ratings yet
Ales Hrdlicka - Some Results of Recent Anthropological Exploration in Peru, 1911
40 pages
MFR11 Manual
No ratings yet
MFR11 Manual
59 pages
Learning To Generate Reviews and Discovering Sentiment
No ratings yet
Learning To Generate Reviews and Discovering Sentiment
9 pages
High Performance, Flexible, Solid-State Supercapacitors Based On A Renewable and Biodegradable Mesoporous Cellulose Membrane
No ratings yet
High Performance, Flexible, Solid-State Supercapacitors Based On A Renewable and Biodegradable Mesoporous Cellulose Membrane
9 pages
Plagarism Report
No ratings yet
Plagarism Report
12 pages
IGCSE Chemistry AO3 G10-2 Sungbeen Hong
No ratings yet
IGCSE Chemistry AO3 G10-2 Sungbeen Hong
14 pages
Paper Id - ICCCAI25 - 188
No ratings yet
Paper Id - ICCCAI25 - 188
8 pages
Sequence Classification Movie Reviews Paper Submission
No ratings yet
Sequence Classification Movie Reviews Paper Submission
8 pages
Applsci 11 11255 v2
No ratings yet
Applsci 11 11255 v2
17 pages
Deep Learning: by Subhash C
No ratings yet
Deep Learning: by Subhash C
18 pages
Maneesha Nidigonda Major Project
No ratings yet
Maneesha Nidigonda Major Project
11 pages
Twitter Sentiment Analysis Using Deep Learning
No ratings yet
Twitter Sentiment Analysis Using Deep Learning
5 pages
Teaching Early Numeracy Skills Hands-On Learning in Times of The Covid-19 Pandemic
No ratings yet
Teaching Early Numeracy Skills Hands-On Learning in Times of The Covid-19 Pandemic
17 pages
Sentiment Analysis With Contextual Embeddings and Self-Attention
No ratings yet
Sentiment Analysis With Contextual Embeddings and Self-Attention
10 pages
Test CAE
No ratings yet
Test CAE
10 pages
Sentiment Analysis Using Topic-Document Embeddings
No ratings yet
Sentiment Analysis Using Topic-Document Embeddings
8 pages
Shirani MehrH PDF
No ratings yet
Shirani MehrH PDF
8 pages
Sentiment Analysis On Movie Reviews Using RNN
No ratings yet
Sentiment Analysis On Movie Reviews Using RNN
10 pages
Technical Seminar Nameera
No ratings yet
Technical Seminar Nameera
14 pages
2 +intelligent+2024+paper+1
No ratings yet
2 +intelligent+2024+paper+1
12 pages
Evolution of FFRDCs
100% (1)
Evolution of FFRDCs
8 pages
Lecture 2 Design Controls and Criteria
No ratings yet
Lecture 2 Design Controls and Criteria
17 pages
16 - The New Public Service Serving Rather Than Steering
No ratings yet
16 - The New Public Service Serving Rather Than Steering
11 pages
Sentiment Analysis of IMDb Movie Reviews Using LSTM
No ratings yet
Sentiment Analysis of IMDb Movie Reviews Using LSTM
4 pages
AHybridCNN-LSTMModelforImprovingAccuracyofMovieReviewsSentimentAnalysis2019-Multimedia Tools and Applications
No ratings yet
AHybridCNN-LSTMModelforImprovingAccuracyofMovieReviewsSentimentAnalysis2019-Multimedia Tools and Applications
17 pages
Future Generation Computer Systems: Fazeel Abid Muhammad Alam Muhammad Yasir Chen Li
No ratings yet
Future Generation Computer Systems: Fazeel Abid Muhammad Alam Muhammad Yasir Chen Li
17 pages
OKE JUGA - Sentiment Analysis of IMDb Movie Reviews Using Long Short-Term Memory
No ratings yet
OKE JUGA - Sentiment Analysis of IMDb Movie Reviews Using Long Short-Term Memory
4 pages
NILES2021 Paper 43
No ratings yet
NILES2021 Paper 43
5 pages
Co-Extraction of Feature Sentiment and Context Terms For Context-Sensitive Feature-Based Sentiment Classification Using Attentive-LSTM
No ratings yet
Co-Extraction of Feature Sentiment and Context Terms For Context-Sensitive Feature-Based Sentiment Classification Using Attentive-LSTM
10 pages
Sentiment Analysis Using Recurrent Neural Network
No ratings yet
Sentiment Analysis Using Recurrent Neural Network
7 pages
Chen 2019 IOP Conf. Ser. Mater. Sci. Eng. 490 062063
No ratings yet
Chen 2019 IOP Conf. Ser. Mater. Sci. Eng. 490 062063
6 pages
Ganesh 2020
No ratings yet
Ganesh 2020
6 pages
Long Short Term Memory LSTM Based Deep Learning For Sentiment Analysis of English and Spanish Data
No ratings yet
Long Short Term Memory LSTM Based Deep Learning For Sentiment Analysis of English and Spanish Data
5 pages
Applications of CNN For Sentiement Analysis
No ratings yet
Applications of CNN For Sentiement Analysis
6 pages
Sentiment Analysis Behind Text With Different Length and Formality
No ratings yet
Sentiment Analysis Behind Text With Different Length and Formality
6 pages
Application of Assembly Construction in Intelligen
No ratings yet
Application of Assembly Construction in Intelligen
6 pages
Attention-Based LSTM For Aspect-Level Sentiment Classification
No ratings yet
Attention-Based LSTM For Aspect-Level Sentiment Classification
10 pages
Panchbhai 2021
No ratings yet
Panchbhai 2021
6 pages
Computer Organization: National Institute of Technology Hamirpur
No ratings yet
Computer Organization: National Institute of Technology Hamirpur
8 pages
A Deep Neural Network Model For Target-Based Sentiment Analysis
No ratings yet
A Deep Neural Network Model For Target-Based Sentiment Analysis
7 pages
Fact Family Trees PDF
No ratings yet
Fact Family Trees PDF
5 pages
Research of Sentiment Analysis Based On Long-Sequence-Term-Memory Model
No ratings yet
Research of Sentiment Analysis Based On Long-Sequence-Term-Memory Model
6 pages
Applications of Deep Learning To Sentiment Analysis of Movie Reviews
No ratings yet
Applications of Deep Learning To Sentiment Analysis of Movie Reviews
8 pages
Deep-Sentiment: Sentiment Analysis Using Ensemble of CNN and Bi-LSTM Models
No ratings yet
Deep-Sentiment: Sentiment Analysis Using Ensemble of CNN and Bi-LSTM Models
6 pages
Oct 31
No ratings yet
Oct 31
4 pages
Sentiment Analysis Based On Weighted Word2vec and Att-LSTM
No ratings yet
Sentiment Analysis Based On Weighted Word2vec and Att-LSTM
5 pages
Skin and Temperature Control
No ratings yet
Skin and Temperature Control
3 pages
Poster Version Final Bis
No ratings yet
Poster Version Final Bis
1 page
Advantage of Using PLC in Industrial Automation
No ratings yet
Advantage of Using PLC in Industrial Automation
2 pages
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
From Everand
The Newbie’s Guidebook to ChatGPT: A Beginner's Tutorial: The Newbie’s Guidebook
Timothy King
No ratings yet
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
From Everand
Gensim for Natural Language Processing: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet