0% found this document useful (0 votes)
87 views7 pages

A Deep Neural Network Model For Target-Based Sentiment Analysis

Business intelligence

Uploaded by

Beghin Bose
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views7 pages

A Deep Neural Network Model For Target-Based Sentiment Analysis

Business intelligence

Uploaded by

Beghin Bose
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

A Deep Neural Network Model for Target-based

Sentiment Analysis
Siyuan Chen Chao Peng
School of Computer Science and Software Engineering Shanghai Key Lab of Trustworthy Computing
East China Normal University School of Computer Science and Software Engineering
Shanghai,China East China Normal University
[email protected] [email protected]

Linsen Cai Lanying Guo


School of Computer Science and Software Engineering School of Computer Science and Software Engineering
East China Normal University East China Normal University
Shanghai,China Shanghai,China
[email protected] [email protected]

Abstract—In recent years, with the development of social net- POS (Part-of-Speech) features and TF-IDF features have great
works, sentiment analysis has become one of the most important impacts on the method’s accuracy. And these machine-learning
research topics in the field of natural language processing. The based methods tend to predict that different targets in a same
deep neural network model combining attention mechanism has
achieved remarkable success in the task of target-based sentiment sentence to have same sentiment polarities. What’s more,
analysis. In current research, however, the attention mechanism the feature engineering relies on human design and is not
is more combined with LSTM networks, such neural network- compatible with datasets in different fields.
based architectures generally rely on complex computation and In recent years, deep learning methods have achieved great
only focus on the single target, thus it is difficult to effectively progress in many fields, more and more researchers applied
distinguish the different polarities of variant targets in the same
sentence. To address this problem, we propose a deep neural deep neural network models like convolutional neural network
network model combining convolutional neural network and (CNN) model [4] and long-short term memory(LSTM) net-
regional long short-term memory (CNN-RLSTM) for the task work model [5] to NLP tasks. In sentiment analysis tasks,
of target-based sentiment analysis. The approach can reduce the people use deep neural network models to represent sentence
training time of neural network model through a regional LSTM. as a feature vector and through a softmax function to get the
At the same time, the CNN-RLSTM uses a sentence-level CNN to
extract sentiment features of the whole sentence, and controls the sentiment classification. However, these methods haven’t paid
transmission of information through different weight matrices, much attention to the fact that different context words might
which can effectively infer the sentiment polarities of different have different contributions to targets’ sentiment polarities [8].
targets in the same sentence. Finally, experimental results on Inspired by the success of attention mechanism in image
multi-domain datasets of two languages from SemEval2016 and recognition applications, researchers try to utilize it in NLP
auto data show that, our approach yields better performance
than SVM and several other neural network models. tasks. Attention mechanism enables models to focus on a
Index Terms—Deep learning, Sentiment analysis, Target-based target’s specific feature during training and explore more
sentiment analysis, Convolutional neural network, Long short- potential correlation between words. In recent years, attention
term memory network, Deep neural network model based neural networks have achieved high performance in
many target-based NLP tasks, as demonstrated in relation clas-
I. I NTRODUCTION sification [5], modeling sentence pairs [6], machine translation
Target-based sentiment analysis is a fundamental task in the [7] and aspect-level sentiment classification [8].
field of sentiment analysis [1]. Different from conventional However, LSTM receives sequential input of sentence,
sentiment classification, it requires analyzing the sentiment which costs much training time. Furthermore, the attention
polarity of different targets in the same sentence since there mechanism requires neuron to do extra computation which can
may be more than one object in a sentence. For example, double the training time. On the other hand, common CNN
there are two targets in the sentence “Good food but dreadful takes little time in training, but attention-based CNN [6] needs
service at that restaurant”. The sentiment polarity of target to build the attention matrix and analyze sentiment features
“food” is positive while the polarity of target “service” is which require much more work in feature engineering.
negative. Different targets in the same sentence may have In this paper we propose a deep neural network model
opposite sentiment polarities. named CNN-RLSTM, which combines convolutional neural
In traditional machine learning methods, feature engineering network and regional long short-term memory. To overcome
is the key task. Different compositions of n-gram features, the long training time of LSTM, we segment the sentence

978-1-5090-6014-6/18/$31.00 ©2018 IEEE


according to the given targets, which can reduce the length Attention mechanism can deal with different targets’ senti-
of input sentence. The intuition of sentence segmentation is ments in a same sentence effectivly. It is firstly proposed to
that useful words are usually not far away from the target. solve tasks in image recognition. With attention mechanism,
Meanwhile, to prevent the loss of polarity in case of bad region the neural network model can focus on important informa-
segmentation, we use a convolutional network to explore tion when dealing with the input since one picture contains
the sentimental feature of a whole sentence. Experimental thousand of pixels, most of which are not important to the
results show that CNN-RLSTM can improve the performance recognition result. Bahdanau et al. firstly applied attention
compared with several other models in both English and mechanism into NLP tasks. They combine RNN model with
Chinese datasets. attention mechanism to solve the machine translation problem.
Our main contributions are summarized as follows: With attention mechanism the model learns to align auto-
• We propose a deep neural network model named CNN- matically when translating [7]. Yin et al. combine attention
RLSTM for target-based sentiment analysis. The model is mechanism with BCNN model to sovle sentence pair tasks.
able to distinguish different target’s sentiment polarities The model achieves great performance by adding attention
and has advantage of less training time than attention- matrices in convolutional layer and pooling layer to build
based LSTM and less work in feature engineering than connection between two sentences [6]. Wang et al. bring
attention-based CNN. attention mechanism into LSTM network model, findng out
• The regional segmentation approach can keep the impor- the words that contributes to the sentiment of corresponding
tant features of given target and reduce the interference target. The model achieves good classification accuracies in
of context words that contributes to other targets in the various datasets [8].
sentence.
III. D EEP N EURAL N ETWORK M ODEL CNN-RLSTM
• Experimental results on multi-domain datasets of two
languages including SemEval2016 datasets and AUTO Figure 1 shows the overall framework of CNN-RLSTM.
dataset show that the CNN-RLSTM achieves outstanding The model mainly contains four parts:
accuracy compared with other methods. • Input matrix of CNN; The matrix is composed of the
The rest of our paper is organized as follows: Section 2 word vectors of the whole sentence, and will be fed into
discusses related works, Sections 3 describes the deep neural the convolutional neural network.
network model of combining convolutional neural network and • Input matrix of regional LSTM; The sentence is seg-
regional long short-term memory, Section 4 presents extensive mented into regions with different central targets. Im-
experiments to justify the effectiveness of our proposals, and portant regions will be organized into a regional matrix,
Section 5 summarizes this work. which is part of the sentence matrix.
• Convolutional neural network; To simplify the model’s
II. R ELATED W ORK structure, we use one convolutional layer and one pooling
Target-based sentiment analysis is a more fine-grained clas- layer to extract important features of the whole sentence.
sification task, and has attracted many researchers’ attention • Regional LSTM network; We use a bidirectional LSTM
[11], [12]. The task needs to concentrate on the correlation network to extract information in each segmentation,
between specific context words and targets since different which is usually highly connected with corresponding
targets in the same sentence may have opposite sentiment targets, and find out the potential correlations between
polarities. Kiritchenko et al. use a SVM classifier which words. Meanwhile the network accepts the feature vector
combines multiple features for aspect level sentiment analysis extracted by CNN and the target’s word embedding as
[13]. The method adds unigram, bigram, sentiment dictionary extra inputs, thus enables the model to concentrate on
and other features to exploit multiple sentiment polarities in the correlation between targets and context.
a sentence, enables the classifier to identify sentiments from
different aspects. A. Task Definition and Notation
Deep neural network models are also effective in target- Given a sentence s = {w1 , w2 , . . . , ti , . . . , tj , . . . , wn }
based sentiment analysis. Nguyen and Shirai propose a target- consisting of n words, two of which are different target
based model based on recursive neural network (RNN) and words ti and tj . Target-based sentiment classification aims
dependency tree [14]. The method uses a binary phrase de- to determine the sentiment polarities of sentence s towards
pendency tree which combines the dependency and constituent the target ti and tj . For example, in the sentence “Good food
trees of a sentence to extract the correlation between the but dreadful service at that restaurant”, the sentiment polarity
target and other phrases in the sentence, thus can decrease of target “food” is positive, while the polarity towards target
the work in feature engineering. Dong et al. use an adaptive “service” is negative. Different targets in the same sentence
recursive neural network model (AdaRNN) to solve the target- may have completely opposite sentiment polarities. When
based sentiment analysis task, their model adopts an adaptive dealing with input sentences, we map each word wi into its
neural network model to extend the emotional information in word embedding vector, notated as xi , and represent target
sentences through the connection between context words and embedding vector as ai . In practice, targets might include
target words [15]. multiple words like “battery life”. For simplicity we consider

2018 International Joint Conference on Neural Networks (IJCNN)


Fig. 1. Overall framework of CNN-RLSTM

a target as a single word in definition. A sentence of length n


is represented as a matrix defined by
x1:n = x1 ⊕ x2 ⊕ · · · ⊕ ai ⊕ · · · ⊕ aj ⊕ · · · ⊕ xn (1)
where ⊕ is the concatenation operator.
B. Convolutional Neural Network
We use a convolutional neural network to extract the fea-
tures of a whole sentence, following the procedure used in
[4]. First we apply a convolution operation to a window of h
words by
ci = f (w · xi:i+h−1 + b) (2) Fig. 2. Framework of RLSTM
h×d
where w ∈ R is a convolution filter, b ∈ R is a bias term
and f is a non-linear function such as the hyperbolic tangent.
Given a sentence s = {w1 , w2 , . . . , ti , . . . , tj , . . . , wn }, we
For a sentence of length n, we can get a feature map
segment the sentence into regions according to targets. For
c = [c1 , c2 , · · · , cn−h+1 ] (3) example, there are two targets ti and tj , we treat each target
as a center and segment the sentence into 2 regions r1 and
where c ∈ Rn−h+1 . We use max-over-time pooling oper- r2 of length l where r1 = {wm , . . . , ti , . . . , wl+m−1 } and
ation over the feature map and take the maximum value r2 = {wn , . . . , tj , . . . , wl+n−1 }. Through reducing the length
c = max {c} to capture the most important feature. With
b of input we can decrease the model’s training time.
c of size m
multiple filters we can get the feature vector b The regional LSTM network takes the regional segmenta-
c = [b
b c2 , · · · , b
c1 , b cm ] (4) tion as the sequential input, every neuron accepts the previous
hidden layer’s output and the word vector of this layer as
c into the input of regional long-short
Then we incorporate b inputs. To make the model focus on the target information,
term memory network in the form of attention vector. we incorporate the feature vector extracted in CNN layer and
C. Regional Long-short Term Memory Network the target word vector into the input of LSTM network in the
form of attention vector. The framework of regional long-short
In order to reduce the training time of LSTM network, we term memory (RLSTM) network is given in Figure 2.
propose a regional segmentation method to reduce the input The input of RLSTM network is composed of the word
length. The intuition of this proposal is that the context words vector, the previous output of hidden layer and the target’s
which are important to target’s sentiment polarities generally attention vector
will be nearby the corresponding targets. Thus the regional
segmentation retains the target’s important feature information. e = Wh · hi ⊕ Wa · a ⊕ Wc · ĉ (5)

2018 International Joint Conference on Neural Networks (IJCNN)


where Wh ∈ Rl is the weight matrix of hidden layer output TABLE I
hi ∈ Rl , l is the dimension of the hidden layer hi . Wa ∈ Rd S TATISTICS OF DATASETS
is the weight matrix of target’s word vector a. Wc ∈ Rm is
Dataset Positive Negative Neutral
c. The RLSTM network
the weight matrix of the feature vector b
REST-Train 1647 741 101
can update the weight matrix to adjust the influence of each REST-Test 601 202 40
vector on classification results. LAPT-Train 1631 1076 183
LAPT-Test 480 268 42
D. Model Training AUTO-Train 2686 1089 543
AUTO-Test 854 243 138
We regard the last hidden vector as the target’s context
representation and feed it to a linear layer. Then a sof tmax
function is used to predict the classification of the sentiment
polarity towards target We use multiple kernels in CNN to extract more information
of sentence with window size of 2,3,4,5 and the count of
y = softmax(Ws hN + b) (6) each kernel is set to 100. In regional LSTM network, the
where Ws is a weight matrix, b is a bias term and hN length of segmentation is set to 11. To prevent overfitting,
is the last hidden layer output. The model is trained by we use dropout mechanism and regularization. The detail of
backpropagation, where the loss function is the cross-entropy parameters are given in Table 2.
error of sentiment classification
X X j 2 TABLE II
loss = − ŷi log yij + λkθk (7) S ETUP OF PARAMETER
i∈D j∈C
Hyper parameters Values
D is the training dataset, C is the sentiment categories, y is
the predicted sentiment polarity and yb is the actual polarity Window size of convolution kernel 2,3,4,5
2 Kernel count 100
and λkθk is the regularization item. Regularization 3
Mini-batch 32
IV. E XPERIMENT Dropout 0.5
We apply the proposed model to three datasets in differ- Regional length 11
ent fields. The experiment results evaluate the effectiveness
of CNN-RLSTM in target-dependent sentiment classification
by comparing to several baselines. In our experiments, the B. Comparison to Other Methods
English words are initialized by Glove [18]. For Chinese word We compare the proposed model CNN-RLSTM with several
embedding, we crawl users’ comments from a car-commercial baseline methods, which achieve good performance in target-
website1 and train the word vectors by word2vec [19]. The based sentiment classification tasks, including:
dimension of all word vectors are set to 300. Other parameters
are randomized with uniform distribution U (−0.01, 0.01). We • SVM: SVM classifier is built with features of unigrams
use “jieba”2 toolkit to segment Chinese sentences. In the or bigrams and the model achieves best performance in
Chinese dataset, all target phrases are included in user dictio- machine learning methods [12].
nary, so they can be searched in the word embedding straight • CNN: The model proposed in [4] is the foundation of all
forwardly. For English datasets, as mentioned in section 3.1, convolution neural networks applied to sentiment clas-
targets might have multi words like “battery life”, we use the sification tasks. The model extracts the most important
average of its constituting word vectors as word embedding. feature thus fails to focus on target’s information.
• LSTM: Common LSTM network model captures long-
A. Experiment Data term dependencies of context words but ignores the
We conduct experiments on datasets in both English and relatedness of target with its context words [16].
Chinese datasets, where English datasets are adopted from • ATT-CNN: Yin et al. bring attention mechanism into
SemEval 20163 , including REST from restaurant domain and CNN model by building attention feature map, thus
LAPT from laptop domain,and the Chinese dataset is AUTO enable the convolution operation to focus on target’s
from car domain provided by BDCI competition4 . The Se- information [6].
mEval dataset is the most adopted public dataset in target- • ATT-LSTM: The attention-based LSTM network model
based tasks. The targets of each sentence are given by the captures important information towards a given target
dataset and the sentiment polarity is classified into positive, and achieves good performance in aspect-level sentiment
negative and neutral. Statistics of the datasets are shown in classification [8].
Table 1. • MATT-CNN: The model augments the basic CNN with
1 https://www.autohome.com.cn
3 types of attention matrix to extract targets’ sentiment
2 https://pypi.python.org/pypi/jieba/ information [10].
3 http://alt.qcri.org/semeval2016/ • ATT-RLSTM: The model is part of the CNN-RLSTM and
4 http://www.wid.org.cn/data/science/player/competition/detail/description/237/ only retains the RLSTM with attention mechanism.

2018 International Joint Conference on Neural Networks (IJCNN)


TABLE III
C LASSIFICATION ACCURACY OF DIFFERENT MODELS

Model REST LAPT AUTO


SVM 76.63 66.58 81.62
CNN 66.55 60.25 73.04
LSTM 69.28 62.41 74.82
ATT-CNN 75.56 65.70 80.32
ATT-LSTM 78.29 67.59 83.24
MATT-CNN 78.89 67.22 84.21
ATT-RLSTM 78.05 67.09 83.89
CNN-RLSTM 78.53 67.85 85.51

• CNN-RLSTM: Our proposed model in this paper, as


described in previous sections.
We conduct 8 experiments on all 3 datasets: REST, LAPT
and AUTO. Experimental results of baseline models and our Fig. 3. Classification accuracy of different models
methods are shown in Table 3.
From Table 3 we can find out that CNN-RLSTM performs
well in target-based sentiment classification tasks. Especially
in AUTO dataset where the model achieves accuracy of
85.51%, which improves the performance by 1.3% compared
with the MATT-CNN model and is 3.89% higher than tradi-
tional SVM model. The model is evaluated to be effective in
target-based sentiment analysis.
The CNN and LSTM models perform not well in all 3
datasets (73.04% and 74.82%), since they pay little atte-
tion to targets. By comparison, the models with attention
mechanism ATT-CNN and ATT-LSTM achieve accuracy of
80.32% and 82.59%, increased by 7.28% and 7.77%. To
clarify the effectiveness of attention mechanism in target-based
sentiment classification, we randomly choose typical sentences
containing multiple targets for detail analysis. The examples
are given in Table 4. Fig. 4. Binary classification of different models
From Table 4 we can find that all four models get the
correct result when targets in sentence 1 have same polarities.
But sentence 2 contains two targets with opposite sentiment two-class classification problem to verify the effectiveness of
polarities, both the ATT-CNN and ATT-LSTM can predict CNN-RLSTM more clearly. The experimental result is given
the right result. However, the basic CNN and LSTM models in Table 5.
predict different targets having same polarities. From Table 5 we can tell that the CNN-RLSTM model
What’s more, from Table 3 we can find that the ATT- achieves best performance in all datasets, where the accuracy
RLSTM without feature from CNN achieves 83.89% in AUTO of AUTO is as high as 94.35%, improving 1.92%, 1.19%
dataset, which improves 0.65% compared with ATT-LSTM and 0.91% compared to other 3 models. We compare the
model. However, the accuracy is 0.32% lower than MATT- performance among 4 models achieved in Table 3 and Table
CNN model. And the model only achieves 78.05% and 67.09% 5, the comparison results are given in Figure 3 and Figure 4.
in REST and LAPT datasets, both of which achieves lower From Figure 3 and Figure 4 we can find that CNN-RLSTM
accuracy than MATT-CNN and ATT-LSTM models. In con- achieves the best overall performance, especially in binary
trast, the CNN-RLSTM model performs well in all 3 datasets. classification. Comparing CNN-RLSTM with MATT-CNN, we
Although the accuracy of REST datasets is 0.36% lower com- can find that CNN-RLSTM performs better than MATT-CNN
pared with MATT-CNN model, it achieves best performance in all experiments except for one in REST dataset. This is
in LATP and AUTO datasets. Especially in AUTO dataset, because CNN-RLSTM can study the long-term dependencies
comparing to ATT-LSTM, MATT-CNN and ATT-RLSTM, the between words through LSTM network while MATT-CNN
CNN-RLSTM model improves 2.27%, 1.3% and 1.62%. We ignore such relation. So the CNN-RLSTM model achieves
conduct another experiment on 3 datasets with only four higher accuracy than MATT-CNN when only word embedding
models, including ATT-LSTM, MATT-CNN, ATT-RLSTM, vector is used.
CNN-RLSTM. This time we only keep negative and positive Comparing ATT-RLSTM with ATT-LSTM we can find that
samples in datasets, transform the sentiment analysis task into ATT-RLSTM performs worse in REST and LAPT, which

2018 International Joint Conference on Neural Networks (IJCNN)


TABLE IV
A NALYSIS OF TYPICAL SENTENCES . “+” MEANS POSITVE AND “-” MEANS NEGATIVE

Sequence Sentence Targets Sentiment CNN LSTM ATT-CNN ATT-LSTM


1 I get good food and good service food/service +/+ +/+ +/+ +/+ +/+
2 Service was slow, the food was good food/service +/- -/- +/+ +/- +/-

TABLE V
B INARY CLASSIFICATION ACCURACY OF DIFFERENT MODELS

Models REST LAPT AUTO


ATT-LSTM 88.92 86.10 92.43
MATT-CNN 89.66 85.56 93.16
ATT-RLSTM 88.67 85.43 93.44
CNN-RLSTM 90.16 86.36 94.35

Fig. 6. Binary classification accuracy with different region length

shows that CNN-RLSTM can use the features extracted by


CNN to ease the influence caused by bad regional segmenta-
tion. From Figure 5 and Figure 6 we find that both models
achieve best accuracies when the size of regional segmentation
is 11. When the regional segmentation’s size increases, it will
cost model more training time. We set the size of regional
Fig. 5. Classification accuracy with different region length segmentation to be 11 in previous experiments.
C. Training time analysis
shows that regional segmentation according to targets can We compare the training time of proposed models with
help the model distinguish the sentiment polarities of different baselines methods under the circumstances of same GPU, CPU
targets to some extent. But when the regional segmentation and network framework. We record the time that the models
is not so appropriate then the model may lose important need to finish one training epoch and the results are given in
words that contribute to target’s sentiment polarity. Making a Table 6.
comparison between CNN-RSLTM and ATT-RLSTM we can From Table 6 we can find that under same circumstances,
find that CNN-RLSTM performs better than ATT-RLSTM in the LSTM network takes 126 seconds to finish one training
all experiments, which is not surprising because the feature epoch, which is 11 times longer than CNN. The ATT-LSTM
extracted from CNN layer can help the model utilize more model needs 386 seconds to complete, which is longest in all
sentimental feature information of sentence. methods because every neuron in ATT-LSTM network model
We conduct another two experiments on ATT-RLSTM and needs to consider the attention information. The proposed
CNN-RLSTM to illustrate how regional segmentation with model CNN-RLSTM takes 153 seconds to finish one epoch,
different size influences the accuracy of classification result. which is far less than ATT-LSTM. We can tell that the regional
The experimental results are given in Figure 5 and Figure 6. segmentation proposed in section 3.3 can reduce the training
From Figure 5 and Figure 6 we can find that CNN- time. What’s more, the CNN-RLSTM takes 17 seconds more
RLSTM performs better than ATT-RLSTM in all sizes of than ATT-RLSTM. Combining with the result that CNN-
regional segmentation. From both figures we can find that the RLSTM performs better than ATT-RLSTM in last section we
accuracies of ATT-RLSTM model decrease rapidly when the can tell that the CNN layer in the CNN-RLSTM model can
size of regional segmentation is 5, which decreases almost improve the model’s performance without increasing training
10%. That’s because the segmentation is so small that words time.
important to targets are lost. But the CNN-RLSTM network
model only decreases 3.46% at most in the experiments, which

2018 International Joint Conference on Neural Networks (IJCNN)


TABLE VI [11] M. Hu, B. Liu, ”Mining and summarizing customer reviews,” Proceed-
RUNTIME OF ONE TRAINGING EPOCH ON THE AUTO DATASET ings of the 10th ACM SIGKDD international conference on Knowledge
discovery and data mining. ACM. New York, pp. 168-177, 2004.
Model Time(s) [12] M. Hu, B. Liu, ”Mining opinion features in customer reviews,” Pro-
ceedings of AAAI 2004. AAAI Press. Menlo Park., vol. 4, No. 4, pp.
CNN 11 755-760, 2004.
LSTM 126 [13] S. Kiritchenko, X. Zhu, C. Cherry, et al, ”NRC-Canada-2014: Detect-
ATT-CNN 28 ing aspects and sentiment in customer reviews,” Proceedings of the
ATT-LSTM 386 8th International Workshop on Semantic Evaluation (SemEval 2014).
MATT-CNN 82 Stroudsburg, pp. 437-442, 2014.
ATT-RLSTM 136 [14] T. H. Nguyen, K. Shirai, ”PhraseRNN: Phrase Recursive Neural Net-
CNN-RLSTM 153 work for Aspect-based Sentiment Analysis,” Proceedings of the 2015
Conference on Empirical Methods in Natural Language Processing.
Stroudsburg, pp. 2509-2514, 2015.
[15] L. Dong, F. Wei, C. Tan, et al, ”Adaptive Recursive Neural Network for
V. C ONCLUSION Target-dependent Twitter Sentiment Classification,” Proceedings of the
52nd Annual Meeting of the Association for Computational Linguistics.
We propose a deep neural network model combining convo- Stroudsburg, pp. 49-54, 2014.
lutional neural network and regional long short-term memory [16] X. Wang, Y. Liu, C. Sun, et al, ”Predicting polarities of Tweets by com-
posing word embeddings with long short-term memory,” Proceedings
(CNN-RLSTM) for the task of target-based sentiment analysis. of the 53rd Annual Meeting of the Association for Computational Lin-
The experimental results show that our CNN-RLSTM model guistics and the 7th International Joint Conference on Natural Language
performs better than existing methods like SVM, attention- Processing (Volume 1: Long Papers). Stroudsburg, vol. 1, pp. 1343-1353,
2015.
based LSTM and multi-attention based CNN, which corrobo- [17] R. Collobert, J. Weston, L. Bottou, et al, ”Natural language processing
rate the effectiveness of this model. What’s more, the CNN- (almost) from scratch,” Journal of Machine Learning Research, vol. 12,
RLSTM model costs only half time compares to the original pp. 2493-2537, August 2011.
[18] J. Pennington, R. Socher, C. Manning. ”Glove: Global Vectors for
attention-based LSTM network model. Word Representation,” Proceedings of the 2014 Conference on Empirical
From different experimental results we can tell that the Methods in Natural Language Processing. Stroudsburg, vol. 14, pp.
performance of CNN-RLSTM model will be influenced when 1532-1543, 2014.
[19] T. Mikolov, K. Chen, G. Corrado, et al, ”Efficient estimation of word
the segmentation is not reasonable. Especially when the length representations in vector space,” Proceedings of workshop at ICLR,
of segmentation is short, the classification accuracy drops quite 2013.
a lot. We’ll try to find out more effective method for regioanl
segmentation to improve the model’s classification accuracy
and stability.

R EFERENCES
[1] B. Pang, L. Lee, ”Opinion mining and sentiment analysis,” Foundations
and Trends in Information Retrieval, vol. 2, No. 1-2, pp. 1-135, July
2008.
[2] M. Pontiki, D. Galanis, J. Pavlopoulos, et al, ”Semeval-2014 task 4:
Aspect based sentiment analysis,” Proceedings of the 8th International
Workshop on Semantic Evaluation (SemEval-2016), pp. 19-30, 2014.
[3] E. Boiy, M. F. Moens, ”A machine learning approach to sentiment
analysis in multilingual web texts,” Information Retrieval, vol. 12, No.
5, pp. 526-558, 2009.
[4] Y. Kim, ”Convolutional neural networks for sentence classification,”
Proc of the 2014 Conference on Empirical Methods in Natural Language
Processing (EMNLP). Stroudsburg, pp. 1746-1751, 2014.
[5] P. Zhou, W. Shi, J. Tian, et al, ”Attention-based bidirectional long short-
term memory networks for relation classification,” Proceedings of the
54th Annual Meeting of the Association for Computational Linguistics.
Stroudsburg, vol. 2, pp. 207-212, 2016.
[6] W. Yin, H. Schtze, B. Xiang, et al, ”Abcnn: Attention-based convolu-
tional neural network for modeling sentence pairs,” Transactions of the
Association for Computational Linguistics, vol. 4, No. 11, pp. 259-272,
2015.
[7] D. Bahdanau, K. Cho, Y. Bengio, Neural machine translation by jointly
learning to align and translate, arXiv preprint arXiv: 1409.0473, 2014.
[8] Y. Wang, M. Huang, L. Zhao, et al, ”Attention-based LSTM for Aspect-
level Sentiment Classification,” Proceedings of the 2016 Conference on
Empirical Methods in Natural Language Processing. Stroudsburg, pp.
606-615, 2016.
[9] D. Tang, B. Qin, T. Liu, ”Aspect level sentiment classification with deep
memory network,” Proceedings of the 2016 Conference on Empirical
Methods in Natural Language Processing. Stroudsburg, pp. 214-224,
2016.
[10] B. Liang, Q. Liu, J. Xu, Q. Zhou, P. Zhang, ”Aspect-Based Sentiment
Analysis Based on Multi-Attention CNN,” Journal of Computer Re-
search and Development. Chinese, vol. 54, No. 8, pp. 1724-1735, 2017.

2018 International Joint Conference on Neural Networks (IJCNN)

You might also like