35 - Cricket Sentiment Analysis From Bangla Text Using Recurrent Neural Network With Long Short Term Memory Model
35 - Cricket Sentiment Analysis From Bangla Text Using Recurrent Neural Network With Long Short Term Memory Model
Hai Ha Do, P.W.C. Prasad, Angelika Maag, Abeer Alsadoon, "Deep Learning for Aspect -Based Se…
A/Prof Abeer Alsadoon
A Deep Recurrent Neural Net work wit h BiLST M model for Sent iment Classificat ion
Md Saiful Islam
Convolut ional Mult i-Head Self-At t ent ion on Memory for Aspect Sent iment Classificat ion
IEEE/CAA J. Aut om. Sinica
International Conference on Bangla Speech and Language Processing (ICBSLP), 27-28 September 2019
Abstract— Nowadays, people used to express their feelings, understand the sentiment of people for cricket. However, very
thoughts, suggestions and opinions on different social platform few attempts were taken for sentiment analysis from Bangla
and video sharing media. Many discussions are made on text because of the unavailability of well-structured resources
Twitter, Facebook and many respective forums on sports in Bangla language processing. Hence, Cricket sentiment
especially cricket and football. The opinion may express analysis on Bangla text from real people sentiments for cricket
criticism in different manner, notation that may comprise has become an exciting field for us. Nowadays, Deep learning
different polarity like positive, negative or neutral and it is a technique is widely used to analyze sentiment of text and has
challenging task even for human to understand the sentiment of proven to be an effective tool in terms of accuracy as it
each opinion as well as time consuming. This problem can be
considered past and future word with respect to target word
solved by analyzing sentiment in respective comments through
natural language processing (NLP). Along with the success of
for text classification. Thus, we are very much influenced to
many deep learning domains, Recurrent Neural Network (RNN) classify cricket sentiment from Bangla text using deep
with Long-Short-Term-Memory (LSTM) is popularly used in learning technique.
NLP task like sentiment analysis. We have prepared a dataset In this research, Recurrent Neural Network with LSTM
about cricket comment in Bangla text of real people sentiments model has been proposed to identify cricket sentiment from
in three categories i.e. positive, negative and neutral and Bangla texts. We have collected real people sentiments about
processed it by removing unnecessary words from the dataset.
cricket from different social media and news portal and
Then we have used word embedding method for vectorization of
categorized into positive, negative or neutral. Then,
each word and for long term dependencies we used LSTM. The
accuracy of this approach has given 95% that beyond the
vectorization of each word was performed by word
accuracy of previous all method. embedding method and LSTM was used to achieve long term
dependencies. Finally, the accuracy of 95% has attained in
Keywords— sentiment analysis, natural language processing, cricket sentiment analysis using the proposed model.
deep learning, word embedding, RNN, LSTM.
II. RELATED WORK
I. INTRODUCTION Sentiment analysis from Bangla text has become major
In present era, people across the globe express their point of focus in NLP for researchers with the increasing use
opinions or feelings through social media and web on different of social media. Hasan et al. [1] proposed a model utilizing
entities such as events, products, social issues, organizations LSTM with binary cross-entropy and categorical cross-
etc. Hence, in every instant massive amount of text data are entropy loss function for sentiment analysis of Bangla and
generated on various entities over the Internet. By analyzing Romanized Bangla Text (BRBT). Sharfuddin et al. [2]
these data business organizations can understand the developed an approach combining deep RNN with
sentiment of people about their products and can find new bidirectional LSTM to classify sentiment of Bengali text
opportunities, government can understand people perception which achieved 85.67% accuracy on a dataset containing
about election and can manage their reputation, event 10000 comments of Facebook status. Baktha et al. [3] have
organizer can understand people expectation on public events explored RNN architectures on three dataset and obtained best
and so on. Thus, it is a high need to epitomize the unstructured accuracy from Gated Recurrent Units (GRU) for sentiment
data created by people over the social media and extract analysis. Tripto et al. [4] suggested deep learning based
relevant insights in order to understand people thoughts. models for detecting multilabel sentiment and emotions from
Therefore, Sentiment Analysis has become a major point of Bengali YouTube comments. They used Convolutional
focus in the field of NLP which extract contextual mining Neural Network (CNN), LSTM, Support Vector Machine
from text data that conveys emotions, sentiments or opinions (SVM) and Naïve Bayes (NB) architectures to identify three
of an individual. (positive, negative, neutral) and five label (strongly positive,
positive, neutral, negative, strongly negative) sentiment as
In recent times, Cricket has gained uttermost popularity in well as emotions where they considered SVM and NB as their
Bangladesh. So, people have diversified emotions for this baseline methods. Term Frequency Inverse Document
sport. They like to express their opinions, emotions regarding Frequency (TF-IDF) with n-gram tokens has been used to
to this sport most often through social network in Bangla extract set of features from respective sentence. They got
language. By processing these reviews, it is possible to 65.97% and 54.24% accuracy for three and five labels
X1 X300
after 15 epochs with batch size 30. Finally, the performance
Layer
h1 h100
D1 D2 D9 D100
Output
Layer
σ σ σ
yt
Fig. 2. LSTM network for cricket comments sentiment classification Fig. 3. Optimal accuracy of proposed model
VI. CONCLUSIONS
In this paper we present an approach to analyze sentiment
of cricket comments in Bangla text. This model consists of a
deep learning variant named RNN. For remembering the
recurrent property and contextual meaning of a sentence we
have used LSTM that makes the model very fruitful and
produces a prediction result about 95%. Spell-checking and
stemming is not included in preprocessing section of our
collected dataset. In future, we will include more
preprocessing steps along with these two in order to improve
the structure of our dataset. We also plan to increase target
class to make an accurate NLP model within this problem
domain.
Ref Prediction
Dataset Model
No. Accuracy
Proposed
ABSA_EXTENDED LSTM 95%
method