CNN LSTM Hybrid Approach For Sentiment Analysis
CNN LSTM Hybrid Approach For Sentiment Analysis
CNN LSTM Hybrid Approach For Sentiment Analysis
https://doi.org/10.22214/ijraset.2023.52191
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
Abstract: In recent years, one of the most popular study subjects has been sentiment analysis. It is employed to ascertain the
text's actual intention. It is primarily interested in the processing and analysis of natural language data. The development of
technology and the phenomenal rise of social media have produced a vast volume of confusing textual information. It's critical
to examine the feelings that underlie such writings. Sentiment analysis reveals the core of irrational beliefs kept in enormous
volumes of text. The primary objective is to get the computer to comprehend the backdrop of the data so that it may be divided
into material that is good or bad. (i) Several machine learning models, including Naive Bayes, XGboost, Random Forest, LGB
Machine, etc., are trained in this study. (ii) The implementation of the deep learning model Bi-LSTM, whose accuracy has
showed promise. (iii) Bidirectional Encoder Representations from Transformers (BERT), a pre-trained language model that
used an external Bi-LSTM model, was implemented. Then, a new approach of CNN-LSTM hybrid model is applied to IMDb
dataset which performed better than all the models.
Keywords: Sentiment analysis, natural language processing, machine learning, deep learning, BERT, CNN-LSTM.
I. INTRODUCTION
Nowadays, individuals want to make decisions depending upon recommendations to save time, whether they are purchasing a
product or viewing a movie. Understanding client behavior is crucial for successful marketing. Companies have made it possible for
customers to leave reviews in order to better understand the decisions made by their customers. But managing such a massive
volume of data is a difficult process. Sentiment analysis is a wise approach to help resolve the question of whether the product
achieves its goal or not. Besides the benefits that consumers gaining from this user-generated material, a large number of company
sectors are also effectively utilizing this developing technology and are employing sentiment analysis to examine the preferences of
their particular clients. It is crucial to understand the motivation underlying any text because of this. Figure 1 illustrates three
different approaches to sentiment analysis: machine learning methods, deep learning methods like BERT and Bidirectional LSTM.
A. Challenges
In sentiment analysis there are some considerable challenges which should be encountered to obtain the best result.
Majority of the data is written in English; however other languages are severely underrepresented. that gives a troublesome
experience in analyzing and training the data. Relying on previously stored data might give a problem as there is a possibility that
opinion of customers might get modified over the period of time.
While performing it with traditional machine learning algorithms, performance is somewhat not up to the mark because of its
approach towards larger datasets. Performance of machine learning models on larger datasets is lower as compared to deep learning
models.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3096
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
2) BERT Approach
BERT, which stands for Bidirectional Encoder Representations from Transformers, is an advanced deep learning technique used in
the field of natural language processing (NLP). Created by Google AI Language, BERT is a model based on neural networks that
utilizes the Transformer architecture to understand the contextual relationships among words in a given text dataset. Unlike
conventional NLP models that process text in a unilateral, sequential manner, BERT is a bidirectional model that accounts for the
entire input sentence or paragraph to generate context-sensitive word embeddings. BERT undergoes pre-training using massive
volumes of textual data, followed by fine-tuning on particular natural language processing (NLP) assignments like text
categorization, answering questions, and identifying named entities. BERT has attained impressive results on a diverse range of
NLP benchmarks and has emerged as a prevalent choice for various NLP applications.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3097
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
4) CNN-LSTM Model
This hybrid model is trained on IMDb dataset with English and French language text because of its large available size of data.
More size of data will provide more info to the model and therefore more generalized will be the model.
III. METHODOLOGY
A. Dataset and Preprocessing
Datasets which we are used were Amazon Reviews Dataset and IMDb Review Dataset. Both datasets were created by combining
from different sources. Amazon Reviews Data consists of total 10000 English and German language reviews containing three
feature columns: text, sentiment, and title. In this research, text is considered which contains full review of the product.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3098
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
IMDb dataset contains total 75000 English and French language reviews from users on various movies which consists of two
columns: review and sentiment. For both the datasets, it is binary classification problem because of two classes positive and
negative i.e., 1 and 0 respectively.
Though reviews are of range between 1 to 5 stars for any product or movie, but the metadata provided with both datasets states that
reviews are already compiled between two classes positive and negative.
1) Data Cleaning
The initial step in training a model involves data cleaning, which aims to eliminate redundant words and phrases from texts. The
objective of this process is to enhance the machine learning model's performance by removing unnecessary elements from the data.
E.g.: Text in raw data- “#5 star is My review regarding the movie Titanic! which I watched @ hall/cinema.</p>”
The following items must be eliminated at this stage:
● Punctuation: removed redundant punctuation. After this step-“5 star is My review regarding the movie Titanic which I watched
hall cinema </p>”
● HTML tags and emojis: removed html tags and certain emojis from text. After this step-“5 star is my review is regarding the
movie titanic which i watched hall cinema”
After performing above steps, pre-processing is done on cleaned text data.
2) Text Pre-processing
Text Pre-processing step is also very crucial step in natural language processing as textual data is not recognized by machine
learning model which is required to be transformed into numerical data. Some preprocessing steps are:
● Lemmatization: Lemmatizer is used from nltk.stem.wordnet library to remove tenses from sentences. It is faster than stemming
and is used when dataset size is large. After this step- “5 star review regarding movie titanic watch hall cinema.”
● TF-IDF Vectorizer: TF-IDF (Term frequency- Inverse Document Frequency) is a mathematical technique in natural language
processing that is applied on cleaned text columns after separating data into training and testing sets to tokenize and generate
word frequency scores. TF-IDF Vectorizer takes an array input of corpus or text and assigns importance to unique words scaled
by its importance across all documents or sentences in the corpus. Output from this is an array having values for each word
relative to all the words present in the corpus or document.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3099
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3100
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
1) CNN
Convolutional Layer's primary job is to extract meaningful characteristics from text data. It is possible to accomplish this by
convolution operations are performed on the word vectors generated by the Embedding Layer. As a nonlinear function, the function
Rectified Linear Unit is used. It is defined as below-
The ReLU activation function returns x if the value is positive; else, it returns 0.
2) MaxPooling
The Convolution technique generates feature maps with a high-level vector representation. A Max-Pooling layer is placed after the
CNN layer to aid in the selection of meaningful information by decreasing weak activation information. This aids in avoiding
overfitting due to outlier text.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3101
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
3) LSTM
The Linear Support Vector Machine (LSTM) represents a particular sort of RNN geared towards integrating contemporary &
anterior information. The whole thing comprises of a recollection block & a total of three gates known to govern the movement of
data through the LSTM module at a given point in time. The aforementioned gates oversight the way the current cell within the
memory & the current concealed state are altered concurrently.
F. Evaluation Metrics
Upon building the model, its efficacy is evaluated using several indicators of effectiveness such as 'accuracy', 'precision score',
'ROC', 'AUC score', and so on. The efficacy of a model made up of statistics or machine learning is assessed using indicators of
assessment. Any undertaking necessitates the assessment of machine learning systems. A model is capable of being evaluated via a
variety of evaluation measures. Models are examined in this research based on accuracy and ROC score since quality indicators
solely are not able to determine the optimum strategy. Certain measurements have been addressed below:
● Accuracy Score: It is the proportion of appropriate forecasting to overall forecasts.
Precision Score: The degree of accuracy is determined by how many good predictions for classes are really affirmative class
forecasts.
● Receiver Operating Characteristic (ROC) Score: It corresponds to the region beneath the contour, that gauges how well a
system for classification can distinguish amongst several distinct categories. It has a value spectrum of 0 to 1, with 1 denoting
optimal performance or the highest achievable effectiveness and 0 signifying the system's most minimal effectiveness.
● Early Stopping: It is a technique which is used to avoid overfitting. In this, by specifying patience value, model will continue to
train until the validation loss stops decreasing.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3102
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
Table 1: shows the Accuracy and ROC Curve Score of Machine learning models for Amazon Reviews
2) IMDb Reviews dataset: Table 2 clearly depicts the three most efficient models when trained on IMDb reviews dataset gave
reasonable results, namely, LGB Machine, Naive Bayes and Linear-SVC. Ensemble model of these three is made to get the best
possible results among machine learning models for IMDb reviews dataset.
With the Ensemble model we get an Accuracy of 0.894, Precision Score of 0.898 and ROC Score of 0.894.
Table 2: shows the Accuracy and ROC Curve Score of Machine learning models for IMDb Reviews
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3103
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3104
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3105
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3106
International Journal for Research in Applied Science & Engineering Technology (IJRASET)
ISSN: 2321-9653; IC Value: 45.98; SJ Impact Factor: 7.538
Volume 11 Issue V May 2023- Available at www.ijraset.com
The BERT approach gave an Accuracy score of 0.86 and ROC Score of 0.935 on IMDb dataset while in the case of Amazon
dataset, it gave an Accuracy Score of 0.79 and ROC Score of 0.86, which were better than the Bi-LSTM numbers. For the sake of
identifying IMDb reviews, an integrated approach constructed using CNN and LSTM can be utilized. The findings from the
experiment highlighted that our advocated deep learning hypothesis, particularly is built upon a mix of CNN and LSTM divisions,
beat all other models with about 90% accuracy and a ROC score of 0.96, revealing the algorithm's outstanding effectiveness. The
use of this method could enormously boost the opinions of others by differentiating constructive and negative feedback in order in
order to better comprehend the preferences and interests of individuals from varied backgrounds, as well as assist build the link
amongst consumers and enterprises. We haven’t applied the hybrid model on the Amazon dataset because of its small size.
For future works, to obtain better accuracy, tuning and adding of certain methods like dropout to avoid overfitting to dataset and
applying different optimizers like SGD, Adam etc. can also be done. The proposed model can also be trained on multi-class
sentiment analysis problems with slight modification in the last layer i.e, instead of sigmoid function, SoftMax function can be used.
REFERENCES
[1] E. Aydoğan and M. A. Akcayol, "A comprehensive srvey for sentiment analysis tasks using machine learning techniques," 2016 International Symposium on
INnovations in Intelligent SysTems and Applications (INISTA), 2016, pp. 1-7, doi: 10.1109/INISTA.2016.7571856.
[2] Fernandes, Marta & Sun, Haoqi & Jain, Aayushee & Alabsi, Haitham & Brenner, Laura & Ye, Elissa & Ge, Wendong & Collens, Sarah & Leone, Michael &
Das, Sudeshna & Robbins, Gregory & Mukerji, Shibani & Westover, M Brandon. (2020). Classification of the Disposition of Patients Hospitalized with
COVID-19: Reading Discharge Summaries Using Natural Language Processing (Preprint). 10.2196/preprints.25457.
[3] Acosta, Joshua, Norissa Lamaute, Mingxiao Luo, Ezra Finkelstein and Andreea Cotoranu. “Sentiment Analysis of Twitter Messages Using Word 2 Vec.”
(2017).
[4] S. M. Qaisar, "Sentiment Analysis of IMDb Movie Reviews Using Long Short-Term Memory," 2020 2nd International Conference on Computer and
Information Sciences (ICCIS), 2020, pp. 1-4, doi: 10.1109/ICCIS49240.2020.9257657.
[5] A. S. Zharmagambetov and A. A. Pak, "Sentiment analysis of a document using deep learning approach and decision trees," 2015 Twelve International
Conference on Electronics Computer and Computation (ICECCO), 2015, pp. 1-4, doi: 10.1109/ICECCO.2015.7416902.
[6] Kanakaraj, Monisha & Guddeti, Rammohana Reddy. (2015). Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques.
Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing, IEEE ICSC 2015. 169-170. 10.1109/ICOSC.2015.7050801.
[7] Phand, S.A., & Phand, J.A. (2017). Twitter sentiment classification using stanford NLP. 2017 1st International Conference on Intelligent Systems and
Information Management (ICISIM), 1-5.
[8] R. Monika, S. Deivalakshmi and B. Janet, "Sentiment Analysis of US Airlines Tweets Using LSTM/RNN," 2019 IEEE 9th International Conference on
Advanced Computing (IACC), 2019, pp. 92-95, doi: 10.1109/IACC48062.2019.8971592.
[9] P. Vateekul and T. Koomsubha, "A study of sentiment analysis using deep learning techniques on Thai Twitter data," 2016 13th International Joint Conference
on Computer Science and Software Engineering (JCSSE), 2016, pp. 1-6, doi: 10.1109/JCSSE.2016.7748849.
[10] Fernandes M, Mendes R, Vieira SM, Leite F, Palos C, Johnson A, et al. (2020) Predicting Intensive Care Unit admission among patients presenting to the
emergency department using machine learning and natural language processing. PLoS ONE 15(3): e0229331. https://doi.org/ 10.1371/journal.pone.0229331
[11] Szlosek, Donald A, and Jonathan Ferrett. “Using Machine Learning and Natural Language Processing Algorithms to Automate the Evaluation of Clinical
Decision Support in Electronic Medical Record Systems.” EGEMS (Washington, DC) vol. 4,3 1222. 10 Aug. 2016, doi:10.13063/2327-9214.1222
[12] Abu Kwaik, Kathrein & Saad, Motaz & Chatzikyriakidis, Stergios & Dobnik, Simon. (2019). LSTM-CNN Deep Learning Model for Sentiment Analysis of
Dialectal Arabic. 10.1007/978-3-030-32959-4_8.
[13] Ghourabi, Abdallah, Mahmood A. Mahmood, and Qusay M. Alzubi. 2020. "A Hybrid CNN-LSTM Model for SMS Spam Detection in Arabic and English
Messages" Future Internet 12.
[14] M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673–2681, 1997.
©IJRASET: All Rights are Reserved | SJ Impact Factor 7.538 | ISRA Journal Impact Factor 7.894 | 3107