Tweet
Tweet
Given the conditions that were there in the past around the world, social media has
become the primary form of contact for most individuals. Social media platforms
and their users have a close relationship, and as a result, these platforms on many
levels reflect the users' personal lives. As a result, many persons with mental
illnesses who are unable to receive in-person assistance are now isolated.
Considering this, we offer a solution to identify and categorize a person's mental
health status, allowing them to get the care they need. Among a few of the chosen
architectures that were successfully applied to natural language processing
applications, we seek to determine the best efficient deep neural network
architecture. Given the limited unstructured text data retrieved from the Twitter
social media site, the chosen architectures are utilized to identify persons who
exhibit symptoms of mental diseases (in our example, depression).
1
Chapter 1
Introduction
We give the readers an overview of the project's background, motivation, problem
statement, difficulties encountered, project goals, and report structure in this
chapter.
1.1 Background
Social media platforms such as twitter Facebook Instagram are now an everyday
habit of one’s life. An average internet users spends around 2 hours and 30 minutes
daily on social media platforms [1].. The daily average usage is increasing rapidly.
The image down shows us the average time spent by a user on social networking
site worldwide from 2012 to 2022[2]
People are publicly connecting with each other through those platforms by
publishing images, messages, and other types of media. . On such platforms, most
users express their opinions and sentiments; while doing so is a great way to get
emotional support from other users, it may also demoralize people. Understanding
2
the feelings that users of these social networks express is essential for preventing a
depressed user from going on a bad mental health path
Now since people express their feeling openly on these public domains and twitter
being the most popular domain, the regular participation of users creates a great
volume of data The geographical location, timestamp, and digital footprint of the
content author are all included in this data. In order to entice others to follow,
watch, and comment, people also have a habit of putting the URLs of their social
media profiles in all their correspondence. This increased expansion allowed
people to get to know one another, converse, and give advice. Considering the vast
number of people who are both directly and indirectly affected by mental disease,
it is essential to do research on the methods that help in the identification of mental
illness and track and forecast the appearance of new mental illness tactics.
Twitter sentiment analysis refers to the use of natural language processing (NLP)
techniques to analyze the sentiment or emotion expressed in tweets, which are
short messages posted on the social media platform Twitter. This type of analysis
has gained widespread attention in recent years due to the increasing popularity of
Twitter and the potential for real-time analysis of public opinion on a wide range
of topics.
There are many applications for Twitter sentiment analysis, including brand
management, market research, political analysis, and customer service. By
analyzing the sentiment of tweets about a particular topic or product, companies
and organizations can gain insight into public opinion and identify areas of
improvement or concern.
There are several challenges to performing Twitter sentiment analysis, including
the limited length of tweets, the use of slang, emojis, and other informal language,
and the presence of noise and spam. In order to overcome these challenges,
researchers have developed a range of NLP techniques and tools, including text
preprocessing, feature extraction, and machine learning algorithms.
Overall, Twitter sentiment analysis has the potential to provide valuable insights
into public opinion and can be used in a variety of applications. However, it is
important to consider the limitations of this approach and to ensure that the results
are interpreted carefully in the context of the data and the specific research
question.
3
1.2 Motivation
This project allowed me to study and confirm my interest in the topic of machine
learning, which has long been a source of great fascination for me. Machines that
can learn, estimate situations, and make predictions will be extremely powerful and
have a wide range of applications. Machine learning has several applications in
practically every industry, including finance, health, automobile, etc. The fact that
machine learning is so widely used inspired me to adopt it as the foundation for my
idea.
This project was motivated by the fact that how information present on the web,
majorly tweets, can shape user’s feeling and how to analyze them to detect
depression at early stages
4
1.4 Report Overview
A brief outline or guideline to the readers. The report is structured as follows:
Chapter 2 – Literature Review, presents its readers with an overview of the
background research on the topic, Machine Learning and a go thorough to Natural
Language Processing and Sentimental Analysis, a review of literature work
referred to while performing the research
Chapter 3 – Design & Methodology, provides its readers with an outline of the
methodology pursued by the author in designing the architecture
Chapter 4 - Implementation The author describes key features, research, and
documentation undertaken in the implementation.
Chapter 5 - Evaluation, provides an analysis on evaluating the training and
validation accuracy of all the models used in this project.
Chapter 6 - Conclusion, provides a summary concluding the project report
Chapter 7 – References, provides references which doing the project
5
Chapter 2
Literature Review
2.1 Machine Learning Techniques
Machine learning is a subfield of artificial intelligence that involves the use of
algorithms and statistical models to enable computers to learn from data and make
predictions or decisions without being explicitly programmed. Machine learning
algorithms can be classified into two main categories: supervised learning and
unsupervised learning.
In supervised learning, the algorithm is trained on a labeled dataset, which consists
of input data and corresponding correct output labels. The algorithm learns to
predict the correct output label for a given input data by adjusting the weights and
biases of the model based on the error between the predicted and correct labels.
Examples of supervised learning include classification tasks, such as identifying
the type of object in an image, and regression tasks, such as predicting the price of
a house based on its characteristics.
In unsupervised learning, the algorithm is not provided with correct output labels
and must find patterns and relationships in the data on its own. Examples of
unsupervised learning include clustering tasks, such as grouping similar documents
together, and dimensionality reduction tasks, such as identifying the most
important features in a dataset.
There are many different types of machine learning algorithms and techniques,
including:
Decision trees: These are tree-like models that make decisions based on a
series of binary splits. Decision trees are simple to understand and interpret,
but they can be prone to overfitting if the tree becomes too complex.
Random forests: These are ensembles of decision trees that are trained on
different subsets of the data and combined to make predictions. Random
forests are more accurate and less prone to overfitting than individual
decision trees.
Support vector machines (SVMs): These are linear models that find the
hyperplane in a high-dimensional space that maximally separates different
6
classes. SVMs are effective for classification tasks, but they can be
computationally expensive for large datasets.
Naive Bayes: These are probabilistic models that use Bayes' theorem to
make predictions based on the probability of different events occurring.
Naive Bayes models are simple and efficient, but they make the assumption
that all features are independent, which may not always be the case.
K-means: This is a clustering algorithm that divides a dataset into a specified
number of clusters by minimizing the within-cluster sum of squares. K-
means is fast and efficient, but it can be sensitive to the initial placement of
the centroids.
Neural networks: These are models that are inspired by the structure and
function of the human brain. Neural networks can be used for a wide range
of tasks, including classification, regression, and feature learning. There are
many different types of neural networks, including convolutional neural
networks (CNNs) and recurrent neural networks (RNNs).
Deep learning: This is a type of machine learning that uses deep neural
networks with many layers, which are trained using large amounts of data.
Deep learning has been successful in many tasks, including image and
speech recognition, natural language processing, and machine translation.
These are just a few examples of the many different machine learning algorithms
and techniques that are available. In practice, the choice of algorithm and
technique depends on the nature of the data and the specific task at hand. It is often
necessary to try out multiple different approaches and tune the parameters of the
model to find the best solution.
2.2 Neural Networks
We will quickly discuss the various structures we employ in our models in this
section. These structures can be used in a variety of ways, including text analysis,
sequence prediction, and sequence categorization.
Deep neural network topologies are made to learn by connecting numerous layers,
with each layer only connected to the one before it in the hidden part of the
network. Vectors are already integrated in the input layer. There should be an equal
number of neurons in the output layer for each class in a multi-class classification
issue, and just one neuron is required for a binary class.
7
1) Recurrent Neural Networks (RNN)
Recurrent neural networks (RNN) for text mining and classification are also of
interest to researchers. RNN gives the earlier data points in the sequence more
weight. Thus, this method is efficient for classifying text, sequence, and string
data. The very advanced way the neural network in RNN considers information
from earlier nodes enables a better semantic analysis of the data set's structures.
8
3) LSTM
Long-Short Term Memory (LSTM) cells are used in most of our networks. They
offer a means of retaining a specific number of previous values. The input gate, the
output gate, and the forget gate are typically the three components that make them
up. By approving or disapproving the update, the input gate regulates the flow of
the new values. Whether or not the state is conveyed as the output is determined by
the output gate. The forget gate can finally restore the status of the cell.
4) CNN
The neural connections in Convolutional Neural Networks (CNN), a feed-forward
network, are modelled after the connections in the animal neural visual cortex. It
has primarily been utilized for voice and picture analysis. Input patterns can be
thought of in terms of simpler, smaller patterns rather than the entire input.
It typically consists of a series of convolution layers and sub sampling levels, with
the pooling layer being the most common. A group of filters (often of modest size)
that will be convolved across the input volume is the primary parameter of the
convolutional layer. The output of each convolution for each filter makes up the
convolutional layer.
9
2.3 An overview to Natural Language Processing
11
Here is an illustration of how the various emotions might be seen as a whole.
When deciding which emotions to categorise, their job is fascinating. Knowing
how to separate them is necessary for classifying emotions. A difficulty that would
be too challenging for our deep learning models would emerge from taking
emotions that are too similar (i.e., near together on the graph). If we choose to
categorise both the emotions sad and sluggish, for example, it would be quite
difficult to determine the emotions of a message like "I don't feel good."
It's important to avoid selecting too similar feelings. However, it's equally crucial
to address the full range of emotions. For instance, using only the two feelings of
boredom and tension would make it impossible to comprehend the emotions of
users. It is then advised to select at least one emotion for each of the Arousal
Valence model's four quadrant
2.5 Related Work
The Related Work can be found to be divided into two categories that is
Classification of Sentiments and Classification of Emotions
2.5.1 Classification of Sentiments
Many works is done in this field Business companies and individuals look to the
material posted on these sites, such as forum conversations, blogs, tweets, etc. to
make decisions as the number of users on the internet and the popularity of social
media sites like Facebook, Instagram, Twitter, etc. Through sentimental analysis,
the writer's emotions or the priority of the context in the posts are identified. The
sentimental analysis, which can be positive, negative, or neutral, seeks to ascertain
the writer's attitude toward specific issues or the entire book.
The hierarchy of sentiment analysis can be seen below
12
Finding the polarity of the sentiment in the text and classifying it as positive,
negative, or neutral is the main objective of sentiment analysis. This categorization
can take place at the phrase, document, entity, or aspect levels. The polarity of the
emotion can be extracted from user-generated text using a number of different
techniques. According to Firmino Alves et al. [3], machine learning, statistical
approach, and semantic approach—an technique based on lexical analysis or
thesaurus—are the three main methods for identifying sentiment polarity. In
opinion mining, sentiment polarity can be predicted from the text manually or
automatically by using experts to identify the polarity, according to Augustyniak,
ukasz et al. (2015). [4]
13
noun, adjective), and the phrases that represent product descriptions and the
adjectives as opinions are quickly re-read. Any statement was then selected, and
the characteristics of the product were described in order to determine whether a
word was classified as positive, negative, or neutral.
In the context of sentiment analysis of Twitter data, (Sosa and P. M., 2017)
proposed the combination of two NNs (Neural networks), such as CNN-LSTM
(Long Short Term Memory) and LSTM-CNN. 10,000 tweets, both favourable and
negative, were used as training data. About 75.2% of the time, LSTM-CNN is
accurate. [9].
"Deep learning for emotion recognition: A review" by Kim and Kim (2017)[10].
This is a review paper that provides an overview of the use of deep learning for
emotion recognition, and discusses the challenges and limitations of this approach.
The paper begins by discussing the importance of emotion recognition and the
various applications in which it can be used, including affective computing,
human-computer interaction, and healthcare. The authors then describe the
different methods that have been used for emotion recognition, including
traditional machine learning approaches and deep learning approaches.
The main part of the paper is focused on the use of deep learning for emotion
recognition. The authors describe several different deep learning architectures that
have been used for this purpose, including convolutional neural networks (CNNs),
recurrent neural networks (RNNs), and autoencoders. They also discuss the use of
transfer learning and data augmentation to improve the performance of these
models.
The authors conclude by discussing the challenges and limitations of using deep
learning for emotion recognition, including the need for large amounts of
annotated data and the potential for bias in the training data. They also suggest
some directions for future research in this area.
"Deep learning for emotion recognition: A survey" (Zhang, Yin, & Li, 2019)[11],
is a survey paper that provides a comprehensive overview of the use of deep
learning for emotion recognition. The paper covers a wide range of applications
and datasets, and discusses the different deep learning architectures that have been
used for emotion classification, including convolutional neural networks (CNNs),
recurrent neural networks (RNNs), and long short-term memory (LSTM) networks.
14
The paper also discusses the challenges and limitations of using deep learning for
emotion recognition, including the need for large amounts of annotated data, the
risk of overfitting, and the potential for bias in the training data. The paper also
considers the ethical implications of using deep learning for emotion recognition,
and discusses the importance of responsible and transparent use of these systems.
Overall, the paper provides a useful overview of the current state of the field and
highlights the potential of deep learning for emotion recognition.
"Deep affect prediction in-the-wild: Aff-wild database and challenge" by Kollias et
al. (2018)[12], presents a database and challenge for emotion classification using
deep learning. The database, called Aff-Wild, consists of a large collection of
annotated video sequences with a wide range of emotions and expressions. The
authors used this database to evaluate the performance of several deep learning
models for emotion classification. They found that the best performing model was
able to achieve an accuracy of 70.8% on the Aff-Wild dataset, and that the model
was able to generalize well to new data. The authors also discuss some of the
challenges and limitations of using deep learning for emotion classification,
including the need for large amounts of annotated data and the potential for bias in
the training data.
The paper "A deep learning approach for multimodal emotion recognition" (Poria
et al., 2017) describes the use of a deep learning model for emotion classification
based on multiple modalities, including text, audio, and visual data. The authors
used a dataset of annotated videos to train and evaluate the model. The model was
15
able to achieve high accuracy rates, with an average classification accuracy of
86.5% on the test set.[14]
The authors also conducted an ablation study to evaluate the importance of each
modality for emotion classification. They found that the visual modality was the
most important for accurate classification, followed by the audio and text
modalities. The authors also found that using multiple modalities improved the
performance of the model compared to using a single modality.
Overall, the results of this study suggest that deep learning approaches can be
effective for emotion classification based on multiple modalities, and can achieve
high accuracy rates when trained on a large and diverse dataset.
In Deep learning for text-based emotion classification using distant
supervisionThis paper describes the use of a deep learning model for emotion
classification based on text data. The model was trained on a large dataset of
annotated social media posts, and was able to achieve good accuracy rates for
emotion classification.
The authors of the paper used a recurrent neural network (RNN) with an attention
mechanism to classify the emotions expressed in the text data. They also used a
technique called "distant supervision" to automatically annotate the training data,
which allowed them to use a larger dataset without the need for manual annotation.
Overall, the results of the study demonstrate the effectiveness of deep learning for
emotion classification based on text data, and highlight the potential of using
distant supervision to improve the performance of such models.
In Transfer learning for emotion recognition in tweets This paper presents a deep
learning model for emotion classification based on text data, and demonstrates the
effectiveness of transfer learning for improving the performance of the model on a
dataset of tweets.[16]
The authors of the paper used a convolutional neural network (CNN) to classify
emotions in tweets, and trained the model on a large dataset of annotated tweets.
They then applied transfer learning to the model, using pre-trained word
embeddings and a CNN model trained on a large dataset of general-purpose text
data.
The results of the study showed that the use of transfer learning significantly
improved the performance of the model on the task of emotion classification in
16
tweets, compared to training the model from scratch. The authors also found that
the use of pre-trained word embeddings had a larger impact on the performance of
the model than the use of a pre-trained CNN model.
Overall, the paper demonstrates the effectiveness of transfer learning for improving
the performance of deep learning models for emotion classification based on text
data, and highlights the potential of this approach for a variety of NLP tasks.
2.6 RESEARCH GAPS
17
in cases where the systems are used in multi-cultural settings or for cross-
cultural comparison.
5. Limited consideration of individual differences: Emotion is often highly
personal and can vary significantly from person to person, but many of the
existing NLP-based emotion classification systems do not take this into
account. This can lead to inaccurate results, especially in cases where the
systems are used for individual-level analysis.
6. Limited consideration of multimodal data: Emotion can be conveyed
through a range of modalities, including text, audio, and visual data, but
many of the existing NLP-based emotion classification systems do not take
this into account. This can limit the accuracy and reliability of the systems,
especially in cases where multiple modalities are available.
Overall, there are many research gaps in the field of emotion classification using
NLP techniques that need to be addressed in order to improve the accuracy and
reliability of these systems. Further research is needed to address these gaps and to
develop more robust and generalizable emotion classification systems that can be
used in a variety of applications.
18
Chapter 3
DESIGN AND METHODOLOGY
3.1 Design
For the project, machine learning classifiers and LSTM models have been
employed to carry out the process of classification. Our goal is to know what are
the depressive and non-depressive tweets when the model is provided with an
input. First, let us understand the machine learning models are used in the project.
4. Support Vector Machine (SVM): This is a linear model that is used for
binary classification. It works by finding the hyperplane in the feature space
that maximally separates the two classes.
19
5. K-Nearest Neighbours (KNN) Classifier: This is a non-parametric model
that makes predictions based on the class of the K nearest data points. It
works by calculating the distance between the new data point and all the
training data, and then choosing the K data points that are closest to the new
data point. The class of the new data point is then determined by a majority
vote of the K nearest neighbours.
For tweet sentiment analysis, the above-mentioned models have been used for
classifying tweets into depressive and non-depressive sentiment. The baseline
score of all the ML models will be compared based on training accuracy and
testing accuracy. If some of the model seems too overfit, they will be excluded
from the final model inference testing process. The machine learning models have
not be hyper tuned or optimized in this project to preserve the simplicity. If the
need is required to boost the performance of the models, the models could be hyper
tuned.
To classify tweets using an RNN, the tweets are first converted into a numerical
form that can be input to the model. This can be done using techniques such as
word embeddings or one-hot encoding. The numerical data is then input to the
RNN, which processes the data in a sequential manner, taking into account the
order of the words in the tweet. The output of the RNN is a predicted class label
for each tweet (e.g., positive, or negative). The RNN can learn to make these
predictions by being trained on a labelled dataset of tweets, where the class labels
20
are provided by human annotators. RNNs are useful for tweet sentiment analysis
because they can capture the context and dependencies between words in a tweet,
which is important for understanding the sentiment of the tweet. They are also able
to handle long sequences of data, which is important for tweets, which can be very
long.
3.2.2 Architecture
The architecture of an RNN consists of a repeating module, called a cell, that has a
small number of internal states and processes one element of the sequence at a
time. The internal states of the cell are passed from one time step to the next,
allowing the RNN to retain information about the past and use it to process the
current input. In the context of tweet sentiment analysis, the input to the RNN at
each time step could be a word in the tweet, and the output could be a prediction of
the sentiment of the tweet. The internal states of the RNN cell would be updated at
each time step based on the previous states and the current input, allowing the
model to consider the context of the words in the tweet as it makes its prediction.
Overall, the architecture of an RNN for tweet sentiment analysis would consist of
an input layer that takes in the words of the tweet, one or more RNN cells that
process the words and retain information about the context of the tweet, and an
output layer that makes a prediction of the sentiment of the tweet based on the
processed input.
21
or not related to depression. This could be done by training the model on a labelled
dataset of tweets, where some tweets are labelled as being related to depression
and others are labelled as not related to depression. The LSTM model would then
learn to identify patterns in the language and structure of the tweets that are
indicative of depression. Once the model is trained, it can be used to classify new
tweets as being related to depression or not related to depression. This could be
useful for detecting and addressing mental health issues in social media users, as
well as for research on depression and other mental health conditions.
3.3.2 Architecture
The architecture of an LSTM consists of input, output, and forget gates that control
the flow of information into, out of, and within the cell, respectively. The input
gate determines what information to store in the cell, the forget gate determines
what information to throw away, and the output gate determines what information
to output. Each LSTM cell also has a cell state, which is like a buffer that can store
information for a long period of time. The cell state is updated at each time step
based on the previous cell state, the current input, and the output of the forget and
input gates. LSTMs are often used in natural language processing tasks because
they are able to capture long-term dependencies in language. They are also used in
other areas such as stock price prediction and language translation.
22
Here is a general outline of how you could use Tweepy to perform text mining on
tweets containing the hashtag "#depressed":
1. Install Tweepy and obtain the necessary API keys from Twitter.
2. Use Tweepy to authenticate your API requests and search for tweets
containing the hashtag "#depressed." You can use the tweepy.Cursor class
and the .search() method to retrieve a list of tweets that match the search
criteria.
3. Pre-process the tweets to prepare them for analysis. This may involve tasks
such as tokenization, stemming, and removing stop words.
4. Use text mining techniques, such as NLP and machine learning algorithms,
to analyse the content of the tweets and extract useful information and
insights.
5. Visualize and interpret the results of the text mining process to draw
meaningful conclusions about the data. For example, you might create word
clouds or perform sentiment analysis to understand the overall tone of the
tweets.
Keep in mind that this is just a general outline, and the specific details of your text
mining process will depend on your specific goals and the data you are working
with. You can find more information about text mining and how to perform it in
Python in the relevant documentation and online resources. Similarly, the search
query can consist of other hashtags which can be commonly found in depressive
tweets such as suicide, anxiety.
23
For the project, tweets have been fetched using Tweepy python library and text
mining process is carried out for specific depressive words such as depression,
mental health, suicide, anxiety, and phrase such as it is okay to be not okay. The
keywords are referred as search queries and the search space is the fetched tweets.
After the process of text mining has been carried out for various keywords
mentioned above, the data is compiled and classified as depressive sentiment
tweets. The data is complied in the form of comma-delimited file format with
following columns:
There are in total 24,147 tweets which are classified as depressive sentiment based
on the keywords which was used to carry out the process of text mining. For the
project, the non-depressive sentiment tweets have been collected using open-
source Kaggle data. There is total 20,952 tweets which are classified as non-
depressive sentiment tweet.
24
There are several ways to fetch data using the Twitter API, depending on what you
want to achieve. Some common methods include:
1. Searching for tweets: You can use the "search" endpoint to search for tweets
that match certain criteria, such as keywords or hashtags.
2. Accessing a user's timeline: You can use the "timeline" endpoint to retrieve
the most recent tweets from a particular user's timeline.
There are many other endpoints available in the Twitter API, which allow you to
access various types of data and perform various actions. You can find more
information about the different endpoints and how to use them in the Twitter API
documentation.
When you create a developer account on Twitter and apply for access to the
Twitter API, you will be given a set of API keys. These keys consist of a
"consumer key" and a "consumer secret," which are used to identify your
application, and an "access token" and "access token secret," which are used to
authenticate your API requests. The "consumer key" and "consumer secret" are
used to identify your application when making API requests. They are used to
authenticate your application and ensure that it has the necessary permissions to
access the data it is requesting.
The "access token" and "access token secret" are used to authenticate your API
requests on behalf of a specific user. When a user grants your application access to
their data, you will be given an "access token" and "access token secret" for that
user. You can use these keys to make API requests on behalf of that user and
access their data. You will need to include these keys in the header of your API
25
requests in order to authenticate them. The specific format of the header will
depend on the authentication method you are using. You can find more information
about this in the Twitter API documentation.
2. Removing stop words: Stop words are words that are commonly used in a
language but do not convey much meaning, such as "a," "an," "the," "and,"
etc. In the context of tweet sentiment analysis, stop words may be removed
as a pre-processing step in order to focus on the more meaningful words in
the tweets. There are several reasons why stop words might be removed
from text data in sentiment analysis:
Stop words are very common and do not contribute much to the
meaning of a sentence. Removing them can reduce the size of the
dataset and make it easier to process.
26
Stop words can be noisy and distract the model from the more
important words. Removing them can improve the model's ability to
identify the sentiment of the tweet.
Stop words are not always used in the same way across languages.
Removing them can make it easier to compare the sentiment of
tweets in different languages.
To remove stop words from text data, you can use a pre-defined list of stop words
and filter out any words that are on the list. There are also many libraries and tools
available that can perform stop word removal automatically.
3. Removing special characters: Special characters are characters that are not
letters or numbers, such as punctuation marks, emoji, and other symbols. In
the context of tweet sentiment analysis, special characters may be removed
as a pre-processing step in order to focus on the words in the tweets and
reduce noise in the data. There are several reasons why special characters
might be removed from text data in sentiment analysis:
Special characters can be noisy and distract the model from the
more important words. Removing them can improve the model's
ability to identify the sentiment of the tweet.
27
To remove special characters from text data, you can use a regular expression to
match and remove any characters that are not letters or numbers. There are also
many libraries and tools available that can perform this type of text cleaning
automatically.
Lowercasing can reduce the size of the dataset, since the same word
in different cases will be counted as a single word rather than
multiple words.
28
from words, and more sophisticated algorithms that use machine learning to
learn the base form of a word from its usage in context. To stem text data,
you can use one of the many stemmers that are available in programming
libraries and tools, such as the Porter stemmer or the Snowball stemmer.
6. Removing links and numbers: Links and numbers may not be relevant to the
sentiment of the tweet and can be removed.
For example, the contraction "I'm" could be expanded to "I am." It is also a good
idea to consider common contractions that include possessives, such as "it's" (it is),
"they're" (they are), and "you're" (you are).
To create a word cloud for depressive and non-depressive tweets, you would first
need to collect and classify a set of tweets as either depressive or non-depressive.
This could be done manually, or you could use a pre-trained machine learning
model to classify the tweets for you.
29
Once you have your tweets classified, you can use a word cloud generator tool to
create the word clouds. There are many word cloud generator tools available
online, such as WordClouds.com and WordCloudGenerator.net. Some
programming libraries, such as wordcloud in Python, also offer word cloud
generation capabilities.
To create the word clouds, you would need to pass the text of the tweets to the
word cloud generator, along with any relevant parameters such as the size and
color of the word cloud. The generator will then create the word cloud, with the
most common words displayed in larger font sizes.
By comparing the word clouds for the depressive and non-depressive tweets, you
can get a sense of the most common words used in each class of tweets and how
they differ. This can be useful for understanding the characteristics of each class of
tweets and for developing machine learning models for tweet sentiment analysis.
FIGURE: (Left) Word cloud generated for depressive tweets. (Right) Word cloud
generated for non-depressive tweets.
30
Word embeddings are a type of representation for text data that can be used as
input to a machine learning model. In the context of tweet sentiment analysis, word
embeddings can be used to convert the text of each tweet into a numerical form
that can be input to the model.
There are several ways to create word embeddings, but a common approach is to
use a neural network to learn the embeddings from the data. The neural network
takes a sequence of words as input, and outputs a fixed-length vector for each
word. The vectors for each word are learned such that words that have similar
meanings or are used in similar contexts are represented by similar vectors.
Once the word embeddings have been learned, they can be used as input to a
machine learning model for tasks such as sentiment analysis. The model can learn
to identify patterns in the embeddings that are indicative of positive or negative
sentiment. Word embeddings are useful because they capture the meaning of
words in a numerical form that can be input to a machine learning model. They
also have the advantage of being able to handle large vocabularies, since they do
not require a separate dimension for each word in the vocabulary.
Chapter 4
Implementation
As we explained in the design and literature review sections, in this chapter we will
now proceed to explain the models' implementation. Let us first get a firm grasp on
our model's critical flow charts and operation. As previously stated, we collected
two types of data, in first dataset there are depressive tweets are present and, in
another dataset, there are non-depressive tweets are present. After we loaded both
datasets in python notebook. After the data cleaning and pre-processing part, we fit
machine learning Supervised models like logistic regression, KNN classifier,
decision tree classifier, SVM, Neutral network and random forest classifier. First,
we describe data and attributes, which will be discussed further below, before
fitting the model and obtaining the output for the dataset. Once the data has been
processed and the changes have been made, the file is saved as processed_data.csv.
This file is used to fitting different types of models, comparing accuracy and
31
predicti ng. The
below figure illustrates the training and the flow cycle of our implementation.
32
There are different types of depressed emotions and we collect data using different
types of hashtags which people used on twitter. These emotions-hasgtags are ##
depress_tags = ["#depressed", "#anxiety", "#depression", "#suicide",
"#mentalhealth" "#loneliness", "#hopelessness", "#itsokaynottobeokay", "#sad"]
We will see one example of collection of data using hashtags
33
In above picture of python code, we can see that how to collect data of tweets
which are relate to depression emotion. As above code I have collected data of
other emotions also.
34
Using above code, I have combined all data.
Here one another dataset also included which we taken from Kaggle open-source
platform. In this dataset some sentiment tweets are present means non-depressive
tweets.
35
Check for duplicates - Primary key ('tweets.id')
36
4.2.2 Word Cloud
To find the most common words used in depressive and random datasets, the POS-
tag (parts of speech tagging) module in the NLTK library was used. Using Word
Cloud library, one can generate a word cloud based on word frequency and
superimpose these words on any image. In this case I had used the twitter logo and
matplotlib to display the image. The word cloud shows the words with higher
frequency in bigger text size while the “not-so” common words are in smaller text
size.
Make text all lower case
Remove links and images
Remove hashtags
37
Remove @ mentions
Remove emojis
Remove stop words
Remove punctuation
Get rid of stuff like "what's" and making it "what is'
Stemming / lemmatization
In above graph there are two types which are Depressive and random (non-
depressive) tweets are almost evenly distributed.
Distribution of tweet lengths
38
From the above graph we can say that there is no outliers present dataset.
Cleaning Combined dataset
I have to remove those rows with tweets that has been completely deleted in the
cleaning process.
39
In above code we can see that final cleaned dataset is saved. Cleaning of dataset is
good for increasing accuracy of model and also get the better results from study.
4.4 Model Building
As we discussed above in this study, we want to fit machine learning models like
SVM, RF, KNN and etc.
Firstly, we install some libraries for build different types supervised machine
learning models.
40
Now import required libraries
41
Classification models as well as LSTM with pretrained model (Spacy):
In order to run a supervised learning model, we first need to convert the clean text
into feature representation.
Split the dataset into two parts train and test, because we train model on train
dataset.
42
Above code is give the output of name of model, accuracy, train result and test
result. It will give result of confusion matrix and from these results we conclude
that the best fitted model.
Above values tells us accuracy of SVM model is good than others, also we can see
the confusion matrix and from this value True-True is good than other models and
its nothing but the 3195.
43
For better understand and compare algorithms we draw the graph
In above graph the best model with the highest accuracy is Support Vector
Machine (SVM) with 85.79% accuracy on test dataset. Logistic Regression
performed good also but we can see overfitting problem with CART, NN and RF.
44
LSTM model
45
4.5 Test Results
Support Vector Machine (SVM) with word embedding:
Above code running the SVM classifier again and train the model and give the
accuracy of model.
46
After running above code it shows that the following confusion matrix and graph
of accuracy of classification.
47
Model Inference-Testing
Firstly, we install and load the required libraries which are already installed for
model fitting.
Now expand data
48
In above code we made a function to perform stepwise cleaning process.
Here we loaded model which was best fitted using above code.
Now we want to text to model and would see the result.
49
In above code we can see that the test some text using SVM model and it gave
correct output of classification i.e., which text is depressed text and which is not.
Chapter 5
Discussion
In this dissertation, we explored the use of various architectures, including neural
networks, convolutional neural networks, and transformer architectures, for
sentiment analysis. We analysed the training time, accuracy, precision, and F1
score of these models and found that the transformer architecture, while powerful,
can be inefficient due to its dependence on large amounts of memory and stack
size. We also identified the challenge of limited maximum RAM memory for
training transformer models and discussed the potential for transformer models to
50
create a hyper-dimensional kernel space to separate positive and negative
sentiments. However, it is worth noting that the transformer architecture also has
some limitations, including the issue of memory leakage.
Challenges
Deep learning involves extracting features from large data sequences, which can be
computationally intensive and require parallel processing using multiple GPUs. It
is important to carefully understand the data and optimize the model to avoid
wasting GPU time and resources. Training a model can also be time-consuming,
taking approximately 3-4 hours of GPU time on an Amazon EC2 high-end server.
It is important to consider these factors when working with deep learning models.
In addition to the points mentioned, deep learning also involves choosing the
appropriate architecture and hyperparameters for the specific task at hand. This can
be a challenging process, as there is no one-size-fits-all solution and finding the
optimal configuration requires experimentation and trial-and-error. It is also
important to have a sufficient amount of labelled data for training, as well as to
properly split the data into training, validation, and testing sets. Overall, deep
learning requires careful planning and consideration of various factors to achieve
successful results.
Future Scope
The potential of sentiment analysis is vast, and the use of transformer architectures
has shown great promise in this field. However, the transformer architecture also
has its limitations, particularly in terms of its high memory usage. This makes it
difficult to deploy transformer models on a wider scale, as they require access to
powerful graphical processing units and tensor processing units which are not
widely available.
One solution to this problem is the use of cloud architecture, which allows for the
utilization of transformer models even without access to this specialized hardware.
However, it would be ideal to find a way to reduce the memory requirements of
transformer models in order to make them more widely accessible.
51
In the current landscape of sentiment analysis, long short-term memory networks
are often used in conjunction with systems like Amazon's Dynamo DP. While
these models have proven effective, there is still room for improvement and the
development of more efficient models. Overall, the future of sentiment analysis is
bright, and the field is likely to continue evolving as advancements are made in
artificial intelligence and deep learning. The goal of developing highly accurate
models remains, but there is also a growing focus on creating models that are more
widely deployable and accessible. It is possible that future developments in these
areas will lead to the creation of low-memory cost networks, which would be a
significant step forward in the field of sentiment analysis.
Evaluation
In this chapter, we will understand and evaluate the different research objectives of
this dissertation. The chapter is divided into sections that examine various models
which we fitted on dataset and interpretation of those models.
The first section of the evaluation discusses the understanding of data and
visualization part of the study. How to collect data from Twitter and another taken
from kaggle and combined both datasets and part of data cleaning and pre-
processing of main dataset.
After the data pre-processing, we build some supervised machine learning models.
Output and results of those models will understand in this chapter. Using the code,
we train some models and model will give output of classification which is divided
into two types depressive tweets and non-depressive tweets. Furthermore, best
performing models and logic behind the model’s working protocol is explain in
details.
In the third section, we will predict some values using fitted model and will check
accuracy using test data and will compare with original value. It’s given us
accuracy of models and from the accuracy we can conclude best model build on
our dataset.
At the end of study, we will give some input for model for model testing and will
check the output and conclude the study.
52
5.1 Data pre-processing and Visualisation
There are 6384 Null values in the location columns but since location will not be
used in our analysis or as a feature in our model, we don’t need to replace them.
Outliers
53
In above both graph there is present equal range of data. We can see that there is
no outlier present.
54
Distribution
In above graph we can see that two types of classification are present which are
shows the Depressive tweets and random (non-depressive) tweets are almost
evenly distributed.
Model Summary
Model Mean(cv_results) Std(cv_results) Train Test
result result
LR 0.834244 0.005465 0.846696 0.839569
KNN 0.84620 0.006422 0.843659 0.779337
Decision 0.721817 0.009442 0.999150 0.734269
tree
SVM 0.849854 0.008233 0.890306 0.850057
NN 0.830539 0.006755 0.997510 0.832341
RF 0.815538 0.009021 0.999150 0.819870
55
In above table we can see the model’s name with accuracy of model. From the
above table we can conclude that the SVM model is best fitted as we as Logistic
also good fitted as compare with other models. From the column of Train result we
can say that Decision tree, Neural network and Random Forest are overfitted.
Confusion Matrix
Mod LR KNN CART SVM NN RF
el
Tru Fals T F T F T F T F T F
e e
True 310 560 257 454 268 884 319 577 306 573 316 755
4 3 5 5 6 0
False 572 282 110 292 991 249 481 280 610 280 516 262
0 3 6 6 3 7 5
In above table we can see the accuracy of classification in every training model.
Comparison
Above graph shows that the best model with the highest accuracy is Support
Vector Machine (SVM) with 85.79% accuracy on test dataset. Logistic regression
performed good also but we can see the graph of Cart, NN and Rf there is problem
of overfitting.
56
LSTM model
Model Summary
Epoch 1/3
309/309 [==============================] - 35s 108ms/step - loss:
0.4131 - accuracy: 0.7974 - val_loss: 0.3325 - val_accuracy: 0.8545
Epoch 2/3
309/309 [==============================] - 33s 108ms/step - loss:
0.1983 - accuracy: 0.9213 - val_loss: 0.3564 - val_accuracy: 0.8553
Epoch 3/3
309/309 [==============================] - 33s 108ms/step - loss:
0.1071 - accuracy: 0.9593 - val_loss: 0.4389 - val_accuracy: 0.8468
In above note we can see that the model accuracy increasing at Epoch 3 and it goes
to near 85%.
Model Accuracy
train 0.9273566569484937
result:
test result: 0.8432539682539683
Confusion matrix
array([[3020, 450],
[ 656, 2930]])
57
5.3 Compare All the models
Note: For understand the test results read text mentioned above
58
5.4 Support Vector Machine (SVM) with word embedding:
Test results
=========================================
Accuracy score is : 0.8500566893424036
=========================================
Detail:
precision recall f1-score support
Test results shows the accuracy of model as well as precision and recall.
From confusion matrix we can understand the results of outcomes True-True,
False-False, True-False and False-True.
59
AUC score is : 0.9257555807380031
In Above graph ROC curve is on left side and Precision-recall curve in left side
Based on this graph we can decide how much the model is performing. Curve from
the graph is far away from the dotted line therefore we can say that model is good
performing.
In precision-recall graph there is 80% to 90% area acquired by the curve.
Area under curve is 92.57%.
5.6 Model Testing
Examples,
['hate life']
DepressedTweet
['love life']
NonDepressedTweet
['get leave alone']
DepressedTweet
In above examples we can se that the if we give input text as “hate life” then model
gives output as “Depressed tweet” and if input is “love life” then output is “Non-
depressed tweet”.
60
Chapter 6
Conclusion
Sentiment analysis is a valuable tool for understanding the sentiment surrounding a
particular product or topic. In this dissertation, we employed a range of machine
learning techniques, including recurrent neural networks, and long short-term
memory networks, to analyse the sentiment of tweets.
We fitted different types of supervised machine learning models to solve our
problem of classification of tweets which are into two types one is depressed
tweets and another is non depressed tweets on our collected and pre-processed
data.
To streamline the implementation process, we utilized jupyter notebook and
installed require libraries to run our python scripts. This allowed us to scrape data
from a given link, fit the appropriate models, and display the results on a
dashboard. Among the machine learning techniques we used, the support vector
machine with word embedding achieved particularly impressive results, with an
accuracy rate of around 85%. However, both the Logistic regression and KNN
performed well on temporal data, with accuracies of around 84% and 78% on the
test set, respectively. In both cases, we observed a consistent decrease in loss and
increase in accuracy as the models were trained, indicated that they were
effectively learning to differentiate between positive and negative sentiments.
61
References
[2] https://www.statista.com/statistics/433871/daily-social-media-usage-
worldwide/
[1 https://techjury.net/blog/time-spent-on-social-media/#:~:text=1.,on%20social
%20media%20in%202022.&text=This%20differs%20only%20slightly%20from,
%E2%80%942%20hours%2C%2022%20minutes.
[3] André Luiz Firmino Alves, Cláudio de Souza Baptista, Anderson Almeida
Firmino, Maxwell Guimarães de Oliveira, and Anselmo Cardoso de Paiva. 2014.
A Comparison of SVM Versus Naive-Bayes Techniques for Sentiment Analysis in
Tweets: A Case Study with the 2013 FIFA Confederations Cup. In Proceedings of
the 20th Brazilian Symposium on Multimedia and the Web (WebMedia '14).
Association for Computing Machinery, New York, NY, USA, 123–130.
https://doi.org/10.1145/2664551.2664561
[4] Augustyniak, Ł., Szymański, P., Kajdanowicz, T., & Tuligłowicz, W. (2015).
Comprehensive study on lexicon-based ensemble classification sentiment analysis.
Entropy, 18(1),
[5] [6] P. K. Patil and K. P. Adhiya, “Automatic Sentiment Analysis of Twitter
Messages Using Lexicon Based Approach and Naive Bayes Classifier with
Interpretation of,” pp. 9025–9034, 2015.
[6] Elli, Maria Soledad, and Yi-Fan Wang. "Amazon Reviews, business analytics
with sentiment analysis." 2016
[7] Nowak, J., Taspinar, A. and Scherer, R., 2017, June. LSTM recurrent neural
networks for short text and sentiment classification. In International Conference
on Artificial Intelligence and Soft Computing (pp. 553-562). Springer, Cham.
[8] Gundla, A.V. and Otari, M.S., 2015. A review on sentiment analysis and
visualization of customer reviews. vol, 4, pp.2062-2067.
[9] Sosa, P. M. (2017). Twitter sentiment analysis using combined LSTM-CNN
models. Eprint Arxiv, 1-9.
[10] Kim, Jaebok, Gwenn Englebienne, Khiet P. Truong, and Vanessa Evers.
"Towards speech emotion recognition" in the wild" using aggregated corpora and
deep multi-task learning." arXiv preprint arXiv:1708.03920 (2017).
62
[12] Kollias, Dimitrios, Panagiotis Tzirakis, Mihalis A. Nicolaou, Athanasios
Papaioannou, Guoying Zhao, Björn Schuller, Irene Kotsia, and Stefanos Zafeiriou.
"Deep affect prediction in-the-wild: Aff-wild database and challenge, deep
architectures, and beyond." International Journal of Computer Vision 127, no. 6
(2019): 907-929.
[13] Suttles, Jared, and Nancy Ide. "Distant supervision for emotion classification
with discrete binary values." In International Conference on Intelligent Text
Processing and Computational Linguistics, pp. 121-136. Springer, Berlin,
Heidelberg, 2013.
[14] Zadeh, Amir, Minghai Chen, Soujanya Poria, Erik Cambria, and Louis-
Philippe Morency. "Tensor fusion network for multimodal sentiment
analysis." arXiv preprint arXiv:1707.07250 (2017).
[15] Chen, X., & Duan, L. (2017).[15] Deep learning for text-based emotion
classification using distant supervision. In Proceedings of the International
Conference on Affective Computing and Intelligent Interaction (pp. 107-114).
Springer, Cham.
[16] Kim, H., & André, E. (2018). Transfer learning for emotion recognition in
tweets. In Proceedings of the International Conference on Multimodal Interaction
(pp. 657-665). ACM.
63