Depression Detection Emotion AI
Depression Detection Emotion AI
Abstract—Depression is a leading cause of mental ill health, Text data has the following benefits:
which has been found to increase risk of early death. Moreover 1. Easy to handle
it is a major cause of suicidal ideation and leads to significant 2. Simple and quick to pre-process
impairment in daily life. Emotion artificial intelligence is a field 3. Quantitative and qualitative availability
of ongoing research in emotion detection, specifically in the field 4. Significantly smaller memory storage size compared to
of text mining. The advent of internet based media sources has
resulted in significant user data being available for sentiment
image and video data
analysis of text and images. This paper aims to apply natural Twitter, which has a fixed limit on the amount of characters
language processing on Twitter feeds for conducting emotion allowed in a single Tweet [7], proves to provide the best
analysis focusing on depression. Individual tweets are classified platform to apply emotion artificial intelligence for depression
as neutral or negative, based on a curated word-list to detect detection.
depression tendencies. In the process of class prediction, support
vector machine and Naive-Bayes classifier have been used. The Emotion AI is an upcoming field of research in sentiment
results have been presented using the primary classification analysis, which aims to utilize machine learning techniques
metrics including F1-score, accuracy and confusion matrix. and algorithms for emotion detection. Successive work in the
Keywords—Emotion Artificial Intelligence, Support Vector Ma- domain of emotion artificial intelligence will ultimately lead
chine, Naive Bayes, Depression Detection, Machine Learning, to breakthroughs in large scale opinion mining [13], market
Natural Language Processing research and in diagnosing medical conditions [10]. Emotion
AI is not limited to textual data, but can have wide applications
I. I NTRODUCTION in computer vision through image and video data, for facial
expression detection. Furthermore advancements in recurrent
Depression is a mental disorder which can impair many neural network based models have led to state of art results in
facets of human life. Though not easily detected it has speech emotion artificial intelligence.
profound and varied impacts [8]. In today’s world, the stresses
of daily life events may increase chances of depression. It’s Determining the sentiment of an entire document is referred
diagnosis is made if at least five of the below symptoms to as coarse level and fine level deals with attribute level
occur almost every day for at least 2 weeks: sentiment analysis [6]. Sentence level emotion AI comes in
between these two. Twitter being the data source of choice
1. Depressed Mood in this paper, mostly deals with Tweets which are short
2. Loss on interest in activities message which are bound by a 140 character limit [2]. In this
3. Suicidal thoughts concise format, users express their emotions and feelings about
4. Feeling of worthlessness or hopelessness ongoing happenings in their life and the world around them.
5. Worsened ability to think and concentrate Emotion AI has been applied on the collected and pre-
processed Tweet data, which is classified into potential cat-
There might be other reasons like genes and family egorization of negative or neutral emotion state. Supervised
history which might also lead to depression. learning is the machine learning task which involves providing
the algorithm with labelled dataset, which is then used to
Nowadays people tend to express their emotions, opinions
learn model parameters (weight, bias). This paper implements
and disclose their daily lives [9] through a variety of social
Naive-Bayes and Support Vector Machines classifier for de-
media platforms like Twitter, Facebook and Instagram. This
tecting Tweets which demonstrate signs of depression and
expression can be through images, videos and mainly through
emotional ill-health.
text. Due to the widespread presence and reach of these social
media platforms, there is a plethora of user data available The structure of the remaining part of the research paper
for undertaking explorative analysis. Textual data being the is as follows. A brief description of the classifiers used in
most widely used form of communication offers a bunch of the implementation has been presented in the Section II.
characteristics which makes it the best choice for doing data Section III deals with the methodology of the research paper.
analysis, for emotion AI. Experimental results and discussion is presented in Section 4.
A conclusion is provided in the final section 5.
Here,
H is the probability of a classification
E1 to En are the Evidence variables
M is the Set of all evidences
incoming streams, an unique access token and secret token further processing. The set of labels corresponding to each
needs to be supplied. This two way communication is handled tweet is also fed into the classifier in the form a vector.
using the Twitter API.
Saving the Classifier and the Count Vectorizer Object:
2. Keyword list: Using a pre-created wordlist for detecting
Since training needs to be done once, the trained classifier
trigger words symbolizing poor mental well-being, Tweets
object needs to be loaded into a pickle file. Same is applicable
from all over the world are collected at random. These keyword
with the Count Vectorizer object. Thus both these objects are
specific tweets are mixed with a general batch of non-weighted
dumped into a pickle file for further use.
Tweets, in form of JSON objects.
3. Extracting text from JSON: The collected tweets in the
JSON objects are parsed to extract only the text field of the
Tweets. Other meta-data related to any particular Tweet is
removed.
4. Data Cleaning: To avoid errors in encoding textual data,
the Text is purged for links(http) and non-ASCII characters like
emoticons. Result is a clean dataset, rid of non-conformative
character types.
5. Generate csv file for train and test set: The cleaned text
data from individual Tweets is added to the training and test
dataset, in a vectorized format. Classification labels for the
training and test datasets are manually added, to create a csv
file using comma as the delimiter.
B. Data Preprocessing
The csv file is read and several data preprocessing steps are
performed on it. Natural language processing [11] has been
utilized for preprocessing methods applied on the extracted
data:
1. Tokenization: Tokenization is a process of dividing
a string into several meaningfull substring,such as units of
words,sentences, or themes[5]. In this case, the first column of
the csv file containing the tweet is extracted and is converted
into individual tokens.
2. Stemming: Stemming involves reducing the words to
their root form. This would help us to group similar words
together. For implementation,Porter Stemmer is used.
Fig. 2: Training phase
3. Stop Words Removal: The commonly used words
,known as stop words need to removed since they are of no
use in the training and could also lead to erratic results if not
ignored. Nltk library has a set of stop-words which can be
used as a reference to remove stop-words from the tweet.
4. POS Tagger: To improve the quality of the training data,
the tokenized text is assigned the respective parts of speech
by using POS Tagger. This would be used to extract only the
adjectives ,nouns and adverbs since other parts of speech are
not of much significance. Example: ’I love coding’ - ’love’
being a noun is extracted, rest are removed.
After all these pre-processing steps, a bag of words is
formed. Bag of words calculates the number of occurrence of
each word, which is then used as a feature to train a classifier.
C. Training
The classifier requires two parameters: training set and
label. The training set in this case is the set of tweets which Fig. 3: Testing phase
needs to be further processed in order to feed into a classifier.
The set of tweets need to converted into vector format for
D. Testing
Testing the classifier involves following steps:
1. Loading saved models: The trained classification models
are loaded from the pickle file, to be used for prediction on
test dataset.
2. Data Preprocessing: The test dataset is preprocessed in
a manner similar to the training data.
3. Class Prediction on Test Tweets: Each tweet is classified
into a depressed or neutral class.
4. Computation of Confusion Matrix: Based on the values
of true or false positives and negatives we compute the confu-
sion matrix, for the evaluation of classification performance.
IV. R ESULTS
The results are evaluated on the basis of F1 score and Fig. 4: Confusion Matrix for SVM
accuracy. The F1 score is the primary performance measure
and accuracy is the secondary measure. F1 score is calculated
based on the precision and recall.
P ∗R
F 1 score = 2 ∗
P +R
. Here, P stands for Precision and R is the Recall.
It can be noticed from the results that Multinomial Naive
Bayes has performed the best with the F1 score of 83.29
whereas SVM has achieved a lower F1 score of 79.73. The
Precision and Recall follow the same trend with Multinomial
Naive Bayes outperforming SVM. Fig.4 and Fig.5 shows the
normalized confusion matrix. It consists of two rows and two
columns in which various parameters like false positives,false
negatives,true positives and true negatives can be analyzed.
The accuracy of the Mutinomial Naive Bayes is 83% and is
79% in case of SVM.
The accuracy of the above classifiers is slightly less due to
the fact that the tweets contain text which is not in standard Fig. 5: Confusion Matrix for Naive Bayes
format. For example, people write ty instead of Thankyou.
Thus it is a bit challenging task to train the classifier and
achieve significant results. This calls for further research on
this area to improve the accuracy of the model. which eliminates about a third of the data due to third-
person and news references. In future, a layer of expert-based
suggestion can be added to the model to reduce number of
TABLE I: Results false positives. This would increase the precision of sentiment
Comparison of performance metrics analysis for depression detection.
Name Precision Recall F1 score Accuracy
Multinomial 0.836 0.83 0.8329 83% R EFERENCES
Naive Bayes [1] A. N Hasan, B. Twala, and T. Marwala,Moving Towards Accurate
Support Vec- 0.804 0.79 0.7973 79% Monitoring and Prediction of Gold Mine Underground Dam Levels, IEEE
IJCNN (WCCI) proceedings, Beijing, China, 2014.
tor Machine
[2] A.K.Jose, N.Bhatia, and S.Krishna, Twitter Sentiment Analysis,National
Institute of Technology,Calicut, 2010.
V. C ONCLUSION [3] T.Mitchell, H.McGraw,”Machine Learning,Second Edition, Chapter
Text based emotion AI has successfully been applied to the One”,January 2010.
task of depression detection using Twitter data. The results [4] C.D.Manning, P.Raghavan, H.Schutze,Introduction to Information Re-
delivered in this paper are at par with the previous results trieval,Cambridge UP, 2008
achieved in this domain. Supervised learning classification [5] P.Taylor, Text-to-Speech Synthesis, Cambridge, U.K.:Cambridge Univer-
have a limitation and cannot grant a human level accuracy in sity Press, 2009
prediction of depression through text data. Moreover there is [6] Y. Mejova, Sentiment analysis: An overview,http://www. cs. uiowa.
significant noise in the Tweets collected before pre-processing, edu/ymejova/publications/CompsYelenaMejova.pdf, 2009.