0% found this document useful (0 votes)
110 views5 pages

Depression Detection Emotion AI

Uploaded by

Akshay Hegde
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views5 pages

Depression Detection Emotion AI

Uploaded by

Akshay Hegde
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Proceedings of the International Conference on Intelligent Sustainable Systems (ICISS 2017)

IEEE Xplore Compliant - Part Number:CFP17M19-ART, ISBN:978-1-5386-1959-9

Depression Detection using Emotion Artificial


Intelligence

Mandar Deshpande Vignesh Rao


Electrical and Electronics Engineering Department Department of Computer Science and Engineering
Visvesvaraya National Institute of Technology Visvesvaraya National Institute of Technology
Nagpur, India Nagpur, India
Email: [email protected] Email: [email protected]

Abstract—Depression is a leading cause of mental ill health, Text data has the following benefits:
which has been found to increase risk of early death. Moreover 1. Easy to handle
it is a major cause of suicidal ideation and leads to significant 2. Simple and quick to pre-process
impairment in daily life. Emotion artificial intelligence is a field 3. Quantitative and qualitative availability
of ongoing research in emotion detection, specifically in the field 4. Significantly smaller memory storage size compared to
of text mining. The advent of internet based media sources has
resulted in significant user data being available for sentiment
image and video data
analysis of text and images. This paper aims to apply natural Twitter, which has a fixed limit on the amount of characters
language processing on Twitter feeds for conducting emotion allowed in a single Tweet [7], proves to provide the best
analysis focusing on depression. Individual tweets are classified platform to apply emotion artificial intelligence for depression
as neutral or negative, based on a curated word-list to detect detection.
depression tendencies. In the process of class prediction, support
vector machine and Naive-Bayes classifier have been used. The Emotion AI is an upcoming field of research in sentiment
results have been presented using the primary classification analysis, which aims to utilize machine learning techniques
metrics including F1-score, accuracy and confusion matrix. and algorithms for emotion detection. Successive work in the
Keywords—Emotion Artificial Intelligence, Support Vector Ma- domain of emotion artificial intelligence will ultimately lead
chine, Naive Bayes, Depression Detection, Machine Learning, to breakthroughs in large scale opinion mining [13], market
Natural Language Processing research and in diagnosing medical conditions [10]. Emotion
AI is not limited to textual data, but can have wide applications
I. I NTRODUCTION in computer vision through image and video data, for facial
expression detection. Furthermore advancements in recurrent
Depression is a mental disorder which can impair many neural network based models have led to state of art results in
facets of human life. Though not easily detected it has speech emotion artificial intelligence.
profound and varied impacts [8]. In today’s world, the stresses
of daily life events may increase chances of depression. It’s Determining the sentiment of an entire document is referred
diagnosis is made if at least five of the below symptoms to as coarse level and fine level deals with attribute level
occur almost every day for at least 2 weeks: sentiment analysis [6]. Sentence level emotion AI comes in
between these two. Twitter being the data source of choice
1. Depressed Mood in this paper, mostly deals with Tweets which are short
2. Loss on interest in activities message which are bound by a 140 character limit [2]. In this
3. Suicidal thoughts concise format, users express their emotions and feelings about
4. Feeling of worthlessness or hopelessness ongoing happenings in their life and the world around them.
5. Worsened ability to think and concentrate Emotion AI has been applied on the collected and pre-
processed Tweet data, which is classified into potential cat-
There might be other reasons like genes and family egorization of negative or neutral emotion state. Supervised
history which might also lead to depression. learning is the machine learning task which involves providing
the algorithm with labelled dataset, which is then used to
Nowadays people tend to express their emotions, opinions
learn model parameters (weight, bias). This paper implements
and disclose their daily lives [9] through a variety of social
Naive-Bayes and Support Vector Machines classifier for de-
media platforms like Twitter, Facebook and Instagram. This
tecting Tweets which demonstrate signs of depression and
expression can be through images, videos and mainly through
emotional ill-health.
text. Due to the widespread presence and reach of these social
media platforms, there is a plethora of user data available The structure of the remaining part of the research paper
for undertaking explorative analysis. Textual data being the is as follows. A brief description of the classifiers used in
most widely used form of communication offers a bunch of the implementation has been presented in the Section II.
characteristics which makes it the best choice for doing data Section III deals with the methodology of the research paper.
analysis, for emotion AI. Experimental results and discussion is presented in Section 4.
A conclusion is provided in the final section 5.

978-1-5386-1959-9/17/$31.00 ©2017 IEEE 858


Proceedings of the International Conference on Intelligent Sustainable Systems (ICISS 2017)
IEEE Xplore Compliant - Part Number:CFP17M19-ART, ISBN:978-1-5386-1959-9

II. BACKGROUND III. M ETHODOLOGY


A. Dataset The approach being taken up in this paper is modular in
The dataset comprises of tweets collected using the Twitter its’ organization. Individual component of the work flow have
API. A total of 10,000 Tweets were collected for generating been segregated into stand-alone steps, to improve quality of
the training and test dataset. A ratio of 80:20 has been adopted implementation.
for splitting the data collected into training and test dataset. The work flow starts with data collection step, which
Two word-list were compiled for the training and test utilizes Twitter API for generation of dataset. Natural language
datasets. The training word-list comprised of curated list processing facilitates much better than average classification
of words suggesting depression tendencies like ’depressed’, for sentimental analysis done by the human [14]. Following
’hopeless’, ’suicide’. For the test dataset, tweets were collected the creation of datasets, the data preprocessing module which
at random which included neutral as well as negative compo- systematically churns the data through tokenization, stemming
nents. and stop words removal. POS tagger then identifies essential
pieces of the text to be utilized. After this the text classifier is
B. Naive Bayes Classifier trained on the processed text data from Twitter, in the training
phase. In the testing phase, class prediction is made on the test
Naive Bayes Classifier is a classifier which imple- dataset to identify potential Tweets demonstrating depression
ments Bayes theorem with a solid (naive) independence tendencies.
assumptions[1], particularly, independent feature model. Bayes
Theorem works on conditional probability which finds out the
probability of an event given that some other event has already
occurred. It predicts the conditional probability of a class given
the set of evidences and finds the most likely class based on
the highest one. A naive Bayes classifier is a famous and
popular technique because it is very fast approach and gives a
high accuracy[3]. The equation of the conditional probability
is defined as follows:
P (E1|H) ∗ P (E2|H) ∗ P (En|H) ∗ P (H)
P (H|M ) =
P (M )

Here,
H is the probability of a classification
E1 to En are the Evidence variables
M is the Set of all evidences

There are three types of Naive Bayes Classifier:


1. Gaussian Naive Bayes
2. Multinomial Naive Bayes
3. Bernoulli Naive Bayes.
In this paper, Multinomial Naive Bayes has been used
as classifier. The Multinomial Naive Bayes works well on
multinomially distributed data and is widely used in text
classification.

C. Support Vector Machine


Fig. 1: Data Collection
Support Vector Machine(SVM) is a supervised learning
algorithm that analyzes the data and recognizing patters used
for classification[4]. Given an input set, SVM classifies them
as one or the other of two categories. SVM can deal both with A. Data Collection
both linear and non-linear classification. Every machine learning or sentiment analysis task starts
With kernel trick, it can efficiently perform non-linear clas- with collection of relevant data from various sources. In this
sification. It does so by mapping the input set to high dimen- paper Twitter is considered as the data source for analysis,
sional feature space. The types of kernel includes polynomial, in the form of User Tweets. This portion covers tasks from
Gaussian radial basis function (RBF) ,Laplace RBF kernel, streaming the data from the Twitter servers, to compile the
Hyperbolic tangent kernel and Sigmoidal kernel. Construction training and test datasets.
of hyper plane is employed by SVM for the classification.
1. App Authentication: Application-only authentication in-
In the context of this paper, the goal of a text classification volves communication between the application request and
system is to determine whether a given tweet belongs to a set Twitter API, without a user context. This requires the creation
predefined categories [12]. An optimal SVM algorithm for text of a Twitter app which is assigned a set of unique consumer
classification does this via multiple optimal strategies. key and secret key. Further to access the twitter data from

978-1-5386-1959-9/17/$31.00 ©2017 IEEE 859


Proceedings of the International Conference on Intelligent Sustainable Systems (ICISS 2017)
IEEE Xplore Compliant - Part Number:CFP17M19-ART, ISBN:978-1-5386-1959-9

incoming streams, an unique access token and secret token further processing. The set of labels corresponding to each
needs to be supplied. This two way communication is handled tweet is also fed into the classifier in the form a vector.
using the Twitter API.
Saving the Classifier and the Count Vectorizer Object:
2. Keyword list: Using a pre-created wordlist for detecting
Since training needs to be done once, the trained classifier
trigger words symbolizing poor mental well-being, Tweets
object needs to be loaded into a pickle file. Same is applicable
from all over the world are collected at random. These keyword
with the Count Vectorizer object. Thus both these objects are
specific tweets are mixed with a general batch of non-weighted
dumped into a pickle file for further use.
Tweets, in form of JSON objects.
3. Extracting text from JSON: The collected tweets in the
JSON objects are parsed to extract only the text field of the
Tweets. Other meta-data related to any particular Tweet is
removed.
4. Data Cleaning: To avoid errors in encoding textual data,
the Text is purged for links(http) and non-ASCII characters like
emoticons. Result is a clean dataset, rid of non-conformative
character types.
5. Generate csv file for train and test set: The cleaned text
data from individual Tweets is added to the training and test
dataset, in a vectorized format. Classification labels for the
training and test datasets are manually added, to create a csv
file using comma as the delimiter.

B. Data Preprocessing
The csv file is read and several data preprocessing steps are
performed on it. Natural language processing [11] has been
utilized for preprocessing methods applied on the extracted
data:
1. Tokenization: Tokenization is a process of dividing
a string into several meaningfull substring,such as units of
words,sentences, or themes[5]. In this case, the first column of
the csv file containing the tweet is extracted and is converted
into individual tokens.
2. Stemming: Stemming involves reducing the words to
their root form. This would help us to group similar words
together. For implementation,Porter Stemmer is used.
Fig. 2: Training phase
3. Stop Words Removal: The commonly used words
,known as stop words need to removed since they are of no
use in the training and could also lead to erratic results if not
ignored. Nltk library has a set of stop-words which can be
used as a reference to remove stop-words from the tweet.
4. POS Tagger: To improve the quality of the training data,
the tokenized text is assigned the respective parts of speech
by using POS Tagger. This would be used to extract only the
adjectives ,nouns and adverbs since other parts of speech are
not of much significance. Example: ’I love coding’ - ’love’
being a noun is extracted, rest are removed.
After all these pre-processing steps, a bag of words is
formed. Bag of words calculates the number of occurrence of
each word, which is then used as a feature to train a classifier.

C. Training
The classifier requires two parameters: training set and
label. The training set in this case is the set of tweets which Fig. 3: Testing phase
needs to be further processed in order to feed into a classifier.
The set of tweets need to converted into vector format for

978-1-5386-1959-9/17/$31.00 ©2017 IEEE 860


Proceedings of the International Conference on Intelligent Sustainable Systems (ICISS 2017)
IEEE Xplore Compliant - Part Number:CFP17M19-ART, ISBN:978-1-5386-1959-9

D. Testing
Testing the classifier involves following steps:
1. Loading saved models: The trained classification models
are loaded from the pickle file, to be used for prediction on
test dataset.
2. Data Preprocessing: The test dataset is preprocessed in
a manner similar to the training data.
3. Class Prediction on Test Tweets: Each tweet is classified
into a depressed or neutral class.
4. Computation of Confusion Matrix: Based on the values
of true or false positives and negatives we compute the confu-
sion matrix, for the evaluation of classification performance.

IV. R ESULTS
The results are evaluated on the basis of F1 score and Fig. 4: Confusion Matrix for SVM
accuracy. The F1 score is the primary performance measure
and accuracy is the secondary measure. F1 score is calculated
based on the precision and recall.
P ∗R
F 1 score = 2 ∗
P +R
. Here, P stands for Precision and R is the Recall.
It can be noticed from the results that Multinomial Naive
Bayes has performed the best with the F1 score of 83.29
whereas SVM has achieved a lower F1 score of 79.73. The
Precision and Recall follow the same trend with Multinomial
Naive Bayes outperforming SVM. Fig.4 and Fig.5 shows the
normalized confusion matrix. It consists of two rows and two
columns in which various parameters like false positives,false
negatives,true positives and true negatives can be analyzed.
The accuracy of the Mutinomial Naive Bayes is 83% and is
79% in case of SVM.
The accuracy of the above classifiers is slightly less due to
the fact that the tweets contain text which is not in standard Fig. 5: Confusion Matrix for Naive Bayes
format. For example, people write ty instead of Thankyou.
Thus it is a bit challenging task to train the classifier and
achieve significant results. This calls for further research on
this area to improve the accuracy of the model. which eliminates about a third of the data due to third-
person and news references. In future, a layer of expert-based
suggestion can be added to the model to reduce number of
TABLE I: Results false positives. This would increase the precision of sentiment
Comparison of performance metrics analysis for depression detection.
Name Precision Recall F1 score Accuracy
Multinomial 0.836 0.83 0.8329 83% R EFERENCES
Naive Bayes [1] A. N Hasan, B. Twala, and T. Marwala,Moving Towards Accurate
Support Vec- 0.804 0.79 0.7973 79% Monitoring and Prediction of Gold Mine Underground Dam Levels, IEEE
IJCNN (WCCI) proceedings, Beijing, China, 2014.
tor Machine
[2] A.K.Jose, N.Bhatia, and S.Krishna, Twitter Sentiment Analysis,National
Institute of Technology,Calicut, 2010.
V. C ONCLUSION [3] T.Mitchell, H.McGraw,”Machine Learning,Second Edition, Chapter
Text based emotion AI has successfully been applied to the One”,January 2010.
task of depression detection using Twitter data. The results [4] C.D.Manning, P.Raghavan, H.Schutze,Introduction to Information Re-
delivered in this paper are at par with the previous results trieval,Cambridge UP, 2008
achieved in this domain. Supervised learning classification [5] P.Taylor, Text-to-Speech Synthesis, Cambridge, U.K.:Cambridge Univer-
have a limitation and cannot grant a human level accuracy in sity Press, 2009
prediction of depression through text data. Moreover there is [6] Y. Mejova, Sentiment analysis: An overview,http://www. cs. uiowa.
significant noise in the Tweets collected before pre-processing, edu/ymejova/publications/CompsYelenaMejova.pdf, 2009.

978-1-5386-1959-9/17/$31.00 ©2017 IEEE 861


Proceedings of the International Conference on Intelligent Sustainable Systems (ICISS 2017)
IEEE Xplore Compliant - Part Number:CFP17M19-ART, ISBN:978-1-5386-1959-9

[7] M.S.Neethu, Rajsree, ”Sentiment analysis in twitter using machine


learning techniques”,Fourth International Conference on Computing,
Communications and Networking Technologies (ICCCNT), 2013
[8] Benjamin L. Cook,Ana M. Progovac, Pei Chen, Brian Mullin, Sherry
Hou and Enrique Baca-Garcia, ”Novel Use of Natural Language Pro-
cessing (NLP) to Predict Suicidal Ideation and Psychiatric Symptoms in
a Text-Based Mental Health Intervention in Madrid”, Computational and
Mathematical Methods in Medicine Volume,2016
[9] M.Rambocas, and J. Gama, Marketing Research:The Role of Sentiment
Analysis, The 5th SNA-KDD Workshop11. University of Porto, 2013
[10] Stphane Meystre,Peter J.Haug, ”Natural language processing to ex-
tract medical problems from electronic clinical documents: Performance
evaluation”, Journal of Biomedical Informatics,Volume 39, Issue 6,
December 2006
[11] Chowdhury, G., ”Natural language processing”, Annual Review of
Information Science and Technology, 2003
[12] Zi-qiang Wang,Xia Sun, De-xian Zhang, Xin Li, ”An Optimal SVM-
Based Text Classification Algorithm”, International Conference on Ma-
chineLearning and Cybernetics, 2006
[13] Mitali Desai, Mayuri A. Mehta,”Techniques for sentiment analysis of
Twitter data: A comprehensive survey”, International Conference on
Computing, Communication and Automation (ICCCA), 2016
[14] Krystian Horecki and Jacek Mazurkiewicz, ”Natural Language Pro-
cessing Methods Used for Automatic Prediction Mechanism of Related
Phenomenon”, Springer International Publishing Switzerland ICAISC,
2015

978-1-5386-1959-9/17/$31.00 ©2017 IEEE 862

You might also like