0% found this document useful (0 votes)
23 views

Using Machine Learning Algorithms To Detect Suicide Risk Factors On Twitter

This document describes a study presented at the 2019 International Conference on Data Mining Workshops that aimed to identify suicide risk factors on Twitter using machine learning algorithms. The researchers collected over 12,000 public tweets from nearly 4,000 users and applied topic discovery algorithms like latent semantic analysis and latent Dirichlet allocation to detect underlying suicide risk factors. They were able to classify users as "HighRisk" or "AtRisk" with over 80% precision and 90% sensitivity by using a decision tree model incorporating the detected risk factors. The framework could supplement suicide prevention efforts by enabling automated risk identification and intervention.

Uploaded by

toton1181
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Using Machine Learning Algorithms To Detect Suicide Risk Factors On Twitter

This document describes a study presented at the 2019 International Conference on Data Mining Workshops that aimed to identify suicide risk factors on Twitter using machine learning algorithms. The researchers collected over 12,000 public tweets from nearly 4,000 users and applied topic discovery algorithms like latent semantic analysis and latent Dirichlet allocation to detect underlying suicide risk factors. They were able to classify users as "HighRisk" or "AtRisk" with over 80% precision and 90% sensitivity by using a decision tree model incorporating the detected risk factors. The framework could supplement suicide prevention efforts by enabling automated risk identification and intervention.

Uploaded by

toton1181
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

2019 International Conference on Data Mining Workshops (ICDMW)

Using Machine Learning Algorithms to Detect


Suicide Risk Factors on Twitter
Samah J. Fodeh Taihua Li Kevin S. Menczynski
Department of Emergency Medicine & College of Computing and Digital Media College of Computing and Digital Media
Yale Center for Medical Informatics DePaul University DePaul University
Yale University Chicago, USA Chicago, USA
New Haven, USA [email protected] [email protected]
[email protected]

Todd M. Burgette Andrew K. Harris Georgeta I. Ilie


College of Computing and Digital Media College of Computing and Digital Media College of Computing and Digital Media
DePaul University DePaul University DePaul University
Chicago, USA Chicago, IL Chicago, IL
[email protected] [email protected]

Satyan K. Rao Jonathan F. Gemmell Daniela S. Raicu


College of Computing and Digital Media College of Computing and Digital Media College of Computing and Digital Media
DePaul University DePaul University DePaul University
Chicago, IL Chicago, IL Chicago, IL
[email protected] [email protected] [email protected]

Abstract: The goal from this study is to identify suicide risk factors been known to be used as a platform for suicidal messages and
on Twitter. We propose a machine learning framework that could suicide notes [5]. Social media platforms have even been used
be potentially useful for suicide prevention interventions. We ap- as venues to live-stream suicide attempts showing that these
plied search terms from the suicidal ideation tracking framework warning signs need to be taken seriously [6]. Twitter has recog-
proposed by Jashinsky et al. and downloaded 12,066 public tweets
from 3,873 users via Twitter’s application programming interface
nized this serious risk and has put a service in place for people
(API). We created “HighRisk” or “AtRisk” labels for users based who are, or know of somebody who is, suicidal to reach out and
on their suicidal ideation terms’ usage and applied three topic dis- get help [7]. However, Twitter is not proactive in identifying
covery algorithms to find underlying suicide risk factors among users at risk, and reporting is at the discretion of the user and
users, which were subsequently used to classify users into “High- not in real time. To make suicide prevention timely and effec-
Risk” or “AtRisk”. Algorithms applied included Latent Semantic tive, the suicide-related data has to be collected, analyzed and
Analysis, Latent Dirichlet Allocation, Non-negative Matrix Fac- reported in a timely manner, so that interventions can be made
torization, Decision Tree and K-means Clustering. Our topic dis- before the person commits suicide.
covery approach detected 7 out of 12 suicide risk factors proposed
by Jashinsky et al. Using a decision tree classification model that Prior research has identified specific terms or phrases in
utilized these factors, we achieved 0.844 in precision, 0.912 in sen- tweets indicative of suicide risk factors [5]. While most research
sitivity, and 0.829 in specificity in classifying users into “HighRisk” has focused on using machine learning in conjunction with hu-
and “AtRisk” groups. The development of this framework supple- man annotators and/or suicide research experts, our study
ments suicide researchers and suicide prevention efforts, with a
dropped the human factor to reduce cost and utilized machine
potential to be employed at run-time.
learning approaches to increase efficiency in identifying users
Keywords- Suicide ideation, Topic modeling, Text analysis, La- at high risk of suicide. For example, in [5], human annotators
tent Semantic Analysis determined the suicide risk factors and linked them to certain
1. Introduction terms and phrases in the tweets. In our previous work [8, 9], we
Suicide is the 10th leading cause of death in the United States utilized the suicide risk factor framework by [5] to detect rela-
with an estimate cost of $51-billion annually [1], making sui- tional features and language patterns indicative of suicidal ide-
cide prevention not only a public health issue, but also an eco- ation. In this study, we applied reproducible machine learning
nomic one. Center for Disease Control and Prevention (CDC) algorithms including Latent Semantic Analysis (LSA), Latent
reported that for youth between the ages of 10 and 24, suicide Dirichlet Allocation (LDA) and Non-Negative Matrix Factori-
is the third leading cause of death. [1] Even more concerning, zation (NMF) to identify suicide related topics and themes of
the results of a recent study on 32 U.S. children’s hospitals discussion in tweets and studied the user cluster formations and
which show that rates of suicide and serious self-harm in chil- user profile classifications.
dren and adolescents have increased steadily from 2008 to 2. Related Work
2015. [2] Social media has been identified as playing a possible The study of Twitter data in suicide analysis has become
role in contributing to suicide through copycat actions, mainly more prevalent as there are known issues with recall and context
in vulnerable and impressionable youth [3]. Twitter is a social bias in the psychological assessments that have been studied in
network platform where users share messages limited to 140 the past [10]. There is a lack of textual data for review in tradi-
characters. It has been shown that 21% of Americans use Twit- tional analysis of suicide and the review of Twitter data offers
ter, with 36% of them age between 18 and 29 [4]. Twitter has an insight into the day-to-day feelings of individual users [11].

2375-9259/19/$31.00 ©2019 IEEE 941


DOI 10.1109/ICDMW.2019.00137
Authorized licensed use limited to: University of Gloucestershire. Downloaded on March 02,2024 at 17:40:32 UTC from IEEE Xplore. Restrictions apply.
This analysis seeks to find quantifiable signals of suicide, ways users. Rather than identifying users who have specific risk lev-
to visualize linguistic data, automatic emotion classification of els, this type of analysis seeks to describe groups of users in
tweets to help create a possible map of emotions leading up to ways that can aid in feature extraction. Latent Dirichlet Alloca-
a suicide attempt, and looks to incorporate results into current tion (LDA) is one technique that has proven successful for topic
work in the field of psychology [11]. The overarching goal of mining in Twitter data [19]. LDA along with Latent Semantic
the research is to create a tangible screening process by which Analysis (LSA) can also be used to compare similarities be-
users could be analyzed in a way that is non-invasive and cost tween words and sentences, which can help identify word vec-
effective to different research groups [11]. As social media has tors and allow comparison across different suicidal users at dif-
become a focus of suicide detection and prevention, various ef- ferent risk levels [20]. These techniques also provide probabil-
fective techniques have emerged to perform this analysis. Lin- ities (LDA) and loadings (LSA) for each word as it contributes
guistic Inquiry and Word Count (LIWC) has been used to ana- to each topic. These are unsupervised methods, but can also be
lyze single users for both classification and time-series analysis used for classifying users into one of the discovered topics [20].
[12, 13]. These analyses were centered around the linguistic
3. Data
characteristics of users’ tweets, leveraging previously catego-
3.1 Twitter Data Collection
rized types of words (pronouns, past tense, positive emotions,
Previous suicide-related research has shown that people
etc.).
who are at risk of committing suicide can be detected and
Logistic regression and linguistic analysis have been done tracked using the twelve-suicide risk factor related terms [5].
to identify suicidal users using LIWC analysis to pre-process Using these search terms, we collected tweets posted between
data [14]. O’Dea’s research has led to expanded feature sets January 1th, 2015 until June 8th, 2016, from 3,873 unique users.
from purely word counts into other high-level semantic fea-
3.2 Data Preprocessing
tures, such as the number of total words and the number of pro-
Processed tweets were filtered based on the suicide risk fac-
nouns in each post. The goal of this work was to identify differ-
tors and associated language, which were then used to create a
ent populations of suicidal users with different levels of suicide
user-term frequency matrix. Since most of the tweets contained
risk and distinguish linguistic profiles between strongly con-
a keyword indicative of the feeling or behavior in combination
cerning users. Strongly concerning suicide-related tweets have
with a descriptor to complete the meaning, we filtered the key-
a higher word count, higher usage of the words defined in the
words into 48 representative terms, as shown in Table 1. After
library, increased use of first-person singular pronouns, and in-
filtration, there were 12,066 tweets left and used to build a final
creased references to time and death. It was also noted that this
user-term frequency matrix.
group of tweets generally have fewer user-tags, indicating an
isolation in the Twitter social network [14]. Similar analysis has Table 1: Terms Used in The User-Term Matrix
been conducted looking to define a level of concern for different Panic Disor-
Abused Prozac Cut
Twitter users through the use of a Support Vector Machine der
(SVM) classifier on word frequency with Term Frequency-In- Social Anxi-
Depressed Pills Bully
verse Document Frequency (TF-IDF) transformations per- ety
formed on the data [15]. SVM has been used to define tweets Suicide
Hopeless Bullied Fight Dad
that warrant further investigation or review. During testing, Abused
Worthless Suicide Pain Suicide Gun Fight Mom
SVM and the human coders achieved the same level of accu-
Empty Suicide Tried Suicide Shoot Fight Parents
racy. When performing this type of classification, an identified
Anxious Suicide Mom Schizophrenia Fight Sister
challenge is that it can be difficult to determine if a tweet is sar- Sleeping Suicide Sister Anorexia Fight Brother
casm or requires actual intervention [15]. Despite this issue, this Suicide
type of study gained validation when researchers were able to Irritable Bulimia Argue Dad
Brother
identify 53 users who ended up attempting to attempting suicide Suicide
based on their analysis on a Chinese social media site similar to Restless OCD Argue Mom
Friend
Twitter [16]. A previous research has tried to obtain labels for Suicide
Alcohol Bipolar Argue Parents
suicide-related tweets through crowdsourcing [17]. This is one Thought
of the most difficult tasks that researchers are faced with when Sertraline Suicide Kill PTSD Impulsive
studying suicide, as there are not definitive labels to be used for Borderline Suicide Be-
Zoloft Suicide Think
classification. Crowdsourcing is the most common solution for Personality fore
generating labels, but they are still subjected to the reviewers’
bias. In a different study, authors acquired confirmed suicide
cases by scanning traditional media outlets and then traced back 3.3 User Labeling: “HighRisk” and “AtRisk”
the deceased user’s’ tweets [18], but this is still not a perfect Users were labeled and divided into two groups, “High-
science as Twitter account information cannot guarantee the Risk” and “AtRisk” following [8]. “HighRisk” users were
identity of the user. Another approach that has been undertaken those whose tweets contained language pertaining to suicide
is performing topic identification, which is the concept of iden- related behavior, and other users were annotated as “AtRisk”.
tifying broader ideas or sentiments across multiple tweets and If a user did not have any risk factor term from Table 1, he or

942

Authorized licensed use limited to: University of Gloucestershire. Downloaded on March 02,2024 at 17:40:32 UTC from IEEE Xplore. Restrictions apply.
she was removed. In the filtered dataset, there were 280 comparing the suicide risk factor assignments across three al-
“HighRisk” users and 1,614 “AtRisk” users. gorithms, 5 topics are found consistent in the results, including
Depressive Feelings, Drug Abuse, Psychological Disorders,
3.4 Addressing Class Imbalance Self-harm, and Bullying. Depressive Symptoms appears to be
To deal with the class imbalance issue (minimal number of another topic found by LSA and NMF, and the top contributing
"HighRisk" users), two additional balanced data-sets were cre- terms for that topic include sleeping, alcohol, and empty. In ad-
ated using random down-sampling and K-means clustering. The dition, Family Violence/Discord is discovered by LSA. In total,
first balanced dataset contained all the 280 “HighRisk” users using the three topic identification algorithms, 7 out of 12 sui-
and 280 randomly-selected “AtRisk” users. The second bal- cide risk factors were identified.
anced dataset contained 280 “HighRisk” users and 285 5.2 User Classification
“AtRisk” users, which were selected from 15 clusters found us- After confirming the definitions of topics, the user-term ma-
ing K-means clustering; for each cluster, we selected 19 most trix ‫ ܣ‬was transformed into the user-topic matrix, where Deci-
representative users. sion Tree and K-means Clustering were applied to classify us-
4. Methodology ers.
We first applied three topic clustering algorithms in this
study to discover topics discussed by users on Twitter, includ- 5.2.1 Decision Tree
ing Latent Semantic Analysis (LSA), Latent Dirichlet Alloca- Decision Tree models were built using the Chi-square Au-
tion (LDA), and Non-Negative Matrix Factorization (NMF). tomatic Interaction Detector (CHAID) algorithm with gain ratio
Second, we explored cluster formations and user classifications as the splitting criterion, which produces a more balanced tree
into “HighRisk” or “AtRisk” using the identified topics. LDA that is less likely to be overfitting than using other splitting cri-
is a probabilistic approach that maximizes the log likelihood of teria. With these parameter settings, number of minimum mem-
each term appearing in each topic, while LSA is a matrix fac- berships in child and parent nodes were varied in experiments,
torization approach that rotates topic vectors in the term space where the restriction in the number of members in child nodes
to best capture the variation of terms. However, LSA produces is half of that in the parent nodes, and the parent nodes mem-
topics which contain terms that are negatively correlated with bership restriction is varied between the range of 10 and 180.
those topics, and such results are hard to be interpreted. To ac-
count for this disadvantage, NMF, a non-negative rank factori- 5.2.2 K-means Clustering
zation approach, is performed to ensure the topics are better in In total, 54 K-means Clustering experiments were per-
being interpretable and well separated. formed. For each of the dataset prepared, two K-means Cluster-
In order to classify users into either “HighRisk” or ing analyses are conducted for each of the similarity measures.
“AtRisk”, two machine learning approaches were employed: Two assumptions were made: the clusters are well separated,
Decision Tree classification and K-means clustering. While De- and the clusters can be partitioned into two groups and be la-
cision Tree is a supervised approach that learns to classify in- beled “AtRisk” and “HighRisk”. Therefore, K=2 is chosen for
stances based on relations between the data points and corre- the clustering analysis. Second, the clusters are disjointed and
sponding ground truth labels, K-means is an unsupervised ap- in order to separate all groups and label each of them into
proach that partitions users based on their similarities Since the “AtRisk” or “HighRisk”, K=N is chosen for the clustering anal-
ground truth labels are unknown when such applications are de- ysis, where N is determined by the number of terminal nodes in
ployed in real-world, we applied K-means clustering to examine the corresponding Decision Tree. For example, for the LDA un-
the effectiveness of using topics discovered in separating balanced dataset, there are 27 terminal nodes in the best deci-
“HighRisk” and “AtRisk” users. sion tree trained, therefore, the K-means clustering is conducted
with K=27.
5. Experiments and Results
5.1 Topic Identification 5.3 Classification Results
Based on the framework proposed in [5], three topic identi- After clusters are formed, the cluster label is based on the
fication algorithms were applied with the assumption that there majority class label within the cluster. For example, if a cluster
are 12 underlying suicide related topics, corresponding to the is dominated by “AtRisk” users, that cluster will be labeled as
12 suicide risk factors. As shown in Table 2, each topic identi- “AtRisk” and all users will be assigned the label “AtRisk” as
fied from results of LDA, LSA, and NMF is associated with one predicted labels. Four metrics are used to evaluate the perfor-
of the 12 risk factors. To determine the topic definition using mances of decision tree and K-means clustering, and they are
one of the twelve suicide risk factors, a threshold of 0.25 is used precision, sensitivity (recall), specificity, and AUC; to calculate
on terms’ weights associated with the topics. For example, in each of these metrics, “HighRisk” class is treated as the positive
the first topic discovered by LDA, sleeping and cut had loadings class.
that are greater than the threshold, and since the term sleeping As shown in Table 3, in general, Decision Tree models per-
has the greatest loading value, the topic is assigned with De- form better than K-means Clustering in identifying “HighRisk”
pressive feelings, according to the framework in [5]. Further- users, measured by precision. Using the unbalanced dataset,
more, topics that are assigned Self-harm and Bullying must have
dominating terms such as Cut and Bullied, respectively. By

943

Authorized licensed use limited to: University of Gloucestershire. Downloaded on March 02,2024 at 17:40:32 UTC from IEEE Xplore. Restrictions apply.
Table 2: Topic Clustering Results
LDA Topics Top Contributing Terms with Probabilities
1 Depressive Feelings Sleeping (0.605), Cut (0.347)
2 Drug Abuse Zoloft (0.328), Prozac (0.298)
3 Psychological Disorders Panic Disorder (0.411)
4 Self-harm Cut (0.31)
5 Bullying Bullying (0.796)
6 Self-harm Cut (0.395)
7 Self-harm Cut (0.354), Depressed (0.324)
8 Drug Abuse Alcohol (0.768)
9 Depressive Feelings Empty (0.964), Worthless (0.375)
10 Drug Abuse Pills (0.747)
11 Self-harm Cut (0.995)
12 Family Violence/Discord Suicide (0.295)
LSA Topics with Variance Explained Top Contributing Terms with Loadings
1 Self-harm (0.463) Cut (0.995)
2 Bullying (0.353) Bully (0.988)
3 Drug Abuse (0.031) Zoloft (0.56), Alcohol (0.475), Prozac (0.372)
4 Psychological Disorders (0.023) Panic Disorder (0.825)
5 Bullying (0.021) Bullied (0.978)
6 Depressive Symptoms (0.017) Sleeping (0.54), Alcohol (0.438), Empty (0.417)
7 Depressive Symptoms (0.015) Sleeping (0.698)
8 Drug Abuse (0.014) Pills (0.75)
9 Drug Abuse (0.012) Pills (0.533), Empty (0.33)
10 Depressive Feelings (0.012) Empty (0.687), Depressed (0.415)
11 Bullying (0.012) Abused (0.997)
12 Psychological Disorders (0.008) Bipolar (0.65), Schizophrenia (0.442)
NMF Topics Top Contributing Terms with Loadings
1 Self-harm Cut (2.22)
2 Psychological Disorders Panic Disorder (0.89)
3 Drug Abuse Alcohol (0.651)
4 Bullying Bully (2.275)
5 Depressive Symptoms Sleeping (0.961)
6 Drug Abuse Pills (2.154)
7 Depressive Feelings Empty (2.002)
8 Depressive Feelings Worthless (2.106), Bullied (0.77)
9 Depressive Feelings Depressed (1.143)
10 Psychological Disorders Bipolar (0.572)
11 Depressive Feelings Anxious (1.513), Hopeless (0.421)
12 Depressive Feelings Abused (1.246)

944

Authorized licensed use limited to: University of Gloucestershire. Downloaded on March 02,2024 at 17:40:32 UTC from IEEE Xplore. Restrictions apply.
neither Decision Tree nor K-means Clustering is able to clearly captured by, or the popularity of each topic is reported next to
distinguish between “AtRisk” and “HighRisk” users. the topic definition. For example, topic 1, which is "Self-harm"
(0.463), indicates it’s a self-harm related topic and it captures
Decision Tree tends to classify most of users into “High-
0.463 of the variance of the original dataset, and its dominating
Risk”, and therefore, all Decision Tree models built on the un-
term is “Cut” which has a weight of 0.994. By observing these
balanced dataset have sensitivity close to one while the speci-
variances and term weights, the LSA results show that “Self-
ficity is low. On the other hand, K-means Clustering tends to
harm” is the dominating topic in the dataset, followed by “Bul-
assign all users into the “AtRisk” class, and therefore, the K-
lying” as the second mostly discussed topic. Between these two
means Clustering results have zero sensitivity and a specificity
topics, “Cut” and “Bully” are the most commonly used terms.
of one. Using the balanced dataset whose “AtRisk” users were
While LDA and NMF do not produce ordered topics with a par-
randomly sampled, Decision Tree models tend to perform better
ametric approach, the popularity of topics discussed can be
than K-means Clustering as well. K-means Clustering with
measured by the frequency of the topic discovered. Since LDA
K=N, where N is determined as described in Section 5.2.2, can
is a generative probabilistic approach, and Self-harm appear in
capture more "HighRisk" users than that of the Decision Tree
4 out of 12 topics discovered, which has the highest frequency
model; the K-means Clustering result has a sensitivity of 0.771
compared to other topics, it can be argued that Self-harm is the
and the Decision Tree model has a sensitivity of 0.736. Overall,
most probable topic being discussed in the dataset. Among
both K-means clustering, and decision tree perform better using
NMF’s topics discovered, Depressive Feelings is the dominat-
the balanced dataset generated by the K-means approach com-
ing topic. Such results, however, do not provide insights on
pared to those using the other datasets. Decision Tree outper-
what topics are important in distinguishing between “AtRisk”
forms K-means Clustering in precision, sensitivity, specificity
and “HighRisk” users. Such information can be observed from
and AUC. However, using LDA’s balanced dataset with K-
decision tree plots.
means approach, K-means Clustering is able to separate the
“AtRisk” user better than the Decision Tree; K-means Cluster- Among the classification experiments, decision tree models
ing result has a specificity of 0.86 compared to that of the De- built using balanced datasets generated using the K-means ap-
cision Tree which is 0.829. In addition, K-means Clustering is proach performed the best. Therefore, conclusions concerning
achieving a precision (0.828) that is close to that of the Decision the importance of topics shall be drawn from those trees. In a
Tree model (0.844). Decision Tree model, the topics that are used to first partition
the users are considered to be the most important topics. Ob-
K-means Clustering results show that the “AtRisk” and
serving three Decision Tree models’ top nodes, LDA’s most im-
“HighRisk” user clusters are disjointed in the risk factor space
portant topics are Self-harm, Depressive feelings, Drug Abuse,
and can be properly separated using the clustering approach,
and Bullying, LSA’s top important topics are Drug Abuse, Self-
and specifically, the clustering approach can almost perfectly
harm, Psychological Disorders, Bullying, and Depressive
separate the “AtRisk” users with 0.993 specificity. Further-
Symptoms, and NMF’s top important topics are Drug Abuse,
more, such disjoint clusters can be better separated using the
Psychological Disorders, Self-harm, and Drug Abuse. Overall,
supervised approach, which in this study is the Decision Tree
it can be concluded that Self-harm, Drug Abuse, Bullying, De-
model. Among all the Decision Tree models, using NMF’s bal-
pressive Feelings, Depressive Symptoms, and Psychological
anced dataset generated using the K-means Clustering ap-
Disorders play important roles in determining “AtRisk” and
proach, it has the best performance of all, where the precision
“HighRisk” users. Furthermore, since the ground truth labels
is 0.853, the sensitivity is 0.933, the specificity is 0.836, and the
are usually unknown in real-world applications, and with a
AUC is 0.885.
specificity of 0.993 performance by applying K-means cluster-
6. Discussion ing on topics identified, we believe such suicidal ideation de-
tection technique can be studied further and eventually de-
The results have indicated that by utilizing the supervised
ployed as a real-time application to accomplish in-time suicide
and unsupervised machine learning algorithms combined with
prevention work.
topic identification techniques, users who are “AtRisk” and
“High-Risk” of suicidal ideation can be identified using their 7. Limitations
Twitter data. Using the topic clustering techniques, the result
The suicide risk factor framework designed by [5] was
also shows that with minimum human interpretation, 7 out of
based on previous research ranging from 1994 to 2012. With
the 12 suicide risk factors confirmed by suicide researchers
the evolving language usage on social media, an up-to-date su-
were discovered. Without looking at any additional linguistic
icide-related lexicon should be considered for this type of
features of tweets, decision tree models are able to distinguish
framework development and incorporated into the suicide de-
the “AtRisk” and “HighRisk” users. Additionally, the classifi-
tection and prevention research. This is indicative that in the
cation results in combination with the topic clustering provide
dataset used in this study, there might be missing information
qualitative interpretation of the users’ psychological state which
that conveys the ideation of the other 5 suicide risk topics. Fur-
could be potentially useful in future prevention efforts. Among
thermore, the use of emoticons, hashtags, and other twitter
the three topic discovery approaches, Latent Semantic Analysis
meta-data that could potentially indicate suicide ideation are not
is able to provide information on the dominating topics or the
included in the dataset. In addition, there is an unsolved issue in
popularly used suicide-related terms. In Table 2, the variance
the selection of tweets and users who are displaying sarcasm or

945

Authorized licensed use limited to: University of Gloucestershire. Downloaded on March 02,2024 at 17:40:32 UTC from IEEE Xplore. Restrictions apply.
making statements in jest compared to users who have actual
suicidal intent. As advancements are made to natural language
processing techniques, this piece of the study can be improved.
This study is evaluated based on the previously established
framework in suicide-related research using Twitter data in-
stead of the ground truth: whether the Twitter user committed
suicide. For this work to be applied in the suicide prevention
domain, ground truth data should be collected and used to eval-
uate the true performance of this study. Other possibilities could
include working to determine the number of different classes
within suicidal users. This study operates off of the “AtRisk”
and “HighRisk” structure and could miss other insights.
8. Conclusion
Suicide is a serious social and economic problem in the
United States. Many efforts have been made in studying the lan-
guage formation of suicide-related tweets and a few are made
to detect suicidal ideation using open social media data. In the
most recent study in achieving such task, [5, 18] tweets are man-
ually tagged by human annotators before machine learning al-
gorithms are applied to classify if users are at risk of committing
suicide, which is not efficient enough to detect suicidal ideation
to support suicide prevention. In this study, we propose a sui-
cidal ideation detection framework that requires minimum hu-
man efforts in annotating data by incorporating unsupervised
topic discovery algorithms. Three techniques are tested in this
study, including Latent Semantic Analysis, Latent Dirichlet Al-
location, and Non-Negative Matrix Factorization. Using these
algorithms, we were able to discover 7 out of 12 suicide risk
factors proposed by [5], and using those topics, we were able to
represent of user profiles in a more compact format using top-
ics. Furthermore, by conducting K-means clustering analysis on
the transformed datasets, we concluded that “AtRisk” and
“HighRisk” user groups are disjointed and cannot be well dis-
tinguished by partitioning them into clusters. However, as
shown in the experimental results using Decision Tree, we were
able to achieve 0.844 of precision, 0.912 of sensitivity, and
0.829 of specificity, where “HighRisk” users are the positive
class. This framework shows that with minimal human interpre-
tation of social media data, it is possible to detect suicidal idea-
tion using the combination of supervised and unsupervised ma-
chine learning algorithms.

946

Authorized licensed use limited to: University of Gloucestershire. Downloaded on March 02,2024 at 17:40:32 UTC from IEEE Xplore. Restrictions apply.
Table 3: User Classification Result

Abbreviations for K-Means Decision Trees


K=2 K=N -
Balanced Sets:

R: Random Sampled Unbalanced Balanced (R) Balanced (K) Unbalanced Balanced (R) Balanced (K) Unbalanced Balanced (R) Balanced (K)

K: K-means Sampled
Precision 0.000 0.511 0.546 0.000 0.631 0.647 0.865 0.747 0.845
Sensitivity 0.000 0.886 0.739 0.000 0.689 0.746 0.993 0.750 0.839
LSA
Specificity 1.000 0.154 0.396 1.000 0.596 0.600 0.111 0.746 0.843
AUC 0.500 0.520 0.568 0.500 0.643 0.673 0.552 0.748 0.841
Precision 0.000 0.538 0.648 0.686 0.554 0.828 0.881 0.640 0.844
Sensitivity 0.000 0.432 0.571 0.086 0.771 0.686 0.979 0.736 0.912
LDA
Specificity 1.000 0.629 0.695 0.993 0.379 0.860 0.239 0.586 0.829
AUC 0.500 0.530 0.633 0.539 0.575 0.773 0.609 0.661 0.870
Precision 0.000 0.520 0.606 0.556 0.575 0.734 0.873 0.711 0.853
Sensitivity 0.000 0.882 0.604 0.018 0.661 0.700 0.984 0.739 0.933
NMF

947
Specificity 1.000 0.186 0.614 0.998 0.511 0.751 0.171 0.700 0.836
AUC 0.500 0.534 0.609 0.508 0.586 0.725 0.578 0.720 0.885

Authorized licensed use limited to: University of Gloucestershire. Downloaded on March 02,2024 at 17:40:32 UTC from IEEE Xplore. Restrictions apply.
References Workshop on Computational Linguistics and Clinical Psy-
chology, San Diego, 2016.
1. Center for Disease Control and Prevention, "Suicide Data 12. J. F. Gunn and D. Lester, "Twitter postings and suicide: An
Sheet," Center for Disease Control and Prevention, 1 Janu- analysis of the postings of a fatal suicide in the 24 hours
ary 2015. Available: https://www.cdc.gov/violencepreven- prior to death," Suicidologi, vol. 17, no. 3, P28-30, 2012.
tion/pdf/suicide-datasheet-a.pdf. (accessed 26 December 13. S. R. Braithwaite, C. Giraud-Carrier and J. West, et al. "Val-
2017). idating machine learning algorithms for Twitter data against
2. G. Plemmons, M. Hall and W. Browning, et al. "Trends in established measures of suicidality," JMIR mental health,
Suicidality and Serious Self-Harm for Children 5-17 Years vol. 3, no. 2, P e21, 2016.
at 32 U.S. Children's Hospitals, 2008-2015," in Pediatric 14. B. O'Dea, M. E. Larsen and P. J. Batterham, et al. "A lin-
Academic Societies 2017, Toronto, 2017. guistic analysis of suicide-related Twitter posts," Crisis, vol.
3. D. Luxton, J. D. June and J. M. Fairall, "Social media and 38, no. 5, P319-329, 2017.
suicide: a public health perspective," American journal of 15. B. O'Dea, S. Wan and P. J. Batterham, et al. "Detecting sui-
public health, vol. 102, no. S2, PS195--S200, 2012. cidality on Twitter," Internet Interventions, vol. 2, no. 2,
4. Pew Research Center, "24% of online adults (21% of all P183-188, 2015.
Americans) use Twitter," 10 November 2016. Available: 16. X. Huang, L. Zhang and D. Chiu, et al. "Detecting suicidal
http://www.pewinternet.org/2016/11/11/social-media-up- ideation in Chinese microblogs with psychological lexi-
date-2016/pi_2016-11-11_social-media-update_0-04/. (ac- cons," in Ubiquitous Intelligence and Computing, 2014
cessed 26 December 2017). IEEE 11th International Conference on and IEEE 11th In-
5. J. Jashinsky, S. H. Burton, and C. L. Hanson, et al. "Track- ternational Conference on and Autonomic and Trusted
ing suicide risk factors through Twitter in the US," Crisis, Computing, and IEEE 14th International Conference on
vol. 35, no. 1, P51-9, 2014. Scalable Computing and Communications and Its Associ-
6. B. Stelter, "Web Suicide Viewed Live and Reaction Spur a ated Workshops (UTC-ATC-ScalCom), Bali, 2014.
Debate," 24 November 2008. Available: http://www.ny- 17. T. Liu, Q. Cheng and C. M. Homan, et al. "Learning from
times.com/2008/11/25/us/25suicides.html. (accessed 26 De- various labeling strategies for suicide-related messages on
cember 2017). social media: An experimental study," CoRR, vol.
7. Twitter, "About self-harm and suicide," 1 January 2017. abs/1701.08796, no. abs/1701.08796, P1, 2017.
Available: https://support.twitter.com/articles/20170313. 18. P. Burnap, W. Colombo and J. Scourfield, "Machine classi-
(accessed 26 December 2017). fication and analysis of suicide-related communication on
8. S. Fodeh, J. Goulet, and C. Brandt, et al. "Leveraging Twit- twitter," in 26th ACM Conference on Hypertext & Social
ter to better identify suicide risk," in Machine Learning Re- Media, New York, 2015.
arch, Halifax, 2017. 19. K. D. Rosa, R. Shah and B. Lin, et al. "Topical clustering
9. R. Grant, D. Kucher and A. León, et al. "Automatic Extrac- of tweets," in 34th international ACM SIGIR conference on
tion of Informal Topics from Online Suicidal Ideation," in Research and development in Information Retrieval, Bei-
11th International Workshop on Data and Text Mining in jing, 2011.
Biomedical Informatics, Singapore, 2017. 20. V. Rus, N. Niraula and R. Banjade, "Similarity measures
10. S. Shiffman, A. A. Stone and M. R. Hufford, "Ecological based on latent dirichlet allocation," in International Confer-
momentary assessment," Annual Review of Clinical Psy- ence on Intelligent Text Processing and Computational Lin-
chology, vol. 4, no. 1, P1-32, 2008. guistics, Samos, 2013.
11. G. Coppersmith, K. Ngo and R. Leary, et al. "Exploratory
analysis of social media prior to a suicide attempt," in Third

948

Authorized licensed use limited to: University of Gloucestershire. Downloaded on March 02,2024 at 17:40:32 UTC from IEEE Xplore. Restrictions apply.

You might also like