Using Machine Learning Algorithms to Enhance the Management of
Suicide Ideation
Sinisa Colic*, J D. Richardson, James P. Reilly and Gary M. Hasey
Abstract— Combat veterans; especially those with mental However, the search for variables predictive of suicide has
health conditions are an at risk group for suicidal ideation and mainly been conducted with standard multivariate statistical
behaviour. This study attempts to use machine learning approaches using general linear models [2, 3]. The limitation
algorithm to predict suicidal ideation (SI) in a treatment of many of these studies has been that they tested predictors
seeking veteran population. Questionnaire data from 738
in isolation [2].
patients consisting of veterans, still serving members of the
Canadian Forces (CF) and Royal Canadian Mountain Police An alternative approach, using machine learning (ML)
(RCMP) were examined to determine the likelihood of suicide techniques, holds more promise as a predictor for the
ideation and to identify key variables for tracking the risk of individual subject. Physicians are limited to examining a few
suicide. Unlike conventional approaches we use pattern variables at a time, whereas ML techniques have the ability
recognition methods, known collectively as machine learning search for relevant diagnostic patterns using all available
(ML), to examine multivariate data and identify patterns
information, not just information related to a particular
associate with suicidal ideation. Our findings show that
accurate prediction of SI of over 84.4% can be obtained with 25 hypothesis. This process produces results which may be
variables, and 81% using as little as 10 variables primarily regarded as reliable, as evidenced by successful use in
obtained from the patient health questionnaire (PHQ). oncology e.g. to assist with diagnosis [6], to predict response
Surprisingly the best identifiers for SI did not come from to chemotherapeutic agents [7] and to estimate survival [8]
occupational experiences but rather the patient quality of where accuracies range from 85% to 95%. Our own group
health, signifying that these findings could be applied to the
has created ML algorithms that can diagnose major
general population. Our results suggest that ML could assist
clinicians to develop a better screening aid for suicidal ideation depression, bipolar disorder, and schizophrenia, and separate
and behaviour. these from healthy volunteers with 90% accuracy [9]. Our
algorithms can also predict, with 78-88% accuracy, an
I. INTRODUCTION individual patient’s response to psychotropic drugs [10],
S UICIDE attempts are a major public health concern; repetitive transcranial magnetic stimulation [11] or
especially for the veteran population and military psychotherapy. Others have shown that ML approaches can
personnel. In the USA the number of military personnel who identify suicide attempters from non-attempters with
die by suicide exceeds the number killed in combat [1]. Our prediction accuracy ranging between 65% and 75% [12], in
ability to predict suicide attempts has been near chance a dataset of 144 patients with mood disorders. In the military
levels for several decades [2]. To date there have been context ML analysis has yielded algorithms predictive of the
several attempts at identifying risk factors for suicidality. appearance of PTSD with accuracy of 75% using emergency
In a dataset consisting of young adults in the military it room records [13] and 80-90% using pre- and early post-
has been shown that sleep problems outperform depression deployment questionnaires [14]. Others have employed ML
and hopelessness as predictors of suicide [3]. Another key analysis of resting functional magnetic resonance imaging
factor believed to be linked suicide attempts comes from data to accurately differentiate subjects with PTSD from
deployment experiences, with 7-13% of Canadian military those without [15]. The range of deployment-related events
personnel developing post-traumatic stress disorder (PTSD) that are viewed as traumatic is huge but ML analysis has
[4]. Of those with PTSD symptoms, only about 2/3 receive been able to identify those subtypes that are the most closely
treatment at least in part because the diagnosis was never associated with the development of PTSD [16]. These
made [5]. Other factors linked to suicide include anxiety, studies highlight the utility of ML methods as a means of
depression, hopelessness, and drug and alcohol abuse [3]. using large databases to accurately predict the risk of
psychopathology such as PTSD, depression and suicide in an
individual.
Manuscript sent Feb 15, 2018. Asterisk indicates corresponding author.
*S. Colic and J. P. Reilly are with Electrical and Computer Engineering In this study we examined a dataset of 738 patients along
Department, McMaster University, Hamilton, ON, L8S 4K1, Canada with 224 variables ranging from demographic information,
(emails: [email protected], [email protected]). general health and well-being to deployment related
*S. Colic and G. M. Hasey are with the Department of Psychiatry and
Behavioural Neurosciences, McMaster University, and also with Mood experience and PTSD scale. Applying variable reduction
Disorders Program, St. Joseph Hospital, Hamilton, ON, (emails: techniques we select a subset of critical variables for
[email protected], [email protected]) diagnosing the risk of suicide and train random forest (RF)
J. D. Richardson is with the Parkwood OSI Clinic, St. Joseph’s Health
Care London, ON, Canada N6A 4V2 (email: Don algorithms to predict suicide ideation in at risk personnel.
[email protected])
978-1-5386-3646-6/18/$31.00 ©2018 IEEE 4936
II. METHODS feature reduction techniques, it was possible to go back and
use all 738 samples in the training. The comparison of these
A. Participants
results are provided in the results section.
A database of 738 subjects were obtained from the
Parkwood Operational Stress Injury (OSI) Clinic in London, D. Selection of Essential Variables
Ontario in collaboration with Western University. The The next step of the preprocessing involved reducing the
Parkwood OSI clinic is funded by Veterans Affairs Canada number of features from 224 to identify the most important
and provided specialized mental health service for veterans, variables. For this purpose we used the minimum-
the Canadian Forces and the Royal Canadian Mountain redundancy-maximum relevance (mRMR) feature selection
Police (RCMP) experiencing service-related mental health technique [17] to reduce the number of variables for the two
conditions. purposes: 1) a lower complexity models show greater
discriminative ability and reduces computation load, 2)
B. Questionnaire Variables
feature selection can aid in the interpretation of the model by
The dataset contains 224 variables covering demographic highlighting important features for discriminating between
information shown in Table I, the SF-36 a general health and the different classes.
well-being scale, the Patient Health Questionnaire (physical The mRMR achieves feature selection by reducing the
symptoms, energy, anxiety and stress), alcohol and drug use number of candidate features Nc to a smaller subset of
questionnaires, the PCL-M, the military version of a chosen or relevant features Nr. The procedure iterates
Posttraumatic Stress Disorder (PTSD) scale, questions through Nr steps, selecting one feature at each step by
related to deployment experiences including head injuries. measuring the mutual information to determine the
maximum statistical dependence features with label data,
TABLE I
while at the same time minimizing dependence with respect
DEMOGRAPHIC CHARACTERISTICS
to the set of features already chosen in the preceding steps.
not SI SI The feature selection process was combined with a boot
(n = 407) (n = 331)
Age Intake 43.9 +/- 14.9 45.0 +/- 12.6
strapping technique which would select batches of samples
Years of Service 14.6 +/- 9.8 13.4 +/- 9.0 using 5-fold subsampling to identify key features. Batches
Gender were randomly selected and repeated 1000 times and top
Male 90.9% 89.7% candidate features were tracked by number of times selected
Female 9.1% 10.3% and the order of ranking. Using this process we selected the
Relationship Status top 25 features.
In a relationship 57.8% 54.9%
Not in a relationship 42.2% 45.1%
E. Random Forest
Number of Deployments 2.22 +/- 2.37 2.38 +/- 3.27 Random Forest (RF) classifiers, proposed initially by
Duty Breiman [18], were used to model the data for suicidal
Regular 57.8% 69.4% ideation. The RF is made up of individual classification
Reserves 20.5% 10.4% trees. Each tree is independently able to make classification
Civilian 21.7% 20.2% decisions. During training a subset of the trees are randomly
Post-Secondary Education 54.7% 52.8% selected and trained on the dataset. The core of the random
Income < $40,000 25.6% 32.6%
forest classifier is the binary decision tree, a data type that
stores elements hierarchically in nodes. Each decisions tree
C. Preprocessing of Data Variables is grown on different bootstrapped sample collections (i.e.
randomly drawn instances with replacement form the
To facilitate the use of machine learning techniques the
data were preprocessed to ensure each variable was original dataset) on a randomly selected subset of all
represented by a numerical value. A significant portion of available predictors. The random selection of predictors
increases the generalizability of the individual decisions
the data were incomplete and default values were generated
based on expert opinion. For example, deployment history trees, whereas the collection of multiple decision trees in one
was broken out by region and if a region had not been ticked forest increases model performance. As a result RFs are well
off, then it was assumed the participant had not been suited to nonlinear, high dimensional feature spaces. RFs
deployed to that region. There were several variables for have less parameters which greatly reduces the parameter
searching required.
which default values could not be provided and were left
incomplete. To address this a second stage of preprocessing Unlike many of the other classifiers available, RF
was applied where any variable that had 60 or more missing algorithms can be trained on unbalanced data and can handle
missing data which makes them well suited for our data.
data was excluded. Then any participant sample that had
missing variable data was excluded from the computations. F. Testing and Validation of Model
This resulted in a final data set of 620 participants with 224 Training of the RF model consisted of first separating the
variables. Once the top variables were identified using data using the standard 5-fold cross-validation technique,
4937
Two dimensional feature projection using t-SNE are
shown in Fig 2a. The projections show that the data are
separable even with only 25 variables, suggesting the
additional variables can be omitted when making the
diagnosis.
Training an RF classifier on the 25 variables and 643
samples showed favourable classification results (Fig 2b),
with the area under the curve (AUC) achieving an average
score of 83.6%, an improvement over the 81.3% achieved
when using all 224 variables. Reducing the number of
variables further suggested that AUC scores greater than
81% could be achieved with only 10 variables which are
identified in Table II.
Comparing the classifier performances pre and post
processing (Fig 2b) of missing data shows a slight
improvement post processing with 25 variables achieving
AUC scores of 84.4% and 81.0% with only 10 variables.
TABLE II
DEMOGRAPHIC CHARACTERISTICS
Rank Variable Description
1 PHQ_2f Feeling bad/like a failure/let people do
Fig. 1. Variable reduction through a combination mRMR and bootstrapping 2 PHQ_4g Nausea/upset stomach during last attack
technique. Samples are randomized 1000 times into 5 equal batches. A) 3 PHQ_8 Are you taking any medication for anxiety,
Shows the occurrence of each of the 224 variables. The top 25 variables depression, or stress?
were selected using a threshold. B) The top 25 variables were ranked based 4 PHQ_1j Feeling heart race/pound
on the order of occurrence in the mRMR variable selection. A sample result 5 PHQ_6j Thinking or dreaming about something terrible
for PHQ-2f is shown. that happened to you in the past
6 PCL_3 Suddenly acting or feeling as if a stressful
where 4/5th of the data was used for training and 1/5th was military experience was happening again
used for validation. Suicidal ideation labels were obtained 7 PHQ_9 Have you ever seen or talked to a health
professional about your mental or emotional
using the self-reported score PHQ-2i – Thinking better off
health?
dead/hurting self in the past two weeks, from the patient 8 PCL_12 Feeling as if your future will somehow be cut
health questionnaire (PHQ). short
The RF model number of tree parameter was optimized by 9 PHQ_3d Worried about having another one
iteratively searching from 50 to 400 trees. As it was 10 PHQ_6d Difficulties with spouse/partner/lover/etc.
observed that after 100 trees the performance did not
improve, the number of trees was fixed at 100 for the IV. DISCUSSION
remainder of the testing. Using machine learning analysis of intake questionnaire
Sensitivity and specificity was evaluated using receiver data from 738 military veterans we revealed 25 non-suicide
operating characteristic (ROC) curves for each cross- related questions that can be used to identify at risk
validation. The final ROC result represented the average personnel with suicidal ideation with average ROC AUC
over all five samples. scores of 84.4%, or 81.0% for the top 10 non-suicide related
questions. In addition we show that our algorithm can handle
III. RESULTS missing data with no reduction in the performance. This
The results of the batch bootstrapping technique using allows us to increase the number of samples we can test on,
mRMR are shown in Fig 1a. A threshold of 900 selections as generally the datasets are never complete.
was used to identify the top 25 variables. Ranking the top 25 It is interesting to note that suicide ideation had little
variables it was shown that the top indicators of suicide relation to the deployment history of the soldiers rather; it
ideation come from the PHQ questions which relate to was the quality of health that showed the highest indication
patient quality of health. These include: PHQ – 2f: Feeling of SI. These questions could also be applied to the general
bad/like a failure/ let people down, PHQ – 4g: Nausea/upset public to determine if the same performance would be
stomach during last attack, PHQ – 8: Are you taking achieved. Our results suggest that we can predict SI without
medication for anxiety, depression, stress, PHQ – 1j feeling asking directly questions related to suicide itself. This would
heart race/pound. All signs that the emotional difficulty has allow the identification of at risk individuals in situations
started to affect the physical operation of the body. where admission of SI might be a deterrent to career
4938
REFERENCES
[1] G. Zoroya, “Suicide surpassed war as the military’s leading cause of
death,” USA Today, 2014.
[2] J. C. Franklin, J. D. Ribeiro, K. R. Fox, K. H. Bentley, E. M. Kleiman,
X. Huang, K. M. Musacchio, A. C. Jaroszewski, B. P. Chang, and M.
K. Nock, “Risk factors for suicidal thoughts and behaviors: A meta-
analysis of 50 years of research,” Psychological Bulletin, vol. 143, no.
2, pp. 187, 2017.
[3] J. D. Ribeiro, J. L. Pease, P. M. Gutierrez, C. Silva, R. A. Bernert, M.
D. Rudd, and T. E. Joiner, “Sleep problems outperform depression and
hopelessness as cross-sectional and longitudinal predictors of suicidal
ideation and behavior in young adults in the military,” Journal of
affective disorders, vol. 136, no. 3, pp. 743-750, 2012.
[4] L. A. Hines, J. Sundin, R. J. Rona, S. Wessely, and N. T. Fear,
“Posttraumatic stress disorder post Iraq and Afghanistan: prevalence
among military subgroups,” The Canadian Journal of Psychiatry, vol.
59, no. 9, pp. 468-479, 2014.
[5] D. Fikretoglu, and A. Liu, “Prevalence, correlates, and clinical features
of delayed-onset posttraumatic stress disorder in a nationally
representative military sample,” Social psychiatry and psychiatric
epidemiology, vol. 47, no. 8, pp. 1359-1366, 2012.
[6] L. Li, Q. Zhang, Y. Ding, H. Jiang, B. H. Thiers, and J. Z. Wang,
“Automatic diagnosis of melanoma using machine learning methods on
a spectroscopic system,” BMC medical imaging, vol. 14, no. 1, pp. 36,
2014.
[7] S. N. Dorman, K. Baranova, J. H. Knoll, B. L. Urquhart, G. Mariani,
M. L. Carcangiu, and P. K. Rogan, “Genomic signatures for paclitaxel
and gemcitabine resistance in breast cancer derived by machine
learning,” Molecular oncology, vol. 10, no. 1, pp. 85-100, 2016.
[8] M. Montazeri, M. Montazeri, M. Montazeri, and A. Beigzadeh,
“Machine learning models in breast cancer survival prediction,”
Technology and Health Care, vol. 24, no. 1, pp. 31-42, 2016.
[9] A. Khodayari-Rostamabad, J. P. Reilly, G. Hasey, and D.
MacCrimmon, "Diagnosis of psychiatric disorders using EEG data and
employing a statistical decision model." pp. 4006-4009.
[10]A. Khodayari-Rostamabad, J. P. Reilly, G. M. Hasey, H. de Bruin, and
D. J. MacCrimmon, “A machine learning approach using EEG data to
predict response to SSRI treatment for major depressive disorder,”
Fig. 2. Validation of the variable reduction and machine learning
Clinical Neurophysiology, vol. 124, no. 10, pp. 1975-1985, 2013.
methodology. A) Variable reduction shows good separation of SI and not
[11]A. Khodayari-Rostamabad, J. P. Reilly, G. M. Hasey, H. de Bruin, and
SI variables using 25 variables. B) Training a random forest classifier on
D. MacCrimmon, "Using pre-treatment electroencephalography data to
all the data variables showed inferior results compared to training on the
predict response to transcranial magnetic stimulation therapy for major
top variables. Further improvements were obtained by taking advantage of
depression." pp. 6418-6421.
random forests ability to deal with missing data, resulting in an AUC score
[12]I. C. Passos, B. Mwangi, B. Cao, J. E. Hamilton, M.-J. Wu, X. Y.
of 84.4% +/- 4.4 with 25 variables and 81% +/- 2.7 with only 10 variables.
Zhang, G. B. Zunta-Soares, J. Quevedo, M. Kauer-Sant’Anna, and F.
advancement such as with pilots. Kapczinski, “Identifying a clinical signature of suicidality among
An algorithm that could identify vulnerable individuals at patients with mood disorders: a pilot study using a machine learning
risk for suicidal behaviour would be extremely helpful to approach,” Journal of affective disorders, vol. 193, pp. 109-116, 2016.
[13]K.-I. Karstoft, I. R. Galatzer-Levy, A. Statnikov, Z. Li, and A. Y.
clinicians and administrators. The benefits in terms of, Shalev, “Bridging a translational gap: using machine learning to
preventing suicides, and reduced “collateral damage” to improve the prediction of PTSD,” BMC psychiatry, vol. 15, no. 1, pp.
military families would be immense. 30, 2015.
[14]K.-I. Karstoft, A. Statnikov, S. B. Andersen, T. Madsen, and I. R.
Galatzer-Levy, “Early identification of posttraumatic stress following
These results suggest that MI could be used to screen military deployment: application of machine learning methods to a
veterans for suicidal risk and provide opportunity for early prospective study of Danish soldiers,” Journal of affective disorders,
vol. 184, pp. 170-175, 2015.
preventative interventions. Ultimately we hope that ML [15]F. Liu, B. Xie, Y. Wang, W. Guo, J.-P. Fouche, Z. Long, W. Wang, H.
based techniques would provide a useful tool to improve the Chen, M. Li, and X. Duan, “Characterization of post-traumatic stress
health and well-being of Canadian military personnel, disorder using resting-state fMRI with a multi-level parametric
classification approach,” Brain topography, vol. 28, no. 2, pp. 221-237,
veterans and their families by maximizing evidence- 2015.
informed practices, policies and programs. [16]R. C. Kessler, S. Rose, K. C. Koenen, E. G. Karam, P. E. Stang, D. J.
Stein, S. G. Heeringa, E. D. Hill, I. Liberzon, and K. A. McLaughlin,
ACKNOWLEDGMENT “How well can post‐traumatic stress disorder be predicted from
pre‐trauma risk factors?,” World Psychiatry, vol. 13, no. 3, pp. 265-274,
This work was supported by the Interdisciplinary 2014.
Research Fund at McMaster University. We would like to [17]H. Peng, F. Long, and C. Ding, “Feature selection based on mutual
information criteria of max-dependency, max-relevance, and min-
thanks the Parkwood OSI Clinic for providing access to their redundancy,” IEEE Transactions on pattern analysis and machine
data without which this work would not be possible. intelligence, vol. 27, no. 8, pp. 1226-1238, 2005.
[18]L. Breiman, “Random forests,” Machine learning, vol. 45, no. 1, pp. 5-
32, 2001.
4939