Machine Learning Approaches To Predict Asthma Exac
Machine Learning Approaches To Predict Asthma Exac
https://doi.org/10.1007/s12325-023-02743-3
REVIEW
N. A. Molfino (&)
Global Development, Amgen Inc., One Amgen
Center Dr, Thousand Oaks, CA 91320, USA
e-mail: [email protected]
G. Turcatel
Digital Health and Innovation, Amgen Inc.,
Thousand Oaks, CA, USA
D. Riskin
Verantos, Inc., Palo Alto, CA, USA
Adv Ther
with future attacks included the following: his- datasets or solve all questions [23]. There have
tory of exacerbations, lung function, smoking been efforts to—at a minimum—correlate
status, blood eosinophilia, age, sex and history symptoms and airway function with asthma
of rhinitis, nasal polyps, eczema, gastroe- exacerbations using ML [24]. Although the
sophageal reflux disease, and obesity. The final analysis of lung images was one of the first areas
validated models utilized 19 and 16 risk factors of respiratory medicine to demonstrate
for two or more (area under the receiver oper- increased diagnostic accuracy with the use of
ating characteristic curve (AUC) 0.785) and four neural networks [25], the prediction of acute
or more (AUC 0.867) attacks over 2 years, medical events has become more recently a
respectively [18]. major unmet need, particularly during the
In addition, Noble et al. trained a multivari- COVID-19 pandemic [26]. This need to estab-
able logistic regression to predict asthma events lish an accurate prognosis of serious conditions,
(accident and emergency attendance, hospital- such as asthma exacerbations, still persists
ization, or death) on a dataset of 61,861 people today. In this review, we summarize recent
with asthma extracted from England and Scot- studies that have demonstrated the ability to
land using the Clinical Practice Research Data- predict asthma exacerbations using different
link. The fitted model was validated on 174,240 algorithms and propose a few next steps for this
patients with asthma from Wales and had AUC field.
of 0.71. Older and underweight patients who
smoked and had history of blood eosinophilia
and past asthmatic attacks had the highest COMPLEXITY OF EXACERBATIONS
probability of experiencing asthma-related crisis
events [19]. Since clinicians are not as well sui- More than 10 million asthma exacerbations
ted to the implementation of complex algo- occurred in the USA in 2020 [3]. There a need
rithms as machines, assessment of the not only to accurately predict who is at risk of
aforementioned features may not be sufficient exacerbations [27] but also to avoid overdiag-
to alert patients to their risk of asthma exacer- nosis and overtreatment of asthma [28]. Asthma
bation. Machines may be unable to implement exacerbations are defined by symptoms of
complex algorithms because data may be increased severity that require treatment with
inconsistently captured or maintained in an systemic corticosteroids for at least three con-
unstructured format [20]. Indeed, additional secutive days [29]. Short bursts of systemic cor-
factors beyond past medical history, current ticosteroids are effective to treat acute asthma,
symptoms, PFTs, and even biomarkers may be but they tend to be overprescribed and can
required for predicting future exacerbations. cause serious side effects [28]. Furthermore,
The patient’s access to medical care, genetic patients with poorly controlled asthma can
makeup and expression, habits, and environ- progress to develop exacerbations due to lack of
mental exposures may also impact their asthma access to medical care and proper therapies [30],
prognosis. Thus, there is a need to integrate the or lack of compliance to prescribed treatments
many characteristics of the ‘‘asthmatic patient’’ [14]. Indeed, providing proper medications has
using algorithms trained and tested in large, been shown to reduce the incidence of exacer-
independent datasets to arrive at an accurate bations in some geographies [31–33]. However,
prognosis of the disease and potentially aug- when applied broadly, such therapeutic
ment our ability to predict exacerbations for approaches incorrectly assume that every per-
each patient individually. son with asthma is at risk of exacerbations,
Machine learning (ML) and deep learning resulting in overtreatment, which in turn can
(DL) are fields within artificial intelligence (AI) lead to the buildup of unnecessary costs and an
that are increasingly being used in medicine increased incidence of treatment-emergent
[21, 22]. However, one of the challenges in ML adverse events. Incidentally, among well-trea-
is the ‘‘no free lunch’’ theorem, which states that ted patients with asthma, exacerbations are
there is no single algorithm that will fit all ultimately triggered by environmental insults
Adv Ther
Fig. 1 The constellation of factors influencing the asth- progression, including both intrinsic factors such as
matic patient’s journey that may be analyzed using artificial genomics, comorbidities, and allergic status, and extrinsic
intelligence to augment clinical decision-making. Patients factors such as occupational and environmental exposures,
with asthma present a unique collection of factors that access to the healthcare system and medications, and
influence their disease severity and outcome. The corner- geography. Because asthma exacerbations are complex and
stones of traditional patient management consist of multidimensional phenomena, the use of artificial intelli-
assessments of pulmonary function (i.e., spirometry mea- gence may better predict such events if proper inputs
suring lung capacity and FEV1), biomarker levels (e.g., (including intrinsic and extrinsic factors) are provided to
blood eosinophils, IgE, FeNO), and patient-reported train, validate, and test next-generation treatment algo-
symptoms (e.g., cough, wheezing, shortness of breath, rithms. IgE immunoglobulin E, FeNO fractional exhaled
sleep). However, a myriad of factors together influence the nitric oxide, FEV1 forced expiratory volume in 1 s
individual patient’s risk of asthma exacerbation and disease
reward, or not (Fig. 2c). The latter has not been These ANNs learn to predict the outcome vari-
used in medicine yet, as far as we know. able, which can be either continuous or cate-
Deep learning is a type of ML that involves gorical, on the basis of the input. In asthma, DL
training artificial neural networks (ANN) con- models have been trained on patient data for
taining several hidden layers on a large dataset. demographics, medical history, asthma
Adv Ther
Table 1 Commonly used machine learning algorithms (with Python and R codes)
Algorithms Objectives References
Linear Used to estimate real values (independent variables Frank E. Harrell Jr
regression such as predicted FEV1) based on continues Regression modeling strategies: with applications
variables (dependent variable such as height, age, to linear models, logistic and ordinal regression,
etc.) [46] and survival analysis
It can be simple or multiple if there is more than Springer International; 2015
one independent variable
Logistic A classification algorithm used to estimate discrete Frank E. Harrell Jr
regressiona values (yes/no, true/false, etc.) [46] Regression modeling strategies: with applications
Since it predicts the probability of developing a to linear models, logistic and ordinal regression,
certain condition, the values are between 0 and 1 and survival analysis
Springer International; 2015
Support vector A classification algorithm in which the value of Ingo Steinwart, Andreas Christmann
machine each feature is plotted in a particular coordinate Support vector machines
(SVM) (known as support vectors) [47]
Springer New York; 2008
Data are split by a line (called classifier) equidistant
to the closest point from each group
Naı̈ve Bayes A classification algorithm that assumes Thomas Mitchell
independence between predictors [48] Machine learning
It is simple, useful in large datasets, and known to McGraw-Hill Education; 1997
outperform sophisticated classification methods
Decision tree A supervised learning algorithm mostly used for the Clinton Sheppard
classification of problems with both categorical Tree-based machine learning algorithms: decision
and continuous dependent variables [49] trees, random forests, and boosting
It splits populations into homogenous sets based on CreateSpace; 2017
significant attributes (independent variables)
Random foresta An ensemble of decision trees Chris Smith, Mark Koning
It classifies a new object on the basis of attributes Decision trees and random forests: a visual
and the tree ‘ votes’’ with forest choosing the introduction for beginners
classification having the most votes Blue Windmill Media; 2017
Combining trees improves the accuracy of the
model [50]
Adv Ther
Table 1 continued
Algorithms Objectives References
Gradient Has high prediction power and is used on large Corey Wade, Kevin Glynn
boosting datasets Hands-on gradient boosting with XGBoost and
machines By combining learning algorithms, gradient Scikit-learn: perform accessible machine learning
(GBM)a boosting works well with scientific data and extreme gradient boosting with Python
XGBoost has high predictive power and accuracy; Packt; 2020
regularized boosting helps reduce overfitting [51] Ke G, Meng Q, Finley T, et al.
LightGBM is a faster algorithm that uses tree-based Lightgbm: a highly efficient gradient boosting
algorithms [52] decision tree
Adv Neural Inf Process Syst 2017;30:3149–3157
Recurrent Designed to process sequential data such as time Ralf C. Staudemeyer, Eric Rothstein Morris
neural series Understanding LSTM – a tutorial into long short-
networks Have been used to predict hospital readmissions term memory recurrent neural networks
(RNN)a [53] arXiv 1909.09586 [preprint]. 2019
Long short- A type of RNN designed to handle long-term Ralf C. Staudemeyer, Eric Rothstein Morris
term memory dependencies in sequential data [53] Understanding LSTM – a tutorial into long short-
(LSTM)a Has been used to predict disease progression term memory recurrent neural networks
arXiv 1909.09586 [preprint]. 2019
K-nearest Can be used for classification or regression Antonio Mucherino, Petraq J. Papajorgji, Panos
neighbors problems M. Pardalos
(KNN) In classification, it stores all cases and classifies new Data mining in agriculture
cases by a majority vote of k-neighbors measured Springer Science & Business Media; 2009
by a distance function [54]
The sum of the square of the difference between
the centroid and the data points within a cluster
determines the value of k for that cluster
K-means An unsupervised algorithm that solves clustering Swati Patel
clustering problems by picking a k number known as K-means clustering algorithm: implementation
centroid, and each data point forms a cluster with and critical analysis
the closest centroids [55]
Scholars; 2019
Dimensionality It identifies highly significant variables from vast Benyamin Ghojogh, Mark Crowley, Fakhri
reduction data sets where the data are unstructured or in Karray, Ali Ghodsi
great detail [56] Elements of dimensionality reduction and
manifold learning
Springer Nature; 2023
a
Most commonly used to predict medical events
Adv Ther
Adv Ther
b Fig.2 Supervised, unsupervised, reinforcement, and deep using only genomics data. Results were com-
learning algorithms. a Linear regression and logistic pared with predictions made by clinicians [59],
regression are among many supervised learning algorithms or those made in genome-wide association
and are trained on labeled data. The label can be either a studies (GWAS) [60], resulting in ML models
continuous outcome (linear regression) or a category with lower accuracy in predicting asthma
(logistic regression). b K-means clustering and hierarchical exacerbations than those based on physicians’
clustering are examples of unsupervised learning algorithms opinions. In 2017, however, Finkelstein and
and use unlabeled data to discover hidden patterns and Jeong interrogated data from 7001 self-reported
data clusters. The output of the supervised learning can be
records during a 7-day home telemonitoring
the clusters’ information, a dendrogram, or other visuals
window and found that ML algorithms, such as
that highlight notable patterns within the data. c Rein-
naı̈ve Bayesian classifier, adaptive Bayesian
forcement learning algorithms train an agent to find the
network, and support vector machines (SVM),
optimal succession of actions that optimize a given task.
respectively, were able to predict asthma exac-
The agent acts within an environment, according to
defined rules and a reward system, and through trial and erbations on day 8 with a sensitivity of 0.80,
error, the agent explores the whole solution space and 1.00, and 0.84; a specificity of 0.77, 1.00, and
eventually identifies the optimal solution. d Deep learning 0.80; and an accuracy of 0.77, 1.00, and 0.80
models are composed of multiple layers of artificial neural [61]. Predictive modeling included asthma
networks and can be trained to predict continuous and symptoms, self-reported medication consump-
categorical variables, using both structured (i.e., tables) and tion, asthma trigger exposure, and lung func-
unstructured data (i.e., images, time series) tion assessed by peak expiratory flow (PEF).
Although using these parameters demonstrated
promising results, future predictive models for
asthma exacerbation should include other
phenotype, environmental factors, biomarkers,
potentially relevant predictive factors such as
as well as PFT results. The trained model can
genomic, clinical, sociodemographic, behav-
then be used to predict the likelihood of asthma
ioral, and environmental.
exacerbations in the near future on the basis of
Additional studies have also used ML to
the patient’s data in the baseline period. This
predict hospitalizations in children and adults
approach holds the potential for application in
with acute asthma [62, 63]. One study in chil-
personalized medicine in asthma (Fig. 2d).
dren with acute asthma seen in the ED com-
pared four ML approaches using features from
STUDIES USING MACHINE 29,392 patients (mean age ± SD, 7 ± 4.2 years)
[62]. Data used to predict hospitalizations
LEARNING TO PREDICT ASTHMA included clinical data available at the time of
EXACERBATIONS triage, weather reports, neighborhood charac-
teristics, and viruses circulating in the commu-
According to the ‘‘no free lunch theorem’’ there is nity. Four models were trained and validated,
no single algorithm that can be applied to all including decision trees, least absolute shrink-
datasets homogeneously, making algorithm age and selection operator (LASSO) logistic
selection a key decision when applying ML regression, random forests, and gradient boost-
[57, 58]. Thus, the use of ML to analyze features ing machines (GBMs), and the area under the
of asthma creates the possibility to uncover receiver operating characteristic curve (AUC)
interactions between all potential features that was calculated for each model. AUCs for each
can produce clinical outcomes and that may model included decision tree, 0.72 (95% confi-
have not been considered until now, thereby dence interval [CI] 0.66–0.77); logistic regres-
leading to better asthma care without increas- sion, 0.83 (95% CI 0.82–0.83); random forests,
ing the burden on patients and healthcare 0.82 (95% CI 0.81–0.83); and GBM, 0.84
practitioners. (95% CI 0.83–0.85). Patient vital signs and
Original research efforts in ML and asthma acuity, age, and weight, followed by socioeco-
were done in small clinical datasets and/or nomic status and weather-related features, were
Adv Ther
the most important features predicting hospi- peak expiratory flow and asthma symptoms in
talizations, with GBM identified as the best 2010 patients [24]. The authors used logistic
predictor of hospitalization in the study. regression, decision tree, naı̈ve Bayes, and per-
In another study, Goto et al. tested four ML ceptron algorithms to assess the primary out-
algorithms (LASSO logistic regression, random come of exacerbation detection on the same
forest, GBM, and deep neural network [DNN]) day, or up to 3 days in the future. The best
on data from 3206 adults with asthma and model used logistic regression, which showed
chronic obstructive pulmonary disease (COPD) an AUC of 0.85 with a sensitivity of 90% and
exacerbations who visited the ED [63]. In both specificity of 83% for severe asthma
patient populations (asthma and COPD), GBM exacerbations.
was the best-performing approach and achieved Tong et al. published results from an accu-
the highest discriminative ability (C-statistics rate model using data from patients with
0.80 vs 0.68), reclassification improvement (net asthma treated at Intermountain Healthcare
reclassification improvement [NRI] 53%; between 2011 and 2018 [66]. The authors
P = 0.002), and sensitivity (0.79 vs 0.53) over inputted 234 candidate features leading to the
the reference model. For the prediction of hos- analysis of 82,888 data instances to predict
pitalizations, random forest provided the high- asthma exacerbations in the following
est discriminative ability (C-statistics 0.83 vs 12 months. The XGBoost algorithm was used
0.64), reclassification improvement (NRI 92%; because it deals with missing values and calcu-
P \ 0.001), and sensitivity (0.75 vs 0.33). lates the importance of each feature according
Zein et al. analyzed data extracted from to its contribution to the model. The final
electronic health records (EHRs) of 60,302 model, using XGBoost and 71 features in
patients with asthma and tested three different descending order of their importance, yielded
models (logistic regression, random forests, and an AUC of 0.90 with an accuracy of 91%, sen-
GBM) to predict three different outcomes sitivity of 70%, and specificity of 91%. The top
occurring in 19,772 patients with acute asthma 10 features associated with exacerbations were
[64]. Of the three asthma outcomes, 32.8% all related to loss of asthma control and African
required oral glucocorticoid bursts, 2.9% American race. To calculate the probability of
required ED visits, and 1.5% required hospital- an asthma exacerbation occurring in children,
izations. The three outcomes were predicted Overgaard et al. developed a model that yielded
best by light gradient boosting machine, with an AUC of 0.80 using logistic regression,
an AUC of 0.88 (95% CI 0.86–0.89). Risk factors whereas random forest showed an AUC of 0.79
for all three outcomes included age and use of and perceptron showed an AUC of 0.78 [67].
long-acting b-agonists, high-dose inhaled glu- Their purpose was to further conduct a clinical
cocorticoids, or chronic oral glucocorticoid trial to assess safety, clinical outcomes, care
therapies. In a subgroup analysis of 9448 quality, care cost, and satisfaction of end users
patients with spirometry data, adding results (e.g., clinicians, patients, and caregivers) com-
from PFTs did not improve the models’ predic- paring ML algorithms versus standard care for
tive performances. clinical decision-making.
More recently, a predictive model of asthma ML has also been used to assess the influence
exacerbations used data downloaded from of environmental exposures on asthma exacer-
ProAir DigihalerÒ together with clinical and bations using conditional random forest, con-
demographic information [65]. The generated ditional tree, and generalized linear models.
model predicted an impending exacerbation Conditional random forest and conditional tree
over the following 5 days with high diagnostic identified five features (prednisone use, race,
accuracy (AUC 0.83). The most significant fac- particulate matter exposure, obesity, and gen-
tors contributing to the model were features der) to be of significant importance in the
based on the mean number of inhaler puffs association with exacerbations. The same fea-
during the 4 days before prediction. Similarly, tures (except for gender) were replicated using a
Zhang et al. analyzed 728,535 records of daily generalized linear model [68]. In addition,
Adv Ther
Haque et al. used a DNN regression (DNNR) PROS AND CONS OF USING
model to predict asthma exacerbations on the
MACHINE LEARNING TO PREDICT
basis of Asthma Control Test (ACT) scores,
weather triggers (temperature, humidity, pres- ASTHMA EXACERBATIONS
sure, and windspeed) and demographic data,
with their approach achieving an accuracy of There are clear advantages to using ML
94% [69]. approaches to predict asthma exacerbations
Another aspect of acute asthma is the num- with greater accuracy than it is being done
ber of readmissions that can occur after the today. Such advantages include the more accu-
initial treatment of acute asthma in the ED. In a rate prediction of asthma exacerbation and
study of 81 patients with asthma, Halner et al. prognosis through the analyses of large data-
developed multivariate random forest models to bases (as mentioned in the studies discussed
identify predictors of treatment failure defined above). ML approaches are faster and less
as the need for additional systemic corticos- expensive than clinical trials as they reduce the
teroids and/or antibiotics, hospital readmission, need for manual data entry and analyses, and
or death within 30 days of the initial ED visit detect patterns that may be more difficult for
[70]. Forty-three patients failed treatment, human experts to detect [73, 74]. The ability to
which was predicted by an 11-variable random automatically integrate the different dimen-
forest model (including medication, history of sions that are part of the asthma syndrome and
exacerbations, symptoms, and quality of life) its timeline of deterioration leading to exacer-
with an AUC of 0.81. In this regard, de Hond bations can only be achieved using ML owing to
et al. recently used home monitoring of peak its lesser scope for human error. In this regard,
flow rates and symptoms in 266 patients to new features or data patterns that can predict
adjust their asthma medications and avert asthma exacerbations could be detected (as has
exacerbations [71]. These authors compared ML been done in other areas of medical care). For
algorithms (XGBoost vs one-class SVM) with a instance, studies have demonstrated that ML
clinical rule and logistic regression. The AUCs can prevent adverse events such as healthcare-
were 0.85 for XGBoost and 0.88 for logistic associated infections, adverse drug events,
regression. Clinical application, however, venous thromboembolism, surgical complica-
remains a challenge owing to the low incidence tions, pressure ulcers, falls, decompensation,
of events, which may be overly sensitive and and diagnostic errors [75]. Another advantage
create false alarms [72]. Additionally, because of using ML is the ability to implement thera-
asthma is the result of the patient interacting peutic measures by predicting the future risk of
with the environment, both patient activity and deterioration at the point-of-care [76]. Finally,
environmental data should be included in any ML may also be applied retrospectively in
ML approach to augment clinical decision- specific patient subgroups to understand/iden-
making. tify likely clinical outcomes depending on
The ML algorithms used in the aforemen- treatment selected, sometimes described as real-
tioned studies found that the most common world evidence (RWE). There exist some ethical
clinical features associated with asthma exacer- challenges that need to be taken into consider-
bations were loss of asthma control and poor air ation when planning to use AI in healthcare.
quality or weather changes; intake of medica- First, informed consent must explain when (and
tions and use of PFTs were less clearly linked to if) AI has been used to provide a therapeutic
asthma exacerbations with their relevance recommendation or to inform prognosis and
depending on the patient populations and set- risk of future events. Explaining to the patient
tings. Table 2 contains a concise summary of the type of AI used and how it has augmented
the studies described above. the accuracy of the prediction is needed, while
‘‘black box’’ algorithms (when the developer
cannot share details of the algorithm) should be
avoided [77]. Black box approaches are those in
Adv Ther
Random
forests,
Gradient
boosting
machines
Adv Ther
Table 2 continued
References Algorithms Predictive features Notes
implemented
Goto et al. Lasso Critical care outcome: arrhythmia, XGBoost and random forest were the best-
[63] regression, respiratory rate, congestive heart failure, performing algorithms for critical care and
Random temperature, oxygen saturation, arrival hospitalization prediction, respectively
forest, mode (ambulance vs walk-in), asthma
status
XGBoost,
Hospitalization outcome: age, congestive
Deep neural
heart failure, arrival mode, asthma status,
network
COPD status, oxygen saturation,
respiratory rate
Zein et al. Logistic Non-severe exacerbation outcome: history of LightGBM generated the best predictions for
[64] regression, sinusitis, treatment with combination iCS non-severe exacerbation, emergency
Random and LABA or with HDiCS, and department visit, and hospitalization
forest, leukotriene inhibitors, high BMI,
eosinophilia, low blood albumin
LightGBM
Emergency department visit outcome: age,
Black/African American race, a history of
non-severe exacerbations, history of severe
asthma, eosinophilia, low blood albumin
Hospitalization outcome: a history of non-
severe exacerbations, low hemoglobin, high
BMI
Noble et al. Logistic Previous hospitalization, older age, being
[19] regression underweight, smoking, history of asthma
attacks and blood eosinophilia
Lugogo Gradient Mean number of daily albuterol inhalations
et al. [65] boosting during the 4 days prior to the prediction,
machines inhalation parameters in the 4 days prior to
prediction (PIF, inhalation volume, and
inhalation duration), and comparison to
the baseline values for these inhalation
parameters
Zhang et al. Logistic Not assessed Logistic regression was the best-performing
[24] regression, model
Decision tree,
Naı̈ve Bayes,
Perceptron
algorithms
Adv Ther
Table 2 continued
References Algorithms Predictive features Notes
implemented
Table 2 continued
References Algorithms Predictive features Notes
implemented
which the inner workings of the system are not algorithms are trained in limited demographic
easily explained, but they are often used in ML groups leading to a lack of generalizability and
algorithms that are trained on large amounts of unintended social bias. If the data are biased,
data. These algorithms can be very accurate at the algorithm will perpetuate these biases,
making predictions but can also be difficult to leading to incorrect predictions [79]. Finally,
interpret, meaning that it is not always clear with regards to data privacy, EHRs contain
why they make the predictions they do. There- confidential patient information; searching
fore, there are several reasons why black box such data using algorithms may constitute a
approaches should be avoided in healthcare. breach of patient privacy.
First, they can lead to a lack of trust between
patients and healthcare providers. Second, black
box approaches can make it difficult to identify CONCLUSION
and address bias in healthcare algorithms.
Third, black box approaches can make it diffi- Despite the use of inhaled corticosteroids,
cult to improve healthcare algorithms over bronchodilators, leukotriene modifiers, and
time. If it is not clear how an algorithm works, it biologics for T2-high and T2-low asthma, some
can be difficult to identify the factors that patients with asthma continue to suffer from
contribute to its accuracy and to make changes acute exacerbations. Predicting asthma exacer-
that can improve its performance. This can lead bations with better accuracy using disease
to algorithms that are less accurate and less management programs and RWE can signifi-
effective over time. cantly reduce asthma-related morbidity and
Additionally, the data currently available for healthcare costs.
use in AI are often unstructured, simply not in Each patient’s asthma journey and disease
any documented form, and/or disputed. High- severity are influenced by a constellation of
quality, large, annotated databases may prove factors, including their medical history, bio-
to be quite fruitful in minimizing patient harm marker phenotype, pulmonary function, level
in the future (caused by potentially incorrect of healthcare system support, compliance to
predictions) [78]. Until this happens, patients prescribed therapy, comorbidities, personal
must be informed of potential pitfalls when habits, and environmental conditions. Such
implementing AI in healthcare decisions. aspects of the patient’s experience can be ana-
Because all AI input is developed by humans, it lyzed using ML to augment clinical decision-
is subject to human bias. This can occur when making and provide appropriate treatment to
Adv Ther
improve both asthma prognosis and the overall Funding. This study was sponsored by
quality of life. Amgen Inc, including the funding of the jour-
It must be noted that there are two metrics nal’s Rapid Service Fee and Open Access Fee.
used to evaluate performance of an ML model
in binary classification tasks: precision and recall. Data Availability. Qualified researchers
These measure how accurate and how complete may request data from Amgen clinical studies.
the model’s positive predictions are, respec- Complete details are available at https://
tively. In general, it is desirable to have both wwwext.amgen.com/science/clinical-trials/
high precision and high recall. However, in a clinical-data-transparency-practices/.
clinical diagnosis system, one may prioritize
recall over precision because it is more impor- Declarations
tant to identify all patients at risk of exacerba-
Conflict of Interest. NA Molfino and G
tions, even if some of the patients with less
Turcatel are employees and stockholders of
severe asthma may present with comparatively
Amgen Inc. D Riskin is the Founder and Chief
lower risk of exacerbations.
Executive Officer of Verantos Inc.
Additionally, while ML works well in very
large and/or well-labeled datasets, it may be Ethical Approval. This article is based on
difficult to apply ML directly in smaller and/or previously conducted studies and does not
less defined datasets. In such circumstances, an contain any new studies with human partici-
initial step of natural language processing (NLP) pants or animals performed by any of the
could help provide a curated dataset as input to authors.
the ML model. Clinical NLP can be applied to
clinician notes to extract clinical features such Open Access. This article is licensed under
as clinic visits, hospital treatments, and PFTs a Creative Commons Attribution-Non-
among other features in the patient timeline Commercial 4.0 International License, which
that can serve as ML inputs. Furthermore, the permits any non-commercial use, sharing,
interpretability of ML algorithms by patients adaptation, distribution and reproduction in
and healthcare professionals can be improved any medium or format, as long as you give
by using simpler algorithms that can explain appropriate credit to the original author(s) and
their predictions in plain language, or by visu- the source, provide a link to the Creative
alizing the algorithm’s decision-making process Commons licence, and indicate if changes were
that can then be audited by humans. made. The images or other third party material
Medical Writing and Editorial Assis- in this article are included in the article’s
tance Medical writing support was provided by Creative Commons licence, unless indicated
Meenakshi Mukherjee, PhD, CMPP (Cactus Life otherwise in a credit line to the material. If
Sciences, part of Cactus Communications) with material is not included in the article’s Creative
funding from Amgen Inc. Commons licence and your intended use is not
Author Contributions. Conception or permitted by statutory regulation or exceeds the
design of the work: Nestor A Molfino. Acquisi- permitted use, you will need to obtain permis-
tion, analysis, or interpretation of data for the sion directly from the copyright holder. To view
work: Nestor A Molfino and Gianluca Turcatel. a copy of this licence, visit http://
Drafting the work or reviewing it critically for creativecommons.org/licenses/by-nc/4.0/.
important intellectual content: Nestor A Mol-
fino, Gianluca Turcatel, and Daniel Riskin. Final
approval of the version to be published: Nestor REFERENCES
A Molfino, Gianluca Turcatel, and Daniel
Riskin. 1. Baldacci S, Simoni M, Maio S, et al. Prescriptive
adherence to GINA guidelines and asthma control:
an Italian cross sectional study in general practice.
Respir Med. 2019;146:10–7.
Adv Ther
2. Cloutier MM, Salo PM, Akinbami LJ, et al. Clinician 15. Chan A, De Simoni A, Wileman V, et al. Digital
agreement, self-efficacy, and adherence with the interventions to improve adherence to mainte-
guidelines for the diagnosis and management of nance medication in asthma. Cochrane Database
asthma. J Allergy Clin Immunol Pract. 2018;6:886- Syst Rev. 2022;6:CD013030.
894.e884.
16. Poole JA, Barnes CS, Demain JG, et al. Impact of
3. Centers for Disease Control and Prevention. weather and climate change with indoor and out-
Asthma: Most recent national asthma data. 2022. door air quality in asthma: a Work Group Report of
https://www.cdc.gov/asthma/most_recent_ the AAAAI Environmental Exposure and Respira-
national_asthma_data.htm. tory Health Committee. J Allergy Clin Immunol.
2019;143:1702–10.
4. O’Byrne P, Fabbri LM, Pavord ID, Papi A, Petruzzelli
S, Lange P. Asthma progression and mortality: the 17. Tiotiu AI, Novakova P, Nedeva D, et al. Impact of air
role of inhaled corticosteroids. Eur Respir J. pollution on asthma outcomes. Int J Environ Res
2019;54:1900491. Public Health. 2020;17:6212.
5. Buhl R, Korn S, Menzies-Gow A, et al. Prospective, 18. Blakey JD, Price DB, Pizzichini E, et al. Identifying
single-arm, longitudinal study of biomarkers in risk of future asthma attacks using UK medical
real-world patients with severe asthma. J Allergy record data: a respiratory effectiveness group ini-
Clin Immunol Pract. 2020;8:2630–2639.e2636. tiative. J Allergy Clin Immunol Pract. 2017;5:
1015–1024.e8.
6. Peters MC, Mauger D, Ross KR, et al. Evidence for
exacerbation-prone asthma and predictive 19. Noble M, Burden A, Stirling S, et al. Predicting
biomarkers of exacerbation frequency. Am J Respir asthma-related crisis events using routine electronic
Crit Care Med. 2020;202:973–82. healthcare data: a quantitative database analysis
study. Br J Gen Pract. 2021;71:e948–57.
7. Porpodis K, Tsiouprou I, Apostolopoulos A, et al.
Eosinophilic asthma, phenotypes-endotypes and 20. Martı́nez-Garcı́a M, Hernández-Lemus E. Data
current biomarkers of choice. J Pers Med. integration challenges for machine learning in
2022;12(7):1093. precision medicine. Front Med (Lausanne). 2022;8:
784455.
8. Kuruvilla ME, Lee FE, Lee GB. Understanding
asthma phenotypes, endotypes, and mechanisms of 21. Topol EJ. High-performance medicine: the conver-
disease. Clin Rev Allergy Immunol. 2019;56: gence of human and artificial intelligence. Nat
219–33. Med. 2019;25:44–56.
9. Kaur R, Chupp G. Phenotypes and endotypes of 22. Acosta JN, Falcone GJ, Rajpurkar P, Topol EJ. Mul-
adult asthma: moving toward precision medicine. timodal biomedical AI. Nat Med. 2022;28:1773–84.
J Allergy Clin Immunol. 2019;144:1–12.
23. Atkinson MK, Saghafian S. Who should see the
10. Zachary CY, Scott TA, Foggs M, Meadows JA. patient? On deviations from preferred patient-pro-
Asthma: an illustration of health care disparities. vider assignments in hospitals. Health Care Manag
Ann Allergy Asthma Immunol. 2020;124:148–9. Sci. 2023;26:165–99.
11. Gaffney AW, Hawks L, Bor D, et al. National trends 24. Zhang O, Minku LL, Gonem S. Detecting asthma
and disparities in health care access and coverage exacerbations using daily home monitoring and
among adults with asthma and COPD: 1997–2018. machine learning. J Asthma. 2021;58:1518–27.
Chest. 2021;159:2173–82.
25. Akhter Y, Singh R, Vatsa M. AI-based radiodiagnosis
12. Nadeem MF, Kaiser LR. Disparities in health care using chest X-rays: a review. Front Big Data. 2023;6:
delivery systems. Thorac Surg Clin. 2022;32:13–21. 1120989.
13. Stern L, Berman J, Lumry W, et al. Medication 26. Sarmiento Varón L, González-Puelma J, Medina-
compliance and disease exacerbation in patients Ortiz D, et al. The role of machine learning in
with asthma: a retrospective study of managed care health policies during the COVID-19 pandemic and
data. Ann Allergy Asthma Immunol. 2006;97: in long COVID management. Front Public Health.
402–8. 2023;11:1140353.
14. Engelkes M, Janssens HM, de Jongste JC, Sturken- 27. Couillard S, Petousi N, Smigiel KS, Molfino NA.
boom MC, Verhamme KM. Medication adherence Toward a predict and prevent approach in
and the risk of severe asthma exacerbations: a sys- obstructive airway diseases. J Allergy Clin Immunol
tematic review. Eur Respir J. 2015;45:396–407. Pract. 2023;11:704–12.
Adv Ther
28. Price D, Castro M, Bourdin A, Fucile S, Altman P. following acute asthma: a systematic review. BMJ
Short-course systemic corticosteroids in asthma: Open Respir Res. 2017;4: e000169.
striking the balance between efficacy and safety.
Eur Respir Rev. 2020;29: 190151. 41. Nowak RM, Parker JM, Silverman RA, et al. A ran-
domized trial of benralizumab, an antiinterleukin 5
29. Chung KF, Wenzel SE, Brozek JL, et al. International receptor a monoclonal antibody, after acute
ERS/ATS guidelines on definition, evaluation and asthma. Am J Emerg Med. 2015;33:14–20.
treatment of severe asthma. Eur Respir J. 2014;43:
343–73. 42. Hasegawa K, Craig SS, Teach SJ, Camargo CA Jr.
Management of asthma exacerbations in the
30. Codispoti CD, Greenhawt M, Oppenheimer J. The emergency department. J Allergy Clin Immunol
role of access and cost-effectiveness in managing Pract. 2021;9:2599–610.
asthma: a systematic review. J Allergy Clin Immu-
nol Pract. 2022;10:2109–16. 43. Grant T, Croce E, Matsui EC. Asthma and the social
determinants of health. Ann Allergy Asthma
31. Comaru T, Pitrez PM, Friedrich FO, Silveira VD, Immunol. 2022;128:5–11.
Pinto LA. Free asthma medications reduces hospital
admissions in Brazil (Free asthma drugs reduces 44. Schyllert C, Lindberg A, Hedman L, et al. Socioe-
hospitalizations in Brazil). Respir Med. 2016;121: conomic inequalities in asthma and respiratory
21–5. symptoms in a high-income country: changes from
1996 to 2016. J Asthma. 2023;60:185–94.
32. Koltermann V, Friedrich FO, Fensterseifer AC,
Ongaratto R, Pinto LA. Cost-benefit impact of free 45. Sarker IH. Machine learning: Algorithms, real-world
asthma medication provision for the pediatric applications and research directions. SN Comput
population. Respir Med. 2020;164:105915. Sci. 2021;2:160.
33. Haahtela T, Tuomisto LE, Pietinalho A, et al. A 10 46. Harrell FE. Regression modeling strategies: with
year asthma programme in Finland: major change applications to linear models, logistic and ordinal
for the better. Thorax. 2006;61:663–70. regression, and survival analysis. Springer Interna-
tional; 2015.
34. May L, Carim M, Yadav K. Adult asthma exacerba-
tions and environmental triggers: a retrospective 47. Steinwart I, Christmann A. Support vector
review of ED visits using an electronic medical machines. New York: Springer; 2008.
record. Am J Emerg Med. 2011;29:1074–82.
48. Mitchell TM. Machine learning. McGraw-Hill Edu-
35. McIntyre A, Busse WW. Asthma exacerbations: the cation; 1997.
Achilles heel of asthma care. Trends Mol Med.
2022;28:1112–27. 49. Sheppard C. Tree-based machine learning algo-
rithms: decision trees, random forests, and boost-
36. Puranik S, Forno E, Bush A, Celedón JC. Predicting ing. Create Space; 2017.
severe asthma exacerbations in children. Am J
Respir Crit Care Med. 2017;195:854–9. 50. Koning M, Smith C. Decision trees and random
forests: a visual introduction for beginners: a simple
37. Fleming L. Asthma exacerbation prediction: recent guide to machine learning with decision trees. Blue
insights. Curr Opin Allergy Clin Immunol. 2018;18: Windmill; 2017.
117–23.
51. Wade C, Glynn K. Hands-on gradient boosting with
38. Albanna AS, Atiah AK, Alamoudi SM, Khojah OM, XGBoost and Scikit-learn: perform accessible
Alajmi RS, Dabroom AA. Treatment response machine learning and extreme gradient boosting
among asthmatic patients with and without rever- with Python. Packt; 2020.
sible airflow limitations. J Taibah Univ Med Sci.
2021;16:950–5. 52. Ke G, Meng Q, Finley T, et al. Lightgbm: a highly
efficient gradient boosting decision tree. Adv Neu-
39. Han YY, Zhang X, Wang J, et al. Multidimensional ral Inf Process Syst. 2017;30:3149–57.
assessment of asthma identifies clinically relevant
phenotype overlap: a cross-sectional study. J Allergy 53. Staudemeyer RC, Morris ER. Understanding LSTM:
Clin Immunol Pract. 2021;9:349–362.e318. a tutorial into long short-term memory recurrent
neural networks. ArXiv Preprint. 2019. https://doi.
40. Hill J, Arrotta N, Villa-Roel C, Dennett L, Rowe BH. org/10.48550/arXiv.1909.09586.
Factors associated with relapse in adult patients
discharged from the emergency department
Adv Ther
54. Mucherino A, Papajorgji P, Pardalos PM. Data 67. Overgaard SM, Peterson KJ, Wi CI, et al. A technical
mining in agriculture. Springer Science and Busi- performance study and proposed systematic and
ness Media; 2009. comprehensive evaluation of an ML-based CDS
solution for pediatric asthma. AMIA Jt Summits
55. Patel S. K-means clustering algorithm: implemen- Transl Sci Proc. 2022;2022:25–35.
tation and critical analysis. Cham: Scholars; 2019.
68. Lan B, Haaland P, Krishnamurthy A, et al. Open
56. Ghojogh B, Crowley M, Karray F, Ghodsi A. Ele- application of statistical and machine learning mod-
ments of dimensionality reduction and manifold els to explore the impact of environmental exposures
learning. Springer Nature; 2023. on health and disease: an asthma use case. Int J
Environ Res Public Health. 2021;18:11398.
57. Magdon-Ismail M. No free lunch for noise predic-
tion. Neural Comput. 2000;12:547–64. 69. Haque R, Ho S, Chai I, Abdullah A. Optimised deep
neural network model to predict asthma exacerba-
58. Almas B, Mujtaba H, Khan KU. EHHR: an efficient tion based on personalised weather triggers.
evolutionary hyper-heuristic based recommender F1000Research. 2021;10:911.
framework for short-text classifier selection. Cluster
Comput. 2023;26:1425–46. 70. Halner A, Beer S, Pullinger R, Bafadhel M, Russell
REK. Predicting treatment outcomes following an
59. Farion KJ, Wilk S, Michalowski W, O’Sullivan D, exacerbation of airways disease. PLoS ONE.
Sayyad-Shirabad J. Comparing predictions made by 2021;16:e0254425.
a prediction model, clinical score, and physicians:
pediatric asthma exacerbations in the emergency 71. de Hond AAH, Kant IMJ, Honkoop PJ, Smith AD,
department. Appl Clin Inform. 2013;4:376–91. Steyerberg EW, Sont JK. Machine learning did not
beat logistic regression in time series prediction for
60. Xu M, Tantisira KG, Wu A, et al. Genome wide severe asthma exacerbations. Sci Rep. 2022;12:
association study to predict severe asthma exacer- 20363.
bations in children using random forests classifiers.
BMC Med Genet. 2011;12:90. 72. Winters BD, Cvach MM, Bonafide CP, et al. Techno-
logical distractions (part 2): a summary of approaches
61. Finkelstein J, Jeong IC. Machine learning approa- to manage clinical alarms with intent to reduce alarm
ches to personalize early prediction of asthma fatigue. Crit Care Med. 2018;46:130–7.
exacerbations. Ann N Y Acad Sci. 2017;1387:
153–65. 73. Jain AK. Data clustering: 50 years beyond K-means.
Pattern Recogn Lett. 2010;31:651–66.
62. Patel SJ, Chamberlain DB, Chamberlain JM. A
machine learning approach to predicting need for 74. Gulshan V, Peng L, Coram M, et al. Development
hospitalization for pediatric asthma exacerbation at and validation of a deep learning algorithm for
the time of emergency department triage. Acad detection of diabetic retinopathy in retinal fundus
Emerg Med. 2018;25:1463–70. photographs. JAMA. 2016;316:2402–10.
63. Goto T, Camargo CA Jr, Faridi MK, Yun BJ, Hase- 75. Hong N, Liu C, Gao J, et al. State of the art of
gawa K. Machine learning approaches for predict- machine learning-enabled clinical decision support
ing disposition of asthma and COPD exacerbations in intensive care units: literature review. JMIR Med
in the ED. Am J Emerg Med. 2018;36:1650–4. Inform. 2022;10:e28781.
64. Zein JG, Wu CP, Attaway AH, Zhang P, Nazha A. 76. Rajkomar A, Oren E, Chen K, et al. Scalable and
Novel machine learning can predict acute asthma accurate deep learning with electronic health
exacerbation. Chest. 2021;159:1747–57. records. NPJ Digit Med. 2018;1:18.
65. Lugogo NL, DePietro M, Reich M, et al. A predictive 77. Lipton ZC. The mythos of model interpretability: in
machine learning tool for asthma exacerbations: machine learning, the concept of interpretability is
results from a 12-week, open-label study using an both important and slippery. Queue. 2018;16:
electronic multi-dose dry powder inhaler with 31–57.
integrated sensors. J Asthma Allergy. 2022;15:
1623–37. 78. Doshi-Velez F, Kim B. Towards a rigorous science of
interpretable machine learning. ArXiv Preprint.
66. Tong Y, Messinger AI, Wilcox AB, et al. Forecasting 2017. https://doi.org/10.48550/arXiv.1702.08608.
future asthma hospital encounters of patients with
asthma in an academic health care system: Predic- 79. O’Neil C. Weapons of math destruction: how big
tive model development and secondary analysis data increases inequality and threatens democracy.
study. J Med Internet Res. 2021;23:e22796. Penguin; 2016.