0% found this document useful (0 votes)
76 views

Machine Learning For Predicting Patient Wait Time

Uploaded by

Anoop Dixit
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
76 views

Machine Learning For Predicting Patient Wait Time

Uploaded by

Anoop Dixit
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

ORIGINAL ARTICLE

Machine Learning for Predicting Patient


Wait Times and Appointment Delays
Catherine Curtis, MS a, Chang Liu, MS a, Thomas J. Bollerman, MS a, Oleg S. Pianykh, PhD a

Abstract
Being able to accurately predict waiting times and scheduled appointment delays can increase patient satisfaction and enable staff
members to more accurately assess and respond to patient flow. In this work, the authors studied the applicability of machine learning
models to predict waiting times at a walk-in radiology facility (radiography) and delay times at scheduled radiology facilities (CT, MRI,
and ultrasound). In the proposed models, a variety of predictors derived from data available in the radiology information system were
used to predict waiting or delay times. Several machine-learning algorithms, such as neural network, random forest, support vector
machine, elastic net, multivariate adaptive regression splines, k-th nearest neighbor, gradient boosting machine, bagging, classification
and regression tree, and linear regression, were evaluated to find the most accurate method. The elastic net model performed best among
the 10 proposed models for predicting waiting times or delay times across all four modalities. The most important predictors were also
identified.
Key Words: Machine learning, radiology information system, regression, predictive model, operations management, elastic net
J Am Coll Radiol 2017;-:---. Copyright  2017 American College of Radiology

INTRODUCTION median to the 95th percentile. Range outputs are much


Being able to accurately predict waiting times and more likely to include the true waiting time than a
scheduled appointment delays can increase patient satis- single predicted value. However, the width of the
faction and enable staff members to more accurately prediction intervals can render them ultimately
assess and respond to patient flow [1,2]. unhelpful in assuaging patient concerns about lack of
A few studies have already acknowledged the impor- waiting time information.
tance of this issue and proposed models attempting to In our previous research, we have already developed
predict waiting time by using a range of statistical predictive models that use current and recent patient
methods. For example, linear regression on the basis of waiting line sizes [5]. With our models, we created
the current time and mean wait time of the past three and applications that show estimated waiting times on
past five patients seen immediately prior has been sug- displays visible in the reception areas in our hospital.
gested [3]. However, limited numbers of predictors often One year after the implementation, we conducted a
lead to considerable discrepancies between predicted and survey to gauge patients’ opinions of the waiting time
actual waiting times, as we demonstrate later in this displays over the course of 10 days. Most (82% of
report. Predicting wait time on the basis of patient those surveyed) liked the displays and wanted to see
acuity category, patient queue sizes, and flow rates has them expanded to all waiting rooms [2]. We noticed
been investigated [4] with quantile regression to provide that most patients who were dissatisfied with the
patients with a range of expected values, from the displayed waiting times were delayed for longer than
predicted, so the need for more accurate models
a
Department of Radiology, Massachusetts General Hospital, Boston,
became imminent. We also wanted to predict not only
Massachusetts. waiting times for the walk-in facilities (our original
Corresponding author and reprints: Oleg S. Pianykh, PhD, Department of design) but also delays for the scheduled facilities.
Radiology, Massachusetts General Hospital, 25 New Chardon Street, Of-
To achieve this goal, we needed more sophisticated
fice #470, Boston, MA 02114; e-mail: [email protected].
The authors have no conflicts of interest related to the material discussed in and flexible algorithms, and machine learning (ML) was a
this article. very logical choice. We also knew that ML could be one

ª 2017 American College of Radiology


1546-1440/17/$36.00 n http://dx.doi.org/10.1016/j.jacr.2017.08.021 1
of the best practical ways of dealing with the extreme examination description, scanner name, modality, and
complexity and randomness of wait and delay time pat- division of examination. The scheduled appointment
terns. At its core, ML provides a set of tools for efficient times of CT, MRI, and ultrasound examinations were
data mining and modeling, especially for large and also recorded. The time of the first image was captured
imperfect data sets [6-9]. Models built with ML have the automatically by the imaging modality. Other time
ability to reflect sophisticated trends that are hard to stamps were recorded manually in real time by medical
capture with conventional regression approaches. They staff members using the RIS interface.
also can resist noise and abnormal outliers, adapt to Initially, our analysis considered all finalized exami-
changing environments, and run without human nations (Fig. 1, Step 1). A few observations had missing
supervision. As a result, the ultimate objective of this values because of manual data entry errors and were
work was to create a universal model that improves the excluded from our analysis (Fig. 1, Step 2). We also
accuracy of predictions for both walk-in and scheduled- discarded units with illogical discrepancies, such as
appointment facilities using more advanced ML methods. arrival time after first image time (Fig. 1, Step 3).
Additionally, many patients had multiple examinations
on a given day, so we grouped them and used the
METHODS earliest arrival and first image times as the definitive
Data Preparation arrival and first image times for that visit. Similarly, we
This was a retrospective study of patient examinations assigned the latest completion time as the completion
performed in the Massachusetts General Hospital time of that patient’s whole visit (Fig. 1, Step 4). The
Department of Radiology between July 2016 and January vast majority of removed observations came in this step.
2017. We considered examinations of the following We defined delay time as the time between scheduled
modalities: CT, MRI, ultrasound, and radiography. time and first image time for modalities with appoint-
Among the four modalities, only radiography had walk-in ments, and we defined wait time as time elapsed between
examinations; the other modalities had scheduled ap- patient arrival time and the first image time for walk-ins.
pointments. Using our radiology information system After computing the delay or wait time for each visit, we
(RIS) (Epic Radiant, Verona, Wisconsin, USA), we found some observations with extreme wait or delay time
extracted nine principal examination parameters: patient values. For example, in some cases the date of the
arrival time, examination begin and complete times, time appointment was different from the date the examination
of the first image acquisition, examination code, was performed. To consistently exclude errors such as

Fig 1. Number of observations in each step of data cleaning.

2 Journal of the American College of Radiology


Volume - n Number - n Month 2017
these, we discarded visits with waiting or delay times about what variables would be the most important; these
more than 5 SDs away from that modality’s mean waiting assumptions, reasonable for small models, tend to fail
or delay time (Fig. 1, Step 5). with much larger ML.
Thus, we compiled a list of 40 predictors. Despite this
Variable Creation large number, we were able to justify all of them on the
basis of our experience and observation, including the
A number of predictor variables were derived from the
following:
original RIS data. On the basis of their values and un-
derlying logic, we grouped them as follows: n Contrast-related predictors: Some abdominal CT or
n
MRI examinations require patients to drink contrast
Predictors related to date and time: season, month, day
liquid one hour before the examination starts, so the
of the week, hour of day, whether the day was a
number of patients with abdominal CT or MRI ex-
workday immediately before a holiday, and whether
aminations in the waiting room might affect the flow
the day was a workday immediately after a holiday.
n
rate of the facility. Similarly, some neurologic or
Predictors related to scheduled appointment: next
vascular CT examinations require that contrast fluid be
scheduled slot, whether the most recent appointment
injected into the patient through an intravenous line in
time was fully booked, total number of examinations
the examination room, which lengthens the examina-
scheduled on that day, number of patients with ex-
tion time relative to noncontrast examinations.
aminations scheduled before the most recently arrived
n Proximity to holiday: In our data, the number of ex-
patient, order that the most recent patient arrived in
aminations scheduled on the days immediately before
relation to other patients with the same appointment
and after federal holidays is different from the number
time, and number of delayed appointments from the
for ordinary workdays.
beginning of the day.
n Arrival order: According to our observations, the arrival
n Patient flow-based predictors: number of patients
order of patients with the same appointment time can
currently waiting; number of patients waiting 15, 30,
affect delays. This relationship is especially pronounced
45, and 60 min (5, 10, 15, and 20 min for ultrasound
for the ultrasound facility, where up to six examina-
and radiography); and number of patients taken in the
tions can be scheduled for the same time. The final two
past 30 and 60 min (10 and 30 min for ultrasound and
patients to arrive for a given slot were consistently seen
radiography).
n
significantly later than their scheduled appointment
Predictors related to current and past examinations:
times, while the first two patients to arrive for the same
maximum examination time of ongoing examinations,
slot had a high probability of being taken on time or
minimum examination time of ongoing examinations,
earlier than the scheduled time.
and median wait time of most recent five
examinations.
n Predictors related to current status of examination ML Models
rooms: number of scanners currently being used and
After data cleaning and creation, the next step was to
number of unique scanners that had been used that day.
build and evaluate models. A standard approach to avoid
n Examination-specific predictors: number of patients with
overfitting is to take a random subsample of the data for
abdominal examinations on the floor (either in the waiting
model building and use the rest to measure model per-
room or the examination room), number of patients with
formance. In each modality, we randomly assigned 70%
neurologic examinations on the floor, and number of
of the data to the training set on which we created our
patients with vascular examinations on the floor.
models and the remaining 30% of the data to the test set
Note that a few of these variables were specific to only on which we evaluated our models’ predictive capabilities.
certain modalities. For example, variables related to Then all nine ML algorithms—neural network, random
scheduled appointments did not apply to walk-in radi- forest, support vector machine (SVM), elastic net,
ography, and number of neurologic examinations was not multivariate adaptive regression splines (MARS), k-th
applicable to ultrasound. However, we wanted to create a nearest neighbor (KNN), gradient boosting machine
universal variable set, and then let modality-specific ML (GBM), bagging, and classification and regression tree
models decide what variables they needed most. In this (CART) —were trained to fine-tune their parameters into
way, we also wanted to avoid any a priori assumptions the best possible training data fit. After that, we applied

Journal of the American College of Radiology 3


Curtis et al n Predicting Wait Times and Appointment Delays
the trained models to the test set and computed the root- Table 2. Comparison of R2 of prediction for four modalities
mean-square error (RMSE) and R2 value for each model (best on top)
to determine predictive accuracy. Finally, we compared Method CT MRI Ultrasound Radiography
the performance of the models on the test set and selected GBM 0.3084 0.2646 0.5895 0.4483
the one that worked best across all four modalities as Elastic net 0.3220 0.2971 0.5630 0.4642
indicated by a low RMSE. We also compared the ML Random forest 0.2959 0.2470 0.5790 0.4582
models with the simple linear regression from our pre- Linear regression 0.3205 0.2980 0.5626 0.4631
vious work [5] to see whether the new ML approach can MARS 0.3196 0.2541 0.5793 0.4520
deliver better results. SVM 0.3071 0.2694 0.5952 0.4543
The most important predictors were identified on the Bagging 0.2384 0.2215 0.5016 0.4414
CART 0.1723 0.1318 0.2115 0.3793
basis of appropriate rules for the model we selected. All
Neural network 0.2003 0.1706 0.4767 0.4475
data manipulation, modeling, training, parameter tuning,
KNN 0.0742 0.1388 0.4177 0.4287
and testing were performed using R (R Development
Note: CART ¼ classification and regression tree; GBM ¼ gradient
Core Team, Vienna, Austria). boosting machine; KNN ¼ kth nearest neighbor; MARS ¼ multi-
variate adaptive regression splines; RMSE ¼ root-mean-square
error; SVM ¼ support vector machine.
RESULTS
The overall performance of all 10 models was accessed with
algorithm across four modalities. The top left corner
RMSE, as shown in Table 1. As one can see, the models’
corresponds to the best model performance. From the
performance varied across different modalities. For
figure, we can see that the accuracy of elastic net was
example, elastic net and GBM were the only two
not far from that of GBM. This is very impressive
techniques that consistently performed in the top 50% of
performance for elastic net; GBM assembles weak
all models.
prediction models (in our case, decision trees) and
The prediction accuracy in the form of R2 is shown in
involves many parameters, whereas elastic net is a
Table 2 and also varied across different modalities. For
simple regularization algorithm based on ordinary linear
instance, by comparing the prediction accuracy of the
regression model. Therefore, on the basis of the
elastic net model on data from all four modalities, we
accuracy of prediction across all modalities and
found that the model was most accurate for predicting
computational simplicity, we selected elastic net as the
ultrasound delays, followed by radiography, CT, and
best predictive model.
finally MRI. This ranking is not surprising given our prior
We also ranked the importance of predictors on the
knowledge of the flow-rate randomness of waiting rooms.
basis of the absolute value of the t statistic for each
Figure 2 visualizes these results by plotting the
model parameter. We found that the most important
average RMSE versus the average R2 value of each
predictors were as follows:

Table 1. Comparison of RMSE of prediction for four


modalities (best on top)
Method CT MRI Ultrasound Radiography
GBM 38.85 24.53 13.03 6.900
Elastic net 38.98 24.46 13.09 6.867
Random forest 39.01 25.47 13.01 6.816
Linear regression 39.20 24.86 13.11 6.863
MARS 36.30 25.32 12.96 6.915
SVM 39.33 24.87 12.84 6.941
Bagging 41.23 25.60 14.29 6.922
CART 42.68 27.21 17.94 7.306
Neural network 43.43 27.36 14.89 7.047
Fig 2. Comparison of average prediction accuracy. CART ¼
KNN 45.07 27.42 15.90 7.008
classification and regression tree; GBM ¼ gradient boosting
Note: CART ¼ classification and regression tree; GBM ¼ gradient machine; KNN ¼ k-th nearest neighbor; MARS ¼ multivar-
boosting machine; KNN ¼ kth nearest neighbor; MARS ¼ multi-
iate adaptive regression splines; RMSE ¼ root-mean-square
variate adaptive regression splines; RMSE ¼ root-mean-square
error; SVM ¼ support vector machine. error; SVM ¼ support vector machine.

4 Journal of the American College of Radiology


Volume - n Number - n Month 2017
n patient queue length (current and most recent), models by visualizing their actual predictions. Figure 3
n examination queue length (number of examinations shows 200 successive patient visits for both
scheduled to be done before the most recently arrived radiography and CT facilities to compare the
patient), predictions of the linear regression model from our
n order in which the most recent patient arrived in previous work and our new elastic net model. For
relation to other patients with the same appointment the radiography facility, we can see that the linear
time, and model and elastic net predictions are roughly
n median examination time of the five most recent ex- comparable, although the elastic net predictions are
aminations (particularly for MRI and ultrasound). marginally better. This is mostly because the
radiography facility ran nine devices, efficiently
However, only 4 or 5 of 40 predictors were elimi- balancing operational load and randomness. As a
nated by elastic nets when applied to different modalities, result, facilities with more processing units are more
meaning that the majority of predictors did contribute to predictable, with only marginal improvements
the final predicted value. provided by more complex ML prediction algorithms.
Finally, one should never rely on a single statistic On the contrary, our CT facility ran only two CT
(such as RMSE or R2) to evaluate the quality of scanners, and although the examination durations were
complex ML models. Therefore, we evaluated the comparable with those for radiography, their delay

Fig 3. Wait time and delay time predictability patterns. LM represents the simple regression predictor previously described [5],
and elastic net corresponds to our present model.

Journal of the American College of Radiology 5


Curtis et al n Predicting Wait Times and Appointment Delays
variability pattern became more complex (CT chart, such as waiting times as described in this work. This is
Fig. 3) and challenging for a simple regression model. As why ML came as a very natural and logical choice to
a result, we can see that the prediction accuracy of the us. On the other side, ML has its own limitations. The
elastic net model is much higher than the prediction most noticeable of them is reduced interpretability,
accuracy of the linear model; the elastic net model when the complexity of ML functions makes them
predicts the real data and their varying trends visibly virtually impossible to explain in simple terms. And
better. This difference in accuracy is especially although these interpretation limitations do not affect
pronounced for delay times longer than 50 minutes, our work with predicting patient times, they may
not even captured by a much “flatter” regression model. become “showstoppers” in some other areas in which
Keep in mind that not only did elastic net use a much interpretation is crucial. As a result, ML should never
larger predictor set, but it was also trained to avoid be applied blindly just for the sake of a popular
overfitting, which contributed to its superior buzzword; on the contrary, it should be treated as one
performance. of many problem-solving tools, used only when it
makes the most sense.
This was exactly the approach we adopted in this
DISCUSSION work. Our study investigated the use of ML models to
The concept of ML is certainly not new and was intro- predict patient wait and facility delay times for radi-
duced with the early advances of computer science in the ology examinations. Parameters from the RIS were
1960s as an attempt to capture complex data patterns used to calculate predictors for 10 proposed models: 9
with equally complex computational models. Since then, from ML and a simple regression. These methods were
driven by scientific research and growth in computing compared with one another, and with our earlier linear
power, ML evolved rapidly, enriching itself with a wealth regression approach, in terms of predictive perfor-
of diverse computational approaches [7,8,10,11]. mance. This comparison demonstrated that ML with
ML relies on complex multidimensional and multi- elastic net outperformed other ML algorithms in pre-
parametric functions that can be fit into large volumes of diction accuracy and model simplicity. It was also
real, often noisy, and imperfect data. This fit, also known found to be more accurate than the linear regression
as “training” or “learning,” enables one to discover the model we developed previously. This definitively
hidden patterns within the data that would often be demonstrates ML’s potential in predicting workflow
invisible otherwise. This transformed ML into a major events.
“big data” mining tool, with a set of powerful learning
techniques. For example, ML linear models, such as
TAKE-HOME POINTS
elastic nets and lasso regression, were developed as ex-
tensions to classical linear regression but with additional - Patient wait and facility delay time prediction is an
robustness to resist noise and outliers [8]. ML essential tool in managing clinical practices.
classification models, such as trees and clustering - Accurate delay and wait prediction is important for
algorithms, were created to discover hidden patterns as patient satisfaction.
groupings (clusters) within data so that the data can be - ML models (such as elastic nets) applied to RIS data
understood as classes with certain properties (consider provide accurate and efficient wait and delay time
clustering patients into different groups on the basis of predictors.
their response to various treatments). Finally, ML - The least predictable outliers should be studied by
neural network models, as their names suggest, were facility management to improve patient-processing
proposed to mimic the structure of a human brain with strategies.
nonlinear, neuron-like connected functions. Then, fine-
tuning this artificial brain with some already classified
data can presumably make it efficient in processing new
REFERENCES
observations. 1. Camacho F, Anderson F, Safrit A, Jones AS, Hoffmann P. The
The computational power of ML, already out- relationship between patient’s perceived waiting time and
performing the human brain in many data-processing office-based practice satisfaction. N Carolina Med J 2006;67:
409-13.
tasks, makes it an excellent choice for recognizing 2. Jaworsky C, Pianykh O, Oglevee C. Patient feedback on waiting time
and predicting sophisticated and noisy phenomena, displays. Am J Med Qual 2017;32:108.

6 Journal of the American College of Radiology


Volume - n Number - n Month 2017
3. Hemaya SAK, Locker TE. How accurate are predicted waiting times, 7. Kuhn M, Johnson K. Applied predictive modeling. New York:
determined upon a patient’s arrival in the emergency department? Springer; 2013.
Emerg Med J 2012;29:316-8. 8. Murphy KP. Machine learning: a probabilistic perspective. Cambridge,
4. Sun Y, Teow KL, Heng BH, Ooi CK, Tay SY. Real-time prediction of MA: MIT Press; 2012.
waiting time in the emergency department, using quantile regression. 9. James G, Witten D, Hastie T, Tibshirani R. An introduction to sta-
Ann Emerg Med 2012;60:299-308. tistical learning. New York: Springer; 2013.
5. Pianykh OS, Rosenthal DI. Can we predict patient wait time? J Am 10. Machine learning. Available at: https://en.wikipedia.org/wiki/
Coll Radiol 2015;12:1058-66. Machine_learning. Accessed January 7, 2017.
6. SAS Institute. Machine learning: what it is and why it matters. 11. Hastie T, Tibshirani R, Friedman J. The elements of statistical
Available at: https://www.sas.com/en_us/insights/analytics/machine- learning: data mining, inference, and prediction. 2nd ed. New York:
learning.html. Accessed August 27, 2017. Springer; 2009.

Journal of the American College of Radiology 7


Curtis et al n Predicting Wait Times and Appointment Delays

You might also like