2.7 Potassium 1 Moonesinghe - Risk Stratification Tools For Predicting Morbidity and Mortality
2.7 Potassium 1 Moonesinghe - Risk Stratification Tools For Predicting Morbidity and Mortality
2.7 Potassium 1 Moonesinghe - Risk Stratification Tools For Predicting Morbidity and Mortality
Suneetha Ramani Moonesinghe, F.R.C.A.,* Michael G. Mythen, M.D.,† Priya Das, M.B.B.S.,‡
Kathryn M. Rowan, Ph.D.,§ Michael P. W. Grocott, M.D.║
outcome.3 Examples of risk scores are the American Society excluded studies that consisted entirely of cohorts undergo-
of Anesthesiologists’ Physical Status score (ASA-PS)4 and the ing ambulatory (day case) surgery and cohorts that included
Lee Revised Cardiac Risk Index.5 cardiac or neurological surgery.
By contrast, risk prediction models estimate an individual
probability of risk for a patient by entering the patient’s data Search Strategy and Study Eligibility
into the multivariable risk prediction model. Although risk A search for articles published between January 1, 1980 and
prediction models may be more accurate predictors of an August 6, 2011 was undertaken using MEDLINE, Embase,
individual patient’s risk than risk scores, they are more com- and Web of Science. No language restriction was applied.
plex to use in the day-to-day clinical setting. The search strategy and inclusion and exclusion criteria are
Despite increasing interest in more sophisticated risk detailed in appendix 2. Of note, articles reporting develop-
prediction methods, such as the measurement of functional ment studies were excluded, unless the article included vali-
capacity by exercise testing,6 risk stratification tools remain dation in a separate cohort.
the most readily accessible option for this purpose. However,
clinical experience tells us that they are not commonly used Data Extraction and Quality Assessment of Studies
in everyday practice. Lack of use may be due to poor aware- Data extraction was independently undertaken by Drs. Moo-
ness amongst clinicians of the available options and concerns nesinghe and Das, using standardized tables relating to the
regarding their complexity and accuracy.7 In other clinical set- study characteristics, quality, and outcomes. Where there was
tings, low uptake of risk stratification tools has been ascribed disagreement in the data extraction between these two authors,
to a lack of clarity on the precision of available tools, resulting Dr. Moonesinghe resolved the query by referring again to the
from perhaps unnecessary efforts to make minor refinements original articles. Study characteristics extracted from each arti-
to existing methods, or to developing novel methods, with cle included the number of patients, the country where the
the aim of achieving greater predictive accuracy.8 study was conducted, the outcome measures and endpoints of
With the aim of summarizing the available risk stratifica- each study, and the risk stratification tools being assessed. Data
tion tools in perioperative care, in order to make recommen- were also extracted regarding the most detailed description of
dations about which methods are appropriate for use both the types of surgery included in each study cohort reported in
in clinical practice and in research, we have undertaken a the articles. We also extracted clinical outcome data (morbidity
qualitative systematic review on the available evidence. The and mortality) for the cohorts in each study.
specific question we sought to answer was “What is the per- Assessment of study quality was based on the framework
formance of risk stratification tools, validated for morbidity for assessing the internal validity of articles dealing with
and/or mortality, in heterogeneous cohort of surgical (non- prognosis developed by Altman.11,12 The following crite-
cardiac, nonneurological) patients?” The review had three ria were used: the number of patients included in analyses,
main objectives as follows: to summarize the available risk whether the study was conducted on a single or multiple
prediction methods, to report on their performance, and to sites, the timing of data collection (prospective vs. retrospec-
comment on their strengths and weaknesses, with particular tive), whether a description of baseline characteristics for
focus on accuracy and ease of application. the cohort was included (including comorbidities, type of
surgery, and demographic data), and selection criteria for
Materials and Methods patients included in the study (to assess for selection bias).
Previously published standards for reporting systematic Selection bias was judged to be present if a study restricted
reviews of observational studies were adhered to when the type of patient who could be enrolled based on age,
undertaking this study.9 A Preferred Reporting Items for ethnicity, sex, premorbid condition, urgency of surgery, or
Systematic reviews and Meta-analyses checklist10 was used in postoperative destination (e.g., critical care). In addition, we
the preparation of this report (appendix 1). reported the setting of each validation study—i.e., whether
the validation was conducted in a split sample of the origi-
Definitions for the Purposes of This Study nal development cohort or whether the validation cohort
A “risk stratification tool” was defined as a scoring system or was entirely different from that in which the tool was devel-
model used to predict or adjust for either mortality or mor- oped.13 Finally, as a measure of their clinical usability and
bidity after surgery, and which contained at least two differ- reproducibility, we reported whether each risk stratification
ent risk factors. “Major surgery” was defined as a procedure tool used variables which were objective (e.g., blood results),
taking place in an operating theatre and conducted by a sur- subjective (e.g., chest radiograph interpretation), or both.14
geon; thus, studies of cohorts of patients undergoing endo-
scopic, angiographic, dental, and interventional radiological Data Analysis and Statistical Considerations
procedures were excluded. A “heterogeneous patient cohort” The performance of each risk stratification tool was evalu-
was defined as a cohort of patients including at least two dif- ated using measures of discrimination and, where appropri-
ferent surgical specialities. Studies of gastrointestinal surgery, ate, calibration. Discrimination (how well a model or score
which included hepatobiliary surgery, were included. We correctly identifies a particular outcome) was reported using
either the area under the receiver operating characteristic of these were screened to identify articles which described
curve (AUROC) or the concordance (c-) statistic. We con- risk stratification tools used in any adult noncardiac, non-
sidered an AUROC of less than 0.7 to indicate poor perfor- neurological surgery. Seven hundred fifty-one articles then
mance, 0.7–0.9 to be moderate, and greater than 0.9 to reflect underwent a review. Hand searching of reference lists and
high performance.15 Calibration is defined as how well the citations identified a further 432 studies which were also
prognostic estimation of a model matches the probability of reviewed in detail.
the event of interest across the full range of outcomes in the Three studies were identified that graphically displayed
population being studied. Where reported, either Hosmer– receiver operating characteristic curves in their results but
Lemeshow or Pearson chi-square statistics were extracted as did not report AUROCs.16–18 The authors of these studies
an evaluation of calibration; P value of more than 0.05 was were contacted for additional information; none responded,
taken to indicate that there was no evidence of lack-of-fit. so these studies were excluded from the analysis. Six foreign
language studies, which may have been eligible for inclu-
Results sion based on review of the abstracts, but for which we
Search Results were unable to obtain translations, were also omitted from
In the initial search, 139,775 articles on MEDLINE and the analysis.19–24 The flow chart for the review is detailed in
71,841 on Embase were listed, and the titles and abstracts figure 1.
A total of 27 studies evaluating 34 risk stratification tools Risk Stratification Tools Using Preoperative Data Only
were included in the analysis. All were cohort studies. Eight Four entirely preoperative risk stratification tools (ASA-PS,
tools were validated in multiple studies; the most commonly Surgical Risk Scale, Surgical Risk Score, and the Charlson
reported were the ASA-PS (four studies, total number of Comorbidity Index) were validated in multiple studies. The
patients, n = 4,014), the Acute Physiology and Chronic Health Surgical Risk Scale and the Surgical Risk Score both contain
Evaluation II (APACHE II) scoring system (four studies, the ASA-PS, and the urgency and severity of surgery; both
n = 5,897), the Physiological and Operative Score for the enU- have also been multiply validated. The Surgical Risk Score28,29
meration of Mortality and Morbidity (POSSUM; three studies, was developed and originally validated in Italy29 and con-
n = 2,915), the Portsmouth variation of POSSUM (P-POS- tains the ASA-PS, a 3-point scale modification of the Johns
SUM; five studies, n = 10,648; mortality model only), the Hopkins surgical severity criteria and a binary definition of
surgical urgency (elective vs. emergency). The only published
Surgical Risk Scale (three studies, n = 5,244; mortality model
study evaluating the Surgical Risk Score after its initial vali-
only), the Surgical Apgar Score (three studies, n = 10,795), the
dation found it to be poorly predictive of inpatient mortal-
Charlson Comorbidity Index (two studies, n = 2,463,997),
ity.28 The Surgical Risk Scale30–32 uses the ASA-PS alongside
and Donati Surgical Risk Score (two studies, n = 7,121). The
United Kingdom definitions of operative urgency (a 4-point
accuracy of a further 26 tools was evaluated in single-validation scale defined by the United Kingdom National Confidential
studies. A comparison of tools that were validated in multiple Enquiry into Postoperative Death and Outcome) and sever-
studies is detailed in tables 1 and 2. The general characteristics ity (the British United Provident Association classification
of all included studies are summarized in table 3. which is used to rank surgical procedures for the purposes of
financial billing in the private sector). Both studies validating
Quality Assessment this system after its initial development found it to be a mod-
The quality assessment of included studies is summarized in erately discriminant tool (AUROC >0.8).30,32
table 3. Seven studies were multicenter and 21 were single A further 18 different risk stratification tools using solely
center. The data collection was prospective in 19 studies, ret- preoperative data were validated in single publications. Sev-
rospective in 7, and based on administrative data in 2 studies. eral of these were originally derived and validated for pur-
Sixteen studies used mortality as an outcome measure, four poses other than the prediction of generic morbidity and
used morbidity, and eight used both. The study endpoints mortality: these include cardiac risk prediction scores,27,32,33
included 30-day outcome in 12 articles, hospital discharge measures of nutritional status,34 and frailty indices.27 These
in 15 articles, and 3 articles also included shorter or longer tools are described in appendix 4.
follow-up times ranging from 1 day to 1 yr. Nineteen stud-
ies of the total 28 reported baseline patient characteristics Risk Stratification Tools Incorporating Intra- and
of physiology or comorbidity, surgery, and demographics; Postoperative Data
selection bias was evident in 12 studies. The POSSUM and P-POSSUM scores were the most fre-
quently used tools in heterogeneous surgical cohorts. The
Outcomes Reporting POSSUM score was derived by multivariable logistic regres-
Outcomes are summarized in table 4. Surgical mortality at sion analysis and contains 18 variables, of which 12 were
30 days varied between 1.25 and 12.2% and at hospital dis- measured preoperatively and 6 at hospital discharge; two
charge between 0.8 and 24.7%. separate equations, for morbidity and mortality, were devel-
All but one25 of the six studies which separately tested the oped and validated.17,35 After recognition that the POSSUM
model overpredicted adverse outcome, the Portsmouth varia-
discrimination of stratification tools for morbidity and mor-
tion (P-POSSUM) was developed to predict mortality, using
tality reported that morbidity prediction was less accurate.
the same composite variables but a different calculation.36
There was considerable heterogeneity in the definition of
P-POSSUM has been used in a larger number of more
morbidity in the 12 studies that reported this outcome (see
recent studies28–30,32,37 than the original POSSUM25,29,30
appendix 3 for summary), and in keeping with this, there and has been found to be of moderate to high discriminant
was wide variation in complication rates in different studies accuracy (AUROC varying between 0.68 and 0.92) with the
(between 6.726 and 50.4%).25 exception of one Australian study.37
APACHE II 16 Postoperative Critical care patients; 3 Jones25 117 Gastrointestinal, vascular, renal, and All 30 d HDU admission score:
all diagnoses (not urology 0.539 (+/−0.083)
just surgical); Osler63 5,322 Noncardiac All Hospital ICU admission score:
hospital discharge 0.806
mortality38 Stachon40 271 Ortho, spinal, trauma, visceral surgery, All Hospital First 24-h worst
limb surgery discharge score: 0.777
ASA-PS 1 Preoperative General surgical 2 Sutton31 1,946 Gastrointestinal, vascular, trauma All Hospital discharge 0.93 (0.90–0.97)
963
Donati29 1,849 Abdominal, vascular, orthopedics, urology, All Hospital 0.912 (0.898–0.924)
endocrine, otolaryngology, neurosurgery, discharge
gynecology, eye, thoracic, other
Brooks30 949 General, colorectal, upper gastrointestinal, All 30 d 0.92 (0.90–0.95)
urology, head, and neck
Neary32 2,349 General, vascular, otolaryngology, Emergent 30 d 0.90 (0.87–0.93)
urology, orthopedics, other and urgent 1 yr 0.90 (0.8–1.0)
Haga28 5,272 Gastrointestinal and hepatobiliary Elective 30 d 0.74 (0.63–0.86)
Hospital discharge 0.81 (0.75–0.88)
Surgical 3 Intraoperative Colorectal; 30-d 2 Regenbogen66 4,119 General and vascular All 30 d 0.81
Apgar mortality65 Haynes67 5,909 Any noncardiac All Inpatient 0.77
* Varied between year and whether ICD-9 or ICD-10 administrative data used (table 3).
APACHE II = Acute Physiology and Chronic Health Evaluation II; ASA-PS = American Society of Anesthesiologists’ Physical Status score; AUROC = area under receiver operating characteristic curve; HDU = high
Moonesinghe et al.
dependency unit; ICD = International Classification of Diseases; ICU = intensive care unit; (P)-POSSUM: (Portsmouth)-Physiological and Operative Severity Score for the enUmeration of Morbidity and morbidity.
Table 2. Morbidity Models Validated in Multiple Studies
ASA-PS 3 Preoperative General surgery4 Goffi68 187 General All 30 d (mortality and 0.777
morbidity combined)
Hightower69 32 Major abdominal Elective 7d 0.688 (0.523–0.851)
(gastrointestinal,
urology)
Makary27 594 Unselected inpatient All Hospital discharge 0.626
APACHE II 1 Postoperative Critical care patients; Goffi68 187 General All 30 d (mortality and Hospital admission score:
any diagnosis (not just morbidity 0.866
surgical); hospital combined) Preoperative score: 0.894*
mortality38
964
POSSUM 2 Pre- and General surgery; 30-d Jones25 117 Gastrointestinal, vascular, All 30 d 0.82
intraoperative morbidity17 renal, and urology
Brooks30 949 General, colorectal, All 30 d 0.92
upper gastrointestinal,
urology, head, and
neck
Surgical 3 Intraoperative Colorectal; 30-d Gawande65 767 General and vascular All 30 d (mortality 0.72
Apgar mortality65 and morbidity
combined)
Regenbogen66 4,119 General and vascular All 30 d 0.73
67
Haynes 5,909 Any noncardiac All Inpatient 0.70
Moonesinghe et al.
EDUCATION
critical care; the score consists of 12 physiological variables Italy. Although this is the most frequently and widely validated
and an assessment of chronic health status. This approach has model identified by our study, it has some limitations. First,
face validity, as APACHE II is a summary measure of acute it includes both preoperative and intraoperative variables, and
physiology and chronic health, both of which may influence therefore cannot be used for preoperative risk prediction. Sec-
surgical outcome. Only one of the four studies reporting the ond, several of the variables are subjective (e.g., chest radio-
APACHE II score’s predictive accuracy used it in the way graph interpretation), carrying the risk of measurement error.
originally intended: by incorporating the most deranged Third, in common with the original POSSUM, the P-POS-
physiological results within 24 h of critical care admission.40 SUM tends to overestimate risk in low-risk patients. Fourth, it
The Charlson comorbidity score was developed to pre- contains 18 variables, which must be entered into a regression
dict 10-yr mortality in medical patients.39 A combined equation to obtain a predicted percentage risk value, and clini-
age-comorbidity score was subsequently validated for the cians may not wish to use such a complex system. Finally, the
prediction of long-term mortality in a population of patients inclusion of intraoperative variables, particularly blood loss,
who had essential hypertension or diabetes and were under- which may be influenced by surgical technique, runs the risk
going elective surgery.41 It is the original Charlson score, of concealing poor surgical performance, therefore, jeopardiz-
however, which is used in two studies identified in our search ing its face validity as a risk adjustment model for comparative
to stratify risk of short-term outcome.42,43 These two studies audit of surgeons or institutions.
reported very different predictive accuracy for the Charlson
score; however, the largest single study included in this entire Surgical Risk Scale
review found the Charlson score (measured using adminis- The Surgical Risk Scale consists entirely of variables that are
trative data) to be a moderately accurate tool.44 available before surgery, making it a useful tool for preop-
erative risk stratification for the purposes of clinical decision
Discussion making. However, there are also some limitations. First, it
The purpose of this systematic review was to identify all risk incorporates the ASA-PS, which may be subject to interob-
stratification tools, which have been validated in heteroge- server variability and therefore measurement error.44–46 Sec-
neous patient cohorts, and to report and summarize their ond, the surgical severity coding is not intuitive, and some
discrimination and calibration. We have found a plethora of familiarity with the British United Provident Association
instruments that have been developed and validated in single system would be required for bedside estimation, unless a
studies, which unfortunately limits any assessment of their reference manual was available. Finally, it has only been vali-
usefulness and generalizability. A smaller number of tools dated in single-center studies within the United Kingdom;
have been multiply validated which could be used univer- therefore, its generalizability to patient populations in the
sally for perioperative risk prediction; of these, the P-POS- United States and worldwide is unknown.
SUM and Surgical Risk Scale have been demonstrated to be
the most consistently accurate systems. Other Options
The ASA-PS is widely used as an indicator of whether or not
Risk Stratification Tools in Practice: Complexity versus a patient falls into a high-, medium-, or low-risk population,
Parsimony but it was not originally intended to be used for the prediction
There are two key considerations when assessing the clinical of adverse outcome in individual subjects.4 It is perhaps sur-
utility of the various risk stratification tools reviewed in our prising that the ASA-PS was reported as having good discrimi-
study. First, what level of predictive accuracy is fit for the pur- nation for predicting postoperative mortality, as it is a very
poses of risk stratification? Second, what is the likelihood that simple scoring system, which has been demonstrated to have
each of the described instruments may be used in everyday only moderate to poor interrater reliability.44–47 Nevertheless,
practice by clinicians? Although the answer to the first question the ASA-PS has face validity as an assessment of functional
may be to aim as “high” (accurate) as possible, this must also be capacity, which is increasingly thought to be a significant pre-
balanced against the issues raised by the second question. Risk dictor of patient outcome, as demonstrated by more sophis-
models incorporating over 30 variables may be highly accurate ticated techniques such as cardiopulmonary exercise testing.48
but are less likely to be routinely incorporated into preoperative Although it is possible that this provides some explanation
assessment processes than scores of similar performance that use for the high discriminant accuracy for ASA-PS found in this
only a few data points. Furthermore, clinical experience tells systematic review, it is possible that publication bias, favoring
us that the clinician is less likely to use complex mathematical studies with “positive” results, may also be a factor.
formulae, as opposed to additive scores, when attempting to The Biochemistry and Hematology Outcome Model is a
risk stratify patients at the bedside or in the preoperative clinic.1 parsimonious version of POSSUM, which omits the subjec-
tive variables such as chest radiography and electrocardiogram
P-POSSUM results. It also has the advantage of consisting of variables
The P-POSSUM model was developed in the United King- which are all available preoperatively, with the exception of
dom and has since been validated in Japan, Australia, and operative severity. Given the Biochemistry and Hematology
Validation Cohort:
Internal vs.
No. of Data Selection Subject Surgical External vs.
First Author Region N Centers Acquisition Bias Description Type of Surgery Urgency Models Used Temporal* Outcome Endpoint
Atherly42 United States 2,167 M Administrative N N General, vascular All Charlson Comorbidity I External Mortality 30 d
(ICD-9 ndex based on ICD-9
codes) codes
Brooks30 United Kingdom 949 S Prospective N N General, colorectal, All POSSUM, P-POSSUM, Temporal Mortality 30 d
upper GI, urology, Surgical Risk Scale
head, and neck
966
Goffi68 Italy 187 S Prospective N N General All ASA-PS, APACHE II on External Combined endpoint: 30 d
hospital admission, mortality, morbidity
APACHE II immediately
preoperative
Hadjianastassiou70 United Kingdom 4,494 S Retrospective N Y Maxillofacial, general, All Surgical Mortality Score Internal Mortality Hospital
orthopedic, renal, discharge
urology, neuro
Haga28 Japan 5,272 M Prospective N Y Gastointestinal, Elective E-PASS, mE-PASS, External Mortality Hospital
hepatobiliary P-POSSUM, Surgical discharge
Risk
Score (Donati)
Haynes67 International 5,909 M Prospective N Y Any noncardiac All Surgical Apgar External Mortality, morbidity Hospital
discharge
Moonesinghe et al.
(Continued )
Table 3. (Continued )
Validation
Cohort: Internal
No. of Data Subject Surgical vs. External vs.
First Author Region N Centers Acquisition Selection Bias Description Type of Surgery Urgency Models Used Temporal* Outcome Endpoint
EDUCATION
Liebman49 The 33,224 S Prospective N Y General and trauma Emergent Identification of Risk In Internal Mortality, morbidity Hospital
Netherlands Surgical patients discharge
Makary27 United States 594 S Prospective Y: elective only Y Unselected inpatient Elective ASA-PS, Lee RCRI, and External Morbidity Hospital
Eagle Scores alone and discharge
in combination with
Frailty Index
967
Pillai73 New Zealand 6,492 M Retrospective N Y GI, breast, endocrine, All Otago Surgical Audit External Morbidity Hospital
vascular, Score discharge
gynecology,
orthopedic,
hepatobiliary
Regenbogen66 United States 4,119 S Prospective N Y General and vascular All Surgical Apgar Score External Mortality, morbidity 30 d
Stachon40 Germany 271 S Prospective Y: ICU only Y Orthopedic, spinal, All APACHE II, SAPS II, External Mortality Hospital
trauma, visceral APACHEN, SAPSN discharge
surgery, limb surgery
Stachon74 Germany 283 S Prospective Y: ICU only Y Orthopedic, spinal, All DELAWARE, APACHE II, Temporal/ Mortality Hospital
trauma, visceral SAPS II external discharge
surgery, limb surgery
Story75 Australia 256 S Retrospective Y: >70 y only Y General, colorectal, All Perioperative Mortality Internal Mortality 30 d
orthopedic, plastics, Risk Score
* Definitions of validation cohorts: External = validation in new cohort unrelated to derivation study; Internal = validation in split sample of same study population as derivation cohort; Temporal = validation in new cohort from derivation study but
same institution(s).
APACHE II = Acute Physiology and Chronic Health Evaluation II; APACHEN = Acute Physiology and Chronic Health Evaluation-Nucleated; ASA-PS = American Society of Anesthesiologists’ Physical Status score; BHOM = Biochemistry and
Hematology Outcome Model; CPET = cardiopulmonary exercise testing; DELAWARE = Dense Laboratory Whole Blood Applied Risk Estimation; E-PASS = Estimation of Physiologic Ability and Surgical Stress; GI = gastrointestinal; HDU = high
dependency unit; ICD = International Classification of Diseases; ICISS = International Classification of Disease Illness Severity Score; ICU = intensive care unit; M = multicenter; (m)E-PASS = (modified) Estimation of Physiologic Ability and Surgical
Moonesinghe et al.
Stress; MPM0 = Mortality Prediction Model; N = no; (P)-POSSUM = (Portsmouth)-Physiology and Operative Severity Score for the enUmeration of Morbidity and Mortality; RCRI = Revised Cardiac Risk Index; SAPS = Simplified Acute Physiology
Score; SAPSN = Simplified Acute Physiology Score-Nucleated; S = Single center; Y = Yes.
Table 4. Outcomes, Discrimination, and Calibration
Calibration (P Value
for Hosmer–Leme-
AUROC Morbidity Mortality show Statistic Unless
Author Models Used Endpoint Morbidity (%) (95% CI) (%) AUROC Mortality (95% CI) Otherwise Stated)
42
Atherly Charlson Comorbidity 30 d NR NR 1.3 0.47 NR
Index using ICD-9
coding
Brooks30 POSSUM 30 d NR NR 8.4 POSSUM: 0.92 NR
P-POSSUM P-POSSUM: 0.92 NR
Surgical Risk Scale Surgical Risk Scale: 0.89 NR
968
fit: 0.57
Goffi68 ASA-PS, Preoperative 30 d Overall: 26.7, NR Overall: 8.6, Combined outcome of mortality and morbidity: NR
APACHE II Elective: Elective: 4.3, ASA-PS: 0.777 NR
15.9, Emergent:
Hospital Admission APACHE II: 0.866 NR
Emergent: 20.4
57.1 Immediate preoperative APACHE II: overall: 0.894, NR
elective
surgery: 0.826, emergent surgery: 0.873, cancer
surgery: 0.915, noncancer surgery: 0.869
Hadjianastassiou70 Surgical Mortality Score Hospital discharge NR NR 4.1 0.82 (0.78–0.85) 0.10
Haga28 E-PASS, mE-PASS, Hospital discharge NR NR NR Hospital 30 d
P-POSSUM, 30 d discharge
Surgical Risk Score E-PASS 0.86 (0.79–0.93) 0.82 (0.69–0.95) NR
(Donati)
mE-PASS 0.86 (0.79–0.92) 0.81 (0.66–0.96)
Moonesinghe et al.
Nutritional Risk Index later) Nutritional Risk Index: Nutritional Risk Index:0.797
0.659
Maastricht Index Maastricht Index: 0.671 Maastricht Index: 0.743
(Continued)
Table 4. (Continued)
Calibration (P Value
for Hosmer–Leme-
Morbidity AUROC Morbidity show Statistic Unless
Author Models Used Endpoint (%) (95% CI) Mortality (%) AUROC Mortality (95% CI) Otherwise Stated)
EDUCATION
969
combined
Pillai73 Otago Surgical Audit Hospital discharge NR for 0.86 NR NR Good fit
Score validation
cohort
Regenbogen66 Surgical Apgar Score 30 d 14.1 0.73 2.3 0.81 NR
Stachon40 APACHE II Hospital discharge NR NR 24.7 APACHE II: 0.777 NR
SAPS II SAPS II: 0.785
APACHEN APACHEN: 0.829
SAPSN SAPSN: 0.823
Stachon74 DELAWARE Hospital discharge NR NR 23.3 DELAWARE: 0.813 0.44
APACHE II 0.777 NR
SAPS II 0.785 NR
APACHE II = Acute Physiology and Chronic Health Evaluation II; APACHEN = Acute Physiology and Chronic Health Evaluation-Nucleated; ASA-PS = American Society of Anesthesiologists’ Physi-
cal Status score; AUROC = area under receiver operating characteristic curve; BHOM = Biochemistry and Hematology Outcome Model; DELAWARE = Dense Laboratory Whole Blood Applied
Risk Estimation; ICD = International Classification of Diseases; ICISS = International Classification of Disease Illness Severity Score; (m)E-PASS = (modified) Estimation of Physiologic Abil-
ity and Surgical Stress; MPM0 = Mortality Prediction Model; NR = not reported; NSQIP = National Surgical Quality Improvement Program; (P)-POSSUM = (Portsmouth)-Physiology and Oper-
Moonesinghe et al.
ative Severity Score for the enUmeration of Morbidity and Mortality; RCRI = Revised Cardiac Risk Index; SAPS = Simplified Acute Physiology Score; SAPSN = Simplified Acute Physiology Score-Nucleated.
Risk Stratification Tools for Major Surgery
Outcome Model’s similarity in predictive accuracy to Classification of Diseases 9 and 10 administrative coding
P-POSSUM in the one study, we identified which made a data to define the Charlson Index variables.
direct comparison,32 this system warrants further evaluation.
Finally, the Identification of Risk In Surgical patients score Limitations of This Study
was developed in The Netherlands and consists of four vari- This study has limitations in a number of factors. First, the
ables (age, acuity of admission, acuity of surgery, and severity focus was on studies that measured the discrimination and/
of surgery). In the study, which developed and validated it on or calibration of risk stratification tools in cohorts that were
separate cohorts, the validation AUROC was 0.92.49 Again, heterogeneous in terms of surgical specialities; therefore, a
further investigation of this simple system would be useful. large number of single-speciality cohort studies identified in
the search were excluded from the analysis.
Generalizability of Findings Second, although the inclusion criteria for our review
Clinical and Methodological Heterogeneity. Clinical heteroge- ensured that a standard measure of discrimination was
neity (both within- and between-cohort patient heterogeneity) reported (AUROC or c-statistic), many studies did not report
and methodological heterogeneity (between-study differences measures of calibration. However, in a systematic review such
in the outcome measures used) are both likely to have had a as this, calibration may be seen to be a less important mea-
significant influence on some of our findings. For example, sure of goodness-of-fit than discrimination for a number of
between-cohort heterogeneity, and variation in how morbidity reasons. Calibration can only be used as a measure of perfor-
is defined (appendix 2), may explain the wide range of morbid- mance for models that generate an individualized predicted
ity rates reported in different studies. Heterogeneity of morbid- percentage risk of an outcome (e.g., the POSSUM systems)
ity definitions may also in part explain the lower accuracy of as opposed to summative scores, which use an ordinal scale to
models for predicting morbidity compared with mortality. On indicate increasing risk (e.g., the ASA-PS). Calibration drift
a different note, our study included all populations of patients is likely to occur over time and will be affected by changes in
who were determined to be heterogeneous, using the definitions healthcare delivery; good calibration in a study over 30 yr ago
described in our methods. However, the degree of heterogene- may be unlikely to correspond to good calibration today.55,56
ity varied among studies, including whether or not patients of Although such calibration drift may affect the usefulness of a
all surgical urgency categories were included, and this may have model for predicting an individual patient’s risk of outcome,
affected the predictive accuracy of models in different studies. poorly calibrated but highly discriminant models will still be
Objective versus Subjective Variables and Issues Surround- of value for risk adjustment in comparative audit. Finally, the
ing Data Collection Methodology. The variables included in probability of the Hosmer–Lemeshow statistic being signifi-
risk stratification tools may be classified as objective (e.g., cant (thereby indicating poor calibration) increases with the
biochemistry and hematology assays), subjective (e.g., inter- size of the population being studied.57 This may explain why
pretation of chest radiographs), and patient-reported (e.g., many of the large high-quality studies we evaluated did not
smoking history). In some clinical settings, the reliability of report calibration or reported that calibration was poor.
nonobjective data may be questionable; for example, previ- Third, by using the AUROC as the sole measure of dis-
ous reports have demonstrated significant interrater varia- crimination, a number of studies were excluded, particularly
tion in the interpretation of both chest radiographs50 and earlier articles that used correlation coefficients between risk
electrocardiograms.51 Patients may also under- or overesti- scores and postoperative outcomes. This was felt to be neces-
mate various elements of their clinical or social history when sary, as a uniform outcome measure provides clarity to the
questioned in the hospital setting. Despite these concerns, reader. Fourth, publication bias, where studies are preferen-
the discrimination of predictors incorporating patient- tially submitted and accepted for publication if the results are
reported and patient-subjective variables was high in the positive, is likely to be a particular problem in cohort stud-
studies included. This may be due to publication bias; it may ies. Finally, despite an extensive literature search, it is possible
also be explained by the fact that in all of these studies, data that some studies which would have been eligible for inclusion
were collected prospectively by trained staff. Previous work may have been missed. Multiple strategies have been used to
has demonstrated an association between interobserver vari- prevent this; however, in a review of this size, it is possible that
ability in the recording of risk and outcome measures, and a small number of appropriate articles may have been omitted.
the level of training that data collection staff have received.52
These caveats are important when considering the generaliz- Future Directions
ability of our findings to the everyday clinical setting, where Undertaking clinical risk prediction should be a key tenet of
data reporting and interpretation may be conducted by dif- safe high-quality patient care, it facilitates informed consent and
ferent types and grades of clinical staff. Finally, concerns enables the perioperative team to plan their clinical manage-
have also been raised over the clinical accuracy of admin- ment appropriately. Equally, accurate risk adjustment is required
istrative data used for case-mix adjustment purposes.53,54 to enable meaningful comparative audit between teams and
However, one large study included in our review43 showed institutions, to facilitate quality improvement for patients and
high discriminant performance when using International providers. Although we identified dozens of scores and models
which have been used to predict or adjust for risk, very few of been developed and validated for use in either specific types
these achieved the aspiration of being derived from entirely pre- of surgery (e.g., pancreatectomy,58 bariatric,59,60 or colorec-
operative data, and of being accurate, parsimonious, and simple tal60 surgery) or for specific outcomes (e.g., cardiac morbid-
to implement. The Surgical Risk Scale is the system that comes ity and mortality).61 A parsimonious, entirely preoperative
closest to achieving these goals; the P-POSSUM score is more National Surgical Quality Improvement Program model
accurate, but its value is limited by the fact that some of the vari- for predicting mortality in heterogeneous cohorts would be
ables are only available after surgery has been completed. Future of value in the United States; its validation in international
work which might be of value would include further compari- multicenter studies would also be a worthwhile endeavor.
son of the Surgical Risk Scale, P-POSSUM, and objective mod- Finally, although there are multiple studies aimed at devel-
els such as the Biochemistry and Hematology Outcome Model oping and validating risk stratification tools, we do not know
in international multicenter cohorts and further investigation how widely such tools are used. Use of mobile technology, such
of models which combine novel variables such as measures of as apps to enable risk calculation using complex equations at
functional capacity, nutritional status, and frailty. the bedside, might increase the use of accurate risk stratification
There is another possible approach. The American Col- tools in day-to-day practice. Importantly, in surgical outcomes
lege of Surgeons’ National Surgical Quality Improvement research, there is an absence of impact studies, measuring the
Program was created in the 1990s to facilitate risk-adjusted effect of using risk stratification tools on clinician behavior,
surgical outcomes reporting in Veterans’ Affairs hospitals, patient outcome, and resource utilization. Randomized, con-
and now also includes a number of private sector institutions. trolled trials to evaluate impact, further validation of existing
Risk adjustment models are produced annually and observed models across healthcare systems, and establishing the infra-
that the expected ratios of surgical outcomes are reported structure required to facilitate such work, including the routine
back to institutions and surgical teams to facilitate quality data collection of risk and outcome data, should be of the high-
improvement. This organization has published a number of est priority in health services research into surgical outcome.62
risk calculators to help clinicians to provide informed con-
sent and plan perioperative care. However, none of these cal- The authors thank Judith Hulf, F.R.C.A., Past President, Royal Col-
culators have been included in our review, as they have all lege of Anaesthetists, London, United Kingdom.
Appendix 1. Preferred Reporting Items for Systematic reviews and Meta-analyses Checklist12
Reported on
Section/Topic # Checklist Item Page No.
TITLE
Title 1 Identify the report as a systematic review, meta-analysis, or both. 959
ABSTRACT
Structured summary 2 Provide a structured summary including, as applicable: background; 959
objectives; data sources; study eligibility criteria, participants, and
interventions; study appraisal and synthesis methods; results; limita-
tions; conclusions and implications of key findings; systematic review
registration number.
INTRODUCTION
Rationale 3 Describe the rationale for the review in the context of what is already 959–60
known.
Objectives 4 Provide an explicit statement of questions being addressed with refer- 960
ence to participants, interventions, comparisons, outcomes, and
study design (PICOS).
METHODS
Protocol and registration 5 Indicate if a review protocol exists, if and where it can be accessed n/a
(e.g., Web address), and, if available, provide registration information
including registration number.
Eligibility criteria 6 Specify study characteristics (e.g., PICOS, length of follow-up) and 960–1
report characteristics (e.g., years considered, language, publication
status) used as criteria for eligibility, giving rationale.
Information sources 7 Describe all information sources (e.g., databases with dates of cover- 960–1
age, contact with study authors to identify additional studies) in the
search and date last searched.
Search 8 Present full electronic search strategy for at least one database, includ- Appendix 2
ing any limits used, such that it could be repeated.
(Continued)
Appendix 1. (Continued)
Reported on
Section/Topic # Checklist Item Page No.
TITLE
Study selection 9 State the process for selecting studies (i.e., screening, eligibility, 960
included in systematic review, and, if applicable, included in the
meta-analysis).
Data collection process 10 Describe method of data extraction from reports (e.g., piloted forms, 960
independently, in duplicate) and any processes for obtaining and
confirming data from investigators.
Data items 11 List and define all variables for which data were sought (e.g., PICOS, 960
funding sources) and any assumptions and simplifications made.
Risk of bias in individual 12 Describe methods used for assessing risk of bias of individual studies 960
studies (including specification of whether this was done at the study or outcome
level), and how this information is to be used in any data synthesis.
Summary measures 13 State the principal summary measures (e.g., risk ratio, difference in 961
means).
Synthesis of results 14 Describe the methods of handling data and combining results of stud- n/a
ies, if done, including measures of consistency (e.g., I2) for each
meta-analysis.
Risk of bias across studies 15 Specify any assessment of risk of bias that may affect the cumulative 960
evidence (e.g., publication bias, selective reporting within studies).
Additional analyses 16 Describe methods of additional analyses (e.g., sensitivity or subgroup n/a
analyses, meta-regression), if done, indicating which were prespecified.
RESULTS
Study selection 17 Give numbers of studies screened, assessed for eligibility, and included Figure 1
in the review, with reasons for exclusions at each stage, ideally with a
flow diagram.
Study characteristics 18 For each study, present characteristics for which data were extracted Tables 1–3
(e.g., study size, PICOS, follow-up period) and provide the citations.
Risk of bias within studies 19 Present data on risk of bias of each study and, if available, any out- Table 3
come level assessment (see item 12).
Results of individual studies 20 For all outcomes considered (benefits or harms), present, for each n/a
study: (a) simple summary data for each intervention group (b) effect
estimates and CIs, ideally with a forest plot.
Synthesis of results 21 Present results of each meta-analysis done, including CIs and meas- n/a
ures of consistency.
Risk of bias across studies 22 Present results of any assessment of risk of bias across studies (see 962
Item 15).
Additional analysis 23 Give results of additional analyses, if done (e.g., sensitivity or subgroup n/a
analyses, meta-regression [see Item 16]).
DISCUSSION
Summary of evidence 24 Summarize the main findings including the strength of evidence for 965
each main outcome; consider their relevance to key groups (e.g.,
healthcare providers, users, and policy makers).
Limitations 25 Discuss limitations at study and outcome level (e.g., risk of bias), 970
and at review-level (e.g., incomplete retrieval of identified research,
reporting bias).
Conclusions 26 Provide a general interpretation of the results in the context of other 970–1
evidence, and implications for future research.
FUNDING
Funding 27 Describe sources of funding for the systematic review and other sup- 959
port (e.g., supply of data); role of funders for the systematic review.
n/a = not applicable.
Appendix 2. Search Strategy headings, heading word, drug trade name, original title,
device manufacturer, drug manufacturer] OR perioperative
care/or intraoperative care/or postoperative care/or preop-
MEDLINE
erative care.
Risk adjustment.mp. or exp Health Care Reform/or exp Risk
Adjustment/or exp “Outcome Assessment (Health Care)”/ Combined with:
or exp Models, Statistical/or exp Risk/OR exp Risk Assess- complicat$.mp. [mp=title, abstract, subject headings,
ment/or risk prediction.mp. or exp Risk/or exp Risk Factors/ heading word, drug trade name, original title, device man-
OR predictive value of tests.mp. or exp “Predictive Value of ufacturer, drug manufacturer] OR adverse outcome/or pre-
Tests”/OR exp Prognosis/or risk stratification.mp. OR case diction/or prognosis/OR exp Postoperative Complication/
mix adjustment.mp. or exp Risk Adjustment/OR severity of co, di, ep, su, th [Complication, Diagnosis, Epidemiology,
illness index.mp. or exp “Severity of Illness Index”/OR scor- Surgery, Therapy] OR exp Perioperative Complication/or
ing system.mp. exp Perioperative Period/OR exp Mortality/or exp Surgical
Mortality/OR exp Morbidity/OR outcome.mp. or “Out-
Combined with: come Assessment (Health Care)”/or “Outcome and Process
Surgical Procedures, Operative/OR surgery.mp. or Gen- Assessment (Health Care)” OR treatment outcome/.
eral Surgery/OR operation.mp. or exp Postoperative
Complications/
Limits
Combined with: 1980 to August 31, 2011
mortality.mp. or exp Hospital Mortality/or exp Mortal-
ity/OR morbidity.mp. or exp Morbidity/OR outcome. Exclusions:
mp. or exp Fatal Outcome/or exp “Outcome Assessment (“all infant (birth to 23 months)” or “all child (0 to 18
(Health Care)”/or exp “Outcome and Process Assessment years)” or “newborn infant (birth to 1 month)” or “infant (1
(Health Care)”/or exp Treatment Outcome/OR postopera- to 23 months)” or “preschool child (2 to 5 years)” or “child
tive complications.mp. or exp Postoperative Complications/ (6 to 12 years)” or “adolescent (13 to 18 years)”) or (cats
OR intraoperative complications.mp. or exp Intraoperative or cattle or chick embryo or dogs or goats or guinea pigs
Complications/OR exp Perioperative Care/or perioperative or hamsters or horses or mice or rabbits or rats or sheep or
complications.mp. OR prognosis.mp. or exp Prognosis/. swine) or (communication disorders journals or dentistry
journals or “history of medicine journals” or “history of
Embase medicine journals non index medicus” or “national aeronau-
tics and space administration (nasa) journals” or reproduc-
Risk Factor/or risk adjust$.mp. OR cardiovascular risk/or
tion journals) or Angioplasty, Balloon/or Angioplasty, Laser/
high risk patient/or high risk population/or risk assessment/
or Angioplasty/or Angioplasty, Balloon, Laser-Assisted/
or risk factor OR risk stratification.mp. [mp=title, abstract,
or Angioplasty, Transluminal, Percutaneous Coronary/or
subject headings, heading word, drug trade name, original
ANGIOPLASTY.mp. OR Eye/or Ophthalmology/or Eye
title, device manufacturer, drug manufacturer] OR *”Scor-
Diseases/or OPTHALMOLOGY.mp. or Hearing Loss OR
ing System”/OR “Severity of Illness Index”/OR Multivariate
CARDIAC SURGERY.mp. or HEART SURGERY.mp. or
Logistic Regression Analysis/or Logistic Regression Analysis
Myocardial Revascularization/or Coronary Artery Bypass/or
OR logistic models/or risk assessment/or risk factors/OR
CORONARY SURGERY.mp. or Coronary Artery Bypass,
exp Scoring System OR Prediction/or possum.mp. or Scor-
Off-Pump/.
ing System/OR exp Risk Assessment/or risk stratification.
mp. OR predict$.mp. OR exp Quality Indicators, Health
Care/OR Risk Adjustment/. Hand Searching of Reference Lists
The following keywords were searched separately on MED-
Combined with: LINE, Embase, and ISI Web of Science:
exp Surgery/OR exp Surgical Procedures, Operative/OR
POSSUM + surgery
specialties, surgical/or surgery/OR surg$.mp. [mp=title,
NSQIP
abstract, subject headings, heading word, drug trade name,
E-PASS
original title, device manufacturer, drug manufacturer] OR
ACE-27
peri-operative period.mp. [mp=title, abstract, subject head-
APACHE
ings, heading word, drug trade name, original title, device
manufacturer, drug manufacturer] OR perioperative.mp. In addition, the original development studies for all risk
[mp=title, abstract, subject headings, heading word, drug prediction models identified in the initial search were then
trade name, original title, device manufacturer, drug manu- snowballed by hand searching for citations on MEDLINE,
facturer] OR postoperative.mp. [mp=title, abstract, subject Embase and ISI Web of Science.
Dasgupta35 Detsky Index Cardiac: ischemia, congestive heart failure, new arrhythmia, or sudden death.
Edmonton Frail Respiratory: pneumonia, significant bronchospasm, deep venous thrombosis or
Scale pulmonary embolism, or the excessive need for respiratory support.
Delirium: required the acute onset and fluctuating course of at least one of the follow-
ing symptoms as outlined in the Diagnostic and Statistical Manual of Mental Disor-
ders, Revised third edition, occurring anytime on or after postoperative day 1.
Disorganized thinking or inattention, altered level of consciousness, psychomotor
agitation, disorientation or memory impairment, new perceptual disturbances, or
new sleep disturbances (e.g., agitation at night or excessive drowsiness during
the day).
If patients had a known diagnosis of dementia or were on cholinesterase inhibi-
tors, the occurrence of delirium required more than just disorientation or
memory impairment.
Davenport28 ASA-PS One or more of 21 specific NSQIP defined complications: not listed
Gawande66 Surgical Apgar According to NSQIP’s established definitions:
Score Cardiovascular: cardiac arrest requiring cardiopulmonary resuscitation, myocar-
dial infarction
Respiratory: pneumonia, unplanned intubation, pulmonary embolism, failure to
wean from the ventilator 48 h after operation
Renal: acute renal failure
Neurological: coma for 24 h or longer, stroke
Infectious: septic shock, sepsis, systemic inflammatory response syndrome
Wound: wound disruption, deep- or organ-space surgical site infection
Other: bleeding requiring >4 U red cell transfusion within 72 h after operation,
deep venous thrombosis, and vascular graft failure
Goffi69 ASA-PS Major: cardiac failure; abdominal sepsis; hemoperitoneum; respiratory failure;
APACHE II intestinal obstruction; renal failure
Minor: urinary infection; respiratory infection; wound infection
Haynes72 Surgical Apgar NSQIP defined (see study by Gawande66)
Hightower70 ASA-PS Cardiac events: myocardial ischemia without myocardial infarction; myocardial
infarction; dysrhythmias and conduction abnormalities; congestive heart fail-
ure; postoperative vasopressors; cardiac arrest with successful resuscitation
Respiratory events: prolonged intubation (>24 h from end of surgery); reintubation; acute
respiratory distress syndrome; hypoxemia; pneumonia; acute respiratory failure
Vascular events: venous thrombus; pulmonary emboli
Renal events: renal insufficiency; acute renal failure
Infectious events: wound infection; sepsis
Gastrointestinal events: gastrointestinal obstruction and/or paralytic lieus
Reoperation
Readmission
(Continued)
Appendix 3. (Continued)
Other
IHD or Preoperative Intraoperative Postoperative
Haem Biochem arrhythmia CCF COPD Neuro Renal Diabetes Cancer Factors Factors Factors
X X General poor
functional
status
Onset time
of surgery,
duration of
surgery
X* X* X X Body weight, Blood loss,
performance duration
status of surgery,
incision type
X* X* X X Performance
status
Alb Normal weight,
current weight
Height, weight,
BMI, nutritional
history,
subjective
assessments
of general
well-being, and
comorbidities
Lymphocytes Alb, Prealbumin Ideal weight
Hospital
admission
status (acute
vs. nonacute)
X† X DM
Shrinking,
decreased
grip strength,
exhaustion,
low physical
activity, slow
walking speed
Hb Ur
WCC Na
K
X X CVAR RD ID
Product of
survival
risk ratios
of all ICD-9
classification
codes
Admission type, Duration of
number of surgery,
operations, operator
preoperative grade, wound
length of stay, category
day case
vs. inpatient
surgery
(Continued)
Appendix 4. (Continued)
* Cardiac comorbidity classed as single variable. † Eagle criteria: score separately for history of angina vs. history of myocardial
infarction.
Alb = serum albumin; ALT = alanine transaminase; ASA-PS = American Society of Anesthesiologists’ Physical Status score;
APACHE II = Acute Physiology and Chronic Health Evaluation II; APACHEN = Acute Physiology and Chronic Health Evaluation-
Nucleated; BHOM = Biochemistry and Hematology Outcome Model; BMI = body mass index; CCF = congestive cardiac failure;
Chol = cholesterol; CK = creatine kinase; COPD = chronic obstructive pulmonary disease; CRP = C-reactive protein; CVAR = cerebro
vascular accident with residual deficit; DELAWARE = Dense Laboratory Whole Blood Applied Risk Estimation; DM = Any definition of
diabetes mellitus; Hb = hemoglobin; ICD = International Classification of Diseases; ICISS = International Classification of Disease
Illness Severity Score; ICU = intensive care unit; ID = insulin dependent; IHD = ischemic heart disease; IRIS = Identification of Risk
In Surgical Patients; K = potassium; (m)E-PASS = (modified) Estimation of Physiologic Ability and Surgical Stress; MPM0 = Mortality
Prediction Model; Na = serum sodium; Plt = platelet count; RCRI = Revised Cardiac Risk Index; RD = Other definition of renal
dysfunction; SAPS = Simplified Acute Physiology Score; SAPSN = Simplified Acute Physiology Score-Nucleated; TGC = serum trigly
cerides; Ur = serum urea; WCC = white cell count.
Other
IHD or Preoperative Intraoperative Postoperative
Haem Biochem arrhythmia CCF COPD Neuro Renal Diabetes Cancer Factors Factors Factors
Nucleated red cell
assay added
to APACHE II
score as an
independent
variable
Nucleated red cell
assay added
to SAPS II
score as an
independent
variable
Plts ALT
WCC CK
Chol
K
TGC
CRP
Alb Acute renal
impairment,
unplanned
ICU
admission,
inflammation
17. Copeland GP, Jones D, Walters M: POSSUM: A scoring system 24. Krenzien J, Roding H, Mummelthey R: Surgical risk in old
for surgical audit. Br J Surg 1991; 78:355–60 age: Prospective evaluation of a prognosis index. Zentralblatt
18. Ding LA, Sun LQ, Chen SX, Qu LL, Xie DF: Modified physi- fur Chirurgie 1990; 115:717–27
ological and operative score for the enumeration of mortal- 25. Jones DR, Copeland GP, de Cossart L: Comparison of
ity and morbidity risk assessment model in general surgery. POSSUM with APACHE II for prediction of outcome
World J Gastroenterol 2007; 13:5090–5 from a surgical high-dependency unit. Br J Surg 1992;
19. Carneiro AV, Leitão MP, Lopes MG, De Pádua F: [Risk strati- 79:1293–6
fication and prognosis in critical surgical patients using 26. Davenport DL, Bowe EA, Henderson WG, Khuri SF, Mentzer
the Acute Physiology, Age and Chronic Health III System RM Jr: National Surgical Quality Improvement Program
(APACHE III)]. Acta Med Port 1997; 10:751–60 (NSQIP) risk factors can be used to validate American Society
20. Zhang H, Zhu D-M, Xue Z-G, Luo J-F, Jiang H: Performance of Anesthesiologists Physical Status Classification (ASA PS)
of APACHE II models in surgical intensive care unit. Fudan levels. Ann Surg 2006; 243:636–41; discussion 641–4
Univ J Med Sci 2004; 31:417–20 27. Makary MA, Segev DL, Pronovost PJ, Syin D, Bandeen-Roche
21. Saba V, Goffi L, Jassem W, Ghiselli R, Necozione S, Mattei A, K, Patel P, Takenaga R, Devgan L, Holzmueller CG, Tian J,
Carle F: Prognostic value of the Apache II scoring system Fried LP: Frailty as a predictor of surgical outcomes in older
daily preoperative use in major general surgery. Chirurgia patients. J Am Coll Surg 2010; 210:901–8
1997; 10:187–94 28. Haga Y, Ikejiri K, Wada Y, Takahashi T, Ikenaga M, Akiyama
22. Martin Graczyk AI, Molina Hernandez MJ, Vazquez PC, Mora N, Koike S, Koseki M, Saitoh T: A multicenter prospec-
FJ, Hierro VM, Gomez PJ, Ribera Casado JM: Preoperative tive study of surgical audit systems. Ann Surg 2011;
geriatric assessment in major surgery in the aged. Anales de 253:194–201
Medicina Interna 1995; 12:270–4 29. Donati A, Ruzzi M, Adrario E, Pelaia P, Coluzzi F, Gabbanelli
23. Kuo HS, Chuang JH, Tang GJ, Hou CC, Chou SS, Lui WY, V, Pietropaoli P: A new and feasible model for predicting
P’eng FK: Development of a new prognostic system and vali- operative risk. Br J Anaesth 2004; 93:393–9
dation of APACHE II for surgical ICU mortality: A multicenter 30. Brooks MJ, Sutton R, Sarin S: Comparison of Surgical Risk
study in Taiwan. Chung Hua i Hsueh Tsa Chih - Chin Med J Score, POSSUM and p-POSSUM in higher-risk surgical
1999; 62:673–81 patients. Br J Surg 2005; 92:1288–92
31. Sutton R, Bann S, Brooks M, Sarin S: The Surgical Risk Scale 51. Trzeciak S, Erickson T, Bunney EB, Sloan EP: Variation in
as an improved tool for risk-adjusted analysis in comparative patient management based on ECG interpretation by emer-
surgical audit. Br J Surg 2002; 89:763–8 gency medicine and internal medicine residents. Am J Emerg
32. Neary WD, Prytherch D, Foy C, Heather BP, Earnshaw JJ: Med 2002; 20:188–95
Comparison of different methods of risk stratification in 52. Dindo D, Hahnloser D, Clavien PA: Quality assessment in
urgent and emergency surgery. Br J Surg 2007; 94:1300–5 surgery: Riding a lame horse. Ann Surg 2010; 251:766–71
33. Dasgupta M, Rolfson DB, Stolee P, Borrie MJ, Speechley M: 53. Mohammed MA, Deeks JJ, Girling A, Rudge G, Carmalt M,
Frailty is associated with postoperative complications in Stevens AJ, Lilford RJ: Evidence of methodological bias in
older adults with medical problems. Arch Gerontol Geriatr hospital standardised mortality ratios: Retrospective data-
2009; 48:78–83 base study of English hospitals. BMJ 2009; 338:b780
34. Kuzu MA, Terzioğlu H, Genç V, Erkek AB, Ozban M, Sonyürek 54. Hall BL, Hirbe M, Waterman B, Boslaugh S, Dunagan WC:
P, Elhan AH, Torun N: Preoperative nutritional risk assess- Comparison of mortality risk adjustment using a clinical data
ment in predicting postoperative outcome in patients under- algorithm (American College of Surgeons National Surgical
going major surgery. World J Surg 2006; 30:378–90 Quality Improvement Program) and an administrative data
35. Copeland GP, Sagar P, Brennan J, Roberts G, Ward J, Cornford algorithm (Solucient) at the case level within a single institu-
P, Millar A, Harris C: Risk-adjusted analysis of surgeon per- tion. J Am Coll Surg 2007; 205:767–77
formance: A 1-year study. Br J Surg 1995; 82:408–11 55. Copeland GP: The POSSUM system of surgical audit. Arch
36. Whiteley MS, Prytherch DR, Higgins B, Weaver PC, Prout Surg 2002; 137:15–9
WG: An evaluation of the POSSUM surgical scoring system. 56. Tilford JM, Roberson PK, Lensing S, Fiser DH: Differences in
Br J Surg 1996; 83:812–5 pediatric ICU mortality risk over time. Crit Care Med 1998;
37. Organ N, Morgan T, Venkatesh B, Purdie D: Evaluation of the 26:1737–43
P-POSSUM mortality prediction algorithm in Australian sur- 57. Kramer AA, Zimmerman JE: Assessing the calibration of mor-
gical intensive care unit patients. ANZ J Surg 2002; 72:735–8 tality benchmarks in critical care: The Hosmer-Lemeshow
38. Knaus WA, Draper EA, Wagner DP, Zimmerman JE: APACHE test revisited. Crit Care Med 2007; 35:2052–6
II: A severity of disease classification system. Crit Care Med 58. Parikh P, Shiloach M, Cohen ME, Bilimoria KY, Ko CY, Hall
1985; 13:818–29 BL, Pitt HA: Pancreatectomy risk calculator: An ACS-NSQIP
39. Charlson ME, Pompei P, Ales KL, MacKenzie CR: A new resource. HPB (Oxford) 2010; 12:488–97
method of classifying prognostic comorbidity in longitudi- 59. Gupta PK, Franck C, Miller WJ, Gupta H, Forse RA:
nal studies: Development and validation. J Chronic Dis 1987; Development and validation of a bariatric surgery morbid-
40:373–83 ity risk calculator using the prospective, multicenter NSQIP
40. Stachon A, Becker A, Kempf R, Holland-Letz T, Friese J, Krieg dataset. J Am Coll Surg 2011; 212:301–9
M: Re-evaluation of established risk scores by measurement 60. Cohen ME, Bilimoria KY, Ko CY, Hall BL: Development
of nucleated red blood cells in blood of surgical intensive of an American College of Surgeons National Surgery
care patients. J Trauma 2008; 65:666–73 Quality Improvement Program: Morbidity and mortality
41. Charlson M, Szatrowski TP, Peterson J, Gold J: Validation risk calculator for colorectal surgery. J Am Coll Surg 2009;
of a combined comorbidity index. J Clin Epidemiol 1994; 208:1009–16
47:1245–51 61. Gupta PK, Gupta H, Sundaram A, Kaushik M, Fang X, Miller
42. Atherly A, Fink AS, Campbell DC, Mentzer RM Jr, Henderson WJ, Esterbrooks DJ, Hunter CB, Pipinos II, Johanning JM,
W, Khuri S, Culler SD: Evaluating alternative risk-adjustment Lynch TG, Forse RA, Mohiuddin SM, Mooss AN: Development
strategies for surgery. Am J Surg 2004; 188:566–70 and validation of a risk calculator for prediction of cardiac risk
43. Sundararajan V, Henderson T, Perry C, Muggivan A, Quan H, after surgery/clinical perspective. Circulation 2011; 124:381–7
Ghali WA: New ICD-10 version of the Charlson comorbidity 62. Grocott MP: Improving outcomes after surgery. BMJ 2009;
index predicted in-hospital mortality. J Clin Epidemiol 2004; 339:b5173
57:1288–94 63. Osler TM, Rogers FB, Glance LG, Cohen M, Rutledge R,
44. Haynes SR, Lawler PG: An assessment of the consistency Shackford SR: Predicting survival, length of stay, and cost
of ASA physical status classification allocation. Anaesthesia in the surgical intensive care unit: APACHE II versus ICISS. J
1995; 50:195–9 Trauma 1998; 45:234–7; discussion 237–8
45. Grocott MP, Levett DZ, Matejowsky C, Emberton M, Mythen 64. Prytherch DR, Whiteley MS, Higgins B, Weaver PC, Prout
MG: ASA scores in the preoperative patient: Feedback to clini- WG, Powell SJ: POSSUM and Portsmouth POSSUM for pre-
cians can improve data quality. J Eval Clin Pract 2007; 13:318–9 dicting mortality. Physiological and Operative Severity Score
46. Aronson WL, McAuliffe MS, Miller K: Variability in the for the enUmeration of Mortality and morbidity. Br J Surg
American Society of Anesthesiologists Physical Status 1998; 85:1217–20
Classification Scale. AANA J 2003; 71:265–74 65. Gawande AA, Kwaan MR, Regenbogen SE, Lipsitz SA,
47. Mak PHK, Campbell RCH, Irwin MG: The ASA physical status Zinner MJ: An Apgar score for surgery. J Am Coll Surg 2007;
classification: Inter-observer consistency. Anaesth Intensive 204:201–8
Care 2002; 30:633–40 66. Regenbogen SE, Ehrenfeld JM, Lipsitz SR, Greenberg CC,
48. Snowden CP, Prentis JM, Anderson HL, Roberts DR, Randles Hutter MM, Gawande AA: Utility of the surgical apgar score:
D, Renton M, Manas DM: Submaximal cardiopulmonary Validation in 4119 patients. Arch Surg 2009; 144:30–6; discus-
exercise testing predicts complications and hospital length sion 37
of stay in patients undergoing major elective surgery. Ann 67. Haynes AB, Regenbogen SE, Weiser TG, Lipsitz SR, Dziekan
Surg 2010; 251:535–41 G, Berry WR, Gawande AA: Surgical outcome measurement
49. Liebman B, Strating RP, van Wieringen W, Mulder W, for a global patient population: Validation of the Surgical
Oomen JL, Engel AF: Risk modelling of outcome after gen- Apgar Score in 8 countries. Surgery 2011; 149:519–24
eral and trauma surgery (the IRIS score). Br J Surg 2010; 68. Goffi L, Saba V, Ghiselli R, Necozione S, Mattei A, Carle F:
97:128–33 Preoperative APACHE II and ASA scores in patients hav-
50. Robinson PJ, Wilson D, Coral A, Murphy A, Verow P: ing major general surgical operations: Prognostic value and
Variation between experienced observers in the interpre- potential clinical applications. Eur J Surg 1999; 165:730–5
tation of accident and emergency radiographs. Br J Radiol 69. Hightower CE, Riedel BJ, Feig BW, Morris GS, Ensor JE Jr,
1999; 72:323–30 Woodruff VD, Daley-Norman MD, Sun XG: A pilot study
evaluating predictors of postoperative outcomes after major necessary for specialized intensive care units? Crit Care Med
abdominal surgery: Physiological capacity compared with 2009; 37:2375–86
the ASA physical status classification system. Br J Anaesth 73. Pillai SB, van Rij AM, Williams S, Thomson IA, Putterill MJ,
2010; 104:465–71 Greig S: Complexity- and risk-adjusted model for measuring
70. Hadjianastassiou VG, Tekkis PP, Poloniecki JD, Gavalas MC, surgical outcome. Br J Surg 1999; 86:1567–72
Goldhill DR: Surgical mortality score: Risk management tool for 74. Stachon A, Becker A, Holland-Letz T, Friese J, Kempf R, Krieg
auditing surgical performance. World J Surg 2004; 28:193–200 M: Estimation of the mortality risk of surgical intensive care
71. Hobson SA, Sutton CD, Garcea G, Thomas WM: Prospective patients based on routine laboratory parameters. Eur Surg
comparison of POSSUM and P-POSSUM with clinical Res 2008; 40:263–72
assessment of mortality following emergency surgery. Acta 75. Story DA, Fink M, Leslie K, Myles PS, Yap SJ, Beavis V,
Anaesthesiol Scand 2007; 51:94–100 Kerridge RK, McNicol PL: Perioperative mortality risk score
72. Nathanson BH, Higgins TL, Kramer AA, Copes WS, Stark M, using pre- and postoperative risk factors in older patients.
Teres D: Subgroup mortality probability models: Are they Anaesth Intensive Care 2009; 37:392–8