Accuracy of Diagnostic Tests For Cushing's Syndrome: A Systematic Review and Metaanalyses
Accuracy of Diagnostic Tests For Cushing's Syndrome: A Systematic Review and Metaanalyses
Accuracy of Diagnostic Tests For Cushing's Syndrome: A Systematic Review and Metaanalyses
C l i n i c a l R e v i e w
Mohamed B. Elamin, M. Hassan Murad, Rebecca Mullan, Dana Erickson, Katherine Harris,
Sarah Nadeem, Robert Ennis, Patricia J. Erwin, and Victor M. Montori
Knowledge and Encounter Research Unit (M.B.E., M.H.M., R.M., P.J.E., V.M.M.), Division of Preventive Medicine (M.H.M.), Division of
Endocrinology, Diabetes, Metabolism, Nutrition (D.E., S.N., V.M.M.), Department of Medicine (M.H.M., D.E., K.H., R.E., V.M.M.), Mayo
Clinic, and Mayo Clinic Libraries (P.J.E.), Mayo Clinic, Rochester, Minnesota 55905
Context: The diagnosis of Cushings syndrome (CS) requires the use of tests of unregulated hy-
percortisolism that have unclear accuracy.
Objective: Our objective was to summarize evidence on the accuracy of common tests for diag-
nosing CS.
Data Sources: We searched electronic databases (MEDLINE, EMBASE, Web of Science, Scopus, and
citation search for key articles) from 1975 through September 2007 and sought additional refer-
ences from experts.
Study Selection: Eligible studies reported on the accuracy of urinary free cortisol (UFC), dexameth-
asone suppression test (DST), and midnight cortisol assays vs. reference standard in patients sus-
pected of CS.
Data Extraction: Reviewers working in duplicate and independently extracted study characteristics
and quality and data to estimate the likelihood ratio (LR) and the 95% confidence interval (CI) for
each result.
Data Synthesis: We found 27 eligible studies, with a high prevalence [794 (9.2%) of 8631 patients
had CS] and severity of CS. The tests had similar accuracy: UFC (n 14 studies; LR 10.6, CI 5.520.5;
LR 0.16, CI 0.08 0.33), salivary midnight cortisol (n 4; LR 8.8, CI 3.521.8; LR 0.07, CI 0 1.2),
and the 1-mg overnight DST (n 14; LR 16.4, CI 9.328.8; LR 0.06, CI 0.03 0.14). Combined
testing strategies (e.g. a positive result in both UFC and 1-mg overnight DST) had similar diagnostic
accuracy (n 3; LR 15.4, CI 0.7358; LR 0.11, CI 0.0071.57).
Conclusions: Commonly used tests to diagnose CS appear highly accurate in referral practices with
samples enriched with patients with CS. Their performance in usual clinical practice remains
unclear. (J Clin Endocrinol Metab 93: 15531562, 2008)
0021-972X/08/$15.00/0 Abbreviations: CS, Cushings syndrome; DST, dexamethasone suppression test; ROC, re-
Printed in U.S.A. ceiver operator characteristic; UFC, urinary free cortisol.
Quality assessment
Reviewers working independently and in
duplicate analyzed the eligible articles to assess
the reported quality of the methods. We fol-
lowed the tool for quality assessment of studies
of diagnostic accuracy included in systematic
reviews (QUADAS) (8).
Data extraction
Reviewers working independently and in
pairs used a standardized form to extract a full
description of study participants, including judg-
FIG. 1. Results of the systematic review with flow of studies for eligibility into the review and into each ments about the extent of diagnostic uncertainty,
metaanalysis. ODST, Overnight DST. the presence of comorbid conditions as eligibility
criteria (not as characteristics of the sample), the
tests and the procedures followed to conduct
accurate tests that are able to discriminate patients with and them, the cutoff or range definitions of diagnostic tests, whether these
without hypercortisolism (35). cutoffs were derived from previous research or determined by study
To summarize the available evidence of diagnostic accuracy authors, and the nature and characteristics of the reference standard
of tests of abnormal cortisol overproduction, The Endocrine So- used. To extract data to estimate diagnostic accuracy measures, we used
the cutoffs authors chose to use in the primary studies. If more than one
ciety Cushings Syndrome Task Force commissioned us to con-
cutoff was reported or if the results were reported at the individual pa-
duct a systematic review of diagnostic accuracy of diagnostic tient level, then we chose to use cutoffs that offered the best test
tests for CS. performance.
Author contact
Materials and Methods We sent letters to the corresponding authors (or any other author
with contact address listed on the main manuscript) of each of the eligible
The protocol of this review, approved by the Task Force, adheres to studies by electronic mail (regular mail if we could not obtain an active
current methodological guidelines on the conduct of systematic reviews e-mail). We asked these authors to verify the data we extracted and to
of diagnostic accuracy (6). complete missing data we could not identify in the published record. In
case of no response, we repeated the request 2 wk later.
Eligibility criteria
We included cross-sectional and longitudinal studies that enrolled Statistical analysis
participants with true diagnostic uncertainty. Therefore, the diagnosis of
We used Meta-DiSc Software for Meta-analysis for Screening and
CS could not be a criterion for enrollment in these studies, so-called phase
Diagnostic tests version 1.4 (9). Using random effects metaanalyses, we
II and III diagnostic studies (7). These studies may have included indi-
pooled the sensitivities, specificities, likelihood ratios, and diagnostic
viduals selected because they had physical findings or comorbid condi-
odds ratio and estimated the 95% confidence intervals for the outcomes.
tions suggestive of CS.
Because the pooled sensitivity and the pooled specificity are interrelated,
Tests of interest were urinary free cortisol (UFC), serum and salivary
we focused our analyses on estimating and pooling likelihood ratios and
midnight/bedtime cortisol, 1-mg overnight dexamethasone suppression
diagnostic odds ratios. The diagnostic odds ratio of a test describes the
test (DST) or the 2-d 2 mg DST. Eligible studies had a reference standard
ratio of the odds of a positive test result in patients with disease compared
for diagnosing CS. Eligible reference standards included a pathological
with patients without disease (10) and can be calculated as the ratio of
diagnosis, response to therapy targeting CS, or clinical follow-up (i.e.
the likelihood ratios for a positive and a negative test. It has the advantage
consensus among treating clinicians about a diagnosis of CS). Eligible
of being a single indicator of test performance that provides a global
studies measured the accuracy of test results with results expressed as 1)
meaning of agreement between a test and a reference standard and allows
both sensitivity and specificity or 2) likelihood ratio. We included studies
for pooling across studies when the main source of inconsistency is the
regardless of their publication status, language, or size.
threshold to consider a test positive [i.e. when there is a common receiver
operator characteristic (ROC) curve across all studies].
Study identification Summary ROC curves allow readers to visually inspect the consis-
An expert reference librarian (P.J.E.) designed and conducted the tency of results across studies (answering the question of whether there
electronic search strategy with input from study investigators with ex- is a single ROC curve across all these studies) and the accuracy of the test,
pertise in conducting systematic reviews. To identify eligible studies, we as judged by the area under the summary ROC curve, in discriminating
searched electronic databases (MEDLINE, EMBASE, Web of Science, between patients with and without CS. In contrast to ROC curves in
Scopus, and citation search for key articles) from 1975 through Septem- which individual data points represent different test cutoffs, in summary
ber 2007. The detailed search strategy is available upon request. We also ROC curves, each point represents a study (11). We assessed the incon-
sought references from experts from The Endocrine Society Cushings sistency among studies using the I2 statistic, which represents the pro-
Syndrome Task Force. portion of variability across studies that is not due to chance. I2 values of
Papanicolaou, 1998 (32) Suspicion referrald 35 (577) 263 (75) 240 (91.3, 83, 6, 11, 0) 0 (0) 59 (18.3) Path and clinical 21 UFC (immunoassay, single outcome-driven cutoff), MSerC
(outcome-driven cutoff)
Raff, 1998 (35) Suspicion referrald 44 78 (NR) 39 (50, 76.9, 12.8, 10.2, 0) 2 (2.6) 0 (0) Path and clinical NR MSalC (assay driven cutoff with assay sensitivity of 0.4 nmol/
liter)
Ness-Abramof, 2002 (29) No suspiciona 42.9 (26 69) 86 (85) 5 (6, 60, 20, 0, 20) 0 (0) 0 (0) Path and clinical NR UFC (immunoassay, single assay-driven cutoff), ODST (assay-
driven cutoff with assay sensitivity of 5.8 nmol/liter), LDDST
(assay-driven cutoff with assay sensitivity of 5.8 nmol/liter)
Papanicolaou, 2002 (31) Suspicion referrald NR 156 (NR) 122(78.21, 80.33, 9.84, 9.84, 4 (2.6) 0 (0) Path and clinical NR UFC (immunoassay, single outcome-driven cutoff), MSerC
0) (outcome-driven cutoff), MSalC (outcome-driven cutoff)
Catargi, 2003 (17) No suspiciona 58.6 (22 84) 200 (75.5) 11 (5.5, 27.3, 72.7, 0, 0) 3 (1.5) NR Path NR ODST (assay-driven cutoff)
Omura, 2004 (30) No suspicion NR 1020 (NR) 11 (1.1, 45.5, 54.5, 0, 0) 0 (0) NR Path NR ODST (assay-driven cutoff)
Holleman, 2005 (23) Suspicion referral 40 (1776) 144 (78) 17 (12, 47, 29, 24, 0) 0 (0) 10 (6.9) Path and clinical 41.2 UFC (liquid chromatography, ROC/multiple outcome-driven
cutoffs), ODST (assay-driven cutoff with
assay sensitivity of 50 nmol/liter)
Liu, 2005 (26)b No suspicion 61.8 141 (0) 0 (0, 0, 0, 0, 0) 0 (0) 1 (0.7) Path and clinical NR UFC, MSalC, ODST, LDDST
Reimondo, 2005 (36) Suspicion referral NR 106 (71.7) 78 (73.6, 56.4, 23.1, 19.2, 1) 0 (0) 0 (0) Path and clinical 12 UFC (ROC/multiple outcome-driven cutoffs), MSerC (outcome-
driven cutoff), ODST (outcome-driven),
Viardot, 2005 (39) Suspicion referral 48.8 (18 68) 26 (69.23) 12 (46.2, 41.67, 25, 33.3, 0) 0 (0) 0 (0) Path and clinical 6 UFC (RIA, ROC/multiple outcome-driven cutoffs), MSalC
(outcome-driven cutoff with assay sensitivity of 0.8 nmol/
liter), ODST (outcome driven cutoff)
Martin, 2006 (27) Suspicion referrald 44 (1777) 36 (61) 12 (33.3, 66.6, 33.3, 0, 0) 0 (0) 0 (0) Path and clinical 12 LDDST (assay-driven cutoff with assay sensitivity of 15 nmol/
liter)
Erickson, 2007 (21) Suspicion referral 45 51 (72.5) 21 (41, 100, 0, 0, 0) 0 (0) 15 (27.7) Path and clinical 11.513.5 UFC (liquid chromatography, ROC/multiple outcome-driven
cutoffs)
Friedman, 2007 (22) Suspicion referral 36.1 87 (96) 24 (27.6, 100, 0, 0, 0) 0 (0) 35 (40.2) Path and clinical 12 UFC (liquid chromatography, single assay-driven cutoff), MSerC
(assay-driven cutoff), MSalC
(assay-driven cutoff)
Giraldi, 2007 (33) Suspicion referral 41.7 (1392) 4126(76.3) 22 (0.5, 90.9, 9.1, 0, 0) 0 (0) 0 (0) Path and clinical 29 UFC (Immunoassay, ROC/ multiple outcome driven cutoffs),
MSerC (outcome driven cutoff), ODST (outcome driven
cutoff), UFCODST
Giraldi, 2007 (34) Suspicion referrald 36.6 (12 65) 55 (83.6) 32 (58.2, 91, 9, 0, 0) 0 (0) 0 (0) Path and clinical 37 UFC (immunoassay, single assay-driven cutoff), MSerC (assay-
driven cutoff with assay sensitivity of 13.5 nmol/liter), ODST
(assay-driven cutoff with assay sensitivity of 13.5 nmol/liter),
LDDST (assay-driven cutoff with assay sensitivity of 13.5
nmol/liter)
Reimondo, 2007 (37) No suspicion 61 (30 87) 99 (37) 1 (1, 100, 0, 0, 0) 0 (0) 1 (1) Path and clinical NR ODST (assay-driven cutoff)
For cohort characteristics, suspicion referral indicates clinicians referred patients for further evaluation for CS, and suspicion nonreferral indicates clinicians suspected CS because of history (of diabetes or hypertension), physical
examination (central obesity, easy bruising, striae, cervical or supraclavicular fat pad), or laboratory findings. For CS definition pathological finding refers to a pituitary tumor or other tumor that stained for ACTH or cortisol, and
clinical indicates clinical and laboratory follow-up leading to overt syndrome (postoperative adrenal crisis or adrenal insufficiency, need for steroid replacement, follow-up confirmation of Cushing through symptoms, signs, or
tests) or rule out of the condition. CD, Cushing disease; LDDST, 2-d 2-mg DST; MSalC, midnight salivary cortisol; MSerC, midnight serum cortisol; ODST, 1-mg overnight DST; Path, pathological finding.
a
Milder CS cases with mean cortisol levels less than the median value across the studies.
jcem.endojournals.org
b
Excluded from analysis.
c
Outcome-driven cutoff refers to investigators setting cutoffs maximizing sensitivities, specificities, or both.
d
1555
More severe CS cases with mean cortisol levels more than the median value across studies.
1556 Elamin et al. Metaanalysis of Tests for Cushings Syndrome J Clin Endocrinol Metab, May 2008, 93(5):15531562
TABLE 2. General quality assessment of studies of diagnostic accuracy included in systematic reviews (QUADAS)
Author, year (Ref.)
N, No; N, there were no uninterpretable or indeterminate results; NR, test was not reported; U, unclear; Y, yes; Y, yes, there were no withdrawals.
Downloaded from jcem.endojournals.org at Hospital Lluis Alcanyis Biblioteca on May 9, 2008
J Clin Endocrinol Metab, May 2008, 93(5):15531562 jcem.endojournals.org 1557
TABLE 2. (Continued)
Author, year (Ref.)
Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y
Y U Y U Y U Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y N N Y N N Y Y Y Y Y Y N N
Y Y Y Y Y Y Y N Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y
Y Y Y Y Y Y Y Y N Y Y Y N Y Y Y U
Y NR N N N NR Y N N Y N Y N N N NR NR
N NR N Y N NR Y Y Y Y Y Y Y Y Y NR NR
Y NR Y Y Y NR Y Y Y Y Y Y Y Y Y NR NR
Y NR Y N Y NR Y Y U Y Y U Y U Y NR NR
N NR N N N NR Y U U N U U U N N NR NR
U NR U Y N NR N Y Y N U U N U Y NR NR
Y NR NR Y N NR NR NR N NR NR NR N N N NR NR
Y NR NR Y N NR NR NR Y NR NR NR Y Y Y NR NR
Y NR NR Y Y NR NR NR Y NR NR NR Y N Y NR NR
Y NR NR N Y NR NR NR U NR NR NR Y U Y NR NR
N NR NR N N NR NR NR U NR NR NR U N N NR NR
N NR NR Y Y NR NR NR Y NR NR NR N U Y NR NR
NR Y NR Y NR NR NR N NR Y NR NR N NR NR NR NR
NR Y NR Y NR NR NR Y NR Y NR NR Y NR NR NR NR
NR Y NR Y NR NR NR Y NR Y NR NR Y NR NR NR NR
NR N NR N NR NR NR U NR Y NR NR Y NR NR NR NR
NR N NR N NR NR NR U NR Y NR NR U NR NR NR NR
NR Y NR Y NR NR NR N NR N NR NR N NR NR NR NR
NR NR Y NR N N Y N N Y NR NR NR N N N NR
NR NR Y NR Y Y Y Y Y Y NR NR NR Y Y N NR
NR NR Y NR Y Y Y Y Y Y NR NR NR N Y Y NR
NR NR Y NR Y N Y Y U Y NR NR NR U Y N NR
NR NR U NR N N Y U U Y NR NR NR N N N NR
NR NR N NR Y Y N Y Y N NR NR NR U Y N NR
NR NR N NR NR NR NR N NR NR N NR NR NR N NR N
NR NR N NR NR NR NR Y NR NR Y NR NR NR Y NR N
NR NR Y NR NR NR NR Y NR NR Y NR NR NR Y NR N
NR NR Y NR NR NR NR Y NR NR Y NR NR NR Y NR Y
NR NR Y NR NR NR NR U NR NR U NR NR NR N NR U
NR NR Y NR NR NR NR Y NR NR Y NR NR NR Y NR N
25, 50, and 75% indicate low, moderate, and high heterogeneity, re- ported in the format we needed for analyses) or confirmed study
spectively (12). characteristics, quality assessments, and data as collected.
Subgroup analyses
Study characteristics
A priori hypotheses to explain potential heterogeneity among studies
Table 1 summarizes the baseline characteristics of eligible
included severity of CS, selection bias (i.e. samples of consecutive pa-
tients with high prevalence of CS), type of patients (referred because of studies. Fourteen studies assessed the diagnostic accuracy of
clinicians suspicion of CS vs. no CS suspicion), cutoff rationale (driven UFC, six midnight serum cortisol, four midnight salivary corti-
by outcomes in the same sample, e.g. chosen to maximize specificity, or sol, 14 the 1-mg overnight DST, and eight the 2-d 2 mg DST. Of
by the upper limit of the assay), and tests characteristics (sensitivity of the 8631 patients enrolled in these studies, 794 (9.2%) had CS.
assay, use of liquid chromatography vs. RIA). We tested these hypotheses
using a test for interaction considering P 0.05 as significant (13),
because we did not have enough studies to conduct meta-regression (14). Study quality
Table 2 summarizes the methodological quality of the 27
included studies. Almost all studies enrolled patients with ap-
parent diagnostic uncertainty of spectrum similar to the popu-
Results
lation in whom clinicians would use the tests in clinical practice
Study identification (42). However, there is a strikingly broad range in the prevalence
Initial search of the literature yielded 1791 publications, of of CS across these studies, suggesting some degree of selection or
which 124 were potentially relevant to this review based on ti- referral bias. Their selection criteria were clearly described, and
tles and abstracts (Fig. 1). After full text review, we found 27 all received a reference standard that either diagnosed or ex-
eligible studies (15 41). We excluded one study from analyses cluded CS.
because there were no CS cases in the sample (26) and excluded
another study because we could not obtain essential data from Metaanalyses
the author (15). The appendix (published as supplemental data on The En-
We contacted all of the corresponding authors (another au- docrine Societys Journals Online web site at http://jcem.
thor in two studies) by electronic mail or, in five instances, by endojournals.org) includes tables with the test accuracy data
regular mail of which 70% were successfully contacted. Ninety from each included study (supplemental Tables 1 6). Table 3
percent of the authors successfully contacted either contributed shows pooled likelihood ratios for test results considered positive
missing data (where these data had been collected but not re- and negative. Table 3 also reports the diagnostic odds ratio and
Downloaded from jcem.endojournals.org at Hospital Lluis Alcanyis Biblioteca on May 9, 2008
J Clin Endocrinol Metab, May 2008, 93(5):15531562 jcem.endojournals.org 1559
Subgroup analyses
Except where noted in Table 3, all other subgroup analyses
explored were not associated with significant test accuracy-
subgroup interactions (see supplemental Tables 711).
Sensitivity analyses
Most patients included in the metaanalysis were enrolled in a
single study (33). A sensitivity analysis, in which we removed this
study, revealed similar pooled accuracy results (data not shown).
Zwinderman and Bossuyt (44) have proposed the use of bi-
variate random-effects metaanalysis to analyze the sensitivities
and specificities together from which one could derive pooled
likelihood ratios, rather than pooling the likelihood ratios di-
rectly; in this data set, however, the bivariate approach yields FIG. 2. Fagan nomogram summarizing the likelihood ratios of selected tests.
Use a straight edge to link pretest probability of CS with the posttest probability
results consistent with those presented here (data not shown). by crossing the likelihood ratio line at a point that describes the results obtained.
The colored shadows represent the 95% confidence interval around the
likelihood ratios for each of the tests. o, Overnight DST; s, midnight salivary
cortisol; u, UFC. Adapted from Fagan (43).
Discussion
Summary of findings In all, we found that the UFC and the overnight DST have the
We conducted a systematic review and metaanalyses of stud- most evidence supporting their use for the detection of CS, with
ies that enrolled patients with diagnostic uncertainty and con- limited evidence supporting the use of salivary and serum mid-
ducted a test for hypercortisolism and a satisfactory reference night cortisol tests. Limited evidence also supports the use of
standard test. This review offers 1) a survey comprised of mostly these tests in combination to both identify and exclude patients
small studies with high prevalence of CS from referral centers, 2) with CS. In two instances in which the inconsistency across stud-
pooled test characteristics that represent the best estimates of test ies was important, we were able to identify potential explana-
accuracy for each of the tests assessed and their combinations, tions. For the midnight serum cortisol test compared with assay-
and 3) inconsistent results across studies that are not explained driven thresholds, outcome-driven thresholds overestimated test
by the choice of test thresholds but likely represent differences in accuracy (i.e. test interpretation was fitted to the data in the
the spectrum of patients with and without CS, in the character- studies). For the 1-mg overnight DST, studies in which the prev-
istics of the tests used, and in the definitions of CS. These incon- alence of CS was greater than 50% (the median across studies)
sistencies remain unexplained given the limitations in our ability reported more modest test characteristics, especially more false-
to explore these differences with few studies. positive test results. This paradoxical result may be due to
Downloaded from jcem.endojournals.org at Hospital Lluis Alcanyis Biblioteca on May 9, 2008
1560 Elamin et al. Metaanalysis of Tests for Cushings Syndrome J Clin Endocrinol Metab, May 2008, 93(5):15531562
chance, to a lower cortisol threshold for positivity, or to patients these results in their practice can use a Fagan nomogram to update
without CS who had other syndromes associated with impaired their estimates of the probability their patients have CS (Fig. 2).
cortisol suppression. Given the close biological relationship between the tests assessed
here, it may be unwise to use this procedure to estimate the posttest
Limitations and strengths probability when several of these tests are performed in series.
The key limitations of this review refer to the relative paucity
of evidence of test accuracy for the evaluated tests and to the Implications for practice and research
methodological quality of the included studies. In particular, the The accompanying Endocrine Society practice guideline on
prevalence and severity of CS varies importantly across studies the diagnosis of CS contains the practical implications of the
despite the authors representation of their populations as con- results of this review. The Task Force recommends a particular
secutive samples of patients referred without clear diagnosis. It algorithm that seeks to balance diagnostic accuracy with prac-
is also striking that these studies rarely report indeterminate tical and logistical considerations.
cases, given how often there is residual diagnostic uncertainty Our systematic review has uncovered several research gaps in
even among patients evaluated in centers of excellence. Finally, this area. From the laboratory perspective, laboratory and test man-
the report of a single cutoff in many of these studies precludes the ufacturers should seek and maintain standards for measuring cor-
estimation of likelihood ratios for ranges of test results. The tisol in urine, serum, and saliva. Variability today introduces vari-
arbitrary choice of test threshold and the dichotomy of the test ability in the literature and in clinical practice and impairs clinicians
results into positive and negative may contribute to a dichoto- ability to apply published cutoffs and results to their practice.
mous view of diagnosis in which patients either have or do not From the diagnostic accuracy perspective, prospective studies
have CS rather than a Bayesian approach in which additional test of the proposed algorithm may uncover further advantages and
results modify the probability that a given patient has CS. disadvantages of the proposed approach, including the down-
Incomplete searching, arbitrary study selection, poor quality stream consequences of patient misclassification. Further work
of the primary studies, misguided analyses, and results that can- to evaluate the accuracy of testing algorithms in consecutive pa-
not be applied in practice represent potential limitations of sys- tients in whom clinical features suggest CS should 1) yield more
tematic reviews. The extent to which publication bias affects accurate estimates of the diagnostic power of test results, 2) re-
studies of test accuracy is unknown, and the performance of tests port findings using likelihood ratios for test result ranges rather
of publication bias in the context of heterogeneous results is than forcing a single cutoff on the data, and 3) use diagnostic
problematic (45); the accuracy of the indexing of such studies in categories that include those who clearly have and do not have
the electronic databases is also unclear (46). Yet, our overlapping CS and those with indeterminate results (48). Given the low
search strategies and extensive input from clinical experts should incidence of CS and the increasing incidence of conditions with
have minimized the chances that we missed studies that could similar features (truncal obesity, bone loss, hyperglycemia, and
substantially change the inferences drawn from this study. hypertension), rigorous research is likely to yield more conser-
Our review has the strengths of systematic reviews that sum- vative estimates of test performance than those summarized here.
marize the totality of the available evidence following a protocol- For stronger recommendations in the future, guideline panels
driven procedure with explicit eligibility criteria, reproducible will require evidence that patients are better off in important
judgments about study quality and selection, and focused anal- ways when they receive a diagnosis when the disease is subtle and
yses (47). We also provide in the appendix the data from each of mild rather than when it is florid and severe. The paucity of both
the studies to facilitate readers secondary analyses. Given our patients and resources mandates collaboration across centers of
focus on samples of patients in whom there was diagnostic un- excellence (i.e. endocrinologists with an interest in CS working
certainty (phase II and III diagnostic studies) (7), we may have in academic medical centers) tightly integrated with their referral
successfully ameliorated the overestimation of test accuracy that sources (i.e. primary care and internal medicine clinicians) to
results from so-called phase I diagnostic accuracy studies in generate this much-needed research evidence.
which investigators evaluate the accuracy of the test in distin-
guishing patients with clear confirmed disease and individuals
Conclusions
who are clearly free of disease. We were forced to use a single
Commonly used tests to diagnose CS appear highly accurate,
cutoff when many were reported from a given study with the
particularly when used in combination, in referral practices with
subsequent loss of information and gain in simplicity and trans-
samples enriched with patients with CS. Their performance in
parency. Yet, our analyses take into account inconsistencies as-
usual clinical practice remains unclear.
sociated with the choice of threshold (i.e. using the diagnostic
odds ratio).
Because of our study selection criteria, this reviews results do
not apply to patients with adrenal incidentaloma or to patients Acknowledgments
with suspected intermittent or so-called cyclical CS. Because of
the high prevalence of CS in the included studies, the applicability We are grateful to the authors of primary studies who responded to our
requests for data confirmation and missing data (F. Cavagnini, R. L.
of this study to general practice settings or to general endocrine Eddy, D. Erickson, T. Friedman, W. E. Grizzle, F. Holleman, G. Lei-
practices is unclear. bowitz, H. Liu, K. Meeran, A. W. Meikle, R. Ness-Abramof, H. Raff, G.
With these limitations and strengths, clinicians seeking to apply Reimondo, T. Reinehr, A. Viardot, and G. Vidal Trecan). We are also
grateful to the members of The Endocrine Society Task Force on Cush- pituitary MRI has high sensitivity and specificity for the diagnosis of mild
ings Syndrome for their expert input into the conduct and interpretation Cushings syndrome and should be part of the initial workup. Horm Metab Res
of our review. 39:451 456
23. Holleman F, Endert E, Prummel MF, van Vessem-Timmermans M, Wiersinga
WM, Fliers E 2005 Evaluation of endocrine tests. B: screening for hypercor-
Address all correspondence and requests for reprints to: Victor M.
tisolism. Neth J Med 63:348 353
Montori, M.D., M.Sc., Mayo Clinic, W18A, 200 First Street SW, Roch-
24. Kreze A, Veleminsky J, Spirova E 1983 A follow-up of the low dose sup-
ester, Minnesota 55905. E-mail: [email protected]. pressible hypercortisolism. Endocrinol Exp 17:119 123
This work was supported by a contract from The Endocrine Society. 25. Leibowitz G, Tsur A, Chayen SD, Salameh M, Raz I, Cerasi E, Gross DJ
Disclosure Statement: M.B.E., M.H.M., R.M., D.E., K.H., S.N., 1996 Pre-clinical Cushings syndrome: an unexpected frequent cause of
R.E., P.J.E., and V.M.M. have nothing to declare. poor glycaemic control in obese diabetic patients. Clin Endocrinol (Oxf)
44:717722
26. Liu H, Bravata DM, Cabaccan J, Raff H, Ryzen E 2005 Elevated late-night
salivary cortisol levels in elderly male type 2 diabetic veterans. Clin Endocrinol
References (Oxf) 63:642 649
27. Martin NM, Dhillo WS, Banerjee A, Abdulali A, Jayasena CN, Donaldson M,
1. Lindholm J, Juul S, Jorgensen JOL, Astrup J, Bjerre P, Feldt-Rasmussen U, Todd JF, Meeran K 2006 Comparison of the dexamethasone-suppressed cor-
Hagen C, Jorgensen J, Kosteljanetz M, Kristensen LO, Laurberg P, Schmidt K, ticotropin-releasing hormone test and low-dose dexamethasone suppression
Weeke J 2001 Incidence and late prognosis of Cushings syndrome: a popu- test in the diagnosis of Cushings syndrome. J Clin Endocrinol Metab 91:
lation-based study. J Clin Endocrinol Metab 86:117123 25822586
2. Arnaldi G, Angeli A, Atkinson AB, Bertagna X, Cavagnini F, Chrousos GP, 28. Meikle AW 1982 Dexamethasone suppression tests: usefulness of simulta-
Fava GA, Findling JW, Gaillard RC, Grossman AB, Kola B, Lacroix A, Man- neous measurement of plasma cortisol and dexamethasone. Clin Endocrinol
cini T, Mantero F, Newell-Price J, Nieman LK, Sonino N, Vance ML, Giustina (Oxf) 16:401 408
A, Boscaro M 2003 Diagnosis and complications of Cushings syndrome: a 29. Ness-Abramof R, Nabriski D, Apovian CM, Niven M, Weiss E, Shapiro MS,
consensus statement. J Clin Endocrinol Metab 88:55935602 Shenkman L 2002 Overnight dexamethasone suppression test: a reliable screen
3. Newell-Price J, Bertagna X, Grossman AB, Nieman LK 2006 Cushings syn- for Cushings syndrome in the obese. Obes Res 10:12171221
drome. Lancet 367:16051617 30. Omura M, Saito J, Yamaguchi K, Kakuta Y, Nishikawa T 2004 Prospective
4. Nieman LK, Ilias I 2005 Evaluation and treatment of Cushings syndrome. study on the prevalence of secondary hypertension among hypertensive pa-
Am J Med 118:1340 1346 tients visiting a general outpatient clinic in Japan. Hypertens Res 27:193202
5. Raff H, Findling JW 2003 A physiologic approach to diagnosis of the Cushing 31. Papanicolaou DA, Mullen N, Kyrou I, Nieman LK 2002 Nighttime salivary
syndrome. Ann Intern Med 138:980 991 cortisol: a useful test for the diagnosis of Cushings syndrome. J Clin Endo-
6. Deville WL, Buntinx F, Bouter LM, Montori VM, de Vet HC, van der Windt crinol Metab 87:4515 4521
DA, Bezemer PD 2002 Conducting systematic reviews of diagnostic studies: 32. Papanicolaou DA, Yanovski JA, Cutler GB, Jr., Chrousos GP, Nieman LK
didactic guidelines. BMC Med Res Methodol 2:9 1998 A single midnight serum cortisol measurement distinguishes Cushings
7. Sackett DL, Haynes RB 2002 The architecture of diagnostic research. BMJ syndrome from pseudo-Cushing states. J Clin Endocrinol Metab 83:1163
324:539 541 1167
8. Whiting P, Rutjes AW, Reitsma JB, Bossuyt PM, Kleijnen J 2003 The devel- 33. Pecori Giraldi F, Ambrogio AG, De Martin M, Fatti LM, Scacchi M, Cavagnini
opment of QUADAS: a tool for the quality assessment of studies of diagnostic F 2007 Specificity of first-line tests for the diagnosis of Cushings syndrome:
accuracy included in systematic reviews. BMC Med Res Methodol 3:25 assessment in a large series. J Clin Endocrinol Metab 92:4123 4129
9. Zamora J, Abraira V, Muriel A, Khan K, Coomarasamy A 2006 Meta-DiSc: 34. Pecori Giraldi F, Pivonello R, Ambrogio AG, De Martino MC, De Martin M,
a software for meta-analysis of test accuracy data. BMC Med Res Methodol Scacchi M, Colao A, Toja PM, Lombardi G, Cavagnini F 2007 The dexam-
6:31 ethasone-suppressed corticotropin-releasing hormone stimulation test and the
10. Glas AS, Lijmer JG, Prins MH, Bonsel GJ, Bossuyt PM 2003 The diagnostic desmopressin test to distinguish Cushings syndrome from pseudo-Cushings
odds ratio: a single indicator of test performance. J Clin Epidemiol 56:1129 states. Clin Endocrinol (Oxf) 66:251257
1135 35. Raff H, Raff JL, Findling JW 1998 Late-night salivary cortisol as a screening
11. Deeks JJ 2001 Systematic reviews in health care: systematic reviews of eval- test for Cushings syndrome. J Clin Endocrinol Metab 83:26812686
uations of diagnostic and screening tests. BMJ 323:157162 36. Reimondo G, Allasino B, Bovio S, Paccotti P, Angeli A, Terzolo M 2005
12. Higgins JP, Thompson SG, Deeks JJ, Altman DG 2003 Measuring inconsis- Evaluation of the effectiveness of midnight serum cortisol in the diagnostic
tency in meta-analyses. BMJ 327:557560
procedures for Cushings syndrome. Eur J Endocrinol 153:803 809
13. Altman DG, Bland JM 2003 Interaction revisited: the difference between two
37. Reimondo G, Pia A, Allasino B, Tassone F, Bovio S, Borretta G, Angeli A,
estimates. BMJ 326:219
Terzolo M 2007 Screening of Cushings syndrome in adult patients with newly
14. Lijmer JG, Bossuyt PM, Heisterkamp SH 2002 Exploring sources of hetero-
diagnosed diabetes mellitus. Clin Endocrinol (Oxf) 67:225229
geneity in systematic reviews of diagnostic tests. Stat Med 21:15251537
38. Reinehr T, Hinney A, de Sousa G, Austrup F, Hebebrand J, Andler W 2007
15. Ashcraft MW, Van Herle AJ, Vener SL, Geffner DL 1982 Serum cortisol levels
Definable somatic disorders in overweight children and adolescents. J Pediatr
in Cushings syndrome after low- and high-dose dexamethasone suppression.
150:618 622, 622.e1 e5
Ann Intern Med 97:2126
39. Viardot A, Huber P, Puder JJ, Zulewski H, Keller U, Muller B 2005 Repro-
16. Barbarino A, de Marinis L, Liberale I, Menini E 1979 Evaluation of steroid
ducibility of nighttime salivary cortisol and its use in the diagnosis of hyper-
laboratory tests and adrenal gland imaging with radiocholesterol in the aetio-
logical diagnosis of Cushings syndrome. Clin Endocrinol (Oxf) 10:107121 cortisolism compared with urinary free cortisol and overnight dexamethasone
17. Catargi B, Rigalleau V, Poussin A, Ronci-Chaix N, Bex V, Vergnot V, Gin H, suppression test. J Clin Endocrinol Metab 90:5730 5736
Roger P, Tabarin A 2003 Occult Cushings syndrome in type-2 diabetes. J Clin 40. Vidal Trecan G, Laudat MH, Thomopoulos P, Luton JP, Bricaire H 1983
Endocrinol Metab 88:5808 5813 Urinary free corticoids: an evaluation of their usefulness in the diagnosis of
18. Cronin C, Igoe D, Duffy MJ, Cunningham SK, McKenna TJ 1990 The over- Cushings syndrome. Acta Endocrinol (Copenh) 103:110 115
night dexamethasone test is a worthwhile screening procedure. Clin Endocri- 41. Yanovski JA, Cutler GB, Jr., Chrousos GP, Nieman LK 1993 Corticotropin-
nol (Oxf) 33:2733 releasing hormone stimulation following low-dose dexamethasone adminis-
19. Dunlap NE, Grizzle WE, Siegel AL 1985 Cushings syndrome. Screening meth- tration. A new test to distinguish Cushings syndrome from pseudo-Cushings
ods in hospitalized patients. Arch Pathol Lab Med 109:222229 states. JAMA 269:22322238
20. Eddy RL, Jones AL, Gilliland PF, Ibarra JD, Jr., Thompson JQ, MacMurry Jr 42. Montori VM, Wyer P, Newman TB, Keitz S, Guyatt G 2005 Tips for learners
JF 1973 Cushings syndrome: a prospective study of diagnostic methods. Am J of evidence-based medicine. 5. The effect of spectrum of disease on the per-
Med 55:621 630 formance of diagnostic tests. CMAJ 173:385390
21. Erickson D, Natt N, Nippoldt T, Young Jr WF, Carpenter PC, Petterson T, 43. Fagan TJ 1975 Letter: nomogram for Bayes theorem. N Engl J Med
Christianson T 2007 Dexamethasone-suppressed corticotropin-releasing hor- 293:257
mone stimulation test for diagnosis of mild hypercortisolism. J Clin Endocrinol 44. Zwinderman AH, Bossuyt PM 2007 We should not pool diagnostic likelihood
Metab 92:29722976 ratios in systematic reviews. Stat Med 27:687 697
22. Friedman TC, Zuckerbraun E, Lee ML, Kabil MS, Shahinian H 2007 Dynamic 45. Deeks JJ, Macaskill P, Irwig L 2005 The performance of tests of publication
bias and other sample size effects in systematic reviews of diagnostic test ac- 47. Montori VM, Guyatt GH 2003 Summarizing studies of diagnostic test per-
curacy was assessed. J Clin Epidemiol 58:882 893 formance. Clin Chem 49:17831784
46. Leeflang MM, Scholten RJ, Rutjes AW, Reitsma JB, Bossuyt PM 2006 Use of 48. Montori VM, Guyatt GH 2003 Evidence-based medicine and the diagnostic
methodological search filters to identify diagnostic accuracy studies can lead process. In: Price C, Christenson R, eds. Evidence-based laboratory medicine.
to the omission of relevant studies. J Clin Epidemiol 59:234 240 Washington, DC: AACC Press; 119