11.1 Diagnostic Tests 1-S2.0-S0001299818300941-Main
11.1 Diagnostic Tests 1-S2.0-S0001299818300941-Main
11.1 Diagnostic Tests 1-S2.0-S0001299818300941-Main
The medical community often assumes that the tests we use to diagnose various diseases
are accurate, safe, and effective. However, the study designs traditionally used to deter-
mine whether such a diagnostic test is indeed accurate, safe, and effective are often at a
higher risk of bias and are of lower methodological quality than those evaluating efficacy of
therapeutic interventions. Several designs can be used to study diagnostic tests such as
diagnostic accuracy cross-sectional studies, diagnostic accuracy case-control studies, and
diagnostic accuracy comparative studies. Clinicians, researchers, and policy-makers may
wish to consider moving toward higher quality study designs when studying new diagnostic
modalities prior to their implementation in routine practice and diagnostic randomized trials
are one such alternative.
Semin Nucl Med 49:87-93 © 2018 Elsevier Inc. All rights reserved.
Introduction The medical community often assumes that the tests we use
to diagnose various diseases are accurate, safe, and effective.
https://doi.org/10.1053/j.semnuclmed.2018.11.005 87
0001-2998/© 2018 Elsevier Inc. All rights reserved.
88 M. Chasse and D.A. Fergusson
In the following sections, we will provide a brief overview evaluation of brainstem function. He decides to order a CTA
of three different designs that can be used to study a new to check whether there is residual intracranial blood flow
diagnostic test: (a) diagnostic accuracy cross-sectional to confirm that the patient is neurologically deceased. The
(cohort) studies, (b) diagnostic accuracy case-control studies, critical care fellow in charge of the unit that day however
and (c) diagnostic accuracy comparative studies. We will cites a case report8 and a recent systematic review9 stating
illustrate this review with a case example of a complex diag- that CTA may not be the appropriate test for this medical sit-
nostic situation (brainstem death) where several imaging uation and suggests a radionuclide examination instead,
modalities have been described to aid in the diagnosis and citing a different review10 and a recent study11 that suggests
where an investigator may wish to identify the best imaging better accuracy of the SPECT modality. The staff critical care
modality to help with clinical decision-making. physician is however well aware of the studies cited by the
fellow and emphasizes that when compared to clinical evalu-
ation, SPECT had a sensitivity of only 83% in that particular
study (in 17% of NDD patients flow was demonstrated) and
Target Condition: Neurological similarly has not been appropriately validated. They agree
Determination of Death that both tests require additional validation to decide which
one is the most appropriate in this setting. Although they
To retrieve a vital organ from a donor for the aim of transplan- face a complex clinical situation in which they will have to
tation, clinicians must be 100% certain that the donor is decide with only limited quality evidence available, they con-
deceased. The diagnosis of neurological death (NDD) is the template how a study to validate a diagnostic accuracy test to
concept of irreversible loss of the capacity for consciousness help with the clinical decision-making around neurological
combined with the irreversible loss of all brainstem functions death could be performed.
including the capacity to breathe. When patients fulfill NDD
criteria, they are legally declared “dead.” Traditionally, NDD is
a predominantly a clinical diagnosis, made by physical exami-
nation at the bedside.3,4 There are however many situations
Definition of the Research
where a complete and accurate clinical evaluation is impossi- Question
ble, and clinicians must order additional tests to confirm In this clinical situation, the researcher will want to compare
NDD. Various imaging modalities have been proposed to sup- the diagnostic accuracy of CTA and the radionuclide study
port clinicians in such situations. Radionuclide blood flow with a reference standard. The population of interest in the
examinations including planar or SPECT, computed tomogra- study should be as close as possible to the population on
phy angiography (CTA), and digital subtraction angiography which the tests will be applied in practice. It must also
are modalities that are often used by clinicians. Their use for include a mix of patients that do present the “disease of inter-
confirming NDD raises diagnostic challenges such as in est” (neurological death in this context) as well as patients
patients with clinical NDD but with negative ancillary testing who are not diseased but are close to the target population.
(ie, with persistence of brain blood flow). When such a sce- One could therefore consider including patients with a
nario occurs, it is sometimes unclear whether the flow severe brain injury who are at high risk of neurological death.
observed is significant enough to translate into brain function Doing so, both the sensitivity (death by imaging criteria
or not. It is currently recommended that in this clinical con- when the patient meets clinical NDD criteria) and specificity
text the selected modality should be able to demonstrate the (alive by imaging criteria when the patient does not meet
presence or absence of brain blood flow within the cerebral clinical NDD criteria) could be determined.
hemispheres and in structures within the posterior fossa.5 It is To evaluate the identified imaging modalities, one must
however also increasingly recognized that when these tests are carefully select the reference standard. The reference gold
applied to clinically confirmed neurologically deceased standard for neurological death is considered to be based on
patients, a proportion of them demonstrate detectable brain clinical examination performed at the bedside.12 A patient
blood flow, suggesting an imperfect correlation between the would be considered neurologically deceased when present-
findings at physical examination and the findings of the con- ing with (1) an established etiology capable of causing neuro-
firmatory test. It is thus important to correctly assess these logical death; (2) an absence of confounders that can mimic
confirmatory tests as this could lead to a population of patients neurological death; (3) an absence of all brainstem reflexes;
who would have received a NDD based on current clinical cri- and (4) a positive apnea test.3,4 For this validation study, we
teria but in whom residual blood flow actually remains (not would therefore have to exclude patients with contraindica-
deceased by actual diagnostic criteria).6,7 tions to CTA or SPECT, and because the reference standard
for this study will be the clinical evaluation, any patient with
a confounding factor precluding complete clinical neurologi-
cal evaluation. The study would therefore compare the
The Clinical Question capacity of the test to correctly classify patients that have
A critical care physician faces a situation where he cannot a clinical absence of brainstem function (the clinical defini-
complete a complete clinical neurological determination of tion of neurological death) as deceased, and patients with
death because of significant facial trauma that limits the residual brainstem function as alive.
Diagnostic Accuracy Studies 89
target population, and the tests were applied in different clin- population, limiting its generalizability in clinical practice
ical settings or time frames. It is thus often hard, if not where these tests are mostly applied in such “borderline
impossible, to compare the accuracy of two tests if not com- cases.”
pared directly, similarly as it may be difficult to compare the
accuracy of two drugs that have only been compared to pla-
cebo, but not to each other directly. It is also possible in Strengths and Limitations
some circumstance that the new test is more accurate than This design is very similar to the diagnostic cross-sectional
the reference standard (more sensitive or specific). When study except for patient selection. The decision to enroll
this is the case, it can be difficult or impossible to know for patients with different inclusion criteria may simplify patient
sure if the reference or the new test provided the correct identification and optimize the number of patients to be
diagnosis. enrolled by ensuring an equal mix of diseased and nondi-
seased patients. Doing so will however have important effects
on the interpretation of the results. When designing a diagnos-
tic accuracy study, researchers need to ensure to select a popu-
Diagnostic Accuracy Case- lation that is representative of the population the test will be
applied to in the clinical setting. Such is the case when using a
Control Studies unique set of inclusion criteria that represents the target popu-
It is sometimes hard to find a unique set of criteria that will lation to obtain a mix of deceased and not deceased patients at
allow enrollment of patients in diagnostic test studies. Some- different stages of the disease of interest, thus giving informa-
times, researchers will use two sets of criteria to conduct a tion about the accuracy of the test for a larger scope of
diagnostic accuracy study, one set of criteria to identify patients. When using a case-control design, the researcher will
known cases, and one set of criteria to identify healthy con- select two different populations, thus restricting the scope of
trols. This design is often referred to as a diagnostic accuracy diseases included in the study. This will result in inflated sen-
case-control study (Fig. 2).14,17 Both group of patients will sitivity and specificity measures and lead to overestimation of
undergo the studies test and the reference standard. The the test accuracy. The additional challenges of selecting an
group of known cases will be used to calculate sensitivity, appropriate unique set of inclusion criteria (and thus avoiding
and the group of healthy control (or negative cases) will be this design) may usually be worth the effort to ensure the
used to compute specificity and the results reported as one most accurate and clinically meaningful results.
cohort. Although it may sound advantageous for research
feasibility reasons, this design is at higher risk of bias than a
diagnostic accuracy cross-sectional study or a comparative
study.17,18 Diagnostic Accuracy
When applied to our clinical scenario, the researchers may
decide to conduct a study that will enroll patients that have
Comparative Studies
been already declared neurologically deceased by clinical cri- Comparative diagnostic accuracy studies can, in addition to
teria, then enroll patients who are comatose but are known measuring the accuracy of a new diagnostic test, provide
to still have residual brainstem function in the control group. comparative accuracy between two tests, and can allow for
Scintigraphy would then be applied to all enrolled patients the measurement of meaningful clinical outcomes.
and accuracy statistics calculated. It is thus obvious that in There are three main design variations that will allow direct
this context, the patients with “borderline” criteria for neuro- comparisons of diagnostic tests.2,17 All designs use only one set
logical death would not be included in that study. It would of inclusion criteria to enroll patients. In the first situation, the
therefore be unclear how the test would perform in this included patients will all undergo the two tests to be compared,
as well as the accepted reference standard (Fig. 3). It will then
be possible to obtain accuracy statistics for each test by compar-
ing them to the reference standard, as well as allow a direct
comparison of the two tests. For this design to be feasible, the
conduct of each individual test must not affect the course of the
target disease, and the result of one of the tests must not affect
the performance of the other.16 This may happen for example
when the two tests require the use of a contrast agent that will
then affect the interpretation of the other test. This design is
also more demanding for the patient as the patient may be
exposed to several tests for the same target condition, exposing
them to additional potential risks of adverse events. Also, the
patients willing to consent and be enrolled in such a study may
be different from those who decline, thus changing the charac-
teristics of the enrolled patients and making the enrolled cohort
Figure 2 Diagnostic accuracy case-control studies. of patients different from the intended target population.
Diagnostic Accuracy Studies 91
provide it differently, or they may affect the clinical decision- studies. Unfortunately, the quality requirements of diagnos-
making in ways not always easy to predict. By randomizing tic studies are different than for medical interventions. Yet,
patients in two or more groups, and by blinding the clinicians the consequences of incorrectly interpreting the results of
to the study group (but not the result of the diagnostic test), it such tests have the potential to greatly affect patient care as it
becomes possible to measure patient clinical outcomes. Ran- affects the medical diagnosis itself, and thus also affects all
domization will distribute unmeasured confounding factors the therapeutic decisions that inevitably follow. We
between groups, hopefully isolating the effect of the diagnostic described a number of potential diagnostic accuracy study
test on the decision-making and clinical outcome, like in stan- designs, each having strengths and weaknesses. We however
dard intervention studies. When possible, blinding will also argue that in most circumstances, randomized designs will
greatly reduce the risk of biases associated with the clinician provide all the required information to first assess the accu-
knowing the study group. A new test may, for example, be racy of a test, with the added benefit of reduced risk of
more sensitive for a target condition. Although this may seem biases, and can provide additional valuable information
desirable at first sight, this hypothetical test may only detect regarding patient outcomes.
the disease at stages where it has no clinical consequences. Although it is possible to obtain a high-quality diagnostic
The clinicians may decide to treat these preclinical conditions. cross-sectional study by selecting a representative sample of
One can imagine circumstances where a patient could be the target population using a unique set of inclusion criteria,
exposed to a potentially unnecessary treatment (which may by keeping the interpretation of the studied test and refer-
have some risks), with no outcome improvement, thus ence standard blinded to each other, and by following strict
increasing risks to the patient and costs to the health system. study flow and procedure, it is very hard to directly compare
For example, a current controversy exists around this issue in the accuracy of two tests if they were not assessed in the
the realm of imaging of pulmonary emboli. While tomo- same study and it is nearly impossible to assess patient clini-
graphic SPECT imaging is more accurate and will detect addi- cal outcomes. By including randomization in the study
tional and smaller emboli than planar imaging, there is a real design, it becomes possible to directly measure clinical out-
concern that treating patients with more trivial disease will comes in addition to diagnostic accuracy. This enables the
expose them to the risks of anticoagulation without any real medical community to confirm if in addition to a comparable
clinical benefits.19,20 or better accuracy, these results translate into improved clini-
For our clinical scenario, in addition to the diagnostic accu- cal outcome. It also becomes easier to compare different ran-
racy of scintigraphy compared to the clinical brainstem evalua- domized studies one to each other by contrasting observed
tion, the clinicians were also interested in the diagnostic clinical outcomes of each trial.
accuracy of CTA. Because neither scintigraphy nor CTA have It is important to note that the conduct of a diagnostic
been appropriately validated and considered suitable enough accuracy study that meets all quality criteria (including
to be used as a reference standard, a comparative design could appropriate patient selection and study flow, and blinding)
be selected. If feasible, comatose patients with no factors limit- can be complex. The added complexity of adding randomi-
ing a complete clinical brainstem evaluation could be enrolled zation to this design may not increase significantly its com-
and scintigraphy, CTA, and a clinical brainstem evaluation plexity while significantly increasing its expected output.
could be performed consecutively. This would allow the Randomization will inevitably reduce the power of the study
assessment of the comparative accuracy of both tests between and require an increased sample size to detect clinically
each other and compared to the reference standard. One must meaningful outcomes. The study will therefore be more
be careful such that the conduct of the two tests should not be expensive and require additional data management. To
too much of burden for such patients. The included patients improve efficiency, it has recently been suggested that rather
could then be randomized into two groups. One group would than randomizing patients between two test groups, all
be assessed using CTA and the other group using scintigraphy, patients would undergo both tests of interest. Patient would
and all patients would undergo a formal clinical brainstem then undergo treatment A if both test results were positive
evaluation. In addition to standard accuracy statistics, the cli- and treatment B if both test results were negative. For pairs
nician could then be informed of only the result of the CTA or where the two tests would disagree, treatment assignment
the result of scintigraphy, while kept blinded of the study would be decided by randomization between the two treat-
group, and then use these results for clinical decision-making. ment arms.21
In this example, the number of patients who undergo organ The added benefits of conducting a randomized diagnostic
donation, the time from patient identification to organ dona- test study may however be worth these limitations and they
tion, or clinician and family satisfaction could be compared should be considered as the first design to consider when
between each group to provide meaningful clinical outcome, planning to study a new diagnostic test to be used in clinical
in addition to determination of the accuracy of the tests. practice.
Discussion Conclusion
To make the best medical decisions for their patients, clini- There are a high number of new diagnostic tests made avail-
cians rely heavily on the interpretation of diagnostic test able each year, with most of these tests being associated with
Diagnostic Accuracy Studies 93
higher costs, for unclear improved patient outcome. Tradi- 9. Taylor T, Dineen RA, Gardiner DC, et al: Computed tomography (CT) angi-
tional diagnostic accuracy case-control or cross-sectional ography for confirmation of the clinical diagnosis of brain death. Cochrane
Database Syst Rev 3:CD009694, 2014. http://ovidsp.ovid.com/ovidweb.cgi?
study designs can be at high risk of bias (for the former) or at
T=JS&PAGE=reference&D=medl&NEWS=N&AN=24683063
risk to provide a limited amount of clinically meaningful 10. Joffe AR, Lequier L, Cave D: Specificity of radionuclide brain blood
information (for the latter). It is critical that diagnostic test be flow testing in brain death: Case report and review. J Intensive
considered like any other interventions. They should be Care Med 25:53-64, 2010. http://ovidsp.ovid.com/ovidweb.cgi?
studied using the same standards as new therapeutic inter- T=JS&PAGE=reference&D=medl&NEWS=N&AN=20095080
ventions that not only ensure that the new test is “as accurate 11. Suarez-Kelly LP, Patel DA, Britt PM, et al: Dead or alive? New confirma-
tory test using quantitative analysis of computed tomographic angiogra-
as” the former test, but also that its use is associated with phy. J Trauma Acute Care Surg 79:995-1003, 2015. https://doi.org/
improved patient outcomes. Randomized diagnostic test 10.1097/TA.0000000000000831. discussion 1003
studies should be considered as the primary design to study 12. Heran MKS, Heran NS, Shemie SD: A review of ancillary tests in evaluat-
new tests to ensure the best and appropriate care for our ing brain death. Can J Neurol Sci 35:409-419, 2008. http://www.ncbi.
patients. nlm.nih.gov/pubmed/18973057. Accessed February 4, 2013
13. Bossuyt PMM: Interpreting diagnostic test accuracy studies. Semin Hematol
45:189-195, 2008. https://doi.org/10.1053/j.seminhematol.2008.04.001
14. Bossuyt P, Leeflang M: The Cochrane Collaboration. Developing criteria
References for including studies. Cochrane Handbook for Systematic Reviews of
1. Rodger M, Ramsay T, Fergusson D: Diagnostic randomized controlled Diagnostic Test Accuracy Version 0.4 [Updated September 2008]. The
trials: The final frontier. Trials 13:137, 2012. https://doi.org/10.1186/ Cochrane Collaboration, 1-7, 2008
1745-6215-13-137 15. Chasse M, Glen P, Doyle M-A, et al: Ancillary testing for diagnosis of
2. Bossuyt PM, Lijmer JG, Mol BW: Randomised comparisons of medical brain death: A protocol for a systematic review and meta-analysis. Syst
tests: Sometimes invalid, not always efficient. Lancet 356:1844-1847, Rev 2:100, 2013. https://doi.org/10.1186/2046-4053-2-100
2000. https://doi.org/10.1016/S0140-6736(00)03246-3 16. Whiting PF, Rutjes AWS, Westwood ME, et al: QUADAS-2: A revised
3. Shemie SD, Doig C, Dickens B, et al: Severe brain injury to neurological tool for the quality assessment of diagnostic accuracy studies. Ann
determination of death: Canadian forum recommendations. CMAJ 174: Intern Med 155:529-536, 2011. https://doi.org/10.1059/0003-4819-
S1-S13, 2006. https://doi.org/10.1503/cmaj.045142 155-8-201110180-00009
4. Gardiner D, Shemie S, Manara A, et al: International perspective on the 17. Leeflang MM, Davenport CF, Takwoingi YDJ: Eligibility designs. Lesson
diagnosis of death. Br J Anaesth 108 (suppl 1):i14-i28, 2012. https:// 2.2: Cochrane Collaboration DTA Online Learning Materials. 2014 The
doi.org/10.1093/bja/aer397 Cochrane Collaboration. http://training.cochrane.org. Published
5. Shemie SD, Lee D, Sharpe M, et al: Brain blood flow in the neurological Accessed August 1, 2018
determination of death: Canadian expert report. Can J Neurol Sci 18. Whiting PF, Rutjes AWS, Westwood ME, et al: Quadas-2: A Quality
35:140-145, 2008. http://www.ncbi.nlm.nih.gov/pubmed/18574925. Assessment Tool for Diagnostic Accuracy Studies. University of Bristol,
Accessed February 4, 2013 http://www.bris.ac.uk/quadas/resources/ Published 2018
6. Roberts DJ, MacCulloch KA, Versnick EJ, et al: Should ancillary brain 19. Le Gal G, Righini M, Parent F, et al: Diagnosis and management of sub-
blood flow analyses play a larger role in the neurological determination segmental pulmonary embolism. J Thromb Haemost 4:724-731, 2006.
of death? Can J Anaesth 57:927-935, 2010. https://doi.org/10.1007/ https://doi.org/10.1111/j.1538-7836.2006.01819.x
s12630-010-9359-4 20. Yoo HHB, Queluz THAT, El Dib R: Anticoagulant treatment for subseg-
7. Lessard MR, Brochu JG: Challenges in diagnosing brain death. Can mental pulmonary embolism. Cochrane Database Syst Rev 2016:
J Anaesth 57:882-887, 2010. https://doi.org/10.1007/s12630-010-9361-x CD010222. https://doi.org/10.1002/14651858.CD010222.pub3
8. Greer DM, Strozyk D, Schwamm LH: False positive CT angiography in 21. Lu B, Gatsonis C: Efficiency of study designs in diagnostic randomized
brain death. Neurocrit Care 11:272-275, 2009. https://doi.org/10.1007/ clinical trials. Stat Med 32:1451-1466, 2013. https://doi.org/10.1002/
s12028-009-9220-1 sim.5655