0% found this document useful (0 votes)
74 views7 pages

Diagnostic Accuracy Part 1 Basic Concepts Sensitivity and Specificity ROC Analysis STARD Statement

Uploaded by

rehana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views7 pages

Diagnostic Accuracy Part 1 Basic Concepts Sensitivity and Specificity ROC Analysis STARD Statement

Uploaded by

rehana
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Diagnostic accuracy – Part 1

Basic concepts: sensitivity and specificity, ROC


analysis, STARD statement
June 2009

Ana-Maria Simundic
University Department of Chemistry
University Hospital SESTRE MILOSRDNICE
School of Medicine, Faculty of Pharmacy and Biochemistry,
Zagreb University
Vinogradska 29
10 000 Zagreb
CROATIA

The discriminative ability of a diagnostic procedure is Furthermore, measures of a test performance are not
called diagnostic accuracy, and a number of quantitative fixed indicators of a test quality, but are very sensitive
measures out of which sensitivity and specificity are to the characteristics of the population in which the test
mostly used in the biomedical literature can express it. accuracy is being evaluated.

Each diagnostic-accuracy measure relates to some Some measures largely depend on the disease
specific aspects of a diagnostic procedure. While some prevalence, while others are highly sensitive to the
measures are used to assess the discriminative property spectrum of the disease in the studied population.
of the test, others are used to assess its predictive ability.
It is therefore of outmost importance to understand the
Discriminative measures are mostly used by health-policy meaning of different measures of diagnostic accuracy
decision makers; predictive measures are most useful for and to know how to interpret them and under what
predicting the probability of a disease in an individual. conditions they may be used.

Some measures assess the global performance of a What is diagnostic accuracy


test, whereas others are related to its ability to detect
or exclude the disease, or to the clinical significance of a To discriminate the diseased from those who are healthy
positive or negative test result in a specific patient. is the ultimate goal of every diagnostic procedure. What
we would expect from an ideal biochemical marker is

Page 1
Ana-Maria Simundic: Diagnostic accuracy – Part 1 Basic concepts: sensitivity and specificity, ROC ... Article downloaded from acutecaretesting.org
that almost all healthy individuals shall have their values Why do we have so many measures of
somewhere within the reference limits, whereas those diagnostic accuracy
who have a disease shall have significantly higher (less
frequently lower) values of a measured parameter. Each measure of diagnostic accuracy relates to some
specific aspects of a diagnostic procedure. While some
What we would expect to observe rather rarely measures are used to assess the discriminative property
are healthy individuals with an elevated marker of the test, others are used to assess its predictive ability.
concentration (the so-called false positives) as well
as diseased individuals with values falling within the Discriminative measures are mostly used by health-
reference interval (false negatives). policy decision makers, whereas predictive measures are
most useful for predicting the probability of a disease in
Even though it may seem as an easy “mission”, the an individual.
absolutely ideal marker does not exist and we therefore
unfortunately always end up with a certain proportion Some measures assess the global performance of a
of individuals having falsely elevated or lowered marker test, whereas others are related to its ability to detect
concentration. or exclude the disease, or to the clinical significance of a
positive or negative test result in a specific patient.
The less of those false positives and false negatives
observed, the better is the marker. What is also important is the fact that measures of a test
performance are not fixed indicators of a test quality.
The only question is: how to measure this discriminative On the contrary, measures of diagnostic accuracy are
potential of some diagnostic procedure (biochemical very sensitive to the characteristics of the population in
parameter, panel of parameters, radiologic analysis or which the test accuracy is being evaluated.
clinical exam)? How to know which procedure is better?

Some measures largely depend on the disease


The discriminative ability of a diagnostic procedure prevalence, while others are highly sensitive to the
is called diagnostic accuracy, and the number of spectrum of the disease in the studied population.
quantitative measures out of which sensitivity and
specificity are mostly used in the biomedical literature It is therefore of utmost importance to understand the
can express it. meaning of different measures of diagnostic accuracy
and to know how to interpret them and under what
Measures of diagnostic accuracy are: conditions they may be used.

• Sensitivity (Se) How to assess the diagnostic accuracy of a


• Specificity (Sp) biochemical marker
• Positive predictive value (PPV)
• Negative predictive value (NPV) Let us imagine that we want to evaluate the diagnostic
• Likelihood ratio (LR) accuracy of S-100B, a new potential marker for acute
• Area under the ROC curve (AUC) ischemic stroke. How would you assess its diagnostic
• Youden index accuracy?
• Diagnostic odds ratio (DOR)

Measures of diagnostic accuracy are extremely sensitive


to the design of the study aimed to assess the diagnostic
accuracy of a certain marker.

Page 2
Ana-Maria Simundic: Diagnostic accuracy – Part 1 Basic concepts: sensitivity and specificity, ROC ... Article downloaded from acutecaretesting.org
Studies suffering from some major methodological A collaborative group of researchers have developed the
shortcomings can severely over- or underestimate the STARD (Standards for Reporting of Diagnostic Accuracy)
indicators of test performance and limit the external statement aimed to improve the quality of reporting of
validity of the study, i.e. the generalizability of the studies of diagnostic accuracy.
results of the study.
The statement consists of a checklist of 25 items and
The easiest and most appealing way to design a a flow diagram that authors can use to ensure that all
diagnostic-accuracy study is a so-called “two-gate“ relevant information is present.
(case-control) study design. In such studies, patients are
compared with healthy individuals. The aim and history of STARD as well as the STARD
checklist, STARD flow diagram and many other related
This way, measures of diagnostic accuracy have been documents can be accessed at the official STARD
shown to overestimate the measures severalfold, website: stard-statement.org. The STARD initiative was
compared with properly designed studies that use a very important step toward the improvement of the
single series of consecutive patients to evaluate the quality of reporting of studies of diagnostic accuracy.
same test. The case-control study design is therefore
not recommended. According to the STARD statement, the simple example
of the flow diagram for our study of diagnostic accuracy
In the properly designed study, patients are collected as of S-100B for acute ischemic stroke would be as
a consecutive series of individuals in whom the target presented on the FIGURE 1.
condition is suspected. The biochemical marker under
evaluation is performed in all individuals presenting with Calculating and interpreting sensitivity and
disease symptoms. specificity

Subsequently, the presence of disease is determined A perfect diagnostic marker for acute ischemic stroke
by performing the reference standard method for would have the potential to completely discriminate
diagnosis. individuals with and without stroke. Unfortunately, as
was already pointed out, such perfect diagnostic test
In our example with a new marker (S-100B) for acute does not exist.
ischemic stroke, the ideal design would be as follows:
Therefore, by using the cut-off for S-100B of 0.5 µg/L,
All individuals with acute ischemic stroke symptoms for example, we may classify study participants into four
presenting to the Emergency department of our subgroups considering parameter concentrations:
Neurology clinic are consecutively recruited into the
study. Blood samples are drawn immediately and sent to • True positive (TP) – subjects having stroke and
the laboratory for S-100B concentration measurement. S-100B > 0.5 µg/L
• False positive (FP) – subjects without stroke and
All individuals undergo the same diagnostic work-up S-100B > 0.5 µg/L
and a stroke diagnosis is made based on established • True negative (TN) – subjects without stroke and
criteria, equal for all patients. S-100B < 0.5 µg/L
• False negative (FN) – subjects having stroke and
Subsequently, statistical analysis is performed and S-100B < 0.5 µg/L
measures estimated in order to assess the power of the
S-100B marker to discriminate between individuals with The first step in calculating sensitivity and specificity is
and without acute ischemic stroke. to make a 2 × 2 table with groups of subjects divided

Page 3
Ana-Maria Simundic: Diagnostic accuracy – Part 1 Basic concepts: sensitivity and specificity, ROC ... Article downloaded from acutecaretesting.org
Eligible stroke
patients
N = 200

S - 100B assay

S-100B > cut-off S-100B normal


N=130 N=70

stroke diagnostic stroke diagnostic


criteria criteria

not stroke stroke not stroke stroke


N=40 N=40 N=60 N=10

FIGURE 1: Flow diagram according to the STARD statement

according to a gold standard or reference method Hence, it relates to the potential of a test to identify
(diagnostic criteria) in columns, and categories according subjects with the disease.
to test (S-100B) in rows (TABLE 1).
In our example the sensitivity is 90 % at a cut-off value
for serum S-100B protein of 0.5 µg/L.
Individuals Individuals
with stroke without stroke
What does it mean? It means that if we measure the
S-100B > 0.5 µg/L TP (N = 90) FP (N = 40)
S-100B concentration in every individual presenting
S-100B < 0.5 µg/L FN (N = 10) TN (N = 60) with stroke symptoms at the Emergency department
TABLE 1: 2 × 2 table for calculating measures of diagnostic accuracy of our Neurology clinic, we shall observe S-100B >
0.5 µg/L in nine out of 10 individuals in whom stroke
was subsequently diagnosed, according to standard
Sensitivity (%) defines the proportion of true positive diagnostic criteria for acute ischemic stroke (gold
subjects with the disease in a total group of subjects with standard).
the disease (TP / (TP + FN)). In other words, sensitivity is
defined as the probability of getting a positive test result Moreover, it also means that if we solely rely on the
in subjects with the disease. S-100B result, in the absence of other diagnostic
options, we would miss one out of every 10 stroke

Page 4
Ana-Maria Simundic: Diagnostic accuracy – Part 1 Basic concepts: sensitivity and specificity, ROC ... Article downloaded from acutecaretesting.org
patients. The question is: are we willing to accept such These individuals would be exposed to further
diagnostic uncertainty? diagnostic work-up and psychological stress related to
the (spurious) existing probability of having a disease.
So, the sensitivity is a very useful marker that gives us
an idea about the discriminative power of the marker The question again is: are we willing to accept this
and the proportion of diseased individuals missed by the diagnostic uncertainty? The answer is not an easy one,
marker. nor is there a unique answer to this question.

However, what would be far more informative for the The decision on the acceptable level of diagnostic
physician is: if a concentration of S-100B > 0.5 µg/L uncertainty depends on the disease characteristics,
is measured in an individual presenting with stroke healthcare costs and psychological impact of a missed
symptoms, how sure can I be that this patient has a stroke? diagnosis and many other issues.

Unfortunately, sensitivity tells us nothing about it. If a disease is a serious life-threatening condition, we
may not want to miss it, so maximum sensitivity shall be
Specificity (%) is another measure of the diagnostic most suitable.
test accuracy, complementary to sensitivity. It is defined
as a proportion of subjects without the disease with So, the specificity also gives us an idea about the
a negative test result in total of subjects without the discriminative power of the marker. Again, as with
disease (TN / (TN + FP)). sensitivity, what the physician would like to know is: if
a concentration of S-100B < 0.5 µg/L is measured in an
Analogous to sensitivity, specificity represents the individual presenting with stroke symptoms, how sure
probability of a negative test result in a subject without can I be that this patient does not have a stroke?
the disease.
The knowledge about the marker specificity does not
Therefore, we can postulate that specificity relates to provide the exact evidence for such clinical judgments.
the aspect of diagnostic accuracy that describes the test
ability to identify subjects without the disease, i.e. to ROC curves
exclude the condition of interest.
The specificity and sensitivity of every diagnostic test
Again, let us look back at the example with stroke depend on the selected cut-off level. Therefore, a pair
patients and the S-100B diagnostic marker. The of diagnostic sensitivity and specificity values exists for
specificity in our study turned out to be 60 % at a cut- every individual cut-off. The ROC (Receiver Operating
off value for serum S-100B protein of 0.5 µg/L. What Characteristic) curve is constructed by plotting these
does it mean? pairs of values on the graph with the 1-specificity on
the x-axis and sensitivity on the y-axis.
A specificity of 60 % means that if we measure the
S-100B concentration in every individual presenting The shape of the ROC curve and the area under the
with stroke symptoms at the Emergency department of curve (AUC) help us estimate the discriminative power
our Neurology clinic, in six out of 10 individuals in whom of a test. The closer the curve follows the upper left-
stroke was subsequently ruled out, a concentration of hand corner and the larger the area under the curve, the
S-100B < 0.5 µg/L shall be observed. better the test is at discriminating between those with
and without the disease.
It also means that four out of 10 individuals without
stroke shall have a falsely elevated marker concentration.

Page 5
Ana-Maria Simundic: Diagnostic accuracy – Part 1 Basic concepts: sensitivity and specificity, ROC ... Article downloaded from acutecaretesting.org
1 Nonetheless, sensitivity and specificity may vary greatly
0,9 depending on the spectrum of the disease in the studied
C= 0 ,7
AU
sensitivity C= group. Sensitivity and specificity are commonly used
AU
estimates of diagnostic accuracy.

They should be well understood and carefully

5
0,
=
interpreted in order to serve as valid evidence for health

C
AU
care providers, clinicians and laboratory professionals; to
the best for the patient care.

0 1
1-specificity

FIGURE 2: ROC curve

AUC is a global measure of diagnostic accuracy. The area


under the curve may be any value between 0 and 1 and
it is a good indicator of the overall quality of the test.

By comparing the areas under the two ROC curves we


can estimate which test is better at diagnosing a disease.
A perfect diagnostic test has an AUC of 1.0, whereas a
useless test has an area ≤0.5. The interpretation of the
AUC is described in TABLE 2.

AUC Diagnostic accuracy


0.9-1.0 Excellent
0.8-0.9 Very good
0.7-0.8 Good
0.6-0.7 Sufficient
0.5-0.6 Bad
<0.5 Test not useful

TABLE 2: The interpretation of the AUC curves

Conclusion

It is important to mention that neither sensitivity nor


specificity is influenced by the disease prevalence,
meaning that results from one study could easily be
transferred to some other setting with a different
prevalence of the disease in the population.

Page 6
Ana-Maria Simundic: Diagnostic accuracy – Part 1 Basic concepts: sensitivity and specificity, ROC ... Article downloaded from acutecaretesting.org
References
1. Irwig L, Bossuyt P, Glasziou P, Gatsonis C, Lijmer J. De
signing studies to ensure that estimates of test accuracy
are transferable. BMJ. 2002; 324(7338): 669-71.

2. Raslich MA. Markert RJ, Stutes SA. Selecting and in-


terpreting diagnostic tests. Biochemia Medica 2007;
17(2): 139-270.

3. Rutjes AW, Reitsma JB, Di Nisio M, Smidt N, van Rijn


JC, Bossuyt PM. Evidence of bias and variation in diagno-
stic accuracy studies. CMAJ. 2006; 14; 174(4): 469-76.

4. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou


PP, Irwig LM, et al. Towards complete and accurate repor-
ting of studies of diagnostic accuracy: the STARD initiative.
Clin Chem 2003; 49: 1-6.

5. Bossuyt PM, Reitsma JB, Bruns DE, Gatsonis CA, Glasziou


PP, Irwig LM, et al. The STARD statement for reporting
studies of diagnostic accuracy: explanation and elaborati
on. Clin Chem 2003; 49: 7-18.

6. Bossuyt PM. Clinical evaluation of medical tests: still a


long road to go. Biochemia Medica 2006; 16(2) 89–228

Data subject to change without notice.


© Radiometer Medical ApS, 2700 Brønshøj, Denmark, 2009. All Rights Reserved.

You might also like