Examining Embedded Validity Indicators in Conners Continuous Performance Test 3 CPT 3
Examining Embedded Validity Indicators in Conners Continuous Performance Test 3 CPT 3
Anna S. Ord, Holly M. Miskey, Sagar Lad, Beth Richter, Kristina Nagy &
Robert D. Shura
To cite this article: Anna S. Ord, Holly M. Miskey, Sagar Lad, Beth Richter, Kristina Nagy
& Robert D. Shura (2021) Examining embedded validity indicators in Conners continuous
performance test-3 (CPT-3), The Clinical Neuropsychologist, 35:8, 1426-1441, DOI:
10.1080/13854046.2020.1751301
CONTACT Anna S. Ord [email protected] W.G. Hefner VA Medical Center, 1601 Brenner Ave.,
Salisbury, NC 28144-2515, USA
This work was authored as part of the Contributor’s official duties as an Employee of the United States Government and is therefore a work
of the United States Government. In accordance with 17 U.S.C. 105, no copyright protection is available for such works under U.S. Law.
THE CLINICAL NEUROPSYCHOLOGIST 1427
Similarly, Lange et al. (2013) compared CPT-II scores of individuals who passed
symptom validity tests and those who failed in a TBI sample. They found that T scores
with the highest AUCs for identifying invalid performance in individuals with mild TBI
were Commission errors (COM; AUC ¼ 0.786), Perseverations (PER; AUC ¼ 0.767), and
Omission errors (OMI; AUC ¼ 0.749). When these measures were combined, there was
a very small increase in sensitivity (from .26 to .29). Further, Busse and Whiteside
(2012) examined classification accuracy of OMI and COM on the CPT-II and found that
classification accuracy of OMI was acceptable (AUC ¼ 0.76), but classification accuracy
of COM was poor (AUC ¼ 0.64). When variables were combined (OMI þ COM), classifi-
cation accuracy slightly improved (AUC ¼ 0.77) but sensitivity did not significantly
increase (Busse & Whiteside, 2012). Finally, using a veteran sample, Shura et al. (2016)
examined several CPT-II scores (Total Errors, PER, OMI, and COM), all of which signifi-
cantly predicted the pass/fail status on Green’s Word Memory Test (COM AUC ¼ 0.68;
OMI AUC ¼ 0.73; PER AUC ¼ 0.73; Total Errors Raw AUC ¼ 0.75), with total errors
showing the best sensitivity at .41 while maintaining specificity of .92 (Shura
et al., 2016).
The most extensive work evaluating EVIs within the CPT-II has been conducted by
Erdodi and colleagues. First, Erdodi et al. (2014) compared CPT-II scores of partici-
pants who passed validity measures and those who failed, and found the largest
effect sizes for the following subtests: OMI, COM, HRT SE, variability (VAR), and PER.
Detection properties were examined for all five scales, and AUC ranged between 0.61
and 0.74 at the cutoffs that produced acceptable specificity of 0.9 or above. COM
had the lowest AUC (0.61–0.62), whereas OMI had the highest AUC (0.74). The
authors also developed two composite scores based on these five subscales repre-
senting a conservative approach (CVI-5A, aimed at maximizing specificity) and a
more liberal approach (CVI-5B, aimed at maximizing sensitivity). Aggregate indices
(CVI-5s) slightly increased the overall classification accuracy with AUC values ranging
from 0.67 to 0.75 at the cutoffs that produced acceptable specificity above 0.90
(Erdodi et al., 2014).
In a later pediatric study, Erdodi et al. (2017) examined five clinical scales of the
CPT-II (OMI, COM, HRT, PER, and HRT Block Change [HRT-BC]) and concluded that
most of these scores (with the exception of COM) had acceptable signal detection
properties with sensitivity numbers ranging between 0 and 0.67 (OMI ¼ 0.67, HRT ¼
0.33, PER ¼ 0, HRT-BC ¼ 0.33) and specificity between 0.86-1.00 (AUC numbers were
not provided). The investigators used these four scores (OMI, HRT, PER, and HRT-BC)
to create a composite CPT-II Validity Indicator (CVI-4), which produced a sensitivity
value of 0.5 at the specificity level of 0.86 (Erdodi et al., 2017). Finally, Erdodi et al.
(2018) examined classification accuracy of OMI, COM, HRT SE, VAR, PER, CVI-5A, and
CVI-5B. AUC values ranged between 0.55 and 0.84 for each CPT-II subscale, with OMI
producing the highest AUC values (0.68-0.84) and COM producing the lowest AUC val-
ues (0.55-0.70) against several reference PVTs. Composite indicators produced AUCs
ranging from 0.74 to 0.80 with CVI-5A producing slightly higher AUC values than CVI-
5B (Erdodi et al., 2018).
In summary, previous studies have examined the ability of several CPT-II indices to
differentiate between valid and invalid performance. However, no published studies
THE CLINICAL NEUROPSYCHOLOGIST 1429
to date have replicated these findings using the updated, third version of this meas-
ure, the CPT-3 (Conners, 2014). Given that the Conners CPT and CPT-II have been
identified among the most commonly used psychological assessment instruments
(Rabin et al., 2005), it is important to cross-validate and update extant CPT/CPT-II EVI
literature using the CPT-3. Thus, the primary aim of the present study was to evalu-
ate the utility of EVIs identified in the CPT-II using the updated version of this meas-
ure, the CPT-3.
Method
Data were collected by retrospective chart review in a specialty outpatient ADHD
evaluation clinic at a Mid-Atlantic VA Medical Center. The study was reviewed and
approved by the VA Medical Center IRB.
Participants
Participants were 197 veterans evaluated in an outpatient, ADHD specialty clinic
between August 2014 and June 2017. The study sample consisted of consecutive
referrals to the standardized, evidence-based specialty evaluation clinic. There were no
exclusion criteria for the referral to the specialty clinic other than referral sources not
completing the consult correctly or patients who were inappropriate for the clinic
(e.g., diagnostic question of attention problems related to PTSD or dementia). The
only exclusion criterion for this study was if veterans did not complete the Test of
Memory Malingering Trial 1 (TOMM1) or the CPT-3. The specialty clinic was structured
to provide an efficient, standardized, and evidence-based assessment of ADHD for vet-
erans (Shura et al., 2017) for treatment purposes (e.g., stimulant medications) but not
to serve as a basis for academic accommodations. Patients were mostly referred by
psychiatry or primary care. The evaluation included a review of medical records, clin-
ical interview, mental status examination, objective tests of attention, and self-report
symptom measures, though the evaluation was adjusted at times based on clinical
need. All participants in the current study were administered the CPT-3 and the
TOMM1, which was utilized as the primary stand-alone PVT in the clinic. The evalua-
tions took approximately 3 hours and were completed by neuropsychologists or super-
vised pre-doctoral interns and post-doctoral trainees. Demographics of the final
sample are presented in Table 1. Participants ranged in age from 23 – 78, were pre-
dominantly male (85%), and had some college education (range 8 to 19 years of edu-
cation). Demographic data were collected from ADHD assessment reports and
electronic medical records. A total of 107 participants (54.31%) were diagnosed with
ADHD. In the present sample, 43 participants (21.8%) failed TOMM1. No significant dif-
ferences on any demographic variables were observed between participants who
passed TOMM1 and those who failed TOMM1. One participant failed the embedded
validity check outlined in the CPT-3 manual (Conners, 2014) by exceeding the 25%
threshold for omission error rate.
1430 A. S. ORD ET AL.
Measures
Conners continuous performance test, 3rd edition (CPT-3)
The CPT-3 is a 14-minute, computer-based assessment that examines various charac-
teristics of an individual’s attention, including impulsivity, inattention, sustained atten-
tion, and vigilance (Conners, 2014). During administration, the respondents are asked
to press the space bar or mouse button when any target letter is presented on the
screen, and inhibit responding when a non-target letter (X) is presented (Conners,
2014). Results yield several T scores describing the examinee’s response style, ability to
discriminate targets from non-targets, error types, and reaction time statistics
(Conners, 2014). Specifically, the CPT-3 yields the following nine scores: detectability
(d’), omission errors (OMI), commission errors (COM), perseverations (PER), hit reaction
time (HRT), HRT Standard Deviation (SD), variability (VAR), HRT block change (BC), and
HRT inter-stimulus interval change (ISIC). The first score, d’, is a measure of detectabil-
ity or ability to discriminate targets (non-X) from non-targets (X). Omission errors rep-
resent the rate of missing targets, whereas commission errors are incorrect responses
to non-targets. Perseverations are responses provided in less than 100 milliseconds fol-
lowing the presentation of a stimulus. HRT reflects average response speed, while HRT
THE CLINICAL NEUROPSYCHOLOGIST 1431
Table 2. Independent T-tests comparing valid and invalid neuropsychological profiles on CPT-
3 subscales.
Total sample (N ¼ 197)
Pass TOMM1 M Fail TOMM1 M
CPT-3 T scores (SD) n ¼ 154 (SD) n ¼ 43 t p d
d’ 52.86 (8.96) 57.28 (9.76) 22.80 .006 0.47
Omissions 49.77 (8.80) 53.88 (12.78) 22.43 .016 0.37
Commissions 53.80 (9.37) 57.53 (9.02) 22.33 .021 0.40
Perseverations 51.18 (10.14) 54.37 (13.99) 1.40 .168 0.26
Hit reaction 52.22 (10.06) 55.47 (12.35) 1.78 .077 0.29
time (HRT)
HRT standard 50.40 (9.76) 59.26 (12.86) 24.89 <.001 0.78
deviation
(SD)
Variability 50.19 (9.87) 53.88 (10.33) 2.14 .033 0.37
HRT 51.01 (9.68) 52.42 (12.15) 0.79 .428 0.13
block change
HRT inter- 51.58 (9.89) 59.12 (10.43) 24.36 <.001 0.74
stimulus
interval
change
(HRT ISIC)
Note. TOMM1 ¼ Test of Memory Malingering Trial 1; bold scores are significant after adjustment for familywise error
using False Discovery Rate (FDR).
cut-off score of 42. Denning (2012) validated the use of Trial 1 of the TOMM with vet-
erans and determined that this single trial administration provides a more efficient
measure of test validity compared to the full administration (specificity ¼ 92%, sensi-
tivity ¼ 72%, overall hit rate ¼ 87%). Similarly, a more recent study conducted by
Fazio et al. (2017) found Trial 1 was more diagnostically accurate than the full adminis-
tration of the TOMM (Fazio et al., 2017). In the present study, participants were admin-
istered Trial 1 of the TOMM (TOMM1) using the published cutoff of <42 for invalid
performance (Denning, 2012, 2014; Martin et al., 2020).
Results
All analyses were conducted using SAS Enterprise Guide 7.1. To reduce Type I error
due to multiple comparisons, false discovery rate (FDR) (Benjamini & Hochberg, 1995)
was used to determine significant outcomes (step-down approach), correcting the FDR
at p < .05. Independent samples t-tests were conducted to compare CPT-3 scores of
participants who passed TOMM1 to those who failed. Compared to the group who
passed the TOMM, the group who failed performed significantly poorer across the fol-
lowing subscales: d’, OMI, COM, HRT SD, and HRT ISIC. Effect sizes (Cohen’s d) ranged
from 0.37 to 0.78. Results of all independent t-tests are presented in Table 2.
Next, logistic regression analyses were conducted to determine whether each of
the nine CPT-3 scores individually would significantly predict pass/fail status on the
TOMM1. Area-under-the-curve (AUC) analyses were also conducted for each measure.
The following five variables were found to significantly predict validity status on the
TOMM1: d’, OMI, COM, HRT SD, and HRT ISIC. Among these measures, HRT SD and
HRT ISIC were identified as the scores with the highest AUC values (0.72 and 0.71
respectively). Results of these logistic regression analyses are presented in Table 3.
THE CLINICAL NEUROPSYCHOLOGIST 1433
Table 3. Logistic regression and AUC results for CPT-3 scores and composite EVIs (N ¼ 197).
95% CI
CPT-3 T scores b S.E. Wald p AUC LLCI ULCI
d’ 20.051 0.019 7.32 .007 .648 .554 .742
Omissions (OMI) 20.035 0.015 5.27 .022 .616 .519 .714
Commissions (COM) 20.042 0.018 5.19 .023 .631 .541 .720
Perseverations (PER) 20.023 0.014 2.68 .102 .564 .470 .658
Hit reaction time (HRT) 20.027 0.015 3.05 .081 .572 .470 .673
HRT standard deviation (HRT SD) 20.068 0.016 17.73 <.001 .720 .637 .804
Variability (VAR) 20.034 0.016 4.33 .038 .617 .525 .708
HRT block change (HRT BC) 20.014 0.017 0.63 .426 .526 .414 .639
HRT inter-stimulus interval change (HRT ISIC) 20.067 0.017 15.31 <.001 .714 .627 .801
Composite EVI
CEVI-1 20.040 0.009 18.88 <.001 .741 .662 .820
CEVI-2 20.029 0.010 7.81 .005 .655 .566 .745
CEVI-3 20.025 0.006 17.33 <.001 .744 .668 .819
CEVI-4 20.019 0.005 16.05 <.001 .727 .647 .808
CEVI-5 20.010 0.003 12.43 <.001 .687 .668 .819
CEVI-6 20.382 0.156 5.96 .015 .612 .526 .697
CEVI-7 20.266 0.121 4.81 .028 .601 .509 .694
CEVI-8 20.442 0.123 12.93 <.001 .679 .593 .766
CEVI-9 20.428 0.155 7.61 .006 .636 .550 .722
CEVI-10 20.467 0.151 9.56 .002 .629 .542 .716
Note. AUC ¼ area under the curve; EVI ¼ embedded validity indicator; CEVI ¼ composite embedded validity indicator;
LLCI ¼ lower level confidence interval; ULCI ¼ upper level confidence interval; bold font indicates significance after
adjustment for familywise error using False Discovery Rate (FDR).
CEVI-1 ¼ HRTSD þ HRT ISIC (Sum of T scores).
CEVI-2 ¼ OMI þ COM (Sum of T scores).
CEVI-3 ¼ OMI þ COM þ HRTSD þ HRTISIC (Sum of T scores).
CEVI-4 ¼ d’ þ OMI þ COM þ HRT SD þ HRT ISIC (Sum of T scores).
CEVI-5 ¼ d’ þ OMI þ COM þ PER þ HRT þ HRT SD þ VAR þ HRT BC þ HRT ISIC (Sum of T scores).
CEVI-6 ¼ OMI > 65 T, COM > 65 T, HRT SD > 65 T, VAR > 65 T, PER > 70 T.
CEVI-7 ¼ OMI > 60 T, COM > 60 T, HRT SD > 60 T, VAR > 60 T, PER > 60 T.
CEVI-8 ¼ d’ > 60 T, OMI > 60 T, COM > 60 T, HRT SD > 60 T, HRT ISIC > 60 T.
CEVI-9 ¼ d’ > 65 T, OMI > 65 T, COM > 65 T, HRT SD > 65 T, HRT ISIC > 65 T.
CEVI-10 ¼ d’ > 65 T, OMI > 60 T, COM > 68 T, HRT SD > 63 T, HRT ISIC > 63 T.
Subsequently, values for sensitivity, specificity, and positive and negative predictive
power were calculated for the five significant predictors (Table 4). Base rates included
in Table 4 were calculated based on a paper synthesizing stand-alone PVT failure rate
in 50 veteran and military studies (Denning & Shura, 2019), with lower scores indicated
in research samples, highest scores in forensic samples, and clinical samples in
between. We examined the cutoffs that were previously used in the literature for the
CPT-II (> 60 and > 65) for each score, and we also identified optimal cutoffs for our
sample. These optimal cutoffs maximized sensitivity at the specificity level of at least
0.9 (Lippa, 2018). The only exception in this approach was OMI score: although an
optimal cutoff that maximized sensitivity at the specificity level of 0.9 would have
been 58, we decided to increase this cutoff to avoid the “invalid before impaired” phe-
nomenon (Erdodi & Lichtenstein, 2017). The score of 58 falls within the normal per-
formance range, whereas the “elevated” range starts at the score of 60 (Conners,
2014). Therefore, we decided to move the optimal cutoff to 60 for this variable. The
optimal cutoffs for T scores for all significant predictors within our sample were as fol-
lows: d’ > 65, OMI > 60, COM > 68, HRT SD > 63, and HRT ISIC > 63. At these cut-
offs, all scores produced acceptable specificity of at least 0.9, with sensitivities ranging
1434 A. S. ORD ET AL.
Table 4. Sensitivity, specificity, and predictive values at various cutoffs of CPT-3 embedded valid-
ity indicators (N ¼ 197).
Base Rate 50% Base Rate 30% Base Rate 15%
CPT-3 variables % above cutoff SN SP PPV NPV PPV NPV PPV NPV LRþ LR-
d’
>60 T 24.9 0.40 0.79 0.66 0.57 0.45 0.75 0.25 0.88 1.90 0.53
>65 T 12.2 0.19 0.90 0.66 0.53 0.45 0.72 0.25 0.86 1.90 0.53
Omissions
>58 T 13.2 0.23 0.90 0.70 0.54 0.50 0.73 0.29 0.87 2.30 0.43
>60 T 9.6 0.16 0.92 0.67 0.52 0.46 0.72 0.26 0.86 2.00 0.50
>65 T 0.12 0.94 0.67 0.52 0.46 0.71 0.26 0.86 2.00 0.50
Commissions
>60 T 27.9 0.37 0.75 0.60 0.54 0.39 0.74 0.21 0.87 1.48 0.68
>65 T 14.2 0.19 0.87 0.59 0.52 0.39 0.71 0.21 0.86 1.46 0.68
>68 T 10.2 0.16 0.92 0.67 0.52 0.46 0.72 0.26 0.86 2.00 0.50
HRT standard deviation (SD)
>60 T 19.8 0.35 0.84 0.69 0.56 0.48 0.75 0.28 0.88 2.19 0.46
>63 T 12.7 0.26 0.91 0.74 0.55 0.55 0.74 0.34 0.87 2.89 0.35
>65 T 9.1 0.23 0.95 0.82 0.55 0.66 0.74 0.45 0.87 4.60 0.22
HRT ISI change
>60 T 19.3 0.42 0.87 0.76 0.60 0.58 0.78 0.36 0.89 3.23 0.31
>63 T 13.7 0.28 0.90 0.74 0.56 0.55 0.74 0.33 0.88 2.80 0.36
>65 T 12.2 0.23 0.91 0.72 0.54 0.52 0.73 0.31 0.87 2.56 0.39
Note. CPT-3 ¼ Conners Continuous Performance Test 3rd Edition; SN ¼ sensitivity; SP ¼ specificity; PPV ¼ positive pre-
dictive value; NPV ¼ negative predictive value; LRþ ¼ positive likelihood ratio; LR ¼ negative likelihood ratio;
HRT ¼ hit reaction time; ISI ¼ inter-stimulus interval; bold font indicates an “optimal” recommended cutoff value that
produces maximum sensitivity at specificity 0.90.
from 0.16 to 0.28. HRT SD and HRT ISIC produced the highest sensitivity values (0.26
and 0.28 respectively).
To examine whether various combinations of scores would improve classification
accuracy, a follow-up exploratory analysis was conducted. At this step, a number of
composite embedded validity indicators (CEVIs) were created and examined, broadly
based on approaches utilized in previous studies with the CPT-II (e.g., Busse &
Whiteside, 2012; Erdodi et al., 2014). CEVI-1 included a combination of T scores for the
two significant predictors with the highest AUC values identified in the previous ana-
lysis (HRT SD and HRT ISIC). CEVI-2 included a sum of T scores for the two scores (OMI
and COM) that had been consistently identified in the published literature as potential
EVIs in the CPT-II, with some research indicating that the combination of these two
variables increases classification accuracy (e.g., Busse & Whiteside, 2012). For CEVI-3, T
scores for these four aforementioned indices (HRT SD, HRT ISIC, OMI, and COM) were
combined. CEVI-4 involved a combination of T scores for all five significant predictors
identified in the logistic regression analyses. CEVI-5 included a sum of T scores for all
nine CPT-3 indices. To our knowledge, this approach utilizing sums of T scores is
somewhat innovative and has not yet been widely used in studies examining compos-
ite EVIs within CPTs.
CEVI-6 and CEVI-7 were generally based on methodology utilized by Erdodi et al.
(2014) when creating CVI-5 indices for CPT-II, which included OMI, COM, HRT SE, VAR,
and PER at various cutoffs (Erdodi et al., 2014). Similar to methodology used by Erdodi
et al. (2014), we created CEVI-8, CEVI-9, and CEVI-10 indices utilizing significant predic-
tors identified in our sample (d’, OMI, COM, HRT SD, and HRT ISIC) at various cutoffs.
THE CLINICAL NEUROPSYCHOLOGIST 1435
Table 5. Sensitivity, specificity, and predictive values at various cutoffs for CPT-3 composite
embedded validity indicators (N ¼ 197).
% above Base Base Base
CPT-3 cutoff SN rate 50% rate 30% rate 15%
composite EVI Cutoff SP PPV NPV PPV NPV PPV NPV LRþ LR-
CEVI-1 >126 13.7 0.28 0.90 0.74 0.56 0.55 0.74 0.34 0.88 2.86 0.35
CEVI-2 >123 11.7 0.19 0.90 0.66 0.53 0.45 0.72 0.25 0.86 1.91 0.52
CEVI-3 >237 13.7 0.28 0.90 0.74 0.56 0.55 0.74 0.34 0.88 2.86 0.35
CEVI-4 >299 13.2 0.26 0.90 0.72 0.55 0.53 0.74 0.32 0.87 2.62 0.38
CEVI-5 >535 12.7 0.23 0.90 0.70 0.54 0.51 0.73 0.30 0.87 2.38 0.42
CEVI-6 >1 9.1 0.21 0.94 0.78 0.54 0.61 0.74 0.39 0.87 3.60 0.28
CEVI-7 >2 11.2 0.19 0.91 0.67 0.53 0.47 0.72 0.27 0.86 2.04 0.49
CEVI-8 >3 7.11 0.23 0.90 0.70 0.54 0.50 0.73 0.29 0.87 2.33 0.43
CEVI-9 >1 15.7 0.30 0.89 0.73 0.56 0.54 0.75 0.32 0.88 2.73 0.37
CEVI-10 >1 18.3 0.35 0.87 0.73 0.57 0.53 0.76 0.32 0.88 2.68 0.37
Note. CPT-3 ¼ Conners Continuous Performance Test 3rd Edition; EVI ¼ embedded validity indicator;
CEVI ¼ composite embedded validity indicator; SN ¼ sensitivity; SP ¼ specificity; PPV ¼ positive predictive value;
NPV ¼ negative predictive value; LRþ ¼ positive likelihood ratio; LR ¼ negative likelihood ratio.
CEVI-1 ¼ HRTSD þ HRT ISIC (Sum of T scores).
CEVI-2 ¼ OMI þ COM (Sum of T scores).
CEVI-3 ¼ OMI þ COM þ HRTSD þ HRTISIC (Sum of T scores).
CEVI-4 ¼ d’ þ OMI þ COM þ HRT SD þ HRT ISIC (Sum of T scores).
CEVI-5 ¼ d’ þ OMI þ COM þ PER þ HRT þ HRT SD þ VAR þ HRT BC þ HRT ISIC (Sum of T scores).
CEVI-6 ¼ OMI > 65 T, COM > 65 T, HRT SD > 65 T, VAR > 65 T, PER > 70 T.
CEVI-7 ¼ OMI > 60 T, COM > 60 T, HRT SD > 60 T, VAR > 60 T, PER > 60 T.
CEVI-8 ¼ d’ > 60 T, OMI > 60 T, COM > 60 T, HRT SD > 60 T, HRT ISIC > 60 T.
CEVI-9 ¼ d’ > 65 T, OMI > 65 T, COM > 65 T, HRT SD > 65 T, HRT ISIC > 65 T.
CEVI-10 ¼ d’ > 65 T, OMI > 60 T, COM > 68 T, HRT SD > 63 T, HRT ISIC > 63 T.
CEVI-8 and CEVI-9 included > 60 and > 65 cutoffs respectively for all measures, and
CEVI-10 included optimal cutoffs that were identified in the previous analyses (see
Table 4). A summary of all composite EVIs is as follows:
CEVI-1 ¼ HRTSD þ HRT ISIC (Sum of T scores)
CEVI-2 ¼ OMI þ COM (Sum of T scores)
CEVI-3 ¼ OMI þ COM þ HRTSD þ HRTISIC (Sum of T scores)
CEVI-4 ¼ d’ þ OMI þ COM þ HRT SD þ HRT ISIC (Sum of T scores)
CEVI-5 ¼ d’ þ OMI þ COM þ PER þ HRT þ HRT SD þ VAR þ HRT BC þ HRT ISIC
CEVI-6 ¼ OMI > 65 T, COM > 65 T, HRT SD > 65 T, VAR > 65 T, PER > 70 T
CEVI-7 ¼ OMI > 60 T, COM > 60 T, HRT SD > 60 T, VAR > 60 T, PER > 60 T
CEVI-8 ¼ d’ > 60 T, OMI > 60 T, COM > 60 T, HRT SD > 60 T, HRT ISIC > 60 T
CEVI-9 ¼ d’ > 65 T, OMI > 65 T, COM > 65 T, HRT SD > 65 T, HRT ISIC > 65 T
CEVI-10 ¼ d’ > 65 T, OMI > 60 T, COM > 68 T, HRT SD > 63 T, HRT ISIC > 63 T
All composite EVIs were able to significantly differentiate between pass and fail sta-
tus on the TOMM1 after adjusting for familywise error (Table 3). The three scores with
the highest AUC values were: CEVI-3 (0.744), CEVI-1 (0.741), and CEVI-4 (0.727). Overall,
results indicated that combining individual scores improved classification accuracy.
Optimal cutoffs for composite EVIs were also identified (Table 5). At acceptable specifi-
city levels (0.87-0.94), sensitivity values ranged from 0.19 to 0.35.
1436 A. S. ORD ET AL.
Discussion
The present study aimed to update the CPT-II embedded validity indicators for the
newer version, the CPT-3, using a sample of veterans referred for treatment at an
ADHD clinic. On the CPT-3, veterans who failed TOMM1 made significantly more omis-
sion (OMI) and commission errors (COM), demonstrated poorer discrimination between
targets and non-targets (d’), were more inconsistent in their response speed (HRT SD),
and were less efficient at processing stimuli (HRT ISIC) than those who passed the
TOMM1. Similar to previous research, OMI and COM significantly predicted PVT failure
(Lange et al., 2013; Ord et al., 2010), but earned poor AUC values. HRT SD and HRT
ISIC were the strongest individual predictors and the only indices with acceptable AUC
values (.720 and .714, respectively). Perseverations, a variable that has previously discri-
minated between passing and failing PVTs (e.g., Lange et al., 2013), was not significant
in this study. These deviations may be due to differences in sample characteristics
(ADHD clinical sample rather than a TBI sample). In addition, HRT SD is a newer CPT-3
variable that replaced the CPT-II’s HRT SE, a variable that was among the best predic-
tors (e.g., Ord et al., 2010). A change in the ratio of targets to non-targets for the CPT-
3 (Conners, 2014) may have also resulted in different outcomes for several CPT-3
variables including d’, OMI, and COM. These results highlight that previously estab-
lished CPT-II validity indicators do not directly translate to the CPT-3 and indicate the
need for updated research.
The individual CPT-3 variables that significantly predicted PVT performance were
combined in several ways in an effort to identify the most accurate embedded validity
indicators. Based on AUC, the approaches that summed T scores were the best predic-
tors with those including HRT SD and HRT ISI (CEVI-1 ¼ .741; CEVI-3 ¼ .744; CEVI-4 ¼
.727). CEVI-1 and CEVI-3 additionally had the best sensitivity and specificity of the
multi-variable composite EVIs. However, those models did not outperform HRT ISI
Change at > 63 T. Models that relied on implementing cut scores for each included
index (e.g., > 65 T) performed poorly with AUC values under .70. Two of these models
were designed to be similar to Erdodi et al. (2014) CVI-5 models for the CPT-II. Our
composite index (CEVI-6) based on their more conservative CVI-5A model performed
better than our CEVI-7 index based on their liberal model (CVI-5B); however, both
earned AUC values below .70. Inclusion of the Perseverations index score in these
models likely reduced their performance as Perseverations was not a significant pre-
dictor in our analyses. In the spirit of Erdodi et al. (2014) design, we tested models
built using the five variables that were significant in our sample with both conserva-
tive (CEVI-8) and liberal (CEVI-9) cutoffs. Although these CEVIs were able to signifi-
cantly differentiate participants who passed TOMM1 from those who failed, AUCs for
those models were below .70.
Overall, CEVI-1, CEVI-3, CEVI-4, HRT SD, and HRT ISIC performed the best and were
relatively equivalent. Of note, this study presents three types of EVIs: cutoffs using clin-
ical scales (e.g., HRT SD), cutoffs using continuous variables based on combinations
scores (e.g., CEVI-1 adding T scores of HRT SD and HRT ISIC), and cutoffs using a num-
ber of failures (e.g., CEVI-6). The number of failures method has been established in
prior studies with the CPT-II; however, all indices that used this approach in this sam-
ple resulted in poorer AUC values than the other two types of indices. Additionally,
THE CLINICAL NEUROPSYCHOLOGIST 1437
this approach allows for two minimally failed scores to result in a failure. Thus, these
options performed poorer than other indices, and might be considered less useful in
practical situations. The use of a single clinical scale has the benefit of ease of use, as
additional scores do not need to be calculated by hand. Thus, in most clinical situa-
tions, interpreting HRT SD or HRT ISIC independently might be most acceptable.
However, such approach does not consider multi-variate base rates, and an individual
who is actually impaired but valid might have a personal weakness in one domain
only and could be falsely identified as invalid by interpreting only one score. From the
current study, the indices that combine T scores provide the best all-around options
given they have the highest AUCs and sensitivities, help address the multi-variate base
rate issue, and partially account for severity of failure. Based on these data, we recom-
mend the use of the CEVI-1 (sum of HRT SD and HRT ISIC T scores) at a cutoff of >
126 until additional research suggests otherwise. However, clinicians should use their
own judgment to select a priori which value to use based on the specific context of
the evaluation, as well as clinical or research needs. Tables 4 and 5 provide predictive
values at different base rates to help clinicians make informed decisions.
The present study is not free of limitations. First, the study utilized only one stand-
alone performance validity test, the TOMM1. Although prior research has suggested that
failure on even one PVT may indicate performance invalidity (Proto et al., 2014), future
studies examining EVIs within the CPT-3 may include combinations of multiple PVTs or
Slick criteria to strengthen the methodology of the “known groups” design. Additional
PVTs will provide stronger evidence regarding validity of performance for each partici-
pant. Moreover, the methodology of deriving and validating novel PVT metrics in the
same dataset carries the inherent limitation of overfitting data, thus overestimating sen-
sitivity. Therefore, findings of the present study should be interpreted in this context;
the present study has provided a preliminary first step in the examination of EVIs within
the CPT-3, but further research is needed to replicate and validate our findings.
Limitations imposed by study participants’ characteristics should also be considered.
Findings will be most applicable to adult veteran samples and adult ADHD outpatient
clinics. Future studies might consider the use of different populations as there is a
need to validate EVIs across populations and clinical presentations. Although the sam-
ple included individuals with various mental health diagnoses, mild TBI histories, com-
bat exposure, and ADHD, the mean scores across CPT-3 indices were in the average
range regardless of pass/fail status on the TOMM1 suggesting relatively good overall
attention functioning. This was somewhat unexpected in an ADHD clinic, and likely
lowered the suggested CEVI cutoffs; replication with additional samples that have
higher variability in CPT-3 scores is recommended to confirm or adjust these values.
Although the mean for the invalid TOMM1 group was within the average range, the
standard deviation suggests that many participants did score in the elevated ranges
on CPT-3 indices. Finally, in the present study only one participant failed the
embedded validity check described in the CPT-3 manual (Conners, 2014) by exceeding
the 25% threshold for omission error rate. Consequently, we were unable to compare
CPT-3 scores of participants who passed and those who failed this embedded validity
check in the present sample. Future studies may compare this validity check with vali-
dated stand-alone and embedded PVTs.
1438 A. S. ORD ET AL.
In conclusion, the Conners’ CPT-II and Conners CPT-3 are widely used by neuropsy-
chologists to assess attention and related disorders. In line with previously published
research (Lange et al., 2013), our results suggest that several CPT-3 variables may dou-
ble as embedded validity indicators. However, due to low sensitivity, they should not
be used in isolation; as Lange et al. (2013) observed, these scores would be “largely
useful to rule in, not rule out” invalid performance. In test batteries using the CPT-3,
consideration of including one of the identified embedded validity measures allows to
better achieve the standard of comprehensive validity assessment across an evaluation
(Boone, 2009). Given the preliminary nature of these findings, clinicians are cautioned
against assuming invalid performance solely based on indicators described in this
study. Identified CPT-3 scores may be useful as one component in a multivariate,
multi-point continuous approach to performance validity sampling.
Author note
Anna S. Ord, PsyD, Mid-Atlantic Mental Illness Research, Education, and Clinical Center
(MA-MIRECC), Research & Academic Affairs Service Line, Salisbury VA Medical Center,
Salisbury, North Carolina; Holly M. Miskey, PhD, MA-MIRECC, Mental Health &
Behavioral Sciences Service Line, Salisbury VA Medical Center, Salisbury, North
Carolina, and Department of Neurology, Wake Forest School of Medicine, Winston-
Salem, North Carolina, and Department of Psychiatry, Via College of Osteopathic
Medicine, Blacksburg, VA; Sagar S. Lad, PsyD, MA-MIRECC, Research & Academic Affairs
Service Line, Salisbury VA Medical Center, Salisbury, North Carolina; Beth Richter, MA,
Mental Health & Behavioral Sciences Service Line, Salisbury VA Medical Center,
Salisbury, North Carolina; Kristina Nagy, MS, Mental Health & Behavioral Sciences
Service Line, Salisbury VA Medical Center, Salisbury, North Carolina; Robert D. Shura,
PsyD, MA-MIRECC, Research & Academic Affairs Service Line, Salisbury VA Medical
Center, Salisbury, North Carolina, Department of Neurology, Wake Forest School of
Medicine, Winston-Salem, North Carolina, and Department of Psychiatry, Via College of
Osteopathic Medicine, Blacksburg, VA. Correspondence concerning this article should
be addressed to Anna S. Ord, PsyD, Salisbury VA Medical Center, 11 M-2, 1601 Brenner
Ave., Salisbury, NC 28144, 704-638-9000 12939, [email protected].
Subsamples of data have been previously published. Please contact the corresponding
author for a complete list of publications.
Acknowledgements
We would like to thank G. Melissa Evans, MA for her contributions to this project.
Disclosure statement
The views, opinions and/or findings contained in this article are those of the authors and should
not be construed as an official position, policy or decision of the Department of Veterans Affairs
or the US Government unless so designated by other official documentation.
THE CLINICAL NEUROPSYCHOLOGIST 1439
Funding
This work was supported by resources of Salisbury Veterans Affairs Medical Center; the Mid-
Atlantic Mental Illness Research, Education, and Clinical Center; and the Department of Veterans
Affairs Office of Academic Affiliations Advanced Fellowship Program in Mental Illness Research
and Treatment.
ORCID
Anna S. Ord http://orcid.org/0000-0003-3373-2016
Holly M. Miskey http://orcid.org/0000-0002-5139-4586
Robert D. Shura http://orcid.org/0000-0002-9505-0080
References
American Academy of Clinical Neuropsychology (AACN). (2007). Practice guidelines for neuro-
psychological assessment and consultation. The Clinical Neuropsychologist, 21(2), 209–231.
https://doi.org/10.1080/13825580601025932
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and power-
ful approach to multiple testing. Journal of the Royal Statistical Society: Series B
(Methodological), 57(1), 289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x
Boone, K. B. (2009). The need for continuous and comprehensive sampling of effort/response
bias during neuropsychological examinations. The Clinical Neuropsychologist, 23(4), 729–741.
https://doi.org/10.1080/13854040802427803
Brennan, A. M., Meyer, S., David, E., Pella, R., Hill, B. D., & Gouvier, W. D. (2009). The vulnerability
to coaching across measures of effort. The Clinical Neuropsychologist, 23(2), 314–328. https://
doi.org/10.1080/13854040802054151
Bush, S. S., Heilbronner, R. L., & Ruff, R. M. (2014). Psychological assessment of symptom and
performance validity, response bias, and malingering: Official position of the Association for
Scientific Advancement in Psychological Injury and Law. Psychological Injury and Law, 7(3),
197–205. https://doi.org/10.1007/s12207-014-9198-7
Bush, S., Ruff, R., Troster, A., Barth, J., Koffler, S., Pliskin, N., Reynolds, C., & Silver, C. (2005).
Symptom validity assessment: Practice issues and medical necessity. NAN policy & planning
committee. Archives of Clinical Neuropsychology, 20(4), 419–426. https://doi.org/10.1016/j.acn.
2005.02.002
Busse, M., & Whiteside, D. (2012). Detecting suboptimal cognitive effort: Classification accuracy
of the Conner’s Continuous Performance Test-II, Brief Test of Attention, and Trail Making Test.
The Clinical Neuropsychologist, 26(4), 675–687. https://doi.org/10.1080/13854046.2012.679623
Conners, C. K. (2014). Conners CPT-3: Manual. MHS Inc.
Denning, J. H. (2012). The efficiency and accuracy of the Test of Memory Malingering trial 1,
errors on the first 10 items of the test of memory malingering, and five embedded measures
in predicting invalid test performance. Archives of Clinical Neuropsychology, 27(4), 417–432.
https://doi.org/10.1093/arclin/acs044
Denning, J. H. (2014). The efficiency and accuracy of the Test of Memory Malingering Trial 1,
errors on the first 10 items of the Test of Memory Malingering, and five embedded measures
in predicting invalid test performance. Archives of Clinical Neuropsychology, 29(7), 729–730.
https://doi.org/10.1093/arclin/acu051
Denning, J. H., & Shura, R. D. (2019). Cost of malingering mild traumatic brain injury-related cog-
nitive deficits during compensation and pension evaluations in the Veterans Benefits
Administration. Applied Neuropsychology: Adult, 26(1), 1–16. https://doi.org/10.1080/23279095.
2017.1350684
1440 A. S. ORD ET AL.
Erdodi, L. A., & Lichtenstein, J. D. (2017). Invalid before impaired: An emerging paradox of
embedded validity indicators. The Clinical Neuropsychologist, 31(6-7), 1029–1046. https://doi.
org/10.1080/13854046.2017.1323119
Erdodi, L. A., Lichtenstein, J. D., Rai, J. K., & Flaro, L. (2017). Embedded validity indicators in
Conners’ CPT-II: Do adult cutoffs work the same way in children? Applied Neuropsychology:
Child, 6(4), 355–363. https://doi.org/10.1080/21622965.2016.1198908
Erdodi, L. A., Pelletier, C. L., & Roth, R. M. (2018). Elevations on select Conners’ CPT-II scales indi-
cate noncredible responding in adults with traumatic brain injury. Applied Neuropsychology:
Adult, 25(1), 19–28. https://doi.org/10.1080/23279095.2016.1232262
Erdodi, L. A., Roth, R. M., Kirsch, N. L., Lajiness-O’neill, R., & Medoff, B. (2014). Aggregating valid-
ity indicators embedded in Conners’ CPT-II outperforms individual cutoffs at separating valid
from invalid performance in adults with traumatic brain injury. Archives of Clinical
Neuropsychology, 29(5), 456–466. https://doi.org/10.1093/arclin/acu026
Fazio, R. L., Denning, J. H., & Denney, R. L. (2017). TOMM Trial 1 as a performance validity indica-
tor in a criminal forensic sample. The Clinical Neuropsychologist, 31(1), 251–267. https://doi.
org/10.1080/13854046.2016.1213316
Greve, K. W., Etherton, J. L., Ord, J., Bianchini, K. J., & Curtis, K. L. (2009). Detecting malingered
pain-related disability: Classification accuracy of the Test of Memory Malingering. The Clinical
Neuropsychologist, 23(7), 1250–1271. https://doi.org/10.1080/13854040902828272
Greve, K. W., Ord, J., Curtis, K. L., Bianchini, K. J., & Brennan, A. (2008). Detecting malingering in
traumatic brain injury and chronic pain: A comparison of three forced-choice symptom valid-
ity tests. The Clinical Neuropsychologist, 22(5), 896–918. https://doi.org/10.1080/
13854040701565208
Heilbronner, R. L., Sweet, J. J., Morgan, J. E., Larrabee, G. J., & Millis, S. R. (2009). American
Academy of Clinical Neuropsychology consensus conference statement on the neuropsycho-
logical assessment of effort, response bias, and malingering. The Clinical Neuropsychologist,
23(7), 1093–1129. https://doi.org/10.1080/13854040903155063
Hervey, A. S., Epstein, J. N., & Curry, J. F. (2004). Neuropsychology of adults with attention-def-
icit/hyperactivity disorder: A meta-analytic review. Neuropsychology, 18(3), 485–503. https://
doi.org/10.1037/0894-4105.18.3.485
Lange, R. T., Iverson, G. L., Brickell, T. A., Staver, T., Pancholi, S., Bhagwat, A., & French, L. M.
(2013). Clinical utility of the Conners’ Continuous Performance Test-II to detect poor effort in
U.S. military personnel following traumatic brain injury. Psychological Assessment, 25(2),
339–352. https://doi.org/10.1037/a0030915
Larrabee, G. J. (2012). Assessment of malingering. In G. J. Larrabee (Ed.), Forensic neuropsych-
ology: A scientific approach (2nd ed., pp. 116–159). Oxford University Press.
Lippa, S. M. (2018). Performance validity testing in neuropsychology: A clinical guide, critical
review, and update on a rapidly evolving literature. The Clinical Neuropsychologist, 32(3),
391–421. https://doi.org/10.1080/13854046.2017.1406146
Losier, B. J., McGrath, P. J., & Klein, R. M. (1996). Error patterns on the Continuous Performance
Test in non-medicated and medicated samples of children with and without ADHD: A meta-
analytic review. Journal of Child Psychology and Psychiatry, 37(8), 971–987. https://doi.org/10.
1111/j.1469-7610.1996.tb01494.x
Martin, P. K., Schroeder, R. W., & Odland, A. P. (2015). Neuropsychologists’ validity testing beliefs
and practices: A survey of North American professionals. The Clinical Neuropsychologist, 29(6),
741–776. https://doi.org/10.1080/13854046.2015.1087597
Martin, P. K., Schroeder, R. W., Olsen, D. H., Maloy, H., Boettcher, A., Ernst, N., & Okut, H. (2020).
A systematic review and meta-analysis of the Test of Memory Malingering in adults: Two dec-
ades of deception detection. The Clinical Neuropsychologist, 34(1), 88–119. https://doi.org/10.
1080/13854046.2019.1637027
Miele, A. S., Gunner, J. H., Lynch, J. K., & McCaffrey, R. J. (2012). Are embedded validity indices
equivalent to free-standing symptom validity tests? Archives of Clinical Neuropsychology, 27(1),
10–22. https://doi.org/10.1093/arclin/acr084
THE CLINICAL NEUROPSYCHOLOGIST 1441
Ord, J. S., Boettcher, A. C., Greve, K. W., & Bianchini, K. J. (2010). Detection of malingering in
mild traumatic brain injury with the Conners’ Continuous Performance Test–II. Journal of
Clinical and Experimental Neuropsychology, 32(4), 380–387. https://doi.org/10.1080/
13803390903066881
Proto, D. A., Pastorek, N. J., Miller, B. I., Romesser, J. M., Sim, A. H., & Linck, J. F. (2014). The dan-
gers of failing one or more performance validity tests in individuals claiming mild traumatic
brain injury-related postconcussive symptoms. Archives of Clinical Neuropsychology, 29(7),
614–624. https://doi.org/10.1093/arclin/acu044
Rabin, L. A., Barr, W. B., & Burton, L. A. (2005). Assessment practices of clinical neuropsycholo-
gists in the United States and Canada: A survey of INS, NAN, and APA Division 40 members.
Archives of Clinical Neuropsychology, 20(1), 33–65. https://doi.org/10.1016/j.acn.2004.02.005
€sseler, J., Brett, A., Klaue, U., Sailer, M., & Mu
Ru €nte, T. F. (2008). The effect of coaching on the
simulated malingering of memory impairment. BMC Neurology, 8(1), 37. https://doi.org/10.
1186/1471-2377-8-37
Shura, R. D., Denning, J. H., Miskey, H. M., & Rowland, J. A. (2017). Symptom and performance
validity with veterans assessed for attention-deficit/hyperactivity disorder (ADHD).
Psychological Assessment, 29(12), 1458–1465. https://doi.org/10.1037/pas0000436
Shura, R. D., Miskey, H. M., Rowland, J. A., Yoash-Gantz, R. E., & Denning, J. H. (2016). Embedded
performance validity measures with postdeployment veterans: Cross-validation and efficiency
with multiple measures. Applied Neuropsychology: Adult, 23(2), 94–104. https://doi.org/10.
1080/23279095.2015.1014556
Sollman, M. J., & Berry, D. T. (2011). Detection of inadequate effort on neuropsychological test-
ing: A meta-analytic update and extension. Archives of Clinical Neuropsychology, 26(8),
774–789. https://doi.org/10.1093/arclin/acr066
Suhr, J., & Gunstad, J. (2007). Coaching and malingering: A review. In G. J. Larrabee (Ed.),
Assessment of malingered neuropsychological deficits (pp. 287–311). Oxford University Press.
Tombaugh, T. N. (1996). TOMM: Test of memory malingering. MHS Inc.