0% found this document useful (0 votes)
19 views10 pages

Handwriting Analysis

Uploaded by

egosuspension
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views10 pages

Handwriting Analysis

Uploaded by

egosuspension
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Journal of Applied Research in Memory and Cognition 7 (2018) 199–208

Journal of Applied Research in Memory and Cognition

Are Forensic Scientists Experts?


Content may be shared at no cost, but any requests to reuse this content in part or whole must go through the American Psychological Association.

Alice Towler∗
David White
University of New South Wales, Australia
Kaye Ballantyne
This document is copyrighted by the American Psychological Association or one of its allied publishers.

Victoria Police Forensic Services Department, Australia


Rachel A. Searston
University of Melbourne, Australia
The University of Adelaide, Australia
Kristy A. Martire
Richard I. Kemp
University of New South Wales, Australia
Despite playing a critical role in our criminal justice system, very little is known about the expertise of forensic
scientists. Here, we review three disciplines where research has begun to investigate such expertise: handwriting
analysis, fingerprint examination, and facial image comparison. We assess expertise against the scientific standard,
but conclude that meeting this standard does not provide a sufficiently high benchmark for the forensic sciences.
Forensic scientists must demonstrate a minimum standard of performance, the ability to defer judgement in cases at
high risk of error, and the ability to effectively communicate the strength of their evidence to factfinders. We discuss
the limitations of current forensic science expertise research to adequately capture factors affecting operational
accuracy and outline crucial differences between studies assessing perceptual skill and operational accuracy. Finally,
we identify key areas for future research and encourage cognitive scientists to engage in forensic science research.

General Audience Summary


Forensic scientists provide investigators and courts with information about the source of traces left at crime
scenes, such as fingerprints, hair, and blood. However, with the exception of DNA, there is limited scientific
evidence that the methods used by forensic scientists can link evidence to a source with high levels of certainty,
or whether forensic scientists themselves are experts at making those decisions. Here, we describe research in
three forensic disciplines—handwriting analysis, fingerprint examination, and facial image comparison—and
consider whether forensic scientists in those disciplines should be considered experts. We identify key issues
related to how we define and measure expertise in the forensic sciences, the adequacy of current research to
assess expertise in real-world settings, and key areas for future research.

Keywords: Forensic science, Expertise, Handwriting analysis, Fingerprint examination, Facial image comparison

Author Note
Alice Towler, David White, Kristy A. Martire, Richard I. Kemp: School of Melbourne, Australia; School of Psychology, The University of Adelaide,
of Psychology, University of New South Wales, Australia; Kaye Ballan- Australia.
tyne: Office of the Chief Forensic Scientist, Victoria Police Forensic Services ∗ Correspondence concerning this article should be addressed to Alice Towler,

Department, Australia; Rachel A. Searston: Melbourne Graduate School of School of Psychology, University of New South Wales, NSW 2052, Australia.
Education, Melbourne Centre for the Study of Higher Education, University Contact: [email protected]
ARE FORENSIC SCIENTISTS EXPERTS? 200

In 1992, three-year-old Christine Jackson was abducted testify to their first-hand experiences relevant to the facts at
from her home, brutally raped, and murdered. Her body issue. However, a common legal exception in many countries
was found in a nearby creek two days later. Attention permits opinion evidence if the opinion is based on “specialised
quickly turned to the victim’s stepfather Kennedy Brewer knowledge” acquired through training, study, or experience
who had been looking after Christine in the hours before (e.g., s79 Evidence Act, 1995). It is exceptions of this kind
she went missing. An autopsy revealed suspected bite-marks that allow forensic scientists to share their “expertise” with the
on the victim’s body and a forensic odontologist testified court and which have established precedent for future admis-
these were inflicted by Brewer. Brewer was found guilty sions. However, simply having training, study, or experience
of capital murder and sexual battery and sentenced to in a particular forensic discipline is insufficient to guarantee
death. expertise (Edmond, 2016; Edmond & Martire, 2017; PCAST,
But Kennedy Brewer was innocent. Advances in DNA anal- 2016).
ysis allowed archived biological evidence to be examined. As a Cognitive scientists have studied expert performance for
Content may be shared at no cost, but any requests to reuse this content in part or whole must go through the American Psychological Association.

result, Brewer was exonerated before he could be executed, but many decades, and as such, are well-placed to examine the
not before serving 13 years on death row (Innocence Project, question of whether forensic scientists are experts. Prominent
2017). researchers in this field have defined expertise as “consistently
In the years since the Brewer case, forensic bite-mark analysis superior performance on a specified set of representative tasks
has been classified as “junk science” (see PCAST, 2016). Indeed, for a domain” (Ericsson & Lehmann, 1996). It is curious that this
This document is copyrighted by the American Psychological Association or one of its allied publishers.

studies have shown that forensic odontologists cannot reliably definition is not the one used by forensic scientists to benchmark
determine whether a bite-mark was left by a human, let alone their abilities. Instead, they rely on the presence of “specialised
identify which human (Freeman & Pretty, 2016; Page, Taylor, knowledge” together with legal assent as evidence of expert sta-
& Blenkin, 2012). tus. Furthermore, the court’s willingness to accept “expertise”
Forensic bite-mark analysis is not the only forensic science in unvalidated forensic science disciplines has been credited
discipline with questionable reliability. In 2009, a scathing report as a source of serious miscarriages of justice (see Edmond,
by the National Research Council (hereafter, NRC Report) 2016; Edmond, Found, et al., 2017; Edmond & Martire, 2017;
stated, Edmond et al., 2014; Edmond & San Roque, 2016; Koehler,
2016; Martire & Edmond, 2017; Mnookin et al., 2011; PCAST,
“With the exception of nuclear DNA analysis . . . no
2016; Saks & Koehler, 2005). In fact, the Innocence Project esti-
forensic method has been rigorously shown to have the
mates that nearly half of all wrongful convictions overturned
capacity to consistently, and with a high degree of cer-
by DNA evidence involved unvalidated or improper forensic
tainty, demonstrate a connection between evidence and a
science evidence (Innocence Project, 2017).
specific individual or source.” (p. 7)
The NRC and PCAST reports prioritise empirical validation
More recently, President Barack Obama commissioned the (or “black box”) studies to establish (a) whether methods rou-
Presidents’ Council of Advisors on Science and Technology tinely used by forensic scientists allow them to make accurate
(PCAST) to conduct a comprehensive investigation into the sta- determinations of the source of questioned samples, and (b)
tus of forensic pattern-matching disciplines (see PCAST, 2016). whether forensic scientists demonstrate expertise in using these
Pattern-matching disciplines are those where a forensic scien- methods compared to untrained novices. But forensic scien-
tist compares two samples “by eye” to determine if they have tists do not necessarily know how to design, run, and analyse
the same or different origin, for example fingerprint exami- human performance studies, and thus may lack the skills neces-
nation and hair comparison. PCAST reported that in five of sary to undertake this critical research (Martire & Kemp, 2016;
the seven examined disciplines—bite-marks, firearms, footwear, Mnookin et al., 2011). There is also a conflict of interest for
complex-mixture DNA and hair analysis—there was little evi- forensic scientists who wish to establish the validity and reli-
dence that forensic scientists were able to reliably link samples ability of their discipline’s methods to provide evidence that
of unknown origin to their source. In some forensic science dis- they and their colleagues are in fact experts. We argue that cog-
ciplines research has revealed poor accuracy and reliability in nitive scientists possess the skills needed to design, administer,
forensic scientists’ judgements, but more commonly, founda- and statistically analyse fair tests of human performance without
tional research establishing their expertise has simply not been being invested in the results. As such, cognitive scientists are par-
carried out.1 ticularly well-suited to conducting research on human expertise
There is limited scientific evidence supporting the validity in the forensic sciences (see also Edmond, Towler, et al., 2017;
and reliability of the techniques forensic scientists’ use. In addi- Koehler, 2013, 2016; Martire & Kemp, 2016; Mnookin et al.,
tion, very few studies have examined the question of whether 2011).
forensic scientists show expert-level performance. Despite this, Here, we review research from three forensic science
forensic scientists regularly provide their opinions in court as disciplines—handwriting analysis, fingerprint examination, and
expert witnesses. Ordinarily, witnesses are only permitted to facial image comparison—where efforts to assess expertise have
already begun. We draw on this research to determine whether
forensic scientists in those disciplines should be considered
1 See Koehler (2016) for a discussion of the reasons why this kind of basic experts. We then discuss some of the broad issues related to
research in forensic science disciplines has not been conducted. establishing expertise in the forensic sciences.
ARE FORENSIC SCIENTISTS EXPERTS? 201

Handwriting Analysis (3.4%) compared to novices (11.4%), and were more likely to
declare samples as inconclusive (23.1% vs. 8.4%).
Forensic handwriting examiners determine the authorship
Critically, research across the discipline consistently shows
and manner of production of handwriting and signatures.
that the difference between handwriting examiner and novice
Authorship decisions require examiners to decide whether the
performance is not due to the examiners making a greater pro-
known writer of a handwriting sample was also the author of a
portion of correct decisions. Instead, group differences lie in
questioned sample (e.g., a signature on a legal document; see
the frequency of inconclusive and incorrect decisions, whereby
Figure 1). Production decisions require examiners to decide
examiners avoid making many of the errors novices make (see
how the writing was produced. Handwriting may be genuine,
Bird, Found, Ballantyne, & Rogers, 2010; Found & Rogers,
forged by someone attempting to mimic another person’s hand-
2003; Found & Rogers, 2008; Kam, Abichandani, & Hewett,
writing, disguised in an attempt to deny authorship, or in some
2015; Kam, Gummadidala, Fielding, & Conn, 2001; Kam, Wet-
cases, written to look as though someone else has forged their
stein, & Conn, 1994; Sita, Found, & Rogers, 2002). Although
handwriting.
Content may be shared at no cost, but any requests to reuse this content in part or whole must go through the American Psychological Association.

handwriting examiners are not necessarily more accurate than


Many studies have investigated handwriting examiners’
novices, we argue that effective use of inconclusive decisions
expertise in making authorship and production decisions. Unlike
to avoid errors is an important component of forensic science
other forensic science disciplines where cognitive scientists have
expertise. We return to this point in the discussion.
led research efforts to establish expertise, research in hand-
writing examination has been largely led by practitioners. As
This document is copyrighted by the American Psychological Association or one of its allied publishers.

a result, studies are designed to reflect casework as closely Fingerprint Examination


as possible, and participants are often not required to make a
definitive decision. Instead, they are able to make “inconclu- Fingerprint examiners judge whether or not a print found at
sive” responses to indicate that they are not prepared to make a crime scene (i.e., a latent) was left by a particular known per-
a decision, usually because of insufficient information in the son. In a typical case, a fingerprint examiner will systematically
samples. compare a latent print side-by-side with a list of suspect prints
Kam, Fielding, and Conn (1997) investigated the accuracy by eye on a computer screen (see Figure 2). The examiner will
of authorship decisions in 105 handwriting examiners. Partici- arrive at one of three decisions: the prints are a match (i.e., they
pants were given six reference handwriting samples and asked to originate from the same finger), the prints are a non-match (i.e.,
identify samples authored by the same person in a comparison the prints come from different fingers), or inconclusive.
set of 24. Handwriting examiners and novices made a similar Fingerprint examination is a challenging perceptual task
number of correct decisions, correctly identifying 87.9% and because matching prints can look very different, due to vari-
87.7% of matching samples in the comparison set. There were, ation in surface, positioning, pressure, movement, moisture,
however, large differences in false positive errors between the distortion of the finger or substrate, and general wear and tear.
groups. Handwriting examiners made 6.5% false positive errors, Conversely, non-matching prints can look very similar, particu-
where they incorrectly declared one of the comparison samples larly because computer algorithms are increasingly being used
as matching a reference sample. In contrast, novices were far to rapidly search very large fingerprint databases and return lists
more error-prone, making false positive errors on 38.3% of trials. of highly similar suspect prints for examiners to compare (Dror
Bird, Found, and Rogers (2010) investigated the accuracy & Mnookin, 2010). As a result of this within-finger variabil-
of production decisions in 11 handwriting examiners. Partici- ity and between-finger similarity, even experienced examiners
pants were given 140 pairs of handwriting samples and were make mistakes.
asked to determine which of each pair was genuine and which A study by Ulery, Hicklin, Buscaglia, and Roberts (2011),
was disguised. Handwriting examiners made correct decisions for example, measured examiners’ ability to match challenging
on 73.4% of trials whereas novices made correct decisions fingerprints representative of casework. The majority of the 169
on 80.1%. However, handwriting examiners made fewer errors examiners made at least one false negative error where they

Figure 1. Handwriting examiners compare handwriting samples side-by-side to determine if they were authored by the same person or different people. They may
also decide whether the writing is genuine, forged, or disguised. Here, the reference signature of a known person (left) is accompanied by a signature written by the
same person (top right) and a signature forged by a different person (bottom right).
ARE FORENSIC SCIENTISTS EXPERTS? 202

Figure 2. Fingerprint examiners compare fingerprints side-by-side to determine if they originated from the same finger or two different fingers. Here, a latent
print (left) is accompanied by a matching fingerprint (middle) and a non-matching fingerprint (right). These example fingerprints were sourced from the Forensic
Informatics Biometric Repository (FIB-R; Tangen & Thompson, n.d.).
Content may be shared at no cost, but any requests to reuse this content in part or whole must go through the American Psychological Association.

incorrectly declared matching prints as non-matches (a false decided whether a high-quality mugshot style photograph, taken
This document is copyrighted by the American Psychological Association or one of its allied publishers.

negative error rate of 7.5%), but only five made a more seri- under controlled lighting conditions, matched a low-quality
ous false identification error where they incorrectly declared still image taken from CCTV footage. Facial examiners made
non-matching prints as matches (a false positive error rate of correct decisions on 76% of trials, inconclusive decisions on
0.1%). Just as in handwriting analysis, fingerprint examiners 21% and errors on 3%. By comparison, novices made cor-
made many inconclusive decisions, judging 23% of the latent rect decisions on 73% of trials, inconclusive decisions on 7%
fingerprints to be of no value for comparison. and errors on 20%. These results parallel those found in hand-
Tangen, Thompson, and McCarthy (2011) sought to establish writing examiners, showing that examiners’ reduced error rate
fingerprint examiners’ expertise by comparing them to novices. can mostly be attributed to them making more inconclusive
Tangen et al. did not allow participants to make inconclusive decisions.
judgements, and thus their study provides a direct measure of In the largest evaluation to date, White, Phillips, Hahn, and
examiners’ and novices’ perceptual sensitivity to matching and O’Toole (2015) tested 27 forensic facial examiners on three
non-matching fingerprints. Examiners were significantly more challenging face matching tasks that modelled the types of deci-
accurate at matching fingerprints than were novices. Examiners sions they encounter in their daily work. In each task, examiners
correctly declared 92% of matching prints as matches (compared decided if two high-quality face photographs showed the same
to 75% for novices), and 99% of highly similar non-matching person or different people. Examiners showed superior accu-
prints as non-matches (compared to 45% for novices). racy to novice participants, making an average of 5% errors
These studies, alongside other work by cognitive scientists, on a standardised test of face matching ability, compared to
provide compelling converging evidence that trained finger- novices who made 20% errors (Glasgow Face Matching Test;
print examiners are demonstrably more accurate than untrained Burton et al., 2010). Interestingly, on one test, the difference
novices at judging whether or not two fingerprints were left between examiners and the novice participants was most appar-
by the same person (Busey & Vanderkolk, 2005; Searston & ent when given 30 s to make their decision compared to 2 s.
Tangen, 2017a, 2017b, 2017c; Thompson & Tangen, 2014; It seems that, given sufficient time, examiners were able to
Thompson, Tangen, & McCarthy, 2014; Vogelsang, Palmeri, extract additional diagnostic information from faces. This sug-
& Busey, 2017). However, they also show low intra-examiner gests that the source of examiners’ superior ability relative to
repeatability and, as a group, demonstrate a surprisingly wide novices originates in a slow, deliberate comparison strategy
range of performance (Dror et al., 2011; Ulery, Hicklin, rather than quick, intuitive judgements (see also Towler, White,
Buscaglia, & Roberts, 2012). & Kemp, 2017).
Consistent with White, Phillips, et al. (2015), more recent
work has also found superior accuracy in forensic facial
Facial Image Comparison examiners compared to untrained novices (Towler, White, &
Forensic facial examiners compare photographs of faces to Kemp, 2017; White, Dunn, et al., 2015), suggesting that they
decide if they show the same person or different people (see are indeed experts. Importantly however, all of these stud-
Figure 3). Despite decades of research showing novices make ies focused on group-level differences between novices and
large proportions of error on these tasks when faces are unfa- examiners. When comparing individual examiners on these
miliar (see Burton, White, & McNeill, 2010; Henderson, Bruce, tasks, large differences in their performance emerge, with
& Burton, 2001; Megreya & Burton, 2006; Megreya & Burton, some examiners making 25% errors and others achieving
2007), it is only very recently that researchers have begun to almost perfect accuracy. These individual differences reflect
examine the accuracy of trained forensic scientists on this task. similar levels of inter-individual variation in the novice pop-
Norell et al. (2015) tested 17 forensic facial examiners on ulation, suggesting that underlying skill in these tasks can
a matching task that modelled forensic casework. Participants arise not only from forensic training, but also from an
ARE FORENSIC SCIENTISTS EXPERTS? 203

Figure 3. Forensic facial examiners compare face images to determine if they show the same person or different people. Here, an image from CCTV (left) is
Content may be shared at no cost, but any requests to reuse this content in part or whole must go through the American Psychological Association.

accompanied by a photograph of the same person (middle) and a photograph of a different person (right).

individual’s natural talent in face identification (Balsdon et al., evidence to factfinders (e.g., judges, jurors) in a manner that
This document is copyrighted by the American Psychological Association or one of its allied publishers.

in press; Noyes, Phillips, & O’Toole, 2017; Russell, Duchaine, enables them to understand the probative weight of that evi-
& Nakayama, 2009). dence.
Knowing when not to make a decision. Expert forensic
Discussion scientists need to be able to determine when a reliable decision
At the beginning of this paper we argued that forensic sci- is impossible, either because there is insufficient information
entists’ expertise should be assessed against the conventional in the samples (e.g., because of ageing, smudging, wear, low
scientific standard, which requires experts to demonstrate supe- resolution imagery), or because the difficulty of the comparison
rior performance relative to novices (see Ericsson & Lehmann, exceeds their abilities. In fact, some forensic scientists would
1996). If we consider superior performance to mean higher levels argue that the ability to defer judgement is the crux of expertise
of accuracy or fewer errors compared to novices, forensic scien- in the forensic sciences.
tists in all three of the reviewed disciplines meet the criteria for There is, however, mixed evidence that forensic scientists
expertise. Nevertheless, forensic scientists still make substantial possess sensitivity to these limits. For example, when given
numbers of errors and show high levels of variability across dif- the same 42 signatures, handwriting examiners made inconclu-
ferent examiners. Given the significant consequences of errors, sive judgements on 0–100% of trials (Found & Rogers, 2008).
is this conventional criteria a sufficiently high benchmark for Similarly, Ulery et al. (2011) found fingerprint examiners made
forensic scientists? Should we consider a forensic scientist an inconclusive judgements on 5–64% of matching fingerprints and
expert, and allow them to testify in court, just because they can 7–96% of non-matching fingerprints. Such large disagreement
perform better than novices? We do not think so. between examiners suggests that assessing the suitability of a
comparison for analysis is highly subjective. The extent to which
Rethinking the Definition of Expertise in the Forensic Sci- forensic scientists are able to determine which cases are likely to
ences lead them to make errors, and the consistency with which they
can make these judgements, are important questions for future
A forensic scientist who outperforms a novice may still make research to examine.
many errors and could be wrong more often than they are right.
Given the pervasiveness of forensic science evidence and the Communication of evidence. Expert forensic scientists also
serious risk of wrongful convictions, we argue that courts should need to effectively communicate the meaning of their evidence
only allow forensic scientists to testify if both the discipline, and (Martire, 2018). For instance, many forensic science bodies have
the individual forensic scientist, have met a minimum standard of advocated the use of standardised conclusion scales to convey
performance.2 This minimum standard should be determined by their evidence to jurors (e.g., Association of Forensic Science
the courts in collaboration with forensic and cognitive scientists. Providers, 2009). However, research by Martire, Kemp, Watkins,
In addition to meeting this minimum standard of perfor- Sayle, and Newell (2013) shows that the verbal label attached to
mance, we argue that expertise in the forensic sciences requires one point of one scale (i.e., “Weak or limited support”) con-
two additional skills: knowing when not to make a decision veys the opposite meaning to that intended by experts, with
because of increased risk of error, and the ability to communicate jurors interpreting weak evidence in favour of the hypothesis
as evidence for the alternative proposition. Forensic scientists
must be able to communicate their evidence in a way that
2 Some would argue that routine proficiency testing of forensic scientists
facilitates factfinder understanding and allows them to prop-
ensures a minimum standard of performance. However, current proficiency erly weigh and combine pieces of evidence. Cognitive scientists
tests are entirely inadequate to assess expertise. See Koehler (2013, 2016) for a are particularly well-placed to help forensic scientists develop
detailed discussion. better ways of conveying the meaning of their conclusions
ARE FORENSIC SCIENTISTS EXPERTS? 204

and aiding comprehension of the evidence by judges and investigation into the effects of time pressure on facial image
jurors. comparison (e.g., Bindemann, Fysh, Cross, & Watts, 2016), but
the effects of limited resources across the forensic sciences are
Adequacy of Current Research to Assess Expertise in Foren- largely unexplored.
sic Science Decision chains. In practice, decisions are not necessar-
ily made by just one forensic scientist. Sometimes a forensic
Research into forensic scientists’ expertise has largely
scientist will refer difficult cases to senior analysts or spe-
ignored the fact that they are analysts operating within a large,
cialist teams, or a colleague will conduct a peer-review (see
complex decision-making system (e.g., see Towler, Kemp, &
Ballantyne, Edmond, & Found, 2017; Towler, Kemp, & White,
White, 2017). Studies usually investigate expertise using tightly
2017). These processes may serve to catch errors made by
controlled experimental methodology and model one or two
individual forensic scientists, and may help calibrate deci-
real-world aspects of the task, such as by using stimuli represen-
sions within disciplines. However, little is known about how
tative of casework (e.g., Tangen et al., 2011; White, Dunn, et al.,
Content may be shared at no cost, but any requests to reuse this content in part or whole must go through the American Psychological Association.

these procedures work or whether some types of review detect


2015) or introducing contextual information about a case (e.g.,
errors—or errors of a particular type—more effectively than
Dror, Charlton, & Péron, 2006). However, this approach does
others.
not help us understand how forensic scientists’ expertise inter-
Knowledge of being tested. In most studies forensic scien-
acts with the complex and high-stakes environments in which
tists are aware they are being tested and this is likely to affect
they operate. The lack of research investigating decision-making
This document is copyrighted by the American Psychological Association or one of its allied publishers.

their performance (see Risinger, Saks, Thompson, & Rosenthal,


in these complex environments highlights the naiveté of current
2002). Covert testing, whereby ground-truth-known cases are
research in the forensic sciences. Below, we outline some of the
surreptitiously inserted into routine casework, is therefore nec-
system-level factors that are often neglected in research and thus
essary to avoid testing effects (Dror & Cole, 2010; Mnookin
limit our ability to make conclusions regarding the expertise of
et al., 2011). This approach is regularly used to maintain accept-
forensic scientists in operational settings.
able levels of accuracy in other domains such as airline baggage
Realistic target prevalence. Unlike most experiments, the
screening (see Wolfe, Brunelli, Rubinstein, & Horowitz, 2013),
ratio of targets to non-targets in forensic contexts is almost
and is just starting to be used in the forensic sciences (see
certainly not 50:50. For instance, forensic scientists often com-
Kerkhoff et al., 2015).
pare crime scene samples against samples taken from suspects.
Investigators submit samples from suspects for analysis because,
Measuring Accuracy in the Forensic Sciences
based on other evidence, they have reason to believe these indi-
viduals are implicated in the crime. This process means that The fact that research does not yet capture the full range
forensic scientists analyse samples that are more likely to be of factors that are likely to affect a forensic scientists’ abil-
matches than non-matches. We know that base rates can affect ity to complete their job accurately causes problems when we
performance on perceptual tasks (e.g., Wolfe, Horowitz, & Ken- attempt to use that research to report operational accuracy and
ner, 2005), so studies that do not adequately model realistic error rates. We therefore need to be very clear about the pur-
rates of target-prevalence may underestimate or overestimate pose of individual studies and what they allow us to conclude
operational levels of accuracy. about expertise in the forensic sciences. Studies assessing the
Contextual information. Forensic scientists are rarely blind expertise of forensic scientists fall into two distinct categories:
to case information. They often know details about the nature of those assessing perceptual skill and those assessing operational
the crime, the history of the suspect, and the presence or absence accuracy.
of other evidence. This contextual information can lead to con- Perceptual skill. Perceptual skill refers to a person’s raw
firmation bias, whereby forensic scientists seek out information ability to accurately classify or discriminate between sam-
consistent with their expectations and prior similar experiences ples representative of casework. For example, in fingerprint
(see Kassin, Dror, & Kukucka, 2013; Saks, Risinger, Rosenthal, examination, this skill refers to an examiner’s ability to dis-
& Thompson, 2003). Compared to other system-level factors, criminate between fingerprints left by the same finger and
considerable research attention has focused on identifying and fingerprints left by different fingers. Studies of this kind typ-
mitigating confirmation bias (e.g., Dror & Charlton, 2006; Dror ically test performance in controlled lab-based experiments,
et al., 2006; Kukucka & Kassin, 2014). However, confirmation under conditions that do not reflect forensic scientists’ daily
bias is usually studied in isolation, without other system-level work. For example, in these tests examiners are typically
factors such as those described here. not allowed access to the tools, procedural documentation, or
Limited resources. Many forensic departments report they response scales they use in casework (e.g., White, Phillips,
are under growing time pressure with large casework back- et al., 2015). Studies of perceptual skill thereby enable
logs and diminishing resources to meet these demands (Kobus, direct comparison of underlying skill in forensic scientists
Houck, Speaker, Riley, & Witt, 2011). Working under these con- and novices, but do not provide valid tests of operational
ditions is likely to impair performance. There has been some accuracy.
ARE FORENSIC SCIENTISTS EXPERTS? 205

Operational accuracy. Operational accuracy refers to a Because expertise research in the forensic sciences is in
forensic scientists’ ability to accurately classify or discrimi- its infancy, there is tremendous scope for psychology-driven
nate between samples representative of casework in operational basic and applied research (see Koehler & Meixner, 2016).
settings. For example, in fingerprint examination, operational For example, we still have a very poor understanding of how
accuracy refers to an examiner’s ability to discriminate between expertise develops in individuals—whether it be through train-
fingerprints left by the same finger and fingerprints left by differ- ing, experience, or predetermined by genetics (see Searston
ent fingers, in cases where they have decided there is sufficient & Tangen, 2017a; Shakeshaft & Plomin, 2015; White et al.,
information in the samples, and in the context of the aforemen- 2014). We also know very little about individual differences
tioned system-level factors. Operational accuracy experiments in the development of expertise in forensic science and about
allow us to assess forensic scientists’ expertise as it applies in the cognitive mechanisms underlying expertise in each disci-
practice. pline. Indeed, we should expect that the critical differences
There is a tendency for those interested in operational accu- between experts and novices may be accounted for by differ-
Content may be shared at no cost, but any requests to reuse this content in part or whole must go through the American Psychological Association.

racy to undervalue tests of perceptual skill (e.g., see review ent cognitive mechanisms in different disciplines. For example,
of existing fingerprint studies in PCAST, 2016). However, we preliminary work in fingerprint examination suggests that exper-
believe this is shortsighted. A thorough understanding of exper- tise is characterised by fast, automatic and holistic processing
tise in a discipline cannot be achieved without both types of of fingerprints (Busey & Vanderkolk, 2005; Thompson & Tan-
studies, and each informs the other. For example, studies of per- gen, 2014), as is typical of experts in domains outside forensic
This document is copyrighted by the American Psychological Association or one of its allied publishers.

ceptual skill demonstrated that identifying unfamiliar peoples’ science (see Kahneman, 2011; Klein, 1998; Kundel, Nodine,
faces is error-prone, even for trained professionals (e.g., White, Conant, & Weinstein, 2007). However, facial image comparison
Kemp, Jenkins, Matheson, & Burton, 2014). Operational accu- shows the exact opposite pattern. What appears to distinguish
racy studies then sought to model the working conditions of forensic facial examiners’ cognitive processing from novices’ is
facial examiners and found strikingly low levels of accuracy their slow, deliberate and piecemeal processing strategy (Towler,
in real-world conditions (i.e., 50%), even without taking into White, & Kemp, 2017; White, Phillips, et al., 2015).
consideration the full complement of system-level factors men- Many important and interesting research questions remain to
tioned earlier (e.g., White, Dunn, et al., 2015). Researchers have be addressed in the forensic sciences. We encourage cognitive
now returned to tightly controlled, lab-based perceptual skill and forensic scientists to build the collaborations necessary to
experiments to investigate ways of boosting accuracy through tackle important research questions facing the forensic sciences.
various procedures (e.g., Towler, White, & Kemp, 2017; White, This of course does not come without challenges. Building inter-
Burton, Kemp, & Jenkins, 2013). Once these techniques are disciplinary relationships and mutual trust takes time. It is also
refined they will be implemented in operational settings and important to manage the expectations of partners who have a
tested to determine their effectiveness in practice. Our knowl- vested interest in study outcomes and may be reluctant to pub-
edge in this discipline, like many others, is advanced through a lish unfavourable results. However, overcoming these minor
series of interrelated studies of both perceptual skill and opera- challenges is a worthwhile endeavour given the profound soci-
tional accuracy. etal impact of a productive collaboration between cognitive and
However, it is important to understand that perceptual skill forensic scientists.
and operational accuracy studies cannot be substituted for each
other. Perceptual skill studies should not be used to make specific
claims about operational accuracy. Similarly, operational accu- Author Contributions
racy studies should not be used to make claims about perceptual Introduction (AT), handwriting (KB, KM), fingerprints (RS),
skill. Furthermore, operational accuracy studies that model only faces (DW), discussion (RK, AT), with AT acting as coordinating
part of the “real world” should not be used to make definitive author. All authors then contributed to the production of the final
claims about operational accuracy; the generalisability of claims manuscript.
is constrained by the factors included in the study. The purpose
of, and differences between, perceptual skill and operational
accuracy studies will be particularly important moving forward Conflict of Interest Statement
as we seek to understand expertise in the forensic sciences. The authors declare no conflict of interest.

References
The Future of Expertise Research in the Forensic Sciences
Association of Forensic Science Providers. (2009). Standards for the
The research efforts reported here represent a tiny fraction
formulation of evaluative forensic science expert opinion. Science
of the research needed in the forensic sciences. For many dis- & Justice, 49, 161–164.
ciplines, there have been no empirical assessments of forensic Ballantyne, K., Edmond, G., & Found, B. (2017). Peer review in foren-
scientists’ ability to perform the tasks they undertake on a daily sic science. Forensic Science International, 277, 66–76.
basis. For disciplines where some research exists, such as the Balsdon, T., Summersby, S., Kemp, R.I., & White, D. Improving face
three reviewed here, we know far less than we should given the identification with specialist teams. Cognitive Research: Principles
proliferation, importance, and impact of the evidence. and Implications (in press).
ARE FORENSIC SCIENTISTS EXPERTS? 206

Bindemann, M., Fysh, M., Cross, K., & Watts, R. (2016). Ericsson, K. A., & Lehmann, A. C. (1996). Expert and exceptional
Matching faces against the clock. i-Perception, 7(5) performance: Evidence of maximal adaption to task constraints.
http://dx.doi.org/10.1177/2041669516672219 Annual Review of Psychology, 47, 273–305.
Bird, C., Found, B., Ballantyne, K., & Rogers, D. K. (2010). Forensic Found, B., & Rogers, D. K. (2003). The initial profiling trial of a pro-
handwriting examiners’ opinions on the process of production of gram to characterise forensic handwriting examiners’ skill. Journal
disguised and simulated signatures. Forensic Science International, of the American Society of Questioned Document Examiners, 6,
195(1), 103–107. 77–81.
Bird, C., Found, B., & Rogers, D. K. (2010). Forensic docu- Found, B., & Rogers, D. K. (2008). The probative character of foren-
ment examiners’ skill in distinguishing between natural and sic handwriting examiners’ identification and elimination opinions
disguised handwriting behaviors. Journal of Forensic Sciences, on questioned signatures. Forensic Science International, 178(1),
55(5), 1291–1295. 54–60.
Burton, A. M., White, D., & McNeill, A. (2010). The Glasgow Freeman, A., & Pretty, I. (2016). Construct validity of bitemark
face matching test. Behavior Research Methods, 42(1), 286–291. assessments using the ABFO decision tree. In Paper presented at
Content may be shared at no cost, but any requests to reuse this content in part or whole must go through the American Psychological Association.

http://dx.doi.org/10.3758/BRM.42.1.286 the 2016 Annual Meeting of the American Academy of Forensic


Busey, T. A., & Vanderkolk, J. R. (2005). Behavioral and electrophys- Sciences.
iological evidence for configural processing in fingerprint experts. Henderson, Z., Bruce, V., & Burton, A. M. (2001). Matching the faces
Vision Research, 45, 431–448. of robbers captured on video. Applied Cognitive Psychology, 15,
Dror, I. E., Champod, C., Langenburg, G., Charlton, D., Hunt, H., & 445–464. http://dx.doi.org/10.1002/acp.718
Rosenthal, R. (2011). Cognitive issues in fingerprint analysis: Inter- Innocence Project. (2017). The innocence project.. Retrieved from
This document is copyrighted by the American Psychological Association or one of its allied publishers.

and intra-expert consistency and the effect of a ‘target’ comparison. www.innocenceproject.org.


Forensic Science International, 208, 10–17. Kahneman, D. (2011). Thinking, fast and slow. New York: Farrar, Straus
Dror, I. E., & Charlton, D. (2006). Why experts make errors. Journal and Giroux.
of Forensic Identification, 56(4), 600–616. Kam, M., Abichandani, P., & Hewett, T. (2015). Simulation detection
Dror, I. E., Charlton, D., & Péron, A. (2006). Contextual in handwritten documents by forensic document examiners. Journal
information renders experts vulnerable to making erroneous of Forensic Sciences, 60(4), 936–941.
identifications. Forensic Science International, 156, 174–178. Kam, M., Fielding, G., & Conn, R. (1997). Writer identification by pro-
http://dx.doi.org/10.1016/j.forsciint.2005.10.017 fessional document examiners. Journal of Forensic Sciences, 42(5),
Dror, I. E., & Cole, S. A. (2010). The vision in “blind” justice: 1–33.
Expert perception, judgment, and visual cognition in forensic sci- Kam, M., Gummadidala, K., Fielding, G., & Conn, R. (2001). Signature
ence pattern recognition. Psychonomic Bulletin & Review, 17(2), authentication by forensic document examiners. Journal of Forensic
161–167. Science, 46(4), 884–888.
Dror, I. E., & Mnookin, J. L. (2010). The use of technol- Kam, M., Wetstein, J., & Conn, R. (1994). Proficiency of professional
ogy in human expert domains: Challenges and risks arising document examiners in writer identification. Journal of Forensic
from the use of automated fingerprint identification systems Science, 39(1), 5–14.
in forensic science. Law, Probability and Risk, 9, 47–67. Kassin, S. M., Dror, I. E., & Kukucka, J. (2013). The forensic con-
http://dx.doi.org/10.1093/lpr/mgp031 firmation bias: Problems, perspectives, and proposed solutions.
Edmond, G. (2016). Legal versus non-legal approaches to forensic sci- Journal of Applied Research in Memory and Cognition, 2(1),
ence evidence. International Journal of Evidence and Proof, 20(1), 42–52.
3–28. Kerkhoff, W., Stoel, R. D., Berger, C. E. H., Mattijssen, E. J. A. T.,
Edmond, G., Found, B., Martire, K. A., Ballantyne, K., Hamer, D., Hermasen, R., Smits, N., & Hardy, H. J. J. (2015). Design and
Searston, R. A., . . . & Roberts, A. (2017). Model forensic sci- results of an exploratory double blind testing program in firearms
ence. Australian Journal of Forensic Sciences, 57(2), 144–154. examination. Science & Justice, 55(6), 514–519.
http://dx.doi.org/10.1016/j.scijus.2016.11.005 Klein, G. A. (1998). Sources of power: How people make decisions.
Edmond, G., & Martire, K. A. (2017). Knowing experts? Section MIT Press.
79, forensic science evidence and the limits of ‘training, study or Kobus, H., Houck, M. M., Speaker, P., Riley, R., & Witt, T. (2011).
experience’. In A. Roberts, & J. Gans (Eds.), Critical perspec- Managing performance in the forensic sciences: Expectations in
tives on the uniform evidence law (p. 80). Sydney: Federation light of limited budgets. Forensic Science Policy and Management,
Press. 2(1), 36–43.
Edmond, G., Martire, K. A., Kemp, R. I., Hamer, D., Hibbert, B., Koehler, J. J. (2013). Proficiency tests to estimate error rates in the
Ligertwood, A., . . . & White, D. (2014). How to cross-examine forensic sciences. Law, Probability & Risk, 12, 89–98.
forensic scientists: A guide for lawyers. Australian Bar Review, 39, Koehler, J. J. (2016). Forensics or fauxrensics? Ascertaining accu-
174–197. racy in the forensic sciences. Arizona State Law Journal, 49(4),
Edmond, G., & San Roque, M. (2016). Before the High Court. 1–41.
Honeysett v The Queen: Forensic science, ‘specialised knowl- Koehler, J. J., & Meixner, J. B. (2016). An empirical research agenda
edge’ and the uniform evidence law. Sydney Law Review, 36, for the forensic sciences. Journal of Criminal Law and Criminology,
323–344. 106(1), 1–34.
Edmond, G., Towler, A., Growns, B., Ribeiro, G., Found, B., White, Kukucka, J., & Kassin, S. M. (2014). Do confessions taint per-
D., . . . & Martire, K. A. (2017). Thinking forensics: Cogni- ceptions of handwriting evidence? An empirical test of the
tive science for forensic practitioners. Science and Justice, 57, forensic confirmation bias. Law and Human Behavior, 38,
144–154. 256–270.
ARE FORENSIC SCIENTISTS EXPERTS? 207

Kundel, H. L., Nodine, C. F., Conant, E. F., & Weinstein, S. science of science to crime laboratory practice in the United States.
P. (2007). Holistic component of image perception in mam- Science & Justice, 43, 77–90.
mogram interpretation: Gaze-tracking study. Radiology, 242(2), Searston, R. A., & Tangen, J. M. (2017a). The emergence
396–402. of perceptual expertise with fingerprints over time. Jour-
Martire, K. A. (2018). Clear communication through clear nal of Applied Research in Memory and Cognition, 6(4),
purpose: Understanding statistical statements made by foren- 442–451.
sic scientists. Australian Journal of Forensic Sciences, 1–9. Searston, R. A., & Tangen, J. M. (2017b). Expertise with unfamiliar
http://dx.doi.org/10.1080/00450618.2018.1439101 objects is flexible to changes in task but not changes in class. PLoS
Martire, K. A., & Edmond, G. (2017). Re-thinking expert ONE, 12(6), e0178403.
opinion evidence. Melbourne University Law Review, 40, Searston, R. A., & Tangen, J. M. (2017c). The style of a
967–998. stranger: Identification expertise generalizes to coarser
Martire, K. A., & Kemp, R. I. (2016). Considerations when level categories. Psychonomic Bulletin & Review, 24(4),
designing human performance tests in the forensic sciences. 1324–1329.
Content may be shared at no cost, but any requests to reuse this content in part or whole must go through the American Psychological Association.

Australian Journal of Forensic Sciences, 1–17. http://dx.doi.org/ Shakeshaft, N. G., & Plomin, R. (2015). Genetic specificity of
10.1080/00450618.2016.1229815 face recognition. Proceedings of the National Academy of Sci-
Martire, K. A., Kemp, R. I., Watkins, I., Sayle, M. A., & Newell, ences of the United States of America, 112(41), 12887–12892.
B. R. (2013). The expression and interpretation of uncertain http://dx.doi.org/10.1073/pnas.1421881112
forensic science evidence: Verbal equivalence, evidence strength, Sita, J., Found, B., & Rogers, D. K. (2002). Forensic handwriting
and the weak evidence effect. Law and Human Behavior, 37(3), examiners’ expertise for signature comparison. Journal of Forensic
This document is copyrighted by the American Psychological Association or one of its allied publishers.

197–207. Sciences, 47(5), 1117–1124.


Megreya, A. M., & Burton, A. M. (2006). Unfamiliar faces are not Tangen, J. M., Thompson, M. B., & McCarthy, D. J. (2011). Identi-
faces: Evidence from a matching task. Memory and Cognition, fying fingerprint expertise. Psychological Science, 22(8), 995–997.
34(4), 865–876. http://dx.doi.org/10.1177/0956797611414729
Megreya, A. M., & Burton, A. M. (2007). Hits and false positives Thompson, M. B., & Tangen, J. M. (2014). The nature of exper-
in face matching: A familiarity-based dissociation. Perception and tise in fingerprint matching: Experts can do a lot with a little.
Psychophysics, 69(7), 1175–1184. PLoS ONE, 9(12), 1–23. http://dx.doi.org/10.1371/journal.pone.
Mnookin, J. L., Cole, S. A., Dror, I. E., Fisher, B. A. J., Houck, 0114759
M. M., Inman, K., . . . & Stoney, D. A. (2011). The need for a Thompson, M. B., Tangen, J. M., & McCarthy, D. J. (2014).
research culture in the forensic sciences. UCLA Law Review, 58, Human matching performance of genuine crime scene
725–779. latent fingerprints. Law and Human Behavior, 38(1), 84–93.
National Research Council. (2009). Strengthening forensic science in http://dx.doi.org/10.1037/lhb0000051
the United States: A path forward. Towler, A., Kemp, R. I., & White, D. (2017). Unfamiliar face matching
Norell, K., Brorsson Lathen, K., Bergstrom, P., Rice, A., Natu, V., & systems in applied settings. In M. Bindemann, & A. M. Megreya
O’Toole, A. J. (2015). The effect of image quality and forensic (Eds.), Face processing: Systems, disorders and cultural differences.
expertise in facial image comparisons. Journal of Forensic Sciences, New York: Nova Science Publishers, Inc.
60(2), 331–340. http://dx.doi.org/10.1111/1556-4029.12660 Towler, A., White, D., & Kemp, R. I. (2017). Evaluating the
Noyes, E., Phillips, P. J., & O’Toole, A. J. (2017). What is a super- feature comparison strategy for forensic face identification.
recogniser? In M. Bindemann, & A. M. Megreya (Eds.), Face Journal of Experimental Psychology: Applied, 23(1), 47–58.
processing: Systems, disorders and cultural differences. Nova Sci- http://dx.doi.org/10.1037/xap0000108
ence. Ulery, B. T., Hicklin, R. A., Buscaglia, J., & Roberts, M. A. (2011).
Page, M., Taylor, J., & Blenkin, M. (2012). Context effects Accuracy and reliability of forensic latent fingerprint decisions.
and observer bias – Implications for forensic odon- Proceedings of the National Academy of Science of the United States
tology. Journal of Forensic Sciences, 57(1), 108–112. of America, 108(19), 7733–7738.
http://dx.doi.org/10.1111/j.1556-4029.2011.01903 Ulery, B. T., Hicklin, R. A., Buscaglia, J., & Roberts, M.
PCAST. (2016). Forensic science in criminal courts: Ensuring scientific A. (2012). Repeatability and reproducibility of decisions
validity of feature-comparison methods. by latent fingerprint examiners. PLoS ONE, 7(3), 1–12.
Risinger, D. M., Saks, M. J., Thompson, W. C., & Rosenthal, R. (2002). http://dx.doi.org/10.1371/journal.pone.0032800
The Daubert/Kumho implications of observer effects in forensic Vogelsang, M. D., Palmeri, T. J., & Busey, T. A. (2017). Holistic
science: Hidden problems of expectation and suggestion. California processing of fingerprints by expert forensic examiners. Cognitive
Law Review, 90(1), 1–56. Research, 2(1), 1–12.
Russell, R., Duchaine, B., & Nakayama, K. (2009). Super- White, D., Burton, A. M., Kemp, R. I., & Jenkins, R. (2013). Crowd
recognizers: People with extraordinary face recognition effects in unfamiliar face matching. Applied Cognitive Psychology,
ability. Psychonomic Bulletin and Review, 16(2), 252–257. 27, 769–777.
http://dx.doi.org/10.3758/PBR.16.2.252 White, D., Dunn, J. D., Schmid, A. C., & Kemp, R. I. (2015). Error
Saks, M. J., & Koehler, J. J. (2005). The coming paradigm shift rates in users of automatic face recognition software. PLoS ONE,
in forensic identification science. Science, 309(5736), 892–895. 10(10), 1–14. http://dx.doi.org/10.1371/journal.pone.0139827
http://dx.doi.org/10.1126/science.1111565 White, D., Kemp, R. I., Jenkins, R., Matheson, M., & Burton, A. M.
Saks, M. J., Risinger, D. L., Rosenthal, D., & Thompson, B. (2003). (2014). Passport officers’ errors in face matching. PLoS ONE, 9(8),
Context effects in forensic science: A review and application of the 1–6.
ARE FORENSIC SCIENTISTS EXPERTS? 208

White, D., Phillips, P. J., Hahn, C. A., Hill, M., & O’Toole, A. J. Wolfe, J. M., Horowitz, T. S., & Kenner, N. M. (2005). Rare items often
(2015). Perceptual expertise in forensic facial image comparison. missed in visual searches. Nature, 435(7041), 439–440.
Proceedings of the Royal Society of London B: Biological Sciences,
282, 1814–1822.
Wolfe, J. M., Brunelli, D. N., Rubinstein, J., & Horowitz, T. S. (2013). Received 5 December 2017;
Prevalence effects in newly trained airport checkpoint screeners: received in revised form 30 March 2018;
Trained observers miss rare targets, too. Journal of Vision, 13(3), accepted 30 March 2018
1–9 (33).
Content may be shared at no cost, but any requests to reuse this content in part or whole must go through the American Psychological Association.
This document is copyrighted by the American Psychological Association or one of its allied publishers.

You might also like