Thesis Paternity

Download as pdf or txt
Download as pdf or txt
You are on page 1of 197

Aalborg University

Department of Mathematical Sciences


PhD Thesis
Statistical Aspects of Forensic Genetics
Models for Qualitative and Quantitative STR Data
August 2010
Torben Tvedebrink
Department of Mathematical Sciences, Aalborg University,
Fredrik Bajers Vej 7 G, 9220 Aalborg East, Denmark
Department of Mathematical Sciences
Fredrik Bajers Vej 7G
DK-9220 Aalborg East
Denmark
Telephone: +45 99 40 88 00
Fax: +45 98 15 81 29
Web: http://www.math.aau.dk
Title:
Statistical Aspects of Forensic Genetics
Models for Qualitative and Quantitative STR Data
Author:
Torben Tvedebrink
Supervisor:
Poul Svante Eriksen
Department of Mathematical Sciences
Aalborg University
PhD Assessment Committee:
Prof. Thore Egeland, Dr. Scient
Institute of Forensic Medicine
University of Oslo
Prof. Dr. Peter M. Schneider
Institute of Legal Medicine
University of Cologne
Prof. Rasmus Waagepetersen, PhD
Department of Mathematical Sciences
Aalborg University
Thesis number of pages: 183
Submitted: August 2010
Preface
Summary in English
This PhD thesis deals with statistical models intended for forensic genetics, which is the part of
forensic medicine concerned with analysis of DNA evidence from criminal cases together with
calculation of alleged paternity and anity in family reunication cases. The main focus of the
thesis is on crime cases as these dier from the other types of cases since the biological material
often is used for person identication contrary to anity.
Common to all cases, however, is that the DNA is used as evidence in order to assess the prob-
ability of observing the biological material given dierent hypotheses. Most countries use com-
mercially manufactured DNA kits for typing a persons DNA prole. Using these kits the DNA
prole is constituted by the state of 10-15 DNA loci which has a large variation from person
to person in the population. Thus, only a small fraction of the genome is typed, but due to the
large variability, it is possible to identify individuals with very high probability. These probabil-
ities are used when calculating the weight of evidence, which in some cases corresponds to the
likelihood of observing a given suspects DNA prole in the population.
By assessing the probability of the DNA evidence under competing hypotheses the biological
evidence may be used in the courts deliberation and trial on equal footing with other evidence
and expert statements. These probabilities are based on population genetic models whose as-
sumptions must be validated. The thesiss rst two articles describe the -correction which
compensate for possible population structures and remote coancestry that could aect the mod-
els accuracy. The Danish reference database with nearly 52,000 DNA proles, is analysed and
the number of near-matches is compared to the expected numbers under the model.
iii
iv Preface
A frequent event in connection with crime cases is the detection of more than one persons DNA
in a sample from the crime scene. In such cases, the DNA prole is called a DNA mixture as it
is not possible mechanically or chemically to separate the biological traces into its contributing
parts. To ascribe an evidentiary weight to a DNA mixture, the quantitative part (comprised
as signal intensities in a so-called electropherogram - EPG) of the result from biotechnological
analysis is used. Two models for handling DNA mixtures are presented together with an ecient
algorithm to separate the DNA mixture in the most probable contributing proles. Furthermore,
it is discussed how the quantitative part of the evidence is included in calculating the evidential
weight.
In criminal cases, the biological traces are often found at crime scenes in conditions which
can degrade and contaminate the DNA strand, which complicates the subsequent biochemical
analysis. Furthermore, the amount of DNA may be limited which may challenge the sensitivity
of the biotechnology applied in the analysis. Models to evaluate the degree of degradation and
estimate the probability of an allelic drop-out are discussed in the thesis. Furthermore, it is
exemplied how to incorporate the probability of degradation and drop-out when calculating the
weight of evidence.
Finally, the thesis contains an article which deals with post-processing of the data after the sig-
nal is processed by PCR thermo cycler and detected by electrophoresis apparatus. Central is the
detection of a signal-to-noise limit which currently is a xed limit recommended by the manu-
facturer of the typing kit. This article discusses how this threshold can be determined from the
noise such that it may be specic to each case and locus. Additionally two lters are presented
that handle specic types of artifacts in the data generation process which are manifested as
increased signals in the EPG.
Summary in Danish v
Summary in Danish
Denne ph.d-afhandlingomhandler statistiske modeller med anvendelse indenfor retsgenetik, som
er den del af det retsmedicinske omrde som beskftiger sig med analyser af dna-spor fra krim-
inalsager, samt beregning af pstet slgtskab i forbindelse faderskabs- og familiesamfrings-
sager anvendt i retlig sammenhng. Afhandlingen har et srligt fokus p kriminalsager, idet
disse adskiller sig fra de vrige sagstyper ved at det biologiske materiale ofte anvendes til per-
sonidentikation i modstning til beslgtethed.
Flles for sagerne er dog, at dna bruges som bevis i forhold til at sandsynliggre forskellige
hypoteser fremsat i den respektive sag. I langt de este lande anvendes kommercielle dna-kit til
at typebestemme en persons dna-prol. Disse kit fastlgger dna-prolen ud fra 10 til 15 dna-
markrer, som har en stor variation fra person til person i befolkningen. Sledes er det kun en
brkdel af genomet som typebestemmes, men grundet den store variabilitet er det muligt ud fra
disse f markrer at identicerer personer med meget hj sandsynlighed. Disse sandsynligheder
anvendes til at udregne den bevismssigevgt, som eksempelvis beskriver sandsynligheden for
at observerer en given mistnkts dna-prol i befolkningen.
Ved at vurdere sandsynligheden for dna-beviset under konkurrerende hypoteser kan det biolo-
giske bevis inddrages i rettens votering og domsafsigelse, p lige fod med vrige beviser og
ekspertudsagn. Disse sandsynligheder bygger p populationsgenetiske modeller, hvis antagelser
m godtgres. I afhandlingens to frste artikler beskrives den skaldte -korrektion som
kompenserer for mulige befolkningsstrukturer og fjernt slgtskab, som kan indvirke p mod-
ellernes korrekthed. Blandt andet analyseres den danske referencedatabase med knapt 52.000
dna-proler, hvor det undersges, hvor meget disse dna-proler adskiller sig fra hinanden, samt
om antallet af nrmatches kan forklares ved hjlp af de anvendte modeller.
En ofte forekommende hndelse i forbindelse med kriminalsager er detektion af mere end en
persons dna i en prve fra et gerningssted. I sdanne tilflde kaldes gerningsstedsprolen en
dna-mikstur, idet det ikke er muligt rent mekanisk eller kemisk at separere det biologiske spor
i de bidragende dna-proler. For at kunne tilskrive en bevismssig vgt til en dna-mikstur,
bruges den kvantitative del (bestende af signalintensiteten udtryk i et skaldt elektroferogram
- EPG) af resultatet fra de bioteknologiske analyser af dna-sporet. Der prsenteres to modeller
til hndtering af dna-miksturer og en eektiv algoritme til at separere dna-miksturer i de mest
sandsynlige bidragsproler. Endvidere diskuteres det, hvorledes den kvantitative del af beviset
inddrages i udregningen af den bevismssige vgt.
I kriminalsager er det biologiske spor ofte fundet p gerningssteder under forhold, som kan
nedbryde og forurene dna-strengen, hvilket besvrliggr den senere biokemiske analyse. Yder-
mere kan mngden af dna vre begrnset, hvilket kan udfordre sensitiviteten af bioteknologien
anvendt i dna-analyserne. Modeller til at vurdere graden af nedbrudthed, samt estimere sandsyn-
ligheden for et alleludfald i dna-analysen behandles i afhandlingen, samt eksempler p hvorledes
dette indkoorporeres i den bevismssige vgt prsenteres.
vi Preface
Endelig indholder afhandlingen en artikel som omhandler processeringen af de kvantitative data
observeret fra EPGet detekteret af elektroforesemaskinerne efter PCR-processen. Centralt er
detektionen af en signal-stjgrnse som hidtil har vret en fast anbefalet grnse fra produ-
centen af det kommercielle kit. I artiklen diskuteres det hvorledes grnsen kan faststtes ud
fra stjniveauet, sledes den kan vre specik for hver sag og dna-markr. Der prsenteres
to yderligere ltre til hndtering af srlige typer af artefakter som udtrykkes i EPGet som
forstrkede signaler.
Acknowledgements vii
Acknowledgements
My biggest thanks goes to my supervisor through ve years Poul Svante Eriksen from whom I
have learned so much. Thank you for inspiring discussions and proposing solutions to many of
the problems I have worked on. For always being encouraging and reading all my manuscript
drafts of dubious quality and for debugging R-code during the past many years.
I also would like to thank my very good friend and oce mate Ege for great times and discus-
sions with and without beers involved. Thanks to the sta and colleagues at the Department of
Mathematical Sciences for a friendly and inspiring place to work, and in particular the Head of
Department E. Susanne Christensen for coming to Oxford in the rst place and convincing me
to work with forensic genetics in my MSc thesis and giving me the opportunity to write this PhD
thesis.
I also would like to thank Helle Smidt Mogensen and Niels Morling, for sharing their insights
in forensic genetics. For always proposing interesting problems and providing data in order to
put statistics into forensic genetics. More inspirational and committed collaborators are hard
to nd. Furthermore, I would like to thank the entire Section of Forensic Genetics for friendly
discussions, for the stas interest in my work and making my visits in Copenhagen pleasant and
fruitful. Finally, thanks to the University of Copenhagen for co-founding my PhD position.
Thanks to the New Zealanders in forensic genetics: Bruce Weir, John Buckleton and James Cur-
ran. Bruce for hosting my stay at the University of Washington, Seattle, during the end of 2008.
The discussions and work in Seattle initiated my interest in population genetics, substructures
and IBD. John and James for inviting me to summerly Auckland in the cold European winter
2010 to collaborate on our common interests in forensic genetics. I look forward to our future
meetings and discussions. In addition I would like to thank Kund Hjsgaards Fond, Oticon
Fonden and Christian og Ottilia Brorsons Rejselegat for yngre videnskabdsmnd og -kvinder
who supported my travels to USA and New Zealand. I would also like to thank Ellen og Aage
Andersens Foundation for nancial support during my PhD studies.
Finally, I would like to thank my family and friends who have put up with me and shown an
interest in my research over the past years. Last but not least I would like to thank Tenna for her
everlasting sympathy and love over the years - and those to come. For letting me focus on my
work during the nal period of my PhD project and accepting my distant moments in the few
times of higher enlightenment. Your great cooking skills is a constant inspiration to me.
Aalborg, August 2010 Torben Tvedebrink
Contents
Preface iii
Summary in English . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Summary in Danish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
1 Introduction 1
1.1 Qualitative models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Quantitative models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Overdispersion in allelic counts and -correction in forensic genetics 11
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Overdispersion in allelic counts . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.A Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.7 Supplementary remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
ix
x Contents
3 Analysis of matches and partial-matches in Danish DNA database 37
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2 Materials and methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.A Derivation and computation of the variance . . . . . . . . . . . . . . . . . . . 53
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.6 Supplementary remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4 Evaluating the weight of evidence using quantitative STR data in DNA mixtures 59
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2 Material and methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3 Impact on the likelihood ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.4 Parameter estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.A The model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.B EM-estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.C Model reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.7 Supplementary remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5 Identifying contributors of DNA mixtures by means of quantitative
information of STR typing 87
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.3 Modelling peak areas of a two-person mixture . . . . . . . . . . . . . . . . . . 90
5.4 Finding best matching pair of proles . . . . . . . . . . . . . . . . . . . . . . 91
5.5 Likelihood ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.6 Importance sampling of the likelihood ratio . . . . . . . . . . . . . . . . . . . 99
5.7 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5.A The general case with mcontributors . . . . . . . . . . . . . . . . . . . . . . 106
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.10 Supplementary remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6 Estimating the probability of allelic drop-out of STR alleles in forensic genetics 115
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.2 Material and methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
6.A Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.5 Supplementary remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Contents xi
7 Sample and investigation specic ltering of quantitative data from
STR DNA analysis 135
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.2 Materials and methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
7.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
7.A Double stutters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.6 Supplementary remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
8 Statistical model for degraded DNA samples and adjusted probabilities
for allelic drop-out 155
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
8.2 Materials and methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
8.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.6 Supplementary remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
9 Epilogue 167
9.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
9.2 Weight of evidence calculations . . . . . . . . . . . . . . . . . . . . . . . . . 168
9.3 Unifying likelihood ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
9.4 Future research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Bibliography 177
CHAPTER 1
Introduction
Forensic genetics is about drawing conclusions from biological evidence related to various types
of crimes and legal disputes. It is the task of the forensic geneticists to present the genetic
evidence as scientic and impartial as possible. The scientic aspects comprises thorough inves-
tigation of the various components in the analysis process of biological evidence. The analysis
consists of several sub-analyses handling specic tasks on the route from tissue or body uid
to data used for interpretation. Since there are many sources of variability and uncertainty, the
interpreter must be able to quantify the amount of uncertainty and include this when reporting
the evidential weight.
Evidence from a scene of crime is subject to more sources of variability than samples taken in
relation to family disputes. In the former case issues of contaminated samples or degraded DNA
due to non-optimal conditions raises problems for the typing technology. The DNA might be
too damaged for analysis or it might only be possible to obtain results from a subset of the DNA
markers used for identication yielding partial DNA proles. In paternity disputes or family
reunication cases the problems facing the forensic geneticists are mainly related to population
genetics and pedigree analysis since in these cases the reference samples are often of high quality
and in sucient amounts such that the risks of contamination, allelic drop-out or degradation of
the biological material are minimal. However, the tissue used for identication of body remains
found in the debris from a mass disaster or in mass graves is often severely degraded.
1
2 Introduction
1.1 Qualitative models
Even before it was possible to obtain DNA proles, biological features or phenotypes were used
for evidential calculations. The blood type of a child is determined by the blood types of the
parents blood types. Hence, this information may be used in paternity disputes, where the
alleged father can be excluded if there are inconsistencies in the constitutions of the trios blood
types. However, the fewpossible states of the blood type implies that the power of discrimination
is low since many men unrelated to the child will share blood type with the true father.
Hence, the more polymorphic and diverse the biological marker, the more informative and pow-
erful it is for discriminating among individuals. The development of DNA markers has min-
imised the problem of low discriminating power. By selecting DNA markers on dierent chro-
mosomes forensic geneticists have obtained a powerful tool for making statements about pater-
nity, relatedness and identity. The prevailing DNA typing technology used in forensic is based
on the short tandem repeat (STR) typing technique.
The STR repeat sequences used in forensic genetics are typically made up by motifs of four or
ve base pairs, e.g. the typical repeat motif for TH0 is given by TCAT (Butler, 2005, Table 5.2).
This implies that for locus TH0 an allele designated 6 has this motif repeated consecutively
six times, which if often denoted [TCAT]
6
.
Excluding abnormalities, every individual has two alleles per locus - one maternal and one pa-
ternal. However, it is impossible to determine the origin of the alleles and they may possibly be
identical (homozygote) which implies only one allelic type is detected. Otherwise two distinct
alleles are observed (heterozygote) and in either case at least one of the individuals parents share
minimum one allele with their common ospring, assuming no mutations.
The commercial STR kits genotype 10 to 15 autosomal STR loci each having 10 to 25 fre-
quently occurring alleles in the Danish population. That is, the qualitative part of the DNA
prole consists of a set of loci where the DNA prole is specied by the states of the alleles. The
heterozygous DNA prole with the highest probability in the Danish population using the SGM
Plus kit (Applied Biosystems) is reported in Table 1.1.
Table 1.1: The heterozygous DNA prole with the highest probability in the Danish population.
Locus D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA
Alleles 15,16 16,17 11,12 17,20 13,14 29,30 14,15 13,14 6,9.3 21,22
Since the STR loci are located on dierent chromosomes the laws of inheritance suggest that the
allelic distribution over loci multiply: P(A
1i
1
A
1 j
1
, . . . , A
Li
L
A
Lj
L
) =

L
l=1
P(A
li
l
A
l j
l
), where A
li
l
is
the ith allele on locus l and L is the total number of typed STR loci. Using the allele probabilities
estimated for the Danish population, the probability of observing the DNA prole of Table 1.1
when sampling a random person from the population is 1.327

10
10
.
1.1 Qualitative models 3
When a crime is committed, DNA evidence is often considered in the court of law, when convict-
ing a suspect guilty or innocent. Let respectively H
p
and H
d
denote the hypotheses relating to
the guilt and innocence of the suspect, and E the evidence relevant for the hypotheses. Then the
court is interested in posterior ratio P(H
p
|E)/P(H
d
|E). However, such statements are impossible
for the forensic geneticist to quantify since this involves the prior ratio P(H
p
)/P(H
d
) which is
unknown to the forensic expert. What can be evaluated by the expert witness is the likelihood ra-
tio P(E|H
p
)/P(E|H
d
) using a model for the occurrence of the evidence given that the hypothesis
is true.
The likelihood ratio, LR, is the essential quantity in forensic genetics and this thesis discuss
several ways to include more of the available information in its evaluation. Consider a crime
case with an identied suspect. Let G
S
denote the suspects DNA prole and E
c
the DNA
stain obtained from the scene of crime, and assume that E
c
is consistent with G
S
. That is, all
alleles in G
S
are present in E
c
which we denote G
S
E
c
. The two competing hypothesis state
respectively; H
p
: The suspect is the donor of the DNA stain and H
d
: An unknown and to the
suspect unrelated person is the donor of the DNA stain. The latter hypothesis is what is called a
random man-hypothesis. Let G
U
denote the DNA prole of the random man which assuming
no typing errors implies that G
U
G
S
. In this case the LR is given by:
LR =
P(E|H
p
)
P(E|H
d
)
=
P(E
c
, G
S
|H
p
)
P(E
c
, G
S
|H
d
)
=
P(E
c
|G
S
)P(G
S
)
P(E
c
|U)P(G
S
|G
U
)P(G
U
)
=
1
P(G
U
|G
S
)
where P(G
U
|G
S
) under some model assumptions is the probability of observing the crime scene
prole at random in the population. Hence, the evidence enables the forensic geneticist to make
statements like The probability of observing this particular prole at random in the reference
population is 1 in 1,000,000 or equivalently the DNA evidence is 1,000,000 times more likely
under H
p
than under H
d
. There is an ongoing debate in the forensic genetic community on
which probability is of relevance to the court. In the recent decade the match probability
(Balding, 2005) that takes subpopulation structures or common coancestry into account has be-
come prevalent. That is, rather than considering proles of the suspect and random man as
independent, one computes the probability of the observed prole conditioned on the suspects
prole. Hence, using the posterior distribution of alleles rather than the prior distribution,
rare alleles are less extreme. This implies a more conservative evaluation of the evidence since
one accounts for the possibility that an allele that is rare in an admixed population is more com-
monly observed in one of its subpopulations to which the suspect (and possibly the culprit) might
belong.
Over the recent years the national databases of STR proles have grown in size due to the suc-
cess of forensic DNA analysis in solving crimes. With these vast numbers of proles available, it
is possible to test the validity and applicability of population models to forensic genetics (Weir,
2004, 2007; Curran et al., 2007; Mueller, 2008). Furthermore, the accumulation of DNA pro-
les implies that the probability of a random match or near match of two randomly selected
DNA proles in the database increases. If all pairs of proles are compared to each other in the
database this corresponds to
_
n
2
_
= n(n1)/2 pairwise comparisons in a database with n DNA
proles. In the Danish DNA reference database there are approximately 52,000 DNA proles
which yield 1,351,974,000 pairwise comparisons. With these large number of comparisons it is
4 Introduction
likely to observe DNA proles that coincide on many loci which has concerned some commenta-
tors and raised questions about overstating the power of DNA evidence. Hence, it is important
to demonstrate that the observed and expected number of matches are suciently close in order
to retain the condence in DNA typing in general and the population genetic models used for
evidential calculations in particular.
1.2 Quantitative models
The commercial kits used for analysis of DNA evidence provide quantitative and qualitative
information to the analyst. The qualitative information reports which alleles that are present in
the data (like in Table 1.1), whereas the qualitative part gives information on peak intensities in
terms of height and area of the peaks obtained from the electropherogram (EPG). An example
of an EPG is given in Figure 1.1 where the peak intensities (peak heights and areas) are plotted
in relative uorescent units (rfu) against the base pair (bp) length. Peaks with a low bp value
correspond to alleles with short amplicons (amplicons are made up by the primer binding site
and STR repeat structure). The peak intensities are measured by a CCD camera where the
observed intensity corresponds to the amount of light emitted from the uorescence dye.
100 150 200 250 300 350 400
0
1
0
0
0
2
0
0
0
3
0
0
0
4
0
0
0
Base pair (bp)
P
e
a
k

h
e
i
g
h
t

(
r
f
u
)
Blue fluorescent dye
Green fluorescent dye
Yellow fluorescent dye
Red fluorescent dye
Figure 1.1: An example of an electropherogram (EPG) for the SGM-Plus kit (Applied Biosys-
tems) with peak height (in rfu) plotted against the base pair (bp) length.
1.2 Quantitative models 5
Thus the crime scene evidence, E
c
, consists of two components: The qualitative (or genetic)
part, G; and the quantitative part with peak intensities, Q. The peak intensities reect the amount
of DNA contributed to the particular allele, and since the technology is indierent to the origin
of the various DNA fragments, the DNA amounts contributed to shared alleles add up. The
resulting peak intensities are registered via a CCD camera that detects the light emitted from a
uorochrome attached to DNA molecules corresponding to a STR allele. A dierence in electric
potential forces the DNA molecules to move in the capillary, where the size dierence of the
molecules implies that some DNA fragments pass the CCD camera before others.
Since the length of the repeat sequences of the STR loci under investigation are overlapping,
most commercial STR kits applies 3-5 dierent uorochrome dyes in order to concurrently detect
signals from multiple alleles and loci. One of these dyes contain DNA fragments of known
length which are used for fragment size determination of the observed peak intensities. These
xed lengths are used to align the observed peak intensities to an allelic ladder which converts
an observed fragment length to an allelic repeat number. For the SGM-Plus kit this size marker
is given a red uorochrome (represented by dashed lines in Figure 1.1).
In a single contributor DNA sample it is possible to observe one or two alleles per locus depend-
ing on whether the DNA prole is homozygous or heterozygous, respectively. However, when m
DNA proles contribute to the same sample it is possible to observe one to 2m alleles per locus,
since the individuals may share all or no alleles. The peak intensities associated to the alleles
reect the amount of DNA contributed to that particular allele. Hence, in a two-person mixture
alleles where the major component (the DNA prole with the largest amount of DNA contributed
to the sample) contributes are often larger than those of the minor component. However, if the
DNA proles share alleles the peak intensities of the common alleles are approximately the sum
of the contributions.
When assigning weight to the evidence under a given hypothesis the methodology needs to
consider both parts of the data. This is particularly important when the data originate from a
DNA mixture, since the quantitative evidence currently is the only way used to separate the
observed alleles into contributing proles. Often the peak intensities are used only to reduce
the number of possible combinations entering the likelihood ratio. This approach is sometimes
called the binary model in the forensic literature, e.g. by Bill et al. (2005); Buckleton et al.
(2005). However, a more correct approach would be to attach a likelihood to each combination
of proles measuring the agreement between the observed peak intensities and the expected
intensities under some model. Let the evidence E = (E
c
, K) where K are the known proles
associated to the crime, then the extended LR taking Q into account is given by:
LR =
P(E|H
p
)
P(E|H
d
)
=
P(Q, G, K|H
p
)
P(Q, G, K|H
d
)
=
P(Q|G, K, H
p
)P(G|K, H
p
)P(K|H
p
)
P(Q|G, K, H
d
)P(G|K, H
d
)P(K|H
d
)
, (1.1)
where P(Q|) measures the agreement between the observed and expected peak intensities. Ide-
ally the model for P(Q|) should take the entire EPG signal into account which includes the noise
component (pictured as a rug close to 0 rfu in Figure 1.1), adjustment and correction for tech-
nical artefacts (stutters and pull-up eects, cf. below), detection of degradation (discussed in the
end of this section), the genotypes of the contributors, etc.
6 Introduction
However, evaluating the LR under such a model is computationally intense and complicated.
That is, for each locus every pair of alleles constructed as a Cartesian product of the allelic ladder
should be considered even though the peak height imbalances (ratio of peak heights) within and
between loci were extreme. For practical purposes such an approach would be infeasible and
too computational intense for standard case work. Hence, it is common to reduce Q to a smaller
set of observations by using a criterion to separate the noise and signal into two parts, such that
the number of possible combinations of DNA proles decreases. A limit of detection is often
used to discriminate between the noise and signal. However, such a threshold approach induces
the risk of making wrong assignment of noise and signal, i.e. false positive and negative calls.
In forensic genetics, these terms are commonly denoted drop-ins and drop-outs which refers to
extra alleles in the signal not contributed by the true donors of the stain and missing alleles of
the true donors being be below the limit of detection.
Let Q denote the part of the EPG that is classied as true signal. As mentioned above Q is cur-
rently the basis for separating DNA mixtures in its contributing components. That is, by dening
a model for Q given a set of contributing proles, it is possible to determine the goodness-of-t
between a hypothesised combination of DNA proles and the observed peak intensities. Meth-
ods exist for modelling P(Q|G, G, H) of which some are more heuristic than statistical (Bill et
al., 2005; Wang et al., 2006), but progress is made towards models based on statistical methods
(Perlin and Szabady, 2001; Cowell et al., 2007a, b, 2010; Curran, 2008; Tvedebrink et al., 2010).
In cases where the amount of DNA contributed by the donor of the prole is low, there is a
risk of the peak heights being below a limit of detection. The limit of detection is introduced
in order to distinguish between noise and true signals. This may imply allelic drop-out which
causes only a partial (or no) prole to be typed. Hence, a true contributing prole to an observed
stain may have one or more alleles not present in the case sample. Not taking allelic drop-out into
consideration could imply that the true donor is erroneously excluded fromfurther consideration.
In order to include the possibility for drop-out in the evidence evaluation it is necessary to be
able to quantify this possibility in terms of a probability.
In contrast to drop-out which is missing alleles, the biotechnology used in the typing of DNA
proles may cause additional peaks to be present in the observed stain. The PCR process, which
amplies the DNA by making multiple copies of the present alleles, causes extra peaks in the
position in front of the true peak. These peaks are called stutters and is due to mispairings
between the Taq enzyme and amplicon. This creates a DNA product one repeat unit shorter than
the true amplicon. Stutters may be produced in any cycle of the PCR process and a rule of thumb
says that the stutter peak height is about 10-15% of the true peak height. This percentage is an
overall value across alleles and loci, but shorter alleles tend to have lower stutter percentage than
longer alleles.
Another systematic component caused by the typing technology are the so called pull-up (or
bleed through) eects, where the light emitted fromone uorescent dye is detected in the spectra
of a dierent uorochrome. This implies false detection of peaks with similar fragment length as
the parental peak, but on a dierent dye band. Furthermore, using a xed limit of detection, of 50
rfu say, neglects important information about the noise level in a sample. If a peak in the interval
40 rfu to 49 rfu is observed, the xed threshold-protocol determines this peak as undetected.
1.2 Quantitative models 7
However, by using a model for the threshold, it might be reasonable to have a variable limit set
such that e.g. 99.95%of all noise peaks are removed. This may for some cases imply a threshold
as low as 25 rfu allowing for a more exible analysis scheme which may be valuable for samples
of low amounts of DNA.
When DNA is exposed to inhibitors such as chemicals, moisture, sunlight and heat, the DNA
molecules are prone to degrade and the DNA strand damaged. This causes the results of the
DNA investigation to have a characteristic prole with decreasing peak intensities as a function
of the DNA fragment length. The longer the amplicon, the more likely it is that the peaks will
have low emission values. This implies that the risk of allelic drop-outs increase for longer
amplicons and may result in partial DNA proles since some loci fail to produce any signal.
Degraded biological material is pronounced in samples taken from the debris of mass disasters
or mass graves.
8 Introduction
1.3 Outline
The following seven chapters (Chapters 2-8) present the seven journal papers constituting this
PhD thesis. The organisation of each chapter is such that the paper is presented in its jour-
nal form (including bibliography) followed by supplementary remarks about the results, how
it relates to the previous chapters, further discussion and additional data analysis. As a conse-
quence notation is not necessarily consistent between the chapters and some of the material is
repeated in dierent chapters. On the other hand, the chapters may be read independently of
each other. Each chapter has its own bibliography with the references used in there, and on the
last pages of the thesis there is a complete list of all references. In chapters were there is a ref-
erence to supplementary material, e.g. as in journal papers, the material is available on-line at
http://people.math.aau.dk/tvede/thesis.
The order of the chapters is such that the number of factors considered in the evaluation of the
evidence increases. First only the qualitative part of the data is considered in the likelihood ratio
with the correction for population stratication eects. Later the quantitative data is added to
the likelihood ratio where each model relaxes the assumptions made in the preceding chapters.
Finally the last chapter combines the results and suggests topics for future research.
Chapter 2 discusses the topic of substructures in populations and how to account for this in evi-
dential calculations. Concepts of identical-by-descent and subpopulations eects are common
concepts from population genetics. The idea of measuring population stratication goes back
to Wright (1951) who dened three quantities measuring the degree of relatedness between in-
dividuals, subpopulations and the total population. The model discussed in the chapter handles
this from a statistical point of view by dening the correlation among individuals DNA proles
as overdispersion and show how it is manifested in the so-called -correction used in forensic
genetics.
Chapter 3 is an analysis of the Danish DNA prole reference database. By the beginning of
2009 the database included 51,517 unique DNA proles typed on ten forensic autosomal STR
loci. We investigated the methodology of Weir (2004, 2007) who made pairwise comparisons of
every pair of DNA proles in the database. We derived an ecient way to compute the expected
number of matches and partial matches for a given , cf. above. Furthermore, in line with Curran
et al. (2007) we extended the model to allow for closer familial relationships (full-siblings, rst-
cousins, parent-child and avuncular) and we derived expressions for the variance of the number
of matches and partial matches in the database.
Chapter 4 is the rst of ve papers on the quantitative part of the data available fromSTR results.
The paper is an extension of the work I did in my MSc thesis where the peak intensities of the
EPG were modelled by a multivariate normal distribution. The challenging part of the model
is the fact that the dimensions of the data vector (and sub-vectors hereof) vary among DNA
mixtures due to the dierent number of shared alleles between individuals. An EM-algorithm
was proposed for optimisation and we demonstrated that the model in fact is a mixed eects
model.
1.3 Outline 9
Chapter 5 discusses a simpler and more operational model for DNA mixtures than the one from
the previous chapter. In order to separate an observed DNA mixture into the contributing DNA
proles we derived a statistical model, which was suited for a greedy optimisation algorithm.
The algorithm is very ecient, separating complex DNA mixtures in a few seconds. It is imple-
mented as an on-line tool which provides valuable graphical output for further interpretation by
the forensic geneticists.
Chapter 6 addresses an important question in forensic genetics and evidential calculations: Esti-
mating the probability of allelic drop-out. We dene a proxy for the amount of DNA contributed
to a sample and use this quantity to derive an logistic regression model to estimate the probability
of allelic drop-out.
Chapter 7 presents a methodology for ltering the quantitative data from STR results. The ob-
served data is a conversion of emitted light froma uorochrome detected by a CCD camera. This
implies that the signal consists of a noise component and further systematic components, the so-
called pull-up eects and stutters. We demonstrate how to determine a oating threshold
using distribution analysis of the noise component. Pull-up and stutter corrections were per-
formed by regression analysis. The methodology decreases signicantly the number of allelic
drop-outs compared to the standard protocol.
Chapter 8 is a short communication on how to model degraded DNA in a simple and intu-
itive manner. Degraded DNA is a common problem in crime case samples since the biological
material from which the DNA is extracted has often been exposed to non optimal conditions.
Sunlight, humidity, bacteria and chemicals are some of the reasons for observing degraded DNA
which complicate the succeeding analysis and interpretation. The model presented in the pa-
per is used to modify the drop-out model discussed in Chapter 6 by adjusting the proxy for the
amount of DNA taking the level of degradation into account.
Chapter 9 summarises the results from the proceeding seven chapters by forming a unifying
likelihood ratio. The terms in this likelihood ratio consist of:
P(Q
mis
|Q
obs
, G
mis
, G
obs
, G)P(Q
obs
|G
mis
, G
obs
, G)P(G
mis
, G
obs
|K, G)P(K|G)P(G),
where G is a combination of DNA proles consistent with the hypothesis under consideration.
Furthermore, Q and G symbolises the quantitative and qualitative parts of the evidence, respec-
tively. The rst term, P(Q
mis
|) evaluates the probability of allelic drop-out using the models of
Chapters 6, 7 and 8, P(Q
obs
|) is evaluated by one of the models for DNA mixtures (Chapters 4
and 5), while the last terms are evaluated using the -correction discussed in Chapters 2 and 3.
CHAPTER 2
Overdispersion in allelic counts and
-correction in forensic genetics
Publication details
Co-authors: None
Journal: Theoretical Population Biology (In Press)
DOI: 10.1016/j.tpb.2010.07.002
11
12 Overdispersion in allelic counts and -correction in forensic genetics
Abstract:
We present a statistical model for incorporating the extra variability in allelic counts due to sub-
population structures. In forensic genetics, this eect is modelled by the identical-by-descent
parameter , which measures the relationship between pairs of alleles within a population rela-
tive to the relationship of alleles between populations (Weir, 2007). In our statistical approach,
we demonstrate that may be dened as an overdispersion parameter capturing the subpopula-
tion eects. This formulation allows derivation of maximum likelihood estimates of the allele
probabilities and together with computation of the prole log-likelihood, condence intervals
and hypothesis testing.
In order to compare our method with existing methods, we reanalysed FBI data from Budowle
and Moretti (1999) with allele counts in six US subpopulations. Furthermore, we investigate
properties of our methodology from simulation studies.
Keywords:
-correction; Forensic genetics; Subpopulation; Dirichlet-multinomial distribution; Maximum
likelihood estimate; Condence interval.
2.1 Introduction
Attaching probabilities to dierent levels of relatedness in paternity disputes or evaluating the
weight of evidence in crime cases with biological traces present at the scene of crime are essential
tasks in forensic genetics. To this purpose, the dierence in the genetic constitution of individuals
in the population is used to assess the probabilities of the evidence under competing hypotheses.
Currently, 10 to 20 locations on the genome (loci) are investigated for identication purposes
and an individuals DNA prole is made up by the dierent states (alleles) of the loci.
It is well known that allele frequencies may vary between ethnic groups, geographic remote
populations and subpopulations. However, due to a common evolutionary past it is assumed that
the allele frequencies of the subpopulations have a common mean, and that the variation between
subpopulations is due to genetic sampling (Weir, 1996).
In forensic genetics, population structures are of great importance when the probability of ob-
serving a given suspects DNA prole is assessed under various hypotheses. Budowle and
Moretti (1999) published allele frequencies fromsix dierent US subpopulations (African Amer-
ican, Bahamian, Jamaican, Trinidad, Caucasian and Hispanic) for 13 CODIS Core STR loci. In
this study, the authors obtained allele frequency estimates varying signicantly across subpop-
ulations. For example, the frequencies range from 6.9% (Hispanic) to 27.3% (Jamaican) for
allele 28 in locus D21, indicating that a homozygote on this locus could be 16 times more likely
in the Jamaican than in the Hispanic subpopulation (when assuming Hardy-Weinberg equilib-
rium). The ability to distinguish true genetic dierences from sampling eects depends on the
sample size. That is, testing the signicance of such allele frequency dierences depends on the
database sizes, since the variance of the estimates scales inversely with the number of sampled
DNA proles.
2.2 Overdispersion in allelic counts 13
In order to correct for subpopulation structure, Nichols and Balding (1991) suggested the -
correction to be used when inferring the weight of evidence in forensic genetics. Our approach
acknowledges the extra variability in the allelic counts and addresses this as overdispersion. The
statistical model of the present paper has the same properties as the genetic model. We exploit
results from the statistical literature in order to obtain maximum likelihood estimates (MLEs)
of the allele frequencies and -parameter, and compute prole log-likelihoods for providing
approximate condence intervals.
The basic idea and principle of overdispersion in allelic counts formulated in Section 2.2 has
previously been noted in the forensic literature, although not called overdispersion, by Balding
(2005). However, the terminology of overdispersion (or heterogeneity) explicitly underlines that
a simple assumption of the sampling distribution (multinomial distribution) is insucient to
model the data. By overdispersion it becomes more transparent to statisticians with limited
knowledge in population genetics to appreciate the concept of variability between population
groups. Hence, these rather specialised types of model are put into a more general statistical
framework.
2.2 Overdispersion in allelic counts
Our set-up assumes that the allelic counts in a given subpopulation X follow a multinomial
distribution with some unknown allele probabilities. Due to an evolutionary past, there exists
some variation among dierent subpopulations in terms of allele probabilities. However, these
allele probabilities have a common distribution across subpopulations with a mean and variance.
For now, we just let E(P) = be the mean of this distribution and V(P) its covariance matrix.
Note that this parametrisation of E(P) implies that are the allele probabilities in the reference
population from which the subpopulations are assumed to have descended.
Let n be the number of alleles sampled from a given subpopulation with k alleles. Then the
model can be formulated as
P(X = x|P = p) =
_
n
x
_
k
_
j=1
p
x
j
j
, where
_
n
x
_
=
n!

k
j=1
x
j
!
, (2.1)
is the multinomial coecient. Thus X follows a multinomial distribution when conditioned
on P = p. This implies that E(X) = E(E{X|P}) = E(nP) = n from the assumption of
E(P) = .
2.2.1 Dirichlet-multinomial distribution
In line with other authors (Lange, 1995b,a; Weir, 1996; Rannala and Hartigan, 1996; Balding,
2003), we assume the distribution of allele probabilities to be a Dirichlet distribution. The as-
sumption of a Dirichlet distribution is based on theoretical arguments from population genetics
together with the convenience that the Dirichlet distribution is the conjugate prior of the multi-
14 Overdispersion in allelic counts and -correction in forensic genetics
nomial distribution. The Dirichlet distribution has density function
f (p
1
, . . . , p
k
;
1
, . . . ,
k
) =
(
+
)

k
j=1
(
j
)
k
_
j=1
p

j
1
j
, (2.2)
where
+
=
_
k
j=1

j
. When assuming a Dirichlet distribution of P, we can derive the marginal
distribution of X by multiplying (2.2) and (2.1) and integrating over p. The resulting distribu-
tion is called the Dirichlet-multinomial distribution (Johnson et al., 1997) or multivariate P` olya
distribution (from its relation to the P` olya urn scheme, Green and Mortera (2009)) with density
P(X = x) =
_
n
x
_
(
+
)
(n +
+
)
k
_
j=1

_
x
j
+
j
_

j
_ . (2.3)
Using the results of Mosimann (1962), the mean of X
j
may be computed as E(X
j
) = n
j
/
+
,
where
j
/
+
is the mean of P
j
, E(P
j
) =
j
=
j
/
+
. Furthermore, the covariance matrix of X is
given by V(X) = cn[diag()

], where c = (n +
+
)/(1 +
+
) and

is the transpose of .
Hence, the covariance matrix of the Dirichlet-multinomial distribution is inated by the factor c
compared to an ordinary multinomial covariance.
The Dirichlet-multinomial distribution derived in (2.3) is almost identical to Eq. (8) in Curran
et al. (1999) except for the multinomial coecient, which is merely a constant with respect to the
parameters of the model. Furthermore, by introducing as in Curran et al. (1999),
+
= (1)/
or equivalently = (1 +
+
)
1
, we may rewrite c in terms of :
c =
n +
+
1 +
+
= (n +
+
) = n + (1 ) = 1 + (n 1).
This implies that V(X) = n[diag()

][1 + (n 1)] which is identical to the variance


in Curran et al. (1999). In Curran et al. (1999), this expression was derived by letting denote
the identical-by-descent (IBD) parameter, whereas in the statistical model is an overdispersion
parameter.
A direct implication fromX being Dirichlet-multinomial distributed is that the vector of propor-
tions

P = {

P
j
}
k
j=1
= {X
j
/n}
k
j=1
is an unbiased estimator of with covariance matrix n
1
[diag()

][1 + (n 1)]. When X follows a multinomial distribution,



P is the maximum likelihood
estimator of . However, under the Dirichlet-multinomial model this variance does not go to
zero even for very large sample sizes n,
lim
n
_
diag()

_
_
+
1
n
_
=
_
diag()

_
,
as opposed to the asymptotic behaviour under the multinomial model where lim
n
n
1
[diag()

] = O, with O being the null matrix.


Let Y
n
= (Y
1
, . . . , Y
n
) denote the vector of sampled alleles, of which X
n
is the sucient statistic,
where the superscript is added to stress that n alleles were sampled. Consider the probability
2.2 Overdispersion in allelic counts 15
P(Y
n+1
= j|Y
n
= y
n
), i.e. the probability of a future j allele given the alleles previously
sampled:
P(Y
n+1
= j|Y
n
= y
n
) =
_
f (p)P(Y
n+1
= j|p)

n
i=1
P(Y
i
= y
i
|p)dp
_
f (p)

n
i=1
P(Y
i
= y
i
|p)dp
=
(
j
+ x
n+1
j
)(
+
+ n)
(
j
+ x
n
j
)(
+
+ n + 1)
=
x
n
j
+ (1 )
j
1 + (n 1)
, (2.4)
where we used f (p) from(2.2) and x
n+1
j
= x
n
j
+1. This expression emphasises that the probability
of observing a future j allele only depends on the previous sampled alleles through the total allele
count, n, and how many of these alleles were of type j, x
n
j
. Hence, we also apply the notation
P( j|x
n
j
) for this probability which is identical to P
n
(A) = (n
A
+ {1 }p
A
)/(1 + {n 1}) in the
recursion equation of Balding and Nichols (1997, equation (1) where we changed their notation
from F for ), which is the probability of observing an A allele after n
A
of n alleles being of type
A.
2.2.2 Application to paternity testing
Forensic genetics is widely used in paternity disputes or when a person applies for a family
reunication. In the setting of a paternity dispute, let H
1
be the hypothesis: The alleged father
is the true father and H
2
the hypothesis: A man unrelated to the alleged father is the true
father. The paternity index (PI) is dened as PI = P(E|H
1
)/P(E|H
2
), where the evidence, E, is
the DNA proles of the involved individuals, i.e. child, mother and alleged father.
In paternity testing, the -correction enters the PI through the assumption of correlated individ-
uals in the population due to subpopulation structures (Balding and Nichols, 1995; Evett and
Weir, 1998). Consider only one locus where a childs DNA prole is (ac) and its mothers
prole is (ab). Assuming no mutations, the true father must pass on a c allele to the child.
If the alleged fathers DNA prole is (cd), the H
1
-hypothesis implies that the parents (ab, cd)
have ospring (ac). The probability of the evidence, given H
1
, is computed as P(E|H
1
) =
P(ac|ab, cd)P(ab, cd), where P(ac|ab, cd) is the probability that a child is ac when its parents
are (ab, cd), i.e. P(ac|ab, cd) =
1
4
, and P(ab, cd) is the probability for observing alleles a, b, c
and d in the population. The other hypothesis, H
2
, claims that the child got its c allele from a
man unrelated to the alleged father. Then the paternity index, as derived in Appendix 2.A.1, is
given by
PI() =
1 + 3
2[ + (1 )
c
]
, (2.5)
where PI() is used to emphasise PIs dependence on . Table 1 and 2 in Balding and Nichols
(1995) give the (reciprocal) PI() for other parent-child scenarios (with denoted by F).
As an example, let us assume that this specic trio scenario is replicated for all S loci used for
DNA prole testing. The consequence between applying > 0 and and using the simple PI(0) =
16 Overdispersion in allelic counts and -correction in forensic genetics
1/(2p
c
) for independent proles is very pronounced even for reasonably common alleles. If

c
= 0.025 and = 0.03, then PI(0.03) 10 while PI(0) = 20. Hence, the numerical dierence
between the two paternity indexes is for S independent DNA markers {PI(0)/PI(0.03)}
S
2
S
,
which for the typical forensic typing kits with S 10 yields a ratio of at least 1, 000. That is, the
evidential weight may decrease by several orders of magnitude by correcting for possible IBD
or population stratication.
2.2.3 Application to DNA mixtures
When two or more individuals contribute to a biological stain, the observable DNA prole is a
mixture of the various alleles contributing to the stain, and is therefore called a DNA mixture. In
an m-person DNA mixture, it is possible to observe 1 to 2m alleles per locus, since the involved
DNA proles may share all or no alleles (see e.g. Tvedebrink et al., 2010, for a further discussion
of DNAmixtures). Assume for a two-person mixture, e.g. a rape case, that we observe the alleles
(abc) and that the victims DNA prole is (ab) and the suspects DNA prole is (cc). Then, in
line with the paternity index, the likelihood ratio is dened as LR = P(E|H
1
)/P(E|H
2
), where
E is the evidence (abc) and the known DNA proles and H
1
and H
2
is the prosecutors and
defences hypotheses, respectively (in the literature H
p
and H
d
are commonly used for the same
hypotheses). The hypothesis H
1
states The victim and suspect constitute the DNA mixture
whereas H
2
acquits the suspect: The victim and an unknown individual constitute the DNA
mixture. Let P(abc|ab, i j) be the probability of observing the crime scene stain (abc) given the
mixture originates from genotypes ab and i j. When assuming no typing errors this probability
is 1 if (i j) {(ac), (bc), (ca), (cb), (cc)} and 0 otherwise. In line with the derivation of PI() (see
Appendix 2.A.1 for the details of PI()), we get
LR() =
P(abc|ab, cc)P(ab, cc)
_
i j
P(abc|ab, i j)P(ab, cc, i j)
=
(1 + 3)(1 + 4)
(7 + {1 }[2
a
+ 2
b
+
c
])(2 + {1 }
c
)
(2.6)
In Figure 2.1, we have plotted the LR() function for the DNA mixture above with
a
and
b
xed at 0.1. The solid line represents the uncorrected LR( = 0) and the broken lines show the
corrected LR() for -values as described by the legend. The inserted plot shows the behaviour
close to the value
c
= 0.71 where the eect of the -correction is reversed. We see that the
eect of is minimal for common alleles and more pronounced for rare ones. Hence, in practice
the larger is the more conservative the LR-estimates are.
The use of in evidential computations can be seen as a means to smoothing the allele proba-
bilities over possible subpopulations and thereby adjusting for the uncertainty associated with
unobserved or unobservable substructures in the larger database. This latent structure may
be seen as a reason for overdispersion in statistical terms, i.e. inhomogeneity due to unob-
served/unobservable variables.
2.2 Overdispersion in allelic counts 17
0.0 0.2 0.4 0.6 0.8
0
5
0
1
0
0
1
5
0
2
0
0
2
5
0
3
0
0

c
L
i
k
e
l
i
h
o
o
d

R
a
t
i
o
,

L
R
(

)
= 0.00
= 0.01
= 0.02
= 0.03
= 0.04
1
.
1
0
1
.
1
5
1
.
2
0
1
.
2
5
1
.
3
0
1
.
3
5
1
.
4
0
0.68 0.70 0.72 0.74 0.76
Figure 2.1: The eect of on LR() for a single locus as exemplied. LR() is plotted for
various -values ranging from 0.00 (no subpopulation eect) to 0.04 (large subpopulation eect)
against the allele frequency of the allele in question (here allele c) with the other probabilities
(
a
and
b
) xed at 0.1. Inserted is a blow-up of the curve near
c
= 0.71 ( marks this point).
If the suspect or alleged father in the two situations considered above has a ethnicity or na-
tionality that indicates that a specic database is representative for his genetic origin then allele
frequencies estimated from this database are the most appropriate reference sample to use for
evidential weight calculations. However, the database and the population that it resembles may
be constituted by several subpopulations or groups, which causes this conceptual population to
be heterogeneous. That is, geopolitical or tribal structures together with marital and religious
preferences may induce genetic diversity causing overdispersion. Hence, a database that seems
to be the most appropriate for a particular suspect may not be sampled on a suciently high
resolution to obtain a homogeneous reference subpopulation. In fact, it may not even be possible
to obtain samples with this property. Thus, genetic diversity and the resulting overdispersion in
allele counts must be accounted for by the -correction.
18 Overdispersion in allelic counts and -correction in forensic genetics
2.3 Parameter estimation
Assume that we have allelic counts from N dierent subpopulations such that x
i j
denotes the
number of allele j in subpopulation i and that for each subpopulation i, i = 1, . . . , N, there is a
total of n
i
counts, n = (n
1
, . . . , n
N
). In addition, we assume that the subpopulations are indepen-
dent, implying that the likelihoods of the counts from the subpopulations multiply.
The likelihood may then be derived by multiplying over the terms of (2.3). This likelihood
implies dierentiation of -functions in order to solve the likelihood equations. A useful obser-
vation about the -function is that
y
_
r=1
{ + (r 1)} = ( + y 1) =
( + y)
()
, (2.7)
using the fact that x(x) = (x + 1). Hence, an equivalent way of expressing the distribution in
(2.3) using the rising factorials of (2.7) is given as
P(X = x) =
_
n
x
_

k
j=1

x
j
r=1
{
j
(1 ) + (r 1)}

n
r=1
{1 + (r 1)}
.
Fromthis probability function we can compute the log-likelihood function (, ; x). Discarding
the multinomial constant (which is a constant with respect to the parameters), the log-likelihood
is
(, ; x) =
N

i=1
k

j=1
x
i j

r=1
log{
j
(1 ) + (r 1)}
N

i=1
n
i

r=1
log{1 + (r 1)}. (2.8)
The corresponding likelihood equations, (, ; x)/(, ) = 0, cannot be solved analytically
for the parameters; hence numerical methods need to be invoked for parameter estimation. Let
denote the parameter vector = (, ) = ({
j
}
k1
j=1
, ), since
k
= 1
_
k1
j=1

j
. A possible
numerical method for solving the likelihood equations is Fisher-scoring, where the parameter
estimates in each iteration are updated using

(m+1)
=

(m)
+ {I(

(m)
)}
1
u(

(m)
), where

(m)
is the estimate in the mth iteration, u(

(m)
) is the score function, (; x)/, and I(

(m)
) is
the expected Fisher Information Matrix (FIM) both evaluated in

(m)
. Paul et al. (2005) derived
exact expressions for the expected FIM entries, I(). The results of Paul et al. (2005) imply that
the expected FIM may be computed using expressions only involving the marginal distributions
of X
j
. Similar results were obtained by Neerchal and Morel (2005).
2.3.1 Computational considerations
Most of the methodology discussed in this section and subsections hereof have been imple-
mented in the R-package dirmult available on-line in the CRAN repository at http://www.r-
project.org (Tvedebrink, 2009).
Even though the expressions for the expected FIM, I(, ), given in (Paul et al., 2005, pp. 232)
are compact, they cause the parameter estimation to be computationally inecient. Numerical
2.3 Parameter estimation 19
work has shown that it is much more convenient to estimate the -parameters and transform the
estimates, rather than estimate and directly. The log-likelihood (; x) is
(; x) =
N

i=1
k

j=1
x
i j

r=1
log{
j
+ r 1}
N

i=1
n
i

r=1
log{
+
+ r 1}, (2.9)
where we used (2.3) and (2.7). The rst-order and second-order derivatives of the log-likelihood
(; x) are given by
(; x)

j
=
N

i=1
_

_
x
i j

r=1
1

j
+ r 1

n
i

r=1
1

+
+ r 1
_

_
(2.10)

2
(; x)

j
2
=
N

i=1
_

_
n
i

r=1
1
(
+
+ r 1)
2

x
i j

r=1
1
(
j
+ r 1)
2
_

_
(2.11)

2
(; x)

l
=
N

i=1
n
i

r=1
1
(
+
+ r 1)
2
, (2.12)
where (2.10) gives the elements of the score function u(). Furthermore, this implies that the
diagonal elements of the expected FIM, I(), are
I(
j
,
j
) =
N

i=1
_

_
x
i j

r=1
P(X
i j
r)
(
j
+ r 1)
2

n
i

r=1
1
(
+
+ r 1)
2
_

_
,
for j = 1, . . . , k, and the o-diagonal elements, I(
j
,
l
), equal (2.12). However, for most practi-
cal purposes using the observed FIM, J(), rather than the expected FIM, I(), in the Newton-
Raphson scoring ensures much lower computational time. Numerical investigations indicate that
the J()-implementation converges to the same extrema and much more quickly as the diagonal
elements, J(
j
,
j
), for this matrix are as in (2.11), i.e. the terms P(X
i j
r), r = 1, . . . , x
i j
, where
X
i j
Beta-Binomial(
j
,
+

j
), need not be computed.
The inverse of the expected FIM is the asymptotic covariance matrix of the MLE. As our interest
is in (, ), we exploit that I(, ) =

I(), where {}
i j
= {/}
i j
.
Simulations
Standard asymptotic theory assures that the MLE is the most ecient estimator. However, infer-
ence about depends mainly on the number of subpopulations sampled, N, and only to a minor
degree on the subpopulation sample sizes, n. Hence, in order to verify our implementation and
the performance of the maximum likelihood estimator for dierent number of subpopulations,
we simulated data with known allele frequencies, , and -value. When simulating the mth data
matrix, x
m
, for m = 1, . . . , M, we used the following sampling scheme:
1. Draw p

i,m
Dirichlet({
j
(1)/}
k
j=1
), i = 1, . . . , N.
2. Draw x
i,m
Multinomial(n
i
, p

i,m
), i = 1, . . . , N.
3. The mth data matrix is x
m
= [x
1,m
, . . . , x
N,m
]

.
20 Overdispersion in allelic counts and -correction in forensic genetics
This ensures that the random variable X
i
, of which x
i,m
is a realisation, follows a Dirichlet-
multinomial distribution with parameters and . Note that the concept of N subpopulations
is a theoretical one. In practice only an overall database would exist which neglects the present
substructure. However, the intension is to account for this partitioning using the -correction.
In Weir and Hill (2002), the authors argue that if the expectation of a ratio was the ratio of ex-
pectations then the method of moment (MoM) estimator,

MoM
, of Weir and Hill (2002, equation
5) was an unbiased estimator of :

MoM
=
_
k
j=1
(MSP
j
MSG
j
)
_
k
j=1
(MSP
j
+ (n
c
1)MSG
j
)
,
where n
c
= (N 1)
1
_
_
N
i=1
n
i
n
1
+
_
N
i=1
n
2
i
_
and n
+
=
_
N
i=1
n
i
. The quantities MSG
j
and MSP
j
are two mean squares dened as
MSP
j
=
1
N1
N

i=1
n
i
( p
i j
p
j
)
2
and MSG
j
=
1
_
N
i=1
(n
i
1)
N

i=1
n
i
p
i j
(1 p
i j
)
with p
i j
= x
i j
/n
i
, p
j
= n
1
+
_
N
i=1
x
i j
. Even though the expectation does not satisfy the property
mentioned above, the

MoM
-estimator seems to perform reasonably well on average.
More recently, Zhou and Lange (2010) has derived MM (Minorisationmaximisation) algorithms
for some discrete multivariate distributions and among these the Dirichlet-multinomial distribu-
tion. The authors have provided Matlab scripts (on line supplementary material available at the
website of Journal of Computational and Graphical Statistics) for estimating parameters in the
MM set-up.
In the following we compare the MLE, MoM and MM estimates on simulated data using the
relative frequencies in locus D13 from data published in Budowle and Moretti (1999) as and
= 0.03. The box plot in Figure 2.2 show -estimates of 100 simulated datasets (M = 100)
with sample sizes, n
i
, of 200 and an increasing number of databases (increasing number of
subpopulations, N).
From the box plot it is evident that the MLE has a lower variance, but also that on average the
MoM and MM estimates are closer to the true value. However, as the number of databases
increases so does the accuracy of the estimates, as one would expect. In addition to the accu-
racy of the estimation procedure, it is relevant to compare the computational speed and ease of
implementation of the various methods. Naturally, the MoM estimator is the easiest to imple-
ment, and since no iterations are applied, convergence happens immediately. Both MLE and
MM estimates are based on iterative procedures. Where several statistical tools exist for easy
implementation of Newton-Raphson iterations, a little more code needs to be written for MM
algorithms. However, the script-les of Zhou and Lange (2010) elegantly demonstrated how
these obstacles can be handled in Matlab. We compared the computation times for the various
iterative methods (Zhou and Lange, 2010, implemented simple and more advanced MM meth-
ods in their paper) and number of iterations needed to satisfy the convergence criteria. The MLE
method implemented in R is always faster and needs fewer iterations for convergence compared
2.3 Parameter estimation 21
Number of databases

0
.
0
0
0
.
0
2
0
.
0
4
0
.
0
6
0
.
0
8
0
.
1
0
0
.
1
2
2 4 8 16 32 64
MLE
MoM
MM
Posterior mean
Figure 2.2: Box plots of 100 estimates based on simulated data with = 0.03 for an increasing
number of databases with a xed number of observations per database (n
i
= 200 for all i). White
boxes are MLE, grey boxes are MoM estimates, dark grey boxes are MM estimates, and the light
grey boxes are posterior means. The indicates the average of the estimates within each block.
to the standard MM implementation. However, the more advanced MM updating schemes are
more ecient than the MLE for small database counts. We tested the same algorithms on larger
datasets (Danish and Greenlandic forensic databases of 20,000 and 2,000 DNA proles). For
these larger databases, the MLE implementation was 10 times faster than the specialised MM
algorithms and up to 1,000 times faster than the standard MM implementation. However, this is
only true when using the observed FIM, J(), while the computation of the expected FIM, I(),
is very slow even for databases of moderate size.
Prole log-likelihood
From the box plots in Figure 2.2, there seems to be a tendency for the MLE to underestimate
the -parameter. In order to investigate the reason for this behaviour and compute the condence
intervals for , we derived the prole log-likelihood,

() = max

(, ; x), for . That is, xing


at some value

and nding the maximum likelihood value under this constraint. By xing
at

we also x
+
at
+
= (1

)/

. Hence, we are maximising the regular log-likelihood under


the constraint that
+
=
+
.
22 Overdispersion in allelic counts and -correction in forensic genetics
Since the analytical formof the log-likelihood is complicated, the only way to evaluate the prole
log-likelihood is by numerical methods as for the maximum likelihood estimation. Applying a
Lagrange multiplier, , we need to nd the stationary points of

() = (; x) +(
+

+
). The
partial derivatives yield

()

i
=
(; x)

i
;

()

=
+

+
which implies that the score function for this new system is u(, ) = (u()1
k
,
+

+
)

,
where u() is the score function from (2.10) and 1
k
is a k-dimensional vector of ones. The
observed FIM, J(, ), is also almost preserved from the likelihood equations,
J(, ) =
_
J() 1
k
1

k
0
_
,
where J() is the observed FIM fromSection 2.3.1. Hence, we may apply Newton-Raphson iter-
ations in order to maximise (; x) under the constraint
+
=
+
. Alternatively this constrained
optimisation problem could have been solved using (recursive) quadratic programming. How-
ever, for this particular log-likelihood function Newton-Raphson procedure works very well with
Lagrange multipliers, and the existing code for maximisation is easily extended for handling the
extra terms induced by the constraints.
In Figure 2.3 the prole log-likelihood for simulated data with = 0.03 is plotted. Each panel
is standardised such that the maximum value of

() is zero, 2[

()

(

)]. The intersection


of the dotted line and the prole log-likelihood indicates a 95%-condence interval for based
on a
2
1
-approximation of 2

(). In each panel the associated MLE (marked by ), MoM ()


and MM () estimates are plotted together with the true -value (). In all six panels the true
value is included in the condence intervals. As one would expect, the width of the condence
intervals decreases as the number of datasets increases. There are profound arguments for using
the
2
-approximation of partial maximised log-likelihood as opposed to using asymptotic results
relying on approximative normality of the MLE with a covariance matrix asymptotically equal
to the inverse FIM (Barndor-Nielsen and Cox, 1994).
From Figure 2.3, it is evident that the prole log-likelihood is skew for small numbers of
databases. This pronounced departure from symmetry explains the bias of the MLE and MM
estimate for small numbers of databases. Using a Bayesian approach, one may assume a uni-
form prior on . This implies that the posterior distribution of approximately equals the prole
likelihood, p(|x) exp[

()]. The posterior mean, E(|x), may be evaluated using a numerical


approximation,
E(|x) =
_
p(|x)d
_
n
i=1

i
exp[

i
)]
_
n
i=1
exp[

i
)]
. (2.13)
The n dierent -values used in the sums of (2.13) are the same as those used for computing the
prole log-likelihood, e.g. equidistant points covering the 95%-condence interval. Table 2.1
lists the posterior means and estimates for the data in Figure 2.3, where the data points used for
computing each posterior mean lies within the 95%-condence interval.
2.3 Parameter estimation 23

2
[
l ^
(

l ^
(
^
)
]
10
8
6
4
2
0
0.00 0.05 0.10 0.15 0.20
Number of databases: 2
0.02 0.04 0.06 0.08
Number of databases: 4
10
8
6
4
2
0
0.01 0.02 0.03 0.04 0.05
Number of databases: 8
0.015 0.020 0.025 0.030 0.035 0.040 0.045
Number of databases: 16
10
8
6
4
2
0
0.020 0.025 0.030 0.035 0.040 0.045
Number of databases: 32
0.025 0.030 0.035
Number of databases: 64
Figure 2.3: Prole log-likelihoods for simulated data for an increasing number of databases
with = 0.03 for all simulations (marked by ). The MLE (), MoM (), MM () and posterior
mean (+) are plotted together with a 95%-condence interval (intersection of the dotted line
and the prole log-likelihood curve). The horizontal dashed and solid lines represent bootstrap
condence intervals based on randomisation and cluster resampling, respectively.
Table 2.1: Posterior means and estimates for the data in Figure 2.3 ( = 0.03).
Number of databases MLE MoM MM Posterior mean
2 0.0395 0.1315 0.0411 0.0532
4 0.0271 0.0374 0.0278 0.0310
8 0.0243 0.0349 0.0249 0.0263
16 0.0258 0.0247 0.0265 0.0267
32 0.0297 0.0345 0.0306 0.0303
64 0.0286 0.0316 0.0295 0.0290
24 Overdispersion in allelic counts and -correction in forensic genetics
We see that the posterior mean estimate in most situations improves the MLE estimate (except
for the rst row) and reduces the amount of bias for small numbers of databases. In Figure 2.2 the
light grey boxes (rightmost box whiskers for each stratum) represent the posterior means for the
simulated data computed using -values within the 95%-condence interval for the associated
MLE. Table 2.1 indicates that the bias is reduced for the posterior means, with only a minor
increment in the variance (see Figure 2.2).
A full Bayesian implementation with prior distributions on and (or equivalently on -
parameters) was not pursued in this study. However, several authors (see e.g. Holsinger, 1999)
have discussed estimation of (and other population genetics diversity measures) froma Bayesi-
an perspective. We refer to the review paper by Holsinger and Weir (2009) for further results
and discussions on Bayesian methodologies.
Bootstrapping condence intervals
In addition to computing a condence interval for using the
2
1
-approximation of the prole
log-likelihood, we also investigated the performance of bootstrap methods to construct the con-
dence intervals. However, there are some problems when bootstrapping clustered data in order
to assess the variability of the intra-cluster correlation parameter .
Several studies (Davison and Hinkley, 1997; Ukoumunne et al., 2003; Fields and Welsh, 2007)
indicate that special attention needs to be paid when one applies the bootstrap methodology
to this problem. The general recommendation is to sample on a subpopulation (cluster) level
rather than an individual (randomised) level due to the dependence structure implied by the intra-
cluster correlation factor. In Figure 2.3, we have superimposed bootstrap condence intervals
(horizontal solid and dashed lines) based on both kinds of bootstrap regime. The general picture
is that the cluster sampling underestimates (solid line - missing in rst two panels due to few
databases), whereas the randomised bootstrap provides overestimated values (dashed line).
From numerical studies we recommend the use of the prole log-likelihood method in order to
estimate the condence intervals for since this method is valid for any number of subpopula-
tions in the data. This might not be surprising (Davison and Hinkley, 1997; Ukoumunne et al.,
2003; Fields and Welsh, 2007). However, bootstrapping is often applied when assessing the
variability of estimates but for this is inappropriate.
Signicance test
Testing whether satises certain numerical properties is interesting since equality of across
loci simplies PI and LR computations. Further simplications are possible if = 0 is supported
by data. This implies that there is no detectable dierence among the databases, where the
reasons for this may be small sample sizes (and thus large variation), or that the databases are as
if sampled from a homogeneous population.
Samanta et al. (2009, Section 3: Hypothesis testing) derived hypothesis tests for inference about
under various population assumptions. Here we initiate by testing for equality of
s
for the
various loci, s = 1, . . . , S . The null hypothesis is
1
= =
S
=

for some unknown

,
2.4 Results 25
where
s
is the -value for locus s, with the alternative hypothesis specifying that at least one
s
is dierent from

. In order to test this hypothesis, we evaluate


({
s
}
S
s=1
,

; x) =
S

s=1

s
(
s
,

; x
s
) (2.14)
where
s
is the regular log-likelihood in (2.8) with
s
=

for all s. The test statistics is given by


2 log Q = 2
_

_
({
s
}
S
s=1
,

; x)
S

s=1

s
(
s
,

s
; x
s
)
_

_
,
and is approximately
2
S 1
-distributed fromthe S 1 degrees of freedom(DoF). Details of nding
stationary points of (2.14) are given in Appendix 2.A.2.
Furthermore, testing whether = 0 is another interesting hypothesis test. Under the null hy-
pothesis there is no evident substructure in the data. Having support for = 0 implies that DNA
proles may be regarded as independent, which has a high inuence on the estimation of the
evidential weight (see Sections 2.2.2 and 2.2.3). The Dirichlet-multinomial model with = 0
is equivalent to the simpler multinomial model. However, testing the hypothesis that = 0 can
not be based on asymptotic theory nor inferred from the inclusion/exclusion of zero in the con-
dence intervals from the prole log-likelihood since = 0 lies on the boundary of the parameter
space.
A possible method is to use a parametric bootstrap, where we simulate data x

1
, . . . , x

M
under the
null hypothesis = 0. From these simulated data we estimate

m
and obtain an approximative
distribution of

under the null hypothesis, which we apply in order to test the signicance of
0 for the observed data, x. Hence, the parametric bootstrap comprises two steps: (1) draw
x

m
Multinomial({x
i+
}
N
i=1
, {x
+j
}
k
j=1
/n
+
) and (2) estimate

m
.
By choosing M large, e.g. M = 1000, one gets M estimates of of which most should have an
estimate smaller than

when the hypothesis = 0 is false. An empirical p-value is computed
by #{

m
>

}/M, i.e. the ratio of the number of larger parametric bootstrap estimates to the total
number of bootstraps.
2.4 Results
The paper of Budowle and Moretti (1999) presents allele frequencies of 13 CODIS Core STR
loci in six US subpopulations. The data have previously been used to estimate the magnitude of
used for forensic purposes; see e.g. Weir (2007). Henceforth we refer to these data as FBI
data.
Estimates of based on the MoM, MLE and MM are given in Table 2.2. There are some distinct
dierences between the

MLE
and

MoM
, with often a factor two in dierence; furthermore , the
standard errors are often very much smaller for the MLE than for the MoM estimates. The
standard errors are asymptotic, where SE(

) is based on a Taylor series approximation by Li


26 Overdispersion in allelic counts and -correction in forensic genetics
Table 2.2: Locus-specic estimates of based on MLE, MoM, MM and posterior mean (PM).
The condence interval for the MLE is based on the
2
1
-approximation of the prole log-
likelihood.
Locus

MoM
SE(

MLE
SE(

) 95%-CI for

MM
PM
D3 0.0108 0.0085 0.0056 0.0020 (0.0028; 0.0110) 0.0057 0.0061
vWA 0.0107 0.0085 0.0053 0.0017 (0.0027; 0.0098) 0.0053 0.0056
FGA 0.0050 0.0051 0.0037 0.0010 (0.0021; 0.0061) 0.0037 0.0038
D8 0.0140 0.0106 0.0084 0.0024 (0.0049; 0.0145) 0.0085 0.0089
D21 0.0126 0.0097 0.0053 0.0013 (0.0031; 0.0086) 0.0053 0.0055
D18 0.0142 0.0107 0.0086 0.0019 (0.0056; 0.0133) 0.0087 0.0089
D5 0.0226 0.0157 0.0161 0.0042 (0.0097; 0.0276) 0.0163 0.0170
D13 0.0264 0.0180 0.0147 0.0040 (0.0088; 0.0254) 0.0149 0.0156
D7 0.0061 0.0056 0.0035 0.0013 (0.0015; 0.0072) 0.0036 0.0038
CSF 0.0050 0.0049 0.0091 0.0026 (0.0049; 0.0167) 0.0092 0.0097
TPOX 0.0306 0.0205 0.0248 0.0066 (0.0147; 0.0433) 0.0254 0.0263
TH01 0.0328 0.0217 0.0189 0.0054 (0.0110; 0.0340) 0.0193 0.0202
D16 0.0117 0.0091 0.0069 0.0023 (0.0036; 0.0131) 0.0070 0.0074
(Weir and Hill, 2002, pp. 730), and SE(

) = {(I
1
)
,
}
1/2
from Section 2.3.1. Standard errors of
the MM estimates are not readily obtained fromthe Matlab scripts of the supplementary material
of Zhou and Lange (2010), hence these are not provided in Table 2.2. The ratio

MoM
/

MLE
of the
estimates in Table 2.2 repeats the pattern which was indicated by the plots in Figures 2.2 and 2.3.
For most loci, the MoM estimate lies within the 95%-condence interval. The MM estimates
coincide with the MLE for all loci. The posterior means are for most loci close to the MLE,
which is due to the rather symmetric shape of the prole log-likelihoods plotted in Figure 2.4,
where the prole log-likelihoods for the FBI data are plotted together with the MLE (marked by
), MoM (), MM () and posterior mean (+).
We tested the hypothesis of equality of for all loci in the FBI data. From Table 2.2, it is clear
that there are dierences among loci, but also some clustering of the estimates. In Table 2.3, we
have listed the results from testing dierent hypotheses.
Table 2.3: Results from testing hypothesis of equality of for multiple loci.
Loci

-2 log Q DoF p-value


All 0.0101 0.0090 62.8011 12 <0.0001
D5, D13, TPOX and TH01 0.0186 0.0183 2.2630 3 0.5196
Remaining loci 0.0063 0.0061 12.7175 8 0.1219
2.4 Results 27

2
[
l ^
(

l ^
(
^
)
]
10
5
0
0.005 0.010 0.015
D3
0.005 0.010
vWA
0.002 0.004 0.006 0.008
FGA
0.005 0.010 0.015 0.020
D8
0.002 0.006 0.010
D21
10
5
0
0.005 0.010 0.015
D18
0.01 0.02 0.03 0.04
D5
0.010 0.020 0.030
D13
0.002 0.006 0.010
D7
0.005 0.010 0.015 0.020
CSF
10
5
0
0.01 0.02 0.03 0.04 0.05 0.06
TPOX
0.01 0.02 0.03 0.04 0.05
THO1
0.005 0.010 0.015 0.020
D16
Figure 2.4: Prole log-likelihoods for the 13 CODIS loci from the FBI data of Budowle and
Moretti (1999). The MLE is marked by , MoM by and MM by . For all loci the MLE
and MM estimate coincide. For most loci, the MoM estimate lies within the MLE condence
interval. The + indicates the posterior mean.
The tests indicate that there are groups of loci with similar -values. The mean,

, of the four
loci (D5, D13, TPOX and TH01) with the largest -estimates in Table 2.2 is

= 0.0186, and the
mean of the remaining loci is

= 0.0063. In both groups, the estimated

= 0.0183 (95%-CI:
[0.0126; 0.0269]) and

= 0.0061 (95%-CI: [0.0043; 0.0088]) is almost equal to



(Table 2.3).
Furthermore, using the methodology described in Section 2.3.1 for testing if = 0, the test
yielded that for no loci was equal to zero. This was true for both the MLE and MoM esti-
mates. However, the test based on the MLE is more powerful than using the MoM estimate.
The estimated -values for the Caribbean subsample (subset of FBI data: Bahamian, Jamaican
and Trinidad subpopulations) are given in Table 2.4 together with empirical p-values and 95%-
condence intervals under the null hypothesis, H
0
: = 0.
In locus D8, the tests based on MLE rejects the null hypothesis, whereas the MoM test accepts
that = 0. For this locus (and D3, D21, D7, CSF, D16) the MoM estimate is negative, and so is
the lower bound of the condence intervals for all loci.
Conceptually, we could imagine that we only had observed a common Caribbean database with-
28 Overdispersion in allelic counts and -correction in forensic genetics
Table 2.4: -estimates for Caribbean sample (three databases from the FBI data) together with
empirical p-values and 95%-condence intervals when = 0. For each locus, the rst row is
MLE and the second row MoM estimates. Note that for locus D8 the test based on MLE rejects
the hypothesis while the MoM-based test does not.
Locus -estimate p-value 95%-Condence interval H
0
-decision
D3 2.582

10
-11
0.099 (3.975

10
-12
; 7.243

10
-11
) Accept
-1.440

10
-3
0.745 (-2.755

10
-3
; 5.715

10
-3
) Accept
vWA 3.904

10
-4
0.000 (3.966

10
-12
; 9.639

10
-11
) Reject
3.488

10
-3
0.030 (-2.226

10
-3
; 3.656

10
-3
) Reject
FGA 3.944

10
-12
0.507 (1.772

10
-12
; 1.470

10
-11
) Accept
6.496

10
-4
0.284 (-2.021

10
-3
; 3.221

10
-3
) Accept
D8 4.351

10
-9
0.010 (3.973

10
-12
; 8.016

10
-11
) Reject
1.286

10
-3
0.214 (-2.581

10
-3
; 4.404

10
-3
) Accept
D21 3.510

10
-12
0.680 (2.654

10
-12
; 1.711

10
-11
) Accept
-4.964

10
-4
0.567 (-2.196

10
-3
; 3.710

10
-3
) Accept
D18 6.262

10
-4
0.000 (2.655

10
-12
; 2.230

10
-11
) Reject
6.657

10
-3
0.001 (-2.066

10
-3
; 3.058

10
-3
) Reject
D5 3.367

10
-3
0.000 (3.964

10
-12
; 5.651

10
-11
) Reject
8.452

10
-3
0.000 (-2.314

10
-3
; 4.449

10
-3
) Reject
D13 1.405

10
-11
0.197 (3.962

10
-12
; 9.433

10
-11
) Accept
3.693

10
-3
0.060 (-2.449

10
-3
; 5.637

10
-3
) Accept
D7 4.776

10
-12
0.712 (3.964

10
-12
; 5.566

10
-11
) Accept
-1.062

10
-3
0.727 (-2.337

10
-3
; 4.600

10
-3
) Accept
CSF 4.399

10
-12
163 (3.971

10
-12
; 8.619

10
-11
) Accept
-2.049

10
-3
0.924 (-2.494

10
-3
; 4.102

10
-3
) Accept
TPOX 1.478

10
-3
0.000 (5.950

10
-12
; 2.149

10
-10
) Reject
7.890

10
-3
0.002 (-2.533

10
-3
; 4.135

10
-3
) Reject
TH01 8.026

10
-12
0.735 (5.949

10
-12
; 1.429

10
-10
) Accept
6.922

10
-4
0.295 (-2.700

10
-3
; 4.298

10
-3
) Accept
D16 7.002

10
-12
0.704 (3.976

10
-12
; 1.371

10
-10
) Accept
-1.925

10
-3
0.903 (-2.403

10
-3
; 4.100

10
-3
) Accept
out information on the specic island of origin. Thus for loci with = 0 (see Table 2.4) this
collapse of the observed databases would in principle not be a problem. However, for the other
loci the present substructure would potentially cause the LR to be anti-conservative depending
on a particular suspects DNA prole and origin.
2.5 Discussion 29
In Table 2.5, the estimated allele probabilities (
j
, for appropriate subscript j) are presented for
each locus. Note that the estimated allele probabilities are estimates of the allele probabilities
in the reference population from which each of the six subpopulations is assumed to have de-
scended. Owing to lack of space, only alleles with integer values are presented, i.e. common
alleles such as 9.3 in TH01 are not reported in Table 2.5.
2.5 Discussion
The model based on the Dirichlet-multinomial distribution has previously been discussed, for
example, by Lange (1995b). However, the estimation methods suggested there relied on approx-
imations of the trigamma-function, which were avoided here due to similar results as those of
Paul et al. (2005).
The maximumlikelihood estimation of parameters discussed in this paper is much more involved
than those of the method of moment (MoM). However, the properties of the MLE ensure reduced
variance of the estimates. In general, the -estimates based on MoM and MLE did coincide,
indicating that the usual relative frequency estimate is adequate in order to obtain point estimates
for the allele probabilities. However, as pointed out by Curran et al. (2002), the uncertainty of
these point estimates needs to be carefully considered when assessing the weight of the evidence.
If allele probabilities are estimated from limited databases the estimates of the rare alleles are
subject to large standard errors. This may lead to overestimates of the (point estimates of) LR or
PI.
Having a joint model for the allele probabilities and -parameter increases the belief in the es-
timates of the latter. However, since may be estimated by the empirical probability p, the
simpler one-dimensional maximisation problem max

(, p; x) may be adequate for estimating


and assessing its variance. Simulations have shown that this method underestimates even for
large number of databases; hence this estimator is inecient as opposed to the joint likelihood
approach, which therefore is recommended for estimation.
Balding (2003, pp. 229) argues that one should expect variability of across the STR loci used
in forensic genetics. This may be due to dierent mutation rates in the various loci and selection
or indirect selection from linkage between the STR loci and genes/genetic regions subject to
selection.
It is possible to test the hypothesis of equal across loci using our model. For the FBI data there
were two groups of loci with common -estimates. Figure 2.1 showed that increased -values
weaken the evidence in most cases. Hence, for a conservative evaluation of the evidence, it may
be reasonable to use the largest -value. This supports the use of the upper 95%-condence limit
(see Table 2.2) of the -estimate, which in most cases does not disagree with the commonly used
value 0.03 for (Phillips et al., 2010). Furthermore, Balding (2005, pp. 97) argues that plug-in
values (of ) should tend to be towards the higher end of the range of plausible values in order
to incorporate uncertainty from higher-order terms of .
However, in paternity disputes it is not common practice to evaluate the evidence conservatively
since in most circumstances these are civil lawsuits. Hence, in paternity cases it may be more ap-
30 Overdispersion in allelic counts and -correction in forensic genetics
Table 2.5: Estimates of for each locus. The rst line of each cell entry gives the MLE and the
second line is the MoM estimate. In small font the associated standard errors 10. Only integer-
valued alleles are reported, for compactness.

: Add 10 units to each FGA allele designation.

:
Add 20 units to each D21 allele designation.
L
o
c
u
s
A
l
l
e
l
e
D
3
v
W
A
F
G
A

D
8
D
2
1

D
1
8
D
5
D
1
3
D
7
C
S
F
T
P
O
X
T
H
0
1
D
1
6
5
.
0
0
6
3
.
0
3
7
.
0
0
1
9
.
0
2
3
6
.
0
0
0
9
.
0
0
9
.
0
0
2
4
.
0
1
5
.
0
0
1
1
.
0
1
1
.
0
3
8
9
.
1
1
9
.
1
7
6
1
.
2
2
7
.
0
0
0
5
.
0
1
1
.
0
0
1
8
.
0
1
8
.
0
0
0
5
.
0
1
3
.
0
4
8
2
.
4
1
9
.
1
7
3
4
.
4
9
1
7
.
0
0
0
8
.
0
0
8
.
0
4
9
9
.
0
8
0
.
0
0
6
6
.
0
3
6
.
0
0
2
7
.
0
2
1
.
0
1
2
9
.
0
3
6
.
0
2
9
0
.
0
7
2
.
0
2
2
7
.
0
8
6
.
3
2
8
5
.
2
8
2
.
0
0
0
5
.
0
1
1
.
0
5
0
1
.
2
2
6
.
0
1
3
1
.
2
4
4
.
0
0
0
9
.
0
1
4
.
0
1
2
2
.
0
6
5
.
0
3
3
2
.
2
9
2
.
0
1
6
4
.
1
1
6
.
3
3
4
1
.
9
0
1
8
.
0
0
7
0
.
0
2
7
.
0
0
5
5
.
0
2
8
.
1
8
5
2
.
1
4
5
.
0
2
5
4
.
0
8
0
.
0
5
0
1
.
1
1
3
.
1
6
2
6
.
1
2
0
.
0
3
2
7
.
0
7
7
.
4
1
6
1
.
3
3
3
.
1
7
6
9
.
2
2
7
.
0
3
4
4
.
0
7
2
.
0
0
8
9
.
1
1
7
.
0
0
5
0
.
0
6
8
.
1
8
9
2
.
7
0
9
.
0
3
3
7
.
2
9
4
.
0
4
9
2
.
3
0
3
.
1
6
2
6
.
3
9
4
.
0
4
2
5
.
3
4
2
.
4
3
0
8
1
.
0
7
5
.
1
7
5
2
.
6
5
6
.
0
3
1
0
.
1
6
2
9
.
0
6
3
4
.
0
8
1
.
0
0
7
8
.
0
3
4
.
1
9
0
3
.
1
4
6
.
0
2
7
9
.
0
8
5
.
0
5
9
8
.
1
2
4
.
1
2
0
2
.
1
0
6
.
0
2
9
0
.
0
7
2
.
1
5
2
5
.
2
3
8
.
1
5
4
9
.
2
1
5
.
1
5
6
8
.
1
4
7
.
0
6
2
5
.
0
9
8
.
0
0
5
5
.
0
3
8
.
1
8
8
2
.
1
1
7
.
0
2
4
4
.
1
7
3
.
0
7
4
6
.
7
5
1
.
1
2
3
4
.
3
9
6
.
0
2
7
1
.
1
4
6
.
1
6
1
8
.
8
0
1
.
1
4
5
3
.
3
6
2
.
1
5
9
6
.
5
8
2
1
0
.
0
8
1
9
.
0
9
2
.
0
4
4
9
.
0
8
8
.
2
0
9
8
.
1
5
2
.
0
0
6
7
.
0
3
1
.
0
7
5
5
.
1
4
4
.
0
5
2
7
.
1
1
6
.
3
2
2
8
.
1
5
2
.
2
6
2
6
.
1
9
7
.
0
7
2
6
.
1
6
7
.
0
1
4
3
.
0
6
0
.
1
1
3
5
.
1
2
8
.
0
8
3
3
.
3
3
7
.
0
5
2
6
.
3
8
9
.
2
1
6
5
.
6
4
0
.
0
0
5
5
.
0
5
3
.
0
6
6
1
.
4
0
0
.
0
5
0
7
.
2
7
9
.
3
2
1
6
.
2
0
5
.
2
5
9
3
.
1
5
6
.
0
6
4
5
.
2
5
4
.
0
0
7
9
.
0
6
2
.
1
1
3
6
.
3
5
2
1
1
.
0
0
5
4
.
0
2
5
.
1
2
0
2
.
1
0
9
.
0
5
3
8
.
0
9
7
.
0
7
6
0
.
0
9
8
.
0
1
4
2
.
0
4
8
.
3
0
1
7
.
2
5
7
.
2
7
2
5
.
2
4
0
.
2
2
3
4
.
1
3
6
.
2
4
2
8
.
1
9
2
.
2
4
3
9
.
2
8
8
.
2
9
9
6
.
1
8
6
.
0
0
3
7
.
0
3
3
.
1
2
2
0
.
3
3
3
.
0
5
0
1
.
1
6
7
.
0
7
5
0
.
1
6
9
.
0
1
0
4
.
0
7
6
.
3
0
5
5
.
9
0
3
.
2
6
8
3
.
4
3
7
.
2
2
3
0
.
3
5
7
.
2
4
3
5
.
3
6
7
.
2
3
3
4
.
4
7
5
.
2
9
8
1
.
1
6
3
1
2
.
0
0
2
0
.
0
1
4
.
1
8
4
1
.
1
3
0
.
1
3
0
7
.
1
4
7
.
0
1
8
0
.
0
4
8
.
0
7
7
3
.
1
1
6
.
3
5
3
6
.
2
6
8
.
3
5
9
2
.
2
6
0
.
1
2
4
4
.
1
0
7
.
3
1
5
5
.
2
0
8
.
0
5
1
4
.
1
3
9
.
2
2
4
5
.
1
6
9
.
0
0
1
4
.
0
2
1
.
1
8
4
5
.
2
6
5
.
1
2
6
0
.
1
9
0
.
0
1
5
4
.
0
8
1
.
0
7
9
0
.
3
3
8
.
3
5
2
9
.
3
8
2
.
3
6
8
7
1
.
0
0
5
.
1
2
6
1
.
3
7
3
.
3
1
7
8
.
4
3
1
.
0
4
4
4
.
2
5
6
.
2
3
2
4
.
7
0
6
1
3
.
0
0
6
0
.
0
2
7
.
0
0
8
6
.
0
3
2
.
1
6
0
1
.
1
2
3
.
2
5
2
4
.
1
9
0
.
0
0
4
6
.
0
2
3
.
0
8
0
6
.
1
1
9
.
1
8
2
3
.
2
1
5
.
1
4
4
2
.
1
8
8
.
0
2
4
6
.
0
5
0
.
0
6
8
6
.
1
1
1
.
0
0
1
9
.
0
1
9
.
1
4
4
7
.
1
4
2
.
0
0
6
2
.
0
6
7
.
0
0
8
4
.
0
9
8
.
1
5
9
2
.
2
5
3
.
2
5
8
4
.
6
3
9
.
0
0
3
5
.
0
3
4
.
0
8
9
0
.
5
3
6
.
1
8
9
3
.
6
6
7
.
1
3
7
0
.
1
9
9
.
0
2
3
9
.
0
9
3
.
0
6
3
6
.
1
0
4
.
0
0
0
5
.
0
1
0
.
1
4
2
7
.
2
2
4
1
4
.
0
9
1
4
.
1
0
9
.
0
7
6
8
.
0
9
8
.
1
6
2
7
.
1
2
4
.
2
8
1
8
.
1
9
7
.
0
0
6
4
.
0
2
7
.
0
9
2
6
.
1
2
7
.
0
1
7
2
.
0
6
4
.
0
5
7
5
.
1
2
1
.
0
0
5
2
.
0
2
2
.
0
1
3
8
.
0
4
8
.
0
2
3
6
.
0
5
9
.
0
9
4
7
.
3
3
8
.
0
7
4
9
.
1
6
2
.
1
5
9
7
.
3
5
0
.
2
8
2
2
.
5
7
0
.
0
0
5
5
.
0
7
1
.
1
0
2
9
.
6
0
3
.
0
1
0
8
.
0
7
3
.
0
5
0
2
.
1
7
7
.
0
0
4
5
.
0
3
2
.
0
0
9
8
.
0
3
0
.
0
2
1
1
.
1
1
4
1
5
.
3
2
1
2
.
1
7
7
.
1
5
3
3
.
1
3
3
.
1
0
4
7
.
1
0
2
.
1
6
7
7
.
1
6
3
.
0
1
3
6
.
0
4
1
.
1
5
0
4
.
1
5
7
.
0
0
6
9
.
0
3
7
.
0
0
1
4
.
0
1
4
.
0
0
3
8
.
0
2
3
.
0
0
2
8
.
0
1
8
.
3
2
3
5
.
5
9
8
.
1
6
1
0
.
6
3
3
.
1
0
4
7
.
2
2
8
.
1
6
7
2
.
4
7
3
.
0
1
7
4
.
1
6
3
.
1
4
6
1
.
1
3
9
.
0
0
3
3
.
0
2
8
.
0
0
0
5
.
0
1
1
.
0
0
2
3
.
0
2
4
.
0
0
1
4
.
0
1
5
1
6
.
2
9
3
1
.
1
7
3
.
2
7
9
1
.
1
6
6
.
0
4
1
7
.
0
6
6
.
0
4
4
8
.
0
8
8
.
0
0
2
6
.
0
1
7
.
1
6
0
3
.
1
6
2
.
0
0
1
4
.
0
1
5
.
2
8
9
7
.
3
9
5
.
2
8
0
4
.
5
1
2
.
0
4
3
2
.
2
2
2
.
0
4
4
6
.
2
4
8
.
0
0
3
0
.
0
4
4
.
1
5
6
1
.
4
3
1
.
0
0
0
5
.
0
2
5
1
7
.
1
9
0
8
.
1
4
9
.
2
1
7
7
.
1
5
3
.
0
3
0
9
.
0
5
7
.
0
0
9
6
.
0
3
9
.
0
0
2
0
.
0
1
4
.
1
5
0
2
.
1
5
7
.
1
8
9
3
.
3
2
2
.
2
1
4
4
.
3
7
5
.
0
3
1
7
.
1
7
4
.
0
0
7
9
.
0
7
9
.
0
0
1
5
.
0
2
3
.
1
5
6
6
.
5
1
8
1
8
.
0
8
3
2
.
1
0
4
.
1
6
7
8
.
1
3
8
.
0
0
9
2
.
0
3
1
.
0
0
1
1
.
0
1
1
.
0
9
6
7
.
1
2
9
.
0
8
5
6
.
4
3
8
.
1
6
8
5
.
3
8
1
.
0
0
8
9
.
0
6
9
.
0
0
0
5
.
0
1
1
.
0
9
7
9
.
3
4
5
1
9
.
0
0
7
8
.
0
3
1
.
0
6
9
2
.
0
9
3
.
0
0
4
7
.
0
2
1
.
0
7
0
7
.
1
1
2
.
0
0
6
7
.
0
5
0
.
0
6
7
9
.
1
3
9
.
0
0
4
0
.
0
3
5
.
0
6
8
6
.
2
8
7
2
0
.
0
2
1
0
.
0
5
1
.
0
0
0
8
.
0
0
8
.
0
4
8
1
.
0
9
2
.
0
1
9
2
.
0
7
1
.
0
0
0
5
.
0
1
1
.
0
4
6
7
.
2
4
8
2
1
.
0
0
1
1
.
0
1
0
.
0
0
0
8
.
0
0
8
.
0
2
2
8
.
0
6
3
.
0
0
1
4
.
0
2
5
.
0
0
0
5
.
0
2
6
.
0
2
0
4
.
1
3
4
2
2
.
0
1
1
0
.
0
4
2
.
0
0
9
9
.
0
9
6
2
3
.
0
0
4
6
.
0
2
5
.
0
0
3
5
.
0
3
4
2
4
.
0
0
1
1
.
0
1
1
.
0
0
0
5
.
0
1
1
2.6 Conclusion 31
propriate to use the locus-specic MLEs (or the common -values for groups of loci in Table 2.3)
when computing PI().
2.6 Conclusion
We have demonstrated how the genetic dependence caused by identical-by-descent assumption
can be modelled as overdispersion from a statistical point of view. This allowed for maximum
likelihood estimation of allele probabilities in the reference population, , and the identical-by-
descent measure, . By using recent results from the statistical literature the FIM was computed
analytically and condence intervals based on prole log-likelihoods were provided.
Acknowledgements
I would like to thank my PhD supervisor Associate Professor Poul Svante Eriksen (Aalborg Uni-
versity, Denmark), Professor Niels Morling (University of Copenhagen, Denmark) and Professor
Bruce S. Weir (University of Washington, USA) for comments and valuable discussions. I am
thankful to Professor Weir for inviting me as visiting scientist to The Department of Biostatis-
tics, University of Washington, which I was visiting while working on this paper. Furthermore,
I would like to thank Associate Professor Esben Hg (Aalborg University, Denmark) and three
anonymous reviewers for their comments, which have signicantly improved the nal version of
this paper.
Appendix
2.A Mathematical details
In Appendix 2.A.1, we give some mathematical details on how to derive the paternity index,
PI(), of (2.5), and Appendix 2.A.2 is about testing for equality of across loci.
2.A.1 Deriving paternity index (PI)
We demonstrate howto derive the paternity index, PI, in (2.5) using P(Y
n+1
=j|Y
n
=y
n
) = P( j|x
n
j
)
in (2.4). In a given locus, the childs prole is (ac) and the mother is heterozygous (ab), where
c is dierent from a and b. Discarding the possibility of mutations, the true father needs to pass
on a c allele to the child. Assume that the alleged father is heterozygous (cd), which implies
P(ac|ab, cd) =
1
4
, i.e. under hypothesis H
1
the probability of the childs prole given its parents
proles is
1
4
. The PI is determined by:
PI =
P(ac, ab, cd|H
1
)
P(ac, ab, cd|H
2
)
=
P(ac|ab, cd)P(ab, cd)
_
k
i, j
P(ac|ab, cd, i j)P(ab, cd, i j)
,
32 Overdispersion in allelic counts and -correction in forensic genetics
where (i j) denotes the prole of the true father under H
2
and summation is over all k alleles
in the given locus. However, when omitting the possibility of mutations, unless i or j equals c
the child can not be the true fathers ospring, i.e. P(ac|ab, cd, i j) = 0 for (i, j) where c i
and c j. Hence, we x j = c and sum over all i = 1, . . . , k, where P(ac|ab, cd, cc) =
1
2
and
P(ac|ab, cd, ic) =
1
4
for all i c under H
2
. This implies that the expression for the PI is given by
PI =
1
4
P(ab, cd)
1
2
P(ab, cd, cc) + 2
_
ic
1
4
P(ab, cd, ic)
=
P(ab, cd)
2P(ab, cd, c)
_
P(c|ab, cd, c) +
_
ic
P(i|ab, cd, c)
_ =
1
2P(c|ab, cd)
,
where the sum in square brackets by denition is one. Using the expression P( j|x
n
j
) in (2.4) with
x
4
= (x
4
a
, x
4
b
, x
4
c
, x
4
d
) = (1, 1, 1, 1), we have,
PI =
1
2P(c|x
4
c
)
=
1 + (n 1)
2[x
c
+ (1 )
c
]
=
1 + 3
2[ + (1 )
c
]
.
2.A.2 Testing equality of for multiple loci
In order to nd stationary points for the log-likelihood of (2.14), we use Fisher-scoring with La-
grange multipliers, = {
s
}
S
s=1
, ensuring equal for all loci. Translating the common parameter

to

ensures computational simplicity. The observed FIM, J(), associated with (2.14) is
J() =
_

_
[J(
1
)] O
1,2
O
1,S
g
k
1
O
2,1
[J(
2
)] O
2,S
g
k
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
O
S,1
O
S,S 1
[J(
2
)] g
k
S
g

k
1
g

k
2
g

k
S
0
_

_
where O
s,t
is a (k
s
+1) (k
t
+1)-matrix of zeros,
[J(
s
)] =
_
J(
s
) 1
k
s
1

k
s
0
_
and g
k
s
=
_
0
k
s
1
_
Furthermore, the score function is
u({
s
,
s
}
S
s=1
,

) = ({u(
s
)
s
1
k
s
, (


s+
)}
S
s=1
,
+
)
where u(
s
) is the score function of (2.10).
Bibliography 33
Bibliography
Balding, D. J. (2003). Likelihood-based inference for genetic correlation coecients. Theoreti-
cal Population Biology 63, 221230.
Balding, D. J. (2005). Weight-of-evidence for Forensic DNA Proles. Chichester, West Sussex:
John Wiley & Sons, Ltd.
Balding, D. J. and R. A. Nichols (1995). A method for quantifying dierentiation between
populations at multi-allelic loci and its implications for investigating identity and paternity.
Genetica 96, 312.
Balding, D. J. and R. A. Nichols (1997). Signicant genetic correlations among caucasians at
forensic DNA loci. Heredity 78(6), 583589.
Barndor-Nielsen, O. E. and D. R. Cox (1994). Inference and Asymptotics. Number 52 in
Monographs on Statistics and Applied Probability. London: Chapman & Hall.
Box, G. E. P. and N. R. Draper (1987). Empirical model-builing and response surfaces. Wiley.
Budowle, B. and T. R. Moretti (1999). Genotype proles for six population groups at the 13
CODIS short tandem repeat core loci and other PCR-based loci. Forensic Science Communi-
cations.
Cockerham, C. C. (1969). Variance of gene frequencies. Evolution 23(1), 7284.
Cockerham, C. C. (1973). Analysis of gene frequencies. Genetics 74(4), 679700.
Curran, J. M., J. S. Buckleton, C. M. Triggs, and B. S. Weir (2002). Assessing uncertainty in
DNA evidence caused by sampling eects. Science and Justice 42(1), 2937.
Curran, J. M., C. M. Triggs, J. S. Buckleton, and B. S. Weir (1999). Interpreting DNA mixtures
in structured populations. Journal of Forensic Science 44(5), 987995.
Davison, A. C. and D. V. Hinkley (1997). Bootstrap Methods and their Application. Cambridge
University Press.
Evett, I. W. and B. S. Weir (1998). Interpreting DNA Evidence: Statistical Genetics for Forensic
Scientists. Sunderland, MA: Sinauer Associates.
Fields, C. A. and A. H. Welsh (2007). Bootstrapping clustered data. Journal of the Royal
Statistical Society. Series B, Statistical methodology 69(3), 369390.
Green, P. J. and J. Mortera (2009). Sensitivity of inferences in forensic genetics to assumptions
about founding genes. Annals of Applied Statistics 3(2), 731763.
Hardy, G. H. (1908). Mendelian proportions in a mixed population. Science 28(706), 4950.
Holsinger, K. E. (1999). Analysis of genetic diversity in geographically structure populations:
A bayesian perspective. Hereditas 130, 245255.
Holsinger, K. E. and B. S. Weir (2009). Genetics in geographically structured populations:
dening, estimating and interpreting F
S T
. Nature Reviews. Genetics 10(9), 639650.
Johnson, N. L., S. Kotz, and N. Balakrishnan (1997). Discrete Multivariate Distributions. Wiley.
Lange, K. (1995a). Applications of the Dirichlet distribution to forensic match probabilities.
Genetica 96, 107117.
34 Overdispersion in allelic counts and -correction in forensic genetics
Lange, K. (1995b). Mathematical and Statistical Methods for Genetic Analysis (2 ed.). Springer.
Mosimann, J. E. (1962). On the compound multinomial distribution, the multivariate -
distribution, and correlations among proportions. Biometrika 49(1-2), 6582.
Neerchal, N. K. and J. G. Morel (2005). An improved method for the computation of maximum
likelihood estimates for multinomial overdispersion models. Computational Statistics &Data
Analysis 49, 3343.
Nichols, R. A. and D. J. Balding (1991). Eects of population structure on DNA ngerprint
analysis in forensic science. Heredity 66, 297302.
Paul, S. R., U. Balasooriya, and T. Banerjee (2005). Fisher information matrix for the Dirichlet-
multinomial distribution. Biometrical Journal 47(2), 230236.
Phillips, C., T. Tvedebrink, et al. (2010). Analysis of global variability in 15 established and
5 new European Standard Set (ESS) STRs using the CEPH human genome diversity panel.
Forensic Science International: Genetics. In Press.
Rannala, B. and J. A. Hartigan (1996). Estimating gene ow in island populations. Genetical
Research 67, 147158.
Samanta, S., Y.-J. Li, and B. S. Weir (2009). Drawing inferences about the coancestry coecient.
Theoretial Population Biology 75, 312319.
Tvedebrink, T. (2009). dirmult: Estimation in Dirichlet-Multinomial distribution. R package
version 0.1.3.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2010). Evaluating the weight
of evidence using quantitative STR data in DNA mixtures. Journal of the Royal Statistical
Society. Series C, Applied statistics. In Press.
Ukoumunne, O. C., A. C. Davison, M. C. Gulliford, and S. Chinn (2003). Non-parametric boot-
strap condence intervals for the intraclass correlation coecient. Statistics in Medicine 22,
38053821.
Weinberg, W. (1908).

Uber den nachweis der vererbung beimmenschen. Jahreshefte des Vereins
f ur vaterl andische Naturkunde in W urttemberg 64, 368382.
Weir, B. S. (1996). Genetic Data Analysis II. Sinauer Associates, Inc.
Weir, B. S. (2007). The rarity of DNA proles. The Annals of Applied Statistics 1(2), 358370.
Weir, B. S. and C. C. Cockerham (1984). Estimating F-statistics for the Analysis of Population
Structure. Evolution 38(6), 13581370.
Weir, B. S. and W. G. Hill (2002). Esimating F-statistics. Annual Review of Genetics 36, 721
750.
Wright, S. (1951). The genetical structure of populations. Annals of eugenics 15, 323354.
Zhou, H. and K. Lange (2010). MM algorithms for some discrete multivariate distributions.
Journal of Computational and Graphical Statistics. In Press.
2.7 Supplementary remarks 35
2.7 Supplementary remarks
The methodology presented above for estimating and computing the prole log-likelihood has
been applied in the publication by Phillips et al. (2010). I performed some of the computations
of that paper using the dirmult package and made plots similar to those of Figures 2.3 and
2.4 (Fig. 4 in Phillips et al., 2010). Plots in higher resolution are available from my web page
(http://people.math.aau.dk/tvede under List of publication).
In population genetics the Hardy-Weinberg equilibrium(HWE) constitute a fundamental point of
reference. Proposed independently by Hardy (1908) and Weinberg (1908), the HWE states that
assuming random mating, no selection, no mutations and innite population size the probability
of a diploid genotype is the product of allele probabilities, P(A
i
A
j
) = 2p
i
p
j
and P(A
i
A
i
) = p
2
i
.
We know immediate from these assumptions that HWE fail to hold since no real world popula-
tion satisfy these restrictions. However, quoting Box and Draper (1987, pp. 74): Remember
that all models are wrong; the practical question is how wrong do they have to be to not be use-
ful applies also to HWE. In fact testing for HWE is often done to test for data quality, where
the test is performed on genetic data to detect possible over-representation of homozygotes due
to typing errors.
Over the last 100 years since the publication of the Hardy-Weinberg principle several genetic
models have been proposed to relax the assumptions mentioned above. One such attempt were
Wright (1951) who dened the F-statistics (F
S T
, F
IT
and F
S I
), which are measures of popula-
tion dierentiation (Holsinger and Weir, 2009). In forensic genetics the most interesting of the
parameters is F
S T
which measures the divergence between a subpopulation, S , and the total pop-
ulation, T. Cockerham (1969, 1973) showed that for most interesting assumptions made about
the population structure and breeding patterns is identical to F
S T
(Weir and Cockerham, 1984,
pp. 1358). The use of the -correction alters the genotype probabilities P(A
i
A
j
) = 2p
i
p
j
(1 )
and P(A
i
A
i
) = p
i
+ p
2
i
(1 ), where the magnitude of controls the deviation from HWE.
In the following chapter, a paper discussing the -correction in relation to a DNA reference
prole databases is presented. In that setting only one database is available, hence there is no
point of reference to which extend a particular subsampled database diers in allelic constitution
from another. Therefore dierent means of estimating needs to be considered. In the setting
above was a measure of subpopulation structure in a larger database, whereas in the subsequent
setting is a measure of correlation between gametes (within and between individuals). Hence,
by making pairwise comparisons of all individuals in the database we may be able to quantify
by analysing the dierence between expected and observed counts of matching loci.
CHAPTER 3
Analysis of matches and partial-matches
in Danish DNA reference prole database
Publication details
Co-authors: Poul Svante Eriksen

, James Curran

, Helle Smidt Mogensen

and Niels Morling

Department of Mathematical Sciences


Aalborg University

Department of Statistics
University of Auckland

Section of Forensic Genetics, Department of Forensic Medicine


Faculty of Health Science, University of Copenhagen
Journal: Forensic Science International: Genetics (Under preparation)
37
38 Analysis of matches and partial-matches in Danish DNA database
Abstract:
In this paper we analyse the Danish reference database accumulated over approximately 40 years
with 51,517 DNA proles, which is close to 1% of the Danish adult population size. Each entry
in the database is associated with a civil registration number such that twins are identied and
potential near matches due to typing errors are removed.
We investigated the methodology of Weir (2004, 2007), and extensions by Curran et al. (2007) to
allow for close relatives, who derived expressions for the expected number of matches and near
matches in a database when every DNA prole is compared to all other proles in the database.
We extended the methodology by computing the covariance matrix of the summary statistic and
used it to estimate the identical-by-descent parameter for the Danish database.
Keywords:
DNA database; -correction; Subpopulation; Close relatives; Covariance matrix.
3.1 Introduction
In order to accommodate the pressure from the legal community, Weir (2007) commented on
the rarity of DNA proles and in particular on the number of expected prole matches and near
prole matches one should expect as the DNA databases increase in size. The fact that a pair
of proles matches at 9 out of 13 loci in an Arizonian database of 65,493 proles (Troyer et al.,
2001) is not unexpected. In fact Weir (2007) suggests that 163 of such pairs would be expected
under his population genetic model with the coancestry parameter = 0.03. However, if one
compares the expected counts and observed counts in Weir (2004), it is evident that the expected
number of partially-matching loci is much larger than what is observed. A possible explanation
is that the population is subdivided which increases the number of homozygote proles. That is,
proles that are homozygous are either similar or dierent, which is not captured in the model
discussed by Weir (2004, 2007).
Mueller (2008) investigated the performance of simple population genetic models further. He
also focused on the Arizona database and discussed howlikely it was to observe the reported 122
pairs matching on 9 loci and 20 pairs matching on 10 loci out of 13 loci. By means of simulations
he increased the complexity of the model to include ve ethnic groups each with four possible
subpopulations and a number of relatives. He concluded that in order to obtain suciently
high probabilities for the observed counts, there needed to be between 1,000 and 3,000 pairs
of full-siblings in a substructured population. Several other authors have discussed multi-locus
matching and population structures inuence on match probabilities, e.g. Lange (1993, 1995);
Donnelly (1995b,a); Balding and Nichols (1995); Ayres (2000); Laurie and Weir (2003); Song
and Slatkin (2007).
The main focus of this paper is the examination and validation of the model proposed by Weir
(2004, 2007) and the modications hereof by Curran et al. (2007) to allow for closely related
individuals in the database. To this purpose we model and analyse the distribution of matches and
partial-matches in the reference DNA database at the Section of Forensic Genetics, Department
of Forensic Medicine, Faculty of Health Sciences, University of Copenhagen.
3.2 Materials and methods 39
3.2 Materials and methods
3.2.1 Data
The Danish reference DNA prole database contains 51,517 STR DNA proles accumulated
from 1971 to the beginning of 2009 typed at the 10 autosomal loci included in the SGM Plus
kit (Applied Biosystems, CA, USA). The database constitute little more than 1% of the Danish
adult population (approx 4 million people). Each entry in the database is associated with a civil
registration number such that twins are identied and potential near matches due to typing errors
are removed.
The database were analysed such that every prole were compared to any other prole in the
database. For each pairwise comparison the number of matching (agreement on both alleles), m,
and partially-matching loci (sharing exactly one allele), p were registered. Let G
i
and G
j
be two
DNA proles in the database. Then M(G
i
, G
j
) is a 1111-indicator matrix with zeros except for
the (m, p)-entry corresponding to m and p matching and partially matching loci between prole
G
i
and G
j
, respectively.
Hence, the summary statistic M = {M
m/p
}
m, p
is formed by
M =
n1

i=1
n

j>i
M(G
i
, G
j
), (3.1)
which corresponds to N =
_
n
2
_
= n(n 1)/2 pairwise comparisons of n DNA proles. With the
database size of n = 51,517 this results in N = 1,326,974,886 comparisons.
The result of analysing the Danish database with n = 51,517 DNA proles is summarised in
Table 3.1, where M
m/p
corresponds to the number of pairs with m matching loci and p partially-
matching loci. From Table 3.1 we nd that e.g. the number of pairs of proles with 5 matching
loci and 4 partially-matching loci out of ten autosomal loci is M
5/4
= 17,060. Figure 3.1 shows
a the summary statistic in an informative way where we have plotted the observed counts on
log
10
-scale.
Two of the authors (T. Tvedebrink and J. Curran) implemented computationally ecient func-
tions for constructing the M-table in the statistical software R (R Development Core Team,
2009). The compare-function from the DNAtools-package (Curran and Tvedebrink, 2010b)
took less than 5 minutes to perform all 1,326,974,886 pairwise comparisons on a 2.50 GHz lap-
top computer. Most of the methodology in this paper has been implemented in the DNAtools-
package together with specialised plotting functions. The package is described in more detail
elsewhere (Curran and Tvedebrink, 2010a).
40 Analysis of matches and partial-matches in Danish DNA database
Table 3.1: Summary matrix M for the Danish reference DNA prole database with 51,517 DNA
proles. M
m/p
is the number pairs of proles with m matching (where m is the row number) and
p partially-matching (where p is the column number) loci. Owing to lack of space the font size
is reduced for the least interesting part of the table (low number of matching loci).
M 0 1 2 3 4 5 6 7 8 9 10
0 906,881 8,707,969 37,632,872 96,157,037 160,570,778 182,820,115 143,627,613 76,852,119 26,786,782 5,486,572 501,671
1 1,100,493 9,484,061 36,229,766 80,292,877 113,733,413 106,635,954 66,164,365 26,183,818 5,992,415 604,900
2 595,135 4,531,792 14,996,133 28,165,271 32,810,688 24,271,278 11,132,519 2,887,555 325,493
3 188,146 1,237,733 3,467,281 5,353,738 4,913,791 2,683,854 805,798 103,305
4 38,094 212,192 487,484 592,929 401,832 143,202 21,490
5 5,114 23,490 42,459 37,933 17,060 3,100
6 470 1,685 2,272 1,414 378
7 26 96 91 64
8 3 6 21
9 0 0
10 0
Match/Partial match
C
o
u
n
t
s
0
/
0
0
/
1
0
/
2
0
/
3
0
/
4
0
/
5
0
/
6
0
/
7
0
/
8
0
/
9
0
/
1
0
1
/
0
1
/
1
1
/
2
1
/
3
1
/
4
1
/
5
1
/
6
1
/
7
1
/
8
1
/
9
2
/
0
2
/
1
2
/
2
2
/
3
2
/
4
2
/
5
2
/
6
2
/
7
2
/
8
3
/
0
3
/
1
3
/
2
3
/
3
3
/
4
3
/
5
3
/
6
3
/
7
4
/
0
4
/
1
4
/
2
4
/
3
4
/
4
4
/
5
4
/
6
5
/
0
5
/
1
5
/
2
5
/
3
5
/
4
5
/
5
6
/
0
6
/
1
6
/
2
6
/
3
6
/
4
7
/
0
7
/
1
7
/
2
7
/
3
8
/
0
8
/
1
8
/
2
9
/
0
9
/
1
1
0
/
0
1
10
2
10
4
10
6
10
8
Figure 3.1: Plot of observed counts (marked by ) versus the number of matching and partially-
matching loci (counts on log
10
-scale) for the Danish database. The superimposed points ()
represents the expected counts (under the model described in Section 3.2.2) and the vertical
bars indicate an approximative 95%-condence interval computed by N 2
_
diag{()} (see
Sections 3.3 and 3.3.2).
3.2 Materials and methods 41
3.2.2 Population genetic model
The model proposed by Weir (2007, 2004) denes for each of the L loci three probabilities
(P
0/0
, P
0/1
, P
1/0
), which are the probabilities for two randomly selected proles sharing none,
one or both alleles at a given locus (Weir denoted the probabilities P
0
, P
1
, P
2
. The change
of subscript will hopefully be clear in the following). The probabilities P
m/p
depends on the
coancestry coecient through the match probability equations (Nichols and Balding, 1991)
that are derived using the recursion formula: P(A
i
|x
n
) = [x
n
i

+ (1 )p
i
]/[1 +(n 1)], which
is the probability of observing an i

allele after having seen x


n
i

alleles of type i

among n sampled
alleles.
The expected values associated with the observed counts in M under this model is computed
as N, where = {
m/p
}
m, p
is the matrix of probabilities for the match/partially-match events
(m, p). The elements of ,
m/p
, m = 0, . . . , L; p = 0, . . . , L m, may be computed using
recursion over loci: Let

m/p
denote the probability based on loci, i.e. using only a subset of
size of the L loci. Then the following equation denote howto compute
+1
m/p
for = 1, . . . , L1:

+1
m/p
= P
+1
0/0

m/p
+ P
+1
0/1

m/p1
+ P
+1
1/0

m1/p
, (3.2)
where the sum of the subscripts for each term on the right hand side equals the subscript on
the left hand side, and P

m/p
refer to the P
m/p
probabilities for the th added locus. When either
m = 0 and/or p = 0 we have these boundary equations:

+1
0/0
= P
+1
0/0

0/0
,
+1
0/p
= P
+1
0/0

0/p
+ P
+1
0/1

0/p1
and
+1
m/0
= P
+1
0/0

m/0
+ P
+1
1/0

m1/0
,
where
1
1/0
= P
1
1/0
,
1
0/1
= P
1
0/1
and
1
0/0
= P
1
0/0
. These equations are easily implemented in
computer software and eciently compute the expected numbers for various -values.
Weir (2007) focused in his survey paper primarily on comparison between the observed counts
and the expected number, N(), for dierent values of . However, as Curran et al. (2007) dis-
cussed one needs to consider normalisation of these dierences for a proper comparison between
the observed and expected counts. In this paper we show how to compute the covariance matrix
of M in order to make a more rigorous comparison taking the correlation between cell counts
into consideration.
Close relatedness
Weir (2007) showed that for a specied family relationship of a pairs of proles, P
m/p
is updated
using the probabilities, k
I
, that the two individuals share I alleles identical-by-decent (IBD):

P
0/0
= k
0
P
0/0

P
0/1
= k
1
(1 )(1 S
2
) + k
0
P
0/1
and

P
1/0
= k
2
+ k
1
[ + (1 )S
2
] + k
0
P
1/0
,
where S
2
=
_
K
i

=1
p
2
i

is the sum of squared allele probabilities at a given locus with K dierent


alleles, and

P
m/p
denote the probability that two individuals with the specied family relationship
will match as m/p in a given locus. In order to compute , P
m/p
is replaced by

P
m/p
in the (3.2).
In Table 3.2 we have listed the ve types of relatedness considered in this paper. The avuncular
42 Analysis of matches and partial-matches in Danish DNA database
class covers half-siblings, grandparent-grandchild and uncle-nephew (independent of gender)
since these has identical k-vector and are as such indistinguishable only using unlinked genetic
markers.
Table 3.2: Probability of sharing I alleles IBD for the specied relationship (Weir, 2007, Table
4).
Relationship Full-siblings First-cousins Parent-child Avuncular Unrelated
k = (k
2
, k
1
, k
0
) (0.25, 0.5, 0.25) (0, 0.25, 0.75) (0, 1, 0) (0, 0.5, 0.5) (0, 0, 1)
The eect of these types of relatedness is represented graphically in Figure 3.2 where
m/p
is
plotted for the possible combinations of m and p for = 0.03. Note that parent-child (marked by
+ in Figure 3.2) must share at least one allele per locus implying that
m/p
= 0 when m+ p L.
Match/Partial
~
m
p
0
/
0
0
/
1
0
/
2
0
/
3
0
/
4
0
/
5
0
/
6
0
/
7
0
/
8
0
/
9
0
/
1
0
1
/
0
1
/
1
1
/
2
1
/
3
1
/
4
1
/
5
1
/
6
1
/
7
1
/
8
1
/
9
2
/
0
2
/
1
2
/
2
2
/
3
2
/
4
2
/
5
2
/
6
2
/
7
2
/
8
3
/
0
3
/
1
3
/
2
3
/
3
3
/
4
3
/
5
3
/
6
3
/
7
4
/
0
4
/
1
4
/
2
4
/
3
4
/
4
4
/
5
4
/
6
5
/
0
5
/
1
5
/
2
5
/
3
5
/
4
5
/
5
6
/
0
6
/
1
6
/
2
6
/
3
6
/
4
7
/
0
7
/
1
7
/
2
7
/
3
8
/
0
8
/
1
8
/
2
9
/
0
9
/
1
1
0
/
0
10
12
10
10
10
8
10
6
10
4
10
2
1
Fullsiblings
Firstcousins
Parentchild
Avuncular
Unrelated
Figure 3.2: Eect on for the ve types of relatedness with = 0.03. The legend explains the
plot characters.
The inclusion of related pairs of proles were investigated by Curran et al. (2007) using Aus-
tralian data with Caucasian and Aborigine origin. Using that E() = E(E[|R]) =
_
rR
E(|R =
r)P(R = r) they computed expected number of matches by stratifying on close relationships, R.
3.3 Results 43
They formulated the model with R = {Full-siblings, First-cousins, Parent-child, Unrelated}:
=
Full-siblings
+
First-cousins
+
Parent-child
+ , (3.3)
where = 1 and the parameters refer to the fraction of the total comparisons that are
made between pairs of full-siblings, rst-cousins, parent-child and unrelated, respectively.
After tting the model to the data, we have parameter estimates of the various parameters in the
(3.3)-model. Thus we have an overall estimate of the probability that a random pair of proles
in the database has a certain familial relationship, e.g. the probability of two pairs of proles
originating from a pair of full-siblings in the Western Australia database is 6.91

10
6
(Curran
et al., 2007, -estimate in caption of Fig. 1).
These probabilities might be used in relation to crime cases where a suspect, S , declares that a
close relative is the culprit, C. Let G
S
be the suspects prole (known to the investigator) and
G
C
the prole of the culprit (unknown, but may be identical to G
S
). For some crime cases the
defence may claim that the circumstances of the crime is such that the true oender is a close
relative to S . Given a specic familial relationship, r, it is possible to compute the probability
that S and C share the same DNA prole. We need to distinguish between the situation of
G
S
being heterozygous or homozygous, and let P(G
C
= A
i
A
j
|G
S
= A
i
A
j
, R = r) and P(G
C
=
A
i
A
i
|G
S
= A
i
A
i
, R = r) denote these probabilities, where r is the specied familial relationship
of C and S . Furthermore, the information about r, implies knowledge of k which gives these
expression for the two probabilities:
P(G
C
=A
i
A
j
|G
S
=A
i
A
j
, R=r)=k
2
+
k
1
2
_
P(A
i
|A
j
, A
i
A
j
)+P(A
j
|A
i
, A
i
A
j
)
_
+k
0
P(A
i
A
j
|A
i
A
j
)
=k
2
+
k
1
2
2+(1)(p
i
+p
j
)
1+2
+2k
0

2
+(1)(p
i
+p
j
)+(1)
2
p
i
p
j

(1+2)(1+)
(3.4)
P(G
C
=A
i
A
i
|G
S
=A
i
A
i
, R=r)=k
2
+k
1
P(A
i
|A
i
, A
i
A
i
)+k
0
P(A
i
A
i
|A
i
A
i
)
=k
2
+k
1
3+(1)p
i

1+2
+ k
0
6
2
+5(1)p
i
+(1)
2
p
2
i

(1+2)(1+)
(3.5)
If the suspect is not the true culprit, then the probability that G
S
G
C
(share the same DNA
prole) is given by
10/0
. For the ve types of relatedness considered here, the probabilities are
plotted in the right-most category in Figure 3.2 for = 0.03.
3.3 Results
3.3.1 Simulations
We used the model discussed above to simulate DNA prole databases with known allele fre-
quencies (the estimated allele frequencies from the Danish database) and various values for .
For a specied number of DNA proles, we used the recursive formula of Nichols and Balding
44 Analysis of matches and partial-matches in Danish DNA database
(1991) for individuals only remotely related P(A
i
|x
n
) = [x
n
i

+ (1 )p
i
]/[1 + (n 1)] to
simulate alleles with a correlation governed by where p
i
in the formula is the allele frequency
of allele A
i
and the vector x
n
= (x
n
1
, . . . , x
n
K
) is the sucient summary statistic (Tvedebrink,
2010). In order to take close relationships among the individuals into consideration, we simu-
lated the number of individuals with a specied relationship n
R
= (n
FS
, n
1C
, n
PC
, n
AV
, n n
+
),
where all n
r
are even numbers. The subscripts relates to full-siblings (FS), rst-cousins (1C),
parent-child (PC) and avuncular (AV). The last entry in n
R
refer to the remaining number of
unrelated DNA proles (UN). Since the comparisons M(G
i
, G
j
) only considers pairs of proles,
the closely related DNA proles are simulated in pairs such that:
1. Simulate the rst relative R
1
: R
1
P(A
i
A
j
|x
n
) = P(A
i
|x
n+1
)P(A
j
|x
n
), where x
n+1
=x
n
+e
j

and e
j
is a vector of zeros except for a one in entry j

.
2. Simulate the number of alleles the second relative R
2
share IBD with R
1
: I P(k).
3. Prole R
2
is simulated conditioned on the value of I:
I = 0: R
2
is generated unrelated to R
1
: R
2
P(A
k
A
l
|A
i
A
j
, x
n
), and may be identical (by
state) to R
1
.
I = 1: The rst allele of R
2
is drawn randomly from the alleles of R
1
, e.g. A
i
is sampled. The
second allele is then sampled from P(A
k
|A
i
, A
i
A
j
, x
n
).
I = 2: R
2
is identical to R
1
. Note that only full-siblings has this possibility in our simulations.
By using this sampling scheme we make n
r
/2 pairwise comparisons for relatedness on level
r, since all other pairs of simulated relatives are mutually unrelated to each other. Hence, the
known vector of p
r
= {P(R = r)}
rR
is for each simulated database:
p
r
=
_
n
FS
n(n 1)
,
n
1C
n(n 1)
,
n
PC
n(n 1)
,
n
AV
n(n 1)
, 1
n
+
n(n 1)
_
From the expressions above it is clear that for increasing database sizes the number of com-
parisons between relatives is o(n
2
). However the impact on M depends on the product of the
matching probabilities and the fraction of comparisons,
r
p
r
. Mueller (2008) argued that the
number of full-sibling pairs in the Arizonian database (n = 65,493) needed to be between 1,000
to 3,000 pairs. This gives that the fraction of pairwise comparisons attributed to full-siblings is
between 4.73

10
7
and 1.42

10
6
for the Arizonian database.
In the formulation of Weir (2004, 2007) was assumed constant across loci. However, this need
not to be the case due to dierent mutation rates, and possibly selection or indirect selection by
linkage to other genes/markers subject to selection (Tvedebrink, 2010). In our simulations we
used a constant across loci for simplicity. For each simulated database we estimated using
ve optimisation criteria:
C
1
() =
_

(M N()) C
2
() =

(M N())
2
N()
C
3
() =

|M N()|
N()
(3.6)
T
1
() =

(M N())
2
diag{()}
T
2
() = {M N()}

()

{M N()}, (3.7)
where summation is over the vector entries. The object functions in (3.6) were investigated by
Curran et al. (2007) as a mean to compare the expected and observed counts. The authors argued
3.3 Results 45
that numerical work indicated that C
3
() yielded good results since special emphasis is placed
on the upper tail of the distribution (large number of matching loci). The functions in (3.7) uses
the covariance matrix, (), computed in this paper (cf. below). The rst function, T
1
(), does
not take correlations into accounts, whereas T
2
() is a natural measure of similarity (a so called
Mahalanobis-distance) incorporating the covariance matrix.
Let M be the M-matrix written in vector format (Appendix see 3.A for details on the transfor-
mation). We derived the expression for the variance of M, (), such that T
2
() = {N()
M}

()

{N() M} may be compared for various values of in order to obtain the minimal
T
2
(). We use the generalised inverse of () since () is not of full rank due to the linear
constraint N = M
+/+
, where the +-notation indicates summation over the index. Let all the
DNA prole identiers, (i
1
, i
2
, i
3
, i
4
) be dierent, then the variance is computed as:
() =
_
n
2
_
V
_
M(G
i
1
, G
i
2
)
_
+ 6
_
n
3
_
C
_
M(G
i
1
, G
i
2
), M(G
i
1
, G
i
3
)
_
+ 6
_
n
4
_
C
_
M(G
i
1
, G
i
2
), M(G
i
3
, G
i
4
)
_
,
(3.8)
where the covariances C
_
M(G
i
1
, G
i
2
), M(G
i
1
, G
i
3
)
_
and C
_
M(G
i
1
, G
i
2
), M(G
i
3
, G
i
4
)
_
are the most
involved terms to compute since V
_
M(G
i
1
, G
i
2
)
_
= diag{()} ()()

. The full details are


given in Appendix 3.A.
Simulations of unrelated DNA proles
We simulated 1, 000 databases for varying -values, {0.00; 0.01; 0.02; 0.03; 0.04} with 10, 000
DNA proles per database. For each database we computed the summary statistic M and Fig-
ure 3.3 shows box-plots of the summary statistics on logarithmic scale for each m/p-category for
=0.03. The superimposed vertical boxes (dark grey) represent an approximate 95%-condence
interval computed by N() 2
_
diag{()}, where the approximation rely on an approxima-
tion to normality for the counts. The performance of this approximation increases with the cell
counts, i.e. the smaller the counts the less accurate is the approximation. The light grey boxes
represent the 95% sample condence interval based on the 2.5% and 97.5% quartiles in the dis-
tribution of the simulated values. Inserted is also the expected value () for each category. It is
evident that the median for most categories are identical to the expected value, except for cases
with N
m/p
small. Here, the box plot is of limited use since the observations are all or nothing.
For each method the minimum was found by evaluating the function for
i
on a ne grid of -
values with step length 0.0001 for the interval [0, 0.12]. The box plot of Figure 3.4 compare the
performance of the ve measures of similarity between the observed and expected numbers for
the various -values. In the box plot the known
0
is subtracted from the estimated

such that
the box plot show the deviation of

from the true value.
The box plot of Figure 3.4 indicate that there is hardly no dierence among the methods. How-
ever, the mean squared errors (MSE) in Table 3.3 show that the T
2
()-method has a slightly better
overall performance compared to the four other methods. Both the box plot and MSE show an
increase in the deviation for increasing -values. This is due to the larger variability (from the
higher correlation of the proles) in the simulated data, and hence the available information for
inference about decreases.
46 Analysis of matches and partial-matches in Danish DNA database
M
a
t
c
h
/
P
a
r
t
i
a
l
Counts
0/0
0/1
0/2
0/3
0/4
0/5
0/6
0/7
0/8
0/9
0/10
1/0
1/1
1/2
1/3
1/4
1/5
1/6
1/7
1/8
1/9
2/0
2/1
2/2
2/3
2/4
2/5
2/6
2/7
2/8
3/0
3/1
3/2
3/3
3/4
3/5
3/6
3/7
4/0
4/1
4/2
4/3
4/4
4/5
4/6
5/0
5/1
5/2
5/3
5/4
5/5
6/0
6/1
6/2
6/3
6/4
7/0
7/1
7/2
7/3
8/0
8/1
8/2
9/0
9/1
10/0
1
1
0
1
0
2
1
0
3
1
0
4
1
0
5
1
0
6
1
0
7
9
5
%

s
i
m
u
l
a
t
e
d

c
o
n
f
i
d
e
n
c
e

i
n
t
e
r
v
a
l
9
5
%

t
h
e
o
r
e
t
i
c
a
l

c
o
n
f
i
d
e
n
c
e

i
n
t
e
r
v
a
l
B
o
x

w
h
i
s
k
e
r
s

(
s
i
m
u
l
a
t
i
o
n
s
)
E
x
p
e
c
t
e
d

v
a
l
u
e

E
x
t
r
e
m
e


o
b
s
e
r
v
a
t
i
o
n
s

(
s
i
m
u
l
a
t
i
o
n
s
)
Figure 3.3: Box plots of the cell counts (on log
10
-scale) for the various categories for 1,000
simulated databases with 10,000 DNA proles and = 0.03. The legend explains the plot
characters.
3.3 Results 47
^

0
= 0.00
0
= 0.01
0
= 0.02
0
= 0.03
0
= 0.04

0
.
0
4

0
.
0
2
0
.
0
0
0
.
0
2
0
.
0
4
C
1
C
2
C
3
T
1
T
2
C
1
C
2
C
3
T
1
T
2
C
1
C
2
C
3
T
1
T
2
C
1
C
2
C
3
T
1
T
2
C
1
C
2
C
3
T
1
T
2
Figure 3.4: Comparisons of the performance of the object functions in (3.6) and (3.7).
Table 3.3: Mean square errors for the ve dierent measures of similarity stratied on .
C
1
() C
2
() C
3
() T
1
() T
2
()
= 0.00 1.072

10
7
1.136

10
7
1.078

10
7
1.077

10
7
1.205

10
7
= 0.01 3.432

10
5
3.418

10
5
3.457

10
5
3.280

10
5
3.264

10
5
= 0.02 7.509

10
5
7.456

10
5
7.601

10
5
7.538

10
5
7.460

10
5
= 0.03 1.213

10
4
1.205

10
4
1.231

10
4
1.222

10
4
1.208

10
4
= 0.04 1.711

10
4
1.697

10
4
1.730

10
4
1.727

10
4
1.702

10
4
Overall 8.034

10
5
7.977

10
5
8.132

10
5
8.061

10
5
7.963

10
5
Simulations including close relatives
The simulations in the previous section only considered remote relatedness trough allelic cor-
relation governed by . However, most realistic reference DNA prole databases will contain
DNA proles from closely related individuals, e.g. brothers and father-son pairs. Hence, we also
investigated the performance of the C() and T()-functions for databases with pairs of close
relatives. For each -value we simulated databases with the number of relatives as specied in
Table 3.4.
Like in in the previous section we want to minimise the deviation between the observed and ex-
pected counts. However, for these simulations the expected value depend on and p
r
through the
expression: E(M; , p
r
) =
_
rR
P(R = r)E(M|; R = r) =
_
rR
p
r
N
r
, as discussed in relation
to (3.3). Let

C() and

T() be as in (3.6) and (3.7), but with N() replaced by
_
rR
p
r
N
r
,
48 Analysis of matches and partial-matches in Danish DNA database
Table 3.4: The number of simulated relatives for the various -values with a total of 10,000 DNA
proles. The numbers in brackets are the relative frequency of pairwise comparisons between
DNA prole with the specied relationship, i.e. the known P(r)-values.
Full-siblings First-cousins Parent-child Avuncular Unrelated
2,000 (2

10
5
) 2,000 (2

10
5
) 2,000 (2

10
5
) 2,000 (2

10
5
) 2,000 (0.99992)
5,000 (5

10
5
) 1,000 (1

10
5
) 1,000 (1

10
5
) 1,000 (1

10
5
) 2,000 (0.99992)
1,000 (1

10
5
) 5,000 (5

10
5
) 1,000 (1

10
5
) 1,000 (1

10
5
) 2,000 (0.99992)
1,000 (1

10
5
) 1,000 (1

10
5
) 5,000 (5

10
5
) 1,000 (1

10
5
) 2,000 (0.99992)
1,000 (1

10
5
) 1,000 (1

10
5
) 1,000 (1

10
5
) 5,000 (5

10
5
) 2,000 (0.99992)
then we seek (

, p
r
) = arg min
(,p
r
)

F() for

F being either

C or

T.
It should be noted that for consistency the variance of M should in this case be computed as

() = E(V(M|R)) +V(E(M|R)). However, we argue that the complexity and cost in computing

() is far beyond the gain. Hence, when minimising with respect to



T
1
() and

T
2
() we use ()
in the computations.
The performance of the dierent optimisation measures is summarised in Figure 3.5 and Ta-
ble 3.5. The pattern of larger variation of the -estimates for increasing
0
is repeated in the
simulations with relatives. From Figure 3.5 there is a remarked spread in the estimates of
P(First-cousins) for the C
i
()-methods, i = 1, 2, 3. The MSE for P(r) are generally smaller
for C
i
() whereas T
2
() has smaller MSE for .
Assuming that the estimators of and p
r
are unbiased, the expected values are given in Table 3.4
and the estimated variances in Table 3.5. Overall the mean is 10
5
while the standard errors
are 10
4
indicating that not all parameters seem to be signicant. Since the minimisation
is computational intense, we dropped all close relationships but P(FS) and re-tted the model.
Naturally P(FS) overestimate the actual fraction of full-siblings since it needed to compensate
for rst-cousins, parent-child and avuncular. However, the estimate of P(Full-siblings) is for this
reduced model signicantly dierent from zero.
3.3.2 Danish database
The Danish reference DNA prole database was analysed using the described methods and gave
the summary statistic presented in Table 3.1 and Figure 3.1. We have used the T
2
() method to
estimate the and p
r
for the Danish database. The minimum was obtained with

= 0.0107 and
p
r
as reported in Table 3.6. It is noteworthy that

= 0 for all of the C
i
()-methods. It seems
rather unlikely that there is no eect of subpopulation after allowing for close relatives.
Note that the estimated P(Full-siblings) for T
2
() is about a factor 10 larger than 2

10
7
which is
the approximate value obtained if one assumes that every individual of the Danish adult popula-
tion has exactly one full-sibling. However, it is likely that the frequency of full-siblings is larger
in the reference database than in the population due to various factors, e.g. the polices sampling
3.3 Results 49
^

0
0.00
0.05
0.10
= 0.00
0.00
0.05
0.10
= 0.01
0.00
0.05
0.10
= 0.02
0.00
0.05
0.10
= 0.03
0.00
0.05
0.10
C
~
1
C
~
2
C
~
3
T
~
1
T
~
2 C
~
1
C
~
2
C
~
3
T
~
1
T
~
2 C
~
1
C
~
2
C
~
3
T
~
1
T
~
2 C
~
1
C
~
2
C
~
3
T
~
1
T
~
2 C
~
1
C
~
2
C
~
3
T
~
1
T
~
2
= 0.04
P(Fullsiblings) P(Firstcousins) P(Parentchild) P(Avuncular)
Figure 3.5: Box plot of the dierences between
0
and

(with replaced for the relevant
parameters) for various -values and number of relatives in the simulated databases.
criteria and social factors. Inserting these values in and () gives the expected values and
covariance matrix, and given these quantities we computed marginal 95%-condence intervals
(superimposed in Figure 3.1).
The argument for using the -correction when assessing the evidential weight of a given DNA
prole is to adjust for possible subpopulation eects in the population from which the suspect
and proles for estimating allele probabilities are drawn. A structured population causes the
probability of observing a specic DNA prole to be heterogeneous, since the prevalence of its
constituting alleles may be higher in some subpopulation relative to the entire population. Taking
the argument further, one could argue that adjustment should be made for close relatedness
between the suspect and random man. Hence, when forming the likelihood ratio, LR, the
hypothesis in the denominator could be H
d
: A man possibly related to the suspect is the true
donor of the biological stain. The evaluation of P(E|H
d
) would then be a sum
_
rR
P(E|H
d
, R =
r)P(R = r), where (H
d
, R = r) concretises the specic relationship r between suspect and culprit.
50 Analysis of matches and partial-matches in Danish DNA database
Table 3.5: Mean squared errors (MSE) for various number of relatives stratied by -values.
Parameter C
1
() C
2
() C
3
() T
1
() T
2
()
0.00 1.354

10
7
1.316

10
7
1.539

10
7
2.514

10
4
1.202

10
7
P(FS) 1.008

10
11
1.897

10
8
1.683

10
9
1.653

10
10
5.238

10
6
P(1C) 4.813

10
6
8.078

10
6
5.762

10
7
1.505

10
10
3.630

10
6
P(PC) 6.835

10
11
5.191

10
10
1.362

10
10
4.788

10
9
7.962

10
7
P(AV) 1.835

10
8
7.247

10
8
6.895

10
9
1.693

10
10
7.217

10
8
0.01 3.221

10
5
3.273

10
5
3.049

10
5
2.032

10
4
2.984

10
5
P(FS) 2.878

10
11
8.088

10
8
8.669

10
10
1.453

10
10
1.995

10
6
P(1C) 3.825

10
5
6.430

10
5
7.402

10
6
1.431

10
10
6.560

10
7
P(PC) 1.708

10
10
1.428

10
9
2.053

10
10
8.610

10
9
1.052

10
6
P(AV) 7.710

10
9
3.483

10
9
1.590

10
8
3.275

10
9
6.465

10
6
0.02 7.006

10
5
7.165

10
5
6.542

10
5
2.472

10
4
6.853

10
5
P(FS) 4.029

10
11
1.727

10
8
1.131

10
9
1.719

10
10
6.694

10
7
P(1C) 7.252

10
5
1.055

10
4
9.885

10
6
1.050

10
9
3.598

10
7
P(PC) 1.976

10
10
7.711

10
10
2.924

10
10
1.024

10
8
5.935

10
7
P(AV) 4.547

10
9
5.264

10
10
1.201

10
8
1.250

10
8
5.160

10
6
0.03 1.232

10
4
1.263

10
4
1.153

10
4
2.237

10
4
1.213

10
4
P(FS) 5.841

10
11
5.695

10
9
7.739

10
10
1.714

10
10
7.768

10
7
P(1C) 1.688

10
4
1.649

10
4
1.854

10
5
2.294

10
7
3.728

10
7
P(PC) 2.294

10
10
3.160

10
10
3.701

10
10
1.490

10
8
7.187

10
7
P(AV) 3.361

10
10
5.053

10
10
5.101

10
8
6.679

10
9
4.757

10
6
0.04 1.661

10
4
1.698

10
4
1.532

10
4
2.839

10
4
1.463

10
4
P(FS) 8.469

10
11
1.109

10
9
1.195

10
9
2.560

10
10
1.063

10
6
P(1C) 1.886

10
4
1.542

10
4
2.180

10
5
3.568

10
7
5.585

10
7
P(PC) 2.318

10
10
2.240

10
10
4.195

10
10
1.665

10
8
1.028

10
6
P(AV) 5.348

10
10
8.605

10
11
1.799

10
8
1.416

10
8
5.296

10
6
Table 3.6: Estimated values for the Danish database using various object functions.
Method P(Full-siblings) P(First-cousins) P(Parent-child) P(Avuncular)
C
1
() 0.0000 2.592

10
6
8.413

10
9
1.072

10
12
1.930

10
9
C
2
() 0.0000 3.700

10
7
5.100

10
7
1.000

10
8
4.600

10
7
C
3
() 0.0000 5.005

10
6
3.534

10
7
6.089

10
13
2.475

10
7
T
1
() 0.0125 1.072

10
6
4.573

10
8
5.197

10
5
5.930

10
9
T
2
() 0.0107 2.263

10
6
1.757

10
7
1.491

10
6
5.882

10
9
3.4 Discussion 51
The problem of this approach would be to quantify P(R = r) for a given suspect. One approach
could be to take p
r
as estimated from the database and then form a weighted sum in the denom-
inator. By doing so for the Danish database with the estimated , frequencies for alleles and
pairs of relatives we obtained LR and LR
r
, where LR
r
denotes the LR taking close relatives into
account:
LR
r
=
P(E|H
p
)
P(E|H
d
)
=
P(E|H
p
)
_
rR
P(E|H
d
, R = r)P(R = r)
=
1
_
rR
P(C|S, R = r)P(R = r)
,
where P(C|S, R = r) is computed by multiplying (3.4) and (3.5) over loci.
For each prole in the database we computed LR assuming that the prole was that of a suspect in
single contributor crime case, i.e. LR = 1/P(U|S ) where P(U|S ) is the probability of observing
an unknown prole (the defence hypothesis) given the suspects prole. Similarly we computed
LR
r
under the same circumstances, except that the unknown prole may a close relative to S .
In Figure 3.6, we have plotted log
10
LR
r
against log
10
LR and see that the relationship is close
to linear: log
10
LR
r
= + log
10
LR. Estimating the parameters ( ,

) = (0.115, 8.59) we ob-
tain a simple formula to calculate LR
r
from LR: LR
r
= 10
8.59
LR
0.115
. In Figure 3.6, we have
superimposed the predicted value (solid line) with the uncertainty represented by the predictive
interval (dashed lines). The estimated mean and standard deviation of log
10
LR/LR
r
are respec-
tively 3.128 and 0.97. Hence, an approximative condence interval for the ratio is given as
10
3.1281.96

0.97
= [27 ; 106,955], i.e. taking close relatives into account decreases the LR with up
to ve orders of magnitude. The dominating contribution to the sum of P(E|H
d
) is that of full-
siblings, P(E|H
d
, R=FS) p
FS
, which accounts for approximately 99.5% of LR
r
. In Figure 3.2
this was also the category with the largest
10/0
. Hence, for practical purposes the only relevant
type of close relatedness to include in LR
r
is full-siblings since the decrease in P(E|H
d
, R) for
the remaining types of relatives is minimal relative to p
r
. Furthermore, previous we saw that the
model only including full-siblings and unrelated increased P(Full-siblings). Thus, this would
decrease LR
r
further yielding a more conservative evaluation of the evidence.
3.4 Discussion
It is evident from the analysis of the Danish reference DNA prole database that a -correction
close to 1% is sucient to capture the eects from substructure among the typed DNA proles.
Furthermore, did the analysis indicate the presence of close relatives in the database. A fact that
were known beforehand, but the number of close relatives were unknown. However, the signi-
cance of the estimated probabilities, p
r
, were not assessed implying some of them may be zero.
It is unknown whether it makes sense to present the LR
r
in court since often the judge and jury
are more interested in the LR for a specic relationship rather than a mean over common relation-
ships with numerical impact on P(E|H
d
). However, LR
r
may be used in order to accommodate
for the fact that the unrelated man may in fact be a unknown close relative to the suspect.
52 Analysis of matches and partial-matches in Danish DNA database
10 12 14 16 18
9
.
6
9
.
8
1
0
.
0
1
0
.
2
1
0
.
4
1
0
.
6
log
10
LR
l
o
g
1
0

L
R
r
1200
1100
1000
900
800
700
600
500
400
300
200
100
1
Figure 3.6: Relationship between LR and LR
r
with a predictive interval superimposed (solid
line: mean, dashed lines: predictive limits). The shaded hexagons indicate bin counts.
3.5 Conclusion
The main objective with the work presented in this paper were to analyse the Danish reference
DNA prole database of 51,517 dierent individuals. This was to accommodate the fact that at
some point two apparently unrelated individuals will share DNA proles for all ten loci in the
Danish population. If a specied relationship is determined it is straight forward to calculate the
probability of identical DNA proles, however, one still needs to account for remote coancestry
for both related and unrelated pairs of proles.
Furthermore, only modelling the expected value or calculating the mean is never satisfactory in
statistics. A measure of precision or variability is needed in order to discuss the extremity of an
observation relative to the expectation under a given model. Hence, deriving and computing the
covariance matrix of M was essential. However, as the simulations exemplied that there was no
pronounced improvement by using the Mahalanobis distance, T
2
() = [MN()]

()

[M
N()], rather than the C()-functions for estimating .
Acknowledgements
The authors would like to thank Ms. Kirstine Kristensen and Ms. Line Maria Irlund Pedersen
both from The Section of Forensic Genetics, University of Copenhagen) for their assistance in
verifying the familial relationships of the twins in the database, and validating some near matches
due to typing errors.
3.A Derivation and computation of the variance 53
Appendix
3.A Derivation and computation of the variance
In order to compute the variance of the summary matrix, we use the denition of variance
and covariance for random variables. First, note that M(G
i
, G
j
) may be listed as a vector:
M(G
i
, G
j
) M(G
i
, G
j
), where the mapping operates on the m/p values: f (m, p; L) = m[(L +
1) + (m 1)/2] + (p + 1), where L is the total number of loci. Next, we expand the expression
V(M) = ():
()=V
_

_
n1

i=1
n

j>i
M(G
i
, G
j
)
_

_
=
n1

i=1
n

j>i
V
_
M(G
i
, G
j
)
_
+ 6
n2

i=1
n1

j>i
n

k>j
C
_
M(G
i
, G
j
), M(G
i
, G
k
)
_
+
n1

i=1
n

j>i
n1

k{i, j}
n

l>k
l{i, j}
C
_
M(G
i
, G
j
), M(G
k
, G
l
)
_
=
_
n
2
_
V
_
M(G
i
1
, G
i
2
)
_
+6
_
n
3
_
C
_
M(G
i
1
, G
i
2
), M(G
i
1
, G
i
3
)
_
+6
_
n
4
_
C
_
M(G
i
1
, G
i
2
), M(G
i
3
, G
i
4
)
_
where (i
1
, i
2
, i
3
, i
4
) in the last line relates to any of the DNA proles in the database as long as
they are dierent proles. We go from the rst to second line by expanding the sum and observe
that C[M(G
i
, G
j
), M(G
i
, G
k
)] = C[M(G
i
, G
j
), M(G
j
, G
k
)] = C[M(G
i
, G
k
), M(G
j
, G
k
)] since
M(, ) is symmetric. The sum over the last term in the expansion, C[M(G
i
, G
j
), M(G
k
, G
l
)]
with all prole indexes dierent, also contain several symmetries implying the weights in the
nal expression. In order to compute the covariances , we need to compute
E
_
M(G
i
, G
j
)M(G
i
, G
k
)

_
and E
_
M(G
i
, G
j
)M(G
k
, G
l
)

_
,
respectively, given that the DNA prole indexes i, j, k and l are all dierent.
For computing E
_
M(G
i
, G
j
)M(G
i
, G
k
)

_
we need to account for the fact that prole G
i
enters in
both pairwise comparisons. Hence, we need to condition on G
i
when deriving the probabilities

m/p, m/ p
=
_
i

, j
P(m/p, m/ p|G
i
=A
i
A
j
)P(A
i
A
j
) for all combinations of m/p, m/ p, where m/p
relates to the number of matches/partial-matches of G
i
and G
j
, with a similar denition of m/ p
for proles G
i
and G
k
.
As for the mean we use a recursion formula over loci to compute
m/p, m/ p
. However, in this
setting there are nine terms on the right hand side:

+1
m/p, m/ p
=

m/p, m/ p
P
+1
0/0,0/0
+

m/p1, m/ p
P
+1
0/1,0/0
+

m1/p, m/ p
P
+1
1/0,0/0
+

m/p, m/ p1
P
+1
0/0,0/1
+

m/p, m1/ p
P
+1
0/0,1/0
+

m/p1, m/ p1
P
+1
0/1,0/1
+

m1/p, m/ p1
P
+1
1/0,0/1
+

m/p1, m1/ p
P
+1
0/1,1/0
+

m1/p, m1/ p
P
+1
1/0,1/0
.
54 Analysis of matches and partial-matches in Danish DNA database
When one or more of the subscripts are zero there are similar boundary conditions for
m/p, m/ p
as
those specied in Section 3.2.2. The probabilities P
m/p, m/ p
are found by considering the events
separately. For each conguration of (m/p, m/ p) {(x
0
/y
0
, x
1
/y
1
) : (x
i
, y
i
) {0, 1}0 x
i
+y
i

1} we compute the probabilities:
P
m/p, m/ p
= P(m/p, m/ p) =

, j

P(m/p, m/ p|G
i
= A
i
A
j
)P(A
i
A
j
)
Each of the probabilities in the sums are expanded such that the events specied by m/p and
m/ p are satised, e.g. m/p = 1/0 and m/ p = 1/0 implying that both prole G
j
and G
k
matches
the proles of G
i
on that particular locus:
P(1/0, 1/0) =

, j

P(A
i
A
j
, A
i
A
j
|A
i
A
j
)P(A
i
A
j
) +

P(A
i
A
i
, A
i
A
i
|A
i
A
i
)P(A
i
A
i
)
= 2

i, ji
P(A
i
A
i
A
j
A
j
|A
i
A
j
)P(A
i
A
j
) +

P(A
i
A
i
A
i
A
i
|A
i
A
i
)P(A
i
A
i
)
= 4

i, ji
P(A
i
A
i
A
i
A
j
A
j
A
j
) +

P(A
i
A
i
A
i
A
i
A
i
A
i
).
From the recursive formula P(A
i
|x
n
) = [x
n
i

+ (1 )p
i
]/[1 + (n 1)], we see that the de-
nominator do not depend on the total number of sampled alleles. Hence, for a probability like
P(A
i
A
j
A
k
A
j
A
i
A
i
) that involves six alleles, the denominator will always be

5
n=1
(1 + n).
Hence, to keep the formulae simple, we only consider the numerator in the following deriva-
tions. First, we observe that:
P(A
i
A
j
A
k
A
j
A
i
A
i
) = P(A
i
|A
j
A
k
A
j
A
i
A
i
)P(A
j
A
k
A
j
A
i
A
i
)
= [(
i
1) + (1 )p
i
]P(A
j
A
k
A
j
A
i
A
i
)
= (
i
1)P(A
j
A
k
A
j
A
i
A
i
) + (1 )p
i
P(A
j
A
k
A
j
A
i
A
i
), (3.9)
where
i
counts the number of i

alleles in the expression on the left hand side. Now, the term
(
i
1)P(A
j
A
k
A
j
A
i
A
i
) follows a similar expansion as the left hand side of (3.9). How-
ever, the latter term of (3.9) involves p
i
which needs to be taken into account when evaluating
P(A
j
A
k
A
j
A
i
A
i
). By following the recursion to the end, that is when the left hand side of
(3.9) is, say, P(A
i
A
j
) = P(A
i
|A
j
)P(A
j
) = [(
i
1) + (1 )p
i
]p
j
= (1 )p
i
p
j
we end
up with terms of the form a
0

a
1
(1 )
a
2
p

1
1
p

K
K
for some constants a = (a
0
, a
1
, a
2
) and
= (
1
, . . . ,
K
). The values of a and is build up during the recursion, hence determining the
actual value is only a matter of bookkeeping.
Furthermore, consider the case where the product of allele probabilities is p
2
i

p
2
j

p
2
k

where the
indexes are dierent. A rst step would be to replace p
2
k

= S
2
p
2
i

p
2
j

and sum over p


2
i

p
2
j

(S
2

p
2
i

p
2
j

) for i

. However, such calculations are very cumbersome to do by hand and from


the equation below we see that there is a lot of repeated structure that may be exploited:

, j

,k

p
2
i
p
2
j
p
2
k

, j

p
2
i
p
2
j
(S
2
p
2
i
p
2
j
) = S
2

, j

p
2
i
p
2
j

, j

p
4
i
p
2
j

, j

p
2
i
p
4
j
,
3.A Derivation and computation of the variance 55
where the notation imply summation over dierent values of the indexes. Rewriting the expres-
sion above with the powers replaced by the -parameters we get this more general expression:

i, j,k
p

=
_

i, j
p

i, j
p

i
+
k

i, j
p

j
+
k

where all -parameters were 2 in the previous example. The formula can be programmed
in a computer as a recursion formula. Hence, in contrast to the simpler situations only in-
volving a pair of DNA proles where a few equations give the necessary probabilities (Weir,
2004, 2007), we let the computer compute the expectations E[M(G
i
1
, G
i
2
)M(G
i
1
, G
i
3
)

] and
E[M(G
i
1
, G
i
2
)M(G
i
3
, G
i
4
)

]. We have implemented ecient functions in R to compute these


and other expectations implying that variances is computed within 10 to 30 seconds on a 2.5
GHz laptop computer for each -value. In order to get a impression of the structure in the matrix
we have plotted a heat-map of the correlation matrix () computed by:
() = diag
_
1/
_
diag{()}
_
()diag
_
1/
_
diag{()}
_
In Figure 3.7 we have plotted (0.03) in grey-scale colours. However, the on line supplementary
material has a coloured animation showing the change in pattern in () for = [0, 0.001, . . . , 1].
10/0
9/1
9/0
8/2
8/1
8/0
7/3
7/2
7/1
7/0
6/4
6/3
6/2
6/1
6/0
5/5
5/4
5/3
5/2
5/1
5/0
4/6
4/5
4/4
4/3
4/2
4/1
4/0
3/7
3/6
3/5
3/4
3/3
3/2
3/1
3/0
2/8
2/7
2/6
2/5
2/4
2/3
2/2
2/1
2/0
1/9
1/8
1/7
1/6
1/5
1/4
1/3
1/2
1/1
1/0
0/10
0/9
0/8
0/7
0/6
0/5
0/4
0/3
0/2
0/1
0/0
0
/
0
0
/
1
0
/
2
0
/
3
0
/
4
0
/
5
0
/
6
0
/
7
0
/
8
0
/
9
0
/
1
0
1
/
0
1
/
1
1
/
2
1
/
3
1
/
4
1
/
5
1
/
6
1
/
7
1
/
8
1
/
9
2
/
0
2
/
1
2
/
2
2
/
3
2
/
4
2
/
5
2
/
6
2
/
7
2
/
8
3
/
0
3
/
1
3
/
2
3
/
3
3
/
4
3
/
5
3
/
6
3
/
7
4
/
0
4
/
1
4
/
2
4
/
3
4
/
4
4
/
5
4
/
6
5
/
0
5
/
1
5
/
2
5
/
3
5
/
4
5
/
5
6
/
0
6
/
1
6
/
2
6
/
3
6
/
4
7
/
0
7
/
1
7
/
2
7
/
3
8
/
0
8
/
1
8
/
2
9
/
0
9
/
1
1
0
/
0
1.0
0.5
0.0
0.5
1.0
Figure 3.7: Graphical representation of the correlation matrix () computed for = 0.03 and
n = 10, 000.
56 Analysis of matches and partial-matches in Danish DNA database
Bibliography
Ayres, K. L. (2000). A two-locus forensic match probability for subdivided populations. Genet-
ica 108, 137143.
Balding, D. J. and R. A. Nichols (1995). A method for quantifying dierentiation between
populations at multi-allelic loci and its implications for investigating identity and paternity.
Genetica 96, 312.
Curran, J. M. and T. Tvedebrink (2010a). DNAtools - a R package for forensic DNA database
analysis. Journal of Computational Statistics. Manuscript in preparation.
Curran, J. M. and T. Tvedebrink (2010b). DNAtools: Statistical functions for analysing forensic
DNA databases. R package version 0.1.
Curran, J. M., S. J. Walsh, and J. S. Buckleton (2007). Empirical testing of estimated DNA
frequencies. Forensic Sciences International: Genetics 1, 267272.
Donnelly, P. (1995a). Match probability calculations for multi-locus DNA proles. Genetica 96,
5567.
Donnelly, P. (1995b). Nonindependence of matches at dierence loci in DNA proles: quanti-
fying the eect of close relatives on the match probability. Heredity 75, 2634.
Lange, K. (1993). Match probabilities in racially admixed populations. American Journal of
Human Genetics 52, 305311.
Lange, K. (1995). Applications of the Dirichlet distribution to forensic match probabilities.
Genetica 96, 107117.
Laurie, C. and B. S. Weir (2003). Dependency eects in multi-locus match probabilities. Theo-
retical Population Biology 63, 207219.
Mueller, L. D. (2008). Can simple populations genetic models reconcile partial match frequen-
cies observed in large forensic databases? Journal of Genetics 87(2), 101107.
Nichols, R. A. and D. J. Balding (1991). Eects of population structure on DNA ngerprint
analysis in forensic science. Heredity 66, 297302.
R Development Core Team (2009). R: A Language and Environment for Statistical Computing.
Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0.
Song, Y. S. and M. Slatkin (2007). A graphical approach to multi-locus match probabilitiy
computation: Revisiting the product rule. Theoretical Population Biology 72, 96110.
Troyer, K., T. Gilroy, and B. Koeneman (2001). A nine STR locus match between two apparent
unrelated individuals using AmpFSTR Proler Plus and COler. Proceedings of the
Promega 12th International Symposium on Human Identication.
Tvedebrink, T. (2010). Overdispersion in allelic counts and -correction in forensic genetics.
Theoretical Population Biology. In Press.
Weir, B. S. (2004). Matching and partially-matching DNA proles. Journal of Forensic Sci-
ence 49(5), 16.
Weir, B. S. (2007). The rarity of DNA proles. The Annals of Applied Statistics 1(2), 358370.
3.6 Supplementary remarks 57
3.6 Supplementary remarks
It is relevant to be aware of and acknowledge the power of the DNA typing technology and its
role in the society. To most people DNA evidence is thought of as awless and superior to any
other sort of evidence. However, due to the very nature of DNA proles there is a possibility that
a pair of apparently unrelated individuals share a DNA prole. As pointed out by Weir (2007,
pp. 360-361) this is related to the birthday problem where one computes the probability that
at least two individuals out of n have the same unspecied birthday. The fact that n = 23 gives
more than 50% probability of at least two individuals sharing birthday is surprising to many at
rst glance. However, this is due to the fact that the birthday is not specied. Similarly, when
computing the probability that any two individuals share DNA prole the actual alleles of their
common prole is not specied. If the prole were specied the computed probability would
in fact be the match probability of two DNA proles. When summing over the possible DNA
proles we obtained N
L/0
(), which was the expected number of pairs of individuals with iden-
tical DNA proles. For the allele probabilities estimated from the Danish reference database we
obtain
10/0
() (
0
+
1
)

2
which for non-negative parameters, = (0.13, 0.87, 14.71), is a
monotonic increasing function. That is, the probability increase with , i.e. the more heteroge-
neous the population is, the larger is the probability of coinciding DNA proles.
However, this fact does not imply that DNA proling is overrated nor that the weight of evidence
reported in court is overstated. When using the LR-approach the reported evidential-value relates
to the specic DNA prole of a suspect. The pairwise comparisons of each pair in the DNA
database were used to validate the population genetic model. The diagnostics presented above
indicated that the dierences between the observed and expected counts were not too extreme,
and thus we may still have condence in the models used for reporting the evidential weight in
court.
CHAPTER 4
Evaluating the weight of evidence using
quantitative STR data in DNA mixtures
Publication details
Co-authors: Poul Svante Eriksen

, Helle Smidt Mogensen

and Niels Morling

Department of Mathematical Sciences


Aalborg University

Section of Forensic Genetics, Department of Forensic Medicine


Faculty of Health Science, University of Copenhagen
Journal: Applied Statistics (In Press)
DOI: 10.1111/j.1467-9876.2010.00722.x
59
60 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
Abstract:
The evaluation of results from mixtures of DNA from two or more persons in crime case inves-
tigations may be improved by taking not only the qualitative but also the quantitative part of the
results into consideration. We present a statistical likelihood approach to assess the probability
of observed peak heights and peak areas information for a pair of proles matching the DNA
mixture. Furthermore, we demonstrate how to incorporate this probability into the evaluation of
the weight of the evidence by a likelihood ratio approach.
Our model is based on a multivariate normal distribution of peak areas for assessing the weight
of the evidence. Based on data from analyses of controlled experiments with mixed DNA sam-
ples, we exploited the linear relationship between peak heights and peak areas, and the linear
relations of the means and variances of the measurements. Furthermore, the contribution from
one individuals allele to the mean area of this allele is assumed to be proportional to the average
of peak height measurements of alleles, where the individual is the only contributor.
For shared alleles in mixed DNA samples, it is possible to observe only the cumulative peak
heights and areas. Complying with this latent structure, we used the EM-algorithm to impute
the missing variables based on a compound symmetry model. The measurements were subject
to intra- and inter-locus correlations not depending on the actual alleles of the DNA proles.
Due to factorisation of the likelihood, properties of the normal distribution and use of auxiliary
variables, an ordinary implementation of the EM-algorithm solved the missing data problem.
Keywords:
STRDNAmixture; Forensic genetics; Missing data; EM-algorithm; Compound symmetry model;
Multivariate normal distribution
4.1 Introduction
4.1.1 DNA mixtures
The model presented in this paper is intended to be used in forensic genetics when facing DNA
data from biological stains with more than one contributor (see Gill et al. (2006) for a detailed
description of the DNA mixture problem). This specic problemhas received increasing interest
from both forensic geneticists and statisticians over the last decade, e.g. Evett and Weir (1998);
Gill et al. (1998, 2006); Perlin and Szabady (2001); Bill et al. (2005); Cowell et al. (2007a).
When a crime has been committed, biological stains are often found at the scene of crime. DNA
is present in almost all human cells and by using biochemical procedures, forensic geneticists
are able to extract the DNA from the body uids for further analysis. In many cases, more
than one individual has contributed to a stain, which is then called a DNA mixture. Mixtures
of DNA often appear in relation to crime cases, e.g. rape cases with one or more rapists, and
cases involving violence. DNA may be extracted from semen obtained by a vaginal swab or
from blood present on the victims clothing.
4.1 Introduction 61
In crime casework today, there is an international consensus to investigate DNA from short
tandem repeat regions - STRs. The STR regions are situated between the coding regions in
the DNA. The polymorphism of an STR region mainly results from dierences in the number
of repeated sequences. This leads to variations in the total lengths of the STR regions from
person to person. In many European countries, ten STR systems and the sex-specic marker
amelogenin are routinely investigated in crime cases by means of the SGMPlus STR kit (Applied
Biosystems). The loci are located on dierent chromosomes. This is generally assumed to be
sucient to ensure statistical independence of alleles at dierent loci.
For the most common STR technologies used in forensic DNA analyses, the alleles are read from
an electropherogram (pictured in Fig. 4.4) as peaks on a given scale. This makes two types of
data available: qualitative allele type data, determined by the position of the peak (measured in
DNA base pairs), and quantitative peak intensity data summarised by the height and area of the
peak (measured in relative uorescence unit, rfu). The set of observed alleles is termed a DNA
prole.
The shaded cones in Fig. 4.4 show a typical picture of a DNA mixture comprising ten STR
loci (denoted D3, vWA, . . . , FGA in Fig. 4.4) used in forensic genetics. The peak height and
area associated with an allele reect the amount of DNA contributed to that particular allele.
The potential peak positions of some loci overlap, which makes it necessary to use dierent
uorescent dyes (the dierent rows in Fig. 4.4 correspond to blue (D3, vWA, D16, D2), green
(D8, D21, D18) and yellow (D19, TH0, FGA) uorescent dyes) with a subsequent spectral
deconvolution of the signal (Butler, 2005).
Depending on the DNA proles mixed in the sample, the number of alleles present for each locus
in a two-person DNA mixture ranges from one to four alleles since an individual may be either
homozygous (carrying two identical copies of the same allele) or heterozygous (two dierent
alleles), and the individuals may share one or both alleles. This implies that the amount of DNA
contributed to each allele varies and the peaks are therefore expected to vary in height and area.
In this paper, we present a statistical model for the peak areas for a given pair of proles while
taking into account the variable dimension (sub-vectors of dimension one to four for dierent
loci) of the measurements.
4.1.2 Evaluating the weight of evidence
A complete DNA investigation is a very eective tool for excluding individuals who are not
very closely related to the person from whom the stain material originated. A match between
complete DNA proles of a stain and a person is very strong evidence for the assumption that the
stain came from that person compared with the assumption that the stain came from a random
person. The weight of DNA evidence can be calculated in each case based on assumptions
about the setting and knowledge of the distribution of the DNA characteristics in the relevant
population. The weight of evidence from DNA investigations is generally accepted in almost all
countries in which DNA investigations are used.
62 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
Methods are available to estimate the weight of the evidence of the qualitative results (Balding
and Nichols, 1994; Evett and Weir, 1998). However, we do not have good mathematical methods
to take into consideration the quantitative aspects of the DNA results in order to answer questions
like: Can the two DNA proles in a crime scene DNA mixture be identied based on the strength
of the DNA results? Are the strengths of the various DNA results in a crime scene mixed DNA
prole (that seems to consist of a major and a minor DNAprole) compatible with the hypothesis
that the DNA comes from two persons with known DNA proles?
Estimating the weight of evidence in forensic sciences is often done in terms of a likelihood
ratio, which is the ratio of the probability of the evidence, E, under two competing and mutually
distinct but not exhaustive hypotheses. In the literature these two hypotheses are often denoted
H
p
and H
d
for the prosecutors hypothesis and defence hypothesis respectively (Evett and
Weir, 1998). Even though the hypotheses may have dierent origin than those of the prosecutor
and defence, we apply the notation of H
p
and H
d
in this paper to denote the two disjoint events
claimed in the hypothesis, i.e. the likelihood ratio is given by LR = P(E|H
p
)/P(E|H
d
), where
large values of LR support the H
p
-hypothesis. For example in case of a rape the H
p
-hypothesis
may be: The victimand the suspect are the contributors to the stain, whereas the H
d
-hypothesis
states: The victim and an unknown individual unrelated to the suspect are the contributors
to the stain. We denote the crime scene evidence from the mixture E
c
= (G, Q), where G
denotes the qualitative allele information, and Q represents the quantitative peak information as
measurements of peak heights and areas. The most frequent way to assess the probability P(E|H)
is by solely using the qualitative information G in terms of allele probabilities. In DNA mixtures,
however, this may discard important quantitative information of the DNA evidence. Thus, the
probability of the evidence E given a hypothesis H needs to include both parts of the evidence G
and Q.
We dene G
V
, G
S
and G
U
to be the proles of the victim, the suspect and a potential unknown
and unrelated contributor, respectively. Both hypotheses H
p
and H
d
in our rape example are
formulated such that they are consistent with G, i.e. all alleles in G are accounted for and only
alleles in G appear in the included proles G
V
, G
S
and G
U
. When xing only one prole, G

, of
a two-person mixture, the consistency with G induces the set C = {G

: (G

, G

) G}, which are


all proles, G

, that together with G

are consistent with G. If the H


p
-hypothesis claims G to be
a mixture of G
V
and G
S
, H
p
:(G
V
, G
S
), while the H
d
-hypothesis claims it is a mixture of G
V
and
G
U
, H
d
:(G
V
, G
U
), the likelihood ratio is
LR =
P(E
c
, G
S
, G
V
|H
p
)
P(E
c
, G
S
, G
V
|H
d
)
=
P(Q, G, G
S
, G
V
|H
p
)
P(Q, G, G
S
, G
V
|H
d
)
=
P(Q|G, G
S
, G
V
, H
p
)P(G, G
S
, G
V
|H
p
)
P(Q|G, G
S
, G
V
, H
d
)P(G, G
S
, G
V
|H
d
)
,
where G
S
and G
V
enter as evidence as these are determined from the case circumstances. Let
C
d
= {G
U
: (G
V
, G
U
) G} be the set of unknown proles that together with G
V
are consistent
with G, then P(G|G
V
, G
U
) = 1 for G
U
C
d
and 0 otherwise, i.e. the set of possible unknowns
under H
d
. We expand the denominator of the LR using hypothesis H
d
,
P(E, G
S
, G
V
|H
d
) =

G
U
C
d
P(Q|G, G
S
, G
V
, G
U
)P(G, G
S
, G
V
, G
U
),
4.1 Introduction 63
Table 4.1: The four DNA proles used in the controlled pairwise two-person mixture experi-
ments.
D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA
A 14,18 17,19 12,14 20,24 10,13 30.2,32.2 13,13 12,13 8,9 20,22
B 15,16 14,16 10,12 17,25 13,16 30,30 13,13 14,15 6,9 19,23
C 15,16 15,17 11,11 19,25 8,12 29,31 15,17 13,13 6,8 23,24
D 16,19 15,17 10,12 23,25 13,13 28,30 12,16 13,15 6,7 20,23
where P(Q|G, G
S
, G
V
, G
U
) = P(Q|G
V
, G
U
) and P(G, G
S
, G
V
, G
U
) = P(G
S
, G
V
, G
U
) due to (G
V
, G
U
)
G and H
d
is assumed. Hence,
P(E, G
S
, G
V
|H
d
) =

G
U
C
d
P(Q|G
V
, G
U
)P(G
S
, G
V
, G
U
).
Similar arguments apply to the numerator of LR, and assuming independence between the pro-
les involved, i.e. unrelated individuals such that P(G
S
, G
V
, G
U
) = P(G
S
)P(G
V
)P(G
U
), the nal
LR expression is:
LR =
P(Q|G
S
, G
V
)
_
G
U
C
d
P(Q|G
V
, G
U
)P(G
U
)
, (4.1)
where the factors P(G
S
)P(G
V
) have cancelled out. The numerator P(Q|G
S
, G
V
) of (4.1) assesses
the probability of observing the quantitative information given that the mixture consists of ge-
netic material from the proles G
S
and G
V
. The denominator equals the mean value of the
quantitative likelihood among the pairs of proles that are consistent with the genetic trace. If
we assume P(Q|G
S
, G
V
) = P(Q|G
V
, G
U
) for all G
U
, i.e. the observed quantitative information
has equal probability for all proles paired with G
V
, then (4.1) reduces to the usual likelihood
ratio as in Evett and Weir (1998), since P(Q|G
S
, G
V
) and P(Q|G
V
, G
U
) then cancel each other in
(4.1). The assumption that the proles G
S
, G
V
and G
U
are independent is a rather strong. The
so-called -correction incorporates the correlation from shared ancestry (Balding and Nichols,
1994) and closer familial relationships induces further correlation of the genetic proles. How-
ever, for the purpose of introducing the factorisation of the qualitative and quantitative evidence
the assumption used in (4.1) is adequate.
The objective of the present paper is to develop a methodology and an adequate statistical model
to describe P(Q|G

, G

), where both of the true proles G

and G

are known. This comprises


a mathematical formalism of inter-locus dependencies of the quantitative evidence, the relation-
ships between a samples peak heights, peak areas, and the amount of DNA contributed to the
individual peak.
64 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
Amount of DNA
P
e
a
k

a
r
e
a
2500
10000
22500
40000
FGA
25 100 225 400 625 900
TH0 D19
25 100 225 400 625 900
D2 vWA
25 100 225 400 625 900
D3 D16
25 100 225 400 625 900
D21 D8
25 100 225 400 625 900
2500
10000
22500
40000
D18
Figure 4.1: Proportionality of peak areas and amounts of DNA of square root transformed data.
4.2 Material and methods
4.2.1 Experimental data
The assumptions made as to the amplication behaviour of mixed DNA samples were based
on data exploration of controlled experiments conducted at The Section of Forensic Genetics,
Department of Forensic Medicine, Faculty of Health Sciences, University of Copenhagen. The
experiments consisted of pairwise two-person mixtures in various mixture ratios of the four
proles in Table 4.1. The data were prepared as described in Tvedebrink et al. (2009). The
assumptions made did not contradict the assumptions made in e.g. Cowell et al. (2007a):
1. proportionality of the peak areas and the amount of DNA in the sample,
2. linearity of the observed peak areas and peak heights,
3. proportionality of the means and variances of peak areas.
These assumptions were supported by the plots in Fig. 4.1 and Fig. 4.2, which were based on data
from the experiments described in Tvedebrink et al. (2009). The validity of the last assumption
was emphasised by tting a linear model: Area/

DNA =
s

DNA + , with N(0,


2
) and
with
s
being a locus specic proportionality factor. Graphical inspections show no systematic
dependence of squared residuals and DNA.
4.2 Material and methods 65
Figure 4.2: Proportionality of peak heights and peak areas. The proportionality factor depends
on loci.
4.2.2 Model description
In a DNA mixture of proles C and D (Table 4.1) we would observe peaks for alleles 15, 16 and
19 in locus D3. The peaks are expected to vary in height and area due to the dierent amounts of
DNA contributed to the alleles, e.g. both proles contribute to the peak of allele 16. For identical
alleles we assume that the peak areas of each individual are additive resulting in an observable
cumulative vector of peak areas, M. Similarly, for homozygous proles, the contribution to the
observable peak area is the sum of two identical peak areas.
The unobservable peak areas, A, from each individual are input for modelling the observable
quantitative data, Q, from DNA traces for the assessment of P(Q|G
V
, G
P
), where G
V
and G
P
are the proles of the victim and true perpetrator, respectively. We use the EM-algorithm in
addressing the DNA mixture problem because it can be formulated as a missing data problem
(Little and Rubin, 2002). The model is derived for two-person mixtures but can be extended to
cope with more than two contributors.
In the following, we let S denote the set of loci and S the number of loci used for identication,
i.e. |S| = S . For parameter estimation, we have access to data fromC mixtures of known proles.
The amount of DNA contributed to the mixture by person k, k = 1, 2, was modelled by H
(k)
. This
is a sum of the observed peak heights with person k as the only contributor divided by a sum
66 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
Figure 4.3: Linearity of the H-ratio and DNA-ratio with the identity line (y = x) superimposed.
The outlier at (2.77, 0.55) was due to entry error of the laboratory.
of indicators with value two and one for alleles from loci where person k is homozygous and
heterozygous, respectively. Let h
(k)
i
be the ith peak height with person k as the only contributor,
H
(k)
=
_
n
(k)
het
+ 2n
(k)
hom
_
1
_
n
(k)
i=1
h
(k)
i
, where n
(k)
= n
(k)
het
+ n
(k)
hom
is the number of person ks alleles from
heterozygous, n
(k)
het
, and homozygous loci, n
(k)
hom
, and person k is the only contributor. Thus, H
(k)
is an estimate of the average peak height associated with person ks alleles.
Fig. 4.3 shows a plot of the ratio H
(1)
/H
(2)
against the DNA ratio reported by the laboratory. The
data demonstrate that it is reasonable to use H
(k)
as a proxy for the amount of DNA contributed
by person k. Furthermore, for each pair of proles, the quantities H
(1)
and H
(2)
can be computed
using only the peak height observations.
We assumed independence among the components of A and that they followed a normal distri-
bution with both mean and variance proportional to the amount of DNA. The components of A
are A
(k)
s,i
for person k, locus s and allele i. We have A
(k)
s,i
N(
s
H
(k)
,
2
s
H
(k)
) which implies the
same distribution of both alleles of locus s for person k.
The parameters
s
and
s
, sS, are locus dependent and shared for all cases, c = 1, . . . , C.
This parameterisation ensures the proportionality of the mean and variance and that both are
proportional to the amount of DNA modelled by H
(k)
. Since H
(k)
is the same for all loci, the
s ensure that the amplication eciency may vary between loci. Hence, the magnitude of
s
reects the emission intensity of locus s. Furthermore, the variation modelled by
2
s
can be
interpreted as the data preprocessing variation of the STR allele signals, e.g. variability from
4.2 Material and methods 67
pipetteting the samples.
The relation between M and A is expressed as a linear transformation, T, adding together
peak areas from identical alleles, and an additional error term related to the measurement error,
M = TA+ . For the measurement errors, , we assume independence of A and multivariate
normal distribution with some dependencies within and across loci. We denote the covariance of
as Cov() = . Let the dimension of M be n =
_
sS
n
s
, where n
s
, 1 n
s
4, is the number
of observed alleles in locus s. The transformation, T, is an n 4S -block diagonal matrix with
block matrices T
s
with 0 and 1 entries according to the proles in the mixture. For each locus,
s, we sort the unobservable peak areas, A
s
, by allelic number of each person, whereas M
s
is
sorted by allelic number. For a mixture of prole B and D from Table 4.1, the genotypes in locus
s = D3 are P
(1)
s
= (15, 16) and P
(2)
s
= (16, 19), and the associated matrix T
s
is
M
s
=
_
M
s,15
, M
s,16
, M
s,19
_

=
_

_
1 0 0 0
0 1 1 0
0 0 0 1
_

_
_
A
(1)
s,15
, A
(1)
s,16
, A
(2)
s,16
, A
(2)
s,19
_

+
s
,
adding together the entries in A
s
that relates to the same allele, i.e. allele 16.
The number of allelic measurements within each locus varies from case to case since dierent
pairs of proles will share a dierent number of alleles. A mixture of person A and B would
have n
D3
= 4, and B and C has n
D3
= 2 (see Table 4.1). Not only will the number of alleles vary,
the specic alleles present in a given mixture depends on the proles in the mixture, e.g. A and
B give alleles {14, 15, 16, 18}, and B and C give {15, 16}. This makes it dicult to incorporate a
covariance structure covering all allele combinations.
We standardised the residual, , by the observed peak heights, h = (h
s
)
sS
with h
s
= (h
s,i
)
n
s
i=1
,
by dening the scaled residual, = (
s
)
sS
, where
s
= (
s,i
/
_
h
s,i
)
n
s
i=1
. To make the model
operational, we assumed a compound symmetry model for the covariance of , Cov( ) =

and
that this does not depend on the specic alleles in the mixture. The only case specic adjustment
made was to make the dimensions of the compound symmetry concordant with the number of
observed peaks for each locus. The compound symmetry structure of

implies that sub-vectors
of share some properties with respect to the scaled covariance

. There are three dierent
types of correlation in our setting:
Dierent loci (s t): Cov(
s,i
,
t, j
) =
st
.
Same locus, dierent alleles (s = t, i j): Cov(
s,i
,
s, j
)=
ss
.
Same locus, same allele (s = t, i = j): Cov(
s,i
,
s,i
) = Var(
s,i
) =
ss
+
s
.
Hence, we can parameterise

by = {
s
}
sS
and = {
st
}
s,tS
. The interpretation of
st
is
that the correlations between observations at dierent loci depend only on the loci and not on
the specic alleles present on each locus. Similarly, the correlation between alleles on the same
locus, s, is independent of the specic alleles, whereas for identical elements, the covariance
corresponds to the variance, and the addition of
s
allows for a larger variance than that given by
the intra-locus covariance.
68 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
4.2.3 Implementation of the EM-algorithm
In order to handle the latent structure of A and the associated missing data problem, we used
the EM-algorithm to impute the missing observations and estimate the parameters in the condi-
tional distribution of A given M. However, since the dimensions of M and sub-vectors hereof
varied from case to case, we obtained a likelihood, which was not very well suited for the imple-
mentation of the EM-algorithm. The problem was solved by introducing appropriate auxiliary
variables.
This allowed for an implementation of the EM-algorithm in the usual full exponential family
framework with the constraint that the
ss
-parameters should be positive, i.e. this method implies
positive intra-locus covariances. However, the inter-locus covariances
st
are not constrained.
The parameters estimated using the EM-algorithmare not case specic but reect the distribution
of the quantitative STR DNA in the laboratory.
Appendices 4.A and 4.B give mathematical details on the model and the implementation of the
EM-algorithm.
4.3 Impact on the likelihood ratio
As mentioned in Section 4.1.2, both the qualitative and quantitative evidence need to be evaluated
for proper use of the available information from a crime scene. The probability P(Q|G

, G

) in
the likelihood ratio of (4.1) is evaluated by using the tted model to calculate L(M|G

, G

) =
|
(G

,G

)
|
1/2
exp{
1
2
(M
(G

,G

)
)

1
(G

,G

)
(M
(G

,G

)
)} of (G

, G

) and thus yielding the


observed signal M, whereas P(G

) as usual is assessed using the allele frequencies (Evett and


Weir, 1998).
Consider a more complicated case with no identied victim where the crime scene stain is as-
sumed to be a mixture of two DNA proles, e.g. DNA extracted from a cigarette butt found
at the scene of crime. Then, given a suspect prole G
S
, the H
p
-hypothesis claims the stain to
be a mixture of the suspect and an unrelated unknown prole, H
p
:(G
U
, G
S
), whereas the H
d
-
hypothesis states it is a mixture of two unrelated unknown proles, H
d
:(G
U
1
, G
U
2
). We form two
sets C
p
= {G
U
: (G
S
, G
U
) G} and C
d
= {(G
U
1
, G
U
2
) : (G
U
1
, G
U
2
) G}, consistent with each
hypothesis. Similar arguments as used for obtaining (4.1) imply that LR is:
LR =
P(E, G
S
|H
p
)
P(E, G
S
|H
d
)
=
_
G
U
C
p
L(M|G
U
, G
S
)P(G
U
)
_
(G
U
1
,G
U
2
)C
d
L(M|G
U
1
, G
U
2
)P(G
U
1
)P(G
U
2
)
.
Note that the sum in the denominator involves 7
S
2
12
S
3
6
S
4
terms, where S
i
is the number of loci
with i observed peaks. This follows fromthe fact that there are 7, 12 and 6 possible combinations
for two, three and four alleles to be assigned to two individuals, respectively. However, this often
yields an intractable number of combinations, where only a limited number of pairwise proles
actually have a likelihood value, L(M|G
U
1
, G
U
2
), large enough to have numerical impact on LR.
4.3 Impact on the likelihood ratio 69
Table 4.2: Data stratied according to STR locus.
Locus Dye Allele Height Area
D3 Blue 15 1135 10301
D3 Blue 16 1031 9405
vWA Blue 14 371 3365
vWA Blue 15 921 8654
vWA Blue 16 395 3610
vWA Blue 17 804 7382
D16 Blue 10 485 4913
D16 Blue 11 2110 21651
D16 Blue 12 417 4304
D2 Blue 17 196 2121
D2 Blue 19 700 7713
D2 Blue 25 951 11209
D8 Green 8 774 7052
D8 Green 12 1006 9297
D8 Green 13 344 3166
D8 Green 16 291 2675
Locus Dye Allele Height Area
D21 Green 29 774 7152
D21 Green 30 789 7240
D21 Green 31 982 9174
D18 Green 13 593 6455
D18 Green 15 1002 10758
D18 Green 17 865 9458
D19 Yellow 13 1614 13532
D19 Yellow 14 211 1849
D19 Yellow 15 182 1647
TH0 Yellow 6 797 6894
TH0 Yellow 8 505 4334
TH0 Yellow 9 198 1751
FGA Yellow 19 173 1606
FGA Yellow 23 880 8720
FGA Yellow 24 647 6682
4.3.1 Example
We illustrate that the inclusion of the quantitative peak information, Q, is important when evalu-
ating the weight of evidence in a mixture. In the example, we demonstrate the properties of our
approach when the data of Table 4.2 are observed.
In order to limit the number of proles in LR, we applied the guidelines of Bill et al. (2005).
These guidelines evaluate each mixture using heuristic rules about peak height balances and
mixture proportions. The authors dene the heterozygote balance Hb as the ratio of two non-
shared peaks of an assumed heterozygous prole, and provide estimators of mixture proportions
within each locus,

M
s
x
. If a two-person mixture is to pass the guideline criteria, it must satisfy
3/5 Hb 5/3 and

M
x
0.35

M
s
x


M
x
+0.35, where

M
x
= S
1
_
sS

M
s
x
is an estimate of the
overall mixture proportion. We used 0.25 as limits on

M
s
x
which resulted in 860 pairs satisfying
the heuristic rules of Bill et al. (2005).
However, instead of assigning equal weight to all these pairs, we evaluate L(M|G

, G

) for each
pair of proles. As mentioned in Bill et al. (2005), this approach will not yield the correct LR
as all possible combinations should be weighted by their associated L(M|G

, G

)-value. This
attempt to evaluate the LR aims at including more of the available information and thus yielding
a better approximation to the actual LR, since each pair of proles has its own weight reecting
how well it ts the quantitative data.
In the example, we demonstrate the eect of including the quantitative information in the evi-
dence evaluation for three dierent suspect proles. The suspect proles used in the example
70 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
Table 4.3: Proles of the suspects (a)-(c), unknowns and best matching pairs of proles ()
in example of Section 4.3.1. For all the suspects, only one unknown matches the chosen sus-
pect among the 860 combinations. In loci where the suspect combination diers from the best
matching combination in part (), allelic numbers are in bold font.
Locus D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA
(a)
Suspect 15,16 14,16 10,12 17,17 13,16 29,31 15,15 14,15 9,9 19,19
Unknown 15,16 15,17 11,11 19,25 8,12 30,31 13,17 13,13 6,8 23,24
(b)
Suspect 15,16 14,16 10,12 17,25 13,16 30,30 17,17 14,15 6,9 19,19
Unknown 15,16 15,17 11,11 19,25 8,12 29,31 13,15 13,13 6,8 23,24
(c)
Suspect 15,16 14,16 10,12 17,25 13,16 30,30 13,15 14,15 6,9 19,23
Unknown 15,16 15,17 11,11 19,25 8,12 29,31 15,17 13,13 6,8 23,24
()
Minor 15,16 14,16 10,12 17,25 13,16 29,29 13,13 14,15 6,9 19,23
Major 15,16 15,17 11,11 19,25 8,12 30,31 15,17 13,13 6,8 23,24
are given in Table 4.3, together with the unknown prole G
U
that maximises L(M|G
S
, G
U
) for
each suspect prole, G
S
. For each suspect prole, only one of the 860 pairs of proles satises
(G
U
, G
S
) G which implies a product of L(M|G
S
, G
U
) and P(G
U
) in the numerator for each
suspect prole, and 860 terms in the sum of the denominator of which the combination of Mi-
nor and Major of Table 4.3, part () has the largest quantitative likelihood value. Throughout
the example, the main focus will be on the suspect of part (a) in Table 4.3, with comparisons to
the results obtained using the suspects of part (b) and (c).
In Fig. 4.4 and Fig. 4.5, the observed quantitative peaks, , are plotted together with the expected
peaks, , for the proles of part (a) and () of Table 4.3, respectively. The expected peaks are
given by

M = T , where T and H
(k)
in
s,k
=
s
H
(k)
are computed for the specic pair of
proles. It is clear from Fig. 4.4 that the imbalances induced by the suspect combination in part
(a) imply substantial deviation from the observed data for loci D2, D21, D18, TH0 and FGA.
These are also the loci where the two pairs of proles of part (a) and () in Table 4.3 dier.
First, we make a non-quantitative evaluation of the LR using only allele probabilities for the
suspect of part (a). Since there is only one combination among the 860 that includes this suspect,
the likelihood ratio LR = P(G
U
)/[
_
P(G
U
1
)P(G
U
2
)], where the sum in the denominator is over
the set C
d
, but here this set consists of 860 combinations satisfying Hb [3/5 ; 5/3] and

M
s
x

[

M
x
0.25] for computational simplicity. This yields a non-quantitative likelihood ratio, LR
G
,
estimate of 4.527

10
13
, which is very strong evidence in favour of the hypothesis that the suspect
is a contributor to the stain.
The dominating values of the quantitative likelihood in the numerator and denominator are given
by L(M|G
(a)
S
, G
(a)
U
) = 5.9

10
119
and L(M|G
()
U
1
, G
()
U
2
) = 5.57

10
100
respectively. Alarge dier-
ence in the quantitative likelihood values was expected from the dierence in t to the observed
peaks pictured in Figs. 4.4 and 4.5. Thus, including the quantitative evidence, the quantitative
likelihood ratio estimate, LR
GQ
, decreased by a factor 10
17
to 7.63

10
4
which is strongly in
favour of the suspect not having contributed to the stain.
4.3 Impact on the likelihood ratio 71
50
250
500
750
1000
1250
1500
1750
2000
2250
2500
50
250
500
750
1000
1250
1500
1750
2000
2250
2500
50
250
500
750
1000
1250
1500
1750
2000
2250
2500
D16
D18
D19
D2
D21
D3
D8
FGA TH0
vWA
10 11 12
13 15 17
13 14 15
17 19 25
29 30 31
15 16
12 13 16 8
19 23 24 6 8 9
14 15 16 17
Expected peak
Observed peak
Figure 4.4: Observed, , and expected peaks, , assuming a two-person mixture of the suspect
and unknown in Table 4.3, part (a). Abscissa: Basepair (bp) values computed using the allelic
number and STR locus, Ordinate: Peak heights in rfu.
Table 4.4: Likelihood ratios for the three dierent suspects in Table 4.3. Here, LR
G
and LR
GQ
denote the non-quantitative and quantitative likelihood ratios, respectively, and LR
GQ
/LR
G
is
the relative change in the weight of the evidence. The allele frequencies used in the calculations
were provided by The Section of Forensic Genetics, University of Copenhagen.
LR
G
LR
GQ
LR
GQ
/LR
G
Suspect (a) 4.527

10
13
7.630

10
4
1.685

10
17
Suspect (b) 4.216

10
13
5.185

10
8
1.230

10
5
Suspect (c) 3.596

10
13
9.744

10
13
2.710
Together with similar computations for the suspects of parts (b) and (c), this information is given
in Table 4.4. Here, we see that for suspects of part (b) and (c), the change in the weight of
evidence is a moderate decrease and small increase, respectively. Note that part (b) diers from
the best matching pair of proles in three loci (D21, D18 and FGA) and part (c) in the two loci
D21 and D18.
72 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
50
250
500
750
1000
1250
1500
1750
2000
2250
2500
50
250
500
750
1000
1250
1500
1750
2000
2250
2500
50
250
500
750
1000
1250
1500
1750
2000
2250
2500
D16
D18
D19
D2
D21
D3
D8
FGA TH0
vWA
10 11 12
13 15 17
13 14 15
17 19 25
29 30 31
15 16
12 13 16 8
19 23 24 6 8 9
14 15 16 17
Expected peak
Observed peak
Figure 4.5: Observed, , and expected peaks, , assuming a two-person mixture of the minor
and major proles in Table 4.3, part (). Abscissa and ordinate as in Fig. 4.4.
The non-quantitative likelihood ratio estimates, LR
G
, of Table 4.4 will in many legal systems
point towards conviction of any of the suspects. When including the quantitative information,
we see that the change in the weight of evidence may add further to the evidence against the
suspect (as in part (c)), or may decrease the likelihood ratio estimate such that it provides strong
evidence in favour of the suspect (part (a)), however, also situations in between these two ex-
tremes will occur (part (b)). This example shows that, even when a persons genotype matches
the genetic stain, imbalanced STR DNA proles judged by the observed quantitative data may
speak strongly in favour of the suspect. However, weighing each pair of genotypes by the asso-
ciated quantitative likelihood-value may add further to the evidence against the suspect when the
suspects prole only causes a few or small imbalances with respect to the observed peaks.
4.4 Parameter estimation
The EM-algorithm and the specic expressions as derived in Appendix 4.B were implemented
in the statistical software package R (R Development Core Team, 2009). In order to validate the
implementation, we simulated peak area data given the peak heights fromcontrolled experiments
and known model parameters. After 30,000 iterations, the parameter estimates were close to the
4.5 Discussion 73
true values indicating a successful implementation of the tting algorithm.
In order to estimate the model parameters, we used a training set consisting of results of investi-
gations of DNA mixtures from 71 controlled experiments conducted at The Section of Forensic
Genetics, University of Copenhagen. These 71 cases were chosen such that all alleles from
the contributing proles were present in the data, i.e. no drop-out events occurred (see Tvede-
brink et al., 2009, for discussion on allelic drop-out). The algorithm was executed using several
dierent sets of initial values. For each set, we ran 30,000 iterations of the EM-algorithm all
converging to the same parameter estimates.
In order to monitor the convergence of the EM-algorithm, we computed the deviance after each
iteration. After 1,100 iterations, the absolute improvement for successive deviances was less
than 0.01.
In the part of Table 4.5, the shading shows the locus correlations,
st
/

ss

tt
, while the above-
diagonal part shows the locus covariances,
st
, when
s
= 0 (see Section 4.5.2). Most of the
loci were highly correlated. This indicates that evaluation of quantitative DNA evidence with
the assumption of independence across loci is an extensive simplication.
The dierent signal intensities of the uorescent dyes were also identiable in the parameter
estimates. The strong signals of the green dye band and the weaker signals of the yellow dye
band (Butler, 2005) were reected in the parameter estimates of
s
. In Table 4.5, we see that the
magnitude of the s of the yellow uorescence was smaller than that of the blue uorescence,
which again was smaller than that of the green uorescence (except for loci D16 and D21).
In addition to the parameter estimates and deviance, we also computed the asymptotic variances
of the estimates by the normality approximation of the MLE with the inverse Fisher Information
as covariance matrix. We found that the estimated standard deviation of both and
2
indicated
reasonably good estimates of these parameters. Large asymptotic standard deviations of did,
however, indicate the possibility of model reductions.
4.5 Discussion
4.5.1 Validity of the hypothesis of a two-person mixture
When analysing the STR results of a crime scene stain, we need to be able to determine whether
the stain is likely to originate from a two-person mixture or not. In this section, we demon-
strate how this is possible using our model for the quantitative STR DNA data. In order to
verify the hypothesis of a given two-person mixture, we simulated 1,000 vectors of peak areas,
M

1
. . . , M

1000
, for each of the 71 cases from the controlled experiments.
Simulations of the peak areas were conditioned on the observed peak heights and true proles of
the mixture, and we used the parameter estimates fromTable 4.5. This corresponds to simulating
under a null hypothesis with the T-matrix, H = (H
(1)
, H
(2)
) and h known together with xed
parameters ,
2
and , i.e. assuming that the stain originates from a two-person mixture.
74 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
Table 4.5: Parameter estimates after 30,000 iterations of the EM-algorithmwith = 0 (Section
4.5.2). The -matrix shows the covariances
st
and correlations
st
/

ss

tt
(shaded).
Y
e
l
l
o
w
d
y
e

o
u
r
e
s
c
e
n
c
e
B
l
u
e
d
y
e

o
u
r
e
s
c
e
n
c
e
G
r
e
e
n
d
y
e

o
u
r
e
s
c
e
n
c
e
F
G
A
T
H
0
D
1
9
D
2
v
W
A
D
3
D
1
6
D
2
1
D
8
D
1
8
1
1
5
1
.
5
4
7
7
3
.
9
1
1
4
4
1
.
0
0
1
4
9
2
.
0
1
1
0
4
4
.
4
8
8
5
7
.
4
6
1
3
0
5
.
3
6
1
0
3
3
.
5
0
3
9
7
.
3
4
1
4
6
1
.
5
9
F
G
A
0
.
7
5
9
2
5
.
6
9
1
0
4
2
.
0
3
1
0
9
0
.
1
6
5
8
7
.
5
8
6
6
4
.
5
5
5
2
7
.
2
1
6
5
4
.
2
6
5
8
2
.
8
8
1
0
8
5
.
3
7
T
H
0
0
.
9
4
0
.
7
6
2
0
5
2
.
8
3
2
1
5
1
.
7
6
1
3
1
9
.
4
9
1
0
5
0
.
9
5
1
7
1
6
.
8
5
1
2
7
9
.
9
3
6
1
9
.
2
2
1
9
6
4
.
6
8
D
1
9
0
.
8
8
0
.
7
2
0
.
9
5
2
4
8
1
.
7
0
1
4
3
8
.
3
6
1
2
3
7
.
0
7
1
8
4
8
.
1
4
1
3
5
4
.
9
4
7
6
5
.
3
5
2
0
7
7
.
8
1
D
2

0
.
9
4
0
.
5
9
0
.
8
9
0
.
8
8
1
0
8
2
.
1
0
8
2
1
.
7
8
1
3
3
9
.
6
4
9
7
5
.
9
1
3
4
0
.
3
5
1
4
4
9
.
9
1
v
W
A
0
.
8
6
0
.
7
4
0
.
7
9
0
.
8
5
0
.
8
5
8
6
4
.
0
7
9
5
4
.
0
8
7
7
6
.
6
4
5
3
6
.
5
6
1
1
4
2
.
7
7
D
3
0
.
8
8
0
.
4
0
0
.
8
7
0
.
8
5
0
.
9
3
0
.
7
4
1
9
1
5
.
7
8
1
2
0
1
.
8
3
2
4
6
.
5
0
1
6
5
3
.
9
1
D
1
6
0
.
9
9
0
.
7
0
0
.
9
2
0
.
8
8
0
.
9
6
0
.
8
6
0
.
8
9
9
5
2
.
2
7
3
8
0
.
7
2
1
3
5
3
.
2
5
D
2
1
0
.
4
1
0
.
6
7
0
.
4
8
0
.
5
4
0
.
3
6
0
.
6
4
0
.
2
0
0
.
4
3
8
2
1
.
0
0
7
5
0
.
0
2
D
8
0
.
9
2
0
.
7
6
0
.
9
3
0
.
8
9
0
.
9
4
0
.
8
3
0
.
8
1
0
.
9
4
0
.
5
6
2
1
9
6
.
6
5
D
1
8

5
.
5
3
5
.
9
9
6
.
1
5
7
.
0
1
7
.
6
4
8
.
2
5
9
.
1
0
8
.
9
2
1
0
.
1
9
1
0
.
1
8

2
5
9
6
.
5
3
7
3
0
.
2
9
1
0
0
2
.
9
3
1
2
3
6
.
3
1
1
1
4
6
.
7
8
1
3
3
1
.
7
9
1
8
2
1
.
0
4
1
7
9
7
.
3
9
1
8
5
4
.
9
4
3
2
0
8
.
7
3
4.5 Discussion 75
For each of the simulated peak area vectors, M

i
, we found the pair of proles maximising the
likelihood,

G
i
= (

G
i1
,

G
i2
), using the approach of (Tvedebrink et al., 2010, Chapter 5 of this
thesis) and computed T and H associated with

G
i
. Using these quantities, we can determine the
Mahalanobis distance,
M
d
(M

i
,

G
i
) = (M

i


M

G
i
)

Var(M

G
i
)
1
(M

i


M

G
i
), (4.2)
where

M

G
i
and Var(M

G
i
) are the expected peak areas and variance assuming a mixture of

G
i
respectively. If

G
i
were equal to the true proles of the mixture, then M
d
would follow a
2
n
-
distribution with n being the number of observations in the mixture. However, the true mixture
proles may not always be identical to the pair of proles maximising the likelihood. This may
be due to stochastic variations and systematic components, e.g. stutter and pull-up eects. The
former is caused by artefacts in the polymerase chain reaction resulting in an increase of peak
intensities typically in the allelic position before the true allele. Pull-up eects are manifested
as an increase of the true peaks caused by overlap of the spectra of the light emitted from the
various uorochromes, which are detected by a CCD camera in the data generating process
(Butler, 2005). Hence, on average we expect M
d
for

G
i
to be smaller than for the true proles
which implies fewer degrees of freedom in the
2
-distribution. Fig. 4.6 shows a histogram
of 1,000 simulated Mahalanobis distances for the data given in Table 4.2. The superimposed
curves indicate that the expectation of fewer than n degrees of freedom for the
2
-distribution is
reasonable, where n = 31 in this example. The hypothesis that the Mahalanobis distance follows
a
2
29
-distribution is supported by a Kolmogorov-Smirnotest (p-value of 0.2410), whereas both
30 and 31 degrees of freedom are rejected (p-values are 0.0307 and 1.966 10
8
, respectively).
In crime casework the DNA may be degraded or partly degraded, which implies that results
only are obtained for short STR loci/alleles (loci/alleles with low base pair numbers), but not (or
weak results) with longer STR loci/alleles (loci/alleles for high base pair numbers). This is a
potential problem since this is not incorporated in the model due to the assumptions on inter-loci
correlation.
However, the Mahalanobis distance M
d
in (4.2) can be decomposed into two parts evaluating the
quality of the sample, M
(q)
d
in (4.3), and the goodness of t of a proposed mixture G = (G

, G

)
of two proles, M
(m)
d
in (4.4). Let
M|M
+
= Var(M
G
|M
+
) and
M
+
= Var(M
+,G
), then
M
(q)
d
(M, G) = (M
+

+,G
)

1
M
+
(M
+

+,G
), (4.3)
M
(m)
d
(M, G) = (M
G|+
)

M|M
+
(M
G|+
), (4.4)
where M
+
is the vector of loci peak area sums and
G|+
(
+,G
) are the expected peak areas
(sums) conditioned on the loci sums for proles G. The reason for this decomposition follows
from the normality assumption, where f (M) = f
M|+
(M|M
+
) f
+
(M
+
), which in density func-
tions yields
|
M
|

1
2
e

1
2
M
d
(M,G)
= |
M|M
+
|

1
2
|
M
+
|

1
2
e

1
2
_
M
(m)
d
(M,G)+M
(q)
d
(M,G)
_
,
where
M
= Var(M
G
). We note that M|M
+
is a distribution restricted to the ane subspace
with xed peak area sums.
76 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
Figure 4.6: Histogram of Mahalanobis distances for simulations based on data from Table 4.2.
Superimposed are a
2
31
-distribution (solid), Gaussian based kernel density estimate (dashed) and
a
2
29
-distribution (dotted).
Since |
M
|

1
2
= |
M|M
+
|

1
2
|
M
+
|

1
2
, taking 2 log on both sides of the equation gives the decom-
position of the Mahalanobis distance (4.2) into the two parts (4.3) and (4.4). Both Mahalanobis
distances, M
(q)
d
(M, G) and M
(m)
d
(M, G), follow
2
-distributions with S and nS degrees of
freedom, respectively.
In Fig. 4.7, we have plotted histograms of the p-values for M
(m)
d
and M
(q)
d
for 66 real crime cases
made available by The Section of Forensic Genetics, University of Copenhagen. In all cases
the contributors are not known for certain. However, the circumstances of the crime cases made
a victim and suspect prole available for each case. The two proles matched and completely
explained the mixed prole the stain.
The left panel shows the histogram of the p-values from M
(m)
d
assessing how well the proposed
pair of proles matched the mixture given the assumptions of the model. The histogram of the
p-values indicated that the model is applicable to STR results in real crime cases, since large p-
values, or equivalently small Mahalanobis-distances, imply that H
p
is supported by the evidence.
The right panel of Fig. 4.7 shows that more than half (35 cases) of the p-values from the test of
the sample quality were less than 0.01. This indicates that most of the crime case samples had
been subject to degradation of the DNA material. Degradation of the DNA is often complicating
the interpretation of DNA mixtures. It is worth emphasising that imbalances caused by degraded
DNA may imply that no pair of proles has M
d

2
n,(1)
, where
2
k,(1)
is the critical value
on signicance level (e.g. = 0.01) for a
2
k
-distributed variable. However, conditioned on
4.5 Discussion 77
Figure 4.7: Histogram of p-values of the Mahalanobis distances of 66 crime cases in which we
had found the pair of proles maximising the likelihood. For these proles, we have decomposed
the overall Mahalanobis distance M
d
into M
(m)
d
and M
(q)
d
.
the loci sums, such imbalances do not aect the evaluation of a particular pair of proles, i.e.
M
(m)
d

2
nS,(1)
is possible.
In order to investigate whether an observed stain may originate from a two-person mixture, the
evaluation of M
(m)
d
(M,

G) needs to be less than
2
nS,(1)
. If this is not the case for the observed
stain, it may be a mixture of more than two contributors or the results are strongly inuenced by
DNA degradation, drop-outs, stutters, pull-up eects, etc. With M
(m)
d
(M,

G)
2
nS,(1)
, it is
plausible for the observed stain to be a mixture of two individuals since, for the pair of proles
maximising the likelihood, the conditional Mahalanobis distance is suciently small. Then the
quality of the sample may be investigated by evaluating M
(q)
d
and observing if it falls above
the critical value
2
S,(1)
, e.g. = 0.01. If so, this indicates unexpected imbalances between
loci, which may be due to e.g. degraded DNA, inhibitors aecting only certain loci or allelic
drop-outs.
4.5.2 Model reductions
When tting the parameters of the model, we nd for our specic data set that the additional
variance components,
s
, s S, were innitesimally small compared to the contributions of
ss
.
A
2
-test indicated that the goodness of t was not signicantly improved by this parameter.
Hence, the results reported in Table 4.5 corresponded to the model with
s
= 0 for all s S.
Investigations showed that further reduction of the covariance structure was not supported by the
data (see Appendix 4.C for more details).
78 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
4.6 Conclusion
In the example of Section 4.3.1, the usual evaluation of the likelihood by considering LR
G
=
P(G|H
p
)/P(G|H
d
) gave a likelihood ratio supporting the H
p
-hypothesis with a likelihood ratio
larger than 10
13
. However, when including the quantitative information, the weight of evidence
was decreased to a likelihood ratio, LR
GQ
, less than one. This was true even with limits of
0.25 for the mixture proportion balances in the setup of Bill et al. (2005). The likelihood ratio
without taking the quantitative information into account corresponded to the situation, where all
combinations passing the guidelines of Bill et al. (2005) were given identical weights. Hence,
excluding possible combinations from entering the likelihood ratio based on the quantitative in-
formation was not sucient for an accurate estimate of the likelihood ratio based on quantitative
information.
For cases where the qualitative results strongly support that the suspect contributed to a mixed
stain, the inclusion of the quantitative information may further support the conclusion. Con-
versely, the likelihood ratio may decrease supporting the H
d
-hypothesis. Both situations were
demonstrated by the example of Section 4.3.1. Hence, the evaluation of the quantitative infor-
mation using a statistical model is of great importance in order to assess the weight of evidence
obtained from DNA mixtures.
The model derived in this paper incorporates both information on qualitative traits (STR alleles)
and on quantitative aspects of the STR alleles (peak heights and areas). Graphical diagnos-
tics (not included in this manuscript) indicate that the model is well suited for the evaluation
of P(Q|G, H). Furthermore, assuming independence of the peak areas of the various STR is a
simplication that cannot be supported by the work carried out in this paper. Hence, inter-locus
correlations or other means of correction need to be considered when assessing the weight of
evidence from quantitative data in forensic DNA STR settings.
The concordance between the model properties and prior knowledge of dierences in amplica-
tion eciency of various STR loci and in emission intensities of various uorescent dyes adds
further support to the model.
The model described in the present paper is also applicable in other elds of science. A useful
property is the handling of variable dimension of the observations while exploiting compound
symmetries (Votaw, 1948). For example similar problems with modelling covariance structures
may arise in animal breeding studies, where the litter size varies and osprings may be related
through the same breeding lines.
4.A The model 79
Appendices
4.A The model
In this section, we provide more mathematical details than given in Section 4.2.2. The model
assumes proportionality of the mean and variance of A N
4S
(, ). The covariance, , is a
diagonal matrix with elements
2
s
H
(k)
and is a vector partitioned in a similar way with the
element
s
H
(k)
for both peak areas associated with locus s and person k.
The observable peak area measurements, M, were dened as a linear transformation, T, such
that M = TA + . In order to model the proportionality of the mean and variance of M, we
dened the scaled residuals =
_

i
/

h
i
_
n
i=1
, where n =
_
sS
n
s
. For , we assumed a compound
symmetry covariance matrix

(Votaw, 1948). Since = diag(h)
1/2
, the covariance of
is Cov() = = diag(h)
1/2

diag(h)
1/2
. We parametrised the covariance,

, as an additive
structure using = {
st
}
s,tS
and = (
s
)
sS
, such that Cov(
s
,
t
) =
st
1
n
s
1
n
t
+
st

s
I
n
s
,
where
s
are the scaled residuals of locus s, 1
k
is a k-dimensional vector of ones, and
st
is the
Kronecker delta. For implementation of the EM-algorithm, we need the conditional distribution
of A|M. Using Lauritzen (1996, Proposition C.5), this is
A|M N
4S
_
+ T

_
TT

+
_
1
(M T), T

_
TT

+
_
1
T
_
. (4.5)
The model for M corresponds to a linear mixed eects model:
M = X+ Z(
1
,
2
)

, where
1
N(0, diag(1
4

2
s
)
sS
) and
2
N(0, ) (4.6)
for some case specic design matrices X and Z. However, estimation of the variance components
are complicated due to the varying dimensions of M and M
s
, s S from case to case.
4.B EM-estimators
In order to handle the complete structure of A that includes the missing data problem, we used
the EM-algorithm to impute the unobservable data. However, since the dimensions of M and
sub-vectors hereof varied from case to case, we obtained a likelihood that was not very well
suited for implementation of the EM-algorithm. This was due to the dependence on n
s
in the
covariance of the locus-wise average of the scaled residuals

= (


1
, . . . ,


S
),
Cov(

) = diag(
s
/n
s
)
sS
+ = diag(/n) + ,
where n = (n
s
)
sS
and the vector division is done component-wise, x/y = (x
i
/y
i
)
n
i=1
.
The problem was solved using appropriate auxiliary variables v and u, which we assumed to
be independent and zero-mean normal distributed variables with covariances and diag(/n),
respectively. By introducing v and u, we obtained a likelihood of a full exponential family,
where the estimation of and may be done separately. The use of auxiliary variables is
80 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
equivalent to adding constraints on the diagonal elements of . By assuming Cov(v) = , we get
the constraint that
ss
> 0, s S. In (4.6), this corresponds to splitting
2
into two independent
parts
21
and
22
,
2
=
21
+
22
, where
21
N(0, Q
c
Q

c
) and
22
N(0, diag(
s
1
n
s
)
sS
) with
Q
c
dened in (4.7).
Hence, the E-step consisted of imputing A, u and v given the observations M. In the M-step,
we used that the full likelihood factorises into two terms modelling the biological part of the data
given the measurement noise, (A, M)|(u, v, ), and the noise, (u, v, ), respectively:
f (A, M, u, v, ; , , , |H, h) = g(A, M; , |u, v, , H, h)h(u, v, ; , |H, h)
with g and h being the density functions of the two multivariate normal distributions below:
g :
_
A
M

_
N
__

T+
_
,
_
T

T TT

__
h :
_

_
u
v

_

_
N
_

_
_

_
0
0
0
_

_
,
_

_
diag(/n) O diag(/n)Q

c
O Q

c
Q
c
diag(/n) Q
c

_
_

_
,
where Q
c
is dened in (4.7). In order to derive the estimators of the parameters entering the
functions g and h, we dened two matrices Q and Q
c
,
Q =
_

_
1
4
. . . O
.
.
.
.
.
.
.
.
.
O . . . 1
4
_

_
and Q
c
=
_

_
1
n
1c
. . . O
.
.
.
.
.
.
.
.
.
O . . . 1
n
S c
_

_
, (4.7)
where subscript c refers to case c, c = 1, . . . , C. Furthermore, the DNA proxy H = (H
(1)
, H
(2)
) is
expanded to a 4S -dimensional vector, H = (H
s
)
sS
, where the components H
s
are xed for all
loci, H
s
= (H
(1)
, H
(1)
, H
(2)
, H
(2)
). Note, that the compound symmetry structure of the covariance
of with = 0 can be written as

= Q
c
Q

c
. The estimators of and
2
can be found as
=
_
c
Q

E(A
c
|M
c
)
_
c
Q

H
c

2
= (4C 1)
1

c
Q

_
{E(A
c
|M
c
)
c
}
2
+ diag{Cov(A
c
|M
c
)}
H
c
_
,
where the squaring of a vector is done component-wise, x
2
= (x
2
i
)
n
i=1
and diag{B} extract the
diagonal vector of B, diag{B} = (B
ii
)
n
i=1
. Furthermore, the moments of A
c
|M
c
are given in (4.5).
The estimators of = (
s
)
sS
and are,

s
= n
1
s+

c
_
E(u
2
sc
|M
c
)n
sc
+ E(

s

s
n
s


2
s
|M
c
)
_

= C
1

c
_
E(v
c
|M
c
)E(v
c
|M
c
)

+ Cov(v
c
|M
c
)
_
.
For both v and u, the covariance with M is expressed as Cov(x, M) =
Cov(x)Q

c
diag(h)
1/2
, for xreplaced by v or u. The conditional moments entering the estimation
4.C Model reduction 81
equations may be found using the formulae for computing conditional moments in the multivari-
ate normal distribution, E(X|Y ) =
X
+
12

1
22
(Y
Y
) and Cov(X|Y ) =
11

12

1
22

21
for (X, Y )

N
_
(
X
,
Y
)

,
_
with =
_

11

12

21

22
_
(Lauritzen, 1996, Proposition C.5).
4.C Model reduction
As mentioned in Section 4.4, the large asymptotic standard deviations indicated that the covari-
ance structure of

could be simplied. The estimated parameters for nearly all loci were neg-
ligible compared to
ss
. Let Diag(A
i
)
n
i=1
be a block-diagonal matrix with matrices A
i
, i = 1, . . . , n
as elements and the square root of a vector dened as

x = (

x
i
)
n
i=1
. Then, we may write the
covariance matrix of M,
M
, as:

M
= TT

+ = Diag
_

2
s
T
s
diag(H
s
)T

s
+
s
_
h
s
_
h
s

_
sS
+
_

st
_
h
s
_
h
t

_
s,tS
.
From the equation above, we see that setting = 0 does not introduce any singularities in
M
.
Hence, the asymptotic theory is not violated. In order to test whether was statistically signif-
icant, we used an approximately
2
-distributed test-statistic with the dierence in parameters as
degrees of freedom (Cox and Hinkley, 1974). In the full model, there were S (S + 3)/2 parame-
ters. By restricting = 0, we removed S parameters and the
2
S
-test yielded a p-value of 0.9999
supporting the hypothesis of = 0. The reported parameter estimates in Table 4.5 were based
on this restricted model.
Data exploration and the estimated parameters of from Table 4.5 suggest that further model
reductions may be feasible. Possible parametrisations of

may be,
Cov(
s
,
t
) =
d(s),d(t)
1
n
s
1

n
t
+
st

s
I
n
s
(4.8)
Cov(
s
,
t
) =
d(s),d(t)
1
n
s
1

n
t
+
d(s)d(t)

d(s)
I
n
s
(4.9)
Cov(
s
,
t
) = 1
n
s
1

n
t
+
st

s
I
n
s
, (4.10)
where d maps locus to uorescence dye colour, e.g. d(FGA) = Yellow. The covariance struc-
tures in (4.8)-(4.10) all use fewer parameters in

than the restricted model with D(D+1)/2+S ,
D(D+3)/2 and 1+S parameters, respectively, where D is the number of dye colours. In our data
D = 3 and S = 10 and thus we removed 39, 46 and 44 parameters, respectively. The three tests
indicated that there were signicant dierences between the full model and any of the reduced
models, all with p-values < 0.0001. Hence, the model with the best t included locus depen-
dent parameters for the between and within covariance on the measurement errors. Inspection
of the correlation matrix in Table 4.5 indicated that locus D8 was the only locus with an average
between-locus-correlation less than 0.5. This may well cause the dye covariance models to have
a poor t.
However, one has to bear in mind that the parameter estimates were based on a limited training
set. Hence, the rejections of the hypotheses of simpler models may be biased towards the four
proles included in the training set. In order to fully verify the model we need to increase the
proportion of alleles from each locus and also the number of homozygous proles. This will
82 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
reduce the possible individual specic eect that may exist in the training set. Such work is in
progress.
A more detailed description of the model and the implementation of the EM-algorithm with full
R-source code are available on line at http://people.math.aau.dk/tvede/dna. The programs can
also be obtained from http://www.blackwellpublishing.com/rss.
Acknowledgements
The authors would like to thank Prof. Bruce S. Weir, University of Washington, for some clari-
fying comments on an earlier version of the manuscript. We also thank Ms. Catharina Steentoft
for collecting the DNA proles from the crime case work used in Section 4.5.1, and Ms. Lis-
beth Grubbe Nielsen for thorough review of language and grammar. Furthermore, very help-
ful comments were made by the journals editors and anonymous reviewers. The 22nd ISFG
Congress-proceedings (Tvedebrink et al., 2008) has a brief model description.
Bibliography 83
Bibliography
Balding, D. J. and R. A. Nichols (1994). DNA prole match probability calculation: how to
allow for population stratication, relatedness, database selection and single bands. Forensic
Science International 64, 125140.
Bill, M. et al. (2005). PENDULUM - a guideline-based approach to the interpretation of STR
mixtures. Forensic Science International 148, 181189.
Butler, J. M. (2005). Forensic DNA Typing: Biology, Technology, and Genetics of STR Markers
(2 ed.). Burlington, MA: Elsevier Academic Press Inc., U.S.
Cowell, R. G. (2009). Validation of an STR peak area model. Forensic Science International:
Genetics 3(3), 193199.
Cowell, R. G., S. L. Lauritzen, and J. Mortera (2007a). A gamma model for DNA mixture
analyses. Bayesian Analysis 2(2), 333348.
Cowell, R. G., S. L. Lauritzen, and J. Mortera (2007b). Identication and separation of DNA
mixtures using peak area information. Forensic Science International 166, 2834.
Cowell, R. G., S. L. Lauritzen, and J. Mortera (2010). Probabilistic expert systems for handling
artifacts in complex DNA mixtures. Forensic Science International: Genetics. In Press.
Cox, D. R. and D. V. Hinkley (1974). Theoretical Statistics. Chapman and Hall Ltd.
Curran, J. M. (2008). A MCMC method for resolving two person mixtures. Science &Justice 48,
168177.
Evett, I. W. and B. S. Weir (1998). Interpreting DNA Evidence: Statistical Genetics for Forensic
Scientists. Sunderland, MA: Sinauer Associates.
Gill, P. D. et al. (1998). Interpreting simple STR mixtures using allele peak areas. Forensic
Science International 91(1), 4153.
Gill, P. D. et al. (2006). DNA commission of the International Society of Forensic Genetics:
Recommendations on the interpretation of mixtures. Forensic Science International 160(2-3),
90101.
Lauritzen, S. L. (1996). Graphical models. Oxford University Press.
Little, R. and D. Rubin (2002). Statistical Analysis with missing data (2 ed.). Wiley.
Perlin, M. W. and B. Szabady (2001). Linear mixture analysis: A mathematical approach to
resolving mixed DNA samples. Journal of Forensic Science 46(6), 13721378.
R Development Core Team (2009). R: A Language and Environment for Statistical Computing.
Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2008). Amplication of DNA
mixtures - Missing data approach. Forensic Science International: Genetics Supplement Se-
ries 1, 664666.
84 Evaluating the weight of evidence using quantitative STR data in DNA mixtures
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2009). Estimating the proba-
bility of allelic drop-out of STR alleles in forensic genetics. Forensic Science International:
Genetics 3(4), 222226.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2010). Identifying contributors
of DNA mixtures by of quantitative information of STR typing. Journal of Computational
Biology. Accepted for publication.
Votaw, D. F. (1948). Testing compound symmetry in a normal multivariate distribution. Annals
of Mathematical Statistics 19(4), 447473.
Wang, T., N. Xue, and J. D. Birdwell (2006). Least-square deconvolution: A framework for
interpreting short tandem repeat mixtures. Journal of Forensic Science 51(6), 12841297.
4.7 Supplementary remarks 85
4.7 Supplementary remarks
As briey mentioned at page 79, the model presented above is a case of the larger class of linear
mixed eects models. However, what distinguishes the model from other types of linear mixed
eects models, is the property of handling varying dimensions of the observation matrix and
subvectors hereof under the assumed mean and covariance structure. Typically an experimental
design is set up such that n
s
and n (as dened above) are constant over the various factors of the
experiment. In order to construct interesting and realistic experiment useful to forensic genetics
it is not possible to full such restrictions. However, by restricting the intra-locus correlations to
be positive, the EM-algorithm may be used to t the model to data where the subvectors of the
response vary across samples.
The model extends the LR by including the quantitative information in the evidence calculations.
By evaluating L(M|G) for a given pair of DNA proles, G, it is possible to assess the goodness-
of-t for a proposed pair of DNA proles versus the observed peak intensities. However, since
the model presented above assumes intra-locus correlations, it is very time consuming and com-
putational intense to search for a pair of best matching proles

G = max
G
L(M|G), since the
conguration on the various loci aect each other through the non-zero correlations.
Hence, in order to perform such a task, we need to relax some of the assumptions for fast com-
putation and evaluation. In the following chapter we present a statistical model and an ecient
algorithm for nding a pair of best matching proles. The basic assumptions are similar to those
discussed above, with the dierence that the peak intensities within each locus is assumed con-
ditionally independent. That is, by conditioning on an ancillary statistic (for the mixture ratio)
we assume that the conguration of the DNA proles in locus s is independent of congurations
in locus t for all t s.
The methodology diers from previous approaches since it is frequentistic and based on a statis-
tical model taking the present proportionality of mean and variance of the peak intensities into
account. There are several Bayesian methods for modelling and separating DNA mixtures, e.g.
Cowell et al. (2007a,b, 2010); Cowell (2009) discussed the use of probabilistic expert systems to
model DNA mixtures using rst a normal distribution (2007a-paper) and later a gamma distribu-
tion, and also Curran (2008) took a Bayesian approach and modelled the peak intensities using a
multivariate normal distribution. However, Curran (2008) did not include the proportionality of
the mean and variance, which is a intrinsic feature of the gamma models of Cowell et al.
Earlier Perlin and Szabady (2001) and Wang et al. (2006) used linear models to model the peak
intensities of DNA mixtures using a frequentistic approach. However, their models did not take
the mentioned proportionalities of the rst two moments into account, and their methods did
not allow for ecient and consistent modelling of all loci simultaneously. For example, Wang
et al. (2006) did not incorporate a common mixture ratio across loci even though there are strong
biological and biochemical arguments for this assumption. Furthermore, did the method of Wang
et al. (2006) call for a reasonably large amount of manual labour in order to use the output from
their method.
CHAPTER 5
Identifying contributors of DNA mixtures by
means of quantitative information of STR typing
Publication details
Co-authors: Poul Svante Eriksen

, Helle Smidt Mogensen

and Niels Morling

Department of Mathematical Sciences


Aalborg University

Section of Forensic Genetics, Department of Forensic Medicine


Faculty of Health Science, University of Copenhagen
Journal: Journal of Computational Biology (Accepted for publication)
87
88 Identifying contributors of DNA mixtures by means of quantitative information
Abstract:
Estimating the weight of evidence in forensic genetics is often done in terms of a likelihood ra-
tio, LR. The LR evaluates the probability of the observed evidence under competing hypotheses.
Most often probabilities used in the LR only consider the evidence from the genomic variation
identied using polymorphic genetic markers. However, modern typing techniques supply addi-
tional quantitative data, which contain very important information about the observed evidence.
This is particularly true for cases of DNA mixtures, where more than one individual has con-
tributed to the observed biological stain.
This paper presents a method for including the quantitative information of STR DNA mixtures
in the LR. Also, an ecient algorithmic method for nding the best matching combination of
DNA mixture proles is derived and implemented in an on-line tool for two- and three-person
DNA mixtures.
Finally, we demonstrate for two-person mixtures, how this best matching pair of proles can
be used in estimating the likelihood ratio using importance sampling. The reason for using
importance sampling for estimating the likelihood ratio is the often vast number of combinations
of proles needed for the evaluation of the weight of evidence.
Keywords:
Forensic genetics; STR DNA; DNA mixture; Greedy algorithm; Finding best pair of matching
proles; Importance sampling.
5.1 Introduction
When a crime has been committed, biological traces are often found at the scene of crime. In
many cases, more than one individual have contributed to the stain, which is then determined a
DNA mixture. The evaluation of DNA mixtures is often complex and laborious taking experi-
enced case workers lots of time and eort to analyse.
Most modern DNA typing techniques are based upon polymerase chain reaction (PCR) produc-
ing millions of copies of the DNA string. The amount of DNA in the PCR vessel pre-PCR is
reected in the concentration of target molecules post-PCR. The targets used in forensic genetics
are selected such that they are highly polymorphic (large number of possible alleles) which gives
a high power of discrimination. Furthermore, the genetic markers used for forensic purposes are
non-coding and should ideally be neutral with respect to selection.
The prevalent technology used in forensic genetics to perform genetic identication uses short
tandem repeat (STR) polymorphisms. This method relies on variability in the length of certain
repeat motifs in the genome. The STR DNA prole is observed via a so called electropherogram
(EPG), where the alleles are identied as signal peaks above a signal to noise threshold (shaded
cones of Figure 5.2). For a single person DNA prole one can observe either one or two peaks
referring to the situation, where the DNA prole is either homozygous (identical alleles on both
chromosomes) or heterozygous (dierent alleles on each chromosome). The commercial kits
used for identication purposes typically contain between 10 to 15 genetic markers (also called
loci: plural for locus). Within each locus the number of alleles varies from 5 to 20. For the
5.1 Introduction 89
kit (SGM Plus kit, Applied Biosystems, AB) depicted in Figure 5.2, the labels D3, vWA,
. . . , FGA refer to locus names and the integer values above the locus name corresponds to the
observed allele types for that particular locus.
It is possible only to observe the cumulative peaks in the EPG. That is, the peak heights are
expected to be twice the height for homozygous loci relative to the heterozygous loci, since the
two identical alleles doubles the amount of pre-PCR product for the homozygous peaks. This
is also true for DNA mixtures where alleles shared by two or more contributors will reect the
contribution from more donors as higher peaks. Hence, for DNA mixtures with two contributors
the number of observable peaks ranges from one to four alleles depending on the particular
proles in the mixture.
The kit used for STR typing comprises a set of loci, S, used for discrimination. For an arbitrary
two-person mixture the number of possible combinations are given by 1
S
1
7
S
2
12
S
3
6
S
4
, where S
i
is the number of loci with i observations and S =
_
4
i=1
S
i
, is the total number of loci used for
discrimination, i.e. S is the size of S. The numbers 1, 7, 12 and 6 comes from the number of
possible combinations (see Table 5.1) when observing 1, 2, 3 and 4 alleles, respectively.
Table 5.1: Possible combinations in a two-person mixture with one to four alleles.
Alleles Possible combinations
a (aa, aa)
a, b (aa, ab) (aa, bb) (ab, aa) (ab, ab) (ab, bb) (bb, ab) (bb, aa)
a, b, c (aa, bc) (ab, ac) (ab, bc) (ab, cc) (ac, ab) (ac, bb)
(ac, bc) (bb, ac) (bc, ab) (bc, ac) (bc, aa) (cc, ab)
a, b, c, d (ab, cd) (ac, bd) (ad, bc) (bc, ad) (bd, ac) (cd, ab)
In most cases, this leads to an intractable number of combinations. However, using the quan-
titative STR data (peak heights and peak areas), the number of plausible combinations often
decreases substantially. In this paper, we develop a statistical model for STR DNA mixtures.
The statistical model is intended to measure the agreement between the expected peak intensi-
ties for a proposed combination of DNA proles and the actual observed peak intensities. Hence,
we use an objective criterion to discriminate among the possible combinations in Table 5.1.
In order to incorporate the peak intensities in the likelihood ratio (LR), we rst demonstrate how
to nd a best matching pair of proles for a given two-person mixture using an ecient algo-
rithmic approach. This algorithm iteratively builds up a best matching combination of proles
using the statistical model for the peak intensities. The algorithm has been implemented in a
free on-line tool available at the rst authors web-site. The statistical model and algorithmic
construction are dierent from previously proposed methods for DNA mixture separation (e.g.
Perlin and Szabady, 2001; Bill et al., 2005; Wang et al., 2006; Cowell et al., 2007a,b; Curran,
2008).
90 Identifying contributors of DNA mixtures by means of quantitative information
The inclusion of the quantitative information in the LR is done by assigning a weight to each
combination of DNA proles consistent with the observed STR types. The weight reects the
probability of observing the observed peak intensities given a specic combination of proles.
The denominator in the LR will in most cases yield a sum over an intractable number of com-
bination. By sampling close to the best matching combination returned by the algorithm, we
show how importance sampling may be used to estimate the LR.
5.2 Data
The model is based on exploration of controlled experiments of two-person mixtures conducted
at The Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health Sci-
ences, University of Copenhagen, Denmark. Fromthe data exploration it is evident that the mean
and covariance structure of the peak areas must satisfy proportionality of:
peak areas and peak heights,
peak area and amount of DNA in the mixture,
the mean and variance of the peak areas.
These assumptions is supported by Figures 1 and 2 in Tvedebrink et al. (2010). The experiments
consisted of pairwise two-person mixtures in various mixture ratios of the four proles in Table
5.2. The data were prepared as described in Tvedebrink et al. (2009).
Table 5.2: The four STR proles used in the controlled pairwise two-person mixture experi-
ments.
D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA
A 14,18 17,19 12,14 20,24 10,13 30.2,32.2 13,13 12,13 8,9 20,22
B 15,16 14,16 10,12 17,25 13,16 30,30 13,13 14,15 6,9 19,23
C 15,16 15,17 11,11 19,25 8,12 29,31 15,17 13,13 6,8 23,24
D 16,19 15,17 10,12 23,25 13,13 28,30 12,16 13,15 6,7 20,23
5.3 Modelling peak areas of a two-person mixture
For the search of a pair of best matching proles to be feasible, we assume the peak areas of
the various loci to be conditionally independent given the loci area sums, A
+
. Performing the
inference conditioned on A
+
satises the reasoning of Cox (1958) as A
+
is an ancillary statistic
for the mixture ratio, i.e. A
+
is xed for all values of the mixture ratio. Furthermore, we assume
that the peak areas are multivariate normal distributed with conditional mean vector, E(A
s
|A
s,+
),
5.4 Finding best matching pair of proles 91
and covariance matrix, Cov(A
s
|A
s,+
), dened as
E(A
s
|A
s,+
) = [P
s,1
+ (1 )P
s,2
]
A
s,+
2
and
Cov(A
s
|A
s,+
) =
2
C
s
diag(h
s
)C

s
, (5.1)
where denotes the proportion with which person 1 contributes to the mixture, and C
s
= I
n
s

n
1
s
1
n
s
1

n
s
with n
s
, 1 n
s
4, being the number of observed peaks at locus s. Note that
is supposed to be common to all loci. The denition of the covariance matrix is close to the
ordinary covariance when conditioning on the vector sum. However, as the variance of the peak
area is assumed proportional to the mean, we use the diagonal matrix diag(h
s
), where h
s
is the
associated peak heights on locus s, to obtain weighted observations that stabilise the variance.
Furthermore,
2
is a common variance parameter for all loci, s S.
The P
s,k
-vector is a vector of indicators taking values 0, 1 or 2 referring to the number of copies
that person k has of each allele in the mixture on locus s. E.g., if the two individuals contributing
to the mixture have genotypes (10, 12) and (14, 14), respectively, we will have P
s,1
= (1, 1, 0)

and P
s,2
= (0, 0, 2)

. Assuming no chromosomal anomalies, each individual carries two alleles


at each locus which implies the sum of P
s,k
to be 2 for all k.
The model presented here is dierent from e.g. the ones of Cowell et al. (2007a,b) and Cur-
ran (2008) who both takes a Bayesian approach. The model of Cowell et al. (2007a) assumes
the peak heights to be gamma-distributed to ensure proportionality of the mean and variance,
whereas Cowell et al. (2007b) assumes normality of the peak heights with parameters chosen
to ensure proportionality of mean and variance. As mentioned in Curran (2008), the model of
Cowell et al. (2007a) makes a crude adjustment for a repeat number eect, which is no longer
relevant. In Curran (2008) the peak heights are assumed multivariate normal, but here no attempt
is done in order to ensure proportionality of the mean and variance. Furthermore, by condition-
ing on the peak area sums within each locus we acknowledge the strong inter-locus correlation.
In addition to the methods based on statistical models, there are several methods that rely on
heuristics and guidelines (Gill et al., 1998; Clayton et al., 1998; Perlin and Szabady, 2001; Bill
et al., 2005; Wang et al., 2006; Gill et al., 2006). Cowell et al. (2007b) gives a nice review of
most of these methods in their introductory section.
5.4 Finding best matching pair of proles
In order to nd the most likely pair of proles matching the observed mixture under the assump-
tions made by the model, one can decrease the number of possibilities using the following argu-
ments. Let the observed peak areas within each locus, s, be sorted such that A
s,(1)
< < A
s,(n
s
)
,
and assume that DNA
1
< DNA
2
, where DNA
k
is the amount of DNA contributed by person k.
Then, for a locus with four observed peaks (n
s
= 4), the only likely pair of proles given the
model relate the alleles with peak areas (A
s,(1)
, A
s,(2)
) and (A
s,(3)
, A
s,(4)
) to person 1 and person
2, respectively. For loci with one observation (n
s
= 1), the two individuals need both to be
homozygous for the observed allele, while for two (n
s
= 2) or three observations (n
s
= 3), the
possible proles are listed in Table 5.3 (the notation J
2
and J
3
is used in Section 5.4.1).
92 Identifying contributors of DNA mixtures by means of quantitative information
Table 5.3: Possible proles for loci with two and three observations.
J
2
: P
s,1
P
s,2
P
s,1
P
s,2
P
s,1
P
s,2
P
s,1
P
s,2
J
3
: P
s,1
P
s,2
P
s,1
P
s,2
P
s,1
P
s,2
P
s,1
P
s,2
A
s,(1)
1 1 2 0 1 0 0 1 A
s,(1)
2 0 1 0 1 0 0 1
A
s,(2)
1 1 0 2 1 2 2 1 A
s,(2)
0 1 1 0 0 1 0 1
A
s,(3)
0 1 0 2 1 1 2 0
In Table 5.3, P
s,1
and P
s,2
refers to the proles of person 1 and person 2 on the particular locus
s, respectively, and the cell values to the number of alleles associated with the proles. The
reason for not considering the three and eight other combinations for loci with two and three
observations (Table 5.1), respectively, is that, for any of these combinations, one of the four
combinations listed in Table 5.3 will be more likely under the model assumptions, i.e. have a
better t to the observed data. E.g. would P
s,1
= (0, 2)

and P
s,2
= (2, 0)

be unlikely as we
assumed person 1 to have the lowest contribution and the second area to be the larger.
The numbers of possible pairs of proles for loci with two, three and four observations are
respectively 7, 12 and 6, when discarding the information from peak areas and only using com-
binatorics. Thus, using the assumptions of the model, we decrease the number of proles which
needs to be examined in order to nd the most likely proles forming the observed mixture.
We assume the peak areas to be normally distributed with conditional means and covariances
as specied in (5.1). Due to the conditional independence of the loci, the overall estimates of
and
2
are found as sums over the loci. Let W
s
= C
s
diag(h
s
)C

s
, then we can write the
conditional distribution as A
s
|A
s,+
N
n
s
(x
s
0
x
s
1
,
2
W
s
), where x
s
0
= (P
s,1
P
s,2
)A
s,+
/2 and
x
s
1
= P
s,2
A
s,+
/2 are the terms of the mean, linear and constant in , respectively. Solving the
likelihood equation with respect to and
2
yield the unbiased estimators
=
_
sS
x
s
0

s
(A
s
x
s
1
)
_
sS
x
s
0

s
x
s
0
and (5.2)

2
= N
1

sS
(A
s
x
s
0
x
s
1
)

s
(A
s
x
s
0
x
s
1
),
where N = n
+
S 1 =
_
sS
(n
s
1) 1 and W

s
is the generalised inverse of W
s
. We have
to use the generalised inverse of W
s
as W
s
has the rank n
s
1. An approximation to this model
assumes that the precision matrix,
2
W
1
s
, is given by
2
C
s
diag(h
s
)
1
C

s
. Hence, we have
a closed form expression for the inverse covariance matrix yielding simple expressions for the
estimators of and
2
,
=
_
sS
_
n
s
i=1
x
s
0,i
(A
s,i
x
s
1,i
)h
1
s,i
_
sS
_
n
s
i=1
x
s
0,i
2
h
1
s,i
and

2
= N
1

sS
n
s

i=1
(A
s,i
x
s
0,i
x
s
1,i
)
2
h
1
s,i
,
where A
s,i
, h
s,i
, x
s
0,i
and x
s
1,i
are the ith components of the respective bold faced vectors. We denote
5.4 Finding best matching pair of proles 93
the unbiased maximum likelihood estimates for the two models as ( , ) and ( , ), respectively.
The latter version is what is implemented in an on-line tool as discussed in Section 5.4.2.
In addition to the estimate of , we are also interested in determining a condence interval for
. The conditional variance of given A
+
is found using the covariance operator on both sides
of (5.2),
Var( |A
+
) =
Cov
_
_
sS
x
s
0

s
(A
s
x
s
1
)

A
+
_
_
_
sS
x
s
0

s
x
s
0
_
2
=
_
sS
x
s
0

s
Cov(A
s
|A
+
)W

s
x
s
0
_
_
sS
x
s
0

s
x
s
0
_
2
=
2
_

sS
x
s
0

s
x
s
0
_
1
, (5.3)
where we fromthe rst to second equality used the conditional independence of A
s
and A
t
given
A
+
, and second to third properties of the covariance together with the expression of Cov(A
s
|A
+
)
in (5.1). The condence interval of given A
+
is then given by
CI

() = t
1/2,N

_
_
sS
x
s
0

s
x
s
0
,
where t
1/2,N
is the critical value on signicance level for a t-distribution with N = n
+
S
1 degrees of freedom. A similar condence interval using the ( , )-estimates is obtained by
inserting the ( , )-estimates instead of ( , ) and replacing W

with W
1
. From the expression
of CI

(), it is obvious that a small -estimate decreases the width of the condence interval and
thus increases the trust in the estimated mixture proportion.
5.4.1 Greedy algorithm
This model was used in an algorithm for nding the most likely pair of proles contributing to
an observed mixture where the STR proles of both individuals were assumed unknown. First,
dene the set J = {J
1
, . . . , J
4
}, where J
i
is the set of plausible proles for loci with n
s
= i.
These sets were dened in Section 5.4 (Table 5.3). The pseudo code for a greedy algorithm
nding a pair of proles (locally) maximising the likelihood of the model specied by (5.1) is
given in Figure 5.1. A greedy algorithm is any algorithm that solves a problem by making the
locally optimum choice at each stage with the hope of nding the global optimum. A graphical
representation of the algorithm is given in Figure 5.7 for a general number of contributors, m.
The algorithm works with both ( , ) or ( , ) as estimates of (, ).
The greedy algorithm initiates by estimating based on a locus s with four present alleles. The
loci of S
4
contain full information on the mixture ratio, , and are thus used for assessing this
quantity. In succession, the loci with three and two (S
3
and S
2
, respectively) observations are
analysed and the combination with the smallest contribution to and best concordance to the
94 Identifying contributors of DNA mixtures by means of quantitative information
Algorithm: Find best matching pair of STR proles.
Let T = , = 0 and
2
= .
While
2
decreases or TS
For i {4, 3, 2}
For s S
i
= {s : s S and n
s
= i}
Choose combination j J
i
minimising
2
Set T = {T \ (s, )} (s, j) and compute
Return , and T.
Figure 5.1: Greedy algorithm for nding a pair of proles (locally) maximising the likelihood
of (5.1).
previously determined mixture proportion is chosen. The set T contains a list of the optimal
combinations on previously visited loci and is updated after each iteration. On termination, the
greedy algorithm returns the best matching pair of proles together with the estimates of and
. The algorithm is designed to perform calculations and decisions similar to those of a forensic
geneticist when analysing a two-person mixture.
The optimisation problem is complicated since the inputs of the function that we are interested
in minimising depend on each other, f (,(P
s,1
, P
s,2
)
sS
) =
_
sS
D
s
, where D
s
= (A
s
x
s
0

x
s
1
)

s
(A
s
x
s
0
x
s
1
). Here, f denotes the object function and (P
s,1
, P
s,2
)
sS
the set of possible
combinations for all loci, s S. It is easy to see that, for a xed , we can minimise D
s
for each
locus s by choosing the combination yielding the smallest square distance. Similarly, xing
the combinations for all loci, is estimated using (5.2). However, from the construction of the
greedy algorithm, the algorithm chooses the combination that minimises
2
for locus s given
and the congurations on loci previously visited loci, t {T \ s}. This ensures locally optimal
solutions, and for most practical purposes, the algorithm returns a global maximum. One should
note that when the algorithmrecovers the best matching pair of proles, we still need to consider
all proles close to these proles consistent with the evidence for likelihood ratio evaluation (see
Section 5.5 for further details).
5.4.2 On-line implementation
The greedy algorithm of Figure 5.1 together with the methods for evaluating the goodness of
t for a given pair of proles are implemented in an on-line application. The on-line imple-
mentation applies the ( , )-estimates when nding the best matching pair of proles. The
two-person (and three-person) mixture separator is available on-line at the rst authors website
(http://people.math.aau.dk/tvede/dna/). The script can plot the expected and observed peak
areas for visual inspection of the t (see Figure 5.2).
The script allows for user uploads of csv-les containing information about loci, alleles, peak
heights and peak areas. The loci implemented are those contained in the SGMPlus and Identiler
kits (AB) excluding amelogenin.
5.4 Finding best matching pair of proles 95
Figure 5.2: Plot produced by the on-line implementation of the algorithm
(http://people.math.aau.dk/tvede/dna/ - sample data le Paper case). The observed
peaks, , are based on data from Table 5.4, and the expected peaks, , assuming a mixture of
the best matching pair of STR proles (Table 5.5). The observed and expected peaks coincide
for nearly all peaks.
Apart from nding the best matching pair of unknown proles, the user can specify a suspect
prole, and the script nds the best matching unknown prole for two-person mixtures.
Example of a two-person mixture separation in an 1:1 mixture ratio
We demonstrate the algorithm and implementation on data from a controlled experiment con-
ducted at the Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health
Sciences, University of Copenhagen, Denmark. The data are presented in Table 5.4 together
with information on the true proles of the mixture (denoted by and ).
The algorithm found that the two proles of Table 5.5 are the best matching pair of proles.
The proles are consistent with the true proles of the mixture except for loci TH0 and FGA.
In Figure 5.2, we have plotted the data from Table 5.4 (solid cones, ) together with the best
matching pair of proles as listed in Table 5.5.
In Figure 5.3, the traces of the parameter estimates of (dashed) and
2
(solid) are plotted for
each successive iteration with the nal parameter estimates being = 0.43 (95%-CI: [0.40 ;
0.45]) and
2
= 1134.04. Evaluating the mixture of the true proles (marked by and in
96 Identifying contributors of DNA mixtures by means of quantitative information
Table 5.4: Data used in demonstrating the algorithm. The and represents prole 1 and 2,
respectively.
Locus Allele Height Area
D3 15 1802 15410
D3 16 1939 16282
vWA 14 712 6128
vWA 15 725 6620
vWA 16 626 5637
vWA 17 830 7362
D16 10 824 7910
D16 11 1772 17231
D16 12 586 6101
D2 17 434 4558
D2 19 612 6563
D2 25 843 9257
D8 8 1284 10782
D8 12 1232 10359
D8 13 903 7891
D8 16 638 5291
Locus Allele Height Area
D21 29 1073 9454
D21 30 1469 12828
D21 31 798 6992
D18 13 1247 12302
D18 15 899 9104
D18 17 726 7549
D19 13 1332 10534
D19 14 416 3478
D19 15 504 3968
TH0 6 820 6739
TH0 8 668 5573
TH0 9 486 4004
FGA 19 490 4415
FGA 23 865 7968
FGA 24 527 5036
Table 5.5: Best matching pair of proles for the data in Table 5.4. This pair of proles is pictured
in Figure 5.2 as the expected peaks.
Locus D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA
Minor 15,16 14,16 10,12 17,25 13,16 30,30 13,13 14,15 6,6 23,23
Major 15,16 15,17 11,11 19,25 8,12 29,31 15,17 13,13 8,9 19,24
Table 5.4), the estimate is almost unchanged ( = 0.42), but with an increase in
2
to 1266.34
indicating a slightly worse t.
The fact that a combination dierent from the true one has a better t, indicates that there are
multiple explanations of the trace since it is a 1:1-mixture ( close to 0.5). However, the dier-
ence in
2
-estimates for the two combinations will only have a minor inuence in the evaluation
of the evidence.
Example of a two-person mixture separation in an 1:2 mixture ratio
Wang et al. (2006, Table 10) presented data from a two-person DNA mixture with known minor
(victim) and major (suspect) proles. Curran (2008) and others have analysed these data in
order to demonstrate their models for separating two-person DNA mixtures. Using the on-line
implementation we obtained the true proles with = 0.30 (95%-CI: [0.28 ; 0.32]) and
2
=
124.87.
5.4 Finding best matching pair of proles 97
Figure 5.3: Trace of the parameter estimates of (dashed/right ordinate labels) and

2
(solid/left ordinate labels). The plot is produced by the on-line tool available at
http://people.math.aau.dk/tvede/dna/.
5.4.3 Dropping non-tting loci
In some cases, the stain may be contaminated, and it may be subject to drop-in or drop-out.
Drop-ins are allelic peaks present in the DNA prole not belonging to the true proles. Drop-ins
may occur at random (contamination) or by more systematic mechanisms such as stuttering or
pull-up eects. Stutters are caused by artefacts in the polymerase chain reaction resulting in an
increase of peak intensities typically in the allelic position before the true peaks. Pull-up eects
are manifested as an increase of true peaks caused by overlap of the spectra of the light emitted
from the various uorochromes, which are detected by a CCD camera in the data generating
process (Butler, 2005). Drop-outs are allelic peaks of the true proles that are absent in the DNA
prole due to, e.g. low amount of DNA or degradation of the DNA. In such cases, the observed
peak heights and peak areas no longer originates solely from a two-person mixture. Hence, the
proportionalities of Section 5.1 need no longer to be satised and the mean structure of (5.1)
may not explain the observed peak heights and peak areas in all loci.
We use an F-test approach to evaluate whether any of the included loci s S has signicant
unexpected balances due to e.g. stutters, degradation or contamination. The purpose is to return
a list of loci in which the hypothesis of a two-person mixture can be supported.
For each locus, the contribution to
2
is computed by D
s
, which we assume to follow a
2
n
s
1
-
distribution. Hence, to test whether any locus contributes signicantly to the overall variance,

2
, we evaluate for each locus s S the ratio
(n
s
1)
1
D
s
(n
+
S n
s
1)
1
_
t{S\s}
D
t
F
(n
s
1),(n
+
S n
s
1)
,
98 Identifying contributors of DNA mixtures by means of quantitative information
where F
(
1
),(
2
)
is an F-distribution with
1
numerator and
2
denominator degrees of freedom.
Since we perform this test for all loci, we make a Bonferroni-correction to compensate for mul-
tiple testing. We apply this procedure successively and drop the most signicant locus (if any)
until no locus has a signicant test-value. This facility is also available in the on-line implemen-
tation.
If the variance contribution from multiple loci is large, the test-value will not indicate any sig-
nicant locus as the overall noise of the sample is large or may be a mixture of more than two
individuals. This will result in large values for the overall
2
.
5.5 Likelihood ratio
Let G be the DNA prole of the crime stain, and G
S
and G
U
i
the proles of the suspect and
unknown contributor i, respectively. Furthermore, the evidence, E, consists of both quantitative
information (peak heights and areas), Q, and the genetic crime stain (allelic information), G.
The probability P(E|H) factories as P(Q, G|H) = P(Q|G, H)P(G|H) using the denition of con-
ditional probabilities. Since Q is a continuous stochastic variable, we use the likelihood of our
model, L(A|G

, G

) =

sS
{|W
s
|
1/2
exp(
1
2
D
s
)}, to evaluate P(Q|G, H), where the hypothesis
H involves proles G

and G

.
Let C
p
= {G
U
: (G
S
, G
U
) G} be the set of unknown proles that together with G
S
are consistent
with G, then P(G|G
S
, G
U
) = 1 for G
U
C
p
and 0 otherwise, i.e. C
p
is the set of possible
unknowns under H
p
. Similarly, let C
d
= {(G
U
1
, G
U
2
) : (G
U
1
, G
U
2
) G} be the set of two unknown
proles consistent with G, i.e. possible pairs of proles under H
d
. This partitioning of the set
of proles is equivalent to Assumption 2 in Evett et al. (1998), where the authors argue that the
only genotype congurations of interest are those proles (G

, G

) inducing the observation of


allelic peaks in G, i.e. P(G|G

, G

) = 1 and 0 otherwise. The LR = P(E|H


p
)/P(E|H
d
) can be
formed as:
LR =
_
G
U
C
p
L(A|G
S
, G
U
)P(G
U
)
_
(G
U
1
,G
U
2
)C
d
L(A|G
U
1
, G
U
2
)P(G
U
1
, G
U
2
)
. (5.4)
The P(G) is the prole probability as applied in the regular likelihood ratio (Evett and Weir,
1998), where P(G) may be computed using the -correction (Nichols and Balding, 1991; Buck-
leton et al., 2005). The expression in (5.4) is similar to equations (5) and (6) of Evett et al. (1998)
who made a Bayesian formulation of the LR for DNA mixtures.
If a case includes a victimwith prole G
V
, the set C
p
= {(G
S
, G
V
) G} only contain one element,
(G
S
, G
V
). Hence, the likelihood ratio simplies further
LR =
L(A|G
S
, G
V
)
_
G
U
C
d
L(A|G
V
, G
U
)P(G
U
)
,
where for this simpler case C
d
= {G
U
: (G
V
, G
U
) G}.
5.6 Importance sampling of the likelihood ratio 99
Table 5.6: Expected peak areas for a two-person mixture (expressed in term of ). The list is
minimal such that equivalent combinations up to numeration of alleles are avoided. The expected
peak areas are ordered by lexicographic order of the allele designation.
n
s
Observed alleles Combinations Expected peak areas in terms of
1 a
4
(aa, aa) (2)

A
s,+
/2
2 a
3
b (aa, ab) (1+, 1)

A
s,+
/2
(ab, aa) (2, )

A
s,+
/2
a
2
b
2
(aa, bb) (2, 2(1))

A
s,+
/2
(ab, ab) (1, 1)

A
s,+
/2
3 a
2
bc (aa, bc) (2, 1, 1)

A
s,+
/2
(ab, ac) (1, , 1)

A
s,+
/2
(bc, aa) (2(1), , )

A
s,+
/2
4 abcd (ab, cd) (, , 1, 1)

A
s,+
/2
(ac, bd) (, 1, , 1)

A
s,+
/2
In some cases, the value of L(A|G
S
, G
V
) may be very much lower than the likelihood value
for the pair of best matching proles. This indicates that it is inappropriate to assume that the
evidence is a mixture of G
S
and G
V
- even though the proles (G
S
, G
V
) are consistent with G.
The sums involved in the evaluation of the likelihood ratio will often involve an intractable num-
ber of terms depending on the number of loci and number of observed peaks in each locus. As
the inclusion of all possible combinations is infeasible, we need at least to include combinations
with a numerical impact on the likelihood ratio for the approximation of the true likelihood ratio
to be satisfactory for forensic use.
The best matching pair of proles will provide an estimate, of the mixture proportion . The
expected peak areas in Table 5.6 (expressed in terms of ) indicate that alternative combinations
need to have an -estimate close to the estimate of the best matching pair in order to have a
reasonable t. We exploit this result when dening our proposal distribution in the section on
importance sampling.
5.6 Importance sampling of the likelihood ratio
An exact assessment of the weight of evidence comprises evaluation of every term of the numer-
ator and denominator of (5.4). However, this is infeasible and other methods of evaluating the
evidence need to be considered. In this section, we show how importance sampling can be used,
for estimation of the weight of evidence by assigning weights to the individual combinations.
Maimon (2010) also considered importance sampling in a Bayesian context for modelling DNA
mixtures.
100 Identifying contributors of DNA mixtures by means of quantitative information
Let C
d
= {(G
U
1
, G
U
2
) G}, and G = (G

, G

) refer to a pair of proles (G

, G

). The expression
of P(E|H
d
) can be interpreted as a expectation of Q with respect to the probability measure P on
G:
P(E|H
d
) =

GC
d
L(A|G)P(G) = E(h(E); P). (5.5)
Hence, simulating combinations G from G with respect to P may be used to estimate P(E|H
d
).
However, simulation with respect to P does not take the quantitative evidence, Q, into account
and will thus yield a poor estimate of P(E|H
d
) due to the possible larger numerical impact from
L(A|G) compared to P(G) in (5.4). To handle this, we use importance sampling based on the
marginal likelihood values of each combination.
Let q(G) =

sS
q
s
(G
s
), where G
s
= (G

s
, G

s
) is the proles on locus s and
q
s
(G
s
) =
L(A|G
s
,

G
s
)P(G
s
)
_
N
s
i=1
L(A|G
s,i
,

G
s
)P(G
s,i
)
, (5.6)
where N
s
is the number of combinations for the observed number of alleles, (G
s
,

G
s
) is the par-
ticular combination on locus s merged with the best matching combination,

G, in the remaining
loci, t {S \ s}, and the sum in the denominator is over all possible combinations, N
s
, in locus s
merge with the best matching combination in the remaining loci. Hence, L(A|G
s
,

G
s
) is called
the marginal likelihood as it gives the likelihood for the particular combination on locus s with
the combinations on the remaining loci identical to the best matching pair of proles. Further-
more, the denominator of (5.6) is a constant, B
s
, for each locus. Using this proposal distribution,
P(E|H
d
) may be expressed as an expectation with respect to q,
P(E|H
d
) =

GC
d
L(A|G)
P(G)
q(G)
q(G) = E(h(E)W(E); q),
where W(E) = P(G)/q(G) is the importance weight. Since P(G) =

sS
P(G
s
) and B =

sS
B
s
, the ratio of L(A|G)P(G)/q(G) is nearly constant:
L(A|G)P(G)

sS
{L(A|G
s
,

G
s
)P(G
s
)}

sS
B
s
=
L(A|G)B

sS
L(A|G
s
,

G
s
)
,
where the product in the denominator in many cases is a good approximation to L(A|G). This
constantness of h(E)W(E) improves the performance of importance sampling and reduces the
number of samples needed for results with low variance (Robert and Casella, 2004).
In order to estimate P(E|H
d
), we draw combinations G
i
, i = 1, . . . , M, from q(G) and compute
the Monte Carlo estimate,

P(E|H
d
) =
1
M
M

i=1
L(A|G
i
)W(G
i
), G
i
q(G),
where W(G
i
) = P(G
i
)/q(G
i
) are the importance weights.
5.6 Importance sampling of the likelihood ratio 101
The estimate is unbiased as the terms are independently simulated from q(G) and all have ex-
pectation E(h(E)W(E); q) = P(E|H
d
). For the variance of

P(E|H
d
), we compute
Var(

P(E|H
d
))=
1
M1
M

i=1
_
[L(A|G
i
)w(G
i
)]
2

P(E|H
d
)
2
_
.
The numerator of LR, P(E|H
d
), can be handled similarly taking into consideration that we are
summing over a restricted set of combinations, C
p
, all including the suspects prole, C
p
=
{G
U
: (G
S
, G
U
) G}. The greedy algorithm of Figure 5.1 is also applicable when specifying
a suspect. We only need another ordering of the observations and a dierent set of J-matrices
using the extra information of the suspects prole. This implies that there exists a best matching
combination,

G
(S )
, in C
p
having the same properties as

G for the unrestricted set, C
d
. Hence,
importance sampling may also be used in estimating P(E|H
p
) with similar formulae as those for
estimating P(E|H
d
).
5.6.1 Example of estimating LR using importance sampling
The best matching pair of proles for the data in Table 5.4 was found in Table 5.5 and were used
for estimating q(G) and the constant B. In the computations, we assumed uniform distributions
of the allele probabilities. Table 5.7, lists the prole of a ctive suspect, G
S
, together with the
unknown prole maximising the likelihood with G
S
xed. This pair plays the role of

G
(S )
in
this example. In Figure 5.4 the observed, , and expected peak heights, , assuming a mixture
of these proles are plotted.
Table 5.7: Suspects STR prole together with best matching STR prole of an unknown person.
Locus D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA
Suspect 16,16 15,17 11,11 25,25 13,16 30,30 15,17 15,15 8,9 24,24
Unknown 15,15 14,16 10,12 17,19 8,12 29,31 13,13 13,14 6,6 19,23
In order to verify the validity of our methodology and implementation of the importance sam-
pler, we limited our data to include only loci on the blue uorescent dye band (D3, vWA, D16
and D2). The total number of possible combinations for the blue loci is 7
1
12
2
6
1
= 6,048 and
it is therefore possible to compute the correct value of P(E|H
d
) = 0.481335

10
10
. For the
suspects prole specied in Table 5.7, locus D3 is the only blue locus for which it is pos-
sible to alter the unknown prole and still have consistency with G. Hence, there are only
two terms in the P(E|H
p
) when restricting the analysis to the blue dye band. The value of
P(E|H
p
) = 0.225730

10
13
indicating that the suspect is not likely to be a true contributor of the
DNA mixture since P(E|H
p
) < P(E|H
d
).
In order to evaluate the performance of the importance sampler, we computed 1,000 estimates of
P(E|H
d
) each based on 10,000 samples. The estimates are plotted together with the correct value
102 Identifying contributors of DNA mixtures by means of quantitative information
Figure 5.4: Plot of the observed peaks, , and the expected peaks, , assuming a mixture of the
suspect and best matching unknown (STR proles of Table 5.7).
in the histogram of Figure 5.5. The distribution of the estimates tends to be skew for this particu-
lar example, but with most of the estimates close to the true value of P(E|H
d
). The mean of the es-
timates,

P(E|H
d
), is 0.483731

10
10
with a standard deviation of 0.184432

10
11
. From the cen-
tral limit theoremwe may approximate the (positive) distribution of

P(E|H
d
) with a normal distri-
bution and compute an approximative 95%-condence interval: [0.122243, 0.8452178]

10
10
.
In forensic genetics it is common practice to evaluate the evidence anti-conservative, mean-
ing that the estimates and approximations are favourable to the suspect/defendant (Balding,
2005). For a conservative LR the estimate of the numerator should be larger than the true value,

P(E|H
d
) > P(E|H
d
). However, 66% of the importance sample estimates are smaller than the true
value for this particular example. A likely explanation for this is that the sampling scheme places
to much of the probability mass close to the best matching pair of proles. Hence, the (very)
large set of less likely combinations are not included in the estimate.
5.7 Results
The algorithm was tested on data from 71 controlled two-person mixtures with known proles.
Hence, it was possible to validate the suggested proles returned by the separation algorithm. Ta-
ble 5.8 summarises the comparisons with the best matching pair of proles and the true mixture
proles.
5.7 Results 103
Estimates
F
r
e
q
u
e
n
c
y
0
1
0
0
2
0
0
3
0
0
4
0
0
0.5 10
10
1 10
10
1.5 10
10
Correct value of P(|H
d
)
Figure 5.5: Histogram of 1,000 estimates of P(E|H
d
) each based on 10,000 samples.
Table 5.8: Detailed summary table with the number of correctly separated loci, x, stratied by
mixture ratio.
Cases with both proles Cases with major prole Cases with minor prole
correct in x of 10 loci correct in x of 10 loci correct in x of 10 loci
Ratio 3 4 5 6 7 8 9 10 3 4 5 6 7 8 9 10 3 4 5 6 7 8 9 10
1:1 1 3 0 2 2 2 1 0 1 2 1 2 2 2 1 0 1 3 0 2 2 2 1 0
1:2 0 0 0 0 2 5 7 8 0 0 0 0 0 2 8 12 0 0 0 0 2 4 8 8
1:4 0 0 0 1 3 2 6 7 0 0 0 0 0 0 0 19 0 0 0 1 3 2 6 7
1:8 0 0 0 2 5 4 0 4 0 0 0 0 0 0 0 15 0 0 0 2 5 4 0 4
1:16 0 0 1 0 0 0 0 3 0 0 0 0 0 0 0 4 0 0 1 0 0 0 0 3
Total 1 3 1 5 12 13 14 22 1 2 1 2 2 4 9 50 1 3 1 5 12 12 15 22
From the bottom row of Table 5.8, we see that the separation algorithm returned the true mixture
proles as the best matching combination 22 times. The number of cases where one (14 cases),
two (13 cases) or three (12 cases) loci were wrongly separated were almost the same. In ve
cases, half or less of the loci were correctly separated.
In 50 cases, the true major prole were correctly identied and in another 13 there were inconsis-
tency in at most two loci between the major prole of the best matching pair and the true major
prole. Furthermore, Table 5.8 shows that the eight remaining cases with incorrect identication
of the major prole had mixture ratio 1:1. Hence, in these cases, there were no obvious major
104 Identifying contributors of DNA mixtures by means of quantitative information
proles as the amounts of DNA contributed were (almost) equal. Furthermore, for 1:1-mixtures,
there are many pairs of proles yielding similar goodness of t to the observed peak intensities,
which previously was exemplied in Section 5.4.2. The algorithm is less successful in identi-
cation of the minor prole. However, in most cases, the minor prole was separated correctly in
seven or more loci.
In addition to the 71 DNA mixtures from controlled experiments, the separation algorithm was
used to separate 64 two-person DNA mixtures from real crime cases. For each of the 64 crime
cases the laboratory had two reference samples that were consistent with the observed stain.
Three experienced forensic geneticists tried to identify both the major and minor proles of the
mixture without knowing the true proles of the mixture for each mixture (blinded experiment).
In Table 5.9, the results from the separation using the separation algorithm is compared to those
of the forensic geneticists.
Table 5.9: Comparison of the performance of the separation algorithm and forensic geneticists.
The counts show the number of loci with the minor and major proles correctly identied.
Geneticists Algorithm
Correct loci Minor Major Minor Major
10 8 31 16 36
9 16 8 16 9
8 13 8 14 5
7 13 4 10 6
6 6 7 1 2
5 3 2 3 2
4 2 2 2 2
3 2 1 2 2
2 0 0 0 0
1 1 1 0 0
The total number of correctly separated mixtures was 16 for the separation algorithm and 8 for
the forensic geneticists. The samples where the minor contributor were correctly identied in all
loci also had the major component correct (see Table 5.9). As for the controlled experiments,
the success rate was dependent by the mixture ratio, with number of correctly separated loci
decreasing as increased towards 0.5.
Furthermore, it should be noted that the forensic geneticists were forced to call some pairs of
proles resulting in some inconclusive statements. That is, the forensic geneticists were forced
to deduce major and minor proles in cases where the regular protocol of the laboratory would
not support the separation of proles.
5.8 Discussion 105
5.8 Discussion
Using the quantitative information from STR DNA analysis in terms of peak intensities is
presently the only way to separate STR mixture results. Based on a statistical model, we de-
veloped a simple greedy algorithm for nding the best matching pair of proles.
Our model is based on few assumptions that are widely accepted among forensic geneticists.
The statistical model made it possible to make objective comparisons of various combinations
by evaluating the likelihood values. From the normal distribution assumption, this value is com-
puted by
N
, which implies that the lower estimate, the better concordance between observed
and expected peaks.
Importance sampling was used in order to estimate the likelihood ratio since this becomes com-
putationally dicult when 7
S
2
12
S
3
6
S
4
terms need to be evaluated in the numerator of the LR
with H
d
:(G
U
1
, G
U
2
). The method showed to be ecient, and future work will consist of imple-
mentation of sampling schemes that explore more of the sample space. This implementation
would ideally result in fewer estimates that are less than the true value.
5.9 Conclusion
By using the greedy algorithm of Section 5.4.1, we demonstrated that it is possible to automate
the separation of DNA mixtures. However, due to the assumption of no occurrence of drop-out
or stutters, the model may be too simple for more complicated cases. Hence, this methodology
is applicable to cases where the analysis today is standard but time-consuming. This allows the
forensic geneticists to focus on more complex crime cases.
Future work comprises the development of a methodology for handling drop-outs and stutters.
Since stutters are prole independent (stutters from parental peaks are constant for all alleged
combinations of proles), it is possible to remove stutters from the data prior to separation and
interpretation. Allowing for drop-outs while nding a best matching pair of proles is also pos-
sible. Using the methodology of Tvedebrink et al. (2009), the probability of drop-out, P(D|

H),
is assessed conditioned on a given prole.
Acknowledgements
The authors would like to thank Dr. Jakob Larsen and Dr. Frederik Torp Petersen (both from
The Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health Sci-
ences, University of Copenhagen, Denmark) for their assistance in manually analysing the DNA
mixtures from real crime cases.
106 Identifying contributors of DNA mixtures by means of quantitative information
Appendix
5.A The general case with mcontributors
In the general case with m contributors to the mixed stain, our method can be generalised by
assuming the mixture proportions
1
, . . . ,
m
to be strictly increasing,

1
< <
m1
<
m
= 1
+
,
+
=
m1

i=1

i
. (5.7)
The conditional covariance structure is the same as specied in (5.1), where the conditional mean
is:
E(A
s
|A
s,+
) =
A
s,+
2
_

_
m1

i=1

i
P
s,i
+ P
s,m
_

_
1
m1

i=1

i
_

_
_

_
= X
s,m
+
m1

i=1

i
X
s,i
, (5.8)
where X
s,i
= (P
s,i
P
s,m
)A
s,+
/2 for i = 1, . . . , m 1 and X
s,m
= P
s,m
A
s,+
/2. In order to nd the
MLE of = (
i
)
m1
i=1
, we solve the likelihood equations for (,
2
; (A
s
)
sS
) with respect to .
This implies that the MLE of is:
=
_

sS
X

s
W

s
X
s
_

_
1
_

sS
X

s
W

s
(A
s
X
s,m
)
_

_
.
Furthermore, the estimate of
2
in the general setting is

2
= N
1

sS
(A
s
X
s
X
s,m
)

s
(A
s
X
s
X
s,m
),
where N = n
+
S m + 1 =
_
sS
(n
s
1) (m 1).
5.A.1 Greedy algorithm
The greedy algorithmof Figure 5.1 needs only a fewmodications to be applicable to the general
case. Most important is the specication of the number of contributors, m. This needs to be
decided before running the algorithm. For the algorithmto be successful, there should preferably
be at least one locus with 2m peaks as this increases the condence in the estimate . The
modied greedy algorithm for m contributors to a DNA mixture is given in Figure 5.6.
Furthermore, it is necessary to check if the -estimate satises the inequalities of (5.7) for each
combination. In Table 5.10, we list ctive data together with two combinations both implying
a perfect t. Both matrices are valid as the orders in the -sum columns satisfy the condition
(5.7). However, for Combination 1, the estimate
1
= (0.2, 0.45) does not satises (5.7) while
the estimate for Combination 2
2
= (0.2, 0.35) does. Hence, Combination 2 is chosen over
Combination 1.
5.A The general case with mcontributors 107
Algorithm: Find best matching set of m proles
Specify the number m of contributors.
Let T = , = 0 and
2
= .
While
2
decreases or TS
For i {2m, . . . , 2}
For s S
i
= {s : s S and n
s
= i}
Choose combination j J
i
minimising
2
and satisfying restrictions of (5.7)
Set T = {T \ (s, )} (s, j) and compute
Return , and T.
Figure 5.6: Greedy algorithm for nding a set of proles (locally) maximising the likelihood of
(5.8).
Table 5.10: Fictive data showing the importance of ensuring (5.7) is satised.
Combination 1 Combination 2
Area P
1
P
2
P
3
-sum P
1
P
2
P
3
-sum
200 1 0 0
1
1 0 0
1
450 0 1 0
2
0 0 1 1
+
550 1 0 1 1
2
1 1 0
1
+
2
800 0 1 1 1
1
0 1 1 1
1
In Figure 5.7, the greedy algorithm of Figure 5.6 is described by a diagram emphasising the
various steps in the procedure of nding the best matching combination.
In step A, the parameters and are estimated using only the loci with 2m observed peaks. Step
B determines the prole combination (see step D) on the current locus that minimises given
the combinations on the already visited loci. The algorithm visits the blocks of loci with equal
numbers of observed alleles in decreasing order: 2m1, . . . , 2. If any of the blocks is empty, the
algorithms skips forward to the next nonempty block. The order within each block of loci with
2mi observed peaks is arbitrary. When reaching the last locus, the combination and estimates
of and are saved.
In step C, the algorithm visits each locus searching for a combination that might decrease with
all remaining loci combinations xed. If is non-changed the algorithm stops. Otherwise step C
is looped until a xed -value is obtained. On termination the algorithm returns the combination
and estimates of and .
Step D pictures that, for each locus with less than 2m peaks, there are several combinations of
proles that need to be investigated. In the gure, each depicts a combination and symbolises
the current optimal conguration. The black arrow shows which combination is currently tested.
When all the combinations are tested the one with smallest is returned.
108 Identifying contributors of DNA mixtures by means of quantitative information
A:
2m 2m1 2 2mi
B:
2m 2m1 2 2mi
2m 2m1 2 2mi
2m 2m1 2 2mi
C:
2m 2m1 2 2mi
2m 2m1 2 2mi
D:
Figure 5.7: Diagram describing the greedy algorithm for resolving DNA mixtures. The shaded
boxes show the loci previously visited by the algorithm. The bold lined box shows the current
locus under investigation.
Bibliography 109
Bibliography
Balding, D. J. (2005). Weight-of-evidence for Forensic DNA Proles. Chichester, West Sussex:
John Wiley & Sons, Ltd.
Bill, M. et al. (2005). PENDULUM - a guideline-based approach to the interpretation of STR
mixtures. Forensic Science International 148, 181189.
Buckleton, J. S., C. M. Triggs, and S. J. Walsh (2005). Forensic DNA evidence interpretation,
pp. 217274. Boca Raton, FL: CRC Press.
Butler, J. M. (2005). Forensic DNA Typing: Biology, Technology, and Genetics of STR Markers
(2 ed.). Burlington, MA: Elsevier Academic Press Inc., U.S.
Clayton, T. M., J. P. Whitaker, R. Sparkes, and P. D. Gill (1998). Analysis and interpretation of
mixed forensic stains using DNA STR proling. Forensic Science International 91, 5570.
Cowell, R. G., S. L. Lauritzen, and J. Mortera (2007a). A gamma model for DNA mixture
analyses. Bayesian Analysis 2(2), 333348.
Cowell, R. G., S. L. Lauritzen, and J. Mortera (2007b). Identication and separation of DNA
mixtures using peak area information. Forensic Science International 166, 2834.
Cox, D. R. (1958). Some problems connected with statistical inference. Annals of Mathematical
Statistics 29(2), 357372.
Curran, J. M. (2008). A MCMC method for resolving two person mixtures. Science &Justice 48,
168177.
Evett, I. W., P. D. Gill, and J. A. Lambert (1998). Taking account of peak areas when interpreting
mixed DNA proles. Journal of Forensic Sciences 43(1), 6269.
Evett, I. W. and B. S. Weir (1998). Interpreting DNA Evidence: Statistical Genetics for Forensic
Scientists. Sunderland, MA: Sinauer Associates.
Gill, P. D. et al. (1998). Interpreting simple STR mixtures using allele peak areas. Forensic
Science International 91(1), 4153.
Gill, P. D. et al. (2006). DNA commission of the International Society of Forensic Genetics:
Recommendations on the interpretation of mixtures. Forensic Science International 160(2-3),
90101.
Maimon, G. (2010). A Bayesian approach to the statistical interpretation of DNA evidence. Ph.
D. thesis, Department of Mathematics and Statistics, McGill University, Montreal, Canada.
Nichols, R. A. and D. J. Balding (1991). Eects of population structure on DNA ngerprint
analysis in forensic science. Heredity 66, 297302.
Perlin, M. W. and B. Szabady (2001). Linear mixture analysis: A mathematical approach to
resolving mixed DNA samples. Journal of Forensic Science 46(6), 13721378.
Robert, C. P. and G. Casella (2004). Monte Carlo Statistical Methods (2 ed.). Springer.
110 Identifying contributors of DNA mixtures by means of quantitative information
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2009). Estimating the proba-
bility of allelic drop-out of STR alleles in forensic genetics. Forensic Science International:
Genetics 3(4), 222226.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2010). Evaluating the weight
of evidence using quantitative STR data in DNA mixtures. Journal of the Royal Statistical
Society. Series C, Applied statistics. In Press.
Wang, T., N. Xue, and J. D. Birdwell (2006). Least-square deconvolution: A framework for
interpreting short tandem repeat mixtures. Journal of Forensic Science 51(6), 12841297.
5.10 Supplementary remarks 111
5.10 Supplementary remarks
In the above manuscript only two-person mixtures were analysed in practice, but the appendix
demonstrated how to extend the model and algorithm to handle m-person mixtures. The Section
of Forensic Genetics, University of Copenhagen, also prepared three-person mixtures. Five
dierent DNA proles were mixed in trios in the mixture ratios: 1:2:4. The ve DNA proles
are listed in Table 5.11. There is
_
5
3
_
= 10 dierent triple-wise combinations and each triple
is analysed in six dierent mixture ratios (permutations of the three proles). This gives 120
samples since each case is analysed in duplicates. However, 17 samples were discarded due to
pipette and amplication errors leaving 103 samples to be analysed.
Table 5.11: The ve DNA proles used in the three-person mixtures.
D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA
A 14,18 17,19 12,14 20,24 10,13 30.2,32.2 13,13 12,13 8,9 20,22
B 17,18 14,17 9,9 17,23 13,15 28,28 14,19 14,15.2 8,9.3 20,24
C 16,18 16,19 10,13 16,23 11,14 31,32.2 15,19 12,15 9,9.3 20,24
D 15,18 16,18 9,11 20,21 12,13 29,32.2 13,14 13,14 6,7 22,22
E 15,19 15,17 12,13 16,19 12,13 27,30 13,15 13,14 9,9.3 19,25
The on-line implementation is programmed such that it handles both the analysis of single source
stains, two- and three-person mixtures. In Figure 5.8 the peak intensities for a mixture of proles
B, D and C (see Table 5.11) is plotted together with the expected values for the best matching
combination.
Since we know the true proles, we are able to compare the best matching combination with the
true proles as for the two-person mixtures. In Table 5.12 the three inferred proles are listed.
The major prole coincides with prole B while the mid prole diers from prole D by one
allele in locus D19. The minor prole has ve correct and 4 partially-correct loci compared to
prole C.
Table 5.12: The estimated proles fromthe separation of the three-person mixture of Figure 5.8.
The major prole coincides with prole B in all loci, the mid prole diers by one allele from
prole D in locus D19, and the minor is correctly identied in ve loci (compared to prole C).
Locus D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA
Minor prole 16,17 17,19 10,13 16,17 11,14 28,31 14,14 12,15 9,9.3 20,24
Mid prole 15,18 16,18 9,11 20,21 12,13 29,32.2 13,15 13,14 6,7 22,22
Major prole 17,18 14,17 9,9 17,23 13,15 28,28 14,19 14,15.2 8,9.3 20,24
112 Identifying contributors of DNA mixtures by means of quantitative information
Figure 5.8: Three-person DNA mixture of proles B, D and C in mixture ratio 4:2:1 (see Ta-
ble 5.11).
The performance of the mixture separator for the three-person DNA mixtures is summarised
in Table 5.13. For each three-person mixture the number of correctly (both alleles correctly
identied) and semi-correctly (exactly one alleles correctly identied) loci are computed. This
is done separately for the major, mid and minor prole where the median of the corresponding
amounts of DNA for these classications are 335 pg, 168 pg and 84 pg. The rst count in each
cell refers to the major prole, the second to the mid prole and lastly the minor component.
In 76 cases (73.8%) the major prole was correctly identied on at least eight loci (and partially
correct on the remaining ones), while 52 cases (50.5%) had the mid prole correct on at least
six loci. The success rate for the minor component was unsatisfactory low. However, the low
amounts of DNA compared to the other two components implies that the contributions from the
minor prole are within the limits of variation one would expect for the larger peak intensities.
That is, the unbalances induced by adding the fraction from the minor component to the peaks
of the mid and major proles is masked by the variability of these peaks.
The authors have in collaboration with Aalborg University and University of Copenhagen ap-
plied for a patent for the intellectual rights of the mixture separating algorithm presented above:
Name of invention: A Computer-Assisted Method of Analyzing a DNA Mixture.
Application details: U.S. Provisional Application 61/148221 led Jan. 29, 2009.
5.10 Supplementary remarks 113
Table 5.13: Summary table of the separation of three-person mixtures. Each cell corresponds to
the number of major/mid/minor counts for the number of correctly identied loci stratied on
matches and partial-matches. Row number is full matches and columns partial matches.
0 1 2 3 4 5 6 7 8 9 10
0 0/0/0 0/0/0 0/ 0/0 0/0/0 0/0/0 0/0/0 0/0/ 0 0/0/0 0/1/0 0/1/1 0/0/0
1 0/0/0 0/0/0 0/ 0/0 0/0/0 0/0/0 0/1/1 0/0/ 1 0/1/3 0/0/1 0/0/1
2 0/0/0 0/0/0 0/ 0/0 0/0/0 0/0/0 0/1/2 0/2/ 6 0/5/3 0/2/5
3 0/0/0 0/0/0 0/ 0/0 0/0/0 0/2/5 0/2/6 0/2/10 0/5/4
4 0/0/0 0/0/0 0/ 0/1 0/1/1 0/4/5 0/2/7 0/4/ 6
5 0/0/0 0/0/0 0/ 2/1 1/1/0 0/5/8 3/7/6
6 0/0/0 0/0/0 0/ 1/0 2/6/8 8/9/4
7 0/0/0 0/2/0 0/ 6/2 13/8/2
8 0/0/0 0/2/0 26/11/3
9 0/0/0 25/6/0
10 25/1/0
CHAPTER 6
Estimating the probability of allelic
drop-out of STR alleles in forensic genetics
Publication details
Co-authors: Poul Svante Eriksen

, Helle Smidt Mogensen

and Niels Morling

Department of Mathematical Sciences


Aalborg University

Section of Forensic Genetics, Department of Forensic Medicine


Faculty of Health Science, University of Copenhagen
Journal: Forensic Science International: Genetics 3 (2009) 222-226
DOI: doi:10.1016/j.fsigen.2009.02.002
115
116 Estimating the probability of allelic drop-out of STR alleles in forensic genetics
Abstract:
In crime cases with available DNAevidence, the amount of DNAis often sparse due to the setting
of the crime. In such cases, allelic drop-out of one or more true alleles in STR typing is possible.
We present a statistical model for estimating the per locus and overall probability of allelic
drop-out using the results of all STR loci in the case sample as reference. The methodology of
logistic regression is appropriate for this analysis, and we demonstrate how to incorporate this in
a forensic genetic framework.
Keywords:
Drop-out probability; forensic genetics; logistic regression; STR.
6.1 Introduction
When assessing the weight of the evidence of STR typing in forensic genetics, the arguments
depend on the observable alleles in the crime stain. However, due to technical and biochemical
issues, it is possible that a true allele in the sample is not detected by the genetic typing method,
i.e. allelic drop-out (Gill et al., 2006). The probability of this event will aect the weight of
evidence with a decrease in the power of discrimination as the drop-out probability increases
since less individuals can be excluded as possible contributors.
It is well-known that in samples of high quality, i.e. high amount of DNA (for all contributors
if it is a mixture) and no contamination or degradation, the probability of observing a drop-out
is practically zero. Using logistic regression, we formalised this intuition by using the results of
all STR loci in the sample as an indicator of the amount of DNA. The statistical analysis showed
that the drop-out probability is locus dependent.
The DNA commission of the ISFG stressed the importance of considering allelic drop-out in
the recommendation on mixture interpretation (Gill et al., 2006, recommendation 7). In rec-
ommendation 7, the intuition of the logistic model was explained, but how to assess P(D) was
not formalised. The estimation of P(D) is important because it inuences the estimation of the
weight of the evidence in the calculation of the likelihood ratio (LR).
6.2 Material and methods
6.2.1 Data
The analysis was based on 175 controlled experiments conducted at The Section of Forensic
Genetics, Department of Forensic Medicine, Faculty of Health Sciences, University of Copen-
hagen, Denmark. The experiments consisted of pairwise mixtures of four proles and samples
with only one contributor diluted in water.
Genomic DNA from blood-samples from two males and two females was extracted by a stan-
dard phenol-chloroform extraction method. DNA was quantied in triplicates using the Quan-
tiler

Human DNA Quantication kit (Applied Biosystems) with Human Genomic DNA Male
6.2 Material and methods 117
(Promega) as the quantication-standard on a ABIPrism

7000. The median DNA concentra-


tions were used. Each sample was diluted in water to 500 pg DNA/l. The DNA concentrations
in the diluted samples were measured again in triplicates and the median DNA concentration
was used.
Six two-person mixtures of DNA (w/v) in proportions 16:1, 8:1, 4:1, 2:1, 1:1, 1:2, 1:4; 1:8 and
1:16 were made of DNA from each of the four persons. The amount of DNA from each person
in the mixtures was calculated based on the DNA concentration in each sample. Each of the four
samples were serially diluted with water in the proportions 16:1, 8:1, 4:1, 2:1 and 1:1.
The amount of DNA in each mixture ranged from 328 to 528 pg DNA, and from 24.6 to 410
pg DNA in the diluted samples and were amplied twice with the AmpFSTR

SGM Plus

-kit
(Applied Biosystems) as recommended by the manufacturer in an ABI GeneAmp

9700 PCR
thermocycler.
One ul of the amplicates in 15 ul HiDi

Formamide (Applied Biosystems) was analysed on an


ABI Prism

3100 Genetic Analyzer using POP4 as the polymer and 5 kV injection voltage for 6
seconds. DNA fragments were detected and fragment sizes were estimated with GeneScan 3.7
with a detection threshold of 50 rfu. Genotypes were assigned using GenoTyper 3.7 with the
Kazam macro (Applied Biosystems) with no stutter lter applied.
We excluded all alleles in stutter positions of true alleles to avoid complications of masked drop-
outs due to stutter eects. Table 6.1 presents the number of observed alleles, dropouts and the
proportion of drop-outs for each locus.
Table 6.1: Observed drop-outs in the data set stratied by locus. All drop-outs were single
contributor alleles.
D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA
Observed 306 356 322 398 362 375 315 220 258 313
Drop-outs 10 11 11 14 11 7 10 10 17 18
Proportion 0.03 0.03 0.03 0.04 0.03 0.02 0.03 0.05 0.07 0.06
There was a tendency for the high molecular loci to have more drop-outs than the remaining
ones within each uorescent dye colour. This indicates a locus dependence of the probability of
drop-out.
6.2.2 Logistic regression model
Let D be the event The contributors allele has dropped out, and

D when no drop-out oc-
curs, implying that P(

D) = 1P(D). For evidence evaluation, we are interested in quantifying
the probability of allelic drop-out P(D). As mentioned in Section 6.1, we wish to model this
probability conditioned on the observed stain.
118 Estimating the probability of allelic drop-out of STR alleles in forensic genetics
H
P
(
D
|
H
)
D
D
D3
20 50 150 400 1100 3000
vWA D16
20 50 150 400 1100 3000
D2
D
D
D8 D21 D18
D
D
20 50 150 400 1100 3000
D19 TH0
20 50 150 400 1100 3000
FGA
Figure 6.1: Locus specic logistic curves (solid) together with an overall estimate (dashed). The
plot is on log-scale ensuring P(D|

H = 0) = 1. At each panel box-plots are added, summarising


the empirical distribution of

H for D and

D.
We dene H as the sum of observed peak heights divided by a sum of indicators with value two
for homozygous alleles and one for heterozygous, i.e. for h
i
being the ith height measurement
H = (n
het
+ 2n
hom
)
1
_
n
i=1
h
i
, where n = n
het
+ n
hom
is the number of heterozygous and homozy-
gous alleles in the prole. This was previously demonstrated to be a good proxy for the amount
of DNA contributed to a stain (Tvedebrink et al., 2010). If the stain is a mixture assumed to have
K contributors, we only use the alleles where person k, k = 1, . . . , K, is a single contributor for
estimating H
(k)
. We use

H as a summary statistics for the observed stain in our analysis and use
logistic regression to model P(D|

H), where

H is found from H as (for K = 2),
P(D|

H) =
_

_
P(D|H), Non-shared het allele
P(D|2H), Non-shared hom allele
P(D|H
(1)
+H
(2)
), Shared het allele,
where H
(1)
and H
(2)
may be weighted by 2 if the contributors of the shared alleles are homozy-
gous.
6.3 Results and discussion 119
Logistic regression is a standard way to estimate the probabilities for a dichotomous response
stochastic variable when explanatory variables are assumed to change the probability of the event
(McCullagh and Nelder, 1989). The logistic model is particularly simple in this case since we
only have one explanatory variable,

H,
P(D|

H) =
exp(
0
+
1
log

H)
1 + exp(
0
+
1
log

H)
,
where
1
showed to be negative such that P(D|

H) decreases as

H increases, and the use of log

H
rather than

H, ensures that with
1
being negative P(D|

H = 0) = 1. When we condition on

H, we
assume the event of two allelic drop-outs of the same contributor are independent, which is also
an underlying assumption of the logistic regression. That is, P(D
1
, D
2
|

H) = P(D
1
|

H)P(D
2
|

H),
where D
i
: Allele i of the contributor with DNA proxy

H has dropped out.
6.3 Results and discussion
The analysis showed that the intercept parameter,
0
, varied between loci with a p-value of 0.01
indicating a signicant dierence between loci (Venables and Ripley, 2002). A similar test for
the slope parameter,
1
, indicated that this parameter did not vary signicantly across loci (p-
value of 0.49). In addition, there was no signicant change of the drop-out probability caused
by the allelic number indicating that larger alleles within the same locus has the same drop-out
probability as smaller alleles. However, in the data set, the largest allelic dierence was eight
repeat units. This variability may be to small to demonstrate that a possible allelic eect is
signicant.
The parameters for locus s are thus
0,s
and
1
for computing P(D|

H), where we use the same

H for all loci. The parameter estimate of


1
is 4.35 and the estimates of
0,s
are given in Table
6.2.
Table 6.2: Estimates of
0,s
and
1
based on the experiments of Section 6.2.1.
Locus D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA

0,s
18.26 18.43 18.75 18.31 18.28 17.45 18.07 19.40 19.40 19.21
Note, that
0,s
are larger for the loci of the yellow uorescent dye band indicating their larger
drop-out probability as observed in Table 6.1. The corresponding logistic curves for the param-
eters of Table 6.2 are plotted in Figure 6.1 together with an overall estimate not stratifying on
loci. The parameters for the overall curve are
0
= 17.56 and
1
= 4.14.
In Figure 6.1, the box-plot added to each panel shows the DNA proxy

H for the drop-outs (D)
and observed alleles (

D). The boxes indicate the inter-quartile range (middle fty percent of
120 Estimating the probability of allelic drop-out of STR alleles in forensic genetics
the data) of the observations and the whiskers extend to the most extreme data points within 1.5
times the lengths of the boxes. Remaining points are marked by dots.
It is clear from Figure 6.1 that there is an overlap of the whiskers in the box-plots. This implies
that the classication of drop-outs is associated with uncertainty as one would expect. In partic-
ular, it is true for D21 where all drop-outs observed had a mean height,

H, above 70. This may
be due to the specic alleles in our data set (for D21, these were 28, 29, 30, 30.2, 31 and 32.2)
and possible individual specic eects from having only four dierent proles in the data.
We used the estimated parameters of Table 6.2 in order to create a table of the mean peak heights
that correspond to the specic drop-out probabilities. For the ten dierent loci included in our
data set, these mean heights are presented in Table 6.3.
Table 6.3: Mean peak heights (rfu) for various drop-out probabilities for ten STR loci.
P(D|

H) D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA Overall


0.0001 556 577 622 562 558 461 531 722 723 692 648
0.0005 384 399 430 388 385 318 367 499 499 478 439
0.0010 327 340 366 331 328 271 313 425 426 407 371
0.0050 226 235 253 228 226 187 216 293 294 281 251
0.0100 192 200 215 194 193 159 184 250 250 239 212
0.0500 132 137 147 133 132 109 126 171 171 164 142
0.1000 111 115 124 112 111 92 106 144 144 138 119
0.2000 92 95 103 93 92 76 88 119 120 114 98
0.3000 81 84 91 82 81 67 78 105 106 101 86
0.4000 73 76 82 74 74 61 70 95 95 91 77
0.5000 67 69 75 68 67 55 64 87 87 83 70
0.6000 61 63 68 62 61 50 58 79 79 76 63
0.7000 55 57 62 56 55 46 53 71 71 68 57
0.8000 49 50 54 49 49 40 46 63 63 60 50
0.9000 40 42 45 41 40 33 39 52 52 50 41
0.9500 34 35 38 34 34 28 32 44 44 42 34
0.9900 23 24 26 23 23 19 22 30 30 29 23
Computing the Brier Score (Brier, 1950) for the estimated locus specic model, we nd that the
Brier Score = n
1
_
n
i=1
(D
i
P(D|

H
i
))
2
= 0.02, where D
i
is indicator for dropout of the allele
of the data and

H
i
is the associated proxy for the amount of DNA. A Brier Score close to zero
6.4 Conclusion 121
indicates that the model is adequate. A simulated p-value of 0.156 indicates a satisfying t of
the model. Furthermore, we tried to improve the model by using linear splines (Harrell Jr., 2001)
with knots at log(75) and log(100), but these model extensions were not supported by the data.
The use of the logit function implies that the interpretation is made in terms of log odds. The log
odds of the drop-out probability conditioned on

H is linear in log

H,
logitP(D|

H) = log
P(D|

H)
P(

D|

H)
=
0,s
+
1,s
log

H.
Using H as the explanatory variable implies lower variability on the DNA proxy than if only
using a single peak height observation, e.g. the peak height on the same locus of a heterozygous
allele that has not dropped-out. Furthermore, in real crime cases such an allele might not be
observed, since both alleles of a heterozygous might have dropped-out or the other allele may be
shared with an other contributor if the stain is a mixture.
Gill et al. (2000) discussed the importance of addressing the risk of allelic drop-out and how to
incorporate this into the likelihood ratio. Combining our approach for estimating P(D|

H) with
the methodology of Gill et al. (2000) may be a feasible approach for better assessment of the
weight of evidence when the level of the peak heights indicates the possibility of drop-outs.
6.4 Conclusion
We have demonstrated a simple and applicable way of assessing the drop-out probabilities of
STR alleles in forensic genetics. The drop-out probabilities computed using the model concur
with the prior knowledge of the drop-out behaviour varying with the observed peak heights.
Future work consists of testing the model on a larger data set including more alleles. With a
larger data set, it may also be possible to test whether alleles or fragment length has a signicant
eect on the drop-out probability as the individual specic eect decreases with the number of
dierent proles.
It is worth emphasising that the drop-out probabilities may vary between laboratories, machinery
within the same laboratory and typing kits used for proling. This is due to dierences in e.g.
the ability to amplify the DNA in the PCR and in the potential to measure the light intensities
for the electropherogram. Hence, before applying this methodology in the likelihood ratio for
evidence calculations, the laboratory needs to perform experiments with known proles in order
to estimate the parameters in the logistic regression model.
122 Estimating the probability of allelic drop-out of STR alleles in forensic genetics
Appendix
6.A Examples
In forensic genetics it is common to use the likelihood ratio LR = P(E|H
p
)/P(E|H
d
) as mean to
assess the weight of evidence. Here P(E|H) is the probability of observing the evidence E given
the hypothesis H. The prosecutors hypothesis, H
p
, often include more proles from identied
individuals than under the defence hypothesis H
d
. Having a single contributor stain H
p
may state
The suspect is the only contributor to the crime stain, whereas H
d
: An unknown individual
unrelated to the suspect is the only contributor to the crime stain.
In the situation where the hypotheses induces that an allelic drop-out has occurred one needs
to specify the proles that constitute the observed stain in order to compute the prole specic
drop-out probability for both H
p
and H
d
.
6.A.1 Example with data from a controlled experiment
We used the data in Table 6.4 to demonstrate the technique of computing the drop-out probability
of a given allele. The data originated from a mixture of a controlled experiment with the two
proles A and B denoted in Table 6.4 by and , respectively, where A contributed with 31.4
pg/ul and B with 424.6 pg/ul.
Table 6.4: Data used in the example of the Appendix 6.A. The sample was a mixture of the two
proles A and B (denoted by and ) contributing 31.4 pg/ul and 424.6 pg/ul, respectively.
Locus Allele Height Area
D3 15 766 7264
D3 16 991 9165
D3 19
vWA 15 788 7631
vWA 17 710 6678
D16 10 117 1201
D16 11 1765 18858
D16 12
D2 19 746 8816
D2 23
D2 25 696 8432
D8 8 967 9145
D8 12 895 8350
D8 13
Locus Allele Height Area
D21 28 70 660
D21 29 767 7169
D21 30 102 1024
D21 31 889 8283
D18 12 70 736
D18 15 766 8501
D18 16 127 1341
D18 17 687 7856
D19 13 1525 12862
D19 15
TH0 6 836 7333
TH0 7 82 736
TH0 8 595 5249
FGA 20
FGA 23 638 6507
FGA 24 549 5542
6.A Examples 123
Under the assumption that the data in Table 6.4 originated from a two-person mixture, we need
to specify a possible pair of proles explaining the observed alleles. We compute the individual
DNA proxies H
(A)
and H
(B)
as dened in Section 6.2.2 for the two proles A and B of Table 6.4,
H
(A)
=
117 + 70 + 102 + 70 + 127 + 82
6
= 94.67
H
(B)
=
766 + 1765 + 746 + 967 + 895 + 767 + 889 + 766 + 687 + 595 + 549
10 + (2 1)
= 782.67.
Let allele 19 in locus D3 be denoted by D3
19
, then from Table 6.4 we found that the homozygous
allele D8
13
and the following non-shared heterozygous alleles of prole A had dropped out:
D3
19
, D16
12
, D2
23
, D19
15
, and FGA
20
.
The DNA proxy was the same for all the heterozygous drop-outs,

H = H
(A)
, and for the homozy-
gous allele

H = 2H
(A)
. The parameter estimates of Table 6.2 were then used in order to compute
the locus specic drop-out probabilities. Below, we demonstrate how to compute the drop-out
probabilities for D3
19
, D19
15
and D8
13
:
P(D
D3
19
|

H) =
exp(18.264.35 log(94.67))
1 + exp(18.264.35 log(94.67))
= 0.177,
P(D
D19
15
|

H) =
exp(19.404.35 log(94.67))
1 + exp(19.404.35 log(94.67))
= 0.403,
P(D
D8
13
|

H) =
exp(18.284.35 log(189.33))
1 + exp(18.284.35 log(189.33))
= 0.011.
Suppose we only had information on prole B, e.g. B being the victim of a crime, and that the
prole of the suspect S only gave a partial match. For simplicity, we use the same mean height
estimate for the suspect as for A, i.e. H
(S )
= H
(A)
. In locus D19, only allele 13 was observed
and a shared allele may have dropped out. Assuming suspect S is homozygous for allele 11
and prole B is heterozygous with alleles 11 and 13, the DNA proxy is

H = 2H
(S )
+ H
(B)
=
189.33 + 782.67 = 972 and the drop-out probability is
P(D
D19
11
|

H) =
exp(19.404.35 log(972))
1 + exp(19.404.35 log(972))
=2.69

10
5
.
6.A.2 Example in the recommendation of the ISFG Commission
Following the idea of Example 1 given in (Gill et al., 2006, Appendix B.2), we compute the
likelihood ratio using our model for assessing the drop-out probabilities.
Assume that the genetic stain G = (a, c, d) and that the prosecutors hypothesis claims that the
suspect, G
S
= (a, b) is a contributor to the stain. For this hypothesis to be true, the b allele must
124 Estimating the probability of allelic drop-out of STR alleles in forensic genetics
have dropped out. In this example, we only consider data from one locus as in Table 3 of the
ISFG recommendations. We re-use the data from TH0 in Table 6.4 in order to exemplify how to
evaluate the LR. For consistency with the example of Gill et al. (2006), denote allele 7 by a and
let c and d be allele 6 and 8, respectively.
From Table 6.4, we compute the following estimates of

H and the associated P(D|

H) for every
combination of the alleles assuming a two-person mixture. Suppose that a contributor has non
shared alleles mn and that the DNA proxy for this combination is H
mn
. Then P(D
mn
) = P(D|H
mn
)
is the drop-out probability of either m or n. Alternatively in the actual case there may be one
shared allele, m, and in this case P(D
m,m
) = P(D|

H
m,m
) = P(D|H
mn
+H
mo
) is the drop-out prob-
ability for allele m when shared by two individuals with the combinations mn and mo. The
probability P(G|H
p
) is
P(G|H
p
) = 2P(cd)P(

D
cd
)
2
P(D
ab
)P(

D
ab
),
since allele b is assumed to have dropped out.
Assume that an allele, Q, has dropped out implying that the two proles are heterozygous not
sharing any allele. That is, Q is any allele of A
TH0
\ {a, c, d} with allele probability P(Q) =
1[P(a)+P(c)+P(d)], where A
TH0
is the set of alleles for locus TH0. All of the observed alleles
must be paired with the missing allele in order to compute the specic drop-out probabilities as
these dier due to the dierent peak heights. From Table 6.5, it is clear that P(D
aQ
) is the largest
of the three as expected since the peak height of a is only 82 rfu. When paired with any of c or
d, the drop-out probabilities are practically zero as one would require. Hence, the terms P(D
cQ
)
and P(D
dQ
) are also indicators of a poor agreement with the heterozygote balance when pairing
ad and ac, respectively.
Table 6.5: DNA proxies and drop-out probabilities for various proles
Prole(s) Notation H

H P(D|

H)
aQ P(D
aQ
) 82.0 82.0 5.57

10
1
ac P(D
ac
) 459.0 459.0 7.02

10
4
ad P(D
ad
) 338.5 338.5 2.63

10
3
cQ P(D
cQ
) 836.0 836.0 5.17

10
5
dQ P(D
dQ
) 595.0 595.0 2.27

10
4
cd P(D
cd
) 715.5 715.5 1.02

10
4
aa P(D
aa
) 82.0 164.0 5.82

10
2
cc P(D
cc
) 836.0 1672.0 2.54

10
6
dd P(D
dd
) 595.0 1190.0 1.11

10
5
ac, ad P(D
a,a
) 459.0, 338.5 797.5 6.35

10
5
ac, cd P(D
c,c
) 459.0, 715.5 1174.5 1.18

10
5
ad, cd P(D
d,d
) 338.5, 715.5 1054.0 1.89

10
5
Bibliography 125
The probability of the evidence given the defence hypothesis and that one allele has dropped out
is given as
P
1
(G|H
d
) = 8P(acdQ)[P(

D
ac
)
2
P(D
dQ
)P(

D
dQ
)+
P(

D
ad
)
2
P(D
cQ
)P(

D
cQ
) + P(

D
cd
)
2
P(D
aQ
)P(

D
aQ
)],
where the multiplication by 8 is due to the number of pairwise combinations of the alleles,
e.g. pairing the alleles ac and dQ may be done as (ac)(dQ), (ac)(Qd), (ca)(dQ) and (ca)(Qd);
interchanging the proles yields the eight combinations.
The defence hypothesis, H
d
, also comprises the scenario where no alleles has dropped out. This
implies that either an allele is shared or one contributor is homozygous. The probability of
P
0
(G|H
d
) is:
P
0
(G|H
d
) = P(acd)
_
P(a)
_
4P(

D
aa
)P(

D
cd
)
2
+8P(

D
a,a
)P(

D
ac
)P(

D
ad
)
_
+ P(c)
_
4P(

D
cc
)P(

D
ad
)
2
+8P(

D
c,c
)P(

D
ac
)P(

D
cd
)
_
+ P(d)
_
4P(

D
dd
)P(

D
ac
)
2
+8P(

D
d,d
)P(

D
ad
)P(

D
cd
)
_
_
.
It is worth noting that the probabilities P(D
ac
) and P(D
ad
) are misleading as the combination of
a together with c or d causes substantially imbalances in the proles peak heights.
In order to compute the likelihood ratio, LR, we need only to compute the ratio of P(G|H
p
) to
P
1
(G|H
d
) + P
0
(G|H
d
). As in Gill et al. (2006), we assume uniform allele probabilities of 0.1 for
the observed alleles implying that P(Q) = 0.7, yielding a LR of
LR =
P(G|H
p
)
P
1
(G|H
d
)+P
0
(G|H
d
)
=
0.0049
0.0014+0.0035
= 1.0007.
For the same scenario, Gill et al. (2006) considered uniformprobabilities of 0.02 of the observed
alleles. Using our model, this implies a LR of 9.6111.
126 Estimating the probability of allelic drop-out of STR alleles in forensic genetics
Bibliography
Balding, D. J. and J. S. Buckleton (2009). Interpreting low template DNA proles. Forensic
Science International: Genetics 4(1), 110.
Brier, G. W. (1950). Verication of forecasts expressed in terms of probability. Monthly Weather
Review 78, 13.
Gill, P. D. et al. (2006). DNA commission of the International Society of Forensic Genetics:
Recommendations on the interpretation of mixtures. Forensic Science International 160(2-3),
90101.
Gill, P. D. and J. S. Buckleton (2010a). A universal strategy to interpret DNA proles that does
not require a denition of low-copy-number. Forensic Science International: Genetics 4(4),
221227.
Gill, P. D. and J. S. Buckleton (2010b). Mixture interpretation: dening the relevant features
for guidelines for the assessment of mixed DNA proles in forensic casework. Journal of
Forensic Sciences 55(1), 265268.
Gill, P. D., J. M. Curran, and K. Elliot (2005). A graphical simulation model of the entire DNA
process associated with the analysis of short tandemrepeat loci. Nucleic Acids Research 33(2),
632643.
Gill, P. D., J. Whitaker, C. Flaxman, N. Brown, and J. S. Buckleton (2000). An investigation
of the rigor of interpretation rules for STRs derived from less than 100 pg of DNA. Forensic
Science International 112(1), 1740.
Harrell Jr., F. E. (2001). Regression Modeling Strategies. Springer.
McCullagh, P. and J. Nelder (1989). Generalized Linear Models. Chapman and Hall.
Petricevic, S. et al. (2009). Validation and development of interpretation guidelines for low
copy number (LCN) DNA proling in New Zealand using the AmpFSTR SGM Plus(TM)
multiplex. Forensic Science International: Genetics In Press, Corrected Proof.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2009). Estimating the proba-
bility of allelic drop-out of STR alleles in forensic genetics. Forensic Science International:
Genetics 3(4), 222226.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2010). Evaluating the weight
of evidence using quantitative STR data in DNA mixtures. Journal of the Royal Statistical
Society. Series C, Applied statistics. In Press.
Venables, W. N. and B. D. Ripley (2002). Modern Applied Statistics with S (4 ed.). Springer.
6.5 Supplementary remarks 127
6.5 Supplementary remarks
Several authors and commentators in forensic genetics have already accepted the model above as
a mean to estimate the probability of allelic drop-out (Balding and Buckleton, 2009; Petricevic
et al., 2009; Gill and Buckleton, 2010a,b). However, as with any piece of science and each
model criticism has also been put forward. Balding and Buckleton (2009) argue that the drop-
out probability of a homozygous allele, P(D
2
), should satisfy the property that P(D
2
) < P(D)
2
,
where P(D) is the drop-out probability of a heterozygous allele for the same DNA prole. Their
argument is based on the fact that the superposition of two low intensity peaks should have
smaller drop-out probabilities than when peaks are considered separately. That is, allelic drop-
out may occur due to absence of molecules associated with a particular allele, but may also be
due to the insucient amount of molecules to trigger the observation of an allele. In the latter
case, the amount of DNA might imply that heterozygous alleles yield peak height observations
close to 50 rfu while homozygous alleles are closer to 80 rfu.
Balding and Buckleton (2009) suggested that P(D
2
) = P(D)
2
for some value of < 1, and was
chosen since it satisfy the their requirement. Based on a survey from some forensic laboratories
Balding and Buckleton (2009) suggest = 0.5. However, there are at least two problems with
the -approach. First, there is not a solid model behind the suggestion, and second, how do one
choose the correct value for ? The model tted to the experimental data in Tvedebrink et al.
(2009) has P(D
2
; H) > P(D; H)
2
for H > 136 rfu. However, the dierences are in the fourth
decimal place and has no practical implications. D. J. Balding (personal communication, 2010)
suggested to use

H rather than log(H). This transformation yields a slightly better t to the
data and postpone the issue of P(D
2
; H) > P(D; H)
2
to H-values > 201 rfu.
Gill et al. (2005) demonstrated how to simulate DNA mixtures by mimicking the procedure
carried out by a forensic laboratory: DNA extraction, aliquot sampling, PCR eciency and
measurement variability. A similar approach is listed below:
(1) Assume that there are N chromosomes extracted for typing.
(2) Of these do n
(0)
carry the specic allele of interest, where n
(0)
= bin(N, x/46) where x = 1
for heterozygous and x = 2 for homozygous alleles, respectively.
(3) The PCR process is assumed to be a binomial process: n
(c)
= n
(c1)
+ bin(n
(c1)
,
PCR
), for
c = 1, . . . , C, cycles, where
PCR
is the PCR eciency for each cycle in the PCR process.
(4) If n
(C)
measured with noise gives reason to peak heights lower than a given threshold we
declare a drop-out.
By running (1)-(4) several times with varying initial values N we get an simulated distribution
of P(D). In Figure 6.2 simulations for heterozygous and homozygous alleles are simulated for
varying amounts of DNA. In these simulations
PCR
= 0.85, C = 28 and each point is based
on 5,000 simulations. The solid curve is tted to the heterozygous data points (open points) by
logit P(D; H) =
0
+
1
log(H) and demonstrates that the model ts the data well over the whole
range of the response. Dashed curves show the same regression with

(H) as covariate, and the
probit approach is discussed below. The tted parameters (

0
,

1
) were used to draw the curves
for the homozygous simulations (closed points) with log(2H) as covariate. The plot shows good
agreement between the simulation homozygous data points and the model predictions. The
128 Estimating the probability of allelic drop-out of STR alleles in forensic genetics
5 10 20 50 100 200
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
Amount of DNA
P
(
D
)
P(D; H) with log(H)
P(D; H) with H
P(D; H) using probit
Balding and Buckleton (2009)
Figure 6.2: Simulations using (1)-(4) for varying amounts of DNA. Open points are heterozy-
gous simulations, and closed points homozygous. The curves are explained by the legend.
dotted curve represents the P(D; H)
2
-model of Balding and Buckleton (2009) with = 0.5.
The impression is quite dierent from the logistic regression tted to the data.
Another way to model the probability of allelic drop-out may be derived taking a slightly dier-
ent approach than above. Let X denote the number of molecules in a aliquot sampled for PCR.
We assume that if X is less than some threshold M the signal will not be suciently strong to
trigger the CCD camera and thus the signal will be undetected implying allelic drop-out.
Assume that the aliquot is sampled from a total number of molecules N in the extract. Fur-
thermore, the spacial pattern is Poisson distributed with intensity (the parameter reects the
concentration of molecules), which implies the position of the molecules are independent of each
other. The aliquot proportion p of molecules for PCR processing is sampled from this extract,
which again is Poisson distributed with intensity p.
Now, X is Poisson distributed with some unknown intensity = p, since p is also unknown.
We assume that the average peak height, H, is proportional to the number of sampled molecules,
X, such that H kX, X Poisson(), which implies that E(X) = k
1
H = cH. Rather than
assuming P(D) = P(X = 0), i.e. drop-out only happens when no molecules are samples, we
6.5 Supplementary remarks 129
allow a positive number of molecules to be sampled:
P(D; H) = P(X M) = P
_
X cH

cH

M cH

cH
_

_
M cH

cH
_
,
where the approximation of a Poisson distribution by the normal distribution is satised by the
large value of cH. This implies that probit[P(D; H)] =
1
[P(D; H)] =
1
H
1/2
+
2
H
1/2
. Fitting
this model using the same dataset as used in Tvedebrink et al. (2009) yields a similar t as that
of the original article. Similarly, did this approach indicate good agreement with the simulations
discussed above (as shown in Figure 6.2). Hence, this method which is more closely related
to the biochemistry than the logistic assumption adds further support to the logistic regression
approach through the similarity in results.
The Section of Forensic Genetics, University of Copenhagen, conducted after the publication of
Tvedebrink et al. (2009) more experiments with dilutions of DNA proles. These experiments
investigated the applicability of the drop-out model to dierent DNA genotyping kits and varying
number of cycles in the PCR process (see summary of the results in Table 6.6).
Table 6.6: Summary of the experiments with diluted samples using the SEler kit (Applied
Biosystems). Samples from identical aliquots were used in order to compare the eect of in-
creasing number of PCR cycles.
Cycles Classication D3 vWA D16 D2 D8 SE33 D19 TH0 FGA D21 D18
28 Observed 151 152 116 139 153 127 134 148 115 130 108
Drop-outs 17 16 11 29 15 19 15 20 10 16 17
29 Observed 165 156 125 162 165 141 144 160 120 141 122
Drop-outs 7 16 5 10 7 9 8 12 8 9 6
30 Observed 170 168 127 168 168 148 151 167 126 148 124
Drop-outs 2 4 3 4 4 2 1 5 2 2 4
The overall properties of the model did not change, only an extra term caused by the cycle-factor
was included:
logitP(D; H, C) = (
0,s
+
0,C
) + (
1
+
1,C
) log(H)
Hence, the overall interpretation of the model is the same. It is worth mentioning that
0,30
<

0,29
<
0,28
= 0 and 0 =
1,28
<
1,29
<
1,30
. This implies that for the same locus and
H
0
> 60 rfu xed:
P(D;

H = H
0
, C = 28) < P(D;

H = H
0
, C = 29) < P(D;

H = H
0
, C = 30).
This seems counter-intuitive since more PCR cycles implies higher peaks. However, with

H =
H
0
for all three levels of C, it is more likely to have drop-out for C = 30 than C = 28 since one
would expect that peaks with 30 cycles on average are higher than peaks from a 28 cycle PCR
130 Estimating the probability of allelic drop-out of STR alleles in forensic genetics
process. In Figure 6.3 a plot similar to Figure 6.1 summaries the estimated model. Each panel
shows a box plot of the drop-out events for the associated

H-estimate with the tted logistic
curves superimposed.
H
P
(
D
;
H
)
D
D
D3
20 50 150 400 1100 3000
vWA D16
20 50 150 400 1100 3000
D2
D
D
D8 SE33 D19 TH0
D
D
20 50 150 400 1100 3000
FGA D21
20 50 150 400 1100 3000
D18
Figure 6.3: Box-plots of the

H-estimates stratied by drop-out event. The boxes are vertically
shifted for visual comprehension. Black: 28 cycles, dark gray: 29 cycles and light gray: 30
cycles.
For low amounts of DNA there might be a potential bias when estimating H. This is due to the
fact that for a prole with many drop-outs, the peaks with heights above the detection threshold,
are outliers with respect to the peak height distribution. Hence, estimates based on these
observations tends to systematically overestimate the amount of DNA through biased estimates
of H. However, for a moderate number of drop-outs the bias is of minor concern.
6.5.1 Mixture separation allowing for allelic drop-out
The models for DNA mixtures presented in Chapters 4 and 5 do not allow for allelic drop-out in
their original formulation. However, they are both expendable for handling this sort of issues, in
particular the mixture separating model which is discussed in details below.
6.5 Supplementary remarks 131
For two-person mixtures, the possibility of allelic drop-out implies that for loci with three and
two observed alleles, the possible list of contributing DNA proles should be extended with
wild cards. That is, observing alleles a, b and c, genotypes involving an additional allele
dierent from the three needs to be considered. In practice, this implies that J
2
and J
3
from
Table 5.3 should be extended as shown in Table 6.7 (denoted J

2
and J

3
). Note that the columns
in J

2
assuming allelic drop-out are identical to those of J
3
and J
4
with the rst row(s) (lowest
peak heights/smallest peak areas) removed. Similarly the additional column in J

3
relative to J
3
refers to the lowest peak intensity of the minor contributor has dropped out (rst row of J
4
). The
number of expected alleles, N
s
, is given over each block, where the number of drop-outs equals
N
s
n
s
.
Table 6.7: Extension of J
2
and J
3
of Table 5.3 allowing for drop-outs. The number of drop-outs
is equal to N
s
n
s
, where N
s
is the expected number of alleles.
Expected number of alleles: N
s
=2 N
s
=3 N
s
=4
Number of drop-outs: 0 1 2
J

2
: P
1
P
2
P
1
P
2
P
1
P
2
P
1
P
2
P
1
P
2
P
1
P
2
P
1
P
2
P
1
P
2
P
1
P
2
A
s,(1)
1 1 2 0 1 0 0 1 0 1 1 0 0 1 0 1 0 1
A
s,(2)
1 1 0 2 1 2 2 1 0 1 0 2 1 1 2 0 0 1
Expected number of alleles: N
s
=3 N
s
=4
Number of drop-outs: 0 1
J

3
: P
1
P
2
P
1
P
2
P
1
P
2
P
1
P
2
P
1
P
2
A
s,(1)
2 0 1 0 1 0 0 1 1 0
A
s,(2)
0 1 1 0 0 1 0 1 0 1
A
s,(3)
0 1 0 2 1 1 2 0 0 1
The statistical formulation of mean and variance are identical, drop-out allowed or not. However,
due to the missing data problem induced by the assumption of allelic drop-out we must impute
the missing data. An ad-hoc way to do this has been implemented and showed reasonably good
results:
N
s
= 4 and n
s
= 3: The missing data is imputed by repeating the A
s,(1)
-data row.
N
s
= 4 and n
s
= 2: The algorithm will always choose the leftmost conguration of J

2
since D
s
will be similar to that of the rightmost conguration, i.e. the dierence between the observed
and expected peak areas is similar when conditioned on the locus sum, A
s,+
. However, the
ratio of the likelihood values will approximately be P(D
s
|H
(1)
)
2
due to the two drop-outs for
N
s
= 4.
N
s
= 3 and n
s
= 2: There are four dierent congurations that need to be considered (numbers
refer to order in J

2
where N
s
= 3):
(1) If a homozygous allele drops-out the situation is the same as above for N
s
= 4 and n
s
= 2.
(2) The missing data is imputed by repeating the A
s,(1)
-row.
132 Estimating the probability of allelic drop-out of STR alleles in forensic genetics
(3) The missing data is imputed as the dierence of the A
s,(3)
- and A
s,(2)
-row.
(4) The missing data is imputed by repeating the A
s,(1)
-row.
A more rigorous approach would be to use the EM-algorithm, however, for practical purposes it
is believed, that there would be no substantial dierence.
Furthermore, the likelihood now includes an extra term P(D; H), where H is calculated as de-
scribed above. Assume that only alleles of the minor contributor have dropped out, then al-
lowing for drop-outs in mixture separation, implies that the selection criterion needs to evaluate
P(D;

H=H
(1)
)
n
D
1
P(D;

H=2H
(1)
)
n
D
2

N
, where n
D
1
and n
D
2
respectively are the number of het-
erozygous and homozygous drop-outs.
Example
Table 6.8 lists the STR data of a two-person mixture where several allelic drop-out has occurred.
In fact only three of the minor proles (marked by in Table 6.8) alleles not shared by the major
prole () had peak heights above the 50 rfu limit of detection.
Table 6.8: Observed STR data of a two-person DNA mixture. The estimated H-values were
respectively 60.3 rfu (Prole ) and 765.8 rfu (). The proles denoted by squares and triangles
are respectively iditied without drop-out allowed and taking drop-out into consideration.
Locus Allele Proles Height Area
D3 15

884 7787
D3 16

816 7140
D3 19

- -
vWA 15

519 5067
vWA 17

530 4928
D16 10 - -
D16 11

1373 14302
D16 12 - -
D2 19

565 6635
D2 23

- -
D2 25

518 6120
D8 8

993 8720
D8 12

807 7320
D8 13

78 891
Locus Allele Proles Height Area
D21 28 - -
D21 29

773 6867
D21 30 - -
D21 31

637 5867
D18 12 - -
D18 15

762 8449
D18 16

52 663
D18 17

644 7316
D19 13

1163 9550
D19 15

51 631
TH0 6

553 4691
TH0 7 - -
TH0 8

572 4936
FGA 20

- -
FGA 23

403 4024
FGA 24

363 3651
The drop-out probabilities of one of minor proles alleles for the various loci are listed in
Table 6.9. From this table we see that it is likely that one or more of the minor proles alleles
has dropped out.
6.5 Supplementary remarks 133
Table 6.9: Drop-out probabilities for the minor contributor (see Table 6.8). The probabilities
were computed using (

s,0
,

1
) from Tvedebrink et al. (2009).
Locus D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA
P
_
D|H
(1)
=60.3
_
0.61 0.65 0.72 0.62 0.61 0.41 0.56 0.83 0.83 0.80
We may analyse the data of Table 6.8 using the mixture separating algorithm. We rst anal-
yse the data assuming no allelic drop-out and next allowing for allelic drop-out. The likeli-
hood value when not allowing for allelic drop-outs is 1.908

10
10
which corresponds to =
7.65. The identied proles when drop-outs are neglected are denoted by square-symbols
in Table 6.8. Similarly, when allowing for drop-outs = 5.82 and the likelihood value is
1.178

10
9
. Since the best matching proles when allowing for drop-outs has three drop-outs
(see the identied proles in Table 6.8 marked by triangles) the likelihood value is computed as

N
P(D
D3
|H
(1)
)P(D
D2
|H
(1)
)P(D
FGA
|H
(1)
), with the locus specic drop-out probabilities listed in
Table 6.9.
Even though the algorithm allowing for drop-out correctly identied three allelic drop-outs, the
number of correctly identied loci for the two methods are almost the same. This is due to the
peak height imbalances of the peaks of the major prole. Often the shared allele of the two
true contributors is smaller than the one where the major prole is the only contributor (e.g. loci
D3, D2 and TH0 in Table 6.8). Hence, for D3 and D2 the larger of the two observed alleles is
assumed to be a shared allele.
In addition to the example above, we also simulated data mimicking a two-person DNA mixture.
The minor contributor had a xed mean peak height of 60 rfu, while the major component had
peak heights of 2000, 1500, 1000, 750, 500, 250, 150, 100 and 75 rfu. The reason for decreasing
the peak height of the major contributor is that since the variance is proportional to the mean, the
contribution from the minor component is masked in the variability for large peak intensities.
That is, it is not possible to detect whether the minor component has dropped out or if it shares
alleles with the major prole. Hence, the smaller the peak intensities of the major, the easier
it should become to detect allelic drop-out. For each mean value of the major peak height, the
standard deviation, , takes integer values from0 to 10. The locus where an allele of a four-allele
locus has dropped out, the methods detects most of the drop-outs for small values of and the
mean values of the major prole. For three-allele loci the method is less successful even for
moderate values of and low major peak heights.
Both experimental data and the simulations indicate that it is dicult to identify allelic drop-out
of a contributor to a DNA mixture. The problem with allelic drop-out is not only due to limited
amount of DNA in the sample. Lowering the limit of detection naturally decreases the number of
drop-outs. However, this might come with the cost of increased drop-in peaks. In the following
paper we discuss how the background noise can be used to determine a limit of detection using
the sample itself as reference.
CHAPTER 7
Sample and investigation specic ltering
of quantitative data from STR DNA analysis
Publication details
Co-authors: Poul Svante Eriksen

, Helle Smidt Mogensen

and Niels Morling

Department of Mathematical Sciences


Aalborg University

Section of Forensic Genetics, Department of Forensic Medicine


Faculty of Health Science, University of Copenhagen
Journal: International Journal of Legal Medicine (Under preparation)
135
136 Sample and investigation specic ltering of quantitative data from STR DNA analysis
Abstract:
The discrimination between positive and negative results in forensic genetic STR DNA analyses
is of outmost importance. We present a method for identication of STR alleles that is based on
(1) discrimination between positive and negative STR results that are specic for the sample and
each STR locus, (2) correction of stutter eects and (3) correction of pull-up eects. The sample
and STR locus specic discrimination was based on a oating threshold that was estimated by
means of distribution analysis of the true negative data elements, i.e. the noise component. The
correction of stutter eects and pull-ups was based on regression analysis. The method was
developed on the basis of STR data of serial dilutions of DNA from four persons in amounts
ranging from 24.6 to 410 pg DNA. The method was tested on two types of data: (1) controlled
experiments with two-person mixtures of DNA in proportions 16:1, 8:1, 4:1, 2:1, 1:1, 1:2, 1:4;
1:8 and 1:16 with a total of 328 to 528 pg DNA in the two-person mixtures, and (2) data from
ngernail swabs from real crime cases.
The method yielded a 16% increase in allele assignment compared to that of a conventional
assignment of STR alleles for the two-person DNA mixtures and 24% increase for the ngernail
data. A further gain from the method was a more precise identication of the STR types of
contaminated or otherwise compromised DNA samples.
Keywords:
STR typing; Allele assignment; Investigation specic oating threshold; Stutters; Pull-up ef-
fects.
7.1 Introduction
DNA typing with Short TandemRepeat (STR) alleles is typically based on multiplex Polymerase
Chain Reaction (PCR) amplication of the relevant STR DNA, capillary electrophoresis and
uorescence detection of the resulting PCR products. In European DNA crime case laboratories,
the AmpFSTR SGM Plus kit (Applied Biosystems - AB) is widely used. The discrimination
between positive and negative STR results may rely on the individual judgement of the scientist
responsible for the STR typing or on xed criteria like a cut-o of 50 relative uorescent units
(rfu) between positive and negative responses as recommended by the supplier of the kit. A
xed cut-o level may be very useful for practical routine work, but xed cut-o values may
ignore specic circumstances of the investigation being carried out and introduce errors in the
interpretation of the results.
A xed cut-o of 50 rfu is used in many laboratories for the analysis of routine results although
other methods based on e.g. the signal-to-noise ratio may be used (Gilder et al., 2007). A number
of factors inuence the general magnitude of the uorescent signal, e.g. (1) the amount of am-
pliable DNA in the PCR, (2) the amount of uorochrome molecules bound to the oligo-DNA
molecules acting as primers in the PCR, (3) the number of PCR cycles, (4) the amount of de-
tectable, amplied PCR products injected into the electrophoresis capillary typically controlled
by the injection voltage and injection time, (5) the sensitivity of the uorescent detections system
and (6) other factors. The level of irrelevant signals, the noise component, is determined by fac-
tors like impurities of the uorochrome not attached to the amplied DNA molecules, impurities
7.1 Introduction 137
of the primers and conglomerates of the primers, conglomerates of uorochromes and other sub-
stances that may be present in the post-PCR reaction volume that is injected into the capillaries.
The PCR amplication with the uorescent primers is usually responsible for the majority of the
noise signal. Large variations may exist from kit to kit and from batch to batch. Other contrib-
utors to irrelevant noise signals include impurities in the DNA preparations and other chemical
reagents than the STR kit, the DNA sequencer equipment, including the electronic detection and
amplication system and other components of minor importance.
In multiplex PCR STR kits, a number of uorochromes are typically used, and the balance
between signal and noise between the uorochromes vary. The signal intensities of the various
STR loci also vary. Thus, systematic variations such as the STR kit, the batch, batches of other
reagents, the DNA sample, the DNA sequencers with attached electronic equipment, etc., may
inuence the level of discrimination between positive and negative reactions in STR typing.
Therefore, it is desirable to develop methods that can determine the threshold between positive
and negative reactions for each of the investigated DNA samples and for each STR locus.
Systematic extra reactions such as stutters, which are caused by indelity of the Taq poly-
merase in the PCR resulting in amplication products typically 4 base pairs (bp) shorter than the
true PCR products, and pull-ups that are caused by spectral overlap of the uorochromes used
for the detection of the PCR products must also be handled during the interpretation of the STR
results. Stutters are often compensated for by ignoring results below a certain ratio of the signal
of the parental peak, i.e. the DNA fragment supposed to cause the stutter signal. Stutter lter
ratios are usually decided for each STR locus based on average data from initial investigations
of small numbers of samples performed by the supplier of the STR kit and, thus, not necessarily
optimal for all laboratories and/or all alleles in an STR locus. Correction for pull-ups is most
often done by visual inspection although IT-based expert programmes may be used to remove
signals that most likely are caused by pull-up eects.
We have developed a new method for the discrimination between positive and negative STR
results based primarily on analyses of the noise component of the data that represent true neg-
ative STR results. The positive results were further analyzed to identify and correct for stutters
and pull-up eects. We used distribution analysis of the noise component in order to separate the
negative and positive results. Algorithms based on regression analysis were developed to correct
for stutter eects of each STR locus and pull-up eects of each sample. The method makes it
possible to analyze the STR results of a sample according to the results that are specic for the
sample and each STR locus.
The results of the method were compared to those obtained by the method recommended by the
manufacturer of the SGM Plus kit with xed cut-os for positive reactions (50 rfu) and stutters.
138 Sample and investigation specic ltering of quantitative data from STR DNA analysis
7.2 Materials and methods
7.2.1 Data
Controlled experiments
The laboratory investigations were performed at the Section of Forensic Genetics, Department
of Forensic Medicine, Faculty of Health Sciences, University of Copenhagen and approved by
the local ethical committee (KF-01-037/03). Genomic DNA fromblood samples fromtwo males
and two females was extracted by a standard phenol-chloroform extraction method. The DNA
was quantied in triplicates using the Quantiler Human DNA Quantication kit (AB) with
Human Genomic DNA Male (Promega) as the quantication-standard on an ABI Prism 7000
(AB). The median DNA concentrations were used. Each sample was diluted in water to approx-
imately 500 pg DNA/l. The DNA concentrations in the diluted samples were measured again
in triplicates and the medians of the DNA concentrations were recorded for further use.
DNA from each of the four persons was serially diluted with water in the proportions 16:1, 8:1,
4:1, 2:1 and 1:1 for the training STR data set, cf. below. Pair-wise two-person mixtures of DNA
(w/v) in proportions 16:1, 8:1, 4:1, 2:1, 1:1, 1:2, 1:4; 1:8 and 1:16 were made of DNA from each
pair of the four persons for the validation STR data set, cf. below.
The amount of DNA from each person in the diluted single donor samples and the two-person
mixtures was calculated based on the measured DNA concentration of each sample. The total
amount of DNA ranged from 24.6 to 410 pg DNA in the dilutions of single donor samples and
from 328 to 528 pg DNA in two-person mixtures. The DNA was amplied with the AmpFSTR
SGM Plus kit (AB) in a GeneAmp 9700 PCR thermocycler (AB) as recommended by the man-
ufacturer.
One ul of the amplicate in 15 ul HiDi Formamide (AB) was analysed on an ABI Prism 3100
Genetic Analyzer (AB) using POP4 (AB) as the polymer and 3 kV injection voltage for 6 sec-
onds.
DNA fragments were detected and fragment sizes were estimated with GeneScan 3.7 (AB).
Genotypes were assigned using GenoTyper 3.7 (AB). The data were analyzed in two ways: (1)
With the method recommended by the manufacturer of the SGM Plus kit with a xed cut-o of
50 Relative Fluorescence Units (rfu) for the discrimination between positive and negative results
and the recommended stutter lter, and (2) by the presented oating threshold method. The data
for the oating method were generated by GeneScan 3.7 with a detection threshold of 5 rfu.
Genotypes were assigned using GenoTyper 3.7 with no stutter lter applied.
Crime scene ngernail swabs
In addition to the DNA mixtures from controlled experiments with known contributors, the
methodology were also tested on samples from real crime cases. DNA transfer between a victim
and suspect frequently occurs during violent crimes. Debris from ngernail swabs are routinely
analysed in many crime cases such as rapes, assaults and other violent crimes. Often the contri-
7.2 Materials and methods 139
bution of DNA from the suspect is limited and low peak intensities of alleles associated with the
suspects DNA prole will be produced on analysis.
Data from ngernail swabs from 98 real crime cases were analysed (1) using the standard pro-
tocol with a 50 rfu cut-o and the recommended stutter lter (2) and a 5 rfu detection threshold
and no stutter lters, similar to that of the experimental data.
With DNA from a single person, the SGM Plus STR kit detects 10 STR loci and X- and Y-
specic (amelogenin) DNA fragments resulting in a maximum of 22 data elements representing
DNA fragments. The data from single donor dilutions (training data) were used for developing
the various mathematical models, while the data from the two-person mixtures and crime case
samples from ngernail swabs were used to evaluate the performance/eciency of the mathe-
matical models.
7.2.2 Data model
The statistical model was derived from STR data from each of four donors (training data). The
model assumes the following major components of STR data:
(1) True positive results from the STRs.
(2) So-called stutters (cf. below) that are DNA artefacts created during the investigations.
(3) So-called pull-ups (cf. below) that are artefacts created during the detection of the various
signals of the uorochromes that contribute to the identication of the signals of each of the
STR alleles.
(4) Back-ground noise of various kinds.
The quantitative STR data is a mixture of contributions from various sources. Apart from the
signals from true alleles, the signals consist of at least two error components from (1) the PCR
amplication (stutters) and (2) the measurement technique (pull-ups). Stutters are PCR products
typically four base pairs (bp) shorter and to some extend four bp longer than the true PCR
product (back-stutters). Stutters originate from primer mis-pairing in the PCR amplication
creating PCR products that mimic alleles typically one repeat shorter than the true peak (Butler,
2005). In Figure 7.1, the stutter products are shown with eects on both the true peaks (Signal
+ stutter) and on the noise. In the double stutter situation, the stutter peak is caused by stutter
eects from both peaks to the right of the stutter peak. This causes the double stutter peak to
be larger than single stutter peaks because a double-stutter peak is the stutter product of two
peaks. However, the eect of double stutters is not directly implemented in our model, but it is
accounted for in the parameter estimates used in the algorithm (Section 7.2.4).
The quantitative signals are obtained by a very sensitive photocell (CCD camera) detecting the
intensity of light emitted from a uorochrome on DNA molecules corresponding to alleles of
each STR locus. The signal intensities are measured as rfu. Due to noise in the apparatus,
the observed signal contains a continuous noise part that we denote background noise (peaks
designated Noise in Figure 7.1). The light-detecting system also causes a systematic error
component, namely the pull-up eects. This is caused by overlap of the spectra of the light
emitted fromthe various uorochromes as illustrated in Figure 7.2. The pull-up eect is observed
140 Sample and investigation specic ltering of quantitative data from STR DNA analysis
Noise
Signal
Stutter Double stutter
Pull-up eect
Signal
Noise
Pull-up eect
Signal+stutter
Signal
Pull-up eect
Noise
Figure 7.1: Picture of the non-signal components of a STR DNA trace.
520 540 600 640 620 580 560
20
100
80
60
40
0
Wave length [nm]
N
o
r
m
a
l
i
s
e
d

o
u
r
e
s
c
e
n
t
i
n
t
e
n
s
i
t
y
%
Figure 7.2: Fluorescent dye bands: Blue (dashed/semi-gray), green (solid/dark-gray), yellow
(dot-dashed/light-gray) and red (dotted/white). The shaded areas under each curve indicate the
amount of spectral overlap between the various dyes. Reproduced from Applied Biosystems
(2000).
as an increase in the intensities of both the background noise and true peaks. The shaded areas
in Figure 7.2 represent the amount of overlapping light frequencies of the four dierent colours
(blue, green, yellow and red) used in the SGM Plus kit. The increases caused by pull-up eects
are pictured in Figure 7.1 as Pull-up eect.
In our model, pull-up eects cannot cause stutters, whereas stutters may induce pull-up eects
on other dye bands.
7.2 Materials and methods 141
7.2.3 Determination of oating threshold
For each STR locus and amelogenin of a sample, we wanted to obtain a set of negative data in
order to model the negative data elements and develop a threshold for discrimination of positive
and negative signals based on the distribution of the negative signals. We used the training data
set for the development of the mathematical model. The peak height observations (intensities
of uorescent signals) below 5 rfu were removed at the rst step of analysis with the Genescan
software because the software was unable to handle the large amount of data elements below 5
rfu. The remaining signals comprised both background noise and more systematic components.
We removed (1) all peaks on the allelic ladder that primarily represent true alleles and (2) all o-
ladder signals in pull-up positions. This ensured that the remaining data points represented true
noise. The peaks designated Noise in Figure 7.1 illustrate the data used for the determination
of the threshold.
Inspection of the data indicated that the noise followed a right-skewed distribution. In order
to obtain a normal distribution, the peak heights were transformed by log
e
(peak height 4.5).
The distribution of the noise data tted the log-normal distribution, and the t was not better
with distributions like the exponential, Fisher-Tippett, Pareto, Rayleigh, or Weibull distributions.
Figure 7.3 shows the distribution of the observed peak heights of the noise for each locus of a
sample after the data had been transformed by log
e
(peak height in rfu 4.5) against a standard
normal distribution in a QQ-plot. Note that the outliers in the upper tail of the distribution are
in fact the true positive signal. The plots demonstrated that the noise (shifted by 4.5) followed a
log-normal distribution with individual parameters
s
and
s
for each locus, s. These parameters
determined the intercept and slope of the superimposed QQ-line and were estimated by

s
=
x
s(q
1
)
x
s(q
0
)
z
(q
1
)
z
(q
0
)
and
s
= x
s(q
0
)

s
z
(q
0
)
,
where x
s(q)
and z
(q)
are the empirical and standard normal q-quantiles, respectively. We used
these quantile estimators rather than the ordinary maximum likelihood estimators in order to
increase the robustness of the method.
Figure 7.3 shows that the t to normality was better for the higher values of
log
e
(peak height in rfu 4.5) than for the lower ones. The observations in the upper part of
the peak heights of the noise are those of main interest for the estimation of the threshold. Thus,
we chose to use the (q
0
, q
1
) = (50%, 90%)-interval for the estimation of the threshold. The
threshold was determined by the mean plus 3.29 times the standard deviation. The locus specic
threshold can be written as:
Threshold for locus s = exp(3.29
s
+
s
) + 4.5.
Approximately 99.95% of the noise will be below the threshold and, thus, will be categorized
as noise. However, 0.05% of the true negative results will be above the threshold and, thus, will
be categorized as positive signals. For practical purposes, the majority of such false positive
assignments will be in o-ladder positions rather than in allele positions.
Figure 7.3 shows that only few noise data elements were recorded for amelogenin. This is due
to the fact that the interval, in which noise could be recorded around the X- and Y-windows of
142 Sample and investigation specic ltering of quantitative data from STR DNA analysis
True allele
Stutter
Pullup (Onladder)
Pullup (Offladder)
Onladder peak
Offladder peak
Locus specific threshold
Fixed 50 rfu threshold
QQline
Standard normal
P
e
a
k

h
e
i
g
h
t
5
10
25
50
150
400
2000
Threshold: 36.4
D3
Threshold: 37.89
vWA
Threshold: 27.29
D16
Threshold: 27.38
D2
5
10
25
50
150
400
2000
Threshold: 24.73
AME
Threshold: 24.73
D8
Threshold: 37.12
D21
Threshold: 34.82
D18
5
10
25
50
150
400
2000
3 2 1 0 1 2 3
Threshold: 37.54
D19
3 2 1 0 1 2 3
Threshold: 32.61
TH0
3 2 1 0 1 2 3
Threshold: 29.7
FGA
Figure 7.3: QQ-plots of the observed peaks. Note the dierent thresholds computed using the
data of the locus itself as reference. For this particular sample, the xed 50 rfu-threshold causes
ve drop-outs (in loci D3, D2, D19, D2 and FGA) and two drop-out (in loci D19 and D21)
with the locus specic threshold (one true peak in locus D21 has a peak height of 22 rfu and is
embedded in the noise).
amelogenin, is small. However, amelogenin and D8 are marked with the same uorochrome
and D8 alleles are only slightly longer than the DNA fragments of X- and Y-amelogenin. The
distributions of noise in amelogenin and D8 were rather similar to each other and, therefore, the
threshold of D8 was also used for amelogenin.
7.2.4 Stutter correction
Figure 7.4 shows three dierent situations involving stutters. The background noise (grey peaks)
is the same in all three scenarios, but the parental peaks and, thus, the stutter peaks (black peaks)
dier in sizes.
We used the training data set to develop the mathematical model based on a regression model on
the peak intensities of the parental peaks. Assuming additivity of the noise and stutter product,
we take into account that peaks in stutter positions in front of small peaks mainly consist of noise
7.2 Materials and methods 143
Figure 7.4: Stutter peaks caused by dierent parental peaks (in black). The grey peaks picture
the noise and the dashed line the median of the noise.
as pictured in Figure 7.4. The model of the expected stutter height, h
Stutter
, is given by
h
Stutter
= h
Noise,s
+ (
s
+
s

bp
s
)h
Parent
, (7.1)
where h
Noise,s
is the known/determined median of the o-ladder peaks not in pull-up position
on locus s (see Section 7.2.3) and h
Parent
is the parental peaks height. The parameters were
estimated by a weighted least square t with weights 1/h
Parent
, due to the proportionality of the
mean and variance of the peak heights (Tvedebrink et al., 2010). In the latter term,

bp
s
is the
base pair deviation from the mean base pair,

bp
s
, on locus s,

bp
s
= bp
s


bp
s
. The parameter
s
is the average stutter eect at a given locus, s. By including the base pairs in the model, we are
able to have dierent stutter fractions for various alleles within a locus, if necessary.
Table 7.1: Estimates of
s
and
s
in the stutter model (7.1).
Locus D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA

s

10
2
7.209 6.714 6.101 7.712 4.996 6.359 7.031 7.252 2.031 6.508
SE(
s
)

10
3
0.930 0.883 0.874 0.848 0.645 0.836 0.776 1.348 1.310 1.221

s

10
2
0.104 0.203 0.212 0.091 0.096 0.075 0.172 0.215 0.069 0.123
SE(
s
)

10
3
0.112 0.118 0.137 0.064 0.069 0.140 0.099 0.215 0.184 0.139
The previously observed increase in stutter percentage as a function of allele number (Applied
Biosystems, 2006) was reproduced by the positive estimates of
s
in Table 7.1. The STR locus
specic
s
parameters in Table 7.1 are in accordance with the picture in the manufacturers kit
documentation (Applied Biosystems, 2006, Figure 9-5, 9-6 and 9-7), where e.g. the average
stutter eect,
TH0
, in TH0 is the weakest.
Figure 7.5 shows the stutter peak heights predicted by the model compared to the observed stutter
peak heights. The plot demonstrates that the model in (7.1) is sucient in order to describe
the stutters. For adjacent heterozygous alleles, the base pairs typically dier by only a limited
number of bp, which minimizes the eect of the length of the DNA fragment estimated by
s
.
Additional examinations of the data also made it clear that back-stutters were present typically
in the position 4 bp larger than the parental peak. The model for back-stutters and correction
144 Sample and investigation specic ltering of quantitative data from STR DNA analysis
Observed stutter height
P
r
e
d
i
c
t
e
d

s
t
u
t
t
e
r

h
e
i
g
h
t
25
100
225
D3
25 100 225
vWA D16
25 100 225
D2
25
100
225
D8 D21 D18
25
100
225
25 100 225
D19 TH0
25 100 225
FGA
Figure 7.5: Predicted stutter peak heights plotted against observed stutter peak heights with the
identity line superimposed. The scale of the plot is the variance stabilising square-root transfor-
mation.
is based on the same idea as those concerning conventional stutters with a noise level and an
additional eect from the parental peak, i.e.
h
Backstutter
= h
Noise,s
+
s
h
Parent
. (7.2)
Table 7.2 shows the parameter estimates. The lack of homozygous alleles in some of the loci in
the actual data set implied that the estimates of
s
were insignicant for these loci. This is due to
the fact that the parental peak needs to reach a certain height (typically well above 1,000 rfu) for
the back-stutter to exceed the noise level. For the same reason, base pairs were not included in
the back-stutter model as only a few base pair lengths were represented in the back stutter data.
Double stutters originating from two adjacent alleles separated by 4 bp in a heterozygous indi-
vidual behave slightly dierently from single stutters. In Appendix 7.A, we evaluate the ratio
of the stutter peak to the mean of the two parental peaks. In situations where the heterozygous
alleles are not adjacent (separated by more than 4 bp) or when a stutter originates from a ho-
mozygous allele, for practical purposes, we only need to consider the ratio of the stutter peak to
the parental peak.
7.2 Materials and methods 145
Table 7.2: Parameter estimates of the backstutter model (7.2). Loci with insignicant
s
esti-
mates due to lack of homozygous alleles were removed.
Locus D3 vWA D16 D2 D8 D21 D18 FGA

s

10
2
0.233 0.211 0.428 0.187 0.293 0.615 0.560 0.638
SE(
s
)

10
3
0.643 0.738 0.628 0.627 0.551 0.625 0.561 0.747
7.2.5 Pull-up correction
We dened pull-ups as peaks on dierent dye bands within 0.5 bp of the parental bp lengths.
Only the peaks not being true alleles or possible stutters on a dierent dye band than the parental
peak were included in the data analyses. Figure 7.1 shows an increase of the noise level (right-
most on the upper band) and a true peak (heterozygote imbalance, leftmost on the upper band).
We used the training data set to develop the mathematical model based on regression.
Figure 7.6 shows examples of pull-up values as function of the values of the parental peaks for
the various colours. The magnitudes of the observed pull-up eects were in accordance with the
spectral overlap in Figure 7.2, i.e. the eects of green signals in the yellowspectrumand of green
signals in the blue spectrum were the two largest, and yellow signals had the smallest eect in
the blue spectrum.
For predictive purposes, we tted a linear model to the observed data in Figure 7.6. Of the in-
cluded data points, only a limited subset comprised detectable pull-up peaks, while the remaining
observations were background noise in pull-up positions. Our model takes this into account by
having a noise dependent intercept, h
Noise,s
, for locus s (median of the noise data described in
Section 7.2.3). This approach is similar to the one used in the model for correction of stutter
eects. In the formulation of the model, the notation D d reects that the parental peak is in
uorescent dye band D and the pull-up peak is located in the uorescent dye band, d,
h
Pull-up
= h
Noise,s
+
Dd
h
Parent
, (7.3)
where parameters were estimated by a weighted least-square t. Table 7.3 shows the parameter
estimates of
Dd
. The superimposed lines in Figure 7.6 were based on the parameter estimates
of Table 7.3. Thus, the superimposed lines are in accordance with the spectral overlaps in Fig-
ure 7.2 except for
GB
, which is smaller than expected. This may be due to the particular alleles
included in our data set.
Table 7.3: Parameter estimates of the various overlapping uorescent dyes.
Dyedye BG BY GB GY Y B Y G

Dd

10
2
1.039 0.449 0.342 0.978 0.322 0.597
SE(
Dd
)

10
3
0.405 0.357 0.411 0.341 0.560 0.482
146 Sample and investigation specic ltering of quantitative data from STR DNA analysis
Parental peak height
P
u
l
l

u
p

p
e
a
k

h
e
i
g
h
t
4
16
36
64
100
144
196
Blue fluorescence dye Green fluorescence dye Blue fluorescence dye Yellow fluorescence dye
4
16
36
64
100
144
196
Green fluorescence dye Blue fluorescence dye Green fluorescence dye Yellow fluorescence dye
4
16
36
64
100
144
196
0 400 1600 3600 6400
Yellow fluorescence dye Blue fluorescence dye
0 400 1600 3600 6400
Yellow fluorescence dye Green fluorescence dye
Figure 7.6: Pull-up eects stratied by overlapping uorescent dyes. The superimposed lines
indicate the estimated model. The scale of the plot is the variance stabilising square-root trans-
formation.
7.3 Results
The parameters for correction of pull-up and stutter eects were estimated using the dilutions of
non mixture samples, whereas the overall performance of the lter was based on analysis of all
possible combinations of pairwise two-person mixtures of four proles in mixture ratios ranging
from 1:16 to 1:1.
The procedure of events were the following:
(1) Determination of the oating threshold: Determine the threshold and detect potential stut-
ters, pull up eects and true peaks, i.e. alleles with peak heights above the threshold.
(2) Pull-up correction: Correcting for pull-up eects caused by peaks above the threshold deter-
mined in (1).
(3) Stutter correction: Correcting for stutter eects caused by peaks above the threshold deter-
mined in (1).
(4) Allele assignment: Assignment of alleles according to the determined oating threshold in
(1) and the allelic ladder.
Note, that the corrections for pull-up eects were made before the stutter correction was applied,
because stutters may cause pull-ups while pull-ups cannot make stutters.
7.3 Results 147
7.3.1 DNA mixtures from controlled experiments
We used our oating threshold, stutter and pull-up correction method on 107 two-person mix-
tures. In Table 7.4, we summarise the performance of the overall lter. It is worth emphasising
that 263 of the true alleles dropped out and that the stutter lter let 6 stutters and no backstutters
slip through. In addition to the stutter peaks, another 25 (21 drop-ins and 4 pull-ups) on-ladder
peaks were classied as proper peaks of the samples.
Table 7.4: Filtered and passed peaks classied by type.
Classication Assigned negative result Assigned positive result
True allele 263 3,308
Stutter 2,167 6
Backstutter 1,260 0
Noise 62,669 324
On-ladder 11,619 21
O-ladder 51,050 303
Pull-up 3,825 14
On-ladder 982 4
O-ladder 2,843 10
The remaining peaks passing the lter were all in o-ladder positions and removed from the
analysis afterwards. The data were also analysed following the standard protocol of The Section
of Forensic Genetics, Department of Forensic Medicine, Faculty of Health Sciences, University
of Copenhagen. Using the technique recommended by the manufacturer, 312 drop-outs were
observed together with 27 stutters and 7 pull-up peaks. Thus, the number of drop-outs of true
alleles was 16% lower with locus specic ltering than with a xed 50 rfu threshold.
The classication tables for the two methods are listed in Table 7.5. In the classication tables
each observation is categorised by its classication and actual class. In Table 7.5 the diagonals
are the correctly classied observations, while the o-diagonals are the misclassied. The lower
the counts in the o-diagonal cells the better is the classication methodology.
To summarize the classication table in a single value we suggest the misclassication rate,
which is the total of misclassied observations to the total of correctly classied observations.
From the misclassication rates (bottom lines in Table 7.5) we see that the oating threshold
method yields a better classication than the xed 50 rfu threshold.
Table 7.5: Classication tables for the two methods: Fixed 50 rfu and oating threshold.
Floating threshold
Expected +
Observed + 3308 31
263 69,921
Misclassication rate: 0.401%
Fixed 50 rfu threshold
Expected +
Observed + 3259 34
312 69,918
Misclassication rate: 0.473%
148 Sample and investigation specic ltering of quantitative data from STR DNA analysis
7.3.2 Fingernail swabs from crime cases
Data from 98 crime cases were analysed using the approach presented. The Section of Forensic
Genetics, University of Copenhagen, supplied the data from the crime cases together with ref-
erence samples associated with the crime. These reference samples may explain the observed
stain, but since contamination from other biological material and debris may accumulate under
the ngernails, the number of random drop-ins may be misleading (Cook and Dixon, 2006).
From Table 7.6 we see that the number of drop-outs decreased by 220 events from 912 using the
standard protocol to 692 using the samples specic setup (decrease of 24%). However, this gain
in fewer drop-outs comes with a cost in more drop-ins. The standard protocol gave 15 drop-ins
versus 90 using our approach. In the experiment conducted by Cook and Dixon (2006), foreign
DNA were detected in 13% of the ngernail swabs taken from the participating individuals.
Hence, the higher number of drop-ins using our more sensitive methodology may be caused
by foreign DNA. These alleles are actually true alleles rather than drop-ins, however, this is
impossible to conclude from the available data.
Table 7.6: Classication tables for the two methods: Fixed 50 rfu and oating threshold.
Floating threshold
Expected +
Observed + 1,460 90
692 82,497
Misclassication rate: 0.931%
Fixed 50 rfu threshold
Expected +
Observed + 1,240 15
912 82,572
Misclassication rate: 1.106%
The misclassication rates in Table 7.6 are more than twice the rates of Table 7.5. This is
a consequence of the data being from real crime cases with many degraded samples and low
amounts of DNA. Hence, the number of drop-outs is larger and so is the number of partial DNA
proles.
In addition Table 7.7 compare the drop-outs and drop-ins of the two methods. We see that 236
of the drop-outs under the standard protocol were correctly declared as true alleles using the
oating threshold method. However, sixteen of the allelic drop-outs from the oating threshold
method did not drop-out using standard methods. More than half of these new drop-outs were
located in locus D3 which tends to have a higher noise level compared to the other loci in this
dataset. This may be due to primer residue increasing the background noise for the shorter loci
in the electrophoresis.
Table 7.7: Comparisons of the drop-ins and drop-outs produced by the two methods.
Dropped out Dropped in
Fixed 50 rfu threshold Yes No Yes No
Floating threshold Yes 676 16 7 83
No 236 1,224 8 -
7.4 Discussion 149
7.4 Discussion
Previously Gilder et al. (2007) indicated that using observations from the negative controls from
the same run as the samples could be used to extract information about the noise level. However,
their approach did not take variation between the capillaries into account. From our analysis
there are signicant dierences between the capillaries with negative controls within each run,
and also signicant dierences between the same capillaries with negative controls for dierent
runs. This suggest that the noise distribution is neither constant within runs nor for the same
capillary for consecutive runs. Hence, our approach where we use the sample itself in order to
determine the noise distribution is recommended, as it eliminates the between run and capillary
variation. Furthermore, the stratication on loci for determining the threshold clearly improves
the noise ltering as indicated by Figure 7.3.
The xed 50 rfu-threshold yields in many cases the same number of drop-outs as the locus
specic oating threshold. In Figure 7.7, the box-plots show the thresholds of the 107 mixture
samples. For all loci, the median of the oating threshold is lower than the xed 50 rfu limit.
Note that within each dye band, the threshold median tends to decrease with the base pair length.
D3 vWA D16 D2 D8 D21 D18 D19 TH0 FGA
2
0
4
0
6
0
8
0
1
0
0
1
2
0
T
h
r
e
s
h
o
l
d

[
R
F
U
]
Locus
Blue fluorescent dye Green fluorescent dye Yellow fluorescent dye
Figure 7.7: Box-plots of the estimated locus specic oating threshold for the 107 mixture
cases.
150 Sample and investigation specic ltering of quantitative data from STR DNA analysis
An advantage of the locus specic threshold is that it enables the case worker to assess the noise
level of the sample. Hence, for cases where a peak lies just below 50 rfu, the magnitude of the
locus specic threshold indicates whether it is reasonable to include the peak in the signal or not.
Furthermore, in cases where the transformed peak heights, log
e
(peak height 4.5), deviate sub-
stantially from normality, the data indicate that the sample may be subject to extensive noise or
contamination of some kind. This may be used as sample quality diagnostic in order to deter-
mine if a re-analysis is necessary. This deviation may be observed from the QQ-plots and other
usual diagnostics to validate assumptions of normality.
Since the stutter and pull-up corrections are based on a regression model, the parameters has
been tuned for this specic data set. In general, the parameters must be determined for each
laboratory, kit and DNA sequencer.
However, the trend in parameter magnitudes for the dierent pull-up directions is expected to be
satised in general - possibly with an increase in the
GB
-parameter estimate. It is also worth
emphasising the dependency on the kit used for DNA typing. The data used in our analyses were
obtained using the SGM-Plus kit from Applied Biosystems. I.e., the parameters of stutter and
pull-up lters are not directly applicable to other kits.
7.5 Conclusion
The methodology of regression and distributional analysis of the noise yielded satisfactory re-
sults in order to deduce a sample and investigation specic lter for STR DNA typing. Compar-
isons of the results with those based on the recommendations of the manufacturers indicated that
the number of drop-outs for the two validation datasets decreased by 16% and 24%, respectively.
Studies of dierent data sets supported this improvement and suggests that the methodology of
the threshold determination is adequate for the noise ltering of STR quantitative data.
The lters for pull-up eects and stutters based on regression analysis trained on non-mixture
data also showed applicability to mixed DNA samples. As mentioned in Section 7.4, the pa-
rameter estimates in the lter were tuned for this specic data set and the alleles of included
proles. Hence, the estimation of the parameters must be a part of a laboratorys internal quality
assessment, where the consistency of the estimates over time are quality indicators.
Appendix
7.A Double stutters
In Gill et al. (2005), the authors argue that, once a stutter has been formed, its replication during
subsequent PCR cycles perform as an ordinary allele. We investigate the behaviour of stutters,
when we have two adjacent alleles of a heterozygous prole, i.e. what we called a double stutter.
Let h
i
denote the expected value of pre-PCR peak height of allele i. Then for two adjacent alleles
7.A Double stutters 151
n and n+1 from the same contributor, we have h
n
= h
n+1
= h and h
n1
= 0 for the stutter position
n1. Let P denote the eect of one PCR-cycle, then after t PCR-cycles, the expected value of
post-PCR peak heights h
(t)
i
is given by,
_
h
(t)
n+1
, h
(t)
n
, h
(t)
n1
_

= P
t
(h, h, 0)

,
where x

denotes the transpose of the vector x. P may be specied in terms of the PCR e-
ciency in one cycle, p, and the one-cycle stutter percentage, ,
_

_
h
(t)
n+1
h
(t)
n
h
(t)
n1
_

_
=
_

_
1+p 0 0
1+p 0
0 1+p
_

_
t
_

_
h
h
0
_

_
=
_

_
(1+p)
t
0 0
t(1+p)
t1
(1+p)
t
0
_
t
2
_

2
(1+p)
t2
t(1+p)
t1
(1+p)
t
_

_
_

_
h
h
0
_

_
The second equality can be shown using some linear algebra. Dene = t/(1+p) to be the
stutter percentage for the entire PCR process comprising t cycles. This denition ensures that
the stutter percentage increases with the number of cycles as noted in the literature (Gill et al.,
2000). The expression can then be rewritten as
_

_
h
(t)
n+1
h
(t)
n
h
(t)
n1
_

_

_

_
1 0 0
1 0

2
2
1
_

_
_

_
h
0
h
0
0
_

_
=
_

_
h
0
h
0
(1 + )
h
0
( +

2
2
)
_

_
(7.4)
where h
0
= (1+p)
t
h and the is due to t(t1)/2 t
2
/2 from the binomial coecient. The error
induced from this approximation is negligible for t 28 cycles.
The peak height, h
0
, can be interpreted as the actual peak height after the PCR process. In Gill
et al. (2005), the authors use p = 0.8 as the eciency of a PCR cycle, hence indicating the
theoretical doubling eect (requires that p = 1) from each cycle is not met in practice. Note, that
there is a dierence to the work of Gill et al. (2005) where they model the PCR process at the
nucleic level. Our approach is in terms of quantitative measures of peak heights.
Often it is assumed that the peak at position n+1,

h
(t)
n+1
, equals some true height,

h, after t PCR
cycles. Due to stuttering, the peak at position n equals

h plus an additional fraction,

, from the
n+1-position peak,

h
(t)
n
= (1+

h
(t)
n+1
= (1+

h. Furthermore, the peak height of stutter peak at


position n1 is

h
(t)
n1
=

h
(t)
n
= (

2
)

h. This can be written using matrices as


_

h
(t)
n+1

h
(t)
n

h
(t)
n1
_

_
=
_

_
1 0 0

1 0

2

1
_

_
_

h
0
_

_
(7.5)
where the dierence (
2
/2

2
) between the matrices in (7.5) and (7.4) is induced by the delay
of one cycle in the stutter product from the n+1-position peak to the stutter peak in position n.
Hence, the relative contribution from the n+1-peak is smaller than modelled in (7.5) since when
formed, the stutter peak is amplied as a regular peak (Gill et al., 2005), which is captured using
(7.4).
152 Sample and investigation specic ltering of quantitative data from STR DNA analysis
When referring to the stutter percentage, , we dene it as the percentage of the parental peak
that is transferred to the stutter peak, = h
(t)
n1
/h
(t)
n
. However, having two true alleles located at
position n and n + 1, we nd
h
(t)
n1
h
(t)
n
=
h
0

_
1+

2
_
h
0
(1+)
.
In this situation, the ratio of the stutter peak to the mean of the two parental peaks yields the
stutter percentage,
h
(t)
n1
1
2
_
h
(t)
n
+h
(t)
n+1
_ =
h
0

_
1+

2
_
1
2
(h
0
(1+)+h
0
)
=
h
0

_
1+

2
_
h
0
_
1+

2
_ = .
In situations where the heterozygous alleles are not adjacent (separated by more than 4 base
pairs) or when stutter originates from a homozygous allele, we need for practical purposes only
to consider the direct ratio h
(t)
n1
/h
(t)
n
in order to estimate .
Bibliography 153
Bibliography
Applied Biosystems (2000). GeneScan Reference Guide - Chemistry Reference for the ABI
PRISM 310 Genetic Analyzer. Applied Biosystems. Figure Virtual Filter Set F, pp. 4-10.
Applied Biosystems (2006). AmpFSTR SGM Plus PCR Amplication Kit Users Manual. Ap-
plied Biosystems.
Butler, J. M. (2005). Forensic DNA Typing: Biology, Technology, and Genetics of STR Markers
(2 ed.). Burlington, MA: Elsevier Academic Press Inc., U.S.
Cook, O. and L. Dixon (2006). The prevalence of mixed DNAproles in ngernail samples taken
fromindividuals in the general population. Forensic Science International: Genetics 1(1), 62
68.
Gilder, J. R., T. E. Doom, K. Inman, and D. E. Krane (2007). Run-Specic Limits of Detection
and Quantitation for STR-based DNA Testing. Journal of Forensic Science 52(1), 97101.
Gill, P. D., J. M. Curran, and K. Elliot (2005). A graphical simulation model of the entire DNA
process associated with the analysis of short tandemrepeat loci. Nucleic Acids Research 33(2),
632643.
Gill, P. D., J. Whitaker, C. Flaxman, N. Brown, and J. S. Buckleton (2000). An investigation
of the rigor of interpretation rules for STRs derived from less than 100 pg of DNA. Forensic
Science International 112(1), 1740.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2010). Evaluating the weight
of evidence using quantitative STR data in DNA mixtures. Journal of the Royal Statistical
Society. Series C, Applied statistics. In Press.
154 Sample and investigation specic ltering of quantitative data from STR DNA analysis
7.6 Supplementary remarks
The three-person mixtures discussed in Section 5.10 were also analysed using the oating thresh-
old methodology. As in Section 5.10 we discarded 17 out of the 120 samples due to preparation
or run errors. For the remaining cases, the minimum amount of DNA contributed to a true allele
was approximately 77.5 pg. Hence, the number of peak heights close to the limit of detection
(50 rfu) is expected to be low, since experience show that with about 50 pg pre-PCR product the
average peak heights are close to this limit.
When using the standard protocol with a xed 50 rfu threshold, there was observed 8 drop-outs
and 80 extra peaks not assigned to the contributors. These were distributed as 47 stutters, 10 pull-
ups and 23 drop-ins. For the oating threshold there were 4 drop-outs and 85 extra peaks, which
were categorised as 27 stutters, 2 pull-ups and 56 drop-ins. Hence, the performance of the two
methods were almost identical with respect to the misclassication rates, which were 0.123%
and 0.124%, respectively. This non-signicant dierence in assignment of alleles indicates that
the xed 50 rfu threshold is very reasonable for standard applications.
However, the methodology may be useful in situations were a the amount of DNA contributed by
a suspect is limited. Given such circumstances the peak intensities associated with the suspects
prole may be close to the xed limit of detection, e.g. with the majority of peak heights in the
range 40 rfu to 60 rfu. Peaks below the limit of detection, 50 rfu say, would conventionally be
declared as drop-outs. However, if the level of the noise supports a oating threshold limit of 30
rfu such considerations need not to be made, since no alleles would drop-out in this case. Often
a case worker is able to visually detect peaks belonging to the suspect in the EPG below the limit
of detection. However, lowering the limit of detection in order to include the suspect is clearly
very erroneous and unfavourable to the defendant, since taken to the extreme, any DNA prole
could be included in the crime related stain.
Furthermore, the method of adjusting for the contribution of stutter and pull-up eects is more
accurate than just removing the peaks due to so-called masking. Keeping all relevant in the
system is desirable since having a peak in stutter position that after adjustment has a peak height
of 35 rfu, say, is more informative than having a NA observation due to removal of a potential
stutter.
CHAPTER 8
Statistical model for degraded DNA samples
and adjusted probabilities for allelic drop-out
Publication details
Co-authors: Poul Svante Eriksen

, Helle Smidt Mogensen

and Niels Morling

Department of Mathematical Sciences


Aalborg University

Section of Forensic Genetics, Department of Forensic Medicine


Faculty of Health Science, University of Copenhagen
Journal: Forensic Science International: Genetics (Under preparation)
155
156Statistical model for degraded DNA samples and adjusted probabilities for allelic drop-out
Abstract:
DNA samples found at a scene of crime or obtained from the debris of a mass disaster accident
are often subject to degradation. When using the STR DNA technology the DNA prole is
observed via a so called electropherogram(EPG), where the alleles are identied as signal peaks
above a signal to noise threshold. Degradation implies that these peak intensities decrease in
strength for longer repeat sequences. Consequently, long STR loci possibly fail to produce peak
heights above the limit of detection resulting in allelic drop-outs.
In this paper we present at method for measuring the degree of degradation of a sample and
demonstrate how to incorporate this in estimating the probability of allelic drop-out. This is
done by extending an existing method derived for non-degraded samples. The performance of
the methodology is evaluated using data from degraded DNA where cases with varying amounts
of DNA and levels of degradation are investigated.
Keywords:
Forensic genetics; STR DNA; Degraded DNA; Allelic dropout.
8.1 Introduction
This paper presents a statistical analysis of degraded STR DNA samples. The model derived
in the subsequent sections is based on analysis of degraded DNA from body tissue kept under
various non-optimal conditions. This implies that the DNA is aected by degradation, which
is a commonly occurring event in crime cases, where evidence is collected after it has been
exposed to e.g. sunlight, humidity and other degrading conditions (see e.g. Alaeddini et al.,
2010). Furthermore, when identifying body remains in mass disaster cases, the samples are often
found in the debris of the accident or in mass graves. Samples taken under these circumstances
are often highly degraded and it is often hard to obtain full DNA proles from longer STR loci
(Schneider et al., 2004; Bender et al., 2004; Alonso et al., 2005; Dixon et al., 2006; Irwin et al.,
2007; Prinz et al., 2007; Colotte et al., 2009).
In samples with degraded DNA, the signal intensities for the STR fragments decreases with the
fragment length, due to the higher likelihood of the longer fragments to be degraded compared
to the shorter fragments. Consequently, signals for the longest alleles are frequently missing, a
phenomena called allelic drop-out. Allelic drop-out of the long alleles can also occur in sam-
ples with apparently moderate amount of DNA since the available quantication kits (Plexor,
Applied Biosystems, and QHum, Qiagen) are based on amplicons less than 200 bp (Green et al.,
2005), which is about half the length of the longest amplicons in e.g. the SGM Plus kit (Applied
Biosystems).
In order to assign weight to the evidence in cases involving degraded samples, the case worker
needs to be able to account for the fact that alleles or loci have dropped out. I.e. alleles of the
true DNA prole fail to cause peak heights large enough to pass a limit of detection. Tvedebrink
et al. (2009) presented a method for estimating the probability of allelic drop-out based on a
logistic regression. However, the analysis was based on diluted samples from healthy DNA
samples where degradation was absent. Here we show how to extend the drop-out model of
8.2 Materials and methods 157
Tvedebrink et al. (2009) to handle degraded samples by adjusting the proxy for the amount of
DNA to correct for degradation.
8.2 Materials and methods
8.2.1 Data
The data used in this study were investigated by The Section of Forensic Genetics, Department
of Forensic Medicine, Faculty of Health Sciences, University of Copenhagen. DNA proles
from 47 crime case samples were identied as degraded due to the decreasing signal intensities
in the electropherogram (EPG) for longer fragments in the SGM Plus analysis. Of these were
eight samples discarded due to obvious inhibition and for ve samples the amount of DNA were
limited such that the observed peak heights were in the range 40 to 15 rfu. The remaining 34
samples originated from saliva on a shirt (three samples), blood stains on paper (eight samples),
blood sample from a decomposed body (ve samples), a spleen from a decomposed body (one
sample) and paran-embedded tissue (17 samples). The amounts of DNA varied from 17 pg to
1244 pg as quantied with Plexor (Applied Biosystems) and QHum (Qiagen) quantication kits.
We used the methodology of Chapter 7 for ltering the raw signal, where the detection limit was
set to 5 rfu in GeneScan. The Kazam macro in Genotyper was used for allele designation.
8.2.2 Model
It is well known to case workers investigating DNA from crime scenes that the DNA often is
subject to degradation to some degree. The most common eect is an observable decrease in
peak intensities for increasing base pair length of the amplicons. A probable explanation for
this is that the longer the repeat sequence the higher the probability of a breakage in the primer
binding sequence. Let p denote the probability that there is no breakage between two DNA
bases (A, T, C or G). For simplicity we assume p to be constant with respect to length and the
two adjacent DNA bases. That is, the probability of breakage between A and G is the same as T
and C, and so on. Furthermore, by a constant probability of breakage as a function of base pair,
bp, we do not assume any region of the genome to be more susceptible to breakage than others.
Hence, this simple model does not include the possibility that proteins may protect some regions
or segments of the DNA from degradation. Therefore, the longer the primer binding site the
more possibilities exists for the occurrence of just one breakage. I.e. longer sequences increase
the probability of damaged DNA:
P(No degradation) = P(No breakage between any base pair)
= P(No breakage between a given base pair)
bp
= p
bp
,
where we from the rst to second line used that the probability of breakage is constant and
that the probability of breakage between any two pairs is assumed independent of the con-
stitution between all other pairs. Since p 1 the function p
bp
is a decreasing function of
158Statistical model for degraded DNA samples and adjusted probabilities for allelic drop-out
bp. This implies that the longer amplicon (larger bp-values), the smaller is the probability of
no degradation. Conversely this increases the probability of degradation as P(Degradation) =
1 P(No degradation) = 1 p
bp
.
For healthy DNA samples the peak heights for the various loci are almost constant. This is due
to the fact that there is no degradation acting in healthy samples with p 1. This led Tvedebrink
et al. (2009) to argue that the amount of DNA is well modelled using the average peak height H:
Amount of DNA H = (n
het
+ 2n
hom
)
1

n
i=1
h
i
, (8.1)
where n = n
het
+ n
hom
is the number of observed heterozygous and homozygous alleles in the
prole. This was previously demonstrated to be a good proxy for the amount of DNA contributed
to a stain (Tvedebrink et al., 2010).
However, in degraded samples one need to take the varying bp into account when modelling the
mean peak height. Rather than being constant, the peak heights are aected by p and bp. By
modelling the mean peak height as
H(bp) = cp
bp
, (8.2)
where c is some proportionality factor, depending e.g. on the amount of DNA in the sample,
we obtain an expression for the mean peak heights in degraded samples. Note that for healthy
samples p 1 which implies that c H as dened in (8.1).
Hence, p may be taken as a measure of the level degradation of a given sample: the smaller p the
more severe is the degradation, whereas values close to 1 indicate only moderate degradation.
Taking logs on both sides of (8.2), we get:
log H(bp) = log(c) + log(p)bp =
0
+
1
bp. (8.3)
This implies a linear relationship between bp and log H(bp). However, the assumption of linear-
ity does not correct for the possibility of homozygocity. Hence, peak heights from homozygous
loci needs to be divided by 2 in order for the model to be applicable to all loci. In Figure 8.1
we see that the model is supported by the data given in Table 8.1. Analysis of all the samples
described in Section 8.2.1 indicated that linearity were satised for all samples (plots similar to
Figure 8.1 are provided as supplementary on-line material). When applying this methodology to
analyse degraded samples, it is important to verify the model t by graphical diagnostics (as in
Figure 8.1) and the R
2
-statistic of the linear model in (8.3). This is due to the fact that a linear
model may be tted to any data set, but without reasonable validity the interpretation might be
dubious.
This model formulation is in itself simple and intuitive. Furthermore, equation (8.3) enables
direct implementation in the model of Tvedebrink et al. (2009) for estimating dropout probabil-
ities P(D), where D indicates a drop-out event. Since the probability of drop-out is primarily
determined by the amount of DNA, it is natural to implement this into the model for drop-out
probabilities. In Tvedebrink et al. (2009) the authors demonstrated that a logistic regression with
H as explanatory variable yield an applicable model to estimate the drop-out probability for a
given value of H:
logit P(D; H) = log
P(D; H)
P(

D; H)
=
0,s
+
1
log

H, (8.4)
8.2 Materials and methods 159
base pair (bp)
100 150 200 250 300 350
0
2
4
6
8
(
1
)
(
7
)
(
5
5
)
(
4
0
3
)
(
2
9
8
1
)
l
o
g
(
p
e
a
k

h
e
i
g
h
t
)
Figure 8.1: Peak heights on logarithmic scale plotted against base pair (bp). The black points
are assigned true peaks by the oating threshold methodology (Chapter 7) and the grey points
are assigned noise (negative signal). The numbers in brackets on the ordinate are the rfu values.
The dashed horizontal line shows the xed 50 rfu detection threshold.
where

H = H and

H = 2H for heterozygous and homozygous loci, respectively. The subscript
s in
0,s
indicate that this parameter is locus specic whereas
1
is not (Tvedebrink et al., 2009).
Since the model in (8.3) measures the degree of degradation, we may adjust the estimate of H by
H(bp), such that the model for dropout also is applicable to degraded DNA samples with H(bp)
as explanatory variable.
8.2.3 Implementation of degradation in drop-out probability estimation
The denition of H in Tvedebrink et al. (2009) assumes that H is estimated based on all peak
height observations (see (8.1)). However, since the peak height decreases for increasing bp in a
degraded DNA sample, the assumptions for the drop-out model it not satised. To compensate
for the decrease in peak heights and thereby increase in drop-out probability, we incorporate the
level of degradation in the drop-out model. Let G
S
be the prole of a given individual, e.g. the
suspect of a crime case. First, the peaks related to the alleles originating from G
S
is determined,
and the -parameters of the degradation model, (8.3), are estimated based on a linear regression
160Statistical model for degraded DNA samples and adjusted probabilities for allelic drop-out
Table 8.1: Data used in Figures 8.1 and 8.2. The drop-out probability of allele 24 in locus
D2 (shaded row) is assessed in the example of Section 8.3. The column Corrected height is
computed using the method discussed in Section 8.6.
Dye Locus bp Allele Height log(Height)
Blue D3 120.22 14 3349.00 8.12
Blue D3 140.66 19 2295.52 7.74
Blue vWA 178.58 17 1272.69 7.15
Blue vWA 186.85 19 1470.00 7.29
Blue D16 246.78 9 627.00 6.44
Blue D16 262.74 13 377.00 5.93
Blue D2 307.57 19 77.40 4.35
Blue D2 327.87 24 42.00 3.74
Green AME 103.27 1 7188.76 8.88
Green D8 140.74 12 3303.01 8.10
Green D8 144.74 13 3026.21 8.02
Green D21 205.27 29 581.91 6.37
Green D21 209.17 30 737.10 6.60
Green D18 307.99 18 175.50 5.17
Green D18 312.12 19 173.54 5.16
Yellow D19 122.67 14 2853.63 7.96
Yellow D19 126.67 15 2546.79 7.84
Yellow TH0 185.90 9.3 1217.00 7.10
Yellow FGA 224.93 20 460.00 6.13
Yellow FGA 241.76 24 355.00 5.87
Corrected height
4790.12
4714.49
5114.07
6838.08
8424.95
6719.33
3050.41
2370.77
7617.12
6793.26
6680.60
3750.30
5090.00
6967.80
7412.75
4262.48
4083.25
5566.79
4198.52
4364.55
of log peak height on bp. The G
S
-specic regression (
0
,
1
)-parameters are inserted in (8.3)
together with the bp-value for the allele under investigation for drop-out. For the ith allele of G
S
the adjusted H-estimate is: H(bp
i
) = exp(
0
+
1
bp
i
) which we then insert in (8.4):
logit P[D
i
;

H(bp
i
)] =
0,s
+
1
log

H(bp
i
) (8.5)
where

H(bp
i
) = H(bp
i
) or

H(bp
i
) = 2H(bp
i
) depending on whether D
i
represents a drop-out
on a heterozygous or homozygous loci, respectively. The information about bp
i
and homozy-
gous/heterozygous locus is given by the specied prole. Hence, as for the drop-out model of
Tvedebrink et al. (2009) the drop-out probabilities are determined for a specic prole since the
drop-out probability depends on the observed peak heights associated to that particular prole.
8.3 Results
In Figure 8.2, the observed peak heights of Table 8.1 are plotted against their base pair lengths
(the peak heights of homozygous loci are divided by 2 in Figure 8.2). The superimposed curves
represents the adjusted, H(bp), observed mean peak heights, H, and the xed 50 rfu detection
8.3 Results 161
Base pair (bp)
P
e
a
k

h
e
i
g
h
t
5
0
0
1
0
0
0
1
5
0
0
2
0
0
0
2
5
0
0
3
0
0
0
3
5
0
0
100 150 200 250 300
5
0
Adjusted mean peak height, H(bp)
Mean peak height, H
Fixed 50 rfu detection threshold
Figure 8.2: Peak heights plotted against base pair length. The solid curve show the adjusted
mean peak height, the dotted line the observed mean peak height and the dashed line the xed
50 rfu detection threshold.
threshold, respectively. Note that the prole of Table 8.1 is homozygous for Amelogenin (fe-
male) and TH0. This implies that the observed peak heights for these two loci are divided by 2
before the linear model (8.3) is tted to the data.
In Table 8.1, the rowwith a grey shading showthe peak belowthe xed 50 rfu detection threshold
which is represented by the dashed line in Figures 8.1 and 8.2. However, using the methodology
of Chapter 7, it is possible to have locus specic thresholds enabling detection of all the alleles
in Table 8.1. In the following we assume that the identied alleles in Table 8.1 represents a true
DNA prole, e.g. identied from a blood stain left by the suspect found at the scene of crime.
The drop-out model and tted parameters in Tvedebrink et al. (2009) are calibrated for the event
D = {peak height < 50 rfu} for non-degraded DNA. Hence, the H-estimate must only be based
on the peak heights above 50 rfu for the drop-out model to be applicable. Under the assumption
that the alleles in Table 8.1 constitute the prole, i.e. the suspect prole is heterozygous for all
loci but Amelogenin and TH0, we nd that H = 1460.41 rfu. Most of the observed peak heights
deviates substantially from H due to degradation (see dotted line in Figure 8.2).
For evidential computations we need the probability that allele 24 in locus D2 has dropped out.
162Statistical model for degraded DNA samples and adjusted probabilities for allelic drop-out
The methodology of Tvedebrink et al. (2009) makes this computation straight forward using
the estimated H-value. By plugging-in the estimated H in (8.4) and taking the inverse of the
logit-function, logit
1
(x) = exp(x)/[1+exp(x)], we obtain the drop-out probability P(D
D2
24
;

H =
1460.41) = 1.54

10
6
, where we used
0,D2
= 18.31 and
0
= 4.35 fromTable 2 of Tvedebrink
et al. (2009). This is an extremely low drop-out probability when considering the fact that allele
19 in the same locus has a peak height of 77 rfu.
From graphical inspections of the (simplied) EPG in Figure 8.2 it is obvious that the DNA
sample is subject to degradation. In order to take the degradation of the DNA into account we
adjust the estimated H. The solid line in Figure 8.1 has (
0
,
1
) = (10.262, 0.0177) with R
2
=
0.931 which together with Figures 8.1 and 8.2 and other graphical diagnostics indicate a good
agreement with the model. Since the fragment length, bp, of allele 24 in locus D2 is bp
D2
24
=
327.87 (see Table 8.1), the adjusted H-value yields H(bp) = exp(10.262 0.0177

327.87) =
85.25 rfu. This estimated peak height is reasonably close to the observed peak height (77 rfu)
for the other allele in the same locus (Table 8.1). The estimated peak height is plugged into
(8.5) which implies that P[D
D2
24
;

H(bp) = 85.25] = 0.26. This drop-out probability is more
reasonable than P(D;

H) not taking degradation into account.
Note from (8.3) we may compute p from the estimate of
1
, p = exp(
1
) = exp(0.0177) =
0.982. From experience (see the supplementary material) this sample is moderately degraded.
8.4 Discussion
Since most DNA samples are analysed in replicates (or at least in duplicates), an additional
source of information is the consistency of the estimated degradation parameter across replicates.
For replicates the amount of DNA may vary, however, this aects (in principle) only
0
, whereas

1
should remain constant. For most of the samples analysed in this paper the were no signicant
dierence between the levels of degradation p
R
i
and p
R
j
for dierent replicates R
i
and R
j
, i j.
Similarly, for samples originating from the same body tissue or uid, the degradation pattern
should be reasonably similar across samples taken fromthe same source of the crime scene. This
were supported by the data, however, some cases had signicant dierences between tissue/uid
samples.
The likelihood ratio is dened as LR = P(E|H
p
)/P(E|H
d
), where H
p
and H
d
are two competing
hypotheses that could represents the statements of the prosecutor and defence. Let G
S
be the
DNA prole of the suspect, which in the example of Section 8.3 equals the prole in Table 8.1.
In order to evaluate P(E|H
p
) where H
p
claims that G
S
is the donor of the observed stain, an allelic
drop-out need to have occurred in order to explain the missing 24 allele in locus D2. Hence, the
probability P(D
D2
24
) enters in the numerator of LR. Thus, the smaller this probability the smaller
the LR. Hence, the prosecutor will claim that degradation is present since the probability of
allelic drop-out is approximately 10
5
larger when assuming degradation, compared to the non-
degraded probability of allelic drop-out.
P(E|H
d
) is evaluated by summation over the set of possible unknown proles with or without
allelic drop-out. Whether or not it is favourable for the defence to consider unknown proles
Bibliography 163
with drop-outs depend on the allele probabilities for the homozygous loci. That is, if
P(

D; 2H)P(A
i
A
i
) < P(D; H)P(

D; H)
k

ji
P(A
i
A
j
)
then P(E|H
d
) is increased by allowing for drop-out which results in a decreased LR. This con-
sideration applies whether or not the sample is degraded. However, the drop-out probabilities
will only increase when considering degradation since H(bp) = cp
bp
H and P(D; H) increases
as H decreases. On the other hand, the probability of alleles not dropping out is possibly larger
when correcting for possible degradation, P(

D; H(bp)) < P(

D; H), since H(bp) may be larger
than H for short amplicons.
8.5 Conclusion
We presented a method for the decay in the peak intensities of forensic STR loci as a function
of increasing base pairs, bp. The model showed satisfactory agreement to data and is simple and
intuitive. Furthermore, we demonstrated how to implement the information of degradation in the
computation of the probability of allelic drop-out in the situation of degraded samples.
164Statistical model for degraded DNA samples and adjusted probabilities for allelic drop-out
Bibliography
Alaeddini, R., S. J. Walsh, and A. Abbas (2010). Forensic implications of genetic analyses from
degraded DNA - A review. Forensic Science International: Genetics 4(3), 148157.
Alonso, A. et al. (2005). Challenges of DNA proling in mass disaster investigations. Croatian
Medical Journal 46(4), 540548.
Bender, K., M. J. Farfan, and P. M. Schneider (2004). Preparation of degraded human DNA
under controlled conditions. Forensic Science International 139(2-3), 135140.
Bill, M. et al. (2005). PENDULUM - a guideline-based approach to the interpretation of STR
mixtures. Forensic Science International 148, 181189.
Colotte, M., V. Couallier, S. Tuet, and J. Bonnet (2009). Simultaneous assessment of aver-
age fragment size and amount in minute samples of degraded DNA. Analytical Biochem-
istry 388(2), 345347.
Dixon, L. A. et al. (2006). Analysis of articially degraded DNA using STRs and SNPs - results
of a collaborative European (EDNAP) exercise. Forensic Science International 164(1), 3344.
Green, R., I. Roinestad, C. Boland, and L. Hennessy (2005). Developmental Validation of the
Quantiler Real-Time PCR kits for the Quantication of Human Nuclear DNA samples.
Journal of Forensic Science 50(4), 809825.
Irwin, J. A. et al. (2007). Application of low copy number STR typing to the identication of
aged, degraded skeletal remains. Journal of Forensic Sciences 52(6), 13221327.
Prinz, M. et al. (2007). DNA Commision of the International Society for Forensic Genetics
(ISFG): Recommendations regarding the role of forensic genetics for disaster victim identi-
cation (DVI). Forensic Science International: Genetics 1(1), 312.
Schneider, P. M. et al. (2004). STR analysis of articially degraded DNA - results of a collabo-
rative European exercise. Forensic Science International 139(2-3), 123134.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2009). Estimating the proba-
bility of allelic drop-out of STR alleles in forensic genetics. Forensic Science International:
Genetics 3(4), 222226.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2010). Evaluating the weight
of evidence using quantitative STR data in DNA mixtures. Journal of the Royal Statistical
Society. Series C, Applied statistics. In Press.
8.6 Supplementary remarks 165
8.6 Supplementary remarks
Degradation aects the mean of the peak heights and areas. Since degradation is a very common
situation in forensic case work, the models developed should be able to handle degradation. As
with the extension of the mixture separation method to allowing for allelic drop-out, the method
is extensible to correct for degradation.
Assume that the biological material contributed by the donors is of similar type, e.g. blood,
tissue, body uids, etc., and that the material has been exposed to similar conditions over an
approximate identical time span. Based on these assumptions it is reasonable to assume that the
level of degradation is common for the DNA and that the peak intensities may be modelled by
c
k
p
bp
, where c
k
reects the amount of DNA contributed by the kth individual and p is common
for all k = 1, . . . , m.
In cases of degradation, the eect on a four peak locus might be such that the highest and lowest
peak heights relate to the major component and the two alleles with intermediate peak heights
belong to the minor contributor of a two-person DNA mixture. This could happen if the highest
and lowest peaks are in each end of the ladder interval and the intermediate in between. Fig-
ure 8.3 shows examples of this situation for the base pair interval from 125 bp to 280 bp for the
SGM Plus kit.
The plot in Figure 8.3 exemplies a two-person DNA mixture with p = 0.98 and amounts of
DNA corresponding approximately to an 1:2 mixture. That is, the expected peak heights of
heterozygous loci are given by h
(k)
s,i
= c
k
p
bp
(k)
s,i
, which implies that
_
4
i=1
log h
s,i
= 2
(+)
0
+
1
bp
s,+
,
where
(+)
0
=
(1)
0
+
(2)
0
,
(i)
0
= log c
i
and
1
= log p.
Hence, a regression of
_
4
i=1
log h
s,i
on bp
s,+
would give estimates of (
(+)
0
,
1
). However, ad-
ditivity of peak heights on natural-scale does not transfer to additivity on log-scale. Homozy-
gous allele peak heights are log h
(k)
s,1
= log 2 +
(k)
0
+
1
bp
(k)
s,1
and shared alleles has log h
s,i
=
log(c
1
+ c
2
) +
1
bp
()
s,i

, where in particular the shared alleles implies that the regression of


_
n
s
i=1
log h
s,i
on bp
s,+
would yield locus dependent intercept.
Thus, dierent means for estimating p for DNA mixtures need to be considered. By simulating
a large number (e.g. 1,000) DNA mixtures with known proles, amounts of DNA and level of
degradation, p
0
, it was possible numerically to compare the performance of dierent estimators
of p. Of the investigated methods a regression of log to the mean of peak heights, log

h, on
the mean of base pairs,

bp
s
, yielded a good approximation based on simulations with a 95%-
condence interval of (8.58

10
5
, 4.62

10
5
) for the dierence between p
0
and p.
The relevant observation window for the loci included in the SGM Plus kit (AB) starts around
100 bp. Using this o-set the observed peak intensities may be adjusted for degradation by
compensating by the tted decay. Given and the peak height h it is possible to compute the
degradation corrected peak height

h. By multiplying the observed peak heights by exp[
0
(bp
100)] the eect of degradation is inverted resulting in less imbalances between loci,

h
s,i
=
h
s,i
exp[
1
(bp
s,i
100)].
In the example of Section 8.3,
1
= 0.0177 and by using the approach above we get the peak
166Statistical model for degraded DNA samples and adjusted probabilities for allelic drop-out
P
e
a
k

h
e
i
g
h
t
0
5
0
0
1
0
0
0
1
5
0
0
2
0
0
0
2
5
0
0
150 200 250
Base pair
Blue fluorescent dye
Green fluorescent dye
Yellow fluorescent dye
Major component
Minor component
Figure 8.3: Degradation of a two-person DNA mixture. The highest and lowest peaks belong to
the major component. The shaded areas below the rst axis show the range of the allelic ladder
for the various STR loci in the 125-280 bp window of the SGM-Plus kit (Applied Biosystems,
AB).
heights reported in the Corrected height-column of Table 8.1. There is still evidence of peak
height imbalances within loci. However, the heterozygote balance, Hb, which is the ratio of the
heterozygous peak heights (see e.g. Bill et al., 2005), is improved by the correction, where the
range of Hb for the observed peak heights is (0.54, 0.99) it is (0.74, 0.98) after the peak height
correction.
In order for the models for DNA mixtures of Chapters 4 and 5 to be valid, the proportionalities
of peak heights and peak areas need to be preserved. However, the correction of peak heights
is also applicable to peak areas, hence a
s,i
= a
s,i
exp[
1
(bp
s,i
100)] which ensures the same
proportionality as before the correction. Therefore, no changes are needed in the sets J

i
for the
mixture separator when the peak intensities are adjusted for degradation.
In the next chapter the model for degraded DNA is combined with the models from the previous
chapters in a unifying likelihood ratio. That is, a likelihood ratio were all the discussed com-
plications can be included and accounted for when assessing the weight of the DNA evidence in
crime cases.
CHAPTER 9
Epilogue
9.1 Conclusion
In the preceding seven chapters (Chapters 2-8) the core content of this present PhD thesis has
been presented. The main focus of the PhD project has been to develop statistical models ap-
plicable to the quantitative part of the STR analysis and in particular DNA mixtures. However,
since the genetic part (qualitative allelic data) of the evidence constitutes the fundamental inputs
in evidential weight calculations, it was dicult not to treat this topic. This lead to the interest for
IBD and the eect of population structures when computing the evidential weight. As pointed
out by one of the reviewers of the paper in Chapter 2 (Overdispersion in allelic counts and -
correction in forensic genetics to appear in Theoretical Population Biology) does the forensic
databases not constitute the databases of interest. More general population surveys should be
used when making inference about , e.g. random samples taken from well-dened subpopula-
tions on a high resolution. For the Danish population this could be samples taken from small
villages or islands since these subpopulations may cause large allelic divergence and thus yield
-estimates in the higher end of the plausible range (Balding, 2005). From Figure 2.1 we saw
that this in practise would lead to conservative evaluation of the evidence. Furthermore, this is
equivalent to the fact that the probability of a random match (a prole match of two unrelated
individuals) increases with .
For the quantitative part, the work was initiated by assuming no complications of stutters, pull-up
eects, allelic drop-out or degradation. Under these settings it was possible to derive two models
for DNA mixtures, where the simplest of the two were wrapped into a greedy algorithm which
167
168 Epilogue
eciently separated DNA mixtures. For the particular data used in the paper the algorithm
was at least as successful as three experienced case workers. However, the analysis did also
emphasise that the results should be interpreted with caution. This was especially important
for samples close to 1:1-mixture proportion and when the interest was about the minor prole.
The analysis of three-person mixtures repeated this picture where the success rate for the 1:2:4-
mixture proportion was rather low for the mid and minor proles.
Having done this, a natural extension of the models was to handle allelic drop-out and degraded
DNA samples as these phenomena are frequently occurring in real crime case work. From
the remarks of Sections 6.5.1 and 8.6 it was demonstrated how the presented models may be
combined in order to handle these complications. In the remarks it was only exemplied how
to modify the statistical model and mixture separating algorithm for two-person DNA mixtures.
However, the cases of more contributors follow along the same lines. The work with allelic
drop-out also made it evident that there were possibilities for renement of the determination
of the signal-to-noise ratio. The use of a xed threshold may in some cases discard important
information regarding the distribution of the noise component from the measurement technique.
The model proposed to determine this threshold was based on a simple analysis of quartiles in
order to estimate the parameters of the log-normal distribution. However, for some situations
this approach seemed to be too simple as a sudden increase of the noise level was detected for
a short bp-interval. This temporally increase in the background noise caused non-linearity in
the QQ-plots and in some cases increased the variance estimate substantially. Loess-curves were
investigated to handles this non-linearity. However, they did not improve the overall performance
signicantly. Further work may suggest ways to adjust for this fact, but one has to focus the
attention on newer typing kits, as these should have better signal-to-noise ratios than the SGM-
Plus kit (Applied Biosystems).
9.2 Weight of evidence calculations
In the preceding chapters it has been demonstrated how to incorporate the quantitative part of the
STR typing results in the likelihood ratio approach. The principle was to assign a weight to each
quantitative termof the LR, where the weight should reect the compliance between the expected
and observed peak intensities. Terms with minor disagreement (e.g. due to measurement errors)
should receive a large weight whereas prole combinations leading to substantial dierences
would be weighted by a quantity close to zero.
This extendability of the LR is one of the many arguments for using this approach rather than the
Random man not excluded-approach (often abbreviated RMNE-approach in the literature). I
will not discuss the philosophical dierences or many advantages of LR over RMNE, since these
are irrelevant at this point. However, it should be noted that the models discussed above is of no
use when assessing the weight of evidence through RMNE. In line with many others (e.g. Evett
and Weir, 1998; Balding, 2005; Buckleton et al., 2005; Gill et al., 2006; Buckleton and Curran,
2008) I strongly recommend the LR-approach in evidential calculations carried out in forensic
genetics.
The LR is formed by evaluating the evidence (crime scene evidence and identied proles) under
9.3 Unifying likelihood ratio 169
competing hypothesis, often denoted H
p
and H
d
for the prosecutor and defence hypotheses.
Since H
p
and H
d
are only mutually exclusive, and not exhaustive, one needs to recall that there
are several LRs - one for every (H

p
, H

d
)-pair of hypotheses. Hence, the fact that the LR favours
H
p
over H
d
does not imply that there cannot exist a H

d
for which LR

= P(E|H
p
)/P(E|H

d
) < 1
(Balding, 2005).
The extensions of the LR derived in Chapters 4 and 5 only considered cases assuming no allelic
drop-outs. However, as previously argued does this assumption often fail together with the no
degradation-assumption. Hence, for proper inclusion of the available data and applicability
to most types of crime cases, the LR needs to be extended further. Let Q = (Q
mis
, Q
obs
) and
G = (G
mis
, G
obs
), where the subscripts refer to dropped-out and observed alleles. That is, Q
obs
are
the observed peak intensities, whereas Q
mis
denotes the event of an allelic drop-out, i.e. the peak
failed to be detected. Similarly are G
obs
and G
mis
the associated types of alleles, where the need
for Q
mis
and G
mis
is induced by the hypothesis under consideration.
Given a specic hypothesis the set of plausible prole combinations C is induced. That is, if the
prosecutors hypothesis H
d
claims that the observed crime scene stain originates from a victim,
V, and suspect S then C
p
= {(G
V
, G
S
)}, where respectively G
V
and G
S
are the proles of V
and S . In connection to this hypothesis the defence states that The observed crime scene stain
originates from the victim and an unknown prole then C
d
= {G
U
: (G
V
, G
U
) H
d
}, with G
U
being the prole of the unknown contributor U. This denition of C
d
does not limit G
U
to be
consistent with (Q
obs
, G
obs
), hence drop-out of Us alleles is allowed with this formulation. Note
that this denition of C is dierent from that of Sections 4.3 and 5.5 where the plausible proles
in C needed to be consistent with the observed alleles, i.e. no allelic drop-outs were allowed.
Additionally, drop-ins, stutters and pull-up peaks possibly causes more alleles to be observed
than those of the true contributors. However, as claimed in Section 5.9 are stutters (and pull-
up peaks) prole independent. Hence, given the peak intensity information in allele position
n it is (in principle) possible to predict and adjust for the stutter contribution to the peak in
position n1. Similarly, the pull-up contribution can be removed from peaks with overlapping
bp-values. However, not all such peaks were successfully removed as 6 stutters and 4 pull-ups
were observed above the signal-to-noise threshold (Table 7.4) for the two-person mixtures, and
for the three-person mixtures 27 stutters and 2 pull-ups were detected. Hence, peaks other than
those belonging to the true donors must be incorporated in the unifying model to be consistent
with the observed data.
9.3 Unifying likelihood ratio
The evaluation of the LR consists of computing the probability of the evidence under the two
hypothesis and form their ratio. Since the C-sets are discrete the probability P(E|H) may be
evaluated using the law of total probability P(E|H) =
_
GC
P(E|G)P(G), where G is short for
the proles involved, e.g. G = (G
V
, G
S
) under the prosecutors hypothesis in the example above.
In order to discuss the evidential weight there need to be at least one identied DNA prole,
namely the suspects prole G
S
. For general purposes let K be the known DNA proles as-
sociated with the case, e.g. K = (G
V
, G
S
) above. Then the evidence E consists of E
c
and K,
170 Epilogue
where E
c
were the crime scene evidence including both the quantitative and qualitative parts,
E
c
= (Q, G). First we note that the crime scene evidence, E, and the known proles, K are as-
sumed conditionally independent given (G, G). That is, given G C and G the known proles K
has no inuence on the crime scene stain. Hence, using the denition of conditional probability
we can for G C factorise P(E|G) as:
P(E|G) = P(E
c
, K|G) = P(Q, G, K|G) = P(Q|G, G)P(G|K, G)P(K|G). (9.1)
In (9.1) the P(Q|G, G)-term measures the agreement of the observed and expected peak inten-
sities under some model. If the detected alleles in G equals those of G neither drop-out nor
drop-in (including stutters and pull-ups) have caused missing or additional alleles to be present
in the signal. Hence, P(Q|G, G) = P(Q|G) may be evaluated by the one of models as presented
in Chapters 4 or 5, and since G = (G G) = G, i.e. the proles are consistent, P(G|G) = 1.
However, in cases with stutters, pull-ups or drop-ins present Q is split into two parts ascribed
respectively to G = G G and

G = G \ (G G). The evaluation is done by P(Q|G, G) =
P(Q
G
|Q
G
, G, G)P(Q
G
|G), where in cases of possible stuttering P(Q
G
|Q
G
, G, G) assigns prob-
ability to this event. In this thesis such models have not been discussed, however, a logistic
regression (similar to that of the drop-out model) may be derived, where the explanatory vari-
able for stutters and pull-ups would be the parental peaks intensities. For drop-ins (additional
peaks not possible to categorise as stutters or pull-ups), the noise level of the sample might be
an appropriate covariate.
Furthermore, if G implies allelic drop-out Q can be decomposed into (Q
mis
, Q
obs
) and the quan-
titative term then factorises further P(Q|G, G) = P(Q
mis
|Q
obs
, G, G)P(Q
obs
|G, G). The probability
of an allelic drop-out, P(Q
mis
|Q
obs
, G, G), is computed given the observations and information
about the samples genotypes. An allelic drop-out is equivalent to the event that the peak height
is less than the limit of detection. Hence, P(Q
mis
|) could be evaluated by
_
T
0
P(h|) dh, where T
and h are the limit of detection and peak height, respectively. However, the drop-out model of
Chapter 6 is an approximation to this integral and since it is easier to compute we use P(D; H)
to quantify P(Q
mis
|Q
obs
, G, G).
Thus combining (9.1) with the extension for drop-outs and additional alleles compared to G the
unifying likelihood ratio can be dened as:
LR =
P(E|H
p
)
P(E|H
d
)
=
_
GC
p
P(Q
mis
|Q
obs
, G)P(Q
obs,

G
|Q
obs,G
, G, G)P(Q
obs,G
|G)P(G|K, G)P(K|G)P(G)
_
G

C
d
P(Q
mis
|Q
obs
, G

)P(Q
obs,

G
|Q
obs,G
, G, G

)P(Q
obs,G
|G

)P(G|K, G

)P(K|G

)P(G

)
(9.2)
This LR is constructed such that it (in principle) is applicable in all possible scenarios arising
from crime cases.
For the example above with H
p
: (G
V
, G
S
) and H
d
: (G
V
, G
U
) the known proles are thus K =
(G
V
, G
S
). Assume that G
S
has alleles not present in G implying that allelic drop-out must have
9.4 Future research 171
occurred if the suspect is a true contributor to the stain. Furthermore, all alleles in G is accounted
for by (G
V
, G
S
). Then the LR is given by:
LR =
P(Q|G, G
V
, G
S
)P(G|G
V
, G
S
)P(G
V
, G
S
)
_
G
U
C
d
P(Q|G, G
V
, G
U
)P(G|G
V
, G
U
)P(G
S
|G
V
, G
U
)P(G
U
, G
V
)
=
P(Q
mis
|Q
obs
, G, G
V
, G
S
)P(Q
obs
|G, G
V
, G
S
)P(G
mis
, G
obs
|G
V
, G
S
)
_
G
U
C
d
P(Q
mis
|Q
obs
, G, G
V
, G
U
)P(Q
obs
|G, G
V
, G
U
)P(G
mis
, G
obs
|G
V
, G
U
)P(G
U
|G
V
, G
S
)
,
where P(G
mis
, G
obs
|G
V
, G
S
) = 1 since (G
V
, G
S
) (G
mis
, G
obs
). Assume further that C
d
= {G
U
:
(G
V
, G
U
) G
obs
}, i.e. the set of possible unknown proles is restricted to be consistent with
the observed alleles when combined with G
V
. Thus, P(G
obs
, G
mis
|G
V
, G
U
) = 1 and LR reduces
further:
LR =
P(Q
mis
|Q
obs
, G, G
V
, G
S
)P(Q
obs
|G, G
V
, G
S
)
_
G
U
C
d
P(Q
obs
|G
V
, G
U
)P(G
U
|G
V
, G
S
)
9.4 Future research
9.4.1 Replicates
When a sample is taken from a crime scene the number of molecules may be limited, e.g. does
dead hair follicles only contain limited amounts of DNA and similarly for touch DNA which
is biological material transferred by physical contact (Gill and Buckleton, 2010a). Let N be
the number of DNA molecules present after extraction and n be the number of replicates, R
i
,
made based on the N molecules. For the n replicates to be comparable in terms of drop-outs
(and possibly stutters and contamination) it is desirable for the amount of DNA to be evenly
distributed among R
1
, . . . , R
n
, e.g. for n = 3 one could imagine to have approximately 30% in
each replicate leaving 10% of the extracted DNA in the tube.
Let
A
denote the aliquot proportion, then this sampling scheme implies that R
1
bin(N,
A
)
and (R
i
|R
1
, . . . , R
i1
) bin(N
_
i1
j=1
R
j
,
A
/[1 (i1)
A
]) for j = 2, . . . , n. It is easy to verify
that this construction yields the expected values as desired: E(R
1
) = N
A
and
E(R
i
) = E[E(R
i
|R
1
, . . . , R
i1
)] =
[N (i1)N
A
]
A
1 (i1)
A
= N
A
.
Furthermore, this implies that (R, Q) = (R
1
, . . . , R
n
, Q) mult(N, {1
A
, 1 n
a
}), where Q is
the remaining extract. Assume that there need to be M molecules of an allele prior to PCR in
order to be detected by the CCD camera in the electrophoresis machine post-PCR. Hence, for
the allele to be detected in each replicate we require that R
i
> M for all i:
P(R
1
> M, R
2
> M, . . . , R
n
> M) =
N

r
1
>M
Nr
1

r
2
>M

Nr
+

r
n
>M
P(R
1
= r
1
, R
2
= r
2
, . . . , R
n
= r
n
), (9.3)
172 Epilogue
where r
+
=
_
n1
i=1
r
i
. This probability depends on several factors but most importantly =
NnM. For small the probability that the allele has dropped out in at least one of the replicates
is considerable, and for negative we are sure to have drop-outs. However, when >> 0 the
probability of drop-outs in any of the replicates is minimal, i.e. when the amount of DNA is large
all the replicates should have all alleles present.
In low template DNA (LT-DNA, formerly known as Low Copy Number DNA, LCN-DNA, Gill
and Buckleton (2010a)) it is common to use the biological model to forma so-called consensus
prole (Buckleton et al., 2005, Chapter 8). That is, only alleles present in at least two replicates
are reported in the consensus prole (Gill et al., 2000). However, from the probability in (9.3)
it is for small N very likely that an allele present in some replicates is absent in others. Hence,
the denition of a consensus prole may not be the best approach when it is expected that the
replicates will showdierent alleles for small amounts of DNA, which is the case for LT-DNA. A
better method would be to model the negative correlation between peak intensities of replicates.
In the left panel of Figure 9.1 the probability that the consensus prole excludes a true al-
lele is plotted for two and three replicates against the total amount of extracted DNA, i.e.
2

P(R
1
<M, R
2
>M) and 3

P(R
1
<M, R
2
<M, R
3
>M), where permutation of replicates induce the
multiplication of weights. It is assumed that in order to trigger the observation of an allele using
a 50 rfu threshold it is required to have 50 pg of DNA material prior to PCR. Furthermore, for
the two replicate case all of the extracted DNA is used in equal amounts. For the three replicate
situation it is intended to assign 30% of the total DNA to each replicate.
50 100 150 200 250
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
Amount of extraced DNA (pg)
P
r
o
b
a
b
i
l
i
t
y

t
h
a
t

e
x
a
c
t
l
y

o
n
e

r
e
p
l
i
c
a
t
e

s
h
o
w
s

t
h
e

a
l
l
e
l
e
50 100 150 200 250
0
.
0
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
Amount of extraced DNA (pg)
P
r
o
b
a
b
i
l
i
t
y

t
h
a
t

a
t

l
e
a
s
t

t
w
o

r
e
p
l
i
c
a
t
e
s

s
h
o
w

t
h
e

a
l
l
e
l
e
Two replicates Three replicates
50 100 150 200 250

1
.
0

0
.
8

0
.
6

0
.
4

0
.
2
0
.
0
Amount of extracted DNA (pg)
P
a
i
r
w
i
s
e

c
o
r
r
e
l
a
t
i
o
n

o
f

r
e
p
l
i
c
a
t
e
s
Figure 9.1: Left: Probability that the consensus prole excludes a true allele for two and three
replicates. Centre: Probability that the allele will be included in the consensus prole for two
and three replicates. Right: Pairwise correlation between consensus prole inducing replicates.
From the curves in the left panel of Figure 9.1 it is evident that for small and large amounts
of DNA the probabilities are eectively zero. For the small values this is because neither of
the replicates have observed alleles (no allele in consensus prole due to drop-out in both repli-
cates), whereas for the large values it is because the allele is observed in all replicates (allele in
9.4 Future research 173
consensus prole). The maximum are respectively at 101 pg and 160 pg while the ranges where
the probabilities are larger than 10
3
are 75-136 pg and 112-201 pg, for two and three replicates.
In the centre panel of Figure 9.1 the probability that the consensus prole will include the allele
for two and three-replicates is plotted against the amount of extracted DNA. For an allele to be
present in the consensus prole it must be detected at least twice:
P(R
1
> M, R
2
> M) and 3

P(R
1
< M, R
2
> M, R
3
> M) + P(R
1
> M, R
2
> M, R
3
> M)
To be 99.9% certain that an allele is present in the consensus prole the minimum required
amount of extracted DNA are 140 pg for two replicates and 245 pg for three replicates using the
assumed aliquot sampling scheme.
Dene the indicator variables T
i
which are 1 if R
i
> M and 0 otherwise. Hence, T
i
indicates
whether replicate i triggers the observation of an allele above the threshold. The consensus
inducing correlations are thus Cor(T
1
, T
2
) for two replicates and similarly Cor(T
1
, T
2
|T
3
= 0)
for three replicates. The latter correlation is naturally subject to permutation of replicates, but
since the amount of DNA for one replicate, here R
3
, is less than M, the two other replicates need
to show the allele for it to be included in the consensus prole. The right panel of Figure 9.1
shows the negative correlations as expected due to the limited amount of DNA. For two replicates
the pairwise correlation is approximately equal to the negative probability of the left panel of
Figure 9.1.
The general picture from the model and analysis of replicates indicate that the concept of the
consensus prole (or biological model) is awed, due to the disproportion between expected
peak intensities and consensus prole construction. However, it should be added that the gures
above are computed without taking measurement error, PCR eciency variation, quantication
inaccuracy, etc. into account. A more rened model should include these and other factors to be
applicable to real STR data.
9.4.2 The number of contributors
When evaluating DNA mixtures a source of uncertainty is the number of contributors. Lau-
ritzen and Mortera (2002) derived an upper bound on the number of unknown contributors
worth considering (typically) under H
d
. That is, the bound b is computed such that if the
number of unknown proles x is larger than b, the evidence is less favourable to the de-
fendant than with x = b. However, this bound is computed without taking the quantita-
tive part of the evidence into consideration and may therefore yield an inaccurate bound for
LR = P(Q, G, K|H
p
)/P(Q, G, K|H
d
).
9.4.3 Distribution of max
G
L(Q|G) - optimisation over a discrete space
In relation to the problem above, it is relevant to be able to quantify the distribution of L(Q|G).
How does one measure the signicance in the L(Q|G)-value when changing the number of con-
tributors m? And how is this related to the mixture proportions ? For a xed combination of
174 Epilogue
proles, going from m to m 1 contributors is equivalent to setting
1
= 0. However, since the
greedy algorithm searches over all possible combination in the discrete space G, it may be inap-
propriate to rely on asymptotic theory or other common approaches to test H
0
:
1
= 0 against
H
1
:
1
> 0.
9.4.4 Estimation of P(D) using the oating threshold methodology
In the drop-out model of Chapter 6 the limit of detection threshold was xed at 50 rfu. How-
ever, if the STR signal is assigned positive and negative by the oating threshold methodology
(Chapter 7), the threshold is not xed and the previous denition of a dropout, D = {h < 50},
does not apply. The denition of the drop-out probability on page 170 as an integral may be
used in this setting. That is, the quantitative data is spilt into two disjoint partitions where
the noise part (o-ladder observation not in pull-up position) is used to determine T and is
therefore independent of the quantitative signal in the remaining part. Hence, it would be pos-
sible to estimate a mean,
h
, and standard deviation,
h
, for the peak heights and evaluate
P(D;
h
,
h
) = P(h < T;
h
,
h
) =
_
T
0
f (h;
h
,
h
) dh.
9.4.5 Evaluating the entire signal
As mentioned in Chapter 1 the use of threshold or limit of detection imply the possibility for
drop-out. In that chapter the argument for using a threshold strategy in this thesis were to limit
the set of possible combinations that were needed to evaluate LR. However, it may be possible to
evaluate the entire STR signal by including all observations above a given limit, 5 rfu say. This
would lead to more complicated expressions for the LR, however with a gain in conceptual clarity
since assignment of positive/negative alleles is superuous. Using this methodology, especially
the P(E|H
d
) could imply a summation over a huge set which would be computationally intense.
However, the terms in P(E|H
p
) and P(E|H
d
) that would have numerical impact on the LR would
be those including the observed alleles with the strongest signals. Often this would be those
associated with the alleles in K. However this need not to be the case, but searching for a best
matching pair of proles would still be possible. For the evaluation of LR to be operational, it
might be necessary to use importance sampling in order to evaluate the sum in the denominator
since fewer known proles is specied by H
d
than by H
p
. Assume that the hypothesis H
d
states
that the observed crime scene stain was a two-person DNA mixture, then correcting for stutters
and pull-up eects, it may be possible to determine a best matching pair of proles

G. This
best matching conguration is then applicable as reference proles for importance sampling
similar to the construction in Section 5.6.
Let E denote the signal obtained from the EPG based on a crime related sample, e.g. a sample
taken from a scene of crime. When evaluating the sample we are interested in P(E|H
a
) for some
H
a
-hypothesis. H
a
induces a discrete set of DNA proles and we denote this C
a
= {G : G H
a
}.
Furthermore, H
a
may specify further evidence in terms of DNA proles of identied individuals.
Let K denote the common set of known proles of the two hypotheses evaluated in the LR. For
example, in a two-person DNA mixture K may be the proles of a victim and the suspect,
9.4 Future research 175
K = (G
V
, G
S
). Thus the likelihood ratio is LR = P(E, K|H
p
)/P(E, K|H
d
). This LR is evaluated
by summing in both numerator and denominator over proles in C
p
and C
d
, respectively. That
is, P(E, K|H
a
) =
_
GC
a
P(E, K|G)P(G).
We assume that given G no other proles aect the observed signal. In particular this is true for
the known proles, K. Hence, E and K are conditionally independent given G: P(E, K|G) =
P(E|G)P(K|G). For each set of proles G C
a
a set of stutters and on-ladder pull-up peaks
are induced. Let S
G
and P
G
denote these derivatives, where S
G
includes both stutters (rst,
second, third, etc.) and back-stutters. Furthermore, for each G the allelic ladder, L, is known
and xed.
Given Gthe observed signal, E, may decomposed into ve parts that constitute a STR signal:
O-ladder noise, E

L
n
which are all intensity observations in o-ladder position and not in
possible pull-up position. E

L
n
is xed for all G since the it only rely on the xed ladder, L.
The signal due to the proposed proles in G: E
G
.
The signal due to stutters induced by proles in G: E
S
G
.
The signal due to pull-up peaks induced by proles in G and S
G
: E
P
G
.
On-ladder noise, E
L
n
which are all on-ladder observations not ascribed to Gand its derivatives.
Using this decomposition we have for G C
a
:
P(E|G) = P(E
L
n
|E
P
G
,E
S
G
,E
G
,E

L
n
,G)P(E
P
G
|E
S
G
,E
G
,E

L
n
,G)P(E
S
G
|E
G
,E

L
n
,G)P(E
G
|E

L
n
,G)P(E

L
n
|G)
= P(E
L
n
|E

L
n
,G)P(E
P
G
|E
S
G
,E
G
,E

L
n
,G)P(E
S
G
|E
G
,E

L
n
,G)P(E
G
|E

L
n
,G)P(E

L
n
), (9.4)
where P(E

L
n
|G) = P(E

L
n
) since it is xed for all proles Gand thus cancels out when forming the
likelihood ratio. It is likely that some of the terms in (9.4) can be simplied due to conditional in-
dependence given G. For example, may the on-ladder noise, E
L
n
, be independent of the o-ladder
noise, E

L
n
, given Gwhen the parameters of P(E

L
n
) is determined, i.e. P(E
L
n
|E

L
n
, G) = P(E
L
n
|E

L
n
, G).
The LR is formed by a hypothesis specic ratio of the expression in (9.4):
LR=
_
GC
d
P(E
L
n
|E

L
n
,G)P(E
P
G
|E
S
G
,E
G
,E

L
n
,G)P(E
S
G
|E
G
,E

L
n
,G)P(E
G
|E

L
n
,G)P(K|G)P(G)
_
G

C
d
P(E
L
n
|E

L
n
,G

)P(E
P
G

|E
S
G

,E
G
,E

L
n
,G

)P(E
S
G
|E
G
,E

L
n
,G

)P(E
G
|E

L
n
,G

)P(K|G

)P(G

)
As in Section 9.3 we consider a two-person DNA mixture with known victim prole G
V
and
suspect prole G
S
where H
p
:(G
V
, G
S
) and H
d
: (G
V
, G
U
). Due to limited space we dene
G
V,S
= (G
V
, G
S
) and G
U,S
= (G
U
, G
S
), then the likelihood ratio is
LR=
P(E
L
n
|E

L
n
,G
V,S
)P(E
P
G
V,S
|E
S
G
V,S
,E
G
V,S
,E

L
n
,G
V,S
)P(E
S
G
V,S
|E
G
V,S
,E

L
n
,G
V,S
)P(E
G
V,S
|E

L
n
,G
V,S
)
_
G
U
C
d
P(E
L
n
|E

L
n
,G
V,U
)P(E
P
G
V,U
|E
S
G
V,U
,E
G
V,U
,E

L
n
,G
V,U
)P(E
S
G
V,U
|E
G
V,U
,E

L
n
,G
V,U
)P(E
G
V,U
|E

L
n
,G
V,U
)
.
Bibliography
Alaeddini, R., S. J. Walsh, and A. Abbas (2010). Forensic implications of genetic analyses from
degraded DNA - A review. Forensic Science International: Genetics 4(3), 148157.
Alonso, A. et al. (2005). Challenges of DNA proling in mass disaster investigations. Croatian
Medical Journal 46(4), 540548.
Applied Biosystems (2000). GeneScan Reference Guide - Chemistry Reference for the ABI
PRISM 310 Genetic Analyzer. Applied Biosystems. Figure Virtual Filter Set F, pp. 4-10.
Applied Biosystems (2006). AmpFSTR SGM Plus PCR Amplication Kit Users Manual. Ap-
plied Biosystems.
Ayres, K. L. (2000). A two-locus forensic match probability for subdivided populations. Genet-
ica 108, 137143.
Balding, D. J. (2003). Likelihood-based inference for genetic correlation coecients. Theoreti-
cal Population Biology 63, 221230.
Balding, D. J. (2005). Weight-of-evidence for Forensic DNA Proles. Chichester, West Sussex:
John Wiley & Sons, Ltd.
Balding, D. J. and J. S. Buckleton (2009). Interpreting low template DNA proles. Forensic
Science International: Genetics 4(1), 110.
Balding, D. J. and R. A. Nichols (1994). DNA prole match probability calculation: how to
allow for population stratication, relatedness, database selection and single bands. Forensic
Science International 64, 125140.
Balding, D. J. and R. A. Nichols (1995). A method for quantifying dierentiation between
177
178 Bibliography
populations at multi-allelic loci and its implications for investigating identity and paternity.
Genetica 96, 312.
Balding, D. J. and R. A. Nichols (1997). Signicant genetic correlations among caucasians at
forensic DNA loci. Heredity 78(6), 583589.
Barndor-Nielsen, O. E. and D. R. Cox (1994). Inference and Asymptotics. Number 52 in
Monographs on Statistics and Applied Probability. London: Chapman & Hall.
Bender, K., M. J. Farfan, and P. M. Schneider (2004). Preparation of degraded human DNA
under controlled conditions. Forensic Science International 139(2-3), 135140.
Bill, M. et al. (2005). PENDULUM - a guideline-based approach to the interpretation of STR
mixtures. Forensic Science International 148, 181189.
Box, G. E. P. and N. R. Draper (1987). Empirical model-builing and response surfaces. Wiley.
Brier, G. W. (1950). Verication of forecasts expressed in terms of probability. Monthly Weather
Review 78, 13.
Buckleton, J. S. and J. M. Curran (2008). A discussion of the merits of randomman not excluded
and likelihood ratios. Forensic Science International: Genetics 2, 343348.
Buckleton, J. S., C. M. Triggs, and S. J. Walsh (2005). Forensic DNA evidence interpretation,
pp. 217274. Boca Raton, FL: CRC Press.
Budowle, B. and T. R. Moretti (1999). Genotype proles for six population groups at the 13
CODIS short tandem repeat core loci and other PCR-based loci. Forensic Science Communi-
cations.
Butler, J. M. (2005). Forensic DNA Typing: Biology, Technology, and Genetics of STR Markers
(2 ed.). Burlington, MA: Elsevier Academic Press Inc., U.S.
Clayton, T. M., J. P. Whitaker, R. Sparkes, and P. D. Gill (1998). Analysis and interpretation of
mixed forensic stains using DNA STR proling. Forensic Science International 91, 5570.
Cockerham, C. C. (1969). Variance of gene frequencies. Evolution 23(1), 7284.
Cockerham, C. C. (1973). Analysis of gene frequencies. Genetics 74(4), 679700.
Colotte, M., V. Couallier, S. Tuet, and J. Bonnet (2009). Simultaneous assessment of aver-
age fragment size and amount in minute samples of degraded DNA. Analytical Biochem-
istry 388(2), 345347.
Cook, O. and L. Dixon (2006). The prevalence of mixed DNAproles in ngernail samples taken
fromindividuals in the general population. Forensic Science International: Genetics 1(1), 62
68.
Cowell, R. G. (2009). Validation of an STR peak area model. Forensic Science International:
Genetics 3(3), 193199.
Cowell, R. G., S. L. Lauritzen, and J. Mortera (2007a). A gamma model for DNA mixture
analyses. Bayesian Analysis 2(2), 333348.
Bibliography 179
Cowell, R. G., S. L. Lauritzen, and J. Mortera (2007b). Identication and separation of DNA
mixtures using peak area information. Forensic Science International 166, 2834.
Cowell, R. G., S. L. Lauritzen, and J. Mortera (2010). Probabilistic expert systems for handling
artifacts in complex DNA mixtures. Forensic Science International: Genetics. In Press.
Cox, D. R. (1958). Some problems connected with statistical inference. Annals of Mathematical
Statistics 29(2), 357372.
Cox, D. R. and D. V. Hinkley (1974). Theoretical Statistics. Chapman and Hall Ltd.
Curran, J. M. (2008). A MCMC method for resolving two person mixtures. Science &Justice 48,
168177.
Curran, J. M., J. S. Buckleton, C. M. Triggs, and B. S. Weir (2002). Assessing uncertainty in
DNA evidence caused by sampling eects. Science and Justice 42(1), 2937.
Curran, J. M., C. M. Triggs, J. S. Buckleton, and B. S. Weir (1999). Interpreting DNA mixtures
in structured populations. Journal of Forensic Science 44(5), 987995.
Curran, J. M. and T. Tvedebrink (2010a). DNAtools - a R package for forensic DNA database
analysis. Journal of Computational Statistics. Manuscript in preparation.
Curran, J. M. and T. Tvedebrink (2010b). DNAtools: Statistical functions for analysing forensic
DNA databases. R package version 0.1.
Curran, J. M., S. J. Walsh, and J. S. Buckleton (2007). Empirical testing of estimated DNA
frequencies. Forensic Sciences International: Genetics 1, 267272.
Davison, A. C. and D. V. Hinkley (1997). Bootstrap Methods and their Application. Cambridge
University Press.
Dixon, L. A. et al. (2006). Analysis of articially degraded DNA using STRs and SNPs - results
of a collaborative European (EDNAP) exercise. Forensic Science International 164(1), 3344.
Donnelly, P. (1995a). Match probability calculations for multi-locus DNA proles. Genetica 96,
5567.
Donnelly, P. (1995b). Nonindependence of matches at dierence loci in DNA proles: quanti-
fying the eect of close relatives on the match probability. Heredity 75, 2634.
Evett, I. W., P. D. Gill, and J. A. Lambert (1998). Taking account of peak areas when interpreting
mixed DNA proles. Journal of Forensic Sciences 43(1), 6269.
Evett, I. W. and B. S. Weir (1998). Interpreting DNA Evidence: Statistical Genetics for Forensic
Scientists. Sunderland, MA: Sinauer Associates.
Fields, C. A. and A. H. Welsh (2007). Bootstrapping clustered data. Journal of the Royal
Statistical Society. Series B, Statistical methodology 69(3), 369390.
Gilder, J. R., T. E. Doom, K. Inman, and D. E. Krane (2007). Run-Specic Limits of Detection
and Quantitation for STR-based DNA Testing. Journal of Forensic Science 52(1), 97101.
180 Bibliography
Gill, P. D. et al. (1998). Interpreting simple STR mixtures using allele peak areas. Forensic
Science International 91(1), 4153.
Gill, P. D. et al. (2006). DNA commission of the International Society of Forensic Genetics:
Recommendations on the interpretation of mixtures. Forensic Science International 160(2-3),
90101.
Gill, P. D. and J. S. Buckleton (2010a). A universal strategy to interpret DNA proles that does
not require a denition of low-copy-number. Forensic Science International: Genetics 4(4),
221227.
Gill, P. D. and J. S. Buckleton (2010b). Mixture interpretation: dening the relevant features
for guidelines for the assessment of mixed DNA proles in forensic casework. Journal of
Forensic Sciences 55(1), 265268.
Gill, P. D., J. M. Curran, and K. Elliot (2005). A graphical simulation model of the entire DNA
process associated with the analysis of short tandemrepeat loci. Nucleic Acids Research 33(2),
632643.
Gill, P. D., J. Whitaker, C. Flaxman, N. Brown, and J. S. Buckleton (2000). An investigation
of the rigor of interpretation rules for STRs derived from less than 100 pg of DNA. Forensic
Science International 112(1), 1740.
Green, P. J. and J. Mortera (2009). Sensitivity of inferences in forensic genetics to assumptions
about founding genes. Annals of Applied Statistics 3(2), 731763.
Green, R., I. Roinestad, C. Boland, and L. Hennessy (2005). Developmental Validation of the
Quantiler Real-Time PCR kits for the Quantication of Human Nuclear DNA samples.
Journal of Forensic Science 50(4), 809825.
Hardy, G. H. (1908). Mendelian proportions in a mixed population. Science 28(706), 4950.
Harrell Jr., F. E. (2001). Regression Modeling Strategies. Springer.
Holsinger, K. E. (1999). Analysis of genetic diversity in geographically structure populations:
A bayesian perspective. Hereditas 130, 245255.
Holsinger, K. E. and B. S. Weir (2009). Genetics in geographically structured populations:
dening, estimating and interpreting F
S T
. Nature Reviews. Genetics 10(9), 639650.
Irwin, J. A. et al. (2007). Application of low copy number STR typing to the identication of
aged, degraded skeletal remains. Journal of Forensic Sciences 52(6), 13221327.
Johnson, N. L., S. Kotz, and N. Balakrishnan (1997). Discrete Multivariate Distributions. Wiley.
Lange, K. (1993). Match probabilities in racially admixed populations. American Journal of
Human Genetics 52, 305311.
Lange, K. (1995a). Applications of the Dirichlet distribution to forensic match probabilities.
Genetica 96, 107117.
Lange, K. (1995b). Mathematical and Statistical Methods for Genetic Analysis (2 ed.). Springer.
Bibliography 181
Laurie, C. and B. S. Weir (2003). Dependency eects in multi-locus match probabilities. Theo-
retical Population Biology 63, 207219.
Lauritzen, S. L. (1996). Graphical models. Oxford University Press.
Lauritzen, S. L. and J. Mortera (2002). Bounding the number of contributors to mixed DNA
stains. Forensic Science International 130(2-3), 125126.
Little, R. and D. Rubin (2002). Statistical Analysis with missing data (2 ed.). Wiley.
Maimon, G. (2010). A Bayesian approach to the statistical interpretation of DNA evidence. Ph.
D. thesis, Department of Mathematics and Statistics, McGill University, Montreal, Canada.
McCullagh, P. and J. Nelder (1989). Generalized Linear Models. Chapman and Hall.
Mosimann, J. E. (1962). On the compound multinomial distribution, the multivariate -
distribution, and correlations among proportions. Biometrika 49(1-2), 6582.
Mueller, L. D. (2008). Can simple populations genetic models reconcile partial match frequen-
cies observed in large forensic databases? Journal of Genetics 87(2), 101107.
Neerchal, N. K. and J. G. Morel (2005). An improved method for the computation of maximum
likelihood estimates for multinomial overdispersion models. Computational Statistics &Data
Analysis 49, 3343.
Nichols, R. A. and D. J. Balding (1991). Eects of population structure on DNA ngerprint
analysis in forensic science. Heredity 66, 297302.
Paul, S. R., U. Balasooriya, and T. Banerjee (2005). Fisher information matrix for the Dirichlet-
multinomial distribution. Biometrical Journal 47(2), 230236.
Perlin, M. W. and B. Szabady (2001). Linear mixture analysis: A mathematical approach to
resolving mixed DNA samples. Journal of Forensic Science 46(6), 13721378.
Petricevic, S. et al. (2009). Validation and development of interpretation guidelines for low
copy number (LCN) DNA proling in New Zealand using the AmpFSTR SGM Plus(TM)
multiplex. Forensic Science International: Genetics In Press, Corrected Proof.
Phillips, C., T. Tvedebrink, et al. (2010). Analysis of global variability in 15 established and
5 new European Standard Set (ESS) STRs using the CEPH human genome diversity panel.
Forensic Science International: Genetics. In Press.
Prinz, M. et al. (2007). DNA Commision of the International Society for Forensic Genetics
(ISFG): Recommendations regarding the role of forensic genetics for disaster victim identi-
cation (DVI). Forensic Science International: Genetics 1(1), 312.
R Development Core Team (2009). R: A Language and Environment for Statistical Computing.
Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0.
Rannala, B. and J. A. Hartigan (1996). Estimating gene ow in island populations. Genetical
Research 67, 147158.
Robert, C. P. and G. Casella (2004). Monte Carlo Statistical Methods (2 ed.). Springer.
182 Bibliography
Samanta, S., Y.-J. Li, and B. S. Weir (2009). Drawing inferences about the coancestry coecient.
Theoretial Population Biology 75, 312319.
Schneider, P. M. et al. (2004). STR analysis of articially degraded DNA - results of a collabo-
rative European exercise. Forensic Science International 139(2-3), 123134.
Song, Y. S. and M. Slatkin (2007). A graphical approach to multi-locus match probabilitiy
computation: Revisiting the product rule. Theoretical Population Biology 72, 96110.
Troyer, K., T. Gilroy, and B. Koeneman (2001). A nine STR locus match between two apparent
unrelated individuals using AmpFSTR Proler Plus and COler. Proceedings of the
Promega 12th International Symposium on Human Identication.
Tvedebrink, T. (2009). dirmult: Estimation in Dirichlet-Multinomial distribution. R package
version 0.1.3.
Tvedebrink, T. (2010). Overdispersion in allelic counts and -correction in forensic genetics.
Theoretical Population Biology. In Press.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2008). Amplication of DNA
mixtures - Missing data approach. Forensic Science International: Genetics Supplement Se-
ries 1, 664666.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2009). Estimating the proba-
bility of allelic drop-out of STR alleles in forensic genetics. Forensic Science International:
Genetics 3(4), 222226.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2010a). Evaluating the weight
of evidence using quantitative STR data in DNA mixtures. Journal of the Royal Statistical
Society. Series C, Applied statistics. In Press.
Tvedebrink, T., P. S. Eriksen, H. S. Mogensen, and N. Morling (2010b). Identifying contributors
of DNA mixtures by of quantitative information of STR typing. Journal of Computational
Biology. Accepted for publication.
Ukoumunne, O. C., A. C. Davison, M. C. Gulliford, and S. Chinn (2003). Non-parametric boot-
strap condence intervals for the intraclass correlation coecient. Statistics in Medicine 22,
38053821.
Venables, W. N. and B. D. Ripley (2002). Modern Applied Statistics with S (4 ed.). Springer.
Votaw, D. F. (1948). Testing compound symmetry in a normal multivariate distribution. Annals
of Mathematical Statistics 19(4), 447473.
Wang, T., N. Xue, and J. D. Birdwell (2006). Least-square deconvolution: A framework for
interpreting short tandem repeat mixtures. Journal of Forensic Science 51(6), 12841297.
Weinberg, W. (1908).

Uber den nachweis der vererbung beimmenschen. Jahreshefte des Vereins
f ur vaterl andische Naturkunde in W urttemberg 64, 368382.
Weir, B. S. (1996). Genetic Data Analysis II. Sinauer Associates, Inc.
Bibliography 183
Weir, B. S. (2004). Matching and partially-matching DNA proles. Journal of Forensic Sci-
ence 49(5), 16.
Weir, B. S. (2007). The rarity of DNA proles. The Annals of Applied Statistics 1(2), 358370.
Weir, B. S. and C. C. Cockerham (1984). Estimating F-statistics for the Analysis of Population
Structure. Evolution 38(6), 13581370.
Weir, B. S. and W. G. Hill (2002). Esimating F-statistics. Annual Review of Genetics 36, 721
750.
Wright, S. (1951). The genetical structure of populations. Annals of eugenics 15, 323354.
Zhou, H. and K. Lange (2010). MM algorithms for some discrete multivariate distributions.
Journal of Computational and Graphical Statistics. In Press.

You might also like