the drop-out/drop-in model
DESCRIPTION
Illustration of the LR principle applied to DNA mixturesTRANSCRIPT
Outline
I Illustration of the LR principle applied to DNAmixtures
I Two-person mixtures to explain the principle(but no general formula is given!)
I Example with and without allelic dropout
The LR model— Avila June 2013 1
DNA mixtures
I Two or more individuals contributing to the sample
I More than two peaks per locus
The LR model— Avila June 2013 2
Why are mixtures challenging?
What genotypes created the mixture?
I 12,12/13,15
I 12,15/13,15
I 12,13/13,15
I ...
The LR model— Avila June 2013 3
ISFG DNA commission recommendations
The likelihood ratio is the preferred approach to mixtureinterpretation. DNA commission 2005
Probabilistic approaches and likelihood ratio principles are superiorto classical methods.
DNA commission 2012
The LR model— Avila June 2013 4
The Bayesian framework: likelihood ratios
LR =Pr(data|Hprosecution)
Pr(data|Hdefence)
I data: alleles and their peaks
I ratio of two probabilities or,ratio of two likelihoods
The LR model— Avila June 2013 5
Interpretation
I Need for an interpretation framework that applies to all typesof samples:
• High template• Low template: PCR-related stochastic effects are exacerbated,
creating uncertainty about the composition of thecrime-sample
Reporting officers make pre-case assessments and formulate thepropositions to be evaluated within the likelihood ratio framework.
The LR model— Avila June 2013 6
Dropout/Drop-in definitions
Allele or locus dropout is defined as a signal that is below the limitof detection threshold, it occurs when one or both alleles of aheterozygote fail to PCR-amplify.
Allele drop-in is an allele that is not associated with thecrime-sample and remains unexplained by the contributors undereither Hp or Hd.
The LR model— Avila June 2013 7
Low/High template DNA
High template DNA
I The epg reflects the composition of the sample:
• no dropout• no drop-in
Low level DNA
I The epg does not reflect the composition of the sample:
• allele dropout• allele drop-in• stutters• ...
The LR model— Avila June 2013 8
Part 1: High template DNA, the epg reflects thecomposition of the sample.
The LR model— Avila June 2013 9
Two-person mixture example
I Two-person mixture
The LR model— Avila June 2013 10
Two-person mixture example
Locus1
Evidence 9,11,12
Suspect 9,11
Victim 11,12
I Hp: Suspect + Victim contributed to the sample
I Hd : Victim + Unknown person (unrelated to the suspect)contributed to the sample
The LR model— Avila June 2013 11
Two-person mixture: Under Hp
Locus1
Evidence 9,11,12
Suspect 9,11
Victim 11,12
Hp: Suspect + Victim contributed tothe sample
Pr(Evidence|Hp) = 1
The LR model— Avila June 2013 12
Two-person mixture: Under Hd
Locus1
Evidence 9,11,12
Victim 11,12
Unknown ?
Hd : Unknown + Victim contributedto the sample
The LR model— Avila June 2013 13
Two-person mixture: Under Hd
I The victim’s profile explains 11 and 12
I The unknown has to have allele 9: allele 9 is constrained
Locus1
Evidence 9,11,12
Victim 11,12
Unknown 9,119,129,9
Pr(evidence|Hd) =2p9p11 + 2p9p12 + p29
The LR model— Avila June 2013 14
Two-person mixure: LR
I Hp: Suspect + Victim contributed to the sample
I Hd : Victim + Unknown person (unrelated to the suspect)contributed to the sample
Pr(Evidence|Hp) = 1
Pr(evidence|Hd) = 2p9p11 + 2p9p12 + p29
LR =1
2p9p11 + 2p9p12 + p29
The LR model— Avila June 2013 15
Two-person mixure: LR
I Hp: Suspect + Victim contributed to the sample
I Hd : Victim + Unknown person (unrelated to the suspect)contributed to the sample
Pr(Evidence|Hp) = 1
Pr(evidence|Hd) = 2p9p11 + 2p9p12 + p29
LR =1
2p9p11 + 2p9p12 + p29
The LR model— Avila June 2013 15
What is the underlying model?
I LR is a function of the genotypic frequencies
I Assumes independent association of the alleles within loci:Hardy Weinberg equilibrium
I Multiply between loci: Linkage equilibrium
The product rule
The LR model— Avila June 2013 16
Summary
I Derive the possible genotypes for the unknowns
I Determine the genotypic probabilities
I Sum up the probabilities for all plausible genotypes
I Calculate the ratio of the probabilities under Hp and under Hd
You should not do this by hand!
I usually, analysis of 15 or more loci simultaneously
I calculations get complicated with two or more unknowns
The LR model— Avila June 2013 17
What happens if there are two unknowns under Hd?
I Hp: Suspect + Victim contributed to the sample
I Hd : Two Unknown individuals (unrelated to the suspect)contributed to the sample
Locus1
Evidence 9,11,12
Unknown 1 ?
Unknown 2 ?
I Have to consider all theplausible genotypiccombinations for the unknownthat explain alleles 9,11,12observed in the crime-sample.
The LR model— Avila June 2013 18
Under Hd: two unknowns
Unknown 1 Unknown 2
9,9 11,1211,11 9,1212,12 9,119,11 9,129,11 11,129,12 11,12
Pr(Evidence|Hd) = 2(p292p11p12 + p2112p9p12 + p2122p9p11+
2p9p112p9p12 + 2p9p112p11p12 + 2p9p122p11p12)
The LR model— Avila June 2013 19
LR: two unknowns
LR =1
2(p292p11p12 + p2112p9p12 + p2122p9p11 + 2p9p112p9p12 + 2p9p112p11p12 + 2p9p122p11p12)
I Increasing the number of unknowns increases the number ofterms under Hd
The LR model— Avila June 2013 20
Part 2: Low template DNA, the epg does not reflectthe composition of the sample.
The LR model— Avila June 2013 21
Likelihood ratios vs. Low template DNA
I Classical approach of the LR: the product rule
I Main source of uncertainty in previous examples: Genotypesof unknown contributors
We will now see how we can modify the classical LR approach toaccount for uncertainty in the data, due to low template DNAconditions
The LR model— Avila June 2013 22
Uncertainty in the data: single-source example
Locus1
Evidence 11
Suspect 9,11
I Hp: Suspect contributed to the sample
I Hd: Unknown person (unrelated tothe suspect) contributed to the sample
I Classical LR: Pr(Evidence|Hp) = 0
I LR with dropout and drop-in: Pr(Evidence|Hp) 6= 0
The LR model— Avila June 2013 23
Uncertainty in the data: single-source example
Locus1
Evidence 11
Suspect 9,11
I Hp: Suspect contributed to the sample
I Hd: Unknown person (unrelated tothe suspect) contributed to the sample
I Classical LR: Pr(Evidence|Hp) = 0
I LR with dropout and drop-in: Pr(Evidence|Hp) 6= 0
The LR model— Avila June 2013 23
LR with dropout and drop-in
I Main theory described by:• Haned et al, FSIG, 2012• DNA commission ISFG, FSIG 2012• Gill et al, FSI 2007• Curran et al, FSI, 2005
I Two key parameters in the model• dropout: Heterozygote, Homozygote• drop-in: not treated here
Basic model: qualitative data only, also called the drop-model.
The LR model— Avila June 2013 24
LR with dropout and drop-in
I An allele drops out with a probability of d
I An allele does not drop out with a probability of 1− d
I Allele dropout from a heterozygote: d
I Allele dropout from a homozygote: d ′
The LR model— Avila June 2013 25
Single-source example: Under Hp
I Hp: Suspect contributed to the sample
dropout
Allele 9 yesAllele 11 no
Pr(evidence|Hp) = Pr(dropout of 9)× Pr(non-dropout of 11)
= d × (1− d)
The LR model— Avila June 2013 26
Single-source example: Under Hd
I Unknown contributed to the sample
Locus1
Evidence 11
Unknown ?
The LR model— Avila June 2013 27
The Q alleles
I What are the possible genotypes for the unknown?• The dropped out alleles are gathered under a virtual alleles Q• Q is a ‘place-holder’ to all possible genotypes!• The Unknown’s genotype has to explain allele 11 (no drop-in)
The LR model— Avila June 2013 28
Under Hd
Locus1
Evidence 11
Unknown 11,1111,Q
I Q can be anything except 11
I Unknown genotype must explain 11
I This leaves us with two possibilities:
• Homozygote: 11, 11• Heterozygote 11, Q
The LR model— Avila June 2013 29
Q allele
• Locus L has five alleles: {9, 10, 11, 12}• p9 + p10 + p11 + p12 = 1
• pQ = 1− p11
• pQ = p9 + p10 + p12
I 11,Q can be:• 9,11• 10,11• 11,12
No need to worry about deriving all thegenotypes!
I All thee genotypes are regroupedunder 11Q with frequency: 2p11pQ
The LR model— Avila June 2013 30
Summary
I Two possible genotypes: 11,11 and 11Q
Dropout Genotype probability11,11 (1− d ′) p21111Q (1− d)d 2p11pQ
LR =d(1− d)
(1− d ′)p211 + (1− d)d2p11pQ
The LR model— Avila June 2013 31
Summary
I Two possible genotypes: 11,11 and 11Q
Dropout Genotype probability11,11 (1− d ′) p21111Q (1− d)d 2p11pQ
LR =d(1− d)
(1− d ′)p211 + (1− d)d2p11pQ
The LR model— Avila June 2013 31
LR vs. probability of dropout
The LR model— Avila June 2013 32
Low-template mixture
I Low-template DNA mixture
The LR model— Avila June 2013 33
Two-person mixture example: one dropout, no drop-in
Locus1
Evidence 9,10,12
Suspect 9,11
Victim 10,12
I Hp: Suspect + Victim
I Hd: Two unknowns (unrelated tosuspect/victim)
The LR model— Avila June 2013 34
Under Hp: Dropout from the suspect
Suspect 9,11 d(1-d)
Victim 10,12 (1-d)2
Pr(Evidence|Hp) = d(1− d)3
The LR model— Avila June 2013 35
Under Hd: dropout is possible
I Hd: two unknowns
I Dropout is possible: Q allele, can be anything except 9, 10, 12
9,9 10,12
No-dropout
10,10 9,1212,12 9,109,12 9,109,12 10,1210,12 9,10
9Q 10,12One dropout10Q 9,12
12Q 9,10
The LR model— Avila June 2013 36
Under Hd: dropout is possible
I Hd: two unknowns
I Dropout is possible: Q allele, can be anything execept 9, 10,12
Dropout Genotype Prob.
9,9 10,12(1− d ′)(1− d)2
p29 × 2p10p1210,10 9,12 p210 × 2p9p1212,12 9,10 p212 × 2p9p109,12 9,10
(1− d)42p9p12 × 2p9p10
9,12 10,12 2p9p12 × 2p10p1210,12 9,10 2p10p12 × 2p9p109Q 10,12
d(1-d)32p9pQ × 2p10p12
10Q 9,12 2p10pQ × 2p9p1212Q 9,10 2p12pQ × 2p9p10
The LR model— Avila June 2013 37
Likelihood ratio
The LR model— Avila June 2013 38
LR vs. dropout probability
0.0 0.2 0.4 0.6 0.8 1.0
510
1520
2530
d
LRLR vs. Drop−out
The LR model— Avila June 2013 39
How about drop-in probability?
Under Hp: Dropout from the suspect
Suspect 9,11 d(1-d)
Victim 10,12 (1-d)2
I If drop-in=0 Pr(Evidence|Hp) = d(1− d)3
I If drop-in 6= 0: Pr(Evidence|Hp) = d(1− d)3 × (1− c)
I c is the probability of drop-in
The LR model— Avila June 2013 40
Under Hd: two unknowns
I Dropout is possible, no drop-in: Q allele, can be anythingexcept 9, 10, 12
I If drop-in is possible: Q allele can be anything!
I So the genotypes of the unknown have no longer to explainalleles 9, 10, 12.
I This increases the number of terms under Hd
The LR model— Avila June 2013 41
Think of drop-in as a scaling factor
I If an allele is a drop-in: multiply by c× frequency of allele i.
I If an allele is not a drop-in, multiply by (1− c)
The LR model— Avila June 2013 42
LR vs. dropout and drop-in probability
0.0 0.2 0.4 0.6 0.8 1.0
510
1520
2530
d
LRLR vs. Drop−out
drop−in=0drop−in=0.01drop−in=0.05
The LR model— Avila June 2013 43
Summary
I Derive the possible genotypes for the unknowns
I Determine the genotypic probabilities
I Sum up the probabilities for all plausible genotypes
I Determine the corresponding dropout probabilities
I Calculate the ratio of the probabilities under Hp and under Hd
The LR model— Avila June 2013 44
Software
I Derive genotypes of the unknowns is the key issue
I Assign genotype probability to each genotype
I The number of possibilities increases with the number ofcontributors, deriving LRs for mixtures by hand is not realistic!
The LR model— Avila June 2013 45
Casework example 1: A 3-person mixture
I Victim is major contributor
I At least two minor contributors
The LR model— Avila June 2013 46
Casework example 1: A 3-person mixture
I Hp: Victim + Suspect + Unknown
I Hd: Victim + two unknowns
The LR model— Avila June 2013 47
Sensitivity analysis: Overall LR
Same dropout probability for allcontributors
7.5
8.0
8.5
9.0
9.5
10.0
10.5
11.0
d
log1
0 LR
0.01 0.20 0.40 0.60 0.80 0.99
Overall LR for the 10 SGM+ loci
The LR model— Avila June 2013 48
Sensitivity analysis: Overall LR
Average probability vs. Splittingdropout/contributor =⇒ Nosignificant differences between themodels!
7.5
8.0
8.5
9.0
9.5
10.0
10.5
11.0
d
log1
0 LR
0.01 0.20 0.40 0.60 0.80 0.99
Basic modelSplitDrop model
Overall LR for the 10 SGM+ loci
The LR model— Avila June 2013 49
Plausible ranges for PrD?
LR dropout≤ 1010 0.01 ≤ D ≤ 0.50[109, 108] 0.50 < D ≤ 0.99
7.5
8.0
8.5
9.0
9.5
10.0
10.5
11.0
d
log1
0 LR
0.01 0.20 0.40 0.60 0.80 0.99
Overall LR for the 10 SGM+ loci
The LR model— Avila June 2013 50
Casework example 2: two-person mixture
LR dropout(1) [1010, 109] 0 ≤ D ≤ 0.50(2) [109, 106] 0.50 < D ≤ 0.76(3) [106, 104] 0.76 < D ≤ 0.84(4) [104, 1] D > 0.84
0
5
10
Probability of dropout
log1
0 LR
0.01 0.50 0.76 0.93
(1) (2) (3) (4)
The LR model— Avila June 2013 51
Casework example 3: three-person mixture
LR dropout(1) [1014, 109] 0 ≤ D ≤ 0.08(2) [109, 106] 0.08 < D ≤ 0.53(3) [106, 104] 0.53 < D ≤ 0.75(4) [104, 100] 0.75 < D ≤ 0.86(5) [100, 1] 0.86 < D ≤ 0.93
0
5
10
15
Probability of dropout
log1
0 LR
0.08 0.53 0.75 0.86
(1) (2) (3) (4) (5)
The LR model— Avila June 2013 52
All models are wrong...
I Continuous models are expected to extract more informationfrom the data, but their implementation is difficult andtedious in practice
I semi-continuous methods are easier to implement and canserve as a good approximation
The LR model— Avila June 2013 53
How to inform dropout probabilities?
I Estimate dropout probabilities via logistic regression• difficult to extended to > 2-person mixtures
I Define plausible ranges of dropout
• based on expert belief• based on maximum likelihood principle
I Bayesian approach: combine prior belief and likelihood toyield a posterior distribution
The LR model— Avila June 2013 54