The statistical evaluation of DNA crime stains in R

Similar documents
THE RARITY OF DNA PROFILES 1. BY BRUCE S. WEIR University of Washington

arxiv: v1 [stat.ap] 7 Dec 2007

Stochastic Models for Low Level DNA Mixtures

Interpreting DNA Mixtures in Structured Populations

Breeding Values and Inbreeding. Breeding Values and Inbreeding

Workshop: Kinship analysis First lecture: Basics: Genetics, weight of evidence. I.1

Examensarbete. Rarities of genotype profiles in a normal Swedish population. Ronny Hedell

Forensic Genetics. Summer Institute in Statistical Genetics July 26-28, 2017 University of Washington. John Buckleton:

Statistical Methods and Software for Forensic Genetics. Lecture I.1: Basics

LECTURE # How does one test whether a population is in the HW equilibrium? (i) try the following example: Genotype Observed AA 50 Aa 0 aa 50

Relatedness 1: IBD and coefficients of relatedness

Lecture 9. QTL Mapping 2: Outbred Populations

The Lander-Green Algorithm. Biostatistics 666 Lecture 22

1 Introduction. Probabilistic evaluation of low-quality DNA profiles. K. Ryan, D. Gareth Williams a, David J. Balding b,c

Population Structure

Modeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17

Abstract. BEECHAM Jr., GARY WAYNE. Statistical methods for the Analysis of Forensic DNA Mixtures. (Under the direction of Bruce S. Weir.

Population genetics snippets for genepop

Covariance between relatives

A consideration of the chi-square test of Hardy-Weinberg equilibrium in a non-multinomial situation

Adventures in Forensic DNA: Cold Hits, Familial Searches, and Mixtures

Reporting LRs - principles

STRmix - MCMC Study Page 1

Problems for 3505 (2011)

2. Map genetic distance between markers

The problem of Y-STR rare haplotype

19. Genetic Drift. The biological context. There are four basic consequences of genetic drift:

Identification of Pedigree Relationship from Genome Sharing

Y-STR: Haplotype Frequency Estimation and Evidence Calculation

Affected Sibling Pairs. Biostatistics 666

Notes on Population Genetics

Bayesian Updating: Odds Class 12, Jeremy Orloff and Jonathan Bloom

Lecture 13: Population Structure. October 8, 2012

8. Genetic Diversity

Mechanisms of Evolution

Lecture 2: Introduction to Quantitative Genetics

Introductory seminar on mathematical population genetics

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?

Major Genes, Polygenes, and

Forensic Genetics Module. Evaluating DNA evidence

Characterization of Error Tradeoffs in Human Identity Comparisons: Determining a Complexity Threshold for DNA Mixture Interpretation

Chem 4331 Name : Final Exam 2008

Match probabilities in a finite, subdivided population

Calculation of IBD probabilities

arxiv: v3 [stat.ap] 4 Nov 2014

Statistical Genetics I: STAT/BIOST 550 Spring Quarter, 2014

Case Studies in Ecology and Evolution

STAT 536: Migration. Karin S. Dorman. October 3, Department of Statistics Iowa State University

... x. Variance NORMAL DISTRIBUTIONS OF PHENOTYPES. Mice. Fruit Flies CHARACTERIZING A NORMAL DISTRIBUTION MEAN VARIANCE

Variance Component Models for Quantitative Traits. Biostatistics 666

GBLUP and G matrices 1

Bayesian analysis of the Hardy-Weinberg equilibrium model

A gamma model for DNA mixture analyses

1. What is the definition of Evolution? a. Descent with modification b. Changes in the heritable traits present in a population over time c.

DNA commission of the International Society of Forensic Genetics: Recommendations on the interpretation of mixtures

1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:

Evaluation of the STR typing kit PowerPlexk 16 with respect to technical performance and population genetics: a multicenter study

Analyzing the genetic structure of populations: a Bayesian approach

Inference on pedigree structure from genome screen data. Running title: Inference on pedigree structure. Mary Sara McPeek. The University of Chicago

2018 SISG Module 20: Bayesian Statistics for Genetics Lecture 2: Review of Probability and Bayes Theorem

Outline of lectures 3-6

EXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series:

Outline of lectures 3-6

DNA Evidence in the Ferguson Grand Jury Proceedings

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

A paradigm shift in DNA interpretation John Buckleton

Question: If mating occurs at random in the population, what will the frequencies of A 1 and A 2 be in the next generation?

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8

Optimal Allele-Sharing Statistics for Genetic Mapping Using Affected Relatives

Bayesian Regression (1/31/13)

A maximum likelihood method to correct for allelic dropout in microsatellite data with no replicate genotypes

Appendix A Evolutionary and Genetics Principles

Methods for Cryptic Structure. Methods for Cryptic Structure

Processes of Evolution

Match Likelihood Ratio for Uncertain Genotypes

Alignment Algorithms. Alignment Algorithms

ABSTRACT. ZAYKIN, DMITRI V. Statistical Analysis of Genetic Associations (Advisor: Bruce S. Weir)

Goodness of Fit Goodness of fit - 2 classes

The Wright-Fisher Model and Genetic Drift

Outline for today s lecture (Ch. 14, Part I)

Demography April 10, 2015

Lecture 6. QTL Mapping

Calculation of IBD probabilities

STAT 536: Genetic Statistics

Estimation of Parameters in Random. Effect Models with Incidence Matrix. Uncertainty

Genotype Imputation. Biostatistics 666

arxiv: v1 [q-bio.qm] 26 Feb 2016

Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency

Selection Page 1 sur 11. Atlas of Genetics and Cytogenetics in Oncology and Haematology SELECTION

Observation: we continue to observe large amounts of genetic variation in natural populations

SOLUTIONS TO EXERCISES FOR CHAPTER 9

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Big Idea #1: The process of evolution drives the diversity and unity of life

Mathematical Population Genetics II

Applications of Mathematics

Outline of lectures 3-6

Module 4: Bayesian Methods Lecture 9 A: Default prior selection. Outline

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)

Lecture 1: Case-Control Association Testing. Summer Institute in Statistical Genetics 2015

Bootstrap Procedures for Testing Homogeneity Hypotheses

Transcription:

The statistical evaluation of DNA crime stains in R Miriam Marušiaková Department of Statistics, Charles University, Prague Institute of Computer Science, CBI, Academy of Sciences of CR UseR! 2008, Dortmund Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 1 / 36

Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 2 / 36

Introduction Outline Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 3 / 36

Introduction Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 4 / 36

Introduction Single crime scene stain Blood stain at the crime scene Believed it was left by offender Suspect arrested for reasons unconnected with his DNA profile Crime sample, suspect sample Hypothesis H p (prosecution): The suspect left the crime stain. H d (defense): Some other person left the crime stain. Notation DNA typing results E = {G C, G S } non-dna evidence I Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 5 / 36

Introduction Evidence interpretation Prior odds Posterior odds Bayes theorem Pr(H p I) Pr(H d I) Pr(H p E, I) Pr(H d E, I) Pr(H p E, I) Pr(H d E, I) = Pr(E H p, I) Pr(H p I) Pr(E H d, I) Pr(H }{{} d I) LR Balding and Donelly (1995), Robertson and Vignaux (1995) Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 6 / 36

Introduction Evidence interpretation LR = Pr(E H p, I) Pr(E H d, I) = Pr(G S, G C H p, I) Pr(G S, G C H d, I) = Pr(G C G S, H p, I) Pr(G C G S, H d, I) Pr(G S H p, I) Pr(G S H d, I) = Pr(G C G S, H p, I) Pr(G C G S, H d, I) 1 = 1 Pr(G C G S, H d, I) = 1 Pr(G C H d, I) if independence assumed Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 7 / 36

Introduction Errors and fallacies Statement The probability of observing this type if the blood came from someone other than suspect is 1 in 100. Pr(G C H d, I) = 1/100 Common error The probability that the blood came from someone else is 1 in 100. Pr(H d G C, I) = 1/100 There is a 99% probability that it came from the suspect. Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 8 / 36

Introduction Errors and fallacies (cont. d) Statement The evidence is 100 times more probable if the suspect left the crime stain than if some unknown person left it. 1 Pr(G C H d, I) = 100 Common error It is 100 times more probable that the suspect left the crime stain than some unknown person. Pr(H p G C, I) Pr(H d G C, I) = 100 Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 9 / 36

Single Crime Scene Outline Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 10 / 36

Single Crime Scene Outline Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 11 / 36

Single Crime Scene Product rule Model of an ideal population Infinite size Random mating Model reliable for most real-world problems Hardy-Weinberg equilibrium Likelihood ratio LR = Pr(G = A i A i ) = p 2 i Pr(G = A i A j ) = 2p i p j, i j 1 Pr(G C G S, H d, I) = 1 Pr(G C H d, I) = 1 p 2 i ( 1 2p i p j ) Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 12 / 36

Single Crime Scene Outline Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 13 / 36

Single Crime Scene F - measure of uncertainty about allele proportions in the population of suspects Genetic interpretation of F (Wright, 1951) How to estimate F (Weir and Cockerham, 1984) Recommendations (National Research Council, 1996) F = 0.01 large subpopulations (USA) F = 0.03 small isolated subpopulations Match probabilities (Balding and Nichols, 1994) P(G C = A i A i G S = A i A i ) = [2F + (1 F)p i] [3F + (1 F)p i ] (1 + F)(1 + 2F) P(G C = A i A j G S = A i A j ) = 2 [F + (1 F)p i] [F + (1 F)p j ] (1 + F)(1 + 2F) Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 14 / 36

Single Crime Scene Effects of F corrections Likelihood ratio - some numerical values Heterozygotes A i A j, p i = p j = p F = 0 F = 0.001 F = 0.01 F = 0.03 p = 0.01 5 000 4 152 1 295 346 p = 0.05 200 193 145 89 p = 0.10 50 49 43 34 Homozygotes A i A i, p i = p F = 0 F = 0.001 F = 0.01 F = 0.03 p = 0.01 10 000 6 439 863 157 p = 0.05 400 364 186 73 p = 0.10 100 96 67 37 Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 15 / 36

Single Crime Scene Relatedness Outline Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 16 / 36

Single Crime Scene Relatedness Inbreeding Individuals with common ancestors - related Their children - inbred Alleles are ibd (identical by descent) - copies of the same allele Example Alleles h 1, h 2 transmitted from parent H to X, Y who transmit a, b to I h 1 X a H h 2 Y b I Pr(h 1 is ibd to h 2 ) = Pr(h 1 h 2 ) = 0.5 Pr(a b) = Pr(a h 1, b h 2 h 1 h 2 )Pr(h 1 h 2 ) = 0.5 3 = 0.125 Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 17 / 36

Single Crime Scene Relatedness Match probabilities for close relatives Balding and Nichols (1994) { ( k0 p 4 i + k 1 p 3 i + k 2 p 2 ) i /p 2 i Pr(G C G S, H d, I) = ) (4k 0 p 2 i p2 j + k 1 p i p j (p i + p j ) + 2k 2 p i p j /2p i p j Kinship coefficients Relationship k 0 k 1 k 2 Parent - child 0 1 0 Siblings 1/4 1/2 1/4 Grandparent - grandchild 1/2 1/2 0 Uncle - nephew 1/2 1/2 0 Cousins 3/4 1/4 0 Unrelated 1 0 0 k i - probability that two persons will share i alleles ibd i = 0, 1, 2 Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 18 / 36

Single Crime Scene R package forensic Outline Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 19 / 36

Single Crime Scene R package forensic R function Pmatch Usage Pmatch(prob, k = c(1, 0, 0), theta = 0) Example > p <- c(0.057, 0.160, 0.182, 0, 0.024, 0.122) > Pmatch(p) $prob locus 1 locus 2 locus 3 allele 1 0.057 0.182 0.024 allele 2 0.160 0.000 0.122 $match locus 1 locus 2 locus 3 [1,] 0.01824 0.033124 0.005856 $total_match [1] 3.538088e-06 Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 20 / 36

Mixed stain Outline Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 21 / 36

Mixed stain Mixtures Prosecution and defense hypothesis H p : Contributors were the victim and the suspect. H d : Contributors were the victim and an unknown person. Likelihood ratio for the mixture LR = Pr(E C, G V, G S H p, I) Pr(E C, G V, G S H d, I) = Pr(E C G V, G S, H p, I) Pr(E C G V, G S, H d, I) Pr(G V, G S H p, I) Pr(G V, G S H d, I) = Pr(E C G V, G S, H p, I) Pr(E C G V, G S, H d, I) = 1 Pr(E C G V, G S, H d, I) 1 = (if independence assumed) Pr(E C G V, H d, I) Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 22 / 36

Mixed stain Outline Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 23 / 36

Mixed stain Match probability (independence assumption) Weir et al (1997) P x (U, C) = (T 0 ) 2x i U (T 1i ) 2x + T 0 = l C p l T 1i = l C\{i} p l, i U T 2ij = l C\{i,j} p l, i, j U, i < j i,j U:i<j (T 2ij ) 2x... U - set of alleles from the crime sample C not carried by known contributors x - number of unknown contributors R Pevid.ind(alleles, prob, x, u = NULL) LR.ind(alleles, prob, x1, x2, u1 = NULL, u2 = NULL) Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 24 / 36

Mixed stain Outline Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 25 / 36

Mixed stain Match probability for structured population Assumption All the involved people come from the same subpopulation with parameter F Fung and Hu (2000), Zoubková and Zvárová (2004) MP = r r r 1 r 1 =0 r 2 =0 r r 1 r c 2 r c 1 =0 ti +u i +v i 1 i=1 j=t i +v i (2n U )! c [(1 F)p i + jf] c i=1 u i! 2n T +2n U +2n V 1 j=2n T +2n V [(1 F) + jf] R Pevid.gen(alleles, prob, x, T = NULL, V = NULL, theta = 0) T, V - genotypes of known contributors, known non-contributors Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 26 / 36

Mixed stain People versus Simpson Outline Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 27 / 36

Mixed stain People versus Simpson People versus Simpson (Los Angeles County Case) DNA evidence Material at the crime scene - alleles a, b, c (locus D2S44) Suspect s genotype ab Victim s genotype ac Hypotheses (Weir et al, 1997, Fung and Hu, 2000) H p : Contributors were the victim, suspect and m unknowns. H d : Contributors were n unknowns. R a = c( a, b, c ) p = c(0.0316, 0.0842, 0.0926) suspect <- a/b, victim <- a/c Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 28 / 36

Mixed stain People versus Simpson Likelihood ratios for the Simpson case Defense n = 2 n = 3 Prosecution F = 0 0.01 0.03 F = 0 0.01 0.03 m = 0 1623 739 276 21606 5853 1150 m = 1 70 44 26 938 345 107 Prosecution proposition (F = 0) Pevid.ind(alleles = a, prob = p, x = m) Pevid.gen(alleles = a, prob = p, x = m, T = c(victim, suspect)) Defense proposition (F = 0) Pevid.ind(alleles = a, prob = p, x = n, u = c( a, b, c )) Pevid.gen(alleles = a, prob = p, x = n) Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 29 / 36

Mixed stain More general situation Outline Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 30 / 36

Mixed stain More general situation More general situation Contributors from different ethnic groups Fung and Hu (2001) Pevid.ind, LR.ind (independence within and between ethnic groups) Presence of related people Hu and Fung (2003) Example Suspect not typed, his relative is tested Two related people among unknown contributors Pevid.rel Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 31 / 36

Conclusion Outline Introduction Single Crime Scene Relatedness R package forensic Mixed stain People versus Simpson More general situation Conclusion Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 32 / 36

Conclusion Conclusion Single crime stain Suspect and offender are unrelated are members of the same subpopulation are close relatives Pmatch Mixed crime stain Contributors unrelated members of the same subpopulation may be related from different ethnic groups Pevid.ind, LR.ind Pevid.gen Pevid.rel Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 33 / 36

Conclusion Thank you for your attention! Acknowledgement The work was supported by GAČR 201/05/H007 and the project 1M06014 of the Ministry of Education, Youth and Sports of the Czech Republic. Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 34 / 36

Important publications BALDING AND NICHOLS(1994) DNA profile match probability calculation Forensic Science International 64, p. 125-140 WEIR ET AL(1997) Interpreting DNA mixtures Journal of Forensic Sciences 42, p. 213-222 FUNG AND HU(2000) Interpreting forensic DNA mixtures: allowing for uncertainty in population substructure and dependence Journal of Royal Statistical Society A 163, p. 241-254 ZOUBKOVÁ, SUPERVISOR ZVÁROVÁ (2004) Master thesis (in Czech) Charles University, Prague Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 35 / 36

Software R DEVELOPMENT CORE TEAM (2006) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria ISBN 3-900051-07-0, URL http://www.r-project.org MARUŠIAKOVÁ, M (2007) forensic: Statistical methods in forensic genetics R package version 0.2 Miriam Marušiaková (Prague) The statistical evaluation of DNA crime stains UseR! 2008, August 12-14 36 / 36