have occurred and the modified (posterior) probabilities after a particular event has occurred. The essence of Bayes theorem is shown in table 1,

Similar documents
6.6 Meiosis and Genetic Variation. KEY CONCEPT Independent assortment and crossing over during meiosis result in genetic diversity.

Genetics (patterns of inheritance)

Guided Notes Unit 6: Classical Genetics

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?

MODULE NO.22: Probability

Unit 6 Reading Guide: PART I Biology Part I Due: Monday/Tuesday, February 5 th /6 th

The Quantitative TDT

1. Let A and B be two events such that P(A)=0.6 and P(B)=0.6. Which of the following MUST be true?

7.014 Problem Set 6 Solutions

4. Conditional Probability

Natural Selection. Population Dynamics. The Origins of Genetic Variation. The Origins of Genetic Variation. Intergenerational Mutation Rate

Outline of lectures 3-6

2. Map genetic distance between markers

Model Building: Selected Case Studies

Biol. 303 EXAM I 9/22/08 Name

Observing Patterns in Inherited Traits

1 INFO 2950, 2 4 Feb 10

Outline of lectures 3-6

REVISION: GENETICS & EVOLUTION 20 MARCH 2013

VII. Non.Paternity Problems.

MAT PS4 Solutions

Exam 5 Review Questions and Topics

Modeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17

Info 2950, Lecture 4

Big Idea 3B Basic Review. 1. Which disease is the result of uncontrolled cell division? a. Sickle-cell anemia b. Alzheimer s c. Chicken Pox d.

UNIT 8 BIOLOGY: Meiosis and Heredity Page 148

Outline of lectures 3-6

Mechanisms of Evolution

The E-M Algorithm in Genetics. Biostatistics 666 Lecture 8

Problems for 3505 (2011)

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

Population Genetics. with implications for Linkage Disequilibrium. Chiara Sabatti, Human Genetics 6357a Gonda

SNP Association Studies with Case-Parent Trios

-Genetics- Guided Notes

A simple genetic model with non-equilibrium dynamics

CINQA Workshop Probability Math 105 Silvia Heubach Department of Mathematics, CSULA Thursday, September 6, 2012

Lesson 4: Understanding Genetics

3/4/2015. Review. Phenotype


Solutions to Even-Numbered Exercises to accompany An Introduction to Population Genetics: Theory and Applications Rasmus Nielsen Montgomery Slatkin

Introduction to Genetics

MULTIPLE CHOICE- Select the best answer and write its letter in the space provided.

LECTURE # How does one test whether a population is in the HW equilibrium? (i) try the following example: Genotype Observed AA 50 Aa 0 aa 50

Natural Selection results in increase in one (or more) genotypes relative to other genotypes.

- interactions between alleles. - multiple phenotypic effects of one gene. - phenotypic variability in single genes. - interactions between genes

7.014 Problem Set 6. Question 1. MIT Department of Biology Introductory Biology, Spring 2004

Life Cycles, Meiosis and Genetic Variability24/02/2015 2:26 PM

Lecture 1 Introduction to Quantitative Genetics

Name Class Date. KEY CONCEPT Gametes have half the number of chromosomes that body cells have.

Introduction to Genetics

Lecture 9. QTL Mapping 2: Outbred Populations

Q Expected Coverage Achievement Merit Excellence. Punnett square completed with correct gametes and F2.

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

Statistical Methods and Software for Forensic Genetics. Lecture I.1: Basics

Chapter 4 Sections 4.1 & 4.7 in Rosner September 23, 2008 Tree Diagrams Genetics 001 Permutations Combinations

The universal validity of the possible triangle constraint for Affected-Sib-Pairs

genome a specific characteristic that varies from one individual to another gene the passing of traits from one generation to the next

Unit 5: Chapter 11 Test Review

MIXED MODELS THE GENERAL MIXED MODEL

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from

Stochastic Models for Low Level DNA Mixtures

Patterns of inheritance

A consideration of the chi-square test of Hardy-Weinberg equilibrium in a non-multinomial situation

PRINCIPLES OF MENDELIAN GENETICS APPLICABLE IN FORESTRY. by Erich Steiner 1/

NATURAL SELECTION FOR WITHIN-GENERATION VARIANCE IN OFFSPRING NUMBER JOHN H. GILLESPIE. Manuscript received September 17, 1973 ABSTRACT

14.30 Introduction to Statistical Methods in Economics Spring 2009

Epistasis in Predator-Prey Relationships

Introduction to Genetics

Biology 211 (1) Exam 4! Chapter 12!

Labs 7 and 8: Mitosis, Meiosis, Gametes and Genetics

Departamento de Biologia, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil 2

BIOLOGY LTF DIAGNOSTIC TEST MEIOSIS & MENDELIAN GENETICS

Outline. P o purple % x white & white % x purple& F 1 all purple all purple. F purple, 224 white 781 purple, 263 white

The identification of synergism in the sufficient-component cause framework

Expression arrays, normalization, and error models

Affected Sibling Pairs. Biostatistics 666

Outline for today s lecture (Ch. 14, Part I)

THE WORK OF GREGOR MENDEL

Lecture 2. Basic Population and Quantitative Genetics

Genetics Unit Review

Bayesian Methods with Monte Carlo Markov Chains II

I Have the Power in QTL linkage: single and multilocus analysis

Name Period. 3. How many rounds of DNA replication and cell division occur during meiosis?

Chapter 13 Meiosis and Sexual Reproduction

Analytic power calculation for QTL linkage analysis of small pedigrees

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Biology Review Second Quarter Mr. Pagani. 2 nd 9 Weeks. Review of major concepts of Biology. Plant structure & Function

Commentary. Regression toward the mean: a fresh look at an old story

3 rd Quarter Study Guide Name

Notes on Blackwell s Comparison of Experiments Tilman Börgers, June 29, 2009

Name Period. 2. Name the 3 parts of interphase AND briefly explain what happens in each:

Significance Testing with Incompletely Randomised Cases Cannot Possibly Work

Design of the Fuzzy Rank Tests Package

Chromosome Chr Duplica Duplic t a ion Pixley

Table of Contents. Unit 3: Rational and Radical Relationships. Answer Key...AK-1. Introduction... v

The Chromosomal Basis of Inheritance

Survey of Physical Anthropology Exam 1

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

Quantitative Genetics I: Traits controlled my many loci. Quantitative Genetics: Traits controlled my many loci

Linear Regression (1/1/17)

Transcription:

Am. J. Hum. Genet. 43:197-205, 1988 An Epository Review of Two Methods of Calculating the Paternity Probability C. C. Li* and A. Chakravarti* *Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh Summary There are two methods for calculating the posttests probability of paternity, viz., the noneclusion probability method (E method) and the paternity inde method (X method). This report reviews these two methods and eplains the reasons behind them, in the hope that it might alleviate the current controversy between the advocates of these two methods. The emphasis throughout the paper is on eposition, using simple eamples to illustrate certain principles or properties. A discussion follows the presentation of the two methods. The calculation of the paternity inde is based on the genotype (or phenotype) of the accused man; and the value of the paternity inde remains the same whether the accusation itself is true or false. Introduction The present communication is not a review of the literature concerning paternity tests but an eposition of the reasons and procedures adopted by the two methods of calculating the posterior probability of paternity after a number of genetic tests, all resulting in noneclusions. Only the literature concerning the current controversy is cited, plus two or three references on basic methodology. When the mother of a child accuses a certain man of being the father of her child, the purpose of conducting genetic marker tests on the trio (mother, child, and the accused man) is to ascertain whether the accusation is true or false. This will be our central topic in the present communication. Since both methods use Bayes theorem on posterior probabilities, we shall begin the eposition by reviewing Bayes theorem briefly. 1. Bayes Theorem This theorem deals with the relationship betweet the initial (prior) probabilities before certain event, Received July 16, 1987; revision received March 15, 1988. Address for correspondence and reprints: Dr. C. C. Li, Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA 15261. 1988 by The American Society of Human Genetics. All rights reserved. 0002-9297/88/4302-0014$02.00 have occurred and the modified (posterior) probabilities after a particular event has occurred. The essence of Bayes theorem is shown in table 1, in which the Y's denote the initial probabilities of the three states. The conditional probabilities of the event are denoted by -yi = Prob(event state i). The sum of the joint probabilities is denoted by ji = IiY and is the unconditional probability of the event. Dividing each Yi-y, by their sum a, we obtain the posterior probabilities Xi which add up to unity. The relationship between X and Y is indicated in the fast column of table 1, which is Bayes theorem. Obviously, the table may be etended to any number of states. To facilitate subsequent generalization of Bayes theorem to cover more than one event, we shall introduce column vectors and use Smith's (1976) notation A for element-by-element multiplication of two column vectors. Thus, the calculations shown in table 1 may be summarized as follows: or l, 'Y Il Yi'wl 2A Y2 A Y2=Y2 Y,3 'Y3 ~~ \Y3-Y3 YAy -.X, {XI) (1) (1') 197

198 Li and Chakravarti Table I Bayes Theorem Unking Initial (Prior) Probabilities of States to Modified (Posterior) Probabilities after an Event Has Occurred and Become Known State Prior Conditional Posterior of Probability Probability Joint Probability: Probability Nature of State of Event Prior Conditional of State (i ' Yi) (-Yi) ( Yrzi (i) 1... YY 1 Y1'y Xi = Y1y/Iy 2... Y2 'Y2 Y2Y2 X2= y2l2 3... Y3 3 Y3Y3 X3= Y3y3/j 1.00 1.00 Total... where the arrow denotes the "normalization" process, i.e., dividing each element by the sum of the elements so that 1X, = 1. Epression (1) is a convenient form for us to use in studying the properties of X and Y. We shall note a few obvious properties: (a) If all conditional probabilities -yi are the same, then Y = X so that the prior and posterior probabilities are identical. (b) If the Y's are multiplied by a positive constant, the posterior probabilities X are unaltered because of the normalization process. (c) The same is true with the -y's. Hence, for numerical convenience, we may multiply the Y's and -y's by any convenient positive constant. (d) If all states are initially equally likely (Yj constant for all i), then the Y vector may be ignored and we need only to normalize the -y vector to obtain X. And finally, (e) if yi is the largest of all y's, then Xi > Yj; if -yi is the smallest, then Xi < Yi. The states with intermediate y values may increase or decrease or remain unchanged in frequency. 2. Multiple Independent Events The generalization of Bayes theorem with the occurrence of several independent events is straightforward. Suppose t independent events, all of which depend on the same states of nature, have occurred; each of these events has a -y vector of conditional probabilities: y( ) for the jth event. The conditional probabilities of the independent events may then be combined simply by element-by-element multiplication of the vectors y(), as shown by Smith (1976). Hence, the modified (posterior) probabilities of the states of nature after the occurrence of the t events are given by yl 'i (1) Y1 (2) Yi t) Y2 AIY2 A Y2 A... A y2 Y3/ \Y3j \Y3/ Y3j X, o. X3 There is no need for the normalization process after each single event; one final normalization after all t events will yield the correct answer X. Again, epression (2) is a convenient form for us to study the effect of the t independent events. First, the ordering or sequence of the events is immaterial. Second, the t independent events may be regarded as a single event with a single vector of conditional probabilities F = -(l) A... A y(t). In this form, epression (2) reduces to epression (1): Y A F,- X. In the following we shall give eamples of application of Bayes theorem to genetical problems, first in genetic counseling, to pave the way for consideration of paternity problems. 3. Se-linked Recessive Disease Genetic counselors have long used Bayes theorem to calculate the risk for a child to have a genetic disease by calculating the probabilities of the parents' genotypes (Murphy and Mutalik 1969; Bolling et al. 1976). The se-linked recessive diseases (e.g., hemophilia) provide a simple eample, and it bears obvious resemblance to the paternity-testing situation to be described in the following section. Consider a woman M who has an unaffected father and two affected (hemophilic) brothers. This shows that M's mother is a heterozygote (Aa, carrier), and the initial probabilities of the states M = and M = Aa are (Y1, Y2) = (.50,.50) before she has any male offspring. In the event that M has one affected son, this is conclusive evidence that M is a carrier, and the probability of having another affected son in the future is 1/2. On the other hand, if M has a normal son, the evidence is inconclusive, as both M = and M = Aa can produce a normal son. In general, even ifm has t > 1 normal sons in succession, this still does not prove M = ; but the probability that it is so

Paternity Probability becomes high, because the probability for M = Aa to have t normal sons is low. The probability of the state M = may be calculated by Bayes theorem. Since (Y1, Y2) = (.50,.50), it may be ignored in the calculations, as we have noted above. The conditional probabilities for state 1 (M = ) to have a normal son is -y = 1, and that for state 2 (M = Aa) to have a normal son is _Y2 = 1/2. This is true for every one of the sons. To obtain the posterior probabilities we need only to normalize the y vectors. whe r(2e ) ( 1 ) X, where 1 = Xi = Xi (t) =1 +() -2t+1 Thus, for t = 0, 1, 2, 3, 4,... XI(t) = 7/2, 2/3, 4/5, 8/9 16/17 (3') Clearly, X1(t) is a monotonically increasing function of t; it is necessarily so from the Bayes theorem because -Yj = 1 is greater than _Y2 = 1/2. In general, the state with the largest -yi will yield X, > Yi for each event. The result, epression (3'), may also be visualized as a process of successive elimination of carrier mothers at the birth of each son. For M = Aa, the birth of each son provides a probability of 1/2 of being identified as a carrier and eliminated from further consideration. Only those mothers who have all normal sons constitute a miture of and Aa genotypes. The relative frequencies of these two genotypes in the miture are shown in table 2, in which the initial numbers of and Aa are taken to be Y1 = Y2 = 32. The last row of table 2 gives the successive probabilities as shown in epression (3'). 4. Paternity Probability Based on Noneclusions We shall first review the noneclusion method of calculating the modified or posttests probability of paternity, as it follows directly from the previous three sections. Let M = mother, C = child, F = father (unknown), and G = the accused man. The purpose of conducting genetic marker tests is to ascertain whether the accusation is true (G = F) or false (G F). When the genotypes (or phenotypes, as Table 2 Successive Elimination of Carrier Mothers (M = Aa) at Birth of Each Son State of M 0 1 2 3 4 M =... 32 32 32 32 32 M = Aa... 32 16 8 4 2 (Aa eliminated)... (16) (8) (4) (2) Total... 64 48 40 36 34 X1 = P(M = ).. 1/2 2 4/5 % 6/1A7 NOTE.-At any birth period, half of these mothers will be identified as carriers and eliminated from further consideration. Only mothers having all normal sons constitute a miture of and Aa genotypes. a No. of normal sons. the case may be) of the tested trio (M, C, G) are found to be incompatible with Mendelian laws of heredity, we conclude that the accusation is false, as G is "ecluded" from paternity. The eclusion is conclusive evidence that the accusation is false, and G is eonerated. The conclusive evidence of eclusion is analogous to the birth of a hemophilic son in the eample given in the previous section. On the other hand, if the genotypes or phenotypes of the tested trio (M, C, G) are compatible with Mendelian laws, the evidence is inconclusive; G may or may not be F, and we say that G is nonecluded from paternity. The case of noneclusion in paternity testing is analogous to the birth of a normal son, which is inconclusive evidence for discriminating between a homozygous normal mother and a carrier mother in the eample of the previous section. Each genetic locus has a certain "capability" of ecluding falsely accused men, depending on the degree of polymorphism of the locus. In the simplest case of two codominant alleles of an autosomal locus, for eample, the average eclusion probability of falsely accused men over all possible MC pairs in the population is E = pq(l - pq), where (p, q) are the frequencies of the two alleles. The concrete meaning of E may be illustrated by a numerical eample with (p, q) = (.95,.05) and E =.045. Suppose there are a large number of falsely accused men to be tested by this locus; then only 4.5% of them will be ecluded and eonerated, and the remaining 95.5% are nonecluded and inconclusive. They have to wait for further tests using other genetic loci. The eclusion capability of all the frequently used marker loci are known, but consideration of them is beyond the ta 199

20u' scope of this review. The general conclusion is that the higher the degree of polymorphism of a locus, the higher will be its eclusion capability. In the case of se-linked recessive disease, we used Bayes theorem to calculate the posterior probability only for those mothers having all normal sons. Analogously, in the case of paternity testing, we calculate the posterior probability only for those accused men nonecluded by genetic tests. To use Bayes theorem, we require the prior probability of paternity before testing; we denote this prior probability by so, and let To0 = 1 - rfo. In words, Fro is the probability of a father among accused men. The quantity fro is a parameter that may be estimated by the use of empirical data from long-term records of blood-typing laboratories (Chakravarti and Li 1984). We note that nyo should be estimated only for a specific ethnic and social group, as it varies from group to group. We net consider the conditional probability denoted by -y in the previous sections. Let the true accusation be state 1 and the false accusation be state 2. = If G F, the accused man can never be ecluded by any genetic test, so that the noneclusion probability is -Yi = 1 for all genetic tests, irrespective of the genotype of F. If G F, the noneclusion probability is _Y2 = 1 - E = E < 1 for a test locus with eclusion capability E. For the final calculation of the posterior probability, consider accused men who have been subjected to t genetic tests, all resulting in noneclusions. We wish to calculate the probability of true fathers among this group of nonecluded accused men. Let Ej be the noneclusion probability of the jth test. Straightforward application of Bayes theorem (2) yields where A A * A E - 'ITO Irt = sro + 90 El,.Et which is the desired probability of paternity after t noneclusions. The result, epression (4), was first given by Wiener (1976), ecept for the notation. It may be called Wiener's method, or, more descriptively, the noneclusion method, or simply the E method. It is also the method we have preferred (Li and Chakravarti 1983, 1985, 1986). (4) 5. Eamples and Properties of Trt Li and Chakravarti The most important and obvious property of St in epression (4) is that it is a monotonically increasing function of t, the number of noneclusions eperienced by the accused individual. We may recall that Xl(t) in epression (3) has the same property, where t is the number of normal sons. In both cases the key point is -y1 = 1 > Y2 = 1/2 or Ej, as the case may be. In fact, epression (4) is a more general form of epression (3), since, when nro = -r0 and each Ej = 1/2, then epression (4) reduces to epression (3). The monotonically increasing function implies that each additional noneclusion on an additional genetic test increases the probability of paternity for the accused man. In view of the genetic meaning of noneclusion, we believe this to be a fundamental property of a paternity probability. However, we shall review the reasons against it when we describe an entirely different approach to the paternity problem. The monotonic property of at may also be viewed as a consequence of successive elimination of falsely accused men by successive tests. As a numerical eample, consider a group of 200 accused men with initial iro =.60. A series of four genetic tests is to be conducted, where the four genetic loci have the following eclusion and noneclusion probabilities: Test locus j: 1 2 3 4 Eclusion Ej: 1/4, 1/2, 2/3, ½2 Noneclusion Ej: 3/4, 1/2, 1/3, 1A2 Joint noneclusion E* = EjE2E3E4 V/ = 6 (5) (5') The successive values of the paternity probabilities are shown in table 3, which is similar to table 2. The Table 3 Successive Elimination of Falsely Accused Men, and Monotonic Increase of the Paternity Probability for the Remaining Nonecluded Men ta 0 1 2 3 4 G=F... 120 120 120 120 120 G #F... 80 60 30 10 5 (Ecluded)... (20) (30) (20) (5) Total nonecluded 200 180 150 130 125 rr.t *....600.667.800.923.960 NoTE.-The noneclusion probabilities of the four tests are shown in epression (5). a No. of noneclusions.

Paternity Probability body of the table shows the number of men with noneclusions. The number of falsely accused men ecluded by the test is shown in brackets. It is seen from the bottom row of table 3 that the paternity probability a, is monotonically increasing with t. For men with four noneclusions, the paternity probability is Tr4 = 120/12s =.96, while for men with only the first two noneclusions the paternity probability is 'U2 =.80. The numerical eample above may also be used to illustrate the concept of equivalent tests. The four noneclusions have raised wro =.60 to ir4 =.96. If we had used a single highly polymorphic genetic locus with noneclusion probability E* = E1E2E3E4 = 1A6 of (5'), we would have accomplished the same result: irod 12=0DAD 1 > (120.96 'no E* 80 1/16 ~~~~.04 This is due to the fact that any number of independent events may be combined and regarded as a single event, as we have observed in section 2 above. The ABO locus has a low capability of eclusion, with (E, E) = (.15,.85) approimately; the HLA locus has a rather high capability of eclusion, with (E, E) = (.90,.10) for illustration purposes. Noting that (.85)14.10, we may say that one test by a locus such as HLA is equivalent to 14 tests by loci such as ABO. This is one way to evaluate the relative value or usefulness of genetics tests. 6. Probabilities of Fathers and Nonfathers We now begin to review a different method of calculating the paternity probability after a number of noneclusions. This method does not use the noneclusion probabilities of the loci tested at all. Instead, it uses a quantity known as the "paternity inde," based on the probabilities of fathers and nonfathers for any given mother-child pair (Baur et al. 1986; Elston 1986; Mickey et al. 1986; Thompson 1986; Valentin 1986). In this section we shall deal with such probabilities only. The paternity inde will then be introduced in the following section. In the paternity literature, the term "nonfathers" is synonymous with random individuals of a population. We shall continue to use the simplest genetic system-two codominant alleles at an autosomal locus-in a random mating population to illustrate the basic methodology of the paternity inde method. We use (A, a) for the two alleles with frequencies (p, q), without implication of dominance. Further, we assume throughout the subsequent sections that the mother-child pair has been found to be (M, C) = (, ), which is the conditional "event" in the contet of Bayes theorem. Let G be the accused man. Before we know the genotypes of the mother and child, the probabilities of G's being (, Aa, aa) are simply (p2, 2pq, q2). Then, at a particular test locus it has been found that (M, C) = (, ). When this "event" becomes known, what would be the genotype probabilities of the true fathers? The conditional probabilities (-yi) for the event are calculated as follows: Mother Father Aa aa Child 201 Segregation Probability 1 = 1 Y2 = /2 3 = 0 The complete calculations are shown in table 4, which follows table 1 step by step. If G = aa, he is ecluded. So our remaining discussion pertains to the cases G = or Aa, which are noneclusions. Table 4 Probability Distribution of Fathers after the Event (M, C) = (, ) State of Initial Probability Conditional Probability Joint Probability = Posterior Father of State of Event Initial Condition Probability (i) (Y.) (Yi) (Y(Xi) 1.... p2 w1 = 1 p2 p 2. Aa... 2pq Y2 = 1/2 pq q 3. aa... q2 0 0 Total... 1.00 - = p 1.00

202 To obtain the general relationships between the initial probabilities Y and the modified or posterior probabilities X, tables 1 and 4 must be read together. Thus, as given in as given in table 1 table 4 X1 = 1)YY = pp2 =p X2 = ( Y2 = (?i)2pq =q The coefficients in brackets, such as (-yi/j) or (lp), may be regarded as the factor converting Y to X. In the genetic eample, it is obvious that X' = (p, q, 0) and Y' = (p2, 2pq, q2) are related by Mendelian segregation probabilities -y, = (1, 1/2, 0). Further, since -Yi = 1 is the largest of the three -y's, we know that X1 is always larger than Y1, as the converting factor (-y1/ -) > 1. To summarize, X and Y have fied relationships-fied by the laws of heredity. 7. Paternity Inde and Inclusion Probability The paternity inde method focuses its attention on the genotype of the accused individual. Suppose the accused man is G =, a noneclusion. Then, the argument is as follows: Probability (G = Father) = X, = X = p Probability (G = Nonfather) = yj = Y = p2 where we have dropped the subscript 1 of X and Y for convenience, as G = is the only genotype under consideration. Further, it is argued that these two probabilities correspond to two distinct states of nature, which may be considered as two hypotheses, viz., H1: G = = father and H2: G = = nonfather. In an attempt to distinguish between these two hypotheses, a likelihood ratio is defined: (6) P 1. -=;e ;g.g.,x(a A) = y~~~~~p (7.1) The ratio X is called the paternity inde, because when X > 1, the evidence favors H1: G = father; and when X < 1, the evidence is against his being the father. The final step of the method is to calculate a paternity probability, known as the inclusion probability (W), on the basis of X and Y. It is defined as _ X+ YAX+ 1 ' 1 e.g., W() = 1 + p Li and Chakravarti (7.2) when Bayes theorem is used and the assumed initial probabilities of paternity are Tro = 'r0 = 1/2. Recently, some authors have used the more general form: W = 0X rrox + noy However, we are only interested in certain properties of X and W rather than in their numerical values for practical decisions; we shall continue to use the simpler form, epression (7.2), in illustrations. As noted before, this method concentrates its attention on one genotype at a time. If the accused individual G is Aa, we have to repeat the procedure by using X2 = X = q and Y2 = Y = 2pq, as shown in table 4, so that X(Aa) = -2 1 (7.3) and W(Aa) = 1 + 2p ( If there are t noneclusions on t independent genetic tests, the cumulative likelihood ratio X1X2,... XI is calculated, where Xi is the paternity inde obtained from the jth locus. The paternity probability or the inclusion probability is X_ X 2...Xt (7.4) W1Xl2 * * * X~t + 1' Note that both equation (4) and equation (7.2) use Bayes theorem; however, the states of nature and the "event" being considered are different. This summarizes the procedure of the paternity inde method of calculating the paternity probabilities after a number of noneclusions. It may be simply referred to as the X method. 8. Eamples and Properties of X and W To facilitate numerical comparisons, we have collected epressions (7.1), (7.2), and (7.3) into table 5, in which four numerical eamples are given. In all cases we have assumed that the initial probability of paternity is pro = 1/2. The first eample shows that a noneclusion by this two-allele codominant system has raised the paternity probability from.500 to

Paternity Probability 203 Table 5 General Procedure of the Paternity Inde Method to Calculate the Paternity Probability: Four Eamples, Given (Mother, Child) = (, ) A. General Procedure Accused Father Random Paternity Inde Inclusion Probability Genotype (X) (Y) (X = X/Y) W = X/[I + 1]) G = p... p p2 X() = 1/p W() =1/(1 + p) G = Aa... q 2pq X(Aa) = 1/2p W(Aa) = 1/(1 + 2p) G = aa 0... 0 q2 0 Ecluded B. Eamples Gene Paternity Inclusion Accused Frequency Inde Probability i: G =... p =.05 X() = 20 W() = 20/21 =.952 ii: G = Aa... p =.05 X(Aa) = 10 W(Aa) = 10/11 =.909 iii: G =... p =.95 X() = 1/.95 W() = 1/1.95 =.513 iv: G = Aa... p =.95 X(Aa) = 1/1.90 W(Aa) = 1/2.90 =.345.952, a very substantial increase indeed. In the second eample, W(Aa) =.909, also very high. This means that in a population with a low value of p, an accused individual, if not ecluded, will always have a high paternity probability, whether he is or Aa and whether the accusation is true or false. In eample iv, where G = Aa and p =.95, the paternity inde is X(Aa) = 10/19 and W(Aa) =.345, considerably smaller than the initial nro =.50 before the noneclusion. Thus, a noneclusion actually makes G less likely to be the father than he was before the test. Supporters of the X method argue that this is precisely the superior aspect of the method. By comparing X and Y, they obtain more information about paternity. Since X = 10/19 < 1, the evidence is against G's being the father, in spite of the noneclusion. The implication is that the evidence from X < 1 is more important than the evidence of a noneclusion. As a comparison, we shall obtain the results for the four eamples in table 5 by the E method. As noted in section 4 above, the eclusion capability of this locus is E = pq(1 - pq) =.045, whether p =.05 or.95, because of its symmetry. Since the E method is based not on the genotype of the accused but on the noneclusion probability of a locus, the answers for all four eamples are the same. Thus, with E = 1 - E =.955, we obtain A II1 A I 0 TIO/lE.501 \.955 ~.48/ Since 95.5% of the falsely accused individuals cannot be ecluded by this simple locus, a noneclusion can only raise the paternity probability slightly over the initial probability. Here, it changes from.500 to.512. We believe that this answer is closer to the truth than any of four answers listed in table 5. The close agreement between this answer and that in eample iii of table 5 is purely incidental. For a series of t tests resulting in t noneclusions, the value of Wt in epression (7.4) has no monotonic property, as each X may be greater or less than 1. So the series W1, W2, W3,... for t = 1, 2, 3,... may oscillate in principle. When an additional test is conducted resulting in another noneclusion, the paternity probability Wt+ 1 may turn out to be smaller than Wt. Similarly, Wt +2 may be smaller than Wt+ 1 As an etreme eample, suppose there are eight noneclusions by eight loci, each like the one in eample iv with X = 10/19 (table 5); we would obtain a paternity probability by epression (7.4): -= (10A/9) =.0059, (10/19)8 + 1 (8) which is below the customary.01 significance point; hence we would reject the hypothesis that G is the father. The astonishing conclusion would be that a series of eight noneclusions could amount to an eclusion. We are not suggesting that this happens frequently, but it is a mathematically possible result and it is a property of the X method. In contrast, the paternity probability a, in epression (4) derived by

204 the E method approaches monotonically toward the limiting value of 1, for El,... E >- 0 as t increases, provided that only noneclusions are observed. 9. Discussion The values ) and Y are assumed to be two independent sets of probabilities under two distinct hypotheses; hence the ratio X = X/Y is taken as an observed value of a likelihood ratio, a random variable. But, as shown in epression (6), X is obtained from Y by the relationship Xi = (yi/ -) Yi. The paternity inde is merely the original factor converting Y to X: Xi= XilYi = y/iljy (9.1) We agree with some of the X method supporters that whether Xi = yi/- is a random variable is not a critical element of the method; the important point is whether it leads to a valid probability of paternity. The term "probability" as used in statistics refers to a class of objects, some of which possess a certain characteristic. The probability of that characteristic is the frequency of objects having that characteristic in the class under consideration. We shall use this basic definition to illustrate the difference between the X and E methods. As before, let F = father (unknown) and G = accused man (known). Note carefully that when we say F =, it is a very different statement from saying G =. The latter is known by blood typing, but F is unknown, still less his genotype. Now, let us hypothetically assume that we know the father is, although not all individuals are fathers. Under this hypothetical situation, the inclusion probability W defined in epression (7.2) would mean W()= X = X + Y -fathers -fathers + -nonfathers (9.2) In this case, the probability W() refers to the class of individuals of genotype only. The epression (9.2) is meaningful only when the fact F = is known. Similarly, if we know F = Aa, then W(Aa) is obtained from epression (9.2) by replacing by Aa, and it now refers to the class of individuals of genotype Aa only. W() and W(Aa) are probabilities for two different classes; thus W() + W(Aa) $ 1 in the eamples of table 5. Li and Chakravarti However, in practice the X method proceeds to calculate the value of epression (9.2) without knowing the state of accusation. The value of W() remains the same whether the accusation itself is true or false, because it is based on G =. That is, the probability calculated from G = in table 5 is unrelated to the state of accusation. In contrast, the E method is concerned solely with the state of accusation. The "measure" to be used must be different for the two states of accusation. If the accusation is true, we use the noneclusion probability -Yj = 1, which applies to all trios whatever the father's genotype. The value of E of a locus is the average eclusion probability over all possible trios in the population. The noneclusion probability Y2 = E also applies to all trios in the population. Hence, all falsely accused men have the probability Y2 = E of not being ecluded. The paternity probability wt given in epression (4) may be rewritten as it = True accusations True accusations + False accusations (undetected) (9.3) This epression refers to the class not of any particular genotype but of all individuals with noneclusions. It is the probability of paternity = P(accusation true) = P(G = F), whatever their genotypes. The difference between epressions (9.2) and (9.3) constitutes the critical difference between the A and E methods. The former is unrelated to the state of accusation; the latter studies nothing but the state of accusation. The final point we wish to discuss is the matter of paternity information provided by a trio. Users of the paternity inde have pointed out that the noneclusion method has neglected using the specific genotypes of a trio and have claimed that the X method is more efficient because it uses such information. It is desirable to have this point clarified. For simplicity we continue to use codominant systems. A trio consists of three genotypes as follows: Mother (1,2) (1,c) or (2,c) Child Accused /(34) (9.4)

Paternity Probability 205 where 1, 2, 3, and 4 indicate the four independent genes and c is the child's gene that must come from his (true) father. If c = 3 or 4, the test result is a noneclusion. If c is neither 3 nor 4, the test result is an eclusion. To reach the decision of eclusion or noneclusion, all the information provided by the three genotypes has been utilized. It should be clear that the decisions on eclusion or noneclusion are based solely on Mendelian laws of heredity and on nothing else. There is no consideration of gene frequencies, still less of mating systems. Mendelian laws are purely family laws, governing the genetic relationship, in any population whatsoever, between two parents and their offspring. Obviously, a trio does not contain any information on gene frequencies or on the mating system of the population. The three genotypes of epression (9.4) contain no information for or against the paternity of any particular genotype. The procedure-starting with the initial Y, finding the modified X = XY, formulating the inde X = X/Y, and obtaining the paternity probability W = X/(X + 1) for the specific genotype of the accused-must be regarded as the methodology developed by investigators using the etraneous information about the population (e.g., gene frequencies and mating system), not the "information provided by the trio" at all. A trio, being strictly a family affair, can be useful only for testing the compatibility or incompatibility with Mendelian inheritance. A corollary of the discussion above is that the specific genotype of the accused has n;-j particular significance in paternity problems. As a concrete eample, let the child in epression (9.4) be of genotype (2, 3), so that the accused cannot be ecluded. The relevant feature of the accused genotype (3, 4) is the presence of gene 3, not that of gene 4. The decision of noneclusion would have remained the same if the accused genotype were (3, 5), (3, 6), etc. Thus, in the eamples of table 5, both accused and Aa are cases of noneclusion not because of their specific genotype per se but because of the presence of gene A, the other gene being irrelevant. The X method uses the whole genotype of the accused as the basis of calculation and obtains different answers for paternity probabilitics W() and W(Aa). In contrast, the eclusion probability E of a locus is based on the presence or absence of the required allele in the accused genotype of all possible trios in the population and yields one answer for any given test. The information provided by trios has been embodied in the value of E and thus in that of E. References Baur, M. P., R. C. Elston, H. Gurtler, K. Henningsen, K. Hummel, H. Matsumoto, W. Mayr, J. W. Morris, L. Nijenhuis, H. Polesky, D. Salmon, J. Valentin, and R. H. Walker. 1986. No fallacies in the formulation of the paternity inde. Am. J. Hum. Genet. 39:528-536. Bolling, D. R., G. A. Chase, and E. A. Murphy. 1976. A matri method for calculating recurrence risks of unilocal disorders for genetic counseling. Ann. Hum. Genet. 40:25-36. Chakravarti, A. and C. C. Li. 1984. Estimating the prior probability of paternity from the results of eclusion tests. Forensic Sci. Int. 24:143-147. Elston, R. C. 1986. Probability and paternity testing. Am. J. Hum. Genet. 39:112-122. Li, C. C., and A. Chakravarti. 1983. On the eclusion and paternity probabilities. Pp. 609-618 in R. H. Walker, ed. Inclusion probabilities in parentage testing. American Association of Blood Banks, Arlington, VA. 1985. Basic fallacies in the formulation of the paternity inde. Am. J. Hum. Genet. 37:809-818.. 1986. Some fallacious thinking about the paternity inde: a reply to Dr. Jack Valentin's comments. Am. J. Hum. Genet. 38:586-589. Mickey, M. R., D. W. Gjertson, and P. I. Terasaki. 1986. Empirical validation of the Essen-Moller probability of paternity. Am. J. Hum. Genet. 39:123-132. Murphy, E. A., and G. S. Mutalik. 1969. The application of Bayesian methods in genetic counseling. Hum. Hered. 19:126-151. Smith, C. A. B. 1976. The use of matrices in calculating mendelian probabilities. Ann. Hum. Genet. 40:37-54. Thompson, E. A. 1986. Likelihood inference of paternity. Am. J. Hum. Genet. 39:285-287. Valentin, J. 1986. Some fallacious thinking about the paternity inde. Am. J. Hum. Genet. 38:582-585. Wiener, A. S. 1976. Likelihood of parentage. Pp. 124-131 in L. N. Sussman, ed. Paternity testing in blood grouping. Charles C Thomas, Springfield, IL.