Mapping quantitative trait loci in oligogenic models

Size: px
Start display at page:

Download "Mapping quantitative trait loci in oligogenic models"

Transcription

1 Biostatistics (2001), 2, 2,pp Printed in Great Britain Mapping quantitative trait loci in oligogenic models HSIU-KHUERN TANG, D. SIEGMUND Department of Statistics, 390 Serra Mall, Sequoia Hall, Stanford University, Stanford, CA , USA SUMMARY We discuss strategies for mapping quantitative trait loci with emphasis on certain issues of study design that have recently received attention: e.g. genotyping only selected pedigrees and the comparative value of large pedigrees versus sib pairs. We use a standard variance components model and a parametrization of the genetic effects in which the segregation parameters are locally orthogonal to the linkage parameters. This permits simple explicit expressions for the expectation of the score statistic, which we use to compare the power of different strategies. We also discuss robustness of the score statistic. Keywords: Gene mapping; Genome scan; Quantitative trait; Variance components. 1. INTRODUCTION The goal of genetic mapping is to locate the genes affecting particular traits by analysis of the correlation between phenotypic values and genetic markers distributed throughout the genome. The traits can involve a 0 1 phenotype (e.g. human diseases) or can be based on quantitative measurement. One expects relatives who have similar phenotypes to have similar genotypes at marker loci close to genetic loci that influence the trait, while the markers behave stochastically according to the rules of Mendelian inheritance at distant loci. Until recently, the theory and practice of mapping quantitative trait loci (QTLs) in humans has been relatively undeveloped. The purpose of this paper is to discuss, essentially from first principles, the statistical theory for mapping QTLs in humans under the simplifying assumptions of an oligogenic model of inheritance and completely informative genetic markers. Although this model has the deficiency of a strong assumption of normality, its interpretability and its computational tractability in the incorporation of covariates, multivariate phenotypes, interactions, etc. is sufficiently attractive to have encouraged substantial recent development (e.g. Amos (1994); Fulker and Cardon (1994); Kruglyak and Lander (1995); Fulker and Cherny (1996); Almasy and Blangero (1998); Page et al. (1998); Williams and Blangero (1999)). We try insofar as possible to give explicit analytic accounts of a number of issues that have usually been treated in the literature by numerical methods or by simulation, especially the relative values of large versus small pedigrees (cf. Almasy and Blangero (1998); Page et al. (1998); Williams and Blangero (1999)) and of genotyping selected versus random sibships (Risch and Zhang, 1995; Eaves and Meyer, 1994). In the process we generalize the notion of a discordant sib pair to a discordant sibship. In the following section we describe the model to be studied. We discuss sib pairs in Section 3. Sibships of arbitrary size are the subject of Section 4. Selective genotyping is the subject of Section 5. In Section 6 To whom correspondence should be addressed. c Oxford University Press (2001)

2 148 H.-K. TANG AND D. SIEGMUND we give a brief discussion of robustness of the components of variance model when a biallelic major gene model is correct. Our main conclusions are discussed in Section 7. To facilitate our calculations we introduce a parametrization in which the segregation parameters (those that can be estimated from segregation data) and linkage parameters (those requiring data from linked markers for their estimation) are orthogonal under the null hypothesis of no linkage (Cox and Hinkley, 1974, p. 324). We also use the standard asymptotic framework of statistical large sample theory, that the sample size N is large and the noncentrality parameter for a single observation is inversely proportional to N 1/2. These techniques allow us to compute score statistics, their asymptotic expectations, and Fisher information matrices comparatively explicitly. 2. MODEL We assume Hardy Weinberg equilibrium throughout. Suppose there is a QTL at the locus τ, which is an unknown parameter. The phenotypic value Y is assumed to be given by Y = µ + α U + α V + δ U,V + e. (2.1) The mean value µ may be expanded as a linear model with only minor changes to what follows. The parameter α a denotes the additive genetic effect of allele a at locus τ, δ a,b the dominance deviation of alleles a, b. A subscript U denotes the allele contributed by the mother while a subscript V refers to the father. The phenotypic variance is σy 2 = E[(Y µ)2 ]. The variances of the additive and dominance effects associated with the QTL at τ are σa 2 = 2Eα2 U and σ D 2 = Eδ2 U,V. Implicitly we expect that there are several QTLs, which may interact. For this paper we assume that other QTLs are on other chromosomes, are in linkage equilibrium with and do not interact with the QTL at τ. Under these assumptions their contribution to the phenotype Y can be assumed to be a part of the residual term e, which we assume is uncorrelated with the other terms in (2.1) and has variance σe 2. Then σ Y 2 = σ A 2 + σ D 2 + σ e 2. The locusspecific heritability associated with the QTL at τ is h 2 = (σa 2 + σ D 2 )/σ Y 2. The term oligogenic of the title is intended to suggest that at least one QTL has a substantial locus-specific heritability, say h 2 > 0.1, and hence might be detectable by linkage analysis with reasonable sample sizes. In order to have some idea of the magnitudes of different components of variance and their relations to more directly interpretable genetic parameters, we shall occasionally be interested in the special case of two alleles A 1 and A 2 with frequencies of p and q = 1 p. Using a and a for the genotypic values of the homozygotes A 1 A 1 and A 2 A 2, respectively, and d for the dominance deviation of the heterozygote, we have the standard formulas: σ 2 A = 2pq[a + (q p)d]2, σ 2 D = 4p2 q 2 d 2. (2.2) Consider a pair of siblings satisfying the model (2.1). Recall that at any locus two relatives share alleles identical by descent if they inherit the same alleles from a common ancestor. Two siblings can share 2, 1 or 0 alleles identical by descent depending on whether they inherit the same alleles from both mother and father, from one but not both, or from neither. Let ν = ν(τ) denote the number of alleles identical by descent at τ. Letting Y i denote the phenotypic value of the ith sibling (i = 1, 2), we have (Fisher, 1918; Kempthorne, 1955) Cov[Y 1, Y 2 ν] =σ 2 e r + σ 2 A ν/2 + σ 2 D 1 {ν=2}, (2.3) where r = corr(e 1, e 2 ) accounts for the correlation between sibs arising from other QTLs and from a shared environment.

3 Mapping quantitative trait loci in oligogenic models 149 Taking the expectation of (2.3), we find the unconditional covariance. Hence we can rewrite (2.3) as Cov[Y 1, Y 2 ν] =Cov(Y 1, Y 2 ) +[(σa 2 + σ D 2 )/2](ν 1) (σ D 2 /2)[1 {ν=1} 1 2 ], (2.4) so the terms involving ν have mean 0 and are uncorrelated. In the following we will be interested in marker loci t distributed throughout the genome and ν(t) for a sib pair as a stochastic process in t. For markers t 1 and t 2 on different chromosomes, ν(t 1 ) and ν(t 2 ) are stochastically independent. For markers on the same chromosome Cov[ν(t 1 ), ν(t 2 )]=2 1 [1 2φ], where φ is a function of the recombination frequency. We find it convenient to assume that recombination follows the Haldane model of no interference, so 1 2φ = exp( 4λ t 1 t 2 ). Here the marker location t denotes genetic distance in centimorgans (cm) from a designated end of the chromosome, and λ = 0.01/cM. 3. LIKELIHOOD THEORY FOR SIB PAIRS In this section we consider genome scans to detect the QTL that is explicitly modelled in (2.1), although we acknowledge that several QTLs may contribute to the trait. Since the location τ of the QTL is unknown, our statistic takes the form of a stochastic process indexed by markers at loci t distributed throughout the genome. Large values of this process indicate the likelihood of a QTL located near to the markers where the large values occur. Initially we consider N pairs of siblings, and for simplicity we assume that markers are completely informative, so the values of ν(t) are known with certainty. Consider a sample of N sib pairs with phenotypes (Y 11, Y 12 ),...,(Y N1, Y N2 ) and data ν 1 (t i ),...,ν N (t i ) from markers t i distributed throughout the genome. Our basic modelling assumption is that (Y n1, Y n2 ) are independent with a common distribution, which conditional on ν = ν n (τ) is bivariate normal with common means µ, variances σ 2 Y, and correlation (cf. (2.4)) where ρ ν = ρ +{α 0 (ν 1) δ 0 [1 {ν=1} 1 2 ]}/σ 2 Y, (3.1) α 0 =[σa 2 + σ D 2 ]/2, δ 0 = σd 2 /2. (3.2) It is convenient to define D n = (Y n1 Y n2 )/2 1/2 and S n = (Y n1 + Y n2 2µ)/2 1/2. For simplicity we assume initially that the parameters µ and σy 2 are known and equal to 0 and 1, respectively. As we show below, this has no effect on the asymptotic theory. We also make the working assumption that the QTL τ is one of the t i (cf. Remark (iii) below), so the marginal log likelihood function, l = l(τ, α 0,δ 0,ρ),is given by l = 2 1 n [log(1 ρ 2 ν n ) + D 2 n /(1 ρ ν n ) + S 2 n /(1 + ρ ν n )]. (3.3) Here ν n = ν n (τ), and ρ ν is given by (3.1). This can be regarded as the conditional likelihood of the phenotypic data given (ν n, n = 1,...,N) or as the unconditional joint log likelihood of the phenotypic data and the (ν n ). All expectations are taken with respect to the joint distribution. Partial derivatives with respect to unknown parameters are denoted by appropriate subscripts. Let C n = C n (α 0,δ 0,ρ)be defined by C n = ρ νn /(1 ρ 2 ν n ) + S 2 n /2(1 + ρ ν n ) 2 D 2 n /2(1 ρ ν n ) 2. (3.4)

4 150 H.-K. TANG AND D. SIEGMUND Regarding τ momentarily as known, we obtain the components of the efficient score: l α (τ) = n [ν n (τ) 1]C n, (3.5) l δ (τ) = n [ {ν n (τ)=1}]c n, (3.6) and l ρ = n C n. (3.7) When α 0 = 0 (hence also δ 0 = 0), the three coordinates of the score vector are uncorrelated, so the Fisher information matrix is diagonal, with easily computed entries. When it is assumed that there is no dominance variance, so δ 0 = 0, the score statistic for testing α 0 = 0 at a putative QTL τ = t is Z 1 (t) = l α (t)/iαα 1/2, (3.8) where now C n = C n (0, 0, ˆρ) and ˆρ = N 1[ n Y 1,nY 2,n ] + is the maximum likelihood estimator under the null hypothesis. Since the true value of τ is unknown, to scan an entire genome for linkage, we can use max Z 1 (t), (3.9) t where the max is taken over all marker loci. Linkage is detected whenever max t Z 1 (t) b for a suitable threshold b. Thresholds to control the false detection rate have been discussed by Feingold et al. (1993) for markers equally spaced at distance 0. See also Lander and Kruglyak (1995) and Lander and Schork (1994). To allow for the possibility that δ 0 > 0, we can define Z 2 (t) = l δ (t)/i 1/2 δδ. (3.10) and use the two degree of freedom statistic [Z 2 1 (t) + Z 2 2 (t)]1/2, (3.11) constrained by the relation that 0 δ 0 α 0 (cf. (3.2)). Dupuis and Siegmund (2000) discuss this possibility in the context of sib pair analysis of qualitative traits. REMARKS (i) For unknown σy 2, µ, and ρ one uses ˆµ = (2N) 1 n (Y 1,n + Y 2,n ), ˆσ Y 2 = (2N) 1 n [(Y 1,n ˆµ) 2 + (Y 2,n ˆµ) 2 ] and ˆρ = N 1[ n (Y 1,n ˆµ)(Y 2,n ˆµ) ] + / ˆσ 2 Y. The asymptotic results are unchanged since the scores for the segregation parameters σy 2,µ,ρ are uncorrelated with those for the linkage parameters α 0,δ 0 under α 0 = δ 0 = 0 and asymptotically for local alternatives. (ii) The components of variance analysis uses the relation for normally distributed X that the variance of X 2 is twice the square of the variance of X, and hence it is not robust to violations of the assumed normality. A simple device to obtain a more robust test would be to refer the statistic given above to its conditional distribution given (C 1,...,C N ), where C n = C n (0, 0, ˆρ). This would make the type I error probability nonparametric with respect to the distribution of the phenotypes while maintaining full asymptotic efficiency if the normality hypothesis is satisfied. Asymptotically it amounts to replacing (3.8) and (3.10) by [ 1/2 [ 1/2 Z 1 (t) = l α (t)/ Cn] 2, Z 2 (t) = l δ (t)/ Cn] 2, (3.12) n n

5 Mapping quantitative trait loci in oligogenic models 151 Fig. 1. Expected value of the score statistic: QTL location is τ, expectation at τ is ζ, detection threshold is b, and the denote simulated values at equally spaced marker loci. which has the effect of using fourth moments of the phenotypic values to estimate variability. (iii) We have assumed completely informative markers in order to simplify the analysis and have made the working assumption that the QTL τ is one of the markers. If either of these assumptions fails to be true, the likelihood function involves a mixture based on the conditional distribution of ν n (τ) given the marker data, say M n,inthenth family. A convenient representation for the likelihood function is E 0 [exp(l(τ, α 0,δ 0,ρ) l(0, 0,ρ)) M, Y ], (3.13) where M = (M 1,...,M N ), Y = (Y 11, Y 21,...,Y 1N, Y 2N ), l is given by (3.3), and the subscript 0 denotes that the expectation is computed under the assumption that α 0 = δ 0 = 0. For the case of partially informative markers, on the evidence of simulations, Fulker and Cherny (1996) report very good success with the test to detect an additive effect that simply replaces ν n in (3.3) by ˆν n = E 0 (ν n M n, Y n ). Equation (3.5) provides an explanation for the Fulker Cherny results, since the efficient score for testing α 0 = 0 is precisely (3.5) with C n = C n (0, 0, ˆρ) and with ν n replaced by ˆν n. To prove this, differentiate the logarithm of (3.13) with respect to α 0 and evaluate the result at α 0 = δ 0 = 0. See Teng and Siegmund (1998) for a discussion of the impact of marker informativeness and intermarker distance on the power to detect linkage. The same calculation gives the efficient score when the QTL τ is located between markers. In principle one might extend the max in (3.9) from marker loci to all loci t, but in most cases there seems to be very little power gained by this device (cf. Darvasi et al. (1993); Dupuis and Siegmund (1999)), so we do not pursue the more complicated analysis. Display (4.8) derived below gives an expression for the noncentrality ξ = E[Z 1 (τ)]. At a marker t linked to τ, E[Z 1 (t)] =ξ exp[ 4λ t τ ]. These relations are illustrated in Figure 1, where ξ>b, so the probability of detection is large. The asymptotic squared noncentrality of (3.11) at the QTL τ is given by (4.8) with α 2 0 /2 replaced by α2 0 /2 + δ2 0 /4. In Table 1 we have used (4.8) to evaluate the sample size necessary to achieve 90% power for the score statistic (3.8) and various values of α 0 and ρ. We consider a genome scan at 1 cm and assume a genome of 23 chromosomes of average length 140 cm. This yields a detection threshold of b 3.91 (cf. Feingold et al. (1993) or the Appendix). We assume the trait locus is at zero recombination distance from

6 152 H.-K. TANG AND D. SIEGMUND Table 1. Number of sib pairs for 90% power. The overall correlation between two siblings is ρ, and the locus-specific heritability is 2α 0 /σy 2.For = 1, the QTL is assumed to lie at a marker; a 2% larger sample size is required for a QTL midway between two flanking markers. For = 10, the QTL is assumed to lie midway between two markers ρ α 0 /σy 2 N( = 1) N( = 10) the nearest marker. For 90% power the approximation of Feingold et al. (1993) given in the Appendix, leads to the value of N that makes the squared noncentrality parameter given by (4.8) about equal to 25. The required sample size would increase by about 2% if the QTL is midway between markers. We have also included the sample size required for 90% power when markers are spaced at 10 cm and the QTL is midway between markers. The detection threshold decreases to approximately b = 3.6; but for a QTL midway between markers, we need an approximately 16 18% increase in sample size compared to the 1 cm spacing. Simulations indicate that these approximations are very accurate. An interesting comparison is available from Page et al. (1998), who have estimated the sample size required to have a probability of 0.9 for the LOD score evaluated at a single marker to exceed the conventional level of 3, or equivalently b = 3.72 on the normal scale, which we are using here. Although this is in principle quite different from the situation that arises when multiple, closely spaced markers are used, in the special case of 1 cm spacing and the QTL exactly at a marker, by a numerical accident their definition also leads to the value of N that makes the squared noncentrality parameter equal to ( ) 2 = 25. Hence our analytic approximations can be compared with the results in Table 4 of Page et al. (1998), which were based on simulations of an additive model with one biallelic major gene and a polygenic component. The agreement is generally excellent. If the true mode of inheritance is roughly additive, the two-dimensional statistic based on (3.11), which under the conditions given above requires a threshold of about b = 4.11 (Dupuis and Siegmund, 2000), has less power than the corresponding one-dimensional statistic. Some numerical experimentation indicates that (3.11) is more efficient than (3.8) only for a fairly rare recessive allele, which must have a large effect for the QTL to be detectable with a reasonable sample size. Hence the simpler one-dimensional statistic would seem adequate in most situations. 4. LIKELIHOOD THEORY FOR SIBSHIPS OF ARBITRARY SIZE Starting with Blackwelder and Elston (1982), a number of authors have observed on the basis of simulations that sibships of size s provide considerably more power than sib pairs, perhaps as much as s(s 1)/2 independent sib pairs. See also Page et al. (1998); Williams and Blangero (1999). Suppose

7 Mapping quantitative trait loci in oligogenic models 153 we have a sample of N sibships, each of size s. We index sibs within a sibship by i and j and sibships by n = 1,...,N. The subscript n is often suppressed in our notation. For ease of exposition we again assume that µ = 0 and σ 2 Y = 1. Let ν ij(t) denote the number of alleles shared identical by descent at the marker locus t by the ith and jth sibs in the nth sibship. Let A ν denote the s s matrix with entries ν ij 1 for i = j and zeros along the diagonal. Let ν = E(YY A ν ). The log likelihood for a single QTL at τ is l = l(τ, α 0,δ 0,ρ)given by l = 2 1 N n=1 {log ν +tr 1 ν YY }, (4.1) where ν = ν(τ). Recall that if G is a nonsingular matrix depending on x, then log G / x = tr(g 1 G/ x) and G 1 / x = G 1 G/ xg 1. By differentiation of (4.1) we obtain the score equations l α = 2 1 n { tr( 1 ν A ν ) + tr( ν 1 A ν ν 1 YY )} (4.2) and l ρ = 2 1 n { tr( 1 ν B) + tr( ν 1 B ν 1 YY )}, (4.3) where B = ν / ρ = 11 I. We omit the similar expression for l δ (cf. (3.6)). Also l αα = n { 2 1 tr( 1 ν A ν 1 ν A ν ) + tr( 1 ν A ν ν 1 A ν ν 1 YY )}. (4.4) Similar expressions for l ρρ, l αρ, l δδ, etc. are easily obtained. Let E 0 denote expectation under the hypothesis that α 0 = 0 (hence also δ 0 = 0) and let = E 0 (YY ) = (1 ρ)i + ρ11. It is easy to see that I αα = E 0 ( l αα ) = (N/2)trE 0 ( 1 A ν 1 A ν ) (4.5) and I αρ = E 0 (l α l ρ ) = 0. Hence the asymptotic noncentrality of the score statistic Z t = l α (t, 0, 0, ˆρ)/Iαα 1/2 (0, 0, ˆρ) (4.6) can be evaluated by taking the expectation of (4.6) with ˆρ replaced by ρ. From (3.1) and (4.2) we obtain for t = τ Hence by (4.5) the noncentrality of (4.6) is To evaluate (4.7), we first observe that E[l α (τ, 0, 0,ρ)]=(N/2)α 0 tre( 1 A ν 1 A ν ). α 0 {(N/2)trE 0 ( 1 A ν 1 A ν )} 1/2. (4.7) 1 = K ρ {[1 + (s 1)ρ]I ρ11 },

8 154 H.-K. TANG AND D. SIEGMUND where K ρ = {(1 ρ)[1 + (s 1)ρ]} 1. See, for example, Rao (1973, p. 67). It is easy to see that ν ij = ν ji, and for i < j the ν ij are pairwise independent whenever they differ for at least one subscript. Hence EA 2 ν =[(s 1)/2]I. By straightforward and somewhat tedious algebra we find that E(A ν11 A ν ) = (s/2 1)I + (1/2)11. Combining these results, we obtain tre 0 ( 1 A ν ) 2 = K 2 ρ ( ) s {[1 + (s 2)ρ] 2 + ρ 2 }, 2 which can be substituted into (4.7) to obtain the square of the asymptotic noncentrality parameter for N sibships of size s: ξ 2 = N α2 0 2σ 4 Y ( ) s {[1 + (s 2)ρ] 2 + ρ 2 } 2 {(1 ρ)[1 + (s 1)ρ]} 2. (4.8) Although (4.8) increases rapidly with s, there are dependencies among the ν ij within a sibship, so l α has a skewed distribution when s > 2, and hence a larger threshold is required to maintain a fixed false positive error rate. In addition, standard asymptotic theory to the effect that the asymptotic variance of the score statistic for small, positive α 0 is effectively the same as when α 0 = 0 is not a reasonable approximation for large s (hence relatively small sample sizes). Consequently, the increase in power with increasing s is less than is suggested by considering only the increase in the asymptotic noncentrality parameter. Numerical examples are given in Table 2, for which we used a more refined asymptotic analysis, which we describe in an Appendix and have spot checked for accuracy by simulations. (Even for s = 2 there is a small discrepancy between the sample sizes in Tables 1 and 2 because of the different approximations used.) We see from Table 2 that, except for large s or large α 0, a sibship of size s turns out to be roughly as powerful as ( s 2) independent sib pairs. For small sibships Williams and Blangero have also obtained the noncentrality parameter (4.8) and have computed sample sizes directly from (4.8) without the corrections mentioned in the preceding paragraph. These can be misleading, especially if carried out for larger s. For example, for the third row of Table 2 their method would yield N = 100; for the sixth row it would yield N = 12. Our more precise asymptotic analysis seems to be consistent with the simulations of Page et al. (1998). Exactly as for sib pairs, we can obtain a distribution-free false positive rate if we consider the conditional distribution of l α given the phenotypic values. Asymptotically that means that l α should be standardized by {E 0 [l 2 α Y 1,...,Y N ]} 1/2, which is given by the square root of a sum of terms of the form E 0 {[ tr( 1 A ν ) + Y 1 A ν 1 Y ] 2 Y } = sµ 4 (1 ρ) 4 4sµ 3 Ȳ [1 + (s 1)ρ](1 ρ) 3 + 2s(s 3)µ 2 Ȳ 2 [1 + (s 1)ρ] 2 (1 ρ) 2 + s2 µ 2 2 (1 ρ) 4 2sρµ 2 [1 + (s 1)ρ](1 ρ) 3 + s(s 1){ρ[1 + (s 1)ρ]+(1 ρ)ȳ 2 } 2 [1 + (s 1)ρ] 4 (1 ρ) 2, (4.9) where µ k = s 1 (Y i Ȳ ) k for k = 2, 3, 4. In the following section we find (4.9) useful for a completely different purpose. Our methods can be adapted to study extended pedigrees, although it is more difficult to obtain explicit analytic results. An exception is the case of nuclear families consisting of parents and their children. In addition to the inter-sib correlation ρ, we let ρ p denote the parental correlation (due to shared environment only) and ρ = (σa 2/2 + rσ e 2)/σ Y 2 the parent sib correlation. For N nuclear families, each containing s

9 Mapping quantitative trait loci in oligogenic models 155 Table 2. Sample sizes required for 90% power. Sibships of size s; sib pair correlation is ρ; sample size is N; threshold is b; noncentrality is ξ; rel. eff. is the number of independent sib pairs needed to have the same power as one sibship of size s; see the text for definitions of other parameters ρ α 0 /σy 2 s b ξ N σ0 2 σ1 2 rel. eff sibs, the squared noncentrality equals Nα 2 0 2σ 2 Y ( ) s [1 2 ρ 2 /(1 + ρ p ) + (s 2)(ρ 2 ρ 2 /(1 + ρ p )] 2 + (ρ 2 ρ 2 /(1 + ρ p )) 2 2 [1 2 ρ 2 /(1 + ρ p ) + (s 1)(ρ 2 ρ 2 /(1 + ρ p ))] 2 (1 ρ) 2. This is somewhat larger than (4.8) for the siblings alone. For a completely additive trait with ρ = 0.25, for two sibs the squared noncentrality with parents included is 15% larger than without parents. For five sibs it is only 7% larger.

10 156 H.-K. TANG AND D. SIEGMUND 5. SELECTIVE GENOTYPING In some cases it may be relatively easy and inexpensive to phenotype individuals. When the cost of phenotyping is indeed small compared to the cost of genotyping, it is possible to achieve an increase in power by genotyping only sib pairs with particularly favourable phenotypes. See, for example, Risch and Zhang (1995), who recommend using sib pairs with discordant phenotypes, and Xu et al. (1999) for an application. In this section we study Risch and Zhang s suggestion. We begin by considering instead of l α the simpler statistic suggested by Risch and Zhang, (ν 1)/(N/2) 1/2, (5.1) where the summation extends over all genotyped sib pairs. Also assume for simplicity that µ = 0,σ 2 Y = 1. For a given sib pair, by Bayes formula one can easily show that E(ν 1 D, S) is given by a fraction, the numerator of which equals P(ν = i)(i 1)ϕ[S/(1 + ρ i ) 1/2 ]ϕ[d/(1 ρ i ) 1/2 ]/(1 ρ 2 i )1/2 while the denominator is a similar expression without the factor (i 1). For small values of α 0, we see from the first term of a Taylor series expansion that E(ν 1 D, S) α 0 E 0 (ν 1) 2 [ρ/(1 ρ 2 ) + S 2 /2(1 + ρ) 2 D 2 /2(1 ρ) 2 ]. (5.2) It is evident from (5.2) that sib pairs with large values of D are particularly informative. If we genotype only those sib pairs whose phenotypes satisfy D > t, for each genotyped pair we have at the QTL an asymptotic noncentrality of 2 1/2 E(ν 1 D > t) [α 0 /2 2 1/2 (1 ρ)]t ϕ(t )/{1 (t )}, (5.3) where t = t/(1 ρ) 1/2. For a numerical example suppose that ρ = 0.25 (corresponding to a heritability of 0.50 for a purely additive trait) and t = Then (5.3) equals 2.16α 0, while the noncentrality of a random sib pair is 0.78α 0. Hence only about one-eighth as many discordant sib pairs need be genotyped as random sib pairs. On the other hand, roughly 20 random sib pairs must be phenotyped to find one discordant sib pair. Other methods of selecting the sib pairs to be genotyped can be handled by modifications of the preceding argument. For the situation described above, the preferred definition of Risch and Zhang (1995) is about 82% as efficient as our suggestion. The case of concordant sib pairs, defined by max(y 1, Y 2 )< t or min(y 1, Y 2 )>t, can be treated similarly. Now the value of t corresponding to the preceding examples is 1.19, and the asymptotic noncentrality is 1.49α 0. Instead of the ad hoc statistic (5.1), one might consider the score statistic (3.8). Now the unknown nuisance parameter ρ (and in general µ and σ Y also) must be estimated. This poses no problem if, as we assume, the sib pairs to be genotyped are selected from a random sample of sib pairs of known phenotypes, which are available to estimate the nuisance parameters. This will typically be a very large sample, ensuring that the nuisance parameters are estimated accurately. By some routine Taylor series expansions one sees that for purposes of asymptotic analysis one can regard them as known. Before proceeding, it is worth noting, however, that the situation would be quite different if the sib pairs are ascertained through their phenotypes, so the natural estimates of nuisance parameters are biased, perhaps severely. See, for example, Beaty and Liang (1987) for ascertainment corrections. An advantage of the score statistic over (5.1) is that it generalizes naturally to the case of larger sibships. For the score statistic (4.2), the arguments given above show that E(l α Y 1,...,Y N ) (α 0 /4) N n=1 E 0{[ tr( 1 A ν ) + Y 1 A ν 1 Y ] 2 Y }. (5.4)

11 Mapping quantitative trait loci in oligogenic models 157 We define a discordant sibship to be one in which the squared norm of the (s 1)-dimensional vector of orthogonal contrasts exceeds a threshold t. To facilitate the analysis of (5.4) we make the (Helmert) orthogonal transformation from Y = (Y 1,...,Y s ) defined by Z s = (Y 1 + +Y s )/s 1/2 = s 1/2 Ȳ, and for i = 1,...,s 1 Z i =[(i + 1)i] 1/2 [ i j=1 Y j iy i+1 ]. Under the probability P 0, Z 1,...,Z s are independent and normally distributed with Var 0 Z s = 1+(s 1)ρ, and Var 0 Z i = 1 ρ, i = 1,...,s 1. The variables Z 1,...,Z s 1 are the orthogonal contrasts in the Y, so our definition of discordant is that Z Z s 1 2 > t. An expression for the expectation in (5.4) is given in (4.9). To find the asymptotic noncentrality parameter, we evaluate the expectation of (4.9) given that Z Z s 1 2 > t. It is straightforward to compute each term, except possibly those involving i=1 s Y i k for k = 3, 4. Since our definition of discordance is symmetric in the Y, all terms in the sum have the same (conditional) expectation, and from the inverse of the Helmert transformation, we see that Y s = (s 1) 1/2 Z s 1 /s 1/2 + Z s /s 1/2. Hence these expectations are also readily evaluated, and we obtain the following expression: E 0 {[ tr( 1 A ν ) + Y 1 A ν 1 Y ] 2 Z Z s 1 2 [ ] > t} 1 3(s 1) 2t 2 f s 1 (t [ ) (s 3 3s 2 + s + 3)ρ + s 2 ] 3 2t f s 1 (t ) = (1 ρ) 2 1 s(s + 1) F s 1 (t + ) s[1 + (s 1)ρ](1 ρ) 2 F s 1 (t ) + s(s 1){[1 + (s 2)ρ]2 + ρ 2 } [1 + (s 1)ρ] 2 (1 ρ) 2, where t = t/(1 ρ), and f s, F s are respectively the density and right tail distribution functions of a χ 2 s variable. Some examples of the efficiency gained by selective genotyping of the most discordant 10% of a sample of sibships of size s are given in Figure 2. For small s the gain in efficiency compared to genotyping random sibships is quite large; but as the size of the sibship increases the relative value of selective genotyping decreases while, as we saw in the preceding section, the unconditional value of the sibship increases. For example, a random sibship of size 4 is about as powerful as a selected sib pair from the most discordant 10% of the population. 6. ROBUSTNESS As we indicated above, by conditioning on the phenotypic values it is possible to make the statistics we have considered nonparametric with respect to the false-positive error rate. In this section we make a brief study of robustness of the power of these tests when a different model is assumed to be true in particular for the model having a major gene with two alleles and a normally distributed residual. For simplicity we consider only sib pairs. The standardized nonparametric version of the score statistic is l α /[ n C 2 n /2]1/2. Since the expected value of the numerator, E(l α ), is computed without making distributional assumptions, to evaluate the noncentrality parameter we need only evaluate E 0 C 2, which under the assumption of normality equals (1 + ρ 2 )/(1 ρ 2 ) 2. Let = E 0 C 2 (1+ρ 2 )/(1 ρ 2 ) 2. Algebraic expressions for are straightforward, albeit somewhat complicated for the case of a single biallelic QTL and normally distributed residual e. Table 3 contains numerical examples of the percentage increase in E 0 C 2, which is also the percentage increase in sample size that would be required to maintain the same power as determined for the assumed components of variance model. For the most part, the impact of using the components of variance when a biallelic major gene model is correct has a negligible effect on the power, but the effect can be substantial in the case of a rare recessive allele of large effect.

12 158 H.-K. TANG AND D. SIEGMUND Fig. 2. Number of genotyped sibships for selected (proportion = 0.1) and unselected genotyping. It is possible that a likelihood analysis of the correct model would produce a completely different and more efficient statistic. To simplify the notation we assume there is no dominance deviation. Let g i denote the indicator that the allele inherited from the ith parent is A 1. Then we can write Y = µ + a(g 1 + g 2 2p) + e. Let p(g 1, g 2 ν) denote the conditional distribution of (g 1, g 2 ) given ν. The likelihood function for a single pedigree is the mixture g1,g 2 [ p(g 1, g 2 ν) S σe 2(1 r 2 ) 1/2 ϕ a(g1 + g 2 2p)/2 1/2 σ e (1 + r) 1/2 ] ϕ [ D a(g1 g 2 )/2 1/2 σ e (1 r) 1/2 ]. (6.1) If we take the first two terms of the Taylor series expansion of (6.1) about a = 0, we obtain after some calculation that (6.1) ϕ[s/σ e(1 + r) 1/2 ]ϕ[d/σ e (1 r) 1/2 ] { ( D 2 σe α 0 (1 r)2 σ 4 e (1 r) [ 1 (ν 1)/2 2 σe 2(1 r 2 ) 1/2 ] + S2 σe 2 (1 + r) (1 + r)2 σ 4 e [ ])} 3 + (ν 1)/2, (6.2) 2 where, as above, α 0 = σa 2/2 = pqa2. The efficient score for testing α 0 = 0 is the logarithmic derivative of the likelihood function evaluated at α 0 = 0. This is just the coefficient of α 0 in (6.2). Any term not involving ν can be omitted, and unknown parameters must be estimated, so after summing over all sib pairs the statistic becomes { (ν 1) D2 ˆσ e 2 (1 ˆr) 2 ˆσ e 4(1 + S2 ˆσ 2 } e (1 +ˆr) ˆr)2 2σˆ 4. (6.3) e (1 +ˆr) 2

13 Mapping quantitative trait loci in oligogenic models 159 Table 3. Percentage increase in sample size for biallelic major gene. The column headed % change gives the percentage increase in sample size for a biallelic major gene (cf. (2.2)) relative to the assumed oligogenic model with the same variance components. The major gene contributes 25% of the trait variance; its additive effect is a, dominance deviation is d, and allele frequency is p; the sib correlation is ρ a d p ρ % change This is exactly of the form of the score statistic for the components of variance model, except that ˆσ e 2 appears in place of ˆσ 2 Y and ˆr appears in place of ˆρ (cf. (3.5) ff.). However, the estimates for these parameters are calculated under the condition α 0 = 0, and in spite of the difference in notation the parameters, hence the estimates, are the same under that hypothesis. Thus the score statistics for the twoallele model with normal residuals and for our components of variance model are the same statistic. This provides further evidence of the robustness of our components of variance test. 7. DISCUSSION For an oligogenic model with normally distributed phenotypic data, we have introduced a parametrization that makes the linkage parameters orthogonal to the segregation parameters and hence allows us to compute explicitly score statistics, Fisher information matrices, and noncentrality parameters in a number of important special cases. We have evaluated the asymptotic noncentrality parameter for sibships of arbitrary size, which suggests what others have observed as a result of simulations, that a sibship of size s can be roughly as powerful as ( s 2) independent sib pairs. Our more precise analysis shows that this assessment is overly optimistic when s and α 0 are large, but large sibships are, nevertheless, extremely valuable even in these cases.

14 160 H.-K. TANG AND D. SIEGMUND We have evaluated the power of genotyping a selected subset of sibships defined by their phenotypes, which we select from a large random sample of sibships. The relative value of selective genotyping decreases rapidly as the sibship size increases. The Haseman Elston regression statistic (Haseman and Elston, 1972) can be derived as a special case of the calculations of Section 3. One ignores S 1,...,S N and starts from the likelihood function for D 1,...,D N, then uses the robust version of the score statistic suggested in (3.12). Compared to the fully efficient likelihood analysis, for moderate phenotypic correlation (0.25) between sibs, Haseman Elston regression is about 75% efficient when the mode of inheritance is additive. The modified Haseman Elston statistic (Elston et al., 2000) is more (less) efficient than the classical for small (large) correlation between sibs. It also is about 75% efficient for moderate correlation. See also Teng (1996) and Wright (1997). In sibships the dominance component of variance contributes to the noncentrality of the score statistic designed to detect an additive component. Consequently, even if there is a large dominance component, but we model only the additive components, our loss of efficiency is usually relatively modest. Even for rare recessively acting alleles of relatively large effect, the loss rarely exceeds 10 20% of the sample size. Based on a conditioning argument, we have suggested a modified statistic, which is nonparametric under the hypothesis of no linkage and can be expected to be robust to moderate departures from normality. We have briefly discussed robustness against a true model involving a (biallelic) major gene. It appears that our model is robust if the allelic substitutions have small phenotypic effect, or modest effect but small dominance deviation. An interesting case deserving more careful attention involves a major gene having rare alleles of large effect. We expect to discuss gene gene and gene environment interactions in a future paper. ACKNOWLEDGEMENTS This research was partly supported by NIH Grant HG The authors thank two referees and the associate editor for their thoughtful suggestions. APPENDIX: BETTER APPROXIMATIONS FOR SIBSHIPS OF SIZE s Because of the dependence among the ν ij (which are pairwise independent), the null distribution of l α is skewed when the number of siblings is s 3. To deal with a similar problem involving qualitative traits Tu and Siegmund (1999) suggested a p-value approximation that uses the third moment to correct for skewness. Let β be the one-sided derivative at 0 of Cov(Z 0, Z t ), γ be N 1/2 times the third moment of Z t under the hypothesis of no linkage, θ =[ 1 + (1 + 2bγ/N 1/2 ) 1/2 ]/γ, and ν = ν[b(2β ) 1/2 ] the special function defined by Siegmund (1985, p. 82). The following is a slight modification of the approximation of Tu and Siegmund (1999): for a chromosome of length L with markers equally spaced at distance, P 0 { max Z i > b} 0 i <L [2π(1 + γθ)] 1/2 {1/θ N 1/2 + νβ Lb} exp[ Nθ 2 (1 + 2γθ/3)/2]. Substantial calculation shows that the value of γ equals [3/2] ( s 3) {[1 + (s 2)ρ] 3 + (3s 10)ρ 3 + 3ρ 2 } {[ ( s) 2 /2][(1 + (s 2)ρ) 2 + ρ 2 ]} 3/2. As a function of ρ this ratio is practically constant, so in evaluating the thresholds in Table 2 we have used the value for ρ = 0.

15 Mapping quantitative trait loci in oligogenic models 161 To determine the sample sizes in Table 2, we have used suitable versions of the power approximations provided by Feingold et al. (1993): P(max Z k > b) 1 ((b ξ)/σ 0 ) [ 2ν ν 2 ] + ϕ((b ξ)/σ 0 ) bσ 0 /σ1 2 + (ξ b)/σ 0 2bσ 0 /σ1 2 + (ξ b)/σ, 0 where ν = ν[b(2β ) 1/2 /σ 1 ]. This approximation, which is valid when there is a marker at the QTL τ, involves (i) the probability that the statistic Z τ is above the detection threshold b (or that the statistic exceeds b at one or both of the two markers flanking τ in the case that τ is not itself a marker) and (ii) the probability that the statistic is below the threshold at τ but Z t b at some nearby marker t. To implement this approximation we require the mean and variance (σ 2 0 ) of Z τ and the conditional mean and variance (σ 2 1 ) of Z t Z τ given Z τ. See Tang (2000) for details. REFERENCES ALMASY, L. AND BLANGERO, J. (1998). Multipoint quantitative-trait linkage analysis in general pedigrees. American Journal of Human Genetics 62, AMOS, C. I. (1994). Robust variance-components approach for assessing genetic linkage in pedigrees. American Journal of Human Genetics 54, BEATY, T. H. AND LIANG, K. Y. (1987). Robust inference for variance components models in families ascertained through probands: I. conditioning on the proband s phenotype. Genetic Epidemiology 4, BLACKWELDER, W. C. AND ELSTON, R. C. (1982). Power and robustness of sib-pair linkage tests and extension to larger sibships. Commun. Statist.- Theor. Meth. 11, COX, D.R.AND HINKLEY, D. V. (1974). Theoretical Statistics. London: Chapman and Hall. DARVASI, A.,WEINREB, A., MINKE, V.,WELLER, J.I.AND SOLLER, M. (1993). Detecting marker-qtl linkage and estimating QTL gene effect and map location using a saturated genetic map. Genetics 134, DUPUIS, J. AND SIEGMUND, D. (2000). Boundary crossing probabilities in linkage analysis. In Thomas Bruss, F. and Le Cam, L. (eds), Game Theory, Optimal Stopping, Probability and Statistics, Hayward, CA: Institute of Mathematical Statistics, pp DUPUIS, J. AND SIEGMUND, D. (1999). Statistical methods for mapping quantitative trait loci from a dense set of markers. Genetics 151, DUPUIS, J., BROWN, P.AND SIEGMUND, D. (1995). Statistical methods for linkage analysis of complex traits from high resolution maps of identity by descent. Genetics 140, EAVES, L. AND MEYER, J. (1994). Locating human quantitative trait loci: guidelines for the selection of sibling pairs for genotyping. Behavior Genetics 24, ELSTON, R., BUXBAUM, S., JACOBS, K. B. AND OLSON, J. M. (2000). Haseman and Elston revisited. Genetic Epidemiology 19, FEINGOLD, E., BROWN, P. O. AND SIEGMUND, D. (1993). Gaussian models for genetic linkage analysis using complete high resolution maps of identity-by-descent. American Journal of Human Genetics 53, FISHER, R. A. (1918). The correlation of relatives on the assumption of Mendelian inheritance. Proc. Roy. Soc. Edinburgh. FULKER, D. W. AND CHERNY, S. S. (1996). An improved multipoint sib pair analysis of quantitative traits. Behavior Genetics 26,

16 162 H.-K. TANG AND D. SIEGMUND FULKER, D. W. AND CARDON, L. R. (1994). A sib-pair approach to interval mapping of quantitative trait loci. American Journal of Human Genetics 54, HASEMAN, J.K.AND ELSTON, R. C. (1972). The investigation of linkage between a quantitative trait and a marker locus. Behavior Genetics 2, KEMPTHORNE, O. (1955). Genetic Statistics. New York: Wiley. KRUGLYAK, L. AND LANDER, E. S. (1995). Complete multipoint sib pair analysis of qualitative and quantitative traits. American Journal of Human Genetics 57, LANDER, E. S. AND KRUGLYAK, L. (1995). Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nature Genetics 11, LANDER, E.S.AND SCHORK, N. J. (1994). Genetic dissection of complex traits. Science 265, PAGE, G.P.,AMOS, C.I.AND BOERWINKLE, E. (1998). The quantitative LOD score: test statistic and sample size for exclusion and linkage of quantitative traits in human sibships. American Journal of Human Genetics 62, RAO, C. R. (1973). Linear Statistical Inference and Its Applications, 2nd edn. New York: Wiley. RISCH, N. AND ZHANG, H. P. (1995). Extreme discordant sib pairs for mapping quantitative trait loci in humans. Science 268, SIEGMUND, D. (1985). Sequential Analysis: Tests and Confidence Intervals. New York: Springer. TANG, H.-K. (2000). Using variance components to map quantitative trait loci in humans, Ph.D. Thesis, Stanford University. TENG, J. (1996). Statistical methods in linkage analysis, Ph.D. Thesis, Stanford University. TENG, J. AND SIEGMUND, D. (1998). Multipoint linkage analysis using affected relative pairs and paritally informative makes. Biometrics 54, TU, I-PING AND SIEGMUND, D. (1999). The maximum of a function of a markov chain and application to linkage analysis. Advances in Applied Probability 31, WILLIAMS, J. T. AND BLANGERO, J. (1999). Power of variance component linkage analysis to detect quantitative trait loci. Annals of Human Genetics 63, WRIGHT, F. (1997). The phenotypic difference discards sib-pair QTL linkage information. American Journal of Human Genetics 60, XU, X., ROGUS, J. J., TERWEDOW, H.A.,YANG, J., WANT, Z., CHEN, C., NIU, T.,WANT, B., XU, H., WEISS, S., SCHORK, N. J. AND FANG, Z. (1999). An extreme-sib-pair genome scan for genes regulating blood pressure. American Journal of Human Genetics 64, [Received March 6, 2000; revised June 28, 2000; accepted for publication June 29, 2000]

QTL mapping under ascertainment

QTL mapping under ascertainment QTL mapping under ascertainment J. PENG Department of Statistics, University of California, Davis, CA 95616 D. SIEGMUND Department of Statistics, Stanford University, Stanford, CA 94305 February 15, 2006

More information

The Admixture Model in Linkage Analysis

The Admixture Model in Linkage Analysis The Admixture Model in Linkage Analysis Jie Peng D. Siegmund Department of Statistics, Stanford University, Stanford, CA 94305 SUMMARY We study an appropriate version of the score statistic to test the

More information

Combining dependent tests for linkage or association across multiple phenotypic traits

Combining dependent tests for linkage or association across multiple phenotypic traits Biostatistics (2003), 4, 2,pp. 223 229 Printed in Great Britain Combining dependent tests for linkage or association across multiple phenotypic traits XIN XU Program for Population Genetics, Harvard School

More information

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES Saurabh Ghosh Human Genetics Unit Indian Statistical Institute, Kolkata Most common diseases are caused by

More information

Analytic power calculation for QTL linkage analysis of small pedigrees

Analytic power calculation for QTL linkage analysis of small pedigrees (2001) 9, 335 ± 340 ã 2001 Nature Publishing Group All rights reserved 1018-4813/01 $15.00 www.nature.com/ejhg ARTICLE for QTL linkage analysis of small pedigrees FruÈhling V Rijsdijk*,1, John K Hewitt

More information

Lecture 9. QTL Mapping 2: Outbred Populations

Lecture 9. QTL Mapping 2: Outbred Populations Lecture 9 QTL Mapping 2: Outbred Populations Bruce Walsh. Aug 2004. Royal Veterinary and Agricultural University, Denmark The major difference between QTL analysis using inbred-line crosses vs. outbred

More information

Prediction of the Confidence Interval of Quantitative Trait Loci Location

Prediction of the Confidence Interval of Quantitative Trait Loci Location Behavior Genetics, Vol. 34, No. 4, July 2004 ( 2004) Prediction of the Confidence Interval of Quantitative Trait Loci Location Peter M. Visscher 1,3 and Mike E. Goddard 2 Received 4 Sept. 2003 Final 28

More information

Power and Robustness of Linkage Tests for Quantitative Traits in General Pedigrees

Power and Robustness of Linkage Tests for Quantitative Traits in General Pedigrees Johns Hopkins University, Dept. of Biostatistics Working Papers 1-5-2004 Power and Robustness of Linkage Tests for Quantitative Traits in General Pedigrees Weimin Chen Johns Hopkins Bloomberg School of

More information

Optimal Allele-Sharing Statistics for Genetic Mapping Using Affected Relatives

Optimal Allele-Sharing Statistics for Genetic Mapping Using Affected Relatives Genetic Epidemiology 16:225 249 (1999) Optimal Allele-Sharing Statistics for Genetic Mapping Using Affected Relatives Mary Sara McPeek* Department of Statistics, University of Chicago, Chicago, Illinois

More information

The universal validity of the possible triangle constraint for Affected-Sib-Pairs

The universal validity of the possible triangle constraint for Affected-Sib-Pairs The Canadian Journal of Statistics Vol. 31, No.?, 2003, Pages???-??? La revue canadienne de statistique The universal validity of the possible triangle constraint for Affected-Sib-Pairs Zeny Z. Feng, Jiahua

More information

Asymptotic properties of the likelihood ratio test statistics with the possible triangle constraint in Affected-Sib-Pair analysis

Asymptotic properties of the likelihood ratio test statistics with the possible triangle constraint in Affected-Sib-Pair analysis The Canadian Journal of Statistics Vol.?, No.?, 2006, Pages???-??? La revue canadienne de statistique Asymptotic properties of the likelihood ratio test statistics with the possible triangle constraint

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities

More information

I Have the Power in QTL linkage: single and multilocus analysis

I Have the Power in QTL linkage: single and multilocus analysis I Have the Power in QTL linkage: single and multilocus analysis Benjamin Neale 1, Sir Shaun Purcell 2 & Pak Sham 13 1 SGDP, IoP, London, UK 2 Harvard School of Public Health, Cambridge, MA, USA 3 Department

More information

Calculation of IBD probabilities

Calculation of IBD probabilities Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm

More information

MIXED MODELS THE GENERAL MIXED MODEL

MIXED MODELS THE GENERAL MIXED MODEL MIXED MODELS This chapter introduces best linear unbiased prediction (BLUP), a general method for predicting random effects, while Chapter 27 is concerned with the estimation of variances by restricted

More information

Multipoint Quantitative-Trait Linkage Analysis in General Pedigrees

Multipoint Quantitative-Trait Linkage Analysis in General Pedigrees Am. J. Hum. Genet. 6:9, 99 Multipoint Quantitative-Trait Linkage Analysis in General Pedigrees Laura Almasy and John Blangero Department of Genetics, Southwest Foundation for Biomedical Research, San Antonio

More information

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative

More information

Lecture 6. QTL Mapping

Lecture 6. QTL Mapping Lecture 6 QTL Mapping Bruce Walsh. Aug 2003. Nordic Summer Course MAPPING USING INBRED LINE CROSSES We start by considering crosses between inbred lines. The analysis of such crosses illustrates many of

More information

Affected Sibling Pairs. Biostatistics 666

Affected Sibling Pairs. Biostatistics 666 Affected Sibling airs Biostatistics 666 Today Discussion of linkage analysis using affected sibling pairs Our exploration will include several components we have seen before: A simple disease model IBD

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA

More information

Multiple QTL mapping

Multiple QTL mapping Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power

More information

Modeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17

Modeling IBD for Pairs of Relatives. Biostatistics 666 Lecture 17 Modeling IBD for Pairs of Relatives Biostatistics 666 Lecture 7 Previously Linkage Analysis of Relative Pairs IBS Methods Compare observed and expected sharing IBD Methods Account for frequency of shared

More information

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017

Lecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017 Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping

More information

Statistical issues in QTL mapping in mice

Statistical issues in QTL mapping in mice Statistical issues in QTL mapping in mice Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Outline Overview of QTL mapping The X chromosome Mapping

More information

2. Map genetic distance between markers

2. Map genetic distance between markers Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,

More information

Gene mapping in model organisms

Gene mapping in model organisms Gene mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University http://www.biostat.jhsph.edu/~kbroman Goal Identify genes that contribute to common human diseases. 2

More information

TOPICS IN STATISTICAL METHODS FOR HUMAN GENE MAPPING

TOPICS IN STATISTICAL METHODS FOR HUMAN GENE MAPPING TOPICS IN STATISTICAL METHODS FOR HUMAN GENE MAPPING by Chia-Ling Kuo MS, Biostatstics, National Taiwan University, Taipei, Taiwan, 003 BBA, Statistics, National Chengchi University, Taipei, Taiwan, 001

More information

The Quantitative TDT

The Quantitative TDT The Quantitative TDT (Quantitative Transmission Disequilibrium Test) Warren J. Ewens NUS, Singapore 10 June, 2009 The initial aim of the (QUALITATIVE) TDT was to test for linkage between a marker locus

More information

Genotype Imputation. Biostatistics 666

Genotype Imputation. Biostatistics 666 Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives

More information

Variance Component Models for Quantitative Traits. Biostatistics 666

Variance Component Models for Quantitative Traits. Biostatistics 666 Variance Component Models for Quantitative Traits Biostatistics 666 Today Analysis of quantitative traits Modeling covariance for pairs of individuals estimating heritability Extending the model beyond

More information

Testing for Homogeneity in Genetic Linkage Analysis

Testing for Homogeneity in Genetic Linkage Analysis Testing for Homogeneity in Genetic Linkage Analysis Yuejiao Fu, 1, Jiahua Chen 2 and John D. Kalbfleisch 3 1 Department of Mathematics and Statistics, York University Toronto, ON, M3J 1P3, Canada 2 Department

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Human vs mouse Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] www.daviddeen.com

More information

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees:

Tutorial Session 2. MCMC for the analysis of genetic data on pedigrees: MCMC for the analysis of genetic data on pedigrees: Tutorial Session 2 Elizabeth Thompson University of Washington Genetic mapping and linkage lod scores Monte Carlo likelihood and likelihood ratio estimation

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

STAT 536: Genetic Statistics

STAT 536: Genetic Statistics STAT 536: Genetic Statistics Tests for Hardy Weinberg Equilibrium Karin S. Dorman Department of Statistics Iowa State University September 7, 2006 Statistical Hypothesis Testing Identify a hypothesis,

More information

Powerful Regression-Based Quantitative-Trait Linkage Analysis of General Pedigrees

Powerful Regression-Based Quantitative-Trait Linkage Analysis of General Pedigrees Am. J. Hum. Genet. 71:38 53, 00 Powerful Regression-Based Quantitative-Trait Linkage Analysis of General Pedigrees Pak C. Sham, 1 Shaun Purcell, 1 Stacey S. Cherny, 1, and Gonçalo R. Abecasis 3 1 Institute

More information

A Robust Identity-by-Descent Procedure Using Affected Sib Pairs: Multipoint Mapping for Complex Diseases

A Robust Identity-by-Descent Procedure Using Affected Sib Pairs: Multipoint Mapping for Complex Diseases Original Paper Hum Hered 001;51:64 78 Received: May 1, 1999 Revision received: September 10, 1999 Accepted: October 6, 1999 A Robust Identity-by-Descent Procedure Using Affected Sib Pairs: Multipoint Mapping

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics Johns Hopkins University kbroman@jhsph.edu www.biostat.jhsph.edu/ kbroman Outline Experiments and data Models ANOVA

More information

Lecture 8. QTL Mapping 1: Overview and Using Inbred Lines

Lecture 8. QTL Mapping 1: Overview and Using Inbred Lines Lecture 8 QTL Mapping 1: Overview and Using Inbred Lines Bruce Walsh. jbwalsh@u.arizona.edu. University of Arizona. Notes from a short course taught Jan-Feb 2012 at University of Uppsala While the machinery

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

Use of hidden Markov models for QTL mapping

Use of hidden Markov models for QTL mapping Use of hidden Markov models for QTL mapping Karl W Broman Department of Biostatistics, Johns Hopkins University December 5, 2006 An important aspect of the QTL mapping problem is the treatment of missing

More information

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics

1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics 1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

More information

Bayesian QTL mapping using skewed Student-t distributions

Bayesian QTL mapping using skewed Student-t distributions Genet. Sel. Evol. 34 00) 1 1 1 INRA, EDP Sciences, 00 DOI: 10.1051/gse:001001 Original article Bayesian QTL mapping using skewed Student-t distributions Peter VON ROHR a,b, Ina HOESCHELE a, a Departments

More information

Biometrical Genetics. Lindon Eaves, VIPBG Richmond. Boulder CO, 2012

Biometrical Genetics. Lindon Eaves, VIPBG Richmond. Boulder CO, 2012 Biometrical Genetics Lindon Eaves, VIPBG Richmond Boulder CO, 2012 Biometrical Genetics How do genes contribute to statistics (e.g. means, variances,skewness, kurtosis)? Some Literature: Jinks JL, Fulker

More information

Likelihood and p-value functions in the composite likelihood context

Likelihood and p-value functions in the composite likelihood context Likelihood and p-value functions in the composite likelihood context D.A.S. Fraser and N. Reid Department of Statistical Sciences University of Toronto November 19, 2016 Abstract The need for combining

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Power and Design Considerations for a General Class of Family-Based Association Tests: Quantitative Traits

Power and Design Considerations for a General Class of Family-Based Association Tests: Quantitative Traits Am. J. Hum. Genet. 71:1330 1341, 00 Power and Design Considerations for a General Class of Family-Based Association Tests: Quantitative Traits Christoph Lange, 1 Dawn L. DeMeo, and Nan M. Laird 1 1 Department

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 4, Issue 1 2005 Article 11 Combined Association and Linkage Analysis for General Pedigrees and Genetic Models Ola Hössjer University of

More information

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 9 for Applied Multivariate Analysis Outline Addressing ourliers 1 Addressing ourliers 2 Outliers in Multivariate samples (1) For

More information

Appendix 2. The Multivariate Normal. Thus surfaces of equal probability for MVN distributed vectors satisfy

Appendix 2. The Multivariate Normal. Thus surfaces of equal probability for MVN distributed vectors satisfy Appendix 2 The Multivariate Normal Draft Version 1 December 2000, c Dec. 2000, B. Walsh and M. Lynch Please email any comments/corrections to: jbwalsh@u.arizona.edu THE MULTIVARIATE NORMAL DISTRIBUTION

More information

DNA polymorphisms such as SNP and familial effects (additive genetic, common environment) to

DNA polymorphisms such as SNP and familial effects (additive genetic, common environment) to 1 1 1 1 1 1 1 1 0 SUPPLEMENTARY MATERIALS, B. BIVARIATE PEDIGREE-BASED ASSOCIATION ANALYSIS Introduction We propose here a statistical method of bivariate genetic analysis, designed to evaluate contribution

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Lecture 6: Introduction to Quantitative genetics. Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011

Lecture 6: Introduction to Quantitative genetics. Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011 Lecture 6: Introduction to Quantitative genetics Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011 Quantitative Genetics The analysis of traits whose variation is determined by both a

More information

Constructing Confidence Intervals for QTL Location. B. Mangin, B. Goffinet and A. Rebai

Constructing Confidence Intervals for QTL Location. B. Mangin, B. Goffinet and A. Rebai Copyright 0 1994 by the Genetics Society of America Constructing Confidence Intervals for QTL Location B. Mangin, B. Goffinet and A. Rebai Znstitut National de la Recherche Agronomique, Station de Biomitrie

More information

Lecture WS Evolutionary Genetics Part I 1

Lecture WS Evolutionary Genetics Part I 1 Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in

More information

Major Genes, Polygenes, and

Major Genes, Polygenes, and Major Genes, Polygenes, and QTLs Major genes --- genes that have a significant effect on the phenotype Polygenes --- a general term of the genes of small effect that influence a trait QTL, quantitative

More information

On Computation of P-values in Parametric Linkage Analysis

On Computation of P-values in Parametric Linkage Analysis On Computation of P-values in Parametric Linkage Analysis Azra Kurbašić Centre for Mathematical Sciences Mathematical Statistics Lund University p.1/22 Parametric (v. Nonparametric) Analysis The genetic

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl W Broman Department of Biostatistics and Medical Informatics University of Wisconsin Madison www.biostat.wisc.edu/~kbroman [ Teaching Miscellaneous lectures]

More information

Introduction to Quantitative Genetics. Introduction to Quantitative Genetics

Introduction to Quantitative Genetics. Introduction to Quantitative Genetics Introduction to Quantitative Genetics Historical Background Quantitative genetics is the study of continuous or quantitative traits and their underlying mechanisms. The main principals of quantitative

More information

Biometrical Genetics

Biometrical Genetics Biometrical Genetics 2016 International Workshop on Statistical Genetic Methods for Human Complex Traits Boulder, CO. Lindon Eaves, VIPBG, Richmond VA. March 2016 Biometrical Genetics How do genes contribute

More information

Causal Model Selection Hypothesis Tests in Systems Genetics

Causal Model Selection Hypothesis Tests in Systems Genetics 1 Causal Model Selection Hypothesis Tests in Systems Genetics Elias Chaibub Neto and Brian S Yandell SISG 2012 July 13, 2012 2 Correlation and Causation The old view of cause and effect... could only fail;

More information

ASSESSING A VECTOR PARAMETER

ASSESSING A VECTOR PARAMETER SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;

More information

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power

Proportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion

More information

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia

Expression QTLs and Mapping of Complex Trait Loci. Paul Schliekelman Statistics Department University of Georgia Expression QTLs and Mapping of Complex Trait Loci Paul Schliekelman Statistics Department University of Georgia Definitions: Genes, Loci and Alleles A gene codes for a protein. Proteins due everything.

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

The Lander-Green Algorithm. Biostatistics 666 Lecture 22

The Lander-Green Algorithm. Biostatistics 666 Lecture 22 The Lander-Green Algorithm Biostatistics 666 Lecture Last Lecture Relationship Inferrence Likelihood of genotype data Adapt calculation to different relationships Siblings Half-Siblings Unrelated individuals

More information

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems

Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA

More information

Partitioning the Genetic Variance

Partitioning the Genetic Variance Partitioning the Genetic Variance 1 / 18 Partitioning the Genetic Variance In lecture 2, we showed how to partition genotypic values G into their expected values based on additivity (G A ) and deviations

More information

Lecture 11: Multiple trait models for QTL analysis

Lecture 11: Multiple trait models for QTL analysis Lecture 11: Multiple trait models for QTL analysis Julius van der Werf Multiple trait mapping of QTL...99 Increased power of QTL detection...99 Testing for linked QTL vs pleiotropic QTL...100 Multiple

More information

Contrasts for a within-species comparative method

Contrasts for a within-species comparative method Contrasts for a within-species comparative method Joseph Felsenstein, Department of Genetics, University of Washington, Box 357360, Seattle, Washington 98195-7360, USA email address: joe@genetics.washington.edu

More information

Methods for QTL analysis

Methods for QTL analysis Methods for QTL analysis Julius van der Werf METHODS FOR QTL ANALYSIS... 44 SINGLE VERSUS MULTIPLE MARKERS... 45 DETERMINING ASSOCIATIONS BETWEEN GENETIC MARKERS AND QTL WITH TWO MARKERS... 45 INTERVAL

More information

to be tested with great accuracy. The contrast between this state

to be tested with great accuracy. The contrast between this state STATISTICAL MODELS IN BIOMETRICAL GENETICS J. A. NELDER National Vegetable Research Station, Wellesbourne, Warwick Received I.X.52 I. INTRODUCTION THE statistical models belonging to the analysis of discontinuous

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Linkage Disequilibrium Mapping of Quantitative Trait Loci by Selective Genotyping

Linkage Disequilibrium Mapping of Quantitative Trait Loci by Selective Genotyping 1 Linkage Disequilibrium Mapping of Quantitative Trait Loci by Selective Genotyping Running title: LD mapping of QTL by selective genotyping Zehua Chen 1, Gang Zheng 2,KaushikGhosh 3 and Zhaohai Li 3,4

More information

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms Introduction to QTL mapping in model organisms Karl Broman Biostatistics and Medical Informatics University of Wisconsin Madison kbroman.org github.com/kbroman @kwbroman Backcross P 1 P 2 P 1 F 1 BC 4

More information

Lecture 2. Basic Population and Quantitative Genetics

Lecture 2. Basic Population and Quantitative Genetics Lecture Basic Population and Quantitative Genetics Bruce Walsh. Aug 003. Nordic Summer Course Allele and Genotype Frequencies The frequency p i for allele A i is just the frequency of A i A i homozygotes

More information

EXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series:

EXERCISES FOR CHAPTER 7. Exercise 7.1. Derive the two scales of relation for each of the two following recurrent series: Statistical Genetics Agronomy 65 W. E. Nyquist March 004 EXERCISES FOR CHAPTER 7 Exercise 7.. Derive the two scales of relation for each of the two following recurrent series: u: 0, 8, 6, 48, 46,L 36 7

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS. Myongsik Oh. 1. Introduction

ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS. Myongsik Oh. 1. Introduction J. Appl. Math & Computing Vol. 13(2003), No. 1-2, pp. 457-470 ORDER RESTRICTED STATISTICAL INFERENCE ON LORENZ CURVES OF PARETO DISTRIBUTIONS Myongsik Oh Abstract. The comparison of two or more Lorenz

More information

For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M.

For 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M. STAT 550 Howework 6 Anton Amirov 1. This question relates to the same study you saw in Homework-4, by Dr. Arno Motulsky and coworkers, and published in Thompson et al. (1988; Am.J.Hum.Genet, 42, 113-124).

More information

Lecture 2. Fisher s Variance Decomposition

Lecture 2. Fisher s Variance Decomposition Lecture Fisher s Variance Decomposition Bruce Walsh. June 008. Summer Institute on Statistical Genetics, Seattle Covariances and Regressions Quantitative genetics requires measures of variation and association.

More information

Partitioning Genetic Variance

Partitioning Genetic Variance PSYC 510: Partitioning Genetic Variance (09/17/03) 1 Partitioning Genetic Variance Here, mathematical models are developed for the computation of different types of genetic variance. Several substantive

More information

VARIANCE-COMPONENTS (VC) linkage analysis

VARIANCE-COMPONENTS (VC) linkage analysis Copyright Ó 2006 by the Genetics Society of America DOI: 10.1534/genetics.105.054650 Quantitative Trait Linkage Analysis Using Gaussian Copulas Mingyao Li,*,,1 Michael Boehnke, Goncxalo R. Abecasis and

More information

QTL model selection: key players

QTL model selection: key players Bayesian Interval Mapping. Bayesian strategy -9. Markov chain sampling 0-7. sampling genetic architectures 8-5 4. criteria for model selection 6-44 QTL : Bayes Seattle SISG: Yandell 008 QTL model selection:

More information

The Annals of Human Genetics has an archive of material originally published in print format by the Annals of Eugenics ( ).

The Annals of Human Genetics has an archive of material originally published in print format by the Annals of Eugenics ( ). The Annals of Human Genetics has an archive of material originally published in print format by the Annals of Eugenics (11). This material is available in specialised libraries and archives. e believe

More information

Case-Control Association Testing. Case-Control Association Testing

Case-Control Association Testing. Case-Control Association Testing Introduction Association mapping is now routinely being used to identify loci that are involved with complex traits. Technological advances have made it feasible to perform case-control association studies

More information

Mixture Models. Pr(i) p i (z) i=1. It is usually assumed that the underlying distributions are normals, so this becomes. 2 i

Mixture Models. Pr(i) p i (z) i=1. It is usually assumed that the underlying distributions are normals, so this becomes. 2 i Mixture Models The Distribution under a Mixture Model Assume the distribution of interest results from a weighted mixture of several underlying distributions. If there are i = 1,,n underlying distributions,

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

On the mapping of quantitative trait loci at marker and non-marker locations

On the mapping of quantitative trait loci at marker and non-marker locations Genet. Res., Camb. (2002), 79, pp. 97 106. With 3 figures. 2002 Cambridge University Press DOI: 10.1017 S0016672301005420 Printed in the United Kingdom 97 On the mapping of quantitative trait loci at marker

More information

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important?

EXERCISES FOR CHAPTER 3. Exercise 3.2. Why is the random mating theorem so important? Statistical Genetics Agronomy 65 W. E. Nyquist March 004 EXERCISES FOR CHAPTER 3 Exercise 3.. a. Define random mating. b. Discuss what random mating as defined in (a) above means in a single infinite population

More information

Change-Point Detection and Copy Number Variation

Change-Point Detection and Copy Number Variation Change-Point Detection and Copy Number Variation Department of Statistics The Hebrew University IMS, Singapore, 2009 Outline 1 2 3 (CNV) Classical change-point detection At each monitoring period we observe

More information

Causal Graphical Models in Systems Genetics

Causal Graphical Models in Systems Genetics 1 Causal Graphical Models in Systems Genetics 2013 Network Analysis Short Course - UCLA Human Genetics Elias Chaibub Neto and Brian S Yandell July 17, 2013 Motivation and basic concepts 2 3 Motivation

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances

Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner

More information

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification, Likelihood Let P (D H) be the probability an experiment produces data D, given hypothesis H. Usually H is regarded as fixed and D variable. Before the experiment, the data D are unknown, and the probability

More information

Model comparison and selection

Model comparison and selection BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)

More information