Analysis of litter size and average litter weight in pigs using a recursive model

Size: px
Start display at page:

Download "Analysis of litter size and average litter weight in pigs using a recursive model"

Transcription

1 Genetics: Published Articles Ahead of Print, published on August 4, 007 as /genetics Analysis of litter size and average litter weight in pigs using a recursive model Luis Varona 1, Daniel Sorensen July 18, Area de Producció Animal - Centre UdL-IRTA, 5198 Lleida, Spain Department of Genetics and Biotechnology, Danish Institute of Agricultural Sciences, PB 50, DK-8830 Tjele, Denmark Abstract An analysis of litter size and average piglet weight at birth in Landrace and Yorkshire using a standard two-trait mixed model (SMM) and a recursive mixed model (RMM) is presented. The RMM establishes a one-way link from litter size to average piglet weight. It is shown that there is a one-to-one correspondence between the parameters of SMM and RMM and that they generate equivalent likelihoods. As parametrized in this work, the RMM tests for the presence of a recursive relationship between additive genetic values, permanent environmental effects and specific environmental effects of litter size, on average piglet weight. The equivalent standard mixed model tests whether the covariance matrices of the random effects have or not a diagonal structure. In Landrace, posterior predictive model checking supports a model without any form of recursion, or, alternatively, a SMM with diagonal covariance matrices of the three random effects. In Yorkshire, the same criterion favours a model with recursion at the level of specific environmental effects only, or, in terms of the SMM, the association between traits is shown to be exclusively due to an environmental (negative) correlation. It is argued that the choice between a SMM or a RMM should be guided by the availability of software, by ease of interpretation, or by the need to test a particular theory or hypothesis, that may best be formulated under one parameterization and not the other. Contents 1 INTRODUCTION MATERIAL and METHODS 3.1 Data Models and likelihoods

2 .3 Likelihood identification under the SMM and the RMM Generating an identifiable likelihood model to address the nature of the relationship between traits Prior and Posterior distributions Implementation Model testing RESULTS 1 4 DISCUSSION INTRODUCTION Mixed linear models (Henderson, 1984) are broadly used to predict breeding values and to estimate variance components for traits of interest in livestock and plant breeding and play an important role in evolutionary and theoretical quantitative genetics (Cheverud, 1984; Lande, 1979; Walsh, 003). In genetic improvement programs, the objective of selection includes typically several correlated traits. The classical approach for a multipletrait analysis is to use models posing that the nature of the correlation between response variables (phenotypes) is due to linear associations between unobservables, such as additive genetic values or non-genetic sources, like permanent or temporary environmental effects. Structural equation models represent an extension of the standard linear model to account for links (feedback and/or recursiveness) involving either the phenotypes directly, or latent variables; they are well established in econometrics and sociology (Goldberger, 197; Jöreskog, 1973; Duncan, 1975). These models were discussed in the early genetics literature by Wright (191) but this work has not received much attention in quantitative genetics. Recently, Xiong et al. (004) proposed the use of structural equation models for modeling and identifying genetic networks. In a quantitative genetics context, Gianola and Sorensen (004) studied the consequences of the existence of simultaneous and recursive relationships between phenotypes on genetic parameters and presented statistical methods for inference. A recent application to study the relationship between somatic cell score and milk yield in goats is in de los Campos et al. (006). Here we are concerned with an illustration of the implementation of structural equation models for the analysis of litter size and average litter weight in two breeds of Danish pigs. Litter size is an important trait in pig genetic improvement programmes (Rothschild and Bidanel, 1998) and there is now convincing evidence that it has responded successfully to selection (i.e. Sorensen et al., 000; Noguera et al., 00). Several studies have also reported negative associations between litter size and individual birth weight (Kerr and Cameron, 1995; Roehe, 1999; Sorensen et al., 000). Further, Sorensen et al. (000) report an increase in the proportion of piglets born dead at higher litter size values. Litter size is basically determined by ovulation rate and embryo mortality (Blasco et al., 1995); these processes take place mainly at the early stages of gestation. Piglet weight at

3 birth is mostly determined by growth in late gestation. One could then postulate a one-way causal path establishing an effect of litter size on piglet weight at birth. This specification defines a recursive two-trait system. On the other hand, simultaneity occurs when trait 1 affects trait and vice-versa. The objective of this study is, first, to show that recursive models can be interpreted as alternative parameterizations of standard linear models. We discuss identifiability of dispersion parameters, a topic that is intimately connected to the possibility of drawing inferences from the various parametric forms of a given model. Secondly, we address the statistical problems involved in deciding whether the association between traits is mediated by additive genetic and/or environmental covariances, or via recursion only. The results are illustrated using data on litter size and average litter weight in pigs. MATERIAL and METHODS.1 Data Data from two breeds were analysed: Landrace and Yorkshire. The traits analysed were total number born per litter and average litter weight at birth (referred to as litter size and average piglet weight, hereinafter). The Landrace dataset included 5, 178 litter size records and a pedigree file of 8, 800 individuals. The raw means for litter size and average piglet weight were 14.3 piglets and 1.36 kg., respectively, with standard deviations 3.6 piglets and 0.35 kg. The Yorkshire dataset consisted of 3, 938 litter size records and a pedigree file of 7, 143 individuals. The raw means for litter size and average piglet weight were piglets and 1.30 kg., respectively, with standard deviations 3.40 piglets and 0. kg. The raw correlations between traits were 0.01 in Landrace and 0.43 in Yorkshire. Piglet weight at birth is strongly genetically determined by maternal effects (Grandinson et al., 00), and, as a consequence, average piglet weight (as well as litter size) was considered a trait of the sow.. Models and likelihoods A description is provided of a standard mixed model (SMM) and a recursive mixed model (RMM). The SMM postulates the following linear structures for y Lij (subscript L represents litter size) and y W ij (subscript W represents average piglet weight) of the jth pair of records from female i: y Lij = x Lijb L + u Li + p Li + e Lij, (1a) y W ij = x W ijb W + u W i + p W i + e W ij, (1b) where x kij, (k = L, W ), is the appropriate row of a known incidence matrix, b k is a vector containing effects of herd-years, seasons and parity number, u ki is an additive genetic effect of individual i, p ki is a permanent environmental effect of individual i and e kij is a residual effect (the lengths of the vectors of additive genetic effects and data are different, but to 3

4 simplify notation, it is assumed throughout that after an appropriate relabeling, a common subindex i can be used for y, u and p) The following distributions were assigned to the location parameters: (b L, b W ) N ( (0, 0), I10 5), (u Li, u W i G) N ((0, 0), G), (p Li, p W i P) N ((0, 0), P), (e Lij, e W ij R ij ) N ((0, 0), R ij ). () Above, I is the identity matrix (of appropriate order), G = [ σ ul σ ul u W σ ul u W σ u W ] (3) and P = [ σ pl σ pl p W σ pl p W σ p W ]. (4) A possible approach to modelling the residual term R ij is as follows. Assume that the residual terms for individual piglet weight at birth, that contribute to a given average piglet weight, are conditionally normally and independently distributed, given litter size, ( ) with residual variance σ e W 1 ρ el e W, where ρel e W is the residual correlation between litter size and individual piglet weight at birth. Also assume that the residual terms for litter size are normally distributed with variance σ e L. Then the marginal (with respect to litter size) residual covariance between two individual piglet weight at birth records is ρ e L e W σ e W and the residual covariance matrix is equal to [ ] σ el ρ el e W σ el σ ew R ij = ρ el e W σ el σ ew σ e W nij ( 1 + (nij 1) ρ e L e W σ e L ). (5) The terms σ x m and σ xl x W (x = u, p, e; m = L, W ) in (3), (4) and (5) are variance and covariance components associated with the distribution of additive genetic effects (x = u), permanent environmental effects (x = p) and residual effects (x = e), for litter size and for average piglet weight. In (5), the off-diagonal term ρ el e W σ el σ ew = σ el e W, and n ij is the known number of records contributing to the average piglet weight of female i in parity j. There are three identifiable parameters in the likelihood based on (5). (Rather than assuming conditional independence of individual piglet weight residuals, given litter size, a more general model would include an extra term to account for a residual correlation between individual piglet weight residuals in their conditional distribution. However, this would lead to 4 parameters in (5) and to problems of identifiability in the likelihood). The residual dispersion matrix can also be written as [ ] σ el β el e W σ e L R ij = β el e W σ σ e W e L nij + n ij 1 n ij β e L e W σ, (6) e L 4

5 where β el e W = σe L e W σ e L is the residual regression of individual piglet weight at birth on litter size. Matrix R ij is positive definite since σ e σ L e W n ij (1 + (n ij 1) ρ ) > ρ e L e W σ e L σ e W. The residual covariance matrix (5) for n ij = 1 is denoted by R. The heritabilities for the two traits are h L = h W = σ u L σ u L + σ p L + σ e L, σ u W, (7) σ u W + σ p W + σ e W nij (1 + (n ij 1) ρ ) and the coefficients of correlation are ρ x = σ x L x W σ xl σ xw, x = u, p, e. (8) 77 Writing y ij = (y Lij, y W ij ), equations (1) can be expressed as y ij = X ij b + u i + p i + e ij, (9) where [ x X ij = Lij 0 0 x W ij ], b = (b L, b W ), u i = (u Li, u W i ), p i = (p Li, p W i ), e ij = (e Lij, e W ij ). It follows that the sampling model for y ij is the Gaussian process y ij b, u i, p i, R ij N (X ij b + u i + p i, R ij ) (10) and the contribution to the likelihood by y ij is y ij b, G, P, R ij N (X ij b, G + P + R ij ). (11) The RMM assumes the following linear relationships between the jth pair of records from individual i and location parameters: y Lij = x Lijb L + u Li + p Li + e Lij, (1a) y W ij = λ ( y Lij x Lijb L ) + x W ij b W + u W i + p W i + e W ij, (1b) where λ is the recursive parameter. The first term in the right hand side of (1b) indicates that, according to the model, average piglet weight is linearly related to the deviation of litter size from its group mean, and the strength of this relationship is measured by λ. On the other hand, Gianola and Sorensen (004) postulate recursiveness or simultaneity between traits involving the observed phenotypes, rather than the unobserved deviations. We return to this point in the Discussion. 5

6 The system defined by (1) can be retrieved subtracting the mean on both sides of (9) and multiplying by Λ, to get The reduced form of (13) is which is the same as (9), where and z i = [ zli z W i Λ (y ij X ij b) = Λu i + Λp i + Λe ij = u i + p i + e ij. (13) y ij = X ij b + Λ 1 u i + Λ 1 p i + Λ 1 e ij, (14) ] [ = Λ 1 = [ 1 0 λ 1 z Li z W i λz Li ] 1 = [ 1 0 λ 1 ], ], z i = u i, p i, e ij; z = u, p, e j. It follows from the Gaussian form of the distributions () that where u i G N ((0, 0), G ), p i P N ((0, 0), P ), e ij R ij N ( (0, 0), R ij), (15) G = ΛGΛ, P = ΛPΛ, (16) R ij = ΛR ij Λ. Therefore the sampling model for y ij under the RMM is the Gaussian process y ij b, u i, p i, R ij N ( X ij b + Λ 1 u i + Λ 1 p i, R ij), (17) and the contribution to the likelihood by y ij is ( y ij b, G, P, R ij, λ N X ij b, Λ ( ( 1 G + P + Rij) ) ) Λ 1. (18) If λ were known this is the same likelihood as (11) due to the one-to-one relationship Λ 1 ( G + P + R ij) ( Λ 1 ) = G + P + Rij. (19) However, with unknown λ, the left hand side of (19) contains 10 parameters and the right hand side 9. There is thus an infinite number of matrices involving the left hand side of (19) that satisfy the equality, for any given G + P + R ij. In other words, disregarding identifiability at the level of the mean for both models, the RMM as defined above generates an unidentifiable likelihood. 6

7 Likelihood identification under the SMM and the RMM The subject of identifiability of the SMM and the RMM at the level of the mean is well known (e.g. Searle, 1971) and will not be discussed. In likelihood (11) of the SMM there are 9 dispersion parameters associated with G, P and R ij. When the data include repeated records of related individuals, the 9 parameters is the maximum number of dispersion parameters that can be identified. This saturated model with non-diagonal covariance matrices for u, p and e is labeled SMM upe. The RMM has an extra parameter, and a constraint needs to be introduced to achieve identification. One possible constraint is to assume that the phenotypic covariance on the recursive scale is zero. That is, denoting the mean of y L by µ L, Cov (y L, y W λ (y L µ L )) = Cov (u L, u W ) + Cov (p L, p W ) + Cov (e L, e W ) = Cov (u L, u W λu L ) + Cov (p L, p W λp L ) + Cov (e L, e W λe L ) = σ ul u W + σ pl p W + σ el e W λ ( σ u L + σ p L + σ e L ) = 0. (0) This places the following interpretation on λ: λ = σ u L u W + σ pl p W + σ el e W σ u L + σ p L + σ e L, (1) the phenotypic regression of average litter weight on litter size. Expanding (19) it is easy to show that the constraint (0) guarantees a one-to-one relationship between the dispersion parameters of the RMM and those of the SMM upe and the likelihoods become equivalent. In this setting the RMM subject to the chosen constraint and the unconstraint SMM upe are two different identifiable parameterizations of the same likelihood model. From the point of view of a likelihood analysis, inferences on the recursive scale can be obtained by fitting the SMM upe and transforming the estimated parameters appropriately, and viceversa. However it is not statistically meaningful to ask whether the data have been generated by the SMM upe or by the recursive process described by the RMM subject to constraint (0), since both specifications lead to the same likelihood..4 Generating an identifiable likelihood model to address the nature of the relationship between traits Here we present a statistically meaningful way to address the question whether the data have been generated by a recursive mechanism. The starting point is the SMM defined in (3), (4), (5) and (9) but with a diagonal matrix for all the dispersion structures; that is, [ ] σ G = ul 0 0 σ, () u W [ ] σ P = pl 0 0 σ, (3) p W 7

8 and R ij = [ σ el 0 0 σ e W nij ]. (4) The contribution to the likelihood by the pair of records y ij is the same as in (11), that is, y ij b, G, P, R ij N (X ij b, (G + P + R ij )) (5) with G, P and R ij appropriately interpreted in the light of (), (3) and (4). There are 6 dispersion parameters associated with this model (the covariance matrices of u, p and e have 0 off-diagonal elements), that is labeled SMM 0. The RMM that is developed here postulates that the relationship between data and location parameters is now y ij = X ij b + Λ u u i + Λ p p i + Λ e e ij = X ij b + u i + p i + e ij, (6) where u i, p i and e ij are the same stochastic variables as in the SMM 0 with covariance matrices (), (3) and (4), and with [ ] 1 0 Λ u =, λ u u i = ( u Li, u W i ) = (uli, u Wi + λ u u Li ), (7) and similarly for Λ p, Λ e, p i and e ij. Notice that the Λ s in (6) have the same structure as the Λ 1 in (14). Contrary to the generation of recursion in (13), the recursive model defined by (6) is not obtained by a linear transformation of the SMM and the two models lead to different marginal (with respect to random effects) distributions of the data. The linear structure specified by (6) and (7) has an interesting property: the components of average litter weight (z Wi + λ x z Li ), z = u, p, e, have a term z Wi independent of litter size, and a component λ x z Li dependent on litter size. The sampling model for y ij is y ij b, u i, p i, R ij N ( X ij b + u i + p i, R ij), (8) and the contribution to the likelihood from y ij is y ij b, G, P, R ij N ( X ij b, G + P + R ij), (9) where G = Λ u GΛ u, P = Λ p PΛ p, R ij = Λ e R ij Λ e. This form of recursive (saturated) model is labeled RMM upe. There are 9 identifiable parameters in the dispersion matrix of this likelihood and when λ u = λ p = λ e = 0 (or when Λ u = Λ p = Λ e = I), likelihood (9) is equal to (5). A comparison between RMM upe and RMM with λ u = λ p = λ e = 0, which is labeled RMM 0, is jointly testing whether there is or not recursion at the level of the unobservable additive genetic values, permanent environmental and environmental 8

9 effects. Alternatively, since likelihoods (9) and (11) are equivalent, the comparison can be interpreted as testing whether the covariance matrices of the random effects of the SMM upe have or not a diagonal structure. Indeed, note that G = P = [ σ ul σ ul u W σ ul u W σ u W ] [ ] σ = ul λ u σ u L λ u σ u L σ u W + λ uσ, (30) u L ] [ ] σ = pl λ p σ p L σ p λ W p σ p L σ p W + λ pσ, (31) p L [ ] σ R el λ e σ e L ij = λ e σ σ e W e L nij + λ eσ. (3) e L [ σ pl σ pl p W σ pl p W The lower diagonal element in (3) is very similar to the corresponding element in (6). However when the trait is not average (that is, when n ij = 1), the second term in the lower diagonal element of (6) vanishes. Since, for example, σ ul u W = β ul u W σ u L, by inspection of (30), (31) and (3) with (3), (4) and (6) it is obvious that the β s under the SMM are identical to the λ s in the RMM. We shall also need [ ] [ ] σ R = el σ el e W σ = el λ e σ e L σ el e σ W e λ W e σ e L σ e W + λ eσ, (33) e L which is matrix R ij for n ij = 1. When λ u = λ p = λ e = 0, the above covariance matrices become equal to (), (3) and (4). Under the RMM upe, the heritability of average litter weight for n ij = n T for all i, j, is defined as h W = σ u W + σ p W σ u W.5 Prior and Posterior distributions + σ e W nt + λ eσ e L. (34) For the RMM upe, the joint prior distribution of all parameters is assumed to admit the factorization p (b, u, p, G, P, R ) = p (b) p (u G ) p (p P ) p (G ) p (P ) p (R ), (35) where u is the vector that contains the pairs (u Li, u W i ) for all individuals in the pedigree, and p is the vector that contains all permanent environmental effects (p Li, p W i ) of females with records. The vector b is allocated an improper uniform distribution and vectors u and p are assumed to be normally distributed u G, A N (0, A G ), 9

10 where A is the known additive genetic relationship matrix, and p P N (0, I P ). The matrices G, P and R follow inverse Wishart distributions G G 0, v G IW (G 0, v G ), 13 P P 0, v p IW (P 0, v P ), R R 0, v R IW (R 0, v R ), where the hyperpriors G 0, P 0 and R 0 are known matrices of dimension and the v s are known degrees of freedom. The conditional density for the whole data y = {y ij } is equal to p (y b, u, p, Σ ) = i,j p ( ) y ij b, u i, p i, R ij (36) where Σ is block diagonal with blocks R ij associated with each pair of records y ij. The posterior distribution of the RMM upe, up to a proportionality constant, is obtained by multiplication of the joint prior (35) by (36), giving p (b, u, p, G, P, R y) p (y b, u, p, Σ ) p (u G ) p (p P ) p (G ) p (P ) p (R ) (37) which is also the posterior distribution of SMM upe, the standard two-trait mixed model with non-diagonal covariance matrices associated with all the random effects. Inferences based on RMM upe can be drawn from the posterior distribution (37) and the recursive parameters can easily be constructed from (30), (31) and (33), λ u = σ u L u W σ u L, (38) and λ p = σ p L p W σ a L, (39) λ e = σ e L e W σ e L. (40) A variety of submodels can be generated by either assuming some or all the λ s equal, or by setting some of them equal to zero..6 Implementation If the number of piglets born was the same for all litters, n T, say, then Σ = I R n T, where R n T denotes the residual covariance matrix (3) with n ij replaced by n T. In this case, the structure of p (y b, u, p, Σ ) in (37) simplifies considerably. To take advantage of this simplification in the computations one can augment the piglet weight data with the so-called missing single records yw mis, so that n ij = n T for all ij, where n T is the largest 10

11 number of records contributing to average piglet weight in the dataset. This technique is known as data augmentation (Tanner and Wong, 1987) and the general idea is as follows. Given observed data y and a model indexed by parameters θ, the posterior distribution p (θ y) is proportional to p (y θ) p (θ). When the model is fitted using McMC, drawing samples from this posterior distribution may be computationally demanding. However, it may be easy to draw samples from p ( θ y, y mis) p ( y, y mis θ ) p (θ) p ( y mis, θ y ), where y mis stands for the missing data. The strategy requires generating y mis from [y mis θ, y]. In the present case, yw mis is generated from N ( E ( yw sim y W, y L, θ ), V ar ( yw sim y W, y L, θ )) where θ is the vector of all parameters indexing the model. After a little experimentation, a length of the Gibbs chain equal to one million was chosen. In Table 1 and Table we report Monte Carlo standard errors of estimates of various posterior means to give an idea of the accuracy of the Monte Carlo computations..7 Model testing Checking for systematic differences between a given model and the observed data discloses the quality of fit of the posed model. An attractive way to study the fit of a model is to use posterior predictive model checking (Gelman et al., 1996, 004). The approach is simple to implement, it is flexible and provides a graphical exploration of residual-type diagnostics. The key feature is the construction of the so-called discrepancy measures that describe particular putative features of the data that the model may fail to account for. To be more specific, consider testing for the presence of recursion at the level of permanent environmental effects. Absence or presence of recursion at the level of additive genetic effects or residuals is studied in a similar way. Let (y Lij, y W ij ), i = 1,,..., denote observed data and for parity j = 1, define the discrepancy measure i b p = (y W i1 x W i1 b W ) (p Li p L ) i (p Li p L ), (41) the change of average piglet weight per unit change of permanent environmental effect associated with litter size. In (41), the sum is over all females with first parity records, and p L is the average p Li across females. If the observed data had been generated under RMM 0 one would expect a value of b p in the vicinity of zero. If parameters were known, one could compare the observed value of b p to its sampling distribution, with a significant difference indicating model failure with respect to the discrepancy measure. This is equivalent to simulating data (y rep Li1, yrep W i1 ), i = 1,,..., under the RMM 0, if parameters were known, computing b rep p in each replicate, and deciding whether the observed value of b p is an 11

12 atypical value in the distribution of b rep p. Specifically and in the current context, one is testing whether the null model RMM 0 is failing to account for a recursive mechanism present in the observed data. Since parameters are not known, we use the idea of posterior predictive model checking (Gelman et al., 1996, 004) and consider the posterior predictive distribution of b p b rep p. This distribution reflects uncertainty about the parameters that enter in the discrepancy measure (41) as well as sampling variation. Notice that the parameters are inferred from the null model RMM 0 that assumes absence of recursion. The presence of recursion, not accounted for by model RMM 0 would result in a distribution of b p b rep p shifted from zero. This can also be construed as a test for a non-zero covariance between permanent environmental effects affecting litter size and those affecting average piglet weight. The exploration of recursion at the level of additive genetic effects and of residuals involves constructing b u b rep u and b e b rep e along the same lines. Often the diagnostic results of posterior predictive model checking are apparent visually, as is the case in the present work. Other times it can be useful to compute a posterior predictive p value to see whether the results could have arisen by chance under the null model (Gelman et al., 1996, 004). These can be very easily computed from the McMC output. 3 RESULTS The familiar parameterization in a two-trait mixed model analysis is based on the saturated SMM upe. We therefore show in Table 1 Monte Carlo estimates of posterior means and standard deviations for chosen parameters based on the SMM upe for Landrace and Yorkshire. Due to the symmetry of all the posterior distributions referred to below, standard deviations rather than posterior intervals are reported. The figures in the table indicate that there is a striking difference between the breeds, especially for the size and sign of the correlation coefficients. For Landrace, a value in the vicinity of zero for all the three correlation coefficients is in an area of high probability mass. For Yorkshire, only for the environmental correlation is the value of zero excluded in the 95% posterior interval. Table shows Monte Carlo estimates of posterior means and standard deviations for chosen parameters based on the RMM upe parameterization for Landrace and Yorkshire. There is a one-to-one relation between the parameters of the RMM upe and those of the SMM upe. The conclusions based on the recursive parameters are the same as those based on the correlation coefficients from Table 1. Figures 1 and show the posterior predictive distribution of discrepancies b u b rep u, b p b rep p and b e b rep e for Landrace and Yorkshire generated under RMM 0. For Landrace, the Monte Carlo estimates of the posterior means (posterior standard deviations) for the three discrepancy measures are (0.80), (0.00) and (0.003), reflecting lack of recursion at all levels. There is therefore lack of evidence suggesting that there is conflict between the data and the null model RMM 0, with respect to the feature described by the discrepancy measure. For Yorkshire, the corresponding figures are (0.05), 1

13 (0.019) and (0.00), supporting recursion at the level of the residual term only, a feature of the data that the null model fails to account for. Table 1: Monte Carlo estimates of posterior means of chosen parameters (posterior standard deviations in brackets) based on SMM upe. h : heritability with subscripts L: litter size, W: average piglet weight for n T = 5 individuals; ρ : correlations with subscripts u, p and e involving additive genetic, permanent and environmental effects; L: Landrace; Y : Yorkshire; MSE: Monte Carlo standard error h L h W ρ u ρ p ρ e L.08(.0).4(.06).16(.19).07(.3).01(.06) Y.07(.0).9(.04) -.1(.16) -.4(.44) -.73(.03) MSE Table : Monte Carlo estimates of posterior means of chosen parameters (posterior standard deviations in brackets) based on RMM upe. h : heritability with subscripts L: litter size, W : average piglet weight for n T = 5 individuals; λ with subscripts u, p and e involving additive genetic, permanent and environmental effects; L: Landrace; Y : Yorkshire; MSE: Monte Carlo standard error h L h W λ u λ p λ e L.08(.0).4(.06).0(.06).050(.189).000(.003) Y.07(.0).(.03) -.08(.0) -.08(.067) -.034(.003) MSE DISCUSSION In a recent article, Gianola and Sorensen (004) discussed the use of simultaneous equation models to analyse and interpret systems of traits that may be subject to feed-back and recursive relationships. Here we report an application of a recursive mixed model for the analysis of litter size and average piglet weight in two breeds of Danish pigs. The recursive relationship defined by model (1) establishes that average piglet weight is linearly related to the deviation of litter size from its group mean. The traditional specification, like in Gianola and Sorensen (004), postulates that average piglet weight is linearly related to litter size, rather than to its deviation from the mean. The system defined by (1) is free of some identifiability problems at the level of parameters entering the mean that are common to both traits. It seems also appealing that deviations from a mid value, rather 13

14 Figure 1: (Landrace) Estimates of posterior distributions (under RMM 0 ) of discrepancies b u b rep u (left), b p b rep p (center) and b e b rep e (right) Figure : (Yorkshire) Estimates of posterior distributions (under RMM 0 ) of discrepancies b u b rep u (left), b p b rep p (center) and b e b rep e (right) then absolute values, exert an influence on average piglet weight. Ultimately, these are two different models and a way of discerning between them is by computing their posterior probabilities, in the light of the data. This was not studied in the present work. The saturated recursive model used in this work has 9 identifiable dispersion parameters. A more parsimonious alternative with 7 parameters postulates that the three recursive parameters λ u, λ p and λ e are equal. In general, the recursive parameterization can be an attractive approach to arrive at parsimonious models, especially in analyses involving many traits. Special attention has been given here to identifiability at the level of the likelihood, despite the fact that inferences were based on posterior distributions. In principle, a Bayesian analysis with a non-identifiable likelihood is possible if proper prior distributions are specified for all the parameters (Bernardo and Smith, 1994). In fact, depending on the prior distributions, a Bayesian analysis with a non-identifiable likelihood may result in Bayesian learning, in the sense that the posterior and prior distributions of the nonidentified parameters are different (see, for example, Sorensen and Gianola (00), page 543). However an McMC implementation of a Bayesian model with barely identified parameters can lead to poor inferences due to extremely slow convergence and very short effective chain lengths. Achieving identifiability of parameters at the level of the likelihood 14

15 will always lead to Bayesian learning and in general to better behaviour of the McMC algorithm. However, there may be situations where the constraints needed for identifiability may restrict inferences, and an unconstrained model using a careful prior specification could be considered instead. The analyses of Yorkshire and Landrace data lead to markedly different inferences; we are not disturbed by this result. The breeds are distinct in various behavioural, physiological and anatomical traits, as well as in outward appearance. From a breeding point of view, in Landrace, changes in litter size should not lead to associated changes in average litter weight. In Yorkshire, a change in environmental deviation of litter size of one unit (for example, due to culling) should result in a temporary reduction of average piglet weight equal to 36 gr. In neither breed, but especially in Landrace, should successful selection for litter size have a direct effect on average piglet weight. There is a rich literature dealing with various transformations of the data or reparameterizations that can lead to computationally more tractable analyses of the multivariate linear model (for example, Meyer, 1987; Quaas, 1988; Jensen and Mao, 1988; Ducrocq and Besbes, 1993; Groeneveld, 1994; Thompson et al., 1994; Gelfand et al., 1995; Ducrocq and Chapuis, 1997). While the recursive model can be viewed in this framework, the focus of the present work is that a recursive model whose likelihood is identifiable is an alternative parameterization of a standard mixed model. The two models provide different interpretation of the results, but are statistically equivalent. There is a one-to-one relationship between the parameters entering the likelihood in both models. This applies also in principle, to simultaneous equation models, which in general require a larger number of constraints to achieve identifiability. However, it may not always be easy to define the equivalent standard model, say, to a model involving complex simultaneous and recursive relationships among many traits. Ultimately, the choice of parameterization should be guided by the availability of software (in simple situations like in the present work), by ease of interpretation, or by the need to test a particular theory or hypothesis. The mathematical formulation of such a hypothesis may be more naturally accomplished using one parameterization and not the other. Acknowledgement: We are grateful to Gustavo de los Campos, Robin Thompson and Daniel Gianola for discussions and comments on an earlier version of this paper. References Bernardo, J. M. and A. F. M. Smith (1994). Bayesian Theory. Wiley. Blasco, A., J. P. Bidanel, and C. Haley (1995). Genetics and neonatal survival. In M. A. Varley (Ed.), The Neonatal Pig. Development and Survival, Wallingford, Oxon, UK, pp CAB International. Cheverud, J. M. (1984). Quantitative genetics and developmental constraints on evolution by selection. Journal of Theoretical Biology 110,

16 de los Campos, G., D. Gianola, P. Boettcher, and P. Moroni (006). A structural equation model for describing relationships between somatic cell score and milk yield in dairy goats. Journal of Animal Science 84, Ducrocq, V. and B. Besbes (1993). Solution of multiple trait animal models with missing data on some traits. Journal of Animal Breeding and Genetics 110, Ducrocq, V. and H. Chapuis (1997). Generalising the use of the canonical transformation for the solution of multivariate mixed model equations. Genetics, Selection, Evolution 9, Duncan, O. D. (1975). Introduction to Structural Equation Models. Academic Press, San Diego, CA. Gelfand, A. E., S. K. Sahu, and B. P. Carlin (1995). Efficient parameterization for normal linear mixed models. Biometrika 8, Gelman, A., J. B. Carlin, H. S. Stern, and D. B. Rubin (004). Bayesian Data Analysis. Chapman and Hall. Gelman, A., X. L. Meng, and H. Stern (1996). Posterior predictive assessment of model fitness via realized discrepancies (with discussion). Statistica Sinica 6, Gianola, D. and D. Sorensen (004). Quantitative genetic models describing simultaneous and recursive relatiosnhips between phenotypes. Genetics 167, Goldberger, A. S. (197). Structural equation methods in the social sciences. Econometrica 40, Grandinson, K., M. S. Lund, L. Rydhmer, and E. Strandberg (00). Genetic parameters for piglet mortality traits crushing, stillbirth and total mortality, and their relation to birth weight. Acta Agricultura Scandinavica, Series A, Animal Science 5, Groeneveld, E. (1994). A reparameterization to improve numerical optimization in multivariate REML (co)variance component estimation. Genetics, Selection, Evolution 6, Henderson, C. R. (1984). Applications of Linear Models in Animal Breeding. University of Guelph. Jensen, J. and I. L. Mao (1988). Transformation algorithms in analysis of single trait and of multiple trait models with equal design matrices and one random factor per trait: a review. Journal of Animal Science 6, Jöreskog, K. G. (1973). A general method for estimating a linear structural equation system. In A. S. Goldberger and O. D. Duncan (Eds.), Structural Equation Models in the Social Sciences, pp New York: Seminar. 16

17 Kerr, J. C. and N. D. Cameron (1995). Reproductive performance of pigs selected for components of efficient lean growth. Animal Science 60, Lande, R. (1979). Quantitative genetic analysis of multivariate evolution, applied to brain:body allometry. Evolution 33, Meyer, K. (1987). A note on the use of an equivalent model to account for relationships between animals in estimating variance components. Journal of Animal Breeding and Genetics 104, Noguera, J. L., L. Varona, D. Babot, and J. Estany (00). Multivariate analysis of litter size for multiple parities with production traits in pigs: II. response to selection for litter size and correlated responses to production traits. Journal of Animal Science 80, Quaas, R. L. (1988). Transformed mixed model equations: a recursive algorithm to eliminate A 1. Journal of Dairy Science 7, Roehe, R. (1999). Genetic determination of individual birthweight and its association with sows productivity traits using Bayesian analysis. Journal of Animal Science 77, Rothschild, M. F. and J. P. Bidanel (1998). Biology and genetics of reproduction. In M. F. Rothschild and A. Ruvinsky (Eds.), The Genetics of the Pig, Wallingford, Oxon, UK, pp CAB International. Searle, S. R. (1971). Linear Models. Wiley. Sorensen, D. and D. Gianola (00). Likelihood, Bayesian, and MCMC Methods in Quantitative Genetics. Springer-Verlag. Sorensen, D., A. Vernersen, and S. Andersen (000). Bayesian analysis of response to selection: A case study using litter size in Danish Yorkshire pigs. Genetics 156, Tanner, M. A. and W. Wong (1987). The calculation of posterior distributions by data augmentation. Journal of the American Statistical Association 8, Thompson, R., R. E. Crump, J. Juga, and P. M. Visscher (1994). Estimating variances and covariances for bivariate animal models using scaling and transformation. Genetics, Selection, Evolution 7, Walsh, B. (003). Evolutionary quantitative genetics. In D. J. Balding, M. Bishop, and C. Cannings (Eds.), Handbook of Statistical Genetics, Volume I, Chichester, UK, pp John Wiley. 17

18 Wright, S. (191). Correlation and causation. Journal of Agricultural Research 10, Xiong, M., J. Li, and X. Fang (004). Identification of genetic networks. Genetics 166,

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Genet. Sel. Evol. 33 001) 443 45 443 INRA, EDP Sciences, 001 Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Louis Alberto GARCÍA-CORTÉS a, Daniel SORENSEN b, Note a

More information

On a multivariate implementation of the Gibbs sampler

On a multivariate implementation of the Gibbs sampler Note On a multivariate implementation of the Gibbs sampler LA García-Cortés, D Sorensen* National Institute of Animal Science, Research Center Foulum, PB 39, DK-8830 Tjele, Denmark (Received 2 August 1995;

More information

Bayesian Inference for the Multivariate Normal

Bayesian Inference for the Multivariate Normal Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Animal Models. Sheep are scanned at maturity by ultrasound(us) to determine the amount of fat surrounding the muscle. A model (equation) might be

Animal Models. Sheep are scanned at maturity by ultrasound(us) to determine the amount of fat surrounding the muscle. A model (equation) might be Animal Models 1 Introduction An animal model is one in which there are one or more observations per animal, and all factors affecting those observations are described including an animal additive genetic

More information

Lecture 32: Infinite-dimensional/Functionvalued. Functions and Random Regressions. Bruce Walsh lecture notes Synbreed course version 11 July 2013

Lecture 32: Infinite-dimensional/Functionvalued. Functions and Random Regressions. Bruce Walsh lecture notes Synbreed course version 11 July 2013 Lecture 32: Infinite-dimensional/Functionvalued Traits: Covariance Functions and Random Regressions Bruce Walsh lecture notes Synbreed course version 11 July 2013 1 Longitudinal traits Many classic quantitative

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Bayesian model selection: methodology, computation and applications

Bayesian model selection: methodology, computation and applications Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program

More information

Searching for phenotypic causal networks involving complex traits: an application to European quail

Searching for phenotypic causal networks involving complex traits: an application to European quail Genetics Selection Evolution RESEARCH Searching for phenotypic causal networks involving complex traits: an application to European quail Bruno D Valente 1,2*, Guilherme JM Rosa 2,3, Martinho A Silva 1,

More information

Longitudinal random effects models for genetic analysis of binary data with application to mastitis in dairy cattle

Longitudinal random effects models for genetic analysis of binary data with application to mastitis in dairy cattle Genet. Sel. Evol. 35 (2003) 457 468 457 INRA, EDP Sciences, 2003 DOI: 10.1051/gse:2003034 Original article Longitudinal random effects models for genetic analysis of binary data with application to mastitis

More information

MIXED MODELS THE GENERAL MIXED MODEL

MIXED MODELS THE GENERAL MIXED MODEL MIXED MODELS This chapter introduces best linear unbiased prediction (BLUP), a general method for predicting random effects, while Chapter 27 is concerned with the estimation of variances by restricted

More information

Supplementary Note on Bayesian analysis

Supplementary Note on Bayesian analysis Supplementary Note on Bayesian analysis Structured variability of muscle activations supports the minimal intervention principle of motor control Francisco J. Valero-Cuevas 1,2,3, Madhusudhan Venkadesan

More information

Single and multitrait estimates of breeding values for survival using sire and animal models

Single and multitrait estimates of breeding values for survival using sire and animal models Animal Science 00, 75: 15-4 1357-798/0/11300015$0 00 00 British Society of Animal Science Single and multitrait estimates of breeding values for survival using sire and animal models T. H. E. Meuwissen

More information

Penalized Loss functions for Bayesian Model Choice

Penalized Loss functions for Bayesian Model Choice Penalized Loss functions for Bayesian Model Choice Martyn International Agency for Research on Cancer Lyon, France 13 November 2009 The pure approach For a Bayesian purist, all uncertainty is represented

More information

A Note on Bayesian Inference After Multiple Imputation

A Note on Bayesian Inference After Multiple Imputation A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal?

Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal? Multivariate Versus Multinomial Probit: When are Binary Decisions Made Separately also Jointly Optimal? Dale J. Poirier and Deven Kapadia University of California, Irvine March 10, 2012 Abstract We provide

More information

Maternal Genetic Models

Maternal Genetic Models Maternal Genetic Models In mammalian species of livestock such as beef cattle sheep or swine the female provides an environment for its offspring to survive and grow in terms of protection and nourishment

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Chapter 5 Prediction of Random Variables

Chapter 5 Prediction of Random Variables Chapter 5 Prediction of Random Variables C R Henderson 1984 - Guelph We have discussed estimation of β, regarded as fixed Now we shall consider a rather different problem, prediction of random variables,

More information

Animal Model. 2. The association of alleles from the two parents is assumed to be at random.

Animal Model. 2. The association of alleles from the two parents is assumed to be at random. Animal Model 1 Introduction In animal genetics, measurements are taken on individual animals, and thus, the model of analysis should include the animal additive genetic effect. The remaining items in the

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

Bayes factor for testing between different structures of random genetic groups: A case study using weaning weight in Bruna dels Pirineus beef cattle

Bayes factor for testing between different structures of random genetic groups: A case study using weaning weight in Bruna dels Pirineus beef cattle Genet. Sel. Evol. 39 (007) 39 53 39 c INRA, EDP Sciences, 006 DOI: 10.1051/gse:006030 Original article Bayes factor for testing between different structures of random genetic groups: A case study using

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative

More information

Supplementary File 3: Tutorial for ASReml-R. Tutorial 1 (ASReml-R) - Estimating the heritability of birth weight

Supplementary File 3: Tutorial for ASReml-R. Tutorial 1 (ASReml-R) - Estimating the heritability of birth weight Supplementary File 3: Tutorial for ASReml-R Tutorial 1 (ASReml-R) - Estimating the heritability of birth weight This tutorial will demonstrate how to run a univariate animal model using the software ASReml

More information

IMPLEMENTATION ISSUES IN BAYESIAN ANALYSIS IN ANIMAL BREEDING. C.S. Wang. Central Research, Pfizer Inc., Groton, CT 06385, USA

IMPLEMENTATION ISSUES IN BAYESIAN ANALYSIS IN ANIMAL BREEDING. C.S. Wang. Central Research, Pfizer Inc., Groton, CT 06385, USA IMPLEMENTATION ISSUES IN BAYESIAN ANALYSIS IN ANIMAL BREEDING C.S. Wang Central Research, Pfizer Inc., Groton, CT 06385, USA SUMMARY Contributions of Bayesian methods to animal breeding are reviewed briefly.

More information

Genetic Parameter Estimation for Milk Yield over Multiple Parities and Various Lengths of Lactation in Danish Jerseys by Random Regression Models

Genetic Parameter Estimation for Milk Yield over Multiple Parities and Various Lengths of Lactation in Danish Jerseys by Random Regression Models J. Dairy Sci. 85:1596 1606 American Dairy Science Association, 2002. Genetic Parameter Estimation for Milk Yield over Multiple Parities and Various Lengths of Lactation in Danish Jerseys by Random Regression

More information

Genetic Parameters for Stillbirth in the Netherlands

Genetic Parameters for Stillbirth in the Netherlands Genetic Parameters for Stillbirth in the Netherlands Arnold Harbers, Linda Segeren and Gerben de Jong CR Delta, P.O. Box 454, 68 AL Arnhem, The Netherlands Harbers.A@CR-Delta.nl 1. Introduction Stillbirth

More information

Statistical Practice

Statistical Practice Statistical Practice A Note on Bayesian Inference After Multiple Imputation Xiang ZHOU and Jerome P. REITER This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

Estimating Variances and Covariances in a Non-stationary Multivariate Time Series Using the K-matrix

Estimating Variances and Covariances in a Non-stationary Multivariate Time Series Using the K-matrix Estimating Variances and Covariances in a Non-stationary Multivariate ime Series Using the K-matrix Stephen P Smith, January 019 Abstract. A second order time series model is described, and generalized

More information

Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates

Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates Toutenburg, Fieger: Using diagnostic measures to detect non-mcar processes in linear regression models with missing covariates Sonderforschungsbereich 386, Paper 24 (2) Online unter: http://epub.ub.uni-muenchen.de/

More information

Chapter 5. Introduction to Path Analysis. Overview. Correlation and causation. Specification of path models. Types of path models

Chapter 5. Introduction to Path Analysis. Overview. Correlation and causation. Specification of path models. Types of path models Chapter 5 Introduction to Path Analysis Put simply, the basic dilemma in all sciences is that of how much to oversimplify reality. Overview H. M. Blalock Correlation and causation Specification of path

More information

Bayesian Inference. Chapter 1. Introduction and basic concepts

Bayesian Inference. Chapter 1. Introduction and basic concepts Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master

More information

ECO 513 Fall 2009 C. Sims HIDDEN MARKOV CHAIN MODELS

ECO 513 Fall 2009 C. Sims HIDDEN MARKOV CHAIN MODELS ECO 513 Fall 2009 C. Sims HIDDEN MARKOV CHAIN MODELS 1. THE CLASS OF MODELS y t {y s, s < t} p(y t θ t, {y s, s < t}) θ t = θ(s t ) P[S t = i S t 1 = j] = h ij. 2. WHAT S HANDY ABOUT IT Evaluating the

More information

Estimation of Parameters in Random. Effect Models with Incidence Matrix. Uncertainty

Estimation of Parameters in Random. Effect Models with Incidence Matrix. Uncertainty Estimation of Parameters in Random Effect Models with Incidence Matrix Uncertainty Xia Shen 1,2 and Lars Rönnegård 2,3 1 The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, Sweden; 2 School

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

Quantitative characters - exercises

Quantitative characters - exercises Quantitative characters - exercises 1. a) Calculate the genetic covariance between half sibs, expressed in the ij notation (Cockerham's notation), when up to loci are considered. b) Calculate the genetic

More information

Bayesian Inference. Chapter 9. Linear models and regression

Bayesian Inference. Chapter 9. Linear models and regression Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering

More information

Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University

Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University A SURVEY OF VARIANCE COMPONENTS ESTIMATION FROM BINARY DATA by Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University BU-1211-M May 1993 ABSTRACT The basic problem of variance components

More information

Equivalence in Non-Recursive Structural Equation Models

Equivalence in Non-Recursive Structural Equation Models Equivalence in Non-Recursive Structural Equation Models Thomas Richardson 1 Philosophy Department, Carnegie-Mellon University Pittsburgh, P 15213, US thomas.richardson@andrew.cmu.edu Introduction In the

More information

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation COMPSTAT 2010 Revised version; August 13, 2010 Michael G.B. Blum 1 Laboratoire TIMC-IMAG, CNRS, UJF Grenoble

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

Large-scale Ordinal Collaborative Filtering

Large-scale Ordinal Collaborative Filtering Large-scale Ordinal Collaborative Filtering Ulrich Paquet, Blaise Thomson, and Ole Winther Microsoft Research Cambridge, University of Cambridge, Technical University of Denmark ulripa@microsoft.com,brmt2@cam.ac.uk,owi@imm.dtu.dk

More information

Misspecification in Nonrecursive SEMs 1. Nonrecursive Latent Variable Models under Misspecification

Misspecification in Nonrecursive SEMs 1. Nonrecursive Latent Variable Models under Misspecification Misspecification in Nonrecursive SEMs 1 Nonrecursive Latent Variable Models under Misspecification Misspecification in Nonrecursive SEMs 2 Abstract A problem central to structural equation modeling is

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

Using Bayesian Priors for More Flexible Latent Class Analysis

Using Bayesian Priors for More Flexible Latent Class Analysis Using Bayesian Priors for More Flexible Latent Class Analysis Tihomir Asparouhov Bengt Muthén Abstract Latent class analysis is based on the assumption that within each class the observed class indicator

More information

A bivariate quantitative genetic model for a linear Gaussian trait and a survival trait

A bivariate quantitative genetic model for a linear Gaussian trait and a survival trait Genet. Sel. Evol. 38 (2006) 45 64 45 c INRA, EDP Sciences, 2005 DOI: 10.1051/gse:200502 Original article A bivariate quantitative genetic model for a linear Gaussian trait and a survival trait Lars Holm

More information

Genetics: Early Online, published on May 5, 2017 as /genetics

Genetics: Early Online, published on May 5, 2017 as /genetics Genetics: Early Online, published on May 5, 2017 as 10.1534/genetics.116.198606 GENETICS INVESTIGATION Accounting for Sampling Error in Genetic Eigenvalues using Random Matrix Theory Jacqueline L. Sztepanacz,1

More information

Markov Chain Monte Carlo in Practice

Markov Chain Monte Carlo in Practice Markov Chain Monte Carlo in Practice Edited by W.R. Gilks Medical Research Council Biostatistics Unit Cambridge UK S. Richardson French National Institute for Health and Medical Research Vilejuif France

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Genetic Heterogeneity of Environmental Variance - estimation of variance components using Double Hierarchical Generalized Linear Models

Genetic Heterogeneity of Environmental Variance - estimation of variance components using Double Hierarchical Generalized Linear Models Genetic Heterogeneity of Environmental Variance - estimation of variance components using Double Hierarchical Generalized Linear Models L. Rönnegård,a,b, M. Felleki a,b, W.F. Fikse b and E. Strandberg

More information

Parameter Redundancy with Covariates

Parameter Redundancy with Covariates Biometrika (2010), xx, x, pp. 1 9 1 2 3 4 5 6 7 C 2007 Biometrika Trust Printed in Great Britain Parameter Redundancy with Covariates By D. J. Cole and B. J. T. Morgan School of Mathematics, Statistics

More information

The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence

The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence The Effects of Monetary Policy on Stock Market Bubbles: Some Evidence Jordi Gali Luca Gambetti ONLINE APPENDIX The appendix describes the estimation of the time-varying coefficients VAR model. The model

More information

Bayesian Analysis of Latent Variable Models using Mplus

Bayesian Analysis of Latent Variable Models using Mplus Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

Introduction to Structural Equation Modeling

Introduction to Structural Equation Modeling Introduction to Structural Equation Modeling Notes Prepared by: Lisa Lix, PhD Manitoba Centre for Health Policy Topics Section I: Introduction Section II: Review of Statistical Concepts and Regression

More information

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation

Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation Univariate Normal Distribution; GLM with the Univariate Normal; Least Squares Estimation PRE 905: Multivariate Analysis Spring 2014 Lecture 4 Today s Class The building blocks: The basics of mathematical

More information

Related Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM

Related Concepts: Lecture 9 SEM, Statistical Modeling, AI, and Data Mining. I. Terminology of SEM Lecture 9 SEM, Statistical Modeling, AI, and Data Mining I. Terminology of SEM Related Concepts: Causal Modeling Path Analysis Structural Equation Modeling Latent variables (Factors measurable, but thru

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Reconstruction of individual patient data for meta analysis via Bayesian approach

Reconstruction of individual patient data for meta analysis via Bayesian approach Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi

More information

Younshik Chung and Hyungsoon Kim 968). Sharples(990) showed how variance ination can be incorporated easily into general hierarchical models, retainin

Younshik Chung and Hyungsoon Kim 968). Sharples(990) showed how variance ination can be incorporated easily into general hierarchical models, retainin Bayesian Outlier Detection in Regression Model Younshik Chung and Hyungsoon Kim Abstract The problem of 'outliers', observations which look suspicious in some way, has long been one of the most concern

More information

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /rssa.

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /rssa. Goldstein, H., Carpenter, J. R., & Browne, W. J. (2014). Fitting multilevel multivariate models with missing data in responses and covariates that may include interactions and non-linear terms. Journal

More information

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH Lecture 5: Spatial probit models James P. LeSage University of Toledo Department of Economics Toledo, OH 43606 jlesage@spatial-econometrics.com March 2004 1 A Bayesian spatial probit model with individual

More information

PQL Estimation Biases in Generalized Linear Mixed Models

PQL Estimation Biases in Generalized Linear Mixed Models PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized

More information

Repeated Records Animal Model

Repeated Records Animal Model Repeated Records Animal Model 1 Introduction Animals are observed more than once for some traits, such as Fleece weight of sheep in different years. Calf records of a beef cow over time. Test day records

More information

Bayesian Inference: Concept and Practice

Bayesian Inference: Concept and Practice Inference: Concept and Practice fundamentals Johan A. Elkink School of Politics & International Relations University College Dublin 5 June 2017 1 2 3 Bayes theorem In order to estimate the parameters of

More information

Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina

Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals John W. Mac McDonald & Alessandro Rosina Quantitative Methods in the Social Sciences Seminar -

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information

ASA Section on Survey Research Methods

ASA Section on Survey Research Methods REGRESSION-BASED STATISTICAL MATCHING: RECENT DEVELOPMENTS Chris Moriarity, Fritz Scheuren Chris Moriarity, U.S. Government Accountability Office, 411 G Street NW, Washington, DC 20548 KEY WORDS: data

More information

The Bayesian Approach to Multi-equation Econometric Model Estimation

The Bayesian Approach to Multi-equation Econometric Model Estimation Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model

More information

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016

Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016 Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016 By Philip J. Bergmann 0. Laboratory Objectives 1. Learn what Bayes Theorem and Bayesian Inference are 2. Reinforce the properties

More information

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016 Spatial Statistics Spatial Examples More Spatial Statistics with Image Analysis Johan Lindström 1 1 Mathematical Statistics Centre for Mathematical Sciences Lund University Lund October 6, 2016 Johan Lindström

More information

CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

MEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES

MEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES XX IMEKO World Congress Metrology for Green Growth September 9 14, 212, Busan, Republic of Korea MEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES A B Forbes National Physical Laboratory, Teddington,

More information

A note on Reversible Jump Markov Chain Monte Carlo

A note on Reversible Jump Markov Chain Monte Carlo A note on Reversible Jump Markov Chain Monte Carlo Hedibert Freitas Lopes Graduate School of Business The University of Chicago 5807 South Woodlawn Avenue Chicago, Illinois 60637 February, 1st 2006 1 Introduction

More information

November 2002 STA Random Effects Selection in Linear Mixed Models

November 2002 STA Random Effects Selection in Linear Mixed Models November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear

More information

STRUCTURAL EQUATION MODELING. Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013

STRUCTURAL EQUATION MODELING. Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013 STRUCTURAL EQUATION MODELING Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013 Introduction: Path analysis Path Analysis is used to estimate a system of equations in which all of the

More information

1 Mixed effect models and longitudinal data analysis

1 Mixed effect models and longitudinal data analysis 1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between

More information

The lmm Package. May 9, Description Some improved procedures for linear mixed models

The lmm Package. May 9, Description Some improved procedures for linear mixed models The lmm Package May 9, 2005 Version 0.3-4 Date 2005-5-9 Title Linear mixed models Author Original by Joseph L. Schafer . Maintainer Jing hua Zhao Description Some improved

More information

Appendix: Modeling Approach

Appendix: Modeling Approach AFFECTIVE PRIMACY IN INTRAORGANIZATIONAL TASK NETWORKS Appendix: Modeling Approach There is now a significant and developing literature on Bayesian methods in social network analysis. See, for instance,

More information

A Bayesian perspective on GMM and IV

A Bayesian perspective on GMM and IV A Bayesian perspective on GMM and IV Christopher A. Sims Princeton University sims@princeton.edu November 26, 2013 What is a Bayesian perspective? A Bayesian perspective on scientific reporting views all

More information

Multilevel Analysis, with Extensions

Multilevel Analysis, with Extensions May 26, 2010 We start by reviewing the research on multilevel analysis that has been done in psychometrics and educational statistics, roughly since 1985. The canonical reference (at least I hope so) is

More information

Multiple QTL mapping

Multiple QTL mapping Multiple QTL mapping Karl W Broman Department of Biostatistics Johns Hopkins University www.biostat.jhsph.edu/~kbroman [ Teaching Miscellaneous lectures] 1 Why? Reduce residual variation = increased power

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.

More information

Statistical Inference and Methods

Statistical Inference and Methods Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of

More information

Measurement Error and Causal Discovery

Measurement Error and Causal Discovery Measurement Error and Causal Discovery Richard Scheines & Joseph Ramsey Department of Philosophy Carnegie Mellon University Pittsburgh, PA 15217, USA 1 Introduction Algorithms for causal discovery emerged

More information