Mixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012

Similar documents
Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013

Mixed-Models. version 30 October 2011

MIXED MODELS THE GENERAL MIXED MODEL

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 5 Basic Designs for Estimation of Genetic Parameters

Lecture 2: Linear and Mixed Models

Best unbiased linear Prediction: Sire and Animal models

3. Properties of the relationship matrix

Lecture 28: BLUP and Genomic Selection. Bruce Walsh lecture notes Synbreed course version 11 July 2013

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Multiple random effects. Often there are several vectors of random effects. Covariance structure

VARIANCE COMPONENT ESTIMATION & BEST LINEAR UNBIASED PREDICTION (BLUP)

Quantitative characters - exercises

Lecture 3. Introduction on Quantitative Genetics: I. Fisher s Variance Decomposition

Lecture 4. Basic Designs for Estimation of Genetic Parameters

INTRODUCTION TO ANIMAL BREEDING. Lecture Nr 3. The genetic evaluation (for a single trait) The Estimated Breeding Values (EBV) The accuracy of EBVs

Likelihood Methods. 1 Likelihood Functions. The multivariate normal distribution likelihood function is

Short-Term Selection Response: Breeder s equation. Bruce Walsh lecture notes Uppsala EQG course version 31 Jan 2012

Lecture 32: Infinite-dimensional/Functionvalued. Functions and Random Regressions. Bruce Walsh lecture notes Synbreed course version 11 July 2013

REML Variance-Component Estimation

Lecture 9. QTL Mapping 2: Outbred Populations

Models with multiple random effects: Repeated Measures and Maternal effects

Lecture Notes. Introduction

Lecture 9 Multi-Trait Models, Binary and Count Traits

Chapter 11 MIVQUE of Variances and Covariances

Lecture 2. Fisher s Variance Decomposition

Introduction to General and Generalized Linear Models

Lecture 2. Basic Population and Quantitative Genetics

5. Best Linear Unbiased Prediction

Chapter 5 Prediction of Random Variables

Should genetic groups be fitted in BLUP evaluation? Practical answer for the French AI beef sire evaluation

Animal Model. 2. The association of alleles from the two parents is assumed to be at random.

Reduced Animal Models

Repeated Records Animal Model

PREDICTION OF BREEDING VALUES FOR UNMEASURED TRAITS FROM MEASURED TRAITS

Lecture 7 Correlated Characters

Selection on Multiple Traits

Lecture 9. Short-Term Selection Response: Breeder s equation. Bruce Walsh lecture notes Synbreed course version 3 July 2013

Lecture 13 Family Selection. Bruce Walsh lecture notes Synbreed course version 4 July 2013

Genetic evaluation for three way crossbreeding

Prediction. is a weighted least squares estimate since it minimizes. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark

Lecture 2: Introduction to Quantitative Genetics

Lecture 6: Selection on Multiple Traits

Lecture 4: Allelic Effects and Genetic Variances. Bruce Walsh lecture notes Tucson Winter Institute 7-9 Jan 2013

Chapter 12 REML and ML Estimation

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

THE ANOVA APPROACH TO THE ANALYSIS OF LINEAR MIXED EFFECTS MODELS

WLS and BLUE (prelude to BLUP) Prediction

Stat 579: Generalized Linear Models and Extensions

Linear Models for the Prediction of Animal Breeding Values

A. Motivation To motivate the analysis of variance framework, we consider the following example.

2.1 Linear regression with matrices

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Animal Models. Sheep are scanned at maturity by ultrasound(us) to determine the amount of fat surrounding the muscle. A model (equation) might be

MODELLING STRATEGIES TO IMPROVE GENETIC EVALUATION FOR THE NEW ZEALAND SHEEP INDUSTRY. John Holmes

GBLUP and G matrices 1

Maternal Genetic Models

Prediction of breeding values with additive animal models for crosses from 2 populations

SYLLABUS MIXED MODELS IN QUANTITATIVE GENETICS. William (Bill) Muir, Department of Animal Sciences, Purdue University

21. Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model

Categorical Predictor Variables

ANOVA Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 32

ANOVA Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 32

Linear Mixed-Effects Models. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 34

Breeding Values and Inbreeding. Breeding Values and Inbreeding

Lecture 6. QTL Mapping

The Prediction of Random Effects in Generalized Linear Mixed Model

Estimation of Parameters in Random. Effect Models with Incidence Matrix. Uncertainty

Lecture 1 Basic Statistical Machinery

20th Summer Institute in Statistical Genetics

STAT 100C: Linear models

BLUP without (inverse) relationship matrix

RESTRICTED M A X I M U M LIKELIHOOD TO E S T I M A T E GENETIC P A R A M E T E R S - IN PRACTICE

Raphael Mrode. Training in quantitative genetics and genomics 30 May 10 June 2016 ILRI, Nairobi. Partner Logo. Partner Logo

11. Linear Mixed-Effects Models. Copyright c 2018 Dan Nettleton (Iowa State University) 11. Statistics / 49

36-720: Linear Mixed Models

A simple method of computing restricted best linear unbiased prediction of breeding values

F9 F10: Autocorrelation

Quantitative genetic (animal) model example in R

Lecture 34: Properties of the LSE

Appendix 2. The Multivariate Normal. Thus surfaces of equal probability for MVN distributed vectors satisfy

Approximating likelihoods for large spatial data sets

The concept of breeding value. Gene251/351 Lecture 5

Oct Analysis of variance models. One-way anova. Three sheep breeds. Finger ridges. Random and. Fixed effects model. The random effects model

Genetic Heterogeneity of Environmental Variance - estimation of variance components using Double Hierarchical Generalized Linear Models

Lecture 6: Introduction to Quantitative genetics. Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011

Hierarchical generalized linear models a Lego approach to mixed models

REPEATED MEASURES. Copyright c 2012 (Iowa State University) Statistics / 29

Making sense of Econometrics: Basics

Estimation of the Proportion of Genetic Variation Accounted for by DNA Tests

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Lecture 14 Simple Linear Regression

Linear Regression. Junhui Qian. October 27, 2014

Multivariate Regression

Association studies and regression

Random and Mixed Effects Models - Part III

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

RANDOM REGRESSION IN ANIMAL BREEDING

population when only records from later

Transcription:

Mixed-Model Estimation of genetic variances Bruce Walsh lecture notes Uppsala EQG 01 course version 8 Jan 01

Estimation of Var(A) and Breeding Values in General Pedigrees The above designs (ANOVA, P-O regression) are simple, involving only a single type of relative comparison. Further, we assumed balanced designs, with the number of offspring the same in each family. In the real world, we often have a pedigree of relatives, with a very unbalanced design. Fortunately, the general mixed model (so called because it includes both fixed and random effects, offers an ideal platform for both estimating genetic variances as well a predicting the breeding values of individuals. Almost all animal breeding is based on such models, with REML (restricted max likelihood) used to estimated variances and BLUP (best linear unbiased predictors) used to predict BV

The general mixed model Vector of observations (phenotypes) Vector of fixed effects (to be estimated), e.g., year, sex and age effects Y = X! + Zu + e Incidence matrix for random effects Vector of residual errors (random effects) Incidence matrix for fixed effects Vector of random effects, such as individual Breeding values (to be estimated)

The general mixed model Vector of observations (phenotypes) Vector of fixed effects Incidence matrix for random effects Y = X! + Zu + e Vector of residual errors Incidence matrix for fixed effects Vector of random effects Observe y, X, Z. Estimate fixed effects! Estimate random effects u, e

Example Suppose we wish to estimate the breeding values of three sires, each of which is mated to a random dam, producing two offspring, some reared in environment one, others in environment two. The data are Observation Value Sire environment Y 111 9 1 1 Y 11 1 1 Y 11 11 1 Y 1 6 1 Y 311 7 3 1 Y 31 14 3

Here the basic model is Y ijk =! j + u i + e ijk Effect of environment j The mixed model vectors and matrices become Breeding value of sire i y 1,1,1 y 1,,1 y y =,1,1 y,1, y 3,1,1 y 3,,1 = 9 1 11 6 7 14 X = 1 0 0 1 1 0, Z = 1 0 1 0 0 1 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 1, β = ( β1 β ), u = u 1 u u 3

Means & Variances for y = X! + Zu + e Means: E(u) = E(e) = 0, E(y) = X! Variances: Let R be the covariance matrix for the residuals. We typically assume R = " e*i Let G be the covariance matrix for the breeding values (the vector u) The covariance matrix for y becomes V = ZGZ T + R

Effects of model misspecification Suppose we simply used a General Linear model (only fixed effects) for this example? Here y = X! + e*, Y ~ MVN(X!,V) where e* ~ MVN(0,V), implying The effect of using a mixed model is that it partitions the residual e* as e* = Zu + e

Estimating fixed Effects & Predicting Random Effects For a mixed model, we observe y, X, and Z!, u, R, and G are generally unknown Two complementary estimation issues (i) Estimation of! and u ( X T V 1 X) 1 X T V 1 y β = Estimation of fixed effects BLUE = Best Linear Unbiased Estimator ) û = GZ T V (y 1 - X β Prediction of random effects BLUP = Best Linear Unbiased Predictor Recall V = ZGZ T + R

Let s return to our example Assume residuals uncorrelated & homoscedastic, R = " e *I. Hence, need " e to solve BLUE/BLUP equations. Suppose " e = 6, giving R = 6* I Now consider G, the covariance matrix for u (the vector of the three sire breeding values). Assume sires are unrelated, so G is diagonal with element " G =sire variance, where " G = " A /4. Suppose " A = 8, giving G G = 8/4*I

1 0 0 1 0 0 0 0 0 1 0 0 V = 8 0 1 0 1 0 0 0 1 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 0 1 0 0 0 +6 4 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 8 0 0 0 0 8 0 0 0 0 0 0 8 0 0 = 0 0 8 0 0 0 0 0 0 8 0 0 0 0 8 Solving, recalling that V = ZGZ T + R β = ( β1 β ) = û = û1 û u 3 giving V 1 = 30 1 - ( ) 1 X T V 1 - - X X T - V 1 y = 1 18 ( ) = GZ T V 1 y X β 4 1 0 0 0 0 1 4 0 0 0 0 0 0 4 1 0 0 0 0 1-4 0 0 0 0 0 0 4 1 0 0 0 0 1 4 = 1 18 ( ) 148 35 1 1

Henderson s Mixed Model Equations y = X! + Zu + e, u ~ (0,G), e ~ (0, R), cov(u,e) = 0, If X is n x p and Z is n x q p x p XT R 1 X Z T R 1 X p x q X T R 1 Z β = XT R 1 y Z T R 1 Z + G 1 û Z T R 1 y q x q The whole matrix is (p+q) x (p+q) β = X T V 1 - X 1 X T V 1 - ( ) y V = ZGZ T + R ) û = GZ T V (y 1 X β Inversion of an n x n matrix

Let s redo our previous example using Henderson s Equation X T R 1 X = 1 ( ) ( ) - 4 0, X T R 1 - T Z = Z T R 1-1 X = 6 0 6 ( 1 ) 1 1 0 1 G 1 - +Z T R 1 - Z = 5 6 1 0 0 0 1 0 0 0 1, X T R 1 y = 1 ( ) - 33, Z T R 1 - y = 1 6 6 6 1 17 1 4 0 1 1 0 1 0 1 1 1 5 0 0 0 0 5 0 1 1 0 0 5 β 1 β û 1 û û 3 = 33 6 1 17 1 Taking the inverse gives β 1 β û 1 û û 3 = 1 18 148 35 1-1 -

The Animal Model, y i = µ + a i + e i Here, the individual is the unit of analysis, with y i the phenotypic value of the individual and a i its BV 1 a 1 X = a 1.., β = µ, u =. G = σa A, 1 a k Where the additive genetic relationship matrix A is given by A ij = # ij, namely twice the coefficient of coancestry Assume R = " e *I, so that R-1 = 1/(" e )*I. Likewise, G = " A *A, so that G-1 = 1/(" A )* A-1.

Henderson s mixed model equation here becomes XT X Z T X X T Z β = XT y Z T Z + λ A 1 - û Z T y This reduces to here $ = " e / " A = (1-h )/h n 1T 1 I + λ A- 1 µ û n yi = y

Suppose our pedigree is Example 1 3 4 5 A = 1 0 0 1/ 0 0 1 0 1/ 1/ 0 0 1 0 1/ 1/ 1/ 0 1 1/4 0 1/ 1/ 1/4 1 Suppose $ =1 (corresponds to h = 0.5). In this case, I + λ A 1 - = 5/ 1/ 0 1-0 1/ 3 1/ 1-1 - 0 1/ 5/ 0 1-1 - 1-0 3 0 0 1-1 - 0 3

Suppose the vector of observations is y = y 1 y y 3 = y 4 y 5 7 9 10 6 9 Here n = 5, % y = 41, and Henderson s equation becomes 5 1 1 1 1 1 µ 1 5/ 1/ 0 1 0 â 1 1 1/ 3 1/ 1 1 â 1 0 1/ 5/ 0-1 â 3 1-1 1 0 3 0 â 4 1 0 1 1 0 3 â 5 = 41 7 9 10 6 9 Solving gives µ = 440 53 8.30, â 1 â â 3 â 4 a 5 = 66/689 4/53 610/689 73/689 381/689 0.961 0.076 0.885 1.06 0.553

More on the animal model Under the animal model y = X! + Za + e a ~ (0," A A), e ~ (0, " e I) BLUP(a) = " A AZ T V -1 (y- X!) Where V = ZGZ T + R = " A ZAZ T + " e I Consider the simplest case of a single observation on one individual, where the only fixed effect is the mean µ, which is assumed known Here Z = A = I = (1), V = " A + " e " A AZ T V -1 = " A /(" A + " e ) = h BLUP(a) = h (y-µ)

More generally, with single observations on n unrelated individuals, A = Z = I n x n V = " A ZAZ T + " e I = (" A + " e ) I " A AZ T V -1 = h I BLUP(a) = " A AZ T V -1 (y- X!) = h (y- µ) Hence, the predicted breeding value of individual i is just BLUP(a i ) = h (y i -µ) When at least some individuals are related and/or inbred (so that A = I) and/or missing or multiple records (so that Z = I), then the estimates of the BV differ from this simple form, but BLUP fully accounts for this

Estimation of R and G The second estimation issue the covariance matrix for residuals R and for breeding values G As we have seen, both matrices have the form " *B, where the variance " is unknown, but B is known For example, for residuals, R = " e*i For breeding values, G = " A*A, where A is given from the pedigree

REML Variance Component Estimation REML = Restricted Maximum Likelihood. Standard ML variance estimation assumes fixed factors are known without error. Results in downward bias in variance estimates REML maximizes that portion of the likelihood that does not depend on fixed effects Basic idea: Use a transformation to remove fixed effect, then perform ML on this transformed vector

Simple variance estimate under ML vs. REML ML = 1 n n i+1 (x x), REML = 1 n 1 n (x x) i+1 REML adjusts for the estimated fixed effect, in this case, the mean With balanced design, ANOVA variance estimates are equivalent to REML variance estimates