Mixed-Models. version 30 October 2011

Similar documents
Mixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012

Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013

MIXED MODELS THE GENERAL MIXED MODEL

Lecture 2: Linear and Mixed Models

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Lecture 5 Basic Designs for Estimation of Genetic Parameters

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 28: BLUP and Genomic Selection. Bruce Walsh lecture notes Synbreed course version 11 July 2013

Quantitative characters - exercises

Chapter 5 Prediction of Random Variables

Introduction to General and Generalized Linear Models

Lecture 9 Multi-Trait Models, Binary and Count Traits

Likelihood Methods. 1 Likelihood Functions. The multivariate normal distribution likelihood function is

Prediction. is a weighted least squares estimate since it minimizes. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark

Best unbiased linear Prediction: Sire and Animal models

Lecture Notes. Introduction

PREDICTION OF BREEDING VALUES FOR UNMEASURED TRAITS FROM MEASURED TRAITS

THE ANOVA APPROACH TO THE ANALYSIS OF LINEAR MIXED EFFECTS MODELS

VARIANCE COMPONENT ESTIMATION & BEST LINEAR UNBIASED PREDICTION (BLUP)

Chapter 12 REML and ML Estimation

Genetic evaluation for three way crossbreeding

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

WLS and BLUE (prelude to BLUP) Prediction

36-720: Linear Mixed Models

ANOVA Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 32

REML Variance-Component Estimation

Should genetic groups be fitted in BLUP evaluation? Practical answer for the French AI beef sire evaluation

Stat 579: Generalized Linear Models and Extensions

Lecture 32: Infinite-dimensional/Functionvalued. Functions and Random Regressions. Bruce Walsh lecture notes Synbreed course version 11 July 2013

3. Properties of the relationship matrix

Chapter 11 MIVQUE of Variances and Covariances

ECON The Simple Regression Model

21. Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model

1 Mixed effect models and longitudinal data analysis

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Alternative implementations of Monte Carlo EM algorithms for likelihood inferences

Categorical Predictor Variables

Models with multiple random effects: Repeated Measures and Maternal effects

ANOVA Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 32

A. Motivation To motivate the analysis of variance framework, we consider the following example.

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Lecture 9. QTL Mapping 2: Outbred Populations

F9 F10: Autocorrelation

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

The Prediction of Random Effects in Generalized Linear Mixed Model

Multiple random effects. Often there are several vectors of random effects. Covariance structure

STAT 100C: Linear models

Linear Models for the Prediction of Animal Breeding Values

Lecture 4. Basic Designs for Estimation of Genetic Parameters

Two-Variable Regression Model: The Problem of Estimation

Random and Mixed Effects Models - Part III

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

INTRODUCTION TO ANIMAL BREEDING. Lecture Nr 3. The genetic evaluation (for a single trait) The Estimated Breeding Values (EBV) The accuracy of EBVs

Stat 579: Generalized Linear Models and Extensions

A simple method of computing restricted best linear unbiased prediction of breeding values

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

Linear Mixed-Effects Models. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 34

Lecture 13 Family Selection. Bruce Walsh lecture notes Synbreed course version 4 July 2013

Estimation of Variances and Covariances

ML and REML Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 58

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

REPEATED MEASURES. Copyright c 2012 (Iowa State University) Statistics / 29

Least Squares Estimation-Finite-Sample Properties

Sensitivity of GLS estimators in random effects models

GBLUP and G matrices 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

The Statistical Property of Ordinary Least Squares

Testing Linear Restrictions: cont.

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Describing Within-Person Change over Time

Oct Analysis of variance models. One-way anova. Three sheep breeds. Finger ridges. Random and. Fixed effects model. The random effects model

MODELLING STRATEGIES TO IMPROVE GENETIC EVALUATION FOR THE NEW ZEALAND SHEEP INDUSTRY. John Holmes

Approximating likelihoods for large spatial data sets

Simple Linear Regression Analysis

Association studies and regression

Chapter 4: Factor Analysis

Multivariate Regression Analysis

Lecture 3. Introduction on Quantitative Genetics: I. Fisher s Variance Decomposition

Making sense of Econometrics: Basics

11. Linear Mixed-Effects Models. Copyright c 2018 Dan Nettleton (Iowa State University) 11. Statistics / 49

INTRODUCTORY ECONOMETRICS

Maternal Genetic Models

SYLLABUS MIXED MODELS IN QUANTITATIVE GENETICS. William (Bill) Muir, Department of Animal Sciences, Purdue University

In the bivariate regression model, the original parameterization is. Y i = β 1 + β 2 X2 + β 2 X2. + β 2 (X 2i X 2 ) + ε i (2)

Math 423/533: The Main Theoretical Topics

Time Series Analysis

Regression Models - Introduction

Outline for today. Maximum likelihood estimation. Computation with multivariate normal distributions. Multivariate normal distribution

Multivariate Regression

Motivation for multiple regression

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Quantitative genetic (animal) model example in R

Homoskedasticity. Var (u X) = σ 2. (23)

Introduction to Random Effects of Time and Model Estimation

Stability and the elastic net

Appendix 2. The Multivariate Normal. Thus surfaces of equal probability for MVN distributed vectors satisfy

Econometrics of Panel Data

L2: Two-variable regression model

STAT 100C: Linear models

Transcription:

Mixed-Models version 30 October 2011

Mixed models Mixed models estimate a vector! of fixed effects and one (or more) vectors u of random effects Both fixed and random effects models always include a vector e of residuals. These are random effects as well We speak about estimating fixed effects (BLUE) And predicting random effects BLUP = best linear unbiased predictor

Random effects Why use random effects? Biological motivation: We consider our sample of values as being drawn from some much larger universe of interest Statistical motivation: treating effects as random generally uses much fewer degrees of freedom! Nuisance parameters often treated as random

Example Suppose you are measuring individuals of interest from different locations, and/or different years Year, location, and year x location effects may be important However, treating all these as fixed uses up degrees of freedom Conversely, simply ignoring these gives a more complicated residual error structure

Example (cont) Suppose you have measured 3 individuals from location 1, 2 from location 2, and 4 from location 3. Let u 1, u 2, and u 3 denote the location values for this particular sample. Ignoring these effects creates a covariance between residuals from the same location (as they have u i in common) A common reason to include random effects is to reduce the residual error structure back to the OLS form, " e 2 I.

Focusing just on the random effects in this model, we have y = Zu + e Z is called the incident matrix for the random effect While Z is critical (the analogue of the design matrix X associated With fixed effects), it is NOT sufficient to characterize the random effects. Most importantly, we need their covariance structure: (the variance-covariance matrix)

Example (cont) Suppose we assume that the location effects are drawn from the same distribution (and hence have the same variance " u2 ) and are uncorrelated. Then R = cov(u) = " u 2 I The resulting covariance matrix for the y = Zu + e (making The OLS assumption for e) becomes Var(y) = Z T RZ + " s 2 I = Z T (" u 2 I)Z + " s 2 I = " u 2 Z T Z + " s 2 I The covariance among observations for the same location is fully accounted for by the resulting covariance matrix

The general mixed model Vector of observations (phenotypes) Vector of fixed effects (to be estimated), e.g., year, sex and age effects Y = X! + Zu + e Incidence matrix for random effects Vector of residual errors (random effects) Incidence matrix for fixed effects Vector of random effects, such as individual Breeding values (to be estimated)

The general mixed model Vector of observations (phenotypes) Vector of fixed effects Incidence matrix for random effects Y = X! + Zu + e Vector of residual errors Incidence matrix for fixed effects Vector of random effects Observe y, X, Z. Estimate fixed effects! Estimate random effects u, e

Means & Variances for y = X! + Zu + e Means: E(u) = E(e) = 0, E(y) = X! Variances: Let R be the covariance matrix for the residuals. We typically assume R = " 2 e*i Let G be the covariance matrix for the breeding values (the vector u) The covariance matrix for y becomes V = ZGZ T + R

Effects of model misspecification Suppose we simply used a General Linear model (only fixed effects) for this example? Here y = X! + e*, Y ~ MVN(X!,V) where e* ~ MVN(0,V), implying The effect of using a mixed model is that it partitions the residual e* as e* = Zu + e

Estimating fixed Effects & Predicting Random Effects For a mixed model, we observe y, X, and Z!, u, R, and G are generally unknown Two complementary estimation issues (i) Estimation of! and u ( X T V 1 X) 1 X T V 1 y β = Estimation of fixed effects BLUE = Best Linear Unbiased Estimator ) û = GZ T V (y 1 - X β Prediction of random effects BLUP = Best Linear Unbiased Predictor Recall V = ZGZ T + R

Henderson s Mixed Model Equations y = X! + Zu + e, u ~ (0,G), e ~ (0, R), cov(u,e) = 0, If X is n x p and Z is n x q p x p XT R 1 X Z T R 1 X p x q X T R 1 Z β = XT R 1 y Z T R 1 Z + G 1 û Z T R 1 y q x q The whole matrix is (p+q) x (p+q) β = X T V 1 - X 1 X T V 1 - ( ) y V = ZGZ T + R ) û = GZ T V (y 1 X β Inversion of an n x n matrix

The Animal Model, y i = µ + a i + e i Here, the individual is the unit of analysis, with y i the phenotypic value of the individual and a i its BV 1 a 1 X = a 1.., β = µ, u = 2. G = σa 2 A, 1 a k Where the additive genetic relationship matrix A is given by A ij = 2# ij, namely twice the coefficient of coancestry Assume R = " 2 e *I, so that R-1 = 1/(" 2 e )*I. Likewise, G = " 2 A *A, so that G-1 = 1/(" 2 A )* A-1.

Henderson s mixed model equation here becomes XT X Z T X X T Z β = XT y Z T Z + λ A 1 - û Z T y This reduces to here $ = " 2 e / " 2 A = (1-h 2 )/h 2 n 1T 1 I + λ A- 1 µ û n yi = y

Estimation of R and G The second estimation issue the covariance matrix for residuals R and for breeding values G As we have seen, both matrices have the form " 2 *B, where the variance " 2 is unknown, but B is known For example, for residuals, R = " 2 e*i For breeding values, G = " 2 A*A, where A is given from the pedigree

REML Variance Component Estimation REML = Restricted Maximum Likelihood. Standard ML variance estimation assumes fixed factors are known without error. Results in downward bias in variance estimates REML maximizes that portion of the likelihood that does not depend on fixed effects Basic idea: Use a transformation to remove fixed effect, then perform ML on this transformed vector

Simple variance estimate under ML vs. REML ML = 1 n n i+1 (x x) 2, REML = 1 n 1 n (x x) 2 i+1 REML adjusts for the estimated fixed effect, in this case, the mean With balanced design, ANOVA variance estimates are equivalent to REML variance estimates