Hierarchical generalized linear models a Lego approach to mixed models

Size: px

Start display at page:

Download "Hierarchical generalized linear models a Lego approach to mixed models"

Jonah Blake
5 years ago
Views:

1 Hierarchical generalized linear models a Lego approach to mixed models Lars Rönnegård Högskolan Dalarna Swedish University of Agricultural Sciences Trondheim Seminar

2 Aliations Hierarchical Generalized Linear Models Borlänge Uppsala

3 Outline Hierarchical Generalized Linear Models Hierarchical Generalized Linear Models Principle Method Iterative GLM tting algorithm The hglm package in R Fitting a spatial CAR model Using extensions of GLMs (DGLM, HGLM and DHGLM) in genetics Finding genes for uniformity Modelling uniformity in animal breeding

4 Principles, Methods and Algorithms Table : Overview of statistical principles. Principle Method Algorithm Bayesian min. loss/max. posterior MCMC, INLA Extended likelihood h-likelihood N-R, IRWLS Likelihood maximum likelihood N-R, Fisher scoring Frequentist method of moments Monte Carlo

5 Books on extended likelihood principle Pawitan Y (2001) In All Likelihood Lee Y, Nelder JA, Pawitan Y (2006) Generalized Linear Models with Random Eects

6 Some h-likelihood people

7 Denition of the h-likelihood Lee & Nelder's (1996) hierarchical log-likelihood (h-likelihood): h(β,θ,u) = logf (y u) + logf (u). Classical inference Uses marginal likelihood (random eects integrated out) f (y,u)du Includes xed parameters and only observations are treated as random Bayesian inference Probabilistic framework that combines likelihood and prior information. Treats all parameters and observations as random. Extended likelihood inference The h-likelihood is based on the extended likelihood principle Extended likelihood principle: all information in the data about the random and xed eects is included in a joint likelihood (which the h-likelihood is an implementation of) Includes: xed parameters, unobserved random eects, and observations as random

8 Denition of the h-likelihood Lee & Nelder's (1996) hierarchical log-likelihood (h-likelihood): h(β,θ,u) = logf (y u) + logf (u). Classical inference Uses marginal likelihood (random eects integrated out) f (y,u)du Includes xed parameters and only observations are treated as random Bayesian inference Probabilistic framework that combines likelihood and prior information. Treats all parameters and observations as random. Extended likelihood inference The h-likelihood is based on the extended likelihood principle Extended likelihood principle: all information in the data about the random and xed eects is included in a joint likelihood (which the h-likelihood is an implementation of) Includes: xed parameters, unobserved random eects, and observations as random

9 Extended likelihood Likelihood Principle: Birnbaum (1962) showed that the classical likelihood function contains all the information about the value of the xed parameter. Extended Likelihood Principle: Bjørnstad (1996) showed that all information in the data y for parameters θ and unobservables u is in the extended likelihood.

10 The h-likelihood is the extended likelihood applied on HGLM Hierarchical Generalized Linear Models Generalized linear models with random eects Both the response y and the random eects u can come from a wide range of distributions Inference and model selection tools

11 h-likelihood estimation for HGLM Estimating xed and random eects: h β = 0 and h u = 0 Estimating variance components using adjusted prole likelihood: ( h p = h + 1 ) 2 log 2πD 1 h p θ = 0 β = ˆβ,u=û where D is the matrix of second derivatives of h around β = ˆβ,u = û.

12 h-likelihood estimation for HGLM Estimating xed and random eects: h β = 0 and h u = 0 Estimating variance components using adjusted prole likelihood: ( h p = h + 1 ) 2 log 2πD 1 h p θ = 0 β = ˆβ,u=û where D is the matrix of second derivatives of h around β = ˆβ,u = û.

13 The h-likelihood for a linear mixed model For a linear mixed model y = Xβ + Zu + e with u N(0,I k σ 2 u ) and e N(0,I n σ 2 e ) All we need is the normal density function (n iid observations with mean 0): 1 f (x) = ( 2πσ 2 ) 1 e 2σ 2 x x ; ie log(f (x)) = n n 2 log(σ 2 ) 1 x x 2σ 2 We have e = y Xβ Zu, so h(β,θ,u) = logf (y u) + logf (u) = log(f (e)) + log(f (u)) = { n 2 log(σ 2 e ) 1 2σ 2 e e} + { k e 2 log(σ 2 u ) 1 2σ 2 u u} u

14 The h-likelihood for a linear mixed model For a linear mixed model y = Xβ + Zu + e with u N(0,I k σ 2 u ) and e N(0,I n σ 2 e ) All we need is the normal density function (n iid observations with mean 0): 1 f (x) = ( 2πσ 2 ) 1 e 2σ 2 x x ; ie log(f (x)) = n n 2 log(σ 2 ) 1 x x 2σ 2 We have e = y Xβ Zu, so h(β,θ,u) = logf (y u) + logf (u) = log(f (e)) + log(f (u)) = { n 2 log(σ 2 e ) 1 2σ 2 e e} + { k e 2 log(σ 2 u ) 1 2σ 2 u u} u

15 The h-likelihood for a linear mixed model For a linear mixed model y = Xβ + Zu + e with u N(0,I k σ 2 u ) and e N(0,I n σ 2 e ) All we need is the normal density function (n iid observations with mean 0): 1 f (x) = ( 2πσ 2 ) 1 e 2σ 2 x x ; ie log(f (x)) = n n 2 log(σ 2 ) 1 x x 2σ 2 We have e = y Xβ Zu, so h(β,θ,u) = logf (y u) + logf (u) = log(f (e)) + log(f (u)) = { n 2 log(σ 2 e ) 1 2σ 2 e e} + { k e 2 log(σ 2 u ) 1 2σ 2 u u} u

16 h-likelihood estimation for a linear mixed model For the linear mixed model we have h = n 2 log(σ 2 e ) 1 2σ 2 e e k e 2 log(σ 2 u ) 1 2σ 2 u u u Putting the rst derivatives equal to zero (ie h β = 0 and h = 0 ) gives the u standard (Henderson's) Mixed Model Equations for estimating xed and random eects. Estimating the variance components (REML): h p θ = 0 D = h p ( X X 1 σ 2 e Z X 1 σ 2 e = ( h log 2πD 1 ) β = ˆβ,u=û = n 2 log(σ 2 e ) 1 ê ê k 2σ 2 2 log(σ 2 u ) 1 ^u û 1 2σ 2 2 log D e u ) X Z 1 σ 2 e Z Z 1 σ 2 e + I k 1 σ 2 u

17 h-likelihood estimation for a linear mixed model For the linear mixed model we have h = n 2 log(σ 2 e ) 1 2σ 2 e e k e 2 log(σ 2 u ) 1 2σ 2 u u u Putting the rst derivatives equal to zero (ie h β = 0 and h = 0 ) gives the u standard (Henderson's) Mixed Model Equations for estimating xed and random eects. Estimating the variance components (REML): h p θ = 0 D = h p ( X X 1 σ 2 e Z X 1 σ 2 e = ( h log 2πD 1 ) β = ˆβ,u=û = n 2 log(σ 2 e ) 1 ê ê k 2σ 2 2 log(σ 2 u ) 1 ^u û 1 2σ 2 2 log D e u ) X Z 1 σ 2 e Z Z 1 σ 2 e + I k 1 σ 2 u

18 A linear model Hierarchical Generalized Linear Models To start with, consider a linear model with only xed eects: y N(Xβ,σ 2 e ) Can also be written as y = Xβ + e How can this model be tted? e N(0,σ 2 e ) Maximum likelihood: ˆβ = (X X) 1 X y, ˆσ 2 e = 1 n (y X ˆ β) (y X ˆ β) Unbiased residual variance estimate: ˆσ 2 e = 1 n p (y X ˆ β) (y X ˆ β)

19 A linear model Hierarchical Generalized Linear Models To start with, consider a linear model with only xed eects: y N(Xβ,σ 2 e ) Can also be written as y = Xβ + e How can this model be tted? e N(0,σ 2 e ) Maximum likelihood: ˆβ = (X X) 1 X y, ˆσ 2 e = 1 n (y X ˆ β) (y X ˆ β) Unbiased residual variance estimate: ˆσ 2 e = 1 n p (y X ˆ β) (y X ˆ β)

20 Using GLM Hierarchical Generalized Linear Models Basic idea If the xed eects in the mean part of the model (ie β ) were known, then the squared residuals are 2 e σ 2 χ 2 i e 1 (for observation i), i.e. gamma distributed. So, the squared residuals may be tted using a GLM having a gamma distribution with a log link function. But β is estimated and not known V (ê i ) = (1 h ii )σ 2 e where h ii are the diagonal elements of the hat matrix H H = X(X X) 1 X so that ŷ = Hy So, (ê i ) 2 1 h ii can be tted using a GLM having a gamma distribution with a log link function (and weights 1 h ii 2 ).

21 Using GLM Hierarchical Generalized Linear Models Basic idea If the xed eects in the mean part of the model (ie β ) were known, then the squared residuals are 2 e σ 2 χ 2 i e 1 (for observation i), i.e. gamma distributed. So, the squared residuals may be tted using a GLM having a gamma distribution with a log link function. But β is estimated and not known V (ê i ) = (1 h ii )σ 2 e where h ii are the diagonal elements of the hat matrix H H = X(X X) 1 X so that ŷ = Hy So, (ê i ) 2 1 h ii can be tted using a GLM having a gamma distribution with a log link function (and weights 1 h ii 2 ).

22 A heteroscedastic linear model Consider now a linear model with only xed eects both in the mean and dispersion parts: y N(Xβ, exp(x d β d )) Can also be written as y = Xβ + e; e N(0,σ 2 e); How can this model be tted? log(σ 2 e) = X d β d Iterate between a linear model and a GLM (implemented in the R package dglm) Estimate β for a given residual variance Estimate β d : Fit (ê i ) 2 as response variable in a gamma GLM (log link) with 1 h ii linear predictor X d β d

23 A heteroscedastic linear model Consider now a linear model with only xed eects both in the mean and dispersion parts: y N(Xβ, exp(x d β d )) Can also be written as y = Xβ + e; e N(0,σ 2 e); How can this model be tted? log(σ 2 e) = X d β d Iterate between a linear model and a GLM (implemented in the R package dglm) Estimate β for a given residual variance Estimate β d : Fit (ê i ) 2 as response variable in a gamma GLM (log link) with 1 h ii linear predictor X d β d

24 Can this be used for linear mixed models? y = Xb + Zu + e V = ZZ σ 2 u + I n σ 2 e Re-write it as an augmented weighted linear model! y a = Tδ + e a ( ) ( ) ( ) ( ) y X Z b e where y a =, T =, δ =, e 0 0 I k u a = u The variance-covariance ( matrix of the augmented residual vector is given by V (e a ) W 1 In σ = 2 ) e 0 0 I k σ 2 u The estimates from weighted least squares are given by T W Tˆδ = T Wy a This is identical to Henderson's( mixed model equations where ) the left hand side X X 1 X Z 1 σ 2 σ 2 e e can be veried to be T W T = Z X 1 σ 2 e Z Z 1 σ 2 e + I k 1 σ 2 u

25 Can this be used for linear mixed models? y = Xb + Zu + e V = ZZ σ 2 u + I n σ 2 e Re-write it as an augmented weighted linear model! y a = Tδ + e a ( ) ( ) ( ) ( ) y X Z b e where y a =, T =, δ =, e 0 0 I k u a = u The variance-covariance ( matrix of the augmented residual vector is given by V (e a ) W 1 In σ = 2 ) e 0 0 I k σ 2 u The estimates from weighted least squares are given by T W Tˆδ = T Wy a This is identical to Henderson's( mixed model equations where ) the left hand side X X 1 X Z 1 σ 2 σ 2 e e can be veried to be T W T = Z X 1 σ 2 e Z Z 1 σ 2 e + I k 1 σ 2 u

26 Can this be used for linear mixed models? y = Xb + Zu + e V = ZZ σ 2 u + I n σ 2 e Re-write it as an augmented weighted linear model! y a = Tδ + e a ( ) ( ) ( ) ( ) y X Z b e where y a =, T =, δ =, e 0 0 I k u a = u The variance-covariance ( matrix of the augmented residual vector is given by V (e a ) W 1 In σ = 2 ) e 0 0 I k σ 2 u The estimates from weighted least squares are given by T W Tˆδ = T Wy a This is identical to Henderson's( mixed model equations where ) the left hand side X X 1 X Z 1 σ 2 σ 2 e e can be veried to be T W T = Z X 1 σ 2 e Z Z 1 σ 2 e + I k 1 σ 2 u

27 Can this be used for linear mixed models? y = Xb + Zu + e V = ZZ σ 2 u + I n σ 2 e Re-write it as an augmented weighted linear model! y a = Tδ + e a ( ) ( ) ( ) ( ) y X Z b e where y a =, T =, δ =, e 0 0 I k u a = u The variance-covariance ( matrix of the augmented residual vector is given by V (e a ) W 1 In σ = 2 ) e 0 0 I k σ 2 u The estimates from weighted least squares are given by T W Tˆδ = T Wy a This is identical to Henderson's( mixed model equations where ) the left hand side X X 1 X Z 1 σ 2 σ 2 e e can be veried to be T W T = Z X 1 σ 2 e Z Z 1 σ 2 e + I k 1 σ 2 u

28 Use same method as before σ 2 is estimated by applying a gamma GLM to the response e ê2/(1 h i ii) with weights (1 h ii )/2, where the index i goes from 1 to n. Similarly for σ 2 u. Hat values given by the diagonal elements of H = T(T W T) 1 T W Possible to have xed eects in the linear predictor for estimating σ 2 (and e σ 2). u Can also add random eects in this gamma GLM! Double Hierarchical Generalized Linear Models (DHGLM). Can be estimated using a second layer in the iterative GLM algorithm.

29 Use same method as before σ 2 is estimated by applying a gamma GLM to the response e ê2/(1 h i ii) with weights (1 h ii )/2, where the index i goes from 1 to n. Similarly for σ 2 u. Hat values given by the diagonal elements of H = T(T W T) 1 T W Possible to have xed eects in the linear predictor for estimating σ 2 (and e σ 2). u Can also add random eects in this gamma GLM! Double Hierarchical Generalized Linear Models (DHGLM). Can be estimated using a second layer in the iterative GLM algorithm.

30 Hierarchical Generalized Linear Models

32 hglm notation Hierarchical Generalized Linear Models Linear mixed model with heteroscedastic residual variance library(hglm) model2 <- hglm(fixed = y ~ x, disp = ~ x, random = ~ 1 ID, family = gaussian(link = identity) )

33 Other possibilities in hglm Notation using design matrices: model2 <- hglm(x, y, Z, X.disp,family = gaussian(link = identity) ) Possible to t animal model, random regression, etc. For instance: animal.model <- hglm(x, y, Z = t(chol(a)), family = gaussian(link = identity) ) Possible to t several random eects: model3 <- hglm(x, y, Z = cbind(z1,z2), randc = c(ncol(z1),ncol(z2)), family = gaussian(link = identity) ) Possible to t other distributions: model4 <- hglm(x, y, Z,family = poisson(link = log) ) Possible to t other distributions for the random eects too: negative_binomial.model <- hglm(x, y, Z,family = poisson(link = log), rand.family = Gamma(link = log))

34 Other possibilities in hglm Notation using design matrices: model2 <- hglm(x, y, Z, X.disp,family = gaussian(link = identity) ) Possible to t animal model, random regression, etc. For instance: animal.model <- hglm(x, y, Z = t(chol(a)), family = gaussian(link = identity) ) Possible to t several random eects: model3 <- hglm(x, y, Z = cbind(z1,z2), randc = c(ncol(z1),ncol(z2)), family = gaussian(link = identity) ) Possible to t other distributions: model4 <- hglm(x, y, Z,family = poisson(link = log) ) Possible to t other distributions for the random eects too: negative_binomial.model <- hglm(x, y, Z,family = poisson(link = log), rand.family = Gamma(link = log))

35 Other possibilities in hglm Notation using design matrices: model2 <- hglm(x, y, Z, X.disp,family = gaussian(link = identity) ) Possible to t animal model, random regression, etc. For instance: animal.model <- hglm(x, y, Z = t(chol(a)), family = gaussian(link = identity) ) Possible to t several random eects: model3 <- hglm(x, y, Z = cbind(z1,z2), randc = c(ncol(z1),ncol(z2)), family = gaussian(link = identity) ) Possible to t other distributions: model4 <- hglm(x, y, Z,family = poisson(link = log) ) Possible to t other distributions for the random eects too: negative_binomial.model <- hglm(x, y, Z,family = poisson(link = log), rand.family = Gamma(link = log))

36 Other possibilities in hglm Notation using design matrices: model2 <- hglm(x, y, Z, X.disp,family = gaussian(link = identity) ) Possible to t animal model, random regression, etc. For instance: animal.model <- hglm(x, y, Z = t(chol(a)), family = gaussian(link = identity) ) Possible to t several random eects: model3 <- hglm(x, y, Z = cbind(z1,z2), randc = c(ncol(z1),ncol(z2)), family = gaussian(link = identity) ) Possible to t other distributions: model4 <- hglm(x, y, Z,family = poisson(link = log) ) Possible to t other distributions for the random eects too: negative_binomial.model <- hglm(x, y, Z,family = poisson(link = log), rand.family = Gamma(link = log))

37 Other possibilities in hglm Notation using design matrices: model2 <- hglm(x, y, Z, X.disp,family = gaussian(link = identity) ) Possible to t animal model, random regression, etc. For instance: animal.model <- hglm(x, y, Z = t(chol(a)), family = gaussian(link = identity) ) Possible to t several random eects: model3 <- hglm(x, y, Z = cbind(z1,z2), randc = c(ncol(z1),ncol(z2)), family = gaussian(link = identity) ) Possible to t other distributions: model4 <- hglm(x, y, Z,family = poisson(link = log) ) Possible to t other distributions for the random eects too: negative_binomial.model <- hglm(x, y, Z,family = poisson(link = log), rand.family = Gamma(link = log))

38 Playing with Lego: Fitting a DHGLM using the hglm package y = Xβ + Zu + e u N(0,Iσ 2 u ) e i N(0,σ 2 e,i), log(σ 2 e ) = Xβ d + Zu d Easy to t using the hglm package w <- rep(1, length(y)) for (i in 1:20) { u d N(0,Iσ 2 u d ) mmean <- hglm(y = y, X = X, Z = Z, weights = w) mdisp <- hglm(y = mmean$resid^2, X = X, Z = Z, family = Gamma(link = 'log'), weights = (1 - mmean$hv)/2) w <- mdisp$fv }

39 Playing with Lego: Fitting a spatial CAR model Linear mixed model y = Xβ + Zu + e with e N(0,I n σ 2 e ) and u N(0,Σ = τ(i n ρd) 1 ) Here D is the neighbourhood matrix specifying which areas that have common borders, τ and ρ are the parameters to be estimated. Eigen decompose D; eigenvalues w and eigenvectors Γ. Then the eigen decomposition of the covariance matrix is Σ = ΓΛΓ T with the diagonal matrix Λ having elements τ 1 ρw i.

40 ˆτ = 1ˆθ 0 Hierarchical Generalized Linear Models Playing with Lego: Fitting a spatial CAR model Re-write the model as: y = Xβ + Γ T Zũ + e with e N(0,I n σ 2 e ) and ũ N(0,Λ) Use a gamma GLM with inverse link and linear predictor θ 0 + θ 1 w to estimate the random eect variance. Then the estimates of τ and ρ are: Possible to t in hglm G <- eigen(nbr)$vectors w <- eigen(nbr)$values ˆρ = ˆθ 1 ˆθ 0 CAR.model_ugly <- hglm(x, y, Z = t(g)%*%z, X.rand.disp = model.matrix(~w), rand.family = Gamma(link = "inverse") ) Implementation in version 2.0 of hglm CAR.model_nice <- hglm(x, y, Z = diag(n), rand.family = CAR(D=nbr))

41 ˆτ = 1ˆθ 0 Hierarchical Generalized Linear Models Playing with Lego: Fitting a spatial CAR model Re-write the model as: y = Xβ + Γ T Zũ + e with e N(0,I n σ 2 e ) and ũ N(0,Λ) Use a gamma GLM with inverse link and linear predictor θ 0 + θ 1 w to estimate the random eect variance. Then the estimates of τ and ρ are: Possible to t in hglm G <- eigen(nbr)$vectors w <- eigen(nbr)$values ˆρ = ˆθ 1 ˆθ 0 CAR.model_ugly <- hglm(x, y, Z = t(g)%*%z, X.rand.disp = model.matrix(~w), rand.family = Gamma(link = "inverse") ) Implementation in version 2.0 of hglm CAR.model_nice <- hglm(x, y, Z = diag(n), rand.family = CAR(D=nbr))

42 ˆτ = 1ˆθ 0 Hierarchical Generalized Linear Models Playing with Lego: Fitting a spatial CAR model Re-write the model as: y = Xβ + Γ T Zũ + e with e N(0,I n σ 2 e ) and ũ N(0,Λ) Use a gamma GLM with inverse link and linear predictor θ 0 + θ 1 w to estimate the random eect variance. Then the estimates of τ and ρ are: Possible to t in hglm G <- eigen(nbr)$vectors w <- eigen(nbr)$values ˆρ = ˆθ 1 ˆθ 0 CAR.model_ugly <- hglm(x, y, Z = t(g)%*%z, X.rand.disp = model.matrix(~w), rand.family = Gamma(link = "inverse") ) Implementation in version 2.0 of hglm CAR.model_nice <- hglm(x, y, Z = diag(n), rand.family = CAR(D=nbr))

43 A tree genetics trial example Figure 1: Location of each tree and their height given in grey scale. Darkness increases with height and white indicates missing phenotype.

44 Figure 1: Location of each tree and their height given in grey scale. Darkness increases with height and white indicates missing phenotype. Figure 2: Estimated spatial and genetic random effects for each tree. Darkness increases with higher values.

45 Ordinary (mean-controlling) genes

46 Variance-controlling genes

47 GWAS for variance-controlling genes Shen, Pettersson, Rönnegård and Carlborg (2012). Inheritance beyond plain heritability: variance controlling genes in Arabidopsis thaliana. PLoS Genetics 8(8):e Arabidopsis thaliana study including 199 individuals Trait: molybdenum content 216,130 SNP Most signicant SNP located within the gene: ion transporter gene MOT1

48 Figure 2: A gene controlling robustness of molybdenum contents in Arabidopsis. Top gure (a) shows logp values for mean-controlling SNP (yellow) and variance-controlling SNP (dierent colours for dierent chromosomes). Bottom gure (b) shows substitution eect of the MOT1 allele. (Shen et al PLoS Genetics)

49 Using Double GLM to t a parametric model

50 Model Hierarchical Generalized Linear Models Traditional model for SNP regression y = µ + x j b + e e N(0,σ 2 e ) model1 <- glm(y ~ SNP) Model to detect variance-controlling genes y = µ + x j b + e e i N(0,σe,i) 2 ; log(σ 2 e ) = c + x j v Model easy to t using Gordon K. Smyth's dglm package in R library(dglm) model2 <- dglm(y ~ SNP, ~ SNP)

51 Model Hierarchical Generalized Linear Models Traditional model for SNP regression y = µ + x j b + e e N(0,σ 2 e ) model1 <- glm(y ~ SNP) Model to detect variance-controlling genes y = µ + x j b + e e i N(0,σe,i) 2 ; log(σ 2 e ) = c + x j v Model easy to t using Gordon K. Smyth's dglm package in R library(dglm) model2 <- dglm(y ~ SNP, ~ SNP)

52 Model Hierarchical Generalized Linear Models Traditional model for SNP regression y = µ + x j b + e e N(0,σ 2 e ) model1 <- glm(y ~ SNP) Model to detect variance-controlling genes y = µ + x j b + e e i N(0,σe,i) 2 ; log(σ 2 e ) = c + x j v Model easy to t using Gordon K. Smyth's dglm package in R library(dglm) model2 <- dglm(y ~ SNP, ~ SNP)

53 Using DHGLM for animal breeding models Rönnegård, Felleki, Fikse, Mulder and Strandberg (2010) Genetic heterogeneity of residual variance - estimation of variance components using double hierarchical generalized linear models. Genetics Selection Evolution 42:8. Rönnegård, L., Felleki, M., Fikse, W.F., Mulder H.A. & Strandberg, E Variance component and breeding value estimation for genetic heterogeneity of residual variance in Swedish Holstein dairy cattle. Journal of Dairy Science 96: Felleki, M., Lee, D., Lee, Y., Gilmour, A. & Rönnegård, L Estimation of breeding values for mean and dispersion, their variance and correlation using double hierarchical generalized linear models. Genetics Research 94: Rönnegård, L. & Lee, Y. (2013) Editorial: Exploring the potential of hierarchical generalized linear models in animal breeding and genetics. Journal of Animal Breeding and Genetics 130:

54 Linear mixed model using pedigree information Animal model y = Xβ + Za + e a N(0,Aσ 2 a ) e N(0,σ 2 e ) a i = additive genetic eect for individual i A= relationship matrix (calculated from pedigree information) Estimated Breeding Values = Best Linear Unbiased Predictor (BLUP) of a i

55 Extending the animal model y = Xβ + Za + e a N(0,Aσ 2 a ) e i N(0,σ 2 e,i), log(σ 2 e ) = X d β d + Za d a d N(0,Aσ 2 a d ) ρ = cor(a,a d ) We t a DHGLM for the pig litter size data, previously studied in Sorensen and Waagepetersen (2003) using MCMC DHGLM possible to t using existing variance-component estimation software (ASReml).

56 Extending the animal model y = Xβ + Za + e a N(0,Aσ 2 a ) e i N(0,σ 2 e,i), log(σ 2 e ) = X d β d + Za d a d N(0,Aσ 2 a d ) ρ = cor(a,a d ) We t a DHGLM for the pig litter size data, previously studied in Sorensen and Waagepetersen (2003) using MCMC DHGLM possible to t using existing variance-component estimation software (ASReml).

57 Data Description Data from Danish Pig Production. Pig litter size from 4,149 sows mean litter size 10.3 The data includes 10,060 records from these 4,149 sows in 82 farms. Fixed eects: farm, season, type of insemination, parity number of litters per sow varying from 1 to 9

58 Simulation results Hierarchical Generalized Linear Models

59 Thank you! Hierarchical Generalized Linear Models Lars Rönnegård Special thanks to my student Majbritt Felleki and collaborators Dalarna University: Moudud Alam Carlborg lab, SLU: Xia Shen, Örjan Carlborg Animal Breeding and Genetics, SLU: Erling Strandberg, Freddy Fikse Seoul National University, Korea: Youngjo Lee Wageningen University, The Netherlands: Herman A. Mulder University of North Carolina at Chapel Hill, USA: William Valdar Reindeer Unit, SLU: Anna Skarin

60 A last illustrative example - Image reconstruction

61 A last illustrative example - Image reconstruction

62 A last illustrative example - Image reconstruction

63 Noise added Hierarchical Generalized Linear Models

64 70% of pixels missing at random

65 Clustered 4x4 pixels missing

67 Deriving the algorithm directly from the h-likelihood Estimating the variance components: h p θ = 0 h p = ( h log 2πD 1 ) β = ˆβ,u=û = C n 2 log(σ 2 e ) 1 ê ê k 2σ 2 2 log(σ 2 u ) 1 ^u û 1 2σ 2 2 log D e u When we take the rst derivative of log D, the hat values for the augmented model appears! log D = tr(d 1 δ D) = 1 tr([x,z] [X,Z](T T) 1 ) = (σ 2)2 e δ δσ 2 e δσ 2 e 1 tr([x,z](t T) 1 [X,Z] ) = 1 (σ 2)2 (σ 2)2 e e So, for the residual variance we have n h ii i=1 h p σ 2 e = n 2σ 2 e + 1 2(σ 2 e ) 2 n i=1 ê 2 i 1 (σ 2 e ) 2 n i=1 This can be re-written as the score function for a gamma GLM with response. h ii ê 2 i 1 h ii as And similarly for σ 2 u...

68 Deriving the algorithm directly from the h-likelihood Estimating the variance components: h p θ = 0 h p = ( h log 2πD 1 ) β = ˆβ,u=û = C n 2 log(σ 2 e ) 1 ê ê k 2σ 2 2 log(σ 2 u ) 1 ^u û 1 2σ 2 2 log D e u When we take the rst derivative of log D, the hat values for the augmented model appears! log D = tr(d 1 δ D) = 1 tr([x,z] [X,Z](T T) 1 ) = (σ 2)2 e δ δσ 2 e δσ 2 e 1 tr([x,z](t T) 1 [X,Z] ) = 1 (σ 2)2 (σ 2)2 e e So, for the residual variance we have n h ii i=1 h p σ 2 e = n 2σ 2 e + 1 2(σ 2 e ) 2 n i=1 ê 2 i 1 (σ 2 e ) 2 n i=1 This can be re-written as the score function for a gamma GLM with response. h ii ê 2 i 1 h ii as And similarly for σ 2 u...

69 Deriving the algorithm directly from the h-likelihood Estimating the variance components: h p θ = 0 h p = ( h log 2πD 1 ) β = ˆβ,u=û = C n 2 log(σ 2 e ) 1 ê ê k 2σ 2 2 log(σ 2 u ) 1 ^u û 1 2σ 2 2 log D e u When we take the rst derivative of log D, the hat values for the augmented model appears! log D = tr(d 1 δ D) = 1 tr([x,z] [X,Z](T T) 1 ) = (σ 2)2 e δ δσ 2 e δσ 2 e 1 tr([x,z](t T) 1 [X,Z] ) = 1 (σ 2)2 (σ 2)2 e e So, for the residual variance we have n h ii i=1 h p σ 2 e = n 2σ 2 e + 1 2(σ 2 e ) 2 n i=1 ê 2 i 1 (σ 2 e ) 2 n i=1 This can be re-written as the score function for a gamma GLM with response. h ii ê 2 i 1 h ii as And similarly for σ 2 u...

70 Introduction to genome-wide association studies Example: 3 individuals and 5 SNPs Linear model y = µ + x j b + e with y =

71 Introduction to genome-wide association studies Example: 3 individuals and 5 SNPs Linear model y = µ + x j b + e with y =

72 GWAS Hierarchical Generalized Linear Models Example: 3 individuals and 5 SNPs Linear model y = µ + x j b + e with y = and x1 = 1 0 2

73 GWAS Hierarchical Generalized Linear Models Example: 3 individuals and 5 SNPs Linear model y = µ + x j b + e with y = and ˆb = 6.5 (P = 0.56) and x1 = 1 0 2

74 GWAS Hierarchical Generalized Linear Models Example: 3 individuals and 5 SNPs Linear model y = µ + x j b + e with y = Calculate P-value for each b and plot log 10 P and x2 = 0 1 2

75 GWAS Hierarchical Generalized Linear Models Example: 3 individuals and 5 SNPs Linear model y = µ + x j b + e with y = Calculate P-value for each b and plot log 10 P and x2 = 0 1 2

76 GWAS plot Hierarchical Generalized Linear Models log 10(P) SNP

77 Manhattan plot Hierarchical Generalized Linear Models Example from: Weedon et al. 2008, Nature Genetics 40,

78 Classical likelihood inference A solution to inference for xed unknowns θ was proposed by Fisher (1922). He developed a likelihood theory, expressing the probability to observe the data as the function of the parameter value. Consider a statistical model, consisting of two types of objects, data y and parameter θ, and two related processes on them: Statistical Model for Data Generation: Generate an instance of the data y from a probability function with xed parameters θ, f θ (y). Statistical Inference: Given the data y, make an inference about unknown xed θ in the stochastic model by using the likelihood L(θ;y). The connection between these two processes is: L(θ;y) f θ (y) where L and f are algebraically identical, but on the left-hand side y is xed while θ varies and on the right-hand side θ is xed while y varies. However, this approach avoids inference of any random eects.

79 Classical likelihood inference A solution to inference for xed unknowns θ was proposed by Fisher (1922). He developed a likelihood theory, expressing the probability to observe the data as the function of the parameter value. Consider a statistical model, consisting of two types of objects, data y and parameter θ, and two related processes on them: Statistical Model for Data Generation: Generate an instance of the data y from a probability function with xed parameters θ, f θ (y). Statistical Inference: Given the data y, make an inference about unknown xed θ in the stochastic model by using the likelihood L(θ;y). The connection between these two processes is: L(θ;y) f θ (y) where L and f are algebraically identical, but on the left-hand side y is xed while θ varies and on the right-hand side θ is xed while y varies. However, this approach avoids inference of any random eects.

80 Classical marginal likelihood for a linear mixed model Consider the linear mixed model y = Xβ + Zu + e In the classical likelihood approach the random eects are integrated out f θ (y) = f θ (u)f θ (y u)du and the data generation process is given by a multivariate normal distribution N(Xβ,σ 2 u ZZ + σ 2 e I), with the corresponding likelihood L(θ;y) = (2π V ) 0.5 exp( 0.5(y Xβ) V 1 (y Xβ)) Note, however, that the random eect u is not included and that the classical likelihood does not give estimates of, nor inference about, the random eects.

81 Inference Hierarchical Generalized Linear Models For model comparisons and testing Lee & Nelder (1996) proposed to use the h-likelihood for random eects, f θ (y,v) the marginal likelihood for xed eects, f θ (y) for the dispersion parameters f θ (y ˆβ) When the Laplace approximation is used, f θ (y) is replaced by p v (h) and f θ (y ˆβ) by p β,v (h). Model selection for nested HGLMs: We can use the above likelihoods to perform likelihood ratio tests Model selection for non-nested HGLMs: The conditional AIC (caic) is dened as 2f (y v) + p D where p D are the estimated number of parameters (computed from the trace of the hat matrix).

82 Inference Hierarchical Generalized Linear Models For model comparisons and testing Lee & Nelder (1996) proposed to use the h-likelihood for random eects, f θ (y,v) the marginal likelihood for xed eects, f θ (y) for the dispersion parameters f θ (y ˆβ) When the Laplace approximation is used, f θ (y) is replaced by p v (h) and f θ (y ˆβ) by p β,v (h). Model selection for nested HGLMs: We can use the above likelihoods to perform likelihood ratio tests Model selection for non-nested HGLMs: The conditional AIC (caic) is dened as 2f (y v) + p D where p D are the estimated number of parameters (computed from the trace of the hat matrix).

83 Criticism of h-likelihood - example

Genetic Heterogeneity of Environmental Variance - estimation of variance components using Double Hierarchical Generalized Linear Models

Genetic Heterogeneity of Environmental Variance - estimation of variance components using Double Hierarchical Generalized Linear Models L. Rönnegård,a,b, M. Felleki a,b, W.F. Fikse b and E. Strandberg