Hierarchical generalized linear models a Lego approach to mixed models
|
|
- Jonah Blake
- 5 years ago
- Views:
Transcription
1 Hierarchical generalized linear models a Lego approach to mixed models Lars Rönnegård Högskolan Dalarna Swedish University of Agricultural Sciences Trondheim Seminar
2 Aliations Hierarchical Generalized Linear Models Borlänge Uppsala
3 Outline Hierarchical Generalized Linear Models Hierarchical Generalized Linear Models Principle Method Iterative GLM tting algorithm The hglm package in R Fitting a spatial CAR model Using extensions of GLMs (DGLM, HGLM and DHGLM) in genetics Finding genes for uniformity Modelling uniformity in animal breeding
4 Principles, Methods and Algorithms Table : Overview of statistical principles. Principle Method Algorithm Bayesian min. loss/max. posterior MCMC, INLA Extended likelihood h-likelihood N-R, IRWLS Likelihood maximum likelihood N-R, Fisher scoring Frequentist method of moments Monte Carlo
5 Books on extended likelihood principle Pawitan Y (2001) In All Likelihood Lee Y, Nelder JA, Pawitan Y (2006) Generalized Linear Models with Random Eects
6 Some h-likelihood people
7 Denition of the h-likelihood Lee & Nelder's (1996) hierarchical log-likelihood (h-likelihood): h(β,θ,u) = logf (y u) + logf (u). Classical inference Uses marginal likelihood (random eects integrated out) f (y,u)du Includes xed parameters and only observations are treated as random Bayesian inference Probabilistic framework that combines likelihood and prior information. Treats all parameters and observations as random. Extended likelihood inference The h-likelihood is based on the extended likelihood principle Extended likelihood principle: all information in the data about the random and xed eects is included in a joint likelihood (which the h-likelihood is an implementation of) Includes: xed parameters, unobserved random eects, and observations as random
8 Denition of the h-likelihood Lee & Nelder's (1996) hierarchical log-likelihood (h-likelihood): h(β,θ,u) = logf (y u) + logf (u). Classical inference Uses marginal likelihood (random eects integrated out) f (y,u)du Includes xed parameters and only observations are treated as random Bayesian inference Probabilistic framework that combines likelihood and prior information. Treats all parameters and observations as random. Extended likelihood inference The h-likelihood is based on the extended likelihood principle Extended likelihood principle: all information in the data about the random and xed eects is included in a joint likelihood (which the h-likelihood is an implementation of) Includes: xed parameters, unobserved random eects, and observations as random
9 Extended likelihood Likelihood Principle: Birnbaum (1962) showed that the classical likelihood function contains all the information about the value of the xed parameter. Extended Likelihood Principle: Bjørnstad (1996) showed that all information in the data y for parameters θ and unobservables u is in the extended likelihood.
10 The h-likelihood is the extended likelihood applied on HGLM Hierarchical Generalized Linear Models Generalized linear models with random eects Both the response y and the random eects u can come from a wide range of distributions Inference and model selection tools
11 h-likelihood estimation for HGLM Estimating xed and random eects: h β = 0 and h u = 0 Estimating variance components using adjusted prole likelihood: ( h p = h + 1 ) 2 log 2πD 1 h p θ = 0 β = ˆβ,u=û where D is the matrix of second derivatives of h around β = ˆβ,u = û.
12 h-likelihood estimation for HGLM Estimating xed and random eects: h β = 0 and h u = 0 Estimating variance components using adjusted prole likelihood: ( h p = h + 1 ) 2 log 2πD 1 h p θ = 0 β = ˆβ,u=û where D is the matrix of second derivatives of h around β = ˆβ,u = û.
13 The h-likelihood for a linear mixed model For a linear mixed model y = Xβ + Zu + e with u N(0,I k σ 2 u ) and e N(0,I n σ 2 e ) All we need is the normal density function (n iid observations with mean 0): 1 f (x) = ( 2πσ 2 ) 1 e 2σ 2 x x ; ie log(f (x)) = n n 2 log(σ 2 ) 1 x x 2σ 2 We have e = y Xβ Zu, so h(β,θ,u) = logf (y u) + logf (u) = log(f (e)) + log(f (u)) = { n 2 log(σ 2 e ) 1 2σ 2 e e} + { k e 2 log(σ 2 u ) 1 2σ 2 u u} u
14 The h-likelihood for a linear mixed model For a linear mixed model y = Xβ + Zu + e with u N(0,I k σ 2 u ) and e N(0,I n σ 2 e ) All we need is the normal density function (n iid observations with mean 0): 1 f (x) = ( 2πσ 2 ) 1 e 2σ 2 x x ; ie log(f (x)) = n n 2 log(σ 2 ) 1 x x 2σ 2 We have e = y Xβ Zu, so h(β,θ,u) = logf (y u) + logf (u) = log(f (e)) + log(f (u)) = { n 2 log(σ 2 e ) 1 2σ 2 e e} + { k e 2 log(σ 2 u ) 1 2σ 2 u u} u
15 The h-likelihood for a linear mixed model For a linear mixed model y = Xβ + Zu + e with u N(0,I k σ 2 u ) and e N(0,I n σ 2 e ) All we need is the normal density function (n iid observations with mean 0): 1 f (x) = ( 2πσ 2 ) 1 e 2σ 2 x x ; ie log(f (x)) = n n 2 log(σ 2 ) 1 x x 2σ 2 We have e = y Xβ Zu, so h(β,θ,u) = logf (y u) + logf (u) = log(f (e)) + log(f (u)) = { n 2 log(σ 2 e ) 1 2σ 2 e e} + { k e 2 log(σ 2 u ) 1 2σ 2 u u} u
16 h-likelihood estimation for a linear mixed model For the linear mixed model we have h = n 2 log(σ 2 e ) 1 2σ 2 e e k e 2 log(σ 2 u ) 1 2σ 2 u u u Putting the rst derivatives equal to zero (ie h β = 0 and h = 0 ) gives the u standard (Henderson's) Mixed Model Equations for estimating xed and random eects. Estimating the variance components (REML): h p θ = 0 D = h p ( X X 1 σ 2 e Z X 1 σ 2 e = ( h log 2πD 1 ) β = ˆβ,u=û = n 2 log(σ 2 e ) 1 ê ê k 2σ 2 2 log(σ 2 u ) 1 ^u û 1 2σ 2 2 log D e u ) X Z 1 σ 2 e Z Z 1 σ 2 e + I k 1 σ 2 u
17 h-likelihood estimation for a linear mixed model For the linear mixed model we have h = n 2 log(σ 2 e ) 1 2σ 2 e e k e 2 log(σ 2 u ) 1 2σ 2 u u u Putting the rst derivatives equal to zero (ie h β = 0 and h = 0 ) gives the u standard (Henderson's) Mixed Model Equations for estimating xed and random eects. Estimating the variance components (REML): h p θ = 0 D = h p ( X X 1 σ 2 e Z X 1 σ 2 e = ( h log 2πD 1 ) β = ˆβ,u=û = n 2 log(σ 2 e ) 1 ê ê k 2σ 2 2 log(σ 2 u ) 1 ^u û 1 2σ 2 2 log D e u ) X Z 1 σ 2 e Z Z 1 σ 2 e + I k 1 σ 2 u
18 A linear model Hierarchical Generalized Linear Models To start with, consider a linear model with only xed eects: y N(Xβ,σ 2 e ) Can also be written as y = Xβ + e How can this model be tted? e N(0,σ 2 e ) Maximum likelihood: ˆβ = (X X) 1 X y, ˆσ 2 e = 1 n (y X ˆ β) (y X ˆ β) Unbiased residual variance estimate: ˆσ 2 e = 1 n p (y X ˆ β) (y X ˆ β)
19 A linear model Hierarchical Generalized Linear Models To start with, consider a linear model with only xed eects: y N(Xβ,σ 2 e ) Can also be written as y = Xβ + e How can this model be tted? e N(0,σ 2 e ) Maximum likelihood: ˆβ = (X X) 1 X y, ˆσ 2 e = 1 n (y X ˆ β) (y X ˆ β) Unbiased residual variance estimate: ˆσ 2 e = 1 n p (y X ˆ β) (y X ˆ β)
20 Using GLM Hierarchical Generalized Linear Models Basic idea If the xed eects in the mean part of the model (ie β ) were known, then the squared residuals are 2 e σ 2 χ 2 i e 1 (for observation i), i.e. gamma distributed. So, the squared residuals may be tted using a GLM having a gamma distribution with a log link function. But β is estimated and not known V (ê i ) = (1 h ii )σ 2 e where h ii are the diagonal elements of the hat matrix H H = X(X X) 1 X so that ŷ = Hy So, (ê i ) 2 1 h ii can be tted using a GLM having a gamma distribution with a log link function (and weights 1 h ii 2 ).
21 Using GLM Hierarchical Generalized Linear Models Basic idea If the xed eects in the mean part of the model (ie β ) were known, then the squared residuals are 2 e σ 2 χ 2 i e 1 (for observation i), i.e. gamma distributed. So, the squared residuals may be tted using a GLM having a gamma distribution with a log link function. But β is estimated and not known V (ê i ) = (1 h ii )σ 2 e where h ii are the diagonal elements of the hat matrix H H = X(X X) 1 X so that ŷ = Hy So, (ê i ) 2 1 h ii can be tted using a GLM having a gamma distribution with a log link function (and weights 1 h ii 2 ).
22 A heteroscedastic linear model Consider now a linear model with only xed eects both in the mean and dispersion parts: y N(Xβ, exp(x d β d )) Can also be written as y = Xβ + e; e N(0,σ 2 e); How can this model be tted? log(σ 2 e) = X d β d Iterate between a linear model and a GLM (implemented in the R package dglm) Estimate β for a given residual variance Estimate β d : Fit (ê i ) 2 as response variable in a gamma GLM (log link) with 1 h ii linear predictor X d β d
23 A heteroscedastic linear model Consider now a linear model with only xed eects both in the mean and dispersion parts: y N(Xβ, exp(x d β d )) Can also be written as y = Xβ + e; e N(0,σ 2 e); How can this model be tted? log(σ 2 e) = X d β d Iterate between a linear model and a GLM (implemented in the R package dglm) Estimate β for a given residual variance Estimate β d : Fit (ê i ) 2 as response variable in a gamma GLM (log link) with 1 h ii linear predictor X d β d
24 Can this be used for linear mixed models? y = Xb + Zu + e V = ZZ σ 2 u + I n σ 2 e Re-write it as an augmented weighted linear model! y a = Tδ + e a ( ) ( ) ( ) ( ) y X Z b e where y a =, T =, δ =, e 0 0 I k u a = u The variance-covariance ( matrix of the augmented residual vector is given by V (e a ) W 1 In σ = 2 ) e 0 0 I k σ 2 u The estimates from weighted least squares are given by T W Tˆδ = T Wy a This is identical to Henderson's( mixed model equations where ) the left hand side X X 1 X Z 1 σ 2 σ 2 e e can be veried to be T W T = Z X 1 σ 2 e Z Z 1 σ 2 e + I k 1 σ 2 u
25 Can this be used for linear mixed models? y = Xb + Zu + e V = ZZ σ 2 u + I n σ 2 e Re-write it as an augmented weighted linear model! y a = Tδ + e a ( ) ( ) ( ) ( ) y X Z b e where y a =, T =, δ =, e 0 0 I k u a = u The variance-covariance ( matrix of the augmented residual vector is given by V (e a ) W 1 In σ = 2 ) e 0 0 I k σ 2 u The estimates from weighted least squares are given by T W Tˆδ = T Wy a This is identical to Henderson's( mixed model equations where ) the left hand side X X 1 X Z 1 σ 2 σ 2 e e can be veried to be T W T = Z X 1 σ 2 e Z Z 1 σ 2 e + I k 1 σ 2 u
26 Can this be used for linear mixed models? y = Xb + Zu + e V = ZZ σ 2 u + I n σ 2 e Re-write it as an augmented weighted linear model! y a = Tδ + e a ( ) ( ) ( ) ( ) y X Z b e where y a =, T =, δ =, e 0 0 I k u a = u The variance-covariance ( matrix of the augmented residual vector is given by V (e a ) W 1 In σ = 2 ) e 0 0 I k σ 2 u The estimates from weighted least squares are given by T W Tˆδ = T Wy a This is identical to Henderson's( mixed model equations where ) the left hand side X X 1 X Z 1 σ 2 σ 2 e e can be veried to be T W T = Z X 1 σ 2 e Z Z 1 σ 2 e + I k 1 σ 2 u
27 Can this be used for linear mixed models? y = Xb + Zu + e V = ZZ σ 2 u + I n σ 2 e Re-write it as an augmented weighted linear model! y a = Tδ + e a ( ) ( ) ( ) ( ) y X Z b e where y a =, T =, δ =, e 0 0 I k u a = u The variance-covariance ( matrix of the augmented residual vector is given by V (e a ) W 1 In σ = 2 ) e 0 0 I k σ 2 u The estimates from weighted least squares are given by T W Tˆδ = T Wy a This is identical to Henderson's( mixed model equations where ) the left hand side X X 1 X Z 1 σ 2 σ 2 e e can be veried to be T W T = Z X 1 σ 2 e Z Z 1 σ 2 e + I k 1 σ 2 u
28 Use same method as before σ 2 is estimated by applying a gamma GLM to the response e ê2/(1 h i ii) with weights (1 h ii )/2, where the index i goes from 1 to n. Similarly for σ 2 u. Hat values given by the diagonal elements of H = T(T W T) 1 T W Possible to have xed eects in the linear predictor for estimating σ 2 (and e σ 2). u Can also add random eects in this gamma GLM! Double Hierarchical Generalized Linear Models (DHGLM). Can be estimated using a second layer in the iterative GLM algorithm.
29 Use same method as before σ 2 is estimated by applying a gamma GLM to the response e ê2/(1 h i ii) with weights (1 h ii )/2, where the index i goes from 1 to n. Similarly for σ 2 u. Hat values given by the diagonal elements of H = T(T W T) 1 T W Possible to have xed eects in the linear predictor for estimating σ 2 (and e σ 2). u Can also add random eects in this gamma GLM! Double Hierarchical Generalized Linear Models (DHGLM). Can be estimated using a second layer in the iterative GLM algorithm.
30 Hierarchical Generalized Linear Models
31
32 hglm notation Hierarchical Generalized Linear Models Linear mixed model with heteroscedastic residual variance library(hglm) model2 <- hglm(fixed = y ~ x, disp = ~ x, random = ~ 1 ID, family = gaussian(link = identity) )
33 Other possibilities in hglm Notation using design matrices: model2 <- hglm(x, y, Z, X.disp,family = gaussian(link = identity) ) Possible to t animal model, random regression, etc. For instance: animal.model <- hglm(x, y, Z = t(chol(a)), family = gaussian(link = identity) ) Possible to t several random eects: model3 <- hglm(x, y, Z = cbind(z1,z2), randc = c(ncol(z1),ncol(z2)), family = gaussian(link = identity) ) Possible to t other distributions: model4 <- hglm(x, y, Z,family = poisson(link = log) ) Possible to t other distributions for the random eects too: negative_binomial.model <- hglm(x, y, Z,family = poisson(link = log), rand.family = Gamma(link = log))
34 Other possibilities in hglm Notation using design matrices: model2 <- hglm(x, y, Z, X.disp,family = gaussian(link = identity) ) Possible to t animal model, random regression, etc. For instance: animal.model <- hglm(x, y, Z = t(chol(a)), family = gaussian(link = identity) ) Possible to t several random eects: model3 <- hglm(x, y, Z = cbind(z1,z2), randc = c(ncol(z1),ncol(z2)), family = gaussian(link = identity) ) Possible to t other distributions: model4 <- hglm(x, y, Z,family = poisson(link = log) ) Possible to t other distributions for the random eects too: negative_binomial.model <- hglm(x, y, Z,family = poisson(link = log), rand.family = Gamma(link = log))
35 Other possibilities in hglm Notation using design matrices: model2 <- hglm(x, y, Z, X.disp,family = gaussian(link = identity) ) Possible to t animal model, random regression, etc. For instance: animal.model <- hglm(x, y, Z = t(chol(a)), family = gaussian(link = identity) ) Possible to t several random eects: model3 <- hglm(x, y, Z = cbind(z1,z2), randc = c(ncol(z1),ncol(z2)), family = gaussian(link = identity) ) Possible to t other distributions: model4 <- hglm(x, y, Z,family = poisson(link = log) ) Possible to t other distributions for the random eects too: negative_binomial.model <- hglm(x, y, Z,family = poisson(link = log), rand.family = Gamma(link = log))
36 Other possibilities in hglm Notation using design matrices: model2 <- hglm(x, y, Z, X.disp,family = gaussian(link = identity) ) Possible to t animal model, random regression, etc. For instance: animal.model <- hglm(x, y, Z = t(chol(a)), family = gaussian(link = identity) ) Possible to t several random eects: model3 <- hglm(x, y, Z = cbind(z1,z2), randc = c(ncol(z1),ncol(z2)), family = gaussian(link = identity) ) Possible to t other distributions: model4 <- hglm(x, y, Z,family = poisson(link = log) ) Possible to t other distributions for the random eects too: negative_binomial.model <- hglm(x, y, Z,family = poisson(link = log), rand.family = Gamma(link = log))
37 Other possibilities in hglm Notation using design matrices: model2 <- hglm(x, y, Z, X.disp,family = gaussian(link = identity) ) Possible to t animal model, random regression, etc. For instance: animal.model <- hglm(x, y, Z = t(chol(a)), family = gaussian(link = identity) ) Possible to t several random eects: model3 <- hglm(x, y, Z = cbind(z1,z2), randc = c(ncol(z1),ncol(z2)), family = gaussian(link = identity) ) Possible to t other distributions: model4 <- hglm(x, y, Z,family = poisson(link = log) ) Possible to t other distributions for the random eects too: negative_binomial.model <- hglm(x, y, Z,family = poisson(link = log), rand.family = Gamma(link = log))
38 Playing with Lego: Fitting a DHGLM using the hglm package y = Xβ + Zu + e u N(0,Iσ 2 u ) e i N(0,σ 2 e,i), log(σ 2 e ) = Xβ d + Zu d Easy to t using the hglm package w <- rep(1, length(y)) for (i in 1:20) { u d N(0,Iσ 2 u d ) mmean <- hglm(y = y, X = X, Z = Z, weights = w) mdisp <- hglm(y = mmean$resid^2, X = X, Z = Z, family = Gamma(link = 'log'), weights = (1 - mmean$hv)/2) w <- mdisp$fv }
39 Playing with Lego: Fitting a spatial CAR model Linear mixed model y = Xβ + Zu + e with e N(0,I n σ 2 e ) and u N(0,Σ = τ(i n ρd) 1 ) Here D is the neighbourhood matrix specifying which areas that have common borders, τ and ρ are the parameters to be estimated. Eigen decompose D; eigenvalues w and eigenvectors Γ. Then the eigen decomposition of the covariance matrix is Σ = ΓΛΓ T with the diagonal matrix Λ having elements τ 1 ρw i.
40 ˆτ = 1ˆθ 0 Hierarchical Generalized Linear Models Playing with Lego: Fitting a spatial CAR model Re-write the model as: y = Xβ + Γ T Zũ + e with e N(0,I n σ 2 e ) and ũ N(0,Λ) Use a gamma GLM with inverse link and linear predictor θ 0 + θ 1 w to estimate the random eect variance. Then the estimates of τ and ρ are: Possible to t in hglm G <- eigen(nbr)$vectors w <- eigen(nbr)$values ˆρ = ˆθ 1 ˆθ 0 CAR.model_ugly <- hglm(x, y, Z = t(g)%*%z, X.rand.disp = model.matrix(~w), rand.family = Gamma(link = "inverse") ) Implementation in version 2.0 of hglm CAR.model_nice <- hglm(x, y, Z = diag(n), rand.family = CAR(D=nbr))
41 ˆτ = 1ˆθ 0 Hierarchical Generalized Linear Models Playing with Lego: Fitting a spatial CAR model Re-write the model as: y = Xβ + Γ T Zũ + e with e N(0,I n σ 2 e ) and ũ N(0,Λ) Use a gamma GLM with inverse link and linear predictor θ 0 + θ 1 w to estimate the random eect variance. Then the estimates of τ and ρ are: Possible to t in hglm G <- eigen(nbr)$vectors w <- eigen(nbr)$values ˆρ = ˆθ 1 ˆθ 0 CAR.model_ugly <- hglm(x, y, Z = t(g)%*%z, X.rand.disp = model.matrix(~w), rand.family = Gamma(link = "inverse") ) Implementation in version 2.0 of hglm CAR.model_nice <- hglm(x, y, Z = diag(n), rand.family = CAR(D=nbr))
42 ˆτ = 1ˆθ 0 Hierarchical Generalized Linear Models Playing with Lego: Fitting a spatial CAR model Re-write the model as: y = Xβ + Γ T Zũ + e with e N(0,I n σ 2 e ) and ũ N(0,Λ) Use a gamma GLM with inverse link and linear predictor θ 0 + θ 1 w to estimate the random eect variance. Then the estimates of τ and ρ are: Possible to t in hglm G <- eigen(nbr)$vectors w <- eigen(nbr)$values ˆρ = ˆθ 1 ˆθ 0 CAR.model_ugly <- hglm(x, y, Z = t(g)%*%z, X.rand.disp = model.matrix(~w), rand.family = Gamma(link = "inverse") ) Implementation in version 2.0 of hglm CAR.model_nice <- hglm(x, y, Z = diag(n), rand.family = CAR(D=nbr))
43 A tree genetics trial example Figure 1: Location of each tree and their height given in grey scale. Darkness increases with height and white indicates missing phenotype.
44 Figure 1: Location of each tree and their height given in grey scale. Darkness increases with height and white indicates missing phenotype. Figure 2: Estimated spatial and genetic random effects for each tree. Darkness increases with higher values.
45 Ordinary (mean-controlling) genes
46 Variance-controlling genes
47 GWAS for variance-controlling genes Shen, Pettersson, Rönnegård and Carlborg (2012). Inheritance beyond plain heritability: variance controlling genes in Arabidopsis thaliana. PLoS Genetics 8(8):e Arabidopsis thaliana study including 199 individuals Trait: molybdenum content 216,130 SNP Most signicant SNP located within the gene: ion transporter gene MOT1
48 Figure 2: A gene controlling robustness of molybdenum contents in Arabidopsis. Top gure (a) shows logp values for mean-controlling SNP (yellow) and variance-controlling SNP (dierent colours for dierent chromosomes). Bottom gure (b) shows substitution eect of the MOT1 allele. (Shen et al PLoS Genetics)
49 Using Double GLM to t a parametric model
50 Model Hierarchical Generalized Linear Models Traditional model for SNP regression y = µ + x j b + e e N(0,σ 2 e ) model1 <- glm(y ~ SNP) Model to detect variance-controlling genes y = µ + x j b + e e i N(0,σe,i) 2 ; log(σ 2 e ) = c + x j v Model easy to t using Gordon K. Smyth's dglm package in R library(dglm) model2 <- dglm(y ~ SNP, ~ SNP)
51 Model Hierarchical Generalized Linear Models Traditional model for SNP regression y = µ + x j b + e e N(0,σ 2 e ) model1 <- glm(y ~ SNP) Model to detect variance-controlling genes y = µ + x j b + e e i N(0,σe,i) 2 ; log(σ 2 e ) = c + x j v Model easy to t using Gordon K. Smyth's dglm package in R library(dglm) model2 <- dglm(y ~ SNP, ~ SNP)
52 Model Hierarchical Generalized Linear Models Traditional model for SNP regression y = µ + x j b + e e N(0,σ 2 e ) model1 <- glm(y ~ SNP) Model to detect variance-controlling genes y = µ + x j b + e e i N(0,σe,i) 2 ; log(σ 2 e ) = c + x j v Model easy to t using Gordon K. Smyth's dglm package in R library(dglm) model2 <- dglm(y ~ SNP, ~ SNP)
53 Using DHGLM for animal breeding models Rönnegård, Felleki, Fikse, Mulder and Strandberg (2010) Genetic heterogeneity of residual variance - estimation of variance components using double hierarchical generalized linear models. Genetics Selection Evolution 42:8. Rönnegård, L., Felleki, M., Fikse, W.F., Mulder H.A. & Strandberg, E Variance component and breeding value estimation for genetic heterogeneity of residual variance in Swedish Holstein dairy cattle. Journal of Dairy Science 96: Felleki, M., Lee, D., Lee, Y., Gilmour, A. & Rönnegård, L Estimation of breeding values for mean and dispersion, their variance and correlation using double hierarchical generalized linear models. Genetics Research 94: Rönnegård, L. & Lee, Y. (2013) Editorial: Exploring the potential of hierarchical generalized linear models in animal breeding and genetics. Journal of Animal Breeding and Genetics 130:
54 Linear mixed model using pedigree information Animal model y = Xβ + Za + e a N(0,Aσ 2 a ) e N(0,σ 2 e ) a i = additive genetic eect for individual i A= relationship matrix (calculated from pedigree information) Estimated Breeding Values = Best Linear Unbiased Predictor (BLUP) of a i
55 Extending the animal model y = Xβ + Za + e a N(0,Aσ 2 a ) e i N(0,σ 2 e,i), log(σ 2 e ) = X d β d + Za d a d N(0,Aσ 2 a d ) ρ = cor(a,a d ) We t a DHGLM for the pig litter size data, previously studied in Sorensen and Waagepetersen (2003) using MCMC DHGLM possible to t using existing variance-component estimation software (ASReml).
56 Extending the animal model y = Xβ + Za + e a N(0,Aσ 2 a ) e i N(0,σ 2 e,i), log(σ 2 e ) = X d β d + Za d a d N(0,Aσ 2 a d ) ρ = cor(a,a d ) We t a DHGLM for the pig litter size data, previously studied in Sorensen and Waagepetersen (2003) using MCMC DHGLM possible to t using existing variance-component estimation software (ASReml).
57 Data Description Data from Danish Pig Production. Pig litter size from 4,149 sows mean litter size 10.3 The data includes 10,060 records from these 4,149 sows in 82 farms. Fixed eects: farm, season, type of insemination, parity number of litters per sow varying from 1 to 9
58 Simulation results Hierarchical Generalized Linear Models
59 Thank you! Hierarchical Generalized Linear Models Lars Rönnegård Special thanks to my student Majbritt Felleki and collaborators Dalarna University: Moudud Alam Carlborg lab, SLU: Xia Shen, Örjan Carlborg Animal Breeding and Genetics, SLU: Erling Strandberg, Freddy Fikse Seoul National University, Korea: Youngjo Lee Wageningen University, The Netherlands: Herman A. Mulder University of North Carolina at Chapel Hill, USA: William Valdar Reindeer Unit, SLU: Anna Skarin
60 A last illustrative example - Image reconstruction
61 A last illustrative example - Image reconstruction
62 A last illustrative example - Image reconstruction
63 Noise added Hierarchical Generalized Linear Models
64 70% of pixels missing at random
65 Clustered 4x4 pixels missing
66
67 Deriving the algorithm directly from the h-likelihood Estimating the variance components: h p θ = 0 h p = ( h log 2πD 1 ) β = ˆβ,u=û = C n 2 log(σ 2 e ) 1 ê ê k 2σ 2 2 log(σ 2 u ) 1 ^u û 1 2σ 2 2 log D e u When we take the rst derivative of log D, the hat values for the augmented model appears! log D = tr(d 1 δ D) = 1 tr([x,z] [X,Z](T T) 1 ) = (σ 2)2 e δ δσ 2 e δσ 2 e 1 tr([x,z](t T) 1 [X,Z] ) = 1 (σ 2)2 (σ 2)2 e e So, for the residual variance we have n h ii i=1 h p σ 2 e = n 2σ 2 e + 1 2(σ 2 e ) 2 n i=1 ê 2 i 1 (σ 2 e ) 2 n i=1 This can be re-written as the score function for a gamma GLM with response. h ii ê 2 i 1 h ii as And similarly for σ 2 u...
68 Deriving the algorithm directly from the h-likelihood Estimating the variance components: h p θ = 0 h p = ( h log 2πD 1 ) β = ˆβ,u=û = C n 2 log(σ 2 e ) 1 ê ê k 2σ 2 2 log(σ 2 u ) 1 ^u û 1 2σ 2 2 log D e u When we take the rst derivative of log D, the hat values for the augmented model appears! log D = tr(d 1 δ D) = 1 tr([x,z] [X,Z](T T) 1 ) = (σ 2)2 e δ δσ 2 e δσ 2 e 1 tr([x,z](t T) 1 [X,Z] ) = 1 (σ 2)2 (σ 2)2 e e So, for the residual variance we have n h ii i=1 h p σ 2 e = n 2σ 2 e + 1 2(σ 2 e ) 2 n i=1 ê 2 i 1 (σ 2 e ) 2 n i=1 This can be re-written as the score function for a gamma GLM with response. h ii ê 2 i 1 h ii as And similarly for σ 2 u...
69 Deriving the algorithm directly from the h-likelihood Estimating the variance components: h p θ = 0 h p = ( h log 2πD 1 ) β = ˆβ,u=û = C n 2 log(σ 2 e ) 1 ê ê k 2σ 2 2 log(σ 2 u ) 1 ^u û 1 2σ 2 2 log D e u When we take the rst derivative of log D, the hat values for the augmented model appears! log D = tr(d 1 δ D) = 1 tr([x,z] [X,Z](T T) 1 ) = (σ 2)2 e δ δσ 2 e δσ 2 e 1 tr([x,z](t T) 1 [X,Z] ) = 1 (σ 2)2 (σ 2)2 e e So, for the residual variance we have n h ii i=1 h p σ 2 e = n 2σ 2 e + 1 2(σ 2 e ) 2 n i=1 ê 2 i 1 (σ 2 e ) 2 n i=1 This can be re-written as the score function for a gamma GLM with response. h ii ê 2 i 1 h ii as And similarly for σ 2 u...
70 Introduction to genome-wide association studies Example: 3 individuals and 5 SNPs Linear model y = µ + x j b + e with y =
71 Introduction to genome-wide association studies Example: 3 individuals and 5 SNPs Linear model y = µ + x j b + e with y =
72 GWAS Hierarchical Generalized Linear Models Example: 3 individuals and 5 SNPs Linear model y = µ + x j b + e with y = and x1 = 1 0 2
73 GWAS Hierarchical Generalized Linear Models Example: 3 individuals and 5 SNPs Linear model y = µ + x j b + e with y = and ˆb = 6.5 (P = 0.56) and x1 = 1 0 2
74 GWAS Hierarchical Generalized Linear Models Example: 3 individuals and 5 SNPs Linear model y = µ + x j b + e with y = Calculate P-value for each b and plot log 10 P and x2 = 0 1 2
75 GWAS Hierarchical Generalized Linear Models Example: 3 individuals and 5 SNPs Linear model y = µ + x j b + e with y = Calculate P-value for each b and plot log 10 P and x2 = 0 1 2
76 GWAS plot Hierarchical Generalized Linear Models log 10(P) SNP
77 Manhattan plot Hierarchical Generalized Linear Models Example from: Weedon et al. 2008, Nature Genetics 40,
78 Classical likelihood inference A solution to inference for xed unknowns θ was proposed by Fisher (1922). He developed a likelihood theory, expressing the probability to observe the data as the function of the parameter value. Consider a statistical model, consisting of two types of objects, data y and parameter θ, and two related processes on them: Statistical Model for Data Generation: Generate an instance of the data y from a probability function with xed parameters θ, f θ (y). Statistical Inference: Given the data y, make an inference about unknown xed θ in the stochastic model by using the likelihood L(θ;y). The connection between these two processes is: L(θ;y) f θ (y) where L and f are algebraically identical, but on the left-hand side y is xed while θ varies and on the right-hand side θ is xed while y varies. However, this approach avoids inference of any random eects.
79 Classical likelihood inference A solution to inference for xed unknowns θ was proposed by Fisher (1922). He developed a likelihood theory, expressing the probability to observe the data as the function of the parameter value. Consider a statistical model, consisting of two types of objects, data y and parameter θ, and two related processes on them: Statistical Model for Data Generation: Generate an instance of the data y from a probability function with xed parameters θ, f θ (y). Statistical Inference: Given the data y, make an inference about unknown xed θ in the stochastic model by using the likelihood L(θ;y). The connection between these two processes is: L(θ;y) f θ (y) where L and f are algebraically identical, but on the left-hand side y is xed while θ varies and on the right-hand side θ is xed while y varies. However, this approach avoids inference of any random eects.
80 Classical marginal likelihood for a linear mixed model Consider the linear mixed model y = Xβ + Zu + e In the classical likelihood approach the random eects are integrated out f θ (y) = f θ (u)f θ (y u)du and the data generation process is given by a multivariate normal distribution N(Xβ,σ 2 u ZZ + σ 2 e I), with the corresponding likelihood L(θ;y) = (2π V ) 0.5 exp( 0.5(y Xβ) V 1 (y Xβ)) Note, however, that the random eect u is not included and that the classical likelihood does not give estimates of, nor inference about, the random eects.
81 Inference Hierarchical Generalized Linear Models For model comparisons and testing Lee & Nelder (1996) proposed to use the h-likelihood for random eects, f θ (y,v) the marginal likelihood for xed eects, f θ (y) for the dispersion parameters f θ (y ˆβ) When the Laplace approximation is used, f θ (y) is replaced by p v (h) and f θ (y ˆβ) by p β,v (h). Model selection for nested HGLMs: We can use the above likelihoods to perform likelihood ratio tests Model selection for non-nested HGLMs: The conditional AIC (caic) is dened as 2f (y v) + p D where p D are the estimated number of parameters (computed from the trace of the hat matrix).
82 Inference Hierarchical Generalized Linear Models For model comparisons and testing Lee & Nelder (1996) proposed to use the h-likelihood for random eects, f θ (y,v) the marginal likelihood for xed eects, f θ (y) for the dispersion parameters f θ (y ˆβ) When the Laplace approximation is used, f θ (y) is replaced by p v (h) and f θ (y ˆβ) by p β,v (h). Model selection for nested HGLMs: We can use the above likelihoods to perform likelihood ratio tests Model selection for non-nested HGLMs: The conditional AIC (caic) is dened as 2f (y v) + p D where p D are the estimated number of parameters (computed from the trace of the hat matrix).
83 Criticism of h-likelihood - example
Genetic Heterogeneity of Environmental Variance - estimation of variance components using Double Hierarchical Generalized Linear Models
Genetic Heterogeneity of Environmental Variance - estimation of variance components using Double Hierarchical Generalized Linear Models L. Rönnegård,a,b, M. Felleki a,b, W.F. Fikse b and E. Strandberg
More informationEstimation of Parameters in Random. Effect Models with Incidence Matrix. Uncertainty
Estimation of Parameters in Random Effect Models with Incidence Matrix Uncertainty Xia Shen 1,2 and Lars Rönnegård 2,3 1 The Linnaeus Centre for Bioinformatics, Uppsala University, Uppsala, Sweden; 2 School
More informationThe hglm Package. Xia Shen Uppsala University
The hglm Package Lars Rönnegård Dalarna University Xia Shen Uppsala University Moudud Alam Dalarna University Abstract This vignette describes the R hglm package via a series of applications that may be
More informationCitation for the original published paper (version of record):
http://www.diva-portal.org This is the published version of a paper published in The R Journal. Citation for the original published paper (version of record): Rönnegård, L., Shen, X., Alam, M. (010) Hglm:
More informationEvaluation of a New Variance Component. Estimation Method - Hierarchical GLM Approach with. Application in QTL Analysis. Supervisor: Lars Rönnegård
Evaluation of a New Variance Component Estimation Method - Hierarchical GLM Approach with Application in QTL Analysis Author: Xia Shen Supervisor: Lars Rönnegård D-level Essay in Statistics, Spring 2008.
More informationMIXED MODELS THE GENERAL MIXED MODEL
MIXED MODELS This chapter introduces best linear unbiased prediction (BLUP), a general method for predicting random effects, while Chapter 27 is concerned with the estimation of variances by restricted
More informationAlternative implementations of Monte Carlo EM algorithms for likelihood inferences
Genet. Sel. Evol. 33 001) 443 45 443 INRA, EDP Sciences, 001 Alternative implementations of Monte Carlo EM algorithms for likelihood inferences Louis Alberto GARCÍA-CORTÉS a, Daniel SORENSEN b, Note a
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationLecture 9 Multi-Trait Models, Binary and Count Traits
Lecture 9 Multi-Trait Models, Binary and Count Traits Guilherme J. M. Rosa University of Wisconsin-Madison Mixed Models in Quantitative Genetics SISG, Seattle 18 0 September 018 OUTLINE Multiple-trait
More informationThe hglm Package (version 1.2)
The hglm Package (version 1.2) Lars Rönnegård Dalarna University Xia Shen Uppsala University Moudud Alam Dalarna University Abstract This vignette describes the R hglm package via a series of applications
More informationMixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012
Mixed-Model Estimation of genetic variances Bruce Walsh lecture notes Uppsala EQG 01 course version 8 Jan 01 Estimation of Var(A) and Breeding Values in General Pedigrees The above designs (ANOVA, P-O
More informationH-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL
H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL Intesar N. El-Saeiti Department of Statistics, Faculty of Science, University of Bengahzi-Libya. entesar.el-saeiti@uob.edu.ly
More informationLinear Mixed Models. One-way layout REML. Likelihood. Another perspective. Relationship to classical ideas. Drawbacks.
Linear Mixed Models One-way layout Y = Xβ + Zb + ɛ where X and Z are specified design matrices, β is a vector of fixed effect coefficients, b and ɛ are random, mean zero, Gaussian if needed. Usually think
More informationMIT Spring 2015
Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)
More informationLecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values. Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 2013
Lecture 5: BLUP (Best Linear Unbiased Predictors) of genetic values Bruce Walsh lecture notes Tucson Winter Institute 9-11 Jan 013 1 Estimation of Var(A) and Breeding Values in General Pedigrees The classic
More informationThe linear model is the most fundamental of all serious statistical models encompassing:
Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x
More informationBayesian construction of perceptrons to predict phenotypes from 584K SNP data.
Bayesian construction of perceptrons to predict phenotypes from 584K SNP data. Luc Janss, Bert Kappen Radboud University Nijmegen Medical Centre Donders Institute for Neuroscience Introduction Genetic
More informationSimple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1
More informationESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS
ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,
More informationTopic 12 Overview of Estimation
Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the
More informationHeritability estimation in modern genetics and connections to some new results for quadratic forms in statistics
Heritability estimation in modern genetics and connections to some new results for quadratic forms in statistics Lee H. Dicker Rutgers University and Amazon, NYC Based on joint work with Ruijun Ma (Rutgers),
More informationST 740: Linear Models and Multivariate Normal Inference
ST 740: Linear Models and Multivariate Normal Inference Alyson Wilson Department of Statistics North Carolina State University November 4, 2013 A. Wilson (NCSU STAT) Linear Models November 4, 2013 1 /
More informationMixed-Models. version 30 October 2011
Mixed-Models version 30 October 2011 Mixed models Mixed models estimate a vector! of fixed effects and one (or more) vectors u of random effects Both fixed and random effects models always include a vector
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Institute of Statistics and Econometrics Georg-August-University Göttingen Department of Statistics
More information21. Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model
21. Best Linear Unbiased Prediction (BLUP) of Random Effects in the Normal Linear Mixed Effects Model Copyright c 2018 (Iowa State University) 21. Statistics 510 1 / 26 C. R. Henderson Born April 1, 1911,
More information1 Mixed effect models and longitudinal data analysis
1 Mixed effect models and longitudinal data analysis Mixed effects models provide a flexible approach to any situation where data have a grouping structure which introduces some kind of correlation between
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of
More informationWeighted Least Squares
Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w
More informationLecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011
Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector
More informationBayesian linear regression
Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationLikelihood Methods. 1 Likelihood Functions. The multivariate normal distribution likelihood function is
Likelihood Methods 1 Likelihood Functions The multivariate normal distribution likelihood function is The log of the likelihood, say L 1 is Ly = π.5n V.5 exp.5y Xb V 1 y Xb. L 1 = 0.5[N lnπ + ln V +y Xb
More informationBAYESIAN KRIGING AND BAYESIAN NETWORK DESIGN
BAYESIAN KRIGING AND BAYESIAN NETWORK DESIGN Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C., U.S.A. J. Stuart Hunter Lecture TIES 2004
More informationLecture 8 Genomic Selection
Lecture 8 Genomic Selection Guilherme J. M. Rosa University of Wisconsin-Madison Mixed Models in Quantitative Genetics SISG, Seattle 18 0 Setember 018 OUTLINE Marker Assisted Selection Genomic Selection
More informationBest unbiased linear Prediction: Sire and Animal models
Best unbiased linear Prediction: Sire and Animal models Raphael Mrode Training in quantitative genetics and genomics 3 th May to th June 26 ILRI, Nairobi Partner Logo Partner Logo BLUP The MME of provided
More informationConjugate Analysis for the Linear Model
Conjugate Analysis for the Linear Model If we have good prior knowledge that can help us specify priors for β and σ 2, we can use conjugate priors. Following the procedure in Christensen, Johnson, Branscum,
More informationMultiple regression. CM226: Machine Learning for Bioinformatics. Fall Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar
Multiple regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Multiple regression 1 / 36 Previous two lectures Linear and logistic
More informationFoundations of Statistical Inference
Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2015 Julien Berestycki (University of Oxford) SB2a MT 2015 1 / 16 Lecture 16 : Bayesian analysis
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationAnimal Models. Sheep are scanned at maturity by ultrasound(us) to determine the amount of fat surrounding the muscle. A model (equation) might be
Animal Models 1 Introduction An animal model is one in which there are one or more observations per animal, and all factors affecting those observations are described including an animal additive genetic
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationOutline for today. Computation of the likelihood function for GLMMs. Likelihood for generalized linear mixed model
Outline for today Computation of the likelihood function for GLMMs asmus Waagepetersen Department of Mathematics Aalborg University Denmark likelihood for GLMM penalized quasi-likelihood estimation Laplace
More informationRegularized PCA to denoise and visualise data
Regularized PCA to denoise and visualise data Marie Verbanck Julie Josse François Husson Laboratoire de statistique, Agrocampus Ouest, Rennes, France CNAM, Paris, 16 janvier 2013 1 / 30 Outline 1 PCA 2
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationStat 579: Generalized Linear Models and Extensions
Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 15 1 / 38 Data structure t1 t2 tn i 1st subject y 11 y 12 y 1n1 Experimental 2nd subject
More informationRonald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California
Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationModeling Real Estate Data using Quantile Regression
Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationRecent advances in statistical methods for DNA-based prediction of complex traits
Recent advances in statistical methods for DNA-based prediction of complex traits Mintu Nath Biomathematics & Statistics Scotland, Edinburgh 1 Outline Background Population genetics Animal model Methodology
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of
More informationIntroduction to General and Generalized Linear Models
Introduction to General and Generalized Linear Models Mixed effects models - Part II Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationMatrix Approach to Simple Linear Regression: An Overview
Matrix Approach to Simple Linear Regression: An Overview Aspects of matrices that you should know: Definition of a matrix Addition/subtraction/multiplication of matrices Symmetric/diagonal/identity matrix
More informationAn Introduction to Bayesian Linear Regression
An Introduction to Bayesian Linear Regression APPM 5720: Bayesian Computation Fall 2018 A SIMPLE LINEAR MODEL Suppose that we observe explanatory variables x 1, x 2,..., x n and dependent variables y 1,
More informationIntegrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University
Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y
More informationChapter 3: Maximum Likelihood Theory
Chapter 3: Maximum Likelihood Theory Florian Pelgrin HEC September-December, 2010 Florian Pelgrin (HEC) Maximum Likelihood Theory September-December, 2010 1 / 40 1 Introduction Example 2 Maximum likelihood
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationWU Weiterbildung. Linear Mixed Models
Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes
More informationLinear Models A linear model is defined by the expression
Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter
More informationLinear Models for the Prediction of Animal Breeding Values
Linear Models for the Prediction of Animal Breeding Values R.A. Mrode, PhD Animal Data Centre Fox Talbot House Greenways Business Park Bellinger Close Chippenham Wilts, UK CAB INTERNATIONAL Preface ix
More informationAsymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands
Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department
More informationModels for spatial data (cont d) Types of spatial data. Types of spatial data (cont d) Hierarchical models for spatial data
Hierarchical models for spatial data Based on the book by Banerjee, Carlin and Gelfand Hierarchical Modeling and Analysis for Spatial Data, 2004. We focus on Chapters 1, 2 and 5. Geo-referenced data arise
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture16: Population structure and logistic regression I Jason Mezey jgm45@cornell.edu April 11, 2017 (T) 8:40-9:55 Announcements I April
More informationBayesian Linear Models
Eric F. Lock UMN Division of Biostatistics, SPH elock@umn.edu 03/07/2018 Linear model For observations y 1,..., y n, the basic linear model is y i = x 1i β 1 +... + x pi β p + ɛ i, x 1i,..., x pi are predictors
More informationCourse topics (tentative) The role of random effects
Course topics (tentative) random effects linear mixed models analysis of variance frequentist likelihood-based inference (MLE and REML) prediction Bayesian inference The role of random effects Rasmus Waagepetersen
More informationA short introduction to INLA and R-INLA
A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk
More information-A wild house sparrow population case study
Bayesian animal model using Integrated Nested Laplace Approximations -A wild house sparrow population case study Anna M. Holand Ingelin Steinsland Animal model workshop with application to ecology at Oulu
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationFractional Imputation in Survey Sampling: A Comparative Review
Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationDistinctive aspects of non-parametric fitting
5. Introduction to nonparametric curve fitting: Loess, kernel regression, reproducing kernel methods, neural networks Distinctive aspects of non-parametric fitting Objectives: investigate patterns free
More informationHierarchical Modeling for Univariate Spatial Data
Hierarchical Modeling for Univariate Spatial Data Geography 890, Hierarchical Bayesian Models for Environmental Spatial Data Analysis February 15, 2011 1 Spatial Domain 2 Geography 890 Spatial Domain This
More informationFinal Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58
Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple
More information1 Appendix A: Matrix Algebra
Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix
More informationAssociation Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5
Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative
More informationBIOS 2083 Linear Models c Abdus S. Wahed
Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter
More informationarxiv: v1 [math.st] 22 Dec 2018
Optimal Designs for Prediction in Two Treatment Groups Rom Coefficient Regression Models Maryna Prus Otto-von-Guericke University Magdeburg, Institute for Mathematical Stochastics, PF 4, D-396 Magdeburg,
More informationA measurement error model approach to small area estimation
A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion
More informationRegression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood
Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationChapter 4 - Fundamentals of spatial processes Lecture notes
Chapter 4 - Fundamentals of spatial processes Lecture notes Geir Storvik January 21, 2013 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites Mostly positive correlation Negative
More informationLinear Methods for Prediction
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationBusiness Statistics. Tommaso Proietti. Model Evaluation and Selection. DEF - Università di Roma 'Tor Vergata'
Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Model Evaluation and Selection Predictive Ability of a Model: Denition and Estimation We aim at achieving a balance between parsimony
More informationREGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University
REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10 OUTLINE 1. Introduction 2.
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationLinear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77
Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical
More informationRandom vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.
Random vectors Recall that a random vector X = X X 2 is made up of, say, k random variables X k A random vector has a joint distribution, eg a density f(x), that gives probabilities P(X A) = f(x)dx Just
More informationEstimation in Generalized Linear Models with Heterogeneous Random Effects. Woncheol Jang Johan Lim. May 19, 2004
Estimation in Generalized Linear Models with Heterogeneous Random Effects Woncheol Jang Johan Lim May 19, 2004 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure
More informationLinear Regression (1/1/17)
STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression
More informationModelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research
Modelling heterogeneous variance-covariance components in two-level multilevel models with application to school effects educational research Research Methods Festival Oxford 9 th July 014 George Leckie
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationBayes: All uncertainty is described using probability.
Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationThe Poisson transform for unnormalised statistical models. Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB)
The Poisson transform for unnormalised statistical models Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB) Part I Unnormalised statistical models Unnormalised statistical models
More informationF & B Approaches to a simple model
A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys
More information