Covariance Models (*) X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects

Similar documents
General Linear Model for Correlated Data

Biostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Stat 579: Generalized Linear Models and Extensions

Part III: Linear mixed effects models

Modelling the Covariance

Repeated measures, part 2, advanced methods

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Introduction to General and Generalized Linear Models

STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data

Stat 579: Generalized Linear Models and Extensions

13. October p. 1

Stat 579: Generalized Linear Models and Extensions

Figure 36: Respiratory infection versus time for the first 49 children.

These slides illustrate a few example R commands that can be useful for the analysis of repeated measures data.

Introduction and Background to Multilevel Analysis

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij

Intruction to General and Generalized Linear Models

Introduction to the Analysis of Hierarchical and Longitudinal Data

,..., θ(2),..., θ(n)

Mixed models with correlated measurement errors

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Modeling the Covariance

Longitudinal Data Analysis

The consequences of misspecifying the random effects distribution when fitting generalized linear mixed models

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Introduction to Linear Mixed Models: Modeling continuous longitudinal outcomes

Outline. Linear OLS Models vs: Linear Marginal Models Linear Conditional Models. Random Intercepts Random Intercepts & Slopes

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Random Intercept Models

Introduction to Within-Person Analysis and RM ANOVA

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

International Journal of PharmTech Research CODEN (USA): IJPRIF, ISSN: , ISSN(Online): Vol.9, No.9, pp , 2016

Non-independence due to Time Correlation (Chapter 14)

The linear mixed model: introduction and the basic model

Lecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Math 423/533: The Main Theoretical Topics

22s:152 Applied Linear Regression. Returning to a continuous response variable Y...

22s:152 Applied Linear Regression. In matrix notation, we can write this model: Generalized Least Squares. Y = Xβ + ɛ with ɛ N n (0, Σ)

Final Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58

Matrix Approach to Simple Linear Regression: An Overview

An R # Statistic for Fixed Effects in the Linear Mixed Model and Extension to the GLMM

Maximum Likelihood (ML) Estimation

STAT 526 Advanced Statistical Methodology

A brief introduction to mixed models

Introduction to Random Effects of Time and Model Estimation

36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)

First Year Examination Department of Statistics, University of Florida

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Stat 209 Lab: Linear Mixed Models in R This lab covers the Linear Mixed Models tutorial by John Fox. Lab prepared by Karen Kapur. ɛ i Normal(0, σ 2 )

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data

Advantages of Mixed-effects Regression Models (MRM; aka multilevel, hierarchical linear, linear mixed models) 1. MRM explicitly models individual

Sample Size and Power Considerations for Longitudinal Studies

Lecture 3 Linear random intercept models

Using Estimating Equations for Spatially Correlated A

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

36-463/663: Hierarchical Linear Models

Mixed effects models

Random Coefficients Model Examples

STAT 100C: Linear models

[y i α βx i ] 2 (2) Q = i=1

Lecture 9 STK3100/4100

Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses

Models for longitudinal data

Model Estimation Example

UNIVERSITY OF TORONTO Faculty of Arts and Science

Statistical Distribution Assumptions of General Linear Models

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Lecture 3.1 Basic Logistic LDA

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

De-mystifying random effects models

Introduction to Hierarchical Data Theory Real Example. NLME package in R. Jiang Qi. Department of Statistics Renmin University of China.

Time-Series Regression and Generalized Least Squares in R*

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

Describing Within-Person Change over Time

The equivalence of the Maximum Likelihood and a modified Least Squares for a case of Generalized Linear Model

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Serial Correlation. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology

Regression Models - Introduction

STAT Financial Time Series

Generalized Linear. Mixed Models. Methods and Applications. Modern Concepts, Walter W. Stroup. Texts in Statistical Science.

Vector autoregressions, VAR

Non-Stationary Time Series and Unit Root Testing

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Modeling the Mean: Response Profiles v. Parametric Curves

Multivariate Regression

PAPER 206 APPLIED STATISTICS

Non-Stationary Time Series and Unit Root Testing

Restricted Likelihood Ratio Tests in Nonparametric Longitudinal Models

Ordinary Least Squares Regression

Repeated Measures Design. Advertising Sales Example

A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models

Models for Clustered Data

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

Models for Clustered Data

Transcription:

Covariance Models (*) Mixed Models Laird & Ware (1982) Y i = X i β + Z i b i + e i Y i : (n i 1) response vector X i : (n i p) design matrix for fixed effects β : (p 1) regression coefficient for fixed effects Note: see pg 60 for specific examples Note: FLW Appendix A = Gentle Intro to Matrices 139 Heagerty, 2006

Covariance Models (*) Mixed Models Laird & Ware (1982) Z i : (n i q) design matrix for random effects b i : (q 1) vector of random effects e i : (n i 1) vector of errors For the random components of the model we typically assume: b i N (0, D) e i N (0, R i ) 140 Heagerty, 2006

140-1 Heagerty, 2006 Laird & Ware Chair, Dept Biostatistics HSPH 1990-1999 Associate Dean HSPH

LMM and components of variation (*) This yields a covariance structure: cov(y i ) = Z i DZ T i }{{} + R i }{{} between-cluster var + within-cluster var We assume that observations on different subjects are independent Note: This is a matrix (compact) way of writing the covariance for any possibe pair Y ij, Y ik, and represents the variance and covariance details that we presented on pp 60-1 and 60-2 141 Heagerty, 2006

LMM and components of variation Within-Subject: Independence Model : R i = σ 2 I or general diagonal matrix Then, assuming normal errors we have that Y i = (Y i1, Y i2,, Y i,ni ) are conditionally independent given b i This model assumes that the within-subject errors do not have any serial correlation 142 Heagerty, 2006

More on Covariance Models Within-Subject: Serial Models Linear mixed models assume that each subject follows his/her own line In some situations the dependence is more local meaning that observations close in time are more similar than those far apart in time One model that we introduced is called the autoregressive model where: cov(e ij, e ik ) = σ 2 ρ j k 143 Heagerty, 2006

ID=1, rho=09 ID=2, rho=09 ID=3, rho=09 ID=4, rho=09 y 5 10 15 20 25 y 5 10 15 20 25 y 5 10 15 20 25 y 5 10 15 20 25 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 months months months months ID=5, rho=09 ID=6, rho=09 ID=7, rho=09 ID=8, rho=09 y 5 10 15 20 25 y 5 10 15 20 25 y 5 10 15 20 25 y 5 10 15 20 25 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 months months months months ID=9, rho=09 ID=10, rho=09 ID=11, rho=09 ID=12, rho=09 y 5 10 15 20 25 y 5 10 15 20 25 y 5 10 15 20 25 y 5 10 15 20 25 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 months months months months ID=13, rho=09 ID=14, rho=09 ID=15, rho=09 ID=16, rho=09 y 5 10 15 20 25 y 5 10 15 20 25 y 5 10 15 20 25 y 5 10 15 20 25 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 months months months months ID=17, rho=09 ID=18, rho=09 ID=19, rho=09 ID=20, rho=09 y 5 10 15 20 25 y 5 10 15 20 25 y 5 10 15 20 25 y 5 10 15 20 25 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 2 4 6 8 10 12 months months months months 143-1 Heagerty, 2006

More on Covariance Models Autoregressive Correlation Assume t ij = j, n i 4: 1 ρ ρ 2 ρ 3 ρ 1 ρ ρ 2 corr(e i ) = ρ 2 ρ 1 ρ ρ 3 ρ 2 ρ 1 144 Heagerty, 2006

More on Covariance Models Autoregressive Correlation Assume t ij = j: corr(e i ) = 1 ρ ρ 2 ρ (n 1) ρ 1 ρ ρ (n 2) ρ 2 ρ 1 ρ (n 3) ρ (n 1) ρ (n 2) ρ (n 3) 1 145 Heagerty, 2006

More on Covariance Models Autoregressive Correlation corr(e i ) = Assume t ij unique: 1 ρ t i1 t i2 ρ t i1 t i3 ρ t i1 t in ρ t i2 t i1 1 ρ t i2 t i3 ρ t i2 t in ρ t i3 t i1 ρ t i3 t i2 1 ρ t i3 t in ρ t in t i1 ρ t in t i2 ρ t in t i3 1 146 Heagerty, 2006

More on Covariance Models Mixed + Serial Diggle (1988) proposed the following model Y ij = X ij β + b i,0 + W i (t ij ) + ɛ ij 147 Heagerty, 2006

Covariance Models Mixed + Serial The most general type of covariance model will combine some random effects with some additional aspects that characterize within-subject serial correlation One such model contains three sources of random variation: random intercept b i,0 serial process W i (t ij ) measurement error ɛ ij 148 Heagerty, 2006

We assume: var(b i,0 ) = ν 2 cov[w (s), W (t)] = σ 2 ρ s t var(ɛ ij ) = τ 2 Then: Total Variance = ν 2 + σ 2 + τ 2 Covariance(Y ij, Y ik ) = ν 2 + σ 2 ρ t ij t ik 149 Heagerty, 2006

Covariance Models Mixed + Serial of variation? Q: How to biologically interpret these three sources random intercept: This represents a trait of the subject FEV1 child size not captured by age and height CD4 subject s normal steady-state level serial variation: This represents a state for the subject FEV1 child current health status (infected with PseudoA) CD4 subject s current immune status (diet? treatment?) measurement error: This represents the instrumentation or process used to generate the final quantitative measurement FEV1 result of only one trial with expiration CD4 blood sample, lab processing 150 Heagerty, 2006

150-1 Heagerty, 2006

EDA for Covariance Structure Numerical Summaries Empirical covariance & correlation Variogram Define: R ij = Y ij X ij β = b i,0 + W (t ij ) + ɛ ij 151 Heagerty, 2006

Note: var(r ij ) = ν 2 + σ 2 + τ 2 E [ ] 1 2 (R ij R ik ) 2 = σ 2 (1 ρ t ij t ik ) + τ 2 R ij R ik = (b i,0 + W i (t ij ) + ɛ ij ) (b i,0 + W i (t ik ) + ɛ ik ) = [W i (t ij ) W i (t ik )] + [ɛ ij ɛ ik ] Plot: 1 2 ( R ij R ik ) 2 versus t ij t ik 152 Heagerty, 2006

152-1 Heagerty, 2006 Variogram E( dr^2 ) 00 02 04 06 08 10 12 Total Variance 0 2 4 6 8 10 deltatime

152-2 Heagerty, 2006 Variogram: Key Features When t ij = t ik : E [ ] 1 2 (R ij R ik ) 2 = σ 2 (1 ρ t ij t ik ) + τ 2 = σ 2 (1 ρ 0 ) + τ 2 = τ 2 = measurement error variance

152-3 Heagerty, 2006 Variogram: Key Features When t ij >> t ik : E [ ] 1 2 (R ij R ik ) 2 (large time separation) = σ 2 (1 ρ t ij t ik ) + τ 2 = σ 2 (1 ρ ) + τ 2 = σ 2 + τ 2 = serial and measurement error variances

dt dr 0 2 4 6 8 0 200 400 600 FEV1 residual variogram 152-4 Heagerty, 2006

Recall Simple Linear Regression (**) In simple linear regression we fit the model E(Y i X i ) = β 0 + β 1 X i We can write the estimate of the slope, β 1 as follows: β 1 = 1 i (X (X i X) 2 i X) (Y i Y ) This method is sometimes called ordinary least squares, or OLS i 153 Heagerty, 2006

Recall Simple Linear Regression (**) In some applications we still want to fit the regression model: E(Y i X i ) = β 0 + β 1 X i But now we want to assign weights, w i, to each observation Using the weights leads to weighted least squares (WLS) We can write the estimate of the slope, β 1 as follows: β 1 (w) = 1 i w (X i (X i X) 2 i X) w i (Y i Y ) With longitudinal data we have a method of estimation that generalizes this to allow covariance weights i 154 Heagerty, 2006

Estimation of β with known Σ i (**) Weighted least squares: In univariate regression, WLS yields estimates of β that minimize the objective function Q(β) = N w i (Y i X i β) 2 where W i is an (n i n i ) positive definite symmetric matrix i=1 Analogously, the multivariate version of WLS finds the value of the parameter β(w ) that minimizes Q W (β) = N (Y i X i β) T W i (Y i X i β) i=1 155 Heagerty, 2006

Estimation of β with known Σ i (**) It s straight forward to see that U(β) = β Q W (β) = 2 N X T i W i (Y i X i β) i=1 This is a general way to statistically define the regression estimator a solution to equations In general W i is chosen as the inverse of Σ i 156 Heagerty, 2006

The solution to the minimization solves U(β) = 0 and yields ( N ) 1 ( N ) β(w ) = X T i W i X i X T i W i Y i i=1 i=1 157 Heagerty, 2006

Properties of β(w ) (**) Given X 1, X 2, X N and W 1, W 2, W N ( ] N ) 1 ( N ) E [ β(w ) = X T i W i X i X T i W i E[Y i ] i=1 i=1 ( N ) 1 ( N ) = X T i W i X i X T i W i X i β = β i=1 i=1 Notice that the estimate β(w ) is unbiased no matter what weighting scheme is used 158 Heagerty, 2006

Properties of β(w ) (**) 1 When W i is correctly specified as the inverse of the variance of Y i then: W i = Σ 1 i ] var [ β(σ 1 ) = ( i ) 1 X T i Σ 1 i X i When we use gllamm, SAS PROC MIXED, or S+ lme this is what is returned to provide standard errors for the estimated regression coefficients 159 Heagerty, 2006

Properties of β(w ) (**) 2 When W i is not the inverse of the variance of Y i then: W i Σ 1 i ] var [ β(w ) = bread ( ) {}}{ A 1 X T i W i Σ i W i X i i }{{} cheese bread {}}{ A 1 A = i X T i W i X i More on this sandwich later 160 Heagerty, 2006

Likelihood Estimation for Linear Mixed Models (**) Parameters: β : regression parameter, fixed effects coefficient α : variance components α D(α) and R(α) where cov(y i ) = Z i DZ T i + R i 161 Heagerty, 2006

Normality: E(Y i ) = X i β cov(y i ) = Σ(α) f(y i ; β, α) = Σ 1/2 (2π) ni/2 [ exp 1 ] 2 (Y i X i β) T Σ 1 (Y i X i β) Maximum Likelihood: Find the values for the regression coefficients, β, and the variance components that maximizes the likelihood eg put the highest available probability on the observed data 162 Heagerty, 2006

RA Fisher 162-1 Heagerty, 2006

ML versus REML There is a variant of ML estimation known as REML Residual ML Restricted ML REML is used to provide slightly less biased estimates of variance components However, be careful using REML when you change the covariates in your model since one can not use changes in REML log likelihoods to test for fixed effects Useful for a single fitted model, or to compare covariance models with a fixed regression model 163 Heagerty, 2006

Inference in the Linear Mixed Model In practice: (1) Saturated mean model & explore the covariance (2) Fix the covariance & explore the mean 164 Heagerty, 2006

Likelihood Ratio Tests Fixed Effects Standard likelihood theory can be applied to test H 0 : β 2 = 0 where E[Y ] = [X 1, X 2 ] β 1 β 2 = X 1 β 1 + X 2 β 2 [1]Full Model: E[Y ] = X 1 β 1 + X 2 β 2 [0]Reduced Model: E[Y ] = X 1 β 1 165 Heagerty, 2006

Likelihood Ratio Tests Fixed Effects In this case we have (when null hypothesis is true): Likelihood Ratio = L ML( β 1, β 2, α; ML using model 1) L ML ( β 1, 0, α; ML using model 0) LRstatistic = 2 log Likelihood Ratio = 2 log L ML,1 2 log L ML,0 χ 2 (q) Where q is the number of coefficients that are set to zero in the reduced model 166 Heagerty, 2006

Other Tests Fixed Effects (*) We also have for a general linear contrast A and a hypothesis H 0 : Aβ = 0 Wald Test: ( ) (A β) T Avar( β)a T 1 (A β) χ 2 (q) F Test: F = (A β) T (A var( β)a T ) 1 (A β) rank(a) F (ndf = rank(a), ddf) 167 Heagerty, 2006

LMM: Selection of the Covariance Matrix A Model that fits the data Compare the fitted covariance to the empirical assessment of it: Σ i = Z i DZ T i + R i ( α) versus cov(y i µ i ) γ( ) = τ 2 + σ 2 [1 ρ( )] versus empirical variogram var(y ij ) = τ 2 + σ 2 + ν 2 versus empirical variance 168 Heagerty, 2006

Look at the maximized likelihood: Compare 2 log L AIC, BIC Don t lose sight of the goals of analysis If covariance selection is to obtain valid model based standard errors then we can assess the impact on β and se s We can also calculate an empirical (sandwich) variance estimate 169 Heagerty, 2006

Inference in the Linear Mixed Model (*) Likelihood Ratio Tests Variance Components We may want to test whether we have random intercepts and slopes, or just random intercepts H 0 : D = D 11 0 0 0 versus H 1 : D = D 11 D 12 D 21 D 22 Q: What is the distribution of the likelihood ratio statistic LRstat = 2 log L ML( θ ML,model 1 ) L ML ( θ ML,model 0 ) 170 Heagerty, 2006

LR Testing for Variance Components (*) D 22 = 0 is on the boundary of the parameter space!!! This violates the standard assumption that we use to justify the χ 2 (p 1 p 0 ) distribution of the LR statistic We appeal to results in Stram and Lee (1994) that build upon results in Self & Liang (1987) showing that LR stat is a mixture of χ 2 Note: For a fixed mean strucure we can use the LR based on either ML or REML (Why?) See: Verbeke and Molenberghs (1997) pages 108-111 171 Heagerty, 2006

171-1 Heagerty, 2006

171-2 Heagerty, 2006

171-3 Heagerty, 2006 S+ LMM Program: # # cfkids-cda-newlmmq # # ------------------------------------------------------------ # # PURPOSE: Use linear mixed models to characterize longitudinal # change by gender and genotype # # AUTHOR: P Heagerty # # DATE: 00/07/10 Revised 14Feb2002 # # ------------------------------------------------------------ # # ##### ##### Read data ##### # source("cfkids-readq") # # ##### ##### Trellis plots of individuals and groups #####

# # Create Grouped Data Set # ntotal <- cumsum( unlist( lapply( split( cfkids$id, cfkids$id), length ) ) ) cfsubset <- groupeddata( fev1 ~ age id, outer = ~ factor(f508)*female, data = cfkids[ 1:ntotal[(8*4*1)], ] ) # cfkids <- groupeddata( fev1 ~ age id, outer = ~ factor(f508)*female, data = cfkids ) # # trellis plot, by id, first 1 pages, 8x4 # postscript( file="cfkids-trellis1ps", horiz=f ) plot( cfsubset, layout = c(4,8) ) graphicsoff() postscript( file="cfkids-trellis2ps", horiz=t ) par( pch="" ) plot( cfkids, outer = ~ factor(f508)*factor(female), layout=c(3,2), aspect=1 ) graphicsoff() # ##### ##### Linear Mixed Models ##### # options( contrasts=c("contrtreatment","contrhelmert") ) 171-4 Heagerty, 2006

171-5 Heagerty, 2006 # ### Intercept only fit0 <- lme( fev1 ~ age0 + agel + female*agel + factor(f508)*agel, method = "ML", random = restruct( ~ 1 id, pdclass="pdsymm", REML=F), data = cfkids ) summary( fit0 ) ### Intercept plus Slope fit1 <- lme( fev1 ~ age0 + agel + female*agel + factor(f508)*agel, method = "ML", random = restruct( ~ 1 + agel id, pdclass="pdsymm", REML=F), data = cfkids ) summary( fit1 ) ### EDA for serial correlation postscript( file="cfkids-variogramps", horiz=t ) plot( Variogram( fit0, form = ~ age id, restype="response" ) ) graphicsoff() ### Intercept plus AR(1) fit2a <- lme( fev1 ~ age0 + agel + female*agel + factor(f508)*agel,

171-6 Heagerty, 2006 method = "ML", random = restruct( ~ 1 id, pdclass="pdsymm", REML=F), correlation = corar1( form = ~ 1 id ), data = cfkids ) summary( fit2a ) ### another way fit2b <- lme( fev1 ~ age0 + agel + female*agel + factor(f508)*agel, method = "ML", random = restruct( ~ 1 id, pdclass="pdsymm", REML=F), correlation = corexp( form = ~ agel id, nugget=f), data = cfkids ) summary( fit2b ) ### another way fit2c <- lme( fev1 ~ age0 + agel + female*agel + factor(f508)*agel, method = "ML", random = restruct( ~ 1 id, pdclass="pdsymm", REML=F), correlation = corcar1( form = ~ agel id ), data = cfkids ) summary( fit2c ) fit2 <- fit2b ### Intercept plus AR(1) plus measurement error

171-7 Heagerty, 2006 fit3 <- lme( fev1 ~ age0 + agel + female*agel + factor(f508)*agel, method = "ML", random = restruct( ~ 1 id, pdclass="pdsymm", REML=F), correlation = corexp( form = ~ agel id, nugget=t), data = cfkids ) summary( fit3 ) # ##### compare these models # anova( fit0, fit1, fit2, fit3 ) # ##### ##### Residual Analysis -- using fit3 ##### # popres <- resid( fit3, level=0 ) clusterres <- resid( fit3, level=1 ) print( var( popres ) ) print( var( clusterres ) ) # postscript( file="cfkids-newresidualsps", horiz=f ) par( mfrow=c(2,1) ) plot( cfkids$age0, popres, pch="" ) lines( smoothspline( cfkids$age0, popres, df=5 ) ) title("residuals (pop) vs Age0") abline( h=0, lty=2 )

171-8 Heagerty, 2006 plot( cfkids$agel, popres, pch="" ) lines( smoothspline( cfkids$agel, popres, df=5 ) ) title("residuals (pop) vs AgeL") abline( h=0, lty=2 ) graphicsoff() # postscript( file="cfkids-newresiduals2ps", horiz=f ) par( mfrow=c(2,1) ) plot( cfkids$age0, clusterres, pch="" ) lines( smoothspline( cfkids$age0, clusterres, df=5 ) ) abline( h=0, lty=2 ) title("residuals (cluster) vs Age0") b0 <- unlist( fit2$coefficients$random ) age0 <- unlist( lapply( split( cfkids$age0, cfkids$id ), min ) ) plot( age0, b0 ) lines( smoothspline( age0, b0, df=5 ) ) abline( h=0, lty=2 ) title("eb b0 versus Age0") graphicsoff() # ##### Do we need a quadratic age0??? # fit4 <- lme( fev1 ~ age0 + age0^2 + agel + female*agel + factor(f508)*agel, method = "ML", random = restruct( ~ 1 id, pdclass="pdsymm", REML=F), correlation = corexp( form = ~ agel id, nugget=t), data = cfkids )

summary( fit4 ) # anova( fit3, fit4 ) # # end-of-file 171-9 Heagerty, 2006

171-10 Heagerty, 2006 5 10 15 20 25 30 5 10 15 20 25 30 100736 100352 100073 101309 140 120 100 80 60 40 140 100895 102551 101806 101035 20 120 100 80 60 40 20 102053 101456 101394 101518 140 120 100 80 60 40 140 102804 102529 101799 101701 20 120 100 80 60 fev1 40 20 101232 102685 102593 100815 140 120 100 80 60 40 140 100636 100111 102316 102512 20 120 100 80 60 40 20 102407 101718 101174 100897 140 120 100 80 60 40 140 100185 100329 102791 101891 20 120 100 80 60 40 20 5 10 15 20 25 30 5 10 15 20 25 30 age

1 0 5 10 15 20 25 30 1 1 1 2 150 100 50 fev1 0 0 0 1 0 2 150 100 50 5 10 15 20 25 30 5 10 15 20 25 30 age 171-11 Heagerty, 2006

171-12 Heagerty, 2006 Fit 0 Random Intercepts Linear mixed-effects model fit by maximum likelihood Data: cfkids AIC BIC loglik 1253201 1259055-6255005 Random effects: Formula: ~ 1 id (Intercept) Residual StdDev: 2230435 1217881 Fixed effects: fev1 ~ age0 + agel + female * agel + factor(f508) * agel Value StdError DF t-value p-value (Intercept) 1038074 6644683 1309 1562262 <0001 age0-18553 0331087 195-560372 <0001 agel -05881 0393625 1309-149417 01354 female -11620 3367634 195-034505 07304 factor(f508)1-42821 5561343 195-076998 04422 factor(f508)2-67427 5592591 195-120564 02294 female:agel -08257 0249786 1309-330577 00010 agelfactor(f508)1-04873 0423076 1309-115191 02496 agelfactor(f508)2-06568 0421964 1309-155655 01198 Number of Observations: 1513 Number of Groups: 200

171-13 Heagerty, 2006 200 150 Semivariogram 100 50 0 2 4 6 Distance

171-14 Heagerty, 2006 Fit 1 Random Intercepts and Slopes Linear mixed-effects model fit by maximum likelihood Data: cfkids AIC BIC loglik 1243034 1249952-6202168 Random effects: StdDev Corr Formula: ~ 1 + agel id (Intercept) 22330642 (Inter Structure: General positive-definite agel 2087337-0156 Residual 10865196 Fixed effects: fev1 ~ age0 + agel + female * agel + factor(f508) * agel Value StdError DF t-value p-value (Intercept) 1045162 6581633 1309 1587997 <0001 age0-19104 0327870 195-582672 <0001 agel -06019 0585611 1309-102777 03042 female -12969 3338418 195-038847 06981 factor(f508)1-42382 5511575 195-076896 04428 factor(f508)2-66550 5542114 195-120081 02313 female:agel -07636 0378110 1309-201965 00436 agelfactor(f508)1-05003 0630694 1309-079323 04278 agelfactor(f508)2-07451 0629441 1309-118370 02367 Number of Observations: 1513 Number of Groups: 200

171-15 Heagerty, 2006 Fit 2a Random Intercepts + AR(1) errors Linear mixed-effects model fit by maximum likelihood Data: cfkids AIC BIC loglik 1242501 1248887-6200503 Random effects: Correlation Structure: AR(1) Formula: ~ 1 id Formula: ~ 1 id (Intercept) Residual Parameter estimate(s): StdDev: 2157121 1325308 Phi 03760331 Fixed effects: fev1 ~ age0 + agel + female * agel + factor(f508) * agel Value StdError DF t-value p-value (Intercept) 1042928 6704807 1309 1555493 <0001 age0-18599 0328983 195-565355 <0001 agel -06754 0507081 1309-133188 01831 female -12954 3428658 195-037781 07060 factor(f508)1-46714 5664678 195-082465 04106 factor(f508)2-68660 5695078 195-120560 02294 female:agel -08151 0325773 1309-250206 00125 agelfactor(f508)1-03865 0546674 1309-070696 04797 agelfactor(f508)2-06224 0545237 1309-114152 02539 Number of Observations: 1513 Number of Groups: 200

171-16 Heagerty, 2006 Fit 2b Random Intercepts + corexp errors Linear mixed-effects model fit by maximum likelihood Data: cfkids AIC BIC loglik 1241276 1247662-6194378 Random effects: Correlation Structure: Exponential spatial corr Formula: ~ 1 id Formula: ~ agel id (Intercept) Residual Parameter estimate(s): StdDev: 2170338 1308926 range 09136573 Fixed effects: fev1 ~ age0 + agel + female * agel + factor(f508) * agel Value StdError DF t-value p-value (Intercept) 1041874 6700086 1309 1555015 <0001 age0-18560 0329094 195-563975 <0001 agel -05882 0499669 1309-117720 02393 female -12584 3430314 195-036684 07141 factor(f508)1-47019 5662454 195-083036 04073 factor(f508)2-67593 5691155 195-118769 02364 female:agel -08559 0321526 1309-266206 00079 agelfactor(f508)1-04242 0539225 1309-078677 04316 agelfactor(f508)2-07042 0537553 1309-130999 01904 Number of Observations: 1513 Number of Groups: 200

171-17 Heagerty, 2006 Fit 2c Random Intercepts + CAR(1) errors Linear mixed-effects model fit by maximum likelihood Data: cfkids AIC BIC loglik 1241276 1247662-6194378 Random effects: Correlation Structure: Continuous AR(1) Formula: ~ 1 id Formula: ~ agel id (Intercept) Residual Parameter estimate(s): StdDev: 2169983 1309153 Phi 03350188 Fixed effects: fev1 ~ age0 + agel + female * agel + factor(f508) * agel Value StdError DF t-value p-value (Intercept) 1041877 6699496 1309 1555158 <0001 age0-18560 0329056 195-564039 <0001 agel -05883 0499812 1309-117704 02394 female -12584 3430073 195-036686 07141 factor(f508)1-47025 5662058 195-083053 04073 factor(f508)2-67597 5690753 195-118784 02363 female:agel -08559 0321620 1309-266136 00079 agelfactor(f508)1-04241 0539381 1309-078625 04319 agelfactor(f508)2-07041 0537708 1309-130942 01906 Number of Observations: 1513 Number of Groups: 200

171-18 Heagerty, 2006 Fit 3 Random Intercepts + corexp + meas error Linear mixed-effects model fit by maximum likelihood Data: cfkids AIC BIC loglik 1238492 1245411-6179461 Random effects: Correlation Structure: Exponential spatial corr Formula: ~ 1 id Formula: ~ agel id (Intercept) Residual Parameter estimate(s): StdDev: 1969916 1586913 range nugget 5116653 02878059 Fixed effects: fev1 ~ age0 + agel + female * agel + factor(f508) * agel Value StdError DF t-value p-value (Intercept) 1047466 6760173 1309 1549466 <0001 age0-18739 0328085 195-571153 <0001 agel -07095 0577371 1309-122886 02193 female -12089 3486725 195-034671 07292 factor(f508)1-50121 5753669 195-087112 03848 factor(f508)2-70993 5782222 195-122778 02210 female:agel -08259 0372683 1309-221612 00269 agelfactor(f508)1-03159 0622931 1309-050710 06122 agelfactor(f508)2-05817 0621290 1309-093627 03493 Number of Observations: 1513 Number of Groups: 200

171-19 Heagerty, 2006 ANOVA Model df AIC BIC loglik Test LRatio p-value fit0 1 11 1253201 1259055-6255005 fit1 2 13 1243034 1249952-6202168 1 vs 2 1056739 <0001 fit2 3 12 1241276 1247662-6194378 2 vs 3 155802 1e-04 fit3 4 13 1238492 1245411-6179461 3 vs 4 298354 <0001

LMM Summary Observe Y i, i = 1, 2,, m independent clusters Model: (Laird & Ware, 1982) Y i = X i β + Z i b i + e i β is the coefficient that is common to all clusters (fixed across clusters) b i is the deviation of the coefficient that varies from cluster to cluster (random across clusters) (β j + b j,i ) is the coefficient of X i,j for cluster i 172 Heagerty, 2006

b i N (0, D) between-cluster e i N (0, R i ) within-cluster Estimation/Inference: WLS, ML Covariance model choice leads to WLS but estimated regression coefficient is unbiased for any choice of weight (covariance) Covariance model choice determines the standard error estimates for the regression coefficients correct covariance model is needed for correct standard errors 173 Heagerty, 2006