Modelling the Covariance Jamie Monogan Washington University in St Louis February 9, 2010 Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 1 / 13
Objectives By the end of this meeting, participants should be able to: Use orthogonal polynomials to capture higher-order time effects Define the structure of common covariance pattern models Make a choice of covariance pattern model for real data Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 2 / 13
Longitudinal Variance and Covariance We expect between-individual variance to be greater than within-individual variance This is because repeated observations ought to be similar The more similar observations are, the higher their covariance The more similar observations are, the lower their variance Illustration: Var(Y i2 Y i1 ) = σ 2 1 + σ2 2 2ρ 12σ 1 σ 2 If we do not account for the correlation among repeated observations, our standard errors will usually be too large Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 3 / 13
Longitudinal Variance and Covariance In regression terms, we account for this by defining Ω within ˆβ GLS = [x Ω 1 x] 1 x Ω 1 y Typically: Σ O O O Σ O Ω = O O Σ Where Y i = (Y i1, Y i2,, Y in ), Cov(Y i ) = Σ, and: 0 0 0 0 0 0 O = 0 0 0 Note: for n waves, both Σ and O are n n matrices Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 4 / 13
Unstructured Covariance No constraints on Σ other than symmetry σ1 2 σ 12 σ 1n σ 21 σ2 2 σ 2n Cov(Y i ) = σ n1 σ n2 σn 2 Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 5 / 13
Compund Symmetry Random effects for units Ie, how to deal with unit effects using GLS σ 2 σ 2 ρ σ 2 ρ σ 2 ρ σ 2 ρ σ 2 σ 2 ρ σ 2 ρ Cov(Y i ) = σ 2 ρ σ 2 ρ σ 2 ρ σ 2 Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 6 / 13
Toeplitz Cannot be directly estimated in R For n waves, AR(n 1) is equivalent σ 2 σ 1 σ 2 σ n 1 σ 1 σ 2 σ 1 σ n 2 Cov(Y i ) = σ n 1 σ n 2 σ n 3 σ 2 Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 7 / 13
First-Order Autoregressive Shorthand: AR(1) A special case of Toeplitz that reduces to two parameters Assumption: e ij = ρe ij 1 + w ij, where w ij is iid normal σ 2 σ 2 ρ σ 2 ρ 2 σ 2 ρ n 1 σ 2 ρ σ 2 σ 2 ρ σ 2 ρ n 2 Cov(Y i ) = σ 2 ρ n 1 σ 2 ρ n 2 σ 2 ρ n 3 σ 2 Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 8 / 13
Banded Simplification of Toeplitz that imposes zero covariance beyond a certain order Not easily estimated in R Toeplitz (2): σ 2 σ 2 ρ 1 0 0 σ 2 ρ 1 σ 2 σ 2 ρ 1 0 Cov(Y i ) = 0 0 0 σ 2 Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 9 / 13
Exponential Also called continuous autoregressive A better choice for irregularly-spaced measurement intervals σ 2 σ 2 ρ t 1 t 2 σ 2 ρ t 1 t 3 σ 2 ρ t 1 t n σ 2 ρ t 2 t 1 σ 2 σ 2 ρ t 2 t 3 σ 2 ρ t 2 t n Cov(Y i ) = σ 2 ρ tn t 1 σ 2 ρ tn t 2 σ 2 ρ tn t 3 σ 2 Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 10 / 13
Estimation in R: corclasses corsymm general correlation matrix, with no additional structure corcompsymm compound symmetry structure (econ random effects) corar1 autoregressive process of order 1 corarma autoregressive moving average process corcar1 continuous autoregressive process (exponential) Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 11 / 13
Choosing a Covariance Pattern Model Covariance of repeated observations can depend on the specification of the mean model (Case for response profiles) Likelihood Ratio Test: H 0 : r constraints are true G 2 = 2(ˆl f ˆl s ) χ 2 (r) for r restrictions For covariance testing: use REML!!! Not ideal if imposing many zeros AIC 2(ˆl) + 2(c) For ˆl maximized REML log-likelihood and c the number of covariance parameters The lowest value is the best fit, contingent on a parsimony penalty BIC (Schwartz s Criterion) 2(ˆl) + log(n )(c) For ˆl maximized REML log-likelihood, N = N p (for p length of β), and c the number of covariance parameters The lowest value is the best fit, contingent on a larger parsimony penalty Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 12 / 13
For Next Time Read FLW chapter 8 With the Lead-Exposed Children data from Feb 2: Run a response profile model Do so with four covariance structures: unstructured, exponential, Toeplitz, and AR(1) (Don t report the results) Report the AIC for the four models and explain which covariance structure you would use Regardless of the AIC, would you believe Toeplitz or AR(1) could be appropriate for modeling the covariance of these data? Jamie Monogan (WUStL) Modelling the Covariance February 9, 2010 13 / 13