Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Linear Mixed Models for Longitudinal Data Yan Lu April, 2018, week 12 1 / 34

Correlated data multivariate observations clustered data repeated measurement longitudinal data spatially correlated data 2 / 34

Example: Multivariate: a subject s systolic and diastolic blood pressure are measured simultaneously -measurement of different characteristics Cluster setting: for a number of families, diastolic blood pressure is measured for all of their members Repeated measure: for each subject, diastolic blood pressure is recorded under several experimental conditions -measurement of the same characteristic under a different condition Longitudinal data: diastolic blood pressure is measured repeated over time for each subject 3 / 34

Problems of repeated measure and longitudinal design: Order effect Example: in evaluating five different advertisements, subjects may tend to give higher (or lower) ratings for advertisements shown-toward the end of the sequence than at the beginning Remedy: randomization of the treatment orders Carry over effect Example: in evaluating five different soup recipes, a bland recipe may get a higher (or lower) rating when preceded by a highly spiced recipe than when preceded by a blander recipe. Remedy: allowing sufficient time between treatments. Example: in evaluating a new drug, it is often noticed that the effect of the drug began to wear off over time. Remedy: consider a mixed model by incorporating time effect typical models: Random intercepts model, and Random slopes and intercepts model 4 / 34

Example: Childhood obesity n children are followed longitudinally (over time) body mass index (BMI = WT/HT 2 ) has been recorded at each time (age) interested in the relationship of BMI as a function of age data structure: time (age) 1 time (age) 2 time (age) n i 1st child y 11 y 12 y 1n1 2nd child y 21 y 22 y 2n2. nth child y n1 y n2 y nnn 5 / 34

Model 1: Simple linear regression model y ij = β 0 + β 1 age ij + ɛ ij where i = 1, 2., n, j = 1, 2,, n i and ɛ ij iid N(0, σ 2 ) Assume that gender is unimportant, same relationship for all children. 6 / 34

Model 2: Intercepts and slopes vary by child where i = 1, 2., n, j = 1, 2,, n i and ɛ ij iid N(0, σ 2 ) y ij = β 0i + β 1i age ij + ɛ ij Model 1: 2 parameters (β 0, β 1 ) Model 2: 2n parameters (β 0i, β 1i ) Responses on the same child are likely to be correlated, iid assumption may not work 7 / 34

Model 3: Mixed model Treat the subject specific intercepts β 0i and slopes β 1i as r.vs β 0i = β 0 + β0i, β iid 0i N(0, σ 2 0 ) β 1i = β 1 + β1i, β iid 1i N(0, σ 2 1 ) β0i, β 1i measure the difference between the population mean intercept and slope (β 0, β 1 ) and the subject specific intercept and slope (β 0i, β 1i ) y ij = β 0 + β 0i + (β 1 + β 1i)age ij + ɛ ij = β 0 + β 1 age ij + β 0i + β 1iage ij + ɛ ij β 0 + β 1 age ij : population average regression line β 0i + β 1i age ij: subject specific deviation from population average A special case of a linear mixed model (LMM): fixed effects (β 0, β 1 ) + random effects (β 0i, β 1i ) 8 / 34

Advantages: responses on the same individual are correlated y i1, y i2,, y ini all depend on β 0i, β 1i Model 2: 2n parameters Model 3: 5 parameters Model 2 only applies to the individuals selected from study, while Model 3, individuals are sampled as representatives of an underlying population, therefore, applying to the population 9 / 34

General Linear Models for Longitudinal Data y i = y i1 y i2., µ i = µ i1 µ i2., ɛ i = ɛ i1 ɛ i2. y ini µ ini ɛ ini y i s are responses at times t i1, t i2, t ini y i = µ i + ɛ i number of time points can vary by individual assume that µ i = X i ni pβ p 1, X i may be individual specific assume E(ɛ i ) = 0, Var(ɛ i ) = Var(y i ) = Σ i often assume ɛ i N ni (0, Σ i ) or equivalently, y i N ni (µ i, Σ i ) 10 / 34

Example: dental data Orthodontic distance is measured at ages t i1 = 8, t i2 = 10, t i3 = 12, t i4 = 14 (n i = 4) for each subject from 2 groups (boys (red) and girls (black)) > ex.data obs subject age distance gender 1 1 1 8 21.0 0 2 2 1 10 20.0 0 3 3 1 12 21.5 0 4 4 1 14 23.0 0 5 5 2 8 21.0 0 6 6 2 10 21.5 0 7 7 2 12 24.0 0 8 8 2 14 25.5 0... 106 106 27 10 21.5 1 107 107 27 12 23.5 1 108 108 27 14 25.0 1 11 / 34

> aa<-aggregate(distance~age+gender, data=ex.data, mean) #cell means > aa age gender distance 1 8 0 21.18182 2 10 0 22.22727 3 12 0 23.09091 4 14 0 24.09091 5 8 1 22.87500 6 10 1 23.81250 7 12 1 25.71875 8 14 1 27.46875 12 / 34

Figure 1: Sample means at each time across children gender distance 21 22 23 24 25 26 27 1 0 8 10 12 14 age Increasing linear trend with time, males (red) and females (black) have different intercepts and possibly different slopes. 13 / 34

Starting model i: subject, j: time Let Let δ i = y ij = µ ij + ɛ ij { 1 if boy 0 if girl µ ij = β 0 + β 1 δ i + β 2 t j + β 3 (δ i t j ) δ i : indicator effect for gender t j : linear time effect (δ i t j ): time and gender interactions 14 / 34

µ ij = β 0 + β 1 δ i + β 2 t j + β 3 (δ i t j ) For δ i = 0 (girls) µ ij = β 0 + β 1 (0) + β 2 t j + β 3 (0) = β 0 + β 2 t j For δ i = 1 (boys) µ ij = β 0 + β 1 (1) + β 2 t j + β 3 (1 t j ) = (β 0 + β 1 ) + (β 2 + β 3 )t j gender intercept slope girls β 0 β 2 boys β 0 + β 1 β 2 + β 3 15 / 34

Coefficient for gender and gender * time interaction measure difference in intercept and slope between boys and girls, GDP (group difference parametrization) for the model µ i = X i β 1 δ i t 1 δ i t 1 1 δ i t 2 δ i t 2 X i =...., β = 1 δ i t n δ i t n β 0 β 1 β 2 β 3 16 / 34

Separate group parametrization An equivalent model is to specify µ ij = β 0 (1 δ i ) + β 1 δ i + β 2 (1 δ i )t j + β 3 (δ i t j ) For δ i = 0 (girls), µ ij = β 0 + β 2 t j For δ i = 1 (boys), µ ij = β 1 + β 3 t j µ i = X i β Separate group parametrization (SGP) µ i1. µ ini = 1 δ i δ i (1 δ i )t j δ i t j.... 1 δ i δ i (1 δ i )t j δ i t j β 0 β 1 β 2 β 3 17 / 34

For girls For boys 1 0 t 1 0 1 0 t 2 0 X i =.... 1 0 t ni 0 0 1 0 t 1 0 1 0 t 2 X i =.... 0 1 0 t ni 18 / 34

Split Plot model Alternatively, we could consider the split-plot model µ ij = µ + τ i + δ j + (τδ) ij The models relates the mean to a linear function of time would appear to be more appealing. 19 / 34

Quadratic curves fitting Hip-replacement study: 30 patients underwent hip replacement surgery, 13 males and 17 females, the ratio of volume packed red blood cells relative to volume of whole blood recorded on a percentage basis, was supposed to be measured for each patient at week 0 before the replacement, and then at weeks 1, 2 and 3, after replacement. collected at 4 equally spaced time points 0, 1,2 3 weeks but some responses missing mean profiles appear to be quadratic in time effect also interested in whether age might response given short duration of study, makes sense to treat a subjects age as constant. 20 / 34

Let δ i = { 1 males 0 females y ij = µ ij + ɛ ij µ ij = β 0 + β 1 t ij + β 2 t 2 ij + β 3 δ i + β 4 δ i t ij + β 5 δ i t 2 ij For females For males µ ij = β 0 + β 1 t ij + β 2 t 2 ij µ ij = (β 0 + β 3 ) + (β 1 + β 4 )t ij + (β 2 + β 5 )t 2 ij quadratic curves shift up or down with age, but amount of shift is same for males and females. We could allow the shift to be sex dependent by includes a gender*age interaction. 21 / 34

Modeling the covariance matrix Σ i y i = µ i + ɛ i Var(ɛ i ) = Var(y i ) = Σ i (ni n i ) For balanced case n i = n, t 1, t 2,, t n assume no missing data further, assume that Σ i = Σ 22 / 34

Equicorrelation Σ = σ 2 σ 2 ρ σ 2... σ 2 ρ σ 2 This structure is also called spherical or exchangeable It might be applicable to cluster data, where ρ is called the intra-class correlation coefficient between two members of the same cluster and is a relative measure of the within-cluster similarity. Each response has variance σ 2 and all pairwise correlations are ρ Believe that correlation between responses at any two points in a time interval are constant 23 / 34

Compound symmetry A special case of equicorrelation, arises by enforcing ρ = σ 2 B /(σ2 B + σ2 e) for some σ 2 B and σ2 e. In this case, σ 2 = σ 2 B + σ2 e, then σb 2 + σ2 e σb 2 σb 2 σ2 B σb 2 σb 2 Σ = + σ2 e σb 2 σ2 B....... σb 2 σb 2 σb 2 σ2 B + σ2 e n n Recall that a split plot design has this variance structure. 24 / 34

Unstructured arbitrary variances and covariances Σ = σ1 2 σ 12 σ 1n σ 21 σ2 2 σ 2n..... σ n1 σ n2 σn 2 the variance-covariance matrix contains, n(n 1)/2 + n nuisance parameters to be estimated estimation of this structure may only convergence for N >> n the statistical power under this structure is reduced since the only constraint on Σ i is that it be symmetric 25 / 34

1 dependent Σ = σ 2 1 ρ 0 0 1 ρ 0... ρ 1 constant variance of σ 2 lag 1, correlation = ρ, i.e., corr(y ij, y i(j±1) ) = ρ lag k correlation =0, i.e., corr(y ij, y i(j±k) ) = 0, if k > 1 observations taken more closely together in time might tend to be more alike than those taken further apart 26 / 34

AR(1): auto regressive order 1 (equally spaced in time) Σ = σ 2 1 ρ ρ 2 ρ n 1 1 ρ ρ n 2....... 1 ρ 1 constant variance σ 2 corr(y ij, y i(j±k) ) = ρ k 27 / 34

Unequally spaced in time The above models make sense for equally spaced times. How about if times are unequally spaced, but we still have same times each subject? compound symmetry: if you believe that correlations between responses at any two points in a time interval are constant, then ok unstructured: ok 1 dependent, not for sure, if t 3 = t 2 + ɛ with ɛ small, should corr(y t1, y t2 ) = ρ, yet corr(y t1, y t3 ) = 0 AR(1), same issue as 1 dependent 28 / 34

Generalization of AR(1) to unequally spaced times Markov model-correlation is a function of distance between times Let d jh = t j t h, then Markov model has - var(y tj ) = σ 2 - corr(y tj, y th ) = ρ d jh 29 / 34

Deciding among covariance models inspection of sample covariance matrices can use formal test in some cases, model selection methods AIC, BIC, for non-nested models 30 / 34

Unbalanced case: or some observations are missing on some units Example: (t 1, t 2, t 3, t 4 ) = (0, 1, 2, 3) For unit i, observation at time t 3 is not available n i = 3. Let y i1 y i = y i2 y i4 denote the observations at times (t 1, t 2, t 4 ) = (0, 1, 3) Assume that var(y ij ) = σ 2 for all j, we thus want a model for Σ i = var(y i ) = σ 2 cov(y i1, y i2 ) cov(y i1, y i4 ) cov(y i2, y i1 ) σ 2 cov(y i2, y i4 ) cov(y i4, y i1 ) cov(y i4, y i2 ) σ 2 31 / 34

Compound symmetry: represented in the same way regardless of the missing value observations any distance apart have the same correlation Unstructured, Σ i = var(y i ) = σ 2 1 σ 12 σ 14 σ 21 σ 2 2 σ 24 σ 41 σ 42 σ 2 4 One-dependent, which says that only observations adjacent in time are correlated, this matrix becomes σ 2 ρσ 2 0 Σ i = var(y i ) = ρσ 2 σ 2 0 0 0 σ 2 32 / 34

AR(1): Σ i = var(y i ) = σ 2 1 ρ ρ 3 ρ 1 ρ 2 ρ 3 ρ 2 1 Comments: if all observations were intended to be taken at the same times, but some are not available, the covariance matrix must be carefully constructed. 33 / 34

Independence assumption: Assume that observations within a unit are uncorrelated, under multivariate normality they are independent Σ i = var(ɛ i ) = σ 2 I ni Independence says that observations on the same unit are no more alike than those across units, unrealistic for longitudinal data Independence also means no correlation included by within-unit fluctuations over time -this is ok, if observations are all taken sufficiently far apart in time from one another. -most situation, cause model misspecification 34 / 34