Correlated data. Further topics. Sources of random variation. Specification of mixed models. Faculty of Health Sciences.

Size: px

Start display at page:

Download "Correlated data. Further topics. Sources of random variation. Specification of mixed models. Faculty of Health Sciences."

Elisabeth Gordon
5 years ago
Views:

1 Faculty of Health Sciences Further topics Correlated data Further topics Lene Theil Skovgaard December 11, 2012 Specification of mixed models Model check and diagnostics Explained variation, R 2 Missing values Several series for each individual Additional examples: Visual acuity example (from Variance components II) Baseline revisited: Simple before and after study Reading ability 1 / 96 2 / 96 Specification of mixed models Sources of random variation Systematic variation: Between-individual covariates: treatment, sex, age, baseline value... Within-individual covariates: time, cumulative dose, temperature... is specified as usual, including possible interactions Random variation Interactions between systematic and random effects are always random 1. Random effects: 2. Serial correlation: 3. Measurement error: 3 / 96 4 / 96

2 SAS, PROC MIXED General model specification model describes the systematic part (fixed effects, mean value structure) random describes the random effects repeated describes the serial correlation local adds an additional measurement error sorry..., matrix notation for brevity, Y i = (y i1,..., y ik ) denote all outcome values for subject i where Y i = X i β + Z i b i + ε i β denotes systematic effects, β = (β 1,..., β p ) b i denotes random effects, b i = (b i1,..., b iq ) ε i is the serially dependent residual variation, ε i = (ε i1,..., ε ik ) We assume all b i s and ε i s to be independent, with mean zero and Var(b i ) = G, Var(ε i ) = R i 5 / 96 6 / 96 For non-normal data Residual variance Y i follows some distribution from the exponential family, with mean value where E(Y i ) = g 1 (η i ) η i = X i β + Z i b i and the variance V (Y i ) = V i is determined by the distribution. In general, there is no free variance parameter, since the variance is determined from the mean value: Normal (link=identity), free variance parameter σ 2 Binomial (link=logit), variance np(1 p) Poisson (link=log), variance λ = E(Y ) Overdispersion: The variance is seen to be larger than determined by the distribution. 7 / 96 8 / 96

3 Overdispersion Swabs example from last week can be caused by omitted covariates (isn t that always the case?) unrecognized clusters heterogeneity, e.g. a zero -group (non-susceptibles) Traditional solution: An over-dispersion parameter φ is estimated and multiplied onto the variance cheating...we would like instead proc glimmix data=swab; class crowding family name; model swab=crowding name / dist=poisson link=log s cl; random family(crowding); run; gave as part of the output Fit Statistics -2 Res Log Pseudo-Likelihood Gener. Chi-Square / DF 1.96 < with a free parameter 9 / 96 η i = X i β + Z i b i + ε i Var(ε i ) = Σ i When this number is greater than 1, it indicates overdispersion, and the P-values in the Poisson analysis might be too small. 10 / 96 Model with overdispersion Output, continued proc glimmix data=swab; class crowding family name; model swab=crowding name / dist=poisson link=log s cl; random intercept / subject=family(crowding); random _residual_; < run; providing the output The GLIMMIX Procedure Fit Statistics -2 Res Log Pseudo-Likelihood Generalized Chi-Square Gener. Chi-Square / DF 2.12 Covariance Parameter Estimates Standard Cov Parm Subject Estimate Error Intercept family(crowding) Residual (VC) Type III Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F crowding name <.0001 with P-values more or less identical to the normality-based results. 11 / / 96

4 But... Real model for overdispersion The model is not a real model, since no such distribution exists... If we should build a real model with overdispersion, we would write: y cfn Poisson(λ cfn ) log(λ cfn ) = µ + β c + γ n + A cf + ε cfn A cf N (0, ω 2 ), ε cfn N (0, σ 2 ) will require two random-statements, like this: proc glimmix data=swab; class crowding family name; model swab=crowding name / dist=poisson link=log s cl; random intercept / subject=family(crowding); random intercept / subject=name*family(crowding); run; but this model unfortunately cannot be estimated by glimmix: 13 / / 96 Convergence problems Model check and diagnostics (normal models) The GLIMMIX Procedure Iteration History Objective Max Iteration Restarts Subiterations Function Change Gradient Did not converge. Covariance Parameter Estimates Standard Cov Parm Subject Estimate Error Intercept family(crowding) Intercept family*name(crowdin) Ordinary residuals, from systematic effect, Y i X i ˆβ These will be correlated Conditional residuals, Y i X i ˆβ + Ziˆbi Normality of random effects, estimated BLUP s, ˆb i Investigation of covariance structure, (variogram) Detection of influential observations It probably requires true replicates / / 96

5 Normality of random effects Variogram Model checks Histogram of BLUP s from the model is not worth much Instead, build model with a mixture of normal distributions and make a test... Variance of difference between time points: γ(u) = 1 2 E(ε t ε t u ) 2 = τ 2 (1 ρ(u)) + σ 2 If normality does not apply: no large effect on estimates for β, G and R standard error become biased, but may be corrected in various ways BLUP s ˆb i become invalid, especially when the residual variance σ 2 is large Nugget: σ 2 Sill: τ 2 + σ 2 Variance: ω 2 + τ 2 + σ 2 17 / / 96 Local influence Reasons for a large influence Idea: Put some infinitesimal extra weight ( ) on a single observation (i); Weight vector: ω i = (1,, 1 +,, 1) Look at the change in likelihood: LD(ω i ) = 2(l(ˆθ) l(ˆθ ω i )) Make a Local influence plot (i, C i ), where unusual combination of covariates X i large ordinary residuals (from X i β) unusual combination of covariates Z i bad choice of covariance pattern V i (α) C i = 1 LD(ω i) 19 / / 96

6 Explained variation in percent, R 2 Hypothetical example We have two (or more) different variances to explain! Residual variation (variation within individuals, σw 2 ) decreases (as usual) when we include an important x covariate (level 1) may decrease when we include an important z covariate (level 2) Variation between individuals, ωb 2 decreases when we include an important z covariate (level 2) may increase or decrease, when we include an important x covariate (level 1) 21 / 96 The x s vary between individuals, and the average outcomes (ȳ) are mostly due to this variation: Levels of y, for fixed x are quite alike! ω 2 decreases 22 / 96 Another hypothetical example Missing values The x s vary between individuals, but the average outcomes (ȳ) are almost identical: Levels of y, for fixed x are very different! ω 2 increases Most investigations are planned to be balanced but almost inevitable turn out to have missing values, or drop-out patients just by coincidence (blood sample lost or ruined) because of exclusion (the patient has recovered) we lost track of the patient (may be worrysome) the patient is too ill to show up (very serious, i.e. carrying information) 23 / / 96

7 Types of missing data Possible missing mechanism, I Low values are good: When the patient is well treated, he drops out Single missing values Drop-outs 25 / / 96 Possible missing mechanism, II Notation Low values are bad: Below some threshold, the patient is too ill to show up (informative missing) Outcome Y git Parameters θ (level, slope etc.) Covariates x git Indicator of missing c git Missing outcomes Y git not observed, i.e. corresponding to c git = 1 27 / / 96

8 Types of missingness Hypothetical TLC-observations MCAR Missing completely at random P(c git = 1) depends only on the parameter θ MAR Missing at random CDEP: P(c git = 1) depends upon θ and covariates x git YDEP: P(c git = 1) depends upon θ, covariates x git and observed outcomes Y git NI Non-ignorable: (informative missing) P(c git = 1) depends upon the unobserved=missing outcome Y git Lung capacity measured at regular time intervals for two groups, that we want to compare 29 / / 96 Average for the two groups Hypothetical example: Informative missing Patients who get below 3.5 drop out, averages change 31 / / 96

9 Traditional handling of missing data Complete case analysis Make an analysis including only those individuals who are observed at all available time points Complete case analysis LOCF: Last observation carried forward Time average imputation Model prediction imputation Information loss Potential bias, if there is a specific reason for the missingness Likelihood methods 33 / / 96 LOCF: Last observation carried forward If an individual has no observed value at time t k, replace the missing value by the previous observation, t k 1 For drop-outs, all subsequent values will equal this t k 1 Time average imputation Subject effect will be underestimated Large residuals, i.e. overestimation of residual variation The time effect will be less pronounced Large residuals, i.e. overestimation of residual variation 35 / / 96

10 Model prediction imputation Likelihood methods Two-step procedure Too small residuals, i.e. downwards bias of of SD Mixed models for all available observations 37 / / 96 MCAR: Missing completely at random MAR: Missing at random Complete case analysis OK, but inefficient If only few observations are missing, imputations could work but the variations will be affected Likelihood approaches (mixed models) OK uses all available information Complete case analysis is biased We disregard subjects with special characteristics If only few observations are missing, imputations could work but the variations will be affected Likelihood approaches (mixed models) OK uses all available information 39 / / 96

11 Mixed models for MAR Non-ignorable Nothing works! Many attempts have been tried to model the missing mechanisms, but they all rely on assumptions that cannot be checked. 41 / / 96 Example: Effect of exercise on appetite HIGH, week=pre 53 subjects, in three groups: Control Moderate exercise: 1 2 hour a day Extensive exercise: 1 hour a day Before and after exercise/placebo: Exercise test, with blood samples taken every half hour from baseline until 3 hours. Several hormones are measured, e.g. ghrelin Mads Rosenkilde 43 / / 96

12 HIGH, week=post Baseline differences, before exercise? No, not really Can we then disregard these? 45 / / / / 96

13 Aim of investigation Model for each group separately Does the exercise change something? Does the time course change from the first to the second week (pre/post exercise)? If so, does it change more than for the control group? And does it apply equally to the two exercise groups? Fixed effects week: Pre or Post time: 0, 30, 60, 90, 120, 150, 180 minutes Interaction week*time: A change in the pattern from before to after exercise Random effects Patients, Sub: 18 (HIGH, MOD), 17 (XCON) Interaction Sub*week Serial covariance structure Autoregressive? Local error term? 49 / / 96 Analysis for each group separately Output, exercise HIGH grp=high The Mixed Procedure proc mixed data=a0 covtest; by grp; class Sub week time; model log_ghrelin=time week*time / ddfm=satterth s cl; random intercept / subject=sub vcorr v; repeated time / subject=sub*week type=sp(pow)(numtime) local rcorr r; lsmeans week*time / slice=time; run; Data Set Dependent Variable Covariance Structures Subject Effects Estimation Method Residual Variance Method Fixed Effects SE Method Degrees of Freedom Method Model Information Class Level Information WORK.A0 log_ghrelin Variance Components, Spatial Power Sub, Sub*week REML Profile Model-Based Satterthwaite Class Levels Values Sub 18 ALMA ANAP ANLO ANTF BRFR CASC CHBE DAHA DERJ GRPE HEJE JAKU MIFH MIMR MIMÂİ MINI NIHA THES week 2 Post Pre time / / 96

14 Output, exercise HIGH, II Output, exercise HIGH, III Dimensions Covariance Parameters 4 Columns in X 24 Columns in Z Per Subject 1 Subjects 18 Max Obs Per Subject 14 Covariance Parameter Estimates Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F time <.0001 week*time Type 3 Tests of Fixed Effects Standard Z Cov Parm Subject Estimate Error Value Pr Z Intercept Sub Variance Sub*week SP(POW) Sub*week Residual <.0001 Num Den Effect DF DF F Value Pr > F week time <.0001 week*time / / 96 Output, exercise HIGH, IV Results from all three groups Tests of Effect Slices Num Den Effect time DF DF F Value Pr > F week*time week*time week*time week*time week*time week*time week*time Solution for Fixed Effects Standard Effect week time Estimate Error DF t Value Pr > t week*time Post week*time Post week*time Post week*time Post week*time Post week*time Post week*time Post ω 2 τ 2 ρ σ 2 P: week*time id/int HIGH /0.056 MOD /0.23 XCON /0.092 All *: Test for second-order interaction: grp*week*time 55 / / 96

15 All groups simultaneously proc mixed data=a0 covtest; class Sub grp week time; model log_ghrelin=grp week time grp*week grp*time week*time grp*week*time / ddfm=satterth s cl; random intercept / subject=sub(grp) vcorr v; repeated time / subject=week*sub(grp) type=sp(pow)(numtime) local rcorr r; lsmeans grp*week*time / slice=week; lsmeans grp*week*time / slice=grp; run; Class Level Information Class Levels Values Sub 53 ALJE ALMA ANAP ANBR ANKR ANLI ANLO ANMO ANTF ASOL ASSA BRFR THBR THES THLU TUMY ULRA grp 3 HIGH MOD XCON week 2 Post Pre time Dimensions Covariance Parameters 4 Columns in X 96 Columns in Z Per Subject 1 Subjects 53 Max Obs Per Subject 14 Number of Observations Number of Observations Read 736 Number of Observations Used 736 Number of Observations Not Used 0 57 / / 96 Baseline differences The Mixed Procedure Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr Z Intercept Sub <.0001 Variance Sub*week <.0001 SP(POW) Sub*week <.0001 Residual <.0001 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F grp week hours <.0001 grp*week grp*hours week*hours grp*week*hours Even though these are not significantly different, we know that we ought to include it as a covariate, but which baseline? The observation at time 0 in the Pre-exercise week? The observation at time 0 at each of the two weeks (i.e. separate values for variable baseline, depending upon the week) Think about which question you want to answer! 59 / / 96

16 Including first baseline as covariate Output, with baseline as covariate proc sort data=a0; by grp Sub; run; data baseline; set a0; if week= Pre and time=0; ghrelin0=ghrelin; log_ghrelin0=log_ghrelin; run; data test; merge a0 baseline; by grp Sub; Dimensions Covariance Parameters 4 Columns in X 85 Columns in Z Per Subject 1 Subjects 52 Max Obs Per Subject 12 Number of Observations Number of Observations Read 632 Number of Observations Used 621 Number of Observations Not Used 11 Covariance Parameter Estimates proc mixed data=test covtest; where week= Post or time>0; class Sub grp week time; model log_ghrelin=log_ghrelin0 grp week time grp*week grp*time week*time grp*week*time / ddfm=satterth s cl; random intercept / subject=sub vcorr v; repeated time / subject=sub*week type=sp(pow)(numtime) local rcorr r; run; 61 / 96 Standard Z Cov Parm Subject Estimate Error Value Pr Z Intercept Sub Variance Sub*week <.0001 SP(POW) Sub*week Residual < / 96 Model reduction Solution for Fixed Effects Standard Effect grp week hours Estimate Error DF t Value Intercept log_ghrelin Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F log_ghrelin <.0001 grp week time <.0001 grp*week grp*time week*time grp*week*time < Proceed with some model reduction and model checks / 96 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F log_ghrelin <.0001 grp week time <.0001 week*time Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F log_ghrelin <.0001 week time <.0001 week*time / 96

17 Could this have been done simpler? Effect of the lens strength on visual acuity Analysis of differences because we have only two occasions and exactly the same time points Reducing the time courses to averages and/or slopes, to give a simpler model for derived quantities 7 individuals are looking at a screen, where a light flash appears. They are looking through 4 lenses, with powers 6/6, 6/18, 6/36 and 6/60, i.e. 4 magnifications: 1, 3, 6 and 10 with 2 eyes Outcome: Visual acuity, the time lag (milliseconds) between the stimulus and the electrical response at the back of the cortex 65 / / 96 Data Crowder & Hand (1990) 67 / / 96

18 Factors to take into account Model formulation Main effects: 7 individuals (person), A p 2 eyes for each individual (eye), α e 4 lens magnifications (power), β m Interactions? person*eye, B pe where p = 1,..., 7, e = 1, 2, m = 1, 2, 3, 4 Y pem = µ em + A p + B pe + C pm + ε pem person*power, C pm eye*power, γ em 2-order interaction person*eye*power = Residual, ε pem A p N (0, ω 2 ) B pe N (0, τe 2 ) C pm N (0, τm) 2 ε pem N (0, σ 2 ) 69 / / 96 Factor diagram Not quite a multilevel model, but.. [I ] = [Pa Ey Po] [Pa Ey] Ey Po [Pa Po] Ey [Pa] Po 0 Level Unit Covariates 1 single measurements Ey*Po 2 interactions 2e [Pa*Ey] Ey 2m [Pa*Po] Po 3 individuals, [Pa] overall level 71 / / 96

19 proc mixed data=visual covtest; class patient eye power; model acuity=eye power eye*power / s ddfm=satterth; * random patient patient*eye patient*power; random intercept eye power / subject=patient; run; Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr > Z Intercept patient eye patient power patient Residual Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F eye power eye*power Solution for Fixed Effects Standard Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power eye*power left eye*power left eye*power left eye*power left eye*power right eye*power right eye*power right eye*power right / / 96 Predicted mean profiles Individual predictions 75 / / 96

20 Residual plot Omit the interaction eye*power Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr > Z Intercept patient eye patient power patient Residual Solution for Fixed Effects Standard Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power Type 3 Tests of Fixed Effects 77 / 96 Num Den Effect DF DF F Value Pr > F eye power / 96 Eye comparisons Consequence for eye comparisons Model: Y pem = µ em + A p + B pe + C pm + ε pem where A p N (0, ω 2 ), B pe N (0, τe 2 ), C pm N (0, τm), 2 ε pem N (0, σ 2 ) Difference between eye averages: Ȳ.e1. Ȳ.e 2. = µ stuff Var(Ȳ.e 1. Ȳ.e 2.) = 2 7 τ 2 e σ2 τ 2 e is rather large (people have different eye preferences) We have to demand a larger difference in order to detect it P-values rather large + B.e1 B.e2 + ε.e1. ε.e2. 79 / / 96

21 Magnification comparisons Consequence for magnification comparisons Model: Y pem = µ em + A p + B pe + C pm + ε pem where A p N (0, ω 2 ), B pe N (0, τe 2 ), C pm N (0, τm), 2 ε pem N (0, σ 2 ) Difference between magnification averages: Ȳ..m1 Ȳ..m 2 = µ stuff Var(Ȳ..m 1 Ȳ..m 2 ) = 2 7 τ 2 m σ2 τ 2 m is not that large (people react more or less identically to the different magnifications) We can detect smaller differences P-values rather small + C..m1 C..m2 + ε..m1 ε..m2 81 / / 96 If we ignore correlations Incorrect analysis, ignoring random effects i.e a model with no random effects Eye differences: but another σ 2 Magnification differences: Var(Ȳ.e 1. Ȳ.e 2.) = σ2 Var(Ȳ..m 1 Ȳ..m 2 ) = σ2 Covariance Parameter Estimates Cov Parm Estimate Residual Solution for Fixed Effects Standard Effect eye power Estimate Error DF t Value Pr > t Intercept <.0001 eye left eye right power power power power Type 3 Tests of Fixed Effects Num Den Effect DF DF Chi-Square F Value Pr > ChiSq Pr > F eye <--too small power <--too large 83 / / 96

22 Systematic vs. random effects Could the patients be treated as systematic here? Yes: Covariance Parameter Estimates Cov Parm Subject Estimate eye patient power patient Residual Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F patient eye power eye*power Example with only two time points (baseline and follow-up) from Vickers, A.J. & Altman, D.G.: Analysing controlled clinical trials with baseline and follow-up measurements. British Medical Journal 2001; 323: : 52 patients with shoulder pain are randomized to either Acupuncture (n=25) Placebo (n=27) Pain is evaluated on a 100 point scale before and after treatment. High scores are good Can you think why? 85 / / 96 Results on pain scores Development of pain, actual and hypothetical Comparison of the two groups Average pain score Treatment effect placebo acupuncture difference (n=27) (n=25) (95% CI) P-value Baseline 53.9 (14.0) 60.4 (12.3) Type of analysis Follow-up 62.3 (17.9) 79.6 (17.1) 17.3 (7.5; 27.1) Changes* 8.4 (14.6) 19.2 (16.1) 10.8 (2.3; 19.4) Ancova 12.7 (4.1; 21.3) * results published in Kleinhenz et.al. Pain 1999; 83: / / 96

23 Approaches for pain score analysis Approaches for pain score analysis, II Baseline The acupuncture group lies somewhat above placebo Follow-up We would expect the acupuncture group to be higher also after treatment Therefore, a direct comparison of follow-up times is unreasonable (we see too big a difference) Change Low baseline implies an expected large positive change (regression to the mean) The placebo group is therefore expected to increase the most Therefore, a direct comparison of changes is unreasonable (we see too small a difference) 89 / / 96 General approaches for handling baseline Recommandation Ancova Analysis of covariance, a special case of multiple regression: Outcome: follow-up data Covariates treatment (factor: acupuncture/placebo) baseline measurement (quantitative) Repeated measurement analysis Treatment effect appears as an interaction between treatment and time When can we use follow-up data? when we have a control group and proper randomisation when the correlation is low When can we use differences? when we have a control group and proper randomisation when the correlation is large When can we use analysis of covariance? always - as long as baseline imbalance is not related to treatment effect! 91 / / 96

24 Example: Reading ability as a function of age/training and cohort/age: Longitudinal (within-individual, β W ) effect vs. cross-sectional (between-individual, β B ) effect: Model Baseline level: a p1 at the age x p1 : a p1 = α + β B x p1 + δ p, β B negative Baseline measurement: y p1 = a p1 + ε p1 Follow-up level: a p2 = a p1 + β W (x p2 x p1 ) at the age x p2 : Follow-up measurement: y p2 = a p2 + ε p2 Difference: y p2 y p1 = β W (x p2 x p1 ) δ p + (ε p2 ε p1 ) Model for all y-observations: y pj = α + β B x p1 + β W (x pj x p1 ) + δ p + ε pj 93 / / 96 Actual analyses Estimation results Regression with inter- as well as intra-individual effect of age/time: proc mixed data=reading; class id; model read=age1 difage / s; random id; run; Covariance Parameter Estimates Cov Parm Estimate id Residual Standard Effect Estimate Error DF t Value Pr > t Intercept age difage cross sectional (β B ) longitudinal (β W ) Method Cohort effect Age effect y i1 vs. x i (0.458) y i2 vs. x i (0.534) y ij vs. x ij (0.384) no individual effect y i2 y i1 vs. x i2 x i (0.211) no intercept y ij vs. x ij (0.307) random individual effect y ij vs (0.572) (0.312) x i1 and (x i2 x i1 ) 95 / / 96

Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences

Correlated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences Faculty of Health Sciences Variance component models Definitions and motivation Correlated data Variance component models, I Lene Theil Skovgaard November 29, 2013 One-way anova with random variation The