Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.

Size: px

Start display at page:

Download "Faculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman."

Kristian Underwood
5 years ago
Views:

1 Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, / 96

2 Overview One-way anova with random variation The rabbit example Hierarchical models with several levels Crossed random effects (interactions) The visual acuity example Random regression Home pages: RepeatedMeasures2017.html 2 / 96

3 Terminology for correlated measurements Cluster/multilevel design: Same outcome (response) measured on all individuals in a number of families/villages/school classes Repeated measurements: Same outcome (response) measured in different situations (or at different spots) for the same individual. Longitudinal measurements: Same outcome (response) measured consecutively over time for each individual. Multivariate outcome: Several outcomes (responses) for each individual, e.g. a number of hormone measurements that we want to study simultaneously. 3 / 96

4 Variance component models Models involving several sources of random variation geographical/environmental variation between regions, hospitals, schools or countries biological variation variation between individuals, families or animals within-individual variation variation between arms, teeth, injection sites, days variation due to uncontrollable circumstances time of day, temperature, observer measurement error Of course, they may also include fixed effects, such as treatment, gender etc. 4 / 96

5 Example: Swelling due to vaccine Research question: How much swelling can be expected in relation to a vaccination? Experiment: 6 rabbits, each vaccinated in 6 (randomly?) selected spots on the back Outcome y rs : swelling in cm 2, where r= 1,,R=6 denotes the rabbit, s= 1,,S=6 denotes the spot We have observed a total of 36 swelling areas, but we must expect swelling to be specific to the individual rabbit. 5 / 96

6 Scatter plot X-axis: Arbitrary numbering of rabbits 6 / 96

7 Naive quantification of swelling The MEANS Procedure Analysis Variable : swelling Lower 95% Upper 95% N Mean Std Error CL for Mean CL for Mean What is wrong here? 7 / 96 Imagine all measurements on a rabbit resulted in the same value... Then we would actually only have 6 measurements..., and SEM would be awfully wrong So what when they are only somewhat identical

8 Correlated observations Observations on the same individual look alike, they are correlated Why is this important? Variation between these will not reflect the population variation (the variation between individuals) The number of observations will seem misleadingly high So, we have to take the correlation into account 8 / 96

9 Neglectance of correlation will lead to errors Typical errors: Wrong standard errors (too small or too big) Wrong confidence intervals (too narrow or too wide) Wrong conclusions (type I or type II errors) The type of error depends upon the kind of question asked.. to be further explained 9 / 96

10 Naive analysis of swelling Each rabbit has a mean level There is some variation between the six injection sites for the same rabbit In computer language: The rabbit is a factor, and the analysis is a one-way ANOVA proc glm data=rabbit; class rabbit; model swelling=rabbit / solution; run; 10 / 96

11 Output from naive model The GLM Procedure Dependent Variable: swelling Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE swelling Mean Source DF Type III SS Mean Square F Value Pr > F rabbit The rabbits have different levels (P=0.0040) but this was NOT the question 11 / 96

12 Output fra den naive model, II Standard Parameter Estimate Error t Value Pr > t Intercept B <.0001 rabbit B rabbit B rabbit B rabbit B rabbit B rabbit B... But: Do we get any useful information from this? We are not interested in these particular 6 rabbits, only in rabbits in general, as a species We assume these 6 rabbits to have been randomly selected from the species. 12 / 96

13 Variance component model Instead of fixed level parameters for each rabbit, we model the differences between rabbits as an extra source of variation: y rs = µ + a r + ε rs where the a r s and the ε rs s are assumed to be independent, Normally distributed, with variances Var(a r )=ω 2 B, Var(ε rs)=σ 2 W The variation between rabbits is now a random effect, or random factor, ωb 2 and σ2 W are called variance components, and the model is also called a two-level model 13 / 96

14 Formulation in terms of correlation All swelling observations have common mean and variance: y rs N (µ, ω 2 B + σ 2 W ) But: Measurements made on the same rabbit are correlated with the intra-class correlation Corr(y r1, y r2 ) = ρ = ω 2 B ω 2 B + σ2 W Measurements made on the same rabbit tend to look more alike than measurements made on different rabbits. All measurements on the same rabbit look equally much alike. This correlation structure is called compound symmetry (CS) or exchangeability. 14 / 96

15 Covariance and correlation For the six injections sites, the covariance matrix for each rabbit is: ω 2 B + σ2 W ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B + σ2 W ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B + σ2 W ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B + σ2 W ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B + σ2 W ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B ω 2 B + σ2 W and the corresponding Compound symmetry correlation structure is: 1 ρ ρ ρ ρ ρ ρ 1 ρ ρ ρ ρ ρ ρ 1 ρ ρ ρ ρ ρ ρ 1 ρ ρ ρ ρ ρ ρ 1 ρ ρ ρ ρ ρ ρ 1 15 / 96

16 Exchangeability = Compound Symmetry This covariance/correlation structure implies: All variances are equal: There should be the same variation between rabbits for all injection sites Any pair of measurements are equally correlated: All injection sites should be equally related to each other How could these assumptions be violated? Are the injection sites really randomly selected? If not, an unstructured covariance may be more appropriate: Some injection sites are more related than others (e.g. due to proximity). 16 / 96

17 Estimation in SAS proc mixed data=rabbit; class rabbit; model swelling = / ddfm=kr s cl; random rabbit; run; Covariance Parameter Estimates Cov Parm Estimate rabbit Residual Solution for Fixed Effects Standard Effect Estimate Error DF t Value Lower Upper Intercept Comparison to p. 7 reveals that correctly taking the correlation into account yields the same estimate, but substantially wider confidence interval To ignore the correlation leads to a type 1 error 17 / 96

18 Interpretation of variance components Proportion of Variation Variance component Estimate variation Between ωb % Within σw % Total ωb 2 + σ2 W % Typical differences (95% Prediction Intervals): for spots on the same rabbit ± = ±2.16 cm 2 for spots on different rabbits ± = ±2.70 cm 2 18 / 96

19 Interpretation of variance components, cont d Approx. 2 3 of the variation in the measurements comes from the variation within rabbits, i.e. between injection sites on the same rabbit. Why? Could there be a systematic difference between the injection sites? Cov Parm Estimate rabbit Residual Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F spot This does not seem to be the case (P=0.26). 19 / 96

20 Design considerations, precision of overall mean For R=no. of rabbits, varying from 3 to 20: For S=no. of spots, varying from 1 to 10: Standard error is the square root of: Var(ȳ) = ω2 B R 20 / 96 + σ2 W RS

21 Effective sample size If we had only one observation for each of k rabbits, how many rabbits would we then need to obtain the same precision? k = R S 1 + ρ(s 1) Inserting R = 6, S = 6 and ρ = yields k = 12.8 ω2 B ω 2 B +σ2 W = = Effectively, we have only approximately two independent observations from each rabbit! Take care: This is a pilot study, do not rely too heavily on the results. 21 / 96

22 Reduced data set - omit 3 observations What kind of effects would we expect? 22 / 96

23 Quantification of overall swelling Right columns correspond to reduced data set, where the 3 smallest measurements from rabbit 2 (with the highest level) are omitted. All 36 data Omitting 3 observations Method Estimate (SE) Estimate (SE) Simple averages (0.155) (0.163) of all (p. 7) Average (0.267) (0.333) of averages Weighted average (0.265) of averages Variance component (0.267) (0.298) model, (p. 17) 23 / 96

24 Comments to quantifications in Table on p. 23 Simple averages: Pool all 36 measurements, wrongly assuming independence. This will result in too small standard errors. In the reduced data set, the estimate is downwards biased, since we have omitted some of the largest observations. Average of averages: Start out by taking averages for each rabbit. This will be OK for balanced designs, but when we omit the three lowest observations for rabbit 2, this rabbit appear to have a higher level and will give an upwards bias. 24 / 96

25 Comments, cont d Weighted average of averages: As above, but weighted according to number of observations. For balanced designs, all weights are equal, but when we omit three observations, the rabbit 2 has a lower weight in the average due to only 3 observations This will result in a downwards bias, because rabbit 2 has a high level. Random rabbit: The variance component model will yield the correct result, provided that observations are missing at random. In the reduced data set, rabbit 2 has a lower weight in the average due to a larger standard error 25 / 96

26 Estimation of individual rabbit means...? Two different approaches: Traditional averages ȳ r. BLUP s (best linear unbiased predictor) rely on the assumption that individuals come from the same population, and become weighted averages which have been shrinked towards the overall mean: kȳ r. + (1 k)ȳ.. where k = ω 2 B ω 2 B + σ2 W S (k is close to 0 when σ 2 W is large, otherwise closer to 1) More shrinkage if rabbits look alike BLUPs are used for ranking e.g. schools 26 / 96

27 BLUPs vs. averages, shrinkage Left panel: The full dataset, Right panel: Reduced data set: Larger shrinkage for rabbit no. 2 in reduced dataset 27 / 96

28 Hierarchical designs, cluster designs e.g. School, School Class and Pupil [I] = [S*C*P] [S*C] S 28 / 96

29 Hierarchical designs, with covariates [S C P] [S C] [S] Gender Class grade School type 29 / 96

30 Examples of hierarchies level 1 level 2 level 3 subjects twin pairs countries subjects families regions students classes schools spots rabbits fields sections rats visits subjects centres Measurements belonging together in the same cluster look alike (are correlated) On all levels, we may have random variation (variance components), as well as covariates 30 / 96

31 Merits of cluster designs Certain effects may be estimated more precisely, since some sources of variation are eliminated, e.g. by making comparisons within a family or a school class This is analogous to the paired comparison situation. When planning subsequent investigations, the knowledge of the relative sizes of the variance components will be of help in deciding the number of repetitions needed at each level 31 / 96

32 Drawbacks of cluster designs Bias may result, if one or more sources of variation are disregarded low efficiency (type 2 error) for evaluation of level 1 covariates (within-cluster effects) too small standard errors (type 1 error) for estimates of level 2 effects (between-cluster effects) possible bias in the mean value structure, in case of missing values For longitudinal data, we saw that ignoring correlations implies that: Time will appear less important Groups will appear more different 32 / 96

33 Level 1 covariates (unit: single observations) Time itself Covariates varying with time: blood pressure, heart rate, age Interaction between group and time If correlation is not taken into account, we ignore the paired situation, leading to low efficiency, i.e. too large P-values Type 2 error Effects may go undetected! 33 / 96

34 Level 2 covariates (unit: individuals) Treatment Gender, age If correlation is ignored, we act as if we have more information than we actually have, leading to too small P-values Type 1 error Noise may be taken to be real effects! 34 / 96

35 A school example Models for such data include 3 sources of variation: 1: Variation between schools ([S]), 2: Variation between classes in each school ([S*C]) and 3: Variation between pupils in each class ([S*C*P], residual variation) What may happen if we forget the variation between classes in the same school, [S*C]? Pupils in the same class will be assumed no more correlated than pupils from different classes (in the same school) Covariates on class level (e.g. class grade) will appear too important Covariates on pupil level (e.g. gender) will appear less important We will return to this example, when we discuss binary data 35 / 96

36 Another example of a 3-level model Research problem: In order to evaluate the effect of cytostatica on pancreas islet β-cells, we need to quantify the number of nuclei per cell. Henrik Winther Nielsen, Inst. Med. Anat. How should data be collected in order to maximize precision with low expense and work load? How many animals (rats)? How many slices of the pancreas? How many sections of each slice should be counted? Hierarchy: fields sections rats σ 2 τ 2 ω 2 Factor diagram: [I] = [R*S*F] [R*S] [R] 0 36 / 96

37 Pilot study 4 rats (R) 3 sections for each rat (S) 5 randomly chosen fields from each section (F) Scatter plot, with jitter (symbols indicate sections) 37 / 96

38 3-level model in SAS proc mixed data=nuclei; class rat section; model nuclei= / ddfm=kr s; random intercept section / subject=rat; run; Covariance Parameter Estimates Covariance Parameter Estimates Cov Parm Subject Estimate Intercept rat section rat Residual Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > t Intercept / 96

39 Variances are positive! and therefore these models describe all correlations to be positive. But note: It may happen that correlations are in reality negative! by a coincidence as a result of competition between units belonging together, e.g. when measuring yield for plants grown in the same pot In such a case, the corresponding variance component will be reported as a zero Here, the variation between sections is close to 0 39 / 96

40 Interpretation of variance components Proportion of Variation Variance component Estimate variation Rats ω % Sections τ % Fields σ % Total ω 2 + τ 2 + σ % Almost all variation is on the lowest level: Rats appear quite identical, perhaps they are from the same litter? Sections appear extremely identical, is the pancreas homogeneous? 40 / 96

41 Typical differences between two measurements: for different fields on the same section ± = ±1.255 for different sections on the same rat ±2 2 ( ) = ±1.264 for sections on different rats ±2 2 ( ) = ± / 96

42 Correlations vary, depending on Measurements on the same section: Corr(y rs1, y rs2 ) = ω 2 + τ 2 ω 2 + τ 2 + σ 2 = Measurements on different sections of the same rat: Corr(y r11, y r22 ) = ω 2 ω 2 + τ 2 + σ 2 = Measurements from different rats are independent 42 / 96

43 Previous example: Calcium supplements A total of year old girls were randomized to receive either calcium or placebo. Outcome: BMD=bone mineral density, in g cm, 2 ideally measured every 6 months (5 visits), but in reality... Scientific question: Does calcium improve the rate of bone gain for adolescent women? 43 / 96

44 Previous analyses of this example Response profiles, with unstructured or patterned covariance: 44 / 96

45 Timing of the 5 visits Of course, the girls were not seen with intervals of precisely 6 months... Neither were they precisely at the same age at the first visit What to do about that? What is the proper time scale? Age? Time since randomization? Assuming that the date of the first visit is also the time of randomization, we shall take this as time 0. More on baseline handling later 45 / 96

46 Time since randomization Time points are specific to each single girl Time 0 is the individual time of the fist visit visit1, visit2 etc. have no real meaning any more, because they do not refer to the same time point Time is now in units years from randomization Note on the output next page: The number of measurements decrease over time, due to missing values/dropout 46 / 96

47 Overview of individual time points The MEANS Procedure N grp visit Obs Variable N Mean Minimum Maximum C 1 55 bmd time bmd time bmd time bmd time bmd time P 1 57 bmd time bmd time bmd time bmd time bmd time / 96

48 Individual profiles Spaghetti plots 48 / 96

49 Plausible models for BMD data Mean value structure We need a model for the effect of time, since 5 separate mean values is not possible (not identical times). The simplest mean value structure is linearity Covariance structure We cannot use the construction type=un, but still the random-statement and the CS in the repeated-statement. A lot of other covariance structures will still be possible, e.g. The non-equidistant analogue to the autoregressive structure is Corr(Y git1, Y git2 ) = ρ t1 t2 which is written as TYPE=SP(POW)(ctime) A new covariance structure comes from random regression 49 / 96

50 Baseline issues It the first visit is a baseline measurement (which it is), and randomization has been performed: The two groups are known to be equal at baseline To allow a group effect at baseline may weaken a possible difference between these (type 2 error) may convert a treatment effect to an interaction Dissimilarities may be present in small studies For slowly varying outcomes, even a small difference may produce non-treatment related differences, i.e. bias 50 / 96

51 Hypothetical comparison of two treatment groups, A Truth: Constant difference between the treatments Finding: Interaction between time and treatment 51 / 96

52 Hypothetical comparison of two treatment groups, B Truth: No effect of treatment Finding: Constant difference between treatments 52 / 96

53 Baseline difference I: Observational studies Research question: Compare the outcomes for individuals from different groups (e.g. gender or illness groups): The groups are likely to differ in many respects, including baseline outcome value. Differences in the outcome may be due to any of these characteristics, and the results will depend on which of these are included in the model. Adjust for the covariates that are sensible in the context. The scientific question answered depends upon the model 53 / 96

54 Baseline difference II: Randomized studies Research question: Compare the outcomes for individuals treated differently, but otherwise identical (with respect to all baseline characteristics, including baseline outcome value): There ought to be no difference in either covariates or baseline outcome. Even so, small chance differences in baseline may create important outcome differences that may erroneously be taken to be treatment effects, if the covariate (or baseline) is highly predictive of the outcome. Using baseline measurement as a covariate (Ancova) to adjust for chance differences is most sensible in simple before/after studies, but is not optimal with more than one follow-up measurement. 54 / 96

55 Approaches for handling baseline in randomized studies Use follow-up data only (exclude baseline from analysis) - most reasonable if correlation between repeated measurements is very low Subtract baseline from successive measurements - most reasonable if correlation between repeated measurements is very high Use a model with equal mean values at baseline - may be used for any degree of correlation and gives the most sensible interpretation 55 / 96

56 Random girl level in SAS and linearity in time: proc mixed covtest data=calcium; class grp girl; model bmd=time grp*time / ddfm=kr s cl; random intercept / subject=girl(grp) v vcorr; run; Girls are nested in groups, specified by the notation random girl(grp); kr could be replaced by satterth, see p. 57 v and vcorr are printing options 56 / 96

57 The options ddfm=satterth (- or kenwardrogers=kr): When the distributions are exact, they have no effect in balanced situations When approximations are necessary, these two are considered best in unbalanced situations, i.e for almost all observational designs in case of missing observations It may give rise to fractional degrees of freedom The computations may require a little more time, but in most cases this will not be noticable When in doubt, use it! 57 / 96

58 Random girl level, output from code on p. 56 Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr > Z Intercept girl(grp) <.0001 Residual <.0001 Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F time <.0001 time*grp <.0001 No doubt, we see an interaction GRP*TIME 58 / 96

59 Random girl level, output, continued Solution for Fixed Effects Standard Effect grp Estimate Error DF t Value Pr > t Alpha Intercept < time < time*grp C < time*grp P Solution for Fixed Effects Effect grp Lower Upper Intercept time time*grp C time*grp P.. Excess slope in the C-group: g/cm 3 extra per year if C-treated, CI=(0.0053, ) 59 / 96

60 Model synonyms Two-level model Model with random subject levels Model with random intercepts Model with compound symmetry correlation structure (TYPE=CS) Model with exchangeability correlation structure 60 / 96

61 Alternative specifications Note, that the specification random girl(grp); can be written in two other ways: repeated visit / type=cs subject=girl(grp); CS: Compound symmetry random intercept / subject=girl(grp); In the following, we shall see generalizations of the RANDOM-statement 61 / 96

62 Individual growth rates? The time course is reasonably linear, but maybe the girls have different growth rates (slopes)? If we let Y git denote BMD for the i th girl (in the g th group) at time t (in years), we could look at the model: y git = A gi + B gi t + ε git, ε git N (0, σ 2 ) i.e., with different intercepts (A gi ) and different slopes (B gi ) for each girl 62 / 96

63 Fit a straight line for each girl Scatterplot of slopes vs. levels at first visit, as estimated by individual regressions: Slopes in the Calcium-group (blue dots) seem to be bigger / 96

64 Results from individual regression Estimates with standard errors in brackets: Group Level at baseline Slope P (0.0091) (0.0025) C (0.0082) (0.0030) Difference (0.0123) (0.0039) P-value NOTE: No restrictions on baseline here 64 / 96

65 Random regression a generalization of the idea of a random level We let each individual (girl) have her own level A gi her own slope B gi but / 96

66 Random regression, II... we bind these individual parameters (A gi and B gi ) together by normal distributions G = ( Agi B gi ( τ 2 a ω ω ) N 2 (( αβg ) τ 2 b ) =, G ) ( τ 2 a ρτ a τ b ρτ a τ b τ 2 b ) G describes the population variation of the lines, i.e. the inter-individual variation (reflected by the picture on p. 63). Note: No subscript on α because the groups are equal at baseline 66 / 96

67 Estimation in random regression keeping levels at baseline equal by omitting grp in the model-statement: proc mixed covtest data=calcium; class grp girl; model bmd=time grp*time / ddfm=kr s cl; random intercept time / type=un subject=girl g v vcorr; run; type=un in the random-statement refers to the matrix G on the previous slide, and the estimate is seen on p / 96

68 Output from random regression Estimated G Matrix Row Effect girl Col1 Col2 1 Intercept time Covariance Parameter Estimates Standard Z Cov Parm Subject Estimate Error Value Pr Z UN(1,1) girl <.0001 UN(2,1) girl UN(2,2) girl <.0001 Residual <.0001 Fit Statistics -2 Res Log Likelihood / 96

69 Output II: Estimated covariance and correlation for the 5 visits for one particular girl Estimated V Matrix for girl 101 Row Col1 Col2 Col3 Col4 Col Estimated V Correlation Matrix for girl 101 Row Col1 Col2 Col3 Col4 Col / 96

70 Output III: Estimated mean value structure Solution for Fixed Effects Standard Effect grp Estimate Error DF t Value Pr > t Alpha Intercept < time < time*grp C time*grp P Effect grp Lower Upper Intercept time time*grp C time*grp P.. Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F time <.0001 time*grp Thus, we find an extra increase in BMD of gram/cm 3 per year, CI=(0.0026, ), when giving calcium supplement, a little less than found on p / 96

71 Note concerning MIXED-notation It is necessary to use TYPE=UN in the RANDOM-statement in order to allow intercept and slope to be arbitrarily correlated Default option in RANDOM is TYPE=VC, which only specifies variance components with different variances If TYPE=UN is omitted, we may experience convergence problems and sometimes totally incomprehensible results. In this particular case, the correlation between intercept and slope is not that impressive - actually only (intercept is not completely out of range in this example, since it refers to the baseline). 71 / 96

72 Individual regressions approach Merits: Easy to understand and interpret Drawbacks: Suboptimal in case of unequal sample sizes Only simple models feasible Difficult/impossible to include covariates Not possible to account for equal baseline values 72 / 96

73 Random regression approach Merits: Uses all available information Optimal procedure if the model holds Easy to include covariates Drawbacks: Biased in case of informative missing values (or informative sample sizes) 73 / 96

74 Random regression vs. individual regressions Slopes from: Group Individual regressions Random regression P (0.0025) (0.0022) C (0.0030) (0.0022) Difference (0.0039) (0.0031) P-value Random regression gives a steeper slope The girls with flat (and low) profiles tend to be shorter These slopes contribute less to the random regression slope because they are less accurate Is this a coincidence?? Otherwise, we may see an example of informative missing values (last lecture) 74 / 96

75 Model checks Two types of residuals: Ordinary Observed minus predicted group mean (only systematic effects) Y ij X T ij ˆβ Conditional Observed minus predicted individual mean value (systematic and random effects) ε ij = Y ij (X T ij ˆβ + Z T ij ˆb i ) Conditional residuals are usually much smaller than the ordinary, since they describe deviations from subject-specific predictions. 75 / 96

76 Coding for model checks proc mixed plots=all data=calcium; class grp girl; model bmd=time grp*time / ddfm=kr s cl; random intercept time / type=un subject=girl g v vcorr; run; gives us Panels to check stability of variance and normality of residuals creates two output data sets: fitpm: Predicted mean BMD-values, common to girls in the same group fitp: Individually predicted BMD-values, specific for each girl 76 / 96

77 Model check, ordinary residuals 77 / 96

78 Model check, conditional residuals 78 / 96

79 Additional model checks Investigating linearity in age: proc sort data=fitpm; by grp time; run; title Ordinary residuals ; proc sgplot data=fitpm; loess Y=Resid X=time / group=grp; run; proc sort data=fitp; by grp time; run; title Conditional residuals ; proc sgplot data=fitp; loess Y=Resid X=time / group=grp; run; 79 / 96

80 Check of linearity, ordinary residuals (fitpm) 80 / 96

81 Check of linearity, conditional residuals(fitp) 81 / 96

82 Comments on model checks Ordinary residuals (p. 77): Homogeneity of variance, Slightly skew distribution, almost Normal Conditional residuals (p. 78): Evidently Normal Linearity (p. 81): Some non-systematic deviation from linearity seen in conditional residuals, but somewhat consistently for the two groups Linearity (p. 80): Deviation from linearity cannot be seen in the ordinary residuals (they drown in the between-subject effect) 82 / 96

83 Normality of random effects? Histogram or Box plots of estimated ˆb i s from the model is not worth much proc mixed covtest plots=all data=calcium; class grp girl; model bmd=time grp*time / ddfm=kr s cl outpm=fitpm outp=fitp residual influence; random intercept time / type=un subject=girl(grp) g v vcorr s; ods output solutionr=random_effects; run; proc sgplot data=random_effects; vbox Estimate / category=grp; run; 83 / 96

84 Predicted values from random regression Predicted group means: shown for two girls from different groups: 84 / 96

85 Predicted values from random regression, II Individual predictions: 85 / 96

86 Plausible models for BMD data Response profiles: Unstructured mean and unstructured covariance (only for balanced data) Compound symmetry covariance/correlation Synomym for random effect/level for each girl Autoregressive covariance/correlation or other covariance structures Random regression Random effects of both intercept and slope for each girl 86 / 96

87 How can we choose between models? Think... Graphical assessment of fit e.g. comparison of predicted profiles with average curves Inspection of residuals Automatic model checks, using ods graphics More extensice model checks using output data sets Tests against more flexible alternatives Fixed effects tested by the usual output Covariance patterns evaluated by χ 2 -tests on 2 log L 87 / 96

88 The mean value structure Look for: Linearity in scatter plot? Curves in residual plots? Alternatives: Splines More covariates Non-linear models 88 / 96

89 The covariance/correlation structure 1. Random effects: 2. Serial correlation (the pattern) 3. Error of measurement 89 / 96

90 Assumptions in a mixed effects model Linearity in covariates X ij (including Z ij ) Normality of residuals ε i. Normality of random effects b i Plausibility of covariance structure Independence between individuals Independence between X ij and b i, e.g. Does the timing and number of measurements relate to the development for the girl? 90 / 96

91 Importance of assumptions Important: Linearity Independence between individuals (normally not an issue) Independence between X ij and b i Appropriateness of the covariance structure: (may be circumvented by using the empirical sandwich estimator, option empirical in proc mixed) Less important: (especially when the number of observations is large) Normality of residuals ε ij Normality of random effects b i 91 / 96

92 Influential observations i.e. observations with a large influence on the estimates, either on the mean value or on the covariance parameters. These observations could have an unusual combination of covariates X i large ordinary residuals (from X i β) an unusual combination of covariates Z i or it could be a sign of a bad choice of mean value structure bad choice of covariance pattern 92 / 96

93 Cooks distance Tentative limit for being influential : 4 n = / 96

94 Repetotion Typical set-up for repeated measurements Two or more groups of subjects (typically receiving different treatments) Randomization at baseline Longitudinal measurements of the same quantity over time for each subject, typically as a function of time (duration of treatment) age cumulative dose of some drug Level 1: Single observations Level 2: Patients/Subjects 94 / 96

95 Repeated measurement designs Merits It is much more powerful in detecting time changes (data are paired with the subject as its own control) We may discover that subjects have different time courses (In designs with only cross-sectional data, this may also be the case, but we have no way of knowing!) We may identify important characteristics of the time courses, specific for each subject (trend, peak etc.) 95 / 96

96 Repeated measurement designs Drawbacks Traditional independence assumption is violated since repeated observations on the same individual are correlated (look alike) Traditional anova-models become impossible Comparison of time averages (or other characteristics) cannot incorporate time dependent covariates 96 / 96

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models

Correlated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 One-way anova with random variation The rabbit example Hierarchical