Analysis of variance and regression. December 4, 2007
|
|
- Shavonne Shaw
- 5 years ago
- Views:
Transcription
1 Analysis of variance and regression December 4, 2007
2 Variance component models Variance components One-way anova with random variation estimation interpretations Two-way anova with random variation Crossed random effects Ecological analyses
3 Lene Theil Skovgaard, Dept. of Biostatistics, Institute of Public Health, University of Copenhagen
4 Variance Component Models, December Terminology for correlated measurements: Multivariate outcome: Several outcomes (responses) for each individual, e.g. a number of hormone measurements that we want to study simultaneously. Cluster design: Same outcome (response) measured on all individuals in a number of families/villages/school classes Repeated measurements: Same outcome (response) measured in different situations (or at different spots) for the same individual. Longitudinal measurements: Same outcome (response) measured consecutively over time for each individual.
5 Variance Component Models, December Variance component models Generalisations of ANOVA-type models or regression models, involving several sources of random variation (variance components) environmental variation between regions, hospitals or countries biological variation variation between individuals, families or animals within-individual variation variation between arms, teeth, injection sites, days variation due to uncontrollable circumstances time of day, temperature, observer measurement error
6 Variance Component Models, December Typical studies involve data from: a number of family members from a sample of households pupils from a sample of school classes measurements on several spots of each individual Alternative name (for some of them): Multilevel models variation on each level (variance component) possibly systematic effects (covariates) on each level
7 Variance Component Models, December Examples of hierarchies: individual context/cluster level 1 level 2 level 3 subjects twin pairs countries subjects families regions students classes schools visits subjects centres
8 Variance Component Models, December Merits Certain effects may be estimated more precisely, since some sources of variation are eliminated, e.g. by making comparisons within a family. This is analogous to the paired comparison situation. When planning subsequent investigations, the knowledge of the relative sizes of the variance components will be of help in deciding the number of repetitions needed at each level (if possible).
9 Variance Component Models, December Drawbacks When making inference (estimation and testing), it is important to take all sources of variation into account, and effects have to be evaluated using the relevant variation! Bias may result, if one or more sources of variation are disregarded
10 Variance Component Models, December Measurements belonging together in the same cluster look alike (are correlated) If we fail to take this correlation into account, we will experience: possible bias in the mean value structure low efficiency (type 2 error) for evaluation of level 1 covariates (within-cluster effects) too small standard errors (type 1 error) for estimates of level 2 effects (between-cluster effects)
11 Variance Component Models, December Concepts of the day: advantage/necessity of random effects generalisations of ANOVA-type models Examples with small data sets some of them too small to allow for trustworthy interpretations illustrative precisely because of their limited size Illustrated with SAS PROC MIXED
12 Variance Component Models, December One-way analysis of variance with random variation: Comparison of k groups/clusters, satisfying The groups are not of individual interest and it is of no interest to test whether they have identical means The groups may be thought of as representatives from a population, that we want to describe. Example: 10 consecutive measurements of blood pressure on a sample of 50 women: We know that the women differ and we do not care! We only want to learn something about blood pressure in the female population in general
13 Variance Component Models, December Example of one-way anova structure: 6 rabbits are vaccinated, each in 6 spots on the back Response Y : swelling in cm 2 Model: swelling = grand mean + rabbit deviation y rs = µ + α r + ε rs, r = 1,, R = 6 denotes the rabbit, + variation ε rs N(0,σ 2 ), where s = 1,, S = 6 denotes the spot The variation can be regarded either as within-rabbit variation or measurement error (probably a combination of the two).
14 Variance Component Models, December Rabbit means: µ r = µ + α r
15 Variance Component Models, December anova-table: SS df MS=SS/df F Between R 1 = Within R(S 1) = Total RS 1 = Test for identical rabbits means: F = 4.39 F(5, 30), P = 0.004, But: We are not interested in these particular 6 rabbits, only in rabbits in general, as a species! We assume these 6 rabbits to have been randomly selected from the species.
16 Variance Component Models, December We choose to model rabbit variation instead of rabbit levels: swelling = grand mean + between-rabbit variation + within-rabbit variation y rs = µ + a r + ε rs, where the a r s and the ε rs s are assumed independent, normally distributed with Var(a r )=ω 2 B, Var(ε rs)=σ 2 W The variation between rabbits has been made random ω 2 B and σ2 W are variance components, and the model is also called a two-level model
17 Variance Component Models, December Fixed vs. random effects? Fixed: all values of the factor present (typically only a few, e.g. treatment) allows inference for these particular factor values only must include a reasonable number of observations for each factor value Random: a representative sample of values of the factor is present allows inference to be extended beyond the values in the experiment and to the population of possible factor values (e.g. geographical areas, classes, rabbits) is necessary when we have a covariate for this level
18 Variance Component Models, December Interpretation: All observations have common mean and variance: y rs N(µ, ω 2 B + σ 2 W) but: Measurements made on the same rabbit are correlated with the intra-class correlation Corr(y r1, y r2 ) = ρ = ω 2 B ω 2 B + σ2 W Measurements made on the same rabbit tend to look more alike than measurements made on different rabbits. All measurements on the same rabbit look equally much alike. This correlation structure is called compound symmetry (CS) or exchangeability.
19 Variance Component Models, December Estimation of variance components First step is to determine the mean values of the mean squares (in balanced situations): E(MS B ) = Rω 2 B + σ2 W E(MS W ) = σ 2 W and from this we get the estimates σ 2 W = MS W σ 2 B = MS B MS W R
20 Variance Component Models, December Note: It may happen that σ 2 B becomes negative! by a coincidence as a result of competition between units belonging together, e.g. when measuring yield for plants grown in the same pot In this case, it will be reported as a zero
21 Variance Component Models, December Reading in data in SAS: data rabbit_orig; input spot $ y1-y6; cards; a b c d e f ; run; data rabbit; set rabbit_orig; rabbit=1; swelling=y1; output; rabbit=2; swelling=y2; output; rabbit=3; swelling=y3; output; rabbit=4; swelling=y4; output; rabbit=5; swelling=y5; output; rabbit=6; swelling=y6; output; run;
22 Variance Component Models, December In SAS, the estimation can be performed as: proc mixed data=rabbit; class rabbit; model swelling = / s; random rabbit; run; Covariance Parameter Estimates Cov Parm Estimate rabbit Residual Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > t Intercept <.0001
23 Variance Component Models, December Interpretation of variance components: Proportion of Variation Variance component Estimate variation Between ωb % Within σw % Total ωb 2 + σw % Typical differences (95% Prediction Intervals): for spots on the same rabbit ± = ±2.16 cm 2 for spots on different rabbits ± = ±2.70 cm 2
24 Variance Component Models, December Interpretation of the size of the variance components: Approx. 2 3 of the variation in the measurements comes from the variation within rabbits. Maybe there is a systematic difference between the injection spots? Two-way anova: Source DF Type III SS Mean Square F Value Pr > F rabbit spot It does not look as if there is any systematic difference (P=0.26).
25 Variance Component Models, December Design considerations Imaginary experiment with measurements on R rabbits, and S spots for each rabbit. Var(ȳ) = ω2 B R + σ2 W RS For S=#spots, varying from 1 to 10: standard error rabbits
26 Variance Component Models, December Effective sample size If we had only one observation for each of k rabbits, how many would we need to obtain the same precision? k = R S 1 + ρ(s 1) We have here ρ = ω2 B ω 2 B +σ2 W = = k = 12.8 Effectively, we have only approximately two independent observations from each rabbit!
27 Variance Component Models, December Quantification of overall swelling method estimate (s.e.) 1: forget rabbit (0.155) 2: fixed rabbit (0.127) 3: rabbit averages (0.267) 4: random rabbit (0.267) 1. We pool all 36 measurements, mixing up the two variance components, and assuming independence 2. We estimate the mean swelling of exactly these 6 rabbits (using only within-rabbit variation) 3. We only look at averages for each rabbit (ecological analysis) 4. We estimate the mean swelling of rabbits as a species (the correct approach)
28 Variance Component Models, December Estimation of individual rabbit means: simple averages rely on individual measurements only, ȳ r. BLUP s (or EBLUP s, expected best unbiased linear predictor) rely on the assumption that individuals come from the same population, and become weighted averages: ω 2 B ω 2 B + σ2 W S ȳ r. + σ 2 W S ω 2 B + σ2 W S ȳ.. which have been shrinked towards the overall mean, ȳ..
29 Variance Component Models, December
30 Variance Component Models, December When the 3 smallest measurements from rabbit 2 (largest level) are omitted, the results become: method estimate (s.e.) 1: forget rabbit (0.163) 2: fixed rabbit (0.136) 3a: rabbit averages (0.265) (weighted) 3b: rabbit averages (0.333) (unweighted) 4: random rabbit (0.298) reference (0.267) 1 we have omitted some of the largest observations 2+3a rabbit 2 has a lower weight in the average due to only 3 observations 3b average for rabbit 2 has increased 4 rabbit 2 has a lower weight in the average due to a larger standard error
31 Variance Component Models, December EBLUPS for the reduced data set: 2 2 Larger shrinkage than before, for rabbit no
32 Variance Component Models, December Confidence limits for the variance components: Intra-individual variation σ 2 W : < σ 2 W < Inter-individual variation ω 2 B : < ω 2 B < 2.48 So, we should take care not to over-interpret...
33 Variance Component Models, December We imagine, that rabbits are grouped in two (grp=1,2) proc mixed data=rabbit; class grp rabbit; model swelling = grp / s; random rabbit(grp); run; Cov Parm Estimate rabbit(grp) < this changes Residual < this stays the same Solution for Fixed Effects Standard Effect grp Estimate Error DF t Value Pr > t Intercept <.0001 grp grp
34 Variance Component Models, December Such a comparison can not be performed in the usual way (ignoring the rabbits), since we then perform the comparison/test against a wrong variation. Type I error will occur! proc glm data=rabbit; class grp; model swelling=grp / solution; run; T for H0: Pr > T Std Error of Parameter Estimate Parameter=0 Estimate INTERCEPT B GRP B B...
35 Variance Component Models, December Two-level model: level unit variation covariates 1 rabbit*spot within rabbit spot 2 rabbit between rabbits group overall mean When the random rabbit variation is ignored: low efficiency (type 2 error) for evaluation of level 1 covariates (spot) too small standard errors (type 1 error) for estimates of level 2 effects (group, overall mean)
36 Variance Component Models, December Factor diagrams: In the traditional one-way anova: [I] = [R*S] [R] 0 In case of grouping: [I] = [R*S] [R] G 0 We have here used the notation arrows indicating simplifications / groupings [ ] for the random effects, corresponding to variance components on the various levels.
37 Variance Component Models, December Example: Number of nuclei per cell in the rat pancreas, used for the evaluation of cytostatica Henrik Winther Nielsen, Inst. Med. Anat. 4 rats (R) 3 sections for each rat (S) 5 randomly chosen fields from each section (F) 3-level model fields sections rats σ 2 τ 2 ω 2 Factor diagram: [I] = [R*S*F] [R*S] [R] 0
38 Variance Component Models, December Covariance Parameter Estimates Cov Parm Estimate rat section(rat) Residual Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > t Intercept
39 Variance Component Models, December Estimation of variance components Proportion of Variation Variance component Estimate variation Rats ω % Sections τ % Fields σ % Total ω 2 + τ 2 + σ %
40 Variance Component Models, December Typical differences: for sections on different rats ±2 2 ( ) = ±1.319 for different sections on the same rat ±2 2 ( ) = ±1.264 for different fields on the same section ± = ±1.255
41 Variance Component Models, December The correlation between two measurements on the same rat becomes: if they are measured on the same section: Corr(y rs1, y rs2 ) = ω 2 + τ 2 ω 2 + τ 2 + σ 2 = if they are measured on different sections: Corr(y r11, y r22 ) = ω 2 ω 2 + τ 2 + σ 2 = 0.082
42 Variance Component Models, December Examples of hierarchies: individual cluster level 1 level 2 level 3 spots rabbits fields sections rats subjects twin pairs countries subjects families regions students classes schools visits subjects centres On all levels, we may have random variation (random effects or variance components), as well as covariates
43 Variance Component Models, December Average profiles: Example: 2 groups of dogs (5 resp. 6 dogs). Outcome: Osmolality, measured at 4 different times (with treatments along the way)
44 Variance Component Models, December Do we have repetitions?
45 Variance Component Models, December Residual plot (after suitable analysis): We see a clear trumpet shape, corresponding to the fact, that Dogs with a high level also vary more than dogs with a low level. Solution: Make a logarithmic transformation!
46 Variance Component Models, December Profiles on logarithmic scale with corresponding residual plot:
47 Variance Component Models, December Two-level model: level unit variation covariates 1 dog*time within dogs grp*time time 2 dog between dogs group overall mean proc mixed data=dogs; class grp time dog; model losmol=grp time grp*time / outpm=fit1 ddfm=satterth; random dog(grp); run;
48 Variance Component Models, December Class Level Information Class Levels Values grp time dog Covariance Parameter Estimates Cov Parm Estimate dog(grp) Residual P=0.08 for test of interaction, i.e. no convincing indication of this. Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F grp time <.0001 grp*time
49 Variance Component Models, December When there is no interaction, we simply omit the term from the model (but we could also just use averages, since the design is balanced) proc mixed covtest data=dogs; class grp time dog; model losmol=grp time / outpm=fit2 ddfm=satterth s; random dog(grp); run; Covariance Parameter Estimates Standard Z Cov Parm Estimate Error Value Pr Z dog(grp) Residual <.0001
50 Variance Component Models, December Solution for Fixed Effects Standard Effect grp time Estimate Error DF t Value Pr > t Intercept grp grp time time time <.0001 time Type 3 Tests of Fixed Effects Num Den Effect DF DF F Value Pr > F grp time <.0001
51 Variance Component Models, December In contrast, if we forget the random dog-effect, and perform a traditional two-way anova: proc glm data=dogs; class grp time; model losmol=grp time / solution; run; The GLM Procedure Dependent Variable: losmol Source DF Type III SS Mean Square F Value Pr > F grp time
52 Variance Component Models, December Standard Parameter Estimate Error t Value Pr > t Intercept B <.0001 grp B grp B... time B time B time B time B... Type 2 error for effect of time (level 1 covariate) time is evaluated in an unpaired fashion Type 1 error for effect of grp (level 2 covariate) we think we have more information than we actually have (we disregard the correlation)
53 Variance Component Models, December Factor diagram: [Dog] Grp [I] [Dog Time] Grp Time Time We note the following: The effect of GRP*TIME is evaluated against DOG*TIME If GRP*TIME is not considered significant, we thereafter evaluate TIME against DOG*TIME GRP against DOG, also called DOG(GRP)
54 Variance Component Models, December Interpretation of group effect: The estimated group difference is (0.166), corresponding to a 95% confidence interval of (-0.053,0.611). But this is on a logarithmic scale! We perform a back transformation with the exponential function and may conclude, that group 1 lies exp(0.279)=1.321 times higher than group 2, i.e. 32.1% higher. The 95% confidence interval is (exp(-0.053),exp(0.611))=(0.948,1.842)
55 Variance Component Models, December Example of a non-hierarchical model: Visual acuity: time in msec. from a stimulus (light flash) to the electrical response at the back of the cortex, measured for 7 individuals (patient), 2 eyes for each individual (eye) 4 lens magnifications (power) for each eye Crowder & Hand (1990)
56 Variance Component Models, December Predictors of visual acuity: Main effects: Systematic (mean value): eye, power Random: patient Interactions: Systematic (mean value): eye*power Random: patient*eye, patient*power Example of crossed random factors
57 Variance Component Models, December proc mixed data=visual; class patient eye power; model acuity=eye power eye*power / ddfm=satterth; random patient patient*eye patient*power; run; Cov Parm Estimate patient patient*eye patient*power Residual Type 3 Tests of Fixed Effects Effect Num DF Den DF F Value Pr > F eye power eye*power
58 Variance Component Models, December Factor diagram: [I] = [Pa Ey Po] [Pa Ey] Ey Po [Pa Po] Ey [Pa] Po 0
59 Variance Component Models, December level unit covariates 1 single measurements Ey*Po 2 interactions 2A [Pa*Ey] Ey 2B [Pa*Po] Po 2 individuals, [Pa] overall level
60 Variance Component Models, December
61 Variance Component Models, December Blood pressure and social inequity women in 17 regions of Malmö Covariates: Individual (level 1): low educational achievement (x 1 ) (less than 9 years of school) age group (x 2 ) Regional (level 2): rate of people with low educational achievement (z 1 ) from the Skåne Council Statistics Office
62 Variance Component Models, December Ecological analysis level 2 analysis (analysis of regional averages): Y: average blood pressure in residential area Z 1 : rate of people with low educational achievement Estimate of regression coefficient: 4.655(1.420) It seems to be an important explanatory variable!?
63 Variance Component Models, December Size of circle indicates size of investigation Obvious effect of aggregated covariate
64 Variance Component Models, December
65 Variance Component Models, December Estimates from variance component model: Covariates Individual variation Rate of variation in low education between low-education between model individuals regions x 1 σw 2 z 1 ωb 2 none age x 1, age (0.170) z 1, age (1.345) x 1, z 1, age (0.167) (1.250) 0.087
66 Variance Component Models, December We note the following: Region as a random effect could only account for 0.4% of the variation in blood pressure! Ecological variable ( Rate of low-income ) will have very little impact! Ecological analysis sums up the two effects, but is not able to distinguish between the two effects It overestimates the level 2 effect It cannot be interpreted as a level 1 effect
67 Variance Component Models, December
68 Variance Component Models, December Covariate effects on level 1 and level 2 can be very different Example: Reading ability, as a function of age and cohort:
69 Variance Component Models, December Misspecification missing random effect Result - type 2 error for x (unpaired) - type 1 error for z (too many df s, wrong variation) missing z - estimate of ω 2 B - estimate of σ 2 W too big perhaps too big (in unbalanced designs) missing x - estimate of ω 2 B or too small - estimate of σ 2 W too big too big
70 Variance Component Models, December Simulated data: Random effect of individual y p
71 Variance Component Models, December Estimates: Level Variation standard deviation 1 within individuals ˆσ W = between individuals ˆω B = 1.23
72 Variance Component Models, December Individual (level 1) covariate x, e.g. time/age: y x
73 Variance Component Models, December Estimates: standard regression Level Variation deviation coefficient 1 within individuals ˆσ W = 0.41 β x = 1.028(0.046) 2 between individuals ˆω B =
74 Variance Component Models, December Addition of a level 2 covariate: z, e.g. age: y x
75 Variance Component Models, December Estimates: standard regression Level Variation deviation coefficient 1 within individuals ˆσ W = 0.41 ˆβx = 1.033(0.046) 2 between individuals ˆω B = 1.14 ˆβz = 1.316(0.206)
76 Variance Component Models, December Comparison of estimates: within individual between individual Model ˆβx sd, ˆσ W ˆβz sd, ˆω B x (0.046) z (0.201) 1.03 x, z (0.046) (0.206) 1.14
77 Variance Component Models, December Example: suicide and religion Ecological analysis of regions: % suicides increases with % protestants, i.e. Protestants are more likely to commit suicide Or??
78 Variance Component Models, December level unit variation covariates 1 individuals within region, σ 2 W religion, x 2 regions between regions, ω 2 B % protestants, z True explanation: Interaction between individual effect (x) and region covariate (z) More suicides among catholics, in regions with many protestants
Varians- og regressionsanalyse
Faculty of Health Sciences Varians- og regressionsanalyse Variance component models Lene Theil Skovgaard Department of Biostatistics Variance component models Definitions and motivation One-way anova with
More informationCorrelated data. Variance component models. Example: Evaluate vaccine. Traditional assumption so far. Faculty of Health Sciences
Faculty of Health Sciences Variance component models Definitions and motivation Correlated data Variance component models, I Lene Theil Skovgaard November 29, 2013 One-way anova with random variation The
More informationCorrelated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models
Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 29, 2016 One-way anova with random variation The rabbit example Hierarchical
More informationVariance component models part I
Faculty of Health Sciences Variance component models part I Analysis of repeated measurements, 30th November 2012 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen
More informationVariance component models
Faculty of Health Sciences Variance component models Analysis of repeated measurements, NFA 2016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen Topics for
More informationFaculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.
Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 27, 2018 1 / 84 Overview One-way anova with random variation The rabbit example Hierarchical
More informationCorrelated data. Overview. Example: Swelling due to vaccine. Variance component models. Faculty of Health Sciences. Variance component models
Faculty of Health Sciences Overview Correlated data Variance component models One-way anova with random variation The rabbit example Hierarchical models with several levels Random regression Lene Theil
More informationFaculty of Health Sciences. Correlated data. Variance component models. Lene Theil Skovgaard & Julie Lyng Forman.
Faculty of Health Sciences Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 1 / 96 Overview One-way anova with random variation The rabbit example Hierarchical
More informationCorrelated data. Overview. Variance component models. Terminology for correlated measurements. Faculty of Health Sciences. Variance component models
Faculty of Health Sciences Overview Correlated data Variance component models Lene Theil Skovgaard & Julie Lyng Forman November 28, 2017 One-way anova with random variation The rabbit example Hierarchical
More informationVariance components and LMMs
Faculty of Health Sciences Variance components and LMMs Analysis of repeated measurements, 4th December 2014 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen
More informationVariance components and LMMs
Faculty of Health Sciences Topics for today Variance components and LMMs Analysis of repeated measurements, 4th December 04 Leftover from 8/: Rest of random regression example. New concepts for today:
More informationAnalysis of variance and regression. May 13, 2008
Analysis of variance and regression May 13, 2008 Repeated measurements over time Presentation of data Traditional ways of analysis Variance component model (the dogs revisited) Random regression Baseline
More informationMulti-factor analysis of variance
Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2015 Two-way ANOVA and interaction Mathed samples ANOVA Random vs systematic variation
More informationLinear mixed models. Faculty of Health Sciences. Analysis of repeated measurements, 10th March Julie Lyng Forman & Lene Theil Skovgaard
Faculty of Health Sciences Linear mixed models Analysis of repeated measurements, 10th March 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 80 Program
More informationCorrelated data. Repeated measurements over time. Typical set-up for repeated measurements. Traditional presentation of data
Faculty of Health Sciences Repeated measurements over time Correlated data NFA, May 22, 2014 Longitudinal measurements Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics University of
More informationLinear mixed models. Program. What are repeated measurements? Outline. Faculty of Health Sciences. Analysis of repeated measurements, 10th March 2015
university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s university of copenhagen d e pa rt m e n t o f b i o s tat i s t i c s Program Faculty of Health Sciences Topics: Linear mixed models
More informationCorrelated data. Overview. Cross-over study. Repetition. Faculty of Health Sciences. Variance component models, II. More on variance component models
Faculty of Health Sciences Overview Correlated data More on variance component models Variance component models, II Cross-over studies Non-normal data Comparing measurement devices Lene Theil Skovgaard
More informationMulti-factor analysis of variance
Faculty of Health Sciences Outline Multi-factor analysis of variance Basic statistics for experimental researchers 2016 Two-way ANOVA and interaction Matched samples ANOVA Random vs systematic variation
More informationAn Introduction to Multilevel Models. PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012
An Introduction to Multilevel Models PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 25: December 7, 2012 Today s Class Concepts in Longitudinal Modeling Between-Person vs. +Within-Person
More informationCorrelated data. Further topics. Sources of random variation. Specification of mixed models. Faculty of Health Sciences.
Faculty of Health Sciences Further topics Correlated data Further topics Lene Theil Skovgaard December 11, 2012 Specification of mixed models Model check and diagnostics Explained variation, R 2 Missing
More informationCorrelated data. Longitudinal data. Typical set-up for repeated measurements. Examples from literature, I. Faculty of Health Sciences
Faculty of Health Sciences Longitudinal data Correlated data Longitudinal measurements Outline Designs Models for the mean Covariance patterns Lene Theil Skovgaard November 27, 2015 Random regression Baseline
More informationModels for longitudinal data
Faculty of Health Sciences Contents Models for longitudinal data Analysis of repeated measurements, NFA 016 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen
More informationFor more information about how to cite these materials visit
Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/
More informationRandom Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars)
STAT:5201 Applied Statistic II Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars) School math achievement scores The data file consists of 7185 students
More informationStatistics for exp. medical researchers Regression and Correlation
Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence
More informationDescribing Change over Time: Adding Linear Trends
Describing Change over Time: Adding Linear Trends Longitudinal Data Analysis Workshop Section 7 University of Georgia: Institute for Interdisciplinary Research in Education and Human Development Section
More informationTopic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model
Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is
More informationAnalysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking
Analysis of variance and regression Contents Comparison of several groups One-way ANOVA April 7, 008 Two-way ANOVA Interaction Model checking ANOVA, April 008 Comparison of or more groups Julie Lyng Forman,
More informationmultilevel modeling: concepts, applications and interpretations
multilevel modeling: concepts, applications and interpretations lynne c. messer 27 october 2010 warning social and reproductive / perinatal epidemiologist concepts why context matters multilevel models
More informationAnalysis of variance. April 16, Contents Comparison of several groups
Contents Comparison of several groups Analysis of variance April 16, 2009 One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics
More informationAnalysis of variance. April 16, 2009
Analysis of variance April 16, 2009 Contents Comparison of several groups One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationModels for Clustered Data
Models for Clustered Data Edps/Psych/Soc 589 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline Notation NELS88 data Fixed Effects ANOVA
More informationModels for Clustered Data
Models for Clustered Data Edps/Psych/Stat 587 Carolyn J Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 2017 Outline Notation NELS88 data Fixed Effects ANOVA
More informationSAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1
CLDP 944 Example 3a page 1 From Between-Person to Within-Person Models for Longitudinal Data The models for this example come from Hoffman (2015) chapter 3 example 3a. We will be examining the extent to
More informationMixed Models II - Behind the Scenes Report Revised June 11, 2002 by G. Monette
Mixed Models II - Behind the Scenes Report Revised June 11, 2002 by G. Monette What is a mixed model "really" estimating? Paradox lost - paradox regained - paradox lost again. "Simple example": 4 patients
More informationAnswer to exercise: Blood pressure lowering drugs
Answer to exercise: Blood pressure lowering drugs The data set bloodpressure.txt contains data from a cross-over trial, involving three different formulations of a drug for lowering of blood pressure:
More informationA (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data
A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data Today s Class: Review of concepts in multivariate data Introduction to random intercepts Crossed random effects models
More informationECNS 561 Multiple Regression Analysis
ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking
More informationMIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010
MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010 Part 1 of this document can be found at http://www.uvm.edu/~dhowell/methods/supplements/mixed Models for Repeated Measures1.pdf
More informationOutline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups
Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression10_2/index.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk
More informationVariance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.
10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for
More informationIntroduction to the Analysis of Hierarchical and Longitudinal Data
Introduction to the Analysis of Hierarchical and Longitudinal Data Georges Monette, York University with Ye Sun SPIDA June 7, 2004 1 Graphical overview of selected concepts Nature of hierarchical models
More informationLecture 1 Introduction to Multi-level Models
Lecture 1 Introduction to Multi-level Models Course Website: http://www.biostat.jhsph.edu/~ejohnson/multilevel.htm All lecture materials extracted and further developed from the Multilevel Model course
More informationRandom Intercept Models
Random Intercept Models Edps/Psych/Soc 589 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Spring 2019 Outline A very simple case of a random intercept
More informationOutline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups
Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~jufo/varianceregressionf2011.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk
More informationIntroduction to Within-Person Analysis and RM ANOVA
Introduction to Within-Person Analysis and RM ANOVA Today s Class: From between-person to within-person ANOVAs for longitudinal data Variance model comparisons using 2 LL CLP 944: Lecture 3 1 The Two Sides
More informationSTAT 5200 Handout #23. Repeated Measures Example (Ch. 16)
Motivating Example: Glucose STAT 500 Handout #3 Repeated Measures Example (Ch. 16) An experiment is conducted to evaluate the effects of three diets on the serum glucose levels of human subjects. Twelve
More informationProfile Analysis Multivariate Regression
Lecture 8 October 12, 2005 Analysis Lecture #8-10/12/2005 Slide 1 of 68 Today s Lecture Profile analysis Today s Lecture Schedule : regression review multiple regression is due Thursday, October 27th,
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationANOVA approaches to Repeated Measures. repeated measures MANOVA (chapter 3)
ANOVA approaches to Repeated Measures univariate repeated-measures ANOVA (chapter 2) repeated measures MANOVA (chapter 3) Assumptions Interval measurement and normally distributed errors (homogeneous across
More informationAnalysis of Variance
1 / 70 Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression11_2 Marc Andersen, mja@statgroup.dk Analysis of variance and regression for health researchers,
More informationModels for binary data
Faculty of Health Sciences Models for binary data Analysis of repeated measurements 2015 Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics, University of Copenhagen 1 / 63 Program for
More information3 Multiple Linear Regression
3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is
More informationwith the usual assumptions about the error term. The two values of X 1 X 2 0 1
Sample questions 1. A researcher is investigating the effects of two factors, X 1 and X 2, each at 2 levels, on a response variable Y. A balanced two-factor factorial design is used with 1 replicate. The
More informationReview of CLDP 944: Multilevel Models for Longitudinal Data
Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance
More informationWU Weiterbildung. Linear Mixed Models
Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes
More informationIntroduction to Random Effects of Time and Model Estimation
Introduction to Random Effects of Time and Model Estimation Today s Class: The Big Picture Multilevel model notation Fixed vs. random effects of time Random intercept vs. random slope models How MLM =
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationStatistics for exp. medical researchers Comparison of groups, T-tests and ANOVA
Faculty of Health Sciences Outline Statistics for exp. medical researchers Comparison of groups, T-tests and ANOVA Lene Theil Skovgaard Sept. 14, 2015 Paired comparisons: tests and confidence intervals
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Dec 1, 2015 Lecture 11: Mixed Models (HLMs)
36-309/749 Experimental Design for Behavioral and Social Sciences Dec 1, 2015 Lecture 11: Mixed Models (HLMs) Independent Errors Assumption An error is the deviation of an individual observed outcome (DV)
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationIntroducing Generalized Linear Models: Logistic Regression
Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and
More informationDay 4: Shrinkage Estimators
Day 4: Shrinkage Estimators Kenneth Benoit Data Mining and Statistical Learning March 9, 2015 n versus p (aka k) Classical regression framework: n > p. Without this inequality, the OLS coefficients have
More informationECO220Y Simple Regression: Testing the Slope
ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x
More informationRon Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)
Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October
More informationAnalysis of variance and regression. November 22, 2007
Analysis of variance and regression November 22, 2007 Parametrisations: Choice of parameters Comparison of models Test for linearity Linear splines Lene Theil Skovgaard, Dept. of Biostatistics, Institute
More informationBiostatistics Workshop Longitudinal Data Analysis. Session 4 GARRETT FITZMAURICE
Biostatistics Workshop 2008 Longitudinal Data Analysis Session 4 GARRETT FITZMAURICE Harvard University 1 LINEAR MIXED EFFECTS MODELS Motivating Example: Influence of Menarche on Changes in Body Fat Prospective
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationSTAT 525 Fall Final exam. Tuesday December 14, 2010
STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will
More informationLinear models Analysis of Covariance
Esben Budtz-Jørgensen April 22, 2008 Linear models Analysis of Covariance Confounding Interactions Parameterizations Analysis of Covariance group comparisons can become biased if an important predictor
More informationSimple linear regression
Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single
More informationLinear models Analysis of Covariance
Esben Budtz-Jørgensen November 20, 2007 Linear models Analysis of Covariance Confounding Interactions Parameterizations Analysis of Covariance group comparisons can become biased if an important predictor
More informationChapter 9. Multivariate and Within-cases Analysis. 9.1 Multivariate Analysis of Variance
Chapter 9 Multivariate and Within-cases Analysis 9.1 Multivariate Analysis of Variance Multivariate means more than one response variable at once. Why do it? Primarily because if you do parallel analyses
More informationANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula
ANCOVA Workings of ANOVA & ANCOVA ANCOVA, Semi-Partial correlations, statistical control Using model plotting to think about ANCOVA & Statistical control You know how ANOVA works the total variation among
More informationCorrelated data. Non-normal outcomes. Reminder on binary data. Non-normal data. Faculty of Health Sciences. Non-normal outcomes
Faculty of Health Sciences Non-normal outcomes Correlated data Non-normal outcomes Lene Theil Skovgaard December 5, 2014 Generalized linear models Generalized linear mixed models Population average models
More informationCorrelation and Linear Regression
Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means
More informationInteractions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept
Interactions Lectures 1 & Regression Sometimes two variables appear related: > smoking and lung cancers > height and weight > years of education and income > engine size and gas mileage > GMAT scores and
More informationSerial Correlation. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology
Serial Correlation Edps/Psych/Stat 587 Carolyn J. Anderson Department of Educational Psychology c Board of Trustees, University of Illinois Fall 017 Model for Level 1 Residuals There are three sources
More informationRandom Effects. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology. university of illinois at urbana-champaign
Random Effects Edps/Psych/Stat 587 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign Fall 2012 Outline Introduction Empirical Bayes inference
More informationBiostatistics Quantitative Data
Biostatistics Quantitative Data Descriptive Statistics Statistical Models One-sample and Two-Sample Tests Introduction to SAS-ANALYST T- and Rank-Tests using ANALYST Thomas Scheike Quantitative Data This
More informationFaculty of Health Sciences. Correlated data. More about LMMs. Lene Theil Skovgaard. December 4, / 104
Faculty of Health Sciences Correlated data More about LMMs Lene Theil Skovgaard December 4, 2015 1 / 104 Further topics Model check and diagnostics Cross-over studies Paired T-tests with missing values
More informationGeneral Principles Within-Cases Factors Only Within and Between. Within Cases ANOVA. Part One
Within Cases ANOVA Part One 1 / 25 Within Cases A case contributes a DV value for every value of a categorical IV It is natural to expect data from the same case to be correlated - NOT independent For
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationIntroduction to Crossover Trials
Introduction to Crossover Trials Stat 6500 Tutorial Project Isaac Blackhurst A crossover trial is a type of randomized control trial. It has advantages over other designed experiments because, under certain
More information6. Multiple regression - PROC GLM
Use of SAS - November 2016 6. Multiple regression - PROC GLM Karl Bang Christensen Department of Biostatistics, University of Copenhagen. http://biostat.ku.dk/~kach/sas2016/ kach@biostat.ku.dk, tel: 35327491
More information2.1 Linear regression with matrices
21 Linear regression with matrices The values of the independent variables are united into the matrix X (design matrix), the values of the outcome and the coefficient are represented by the vectors Y and
More informationCorrelation and Simple Linear Regression
Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline
More informationLab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )
Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationNonlinear regression. Nonlinear regression analysis. Polynomial regression. How can we model non-linear effects?
Nonlinear regression Nonlinear regression analysis Peter Dalgaard (orig. Lene Theil Skovgaard) Department of Biostatistics University of Copenhagen Simple kinetic model Compartment models Michaelis Menten
More informationReview of Multiple Regression
Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate
More informationCorrelation and Regression Bangkok, 14-18, Sept. 2015
Analysing and Understanding Learning Assessment for Evidence-based Policy Making Correlation and Regression Bangkok, 14-18, Sept. 2015 Australian Council for Educational Research Correlation The strength
More informationThe Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs
The Application and Promise of Hierarchical Linear Modeling (HLM) in Studying First-Year Student Programs Chad S. Briggs, Kathie Lorentz & Eric Davis Education & Outreach University Housing Southern Illinois
More informationUnbalanced Designs & Quasi F-Ratios
Unbalanced Designs & Quasi F-Ratios ANOVA for unequal n s, pooled variances, & other useful tools Unequal nʼs Focus (so far) on Balanced Designs Equal n s in groups (CR-p and CRF-pq) Observation in every
More information4 Multiple Linear Regression
4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ
More informationDifferences of Least Squares Means
STAT:5201 Homework 9 Solutions 1. We have a model with two crossed random factors operator and machine. There are 4 operators, 8 machines, and 3 observations from each operator/machine combination. (a)
More information