Random and Mixed Effects Models - Part II

Similar documents
Random and Mixed Effects Models - Part III

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data

The factors in higher-way ANOVAs can again be considered fixed or random, depending on the context of the study. For each factor:

Statistics 512: Applied Linear Models. Topic 9

Stat 579: Generalized Linear Models and Extensions

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

Workshop 9.3a: Randomized block designs

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

Topic 32: Two-Way Mixed Effects Model

A brief introduction to mixed models

Homework 3 - Solution

lme4 Luke Chang Last Revised July 16, Fitting Linear Mixed Models with a Varying Intercept

Unbalanced Designs Mechanics. Estimate of σ 2 becomes weighted average of treatment combination sample variances.

Mixed effects models

Introduction and Background to Multilevel Analysis

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions

Correlated Data: Linear Mixed Models with Random Intercepts

Value Added Modeling

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

Outline for today. Two-way analysis of variance with random effects

Hierarchical Linear Models

General Linear Statistical Models

Outline. Example and Model ANOVA table F tests Pairwise treatment comparisons with LSD Sample and subsample size determination

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

STAT 526 Spring Midterm 1. Wednesday February 2, 2011

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs

Latent random variables

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

TWO OR MORE RANDOM EFFECTS. The two-way complete model for two random effects:

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

Mixed Model Theory, Part I

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

Using R formulae to test for main effects in the presence of higher-order interactions

A Re-Introduction to General Linear Models (GLM)

PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design

Non-Gaussian Response Variables

#Alternatively we could fit a model where the rail values are levels of a factor with fixed effects

Analysis of Variance and Design of Experiments-I

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

36-463/663: Hierarchical Linear Models

Intruction to General and Generalized Linear Models

Time-Invariant Predictors in Longitudinal Models

Exploring Hierarchical Linear Mixed Models

Coping with Additional Sources of Variation: ANCOVA and Random Effects

Time-Invariant Predictors in Longitudinal Models

Stat 579: Generalized Linear Models and Extensions

Categorical Predictor Variables

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models

Aedes egg laying behavior Erika Mudrak, CSCU November 7, 2018

STAT 526 Advanced Statistical Methodology

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Time Invariant Predictors in Longitudinal Models

Simple Linear Regression

Ch 2: Simple Linear Regression

Lecture 22 Mixed Effects Models III Nested designs

Time-Invariant Predictors in Longitudinal Models

Introduction to Linear Mixed Models: Modeling continuous longitudinal outcomes

Topic 20: Single Factor Analysis of Variance

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models with correlated measurement errors

Introduction to Within-Person Analysis and RM ANOVA

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Analysis of Variance

R Output for Linear Models using functions lm(), gls() & glm()

WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS

Generalized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs

Multiple Predictor Variables: ANOVA

These slides illustrate a few example R commands that can be useful for the analysis of repeated measures data.

HW 2 due March 6 random effects and mixed effects models ELM Ch. 8 R Studio Cheatsheets In the News: homeopathic vaccines

Stat 209 Lab: Linear Mixed Models in R This lab covers the Linear Mixed Models tutorial by John Fox. Lab prepared by Karen Kapur. ɛ i Normal(0, σ 2 )

STAT 525 Fall Final exam. Tuesday December 14, 2010

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Longitudinal Data Analysis

Outline for today. Maximum likelihood estimation. Computation with multivariate normal distributions. Multivariate normal distribution

Mixed models in R using the lme4 package Part 3: Linear mixed models with simple, scalar random effects

Formal Statement of Simple Linear Regression Model

Stat 5303 (Oehlert): Randomized Complete Blocks 1

Lecture 4. Random Effects in Completely Randomized Design

Lecture 6: Linear Regression (continued)

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d.

13. October p. 1

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Hierarchical Random Effects

Outline Topic 21 - Two Factor ANOVA

Section Least Squares Regression

A (Brief) Introduction to Crossed Random Effects Models for Repeated Measures Data

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Answer to exercise: Blood pressure lowering drugs

The linear mixed model: introduction and the basic model

Power analysis examples using R

Statistical Analysis of Hierarchical Data. David Zucker Hebrew University, Jerusalem, Israel

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

Inference for Regression Inference about the Regression Model and Using the Regression Line

Transcription:

Random and Mixed Effects Models - Part II Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin

Two-Factor Random Effects Model Example: Miles per Gallon (Neter, Kutner, Nachtsheim, & Wasserman, problem 24.15) An automobile manufacturer studied the effects of differences between drivers (Driver) and cars (Car) on gasoline consumption. Four drivers and 5 cars of the same model were both selected at random. Each driver drove each car twice over a 40 mile test course and the miles per gallon (MPG) were recorded. Miles per Gallon 26 28 30 32 34 36 In this example, it seems reasonable to treat driver and car as random effects. 1 2 3 4 5 Car Two-Factor Random Effects Model 1

One possible model for this data is y ijk = µ + α i + β j + (αβ) ij + ɛ ijk α i iid N(0, σ 2 α) β j iid N(0, σ 2 β) (αβ) ij iid N(0, σ 2 αβ) ɛ ijk iid N(0, σ 2 ) For what follows, lets assume a balance design, with a levels for factor A (Driver), b levels for factor B (Car), and m observations for each (i, j) combination. (Of course you could have an unbalanced designed, however the following formulas get really ugly.) Two-Factor Random Effects Model 2

For this design, the expected mean squares satisfy Mean Square df E[M S] MSA a 1 σ 2 + bmσ 2 α + mσ 2 αβ MSB b 1 σ 2 + amσ 2 β + mσ2 αβ MSAB (a 1)(b 1) σ 2 + mσαβ 2 MSE (m 1)ab σ 2 Based on these expected mean squares, we estimate the various variance components as follows σ 2 : ˆσ 2 = MSE σ 2 αβ : ˆσ 2 αβ = MSAB MSE m Two-Factor Random Effects Model 3

σ 2 α: ˆσ 2 α = MSA MSAB bm σ 2 β : ˆσ 2 β = MSB MSAB am For the example, these happen to be Two-Factor Random Effects Model 4

> mpg.rr <- lmer(mpg ~ 1 + (1 Driver) + (1 Car) + (1 Driver:Car), > mpg.rr Linear mixed-effects model fit by REML Formula: MPG ~ 1 + (1 Driver) + (1 Car) + (1 Driver:Car) Data: mpg AIC BIC loglik MLdeviance REMLdeviance 94.77908 101.5346-43.38954 89.67671 86.77908 Random effects: Groups Name Variance Std.Dev. Driver:Car (Intercept) 0.014063 0.11859 Car (Intercept) 2.934312 1.71298 Driver (Intercept) 9.322437 3.05327 Residual 0.175750 0.41923 number of obs: 40, groups: Driver:Car, 20; Car, 5; Driver, 4 Fixed effects: Estimate Std. Error t value (Intercept) 30.0475 1.7096 17.576 Two-Factor Random Effects Model 5

So similarly to the single random effect case, we can use the ANOVA table to help us to estimate the variance components. However we need to know the structure of the expected mean squares. Different assumptions about which terms are to be included in the model and which factors are random and which are fixed, will lead to different mean square errors, and thus different estimates for the variance components. The resulting structure also affects tests on the variance components. If all of the factors are fixed, the three standard F tests are F A = MSA MSE F B = MSB MSE F AB = MSAB MSE For the test of the interaction (H 0 : σ 2 αβ ), F AB = MSAB MSE since both mean squares have the same expectation under H 0. Two-Factor Random Effects Model 6

For the example, this can be examined in R by > mpg.rr <- lmer(mpg ~ 1 + (1 Driver) + (1 Car) + (1 Driver:Car), data=mpg) > mpg.a.rr <- lmer(mpg ~ 1 + (1 Driver) + (1 Car), data=mpg) > anova(mpg.rr, mpg.a.rr) Data: mpg Models: mpg.a.rr: MPG ~ 1 + (1 Driver) + (1 Car) mpg.rr: MPG ~ 1 + (1 Driver) + (1 Car) + (1 Driver:Car) Df AIC BIC loglik Chisq Chi Df Pr(>Chisq) mpg.a.rr 3 92.863 97.929-43.431 mpg.rr 4 94.779 101.535-43.390 0.0836 1 0.7725 Note that the R approach using the lme4 package is to use χ 2 based likelihood ratio tests instead of F tests. Asymptotically these will give the similar results. Two-Factor Random Effects Model 7

However for the main effects, the tests aren t valid. For the test on factor A, under H 0 : σ 2 α, E[MSA] = σ 2 + mσ 2 αβ E[MS] = σ 2 so the F used in the fixed effects case will not work here. Instead F A = MSA MSAB will work as both mean squares have the same expectation under the null. Similarly can be used to examine H 0 : σ 2 β F B = MSB MSAB While these are valid test statistics, whether testing the corresponding hypotheses is usually questionable. There is usually no reason to test a main effect when an interaction containing that main effect is included in the model. Two-Factor Random Effects Model 8

Two-Factor Mixed Model As suggested before, it is possible to combine fixed and random factors in a model. For example, suppose that driver is a random effect and car is a fixed effect. This might occur if we are considering a single Zipcar location that only has 5 cars. In that case, the company might be interested in the individual cars and not some larger population of cars. In this case we can model the data as y ijk = µ + α i + β j + (αβ) ij + ɛ ijk α i iid N(0, σ 2 α) β j fixed but unknown constants, subject to β j = 0 (αβ) ij iid N(0, σ 2 αβ) ɛ ijk iid N(0, σ 2 ) Two-Factor Mixed Model 9

(Note the constraint on the β j was chosen to make the expect mean square formula nicer.) Usually when you include interactions involving fixed and random effects, the interactions are considered as random effects. The expected mean squares for this model satisfy Mean Square df E[M S] MSA a 1 σ 2 + bmσ 2 α + mσ 2 αβ P MSB b 1 σ 2 + mσαβ 2 + ma β 2 j b 1 MSAB (a 1)(b 1) σ 2 + mσαβ 2 MSE (m 1)ab σ 2 Note that this table of mean squares doesn t match what you will see in other and software packages. Other books discuss a slightly different form of the mixed model, sometimes referred to as the restricted model. Two-Factor Mixed Model 10

In this case, they assume that (αβ) ij = 0 i for each j The unrestricted model was chosen as it matches with the function lmer. Also Dean and Vos argue that usually the unrestricted model makes for sense (what really is the correct way to set up the restrictions). For a further discussion of the restricted form of the model, see Neter, Kutner, Nachtsheim, and Wasserman or Montgomery. Under the unrestricted model, the estimated variance components are σ 2 : ˆσ 2 = MSE σ 2 αβ : ˆσ 2 αβ = MSAB MSE m Two-Factor Mixed Model 11

σ 2 α: ˆσ 2 α = MSA MSAB bm > options(contrasts=c("contr.sum","contr.poly")) > mpg.rf <- lmer(mpg ~ Car + (1 Driver) + (1 Car:Driver), data=mpg) > mpg.rf Linear mixed-effects model fit by REML Formula: MPG ~ Car + (1 Driver) + (1 Car:Driver) Data: mpg AIC BIC loglik MLdeviance REMLdeviance 86.69653 98.51868-36.34826 66.10478 72.69653 Random effects: Groups Name Variance Std.Dev. Car:Driver (Intercept) 0.014063 0.11859 Driver (Intercept) 9.322437 3.05327 Residual 0.175750 0.41923 number of obs: 40, groups: Car:Driver, 20; Driver, 4 Two-Factor Mixed Model 12

Fixed effects: Estimate Std. Error t value (Intercept) 30.04750 1.52830 19.6607 Car1-1.08500 0.14278-7.5988 Car2 2.20250 0.14278 15.4253 Car3-2.13500 0.14278-14.9526 Car4 1.11500 0.14278 7.8090 Correlation of Fixed Effects: (Intr) Car1 Car2 Car3 Car1 0.000 Car2 0.000-0.250 Car3 0.000-0.250-0.250 Car4 0.000-0.250-0.250-0.250 Note that the constraint chosen on the fixed effects affects their estimates but not those of the random effects (at least under the default estimation scheme method="reml"). Switching to contr.treatment gives Two-Factor Mixed Model 13

> options(contrasts=c("contr.treatment","contr.poly")) > mpg.rf2 <- lmer(mpg ~ Car + (1 Driver) + (1 Car:Driver), data=mpg) > mpg.rf2 Linear mixed-effects model fit by REML Formula: MPG ~ Car + (1 Driver) + (1 Car:Driver) Data: mpg AIC BIC loglik MLdeviance REMLdeviance 83.47765 95.2998-34.73883 66.10478 69.47765 Random effects: Groups Name Variance Std.Dev. Car:Driver (Intercept) 0.014062 0.11859 Driver (Intercept) 9.322436 3.05327 Residual 0.175750 0.41923 number of obs: 40, groups: Car:Driver, 20; Driver, 4 Two-Factor Mixed Model 14

Fixed effects: Estimate Std. Error t value (Intercept) 28.96250 1.53496 18.8686 Car2 3.28750 0.22576 14.5618 Car3-1.05000 0.22576-4.6509 Car4 2.20000 0.22576 9.7447 Car5 0.98750 0.22576 4.3741 Correlation of Fixed Effects: (Intr) Car2 Car3 Car4 Car2-0.074 Car3-0.074 0.500 Car4-0.074 0.500 0.500 Car5-0.074 0.500 0.500 0.500 Two-Factor Mixed Model 15

Again we can see whether there is an interaction effect as follows > anova(mpg.a.rf, mpg.rf) Data: mpg Models: mpg.a.rf: MPG ~ Car + (1 Driver) mpg.rf: MPG ~ Car + (1 Driver) + (1 Car:Driver) Df AIC BIC loglik Chisq Chi Df Pr(>Chisq) mpg.a.rf 6 84.780 94.913-36.390 mpg.rf 7 86.697 98.519-36.348 0.0836 1 0.7725 Actually this is the same test as with the random effects case, even though the base models are different. Be careful as this won t always happen in more complex cases. Two-Factor Mixed Model 16

Combining Continuous Predictors and Random Effects In the next example, we want to combine a continuous predictor with a categorical factors, one fixed and one random. Of course we can have a wide range of different ways we can combine predictors in this sort of situation. Example: This dataset, from the S-plus manual and collected by the Dental School of North Carolina, investigated the distance from the pituitary to the ptergomaxillary fissure (Distance). There were 27 subjects (Subject) (16 boys, 11 girls - Sex) each measured at 4 ages (8, 10, 12, 14 - age). Of interest is the difference between the boys and girls, after accounting for the effects of age and subject to subject variability. Combining Continuous Predictors and Random Effects 17

8 9 11 13 8 9 11 13 8 9 11 13 8 9 11 13 M08 M09 M10 M11 M12 M13 M14 M15 M16 30 25 20 F10 F11 M01 M02 M03 M04 M05 M06 M07 30 Distance 25 20 F01 F02 F03 F04 F05 F06 F07 F08 F09 30 25 20 8 9 11 13 8 9 11 13 8 9 11 13 8 9 11 13 8 9 11 13 Age Combining Continuous Predictors and Random Effects 18

So we want to develop a model of the basic form y = µ + α }{{} Sex effect + β }{{} Subject effect + }{{} γt +ɛ Age effect Two possible models of this basic form are Common age effect: y it = µ + α(sex) i + β i + γt + ɛ it β i iid N(0, σ 2 β) ɛ it iid N(0, σ 2 ) In this case, the effect of age is the same for each subject. Combining Continuous Predictors and Random Effects 19

Subject specific age effect: y it = µ + α(sex) i + β i + γ i t + ɛ it β i iid N(0, σ 2 β) γ i iid N(γ, σ 2 γ) ɛ it iid N(0, σ 2 ) In this case, each subject in the trial has there own slope. As written, their is a fixed age effect γ and γ i γ describe the the random slope deviations for each person. For the first model, the fitted values are Combining Continuous Predictors and Random Effects 20

> orthodont.mix <- lmer(distance ~ Sex + age + (1 Subject), data=orthodont) > orthodont.mix Linear mixed-effects model fit by REML Formula: distance ~ Sex + age + (1 Subject) Data: Orthodont AIC BIC loglik MLdeviance REMLdeviance 445.5125 456.241-218.7563 434.8982 437.5125 Random effects: Groups Name Variance Std.Dev. Subject (Intercept) 3.2667 1.8074 Residual 2.0495 1.4316 number of obs: 108, groups: Subject, 27 Fixed effects: Estimate Std. Error t value (Intercept) 15.385690 0.895983 17.1718 SexMale 2.321023 0.761412 3.0483 age 0.660185 0.061606 10.7162 Combining Continuous Predictors and Random Effects 21

Correlation of Fixed Effects: (Intr) SexMal SexMale -0.504 age -0.756 0.000 The second model can be fit by > orthodont.mix2 <- lmer(distance ~ Sex + age + (1 Subject) + (age - 1 Subject), data=orthodont) To fit the model, we need to include the fixed effect for age to estimate γ. The term (age - 1 Subject) is written this way to guarantee that the random effects for Subject and the age:subject are independent. If the (age Subject) was used, there would be an interaction. Combining Continuous Predictors and Random Effects 22

Formula: distance ~ Sex + age + (1 Subject) + (age - 1 Subject) > orthodont.mix2 Linear mixed-effects model fit by REML Data: Orthodont AIC BIC loglik MLdeviance REMLdeviance 446.6453 460.056-218.3227 434.0781 436.6453 Random effects: Groups Name Variance Std.Dev. Subject (Intercept) 2.172948 1.47409 Subject age 0.009996 0.09998 Residual 1.967260 1.40259 number of obs: 108, groups: Subject, 27; Subject, 27 Fixed effects: Estimate Std. Error t value (Intercept) 15.568993 0.861517 18.0716 SexMale 2.011700 0.759759 2.6478 age 0.660185 0.063351 10.4211 Combining Continuous Predictors and Random Effects 23

Correlation of Fixed Effects: (Intr) SexMal SexMale -0.523 age -0.734 0.000 One potentially important question is whether the subject specific slopes are needed. Note that the estimate of σ 2 γ is very small relative to the estimate σ 2, suggesting that a single slope is adequate. This is supported by > anova(orthodont.mix2,orthodont.mix) Data: Orthodont Models: orthodont.mix: distance ~ Sex + age + (1 Subject) orthodont.mix2: distance ~ Sex + age + (1 Subject) + (age - 1 S Df AIC BIC loglik Chisq Chi Df Pr(>Chisq) orthodont.mix 4 445.51 456.24-218.76 orthodont.mix2 5 446.65 460.06-218.32 0.8672 1 0.3517 Combining Continuous Predictors and Random Effects 24