Random and Mixed Effects Models - Part III

Similar documents
Random and Mixed Effects Models - Part II

TWO OR MORE RANDOM EFFECTS. The two-way complete model for two random effects:

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

Homework 3 - Solution

Factorial ANOVA. Psychology 3256

Outline for today. Two-way analysis of variance with random effects

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Workshop 9.3a: Randomized block designs

Mixed Model Theory, Part I

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs

Multiple Comparisons. The Interaction Effects of more than two factors in an analysis of variance experiment. Submitted by: Anna Pashley

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij

Lecture 2. The Simple Linear Regression Model: Matrix Approach

Categorical Predictor Variables

Statistics 512: Applied Linear Models. Topic 9

Factorial and Unbalanced Analysis of Variance

Multilevel Models in Matrix Form. Lecture 7 July 27, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2

Simple and Multiple Linear Regression

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

Unbalanced Designs Mechanics. Estimate of σ 2 becomes weighted average of treatment combination sample variances.

Ch 3: Multiple Linear Regression

These slides illustrate a few example R commands that can be useful for the analysis of repeated measures data.

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

1 Mixed effect models and longitudinal data analysis

R Output for Linear Models using functions lm(), gls() & glm()

Properties of the least squares estimates

Mixed effects models

THE ANOVA APPROACH TO THE ANALYSIS OF LINEAR MIXED EFFECTS MODELS

A brief introduction to mixed models

Introduction and Background to Multilevel Analysis

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

Ch 2: Simple Linear Regression

[y i α βx i ] 2 (2) Q = i=1

AMS-207: Bayesian Statistics

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data

Statistics 210 Part 3 Statistical Methods Hal S. Stern Department of Statistics University of California, Irvine

Stat 579: Generalized Linear Models and Extensions

Multiple Linear Regression

Intruction to General and Generalized Linear Models

Model Selection Procedures

Answer Keys to Homework#10

Assignment 9 Answer Keys

Power & Sample Size Calculation

Stat 579: Generalized Linear Models and Extensions

Outline for today. Maximum likelihood estimation. Computation with multivariate normal distributions. Multivariate normal distribution

MAT3378 ANOVA Summary

ST430 Exam 2 Solutions

Mixed-Models. version 30 October 2011

13. October p. 1

Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

MS&E 226: Small Data

Lecture 6 Multiple Linear Regression, cont.

Lecture 10: Experiments with Random Effects

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

Unbalanced Designs & Quasi F-Ratios

Nesting and Mixed Effects: Part I. Lukas Meier, Seminar für Statistik

Analysis of Variance and Design of Experiments-I

Lecture 10: Factorial Designs with Random Factors

Coping with Additional Sources of Variation: ANCOVA and Random Effects

y = µj n + β 1 b β b b b + α 1 t α a t a + e

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam

Weighted Least Squares

HW 2 due March 6 random effects and mixed effects models ELM Ch. 8 R Studio Cheatsheets In the News: homeopathic vaccines

A Re-Introduction to General Linear Models (GLM)

Section Least Squares Regression

. Example: For 3 factors, sse = (y ijkt. " y ijk

Topic 20: Single Factor Analysis of Variance

Probability and Statistics Notes

Two or more categorical predictors. 2.1 Two fixed effects

Statistics for analyzing and modeling precipitation isotope ratios in IsoMAP

Lecture 1: Linear Models and Applications

SPH 247 Statistical Analysis of Laboratory Data

STAT 705 Chapter 16: One-way ANOVA

Lecture 9 STK3100/4100

Repeated measures, part 2, advanced methods

Stat 511 Outline (Spring 2009 Revision of 2008 Version)

STAT 705 Chapter 19: Two-way ANOVA

Outline Topic 21 - Two Factor ANOVA

Outline. Example and Model ANOVA table F tests Pairwise treatment comparisons with LSD Sample and subsample size determination

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Correlated Data: Linear Mixed Models with Random Intercepts

STAT 705 Chapter 19: Two-way ANOVA

Linear Regression (9/11/13)

. a m1 a mn. a 1 a 2 a = a n

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Topic 32: Two-Way Mixed Effects Model

Answer to exercise: Blood pressure lowering drugs

The linear mixed model: introduction and the basic model

Mixed-Model Estimation of genetic variances. Bruce Walsh lecture notes Uppsala EQG 2012 course version 28 Jan 2012

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Appendix A: Review of the General Linear Model

Simple Linear Regression

The factors in higher-way ANOVAs can again be considered fixed or random, depending on the context of the study. For each factor:

Generalized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58

Multiple Regression: Example

If you believe in things that you don t understand then you suffer. Stevie Wonder, Superstition

Stat 217 Final Exam. Name: May 1, 2002

Stat 579: Generalized Linear Models and Extensions

Transcription:

Random and Mixed Effects Models - Part III Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin

Quasi-F Tests When we get to more than two categorical factors, some times there are not nice F tests available. For example, consider the 3-way model with 3 random effects (A: a levels, B: b levels, C: c levels) and m observations per treatment combination. The expected mean squares are MS df E[M S] MSA a 1 σ 2 + mbcσ 2 α + mcσ 2 αβ + mbσ2 αγ + mσ 2 αβγ MSB b 1 σ 2 + macσ 2 β + mcσ2 αβ + maσ2 βγ + mσ2 αβγ MSC c 1 σ 2 + mabσ 2 γ + mbσ 2 αγ + maσ 2 βγ + mσ2 αβγ MSAB (a 1)(b 1) σ 2 + mcσ 2 αβ + mσ2 αβγ MSAC (a 1)(b 1) σ 2 + mcσ 2 αγ + mσ 2 αβγ MSBC (a 1)(c 1) σ 2 + maσ 2 βγ + mσ2 αβγ MSABC (a 1)(b 1)(c 1) σ 2 + mσαβγ 2 MSE (m 1)abc σ 2 Quasi-F Tests 1

If we want to test H 0 : σ 2 αβ = 0, we can use F AB = MSAB MSABC as E[MSAB] = E[MSABC] under H 0. Now consider testing H 0 : σα 2 = 0. Looking through the previous table there is no entry that has the same expectation as E[MSA] has under H 0, i.e. there is nothing satisfying MSX = σ 2 + mcσ 2 αβ + mbσ 2 αγ + mσ 2 αβγ So we can t use a test statistic of the form F = MSA MSX Quasi-F Tests 2

One alternative is to use a Quasi-F test (also known as Satterthwaite s Approximate F test. The idea is to find a linear combination of mean squares ˆL = c 1 MS 1 +... + c h MS h that has the desired expectation, i.e. E[ˆL] = c 1 E[MS 1 ] +... + c h E[MS h ] = MSX Then the hypothesis can be tested with the test statistic F A = MSA ˆL Quasi-F Tests 3

In this example, ˆL = MSAB + MSAC MSABC will work. Note that there may be multiple possibilities for the denominator in the F ratio. This test statistic has an approximate F distribution with a 1 and df degrees of freedom where df = (c 1MS 1 +... + c h MS h ) 2 (c 1 MS 1 ) 2 df 1 +... + (c hms h ) 2 df h Example: A study on daily output of a factory examined three factors, operators (3 levels), machines (2 levels), and batches (5 levels). For each treatment combination, 3 observations were taken. The resulting ANOVA table was generated for this table Quasi-F Tests 4

Source of Variation SS df M S A (operators) 17.3 2 8.65 B (machines) 4.2 1 4.20 C (batches) 24.8 4 6.20 AB 4.8 2 2.40 AC 31.7 8 3.96 BC 12.5 4 3.13 ABC 11.9 8 1.49 Error 137.7 60 2.30 For example to examine the effect of operators, we can use the Quasi-F statistic FA MSA = MSAB + MSAC MSABC 8.65 = 2.40 + 3.96 + 1.49 = 1.78 Quasi-F Tests 5

The denominator degrees of freedom for this case is df = 4.87 2 2.40 2 2 + 3.962 8 + 1.492 8 = 4.63 Note that this formula will usually give non-integer degrees of freedom. Rounding to the closest integer is usually fine when working with tables. Note that many packages will allow non-integer degrees in their probability functions. The p-value for this F statistic is 0.267 (=pf(1.78,2,4.63, lower.tail=f)) Quasi-F Tests 6

Note that exactly how you deal with these non-integer degrees of freedom usually doesn t matter, as exhibited by > pf(1.78, 2, 4.63, lower.tail=f) [1] 0.2670347 > pf(1.78, 2, 4, lower.tail=f) [1] 0.2799474 > pf(1.78, 2, 5, lower.tail=f) [1] 0.2607598 For the example, if anybody is interested, none of the test are significant. The smallest p-value (0.094) occurs with the test on the AC interaction. Quasi-F Tests 7

As mentioned before, it is possible that there is more that one linear combination L that can be chosen. In the case where that occurs, you generally want to choose the one that has the least number of terms involved. If there are two possibilities and they both have the same number of terms involved, pick the one that avoids low order interactions. This test tends to work better when the mean square components are close to estimating the measurement error variance σ 2. One way of figuring out a possible L is to consider the estimate of the variance component of interest. In the example, to estimate σα, 2 one estimate is ˆσ 2 α = MSA MSAB MSAC + MSABC mbc = MSA ˆL mbc Often the L you want to use will come from a relationship like this. Quasi-F Tests 8

General Form of the Linear Mixed Model All the the examples involving random effects, the models can be written in the general matrix form [ Y = Xβ + Zu + ɛ ] ([ ] [ u 0 N, ɛ 0 G 0 0 σ 2 R ]) where β is the vector of fixed effects and u is the vector of random effects, X and Z are known design matrices, and G and R are scaled covariance matrices. Elements of the matrix G are the variances of the random effects, plus possibly covariances between the random effects. R is the correlation matrix of the measurement errors. In the examples discussed so far, both matrices have been diagonal matrices, but in general they don t need to be. General Form of the Linear Mixed Model 9

For example, consider a mixed model with one fixed effect (A: 4 levels), and two random effects (C: 2 levels, D: 3 levels) and 2 observations per treatment combination and no interactions. Then β T = [ µ α 1 α 2 α 3 α 4 ] u T = [ γ 1 γ 2 δ 1 δ 2 δ 3 ] Z is a 48 5 matrix of 0s and 1s. In each row there will be two 1s, one in the first two columns (indicating the C random effect) and one in the last three columns (indicating the D random effect) ( G = diag σ 2 γ σ 2 γ σ 2 δ σ 2 δ σ 2 δ ) R = I 48 General Form of the Linear Mixed Model 10

Restricted Maximum Likelihood - REML The likelihood function in the random effects case has two components as the model being fit can be considered as a hierarchical model. The two components are Y u N(Xβ + Zu, σ 2 R) u N(0, G) So the likelihood function factors as L(β, σ 2, G) = f(y u)f(u) So the maximum likelihood approach would choose the values of β, σ 2, and G (actually the variances that make up G) that maximize L(β, σ 2, G) Restricted Maximum Likelihood - REML 11

One known problem with maximum likelihood in this case is that the estimates of the variance components are known to have a negative bias. Consider the case when there is one variance component σ 2 and β has p components. The the MLE of σ 2 is However its expection is σ 2 = SSE n E[ σ 2 ] = n p n σ2 < σ 2 As we have seen before, the easy solutions here is to use ˆσ 2 = SSE n p = MSE Restricted Maximum Likelihood - REML 12

This solution is a special case of a more general method, Restricted Maximum Likelihood (REML). In REML, we use structure that must occur in the residuals (like e i = 0 in standard regression) to create a correction term in the likelihood and correct for the bias problems. For example, for known G and R we get X T R 1 (Y X ˆβ Zû) = 0 and X T R 1 (Y X ˆβ Zû) + G 1 û = 0 From these, we get û = (G 1 Z T R 1 boldz) 1 Z T R 1 (Y X ˆβ) and ˆβ = [ X T (R + ZGZ T ) 1 X ] 1 X T (R + ZGZ T ) 1 Y Restricted Maximum Likelihood - REML 13

Plugging this into the previous equation gives û and the information for the correction term. The default estimation method for lmer is REML (method="reml"). will also fit by maximum likelihood (method="ml"). It To see the effect of the two estimation methods, lets examine the orthodontic example from the last class. > orthodont.reml Linear mixed-effects model fit by REML Formula: distance ~ Sex + age + (1 Subject) Data: Orthodont AIC BIC loglik MLdeviance REMLdeviance 445.5125 456.241-218.7563 434.8982 437.5125 Random effects: Groups Name Variance Std.Dev. Subject (Intercept) 3.2667 1.8074 Residual 2.0495 1.4316 number of obs: 108, groups: Subject, 27 Restricted Maximum Likelihood - REML 14

Fixed effects: Estimate Std. Error t value (Intercept) 15.385690 0.895983 17.1718 SexMale 2.321023 0.761412 3.0483 age 0.660185 0.061606 10.7162 Correlation of Fixed Effects: (Intr) SexMal SexMale -0.504 age -0.756 0.000 > orthodont.mle Linear mixed-effects model fit by maximum likelihood Formula: distance ~ Sex + age + (1 Subject) Data: Orthodont AIC BIC loglik MLdeviance REMLdeviance 442.8565 453.585-217.4282 434.8565 437.5526 Restricted Maximum Likelihood - REML 15

Random effects: Groups Name Variance Std.Dev. Subject (Intercept) 2.9932 1.7301 Residual 2.0242 1.4227 number of obs: 108, groups: Subject, 27 Fixed effects: Estimate Std. Error t value (Intercept) 15.385690 0.878447 17.5146 SexMale 2.321023 0.732672 3.1679 age 0.660185 0.061225 10.7830 Correlation of Fixed Effects: (Intr) SexMal SexMale -0.494 age -0.767 0.000 Restricted Maximum Likelihood - REML 16

In this case, both estimated variance are smaller with maximum likelihood, particularly the subject to subject variance. Also note that the estimates of the fixed effects are the same, though their standard errors and correlations are different. Restricted Maximum Likelihood - REML 17