Lecture 4. Random Effects in Completely Randomized Design

Similar documents
Topic 25 - One-Way Random Effects Models. Outline. Random Effects vs Fixed Effects. Data for One-way Random Effects Model. One-way Random effects

Lecture 10: Experiments with Random Effects

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Single Factor Experiments

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3

Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3

Comparison of a Population Means

Lecture 7: Latin Square and Related Design

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

Chapter 11. Analysis of Variance (One-Way)

13. The Cochran-Satterthwaite Approximation for Linear Combinations of Mean Squares

Lecture 7: Latin Squares and Related Designs

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

Lecture 10: Factorial Designs with Random Factors

Ch 2: Simple Linear Regression

SAS Syntax and Output for Data Manipulation:

Statistics 512: Applied Linear Models. Topic 9

5.3 Three-Stage Nested Design Example

Analysis of Covariance

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

Topic 20: Single Factor Analysis of Variance

Analysis of Longitudinal Data: Comparison Between PROC GLM and PROC MIXED. Maribeth Johnson Medical College of Georgia Augusta, GA

Lecture 4. Checking Model Adequacy

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

The Random Effects Model Introduction

Unbalanced Designs Mechanics. Estimate of σ 2 becomes weighted average of treatment combination sample variances.

One-way ANOVA (Single-Factor CRD)

Answer to exercise: Blood pressure lowering drugs

Ch 3: Multiple Linear Regression

Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes)

Analysis of Variance

Review of CLDP 944: Multilevel Models for Longitudinal Data

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

Covariance Structure Approach to Within-Cases

STAT 115:Experimental Designs

Lecture 11: Nested and Split-Plot Designs

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

ANOVA Longitudinal Models for the Practice Effects Data: via GLM

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs

MIXED MODELS FOR REPEATED (LONGITUDINAL) DATA PART 2 DAVID C. HOWELL 4/1/2010

Topic 32: Two-Way Mixed Effects Model

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Variance component models part I

Lecture 5: Comparing Treatment Means Montgomery: Section 3-5

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Introduction to SAS proc mixed

PLS205 Lab 2 January 15, Laboratory Topic 3

Lecture 9. ANOVA: Random-effects model, sample size

WITHIN-PARTICIPANT EXPERIMENTAL DESIGNS

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

Introduction to SAS proc mixed

Lec 1: An Introduction to ANOVA

Assignment 6 Answer Keys

Outline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013

Lecture 5: Determining Sample Size

Split-plot Designs. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 21 1

17. Example SAS Commands for Analysis of a Classic Split-Plot Experiment 17. 1

Repeated Measures Design. Advertising Sales Example

QUEEN MARY, UNIVERSITY OF LONDON

Outline Topic 21 - Two Factor ANOVA

Topic 23: Diagnostics and Remedies

THE ANOVA APPROACH TO THE ANALYSIS OF LINEAR MIXED EFFECTS MODELS

Repeated Measures Data

Lecture 11: Simple Linear Regression

Lecture 9: Factorial Design Montgomery: chapter 5

Topic 6. Two-way designs: Randomized Complete Block Design [ST&D Chapter 9 sections 9.1 to 9.7 (except 9.6) and section 15.8]

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

IX. Complete Block Designs (CBD s)

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Homework 3 - Solution

Overview Scatter Plot Example

Introduction to Random Effects of Time and Model Estimation

STAT 5200 Handout #23. Repeated Measures Example (Ch. 16)

Mixed Model: Split plot with two whole-plot factors, one split-plot factor, and CRD at the whole-plot level (e.g. fancier split-plot p.

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

SAS Code for Data Manipulation: SPSS Code for Data Manipulation: STATA Code for Data Manipulation: Psyc 945 Example 1 page 1

Random Intercept Models

STOR 455 STATISTICAL METHODS I

df=degrees of freedom = n - 1

Odor attraction CRD Page 1

Multiple Linear Regression

Math 423/533: The Main Theoretical Topics

STAT 525 Fall Final exam. Tuesday December 14, 2010

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

Chapter 1 Linear Regression with One Predictor

Multi-factor analysis of variance

Descriptions of post-hoc tests

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Models for Clustered Data

Models for Clustered Data

Lecture 7 Randomized Complete Block Design (RCBD) [ST&D sections (except 9.6) and section 15.8]

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

Random Coefficient Model (a.k.a. multilevel model) (Adapted from UCLA Statistical Computing Seminars)

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Serial Correlation. Edps/Psych/Stat 587. Carolyn J. Anderson. Fall Department of Educational Psychology

Sociology 6Z03 Review II

21.0 Two-Factor Designs

Transcription:

Lecture 4. Random Effects in Completely Randomized Design Montgomery: 3.9, 13.1 and 13.7 1 Lecture 4 Page 1

Random Effects vs Fixed Effects Consider factor with numerous possible levels Want to draw inference on population of levels Not just concerned with levels in experiment Example of differences Fixed: Compare reading ability of 10 2nd grade classes in NY Select a = 10 specific classes of interest. Randomly choose n students from each classroom. Want to compareτ i (class-specific effects). Random: Compare variability among all 2nd grade classes in NY Randomly choose a = 10 classes from large number of classes. Randomly choose n students from each classroom. Want to assessσ 2 τ (class to class variability). Inference broader in random effects case: inference on population with randomly chosen levels 2 Lecture 4 Page 2

Random Effects Model (CRD) Same model as in fixed effects case y ij = µ+τ i +ǫ ij i = 1,2...a j = 1,2,...n i µ - grand mean τ i -ith treatment effect ǫ ij iid N(0,σ 2 ) But view treatment effects in different way Instead of τ i = 0, assume τ i iid N(0,σ 2 τ) {τ i } and{ǫ ij } independent Var(y ij ) =σ 2 τ +σ 2 3 Lecture 4 Page 3

The hypotheses are: Random Effects Model H 0 : στ 2 = 0 H 1 : στ 2 > 0 Partitioning of Total Sum of Squares identical E(MS E )=σ 2 E(MS Treatment )=σ 2 +nστ 2 UnderH 0,F 0 = MS Treatment /MS E F a 1,N a Same test as before: Direct comparison of variabilities (between vs within) Conclusions, however, pertain to entire population 4 Lecture 4 Page 4

Model Estimates Usually interested in estimating variances Use mean squares (known as ANOVA method) ˆσ 2 = MS E ˆσ τ 2 = (MS Treatment MS E )/n If unbalanced, replacenwith n 0 = (( n i ) 2 n 2 i )/((a 1) n i ) Estimate ofσ 2 τ can be negative - SupportsH 0? Use zero as estimate? - Is the model reasonable? - Bayesian approach (nonnegative prior) - Use estimation method that only gives nonnegative estimates 5 Lecture 4 Page 5

σ 2 : Same as fixed case σ 2 τ Confidence intervals (N a)ms E σ 2 (N a)ms E χ 2 α/2,n a σ 2 : Approximate Confidence Interval χ 2 N a ˆσ 2 τ = (MS Trt MS E )/n No exact calculation of CI available. (N a)ms E χ 2 1 α/2,n a Approximate CI based on Satterthwaite s Approximation: rˆσ 2 τ/ χ 2 α/2,r σ 2 τ rˆσ 2 τ/ χ 2 1 α/2,r r = (MS Trt MS E ) 2 MS 2 Trt /(a 1)+MS2 E /(N a) 6 Lecture 4 Page 6

Approximate Confidence interval CI for a variance component: σ 2 0 = E[MS MS ]/k MS = MS r + +MS s MS = MS u + +MS v No common mean squares terms shared byms andms. Notef i MS i /σi 2 = SS i/σi 2 ind χ 2 f i Point estimate ofσ0 2 : ˆσ2 0 = (MS MS )/k Satterthwaite s Approximation: rˆσ 2 0/σ 2 0 χ 2 r r = (ˆσ 2 0) 2 / i MS 2 i k 2 f i Approximate(1 α) 100% CI ofσ 2 0 rˆσ 2 0/ χ 2 α/2,r σ 2 0 rˆσ 2 0/ χ 2 1 α/2,r 7 Lecture 4 Page 7

Proportion ofσ 2 τ in Var(y ij ), i.e., Intraclass Correlation Coefficient (ICC) Common estimate if goal is to reduce variance Uses ratio of twoχ 2 distributions (i.e.,f dist) L L+1 σ2 τ σ 2 +σ 2 τ U U+1 L = 1 n U = 1 n ( ( ) MS Trt 1 MS E F α/2,a 1,N a MS Trt MS E F 1 α/2,a 1,N a 1 = 1 ( ) F 0 1 n F α/2,a 1,N a ) = 1 ( F 0 1 n F 1 α/2,a 1,N a ) Grand mean: µ Example: Average reading ability of 2nd grade class, y.. = (y 1. +y 2. +...+y a. )/a. y i. iid Normal. But what is the variance? CI forµ: y.. ±t α/2,a 1 MSTrt /(an) 8 Lecture 4 Page 8

Example A supplier delivers several hundred batches of raw material to a company each year. The company is interested in a high yield from each batch of raw material (percent usable). Therefore, to investigate the consistency of this supplier, an experiment is done where five batches were selected at random and three yield determinations were made on each batch. Batch 1 2 3 4 5 74 68 75 72 79 76 71 77 74 81 75 72 77 73 79 9 Lecture 4 Page 9

Source of Sum of Degrees of Mean Variation Squares Freedom Square F 0 Between 147.73 4 36.93 20.5 Within 18.00 10 1.80 Total 165.73 14 Highly significant result (F.05,4,10 = 3.48) ˆσ τ 2 = (36.93 1.80)/3 = 11.71 86.7% (=11.71/(11.71+1.80)) is attributable to batch differences Time to improve consistency of the batches 10 Lecture 4 Page 10

95% CI forσ 2 ( SS E, χ 2.025,10 SS E χ 2.975,10 ) = (18.00/20.48, 18.00/3.25) = (0.879, 5.538) 95% CI forσ 2 τ r = (36.93 1.80) 2 36.93 2 /4+1.80 2 /10 = 3.62 χ 2 0.025,3.62 = (1.62)χ 2.025,3 +.62χ 2.0.025,4 =.38 9.35+.62 11.14 = 10.4598 χ 2 0.975,3.62 = (1.62)χ 2.975,3 +.62χ 2.0.975,4 =.38.22+.62.48 =.3812 (3.62 11.71/10.4598, 3.62 11.71/.3812) = (4.0527, 111.2020) SAS allows noninteger degrees of freedom forχ 2 with :CINV(p,df) 11 Lecture 4 Page 11

95% CI for Intraclass Correlation L = 1 ( ) 20.52 3 4.47 1 U = 1 3 ( 20.52 1/8.84 1 = 1.1969 = L L+1 = 0.5448 ) = 60.1323 = U U +1 = 0.9836 using property that 95% CI: (0.545, 0.984) F 1 α/2,v1,v 2 = 1/F α/2,v2,v 1 12 Lecture 4 Page 12

Using SAS data example; input batch percent @@; cards; 1 74 1 76 1 75 2 68 2 71 2 72 3 75 3 77 3 77 4 72 4 74 4 73 5 79 5 81 5 79 ; proc glm data=example; class batch; model percent=batch; random batch; output out=diag r=res p=pred; proc gplot data=diag; plot res*pred; proc mixed data=example cl; class batch; model percent = ; random batch / vcorr; run; 13 Lecture 4 Page 13

Dependent Variable: PERCENT Sum of Mean Source DF Squares Square F Value Pr > F Model 4 147.73333 36.93333 20.52 0.0001 Error 10 18.00000 1.80000 Corrected Total 14 165.73333 Source DF Type I SS Mean Square F Value Pr > F BATCH 4 147.73333 36.93333 20.52 0.0001 Source BATCH Type III Expected Mean Square Var(Error) + 3 Var(BATCH) GLM : Uses least squares for estimation Not designed for random effects...must compute variance estimates / CIs by hand MIXED: Uses restricted maximum likelihood (REML) Often preferred to ML because it produces unbiased estimates of covariance parameters by taking into account the loss of degrees of freedom in estimating fixed effects Usually residual variance profiled out of the likelihood 14 Lecture 4 Page 14

The MIXED Procedure Estimated V Correlation Matrix for batch 1 Row Col1 Col2 Col3 1 1.0000 0.8668 0.8668 2 0.8668 1.0000 0.8668 3 0.8668 0.8668 1.0000 Covariance Parameter Estimates Cov Parm Estimate Alpha Lower Upper BATCH 11.71111111 0.05 4.0450 114.2090 Residual 1.80000000 0.05 0.8788 5.5436 Fit Statistics -2 Res Log Likelihood 62.8 AIC (smaller is better) 66.8 AICC (smaller is better) 67.8 BIC (smaller is better) 66.0 15 Lecture 4 Page 15

Negativeσ 2 τ Estimate Example data new; input school subj score @@; cards; 1 1 74.62 1 2 73.90 1 3 72.27 1 4 71.60 1 5 73.80 1 6 77.42 1 7 72.16 1 8 76.69 1 9 75.84 1 10 70.35 2 1 72.55 2 2 71.44 2 3 72.67 2 4 72.59 2 5 71.25 2 6 68.99 2 7 69.61 2 8 77.44 2 9 73.99 2 10 73.90 3 1 76.66 3 2 74.76 3 3 70.47 3 4 75.38 3 5 68.32 3 6 76.69 3 7 73.34 3 8 68.24 3 9 69.33 3 10 78.22 ; proc glm data=new; class school; model score = school; random school; proc mixed data=new cl; class school; model score = ; random school; run; 16 Lecture 4 Page 16

The GLM Procedure Sum of Source DF Squares Mean Square F Value Pr > F Model 2 10.1115467 5.0557733 0.60 0.5557 Error 27 227.3489500 8.4203315 Corrected Total 29 237.4604967 R-Square Coeff Var Root MSE score Mean 0.042582 3.966909 2.901781 73.14967 Source DF Type I SS Mean Square F Value Pr > F school 2 10.11154667 5.05577333 0.60 0.5557 Source DF Type III SS Mean Square F Value Pr > F school 2 10.11154667 5.05577333 0.60 0.5557 Source school Type III Expected Mean Square Var(Error) + 10 Var(school) Parameter Estimates using ANOVA method ˆσ 2 = 8.42 ˆσ 2 τ = 5.0558 8.4203 10 = 3.3645/10 = 0.37 17 Lecture 4 Page 17

The MIXED Procedure Covariance Parameter Estimates Cov Parm Estimate Alpha Lower Upper school 0... Residual 8.1883 0.05 5.1935 14.7977 Fit Statistics -2 Res Log Likelihood 146.7 AIC (smaller is better) 148.7 AICC (smaller is better) 148.8 BIC (smaller is better) 147.8 18 Lecture 4 Page 18

In log file for PROC MIXED analysis, Word of Caution NOTE: Convergence criteria met. NOTE: Estimated G matrix is not positive definite. Caused by variance estimated to be zero Suggests possibly removing this term from the model Decision often an issue in more complicated models Can remove positivity constraint (NOBOUND) proc mixed cl nobound; class school; model score = ; random school ; run; Includes null model LRT to determine if it is necessary to model the covariance structure 19 Lecture 4 Page 19