Nested Designs & Random Effects

Similar documents
STAT 506: Randomized complete block designs

Two-Way Factorial Designs

Chapter 11 - Lecture 1 Single Factor ANOVA

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs

STAT22200 Spring 2014 Chapter 8A

MAT3378 ANOVA Summary

STAT 705 Chapter 16: One-way ANOVA

3. Design Experiments and Variance Analysis

Stat 579: Generalized Linear Models and Extensions

Stat 705: Completely randomized and complete block designs

Lec 5: Factorial Experiment

Stat 217 Final Exam. Name: May 1, 2002

G. Nested Designs. 1 Introduction. 2 Two-Way Nested Designs (Balanced Cases) 1.1 Definition (Nested Factors) 1.2 Notation. 1.3 Example. 2.

Section 7.3 Nested Design for Models with Fixed Effects, Mixed Effects and Random Effects

Solution to Final Exam

STAT Final Practice Problems

STAT 705 Chapter 19: Two-way ANOVA

Chapter 20 : Two factor studies one case per treatment Chapter 21: Randomized complete block designs

Outline Topic 21 - Two Factor ANOVA

Lecture 10. Factorial experiments (2-way ANOVA etc)

STAT 705 Chapter 19: Two-way ANOVA

PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

Two-Way Analysis of Variance - no interaction

Analysis of Variance and Design of Experiments-I

Unit 8: 2 k Factorial Designs, Single or Unequal Replications in Factorial Designs, and Incomplete Block Designs

Unbalanced Data in Factorials Types I, II, III SS Part 1

iron retention (log) high Fe2+ medium Fe2+ high Fe3+ medium Fe3+ low Fe2+ low Fe3+ 2 Two-way ANOVA

Analysis of Covariance

Stat 6640 Solution to Midterm #2

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

STAT 430 (Fall 2017): Tutorial 8

EX1. One way ANOVA: miles versus Plug. a) What are the hypotheses to be tested? b) What are df 1 and df 2? Verify by hand. , y 3

3. Factorial Experiments (Ch.5. Factorial Experiments)

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

Factorial ANOVA. Psychology 3256

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Ch. 5 Two-way ANOVA: Fixed effect model Equal sample sizes

Unit 6: Orthogonal Designs Theory, Randomized Complete Block Designs, and Latin Squares

Allow the investigation of the effects of a number of variables on some response

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

Chapter 11 - Lecture 1 Single Factor ANOVA

Factorial BG ANOVA. Psy 420 Ainsworth

Factorial Treatment Structure: Part I. Lukas Meier, Seminar für Statistik

STAT 525 Fall Final exam. Tuesday December 14, 2010

Lecture 9: Factorial Design Montgomery: chapter 5

Design & Analysis of Experiments 7E 2009 Montgomery

Two-Way ANOVA (Two-Factor CRD)

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

BIOSTATISTICAL METHODS

Factorial designs. Experiments

Factorial and Unbalanced Analysis of Variance

STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test

Written Exam (2 hours)

Statistics 210 Part 3 Statistical Methods Hal S. Stern Department of Statistics University of California, Irvine

Stat 511 HW#10 Spring 2003 (corrected)

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Inference for Regression Simple Linear Regression

Analysis of Variance

Nesting and Mixed Effects: Part I. Lukas Meier, Seminar für Statistik

STATS Analysis of variance: ANOVA

Statistics For Economics & Business

1 Use of indicator random variables. (Chapter 8)

Analysis of Variance

Stat 579: Generalized Linear Models and Extensions

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

One-way ANOVA (Single-Factor CRD)

Lec 1: An Introduction to ANOVA

Inference for the Regression Coefficient

psyc3010 lecture 2 factorial between-ps ANOVA I: omnibus tests

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction,

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

Joint Probability Distributions

ANOVA Randomized Block Design

Multiple Predictor Variables: ANOVA

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

Ch 2: Simple Linear Regression

STAT 705 Chapters 23 and 24: Two factors, unequal sample sizes; multi-factor ANOVA

ANOVA: Comparing More Than Two Means

More about Single Factor Experiments

Topic 9: Factorial treatment structures. Introduction. Terminology. Example of a 2x2 factorial

Chapter 12 - Lecture 2 Inferences about regression coefficient

NESTED FACTORS (Chapter 18)

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Homework 3 - Solution

Increasing precision by partitioning the error sum of squares: Blocking: SSE (CRD) à SSB + SSE (RCBD) Contrasts: SST à (t 1) orthogonal contrasts

Lecture 11: Nested and Split-Plot Designs

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA.

BIOS 2083: Linear Models

TWO OR MORE RANDOM EFFECTS. The two-way complete model for two random effects:

More than one treatment: factorials

Confidence Intervals, Testing and ANOVA Summary

Unit 12: Analysis of Single Factor Experiments

Practice Final Exam. December 14, 2009

Notes on Maxwell & Delaney

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Analysis of Covariance

Transcription:

Nested Designs & Random Effects Timothy Hanson Department of Statistics, University of South Carolina Stat 506: Introduction to Design of Experiments 1 / 17

Bottling plant production A production engineer studied the effects of machine model (factor A) and operator (factor B) on the output in a bottling plant. Three bottling machines were used, each a different model. Twelve different machine operators were studied, four on each machine working a six-hour shift each day for one week. The number of cases produced by each machine/operator are Machine i: 1 2 3 Operator j: 1 2 3 4 5 6 7 8 9 10 11 12 Day k = 1: 65 68 56 45 74 69 52 73 69 63 81 67 Day k = 2: 58 62 65 56 81 76 56 78 83 70 72 79 Day k = 3: 63 75 58 54 76 80 62 83 74 72 73 73 Day k = 4: 57 64 70 48 80 78 58 75 78 68 76 77 Day k = 5: 66 70 64 60 68 73 51 76 80 75 70 71 Each operator is only assigned one of the three treatments; operator is nested within treatment. 2 / 17

Model The model is y ijk = µ + α i + β j + ɛ ijk, where i = 1, 2, 3 machines and j = 1,..., 12 operators. Note that this is different from a crossed design where every operator uses all three machines. Here, an operator is observed at one machine only. 3 / 17

Nested notation We can also use different notation where the operator number starts at one within each machine. Then each pairing of (i, j) is a unique, different operator; this is commonly written as j(i), operator j nested within machine i. Thus 2(1) (second operator at machine 1) is a different person than 2(2) (second operator at machine 2). Machine i: 1 2 3 Operator j: 1 2 3 4 1 2 3 4 1 2 3 4 Day k = 1: 65 68 56 45 74 69 52 73 69 63 81 67 Day k = 2: 58 62 65 56 81 76 56 78 83 70 72 79 Day k = 3: 63 75 58 54 76 80 62 83 74 72 73 73 Day k = 4: 57 64 70 48 80 78 58 75 78 68 76 77 Day k = 5: 66 70 64 60 68 73 51 76 80 75 70 71 Using this notation the model is Every operator has a unique j(i). y ijk = µ + α i + β j(i) + ɛ ijk. 4 / 17

ANOVA on simple nested design Source SS df MS F A SSA = bn a i=1 (ȳ i ȳ ) 2 a 1 MSA=SSA/dfA MSA/MSE B(A) SSB(A) = n a i=1 bj=1 (ȳ ij ȳ i ) 2 a(b 1) SSB(A)/dfB(A) MSB(A)/MSE Error SSE = a i=1 bj=1 nk=1 (y ijk ȳ ij ) 2 ab(n 1) SSE/dfE The F-tests are as usual and test H 0 : α i = 0 and H 0 : β j(i) = 0 respectively. 5 / 17

Bottling plant production in R library(cfcdae) cases=c(65,68,56,45,74,69,52,73,69,63,81,67, 58,62,65,56,81,76,56,78,83,70,72,79, 63,75,58,54,76,80,62,83,74,72,73,73, 57,64,70,48,80,78,58,75,78,68,76,77, 66,70,64,60,68,73,51,76,80,75,70,71) machine =factor(rep(c(1,1,1,1,2,2,2,2,3,3,3,3),5)) operator=factor(rep(c(1,2,3,4,1,2,3,4,1,2,3,4),5)) f=lm(cases~machine/operator) # note use of "/" to indicate nesting! anova(f) model.effects(f,machine) pairwise(f,machine) 6 / 17

One-way random effects model If treatment levels come from a larger population, their effects are best modeled as random. A random-effects one-way model is where y ij = µ + α i + ɛ ij, α 1,..., α g iid N(0, σ 2 α ) independent of ɛ ij iid N(0, σ 2 ). As usual, i = 1,..., g and j = 1,..., n i. The test of interest is H 0 : α 1 = = α g = 0. This happens if and only if H 0 : σ α = 0. α 1,..., α g are called random effects and σ α and σ are termed variance components. This model is an example of a random effects model, because it has only random effects in it beyond the intercept µ (which is fixed). 7 / 17

Testing H 0 : σ α = 0 The MSE and MSTR are defined as they were before. One can show E(MSE) = σ 2 and E(MSTR) = σ 2 + nσ 2 α when n = n i for all i. If σ α = 0 we expect F = MSTR/MSE to estimate one. In fact, just like the fixed-effects case, F F (g 1, N g). This is the usual test given by lm followed by anova. 8 / 17

Apex Enterprises g = 5 personnel officers were selected at random, and n i = 4 prospective employee candidates assigned at random to each officer. y ij is the rating of the ith officer on their jth candidate. Since the personnel officers are chosen randomly from a large population of personnel officers, the random-effects one-way model applies, y ij = µ + α i + ɛ ij. exactrlrt provides a different test of H 0 : σ 2 α vs. H 0 : σ 2 α > 0. rating=c(76,65,85,74,59,75,81,67,49,63,61,46,74,71,85,89,66,84,80,79) officer=factor(c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5)) f=lm(rating~officer) anova(f) # fixed effects model AIC(f) library(rlrsim) f=lmer(rating~1+(1 officer),reml=f) summary(f) exactrlrt(f) # random effects model 9 / 17

Nested designs with random nested effects Often the subsampled factor, e.g. operators, can be thought of as random, β j(i) iid N(0, σ 2 β ). However, it s easier to fit the nested model y ijk = µ + α i + β j + ɛ ijk in R where β j iid N(0, σ 2 β ). operator=factor(rep(1:12,5)) # each operator now has own index f=lmer(cases~machine+(1 operator),reml=f) summary(f) exactrlrt(f) # are random operator effects significantly different? library(lsmeans) lsmeans(f,"machine") # machine effects averaged over operators pairs(lsmeans(f,"machine")) # which machine(s) best? library(car) Anova(f,type=3) 10 / 17

Random block effects and repeated measures When block levels come from a large population, we can consider a complete randomized block design with random block effects. One very important example of this is the repeated measures design, where each block is an experimental unit in which all treatment levels are randomly applied. In fact, the blocks are retermed subjects and we consider a sample of subjects from their population. where y ij = µ + ρ i }{{} subject + τ j }{{} level +ɛ ij, ρ 1,..., ρ n iid N(0, σ 2 ρ ) independent of ɛ ij iid N(0, σ 2 ). There are i = 1,..., n subjects receiving each of j = 1,..., r treatments. 11 / 17

Random block effects Note that the model can be extended to factorial treatment structure, e.g. Y ijk = µ + ρ i + α j + β k + (αβ) jk + ɛ ijk, Examples of subjects include people, animals, families, cities, and clinics. This is an example of a mixed effects model; there is a mix of random (ρ i s) and fixed (α j s, β k s, and (αβ) jk s) effects in the model. 12 / 17

Random blocks, comments We assume subject effects and treatment effects do not interact. Can check via Tukey s 1 df test for additivity. Also look at interaction plots as in fixed-effects C.R.B. designs. ANOVA table, sums of squares exactly the same, except now the F-test for blocks tests H 0 : σ ρ = 0 instead of H 0 : ρ 1 = = ρ n = 0. Test for treatment is same H 0 : τ 1 = = τ r = 0. Every treatment is given to every experimental unit in randomized order. Two sets of residuals to consider. Both should be normal; e ij should have constant variance. 1 e ij = y ij {ˆµ + ˆρ i + ˆτ j }, and 2 ˆρ i. corr(y ij1, y ij2 ) = σ 2 ρ/(σ 2 + σ 2 ρ) for j 1 j 2 tells you how correlated the repeated measures are. 13 / 17

Road paint wear (p. 1082) A state highway department studied wear of five paints at eight randomly picked locations. The standard is paint 1. Paints 1, 3, and 5 are white; paints 2 and 4 are yellow. At each location a random ordering of the paints were applied to the road. After an exposure period, a combined measure of wear y ij was recorded. The higher the score, the better the wearing characteristics. Recall the model where y ij = µ + ρ }{{} i + τ j +ɛ ij, }{{} location paint ρ 1,..., ρ n iid N(0, σ 2 ρ ) independent of ɛ ij iid N(0, σ 2 ). Here g = 5 and n = 8. 14 / 17

Road paint wear in R location=factor(rep(1:8,each=5)) paint=factor(rep(1:5,times=8)) wear=c(11.0,13.0,10.0,18.0,15.0,20.0,28.0,15.0,30.0,18.0, 8.0,10.0, 8.0,16.0,12.0,30.0,35.0,27.0,41.0,28.0, 14.0,16.0,13.0,22.0,16.0,25.0,27.0,26.0,33.0,25.0, 43.0,46.0,41.0,55.0,42.0,13.0,14.0,12.0,20.0,13.0) d=data.frame(wear,location,paint) with(d,interactplot(location,paint,wear)) # parallel? source("http://people.stat.sc.edu/hansont/stat506/tukey.r") tukeys.add.test(wear,location,paint) f=lm(wear~paint+location) anova(f) AIC(f) f=lmer(wear~paint+(1 location),reml=f) summary(f) exactrlrt(f) # are random location effects significantly different? lsmeans(f,"paint") # paint effects pairs(lsmeans(f,"paint")) # which paint(s) best? # yellow vs. white: contrast(lsmeans(f,"paint"),list(c1=c(-1/3,1/2,-1/3,1/2,-1/3))) Anova(f,type=3) # whether paint is signicant qqnorm((ranef(f)$location[,1])) # random effects normal? plot(f) # different plot because fit w/ lmer 15 / 17

Wine tasting r = 4 Chardonnary wines of the same vintage were judged by n = 6 judges. Each wine was blinded and given to each judge in randomized order. The wines were scored on a 40-point scale y ij, with higher scores meaning better wine. The six judges are considered to come from a large population of wine-tasting judges and so a repeated measures model is appropriate. Let s examine these data in R... 16 / 17

Wine tasting in R rating=c(20,24,28,28,15,18,23,24,18,19,24,23,26,26,30,30,22,24,28,26,19,21,27,25) judge=factor(c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5,6,6,6,6)) wine=factor(c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4)) d=data.frame(rating,judge,wine) with(d,interactplot(judge,wine,rating)) # parallel? tukeys.add.test(rating,wine,judge) 17 / 17