Increasing precision by partitioning the error sum of squares: Blocking: SSE (CRD) à SSB + SSE (RCBD) Contrasts: SST à (t 1) orthogonal contrasts

Similar documents
Allow the investigation of the effects of a number of variables on some response

Topic 9: Factorial treatment structures. Introduction. Terminology. Example of a 2x2 factorial

Topic 4: Orthogonal Contrasts

2. Two distinct error terms (Error for main plot effects > Error for subplot effects)

2. Treatments are randomly assigned to EUs such that each treatment occurs equally often in the experiment. (1 randomization per experiment)

Lecture 15 Topic 11: Unbalanced Designs (missing data)

Multiple Predictor Variables: ANOVA

Topic 6. Two-way designs: Randomized Complete Block Design [ST&D Chapter 9 sections 9.1 to 9.7 (except 9.6) and section 15.8]

Topic 12. The Split-plot Design and its Relatives [ST&D Ch. 16] Definition Uses of Split-plot designs

Orthogonal contrasts for a 2x2 factorial design Example p130

Analysis of Variance and Design of Experiments-I

Chapter 11: Factorial Designs

Lec 5: Factorial Experiment

MEMORIAL UNIVERSITY OF NEWFOUNDLAND DEPARTMENT OF MATHEMATICS AND STATISTICS FINAL EXAM - STATISTICS FALL 1999

Lecture 14 Topic 10: ANOVA models for random and mixed effects. Fixed and Random Models in One-way Classification Experiments

Unbalanced Data in Factorials Types I, II, III SS Part 1

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs

Topic 12. The Split-plot Design and its Relatives (Part II) Repeated Measures [ST&D Ch. 16] 12.9 Repeated measures analysis

PLS205 Lab 6 February 13, Laboratory Topic 9

If we have many sets of populations, we may compare the means of populations in each set with one experiment.

Lecture 7 Randomized Complete Block Design (RCBD) [ST&D sections (except 9.6) and section 15.8]

Factorial and Unbalanced Analysis of Variance

Two-Factor Full Factorial Design with Replications

Topic 7: Incomplete, double-blocked designs: Latin Squares [ST&D sections ]

Lecture 10. Factorial experiments (2-way ANOVA etc)

2 k, 2 k r and 2 k-p Factorial Designs

Topic 13. Analysis of Covariance (ANCOVA) - Part II [ST&D Ch. 17]

Factorial ANOVA. Testing more than one manipulation

Two-Way Factorial Designs

Unit 8: 2 k Factorial Designs, Single or Unequal Replications in Factorial Designs, and Incomplete Block Designs

Factorial designs. Experiments

Fractional Factorial Designs

Multiple Comparisons. The Interaction Effects of more than two factors in an analysis of variance experiment. Submitted by: Anna Pashley

Two-Way Analysis of Variance - no interaction

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Workshop 7.4a: Single factor ANOVA

ANOVA (Analysis of Variance) output RLS 11/20/2016

Multiple Predictor Variables: ANOVA

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

STAT22200 Spring 2014 Chapter 8A

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

Orthogonal contrasts and multiple comparisons

STAT Chapter 10: Analysis of Variance

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

Outline Topic 21 - Two Factor ANOVA

Multi-Factor Experiments

Statistics For Economics & Business

BALANCED INCOMPLETE BLOCK DESIGNS

Extensions of One-Way ANOVA.

Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures

Blocks are formed by grouping EUs in what way? How are experimental units randomized to treatments?

Ch. 1: Data and Distributions

Ch 2: Simple Linear Regression

PLS205 Lab 2 January 15, Laboratory Topic 3

Confidence Interval for the mean response

STAT22200 Spring 2014 Chapter 13B

. Example: For 3 factors, sse = (y ijkt. " y ijk

Reference: Chapter 6 of Montgomery(8e) Maghsoodloo

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

Unbalanced Data in Factorials Types I, II, III SS Part 2

Lecture 6 Multiple Linear Regression, cont.

Biostatistics 380 Multiple Regression 1. Multiple Regression

FACTORIALS. Advantages: efficiency & analysis of interactions. Missing data. Predicted EMS Special F tests Satterthwaite linear combination of MS

Design of Engineering Experiments Chapter 5 Introduction to Factorials

Factorial Treatment Structure: Part I. Lukas Meier, Seminar für Statistik

19. Blocking & confounding

3. Design Experiments and Variance Analysis

Stat 511 HW#10 Spring 2003 (corrected)

Definitions of terms and examples. Experimental Design. Sampling versus experiments. For each experimental unit, measures of the variables of

ANOVA CIVL 7012/8012

Power & Sample Size Calculation

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

Stat 217 Final Exam. Name: May 1, 2002

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

SIMPLE REGRESSION ANALYSIS. Business Statistics

Chapter 4: Randomized Blocks and Latin Squares

COMPLETELY RANDOM DESIGN (CRD) -Design can be used when experimental units are essentially homogeneous.

ETH p. 1/21. Split Plot Designs. Large and small units Confounding main effects Repeated measures anova

Ch 3: Multiple Linear Regression

Nesting and Mixed Effects: Part I. Lukas Meier, Seminar für Statistik

Factorial ANOVA. Psychology 3256

ANOVA: Analysis of Variation

3. Factorial Experiments (Ch.5. Factorial Experiments)

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

Lecture 22 Mixed Effects Models III Nested designs

The ε ij (i.e. the errors or residuals) are normally distributed. This assumption has the least influence on the F test.

23. Fractional factorials - introduction

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA)

Chapter 10. Design of Experiments and Analysis of Variance

IX. Complete Block Designs (CBD s)

CS 5014: Research Methods in Computer Science

Chapter 6 Randomized Block Design Two Factor ANOVA Interaction in ANOVA

R version ( ) Copyright (C) 2009 The R Foundation for Statistical Computing ISBN

Analysis of variance. Gilles Guillot. September 30, Gilles Guillot September 30, / 29

Nested Designs & Random Effects

y = µj n + β 1 b β b b b + α 1 t α a t a + e

Statistical Analysis of Unreplicated Factorial Designs Using Contrasts

DESAIN EKSPERIMEN BLOCKING FACTORS. Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

Chapter 15: Analysis of Variance

Transcription:

Lecture 13 Topic 9: Factorial treatment structures (Part II) Increasing precision by partitioning the error sum of squares: s MST F = = MSE 2 among = s 2 within SST df trt SSE df e Blocking: SSE (CRD) à SSB + SSE (RCBD) Increasing precision by partitioning the treatment sum of squares: s MST F = = MSE 2 among = s 2 within SST df trt SSE df e Contrasts: SST à (t 1) orthogonal contrasts A factorial treatment structure ensures that the TSS can be partitioned into very specific orthogonal components (main effects and interaction effects): SST SSA + SSB + SS(AB) SST SSA + SSB + SSC + SS(AB) + SS(AC) + SS(BC) + SS(ABC) The significance of the interaction F test determines what kind of subsequent analysis is appropriate: No significant interaction: Subsequent analysis (mean comparisons, contrasts, etc.) are performed on the main effects. Significant interaction: Subsequent analysis (mean comparisons, contrasts, etc.) are performed on the simple effects. 1

Another example of partitioning the Interaction SS A 3x3 factorial experiment was conducted to determine the effects of vernalization genes Vrn1 and Vrn2 on flowering time (days to flowering) in wheat. 12 plants from a segregating population (parents A and B) were characterized with molecular markers and the number of alleles of parent A indicated for each of the two genes (BB =, AB = 1, AA = 2). Gene Vrn1 Vrn2 Genotype BB AB AA Dose A 1 2 BB = Type 1 1 = Type 2 2 = Type 3 AB 1 1 = Type 4 11 = Type 5 12 = Type 6 AA 2 2 = Type 7 21 = Type 8 22 = Type 9 #The ANOVA vrn_mod<-lm(days ~ Vrn1 + Vrn2 + Vrn1*Vrn2, vrn_dat) [ Results (obtained via Anova(), part of the "car" package, using partial SS) Sum Sq Df F value Pr(>F) (Intercept) 85812 1 7761.1659 < 2.2e-16 *** Vrn1 4435 2 2.558 5.734e-8 *** Vrn2 2131 2 96.3683 < 2.2e-16 *** Vrn1:Vrn2 88 4 1.8267.133 NS Residuals 1282 93 Is it worth partitioning the Interaction SS? F Calc SS( AB) = 1 MSE = F Crit 87.8671 = 7.31 11.56357 = F.5,1,93 3.96 2

Partitioning the interaction #The same contrasts are used for factors Vrn1 and Vrn2 # Contrast Linear -1,,1 # Contrast Quadratic 1,-2,1 contrastmatrix<-cbind(c(-1,,1),c(1,-2,1)) contrasts(vrn_dat$vrn1)<-contrastmatrix contrasts(vrn_dat$vrn2)<-contrastmatrix vrn_rcon_mod<-lm(days ~ Vrn1 + Vrn2 + Vrn1*Vrn2, vrn_dat) summary(vrn_rcon_mod) Estimate Std. Error t value Pr(> t ) Vrn11-9.45893 1.877-5.58 2.12e-6 *** Vrn12 2.2747.79774 2.767.6822 ** Vrn21 22.6763 1.87398 12.98 < 2e-16 *** Vrn22-3.6971.79467-3.863.27 *** Vrn11:Vrn21-6.3952 2.64122-2.389.1892 * Vrn12:Vrn21.35317 1.8924.324.746484 NS Vrn11:Vrn22 -.6488 1.825 -.6.952335 NS Vrn12:Vrn22 -.59239.4982-1.27.23512 NS There is a significant linear by linear component of the interaction! Great. What does that mean? 3

A linear x linear interaction: The effect of one factor (full dose vs. no dose) is different depending on the level (full dose vs. no dose) of the other factor. Vrn2 Vrn1 Interactions of Vrn2 and Vrn1 16 14 Days 12 Vrn2 1 2 1 8 1 2 Vrn1 Interactions of Vrn1 and Vrn2 16 14 Days 12 Vrn1 1 2 1 8 1 2 Vrn2 4

2x2 CRD with a significant interaction (testing simple effects) In this CRD with five replications per treatment combination, Factor A is genotype of mungbean (Control vs. Advanced Line) and Factor B is drought stress (Not irrigated vs. Irrigated). The response variable is yield. Factor A B Control Advanced Means se (A,B i ) Not Irrigated 19.36 27.81 23.585 8.45 Irrigated 13.28 36.53 24.95 23.25 Means 16.32 32.17 24.245 se (B,A i ) 6.8-8.72 #The ANOVA mung_mod<-lm(yield ~ A + B + A*B, mung_dat) anova(mung_mod) Output Df Sum Sq Mean Sq F value Pr(>F) A 1 1256.75 1256.75 52.9263 1.859e-6 *** B 1 8.71 8.71.3669.553198 A:B 1 273.95 273.95 11.537.3686 ** Residuals 16 379.92 23.75 Interactions of B and A Interactions of A and B 4 4 Yield 3 B Irr NoIrr Yield 3 A Adv Cont 2 2 1 1 Adv Cont Irr NoIrr A B If an interaction is present in a fixed-effects model, the next step is to analyze the simple effects. 5

An easy way of testing simple effects is to subset the data according to the levels of the factors involved and conduct individual ANOVAs within each level: #Analyze the simple effects by subsetting the data... mung_irr_dat<-subset(mung_dat, mung_dat$b == "Irr") mung_noirr_dat<-subset(mung_dat, mung_dat$b == "NoIrr") mung_cont_dat<-subset(mung_dat, mung_dat$a == "Cont") mung_adv_dat<-subset(mung_dat, mung_dat$a == "Adv") #...and then performing multiple ANOVAs anova(lm(yield ~ A, mung_irr_dat)) anova(lm(yield ~ A, mung_noirr_dat)) anova(lm(yield ~ B, mung_cont_dat)) anova(lm(yield ~ B, mung_adv_dat)) The results (F tests pulled from the four separate ANOVAs): Df Sum Sq Mean Sq F value Pr(>F) Geno (Irr) 1 178.591 178.591 24.82.179 ** Geno (NoIrr) 1 1352.1 1352.1 33.559.484 *** Irr (Cont) 1 92.477 92.477 7.767.2371 * Irr (Adv) 1 19.18 19.183 5.3461.4952 * Compare this to the previous F test for the main effect of Irrigation: Df Sum Sq Mean Sq F value Pr(>F) Geno 1 1256.75 1256.75 52.9263 1.859e-6 *** Irr 1 8.71 8.71.3669.553198 NS Geno:Irr 1 273.95 273.95 11.537.3686 ** Residuals 16 379.92 23.75 Why not use contrasts to probe simple effects? These contrasts are not orthogonal (prove it to yourself), so the results you obtain are not identical to those above. 6

Example of a factorial experiment with subsamples Suppose in the quack-grass shoots experiment that two random areas of 1 square foot each were evaluated in each plot (each R-D combination): D R Block Plot Area Number 3 1 1 1 14.7 3 2 1 1 13.6 3 3 1 1 15.5.................. 1 8 2 1 2 9.2 1 8 3 1 2 12.3 1 8 4 1 2 12.2 The "Plot" and "Area" columns are here just to be very explicit about the fact that there is one experimental unit (Plot) per D:R:Block combination, and there are two subsamples (Areas) per experimental unit. The correct linear model in R, however, does not make use of either of these classification variables: #The ANOVA quack_mod<-lm(number ~ D + R + D*R + Block + D:R:Block, quack_dat) anova(quack_mod) The output: Df Sum Sq Mean Sq F value Pr(>F) D 1 3. 3. 1.5.23256 R 2 37.327 153.663 76.8317 3.693e-11 *** Block 3 1.163.388.1939.89952 D:R 2.98.49.245.78464 D:R:Block 15 78.767 5.251 2.6256.1698 * Residuals 24 48. 2. Using the correct error term (D:R:Block) for testing D, R, Block, and D:R... Df Sum Sq Mean Sq F value Pr(>F) D 1 3. 3..5713.46144 R 2 37.327 153.663 29.2636 6.642e-6 *** Block 3 1.163.388.739.9739 D:R 2.98.49.933.91143 D:R:Block 15 78.767 5.251 2.6256.1698 * Residuals 24 48. 2. 7

The objective of analyzing the experiment using the individual subsample values is simply to understand the sources of variation in the experimental error: #Calculating components of variance #library(lme4) quackcv_mod<-lmer(number ~ D + R + D*R + Block + (1 Block:D:R), quack_dat) summary(quackcv_mod) The output: Random effects: Groups Name Variance Std.Dev. Block:D:R (Intercept) 1.626 1.275 Residual 2. 1.414 Number of obs: 48, groups: Block:D:R, 24 If one plot (EU) costs $5 and one subsample costs $5 n sub = C 2 e.u. * s sub = 2 C sub * s e.u. 5 * 2. 5*1.63 = 3.5 the optimum allocation of resources would be to take more than 3 subsamples per plot (i.e. 4). 8

Three-way ANOVA (Model I, fixed-effects model) Three or more factors may be analyzed simultaneously, each at different levels. However As the number of factors increases, the required number of experimental units becomes very large. It may not be possible to run all the tests on the same day or to hold all of the material in a single controlled environmental chamber. There arise a large number of higher-order interactions that are difficult to interpret. The linear model: µ ijkl = µ + τ Ai + τ Bj + τ Ck + (τ A τ B ) ij + (τ A τ C ) ik + (τ B τ C ) jk + (τ A τ B τ C ) ijk + ε ijklm Example C.J. Monlezun (1979) Two-dimensional plots for interpreting interactions in the three-factor analysis of variance model. The American Statistician 33: 63-69 The following group means from a hypothetical 3x5x2 experiment are used to illustrate an example with no three-way interaction. A1C1 A2C1 A3C1 A1C2 A2C2 A3C2 B1 61 38 81 31 27 113 B2 39 61 49 68 13 143 B3 121 82 41 78 57 63 B4 79 68 59 122 127 167 B5 91 31 61 92 43 128 9

There is a two-way interaction between factors A and B at both levels of factor C: A x B at C 1 A x B at C 2 14 18 12 1 8 6 4 2 A1 A2 A3 16 14 12 1 8 6 4 2 A1 A2 A3 B1 B2 B3 B4 B5 B1 B2 B3 B4 B5 So, the first order interaction (A:B) has two values: (A:B at C 1 ) and (A:B at C 2 ). The interaction term (A:B) is the average of these values. Similarly, there is a two-way interaction between factors B and C at each level of factor A: B x C at A 1 B x C at A 2 B x C at A 3 14 12 1 14 12 1 18 16 14 12 8 6 4 2 C1 C2 8 6 4 2 C1 C2 1 8 6 4 2 C1 C2 B1 B2 B3 B4 B5 B1 B2 B3 B4 B5 B1 B2 B3 B4 B5 Five interaction plots are needed to visualize the two-way A x C interactions at each level of B. 1

Visualizing a three-way interaction: A B C Response Effect of C (C 1 - C 2 ) 1 1 1 61 3 2 1 1 38 11 3 1 1 81-32 1 1 2 31 2 1 2 27 3 1 2 113 1 2 1 39-29 2 2 1 61-42 3 2 1 49-94 1 2 2 68 2 2 2 13 3 2 2 143 1 3 1 121 43 2 3 1 82 25 3 3 1 41-22 1 3 2 78 2 3 2 57 3 3 2 63 1 4 1 79-43 2 4 1 68-59 3 4 1 59-18 1 4 2 122 2 4 2 127 3 4 2 167 1 5 1 91-1 2 5 1 31-12 3 5 1 61-67 1 5 2 92 2 5 2 43 3 5 2 128 6 4 2 A x B, where Response = Effect of C -2-4 -6-8 B1 B2 B3 B4 B5 A1 A2 A3-1 -12 11

Each line represents one level of A. The average of each line represents the average effect of C for each level of A. While these averages differ among lines (i.e. A:C is significant), their differences are fairly constant across all levels of B. The roughly parallel nature of the lines in this interaction plot shows us that the difference in the effects of C at the different levels of A do not vary significantly across the levels of B. [Translation: No significant three-way interaction.] Example: Let A be nitrogen level and B be plow depth. In a simple two-factor experiment: A significant two-way interaction (A:B) means that the crop's response to nitrogen varies, depending on plow depth. Now introduce a third factor C, which could be soil type. A significant three-way interaction (A:B:C) means that the effect of plow depth on the crop's response to nitrogen varies, depending on the soil type. 12