Lecture 10. Factorial experiments (2-way ANOVA etc)

Lecture 10. Factorial experiments (2-way ANOVA etc) Jesper Rydén Matematiska institutionen, Uppsala universitet jesper@math.uu.se Regression and Analysis of Variance autumn 2014

A factorial experiment Two factors A and B, each run at two levels ( low and high ) Two-factor factorial experiment (responses in corners) Interaction plot (no interaction)

Another factorial experiment Two factors A and B, each run at two levels ( low and high ) Two-factor factorial experiment (responses in corners) Interaction plot (interaction)

Data arrays (Battery data) Response: Life (in hours) of battery. Factor A: Material type (1, 2 or 3). Factor B: Temperature (in F), levels 15, 70 or 125. This is a 3 2 factorial design. Temperature can be controlled in the laboratory.

The effects model The effects model for a two-factor factorial design: i = 1, 2,..., a y ijk = µ + τ i + β j + (τβ) ij + ɛ ijk j = 1, 2,..., b k = 1, 2,..., n µ Overall mean effect τ i Effect of the ith level of the row factor A β j Effect of the jth level of the column factor B (τβ) ij Effect of the interaction between τ i and β j Random error component ɛ ijk Both factors are assumed to be fixed and a n τ i = 0, β j = 0 and for interactions i=1 a (τβ) ij = i=1 j=1 b (τβ) ij = 0. j=1

The means model The means model for a two-factor factorial design: i = 1, 2,..., a y ijk = µ ij + ɛ ijk j = 1, 2,..., b k = 1, 2,..., n where the mean of the ijth cell is µ ij = µ + τ i + β j + (τβ) ij.

Testing hypotheses Testing equality of row treatment effects: H 0 : τ 1 = τ 2 = = τ a = 0 H 1 : at least one τ i 0 Testing equality of column treatment effects: H 0 : β 1 = β 2 = = β b = 0 H 1 : at least one β j 0 Test of interaction: H 0 : (τβ) ij = 0, for all i, j H 1 : at least one (τβ) ij 0

Estimation of parameters The Least-Squares method applied on the effects model with constraints result in the point estimates µ = ȳ, τ i = ȳ i ȳ, β j = ȳ j ȳ, (τβ) ij = ȳ ij ȳ i ȳ j + ȳ, { i = 1, 2,..., a j = 1, 2,..., b Fitted value: ŷ ijk = µ + τ i + β j + (τβ) ij = ȳ ij.

ANOVA table: Fixed-effects case, two factors Source of Sum of Degrees of Variation Squares Freedom Mean Square F 0 A treatments SS A a 1 MS A = SS A a 1 B treatments SS B b 1 MS B = SS B b 1 Interaction SS AB (a 1)(b 1) MS AB = SS AB (a 1)(b 1) Error SS E ab(n 1) MS E = SS E ab(n 1) Total SS T abn 1 F 0 = MS A MS E F 0 = MS B MS E F 0 = MS AB MS E SS T }{{} N 1 = SS }{{ Treat + SS }}{{} E ab 1 N ab = SS }{{} A + SS }{{} B a 1 b 1 + SS AB }{{} (a 1)(b 1) + SS E }{{} N ab So-called machine formulae for computation of sums of squares, see the textbook.

ANOVA table (battery data)

ANOVA table, no interaction (battery data) An experimenter may assume a two-factor model without interaction, i.e. i = 1, 2,..., a y ijk = µ + τ i + β j + ɛ ijk j = 1, 2,..., b k = 1, 2,..., n ANOVA table for the battery data (where interaction in fact was significant (p < 0.05):

Interaction plots (battery data)

Example. Yield of harvesting Three types of crop are investigated at two stations (North and South). At each station, four fields are chosen at random and harvested after a fixed time when the yield is measured. Data is in the form of a table with mean values for each factor combination: Analyse the experiment. Type of crop Station I II III North 12 5 9 South 14 9 11 Blackboard

Residuals (battery data) Residual analysis of e ijk = y ijk ŷ ijk = y ijk ȳ ij

Residuals (battery data) Montgomery: Mild inequality of variance, with the treatment combination 15 F and material type 1 having larger variance than the others.

A two-factor factorial with random factors In an experiment, a levels are chosen at random from factor A and b levels chosen from factor B. The experiment is replicated n times. y ijk = µ + τ i + β j + (τβ) ij + ɛ ijk i = 1, 2,..., a j = 1, 2,..., b k = 1, 2,..., n where the model parameters τ i, β j, (τβ) ij and ɛ ijk are independent random variables: τ i N(0, σ 2 τ ), β j N(0, σ 2 β ), (τβ) ij N(0, σ 2 τβ ), ɛ ijk N(0, σ 2 ). ANOVA table, estimation, see blackboard.

Example. Piglets In an experiment, we have 20 randomly chosen farms (factor B) with pigs of the type GIP and 15 randomly chosen breeding boars (avelsgaltar), factor A. Each breeding boar will breed two independent sows (suggor) from each farm. The response of the following data concerns the number of piglets living at birth in each farrow (kull). SS A = 2296, SS B = 931, SS AB = 1064, SS T = 4591. Moreover, i,j,k y ijk = 8760. Propose a model, estimate parameters and analyse the experiment. Blackboard

Example: Electrical circuit We will study an example with respect to coding the factor levels. Response: voltage (in V), two factors: resistance R and current I. I = 4A I = 6A R = 1 Ω 3.802, 4.013 6.065, 5.992 R = 2 Ω 7.934, 8.159 11.865, 12.138 Let us fit a regression model in original units as well as coded factor levels ( low and high, 1 and 1).

R output: Original units Call: lm(formula = Voltage ~ R + I + R * I) Residuals: 1 2 3 4 5 6 7 8-0.1055 0.1055 0.0365-0.0365-0.1125 0.1125-0.1365 0.1365 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) -0.8055 0.8432-0.955 0.393518 R 0.4710 0.5333 0.883 0.427003 I 0.1435 0.1654 0.868 0.434467 R:I 0.9170 0.1046 8.768 0.000933 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 0.1479 on 4 degrees of freedom Multiple R-squared: 0.9988,Adjusted R-squared: 0.9979 F-statistic: 1086 on 3 and 4 DF, p-value: 2.818e-06

R output: Coded units Call: lm(formula = Voltage ~ Rc + Ic + Rc * Ic) Residuals: 1 2 3 4 5 6 7 8-0.1055 0.1055 0.0365-0.0365-0.1125 0.1125-0.1365 0.1365 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 7.49600 0.05229 143.349 1.42e-08 *** Rc 2.52800 0.05229 48.344 1.10e-06 *** Ic 1.51900 0.05229 29.049 8.36e-06 *** Rc:Ic 0.45850 0.05229 8.768 0.000933 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 0.1479 on 4 degrees of freedom Multiple R-squared: 0.9988,Adjusted R-squared: 0.9979 F-statistic: 1086 on 3 and 4 DF, p-value: 2.818e-06

Result of regressions Original units y = 0.806 + 0.471R + 0.144I + 0.917RI Statistical test: only interaction significant. (N.B. Ohm s law) Coded variables y = 7.50 + 2.53x 1 + 1.52x 2 + 0.458x 1 x 2 Statistical test: all variables are significant.

ANOVA tables #... ORIGINAL UNITS... Response: Voltage Df Sum Sq Mean Sq F value Pr(>F) R 1 51.126 51.126 2337.148 1.095e-06 *** I 1 18.459 18.459 843.816 8.361e-06 *** R:I 1 1.682 1.682 76.879 0.0009328 *** Residuals 4 0.088 0.022 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 #... CODED UNITS... Response: Voltage Df Sum Sq Mean Sq F value Pr(>F) Rc 1 51.126 51.126 2337.148 1.095e-06 *** Ic 1 18.459 18.459 843.816 8.361e-06 *** Rc:Ic 1 1.682 1.682 76.879 0.0009328 *** Residuals 4 0.088 0.022 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1

A factorial experiment Coded variables: Magnitudes of model coefficients are directly comparable (dimension less, measuring the effect of changing each design factor over a one-unit interval). All estimated with the same precision (regression output) Coded variable: Effective for determining the relative size of factor effects. Engineering units: Interpretation with physical meaning, underlying mechanisms (e.g. Ohm s law). Discussion: Montgomery, Chapter 6.9 ( Why we work with coded design variables )

ANOVA table: Three-factor fixed-effects model

Classification of ANOVA models Model I ANOVA. The levels of the factors are fixed. Model II ANOVA. The levels of the factors are random. Model III ANOVA. The design has both fixed-effect and random-effect factors (also called mixed model).