iron retention (log) high Fe2+ medium Fe2+ high Fe3+ medium Fe3+ low Fe2+ low Fe3+ 2 Two-way ANOVA

iron retention (log) 0 1 2 3 high Fe2+ high Fe3+ low Fe2+ low Fe3+ medium Fe2+ medium Fe3+ 2 Two-way ANOVA In the one-way design there is only one factor. What if there are several factors? Often, we are interested to know the simultaneous effects of multiple factors, e.g, gender and smoking on hypertension. The statistical approach to analyze data from a two-way design is two-way ANOVA. Iron retention. An experiment was performed to determine the factors that affect iron retention. The experiment was done on 108 mice, which were randomly divided into 6 groups of 18 each. These groups consist of the six combinations of two forms of iron and three concentrations. The table below shows the mean logged retention percentage. 15

Form low medium high Fe3+ 2.28 (Ȳ11.) 1.90 (Ȳ12.) 1.16 (Ȳ13.) Fe2+ 2.40 (Ȳ21.) 2.09 (Ȳ22.) 1.68 (Ȳ23.) The figure below shows the mean retention levels (log). Several scientific questions: (1) Are the means different by iron form? (2) Are the means different by dosage? (3) Does difference in the means between the two iron forms change as a function of dosage? The first two questions can be answered by one-way ANOVA. Here we use the two-way anova to answer all the questions simultaneously. 16

2.1 The two-way layout A two-way layout is an experiment design involving two factors, each at two or more levels. If there are I levels of factor A and J of factor B, there are I by J combinations. We assume that K independent observations are taken for each of the IJ combinations. Let Y ijk denote the kth observation in cell ij. The statistical model is a natural extension of the one-way anova: Y ijk = µ + α i + β j + δ ij + ǫ ijk, with ǫ ijk N(0, σ 2 ) with the constraints α i = 0 β i = 0 δ ij = δ ij = 0 Interpretations of the parameters: µ is the grand mean over all cells α i is the main effect of the ith level of factor A β j is the main effect of the jth level of factor B δ ij is the interaction effect between the ith level of factor A and the jth level of factor B, i.e., it denotes the effect of combining the ith level of factor A and the jth level of factor B. interpretation Under the two-way layout, the mean response in the ith level of factor A is µ + α i the mean response in the jth level of factor B is µ + β j the mean response in the cell ij is µ + α i + β j + δ ij We say the two factors have an interaction effect on response if any of δ ij is not zero. When there is no interaction, the means are 17

Form low medium high Fe2+ µ + α 1 + β 1 µ + α 1 + β 2 µ + α 1 + β 3 Fe3+ µ + α 2 + β 1 µ + α 2 + β 2 µ + α 2 + β 3 Difference α 1 α 2 α 1 α 2 α 1 α 2 The graph of the means would look like 2.2 Two-way ANOVA When there is an interaction between iron form and dosage, the means are Form low medium high Fe3+ µ + α 1 + β 1 + δ 11 µ + α 1 + β 2 + δ 12 µ + α 1 + β 3 + δ 13 Fe2+ µ + α 2 + β 1 + δ 12 µ + α 2 + β 2 + δ 22 µ + α 2 + β 3 + δ 23 Difference α 1 α 2 + (δ 11 δ 21 ) α 1 α 2 + (δ 12 δ 22 )) α 1 α 2 + (δ 13 δ 23 ) The graph of the means would look like 2.2.1 Notations Ȳ = Ȳ i = Ȳ j = Ȳ ij = Y ijk /(IJK) k=1 k=1 Y ijk /(JK) Y ijk /(IK) Y ijk /K k=1 18

Y ijk = µ + α i + β j + γ ij + ǫ ij, ǫ ijk iid N(0, σ 2 ), i α i = j β j = i γ ij = j γ ij = 0 Assume that the observations are independent and normally distributed with equal variance, the log likelihood is l = IJK 2 log(2πσ2 ) 1 2σ 2 = IJK IJK log(2π) 2 2 log(σ2 ) 1 2σ 2 (Y ijk µ α i β j δ ij ) 2 (Y ijk µ α i β j δ ij ) 2 The mles are ˆµ = Y ˆα i = Ȳi Ȳ, i = 1,, I ˆβ j = Ȳ j Ȳ, j = 1,, J ˆδ ij = Ȳij Ȳi Ȳ j + Ȳ Similar to the definition used in the one-way ANOVA, the sum of squares of total is defined as SSTO = = (Y ijk Ȳ ) 2 (Y ijk ˆµ) 2 SSTO can be decomposed to SSTO = SSA + SSB + SSAB + SSE where 19

SSA = JK = JK (Ȳi Ȳ ) 2 ˆα 2 i SSB = IK (Ȳ j Ȳ ) 2 = IK ˆβ j 2 SSAB = K (Ȳij Ȳi Ȳ j + Ȳ ) 2 = K ˆδ 2 ij SSE = ijk (Y Ȳij ) 2 = (K 1) i j = (Y ijk (ˆµ + ˆα i + ˆβ j + ˆδ ij )) 2 S 2 ij The proof is also similar to that of one-way ANOVA. Essentially, we need to use the identity Y ijk Ȳ = (Y ijk Ȳij ) + (Ȳi Ȳ ) + (Ȳ j Ȳ ) + (Ȳij Ȳi Ȳ j + Ȳ ) Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, 20

(1) E(MSE) = E(SSE/[IJ(K 1)]) = σ 2 (2) E(MSA) = E(SSA/(I 1)) = σ 2 + JK I 1 (3) E(MSB) = E(SSB/(J 1)) = σ 2 + IK J 1 (4) E(MSAB) = E(SSAB/[(I 1)(J 1)]) = σ 2 + α 2 i β 2 i K (I 1)(J 1) δ 2 ij Proof: They can be proved by using Lemma A: Let X i, where i = 1,, n be independent random variables with E(X i ) = µ i and V ar(x i ) = σ 2. Then E(X i X) 2 = (µ i µ) 2 + n 1 n σ2 Apply it to (1): E(Y ijk Ȳij.) 2 = (0 0) 2 + (K 1)/Kσ 2, thus E(SSE) = E[ ij (K 1)[S 2 ij] = IJKE[S 2 ij] = IJ(K 1) Apply it (2): Apply it to (3): Notice that Y ijk N(µ + α i + β j + δ ij, σ 2 ) and apply the lemma to SSTO, E(SST O) = E(Y ijk Ȳ ) 2 = [(µ + α i + β j + δ ij µ) 2 + IJK 1 IJK σ2 ] = (IJK 1)σ 2 + (α i + β j + δ ij ) 2 = (IJK 1)σ 2 + JK αi 2 + IK βj 2 + K 21 δ 2 ij