MAT3378 ANOVA Summary April 18, 2016 Before you do the analysis: How many factors? (one-factor/one-way ANOVA, two-factor ANOVA etc.) Fixed or Random or Mixed effects? Crossed-factors; nested factors or partially nested factors? The same or different sample sizes for each treatment? Contents 1 One Factor Anova 2 1.1 Fixed effects................................ 2 1.2 Random effects.............................. 5 2 Two-Factor ANOVA with crossed-factors 8 2.1 Fixed effects - equal sample sizes for each treatment......... 8 2.2 Random effects - equal sample sizes for each treatment....... 13 2.3 Mixed effects - equal sample sizes for each treatment (Chapter 25.2) 15 3 Two-Factor ANOVA with nested design 17 3.1 Fixed effects................................ 17 3.2 Random effects.............................. 20 3.3 Mixed effects............................... 20 4 Complicated designs: Three-Factor ANOVA with partially nested design and mixed effects 21 4.1 Fixed-effects only............................. 21 4.2 Mixed-effects............................... 23 1
1 One Factor Anova 1.1 Fixed effects Textbook: Chapter 16; p. 681 Model: Estimates: µ is a constant, Y ij = µ i + ε ij = µ + α i + ε ij, µ i are factor level means (fixed, deterministic), α i are factor level effects, α i = µ i µ, ε ij are indepndent N(0, σ 2 ), i = 1,..., r, r is the number of factor levels, j = 1,..., n i, n i is the sample size for the level i, N = n 1 + + n r. Y i = Y = ni, i = 1,..., r, sample mean for level i. n i ni, i = 1,..., r, overall mean. N r Sums of Squares: Decomposition Y ij Y = Y ij Y i + Y i Y }{{ } Total deviation Deviation of each observation from the factor level mean Deviation of the factor level mean from the overall mean leads to SSTO = SSE + SSTR, 2
SSTO = SSTR = SSE = r n i ( 2 Yij Y ) r ( 2 n i Y i Y ) n i r ( 2 Yij Y i ) Degrees of freedom: SSTO has n 1 degrees of freedom, since there are n observations and one constrain: r ni (Y ij Y ) = 0; SSTR has r 1 degrees of freedom, since there are r levels and one constrain: r n i(y i Y ) = 0; The number of degrees of freedom is n r since we must have N 1 = (N r) + (r 1). Mean Squares: Sums of squares and the number of degrees of freedom lead to: Expected Values: MSTR = SSTR r 1, MSE = SSE N r E[MSE] = σ 2, (1) r E[MSTR] = σ 2 + n i(µ i µ ) r 1 µ is a weighted mean: µ = r n iµ i /N. Test: H 0 : µ 1 = = µ r. If all factor level means are the same, then E[MSTR] = σ 2. Hence, the test statistics for H 0 can be constructed by comparing MSTR with MSE. ANOVA table: 3
source df SS MS F Treatment r 1 SSTR MSTR F Error (Residuals) N r SSE MSE Total N 1 SSTO F = MSTR MSE has F distribution with (r 1, N r) degrees of freedom. Estimation: Goal: estimate µ i ; Estimator: Y i ; σ 2 [Y i ] = σ 2 /N; By 1, σ 2 can be estimated by MSE which has N r degrees of freedom; The confidence interval for µ i is R implementation: aov(y~x) Y i ± t(1 α/2, N r) MSE 1 n i. 4
1.2 Random effects Textbook: Chapter 25.1; p. 1031 Model: Y ij = µ i + ε ij, µ i are independent N(µ, σµ) 2, ε ij are independent N(0, σ 2 ), µ i and ε ij are independent, i = 1,..., r, r is the number of factor levels, j = 1,..., n i, n is the sample size for the level i, N = nr. Estimates: Y i = Y = ni, i = 1,..., r, sample mean for level i. n i ni, i = 1,..., r, overall mean. N r Sums of Squares: Decomposition Y ij Y = Y ij Y i + Y i Y }{{ } Total deviation Deviation of each observation from the factor level mean Deviation of the factor level mean from the overall mean leads to SSTO = SSE + SSTR, 5
SSTO = SSTR = SSE = r n i ( 2 Yij Y ) r ( 2 n i Y i Y ) n i r ( 2 Yij Y i ) Degrees of freedom: SSTO has N 1 degrees of freedom, since there are n observations and one constrain: r ni (Y ij Y ) = 0; SSTR has r 1 degrees of freedom, since there are r levels and one constrain: r n i(y i Y ) = 0; The number of degrees of freedom is N r since we must have N 1 = (N r) + (r 1). Mean Squares: Sums of squares and the number of degrees of freedom lead to: Expected Values: MSTR = SSTR r 1, MSE = SSE N r E[MSE] = σ 2, (2) E[MSTR] = σ 2 + nσ 2 µ. (3) Note the difference in E[MSTR] as compared to the fixed effects ANOVA. Test: ANOVA table: H 0 : σ µ = 0. 6
source df SS MS F Treatment r 1 SSTR MSTR F Error (Residuals) N r SSE MSE Total N 1 SSTO F = MSTR MSE has F distribution with (r 1, N r) degrees of freedom. Estimation I: Goal: estimate µ ; Estimator: Y ; σ 2 [Y ] = nσ2 µ +σ2 rn ; By 3, the right hand-side can be estimated by MSTR nr of freedom; which has r 1 degrees The confidence interval for µ is Y ± t(1 α/2, r 1) 1 MSTR rn. Note the difference in as compared to the fixed effects ANOVA. Estimation II: Goal: estimate R implementation: aov(y~x) σ 2 µ σ 2 µ +σ2, σ 2, σ 2 µ; 7
2 Two-Factor ANOVA with crossed-factors 2.1 Fixed effects - equal sample sizes for each treatment Textbook: Chapter 19; p. 812 Model: µ is a constant, Y ijk = µ + α i + β j + (αβ) ij + ε ijk, α i are main effects for Factor A, β i are main effects for Factor B, a α i = 0, b β j = 0, (αβ) ij are interactions between A, B, a (αβ) ij = 0, j = 1,..., b b (αβ) ij = 0, i = 1,..., a ε ijk are independent N(0, σ 2 ), i = 1,..., a, a is the number of factor levels for A, j = 1,..., b, b is the number of factor levels for B, k = 1,..., n, n is the sample size for the level treatment (i, j), N = nab, the total sample size. Estimates: n k=1 Y ij = Y ijk, i = 1,..., a, j = 1,..., b, sample mean for treatment (i, j), n b n k=1 Y i = Y ijk, i = 1,..., a, sample mean for level i of factor A, bn a n k=1 Y j = Y ijk, j = 1,..., b, sample mean for level j of factor B. an Y = a b n k=1 N, i = 1,..., r, overall mean. 8
Sums of Squares: Decomposition Y ijk Y = Y ijk Y ij + Y ij Y Total deviation Deviation of each observation from the treatment mean Deviation of the treatment mean from the overall mean leads to SSTO = SSE + SSTR, a b n ( 2 SSTO = Yijk Y ) k=1 a b ( 2 SSTR = n Y ij Y ) a b n ( 2 SSE = Yijk Y ij ). k=1 Further decomposition Y ij Y Deviation of the treatment mean from the overall mean leads to = Y i Y }{{ } + Y j Y Factor A main effect Factor B main effect SSTR = SSA + SSB + SSAB, + Y ij Y i Y j + Y Interaction a ( 2 SSA = nb Yi Y ) SSB = na b ( 2 Y j Y ) a b ( 2 SSAB = n Yij Y i Y j + Y ). Degrees of freedom: 9
SSTO has N 1 degrees of freedom, since there are N observations and one constrain; SSTR has ab 1 degrees of freedom, since there are ab treatments and one constrain: SSE: The number of degrees of freedom is N ab since we must have N 1 = (N ab) + (ab 1). SSA has a 1 degrees of freedom since there are a levels and 1 constrain; SSB has b 1 degrees of freedom since there are b levels and 1 constrain; SSAB has (a 1)(b 1) degrees of freedom since we must have ab 1 = (a 1) + (b 1) + (ab 1), Mean Squares: Sums of squares and the number of degrees of freedom lead to: MSTR = SSTR ab 1, MSE = SSE N ab, MSA = SSA a 1, MSB = SSB b 1, MSAB = SSAB (a 1)(b 1), Expected Values: E[MSE] = σ 2, (4) a E[MSA] = σ 2 + nb (µ i µ ) 2 a = σ 2 + nb α2 i, (5) a 1 a 1 b E[MSB] = σ 2 + na (µ j µ ) 2 b = σ 2 + na β2 j, (6) b 1 b 1 E[MSAB] = σ 2 + n a b (αβ) ij (a 1)(b 1), (7) 10
Tests: H 0 : α 1 = = α a = 0 (equivalently) H 0 : µ 1 = = µ a. H 0 : β 1 = = β b = 0 (equivalently) H 0 : µ 1 = = µ b, H 0 : (αβ) ij = 0. Equations (4)-(7) suggest evaluation of MSA/MSE; MSB/MSE; MSAB/MSE. ANOVA table: source df SS MS F Factor A a 1 SSA MSA F A Factor B b 1 SSB MSB F B Interactions AB (a 1)(b 1) SSAB MSAB F AB Error (Residuals) N ab SSE MSE Total N 1 SSTO F A = MSA MSE, F B = MSB MSE, F AB = MSAB MSE, have F distribution with (a 1, N ab), (b 1, N ab) and ((a 1)(b 1), N ab) degrees of freedom, respectively. Estimation: Goal: estimate L = a c iµ i ; Estimator: ˆL = a c iy i ; σ 2 [ˆL] = σ2 bn a c2 i ; We estimate σ 2 by MSE which has N ab degrees of freedom; The confidence interval for L is ˆL ± t(1 α/2, N ab) 1 MSE bn. R implementation: 11
aov(y~xa*xb) aov(y~xa+xb+xa*xb) aov(y~xa+xb+xa:xb) Note: aov(y~xa+xb) produces the output for the model aov(y~xa:xb) produces the output for the model Y ijk = µ + α i + β j + ε ijk, Y ijk = µ + (αβ) ij + ε ijk, 12
2.2 Random effects - equal sample sizes for each treatment Textbook: Chapter 25.2; p. 1047 Model: µ is a constant, Y ijk = µ + α i + β j + (αβ) ij + ε ijk, α i, β j, (αβ) ij, are independent zero-mean normal random variables with variances σ 2 α, σ 2 β, σ2 αβ, ε ij are independent N(0, σ 2 ), α i, β j, (αβ) ij and ε ijk are independent, i = 1,..., a, j = 1,..., b, k = 1,..., n, N = nab, the total sample size. Expected Values: Instead of (4)-(7), we have Tests: E[MSE] = σ 2, (8) E[MSA] = σ 2 + nbσ 2 α + nσ 2 αβ, (9) E[MSB] = σ 2 + naσ 2 β + nσ2 αβ, (10) E[MSAB] = σ 2 + nσ 2 αβ, (11) H 0 : σ α = 0, H 0 : σ β = 0, H 0 : σ αβ = 0. Equations (8)-(11) suggest evaluation of MSA/MSAB; MSB/MSAB; MSAB/MSE. ANOVA table: source df SS MS F Factor A a 1 SSA MSA F A Factor B b 1 SSB MSB F B Interactions AB (a 1)(b 1) SSAB MSAB F AB Error (Residuals) N ab SSE MSE Total N 1 SSTO 13
F A = MSA MSAB, F B = MSB MSAB, F AB = MSAB MSE, have F distribution with (a 1, (a 1)(b 1)), (b 1, (a 1)(b 1)) and ((a 1)(b 1), N ab) degrees of freedom, respectively. R implementation: Use the fixed-effects commands: aov(y~xa*xb) aov(y~xa+xb+xa*xb) aov(y~xa+xb+xa:xb) Ignore F statistics and p-values and compute them on your own. 14
2.3 Mixed effects - equal sample sizes for each treatment (Chapter 25.2) Textbook: Chapter 25.2; p. 1047 Model: µ is a constant, α i are fixed Factor A main effects, Y ijk = µ + α i + β j + (αβ) ij + ε ijk, a α i = 0, β j, (αβ) ij, are independent zero-mean normal random variables with variances σ 2 β, (a 1)σ2 αβ /a, ε ijk are independent N(0, σ 2 ), β j, (αβ) ij and ε ijk are independent, i = 1,..., a, j = 1,..., b, k = 1,..., n, N = nab, the total sample size. Expected Values: Instead of (8)-(11), we have Tests: E[MSE] = σ 2, (12) a α2 i E[MSA] = σ 2 + nb + nσαβ 2 a 1, (13) E[MSB] = σ 2 + naσβ 2, (14) E[MSAB] = σ 2 + nσ 2 αβ, (15) H 0 : α 1 = = α a = 0, H 0 : σ β = 0, H 0 : σ αβ = 0. Equations (8)-(11) suggest evaluation of MSA/MSAB; MSB/MSE; MSAB/MSE. ANOVA table: 15
source df SS MS F Factor A a 1 SSA MSA F A Factor B b 1 SSB MSB F B Interactions AB (a 1)(b 1) SSAB MSAB F AB Error (Residuals) N ab SSE MSE Total N 1 SSTO F A = MSA MSAB, F B = MSB MSE, F AB = MSAB MSE, have F distribution with (a 1, (a 1)(b 1)), (b 1, N ab) and ((a 1)(b 1), N ab) degrees of freedom, respectively. R implementation: Use the fixed-effects commands: aov(y~xa*xb) aov(y~xa+xb+xa*xb) aov(y~xa+xb+xa:xb) Ignore F statistics and p-values and compute them on your own. 16
3 Two-Factor ANOVA with nested design 3.1 Fixed effects Textbook: Chapter 26.1; p. 1089. For random and mixed effects see Table 26.5 on page 1099. Model: µ is a constant, α i are fixed Factor A main effects, β j(i) are within levels effects, ε ijk are independent N(0, σ 2 ), i = 1,..., a, Y ijk = µ + α i + β j(i) + ε ijk, a α i = 0, b β j(i) = 0, j = 1,..., b, b is the number of levels of Factor B within each level of Factor A; k = 1,..., n, N = nab, the total sample size. Sums of Squares: Decomposition Y ijk Y = Y ijk Y ij + Y ij Y Total deviation Deviation of each observation from the treatment mean Deviation of the treatment mean from the overall mean leads to SSTO = SSE + SSTR, 17
a b n ( 2 SSTO = Yijk Y ) k=1 a b ( 2 SSTR = n Y ij Y ) a b n ( 2 SSE = Yijk Y ij ). k=1 Further decomposition Y ij Y Deviation of the treatment mean from the overall mean leads to SSTR = SSA + SSB(A), = Y i Y }{{ } + Y ij Y i Factor A main effect Factor B effects within A a ( 2 SSA = nb Yi Y ) a SSB(A) = n b ( 2 Y ij Y i ). Degrees of freedom: SSTO has N 1 degrees of freedom, since there are N observations and one constrain; SSTR has ab 1 degrees of freedom, since there are ab treatments and one constrain: SSE: The number of degrees of freedom is N ab since we must have N 1 = (N ab) + (ab 1). SSA has a 1 degrees of freedom since there are a levels and 1 constrain; SSB(A) has a(b 1) degrees of freedom since for each level of A we have b levels less 1 constrain; 18
Mean Squares: Sums of squares and the number of degrees of freedom lead to: MSTR = SSTR ab 1, MSE = SSE N ab, MSA = SSA a 1, MSB = SSB b 1, MSB(A) = SSB(A) a(b 1). Expected Values: E[MSE] = σ 2, (16) E[MSA] = σ 2 + nb a α2 i a 1, (17) Tests: E[MSB(A)] = σ 2 + n a b β2 j(i) a(b 1) H 0 : α 1 = = α a = 0, H 0 : β j(i) = 0,, (18) Equations (16)-(18) suggest evaluation of MSA/MSE; MSB(A)/MSE. ANOVA table: source df SS MS F Factor A a 1 SSA MSA Factor B within A a(b 1) SSB MSB FA FB(A) Error (Residuals) N ab SSE MSE Total N 1 SSTO F A = MSA MSE, F B(A) = MSB(A) MSE, 19
have F distribution with (a 1, N ab), (a(b 1), N ab) degrees of freedom, respectively. R implementation: aov(y~xa+xa/xb) For random and mixed effects ignore the F and p-value part and calculate manually. 3.2 Random effects Expected Values: E[MSE] = σ 2, (19) E[MSA] = σ 2 + nbσα 2 + nσβ 2, (20) E[MSB(A)] = σ 2 + nσβ 2, (21) 3.3 Mixed effects Model: Y ijk = µ + α i + β j(i) + ε ijk, µ is a constant, α i are fixed Factor A main effects, β j(i) are N(0, σβ 2 ), ε ijk are independent N(0, σ 2 ), ε ijk, β j(i) are independent, a α i = 0, j = 1,..., b, b is the number of levels of Factor B within each level of Factor A; k = 1,..., n, N = nab, the total sample size. Expected Values: 20
A - fixed; B - random E[MSE] = σ 2, (22) a α2 i E[MSA] = σ 2 + nb + nσβ 2 a 1, (23) E[MSB(A)] = σ 2 + nσβ 2, (24) 4 Complicated designs: Three-Factor ANOVA with partially nested design and mixed effects Textbook: Chapter 26.9; p. 1114 4.1 Fixed-effects only Model: Set-up: All effects are fixed, A and B interact, C nested in A. Y ijkm = µ + α i + β j + (αβ) ij + γ k(i) + (βγ) jk(i) + ε ijkm, 21
µ is a constant, α i are fixed Factor A main effects, β j are fixed Factor B main effects, (αβ) ij are interactions between A, B, a (αβ) ij = 0, j = 1,..., b b (αβ) ij = 0, i = 1,..., a γ k(i) are C within A effects, a α i = 0, b β j = 0, c γ k(i) = 0, (βγ) jk(i) are B and C interactions within A, ε ijkm are independent N(0, σ 2 ), i = 1,..., a, j = 1,..., b, k = 1,..., c, m = 1,..., n, N = nabc, the total sample size. c (βγ) jk(i) = 0, Tests: H 0 : α 1 = = α a = 0. ANOVA table: 22
source df SS MS F Factor A a 1 SSA MSA FA Factor B b 1 SSB MSB FB Interactions A and B (a 1)(b 1) SSAB MSAB FAB C nested in A a(c-1) SSC(A) MSC(A) FC(A) Interactions B and C nested in A a(b 1)(c 1) SSBC(A) MSBC(A) FBC(A) Error (Residuals) N ab SSE MSE Total N 1 SSTO F A = MSA MSE, F B = MSB MSE, F AB = MSAB MSE, F C(A) = MSC(A) MSE, F BC(A) = MSBC(A), MSE have F distribution with (a 1, N ab), (b 1, N ab), ((a 1)(b 1), N ab), (a(c 1), N ab), (a(b 1)(c 1), N ab) degrees of freedom, respectively. 4.2 Mixed-effects Model: Set-up: A and B are fixed, C is random; A and B interact, C nested in A. Y ijkm = µ + α i + β j + (αβ) ij + γ k(i) + (βγ) jk(i) + ε ijkm, 23
µ is a constant, α i are fixed Factor A main effects, β j are fixed Factor B main effects, (αβ) ij are interactions between A, B, a (αβ) ij = 0, j = 1,..., b b (αβ) ij = 0, i = 1,..., a a α i = 0, b β j = 0, γ k(i) are random C within A effects, N(0, σγ) 2, (βγ) jk(i) are random B and C interactions within A, N(0, σβγ 2 ), ε ijkm are independent N(0, σ 2 ), all random variables are considered to be normal and independent, i = 1,..., a, j = 1,..., b, k = 1,..., c, m = 1,..., n, N = nabc, the total sample size. Expected Values: E[MSE] = σ 2, (25) E[MSA] = σ 2 + nbc a α2 i a 1 + bnσ 2 γ, (26) b β2 j E[MSB] = σ 2 + nac + nσβγ 2 b 1, (27) E[MSC(A)] = σ 2 + bnσγ 2, (28) a b E[MSAB] = σ 2 + cn (αβ) ij + nbσβγ 2 (a 1)(b 1), (29) E[MSBC(A)] = σ 2 + nσ 2 βγ, (30) 24
Tests: H 0 : α 1 = = α a = 0, ANOVA table: source df SS MS F Factor A a 1 SSA MSA FA Factor B b 1 SSB MSB FB Interactions A and B (a 1)(b 1) SSAB MSAB FAB C nested in A a(c-1) SSC(A) MSC(A) Interactions B and C nested in A a(b 1)(c 1) SSBC(A) MSBC(A) Error (Residuals) N ab SSE MSE Total N 1 SSTO F A = MSA MSC(A), F B = MSB MSBC(A), F AB = MSAB MSBC(A), have F distribution with (a 1, a(c 1)), (b 1, a(b 1)(c 1)), ((a 1)(b 1), a(b 1)(c 1)) degrees of freedom, respectively. R implementation: aov(y~xa+xb+xa:xb+xa/xc+xa/(xb:xc)) For random and mixed effects ignore the F and p-value part and calculate manually. 25