Factorial and Unbalanced Analysis of Variance

Size: px
Start display at page:

Download "Factorial and Unbalanced Analysis of Variance"

Transcription

1 Factorial and Unbalanced Analysis of Variance Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 1

2 Copyright Copyright 2017 by Nathaniel E. Helwig Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 2

3 Outline of Notes 1) Balanced Two-Way ANOVA: Model Form & Assumptions Least-Squares Estimation Basic Inference Hypertension Example (pt 1) Multiple Comparisons Hypertension Example (pt 2) 2) Balanced Three-Way ANOVA: Model Form & Estimation Hypertension Example (pt 3) 3) Unbalanced ANOVA Models: Overview of problem Types of sums-of-squares Hypertension Example (pt 4) Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 3

4 Balanced Two-Way ANOVA Balanced Two-Way ANOVA Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 4

5 Balanced Two-Way ANOVA Model Form & Assumptions Two-Way ANOVA Model (cell means form) The Two-Way Analysis of Variance (ANOVA) model has the form y ijk = µ jk + e ijk for i {1,..., n jk }, j {1,..., a}, and k {1,..., b} where y ijk R is real-valued response for i-th subject in factor cell (j, k) µ jk R is real-valued population mean for factor cell (j, k) e ijk iid N(0, σ 2 ) is a Gaussian error term n jk is number of subjects in cell (j, k) and n = a j=1 b k=1 n jk (note: n jk = n j, k in balanced two-way ANOVA) a and b are number of levels for first and second factors Implies that y ijk ind N(µ jk, σ 2 ). Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 5

6 Balanced Two-Way ANOVA Model Form & Assumptions Two-Way ANOVA Model (effect coding) Using effect coding, the mean for factor cell (j, k) has the form µ jk = µ + α j + β k + (αβ) jk for j {1,..., a} and k {1,..., b} where µ is overall population mean α j is main effect of first factor such that a j=1 α j = 0 β k is main effect of second factor such that b k=1 β k = 0 (αβ) jk is interaction effect such that a j=1 (αβ) jk = 0 k and b k=1 (αβ) jk = 0 j Set (αβ) jk = 0 j, k to fit an additive model. Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 6

7 Balanced Two-Way ANOVA Model Form & Assumptions Two-Way ANOVA Model (matrix form) In matrix form, the two-way ANOVA model is y = Xb + e where 1 x 11 x 1(a 1) z 11 z 1(b 1) x 11 z 11 x 1(a 1) z 11 x 11 z 1(b 1) x 1(a 1) z 1(b 1) 1 x 21 x 2(a 1) z 21 z 2(b 1) x 21 z 21 x 2(a 1) z 21 x 21 z 2(b 1) x 2(a 1) z 2(b 1) 1 x 31 x 3(a 1) z 31 z 3(b 1) x 31 z 31 x 3(a 1) z 31 x 31 z 3(b 1) x 3(a 1) z 3(b 1) X = x n1 x n(a 1) z n1 z n(b 1) x n1 z n1 x n(a 1) z n1 x n1 z n(b 1) x n(a 1) z n(b 1) b = ( µ α 1 α a 1 β 1 β b 1 (αβ) 11 (αβ) (a 1)1 (αβ) 1(b 1) (αβ) (a 1)(b 1) ) where X has 1 + (a 1) + (b 1) + (a 1)(b 1) = ab columns 1 if i-th observation is in j-th level of first factor x ij = 1 if i-th observation is in a-th level of first factor 0 otherwise 1 if i-th observation is in k-th level of second factor z ik = 1 if i-th observation is in b-th level of second factor 0 otherwise i {1,..., n} and additional subscripts on y and e are dropped Implies that y N(Xb, σ 2 I n ). Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 7

8 Balanced Two-Way ANOVA Model Form & Assumptions Two-Way ANOVA Model (assumptions) The fundamental assumptions of the two-way ANOVA model are: 1 x ij, z ik and y i are observed random variables (known constants) 2 e i iid N(0, σ 2 ) is an unobserved random variable 3 µ jk are unknown constants 4 (y i x ij, z ik ) ind N(µ jk, σ 2 ) note: homogeneity of variance Interpretation of µ jk depends on model form Additive: µ jk = µ + α j + β k Interaction: µ jk = µ + α j + β k + (αβ) jk Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 8

9 Balanced Two-Way ANOVA Ordinary Least-Squares Least-Squares Estimation We want to find the effect estimates (i.e., ˆµ, ˆα j, ˆβ k, and ( αβ) ˆ jk terms) that minimize the ordinary least squares criterion n b a jk SSE = (y ijk µ α j β k (αβ) jk ) 2 k=1 j=1 i=1 If n jk = n j, k the least-squares estimates have the form ˆµ = 1 b a k=1 j=1 abn ( 1 ˆα j = b k=1 bn ( 1 ˆβ k = a j=1 an ( 1 n i=1 n y ijk ( αβ) ˆ jk = n i=1 y ijk n i=1 y ijk n i=1 y ijk = ȳ which implies that ŷ ijk = ȳ jk for all (i, j, k). ) ˆµ = ȳ j ȳ ) ˆµ = ȳ k ȳ ) ˆµ ˆα j ˆβ k = ȳ jk ȳ j ȳ k + ȳ Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 9 Proof

10 Balanced Two-Way ANOVA Least-Squares Estimation Fitted Values and Residuals Form of fitted values depends on fit model: Additive: ˆµ jk = ȳ j + ȳ k ȳ Interaction: ˆµ jk = ȳ jk Residuals have the form ê ijk = y ijk ˆµ jk where form of ˆµ jk depends on fit model (additive versus interaction). Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 10

11 Balanced Two-Way ANOVA ANOVA Sums-of-Squares Basic Inference In balanced two-way ANOVA model with interaction: SST = b a n k=1 j=1 i=1 (y ijk ȳ ) 2 df = abn 1 SSR = n b a k=1 j=1 (ȳ jk ȳ ) 2 df = ab 1 SSE = b a n k=1 j=1 i=1 (y ijk ȳ jk ) 2 df = abn ab In balanced two-way ANOVA model with no interaction: SST = b a n k=1 j=1 i=1 (y ijk ȳ ) 2 df = abn 1 SSR = n b a k=1 j=1 ([ȳ j + ȳ k ȳ ] ȳ ) 2 df = a + b 2 SSE = b a n k=1 j=1 i=1 (y ijk [ȳ j + ȳ k ȳ ]) 2 df = abn (a + b 1) Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 11

12 Balanced Two-Way ANOVA Basic Inference Partitioning the Variance From MLR notes we know that SST = SSR + SSE. If n jk = n j, k can partition SSR = SSA + SSB + SSAB where SSA = bn a j=1 ˆα2 j df = a 1 SSB = an b k=1 ˆβ 2 k df = b 1 SSAB = n b k=1 a j=1 ( ˆ αβ) 2 jk df = (a 1)(b 1) Implies that SSR = SSA + SSB for additive model (if n jk = n j, k). Proof Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 12

13 Balanced Two-Way ANOVA Basic Inference Extended ANOVA Table and F Tests We typically organize the SS information into an ANOVA table: Source SS df MS F p-value SSR n b k=1 a j=1 (ȳ jk ȳ ) 2 ab 1 MSR F p SSA bn aj=1 (ȳ j ȳ ) 2 a 1 MSA F a p a SSB an bk=1 (ȳ k ȳ ) 2 b 1 MSB F b p b SSAB n bk=1 aj=1 (ȳ jk ȳ j ȳ k + ȳ ) 2 (a 1)(b 1) MSAB Fab p ab SSE b a n k=1 j=1 i=1 (y ijk ȳ jk ) 2 ab(n 1) MSE SST b a n k=1 j=1 i=1 (y ijk ȳ ) 2 abn 1 MSR = SSR SSA SSB, MSA =, MSB = ab 1 a 1 b 1, MSAB = SSAB (a 1)(b 1), MSE = SSE ab(n, 1) F = MSR MSE F ab 1,ab(n 1) and p = P(F ab 1,ab(n 1) > F ), Fa = MSA MSE F a 1,ab(n 1) and pa = P(F a 1,ab(n 1) > Fa ), Fb = MSB MSE F b 1,ab(n 1) and pb = P(F b 1,ab(n 1) > Fb ), Fab = MSAB MSE F (a 1)(b 1),ab(n 1) and pab = P(F (a 1)(b 1),ab(n 1) > Fab ), Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 13

14 Balanced Two-Way ANOVA Basic Inference ANOVA Table F Tests F and p -value are testing H 0 : α j = β k = (αβ) jk = 0 j, k versus H 1 : ( j, k {1,..., a} {1,..., b})(α j = β k = (αβ) jk = 0 is false) Equivalent to H 0 : µ jk = µ j, k versus H 1 : not all µ jk are equal F a statistic and p a-value are testing H 0 : α j = 0 j versus H 1 : ( j {1,..., a})(α j 0) Testing main effect of first factor F b statistic and p b -value are testing H 0 : β k = 0 k versus H 1 : ( k {1,..., b})(β k 0) Testing main effect of second factor F ab statistic and p ab -value are testing H 0 : (αβ) jk = 0 j, k versus H 1 : ( j, k {1,..., a} {1,..., b})((αβ) jk 0) Testing interaction effect Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 14

15 Balanced Two-Way ANOVA Hypertension Example (Part 1) Hypertension Example: Data Description Hypertension example from Maxwell & Delany (2003). Total of n = 72 subjects participate in hypertension experiment. Factor A: drug type (a = 3 levels: X, Y, Z) Factor B: diet type (b = 2 levels: yes, no) Randomly assign n jk = 12 subjects to each treatment cell: Note there are (ab) = (3)(2) = 6 treatment cells Observations are independent within and between cells Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 15

16 Balanced Two-Way ANOVA Hypertension Example (Part 1) Hypertension Example: Descriptive Statistics Sum of blood pressure for each treatment cell ( 12 i=1 y ijk): Diet Drug No (k = 1) Yes (k = 2) Total X (j = 1) Y (j = 2) Z (j = 3) Total Sum-of-squares of blood pressure for each treatment cell ( 12 i=1 y ijk 2 Diet Drug No (k = 1) Yes (k = 2) Total X (j = 1) Y (j = 2) Z (j = 3) Total Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 16

17 Balanced Two-Way ANOVA Hypertension Example (Part 1) Hypertension Example: OLS Estimation (by hand) Least-squares estimates are cell means: ˆµ jk = ȳ jk and n ˆµ = 1 b a k=1 j=1 i=1 abn y ijk = ȳ = = ( ) 1 ˆα 1 = b n k=1 i=1 bn y i1k ˆµ = ȳ 1 ȳ = = ( ) 1 ˆα 2 = b n k=1 i=1 bn y i2k ˆµ = ȳ 2 ȳ = = ( ) 1 ˆα 3 = b n k=1 i=1 bn y i3k ˆµ = ȳ 3 ȳ = = ( ) 1 ˆβ 1 = a n j=1 i=1 an y ij1 ˆµ = ȳ 1 ȳ = = ( ) 1 ˆβ 2 = a n j=1 i=1 an y ij2 ˆµ = ȳ 2 ȳ = = Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 17

18 Balanced Two-Way ANOVA Hypertension Example (Part 1) Hypertension Example: OLS Estimation (by hand) Continuing from the previous slide... ( αβ) ˆ 11 = ȳ 11 ȳ 1 ȳ 1 + ȳ = = 5 36 ( αβ) ˆ 12 = ȳ 12 ȳ 1 ȳ 2 + ȳ = = 5 ( αβ) ˆ 21 = ȳ 21 ȳ 2 ȳ 1 + ȳ = = ( αβ) ˆ 22 = ȳ 22 ȳ 2 ȳ 2 + ȳ = = ( αβ) ˆ 31 = ȳ 31 ȳ 3 ȳ 1 + ȳ = = ( αβ) ˆ 32 = ȳ 32 ȳ 3 ȳ 2 + ȳ = = Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 18

19 Balanced Two-Way ANOVA Hypertension Example (Part 1) Hypertension Example: Enter Data (in R) > bp = scan("/users/nate/desktop/hypertension.dat") Read 72 items > diet = factor(rep(rep(c("no","yes"),each=6),6)) > drug = factor(rep(rep(c("x","y","z"),each=12),2)) > biof = factor(rep(c("present","absent"),each=36)) > hyper = data.frame(bp=bp, diet=diet, drug=drug, biof=biof) > hyper[1:20,] bp diet drug biof no X present no X present no X present no X present no X present no X present yes X present yes X present yes X present yes X present yes X present yes X present no Y present no Y present no Y present no Y present no Y present no Y present yes Y present yes Y present Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 19

20 Balanced Two-Way ANOVA Hypertension Example (Part 1) Hypertension Example: OLS Estimation (in R) Effect coding for drug and diet: > contrasts(hyper$drug) <- contr.sum(3) > contrasts(hyper$drug) [,1] [,2] X 1 0 Y 0 1 Z -1-1 > contrasts(hyper$diet) <- contr.sum(2) > contrasts(hyper$diet) [,1] no 1 yes -1 > mymod = lm(bp ~ drug * diet, data=hyper) > summary(mymod) # I deleted some output Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** drug e-05 *** drug ** diet e-06 *** drug1:diet * drug2:diet Signif. codes: 0 *** ** 0.01 * Residual standard error: on 66 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 5 and 66 DF, p-value: 3.385e-07 Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 20

21 Balanced Two-Way ANOVA Hypertension Example (Part 1) Hypertension Example: Sums-of-Squares (by hand 1) Defining n = b k=1 a j=1 n jk, the relevant sums-of-squares are SST = b a n jk k=1 j=1 i=1 (y ijk ȳ ) 2 = b ( a n jk k=1 j=1 i=1 y ijk 2 1 b ) a n jk 2 n k=1 j=1 i=1 y ijk = (13284)2 = SSE = b a n jk k=1 j=1 = i=1 (y ijk ȳ jk ) 2 = b k=1 a j=1 n jk i=1 y 2 ijk b k=1 a j=1 ( ) njk 2 i=1 y ijk [ ] /12 = n jk SSR = SST SSE = = 9780 Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 21

22 Balanced Two-Way ANOVA Hypertension Example (Part 1) Hypertension Example: Sums-of-Squares (by hand 2) The sums-of-squares for the main and interaction effects are given by SSA = bn a j=1 (ȳ j ȳ ) 2 = bn a j=1 ˆα2 j [ = (2)(12) ( 10) ] = 3675 SSB = an b k=1 (ȳ k ȳ ) 2 = an b ˆβ k=1 k 2 [ = (3)(12) ( 8.5) ] = 5202 SSAB = n b a k=1 j=1 (ȳ jk ȳ j ȳ k + ȳ ) 2 = n b a k=1 j=1 ( αβ) ˆ 2 jk [ = 12 ( 5) ( 2.75) ( 2.25) 2] = 903 and since n jk = n = 12 j, k, we have SSR = SSA + SSB + SSAB 9780 = Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 22

23 Balanced Two-Way ANOVA Hypertension Example (Part 1) Hypertension Example: ANOVA Table (by hand) Putting things together in ANOVA table: Source SS df MS F p-value SSR <.0001 SSA SSB <.0001 SSAB SSE SST Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 23

24 Balanced Two-Way ANOVA Hypertension Example (Part 1) Hypertension Example: ANOVA Table (in R) > anova(mymod) Analysis of Variance Table Response: bp Df Sum Sq Mean Sq F value Pr(>F) drug *** diet e-06 *** drug:diet Residuals Signif. codes: 0 *** ** 0.01 * Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 24

25 Balanced Two-Way ANOVA Multiple Comparisons Multiple Comparisons Overview Still have multiple comparison problem: Overall test is not very informative Can examine effect estimates for group differences Need follow-up tests to examine linear combinations of means Still can use the same tools as before: Bonferroni Tukey (Tukey-Kramer) Scheffé Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 25

26 Balanced Two-Way ANOVA Multiple Comparisons Two-Way ANOVA Linear Combinations Assuming interaction model, we now have ˆL = b a k=1 j=1 c jkȳjk and ˆV (ˆL) = ˆσ 2 b a k=1 j=1 c2 jk /n jk where c jk are the coefficients and ˆσ 2 is the MSE. Assuming the additive model, we have ˆL a = a j=1 c jȳj and ˆV (ˆLa ) = ˆσ 2 a j=1 c2 j /n j ˆL b = b k=1 c kȳ k and ˆV (ˆLb ) = ˆσ 2 b k=1 c2 k /n k where c j and c k are main effect coefficients, ˆσ 2 is the MSE, and n j = b k=1 n jk and n k = a j=1 n jk are the marginal sample sizes. Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 26

27 Balanced Two-Way ANOVA Multiple Comparisons Two-Way Multiple Comparisons in Practice For interaction model, you follow-up on ˆµ jk = ȳ jk Bonferroni for any f tests (independent or not) Tukey (Tukey-Kramer) for all pairwise comparisons Scheffé for all possible contrasts For additive model, you follow-up on ˆµ j = ȳ j and ˆµ k = ȳ k Bonferroni for any f tests (independent or not) Tukey (Tukey-Kramer) for all pairwise comparisons Scheffé for all possible contrasts For additive model, Tukey and Scheffé control FWER for each main effect family separately. Use Bonferroni in combination with Tukey/Scheffé to control FWER for both families simultaneously Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 27

28 Balanced Two-Way ANOVA Hypertension Example (Part 2) Hypertension Example: Interaction (by hand) All ab(ab 1)/2 = 15 possible pairwise comparisons of ˆµ jk : ˆL = ȳ jk ȳ j k and ˆV (ˆL) = 194.2(2/12) = and we know that 2(ˆL) ˆV (ˆL) q ab,abn ab, so 100(1 α)% CI is given by ˆL ± 1 q (α) 2 ab,abn ab ˆV (ˆL) where q (α) ab,abn ab is critical value from studentized range. For example, 95% CI for µ 21 µ 11 is given by: (ˆµ 21 ˆµ 11 ) ± 1 q (.05) 2 6,66 ˆV (ˆL) ( ) ± 1 ( ) = [ ; ] 12 2 Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 28

29 Balanced Two-Way ANOVA Hypertension Example (Part 2) Hypertension Example: Interaction (in R) All ab(ab 1)/2 = 15 possible pairwise comparisons of ˆµ jk : > mymod = aov(bp ~ drug * diet, data=hyper) > TukeyHSD(mymod, "drug:diet") Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = bp ~ drug * diet, data = hyper) $ drug:diet diff lwr upr p adj Y:no-X:no Z:no-X:no X:yes-X:no Y:yes-X:no Z:yes-X:no Z:no-Y:no X:yes-Y:no Y:yes-Y:no Z:yes-Y:no X:yes-Z:no Y:yes-Z:no Z:yes-Z:no Y:yes-X:yes Z:yes-X:yes Z:yes-Y:yes Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 29

30 Balanced Two-Way ANOVA Hypertension Example (Part 2) Hypertension Example: Additive (by hand A part 1) All a(a 1)/2 = 3 possible pairwise comparisons of ˆµ j : and the variance is given by where ˆσ 2 = Y X : ˆLa1 = = Z X : ˆLa2 = = Z Y : ˆLa3 = = 2.5 ˆV (ˆL aj ) = ˆσ 2 a j=1 c2 j /n j = ( )(2/24) = SSE+SSAB abn (a+b 1) = = Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 30

31 Balanced Two-Way ANOVA Hypertension Example (Part 2) Hypertension Example: Additive (by hand A part 2) Note 2(ˆLaj ) ˆV (ˆLaj ) q a,abn (a+b 1), so 100(1 α)% CI is given by ˆL aj ± 1 2 q (α) a,abn (a+b 1) ˆV (ˆLaj ) where q (α) a,abn (a+b 1) is critical value from studentized range. The 95% CI for all three pairwise comparisons is given by ˆL a1 ± 1 q (.05) 3,68 ˆV (ˆL a1 ) = ± 1 ( ) = [ ; ] 2 2 ˆL a2 ± 1 q (.05) 3,68 ˆV (ˆL a2 ) = ± 1 ( ) = [ ; ] 2 2 ˆL a3 ± 1 q (.05) 3,68 ˆV (ˆL a3 ) = 2.5 ± 1 ( ) = [ ; ] 2 2 Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 31

32 Balanced Two-Way ANOVA Hypertension Example (Part 2) Hypertension Example: Additive (by hand B part 1) All b(b 1)/2 = 1 possible pairwise comparison of ˆµ k : and the variance is given by where ˆσ 2 = yes no : ˆLb = = 17 ˆV (ˆL b ) = ˆσ 2 b k=1 c2 k /n k = ( )(2/36) = SSE+SSAB abn = (a+b 1) 68 = Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 32

33 Balanced Two-Way ANOVA Hypertension Example (Part 2) Hypertension Example: Additive (by hand B part 2) Note 2(ˆLb ) ˆV (ˆLb ) q b,abn (a+b 1), so 100(1 α)% CI is given by ˆL b ± 1 q (α) 2 b,abn (a+b 1) ˆV (ˆL b ) where q (α) b,abn (a+b 1) is critical value from studentized range. The 95% CI for pairwise comparison is given by ˆL b ± 1 2 q (.05) 2,68 ˆV (ˆL b ) = 17 ± 1 2 ( ) = [ ; ] Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 33

34 Balanced Two-Way ANOVA Hypertension Example (Part 2) Hypertension Example: Additive (in R) All a(a 1)/2 = 3 possible pairwise comparisons of ˆµ j : > mymod = aov(bp ~ drug + diet, data=hyper) > TukeyHSD(mymod, "drug") Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = bp ~ drug + diet, data = hyper) $drug diff lwr upr p adj Y-X Z-X Z-Y All b(b 1)/2 = 1 possible pairwise comparison of ˆµ k : > mymod = aov(bp ~ drug + diet, data=hyper) > TukeyHSD(mymod,"diet") Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = bp ~ drug + diet, data = hyper) $diet diff lwr upr p adj yes-no e-06 Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 34

35 Balanced Three-Way ANOVA Balanced Three-Way ANOVA Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 35

36 Balanced Three-Way ANOVA Model Form and Estimation Three-Way ANOVA Model (cell means form) The Three-Way Analysis of Variance (ANOVA) model has the form y ijkl = µ jkl + e ijkl for i {1,..., n jkl }, j {1,..., a}, k {1,..., b}, l {1,..., c}, where y ijkl R is response for i-th subject in factor cell (j, k, l) µ jkl R is population mean for factor cell (j, k, l) e ijkl iid N(0, σ 2 ) is a Gaussian error term n jkl is number of subjects in cell (j, k, l) (note: n jkl = n j, k, l in balanced three-way ANOVA) (a, b, c) is number of factor levels for Factors (A, B, C) Implies that y ijkl ind N(µ jkl, σ 2 ). Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 36

37 Balanced Three-Way ANOVA OLS Estimation (cell means form) Model Form and Estimation Similar to balanced two-way ANOVA, we want to minimize c b l=1 k=1 j=1 i=1 n a jkl (y ijkl µ jkl ) 2 which is equivalent to minimizing n jkl i=1 (y ijkl µ jkl ) 2 for all j, k, l Taking the derivative of SSE jkl = n jkl i=1 (y ijkl µ jkl ) 2, we see that n dsse jkl jkl = 2 y ijkl + 2n jkl µ jkl dµ jkl and setting to zero and solving gives ˆµ jkl = 1 n jkl njkl i=1 y ijkl = ȳ jkl i=1 Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 37

38 Balanced Three-Way ANOVA Model Form and Estimation Three-Way ANOVA Model (effect coding) The three-way ANOVA with all interactions assumes that µ jkl = µ + α j + β k + γ l + (αβ) jk + (αγ) jl + (βγ) kl + (αβγ) jkl for j {1,..., a}, k {1,..., b}, and l {1,..., c} where µ is overall population mean α j is main effect of first factor such that a j=1 α j = 0 β k is main effect of second factor such that b k=1 β k = 0 γ l is main effect of third factor such that c l=1 γ l = 0 (αβ) jk is A B interaction effect such that a j=1 (αβ) jk = 0 k and b k=1 (αβ) jk = 0 j (αγ) jl is A C interaction effect such that a j=1 (αγ) jl = 0 l and c l=1 (αγ) jl = 0 j (βγ) kl is B C interaction effect such that b k=1 (βγ) kl = 0 l and c l=1 (βγ) kl = 0 k (αβγ) jkl is A B C interaction effect such that a j=1 (αβγ) jkl = 0 k, l and b k=1 (αβγ) jkl = 0 j, l and c l=1 (αβγ) jkl = 0 j, k Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 38

39 Balanced Three-Way ANOVA OLS Estimation (effect coding) Model Form and Estimation The OLS estimates of the various effects are given by ˆµ = ȳ ˆα j = ȳ j ȳ ˆβ k = ȳ k ȳ ˆγ l = ȳ l ȳ ( αβ) ˆ jk = ȳ jk ȳ j ȳ k + ȳ ( αγ) ˆ jl = ȳ j l ȳ j ȳ l + ȳ ( βγ) ˆ kl = ȳ kl ȳ k ȳ l + ȳ ( αβγ) ˆ jkl = ȳ jkl [ˆµ + ˆα j + ˆβ k + ˆγ l + ( αβ) ˆ jk + ( αγ) ˆ jl + ( βγ) ˆ kl ] Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 39

40 Balanced Three-Way ANOVA Fitted Values and Residuals Model Form and Estimation Form of fitted values depends on fit model: Additive: All 2-way Int: 3-way Int: ˆµ jkl = ˆµ + ˆα j + ˆβ k + ˆγ l ˆµ jkl = ˆµ + ˆα j + ˆβ k + ˆγ l + ( αβ) ˆ jk + ( αγ) ˆ jl + ( βγ) ˆ kl ˆµ jkl = ȳ jkl Residuals have the form ê ijkl = y ijkl ˆµ jkl where form of ˆµ jkl depends on fit model (additive versus interaction). Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 40

41 Balanced Three-Way ANOVA ANOVA Sums-of-Squares Basic Inference Defining N = abcn, the three-way ANOVA sums-of-squares are: SST = c b a n l=1 k=1 j=1 i=1 (y ijkl ȳ ) 2 df = N 1 SSR = n c b a l=1 k=1 j=1 (ȳ jkl ȳ ) 2 df = abc 1 SSE = c l=1 b k=1 a j=1 n i=1 (y ijkl ȳ jkl ) 2 df = N abc SSR = SS A + SS B + SS C + SS AB + SS AC + SS BC + SS ABC SS A = bcn a j=1 ˆα2 j df = a 1 SS B = acn b k=1 ˆβ 2 k df = b 1 SS C = abn c l=1 ˆγ2 l df = c 1 SS AB = cn b k=1 a j=1 ( ˆ αβ) 2 jk df = (a 1)(b 1) SS AC = bn c a l=1 j=1 ( αγ)2 ˆ jl df = (a 1)(c 1) SS BC = an c b l=1 k=1 ( βγ) ˆ 2 kl df = (b 1)(c 1) SS ABC = n c b a l=1 k=1 j=1 ( αβγ) ˆ 2 jkl df = (a 1)(b 1)(c 1) Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 41

42 Balanced Three-Way ANOVA Basic Inference Memory Example: Data Description (revisited) Hypertension example from Maxwell & Delany (2003). Total of N = 72 subjects participate in hypertension experiment. Factor A: drug type (a = 3 levels: X, Y, Z) Factor B: diet type (b = 2 levels: yes, no) Factor C: biof type (c = 2 levels: present, absent) Randomly assign n jkl = 6 subjects to each treatment cell: Note there are (abc) = (3)(2)(2) = 12 treatment cells Observations are independent within and between cells Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 42

43 Balanced Three-Way ANOVA Basic Inference Hypertension Example: Look at Data > bp = scan("/users/nate/desktop/hypertension.dat") Read 72 items > diet = factor(rep(rep(c("no","yes"),each=6),6)) > drug = factor(rep(rep(c("x","y","z"),each=12),2)) > biof = factor(rep(c("present","absent"),each=36)) > hyper = data.frame(bp=bp, diet=diet, drug=drug, biof=biof) > contrasts(hyper$drug) <- contr.sum(3) > contrasts(hyper$diet) <- contr.sum(2) > contrasts(hyper$biof) <- contr.sum(2) > hyper[1:15,] bp diet drug biof no X present no X present no X present no X present no X present no X present yes X present yes X present yes X present yes X present yes X present yes X present no Y present no Y present no Y present Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 43

44 Balanced Three-Way ANOVA Basic Inference Hypertension Example: All Interactions > mymod = lm(bp ~ drug * diet * biof, data=hyper) > anova(mymod) Analysis of Variance Table Response: bp Df Sum Sq Mean Sq F value Pr(>F) drug e-05 *** diet e-07 *** biof *** drug:diet drug:biof diet:biof drug:diet:biof * Residuals Signif. codes: 0 *** ** 0.01 * Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 44

45 Balanced Three-Way ANOVA Basic Inference Hypertension Example: All 2-Way Interactions > mymod = lm(bp ~ drug * diet + drug * biof + diet * biof, data=hyper) > anova(mymod) Analysis of Variance Table Response: bp Df Sum Sq Mean Sq F value Pr(>F) drug e-05 *** diet e-07 *** biof *** drug:diet drug:biof diet:biof Residuals Signif. codes: 0 *** ** 0.01 * Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 45

46 Balanced Three-Way ANOVA Basic Inference Hypertension Example: Additive Model > mymod = lm(bp ~ drug + diet + biof, data=hyper) > anova(mymod) Analysis of Variance Table Response: bp Df Sum Sq Mean Sq F value Pr(>F) drug *** diet e-07 *** biof ** Residuals Signif. codes: 0 *** ** 0.01 * Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 46

47 Balanced Three-Way ANOVA Basic Inference Hypertension Example: Interaction Plot If you choose the three-way interaction model, you could visualize the interaction using an interaction plot. Biofeedback Absent Biofeedback Present Mean BP Diet No Diet Yes Mean BP Diet No Diet Yes X Y Z X Y Z Drug Drug Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 47

48 Balanced Three-Way ANOVA Basic Inference Hypertension Example: Interaction Plot (R code) yhat=tapply(hyper$bp,list(hyper$drug,hyper$diet,hyper$biof),mean) par(mfrow=c(1,2)) mytitles=c("biofeedback Absent","Biofeedback Present") for(k in 1:2){ plot(1:3,yhat[,1,k],ylim=c(165,215),xlab="drug", ylab="mean BP",main=mytitles[k],axes=FALSE,type="l") lines(1:3,yhat[,2,k],lty=2) legend("topleft",c("diet No","Diet Yes"),lty=1:2,bty="n") axis(1,at=1:3,labels=c("x","y","z")) axis(2) } Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 48

49 Balanced Three-Way ANOVA Basic Inference Hypertension Example: Multiple Comparisons Assuming we chose the additive model, we would perform follow-up tests on the marginal means. Factor A: ˆµ aj = ˆµ + ˆα j = ȳ j Factor B: ˆµ bk = ˆµ + ˆβ k = ȳ k Factor C: ˆµ cl = ˆµ + ˆγ l = ȳ l If we chose three-way interaction model, we would perform follow-up tests on the individual cell means. ˆµ jkl = ˆµ + ˆα j + ˆβ k + ˆγ l + ( αβ) ˆ jk + ( αγ) ˆ jl + ( βγ) ˆ kl + ( αβγ) ˆ jkl = ȳ jkl Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 49

50 Balanced Three-Way ANOVA Basic Inference Hypertension Example: Multiple Comparisons > mymod = aov(bp ~ drug + diet + biof, data=hyper) > TukeyHSD(mymod, "drug") # I deleted some output Tukey multiple comparisons of means 95% family-wise confidence level $drug diff lwr upr p adj Y-X Z-X Z-Y > TukeyHSD(mymod, "diet") # I deleted some output Tukey multiple comparisons of means 95% family-wise confidence level $diet diff lwr upr p adj yes-no e-07 > TukeyHSD(mymod, "biof") # I deleted some output Tukey multiple comparisons of means 95% family-wise confidence level $biof diff lwr upr p adj present-absent Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 50

51 Unbalanced ANOVA Models Unbalanced ANOVA Models Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 51

52 Unbalanced ANOVA Models Overview of Problem Unbalanced ANOVA: Model Form Unbalanced ANOVA has same model form as balanced, but unequal sample sizes in each cell. 1-way: n j n j for some j, j 2-way: n jk n j k for some (jk), (j k ) 3-way: n jkl n j k l for some (jkl), (j k l ) In the previous slides, we assumed n jk = n j, k (two-way ANOVA) or n jkl = n j, k, l (three-way ANOVA), which made life easy. Effects were orthogonal in balanced design Parameter estimates had simple relation to cell/marginal means Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 52

53 Unbalanced ANOVA Models Overview of Problem Unbalanced ANOVA: Implications Main consequence for two-way (and higher-way) unbalanced design: Non-orthogonal SS (e.g., SSR SSA + SSB + SSAB) Design is less efficient (larger variances of parameter estimates) Unbalanced design also affects our estimation and follow-up tests Parameter estimates require matrix inversion: ˆb = (X X) 1 X y Need to do follow-up tests on least-squares means Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 53

54 Unbalanced ANOVA Models Overview of Problem Unbalanced ANOVA: Testing Effects Because of non-orthogonality, cannot test effects using F = MS? MSE. Instead we use the General Linear Model (GLM) F test statistic: F = SSE R SSE F df R df F SSE F df F F (dfr df F,df F ) where SSE R is sum-of-squares error for reduced model SSE F is sum-of-squares error for full model df R is error degrees of freedom for reduced model df F is error degrees of freedom for full model Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 54

55 Unbalanced ANOVA Models Overview of Problem Unbalanced ANOVA: Testing Example Consider two-way ANOVA and all 7 possible models y ijk = µ + α j + β k + (αβ) jk + e ijk (1) y ijk = µ + α j + β k + e ijk (2) y ijk = µ + α j + (αβ) jk + e ijk (3) y ijk = µ + β k + (αβ) jk + e ijk (4) y ijk = µ + α j + e ijk (5) y ijk = µ + β k + e ijk (6) y ijk = µ + e ijk (7) Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 55

56 Unbalanced ANOVA Models Overview of Problem Unbalanced ANOVA: Testing Example (continued) To test effect, use F test comparing full and reduced models. To test each effect there are multiple choices we could use for full and reduced models: A: F=1 and R=4 or F=2 and R=6 or F=5 and R=7 B: F=1 and R=3 or F=2 and R=5 or F=6 and R=7 AB: F=1 and R=2 or F=3 and R=5 or F=4 and R=6 Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 56

57 Unbalanced ANOVA Models Types of Sum-of-Squares Types of Sums-of-Squares Type I SS Amount of additional variation explained by the model when a term is added to the model (aka sequential sum-of-squares). In two-way ANOVA, type I SS would compare: (a) Main Effect A: F=5 and R=7 (b) Main Effect B: F=2 and R=5 (c) Interaction Effect: F=1 and R=2 Type II SS Amount of variation a term adds to the model when all other terms are included except terms that contain the effect being tested (e.g., (αβ) jk contains α j and β k ). In two-way ANOVA, type II SS would compare: (a) Main Effect A: F=2 and R=6 (b) Main Effect B: F=2 and R=5 (c) Interaction Effect: F=1 and R=2 Type III SS Amount of variation a term adds to the model when all other terms are included, which is sometimes called partial sum-of-squares. In two-way ANOVA, type III SS would compare: (a) Main Effect A: F=1 and R=4 (b) Main Effect B: F=1 and R=3 (c) Interaction Effect: F=1 and R=2 Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 57

58 Unbalanced ANOVA Models Types of Sums-of-Squares Types of Sum-of-Squares (in R) When fitting multi-way ANOVAs, anova function gives Type I SS. Order matters in unbalanced design! bp = drug + diet produces different Type I SS tests than bp = diet + drug if design is unbalanced Use Anova function in car package for Type II and Type III SS. Function performs Type II SS tests by default Use type=3 option for Type III SS tests Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 58

59 Unbalanced ANOVA Models Hypertension Example (Part 4) Hypertension Example: Type I > mymod = lm(bp ~ drug * diet * biof, data=hyper[1:71,]) > anova(mymod) Analysis of Variance Table Response: bp Df Sum Sq Mean Sq F value Pr(>F) drug e-05 *** diet e-07 *** biof *** drug:diet drug:biof diet:biof drug:diet:biof * Residuals Signif. codes: 0 *** ** 0.01 * Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 59

60 Unbalanced ANOVA Models Hypertension Example (Part 4) Hypertension Example: Type II > library(car) > Anova(mymod, type=2) Anova Table (Type II tests) Response: bp Sum Sq Df F value Pr(>F) drug e-05 *** diet e-07 *** biof *** drug:diet drug:biof diet:biof drug:diet:biof * Residuals Signif. codes: 0 *** ** 0.01 * Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 60

61 Unbalanced ANOVA Models Hypertension Example (Part 4) Hypertension Example: Type III > library(car) > Anova(mymod, type=3) Anova Table (Type III tests) Response: bp Sum Sq Df F value Pr(>F) (Intercept) < 2.2e-16 *** drug e-05 *** diet e-07 *** biof *** drug:diet drug:biof diet:biof drug:diet:biof * Residuals Signif. codes: 0 *** ** 0.01 * Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 61

62 Appendix Appendix Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 62

63 Appendix Ordinary Least-Squares (proof for µ) Expanding the first summation produces [ b a n SSE = yijk 2 2(µ + α j + β k + (αβ) jk ) k=1 j=1 i=1 n i=1 y ijk + n (µ + α j + β k + (αβ) jk ) 2 ] Taking the derivative with respect to µ we have dsse dµ = b a [ k=1 j=1 2 n = 2 ( b k=1 a j=1 n i=1 y ijk and setting to zero and solving for µ gives ˆµ = 1 b a n abn k=1 j=1 i=1 y ijk = ȳ i=1 y ijk + 2n µ + 2n (α j + β k + (αβ) jk ) ] ) + 2abn µ Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 63

64 Appendix Ordinary Least-Squares (proof for α j ) Taking the derivative with respect to α j we have dsse dα j = b [ k=1 2 n = 2 ( b k=1 n i=1 y ijk i=1 y ijk + 2n α j + 2n (µ + β k + (αβ) jk ) ] ) + 2bn α j + 2bn µ and setting to zero, using ˆµ for µ, and solving for α j gives ˆα j = 1 bn ( b n k=1 i=1 y ijk) ˆµ = ȳ j ȳ Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 64

65 Appendix Ordinary Least-Squares (proof for β k ) Taking the derivative with respect to β k we have dsse dβ k = a [ j=1 2 n = 2 ( a j=1 n i=1 y ijk i=1 y ijk + 2n β k + 2n (µ + α j + (αβ) jk ) ] ) + 2an β k + 2an µ and setting to zero, using ˆµ for µ, and solving for β k gives ˆβ k = 1 an ( a n j=1 i=1 y ijk) ˆµ = ȳ k ȳ Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 65

66 Appendix Ordinary Least-Squares (proof for (αβ) jk ) Taking the derivative with respect to (αβ) jk we have dsse n = 2 y ijk + 2n (αβ) jk + 2n (µ + α j + β k ) d(αβ) jk i=1 and setting to zero, using (ˆµ, ˆα j, ˆβ k ) for (µ, α j, β k ), and solving for (αβ) jk gives ( αβ) ˆ jk = 1 n ( n i=1 y ijk) ˆµ ˆα j ˆβ k = ȳ jk ȳ j ȳ k + ȳ Return Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 66

67 Appendix Partitioning the Variance (proof part 1) To prove SSR = SSA + SSB + SSAB when n jk = n j, k, note that y ijk ȳ = (y ijk ȳ jk ) + (ȳ jk [ȳ j + ȳ k ȳ ]) + (ȳ j ȳ ) + (ȳ k ȳ ) Now if we square both sides we have (y ijk ȳ ) 2 = (y ijk ȳ jk ) 2 + (ȳ jk [ȳ j + ȳ k ȳ ]) 2 + (ȳ j ȳ ) 2 + (ȳ k ȳ ) 2 + 2(y ijk ȳ jk ) {(ȳ jk [ȳ j + ȳ k ȳ ]) + (ȳ j ȳ ) + (ȳ k ȳ )} + 2(ȳ jk [ȳ j + ȳ k ȳ ]) [(ȳ j ȳ ) + (ȳ k ȳ )] + 2(ȳ j ȳ )(ȳ k ȳ ) Now if we apply the triple summation we have SST n b a jk SST = (y ijk ȳ ) 2 k=1 j=1 i=1 Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 67

68 Appendix Partitioning the Variance (proof part 2) First, note that we have SSE = b SSAB = b SSA = b a njk k=1 j=1 a njk k=1 j=1 a njk k=1 j=1 SSB = b k=1 a j=1 i=1 (y ijk ȳ jk ) 2 i=1 (ȳ jk [ȳ j + ȳ k ȳ ]) 2 i=1 (ȳ j ȳ ) 2 njk i=1 (ȳ k ȳ ) 2 so we need to prove that the crossproduct terms are orthogonal. To prove that the first crossproduct term sums to zero, define (αβ) jk = (ȳ jk [ȳ j + ȳ k ȳ ]) + (ȳ j ȳ ) + (ȳ k ȳ ) and note that b a njk k=1 j=1 i=1 2(y ijk ȳ jk )(αβ) jk = 2 b a k=1 j=1 (αβ) njk jk i=1 (y ijk ȳ jk ) = 2 b a k=1 j=1 (αβ) jk(0) = 0 because we are summing mean-centered variable. Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 68

69 Appendix Partitioning the Variance (proof part 3) To prove that the second crossproduct term sums to zero, note that ( ˆ αβ) jk = (ȳ jk [ȳ j + ȳ k ȳ ]), ˆα j = (ȳ j ȳ ), and ˆβ k = (ȳ k ȳ ), so b a njk k=1 j=1 i=1 2( αβ) ˆ jk (ˆα j + ˆβ k ) = 2 b a k=1 j=1 n jk( αβ) ˆ jk (ˆα j + ˆβ k ) Now assuming that n jk = n j, k b a k=1 j=1 n jk( αβ) ˆ ( jk ˆα j = n a j=1 ˆα b ) j k=1 ( αβ) ˆ jk = n a j=1 ˆα j(0) = 0 b a k=1 j=1 n jk( αβ) ˆ jk ˆβk = n b ˆβ ( a k=1 k j=1 ( αβ) ˆ ) jk = n b ˆβ k=1 k (0) = 0 Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 69

70 Appendix Partitioning the Variance (proof part 4) To prove that the third crossproduct term sums to zero, note that b a njk k=1 j=1 i=1 2(ȳ j ȳ )(ȳ k ȳ ) = 2 b a k=1 j=1 n jk ˆα j ˆβ k and if n jk = n j, k we have that 2 b k=1 a j=1 n jk ˆα j ˆβ k = 2n b k=1 a j=1 ˆα j ˆβ k which completes the proof. = 2n b k=1 ˆβ k ( a j=1 ˆα j = 2n b k=1 ˆβ k (0) = 0 ) Return Nathaniel E. Helwig (U of Minnesota) Factorial & Unbalanced Analysis of Variance Updated 04-Jan-2017 : Slide 70

3. Design Experiments and Variance Analysis

3. Design Experiments and Variance Analysis 3. Design Experiments and Variance Analysis Isabel M. Rodrigues 1 / 46 3.1. Completely randomized experiment. Experimentation allows an investigator to find out what happens to the output variables when

More information

Two-Way Analysis of Variance - no interaction

Two-Way Analysis of Variance - no interaction 1 Two-Way Analysis of Variance - no interaction Example: Tests were conducted to assess the effects of two factors, engine type, and propellant type, on propellant burn rate in fired missiles. Three engine

More information

Outline Topic 21 - Two Factor ANOVA

Outline Topic 21 - Two Factor ANOVA Outline Topic 21 - Two Factor ANOVA Data Model Parameter Estimates - Fall 2013 Equal Sample Size One replicate per cell Unequal Sample size Topic 21 2 Overview Now have two factors (A and B) Suppose each

More information

Analysis of Variance and Design of Experiments-I

Analysis of Variance and Design of Experiments-I Analysis of Variance and Design of Experiments-I MODULE VIII LECTURE - 35 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS MODEL Dr. Shalabh Department of Mathematics and Statistics Indian

More information

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek Two-factor studies STAT 525 Chapter 19 and 20 Professor Olga Vitek December 2, 2010 19 Overview Now have two factors (A and B) Suppose each factor has two levels Could analyze as one factor with 4 levels

More information

Unit 8: 2 k Factorial Designs, Single or Unequal Replications in Factorial Designs, and Incomplete Block Designs

Unit 8: 2 k Factorial Designs, Single or Unequal Replications in Factorial Designs, and Incomplete Block Designs Unit 8: 2 k Factorial Designs, Single or Unequal Replications in Factorial Designs, and Incomplete Block Designs STA 643: Advanced Experimental Design Derek S. Young 1 Learning Objectives Revisit your

More information

Factorial ANOVA. Psychology 3256

Factorial ANOVA. Psychology 3256 Factorial ANOVA Psychology 3256 Made up data.. Say you have collected data on the effects of Retention Interval 5 min 1 hr 24 hr on memory So, you do the ANOVA and conclude that RI affects memory % corr

More information

STAT 705 Chapter 19: Two-way ANOVA

STAT 705 Chapter 19: Two-way ANOVA STAT 705 Chapter 19: Two-way ANOVA Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 38 Two-way ANOVA Material covered in Sections 19.2 19.4, but a bit

More information

STAT 705 Chapter 19: Two-way ANOVA

STAT 705 Chapter 19: Two-way ANOVA STAT 705 Chapter 19: Two-way ANOVA Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 41 Two-way ANOVA This material is covered in Sections

More information

Multiple Predictor Variables: ANOVA

Multiple Predictor Variables: ANOVA Multiple Predictor Variables: ANOVA 1/32 Linear Models with Many Predictors Multiple regression has many predictors BUT - so did 1-way ANOVA if treatments had 2 levels What if there are multiple treatment

More information

Ch. 5 Two-way ANOVA: Fixed effect model Equal sample sizes

Ch. 5 Two-way ANOVA: Fixed effect model Equal sample sizes Ch. 5 Two-way ANOVA: Fixed effect model Equal sample sizes 1 Assumptions and models There are two factors, factors A and B, that are of interest. Factor A is studied at a levels, and factor B at b levels;

More information

Homework 3 - Solution

Homework 3 - Solution STAT 526 - Spring 2011 Homework 3 - Solution Olga Vitek Each part of the problems 5 points 1. KNNL 25.17 (Note: you can choose either the restricted or the unrestricted version of the model. Please state

More information

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs STA 643: Advanced Experimental Design Derek S. Young 1 Learning Objectives Understand how to interpret a random effect Know the different

More information

STAT22200 Spring 2014 Chapter 13B

STAT22200 Spring 2014 Chapter 13B STAT22200 Spring 2014 Chapter 13B Yibi Huang May 27, 2014 13.3.1 Crossover Designs 13.3.4 Replicated Latin Square Designs 13.4 Graeco-Latin Squares Chapter 13B - 1 13.3.1 Crossover Design (A Special Latin-Square

More information

MAT3378 ANOVA Summary

MAT3378 ANOVA Summary MAT3378 ANOVA Summary April 18, 2016 Before you do the analysis: How many factors? (one-factor/one-way ANOVA, two-factor ANOVA etc.) Fixed or Random or Mixed effects? Crossed-factors; nested factors or

More information

Regression with Polynomials and Interactions

Regression with Polynomials and Interactions Regression with Polynomials and Interactions Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)

More information

Statistics For Economics & Business

Statistics For Economics & Business Statistics For Economics & Business Analysis of Variance In this chapter, you learn: Learning Objectives The basic concepts of experimental design How to use one-way analysis of variance to test for differences

More information

Contents. TAMS38 - Lecture 6 Factorial design, Latin Square Design. Lecturer: Zhenxia Liu. Factorial design 3. Complete three factor design 4

Contents. TAMS38 - Lecture 6 Factorial design, Latin Square Design. Lecturer: Zhenxia Liu. Factorial design 3. Complete three factor design 4 Contents Factorial design TAMS38 - Lecture 6 Factorial design, Latin Square Design Lecturer: Zhenxia Liu Department of Mathematics - Mathematical Statistics 28 November, 2017 Complete three factor design

More information

If we have many sets of populations, we may compare the means of populations in each set with one experiment.

If we have many sets of populations, we may compare the means of populations in each set with one experiment. Statistical Methods in Business Lecture 3. Factorial Design: If we have many sets of populations we may compare the means of populations in each set with one experiment. Assume we have two factors with

More information

Chapter 13 Experiments with Random Factors Solutions

Chapter 13 Experiments with Random Factors Solutions Solutions from Montgomery, D. C. (01) Design and Analysis of Experiments, Wiley, NY Chapter 13 Experiments with Random Factors Solutions 13.. An article by Hoof and Berman ( Statistical Analysis of Power

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative

More information

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2 identity Y ijk Ȳ = (Y ijk Ȳij ) + (Ȳi Ȳ ) + (Ȳ j Ȳ ) + (Ȳij Ȳi Ȳ j + Ȳ ) Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, (1) E(MSE) = E(SSE/[IJ(K 1)]) = (2) E(MSA) = E(SSA/(I

More information

Topic 29: Three-Way ANOVA

Topic 29: Three-Way ANOVA Topic 29: Three-Way ANOVA Outline Three-way ANOVA Data Model Inference Data for three-way ANOVA Y, the response variable Factor A with levels i = 1 to a Factor B with levels j = 1 to b Factor C with levels

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

STAT 506: Randomized complete block designs

STAT 506: Randomized complete block designs STAT 506: Randomized complete block designs Timothy Hanson Department of Statistics, University of South Carolina STAT 506: Introduction to Experimental Design 1 / 10 Randomized complete block designs

More information

Unbalanced Data in Factorials Types I, II, III SS Part 1

Unbalanced Data in Factorials Types I, II, III SS Part 1 Unbalanced Data in Factorials Types I, II, III SS Part 1 Chapter 10 in Oehlert STAT:5201 Week 9 - Lecture 2 1 / 14 When we perform an ANOVA, we try to quantify the amount of variability in the data accounted

More information

Lecture 15 Topic 11: Unbalanced Designs (missing data)

Lecture 15 Topic 11: Unbalanced Designs (missing data) Lecture 15 Topic 11: Unbalanced Designs (missing data) In the real world, things fall apart: plants are destroyed/trampled/eaten animals get sick volunteers quit assistants are sloppy accidents happen

More information

Statistics 210 Part 3 Statistical Methods Hal S. Stern Department of Statistics University of California, Irvine

Statistics 210 Part 3 Statistical Methods Hal S. Stern Department of Statistics University of California, Irvine Thus far: Statistics 210 Part 3 Statistical Methods Hal S. Stern Department of Statistics University of California, Irvine sternh@uci.edu design of experiments two sample methods one factor ANOVA pairing/blocking

More information

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222

More information

Stat 217 Final Exam. Name: May 1, 2002

Stat 217 Final Exam. Name: May 1, 2002 Stat 217 Final Exam Name: May 1, 2002 Problem 1. Three brands of batteries are under study. It is suspected that the lives (in weeks) of the three brands are different. Five batteries of each brand are

More information

Chapter 11: Factorial Designs

Chapter 11: Factorial Designs Chapter : Factorial Designs. Two factor factorial designs ( levels factors ) This situation is similar to the randomized block design from the previous chapter. However, in addition to the effects within

More information

STAT 525 Fall Final exam. Tuesday December 14, 2010

STAT 525 Fall Final exam. Tuesday December 14, 2010 STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will

More information

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Biostatistics for physicists fall Correlation Linear regression Analysis of variance Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody

More information

Factorial ANOVA. Testing more than one manipulation

Factorial ANOVA. Testing more than one manipulation Factorial ANOVA Testing more than one manipulation Factorial ANOVA Today s goal: Teach you about factorial ANOVA, the test used to evaluate more than two manipulations at the same time Outline: - Why Factorial

More information

Lecture 9: Factorial Design Montgomery: chapter 5

Lecture 9: Factorial Design Montgomery: chapter 5 Lecture 9: Factorial Design Montgomery: chapter 5 Page 1 Examples Example I. Two factors (A, B) each with two levels (, +) Page 2 Three Data for Example I Ex.I-Data 1 A B + + 27,33 51,51 18,22 39,41 EX.I-Data

More information

Random and Mixed Effects Models - Part III

Random and Mixed Effects Models - Part III Random and Mixed Effects Models - Part III Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Quasi-F Tests When we get to more than two categorical factors, some times there are not nice F tests

More information

Increasing precision by partitioning the error sum of squares: Blocking: SSE (CRD) à SSB + SSE (RCBD) Contrasts: SST à (t 1) orthogonal contrasts

Increasing precision by partitioning the error sum of squares: Blocking: SSE (CRD) à SSB + SSE (RCBD) Contrasts: SST à (t 1) orthogonal contrasts Lecture 13 Topic 9: Factorial treatment structures (Part II) Increasing precision by partitioning the error sum of squares: s MST F = = MSE 2 among = s 2 within SST df trt SSE df e Blocking: SSE (CRD)

More information

Multi-Factor Experiments

Multi-Factor Experiments ETH p. 1/31 Multi-Factor Experiments Twoway anova Interactions More than two factors ETH p. 2/31 Hypertension: Effect of biofeedback Biofeedback Biofeedback Medication Control + Medication 158 188 186

More information

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Chap The McGraw-Hill Companies, Inc. All rights reserved. 11 pter11 Chap Analysis of Variance Overview of ANOVA Multiple Comparisons Tests for Homogeneity of Variances Two-Factor ANOVA Without Replication General Linear Model Experimental Design: An Overview

More information

More about Single Factor Experiments

More about Single Factor Experiments More about Single Factor Experiments 1 2 3 0 / 23 1 2 3 1 / 23 Parameter estimation Effect Model (1): Y ij = µ + A i + ɛ ij, Ji A i = 0 Estimation: µ + A i = y i. ˆµ = y..  i = y i. y.. Effect Modell

More information

Orthogonal contrasts for a 2x2 factorial design Example p130

Orthogonal contrasts for a 2x2 factorial design Example p130 Week 9: Orthogonal comparisons for a 2x2 factorial design. The general two-factor factorial arrangement. Interaction and additivity. ANOVA summary table, tests, CIs. Planned/post-hoc comparisons for the

More information

2-way analysis of variance

2-way analysis of variance 2-way analysis of variance We may be considering the effect of two factors (A and B) on our response variable, for instance fertilizer and variety on maize yield; or therapy and sex on cholesterol level.

More information

FACTORIAL DESIGNS and NESTED DESIGNS

FACTORIAL DESIGNS and NESTED DESIGNS Experimental Design and Statistical Methods Workshop FACTORIAL DESIGNS and NESTED DESIGNS Jesús Piedrafita Arilla jesus.piedrafita@uab.cat Departament de Ciència Animal i dels Aliments Items Factorial

More information

. Example: For 3 factors, sse = (y ijkt. " y ijk

. Example: For 3 factors, sse = (y ijkt.  y ijk ANALYSIS OF BALANCED FACTORIAL DESIGNS Estimates of model parameters and contrasts can be obtained by the method of Least Squares. Additional constraints must be added to estimate non-estimable parameters.

More information

Lec 5: Factorial Experiment

Lec 5: Factorial Experiment November 21, 2011 Example Study of the battery life vs the factors temperatures and types of material. A: Types of material, 3 levels. B: Temperatures, 3 levels. Example Study of the battery life vs the

More information

Allow the investigation of the effects of a number of variables on some response

Allow the investigation of the effects of a number of variables on some response Lecture 12 Topic 9: Factorial treatment structures (Part I) Factorial experiments Allow the investigation of the effects of a number of variables on some response in a highly efficient manner, and in a

More information

Factorial Treatment Structure: Part I. Lukas Meier, Seminar für Statistik

Factorial Treatment Structure: Part I. Lukas Meier, Seminar für Statistik Factorial Treatment Structure: Part I Lukas Meier, Seminar für Statistik Factorial Treatment Structure So far (in CRD), the treatments had no structure. So called factorial treatment structure exists if

More information

iron retention (log) high Fe2+ medium Fe2+ high Fe3+ medium Fe3+ low Fe2+ low Fe3+ 2 Two-way ANOVA

iron retention (log) high Fe2+ medium Fe2+ high Fe3+ medium Fe3+ low Fe2+ low Fe3+ 2 Two-way ANOVA iron retention (log) 0 1 2 3 high Fe2+ high Fe3+ low Fe2+ low Fe3+ medium Fe2+ medium Fe3+ 2 Two-way ANOVA In the one-way design there is only one factor. What if there are several factors? Often, we are

More information

Analysis of Variance

Analysis of Variance Analysis of Variance Blood coagulation time T avg A 62 60 63 59 61 B 63 67 71 64 65 66 66 C 68 66 71 67 68 68 68 D 56 62 60 61 63 64 63 59 61 64 Blood coagulation time A B C D Combined 56 57 58 59 60 61

More information

Factorial designs. Experiments

Factorial designs. Experiments Chapter 5: Factorial designs Petter Mostad mostad@chalmers.se Experiments Actively making changes and observing the result, to find causal relationships. Many types of experimental plans Measuring response

More information

COMPARISON OF MEANS OF SEVERAL RANDOM SAMPLES. ANOVA

COMPARISON OF MEANS OF SEVERAL RANDOM SAMPLES. ANOVA Experimental Design and Statistical Methods Workshop COMPARISON OF MEANS OF SEVERAL RANDOM SAMPLES. ANOVA Jesús Piedrafita Arilla jesus.piedrafita@uab.cat Departament de Ciència Animal i dels Aliments

More information

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Throughout this chapter we consider a sample X taken from a population indexed by θ Θ R k. Instead of estimating the unknown parameter, we

More information

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA s:5 Applied Linear Regression Chapter 8: ANOVA Two-way ANOVA Used to compare populations means when the populations are classified by two factors (or categorical variables) For example sex and occupation

More information

STAT 401A - Statistical Methods for Research Workers

STAT 401A - Statistical Methods for Research Workers STAT 401A - Statistical Methods for Research Workers One-way ANOVA Jarad Niemi (Dr. J) Iowa State University last updated: October 10, 2014 Jarad Niemi (Iowa State) One-way ANOVA October 10, 2014 1 / 39

More information

Chapter 7 Factorial ANOVA: Two-way ANOVA

Chapter 7 Factorial ANOVA: Two-way ANOVA Chapter 7 Factorial ANOVA: Two-way ANOVA Page Two-way ANOVA: Equal n. Examples 7-. Terminology 7-6 3. Understanding main effects 7- and interactions 4. Structural model 7-5 5. Variance partitioning 7-6.

More information

Chapter 12. Analysis of variance

Chapter 12. Analysis of variance Serik Sagitov, Chalmers and GU, January 9, 016 Chapter 1. Analysis of variance Chapter 11: I = samples independent samples paired samples Chapter 1: I 3 samples of equal size J one-way layout two-way layout

More information

V. Experiments With Two Crossed Treatment Factors

V. Experiments With Two Crossed Treatment Factors V. Experiments With Two Crossed Treatment Factors A.The Experimental Design Completely Randomized Design (CRD) Let A be a factor with levels i = 1,,a B be a factor with levels j = 1,,b Y ijt = the response

More information

Fractional Factorial Designs

Fractional Factorial Designs k-p Fractional Factorial Designs Fractional Factorial Designs If we have 7 factors, a 7 factorial design will require 8 experiments How much information can we obtain from fewer experiments, e.g. 7-4 =

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 22s:152 Applied Linear Regression Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) We now consider an analysis with only categorical predictors (i.e. all predictors are

More information

Two-Way Factorial Designs

Two-Way Factorial Designs 81-86 Two-Way Factorial Designs Yibi Huang 81-86 Two-Way Factorial Designs Chapter 8A - 1 Problem 81 Sprouting Barley (p166 in Oehlert) Brewer s malt is produced from germinating barley, so brewers like

More information

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model Topic 23 - Unequal Replication Data Model Outline - Fall 2013 Parameter Estimates Inference Topic 23 2 Example Page 954 Data for Two Factor ANOVA Y is the response variable Factor A has levels i = 1, 2,...,

More information

Math 141. Lecture 16: More than one group. Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141

Math 141. Lecture 16: More than one group. Albyn Jones 1.   jones/courses/ Library 304. Albyn Jones Math 141 Math 141 Lecture 16: More than one group Albyn Jones 1 1 Library 304 jones@reed.edu www.people.reed.edu/ jones/courses/141 Comparing two population means If two distributions have the same shape and spread,

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is. Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The

More information

22s:152 Applied Linear Regression. Take random samples from each of m populations.

22s:152 Applied Linear Regression. Take random samples from each of m populations. 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design

PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design The purpose of this experiment was to determine differences in alkaloid concentration of tea leaves, based on herb variety (Factor A)

More information

Introduction to Nonparametric Regression

Introduction to Nonparametric Regression Introduction to Nonparametric Regression Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)

More information

Analysis of Variance II Bios 662

Analysis of Variance II Bios 662 Analysis of Variance II Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-10-24 17:21 BIOS 662 1 ANOVA II Outline Multiple Comparisons Scheffe Tukey Bonferroni

More information

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA 22s:152 Applied Linear Regression Chapter 8: ANOVA NOTE: We will meet in the lab on Monday October 10. One-way ANOVA Focuses on testing for differences among group means. Take random samples from each

More information

Nonparametric Location Tests: k-sample

Nonparametric Location Tests: k-sample Nonparametric Location Tests: k-sample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)

More information

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow) STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points

More information

22s:152 Applied Linear Regression. 1-way ANOVA visual:

22s:152 Applied Linear Regression. 1-way ANOVA visual: 22s:152 Applied Linear Regression 1-way ANOVA visual: Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Y We now consider an analysis

More information

2017 Financial Mathematics Orientation - Statistics

2017 Financial Mathematics Orientation - Statistics 2017 Financial Mathematics Orientation - Statistics Written by Long Wang Edited by Joshua Agterberg August 21, 2018 Contents 1 Preliminaries 5 1.1 Samples and Population............................. 5

More information

Chapter 15: Analysis of Variance

Chapter 15: Analysis of Variance Chapter 5: Analysis of Variance 5. Introduction In this chapter, we introduced the analysis of variance technique, which deals with problems whose objective is to compare two or more populations of quantitative

More information

The Distribution of F

The Distribution of F The Distribution of F It can be shown that F = SS Treat/(t 1) SS E /(N t) F t 1,N t,λ a noncentral F-distribution with t 1 and N t degrees of freedom and noncentrality parameter λ = t i=1 n i(µ i µ) 2

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

ANOVA: Analysis of Variation

ANOVA: Analysis of Variation ANOVA: Analysis of Variation The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative variables depend on which group (given by categorical

More information

Multiple Predictor Variables: ANOVA

Multiple Predictor Variables: ANOVA What if you manipulate two factors? Multiple Predictor Variables: ANOVA Block 1 Block 2 Block 3 Block 4 A B C D B C D A C D A B D A B C Randomized Controlled Blocked Design: Design where each treatment

More information

Ch. 1: Data and Distributions

Ch. 1: Data and Distributions Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and

More information

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018 Math 403 - P. & S. III - Dr. McLoughlin - 1 2018 2 Hand-out 2 Dr. M. P. M. M. M c Loughlin Revised 2018 3. Fundamentals 3.1. Preliminaries. Suppose we can produce a random sample of weights of 10 year-olds

More information

Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction,

Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction, Residual Analysis for two-way ANOVA The twoway model with K replicates, including interaction, is Y ijk = µ ij + ɛ ijk = µ + α i + β j + γ ij + ɛ ijk with i = 1,..., I, j = 1,..., J, k = 1,..., K. In carrying

More information

These are multifactor experiments that have

These are multifactor experiments that have Design of Engineering Experiments Nested Designs Text reference, Chapter 14, Pg. 525 These are multifactor experiments that have some important industrial applications Nested and split-plot designs frequently

More information

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model Biostatistics 250 ANOVA Multiple Comparisons 1 ORIGIN 1 Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model When the omnibus F-Test for ANOVA rejects the null hypothesis that

More information

Topic 9: Factorial treatment structures. Introduction. Terminology. Example of a 2x2 factorial

Topic 9: Factorial treatment structures. Introduction. Terminology. Example of a 2x2 factorial Topic 9: Factorial treatment structures Introduction A common objective in research is to investigate the effect of each of a number of variables, or factors, on some response variable. In earlier times,

More information

STAT22200 Spring 2014 Chapter 5

STAT22200 Spring 2014 Chapter 5 STAT22200 Spring 2014 Chapter 5 Yibi Huang April 29, 2014 Chapter 5 Multiple Comparisons Chapter 5-1 Chapter 5 Multiple Comparisons Note the t-tests and C.I. s are constructed assuming we only do one test,

More information

Chapter 20 : Two factor studies one case per treatment Chapter 21: Randomized complete block designs

Chapter 20 : Two factor studies one case per treatment Chapter 21: Randomized complete block designs Chapter 20 : Two factor studies one case per treatment Chapter 21: Randomized complete block designs Adapted from Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis

More information

Straw Example: Randomized Block ANOVA

Straw Example: Randomized Block ANOVA Math 3080 1. Treibergs Straw Example: Randomized Block ANOVA Name: Example Jan. 23, 2014 Today s example was motivated from problem 13.11.9 of Walpole, Myers, Myers and Ye, Probability and Statistics for

More information

Lecture 5: Comparing Treatment Means Montgomery: Section 3-5

Lecture 5: Comparing Treatment Means Montgomery: Section 3-5 Lecture 5: Comparing Treatment Means Montgomery: Section 3-5 Page 1 Linear Combination of Means ANOVA: y ij = µ + τ i + ɛ ij = µ i + ɛ ij Linear combination: L = c 1 µ 1 + c 1 µ 2 +...+ c a µ a = a i=1

More information

Biostatistics 380 Multiple Regression 1. Multiple Regression

Biostatistics 380 Multiple Regression 1. Multiple Regression Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)

More information

Statistics for EES Factorial analysis of variance

Statistics for EES Factorial analysis of variance Statistics for EES Factorial analysis of variance Dirk Metzler June 12, 2015 Contents 1 ANOVA and F -Test 1 2 Pairwise comparisons and multiple testing 6 3 Non-parametric: The Kruskal-Wallis Test 9 1 ANOVA

More information

Page: 3, Line: 24 ; Replace 13 Building... with 13 More on Multiple Regression. Page: 83, Line: 9 ; Replace The ith ordered... with The jth ordered...

Page: 3, Line: 24 ; Replace 13 Building... with 13 More on Multiple Regression. Page: 83, Line: 9 ; Replace The ith ordered... with The jth ordered... ERRATA FOR THE OTT/LONGNECKER BOOK AN INTRODUCTION TO STATISTICAL METHODS AND DATA ANALYSIS (Some of these typos have been corrected in later printings) Page: xiv, Line: 15 ; Replace these with the Page:

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

STAT22200 Spring 2014 Chapter 8A

STAT22200 Spring 2014 Chapter 8A STAT22200 Spring 2014 Chapter 8A Yibi Huang May 13, 2014 81-86 Two-Way Factorial Designs Chapter 8A - 1 Problem 81 Sprouting Barley (p166 in Oehlert) Brewer s malt is produced from germinating barley,

More information

Two-Factor Full Factorial Design with Replications

Two-Factor Full Factorial Design with Replications Two-Factor Full Factorial Design with Replications Dr. John Mellor-Crummey Department of Computer Science Rice University johnmc@cs.rice.edu COMP 58 Lecture 17 March 005 Goals for Today Understand Two-factor

More information

y = µj n + β 1 b β b b b + α 1 t α a t a + e

y = µj n + β 1 b β b b b + α 1 t α a t a + e The contributions of distinct sets of explanatory variables to the model are typically captured by breaking up the overall regression (or model) sum of squares into distinct components This is useful quite

More information

Multiple Comparisons. The Interaction Effects of more than two factors in an analysis of variance experiment. Submitted by: Anna Pashley

Multiple Comparisons. The Interaction Effects of more than two factors in an analysis of variance experiment. Submitted by: Anna Pashley Multiple Comparisons The Interaction Effects of more than two factors in an analysis of variance experiment. Submitted by: Anna Pashley One way Analysis of Variance (ANOVA) Testing the hypothesis that

More information

MAT3378 (Winter 2016)

MAT3378 (Winter 2016) MAT3378 (Winter 2016) Assignment 2 - SOLUTIONS Total number of points for Assignment 2: 12 The following questions will be marked: Q1, Q2, Q4 Q1. (4 points) Assume that Z 1,..., Z n are i.i.d. normal random

More information

Stat 6640 Solution to Midterm #2

Stat 6640 Solution to Midterm #2 Stat 6640 Solution to Midterm #2 1. A study was conducted to examine how three statistical software packages used in a statistical course affect the statistical competence a student achieves. At the end

More information