Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes

2 Quick review: Normal distribution Y N(µ, σ 2 ), f Y (y) = 1 2πσ 2 (y µ)2 e 2σ 2 E[Y ] = µ, Var[Y ] = σ 2, Z = Y µ N(0, 1), f Z (z) = 1 e z2 2. σ 2π Probability density function of a normal distribution: 0.4 0.3 0.2 0.1 0.0 µ 3σ µ 2σ µ σ µ µ + σ µ + 2σ µ + 3σ 68.27% 95.45% 99.73%

2 Quick review: Normal distribution Y N(µ, σ 2 ), f Y (y) = Z = Y µ σ 1 2πσ 2 E[Y ] = µ, Var[Y ] = σ 2, (y µ)2 e 2σ 2 N(0, 1), f Z (z) = 1 2π e z2 2. Suitable modelling for a lot of phenomena: IQ N(100, 15 2 ). 0.0266 0.0000 55 70 85 100 115 130 145 68.27% 95.45% 99.73%

3 Grand Picture of Statistics Statistical Hypotheses H0: µ T amoxifen = µ Control H1: µ T amoxifen < µ Control Sample Idea: Tamoxifen represses the progression of ER+ Breast cancer Data: Tumour size at day 42 (x T,1 ; x T,2 ;...; x T,nT ) (x C,1 ; x C,2 ;...; x C,nT ) Inference: Under H0 T obs = µ T amoxifen µ Control s 1 p n + 1 n T C St nt +n C 2 Point estimation µ T amoxifen µ Control

One-sample Student s t-test Weight loss Assumed model Y i = µ + ɛ i, where i = 1,..., n and ɛ i N(0, σ 2 ). Hypotheses H0: µ = 0, H1: µ > 0. Test statistic s distribution under H0 T = Y µ0 s Student(n 1). One Sample t-test 4 data: dietb t = 6.6301, df = 24, p-value = 3.697e-07 alternative hypothesis: true mean is greater than 0 95 percent confidence interval: 2.424694 Inf sample estimates: mean of x 3.268

Weight loss Two-sample location tests: t-tests and Mann-Whitney-Wilcoxon s test

Two independent sample Student s t-test Weight loss Assumed model Y i(g) = µ g + ɛ i(g), = µ + δ g + ɛ i(g), where g = A, B, i = 1,..., n g, ɛ i(g) N(0, σ 2 ) and n gδ g = 0. Hypotheses H0: µ A = µ B, H1: µ A µ B. Test statistic s distribution under H0 T = (Y A Y B ) (µ A µ B ) Student(n A + n B 2). s p n 1 A + n 1 B Two Sample t-test 6 data: dieta and dietb t = 0.0475, df = 47, p-value = 0.9623 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.323275 1.387275 sample estimates: mean of x mean of y 3.300 3.268

Two independent sample Welch s t-test Weight loss Assumed model Y i(g) = µ g + ɛ i(g), = µ + δ g + ɛ i(g), where g = A, B, i = 1,..., n g, ɛ i(g) N(0, σ 2 g) and n gδ g = 0. Hypotheses H0: µ A = µ B, H1: µ A µ B. Test statistic s distribution under H0 T = (Y A Y B ) (µ A µ B ) s 2 X /n X + s 2 Y /n Y Student(df). Welch Two Sample t-test 7 data: dieta and dietb t = 0.047594, df = 46.865, p-value = 0.9622 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.320692 1.384692 sample estimates: mean of x mean of y 3.300 3.268

Two independent sample Mann-Whitney-Wilcoxon test Weight loss Assumed model Y i(g) = θ g + ɛ i(g), = θ + δ g + ɛ i(g), where g = A, B, i = 1,..., n g, ɛ i(g) iid(0, σ 2 ) and n gδ g = 0. Hypotheses H0: θ A = θ B, H1: θ A θ B. Test statistic s distribution under H0 nb z = i=1 R i(g) [n B (n A + n B + 1)/2], na n B (n A + n B + 1)/12 where R i(g) denotes the global rank of the ith observation of group g. Wilcoxon rank sum test with continuity correction 8 data: dieta and dietb W = 277, p-value = 0.6526 alternative hypothesis: true location shift is not equal to 0

Weight loss Two or more sample location tests: one-way ANOVA & multiple comparisons

More than two sample case: Fisher s one-way ANOVA Weight loss Assumed model Y i(g) = µ g + ɛ i(g), = µ + δ g + ɛ i(g), where g = 1,..., G, i = 1,..., n g, ɛ i(g) N(0, σ 2 ) and n gδ g = 0. Hypotheses H0: µ 1 = µ 2 =... = µ G, H1: µ k µ l for at least one pair (k, l). Test statistic s distribution under H0 F = Ns2 Y s 2 p F isher(g 1, N G), 10 where s 2 Y = G ( ) 1 ng 2, G 1 N Y g Y g=1 s 2 p = G 1 N G (n g 1)s 2 g, g=1 N = n g, Y = 1 N G n gy g. g=1 Df Sum Sq Mean Sq F value Pr(>F) diet.type 2 60.5 30.264 5.383 0.0066 ** Residuals 73 410.4 5.622 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0

More than two sample case: Welch s one-way ANOVA Weight loss 11 Assumed model Y i(g) = µ g + ɛ i(g), = µ + δ g + ɛ i(g), where g = 1,..., G, i = 1,..., n g, ɛ i(g) N(0, σ 2 g) and n gδ g = 0. Hypotheses H0: µ 1 = µ 2 =... = µ G, H1: µ k µ l for at least one pair (k, l). Test statistic s distribution under H0 F = s 2 Y 1 + 2(G 2) 3 where s 2 = G 1 Y G 1 w g (Y g Y ) 2, g=1 [ ( ) G ] 3 = G 2 1 1 wg 1 ng wg 1, g=1 w g = ng s 2, Y = G wg Y g. g wg g=1 F isher(g 1, ), One-way analysis of means (not assuming equal variances) data: weight.diff and diet.type F = 5.2693, num df = 2.00, denom df = 48.48, p-value = 0.008497

More than two sample case: Kruskal-Wallis test Weight loss Assumed model Y i(g) = θ g + ɛ i(g), = θ + δ g + ɛ i(g), where g = 1,..., G, i = 1,..., n g, ɛ i(g) iid(0, σ 2 ) and n gδ g = 0. Hypotheses H0: θ 1 = θ 2 =... = θ G, H1: θ k θ l for at least one pair (k, l). Test statistic s distribution under H0 H = 12 G Rg N(N+1) g=1 3(N 1) ng V χ(g 1), 1 v=1 t3 v tv N 3 N where R g = ng 1 ng i=1 R i(g) and R i(g) denotes the global rank of the ith observation of group g, V is the number of different values/levels in y and t v denotes the number of times a given value/level occurred in y. Kruskal-Wallis rank sum test 12 data: weight.loss by diet.type Kruskal-Wallis chi-squared = 9.4159, df = 2, p-value = 0.009023

13 Model check: Residual analysis Y i(g) = θ g + ɛ i(g) ɛ i(g) = Y i(g) θ g, Residual boxplot per group where ɛ i(g) N(0, σ 2 ) for Fisher s ANOVA ɛ i(g) N(0, σ 2 g) for Welch s ANOVA ɛ i(g) iid(0, σ 2 ) for Kruskal-Wallis ANOVA Normal Q-Q Plot Residuals -4-2 0 2 4 6 Sample Quantiles -4-2 0 2 4 6 A B C Diet type -2-1 0 1 2 Theoretical Quantiles Shapiro-Wilk normality test data: diet$resid.mean W = 0.99175, p-value = 0.9088 Bartlett test of homogeneity of variances data: diet$resid.mean by as.numeric(diet$diet.type) Bartlett s K-squared = 0.21811, df = 2, p-value = 0.8967

Finding different pairs: Multiple comparisons All-pairwise comparison problem: Interested in finding which pair(s) are different by testing H0 1: µ 1 = µ 2, H0 2: µ 1 = µ 3,... H0 K: µ G 1 = µ G, leading to a total of K = G(G 1)/2 pairwise comparisons. Family-wise type I error for K tests, α K For each test, the probability of rejecting H0 when H0 is true equals α. For K independent tests, the probability of rejecting H0 at least 1 time when H0 is true, α K, is given by α K = 1 (1 α) K. α 1 = 0.05, α 2 = 0.0975, α 10 = 0.4013. Multiplicity correction Principle: change the level of each test so that α K = 0.05, for example: Bonferroni s correction (indep. tests): α = α K /K, Dunn-Sidak s correction (indep. tests): α = 1 (1 α K ) 1/K, Tukey s correction (dependent tests). 95% family wise confidence level B A C A 14 C B 1 0 1 2 3

Weight loss Male Female Two or more sample location tests: two-way ANOVA

More than one factor: Fisher s two-way ANOVA Weight loss Assumed model Y i(g) = µ gk + ɛ i(gk), = µ + δ g + δ k + δ gk + ɛ i(gk), g = 1,..., G, k = 1,..., K, i = 1,..., n g, ɛ i(gk) N(0, σ 2 ) n gδ g = n k δ k = n gk δ gk = 0. Hypotheses Male Female H0 1: δ g = 0 g, H1 1: H0 1 is false. H0 2: δ k = 0 k, H1 2: H0 2 is false. H0 3: δ gk = 0 g, k, H1 3: H0 3 is false. Scenario 1 Scenario 2 Scenario 3 H11 Scenario 4 H12 Scenario 5 H11 & H12 Scenario 6 H11 & H12 & H13 3.0 Mean weight loss 2.5 2.0 1.5 1.0 µ 0.5 16 0.0 A B C A B C Male A B C Female A B C A B C A B C

16 More than one factor: Fisher s two-way ANOVA Weight loss Assumed model Y i(g) = µ gk + ɛ i(gk), = µ + δ g + δ k + δ gk + ɛ i(gk), g = 1,..., G, k = 1,..., K, i = 1,..., n g, ɛ i(gk) N(0, σ 2 ) n gδ g = n k δ k = n gk δ gk = 0. Hypotheses Male Female H0 1: δ g = 0 g, H1 1: H0 1 is false. H0 2: δ k = 0 k, H1 2: H0 2 is false. H0 3: δ gk = 0 g, k, H1 3: H0 3 is false. Df Sum Sq Mean Sq F value Pr(>F) diet.type 2 60.5 30.264 5.629 0.00541 ** gender 1 0.2 0.169 0.031 0.85991 diet.type:gender 2 33.9 16.952 3.153 0.04884 * Residuals 70 376.3 5.376 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1

Summary 17