Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

Size: px

Start display at page:

Download "Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test"

Aubrey Riley
5 years ago
Views:

1 Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test la

2 Contents

4 The two sample t-test generalizes into Analysis of Variance. In analysis of variance ANOVA the population consists of two or more groups. Observations are assumed to follow normal distribution. ANOVA tests the equality of the expected values of the groups. For example, we could test the equality of the expected monthly salary of MScs (Tech.) in Helsinki, Tampere and Oulu.

5 MANOVA In multivariate analysis of variance, MANOVA, the tested expected values are vectors. We could test the equality of the expected monthly salary and weekly overtime of MScs (Tech.) in Helsinki, Tampere and Oulu.

6 Multifactor ANOVA The population could also be divided into groups based on multiple factors (Multifactor ANOVA), of which some can be continuous (analysis of covariance, ANCOVA). For example, we could divide MScs (Tech.) in to groups based on the city they live in and based on their gender.

7 Multifactor MANOVA In multifactor MANOVA, the tested expected values are vectors.

8 ANOVA On this lecture course, we only consider cases where the population is divided into groups with respect to just one factor and the expected value that is tested is univariate.

9 ANOVA Let x 1j, x 2j,..., x nj j be observed values of a random variable x j, j {1,..., k}. Assume that the observations x 1j, x 2j,..., x nj j are i.i.d. and follow normal distribution N(µ j, σ 2 ), j {1,..., k}. (We now have k random samples from univariate normal distribution. The variance of all the k normal distributions is the same.) Assume that the k samples are independent. The null hypothesis H 0 : µ 1 = µ 2 = = µ k. The alternative hypothesis H 1 : µ j µ i for some i, j.

10 ANOVA In analysis of variance, the total variance is divided into two parts. First part measures the variance between group means and the second part measures variance within groups. If these two variances differ a lot, there is evidence against the null hypothesis, and it can be rejected. The test of the equality of the expected values is based on comparison of between groups variance and within groups variances. This the name to the method analysis of variance.

11 ANOVA Calculate the group means x j = 1 n j n j i=1 x ij and combined sample mean x = 1 n n k j x ij, j=1 i=1 where n = k j=1 n j.

12 ANOVA Consider the total sum of squares n k j SST = (x ij x) 2, j=1 i=1 the variance between groups (group sum of squares) SSG = n k j ( x j x) 2 = j=1 i=1 k n j ( x j x) 2 j=1 and the variance within groups (error sum of squares) SSE = n k j (x ij x j ) 2 = j=1 i=1 k (n j 1)sj 2, j=1 where sj 2 = 1 nj n j 1 i=1 (x ij x j ) 2. Now the total sum of squares SST = SSG + SSE.

13 ANOVA F-test statistic F = n k SSG k 1 SSE. Under the null hypothesis, the test statistic follows F distribution with parameters (k 1) and (n k). The expected value of the test statistic, under H 0, is (E[F] = n k n k 2 ). Large values of the test statistic suggests that the null hypothesis H 0 is false. n k n k 2 The null hypothesis H 0 is rejected, if the p-value is small enough.

14 F -distribution For more about F-distribution, check Wolfram MathWorld and Wikipedia. Figure: F-distributions with different parameters.

15 Numerical example A research group was set to study whether the expected value of a specific L laboratory test differs between patients that are on different medications (A, B, C). 10 patients receiving medication A (group 1), 10 patients receiving medication B (group 2) and 10 receiving medication C (group 3) were picked randomly and lab test L was taken. Next table shows the accurately measured laboratory test results.

16 Group 1 (A) Group 2 (B) Group 3 (C) Table: L laboratory test results for patient groups 1, 2 and 3

17 Group means are x 1 = , x 2 = and x 3 = and combined mean x = Group variances are s 2 1 = , s 2 2 = and s 2 3 = Total sum of squares SST = n k j (x ij x) 2 = j=1 i= i=1 group sum of squares 10 i=1 (x 1i ) i=1 (x 3 i ) 2 = , (x 2i ) 2 SSG = k n j ( x j x) 2 = j=1 3 10( x j ) 2 = j=1 and error sum of squares SSE = k (n j 1)sj 2 = 9( ) j=1 =

18 The value of the test statistic is F = n k k 1 SSG SSE = = Under the null hypothesis, the test statistic follows F -distribution with parameters (2) and (27). One-tailed critical value on 5% significance level is < The null hypothesis can be rejected.

20 If the null hypothesis (equality of the expected values) is rejected based on the F -test, the next step is pairwise comparison. The goal in pairwise comparison is to identify the groups with statistically significant differences in expected values.

21 Bonferroni s method Consider pairwise comparison of the expected values. There are c = k(k 1) 2 pairs in total to compare. The first idea would be to analyse the pairs with t-test or to construct confidence intervals for the differences µ j µ s. Let β be the probability that H 0 is incorrectly rejected in a single test and let α be the probability that H 0 is incorrectly rejected in at least one test, when the test is repeated c times. Then α cβ. For this reason, if significance level α is chosen for the combined comparison, then individual comparisons must be done on level β = α c. (For pairwise tests, instead of p-value α, you look for p-values α c.)

23 In ANOVA, it is assumed that the group variances are equal. This assumption should be tested separately.

24 Let x 1j, x 2j,..., x nj j be observed values of a random variable x j, j {1,..., k}. Assume that the observations are i.i.d. and follow normal distribution N(µ j, σ 2 j ), j {1,..., k}. Assume that all the k samples are independent. The null hypothesis H 0 : σ 2 1 = σ2 2 = = σ2 k. The alternative hypothesis H 1 : σ 2 j σ 2 i for some i, j..

25 Let and Let s 2 = 1 n k s 2 j = 1 n j 1 n k j (x ij x j ) 2, j=1 i=1 n j (x ij x j ) 2. i=1 B = Q h, where and Q = (n k) ln s 2 h = ( ( 3(k 1) k j=1 k (n j 1) ln sj 2 j=1 1 n j 1 ) 1 ). n k

26 Bartlett s test statistic where and B = Q h, Q = (n k) ln s 2 h = ( ( 3(k 1) k j=1 k (n j 1) ln sj 2 j=1 1 n j 1 ) 1 ). n k If the sample size is large, then, under the null hypothesis, the test statistic approximately follows χ 2 distribution with (k 1) degrees of freedom. The expected value of the test statistic, under H 0, is k 1 (E[B] = k 1). Large values of the test statistic suggest that the null hypothesis H 0 is false. The null hypothesis H 0 is rejected if the p-value is small enough.

28 ANOVA tests the equality of the expected values of normally distributed samples. We next consider a non-parametric alternative to ANOVA.

29 is similar to one way analysis of variance, but it does not require normality assumption.

30 Let x 1j, x 2j,..., x nj j be observed values of a continuous random variable x j, j {1,..., k}. Assume that the observations x 1j, x 2j,..., x nj j are i.i.d. Assume also that the k samples are independent and that the variables x j, j {1,..., k} follow the same distribution up to location shifts (i.e. x j follow otherwise the same distribution, but possibly with different medians) and assume that the variables x j have population medias m j, j {1,..., k}. The null hypothesis H 0 : m 1 = m 2 = = m k. The alternative hypothesis H 1 : m j m i for some i, j.

31 is based on examining the ranks of the observations.

32 Combine the groups x 1j, x 2j,..., x nj j, j {1,..., k} into one big sample z 1, z 2,..., z n, where n = k j=1 n j. Order the observations z s from the smallest to the largest. Let R(z s ) be the rank of the observation z s in the combined sample z 1, z 2,..., z n. Calculate the group means of the ranks r j = 1 n j n j z s=x ij,i=1 and the mean of the combined sample r = 1 n R(z s ) n R(z s ). s=1

33 Consider the group sum of squares, which describes the variation of the ranks between the groups k n j ( r j r) 2 j=1 and the total sum of squares, which describes the variation of the ranks in the combined sample n (R(z s ) r) 2. s=1

34 statistic k j=1 K = (n 1) n j( r j r) 2 n s=1 (R(z s) r). 2 Under the null hypothesis H 0, if the sample size is large, the test statistic approximately follows χ 2 distribution with k 1 degrees of freedom. Under H 0, the (asymptotic) expected value of the test statistic is k 1. Large values of the test statistic suggests that the null hypothesis H 0 is false. The null hypothesis is rejected, if the p-value is small enough.

35 Statistical softwares often calculate exact p-values for the when the sample size is small. With large sample sizes, calculation of exact p-values requires intense computing and in these cases the asymptotic p-values (based on the χ 2 -distribution) are used.

36 Discrete distributions We assumed above that the observations follow some continuous distribution. However, can be used for discrete observations as well. Then it is possible that some of the observations have the same rank. In that case, all those observations are assigned to have the median of the corresponding ranks. For example, if two observations have the same rank corresponding to ranks 7 and 8, then both are assigned to have rank 7.5. If three observations have the same ranks corresponding to ranks 3, 4, and 5, then each is assigned to have rank 4.

37 ANOVA vs Kruskall-Wallis ANOVA is explicitly a test for the equality of the expected values. The can, technically, be seen as a comparison of the expected ranks. Hence, the Kruskal-Wallis test is in fact more general than a test for the equality of the medians. It tests whether the probability that a random observation from each group is equally likely to be above or below a random observation from another group. The test is sensitive to differences in medians and that is why it is usually considered as a test for the equality of the medians.

38 Example Consider three student groups (1, 2, 3) and their statistics exam scores. The table below displays the scores and the corresponding ranks (in parenthesis). group 1 group 2 group (14) 16.5 (11) 23 (22) 11.0 (4.5) 10.0 (3) 22 (20) 17.0 (12) 15.0 (8.5) 23 (22) 14.0 (7) 15.0 (8.5) 24 (24) 11.0 (4.5) 20.5 (17) 21 (18) 9.5 (2) 8.0 (1) 21.5 (19) 16.0 (10) 12.0 (6) 23 (22) 20.0 (16) 17.5 (13) 19.0 (15)

39 Example Calculate the rank means within groups r 1 = 1 54 ( ) = 7 7 = , r 2 = 1 7 ( ) = 55 7 = , r 3 = ( ) = = 19.1, and the mean rank of the combined sample r = ( ) = = 12.5.

40 Example Calculate the group sum of squares k n j ( r j r) 2 = 7 ( ) ( ) 2 j=1 +10 ( ) 2 = and the total sum of squares n (R(z s ) r) 2 s=1 = ( ) 2 + ( ) 2 + ( ) 2 + (7 12.5) 2 +( ) 2 + (2 12.5) 2 + ( ) 2 + ( ) 2 +(3 12.5) 2 + ( ) 2 + ( ) 2 +( ) 2 + (1 12.5) 2 + (6 12.5) 2 +( ) 2 +( ) 2 +( ) 2 +( ) 2 +( ) 2 +( ) 2 +( ) 2 +( ) 2 +( ) 2 +( ) 2 = 1147

41 Example Now k j=1 K = (n 1) n j( r j r) 2 n s=1 (R(z s) r) = (24 1) = p value of the test is clearly less than 0.05 value 5.79 has p value approximately The null hypothesis can be rejected. There is a statistically significant difference in the exam success between the three groups.

42 Bonferroni s method If the null hypothesis of the is rejected, the next step is pairwise comparison.

43 Bonferroni s method Compare pairwise equality/difference of the medians. There are c = k(k 1) 2 pairs to compare. First idea would be to use the Wilcoxon two sample rank test for pairwise comparison. It should be noted that if the comparison is done on significance level α, then the pairwise comparisons should be done on significance level β = α c. For example, if significance level 0.05 is used for the combined comparison, then the pairwise comparisons can be used to reject the null hypothesis if the p-value is smaller than or equal to 0.05 c.

44 J. S. Milton, J. C. Arnold: Introduction to Probability and Statistics, McGraw-Hill Inc R. V. Hogg, J. W. McKean, A. T. Craig: Introduction to Mathematical Statistics, Pearson Education P. Laininen: Todennäköisyys ja sen tilastollinen soveltaminen, Otatieto 1998, numero 586. I. Mellin: Tilastolliset menetelmät,

Introduction to Statistical Inference Lecture 8: Linear regression, Tests and confidence intervals

Introduction to Statistical Inference Lecture 8: Linear regression, Tests and confidence la Non-ar Contents Non-ar Non-ar Non-ar Consider n observations (pairs) (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n