Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Suhasini Subba Rao Motivations for the ANOVA We defined the F-distribution, this is mainly used in the ANOVA. Motivating the ANOVA Suppose we have three samples from three populations: 13 heights of 10 year old children from Country 1. 15 heights of 10 year old children from Country 2. 18 heights of 10 year old children from Country 3. We think (to make everything easy for now), that all the observations are normal and we want to test whether they have they same mean. Let µ 1 = mean height of 10 year olds in Country 1. µ 2 = mean height of 10 year olds in Country 2 and µ 3 = mean height of 10 year olds in Country 3. 1 Formally we want to test H 0 : µ 1 = µ 2 = µ 3. against H A : The means are not all the same. How can we do the test? One method would be to go through every combination and test individually H 0 : µ 1 = µ 2 against H A : µ 1 µ 2 H 0 : µ 1 = µ 3 against H A : µ 1 µ 3 H 0 : µ 2 = µ 3 against H A : µ 2 µ 3. There are problems associated with doing multiple tests, one of the main is the false discover or false positives. In the case of testing equality of means we often use what is known as ANOVA, whereas above we had to so a multiple test this is just one test. The speed of light data In 1879, A. A. Michelson wanted to measure the speed of light. He conducted 5 trials and for each trial made 20 measurements. Eg. he has 5 samples, where each sample is of size 20. Each sample comes from a different population. Here are the summary statistics (to make life easy we have removed 299,000 from each observation): Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 sample mean 909 856 845 820.5 831.5 sample variance 11009 3741 6257 3605 2939 It is now known that the speed of light (minus 299,000km/s) is 734.5km/s. 2 3
1 2 3 4 5 If we do a hypothesis test for each trial and test whether the population mean for each trial is 734.5km/s, we would reject the null. For practice you can try doing the test, and see what you get. Boxplots Which indicates there were some problems in his experiment (quite a lot of measurement error). Another interesting question is whether all trials have the same mean or whether some are different. There is a belief that he changed some of the equipment for each of the trials. By changing the equipment he has produced a new population. 700 800 900 1000 Translating this into statistical language we can ask data all have the same population mean. First we need to make some plots. A boxplot for each trial. What do you think? Let µ 1, µ 2, µ 3, µ 4, µ 5 be the population means of Trial 1,2,3,4,5 respectively. We want to test the hypothesis H 0 : µ 1 = µ 2 = µ 3 = µ 4 = µ 5. Before we do tests using this data let us look at a simple example. 4 5 The principles of ANOVA: Dummy example Consider the following dummy example: average is overall average. The variances are calculated using: s 2 1 = 1 3 ((4.1 3.2)2 + (3.3 3.2) 2 + (2.6 3.2) 2 + (2.8 3.2) 2 ) = 0.45 s 2 2 = 1 2 ((5.1 5.13)2 + (5 5.13) 2 + (5.3 5.13) 2 )) = 0.023 Sample 1 Sample 2 Sample 3 Combined Sample 4.1 5.1 6.6 3.3 5.0 6.2 2.6 5.3 7.3 2.8 6.5 average 3.2 5.13 6.65 4.98 sample variance 0.45 0.023 0.22 0.26 Notice the in the table we have the means for each group and the last s 2 3 = 1 4 ((6.6 6.65)2 + (6.2 6.65) 2 + (7.3 6.65) 2 + (6.5 6.65) 2 ) = 0.22 s 2 A = 1 8 ((4.1 3.2)2 +... + (6.5 6.65) 2 ) = 1 8 (3 s2 1 + 2 s2 2 + 3 s2 3 ) = 0.26 Do you think there is a difference between the three population means? We want to test H 0 : µ 1 = µ 2 = µ 3 (the population means are the same) against H A at least one of the means are different. 6 7
Graphical representation of data The idea of ANOVA The idea of ANOVA is to look at the variation under the global sample (average calculated using all of the data) and compare it to the variation between the sample means. 1 2 3 4. 5 6 7 Group mean Total mean 1 2 3 4. 5 6 7 Within variance Between group variance 8 9 The methods of ANOVA We evaluate the Sum of Squares within the samples (SSW): Sample 1 Sample 2 Sample 3 Combined mean 3.2 5.13 6.65 4.98 within variances 0.45 0.023 0.22 mean - combined mean -1.78 0.015 1.67 n i (mean - combined mean) 2 4 1.78 2 3 0.015 2 4 1.67 2 We evaluate Sum of Squares Between the samples (): = 4 1.78 2 + 3 0.015 2 + 4 1.67 2 = 23.8 So /(3 1) = 11.9 can almost (BUT IT IS NOT) be considered as the sample variance of the sample means 3.2,5.13,6.65. SSW = (4 1) 0.45 + (3 1) 0.023 + (4 1) 0.22 = 2.05. So SSW/[(4 1)+(3 1)+(4 1)] = 0.266 can basically considered as an extended version of the pooled variance, to more than two samples. If the means of the populations are the same, then /(3 1) and SSW/8 should be close to each other. If the population means are different then, the between group variation must be much greater than the within group variation. Look at the picture on two pages before: The between group variation is represented by the lowest bracket. 10 11
The within group variation by an average of the above brackets. Of course we need a test statistic to actually test this. The F-distribution and ANOVA We consider the ratio F = /2 SSW/8, under the null the ratio should be close to one (hmm this looks rather like the F-test...). Under the null H 0 : µ 1 = µ 2 = µ 3, we have /2 SSW/8 F 2,8! The ratio is 11.9/0.266 45. And P(F 2,8 45) 0, this is very small, hence there is enough evidence to reject the null! The ANOVA table - for the dummy example When we divide the sum of squares by the degrees of freedom this is known as the mean square. Sum of df Mean square F Sig. Squares Between Groups = 23.8 2 11.9 45 very small!!. Within Groups SSW=2.05 8 0.266 Total 25.85 10 The total = SSW +. We call the 2 and 8 in /2 SSW/8, the degrees of freedom. 12 13 What exactly the SSW and are estimating Suppose the means of the populations in this example are µ 1, µ 2 and µ 3. The underlying assumption for the ANOVA is that the variance of all three populations are the same, let us call this σ 2. The SSW/(11 3) is basically an estimate of the variance that is SSW 11 3 σ2 We see if the null is true, that is µ 1 = µ 2 = µ 3, then 3 1 of σ 2. Hence SSW 11 3 and 3 1 is an estimator are both estimators of the variance. However if the alternative is true, and at least one of the means is different, then on average 3 1 will be larger than σ2. So the ratio 3 1 /SSW 11 3 will be much bigger than one. This is why when 3 1 /SSW 11 3 is large we reject the null because it is highly unlikely that the means are the same. The /2 is basically an estimate of the variance plus additional terms 3 1 σ2 + 1 ( 4 (µ1 µ) 2 + 3 (µ 2 µ) 2 + 4 (µ 1 µ) 2), 3 1 where µ is the global average of the means that is µ = 1 3 (µ 1 + µ 2 + µ 3 ). 14 15
ANOVA for the speed of light example Trial 1 Trial 2 Trial 3 Trial 4 Trial 5 sample mean 909 856 845 820.5 831.5 sample variance 11009 3741 6257 3605 2939 sample mean to the total sample mean. The total sample mean is (909 + 856 + 845 + 820.5 + 831.5) X = = 852.4 5 We need to calculate the and SSW, divide each of these by the number of degrees of freedom and compare them to each other and use the F-distribution to determine whether they are close or not. We use the sample variances to calculate the SSW. Note that the sample size in each sample is 20. SSW = (20 1) (11009 + 3741 + 6257 + 3605 + 2939) = 27551. To calculate the SSW we need to calculate the distance from each group = 20 (909 852.4) 2 + (856 852.4) 2 + (845 852.4) 2 + (820.5 852 (820.5 852.4) 2 = 4725.7 We need to the make an ANOVA table: Sum of df Mean square F Sig. Squares Between Groups = 4725.7 5 1 4725.7/4 = 1181 4.1 0.004 Within Groups SSW= 27551 100 5 27551/95 = 290 Total 32276.7 99 16 17 You see that the ratio 1181/290 = 4.1 is quite far from zero, and the p-value is 0.004. Hence at the 5% level, since 0.004 < 0.05, there is evidence to reject the null. That is based on the data, at least one of means for each trial 1,2,3,4,5 is different from the rest. 18