CHAPTER 4 Analysis of Variance One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication 1
Introduction In this chapter, expand the idea of hypothesis tests. We describe a test of variances and then a test that simultaneously compares several means to determine if they came from equal populations. 2
Key Concepts ANOVA is analysis of variance. ANOVA can be used to analyze the data obtained from experimental or observational studies. A factor is a variable that the experimenter has selected for investigation. A level is the intensity setting of a factor. A treatment is a specific combination of a factor levels. Experimental units are the objects of interest in the experiment. Variation between treatment groups captures the effect of the treatment. Variation within treatment groups represents random error not explained by the experimental treatments. 3
Example 4.1 Suppose that the experimenter began by randomly selecting 20 men and 20 women for the experiment. These two groups were then randomly divided into ten each for the experimental and control groups. What are the factors, levels and treatments in this experiment? Solution : Now there are two factors of interest to the experimental and control groups and each factor has two levels: i. Gender at two levels: men and women ii. Meal at two levels: breakfast and no breakfast In this more complex experiment, there are four treatments, one for each specific combination of factor levels: men without breakfast, men with breakfast, women without breakfast and women with breakfast. 4
One-way ANOVA The one-way analysis of variance specifically allows us to compare several groups of observations whether or not their population mean are equal. One way ANOVA is also known as Completely Randomized Design (CRD). This design only involves one factor. The application of one way ANOVA requires that the following assumptions hold true: (i) The populations from which the samples are drawn are (approximately) normally distributed. (ii) The populations from which the samples are drawn have the same variance. (iii) The samples drawn from different populations are random and independent. 5
Each observation may be written as: y Or alternatively written as: Where i 12,,...,k j 12,,...,n ij i ij yij i i ij th th y : j observation from i treatment ij th : i mean i th : i effect of treatment i : random error ij i 6
Y i. The is the total of all observations from the treatment, while Y.. is the grand total of all N observations. i th Treatment 1 2 i k y 11 y 21 y i1 y k1 y 12 y 22 y i2 y k 2............ y 1n1 y 2n2 y ini y knk Total Y 1 Y 2 Y i Yk Y Then the hypothesis can be written as: 0 1 2 1 Note: This hypothesis is use for model H :... all the population mean are equal i j k H : for at least one i, j at least one of the mean is not equal yij i ij 7
For model 0 1 2 1 y, ij i i ij the hypothesis is as follows: H :... 0 there is no treatment effect i k H : 0 for at least one i there is exist treatment effect The computations for an analysis of variance problem are usually summarized in tabular form as shown in table below. This table is referred to as the ANOVA table. Source of Variation Sum of Squares Degree of freedom Mean Square F Calculated Treatment (Between levels) SSTR k - 1 MSTR SSTR k 1 F cal MSTR MSE Error (within levels) SSE N - k MSE SSE N k Total SST N - 1 8
where Y.. Y Y SST y SSTR SSE SST SSTR k n 2 k 2 2 2 i... ij i1 j1 N i1 ni N k number of treatment N total number of observation F f H 1 We reject 0 if cal,k,n k and conclude that some of the data is due to differences in the treatment levels. f,k 1,N k 9
Example 4.2 Three different types of acid can be used in a particular chemical process. The resulting yield (in %) from several batches using the different types of acid are given below: Acid A B C 93 95 76 95 97 77 74 87 84 Test whether or not the three populations appear to have equal means using = 0.05. 10
Solution: 1. Construct the table of calculation: N 9,k 3 Acid A B C 93 95 76 95 97 77 74 87 84 Y1. 262 Y2 279 Y3 237 Y 778.... 2. Set up the hypothesis: H 0 1 : i A B C H : for at least one j i, j 11
3. Construct ANOVA table: SST SSTR k i1 j1 2 ij 2.. 2 2 2 2 2 2 93 95 74... 76 77 84 67914 67253. 7778 660. 2222 k i1 n Y y 2 2 i... n i Y N Y N 262 279 237 778 3 3 3 9 67551. 3333 67253. 7778 2 2 2 2 297. 5555 SSE SST SSTR 660. 2222 297. 5555 362. 6667 778 9 2 12
Source of Variation Treatment (Between levels) Error (within levels) Sum of Squares Degrees of Freedom 297.5555 3 1 = 2 362.6667 9 3 = 6 Total 660.2222 9 1 = 8 Mean Square 297. 5555 148. 7778 2 362. 6667 6 60. 4445 F Calculated 148. 7778 F cal 2. 4614 60. 4445 4. At = 0.05, from the statistical table for f distribution, we have f 0. 05, 2, 6 5. 14 5. Since F cal 2. 4614 f 0. 05, 2, 6 5. 14, thus we failed to reject and conclude that there is no difference for mean in the three Htypes 0 of acid at significance at = 0.05 13
Anova: Single Factor SUMMARY Groups Count Sum Average Variance Column 1 3 262 87.33333 134.3333 Column 2 3 279 93 28 Column 3 3 237 79 19 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 297.5556 2 148.7778 2.461397 0.16575 5.143253 Within Groups 362.6667 6 60.44444 Total 660.2222 8 14
Output from Excel Compare calculated values from example to Excel output: Groups Count Sum Average Variance Column 1 3 262 87.333 134.333 Column 2 3 279 93 28 Column 3 3 237 79 19 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 298 2 148.78 2.4614 0.1657 5.143 Within Groups 363 6 60.444 Total 660 8 The test statistic The p-value The critical bound 15
Exercise 4.1: Four catalyst that may affect the concentration of one component in a three-component liquid mixture are being investigated. The following concentrations are obtained. 1 2 3 4 58.2 56.3 50.1 52.9 57.2 54.5 54.2 49.9 58.4 57.0 55.4 50.0 55.8 55.3 51.7 54.9 Y1 Y2 Y3 Y4 Y Compute a one-way analysis of variance for this experiment and test the hypothesis at 0.05 level of significance and state your conclusion concerning the effect of catalyst on the concentration of one component in three-component liquid mixture. 16
Exercise 4.2 The following is sample information. Test the hypothesis that the treatment means are equal. Use 0.05. (Answer: Reject ) H 0 Treatment 1 Treatment 2 Treatment 3 8 3 3 6 2 4 10 4 5 9 3 4 17