Analysis of Variance Read Chapter 14 and Sections 15.1-15.2 to review one-way ANOVA. Design of an experiment the process of planning an experiment to insure that an appropriate analysis is possible. Some important steps 1. Statement of experimental objectives. 2. Statement of how much information is required about the relevant parameters. 3. Selection of an experimental design. 4. Selection of an appropriate analysis and sample size. 100
Definitions Experimental unit the basic unit on which a measurement is made. Factors different sets of conditions to which the experimental units are subjected. Any factor will have at least two different levels. Treatment a specific combination of the levels of the different factors. The number of experimental units on which a particular treatment is applied is called the number of replications of that treatment. Quantitative factor a factor whose levels correspond to a numerical (at least ordinal) scale. Qualitative factor a factor whose levels correspond to a nominal-level scale. 101
Example A chemist wants to determine the effect of air exposure time and relative humidity on the weight loss of a 5 gram portion of a compound. Four different exposure times and the relative humidities 0.20, 0.30 and 0.40 were of interest. In addition, weight loss was determined by three different analysts. An experimental unit is a single 5 gram portion of compound. The experiment has three factors: exposure time, a quantitative variable with four levels, relative humidity, a quantitative variable with three levels, and analyst, a qualitative variable with three levels. The number of treatments in this experiment is 4 3 3 = 36. If multiple replications of a given treatment are done, what might cause variation in the results? 102
Randomization Formal idea of randomization introduced by R.A. Fisher in 1930 s. Suppose we re comparing two treatments. The treatments are to be applied to 16 distinct experimental units. Ideal case: The experimental units are identical in every respect. Simply assign Treatment 1 to any of the 8 units and Treatment 2 to the other 8. Case requiring complete randomization: The experimental units are known to differ in some way, but it isn t possible to distinguish among them. Randomly assign Treatment 1 to 8 units and Treatment 2 103
to the other 8. This is a reasonable way to ensure that, with high likelihood, the two treatments are applied to groups of experimental units with similar mixes of characteristics. Note: It s probably wise to use randomization even in the ideal case, to guard against unanticipated differences between units. Blocking: Suppose the 16 experimental units are different in an obvious way, but can be arranged into four categories, or blocks, such that the four units within a block are similar. For each block, randomly assign Treatment 1 to two units and Treatment 2 to the other two. 104
Completely Randomized Design (CRD) In a CRD, one randomly assigns n 1, n 2,..., n t experimental units to treatments 1, 2,..., t, respectively. CRD is equivalent to selecting t random samples, one from each of t populations, in such a way that the samples are independent of each other. The n i responses to treatment i represent a random sample from a population of possible responses. After a treatment has been applied to an experimental unit, a response or measurement is recorded from that unit. We denote the jth response from the ith treatment by X ij, j = 1, 2,..., n i, i = 1, 2,..., t. 105
Data Structure Total Mean Trt. 1 X 11 X 1n1 X 1 X 1 = X 1 /n 1 Trt. 2 X 21 X 2n2 X 2 X 2 = X 2 /n 2.... Trt. t X t1 X tnt X t X t = X t /n t X X = X /N N = n 1 + n 2 + + n t Important decomposition X ij = X + ( X i X) + (X ij X i ) X: Grand mean X i X: Deviation due to treatment X ij X i : Residual, or deviation due to differences among experimental units 106
Parametric analog of previous decomposition X ij = µ + α i + ɛ ij, i = 1, 2,..., t, j = 1,..., n i. µ: a constant common to all X ij s α i : a constant representing the effect due to treatment i ɛ ij : a random variable representing an error term, i.e., error due to differences among experimental units Usual assumptions about errors: 1. Each ɛ ij has a normal distribution with mean 0 and variance σ 2. 2. All the error terms are independent of each other. 107
Decomposition of sums of squares For each observation, X ij X = ( X i X) + (X ij X i ). The sum of all (X ij X) 2 is called T SS, for total sum of squares. T SS = (X ij X) 2 i,j Define SS T and SS E by SS T = i n i ( X i X) 2 and SS E = i,j (X ij X i ) 2. Important identity T SS = SS T + SS E 108
Testing for no treatment effect Of central importance in the CRD is testing the hypothesis of no treatment effect, which in parametric terms is H 0 : α 1 = α 2 = = α t = 0. The alternative hypothesis is H 1 : α i 0 for at least one i. Define MS T = SS T t 1 and MS E = SS E n t. We test H 0 vs. H 1 using the F -statistic F = MS T /MS E. H 0 is rejected at level of significance α if F F t 1,n t,α. 109