One-Way Analysis of Variance (ANOVA) Paul K. Strode, Ph.D.

Similar documents
The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

One-way ANOVA. Experimental Design. One-way ANOVA

CHAPTER 10 ONE-WAY ANALYSIS OF VARIANCE. It would be very unusual for all the research one might conduct to be restricted to

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

ONE FACTOR COMPLETELY RANDOMIZED ANOVA

Difference in two or more average scores in different groups

20.0 Experimental Design

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

PLSC PRACTICE TEST ONE

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Comparing Several Means: ANOVA

Analysis of Variance: Part 1

Analysis of Variance (ANOVA)

An inferential procedure to use sample data to understand a population Procedures

Hypothesis testing: Steps

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Lecture 18: Analysis of variance: ANOVA

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

Introduction to Business Statistics QM 220 Chapter 12

Econometrics. 4) Statistical inference

Unit 27 One-Way Analysis of Variance

Sampling Distributions: Central Limit Theorem

Lab #12: Exam 3 Review Key

Design of Experiments. Factorial experiments require a lot of resources

10/31/2012. One-Way ANOVA F-test

ANOVA CIVL 7012/8012

HYPOTHESIS TESTING. Hypothesis Testing

Introduction to Analysis of Variance. Chapter 11

Hypothesis testing: Steps

STATS Analysis of variance: ANOVA

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Lec 1: An Introduction to ANOVA

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

Comparing the means of more than two groups

Analysis of variance (ANOVA) Comparing the means of more than two groups

1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance?

Multiple comparisons - subsequent inferences for two-way ANOVA

Analysis of Variance

Analysis of variance (ANOVA) ANOVA. Null hypothesis for simple ANOVA. H 0 : Variance among groups = 0

Introduction to the Analysis of Variance (ANOVA)

Two-Sample Inferential Statistics

ANOVA Analysis of Variance

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

Calculating Fobt for all possible combinations of variances for each sample Calculating the probability of (F) for each different value of Fobt

Study Guide #3: OneWay ANALYSIS OF VARIANCE (ANOVA)

STAT Chapter 10: Analysis of Variance

One-Way ANOVA Source Table J - 1 SS B / J - 1 MS B /MS W. Pairwise Post-Hoc Comparisons of Means

Data Analysis and Statistical Methods Statistics 651

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

Single Sample Means. SOCY601 Alan Neustadtl

Statistics For Economics & Business

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Variance Estimates and the F Ratio. ERSH 8310 Lecture 3 September 2, 2009

Keppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Independent Samples ANOVA

The t-statistic. Student s t Test

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Statistics Introductory Correlation

Hypothesis T e T sting w ith with O ne O One-Way - ANOV ANO A V Statistics Arlo Clark Foos -

Factorial Analysis of Variance

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Analysis of variance

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

Keppel, G. & Wickens, T. D. Design and Analysis Chapter 4: Analytical Comparisons Among Treatment Means

Advanced Experimental Design

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

Battery Life. Factory

P-values and statistical tests 3. t-test

Review. One-way ANOVA, I. What s coming up. Multiple comparisons

8/23/2018. One-Way ANOVA F-test. 1. Situation/hypotheses. 2. Test statistic. 3.Distribution. 4. Assumptions

Note: k = the # of conditions n = # of data points in a condition N = total # of data points

Analysis of Variance

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

Analysis of Variance (ANOVA)

QUEEN MARY, UNIVERSITY OF LONDON

An Old Research Question

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

Statistics and Quantitative Analysis U4320

9 One-Way Analysis of Variance

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Statistical Inference for Means

Factorial Analysis of Variance

While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1

Lecture notes 13: ANOVA (a.k.a. Analysis of Variance)

Transcription:

One-Way Analysis of Variance (ANOVA) Paul K. Strode, Ph.D. Purpose While the T-test is useful to compare the means of two samples, many biology experiments involve the parallel measurement of three or more different samples. For example, rather than simply test the effect of the presence or absence of plants on the ability of goldfish to find and eat Daphnia, scientists might design an experiment in which Daphnia and goldfish are placed together in tanks with either no plants, underwater plans, or plants on the surface of the water. It is possible to perform three t-tests to discover if any of the three treatments is significantly different from any other. However, the more t- Tests that are performed on experimental data, the greater the chance that one of the tests will produce a statistically significant result by accident. A more careful strategy is to analyze all of the treatments together in one go, and the one-way analysis of variance (ANOVA) makes a single analysis possible. ANOVA allows for statistical tests concerning the means of three or more treatment conditions or levels. Specifically, ANOVA provides a test of the null statistical hypothesis (H 0 ) that the population means for three or more groups are equal ( ) and that any observed differences have occurred by chance. If at least one of the means differs significantly from all the others, then H 0 is rejected and the alternative hypothesis (H 1 ) is supported. In other words, one of the treatments (variations of the independent variable) is significantly different from the others. Ultimately, ANOVA compares the variance between the treatment means (called the mean square between : ) to the average variance within all the treatments (called the mean square within : ). If the variance between the treatment means is large and the variance within all the treatments is small, it is likely that the difference between the means reflects the differential treatment of the samples. However, if there is too much variance within each sample, it cannot be safely concluded that any observed differences between the treatment means are statistically significant. Equation and Sample Problem The test statistic that is calculated for the ANOVA is called the F-ratio. The F-ratio (F) is named for Ronald Fisher, the statistician who invented ANOVA in the 1920s. is calculated by determining the ratio of the variance between the treatment means ( ) to the average variance within all the treatments ( ). = Sample problem and steps for calculating mean squares and : Students wanted to test the hypothesis that phosphorus is a limiting factor for algal growth. The students predicted that growing algae in three different phosphate solutions would produce a treatment effect of increasing chlorophyll concentration in the algae with increasing phosphorus concentration. The null hypothesis for this experiment is that there is no effect of phosphorous concentration on algal growth. The data the students collected are in Table 1 below.

Table 1. Chlorophyll concentration (mg/m 3 ) after two weeks of growing algae in three different phosphorus treatments of 10, 15, and 20 mg/m 3. Total Phosphorus Concentration (mg/m 3 ) Sample # 10 15 20 Chlorophyll Concentration (mg/m 3 ) 1 2.9 5.7 11.3 2 3.7 6.1 9.8 3 2.9 4.9 10.6 4 4.0 5.5 10.9 5 4.1 5.4 10.4 6 3.6 5.9 12.0 7 3.4 5.1 10.9 8 3.7 4.7 10.1 9 3.0 6.1 9.7 10 2.7 6.9 11.1 11 4.1 7.2 11.4 12 3.9 6.1 10.2 Totals 42 69.6 128.4 Means = 3.5 = 5.8 = 10.7 Grand Total 240 Grand Mean = 6.7 1. Determine the treatment totals (sums), treatment means, the grand total, and grand mean. a. totals = 42, 69.6, and 128.4 b. means = 3.5, 5.8, and 10.7 c. Grand Total of all scores = 240 d. Grand Mean = 6.7 2. Determine the degrees of freedom ( ). a. Between groups = (number of groups 1) = (3 1) = 2 b. Within groups = (the number of treatments) X ( for each treatment) = 3(12 1) =33 3. Determine the total variation among all the values in the experiment ( ). This is done by calculating the sum of the squares of the difference of each score from the grand mean. a. Square each value and sum all the squares to get the Sum of Squares ( ). = 2.9 2 + 3.7 2 + 2.9 2 + 4.0 2 + 11.4 2 + 10.2 2 = 1,938.92 b. Square the Grand Total and divide it by the total number of observations. = = 1,600 c. Subtract the answer in 3b from the answer in 3a (SS) to get. = 1,938.92 1,600 = 338.92

4. Determine the Sum of the Squares within the groups ( ). To do this, sum the squares of the differences of each score from its group (treatment) mean. Square the quantity of each score minus its own column mean, then sum the values for all the subjects in the experiment: = (2.9 3.5) 2 + (3.7 3.5) 2 + (2.9 3.5) 2 + + (11.4 10.7) 2 + (10.2 10.7) 2 = 14.36 5. Determine the sum of the squares between the groups ( ). - = 338.92 14.36 = 324.56 6. Determine the mean squares between ( ) and mean squares within ( ) by dividing each by the between and within, respectively. = = = 162.28 = = = 0.435 7. Determine the. = = = 373 8. Compare the observed F-ratio ( ) with the critical value ( ) of at (alpha, the H 0 rejection level) = 0.05 and the between (numerator) and within (denominator) degrees of freedom = 2 and 33, respectively:. 9. Compare the to the critical values in Table 3 on the following page. 10. Create an ANOVA table to summarize the results. (Table 2 is an example.) Table 2. Results of one-way analysis of variance for the effect of phosphorus concentration on chlorophyll concentration. Source of Variation SS df MS F P-value F crit Between s 324.56 2 162.28 373 P < 0.001 3.28 Within s 14.36 33 0.435 Total 338.92 35

Table 3: Critical values for at a rejection level ( ) of 0.05. Degrees of Freedom (df) Between the s Degrees of Freedom Within the s 1 2 3 4 5 6 7 8 9 10 1 161.45 199.50 215.71 224.58 230.16 233.99 236.77 238.88 240.54 241.88 2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38 19.40 3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81 8.79 4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00 5.96 5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77 4.74 6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10 4.06 7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 3.68 3.64 8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39 3.35 9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18 3.14 10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02 2.98 11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90 2.85 12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80 2.75 13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71 2.67 14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65 2.60 15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59 2.54 16 4.49 3.63 3.24 3.01 2.85 2.74 2.66 2.59 2.54 2.49 17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49 2.45 18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46 2.41 19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42 2.38 20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39 2.35 21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37 2.32 22 4.30 3.44 3.05 2.82 2.66 2.55 2.46 2.40 2.34 2.30 23 4.28 3.42 3.03 2.80 2.64 2.53 2.44 2.37 2.32 2.27 24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30 2.25 25 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28 2.24 26 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27 2.22 27 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25 2.20 28 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24 2.19 29 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22 2.18 30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21 2.16 40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12 2.08 60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04 1.99 120 3.92 3.07 2.68 2.45 2.29 2.18 2.09 2.02 1.96 1.91 3.84 3.00 2.60 2.37 2.21 2.10 2.01 1.94 1.88 1.83

Summary Table 3, critical values for, shows us the F-ratio the students could obtain by accident 5% of the time in an ANOVA test for various combinations of degrees of freedom between and within groups and treatments. In Table 3, is 3.28 for data that include 2 and 33 degrees of freedom between and within groups, respectively. The calculated in the ANOVA for the phosphorus-chlorophyll data is 373. This number is much higher than 3.28, so the probability ( ) of getting a value as large as 373 if H 0 is true is less than 0.05. Therefore, the H 0 (statistical null hypothesis) is rejected (that the treatment means are equal, and any difference occurred by chance) and it is reasonable to conclude that the phosphorus treatment had a statistically significant effect on algae growth. Therefore, the experimental hypothesis has been supported by the results of this experiment and the ANOVA statistical test. Solve this Problem A student observed that just after bean plants (Phaseolus vulgaris) germinate, their cotyledons slowly shrivel. She hypothesized that the young plants are using energy from their cotyledons to fuel extra growth and predicted that bean plants with either one or both cotyledons removed just after germination would be shorter after three weeks relative to bean plants with their cotyledons intact. 1. What is the null statistical hypothesis (H 0 ) for this experiment? 2. Perform the ANOVA on the student s data to test the H 0. 3. Write a short conclusion about what the results of the ANOVA suggest about the student s experimental hypothesis and statistical hypothesis. Table 4. Heights (cm) of bean plants with 0, 1, or 2 cotyledons removed, three weeks after germination. Number of Cotyledons Removed from Plant Plant # 0 1 2 1 10.3 9.7 6.5 2 9.6 10.2 4.6 3 10.0 6.8 7.9 4 9.8 8.3 8.1 5 10.1 7.8 5.8 6 8.4 9.9 5.9 7 12.4 10 6.0 8 7.9 8.0 7.1 9 8.8 7.8 5.2 Totals Means = = = Grand Total Grand Mean = Table 5. ANOVA results of the effect of removing cotyledons on growth in bean plants. Source of Variation SS df MS F P-value F crit Between s Within s Total