OHSU OGI Class ECE-580-DOE :Design of Experiments Steve Brainerd

Similar documents
Difference in two or more average scores in different groups

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

Sampling Distributions: Central Limit Theorem

What If There Are More Than. Two Factor Levels?

An Old Research Question

Analysis of variance

Hypothesis testing: Steps

Written Exam (2 hours)

Hypothesis testing: Steps

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

Review. One-way ANOVA, I. What s coming up. Multiple comparisons

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

HYPOTHESIS TESTING. Hypothesis Testing

Regression With a Categorical Independent Variable: Mean Comparisons

Keppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares

Introduction to Analysis of Variance. Chapter 11

10/31/2012. One-Way ANOVA F-test

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Hypothesis Testing hypothesis testing approach

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

8/23/2018. One-Way ANOVA F-test. 1. Situation/hypotheses. 2. Test statistic. 3.Distribution. 4. Assumptions

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

Analysis of Variance (ANOVA)

Sleep data, two drugs Ch13.xls

Analysis of Variance: Part 1

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

One-way ANOVA. Experimental Design. One-way ANOVA

Factorial Analysis of Variance

Independent Samples ANOVA

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

ANOVA: Analysis of Variation

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

Advanced Experimental Design

PLSC PRACTICE TEST ONE

Multiple comparisons - subsequent inferences for two-way ANOVA

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Analysis of Variance and Co-variance. By Manza Ramesh

Note: k = the # of conditions n = # of data points in a condition N = total # of data points

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Introduction to Business Statistics QM 220 Chapter 12

COMPARING SEVERAL MEANS: ANOVA

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

Variance Estimates and the F Ratio. ERSH 8310 Lecture 3 September 2, 2009

Factorial Analysis of Variance

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

Analysis of Variance

We need to define some concepts that are used in experiments.

INTRODUCTION TO ANALYSIS OF VARIANCE

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?

Using SPSS for One Way Analysis of Variance

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

1 Use of indicator random variables. (Chapter 8)

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

The t-statistic. Student s t Test

Lab #12: Exam 3 Review Key

ST4241 Design and Analysis of Clinical Trials Lecture 4: 2 2 factorial experiments, a special cases of parallel groups study

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

9/28/2013. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1

Introduction. Chapter 8

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

What is Experimental Design?

Stats fest Analysis of variance. Single factor ANOVA. Aims. Single factor ANOVA. Data

Chapter 9 Inferences from Two Samples

Sociology 6Z03 Review II

Analysis of Variance

CE3502. Environmental Measurements, Monitoring & Data Analysis. ANOVA: Analysis of. T-tests: Excel options

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

Regression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv).

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

ANOVA: Comparing More Than Two Means

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.

DESAIN EKSPERIMEN BLOCKING FACTORS. Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

Comparing Several Means: ANOVA

Regression With a Categorical Independent Variable

Lec 5: Factorial Experiment

Hypothesis Testing hypothesis testing approach formulation of the test statistic

Section 9.5. Testing the Difference Between Two Variances. Bluman, Chapter 9 1

11-2 Multinomial Experiment

Inference for Regression Simple Linear Regression

Chapter 10. Design of Experiments and Analysis of Variance

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Factorial Independent Samples ANOVA

ANOVA Analysis of Variance

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

Chapter 10: Chi-Square and F Distributions

Analysis of Variance: Repeated measures

Lecture 18: Analysis of variance: ANOVA

SIMPLE REGRESSION ANALYSIS. Business Statistics

16.3 One-Way ANOVA: The Procedure

Analysis of Variance ANOVA. What We Will Cover in This Section. Situation

While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1

Chapter 10: Analysis of variance (ANOVA)

Transcription:

Why We Use Analysis of Variance to Compare Group Means and How it Works The question of how to compare the population means of more than two groups is an important one to researchers. Let us suppose that we are testing three new drugs against a control group in their effect on some type of medical problem. Thus, we are comparing four groups: Control Group Drug A Treated Group Drug B Treated Group Drug C Treated Group Let us assume that we draw a sample of people from some population of people with a particular medical problem and randomly assign them to each of the four groups, so each person is equally likely to be placed into any of the four groups. Because the process is random, the researcher has no control of who goes into which group either. 12/30/02 DOE_Class_2003, ECE-5XXX 1

ANOVA Basics and Examples A drug is administered to the patients in such a manner that neither the administrator of the drug, nor the patient knows either which group into which the patient was classified or the drug (or placebo) that the patient is receiving. After some period of time the patient is measured on some variable that measures the medical effect of interest. The question is: How do we proceed with the analysis of the data? Most beginning researchers who know about two-sample tests would propose performing a whole series of t-tests--comparing, for example the Control Group with each of the Drug groups, then comparing the Drug groups with each other. The number of possible t-tests required to make all possible comparisons for the four groups would be six: Control with A Control with B Control with C A with B A with C B with C 12/30/02 DOE_Class_2003, ECE-5XXX 2

ANOVA Now each of these t-tests has an a level. Let us presume that we predetermine that a should be set at.05, so we have a reasonable b. What happens is that because we no longer have independence between the data sets (we are performing statistical tests on the same set of data more than once) the a probabilities add. Thus, the overall a level for the analysis of this experiment could be as high as a= 0.3. Clearly this is unacceptable for most research. The problem can be minimized if we were to use smaller a levels. The problem is that when this is done, the probability of committing a Type II error balloons to unacceptable levels. Additionally, if we have many groups (more than just four), the overall a level becomes unacceptable very quickly. The statistical analysis process desired is to test for the equality of all of the means in a single test. H 0 : µ 1 =µ 2 =µ 3 =µ 4 H 1 : at least one pair of means is unequal 12/30/02 DOE_Class_2003, ECE-5XXX 3

ANOVA This is the hypothesis that is tested with a completely randomized analysis of variance, or ANOVA. Notice that we don't know which pair (or pairs) of means are unequal to each other when we are finished. This is accomplished by performing a post hoc comparison that will identify which pair (or pairs) of means are significantly different from each other. Why is this hypothesis test about multiple means called analysis of variance? The answer is that it is possible to infer what is happening to population means by examining and analyzing the variability or variance of the data. If we were to examine the data, we would find that the overall variability of the data, as measured by the variance can be calculated in several ways: Way One - we measure the overall variability of the data using a formula such as In this formula, one can easily see that that variance is computed by summing the squared differences from the overall mean to each piece of data. In our four-group experiment, this would be like finding the variability across all four groups by using the mean across the four groups denoted by 12/30/02 DOE_Class_2003, ECE-5XXX 4

ANOVA Way Two - we partition the variance into several components. We could, for instance, find the variability for each group and add them. This would give us something like the following: We would have such an equation for each of the four groups. Thus, another way to estimate the overall variability would be to add the variances of the individual groups. Thus, we would have something like: 12/30/02 DOE_Class_2003, ECE-5XXX 5

ANOVA Now, if each of the group means is identically equal to the overall mean across all groups, the estimate of the variance we computed in Way One will be exactly equal to what we compute in Way Two. When each of the group means is not identically equal to the overall mean, the method of estimating the variance for Way One will be larger than the method used in Way Two. The reason for this is that there is some variability included in Way One that is comprised of the differences between the individual group means and the overall mean. This can be shown in the following equation:. The total of the variances is equal to the sum of the between and within variances: 12/30/02 DOE_Class_2003, ECE-5XXX 6

ANOVA If we were to compute a ratio of the between sample variation divided by the within sample variation we would get a ratio like the following: If you look at this equation, you should notice that as the mean difference between groups increases, the numerator of the equation will become large relative to the denominator because the between variation is dependent on the difference between the group means and the overall mean while the denominator is relatively independent of this difference. 12/30/02 DOE_Class_2003, ECE-5XXX 7

ANOVA This ratio follows a statistical distribution called the F-distribution. The F- distribution in statistics is frequently used to make probability statements about the ratio of two variances. F Distribution: The F-test is always a one-tail test. If the computed test statistics F, exceeds the tabled value, one rejects the hypothesis that all of the population means are equal. One then concludes that at least one pair of means is significantly different and proceeds to use the Scheffè post hoc comparison procedure to identify which pair or pairs of means caused us to reject the null hypothesis of equal group means. 12/30/02 DOE_Class_2003, ECE-5XXX 8

ANOVA Calculations Example 3 operators measure sample volume deltas to specification for 5 different wort vats in a brewery on different days. Is there a difference between Operators or the Vat volumes? Volume OPER 1 OPER 2 OPER 3 V1 4 0 7 V2 1 2 1 V3 6 3 8 V4 7 0 6 V5 2 9 8 12/30/02 DOE_Class_2003, ECE-5XXX 9

ANOVA Calculations Example 3 operators measure sample volume deltas to specification for 5 different wort vats in a brewery on different days. Is there a difference between Operators or the Vat volumes? Run an ANOVA to test for any statistically significant difference. This is a Anova: Two-Factor Without Replication test. 12/30/02 DOE_Class_2003, ECE-5XXX 10

ANOVA Manual Calculations Example: 3 operators measure sample volume. Columns j 3 Volume OPER 1 SQ OPER 2 SQ OPER 3 SQ SUMj (SUM)SQj SUM SQj rows i x i x j 2 x i x j 2 x i x j 2 V1 4 16 0 0 7 49 V2 1 1 2 4 1 1 V3 6 36 3 9 8 64 V4 7 49 0 0 6 36 V5 2 4 9 81 8 64 (Σx j ) (Σx j ) 2 Σ(x 2 j ) 11 121 65 4 16 6 17 289 109 13 169 85 19 361 149 Rows I 5 Σ(Σx j ) Σ((Σx j ) 2 ) Σ(Σ(x j 2 )) Σ(Σx ij ) Σ(Σ(x ij 2 )) (Σx i ) 20 14 30 Σ(Σx i ) 64 64 956 414 (Σx i ) 2 400 196 900 Σ((Σx 2 i ) 1496 Σ(x 2 i ) 106 94 214 Σ(Σ(x 2 i )) 414 12/30/02 DOE_Class_2003, ECE-5XXX 11

ANOVA Manual Calculations Example: 3 operators measure sample Total number n= 15 volume. # Columns c= 5 # rows r= 3 SOURCE OF VARIATION Sum of squares Degrees of Freedom n -1 MS Mean square ( estimate = σ 2 ) Volumes( rows) II 45.60 4 11.40 σ i 2 Operators (columns)j III 26.13 2 13.07 σ j 2 Residual or error 69.20 8 8.65 σ o 2 Total I 140.93 14 Volumes( rows) II =(c(σσx 2 j)-(σσxj) 2 )/n VARIANCE σ 2 Fcal Fcal σ i 2 /σ o 2 σ j 2 /σ o 2 Fbook or F crit Fbook or F crit Significant 1.32 F (0.05,4,8) 3.84 no 1.51 F (0.05,2,8) 4.46 no Operators (columns) III =(r(σσx 2 i)-(σσxij) 2 )/n Residual I - II - III =[(n(σσx2ij)- (ΣΣxij)2)/n]-[(c(ΣΣx2j)- (ΣΣxj)2)/n]-[(r(ΣΣx2i)- (ΣΣxij)2)/n] 69.2 Total I =(n(σσx 2 ij)-(σσxij) 2 )/n 12/30/02 DOE_Class_2003, ECE-5XXX 12

A Faster way: ANOVA use DATA ANALYSIS in EXCEL Anova: Two-Factor Without Replication Example: 3 operators measure sample volume. non-replicated experiment Anova: Two-Factor Without Replication SUMMARY Count Sum Average Variance Volume OPER 1 OPER 2 OPER 3 V1 3 11 3.666667 12.33333 V1 4 0 7 V2 1 2 1 V3 6 3 8 V4 7 0 6 V2 3 4 1.333333 0.333333 V3 3 17 5.666667 6.333333 V4 3 13 4.333333 14.33333 V5 3 19 6.333333 14.33333 V5 2 9 8 OPER 1 5 20 4 6.5 OPER 2 5 14 2.8 13.7 OPER 3 5 30 6 8.5 12/30/02 DOE_Class_2003, ECE-5XXX 13

A faster way: ANOVA use DATA ANALYSIS in EXCEL Anova: Two-Factor Without Replication Example: 3 operators measure sample volume. ANOVA Source of Variation SS df MS F P- value F crit 0.05 level Volume Rows 45.6 4 11.4 1.317919 0.34 3.84 Operator Columns 26.13333333 2 13.06666667 1.510597 0.28 4.46 Error 69.2 8 8.65 Total 140.9333333 14 P value = probability that a larger Fcal value would occur due to random chance. In our case there is a 34% chance that any difference detected between Volumes is just due to random chance Another way to express it would be there is a 34% chance that rejecting the Null Hypothesis that there is no difference between Volumes would be correct Based on this sampling the statement that there is a difference between volumes would only be correct 66% of the time! If P=1 then there is 100% chance that a larger value would occur due to random chance. A low P value means that the factor being tested has a significant effect ( not due to random chance) If Fcal > F crit then the factor has a significant effect! 12/30/02 DOE_Class_2003, ECE-5XXX 14