ANOVA: Comparing More Than Two Means

Similar documents
Inference for Regression

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

16.3 One-Way ANOVA: The Procedure

IEOR165 Discussion Week 12

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Chapter 11 - Lecture 1 Single Factor ANOVA

1 Introduction to One-way ANOVA

Chapter 10: Analysis of variance (ANOVA)

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts

Wolf River. Lecture 19 - ANOVA. Exploratory analysis. Wolf River - Data. Sta 111. June 11, 2014

Sociology 6Z03 Review II

Chapter 11 - Lecture 1 Single Factor ANOVA

Week 14 Comparing k(> 2) Populations

One-Way Analysis of Variance (ANOVA)

ANOVA Analysis of Variance

Announcements. Unit 4: Inference for numerical variables Lecture 4: ANOVA. Data. Statistics 104

Lecture 6 Multiple Linear Regression, cont.

ANOVA: Analysis of Variation

In ANOVA the response variable is numerical and the explanatory variables are categorical.

Variance Decomposition and Goodness of Fit

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Lec 1: An Introduction to ANOVA

Wolf River. Lecture 15 - ANOVA. Exploratory analysis. Wolf River - Data. Sta102 / BME102. October 22, 2014

Research Methods II MICHAEL BERNSTEIN CS 376

Standard Normal Calculations

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

STATS Analysis of variance: ANOVA

Wolf River. Lecture 15 - ANOVA. Exploratory analysis. Wolf River - Data. Sta102 / BME102. October 26, 2015

22s:152 Applied Linear Regression. 1-way ANOVA visual:

Assignment #7. Chapter 12: 18, 24 Chapter 13: 28. Due next Friday Nov. 20 th by 2pm in your TA s homework box

Lecture notes 13: ANOVA (a.k.a. Analysis of Variance)

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

Chapter 10. Design of Experiments and Analysis of Variance

3. Design Experiments and Variance Analysis

Statistiek II. John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa. February 13, Dept of Information Science

Lecture 15 - ANOVA cont.

Lecture 5: ANOVA and Correlation

Example - Alfalfa (11.6.1) Lecture 16 - ANOVA cont. Alfalfa Hypotheses. Treatment Effect

EX1. One way ANOVA: miles versus Plug. a) What are the hypotheses to be tested? b) What are df 1 and df 2? Verify by hand. , y 3

ANOVA: Comparing More Than Two Means

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

Inferences for Regression

Multiple comparisons - subsequent inferences for two-way ANOVA

Analysis of variance (ANOVA) ANOVA. Null hypothesis for simple ANOVA. H 0 : Variance among groups = 0

Introduction to Analysis of Variance. Chapter 11

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means

Example - Alfalfa (11.6.1) Lecture 14 - ANOVA cont. Alfalfa Hypotheses. Treatment Effect

Factorial designs. Experiments

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

Chapter 4: Regression Models

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Analysis of Variance: Part 1

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

Inference for Regression Inference about the Regression Model and Using the Regression Line

ANOVA (Analysis of Variance) output RLS 11/20/2016

9 One-Way Analysis of Variance

MSc / PhD Course Advanced Biostatistics. dr. P. Nazarov

One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables.

Chapter Seven: Multi-Sample Methods 1/52

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

Analysis Of Variance Compiled by T.O. Antwi-Asare, U.G

Bernoulli Trials and Binomial Distribution

The Normal Distribuions

1 Use of indicator random variables. (Chapter 8)

Tukey Complete Pairwise Post-Hoc Comparison

Chapter 4. Regression Models. Learning Objectives

Density Curves & Normal Distributions

What If There Are More Than. Two Factor Levels?

CHAPTER 10 ONE-WAY ANALYSIS OF VARIANCE. It would be very unusual for all the research one might conduct to be restricted to

Analysis of Variance Bios 662

Comparing the means of more than two groups

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

their contents. If the sample mean is 15.2 oz. and the sample standard deviation is 0.50 oz., find the 95% confidence interval of the true mean.

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

Chapter 9. Hypothesis testing. 9.1 Introduction

Statistics and Quantitative Analysis U4320

1 The Randomized Block Design

STAT 401A - Statistical Methods for Research Workers

Josh Engwer (TTU) 1-Factor ANOVA / 32

Confidence Intervals for Two Means

Unit 27 One-Way Analysis of Variance

Pooled Variance t Test

Bernoulli Trials, Binomial and Cumulative Distributions

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Tests about a population mean

Nested 2-Way ANOVA as Linear Models - Unbalanced Example

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Week 12 Hypothesis Testing, Part II Comparing Two Populations

Transcription:

ANOVA: Comparing More Than Two Means Chapter 11 Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c Department of Mathematics University of Houston Lecture 25-3339 Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture) 25-3339 1 / 23

Outline 1 Comparing More Than Two Means 2 ANOVA 3 Pairwise Tests Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture) 25-3339 2 / 23

Popper #20 Questions We wish to test the hypotheses H 0 : µ = 13.5 versus H a : µ > 13.5 at α = 0.02 significance level. From a random sample of 40 the sample mean is 15.9. Assume that the population standard deviation is 2.7. 1. Which test would be appropriate to use? a) z-test for proportions b) z-tests for means c) t-test for means d) t-tests for proportions 2. Calculate the test statistic. a) 2.4 b) -5.62 c) 5.62 d) 0.89 e) 2.0537 3. Determine the p-value, approximately. 4. What is our decision of the test? a) 0 b) 0.02 c) 1.434 d) 4.68 a) Reject H 0 b) Fail to reject H 0 c) Accept H 0 Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture) 25-3339 3 / 23

Popper #20 Questions For each of the following scenarios, determine if it is a) paired t-test or b) two sample t-test 5. The weight of 14 patients before and after open-heart surgery. 6. The smoking rates of 14 men measured before and after a stroke. 7. The number of cigarettes smoked per day by 14 men who have had stokes compared with the number smoked by 14 men who have not had strokes. 8. The photosynthetic rates of 10 randomly chosen Douglas-fir trees compared with 10 randomly chosen western red cedar trees. 9. The photosynthetic rate measured on 10 randomly chosen Sitka spruce trees compared with the rate measured on the western red cedar growing next to each of the Sitka spruce trees. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture) 25-3339 4 / 23

Weight Loss From: http://sphweb.bumc.bu.edu/otlt/mph-modules/ BS/BS704_HypothesisTesting-ANOVA/BS704_ HypothesisTesting-Anova_print.html Is there a difference in the mean weight loss among different programs? A clinical trial is run to compare weight loss programs and participants are randomly assigned to one of the comparison programs and are counseled on the details of the assigned program. Participants follow the assigned program for 8 weeks. Three popular weight loss programs are considered. Low calorie diet. Low fat diet Low carbohydrate diet Control group Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture) 25-3339 5 / 23

Results Response variable = weight loss = weight at the end of 8 weeks - weight at beginning of the study The observed weight losses of twenty people in this study are as follows: Low Calorie Low Fat Low Carbohydrate Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Is there a statistically significant difference in the mean weight loss among the four diets? Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture) 25-3339 6 / 23

Box Plots of Weight Loss 0 2 4 6 8 calorie carb control fat Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture) 25-3339 7 / 23

Hypotheses We want to know if there is a "statistically significant difference" in the mean weight loss among the four diets. Null hypothesis: mean weight loss is the same among the four diets H 0 : µ calorie = µ carb = µ control = µ fat Alternative hypothesis is that at least one of the mean weight loss among the four diets is different. Rejecting H0 is evidence that the mean of at least one group is different from the other means. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture) 25-3339 8 / 23

Hypotheses We want to know if there is a "statistically significant difference" in the mean weight loss among the four diets. Null hypothesis: mean weight loss is the same among the four diets H 0 : µ calorie = µ carb = µ control = µ fat Alternative hypothesis is that at least one of the mean weight loss among the four diets is different. Rejecting H0 is evidence that the mean of at least one group is different from the other means. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture) 25-3339 8 / 23

Hypotheses We want to know if there is a "statistically significant difference" in the mean weight loss among the four diets. Null hypothesis: mean weight loss is the same among the four diets H 0 : µ calorie = µ carb = µ control = µ fat Alternative hypothesis is that at least one of the mean weight loss among the four diets is different. Rejecting H0 is evidence that the mean of at least one group is different from the other means. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture) 25-3339 8 / 23

Hypotheses We want to know if there is a "statistically significant difference" in the mean weight loss among the four diets. Null hypothesis: mean weight loss is the same among the four diets H 0 : µ calorie = µ carb = µ control = µ fat Alternative hypothesis is that at least one of the mean weight loss among the four diets is different. Rejecting H0 is evidence that the mean of at least one group is different from the other means. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture) 25-3339 8 / 23

Assumptions The assumptions of analysis of variance are the same as those of the two sample t-test, but they must hold for all k groups. The measurements in every group is a SRS. We have a Normal distribution for each of the k populations. The variance is the same in all k populations. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture) 25-3339 9 / 23

Analysis: ANOVA ANalysis Of VAriance We can estimate how much variation among group means ought to be present from sampling error alone if the null hypothesis is true. ANOVA lets us determine whether there is more variance among the sample means than we would expect by chance alone. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 10 / 23

The Formulas Let X i. = group. ni j=1 X ij n i denote the average of the observations in the ith Let N = M i=1 n i, be the total number of observations in all the M groups. Let X M i=1.. = n i X i. N grand average) be the average of all the observations (the The treatment sum of squares (between groups) is SS(betw) = M i=1 n i( X i. X.. ) 2. The error sum of squares (residual) is SSE = M ni i=i n=1 (X ij X i. ) 2 = M i=1 (n 1)S2 i Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 11 / 23

Diets Example Low Calorie Low Fat Low Carbohydrate Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 12 / 23

The F Test The mean square for treatments is MSTr = SSTr M 1. The mean square for error is MSE = SSE N M. The test statistic is F = MSTr MSE. This test statistic has an F distribution with parameters "numerator degrees of freedom" = M - 1 and "denominator degrees of freedom" = N - M. Where N is the total number of observations and M is the number of groups. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 13 / 23

The ANOVA Table Source of degrees of Sum of Mean F Variation freedom Squares Square MSTr Treatments M - 1 SSTr MSTr MSE Error N - M SSE MSE Total N - 1 SST Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 14 / 23

ANOVA for Diets Source of degrees of Sum of Mean F Variation freedom Squares Square Diets Error Total p-value = 1 - pf(f,m - 1, N - M) Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 15 / 23

R Code > diet.lm=lm(loss~diet,data=diet) > anova(diet.lm) Analysis of Variance Table Response: Loss Df Sum Sq Mean Sq F value Pr(>F) Diet 3 75.75 25.25 8.5593 0.001278 ** Residuals 16 47.20 2.95 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 16 / 23

Tukey s Method (The T Method) The T-method is used to determine which pair (or pairs) of means differs significanctly. 1. Select α, determine Q α,k,n k in R it is qtukey(1 α, k, N k) where, k = number of groups and N = total sample size. 2. Calculate w = Q α,k,n k MSE/j. Where j = the number of elements in each group. 3. List the sample means in increasing order and underline those pairs that differ by less than w. 4. Any pair of sample means not underscored by the same line corresponds to a pair of population or treatment means that are judged significantly different. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 17 / 23

Tukey s Method for Diet Example Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 18 / 23

Multiple Comparisons If our p-value is small for the ANOVA F test, this implies that at least one of the means is different from the other. Which one(s) are different? We could do a t-test for each pair of means. Problem: when we do multiple t-tests our P(Type 1 error) becomes greater than α. Solution: There are methods of adjustments to reduce the significance level of the pairwise test enough so that the probability of one or more type I errors in the whole set of comparisons is less than α. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 19 / 23

Multiple Comparisons If our p-value is small for the ANOVA F test, this implies that at least one of the means is different from the other. Which one(s) are different? We could do a t-test for each pair of means. Problem: when we do multiple t-tests our P(Type 1 error) becomes greater than α. Solution: There are methods of adjustments to reduce the significance level of the pairwise test enough so that the probability of one or more type I errors in the whole set of comparisons is less than α. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 19 / 23

Multiple Comparisons If our p-value is small for the ANOVA F test, this implies that at least one of the means is different from the other. Which one(s) are different? We could do a t-test for each pair of means. Problem: when we do multiple t-tests our P(Type 1 error) becomes greater than α. Solution: There are methods of adjustments to reduce the significance level of the pairwise test enough so that the probability of one or more type I errors in the whole set of comparisons is less than α. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 19 / 23

Multiple Comparisons If our p-value is small for the ANOVA F test, this implies that at least one of the means is different from the other. Which one(s) are different? We could do a t-test for each pair of means. Problem: when we do multiple t-tests our P(Type 1 error) becomes greater than α. Solution: There are methods of adjustments to reduce the significance level of the pairwise test enough so that the probability of one or more type I errors in the whole set of comparisons is less than α. Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 19 / 23

The Bonferroni Method The Bonferroni Method of adjustments reduces the significance level for the pairwise test to α/k, where k is the number of comparisons. R Code: > attach(diet) > pairwise.t.test(loss,diet,"bonferroni") Pairwise comparisons using t tests with pooled SD data: Loss and Diet calorie carb control carb 0.05695 - - control 0.00083 0.35914 - fat 0.02632 1.00000 0.70193 P value adjustment method: bonferroni Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 20 / 23

The Bonferroni Method The Bonferroni Method of adjustments reduces the significance level for the pairwise test to α/k, where k is the number of comparisons. R Code: > attach(diet) > pairwise.t.test(loss,diet,"bonferroni") Pairwise comparisons using t tests with pooled SD data: Loss and Diet calorie carb control carb 0.05695 - - control 0.00083 0.35914 - fat 0.02632 1.00000 0.70193 P value adjustment method: bonferroni Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 20 / 23

Example MPG Is there a difference in the average miles per gallon for different makes of automobiles? The following table shows the mean mpg of three different makes of automobiles. The data is on the data sets list called mpg. Make n X S Honda 5 29.9 1.468 Toyota 6 33.04 2.1173 Nissan 4 29.3 1.3115 Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 21 / 23

Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 22 / 23

Cathy Poliak, Ph.D. cathy@math.uh.edu Office Fleming 11c (Department Chapter 11of Mathematics University of Houston Lecture ) 25-3339 23 / 23