Chapter 15: Analysis of Variance

Similar documents
Statistics For Economics & Business

Analysis of Variance and Design of Experiments-I

We need to define some concepts that are used in experiments.

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Lec 1: An Introduction to ANOVA

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

STAT 115:Experimental Designs

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

ANOVA: Comparing More Than Two Means

Correlation Analysis

Chapter 12: Inference about One Population

If we have many sets of populations, we may compare the means of populations in each set with one experiment.

STAT Chapter 10: Analysis of Variance

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

Regression Analysis II

Two-Way Analysis of Variance - no interaction

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Chapter 3 Multiple Regression Complete Example

1 The Randomized Block Design

The simple linear regression model discussed in Chapter 13 was written as

Chapter 10. Design of Experiments and Analysis of Variance

Chapter 14 Student Lecture Notes 14-1

Fractional Factorial Designs

Chapter 7 Student Lecture Notes 7-1

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

Chapter 16. Simple Linear Regression and Correlation

Allow the investigation of the effects of a number of variables on some response

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

Factorial ANOVA. Psychology 3256

Analysis of Variance

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

W&M CSCI 688: Design of Experiments Homework 2. Megan Rose Bryant

Business Statistics (BK/IBA) Tutorial 4 Full solutions

Chapter Learning Objectives. Regression Analysis. Correlation. Simple Linear Regression. Chapter 12. Simple Linear Regression

Ch 13 & 14 - Regression Analysis

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts

Topic 9: Factorial treatment structures. Introduction. Terminology. Example of a 2x2 factorial

Introduction to Business Statistics QM 220 Chapter 12

Analysis of Variance (ANOVA)

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

Chapter 16. Simple Linear Regression and dcorrelation

Basic Business Statistics 6 th Edition

Mathematics for Economics MA course

LI EAR REGRESSIO A D CORRELATIO

Chapter 10: Analysis of variance (ANOVA)

Two Factor Completely Between Subjects Analysis of Variance. 2/12/01 Two-Factor ANOVA, Between Subjects 1

Stat 579: Generalized Linear Models and Extensions

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

Analysis Of Variance Compiled by T.O. Antwi-Asare, U.G

Unit 27 One-Way Analysis of Variance

Chapter 12. ANalysis Of VAriance. Lecture 1 Sections:

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Statistical Hypothesis Testing

3. Design Experiments and Variance Analysis

Chapter 4. Regression Models. Learning Objectives

Multiple Comparisons. The Interaction Effects of more than two factors in an analysis of variance experiment. Submitted by: Anna Pashley

ANOVA - analysis of variance - used to compare the means of several populations.

Basic Business Statistics, 10/e

STAT 705 Chapter 19: Two-way ANOVA

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12)

Outline Topic 21 - Two Factor ANOVA

Inference for Regression Simple Linear Regression

One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables.

V. Experiments With Two Crossed Treatment Factors

Two-Factor Full Factorial Design with Replications

Problem Set 4 - Solutions

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Two or more categorical predictors. 2.1 Two fixed effects

MAT3378 ANOVA Summary

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics

HYPOTHESIS TESTING. Hypothesis Testing

Chapter 6 Randomized Block Design Two Factor ANOVA Interaction in ANOVA

The Multiple Regression Model

1. The (dependent variable) is the variable of interest to be measured in the experiment.

Chapter 4: Regression Models

16.3 One-Way ANOVA: The Procedure

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Chapter Seven: Multi-Sample Methods 1/52

Ch 2: Simple Linear Regression

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Hypothesis testing: Steps

Factorial and Unbalanced Analysis of Variance

Analysis of Variance

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

BALANCED INCOMPLETE BLOCK DESIGNS

Analysis of Variance

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs

Math 1101 Chapter 2 Review Solve the equation. 1) (y - 7) - (y + 2) = 4y A) B) D) C) ) 2 5 x x = 5

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

IX. Complete Block Designs (CBD s)

Transcription:

Chapter 5: Analysis of Variance 5. Introduction In this chapter, we introduced the analysis of variance technique, which deals with problems whose objective is to compare two or more populations of quantitative data. You are expected to learn how to do the following: l. Recognize when the analysis of variance is to be employed.. Recognize which of the three models introduced in this chapter is to be used. 3. How to interpret the ANOVA table. 4. How to perform Fisher's LSD method, the Bonferonni adjustment and Tuey s multiple comparison procedure. 5. How to conduct Bartlett's test. 5. Single-Factor (One-Way) Analysis of Variance: Independent Samples When the samples are drawn independently of each other, we partition the total sum of squares into two sources of variability: sum of squares for treatment and sum of squares for error. The F-test is then used to complete the technique. The important formulas are n j SS(Total) = ( x ij x ) j = i= SST = n j ( x j x ) = n ( x x ) + n ( x x ) + K + n ( x x ) j = n j SSE = ( x ij x j ) j = i= n n n = ( x i x ) + ( x i x ) + K + ( x i x ) i = i = i = = ( n )s MST = SST MSE = SSE n + ( n )s + K + ( n ) s 79

F = MST MSE The ANOVA table for the completely randomized design is shown below. Source d.f. Sums of Squares Mean Squares F-ratio Treatments SST MST F Error n SSE MSE Total n SS(Total) Example 5. A major computer manufacturer has received numerous complaints concerning the short life of its dis drives. Most need repair within two years. Since the cost of repairs often exceeds the cost of a new dis drive, the manufacturer is concerned. In his search for a better dis drive, he finds three new products. He decides to test these three plus his current dis drive to determine if differences in lifetimes exist among the products. He taes a random sample of five dis drives of each type and lins it with a computer. The number of wees until the drive breas down is recorded and is shown below. Do these data allow us to conclude at the 5% significance level that there are differences among the dis drives? Type Type Type 3 Current Product 78 5 43 0 9 0 5 96 0 6 33 88 05 88 08 5 98 8 8 x = 94.8 x = 3.4 x 3 = 6.0 x 4 = 07.6 s = 0.7 s = 5.8 s 3 = 7.0 s 4 = 30.3 Solution The problem objective is to compare the populations of lifetimes of the four dis drives, and the data are quantitative. Because the samples are independent, the appropriate technique is the completely randomized design of the analysis of variance. The null and alternative hypotheses automatically follow. H 0 : µ = µ = µ 3 = µ 4 80

H : At least two means differ. The rejection region is F > F α,, n = F.05, 3, 6 = 3.4 The test statistic is computed as follows: n = n = n 3 = n 4 = 5 x = 5(94.80 ) + 5(3.4) + 5(6.0 ) + 5(07.6 ) 0 =,09 0 = 0.45 SST = 5(94.8 0.45) + 5(3.4 0.45) + 5(6.0 0.45) + 5(07.6 0.45) =,57.75 SSE = 4(0.7) + 4(5.8) + 4(7.0) + 4(30.3) = 3,43. MST = MSE = F =,57.75 3 3,43. 6 839.5 3.95 = 839.5 = 3.95 = 3.9 The complete ANOVA table follows. Source d.f. Sums of Squares Mean Squares F-ratio Treatments 3,57.75 839.5 3.9 Error 6 3,43. 3.95 Total 9 5,940.95 Since the F-ratio (3.9) exceeds F α,, n (3.4), we reject H 0 and conclude that at least two means differ. 8

EXERCISES 5. Complete the ANOVA table and F-test with α =.05. Source d.f. Sums of Squares Mean Squares F-ratio Treatments 5 Error 3,000 Total 0 3,500 5. Develop the ANOVA table from the following information and perform the F-test with α =.0. SS(Total) = 50 SST = 90 SSE = 60 = 5 n = 7 n = 5 n 3 = 8 n 4 = 9 n 5 = 4 ANOVA Table Source d.f. Sums of Squares Mean Squares F-ratio 8

5.3 Test the hypotheses (with α =.0) H 0 : µ = µ = µ 3 H : At least two means differ. given the following statistics: x = 5 x = x 3 = 3 s = 6 s = s 3 = 9 n = 5 n = 8 n 3 = 0 ANOVA Table Source d.f. Sums of Squares Mean Squares F-ratio 5.4 Perform the analysis of variance test with α =.05 using the following information: Treatment 3 4 x ij 30 60 540 40 x ij 9,650 8,50 0,00 4,800 n j 0 8 5 H 0 : H : Rejection region: Value of the test statistic: 83

ANOVA Table Source d.f. Sums of Squares Mean Squares F-ratio Conclusion: 5.5 A nationwide real estate chain is in the process of examining condominium prices across the country. The company has hired a statistician who taes a random sample of six sales in each of four cities. The results are recorded to the nearest thousand dollars and are shown below. Can we conclude from these data that there are differences in the selling prices of condominiums among the four cities? (Use α =.05.) New Yor Chicago Dallas Los Angeles 53 99 73 305 78 56 85 88 85 76 66 35 3 03 38 48 77 9 75 96 89 4 44 x i =,50 x i =,438 x i =,80 x i =,706 x i = 388,98 x i = 35,636 x i = 79,076 x i = 489,470 H 0 : H : Rejection region: Value of the test statistic: 84

ANOVA Table Source d.f. Sum of Squares Mean Squares F-ratio Conclusion: 5.3 Analysis of Variance Models In this section we described some of the models that are available to statisticians. However, we covered only three of these. 5.4 Single-Factor Analysis of Variance: Randomized Blocs If the experimental design is randomized bloc, the total variation as measured by SS(Total) must be partitioned into three sources of variation: treatments (measured by SST), blocs (measured by SSB), and error (measured by SSE). The general form of the ANOVA table is shown below. ANOVA Table for Randomized Bloc Design Source d.f. Sums of Squares Mean Squares F-ratio Treatments SST MST F = MST/MSE Blocs b l SSB MSB F = MSB/MSE Error n b + SSE MSE Total n SS(Total) The rejection region for testing the treatment means is F > F α,, n b + 85

The rejection region for testing the bloc means is F > F α, b, n b + Example 5. The statistician in Exercise 5.5 decides to redo the experiment to eliminate the variation among condominium prices. In each city, the selling prices of a,000-square-foot, a,500-square-foot, a,000- square-foot, a,500-square-foot and a 3,000-square-foot condominium are randomly selected. The results are shown below. Condominium Size New Yor Chicago Dallas Los Angeles,000 square feet 65 85 73 00,500 square feet 98 93 8 96,000 square feet 5 5 97 78,500 square feet 3 68 9 33 3,000 square feet 405 38 94 446 The following statistics were computed: SST = 5,90.9 SSB = 07,57.7 SSE = 8,360.3 Complete the ANOVA table to determine if we can conclude at the 5% significance level that there are differences in condominium prices among the four cities. Solution We test the hypotheses H 0 : µ = µ = µ 3 = µ4 H : At least two treatment means differ. Rejection region: F > F α,, n b + = F.05, 3, = 3.49 Value of the test statistic: From the table below, we find F = 7.3. 86

Source d.f. Sums of Squares Mean Squares F-ratio Treatments 3 5,90.9 5,063.6 7.3 Blocs 4 07,57.7 6,789.4 38.5 Error 8,360.3 696.7 Total 9 30,708.9 Conclusion: Reject H 0. There is sufficient evidence to conclude that there are differences in condominium prices among the four cities. EXERCISES 5.6 Complete the following ANOVA table (randomized bloc design). Source d.f. Sums of Squares Mean Squares F-ratio Treatments 5 300 Blocs 8 Error 400 Total 53,000 5.7 Refer to Exercise 5.6. Test to determine if there are differences among the treatment means with α =.0. 5.8 Refer to Exercise 5.6. Test to determine if there are differences among the bloc means with α =.0. 87

5.9 A large catalogue chain store has been experimenting with several methods of advertising its extensive variety of bicycles. Three inds of catalogues have been prepared. In one, a side view of each bicycle is shown. In another, each bicycle s excellent record of longevity is extolled. In the third, pictures of the bicycles with their riders are shown. The company s management would lie to now if there are differences in sales among the stores that use the different catalogues. The monthly sales of bicycles of three randomly selected stores, each using a different catalogue, are shown below. Do these data allow us to conclude that there are differences in bicycle sales among the stores using the three catalogues? Use α =.0. Monthly Bicycle Sales Catalogue Month 3 March 7 8 4 April 5 9 0 May 0 7 88 June 7 95 4 July 83 6 303 August 85 5 6 September 90 38 85 October 5 4 SST(Catalogues) = 53 SSB(Months) = 05,35 SS(Total) = 4,08 H 0 : H : Rejection region: Value of the test statistic: 88

ANOVA Table Source d.f. Sums of Squares Mean Squares F-ratio Conclusion: 5.5 Two-Factor Analysis of Variance: Independent Samples When the treatments are defined by two factors, we are often interested in determining whether treatment differences (if they exist) are due to factor A, factor B, or interaction between the two factors. The answer is provided by applying the two-factor analysis of variance. The format of the ANOVA table appears below. Source d.f. Sums of Squares Mean Squares F-Ratios Factor A a SS(A) MS(A) = SS(A) (a ) Factor B b SS(B) SS( B) MS(B) = (b ) Interaction (a )(b ) SS(AB) SS(AB) MS(AB) = (a )(b ) Error n ab SSE SSE MSE = (n ab) F = MS(A) MSE F = MS(B) MSE F = MS(AB) MSE Total n SS(Total) Example 5.3 The dean of a business school was examining the factors that lead to success in the MBA program. She felt that the type of degree and whether the student had previous wor experience were liely to be critical factors. To test her beliefs she too a random sample of 00 students. For each she recorded their MBA grade point average (response variable), the type of undergraduate degree (B.A., B.Sc., B.Eng, B.B.A.) (factor A), and whether the student had wor experience (factor B). The sums of squares are listed below. What conclusions can be reached from these results? 89

SS(Total )= 68 SS(A) = 7 SS(B) = 6 SS(AB) = 68 Solution The number of levels of factor A (type of degree) is 4; the number of factors of level B (previous wor experience yes or no) is. The ANOVA table is shown below. Source d.f. Sums of Squares Mean Squares F-Ratios Factor A 3 7 9.0.60 Factor B 6 6.0.07 Interaction 3 68.67 4.03 Error 9 57 5.6 Total 99 68 We first test for interaction. At the 5% significance level the rejection region is F > F α, (a )(b ), n ab = F.05, 3, 9.68 Since F = 4.03 we conclude that there is evidence of an interaction effect. We do not conduct tests to determine if factors A and B are significant. EXERCISES 5.0 Complete the following ANOVA table. Source d.f. Sums of Squares Mean Squares F-Ratios Factor A 4 80 Factor B 3 450 Interaction 40 Error 330 Total 30 00 5. Refer to Exercise 5.0. Test to determine whether there is evidence of an interaction effect. Use α =.05. 90

5. Refer to Exercises 5.0 and 5.. Test to determine whether Factors A and B are significant. Use α =.05. 5.6 Operations Management Application: Finding and Reducing Variation In this section we presented a common application of the analysis of variance. We also discussed the Taguchi method, which involves designing experiments to produce operations methods that lead to improved quality. 5.7 Multiple Comparisons If we wish to determine which treatment means differ, we can use one of several multiple comparison procedures. In this section we introduced Fisher's least significant difference (LSD) method, the Bonferroni adjustment to LSD, and Tuey s multiple comparison method. These techniques are applied when the samples are independent, the sample sizes are equal, and all conditions required to perform the F-test of the analysis of variance are satisfied. Fisher's Least Significant Difference Method If the analysis of variance reveals that there is evidence that at least two means differ, we can determine which ones differ by computing the difference between each pair of means and comparing the difference to LSD. For the means of sample i and sample j we define the least significant difference as LSD = t α / MSE ni + n j The degrees of freedom are n -. The value of αis the same as used in the analysis of variance, often 5%. The problem with this approach is that we increase the probability of concluding that some differences 9

exit when in fact they do not. We can lower that probability by using the Bonferonni adjustment which sets the significance level as α/c where C = (-)/. Tuey's Multiple Comparison Method where The ey statistic is ω = q α (,v ) MSE / n g = number of samples v = number of degrees of freedom associated with MSE (v = n ) n g = number of observations in each sample ( n g = n = n =... = n ) q α (, v) = critical value of the studentized range (from Table 7 in Appendix B) Example 5.4 Refer to Example 5.. Apply the LSD method, LSD method with the Bonferroni adjustment, and Tuey's multiple comparison method to determine which dis drives are different at the 5% significance level? Solution The sample means are x = 94.8 x = 3.4 x 3 = 6.0 x 4 = 07.6 The sample sizes are equal to 5 and MSE = 3.95 The least significant difference is defined as LSD = t α / MSE ni + n j 9

The number of degrees of freedom are n - = 0-4 = 6. With a 5% significance level we have t Thus, α /, n =.05, 6 t =.0 LSD =. 0 3. 95 + = 9.6 5 5 The least significant difference using the Bonferroni adjustment is calculated by altering the significance level. Because there are 4(3)/ = 6 possible pairs of means we divide 5% by 6. We used Excel to find t Thus, t = 3.00 α /, n =.0047, 6 LSD = 3. 00 3. 95 + = 7.85 5 5 The critical value of Tuey s multiple comparison method is ω = q α (, v) MSE / n g = q.05 ( 4, 6 ) 3.95 / 5 = (4.05)(6.54) = 6.49 Putting the sample means in ascending order, we find x x 4 x x 3 94.8 07.6 3.4 6.0 The only difference that exceeds LSD = 9.6, LSD (Bonferroni) = 7.85, and ω = 6.49 is x 3 x = 3. Therefore, there is sufficient evidence at the 5% significance level to conclude that µ and µ 3 differ no matter which method is used. There is not enough evidence to indicate that there are any other differences among the population means. 93

EXERCISES 5.3 Use the least significant difference method to determine which means differ. Use α =.05. = 4 n g = 7 MSE = 5 x = 0 x = 9 x 3 = 53 x 4 = 54 5.4 Refer to Exercise 5.3 Use the Bonferroni adjustment to determine which means differ. 5.5 Refer to Exercise 5.3 Use Tuey s multiple comparison method to determine which means differ. 5.8 Bartlett's Test Bartlett's test is a diagnostic tool that is used to test one of the requirements of the analysis of variance. It tests the quality of variance. The test statistic is where B = ( n ) ln(mse ) ( n )ln(s C i= C = + 3( ) n n i i = i i ) 94

The statistic is chi-squared distributed with - degrees of freedom. Example 5.5 Refer to Example 5.. Is there enough evidence at the 5% significance level to infer that the population variances are unequal? Solution We test the following hypotheses. H 0 : σ = σ = σ 3 = σ 4 H : At least one variance differs C = + 3( ) n n i i = 4 = + 3(4 ) 5 0 i = 4 =.04 B = ( n ) ln( MSE) ( n i ) ln( s i C i=.04 ) = [(0 4) ln(3.95) {(5 ) ln(0.7) + (5 ) ln(5.8) + (5 ) ln(7.0) + (5 ) ln(30.3)}] =.04 [6(5.366)-{4(4.707)+4(5.533)+4(5.47)+4(5.769)} =.6 The rejection region is >, χ.05, 3 B χ α = = 7.8473 Conclusion: There is not enough evidence to infer that any variances differ. The equality of variances requirement appears to be satisfied. EXERCISES 5.6 Refer to Example 5.. Apply Bartlett's test at the 5% significance level to determine whether the population variances differ. H 0 : H : 95

Test statistic: Rejection region: Value of the test statistic: 5.7 Refer to Exercise 5.5. Apply Bartlett's test at the 5% significance level to determine whether the population variances differ. H 0 : H : Test statis tic: Rejection region: Value of the test statistic: 96