CHAPTER 10 ONE-WAY ANALYSIS OF VARIANCE. It would be very unusual for all the research one might conduct to be restricted to

Similar documents
Chapter Seven: Multi-Sample Methods 1/52

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

One-Way Analysis of Variance (ANOVA) Paul K. Strode, Ph.D.

While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

Study Guide #3: OneWay ANALYSIS OF VARIANCE (ANOVA)

Using SPSS for One Way Analysis of Variance

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

CHAPTER 13: F PROBABILITY DISTRIBUTION

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

This gives us an upper and lower bound that capture our population mean.

Sampling Distributions: Central Limit Theorem

Unit 27 One-Way Analysis of Variance

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

INTERVAL ESTIMATION AND HYPOTHESES TESTING

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Independent Samples ANOVA

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance?

Calculating Fobt for all possible combinations of variances for each sample Calculating the probability of (F) for each different value of Fobt

An Old Research Question

Introduction to Business Statistics QM 220 Chapter 12

Statistics For Economics & Business

Chapter 4: Regression Models

0 0'0 2S ~~ Employment category

Statistics Introductory Correlation

Analysis of Variance: Part 1

3. Nonparametric methods

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Two-Sample Inferential Statistics

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

Chapter 12: Inference about One Population

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Analysis of Variance and Co-variance. By Manza Ramesh

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

PLSC PRACTICE TEST ONE

Chapter 7: Hypothesis Testing - Solutions

Inferences for Regression

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Ch 11- One Way Analysis of Variance

Estimating a Population Mean

Inferential statistics

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

Regression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv).

Hypothesis testing: Steps

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

Hypothesis Testing hypothesis testing approach formulation of the test statistic

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

Comparing the means of more than two groups

16.3 One-Way ANOVA: The Procedure

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

CHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the

Hypothesis testing: Steps

Variance Estimates and the F Ratio. ERSH 8310 Lecture 3 September 2, 2009

ANOVA: Comparing More Than Two Means

STAT Chapter 10: Analysis of Variance

Introduction to Analysis of Variance. Chapter 11

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

13: Additional ANOVA Topics. Post hoc Comparisons

Chapter 4. Regression Models. Learning Objectives

Mathematical Notation Math Introduction to Applied Statistics

PSY 216. Assignment 12 Answers. Explain why the F-ratio is expected to be near 1.00 when the null hypothesis is true.

Sociology 6Z03 Review II

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

Analysis of Variance ANOVA. What We Will Cover in This Section. Situation

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Chapter 10: Chi-Square and F Distributions

QUEEN MARY, UNIVERSITY OF LONDON

Introduction. Chapter 8

Chapter 7 Comparison of two independent samples

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Pooled Variance t Test

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

Review of Statistics 101

Correlation Analysis

Simple Linear Regression: One Qualitative IV

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

ANOVA: Comparing More Than Two Means

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Factorial Independent Samples ANOVA

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

INTRODUCTION TO ANALYSIS OF VARIANCE

Association Between Variables Measured at the Ordinal Level

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

One-way between-subjects ANOVA. Comparing three or more independent means

CHAPTER EIGHT Linear Regression

Institute of Actuaries of India

A posteriori multiple comparison tests

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Analysis of variance (ANOVA) ANOVA. Null hypothesis for simple ANOVA. H 0 : Variance among groups = 0

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

Quantitative Methods for Economics, Finance and Management (A86050 F86050)

CHAPTER 9: HYPOTHESIS TESTING

EX1. One way ANOVA: miles versus Plug. a) What are the hypotheses to be tested? b) What are df 1 and df 2? Verify by hand. , y 3

Transcription:

CHAPTER 10 ONE-WAY ANALYSIS OF VARIANCE It would be very unusual for all the research one might conduct to be restricted to comparisons of only two samples. Respondents and various groups are seldom divided easily into just two samples. As a result, the researcher often needs to make comparisons between three, four, five, or even more sample means. A series of t tests could be used to make the comparisons between combinations of three or more sample means. Four samples could be compared using six pairs or combinations of t tests. This is an complex and unwieldy process that would require a great number of calculations. This approach also has important statistical limitations. The probability of committing alpha error increases dramatically when t-tests are used in this way.. This chapter introduces a statistic called the one-way Analysis of Variance (ANOVA) that is used to test hypotheses involving differences in means when there are three or more samples to be examined. The requirements for using an Analysis of Variance are: the presence of interval data, the need to consider more than two sample means, the assumption that the characteristic under consideration is normally distributed. The core function of an ANOVA is very similar to that of the T-test. In both cases, the statistic is calculated using a formula that requires calculation of the amount of variance that exists within the a sample as well as the level of variance that exists between the samples under consideration. In the t-test, the numerator of the formula for t represents the variance that exists between the two sample means being compared. The numerator of the formula used to perform an ANOVA also represents a statistic which represents the variance that occurs between the three or more groups being studied. 157

Statistics reflecting the variance that exists within the individual samples are used in the denominator of the formula for each statistic in much the same way. Since the terms between group variance and within group variance are new to most students in a basic statistics course, these terms will be explained as background for understanding analysis of variance. One type of variance should already be understood based on the work that has already been done in this class even though the term has not been employed previously. Measurement of deviations from a sample mean, variance, standard deviation, and standard error has focused on a type of variance called within group variance. This represents the degree to which individual values within a group fluctuate or deviate from the mean of that particular group. The second type of variance is called between group variance. This type of variance measures the difference that exists between the group means of multiple samples when working with t-tests or ANOVA s. Between group variance is measured by the standard error of the difference in means when conducting a t-test and by a statistical called Mean of Squares Within when working with an ANOVA. Just like the T-test, the ANOVA produces a statistic labled F which is determined by comparing the variance between groups with the variance that exists within those groups. The formula used to calculate this statistic is: When an F-ratio is large, it provides a powerful indication that the variation observed between the group means under consideration is likely to be the result of real differences in the 158

populations represented by each sample and not the result of simple measurement error. The actual calculations for obtaining an F ratio are not difficult, but they are more complex than for the statistics covered so far in the text. This is primarily true because the student is working with more than two samples, and there is some new symbolism associated with the formulas. However, some of the basics have already been learned from the study of earlier chapters in the form of the sum of squares or finding deviation values. It will be recalled 2 that the sum of squares was the basis for obtaining variance (s ) for distribution of values. The sum of the squares is also the basis for obtaining the denominator (MS within) of the F ratio. The Mean Square Within is calculated using the following formula: Where: Total Number of Cases included in all samples minus the number of groups Being Compared = Sum of Squared Deviations of Each Value From its corresponding sample mean The following is a step by step analysis for finding the MS within portion of the ratio. The analysis is based on a hypothetical scenario in which a researcher has data for three simple income distributions or samples that document percent of funds spent for entertainment per year as follows: 159

Family X 1 -High Income X 2 Middle Income X 3 Low Income 1 20% 50% 60% 2 10% 40% 30% 3 30% 10% 20% 4 15% 8% 10% 5 5% 2% 5% After stating the null hypothesis, determining that the data are interval and noting that there are more than two groups, the researcher would then calculate the means, deviation values, sum of the squares, sum of the squares within (SS within ), degrees of freedom within groups (d.f. within ), and mean of the squares within groups. Group 1: 20 4 16 10-6 36 30 14 196 15-1 1 5-11 121 160

Group 2: 50 28 784 40 18 324 10-12 144 8-14 196 2-20 400 Group 3: 60 35 1225 30 5 25 20-5 25 10-15 225 5-20 400 After the means and deviation values have been calculated, the MS within can be calculated as follows: 161

Step 1: Step 2 Step 3 At this point, the researcher has calculated the denominator of the F ratio formula in three easy steps. The next stage of the process involves the calculation of the Mean of Squares Between. This statistic is determined by the formula: Where: n of each sample group Overall mean across all groups 162

Working with the same example as before, the process for obtaining is as follows: Step 1: Compute the Overall Mean ( ) Step 2: Compute Step 3: Determine Step 4: Determine 163

Once the mean of the squares within and mean of the squares between have been calculated, obtaining the F ratio is a simple division. With the F ratio calculated, the researcher can then consult the tables in Appendix E and F for the critical values of F at.01 and.05 to determine whether or not the null hypothesis should be accepted or rejected. The tables yield critical values of 6.93 (.01 or 99% level) and 3.88 (.05 or 95% level). Since the obtained F ratio of.31 is smaller than either of the critical values, the null hypothesis is accepted at both levels of significance. Statistically, the mean percent income spent for entertainment for the populations of these three income groups is the same. Income level does not affect the amount spent for entertainment. When reporting the results of an analysis of variance, one must always construct a table showing the sources of variation and the F ratio. The table provides a summary of the calculations and clarifies the findings for the reader of the research report. An example of an F ratio table, which includes the data from the above illustration, is shown below. F RATIO TABLE Sources of Variation df SS MS F Ratio Between Groups 2 210.00 105.00.31 Within Groups 12 4118.00 343.17 164

On those occasions when the obtained value for the F-ratio is significant and the null hypothesis must be rejected, one additional step is needed to determine which sample means are significantly different from each other. The statistic used for this follow up test is called Tukey s HSD. Tukey s HSD is used to determine the amount of difference that must exist between two sample means for those means to be considered significantly different in a statistical sense. The formula for Tukey s HSD is: Where: q = represents the value obtained from Appendix G at the appropriate level of confidence the number of cases in each sample (**Tukey s HSD can only be used when all samples are of equal size) In conclusion, analysis of variance is a comparison between three or more independent means. In order to conduct an analysis of variance test, one must have data that are interval level measurement. Ordinal and nominal data cannot be used. The test also assumes that the samples have been drawn at random from their populations and that the characteristic being studied is normally distributed in the populations. A review of the steps for calculating F ratio are at the end of the chapter. 165

A Major Idea: Remember this concept when applying analysis of variance. Every time a hypothesis is tested for a research situation the following MUST ALWAYS be applied: (1) State the RESEARCH QUESTION. (2) State the NULL HYPOTHESIS. (3) Conduct the STATISTICAL ANALYSIS. (4) Draw STATISTICAL CONCLUSIONS. 166

SEQUENTIAL STATISTICAL STEPS FINDING THE F RATIO Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Organize Data Matrices i 2 x SS within X average SS between Organize all of the distributions in matrices. Calculate the means for all of the distributions by adding all of the raw values in each distribution and dividing by the number of values in each distribution. Find the sums of the deviation values squared. 2 2 2 2 (X-X 1), (X-X 2), (X-X 3), (X-X 4) Find the sum of the squares within by adding the deviation values squared for all of the distributions. What is the mean average for all the distributions? Add all the means and divide by the number of samples. X 1 + X 2 + X 3 + X4 4 Find the sum of the squares between by subtracting each mean total from the individual mean, square, multiply the number in each distribution and add those results. 167

Step 7 Step 8 What are the degrees of freedom for between groups? df between df between = add the number of distributions df within and subtract 1. df within = add number of values in each distribution and subtract one for each distribution. MSwithin sum What is the mean of the squares within groups? Divide the of the squares within by the degrees of freedom within. Step 9 MSbet sum What is the mean of the squares between groups? Divide the of the squares between by the degrees of freedom between. Step 10 MS bet MS within What is the F ratio? Divide the mean of the squares between by the mean of the squares within. Step 11 Construct F Ratio Table Construct a sources of variation table and enter all F ratio values. 168

Step 12 Step 13 Step 14 Obtain Critical Values For F Accept or Reject H 0 Draw Research Conclusions Consult the F ratio table for the critical value of F. The degree of freedom between is the numerator and the degrees of freedom within the denominator in the table. Make decision to accept or reject the null hypothesis by comparing the obtained F ratio with the critical value from the tables. Draw research conlusions based on the findings related to the statistical tests. 169

EXERCISES - CHAPTER 11 (1) Following the step-by-step procedures, calculate an F ratio for the following samples. Show all work. Test the null hypothesis at the.05 and.01 levels. Draw Research conclusions. X Factor/Grouping Variable 20 1 10 1 30 1 30 1 50 2 40 2 30 2 40 2 80 3 90 3 70 3 80 3 170

(2) Four machines in a plant produce so many units (in hundreds) per day. Test a hypothesis that there is no statistically significant difference in the productivity of the machines. Draw research conclusions at both 95% and 99% levels of confidence. Units Machine # 36 1 34 1 37 1 35 1 33 1 19 2 24 2 24 2 26 2 22 2 31 3 35 3 32 3 33 3 39 3 56 4 48 4 54 4 52 4 50 4 171

(3) Net receipts (in thousands) for three restaurants in a chain of restaurants for one month are as follows. Test a hypothesis that there is no statistically significant difference in receipts of the restaurants. Draw research conclusions. Receipts Restaurant # 25 1 29 1 33 1 15 1 28 1 14 1 20 2 24 2 17 2 35 2 22 2 16 2 81 3 74 3 94 3 74 3 54 3 74 3 172

(4) A test was conducted to compare the fuel economy of three different government automobiles. The results were expressed in miles per gallon after five tanks of gasoline had been used in each of the automobiles. Test whether or not there are statistically significant differences in fuel economy among the automobiles. Fuel Economy Type 16 a 18 a 15 a 20 a 19 a 13 b 13 b 15 b 14 b 16 b 17 c 18 c 16 c 19 c 20 c If MPG is a major consideration, which model automobile should the government buy? 173

(5) The Department of Commerce used three different methods to rate effectiveness of trainee programs. Are the three methods equally effective? Draw research conclusions. Effectiveness Rating Method 89 1 100 1 99 1 70 1 50 1 21 1 90 2 80 2 81 2 70 2 75 2 88 2 100 3 130 3 115 3 106 3 121 3 149 3 174