One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables.

Similar documents
Analysis of Variance

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Data Analysis and Statistical Methods Statistics 651

Introduction to Business Statistics QM 220 Chapter 12

Lecture (chapter 10): Hypothesis testing III: The analysis of variance

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

Analysis of Variance: Part 1

20.0 Experimental Design

Econometrics. 4) Statistical inference

Ch 2: Simple Linear Regression

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1

ANOVA Analysis of Variance

Lec 1: An Introduction to ANOVA

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means

Sociology 6Z03 Review II

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

One-way ANOVA. Experimental Design. One-way ANOVA

Lecture 5: ANOVA and Correlation

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

Inference for the Regression Coefficient

Factorial designs. Experiments

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

Introduction to the Analysis of Variance (ANOVA)

Table 1: Fish Biomass data set on 26 streams

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

4.1. Introduction: Comparing Means

ANOVA: Comparing More Than Two Means

WELCOME! Lecture 13 Thommy Perlinger

TWO OR MORE RANDOM EFFECTS. The two-way complete model for two random effects:

Analysis Of Variance Compiled by T.O. Antwi-Asare, U.G

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

What is Experimental Design?

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Independent Samples ANOVA

Unit 27 One-Way Analysis of Variance

Categorical Predictor Variables

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

1 Introduction to One-way ANOVA

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

Design & Analysis of Experiments 7E 2009 Montgomery

Lecture 3: Inference in SLR

Orthogonal contrasts for a 2x2 factorial design Example p130

Battery Life. Factory

Review. One-way ANOVA, I. What s coming up. Multiple comparisons

Chapter 7 Comparison of two independent samples

22s:152 Applied Linear Regression. Take random samples from each of m populations.

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

20.1. Balanced One-Way Classification Cell means parametrization: ε 1. ε I. + ˆɛ 2 ij =

CptS 570 Machine Learning School of EECS Washington State University. CptS Machine Learning 1

One-Way Analysis of Variance (ANOVA)

Data Skills 08: General Linear Model

Sampling Distributions: Central Limit Theorem

Lecture 13 Extra Sums of Squares

14 Multiple Linear Regression

W&M CSCI 688: Design of Experiments Homework 2. Megan Rose Bryant

Chapter Seven: Multi-Sample Methods 1/52

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

ANOVA CIVL 7012/8012

Analysis of Variance Bios 662

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

Unit 12: Analysis of Single Factor Experiments

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Lec 5: Factorial Experiment

STA 4210 Practise set 2a

Multiple Linear Regression

PLS205 Lab 2 January 15, Laboratory Topic 3

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University

Lecture 7: Hypothesis Testing and ANOVA

Analysis of Variance and Design of Experiments-I

10/31/2012. One-Way ANOVA F-test

Comparing Nested Models

DOE DESIGN OF EXPERIMENT EXPERIMENTAL DESIGN. G. Olmi Università degli Studi di Bologna

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Unbalanced Data in Factorials Types I, II, III SS Part 1

ANOVA: Comparing More Than Two Means

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Hypothesis Testing hypothesis testing approach

Multiple comparisons - subsequent inferences for two-way ANOVA

Smart Home Health Analytics Information Systems University of Maryland Baltimore County

Analysis of Covariance

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

Analysis of Variance (ANOVA)

Ch 3: Multiple Linear Regression

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Stat 217 Final Exam. Name: May 1, 2002

Statistics For Economics & Business

Self-Assessment Weeks 8: Multiple Regression with Qualitative Predictors; Multiple Comparisons

Transcription:

One-Way Analysis of Variance With regression, we related two quantitative, typically continuous variables. Often we wish to relate a quantitative response variable with a qualitative (or simply discrete) independent variable, also called a factor. In particular, we wish to compare the mean response value at several levels of the discrete independent variable. Example: We wish to compare the mean wage of farm laborers for 3 different races (black, white, Hispanic). Is there a difference in true mean wage among the ethnic groups? If there were only 2 levels, could do a: For 3 or more levels, must use the Analysis of Variance (ANOVA). The Analysis of Variance tests whether the means of t populations are equal. We test:

Suppose we have t = 4 populations. Why not test: with a series of t-tests? If each test has α =.05, probability of correctly failing to reject H 0 in all 6 tests (when all nulls are true) is: Actual significance level of the procedure is 0.265, not 0.05 We will make some Type I error with probability 0.265 if all 6 means are truly equal. Why Analyze Variances to Compare Means? Look at Figure 6.1, page 223. Case I and Case II: Both have independent samples from 3 populations. The positions of the 3 sample means are the same in each case. In which case would we conclude a definite difference among population means μ 1, μ 2, μ 3? Case I? Case II?

This comparison of variances is at the heart of ANOVA. Assumptions for the ANOVA test: (1) There are t independent samples taken from t populations having means μ 1, μ 2,, μ t. (2) Each population has the same variance, σ 2. (3) Each population has a normal distribution. The data (observed values of the response variable) are denoted: Each sample has size n i, for a total of observations. Example: Y 47 = Notation The i-th level s total: Y i (sum over j) The i-th level s mean: Y i The overall total: Y (sum over i and j) The overall mean: Y

Estimating the variance σ 2 For i = 1,, t, the sum of squares for each level is SS i = Adding all the SS i s gives the pooled sum of squares: Dividing by our degrees of freedom gives our estimate of σ 2 : Recall: For 2-sample t-test, pooled sample variance was: This is the correct estimate of σ 2 if all t populations have equal variances. We will have to check this assumption.

Development of ANOVA F-test Assume sample sizes all equal to n: n 1 = n 2 = = n t (= n) balanced data Suppose H 0 : μ 1 = μ 2 = = μ t (= μ) is true. Then each sample mean variance Y i has mean and Treat these group sample means as the data and treat the overall sample mean as the mean of the group means. Then an estimate of σ 2 / n is: Recall: Consider the statistic:

With normal data, the ratio of two independent estimates of a common variance has an F-distribution. If H 0 true, we expect F* has an F-distribution. If H 0 false (μ 1, μ 2,, μ t not all equal), the sample means should be more spread out. General ANOVA Formulas (Balanced or Unbalanced) We want to compare the variance between (among) the sample means with the variance within the different groups. Variance between group means measured by: and, after dividing by the between groups degrees of freedom,

Variance within groups measured by: and, after dividing by the within groups degrees of freedom, In general, our F-ratio is: Under H 0, F* has an F-distribution with: The total sum of squares for the data: can be partitioned into The degrees of freedom are also partitioned:

This can be summarized in the ANOVA table: Source df SS MS F Example: Table 6.4 (p. 227) gives yields (in pounds/acre) for 4 different varieties of rice (4 observations for each variety) Y 2 i i ni Y 2 n i = = SSB =

Y 2 ij = SSW = ANOVA table for Rice Data: Back to original question: Do the four rice varieties have equal population mean yields or not? H 0 : μ 1 = μ 2 = μ 3 = μ 4 H a : At least one equality is not true Test statistic: At α = 0.05, compare to: Conclusion:

Treatment Effects Linear Model: Our ANOVA model equation: Denote the i-th treatment effect by: The ANOVA model can now be written as: Note that our ANOVA test of: H 0 : μ 1 = μ 2 = = μ t is the same as testing: Note: For balanced data, E(MSB) = and E(MSW) = If H 0 is true (all τ i = 0): If H 0 is false: