STATS Analysis of variance: ANOVA

Similar documents
16.3 One-Way ANOVA: The Procedure

Chapter 11 - Lecture 1 Single Factor ANOVA

This document contains 3 sets of practice problems.

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

Chapter 10: Analysis of variance (ANOVA)

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Chapter 11 - Lecture 1 Single Factor ANOVA

df=degrees of freedom = n - 1

ANOVA (Analysis of Variance) output RLS 11/20/2016

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Week 14 Comparing k(> 2) Populations

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

1 Introduction to One-way ANOVA

Difference in two or more average scores in different groups

Lecture notes 13: ANOVA (a.k.a. Analysis of Variance)

Advanced Experimental Design

4.1. Introduction: Comparing Means

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

SIMPLE REGRESSION ANALYSIS. Business Statistics

Statistics For Economics & Business

ANOVA: Comparing More Than Two Means

Factorial designs. Experiments

One-Way Analysis of Variance (ANOVA) Paul K. Strode, Ph.D.

ANOVA Analysis of Variance

Chapter 4. Regression Models. Learning Objectives

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

Lecture 10 Multiple Linear Regression

Analysis of Variance

Statistics for Managers using Microsoft Excel 6 th Edition

Analysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร

Correlation Analysis

What Is ANOVA? Comparing Groups. One-way ANOVA. One way ANOVA (the F ratio test)

Inference for the Regression Coefficient

Inference for Regression Simple Linear Regression

Inferences for Regression

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

Chapter 4: Regression Models

STAT Chapter 10: Analysis of Variance

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

MSc / PhD Course Advanced Biostatistics. dr. P. Nazarov

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

One-Way Analysis of Variance (ANOVA)

Unit 27 One-Way Analysis of Variance

Statistics and Quantitative Analysis U4320

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

EX1. One way ANOVA: miles versus Plug. a) What are the hypotheses to be tested? b) What are df 1 and df 2? Verify by hand. , y 3

Confidence Interval for the mean response

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Multiple Linear Regression

Power & Sample Size Calculation

STA Module 10 Comparing Two Proportions

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

What If There Are More Than. Two Factor Levels?

Regression Models. Chapter 4. Introduction. Introduction. Introduction

In ANOVA the response variable is numerical and the explanatory variables are categorical.

The Multiple Regression Model

10 One-way analysis of variance (ANOVA)

Analysis Of Variance Compiled by T.O. Antwi-Asare, U.G

Practice Final Exam. December 14, 2009

Lecture 3: Inference in SLR

STAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS

Chapter 14 Simple Linear Regression (A)

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Hypothesis Testing hypothesis testing approach

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

Sampling Distributions: Central Limit Theorem

Chapter 7, continued: MANOVA

3. Design Experiments and Variance Analysis

Homework 2: Simple Linear Regression

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

Simple Linear Regression: One Quantitative IV

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

Chapter 16. Simple Linear Regression and Correlation

STAT Final Practice Problems

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Research Methods II MICHAEL BERNSTEIN CS 376

Ma 3/103: Lecture 25 Linear Regression II: Hypothesis Testing and ANOVA

ECO220Y Simple Regression: Testing the Slope

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Basic Business Statistics 6 th Edition

Chapter 12 - Lecture 2 Inferences about regression coefficient

Inference for Regression

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Note: k = the # of conditions n = # of data points in a condition N = total # of data points

Lec 1: An Introduction to ANOVA

Business Statistics. Lecture 10: Correlation and Linear Regression

Simple Linear Regression: One Qualitative IV

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Inference for Regression Inference about the Regression Model and Using the Regression Line

The Chi-Square Distributions

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Statistiek II. John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa. February 13, Dept of Information Science

Ch 2: Simple Linear Regression

Lecture 9: Linear Regression

The Chi-Square Distributions

QUEEN MARY, UNIVERSITY OF LONDON

Transcription:

STATS 1060 Analysis of variance: ANOVA READINGS: Chapters 28 of your text book (DeVeaux, Vellman and Bock); on-line notes for ANOVA; on-line practice problems for ANOVA NOTICE: You should print a copy of both (1) problems and (2) F- tables, and bring them with you to class. Solutions will be reviewed in class and you will have trouble keeping up if you do not have a copy of them with you.

Learning objectives: Even though you will explore ANOVA in the most simple setting, you will gain insights that will allow you to carry out one-way ANOVA when it is appropriate, you will be able to interpret published results of ANOVA (e.g., in biology and medicine), and you will have a base from which you can delve deeper into this important statistical method. For this part of the course, your specific objectives are: 1. To understand and be able to explain how ANOVA works. 2. To be able to construct an ANOVA Table, and interpret the statistics contained in that table. 3. To be able to use the F distribution to test the null hypothesis that all treatment means are equal. 4. To be able to answer ANOVA problems 1 to 8 that are provided on-line.

ANOVA: tests if means of different groups are equal One-way ANalysis Of VAriance (ANOVA) is used to compare 3 or more group means, where the groups are defined in just one way. 1. EXPERIMENTAL DATA: Do different treatments have the same mean? 2. OBSERVATIONAL DATA: Do different populations have the same mean? Group 1 Group 2 Group 3 sample mean

ANOVA compares variation within and between groups mean GPA of a dormitory (dorm) variation between dorm means (a) variation within dorms (b) Dorm C Dorm B a/b < 1 Dorm A Dorm GPA scores mean A 0.60 3.82 4.00 2.22 1.46 2.91 2.20 1.60 0.89 2.30 2.2 B 2.12 2.00 1.03 3.47 3.70 1.72 3.15 3.93 1.26 2.62 2.5 C 3.65 1.57 3.36 1.17 2.55 3.12 3.60 4.00 2.85 2.13 2.8

ANOVA compares variation within and between groups mean GPA of a dormitory (dorm) variation between dorm means (a) variation within dorms (b) Dorm F Dorm E a/b > 1 Dorm D Dorm GPA scores mean D 2.16 2.23 2.09 2.17 2.25 2.19 2.24 2.28 2.25 2.14 2.2 E 2.45 2.34 2.58 2.49 2.60 2.42 2.55 2.62 2.45 2.50 2.5 F 2.80 2.75 2.93 2.68 2.88 2.75 2.87 2.81 2.73 2.80 2.8 GPA

ANOVA: a conceptual overview ANOVA uses two measures of sample variability that do not depend on the null or alternative hypotheses: (a) The variability between group means (b) The variability within each group ANOVA compares a and b (as ratio a/b): If same means, expect: a/b < 1 If different means, expect: a/b > 1 But, how do we know when a/b is large enough?

The mathematical model for ANOVA Observation = grand mean (µ) + treatment effect (τ) + residual (ε) mean of group 1 mean of group 2 grand mean µ = grand mean!! "!! " µ 1 µ µ 2 y 1,2! 2! 1,2 µ j = mean of j th group! j = µ j + µ " i, j = y i, j! u j (treatment effect) (residual) y i, j = µ +! j +" i, j

The mathematical model for ANOVA Observations = grand mean + treatment effect + residual y i, j = µ +! j +" i, j (true parameters) y i, j = y + ˆ! j + e i, j (sample statistics) y ˆ! j STATISTICS, and e ij are estimators of PARAMETERS, and µ! j! ( ) ( ) y i, j = y + y j! y!" # $# + y i, j! y!# " $# j ˆ! j e ij = a = b y y j y i, j grand mean mean of group j the i th obs in j th group * now we have a way to measure a and b

Summarize variability with a mean square (MS) statistic ( ) ( ) y i, j = y + y j! y!" # $# + y! y i, j j!# " $# between groups within groups MEAN SQUARED TREATMENT (MSTR) measures the variability between groups (treatments): MSTR = SSTR df 1 = " n j ( y j! y) 2 k!1 MEAN SQUARED ERROR (MSE) measures the variability within groups: MSE = SSE df 2 = ""( y i, j! y ) 2 j n! k

The F-ratio of ANOVA is a ratio of mean squares F-ratio = variation between group means variation within groups = "a" "b" F df1, df 2 = F k!1, n!k = MSTR MSE If same means, expect: F < 1 If different means, expect: F > 1 The F-ratio computed from a sample is called F DATA

ANOVA without a computer Calculations are organized in an ANOVA table Source df Sum of Squares Mean Square F DATA Treatment df 1 = k-1 SSTR MSTR MSTR/MSE Error df 2 = n-k SSE MSE Total df 3 = n-1 SST = SSTR + SSE To compute MSTR: The easiest way to compute MSTR = SSTR/df 1 is to use the following short cut for compudng the grand- mean of the sample : y = n 1 y 1 + n 2 y 2 + n k y k n 1 + n 2 + n k The value n j is the size of the j th group and accounts for different sized groups. To compute MSE: The easiest way to compute MSE = SSE/ df 2 is to compute SSE from the sample variance (s 2 ) as follows: 2 SSE = "( n j!1) s j This formuladon avoids having to compute e ij for all observadons (i) and groups (j).

We need to know when the F-ratio is larger than expected by chance when the null hypothesis is true H 0 : µ 1 = µ 2 = = µ k (this is equivalent to τ 1 = τ 2 = τ k = 0) H A : At least one of the means is different F has a distribudon with df 1 and df 2!" # $%$&'$!" & $%$&($!" # $%$)'$!" & $%$#*$!" # $%$+'$!" & $%$,&$ F follows this distribution if the means are the same (i.e., H 0 is true) Total area under curve = 1 F is always positive, so curve always starts at 0 and is right skewed The curve has df 1 and df 2 because MSTR and MSE of the F-ratio have different dfs There is a different curve for each pair of dfs

Use the F-distribution to test the null hypothesis 1- α (non-critical region) (critical region) F F CRIT (boundary value of F) F F CRITICAL VALUE: The value of a random variable (in this case, F) at the BOUNDARY between the acceptance region and the rejection region of a hypothesis test.

Use the CRITICAL VALUE METHOD to test the null: Step 1: State the null hypothesis and rejection rule H 0 : µ 1 = µ 2 = = µ k Reject H 0 if F DATA > F CRIT Step 2: Determine the critical (boundary) value of F (F CRIT ) Obtain CRITICAL VALUE from an F-table Step 3: Compute ANOVA statistics and F-ratio for the data (F DATA ) Display statistics in an ANOVA table Step 4: Compare F DATA to F CRIT Accept or reject H 0

Be careful to avoid drawing the wrong conclusions Rejection of H 0 takes only one mean among k to be different! The most you can conclude for H A is that at least one mean is different You CANNOT determine which group(s) is(are) responsible for rejecting the null by looking at the estimated means! You must carry out POST TESTS. Lastly you must verify that the requirements for ANOVA have been met Independence Normality Equal Variances

Practice problems The in-class practice problems are distributed on-line via the course web site (through Dal s Online Web Learning, or OWL, resource). Additional problems, and real-time solutions, are provided on line in the form of screencasts. The additional problems are also provided in PDF form via a link on that site. You are strongly encouraged to try working those problems before watching the screencasts. The additional problems will NOT be covered during class time. Primary URL: http://awarnach.mathstat.dal.ca/~joeb/stats1060_webcasts/ Part_2.html Alternate URL: http://web.me.com/cadair_idris/stats1060/part_2.html On-line supplements ANOVA Fall 2011