Lecture 14: ANOVA and the F-test

Similar documents
I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

Agonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data?

Chapter 12. Analysis of variance

Study Guide #3: OneWay ANALYSIS OF VARIANCE (ANOVA)

Week 14 Comparing k(> 2) Populations

Non-parametric tests, part A:

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data

Analysis of variance (ANOVA) Comparing the means of more than two groups

ST4241 Design and Analysis of Clinical Trials Lecture 9: N. Lecture 9: Non-parametric procedures for CRBD

Data Analysis: Agonistic Display in Betta splendens I. Betta splendens Research: Parametric or Non-parametric Data?

Lecture 16: Again on Regression

Analysis of Variance

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Biostatistics 270 Kruskal-Wallis Test 1. Kruskal-Wallis Test

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Hypothesis Testing hypothesis testing approach

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Lecture 28 Chi-Square Analysis

Analysis of Variance (ANOVA)

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Non-parametric (Distribution-free) approaches p188 CN

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

Battery Life. Factory

Data analysis and Geostatistics - lecture VII

Lecture 7: Hypothesis Testing and ANOVA

Statistics for EES Factorial analysis of variance

The Chi-Square Distributions

Multiple comparisons - subsequent inferences for two-way ANOVA

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

MATH Notebook 3 Spring 2018

One-way ANOVA (Single-Factor CRD)

BIO 682 Nonparametric Statistics Spring 2010

Introduction to Nonparametric Statistics

Hypothesis Testing One Sample Tests

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means

What is a Hypothesis?

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

Statistics Introductory Correlation

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

3. Nonparametric methods

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)

Rank-Based Methods. Lukas Meier

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Comparing the means of more than two groups

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

Sleep data, two drugs Ch13.xls

Analysis of variance (ANOVA) ANOVA. Null hypothesis for simple ANOVA. H 0 : Variance among groups = 0

WELCOME! Lecture 13 Thommy Perlinger

Chapter 18 Resampling and Nonparametric Approaches To Data

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables.

8.1-4 Test of Hypotheses Based on a Single Sample

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Finding Relationships Among Variables

ANOVA CIVL 7012/8012

Topic 21 Goodness of Fit

Multiple Sample Numerical Data

STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples.

Factorial designs. Experiments

Addition of Center Points to a 2 k Designs Section 6-6 page 271

Lecture 9. ANOVA: Random-effects model, sample size

The Chi-Square Distributions

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

QUEEN MARY, UNIVERSITY OF LONDON

Statistical Foundations:

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Analysis of Variance and Design of Experiments-I

Basic Business Statistics, 10/e

4/22/2010. Test 3 Review ANOVA

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

Review. One-way ANOVA, I. What s coming up. Multiple comparisons

Agenda: Recap. Lecture. Chapter 12. Homework. Chapt 12 #1, 2, 3 SAS Problems 3 & 4 by hand. Marquette University MATH 4740/MSCS 5740

STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

Analysis Of Variance Compiled by T.O. Antwi-Asare, U.G

Lec 3: Model Adequacy Checking

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

Analysis of Variance: Part 1

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

STK4900/ Lecture 3. Program

16.3 One-Way ANOVA: The Procedure

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

Tribhuvan University Institute of Science and Technology 2065

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Sampling Distributions: Central Limit Theorem

Difference in two or more average scores in different groups

22s:152 Applied Linear Regression. Take random samples from each of m populations.

Transcription:

Lecture 14: ANOVA and the F-test S. Massa, Department of Statistics, University of Oxford 3 February 2016

Example Consider a study of 983 individuals and examine the relationship between duration of breastfeeding and adult intelligence. Each individual had to perform 3 tests, and breastfeeding duration was marked in 5 classes. Test Duration of Breastfeeding (months) 1 2-3 4-6 7-9 > 9 N 272 305 269 104 23 Verbal IQ Adj. Mean 99.7 102.3 102.7 105.7 103.0 SD 16.0 14.9 15.7 13.3 15.2 Performance IQ Adj. Mean 99.1 100.6 101.3 105.1 104.4 SD 15.8 15.2 15.6 13.9 14.9 Full Scale IQ Adj. Mean 99.4 101.7 102.3 106.0 104.0 SD 15.9 15.2 15.7 13.1 14.4

Example First of all notice that we list adjusted means. This means that the actual data has been analysed to remove effects of confounding factors (mother smoking, parents income etc), so that the effect of breastfeeding could be isolated. Test if the duration of breastfeeding affects adult intelligence.

The General Setup Suppose we have independent samples from K different normal distributions, with means µ 1,..., µ K and common variance σ 2. Test H 0 : µ 1 =... = µ K. We call the K groups, levels. We have n i samples from the i-th level, X i1, X i2,..., X ini, and N = K i=1 n i total observations. The sample mean is X while the sample mean of group i is X i X i = 1 n i n i j=1 X ij, X = 1 N X ij. i,j

The Idea Behind ANOVA If the K means are all equal, then: the observations should be as far from their own level mean X i, as they are from the overall mean X. If the means were different then observations should be closer to the mean of their level than the overall mean.

Between Groups and Error Sum of Squares The Between Groups Sum of Squares (BSS) is the total square deviation of the group means from the overall mean, K BSS = n i ( X i X) 2 ; i=1 The Error Sum of Squares (ESS) is the total squared deviation of the samples from their group means; K n i ESS = (X ij X i ) 2 i=1 j=1 K = (n i 1)s 2 i, i=1 where s i is the SD of observations in level i.

Total Sum of Squares The Total Sum of Squares (TSS) is the total square deviation of the samples from the overall mean. T SS = (X ij X) 2 i,j = (N 1)s 2, where s is the sample SD of all observations together. We also have BMS = BSS ESS, EMS = K 1 N K.

ANOVA Two basic mathematical facts behind ANOVA First TSS = ESS + BSS. The variability among the data can be split in two pieces: 1. the variability among the means of the groups; 2. the variability within the groups; Evaluate how the total variability is split among the two types: if there is too much between group variability this would cast doubt on the validity of the null. Second Both EMS and BMS are estimates for σ 2.

The F -statistic The F -statistic is F = BMS EMS = N K BSS K 1 ESS. The critical region is of the form {F f}, where f will depend on the significance level of the test. Essentially we would reject the null hypothesis for larger values of F (that is BMS bigger than EMS).

The F -distribution Under the null hypothesis the F statistic has the F distribution with (K 1, N K) degrees of freedom. This is a continuous distribution on the positive real numbers with two parameters. Figure: The probability density function of the F distribution for various degrees of freedom.

The F -statistic The important quantities are summerised in the following table: Errors SS d.f. MS F Between Groups BSS K 1 BMS = BSS/(K 1) BMS/EMS Within Groups ESS N K EMS = ESS/(N K) Total TSS N 1

ANOVA: the Breastfeeding Study Recall the data from the breastfeeding study: Test Duration of Breastfeeding (months) 1 2-3 4-6 7-9 > 9 N 272 305 269 104 23 Verbal IQ Adj. Mean 99.7 102.3 102.7 105.7 103.0 SD 16.0 14.9 15.7 13.3 15.2 Performance IQ Adj. Mean 99.1 100.6 101.3 105.1 104.4 SD 15.8 15.2 15.6 13.9 14.9 Full Scale IQ Adj. Mean 99.4 101.7 102.3 106.0 104.0 SD 15.9 15.2 15.7 13.1 14.4 Find TSS via TSS = ESS + BSS.

Example: the Breastfeeding Study Test Duration of Breastfeeding (months) 1 2-3 4-6 7-9 > 9 N 272 305 269 104 23 Verbal IQ Adj. Mean 99.7 102.3 102.7 105.7 103.0 SD 16.0 14.9 15.7 13.3 15.2 Performance IQ Adj. Mean 99.1 100.6 101.3 105.1 104.4 SD 15.8 15.2 15.6 13.9 14.9 Full Scale IQ Adj. Mean 99.4 101.7 102.3 106.0 104.0 SD 15.9 15.2 15.7 13.1 14.4 5 ESS = (n k 1)s 2 k k=1 = 271 15.9 2 + 304 15.2 2 + 268 15.7 2 + 103 13.1 2 + 22 14.4 2 = 227000;

Example: the Breastfeeding Study BSS = Test Duration of Breastfeeding (months) 1 2-3 4-6 7-9 > 9 N 272 305 269 104 23 Verbal IQ Adj. Mean 99.7 102.3 102.7 105.7 103.0 SD 16.0 14.9 15.7 13.3 15.2 Performance IQ Adj. Mean 99.1 100.6 101.3 105.1 104.4 SD 15.8 15.2 15.6 13.9 14.9 Full Scale IQ Adj. Mean 99.4 101.7 102.3 106.0 104.0 SD 15.9 15.2 15.7 13.1 14.4 5 n k ( x k x) 2 k=1 = 272 (99.4 101.7) 2 + 305 (101.7 101.7) 2 + = 3597. + 269 (102.3 101.7) 2 + 104 (106.0 101.7) 2 + + 23 (104.0 101.7) 2

Example: the Breastfeeding Study We complete as follows Table: ANOVA table for breastfeeding data: Full Scale IQ, Adjusted. SS d.f. MS F Between 3597 4 894.8 3.81 Samples = 3597/4 = 894.8/234.6 Within 227000 968 234.6 Samples = 227000/968 Total 230600 972 = 3597 + 227000 Since N = 973 and K = 5 under the null the F -statistic is distributed according to F (4, 968).

Example: the Breastfeeding Study Having computed F = 3.81 we now look up the critical values in our table for the 0.05 level: K = 4 so we pick the fourth column, but N K is much more than 60 so we take the bottom row. The critical value turns out to be 2.37 so we reject the null hypothesis at the 0.05 level.

The Kruskal-Wallis Test The F -test has one basic assumption: the samples are assumed to be normally distributed, that is the F -test is parametric. The non-parametric version is known as the Kruskal-Wallis test. As with the rank sum test, the basic idea is to substitute the ranks for the actual observed values.

The Kruskal-Wallis test Suppose K levels, with n i observations in level i. Assign to each observation its rank relative to the whole sample. Sum the ranks in each group giving rank sums R 1,..., R K. The Kruskal-Wallis test statistic is H = 12 N(N + 1) K i=1 Under the null hypothesis, H χ 2 K 1. R 2 i n i 3(N + 1). (1)

Exercise and Bone Density in Rats A study was performed to examine the effect of exercise on bone density in rats. 30 rats were divided into three groups of ten: high, low and control. Their bone density was measured at the end of the treatment period. Test H 0 : different groups have same mean density H 1 : different groups have different mean density

Exercise and Bone Density in Rats Compute Thus High 626 650 622 674 626 643 622 650 643 631 Low 594 599 635 605 632 588 596 631 607 638 Control 614 569 653 593 611 600 603 593 621 554 x high = 638.70, s 2 high = 275.34 x low = 612.5 s 2 low = 373.61 x cont = 601.10, s 2 cont = 748.77 x = 617.4 ESS = 9s 2 high + 9s 2 low + 9s 2 cont = 12579.5, BSS = 10(638.7 617.4) 2 + 10(612.5 617.4) 2 + 10(601.1 617.4) 2 = 7433.9

Exercise and Bone Density in Rats ANOVA table: SS d.f. MS F Between 7434 2 3717 7.98 Errors (Within 12580 27 466 Total 20014 29 The number of degrees of freedom here is (K 1, N K) = (2, 27). There is no row for 27 so we just look at the row for 30 and find the critical value to be 3.32. So we reject the null at the 5% level.

Exercise and Bone Density in Rats I Use the Kruskal-Wallis test. First we assign ranks to the data, breaking ties as usual. High 18.5 27.5 16.5 30 18.5 25.5 16.5 27.5 25.5 20.5 Low 6 8 23 11 22 3 7 20.5 12 24 Control 14 2 29 4.5 13 9 10 4.5 15 1 The rank sums are then computed as follows R high = 226.5, R low = 136.5, R cont = 102.

Exercise and Bone Density in Rats II Then compute H = 12 N(N + 1) K i=1 = 12 [ 226.5 2 30 31 10 R 2 i n i 3(N + 1) + 136.52 10 + 1022 ] 3 31 = 10.66 10 At the 5% level, the critical value for χ 2 with K 1 = 2 degrees of freedom is 5.99. We therefore still reject the null hypothesis.

Recap Given independent samples from K normally distributed populations N(µ 1, σ 2 ),... N(µ K, σ 2 ) we want to test if the level means µ 1,..., µ K could all be equal. We compute ESS: the squared deviations of observations from their own group mean; and BSS: the squared deviations of group means from the overall mean. Failure of the null should result in higher BMS compared to EMS. The F -test is defined as F = N K K 1 BSS ESS = BMS EMS. Under the null F has the F -distribution with (K 1, N K) degrees of freedom.

Recap We summarise our calculation in an ANOVA table: SS d.f. MS F Between BSS K 1 BMS Treatments (A) (B) (X = A/B) Errors (Within ESS N K EMS Treatments) (C) (D) (Y = C/D) Total TSS N 1 X/Y

Recap Now the F -test depends crucially on our data being normally distributed. If we have reason to believe this may not be satisfied then we use the non-parametric Kruskal-Wallis test. Replace the data by their ranks relative to the whole sample. Let R i be the rank sum in the i-th level. H = 12 N(N + 1) Under the null H χ 2 K 1. K i=1 R 2 i n i 3(N + 1). (2)