STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

Similar documents
STAT 135 Lab 7 Distributions derived from the normal distribution, and comparing independent samples.

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

STAT 135 Lab 10 Two-Way ANOVA, Randomized Block Design and Friedman s Test

Biostatistics 270 Kruskal-Wallis Test 1. Kruskal-Wallis Test

Week 14 Comparing k(> 2) Populations

22s:152 Applied Linear Regression. Take random samples from each of m populations.

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test.

Assignment #7. Chapter 12: 18, 24 Chapter 13: 28. Due next Friday Nov. 20 th by 2pm in your TA s homework box

Analysis of variance (ANOVA) Comparing the means of more than two groups

Lecture 7: Hypothesis Testing and ANOVA

Analysis of variance (ANOVA) ANOVA. Null hypothesis for simple ANOVA. H 0 : Variance among groups = 0

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Lecture 14: ANOVA and the F-test

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

Non-parametric (Distribution-free) approaches p188 CN

Sampling Distributions: Central Limit Theorem

Multiple comparisons - subsequent inferences for two-way ANOVA

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

Comparing the means of more than two groups

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Hypothesis testing: Steps

MATH Notebook 3 Spring 2018

Hypothesis testing: Steps

Statistics for EES Factorial analysis of variance

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Data Analysis: Agonistic Display in Betta splendens I. Betta splendens Research: Parametric or Non-parametric Data?

Data analysis and Geostatistics - lecture VII

Agonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data?

Tentative solutions TMA4255 Applied Statistics 16 May, 2015

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables.

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

Chapter 12. Analysis of variance

Analysis of Variance

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Chapter 11 - Lecture 1 Single Factor ANOVA

Analysis of Variance. Read Chapter 14 and Sections to review one-way ANOVA.

= 1 i. normal approximation to χ 2 df > df

3. Nonparametric methods

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

13: Additional ANOVA Topics

ANOVA Analysis of Variance

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

Topic 22 Analysis of Variance

4.1. Introduction: Comparing Means

Difference in two or more average scores in different groups

Introduction to Nonparametric Statistics

Nonparametric Statistics

Lecture 6 April

One-way ANOVA (Single-Factor CRD)

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Statistical Inference Theory Lesson 46 Non-parametric Statistics

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Chapter 7 Comparison of two independent samples

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

The Random Effects Model Introduction

Statistiek II. John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa. February 13, Dept of Information Science

CBA4 is live in practice mode this week exam mode from Saturday!

More about Single Factor Experiments

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

Lec 1: An Introduction to ANOVA

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes

Rejection regions for the bivariate case

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

Unit 12: Analysis of Single Factor Experiments

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

Analysis of Variance

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Analysis of Variance

Lecture 10: Generalized likelihood ratio test

Lecture 21: October 19

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Central Limit Theorem ( 5.3)

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)

Basic Business Statistics, 10/e

IEOR165 Discussion Week 12

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

a. When a data set is not normally distributed, what should you try in order to appropriately make statistical tests on that data?

ANOVA CIVL 7012/8012

STAT 135: Anova. Joan Bruna. April 10, Department of Statistics UC, Berkeley

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Master s Written Examination

Module 9: Nonparametric Statistics Statistics (OA3102)

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Multiple Sample Numerical Data

Lecture 5: ANOVA and Correlation

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data

Chapter Seven: Multi-Sample Methods 1/52

Two-Sample Inferential Statistics

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38

Transcription:

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis Rebecca Barter April 6, 2015

Multiple Testing

Multiple Testing Recall that when we were doing two sample t-tests, we were testing the equality of means between two populations. H 0 : µ X = µ Y H 0 : µ X µ Y We could think of these two samples, X and Y, as coming from two experimental conditions: X corresponds to the treatment group Y corresponds to the control group But what if we had more than two experimental conditions?

Multiple Testing Suppose we had I experimental conditions for which we wanted to test the equality of means and reject if our p-value is less than α = 0.05: H 0 : µ 1 = µ 2 =... = µ I H 1 : at least one of the experimental conditions has different mean Why might it be a bad idea to test each of these equalities via multiple two-sample t-tests, where we reject individual tests if the p-value is below α = 0.05? H 0 : µ 1 = µ 2 H 0 : µ 2 = µ 3. H 0 : µ I 1 = µ I

Multiple Testing Conducting multiple hypothesis tests leads to the multiple testing problem. Suppose we are conducting multiple tests, each at the 5% level. For example Suppose that we want to consider the efficacy of a drug in terms of the reduction of any one of I disease symptoms: H 0 : µ (1) X = µ(1) Y symptom 1 H 0 : µ (2) X = µ(2) Y symptom 2 H 0 : µ (I) X = µ(i) Y. symptom I As more symptoms are considered, it becomes more likely that the new drug (Y ) will appear to improve over the existing drug (X) in terms of at least one symptom just by chance.

Multiple Testing H 0 : µ (1) X = µ(1) Y symptom 1 H 0 : µ (2) X = µ(2) Y symptom 2 H 0 : µ (I) X = µ(i) Y. symptom I For each individual test, the probability of incorrectly rejecting the null hypothesis (if the null hypothesis is true) is 5%. For 100 tests where the null hypotheses are all true, the expected number of incorrect rejections is 5. If the tests are independent, the probability of at least one incorrect rejection is 99.4% (which is MUCH larger than 5%).

Multiple Testing How can we deal with the multiple testing problem? One common approach is to control the family-wise error rate (the probability of at least one incorrect rejection) at 5%. Recall that if we conduct 100 independent tests each at the 5% level, then the FWER is 99.4%. We need to adjust our method so that we lower the FWER to 5%. Idea: maybe we can perform each individual test at a lower significance level?

Multiple Testing: Bonferroni Correction The Bonferroni multiple testing correction: Suppose that you intend to conduct I hypothesis tests each at level α. Then to fix the family-wise-error rate at α, you should instead conduct each of the I hypothesis tests at level α I. That is, for each test only reject H 0 if the p-value is less than α I instead of just α if α = 0.05, and I = 100, this means you reject when you get p-values less than 0.05/100 = 0.0005. Then the probability of (at least) one incorrect rejection will be at most α.

Multiple Testing Instead of testing each equality separately and correcting for multiple comparisons, is there a way that we can simply test the joint hypothesis H 0 : µ 1 = µ 2 =... = µ I H 1 : at least one of the experimental conditions has different mean by using a single test? The answer is yes: we can use ANOVA. Downside of ANOVA: Rejecting H 0 does not tell you which of the groups have different means!

ANOVA Analysis Of Variance

ANOVA In ANOVA, we are asking the question: Do all our groups come from populations with the same mean? To answer this question, we need to compare the sample means. However, there will always be some differences due to sampling variation. A more reasonable question might instead be: Are the observed differences between the sample means simply due to sampling variation or due to real differences in the population means? This question cannot be answered just from the sample means - we also need to look at the variability.

ANOVA In ANOVA we compare: the variability between the groups (how much variability is observed between the means from different groups?) to the variability within the groups (how much natural variation is there in our measurements?). If the variability between the groups is not significantly more than the variability within the groups, then it is likely that the observed differences between the group means are simply due to chance.

ANOVA The assumptions required for ANOVA are: the observations are independent the observations are random samples from normal distributions the populations have common variance, σ 2.

ANOVA Suppose we have a sample of J observations from each of I populations (groups) We are interested in testing G 1 G 2... G I Y 11 Y 21... Y I1 Y 12 Y 22... Y I2.... Y 1J Y 2J... Y IJ H 0 : µ 1 = µ 2 =... = µ I H 1 : at least one of the populations has different mean where µ k is the true mean for population k.

ANOVA We can formulate a model as follows: where ɛ ij IID N(0, σ 2 ) Y ij = µ + α i + ɛ ij Y ij is the observed response for the jth observation for group i. µ is the overall global mean across all groups. α i is an offset from the global mean for each group, such that µ i = µ + α i ɛ ij is a random error term for the jth observation in group i

ANOVA The ANOVA model is: Y ij = µ + α i + ɛ ij Recall that the null hypothesis in ANOVA is: H 0 : µ 1 = µ 2 =... = µ I since µ i = µ + α i, this is equivalent to H 0 : α i = 0 for all i

ANOVA Let s define some notation: Y ij represents the jth observation in group i Y i represents the mean in group i Y represents the mean of all observations which we can calculate as follows: Y i = 1 J J j=1 Y ij Y = 1 IJ J I j=1 i=1 Y ij = 1 IJ i,j Y ij

ANOVA We can capture the different types of variance via the sum of squares: Total sum of squares: compares each observations to the overall mean (overall variance) I J SS T = (Y ij Y ) 2 i=1 j=1 Between sum of squares: compares the group means to the overall mean (variance between groups) I SS B = J (Y i Y ) 2 Within sum of squares: compares each observation to its corresponding group mean (variance within groups) I J SS W = (Y ij Y i ) 2 i=1 i=1 j=1

ANOVA It is not hard to show that the total sum of squares can be decomposed into the sum of the between sum of squares and the within sum of squares: I i=1 j=1 J (Y ij Y ) 2 = J SS T = SS B + SS W I (Y i Y ) 2 + i=1 I i=1 j=1 J (Y ij Y i ) 2

ANOVA Each source of variability also has its own associated degrees of freedom: SS T compares the IJ observations to the overall mean, so has IJ 1 degrees of freedom SS B compares the I group means to the overall mean, so has I 1 degrees of freedom SS W compares the IJ observations to the I group means, so has IJ I = I(J 1) degrees of freedom Notice that IJ 1 = I(J 1) + (I 1) df T = df B + df W so the degrees of freedom are related in the same way as the sum of squares.

ANOVA Recall that to test H 0, we want to compare the between sum of squares to the within sum of squares. Thus we define our test statistic (standardized by df) to be: we note that, under H 0, F := SS B/(I 1) SS W /(I(J 1)) SS B σ 2 = J I i=1 (Y i Y ) 2 σ 2 χ 2 I 1 I J SS W i=1 j=1 σ 2 = (Y ij Y i ) 2 σ 2 χ 2 I(J 1) and thus, under H 0, the distribution of our test statistic is: F := SS B/(I 1) SS W /(I(J 1)) F I 1,I(J 1)

ANOVA In summary, if we want to test H 0 : µ 1 = µ 2 =... = µ I using ANOVA, we calculate the test statistic and the corresponding p-value F = SS B/(I 1) SS W /(I(J 1)) p-value = P ( F I 1,I(J 1) F ) (to understand the form of the p-value, think about what a more extreme value of F w.r.t H 1 would correspond to).

ANOVA It is common to summarize the data in the following table, from which the test statistic can just be read off Source of variability df SS MS F Treatments (between) I 1 SS B MS B MS B /MS W Error (within) I(J 1) SS W MS W Total IJ 1 SS T where are the mean sum of squares. MS B = SS B /(I 1) MS W = SS W /(I(J 1))

Example: ANOVA Suppose that we only knew some of the values in the table. We can usually fill in the rest, and perform the test: Suppose we have 6 groups and 4 samples from each group. But the only values in the table given to us are: So all we know is Source of variability df SS MS F Treatments???? Error? 55? Total? 98 SS W = 55 SS T = 98

Example: ANOVA We know, however that I = 6 and J = 4, so we can fill in the degrees of freedom df B = I 1 = 5 df W = I(J 1) = 6 3 = 18 df T = IJ 1 = 6 4 1 = 23 Source of variability df SS MS F Treatments 5??? Error 18 55? Total 23 98 we can check that this makes sense since we should have df T = df B + df W

Example: ANOVA We also know that SS T = SS B + SS W So that SS B = SS T SS W = 98 55 = 43 Source of variability df SS MS F Treatments 5 43?? Error 18 55? Total 23 98

Example: ANOVA The rest are easy: to get the MS column, just divide the SS column by the df column Source of variability df SS MS F Treatments 5 43 43/5? Error 18 55 55/18 Total 23 98

Example: ANOVA To get the F value, divide the MS B (treatments) by MS W (error) So our p-value is Source of variability df SS MS F Treatments 5 43 43/5 2.81 Error 18 55 55/18 Total 23 98 P (F 5,18 2.81) = 0.048

Exercise

Exercise: ANOVA (Rice 12.5.3) For a one-way ANOVA with two groups (I = 2), show that the F statistic is just t 2, where t is just the t statistic for a two-sample t-test. That is, show that ANOVA with two groups is equivalent to a two-sample t-test with the common variance assumption.

Exercise: ANOVA Recall that a few weeks ago, I wanted to compare the average delivery time for two pizza companies. Well, this week I went pizza crazy and ordered 10 pizzas from each of La Val s, Sliver and Pieology, and recorded how long it took them to deliver to my office in Evans.

Exercise: ANOVA The recorded delivery times are as follows: La Val s Sliver Pieology 32 27 52 48 32 40 51 37 41 47 21 36 41 25 32 35 28 38 37 36 37 41 38 42 43 29 39 38 30 30 Test the hypothesis that all three pizzas places have the same mean delivery time.

A non-parametric version of ANOVA The Kruskal-Wallis Test

Kruskal-Wallis The Kruskal-Wallis test is generalization of the Mann-Whitney test to multiple groups, and corresponds to a non-parametric version of one-way ANOVA. It tests whether each of the groups come from the same distribution: H 0 : F 1 = F 2 =... = F I The observations are asssumed to be independent No distributional form, such as normal, is assumed (the test is non-parametric)

Kruskal-Wallis The observations are pooled together and ranked. Let R ij be the rank of Y ij in the combined sample. The average rank in the ith group is thus defined as R i = 1 J J j=1 R ij And the global mean of the ranks is given by R = 1 IJ I J i=1 j=1 R ij = IJ + 1 2

Kruskal-Wallis As in ANOVA, define the between sum of squares (a measure of variance of the group means) be SS B = I J(R i R) 2 i=1 We can use SS B to test the null hypothesis that the groups have the same distribution Larger SS B provides stronger evidence against H 0 The distribution of a transformation of SS B can be calculated as follows: 12 K = IJ(IJ + 1) SS B χ 2 I 1 Note that K can also be expressed as ( I ) 12 K = JR 2 i 3(IJ + 1) IJ(IJ + 1) i=1

Kruskal-Wallis Our test statistic is: K = 12 IJ(IJ + 1) SS B χ 2 I 1 Note that K can also be expressed as ( I ) 12 K = JR 2 i 3(IJ + 1) IJ(IJ + 1) i=1 And our p-value can thus be calculated as p-value = P ( χ 2 I 1 K ) where this is a greater than statement since a larger value of K provides more evidence against H 0, so larger values are more extreme.

Exercise

Exercise: Kruskal-Wallis (non-parametric ANOVA) Recall that the delivery times did not particularly seem to satisfy the normal assumption required by ANOVA. Instead, conduct a non-parametric test to investigate whether the delivery times for La Val s, Pieology and Sliver come from the same distribution.