(Foundation of Medical Statistics)

Similar documents
13: Additional ANOVA Topics

Analysis of variance (ANOVA) Comparing the means of more than two groups

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Chapter Seven: Multi-Sample Methods 1/52

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

More about Single Factor Experiments

Lec 3: Model Adequacy Checking

Table 1: Fish Biomass data set on 26 streams

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

13: Additional ANOVA Topics. Post hoc Comparisons

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

Comparing the means of more than two groups

Analysis of variance (ANOVA) ANOVA. Null hypothesis for simple ANOVA. H 0 : Variance among groups = 0

Linear Combinations of Group Means

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

Review: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.

What is a Hypothesis?

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

Introduction to Nonparametric Statistics

Transition Passage to Descriptive Statistics 28

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Lecture Slides for INTRODUCTION TO. Machine Learning. ETHEM ALPAYDIN The MIT Press,

Ch 2: Simple Linear Regression

IEOR165 Discussion Week 12

Week 14 Comparing k(> 2) Populations

A posteriori multiple comparison tests

Lecture 7: Hypothesis Testing and ANOVA

9 One-Way Analysis of Variance

Mathematical statistics

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

Basic Statistical Analysis

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

Chapter 24. Comparing Means

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

Nonparametric Statistics

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

Introduction. Chapter 8

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

Design of Experiments. Factorial experiments require a lot of resources

Statistics For Economics & Business

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03

Multiple comparisons - subsequent inferences for two-way ANOVA

Analysis of variance

12.10 (STUDENT CD-ROM TOPIC) CHI-SQUARE GOODNESS- OF-FIT TESTS

Linear models and their mathematical foundations: Simple linear regression

Lec 1: An Introduction to ANOVA

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Performance Evaluation and Comparison

1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance?

CHAPTER 13: F PROBABILITY DISTRIBUTION

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Difference between means - t-test /25

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

MATH Notebook 3 Spring 2018

Introduction to Analysis of Variance (ANOVA) Part 2

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

Correlation Analysis

Assignment #7. Chapter 12: 18, 24 Chapter 13: 28. Due next Friday Nov. 20 th by 2pm in your TA s homework box

Tukey Complete Pairwise Post-Hoc Comparison

Chap The McGraw-Hill Companies, Inc. All rights reserved.

9-6. Testing the difference between proportions /20

Chapter 12 - Lecture 2 Inferences about regression coefficient

M A N O V A. Multivariate ANOVA. Data

PROBLEM TWO (ALKALOID CONCENTRATIONS IN TEA) 1. Statistical Design

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

The Chi-Square Distributions

STAT22200 Spring 2014 Chapter 5

Machine Learning: Evaluation

Data Analysis and Statistical Methods Statistics 651

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Chapter 7 Comparison of two independent samples

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes

The Chi-Square Distributions

Using SPSS for One Way Analysis of Variance

1 One-way Analysis of Variance

Unit 14: Nonparametric Statistical Methods

5 Inferences about a Mean Vector

3. Nonparametric methods

Statistics for EES Factorial analysis of variance

Nonparametric tests, Bootstrapping

Chapter 11 - Lecture 1 Single Factor ANOVA

Institute of Actuaries of India

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Aquatic Toxicology Lab 10 Pimephales promelas Larval Survival and Growth Test Data Analysis 1. Complete test initiated last week 1.

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

You can compute the maximum likelihood estimate for the correlation

Assumptions of classical multiple regression model

A discussion on multiple regression models

ANOVA: Analysis of Variance

Stat 710: Mathematical Statistics Lecture 41

Statistical Inference Theory Lesson 46 Non-parametric Statistics

Chapter 12. Analysis of variance

Transcription:

(Foundation of Medical Statistics) ( ) 4. ANOVA and the multiple comparisons 26/10/2018 Math and Stat in Medical Sciences Basic Statistics 26/10/2018 1 / 27

Analysis of variance (ANOVA) Consider more than 2 groups populations Ω 1, Ω 2,..., Ω m, m 3 whose means are µ 1, µ 2,..., µ m. Then 1 Null hypothesis (H 0 ) : µ 1 = µ 2 = = µ n. 2 Alternative hypothesis (H 1 ) : µ i µ j for some i and j. This test is called the (one-way) analysis of variance, ANOVA. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 2 / 27

Analysis of variance (ANOVA) The two-way analysis of variance when there are two factors, and the multi-way analysis of variance when there are three or more factors. These are not treated here. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 3 / 27

Assumption Suppose that the factor X is divided into m levels X 1,..., X m. Each X i follows a normal distribution. The variances are equal. Remark (1) When m = 2, the one-way ANOVA is equivalent to the t test (2) Equality of variances can be verified with the Bartlett test or the Levene test. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 4 / 27

Table of data Table: Ex Gp size data total T mean X i var. V i level X 1 n 1 x 11 x 12 x 1n1 T 1 x 1 V 1 X 2 n 2 x 21 x 22 x 2n2 T 2 x 2 V 2....... X m n m x m1 x m2 x mnm T m x m V m tolal N T The mean of all data is x = T N. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 5 / 27

Sum of squared deviation between groups Let S T be the sum of squared deviation with respect to the total mean x. S T = (x i j x) 2 i, j The sum of squared deviation between groups S A is defined by S A = m n i ( x i x) 2 i=1 Math and Stat in Medical Sciences Basic Statistics 26/10/2018 6 / 27

Sum of squared deviation within a group It is considered that if S A increases, then the difference between means of groups also increases. The sum of squared deviation within a group Sum of squares of errors S E is defined by S E = m n i (x i j x i ) 2 i=1 j=1 = (n 1 1)V 1 + + (n m 1)V m Math and Stat in Medical Sciences Basic Statistics 26/10/2018 7 / 27

Degrees of freedom Theorem S T = S A + S E. Variances and the degrees of freedom the degrees of freedom of S A is ϕ A = m 1. the degrees of freedom of S E is ϕ E = N m. the degrees of freedom of S T is ϕ T = N 1. Variance V A = S A ϕ A, V E = S E ϕ E (variance of errors Math and Stat in Medical Sciences Basic Statistics 26/10/2018 8 / 27

Ratio of variances and F-distribution Let F 0 = V A V E. Fact F 0 follows F-distribution of degrees of freedom (ϕ A, ϕ E ). 0.6 0.5 0.4 0.3 0.2 0.1 0.5 1 1.5 2 2.5 3 Math and Stat in Medical Sciences Basic Statistics 26/10/2018 9 / 27

Decision When F 0 F ϕ A ϕ E (α), p-value α (H 0 ) : µ 1 = = µ m is rejected. Hence µ i µ j for some i, j. When F 0 < F ϕ A ϕ E (α), p-value > α (H 0 ) : µ 1 = = µ m can not be rejected. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 10 / 27

Using EZR Read ANOVA.csv into EZR. 1 Show the boxplots of groups 1 4. 2 Verify the equality of variances by the Bartlett test. 3 3 Perform the one-way ANOVA. 3 Math and Stat in Medical Sciences Basic Statistics 26/10/2018 11 / 27

Remark 1 When the normality or the equality of variances are not satisfied,, use Kruskal-Wallis test a nonparametric version of analysis of variance EZR: 3 R: kruskal.test ( list(data1, data2, data3,... )) Math and Stat in Medical Sciences Basic Statistics 26/10/2018 12 / 27

Remark 2 In the case of 2-way ANOVA (repeat), the effect of the two factors X, Y and the interaction X Y of X, Y can be tested. EZR: Math and Stat in Medical Sciences Basic Statistics 26/10/2018 13 / 27

Multiple comparison problem By the above test, the null hypothesis has been rejected. Thus it turns out that some population mean is different from the others. Question Which two population means differ? ANOVA does not answer this question. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 14 / 27

Misuse of t test To see that, it seems to be necessary to repeat the t test for all pairs. But such a treatment should not be doing. Why? Math and Stat in Medical Sciences Basic Statistics 26/10/2018 15 / 27

Because... If there are 4 populations, we need to do 4 C 2 = 6 tests. Assuming that the reliability of a single t test is 95%, the total reliability of 6 times t tests is cb 0.95 6 100% = 73.5% Thus the total reliability is lower than 95%. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 16 / 27

Multiple comparisons Bonferroni method Various multiple comparison methods have been posed to avoid such difficulties. Bonferroni correction Taking the significance level to be smaller, in order to guarantee the reliability of 95% even when repeating the t test. Since (1 α) n 1 nα, if we take the siginificance level to be α/n, after n times t tests, the total significance level is less than α. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 17 / 27

Ex If n = 6 and α = 0.05, we may perform 6 times t tests under the significance level 0.05 6 = 0.0083. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 18 / 27

Multiple comparisons Holm method Boferroni s method is conservative, i.e., if n is larger, power is lower since α/n is very small. There is the Holm s method improved the Bonferroni method. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 19 / 27

Multiple comparisons Holm metho Repeat the t test at the significance level α and n times, and arrange the resulting p values (which the software will output) in ascending order p 1 < p 2 < < p n. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 20 / 27

Procedure 1 If p 1 < α/n, p 1 is significant. 2 If p 2 < α/(n 1), p 2 is significant 3 If p 3 < α/(n 2), p 3 is significant and so on. 4 If p k α/(n k + 1) for the first time, p k,..., p n are not significant. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 21 / 27

Multiple comparisons Tukey-Kramer method A method to compare all pairs of m groups in one test. Assumption Each group follows a normal distribution. The variances are equal. Math and Stat in Medical Sciences Basic Statistics 26/10/2018 22 / 27

Using EZR Read ANOVA.csv into EZR. Perform the Tukey-Kramer method 3 Math and Stat in Medical Sciences Basic Statistics 26/10/2018 23 / 27

Using EZR Result: The simultaneous confidence intervals are displayed. As a result, there is a difference between group 1 and group 3, also group 1 and group 4 Math and Stat in Medical Sciences Basic Statistics 26/10/2018 24 / 27

Multiple comparisons Dunnet method A method of comparison between the control group X 1 and each of the other groups X 2,..., X m (there are m 1 combinations.) Math and Stat in Medical Sciences Basic Statistics 26/10/2018 25 / 27

Multiple comparisons nonparametric methods When the normality is not satisfied, use nonparametric methods. Assumption It is assumed that the distributions of all groups are the same shape. The sample size of each group is large (10 or more in each group). Math and Stat in Medical Sciences Basic Statistics 26/10/2018 26 / 27

Nonparametric methods all pair comparisons Steel-Dwass method pair comparisons between the control group and the other groups Steel method These methods are found in 3 Reference (Japanese) Math and Stat in Medical Sciences Basic Statistics 26/10/2018 27 / 27