Wilcoxon Test and Calculating Sample Sizes

Similar documents
Nonparametric Statistics

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!

Solutions exercises of Chapter 7

Comparison of Two Population Means

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

Distribution-Free Procedures (Devore Chapter Fifteen)

Chapter 7 Comparison of two independent samples

Describing distributions with numbers

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Nonparametric Tests. Mathematics 47: Lecture 25. Dan Sloughter. Furman University. April 20, 2006

Non-parametric tests, part A:

3. Nonparametric methods

Introduction to hypothesis testing

Mitosis Data Analysis: Testing Statistical Hypotheses By Dana Krempels, Ph.D. and Steven Green, Ph.D.

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

Module 9: Nonparametric Statistics Statistics (OA3102)

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

Describing distributions with numbers

Non-parametric methods

STAT Section 5.8: Block Designs

Relating Graph to Matlab

Nonparametric tests. Mark Muldoon School of Mathematics, University of Manchester. Mark Muldoon, November 8, 2005 Nonparametric tests - p.

STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test.

Dealing with the assumption of independence between samples - introducing the paired design.

Analysis of 2x2 Cross-Over Designs using T-Tests

Do not copy, post, or distribute. Independent-Samples t Test and Mann- C h a p t e r 13

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Transition Passage to Descriptive Statistics 28

Inferences About the Difference Between Two Means

Non-parametric Hypothesis Testing

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Review: Central Measures

Business Statistics. Lecture 10: Course Review

Glossary for the Triola Statistics Series

Violating the normal distribution assumption. So what do you do if the data are not normal and you still need to perform a test?

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)

Business Statistics MEDIAN: NON- PARAMETRIC TESTS

Fish SR P Diff Sgn rank Fish SR P Diff Sng rank

Non-parametric (Distribution-free) approaches p188 CN

Unit 14: Nonparametric Statistical Methods

Statistics for Managers using Microsoft Excel 6 th Edition

Data Analysis and Statistical Methods Statistics 651

Version 1: Equality of Distributions. 3. F (x) and G(x) represent the distribution functions corresponding to the Xs and Y s, respectively.

SPSS Guide For MMI 409

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Ch. 16: Correlation and Regression

Review of Statistics 101

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.

TMA4255 Applied Statistics V2016 (23)

Hypothesis Testing and Confidence Intervals (Part 2): Cohen s d, Logic of Testing, and Confidence Intervals

Section 9.1 (Part 2) (pp ) Type I and Type II Errors

ANOVA - analysis of variance - used to compare the means of several populations.

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

S D / n t n 1 The paediatrician observes 3 =

Nonparametric Location Tests: k-sample

Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)

Statistics: revision

Introduction to Statistical Data Analysis III

Chapter 18 Resampling and Nonparametric Approaches To Data

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

BIO 682 Nonparametric Statistics Spring 2010

Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem. Motivation: Designing a new silver coins experiment

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Nonparametric Statistics Notes

2011 Pearson Education, Inc

Y i = η + ɛ i, i = 1,...,n.

Nonparametric hypothesis tests and permutation tests

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

are the objects described by a set of data. They may be people, animals or things.

Tests for Two Coefficient Alphas

Contents. Acknowledgments. xix

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Data analysis and Geostatistics - lecture VII

280 CHAPTER 9 TESTS OF HYPOTHESES FOR A SINGLE SAMPLE Tests of Statistical Hypotheses

Statistiek I. Nonparametric Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen.

The independent-means t-test:

Non-Parametric Statistics: When Normal Isn t Good Enough"

1.3: Describing Quantitative Data with Numbers

Inferential Statistics

Section 2.4. Measuring Spread. How Can We Describe the Spread of Quantitative Data? Review: Central Measures

Statistics Handbook. All statistical tables were computed by the author.

Background to Statistics

Lecture 26. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing

Inferential statistics

Introduction to Statistical Hypothesis Testing

Transcription:

Wilcoxon Test and Calculating Sample Sizes Dan Spencer UC Santa Cruz Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 1 / 33

Differences in the Means of Two Independent Groups When using the t, t or t p test statistics, we assume that the responses in both groups are normally distributed What if they are not normally distributed? If n 1 and n 2 are large enough, it is still okay to use the t-distribution However, if n1 and n 2 are small, this is a problem This non-normality sometimes occurs in animal studies Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 2 / 33

Wilcoxon Rank-Sum Test Sometimes called the Mann-Whitney-Wilcoxon test, the Mann-Whitney U test, or the Wilcoxon-Mann-Whitney test Test to see if the location of the responses between the groups is different Interpreted as a test for a difference in medians An example of a nonparametric test, as it does not test about parameters in an assumed distribution Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 3 / 33

Wilcoxon Rank-Sum: Assumptions Responses are either continuous or ordinal Observations from both groups are independent The shape and spread of the response in the two different populations is the same, but not necessarily normal Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 4 / 33

t-test Group Density Assumption Density Assumption for t Tests 0.4 0.3 Density 0.2 Group 1 2 0.1 0.0 5.0 2.5 0.0 2.5 5.0 Values Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 5 / 33

Wilcoxon Group Density Assumption Wilcoxon Density Assumption 0.15 Density 0.10 Group1 Group2 0.05 0.00 0 5 10 Values Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 6 / 33

Wilcoxon Rank-Sum: Hypotheses Null Hypothesis (H 0 ): The probability of a randomly-selected response from the first population exceeding that of a randomly-selected response from the second population is equal to 0.5 A slightly stronger hypothesis is that the distributions are equal in terms of location This hypothesis implies the above null hypothesis Alternative Hypothesis (H 1 ): The probability of a randomly-selected response from the first population exceeding that of a randomly-selected response from the second population is Not equal to 0.5 Greater than 0.5 Less than 0.5 Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 7 / 33

Case Study: Chick Weights Newly hatched chicks were separated into two groups Sunflower seed diet Horsebean seed diet After six weeks, the weights of the chicks were measured in grams Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 8 / 33

Case Study: Chick Weights Boxplots of Chick Weights by Feed Type 270.0 Weight (grams) 267.5 feed horsebean sunflower 265.0 horsebean Feed Type sunflower Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 9 / 33

Case Study: Chick Weights Both distributions look to be somewhat skewed to the right because they either have a long tail or an outlier (shown as a solitary point) Sample sizes are small (8 and 10, respectively), so t and t are not appropriate here Hypotheses: H 0 : The distribution of chick weights in the two groups is equal H 1 : The distribution of chick weights is lower for the horsebean group Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 10 / 33

Wilcoxon Rank-Sum Test Statistic Combine groups, and rank all responses from smallest to largest The ranks number from 1 to n n = n1 + n 2 If there are ties, the ranks should be averaged Values 7, 5, 6, 6 Their ranks would be 4, 1, 2.5, 2.5 The test statistic T is the sum of the ranks for the group with the smallest sample size If n 1 = n 2, T falls between the two rank sums Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 11 / 33

Rank Sums Horsebean Weights Ranks 266.84 14 264.07 6 263.82 4 263.47 2 264.33 8 264.25 7 263.22 1 263.92 5 Sum = 47 Sunflower Weights Ranks 267.75 15 266.02 12 266.29 13 264.89 10 269.24 17 271.63 18 264.74 9 268.36 16 264.99 11 263.69 3 Sum = 124 Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 12 / 33

Case Study: Chick Weights T = 47 Wilcoxon Rank-Sum rejection region values can be found in a table at https://metxstats.soe.ucsc.edu/node/5 Since the research hypothesis is that the horsebean group has a lower-shifted distribution than the sunflower group, reject H 0 if T is less than the values in the table when n 1 = 8 and n 2 = 10 T is larger than the critical value for α = 0.025, 0.05, and 0.10 Fail to reject H 0 and conclude that distributions are not significantly shifted from one another Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 13 / 33

Normal Approximation When both treatment groups are larger than 10, the normal distribution approximates the distribution of the Wilcoxon Rank-Sum test statistic rather well z = T µ T σ T µ T = n 1(n 1 + n 2 + 1) 2 n1 n 2 (n 1 + n 2 + 1) σ T = 12 Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 14 / 33

Normal Approximation: Our Example µ T = n 1(n 1 + n 2 + 1) 2 8(8 + 10 + 1) = 2 = 76 n1 n 2 (n 1 + n 2 + 1) σ T = 12 (8)(10)(8 + 10 + 1) = 12 = 11.25463 Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 15 / 33

Normal Approximation: Our Example 47 76 z = 11.25463 = 2.576717 This z-score certainly does fall in the rejection region P-value 0.00499 This is a contradictory conclusion! Use this approximation only when samples are large enough! Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 16 / 33

Wilcoxon Rank-Sum Test in JMP Analyze Fit Y by X Drag your variables to the appropriate Response and Factor boxes and click OK Click the Nonparametric Wilcoxon Test Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 17 / 33

Wilcoxon Rank-Sum Test in JMP JMP calls the test statistic S instead of T Only the two-sided p-value for the normal aproximation is given For the one-sided p-value, divide by 2 Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 18 / 33

Sample Size Researchers aim to present evidence to support their hypotheses about how the world works Most of the time, this hypothesis aims to show that treatments are significantly different from one another Usually, the aim is to reject H 0 Ideally, sample sizes would be as big as possible However, time and money often limit sample sizes Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 19 / 33

Power We want to minimize the chance of failing to reject a false H 0 This chance is often represented by β An experiment s power is the chance that a false H 0 is correctly rejected 1 β When the chance of incorrectly rejecting H 0 is fixed at some value α, the power of a test can be estimated for different sample sizes Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 20 / 33

Power: t Distributions When H 0 is true, the test statistic is centered around 0 When H 1 is true, the test statistic is proportionally centered at = µ 1 µ 2 D 0 σ 1 n 1 + 1 n 2 For simplicity, the quantity µ 1 µ 2 D 0 is represented as Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 21 / 33

Calculating Power An experiment where n 1 = n 2 = 5, σ = 10, and = 25 α is fixed at 0.05 for the hypotheses H 0 : µ 1 µ 2 = 0 H 1 : µ 1 µ 2 0 Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 22 / 33

Power Illustrated β, α, and t 0.4 0.3 Density 0.2 Hypothesis H 0 H 1 0.1 0.0 t* 5 0 5 t Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 23 / 33

Changing σ 0.4 σ = 10 0.4 σ = 8 0.3 0.3 Density 0.2 Hypothesis H 0 H 1 Density 0.2 Hypothesis H 0 H 1 0.1 0.1 0.0 t* 0.0 t* 5 0 5 t 5 0 5 t Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 24 / 33

Changing n 0.4 n 1 = n 2 = 5 0.4 n 1 = n 2 = 10 0.3 0.3 Density 0.2 Hypothesis H 0 H 1 Density 0.2 Hypothesis H 0 H 1 0.1 0.1 0.0 t* 0.0 t* 5 0 5 t 5 0 5 t Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 25 / 33

Maximizing Power Increase n 1 and n 2 and decrease experimental error as much as possible We have previously discussed reducing experimental error by standardizing measurement practices How do we choose the smallest possible sample size while achieving a fixed α and β? Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 26 / 33

Calculating n Fix or estimate α - Chance of incorrectly rejecting H 0 β - Chance of incorrectly failing to reject H 0 σ - Estimated population standard deviation - The size of difference that is desirable to detect One-sided tests for µ 1 µ 2 : n 1 = n 2 = 2σ 2(z α + z β ) 2 2 Two-sided tests for µ 1 µ 2 : n 1 = n 2 = 2σ 2(z α/2 + z β ) 2 Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 27 / 33 2

Calculating n If µ 1 µ 2 D 0, type II error probability β Typically, β is chosen to be 0.2 σ is estimated as s calculated from previous experiments is set as the minimum difference that is desirable to detect A treatment is only preferable if it increases CD4 cell count by 100 or more, so 100 Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 28 / 33

Calculating n: Tooth Growth In a previous lesson, we examined the effects of the source of vitamin C on tooth growth in guinea pigs Let s say we want to conduct another study, but this time, we want to be able to detect a true difference of 3 millimeters in tooth length We ll estimate that σ = 7.5, which was our estimate s p Fix α = 0.05 Fix β = 0.20 We ll assume a two-sided test Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 29 / 33

Calculating n: Tooth Growth n 1 = n 2 = 2(7.5 2 ) (z 0.05/2 + z 0.20 ) 2 = 2(7.5 2 ) = 98.111 3 2 (1.959964 + 0.8416212)2 3 2 In order to have power = 1 -.2 =.8, the minimum sample size for each group is 99 guinea pigs In the case where a non-integer sample size is found, round up to the nearest whole number Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 30 / 33

Calculating Sample Size in JMP DOE Sample Size and Power Two Sample Means Enter α σ (Std Dev) Difference to detect ( ) Power (1 β) Continue Note, small differences may exist due to rounding errors Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 31 / 33

JMP Output Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 32 / 33

Notes on JMP Note that this tool can also be used to evaluate the power of a proposed study A plot of power versus sample size can also be useful in determining sample size Dan Spencer (UC Santa Cruz) Wilcoxon Test and Calculating Sample Sizes 33 / 33