Statistics for EES and MEME 5. Rank-sum tests

Similar documents
Statistics for EES 3. From standard error to t-test

Comparison of two samples

Distribution-Free Procedures (Devore Chapter Fifteen)

Chapter 7 Comparison of two independent samples

Nonparametric Location Tests: k-sample

Advanced Statistics II: Non Parametric Tests

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Analysis of variance (ANOVA) Comparing the means of more than two groups

Lecture 7: Hypothesis Testing and ANOVA

Resampling Methods. Lukas Meier

Comparing Two Variances. CI For Variance Ratio

Introduction to Nonparametric Statistics

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

Difference between means - t-test /25

Rank-Based Methods. Lukas Meier

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!

Selection should be based on the desired biological interpretation!

Comparison of Two Population Means

The independent-means t-test:

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

LAB 2. HYPOTHESIS TESTING IN THE BIOLOGICAL SCIENCES- Part 2

Non-parametric Tests

Non-parametric (Distribution-free) approaches p188 CN

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Questions 3.83, 6.11, 6.12, 6.17, 6.25, 6.29, 6.33, 6.35, 6.50, 6.51, 6.53, 6.55, 6.59, 6.60, 6.65, 6.69, 6.70, 6.77, 6.79, 6.89, 6.

Introduction to Statistical Data Analysis III

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data

Chapter 8 Class Notes Comparison of Paired Samples

Fish SR P Diff Sgn rank Fish SR P Diff Sng rank

Data Analysis and Statistical Methods Statistics 651

Non-parametric Hypothesis Testing

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=

STATISTICS 4, S4 (4769) A2

Nonparametric Statistics

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.

Exam details. Final Review Session. Things to Review

8 Comparing Two Independent Populations

Non-parametric methods

Inferential Statistics

Relating Graph to Matlab

Chapter 23: Inferences About Means

Nonparametric tests. Mark Muldoon School of Mathematics, University of Manchester. Mark Muldoon, November 8, 2005 Nonparametric tests - p.

Solutions exercises of Chapter 7

Introduction to hypothesis testing

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Single Sample Means. SOCY601 Alan Neustadtl

Section 3 : Permutation Inference

Gas Transport Example: Confidence Intervals in the Wilcoxon Rank-Sum Test

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

Business Statistics MEDIAN: NON- PARAMETRIC TESTS

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions

Non-parametric tests, part A:

Chapter 7: Statistical Inference (Two Samples)

Introduction to Business Statistics QM 220 Chapter 12

3. Nonparametric methods

Kruskal-Wallis and Friedman type tests for. nested effects in hierarchical designs 1

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test.

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Outline The Rank-Sum Test Procedure Paired Data Comparing Two Variances Lab 8: Hypothesis Testing with R. Week 13 Comparing Two Populations, Part II

Wilcoxon Test and Calculating Sample Sizes

Transition Passage to Descriptive Statistics 28

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

9.5 t test: one μ, σ unknown

Statistical Inference for Means

MATH4427 Notebook 5 Fall Semester 2017/2018

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data

Two sample Hypothesis tests in R.

Non-parametric Inference and Resampling

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Nonparametric hypothesis tests and permutation tests

Inference About Two Means: Independent Samples

Statistics Handbook. All statistical tables were computed by the author.

8 Comparing Two Populations via Independent Samples, Part 1/2

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Nonparametric tests, Bootstrapping

TMA4255 Applied Statistics V2016 (23)

4.1 Hypothesis Testing

Chapter 24. Comparing Means

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Nonparametric Statistics Notes

Chapter 3: Statistical methods for estimation and testing. Key reference: Statistical methods in bioinformatics by Ewens & Grant (2001).

POLI 443 Applied Political Research

Statistics for EES Factorial analysis of variance

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Introduction to Statistical Analysis. Cancer Research UK 12 th of February 2018 D.-L. Couturier / M. Eldridge / M. Fernandes [Bioinformatics core]

Inferential Statistics

Statistical inference provides methods for drawing conclusions about a population from sample data.

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Data analysis and Geostatistics - lecture VII

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

Transcription:

Statistics for EES and MEME 5. Rank-sum tests Dirk Metzler June 4, 2018 Wilcoxon s rank sum test is also called Mann-Whitney U test References Contents [1] Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin 1:8083. [2] Mann, H. B., Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics 18:5060. Contents 1 Motivation 2 2 Wilcoxon-test for independent samples 3 3 Application to Hipparion tooth measurements 8 4 Wilcoxon signed-rank test 9 1

1 Motivation For (more or less) bell-shaped symmetrically distributed observations (or for sufficiently large sample sizes) we can apply the t-test for the null hypothesis µ 1 = µ 2 : The t-statistic is then nearly Student-t-distributed. This can be different especially for very asymmetric long-tailed distributions Assume that we have to compare these distributions: Häufigkeit 0 5 10 15 20 25 Häufigkeit 0 5 10 15 0 50 100 150 x 0 50 100 150 y Examples Waiting times Distances of dispersion Frequencies of cell types Wanted: a test that does not assume a distribution Such tests are called non-parametric 2

2 Wilcoxon-test for independent samples Observations: Two samples X : x 1, x 2,..., x m Y : y 1, y 2,..., y n Test the null hypothesis: that X and Y come from the same population alternative: X typically larger than Y or Y typically larger than X Idea Observations: Sort all observations by size. X : x 1, x 2,..., x m Y : y 1, y 2,..., y n Determine the ranks of the m X-values among all m + n observations. If the null hypothesis is true, than the m X-ranks are randomly chosen from {1, 2,..., m + n}. Compute the sum of the X-ranks and check if it is untypically small or large compared to sum of random ranks. Wilcoxon s rank-sum statistic Observation: X : x 1, x 2,..., x m Y : y 1, y 2,..., y n Frank Wilcoxon, 1892-1965 3

W = Sum of the X-ranks (1 + 2 + + m) is called Wilcoxon s rank-sum statistic Wilcoxon s rank-sum statistic Note: A small example Observations: W = Sum of the X-ranks (1 + 2 + + m) We could also use the sum of the Y -ranks, because Sum of the X-ranks + Sum of the Y -ranks = Sum of all ranks = 1 + 2 + + (m + n) = X : 1.5, 5.6, 35.2 (m + n)(m + n + 1) 2 Y : 7.9, 38.1, 41.0, 56.7, 112.1, 197.4, 381.8 Pool observations and sort: 1.5, 5.6, 7.9, 35.2, 38.1, 41.0, 56.7, 112.1, 197.4, 381.8 Determine ranks: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 rank-sum: W = 1 + 2 + 4 (1 + 2 + 3) = 1 Interpretation of W X-Population smaller = W small: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 W = 0 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 W = 1 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 W = 2 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 W = 2 X-population larger = W large: 4

1, 2, 3, 4, 5, 6, 7, 8, 9, 10 W = 21 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 W = 20 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 W = 19 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 W = 19 Significance Null hypothesis: X-sample and Y -sample were taken from the same distribution The 3 ranks of the X-sample 1 2 3 4 5 6 7 8 9 10 could just as well have been any 3 ranks 1 2 3 4 5 6 7 8 9 10 There are 10 9 8 3 2 1 = 120 possibilities. (In general: (m+n)(m+n 1) (n+1) m(m 1) 1 ) = (m+n)! n!m! = ( m+n m ) possibilities) Distribution or the Wilcoxon statistic (m = 3, n = 7)[1ex] Möglichkeiten 0 2 4 6 8 10 0 2 4 6 8 10 13 16 19 W Under the null hypothesis all rank configurations are equally likely, thus number of possibilities with rank-sum statistic w P(W = w) = 120 5

We see in our example: 1.5, 5.6, 7.9, 35.2, 38.1, 41.0, 56.7, 112.1, 197.4, 381.8 W = 1 P(W 1) + P(W 20) = P(W = 0) + P(W = 1) + P(W = 20) + P(W = 21) = 0.033 = 1+1+1+1 120 Distribution of the Wilcoxon statistic (m = 3, n = 7)[1ex] Wahrscheinlichkeit 0.00 0.02 0.04 0.06 0.08 0 2 4 6 8 10 13 16 19 W For our example (W = 1): p-value = P(such an extreme W ) = 4/120 = 0.033 We reject the null hypothesis, that the distributions of X and Y were equal, on the 5%-level. Wilcoxon test in R with wilcox.test: > x [1] 1.5 5.6 35.2 > y [1] 7.9 38.1 41.0 56.7 112.1 197.4 381.8 > wilcox.test(x,y) 6

Wilcoxon rank sum test data: x and y W = 1, p-value = 0.03333 alternative hypothesis: true location shift is not equal to 0 Let s compare to the t-test: > x [1] 1.5 5.6 35.2 > y [1] 7.9 38.1 41.0 56.7 112.1 197.4 381.8 > t.test(x,y) Welch Two Sample t-test data: x and y t = -2.0662, df = 6.518, p-value = 0.08061 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -227.39182 17.02039 sample estimates: mean of x mean of y 14.1000 119.2857 7

X Y 0 100 200 300 3 Application to Hipparion tooth measurements Remember Welch s t-test We examine one trait in two populations: Population 1 2 mean µ 1 µ 2 Null hypothesis: µ 1 = µ 2 We draw samples from the two populations with means x 1 x 2 To check the null hypothesis H 0, we compute the t-statistic[1ex] t = x 1 x ( 2 with f 2 s1 ) 2 ( s2 ) 2 = + f n1 n2 p-value under H 0 : p P( T g t ) (g=(estimated) degrees of freedom depends on n 1, n 2, s 1, s 2 ) If the distributions are not close to normal, Wilcoxon s rank-sum Statistic can be used. > t.test(md[art=="africanum"],md[art=="libycum"]) Welch Two Sample t-test data: md[art == "africanum"] and md[art == "libycum"] t = -3.2043, df = 54.975, p-value = 0.002255 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -4.1025338-0.9453745 sample estimates: 8

mean of x mean of y 25.91026 28.43421 > wilcox.test(md[art=="africanum"],md[art=="libycum"]) Wilcoxon rank sum test with continuity correction data: md[art == "africanum"] and md[art == "libycum"] W = 492, p-value = 0.01104 alternative hypothesis: true location shift is not equal to 0 Warning message: In wilcox.test.default(md[art == "africanum"], md[art == "libycum"]) : kann bei Bindungen keinen exakten p-wert Berechnen IMPORTANT! Is the Wilcoxon test really appropriate here? Not clear because its null hypothesis is that the data come from the same distribution, not just that the means are equal. If we want to test whether the means are different but allow the standard deviations to be different (like in the assumptions of Welch s t-test), the Wilcoxon test cannot be applied! 4 Wilcoxon signed-rank test Remember the pied flycatchers? x := green length blue length 0.05 0.00 0.05 0.10 0.15 0.20 t-test with R > x <- length$green-length$blue > t.test(x) 9

One Sample t-test data: x t = 2.3405, df = 16, p-value = 0.03254 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: 0.004879627 0.098649784 sample estimates: mean of x 0.05176471 Normal Q Q Plot Sample Quantiles 0.20 0.15 0.10 0.05 0.00 0.05 2 1 0 1 2 Theoretical Quantiles Wilcoxon test for paired samples Wilcoxon signed rank test: given independent samples X 1, X 2,..., X n from an unknown distribution. null hypothesis: distribution of X i is symmetric around 0.0. Application: paired samples (Y 1, Z 1 ),..., (Y n, Z n ). distribution. Apply Wilcoxon signed rank test to null hypothesis: Y i and Z i have same X 1 = Y 1 Z 1, X 2 = Y 2 Z 2,..., X n = Y n Z n. wilcox.test(flycatchers$blue,flycatchers$green, both give: paired=true) and wilcox.test(flycat Wilcoxon signed rank test with continuity correction 10

data: flycatchers$blue and flycatchers$green V = 22.5, p-value = 0.03553 alternative hypothesis: true location shift is not equal to 0 Warning messages: 1: In wilcox.test.default(flycatchers$blue, flycatchers$green, paired = TRUE) : kann bei Bindungen keinen exakten p-wert Berechnen 2: In wilcox.test.default(flycatchers$blue, flycatchers$green, paired = TRUE) : kann den exakten p-wert bei Nullen nicht berechnen statistic: sum of signed ranks. signed rank: take rank of absolute value, equipped with sign of original value example: values: 2.6 2.5 2.3 1.3 0.6 1.6 2.2 6.1 absolute values: 2.6 2.5 2.3 1.3 0.6 1.6 2.2 6.1 ranks: 7 6 5 2 1 3 4 8 signed ranks: 7 6 5 2 1 3 4 8 Sum of signed ranks= 7 6 5 2 1 + 3 + 4 + 8 = 6 11