Advanced Statistics II: Non Parametric Tests

Similar documents
Non-parametric Inference and Resampling

Nonparametric hypothesis tests and permutation tests

Inferential Statistics

large number of i.i.d. observations from P. For concreteness, suppose

One-Sample Numerical Data

Unit 14: Nonparametric Statistical Methods

Bootstrap & Confidence/Prediction intervals

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

36. Multisample U-statistics and jointly distributed U-statistics Lehmann 6.1

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

Cramér-Type Moderate Deviation Theorems for Two-Sample Studentized (Self-normalized) U-Statistics. Wen-Xin Zhou

Master s Written Examination

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

Non-parametric Inference and Resampling

Econometrics II - Problem Set 2

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data

Some Observations on the Wilcoxon Rank Sum Test

What to do today (Nov 22, 2018)?

STAT 461/561- Assignments, Year 2015

Bootstrap, Jackknife and other resampling methods

Nonparametric Location Tests: k-sample

Central Limit Theorem ( 5.3)

Bootstrap tests. Patrick Breheny. October 11. Bootstrap vs. permutation tests Testing for equality of location

Stat 710: Mathematical Statistics Lecture 31

5 Introduction to the Theory of Order Statistics and Rank Statistics

STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test.

Bivariate Paired Numerical Data

Non-parametric Tests

Statistics for EES and MEME 5. Rank-sum tests

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Rank-Based Methods. Lukas Meier

Math 494: Mathematical Statistics

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Contents 1. Contents

Distribution-Free Procedures (Devore Chapter Fifteen)

Bootstrapping high dimensional vector: interplay between dependence and dimensionality

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

Notes on MAS344 Computational Statistics School of Mathematical Sciences, QMUL. Steven Gilmour

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Robustness and Distribution Assumptions

Confidence Intervals in Ridge Regression using Jackknife and Bootstrap Methods

TESTS BASED ON EMPIRICAL DISTRIBUTION FUNCTION. Submitted in partial fulfillment of the requirements for the award of the degree of

1 Glivenko-Cantelli type theorems

Contents. Acknowledgments. xix

Analyzing Small Sample Experimental Data

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Problem Selected Scores

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.

Errata and Updates for ASM Exam MAS-I (First Edition) Sorted by Date

Errata and Updates for ASM Exam MAS-I (First Edition) Sorted by Page

Chapter 10. U-statistics Statistical Functionals and V-Statistics

Fish SR P Diff Sgn rank Fish SR P Diff Sng rank

STAT Section 2.1: Basic Inference. Basic Definitions

Recall the Basics of Hypothesis Testing

STAT Section 5.8: Block Designs

Lecture 26. December 19, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results

Economics 583: Econometric Theory I A Primer on Asymptotics

Introduction to Empirical Processes and Semiparametric Inference Lecture 02: Overview Continued

Negative Association, Ordering and Convergence of Resampling Methods

Section 3 : Permutation Inference

MIT Spring 2015

KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS

Non-Parametric Statistics: When Normal Isn t Good Enough"

Comparison of two samples

Asymptotic Statistics-VI. Changliang Zou

Supplementary Materials for Residuals and Diagnostics for Ordinal Regression Models: A Surrogate Approach

Econometrics II: Statistical Analysis

Statistics. Statistics

14.30 Introduction to Statistical Methods in Economics Spring 2009

Homework # , Spring Due 14 May Convergence of the empirical CDF, uniform samples

Performance Evaluation and Comparison

A Resampling Method on Pivotal Estimating Functions

MA 575 Linear Models: Cedric E. Ginestet, Boston University Bootstrap for Regression Week 9, Lecture 1

Inference on distributions and quantiles using a finite-sample Dirichlet process

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

Computer simulation on homogeneity testing for weighted data sets used in HEP

Brief Review on Estimation Theory

Testing Statistical Hypotheses

A Conditional Approach to Modeling Multivariate Extremes

Mathematical statistics

Lecture 30. DATA 8 Summer Regression Inference

Tentative solutions TMA4255 Applied Statistics 16 May, 2015

24 Classical Nonparametrics

Robust Backtesting Tests for Value-at-Risk Models

Theory of Statistics.

Review. December 4 th, Review

simple if it completely specifies the density of x

Introduction to Self-normalized Limit Theory

Simple Linear Regression: The Model

COMP2610/COMP Information Theory

Simulation. Alberto Ceselli MSc in Computer Science Univ. of Milan. Part 4 - Statistical Analysis of Simulated Data

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

Association studies and regression

Stochastic Simulation

Probability and Statistics Notes

Chapter 2: Resampling Maarten Jansen

Transcription:

Advanced Statistics II: Non Parametric Tests Aurélien Garivier ParisTech February 27, 2011

Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney signed-rank test Two related samples: Wilcoxon signed-rank test Bootstrap

A First Motivation: Model Validation In a Gaussian linear model i = 1,..., n Y i = p θ j X i,j + σɛ i j=1 it is assumed that the ɛ i are i.i.d. N (0, 1) = Can we check that this is the case? Two aspects: identically distributed (residual vs fitted value) gaussian distribution : Q-Q plots

Key Remarks Prop: if X has a continuous Cumulative Distribution Function (CDF) F : t P(x t), then F (X ) U[0, 1]. Consequence: if X 1,..., X n are i.i.d. N (0, 1), then the distribution of F (X 1 ),..., F (X n ) are i.i.d. U[0, 1] (they are free of F ). Definition: The order statistics of an n-uple (U 1,..., U n ) is the n-uple (U (1),..., U (n) ) such that {U 1,..., U n } = {U (1),..., U (n) } and U (1) U (n) Prop: If U (1),..., U (n) is the order statistics of i.i.d U[0, 1] random variables, then E[U (i) ] = i n + 1

Free statistic Definition A statistic S = S(X 1,..., X n ) is free if its distribution depends only on n and not on the distribution of (X 1,..., X n ). Free statistics are useful to build (non-parametric) tests The permutation sorting a sample is a free statistic (uniformly distributed).

Q-Q plots Prop: Let X 1,..., X n be i.i.d. random variables, with cdf F, and let G be a cdf. Consider thepoints {( ( ) ) } i F 1, X n + 1 (i), 1 i n if F = G, then the points are approximately lying on the first diagonal. If F G, then (at least some of) these points deviate from the first diagonal. Remark: in practice, for Gaussian variables one may use F 1 ((i 0.375)/(n + 0.25)) for better performance.

Kolmogorov-Smirnov Statistic Defintion The empirical distribution function F n of X 1,..., X n is the mapping R [0, 1] defined by F n (t) = 1 n n 1 {Xi t}. k=1 It is the CDF of the empirical measure P n = 1 n n i=1 δ X i. Definition The Kolmogorov-Smirnov Statistic between the sample X 1,..., X n and the CDF G is D n (Z 1:n, G) = sup G(t) F n (t) t R

Properties of the K-S statistic Prop:The KS statstic can be computed by: { D n (X 1:n, G) = max max G(X 1 i n (i)) i 1 n, G(X (i) ) i } n Prop: if F is the CDF of X i, then D n (X 1:n, F ) is a free statistic, and it has the distribution of sup u 1 n 1 n {Ui u} u [0,1] where the U i are i.i.d. U[0, 1]. i=1

Glivenko-Cantelli Theorem and K-S limiting distribution Theorem: If F denotes the CDF of X i, then D n (X 1:n, F ) 0 a.s. as n goes to infinity. If F G, then D n (X 1:n, G) remains lower-bounded as n goes to infinity. Theorem: As n goes to infinity, P ( D n (X 1:n, F ) > The RHS is 5% when c = 1.36. c n ) 2 ( 1) k 1 e 2r 2 c 2. k=1

Comparison of two samples Let X 1,..., Y m and Y 1,..., Y n be two samples, with CDF respectively F and G, and with empirical CDF F m and G n. Definition The Kolmogorov-Smirnov Statistic between the sample X 1,..., X n the sample Y 1,..., Y n D m,n (X 1:m, Y 1:n ) = sup G n (t) F m (t) t Prop: Asymptotic behavior of the K-S statistic: ( mn lim P m,n m + n D m,n(x 1:m, Y 1:n ) > c n ) = 2 ( 1) k 1 e 2r 2 c 2 k=1

Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney signed-rank test Two related samples: Wilcoxon signed-rank test Bootstrap

Unrelated samples Assume that (X i ) 1 i m and (Y j ) 1 j n are two independent samples with CDF, respectively, F and G. The goal is to test H 0 : F = G against H 1 : P(Y j > X i ) 1/2. The alternative hypothese is more restrictive than for the K-S test. The idea is that, under H 0, the X i and Y i are tangled while, under H 1, the X i tend to be smaller (or larger) than the Y j.

Mann-Whitney Statistic The Mann-Whitney Statistic is defined as U m,n = i=1..m j=1..n 1 {Yj >X i } Prop: Under H 0, U m,n is a free statistic, E[U m,n ] = mn 2 and Var(U mn(m + n + 1) m,n) =. 12 Besides, when m, n, ζ m,n = while, under H 1, ζ m,n. U m,n mn 2 mn(m+n+1) 12 N (0, 1),

Mann-Whitney Test Testing procedures are derived as usual. The Mann-Whitney test can also be used with unilateral alternatives For effective computation, observe that U m,n can be obtained as follows: sort all the elements of {X 1,..., X m, Y 1,..., Y n } in increasing order define R m,n to be the sum of the ranks of all the elements {Y 1,..., Y n } then U m,n = R m,n n(n + 1) 2.

Student, K-S or Mann-Whitney? Student s test is the most powerful, but it has strong requirements (normality of the two samples, equality of the variances). The Mann-Whitney test is non-parametric and more robust. In the normal case, its relative efficiency wrt. Student s test is about 96%. Besides, it can be used on ordinal data. The K-S test has a more general alternative hypothese. However, it is less powerful. = If normality cannot be assumed, and if its alternative hypothese is sufficiently discriminating, use Mann-Whitney Warning: in R, this test is implemented under name wilcox.test.

Related samples Let (X i, Y i ), 1 i n be independent, identically distributed pairs of real-valued random variables. We assume that the CDF of the X i is F, while the CDF of the Y i is t F (t θ) for some real θ. In other words, Y i has the same distribution as X i + θ. Goal: we want to test H 0 : θ = 0 against H 1 : θ 0. Example: evolution of the blood pressure after administration of a drug, double correction of a test

Wilcoxon statistic Definition: for 1 i n, let Z i = X i Y i. The Wilcoxon statistic W + is defined by n W n + = k1 {Z[k] >0}, k=1 where Z [k] is such that Z [1] Z [2] Z [n]. Prop: if the distribution of Z is symetric and if P(Z = 0) = 0, then the sign sgn(z) and the absolute value Z are independent. Prop: under H 0, the variables 1 {Z[k] >0} are i.i.d. B(1/2) and E[W + n ] = n(n + 1), Var[W + n (n + 1) (2n + 1) n ] =. 4 24

Limiting distribution Prop: Under H 0, as n goes to infinity, ζ n = W n + n(n+1) 4 n(n+1)(2n+1) 24 converge to the standard normal distribution N (0, 1). Prop: Under H 1, ζ n goes to (resp. + ) if θ > 0 (resp. θ < 0). Remark: the test needs not be bilateral, for example if H 1 = θ < 0 the null hypothese is rejected when W n + is too large.

Outline Fitting a distribution Rank Tests for the comparison of two samples Two unrelated samples: Mann-Whitney signed-rank test Two related samples: Wilcoxon signed-rank test Bootstrap

Pulling yourself up by your own bootstraps What to do when the classical testing or estimating procedure can t be trusted? When the distribution is strongly nongaussian? When the amount of data is not sufficient to assume normality? Idea: create new data from the data available!

Principle: resampling Given a sample X 1,..., X n from a distribution P, let P n be the empirical distribution plug-in: to estimate a functional T (P), estimate T (P n )! Let ˆθ = ˆθ (X 1,..., X n ) be an estimator of T (P) for k from 1 to N (N 1), repeat 1. Resampling: sample X 1 k,..., X n k from P n i.e. from {X 1,..., X n } with replacement( ) 2. compute an estimator ˆθ k = ˆθ k X 1,..., X n of T (P n ). ) Bootstrap idea: the empirical distribution of the (ˆθ k is k close to the distribution of ˆθ

Berry-Esseen Theorem Theorem Let X 1,..., X n be iid with E[X i ] = 0, E[Xi 2 ] = σ 2 and E [ X i 3] = ρ <. If F (n) is the distribution of (X 1 + + X n )/σ n and N is the CDF of the standard normal distribution, then F (n) (x) N (x) 3ρ σ 3 n

Properties of the Bootstrap distribution shape: because it approximates the sampling distribution, the bootstrap distribution can be used to check normality of the latter. spread: the standard deviation of the sampling distribution Var[P n ] 1/2 is approximately the standard error of the statistic ˆθ n. center: the bias of the bootstrap distribution mean T (P n ) from the value of the statistic on the sample ˆθ is the same as the bias of ˆθ n from T (P)

Bootstrap Confidence Intervals Bootstrap t-confidence interval: instead of ˆσ/ n, use the standard deviation of the bootstrap distribution to estimate the deviation of the sampling distribution. Requests that its shape is nearly gaussian. Bootstrap percentile confidence interval: keep as) a α-confidence interval the central 1 α values of (ˆθk k Example: confidence interval for the mean, regression.

Comparing two groups Given independent samples X 1,..., X n and Y 1,..., Y m, 1. Draw a resample of size n of the first sample X 1,..., X n, and a separate resample of size m of the second sample Y 1,..., Y m. 2. Compute the statistic that compares the two groups, such as the difference between the two sample means 3. Repeat the first two steps 10000 times 4. Construct the bootstrap distribution of the statistic. 5. Inspect the shape, bias, and bootstrap error (i.e., the standard deviation of the bootstrap distribution).