Simultaneous Confidence Intervals and Multiple Contrast Tests

Similar documents
Vienna Medical University 11/2014. Quality ranking. or... Comparisons against the grand mean

User-defined contrasts within multiple contrast tests- case studies using R

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

arxiv: v1 [stat.me] 20 Feb 2018

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

arxiv: v1 [math.st] 12 Oct 2017

Reports of the Institute of Biostatistics

Nonparametric Statistics

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Non-parametric (Distribution-free) approaches p188 CN

Reports of the Institute of Biostatistics

3 Joint Distributions 71

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

Non-parametric tests, part A:

BIO 682 Nonparametric Statistics Spring 2010

Modeling and inference for an ordinal effect size measure

Data are sometimes not compatible with the assumptions of parametric statistical tests (i.e. t-test, regression, ANOVA)

Lecture 7: Hypothesis Testing and ANOVA

Non-parametric confidence intervals for shift effects based on paired ranks

Analysis of Variance (ANOVA) Cancer Research UK 10 th of May 2018 D.-L. Couturier / R. Nicholls / M. Fernandes

Small n, σ known or unknown, underlying nongaussian

October 1, Keywords: Conditional Testing Procedures, Non-normal Data, Nonparametric Statistics, Simulation study

Applied Multivariate and Longitudinal Data Analysis

Nonparametric Statistics Notes

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

Exam details. Final Review Session. Things to Review

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

Contents. Acknowledgments. xix

Non-parametric methods

R-functions for the analysis of variance

Module 9: Nonparametric Statistics Statistics (OA3102)

Relative Potency Estimations in Multiple Bioassay Problems

Transition Passage to Descriptive Statistics 28

Biostatistics 270 Kruskal-Wallis Test 1. Kruskal-Wallis Test

Online publication date: 22 March 2010

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Non-parametric Tests

MATH Notebook 3 Spring 2018

Kruskal-Wallis and Friedman type tests for. nested effects in hierarchical designs 1

Comparison of Two Samples

Textbook Examples of. SPSS Procedure

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

A Regression Framework for Rank Tests Based on the Probabilistic Index Model

Nonparametric Location Tests: k-sample

ON MULTIVARIATE t AND GAUSS PROBABILITIES IN R. Introduction

sphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19

ST4241 Design and Analysis of Clinical Trials Lecture 9: N. Lecture 9: Non-parametric procedures for CRBD

= 1 i. normal approximation to χ 2 df > df

3. Nonparametric methods

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

Introduction to Nonparametric Statistics

parameter space Θ, depending only on X, such that Note: it is not θ that is random, but the set C(X).

Unit 14: Nonparametric Statistical Methods

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

Dr. Maddah ENMG 617 EM Statistics 10/12/12. Nonparametric Statistics (Chapter 16, Hines)

Workshop Research Methods and Statistical Analysis

Probabilistic Index Models

Statistical Inference Theory Lesson 46 Non-parametric Statistics

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Basic Statistical Analysis

Adaptive Treatment Selection with Survival Endpoints

CDA Chapter 3 part II

ST4241 Design and Analysis of Clinical Trials Lecture 7: N. Lecture 7: Non-parametric tests for PDG data

Statistics for EES Factorial analysis of variance

One-way ANOVA Model Assumptions

Rank-Based Methods. Lukas Meier

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie

1.0 Hypothesis Testing

Lecture 10: Non- parametric Comparison of Loca6on. GENOME 560, Spring 2015 Doug Fowler, GS

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health

NONPARAMETRICS. Statistical Methods Based on Ranks E. L. LEHMANN HOLDEN-DAY, INC. McGRAW-HILL INTERNATIONAL BOOK COMPANY

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

An Application of the Closed Testing Principle to Enhance One-Sided Confidence Regions for a Multivariate Location Parameter

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

Hypothesis Testing One Sample Tests

Data analysis and Geostatistics - lecture VII

SAS/STAT 14.1 User s Guide. Introduction to Nonparametric Analysis

Nonparametric Methods

Stat 5101 Lecture Notes

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Analysis of variance (ANOVA) Comparing the means of more than two groups

N Utilization of Nursing Research in Advanced Practice, Summer 2008

AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC

HANDBOOK OF APPLICABLE MATHEMATICS

Comparison of two samples

ANOVA - analysis of variance - used to compare the means of several populations.

Simultaneous identifications of the minimum effective dose in each of several groups

Two-stage k-sample designs for the ordered alternative problem

Multiple Endpoints: A Review and New. Developments. Ajit C. Tamhane. (Joint work with Brent R. Logan) Department of IE/MS and Statistics

Chapter 12. Analysis of variance

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

Introduction to Statistical Analysis

This paper has been submitted for consideration for publication in Biometrics

Analysis of Variance (ANOVA)

Empirical Power of Four Statistical Tests in One Way Layout

1 Statistical inference for a population mean

Transcription:

Simultaneous Confidence Intervals and Multiple Contrast Tests Edgar Brunner Abteilung Medizinische Statistik Universität Göttingen 1

Contents Parametric Methods Motivating Example SCI Method Analysis of the Example Nonparametric Methods Motivating Example SCI Method Analysis of the Example Paricular Difficulties References 2

I Parametric Methods Motivating Example O 2 -Consumption of Leucocytes bars show min max O 2 -Consumption of Leucocytes D2 n 3 =7 D1 n 2 =8 PL n 1 =8 3,0 3,5 4,0 O 2 -Consumption [µl] Question Which dose is different from control? 3

Motivating Example Classical Analysis (1) ANOVA / H 0 : µ P = µ 1 = µ 2 (2) H 0 rejected multiple comparisons (FWE s = 0.05) (3) confidence intervals for µ 1 µ P and µ 2 µ P must be compatible to the decisions of the MCP i.e. confidence interval (CI) for µ i µ P may not contain 0 H 0 : µ i µ P = 0 is rejected, i = 1,2 Statistical Methods / Procedures ANOVA (F-test) multiple comparisons using closure principle (CTP) Bonferroni confidence intervals (1 α = 0.975) Results global hypothesis: F = 2.53 p-value 0.1056 - (n.s.) MCP PL - D1: p = 0.1424 - (n.s.) / PL - D2: p = 0.0488 - (n.s.) 4

Motivating Example Shift of the D1 Data O 2 -Consumption of Leucocytes D2 n 3 =7 D1 n 2 =8 PL n 1 =8 3,0 3,5 4,0 O 2 -Consumption [µl] Results global hypothesis: F = 4.06 p-value 0.0355 - (*) MCP (CTP) PL - D1: p = 0.0256 - (*) / PL - D2: p = 0.0488 - (*) 5

Conclusions from the Motivating Example Confidence Intervals (Bonferroni) PL - D1: [ 0.024, 0.557] - contains 0 / not compatible to CTP PL - D2: [ 0.063, 0.538] - contains 0 / not compatible to CTP Conclusions (undesirable properties decision on effect PL - D2 depends on effect PL - D1 confidence intervals are not compatible dependency of the statistics X 1 X P and X 2 X P not used (wasting information) different method needed 6

Different Method Idea statistical model is adapted and reduced to the particular questions of the experimenter take dependence of the statistics into account statistics completely dependent no α-adjusting necessary independence is the worst case example of O 2 -consumption ( ) 1 1 0 C = = ( 1 1 0 1 2.I 2 ) and X = (X P,X 1,X 2 ) ( ) ( ) X desired contrasts CX = 1 X P µ1 µ, µ X 2 X δ = P P µ 2 µ P consider the distribution of CX N(µ δ, Σ) [( ) ] n Σ = σ 2 1 1 0 0 n 1 + n 1 P J 2 = (s i j ) i, j=1,2 2 7

Different Method Derivation of the Statistic s ii = σ 2 (n i + n P )/(n i n P ), i = 1,2 - diagonal elements of Σ ŝ ii : LS-estimator of s ii replacing σ 2 with the pooled estimator σ 2 N = 1 N 3 i=p,1,2 n i (X ik X i ) 2, N = n 1 + n 2 + n P k=1 ( ) X 1 X P studentize each component of CX = X 2 X P under H 0 : µ δ = 0, consider the statistics (i = 1,2) ni n P T i = (X i X / σ P ) N n i + n P. N(0,1), N, N/n i < N 0 < multivariate statistic T = (T 1,T 2 ). N(0,R), R: correlation matrix with ŝ ii 8

Different Method Derivation of the (1 α)-quantiles same quantile z 1 α,2,r for all components, such that z1 α,2,r z 1 α,2,r z1 α,2,r z 1 α,2,r dn(0,r) = 1 α better approximation: mulitvariate t-distribution: t 1 α,2,ν, R N R N : LS-estimator of R replacing σ 2 with σ 2 N diagonal elements = 1 off-diagonal elements depend only on sample sizes and σ 2 N T multivariate t-distribution references original paper: Bretz, Genz and Hothorn (2001) multivariate integration: Genz and Bretz (2009) heteroscedastic case: Hasler and Hothorn (2008) in general: C may be any appropriate contrast matrix 9

SCI-Method / Quantiles 4 2 0 2 4 2 4 2 0 2 0 2 4 4 2 0 2 4 Korrelation = 0.99, Quantil= 2.0133 4 Korrelation = 0.5, Quantil= 2.2121 4 Korrelation = 0, Quantil= 2.2365 4 2 0 2 4 4 2 0 2 4 equi-coordinate quantiles of different bivariate normal distributions squares containing mass 1 α of the bivariate normal distributions computation by means of R-package mvtnorm SAS-macro: to be developed or input of R-code in SAS/IML Studio 3.2 10

SCI-Method / Procedure Multiple Comparisons reject H (i) 0 : δ i = µ i µ P = 0 if T i z 1 α,2,r - or T i t 1 α,2,ν, R N Global Hypothesis reject H 0 : Cµ= µ δ = 0 if max{t 1,T 2 } z 1 α,2,r - or max{t 1,T 2 } t 1 α,2,ν, R N Simultaneous Confidence Intervals ( { [ P δ i X i X P ± z ]} ) 1 α,2, R N ni + n P. = σ N n i n 1 α P i I Error Control? FWE s (by Gabriel s Theorem, 1969) 11

Example: Analysis by SCI-Method Original Data Set (O 2 -Consumption of Leucocytes) O 2 -Consumption of Leucocytes D2 n 3 =7 D1 n 2 =8 PL n 1 =8 3,0 3,5 4,0 O 2 -Consumption [µl] SCI Classical PL - D1 t = 2.10 p-value 0.0965 - n.s. n.s. PL - D2 t = 2.18 p-value 0.0864 - n.s. n.s. 12

Example: Analysis by SCI-Method Shift of the D1 Data O 2 -Consumption of Leucocytes D2 n 3 =7 D1 n 2 =8 PL n 1 =8 3,0 3,5 4,0 O 2 -Consumption [µl] SCI Classical PL - D1 t = 2.53 p-value 0.0460 - ( ) ( ) PL - D2 t = 2.18 p-value 0.0864 - n.s. ( ) 13

Conclusions from the Analysis Confidence Intervals (D1 Shifted) PL - D1: [0.0049, 0.5276] - does not contain 0 / compatible PL - D2: [ 0.0324, 0.5074] - contains 0 / compatible Conclusions decision on effect PL - D2 does not depend on effect PL - D1 confidence intervals are compatible dependency of the statistics X 1 X P and X 2 X P is used 14

Extensions / Generalizations Factorial Designs Biesheuvel and Hothorn (2002) / stratified samples general case under research: diploma thesis Large Number of Dimensions Σ N may become singular (breakdown?) Repeated Measures n d and n < d (breakdown?) high-dimensional data / Froemke, Hothorn and Kropf (2008) is there a limit distribution? Binomial Data Schaarschmidt, Sill and Hothorn (2008) Nonparametric effects non-normal data (Konietschke, 2009) ordinal data: ordinal effect size measure (Ryu and Agresti, 2008) 15

II Nonparametric Methods Motivating Example Toxicity Trial (60 Wistar Rats) damage by an inhalable substance on the mucosa of the nose 3 concentrations ( 2[ppm], 5[ppm], 10[ppm]) score (0 = no damage,..., 3 = severe damage ) ordinal data Concentration Number of Rats with Score 0 1 2 3 2 [ppm] 18 2 0 0 5 [ppm] 12 6 2 0 10 [ppm] 3 7 6 4 16

Motivating Example Classical Analysis Strategy statistical model X ik F i (x), i = 1,2,3; k = 1,...,20 hypotheses H (1) 0 : F 1 = F 2 = F 3 H (2) 0 : F 1 = F 2 - relative effect: p 12 = F 1 df 2 H (3) 0 : F 1 = F 3 - relative effect: p 13 = F 1 df 3 H (4) 0 : F 2 = F 3 - relative effect: p 23 = F 2 df 3 relative effect p i j - interpretation p i j = F i df j = P(X i1 < X j1 )+ 1 2 P(X i1 = X j1 ) probability that the observations in group i tend to smaller values than in group j ordinal data: effect size measure (Ryu and Agresti, 2008) needed: confidence intervals for p i j = F i df j, i j = 1,2,3 error control: FWE s 17

SCI-Method Hypotheses of Interest H (1) 0 : p 12 = 1 2, H(2) 0 : p 13 = 1 2 Estimators of the Relative Effects p i j p i j = ( ) F i d F j = 1 (i j) n i R j n j+1 2 p = ( p12 asymptotic distribution of N( p p) N(0,V N ) depends on unknown parameters (elements of V N ) no pivotal quantity Statistics p 13 ) studentize each component (i, j) of p by v (i j) v (i j) : estimated variance of p i j (diagonal elements of V N ) (i j) j) estimation by means of ranks R ik, R(i jk, R(i) j) ik, and R( jk Reference: Brunner, Munzel und Puri (2002) 18

SCI-Method Asymptotic Distribution of the Statistics (i j) asymptotic distribution under H 0 : p i j = 1 2 of T i j = N ( p i j 1 2)/ v i j.. N(0,1) T = (T 12,T 13 ).. N(0,R), R: correlation matrix use the same procedure as in the parametric case error control: FWE s problem: confidence intervals may exceed the [0,1]-interval 19

SCI-Method / Properties Problem intervals are not range preserving lower and upper bound of a 95% confidence interval (n = 10) Solution multivariate δ-method 20

Range Preserving Intervals Procedure continuous transformation of G( p i j ) (, ) G : (G 1,...,G q ) : (0,1) q R q strictly monotone, i.e. G l (p i j) 0 differentiable, bijective, G l ( 1 2 ) = 0, l = 1,...,q in the example: q = 2 asymptotic distribution of G: Cramer s δ-theorem transformed estimators are also multivariat normal elements v i j of the covariance matrix of G multivariate δ-theorem: v i j = [G (p i j )] 2 v i j back transformation of the limits [0,1] - range preserving 21

Example: Analysis by SCI-Method Toxicity Trial (60 Wistar Rats) Results (Probit) Concentration Number of Rats with Score 0 1 2 3 2 [ppm] 18 2 0 0 5 [ppm] 12 6 2 0 10 [ppm] 3 7 6 4 Comparison Effect Interval p-value 2 vs. 5 0.66 0.5 / [0.501; 0.787] 0.049 2 vs. 10 0.90 0.5 / [0.753; 0.970] < 0.0001 22

Nonparametric Methods / Difficulties Non-Transitivity pairwise relative effects are not transitive e.g.: p 1 < p 2 < p 3 < p 1 counter-example: Efron s paradox dice (Rump, 2001) Brown and Hettmansperger (2002) - one-way layout Thangavelu and Brunner (2007) - stratified Wilcoxon tests New Definition of Relative Effects for a > 2 e.g. p i = HdF i, H = mean of the F i all distributions are compared to H or all distributions are compared to the same reference to be worked out covariance matrix of N( p 1,..., p d ) is quite involved first results: Konietschke (2009) Factorial Designs consider each factor separately? or combine all comparisons in one vector? to be worked out 23

Discussion aund Outlook SCI-Method unifies 3 steps of the classical analysis strategy ANOVA multiple comparisons (controlling FWE s ) confidence intervals for the effects - compatible to the multiple comparisons in one procedure further research detailed results regarding power extension to factorial designs extension to repeated measures designs for parametric as well as nonparametric models Software so far only for independent samples (one-factorial design) parametric models: R-package: SimComp in CRAN nonparametric models: R-package: nparcomp in CRAN 24

Cooperation / Credits Ludwig Hothorn and assistants (Biostatistik, LU Hannover) Frank Konietschke (Medizinische Statistik, University of Göttingen) 25

References BIESHEUVEL, E. and HOTHORN, L.A. (2002). Many-to-one comparisons in stratified designs. BIOMETRICAL JOURNAL 44, 101-116. BRETZ, F., GENZ, A., and HOTHORN, L.A. (2001). On the numerically availibilty of multiple comparison procedures, Biometrical Journal 43, 645-656. BROWN, B. M. and HETTMANSPERGER, T. P. (2002). Kruskal-Wallis, Multiple Comparisons and Efron Dice. Australian and New Zealand Journal of Statistics 44, 427-438. BRUNNER,E., MUNZEL, U., and PURI, M., (2002). The multivariate nonparametric Behrens-Fisher problem. Journal of Statistical Planning and Inference 108, 37-53. FROEMKE C., HOTHORN L.A. and KROPF S. (2008). Nonparametric relevance-shifted multiple testing procedures for the analysis of high-dimensional multivariate data with small sample sizes. BMC Bioinformatics, 9:54 doi: 10.1186/1471-2105-9-54. 26

References GABRIEL, K.R. (1969). Simultaneous Test Procedures - Some Theory of Multiple Comparisons. The Annals of Mathematical Statistics 40, 224-250. GENZ, A. and BRETZ F. (2009). Computation of Multivariate Normal and t Probabilities. Lecture Notes in Statistics 195. Springer, Heidelberg, New York. HASLER M. and HOTHORN L.A. (2008). Multiple Contrast Tests in the Presence of Heteroscedasticity. Biometrical Journal 50, 793-800. KONIETSCHKE, F. (2009). Simultane Konfidenzintervalle für nichtparametrische relative Kontrasteffekte. Dissertation, Georg-August-Universität Göttingen RUMP, C. M. (2001). Strategies for Rolling the Efron dice. Mathematics Magazine 74, 212-216. 27

References RYU, E. and AGRESTI, A. (2008). Modeling and inference for an ordinal effect size measure. Statistics in Medicine 27, 1703-1717. SCHAARSCHMIDT, F., SILL, M. and HOTHORN, L.A. (2008). Approximate Simultaneous Confidence Intervals for Multiple Contrasts of Binomial Proportions. Biometrical Journal 50, 782-792. THANGAVELU, K. and BRUNNER, E. (2007). Wilcoxon Mann-Whitney Test for Stratified Samples and Efron s Paradox Dice. Journal of Statistical Planning and Inference 137, 720-737. 28