Multiple Testing of General Contrasts: Truncated Closure and the Extended Shaffer-Royen Method

Similar documents
Superchain Procedures in Clinical Trials. George Kordzakhia FDA, CDER, Office of Biostatistics Alex Dmitrienko Quintiles Innovation

Applying the Benjamini Hochberg procedure to a set of generalized p-values

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

Step-down FDR Procedures for Large Numbers of Hypotheses

Adaptive, graph based multiple testing procedures and a uniform improvement of Bonferroni type tests.

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials

Lecture 28. Ingo Ruczinski. December 3, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

On Generalized Fixed Sequence Procedures for Controlling the FWER

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs

Specific Differences. Lukas Meier, Seminar für Statistik

Familywise Error Rate Controlling Procedures for Discrete Data

A class of improved hybrid Hochberg Hommel type step-up multiple test procedures

Control of Generalized Error Rates in Multiple Testing

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)

Statistica Sinica Preprint No: SS R1

Lecture 27. December 13, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

Chapter 13 Section D. F versus Q: Different Approaches to Controlling Type I Errors with Multiple Comparisons

Control of Directional Errors in Fixed Sequence Multiple Testing

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

Simultaneous identifications of the minimum effective dose in each of several groups

Week 5 Video 1 Relationship Mining Correlation Mining

Multiple Endpoints: A Review and New. Developments. Ajit C. Tamhane. (Joint work with Brent R. Logan) Department of IE/MS and Statistics

Chapter Seven: Multi-Sample Methods 1/52

Lecture 6 April

Bonferroni - based gatekeeping procedure with retesting option

Compatible simultaneous lower confidence bounds for the Holm procedure and other Bonferroni based closed tests

Stepwise Gatekeeping Procedures in Clinical Trial Applications

Analysis of Variance II Bios 662

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

On adaptive procedures controlling the familywise error rate

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

False discovery rate control for non-positively regression dependent test statistics

An Alpha-Exhaustive Multiple Testing Procedure

Hochberg Multiple Test Procedure Under Negative Dependence

A note on tree gatekeeping procedures in clinical trials

arxiv: v1 [math.st] 14 Nov 2012

STAT 705 Chapter 19: Two-way ANOVA

Multiple testing: Intro & FWER 1

Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing

IEOR165 Discussion Week 12

Modified Simes Critical Values Under Positive Dependence

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

STAT 705 Chapters 23 and 24: Two factors, unequal sample sizes; multi-factor ANOVA

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

c 2011 Kuo-mei Chen ALL RIGHTS RESERVED

High-throughput Testing

On weighted Hochberg procedures

Machine learning: Hypothesis testing. Anders Hildeman

STAT 705 Chapter 19: Two-way ANOVA

Dr. Peter Westfall Texas Tech University (806)

The International Journal of Biostatistics

Weighted FWE-controlling methods in high-dimensional situations

A multiple testing procedure for input variable selection in neural networks

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

1 Tomato yield example.

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

Linear Regression for Air Pollution Data

Closure properties of classes of multiple testing procedures

SIMULTANEOUS CONFIDENCE BOUNDS WITH APPLICATIONS TO DRUG STABILITY STUDIES. Xiaojing Lu. A Dissertation

Simultaneous Confidence Intervals and Multiple Contrast Tests

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Statistical Applications in Genetics and Molecular Biology

Inferential Statistical Analysis of Microarray Experiments 2007 Arizona Microarray Workshop

FDR and ROC: Similarities, Assumptions, and Decisions

Non-specific filtering and control of false positives

Multiple Testing. Tim Hanson. January, Modified from originals by Gary W. Oehlert. Department of Statistics University of South Carolina

A BAYESIAN STEPWISE MULTIPLE TESTING PROCEDURE. By Sanat K. Sarkar 1 and Jie Chen. Temple University and Merck Research Laboratories

V. Experiments With Two Crossed Treatment Factors

Adaptive Dunnett Tests for Treatment Selection

Chapter 10. Design of Experiments and Analysis of Variance

STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University

MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY

What Does the F-Ratio Tell Us?

Journal Club: Higher Criticism

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR

Multiple Testing. Anjana Grandhi. BARDS, Merck Research Laboratories. Rahway, NJ Wenge Guo. Department of Mathematical Sciences

ANOVA: Comparing More Than Two Means

Linear Combinations of Group Means

Chapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate

Comparisons among means (or, the analysis of factor effects)

GOTEBORG UNIVERSITY. Department of Statistics

Testing a secondary endpoint after a group sequential test. Chris Jennison. 9th Annual Adaptive Designs in Clinical Trials

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Structured testing of 2 2 factorial effects: an analytic plan requiring fewer observations

SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE

Examining Multiple Comparison Procedures According to Error Rate, Power Type and False Discovery Rate

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES

STAT 705 Chapter 16: One-way ANOVA

Consonance and the Closure Method in Multiple Testing. Institute for Empirical Research in Economics University of Zurich

Estimation of the False Discovery Rate

Controlling the False Discovery Rate in Two-Stage. Combination Tests for Multiple Endpoints

Looking at the Other Side of Bonferroni

Transcription:

Multiple Testing of General Contrasts: Truncated Closure and the Extended Shaffer-Royen Method Peter H. Westfall, Texas Tech University Randall D. Tobias, SAS Institute

Pairwise Comparisons ANOVA, g =10groups, n =2per group Averages: 1.9695, 2.1520, 5.5915, 0.8695, 4.6315, 0.2780, 1.3210, 0.3490, 7.1530, and 3.6220; MSE=1.051976 k =45pairwise comparisons Bonferroni-Based Bonferroni Holm Shaffer Method 2 MaxT-Based Tukey Stepdown MaxT-based Royen

SAS Code proc glimmix data=allpairs; class g; model y=g; lsmeans g/pdiff adjust=bonferroni; lsmeans g/pdiff adjust=bonferroni stepdown(type=free); lsmeans g/pdiff adjust=bonferroni stepdown(type=logical); lsmeans g/pdiff adjust=tukey; lsmeans g/pdiff adjust=simulate stepdown(type=free); lsmeans g/pdiff adjust=simulate stepdown(type=logical); run;

Assumptions bµ N g (µ,σ 2 V ), νbσ 2 /σ 2 χ 2 ν, independent of bµ V is known, positive definite H i : c 0 i µ =0 vs. H0 i : c0 i µ 6= 0, i =1,...,k, for contrast vectors c i T i = c 0 i bµ /(bσ2 c 0 i Vc i) 1/2

Shaffer 2 Method (JASA 81, 826-831) Let H (i) correspond with T (i) = T ai. Let S = {1,...,k}. Define S i = {K S a i K and H K H(k) 0... H0 (k i+2) 6= }. S2: starting with i = k, H (i) is rejected iff H (k),...,h (i+1) are rejected and p (i) α/k i,where k i =max{ K ; K S i }. Adjusted p-values are ep S2 (i) =min{max j i(k j p (j) ), 1} S2 is uniformly more powerful tham Holm.

Extended Shaffer 2 Method (JASA 92, 299-306) Define t ES (α, i) =max{t(α, K) K S i }, where P 0 {max j K T j t(α, K)} = α Starting with i = k, H (i) is rejected iff H (k),...,h (i+1) are rejected and T (i) t ES (α, i). Adjusted p-values are ep ES2 (i) =max {max P 0 (max T l t (j) )} j i K S j l K Uniformly more powerful than S2.

Example: All Pairs, g=5 i 1 2 3 4 5 6 7 8 9 10 p (i).001.001.006.010.015.020.039.060.266.392 H H 13 H 15 H 34 H 12 H 45 H 23 H 14 H 25 H 24 H 35 S 1 = {{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}} (k 1 = 10) S 2 = {{2, 4, 5, 7, 8, 9}, {2, 3, 4, 8}, {2, 5, 6, 7}, {2, 3, 6, 9}} (k 2 =6) S 3 = {{3, 4, 5, 10}, {3, 5, 6, 8, 9, 10}} (k 3 =6) S 4 = {{4, 7, 9, 10}, {4, 5}} (k 4 =4) S 5 = {{5, 6}, {5, 8, 9}} (k 5 =3) S 6 = {{6, 7, 8, 10}} (k 6 =4) S 7 = {{7, 8}, {7, 10}} (k 7 =2) S 8 = {{8}}, S 9 = {{9, 10}}, S 10 = {{10}}

S2 and Closure Connection Westfall proved FWE control for ES2 directly, not using closure Rom and Holland, Hommel and Bernhard showed that S2 is more conservative than full closure with Bonferroni tests for all pairwise comparisons Royen noted closed methods are sometimes nonmonotonic in p-values, and developed a truncated closed procedure: closed testing follows the ordered p-values, stopping as soon as there is insignificance, with application to all pairwise comparisons in ANOVA.

New Results Westfall and Tobias (a) extend Royen s truncated closure to general contrasts, (b) prove equivalence with ES2, and (c) develop a tree-based branch-and-bound algorithm to prune the all-subsets tree, reducing O(2 k ) computational complexity stop the search at early stages if the problem is computationally infeasible, with increasing power for increased search depth. call the method ESR (extended Shaffer-Royen)

Tree representation of the search space

S (0) 1+ S (0) 2+ S (0) 3+ S (0) 4+ S (0) 5+ S (0) 6+ S (0) 7+ S (0) 8+ S (0) i+ Covering Sets = {{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}} (k(0) 1+ = 10) = {{2, 3, 4, 5, 6, 7, 8, 9}} (k(0) 2+ = {{3, 4, 5, 6, 8, 9, 10}} (k(0) 3+ = {{4, 5, 7, 9, 10}} (k(0) 4+ = {{5, 6, 8, 9}} (k(0) 5+ =4) = {{6, 7, 8, 10}} (k(0) 6+ =4) = {{7, 8, 10}} (k(0) 7+ =3) = {{8}}, S(0) 1+ =8) (9for free stepdown) =7) (8""") =5) (7""") = {{9, 10}}, S(0) 1+ = {{10}}

Ratios of adjusted p-values of S2 ( ), S (0) i+ ( ), S(1) i+ ( ), and S(2) i+ ( ) to truncated closure for g =10, k =45example.

Response Surface Comparisons Compare health for Active and Placebo at various initial health (PreTrt) and Age values. Estimates: by A =1.96204 + 0.42977 PreTrt 0.05410 Age/100, by P =1.54004 + 0.53557 PreTrt 1.73130 Age/100. Comparisons made at (PreTrt, Age) {{1, 2, 3, 4, 5} {10, 11,...,70}} (k =5 61 = 305). B & B search finds every element of S i despite k = 305. SAS GLIMMIX with CONTRAST statement used for analysis

Significant (0.05 level simultaneous) differences using Benjamini-Hochberg FDR-controlling method (B), Liu et al. s method (L), the discrete grid approximation to Liu et al. (D), free combination covering sets (F), S (0) i+,ands(1) i+ (= S i here).

Timing Issues 26 minutes to find all relevant subsets in the response surface example (not O(2 305 )) Full method feasible for all pairwise comparisons with g =11 (k =55), S (2) i+ is feasible with as many as 19 groups (k = 171), S (1) i+ is feasible with as many as 32 groups (k = 496), and S(0) i+ is feasible with as many as 80 groups (k = 3160). Donoghue developed an algorithm for S2 that is faster, but limited.

Conclusions ESR is a truncated closed testing procedure Branch-and-bound provides conservative, but computable approximations, that are usually much more powerful than standard alternatives. The method is fully computable in response surface comparisons, even with large numbers of comparisons. Available in PROC GLIMMIX for arbitrary contrasts and general model structures (approximate analyses used for more general models) General strategy: choose depth as large as compute resources allow and use the resulting multiplicity adjustment.

References Donoghue, J. R. (2004), Implementing Shaffer s Multiple Comparison Procedure for a Large Number of Groups, in Recent Developments in Multiple Comparison Procedures, Institute of Mathematical Statistics Lecture Notes-Monograph Series, Vol. 47, Y. Benjamini, F. Bretz, and S. Sarkar, eds., 1 23. Holm, S. (1979), A Simple Sequentially Rejective Multiple Test Procedure, Scandinavian Journal of Statistics, 6, 65 70. Hommel, G., and Bernhard, G. (1999), Bonferroni Procedures for Logically Related Hypotheses, Journal of Statistical Planning and Inference, 82, 119 128. Royen, T. (1989), Generalized Maximum Range Tests for Pairwise Comparisons of Several Populations, Biometrical Journal, 31, 905 929. Shaffer, J. P. (1986), Modified Sequentially Rejective Multiple Test Procedures, Journal of the American Statistical Association, 81.

826 831. Rom, D. M., and Holland, B. (1995), A New Closed Multiple Testing Procedure for Hierarchical Families of Hypotheses, Journal of Statistical Planning and Inference, 46, 265 275. Royen, T. (1989), Generalized Maximum Range Tests for Pairwise Comparisons of Several Populations, Biometrical Journal, 31, 905 929. Shaffer, J. P. (1986), Modified Sequentially Rejective Multiple Test Procedures, Journal of the American Statistical Association, 81, 826 831. Westfall, P.H. and Tobias, R.D. (2007). Multiple Testing of General Contrasts: Truncated Closure and the Extended Shaffer-Royen Method, Journal of the American Statistical Association 102: 487 494.