Statistical Applications in Genetics and Molecular Biology

Similar documents
Statistical Applications in Genetics and Molecular Biology

University of California, Berkeley

University of California, Berkeley

Multiple Hypothesis Testing in Microarray Data Analysis

Statistical Applications in Genetics and Molecular Biology

Step-down FDR Procedures for Large Numbers of Hypotheses

PB HLTH 240A: Advanced Categorical Data Analysis Fall 2007

University of California, Berkeley

Statistical Applications in Genetics and Molecular Biology

Control of Generalized Error Rates in Multiple Testing

Research Article Sample Size Calculation for Controlling False Discovery Proportion

Non-specific filtering and control of false positives

Statistical Applications in Genetics and Molecular Biology

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

University of California, Berkeley

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing

Political Science 236 Hypothesis Testing: Review and Bootstrapping

University of California, Berkeley

University of California, Berkeley

Christopher J. Bennett

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS

The International Journal of Biostatistics

Multiple testing: Intro & FWER 1

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES

Control of Directional Errors in Fixed Sequence Multiple Testing

Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004

Resampling-Based Control of the FDR

Familywise Error Rate Controlling Procedures for Discrete Data

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Single gene analysis of differential expression

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

Modified Simes Critical Values Under Positive Dependence

This paper has been submitted for consideration for publication in Biometrics

LECTURE 5 HYPOTHESIS TESTING

Department of Statistics University of Central Florida. Technical Report TR APR2007 Revised 25NOV2007

University of California San Diego and Stanford University and

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method

Reports of the Institute of Biostatistics

Studies in Nonlinear Dynamics & Econometrics

FDR and ROC: Similarities, Assumptions, and Decisions

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials

STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University

The International Journal of Biostatistics

Resampling and the Bootstrap

IMPROVING TWO RESULTS IN MULTIPLE TESTING

Math 494: Mathematical Statistics

Exact and Approximate Stepdown Methods for Multiple Hypothesis Testing

On Methods Controlling the False Discovery Rate 1

On adaptive procedures controlling the familywise error rate

5 Introduction to the Theory of Order Statistics and Rank Statistics

The miss rate for the analysis of gene expression data

SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE

QED. Queen s Economics Department Working Paper No Hypothesis Testing for Arbitrary Bounds. Jeffrey Penney Queen s University

INTERVAL ESTIMATION AND HYPOTHESES TESTING

Inference on distributions and quantiles using a finite-sample Dirichlet process

Exam: high-dimensional data analysis February 28, 2014

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Procedures controlling generalized false discovery rate

Lecture 21. Hypothesis Testing II

Deductive Derivation and Computerization of Semiparametric Efficient Estimation

Control of the False Discovery Rate under Dependence using the Bootstrap and Subsampling

ON TWO RESULTS IN MULTIPLE TESTING

Lecture 13: p-values and union intersection tests

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS

The Components of a Statistical Hypothesis Testing Problem

Estimation of a Two-component Mixture Model

University of California, Berkeley

Stat 206: Estimation and testing for a mean vector,

Sample Size Estimation for Studies of High-Dimensional Data

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Lecture 21: October 19

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

1 Statistical inference for a population mean

Chapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate

Robust Backtesting Tests for Value-at-Risk Models

On Generalized Fixed Sequence Procedures for Controlling the FWER

University of California, Berkeley

On weighted Hochberg procedures

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

Topic 3: Hypothesis Testing

Marginal Screening and Post-Selection Inference

The Simplex Method: An Example

Empirical Bayes Moderation of Asymptotically Linear Parameters

Uniformly Most Powerful Bayesian Tests and Standards for Statistical Evidence

STEPUP PROCEDURES FOR CONTROL OF GENERALIZATIONS OF THE FAMILYWISE ERROR RATE

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Looking at the Other Side of Bonferroni

Confidence Thresholds and False Discovery Control

Probabilistic Inference for Multiple Testing

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

False discovery control for multiple tests of association under general dependence

Multiple Testing Procedures under Dependence, with Applications

MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY

Transcription:

Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 14 Multiple Testing. Part II. Step-Down Procedures for Control of the Family-Wise Error Rate Mark J. van der Laan Sandrine Dudoit Katherine S. Pollard Division of Biostatistics, School of Public Health, University of California, Berkeley, laan@stat.berkeley.edu Division of Biostatistics, School of Public Health, University of California, Berkeley, sandrine@stat.berkeley.edu University of California, Santa Cruz, kpollard@gladstone.ucsf.edu Copyright c 2004 by the authors. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher, bepress, which has been given certain exclusive rights by the author. Statistical Applications in Genetics and Molecular Biology is produced by The Berkeley Electronic Press bepress. http://www.bepress.com/sagmb

Multiple Testing. Part II. Step-Down Procedures for Control of the Family-Wise Error Rate Mark J. van der Laan, Sandrine Dudoit, and Katherine S. Pollard Abstract The present article proposes two step-down multiple testing procedures for asymptotic control of the family-wise error rate FWER: the first procedure is based on maxima of test statistics step-down maxt, while the second relies on minima of unadjusted p-values step-down minp. A key feature of our approach is the characterization and construction of a test statistics null distribution rather than data generating null distribution for deriving cut-offs for these test statistics i.e., rejection regions and the resulting adjusted p-values. For general null hypotheses, corresponding to submodels for the data generating distribution, we identify an asymptotic domination condition for a null distribution under which the step-down maxt and minp procedures asymptotically control the Type I error rate, for arbitrary data generating distributions, without the need for conditions such as subset pivotality. Inspired by this general characterization, we then propose as an explicit null distribution the asymptotic distribution of the vector of null value shifted and scaled test statistics. Step-down procedures based on consistent estimators of the null distribution are shown to also provide asymptotic control of the Type I error rate. A general bootstrap algorithm is supplied to conveniently obtain consistent estimators of the null distribution. KEYWORDS: Adjusted p-value, asymptotic control, bootstrap, consistency, cut-off, family-wise error rate, maxima of test statistics, minima of p-values, multiple testing, null distribution, null hypothesis, quantile, step-down, test statistic, Type I error rate The authors would like to thank two referees for constructive comments on an earlier version of this manuscript.

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate 1 Introduction 1.1 Multiple hypothesis testing framework The present article is concerned with step-down multiple testing procedures for controlling the family-wise error rate i.e., the probability of at least one Type I error, when testing general null hypotheses defined in terms of submodels for the data generating distribution. Our approach is based on a null distribution for the test statistics, rather than a data generating null distribution, and provides asymptotic control of the Type I error rate for general data generating distributions, without the need for conditions such as subset pivotality Westfall and Young 1993, p. 42 43. The companion article Dudoit et al., 2004 gives a detailed introduction to our general approach to multiple testing and provides single-step multiple testing procedures for controlling Type I error rates defined as arbitrary parameters of the distribution of the number of Type I errors. The third article in this series van der Laan et al., 2004 proposes simple augmentations of FWER-controlling procedures which control error rates such as tail probabilities for the number of false positives i.e., generalized family-wise error rate, gfwer and for the proportion of false positives among the rejected hypotheses TPPFP, under general data generating distributions, with arbitrary dependence structures among variables. The reader is referred to Korn et al. 2004, Troendle 1995, 1996, and Westfall and Young 1993, for recent work on stepwise methods. In particular, Korn et al. 2004 provide permutation-based step-down procedures for controlling the gfwer and TPPFP, in the special case where the null hypotheses concern equality of the marginal distributions of the components of a random vector in two populations. Troendle 1996 proposes a permutation-based step-up multiple testing procedure which takes into account the dependence structure among the test statistics and is related to the Westfall and Young 1993 step-down maxt procedure. We follow the framework described in the companion article on singlestep procedures and refer the reader to Sections 1 and 2 of this earlier article for a detailed introduction Dudoit et al., 2004. The basic set-up and main definitions are recalled below for convenience. As in Dudoit et al. 2004, we adopt the following definitions for inverses of cumulative distribution functions c.d.f. and survivor functions. Let F denote a non-decreasing and right-continuous c.d.f. and let F denote the corresponding non-increasing and right-continuous survivor function, Published by The Berkeley Electronic Press, 2004 1

Statistical Applications in Genetics and Molecular Biology, Vol. 3 [2004], Iss. 1, Art. 14 defined as F 1 F. For α [0, 1], define inverses as F 1 α inf{x : F x α} and F 1 α inf{x : F x α}. 1 With these definitions, F 1 α = F 1 1 α. Model. [Section 2.1 in Dudoit et al. 2004] Let X 1,..., X n be n independent and identically distributed i.i.d. random J vectors, where X i = X i j : j = 1,..., J P M, i = 1,..., n, and the data generating distribution P is known to be an element of a particular statistical model M possibly non-parametric. Let P n denote the corresponding empirical distribution, which places probability 1/n on each realization of X. For example, in cancer microarray studies, X i j : j = 1,..., G may denote a G vector of gene expression measures and X i j : j = G + 1,..., J a J G vector of biological and clinical outcomes for patient i, i = 1,..., n. Null hypotheses. [Section 2.3 in Dudoit et al. 2004] In order to cover a broad class of testing problems, we define M null hypotheses in terms of a collection of submodels, Mm M, m = 1,..., M, for the data generating distribution P. The M null hypotheses are defined as H 0 m IP Mm and the corresponding alternative hypotheses as H 1 m IP / Mm. Here, I is the indicator function, equaling 1 if the condition in parentheses is true and 0 otherwise. Thus, H 0 m is true, i.e., H 0 m = 1, if P Mm, and H 0 m is false otherwise. This general representation of null hypotheses includes the familiar case of single-parameter null hypotheses. In this setting, one considers an M vector of parameters, ΨP = ψ = ψm : m = 1,..., M, defined as functions ψm = ΨP m IR of the unknown data generating distribution P, and specifies each null hypothesis in terms of one of these parameters. Parameters of interest include means, differences in means, correlations, and can refer to linear models, generalized linear models, survival models e.g., Cox proportional hazards model, time-series models, dose-response models, etc. One distinguishes between two types of testing problems for single parameters. One-sided tests H 0 m = I ψm ψ 0 m vs. H 1 m = I ψm > ψ 0 m, m = 1,..., M. Two-sided tests H 0 m = I ψm = ψ 0 m vs. H 1 m = I ψm ψ 0 m, m = 1,..., M. http://www.bepress.com/sagmb/vol3/iss1/art14 2

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate The hypothesized null values, ψ 0 m, are frequently zero e.g., no difference in mean expression levels for gene m between two populations of patients. Let H 0 = H 0 P {m : H 0 m = 1} = {m : P Mm} be the set of h 0 H 0 true null hypotheses, where we note that H 0 depends on the true data generating distribution P. Let H 1 = H 1 P H c 0P = {m : H 1 m = 1} = {m : P / Mm} be the set of h 1 H 1 = M h 0 false null hypotheses, i.e., true positives. The goal of a multiple testing procedure is to accurately estimate the set H 0, and thus its complement H 1, while controlling probabilistically the number of false positives at a user-supplied level α. Test statistics. [Section 2.4 in Dudoit et al. 2004] The decisions to reject or not the null hypotheses are based on an M vector of test statistics, T n = T n m : m = 1,..., M, that are functions of the data, X 1,..., X n. Denote the finite sample joint distribution of the test statistics T n by Q n = Q n P. Unless specified otherwise, it is assumed that large values of the test statistic T n m provide evidence against the corresponding null hypothesis H 0 m, that is, the null H 0 m is rejected whenever T n m exceeds a specified cut-off cm. For two-sided tests of single-parameter null hypotheses, one could take absolute values of the test statistics. Multiple testing procedures. [Section 2.5 in Dudoit et al. 2004] A multiple testing procedure MTP can be represented by a random subset R n of rejected hypotheses, that estimates the set H 1 of true positives, R n = RT n, Q 0, α {m : H 0 m is rejected} {1,..., M}. 2 As indicated by the long notation RT n, Q 0, α, the set R n is a function of: i the data, X 1,..., X n, through an M vector of test statistics, T n = T n m : m = 1,..., M, where each T n m corresponds to a null hypothesis H 0 m; ii a test statistics null distribution, Q 0, for computing cut-offs for each T n m and the resulting adjusted p-values; and iii the nominal level α of the MTP, i.e., the desired upper bound for a suitably defined Type I error rate. As in the companion article Dudoit et al., 2004, and unless specified otherwise, we consider multiple testing procedures that reject null hypotheses for large values of the test statistics, i.e., that can be represented as R n = RT n, Q 0, α = {m : T n m > cm}, 3 where cm = ct n, Q 0, αm, m = 1,..., M, are possibly random cut-offs, Published by The Berkeley Electronic Press, 2004 3

Statistical Applications in Genetics and Molecular Biology, Vol. 3 [2004], Iss. 1, Art. 14 or critical values, computed under the null distribution Q 0 for the test statistics. Type I error rates. [Section 2.7 in Dudoit et al. 2004] In any testing situation, two types of errors can be committed: a false positive, or Type I error, is committed by rejecting a true null hypothesis, and a false negative, or Type II error, is committed when the test procedure fails to reject a false null hypothesis. Denote the number of Type I errors by V n = V Q 0 Q n RT n, Q 0, α H 0 P = R n H 0, where the longer notation V Q 0 Q n emphasizes the dependence of the distribution for the number of Type I errors on the null distribution Q 0, used to derive cut-offs for the test statistics T n, and on the true underlying distribution Q n = Q n P for these test statistics here, the subset H 0 is kept fixed at the truth H 0 P and the nominal level α of the test is also held fixed. Likewise, let R n = RQ 0 Q n RT n, Q 0, α = R n denote the number of rejected hypotheses. The companion articles consider general Type I error rates, defined as parameters θf Vn,Rn of the joint distribution F Vn,Rn of the numbers of Type I errors V n and rejected hypotheses R n Dudoit et al., 2004; van der Laan et al., 2004. Here, we focus on control of the family-wise error rate FWER, or probability of at least one Type I error, F W ER P rv n 1 = 1 F Vn 0, 4 where F Vn is the discrete cumulative distribution function on {0,..., M} for the number of Type I errors, V n. van der Laan et al. 2004 provide simple augmentations of FWER-controlling procedures that control the generalized family-wise error rate gf W ERk P rv n k + 1 = 1 F Vn k, for a user-supplied integer k 0 and tail probabilities for the proportion of false positives among the rejected hypotheses T P P F P q P rv n /R n > q = 1 F Vn/Rn q, for a user-supplied q 0, 1, under general data generating distributions P, with arbitrary dependence structures among variables. Adjusted p-values. [Section 2.9 in Dudoit et al. 2004] Given any multiple testing procedure R n = RT n, Q 0, α = {m : T n m > cm}, based on an M vector of cut-offs, c = ct n, Q 0, α, the adjusted p-values, http://www.bepress.com/sagmb/vol3/iss1/art14 4

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate P 0n = P T n, Q 0 = P RT n, Q 0, α : α [0, 1], are defined as P 0n m inf {α [0, 1] : m RT n, Q 0, α} 5 = inf {α [0, 1] : ct n, Q 0, αm < T n m}, m = 1,..., M. That is, the adjusted p-value P 0n m for null hypothesis H 0 m is the nominal level of the entire MTP e.g., gfwer or FDR at which H 0 m would just be rejected, given T n. For continuous null distributions Q 0, P0n m = c 1 m T n m, where c 1 m is the inverse of the monotone decreasing function α c m α = ct n, Q 0, αm. The particular mapping c m, defining the cutoffs ct n, Q 0, αm, will depend on the choice of MTP e.g., single-step vs. stepwise, common cut-offs vs. common-quantile cut-offs. In contrast, the unadjusted p-value i.e., marginal or raw p-value, P 0n m = P T n m, Q 0,m, for the test of single null hypothesis H 0 m, based on cutoffs c m α = cq 0,m, α = Q 0,mα, involves only the marginal distribution 1 Q 0,m of the test statistic T n m for that hypothesis P 0n m inf {α [0, 1] : cq 0,m, α < T n m}, m = 1,..., M. 6 That is, P 0n m is the nominal level of the single hypothesis testing procedure at which H 0 m would just be rejected, given T n m. For continuous marginal null distributions Q 0,m, the unadjusted p-values are given by P 0n m = c 1 m T n m = Q 0,m T n m, where c 1 m is the inverse of the monotone decreasing function α c m α = cq 0,m, α. Stepwise multiple testing procedures. [Section 2.10 in Dudoit et al. 2004] One usually distinguishes between two main classes of multiple testing procedures, single-step and stepwise procedures, depending on whether the cut-off vector c = cm : m = 1,..., M for the test statistics T n is constant or random given the null distribution Q 0, i.e., is independent or not of these test statistics. In single-step procedures, each hypothesis H 0 m is evaluated using a critical value cm = cq 0, αm that is independent of the results of the tests of other hypotheses and is not a function of the data X 1,..., X n unless these data are used to estimate the null distribution Q 0, as in Section 3. Improvement in power, while preserving asymptotic Type I error rate control, may be achieved by stepwise procedures, in which rejection of a particular hypothesis depends on the outcome of the tests of other hypotheses. That is, the cut-offs cm = ct n, Q 0, αm are allowed to Published by The Berkeley Electronic Press, 2004 5

Statistical Applications in Genetics and Molecular Biology, Vol. 3 [2004], Iss. 1, Art. 14 depend on the data, X 1,..., X n, via the test statistics T n. In step-down procedures, the hypotheses corresponding to the most significant test statistics i.e., largest absolute test statistics or smallest unadjusted p-values are considered successively, with further tests depending on the outcome of earlier ones. As soon as one fails to reject a null hypothesis, no further hypotheses are rejected. In contrast, for step-up procedures, the hypotheses corresponding to the least significant test statistics are considered successively, again with further tests depending on the outcome of earlier ones. As soon as one hypothesis is rejected, all remaining more significant hypotheses are rejected. 1.2 Outline Section 2 proposes two step-down multiple testing procedures for controlling the family-wise error rate FWER, when testing general null hypotheses defined in terms of submodels for the data generating distribution. The first method relies on successive maxima of test statistics step-down maxt, Procedure 1 and the second involves successive minima of unadjusted p- values step-down minp, Procedure 2. We derive two main types of results concerning asymptotic control of the FWER by Procedures 1 and 2, under a null distribution for the test statistics, rather than a data generating null distribution. The more general Theorems 1 and 4 prove that the step-down maxt and minp procedures provide asymptotic control of the FWER, under general asymptotic domination conditions for the null distribution, and imply that gains in power from step-down procedures, relative to their single-step counterparts, do not come at the expense of Type I error control. By making additional asymptotic separation assumptions, Theorems 2 and 5 provide sharper control results. Theorem 3 proposes as an explicit null distribution the asymptotic distribution of the vector of null value shifted and scaled test statistics. In Section 3, step-down maxt and minp procedures, based on a consistent estimator of the null distribution, are shown to also provide asymptotic control of the Type I error rate Theorems 6 and 8. A general bootstrap procedure is supplied to conveniently obtain consistent estimators of the null distribution Procedure 3. The proposed methods are evaluated by a simulation study and applied to genomic data in the fourth article of the series Pollard et al., 2004. Software implementing the bootstrap single-step and step-down multiple testing procedures will be available in the R package multtest, released as part of the Bioconductor Project www.bioconductor. org. http://www.bepress.com/sagmb/vol3/iss1/art14 6

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate 2 Step-down procedures for control of the familywise error rate 2.1 Step-down procedure based on maxima of test statistics 2.1.1 Step-down maxt procedure Procedure 1. Step-down maxt procedure for control of the FWER. Let T nm denote the ordered test statistics, T n1... T nm, and let O n m denote the indices for these ordered statistics, so that T nm T n O n m, m = 1,..., M. For a level α 0, 1 test, given an M variate null distribution Q 0 and a random M vector Z = Zm : m = 1,..., M Q 0, define 1 α quantiles, ca = ca, Q 0, α IR, for the distributions of maxima, max m A Zm, of random variables Zm over subsets A {1,..., M}, ca, Q 0, α F 1 A,Q 0 1 α = inf {z : F A,Q0 z 1 α}, 7 where F A,Q0 z P r Q0 max m A Zm z denotes the c.d.f. of max m A Zm under Z Q 0. Next, given the indices O n m for the ordered statistics Tnm, define 1 α quantiles, C n m, for subsets of the form O n m {O n m,..., O n M}, and step-down cut-offs C n m co n m, Q 0, α = F 1 O nm,q 0 1 α, 8 Cn1 C n 1, 9 { Cnm C n m, if T nm 1 > Cnm 1, m = 2,..., M. +, otherwise The step-down maxt multiple testing procedure for controlling the FWER at level α is defined by the following rule: Reject null hypothesis H 0 O n m, corresponding to the mth most significant test statistic T nm = T n O n m, if T nm > C nm, m = 1,..., M, that is, RT n, Q 0, α {O n m : T nm > C nm, m = 1,..., M}. 10 Published by The Berkeley Electronic Press, 2004 7

Statistical Applications in Genetics and Molecular Biology, Vol. 3 [2004], Iss. 1, Art. 14 Procedure 1 can be stated in a more compact manner, as RT n, Q 0, α {O n 1,..., O n R n }, 11 where R n, the number of rejected hypotheses, is defined as { m } R n max m : ITnh > C n h = m. 12 h=1 Note that the definition C nm = +, if T nm 1 C nm 1, ensures that the procedure is indeed step-down, that is, one can only reject a particular hypothesis provided all hypotheses with more significant i.e., larger test statistics were rejected beforehand. In addition, the cut-offs C n m used in the rejection rule are random variables that depend on the data via the ranks of the test statistics T n i.e., via the random subsets O n m, again reflecting the stepwise nature of the method. This is in contrast to the constant cut-offs used in single-step Procedures 1 and 2 of the companion article Dudoit et al., 2004. Procedure 1, based on successive maxima of test statistics, is a step-down analogue of the single-step maxt procedure that arises as a special case of single-step common-cut-off Procedure 2 of Dudoit et al. 2004. Similar stepdown maxt procedures are discussed in Dudoit et al. 2003 and Westfall and Young 1993, Algorithm 4.1, p. 116 117, with an important distinction in the choice of the null distribution Q 0 used to derive the quantiles C n m and the resulting adjusted p-values, as detailed in Section 2.1.4, below. 2.1.2 Asymptotic control of FWER In order to establish asymptotic control of the FWER by Procedure 1, we rely on one or both of the following two assumptions concerning the true joint distribution Q n = Q n P of the test statistics T n and the null distribution Q 0. Assumption AT1 [Asymptotic null domination] There exists an M variate null distribution Q 0 = Q 0 P so that lim sup P r Qn max T n m > x P r Q0 max Zm > x for all x, 13 where T n and Z are random M vectors with T n Q n = Q n P and Z Q 0 = Q 0 P. http://www.bepress.com/sagmb/vol3/iss1/art14 8

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate Assumption AT2 [Asymptotic separation of true and false null hypotheses] Let K 0 be a possibly degenerate e.g., + maximal value, so that P r Qn max m T n m < K 0 = 1, for all n. Assume that for all K < K 0, lim P r Q n min T n m K = 1 14 m H 1 and lim K K 0 lim P r Qn max T n m K = 0. 15 In addition, for α 0, 1 and Z = Zm : m = 1,..., M Q 0, the distributions of maxima, max m A Zm, of random variables Zm over subsets A {1,..., M}, are assumed to have 1 α quantiles bounded by K 0, that is, max A {1,...,M} ca, Q 0, α < K 0, 16 where the quantiles ca, Q 0, α are defined as in Procedure 1. Note that Assumption AT1 follows from the main asymptotic null domination Assumption AQ0 in Theorem 1 of the companion article on singlestep procedures Dudoit et al., 2004. Condition AQ0 was stated there to asymptotically control general Type I error rates of the form θf Vn and it is sufficient, but not necessary, for control of the FWER. Assumption AT1 can therefore be viewed as an FWER-specific asymptotic null domination condition, where θf Vn = 1 F Vn 0. Specific guidelines for constructing a null distribution Q 0 that satisfies Assumption AT1 are given in Theorem 3, below. Also note that conditions 14 and 15 in Assumption AT2 only require that T n = T n m : m = 1,..., M represent a sensible set of test statistics, that separate into two groups as n, depending on the truth or falsity of the null hypotheses, i.e., such that the largest h 1 test statistics correspond to the h 1 false null hypotheses. We derive two main results concerning asymptotic control of the FWER by Procedure 1. The more general Theorem 1 proves that Procedure 1 provides asymptotic control of the FWER under asymptotic null domination Assumption AT1 only. Since step-down cut-offs are always less than or equal to the corresponding single-step cut-offs, this result shows that the gain in power from the step-down maxt procedure, relative to the single-step maxt procedure, does not come at the expense of failure of Type I error control. Published by The Berkeley Electronic Press, 2004 9

Statistical Applications in Genetics and Molecular Biology, Vol. 3 [2004], Iss. 1, Art. 14 By making the additional asymptotic separation Assumption AT2, Theorem 2 provides a sharper result. In particular, consistent identification of the set H 1 of false null hypotheses as in Assumption AT2, leads to exact asymptotic control of the FWER, when condition 13 in Assumption AT1 holds with equality. However, asymptotic separation of the test statistics for the true and false null hypotheses does not hold at local alternatives. Theorem 1 [Asymptotic control of FWER for step-down maxt Procedure 1, under Assumption AT1] Suppose Assumption AT1 of asymptotic null domination is satisfied by the distribution Q n = Q n P of the test statistics T n = T n m : m = 1,..., M and by the null distribution Q 0. Denote the number of Type I errors for Procedure 1 by V n M ITnm > Cnm, O n m H 0. m=1 Then, Procedure 1 provides asymptotic control of the family-wise error rate at level α, that is, lim sup P rv n > 0 α. Proof of Theorem 1. Let m n min{m : O n m H 0 }, that is, O n m n is the index of the true null hypothesis with the largest test statistic. Thus, by definition of m n, T o nm n = T n O n m n = max m H0 T n m and {O n 1,..., O n m n 1} = O c nm n H 1. It then follows that P rv n > 0 = P ro n m n R n P r Tnm o n > co n m n, Q 0, α = P r max T n m > co n m n, Q 0, α P r max T n m > ch 0, Q 0, α, where the first inequality follows from the step-down property and the last inequality follows from the fact that H 0 O n m n implies ch 0, Q 0, α http://www.bepress.com/sagmb/vol3/iss1/art14 10

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate co n m n, Q 0, α. Finally, under Assumption AT1 and for Z Q 0, P rv n > 0 lim sup P r max lim sup which completes the proof. P r α, T n m > ch 0, Q 0, α max Zm > ch 0, Q 0, α Theorem 2 [Asymptotic control of FWER for step-down maxt Procedure 1, under Assumptions AT1 and AT2] Suppose Assumptions AT1 and AT2 hold, specifically, conditions 13, 14, 15, and 16, are satisfied by the distribution Q n = Q n P of the test statistics T n = T n m : m = 1,..., M and by the null distribution Q 0. Then, as under the weaker assumptions of Theorem 1, Procedure 1 provides asymptotic control of the family-wise error rate at level α. In addition, if condition 13 in Assumption AT1 holds with equality and Q 0 is continuous so that max m H0 Zm is a continuous random variable for Z Q 0, then asymptotic control is exact lim P rv n > 0 = α. Proof of Theorem 2. Procedure 1 can be stated equivalently in terms of statistics T nm, as follows. Let Tn1 Tn1, 17 { Tnm T nm, if Tnm 1 > C n m 1, m = 2,..., M,, otherwise and reject H 0 O n m if T nm > C n m, m = 1,..., M. We first state the main ideas of the proof. Note that, from asymptotic separation Assumption AT2, with probability one in the limit, the first h 1 = H 1 rejected hypotheses correspond to the h 1 false null hypotheses see argument with indicator B n, below. Thus, no Type I errors are committed for these first h 1 rejections and one can focus on the h 0 least significant statistics, T nm, m = h 1 +1,..., M, Published by The Berkeley Electronic Press, 2004 11

Statistical Applications in Genetics and Molecular Biology, Vol. 3 [2004], Iss. 1, Art. 14 which now correspond to the test statistics for the true null hypotheses, T n m : m H 0. By definition of the step-down procedure, a Type I error is then committed if and only if Tnh 1 +1 = max m H0 T n m > C n h 1 +1 = ch 0. Thus, one needs to control P r max m H0 T n m > ch 0, which the procedure indeed asymptotically controls at level α, conditional on the event that the first h 1 rejections correspond exactly with rejecting the true positives H 1. Details of the proof are given next. Define Bernoulli random variables B n I {O n 1,..., O n h 1 } = H 1, Tn1 > C n 1,..., Tnh 1 > C n h 1. 18 Under Assumption AT2, P rb n = 1 1 as n. Then, the FWER for Procedure 1 is given by P rv n > 0 = P r M m=1{t nm > C n m, O n m H 0 } = P r M m=1{t nm > C n m, O n m H 0 } B n = 1 + o1 = P r M m=h 1 +1{Tnm > C n m} B n = 1 + o1 [ = E P r M m=h 1 +1{Tnm > C n m} O n h 1 + 1, B n = 1 ] B n = 1 + o1 [ = E P r M m=h 1 +1{Tnm > C n m} T nh 1 + 1 > ch 0, O n h 1 + 1, B n = 1 P r Tnh 1 + 1 > ch 0 On h 1 + 1, B n = 1 ] Bn = 1 + o1, where B n = 1 implies that O n h 1 + 1 = H 0 and hence C n h 1 + 1 = ch 0. The last equality follows by noting that, given B n = 1, then T nh 1 + 1 = T nh 1 + 1 = T n O n h 1 + 1, and, if T nh 1 + 1 C n h 1 + 1, then T nm = for m = h 1 + 2,..., M. Now, note that the first probability within the conditional expectation equals one, so that P rv n > 0 = = E [ P rtnh 1 + 1 > ch 0 O n h 1 + 1, B n = 1 Bn = 1 ] + o1 [ = E P r max T n m > ch 0 O n h 1 + 1, B n = 1 B n = 1] + o1 = P r max T n m > ch 0 B n = 1 + o1 = P r max T n m > ch 0 + o1, http://www.bepress.com/sagmb/vol3/iss1/art14 12

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate where we again use the fact that P rb n = 1 1. Finally, under Assumption AT1 and for Z Q 0, lim sup P r max T n m > ch 0 P r max Zm > ch 0 α. In addition, if condition 13 in Assumption AT1 holds with equality i.e., lim n P r max m H0 T n m > x = P r max m H0 Zm > x and the null distribution Q 0 is continuous, so that quantiles ca = ca, Q 0, α yield exact α survival probabilities, then Procedure 1 provides exact asymptotic FWER control at level α lim P r max T n m > ch 0 = P r max Zm > ch 0 = α. 2.1.3 Explicit proposal for the test statistics null distribution One can make the following explicit proposal for a null distribution Q 0 that satisfies asymptotic null domination Assumption AT1. Theorem 3 [General construction for null distribution Q 0 ] Suppose there exist known M vectors λ 0 IR M and τ 0 IR M 0 of null values, so that Let lim sup E[T n m] λ 0 m and 19 lim sup V ar[t n m] τ 0 m, for m H 0. ν 0n m min τ 0 m 1, V ar[t n m] 20 and define an M vector of null value shifted and scaled test statistics Z n by Z n m ν 0n m T n m + λ 0 m E[T n m], m = 1,..., M. 21 Suppose that Z n L Z Q 0 P. 22 Published by The Berkeley Electronic Press, 2004 13

Statistical Applications in Genetics and Molecular Biology, Vol. 3 [2004], Iss. 1, Art. 14 Then, for this choice of null distribution Q 0 = Q 0 P, and for all x, lim sup P r Qn max T n m > x P r Q0 max Zm > x, 23 so that asymptotic null domination condition 13 in Assumption AT1 holds. In particular, if for all m H 0, lim n E[T n m] = λ 0 m, then 23 holds with equality. Proof of Theorem 3. Define an intermediate random vector Z n m : m H 0, for the true null hypotheses, by Z n m T n m + max0, λ 0 m E[T n m], m H 0. 24 Then, T n m Z n m. In addition, since lim sup n E[T n m] λ 0 m and lim sup n V ar[t n m] τ 0 m for m H 0 and thus lim n ν 0n m = 1, it follows that Z n m : m H 0 and Z n m : m H 0 have the same limit distribution Z n m : m H 0 L Zm : m H 0 Q 0,H0. Thus, by the Continuous Mapping Theorem, lim sup P r max T n m > x lim sup = P r P r max max Zm > x Zn m > x for all x. In particular, if 19 holds with equality, then 23 also holds with equality. The null distribution Q 0 proposed in Theorem 3 for step-down procedures is the same as the general null distribution proposed for single-step procedures in Theorem 2 of the companion article Dudoit et al., 2004. The reader is referred to this earlier article for motivation for the construction of the null distribution Q 0 and for a detailed discussion of its properties Sections 3, 5.2, and 7. In particular, Section 7 provides null values λ 0 m and τ 0 m for a broad range of testing problems and also discusses null distributions for specific choices of test statistics e.g., t-statistics, F -statistics. In http://www.bepress.com/sagmb/vol3/iss1/art14 14

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate many testing problems of interest, Q 0 is continuous. For instance, for the test of single-parameter null hypotheses using t-statistics, Q 0 is an M variate Gaussian distribution with mean vector zero Section 7.1, in Dudoit et al. 2004. In practice, one can estimate the null distribution Q 0 using a bootstrap procedure, as discussed in detail in Section 3. For B bootstrap samples, one has an M B matrix of test statistics, T # n = T n # m, b, with rows corresponding to the M hypotheses and columns to the B bootstrap samples. The expected values, E[T n m], and variances, V ar[t n m], are estimated by simply taking row means and variances of the matrix T # n. The matrix of test statistics T # n can then be row-shifted and scaled using the supplied null values λ 0 m and τ 0 m, to produce an M B matrix Z # n = Z n # m, b. The null distribution Q 0 is estimated by the empirical distribution of the columns of matrix Z # n. 2.1.4 Adjusted p-values Rather than simply reporting rejection or not of a subset of null hypotheses at a prespecified level α, one can report adjusted p-values for step-down Procedure 1, computed under the assumed null distribution Q 0 for the test statistics T n for a more detailed discussion of adjusted p-values, consult Section 2.9 of the companion article, Dudoit et al. 2004. While the definition of adjusted p-values in Equation 5 of Section 1 holds for general null distributions, in this section, we consider for simplicity a null distribution Q 0 with continuous and strictly monotone marginal c.d.f. s, Q 0,m, and survivor functions, Q0,m = 1 Q 0,m, m = 1,..., M. Result 1 [Adjusted p-values for step-down maxt Procedure 1] The adjusted p-values for step-down maxt Procedure 1, based on a null distribution Q 0 with continuous and strictly monotone marginal distributions, are given by } {P r Q0 P 0n O n m = max k=1,...,m max Zh T no n k h {O nk,...,o nm} 25 where Z = Zm : m = 1,..., M Q 0 and H 0 O n m is the null hypothesis corresponding to the mth most significant test statistic T nm = T n O n m, that is, the indices O n m are defined such that T n O n 1, Published by The Berkeley Electronic Press, 2004 15

Statistical Applications in Genetics and Molecular Biology, Vol. 3 [2004], Iss. 1, Art. 14... T n O n M. Step-down Procedure 1 for controlling the FWER at level α can then be stated equivalently as RT n, Q 0, α = { O n m : P 0n O n m α, m = 1,..., M }. 26 Note that the adjusted p-values are conditional on the observed test statistics T n m and their ranks. In addition, taking successive maxima of the probabilities in Equation 25 enforces the step-down property via monotonicity of the adjusted p-values, P 0n O n 1... P 0n O n M. Proof of Result 1. As in Procedure 1, let F A,Q0 z = P r Q0 max m A Zm z denote the c.d.f. of max m A Zm for Z Q 0. Also, let O n m = {O n m,..., O n M} and C n m = F 1 O nm,q 0 1 α. Then, as in Equation 12, P 0n O n m = inf { α [0, 1] : m I T n O n k > C n k } = m k=1 = inf {α [0, 1] : T n O n k > C n k, k = 1,..., m} = max inf {α [0, 1] : T no n k > C n k} k=1,...,m { } = max inf α [0, 1] : T n O n k > F 1 1 α k=1,...,m O nk,q 0 { = max inf α [0, 1] : F } Onk,Q k=1,...,m 0 T n O n k < α = max k=1,...,m = max k=1,...,m F Onk,Q0 T n O n k } {P r Q0 max Zh > T n O n k, h {O nk,...,o nm} where in we use the fact that, for a c.d.f. F and corresponding survivor function F, F 1 α = F 1 1 α. http://www.bepress.com/sagmb/vol3/iss1/art14 16

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate 2.2 Step-down procedure based on minima of unadjusted p-values 2.2.1 Step-down minp procedure One can also prove asymptotic control of the FWER for an analogue of Procedure 1, where maxima of test statistics, T n m, are replaced by minima of unadjusted p-values, P 0n m, also computed under the null distribution Q 0. Procedure 2, p. 19, below, is a step-down analogue of the single-step minp procedure that arises as a special case of single-step common-quantile Procedure 1 of Dudoit et al. 2004. Similar step-down minp procedures are discussed in Dudoit et al. 2003 and Westfall and Young 1993, Algorithm 2.8, p. 66 67, with an important distinction in the choice of the null distribution Q 0 used to derive the quantiles C n m and the resulting adjusted p-values, as detailed in Section 2.2.4, below. Note that procedures based on maxima of test statistics maxt and minima of unadjusted p-values minp are equivalent when the test statistics T n m, m = 1,..., M, are identically distributed under Q 0, i.e., when the marginal distributions Q 0,m do not depend on m: in this case, the significance rankings based on test statistics T n m and marginal p-values P 0n m coincide. In general, however, the two types of procedures produce different results, and considerations of balance, power, and computational feasibility should dictate the choice between the two approaches Dudoit et al., 2003, 2004; Ge et al., 2003. For non-identically distributed test statistics e.g., some null hypotheses tested with F -statistics and others with t-statistics, null hypotheses tested with t-statistics having different degrees of freedom, procedures based on minima of p-values may be preferable, as they place the null hypotheses on an equal footing, i.e., are more balanced than their maxt counterparts. Also note that while nominal p-values computed from a standard normal or some other distribution may not be correct, a step-down procedure based on minima of such transformed test statistics nonetheless provides asymptotic control of the FWER e.g., P n m = ΦT n m, where Φ is the standard normal survivor function. That is, these p-values can be viewed as just another type of test statistics and one can apply Procedure 1 to T n m = P n m and appeal to Theorems 1, 2, and 3 for FWER control. Here, however, we propose a step-down minp multiple testing procedure where unadjusted p-values are also defined in terms of the null distribution Published by The Berkeley Electronic Press, 2004 17

Statistical Applications in Genetics and Molecular Biology, Vol. 3 [2004], Iss. 1, Art. 14 Q 0. Specifically, we define unadjusted p-values as P 0n m Q 0,m T n m, where Q 0,m = 1 Q 0,m denote the marginal survivor functions corresponding to Q 0, m = 1,..., M. We therefore have a more specific method and assumptions for proving asymptotic control of the family-wise error rate than in Section 2.1. Asymptotic Type I error control by step-down minp Procedure 2 relies on Assumptions AP1 and AP2, below; guidelines for constructing the null distribution Q 0 are given in Lemma 1 and 2 and are as in Theorem 3 with a few additional requirements. Thus, while similar, maxt Procedure 1 and minp Procedure 2 are not equivalent and require a distinct treatment if one is to use the same test statistics null distribution Q 0 for both approaches as opposed to a different null distribution for the p-values in Procedure 2. As for Procedure 1, Procedure 2 can be stated more succinctly as RT n, Q 0, α {O n 1,..., O n R n }, 27 where R n, the number of rejected hypotheses, is defined as { m } R n max m : IP0nh < C n h = m. 28 h=1 The definition C nm = 0, if P 0nm 1 C nm 1, ensures that the procedure is indeed step-down, that is, one can only reject a particular hypothesis provided all hypotheses with more significant i.e., smaller unadjusted p-values were rejected beforehand. Note that for a null distribution Q 0 with continuous margins, the unadjusted p-values P 0 m = Q 0,m Zm have U0, 1 marginal distributions. However, the P 0 m are not independent, therefore, the quantiles C n m cannot be obtained trivially from the Beta1, M m + 1 distribution for the minimum of M m+1 independent U0, 1 random variables. A key feature of Theorem 3 is that it provides a null distribution Q 0 for multiple testing procedures that take into account the joint distribution of the test statistics, i.e., the correlation structure of the null distribution Q 0 is implied by the correlation structure of the test statistics T n, via the null value shifted and scaled statistics Z n. http://www.bepress.com/sagmb/vol3/iss1/art14 18

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate Procedure 2. Step-down minp procedure for control of the FWER. Given an M variate null distribution Q 0, with marginal c.d.f. s Q 0,m and survivor functions Q 0,m = 1 Q 0,m, m = 1,..., M, define unadjusted p-values P 0n m Q 0,m T n m 29 and P 0 m Q 0,m Zm, 30 for random M vectors T n = T n m : m = 1,..., M Q n = Q n P and Z = Zm : m = 1,..., M Q 0. Let P 0nm denote the ordered unadjusted p-values, P 0n1... P 0nM, and let O n m denote the indices for these ordered statistics, so that P 0nm P 0n O n m, m = 1,..., M. For a level α 0, 1 test, define α quantiles, ca = ca, Q 0, α [0, 1], for the distributions of minima, min m A P 0 m, of unadjusted p-values P 0 m over subsets A {1,..., M}, ca, Q 0, α F 1 A,Q 0 α = inf {z : F A,Q0 z α}, 31 where F A,Q0 z P r Q0 min m A P 0 m z denotes the c.d.f. of min m A P 0 m for Z Q 0. Next, given the indices O n m for the ordered unadjusted p-values P0nm, define α quantiles, C n m, for subsets of the form O n m {O n m,..., O n M}, and step-down cut-offs C n m co n m, Q 0, α = F 1 O nm,q 0 α, 32 Cn1 C n 1, 33 { Cnm C n m, if P 0nm 1 < Cnm 1, m = 2,..., M. 0, otherwise The step-down minp multiple testing procedure for controlling the FWER at level α is defined by the following rule: Reject null hypothesis H 0 O n m, corresponding to the mth most significant unadjusted p- value P0nm = P 0n O n m = Q 0,OnmT n O n m, if P0nm < Cnm, m = 1,..., M, that is, RT n, Q 0, α {O n m : P 0nm < C nm, m = 1,..., M}. 34 Published by The Berkeley Electronic Press, 2004 19

Statistical Applications in Genetics and Molecular Biology, Vol. 3 [2004], Iss. 1, Art. 14 2.2.2 Asymptotic control of FWER As with step-down maxt Procedure 1, we prove two main theorems concerning asymptotic control of the FWER by Procedure 2, under the following p-value analogues of Assumptions AT1 and AT2. The more general result Theorem 4 is proved under only asymptotic null domination Assumption AP1, and the sharper result Theorem 5 is proved under both Assumptions AP1 and AP2. Guidelines for constructing a null distribution Q 0 that satisfies Assumptions AP1 and AP2 are as in Theorem 3, with a few additional conditions stated in Lemma 1 and 2, below. Assumption AP1 [Asymptotic null domination] There exists an M variate null distribution Q 0 = Q 0 P so that lim sup P r Qn min P 0n m < x P r Q0 min P 0 m < x for all x, 35 where P 0n m = Q 0,m T n m and P 0 m = Q 0,m Zm are unadjusted p- values defined for random M vectors T n Q n = Q n P and Z Q 0 = Q 0 P, respectively, and Q 0,m denote the marginal survivor functions corresponding to the null distribution Q 0, m = 1,..., M. Note that, like Assumption AT1 for the step-down maxt procedure, Assumption AP1 follows from the main asymptotic null domination Assumption AQ0 in Theorem 1 of the companion article on single-step procedures Dudoit et al., 2004. It is a weaker form of null domination, that is specific to FWER-controlling procedures based on p-values. Assumption AP2 [Asymptotic separation of true and false null hypotheses] Let P 0n m = Q 0,m T n m and P 0 m = Q 0,m Zm denote unadjusted p-values defined for random M vectors T n Q n = Q n P and Z Q 0, respectively, where Q 0,m denote the marginal survivor functions corresponding to the null distribution Q 0, m = 1,..., M. For each ɛ > 0, assume that and lim ɛ 0 lim P r Q n lim P r Qn max m H 1 P 0n m ɛ = 1 36 min P 0n m ɛ = 0. 37 http://www.bepress.com/sagmb/vol3/iss1/art14 20

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate In addition, for α 0, 1 and Z = Zm : m = 1,..., M Q 0, the distributions of minima, min m A P 0 m, of unadjusted p-values P 0 m = Q 0,m Zm over subsets A {1,..., M}, are assumed to have positive α quantiles, that is, min ca, Q 0, α > 0, 38 A {1,...,M} where the quantiles ca, Q 0, α are defined as in Procedure 2. Theorem 4 [Asymptotic control of FWER for step-down minp Procedure 2, under Assumption AP1] Suppose Assumption AP1 of asymptotic null domination is satisfied by the unadjusted p-values, P 0n = P 0n m = Q 0,m T n m : m = 1,..., M, i.e., by the distribution Q n = Q n P of the test statistics T n = T n m : m = 1,..., M and by the null distribution Q 0. Denote the number of Type I errors for Procedure 2 by V n M IP0nm < Cnm, O n m H 0. m=1 Then, Procedure 2 provides asymptotic control of the family-wise error rate at level α, that is, lim sup P rv n > 0 α. Proof of Theorem 4. The proof follows that of Theorem 1, with unadjusted p-values P 0n m replacing test statistics T n m. Theorem 5 [Asymptotic control of FWER for step-down minp Procedure 2, under Assumptions AP1 and AP2] Suppose Assumptions AP1 and AP2 hold, specifically, conditions 35, 36, 37, and 38, are satisfied by the unadjusted p-values, P 0n = P 0n m = Q 0,m T n m : m = 1,..., M, i.e., by the distribution Q n = Q n P of the test statistics T n = T n m : m = 1,..., M and by the null distribution Q 0. Then, as under the weaker assumptions of Theorem 4, Procedure 2 provides asymptotic control of the family-wise error rate at level α. In addition, if condition 35 in Assumption AP1 holds with equality and Q 0 is continuous so that Published by The Berkeley Electronic Press, 2004 21

Statistical Applications in Genetics and Molecular Biology, Vol. 3 [2004], Iss. 1, Art. 14 min m H0 Q0,m Zm is a continuous random variable for Z Q 0, then asymptotic control is exact lim P rv n > 0 = α. Proof of Theorem 5. The proof is analogous to that of Theorem 2. Therefore, only the main steps where Assumptions AP1 and AP2 come into play are highlighted. Compared to the previous proof, maxima of test statistics are replaced by minima of unadjusted p-values and the direction of the cut-off rules is reversed. Again, the procedure can be stated equivalently in terms of statistics P 0nm, as follows. Let P0n1 P0n1, 39 { P0nm P 0nm, if P0nm 1 < C n m 1, m = 2,..., M, 1, otherwise and reject H 0 O n m if P0nm < C n m, m = 1,..., M. As before, define Bernoulli random variables B n I {O n 1,..., O n h 1 } = H 1, P0n1 < C n 1,..., P0nh 1 < C n h 1 40 and argue that, under asymptotic separation Assumption AP2, then P rb n = 1 1 as n. Asymptotic null domination Assumption AP1 comes into play at the very last step of the proof to show that lim sup P r min P 0n m < ch 0 α. 2.2.3 Explicit proposal for the test statistics null distribution A null distribution Q 0 for Procedure 2 can be constructed as described in Theorem 3, with a few additional requirements in order to meet Assumptions AP1 and AP2 of Theorems 4 and 5. Lemma 1 and 2 are concerned with providing sufficient conditions in terms of continuity and monotonicity assumptions on the null distribution Q 0 so that Assumptions AP1 and AP2 are implied by their maxt counterparts, i.e., by Assumptions AT1 and AT2, respectively. http://www.bepress.com/sagmb/vol3/iss1/art14 22

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate Lemma 1 [Asymptotic null domination] The null distribution Q 0 = Q 0 P defined in Theorem 3, with the additional condition that the marginal survivor functions Q 0,m = 1 Q 0,m, m = 1,..., M, are continuous, satisfies Assumption AP1. Proof of Lemma 1. Define an intermediate random vector Z n m : m H 0 as in the proof of Theorem 3, so that T n m Z n m, for m H 0, and Z n m : m H 0 and Z n m : m H 0 have the same limit distribution Q 0,H0. Then, by the Continuous Mapping Theorem, lim sup P r min P 0n m < x P r Q0,m T n m < x Q0,m Z n m < x = lim sup min lim sup P r min = P r min Q0,m Zm < x = P r min P 0 m < x for all x, where Z Q 0. In particular, if 19 holds with equality, then 35 also holds with equality. Lemma 2 [Asymptotic separation of true and false null hypotheses] Suppose that the marginal survivor functions Q 0,m, m = 1,..., M, corresponding to the null distribution Q 0, satisfy the following: i Q 0,m is continuous; ii Q 0,m is strictly decreasing; and iii there exists a constant K 0 possibly degenerate such that lim Q 1 ɛ 0 0,mɛ = K 0 for each m. Then, asymptotic separation Assumption AT2 for the test statistics T n m implies Assumption AP2 for the unadjusted p-values P 0n m = Q 0,m T n m, m = 1,..., M. Published by The Berkeley Electronic Press, 2004 23

Statistical Applications in Genetics and Molecular Biology, Vol. 3 [2004], Iss. 1, Art. 14 Proof of Lemma 2. Condition 36 follows from 14 by noting that, for each ɛ > 0, K 1 ɛ max Q 1 m H1 0,mɛ < K 0 and P r max P 0n m ɛ = P r Q0,m T n m ɛ, m H 1 m H 1 = P r 1 T n m Q 0,mɛ, m H 1 P r T n m K 1 ɛ, m H 1 = P r min T n m K 1 ɛ. m H 1 Similarly, condition 37 follows from 15 by noting that, for each ɛ > 0 and k 0 ɛ min Q 1 m H0 0,mɛ, then P r min P 0n m ɛ = P r Q0,m T n m ɛ, for some m H 0 = P r 1 T n m Q 0,mɛ, for some m H 0 P r T n m k 0 ɛ, for some m H 0 = P r max T n m k 0 ɛ. Also, k 0 ɛ K 0 as ɛ 0, thus, by 15, lim lim P r min P 0n m ɛ lim lim P r ɛ 0 ɛ 0 max T n m k 0 ɛ = 0. 2.2.4 Adjusted p-values Adjusted p-values are obtained similarly as for Procedure 1. Again, for simplicity and as in Lemma 1 and 2, consider a null distribution Q 0 with continuous and strictly monotone marginal c.d.f. s, Q 0,m, and survivor functions, Q0,m = 1 Q 0,m, m = 1,..., M. Result 2 [Adjusted p-values for step-down minp Procedure 2] The adjusted p-values for step-down minp Procedure 2, based on a null distribution Q 0 with continuous and strictly monotone marginal distributions, are given by } {P r Q0 P 0n O n m = max k=1,...,m min P 0h P 0n O n k h {O nk,...,o nm}, 41 http://www.bepress.com/sagmb/vol3/iss1/art14 24

van der Laan et al.: Step-Down Procedures for Control of the Family-Wise Error Rate where P 0 m = Q 0,m Zm, Z = Zm : m = 1,..., M Q 0, and H 0 O n m is the null hypothesis corresponding to the mth most significant unadjusted p-value P0nm = P 0n O n m = Q 0,OnmT n O n m, that is, the indices O n m are defined such that P 0n O n 1... P 0n O n M. Step-down Procedure 2 for controlling the FWER at level α can then be stated equivalently as RT n, Q 0, α = { O n m : P 0n O n m α, m = 1,..., M }. 42 Proof of Result 2. As in Procedure 2, let F A,Q0 z = P r Q0 min m A P 0 m z denote the c.d.f. of min m A P 0 m for P 0 m = Q 0,m Zm and Z Q 0. Also, let O n m = {O n m,..., O n M} and C n m = F 1 α. Then, O nm,q 0 as in Equation 28, { m P 0n O n m = inf α [0, 1] : I P 0n O n k < C n k } = m k=1 = inf {α [0, 1] : P 0n O n k < C n k, k = 1,..., m} = max inf {α [0, 1] : P 0nO n k < C n k} k=1,...,m { } = max inf α [0, 1] : P 0n O n k < F 1 α k=1,...,m O nk,q 0 { } = max inf α [0, 1] : F Onk,Q k=1,...,m 0 P 0n O n k α = max F O k=1,...,m nk,q 0 P 0n O n k {P r Q0 = max k=1,...,m min P 0 h P 0n O n k h {O nk,...,o nm} }. The adjusted p-values in Equation 41 correspond to those given in Equation 2.10, p. 66, of Westfall and Young 1993, again with an important distinction in the choice of the null distribution Q 0. Consider the special case where the random M vector Z Q 0 has independent components Zm, with continuous marginal distributions Q 0,m, m = 1,..., M. Then, the unadjusted p-values P 0 m = Q 0,m Zm are independent U0, 1 random variables and the minima min m A P 0 m have Beta1, A distributions. The adjusted p-values for Procedure 2 then reduce to the step-down Published by The Berkeley Electronic Press, 2004 25