Two-stage stepup procedures controlling FDR

Size: px
Start display at page:

Download "Two-stage stepup procedures controlling FDR"

Transcription

1 Journal of Statistical Planning and Inference 38 (2008) Two-stage stepup procedures controlling FDR Sanat K. Sarar Department of Statistics, Temple University, Philadelphia, USA Received 0 June 2005; received in revised form 6 August 2006; accepted 23 March 2007 Available online 7 July 2007 Abstract A two-stage stepup procedure is defined and an explicit formula for the FDR of this procedure is derived under any distributional setting. Sets of critical values are determined that provide a control of the FDR of a two-stage stepup procedure under iid mixture model. A class of two-stage FDR procedures modifying the Benjamini Hochberg (BH) procedure and containing the one given in Storey et al. [2004. Strong control, conservative point estimation and simultanaeous conservative consistency of false discovery rates: a unified approach. J. Roy. Statist. Soc. Ser. B 66, ] is obtained. The FDR controlling property of the Storey Taylor Siegmund procedure is proved only under the independence, which is different from that presented by these authors. A single-stage stepup procedure controlling the FDR under any form of dependence, which is different from and in some situations performs better than the Benjamini Yeutieli (BY) procedure, is given before discussing how to obtain two-stage versions of the BY and this new procedures. Simulations reveal that procedures proposed in this article under the mixture model can perform quite well in terms of improving the FDR control of the BH procedure. However, the similar idea of improving the FDR control of a stepup procedure under any form dependence does not seem to wor Elsevier B.V. All rights reserved. Keywords: Modified BH procedure; Mixture model; Average power. Introduction Since its introduction by Benjamini and Hochberg (995) in multiple testing, the concept of false discovery rate (FDR) as a measure of error rate has been receiving considerable attention due to its relevance in many scientific investigations, particularly in the context of DNA microarray analysis. The FDR is the expected proportion of type I errors among the rejected hypotheses. There are many procedures that control the FDR. Among them, the stepup procedures of Benjamini and Hochberg (995) has received the most attention. Consider n null hypotheses H,...,H n that are being tested simultaneously using the corresponding p-values P,...,P n. Let these P i s be ordered as 0 = P 0:n <P :n P n:n. Then, given a set of ordered critical values 0 = α 0 < α α n <, a stepup procedure rejects H i for all i K SU, where K SU = max0 i n : P i:n α i }. Assuming that P i U(0, ), for each i =,...,n, under the null hypotheses, the critical values of the Benjamini Hochberg (BH) procedure controlling the FDR at α are given by α i = iα/n, i =,...,n. (.) The research is supported by NSF Grants DMS and DMS address: sanat@temple.edu /$ - see front matter 2007 Elsevier B.V. All rights reserved. doi:0.06/j.jspi

2 S.K. Sarar / Journal of Statistical Planning and Inference 38 (2008) One of the most important results of the BH procedure is that its FDR is exactly equal to (n 0 /n)α, where n 0 /n is the proportion of true null hypotheses, when the p-values are independent and is less than or equal to (n 0 /n)α when they are positive regression dependent on subset (PRDS) of those corresponding to the null hypotheses, a positive dependence property shared by many of the multivariate distributions arising in multiple testing (Benjamini and Yeutieli, 200; Sarar, 2002). Finner and Roters (200) and Storey et al. (2004) offered different proofs of the above equality result under independence. Zhang (2005) contains some of these results. In the absence of a nowledge of any particular form of dependence among the P i s, the procedure of Benjamini and Yeutieli (200) is generally recommended. The critical values of this procedure (referred to as the BY procedure) are given by α i = iα / n, j i =,...,n, (.2) with the FDR being less than or equal to (n 0 /n)α. Since n 0 is unnown, both BH and BY procedures could potentially be improved in the sense of providing a better control of the FDR if they are modified using an appropriately chosen estimate of n 0. This idea, nevertheless, has so far been implemented successfully only in modifying the BH procedure under the independence. Benjamini and Hochberg (2000) first proposed a data-adaptive modification of the BH procedure, modifying further a similar approach in Hochberg and Benjamini (990) aimed at improving the familywise error rate (FWER) control of the Bonferroni single-step and related stepwise procedures. They borrowed the idea of estimating n 0 based on the number of relatively large p-values outlined in Schweder and Spjotvoll (982). This has also been used in Benjamini et al. (2006) and Storey et al. (2004) who provided alternative modifications to the BH procedure. While Benjamini et al. (2006) and Storey et al. (2004) offer proofs of the FDR controlling property of their modified BH procedures under independence of the p-values, no proof is yet available for the adaptive BH procedure of Benjamini and Hochberg (2000). These are all basically two-stage procedures that estimate n 0 at the first stage before applying the BH procedure at the second stage with its critical values modified incorporating this estimate. Sarar (2006) has recently modified the single-step Bonferroni procedure using the estimate of n 0 in Storey et al. (2004) through a two-stage approach and proved assuming the independence that it controls the FDR. In this article, we obtain a wider class of two-stage modifications of the BH procedure under mixture model (Genovese and Wasserman, 2002; Storey, 2002) and obtain similar two-stage versions of the BY procedure and an alternative to it proposed for the first time in this article under any form of dependence. We start with a general two-stage stepup procedure in Section 2 and obtain in the same section an explicit expression for its FDR under any distributional setting. We then determine in Section 3, assuming the mixture model, sets of critical values of a two-stage stepup procedure that provide a control of the FDR at a pre-specified value α (0, ). The resulting two-stage FDR procedures represent a newer class of modified BH procedures, including the one given in Storey et al. (2004). For the Storey Taylor Siegmund procedure (to be referred to as the STS procedure), we also give a slightly more general result providing a proof of its FDR controlling property only under the independence, that is, without having the additional assumption on the probability distribution of different configurations of true and false null hypotheses. This is different from Storey et al. (2004). We choose one in our proposed class of procedures and study it numerically in comparison with the STS, the adaptive BH and the original BH procedures in terms of their FDR and average power. While our procedure appears to have the best performance for small difference between true and false nulls, the STS procedure seems to wor the best when this difference is large. We obtain an inequality in Section 4 for the FDR of a two-stage stepup procedure under any form of dependence among the p-values. This provides an idea of developing the newer FDR procedure and leads to the developments of two-stage versions of this and the BY procedures. To set the stage for a numerical investigation of the question if a two-stage stepup procedure really improves the FDR control of its respective single-stage version in the dependence case, we decide to numerically assess how comparable our new FDR procedure is to the commonly used BY procedure. Interestingly, our procedure is seen to exhibit better control of the FDR when there is a high dependence among the p-values and not many of the null hypotheses are false. Unfortunately, however, none of these two-stage procedures does seem to wor in terms of improving the FDR control of the single-stage procedures.

3 074 S.K. Sarar / Journal of Statistical Planning and Inference 38 (2008) Two-stage stepup procedure and its FDR In this section, we will define a two-stage stepup procedure and obtain an explicit formula for its FDR under any distributional setting. Definition 2. (Two-stage stepup procedure). Given 0 = λ 0 < λ λ n < λ n+ = and a set of other constants α j, 0 j n, satisfying 0 = α j0 < α j α jj λ j, for each j, let J = max0 i n : P i:n λ i }, and K(J) = max0 i J : P i:n α Ji }. Reject H i for all i K(J). Remar 2.. A two-stage stepup procedure starts with a stepup test with the critical values λ λ n at the first stage and continues testing as long as no hypothesis is rejected. Once a rejection occurs, testing at the first stage stops, the n J hypotheses accepted are declared true, and those which are not accepted are deferred to the second stage to be further tested using another stepup test with a smaller set of critical values. Given J = j, K(j) is the number of hypotheses that are finally rejected. It reduces to a single-stage stepup procedure with the critical values λ λ n if α j = λ j, j n. The critical values λ j s at the first stage are chosen to be relatively large compared to those at the second stage. The reason behind this is that the hypotheses that are accepted based on larger critical values (in terms of p-values) should not be further tested. Moreover, the p-values that are insignificant compared to larger critical values are better indicators of the true null hypotheses, providing a reliable information about the number of true null hypotheses that may be used to modify the critical values before testing the remaining hypotheses. Our definition of two-stage stepup procedure does not cover the type considered by Benjamini et al. (2006) who introduced a two-stage version of linear stepup procedure. Unlie ours, their procedure allows the first-stage critical values to be smaller than those at the second stage. This, however, presents an inconsistency in the procedure in that some hypotheses that are insignificant at the first stage may be found significant at the second stage. Our definition avoids such inconsistencies. It is actually more in the spirit of Storey et al. (2004) who restrict the significance threshold for their modified BH procedure to the region which is found significant at the first stage. Notice that we accept the hypotheses found insignificant at the first stage before proceeding to the second stage; whereas, Benjamini et al. (2006) mae no decision at the first stage unless all the hypotheses are accepted or rejected and Storey et al. (2004) mae no decision unless all the hypotheses are accepted. As seen from the following proposition, under certain conditions on the critical values of our procedure, it maes no difference whether or not we mae any decision at the first stage. Proposition 2.. With additional set of constant α j, j < n, satisfying α jj α j,j+ α jn, for each j, let K (J )=max0 i n : P i:n α Ji } and replace K(J)by K (J ) in Definition 2.. This new two-stage stepup procedure will remain equivalent to the original one as long as α j λ, j n. Towards deriving a formula for the FDR of a two-stage stepup procedure, first note that if J = 0, that is, if all the hypotheses are declared true in the first stage, the FDR is 0. Therefore, by letting H i,i I 0 } the set of true null hypotheses, we can write a formula for the FDR of a two-stage stepup procedure as follows: FDR = = = = Note that, for j = 2,...,n, PrJ = j,k(j) =, P i α j } = = = PrJ = j,k(j) =, H i is rejected} PrJ = j,k(j) =, P i α j }. (2.) PrJ = j,k(j), P i α j } =2 PrJ = j,k(j),p i α j }

4 S.K. Sarar / Journal of Statistical Planning and Inference 38 (2008) = j PrJ = j,p j [ I(Pi α j ) i α jj }+ E PrJ = j,k(j) P i } I(P }] i α j,+ ). (2.2) + = Let P ( i) ( i) :n P n :n denote the ordered components of the set obtained by removing P i from (P,...,P n ). Then, we have, for any j =,...,n, J = j,p i α jj }=P j:n λ j,p j+:n > λ j+,...,p n:n > λ n,p i α jj } And, for any <j n and β α j,+, J = j,k(j), P i β} =P ( i) j :n λ j,p ( i) j:n > λ j+,...,p ( i) n :n > λ n,p i α jj }. (2.3) = = J = j,k(j) = l,p i β} l=0 P l:n α jl,p l+:n > α j,l+,...,α jj P j:n < λ j,p j+:n > λ j+,...,p n:n > λ n,p i β} l=0 =P +:n > α j,+,...,p j,n > α j,j, α jj P j:n < λ j,p j+:n > λ j+,...,p n:n > λ n,p i β} =P ( i) :n > α j,+,...,p ( i) j 2,n > α j,j, α jj P ( i) j :n < λ j,p ( i) j:n > λ j+,...,p ( i) n :n > λ n,p i β} (2.4) (with P ( i) ( i) 0:n = 0, and P n:n = ). Thus, applying (2.3) and (2.4), respectively, to the probability in the first term and the conditional probabilities inside the summation in the second term of (2.2) and going bac to (2.), we have the following lemma providing an explicit expression for the FDR of a two-stage stepup procedure under any distributional setting. Lemma 2.. The FDR of a two-stage stepup procedure with the first-stage critical values 0 = λ 0 < λ λ n < λ n+ = and the second-stage critical values 0 = α j0 < α j α jj, with α jj λ j,j=,...,n,is given by FDR = ( i) j :n j λ j,p ( i) j:n > λ j+,...,p ( i) n :n > λ n,p i α jj } + j [ E :n > α j,+,...,p ( i) j 2,n > α j,j, α jj P ( i) j :n < λ j, j=2 = P ( i) j:n > λ j+,...,p ( i) n :n > λ n P i } I(Pi α j ) I(P }] i α j,+ ). (2.5) + Remar 2.2. Lemma 2. generalizes the formula for the FDR given in Lemma 3.2 (with r = ) of Sarar (2002) from a single-stage stepup to a two-stage stepup procedure. It reduces to the formula for a single-stage stepup procedure with the critical values 0 = λ 0 < λ λ n < λ n+ =, when α j = λ j, j n. 3. FDR-controlling two-stage stepup procedures under independence We determine in this section the first-stage critical values λ j, j =,...,n,and the second-stage critical values α j, j n, in the above two-stage stepup procedure providing a control of the FDR at α (0, ), assuming that the P i s are independent and, when the null hypotheses are true, they are distributed as U(0, ).

5 076 S.K. Sarar / Journal of Statistical Planning and Inference 38 (2008) Theorem 3.. Consider a two-stage stepup procedure with the second-stage critical values α j s satisfying α j = minλ j,α j }, for =,...,j, given the first-stage critical values λ j s and some other constants α j s. The FDR of this procedure under the independence is given by FDR = } λj min j, α j j :n λ j,p ( i) j:n > λ j+,...,p ( i) n :n > λ n}. (3.) Proof. The FDR of a two-stage stepup procedure, under independence, as we see from Lemma 2., is FDR = α jj j j=2 = ( i) j :n λ j,p ( i) j:n > λ j+,...,p ( i) n :n > λ n} j + :n > α j,+,...,p ( i) j 2,n > α j,j, α jj P ( i) j :n < λ j, P ( i) j:n > λ j+,...,p ( i) n :n > λ n} αj α } j,+. (3.2) + For every fixed j = 2,...,nin the above triple summation, consider the two possible situations: (i) jα j λ j and (ii) jα j > λ j. In situation (i), since α j jα j λ j,wehaveα j = minλ j,α j }=α j, for all =,...,j. Whereas, in situation (ii), we have λ j = minλ j,jα j }=α jj, whatever be. In each of these situations, the summation over = to j for that particular j will be zero. In other words, the expression (3.2) reduces to the right-hand side of (3.) if we choose the α j s as stated in the theorem. Thus, the theorem is proved. Remar 3.. Theorem 3. with the λ j s and α j s satisfying the inequality FDR α provides a class of two-stage stepup FDR procedures under the independence. The single-stage BH procedure belongs to this class. To see this, let λ j =jα/n and choose any α j jα/n, =,...,j. Since minλ j /j, α j }=α/n in this case, the FDR is equal to n 0 α/n,as j :n λ j,p ( i) j:n > λ j+,...,p ( i) n :n > λ n}=, (3.3) for any λ λ n.however,astheα j s satisfy α j =jα/n, j n, this is no different from the BH procedure (see Remar 2.). Remar 3.2. To obtain a proper two-stage FDR procedure, first note that if we let α j = α( λ j )/(n j + ), j =,...,n,then the FDR in (3.) becomes less than or equal to α λ j ( i) j :n n j + λ j,p ( i) j:n > λ j+,...,p ( i) n :n > λ n} = αn 0 ( λ n ) + α n λ j ( i) j:n n j + > λ j+,...,p ( i) n :n > λ n} α j=2 λ j ( i) j :n n j + > λ j,...,p ( i) n :n > λ n} = αn 0 ( λ n ) + α n j:n > λ j+,...,p ( i) n :n > λ λj n} n j + λ } j+. (3.4) n j

6 S.K. Sarar / Journal of Statistical Planning and Inference 38 (2008) We choose the λ j s as follows. For any fixed m n and 0 < λ <, let λ j =(n j +)( λ)/n for j =,...,m, and λ j = λ m for j = m +,...,n. For this choice of the λ j s, the expression (3.4) reduces to αn 0 ( λ m ) α( λ m ) n j:n > λ m} (n j)(n j + ). (3.5) j=m We now consider the following: Mixture model: With ω i ω(h i ) defined to be 0 or according as H i is true or false, let (P i, ω i ), i =,...,n,be iid with i u ω i }=( ω i )F 0 (u) + ω i F (u), ω i Bernoulli( π 0 ), (3.6) where F 0 (u) = u and F is stochastically smaller than F 0. Under this model, the P i s are now iid with F (u) = π 0 u + ( π 0 )F (u). Therefore, (3.5) can be expressed as n j:n > λ m } αnπ 0 ( λ m ) (n j)(n j + ) j=m = αnπ 0 ( λ m ) αnπ 0 ( λ m ) n m n m = αn π 0( λ m ) F(λ m ) F(λ m)} n j:n > λ m } j(j + ) n m+ j:n m > λ m } j(j + ) n m n m+ j:n m > λ m } j(j + ) nπ 0α λn m+ m }, (3.7) n m + where P :n m...p n m:n m are the ordered components of a subset of n m of the original np-values. The first inequality in (3.7) follows from the fact that, since the jth ordered component of a set of n values is more than the (j )th ordered component of any of its subset of n values, we have P n j:n P n j :n 2 P n m+ j:n m. Whereas, the second inequality follows, first from that fact that π 0 ( λ m ) F(λ m ) = π 0 ( λ m ) π 0 ( λ m ) + ( π 0 )( F (λ m )), and second nowing that (n m + ) F(λ m )} n m n m+ j:n m > λ m } = n m+:n m+ > λ m }, (3.8) j(j + ) from Sarar (998, p. 499), which is less than or equal to Pr F0 P n m+:n m+ > λ m },asthep i s are stochastically smaller under F than under F 0. If we choose 0 < λ < and m n is such a way that λ m = n m + ( ) m /(n m+) ( λ), n n

7 078 S.K. Sarar / Journal of Statistical Planning and Inference 38 (2008) that is, λ ((m )/n)/(n m+) (m )/n, (3.9) (m )/n then the FDR in (3.7) is controlled at π 0 α. Thus, we have the following theorem. Theorem 3.2. The following class of two-stage stepup procedures controls the FDR at α under the mixture model (3.6): (i) At stage one, consider the set of constants λ j = ((n j + )/n)( λ), j =,...,m, and λ j = λ m, j = m +,...,n, for any fixed 0 < λ < and m n satisfying (3.9). Find j = max0 i n: P i:n λ i }. (ii) Accept H i for all i>j and go to the second stage. (iii) At stage two, determine the constants 0 = α j0 α j α jj satisfying α jl = minλ j,lα( λ j )/(n j + )}, l = 0,,...,j, and find = max0 l j: P l:n α jl }. (iv) Reject H i for all i and accept the rest. Remar 3.3. The FDR of a procedure in Theorem 3.2 under the mixture model, as we see from (3.), is given by FDR = λj min j, α( λ } j ) j :n n j + λ j,p ( i) j:n > λ j+,...,p ( i) n :n > λ n}. (3.0) Theorem 3.2 provides a newer class of FDR procedures under the mixture model modifying the BH procedure in the spirit of Benjamini et al. (2006) and Storey et al. (2004). Each modifies the critical values of the BH procedure, from the α j = α/n to those satisfying α j = minλ j,α/ ˆn 0 (j)} at the second stage using an estimate ˆn 0 (j) of n 0 based on the number of hypotheses accepted at the first stage. The rationale behind such a modification, as given in Benjamini and Hochberg (2000), Benjamini et al. (2006) and Storey et al. (2004), is that if in the BH procedure α is replaced by min,n/n 0 }α, the FDR, at least under the independence, can be controlled less conservatively. The idea of estimating n 0 based on the number of insignificant hypotheses, first outlined in Schweder and Spjotvoll (982), was used in the aforementioned papers and Sarar (2006). Notice that we have chosen the estimate ˆn 0 (j) = (n j + )/( λ j ), with n j + in the numerator, and restricted the second-stage critical values to the region found significant at the first stage, as in Storey et al. (2004) and Sarar (2006). This is in contrast with Benjamini et al. (2006) who chose n j, the number of hypotheses accepted at the first stage, and allowed the second-stage critical values to fall in the insignificant region found at the first stage. Also notice that in the above theorem α j λ for all j n. So, according to Proposition 2., it does not mae any difference whether or not we mae any decision at stage one in the above procedure unless all the hypotheses are accepted. In other words, the procedures in Theorem 3.2 can be described equivalently as in the following corollary. Corollary 3.. The following class of two-stage stepup procedures controls the FDR at α under the mixture model: At stage one, consider the set of constants as in Theorem 3.2 and find j = min0 i n: P i:n λ i }. If j = 0, accept all hypotheses and stop; otherwise, go to stage two and apply the stepup procedure to all the hypotheses using the modified BH critical values α j = minλ j,α( λ j )/(n j + )}, =,...,n. The above theorem or its corollary provides a number of two-stage modified BH procedures, for different choices of m, each controlling the FDR under the mixture model. It is important to note, however, that for m = and n, we do not need to assume the mixture model; just independence under any fixed configuration, yet unnown, of true and false null hypotheses will suffice. More precisely, when m = n, since λ j = (n j + )( λ)/n, i =,...,n,

8 S.K. Sarar / Journal of Statistical Planning and Inference 38 (2008) the FDR in (3.4) equals n 0 α( λ)/n α, for any 0 < λ <. When m =, (3.4) is equal to αn 0 ( λ) α( λ) n j:n > λ} (n j)(n j + ) = n 0 n α( λ) + α( λ) α( λ) + α( λ) n i= n j:n λ} (n j)(n j + ) j:n λ} (n j)(n j + ) = α n:n > λ}, (3.) which is less than or equal to α for any 0 < λ <. The last line in (3.) follows from Sarar (998). In fact, our procedure with m =, that is, the two stage procedure with α j s given by } α( λ) α j = min λ,, =,...,n, n j + for any arbitrary 0 < λ <, is same as the one developed by Storey et al. (2004). The FDR of this procedure under independence, as seen from (3.), is given by FDR = } λ α( λ) min, j :n λ <P( i) j:n }. (3.2) j n j + which is much more explicit than given in Storey et al. (2004), and we have given here an alternative proof that it controls the FDR. Some other interesting results related to this procedure are worth noting. For instance, FDR λ ( i) j :n λ <P( i) j:n j } = FDR, (3.3) with the equality holding when λ α/(n + α), where FDR is the FDR of the first-stage single-step procedure. Let FNR and pfnr, respectively, be the FNR and pfnr of the this single-step procedure. Then, we also have FDR α λ ( i) j :n λ <P( i) j:n n j + } = α n:n > λ} ( i) P P j :n λ <P( i) j:n n j +,P i > λ} i I\I 0 = α[ n:n > λ} FNR ] = α n:n > λ}( pfnr ), (3.4) with the equality holding when λ nα/( + nα). See Sarar (2006) for formulas of FDR and FNR of a single-step procedure. Again, the inequality (3.4) provides an alternative proof of the FDR controlling property of this procedure under the independence. Storey et al. (2004) used the inequality FDR α Prmax i n0 P i > λ} to prove this property. 4. Numerical study Given 000 independent random variables X i N(μ i, ), i =,...,000, we consider simultaneous testing of the null hypotheses H i : μ i = 0, i =,...,000, against their respective alternatives K i : μ i > 0, i =,...,000. Among

9 080 S.K. Sarar / Journal of Statistical Planning and Inference 38 (2008) Adaptive BH BH δ = Sarar m= δ = FDR π 0 Fig.. Comparison of different procedures in terms of the FDR. several possible choices within the proposed class of FDR procedures, we consider the one with m = 0 and λ = and numerically compare it with the STS procedure (m = ) with λ = 0.5, the adaptive BH procedure and the original BH procedure. Notice that the choice of λ in our procedure is dictated by the constrained in (3.9) given m = 0. We examine the amount of improvement our procedure (lebeled Sarar) offers over the BH procedure relative to the STS and the adaptive BH procedures. We also compare these procedures in terms of the average power, defined as the expected proportion false null hypotheses that are rejected (Dudoit et al., 2003; Shaffer, 2002). Having generated values of n = 000 independent random variables X i N(μ i, ), i =,...,000, we perform 000 Z tests of μ = 0 against μ > 0 using each of the four procedures at the FDR level α = We randomly identify each hypothesis to be true (μ = 0) or false (μ = δ), for some fixed δ > 0, according to a Bernoulli model with π 0 = π as the probability of a null hypothesis being true. The values of Q = V/R, the ratio of the number of rejected true null hypotheses (V) to the total number of rejected null hypotheses (R), and S/n, the proportion of the number of rejected false null hypotheses (S) to the total number of false null hypotheses (n ), are then calculated for each procedure. Repeating this 30,000 times, we estimate the FDR by averaging the 30,000 Q values and the average power by averaging the 30,000 S/n -values. The value of Q is considered to be zero if R = 0 in a particular repetition. Whereas, since n is random in a mixture model and can be zero with positive probability (even though it is extremely small), we also consider S/n = 0ifn = 0. Figs. and 2 compare the FDR and the average power, respectively, of these procedures for δ = and 2. As seen from Fig., for improving the FDR control of the BH procedure, ours seems to have the best performance when in a false null hypothesis the value of the mean departs slightly from its specified null hypothetical value; whereas, the STS procedure seems to wor the best when this departure is large. Similar picture is seen in Fig. 2 that compares the average power, even though, for each of these procedures, the power of detecting small departures of the means from their null hypothetical values is generally low, could be even lower than α unless many of the null hypotheses are false.

10 S.K. Sarar / Journal of Statistical Planning and Inference 38 (2008) Adaptive BH BH δ = Sarar m= δ = AvePower π Fig. 2. Comparison of different procedures in terms of the average power. 5. FDR-controlling two-stage stepup procedures under dependence In this section, we discuss how to determine critical values of a two-stage stepup procedure providing a control of the FDR under any form of dependence among the test statistics. To this end, we first have the following lemma. Lemma 5.. The FDR of a two-stage stepup procedure satisfies the following inequality: FDR = α j, α j,. (5.) Proof. Note that in the expression (2.) for the FDR of a two-stage stepup procedure, = P J = j,k(j) =, P i α j }= = P J = j,k(j) =, α j,l <P i α jl } l= = l= =l P J = j,k(j) =, α j,l P i < α j,l } l= l P J = j,k(j) =, α j,l P i < α jl } =

11 082 S.K. Sarar / Journal of Statistical Planning and Inference 38 (2008) Using (5.2) in (2.), we get the inequality (5.). l= l= l P J = j,α j,l P i < α jl } α jl α j,l. (5.2) l Let us choose the α j s satisfying } α j α j = min λ j, j = (/), (5.3) j n, for some α j s, modifying the BY critical values at the second stage using the number of hypotheses accepted (which is n j) at the first stage and truncating each at λ j, as in the case of modifying the BH procedure (Section 3). Then, we have from (5.) that FDR n 0 n α j, which can be made less than or equal to α by choosing the α j s in several different ways. For instance, one might choose the α j s as follows: α j =α/n(n j +) n = (/)}. Thus, we have the following as a two-stage version of the BY procedure: Procedure 5.. At stage one, consider any set of constants 0 = λ 0 < λ λ n < λ n+ =. At stage two, consider the critical values α j s satisfying } α α j = min λ j, n(n j + ) j = n=, =,...,n. (5.4) Considering Lemma 5. for a single-stage procedure, we notice that one can develop a single-stage procedure different from the BY procedure that will also control the FDR under any form dependence among the p-values. We present this single-stage procedure in the following before constructing a two-stage version of this. Lemma 5.2. Consider a single-stage stepup procedure with the critical values α s satisfying α = ( + )α/2n 2 }. It has the FDR less than or equal to n 0 α/n. Proof. From Lemma 5., we see that the FDR of this single-stage stepup procedure is given by FDR = α 2n 2 α α = } [( + ) ( )] = n 0 n α. (5.5) The following is a two-stage version of the alternative single-stage procedure in Lemma 5.2. Procedure 5.2. At stage one, consider any set of constants 0 = λ 0 < λ λ n < λ n+ =. At stage two, consider the critical values d j s satisfying } ( + )α α j = min λ j, 2nj (n j + ) n, =,...,n. (5.6) =

12 S.K. Sarar / Journal of Statistical Planning and Inference 38 (2008) BY Sarar ρ = 0.5 ρ = FDR ρ = 0 ρ = n0 Fig. 3. Comparison of the BY and the new procedures in terms of the FDR. Before numerically investigating how Procedures 5. and 5.2 perform as two-stage modifications of their respective single-stage versions, we assess how good the new single-stage procedure proposed in Lemma 5.2 is compared to the well-nown BY procedure. Fig. 3 compares the simulated vales of the FDR for the BY procedure and the new procedure (labeled Sarar) in simultaneous testing of the means of 000 normal random variables, each mean being tested at 0 against a positive value using the Z test. The correlations are assumed to be common, which is ρ, and each variance is. The simulated values are based on 20,000 repetitions and the alternative mean in each test is chosen to be 2. Interestingly, the new procedure seems to perform much better than the BY procedure when the p-values are highly dependent and the number of true null hypotheses is neither extremely small nor extremely large. However, when we brought Procedures 5. and 5.2 into our comparisons, we noticed, unfortunately, that none of these two-stage procedures provide any significant improvement over the BY procedure or its alternative in terms of the FDR control. Thus, it appears that the way we estimate n 0 to improve the BH procedure under the independence may not wor under dependence for improving the BY or its alternative procedure. 6. Concluding remars Starting with a general two-stage stepup procedure, we have made an attempt in this article to present a wider class of FDR procedures. In particular, we have offered a theory of modifying the BH procedure through a two-stage approach incorporating an estimate of the number of true null hypotheses and maintaining the FDR control, which is more general than Storey et al. (2004). Although our general theory relies on the iid mixture model for the p-values, for the STS procedure, it not only reduces to the one requiring only the independence condition of the p-values, as

13 084 S.K. Sarar / Journal of Statistical Planning and Inference 38 (2008) they have assumed, but also it provides an alternative proof of its FDR control property. We have chosen only one from our proposed class of procedures and shown numerically that it wors quite well in terms of improving the FDR control of the BH procedure. We believe that there are other procedures in this class that would also perform well and more studies are needed to investigate it. Another positive contribution we have made in this article is that we have introduced a new single-stage stepup procedure that might wor better than the commonly used BY procedure in terms of controlling the FDR under any form of dependence among the p-values in some instances. Acnowledgment The author is grateful to an Associate Editor and two referees for maing useful and constructive comments, and to Wenge Guo for giving some additional comments and pointing out an error in an earlier version of the manuscript, all of which led to a much improved form of the paper. He also thans Zijiang Yang for doing the numerical calculations. Appendix Proof of Proposition 2.. The two events J = j,k (j) = j} and J = j,k(j) = j} can be seen to be identical if α j λ, for j n. This is because, for <j n, J = j, K (j) = } =P j:n λ j,p j+:n > λ j+,...,p n:n > λ n } P :n α j,p +,n > α j,+,...,p n:n > α j,n } =P :n α j,p +,n > α j,+,...,α jj <P j:n λ j,p j+:n > λ j+,...,p n:n > λ n }, (A.) which is same as J = j,k(j) = }. For = j n, J = j, K (j) = j}=p j:n α jj,p j+:n > λ j+,...,p n:n > λ n } =J = j,k(j) = j}. (A.2) And, for j < n, both of these events are null. This proves the proposition. References Benjamini, Y., Hochberg, Y., 995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B 57, Benjamini, Y., Hochberg, Y., On the adaptive control of the false discovery rate in multiple testing with independent statistics. J. Educ. Behav. Statist. 25, Benjamini, Y., Yeutieli, D., 200. The control of the false discovery rate in multiple testing under dependence. Ann. Statist. 29, Benjamini, Y., Krieger, A.M., Yeutieli, D., Adaptive linear step-up procedures that control the false discovery rate. Biometria 93, Dudoit, S., Shaffer, J.P., Boldric, J.C., Multiple hypothesis testing in microarray experiments. Statist. Sci. 8, Finner, H., Roters, M., 200. On the false discovery rate and expected number of type I errors. Biometrical J. 43, Genovese, C., Wasserman, L., Operarting characteristics and extensions of the false discovery rate procedure. J. Roy. Statist. Soc. Ser. B 64, Hochberg, Y., Benjamini, Y., 990. More powerful procedures for multiple significance testing. Statist. Med. 9, Sarar, S.K., 998. Some probability inequalities for ordered MTP 2 random variables: a proof of the simes conjecture. Ann. Statist. 26, Sarar, S.K., Some results on false discovery rate in stepwise multiple testing procedures. Ann. Statist. 30, Sarar, S.K., False discovery and false non-discovery rates in single-step multiple testing procedures. Ann. Statist. 34, Schweder, T., SpjZtvoll, E., 982. Plots of p-values to evaluate many tests simultaneously. Biometria 69, Shaffer, J.P., Multiplicity, directional (type III) errors, and the null hypothesis. Psychol. Methods 7, Storey, J.D., A direct approach to false discovery rates. J. Roy. Statist. Soc. Ser. B 64, Storey, J.D., Taylor, J.E., Siegmund, D., Strong control, conservative point estimation and simultanaeous conservative consistency of false discovery rates: a unified approach. J. Roy. Statist. Soc. Ser. B 66, Zhang, Z., Multiple hypothesis testing for finite and infinite number of hypotheses. Ph.D. Thesis, Case Western Reserve University.

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo PROCEDURES CONTROLLING THE k-fdr USING BIVARIATE DISTRIBUTIONS OF THE NULL p-values Sanat K. Sarkar and Wenge Guo Temple University and National Institute of Environmental Health Sciences Abstract: Procedures

More information

On adaptive procedures controlling the familywise error rate

On adaptive procedures controlling the familywise error rate , pp. 3 On adaptive procedures controlling the familywise error rate By SANAT K. SARKAR Temple University, Philadelphia, PA 922, USA sanat@temple.edu Summary This paper considers the problem of developing

More information

Procedures controlling generalized false discovery rate

Procedures controlling generalized false discovery rate rocedures controlling generalized false discovery rate By SANAT K. SARKAR Department of Statistics, Temple University, hiladelphia, A 922, U.S.A. sanat@temple.edu AND WENGE GUO Department of Environmental

More information

On Methods Controlling the False Discovery Rate 1

On Methods Controlling the False Discovery Rate 1 Sankhyā : The Indian Journal of Statistics 2008, Volume 70-A, Part 2, pp. 135-168 c 2008, Indian Statistical Institute On Methods Controlling the False Discovery Rate 1 Sanat K. Sarkar Temple University,

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept

More information

arxiv: v1 [math.st] 31 Mar 2009

arxiv: v1 [math.st] 31 Mar 2009 The Annals of Statistics 2009, Vol. 37, No. 2, 619 629 DOI: 10.1214/07-AOS586 c Institute of Mathematical Statistics, 2009 arxiv:0903.5373v1 [math.st] 31 Mar 2009 AN ADAPTIVE STEP-DOWN PROCEDURE WITH PROVEN

More information

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Applying the Benjamini Hochberg procedure to a set of generalized p-values U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure

More information

Sanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract

Sanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract Adaptive Controls of FWER and FDR Under Block Dependence arxiv:1611.03155v1 [stat.me] 10 Nov 2016 Wenge Guo Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102, U.S.A.

More information

STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University

STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE Wenge Guo 1 and Sanat K. Sarkar 2 National Institute of Environmental Health Sciences and Temple University Abstract: Often in practice

More information

Chapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate

Chapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate Chapter Stepdown Procedures Controlling A Generalized False Discovery Rate Wenge Guo and Sanat K. Sarkar Biostatistics Branch, National Institute of Environmental Health Sciences, Research Triangle Park,

More information

FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING PROCEDURES 1. BY SANAT K. SARKAR Temple University

FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING PROCEDURES 1. BY SANAT K. SARKAR Temple University The Annals of Statistics 2006, Vol. 34, No. 1, 394 415 DOI: 10.1214/009053605000000778 Institute of Mathematical Statistics, 2006 FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING

More information

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE Sanat K. Sarkar 1, Tianhui Zhou and Debashis Ghosh Temple University, Wyeth Pharmaceuticals and

More information

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses Gavin Lynch Catchpoint Systems, Inc., 228 Park Ave S 28080 New York, NY 10003, U.S.A. Wenge Guo Department of Mathematical

More information

GENERALIZING SIMES TEST AND HOCHBERG S STEPUP PROCEDURE 1. BY SANAT K. SARKAR Temple University

GENERALIZING SIMES TEST AND HOCHBERG S STEPUP PROCEDURE 1. BY SANAT K. SARKAR Temple University The Annals of Statistics 2008, Vol. 36, No. 1, 337 363 DOI: 10.1214/009053607000000550 Institute of Mathematical Statistics, 2008 GENERALIZING SIMES TEST AND HOCHBERG S STEPUP PROCEDURE 1 BY SANAT K. SARKAR

More information

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA

More information

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using

More information

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE By Wenge Guo and M. Bhaskara Rao National Institute of Environmental Health Sciences and University of Cincinnati A classical approach for dealing

More information

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a

More information

arxiv: v1 [math.st] 13 Mar 2008

arxiv: v1 [math.st] 13 Mar 2008 The Annals of Statistics 2008, Vol. 36, No. 1, 337 363 DOI: 10.1214/009053607000000550 c Institute of Mathematical Statistics, 2008 arxiv:0803.1961v1 [math.st] 13 Mar 2008 GENERALIZING SIMES TEST AND HOCHBERG

More information

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES By Wenge Guo Gavin Lynch Joseph P. Romano Technical Report No. 2018-06 September 2018

More information

ON TWO RESULTS IN MULTIPLE TESTING

ON TWO RESULTS IN MULTIPLE TESTING ON TWO RESULTS IN MULTIPLE TESTING By Sanat K. Sarkar 1, Pranab K. Sen and Helmut Finner Temple University, University of North Carolina at Chapel Hill and University of Duesseldorf Two known results in

More information

Resampling-Based Control of the FDR

Resampling-Based Control of the FDR Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago

More information

The miss rate for the analysis of gene expression data

The miss rate for the analysis of gene expression data Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,

More information

Web-based Supplementary Materials for. of the Null p-values

Web-based Supplementary Materials for. of the Null p-values Web-based Supplementary Materials for ocedures Controlling the -FDR Using Bivariate Distributions of the Null p-values Sanat K Sarar and Wenge Guo Temple University and National Institute of Environmental

More information

IMPROVING TWO RESULTS IN MULTIPLE TESTING

IMPROVING TWO RESULTS IN MULTIPLE TESTING IMPROVING TWO RESULTS IN MULTIPLE TESTING By Sanat K. Sarkar 1, Pranab K. Sen and Helmut Finner Temple University, University of North Carolina at Chapel Hill and University of Duesseldorf October 11,

More information

Heterogeneity and False Discovery Rate Control

Heterogeneity and False Discovery Rate Control Heterogeneity and False Discovery Rate Control Joshua D Habiger Oklahoma State University jhabige@okstateedu URL: jdhabigerokstateedu August, 2014 Motivating Data: Anderson and Habiger (2012) M = 778 bacteria

More information

Controlling the False Discovery Rate in Two-Stage. Combination Tests for Multiple Endpoints

Controlling the False Discovery Rate in Two-Stage. Combination Tests for Multiple Endpoints Controlling the False Discovery Rate in Two-Stage Combination Tests for Multiple ndpoints Sanat K. Sarkar, Jingjing Chen and Wenge Guo May 29, 2011 Sanat K. Sarkar is Professor and Senior Research Fellow,

More information

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Test (2008) 17: 461 471 DOI 10.1007/s11749-008-0134-6 DISCUSSION Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Joseph P. Romano Azeem M. Shaikh

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca

More information

Control of Directional Errors in Fixed Sequence Multiple Testing

Control of Directional Errors in Fixed Sequence Multiple Testing Control of Directional Errors in Fixed Sequence Multiple Testing Anjana Grandhi Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102-1982 Wenge Guo Department of Mathematical

More information

False discovery rate control for non-positively regression dependent test statistics

False discovery rate control for non-positively regression dependent test statistics Journal of Statistical Planning and Inference ( ) www.elsevier.com/locate/jspi False discovery rate control for non-positively regression dependent test statistics Daniel Yekutieli Department of Statistics

More information

Statistica Sinica Preprint No: SS R1

Statistica Sinica Preprint No: SS R1 Statistica Sinica Preprint No: SS-2017-0072.R1 Title Control of Directional Errors in Fixed Sequence Multiple Testing Manuscript ID SS-2017-0072.R1 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0072

More information

Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004

Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004 Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004 Multiple testing methods to control the False Discovery Rate (FDR),

More information

Step-down FDR Procedures for Large Numbers of Hypotheses

Step-down FDR Procedures for Large Numbers of Hypotheses Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate

More information

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Christopher R. Genovese Department of Statistics Carnegie Mellon University joint work with Larry Wasserman

More information

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018 High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously

More information

Comments on: Control of the false discovery rate under dependence using the bootstrap and subsampling

Comments on: Control of the false discovery rate under dependence using the bootstrap and subsampling Test (2008) 17: 443 445 DOI 10.1007/s11749-008-0127-5 DISCUSSION Comments on: Control of the false discovery rate under dependence using the bootstrap and subsampling José A. Ferreira Mark A. van de Wiel

More information

Familywise Error Rate Controlling Procedures for Discrete Data

Familywise Error Rate Controlling Procedures for Discrete Data Familywise Error Rate Controlling Procedures for Discrete Data arxiv:1711.08147v1 [stat.me] 22 Nov 2017 Yalin Zhu Center for Mathematical Sciences, Merck & Co., Inc., West Point, PA, U.S.A. Wenge Guo Department

More information

arxiv:math/ v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao

arxiv:math/ v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? arxiv:math/0701003v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao ABSTRACT. In the analysis

More information

Hochberg Multiple Test Procedure Under Negative Dependence

Hochberg Multiple Test Procedure Under Negative Dependence Hochberg Multiple Test Procedure Under Negative Dependence Ajit C. Tamhane Northwestern University Joint work with Jiangtao Gou (Northwestern University) IMPACT Symposium, Cary (NC), November 20, 2014

More information

Doing Cosmology with Balls and Envelopes

Doing Cosmology with Balls and Envelopes Doing Cosmology with Balls and Envelopes Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics Carnegie

More information

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR CONTROLLING THE FALSE DISCOVERY RATE A Dissertation in Statistics by Scott Roths c 2011

More information

On Generalized Fixed Sequence Procedures for Controlling the FWER

On Generalized Fixed Sequence Procedures for Controlling the FWER Research Article Received XXXX (www.interscience.wiley.com) DOI: 10.1002/sim.0000 On Generalized Fixed Sequence Procedures for Controlling the FWER Zhiying Qiu, a Wenge Guo b and Gavin Lynch c Testing

More information

Looking at the Other Side of Bonferroni

Looking at the Other Side of Bonferroni Department of Biostatistics University of Washington 24 May 2012 Multiple Testing: Control the Type I Error Rate When analyzing genetic data, one will commonly perform over 1 million (and growing) hypothesis

More information

This paper has been submitted for consideration for publication in Biometrics

This paper has been submitted for consideration for publication in Biometrics BIOMETRICS, 1 10 Supplementary material for Control with Pseudo-Gatekeeping Based on a Possibly Data Driven er of the Hypotheses A. Farcomeni Department of Public Health and Infectious Diseases Sapienza

More information

A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments

A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Jie Chen 1 Merck Research Laboratories, P. O. Box 4, BL3-2, West Point, PA 19486, U.S.A. Telephone:

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software MMMMMM YYYY, Volume VV, Issue II. doi: 10.18637/jss.v000.i00 GroupTest: Multiple Testing Procedure for Grouped Hypotheses Zhigen Zhao Abstract In the modern Big Data

More information

New Procedures for False Discovery Control

New Procedures for False Discovery Control New Procedures for False Discovery Control Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Elisha Merriam Department of Neuroscience University

More information

Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control

Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control Joshua D. Habiger Oklahoma State University jhabige@okstate.edu Nov. 8, 2013 Outline 1 : Motivation and FDR Research Areas

More information

Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments

Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Jie Chen 1 Merck Research Laboratories, P. O. Box 4, BL3-2, West Point, PA 19486, U.S.A. Telephone:

More information

hal , version 2-2 Apr 2010

hal , version 2-2 Apr 2010 Submitted to the Annals of Statistics arxiv: 1002.2845 EXACT CALCULATIONS FOR FALSE DISCOVERY PROPORTION WITH APPLICATION TO LEAST FAVORABLE CONFIGURATIONS By Etienne Roquain and Fanny Villers UPMC University

More information

Estimation of a Two-component Mixture Model

Estimation of a Two-component Mixture Model Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint

More information

Research Article Sample Size Calculation for Controlling False Discovery Proportion

Research Article Sample Size Calculation for Controlling False Discovery Proportion Probability and Statistics Volume 2012, Article ID 817948, 13 pages doi:10.1155/2012/817948 Research Article Sample Size Calculation for Controlling False Discovery Proportion Shulian Shang, 1 Qianhe Zhou,

More information

More powerful control of the false discovery rate under dependence

More powerful control of the false discovery rate under dependence Statistical Methods & Applications (2006) 15: 43 73 DOI 10.1007/s10260-006-0002-z ORIGINAL ARTICLE Alessio Farcomeni More powerful control of the false discovery rate under dependence Accepted: 10 November

More information

A class of improved hybrid Hochberg Hommel type step-up multiple test procedures

A class of improved hybrid Hochberg Hommel type step-up multiple test procedures Biometrika (2014), 101,4,pp. 899 911 doi: 10.1093/biomet/asu032 Printed in Great Britain Advance Access publication 24 October 2014 A class of improved hybrid Hochberg Hommel type step-up multiple test

More information

Department of Statistics University of Central Florida. Technical Report TR APR2007 Revised 25NOV2007

Department of Statistics University of Central Florida. Technical Report TR APR2007 Revised 25NOV2007 Department of Statistics University of Central Florida Technical Report TR-2007-01 25APR2007 Revised 25NOV2007 Controlling the Number of False Positives Using the Benjamini- Hochberg FDR Procedure Paul

More information

A Large-Sample Approach to Controlling the False Discovery Rate

A Large-Sample Approach to Controlling the False Discovery Rate A Large-Sample Approach to Controlling the False Discovery Rate Christopher R. Genovese Department of Statistics Carnegie Mellon University Larry Wasserman Department of Statistics Carnegie Mellon University

More information

Lecture 7 April 16, 2018

Lecture 7 April 16, 2018 Stats 300C: Theory of Statistics Spring 2018 Lecture 7 April 16, 2018 Prof. Emmanuel Candes Scribe: Feng Ruan; Edited by: Rina Friedberg, Junjie Zhu 1 Outline Agenda: 1. False Discovery Rate (FDR) 2. Properties

More information

False discovery control for multiple tests of association under general dependence

False discovery control for multiple tests of association under general dependence False discovery control for multiple tests of association under general dependence Nicolai Meinshausen Seminar für Statistik ETH Zürich December 2, 2004 Abstract We propose a confidence envelope for false

More information

The optimal discovery procedure: a new approach to simultaneous significance testing

The optimal discovery procedure: a new approach to simultaneous significance testing J. R. Statist. Soc. B (2007) 69, Part 3, pp. 347 368 The optimal discovery procedure: a new approach to simultaneous significance testing John D. Storey University of Washington, Seattle, USA [Received

More information

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications Thomas Brechenmacher (Dainippon Sumitomo Pharma Co., Ltd.) Jane Xu (Sunovion Pharmaceuticals Inc.) Alex Dmitrienko

More information

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are, Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations

More information

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper

More information

STEPUP PROCEDURES FOR CONTROL OF GENERALIZATIONS OF THE FAMILYWISE ERROR RATE

STEPUP PROCEDURES FOR CONTROL OF GENERALIZATIONS OF THE FAMILYWISE ERROR RATE AOS imspdf v.2006/05/02 Prn:4/08/2006; 11:19 F:aos0169.tex; (Lina) p. 1 The Annals of Statistics 2006, Vol. 0, No. 00, 1 26 DOI: 10.1214/009053606000000461 Institute of Mathematical Statistics, 2006 STEPUP

More information

Statistical testing. Samantha Kleinberg. October 20, 2009

Statistical testing. Samantha Kleinberg. October 20, 2009 October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find

More information

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome

More information

Adaptive FDR control under independence and dependence

Adaptive FDR control under independence and dependence Adaptive FDR control under independence and dependence Gilles Blanchard, Etienne Roquain To cite this version: Gilles Blanchard, Etienne Roquain. Adaptive FDR control under independence and dependence.

More information

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons: STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two

More information

Effects of dependence in high-dimensional multiple testing problems. Kyung In Kim and Mark van de Wiel

Effects of dependence in high-dimensional multiple testing problems. Kyung In Kim and Mark van de Wiel Effects of dependence in high-dimensional multiple testing problems Kyung In Kim and Mark van de Wiel Department of Mathematics, Vrije Universiteit Amsterdam. Contents 1. High-dimensional multiple testing

More information

Aliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25

Aliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25 Presentation of The Paper: The Positive False Discovery Rate: A Bayesian Interpretation and the q-value, J.D. Storey, The Annals of Statistics, Vol. 31 No.6 (Dec. 2003), pp 2013-2035 Aliaksandr Hubin University

More information

An Alpha-Exhaustive Multiple Testing Procedure

An Alpha-Exhaustive Multiple Testing Procedure Current Research in Biostatistics Original Research Paper An Alpha-Exhaustive Multiple Testing Procedure, Mar Chang, Xuan Deng and John Balser Boston University, Boston MA, USA Veristat, Southborough MA,

More information

A BAYESIAN STEPWISE MULTIPLE TESTING PROCEDURE. By Sanat K. Sarkar 1 and Jie Chen. Temple University and Merck Research Laboratories

A BAYESIAN STEPWISE MULTIPLE TESTING PROCEDURE. By Sanat K. Sarkar 1 and Jie Chen. Temple University and Merck Research Laboratories A BAYESIAN STEPWISE MULTIPLE TESTING PROCEDURE By Sanat K. Sarar 1 and Jie Chen Temple University and Merc Research Laboratories Abstract Bayesian testing of multiple hypotheses often requires consideration

More information

Simultaneous Testing of Grouped Hypotheses: Finding Needles in Multiple Haystacks

Simultaneous Testing of Grouped Hypotheses: Finding Needles in Multiple Haystacks University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2009 Simultaneous Testing of Grouped Hypotheses: Finding Needles in Multiple Haystacks T. Tony Cai University of Pennsylvania

More information

Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses

Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Amit Zeisel, Or Zuk, Eytan Domany W.I.S. June 5, 29 Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving

More information

Multiple Testing. Anjana Grandhi. BARDS, Merck Research Laboratories. Rahway, NJ Wenge Guo. Department of Mathematical Sciences

Multiple Testing. Anjana Grandhi. BARDS, Merck Research Laboratories. Rahway, NJ Wenge Guo. Department of Mathematical Sciences Control of Directional Errors in Fixed Sequence arxiv:1602.02345v2 [math.st] 18 Mar 2017 Multiple Testing Anjana Grandhi BARDS, Merck Research Laboratories Rahway, NJ 07065 Wenge Guo Department of Mathematical

More information

CHOOSING THE LESSER EVIL: TRADE-OFF BETWEEN FALSE DISCOVERY RATE AND NON-DISCOVERY RATE

CHOOSING THE LESSER EVIL: TRADE-OFF BETWEEN FALSE DISCOVERY RATE AND NON-DISCOVERY RATE Statistica Sinica 18(2008), 861-879 CHOOSING THE LESSER EVIL: TRADE-OFF BETWEEN FALSE DISCOVERY RATE AND NON-DISCOVERY RATE Radu V. Craiu and Lei Sun University of Toronto Abstract: The problem of multiple

More information

Multiple testing: Intro & FWER 1

Multiple testing: Intro & FWER 1 Multiple testing: Intro & FWER 1 Mark van de Wiel mark.vdwiel@vumc.nl Dep of Epidemiology & Biostatistics,VUmc, Amsterdam Dep of Mathematics, VU 1 Some slides courtesy of Jelle Goeman 1 Practical notes

More information

False Discovery Rate

False Discovery Rate False Discovery Rate Peng Zhao Department of Statistics Florida State University December 3, 2018 Peng Zhao False Discovery Rate 1/30 Outline 1 Multiple Comparison and FWER 2 False Discovery Rate 3 FDR

More information

Alpha-Investing. Sequential Control of Expected False Discoveries

Alpha-Investing. Sequential Control of Expected False Discoveries Alpha-Investing Sequential Control of Expected False Discoveries Dean Foster Bob Stine Department of Statistics Wharton School of the University of Pennsylvania www-stat.wharton.upenn.edu/ stine Joint

More information

On weighted Hochberg procedures

On weighted Hochberg procedures Biometrika (2008), 95, 2,pp. 279 294 C 2008 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn018 On weighted Hochberg procedures BY AJIT C. TAMHANE Department of Industrial Engineering

More information

MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY

MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY BY YINGQIU MA A dissertation submitted to the Graduate School New Brunswick Rutgers, The State University of New

More information

SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE

SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE Statistica Sinica 18(2008), 881-904 SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE Yongchao Ge 1, Stuart C. Sealfon 1 and Terence P. Speed 2,3 1 Mount Sinai School of Medicine,

More information

Some General Types of Tests

Some General Types of Tests Some General Types of Tests We may not be able to find a UMP or UMPU test in a given situation. In that case, we may use test of some general class of tests that often have good asymptotic properties.

More information

Stat 206: Estimation and testing for a mean vector,

Stat 206: Estimation and testing for a mean vector, Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where

More information

EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS

EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS Statistica Sinica 19 (2009), 125-143 EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS Debashis Ghosh Penn State University Abstract: There is much recent interest

More information

Multiple hypothesis testing using the excess discovery count and alpha-investing rules

Multiple hypothesis testing using the excess discovery count and alpha-investing rules Multiple hypothesis testing using the excess discovery count and alpha-investing rules Dean P. Foster and Robert A. Stine Department of Statistics The Wharton School of the University of Pennsylvania Philadelphia,

More information

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS Journal of Biopharmaceutical Statistics, 21: 726 747, 2011 Copyright Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543406.2011.551333 MULTISTAGE AND MIXTURE PARALLEL

More information

Post-Selection Inference

Post-Selection Inference Classical Inference start end start Post-Selection Inference selected end model data inference data selection model data inference Post-Selection Inference Todd Kuffner Washington University in St. Louis

More information

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data Faming Liang, Chuanhai Liu, and Naisyin Wang Texas A&M University Multiple Hypothesis Testing Introduction

More information

Tools and topics for microarray analysis

Tools and topics for microarray analysis Tools and topics for microarray analysis USSES Conference, Blowing Rock, North Carolina, June, 2005 Jason A. Osborne, osborne@stat.ncsu.edu Department of Statistics, North Carolina State University 1 Outline

More information

New Approaches to False Discovery Control

New Approaches to False Discovery Control New Approaches to False Discovery Control Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics Carnegie

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 13 Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates Sandrine Dudoit Mark

More information

High-throughput Testing

High-throughput Testing High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector

More information

Adaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses

Adaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses Adaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses arxiv:1610.03330v1 [stat.me] 11 Oct 2016 Jingshu Wang, Chiara Sabatti, Art B. Owen Department of Statistics, Stanford University

More information

Sample Size Estimation for Studies of High-Dimensional Data

Sample Size Estimation for Studies of High-Dimensional Data Sample Size Estimation for Studies of High-Dimensional Data James J. Chen, Ph.D. National Center for Toxicological Research Food and Drug Administration June 3, 2009 China Medical University Taichung,

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

Journal Club: Higher Criticism

Journal Club: Higher Criticism Journal Club: Higher Criticism David Donoho (2002): Higher Criticism for Heterogeneous Mixtures, Technical Report No. 2002-12, Dept. of Statistics, Stanford University. Introduction John Tukey (1976):

More information

FDR and ROC: Similarities, Assumptions, and Decisions

FDR and ROC: Similarities, Assumptions, and Decisions EDITORIALS 8 FDR and ROC: Similarities, Assumptions, and Decisions. Why FDR and ROC? It is a privilege to have been asked to introduce this collection of papers appearing in Statistica Sinica. The papers

More information

Control of Generalized Error Rates in Multiple Testing

Control of Generalized Error Rates in Multiple Testing Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 245 Control of Generalized Error Rates in Multiple Testing Joseph P. Romano and

More information