Procedures controlling generalized false discovery rate

Size: px
Start display at page:

Download "Procedures controlling generalized false discovery rate"

Transcription

1 rocedures controlling generalized false discovery rate By SANAT K. SARKAR Department of Statistics, Temple University, hiladelphia, A 922, U.S.A. sanat@temple.edu AND WENGE GUO Department of Environmental Health, University of Cincinnati, Cincinnati, OH , U.S.A. guowg@ .uc.edu November 29, 2006 Summary rocedures controlling error rates measuring at least false rejections, instead of at least one, can potentially increase the ability of a procedure to detect false null hypotheses in situations where one sees to control or more false rejections having tolerated a few of them. The -FWER, which is the probability of at least false rejections and generalizes the usual familywise error rate (FWER), is such an error rate that is recently introduced in the literature and procedures controlling it have been proposed. This paper considers an alternative and less conservative notion of error rate, the -FDR, which is the expected proportion of or more false rejections among all rejections and generalizes the usual notion of false discovery rate (FDR). rocedures with the -FDR control dominating the Benjamini-Hochberg stepup FDR procedure and its stepdown analog under independence or positive dependence and the Benjamini-Yeutieli stepup FDR procedure under any form of dependence have been constructed. Some ey words: Multiple hypothesis testing; Generalized FDR; Stepwise procedures; ositive regression dependence on subset; Clumpy dependence; Arbitrary dependence; Average power; Gene expressions.

2 . Introduction We consider in this article the general problem of simultaneously testing of n null hypotheses H i, i =,..., n, against their respective alternatives, using tests that are available for these individual hypotheses. rocedures developed for dealing with this problem are traditionally based on the idea of controlling the familywise error rate (FWER), the probability of rejecting at least one true null hypothesis [Hochberg & Tamhane (987)]. However, quite often, especially when large number of hypotheses are simultaneously tested, the notion of FWER turns out to be too stringent, allowing little chance to detect many false null hypotheses. Therefore, researchers have focused in the last decade on defining alternative less stringent error rates and developing methods that control them. The false discovery rate (FDR), the expected proportion of falsely rejected null hypotheses, due to Benjamini & Hochberg (995), is the first of these alternative error rates that has received considerable attention [Benjamini & Yeutieli (200,2005), Genovese & Wasserman (2002, 2004), Sarar (2002, 2004, 2006a, b), Storey (2002, 2003) and Storey et al. (2004)]. Recently, the ideas of controlling the probabilities of falsely rejecting at least null hypotheses, which is the -FWER, and the false discovery proportion (FD) exceeding a certain threshold γ [0, ) have been introduced as alternatives to the FWER and methods controlling these new error rates have been suggested. Korn et al. (2004) gave a permutation-based procedure to control the -FWER and FD approximately. In van der Laan et al. (2004), alternative procedures controlling both the -FWER and FD are proposed augmenting single step and stepwise FWER procedures and related finite sample as well as asymptotic results are given. Single-step and stepwise -FWER and FD procedures for finite number of hypotheses in terms of only the marginal null distributions of the p-values either under independence or arbitrary dependence are given in Hommel & Hoffmann (987), Lehmann & Romano (2005), Guo & Romano (2006), Romano & Shaih (2006a, b) and Romano & Wolf (2005). When the p-values are independent or positively dependent in the sense of being multivariate totally positive of order two (MT 2 ) [Karlin & Rinott (980)], a condition satisfied in some multiple testing situations, Sarar (2005a) developed alternative single-step and stepwise -FWER procedures utilizing th order joint null distributions of the p-values. Other methods controlling the -FWER and FD are discussed in Dudoit et al. (2004). Often in practice one is willing to tolerate a few false rejections but wants to control too many of them, say or more. A procedure controlling the -FWER in this case will have better ability to 2

3 mae or more true discoveries than the corresponding FWER ( = ) procedure. Sarar (2005b) considered generalizing the FDR to the -FDR, the expected ratio of or more false rejections to the total number of rejections, which is a less conservative notion of error rate than the -FWER. Under the independence or the MT 2 positive dependence of the p-values, he developed a -FDR procedure utilizing th order joint null distributions of the p-values. It is less conservative and uniformly more powerful in maing or more true discoveries than the -FWER procedure that Sarar (2005a) developed as a generalization of Hochberg s (988) FWER procedure, and often outperforms the Benjamini-Hochberg (BH) stepup FDR procedure. Sarar (2005b) has also developed a -FDR procedure utilizing th order joint null distributions of the p-values under any form of dependence of the p-values, which generalizes Benjamini & Yeutieli s (200) (BY) stepup procedure but does not uniformly outperform it. In this article, we focus on developing -FDR procedures uniformly outperforming the BH and BY stepup procedures and also the stepdown analog of the BH procedure. Specifically, we consider positively dependent p-values for which a stepwise procedure, stepdown or steup, with critical values of the BH procedure (to be simply referred to as the BH stepwise procedure) is nown to control the FDR, and hence the -FDR conservatively [Benjamini & Yeutieli (200) and Sarar (2002)], and generalize it to a -FDR stepwise procedure that allows more rejections. The positive dependence condition in this paper is slightly weaer than considered earlier in the above two papers. We offer two such generalizations in the positive dependence case using bivariate distributions of the null p-values, one more conservative than the other but easier to implement, both reducing to the same procedure under independence. Based on numerical calculations, we show that, for appropriately chosen values of, our -FDR stepwise procedure under independence is uniformly more powerful than the corresponding BH stepwise procedure. Under positive dependence, each of these -FDR procedures remains more powerful than the corresponding BH stepwise procedure under wea dependence, but unfortunately loses its edge with increasing dependence. Often in practice, lie in microarray analysis and fmri studies, the p-values tend to be clumpy dependent in that they are more dependent within small groups, strongly or wealy, but are independent between these groups. The -FDR procedures in this case are seen to improve their performance. They remain more powerful than the corresponding BH stepwise procedure even for large positive correlations. An alternative stepdown -FDR procedure is proposed that performs better than the one in the above stepwise procedure under independence. When applied to the gene expression data in Hedenfal et al. (200), 3

4 it is seen to detect more differentially expressed genes than the BH stepup procedure for different values of. Finally, we develop a generalized BY procedure that uniformly outperforms the original BY procedure as a -FDR procedure under any form of dependence. 2. reliminaries Suppose that H,..., H n are the null hypotheses that are to be simultaneously tested using the corresponding p-values,... n, respectively. Let () (n) be the ordered p-values and H (),..., H (n) the associated null hypotheses. Then, given a non-decreasing set of critical constants 0 < α α n <, a stepdown multiple testing procedure accepts the set of null hypotheses H (i), i i SD and rejects the rest, where i SD = mini : (i) > α i, if the minimum exists, otherwise rejects all the null hypotheses. A stepup procedure, on the other hand, rejects the set H i, i i SU and accepts the rest, where i SU = maxi : (i) α i, if the maximum exists, otherwise accepts all the null hypotheses. A stepwise (stepdown or stepup) procedure with the same constant is referred to as a single-step procedure. The constants in a stepwise procedure are determined subject to the control at a pre-specified level α of a suitable error rate. Let V denote the number of falsely rejected null hypotheses. Then, the - FWER is defined as -FWER = V, with -FWER being the original FWER. The following lemma provides a general theory towards determining the constants in a stepwise procedure providing a control of the -FWER. It is assumed that there are n 0 true null hypotheses and ˆ,..., ˆ n0 are the p-values that correspond to these null hypotheses. Lemma. Given a stepwise procedure involving,..., n and the critical values 0 < α α n <, consider the corresponding stepwise procedure in terms of the null p-values ˆ,..., ˆ n0 and the critical values α n n0 + α n. Let V n denote the number of false rejections in the original stepwise procedure and ˆR n0 denote the number of rejections in the stepwise procedure involving the null p-values. Then, we have for any fixed n 0 n, V n ˆRn0. roof. Let ˆ () ˆ (n0 ) be the ordered values of ˆ,..., ˆ n0. Note that ˆ (j) (n n0 +j) for all j n 0. Therefore, the original stepwise procedure rejects less number of null hypotheses, and hence has less number of falsely rejected null hypotheses, than the corresponding stepwise 4

5 procedure where the p-values corresponding to the false null hypotheses are all very close to zero. Thus, the lemma follows. Remar. According to Lemma, construction of a stepwise procedure providing a control of the -FWER at α basically reduces to that of finding the constants in that procedure guaranteeing the inequality ˆR n0 α for all n 0 n. It unifies the arguments used separately towards constructing stepdown -FWER and stepup -FWER procedures [Lehmann & Romano (2005) and Romano & Shaih (2006b)]. It essentially implies that the least favorable configuration for the - FWER corresponds to the situation where p-values corresponding to false null hypotheses are all zero. Sarar (2005b) has recently introduced the concept of -FDR. Let R denote the total number of rejections. Then, it is defined as -FDR = E (-FD), where -FD = V R if V 0 otherwise. That is, the -FDR is the expected ratio of or more false rejections to the total number of rejections. When =, it reduces to the original FDR. The -FDR is a less conservative notion of error rate than the FDR, as -FDR FDR. Using it, instead of the FDR, when one is willing to control at least false rejections rather than at least one, is a natural generalization of the idea of using the -FWER instead of the FWER. We will be assuming in this article that each i under its respective true null hypothesis H i is U(0, ) and jointly the p-values are positively dependent in the following sense: E φ(,..., n ) ˆi u u (0, ), () for every ˆ i and any increasing (coordinatewise) function φ. The condition () is more relaxed than the positive regression dependence on subset (RDS) condition, that is, E φ(,..., n ) ˆi = u u (0, ), used in Benjamini & Yeutieli (200) and Sarar (2002), and is satisfied by the p-values arising in a number of multiple testing situations. In particular, it is satisfied by the p-values corresponding to multivariate normal test statistics with a common non-negative correlation that we consider in our numerical calculations later. Since the -FDR is 0, and hence trivially controlled, by 5

6 any procedure if n 0 <, we will assume throughout this paper that n 0 n. 3. -FDR ROCEDURES UNDER INDEENDENCE OR OSITIVE DEENDENCE We will develop in this section stepwise procedures that control the -FDR, for 2, under the condition (). To this end, we first have V -F DR = E I (V ) = E R r = i= r= r= i= r ˆi α r, V, R = r ( ) I ˆi α r, V, R = r = i= r= α r r V, R = r ˆi α r. (2) With r = max, r, let α r = (r )α /, for some fixed 0 < α <. Then, the -FDR in (2) simplifies to -F DR = α i= r= V, R = r ˆi α r. Now, for any r > and i n 0, V, R = r ˆi α r = V, R r ˆi α r V, R r + ˆi α r V, R r ˆi α r V, R r + ˆi α r. The inequality follows from the assumption (), since V, R r is a decreasing set for a stepwise procedure. Thus, we get -F DR α α = α i= n 0 i= n 0 i= Applying Lemma to (3), we get V, R = ˆi α + α V, R = ˆi α + α i= i= r=+ n 0 V, R = r ˆi α r V, R + ˆi α i= V ˆi α = V, ˆ i α. (3) -F DR i= ˆRn0, ˆ i α. Let ˆR ( i) n 0 denote the number of rejections in the corresponding stepwise procedure based on the 6

7 ordered values ˆ ( i) () ( i) ˆ (n 0 ) of ˆ,..., ˆ n0 \ ˆi and the n 0 critical values α n n0 +2 α n. Then, for any r n 0, we notice that, when our procedure is a stepup procedure ˆRn0 = r, ˆ i α = ˆ(r) α n n0 +r, ˆ (r+) > α n n0 +r+,..., ˆ (n0 ) > α n, ˆi α ( i) = ˆ (r ) α ( i) ( i) n n 0 +r, ˆ (r) > α n n0 +r+,..., ˆ (n 0 ) > α n, ˆi α = ˆR( i) n 0 = r, ˆ i α ; (4) whereas, for a stepdown procedure, we have ˆRn0 = r, ˆ i α = ˆ() α n n0 +,, ˆ (r) α n n0 +r, ˆ (r+) > α n n0 +r+, ˆi α ( i) ( i) ˆ () α n n0 +2,, ˆ (r ) α ( i) n n 0 +r, ˆ (r) > α n n0 +r+, ˆi α = ˆR( i) n 0 = r, ˆ i α. (5) Thus, for a stepwise procedure, we get -F DR i= r= = i= j( i)= r= ˆR( i) n 0 = r, ˆ i α r ˆR( i) n 0 = r, ˆ i α, ˆ j α n n0 +r. (6) As explained in (4) and (5), ˆR( i) n 0 = r, ˆ i α, ˆ j α n n0 +r = ˆR( i, j) n 0 2 = r 2, ˆ i α, ˆ j α n n0 +r, for a stepup procedure, and ˆR( i) n 0 = r, ˆ i α, ˆ j α n n0 +r ˆR( i, j) n 0 2 = r 2, ˆ i α, ˆ j α n n0 +r, ˆR ( i, j) for a stepdown procedure, where n 0 2 is the number of rejections in the corresponding stepwise procedure involving ˆ,..., ˆ n0 \ ˆi, ˆ j and critical values α n n0 +3 α n. Thus, -F DR (n n 0 + )α 2 ( ) i= j( i)= r= ˆR( i, j) n 0 2 = r 2, ˆ i α ˆj α n n0 +r. 7

8 The inequality follows from the fact that n n 0 + r r n n 0 +, for all r n 0. Since for r >, = ˆR( i, j) n 0 2 = r 2, ˆ i α ˆj α n n0 +r ˆR( i, j) n 0 2 r 2, ˆ i α ˆ j α n n0 +r ˆR( i, j) n 0 2 ˆ j α n n0 +r r 2, ˆ i α ˆ j α n n0 +r ˆ j α n n0 +r, ˆR( i, j) n 0 2 r, ˆ i α ( i, j) ˆR n 0 2 r, ˆ i α with the inequality following due to the property () and the fact that is decreasing, we have ˆR( i, j) n 0 2 r 2, ˆ i α = = r= ˆR( i, j) n 0 2 = r 2, ˆ i α ˆj α n n0 +r ˆR( i, j) n 0 2 = 2, ˆ i α ˆj α n n0 + ˆ i α ˆ j α n n0 +r ˆR( i, j) n 0 2 = 2, ˆ i α ˆj α n n0 + ˆ j α n n0 + ˆR( i, j) n 0 2 2, ˆ i α ˆj α n n0 + + r=+ [ ˆR( i, j) n 0 2 r 2, ˆR( i, j) n 0 2 r, ˆ i α ˆj α n n0 +r. + ] ˆR( i, j) n 0 2, ˆ i α Therefore, -F DR (n n 0 + )α 2 ( ) (n n 0 + )α 2 ( ) = ( ) i= j( i)= i= j( i)= i= j( i)= ( i, j) ˆR n 0 2 2, ˆ i α ˆj α n n0 + ˆ i α ˆj α n n0 + ˆ i α, ˆ j α n n0 +. (7) 8

9 Thus, we have the following theorem as one of our main results. Theorem. For a stepwise procedure with the p-values satisfying () and critical values given by α i = i α /, i =,..., n, for some fixed 2 n and 0 < α <, we have -F DR max n 0 n ( ) i= j( i)= ( H ij α, (n n ) 0 + )α, (8) where H ij (u, v) = ˆ i u, ˆ j v, for i j. Remar 2. When =, the right-hand side of (3) reduces to n 0 α, providing the inequality FDR n 0 α. It is being generalized in Theorem by using joint distributions of the p-values when 2. This result for = is nown in the literature [Benjamini & Hochberg (200) and Sarar (2002)], but under conditions slightly more restrictive than (). Theorem highlights the point, which was emphasized in Sarar (2005b), that the concept of -FDR, lie that of the -FWER, allows one to explicitly utilize the joint distribution of the p-values. However, while Sarar (2005b) used the th order joint null distributions of the p-values, here we use only the bivariate null distributions. Moreover, we are relying on a less restrictive assumption on the dependence of the p-values. We can invoe the same positive dependence property () shared by every pair ( ˆ i, ˆ j ) and apply the inequality ˆ i α ˆj α n n0 + ˆ i α ˆj α to the second to last line in (7) to obtain the following corollary to Theorem, providing an upper bound to the -FDR (with 2) that is more relaxed and easier to implement than the one in (8). Corollary. For the stepwise procedure in Theorem and under the conditions stated therein, we have, for 2, -F DR max n 0 n n n ( ) i= j( i)= H ij (α, α ). (9) The -FDR of the stepwise procedure in Theorem can be controlled at α under the conditions stated in the theorem by equating the upper bound in (8) or (9) to α and solving the resulting equation for α. Of course, one needs to now the bivariate distributions of all pairs of null p-values. For instance, when the null p-values are exchangeable with H as the common and nown bivariate 9

10 cdf, the α can be obtained satisfying the equation ( n0 (n 0 ) max n 0 n ( ) H α, (n n ) 0 + )α = α, (0) or, can be obtained little more conservatively by solving D(, n)h (α, α ) 2 ( ) = α, () where D(, n) = max n 0(n 0 )(n n 0 + ). n 0 n In particular, when the p-values are independent, we have the following: roposition. Consider a stepwise procedure with the critical values α i = (i )β/n, i =,..., n, with β = n ( )α/d(, n). It controls the -FDR at α when the p-values are independent. Remar 3. It is important to note that in a stepwise procedure with a control of the -FDR, the first critical values can be chosen arbitrarily without affecting the -FDR, as in the case of a -FWER stepwise procedure. Nevertheless, as argued in Lehmann & Romano (2005) and Sarar (2005a, b), eeping these critical values constant at the th critical value would be the best option. Thus, even though one can use the stepwise BH procedure with the critical values α i = iα/n, i =,..., n, as a -FDR procedure, since it controls the FDR under the conditions assumed in this paper (see Remar 2), one may consider improving it by modifying the critical values to α i = (i )α/n, i =,..., n. Nevertheless, as seen in Theorem, it does not tae full advantage of the notion of -FDR in the sense that its critical values are not determined subject to a direct control of the -FDR, and hence can potentially be improved. roposition helps us develop such an improvement for some choices of (n, ) at least under the independence. The procedure in roposition will be uniformly less conservative and more powerful than the corresponding stepwise BH procedure if β > α, that is, if n 2 ( ) D(, n) > α. (2) 0

11 Since D(, n) = ñ 0 (ñ 0 )(n + ñ 0 ), with ñ 0 = n [ 3(n + ) (n + + ) 2 ] 2, one can determine the minimum value of for which the inequality in (2) holds for given n and α. Table presents such values of for some values of n and α = For instance, with 000 null hypotheses, if one is willing to allow at most false rejections and sees to control the -FDR at α = 0.05, our procedure in roposition does provide a better control of the -FDR than the corresponding stepwise BH procedure under the independence of the p-values as long as 9. The stepdown procedure in roposition can be improved under independence as follows. Consider a stepdown procedure with critical values α i = i β/n, i =,..., n, for a fixed 0 < β <. From the first line in (6), we see that for this procedure -F DR n 0β n R n 0, where R n0 is the number of rejections in the stepdown procedure based on ˆ ():n0 ˆ (n0 ):n 0, the ordered versions of any n 0 of the n 0 null p-values, and the corresponding critial values α n n0 +2 α n. Let G,n (u) = U () u = n j j= u j ( u) n j, the cdf of the th order statistic based on n iid U(0, ). Then, since R n0 = ˆ:n0 α n n0 +2,..., ˆ ( ):n0 α n n0 + G,n0 (α n n0 +), we have the following: roposition 2. Consider a stepdown procedure with the critical values α i = (i )β/n, i =

12 ,..., n, where ( ) β (n n max n0 + )β n 0 G,n0 = α. n 0 n n It controls the -FDR at α when the p-values are independent. Table 2 presents values of β in roposition 2 for some choices of (n, ) and α = It shows that the stepdown procedure in roposition 2 indeed performs better than the stepdown procedure in roposition under independence. It continues to be a more powerful -FDR procedure than the stepdown BH procedure even for values of smaller than those for the stepdown procedure in roposition. With positively dependent p-values, since the upper bound in Theorem or its corollary is greater than or equal to the corresponding upper bound under independence, the same critical values in roposition may not continue to control the -FDR. In this case, as noted before, one needs to compute α, or equivalently β, where α = β/n, from these upper bounds using the bivariate distributions H ij. For p-values generated from multivariate normal test statistics with a common non-negative correlation ρ, values of β are numerically computed using these bounds for some values of (n, ) and α = 0.05 and are presented in Table 3, where β β = nα / with α in (0) and β β 2 = nα / with α in (). With significantly larger critical values than the corresponding BH stepwise procedure, our proposed -FDR stepwise procedures, corresponding to β and β 2, are seen to be quite powerful compared to the corresponding stepwise BH procedure when the p-values are independent or wealy but positively dependent. However, as the p-values become more and more positively dependent our procedures unfortunately lose their edges over the corresponding stepwise BH procedure. Often in practice, lie in microarray analysis and fmri studies, the p-values tend to be clumpy dependent in that they are more dependent within groups than between groups. Suppose that there are g independent groups and, for each i =,..., n 0, J i is the set of indices of null hypotheses in the group containing the ˆ i. Clearly, J i J j if i and j belong to the same group. The upper bounds in 2

13 Theorem and Corollary can be expressed, respectively, in this case as -F DR max n 0 n n n ( ) i= j( i) J i [ n n 0 + H ij ( α, (n n ) ] 0 + )α + (n 0 J i )α 2, and -F DR max n 0 n n n ( ) i= j( i) J i [ Hij (α, α ) + (n 0 J i )α] 2, where J i is the cardinality of J i. Generating n p-values from multivariate normal test statistics with a common non-negative correlation ρ within independent groups each containing n/g p-values, we repeat our numerical calculations to examine how our stepwise procedures perform compared to the corresponding stepwise BH procedure in this case. Table 4 presents the values of β and β 2 for some values of (n,, g, ρ) and α = This time, our stepwise procedures are seen to uniformly dominate the corresponding stepwise BH procedure even for positive correlations as large as 0.5. For correlation larger than 0.5, while the procedure corresponding to β continues to uniformly dominate the corresponding stepwise BH procedure, the procedure corresponding to β 2 wors well only when the number of null hypotheses is large. We investigate the extent of power improvement our -FDR stepup procedures offer over the stepup BH procedure. We simulated the average power, the expected proportion of false null hypotheses that are rejected, for each of these stepup procedures. Figure presents this power comparison, with our procedures corresponding to β and β 2 labelled -FDR SU and -FDR SU 2, respectively, and the BH procedure labelled FDR BH. Each simulated power was obtained by (i) generating n = 200 dependent normal random variables N(µ i, ), i =,, n, with a common correlation ρ = 0. and with n of the 200 µ i s being equal to d = 2 and the rest 0, (ii) applying the corresponding stepup procedure with = 8 to the generated data to test H i : µ i = 0 against K i : µ i > 0 simultaneously for i =,..., 200 at α = 0.05, and (iii) repeating steps (i) and (ii),000 times before observing the proportion of the n false H i s that are correctly declared significant. As we can see from this figure, our proposed -FDR stepup procedures are uniformly more powerful than the BH stepup procedure under wea dependence, with the power difference getting significantly 3

14 higher with increasing number of false null hypotheses. 4. AN ALICATION TO GENE EXRESSION DATA Hereditary breast cancer is nown to be associated with mutations in BRCA and BRCA2 proteins. Hedenfal et al. (200) report that a group of genes are differentially expressed between tumors with BRCA mutations and tumors with BRCA2 mutations. The data, which are publicly available from the web site Supplement/, consist of 22 breast cancer samples, among which n = 7 are BRCA mutants, n 2 = 8 are BRCA2 mutants, and n 3 = 7 are sporadic (not used in this illustration). Expression levels in terms of florescent intensity ratios of a tumor sample to a common reference sample, are measured for 3, 226 genes using cdna microarrays. If any gene has one ratio exceeding 20, then this gene is eliminated, since a value of 20 is so high that it does not seem trustworthy for this data set. Such preprocessing leaves m = 3, 70 genes. We test each gene for differential expression between these two tumor types by using a twosample t-test statistic. For each gene, the base 2 logarithmic transformation of the ratio is obtained before we compute the two-sample t-test statistic. We then calculate the raw p-value by using a permutation method from Storey & Tibshirani (2003) with the times of permutation B = 2, 000. Finally, we adjust these raw p-values by using the BH stepup procedure and our proposed -FDR stepdown procedure in roposition 2. At α = 0.05 and 0.07, the BH procedure results in 73 and 95 significant genes, respectively, while those numbers for our -FDR method are presented in Table 5 for = 2, 5, 0, 20, 30, 40 and 50. The -FDR stepdown procedure is seen to always detect more differentially expressed genes than the BH procedure for these values of and α. 5. -FDR ROCEDURE UNDER ARBITRARY DEENDENCE We now consider developing a stepup procedure with a control of the -FDR under any form of dependence of the p-values, which will uniformly dominate the BY procedure with the critical values α i = iα/n n j= j, i =,..., n, or its modification obtained by eeping the first critical values same as the th one. Define p ijr = ˆi [α j, α j ], V, R = r, given a set critical values 4

15 0 = α 0 < α < < α n. Then, from (2) we have, -F DR = r= i= n 0 i= j= i= j= r ˆi α r, V, R = r = j r= j p ijr i= j= r r= i= j= r p ijr = j ˆi [α j, α j ], V i= j= r= j r p ijr α j α j. (3) j Let α i = (i )α /, for some fixed 0 < α <. Then, -F DR n 0α + j=+ j. (4) Thus, we have the following theorem. Theorem 2. A stepup procedure with the critical values α i = (i )α +, i =,..., n, n j=+ j for a fixed 0 < α <, controls the -FDR at α. Remar 4. It would be interesting to see if there exists a joint distribution of p-values under which the -FDR of a stepup procedure with critical values of the form α i = (i )α /, i =,..., n, for some 0 < α <, is equal to the upper bound in (4). Indeed, there is such a joint distribution, which is presented in the Appendix. Acnowledgement The research of the first author is supported by grants from U.S. National Science Foundation (DMS and DMS ) Appendix The construction of a joint distribution This appendix presents the construction of a joint distribution of p-values under which the - 5

16 FDR of the stepup procedure with critical values of the form α i = (i )α /, i =,..., n, for some 0 < α <, attains the upper bound in (4). Let I =, 2,, n denote the set of indices of all null hypotheses, with I 0 and I being those of true and false null hypotheses, respectively. The main idea is to construct a joint distribution under which for each i I 0 and j, each p-value ˆ i U[0, ] and the event ˆ i [α j, α j ] is same as the event V, R = j. That is, if j = r, p ijr = ˆ i [α j, α j ], V, R = r = ˆ i [α j, α j ] = α j α j. Otherwise p ijr = 0. Then, by the second equality in (3), we have -F DR = i= j= r= j r p ijr = n 0 α + j=+ j. The construction of the joint distribution proceeds as follows: let U,..., U n+ be n + 2 uniformly distributed random variables such that U U[0, α ], U n+ U[α n, ], and U i U[α i, α i ], i = +,..., n. Let M be a random variable taing values,..., n + with the following probability distribution. M = m = n 0 α if m =, α n 0 m if + m n 0, α if n 0 + m n, n + n n0 0 j=+ j α if m = n +. Here, we assume n + n 0 n0 j=+ j α, which is easily satisfied. We want to associate a p-value to each of the indices, 2,..., n. The association proceeds in 3 distinct phases. hase. For each given M = m,..., n 0, choose m indices i, i 2,..., i m randomly without replacement from I 0. Each of these chosen indices has the same p-value ij = U m. Each of the indices s in I i, i 2,..., i m has the same p-value s = U n+. hase 2. For each given M = m n 0 +,..., n, choose all n 0 indices from I 0 and (m n 0 ) indices randomly without replacement from I. Each of these indices is associated with the same 6

17 p-value U m. Each of the remaining unaccounted indices from I is associated with the same p-value U n+. hase 3. Given M = n +, each of the indices in I 0 is associated with the same p-value U n+, and each of the indices in I is associated with the same p-value zero. It is easy to verify that, for each i I 0 and j, unconditionally, each p-value ˆ i U[0, ] and when ˆ i [α j, α j ], there are totally j p-values and at least p-values corresponding to true null hypotheses less than α j, i.e., the event ˆ i [α j, α j ] is same as the event V, R = j. References Benjamini, Y. & Hochberg, Y. (995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B Benjamini, Y. & Yeutieli, D. (200). The control of the false discovery rate in multiple testing under dependency. Ann. Statist Benjamini, Y. & Yeutieli, D. (2005). False discovery rate-adjusted multiple confidence intervals for selected parameters. J Amer. Statist. Assoc Dudoit, S., van der Laan, M. & ollard, K. (2004). Multiple testing: art I. single-step procedures for control of general type I error rates. Statist. App. Gen. Mol. Bio. 3 (): Article 3. Genovese, C. & Wasserman, L. (2002). Operarting characteristics and extensions of the false discovery rate procedure. J. Roy. Statist. Soc. Ser. B Genovese, C. & Wasserman, L. (2004). A stochastic process approach to false discovery control. Ann. Statist Guo, W. & Romano, J.. (2006). A generalized Sidá procedure and control of generalized error rates under independence. Unpublished Manuscript. Hedenfal, I., Duggan, D., Chen, Y., Radmacher, M., Bittner, M., Simon, R., Meltzer,., Gusterson, B., Esteller, M., Kallioniemi, O., Wilfond, B., Borg, A. & Trent, J. (2006). Gene-expression profiles in hereditary breast cancer. The New England Journal of Medicine

18 Hochberg, Y. (988). A sharper Bonferroni procedure for multiple tests of significance. Biometria Hochberg, Y. & Tamhane, A. C. (987). Multiple Comparison rocedures. New Yor: Wiley. Hommel, G. & Hoffmann, T. (987). Controlled uncertainty. In Multiple Hypothesis Testing. Bauer, G. Hommel and E. Sonnemann, eds. 54 6, Springer, Heidelberg. Karlin, S. & Rinott, Y. (980). Classes of orderings of measures and related correlation inequalities I: Multivariate totally positive distributions. J. Mult. Anal Korn, E., Troendle, T., McShane, L. & Simon, R. (2004). Controlling the number of false discoveries: Application to high-dimensional genomic data. J. Statist. lann. Inf Lehmann, E. L. & Romano, J. (2005). Generalizations of the familywise error rate. Ann. Statist Romano, J.. & Shaih, A. M. (2006a). On stepdown control of the false discovery proportion In. The Second E.L. Lehamann Symposium - Optimality Ed. J. Rojo, IMS Lecture Notes - Monograph Series. Romano, J.. & Shaih, A. M. (2006b). Stepup procedures for control of generalizations of the familywise error rate. Ann. Statist. To appear. Romano, J.. & Wolf, M. (2005). Control of generalized error rates in multiple testing. Unpublished Manuscript. Sarar, S. K. (2002). Some results on false discovery rate in stepwise multiple testing procedures. Ann. Statist Sarar, S. K. (2004). FDR-controlling stepwise procedures and their false negatives rates. J. Statist. lann. Inf Sarar, S. K. (2005a). Generalizing Simes test and Hochberg s stepup procedure. Technical Report, Temple University. Sarar, S. K. (2005b). Stepup procedures controlling generalized FWER and generalized FDR. Technical Report, Temple University. 8

19 Sarar, S. K. (2006a). False discovery and false non-discovery rates in single-step multiple testing procedures. Ann. Statist Sarar, S. K. (2006b). Two-stage stepup procedures controlling FDR Technical Report, Temple University. Storey, J. D. (2002). A direct approach to false discovery rates. J. Roy. Statist. Soc. Ser. B Storey, J. D. (2003). The positive false discovery rate: A Bayesian interpretation and the q-value. Ann. Statist Storey, J. D., Taylor, J. E. & Siegmund, D. (2004). Strong control, conservative point estimation and simultanaeous conservative consistency of false discovery rates: A unified approach. J. Roy. Statist. Soc. Ser. B Storey, J. D. & Tibshirani, R. (2003). Statistical significance for genomewide studies. roceedings of the National Academy of Sciences van der Laan, M., Dodoit, S. & ollard, K. (2004). Augmentation procedures for control of the generalized family-wise error rate and tail probabilities for the proportion of false positives. Stat. App. Gen. Mol. Bio. 3(): Article 5. 9

20 Table : Minimum values for different n under independence with α = n Table 2: Values of β for the stepdown procedure in roposition 2 for different (n, ) with α = n = = = Table 3: Values of β and β 2 for different (n, ) and non-negative ρ with α = ρ = 0.0 ρ = 0.05 ρ = 0.0 ρ = 0.5 ρ = 0.20 n β β 2 β β 2 β β 2 β β 2 β β

21 Table 4: Values of β and β 2 for different (n,, g) and non-negative ρ under clumpy dependence with α = ρ = 0.0 ρ = 0.2 ρ = 0.5 ρ = 0.8 n g β β 2 β β 2 β β 2 β β Table 5: Numbers of differentially expressed genes for the data in Hedenfal et al. (200) using the -FDR stepdown procedure in roposition 2. = 2 = 5 = 0 = 20 = 30 = 40 = 50 α = α =

22 Average ower FDR BH FDR SU FDR SU The number of false null hypotheses Figure : ower of two -FDR stepup procedures in the case of positive dependence with parameters n = 200, = 8, d = 2, ρ = 0. and α =

STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University

STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE Wenge Guo 1 and Sanat K. Sarkar 2 National Institute of Environmental Health Sciences and Temple University Abstract: Often in practice

More information

Chapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate

Chapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate Chapter Stepdown Procedures Controlling A Generalized False Discovery Rate Wenge Guo and Sanat K. Sarkar Biostatistics Branch, National Institute of Environmental Health Sciences, Research Triangle Park,

More information

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo PROCEDURES CONTROLLING THE k-fdr USING BIVARIATE DISTRIBUTIONS OF THE NULL p-values Sanat K. Sarkar and Wenge Guo Temple University and National Institute of Environmental Health Sciences Abstract: Procedures

More information

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE By Wenge Guo and M. Bhaskara Rao National Institute of Environmental Health Sciences and University of Cincinnati A classical approach for dealing

More information

Two-stage stepup procedures controlling FDR

Two-stage stepup procedures controlling FDR Journal of Statistical Planning and Inference 38 (2008) 072 084 www.elsevier.com/locate/jspi Two-stage stepup procedures controlling FDR Sanat K. Sarar Department of Statistics, Temple University, Philadelphia,

More information

On adaptive procedures controlling the familywise error rate

On adaptive procedures controlling the familywise error rate , pp. 3 On adaptive procedures controlling the familywise error rate By SANAT K. SARKAR Temple University, Philadelphia, PA 922, USA sanat@temple.edu Summary This paper considers the problem of developing

More information

On Methods Controlling the False Discovery Rate 1

On Methods Controlling the False Discovery Rate 1 Sankhyā : The Indian Journal of Statistics 2008, Volume 70-A, Part 2, pp. 135-168 c 2008, Indian Statistical Institute On Methods Controlling the False Discovery Rate 1 Sanat K. Sarkar Temple University,

More information

Web-based Supplementary Materials for. of the Null p-values

Web-based Supplementary Materials for. of the Null p-values Web-based Supplementary Materials for ocedures Controlling the -FDR Using Bivariate Distributions of the Null p-values Sanat K Sarar and Wenge Guo Temple University and National Institute of Environmental

More information

Step-down FDR Procedures for Large Numbers of Hypotheses

Step-down FDR Procedures for Large Numbers of Hypotheses Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept

More information

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Applying the Benjamini Hochberg procedure to a set of generalized p-values U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure

More information

Comments on: Control of the false discovery rate under dependence using the bootstrap and subsampling

Comments on: Control of the false discovery rate under dependence using the bootstrap and subsampling Test (2008) 17: 443 445 DOI 10.1007/s11749-008-0127-5 DISCUSSION Comments on: Control of the false discovery rate under dependence using the bootstrap and subsampling José A. Ferreira Mark A. van de Wiel

More information

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Test (2008) 17: 461 471 DOI 10.1007/s11749-008-0134-6 DISCUSSION Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Joseph P. Romano Azeem M. Shaikh

More information

A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments

A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Jie Chen 1 Merck Research Laboratories, P. O. Box 4, BL3-2, West Point, PA 19486, U.S.A. Telephone:

More information

Sanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract

Sanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract Adaptive Controls of FWER and FDR Under Block Dependence arxiv:1611.03155v1 [stat.me] 10 Nov 2016 Wenge Guo Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102, U.S.A.

More information

FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING PROCEDURES 1. BY SANAT K. SARKAR Temple University

FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING PROCEDURES 1. BY SANAT K. SARKAR Temple University The Annals of Statistics 2006, Vol. 34, No. 1, 394 415 DOI: 10.1214/009053605000000778 Institute of Mathematical Statistics, 2006 FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING

More information

Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments

Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Jie Chen 1 Merck Research Laboratories, P. O. Box 4, BL3-2, West Point, PA 19486, U.S.A. Telephone:

More information

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses Gavin Lynch Catchpoint Systems, Inc., 228 Park Ave S 28080 New York, NY 10003, U.S.A. Wenge Guo Department of Mathematical

More information

STEPUP PROCEDURES FOR CONTROL OF GENERALIZATIONS OF THE FAMILYWISE ERROR RATE

STEPUP PROCEDURES FOR CONTROL OF GENERALIZATIONS OF THE FAMILYWISE ERROR RATE AOS imspdf v.2006/05/02 Prn:4/08/2006; 11:19 F:aos0169.tex; (Lina) p. 1 The Annals of Statistics 2006, Vol. 0, No. 00, 1 26 DOI: 10.1214/009053606000000461 Institute of Mathematical Statistics, 2006 STEPUP

More information

GENERALIZING SIMES TEST AND HOCHBERG S STEPUP PROCEDURE 1. BY SANAT K. SARKAR Temple University

GENERALIZING SIMES TEST AND HOCHBERG S STEPUP PROCEDURE 1. BY SANAT K. SARKAR Temple University The Annals of Statistics 2008, Vol. 36, No. 1, 337 363 DOI: 10.1214/009053607000000550 Institute of Mathematical Statistics, 2008 GENERALIZING SIMES TEST AND HOCHBERG S STEPUP PROCEDURE 1 BY SANAT K. SARKAR

More information

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a

More information

arxiv: v1 [math.st] 31 Mar 2009

arxiv: v1 [math.st] 31 Mar 2009 The Annals of Statistics 2009, Vol. 37, No. 2, 619 629 DOI: 10.1214/07-AOS586 c Institute of Mathematical Statistics, 2009 arxiv:0903.5373v1 [math.st] 31 Mar 2009 AN ADAPTIVE STEP-DOWN PROCEDURE WITH PROVEN

More information

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using

More information

arxiv: v1 [math.st] 13 Mar 2008

arxiv: v1 [math.st] 13 Mar 2008 The Annals of Statistics 2008, Vol. 36, No. 1, 337 363 DOI: 10.1214/009053607000000550 c Institute of Mathematical Statistics, 2008 arxiv:0803.1961v1 [math.st] 13 Mar 2008 GENERALIZING SIMES TEST AND HOCHBERG

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca

More information

Control of Directional Errors in Fixed Sequence Multiple Testing

Control of Directional Errors in Fixed Sequence Multiple Testing Control of Directional Errors in Fixed Sequence Multiple Testing Anjana Grandhi Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102-1982 Wenge Guo Department of Mathematical

More information

Research Article Sample Size Calculation for Controlling False Discovery Proportion

Research Article Sample Size Calculation for Controlling False Discovery Proportion Probability and Statistics Volume 2012, Article ID 817948, 13 pages doi:10.1155/2012/817948 Research Article Sample Size Calculation for Controlling False Discovery Proportion Shulian Shang, 1 Qianhe Zhou,

More information

Familywise Error Rate Controlling Procedures for Discrete Data

Familywise Error Rate Controlling Procedures for Discrete Data Familywise Error Rate Controlling Procedures for Discrete Data arxiv:1711.08147v1 [stat.me] 22 Nov 2017 Yalin Zhu Center for Mathematical Sciences, Merck & Co., Inc., West Point, PA, U.S.A. Wenge Guo Department

More information

POSITIVE FALSE DISCOVERY PROPORTIONS: INTRINSIC BOUNDS AND ADAPTIVE CONTROL

POSITIVE FALSE DISCOVERY PROPORTIONS: INTRINSIC BOUNDS AND ADAPTIVE CONTROL Statistica Sinica 18(2008, 837-860 POSITIVE FALSE DISCOVERY PROPORTIONS: INTRINSIC BOUNDS AND ADAPTIVE CONTROL Zhiyi Chi and Zhiqiang Tan University of Connecticut and Rutgers University Abstract: A useful

More information

Lecture 7 April 16, 2018

Lecture 7 April 16, 2018 Stats 300C: Theory of Statistics Spring 2018 Lecture 7 April 16, 2018 Prof. Emmanuel Candes Scribe: Feng Ruan; Edited by: Rina Friedberg, Junjie Zhu 1 Outline Agenda: 1. False Discovery Rate (FDR) 2. Properties

More information

Department of Statistics University of Central Florida. Technical Report TR APR2007 Revised 25NOV2007

Department of Statistics University of Central Florida. Technical Report TR APR2007 Revised 25NOV2007 Department of Statistics University of Central Florida Technical Report TR-2007-01 25APR2007 Revised 25NOV2007 Controlling the Number of False Positives Using the Benjamini- Hochberg FDR Procedure Paul

More information

Hochberg Multiple Test Procedure Under Negative Dependence

Hochberg Multiple Test Procedure Under Negative Dependence Hochberg Multiple Test Procedure Under Negative Dependence Ajit C. Tamhane Northwestern University Joint work with Jiangtao Gou (Northwestern University) IMPACT Symposium, Cary (NC), November 20, 2014

More information

ON TWO RESULTS IN MULTIPLE TESTING

ON TWO RESULTS IN MULTIPLE TESTING ON TWO RESULTS IN MULTIPLE TESTING By Sanat K. Sarkar 1, Pranab K. Sen and Helmut Finner Temple University, University of North Carolina at Chapel Hill and University of Duesseldorf Two known results in

More information

Control of Generalized Error Rates in Multiple Testing

Control of Generalized Error Rates in Multiple Testing Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 245 Control of Generalized Error Rates in Multiple Testing Joseph P. Romano and

More information

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA

More information

IMPROVING TWO RESULTS IN MULTIPLE TESTING

IMPROVING TWO RESULTS IN MULTIPLE TESTING IMPROVING TWO RESULTS IN MULTIPLE TESTING By Sanat K. Sarkar 1, Pranab K. Sen and Helmut Finner Temple University, University of North Carolina at Chapel Hill and University of Duesseldorf October 11,

More information

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES By Wenge Guo Gavin Lynch Joseph P. Romano Technical Report No. 2018-06 September 2018

More information

New Procedures for False Discovery Control

New Procedures for False Discovery Control New Procedures for False Discovery Control Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Elisha Merriam Department of Neuroscience University

More information

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE Sanat K. Sarkar 1, Tianhui Zhou and Debashis Ghosh Temple University, Wyeth Pharmaceuticals and

More information

Controlling the False Discovery Rate in Two-Stage. Combination Tests for Multiple Endpoints

Controlling the False Discovery Rate in Two-Stage. Combination Tests for Multiple Endpoints Controlling the False Discovery Rate in Two-Stage Combination Tests for Multiple ndpoints Sanat K. Sarkar, Jingjing Chen and Wenge Guo May 29, 2011 Sanat K. Sarkar is Professor and Senior Research Fellow,

More information

False discovery control for multiple tests of association under general dependence

False discovery control for multiple tests of association under general dependence False discovery control for multiple tests of association under general dependence Nicolai Meinshausen Seminar für Statistik ETH Zürich December 2, 2004 Abstract We propose a confidence envelope for false

More information

The miss rate for the analysis of gene expression data

The miss rate for the analysis of gene expression data Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,

More information

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR CONTROLLING THE FALSE DISCOVERY RATE A Dissertation in Statistics by Scott Roths c 2011

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 13 Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates Sandrine Dudoit Mark

More information

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 14 Multiple Testing. Part II. Step-Down Procedures for Control of the Family-Wise Error Rate Mark J. van der Laan

More information

Estimation of a Two-component Mixture Model

Estimation of a Two-component Mixture Model Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint

More information

Statistica Sinica Preprint No: SS R1

Statistica Sinica Preprint No: SS R1 Statistica Sinica Preprint No: SS-2017-0072.R1 Title Control of Directional Errors in Fixed Sequence Multiple Testing Manuscript ID SS-2017-0072.R1 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0072

More information

Resampling-Based Control of the FDR

Resampling-Based Control of the FDR Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago

More information

A Unified Computational Framework to Compare Direct and Sequential False Discovery Rate Algorithms for Exploratory DNA Microarray Studies

A Unified Computational Framework to Compare Direct and Sequential False Discovery Rate Algorithms for Exploratory DNA Microarray Studies Journal of Data Science 3(2005), 331-352 A Unified Computational Framework to Compare Direct and Sequential False Discovery Rate Algorithms for Exploratory DNA Microarray Studies Danh V. Nguyen University

More information

On Generalized Fixed Sequence Procedures for Controlling the FWER

On Generalized Fixed Sequence Procedures for Controlling the FWER Research Article Received XXXX (www.interscience.wiley.com) DOI: 10.1002/sim.0000 On Generalized Fixed Sequence Procedures for Controlling the FWER Zhiying Qiu, a Wenge Guo b and Gavin Lynch c Testing

More information

arxiv:math/ v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao

arxiv:math/ v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? arxiv:math/0701003v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao ABSTRACT. In the analysis

More information

Familywise Error Rate Control via Knockoffs

Familywise Error Rate Control via Knockoffs Familywise Error Rate Control via Knockoffs Abstract We present a novel method for controlling the k-familywise error rate (k-fwer) in the linear regression setting using the knockoffs framework first

More information

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Christopher R. Genovese Department of Statistics Carnegie Mellon University joint work with Larry Wasserman

More information

Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004

Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004 Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004 Multiple testing methods to control the False Discovery Rate (FDR),

More information

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper

More information

EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS

EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS Statistica Sinica 19 (2009), 125-143 EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS Debashis Ghosh Penn State University Abstract: There is much recent interest

More information

MIXTURE MODELS FOR DETECTING DIFFERENTIALLY EXPRESSED GENES IN MICROARRAYS

MIXTURE MODELS FOR DETECTING DIFFERENTIALLY EXPRESSED GENES IN MICROARRAYS International Journal of Neural Systems, Vol. 16, No. 5 (2006) 353 362 c World Scientific Publishing Company MIXTURE MOLS FOR TECTING DIFFERENTIALLY EXPRESSED GENES IN MICROARRAYS LIAT BEN-TOVIM JONES

More information

Comparison of the Empirical Bayes and the Significance Analysis of Microarrays

Comparison of the Empirical Bayes and the Significance Analysis of Microarrays Comparison of the Empirical Bayes and the Significance Analysis of Microarrays Holger Schwender, Andreas Krause, and Katja Ickstadt Abstract Microarrays enable to measure the expression levels of tens

More information

SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE

SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE Statistica Sinica 18(2008), 881-904 SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE Yongchao Ge 1, Stuart C. Sealfon 1 and Terence P. Speed 2,3 1 Mount Sinai School of Medicine,

More information

Multiple testing: Intro & FWER 1

Multiple testing: Intro & FWER 1 Multiple testing: Intro & FWER 1 Mark van de Wiel mark.vdwiel@vumc.nl Dep of Epidemiology & Biostatistics,VUmc, Amsterdam Dep of Mathematics, VU 1 Some slides courtesy of Jelle Goeman 1 Practical notes

More information

False discovery rate control for non-positively regression dependent test statistics

False discovery rate control for non-positively regression dependent test statistics Journal of Statistical Planning and Inference ( ) www.elsevier.com/locate/jspi False discovery rate control for non-positively regression dependent test statistics Daniel Yekutieli Department of Statistics

More information

Looking at the Other Side of Bonferroni

Looking at the Other Side of Bonferroni Department of Biostatistics University of Washington 24 May 2012 Multiple Testing: Control the Type I Error Rate When analyzing genetic data, one will commonly perform over 1 million (and growing) hypothesis

More information

A moment-based method for estimating the proportion of true null hypotheses and its application to microarray gene expression data

A moment-based method for estimating the proportion of true null hypotheses and its application to microarray gene expression data Biostatistics (2007), 8, 4, pp. 744 755 doi:10.1093/biostatistics/kxm002 Advance Access publication on January 22, 2007 A moment-based method for estimating the proportion of true null hypotheses and its

More information

Statistical testing. Samantha Kleinberg. October 20, 2009

Statistical testing. Samantha Kleinberg. October 20, 2009 October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find

More information

Non-specific filtering and control of false positives

Non-specific filtering and control of false positives Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 7, Issue 1 2011 Article 12 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Stanford University Azeem Shaikh, University of Chicago

More information

Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control

Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control Joshua D. Habiger Oklahoma State University jhabige@okstate.edu Nov. 8, 2013 Outline 1 : Motivation and FDR Research Areas

More information

FDR and ROC: Similarities, Assumptions, and Decisions

FDR and ROC: Similarities, Assumptions, and Decisions EDITORIALS 8 FDR and ROC: Similarities, Assumptions, and Decisions. Why FDR and ROC? It is a privilege to have been asked to introduce this collection of papers appearing in Statistica Sinica. The papers

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 6, Issue 1 2007 Article 28 A Comparison of Methods to Control Type I Errors in Microarray Studies Jinsong Chen Mark J. van der Laan Martyn

More information

Control of the False Discovery Rate under Dependence using the Bootstrap and Subsampling

Control of the False Discovery Rate under Dependence using the Bootstrap and Subsampling Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 337 Control of the False Discovery Rate under Dependence using the Bootstrap and

More information

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018 High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously

More information

Stat 206: Estimation and testing for a mean vector,

Stat 206: Estimation and testing for a mean vector, Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where

More information

Doing Cosmology with Balls and Envelopes

Doing Cosmology with Balls and Envelopes Doing Cosmology with Balls and Envelopes Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics Carnegie

More information

arxiv: v3 [stat.me] 9 Nov 2015

arxiv: v3 [stat.me] 9 Nov 2015 Familywise Error Rate Control via Knockoffs Lucas Janson Weijie Su Department of Statistics, Stanford University, Stanford, CA 94305, USA arxiv:1505.06549v3 [stat.me] 9 Nov 2015 May 2015 Abstract We present

More information

TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? Jianqing Fan Peter Hall Qiwei Yao

TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? Jianqing Fan Peter Hall Qiwei Yao TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? Jianqing Fan Peter Hall Qiwei Yao ABSTRACT. In the analysis of microarray data, and in some other

More information

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons: STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two

More information

High-throughput Testing

High-throughput Testing High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications Thomas Brechenmacher (Dainippon Sumitomo Pharma Co., Ltd.) Jane Xu (Sunovion Pharmaceuticals Inc.) Alex Dmitrienko

More information

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS Journal of Biopharmaceutical Statistics, 21: 726 747, 2011 Copyright Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543406.2011.551333 MULTISTAGE AND MIXTURE PARALLEL

More information

Positive false discovery proportions: intrinsic bounds and adaptive control

Positive false discovery proportions: intrinsic bounds and adaptive control Positive false discovery proportions: intrinsic bounds and adaptive control Zhiyi Chi and Zhiqiang Tan University of Connecticut and The Johns Hopkins University Running title: Bounds and control of pfdr

More information

Supplement to Post hoc inference via joint family-wise error rate control

Supplement to Post hoc inference via joint family-wise error rate control Supplement to Post hoc inference via joint family-wise error rate control Gilles Blanchard Universität Potsdam, Institut für Mathemati Karl-Liebnecht-Straße 24-25 14476 Potsdam, Germany e-mail: gilles.blanchard@math.uni-potsdam.de

More information

Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses

Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Amit Zeisel, Or Zuk, Eytan Domany W.I.S. June 5, 29 Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving

More information

An Alpha-Exhaustive Multiple Testing Procedure

An Alpha-Exhaustive Multiple Testing Procedure Current Research in Biostatistics Original Research Paper An Alpha-Exhaustive Multiple Testing Procedure, Mar Chang, Xuan Deng and John Balser Boston University, Boston MA, USA Veristat, Southborough MA,

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2004 Paper 164 Multiple Testing Procedures: R multtest Package and Applications to Genomics Katherine

More information

Multiple Testing. Anjana Grandhi. BARDS, Merck Research Laboratories. Rahway, NJ Wenge Guo. Department of Mathematical Sciences

Multiple Testing. Anjana Grandhi. BARDS, Merck Research Laboratories. Rahway, NJ Wenge Guo. Department of Mathematical Sciences Control of Directional Errors in Fixed Sequence arxiv:1602.02345v2 [math.st] 18 Mar 2017 Multiple Testing Anjana Grandhi BARDS, Merck Research Laboratories Rahway, NJ 07065 Wenge Guo Department of Mathematical

More information

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Joseph P. Romano Department of Statistics Stanford University Michael Wolf Department of Economics and Business Universitat Pompeu

More information

Hunting for significance with multiple testing

Hunting for significance with multiple testing Hunting for significance with multiple testing Etienne Roquain 1 1 Laboratory LPMA, Université Pierre et Marie Curie (Paris 6), France Séminaire MODAL X, 19 mai 216 Etienne Roquain Hunting for significance

More information

arxiv: v3 [math.st] 15 Jul 2018

arxiv: v3 [math.st] 15 Jul 2018 A New Step-down Procedure for Simultaneous Hypothesis Testing Under Dependence arxiv:1503.08923v3 [math.st] 15 Jul 2018 Contents Prasenjit Ghosh 1 and Arijit Chakrabarti 2 1 Department of Statistics, Presidency

More information

A Large-Sample Approach to Controlling the False Discovery Rate

A Large-Sample Approach to Controlling the False Discovery Rate A Large-Sample Approach to Controlling the False Discovery Rate Christopher R. Genovese Department of Statistics Carnegie Mellon University Larry Wasserman Department of Statistics Carnegie Mellon University

More information

Two simple sufficient conditions for FDR control

Two simple sufficient conditions for FDR control Electronic Journal of Statistics Vol. 2 (2008) 963 992 ISSN: 1935-7524 DOI: 10.1214/08-EJS180 Two simple sufficient conditions for FDR control Gilles Blanchard, Fraunhofer-Institut FIRST Kekuléstrasse

More information

Closure properties of classes of multiple testing procedures

Closure properties of classes of multiple testing procedures AStA Adv Stat Anal (2018) 102:167 178 https://doi.org/10.1007/s10182-017-0297-0 ORIGINAL PAPER Closure properties of classes of multiple testing procedures Georg Hahn 1 Received: 28 June 2016 / Accepted:

More information

Effects of dependence in high-dimensional multiple testing problems. Kyung In Kim and Mark van de Wiel

Effects of dependence in high-dimensional multiple testing problems. Kyung In Kim and Mark van de Wiel Effects of dependence in high-dimensional multiple testing problems Kyung In Kim and Mark van de Wiel Department of Mathematics, Vrije Universiteit Amsterdam. Contents 1. High-dimensional multiple testing

More information

More powerful control of the false discovery rate under dependence

More powerful control of the false discovery rate under dependence Statistical Methods & Applications (2006) 15: 43 73 DOI 10.1007/s10260-006-0002-z ORIGINAL ARTICLE Alessio Farcomeni More powerful control of the false discovery rate under dependence Accepted: 10 November

More information

arxiv: v1 [math.st] 14 Nov 2012

arxiv: v1 [math.st] 14 Nov 2012 The Annals of Statistics 2010, Vol. 38, No. 6, 3782 3810 DOI: 10.1214/10-AOS829 c Institute of Mathematical Statistics, 2010 arxiv:1211.3313v1 [math.st] 14 Nov 2012 THE SEQUENTIAL REJECTION PRINCIPLE OF

More information

CHOOSING THE LESSER EVIL: TRADE-OFF BETWEEN FALSE DISCOVERY RATE AND NON-DISCOVERY RATE

CHOOSING THE LESSER EVIL: TRADE-OFF BETWEEN FALSE DISCOVERY RATE AND NON-DISCOVERY RATE Statistica Sinica 18(2008), 861-879 CHOOSING THE LESSER EVIL: TRADE-OFF BETWEEN FALSE DISCOVERY RATE AND NON-DISCOVERY RATE Radu V. Craiu and Lei Sun University of Toronto Abstract: The problem of multiple

More information

Consonance and the Closure Method in Multiple Testing. Institute for Empirical Research in Economics University of Zurich

Consonance and the Closure Method in Multiple Testing. Institute for Empirical Research in Economics University of Zurich Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 446 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Azeem

More information

Multiple Testing Procedures under Dependence, with Applications

Multiple Testing Procedures under Dependence, with Applications Multiple Testing Procedures under Dependence, with Applications Alessio Farcomeni November 2004 ii Dottorato di ricerca in Statistica Metodologica Dipartimento di Statistica, Probabilità e Statistiche

More information