Procedures controlling generalized false discovery rate
|
|
- George Hampton
- 5 years ago
- Views:
Transcription
1 rocedures controlling generalized false discovery rate By SANAT K. SARKAR Department of Statistics, Temple University, hiladelphia, A 922, U.S.A. sanat@temple.edu AND WENGE GUO Department of Environmental Health, University of Cincinnati, Cincinnati, OH , U.S.A. guowg@ .uc.edu November 29, 2006 Summary rocedures controlling error rates measuring at least false rejections, instead of at least one, can potentially increase the ability of a procedure to detect false null hypotheses in situations where one sees to control or more false rejections having tolerated a few of them. The -FWER, which is the probability of at least false rejections and generalizes the usual familywise error rate (FWER), is such an error rate that is recently introduced in the literature and procedures controlling it have been proposed. This paper considers an alternative and less conservative notion of error rate, the -FDR, which is the expected proportion of or more false rejections among all rejections and generalizes the usual notion of false discovery rate (FDR). rocedures with the -FDR control dominating the Benjamini-Hochberg stepup FDR procedure and its stepdown analog under independence or positive dependence and the Benjamini-Yeutieli stepup FDR procedure under any form of dependence have been constructed. Some ey words: Multiple hypothesis testing; Generalized FDR; Stepwise procedures; ositive regression dependence on subset; Clumpy dependence; Arbitrary dependence; Average power; Gene expressions.
2 . Introduction We consider in this article the general problem of simultaneously testing of n null hypotheses H i, i =,..., n, against their respective alternatives, using tests that are available for these individual hypotheses. rocedures developed for dealing with this problem are traditionally based on the idea of controlling the familywise error rate (FWER), the probability of rejecting at least one true null hypothesis [Hochberg & Tamhane (987)]. However, quite often, especially when large number of hypotheses are simultaneously tested, the notion of FWER turns out to be too stringent, allowing little chance to detect many false null hypotheses. Therefore, researchers have focused in the last decade on defining alternative less stringent error rates and developing methods that control them. The false discovery rate (FDR), the expected proportion of falsely rejected null hypotheses, due to Benjamini & Hochberg (995), is the first of these alternative error rates that has received considerable attention [Benjamini & Yeutieli (200,2005), Genovese & Wasserman (2002, 2004), Sarar (2002, 2004, 2006a, b), Storey (2002, 2003) and Storey et al. (2004)]. Recently, the ideas of controlling the probabilities of falsely rejecting at least null hypotheses, which is the -FWER, and the false discovery proportion (FD) exceeding a certain threshold γ [0, ) have been introduced as alternatives to the FWER and methods controlling these new error rates have been suggested. Korn et al. (2004) gave a permutation-based procedure to control the -FWER and FD approximately. In van der Laan et al. (2004), alternative procedures controlling both the -FWER and FD are proposed augmenting single step and stepwise FWER procedures and related finite sample as well as asymptotic results are given. Single-step and stepwise -FWER and FD procedures for finite number of hypotheses in terms of only the marginal null distributions of the p-values either under independence or arbitrary dependence are given in Hommel & Hoffmann (987), Lehmann & Romano (2005), Guo & Romano (2006), Romano & Shaih (2006a, b) and Romano & Wolf (2005). When the p-values are independent or positively dependent in the sense of being multivariate totally positive of order two (MT 2 ) [Karlin & Rinott (980)], a condition satisfied in some multiple testing situations, Sarar (2005a) developed alternative single-step and stepwise -FWER procedures utilizing th order joint null distributions of the p-values. Other methods controlling the -FWER and FD are discussed in Dudoit et al. (2004). Often in practice one is willing to tolerate a few false rejections but wants to control too many of them, say or more. A procedure controlling the -FWER in this case will have better ability to 2
3 mae or more true discoveries than the corresponding FWER ( = ) procedure. Sarar (2005b) considered generalizing the FDR to the -FDR, the expected ratio of or more false rejections to the total number of rejections, which is a less conservative notion of error rate than the -FWER. Under the independence or the MT 2 positive dependence of the p-values, he developed a -FDR procedure utilizing th order joint null distributions of the p-values. It is less conservative and uniformly more powerful in maing or more true discoveries than the -FWER procedure that Sarar (2005a) developed as a generalization of Hochberg s (988) FWER procedure, and often outperforms the Benjamini-Hochberg (BH) stepup FDR procedure. Sarar (2005b) has also developed a -FDR procedure utilizing th order joint null distributions of the p-values under any form of dependence of the p-values, which generalizes Benjamini & Yeutieli s (200) (BY) stepup procedure but does not uniformly outperform it. In this article, we focus on developing -FDR procedures uniformly outperforming the BH and BY stepup procedures and also the stepdown analog of the BH procedure. Specifically, we consider positively dependent p-values for which a stepwise procedure, stepdown or steup, with critical values of the BH procedure (to be simply referred to as the BH stepwise procedure) is nown to control the FDR, and hence the -FDR conservatively [Benjamini & Yeutieli (200) and Sarar (2002)], and generalize it to a -FDR stepwise procedure that allows more rejections. The positive dependence condition in this paper is slightly weaer than considered earlier in the above two papers. We offer two such generalizations in the positive dependence case using bivariate distributions of the null p-values, one more conservative than the other but easier to implement, both reducing to the same procedure under independence. Based on numerical calculations, we show that, for appropriately chosen values of, our -FDR stepwise procedure under independence is uniformly more powerful than the corresponding BH stepwise procedure. Under positive dependence, each of these -FDR procedures remains more powerful than the corresponding BH stepwise procedure under wea dependence, but unfortunately loses its edge with increasing dependence. Often in practice, lie in microarray analysis and fmri studies, the p-values tend to be clumpy dependent in that they are more dependent within small groups, strongly or wealy, but are independent between these groups. The -FDR procedures in this case are seen to improve their performance. They remain more powerful than the corresponding BH stepwise procedure even for large positive correlations. An alternative stepdown -FDR procedure is proposed that performs better than the one in the above stepwise procedure under independence. When applied to the gene expression data in Hedenfal et al. (200), 3
4 it is seen to detect more differentially expressed genes than the BH stepup procedure for different values of. Finally, we develop a generalized BY procedure that uniformly outperforms the original BY procedure as a -FDR procedure under any form of dependence. 2. reliminaries Suppose that H,..., H n are the null hypotheses that are to be simultaneously tested using the corresponding p-values,... n, respectively. Let () (n) be the ordered p-values and H (),..., H (n) the associated null hypotheses. Then, given a non-decreasing set of critical constants 0 < α α n <, a stepdown multiple testing procedure accepts the set of null hypotheses H (i), i i SD and rejects the rest, where i SD = mini : (i) > α i, if the minimum exists, otherwise rejects all the null hypotheses. A stepup procedure, on the other hand, rejects the set H i, i i SU and accepts the rest, where i SU = maxi : (i) α i, if the maximum exists, otherwise accepts all the null hypotheses. A stepwise (stepdown or stepup) procedure with the same constant is referred to as a single-step procedure. The constants in a stepwise procedure are determined subject to the control at a pre-specified level α of a suitable error rate. Let V denote the number of falsely rejected null hypotheses. Then, the - FWER is defined as -FWER = V, with -FWER being the original FWER. The following lemma provides a general theory towards determining the constants in a stepwise procedure providing a control of the -FWER. It is assumed that there are n 0 true null hypotheses and ˆ,..., ˆ n0 are the p-values that correspond to these null hypotheses. Lemma. Given a stepwise procedure involving,..., n and the critical values 0 < α α n <, consider the corresponding stepwise procedure in terms of the null p-values ˆ,..., ˆ n0 and the critical values α n n0 + α n. Let V n denote the number of false rejections in the original stepwise procedure and ˆR n0 denote the number of rejections in the stepwise procedure involving the null p-values. Then, we have for any fixed n 0 n, V n ˆRn0. roof. Let ˆ () ˆ (n0 ) be the ordered values of ˆ,..., ˆ n0. Note that ˆ (j) (n n0 +j) for all j n 0. Therefore, the original stepwise procedure rejects less number of null hypotheses, and hence has less number of falsely rejected null hypotheses, than the corresponding stepwise 4
5 procedure where the p-values corresponding to the false null hypotheses are all very close to zero. Thus, the lemma follows. Remar. According to Lemma, construction of a stepwise procedure providing a control of the -FWER at α basically reduces to that of finding the constants in that procedure guaranteeing the inequality ˆR n0 α for all n 0 n. It unifies the arguments used separately towards constructing stepdown -FWER and stepup -FWER procedures [Lehmann & Romano (2005) and Romano & Shaih (2006b)]. It essentially implies that the least favorable configuration for the - FWER corresponds to the situation where p-values corresponding to false null hypotheses are all zero. Sarar (2005b) has recently introduced the concept of -FDR. Let R denote the total number of rejections. Then, it is defined as -FDR = E (-FD), where -FD = V R if V 0 otherwise. That is, the -FDR is the expected ratio of or more false rejections to the total number of rejections. When =, it reduces to the original FDR. The -FDR is a less conservative notion of error rate than the FDR, as -FDR FDR. Using it, instead of the FDR, when one is willing to control at least false rejections rather than at least one, is a natural generalization of the idea of using the -FWER instead of the FWER. We will be assuming in this article that each i under its respective true null hypothesis H i is U(0, ) and jointly the p-values are positively dependent in the following sense: E φ(,..., n ) ˆi u u (0, ), () for every ˆ i and any increasing (coordinatewise) function φ. The condition () is more relaxed than the positive regression dependence on subset (RDS) condition, that is, E φ(,..., n ) ˆi = u u (0, ), used in Benjamini & Yeutieli (200) and Sarar (2002), and is satisfied by the p-values arising in a number of multiple testing situations. In particular, it is satisfied by the p-values corresponding to multivariate normal test statistics with a common non-negative correlation that we consider in our numerical calculations later. Since the -FDR is 0, and hence trivially controlled, by 5
6 any procedure if n 0 <, we will assume throughout this paper that n 0 n. 3. -FDR ROCEDURES UNDER INDEENDENCE OR OSITIVE DEENDENCE We will develop in this section stepwise procedures that control the -FDR, for 2, under the condition (). To this end, we first have V -F DR = E I (V ) = E R r = i= r= r= i= r ˆi α r, V, R = r ( ) I ˆi α r, V, R = r = i= r= α r r V, R = r ˆi α r. (2) With r = max, r, let α r = (r )α /, for some fixed 0 < α <. Then, the -FDR in (2) simplifies to -F DR = α i= r= V, R = r ˆi α r. Now, for any r > and i n 0, V, R = r ˆi α r = V, R r ˆi α r V, R r + ˆi α r V, R r ˆi α r V, R r + ˆi α r. The inequality follows from the assumption (), since V, R r is a decreasing set for a stepwise procedure. Thus, we get -F DR α α = α i= n 0 i= n 0 i= Applying Lemma to (3), we get V, R = ˆi α + α V, R = ˆi α + α i= i= r=+ n 0 V, R = r ˆi α r V, R + ˆi α i= V ˆi α = V, ˆ i α. (3) -F DR i= ˆRn0, ˆ i α. Let ˆR ( i) n 0 denote the number of rejections in the corresponding stepwise procedure based on the 6
7 ordered values ˆ ( i) () ( i) ˆ (n 0 ) of ˆ,..., ˆ n0 \ ˆi and the n 0 critical values α n n0 +2 α n. Then, for any r n 0, we notice that, when our procedure is a stepup procedure ˆRn0 = r, ˆ i α = ˆ(r) α n n0 +r, ˆ (r+) > α n n0 +r+,..., ˆ (n0 ) > α n, ˆi α ( i) = ˆ (r ) α ( i) ( i) n n 0 +r, ˆ (r) > α n n0 +r+,..., ˆ (n 0 ) > α n, ˆi α = ˆR( i) n 0 = r, ˆ i α ; (4) whereas, for a stepdown procedure, we have ˆRn0 = r, ˆ i α = ˆ() α n n0 +,, ˆ (r) α n n0 +r, ˆ (r+) > α n n0 +r+, ˆi α ( i) ( i) ˆ () α n n0 +2,, ˆ (r ) α ( i) n n 0 +r, ˆ (r) > α n n0 +r+, ˆi α = ˆR( i) n 0 = r, ˆ i α. (5) Thus, for a stepwise procedure, we get -F DR i= r= = i= j( i)= r= ˆR( i) n 0 = r, ˆ i α r ˆR( i) n 0 = r, ˆ i α, ˆ j α n n0 +r. (6) As explained in (4) and (5), ˆR( i) n 0 = r, ˆ i α, ˆ j α n n0 +r = ˆR( i, j) n 0 2 = r 2, ˆ i α, ˆ j α n n0 +r, for a stepup procedure, and ˆR( i) n 0 = r, ˆ i α, ˆ j α n n0 +r ˆR( i, j) n 0 2 = r 2, ˆ i α, ˆ j α n n0 +r, ˆR ( i, j) for a stepdown procedure, where n 0 2 is the number of rejections in the corresponding stepwise procedure involving ˆ,..., ˆ n0 \ ˆi, ˆ j and critical values α n n0 +3 α n. Thus, -F DR (n n 0 + )α 2 ( ) i= j( i)= r= ˆR( i, j) n 0 2 = r 2, ˆ i α ˆj α n n0 +r. 7
8 The inequality follows from the fact that n n 0 + r r n n 0 +, for all r n 0. Since for r >, = ˆR( i, j) n 0 2 = r 2, ˆ i α ˆj α n n0 +r ˆR( i, j) n 0 2 r 2, ˆ i α ˆ j α n n0 +r ˆR( i, j) n 0 2 ˆ j α n n0 +r r 2, ˆ i α ˆ j α n n0 +r ˆ j α n n0 +r, ˆR( i, j) n 0 2 r, ˆ i α ( i, j) ˆR n 0 2 r, ˆ i α with the inequality following due to the property () and the fact that is decreasing, we have ˆR( i, j) n 0 2 r 2, ˆ i α = = r= ˆR( i, j) n 0 2 = r 2, ˆ i α ˆj α n n0 +r ˆR( i, j) n 0 2 = 2, ˆ i α ˆj α n n0 + ˆ i α ˆ j α n n0 +r ˆR( i, j) n 0 2 = 2, ˆ i α ˆj α n n0 + ˆ j α n n0 + ˆR( i, j) n 0 2 2, ˆ i α ˆj α n n0 + + r=+ [ ˆR( i, j) n 0 2 r 2, ˆR( i, j) n 0 2 r, ˆ i α ˆj α n n0 +r. + ] ˆR( i, j) n 0 2, ˆ i α Therefore, -F DR (n n 0 + )α 2 ( ) (n n 0 + )α 2 ( ) = ( ) i= j( i)= i= j( i)= i= j( i)= ( i, j) ˆR n 0 2 2, ˆ i α ˆj α n n0 + ˆ i α ˆj α n n0 + ˆ i α, ˆ j α n n0 +. (7) 8
9 Thus, we have the following theorem as one of our main results. Theorem. For a stepwise procedure with the p-values satisfying () and critical values given by α i = i α /, i =,..., n, for some fixed 2 n and 0 < α <, we have -F DR max n 0 n ( ) i= j( i)= ( H ij α, (n n ) 0 + )α, (8) where H ij (u, v) = ˆ i u, ˆ j v, for i j. Remar 2. When =, the right-hand side of (3) reduces to n 0 α, providing the inequality FDR n 0 α. It is being generalized in Theorem by using joint distributions of the p-values when 2. This result for = is nown in the literature [Benjamini & Hochberg (200) and Sarar (2002)], but under conditions slightly more restrictive than (). Theorem highlights the point, which was emphasized in Sarar (2005b), that the concept of -FDR, lie that of the -FWER, allows one to explicitly utilize the joint distribution of the p-values. However, while Sarar (2005b) used the th order joint null distributions of the p-values, here we use only the bivariate null distributions. Moreover, we are relying on a less restrictive assumption on the dependence of the p-values. We can invoe the same positive dependence property () shared by every pair ( ˆ i, ˆ j ) and apply the inequality ˆ i α ˆj α n n0 + ˆ i α ˆj α to the second to last line in (7) to obtain the following corollary to Theorem, providing an upper bound to the -FDR (with 2) that is more relaxed and easier to implement than the one in (8). Corollary. For the stepwise procedure in Theorem and under the conditions stated therein, we have, for 2, -F DR max n 0 n n n ( ) i= j( i)= H ij (α, α ). (9) The -FDR of the stepwise procedure in Theorem can be controlled at α under the conditions stated in the theorem by equating the upper bound in (8) or (9) to α and solving the resulting equation for α. Of course, one needs to now the bivariate distributions of all pairs of null p-values. For instance, when the null p-values are exchangeable with H as the common and nown bivariate 9
10 cdf, the α can be obtained satisfying the equation ( n0 (n 0 ) max n 0 n ( ) H α, (n n ) 0 + )α = α, (0) or, can be obtained little more conservatively by solving D(, n)h (α, α ) 2 ( ) = α, () where D(, n) = max n 0(n 0 )(n n 0 + ). n 0 n In particular, when the p-values are independent, we have the following: roposition. Consider a stepwise procedure with the critical values α i = (i )β/n, i =,..., n, with β = n ( )α/d(, n). It controls the -FDR at α when the p-values are independent. Remar 3. It is important to note that in a stepwise procedure with a control of the -FDR, the first critical values can be chosen arbitrarily without affecting the -FDR, as in the case of a -FWER stepwise procedure. Nevertheless, as argued in Lehmann & Romano (2005) and Sarar (2005a, b), eeping these critical values constant at the th critical value would be the best option. Thus, even though one can use the stepwise BH procedure with the critical values α i = iα/n, i =,..., n, as a -FDR procedure, since it controls the FDR under the conditions assumed in this paper (see Remar 2), one may consider improving it by modifying the critical values to α i = (i )α/n, i =,..., n. Nevertheless, as seen in Theorem, it does not tae full advantage of the notion of -FDR in the sense that its critical values are not determined subject to a direct control of the -FDR, and hence can potentially be improved. roposition helps us develop such an improvement for some choices of (n, ) at least under the independence. The procedure in roposition will be uniformly less conservative and more powerful than the corresponding stepwise BH procedure if β > α, that is, if n 2 ( ) D(, n) > α. (2) 0
11 Since D(, n) = ñ 0 (ñ 0 )(n + ñ 0 ), with ñ 0 = n [ 3(n + ) (n + + ) 2 ] 2, one can determine the minimum value of for which the inequality in (2) holds for given n and α. Table presents such values of for some values of n and α = For instance, with 000 null hypotheses, if one is willing to allow at most false rejections and sees to control the -FDR at α = 0.05, our procedure in roposition does provide a better control of the -FDR than the corresponding stepwise BH procedure under the independence of the p-values as long as 9. The stepdown procedure in roposition can be improved under independence as follows. Consider a stepdown procedure with critical values α i = i β/n, i =,..., n, for a fixed 0 < β <. From the first line in (6), we see that for this procedure -F DR n 0β n R n 0, where R n0 is the number of rejections in the stepdown procedure based on ˆ ():n0 ˆ (n0 ):n 0, the ordered versions of any n 0 of the n 0 null p-values, and the corresponding critial values α n n0 +2 α n. Let G,n (u) = U () u = n j j= u j ( u) n j, the cdf of the th order statistic based on n iid U(0, ). Then, since R n0 = ˆ:n0 α n n0 +2,..., ˆ ( ):n0 α n n0 + G,n0 (α n n0 +), we have the following: roposition 2. Consider a stepdown procedure with the critical values α i = (i )β/n, i =
12 ,..., n, where ( ) β (n n max n0 + )β n 0 G,n0 = α. n 0 n n It controls the -FDR at α when the p-values are independent. Table 2 presents values of β in roposition 2 for some choices of (n, ) and α = It shows that the stepdown procedure in roposition 2 indeed performs better than the stepdown procedure in roposition under independence. It continues to be a more powerful -FDR procedure than the stepdown BH procedure even for values of smaller than those for the stepdown procedure in roposition. With positively dependent p-values, since the upper bound in Theorem or its corollary is greater than or equal to the corresponding upper bound under independence, the same critical values in roposition may not continue to control the -FDR. In this case, as noted before, one needs to compute α, or equivalently β, where α = β/n, from these upper bounds using the bivariate distributions H ij. For p-values generated from multivariate normal test statistics with a common non-negative correlation ρ, values of β are numerically computed using these bounds for some values of (n, ) and α = 0.05 and are presented in Table 3, where β β = nα / with α in (0) and β β 2 = nα / with α in (). With significantly larger critical values than the corresponding BH stepwise procedure, our proposed -FDR stepwise procedures, corresponding to β and β 2, are seen to be quite powerful compared to the corresponding stepwise BH procedure when the p-values are independent or wealy but positively dependent. However, as the p-values become more and more positively dependent our procedures unfortunately lose their edges over the corresponding stepwise BH procedure. Often in practice, lie in microarray analysis and fmri studies, the p-values tend to be clumpy dependent in that they are more dependent within groups than between groups. Suppose that there are g independent groups and, for each i =,..., n 0, J i is the set of indices of null hypotheses in the group containing the ˆ i. Clearly, J i J j if i and j belong to the same group. The upper bounds in 2
13 Theorem and Corollary can be expressed, respectively, in this case as -F DR max n 0 n n n ( ) i= j( i) J i [ n n 0 + H ij ( α, (n n ) ] 0 + )α + (n 0 J i )α 2, and -F DR max n 0 n n n ( ) i= j( i) J i [ Hij (α, α ) + (n 0 J i )α] 2, where J i is the cardinality of J i. Generating n p-values from multivariate normal test statistics with a common non-negative correlation ρ within independent groups each containing n/g p-values, we repeat our numerical calculations to examine how our stepwise procedures perform compared to the corresponding stepwise BH procedure in this case. Table 4 presents the values of β and β 2 for some values of (n,, g, ρ) and α = This time, our stepwise procedures are seen to uniformly dominate the corresponding stepwise BH procedure even for positive correlations as large as 0.5. For correlation larger than 0.5, while the procedure corresponding to β continues to uniformly dominate the corresponding stepwise BH procedure, the procedure corresponding to β 2 wors well only when the number of null hypotheses is large. We investigate the extent of power improvement our -FDR stepup procedures offer over the stepup BH procedure. We simulated the average power, the expected proportion of false null hypotheses that are rejected, for each of these stepup procedures. Figure presents this power comparison, with our procedures corresponding to β and β 2 labelled -FDR SU and -FDR SU 2, respectively, and the BH procedure labelled FDR BH. Each simulated power was obtained by (i) generating n = 200 dependent normal random variables N(µ i, ), i =,, n, with a common correlation ρ = 0. and with n of the 200 µ i s being equal to d = 2 and the rest 0, (ii) applying the corresponding stepup procedure with = 8 to the generated data to test H i : µ i = 0 against K i : µ i > 0 simultaneously for i =,..., 200 at α = 0.05, and (iii) repeating steps (i) and (ii),000 times before observing the proportion of the n false H i s that are correctly declared significant. As we can see from this figure, our proposed -FDR stepup procedures are uniformly more powerful than the BH stepup procedure under wea dependence, with the power difference getting significantly 3
14 higher with increasing number of false null hypotheses. 4. AN ALICATION TO GENE EXRESSION DATA Hereditary breast cancer is nown to be associated with mutations in BRCA and BRCA2 proteins. Hedenfal et al. (200) report that a group of genes are differentially expressed between tumors with BRCA mutations and tumors with BRCA2 mutations. The data, which are publicly available from the web site Supplement/, consist of 22 breast cancer samples, among which n = 7 are BRCA mutants, n 2 = 8 are BRCA2 mutants, and n 3 = 7 are sporadic (not used in this illustration). Expression levels in terms of florescent intensity ratios of a tumor sample to a common reference sample, are measured for 3, 226 genes using cdna microarrays. If any gene has one ratio exceeding 20, then this gene is eliminated, since a value of 20 is so high that it does not seem trustworthy for this data set. Such preprocessing leaves m = 3, 70 genes. We test each gene for differential expression between these two tumor types by using a twosample t-test statistic. For each gene, the base 2 logarithmic transformation of the ratio is obtained before we compute the two-sample t-test statistic. We then calculate the raw p-value by using a permutation method from Storey & Tibshirani (2003) with the times of permutation B = 2, 000. Finally, we adjust these raw p-values by using the BH stepup procedure and our proposed -FDR stepdown procedure in roposition 2. At α = 0.05 and 0.07, the BH procedure results in 73 and 95 significant genes, respectively, while those numbers for our -FDR method are presented in Table 5 for = 2, 5, 0, 20, 30, 40 and 50. The -FDR stepdown procedure is seen to always detect more differentially expressed genes than the BH procedure for these values of and α. 5. -FDR ROCEDURE UNDER ARBITRARY DEENDENCE We now consider developing a stepup procedure with a control of the -FDR under any form of dependence of the p-values, which will uniformly dominate the BY procedure with the critical values α i = iα/n n j= j, i =,..., n, or its modification obtained by eeping the first critical values same as the th one. Define p ijr = ˆi [α j, α j ], V, R = r, given a set critical values 4
15 0 = α 0 < α < < α n. Then, from (2) we have, -F DR = r= i= n 0 i= j= i= j= r ˆi α r, V, R = r = j r= j p ijr i= j= r r= i= j= r p ijr = j ˆi [α j, α j ], V i= j= r= j r p ijr α j α j. (3) j Let α i = (i )α /, for some fixed 0 < α <. Then, -F DR n 0α + j=+ j. (4) Thus, we have the following theorem. Theorem 2. A stepup procedure with the critical values α i = (i )α +, i =,..., n, n j=+ j for a fixed 0 < α <, controls the -FDR at α. Remar 4. It would be interesting to see if there exists a joint distribution of p-values under which the -FDR of a stepup procedure with critical values of the form α i = (i )α /, i =,..., n, for some 0 < α <, is equal to the upper bound in (4). Indeed, there is such a joint distribution, which is presented in the Appendix. Acnowledgement The research of the first author is supported by grants from U.S. National Science Foundation (DMS and DMS ) Appendix The construction of a joint distribution This appendix presents the construction of a joint distribution of p-values under which the - 5
16 FDR of the stepup procedure with critical values of the form α i = (i )α /, i =,..., n, for some 0 < α <, attains the upper bound in (4). Let I =, 2,, n denote the set of indices of all null hypotheses, with I 0 and I being those of true and false null hypotheses, respectively. The main idea is to construct a joint distribution under which for each i I 0 and j, each p-value ˆ i U[0, ] and the event ˆ i [α j, α j ] is same as the event V, R = j. That is, if j = r, p ijr = ˆ i [α j, α j ], V, R = r = ˆ i [α j, α j ] = α j α j. Otherwise p ijr = 0. Then, by the second equality in (3), we have -F DR = i= j= r= j r p ijr = n 0 α + j=+ j. The construction of the joint distribution proceeds as follows: let U,..., U n+ be n + 2 uniformly distributed random variables such that U U[0, α ], U n+ U[α n, ], and U i U[α i, α i ], i = +,..., n. Let M be a random variable taing values,..., n + with the following probability distribution. M = m = n 0 α if m =, α n 0 m if + m n 0, α if n 0 + m n, n + n n0 0 j=+ j α if m = n +. Here, we assume n + n 0 n0 j=+ j α, which is easily satisfied. We want to associate a p-value to each of the indices, 2,..., n. The association proceeds in 3 distinct phases. hase. For each given M = m,..., n 0, choose m indices i, i 2,..., i m randomly without replacement from I 0. Each of these chosen indices has the same p-value ij = U m. Each of the indices s in I i, i 2,..., i m has the same p-value s = U n+. hase 2. For each given M = m n 0 +,..., n, choose all n 0 indices from I 0 and (m n 0 ) indices randomly without replacement from I. Each of these indices is associated with the same 6
17 p-value U m. Each of the remaining unaccounted indices from I is associated with the same p-value U n+. hase 3. Given M = n +, each of the indices in I 0 is associated with the same p-value U n+, and each of the indices in I is associated with the same p-value zero. It is easy to verify that, for each i I 0 and j, unconditionally, each p-value ˆ i U[0, ] and when ˆ i [α j, α j ], there are totally j p-values and at least p-values corresponding to true null hypotheses less than α j, i.e., the event ˆ i [α j, α j ] is same as the event V, R = j. References Benjamini, Y. & Hochberg, Y. (995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. Roy. Statist. Soc. Ser. B Benjamini, Y. & Yeutieli, D. (200). The control of the false discovery rate in multiple testing under dependency. Ann. Statist Benjamini, Y. & Yeutieli, D. (2005). False discovery rate-adjusted multiple confidence intervals for selected parameters. J Amer. Statist. Assoc Dudoit, S., van der Laan, M. & ollard, K. (2004). Multiple testing: art I. single-step procedures for control of general type I error rates. Statist. App. Gen. Mol. Bio. 3 (): Article 3. Genovese, C. & Wasserman, L. (2002). Operarting characteristics and extensions of the false discovery rate procedure. J. Roy. Statist. Soc. Ser. B Genovese, C. & Wasserman, L. (2004). A stochastic process approach to false discovery control. Ann. Statist Guo, W. & Romano, J.. (2006). A generalized Sidá procedure and control of generalized error rates under independence. Unpublished Manuscript. Hedenfal, I., Duggan, D., Chen, Y., Radmacher, M., Bittner, M., Simon, R., Meltzer,., Gusterson, B., Esteller, M., Kallioniemi, O., Wilfond, B., Borg, A. & Trent, J. (2006). Gene-expression profiles in hereditary breast cancer. The New England Journal of Medicine
18 Hochberg, Y. (988). A sharper Bonferroni procedure for multiple tests of significance. Biometria Hochberg, Y. & Tamhane, A. C. (987). Multiple Comparison rocedures. New Yor: Wiley. Hommel, G. & Hoffmann, T. (987). Controlled uncertainty. In Multiple Hypothesis Testing. Bauer, G. Hommel and E. Sonnemann, eds. 54 6, Springer, Heidelberg. Karlin, S. & Rinott, Y. (980). Classes of orderings of measures and related correlation inequalities I: Multivariate totally positive distributions. J. Mult. Anal Korn, E., Troendle, T., McShane, L. & Simon, R. (2004). Controlling the number of false discoveries: Application to high-dimensional genomic data. J. Statist. lann. Inf Lehmann, E. L. & Romano, J. (2005). Generalizations of the familywise error rate. Ann. Statist Romano, J.. & Shaih, A. M. (2006a). On stepdown control of the false discovery proportion In. The Second E.L. Lehamann Symposium - Optimality Ed. J. Rojo, IMS Lecture Notes - Monograph Series. Romano, J.. & Shaih, A. M. (2006b). Stepup procedures for control of generalizations of the familywise error rate. Ann. Statist. To appear. Romano, J.. & Wolf, M. (2005). Control of generalized error rates in multiple testing. Unpublished Manuscript. Sarar, S. K. (2002). Some results on false discovery rate in stepwise multiple testing procedures. Ann. Statist Sarar, S. K. (2004). FDR-controlling stepwise procedures and their false negatives rates. J. Statist. lann. Inf Sarar, S. K. (2005a). Generalizing Simes test and Hochberg s stepup procedure. Technical Report, Temple University. Sarar, S. K. (2005b). Stepup procedures controlling generalized FWER and generalized FDR. Technical Report, Temple University. 8
19 Sarar, S. K. (2006a). False discovery and false non-discovery rates in single-step multiple testing procedures. Ann. Statist Sarar, S. K. (2006b). Two-stage stepup procedures controlling FDR Technical Report, Temple University. Storey, J. D. (2002). A direct approach to false discovery rates. J. Roy. Statist. Soc. Ser. B Storey, J. D. (2003). The positive false discovery rate: A Bayesian interpretation and the q-value. Ann. Statist Storey, J. D., Taylor, J. E. & Siegmund, D. (2004). Strong control, conservative point estimation and simultanaeous conservative consistency of false discovery rates: A unified approach. J. Roy. Statist. Soc. Ser. B Storey, J. D. & Tibshirani, R. (2003). Statistical significance for genomewide studies. roceedings of the National Academy of Sciences van der Laan, M., Dodoit, S. & ollard, K. (2004). Augmentation procedures for control of the generalized family-wise error rate and tail probabilities for the proportion of false positives. Stat. App. Gen. Mol. Bio. 3(): Article 5. 9
20 Table : Minimum values for different n under independence with α = n Table 2: Values of β for the stepdown procedure in roposition 2 for different (n, ) with α = n = = = Table 3: Values of β and β 2 for different (n, ) and non-negative ρ with α = ρ = 0.0 ρ = 0.05 ρ = 0.0 ρ = 0.5 ρ = 0.20 n β β 2 β β 2 β β 2 β β 2 β β
21 Table 4: Values of β and β 2 for different (n,, g) and non-negative ρ under clumpy dependence with α = ρ = 0.0 ρ = 0.2 ρ = 0.5 ρ = 0.8 n g β β 2 β β 2 β β 2 β β Table 5: Numbers of differentially expressed genes for the data in Hedenfal et al. (200) using the -FDR stepdown procedure in roposition 2. = 2 = 5 = 0 = 20 = 30 = 40 = 50 α = α =
22 Average ower FDR BH FDR SU FDR SU The number of false null hypotheses Figure : ower of two -FDR stepup procedures in the case of positive dependence with parameters n = 200, = 8, d = 2, ρ = 0. and α =
STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University
STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE Wenge Guo 1 and Sanat K. Sarkar 2 National Institute of Environmental Health Sciences and Temple University Abstract: Often in practice
More informationChapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate
Chapter Stepdown Procedures Controlling A Generalized False Discovery Rate Wenge Guo and Sanat K. Sarkar Biostatistics Branch, National Institute of Environmental Health Sciences, Research Triangle Park,
More informationPROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo
PROCEDURES CONTROLLING THE k-fdr USING BIVARIATE DISTRIBUTIONS OF THE NULL p-values Sanat K. Sarkar and Wenge Guo Temple University and National Institute of Environmental Health Sciences Abstract: Procedures
More informationON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao
ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE By Wenge Guo and M. Bhaskara Rao National Institute of Environmental Health Sciences and University of Cincinnati A classical approach for dealing
More informationTwo-stage stepup procedures controlling FDR
Journal of Statistical Planning and Inference 38 (2008) 072 084 www.elsevier.com/locate/jspi Two-stage stepup procedures controlling FDR Sanat K. Sarar Department of Statistics, Temple University, Philadelphia,
More informationOn adaptive procedures controlling the familywise error rate
, pp. 3 On adaptive procedures controlling the familywise error rate By SANAT K. SARKAR Temple University, Philadelphia, PA 922, USA sanat@temple.edu Summary This paper considers the problem of developing
More informationOn Methods Controlling the False Discovery Rate 1
Sankhyā : The Indian Journal of Statistics 2008, Volume 70-A, Part 2, pp. 135-168 c 2008, Indian Statistical Institute On Methods Controlling the False Discovery Rate 1 Sanat K. Sarkar Temple University,
More informationWeb-based Supplementary Materials for. of the Null p-values
Web-based Supplementary Materials for ocedures Controlling the -FDR Using Bivariate Distributions of the Null p-values Sanat K Sarar and Wenge Guo Temple University and National Institute of Environmental
More informationStep-down FDR Procedures for Large Numbers of Hypotheses
Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate
More informationModified Simes Critical Values Under Positive Dependence
Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia
More informationFDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES
FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept
More informationApplying the Benjamini Hochberg procedure to a set of generalized p-values
U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure
More informationComments on: Control of the false discovery rate under dependence using the bootstrap and subsampling
Test (2008) 17: 443 445 DOI 10.1007/s11749-008-0127-5 DISCUSSION Comments on: Control of the false discovery rate under dependence using the bootstrap and subsampling José A. Ferreira Mark A. van de Wiel
More informationRejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling
Test (2008) 17: 461 471 DOI 10.1007/s11749-008-0134-6 DISCUSSION Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Joseph P. Romano Azeem M. Shaikh
More informationA Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments
A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Jie Chen 1 Merck Research Laboratories, P. O. Box 4, BL3-2, West Point, PA 19486, U.S.A. Telephone:
More informationSanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract
Adaptive Controls of FWER and FDR Under Block Dependence arxiv:1611.03155v1 [stat.me] 10 Nov 2016 Wenge Guo Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102, U.S.A.
More informationFALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING PROCEDURES 1. BY SANAT K. SARKAR Temple University
The Annals of Statistics 2006, Vol. 34, No. 1, 394 415 DOI: 10.1214/009053605000000778 Institute of Mathematical Statistics, 2006 FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING
More informationBayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments
Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Jie Chen 1 Merck Research Laboratories, P. O. Box 4, BL3-2, West Point, PA 19486, U.S.A. Telephone:
More informationOn Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses
On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses Gavin Lynch Catchpoint Systems, Inc., 228 Park Ave S 28080 New York, NY 10003, U.S.A. Wenge Guo Department of Mathematical
More informationSTEPUP PROCEDURES FOR CONTROL OF GENERALIZATIONS OF THE FAMILYWISE ERROR RATE
AOS imspdf v.2006/05/02 Prn:4/08/2006; 11:19 F:aos0169.tex; (Lina) p. 1 The Annals of Statistics 2006, Vol. 0, No. 00, 1 26 DOI: 10.1214/009053606000000461 Institute of Mathematical Statistics, 2006 STEPUP
More informationGENERALIZING SIMES TEST AND HOCHBERG S STEPUP PROCEDURE 1. BY SANAT K. SARKAR Temple University
The Annals of Statistics 2008, Vol. 36, No. 1, 337 363 DOI: 10.1214/009053607000000550 Institute of Mathematical Statistics, 2008 GENERALIZING SIMES TEST AND HOCHBERG S STEPUP PROCEDURE 1 BY SANAT K. SARKAR
More informationTable of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors
The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a
More informationarxiv: v1 [math.st] 31 Mar 2009
The Annals of Statistics 2009, Vol. 37, No. 2, 619 629 DOI: 10.1214/07-AOS586 c Institute of Mathematical Statistics, 2009 arxiv:0903.5373v1 [math.st] 31 Mar 2009 AN ADAPTIVE STEP-DOWN PROCEDURE WITH PROVEN
More informationFalse discovery rate and related concepts in multiple comparisons problems, with applications to microarray data
False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using
More informationarxiv: v1 [math.st] 13 Mar 2008
The Annals of Statistics 2008, Vol. 36, No. 1, 337 363 DOI: 10.1214/009053607000000550 c Institute of Mathematical Statistics, 2008 arxiv:0803.1961v1 [math.st] 13 Mar 2008 GENERALIZING SIMES TEST AND HOCHBERG
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca
More informationControl of Directional Errors in Fixed Sequence Multiple Testing
Control of Directional Errors in Fixed Sequence Multiple Testing Anjana Grandhi Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102-1982 Wenge Guo Department of Mathematical
More informationResearch Article Sample Size Calculation for Controlling False Discovery Proportion
Probability and Statistics Volume 2012, Article ID 817948, 13 pages doi:10.1155/2012/817948 Research Article Sample Size Calculation for Controlling False Discovery Proportion Shulian Shang, 1 Qianhe Zhou,
More informationFamilywise Error Rate Controlling Procedures for Discrete Data
Familywise Error Rate Controlling Procedures for Discrete Data arxiv:1711.08147v1 [stat.me] 22 Nov 2017 Yalin Zhu Center for Mathematical Sciences, Merck & Co., Inc., West Point, PA, U.S.A. Wenge Guo Department
More informationPOSITIVE FALSE DISCOVERY PROPORTIONS: INTRINSIC BOUNDS AND ADAPTIVE CONTROL
Statistica Sinica 18(2008, 837-860 POSITIVE FALSE DISCOVERY PROPORTIONS: INTRINSIC BOUNDS AND ADAPTIVE CONTROL Zhiyi Chi and Zhiqiang Tan University of Connecticut and Rutgers University Abstract: A useful
More informationLecture 7 April 16, 2018
Stats 300C: Theory of Statistics Spring 2018 Lecture 7 April 16, 2018 Prof. Emmanuel Candes Scribe: Feng Ruan; Edited by: Rina Friedberg, Junjie Zhu 1 Outline Agenda: 1. False Discovery Rate (FDR) 2. Properties
More informationDepartment of Statistics University of Central Florida. Technical Report TR APR2007 Revised 25NOV2007
Department of Statistics University of Central Florida Technical Report TR-2007-01 25APR2007 Revised 25NOV2007 Controlling the Number of False Positives Using the Benjamini- Hochberg FDR Procedure Paul
More informationHochberg Multiple Test Procedure Under Negative Dependence
Hochberg Multiple Test Procedure Under Negative Dependence Ajit C. Tamhane Northwestern University Joint work with Jiangtao Gou (Northwestern University) IMPACT Symposium, Cary (NC), November 20, 2014
More informationON TWO RESULTS IN MULTIPLE TESTING
ON TWO RESULTS IN MULTIPLE TESTING By Sanat K. Sarkar 1, Pranab K. Sen and Helmut Finner Temple University, University of North Carolina at Chapel Hill and University of Duesseldorf Two known results in
More informationControl of Generalized Error Rates in Multiple Testing
Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 245 Control of Generalized Error Rates in Multiple Testing Joseph P. Romano and
More informationControlling Bayes Directional False Discovery Rate in Random Effects Model 1
Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA
More informationIMPROVING TWO RESULTS IN MULTIPLE TESTING
IMPROVING TWO RESULTS IN MULTIPLE TESTING By Sanat K. Sarkar 1, Pranab K. Sen and Helmut Finner Temple University, University of North Carolina at Chapel Hill and University of Duesseldorf October 11,
More informationA NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES
A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES By Wenge Guo Gavin Lynch Joseph P. Romano Technical Report No. 2018-06 September 2018
More informationNew Procedures for False Discovery Control
New Procedures for False Discovery Control Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Elisha Merriam Department of Neuroscience University
More informationA GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE
A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE Sanat K. Sarkar 1, Tianhui Zhou and Debashis Ghosh Temple University, Wyeth Pharmaceuticals and
More informationControlling the False Discovery Rate in Two-Stage. Combination Tests for Multiple Endpoints
Controlling the False Discovery Rate in Two-Stage Combination Tests for Multiple ndpoints Sanat K. Sarkar, Jingjing Chen and Wenge Guo May 29, 2011 Sanat K. Sarkar is Professor and Senior Research Fellow,
More informationFalse discovery control for multiple tests of association under general dependence
False discovery control for multiple tests of association under general dependence Nicolai Meinshausen Seminar für Statistik ETH Zürich December 2, 2004 Abstract We propose a confidence envelope for false
More informationThe miss rate for the analysis of gene expression data
Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,
More informationThe Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR
The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR CONTROLLING THE FALSE DISCOVERY RATE A Dissertation in Statistics by Scott Roths c 2011
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 13 Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates Sandrine Dudoit Mark
More informationMultiple Testing. Hoang Tran. Department of Statistics, Florida State University
Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 14 Multiple Testing. Part II. Step-Down Procedures for Control of the Family-Wise Error Rate Mark J. van der Laan
More informationEstimation of a Two-component Mixture Model
Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint
More informationStatistica Sinica Preprint No: SS R1
Statistica Sinica Preprint No: SS-2017-0072.R1 Title Control of Directional Errors in Fixed Sequence Multiple Testing Manuscript ID SS-2017-0072.R1 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0072
More informationResampling-Based Control of the FDR
Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago
More informationA Unified Computational Framework to Compare Direct and Sequential False Discovery Rate Algorithms for Exploratory DNA Microarray Studies
Journal of Data Science 3(2005), 331-352 A Unified Computational Framework to Compare Direct and Sequential False Discovery Rate Algorithms for Exploratory DNA Microarray Studies Danh V. Nguyen University
More informationOn Generalized Fixed Sequence Procedures for Controlling the FWER
Research Article Received XXXX (www.interscience.wiley.com) DOI: 10.1002/sim.0000 On Generalized Fixed Sequence Procedures for Controlling the FWER Zhiying Qiu, a Wenge Guo b and Gavin Lynch c Testing
More informationarxiv:math/ v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao
TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? arxiv:math/0701003v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao ABSTRACT. In the analysis
More informationFamilywise Error Rate Control via Knockoffs
Familywise Error Rate Control via Knockoffs Abstract We present a novel method for controlling the k-familywise error rate (k-fwer) in the linear regression setting using the knockoffs framework first
More informationControlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method
Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Christopher R. Genovese Department of Statistics Carnegie Mellon University joint work with Larry Wasserman
More informationExceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004
Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004 Multiple testing methods to control the False Discovery Rate (FDR),
More informationSummary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing
Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper
More informationEMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS
Statistica Sinica 19 (2009), 125-143 EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS Debashis Ghosh Penn State University Abstract: There is much recent interest
More informationMIXTURE MODELS FOR DETECTING DIFFERENTIALLY EXPRESSED GENES IN MICROARRAYS
International Journal of Neural Systems, Vol. 16, No. 5 (2006) 353 362 c World Scientific Publishing Company MIXTURE MOLS FOR TECTING DIFFERENTIALLY EXPRESSED GENES IN MICROARRAYS LIAT BEN-TOVIM JONES
More informationComparison of the Empirical Bayes and the Significance Analysis of Microarrays
Comparison of the Empirical Bayes and the Significance Analysis of Microarrays Holger Schwender, Andreas Krause, and Katja Ickstadt Abstract Microarrays enable to measure the expression levels of tens
More informationSOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE
Statistica Sinica 18(2008), 881-904 SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE Yongchao Ge 1, Stuart C. Sealfon 1 and Terence P. Speed 2,3 1 Mount Sinai School of Medicine,
More informationMultiple testing: Intro & FWER 1
Multiple testing: Intro & FWER 1 Mark van de Wiel mark.vdwiel@vumc.nl Dep of Epidemiology & Biostatistics,VUmc, Amsterdam Dep of Mathematics, VU 1 Some slides courtesy of Jelle Goeman 1 Practical notes
More informationFalse discovery rate control for non-positively regression dependent test statistics
Journal of Statistical Planning and Inference ( ) www.elsevier.com/locate/jspi False discovery rate control for non-positively regression dependent test statistics Daniel Yekutieli Department of Statistics
More informationLooking at the Other Side of Bonferroni
Department of Biostatistics University of Washington 24 May 2012 Multiple Testing: Control the Type I Error Rate When analyzing genetic data, one will commonly perform over 1 million (and growing) hypothesis
More informationA moment-based method for estimating the proportion of true null hypotheses and its application to microarray gene expression data
Biostatistics (2007), 8, 4, pp. 744 755 doi:10.1093/biostatistics/kxm002 Advance Access publication on January 22, 2007 A moment-based method for estimating the proportion of true null hypotheses and its
More informationStatistical testing. Samantha Kleinberg. October 20, 2009
October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find
More informationNon-specific filtering and control of false positives
Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview
More informationThe International Journal of Biostatistics
The International Journal of Biostatistics Volume 7, Issue 1 2011 Article 12 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Stanford University Azeem Shaikh, University of Chicago
More informationWeighted Adaptive Multiple Decision Functions for False Discovery Rate Control
Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control Joshua D. Habiger Oklahoma State University jhabige@okstate.edu Nov. 8, 2013 Outline 1 : Motivation and FDR Research Areas
More informationFDR and ROC: Similarities, Assumptions, and Decisions
EDITORIALS 8 FDR and ROC: Similarities, Assumptions, and Decisions. Why FDR and ROC? It is a privilege to have been asked to introduce this collection of papers appearing in Statistica Sinica. The papers
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 6, Issue 1 2007 Article 28 A Comparison of Methods to Control Type I Errors in Microarray Studies Jinsong Chen Mark J. van der Laan Martyn
More informationControl of the False Discovery Rate under Dependence using the Bootstrap and Subsampling
Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 337 Control of the False Discovery Rate under Dependence using the Bootstrap and
More informationHigh-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018
High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously
More informationStat 206: Estimation and testing for a mean vector,
Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where
More informationDoing Cosmology with Balls and Envelopes
Doing Cosmology with Balls and Envelopes Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics Carnegie
More informationarxiv: v3 [stat.me] 9 Nov 2015
Familywise Error Rate Control via Knockoffs Lucas Janson Weijie Su Department of Statistics, Stanford University, Stanford, CA 94305, USA arxiv:1505.06549v3 [stat.me] 9 Nov 2015 May 2015 Abstract We present
More informationTO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? Jianqing Fan Peter Hall Qiwei Yao
TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? Jianqing Fan Peter Hall Qiwei Yao ABSTRACT. In the analysis of microarray data, and in some other
More informationSTAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:
STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two
More informationHigh-throughput Testing
High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions
More informationA Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications
A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications Thomas Brechenmacher (Dainippon Sumitomo Pharma Co., Ltd.) Jane Xu (Sunovion Pharmaceuticals Inc.) Alex Dmitrienko
More informationMULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS
Journal of Biopharmaceutical Statistics, 21: 726 747, 2011 Copyright Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543406.2011.551333 MULTISTAGE AND MIXTURE PARALLEL
More informationPositive false discovery proportions: intrinsic bounds and adaptive control
Positive false discovery proportions: intrinsic bounds and adaptive control Zhiyi Chi and Zhiqiang Tan University of Connecticut and The Johns Hopkins University Running title: Bounds and control of pfdr
More informationSupplement to Post hoc inference via joint family-wise error rate control
Supplement to Post hoc inference via joint family-wise error rate control Gilles Blanchard Universität Potsdam, Institut für Mathemati Karl-Liebnecht-Straße 24-25 14476 Potsdam, Germany e-mail: gilles.blanchard@math.uni-potsdam.de
More informationImproving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses
Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Amit Zeisel, Or Zuk, Eytan Domany W.I.S. June 5, 29 Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving
More informationAn Alpha-Exhaustive Multiple Testing Procedure
Current Research in Biostatistics Original Research Paper An Alpha-Exhaustive Multiple Testing Procedure, Mar Chang, Xuan Deng and John Balser Boston University, Boston MA, USA Veristat, Southborough MA,
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2004 Paper 164 Multiple Testing Procedures: R multtest Package and Applications to Genomics Katherine
More informationMultiple Testing. Anjana Grandhi. BARDS, Merck Research Laboratories. Rahway, NJ Wenge Guo. Department of Mathematical Sciences
Control of Directional Errors in Fixed Sequence arxiv:1602.02345v2 [math.st] 18 Mar 2017 Multiple Testing Anjana Grandhi BARDS, Merck Research Laboratories Rahway, NJ 07065 Wenge Guo Department of Mathematical
More informationExact and Approximate Stepdown Methods For Multiple Hypothesis Testing
Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Joseph P. Romano Department of Statistics Stanford University Michael Wolf Department of Economics and Business Universitat Pompeu
More informationHunting for significance with multiple testing
Hunting for significance with multiple testing Etienne Roquain 1 1 Laboratory LPMA, Université Pierre et Marie Curie (Paris 6), France Séminaire MODAL X, 19 mai 216 Etienne Roquain Hunting for significance
More informationarxiv: v3 [math.st] 15 Jul 2018
A New Step-down Procedure for Simultaneous Hypothesis Testing Under Dependence arxiv:1503.08923v3 [math.st] 15 Jul 2018 Contents Prasenjit Ghosh 1 and Arijit Chakrabarti 2 1 Department of Statistics, Presidency
More informationA Large-Sample Approach to Controlling the False Discovery Rate
A Large-Sample Approach to Controlling the False Discovery Rate Christopher R. Genovese Department of Statistics Carnegie Mellon University Larry Wasserman Department of Statistics Carnegie Mellon University
More informationTwo simple sufficient conditions for FDR control
Electronic Journal of Statistics Vol. 2 (2008) 963 992 ISSN: 1935-7524 DOI: 10.1214/08-EJS180 Two simple sufficient conditions for FDR control Gilles Blanchard, Fraunhofer-Institut FIRST Kekuléstrasse
More informationClosure properties of classes of multiple testing procedures
AStA Adv Stat Anal (2018) 102:167 178 https://doi.org/10.1007/s10182-017-0297-0 ORIGINAL PAPER Closure properties of classes of multiple testing procedures Georg Hahn 1 Received: 28 June 2016 / Accepted:
More informationEffects of dependence in high-dimensional multiple testing problems. Kyung In Kim and Mark van de Wiel
Effects of dependence in high-dimensional multiple testing problems Kyung In Kim and Mark van de Wiel Department of Mathematics, Vrije Universiteit Amsterdam. Contents 1. High-dimensional multiple testing
More informationMore powerful control of the false discovery rate under dependence
Statistical Methods & Applications (2006) 15: 43 73 DOI 10.1007/s10260-006-0002-z ORIGINAL ARTICLE Alessio Farcomeni More powerful control of the false discovery rate under dependence Accepted: 10 November
More informationarxiv: v1 [math.st] 14 Nov 2012
The Annals of Statistics 2010, Vol. 38, No. 6, 3782 3810 DOI: 10.1214/10-AOS829 c Institute of Mathematical Statistics, 2010 arxiv:1211.3313v1 [math.st] 14 Nov 2012 THE SEQUENTIAL REJECTION PRINCIPLE OF
More informationCHOOSING THE LESSER EVIL: TRADE-OFF BETWEEN FALSE DISCOVERY RATE AND NON-DISCOVERY RATE
Statistica Sinica 18(2008), 861-879 CHOOSING THE LESSER EVIL: TRADE-OFF BETWEEN FALSE DISCOVERY RATE AND NON-DISCOVERY RATE Radu V. Craiu and Lei Sun University of Toronto Abstract: The problem of multiple
More informationConsonance and the Closure Method in Multiple Testing. Institute for Empirical Research in Economics University of Zurich
Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 446 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Azeem
More informationMultiple Testing Procedures under Dependence, with Applications
Multiple Testing Procedures under Dependence, with Applications Alessio Farcomeni November 2004 ii Dottorato di ricerca in Statistica Metodologica Dipartimento di Statistica, Probabilità e Statistiche
More information