Modified Simes Critical Values Under Positive Dependence

Size: px
Start display at page:

Download "Modified Simes Critical Values Under Positive Dependence"

Transcription

1 Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia Abstract A modification of the critical values of Simes test is suggested in this article when the underlying test statistics are multivariate normal with a common nonnegative correlation, yielding a more powerful test than the original Simes test. A step-up multiple testing procedure with these modified critical values, which is shown to control false discovery rate (FDR), is presented as a modification of the traditional Benjamini-Hochberg (BH) procedure. Simulations were carried out to compare this modified BH procedure with the BH and other modified BH procedures in terms of false non-discovery rate (FNR), 1 FDR FNR and average power. The present modified BH procedure is observed to perform well compared to others when the test statistics are highly correlated and most of the hypotheses are true. Key words: Multivariate normal with non-negative common correlation; Adjusted Simes test; False Discovery Rate; Modified Benjamini-Hochberg procedures 1 Introduction Simes (1986) test has been receiving considerable attention recently by researchers as well as practitioners in multiple comparisons. This is because it performs much better than Bonferroni and Sidak procedures and, more importantly, its critical values are the same ones in the Benjamini-Hochberg (1995) step-up procedure that controls the false discovery rate (FDR). Suppose that we have a set of test statistics X 1,..., X n for testing some null hypotheses H 1,..., H n, respectively, and that P 1,..., P n are the corresponding p-values. Simes (1986) proposed his test for testing the overall null hypothesis The research is supported by NSF Grant DMS address: sanat@temple.edu (Sanat K. Sarkar). Preprint submitted to Elsevier Science 12 April 2005

2 H = n i=1h i, as a modification of the Bonferroni test, based on the ordered p-values P (1) P (n) as follows: Reject H at level α if P (i) iα/n for at least one i. (1) Benjamini-Hochberg s (1995) FDR-controlling procedure (BH procedure) is a step-up test with these same critical values, i.e., it rejects H i for all i k and accepts the rest, where k = max{j : P (j) jα/n}. Simes (1986) proved that the type I error rate of his method, i.e, the probability of (1) under the null hypotheses, is exactly α when the X i s are iid and conjectured that it is conservative for a variety of commonly used multivariate distributions exhibiting positive dependence. Sarkar and Chang (1997) and Sarkar (1998) noticed that many of the commonly used positively dependent multivariate distributions arising in multiple comparisons are characterized by the MTP 2 (multivariate totally positive of order two) property of Karlin and Rinott (1980) and proved Simes conjecture for these distributions. Regarding the BH procedure, Benjamini and Yekutieli (2001) and Sarkar (2002) proved that the FDR of this procedure is less than or equal to n 0 α/n, where n 0 is the (unknown) number of true null hypotheses, for multivariate distributions that are positively dependent in the sense of being PRDS (positive regression dependent on subset), a slightly weaker version of the MTP 2 condition, and is exactly equal to n 0 α/n under independence. Readers are referred to Benjamini and Yekutieli (2001) and Sarkar (2002) for details of PRDS and to the Appendix for a brief review of the MTP 2 property. Notice that the FDR-controlling property of the BH procedure, considering n 0 = n, implies that Simes test controls the type I error rate. Therefore, Simes test is actually conservative even for the larger class of PRDS distributions. Multivariate normal with positive correlations that commonly arise in many multiple testing situations is an example of such a distribution. Test statistics having multivariate normal distribution with a known nonnegative common correlation are often encountered in multiple comparisons; for instance, in many-to-one comparisons using a balanced one-way layout (Hochberg and Tamhane, 1987; Hsu, 1996). This paper is motivated by the fact that, for such a distribution, since Simes test is conservative, it can potentially be improved in terms of power if its critical values are modified incorporating the correlation. These modified Simes critical values can also potentially improve the BH procedure. Of course, the idea of adjusting the critical values of Simes test by incorporating the dependence structure is not new; it was implemented before by Kwong et al. (2002) in an attempt to modify the BH procedure. There is, however, a methodological difference in our approach to adjusting the critical values. Rather than modifying only one critical value, as done in Kwong et al. (2002), we modify all but one critical value. This results in a modified Simes 2

3 procedure that gets more powerful than the unmodified one as the correlation increases. This is in sharp contrast with Kwong et al. s version of the modified Simes procedure which has an inconsistency in its power performance, particularly for large n, in that it peaks up quite dramatically for small correlation but becomes weak as correlation increases. We propose a modification of the BH procedure based on the adjusted Simes critical values derived in this article. We prove that this modified BH procedure controls the FDR. Kwong et al. s version of the modified BH procedure also controls the FDR, but, as discussed above, it provides minimal improvement over the unmodified BH procedure for large correlation. Yekutieli and Benjamini (1999) and Troendle (2000) also offered ideas of improving the BH procedure incorporating the dependence structure. Yekutieli-Benjamini s method, however, relies on resampling technique, and the FDR-controlling property can be verified only empirically. While Troendle proposed powerful step-up and step-down procedures based on normal theory, they are based on an assumption, valid only asymptotically, that the ordering of test statistics correspond to that of the means. How good is the performance of our version of the modified BH procedure, referred to as the extended BH (EBH) procedure, compared to the original BH procedure and Kwong et al. s (2002) version of the modified BH procedure? We investigate that in terms of three different concepts of power. One is the average power (Benjamini and Liu, 1999; Kwong et al., 2002; Storey, 2002). The second one is based on the concept of false non-discovery rate (FNR) (Genovese and Wasserman, 2002; Sarkar, 2005, 2004; Storey, 2003). The third one is 1 FDR FNR proposed in Sarkar (2004), which reflects the strength of unbiasedness of an FDR procedure. The setup of this paper is as follows. In Section 2, we explain our method of modifying Simes critical values using Sarkar s (1998) formula. The fact that the BH procedure based on these modified Simes critical values controls the FDR is established in Section 3. Section 4 compares EBH to the BH and Kwong et al. s procedures. Section 5 shows the simulation results when the multivariate normal with common correlation assumption is not met. The paper concludes with some remarks. Proofs of some supporting lemmas are deferred to the Appendix. 2 Modified Simes Critical Values As mentioned in the introduction, we will assume that the set of test statistics X = (X 1,..., X n ) is distributed as multivariate normal with mean vector Θ = (θ 1,..., θ n ) and covariance matrix Σ = (1 ρ)i + ρj, for some known 3

4 ρ 0. We are interested in testing H i : θ i = 0 versus K i : θ i > 0, for i = 1,, n. Simes test at level α for testing the overall null hypothesis H : n i=1h i against the alternative K : n i=1k i, in terms of the X i s, rejects H if X (i) Φ 1 (1 n i+1 α), where X n (1) X (n) are the ordered components of X, and Φ(z) is the cdf of N(0, 1) at z. This test, under the present set-up, is conservative, i.e., its type I error rate is less than or equal to α, and hence can be improved if we modify the critical values by c 1,α (ρ) c n,α (ρ); satisfying 1 P H {X (i) < c i,α (ρ), i = 1,..., n} = α. (2) For the time being, we will simply write c i instead of c i,α (ρ). From Lemma 2.1 in Sarkar (1998), we note that the left-hand side of (2) can be expressed as 1 P H {X (i) < c i, i = 1,..., n} = 1 n n P H {X i c 1 } + n i=1 { I(Xi c j+1 ) n j n 1 i=1 j=1 E H [ P H { X ( i) (1) < c 1,..., X ( i) (j) < c j X i } I(X i c j ) n j + 1 } ], (3) where X ( i) (1) X ( i) (n 1) denote the ordered components of the (n 1)- dimensional random vector X ( i) obtained by eliminating X i from X. Since, under H, X is exchangeable, we have n 1 1 P H {X (i) < c i, i = 1,..., n} = [1 Φ(c 1 )] + n K j, (4) j=1 where { I(X1 c j+1 ) K j =E H [Ψ j (X 1 ) I(X }] 1 c j ) n j n j + 1 { I(x cj+1 ) = Ψ j (x)φ(x) I(x c } j) dx, n j n j + 1 (5) with Ψ j (x) = P H { X ( 1) (1) < c 1,..., X ( 1) (j) < c j X 1 = x } (6) and φ(.) being the density of standard normal distribution. Note that K j depends only on c 1,..., c j+1. With given c 1,..., c j, unique c j+1 (> c j ) can be found from equation K j = 0. Therefore, from (4), we can derive the following procedure to calculate the unique desired critical values c 1,α (ρ) c n,α (ρ): (1) Let [1 Φ(c 1,α (ρ))] = α, that is, c 1,α (ρ) = Φ 1 (1 α) for any ρ 0. 4

5 (2) Make K j = 0 for j = 1, 2,, n 1, and from the n 1 equations sequentially solve for c i,α (ρ) for i = 2,, n. Obviously, for different ρ, we will get different critical values except c 1. Notice that when ρ = 0, we get Simes critical values with this method. Theorem 1 Simes test with modified critical values c i,α (ρ), i = 1,..., n, controls the type I error rate exactly at α when the test statistics follow a multivariate normal distribution with a common nonnegative correlation ρ. Proof. The result is obvious because of (4) and the way we obtain c i,α (ρ) s. In the process of computing the modified critical values, we need to evaluate (6). For this, we will employ the same algorithm as in Dunnett and Tamhane (1992) using an R program. Table 1 gives the modified critical values in terms of p-value for n = 2, 3, 4, 5 corresponding to different values of ρ at level α = From Table 1, we note that c i,α (ρ) is a decreasing function of ρ, providing an empirical evidence that Simes test with these critical values become more powerful as ρ increases. A theoretical proof of this monotonicity property appears to be difficult. Nevertheless, we prove in the following that c i,α (ρ) < c i (0), for any i > 1 and ρ > 0, which proves that the modified Simes test is more powerful than the original Simes test. Theorem 2 For any ρ > 0 and i = 2,, n, c i,α (ρ) < c i,α (0). Proof. We prove this theorem by induction. For this, first note that c i,α (0), i = 2,..., n, are obtained from the following equations: φ(x) { I(x ci+1,α (0)) n i I(x c } i,α(0)) dx = 0, i = 1,... n 1. n i + 1 Let us write Ψ j (x) as Ψ ρ j(x). Since the distribution of X has the PRDS property, Ψ ρ j(x) is a decreasing function of x. Also, for any c j < c j+1, I(x c j+1 ) I(x c j) or 0, according as x or c n j n j+1 j+1. Therefore, we have { I(x Ψ ρ cj+1,α (0)) j(x)φ(x) I(x c } j,α(0)) dx n j n j + 1 { I(x Ψ ρ cj+1,α (0)) j(c j+1,α (0)) φ(x) I(x c } j,α(0)) dx n j n j + 1 = 0 { I(x = Ψ ρ cj+1,α (ρ)) j(x)φ(x) I(x c } j,α(ρ)) dx. n j n j + 1 Now, assume that the result is true for i = j; i.e, c j,α (ρ) < c j,α (0). Using this (7) 5

6 Table 1 Modified Simes Critical Values p i (ρ) = 1 Φ[c i (ρ)] with α = 0.05 p i n i Note: p 1 (ρ) = 1 Φ[c 1 (ρ)] = α = in (7), we see that c j+1,α (0) Ψ ρ j(x)φ(x)dx ρ c j+1,α (ρ) Ψ ρ j(x)φ(x)dx, (8) which implies that c j+1,α (ρ) < c j+1,α (0), i.e., the result must be true for i = j + 1 also. This proves the theorem. 3 Extended BH Procedure The BH procedure, in terms of X, is a step-up procedure that rejects H i, for all i k, where k = min{i : X (i) Φ 1 (1 n i+1 α)}. It is designed to control n the FDR, which is the expected ratio of false rejections to total number of rejections, exactly at n 0 α/n when ρ = 0, where n 0 is the number of true null hypotheses. We will modify this procedure by replacing its critical values by the modified Simes critical values obtained in the above section, and refer to that as the extended BH (EBH) procedure. Thus, the EBH procedure rejects H i, for all i k, where k = min{i : X (i) c i,α (ρ)}. We are going to show in this section that the EBH procedure also controls the FDR at α. More specifically, we prove the following theorem. Theorem 3 The FDR of the EBH procedure is less than or equal to n 0 α/n, where n 0 is the number of true null hypotheses, if X follows multivariate normal distribution with a common nonnegative correlation ρ. Proof. Our proof requires two supporting lemmas that will be stated and proved in Appendix. If n 0 = 0, the result is clearly true as FDR = 0. So let s assume that n 0 > 0 and, without any loss of generality, assume that the 6

7 first n 0 of the null hypotheses are true and the rest are false. For notational convenience, we will simply write c i instead of c i,α (ρ). According to Sarkar (2002), the FDR of a step-up procedure with any set of critical values c 1 c n is FDR = 1 [ n 0 n 0 n 1 P {X i c 1 } + E n i=1 i=1 j=1 { I(Xi c j+1 ) n j I(X i c j ) n j + 1 P { X ( i) (1) < c 1,..., X ( i) (j) < c j X i } Since the distribution of X is invariant under the permutations within the set of first n 0 X i s, the FDR in (9) simplifies to } ]. FDR = n n 1 0 n [1 Φ(c 1)] + n 0 K j (Θ 1 ), (10) j=1 (9) where K j (Θ 1 ) =E [ P { X ( 1) (1) < c 1,..., X ( 1) } (j) < c j X 1 { I(X1 c j+1 ) I(X } ] 1 c j ), n j n j + 1 (11) and Θ 1 = (θ 1,..., θ n ), with θ i = 0 for i = 1,..., n 0. Writing X i = 1 ρz i ρz 0 + θ i, i = 1,..., n, where Z i N(0, 1) for i = 0, 1,..., n, we can simplify K j (Θ 1 ) as follows: K j (Θ 1 ) =E [ =E P P { { X ( 1) (1) c 1,..., X ( 1) } P (X1 c j+1 Z 0 ) (j) c j Z 0 { n j Y (1) c 1 + ρz 0,..., Y (j) c j + } ρz 0 1 ρ 1 ρ { cj + 1 Φ } ρz 0 n j 1 ρ (n j + 1) Φ { c j+1 + } ρz 0, 1 ρ P (X }] 1 c j Z 0 ) n j + 1 { cj+1 + } ρz 0 Φ 1 ρ (12) for j = 1,..., n 1, where Y (1) Y (n 1) are the ordered components of (n 1)-dimensional normal with mean vector ( θ 2 θ 1 ρ,..., 1 ρ n ) and covariance matrix I n 1, and Φ = 1 Φ. Let { G Θ1 (Z 0 ; j) = P Y (1) c 1 + ρz 0,..., Y (j) c j + } ρz 0 1 ρ 1 ρ 7

8 and so that we have [ K j (Θ 1 ) =E =E =E [ [ G Θ1 (Z 0 ; j) Φ r(z 0 ; j) = R(Z 0 ; j) G 0 (Z 0 ; j) Φ G 0 (Z 0 ; j) Φ Φ { c j + } ρz 0 1 ρ Φ { c j+1 + ρz 0 }, 1 ρ { cj+1 + } { ρz ρ { cj+1 + } ρz 0 1 ρ { cj+1 + }] [ ρz 0 E R(Z 1 ρ 0; j) n j r(z }] 0; j) n j + 1 { 1 n j r(z }] 0; j) n j + 1 { }] 1 n j r(z 0; j), n j + 1 (13) where R(z 0 ; j) = G Θ1 (z 0 ; j)/g 0 (z 0 ; j), and Z0 has the following probability density at z: G 0 (z; j) Φ { c j+1 + } ρz 1 ρ E [ G 0 (Z 0 ; j) Φ { c j+1 + ρz 0 }]. 1 ρ Since 1 r(z 0 ;j) is a decreasing function of n j n j+1 Z 0 (from Lemma 1) and R(Z0; j) is an increasing function of Z0 (from Lemma 2), the expectation of the product of these two functions with respect to the distribution of Z0 is less than or equal to the product of their expectations. But, the expectation of the first function with respect to Z0 is K j (0) E [ G 0 (Z 0 ; j) Φ { c j+1 + ρz 0 }], 1 ρ which is zero when these c i s are the modified Simes critical values. This proves the theorem. 4 Power Comparisons In this section, we will examine how much improvement over the BH procedure we can achieve by using our proposed EBH procedure. Also, we will compare the EBH procedure with Kwong et al. s version of the modified BH procedure, referred to as the MBH procedure. Towards comparing different FDR-controlling procedures, it is important to keep in mind that one can conceptualize power in many different ways. Three particular concepts have received attention in the literature. One is the average power (see, e.g. Benjamini and Liu, 1999; Kwong et al., 2002; Storey, 8

9 2002), which is the expected proportion of false null hypotheses that are correctly rejected. Another one is based on the concept of false non-discovery rate (FNR), which is an analog of FDR and defined in terms of Type II errors as the expected proportion of falsely accepted null hypotheses among those that are accepted (Genovese and Wasserman, 2002; Sarkar, 2005, 2004; Storey, 2003). Basically, between two FDR-controlling procedures, the one with smaller FNR is considered more powerful. The third one is 1 FDR FNR proposed in Sarkar (2004), which reflected the strength of unbiasedness of an FDR procedure. Out of two FDR-controlling procedures, the one with larger 1 FDR FNR is considered better. We will use these three concepts of power in this section. 4.1 Comparisons with the BH procedure Tables 2 and 3 show part of the extensive simulation carried out for n = 5. These tables reveal that the proposed EBH procedure can have much better performance than the unmodified BH procedure when the correlation among the test statistics is large and many of the null hypotheses are true. It should be noted that Table 3 also provides an idea how good our modification of the Simes test is in terms of controlling the Type I error rate. Each value in Table 2 and Table 3 is based on 50,000 simulations and the test statistics corresponding to false null hypotheses are assumed to have a common mean θ = 2. Table 2 Comparison of BH and EBH for n = 5, n 0 = 3 and α = 0.05 Method FDR BH EBH FNR BH EBH FDR-FNR BH EBH Average BH Power EBH ρ 9

10 Table 3 Comparison of BH and EBH for n = 5, n 0 = 5 and α = 0.05 Method FDR BH EBH Comparison with the MBH procedure ρ Since Kwong et al. (2002) does not provide critical values for ρ > 0.5, we did simulation for ρ 0.5 and n = 5 to compare the two procedures in terms of FDR, FNR, 1 FDR FNR and average power. Table 4 presents these simulated values. As expected, the MBH procedure appears to dominate when ρ and n 0 are both small and the EBH procedure dominates when both are large. Each number in the table is based on 50,000 simulations and the mean corresponding to each alternative hypothesis is θ = 2. 5 Robustness Simes critical values are modified under the assumption of normality and equal correlation. For other distributions and/or unequal correlation, the conditional distributions in (3) become complicated and hard to simplify further. Therefore, at least three types of questions can be raised about the robustness of the proposed the EBH procedure: What if the common correlation assumption is not satisfied? What if the common correlation is not known? What if the distribution involved is not multivariate normal? In this section, we will carry out simulations to investigate these robustness issues of the EBH procedure. 5.1 Unequal Correlations Let us first look at the case where the correlation matrix has AR(1) structure; that is, when our test statistics follow an n-dimensional multivariate normal with unit variances and the following covariance (or correlation) matrix Σ = { σ ij = ρ i j, i = 1,..., n, j = 1,..., n }. Table 5 shows the simulation results based on this AR(1) correlation structure with n = 5, where the modified Simes critical values c i (ρ n 1 ) s corresponding 10

11 Table 4 Comparison of the EBH and MBH procedures with n = 5, α = 0.05 FDR n 0 ρ = 0.1 ρ = 0.2 ρ = 0.3 ρ = 0.4 ρ = (0.024) (0.027) (0.029) (0.031) (0.028) (0.027) (0.029) (0.032) (0.034) (0.032) (0.033) (0.034) (0.036) (0.039) (0.037) (0.042) (0.042) (0.044) (0.045) (0.044) (0.050) (0.050) (0.051) (0.051) (0.049) FNR n 0 ρ = 0.1 ρ = 0.2 ρ = 0.3 ρ = 0.4 ρ = (0.525) (0.481) (0.447) (0.418) (0.434) (0.517) (0.499) (0.486) (0.475) (0.471) (0.378) (0.370) (0.363) (0.357) (0.353) (0.247) (0.244) (0.242) (0.240) (0.239) (0.125) (0.124) (0.124) (0.124) (0.124) 1 FDR FNR n 0 ρ = 0.1 ρ = 0.2 ρ = 0.3 ρ = 0.4 ρ = (0.475 ) (0.519 ) (0.553 ) (0.582 ) (0.566) (0.459 ) (0.474 ) (0.484 ) (0.495 ) (0.501) (0.595 ) (0.601 ) (0.605 ) (0.609 ) (0.615) (0.721 ) (0.722 ) (0.722 ) (0.722 ) (0.724) (0.834 ) (0.834 ) (0.832 ) (0.831 ) (0.833) (0.950 ) (0.950 ) (0.949 ) (0.949 ) (0.951) Average Power n 0 ρ = 0.1 ρ = 0.2 ρ = 0.3 ρ = 0.4 ρ = (0.688) (0.690) (0.690) (0.689) (0.663) (0.546) (0.545) (0.541) (0.537) (0.528) (0.483) (0.484) (0.484) (0.482) (0.479) (0.434) (0.434) (0.435) (0.435) (0.432) (0.382) (0.382) (0.383) (0.382) (0.381) Note: The values in parentheses are for the MBH procedure. to the smallest value of the correlation are used in the EBH procedure. From the simulation result, the FDR is still controlled at n 0 α/n. Each number in the table is based on simulations with the mean corresponding to each alternative hypothesis being θ = 2. Table 6 shows the simulation results based on two randomly generated positive correlation structures with n = 5. Each number in the table is based on simulations with multivariate normal test statistics with unit variance and the mean corresponding to each alternative hypothesis is θ. The correlation 11

12 Table 5 Simulated FDR With AR(1) Correlation Structure Minimun Correlation ρ 4 n matrices used in the simulation are Σ 1 and Σ 2, where Σ 1 = Σ 2 = The simulation results in Table 5 and Table 6 show that for any positive correlation structure, if the critical values corresponding to the smallest correlation are used in the EBH procedure, the FDR can still be controlled at n 0 α/n. 5.2 Unknown Common Correlation Sometimes in practice, even though it might make sense to assume that the correlations are the same, this common value may not be exactly known. In a situation like this, one can use a conservative (small) estimate of this common correlation. This will still control FDR based on the simulation in Section 5.1. Table 6 Simulated FDR With Arbitrary Correlation Structure Mean for the Alternative Hypotheses (θ) Σ c i n Σ 1 c i (0.3) Σ 2 c i (0.5)

13 Table 7 Simulated FDR With Multivariate t Distribution Common Correlation ρ Among Test Statistics n 0 df Multivariate t Distribution If the population follows a multivariate normal distribution with unknown variance σ 2, then the test statistics used for inferences follow multivariate t distributions. How does the EBH procedure perform in such cases? Table 7 shows the results of simulations based on the modified critical values with n = 5 in terms of p-value as listed in Table 1. Based on the simulation results in this table, the EBH procedure appears to perform well when n 0 < n and the degrees of freedom is not too small ( 10). When n 0 = n, the EBH procedure cannot control FDR at α for small degrees of freedom. Larger the correlation, the larger is the necessary degrees of freedom for the EBH procedure to control the FDR at α. Each number in the table is based on simulations with multivariate t test statistics with common correlation structure and the mean corresponding to each alternative hypothesis is θ = 2. 6 Concluding Remarks We have suggested in this article an alternative method of adjusting the critical values of Simes test, making it more powerful, when the underlying test 13

14 statistics are known to be equicorrelated normal. The step-up procedure based on these adjusted critical values, which is shown to control the FDR, provides a modification of the BH procedure and is different from Kwong et al. (2002). This newer modified BH procedure performs much better, compared to the BH procedure, when the correlation is high and a few of the null hypotheses are actually true, as opposed to Kwong et al. s version of the modified BH procedure that is designed to work well for small correlation and small number of true null hypotheses. The normality and equal-correlation assumptions are very crucial in our proposed modifications of Simes test and the BH procedure, as it is true in Kwong et al. (2002) also. Without these assumptions, that is, for other distributions and/or unequal-correlation cases, it seems difficult to propose such modifications. However, simulation results show that for unequal correlated cases, if the conservative (small) correlation estimates are used for deriving the modified critical values, the proposed EBH procedure can still control FDR at n 0 /nα for multivariate normal distributions. The EBH procedure performs well also for multivariate t distributions if the modified critical values in terms of p-value obtained based on normal distribution are used. It is important to point out that the computing time involved in obtaining the modified critical values increases dramatically as n increases, as they require evaluations of joint probabilities of the ordered components of a dependent random vector. It took us about 10 hours to get these critical values for n = 6 using R on a Pentium III 800MHz PC. While it limits our ability to further study the proposed procedure for larger number of tests given the present state of computing facilities available to us, it does not, however, limit the scope of this procedure, as quite often in practice, particularly in safety assessment and pharmacology studies in drug development, small number of statistics are involved; see, for example Zhang et al. (1997) and Dmitrienko et al. (2003). We have tried with the algorithm in Kwong and Liu (2000), but no improvement in terms of computing time is noted over the one in Dunnett and Tamhane (1992) we have used here. A faster algorithm for evaluating joint probabilities of the ordered components of a dependent random vector needs to be formulated. Or, an approximation formula needs to be developed to accurately calculate the adjusted Simes critical values for large n. 7 Acknowledgements We thank an Associate Editor and two referees for their valuable comments. 14

15 8 Appendix Lemma 1 For any fixed θ < θ, Φ(x θ ) Φ(x θ) is an increasing functions of x. Lemma 1 follows from the TP 2 (totally positive of order two) property of Φ(x θ) in (x, θ); see Karlin and Rinott (1980) and Das Gupta and Sarkar (1984). The next lemma will be proved using certain results on multivariate totally positive of order two (MTP 2 ) distributions. For the sake of a better understanding of the proof, we will now recall the definition of the MTP 2 property, due to Karlin and Rinott (1980), and then briefly review some basic related results that will be used in the proof. Definition. A non-negative real-valued function f(x 1,..., x n ) defined on R n is MTP 2 in (x 1,..., x n ) if, for any two points (x 1,..., x n ) and (y 1,..., y n ) in R n, f(max(x 1, y 1 ),..., max(x n, y n ))f(min(x 1, y 1 ),..., min(x n, y n )) f(x 1,..., x n )f(y 1,..., y n ). When n = 2, this is referred to as the totally positive of order two (TP 2 ) property of Karlin (1968). Result A.1. If f is MTP 2 in (x 1,..., x n ), then the ratio f(x 1,..., x k, x k+1,..., x n) f(x 1,..., x k, x k+1,..., x n ) is increasing in (x 1,..., x k ), for any fixed (x k+1,..., x n ) and (x k+1,..., x n) with x i > x i for i = k + 1,..., n. Result A.2. If ψ 1 (x 1,..., x n ) and ψ 2 (y 1,..., y n ) are both MTP 2, then the product ψ 1 ψ 2 is MTP 2 in (x 1,..., x n, y 1,..., y n ). The above two results follow easily from the definition of MTP 2. The MTP 2 properties of two particular functions arising in the proof of our next lemma follow from Result A.2. The first function is i=1 φ(x i θ i ), where φ(x θ) is the density of N(θ, 1) and is TP 2. The other function is I(x 1 x n ) = n 1 i=1 I(x i x i+1 ), where I(x y), which is 1 if x y, and 0 otherwise, is also TP 2. 15

16 Result A.3. If f(x 1,..., x n ) is MTP 2, then n f(x 1,..., x n ) dx i i=k+1 is MTP 2 in (x 1,..., x k ). A proof of this is given in Karlin and Rinott (1980). Lemma 2 Let X N n (Θ, I n ). Then, for any fixed a 1 a n, P Θ {X (1) a 1 + x,..., X (n) a n + x} P 0 {X (1) a 1 + x,..., X (n) a n + x} (14) is an increasing function of x. Proof. The joint cdf of (X (1),..., X (n) ) at (x 1,..., x n ), under any Θ, is given by P Θ {X (1) x 1,..., X (n) x n } = k i1,...,i n (Θ)P Θ {X i1 x 1,..., X in x n X i1 X in }, i 1,...,i n P (15) where P is the set of all permutations (i 1,..., i n ) of (1,..., n) and k i1,...,i n (Θ) = P Θ {X i1 X in }. When Θ = 0, the conditional probabilities in (15) are all equal. Therefore, the ratio of the probabilities in (14) can be written as i 1,...,i n P k i1,...,i n (Θ) P Θ{X i1 x 1,..., X in x n X i1 X in } P 0 {X i1 x 1,..., X in x n X i1 X in }. (16) We will now prove that the ratio of the conditional probabilities in (16), for each permutation (i 1,..., i n ), is increasing in (x 1,..., x n ). The lemma follows once this is proved. We will, however, give a proof of this result in the following only when (i 1,..., i n ) = (1,..., n); it can be proved similarly for other permutations. The joint density of (X 1,..., X n ) conditional on X 1 X n, under any Θ, is given by n g Θ (x 1,..., x n ) = k1,...,n(θ) 1 φ(x i θ i )I(x 1 x n ). (17) i=1 16

17 Hence, we have P Θ {X 1 x 1,..., X n x n X 1 X n } n n = I(y i x i )g Θ (y 1,..., y n ) dy i. i=1 As discussed above following Result A.2, g Θ (y 1,..., y n ) is MTP 2 in (y 1,..., y n, θ 1,..., θ n ) and n i=1 I(y i x i ) is MTP 2 in (y 1,..., y n, x 1,..., x n ). Hence, from Result A.3, we see that the conditional probability i=1 P Θ {X 1 x 1,..., X n x n X 1 X n } is MTP 2 in (x 1,..., x n.θ 1,..., θ n ). This implies, as stated in Result A.1, that the ratio P Θ {X 1 x 1,..., X n x n X 1 X n } P 0 {X 1 x 1,..., X n x n X 1 X n } is increasing in (x 1,..., x n ). This completes the proof. References Benjamini, Y. and Y. Hochberg (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B 57, Benjamini, Y. and W. Liu (1999). A step-down multiple hypotheses testing procedure that controls the false discovery rate under independence. Journal of Statistical Planning and Inference 82, Benjamini, Y. and D. Yekutieli (2001). The control of the false discovery rate in multiple testing under dependency. Annals of Statistics 29, Das Gupta, S. and S. K. Sarkar (1984). Inequalities in Statistics and Probability, Volume 5 of IMS Lecture Notes Monograph Series, Chapter On TP2 and log-concavity, pp Institute of Mathematical Statistics. Dunnett, C. W. and A. C. Tamhane (1992). A step-up multiple test procedure. Journal of the American Statistical Association 87, Genovese, C. and L. Wasserman (2002). Operating characteristics and extensions of the false discovery rate procedure. Journal of the Royal Statistical Society, Series B 64, Hochberg, Y. and A. C. Tamhane (1987). Multiple Comparison Procedure. John Wiley & Sons. Hsu, J. C. (1996). Multiple Comparisons: Theory and Methods. Chapman & Hall. Karlin, S. (1968). Total Positivity. Stanford, CA: Stanford University Press. Karlin, S. and Y. Rinott (1980). Classes of orderings of measures and related correlation inequalities. I. multivariate totally positive distributions. Journal of Multivariate Analysis 10,

18 Kwong, K. S., B. Holland, and S. H. Cheung (2002). A modified Benjamini- Hochberg multiple comparisons procedure for controlling the false discovery rate. Journal of Statistical Planning and Inference 104, Kwong, K. S. and W. Liu (2000). Calculation of critical values for Dunnett and Tamhane s step-up multiple test procedure. Statistics & Probability Letters 49, Sarkar, S. K. (1998). Some probability inequalities for ordered MTP 2 random variables: a proof of the Simes conjecture. Annals of Statistics 26 (2), Sarkar, S. K. (2002). Some results on false discovery rate in stepwise multiple testing procedures. Annals of Statistics 30, Sarkar, S. K. (2004). FDR-controlling stepwise procedures and their false negatives rates. Journal of Statistical Planning and Inference 125, Sarkar, S. K. (2005). False discovery and false non-discovery rates in singlestep multiple testing procedures. To appear in Annals of Statistics. Sarkar, S. K. and C.-K. Chang (1997). The Simes method for multiple hypothesis testing with positively dependent test statistics. Journal of the American Statistical Association 92, Simes, R. J. (1986). An improved Bonferroni procedure for multiple tests of significance. Biometrika 73, Storey, J. D. (2002). A direct approach to false discovery rates. Journal of the Royal Statistical Society, Series B 64, Storey, J. D. (2003). The positive false discovery rate: a bayesian interpretation and the q-value. Annals of Statistics 31 (6), Troendle, J. F. (2000). Stepwise normal theory multiple test procedures controlling the false discovery rate. Journal of Statistical Planning and Inference 84, Yekutieli, D. and Y. Benjamini (1999). Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics. Journal of Statistical Planning and Inference 82,

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept

More information

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo PROCEDURES CONTROLLING THE k-fdr USING BIVARIATE DISTRIBUTIONS OF THE NULL p-values Sanat K. Sarkar and Wenge Guo Temple University and National Institute of Environmental Health Sciences Abstract: Procedures

More information

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE Sanat K. Sarkar 1, Tianhui Zhou and Debashis Ghosh Temple University, Wyeth Pharmaceuticals and

More information

FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING PROCEDURES 1. BY SANAT K. SARKAR Temple University

FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING PROCEDURES 1. BY SANAT K. SARKAR Temple University The Annals of Statistics 2006, Vol. 34, No. 1, 394 415 DOI: 10.1214/009053605000000778 Institute of Mathematical Statistics, 2006 FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING

More information

On adaptive procedures controlling the familywise error rate

On adaptive procedures controlling the familywise error rate , pp. 3 On adaptive procedures controlling the familywise error rate By SANAT K. SARKAR Temple University, Philadelphia, PA 922, USA sanat@temple.edu Summary This paper considers the problem of developing

More information

On Methods Controlling the False Discovery Rate 1

On Methods Controlling the False Discovery Rate 1 Sankhyā : The Indian Journal of Statistics 2008, Volume 70-A, Part 2, pp. 135-168 c 2008, Indian Statistical Institute On Methods Controlling the False Discovery Rate 1 Sanat K. Sarkar Temple University,

More information

Two-stage stepup procedures controlling FDR

Two-stage stepup procedures controlling FDR Journal of Statistical Planning and Inference 38 (2008) 072 084 www.elsevier.com/locate/jspi Two-stage stepup procedures controlling FDR Sanat K. Sarar Department of Statistics, Temple University, Philadelphia,

More information

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Applying the Benjamini Hochberg procedure to a set of generalized p-values U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure

More information

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA

More information

Control of Directional Errors in Fixed Sequence Multiple Testing

Control of Directional Errors in Fixed Sequence Multiple Testing Control of Directional Errors in Fixed Sequence Multiple Testing Anjana Grandhi Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102-1982 Wenge Guo Department of Mathematical

More information

Step-down FDR Procedures for Large Numbers of Hypotheses

Step-down FDR Procedures for Large Numbers of Hypotheses Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate

More information

GENERALIZING SIMES TEST AND HOCHBERG S STEPUP PROCEDURE 1. BY SANAT K. SARKAR Temple University

GENERALIZING SIMES TEST AND HOCHBERG S STEPUP PROCEDURE 1. BY SANAT K. SARKAR Temple University The Annals of Statistics 2008, Vol. 36, No. 1, 337 363 DOI: 10.1214/009053607000000550 Institute of Mathematical Statistics, 2008 GENERALIZING SIMES TEST AND HOCHBERG S STEPUP PROCEDURE 1 BY SANAT K. SARKAR

More information

Resampling-Based Control of the FDR

Resampling-Based Control of the FDR Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago

More information

arxiv: v1 [math.st] 13 Mar 2008

arxiv: v1 [math.st] 13 Mar 2008 The Annals of Statistics 2008, Vol. 36, No. 1, 337 363 DOI: 10.1214/009053607000000550 c Institute of Mathematical Statistics, 2008 arxiv:0803.1961v1 [math.st] 13 Mar 2008 GENERALIZING SIMES TEST AND HOCHBERG

More information

STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University

STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE Wenge Guo 1 and Sanat K. Sarkar 2 National Institute of Environmental Health Sciences and Temple University Abstract: Often in practice

More information

ON TWO RESULTS IN MULTIPLE TESTING

ON TWO RESULTS IN MULTIPLE TESTING ON TWO RESULTS IN MULTIPLE TESTING By Sanat K. Sarkar 1, Pranab K. Sen and Helmut Finner Temple University, University of North Carolina at Chapel Hill and University of Duesseldorf Two known results in

More information

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications

A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications A Mixture Gatekeeping Procedure Based on the Hommel Test for Clinical Trial Applications Thomas Brechenmacher (Dainippon Sumitomo Pharma Co., Ltd.) Jane Xu (Sunovion Pharmaceuticals Inc.) Alex Dmitrienko

More information

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses Gavin Lynch Catchpoint Systems, Inc., 228 Park Ave S 28080 New York, NY 10003, U.S.A. Wenge Guo Department of Mathematical

More information

Procedures controlling generalized false discovery rate

Procedures controlling generalized false discovery rate rocedures controlling generalized false discovery rate By SANAT K. SARKAR Department of Statistics, Temple University, hiladelphia, A 922, U.S.A. sanat@temple.edu AND WENGE GUO Department of Environmental

More information

Chapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate

Chapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate Chapter Stepdown Procedures Controlling A Generalized False Discovery Rate Wenge Guo and Sanat K. Sarkar Biostatistics Branch, National Institute of Environmental Health Sciences, Research Triangle Park,

More information

IMPROVING TWO RESULTS IN MULTIPLE TESTING

IMPROVING TWO RESULTS IN MULTIPLE TESTING IMPROVING TWO RESULTS IN MULTIPLE TESTING By Sanat K. Sarkar 1, Pranab K. Sen and Helmut Finner Temple University, University of North Carolina at Chapel Hill and University of Duesseldorf October 11,

More information

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES By Wenge Guo Gavin Lynch Joseph P. Romano Technical Report No. 2018-06 September 2018

More information

Familywise Error Rate Controlling Procedures for Discrete Data

Familywise Error Rate Controlling Procedures for Discrete Data Familywise Error Rate Controlling Procedures for Discrete Data arxiv:1711.08147v1 [stat.me] 22 Nov 2017 Yalin Zhu Center for Mathematical Sciences, Merck & Co., Inc., West Point, PA, U.S.A. Wenge Guo Department

More information

arxiv: v1 [math.st] 31 Mar 2009

arxiv: v1 [math.st] 31 Mar 2009 The Annals of Statistics 2009, Vol. 37, No. 2, 619 629 DOI: 10.1214/07-AOS586 c Institute of Mathematical Statistics, 2009 arxiv:0903.5373v1 [math.st] 31 Mar 2009 AN ADAPTIVE STEP-DOWN PROCEDURE WITH PROVEN

More information

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE By Wenge Guo and M. Bhaskara Rao National Institute of Environmental Health Sciences and University of Cincinnati A classical approach for dealing

More information

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using

More information

Statistica Sinica Preprint No: SS R1

Statistica Sinica Preprint No: SS R1 Statistica Sinica Preprint No: SS-2017-0072.R1 Title Control of Directional Errors in Fixed Sequence Multiple Testing Manuscript ID SS-2017-0072.R1 URL http://www.stat.sinica.edu.tw/statistica/ DOI 10.5705/ss.202017.0072

More information

Hochberg Multiple Test Procedure Under Negative Dependence

Hochberg Multiple Test Procedure Under Negative Dependence Hochberg Multiple Test Procedure Under Negative Dependence Ajit C. Tamhane Northwestern University Joint work with Jiangtao Gou (Northwestern University) IMPACT Symposium, Cary (NC), November 20, 2014

More information

Sanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract

Sanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract Adaptive Controls of FWER and FDR Under Block Dependence arxiv:1611.03155v1 [stat.me] 10 Nov 2016 Wenge Guo Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102, U.S.A.

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca

More information

Journal of Statistical Software

Journal of Statistical Software JSS Journal of Statistical Software MMMMMM YYYY, Volume VV, Issue II. doi: 10.18637/jss.v000.i00 GroupTest: Multiple Testing Procedure for Grouped Hypotheses Zhigen Zhao Abstract In the modern Big Data

More information

Control of Generalized Error Rates in Multiple Testing

Control of Generalized Error Rates in Multiple Testing Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 245 Control of Generalized Error Rates in Multiple Testing Joseph P. Romano and

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

On Generalized Fixed Sequence Procedures for Controlling the FWER

On Generalized Fixed Sequence Procedures for Controlling the FWER Research Article Received XXXX (www.interscience.wiley.com) DOI: 10.1002/sim.0000 On Generalized Fixed Sequence Procedures for Controlling the FWER Zhiying Qiu, a Wenge Guo b and Gavin Lynch c Testing

More information

False discovery rate control for non-positively regression dependent test statistics

False discovery rate control for non-positively regression dependent test statistics Journal of Statistical Planning and Inference ( ) www.elsevier.com/locate/jspi False discovery rate control for non-positively regression dependent test statistics Daniel Yekutieli Department of Statistics

More information

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Christopher R. Genovese Department of Statistics Carnegie Mellon University joint work with Larry Wasserman

More information

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR CONTROLLING THE FALSE DISCOVERY RATE A Dissertation in Statistics by Scott Roths c 2011

More information

Multiple Endpoints: A Review and New. Developments. Ajit C. Tamhane. (Joint work with Brent R. Logan) Department of IE/MS and Statistics

Multiple Endpoints: A Review and New. Developments. Ajit C. Tamhane. (Joint work with Brent R. Logan) Department of IE/MS and Statistics 1 Multiple Endpoints: A Review and New Developments Ajit C. Tamhane (Joint work with Brent R. Logan) Department of IE/MS and Statistics Northwestern University Evanston, IL 60208 ajit@iems.northwestern.edu

More information

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a

More information

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018 High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously

More information

A Large-Sample Approach to Controlling the False Discovery Rate

A Large-Sample Approach to Controlling the False Discovery Rate A Large-Sample Approach to Controlling the False Discovery Rate Christopher R. Genovese Department of Statistics Carnegie Mellon University Larry Wasserman Department of Statistics Carnegie Mellon University

More information

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials Research Article Received 29 January 2010, Accepted 26 May 2010 Published online 18 April 2011 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.4008 Mixtures of multiple testing procedures

More information

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS

MULTISTAGE AND MIXTURE PARALLEL GATEKEEPING PROCEDURES IN CLINICAL TRIALS Journal of Biopharmaceutical Statistics, 21: 726 747, 2011 Copyright Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543406.2011.551333 MULTISTAGE AND MIXTURE PARALLEL

More information

More powerful control of the false discovery rate under dependence

More powerful control of the false discovery rate under dependence Statistical Methods & Applications (2006) 15: 43 73 DOI 10.1007/s10260-006-0002-z ORIGINAL ARTICLE Alessio Farcomeni More powerful control of the false discovery rate under dependence Accepted: 10 November

More information

arxiv: v1 [math.st] 17 Jun 2009

arxiv: v1 [math.st] 17 Jun 2009 The Annals of Statistics 2009, Vol. 37, No. 3, 1518 1544 DOI: 10.1214/08-AOS616 c Institute of Mathematical Statistics, 2009 arxiv:0906.3082v1 [math.st] 17 Jun 2009 A NEW MULTIPLE TESTING METHOD IN THE

More information

Multiple Testing. Anjana Grandhi. BARDS, Merck Research Laboratories. Rahway, NJ Wenge Guo. Department of Mathematical Sciences

Multiple Testing. Anjana Grandhi. BARDS, Merck Research Laboratories. Rahway, NJ Wenge Guo. Department of Mathematical Sciences Control of Directional Errors in Fixed Sequence arxiv:1602.02345v2 [math.st] 18 Mar 2017 Multiple Testing Anjana Grandhi BARDS, Merck Research Laboratories Rahway, NJ 07065 Wenge Guo Department of Mathematical

More information

Non-specific filtering and control of false positives

Non-specific filtering and control of false positives Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview

More information

This paper has been submitted for consideration for publication in Biometrics

This paper has been submitted for consideration for publication in Biometrics BIOMETRICS, 1 10 Supplementary material for Control with Pseudo-Gatekeeping Based on a Possibly Data Driven er of the Hypotheses A. Farcomeni Department of Public Health and Infectious Diseases Sapienza

More information

Estimation of a Two-component Mixture Model

Estimation of a Two-component Mixture Model Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint

More information

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

False discovery control for multiple tests of association under general dependence

False discovery control for multiple tests of association under general dependence False discovery control for multiple tests of association under general dependence Nicolai Meinshausen Seminar für Statistik ETH Zürich December 2, 2004 Abstract We propose a confidence envelope for false

More information

The optimal discovery procedure: a new approach to simultaneous significance testing

The optimal discovery procedure: a new approach to simultaneous significance testing J. R. Statist. Soc. B (2007) 69, Part 3, pp. 347 368 The optimal discovery procedure: a new approach to simultaneous significance testing John D. Storey University of Washington, Seattle, USA [Received

More information

Doing Cosmology with Balls and Envelopes

Doing Cosmology with Balls and Envelopes Doing Cosmology with Balls and Envelopes Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics Carnegie

More information

The miss rate for the analysis of gene expression data

The miss rate for the analysis of gene expression data Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,

More information

Stat 206: Estimation and testing for a mean vector,

Stat 206: Estimation and testing for a mean vector, Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where

More information

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons: STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two

More information

Lecture 3. Inference about multivariate normal distribution

Lecture 3. Inference about multivariate normal distribution Lecture 3. Inference about multivariate normal distribution 3.1 Point and Interval Estimation Let X 1,..., X n be i.i.d. N p (µ, Σ). We are interested in evaluation of the maximum likelihood estimates

More information

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Test (2008) 17: 461 471 DOI 10.1007/s11749-008-0134-6 DISCUSSION Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Joseph P. Romano Azeem M. Shaikh

More information

A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments

A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Jie Chen 1 Merck Research Laboratories, P. O. Box 4, BL3-2, West Point, PA 19486, U.S.A. Telephone:

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 13 Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates Sandrine Dudoit Mark

More information

Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments

Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Jie Chen 1 Merck Research Laboratories, P. O. Box 4, BL3-2, West Point, PA 19486, U.S.A. Telephone:

More information

Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample Size Re-estimation

Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample Size Re-estimation Research Article Received XXXX (www.interscience.wiley.com) DOI: 10.100/sim.0000 Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data Faming Liang, Chuanhai Liu, and Naisyin Wang Texas A&M University Multiple Hypothesis Testing Introduction

More information

High-throughput Testing

High-throughput Testing High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector

More information

On Reparametrization and the Gibbs Sampler

On Reparametrization and the Gibbs Sampler On Reparametrization and the Gibbs Sampler Jorge Carlos Román Department of Mathematics Vanderbilt University James P. Hobert Department of Statistics University of Florida March 2014 Brett Presnell Department

More information

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are, Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations

More information

Effects of dependence in high-dimensional multiple testing problems. Kyung In Kim and Mark van de Wiel

Effects of dependence in high-dimensional multiple testing problems. Kyung In Kim and Mark van de Wiel Effects of dependence in high-dimensional multiple testing problems Kyung In Kim and Mark van de Wiel Department of Mathematics, Vrije Universiteit Amsterdam. Contents 1. High-dimensional multiple testing

More information

On weighted Hochberg procedures

On weighted Hochberg procedures Biometrika (2008), 95, 2,pp. 279 294 C 2008 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn018 On weighted Hochberg procedures BY AJIT C. TAMHANE Department of Industrial Engineering

More information

Testing Jumps via False Discovery Rate Control

Testing Jumps via False Discovery Rate Control Testing Jumps via False Discovery Rate Control Yu-Min Yen August 12, 2011 Abstract Many recently developed nonparametric jump tests can be viewed as multiple hypothesis testing problems. For such multiple

More information

SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE

SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE Statistica Sinica 18(2008), 881-904 SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE Yongchao Ge 1, Stuart C. Sealfon 1 and Terence P. Speed 2,3 1 Mount Sinai School of Medicine,

More information

False Discovery Control in Spatial Multiple Testing

False Discovery Control in Spatial Multiple Testing False Discovery Control in Spatial Multiple Testing WSun 1,BReich 2,TCai 3, M Guindani 4, and A. Schwartzman 2 WNAR, June, 2012 1 University of Southern California 2 North Carolina State University 3 University

More information

Controlling the False Discovery Rate in Two-Stage. Combination Tests for Multiple Endpoints

Controlling the False Discovery Rate in Two-Stage. Combination Tests for Multiple Endpoints Controlling the False Discovery Rate in Two-Stage Combination Tests for Multiple ndpoints Sanat K. Sarkar, Jingjing Chen and Wenge Guo May 29, 2011 Sanat K. Sarkar is Professor and Senior Research Fellow,

More information

Multiple hypothesis testing using the excess discovery count and alpha-investing rules

Multiple hypothesis testing using the excess discovery count and alpha-investing rules Multiple hypothesis testing using the excess discovery count and alpha-investing rules Dean P. Foster and Robert A. Stine Department of Statistics The Wharton School of the University of Pennsylvania Philadelphia,

More information

GOTEBORG UNIVERSITY. Department of Statistics

GOTEBORG UNIVERSITY. Department of Statistics GOTEBORG UNIVERSITY Department of Statistics RESEARCH REPORT 1994:5 ISSN 0349-8034 COMPARING POWER AND MULTIPLE SIGNIFICANCE LEVEL FOR STEP UP AND STEP DOWN MULTIPLE TEST PROCEDURES FOR CORRELATED ESTIMATES

More information

Research Article Sample Size Calculation for Controlling False Discovery Proportion

Research Article Sample Size Calculation for Controlling False Discovery Proportion Probability and Statistics Volume 2012, Article ID 817948, 13 pages doi:10.1155/2012/817948 Research Article Sample Size Calculation for Controlling False Discovery Proportion Shulian Shang, 1 Qianhe Zhou,

More information

Looking at the Other Side of Bonferroni

Looking at the Other Side of Bonferroni Department of Biostatistics University of Washington 24 May 2012 Multiple Testing: Control the Type I Error Rate When analyzing genetic data, one will commonly perform over 1 million (and growing) hypothesis

More information

Post-Selection Inference

Post-Selection Inference Classical Inference start end start Post-Selection Inference selected end model data inference data selection model data inference Post-Selection Inference Todd Kuffner Washington University in St. Louis

More information

Simultaneous identifications of the minimum effective dose in each of several groups

Simultaneous identifications of the minimum effective dose in each of several groups Journal of Statistical Computation and Simulation Vol. 77, No. 2, February 2007, 149 161 Simultaneous identifications of the minimum effective dose in each of several groups SHOW-LI JAN*, YUH-ING CHEN

More information

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or

More information

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

y ˆ i = ˆ  T u i ( i th fitted value or i th fit) 1 2 INFERENCE FOR MULTIPLE LINEAR REGRESSION Recall Terminology: p predictors x 1, x 2,, x p Some might be indicator variables for categorical variables) k-1 non-constant terms u 1, u 2,, u k-1 Each u

More information

Control of the False Discovery Rate under Dependence using the Bootstrap and Subsampling

Control of the False Discovery Rate under Dependence using the Bootstrap and Subsampling Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 337 Control of the False Discovery Rate under Dependence using the Bootstrap and

More information

Probabilistic Inference for Multiple Testing

Probabilistic Inference for Multiple Testing This is the title page! This is the title page! Probabilistic Inference for Multiple Testing Chuanhai Liu and Jun Xie Department of Statistics, Purdue University, West Lafayette, IN 47907. E-mail: chuanhai,

More information

Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs

Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs with Remarks on Family Selection Dissertation Defense April 5, 204 Contents Dissertation Defense Introduction 2 FWER Control within

More information

Large-Scale Hypothesis Testing

Large-Scale Hypothesis Testing Chapter 2 Large-Scale Hypothesis Testing Progress in statistics is usually at the mercy of our scientific colleagues, whose data is the nature from which we work. Agricultural experimentation in the early

More information

INTRODUCTION TO INTERSECTION-UNION TESTS

INTRODUCTION TO INTERSECTION-UNION TESTS INTRODUCTION TO INTERSECTION-UNION TESTS Jimmy A. Doi, Cal Poly State University San Luis Obispo Department of Statistics (jdoi@calpoly.edu Key Words: Intersection-Union Tests; Multiple Comparisons; Acceptance

More information

Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control

Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control Joshua D. Habiger Oklahoma State University jhabige@okstate.edu Nov. 8, 2013 Outline 1 : Motivation and FDR Research Areas

More information

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap University of Zurich Department of Economics Working Paper Series ISSN 1664-7041 (print) ISSN 1664-705X (online) Working Paper No. 254 Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and

More information

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity Prof. Kevin E. Thorpe Dept. of Public Health Sciences University of Toronto Objectives 1. Be able to distinguish among the various

More information

Simultaneous Testing of Grouped Hypotheses: Finding Needles in Multiple Haystacks

Simultaneous Testing of Grouped Hypotheses: Finding Needles in Multiple Haystacks University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2009 Simultaneous Testing of Grouped Hypotheses: Finding Needles in Multiple Haystacks T. Tony Cai University of Pennsylvania

More information

Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004

Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004 Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004 Multiple testing methods to control the False Discovery Rate (FDR),

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 7, Issue 1 2011 Article 12 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Stanford University Azeem Shaikh, University of Chicago

More information

Hunting for significance with multiple testing

Hunting for significance with multiple testing Hunting for significance with multiple testing Etienne Roquain 1 1 Laboratory LPMA, Université Pierre et Marie Curie (Paris 6), France Séminaire MODAL X, 19 mai 216 Etienne Roquain Hunting for significance

More information

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Joseph P. Romano Department of Statistics Stanford University Michael Wolf Department of Economics and Business Universitat Pompeu

More information

Heterogeneity and False Discovery Rate Control

Heterogeneity and False Discovery Rate Control Heterogeneity and False Discovery Rate Control Joshua D Habiger Oklahoma State University jhabige@okstateedu URL: jdhabigerokstateedu August, 2014 Motivating Data: Anderson and Habiger (2012) M = 778 bacteria

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 6, Issue 1 2007 Article 28 A Comparison of Methods to Control Type I Errors in Microarray Studies Jinsong Chen Mark J. van der Laan Martyn

More information

MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY

MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY BY YINGQIU MA A dissertation submitted to the Graduate School New Brunswick Rutgers, The State University of New

More information

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University

More information

Power and sample size determination for a stepwise test procedure for finding the maximum safe dose

Power and sample size determination for a stepwise test procedure for finding the maximum safe dose Journal of Statistical Planning and Inference 136 (006) 163 181 www.elsevier.com/locate/jspi Power and sample size determination for a stepwise test procedure for finding the maximum safe dose Ajit C.

More information