Statistical Applications in Genetics and Molecular Biology
|
|
- Randall Atkins
- 6 years ago
- Views:
Transcription
1 Statistical Applications in Genetics and Molecular Biology Volume 5, Issue Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca W. Doerge Northwestern University, hongmei@northwestern.edu Purdue University, doerge@purdue.edu Copyright c 2006 The Berkeley Electronic Press. All rights reserved.
2 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang and Rebecca W. Doerge Abstract For situations where the number of tested hypotheses is increasingly large, the power to detect statistically significant multiple treatment effects decreases. As is the case with microarray technology, often researchers are interested in identifying differentially expressed genes for more than two types of cells or treatments. A two-step procedure is proposed for the purpose of increasing power to detect significant effects (i.e., to identify differentially expressed genes). Specifically, in the first step, the null hypothesis of equality across the mean expression levels for all treatments is tested for each gene. In the second step, only pairwise comparisons corresponding to the genes for which the treatment means are statistically different in the first step are tested. We propose an approach to estimate the overall FDR for both fixed rejection regions and fixed FDR significance levels. Also proposed is a procedure to find the FDR significance levels used in the first step and the second step such that the overall FDR can be controlled below a pre-specified FDR significance level. When compared via simulation the two-step approach has increased power over a one-step procedure, and controls the FDR at a desire significance level. KEYWORDS: false discovery rate, multiple comparisons, multiple tests, testing differential expression Acknowledgments: We are very grateful to two reviewers and the Associate Editor for their helpful comments and suggestions.
3 Jiang and Doerge: A Two-Step Multiple Comparison Procedure 1 1 Introduction Advances in many areas of technology (e.g., communication, health care, and biotechnology) are giving rise to vast experiments that provide data for testing a very large number of repetitive tests. These situations require a multiple comparison correction that not only accommodates the number of tests that are being conducted, but also controls the rate of false positives at a desired level. While this problem presents itself in a variety of applications the one that motivated this work is microarray technology; a powerful tool that is widely applicable to almost every area of science (e.g., basic science, agriculture, and medical research). Microarrays provide a systematic way to study transcript variation for thousands of genes simultaneously. The key question addressed by most microarray experiments is to ask which genes are differentially expressed genes between a pair of conditions (i.e., control and treatment). Numerous approaches that range from traditional statistical analyses to new statistical models have been proposed for testing differential gene expression (Schena et al., 1996; Baldi and Long, 2001; Efron, 2003; Newton et al., 2001; Gottardo et al., 2003; Tusher et al., 2001; Kerr et al., 2000; Wolfinger et al., 2001) between pairs of conditions. Since the traditional familywise error rate (FWER) multiple comparisons procedures, such as Bonferroni s procedure, are too conservative, false discovery rate (FDR) controlling procedures (Benjamini and Hochberg, 1995) have been widely used in microarray studies. Benjamini and Hochberg (2000) propose an adaptive procedure, that has increased power over the original procedure, by incorporating the estimate of the proportion of true null hypotheses. A variety of methods have been proposed to estimate the proportion of true null hypotheses for multiple testing problems, such as Storey s bootstrap method (Storey, 2002), Storey and Tibshirani s smoother estimate (Storey and Tibshirani, 2003), and Langaas et al. s method based on nonparametric maximum likelihood estimation of the p-value density, under the restriction of decreasing and convex decreasing densities (Langaas et al., 2005). Although testing for differential expression of a gene between pairs of conditions or treatments is informative, in a microarray study it is quite common for researchers to be interested in comparing more than two treatment conditions for thousands of genes in the experiment. For instance, Hedenfalk et al. (2001) studied gene expression changes among breast cancers due to mutations in either the gene BRCA1 or the gene BRCA2 and sporadic tumor (i.e., three conditions) using 5,361 genes. With a large number (m) of genes, the number of pairwise comparisons are typically very large (3m for 3 treatments, and 6m for 4 treatments, etc.). Therefore, when the goal is to identify statistically Published by The Berkeley Electronic Press, 2006
4 2 Statistical Applications in Genetics and Molecular Biology Vol. 5 [2006], No. 1, Article 28 differentially expressed genes between each pair of conditions, in the typical one-step multiple comparison procedure C m (C is number of pairwise comparisons for each gene) hypothesis tests are treated as a family, and a false discovery rate (FDR) controlling procedure such as Benjamini and Hochberg s procedure (Benjamini and Hochberg, 1995) is applied at a significance level α. In situations where the majority of genes are not differentially expressed across the treatments, applying the FDR controlling procedure to a large family of multiple comparisons may not be most powerful simply because when the number of hypothesis increases, the power of detecting differentially expressed genes decreases. Lu et al. (2005) explored this issue and proposed a two-step strategy. In the first step, a subset of genes that are potentially differentially expressed among the treatments are identified with a loose criterion. In the second step, these potential genes are combined for detecting differentially expressed genes with a more stringent criterion. It is expected that the smaller number of genes in the second step will give rise to a more powerful test. In both steps of the procedure Lu et al. (2005) employ a Bonferroni adjustment to address the multiple comparison problem. Lu et al. (2005) point out that Benjamini and Hochberg s FDR controlling procedure (Benjamini and Hochberg, 1995) can be used in both steps but do not address the family-wise error rate (FWER) or the FDR for the whole/entire procedure. Specifically, suppose the FDR significance levels used in the two steps are 0.05 and 0.01, respectively. The FDR for the whole procedure must be taken into account, and not limited to the individual FDRs at each step, since the false rejections in the first step will affect the results of the second step. Using this as our motivation, a two-step multiple comparison procedure is proposed for testing pairwise comparisons of more than two treatments for a large number of genes such that the power to detect differentially expressed genes, while controlling the FDR at a pre-chosen significance level, will be higher than a one-step procedure. Although Lu et al. (2005) used a mixed model approach for their two-step procedure, our proposed two-step procedure is not limited by the specifics of the model. Specifically, in the first step, the null hypothesis of equality across the mean expression levels for all treatments is tested for each gene. In the second step, only pairwise comparisons corresponding to the genes for which the treatment means are statistically different in the first step are tested. The two-step procedure can be applied in practice in three different ways: 1. The rejection regions in the first and second step both can be fixed. That is, equality tests of expression levels for the genes in the first step with corresponding p-values less than or equal to c 1 are considered statistically significant, and pairwise comparisons in the second step with p-values less than or equal to c 2 are statistically significant, where
5 Jiang and Doerge: A Two-Step Multiple Comparison Procedure 3 c 1 and c 2 are fixed and known. Although it is typical to use the term rejection region in conjunction with the term test statistic(s), here we rely on the term rejection region in conjunction with the term p-value(s) for ease of explanation; 2. One can apply an FDR controlling procedure at significance level α 1 in the first step, and an FDR controlling procedure at significance level α 2 in the second step, where α 1 and α 2 are fixed and known; 3. One can pre-specify the overall FDR α to control the overall FDR below α. In this work we propose an approach to estimate the overall FDR for both fixed rejection regions (situation 1) and fixed FDR significance levels (situation 2). We also propose a procedure to find the FDR significance levels used in the first step and the second step such that the overall FDR can be controlled below a pre-specified FDR significance level. Using simulated data we demonstrate that our proposed two-step procedure has increased power over a one-step procedure and controls the FDR for the entire procedure at a desired significance level. 2 A two-step multiple comparison procedure A novel two-step multiple comparison procedure is proposed in the context of testing for differential expression. Initially, we present it generally with no specific FDR controlling procedure specified: Step 1. The null hypothesis that a gene is not differentially expressed across all treatment conditions is tested for each gene (e.g., the global F-test from ANOVA model). For the family of m tests corresponding to the m genes, an FDR controlling procedure is applied to control the FDR at level α 1. Suppose there are K tests that are significant. Let A denote the collection of the genes which have statistically significant treatment effects. If K=0, the procedure is stopped and it is concluded that no pairwise comparisons are significant and that there are no differentially expressed genes; otherwise, go to Step 2. Step 2. (a) For genes not belonging to A, conclude pairwise comparisons among the treatments for these genes are not significant. (b) For genes belonging to A, perform pairwise (C) comparisons for each gene. Since there are K genes, in total there are C K pairwise comparisons. Apply an FDR controlling procedure for this family of C K tests at level α 2. Using FDR significance levels α 1 and α 2 our two-step procedure follows (this can also be accomplished using fixed rejection regions in a similar way). Published by The Berkeley Electronic Press, 2006
6 4 Statistical Applications in Genetics and Molecular Biology Vol. 5 [2006], No. 1, Article 28 Step 1. The null hypothesis that a gene is not differentially expressed across all treatment conditions is tested for each gene (e.g., the global F- test from ANOVA model). Tests with p-values c 1 are considered as statistically significant. Suppose there are K tests that are significant. Let A denote the collection of the genes which have statistically significant treatment effects. If K=0, the procedure is stopped and it is concluded that no pairwise comparisons are significant and that there are no differentially expressed genes; otherwise, go to Step 2. Step 2. (a) For genes not belonging to A, conclude pairwise comparisons among the treatments for these genes are not significant. (b) For genes belonging to A, perform pairwise (C) comparisons for each gene. Since there are K genes, in total there are C K pairwise comparisons. Pairwise comparisons with p-values c 2 are considered as statistically significant. We assume that if a gene does not have a significant treatment effect (tested in Step 1), then all of the pairwise comparisons among the treatments corresponding to that gene are not significant. Only genes with a statistically significant treatment effect will enter into the second step to be tested for pairwise comparisons. However, if a gene has a significant treatment effect (Step 1), some or all the pairwise comparisons may not be significant. For the fixed FDR significance levels α 1 and α 2, or the fixed rejection regions [0, c 1 ] and [0, c 2 ] in the respective Step 1 and Step 2, determination of the overall FDR remains necessary. Choosing the significance level α 1 in Step 1 and α 2 in Step 2 so that the FDR for the entire two-step procedure is controlled at a desired significance level α is an additional issue that is of interest. To address these issues the two-step multiple comparison procedure is investigated further to gain an appreciation of the overall FDR relative to the FDR in each step of the procedure. 3 Estimating FDR for fixed rejection regions 3.1 Derivation of the FDR Assume the two-step procedure with fixed rejection regions are used. That is, assume that genes with p-values c 1 have a significant treatment effect (i.e., at least one treatment mean is different from others) in Step 1; and the pairwise comparisons with p-values c 2 are identified as statistically significant in Step 2, where c 1 and c 2 are known. Our goal is to compute the overall FDR for the
7 Jiang and Doerge: A Two-Step Multiple Comparison Procedure 5 two-step multiple comparison procedure. The approach is similar to Storey s positive false discovery rate (pfdr) procedure (Storey, 2002, 2003b) where one estimates the FDR for a given rejection region. Let H0 i denote the null hypothesis of no treatment effect for the ith gene and let H ij 0 denote the null hypothesis that the jth pair of treatment means are not different for the ith gene. For instance, if three treatments are of interest, j = 1, 2, 3; if four treatments are of interest, j = 1, 2,, 6. Let D i = 0 indicate that there is no treatment effect for the ith gene, and let D i = 1 indicate a treatment effect for the ith gene. Furthermore, let D ij = 0 indicate that the means of the jth pair of treatments for gene i are the same, and D ij = 1 when they are different. If D i = 0, then D ij = 0 for all j. Finally let p i denote the p-value for testing the null hypothesis H0 i in Step 1; and p ij denote the p-value for testing the null hypothesis H ij 0 in Step 2. Our two-step multiple comparison approach is different from the one-step multiple comparison procedure where the decision to reject depends on only p ij, since the decision whether to reject H ij 0 or not in the two-step multiple comparison procedure depends on both p i and p ij. Essentially, the two-step multiple comparison procedure has two criteria. The null hypothesis H ij 0 is rejected if and only if both conditions p i c 1 and p ij c 2 are satisfied. Obviously, the two-step comparison procedure is exactly the one-step procedure when c 1 1. In fact, if c 1 is large enough such that the two events, {p ij c 2 } for some j, and {p i c 1 }, occur simultaneously for every gene i, then the two-step comparison procedure will produce the same results as the one-step procedure. Theorem 1. In a two-step multiple comparison procedure, suppose that objects/genes with p-values c 1 are considered as having a significant treatment effect (i.e., at least one treatment mean is different from others) in Step 1; and the pairwise comparisons with p-values c 2 are identified as statistically significant in Step 2. Assume c 1 and c 2 are known, and the objects/genes are independent. The pfdr of this two-step multiple comparison procedure is: pfdr = pfdr 1 P (p ij c 2 D i = 0, p i c 1 ) P (p ij c 2 p i c 1 ) + (1 pfdr 1 ) P (p ij c 2 D ij = 0, D i = 1, p i c 1 )P (D ij = 0 D i = 1, p i c 1 ), (1) P (p ij c 2 p i c 1 ) where pfdr 1 = P (D i = 0 p i c 1 ), which is the pfdr in Step 1. Published by The Berkeley Electronic Press, 2006
8 6 Statistical Applications in Genetics and Molecular Biology Vol. 5 [2006], No. 1, Article 28 Proof. Since the goal of the two-step multiple comparison procedure is to identify statistically significant pairwise comparisons, only the rejections in Step 2 are of interest. Assume the objects/genes are independent. Using the Bayesian interpretation of pfdr (Storey, 2003b), the pfdr for the whole procedure is the probability of having a false rejection of a pairwise comparison given that it is in the rejection region (i.e., the probability that D ij = 0 given that p i c 1 and p ij c 2 ), pfdr = P (D ij = 0 p i c 1, p ij c 2 ) = P (D ij = 0, p ij c 2 p i c 1 ). (2) P (p ij c 2 p i c 1 ) To compute the numerator of equation (2), falsely rejected genes in the first step are treated separately from the rejected genes that in fact have different treatment effects. P (D ij = 0, p ij c 2 p i c 1 ) = P (D ij = 0, p ij c 2 D i = 0, p i c 1 ) P (D i = 0 p i c 1 ) +P (D ij = 0, p ij c 2 D i = 1, p i c 1 ) P (D i = 1 p i c 1 ) = P (D ij = 0, p ij c 2 D i = 0, p i c 1 ) pfdr 1 +P (D ij = 0, p ij c 2 D i = 1, p i c 1 ) (1 pfdr 1 ) = P (p ij c 2 D ij = 0, D i = 0, p i c 1 ) P (D ij = 0 D i = 0, p i c 1 ) pfdr 1 + P (p ij c 2 D ij = 0, D i = 1, p i c 1 ) (1 pfdr 1 ) P (D ij = 0 D i = 1, p i c 1 ). Assume that all pairwise comparisons for a gene are not significant if that gene does not have a significant treatment effect, then Then, P (D ij = 0 D i = 0, p i c 1 ) = P (D ij = 0 D i = 0) = 1. P (D ij = 0, p ij c 2 p i c 1 ) = P (p ij c 2 D i = 0, p i c 1 ) pfdr 1 + (1 pfdr 1 ) P (p ij c 2 D ij = 0, D i = 0, p i c 1 )P (D ij = 0 D i = 1, p i c 1 ). (3) Combining equation (3) with equation (2) gives rise to the pfdr formulation as in equation (1).
9 Jiang and Doerge: A Two-Step Multiple Comparison Procedure Estimation of the FDR With respect to microarray studies, the probability of having at least one rejection, P (R > 0) is almost 1, making the FDR and the pfdr essentially the same (Storey et al., 2004; Black, 2004). Therefore, the pfdr can be replaced with FDR in equation (1), and the FDR for a two-step multiple comparison procedure is, FDR = FDR 1 P (p ij c 2 D i = 0, p i c 1 ) P (p ij c 2 p i c 1 ) + (1 FDR 1 ) P (p ij c 2 D ij = 0, D i = 1, p i c 1 )P (D ij = 0 D i = 1, p i c 1 ). (4) P (p ij c 2 p i c 1 ) To estimate the FDR of the two-step multiple comparison procedure with fixed rejection region, the five components of equation (4) have to be estimated: (1) P (p ij c 2 p i c 1 ) can be estimated using the proportion of rejections among the pairwise comparisons occurred in Step 2. That is, P (p ij c 2 p i c 1 ) = #{p ij : p ij c 2, p i c 1 }, (5) #{p i : p i c 1 } C where C is the number of pairwise comparisons for each gene, #{p i : p i c 1 } is the number of statistically significant genes (i.e., with p-values c 1 ) in Step 1, and #{p ij : p ij c 2, p i c 1 } is the number of significant pairwise comparisons (i.e., with p-values c 2 ) in Step 2. (2) The FDR in Step 1, FDR 1, can be estimated using the approach of Storey (2002) : F DR 1 = c 1 π 01 #{p i : p i c 1 }/m, (6) where m is the total number of genes, #{p i : p i c 1 } is the number of p-values c 1 in Step 1, and π 01 is the estimate for π 01 which is the proportion of true null hypotheses in Step 1 (i.e., the proportion of genes which in fact have no treatment effect among all m genes). Details about estimating the proportion of true null hypotheses are not covered here; references are given in Section 1. Published by The Berkeley Electronic Press, 2006
10 8 Statistical Applications in Genetics and Molecular Biology Vol. 5 [2006], No. 1, Article 28 (3) P (p ij c 2 D i = 0, p i c 1 ) is the probability of claiming a statistically significant pairwise comparison which is associated with a falsely rejected gene (tested in Step 1). A resampling technique can be employed to estimate this probability. The following procedure is applied to the cases where a global F-test from an ANOVA model with constant variance and normal distribution assumption is employed to test for the treatment effect in Step 1. The concept is to generate a large data set under the true null hypothesis (i.e., all treatment means are the same for all genes) and then analyze these data in the same manner as the real (actual) data. The proportion of rejections in Step 2 (ratio of the number of rejections to the total number of pairwise comparisons) is then computed. The specifics are as follows: (i) (ii) Using the same sample size as the real data, generate a random sample from a standard normal distribution for a large number of genes (e.g., M = 100, 000). Assume there are 3 treatment conditions and n observations within each treatment condition, making the random sample of size 3nM. These data are then analyzed using the same analysis as used for the real data. The p-value (p i ) for testing the null hypothesis that the treatment means are equal, and the p-values (p ij) for testing the pairwise comparisons for i = 1,, M are computed. Let #{p i : p i c 1 } be the number of p-values such that p i c 1 and #{p ij : p ij c 2, p i c 1 } be the number of p-values such that p ij c 2 where i is chosen such that p i c 1. These quantities as gained by resampling provide an estimate of the probability of claiming a statistically significant pairwise comparison that is associated with a falsely rejected genes, namely P (p ij c 2 D i = 0, p i c 1 ) = #{p ij : p ij c 2, p i c 1 } #{p i : p i c 1} C, (7) where C is the number of pairwise comparisons for each gene. In Section 6, we present an algorithm for situations when the experimental design is unbalanced and the data are not normally distributed. A permutation method is used to estimate the true null distribution of the test statistics. (4) The estimate of P (p ij c 2 D ij = 0, D i = 1, p i c 1 ) is c 2 when the
11 Jiang and Doerge: A Two-Step Multiple Comparison Procedure 9 probability P (p i c 1 D ij = 0, D i = 1) = 1. Notice that P (p ij c 2 D ij = 0, D i = 1, p i c 1 ) P (p ij c 2 D ij = 0, D i = 1) = P (p ij c 2, p i c 1 D ij = 0, D i = 1) P (p ij c 2 D ij = 0, D i = 1) P (p i c 1 D ij = 0, D i = 1) P (p ij c 2 D ij = 0, D i = 1) P (p i c 1 D ij = 0, D i = 1) P (p ij c 2 D ij = 0, D i = 1) = P (p ij c 2 D ij = 0, D i = 1) 1 P (p i c 1 D ij = 0, D i = 1). P (p i c 1 D ij = 0, D i = 1) Since the p-value p ij corresponding to D i = 1 and D ij = 0 is uniformly distributed on the interval (0,1), then P (p ij c 2 D ij = 0, D i = 1) = c 2. Hence, P (p ij c 2 D ij = 0, D i = 1, p i c 1 ) c 2 1 P (p i c 1 D ij = 0, D i = 1) c 2. (8) P (p i c 1 D ij = 0, D i = 1) Therefore, when P (p i c 1 D ij = 0, D i = 1) = 1, P (p ij c 2 D ij = 0, D i = 1, p i c 1 ) = c 2 holds. For an infinite sample size, the event {p i c 1 D ij = 0, D i = 1} is deterministic regardless of the value that c 1 takes. For a finite sample size, P (p i c 1 D ij = 0, D i = 1) can be very close to, or equal to 1 for a reasonable value of c 1. For example, suppose there are three treatment conditions with an equal sample size n under each of the three conditions. Suppose further that a gene has treatment means (0, 0, 3). Using the noncentral F-distribution under the assumption of the normal distribution, P (p i 0.01 D ij = 0, D i = 1) = when n = 6, and when n = 10, and 1 when n = 30; P (p i D ij = 0, D i = 1) = when n = 6, and when n = 10, and 1 when n = 30. When c 1 is extremely small, P (p i c 1 D ij = 0, D i = 1) can be much smaller than 1 for a finite sample size. Using equation (8) the following method can be employed to provide an overestimate of P (p ij c 2 D ij = 0, D i = 1, p i c 1 ). P (p ij c 2 D ij = 0, D i = 1, p i c 1 ) 1 = c 2 + c P (p i c 1 D ij = 0, D i = 1) 2. (9) P (p i c 1 D ij = 0, D i = 1) Published by The Berkeley Electronic Press, 2006
12 10 Statistical Applications in Genetics and Molecular Biology Vol. 5 [2006], No. 1, Article 28 Let E be the set of genes which enter the second step of the two-step procedure, but do not have all pairwise comparisons statistically significant, i.e., E = {gene g : p g c 1, at least one j such that p gj > c 2 }. Since the true means are unknown they have to be estimated. For gene g E, let x gj denote the sample mean for gene g under treatment condition j, and [j] denote the treatment which has the jth largest magnitude (absolute value) of the sample mean. For example, if the three treatment means for gene g satisfy x g3 < x g1 < x g2, then [1] = 3, [2] = 1 and [3] = 2. For gene g E, define the pseudo means under the J treatment conditions as following: µ g[1] = = µ g[j 1] = 0 and µ g[j] = µ g where µ g = max{ x gi x gj, i, j = 1,, J, i j}. It becomes necessary to compute the probability that a gene with these pseudo means will have a p-value for testing the equality of means below c 1. Under the assumption of normality, the global F-test statistic for testing the equality of the means has a non-central F-distribution with non-centrality parameter ncp g = ( j=j 1 j=1 n [j] (0 µ g /J) 2 + n [J] ( µ g µ g /J) 2 ) / σ 2 g, where n j is the sample size under treatment j and σ 2 g is the estimate of the variance for gene g. Then P (p g c 1 D gj = 0, D g = 1) = P (f J 1,N J,ncpg F 1 J 1,N J (1 c 1)), where N = n j, and f J 1,N J,ncpg is a random variable of non-central F-distribution with degrees of freedom J 1 and N J and non-centrality parameter ncp g, F 1 J 1,N J (1 c 1) is the (1 c 1 ) 100th percentile for a F-distribution with degrees of freedom J 1 and N J. Thus, P (p i c 1 D ij = 0, D i = 1) = average of P (p g c 1 D gj = 0, D g = 1), (10) where g E. When the assumption of normality does not hold, a permutation method is presented (in Section 6) to estimate this probability. (5) The last component of equation (4), P (D ij = 0 D i = 1, p i c 1 ), can be estimated using the proportion of non-significant pairwise comparisons among all pairwise comparisons associated with correctly rejected genes in
13 Jiang and Doerge: A Two-Step Multiple Comparison Procedure 11 Step 1. However, it is impossible to separate the correctly rejected genes from the falsely rejected genes, hence an overestimate is pursued. Define π 02 as the estimate of the proportion of true null hypotheses given the distribution of the p-values in Step 2. We emphasize true here because π 02 is computed based on the the distribution of p-values in Step 2 using the same methods as those used to estimate π 01, and it is not exactly P (D ij = 0 p i c 1 ). Let K denote the number of genes in Step 2, then C K π 02 estimates the number of true null hypotheses based on the p-values, and C K (1 FDR 1 ) is the estimated number of pairwise comparisons generated by correctly rejected genes. Since the p-value (p ij ) corresponding to D i = 1 and D ij = 0 is approximately uniformly distributed, and the estimate C K π 02 also includes some true null hypotheses corresponding to D i = 0 and D ij = 0, the number of true null hypotheses (D ij = 0) corresponding to D i = 1 is less than or equal to C K π 02. Therefore, P (D ij = 0 D i = 1, p i c 1 ) = C K π 02 C K (1 FDR 1 ) = π 02 (1 FDR 1 ). Using equations (5) (9), along with the estimates of the proportions of true null hypotheses ( π 01 and π 02 ) in Step 1 and Step 2, the FDR (equation 4) of the two-step multiple comparison procedure can be estimated by FDR = P (p ij c 2 D i = 0, p i c 1 ) P (p ij c 2 p i c 1 ) FDR 1 + P (p ij c 2 D ij = 0, D i = 1, p i c 1 ) π 02. (11) P (p ij c 2 p i c 1 ) 3.3 Simulation study and results A simulation study is employed to illustrate the accuracy of the proposed method for estimating the FDR of the two-step multiple comparison procedure. Assume there are 3 treatments, and m = 1000 genes. Allow a proportion (R 1 ) of the genes to have a treatment effect. For any gene having a treatment effect, there are two cases: it is differentially expressed across all three treatments; or it is not differentially expressed between two treatments, but differentially expressed under the third treatment. Among the genes which have a treatment effect, assume a proportion (R 2 ) of them are not differentially expressed between two treatments, but differentially expressed under the third treatment. Published by The Berkeley Electronic Press, 2006
14 12 Statistical Applications in Genetics and Molecular Biology Vol. 5 [2006], No. 1, Article 28 That is, R 1 m genes have treatment effects, and R 1 R 2 m genes have treatment means (µ a, µ 0, µ 0 ) or (µ 0, µ a, µ 0 ) or (µ 0, µ 0, µ a ), where µ 0 and µ a are different; and R 1 (1 R 2 ) m genes have treatment means (µ 1, µ 2, µ 3 ), where µ 1, µ 2, and µ 3 are different. In this simulation, half of the R 1 R 2 m genes are chosen to have mean (2,0,0) and the other half have mean (4,0,0); and the R 1 (1 R 2 ) m genes have means (4,2,0). For the (1 R 1 ) m genes not having a treatment effect, the mean vector is (0,0,0). The values for R 1 are 0.10, 0.20, 0.30, 0.40, and 0.50, and the values for R 2 are 0.0, 0.20, 0.40, 0.60, 0.80 and 1. Large values of R 1 are not used in this simulation because the proportion of significant genes in most microarray studies is relatively small. Assume for each gene that there are n = 6 observations under each of the treatments. For each combination of R 1 and R 2, 1000 data sets (each with size of 1000 genes 6 replicates 3 treatments) are generated from normal distributions with standard deviation 1. For each simulated data, 1000 global F-test statistics corresponding to the m = 1000 genes are computed for testing equality of the three treatment means across the 1000 genes. If a gene has a p-value smaller than or equal to a pre-specified level c 1, then it is considered as having significant treatment effect, and thus enters the second step. In the second step, for the genes with statistically significant treatment effects from Step 1, pairwise comparisons are performed using t-tests. Pairwise comparisons with a p-value less than or equal to a pre-specified level c 2 are considered as statistically significant. Various values of c 1 and c 2 are used in the simulation. For each data simulation, π 01 and π 02, the estimates of the proportion of true null hypotheses in Step 1 and Step 2, are computed using Storey and Tibshirani s smoother estimate (Storey and Tibshirani, 2003), and the FDR is estimated using equation (11). The average of the estimated FDR from 1000 simulations for (c 1, c 2 ) = (0.10, 0.05), (0.10, 0.01) and (0.05, 0.01) are presented in Table 1. The average of the true FDR from the 1000 simulations is also presented. For the estimated FDR presented in Table 1, P (p ij c 2 D ij = 0, D i = 1, p i c 1 ) is estimated using c 2 instead of equation (9). It is clear that the estimated FDR is very close to the true FDR when c 1 is not too small which indicates P (p i c 1 D ij = 0, D i = 1) is close to 1. As seen in Table 1 the proposed method yields accurate estimates of the overall FDR. As one would expect the overall FDR for any two-step procedure depends on the configuration of R 1 and R 2. For our two-step approach with c 1 = 0.10, c 2 = 0.05 when R 1 = 0.10 and R 2 = 1.0 the FDR can be as big as 0.39, yet when R 1 = 0.50 and R 2 = 0.0 the FDR can be as small as For the same value of R 1 and the same rejection regions [0, c 1 ] in Step 1 and [0, c 2 ] in Step 2, the FDR increases as R 2 increases. On the other hand, for the same value of R 2 and the same rejection regions [0, c 1 ] in Step 1 and [0, c 2 ] in Step
15 Jiang and Doerge: A Two-Step Multiple Comparison Procedure 13 2, the FDR decreases as R 1 (the proportion of genes having treatment effect) increases. 4 Estimating FDR for fixed FDR significance levels The two-step multiple comparison procedure can also be applied using fixed FDR significance levels in Step 1 and Step 2, respectively. For instance, an FDR controlling procedure at FDR significance level α 1 (α 1 is known and fixed) is applied to the p-values in Step 1, and statistically significant genes are identified. Let A denote the collection of statistically significant genes. Define d 1 be the smallest p-value in Step 1 which is not statistically significant, i.e., d 1 = min{p i, i A c }, where A c is the complement of A. In Step 2, pairwise comparisons associated with the statistically significant genes (i.e., genes in set A) are tested using an FDR controlling procedure at FDR significance level α 2 (α 2 is known and fixed) and statistically significant effects are identified. Let d 2 be the smallest p-value for pairwise comparisons in Step 2 which are not statistically significant. Since the goal is to compute the overall FDR, this can be achieved by replacing c 1 and c 2 with the respective d 1 and d 2 when using the method for estimating the FDR for fixed rejection regions (11). That is, assuming d 1 and d 2 are known, FDR(α 1, α 2 ) = P (p ij d 2 D i = 0, p i d 1 ) P (p ij d 2 p i d 1 ) FDR 1 + P (p ij d 2 D ij = 0, D i = 1, p i d 1 ) π 02. (12) P (p ij d 2 p i d 1 ) It is worth noting that for this approach, d 1 is determined by the p-values in Step 1, α 1, and the FDR controlling procedures applied in Step 1; and d 2 is determined by the p-values in both steps, α 1, α 2, and the FDR controlling procedures applied in Step 1 and Step 2, respectively. 5 Controlling the FDR at a desired significance level Instead of estimating the FDR for a fixed rejection region, traditional multiple comparison procedures (Hochberg and Tamhane, 1987; Hsu, 1996) reject Published by The Berkeley Electronic Press, 2006
16 14 Statistical Applications in Genetics and Molecular Biology Vol. 5 [2006], No. 1, Article 28 Table 1: Simulation results. Estimated FDR ( FDR) and true FDR of pairwise comparisons for 3 treatments and 1000 genes as applied to the two-step multiple comparison procedure using fixed rejection regions c 1 and c 2 in Steps 1 and 2, respectively. R 1 : the proportion of genes having a treatment effect; R 2 : the proportion of genes with a treatment effect having one treatment mean different and the other two the same. R 1 R 2 = c 1 = 0.10 FDR c 2 = True FDR c 1 = 0.10 FDR c 2 = True FDR c 1 = 0.05 FDR c 2 = True FDR
17 Jiang and Doerge: A Two-Step Multiple Comparison Procedure 15 the null hypotheses at a pre-chosen significance level. If the desired FDR significance level of the two-step multiple comparison is α, then the problem becomes choosing the FDR significance levels α 1 and α 2 in Step 1 and Step 2, respectively, so that the overall FDR is controlled by α. 5.1 An approximate upper bound for FDR Although the resampling procedure that is required for estimating the FDR (equation 11) may appear to be a disadvantage, when the experimental design is complicated, it may in fact be difficult to generate data under the null hypothesis. Fortunately, an upper bound of P (p ij c 2 D i =0,p i c 1 ) P (p ij c 2 p i c 1 is possible, thus ) estimating P (p ij c 2 D i = 0, p i c 1 ) via simulation can be avoided. Theorem 2. In the two-step multiple comparison procedure, P (p ij c 2 D i = 0, p i c 1 ) P (p ij c 2 p i c 1 ) 1. Proof. P (p ij c 2 D i = 0, p i c 1 ) P (p ij c 2 p i c 1 ) = P (p ij c 2 D ij = 0, D i = 0, p i c 1 ) P (p ij c 2 p i c 1 ) = = P (p ij c 2,D ij =0,D i =0 p i c 1 ) P (D ij =0,D i =0 p i c 1 ) P (p ij c 2 p i c 1 ) P (p ij c 2,D ij =0,D i =0 p i c 1 ) P (p ij c 2 p i c 1 ) P (D ij = 0, D i = 0 p i c 1 ) = P (D ij = 0, D i = 0 p i c 1, p ij c 2 ) P (D ij = 0, D i = 0 p i c 1 ) 1. (13) When c 2 1 this equality (equation 13) holds for two specific reasons. First, the probability of a false rejection in Step 1 (reject the null hypothesis H 0 i when it is true) only depends on the p-values p i and c 1. Second, with a constraint in Step 2 (p ij c 2 and c 2 < 1), the chance of making a false rejection (reject the null hypothesis H 0 ij when it is true) will be smaller than when compared to the procedure for which no constraint is applied. Published by The Berkeley Electronic Press, 2006
18 16 Statistical Applications in Genetics and Molecular Biology Vol. 5 [2006], No. 1, Article 28 is When P (p i c 1 D i = 1, D ij = 0 for some j) = 1, the FDR (equation 4) FDR = FDR 1 P (p ij c 2 D i = 0, p i c 1 ) P (p ij c 2 p i c 1 ) +(1 FDR 1 ) c 2P (D ij = 0 D i = 1, p i c 1 ). P (p ij c 2 p i c 1 ) Define π 02, FDR 2 and pfdr 2 to be the proportion of true null hypotheses, the FDR and the pfdr in Step 2 based on the empirical distribution of the p-values. Then Notice that FDR 2 = pfdr 2 = c 2 π 02 P (p ij c 2 p i c 1 ). c 2 P (D ij = 0 D i = 1, p i c 1 ) P (p ij c 2 p i c 1 ) c 2 π 02 /(1 FDR 1 ) P (p ij c 2 p i c 1 ) = FDR 2 1 FDR 1, thus an upper bound for the overall FDR (equation 4) is, F DR FDR 1 + FDR 2. (14) Therefore, the overall FDR can be controlled below level α as long as the FDR significance levels α 1 and α 2 used in the respective Step 1 and Step 2 satisfy α 1 + α 2 α. However, when P (p i c 1 D i = 1, D ij = 0 for some j) is far less than 1, the realized FDR may exceed FDR 1 + FDR 2. One strategy is to put more weight of the overall FDR on FDR 1 so that P (p i c 1 D i = 1, D ij = 0 for some j) is closer to 1, and at the same time more genes can be included in the analysis in Step 2. Next, we investigate the performance of the two-step procedure with fixed FDR significance levels in Step 1 and Step 2, and propose a method to choose FDR significance levels in the two steps so that the overall FDR can be controlled below a pre-chosen overall FDR significance level. 5.2 Fixing the FDR significance levels A simulation study is employed to illustrate the improved power of the two-step multiple comparison procedure over the one-step procedure. The simulation scenario is the same as Section 3.3. There are 3 treatment conditions, a sample size of n = 6 within each treatment condition, and m = 1000 genes. For
19 Jiang and Doerge: A Two-Step Multiple Comparison Procedure 17 each combination of R 1 and R 2, 1000 data sets are generated from standard normal distributions, and there are 3nm data points within each data set. The FDR controlling procedure is then applied to the corresponding 1000 genes at a FDR significance level α 1. In the second step, for the genes with significant treatment effects from Step 1, pairwise comparisons are performed with the FDR controlling procedure at FDR significance level α 2. The respective FDR significance levels used in the first and second step are (α 1, α 2 ) = (0.04, 0.01), and (0.03, 0.02), and the estimated FDR, the true FDR and average power are listed in Tables 2 and 3. Here, the average power is defined to be the expected proportion of correct rejections among the true alternative hypotheses. For the purpose of comparing the results with the one-step FDR controlling procedure, the estimated FDR, the true FDR, and the average power for the one-step procedure are also listed in Table 2. For the one-step procedure, an FDR controlling procedure is applied to the family of 3m pairwise comparisons. Specifically, Benjamini and Hochberg s adaptive FDR controlling procedure (Benjamini and Hochberg, 2000) with the incorporation of the estimate of the proportion of null hypotheses by Storey and Tibshirani s smoother estimate (Storey and Tibshirani, 2003) is employed. When the proportion of genes having a treatment effect (R 1 ) is small, the two-step multiple comparison procedure is more powerful than the one-step multiple comparison procedure because of the reduced number of tests in Step 2. For example, in this simulation, when R 1 = 0.2 and R 2 = 0.2, the one-step procedure has 80% power, while the two-step procedure has approximate power 96%. As observed from the simulations when R 2, the proportion of significant genes for which one treatment effect is different but the other two are the same, increases, the power of the two-step procedure decreases. This is due to the fact that when R 2 increases, fewer genes are included in Step 2. From this simulation, the power for α 1 = 0.04, α 2 = 0.01 is slightly bigger than that for α 1 = 0.03, α 2 = 0.02 when R 1 is small. Furthermore, when α 1 = 0.04 more genes are included in the Step 2. Simulations have been performed for different values of FDR level α that vary from 0.01, 0.02,, 0.2. The FDR controlling procedure with the incorporation of the estimate of true null hypotheses is applied in both steps of the two-step procedure, and Step 1 and Step 2 FDR levels are set to α 1 = 4/5α and α 2 = 1/5α. These simulations (Figure 1) demonstrate that the overall FDR is controlled at FDR level α for all values of α. Based on this work and experience our ad hoc suggestion is to use α 1 = 4/5α and α 2 = 1/5α if the overall FDR is required to be controlled at FDR level α. Published by The Berkeley Electronic Press, 2006
20 18 Statistical Applications in Genetics and Molecular Biology Vol. 5 [2006], No. 1, Article 28 Table 2: Simulation results. Estimated FDR ( FDR), true FDR, and average power for pairwise comparisons for 3 treatment conditions and 1000 genes using both the two-step and one-step procedure, respectively. For the twostep procedure, the FDR significance levels α 1 = 0.04 and α 2 = 0.01 are used in Step 1 and Step 2, respectively. For the one-step procedure, the FDR significance level is R 1 R 2 = Two- FDR Step True FDR Power One- True Step FDR Power
21 Jiang and Doerge: A Two-Step Multiple Comparison Procedure 19 Table 3: Simulation results. Estimated FDR ( FDR), true FDR, and average power for pairwise comparisons for 3 treatment conditions and 1000 genes using the two-step procedure at the FDR significance levels α 1 = 0.03 and α 2 = 0.02 in Step 1 and Step 2, respectively. R 1 R 2 = FDR True FDR Power Published by The Berkeley Electronic Press, 2006
22 20 Statistical Applications in Genetics and Molecular Biology Vol. 5 [2006], No. 1, Article 28 True FDR R 1 =0.2, R 2 =0.2 R 1 =0.2, R 2 =0.8 R 1 =0.4, R 2 =0.2 R 1 =0.4, R 2 = α Figure 1: Simulation results of the FDR for the two-step multiple comparison procedure using α 1 = 4α and α 5 2 = 1 α for different levels of α. In total there 5 are m = 1000 genes, 3 treatment conditions, and four different combinations of R 1 and R 2 : R 1 = 0.2, R 2 = 0.2 (short dashed line), R 1 = 0.2, R 2 = 0.8 (dotted line), R 1 = 0.4, R 2 = 0.2 (dotted-dashed line) and R 1 = 0.4, R 2 = 0.8 (long dashed line). The black straight line represents the pre-chosen FDR level. Here R 1 is the proportion of genes having a treatment effect; R 2 is the proportion of genes with a treatment effect having one treatment mean different but the other two the same.
23 Jiang and Doerge: A Two-Step Multiple Comparison Procedure Choosing the FDR significance levels Here we propose an adaptive approach for choosing α 1 and α 2, and suggest some guidelines and direction for selecting α 1 and α 2. First, α 1 should be bigger than α 2. When a looser criterion is used in Step 1, more genes are available to enter the second step. Second, α 1 and α 2 should be chosen such that the overall FDR is close to but below the pre-specified significance level. Hence, the power for detecting a significant effect will be maximized. Third, the choice of α 1 and α 2 should lead to the largest number of rejections occurring in Step 2. With these guidelines in mind, we propose the following directive for finding the significance levels α 1 and α 2. Let S be a set of values of (i α)/n where i = 1,, n 1 and n is a positive integer. That is, S = {α/n, 2α/n,, (n 1)α/n}. Let FDR(α 1, α 2 )) be the estimated overall FDR and R(α 1, α 2 ) the number of rejections (or statistically significant pairwise comparisons) in Step 2 when a two-step procedure with respective significance levels α 1 and α 2 in Step 1 and 2 is applied. Then α1 and α2 are chosen such that (α1, α2) = arg α1,α 2 { max R(α 1, α 2 )}. (15) α 1,α 2 S,α 1 >α 2,α 1 +α 2 α, FDR(α 1,α 2 ) α Using the same simulation as in Section 3.3, for each of the 1000 data sets, we apply our guidelines to find α1 and α2. Suppose the overall FDR significance level α = 0.05 and S = {α/5, 2α/5, 3α/5, 4α/5}, then α1 and α2 can be chosen from (α 1, α 2 ) = (0.02, 0.01), (0.03, 0.01), (0.03, 0.02), and (0.04, 0.01). Table 4 gives the frequency distribution of α1 and α2 based on these 1000 simulations. As can be seen, when R 1 = 0.20, R 2 = 0.60, the choice of (α1, α2) is (0.03, 0.01) for 12 simulated data sets, (0.03, 0.02) for 877 simulated data sets, and (0.04, 0.01) for 111 simulated data sets. The chosen significance levels in the two step method are more diverse when R 1 is small, and then they converge to (α1, α2) = (0.03, 0.02) as R 1 gets larger. Evidently, the case where R 2 = 0.0 (genes which have a treatment effect where all means are different from each other) yields random results. This is most likely due to the fact that almost all pairwise comparisons in Step 2 are significant. Given the choices of α1 and α2 (Table 4), the average FDR is controlled below α = 0.05 (Table 5), and the two-step procedure has more power than the one-step procedure (Table 2). For these results, α1 and α2 take values from S = {α/5, 2α/5, 3α/5, 4α/5}. However, for more accurate results, we suggest S = {α/20, 2α/20,, 19α/20}. Published by The Berkeley Electronic Press, 2006
24 22 Statistical Applications in Genetics and Molecular Biology Vol. 5 [2006], No. 1, Article 28 Table 4: Frequency distribution of α1 and α2 from 1000 simulations for pairwise comparisons for 3 treatment conditions and 1000 genes. Here α1 and α2 are determined using the stated guidelines, and by controlling the overall FDR for the two-step procedure below α = α1 = R 1 R 2 α2 =
25 Jiang and Doerge: A Two-Step Multiple Comparison Procedure 23 Table 5: Simulation results. Estimated FDR ( FDR), true FDR, and power for pairwise comparisons for 3 treatment conditions and 1000 genes using the two-step procedure. The FDR for the entire procedure is controlled below 0.05 with significance levels α 1 and α 2 chosen automatically (results are listed in Table 4). R 1 R 2 = FDR True FDR Power Published by The Berkeley Electronic Press, 2006
Multiple Testing. Hoang Tran. Department of Statistics, Florida State University
Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome
More informationStatistical testing. Samantha Kleinberg. October 20, 2009
October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find
More informationHigh-throughput Testing
High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector
More informationStep-down FDR Procedures for Large Numbers of Hypotheses
Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate
More informationSummary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing
Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper
More informationFalse discovery rate and related concepts in multiple comparisons problems, with applications to microarray data
False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using
More informationTable of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors
The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a
More informationComparison of the Empirical Bayes and the Significance Analysis of Microarrays
Comparison of the Empirical Bayes and the Significance Analysis of Microarrays Holger Schwender, Andreas Krause, and Katja Ickstadt Abstract Microarrays enable to measure the expression levels of tens
More informationResampling-based Multiple Testing with Applications to Microarray Data Analysis
Resampling-based Multiple Testing with Applications to Microarray Data Analysis DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School
More informationStat 206: Estimation and testing for a mean vector,
Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where
More informationThe miss rate for the analysis of gene expression data
Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,
More informationA moment-based method for estimating the proportion of true null hypotheses and its application to microarray gene expression data
Biostatistics (2007), 8, 4, pp. 744 755 doi:10.1093/biostatistics/kxm002 Advance Access publication on January 22, 2007 A moment-based method for estimating the proportion of true null hypotheses and its
More informationSample Size Estimation for Studies of High-Dimensional Data
Sample Size Estimation for Studies of High-Dimensional Data James J. Chen, Ph.D. National Center for Toxicological Research Food and Drug Administration June 3, 2009 China Medical University Taichung,
More informationHigh-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018
High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously
More informationLooking at the Other Side of Bonferroni
Department of Biostatistics University of Washington 24 May 2012 Multiple Testing: Control the Type I Error Rate When analyzing genetic data, one will commonly perform over 1 million (and growing) hypothesis
More informationON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao
ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE By Wenge Guo and M. Bhaskara Rao National Institute of Environmental Health Sciences and University of Cincinnati A classical approach for dealing
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 6, Issue 1 2007 Article 28 A Comparison of Methods to Control Type I Errors in Microarray Studies Jinsong Chen Mark J. van der Laan Martyn
More informationControlling Bayes Directional False Discovery Rate in Random Effects Model 1
Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA
More informationWeek 5 Video 1 Relationship Mining Correlation Mining
Week 5 Video 1 Relationship Mining Correlation Mining Relationship Mining Discover relationships between variables in a data set with many variables Many types of relationship mining Correlation Mining
More informationNon-specific filtering and control of false positives
Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview
More informationFDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES
FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept
More informationBayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments
Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Jie Chen 1 Merck Research Laboratories, P. O. Box 4, BL3-2, West Point, PA 19486, U.S.A. Telephone:
More informationA Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments
A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Jie Chen 1 Merck Research Laboratories, P. O. Box 4, BL3-2, West Point, PA 19486, U.S.A. Telephone:
More informationSpecific Differences. Lukas Meier, Seminar für Statistik
Specific Differences Lukas Meier, Seminar für Statistik Problem with Global F-test Problem: Global F-test (aka omnibus F-test) is very unspecific. Typically: Want a more precise answer (or have a more
More informationModified Simes Critical Values Under Positive Dependence
Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia
More informationFDR and ROC: Similarities, Assumptions, and Decisions
EDITORIALS 8 FDR and ROC: Similarities, Assumptions, and Decisions. Why FDR and ROC? It is a privilege to have been asked to introduce this collection of papers appearing in Statistica Sinica. The papers
More informationLarge-Scale Hypothesis Testing
Chapter 2 Large-Scale Hypothesis Testing Progress in statistics is usually at the mercy of our scientific colleagues, whose data is the nature from which we work. Agricultural experimentation in the early
More informationPerformance Evaluation and Comparison
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation
More informationDETECTING DIFFERENTIALLY EXPRESSED GENES WHILE CONTROLLING THE FALSE DISCOVERY RATE FOR MICROARRAY DATA
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Dissertations and Theses in Statistics Statistics, Department of 2009 DETECTING DIFFERENTIALLY EXPRESSED GENES WHILE CONTROLLING
More informationFalse discovery rate procedures for high-dimensional data Kim, K.I.
False discovery rate procedures for high-dimensional data Kim, K.I. DOI: 10.6100/IR637929 Published: 01/01/2008 Document Version Publisher s PDF, also known as Version of Record (includes final page, issue
More informationResearch Article Sample Size Calculation for Controlling False Discovery Proportion
Probability and Statistics Volume 2012, Article ID 817948, 13 pages doi:10.1155/2012/817948 Research Article Sample Size Calculation for Controlling False Discovery Proportion Shulian Shang, 1 Qianhe Zhou,
More informationMultiple Testing. Gary W. Oehlert. January 28, School of Statistics University of Minnesota
Multiple Testing Gary W. Oehlert School of Statistics University of Minnesota January 28, 2016 Background Suppose that you had a 20-sided die. Nineteen of the sides are labeled 0 and one of the sides is
More informationControlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method
Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Christopher R. Genovese Department of Statistics Carnegie Mellon University joint work with Larry Wasserman
More informationLecture 7 April 16, 2018
Stats 300C: Theory of Statistics Spring 2018 Lecture 7 April 16, 2018 Prof. Emmanuel Candes Scribe: Feng Ruan; Edited by: Rina Friedberg, Junjie Zhu 1 Outline Agenda: 1. False Discovery Rate (FDR) 2. Properties
More informationApplying the Benjamini Hochberg procedure to a set of generalized p-values
U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure
More informationQuick Calculation for Sample Size while Controlling False Discovery Rate with Application to Microarray Analysis
Statistics Preprints Statistics 11-2006 Quick Calculation for Sample Size while Controlling False Discovery Rate with Application to Microarray Analysis Peng Liu Iowa State University, pliu@iastate.edu
More informationBIOINFORMATICS ORIGINAL PAPER
BIOINFORMATICS ORIGINAL PAPER Vol 21 no 11 2005, pages 2684 2690 doi:101093/bioinformatics/bti407 Gene expression A practical false discovery rate approach to identifying patterns of differential expression
More informationIEOR165 Discussion Week 12
IEOR165 Discussion Week 12 Sheng Liu University of California, Berkeley Apr 15, 2016 Outline 1 Type I errors & Type II errors 2 Multiple Testing 3 ANOVA IEOR165 Discussion Sheng Liu 2 Type I errors & Type
More informationMixtures of multiple testing procedures for gatekeeping applications in clinical trials
Research Article Received 29 January 2010, Accepted 26 May 2010 Published online 18 April 2011 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.4008 Mixtures of multiple testing procedures
More informationSingle gene analysis of differential expression
Single gene analysis of differential expression Giorgio Valentini DSI Dipartimento di Scienze dell Informazione Università degli Studi di Milano valentini@dsi.unimi.it Comparing two conditions Each condition
More informationAlpha-Investing. Sequential Control of Expected False Discoveries
Alpha-Investing Sequential Control of Expected False Discoveries Dean Foster Bob Stine Department of Statistics Wharton School of the University of Pennsylvania www-stat.wharton.upenn.edu/ stine Joint
More informationProcedures controlling generalized false discovery rate
rocedures controlling generalized false discovery rate By SANAT K. SARKAR Department of Statistics, Temple University, hiladelphia, A 922, U.S.A. sanat@temple.edu AND WENGE GUO Department of Environmental
More informationarxiv: v1 [math.st] 31 Mar 2009
The Annals of Statistics 2009, Vol. 37, No. 2, 619 629 DOI: 10.1214/07-AOS586 c Institute of Mathematical Statistics, 2009 arxiv:0903.5373v1 [math.st] 31 Mar 2009 AN ADAPTIVE STEP-DOWN PROCEDURE WITH PROVEN
More informationLinear Models and Empirical Bayes Methods for. Assessing Differential Expression in Microarray Experiments
Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments by Gordon K. Smyth (as interpreted by Aaron J. Baraff) STAT 572 Intro Talk April 10, 2014 Microarray
More informationSTEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University
STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE Wenge Guo 1 and Sanat K. Sarkar 2 National Institute of Environmental Health Sciences and Temple University Abstract: Often in practice
More informationThe legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.
1 Chapter 1: Research Design Principles The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 2 Chapter 2: Completely Randomized Design
More informationESTIMATING THE PROPORTION OF TRUE NULL HYPOTHESES UNDER DEPENDENCE
Statistica Sinica 22 (2012), 1689-1716 doi:http://dx.doi.org/10.5705/ss.2010.255 ESTIMATING THE PROPORTION OF TRUE NULL HYPOTHESES UNDER DEPENDENCE Irina Ostrovnaya and Dan L. Nicolae Memorial Sloan-Kettering
More informationPositive false discovery proportions: intrinsic bounds and adaptive control
Positive false discovery proportions: intrinsic bounds and adaptive control Zhiyi Chi and Zhiqiang Tan University of Connecticut and The Johns Hopkins University Running title: Bounds and control of pfdr
More informationFamily-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs
Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs with Remarks on Family Selection Dissertation Defense April 5, 204 Contents Dissertation Defense Introduction 2 FWER Control within
More informationPost-Selection Inference
Classical Inference start end start Post-Selection Inference selected end model data inference data selection model data inference Post-Selection Inference Todd Kuffner Washington University in St. Louis
More informationThe Pennsylvania State University The Graduate School A BAYESIAN APPROACH TO FALSE DISCOVERY RATE FOR LARGE SCALE SIMULTANEOUS INFERENCE
The Pennsylvania State University The Graduate School A BAYESIAN APPROACH TO FALSE DISCOVERY RATE FOR LARGE SCALE SIMULTANEOUS INFERENCE A Thesis in Statistics by Bing Han c 2007 Bing Han Submitted in
More informationThe Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR
The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR CONTROLLING THE FALSE DISCOVERY RATE A Dissertation in Statistics by Scott Roths c 2011
More informationEstimation of a Two-component Mixture Model
Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint
More informationMultiple Testing. Tim Hanson. January, Modified from originals by Gary W. Oehlert. Department of Statistics University of South Carolina
Multiple Testing Tim Hanson Department of Statistics University of South Carolina January, 2017 Modified from originals by Gary W. Oehlert Type I error A Type I error is to wrongly reject the null hypothesis
More informationLecture 6 April
Stats 300C: Theory of Statistics Spring 2017 Lecture 6 April 14 2017 Prof. Emmanuel Candes Scribe: S. Wager, E. Candes 1 Outline Agenda: From global testing to multiple testing 1. Testing the global null
More informationAdvanced Statistical Methods: Beyond Linear Regression
Advanced Statistical Methods: Beyond Linear Regression John R. Stevens Utah State University Notes 3. Statistical Methods II Mathematics Educators Worshop 28 March 2009 1 http://www.stat.usu.edu/~jrstevens/pcmi
More informationA Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data
A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data Faming Liang, Chuanhai Liu, and Naisyin Wang Texas A&M University Multiple Hypothesis Testing Introduction
More informationMultiple testing: Intro & FWER 1
Multiple testing: Intro & FWER 1 Mark van de Wiel mark.vdwiel@vumc.nl Dep of Epidemiology & Biostatistics,VUmc, Amsterdam Dep of Mathematics, VU 1 Some slides courtesy of Jelle Goeman 1 Practical notes
More informationLecture 28. Ingo Ruczinski. December 3, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University
Lecture 28 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University December 3, 2015 1 2 3 4 5 1 Familywise error rates 2 procedure 3 Performance of with multiple
More informationChapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate
Chapter Stepdown Procedures Controlling A Generalized False Discovery Rate Wenge Guo and Sanat K. Sarkar Biostatistics Branch, National Institute of Environmental Health Sciences, Research Triangle Park,
More informationThe optimal discovery procedure: a new approach to simultaneous significance testing
J. R. Statist. Soc. B (2007) 69, Part 3, pp. 347 368 The optimal discovery procedure: a new approach to simultaneous significance testing John D. Storey University of Washington, Seattle, USA [Received
More informationMIXTURE MODELS FOR DETECTING DIFFERENTIALLY EXPRESSED GENES IN MICROARRAYS
International Journal of Neural Systems, Vol. 16, No. 5 (2006) 353 362 c World Scientific Publishing Company MIXTURE MOLS FOR TECTING DIFFERENTIALLY EXPRESSED GENES IN MICROARRAYS LIAT BEN-TOVIM JONES
More informationSTAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)
STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons Ch. 4-5) Recall CRD means and effects models: Y ij = µ i + ϵ ij = µ + α i + ϵ ij i = 1,..., g ; j = 1,..., n ; ϵ ij s iid N0, σ 2 ) If we reject
More informationExtending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie
Extending the Robust Means Modeling Framework Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie One-way Independent Subjects Design Model: Y ij = µ + τ j + ε ij, j = 1,, J Y ij = score of the ith
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 13 Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates Sandrine Dudoit Mark
More informationLecture 27. December 13, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationAnnouncements. Proposals graded
Announcements Proposals graded Kevin Jamieson 2018 1 Hypothesis testing Machine Learning CSE546 Kevin Jamieson University of Washington October 30, 2018 2018 Kevin Jamieson 2 Anomaly detection You are
More informationCH.9 Tests of Hypotheses for a Single Sample
CH.9 Tests of Hypotheses for a Single Sample Hypotheses testing Tests on the mean of a normal distributionvariance known Tests on the mean of a normal distributionvariance unknown Tests on the variance
More informationAliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25
Presentation of The Paper: The Positive False Discovery Rate: A Bayesian Interpretation and the q-value, J.D. Storey, The Annals of Statistics, Vol. 31 No.6 (Dec. 2003), pp 2013-2035 Aliaksandr Hubin University
More informationControl of Generalized Error Rates in Multiple Testing
Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 245 Control of Generalized Error Rates in Multiple Testing Joseph P. Romano and
More informationLet us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided
Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or
More informationEffects of dependence in high-dimensional multiple testing problems. Kyung In Kim and Mark van de Wiel
Effects of dependence in high-dimensional multiple testing problems Kyung In Kim and Mark van de Wiel Department of Mathematics, Vrije Universiteit Amsterdam. Contents 1. High-dimensional multiple testing
More informationIntroduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs
Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs The Analysis of Variance (ANOVA) The analysis of variance (ANOVA) is a statistical technique
More informationEMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS
Statistica Sinica 19 (2009), 125-143 EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS Debashis Ghosh Penn State University Abstract: There is much recent interest
More informationTools and topics for microarray analysis
Tools and topics for microarray analysis USSES Conference, Blowing Rock, North Carolina, June, 2005 Jason A. Osborne, osborne@stat.ncsu.edu Department of Statistics, North Carolina State University 1 Outline
More informationMULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY
MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY BY YINGQIU MA A dissertation submitted to the Graduate School New Brunswick Rutgers, The State University of New
More informationTwo-stage stepup procedures controlling FDR
Journal of Statistical Planning and Inference 38 (2008) 072 084 www.elsevier.com/locate/jspi Two-stage stepup procedures controlling FDR Sanat K. Sarar Department of Statistics, Temple University, Philadelphia,
More informationResampling-Based Control of the FDR
Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago
More informationarxiv: v1 [stat.me] 25 Aug 2016
Empirical Null Estimation using Discrete Mixture Distributions and its Application to Protein Domain Data arxiv:1608.07204v1 [stat.me] 25 Aug 2016 Iris Ivy Gauran 1, Junyong Park 1, Johan Lim 2, DoHwan
More informationA GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE
A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE Sanat K. Sarkar 1, Tianhui Zhou and Debashis Ghosh Temple University, Wyeth Pharmaceuticals and
More informationChapter Seven: Multi-Sample Methods 1/52
Chapter Seven: Multi-Sample Methods 1/52 7.1 Introduction 2/52 Introduction The independent samples t test and the independent samples Z test for a difference between proportions are designed to analyze
More informationLec 1: An Introduction to ANOVA
Ying Li Stockholm University October 31, 2011 Three end-aisle displays Which is the best? Design of the Experiment Identify the stores of the similar size and type. The displays are randomly assigned to
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationSemi-Penalized Inference with Direct FDR Control
Jian Huang University of Iowa April 4, 2016 The problem Consider the linear regression model y = p x jβ j + ε, (1) j=1 where y IR n, x j IR n, ε IR n, and β j is the jth regression coefficient, Here p
More informationThe One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)
The One-Way Independent-Samples ANOVA (For Between-Subjects Designs) Computations for the ANOVA In computing the terms required for the F-statistic, we won t explicitly compute any sample variances or
More informationSTAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis
STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis Rebecca Barter April 6, 2015 Multiple Testing Multiple Testing Recall that when we were doing two sample t-tests, we were testing the equality
More informationJournal Club: Higher Criticism
Journal Club: Higher Criticism David Donoho (2002): Higher Criticism for Heterogeneous Mixtures, Technical Report No. 2002-12, Dept. of Statistics, Stanford University. Introduction John Tukey (1976):
More informationDoing Cosmology with Balls and Envelopes
Doing Cosmology with Balls and Envelopes Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics Carnegie
More informationPeak Detection for Images
Peak Detection for Images Armin Schwartzman Division of Biostatistics, UC San Diego June 016 Overview How can we improve detection power? Use a less conservative error criterion Take advantage of prior
More informationBiostatistics Advanced Methods in Biostatistics IV
Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 11 1 / 44 Tip + Paper Tip: Two today: (1) Graduate school
More informationFamilywise Error Rate Controlling Procedures for Discrete Data
Familywise Error Rate Controlling Procedures for Discrete Data arxiv:1711.08147v1 [stat.me] 22 Nov 2017 Yalin Zhu Center for Mathematical Sciences, Merck & Co., Inc., West Point, PA, U.S.A. Wenge Guo Department
More informationNew Procedures for False Discovery Control
New Procedures for False Discovery Control Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Elisha Merriam Department of Neuroscience University
More informationAdaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses
Adaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses arxiv:1610.03330v1 [stat.me] 11 Oct 2016 Jingshu Wang, Chiara Sabatti, Art B. Owen Department of Statistics, Stanford University
More informationMultidimensional local false discovery rate for microarray studies
Bioinformatics Advance Access published December 20, 2005 The Author (2005). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org
More informationOn Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses
On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses Gavin Lynch Catchpoint Systems, Inc., 228 Park Ave S 28080 New York, NY 10003, U.S.A. Wenge Guo Department of Mathematical
More informationFalse Discovery Rate
False Discovery Rate Peng Zhao Department of Statistics Florida State University December 3, 2018 Peng Zhao False Discovery Rate 1/30 Outline 1 Multiple Comparison and FWER 2 False Discovery Rate 3 FDR
More informationLecture 21: October 19
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use
More informationJournal of Statistical Software
JSS Journal of Statistical Software MMMMMM YYYY, Volume VV, Issue II. doi: 10.18637/jss.v000.i00 GroupTest: Multiple Testing Procedure for Grouped Hypotheses Zhigen Zhao Abstract In the modern Big Data
More informationA NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES
A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES By Wenge Guo Gavin Lynch Joseph P. Romano Technical Report No. 2018-06 September 2018
More informationExam: high-dimensional data analysis January 20, 2014
Exam: high-dimensional data analysis January 20, 204 Instructions: - Write clearly. Scribbles will not be deciphered. - Answer each main question not the subquestions on a separate piece of paper. - Finish
More information