Adaptive Filtering Procedures for Replicability Analysis of High-throughput Experiments

Size: px
Start display at page:

Download "Adaptive Filtering Procedures for Replicability Analysis of High-throughput Experiments"

Transcription

1 Adaptive Filtering Procedures for Replicability Analysis of High-throughput Experiments Jingshu Wang 1, Weijie Su 1, Chiara Sabatti 2, and Art B. Owen 2 1 Department of Statistics, University of Pennsylvania 2 Department of Statistics, Stanford University January 2018 Abstract Replicability is a fundamental quality of scientific discoveries. While meta-analysis provides a framework to evaluate the strength of signals across multiple studies accounting for experimental variability, it does not investigate replicability. A single, possibly non-reproducible study, can be enough to bring significance. In contrast, the partial conjunction (PC) alternative hypothesis stipulates that for a chosen number r (r > 1), at least r out of n related individual hypotheses are non-null, making it a useful measure of replicability. Motivated by genetics problems, we consider settings where a large number M of partial conjunction null hypotheses are tested, using an n M matrix of p-values where n is the number of studies. Applying multiple testing adjustments directly to PC p-values can be very conservative. We here introduce AdaFilter, a new procedure that, mindful of the fact that the PC null is a composite hypothesis, increases power by filtering out unlikely candidate PC hypotheses using the whole p-value matrix. We prove that appropriate versions of AdaFilter control the familywise error rate and the per family error rate under independence. We show that these error rates and the false discovery rate can be controlled under independence and a withinstudy local dependence structure while achieving much higher power than existing methods. We illustrate the effectiveness of the AdaFilter procedures with three different case studies. Keywords: meta-analysis, partial conjunction, multiple hypothesis testing, composite null jingshuw@upenn.edu Jingshu Wang is supported by NIH grant 2 R01HG and the Wharton Dean s Fund. Art Owen is supported by the US NSF grant DMS Chiara Sabatti is supported by NIH grants R01MH and U01HG

2 1 Introduction Replication is the cornerstone of science (Moonesinghe et al., 2007). In the last decade, however, both the popular (Lehrer, 2010) and the scientific press (Begley and Ellis, 2012; Baker, 2016) have given space to a number of publications highlighting the lack of reproducibility of research results. The National Academy of Sciences has identified this reproducibility crisis as a major focus area. While addressing this challenge requires efforts on multiple fronts, statistics has important contributions to make. On one hand, there is a need for data analysis methods that result in the identification of findings with higher chance of being reproducible. On the other hand, it is crucial to have a framework with which to evaluate the reproducibility of scientific results in a manner that is objective and precise, while properly accounting for experimental and sampling variability. The field known as meta-analysis (Hedges and Olkin, 1985) has provided one such framework, but with an emphasis on aggregating information across n studies to increase power rather than evaluating their agreement or lack-thereof. In meta-analysis one tests the global null of no effect in any studies. It is possible that this is rejected largely on the basis of just one study with extremely significant effect. Such a rejection may be undesirable as it could arise from some irreproducible property, such as technical or modeling bias, of that one significant study. The accepted approach in meta-analysis to account for studies heterogeneity is to rely on a random effects model. While this is robust to inter-study variability, the rejection of the global null still is not equivalent to replication: it does not guarantee that the effect in question has been observed in multiple separate studies. Functional magnetic resonance imaging (fmri) is one field in which the need of explicit guarantees of replication has been felt earlier. The solution put forward in Price and Friston (1997) is conjunction (logical and ) testing: a result is considered replicated when the corresponding null hypothesis is rejected in all n settings where it is tested. While conjunction tests guarantee replication, their power quickly decreases as the number of studies n increases, as the decision relies on the least significant p-values across studies. The partial conjunction (PC) hypothesis represents a compromise between these two extremes: it provides replication guarantees while maintaining reasonable power. Introduced by Friston et al. (2005), and studied in Benjamini and Heller (2008), the partial conjunction (PC) hypothesis tests whether at least r out of n hypotheses are false. As long as r 2, rejecting the partial conjunction null explicitly guarantees that the finding is replicable in at least some of these studies. Testing for a PC hypothesis can evaluate the consistency of genetic findings across multiple studies or platforms (Heller and Yekutieli, 2014). It can also be used to identify effects that are common across different conditions/subgroups (Sun and Wei, 2011; Heller et al., 2007), even when these are assessed in the context of a single study. For example, in multiple tissue eqtl (expression quantitative trait loci) studies, PC can help to judge whether the detected gene regulation patterns are consistent across tissues or tissue-specific (Flutre et al., 2013). Or, in genome-wide association studies (GWAS) of multiple phenotypes (Shin et al., 2014), PC can identify the single nucleotide polymorphisms (SNPs) that are associated with more than one trait, and therefore are likely to probe an important biological pathway. 2

3 In high-throughput experiments like the ones just described it is common to test tens of thousands of hypotheses in the same study. A typical application of the PC framework in this context, therefore, requires controlling for multiplicity. For example, when studying the influence of a condition on gene expression, one typically quantifies the expression levels of 20,000 genes and tests the corresponding 20,000 null hypotheses of no association between the condition of interest and the expression levels in one study. To identify which of the potential rejections are replicated r times across n studies using a PC approach, one would need to carry out the corresponding 20,000 PC tests. A straightforward approach to controlling for multiplicity is to apply standard adjustment procedures to the p-values for the many PC hypotheses. However, this strategy has been shown to have very low power (Heller and Yekutieli, 2014; Sun et al., 2015). The issue is particularly severe in genetics studies, where the large number of hypotheses tested is coupled with a sparse signal (only a small portion of the PC hypotheses in non-null). This power loss which represents a major roadblock to the application of the PC framework in replicability studies is linked to the composite nature of the PC null hypotheses for r 2. Both Heller and Yekutieli (2014) and Bogomolov and Heller (2015) suggest procedures that try to work-around this difficulty and result in power increases. However, Bogomolov and Heller (2015) is designed for the case of n = r = 2 and the empirical Bayes approach in Heller and Yekutieli (2014) encounters computational difficulties as the values of n increases substantially. We here address this limitation, introducing the new adaptive filtering procedures (AdaFilter) to test multiple PC hypotheses that control global errors without dramatically decreasing power for any number of studies n. The key idea is that when we are testing whether at least r out of n hypotheses are false, we can treat differently the hypotheses that appear to be false in at least r 1 studies from those that do not. Our AdaFilter procedures for PC hypotheses provably control several simultaneous error rates when all the test statistics are independent. Simulation studies suggest that these properties extend to the case of local and weak dependence between the hypotheses within each study. In addition, AdaFilter has a partial monotonicity property: reducing any of the n p-values for a given null hypothesis can only make it more likely rejected. It is however possible that lowering one of the p-values for one null hypothesis can make it less likely that another null hypothesis will be rejected. This lack of a corresponding complete monotonicity property leads to the power gain of our proposals. The structure of the paper is as follows. Section 2 precisely defines the PC testing, and illustrates the power limitation of out-of-the-box multiplicity adjustment strategies. Section 3 introduces our AdaFilter procedures. Section 4 proves their validity under independence, provides further understanding of the procedures from an important monotonicity property, discusses extensions to more general scenarios and offers a comparison with previous methods. Section 5 explores the performance with simulations and finally, Section 6 describes the application of our procedures to three case studies, one testing for replication in multiple experiments and the other two testing for signal consistency across different conditions/subgroups. 3

4 2 Multiple testing for partial conjunctions 2.1 Problem set up We consider the problem where M null hypotheses are tested in n studies (H 0ij ) n M. In keeping with what is true in many applications, we assume that only summary statistics are available from these studies, and, for simplicity, we focus on the situation where the summary statistics are the p-values (p ij ) n M of the hypotheses in each of the studies, which is the realization of a random matrix (P ij ) n M. To evaluate if the findings have been replicated at least r times, we want to simultaneously test M partial conjunction null hypotheses at replicability level r: H r/n 0j : fewer than r out of n hypotheses are nonnull for r = 1,, n and j = 1, 2,..., M. The replicability level r is specified beforehand. In practice, one can set r = 2 for a minimal amount of reproducibility or take r proportional to n to describe a desired rate of reproducibility. We use the indicator variable v j to keep track of the truth status of H r/n 0j (v j = 0 if the null is true and v j = 1 otherwise). For each j = 1, 2,..., M, let p j = (p 1j,..., p nj ) be the vector of p-values for individual hypotheses (H 01j,..., H 0nj ) involved in H r/n 0j, and let p (1)j p (2)j p (n)j be its sorted values. For a multiple testing procedure, the decision function ϕ j = 1 if we reject H r/n 0j and ϕ j = 0 otherwise. Because of multiple testing adjustment, our rejection of the jth PC hypothesis (ϕ j = 1) will depend most strongly on p j. However, the elements of p j for j j may also play a role, which makes adafilter different from both the usual meta-analysis (r = 1) and straightforward partial conjunction testing. We use 1:M as a concise notation of the index set {1, 2,..., M}. The number of total discoveries is R = M ϕ j and the total number of false discoveries is V = M ϕ j1 vj =0. There are many possible definitions of the simultaneous error rate (Dudoit and Van Der Laan, 2007), with familywise error rate (FWER) and false discovery rate (FDR) being the most common ones. In addition, we consider the per-family error rate (PFER), as it provides a motivation for our procedures. With the notation introduced, the FWER:= P(V 1), the PFER:= E(V ) and the FDR:= E(V/ max(r, 1)). We say that an error rate is controlled at level α when for any configuration of true and non-true null hypotheses the expectations defining the error rates are bounded by α. Throughout the paper, we use an upper case letter to denote a random variable and the corresponding lower case letter for the specific realization of the random variable. 2.2 Multiple comparison adjustment procedures on PC p-values For a vector of individual P-values (P 1, P 2,..., P n ) from multiple studies, let P r/n denote the P- value for a single PC null hypothesis H r/n 0 obtained by combining the individual P-values into a single value. Benjamini and Heller (2008) discuss three approaches to obtain these p-values, which we report here, using the standard notation (P (1) P (2) P (n) ) 4

5 1. Simes method: { } n r + 1 Pr/n S = min i=r,...,n i r + 1 P (i) which is valid for positively dependent P-values (Benjamini and Heller, 2008). 2. Fisher s method: ( Pr/n F = P χ 2 (2(n r+1)) 2 which is valid for independent P-values. 3. Bonferroni s method: n log P (i) ), i=r P B r/n = (n r + 1)P (r), which is valid under any dependence structure of P-values. Applying any of these rules, we can obtain a P-value P r/n,j for the jth PC null hypothesis, with j = 1,..., M. With these premises, a straightforward approach to the multiple PC testing problem consists of simply applying standard multiplicity adjustment procedures to the individual PC P-values {P r/n,j : j = 1, 2,, M}. For example, to control for FWER at level α, we could use the Bonferroni rule, which rejects H r/n 0j if P r/n,j α/m. This would also control the PFER at level α (Tukey, 1953). To control for FDR we could use the procedure introduced in Benjamini and Hochberg (1995) (BH). This can be described as rejecting H r/n 0j if P r/n,j γ 0 where γ 0 is a datadependent threshold defined as γ 0 = max {γ { 0, α M, 2α M,, α} M } Mγ α 1(P r/n,j γ). Aside from the appeal of relying on standard procedures, these strategies are often too conservative in the context of PC, especially when the true non-null individual hypotheses are a small subset. To quantify this conservativeness, we start by considering the example of the Bonferroni correction when r = n. We will then discuss a more general case, before proposing our alternatives in Section The conjunction case r = n Rejecting H n/n 0j corresponds to declaring that the finding j is replicated in all the n studies. The combined p-values for H n/n 0j are simply P n/n,j = P (n)j. To control for PFER (or FWER) at level α, the Bonferroni correction rejects H n/n 0j if p (n)j α/m. To understand the performance of this procedure, and how it interplays with the composite nature of the PC hypothesis, define sets I k 1:M such that j I k means that precisely k of the 5

6 alternative hypotheses H 1ij hold for the given j. In terms of the partial conjunction hypotheses, for k = 1,..., n 1 we have I k = {j 1:M H (k+1)/n 0j is true but H k/n 0j is false}, while I 0 = {j 1:M H 1/n 0j is true} and I n = {j 1:M H n/n 0j is false}. The sets {I k, k = 0, 1,, n} are a valid partition and H n/n 0 = {H n/n 0j is true} = n 1 k=0 I k. Assuming that the individual P-values are all independent, we can use this decomposition of the set of null H n/n 0j to understand the value of PWER and obtain an upper-bound: E(V ) = j H n/n 0 n 1 k=0 E(1 P(n)j α/m) = n 1 k=0 j I k P(P ij α M i s.t. H 0ij is true) j I k P(P (n)j α/m) (1) n 1 k=0 j I k αn k M = n 1 n k k=0 I k αn k M n k (2) The first inequality in (2) is close to an equality when all the tests of non-null hypotheses H 1ij have high power, so that their associated p-values would be very small. The second inequality in (2) reduces to an equality for many standard tests. Therefore, the bound is tight and there are definitely situations where it provides an approximation of E(V ). We are interested in the setting where M is large. Let δ k = I k /M, the bound for PFER is ( α α ) 2 ( α ) n ) E(V ) α δ n 1 + δ n 2 n 3( M + δ + + δ0 : M M in the limit the bound on E(V ) is dominated by δ n 1 α so long as δ n 1 is bounded away from 0. If instead δ n 1 = 0, then E(V ) = O(M 1 ). More generally if δ n 1 is small then E(V ) will be much less than its nominal level α for large M. In genetics problems we might have M in the thousands (microarrays) or millions (SNPs) with δ 0 1 and δ n 1 0. Under these circumstances the procedure would be very conservative, in fact much more conservative than Bonferroni usually is The general case r n A key lesson from the calculation in the previous section is that this straightforward application of multiple comparison correction strategies to the PC p-values does not account for the fact that PC are composite hypotheses, and therefore controls the global error rate under the worst case scenario (δ n 1 = 1). However, this is not a requirement for a valid multiplicity adjustment procedure, which should simply control the target global error under the true (and unknown) configuration of null hypotheses. The problem is not limited to the case r = n. For general 2 r n, the magnitude of E(V ) for Bonferroni correction of individual PC p-values will depend mainly on δ r 1 in the large M setting. It is clear that more efficient procedures should be available if the fractions δ k were known. Unfortunately, this is typically not the case. However, if good estimates of δ k can be obtained from 6

7 PFER/FWER control 1 estimated V p 2 FDR control γ estimated FDP γ 1 p γ (a) (b) Figure 1: Illustration of the adaptive filtering procedures for n = r = 2. (a) Rejection (red) and filtering (blue) regions are shown when α = 0.2. Each triangle corresponds to a pair of p-values. (b)selection of γ when α = 0.2 under the complete null scenario. the given p-value matrix, it is possible that these could be employed in more efficient procedures for simultaneous testing of PC null hypotheses. This is what motivates the Bayesian methods (Heller and Yekutieli, 2014; Flutre et al., 2013). In this paper we take a frequentist perspective and rather than estimating the δ k, we work directly on an alternative estimation of V. 3 Adaptive filtering procedures The observations made in the previous section can be recast as follows. The key reason for the conservativeness of straightforward approaches is that the partial conjunction null being composite the inequality P(P r/n γ) γ can be very loose. When r = n, the value of P(P (n)j γ) can be as low as γ n under the complete null (all individual hypotheses are null) but as high as γ if only one individual hypothesis is null. The fact that P(P r/n γ) can be substantially lower than γ results in a power loss, as the standard multiple testing procedures are designed to control error when P(P r/n γ) = γ. To overcome this, our AdaFilter procedures leverage regions A γ [0, 1] n such that the much tighter inequality P(P r/n,j γ P j A γ ) γ holds for any configuration in the partial conjunction null space. To motivate our construction, we start with the simple example where n = r = 2 and P 1j and P 2j are independent. There are three possible scenarios for H 2/2 j0 being true: (H 1j0, H 2j0 ) 7

8 {(0, 0), (0, 1), (1, 0)}. The rejection region has the form R γ = {(p 1, p 2 ) max(p 1, p 2 ) γ}, with γ [0, 1]. Let us now define the filtering region A γ = {(p 1, p 2 ) min(p 1, p 2 ) γ}, illustrated in Figure 1(a) as the blue L shaped region. It s easy to check geometrically that P(P j R γ P j A γ and H 2/2 0j is true) γ under all three null scenarios, since at least one of P 1j and P 2j is stochastically greater than uniform. Recall that both the Bonferroni and BH procedures are based on an implicit estimation of the number of false rejections V associated with a threshold γ: ˆVγ = γm. However, in the case of partial conjunctions, using the total number of hypotheses M is too conservative: one needs to consider as candidates for rejection only the H 2/2 j0 such that P j A γ. The observation that P(P j R γ P j A γ and H 2/2 is true) γ leads us to another estimate ˆV Aγ ( M = γ 1 min(p1j,p 2j ) γ 0j ). Obviously, ˆV Aγ ˆV γ and the gap can be large if most of the hypotheses are complete nulls. We can use the estimate ˆV Aγ in multiple testing procedures, replicating the construction of Bonferroni or BH and determining the threshold γ adaptively. To control for FWER and PFER at level α (which is to guarantee E(V ) α), we will choose γ as large as possible, subject to ˆV Aγ α. This choice is well defined because ˆV Aγ is a non-decreasing function of γ ranging from 0 at γ = 0 to M at γ = 1. To control the FDR at level α, one can estimate the false discovery proportion V/R as ˆV Aγ /R and select the largest γ 0 such that ˆV Aγ / max(r, 1) α. The above choices encourage choosing γ towards 0 when all H 2/2 j0 are true to avoid false discoveries (Figure 1(b)). In the next section we introduce multiple testing procedures that follow this scheme for {H r/n j0, j = 1,..., M} with any value of r. 3.1 AdaFilter procedures for partial conjunctions The AdaFilter procedures for H r/n 0j are designed keeping in mind that the truth status of H (r 1)/n 0 is important in evaluating the probability of the rejection region for H r/n 0. We use the Bonferroni method: P B r/n,j = (n r + 1)P (r)j for a single PC hypothesis and compare against the threshold γ, determined adaptively from the whole p-value matrix. The choice of γ depends on the filtering regions: A γ := { (p 1, p 2,, p n ) (n r + 1)p (r 1) γ }. We propose an AdaFilter Bonferroni test (two versions) and an AdaFilter Benjamini-Hochberg (BH) test. It is convenient to first introduce the notion of filtering and selection P -values. Filtering P -values F j : F j := (n r + 1)P (r 1)j (3) 8

9 Selection P -values S j S j := P B r/n,j = (n r + 1)P (r)j (4) Definition 1 (AdaFilter Bonferroni). For a level α, and with F j and S j given by (3) and (4) respectively, reject H r/n 0j if S j γ0 Bon where { γ0 Bon = max γ { α M,, α 2, α} γ M } 1 Fj γ α. Definition 2 (AdaFilter BH). For a level α, and with F j and S j given by (3) and (4) respectively, reject H r/n 0j if S j γ0 BH where γ0 BH M M = max {γ I α,m γ 1 Fj γ α 1 Sj γ } { k, I α,m = m α } k 0:M, m 1:M, k m. Our AdaFilter Bonferroni can be defined equivalently using a more programmable two-step approach, as illustrated in Figure 2. Definition 3 (AdaFilter Bonferroni, alternative version). For a level α, and with F j and S j given by (3) and (4) respectively, proceed as follows. First sort {F 1, F 2,, F M } into F (1) F (2) F (M). Then let and define { m = min j 1:M m = α } j < F (j) { m, if F (m ) α m 1 m 1, otherwise. Finally, reject H r/n 0j if S j α/m. If the set in (5) is, then take m = m = M. Proposition 3.1. The two definitions of the Adaptive filtering Bonferroni procedure in Definition 1 and Definition 3 are equivalent. The alternative form provides another view of the AdaFilter procedures. We filter out unlikely hypotheses with large F j values and only adjust for the m remaining hypotheses. If m M then adaptive filtering can reject many more hypotheses than direct Bonferroni or BH on S j. One will see from (6) in the next section that, because of the conditional validity of the selection P-values there is no need to correct for selection bias when selecting based on F j. (5) 9

10 m = m' 1 m = m' α F (j) = F j* S j* selected PC α F (j) = F j* S j* selected PC γ 0 α M y = α j γ 0 α M y = α j 1 m' M j 1 m' M j Figure 2: AdaFilter Bonferroni procedure. On the x-axis, indexes j of the hypotheses, ordered in increasing values of the filtering p-values: F (1) F (M). On the y-axis, filtering and selection p-values for each of the hypotheses: F (j) are plotted with filled triangles, and S j with crosses where j is the true index of F (j). The solid green line is the α/j boundary, and the color blue is used to indicate hypotheses that pass the filtering step, while the red circle marks hypotheses that pass the selection step. The left and right panels represent two possible definitions of m. 4 Theoretical properties of AdaFilter When the P-values are independent across both the studies and the multiple individual hypotheses, we can prove that the AdaFilter procedures control the corresponding global errors under any configurations of the true individual hypotheses. In addition to independence we assume validity. Definition 4 (Valid p-value). If P(P ij α) α under H 0ij for any α [0, 1], then P ij is a valid P -value. Theorem 4.1. Let (P ij ) n M contain independent valid p-values. Then the AdaFilter Bonferroni procedure in Definition 1 controls FWER and PFER at level α for the null hypotheses {H r/n 0j : j = 1, 2,, M}. The proof of Theorem 4.1 is given in the appendix. It depends on Lemma 4.2, which gives a conditional validity property of the selection statistics S j after they are filtered via F j. Lemma 4.2. Let (P ij ) n M some j 1:M. Then contain independent valid p-values and assume that H r/n 0j is true for P ( S j β F j β ) β (6) 10

11 holds whenever β > 0 and P ( F j β ) > 0. Here F j and S j are given by (3) and (4) respectively. Equation (6) can be equivalently written as P ( S j β ) βp(f j β). It holds also when P(F j β) = 0 as S j F j is always true. We note that the independence assumption on (P ij ) n M in Theorem 4.1 can be relaxed. In the proof, we only require that if v j = 0 (H r/n 0j is true), then the P-values (P 1j, P 2j,, P nj ) are independent of each other and of the other entries in (P ij ) n M. Even in this more relaxed form, independence is not a reasonable assumption in many applications. We will investigate with simulations (Section 5) how robust our procedures are to violations of this hypothesis. In addition, while the fact that the AdaFilter BH procedure controls FDR remains a conjecture at this stage, we will use simulations to illustrate how FDR appears to be well controlled in practice. 4.1 Partial monotonicity Wang and Owen (2017) discussed the importance of monotonicity for the combined p-value of a single PC test, where monotonicity is defined as the property that the combined p-value is a non-decreasing function of its individual p-values. Wang and Owen (2017) shows that due to the complexity of the null space of the PC hypothesis, there do exist non-monotone PC p-values that are uniformly most powerful. However, one needs to avoid proposing a non monotone test as they are completely unreasonable in practice (Perlman and Wu, 1999). In terms of multiple testing on many PC hypotheses, we will show that investigating monotonicity leads to a deeper understanding of our AdaFilter procedures. To clarify our analysis, we first define two monotonicity concepts: complete monotonicity and partial monotonicity. Definition 5 (Complete monotonicity). A multiple testing procedure has complete monotonicity if each decision function ϕ j (p 1,, p M ) is a non-increasing function in all the elements of (p ij ) n M for j = 1, 2,, M. Definition 6 (Partial monotonicity). A multiple testing procedure has partial monotonicity if for all j 1:M, its decision function ϕ j (p 1,, p M ) is non-increasing in all elements of (p 1j, p 2j,..., p nj ). Complete monotonicity guarantees that if any p-value p ij decreases while all others are held fixed, we reject all PC null hypotheses that we originally rejected and maybe more. Partial monotonicity only requires the test of hypothesis j to be monotone in the p-values for that same hypothesis. It allows a reduction in p ij for j j to reverse a rejection of H r/n 0j. Most multiple comparison procedures for a single study (for example, the BH procedure) satisfy complete monotonicity. However, it turns out that in our problem, complete monotonicity is a stringent assumption and prevents us from finding efficient procedures for PC nulls. We can show that our procedures satisfy partial monotonicity but do not satisfy complete monotonicity. Corollary 4.3. Let (P ij ) be a matrix of valid p-values. Then both the AdaFilter Bonferroni and the AdaFilter BH procedures satisfy partial monotonicity for all null hypotheses H r/n 0j, j = 1, 2,, M. 11

12 j Study 1 Study 2 F j S j Table 1: Toy example showing the power of AdaFilter by violating complete monotonicity. Here, both n = r = 2 and M = 2. AdaFilter Bonferroni finds the first hypothesis significant (γ0 Bon = 0.05) in both studies while neither of the individual p-value can be rejected in each study separately using Bonferroni when FWER is controlled at α = To see how our two AdaFilter procedures do not satisfy complete monotonicity, consider a case j with F j > γ 0 where γ 0 is the original selection threshold. Now let F j increase while keeping S j and the individual P-values not involving j fixed, as γ M 1 F j γ keeps decreasing, the selection threshold increases, which can reverse the rejection of H r/n 0k for some k j. In other words, the increase of individual p-values can induce a larger number of unlikely PC non-null hypotheses to be filtered out, and, in turn, increase the number of rejections of non-null candidates. Violation of complete monotonicity is necessary for a PC multiple testing procedure to have a high power. An increase of individual p-values can make the non-null configuration more similar across studies, which helps in getting more rejections and violates the complete monotonicity rule. As in the toy example illustrated in Table 1, when combining results of two repeated experiments and setting r = 2, we may even reject a conjunction hypothesis using AdaFilter while neither of the single hypotheses it joins are rejected separately. The gain of power here derives from having learned the similarities between the two experiments. 4.2 Extensions and relation to literature Remark 1: variable r and n In some applications, the M PC null hypotheses can have varying r j or n j. Then the jth PC null hypothesis becomes: H r j/n j 0j : fewer than r j out of n j hypotheses are nonnull. For example in genetic analysis, if each P ij is the P-value testing for the null hypothesis related to gene j in experiment i, then gene j might not be present in every single experiment, making n j different. The AdaFilter procedures still work in this scenario. We replace formulas (3) and (4) by F j = (n j r j + 1)P (rj 1)j, and S j = (n j r j + 1)P (rj )j respectively. Both AdaFilter Bonferroni and AdaFilter BH still control the required error rate as the conditional validity Lemma 4.2 still holds. Remark 2: other strategies to increase power Two methods related to our AdaFilter procedures for PC hypotheses are the multiple testing procedure developed in Bogomolov and Heller (2015) for replicability analysis where n = r = 2 and the empirical Bayes approach in Heller 12

13 and Yekutieli (2014) for controlling the Bayes FDR for multiple PC hypotheses. Both methods are developed to improve the efficiency of the straightforward procedures. Our AdaFilter procedures generalize the procedures in Bogomolov and Heller (2015) to arbitrary n and r and provide a frequentist approach comparable to the empirical Bayes method in Heller and Yekutieli (2014). Bogomolov and Heller (2015) describe procedures to control either FWER or FDR when n = r = 2. These are also based on a filtering step: promising candidate hypotheses for each study are identified as those whose corresponding p-value in the other study is below a threshold. Then a promising candidate hypothesis is rejected if both its two individual p-values are small. To maximize the efficiency, the authors suggest a data-adaptive threshold that is very similar to our AdaFilter procedures. For instance, to control FWER, they chose two thresholds γ 1 and γ 2 to satisfy ( M ) ( M ) γ 1 1 P2j γ 2 α/2, and γ 2 1 P1j γ 1 α/2. When γ 1 γ 2, then ( M ) ( M ( γ 1 1 min(p1j,p 2j ) γ 1 γ 1 1P1j γ P2j γ 1 )) α Thus γ0 Bon γ 1 γ 2 where γ0 Bon is defined in our AdaFilter Bonferroni procedure for n = r = 2, making our procedure similar to theirs. The selection step in Bogomolov and Heller (2015) consists in rejecting H 2/2 0j if both p 2j γ 2 and p 1j γ 1 again comparable to what we described. These similarities extend to the FDR controlling procedures. Our contribution is that we propose a general adaptive filtering idea that can be applied to any combination of r and n. Heller and Yekutieli (2014) used an empirical Bayes approach. They first learn the proportion of each of the 2 n possible configurations of individual hypotheses being null or non-null, along with the distribution of the Z-values under each configuration. While this approach certainly deals with the loss of power due to the composite nature of the PC hypotheses, it requires adopting a Bayesian framework and has substantial computational costs. AdaFilter provides an alternative approach to increase the efficiency of multiple comparisons of PC tests, with no need to estimate the fractions of each configuration and to abandon the frequentist approach. More importantly, the computation cost of our proposal is O(n) while the cost of the empirical Bayes method increases exponentially in n. Simulation results in Section 5 show that our procedure is more accurate in controlling FDR when n is large for PC hypotheses. Remark 3: testing for all possible values of r The partial conjunction null H r/n 0 can be meaningfully defined for any 1 r n, and sometimes it is of interest to test for all possible r values, adding another layer of multiplicity. Benjamini and Heller (2008) specifically consider this case, providing an FDR controlling procedure. A desirable property in this setting is for the rejection set of the PC hypotheses to be monotonically decreasing as r increases. The filtering procedure we describe, applied to control FDR for each value of r separately, does not have this property: the filtering information based on the whole P-value matrix can be different for different 13

14 n r Table 2: Configurations of n and r in our simulation. r. Of course, by controlling FDR for each value of r separately, the expected number of false discoveries accumulates when different r values are considered at the same time and this is not a strategy that we would suggest, even aside from the lack of monotonicity. The investigation of powerful strategies in this context is for future research. 5 Simulations We benchmark the performance of AdaFilter using (a) straightforward multiple comparison correction procedures on the three forms of PC p-values in Section 2.2 and (b) the empirical Bayes method in Heller and Yekutieli (2014). To implement (b), we used the R package repfdr of the original authors. We consider two dependence structures across the individual hypothesis p-values within each study: independence (which is assumed in our theoretical results) and local dependence, which is a reasonable assumption in many high dimensional settings, such as GWAS. We set M = and control PFER at the nominal level α = 1, FDR at the nominal level α = 0.2, and Bayes FDR at the same level α = 0.2 for repfdr. Bayes FDR corresponds to the posterior probability of the hypothesis being null given the rejection region, which has been shown to be similar but slightly more conservative than the frequentist FDR under independence (Efron, 2012). For a given n, there are 2 n combinations of individual hypotheses being null or non-null. We use two parameters to control the probability of each combination: π 0 is the probability of the complete null combination (0, 0,, 0) and π r/n is the total probability of the combinations not belonging to H r/n 0j. We set π r/n = 0.02 and consider two values for π 0 : π 0 = 0.5 or 0.9. All PC null combinations except for the complete null have equal probabilities and add up to 1 π 0 π r/n. All non-null PC combinations also have equal probabilities. We consider six different configurations of n and r, as listed in Table 2. We assume that p-values belonging to different studies are independent and, within one study, the correlation of the M Z-values is Σ ρ I b b where is the Kronecker product, the block size b = 100 and Σ ρ R m m has 1s on the diagonal and common value ρ 0 off the diagonal. The Z-values then have a distribution with local block dependence a reasonable assumption for many genetics data sets such as in GWA or eqtl studies. Based on the z-values, we calculate the two-sided p-value for each individual hypothesis. To generate the signals, when the individual component hypothesis is non-null, we sample the mean of the z-value uniformly and independently from I = {±µ 1, ±µ 2, ±µ 3, ±µ 4 } where the four values {µ 1, µ 2, µ 3, µ 4 } correspond to detection power of 0.02, 0.2, 0.5, 0.95 respectively (the detection power is calculated assuming that the individual null component hypothesis H 0ij is rejected when P ij α/m). For each parameter configuration, we run B = 100 random experiments and calculate the average power, number of false discoveries and 14

15 false discovery proportions. Studying PFER control, we compare our AdaFilter Bonferroni procedure with three straightforward procedures, using Bonferroni correction to the three forms of M PC p-values in Section 2.2. Table 3 shows the average PFER and power of the four procedures over the six configurations in Table 2 (more detailed results for each configuration is in Figure B.1 and Figure B.2). All methods successfully control PFER at the nominal level under both the independence and block dependence cases. However, the straightforward procedures are much more conservative than our AdaFilter Bonferroni procedure, especially when both n and r are large. The gain in power is more pronounced when the complete null fraction is higher, which is expected in many genetics applications. π 0 = 0.5 π 0 = 0.8 ρ = 0 ρ = 0.8 ρ = 0 ρ = 0.8 Method PFER Recall(%) PFER Recall(%) PFER Recall(%) PFER Recall(%) Bonferroni-Pr/n B Bonferroni-Pr/n F Bonferroni-Pr/n S AdaFilter Bonferroni Table 3: Results for methods targeting a PFER of α = 1. The numbers in the table are averages of all 6 combinations of n and r values, with 2 π 0 values and 2 ρ values (in total 24 combinations). Each combination has 100 repeated experiments. We analyze five procedures that control FDR: our AdaFilter BH procedure, three straightforward procedures where BH correction is applied to the three forms of M PC p-values, and repfdr. Table 4 shows the average FDR and power of the five procedures over the six configurations (more detailed results in Figure B.3 and Figure B.4). Our AdaFilter BH procedure and the three straightforward procedures control FDR at the nominal level. However, similar to what happens in the PFER control, the straightforward approaches are too conservative. Our AdaFilter BH procedure has much more power than these methods. The repfdr method fails to always control the FDR at its nominal level especially when n is large and the individual hypotheses are locally correlated within studies. One explanation is that a large n translates into a large number of parameters that need to be estimated in repfdr a task that might not be performed satisfactorily. In the cases when repfdr does control FDR, its power is comparable with that of our proposal. However, it is consistently less powerful than AdaFilter BH when π 0 = 0.9 and ρ = 0.8, a scenario that is closest to the real GWA study, indicating that AdaFilter BH can more powerful when applied to GWAS. 15

16 π 0 = 0.5 π 0 = 0.8 ρ = 0 ρ = 0.8 ρ = 0 ρ = 0.8 Method FDR Recall(%) FDR Recall(%) FDR Recall(%) FDR Recall(%) BH-Pr/n B BH-Pr/n F BH-Pr/n S repfdr AdaFilter BH Table 4: Results for methods targeting a nominal FDR of α = 0.2. The numbers in the table are averages of all 6 combinations of n and r values, with 2 π 0 values and 2 ρ values (in total 24 combinations). Each combination has 100 repeated experiments. 6 Case studies Here we use our methods to analyze three data sets: one tests for replication in multiple experiments and the other two test for consistent signals across different conditions/subgroups of the data. 6.1 Duchenne Muscular Dystrophy microarray studies We consider a replication analysis based on four publicly available microarray datasets on Duchenne muscular dystrophy (DMD). Affecting about 1:3500 newborn males, DMD is the most common form of muscular dystrophy and the most common sex linked disease in males. Following Kotelnikova et al. (2012), we investigate four NCBI GEO DMD-related microarray datasets (GDS 214, GDS 563, GDS 1956 and GDS 3027, Table 5) that are conducted in different labs and different platforms with no shared biological samples; each dataset is collected with the purpose of identifying genes that are differentially expressed in DMD cases. For each experiment, the data is preprocessed using RMA (Irizarry et al., 2003) and the individual p-values for each probe set are calculated using Limma (Smyth, 2004) and then rescaled to achieve an approximately uniform empirical null distribution of the p-values. If a gene is represented by two or more probe sets on a chip, we use the Bonferroni rule to combine the probe specific p-values and obtain a single gene p-value. Study GDS 1956 is generally less powerful than other studies (Figure B.5). As the four studies used different microarray platforms, not every gene is presented in all studies. However, the application of our AdaFilter BH procedure at level α = 0.05 still leads to the discovery of many consistently differentially expressed genes at r = 2, 3, 4 (Table 5(b)). Table 5(c) shows how our AdaFilter BH increases power, allowing for rejection of the complete conjunction hypotheses (r = n = 4) even when one of the study has a relatively high p-value. Specifically, while the third study (GDS 1956) has less power than the others, our method can compensate this deficiency by leveraging the overall similarity of the results in this study compared with other studies. The 16

17 GEO ID Platform Description Source GDS 214 custom Affymetrix 4 healthy, 26 DMD Muscle GDS 563 Affymmetrix U95A 11 healthy, 12 DMD Quadriceps Muscle GDS 1956 Affymetrix U133A 18 healthy, 10 DMD Muscle GDS 3027 Affymetrix U133A 14 healthy, 23 DMD Quadriceps Muscle r M Rejected (b) (a) Gene Symbol GDS 214 GDS 563 GDS 1956 GDS 3027 HLA-DPB1 8.11e e e e-08 MYL4 1.48e e e e-08 PPP1R1A 6.93e e e e-04 SRPX 1.64e e e e-08 DMD e e-10 (c) Table 5: (a)information of GEO datasets used. (b)adafilter BH selection results. (c)some of the Selected genes. The first four are genes selected at r = 4 whose individual p-value in study GDS 1956 is above The last gene (DMD) is selected at r = 2. FDR is controlled at α = 0.05 for each r. disease causal gene DMD is, on the other hand, not significantly differentially expressed in every study, indicating that partial conjunction is preferable to complete conjunction to encourage more true findings. 6.2 Multi-tissue AGEMAP data The AGEMAP study (Zahn et al., 2007) investigated age-related gene expression in mice. The experiments comprised ten mice at each of four age groups: each of these 40 mice contributed samples from 16 different tissues, resulting in 618 microarray data sets (some mice missing in specific tissues). From each microarray, 8932 probes were sampled. We use the robust-regression approach in Wang et al. (2017) to calculate confounder-adjusted p-values of age effect for each gene and each tissue; this is a simplified version of the LEAPP algorithm, originally used on this data by Sun et al. (2012). Wang et al. (2017) showed that the p-values of the genes are asymptotically independent within each tissue after having adjusted for confounder. Correlation of p-values across tissues are considered as negligible. We here consider the problem of identifying genes whose expression significantly change with age across multiple tissues. Figure 3 gives the results, in terms of the number of significant genes, when PFER is controlled at 1 and r ranges from 2 to 7. For all r values, the adaptive filtering Bonferroni procedure is much more powerful than applying Bonferroni adjustment to the PC p-values. 17

18 Power Comparison of Partial Conjunction for Agemap Data # of significant genes B Bonferroni P r n F Bonferroni P r n S Bonferroni P r n AdaFilter Bonferroni r Figure 3: The number of genes whose H r/n null hypotheses were rejected by each of the compared procedures. Here n = 16 for 16 tissues and r ranges from 2 to 7 and PFER is controlled at α = Metabolites super-pathways GWAS data Finally, we examine the multi-trait GWAS data from Shin et al. (2014), a comprehensive study of the genetic loci influencing human metabolism. A total of 7824 adult individuals from 2 European populations were recruited in the study, and M = 2,182,555 SNPs were recorded, either directly genotyped or imputed from the HapMap 2 panel. For each sample, the study measured the levels of 333 metabolites, categorized into 8 non-overlapping super pathways. Due to privacy concerns, only summary statistics measurements are publicly available: these are consist in the p-values and test statistics for the null hypotheses of no association between each marker and each of the m = 275 annotated metabolites measurements (only 275 of the 333 annotated metabolites can be downloaded from the original authors website GWAS/gwas/index.php?task=download). We are interested in discovering SNPs that significantly influence metabolites in multiple super pathways, as this information gives a more comprehensive view of the genetic regulations and benefits other downstream analysis, such as the side effects in drug development. To calculate individual p-values p ij for each marker j and each super-pathway i, we start with the Z-values Z sj for test of association between each metabolite s and marker j, which are given as summary statistics in the original GWAS. For a super-pathway i, let {s 1, s 2,, s ni } be the index 18

19 set of metabolite measures that belong to it. We assume that (Z s1 j, Z s2 j,, Z sni j) N (0, Σ i ) where Σ i can be accurately estimated in principle since we have millions of markers and most of the individual hypotheses are null. We estimate Σ i as the graphical Lasso estimated covariance matrix of a 2000 randomly sampled SNPs (markers) that lie at least 1Mbp away from each other with the tuning parameters selected by cross-validation. The graphical Lasso approach guarantees an accurate sparse inverse covariance matrix estimation, that is needed for computing the p-values for each super-pathway. Let p ij be the p-value for the association between a super-pathway i and marker j, in other words, the p-value for the null H ij0 : no metabolite measure of super pathway i is associated with marker j. Given (Z s1 j, Z s2 j,, Z sni j) and ˆΣ i, we calculate p-values p ij from the chi-square test treating the estimated Σ i as known. These p ij serve as individual p-values which will be used in the partial conjunction testing. Figure 4a shows the estimated correlation across metabolites assuming (Z 1j, Z 2j,, Z mj ) N (0, Σ) for j = 1, 2,, M. We estimate Σ by applying the Minimum Covariance Determinant (MCD) (Rousseeuw and Driessen, 1999) estimator to the 2000 randomly sampled SNPs, where MCD is a highly robust method to reduce the influence of the sparse non-null hypotheses. Notice that we choose MCD instead of graphical Lasso here as we do not need an estimate of the inverse of Σ. It is evident that most of the nonzero correlations are between traits within the same super-pathways: this allows us to apply the adaptive filtering procedures for PC hypotheses across super-pathways with confidence. We can now input these p-values to the multiple-comparison adjustment procedure. Figure 4b compares the number of significant SNPs when FDR is controlled at 0.05 and r ranges from 2 to 5. Compared with other four methods, our AdaFilter BH procedure is much more powerful in rejecting PC null hypotheses for any value of r. The method repfdr rejects less than our AdaFilter BH. This is a consistent result with our simulations as in this GWAS, the fraction of complete nulls π 0 is almost 1 and there exist strong local correlations among nearby SNPs. Table 6 shows the selected SNPs for r = 4. Most of the individual p-values are not below the usual p-values threshold (10 8 ), showing the power gain of our method. The genes that corresponds to these SNPs are GCKR, C2orf16, ZNF512, GPN1, SLC17A3, ABO and ABCC1. Looking up the traits that have been found to be affected by these genes using the Phenotype-Genotype Integrator website (PheGenI, we find that most of these genes are associated with a diverse class of traits, consistently with our results. For example, gene GCKR is a glucokinase regulator and affects 62 different traits including Glucose, Cholesterol, obesity, uric acid, Calcium and Insulin. Gene ABO is related to the first discovered blood group system ABO and affects 52 traits, including blood coagulation factors, alkaline phosphatase, stroke and Coronary Artery Disease. The protein encoded by gene ABCC1 is a member of the superfamily of ATP-binding cassette (ABC) transporters. This gene is found to affect 4 traits, which includes metabolism and stroke. Finally, the gene SLC17A3 encodes a voltage-driven transporter that excretes intracellular urate and organic anions from the blood into renal tubule cells and affects uric acid, blood pressure and homocysteine. 19

20 Lipid Cofactors and vitamins Energy Nucleotide Carbohydrate Xenobiotics Amino acid Peptide Peptide Amino acid Xenobiotics Carbohydrate Nucleotide Energy Cofactors and vitamins Lipid # of significant loci B BH P r n F BH P r n S BH P r n repfdr AdaFilter BH r 27 1 (a) Correlation of the metabolites measures. (b) Number of significant SNPs. Figure 4: Figure 4a shows the correlation of test statistics of the metabolites measures under the null. The darker the color, the higher the absolute value of the correlation. The metabolite measurements are reordered so that metabolites in the same pathway are adjacent to each other. The red lines and blue test labels the eight super-pathways. Figure 4b compares the results of adaptive-filtering BH with those of the application of BH to PC p-values calculated with three rules. r ranges from 2 to 5 and FDR is targeted at α = Conclusion Testing partial conjunction hypotheses provides a framework to assess the replicability of scientific findings across multiple studies. However, simultaneous testing for thousands of partial conjunction hypotheses often results in very low power, limiting the utility of this framework in high-throughput experiments. We introduced here a new multiple testing approach based on adaptive filtering, AdaFilter, which can greatly increase the power in simultaneous testing of PC hypotheses, while controlling global error rates. We can theoretically prove that appropriate variants of our procedure control FWER and PFER under independence of all individual hypotheses. More importantly, we show through extensive simulations that the AdaFilter procedures control FWER, PFER and FDR both under independence and under weak dependence of the individual hypotheses within each study. However, one limitation of our approach is that it does require independence of the results across different studies. The key reason for the low power of previous methods in PC multiple testing is that a null PC hypothesis is a composite hypothesis. To increase power, AdaFilter filters out unlikely candidates PC null hypotheses by establishing conditional validity, validity of a p-value conditional on sub- 20

Adaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses

Adaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses Adaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses arxiv:1610.03330v1 [stat.me] 11 Oct 2016 Jingshu Wang, Chiara Sabatti, Art B. Owen Department of Statistics, Stanford University

More information

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA

More information

Looking at the Other Side of Bonferroni

Looking at the Other Side of Bonferroni Department of Biostatistics University of Washington 24 May 2012 Multiple Testing: Control the Type I Error Rate When analyzing genetic data, one will commonly perform over 1 million (and growing) hypothesis

More information

Confounder Adjustment in Multiple Hypothesis Testing

Confounder Adjustment in Multiple Hypothesis Testing in Multiple Hypothesis Testing Department of Statistics, Stanford University January 28, 2016 Slides are available at http://web.stanford.edu/~qyzhao/. Collaborators Jingshu Wang Trevor Hastie Art Owen

More information

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018 High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously

More information

High-throughput Testing

High-throughput Testing High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector

More information

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses Gavin Lynch Catchpoint Systems, Inc., 228 Park Ave S 28080 New York, NY 10003, U.S.A. Wenge Guo Department of Mathematical

More information

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo PROCEDURES CONTROLLING THE k-fdr USING BIVARIATE DISTRIBUTIONS OF THE NULL p-values Sanat K. Sarkar and Wenge Guo Temple University and National Institute of Environmental Health Sciences Abstract: Procedures

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca

More information

Stat 206: Estimation and testing for a mean vector,

Stat 206: Estimation and testing for a mean vector, Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where

More information

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper

More information

Advanced Statistical Methods: Beyond Linear Regression

Advanced Statistical Methods: Beyond Linear Regression Advanced Statistical Methods: Beyond Linear Regression John R. Stevens Utah State University Notes 3. Statistical Methods II Mathematics Educators Worshop 28 March 2009 1 http://www.stat.usu.edu/~jrstevens/pcmi

More information

Non-specific filtering and control of false positives

Non-specific filtering and control of false positives Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview

More information

Step-down FDR Procedures for Large Numbers of Hypotheses

Step-down FDR Procedures for Large Numbers of Hypotheses Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate

More information

Post-Selection Inference

Post-Selection Inference Classical Inference start end start Post-Selection Inference selected end model data inference data selection model data inference Post-Selection Inference Todd Kuffner Washington University in St. Louis

More information

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES By Wenge Guo Gavin Lynch Joseph P. Romano Technical Report No. 2018-06 September 2018

More information

Zhiguang Huo 1, Chi Song 2, George Tseng 3. July 30, 2018

Zhiguang Huo 1, Chi Song 2, George Tseng 3. July 30, 2018 Bayesian latent hierarchical model for transcriptomic meta-analysis to detect biomarkers with clustered meta-patterns of differential expression signals BayesMP Zhiguang Huo 1, Chi Song 2, George Tseng

More information

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Christopher R. Genovese Department of Statistics Carnegie Mellon University joint work with Larry Wasserman

More information

This paper has been submitted for consideration for publication in Biometrics

This paper has been submitted for consideration for publication in Biometrics BIOMETRICS, 1 10 Supplementary material for Control with Pseudo-Gatekeeping Based on a Possibly Data Driven er of the Hypotheses A. Farcomeni Department of Public Health and Infectious Diseases Sapienza

More information

Pearson s meta-analysis revisited

Pearson s meta-analysis revisited Pearson s meta-analysis revisited 1 Pearson s meta-analysis revisited in a microarray context Art B. Owen Department of Statistics Stanford University Pearson s meta-analysis revisited 2 Long story short

More information

Lecture 7 April 16, 2018

Lecture 7 April 16, 2018 Stats 300C: Theory of Statistics Spring 2018 Lecture 7 April 16, 2018 Prof. Emmanuel Candes Scribe: Feng Ruan; Edited by: Rina Friedberg, Junjie Zhu 1 Outline Agenda: 1. False Discovery Rate (FDR) 2. Properties

More information

Multiple testing: Intro & FWER 1

Multiple testing: Intro & FWER 1 Multiple testing: Intro & FWER 1 Mark van de Wiel mark.vdwiel@vumc.nl Dep of Epidemiology & Biostatistics,VUmc, Amsterdam Dep of Mathematics, VU 1 Some slides courtesy of Jelle Goeman 1 Practical notes

More information

REPRODUCIBLE ANALYSIS OF HIGH-THROUGHPUT EXPERIMENTS

REPRODUCIBLE ANALYSIS OF HIGH-THROUGHPUT EXPERIMENTS REPRODUCIBLE ANALYSIS OF HIGH-THROUGHPUT EXPERIMENTS Ying Liu Department of Biostatistics, Columbia University Summer Intern at Research and CMC Biostats, Sanofi, Boston August 26, 2015 OUTLINE 1 Introduction

More information

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept

More information

Multilayer Knockoff Filter: Controlled variable selection at multiple resolutions

Multilayer Knockoff Filter: Controlled variable selection at multiple resolutions Multilayer Knockoff Filter: Controlled variable selection at multiple resolutions arxiv:1706.09375v1 stat.me 28 Jun 2017 Eugene Katsevich and Chiara Sabatti June 27, 2017 Abstract We tackle the problem

More information

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome

More information

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using

More information

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a

More information

False Discovery Rate

False Discovery Rate False Discovery Rate Peng Zhao Department of Statistics Florida State University December 3, 2018 Peng Zhao False Discovery Rate 1/30 Outline 1 Multiple Comparison and FWER 2 False Discovery Rate 3 FDR

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Applying the Benjamini Hochberg procedure to a set of generalized p-values U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure

More information

FDR and ROC: Similarities, Assumptions, and Decisions

FDR and ROC: Similarities, Assumptions, and Decisions EDITORIALS 8 FDR and ROC: Similarities, Assumptions, and Decisions. Why FDR and ROC? It is a privilege to have been asked to introduce this collection of papers appearing in Statistica Sinica. The papers

More information

Lecture 6 April

Lecture 6 April Stats 300C: Theory of Statistics Spring 2017 Lecture 6 April 14 2017 Prof. Emmanuel Candes Scribe: S. Wager, E. Candes 1 Outline Agenda: From global testing to multiple testing 1. Testing the global null

More information

Empirical Bayes Moderation of Asymptotically Linear Parameters

Empirical Bayes Moderation of Asymptotically Linear Parameters Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 6, Issue 1 2007 Article 28 A Comparison of Methods to Control Type I Errors in Microarray Studies Jinsong Chen Mark J. van der Laan Martyn

More information

A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data

A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data A Multivariate Two-Sample Mean Test for Small Sample Size and Missing Data Yujun Wu, Marc G. Genton, 1 and Leonard A. Stefanski 2 Department of Biostatistics, School of Public Health, University of Medicine

More information

A knockoff filter for high-dimensional selective inference

A knockoff filter for high-dimensional selective inference 1 A knockoff filter for high-dimensional selective inference Rina Foygel Barber and Emmanuel J. Candès February 2016; Revised September, 2017 Abstract This paper develops a framework for testing for associations

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Two-stage stepup procedures controlling FDR

Two-stage stepup procedures controlling FDR Journal of Statistical Planning and Inference 38 (2008) 072 084 www.elsevier.com/locate/jspi Two-stage stepup procedures controlling FDR Sanat K. Sarar Department of Statistics, Temple University, Philadelphia,

More information

Journal Club: Higher Criticism

Journal Club: Higher Criticism Journal Club: Higher Criticism David Donoho (2002): Higher Criticism for Heterogeneous Mixtures, Technical Report No. 2002-12, Dept. of Statistics, Stanford University. Introduction John Tukey (1976):

More information

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control

Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Supplementary Materials for Molecular QTL Discovery Incorporating Genomic Annotations using Bayesian False Discovery Rate Control Xiaoquan Wen Department of Biostatistics, University of Michigan A Model

More information

How to analyze many contingency tables simultaneously?

How to analyze many contingency tables simultaneously? How to analyze many contingency tables simultaneously? Thorsten Dickhaus Humboldt-Universität zu Berlin Beuth Hochschule für Technik Berlin, 31.10.2012 Outline Motivation: Genetic association studies Statistical

More information

The miss rate for the analysis of gene expression data

The miss rate for the analysis of gene expression data Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume, Issue 006 Article 9 A Method to Increase the Power of Multiple Testing Procedures Through Sample Splitting Daniel Rubin Sandrine Dudoit

More information

arxiv: v1 [math.st] 31 Mar 2009

arxiv: v1 [math.st] 31 Mar 2009 The Annals of Statistics 2009, Vol. 37, No. 2, 619 629 DOI: 10.1214/07-AOS586 c Institute of Mathematical Statistics, 2009 arxiv:0903.5373v1 [math.st] 31 Mar 2009 AN ADAPTIVE STEP-DOWN PROCEDURE WITH PROVEN

More information

Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses

Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Amit Zeisel, Or Zuk, Eytan Domany W.I.S. June 5, 29 Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving

More information

Empirical Bayes Moderation of Asymptotically Linear Parameters

Empirical Bayes Moderation of Asymptotically Linear Parameters Empirical Bayes Moderation of Asymptotically Linear Parameters Nima Hejazi Division of Biostatistics University of California, Berkeley stat.berkeley.edu/~nhejazi nimahejazi.org twitter/@nshejazi github/nhejazi

More information

Statistical testing. Samantha Kleinberg. October 20, 2009

Statistical testing. Samantha Kleinberg. October 20, 2009 October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find

More information

Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs

Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs with Remarks on Family Selection Dissertation Defense April 5, 204 Contents Dissertation Defense Introduction 2 FWER Control within

More information

Statistical Methods for Analysis of Genetic Data

Statistical Methods for Analysis of Genetic Data Statistical Methods for Analysis of Genetic Data Christopher R. Cabanski A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial fulfillment of the requirements

More information

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate

Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Model-Free Knockoffs: High-Dimensional Variable Selection that Controls the False Discovery Rate Lucas Janson, Stanford Department of Statistics WADAPT Workshop, NIPS, December 2016 Collaborators: Emmanuel

More information

Research Article Sample Size Calculation for Controlling False Discovery Proportion

Research Article Sample Size Calculation for Controlling False Discovery Proportion Probability and Statistics Volume 2012, Article ID 817948, 13 pages doi:10.1155/2012/817948 Research Article Sample Size Calculation for Controlling False Discovery Proportion Shulian Shang, 1 Qianhe Zhou,

More information

Lesson 11. Functional Genomics I: Microarray Analysis

Lesson 11. Functional Genomics I: Microarray Analysis Lesson 11 Functional Genomics I: Microarray Analysis Transcription of DNA and translation of RNA vary with biological conditions 3 kinds of microarray platforms Spotted Array - 2 color - Pat Brown (Stanford)

More information

Association studies and regression

Association studies and regression Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

More information

Selecting explanatory variables with the modified version of Bayesian Information Criterion

Selecting explanatory variables with the modified version of Bayesian Information Criterion Selecting explanatory variables with the modified version of Bayesian Information Criterion Institute of Mathematics and Computer Science, Wrocław University of Technology, Poland in cooperation with J.K.Ghosh,

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2016 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Raied Aljadaany, Shi Zong, Chenchen Zhu Disclaimer: A large

More information

Lecture 21: October 19

Lecture 21: October 19 36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 21: October 19 21.1 Likelihood Ratio Test (LRT) To test composite versus composite hypotheses the general method is to use

More information

Procedures controlling generalized false discovery rate

Procedures controlling generalized false discovery rate rocedures controlling generalized false discovery rate By SANAT K. SARKAR Department of Statistics, Temple University, hiladelphia, A 922, U.S.A. sanat@temple.edu AND WENGE GUO Department of Environmental

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2004 Paper 147 Multiple Testing Methods For ChIP-Chip High Density Oligonucleotide Array Data Sunduz

More information

Sanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract

Sanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract Adaptive Controls of FWER and FDR Under Block Dependence arxiv:1611.03155v1 [stat.me] 10 Nov 2016 Wenge Guo Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102, U.S.A.

More information

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS

CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS CS 4491/CS 7990 SPECIAL TOPICS IN BIOINFORMATICS * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Mingon Kang, Ph.D. Computer Science, Kennesaw State University Problems

More information

Hunting for significance with multiple testing

Hunting for significance with multiple testing Hunting for significance with multiple testing Etienne Roquain 1 1 Laboratory LPMA, Université Pierre et Marie Curie (Paris 6), France Séminaire MODAL X, 19 mai 216 Etienne Roquain Hunting for significance

More information

Lecture 7: Interaction Analysis. Summer Institute in Statistical Genetics 2017

Lecture 7: Interaction Analysis. Summer Institute in Statistical Genetics 2017 Lecture 7: Interaction Analysis Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 39 Lecture Outline Beyond main SNP effects Introduction to Concept of Statistical Interaction

More information

Linear Regression (1/1/17)

Linear Regression (1/1/17) STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression

More information

A Large-Sample Approach to Controlling the False Discovery Rate

A Large-Sample Approach to Controlling the False Discovery Rate A Large-Sample Approach to Controlling the False Discovery Rate Christopher R. Genovese Department of Statistics Carnegie Mellon University Larry Wasserman Department of Statistics Carnegie Mellon University

More information

Biostatistics Advanced Methods in Biostatistics IV

Biostatistics Advanced Methods in Biostatistics IV Biostatistics 140.754 Advanced Methods in Biostatistics IV Jeffrey Leek Assistant Professor Department of Biostatistics jleek@jhsph.edu Lecture 11 1 / 44 Tip + Paper Tip: Two today: (1) Graduate school

More information

Computational Systems Biology: Biology X

Computational Systems Biology: Biology X Bud Mishra Room 1002, 715 Broadway, Courant Institute, NYU, New York, USA L#7:(Mar-23-2010) Genome Wide Association Studies 1 The law of causality... is a relic of a bygone age, surviving, like the monarchy,

More information

Bayesian Regression (1/31/13)

Bayesian Regression (1/31/13) STA613/CBB540: Statistical methods in computational biology Bayesian Regression (1/31/13) Lecturer: Barbara Engelhardt Scribe: Amanda Lea 1 Bayesian Paradigm Bayesian methods ask: given that I have observed

More information

Familywise Error Rate Controlling Procedures for Discrete Data

Familywise Error Rate Controlling Procedures for Discrete Data Familywise Error Rate Controlling Procedures for Discrete Data arxiv:1711.08147v1 [stat.me] 22 Nov 2017 Yalin Zhu Center for Mathematical Sciences, Merck & Co., Inc., West Point, PA, U.S.A. Wenge Guo Department

More information

The International Journal of Biostatistics

The International Journal of Biostatistics The International Journal of Biostatistics Volume 7, Issue 1 2011 Article 12 Consonance and the Closure Method in Multiple Testing Joseph P. Romano, Stanford University Azeem Shaikh, University of Chicago

More information

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE By Wenge Guo and M. Bhaskara Rao National Institute of Environmental Health Sciences and University of Cincinnati A classical approach for dealing

More information

25 : Graphical induced structured input/output models

25 : Graphical induced structured input/output models 10-708: Probabilistic Graphical Models 10-708, Spring 2013 25 : Graphical induced structured input/output models Lecturer: Eric P. Xing Scribes: Meghana Kshirsagar (mkshirsa), Yiwen Chen (yiwenche) 1 Graph

More information

The Simplex Method: An Example

The Simplex Method: An Example The Simplex Method: An Example Our first step is to introduce one more new variable, which we denote by z. The variable z is define to be equal to 4x 1 +3x 2. Doing this will allow us to have a unified

More information

Estimation of a Two-component Mixture Model

Estimation of a Two-component Mixture Model Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint

More information

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons: STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two

More information

Lecture 28. Ingo Ruczinski. December 3, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University

Lecture 28. Ingo Ruczinski. December 3, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University Lecture 28 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University December 3, 2015 1 2 3 4 5 1 Familywise error rates 2 procedure 3 Performance of with multiple

More information

Genotype Imputation. Biostatistics 666

Genotype Imputation. Biostatistics 666 Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives

More information

arxiv: v2 [stat.me] 9 Aug 2018

arxiv: v2 [stat.me] 9 Aug 2018 Submitted to the Annals of Applied Statistics arxiv: arxiv:1706.09375 MULTILAYER KNOCKOFF FILTER: CONTROLLED VARIABLE SELECTION AT MULTIPLE RESOLUTIONS arxiv:1706.09375v2 stat.me 9 Aug 2018 By Eugene Katsevich,

More information

Resampling-Based Control of the FDR

Resampling-Based Control of the FDR Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago

More information

Semi-Penalized Inference with Direct FDR Control

Semi-Penalized Inference with Direct FDR Control Jian Huang University of Iowa April 4, 2016 The problem Consider the linear regression model y = p x jβ j + ε, (1) j=1 where y IR n, x j IR n, ε IR n, and β j is the jth regression coefficient, Here p

More information

Multiple Comparison Methods for Means

Multiple Comparison Methods for Means SIAM REVIEW Vol. 44, No. 2, pp. 259 278 c 2002 Society for Industrial and Applied Mathematics Multiple Comparison Methods for Means John A. Rafter Martha L. Abell James P. Braselton Abstract. Multiple

More information

Multiple Hypothesis Testing in Microarray Data Analysis

Multiple Hypothesis Testing in Microarray Data Analysis Multiple Hypothesis Testing in Microarray Data Analysis Sandrine Dudoit jointly with Mark van der Laan and Katie Pollard Division of Biostatistics, UC Berkeley www.stat.berkeley.edu/~sandrine Short Course:

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 13 Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates Sandrine Dudoit Mark

More information

More powerful control of the false discovery rate under dependence

More powerful control of the false discovery rate under dependence Statistical Methods & Applications (2006) 15: 43 73 DOI 10.1007/s10260-006-0002-z ORIGINAL ARTICLE Alessio Farcomeni More powerful control of the false discovery rate under dependence Accepted: 10 November

More information

Estimation and Confidence Sets For Sparse Normal Mixtures

Estimation and Confidence Sets For Sparse Normal Mixtures Estimation and Confidence Sets For Sparse Normal Mixtures T. Tony Cai 1, Jiashun Jin 2 and Mark G. Low 1 Abstract For high dimensional statistical models, researchers have begun to focus on situations

More information

On Methods Controlling the False Discovery Rate 1

On Methods Controlling the False Discovery Rate 1 Sankhyā : The Indian Journal of Statistics 2008, Volume 70-A, Part 2, pp. 135-168 c 2008, Indian Statistical Institute On Methods Controlling the False Discovery Rate 1 Sanat K. Sarkar Temple University,

More information

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations

Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Multiple Change-Point Detection and Analysis of Chromosome Copy Number Variations Yale School of Public Health Joint work with Ning Hao, Yue S. Niu presented @Tsinghua University Outline 1 The Problem

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Gene Expression an Overview of Problems & Solutions: 3&4. Utah State University Bioinformatics: Problems and Solutions Summer 2006

Gene Expression an Overview of Problems & Solutions: 3&4. Utah State University Bioinformatics: Problems and Solutions Summer 2006 Gene Expression an Overview of Problems & Solutions: 3&4 Utah State University Bioinformatics: Problems and Solutions Summer 006 Review Considering several problems & solutions with gene expression data

More information

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Test (2008) 17: 461 471 DOI 10.1007/s11749-008-0134-6 DISCUSSION Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Joseph P. Romano Azeem M. Shaikh

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2004 Paper 164 Multiple Testing Procedures: R multtest Package and Applications to Genomics Katherine

More information

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative

More information

McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination

McGill University. Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II. Final Examination McGill University Faculty of Science MATH 204 PRINCIPLES OF STATISTICS II Final Examination Date: 20th April 2009 Time: 9am-2pm Examiner: Dr David A Stephens Associate Examiner: Dr Russell Steele Please

More information

BTRY 4830/6830: Quantitative Genomics and Genetics

BTRY 4830/6830: Quantitative Genomics and Genetics BTRY 4830/6830: Quantitative Genomics and Genetics Lecture 23: Alternative tests in GWAS / (Brief) Introduction to Bayesian Inference Jason Mezey jgm45@cornell.edu Nov. 13, 2014 (Th) 8:40-9:55 Announcements

More information

Selection-adjusted estimation of effect sizes

Selection-adjusted estimation of effect sizes Selection-adjusted estimation of effect sizes with an application in eqtl studies Snigdha Panigrahi 19 October, 2017 Stanford University Selective inference - introduction Selective inference Statistical

More information

Structure learning in human causal induction

Structure learning in human causal induction Structure learning in human causal induction Joshua B. Tenenbaum & Thomas L. Griffiths Department of Psychology Stanford University, Stanford, CA 94305 jbt,gruffydd @psych.stanford.edu Abstract We use

More information

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE Sanat K. Sarkar 1, Tianhui Zhou and Debashis Ghosh Temple University, Wyeth Pharmaceuticals and

More information

Estimating False Discovery Proportion Under Arbitrary Covariance Dependence

Estimating False Discovery Proportion Under Arbitrary Covariance Dependence Estimating False Discovery Proportion Under Arbitrary Covariance Dependence arxiv:1010.6056v2 [stat.me] 15 Nov 2011 Jianqing Fan, Xu Han and Weijie Gu May 31, 2018 Abstract Multiple hypothesis testing

More information

A TUTORIAL ON THE INHERITANCE PROCEDURE FOR MULTIPLE TESTING OF TREE-STRUCTURED HYPOTHESES

A TUTORIAL ON THE INHERITANCE PROCEDURE FOR MULTIPLE TESTING OF TREE-STRUCTURED HYPOTHESES A TUTORIAL ON THE INHERITANCE PROCEDURE FOR MULTIPLE TESTING OF TREE-STRUCTURED HYPOTHESES by Dilinuer Kuerban B.Sc. (Statistics), Southwestern University of Finance & Economics, 2011 a Project submitted

More information