Resampling-based Multiple Testing with Applications to Microarray Data Analysis

Size: px
Start display at page:

Download "Resampling-based Multiple Testing with Applications to Microarray Data Analysis"

Transcription

1 Resampling-based Multiple Testing with Applications to Microarray Data Analysis DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Dongmei Li, B.A., M.S. * * * * * The Ohio State University 2009 Dissertation Committee: Approved by Dr. Jason C. Hsu, Adviser Dr. Elizabeth Stasny Dr. William Notz Dr. Steve MacEachern Adviser Graduate Program in Biostatistics The Ohio State University

2 c Copyright by Dongmei Li 2009

3 ABSTRACT In microarray data analysis, resampling methods are widely used to discover significantly differentially expressed genes under different biological conditions when the distributions of test statistics are unknown. When sample size is small, however, simultaneous testing of thousands, or even millions, of null hypotheses in microarray data analysis brings challenges to the multiple hypothesis testing field. We study small sample behavior of three commonly used resampling methods, including permutation tests, post-pivot resampling methods, and pre-pivot resampling methods in multiple hypothesis testing. We show the model-based pre-pivot resampling methods have the largest maximum number of unique resampled test statistic values, which tend to produce more reliable P-values than the other two resampling methods. To avoid problems with the application of the three resampling methods in practice, we propose new conditions, based on the Partitioning Principle, to control the multiple testing error rates in fixed-effects general linear models. Meanwhile, from both theoretical results and simulation studies, we show the discrepancies between the true expected values of order statistics and the expected values of order statistics estimated by permutation in the Significant Analysis of Microarrays (SAM) procedure. Moreover, we show the conditions for SAM to control the expected number of false ii

4 rejections in the permutation-based SAM procedure. We also propose a more powerful adaptive two-step procedure to control the expected number of false rejections with larger critical values than the Bonferroni procedure. iii

5 This is dedicated to my dear husband Zidian Xie, my cute daughter Catherine Xie, my cute son Matthew Xie, and my dear parents. iv

6 ACKNOWLEDGMENTS I would like to express my heartfelt gratitude to my advisor Professor Jason C. Hsu for his encouragement, constant guidance and extreme patience. Without his advice, it would have been impossible for me to finish this dissertation. A special thanks goes to Professor Elizabeth Stasny, Graduate Studies Chairs in Statistics, who carefully proofread my papers and gave me tons of help during my Ph.D. study. I would also like to thank my other committee members, Professor William Notz and Professor Steve MacEachern for their thoughtful questions and advice. I am enormously grateful to my parents, my husband and my kids for their support and love, especially my husband Zidian Xie, who always support me whenever I need him. v

7 VITA B.A. Pomology, Laiyang Agriculture College, China M.S. Biophysics, China Agriculture University, China M.S. Statistics, The Ohio State University, U.S.A present...Graduate Teaching and Research Associate, The Ohio State University. PUBLICATIONS Research Publications Violeta Calian, Dongmei Li, and Jason C. Hsu. Partitioning to Uncover Conditions for Permutation Tests to Control Multiple Testing Error Rates. Biometrical Journal, 50 (5): , DOI: /bimj FIELDS OF STUDY Major Field: Biostatistics vi

8 TABLE OF CONTENTS Page Abstract Dedication Acknowledgments Vita List of Tables List of Figures ii iv v vi x xi Chapters: 1. Multiple hypotheses testing and resampling methods Multiple hypotheses testing Introduction Two definitions of Type I error rate Familywise Error Rate (FWER) False Discovery Rate (FDR) Multiple testing principles Resampling methods Permutation tests Bootstrap methods Small sample behavior of resampling methods Tomato microarray example Conditions for getting adjusted P-values of zero using the post-pivot resampling method vii

9 2.2.1 Conditions for getting adjusted P-values of zero with a sample size of two Conditions for getting adjusted P-values of zero with a sample size of three Conditions for getting adjusted P-values of zero using the pre-pivot resampling method Discreteness of resampled test statistics distributions Paired samples Two independent samples Multiple independent samples General linear mixed-effects models Conditions for resampling methods to control multiple testing error rates Two-group comparison Permutation tests Post-pivot resampling method Pre-pivot resampling method Fixed-effects general linear model Estimating the test statistic s null distribution Permutation tests Pre-pivot resampling method Post-pivot resampling method Estimating critical values for strong control of FWER Permutation tests Pre-pivot resampling method Post-pivot resampling method Shortcuts of partitioning tests using resampling methods Permutation tests Pre-pivot resampling method Post-pivot resampling method Conditions for Significant Analysis of Microarrays (SAM) to control the empirical FDR Introduction to Significant Analysis of Microarrays (SAM) method Discrepancies between true expected values of order statistics and expected values estimated by permutation Effect of unequal variances-covariance matrices and sample sizes Effect of higher order cumulants with equal sample sizes.. 88 viii

10 4.3 Conditions for controlling the expected number of false rejections in SAM An adaptive two-step procedure controlling the expected number of false rejections Discussion Concluding remarks References ix

11 LIST OF TABLES Table Page 1.1 Summary of possible outcomes from testing k null hypotheses Adjusted P-values calculated from formula (2.1) for the permutation test, post-pivot resampling method and pre-pivot resampling method Maximum number of unique resampled test statistic values for the permutation test, post-pivot resampling method and pre-pivot resampling method x

12 LIST OF FIGURES Figure Page 2.1 Null distribution of max i=1,2,3 T i for k = 3 and n = 3. Observed test statistics and resampled test statistics from permutation test, postpivot resampling and pre-pivot resampling methods Q-Q plot of the true expected values of order statistics against the expected values estimated by permutation for unequal variance and sample sizes. Dashed line in the Q-Q plot is the 45 degree diagonal line Q-Q plot of the true expected values of order statistics against the expected values estimated by permutation for unequal correlations and sample sizes. Dashed line in the Q-Q plot is the 45 degree diagonal line Q-Q plot of the true expected values of order statistics against the expected values estimated by permutation for unequal skewness. Dashed line in the Q-Q plot is the 45 degree diagonal line Q-Q plot of the true expected values of order statistics against the expected values estimated by permutation for unequal third order cross cumulants. Dashed line in the Q-Q plot is the 45 degree diagonal line. 97 xi

13 CHAPTER 1 MULTIPLE HYPOTHESES TESTING AND RESAMPLING METHODS 1.1 Multiple hypotheses testing Introduction With the rapid development of biotechnology, microarray technology became widely used in biomedical and biological fields to identify differentially expressed genes and transcription factor binding sites, and map complex traits using single nucleotide polymorphisms (SNPs) (Kulesh et al. (1987), Schena et al. (1995), Lashkari et al. (1997), Pollack et al. (1999), Buck and Lieb (2004), Mei et al. (2000), Hehir- Kwa et al. (2007)). Having thousands, even millions, of genes on a small array makes multiple comparisons a hot topic in today s statistics field because thousands, even millions, of hypotheses need to be tested simultaneously. Without multiplicity adjustment, if each hypothesis is tested at level α, the probability of rejecting at least one true null hypothesis will increase enormously when testing multiple hypotheses. If, for example, 20 hypotheses are tested simultaneously and each hypothesis is tested at 5%, the probability of rejecting at least one true null hypothesis will be 64%, assuming all the test statistics are independent. Therefore, 1

14 in order to make the multiplicity adjustment, a multiple hypotheses testing procedure need to control a certain type of error rate at a level of α. A popular multiple testing error rate being controlled in many multiple hypotheses testing procedures is the family-wise error rate (FWER) (Hochberg and Tamhane (1987), Shaffer (1995)), which is defined as the probability of at least one false rejection. Another less stringent multiple testing error rate commonly used is the false discovery rate (FDR) (Benjamini and Hochberg (1995)), which is defined as the proportion of falsely rejected null hypotheses Two definitions of Type I error rate Suppose k genes are probed to compare expression levels between high risk and low risk patients. Let µ Hi, µ Li, i = 1,..., k, denote the expected (logarithms of) expression levels of the ith gene of a randomly sampled patient from the high risk and low risk groups respectively. Let θ i = µ Hi µ Li denote the difference of expected (logarithm of) expression levels of the ith gene between the high risk group and the low risk group. To determine which of the genes are differentially expressed in expectation between the high risk and low risk patients, we need to test the following null hypotheses: H 0i : θ i = 0, i = 1,..., k. (1.1) There are two different ways to define the Type I error rate when testing a single null hypothesis. Let θ = (θ 1, θ 2,...,θ k ), and let Σ denote generically all nuisance parameters that the observed expression levels depend on, such as covariance of the expression levels for each of the high risk group and low risk group. Let θ 0 = (θ1 0,...,θ0 k ) and Σ 0 be a collection of all (unknown) true parameter values. A traditional definition of 2

15 the Type I error rate given by Casella and Berger (1990) or Berger (1993) is sup θi =0P θ,σ {Reject H 0i }, where the supremum is taken over all possible θ and Σ subject to θ i = 0. Another definition of the Type I error rate, given by Pollard and van der Laan (2005), is where θ 0 i = 0, θ 0 = (θ 0 1,...,θ0 k parameter values. P θ 0,ΣΣ 0 Σ 0{Reject H 0i}, ), and Σ0 Σ 0 represents the set of all (unknown) true The first definition of Type I error rate is more widely used than the second definition. The second definition of Type I error rate can only be controlled asymptotically since the true parameter values are unknown in microarray data analysis Familywise Error Rate (FWER) When we are testing k null hypotheses simultaneously, the summary of possible outcomes is shown in Table 1.1. Table 1.1: Summary of possible outcomes from testing k null hypotheses Number not rejected Number rejected True null hypotheses U V k 0 Non-true null hypotheses T S k k 0 Total k R R k In Table 1.1, V denotes the number of incorrectly rejected true null hypotheses when testing k null hypotheses; R denotes the number of hypotheses rejected among 3

16 those k null hypotheses; k 0 denotes the number of true null hypotheses; and k k 0 denotes the number of false null hypotheses. FWER is defined as the probability of rejecting at least one true null hypothesis (at least one false rejection). FWER has the following expression: FWER = P {V 1}. (1.2) There are two kinds of control of FWER. One is strong control of FWER, which controls the probability of at least one false rejection under any combination of true and false null hypotheses (controls the supremum). The other is weak control of FWER, which controls the probability of at least one false rejection under the complete null hypothesis H C 0 : k i=1h 0i with k 0 = k (Westfall and Young (1993), Lehmann and Romano (2005)). In microarray experiments, since it is rare that no gene is differentially expressed, to control FWER strongly is more appropriate than weakly. Strong control of FWER is desired to minimize the number of false rejections in some cases, such as selecting genes to build diagnostic or prognostic chips for diseases. An example is the MammaPrint developed by Agendia, which is based on the well-known Amsterdam 70-gene breast cancer gene signature (van t Veer et al. (2002), van de Vijver et al. (2002), Buyse et al. (2006), Glas et al. (2006)). MammaPrint is used to predict whether existing breast cancer will metastasize (spread to other parts of a patient s body). The multiple testing procedure proposed by Pollard and van der Laan (2005) has a strong asymptotic control of FWER. It controls the error rate α n for a sample of size n. It has limsup n α n α under the true data generating distribution when the sample size n goes to infinity. 4

17 1.1.4 False Discovery Rate (FDR) The concept of false discovery rate (FDR) was first proposed by Benjamini and Hochberg (1995) to reduce the stringency of strong FWER control. FDR is more widely used than FWER in bioinformatics studies because the investigators are more interested in finding all potential genes that are differentially expressed even if some genes could be falsely identified (Benjamini and Yekutieli (2001), Storey (2002), Storey and Tibshirani (2003b), Storey and Tibshirani (2003a), Benjamini et al. (2006), Strimmer (2008)). FDR is defined as the expected proportion of erroneously rejected null hypotheses among all rejected null hypotheses FDR = E( V R > 0)Pr(R > 0). R FDR: Benjamini and Hochberg (1995) also presented four alternative formulations of (1) Positive FDR pfdr = E( V R > 0). R The pfdr is recommended by Storey (2002) who argued that pfdr is a more appropriate error measure to use compared to FDR. (2) Conditional FDR cfdr = E( V R = r), R where r is the observed number of rejected null hypotheses. (3) Marginal FDR mfdr = E(V )/E(R). 5

18 (4) Empirical FDR Fdr = E(V )/r. Benjamini and Hochberg (1995) argued that all four FDRs can not be controlled when all null hypotheses are true (k 0 = k). If k 0 = k and even if a single null hypothesis is rejected, V/R = 1 and FDR cannot be controlled. Controlling pfdr, cfdr, mfdr and Fdr has the same problem-they are identically 1 when k 0 = k. Tsai et al. (2003) showed that pfdr, cfdr and mfdr are equivalent under the Bayesian framework, in which the number of true null hypotheses is modeled as a random variable. The significant analysis of microarray (SAM) method that will be discussed in chapter 4 estimates the empirical FDR Multiple testing principles A general principle of multiple testing is the Partitioning Principle proposed by Stefansson et al. (1988), and further refined by Finner and Strassburger (2002). Both Holm (1979) s step-down method and Hochberg (1988) s step-up method are special cases of partition testing (Huang and Hsu (2007)). The principle of partition testing is to partition the parameter space into disjoint subspaces, test each partitioning null hypothesis at level α, and collate the results across the subspaces, as follows: Let P = {1,..., k}, and consider testing H 0i : θ i = 0, i = 1,...,k. To control FWER strongly, the Partitioning Principle states: P1: For each I {1,...,k}, I, form H0I : θ i = 0 for all i I and θ j 0 for j / I. In total, there are 2 k parameter subspaces and 2 k 1 null hypotheses to be tested. 6

19 P2: Test each H0I at level α. Since all the null hypotheses are disjoint, at most one null hypothesis is true. Therefore, no multiplicity adjustment is required for each H0I. P3: For each i, infer θ i 0 if and only if all H0I with i I are rejected since H 0i is the union of H 0I with i I. Taking k = 3 as an example, the parameter space Θ = {θ 1, θ 2, θ 3 } will be partitioned into eight disjoint subspaces: Θ 1 = {θ 1 = 0 and θ 2 = 0 and θ 3 = 0} Θ 2 = {θ 1 = 0 and θ 2 = 0 and θ 3 0} Θ 3 = {θ 1 = 0 and θ 2 0 and θ 3 = 0} Θ 7 = {θ 1 0 and θ 2 0 and θ 3 = 0} Θ 8 = {θ 1 0 and θ 2 0 and θ 3 0} Next, we will test each of the following H0I s at level α: H0{123} : θ 1 = 0 and θ 2 = 0 and θ 3 = 0 H0{12} : θ 1 = 0 and θ 2 = 0 and θ 3 0 H0{13} : θ 1 = 0 and θ 2 0 and θ 3 = 0 H0{2} : θ 1 0 and θ 2 = 0 and θ 3 0 H0{3} : θ 1 0 and θ 2 0 and θ 3 = 0 Finally, infer θ i 0 if and only if all H0I involving θ i = 0 are rejected. Another multiple testing principle similar to the Partitioning Principle, is the closed testing principle (Marcus et al. (1976)). The closed testing principle states: 7

20 C1: For each I {1,..., k}, form the intersection null hypothesis H 0I : θ i = 0 for all i I. C2: Test each H 0I at level α. C3: For each i, infer θ i 0 if and only if all H 0I with i I are rejected. Compared to the partition testing procedure, the closed testing procedure tests less restrictive hypotheses. However, the closed testing procedure still controls FWER strongly because a level-α test for H 0I is also a level-α test for H0I. To test H 0 : θ i = 0 (i = 1,...,k) using the test statistic T i = ˆθ i (i = 1,...,k), we will test 2 k 1 null hypotheses in accordance with the Partitioning Principle. Here is a typical partitioning null hypothesis: H0{12 t} : θ 1 = 0 and and θ t = 0 and θ t+1 0 and and θ k 0 (1 t k). The above null hypothesis can be simplified as H 0{12 t} : θ 1 = 0 and θ 2 = 0 and and θ t = 0 (1 t k) according to the closed testing principle. It still controls FWER strongly because a level-α test for H 0{12 t} is also a level-α test for H0{12 t}. The test statistic for testing H 0{12 t} is max i=1,...,t T i = max i=1,...,t ˆθ i because H 0{12 t} is an Union-Intersection test (Casella and Berger (1990)), and the rejection region for a Union-Intersection test is i {1,...,t} { T i > c} = {max i=1,...,t T i > c} (where c is the critical value for testing H 0{12 t} ). 1.2 Resampling methods Resampling methods can be used to estimate the precision of sample statistics (mean, median, percentiles), perform significance tests, and validate models (Westfall 8

21 and Young (1993), Efron and Tibshirani (1994), Davison and Hinkley (1997), Good (2005)). The commonly used resampling techniques include permutation tests and bootstrap methods. Two different bootstrap methods, the post-pivot resampling method and the pre-pivot resampling method, will be introduced in this section. Westfall and Young (1993) introduced procedures using resamplings to adjust P- values in multiple testings to control multiple testing error rates Permutation tests A permutation test is a type of non-parametric statistical significance test in which a reference distribution is constructed by calculating all possible values of test statistics from permuted observations under a null hypothesis. The theory of permutation tests is based on the works of Fisher and Pitman in the 1930s (Good (2005)). Compared to parametric testing procedures, the fewer distributional assumptions and the simpler procedures make permutation tests more attractive to many researchers and statisticians. For example, when comparing the means of two populations, a two-sample t-test assumes that the sampling distribution of the difference between sample averages is normal, which is not true in most cases. The t-test is only valid when both populations have independent or joint normal distributions. In contrast, the permutation test is distribution-free so that it can give exact P-values when the sample size is small. The permutation test permutes the labels of observations between two groups, and obtains the P-value by calculating the proportion of test statistic values from resamples that are as extreme or more extreme than the 9

22 observed test statistic value. In microarray data analysis, when the correlations between genes are considered in the joint distribution of test statistics, the parametric form of a multivariate t distribution becomes very complex and difficult to calculate. In contrast, the permutation test is easy to conduct and avoids complex calculations. To carry out a permutation test based on a test statistic that measures the size of an effect of interest, we proceed as follows: 1. Compute the test statistic for the observed data set. 2. Permute the original data in a way that matches the null hypothesis to get permuted resamples, and construct the reference distribution using the test statistics calculated from permuted resamples. 3. Calculate the critical value of a level α test based on the upper α percentile of the reference distribution, or obtain the P-value by computing the proportion of permutation test statistics that are as extreme or more extreme than the observed test statistic. Permutation tests can be used in a wide variety of settings. For example, Fisher s exact test (a permutation test) is used to detect the association between a row variable and a column variable for small, sparse, or unbalanced data sets. Ein-Dor et al. (2005) used a permutation test for selecting genes which expression profiles are significantly correlated with breast cancer survival status. Based on random permutations of time points, Ptitsyn et al. (2006) applied the permutation test to identifying a periodic pattern in relatively short time series using microarray technology. The periodic process is important for modulating and coordinating the transcription of genes governing key metabolic pathways. Churchill and Doerge (1994) used a permutation 10

23 test based on the permutation of observed quantitative traits to determine the quantitative trait loci. To identify significant changes in gene expression in microarray experiments, Tusher et al. (2001) used permutations of the repeated measurements in the significance analysis of microarrays (SAM) procedure. For two-group comparisons, permuting the labels of observations between two groups requires an assumption that two populations are identical when the null hypothesis is true-that is, not only are their means the same, but also their spreads and shapes. Pollard and van der Laan (2005) demonstrated that, if both the correlation structures and the sample sizes are different between two populations, then a permutation test does not control the type I error rate at its nominal significance level for detecting differentially expressed genes between two groups. When comparing two groups and finding significant predictor variables in fixed-effects general linear models, the conditions for permutation tests to control multiple testing error rates will be further discussed in chapter 3. For testing hypotheses about a single population, comparing populations that differ even under the null hypothesis, or testing general relationships, permutation tests cannot be used because we do not know how to resample in a way that matches the null hypothesis in these settings. Hence, bootstrap methods should be used instead Bootstrap methods The bootstrap method was first introduced by Efron (1979) and further discussed by Efron and Tibshirani (1994). 11

24 The bootstrap method is a way of approximating the sampling distribution from just one sample. Instead of taking many simple random samples from the population to find the sampling distribution of a sample statistic, the bootstrap method repeatedly resamples with replacement from one random sample. The bootstrap distribution of a statistic collects values of the statistic from many resamples, and gives information about the sampling distribution of the statistic. For example, the bootstrap distribution of a sample mean is obtained from the resampled means calculated from hundreds of resamples with replacement from a single original sample. The bootstrap distribution of a sample mean has the following mean and standard error: mean boot = X boot = 1 B X SE boot = 1 ( X mean boot ) B 1 2 where X is the sample mean of each bootstrap resample and B is the number of resamples. Since a bootstrap distribution of a statistic generates from a single original sample, it is centered at the value of the sample statistic rather than the parameter value. Bootstrap distributions include two sources of random variation: one is from choosing an original sample at random from the population, and the other is from choosing bootstrap resamples at random from the original sample, which introduces little additional variation. Bootstrap methods are asymptotically valid (as original sample size goes to ). Efron (1979) showed that the bootstrap method can (asymptotically) correctly estimate the variance of a sample median, and the error rates in a linear discrimination 12

25 problem (outperforming cross-validation). Freedman (1981) showed that the bootstrap approximation to the distribution of least square estimates is valid. Hall (1986) showed the bootstrap method s reduction of error coverage probability, from O(n 1/2 ) to O(n 1 ), which makes the bootstrap method one order more accurate than the delta method. Bootstrap methods are widely used in all kinds of data analysis. Davison and Hinkley (1997) illustrated the application of bootstrap methods in stratified data; finite populations; censored and missing data; linear, nonlinear, and smooth regression models; classification; time series and spatial problems. For example, by using Efron s bootstrap resampling method, Liu et al. (2004) analyzed the performance of artificial neural networks (ANNs) in the area of feature classification for the analysis of mammographic masses to achieve more accurate results. The feature classification in mammography is used to discover the salient information that can be used to discriminate benign from malignant masses. In microarray data analysis, there are two commonly used bootstrap methods, including the post-pivot resampling method and the pre-pivot resampling method. Both methods can control FWER asymptotically and give similar results in a fixedeffects general linear model with i.i.d. errors. In two-group comparisons, the null distribution estimated by the pre-pivot resampling method has more resampled test statistic values than that estimated by the post-pivot resampling method under a reasonable assumption (the distributions of the errors are exchangeable) for microarray data. 13

26 Post-pivot resampling method The post-pivot resampling method was introduced by Pollard and van der Laan (2005) to estimate the null distribution of test statistics in multiple hypotheses testing to achieve asymptotic multiple testing error rate control. The post-pivot resampling method obtains the asymptotically correct null distribution of the test statistic (based on the true data generating distribution) from centered and/or scaled resampled test statistics. In microarray data analysis with two or more treatment groups, the post-pivot resampling method resamples the observed data within each group, calculates the resampled test statistics from each resample, centers and/or scales the resampled test statistics (subtracts the average of resampled test statistics and/or divides the standard deviation of resampled test statistics), and estimates the test statistic s null distribution from the centered and/or scaled resampled test statistics. To carry out a hypothesis test based on a test statistic that measures the location difference between two populations, the post-pivot resampling method proceeds as follows: 1. Compute the test statistic for the observed data set. 2. Resample the data with replacement within each group to obtain bootstrap resamples, compute the test statistic for each resampled data set, and construct the reference distribution using the centered and/or scaled resampled test statistics. 3. Calculate the critical value of a level-α test based on the upper α percentile of the reference distribution, or obtain the P-value by computing the proportion of bootstrapped test statistics that are as extreme or more extreme than the observed test statistic. 14

27 Pre-pivot resampling method The pre-pivot resampling method fits a model to the observed data first and then estimates the test statistic s null distribution by bootstrapping the centered residuals (subtract the sample mean of residuals) (Freedman (1981)). Under an assumption that the model fits the data well, the pre-pivot resampling method can provide asymptotically valid results, i.e., it can control multiple testing error rates asymptotically when testing multiple null hypotheses. In microarray data analysis, the pre-pivot resampling method estimates the null distributions of test statistics by bootstrapping residuals from a probe level or a gene level model with treatment effects. The way that the residuals are re-sampled with replacement (bootstrapped) depends on the assumptions about the residuals. The residuals can be re-sampled across treatments under the assumption of same distributions across treatments, but not across genes. If the distributions are the same across genes, then residuals across treatments and genes can be pooled together for resampling with replacement. To carry out a hypothesis test based on a test statistic that measures the location difference of two populations, the pre-pivot resampling method has the following procedure: 1. Compute the test statistic for the observed data set. 2. Fit a one-way model to the observed data, and compute the residuals from the one-way model (subtract the sample mean from each observation within each group). 3. Combine the residuals of two groups together under an assumption that the distributions of the residuals are the same for these two groups. 15

28 4. Resample the pooled residuals with replacement to get bootstrapped residuals, and center the bootstrapped residuals at the average (subtract the average of bootstrapped residuals) if the average of those bootstrapped residuals is not zero. 5. Add the centered bootstrapped residuals from each resample back to the oneway model, and recompute the test statistic for each resample. Then, the test statistics from all resamples form the reference distribution. 6. Calculate the critical value of a level-α test based on the upper α percentile of the reference distribution, or obtain the P-value by computing the proportion of bootstrapped test statistics as extreme or more extreme than the observed test statistic. 16

29 CHAPTER 2 SMALL SAMPLE BEHAVIOR OF RESAMPLING METHODS Resampling techniques are popular in microarray data analysis. In this chapter, we will discuss the small sample behavior of three popular resampling techniques for multiple testing: the permutation test, the post-pivot resampling method, and the pre-pivot resampling method. We will show that when the sample size is small, for matched pairs, a permutation test is unlikely to give small P-values, while both post-pivot and pre-pivot resampling methods might give P-values of zero for the same data, even adjusting for multiplicity. The discreteness of the test statistics null distributions estimated by the above three resampling methods will be compared based on the maximum number of unique test statistic values. 2.1 Tomato microarray example A biology professor in the Department of Horticulture and Crop Science at the Ohio State University wishes to identify differentially expressed genes between control tomato plants and mutant tomato plants at different tomato fruit developmental stages (flower bud, flower, and fruit). Lee et al. (2000) recommended that at least three replicates should be used in designing experiments by using cdna microarrays, 17

30 particularly when gene expression data from single specimens will be analyzed. In the tomato microarray experiment, there are three paired samples at each stage (three plants in the control group and three plants in the mutant group). Suppose we only have three genes at the fruit stage and wish to learn which genes are differently expressed between the mutant group and the control group using the single step maxt method, a method based on resampling techniques, for the multiplicity adjustment. Let X ij (i =1, 2, 3, and j =1, 2, 3) denote the gene expression levels for the ith gene, jth sample in the control group, and Y ij (i =1, 2, 3, and j =1, 2, 3) denote the gene expression levels for the ith gene, jth sample in the treatment group. For the ith gene, X ij i.i.d. F Xi, and Y ij i.i.d. F Yi. Let d ij = x ij y ij denote the observed paired difference for the ith gene, jth paired sample, θ i denote the true paired difference between the paired samples. To identify the differentially expressed genes among these three genes, we will test the null hypotheses H 0 : θ i = 0 (i =1, 2, 3) using the test statistics T i = d i (i =1, 2, 3). The raw P-values are calculated according to the following formula using resampling methods: Raw P i = {b : T i,b T i }, for i = 1,...,k. B The single step maxt method based on resampling techniques will be used to calculate the adjusted P-values for adjusting multiplicity when we are testing three null hypotheses simultaneously. The formula for calculating maxt adjusted P-values with monotonicity enforced is (cf Westfall and Young (1993)): Adjusted P i = {b : max i=1,2,3 T i,b T i }, for i = 1, 2, 3, (2.1) B 18

31 where T i,b denotes the resampled test statistic for the ith gene, bth resampling, and B is the total number of resamplings (b = 1,..., B). Figure 2.1 shows the absolute values of the observed test statistics T i and the maximums of the absolute values of resampled test statistics max i=1,2,3 T i,b from three resampling methods. The dots denote the observed test statistics; the rectangles denote the maximums of resampled test statistics from the permutation test; the diamonds denote the maximums of resampled test statistics from the post-pivot resampling method; and the triangles denote the maximums of resampled test statistics from the pre-pivot resampling method. As shown in Figure 2.1, the permutation test always produces permutated test statistics that are greater than or equal to the observed test statistic. Thus, it is unlikely that the permutation test gives zero adjusted P-values. In contrast, for either the pre-pivot or post-pivot resampling method, there is a high probability that the observed test statistic is far from the resampled test statistics. Therefore, we might get zero adjusted P-values using these two resampling methods. Based on the formula of the single step maxt method for calculating adjusted P-values, we can get the adjusted P-values for all three genes. Table 2.1 summarizes the adjusted P-values obtained from the permutation test, the post-pivot resampling method, and the pre-pivot resampling method for three tomato fruit genes based on Figure 2.1. Based on the null distribution of max T estimated from the permutation test (the rectangles), we can observe that the adjusted P-value for gene 1 is 0.75 since 6 out of 8 max T values (square in Figure 2.1) are greater than or equal to T 1 (dot in Figure 2.1). Similarly, the adjusted P-values for gene 2 and gene 3 are both 0.25 based on the permutation test. Using the post-pivot resampling method, the adjusted 19

32 Figure 2.1: Null distribution of max i=1,2,3 T i for k = 3 and n = 3. Observed test statistics and resampled test statistics from permutation test, post-pivot resampling and pre-pivot resampling methods. P-value for gene 1 is 0.30 since 3 out of 10 max T values (diamond in Figure 2.1) are greater than or equal to T 1 (dot in Figure 2.1). For gene 2 and gene 3, however, there is no resampled max T value from the post-pivot resampling method that is greater than or equal to either T 2 or T 3 (dots in Figure 2.1). Thus, the adjusted P-values for gene 2 and gene 3 are both zero using the post-pivot resampling method. We obtain the same adjusted P-values from the pre-pivot resampling method as from the post-pivot resampling method for all three fruit genes. Table 2.1: Adjusted P-values calculated from formula (2.1) for the permutation test, post-pivot resampling method and pre-pivot resampling method Permutation Post-pivot resampling Pre-pivot resampling gene 1 6/8=0.75 3/10 =0.30 3/10 =0.30 gene 2 2/8=0.25 0/10 =0 0/10 =0 gene 3 2/8=0.25 0/10 =0 0/10 =0 20

33 Strikingly, for matched pairs, the permuted test statistics (unstandardized or standardized) with complete enumerations always have a mean of zero. The reason is that one sample in each pair can be assigned either zero or one as their group label. When the labels are switched, the signs of the test statistics are also switched. Thus, the positive signs and negative signs cancel each other out so that the mean of all permuted test statistics will be equal to zero. For standardized test statistics, since the MSEs are always the same for paired permuted samples when labels switched, the mean of all permuted test statistics is also zero. 2.2 Conditions for getting adjusted P-values of zero using the post-pivot resampling method The tomato microarray example suggests that P-values of zero may occur often even after multiplicity adjustment. Therefore, we need to explore the conditions for getting an adjusted P-value of zero using the post-pivot and pre-pivot resampling methods for paired samples with small sample sizes (2 or 3 each) Conditions for getting adjusted P-values of zero with a sample size of two To expand three genes in our tomato microarray example to k genes, let X ij (i = 1, 2,..., k; and j = 1, 2,..., n) denote the gene expression levels for the ith gene, jth sample in the control group, and Y ij (i = 1, 2,..., k and j = 1, 2,..., n) denote the gene expression levels for the ith gene jth sample in the mutant group. For the ith gene, X ij i.i.d. F Xi, and Y ij i.i.d. F Yi. Assume d ij = x ij y ij are the observed paired differences for the ith gene in the jth paired sample. We wish to determine which genes are differentially expressed 21

34 among those k genes by testing the k null hypotheses H 0 : θ i = 0 (i = 1,...,k) using the test statistics T i = d i. When the sample size n is two, the observed differences are d ij = x ij y ij (i =1, 2,..., k and j =1 and 2). For the first two genes, we have the following observation matrix ( d11 d 12 d 21 d 22 ). The observed test statistics are T 1 = (d 11 + d 12 )/2 and T 2 = (d 21 + d 22 )/2. The resampled test statistics matrix is shown as follows using the post-pivot resampling method: ( d d d 12 2 d d 21 +d d 11 +d 12 d 2 12 d 21 +d 22 d 2 22 We can get the following matrix after subtracting the average in each row: ( d11 d d 12 d d 21 d d 22 d To get a raw P-value of zero for the first gene, we need to have { d 11 +d 12 2 > d 11 d 12 d 11+d 12 2 > 0. 2 Similarly, we need to have the following relationship to have a raw P-value of zero for the second gene: { d 21 +d 22 2 > d 21 d 22 d 21+d 22 2 > 0. 2 Therefore, the necessary and sufficient conditions for getting a raw P-value of zero for the ith gene are: ). ). either { di1 > 0 d i2 > 0 22

35 or { di1 < 0 d i2 < 0 for i=1, 2. Using the single step maxt method, we can get the necessary and sufficient conditions for getting an adjusted P-value of zero for the first gene as follows: either or d 11 > 0 d 12 > 0 d 11 + d 12 > d 21 d 22 d 11 < 0 d 12 < 0 d 11 + d 12 < d 21 d 22. Similarly, we can get the necessary and sufficient conditions for getting an adjusted P-value of zero for the second gene as follows: either or d 21 > 0 d 22 > 0 d 21 + d 22 > d 11 d 12 d 21 < 0 d 22 < 0 d 21 + d 22 < d 11 d 12. In other words, to have both raw P-values of zero and adjusted P-values of zero with a sample size of two for two genes, the conditions are: 1. To have raw P-values of zero, the necessary and sufficient condition is that both observations are in the same direction (either both are bigger than zero or both are smaller than zero). 2. To have adjusted P-values of zero, the necessary and sufficient conditions that need to be satisfied are: (a) Both observations for the same gene are in the same direction. 23

36 (b) The sum of two observations for one gene is either bigger than the absolute difference of two observations of the other gene (in the positive direction) or smaller than the negative value of the absolute difference of two observations of the other gene (in the negative direction). If k genes are considered, the necessary and sufficient conditions for the ith gene to have a raw P-value of zero with a sample size of two are: either or { di1 > 0 d i2 > 0 { di1 < 0 d i2 < 0, for i = 1, 2,..., k. For getting an adjusted P-value of zero for the ith gene with a sample size of two, the necessary and sufficient conditions are: either or for i = 1, 2,..., k. d i1 > 0 d i2 > 0 d i1 + d i2 > max j i,j=1,2,...,n d j1 d j2 d i1 < 0 d i2 < 0 d i1 + d i2 < max j i,j=1,2,...,n d j1 d j2, Conditions for getting adjusted P-values of zero with a sample size of three When the sample size increases from two to three for each group, the observed differences are d ij = x ij y ij (i = 1, 2,..., k and j = 1, 2 and 3). The observed 24

37 difference matrix for the first two genes is: ( ) d11 d 12 d 13. d 21 d 22 d 23 T 1 = (d 11 + d 12 + d 13 )/3 and T 2 = (d 21 + d 22 + d 23 )/3 will be our observed test statistics for the first two genes when the sample size is three, and there will be = 27 complete bootstrap resampled test statistics. The ten bootstrap resamples that will give ten unique test statistic values are: , where 1 is the label for the first paired difference, 2 is the label for the second paired difference, and 3 is the label for the third paired difference. If the bootstrap resamplings all come from the first paired difference, then we will have the following resampled difference matrix for the first two genes: ( d11 d 11 d 11 d 21 d 21 d 21 The resampled test statistics computed from the above difference matrix are T 1,b=1 = d 11 and T 2,b=1 = d 21. If the bootstrap resamplings include the first paired difference twice and the second paired difference once, then the resampled difference matrix is: ). ( ) d11 d 11 d 12. d 21 d 21 d 22 25

38 The resampled test statistics computed from the above difference matrix are T 1,b=2 = (2d 11 + d 12 )/3 and T 2,b=2 = (2d 21 + d 22 )/3. In the post-pivot resampling method, we subtract the average of all resampled test statistics, which is T 1 = (d 11 +d 12 +d 13 )/3 for the first gene and T 2 = (d 21 +d 22 +d 23 )/3 for the second gene respectively, from each resampled test statistic to get the reference distribution Z b for both genes: ( 2d11 d 12 d d 21 d 22 d 23 3 d 11 d d 12 d 11 d d 21 d d 22 d 21 d d 13 d d 23 d 22 2d 13 d 11 d d 23 d 21 d 22 3 ). According to the formula for calculating raw P-values, if all Z 1,b < T 1, the raw P-value of the first gene is equal to zero. To have Z 1,b < T 1, the following relationships need to be satisfied: (d 11 d 13 )/3 < (d 11 + d 12 + d 13 )/3 (d 11 d 12 )/3 < (d 11 + d 12 + d 13 )/3 (d 12 d 13 )/3 < (d 11 + d 12 + d 13 )/3 (2d 11 d 12 d 13 )/3 < (d 11 + d 12 + d 13 )/3 (2d 12 d 11 d 13 )/3 < (d 11 + d 12 + d 13 )/3 (2d 13 d 11 d 12 )/3 < (d 11 + d 12 + d 13 )/3 0 < (d 11 + d 12 + d 13 )/3 From the above equations, we derive the following necessary and sufficient conditions for the first gene to have a raw P-value of zero: either or d 11 > max( 2d 12, 2d 13 ) d 12 > max( 2d 11, 2d 13 ) d 13 > max( 2d 11, 2d 12 ) d 11 + d 12 > d 13 2 d 11 + d 13 > d 12 2 d 11 + d 13 > d 12 2 d 11 < 0 d 12 < 0 d 13 < 0, 26

39 For the second gene, to have Z 2,b < T 2, the following relationships need to be satisfied: (d 21 d 23 )/3 < (d 21 + d 22 + d 23 )/3 (d 21 d 22 )/3 < (d 21 + d 22 + d 23 )/3 (d 22 d 23 )/3 < (d 21 + d 22 + d 23 )/3 (2d 21 d 22 d 23 )/3 < (d 21 + d 22 + d 23 )/3 (2d 22 d 21 d 23 )/3 < (d 21 + d 22 + d 23 )/3 (2d 23 d 21 d 22 )/3 < (d 21 + d 22 + d 23 )/3 0 < (d 21 + d 22 + d 23 )/3 From the above equations, the necessary and sufficient conditions for the second gene to have a raw P-value of zero are: either or d 21 > max( 2d 22, 2d 23 ) d 22 > max( 2d 21, 2d 23 ) d 23 > max( 2d 21, 2d 22 ) d 21 + d 22 > d 23 2 d 21 + d 23 > d 22 2 d 21 + d 23 > d 22 2 d 21 < 0 d 22 < 0 d 23 < 0, If we expand the two-genes case to k-genes case, the necessary and sufficient conditions for the ith gene to have a raw P-value of zero are shown as follows using the post-pivot resampling method: either or for i = 1, 2,..., k. d i1 > max i=1,2,...,k ( 2d i2, 2d i3 ) d i2 > max i=1,2,...,k ( 2d i1, 2d i3 ) d i3 > max i=1,2,...,k ( 2d i1, 2d i2 ) d i1 + d i2 > d i3 2 d i1 + d i3 > d i2 2 d i1 + d i3 > d i2 2 d i1 < 0 d i2 < 0 d i3 < 0, 27

40 To have an adjusted P-value of zero for the first gene when we only have two genes, the following relationships need to be satisfied: max( (d 11 d 13 )/3, (d 21 d 23 )/3 ) < (d 11 + d 12 + d 13 )/3 max( (d 11 d 12 )/3, (d 21 d 22 )/3 ) < (d 11 + d 12 + d 13 )/3 max( (d 12 d 13 )/3, (d 22 d 23 )/3 ) < (d 11 + d 12 + d 13 )/3 max( (2d 11 d 12 d 13 )/3, (2d 21 d 22 d 23 )/3 ) < (d 11 + d 12 + d 13 )/3 max( (2d 12 d 11 d 13 )/3, (2d 22 d 21 d 23 )/3 ) < (d 11 + d 12 + d 13 )/3 max( (2d 13 d 11 d 12 )/3, (2d 23 d 21 d 22 )/3 ) < (d 11 + d 12 + d 13 )/3 0 < (d 11 + d 12 + d 13 )/3 The above equations give us the following necessary and sufficient conditions for getting an adjusted P-value of zero for the first gene: either or d 11 > max( 2d 12, 2d 13 ) d 12 > max( 2d 11, 2d 13 ) d 13 > max( 2d 11, 2d 12 ) d 11 + d 12 > d 13 2 d 11 + d 13 > d 12 2 d 11 + d 13 > d 12 2 d 11 + d 12 + d 13 > max( d 21 d 23 + d 21 d 22, d 21 d 22 + d 22 d 23, d 21 d 23 + d 22 d 23 ) d 11 < 0 d 12 < 0 d 13 < 0 d 11 + d 12 + d 13 < max( d 21 d 23 + d 21 d 22, d 21 d 22 + d 22 d 23, d 21 d 23 + d 22 d 23 ), If we have k genes instead of two genes, we need to solve the following equations to get an adjusted P-value of zero for the ith gene (i = 1,...,k): max i=1,...,k ( (d i1 d i3 )/3 ) < (d i1 + d i2 + d i3 )/3 max i=1,...,k ( (d i1 d i2 )/3 ) < (d i1 + d i2 + d i3 )/3 max i=1,...,k ( (d i2 d i3 )/3 ) < (d i1 + d i2 + d i3 )/3 max i=1,...,k ( (2d i1 d i2 d i3 )/3 ) < (d i1 + d i2 + d i3 )/3 max i=1,...,k ( (2d i2 d i1 d i3 )/3 ) < (d i1 + d i2 + d i3 )/3 max i=1,...,k ( (2d i3 d i1 d i2 )/3 ) < (d i1 + d i2 + d i3 )/3 0 < (d 11 + d 12 + d 13 )/3 28

41 The following necessary and sufficient conditions are derived to get an adjusted P-value of zero for the ith gene when the sample size is three in each group. Either d i1 > max i=1,2,...,k ( 2d i2, 2d i3 ) d i2 > max i=1,2,...,k ( 2d i1, 2d i3 ) d i3 > max i=1,2,...,k ( 2d i1, 2d i2 ) d i1 + d i2 > d i3 2 d i1 + d i3 > d i2 2 d i1 + d i3 > d i2 2 d i1 + d i2 + d i3 > max l i,l=1,2,...,k ( d l1 d l3 + d l1 d l2, d l1 d l2 + d l2 d l3, d l1 d l3 + d l2 d l3 ) or d i1 < 0 d i2 < 0 d i3 < 0 d i1 + d i2 + d i3 < max l i,l=1,2,...,k ( d l1 d l3 + d l1 d l2, d l1 d l2 + d l2 d l3, d l1 d l3 + d l2 d l3 ), for i = 1, 2,..., k. 2.3 Conditions for getting adjusted P-values of zero using the pre-pivot resampling method For paired data, two groups comparison is equivalent to a one-sample problem. The pre-pivot resampling method subtracts the difference of the two groups means first, and then resamples the residuals with replacement for paired data. Since (x i x) (y i ȳ) = (x i y i ) ( x ȳ), the test statistic s null distribution estimated by the pre-pivot resampling method is the same as that by the post-pivot resampling method for paired data, as shown below. With a sample size of n, the observed test statistics will be d i = (d i1+d i2 + +d in ) n for the ith gene. For the post-pivot resampling method, there are ten unique bootstrap test statistics calculated from the resamples for each gene when the sample size is 29

42 three (n = 3). The bootstrap resampled test statistics matrix T b is: 2d d 11 +d 12 d d 12 +d 13 d d 11 +2d d d d 21 +d 22 d d 22 +d 23 d d 21 +2d d d d k1 +d k2 d k1 k1 +d k2 +d k3 d d k1 +2d k3 3 3 k2 d 3 k3 The estimated mean vector Ê(T b ) is d 11 +d 12 +d 13 3 d 21 +d 22 +d d k1 +d k2 +d k3 3. The estimated null distribution matrix Z b is 2d 11 d 12 d 13 d 11 d 13 2d 0 12 d 11 d d 21 d 22 d 23 d 21 d 23 2d 0 22 d 21 d d k1 d k2 d k3 d k1 d k3 0 2d k2 d k1 d k d 13 d 12 3 d 23 d d k3 d k2 3 2d 13 d 11 d d 23 d 21 d d k3 d k1 d k2 3 The residuals for paired data sets using the pre-pivot resampling method are shown. as follows: d 11 ( x 1 ȳ 1 ) d 12 ( x 1 ȳ 1 ) d 13 ( x 1 ȳ 1 ) d 21 ( x 2 ȳ 2 ) d 22 ( x 2 ȳ 2 ) d 23 ( x 2 ȳ 2 )... d k1 ( x k ȳ k ) d k2 ( x k ȳ k ) d k3 ( x k ȳ k ), where x i ȳ i = x i1+x i2 +x i3 y i1+y i2 +y i3 = d i1+d i2 +d i3. The number of unique bootstrap resampled test statistics from the pre-pivot resampling method is ( n+n 1 n ) = (2n 1)! n!(n 1)!, which is the same as that from the post-pivot resampling method since n data points are resampled from n paired differences with replacement for both the post-pivot and pre-pivot resampling methods. Therefore, there are ten unique resampled test statistic values for each gene when the sample size n is three. The calculated bootstrap test statistics matrix, which is the estimated test statistic s null distribution, is showed as 30

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 6, Issue 1 2007 Article 28 A Comparison of Methods to Control Type I Errors in Microarray Studies Jinsong Chen Mark J. van der Laan Martyn

More information

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using

More information

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018 High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously

More information

High-throughput Testing

High-throughput Testing High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector

More information

Step-down FDR Procedures for Large Numbers of Hypotheses

Step-down FDR Procedures for Large Numbers of Hypotheses Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 13 Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates Sandrine Dudoit Mark

More information

Control of Generalized Error Rates in Multiple Testing

Control of Generalized Error Rates in Multiple Testing Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 245 Control of Generalized Error Rates in Multiple Testing Joseph P. Romano and

More information

MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY

MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY MULTIPLE TESTING PROCEDURES AND SIMULTANEOUS INTERVAL ESTIMATES WITH THE INTERVAL PROPERTY BY YINGQIU MA A dissertation submitted to the Graduate School New Brunswick Rutgers, The State University of New

More information

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE By Wenge Guo and M. Bhaskara Rao National Institute of Environmental Health Sciences and University of Cincinnati A classical approach for dealing

More information

The miss rate for the analysis of gene expression data

The miss rate for the analysis of gene expression data Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,

More information

Non-specific filtering and control of false positives

Non-specific filtering and control of false positives Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview

More information

Multiple Hypothesis Testing in Microarray Data Analysis

Multiple Hypothesis Testing in Microarray Data Analysis Multiple Hypothesis Testing in Microarray Data Analysis Sandrine Dudoit jointly with Mark van der Laan and Katie Pollard Division of Biostatistics, UC Berkeley www.stat.berkeley.edu/~sandrine Short Course:

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2004 Paper 164 Multiple Testing Procedures: R multtest Package and Applications to Genomics Katherine

More information

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a

More information

Resampling and the Bootstrap

Resampling and the Bootstrap Resampling and the Bootstrap Axel Benner Biostatistics, German Cancer Research Center INF 280, D-69120 Heidelberg benner@dkfz.de Resampling and the Bootstrap 2 Topics Estimation and Statistical Testing

More information

Research Article Sample Size Calculation for Controlling False Discovery Proportion

Research Article Sample Size Calculation for Controlling False Discovery Proportion Probability and Statistics Volume 2012, Article ID 817948, 13 pages doi:10.1155/2012/817948 Research Article Sample Size Calculation for Controlling False Discovery Proportion Shulian Shang, 1 Qianhe Zhou,

More information

Statistical testing. Samantha Kleinberg. October 20, 2009

Statistical testing. Samantha Kleinberg. October 20, 2009 October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find

More information

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Applying the Benjamini Hochberg procedure to a set of generalized p-values U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure

More information

EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS

EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS Statistica Sinica 19 (2009), 125-143 EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS Debashis Ghosh Penn State University Abstract: There is much recent interest

More information

Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs

Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs with Remarks on Family Selection Dissertation Defense April 5, 204 Contents Dissertation Defense Introduction 2 FWER Control within

More information

FDR and ROC: Similarities, Assumptions, and Decisions

FDR and ROC: Similarities, Assumptions, and Decisions EDITORIALS 8 FDR and ROC: Similarities, Assumptions, and Decisions. Why FDR and ROC? It is a privilege to have been asked to introduce this collection of papers appearing in Statistica Sinica. The papers

More information

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept

More information

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR CONTROLLING THE FALSE DISCOVERY RATE A Dissertation in Statistics by Scott Roths c 2011

More information

Exam: high-dimensional data analysis January 20, 2014

Exam: high-dimensional data analysis January 20, 2014 Exam: high-dimensional data analysis January 20, 204 Instructions: - Write clearly. Scribbles will not be deciphered. - Answer each main question not the subquestions on a separate piece of paper. - Finish

More information

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome

More information

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship Scholars' Mine Doctoral Dissertations Student Research & Creative Works Spring 01 Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University

STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE Wenge Guo 1 and Sanat K. Sarkar 2 National Institute of Environmental Health Sciences and Temple University Abstract: Often in practice

More information

Stat 206: Estimation and testing for a mean vector,

Stat 206: Estimation and testing for a mean vector, Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where

More information

Multiple testing: Intro & FWER 1

Multiple testing: Intro & FWER 1 Multiple testing: Intro & FWER 1 Mark van de Wiel mark.vdwiel@vumc.nl Dep of Epidemiology & Biostatistics,VUmc, Amsterdam Dep of Mathematics, VU 1 Some slides courtesy of Jelle Goeman 1 Practical notes

More information

Modified Simes Critical Values Under Positive Dependence

Modified Simes Critical Values Under Positive Dependence Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia

More information

Advanced Statistical Methods: Beyond Linear Regression

Advanced Statistical Methods: Beyond Linear Regression Advanced Statistical Methods: Beyond Linear Regression John R. Stevens Utah State University Notes 3. Statistical Methods II Mathematics Educators Worshop 28 March 2009 1 http://www.stat.usu.edu/~jrstevens/pcmi

More information

arxiv:math/ v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao

arxiv:math/ v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? arxiv:math/0701003v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao ABSTRACT. In the analysis

More information

STAT 461/561- Assignments, Year 2015

STAT 461/561- Assignments, Year 2015 STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and

More information

Political Science 236 Hypothesis Testing: Review and Bootstrapping

Political Science 236 Hypothesis Testing: Review and Bootstrapping Political Science 236 Hypothesis Testing: Review and Bootstrapping Rocío Titiunik Fall 2007 1 Hypothesis Testing Definition 1.1 Hypothesis. A hypothesis is a statement about a population parameter The

More information

Single gene analysis of differential expression

Single gene analysis of differential expression Single gene analysis of differential expression Giorgio Valentini DSI Dipartimento di Scienze dell Informazione Università degli Studi di Milano valentini@dsi.unimi.it Comparing two conditions Each condition

More information

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Christopher R. Genovese Department of Statistics Carnegie Mellon University joint work with Larry Wasserman

More information

Resampling and the Bootstrap

Resampling and the Bootstrap Resampling and the Bootstrap Axel Benner Biostatistics, German Cancer Research Center INF 280, D-69120 Heidelberg benner@dkfz.de Resampling and the Bootstrap 2 Topics Estimation and Statistical Testing

More information

Control of Directional Errors in Fixed Sequence Multiple Testing

Control of Directional Errors in Fixed Sequence Multiple Testing Control of Directional Errors in Fixed Sequence Multiple Testing Anjana Grandhi Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102-1982 Wenge Guo Department of Mathematical

More information

SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE

SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE Statistica Sinica 18(2008), 881-904 SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE Yongchao Ge 1, Stuart C. Sealfon 1 and Terence P. Speed 2,3 1 Mount Sinai School of Medicine,

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 3, Issue 1 2004 Article 14 Multiple Testing. Part II. Step-Down Procedures for Control of the Family-Wise Error Rate Mark J. van der Laan

More information

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA

More information

TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? Jianqing Fan Peter Hall Qiwei Yao

TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? Jianqing Fan Peter Hall Qiwei Yao TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? Jianqing Fan Peter Hall Qiwei Yao ABSTRACT. In the analysis of microarray data, and in some other

More information

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE Sanat K. Sarkar 1, Tianhui Zhou and Debashis Ghosh Temple University, Wyeth Pharmaceuticals and

More information

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing

Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Exact and Approximate Stepdown Methods For Multiple Hypothesis Testing Joseph P. Romano Department of Statistics Stanford University Michael Wolf Department of Economics and Business Universitat Pompeu

More information

Resampling-Based Control of the FDR

Resampling-Based Control of the FDR Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Chapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate

Chapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate Chapter Stepdown Procedures Controlling A Generalized False Discovery Rate Wenge Guo and Sanat K. Sarkar Biostatistics Branch, National Institute of Environmental Health Sciences, Research Triangle Park,

More information

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects A Simple, Graphical Procedure for Comparing Multiple Treatment Effects Brennan S. Thompson and Matthew D. Webb May 15, 2015 > Abstract In this paper, we utilize a new graphical

More information

Biochip informatics-(i)

Biochip informatics-(i) Biochip informatics-(i) : biochip normalization & differential expression Ju Han Kim, M.D., Ph.D. SNUBI: SNUBiomedical Informatics http://www.snubi snubi.org/ Biochip Informatics - (I) Biochip basics Preprocessing

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

c 2011 Kuo-mei Chen ALL RIGHTS RESERVED

c 2011 Kuo-mei Chen ALL RIGHTS RESERVED c 2011 Kuo-mei Chen ALL RIGHTS RESERVED ADMISSIBILITY AND CONSISTENCY FOR MULTIPLE COMPARISON PROBLEMS WITH DEPENDENT VARIABLES BY KUO-MEI CHEN A dissertation submitted to the Graduate School New Brunswick

More information

Large-Scale Multiple Testing of Correlations

Large-Scale Multiple Testing of Correlations Large-Scale Multiple Testing of Correlations T. Tony Cai and Weidong Liu Abstract Multiple testing of correlations arises in many applications including gene coexpression network analysis and brain connectivity

More information

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Test (2008) 17: 461 471 DOI 10.1007/s11749-008-0134-6 DISCUSSION Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Joseph P. Romano Azeem M. Shaikh

More information

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials Research Article Received 29 January 2010, Accepted 26 May 2010 Published online 18 April 2011 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.4008 Mixtures of multiple testing procedures

More information

Comparison of the Empirical Bayes and the Significance Analysis of Microarrays

Comparison of the Empirical Bayes and the Significance Analysis of Microarrays Comparison of the Empirical Bayes and the Significance Analysis of Microarrays Holger Schwender, Andreas Krause, and Katja Ickstadt Abstract Microarrays enable to measure the expression levels of tens

More information

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES By Wenge Guo Gavin Lynch Joseph P. Romano Technical Report No. 2018-06 September 2018

More information

Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments

Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Jie Chen 1 Merck Research Laboratories, P. O. Box 4, BL3-2, West Point, PA 19486, U.S.A. Telephone:

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2005 Paper 168 Multiple Testing Procedures and Applications to Genomics Merrill D. Birkner Katherine

More information

Finite Population Correction Methods

Finite Population Correction Methods Finite Population Correction Methods Moses Obiri May 5, 2017 Contents 1 Introduction 1 2 Normal-based Confidence Interval 2 3 Bootstrap Confidence Interval 3 4 Finite Population Bootstrap Sampling 5 4.1

More information

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

Control of the False Discovery Rate under Dependence using the Bootstrap and Subsampling

Control of the False Discovery Rate under Dependence using the Bootstrap and Subsampling Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 337 Control of the False Discovery Rate under Dependence using the Bootstrap and

More information

Familywise Error Rate Controlling Procedures for Discrete Data

Familywise Error Rate Controlling Procedures for Discrete Data Familywise Error Rate Controlling Procedures for Discrete Data arxiv:1711.08147v1 [stat.me] 22 Nov 2017 Yalin Zhu Center for Mathematical Sciences, Merck & Co., Inc., West Point, PA, U.S.A. Wenge Guo Department

More information

A better way to bootstrap pairs

A better way to bootstrap pairs A better way to bootstrap pairs Emmanuel Flachaire GREQAM - Université de la Méditerranée CORE - Université Catholique de Louvain April 999 Abstract In this paper we are interested in heteroskedastic regression

More information

Probabilistic Inference for Multiple Testing

Probabilistic Inference for Multiple Testing This is the title page! This is the title page! Probabilistic Inference for Multiple Testing Chuanhai Liu and Jun Xie Department of Statistics, Purdue University, West Lafayette, IN 47907. E-mail: chuanhai,

More information

Hunting for significance with multiple testing

Hunting for significance with multiple testing Hunting for significance with multiple testing Etienne Roquain 1 1 Laboratory LPMA, Université Pierre et Marie Curie (Paris 6), France Séminaire MODAL X, 19 mai 216 Etienne Roquain Hunting for significance

More information

EVALUATING THE REPEATABILITY OF TWO STUDIES OF A LARGE NUMBER OF OBJECTS: MODIFIED KENDALL RANK-ORDER ASSOCIATION TEST

EVALUATING THE REPEATABILITY OF TWO STUDIES OF A LARGE NUMBER OF OBJECTS: MODIFIED KENDALL RANK-ORDER ASSOCIATION TEST EVALUATING THE REPEATABILITY OF TWO STUDIES OF A LARGE NUMBER OF OBJECTS: MODIFIED KENDALL RANK-ORDER ASSOCIATION TEST TIAN ZHENG, SHAW-HWA LO DEPARTMENT OF STATISTICS, COLUMBIA UNIVERSITY Abstract. In

More information

Estimation of a Two-component Mixture Model

Estimation of a Two-component Mixture Model Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint

More information

Multiple comparisons of slopes of regression lines. Jolanta Wojnar, Wojciech Zieliński

Multiple comparisons of slopes of regression lines. Jolanta Wojnar, Wojciech Zieliński Multiple comparisons of slopes of regression lines Jolanta Wojnar, Wojciech Zieliński Institute of Statistics and Econometrics University of Rzeszów ul Ćwiklińskiej 2, 35-61 Rzeszów e-mail: jwojnar@univrzeszowpl

More information

Sample Size Estimation for Studies of High-Dimensional Data

Sample Size Estimation for Studies of High-Dimensional Data Sample Size Estimation for Studies of High-Dimensional Data James J. Chen, Ph.D. National Center for Toxicological Research Food and Drug Administration June 3, 2009 China Medical University Taichung,

More information

Looking at the Other Side of Bonferroni

Looking at the Other Side of Bonferroni Department of Biostatistics University of Washington 24 May 2012 Multiple Testing: Control the Type I Error Rate When analyzing genetic data, one will commonly perform over 1 million (and growing) hypothesis

More information

Post-Selection Inference

Post-Selection Inference Classical Inference start end start Post-Selection Inference selected end model data inference data selection model data inference Post-Selection Inference Todd Kuffner Washington University in St. Louis

More information

A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments

A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments Jie Chen 1 Merck Research Laboratories, P. O. Box 4, BL3-2, West Point, PA 19486, U.S.A. Telephone:

More information

False discovery rate control for non-positively regression dependent test statistics

False discovery rate control for non-positively regression dependent test statistics Journal of Statistical Planning and Inference ( ) www.elsevier.com/locate/jspi False discovery rate control for non-positively regression dependent test statistics Daniel Yekutieli Department of Statistics

More information

FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING PROCEDURES 1. BY SANAT K. SARKAR Temple University

FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING PROCEDURES 1. BY SANAT K. SARKAR Temple University The Annals of Statistics 2006, Vol. 34, No. 1, 394 415 DOI: 10.1214/009053605000000778 Institute of Mathematical Statistics, 2006 FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING

More information

Procedures controlling generalized false discovery rate

Procedures controlling generalized false discovery rate rocedures controlling generalized false discovery rate By SANAT K. SARKAR Department of Statistics, Temple University, hiladelphia, A 922, U.S.A. sanat@temple.edu AND WENGE GUO Department of Environmental

More information

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo PROCEDURES CONTROLLING THE k-fdr USING BIVARIATE DISTRIBUTIONS OF THE NULL p-values Sanat K. Sarkar and Wenge Guo Temple University and National Institute of Environmental Health Sciences Abstract: Procedures

More information

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data Faming Liang, Chuanhai Liu, and Naisyin Wang Texas A&M University Multiple Hypothesis Testing Introduction

More information

A TUTORIAL ON THE INHERITANCE PROCEDURE FOR MULTIPLE TESTING OF TREE-STRUCTURED HYPOTHESES

A TUTORIAL ON THE INHERITANCE PROCEDURE FOR MULTIPLE TESTING OF TREE-STRUCTURED HYPOTHESES A TUTORIAL ON THE INHERITANCE PROCEDURE FOR MULTIPLE TESTING OF TREE-STRUCTURED HYPOTHESES by Dilinuer Kuerban B.Sc. (Statistics), Southwestern University of Finance & Economics, 2011 a Project submitted

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2004 Paper 147 Multiple Testing Methods For ChIP-Chip High Density Oligonucleotide Array Data Sunduz

More information

This paper has been submitted for consideration for publication in Biometrics

This paper has been submitted for consideration for publication in Biometrics BIOMETRICS, 1 10 Supplementary material for Control with Pseudo-Gatekeeping Based on a Possibly Data Driven er of the Hypotheses A. Farcomeni Department of Public Health and Infectious Diseases Sapienza

More information

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap

Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and the Bootstrap University of Zurich Department of Economics Working Paper Series ISSN 1664-7041 (print) ISSN 1664-705X (online) Working Paper No. 254 Multiple Testing of One-Sided Hypotheses: Combining Bonferroni and

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Confidence Estimation Methods for Neural Networks: A Practical Comparison

Confidence Estimation Methods for Neural Networks: A Practical Comparison , 6-8 000, Confidence Estimation Methods for : A Practical Comparison G. Papadopoulos, P.J. Edwards, A.F. Murray Department of Electronics and Electrical Engineering, University of Edinburgh Abstract.

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Permutation Tests and Multiple Testing

Permutation Tests and Multiple Testing Master Thesis Permutation Tests and Multiple Testing Jesse Hemerik Leiden University Mathematical Institute Track: Applied Mathematics December 2013 Thesis advisor: Prof. dr. J.J. Goeman Leiden University

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 2005 Paper 198 Quantile-Function Based Null Distribution in Resampling Based Multiple Testing Mark J.

More information

Inferences about Parameters of Trivariate Normal Distribution with Missing Data

Inferences about Parameters of Trivariate Normal Distribution with Missing Data Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 7-5-3 Inferences about Parameters of Trivariate Normal Distribution with Missing

More information

Supplementary Materials for Residuals and Diagnostics for Ordinal Regression Models: A Surrogate Approach

Supplementary Materials for Residuals and Diagnostics for Ordinal Regression Models: A Surrogate Approach Supplementary Materials for Residuals and Diagnostics for Ordinal Regression Models: A Surrogate Approach Part A: Figures and tables Figure 2: An illustration of the sampling procedure to generate a surrogate

More information

Aliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25

Aliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25 Presentation of The Paper: The Positive False Discovery Rate: A Bayesian Interpretation and the q-value, J.D. Storey, The Annals of Statistics, Vol. 31 No.6 (Dec. 2003), pp 2013-2035 Aliaksandr Hubin University

More information

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook BIOMETRY THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH THIRD E D I T I O N Robert R. SOKAL and F. James ROHLF State University of New York at Stony Brook W. H. FREEMAN AND COMPANY New

More information

DETECTING DIFFERENTIALLY EXPRESSED GENES WHILE CONTROLLING THE FALSE DISCOVERY RATE FOR MICROARRAY DATA

DETECTING DIFFERENTIALLY EXPRESSED GENES WHILE CONTROLLING THE FALSE DISCOVERY RATE FOR MICROARRAY DATA University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Dissertations and Theses in Statistics Statistics, Department of 2009 DETECTING DIFFERENTIALLY EXPRESSED GENES WHILE CONTROLLING

More information

Sanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract

Sanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract Adaptive Controls of FWER and FDR Under Block Dependence arxiv:1611.03155v1 [stat.me] 10 Nov 2016 Wenge Guo Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102, U.S.A.

More information

The Pennsylvania State University The Graduate School A BAYESIAN APPROACH TO FALSE DISCOVERY RATE FOR LARGE SCALE SIMULTANEOUS INFERENCE

The Pennsylvania State University The Graduate School A BAYESIAN APPROACH TO FALSE DISCOVERY RATE FOR LARGE SCALE SIMULTANEOUS INFERENCE The Pennsylvania State University The Graduate School A BAYESIAN APPROACH TO FALSE DISCOVERY RATE FOR LARGE SCALE SIMULTANEOUS INFERENCE A Thesis in Statistics by Bing Han c 2007 Bing Han Submitted in

More information

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons: STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two

More information

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38 BIO5312 Biostatistics Lecture 11: Multisample Hypothesis Testing II Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/8/2016 1/38 Outline In this lecture, we will continue to

More information