Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses
|
|
- Joy Mosley
- 5 years ago
- Views:
Transcription
1 Improving the Performance of the FDR Procedure Using an Estimator for the Number of True Null Hypotheses Amit Zeisel, Or Zuk, Eytan Domany W.I.S. June 5, 29 Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number / 7 of T
2 Introduction/Motivation The vast use of high throughput technologies involves testing thousands of hypotheses simultaneously. The field of multiple testing deals with developing methods to determine the level of significance in such a scenario. Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 2 / 7 of T
3 Multiple testing m null hypotheses H,i, i =,2,...m. Example: for m variables H,i : µ A i = µ B i. Calculate p-values p i, and set a threshold for significance. The set of random variables (U,V,T,S), and parameters (m,m,m ) describes this scenario: ground truth non-rejected rejected total hypotheses hypotheses null hypothesis is true U V m null hypothesis is false T S m total m R R m Fraction of false discoveries = V R Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 3 / 7 of T
4 Control of the FDR=E( V R + ) In 995 Benjamini and Hochberg proposed a procedure (BH95) to control the FDR for a given set of p-values: Sort and re-label the p-values, p () p (2)... p (m). 2 Choose q the desired FDR level. 3 Define the set of constants α i = i mq, i =,2,,,,m. 4 Identify R = max{i : p (i) α i }. 5 If R reject all hypotheses (i) =,2,,,R, else no hypothesis is rejected. BH proved that FDR m q m q. The bound is not tight, there is room for improvement. Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 4 / 7 of T
5 Control vs. estimation, and improved procedures Control (q R) - significance is preset at a desired level q. The procedure yields a set of rejected hypotheses with FDR q. Estimation (R q) - the threshold is preset at a level that yields a desired number of rejections R. The corresponding FDR is estimated. There were many attempts to produce tighter bounds on the FDR, using an estimator for m. Difficulty: the estimator ˆm is a fluctuating random variable. Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 5 / 7 of T
6 Aim Produce an improved BH procedure using an estimator ˆm for m, the number of true null hypotheses. Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 6 / 7 of T
7 Definitions: monotonic estimator An estimator for m is a family of functions ˆm ˆm (m) : [,] m R, ˆm ˆm (p,..,p m ). ˆm is a monotonic estimator if it satisfies: ˆm (m) (p,..,p i,..,p m ) ˆm (m) (p,..,p i,..,p m), if p i p i, i =,2,,,m, m 2 ˆm (m) (p,..,p i,..,p m ) ˆm (m ) (p,..,p i,p i+,..,p m ), i =,2,,,m, m 2 Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 7 / 7 of T
8 Definitions: modified BH procedure (m ˆm ) Given m hypotheses of which m are null, let p,..,p m be the respective p-values. The modified BH procedure with estimator ˆm is: Compute ˆm ˆm (p,..,p m ). 2 Sort and relabel the p-values p ()... p (m). 3 Define the set of constants q k = qk ˆm k =,2...,m. 4 Let R = max{k : p (k) q k }. 5 If R reject p (),..,p (R) else don t reject any hypothesis. Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 8 / 7 of T
9 Theorem for control (see Benjamini et al 26 (BKY)) Let ˆm ˆm (p,..,p m ) be a monotonic estimator for m. Let ˆm ( ) (p,..,p m ) ˆm (p 2,..,p m ) be the same estimator, but disregarding the first p-value p. Assume that the null p-values are i.i.d. U[,]. Then the modified BH procedure satisfies: FDR = E [ ] V R + m qe [ ] Note: if E ˆm ( ) m, then FDR q. [ ˆm ( ) ] () Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 9 / 7 of T
10 Both procedures satisfy E(V /R + ) q Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number / 7 of T Two improved procedures The two procedures are based on the estimators: [ IBHsum: ˆm = C(m) min m,max ( s(m),2 m j= p j) ]. C(m),s(m) are universal correction factors..3 (a) C (b).3 s/m m IBHlog: m = 2 m i= log( p i).
11 .5.3 Performance: simulations (IBHsum) ρ is the correlation between test statistics: µ q=.5, ρ= µ q=.5, ρ= µ m /m q=.2, ρ= m /m µ m /m q=.2, ρ= m /m Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number / 7 of T
12 Compare to other methods Results from simulations, m = 5,µ = 3.5: ρ= E(V/R) ρ= Oracle BH (995) BKY (Benjamini et al 26) STS (Storey et al 24) IBHsum IBHlog E(S)/m m /m m /m Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 2 / 7 of T
13 Applying to 33 gene expression datasets Large number of discoveries Small number of discoveries.8.8 p Two tailed p One tailed i/m i/m Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 3 / 7 of T
14 q BKY STS IBHsum IBHlog a. Two tailed, large number of discoveries ( studies) (.43) (.38) (.) (.3) b. Two tailed, small number of discoveries ( studies) (.3) (.97) (.4) (.79) c. One tailed, large number of discoveries (8 studies) (.9) (.33) (.26) (.36) d. One tailed, small number of discoveries (5 studies) (.2) (.52) (.7) (.23) Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 4 / 7 of T
15 Summary and conclusions We proved a theorem that provides a bound on the FDR for any improved procedure based on an monotonic estimator of m. We proposed two improved procedures based on the estimators 2 m j= p j and m i= log( p i). For the case of independent statistics all improved procedures provide similar results: saturation of the bound and more power than BH95. We showed by simulations that even in the case of dependent statistics our procedures provide a reliable bound and improved power. For real gene expression data, where dependencies are expected, our methods improve, in general, over existing ones. Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 5 / 7 of T
16 Theorem for monotonicity Let p = (p..m ) be a set of independent p-values. Assume that f, the marginal probability density function of the alternatives, is monotonically non-increasing and differentiable. Let B (i) be two threshold FDR procedures rejecting R (i) ( p) hypotheses and each having FDR (i), i =,2. Assume that for any q, R () ( p) R (2) ( p), p. Then it also holds that FDR () FDR (2). Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 6 / 7 of T
17 THANKS Amit Zeisel, Or Zuk, Eytan Domany (W.I.S.)Improving the Performance of the FDR Procedure Using an Estimator June 5, for 29 the Number 7 / 7 of T
Looking at the Other Side of Bonferroni
Department of Biostatistics University of Washington 24 May 2012 Multiple Testing: Control the Type I Error Rate When analyzing genetic data, one will commonly perform over 1 million (and growing) hypothesis
More informationarxiv: v1 [math.st] 31 Mar 2009
The Annals of Statistics 2009, Vol. 37, No. 2, 619 629 DOI: 10.1214/07-AOS586 c Institute of Mathematical Statistics, 2009 arxiv:0903.5373v1 [math.st] 31 Mar 2009 AN ADAPTIVE STEP-DOWN PROCEDURE WITH PROVEN
More informationResampling-Based Control of the FDR
Resampling-Based Control of the FDR Joseph P. Romano 1 Azeem S. Shaikh 2 and Michael Wolf 3 1 Departments of Economics and Statistics Stanford University 2 Department of Economics University of Chicago
More informationHigh-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018
High-Throughput Sequencing Course Multiple Testing Biostatistics and Bioinformatics Summer 2018 Introduction You have previously considered the significance of a single gene Introduction You have previously
More informationHeterogeneity and False Discovery Rate Control
Heterogeneity and False Discovery Rate Control Joshua D Habiger Oklahoma State University jhabige@okstateedu URL: jdhabigerokstateedu August, 2014 Motivating Data: Anderson and Habiger (2012) M = 778 bacteria
More informationAlpha-Investing. Sequential Control of Expected False Discoveries
Alpha-Investing Sequential Control of Expected False Discoveries Dean Foster Bob Stine Department of Statistics Wharton School of the University of Pennsylvania www-stat.wharton.upenn.edu/ stine Joint
More informationSummary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing
Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing Statistics Journal Club, 36-825 Beau Dabbs and Philipp Burckhardt 9-19-2014 1 Paper
More informationStatistical testing. Samantha Kleinberg. October 20, 2009
October 20, 2009 Intro to significance testing Significance testing and bioinformatics Gene expression: Frequently have microarray data for some group of subjects with/without the disease. Want to find
More informationNon-specific filtering and control of false positives
Non-specific filtering and control of false positives Richard Bourgon 16 June 2009 bourgon@ebi.ac.uk EBI is an outstation of the European Molecular Biology Laboratory Outline Multiple testing I: overview
More informationHigh-throughput Testing
High-throughput Testing Noah Simon and Richard Simon July 2016 1 / 29 Testing vs Prediction On each of n patients measure y i - single binary outcome (eg. progression after a year, PCR) x i - p-vector
More informationSta$s$cs for Genomics ( )
Sta$s$cs for Genomics (140.688) Instructor: Jeff Leek Slide Credits: Rafael Irizarry, John Storey No announcements today. Hypothesis testing Once you have a given score for each gene, how do you decide
More informationTable of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors
The Multiple Testing Problem Multiple Testing Methods for the Analysis of Microarray Data 3/9/2009 Copyright 2009 Dan Nettleton Suppose one test of interest has been conducted for each of m genes in a
More informationThe miss rate for the analysis of gene expression data
Biostatistics (2005), 6, 1,pp. 111 117 doi: 10.1093/biostatistics/kxh021 The miss rate for the analysis of gene expression data JONATHAN TAYLOR Department of Statistics, Stanford University, Stanford,
More informationStat 206: Estimation and testing for a mean vector,
Stat 206: Estimation and testing for a mean vector, Part II James Johndrow 2016-12-03 Comparing components of the mean vector In the last part, we talked about testing the hypothesis H 0 : µ 1 = µ 2 where
More informationFalse Discovery Rate
False Discovery Rate Peng Zhao Department of Statistics Florida State University December 3, 2018 Peng Zhao False Discovery Rate 1/30 Outline 1 Multiple Comparison and FWER 2 False Discovery Rate 3 FDR
More informationStatistical Emerging Pattern Mining with Multiple Testing Correction
Statistical Emerging Pattern Mining with Multiple Testing Correction Junpei Komiyama 1, Masakazu Ishihata 2, Hiroki Arimura 2, Takashi Nishibayashi 3, Shin-Ichi Minato 2 1. U-Tokyo 2. Hokkaido Univ. 3.
More informationEstimation of the False Discovery Rate
Estimation of the False Discovery Rate Coffee Talk, Bioinformatics Research Center, Sept, 2005 Jason A. Osborne, osborne@stat.ncsu.edu Department of Statistics, North Carolina State University 1 Outline
More informationControlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method
Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method Christopher R. Genovese Department of Statistics Carnegie Mellon University joint work with Larry Wasserman
More informationLecture 28. Ingo Ruczinski. December 3, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University
Lecture 28 Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University December 3, 2015 1 2 3 4 5 1 Familywise error rates 2 procedure 3 Performance of with multiple
More informationA Large-Sample Approach to Controlling the False Discovery Rate
A Large-Sample Approach to Controlling the False Discovery Rate Christopher R. Genovese Department of Statistics Carnegie Mellon University Larry Wasserman Department of Statistics Carnegie Mellon University
More informationFalse discovery rate and related concepts in multiple comparisons problems, with applications to microarray data
False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008 1 / 35 Lecture outline Motivation for not using
More informationJournal Club: Higher Criticism
Journal Club: Higher Criticism David Donoho (2002): Higher Criticism for Heterogeneous Mixtures, Technical Report No. 2002-12, Dept. of Statistics, Stanford University. Introduction John Tukey (1976):
More informationAnnouncements. Proposals graded
Announcements Proposals graded Kevin Jamieson 2018 1 Hypothesis testing Machine Learning CSE546 Kevin Jamieson University of Washington October 30, 2018 2018 Kevin Jamieson 2 Anomaly detection You are
More informationMultiple Testing. Hoang Tran. Department of Statistics, Florida State University
Multiple Testing Hoang Tran Department of Statistics, Florida State University Large-Scale Testing Examples: Microarray data: testing differences in gene expression between two traits/conditions Microbiome
More informationAdaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses
Adaptive Filtering Multiple Testing Procedures for Partial Conjunction Hypotheses arxiv:1610.03330v1 [stat.me] 11 Oct 2016 Jingshu Wang, Chiara Sabatti, Art B. Owen Department of Statistics, Stanford University
More informationLecture 7 April 16, 2018
Stats 300C: Theory of Statistics Spring 2018 Lecture 7 April 16, 2018 Prof. Emmanuel Candes Scribe: Feng Ruan; Edited by: Rina Friedberg, Junjie Zhu 1 Outline Agenda: 1. False Discovery Rate (FDR) 2. Properties
More informationAliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25
Presentation of The Paper: The Positive False Discovery Rate: A Bayesian Interpretation and the q-value, J.D. Storey, The Annals of Statistics, Vol. 31 No.6 (Dec. 2003), pp 2013-2035 Aliaksandr Hubin University
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca
More informationOn Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses
On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses Gavin Lynch Catchpoint Systems, Inc., 228 Park Ave S 28080 New York, NY 10003, U.S.A. Wenge Guo Department of Mathematical
More informationIntroductory Econometrics
Session 4 - Testing hypotheses Roland Sciences Po July 2011 Motivation After estimation, delivering information involves testing hypotheses Did this drug had any effect on the survival rate? Is this drug
More informationPROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo
PROCEDURES CONTROLLING THE k-fdr USING BIVARIATE DISTRIBUTIONS OF THE NULL p-values Sanat K. Sarkar and Wenge Guo Temple University and National Institute of Environmental Health Sciences Abstract: Procedures
More informationAdaptive FDR control under independence and dependence
Adaptive FDR control under independence and dependence Gilles Blanchard, Etienne Roquain To cite this version: Gilles Blanchard, Etienne Roquain. Adaptive FDR control under independence and dependence.
More informationA Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data
A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data Faming Liang, Chuanhai Liu, and Naisyin Wang Texas A&M University Multiple Hypothesis Testing Introduction
More informationA NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES
A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES By Wenge Guo Gavin Lynch Joseph P. Romano Technical Report No. 2018-06 September 2018
More informationOptional Stopping Theorem Let X be a martingale and T be a stopping time such
Plan Counting, Renewal, and Point Processes 0. Finish FDR Example 1. The Basic Renewal Process 2. The Poisson Process Revisited 3. Variants and Extensions 4. Point Processes Reading: G&S: 7.1 7.3, 7.10
More informationTools and topics for microarray analysis
Tools and topics for microarray analysis USSES Conference, Blowing Rock, North Carolina, June, 2005 Jason A. Osborne, osborne@stat.ncsu.edu Department of Statistics, North Carolina State University 1 Outline
More informationHunting for significance with multiple testing
Hunting for significance with multiple testing Etienne Roquain 1 1 Laboratory LPMA, Université Pierre et Marie Curie (Paris 6), France Séminaire MODAL X, 19 mai 216 Etienne Roquain Hunting for significance
More informationEffects of dependence in high-dimensional multiple testing problems. Kyung In Kim and Mark van de Wiel
Effects of dependence in high-dimensional multiple testing problems Kyung In Kim and Mark van de Wiel Department of Mathematics, Vrije Universiteit Amsterdam. Contents 1. High-dimensional multiple testing
More informationFalse Discovery Control in Spatial Multiple Testing
False Discovery Control in Spatial Multiple Testing WSun 1,BReich 2,TCai 3, M Guindani 4, and A. Schwartzman 2 WNAR, June, 2012 1 University of Southern California 2 North Carolina State University 3 University
More informationSTAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:
STAT 263/363: Experimental Design Winter 206/7 Lecture January 9 Lecturer: Minyong Lee Scribe: Zachary del Rosario. Design of Experiments Why perform Design of Experiments (DOE)? There are at least two
More informationAdvanced Statistical Methods: Beyond Linear Regression
Advanced Statistical Methods: Beyond Linear Regression John R. Stevens Utah State University Notes 3. Statistical Methods II Mathematics Educators Worshop 28 March 2009 1 http://www.stat.usu.edu/~jrstevens/pcmi
More informationTests for Two Coefficient Alphas
Chapter 80 Tests for Two Coefficient Alphas Introduction Coefficient alpha, or Cronbach s alpha, is a popular measure of the reliability of a scale consisting of k parts. The k parts often represent k
More informationMultiple testing with the structure-adaptive Benjamini Hochberg algorithm
J. R. Statist. Soc. B (2019) 81, Part 1, pp. 45 74 Multiple testing with the structure-adaptive Benjamini Hochberg algorithm Ang Li and Rina Foygel Barber University of Chicago, USA [Received June 2016.
More informationSTAT 461/561- Assignments, Year 2015
STAT 461/561- Assignments, Year 2015 This is the second set of assignment problems. When you hand in any problem, include the problem itself and its number. pdf are welcome. If so, use large fonts and
More informationIncorporation of Sparsity Information in Large-scale Multiple Two-sample t Tests
Incorporation of Sparsity Information in Large-scale Multiple Two-sample t Tests Weidong Liu October 19, 2014 Abstract Large-scale multiple two-sample Student s t testing problems often arise from the
More informationModel Identification for Wireless Propagation with Control of the False Discovery Rate
Model Identification for Wireless Propagation with Control of the False Discovery Rate Christoph F. Mecklenbräuker (TU Wien) Joint work with Pei-Jung Chung (Univ. Edinburgh) Dirk Maiwald (Atlas Elektronik)
More informationProbabilistic Inference for Multiple Testing
This is the title page! This is the title page! Probabilistic Inference for Multiple Testing Chuanhai Liu and Jun Xie Department of Statistics, Purdue University, West Lafayette, IN 47907. E-mail: chuanhai,
More informationMultiple testing: Intro & FWER 1
Multiple testing: Intro & FWER 1 Mark van de Wiel mark.vdwiel@vumc.nl Dep of Epidemiology & Biostatistics,VUmc, Amsterdam Dep of Mathematics, VU 1 Some slides courtesy of Jelle Goeman 1 Practical notes
More informationAdaptive False Discovery Rate Control for Heterogeneous Data
Adaptive False Discovery Rate Control for Heterogeneous Data Joshua D. Habiger Department of Statistics Oklahoma State University 301G MSCS Stillwater, OK 74078 June 11, 2015 Abstract Efforts to develop
More informationOn Methods Controlling the False Discovery Rate 1
Sankhyā : The Indian Journal of Statistics 2008, Volume 70-A, Part 2, pp. 135-168 c 2008, Indian Statistical Institute On Methods Controlling the False Discovery Rate 1 Sanat K. Sarkar Temple University,
More informationThe optimal discovery procedure: a new approach to simultaneous significance testing
J. R. Statist. Soc. B (2007) 69, Part 3, pp. 347 368 The optimal discovery procedure: a new approach to simultaneous significance testing John D. Storey University of Washington, Seattle, USA [Received
More informationBIBLE CODE PREDICTION. and multiple comparisons problem
BIBLE CODE PREDICTION and multiple comparisons problem Equidistant Letter Sequences in the Book of Genesis Doron Witztum, Eliyahu Rips, and Yoav Rosenberg Statistical Science, Vol. 9 (1994) 429-438. Bible
More informationLecture 27. December 13, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationDepartment of Statistics University of Central Florida. Technical Report TR APR2007 Revised 25NOV2007
Department of Statistics University of Central Florida Technical Report TR-2007-01 25APR2007 Revised 25NOV2007 Controlling the Number of False Positives Using the Benjamini- Hochberg FDR Procedure Paul
More informationhypothesis a claim about the value of some parameter (like p)
Testing hypotheses hypothesis a claim about the value of some parameter (like p) significance test procedure to assess the strength of evidence provided by a sample of data against the claim of a hypothesized
More informationPlan Martingales cont d. 0. Questions for Exam 2. More Examples 3. Overview of Results. Reading: study Next Time: first exam
Plan Martingales cont d 0. Questions for Exam 2. More Examples 3. Overview of Results Reading: study Next Time: first exam Midterm Exam: Tuesday 28 March in class Sample exam problems ( Homework 5 and
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume, Issue 006 Article 9 A Method to Increase the Power of Multiple Testing Procedures Through Sample Splitting Daniel Rubin Sandrine Dudoit
More informationModified Simes Critical Values Under Positive Dependence
Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia
More informationInferential Statistics Hypothesis tests Confidence intervals
Inferential Statistics Hypothesis tests Confidence intervals Eva Riccomagno, Maria Piera Rogantin DIMA Università di Genova riccomagno@dima.unige.it rogantin@dima.unige.it Part G. Multiple tests Part H.
More informationSequential Analysis & Testing Multiple Hypotheses,
Sequential Analysis & Testing Multiple Hypotheses, CS57300 - Data Mining Spring 2016 Instructor: Bruno Ribeiro 2016 Bruno Ribeiro This Class: Sequential Analysis Testing Multiple Hypotheses Nonparametric
More informationhypothesis testing 1
hypothesis testing 1 Does smoking cause cancer? competing hypotheses (a) No; we don t know what causes cancer, but smokers are no more likely to get it than nonsmokers (b) Yes; a much greater % of smokers
More informationFalse discovery control for multiple tests of association under general dependence
False discovery control for multiple tests of association under general dependence Nicolai Meinshausen Seminar für Statistik ETH Zürich December 2, 2004 Abstract We propose a confidence envelope for false
More informationLarge-Scale Multiple Testing of Correlations
Large-Scale Multiple Testing of Correlations T. Tony Cai and Weidong Liu Abstract Multiple testing of correlations arises in many applications including gene coexpression network analysis and brain connectivity
More informationStep-down FDR Procedures for Large Numbers of Hypotheses
Step-down FDR Procedures for Large Numbers of Hypotheses Paul N. Somerville University of Central Florida Abstract. Somerville (2004b) developed FDR step-down procedures which were particularly appropriate
More informationEstimation of a Two-component Mixture Model
Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint
More informationNew Approaches to False Discovery Control
New Approaches to False Discovery Control Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics Carnegie
More informationOn adaptive procedures controlling the familywise error rate
, pp. 3 On adaptive procedures controlling the familywise error rate By SANAT K. SARKAR Temple University, Philadelphia, PA 922, USA sanat@temple.edu Summary This paper considers the problem of developing
More informationReview: General Approach to Hypothesis Testing. 1. Define the research question and formulate the appropriate null and alternative hypotheses.
1 Review: Let X 1, X,..., X n denote n independent random variables sampled from some distribution might not be normal!) with mean µ) and standard deviation σ). Then X µ σ n In other words, X is approximately
More informationSpecific Differences. Lukas Meier, Seminar für Statistik
Specific Differences Lukas Meier, Seminar für Statistik Problem with Global F-test Problem: Global F-test (aka omnibus F-test) is very unspecific. Typically: Want a more precise answer (or have a more
More informationLarge-Scale Hypothesis Testing
Chapter 2 Large-Scale Hypothesis Testing Progress in statistics is usually at the mercy of our scientific colleagues, whose data is the nature from which we work. Agricultural experimentation in the early
More informationResearch Article Sample Size Calculation for Controlling False Discovery Proportion
Probability and Statistics Volume 2012, Article ID 817948, 13 pages doi:10.1155/2012/817948 Research Article Sample Size Calculation for Controlling False Discovery Proportion Shulian Shang, 1 Qianhe Zhou,
More informationPeak Detection for Images
Peak Detection for Images Armin Schwartzman Division of Biostatistics, UC San Diego June 016 Overview How can we improve detection power? Use a less conservative error criterion Take advantage of prior
More informationTwo-stage stepup procedures controlling FDR
Journal of Statistical Planning and Inference 38 (2008) 072 084 www.elsevier.com/locate/jspi Two-stage stepup procedures controlling FDR Sanat K. Sarar Department of Statistics, Temple University, Philadelphia,
More informationThe Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR
The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR CONTROLLING THE FALSE DISCOVERY RATE A Dissertation in Statistics by Scott Roths c 2011
More informationSTAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)
STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons Ch. 4-5) Recall CRD means and effects models: Y ij = µ i + ϵ ij = µ + α i + ϵ ij i = 1,..., g ; j = 1,..., n ; ϵ ij s iid N0, σ 2 ) If we reject
More informationSome General Types of Tests
Some General Types of Tests We may not be able to find a UMP or UMPU test in a given situation. In that case, we may use test of some general class of tests that often have good asymptotic properties.
More informationNew Procedures for False Discovery Control
New Procedures for False Discovery Control Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Elisha Merriam Department of Neuroscience University
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationSanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract
Adaptive Controls of FWER and FDR Under Block Dependence arxiv:1611.03155v1 [stat.me] 10 Nov 2016 Wenge Guo Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102, U.S.A.
More informationIntroduction to Statistical Hypothesis Testing
Introduction to Statistical Hypothesis Testing Arun K. Tangirala Power of Hypothesis Tests Arun K. Tangirala, IIT Madras Intro to Statistical Hypothesis Testing 1 Learning objectives I Computing Pr(Type
More informationComments on: Control of the false discovery rate under dependence using the bootstrap and subsampling
Test (2008) 17: 443 445 DOI 10.1007/s11749-008-0127-5 DISCUSSION Comments on: Control of the false discovery rate under dependence using the bootstrap and subsampling José A. Ferreira Mark A. van de Wiel
More informationApplying the Benjamini Hochberg procedure to a set of generalized p-values
U.U.D.M. Report 20:22 Applying the Benjamini Hochberg procedure to a set of generalized p-values Fredrik Jonsson Department of Mathematics Uppsala University Applying the Benjamini Hochberg procedure
More informationLarge-Scale Multiple Testing of Correlations
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 5-5-2016 Large-Scale Multiple Testing of Correlations T. Tony Cai University of Pennsylvania Weidong Liu Follow this
More informationDoing Cosmology with Balls and Envelopes
Doing Cosmology with Balls and Envelopes Christopher R. Genovese Department of Statistics Carnegie Mellon University http://www.stat.cmu.edu/ ~ genovese/ Larry Wasserman Department of Statistics Carnegie
More informationEstimation and Control of the False Discovery Rate of Bayesian Network Skeleton Identification
Estimation and Control of the False Discovery Rate of Bayesian Network Skeleton Identification Angelos P. Armen and Ioannis Tsamardinos Computer Science Department, University of Crete, and Institute of
More informationarxiv: v2 [stat.me] 14 Mar 2011
Submission Journal de la Société Française de Statistique arxiv: 1012.4078 arxiv:1012.4078v2 [stat.me] 14 Mar 2011 Type I error rate control for testing many hypotheses: a survey with proofs Titre: Une
More informationFALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING PROCEDURES 1. BY SANAT K. SARKAR Temple University
The Annals of Statistics 2006, Vol. 34, No. 1, 394 415 DOI: 10.1214/009053605000000778 Institute of Mathematical Statistics, 2006 FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING
More informationControlling the False Discovery Rate in Two-Stage. Combination Tests for Multiple Endpoints
Controlling the False Discovery Rate in Two-Stage Combination Tests for Multiple ndpoints Sanat K. Sarkar, Jingjing Chen and Wenge Guo May 29, 2011 Sanat K. Sarkar is Professor and Senior Research Fellow,
More informationControl of the False Discovery Rate under Dependence using the Bootstrap and Subsampling
Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 337 Control of the False Discovery Rate under Dependence using the Bootstrap and
More informationControlling Bayes Directional False Discovery Rate in Random Effects Model 1
Controlling Bayes Directional False Discovery Rate in Random Effects Model 1 Sanat K. Sarkar a, Tianhui Zhou b a Temple University, Philadelphia, PA 19122, USA b Wyeth Pharmaceuticals, Collegeville, PA
More informationTesting Jumps via False Discovery Rate Control
Testing Jumps via False Discovery Rate Control Yu-Min Yen August 12, 2011 Abstract Many recently developed nonparametric jump tests can be viewed as multiple hypothesis testing problems. For such multiple
More informationMultiple Testing Issues & K-Means Clustering. Definitions related to the significance level (or type I error) of multiple tests
StatsM254 Statistical Methods in Coputational Biology Lecture 3-04/08/204 Multiple Testing Issues & K-Means Clustering Lecturer: Jingyi Jessica Li Scribe: Arturo Rairez Multiple Testing Issues When trying
More informationFDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES
FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES Sanat K. Sarkar a a Department of Statistics, Temple University, Speakman Hall (006-00), Philadelphia, PA 19122, USA Abstract The concept
More informationSimultaneous Testing of Grouped Hypotheses: Finding Needles in Multiple Haystacks
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2009 Simultaneous Testing of Grouped Hypotheses: Finding Needles in Multiple Haystacks T. Tony Cai University of Pennsylvania
More informationCHOOSING THE LESSER EVIL: TRADE-OFF BETWEEN FALSE DISCOVERY RATE AND NON-DISCOVERY RATE
Statistica Sinica 18(2008), 861-879 CHOOSING THE LESSER EVIL: TRADE-OFF BETWEEN FALSE DISCOVERY RATE AND NON-DISCOVERY RATE Radu V. Craiu and Lei Sun University of Toronto Abstract: The problem of multiple
More informationRejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling
Test (2008) 17: 461 471 DOI 10.1007/s11749-008-0134-6 DISCUSSION Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Joseph P. Romano Azeem M. Shaikh
More informationHypothesis Tests Solutions COR1-GB.1305 Statistics and Data Analysis
Hypothesis Tests Solutions COR1-GB.1305 Statistics and Data Analysis Introduction 1. An analyst claims to have a reliable model for Twitter s quarterly revenues. His model predicted that the most recent
More informationarxiv: v2 [stat.me] 31 Aug 2017
Multiple testing with discrete data: proportion of true null hypotheses and two adaptive FDR procedures Xiongzhi Chen, Rebecca W. Doerge and Joseph F. Heyse arxiv:4274v2 [stat.me] 3 Aug 207 Abstract We
More informationVisual interpretation with normal approximation
Visual interpretation with normal approximation H 0 is true: H 1 is true: p =0.06 25 33 Reject H 0 α =0.05 (Type I error rate) Fail to reject H 0 β =0.6468 (Type II error rate) 30 Accept H 1 Visual interpretation
More informationarxiv: v1 [stat.me] 25 Aug 2016
Empirical Null Estimation using Discrete Mixture Distributions and its Application to Protein Domain Data arxiv:1608.07204v1 [stat.me] 25 Aug 2016 Iris Ivy Gauran 1, Junyong Park 1, Johan Lim 2, DoHwan
More information