False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

Similar documents
Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

Step-down FDR Procedures for Large Numbers of Hypotheses

Statistical testing. Samantha Kleinberg. October 20, 2009

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

The optimal discovery procedure: a new approach to simultaneous significance testing

High-throughput Testing

Statistical Applications in Genetics and Molecular Biology

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method

Stat 206: Estimation and testing for a mean vector,

CHOOSING THE LESSER EVIL: TRADE-OFF BETWEEN FALSE DISCOVERY RATE AND NON-DISCOVERY RATE

EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS

FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING PROCEDURES 1. BY SANAT K. SARKAR Temple University

FDR and ROC: Similarities, Assumptions, and Decisions

Statistical Applications in Genetics and Molecular Biology

STEPDOWN PROCEDURES CONTROLLING A GENERALIZED FALSE DISCOVERY RATE. National Institute of Environmental Health Sciences and Temple University

Estimation of a Two-component Mixture Model

Two-stage stepup procedures controlling FDR

Large-Scale Hypothesis Testing

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Estimation of the False Discovery Rate

Non-specific filtering and control of false positives

A Large-Sample Approach to Controlling the False Discovery Rate

The miss rate for the analysis of gene expression data

Chapter 1. Stepdown Procedures Controlling A Generalized False Discovery Rate

Modified Simes Critical Values Under Positive Dependence

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE

Aliaksandr Hubin University of Oslo Aliaksandr Hubin (UIO) Bayesian FDR / 25

Announcements. Proposals graded

Procedures controlling generalized false discovery rate

Multiple testing: Intro & FWER 1

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

Research Article Sample Size Calculation for Controlling False Discovery Proportion

Sanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract

Lecture 21: October 19

On adaptive procedures controlling the familywise error rate

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR

Exam: high-dimensional data analysis January 20, 2014

Probabilistic Inference for Multiple Testing

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

False discovery control for multiple tests of association under general dependence

Advanced Statistical Methods: Beyond Linear Regression

Effects of dependence in high-dimensional multiple testing problems. Kyung In Kim and Mark van de Wiel

Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments

Doing Cosmology with Balls and Envelopes

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

REPRODUCIBLE ANALYSIS OF HIGH-THROUGHPUT EXPERIMENTS

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling

arxiv: v1 [math.st] 31 Mar 2009

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

Biostatistics Advanced Methods in Biostatistics IV

A Bayesian Determination of Threshold for Identifying Differentially Expressed Genes in Microarray Experiments

ON STEPWISE CONTROL OF THE GENERALIZED FAMILYWISE ERROR RATE. By Wenge Guo and M. Bhaskara Rao

More powerful control of the false discovery rate under dependence

Looking at the Other Side of Bonferroni

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data

False discovery rate procedures for high-dimensional data Kim, K.I.

Multiple Testing Procedures under Dependence, with Applications

University of California, Berkeley

Statistical Applications in Genetics and Molecular Biology

Multiple Hypothesis Testing in Microarray Data Analysis

POSITIVE FALSE DISCOVERY PROPORTIONS: INTRINSIC BOUNDS AND ADAPTIVE CONTROL

On Methods Controlling the False Discovery Rate 1

Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004

Inferential Statistical Analysis of Microarray Experiments 2007 Arizona Microarray Workshop

arxiv:math/ v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao

Resampling-based Multiple Testing with Applications to Microarray Data Analysis

Department of Statistics University of Central Florida. Technical Report TR APR2007 Revised 25NOV2007

Weighted Adaptive Multiple Decision Functions for False Discovery Rate Control

Tools and topics for microarray analysis

Hypothesis testing (cont d)

TO HOW MANY SIMULTANEOUS HYPOTHESIS TESTS CAN NORMAL, STUDENT S t OR BOOTSTRAP CALIBRATION BE APPLIED? Jianqing Fan Peter Hall Qiwei Yao

Sta$s$cs for Genomics ( )

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

ESTIMATING THE PROPORTION OF TRUE NULL HYPOTHESES UNDER DEPENDENCE

Single gene analysis of differential expression

Some General Types of Tests

Tweedie s Formula and Selection Bias. Bradley Efron Stanford University

Alpha-Investing. Sequential Control of Expected False Discoveries

Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs

Specific Differences. Lukas Meier, Seminar für Statistik

Journal of Statistical Software

False Discovery Rate

Resampling-Based Control of the FDR

The Pennsylvania State University The Graduate School A BAYESIAN APPROACH TO FALSE DISCOVERY RATE FOR LARGE SCALE SIMULTANEOUS INFERENCE

Correlation, z-values, and the Accuracy of Large-Scale Estimators. Bradley Efron Stanford University

A moment-based method for estimating the proportion of true null hypotheses and its application to microarray gene expression data

Multiple hypothesis testing using the excess discovery count and alpha-investing rules

Association studies and regression

Hunting for significance with multiple testing

Exam: high-dimensional data analysis February 28, 2014

Statistical Applications in Genetics and Molecular Biology

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

SOME STEP-DOWN PROCEDURES CONTROLLING THE FALSE DISCOVERY RATE UNDER DEPENDENCE

Positive false discovery proportions: intrinsic bounds and adaptive control

MIXTURE MODELS FOR DETECTING DIFFERENTIALLY EXPRESSED GENES IN MICROARRAYS

Post-Selection Inference

DETECTING DIFFERENTIALLY EXPRESSED GENES WHILE CONTROLLING THE FALSE DISCOVERY RATE FOR MICROARRAY DATA

New Procedures for False Discovery Control

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES

Transcription:

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data Ståle Nygård Trial Lecture Dec 19, 2008

1 / 35 Lecture outline Motivation for not using classical p values in large-scale simultaneous multiple testing situations False discovery rate (FDR) and other multiple testing error measurements Estimation of FDR FDR, power and sample size

2 / 35 Classical single hypothesis testing Let µ be the difference in mean between two groups. We want to test the hypotheses H 0 : µ = 0 vs H 1 : µ 0 Observations in group 1: X = X 1, X 2,..., X nx Observations in group 2: Y = Y 1, Y 2,..., Y ny Test procedure Find a test statistic Z = h(x, Y ). Reject H 0 if p = 2P(Z > z obs given H 0 is true) < α, where α is significance level (e.g. 0.05) and z obs is observed value of Z.

3 / 35 When H 0 is correct f(t) 0.0 0.1 0.2 0.3 0.4 Frequency 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.0 0.2 0.4 0.6 0.8 1.0 4 2 0 2 4 Distribution of the test statistic. t P value Given that the model for the data used under H 0 is correct, p values have a Uniform(0,1) distribution.

4 / 35 Single hypothesis testing set-up Not reject H 0 Reject H 0 H 0 true Correct Type I error H 0 false Type II error Correct Significance level=p(type I error)=α Power=1-P(type II error)=β, i.e. probability of detecting a difference if there is a true difference.

Microarrays Microarrays measure differences in expression levels between two conditions. Sick vs healthy Microarray gene expressions More expressed in the sick individual More expressed in the healthy individual Same expression level in sick and healthy individuals 5 / 35

6 / 35 Microarray test statistic We want to test differential expression between two groups for i = 1,..., m genes (m of order 10000). This can be done using the ordinary two sample t statistic t i = x i ȳ i σ i, where σ i is the (estimated) standard deviation for the difference x i ȳ i. Variance estimates can be improved by borrowing strength across genes in a technique called variance shrinkage: z i = x i ȳ i. B σ 2 all + (1 B) σ i 2

7 / 35 Bootstrap estimated test statistic Variance shrinkage is often accompanied by bootstrap estimation of the test statistic under H 0. For B bootstrap samples: {x 1,..., x n, y 1,..., y n }: (draw) {x 1,..., x n},{y 1,..., y n} Calculate the null statistic z from the x s and the y s. Compare observed test statistic z obs with the B z -values. Frequency 0 10 20 30 40 50 Histogram of z z obs 6 4 2 0 2 4 6 z

0.0 0.2 0.4 0.6 0.8 1.0 P value 0.0 0.2 0.4 0.6 0.8 1.0 P value 8 / 35 P values from a microarray experiment Frequency 0.0 0.2 0.4 0.6 0.8 1.0 1.2 Frequency 0 2 4 6 8 10 12 p values for null genes p values for non-null genes. Frequency True positives False positives True negatives False negatives α P value p values for all genes on the microarray

9 / 35 Multiple testing set-up Not reject H 0 Reject H 0 Total H 0 true TN FP m 0 H 0 false FN TP m m 0 Total m R R m m = # of hypotheses. m 0 = # of true H 0 s R = # of rejected H 0 s TP = # of true positives FP = # of false positives TN = # of true negatives FN = # of false negatives

10 / 35 Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) Type I error rates Family-wise error rate (FWER): FWER = P(FP 1) False discovery rate (FDR): FDR = E{ FP R I (R > 0)}, i.e. the expected proportion of falsely rejected H 0 among all rejections if there are any rejections, otherwise zero. Positive false discovery rate (pfdr): pfdr = E( FP R R > 0), i.e. same as FDR, but conditioned on having at least one rejection. Per comparison error rate (PCER): PCER = E(FP) m

11 / 35 Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) Family-wise error rates (FWER) Usual way of controlling for multiple testing in the pre-genomic era. FWER= Pr(FP 1) is the probability of at least one false positive. Most common method Bonferroni(1936): p = min(mp, 1) Other methods Šidàk (1967) Stepwise procedures, e.g. Holm (1979) Westfall & Young (1993) For genome-wide data controlling FWER leads to very low power! Less conservative approach: Generalized FWER (Dudoit et al., 2004, and van der Laan et al., 2004): P(FP k).

12 / 35 Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) False discovery rate (FDR) Benjamini and Hochberg (1995) FDR = E{ FP R I (R > 0)} is the expected proportion of false positives, if there are any positives, else zero. Common method: Benjamini & Hochberg s (BH) step-up procedure: Let p (1) p (2) p (m) be the ordered raw p values. Let k = max{k : mp (k) α} k Reject all hypotheses for which the corresponding p values are smaller than p ( k) : p (1),..., p ( k), p ( k+1),..., p (m).

13 / 35 BH step-up: Motivation Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) k = max{k : mp (k) k α} Core of the BH step-up is mp (k). k m 0 p (k) is an estimate of the expected number of false positives when p (k) is cut-off value for the raw p values. Since m 0 is unknown, m is used as a conservative estimate of m 0. is then an estimate of the proportion of expected false positives among the total number of positives k. mp (k) k

14 / 35 Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) Modification for general dependence Benjamini & Yekutiely (2001) The Benjamini & Yekutiely (BY) step-up procedure modifies for general dependence: k = max{k : m m l=1 1 l p (k) α} k When m is large the penalty of the BY-procedure is about log(m) compared to the BH-procedure Can be a large price to pay for allowing arbitrary dependence (Ge et al. 2003)

15 / 35 Proportion of true nulls Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) The number of null genes m 0 is unknown, therefore also the proportion π 0 = m0 m. π 0 is important in estimation of FDR.

Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) Estimating π 0 Schweder and Spjøtvoll s estimator Look at an interval [λ, 1], where most p values are assumed to come from true nulls. The Schweder and Spjøtvoll (1982) estimator is π 0 (λ) = #{p i > λ} m(1 λ) for a fixed λ (0, 1) Frequency Null genes Non-null genes λ 16 / 35

17 / 35 Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) Estimating π 0 using convex decreasing p value density (Langaas et al., 2002) For p close to 1, f (p) π 0. Reasonable to assume that f (p) is decreasing in p. Assuming f (p) also is convex leads to improved estimation of f (1), which can be used as an estimate of π 0. Decreasing p values. Convex decreasing p values.

18 / 35 Inserting π 0 to improve FDR estimate Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) The BH step-up procedure finds k = max{k : mp (k) k α}, where m was a conservative estimate of the number of true nulls. The BH procedure with adaptive control (Benjamini & Hochberg, 2000) finds k = max{k : π 0mp (k) k α}.

19 / 35 Mixture model for p values Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) According to Genovese & Wasserman (2002) Conditional distributions of p values Null genes: Uniform(0,1) (when correct distribution for test statistic is used to calculate the p values.) Non-null genes: h(p) Unconditional distribution of p values is then f (p) = π 0 1 + (1 π 0 ) h(p)

20 / 35 Mixture model for test statistic Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) Unconditional distribution of z values is (Efron et al., 2001) f (z) = π 0 f 0 (z) + (1 π 0 ) f 1 (z), where f 0 (z) is the distribution of the test statistic Z for non-null genes and f 1 (z) is the distribution of Z for non-null genes.

21 / 35 Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) (Empirical) Bayesian Fdr and local Fdr Assume (without loss of generality) that H 0 is rejected for large values of Z. The mixture model based or (empirical) Bayesian false discovery rate is q(z) = Fdr(z) = P(H 0 true Z z) = P(Z z H 0 true)p(h 0 true) P(Z z) = π 0(1 F 0 (z)) (1 F (z)), where F 0 is the cumulative distribution of Z under H 0, and F is the unconditional cumulative distributions of Z. Local Fdr (locally at Z = z) is defined as (Efron et al., 2001) fdr(z) = P(H 0 true Z = z) = π 0f 0 (z) f (z)

22 / 35 Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) Connection between BH ( frequentist ) FDR and empirical Bayesian Fdr Frequentist procedure: The BH step-up procedure with adaptive control finds k such that k = max{k : π 0p (k) k/m α}. Rejecting p 1,..., p k provides FDR α. Let z 1 z 2 z m be the ordered z values. The empirical Bayesian procedure finds l = max l : Fdr(zl ) α, where Fdr(z l ) = π 0P(Z z l H 0 true) P(Z z l ) = π 0p l l/m

23 / 35 Estimation under mixture model Recall the mixture model f (z) = π 0 f 0 (z) + (1 π 0 ) f 1 (z). Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) Null distribution f 0 (z) is usually assumed N(0, 1) (but normality assumption may be violated), or found by bootstrap estimation via resampling group labels. Unconditional distribution f (z) can be approximated by smoothing the empirical distribution.

24 / 35 Estimation under mixture model Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) Upper bound for π 0 can be found by requiring (Efron et al., 2001) 1 fdr(z) = 1 π 0 f 0 (z)/f (z) > 0 for all z This yields π 0 min f (z)/f 0 (z) z

25 / 35 Violation of N(0,1) assumption Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) The null distribution is not necessarily N(0, 1). Deviations from N(0, 1) are caused by (1) Non-normal data and n too small for asymptotic theory to be valid. (2) Unobserved covariates. Inflate the distribution. (3) Correlation across arrays (4) Correlation between genes Bootstrap can not resolve (2) (4). Efron (2007) suggests to estimate empirical null distribution.

26 / 35 Family-wise error rate (FWER) False Discovery rate Benjamini & Hochberg ( frequentist approach) False Discovery Rate Mixture model ( Bayesian approach) Estimating empirical null distribution (Efron, 2007) Assume f 0 (z) N(δ 0, σ0) 2 Estimate δ 0 and σ0 2 by fitting a quadratic curve to the log of the distribution of Z around 0. The procedure is called central matching.

Type II errors Type II errors Optimizing power Sample size False non-discovery rate (FNDR) is the proportion of non-null genes among all non-significant genes. False negative rate (FNR) is the proportion of non-significant genes among all non-null genes. Sensitivity=power=1-FNR, i.e. proportion of significant genes among all non-null genes. Given Type I error rate α, an optimal testing procedure maximizes sensitivity (minimizes FNR). Frequency True positives False positives True negatives False negatives α 27 / 35

28 / 35 Type II errors Optimizing power Sample size Optimal discovery procedure (Storey, 2007) Neyman-Pearson (NP) lemma (1933): Given observed data, optimal testing procedure is based on likelihood ratio P(data H 1 ) P(data H 0 ) Storey (2007) applies NP lemma to multiple testing situation. Assume that test j has density f j under H 0 and g j under H 1. The optimal discovery procedure (ODP) statistics for a gene with observation vector x is defined as S ODP (x) = Sum of P(x under H 1) for all non-null genes Sum of P(x under H 0 ) for all null genes m j=m = 0 +1 g j(x) m0 j=1 f j(x) The f j s and g j s, as well as m 0, must be estimated.

29 / 35 Type II errors Optimizing power Sample size Optimal discovery procedure (Storey, 2007) The ODP procedure: 1 Evaluate the estimated ODP statistic for each gene 2 Use bootstrap to simulate data from the null distribution for each gene, and recompute ODP to get a null distribution for ODP. 3 Use observed and resampled ODPs to calculate q-value for each gene.

30 / 35 Type II errors Optimizing power Sample size Covariate modulated FDR (Ferkingstad et al., 2008) Sensitivity can also be increased by adding external covariates x i, i = 1,... m. Let g(p x) be the conditional density of p under H 1 and π 0 (x) = P(H 0 true x) Mixture model for p values given x is then f (p x) = π 0 (x)+(1 π 0 (x))g(p x).

31 / 35 Type II errors Optimizing power Sample size Sample size assessments (Pawitan et al., 2005) FDR (and FNR) as a function of sample size.

32 / 35 Type II errors Optimizing power Sample size Sample size assessments (Efron, 2007) Efron (2007) studied how multiplying the sample size with a factor c would affect local Fdr. c 1 1.5 2 2.5 3 Prostate cancer 0.68 0.54 0.44 0.38 0.34 HIV 0.45 0.31 0.23 0.18 0.14

33 / 35 Summary References Summary Use of classical p values is problematic in large-scale simultaneous hypothesis testing situations, as it easily generates too many false positives. For microarrays, False Discovery Rate (FDR) is a convenient measure for balancing the number of false positives and false negatives. FDR can be calculated using the Benjamini & Hochberg step-up procedure ( frequentist ) approach or a mixed model ( Bayesian or empirical Bayesian ) approach. The mixed model approach has recently been used to avoid the N(0, 1) null distribution assumption, and to include external covariates. Methods for power and sample size calculations when controlling significance via FDR have recently been proposed.

34 / 35 Summary References References Benjamini, Y. and Hochberg, Y. (1995). Controlling the false discovery rate: A practival and powerful approach to multiple testing. J. Roy. Statist. Soc. B, 57:289 300. Benjamini, Y. and Hochberg, Y. (2000). The adaptive control of the false discovery rate in multiple hypotheses testing. J. Behav. Educ. Statist., 25:60 83. Benjamini, Y. and Yekutieli, Y. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist., 29:1165 1188. Efron, B. (2007). Size, power and false discovery rates. Ann. Statist., 35:1351 1377. Efron, B. et al. (2001). Empirical Bayes analysis of a microarray experiment. J. Amer. Statist. Assoc., 96:1151 1160. Ferkingstad, E. et al. (2008). Unsupervised empirical Bayesian multiple testing with external covariates. Ann. of appl. statist., 2:714 735.

35 / 35 Summary References References Genovese, C. and Wasserman, L. (2004). A stochastic process approach to false discovery control Ann. Statist., 32:1035 1061. Langaas, M. et al. (2005). Estimating the proportion of true null hypotheses, with application to DNA microarray data. J. Roy. Statist. Soc. Ser. B, 67:555 572. Pawitan, Y. et al. (2005). False discovery rate, sensitivity and sample size for microarray studies. Bioinforamtics, 21:3017 3024. Storey, J. D. (2002). A direct approach to false discovery rates. J. Roy. Statist. Soc. B, 64:479 498. Storey, J. D. and Tibshirani, R. (2003). Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA, 100:9440 9445. Storey, J. D. (2007). The optimal discovery procedure: a new approach to simultaneous significance testing. J. Roy. Statist. Soc. B, 69:347 368.