Estimation of a Two-component Mixture Model

Similar documents
Estimation of a Two-component Mixture Model with Applications to Multiple Testing

arxiv: v5 [stat.me] 9 Nov 2015

SEMIPARAMETRIC INFERENCE WITH SHAPE CONSTRAINTS ROHIT KUMAR PATRA

Controlling the False Discovery Rate: Understanding and Extending the Benjamini-Hochberg Method

A Large-Sample Approach to Controlling the False Discovery Rate

Estimation and Confidence Sets For Sparse Normal Mixtures

The miss rate for the analysis of gene expression data

False discovery rate and related concepts in multiple comparisons problems, with applications to microarray data

A Sequential Bayesian Approach with Applications to Circadian Rhythm Microarray Gene Expression Data

Doing Cosmology with Balls and Envelopes

Separating Signal from Background

Multiple Testing. Hoang Tran. Department of Statistics, Florida State University

Non-specific filtering and control of false positives

Statistical testing. Samantha Kleinberg. October 20, 2009

Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. Table of Outcomes. T=number of type 2 errors

PROCEDURES CONTROLLING THE k-fdr USING. BIVARIATE DISTRIBUTIONS OF THE NULL p-values. Sanat K. Sarkar and Wenge Guo

A contamination model for approximate stochastic order

New Approaches to False Discovery Control

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

Simultaneous Testing of Grouped Hypotheses: Finding Needles in Multiple Haystacks

Statistical Applications in Genetics and Molecular Biology

Modified Simes Critical Values Under Positive Dependence

Summary and discussion of: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Nonparametric Estimation of Distributions in a Large-p, Small-n Setting

Confidence Thresholds and False Discovery Control

Control of Directional Errors in Fixed Sequence Multiple Testing

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

Tweedie s Formula and Selection Bias. Bradley Efron Stanford University

Optimal Rates of Convergence for Estimating the Null Density and Proportion of Non-Null Effects in Large-Scale Multiple Testing

High-throughput Testing

A NEW APPROACH FOR LARGE SCALE MULTIPLE TESTING WITH APPLICATION TO FDR CONTROL FOR GRAPHICALLY STRUCTURED HYPOTHESES

Large-Scale Hypothesis Testing

Frequentist Accuracy of Bayesian Estimates

arxiv: v1 [math.st] 31 Mar 2009

Probabilistic Inference for Multiple Testing

FALSE DISCOVERY AND FALSE NONDISCOVERY RATES IN SINGLE-STEP MULTIPLE TESTING PROCEDURES 1. BY SANAT K. SARKAR Temple University

Alpha-Investing. Sequential Control of Expected False Discoveries

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

ON TWO RESULTS IN MULTIPLE TESTING

Hunting for significance with multiple testing

On Procedures Controlling the FDR for Testing Hierarchically Ordered Hypotheses

Statistical Applications in Genetics and Molecular Biology

Exceedance Control of the False Discovery Proportion Christopher Genovese 1 and Larry Wasserman 2 Carnegie Mellon University July 10, 2004

Some General Types of Tests

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Procedures controlling generalized false discovery rate

Journal of Statistical Software

Stat 206: Estimation and testing for a mean vector,

Advanced Statistical Methods: Beyond Linear Regression

Peak Detection for Images

False discovery control for multiple tests of association under general dependence

Research Article Sample Size Calculation for Controlling False Discovery Proportion

Bayesian estimation of the discrepancy with misspecified parametric models

Applying the Benjamini Hochberg procedure to a set of generalized p-values

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Control of Generalized Error Rates in Multiple Testing

High-Throughput Sequencing Course. Introduction. Introduction. Multiple Testing. Biostatistics and Bioinformatics. Summer 2018

Sanat Sarkar Department of Statistics, Temple University Philadelphia, PA 19122, U.S.A. September 11, Abstract

Resampling-Based Control of the FDR

Bayesian Inference and the Parametric Bootstrap. Bradley Efron Stanford University

A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE

14.30 Introduction to Statistical Methods in Economics Spring 2009

New Procedures for False Discovery Control

Statistics: Learning models from data

Looking at the Other Side of Bonferroni

Minimum Hellinger Distance Estimation with Inlier Modification

Step-down FDR Procedures for Large Numbers of Hypotheses

Nonparametric Inference via Bootstrapping the Debiased Estimator

Two-stage stepup procedures controlling FDR

arxiv: v3 [stat.me] 12 Jul 2018

Lecture 8 Inequality Testing and Moment Inequality Models

Hypothesis Testing - Frequentist

Large-Scale Multiple Testing of Correlations

Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling

STAT 461/561- Assignments, Year 2015

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)

Distribution-free inference for estimation and classification

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE

Association studies and regression

IMPROVING TWO RESULTS IN MULTIPLE TESTING

EMPIRICAL BAYES METHODS FOR ESTIMATION AND CONFIDENCE INTERVALS IN HIGH-DIMENSIONAL PROBLEMS

More powerful control of the false discovery rate under dependence

Estimating the Null and the Proportion of Non- Null Effects in Large-Scale Multiple Comparisons

Efficient Estimation in Convex Single Index Models 1

Correlation, z-values, and the Accuracy of Large-Scale Estimators. Bradley Efron Stanford University

Family-wise Error Rate Control in QTL Mapping and Gene Ontology Graphs

Positive false discovery proportions: intrinsic bounds and adaptive control

Primer on statistics:

ESTIMATING THE PROPORTION OF TRUE NULL HYPOTHESES UNDER DEPENDENCE

Testing against a parametric regression function using ideas from shape restricted estimation

The Pennsylvania State University The Graduate School Eberly College of Science GENERALIZED STEPWISE PROCEDURES FOR

Statistical Inference. Hypothesis Testing

arxiv: v2 [stat.me] 31 Aug 2017

Nonparametric Estimation of Luminosity Functions

Optimal Detection of Heterogeneous and Heteroscedastic Mixtures

Frequentist Accuracy of Bayesian Estimates

Supplementary Material for On the evaluation of climate model simulated precipitation extremes

Transcription:

Estimation of a Two-component Mixture Model Bodhisattva Sen 1,2 University of Cambridge, Cambridge, UK Columbia University, New York, USA Indian Statistical Institute, Kolkata, India 6 August, 2012 1 Joint work with Rohit Patra, Columbia University, USA 2 Supported by NSF grants DMS-0906597, AST-1107373, DMS-1150435 (CAREER)

Mixture model with two components F (x) = αf s (x) + (1 α)f b (x) (1) F b is a known distribution function (DF). Unknowns: mixing proportion α (0, 1) and DF F s ( F b ). Problem: Given a random sample from X 1, X 2,..., X n ind. F, we wish to (nonparametrically) estimate F s and the parameter α. Previous work Most of the previous work on this problem assume some constraint (parametric models) on the form of F s; see e.g., Lindsay (AoS, 1983), Lindsay and Basak (JASA, 1993). Bordes et al. (AoS, 2006) assume that both the components belong to an unknown symmetric location-shift family. In the multiple testing setup, this problem has been addressed mostly to estimate α under suitable assumptions; see e.g., Storey (JRSSB, 2002), Genevese & Wasserman (AoS, 2004), Meinshausen & Rice (AoS, 2006).

Applications In multiple testing problems, e.g., microarray analysis, neuroimaging (fmri). Any situation where numerous independent hypotheses tests are performed. The p-values are uniformly distributed on [0,1], under H 0, while their distribution associated with H 1 is unknown; see e.g., Efron (2010). Estimate the proportion of false null hypotheses α; also needed to control multiple error rates, such as the FDR of Benjamini & Hochberg (JRSSB, 1995). In contamination problems application in astronomy.

An application in astronomy We analyse the radial velocity (RV; line of sight velocity) distribution of stars in Carina, a dwarf spheroidal (dsph) galaxy. The data have been obtained by Magellan and MMT telescopes and consist of radial velocity measurements for n = 1215 stars from Carina, contaminated with Milky Way stars in the field of view. We would like to understand the distribution of the line of sight velocity of stars in Carina. For the contaminating stars from the Milky Way in the field of view, we assume a non-gaussian velocity distribution F b that is known from the Besancon Milky Way model (Robin et. al, 2003), calculated along the line of sight to Carina. Here α is the proportion of stars from Carina.

Outline 1 Estimation of α 2 3 Estimating of F s and its density f s

When α is known We observe an i.i.d. sample from F = αf s + (1 α)f b. A naive estimator of F s would be ˆF s,n α = F n (1 α)f b, α where F n is the empirical DF of the observed sample. ˆF α s,n is not a valid DF: need not be non-decreasing or lie in [0, 1]. Find the closest DF: impose the known shape constraint of monotonicity, accomplished by minimizing {W (x) ˆF s,n(x)} α 2 df n (x) 1 n over all DFs W. n {W (X i ) ˆF s,n(x α i )} 2 i=1

Isotonized estimator Let ˇF s,n α 1 = arg min W DF n n {W (X i ) ˆF s,n(x α i )} 2. i=1 ˇF α s,n is the isotonized estimator; uniquely defined at the data points X i, for all i = 1,..., n. CDF Plot CDF Plot CDF 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 Naive CDF Estimate CDF Estimate True CDF CDF 0.0 0.5 1.0 1.5 Naive CDF Estimate CDF Estimate True CDF 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 x x Plot of ˆF ˇαn s,n, ˇF ˇαn s,n and F s for n = 300 and 500, when α = 0.3.

Computation Recall ˇF α s,n = arg min W DF 1 n n {ˆF s,n(x α (i) ) W (X (i) )} 2. i=1 The above optimization problem is the same as minimizing V θ 2 over θ = (θ 1,..., θ n ) Θ inc where Θ inc = {θ R n : 0 θ 1 θ 2... θ n 1}, V = (V 1, V 2,..., V n ), V i := ˆF α s,n(x (i) ), i = 1, 2,..., n. The estimator ˆθ is uniquely defined by the projection theorem; it is the L 2 projection of V on a closed convex cone in R n. Can easily be computed using the pool-adjacent-violators algorithm (PAVA); see Section 1.2 of Robertson et al. (1988).

Identifiability When α is unknown, the problem is non-identifiable. If F = αf s + (1 α)f b for some F b (known) and α (unknown), then the mixture model can be re-written as ( α F = (α + γ) α + γ F s + γ ) α + γ F b + (1 α γ)f b, for 0 γ 1 α, and the term (αf s + γf b )/(α + γ) can be thought of as the nonparametric component. A trivial solution occurs when we take α + γ = 1, in which case ˇF 1 s,n = ˆF 1 s,n = F n. Hence, α is not uniquely defined.

Identifiability We redefine the mixing proportion as { α 0 := inf γ (0, 1) : F (1 γ)f b γ } is a valid DF. Lemma Intuitively, this definition makes sure that the signal distribution F s does not include any contribution from the known background F b. We consider the estimation of α 0 as defined above. Suppose that F s and F b are absolutely continuous, i.e., they have densities f s and f b, respectively. Then α 0 < α iff there exists c > 0 such that f s (x) cf b (x), for all x R. If the support of F s is strictly contained in that of F b, then the problem is identifiable.

Estimation of α 0 When γ = 1, ˆF γ s,n = F n = ˇF γ s,n. When γ is much smaller than α 0, the regularization of ˆF γ s,n modifies it, and thus ˆF γ s,n and ˇF γ s,n are quite different. Measure distance by d n the L 2 (F n ) distance, i.e., if g, h : R R are two functions, then d n (g, h) = {g(x) h(x)}2 df n (x). 0.05 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 0.005 0 0 0.05 0.1 0.15 0.2 0.25 γ 0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 γ Plot of γ d n (ˆF γ s,n, ˇF γ s,n) (in solid blue) when α 0 = 0.1 and n = 5000.

We will study γ d n (ˆF γ s,n, ˇF γ s,n) = d n (F n, γ ˇF γ s,n + (1 γ)f b ). Lemma γ d n (ˆF γ s,n, ˇF γ s,n) is a non-increasing convex function of γ in (0, 1). Lemma For 1 γ α 0, γ d n (ˆF γ s,n, ˇF γ s,n) d n (F, F n ). Thus, { γ d n (ˆF s,n, γ ˇF s,n) γ a.s. 0, γ α0, > 0, γ < α 0. Note, γ d n (ˆF s,n, γ ˇF ) s,n) γ γ γ d n (ˆF s,n, F (1 γ)f b γ from which ( Fn (1 γ)f b γ d n, F (1 γ)f ) b = d n (F, F n ). γ γ

0.05 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 0.005 0 0 0.05 0.1 0.15 0.2 0.25 γ 0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 γ Estimation of α 0 We would like to compare ˆF γ s,n and ˇF γ s,n, and choose the smallest γ for which their distance is still small. Let { ˆα n = inf γ (0, 1] : γ d n (ˆF s,n, γ ˇF s,n) γ c } n. (2) n Theorem (Consistency of ˆα n ) If c n / n 0 and c n, then ˆα n P α0.

Lower bound for α 0 Construct a finite sample (honest) lower confidence bound ˆα n with the property P(α 0 ˆα n ) 1 β, for all n, (3) for a specified confidence level (1 β), 0 < β < 1. Would allow one to assert, with a specified level of confidence, that the proportion of signal is at least ˆα n. It can also be used to test the hypothesis that there is no signal at level β by rejecting when ˆα n > 0. Genovese & Wasserman (AoS, 2004) and Meinshausen & Rice (AoS, 2006) usually yield approximate conservative lower bounds.

Theorem Let H n be the DF of nd n (F n, F ) := n {F n (x) F (x)} 2 df n (x). Let ˆα n be defined as in (2) with c n defined as the (1 β)-quantile of H n. Then P(α 0 ˆα n ) 1 β, for all n. Proof of Theorem ( P(α 0 < ˆα n ) = P α 0 d n (ˆF s,n, α0 ( P α 0 d n (ˆF α0 s,n, Fs α0 α0 ˇF s,n) c ) n n = P ( n d n (F n, F ) c n ) = 1 H n (c n ) = β, ) c ) n n as α 0 d n (ˆF α0 s,n, Fs α0 ( ) ) = α 0 d Fn (1 α 0)F b n α 0, F (1 α0)f b α 0 = d n (F n, F ).

H n is distribution-free and can be easily simulated. Requires no tuning parameters. Lower bound holds for all n. For moderately large n (e.g., n 500) the distribution H n can be very well approximated by that of the (square-root of) Cramér-von Mises statistic, defined as nd(fn, F ) := n {F n (x) F (x)} 2 df (x). Theorem Letting G n to be the DF of nd(f n, F ), we have sup H n (x) G n (x) 0 as n. x R

0.05 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.045 0.04 0.035 0.03 0.025 0.02 0.015 0.01 0.005 0.005 0 0 0.05 0.1 0.15 0.2 0.25 γ 0 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 γ Plot of γd n (ˆF γ s,n, ˇF γ s,n) (in solid blue) overlaid with its (scaled) second derivative (in dashed red) for α 0 = 0.1 and n = 5000. A tuning-parameter-free estimator of α 0 We can use the elbow of γ d n (ˆF γ s,n, ˇF γ s,n) to estimate α 0. It is the point that has the maximum curvature, i.e., the point where the second derivative is maximum.

Estimation of F s Assume now that the model (1) is identifiable, i.e., α = α 0, and ˇα n be an estimator of α 0 (ˇα n can be ˆα n ). A natural nonparametric estimator of F s is ˇF ˇαn s,n. CDF 0.0 0.2 0.4 0.6 0.8 1.0 CDF 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.1 0.2 0.3 0.4 x 0.0 0.1 0.2 0.3 0.4 x Plot of ˇF ˇαn s,n (in dotted red), F s (in dashed black) and F s,n (in blue). Theorem Suppose that ˇα n P α0. Then, sup x R ˇF ˇαn s,n(x) F s (x) P 0.

Estimating the density f s Suppose now that F s has a non-increasing density f s (w.l.o.g., we assume that f s is non-increasing on [0, )). Natural assumption in many situations, see e.g., Genovese & Wasserman (AoS, 2004). Define F s,n := LCM[ˇF ˇαn s,n], where for a bounded function g : [0, ) R, let us represent the least concave majorant (LCM) of g by LCM[g]. F s,n is a valid DF with a density f s,n. We can now estimate f s by f s,n = [F s,n].

10 10 9 9 8 8 7 7 6 6 Density 5 Density 5 4 4 3 3 2 2 1 1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 x Theorem Plot of the estimate f s,n (in solid red) and f s (in solid blue). Assume F s has non-increasing density f s on [0, ). If ˇα n P α0, then, sup F s,n(x) F s (x) P 0. x R Further, if for any x > 0, f s (x) is continuous at x, then, f s,n(x) P f s (x).

Multiple testing Estimate the proportion of genes that are differentially expressed in DNA microarray experiments. We wish to test n null hypothesis H 01, H 02,..., H 0n using p-values X 1, X 2,..., X n. FWER = Prob (# of false rejections 1), the probability of committing at least one type I error. Benjamini and Hochberg (1995) proposed FDR = E { V R 1(R > 0)}, where V is the number of false rejections and R is the number of total rejections. The BH method guarantees FDR βα 0 ; an estimate of α 0 can be used to yield a procedure with FDR approximately equal to β and thus will result in an increased power.

Application to multiple testing Our method can be used to yield an estimator of α 0. We can also obtain a completely nonparametric estimator of F s, the distribution of the p-values arising from the alternative hypotheses. The local false discovery rate (LFDR) is defined as the function l : (0, 1) [0, ), such where l(x) = P(H i = 0 X i = x) = (1 α 0)f b (x). f (x) The LFDR method can help detect interesting cases (e.g., l(x) 0.20); see Section 5 of Efron (2010). If f s is a assumed to have a non-increasing density, we have a natural tuning-parameter-free estimator ˆl of the LFDR: ˆl(x) =, for x (0, 1). (1 ˇα n)f b (x) ˇα nf s,n(x)+(1 ˇα n)f b (x)

Simulation: lower bounds Compare our method with that of Genevose & Wasserman (AoS, 2004), and Meinshausen & Rice (AoS, 2006). The method in Meinshausen & Rice (AoS, 2006) are extensions of those proposed in Genevose & Wasserman (AoS, 2004). Table: Coverage probabilities of nominal 95% lower confidence bounds for the three methods when n = 1000. α ˆα 0 L Setting I ˆα GW L Setting II ˆα L MR ˆα L 0 ˆα L GW ˆα MR L 0 0.95 0.98 0.93 0.95 0.98 0.93 0.01 0.97 0.98 0.99 0.97 0.97 0.99 0.03 0.98 0.98 0.99 0.98 0.98 0.99 0.05 0.98 0.98 0.99 0.98 0.98 0.99 0.10 0.99 0.99 1.00 0.99 0.98 0.99

Real example: Prostate data Genetic expression levels for n = 6033 genes for m 1 = 50 normal control subjects and m 2 = 52 prostate cancer patients. Goal: To discover a small number of interesting genes whose expression levels differ between the cancer and control patients. Such genes, once identified, might be further investigated for a causal link to prostate cancer development. 0.035 250 0.03 200 0.025 Frequency 150 100 0.02 0.015 0.01 50 0.005 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 p values 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 α

The two-sample t-statistic for testing significance of gene i is t i = x i(1) x i (2) s i t 100 ( under H 0 ), where s i is an estimate of the standard error of x i (1) x i (2), i.e., si 2 = (1/50 + 1/52)[ 50 j=1 {x ij x i (1)} 2 + 102 j=51 {x ij x i (2)} 2 ]/100. CDF 0.0 0.2 0.4 0.6 0.8 1.0 Density 0 2 4 6 8 10 0.0 0.1 0.2 0.3 0.4 0.5 0.6 x 0.0 0.2 0.4 0.6 0.8 1.0 x ˇF s,n ˇαn (in dotted red) and F s,n (in solid blue). The density f s,n. ˆα n = 0.0877. The lower bound for α 0 is 0.0512.

An application in astronomy We analyse the radial velocity (RV; line of sight velocity) distribution of stars in Carina, a dwarf spheroidal (dsph) galaxy. The data have been obtained by Magellan and MMT telescopes and consist of radial velocity measurements for n = 1215 stars from Carina, contaminated with Milky Way stars in the field of view. We would like to understand the distribution of the line of sight velocity of stars in Carina. For the contaminating stars from the Milky Way in the field of view, we assume a non-gaussian velocity distribution F b that is known from the Besancon Milky Way model (Robin et. al, 2003), calculated along the line of sight to Carina.

0.025 0.2 0.02 0.18 0.16 0.015 0.14 0.12 0.1 0.01 0.08 0.06 0.005 0.04 0.02 0 50 0 50 100 150 200 250 300 350 Radial Velocity (RV) 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 γ In the left panel we have the histogram of the radial velocity of the contaminating stars overlaid with the (scaled) kernel density estimator of the Carina data set. The right panel shows the plot of γd n (ˆF γ s,n, ˇF γ s,n) (in solid blue) overlaid with its (scaled) second derivative (in dashed red). Our estimate ˆα n of α 0 for this data set turns out to be 0.356, while the lower bound for α 0 is found to be 0.322.

Astronomers usually assume the distribution of the radial velocities for these dsph galaxies to be Gaussian in nature. ˇF ˆαn The figure below shows s,n (in dashed red) overlaid with the closest (in terms of minimising the L 2 (ˇF s,n) ˆαn distance) Gaussian distribution (in solid blue). ˆαn Indeed, we see that ˇF s,n is close to a normal distribution (with mean 222.9 and standard deviation 7.51). 1 0.9 0.8 0.7 0.6 CDF 0.5 0.4 0.3 0.2 0.1 0 190 200 210 220 230 240 250 260 Line of sight velocity

Summary Consistent estimation of a two-parameter mixture model using techniques from shape-restricted function estimation. Avoids the need to specify tuning parameters. Such shape constraints arise naturally in many contexts. References Benjamini, Y. & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. B, 57, 289 300. Efron, B. (2010). Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction. IMS Monographs. Genovese, C. & Wasserman, L. (2004). A stochastic process approach to false discovery control. Ann. Statist., 32, 1035 1061. Meinshausen, N. & Rice, J. (2006). Estimating the proportion of false null hypotheses among a large number of independently tested hypothesis. Ann. Statist., 34, 373 393. Robertson,T., Wright, F. T. & Dykstra, R.L. (1988). Order restricted statistical inference. Wiley, New York. Thank You! Questions?