Accepted for publication in: Comm. Statist. Theory and Methods April 14, 2015 ASYMPTOTICS OF GOODNESS-OF-FIT TESTS BASED ON MINIMUM P-VALUE STATISTICS

Similar documents
TWO-SAMPLE KOLMOGOROV-SMIRNOV TYPE TESTS REVISITED: OLD AND NEW TESTS IN TERMS OF LOCAL LEVELS

A note on the asymptotic distribution of Berk-Jones type statistics under the null hypothesis

arxiv: v3 [stat.me] 2 Oct 2014

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

Asymptotic Statistics-VI. Changliang Zou

Exercises in Extreme value theory

IMPROVING TWO RESULTS IN MULTIPLE TESTING

Comparing distributions by multiple testing across quantiles

Compatible simultaneous lower confidence bounds for the Holm procedure and other Bonferroni based closed tests

The main results about probability measures are the following two facts:

ON TWO RESULTS IN MULTIPLE TESTING

GOODNESS-OF-FIT TESTS VIA PHI-DIVERGENCES. BY LEAH JAGER 1 AND JON A. WELLNER 2 Grinnell College and University of Washington

SYSM 6303: Quantitative Introduction to Risk and Uncertainty in Business Lecture 4: Fitting Data to Distributions

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

On probabilities of large and moderate deviations for L-statistics: a survey of some recent developments

Extension of continuous functions in digital spaces with the Khalimsky topology

Modified Simes Critical Values Under Positive Dependence

Comparing distributions by multiple testing across quantiles or CDF values

Estimation and Confidence Sets For Sparse Normal Mixtures

Applying the Benjamini Hochberg procedure to a set of generalized p-values

FDR-CONTROLLING STEPWISE PROCEDURES AND THEIR FALSE NEGATIVES RATES

High Breakdown Analogs of the Trimmed Mean

Lecture 16: Sample quantiles and their asymptotic properties

Controlling Bayes Directional False Discovery Rate in Random Effects Model 1

Bahadur representations for bootstrap quantiles 1

Lecture 2 One too many inequalities

Cramér-Type Moderate Deviation Theorems for Two-Sample Studentized (Self-normalized) U-Statistics. Wen-Xin Zhou

Econometrica, Vol. 71, No. 1 (January, 2003), CONSISTENT TESTS FOR STOCHASTIC DOMINANCE. By Garry F. Barrett and Stephen G.

BY JIAN LI AND DAVID SIEGMUND Stanford University

Two-stage stepup procedures controlling FDR

Bi-s -Concave Distributions

Systems Simulation Chapter 7: Random-Number Generation

Proof. We indicate by α, β (finite or not) the end-points of I and call

Estimates for probabilities of independent events and infinite series

Cramér type moderate deviations for trimmed L-statistics

Asymptotic results for empirical measures of weighted sums of independent random variables

Learning Objectives for Stat 225

Evenly sensitive KS-type inference on distributions: new computational, Bayesian, and two-sample contributions

,... We would like to compare this with the sequence y n = 1 n

Extreme Value for Discrete Random Variables Applied to Avalanche Counts

Design of the Fuzzy Rank Tests Package

CPSC 531: Random Numbers. Jonathan Hudson Department of Computer Science University of Calgary

HANDBOOK OF APPLICABLE MATHEMATICS

Endogeny for the Logistic Recursive Distributional Equation

PCA with random noise. Van Ha Vu. Department of Mathematics Yale University

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Optimal detection of heterogeneous and heteroscedastic mixtures

Adv. App. Stat. Presentation On a paradoxical property of the Kolmogorov Smirnov two-sample test

Sharp threshold functions for random intersection graphs via a coupling method.

Asymptotic efficiency of simple decisions for the compound decision problem

Phase Transition Phenomenon in Sparse Approximation

Convergence of Multivariate Quantile Surfaces

Lehrstuhl für Statistik und Ökonometrie. Diskussionspapier 87 / Some critical remarks on Zhang s gamma test for independence

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.

ASYMPTOTIC PROPERTIES OF SOME GOODNESS-OF-FIT TESTS BASED ON THE L1-NORM

THE ASYMPTOTICS OF L-STATISTICS FOR NON I.I.D. VARIABLES WITH HEAVY TAILS

Lower Bounds for Testing Bipartiteness in Dense Graphs

On Rescaled Poisson Processes and the Brownian Bridge. Frederic Schoenberg. Department of Statistics. University of California, Los Angeles

Asymptotic results for empirical measures of weighted sums of independent random variables

OHSU OGI Class ECE-580-DOE :Statistical Process Control and Design of Experiments Steve Brainerd Basic Statistics Sample size?

van Rooij, Schikhof: A Second Course on Real Functions

PM functions, their characteristic intervals and iterative roots

Statistical Applications in Genetics and Molecular Biology

EXPLICIT NONPARAMETRIC CONFIDENCE INTERVALS FOR THE VARIANCE WITH GUARANTEED COVERAGE

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song

Recall the Basics of Hypothesis Testing

Forcing unbalanced complete bipartite minors

Asymptotic statistics using the Functional Delta Method

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

ENTROPY-BASED GOODNESS OF FIT TEST FOR A COMPOSITE HYPOTHESIS

arxiv:math/ v1 [math.st] 29 Dec 2006 Jianqing Fan Peter Hall Qiwei Yao

arxiv: v1 [math.st] 31 Mar 2009

Exact goodness-of-fit tests for censored data

BIVARIATE P-BOXES AND MAXITIVE FUNCTIONS. Keywords: Uni- and bivariate p-boxes, maxitive functions, focal sets, comonotonicity,

PAijpam.eu ADAPTIVE K-S TESTS FOR WHITE NOISE IN THE FREQUENCY DOMAIN Hossein Arsham University of Baltimore Baltimore, MD 21201, USA

Ahlswede Khachatrian Theorems: Weighted, Infinite, and Hamming

Postulate 2 [Order Axioms] in WRW the usual rules for inequalities

Quantum query complexity of entropy estimation

arxiv: v2 [math.pr] 8 Feb 2016

University of California San Diego and Stanford University and

Finding Outliers in Monte Carlo Computations

Existence and convergence of moments of Student s t-statistic

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Exact goodness-of-fit tests for censored data

Some Statistical Inferences For Two Frequency Distributions Arising In Bioinformatics

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Journal Club: Higher Criticism

B.N.Bandodkar College of Science, Thane. Random-Number Generation. Mrs M.J.Gholba

Optimal Detection of Heterogeneous and Heteroscedastic Mixtures

Supplement to Post hoc inference via joint family-wise error rate control

On the Bennett-Hoeffding inequality

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

Normal approximation of Poisson functionals in Kolmogorov distance

INTRODUCTION TO INTERSECTION-UNION TESTS

Lecture 6 April

An elementary proof of the weak convergence of empirical processes

Spring 2012 Math 541B Exam 1

Nonparametric one-sided testing for the mean and related extremum problems

On Decision Making under Interval Uncertainty: A New Justification of Hurwicz Optimism-Pessimism Approach and Its Use in Group Decision Making

Transcription:

Accepted for publication in: Comm. Statist. Theory and Methods April 14, 2015 ASYMPTOTICS OF GOODNESS-OF-FIT TESTS BASED ON MINIMUM P-VALUE STATISTICS Veronika Gontscharuk and Helmut Finner Institute for Biometrics and Epidemiology German Diabetes Center, Leibniz Institute for Diabetes Research at Heinrich-Heine-University Düsseldorf Auf m Hennekamp 65, D-40225 Düsseldorf, Germany veronika.gontscharuk@ddz.uni-duesseldorf.de Key Words: equal local levels; goodness-of-fit; higher criticism; Kolmogorov-Smirnov test; minimum level attained; order statistics. ABSTRACT This paper provides some new results on the asymptotics of goodness-of-fit (GOF) tests based on minimum p-value statistics. In connection with detectability of sparse signals in high-dimensional data, various tests were proposed and investigated during the last decade, especially with respect to asymptotic properties. Minimum p-value GOF statistics were already investigated as minimum level attained statistic by Berk and Jones with respect to Bahadur efficiency. The distribution of minimum p-value GOF statistics is closely related to the distribution of higher criticism statistics, the distribution of the supremum of a normalized Brownian bridge and the supremum of an Ornstein-Uhlenbeck process. 1. INTRODUCTION This paper is concerned with the asymptotics of goodness-of-fit (GOF) tests based on so-called minimum level attained statistics studied in Berk and Jones (1978) and Berk and 1

Jones (1979). The notation level attained is a synonym for p-value. Nowadays a minimum level attained statistic is often referred to as min-p or minp statistic. If we have p-values p i, i I, the minp statistic is defined by min i I p i. The minp statistic is closely related to the union-intersection principle and well-known in multiple hypothesis testing at least since the nineteen thirties. We refer to the minimum level attained GOF tests of Berk and Jones as minp GOF or minp tests. Recently, these tests have gained new interest and were rediscovered and studied under different names and representations by various authors, e.g., aspecial caseofcalibration for simultaneityin BujaandRolke (2006), testsbased on thenew tail-sensitive simultaneous confidence bands in Aldor et al. (2013), GOF tests with equal local levels in Gontscharuk et al. (2014), new higher criticism (HC) tests in Gontscharuk et al. (2015) and a non-asymptotic standardization of binomial counts in Mary and Ferrari (2014). To set up notation, let X 1,...,X n, n N, be real-valued independently identically distributed (iid) random variables with continuous cumulative distribution function (cdf) F. For a continuous cdf F 0 we consider the testing problems H 0 + : F(x) F 0 (x) x R vs. H 1 + : F(x) > F 0 (x) for some x R and H 0 : F(x) = F 0 (x) x R vs. H 1 : F(x) F 0 (x) for some x R. Since F 0 (X i ), i = 1,...,n, are iid uniformly distributed on [0,1] if H 0 is true, we restrict attention to the case where F 0 (x) = x, x [0,1], and X i, i = 1,...,n, are iid with values in [0,1]. Let X 1:n,...,X n:n be the order statistics of X 1,...,X n and let F i,n denote the cdf of the beta distribution with parameters i and n i+1, i.e., the cdf of X i:n under H 0. The oneand two-sided versions of the minp statistic are given by M + n = min 1 i n F i,n(x i:n ), M n = 2 min 1 i n {F i,n(x i:n ),1 F i,n (X i:n )}. 2

Thereby, F i,n (X i:n ) and min{f i,n (X i:n ),1 F i,n (X i:n )} can be viewed as local p-values based on X i:n. Since a minp statistic tends to smaller values under alternatives than under the null hypothesis, the related minp GOF test rejects H + 0 and/or H 0 if M + n and/or M n, respectively, are not larger than the minp critical value d n (say). For a given α (0,1) define a critical value α loc n α loc n (α) such that the related one- or two-sided minp test is an exact level α test, i.e., P(M + n α loc n H 0 ) = α or P(M n α loc n H 0 ) = α, (1) respectively. Clearly, α/n < α loc n < α for n 2. A characterization of the asymptotic behavior of the minp critical values was mentioned in Gontscharuk et al. (2014) and Gontscharuk et al. (2015) without proof. More precisely, if and only if n P(M+ n d n H 0 ) = α and/or P(M n d n H 0 ) = α, (2) n d n 2log(log(n))log(n) n log(1 α) Among others, we provide a proof for this result. The paper is organized as follows. In Section 2 we show a strong connection between minp and higher criticism (HC) tests and summarize some important results related to HC statistics. In Section 3 we provide three different asymptotic critical values leading to asymptotic level α minp GOF tests. We also present the asymptotic minp distribution under the null hypothesis and show that the HC statistic and a z-transformed version of the minp statistic coincide asymptotically in distribution. Section 4 addresses the applicability = 1. of the asymptotic results to the finite case. Proofs are deferred to an Appendix. 2. LOCAL LEVELS OF HIGHER CRITICISM AND MINP TESTS A thorough analysis of the asymptotic behavior of HC related quantities is a key step to obtain results for the asymptotics of the minp test. We first provide some known results concerning HC tests. The HC tests considered here can be seen as normalized versions of the 3

well-known Kolmogorov-Smirnov (KS) tests, e.g., cf. Eicker (1979) and Jaeschke (1979) for the asymptotics of the normalized KS statistic, Donoho and Jin (2004), Hall and Jin (2008) and Donoho and Jin (2009) for the HC concept, Jager and Wellner (2007), Gontscharuk et al. (2014) and Chapter 16 in Shorack and Wellner (2009) for the relationship to the normalized Brownian bridge and Ornstein-Uhlenbeck process. The one- and two-sided HC statistics considered here are HC + n i/n X = max n i:n 1 i n Xi:n (1 X i:n ), (3) HC n = max 1 i n The iting null distributions are characterized by { } n i/n X i:n Xi:n (1 X i:n ), n X i:n (i 1)/n. (4) Xi:n (1 X i:n ) n P(HC+ n < b n (t) H 0 ) = exp( exp( t)), (5) P(HC n < b n (t) H 0 ) = exp( 2exp( t)), (6) n where b n (t) = 2log 2 (n)+(log 3 (n) log(π)+2t)/(2 2log 2 (n)) (7) with log 2 (n) = log(log(n)) and log 3 (n) = log 2 (log(n)), cf. Eicker (1979) and Jaeschke (1979). We consider HC tests based on the critical value b n (t), i.e., a one-sided HC test rejects the null hypothesis H 0 + if HC n + b n (t) while a two-sided HC test rejects H 0 if HC n b n (t). Replacing t in b n (t) in (5) and (6) by t + α = log( log(1 α)) and t α = log( log(1 α)/2) (8) leads to asymptotic level α one- and two-sided HC tests, respectively. Instead of considering union-intersection related GOF tests in terms of test statistics, many of them can be rewritten in terms of so-called local levels, cf. Gontscharuk et al. (2014) for this concept. In our case, the ith local level of a GOF test based on all order statistics is defined as the probability that the ith local test statistic based on the order statistic X i:n exceeds its critical value. For example, local levels related to the one-sided 4

asymptotic level α HC test are given by ( ) n αi,n HC αi,n HC i/n X i:n (α) = P Xi:n (1 X i:n ) b n(t + α) H 0. (9) Local levels related to the one-sided minp test with critical value d n (0,1) (say) are given by α minp i,n d n = P(F i,n (X i:n ) d n H 0 ), (10) i.e., all minp local levels are equal to the underlying critical value d n. By construction, all minp local levels are equal in the two-sided case, too. Moreover, as shown in Gontscharuk et al. (2014), almost all local levels of the oneand two-sided HC tests are asymptotically equal in the sense that the ratio of two local levels tends to 1 for n. The remaining local levels do not contribute to the asymptotic HC global level. Therefore, there is some evidence that minp and HC tests coincide asymptotically in some sense. The following lemma summarizes HC properties which are useful in order to get asymptotic results related to the minp test in Section 3. Lemma 1.(i) The local levels of the one-sided HC asymptotic level α test with critical value b n (t + α) (see (7) and (8)) satisfy for log(n) i n log(n). α HC i,n (α) = log(1 α) 2log 2 (n)log(n) [ 1+O ( )] log3 (n) log 2 (n) (ii) The local levels α i,n (say) of two-sided HC tests with critical value b n (t + α) > 1 from the one-sided test in (i) satisfy α i,n = α HC i,n (α)+α HC n i+1,n(α), i = 1,...,n. (iii) A restricted version of the HC statistic, where the maximum in (3) or (4) is taking over i [log(n), n log(n)] only, has the same asymptotic distribution as the corresponding original HC statistics, i.e., (5) and (6) are also fulfilled if HC + n and HC n are replaced by the corresponding restricted versions. 5

3. ASYMPTOTICS OF MINP GOF TESTS In this section we present several results for one- and two-sided minp tests. For convenience, we sometimes refer to both as minp tests. The following theorem provides a rate of convergence for the critical values related to the asymptotic level α minp tests and hence the asymptotic null distribution of the minp statistics. only if Theorem 1. The minp test with critical value d n is an asymptotic level α test if and d n/αn = 1 with αn αn(α) = log(1 α) n 2log 2 (n)log(n). (11) The asymptotic null distribution of the minp statistics is given by n P(M+ n αn(t) H 0 ) = P(M n α n n(t) H 0 ) = t, t (0,1). Some important conclusions are given in the following remarks. Remark 1. Theorem1showsthataminPcriticalvalued n, whichleadstoanasymptotic level α one-sided minp test, also leads to an asymptotic level α two-sided minp test. However, forfinitenthelevelsofone-andtwo-sidedtestsbasedonthesamecriticalvalued n maydiffer tosomeextend. Forexample, d n = 0.00246,0.00145,0.00122leadstoP(M + n d n H 0 ) 0.05 and P(M n d n H 0 ) 0.055 for n = 100,500,1000. It looks like the two-sided minp critical values α loc n defined in (1) should be somewhat smaller than the one-sided counterparts. Remark 2. Theorem 1 implies that the critical values d n related to asymptotic level α minp tests converge to 0 as the sample size increases. Note that all local levels of the HC test also converge to 0, cf. Theorem 5.2 in Gontscharuk et al. (2014). Compared to the Bonferroni adjustment α/n, the local levels of the minp GOF test converge to 0 very slowly. Remark 3. Considering the proof of Theorem 1, it can be easily seen that the sensitivity range of the asymptotic level α minp test, i.e, the index set of order statistics that contribute totheasymptoticlevelαunderthenullhypothesish 0, coincideswiththesensitivityrangeof 6

the HC test, cf. Eicker (1979), Jaeschke (1979) and Gontscharuk et al. (2015) for sensitivity ranges of HC tests. Let Φ be the cdf of the standard normal distribution and Φ 1 be the inverse of Φ. Note that Z i = Φ 1 (1 F i,n (U i:n )) N(0,1), i = 1,...,n. Herewith, the z-transformed minp statistics are maxima of (dependent) normals/absolute normals, that is, Φ 1 (1 M + n ) = max 1 i n Z i and Φ 1 (1 M n /2) = max 1 i n Z i. The z-transformed minp statistics have some advantage compared to the HC statistics defined in (3) and (4), cf. Gontscharuk et al. (2015). For example, most of the global α level of the HC tests is taken away by a very small amount of the most extreme order statistics, even for extremely large n, although they are asymptotically negligible. Loosely speaking, even for extremely large n there is a great gap between finite and asymptotic HC behavior. The next theorem provides the relationship between the asymptotic distributions of the HC and z-transformed minp statistics. Theorem 2. For the HC critical values b n (t) we get and n P(Φ 1 (1 M n + ) < b n (t) H 0 ) = exp( exp( t)) n P(Φ 1 (1 M n /2) < b n (t) H 0 ) = exp( 2exp( t)), i.e., the asymptotic distribution of the z-transformed minp statistics Φ 1 (1 M + n ) and Φ 1 (1 M n /2) is the same as the asymptotic distribution of HC + n and HC n, respectively. Remark 4. Theorem 2 yields two competing choices of the minp critical values α n(α) and α n(α) (say), i.e., α n α n(α) 1 Φ(b n (t + α)) and α n α n(α) 2(1 Φ(b n (t α ))), (12) which both lead to asymptotic level α minp tests. This is because of n α n/α n = n α n/α n = 1, cf. the proof of Theorem 2. 7

Table1: Locallevelsα loc n fulfilling(1) for various n- and α-values such that the corresponding one- or two-sided minp test is an exact level α test. α = 0.01 α = 0.05 α = 0.1 n one-sided two-sided one-sided two-sided one-sided two-sided n = 10 0.001368 0.001311 0.007944 0.007390 0.017604 0.015992 n = 10 2 0.000387 0.000359 0.002461 0.002197 0.005728 0.004974 n = 10 3 0.000185 0.000170 0.001217 0.001075 0.002881 0.002464 n = 10 4 0.000115 0.000105 0.000765 0.000672 0.001816 0.001550 4. FINITE SAMPLES AND MINP GOF TESTS We briefly investigate the applicability of asymptotic minp critical values in the finite setting. The critical value α loc n related to the exact level α minp test, i.e., α loc n fulfilling (1), can be calculated numerically, e.g., via some search algorithm. Note that the probabilities in (1)canberepresentedasP(X i:n < c i,i = 1,...,n H 0 )forsomec 1 c n intheone-sided case and P(a i < X i:n < c i,i = 1,...,n H 0 ) for some a 1 a n with a i < c i, i = 1,...,n, in the two-sided case. Below, probabilities of the first type are calculated via Bolshev s recursion, cf. pp. 362-370 in Shorack and Wellner (2009), and probabilities of the second type are calculated via a recursive procedure proposed in Khmaladze and Shinjikashvili (2001). For example, α loc n -values for n = 10,10 2,10 3,10 4 and α = 0.01,0.05,0.1 are provided in Table 1. Now we compare critical values leading to asymptotic level α minp tests, i.e., α n defined in (11), α n and α n defined in (12), with a finite counterpart α loc n defined by (1). The left graph in Figure 1 shows α n, α n, α n and α loc n for α = 0.05 and n = 50,...,10 4, and the right graph in Figure 1 shows related probabilities to reject the true null hypothesis H 0, i.e., probabilities given in (1). Similar pictures can be observed for other values of α, e.g., α = 0.01,0.1. It looks like the probability to reject the true null hypothesis H 0 by a minp test based on an asymptotic critical value is larger in the two-sided case than in the one- 8

Figure 1: Left graph: asymptotic minp critical values α n (dotted curve), α n (dash-dotted curve)andα n (dashedcurve)togetherwithα loc n leading to finite level α one-sided(upper solid curve) and two-sided (lower solid curve) minp tests for n = 50,...,10 4 and α = 0.05. Right graph: probabilities to reject the true H 0 by the corresponding minp tests, i.e., P(M + n d n H 0 ) (lower curves) and P(M n d n H 0 ) (upper curves) for d n = α n,α n,α n (dotted curves, dash-dotted curves and dashed curves, respectively). sided case. Moreover, minp tests based on the asymptotic critical value αn exceed the α level at least for n 10 4, while minp tests based on α n or α n are conservative and minp tests based on α n are most conservative. Surprisingly, although the minp test based on the critical value αn is an asymptotic level α test, cf. Theorem 1, the finite global level, i.e., the related probability in (1), is considerably larger than α = 0.05 and even increases in n {50,...,10 4 }, cf. the right graph in Figure 1. Finally, we focus on z-transformed minp statistics, which have the same asymptotic distribution as related HC tests, cf. Theorem 2. Since for any x R we get Φ 1 (1 M n + ) x and Φ 1 (1 M n /2) x iff M n + 1 Φ(x) and M n 2(1 Φ(x)), respectively, the right graph in Figure 1 also provides probabilities that z-transformed minp statistics exceed the 9

Figure 2: The cdf of Φ 1 (1 M n /2) (dashed curves) and the cdf of HC n (dotted curves) simulated by 10 5 repetitions together with the corresponding asymptotic cdf F n (y) (solid curves) for n = 10 4 (left graph) and n = 10 6 (right graph). corresponding (asymptotic) HC critical values, that is, P(Φ 1 (1 M + n ) b n (t + α) H 0 ) and P(Φ 1 (1 M n /2) b n (t α ) H 0 ) are given by the lower dash-dotted and upper dashed curves, respectively. We observe that z-transformed minp GOF tests based on the corresponding asymptotic HC critical values are level α tests but too conservative in the finite setting. Contrary to this behavior, asymptotic level α HC tests, i.e., GOF tests based on statistics HC + n, HC n and critical values b n (t + α), b n (t α ), respectively, exceed the α level drastically for a finite sample size, e.g., cf. Jaeschke (1979), Khmaladze and Shinjikashvili (2001) and Gontscharuk et al. (2015). For example, Figure 2 shows the (simulated) cdfs of the twosided HC statistic HC n and the two-sided z-transformed minp statistic Φ 1 (1 M n /2) together with the asymptotic (Gumbel-related) cdf F n (y) (say) for n = 10 4,10 6. Thereby, the asymptotic cdf F n (y) is defined by (6) so that F n (y) = exp( 2exp( b 1 n (y))), where b 1 n (y) is the inverse function of b n (t) defined in (7). The finite distribution of z-transformed two-sided minp statistics seems to be closer to the asymptotic Gumbel-related distribution 10

Figure 3: Critical values d n (α) defined in (13) (diamonds) with c α = 1.1,1.3,1.6 (from top to bottom) and two-sided critical values α loc n fulfilling (1) (solid curves) for α = 0.01,0.05,0.1 (from bottom to top) and n = 10 3,...,10 4. than the finite two-sided HC distribution. In the one-sided case we get a similar picture, cf. Figure 7 in Gontscharuk et al. (2015). We note that the minp critical values α loc n fulfilling (1) can be calculated exactly at least for n 10 4. The asymptotic critical values α n defined in (12) always lead to finite level α minp tests in Figure 1. Although these tests seem to be too conservative, one may prefer minp GOF tests based on α n for n > 10 4. In order to get minp critical values d n leading to an asymptotic level α test with improved finite behavior, we have to keep in mind that Theorem 1 requires d n /α n 1 as n. Motivated by (i) in Lemma 1, we can try the minp critical values defined by d n (α) = log(1 α) 2log 2 (n)log(n) [ ] log 1 c 3 (n) α, (13) log 2 (n) where c α R is a suitable constant. Note that, in fact, d n (α)/α n 1 as n. Figure 3 shows d n (α) based on c α = 1.1,1.3,1.6 for α = 0.01,0.05,0.1, respectively, together with the corresponding minp two-sided critical values α loc n fulfilling (1) for n = 10 3,...,10 4. It seems 11

Table 2: Probabilities P(M n d n (α) H 0 ) (and P(M n α loc n H 0 ) for n = 10 4 only) simulated by 10 5 repetitions, where d n (α) is based on c α = 1.6,1.3,1.1 for α = 0.01,0.05,0.1, respectively, and α loc n fulfills (1) in the two-sided case. n α = 0.01 α = 0.05 α = 0.1 n = 10 4 0.00966 (0.00972) 0.04874 (0.04905) 0.09969 (0.09937) n = 5 10 4 0.00961 0.04971 0.10016 n = 10 5 0.01018 0.05019 0.10188 n = 5 10 5 0.01001 0.05018 0.10115 n = 10 6 0.00973 0.04942 0.10135 that d n (α) approximates α loc n very well at least for the considered α- and n-values. However, one cannot expect that this approximation works well for all n > 10 4. It is also difficult to check the appropriateness of any approximation since even simulations become more and more time consuming for larger values of n. Table 2 provides simulated probabilities P(M n d n (α) H 0 ) based on d n (α) defined in (13) with c α = 1.6,1.3,1.1 for α = 0.01,0.05,0.1, respectively, and some n 10 4, and, for n = 10 4, simulated probabilities P(M n α loc n H 0 ) where α loc n refers to two-sided exact level α minp tests. Taking account of the simulation inaccuracy, minp critical values defined in (13) seem to work not that bad for the considered n- and α-values. In summary, the asymptotic representation of the minp critical values α n defined in (11) gives us some idea about the magnitude of the local level w.r.t. a single p-value. In order to improve the finite sample behavior of an asymptotic minp GOF test, new results with respect to higher order asymptotics for local levels or the Gumbel approximation seem desirable. However, this seems to be a difficult issue. APPENDIX Proof of Lemma 1. Part (i) immediately follows from Lemma 4.3 in Gontscharuk et 12

al. (2014); for (ii) see p.10 in Gontscharuk et al. (2014); Proposition 1 in Eicker (1979) implies (iii). Proof of Theorem 1. First, we show that the one-sided minp test with critical value d n α n(α) is an asymptotic level α test, i.e., n P(M+ n αn(α) H 0 ) = α. (14) Let U 1:n U n:n denote the order statistics of iid U(0,1)-distributed random variables. Setting c minp i,n that α minp i,n F 1 i,n (α n(α)), i = 1,...,n, we get for the minp local levels defined in (10) = P(U i:n c minp i,n ) = αn(α) and P(M + n α n(α) H 0 ) = P( n i=1{u i:n c minp i,n }). For notational convenience, we split the index set I n = {i : i = 1,...,n} into J 1 = {i I n : i < log(n) or i > n log(n)} and J 2 = I n \J 1. By means of the Bonferroni inequality we obtain P(M n + αn(α) H 0 ) P ( i J1 {U i:n c minp i,n } ) +P ( i J2 {U i:n c minp i,n } ). Moreover, the Bonferroni inequality also implies Hence, P( i J1 {U i:n c minp i,n }) i J 1 P(U i:n c minp i,n ) log(1 α). log 2 (n) n P(M+ n αn(α) H 0 ) = P( i J2 {U i:n c i,n }). (15) n For an arbitrary but fixed ε (0,min(α,1 α)) we consider two one-sided HC tests at the asymptotic level α + ǫ and α ǫ, respectively. That is, these HC tests are based on the critical values b n (t + α+ǫ) and b n (t + α ǫ), respectively, with b n defined in (7) and t + α defined in (8). Due to (i) in Lemma 1, the corresponding HC local levels α HC i,n (α ε) and α HC i,n (α+ε), cf. (9), fulfill α HC i,n (α±ε)/α n(α±ǫ) 1 as n 13

uniformly for i J 2. Since α n(α) is a monotonically increasing function in α (0,1), we obtain for n N large enough that α HC i,n (α ε) α n(α) α HC i,n (α+ε), i J 2. (16) Setting g i,n (u) n(i/n u)/ u(1 u), local levels related to the asymptotic level α HC test can be represented as α HC i,n (α) = P(g i,n (U i:n ) > b n (t + α)), i = 1,...,n. Hence, setting c HC i,n (α) g 1 i,n (b n(t + α)), i = 1,...,n, we get α HC i,n (α) = P(U i:n < c HC i,n (α)), i = 1,...,n. Therefore, (16) implies c HC i,n (α ε)) c minp i,n for n N large enough, which immediately leads to c HC i,n (α+ε), i J 2, P ( i J2 {U i:n c minp i,n } ) P ( i J2 {U i:n c HC i,n (α ε))} ) (17) and P ( i J2 {U i:n c minp i,n } ) P ( i J2 {U i:n c HC i,n (α+ε)} ). (18) The last statement in Lemma 1 yields that order statistics U i:n, i / J 2, do not contribute anything to the asymptotic level of the HC test in the sense n P( n i=1{u i:n c HC i,n (α)}) = P( i J2 {U i:n c HC i,n (α)}) = α n for any α (0,1). This together with (15), (17) and (18) implies α ε n P(M + n α n(α) H 0 ) α+ε, which is true for any ε (0,min(α,1 α)) and hence (14) follows. Now we focus on the one-sided minp test in the general case d n (α/n,α). For a given d n we define a level α n (say) as a solution of d n = α n(α n), i.e., α n = 1 exp( 2d n log 2 (n)log(n)). W.l.o.g. let α n α for some α [0,1]. Obviously, 14

if 0 < α < 1 then for any ǫ (0,min(α,1 α )) and larger n we get α (α ǫ) d n α (α +ǫ), if α = 0 then for any ǫ (0,1) and larger n we get d n α (ǫ), if α = 1 then for any ǫ (0,1) and larger n we get d n α (1 ǫ). Since the probability P(M n + d n H 0 ) is a monotonically increasing function in d n, we obtain by (14) that n P(M n + d n H 0 ) α +ǫ and/or n P(M n + d n H 0 ) α ǫ for any (feasible) ǫ > 0. Therefore, n P(M n + d n H 0 ) = α. Noting that α = α iff n d n /αn(α ) = 1, the assertion for the one-sided test follows. The two-sided case can be proved in the same way via the assertion (ii) in Lemma 1. Proof of Theorem 2. We first restrict attention to the one-sided case. Setting t t + α given in (8), the assertion of the theorem can be rewritten as n P(M+ n Φ(b n (t + α)) H 0 ) = α, where Φ(x) = 1 Φ(x). Due to Theorem 1 it suffices to show that Φ(b n (t + α))/αn(α) = 1. n Applying Mill s ratio, we get Φ(b n (t + α)) = φ(b n (t + α))/b n (t + α)[1+o(1)], where φ( ) is the density of the standard normal distribution. This together with simple analysis leads to Φ(b n (t + α)) = which implies the assertion. 1 1 exp ( b n(t + α) 2 ) [1+o(1)] 2log2 (n) 2π 2 = exp( log 2(n) log( log 2 (n))+log( π) t + α +o(1)) 2 πlog 2 (n) [1+o(1)] = log(1 α) 2log 2 (n)log(n) [1+o(1)] = α n(α)[1+o(1)], The two-sided case follows by noting that 2 Φ(b n (t α )) = α n(α)[1+o(1)] for t α in (8). 15

ACKNOWLEDGEMENTS The authors greatly acknowledge the constructive comments and suggestions of the anonymous referee. This work was supported by the Ministry of Science and Research of the State of North Rhine-Westphalia (MIWF NRW) and the German Federal Ministry of Health (BMG). BIBLIOGRAPHY Aldor-Noiman, S., Brown, L., Buja, A., Rolke, W. and Stine, R. (2013). The power to see: A new graphical test of normality. Am. Stat., 67:4, 249 260. Berk, R. and Jones, D. (1978). Relatively optimal combinations of test statistics. Scand. J. Stat., 5, 158 162. Berk, R. and Jones, D. (1979). Goodness-of-fit test statistics that dominate the Kolmogorov Statistics. Z. Wahrscheinlichkeit., 47, 47 59. Buja, A. and Rolke, W. (2006). Calibration for simultaneity: (Re)Sampling methods for simultaneous inference with applications to function estimation and functional data. Unpublished manuscript. Donoho, D. and Jin, J. (2004). Higher criticism for detecting sparse heterogeneous mixtures. Ann. Stat., 32, 962 994. Donoho, D. and Jin, J. (2009). Feature selection by higher criticism thresholding achieves the optimal phase diagram. Philos. Tr. R. Soc. A, 367, 4449 4470. Eicker, F. (1979). The asymptotic distribution of the suprema of the standardized empirical processes. Ann. Stat., 7, 116 138. Gontscharuk, V., Landwehr, S. and Finner, H. (2014). Goodness of fit tests in terms of local levels with special emphasis on higher criticism tests. Accepted for publication in Bernoulli, http://www.e-publications.org/ims/submission/bej/user/submissionfile/18412?confirm=bd254ced Gontscharuk, V., Landwehr, S. and Finner, H. (2015). The intermediates take it all: asymp- 16

totics of higher criticism statistics and a powerful alternative based on equal local levels. Biometrical J., 57, 159 180. Hall, P. and Jin, J. (2008). Properties of higher criticism under strong dependence. Ann. Stat., 36, 381 402. Jaeschke, D. (1979). The asymptotic distribution of the supremum of the standardized empirical distribution function on subintervals. Ann. Stat., 7, 108 115. Jager, L. and Wellner, J. (2007). Goodness-of-fit tests via phi-divergences. Ann. Stat., 35, 2018 2053. Khmaladze, E. and Shinjikashvili, E. (2001). Calculation of noncrossing probabilities for Poisson processes and its corollaries. Adv. Appl. Probab., 33, 702 716. Mary, D. and Ferrari, A. (2014). A non-asymptotic standardization of binomial counts in higher criticism. IEEE International Symposium on Information Theory (ISIT), 561 565. Shorack, G. and Wellner, J. (2009). Empirical Processes with Applications to Statistics. Philadelphia: Society for Industrial and Applied Mathematics. 17