An Empirical Likelihood Approach To Goodness of Fit Testing

Similar documents
Empirical likelihood approach to goodness of fit testing

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Rank tests and regression rank scores tests in measurement error models

Lecture 19: Convergence

Asymptotic normality of quadratic forms with random vectors of increasing dimension

7.1 Convergence of sequences of random variables

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

32 estimating the cumulative distribution function

Sequences and Series of Functions

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Self-normalized deviation inequalities with application to t-statistic

Expectation and Variance of a random variable

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Probability and Statistics

Topic 9: Sampling Distributions of Estimators

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

LECTURE 8: ASYMPTOTICS I

4. Partial Sums and the Central Limit Theorem

Topic 9: Sampling Distributions of Estimators

Fall 2013 MTH431/531 Real analysis Section Notes

Efficient GMM LECTURE 12 GMM II

POWER COMPARISON OF EMPIRICAL LIKELIHOOD RATIO TESTS: SMALL SAMPLE PROPERTIES THROUGH MONTE CARLO STUDIES*

Empirical Processes: Glivenko Cantelli Theorems

A statistical method to determine sample size to estimate characteristic value of soil parameters

Chapter 6 Infinite Series

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Properties and Hypothesis Testing

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Berry-Esseen bounds for self-normalized martingales

Problem Set 4 Due Oct, 12

Topic 9: Sampling Distributions of Estimators

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector

Advanced Stochastic Processes.

Law of the sum of Bernoulli random variables

IMPROVING EFFICIENT MARGINAL ESTIMATORS IN BIVARIATE MODELS WITH PARAMETRIC MARGINALS

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Statistics 511 Additional Materials

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

b i u x i U a i j u x i u x j

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

Random Variables, Sampling and Estimation

Distribution of Random Samples & Limit theorems

Asymptotic distribution of products of sums of independent random variables

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Mathematical Statistics - MS

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Estimation for Complete Data

Lecture 7: Properties of Random Samples

Kernel density estimator

2.2. Central limit theorem.

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

ESTIMATING THE ERROR DISTRIBUTION FUNCTION IN NONPARAMETRIC REGRESSION WITH MULTIVARIATE COVARIATES

An Introduction to Randomized Algorithms

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

Asymptotic Results for the Linear Regression Model

Notes 19 : Martingale CLT

ON BARTLETT CORRECTABILITY OF EMPIRICAL LIKELIHOOD IN GENERALIZED POWER DIVERGENCE FAMILY. Lorenzo Camponovo and Taisuke Otsu.

7.1 Convergence of sequences of random variables

f n (x) f m (x) < ɛ/3 for all x A. By continuity of f n and f m we can find δ > 0 such that d(x, x 0 ) < δ implies that

Unbiased Estimation. February 7-12, 2008

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

Introductory statistics

Statistical Inference Based on Extremum Estimators

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

Lecture 2: Monte Carlo Simulation

Probability and statistics: basic terms

Chapter 6 Principles of Data Reduction


Measure and Measurable Functions

Introduction to Probability. Ariel Yadin

1 Convergence in Probability and the Weak Law of Large Numbers

Lecture 2. The Lovász Local Lemma

TAMS24: Notations and Formulas

Lecture Notes 15 Hypothesis Testing (Chapter 10)

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Lecture 3 The Lebesgue Integral

Simulation. Two Rule For Inverting A Distribution Function

Notes 5 : More on the a.s. convergence of sums

OFF-DIAGONAL MULTILINEAR INTERPOLATION BETWEEN ADJOINT OPERATORS

Lecture 33: Bootstrap

Technical Proofs for Homogeneity Pursuit

Exercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1).

MAT1026 Calculus II Basic Convergence Tests for Series

Inference about the slope in linear regression: an empirical likelihood approach

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Entropy Rates and Asymptotic Equipartition

Chapter 6 Sampling Distributions

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Web-based Supplementary Materials for A Modified Partial Likelihood Score Method for Cox Regression with Covariate Error Under the Internal

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Transcription:

Submitted to the Beroulli A Empirical Likelihood Approach To Goodess of Fit Testig HANXIANG PENG ad ANTON SCHICK Idiaa Uiversity Purdue Uiversity at Idiaapolis, Departmet of Mathematical Scieces, Idiaapolis, IN 46202, USA E-mail: hpeg@math.iupui.edu Bighamto Uiversity, Departmet of Mathematical Scieces, Bighamto, NY 3902, USA E-mail: ato@math.bighamto.edu Motivated by applicatios to goodess of fit testig, the empirical likelihood approach is geeralized to allow for the umber of costraits to grow with the sample size ad for the costraits to use estimated criteria fuctios. The latter is eeded to deal with uisace parameters. The proposed empirical likelihood based goodess of fit tests are asymptotically distributio free. For uivariate observatios, tests for a specified distributio, for a distributio of parametric form, ad for a symmetric distributio are preseted. For bivariate observatios, tests for idepedece are developed. Keywords: ifiitely may costraits; uisace parameter; estimated costrait fuctios; regressio model; testig for a specific distributio; testig for a parametric model; testig for symmetry; testig for idepedece.. Itroductio The empirical likelihood approach was itroduced by Owe (988, 990) to costruct cofidece itervals i a oparametric settig, see also Owe (200). As a likelihood approach possessig oparametric properties, it does ot require us to specify a distributio for the data ad ofte yields more efficiet estimates of the parameters. It allows data to decide the shape of cofidece regios ad is Bartlett correctable (DiCiccio, Hall ad Romao, 99). The approach has bee developed to various situatios, e.g., to geeralized liear models (Kolaczyk, 994), local liear smoother (Che ad Qi, 2000), partially liear models (Shi ad Lau, 2000; Wag ad Jig, 2003), parametric ad semiparametric models i multirespose regressio (Che ad Va Keilegom, 2009), liear regressio with cesored data (Zhou ad Li, 2008), ad plug-i estimates of uisace parameters i estimatig equatios i the cotext of survival aalysis (Qi ad Jig, 200; Wag ad Jig, 200; Li ad Wag, 2003). Algorithms, calibratio ad higher-order precisio of the approach ca be foud i Hall ad La Scala (990), Emerso ad Owe (2009) ad Liu ad Che (200) amog others. It is especially coveiet to icorporate side iformatio expressed through equality costraits. Qi ad Lawless (994) liked

2 H. Peg ad A. Schick empirical likelihood with fiitely may estimatig equatios. These estimatig equatios serve as fiitely may equality costraits. I semiparametric settigs, iformatio o the model ca ofte be expressed by meas of ifiitely may costraits which may also deped o parameters of the model. I goodess of fit testig, the ull hypothesis ca typically be expressed by ifiitely may such costraits. This is the case whe testig for a fixed distributio (see Example below), whe testig for a give parametric model (Example 2), whe testig for symmetry about a fixed poit (Example 3), ad whe testig for idepedece (Example 4). Modelig coditioal expectatios ca also be doe by meas of ifiitely may costraits. This has applicatios to heteroscedastic regressio models (Sectio 3) ad to coditioal momet restrictio models treated by Tripathi ad Kitamura (2003) usig a smoothed empirical likelihood approach. Recetly Hjort, McKeague ad Va Keilegom (2009) exteded the scope of the empirical method. I particular, they developed a geeral theory for costraits with uisace parameters ad cosidered the case with ifiitely may costraits. Their results for ifiitely may costraits, however, do ot allow for uisace parameters. I this paper we will fill this gap ad i the process improve o their results. Let us ow discuss some of our results i the followig special case. Let Z,..., Z be idepedet copies of a radom vector Z with distributio Q. Let u, u 2,... be orthoormal elemets of L 2,0 (Q) = {u L 2 (Q) : u dq = 0}. The the radom variables u (Z), u 2 (Z),... have mea zero, variace oe ad are ucorrelated. Now cosider the empirical likelihood based o the first m of these fuctios, { R = sup π j : π P, π j u k (Z j ) = 0, } k =,..., m, where P = {π = (π,..., π ) [0, ] : π + + π = } deotes the closed probability simplex i dimesio. For fixed m, it follows from Owe s work that 2 log R has asymptotically a chi-square distributio with m degrees of freedom. I other words, P ( 2 log R > χ 2 α(m)) α, 0 < α <, (.) where χ 2 β (m) deotes the β-quatile of the chi-square distributio with m degrees of freedom. Hjort et al (2009) have show that (.) holds uder some additioal assumptios eve if m teds to ifiity with by provig the asymptotic ormality result ( 2 log R m)/ 2m = N(0, ). (.2) This result requires higher momet assumptios o the fuctios u, u 2,... ad restrictios o the rate at which m ca ted to ifiity. For example, if the fuctios u, u 2,... are uiformly bouded, the the rate m 3 = o() suffices for (.2). They also state i their

A Empirical Likelihood Approach To Goodess of Fit Testig 3 Theorem 4., that if sup k uk q dq is fiite for some q > 2, the m 3+6/(q 2) = o() suffices for (.2). A gap i their argumet was fixed by Peg ad Schick (202). We shall show that larger m are allowed i some cases. I particular, for q = 4, it suffices that m 4 = o() holds (istead of their m 6 = o()) ad if q = 3, the m 6 = o() is eough (istead of their m 9 = o()), see our Theorems 7.2 ad 7.3 below. Our rate m 4 = o() for q = 4 matches the rate give i Theorem 2 of Che, Peg ad Qi (2009). These authors obtai asymptotic ormality for m larger tha i Hjort et al (2009) by imposig additioal structural assumptios. These assumptios, however, are typically ot met i the applicatios we have i mid. Oe of the key poits i our proof is a simple coditio for the covex hull of some vectors x,..., x to have the origi as a iterior poit. Our coditio is that the smallest eige value of i= x ix i exceeds 5 i= x i max j x j. Here x deotes the euclidea orm of a vector x. This sufficiet coditio ties i icely with the other requiremets used to establish the asymptotic behavior of the empirical likelihood ad is typically implied by these. For example, coditios (A) (A3) i Theorem 2. of Hjort et al (2009) already imply their (A0). Thus the coclusio of their theorem is valid uder (A) (A3) oly, see our Theorem 6.. Let us ow look at the case whe the fuctios u, u 2,... are ukow. The we ca work with the empirical likelihood { ˆR = sup π j : π P, where û k is a estimator of u k such that m k= π j û k (Z j ) = 0, } k =,..., m (û k (Z j ) u k (Z j )) 2 = o p (m ). (.3) Now we have the coclusio ( 2 log ˆR m)/ 2m = N(0, ) uder the coditio m ( 2 /2 (û k (Z j ) u k (Z j ))) = op () (.4) k= ad mild additioal coditios such as (i) û k + u k B for some costat B ad all k ad m 3 = o(), or (ii) m k= u 4 k dq = O(m 2 ) ad m 4 = o(). Our results, however, go beyod this simple result. If (.4) is replaced by m 2 ( /2 (û k (Z j ) u k (Z j ) + E[u k (Z)ψ (Z)]ψ(Z j ))) = op () (.5) k= with ψ a measurable fuctio ito R q which is stadardized uder Q i the sese that E[ψ(Z)] = 0 ad E[ψ(Z)ψ (Z)] = I q, the q q idetity matrix, the the coclusio ( 2 log ˆR (m q))/ 2(m q) = N(0, ) holds uder (i) or (ii).

4 H. Peg ad A. Schick Our paper is orgaized as follows. I Sectio 2, we give four examples that motivate our research. The emphasis i these examples is o goodess of fit testig. The proposed empirical likelihood based goodess of fit tests are asymptotically distributio free. For uivariate observatios, tests for a specified distributio, for a distributio of parametric form, ad for a symmetric distributio are preseted. For bivariate observatios, tests for idepedece are discussed. Aother example is give i Sectio 3 with a small simulatio study. This example cosiders tests for the regressio parameters i simple liear heteroscedastic regressio. The simulatios compare our ew procedure based o ifiitely may costraits with the classical empirical likelihood procedure ad illustrate improvemets by the ew procedures. I Sectio 4 we itroduce otatio ad recall some results o the spectral orm of matrices. I Sectio 5 we derive a lemma that extracts the essece from the proofs of Owe (200, Chapter ) ad also obtais the aforemetioed sufficiet coditio for a covex hull of vectors to cotai the origi as iterior poit. The results are derived for o-stochastic vectors ad formulated as iequalities. The iequalities are used i Sectio 6 to obtai the behavior of the empirical likelihood with radom vectors whose dimesio may icrease. The results are formulated abstractly ad do ot require idepedece. I Sectio 7 we specialize our results to the case of idepedet observatios with ifiitely may costraits, both kow ad ukow. We also briefly discuss the behavior uder cotiguous alteratives. The details for our examples are give i Sectio 8. 2. Motivatig examples I this sectio, we give examples that motivated the research i this paper. Example. Testig for a fixed distributio. Let X,..., X be idepedet copies of a radom variable X. Suppose we wat to test whether their commo distributio fuctio F equals a kow cotiuous distributio fuctio F 0. Uder the ull hypothesis, we have E[h(X)] = 0 for every h L 2,0 (F 0 ), ad F 0 (X) has a uiform distributio. A orthoormal basis of L 2,0 (F 0 ) is thus give by v F 0, v 2 F 0,... for ay orthoormal basis v, v 2,... of L 2,0 (U), where U is the uiform distributio o (0, ). We shall work with the trigoometric basis φ, φ 2,... defied by φ k (x) = 2 cos(kπx), x [0, ], k =, 2,..., (2.) as these basis fuctios are uiformly bouded by 2. As test statistic we take { } R (F 0 ) = sup π j : π P, π j φ k (F 0 (X j )) = 0, k =,..., m which uses the first m of the trigoometric fuctios. Uder the ull hypothesis we have P ( 2 log R (F 0 ) > χ 2 α(m)) α for every 0 < α < as both m ad ted to ifiity ad m 3 / teds to zero. Thus the test [ 2 log R (F 0 ) > χ 2 α(m)] has asymptotic size α. Here we are still i the framework of Hjort et al (2009) with ifiitely may kow costraits.

A Empirical Likelihood Approach To Goodess of Fit Testig 5 Example 2. Testig for a parametric model. Let X,..., X be agai idepedet ad idetically distributed radom variables. But ow suppose we wat to test whether their commo distributio fuctio F belogs to a model F = {F ϑ : ϑ Θ} idexed by a ope subset Θ of R q. Suppose that the distributio fuctios F ϑ have desities f ϑ such that the map ϑ s ϑ = f ϑ is cotiuously differetiable i L 2 with derivative ϑ ṡ ϑ ad the matrix J(ϑ) = 4 ṡ ϑ (x)ṡ ϑ (x) dx is ivertible for each ϑ Θ. I this case we set l ϑ = 2ṡ ϑ /s ϑ. Let ow ˆθ be a estimator of the parameter i the model. We require it to satisfy the stochastic expasio ˆθ = θ + J(θ) lθ (X j ) + o Pθ ( /2 ) (2.2) for each θ Θ, where P θ is the measure for which F = F θ. Such estimators are efficiet i the parametric model. Cadidates are maximum likelihood estimators. As test statistic we take R (Fˆθ), the test statistic from the previous example with F 0 replaced by Fˆθ. Here we are o loger i the framework of Hjort et al (2009) as we ow have ifiitely may ukow costraits. We shall show that uder the ull hypothesis P ( 2 log R (Fˆθ) > χ 2 α(m q)) α for every 0 < α < as both m ad ted to ifiity ad log m 3 / teds to zero. I view of this result the test [ 2 log R (Fˆθ) > χ 2 α(m q)] has asymptotic size α. It is crucial for our result that we have chose a estimator ˆθ satisfyig (2.2). Example 3. Testig for symmetry. Let X,..., X be idepedet copies of a radom variable X with a cotiuous distributio fuctio F. We wat to test whether F is symmetric about zero i the sese that F (t) = F ( t) for all real t. Uder the ull hypothesis of symmetry, the radom variables sig(x) ad X are idepedet, ad sig(x) takes values ad with probability oe half. This is equivalet to E[sig(X)v( X )] = 0 for every v L 2 (H), where H is the distributio fuctio of X. Sice H is cotiuous, a orthoormal system of L 2 (H) is give by φ 0 H, φ H,... where φ 0 = ad φ, φ 2,... are give i (2.). This suggests the test statistic { R = sup π j : π P, } π j sig(x j )φ k (R j ) = 0, k = 0,..., m, where R j = H( X j ) ad H is the empirical distributio fuctio based o X,..., X. We shall show that uder symmetry oe has P ( 2 log R > χ 2 α(m + )) α for every 0 < α < as m ad ted to ifiity ad m 3 / teds to zero. From this we derive that the test [ 2 log R > χ 2 α(m + )] has asymptotic size α. Example 4. Testig for idepedece. Let (X, Y ),..., (X, Y ) be idepedet copies of a bivariate radom vector (X, Y ). We assume that the margial distributio fuctios F ad G are cotiuous. We wat to test whether X ad Y are idepedet. Idepedece is equivalet to E[a(X)b(Y )] = 0 for all a L 2,0 (F ) ad b L 2,0 (G) ad thus equivalet to E[φ k (F (X))φ l (G(Y ))] = 0 for all positive itegers k ad l.

6 H. Peg ad A. Schick (a) Assume first that F ad G are kow. This is for example the case i a actuarial settig where X ad Y deote residual lifetimes ad their distributio fuctios are available from life tables. Motivated by the above we take as test statistics { R (F, G) = sup π j : π P, π j φ k (F (X j ))φ l (G(Y j )) = 0, } k, l =,..., r. Uder the ull hypothesis oe has P ( 2 log R (F, G) > χ 2 α(r 2 )) α for every 0 < α < as r ad ted to ifiity ad r 6 / teds to zero. Here we are i the framework of Hjort, McKeague ad Va Keilegom (2009). The above shows that the test [ 2 log R (F, G) > χ 2 α(r 2 )] has asymptotic size α. (b) Now assume that F ad G are ukow. I this case we replace both margial distributio fuctios by their empirical distributio fuctios. The resultig test statistic is R (F, G), where F deotes the empirical distributio based o X,..., X ad G the oe based o Y,..., Y. We shall show that uder the ull hypothesis P ( 2 log R (F, G) > χ 2 α(r 2 )) α for every 0 < α < as r ad ted to ifiity ad r 6 / teds to zero. Thus the test [ 2 log R (F, G) > χ 2 α(r 2 )] has asymptotic size α. Remark 2.. Suppose that (X, Y ) form a simple liear homoscedastic regressio model, Y = β + β 2 X + ε, with X ad ε idepedet. We ca use the test statistic from case (b) to test the hypothesis whether the slope parameter β 2 is zero. Ideed, β 2 = 0 is equivalet to the idepedece of X ad Y. Remark 2.2. The asymptotic distributios of the above tests uder cotiguous alteratives are liked to o-cetral chi-square distributios; see Remark 7.3 for details. As the o-cetrality parameters are bouded, the local asymptotic power alog such a cotiguous alterative coicides with the level. Our tests are asymptotically equivalet to Neyma s smooth tests with icreasig dimesios. I view of the optimality results of Iglot ad Ledwia (996) for those tests uder moderate deviatios, we expect similar results for our tests. Of course, this eeds to be explored more carefully. 3. Aother example ad simulatios Let (X, Y ),..., (X, Y ) be idepedet copies of (X, Y ), where Y = β +β 2 X +ε, with E[ε X] = 0, σ 2 (X) = E[ε 2 X] bouded ad bouded away from zero, ad E[ε 4 ] <. Assume that X has a fiite variace ad a cotiuous distributio fuctio G. We are iterested i testig whether the regressio parameter β = (β, β 2 ) equals some specific value θ. We could proceed as i Owe (99) ad use the test δ 0 = [ 2 log R 0 (θ) > χ 2 α(2)] based o the empirical likelihood { R 0 (θ) = sup π j : π P, ( ) } π j (Y X j θ θ 2 X j ) = 0. j

A Empirical Likelihood Approach To Goodess of Fit Testig 7 But this empirical likelihood does ot use all the iformatio of the model. Here we have E[a(X)ε] = 0 for every a L 2 (G). Sice G is cotiuous (but ukow), we work with the empirical likelihood { ˆR (θ) = sup π j : π P, } π j u r (G(X j ))(Y j θ θ 2 X j ) = 0 where u r = (, φ,..., φ r ) ad G is the empirical distributio fuctio based o the covariate observatios X,..., X. It follows from Corollary 7.6 ad Lemma 8. below that P ( 2 log ˆR (β) > χ 2 α( + r)) α if r 4 = o(). The resultig test is δ = [ 2 log ˆR (θ) > χ 2 α(r + )]. Both tests have asymptotic size α. We performed a small simulatio study to compare the procedures. For our simulatio we chose α =.05 ad = 00 ad took θ = (, 2). We modeled the error ε as s(x)η, with s(x) = mi( + X 2, 00) ad η idepedet of X. As distributios for X we chose the expoetial distributio with mea 5 (Ex(5)) ad the t-distributio with three degrees of freedom (t(3)), while for η we chose the stadard ormal distributio (N(0,)) ad the double expoetial distributio with locatio 0 ad scale.5 (L(0,.5)). Table. Simulated powers of the tests δ 0 ad δ. t(3) Ex(5) β β 2 0 2 3 4 5 0 2 3 4 5 0.6 2.3.7.88.86.85.84.38.37.39.40.4 0.8.5.68.82.84.83.83.95.99.99.99.99 N(0, ).0 2.0.3.09.0.2.3.2.07.09.2.4.2 2.2.37.42.43.43.44.5.54.52.50.52.4.7.7.88.87.86.86.37.34.37.40.44 0.6 2.3.89.98.99.98.98.6.64.68.7.74 0.8.5.84.96.98.98.98.93.00.00.00.00 L(0,.5).0 2.0.4.0.4.7.2.3.0..4.7.2 2.2.57.70.70.70.74.68.84.84.82.83.4.7.89.99.99.99.99.62.67.72.73.76 Table reports simulated powers of the tests δ 0 ad δ (with several choices of r) ad for some values of θ. The reported values are based o 000 repetitios. The colum labeled 0 correspods to Owe s test δ 0, while the colums labeled 2,3,4,5 correspod to our tests δ with r = 2, 3, 4, 5 respectively. Clearly our ew test is more powerful tha the traditioal test. The values i the rows correspodig to the parameter values (.0, 2.0) are the observed sigificace levels of the omial sigificace level.05. Our ew test overall has closer observed sigificace levels tha the traditioal oe except for r = 5. 4. Notatio I this sectio we itroduce some of the otatio we use throughout. We write A for the euclidea orm ad A o for the operator (or spectral) orm of a matrix A which are

8 H. Peg ad A. Schick defied by A 2 = trace(a A) = i,j A 2 ij ad A o = sup Au = sup (u A Au) /2. u = u = I other words, the squared euclidea orm A 2 equals the sum of the eige values of A A, while the squared operator orm A 2 o equals the largest eige value of A A. Cosequetly, the iequality A o A holds. Thus we have Ax A o x A x for compatible vectors x. We should also poit out the idetity A o = sup u = v = sup u Av. If A is a oegative defiite symmetric matrix, this simplifies to A o = sup u Au. u = Usig this ad the Cauchy-Schwarz iequality we obtai fg dµ 2 ff dµ gg dµ, (4.) o o o ff dµ f 2 dµ, (4.2) o wheever µ is a measure ad f ad g are measurable fuctios ito R s ad R t such that f 2 dµ ad g 2 dµ are fiite. As a special case we derive the iequality S x+y S x o S y o + 2 S x /2 o S y /2 o ad therefore S x+y S x o ( y i 2 + 2 S x /2 o y i 2) /2 (4.3) with S x+y = (x j + y j )(x j + y j ), S x = x j x j, S y = y j yj for vectors x, y,..., x, y of the same dimesio.

A Empirical Likelihood Approach To Goodess of Fit Testig 9 5. A maximizatio problem Let x,..., x be m-dimesioal vectors. Set x = max j x j, x = x j, S = x j x j, x (ν) = sup u = (u x j ) ν, ν = 3, 4, ad let λ ad Λ deote the smallest ad largest eige value of the matrix S, λ = if u = u Su ad Λ = sup u Su. u = Usig Lagrage multipliers, Owe (988,200) obtaied the idetity { R = sup π j : π P, } π j x j = 0 = if there exists a ζ i R m such that + ζ x j > 0, j =,...,, ad + ζ x j x j + ζ x j = 0. (5.) He also showed that such a vector ζ exists ad is uique if (i) the origi is a iterior poit of the covex hull of x,..., x ad (ii) the matrix S is ivertible. Let us ow show that the iequality λ > 5x x implies these two coditios. Ideed, the matrix S is the positive defiite ad hece ivertible as its smallest eige value λ is positive. To show (i) we will rely o the followig lemma. Lemma 5.. A radom variable Y with E[Y ] = 0 ad P ( Y K) = for some positive K obeys the iequality P (Y > a) E[Y 2 ] 2Ka 2K 2, 0 a < K. Proof. Fix a i [0, K). By the properties of Y, we obtai 2K 2 P (Y > a) 2KE[Y [Y > a]] 2KE[Y [Y > 0]] 2Ka ad 2KE[Y [Y > 0]] = KE[ Y ] E[Y 2 ]. The origi is a iterior poit of the covex hull of x,..., x if for every uit vector u R m there is at least oe j {,..., } such that u x j > 0. This latter coditio is equivalet to N = if [u x j > 0]. u = For a uit vector u, we have u x x ad thus [u x j > 0] [u (x j x) > x ] = N(u).

0 H. Peg ad A. Schick It follows from the triagle iequality that x j x x j + x 2x for j =,...,. Note that x is positive if S is positive defiite. Thus Lemma 5. yields the lower boud N(u)/ (σ 2 (u) 4x x )/(8x 2 ) with σ 2 (u) = (u (x j x)) 2 = u T Su (u x) 2 λ x 2 λ x x. Thus we have N (λ 5 x x )/(8x 2 ). This shows that the iequality λ > 5 x x implies N ad hece the desired coditio (i). Assume ow that the iequality λ > 5x x holds. We proceed as o page 220 of Owe (200). Let u be a uit vector such that ζ = ζ u. The we have the idetity 0 = ad the iequality u x j ( + ζ x j ζ x j ) + ζ x j = u x ζ λ u Su = (u x j ) 2 (u x j ) 2 + ζ x j (u x j ) 2 ( + ζ x ) + ζ x j. Cosequetly, we fid λ ζ ( + ζ x )u x ( + ζ x ) x ad obtai the boud From this oe immediately derives ζ x λ x x. (5.2) ζ x x x λ x x < 4, (5.3) max j + ζ < 4 x j ζ x 3, (5.4) (ζ x j ) 2 = ζ Sζ Λ ζ 2 Λ x 2 (λ x x ) 2. (5.5) The idetity /( + d) + d = d 2 d 3 /( + d) ad (5.4) yield ( r j + ζ x j r j + r j x j ζ ) r j (ζ x j ) 2 4 + 3 r j ζ x j 3 for vectors r,..., r of the same dimesio. Takig r j = S x j, we derive with the help of (5.) ζ S x S x j (ζ x j ) 2 4 + S x j ζ x j 3. 3

A Empirical Likelihood Approach To Goodess of Fit Testig Usig x = sup v = v x, the Cauchy-Schwarz iequality, (5.3) ad (5.5) we boud the square of the first summad of the right-had side by (ζ x j ) 4 sup v S v λ ζ 4 x (4) v = ad the square of the secod summad by 6 9λ 2 x2 ζ Sζ Combiig the above we obtai (ζ x j ) 4 Λ 9λ 2 ζ 4 x (4). ζ S x ( 2 2 λ + Λ ) 9λ 2 ζ 4 x (4). (5.6) Usig the iequality 2 log( + t) 2t + t 2 2t 3 /3 t 4 /(2( t ) 4 ) valid for t <, ad the (5.4) we derive 2 log( + ζ x j ) 2ζ x + ζ Sζ 2 (ζ x j ) 3 4 ) 4 + 3 2( ζ x j 4. 3 With = ζ S x, we ca write ζ Sζ = ζ x + ζ S ad ζ x = x S x + x, ad obtai the idetity 2ζ x ζ Sζ = xs x S. Usig this ad (5.6) we arrive at the boud 2 log( + ζ x j ) x S x ( 6 ζ 3 x (3) + 9 + 2Λ ) λ + 2Λ2 9λ 2 ζ 4 x (4). I view of (5.2) ad Λ λ, this becomes 2 log( + ζ x j ) x S x x 3 x (3) (λ x x ) 3 + Λ2 4 x 4 x (4) λ 2 (λ x x ) 4. (5.7) If we boud x (3) by x Λ ad x (4) by x 2 Λ ad use (5.3) we obtai the boud 2 log( + ζ x j ) x S x Thus we have proved the followig result. (Λ + Λ3 λ 2 ) x x 3 (λ x x ) 3. (5.8) Lemma 5.2. The iequality λ > 5 x x implies that there is a uique ζ i R m satisfyig + ζ x j > 0, j =,...,, ad (5.) to (5.8).

2 H. Peg ad A. Schick 6. Applicatios with radom vectors We shall ow discuss implicatios of Lemma 5.2 to the case whe the vectors x j are replaced by radom vectors. We are iterested i the case whe the dimesio of the radom vectors icreases with. Let T,..., T be m -dimesioal radom vectors. With these radom vectors we associate the empirical likelihood { R = sup π j : π P, To study the asymptotic behavior of R we itroduce ad the matrix T = max j T j, T = T j, S = T (ν) T j Tj, } π j T j = 0. = sup u = ad let λ ad Λ deote the smallest ad largest eige values of S, (u T j ) ν, λ = if u = u S u ad Λ = sup u S u. u = We say a sequece W of m m dispersio matrices is regular if the followig coditio holds, 0 < if if u = u W u sup sup u W u <. We impose the followig coditios. (A) m /2 T = o p ( /2 ). (A2) T 2 = O p (m ). (A3) There is a regular sequece of dispersio matrices W such that S W o = o p (m /2 ). (A4) m T (3) = o p ( /2 ) ad m 3/2 T (4) = o p (). u = The first two coditios imply T T = o p (), the third coditio implies that there are positive umbers a < b such that P (a λ Λ b). Thus all three coditios imply that the probability of the evet {λ > 5T T } teds to oe. Cosequetly, by Lemma 5.2, there exists a m -dimesioal radom vector ˆζ which is uiquely determied o this evet by the properties + ˆζ T j > 0, j =,...,, ad T j + ˆζ T j = 0. (6.)

A Empirical Likelihood Approach To Goodess of Fit Testig 3 O this evet we have 2 log R = 2 log( + ˆζ T j ). It follows from (A3) that S is ivertible except o a evet whose probability teds to zero. It follows from (A2) ad (A4) that T 3 T (3) = o p (m /2 ) ad T 4 T (4) = o p (m /2 ). Thus, uder (A) (A4), the followig expasio follows from (5.7) 2 log R = T S T + o p (m /2 ). (6.2) From (A3) we ca also derive the rate S W o = o p (m /2 ). Thus, if (A) (A4) hold, the (6.2) holds with S replaced by W, 2 log R = T W T + o p (m /2 ). (6.3) I view of the iequalities T (3) Λ T ad T (4) Λ (T) 2, a sufficiet coditio for (A) ad (A4) is give by m T = o p ( /2 ). (B) I view of the boud (T (3) ) 2 Λ T (4), which is a cosequece of the Cauchy-Schwarz iequality, a sufficiet coditio for (A4) is give by m 2 T (4) = o p (). (B2) We first treat the case whe the dimesio m does ot icrease with. I this case (B) ad (A2) are implied by T = o p ( /2 ) ad T = O p ( /2 ), ad (A3) is implied by the coditio: S = W + o p () for some positive defiite matrix W. Thus we have the followig result. Theorem 6.. Let m = m for all. Suppose T = o p ( /2 ), /2 T = N(0, V ) ad S = W + o p (), for dispersio matrices V ad W, with W positive defiite. The 2 log R coverges i distributio to Z V /2 W V /2 Z, where the m-dimesioal radom vector Z is stadard ormal. For V = W the limitig distributio is a chi-square distributio with m degrees of freedom. If we replace /2 T = N(0, V ) by /2 T = U for some radom variable U, the the coclusio becomes 2 log R coverges i distributio to U W U. This versio of the theorem yields Theorem 2. of Hjort et al (2009) without their (A0). Theorem 6. does ot require the idepedece of the radom vectors T,,..., T,. This is importat whe dealig with estimated costrait fuctios as we shall see below. Suppose the coditio i the theorem hold with V = W. Uder a cotiguous alterative, oe typically has /2 T = N(µ, V ) for some µ differet from zero, but retais the other coditios. I this case, 2 log R has a limitig chi-square distributio with m degrees of freedom ad o-cetrality parameter V /2 µ.

4 H. Peg ad A. Schick Let us address some applicatios of Theorem 6.. For this discussio we let Z,..., Z be idepedet copies of a k-dimesioal radom vector Z with distributio Q ad let w be a measurable fuctio from R k ito R m such that E[w(Z)] = w dq = 0 ad W = E[w(Z)w (Z)] = ww dq is positive defiite. Let us first look at the empirical likelihood { } R = sup π j : π P, π j w(z j ) = 0. It follows from Owe that 2 log R has a limitig chi-square distributio with m degrees of freedom. This also follows from Theorem 6. applied with T j = w(z j ). Ideed, the first coditio follows from the iequality P ( max j w(z j) > ɛ /2 ) ɛ 2 E[ w(z) 2 [ w(z) > ɛ /2 ] (6.4) ad the Lebesgue domiated covergece theorem; the cetral limit theorem yields the secod coditio with V = W ; the third coditio w(z j )w (Z j ) = W + o p () (6.5) follows from the weak law of large umbers. This shows that Owe s result is a special case of our result. Now cosider the empirical likelihood { ˆR = sup π j : π P, } π j ŵ(z j ) = 0, where ŵ is a estimator of w based o the observatios Z,..., Z which is cosistet i the followig sese, ŵ(z j ) w(z j ) 2 = o p (). (6.6) The 2 log ˆR has a limitig chi-square distributio with m degrees of freedom if also /2 ŵ(z j ) = /2 w(z j ) + o p () (6.7) holds. To see this, we verify the assumptios of Theorem 6. with T j = ŵ(z j ). The first coditio follows from (6.4), (6.6) ad the iequality T max j w(z j) + ( ŵ(z j ) w(z j ) 2) /2.

A Empirical Likelihood Approach To Goodess of Fit Testig 5 The cetral limit theorem, Slutsky s theorem ad (6.7) yield the secod coditio with V = W. The third coditio follows from (6.5), (6.6) ad the iequality (4.3). The requiremet (6.7) is rather strog. Oe ofte oly derives /2 ŵ(z j ) = /2 v(z j ) + o p () (6.8) for some fuctio v satisfyig E[v(Z)] = 0 ad E[ v(z) 2 ] <. Uder (6.6) ad (6.8), 2 log ˆR has limitig distributio as give i Theorem 6. with V the dispersio matrix of v(z). This follows from Theorem 6. whose assumptios are ow verified as above. I situatios whe w(z) = u(z, η) for some q-dimesioal uisace parameter η ad ŵ(z) = u(z, ˆη) for some estimator ˆη of η, oe typically has v(z) = w(z)+dψ(z), where the m q matrix D is the derivative of the map t E[u(Z, η + t)] at t = 0, ad ψ is the ifluece fuctio of ˆη. We ow address the case whe m icreases with the sample size. Theorem 6.2. Let (A) (A4) hold. Suppose that m icreases with to ifiity ad that there are m m dispersio matrices V such that m /trace(v 2 ) = O() ad ( T W T trace(v ))/ 2trace(V 2 ) = N(0, ). (6.9) The we have ( 2 log R trace(v ))/ 2trace(V 2 ) = N(0, ). (6.0) Proof. We have already see that (A) (A4) imply (6.3). It follows from (6.3) ad m /trace(v 2 ) = O() that the differece of the left-had sides of (6.9) ad (6.0) coverge to zero i probability. Thus the desired (6.0) follows from (6.9) ad Slutsky s Theorem. Of special iterest is the case whe V is the m m idetity matrix I m. The trace(v ) = trace(v 2 ) = m ad (6.0) simplifies to (.2). Sufficiet coditios for (6.9) are give by Peg ad Schick (202). 7. Mai results I this sectio we assume that (Z, S ) is a measurable space, that Z,..., Z are idepedet copies of the Z -valued radom variable Z with distributio Q, ad that m is a positive iteger that teds to ifiity with. We let w deote a measurable fuctio from Z to R m such that w dq = 0 ad w 2 dq is fiite. We first study { R = sup π j : π P, } π j w (Z j ) = 0.

6 H. Peg ad A. Schick Our goal is to show (.2). To this ed we set w = w (Z j ), W = w (Z j )w (Z j ), W = w w dq ad itroduce the followig coditio. (C) The sequece W is regular. Motivated by the results i Peg ad Schick (202) we call a sequece v of measurable fuctios from Z to R Lideberg if v 2 [ v > ɛ ] dq 0, ɛ > 0. (7.) The followig are easy to check. If the sequeces u ad v are Lideberg, so are the sequeces max{ u, v } ad u + v. If the sequece v is Lideberg ad u v, the the sequece u is also Lideberg. We also eed the followig properties. (L) If v is Lideberg, the oe has the rate max j v (Z j ) = o p ( /2 ). (L2) If v r dq = o( r/2 ) for some r > 2, the v is Lideberg. The first statemet follows from a iequality similar to (6.4), the secod from Remark i Peg ad Schick (202). To show (.2) we apply Theorem 6.2 with T j = w (Z j ). I the presece of (C), the coditios (6.9) ad (A) (A4) of this theorem are implied by ( w W w m )/ 2m = N(0, ), (D0) max j m/2 w (Z j ) = o p ( /2 ), (D) w 2 = O p (m ), (D2) W W o = o p (m /2 ), (D3) m 2 sup u w (Z j ) 4 = o p (). u = (D4) By part (c) of Corollary 3 i Peg ad Schick (202), (D0) follows if the fuctio W /2 w is Lideberg. I the presece of (C), the latter coditio is equivalet to w beig Lideberg. By (L), a sufficiet coditio for (D) is that m /2 w is Lideberg. It follows from (C) that trace(w ) Bm for some costat B. Thus (C) implies E[ w 2 ] = trace(w ) = O(m ) ad hece (D2). I view of (C), a sufficiet coditio for (D3) is that m w is Lideberg. To see this fix ɛ > 0 ad let W, ad W,2 be the matrices obtaied by replacig i the defiitio of W the fuctio w by v = w [ m w ɛ ] ad w v = w [ m w > ɛ ], respectively. The we fid E[ W, E[ W, ] 2 ] E[ v 4 (Z)] ɛ2 m 2 E[ w 2 (Z)] ɛ2 Bm m 2,

A Empirical Likelihood Approach To Goodess of Fit Testig 7 ad usig (4.2) P ( W,2 0) P ( max j m w (Z j ) > ɛ ) 0 E[ W,2 ] o E[ w 2 (Z)[ m w (Z) > ɛ ]] = o(m 2 ). The above iequalities show that (C) ad m w is Lideberg imply statemet (D3). The latter coditio also implies (B) ad hece (D) ad (D4), the latter i the presece of (C). Thus we have the followig result. Theorem 7.. Suppose (C) holds ad the sequece m w is Lideberg. The (.2) holds as m teds to ifiity with. From this, simple calculatios ad the property (L2) we immediately derive the followig corollaries. Corollary 7.. Suppose (C) holds ad w m B for some costat B. The (.2) holds if m 3 = o(). Corollary 7.2. Suppose (C) holds ad w r dq = O(m r/2 ) for some r > 2. The (.2) holds if m 3r/(r 2) = o(). These two corollaries give the coclusios i Theorem 4. i Hjort et al (2009) uder slightly weaker coditios i the case of Corollary 7.2. We ow preset some additioal results that allow for larger m if r is small. For example, if r = 4, Corollary 7.2 requires m 6 = o(), while Theorem 7.2 below allows m 4 = o(). For r = 3, Corollary 7.2 requires m 9 = o(), while Theorem 7.3 below allows m 6 = o(). Theorem 7.2. Suppose (C) holds ad w 4 dq = O(m 2 ). The (.2) holds if m 4 = o(). Proof. Usig (L2) ad m 4 = o() we derive that m /2 w is Lideberg. This latter coditio ad (C) imply (D) (D2) as show prior to Theorem 7.. Next we calculate E[ W W 2 ] E[ w 4 (Z)] = O(m 2 ). This yields (D3) i view of W W o W W = O p (m / ) ad m 4 = o(). Fially we have (D4) as the left-had side of (D4) is bouded by m 2 w(z j ) 4 = O p (m 4 ) = o p (). Thus (D0) (D4) hold ad we obtai the desired result from Theorem 6.2. Theorem 7.3. Suppose (C) holds ad w r dq = O(m r/2 ) for some 2 < r < 4. The (.2) holds if m 2r/(r 2) = o().

8 H. Peg ad A. Schick Proof. There is a costat B such that w r dq Bm r/2. I view of (L2) ad the properties of m we derive that m /2 w is Lideberg. This coditio ad (C) imply (D0) (D2). It follows from (D), the momet coditio o w, ad the properties of m that m 2 w(z j ) 4 m2 w(z j ) r max w (Z j ) 4 r j = o p (m 2 m r/2 (/m ) (4 r)/2 ) = o p (m r (4 r)/2 ) = o p (). This establishes (D4). Fially (D3) follows as we have W W o = o p (m ). To prove the latter we mimic the argumet prior to Theorem 7. used to verify (D3) if m w is Lideberg. But ow w [m /2 w ] plays the role of v. For the correspodig matrices W ad W 2 we have m 2 E[ W E[ W ( 2 ] m2 ) (4 r)/2bm r/2 m Bmr 0, r/2 P ( W 2 0) P ( max j m/2 w (Z j ) > /2 ) 0, m E[ W,2 ] o r/2 m w r (r 2)/2 dq Bmr 0. r/2 Cosequetly (D0) (D4) hold ad the desired result follows. Now we study { ˆR = sup π j : π P, } π j ŵ (Z j ) = 0, where ŵ is a estimator of w. Let us set Ŵ = ŵ (Z j )ŵ (Z j ). Theorem 7.4. Suppose (C) holds ad assume we have the expasios m max ŵ (Z j ) = o p ( /2 ), (7.2) j Ŵ W o = o p (m /2 ) (7.3) ŵ (Z j ) = v (Z j ) + o p ( /2 ) (7.4)

A Empirical Likelihood Approach To Goodess of Fit Testig 9 for some measurable fuctio v from S ito R m such that v dq = 0 ad v is Lideberg. Furthermore assume that the dispersio matrix U = W /2 v v dqw /2 of W /2 v (Z) satisfies U o = O() ad m /trace(u) 2 is bouded. The, as m teds to ifiity with, ( 2 log ˆR trace(u ))/ 2trace(U) 2 is asymptotically stadard ormal. Proof. Set ξ j = W /2 v (Z j ), ad itroduce the averages v = v (Z j ) ad T = ŵ(z j ). It follows from (C) that W /2 o + W /2 o = O(). Usig this ad the Lideberg property of v we derive L (ɛ) = E[ ξ, 2 [ ξ, > ɛ ]] 0, ɛ > 0, (7.5) We have trace(u )/trace(u) 2 U o m /trace(u) 2 = O(). From m /trace(u) 2 = O() we ( coclude trace(u) 2. Thus Theorem 2 i Peg ad Schick (202) yields that v W v trace(u ) )/ 2trace(U) 2 is asymptotically stadard ormal. From this, (C), trace(u ) = O(m ) ad trace(u) 2 U 2 0m we coclude v 2 = O p (m ). With the help of (7.4) ad the assumptio m /trace(u) 2 = O() we the derive T 2 = O p (m ) ad that ( T W T trace(u ) )/ 2trace(U) 2 is asymptotically stadard ormal. Thus i view of (B), coditios (A) (A4) hold with T j = ŵ (Z j ), ad the desired result follows from Theorem 6.2. Let us first metio the special case whe v = w. I this case U equals I m trace(u ) = trace(u) 2 = m. ad Corollary 7.3. Suppose (C), (7.2) ad (7.3) hold, w is Lideberg, ad the followig expasio is valid, ŵ (Z j ) = w (Z j ) + o p ( /2 ). (7.6) The ( 2 log ˆR m )/ 2m is asymptotically stadard ormal. Next we treat v = w A ψ with A ad ψ as i the ext coditio. (C2) There is a measurable fuctio ψ from Z ito R q satisfyig ψ dq = 0 ad ψψ dq = I q such that, with A = w ψ dq, the expasio ŵ (Z j ) = w (Z j ) A ψ(z j ) + o p ( /2 ) ad the covergece trace(a W A ) q hold.

20 H. Peg ad A. Schick Corollary 7.4. Suppose (C), (C2), (7.2) ad (7.3) hold, ad w is Lideberg. The ( 2 log ˆR m + q)/ 2(m q) is asymptotically stadard ormal. Remark 7.. Suppose that w is the vector formed by the first m elemets of a orthoormal basis u, u 2,... for L 2,0 (Q). The the ν-th colum of the matrix A is formed by the first m Fourier coefficiets of the ν-th compoet of ψ with respect to this basis. I this case we have the idetity q m ( 2 trace(a W A ) = trace(a A ) = ψ ν u k dq) ν= k= ad obtai uder the assumptios ψ dq = 0 ad ψψ dq = I q the covergece trace(a W A ) ψ 2 dq = q. I our goodess-of-fit examples the followig coditio holds. (C3) There is a costat B such that w B m ad ŵ B m. Uder this coditio, the rate m 3 / 0 implies (7.2), the Lideberg property of m w, ad (D3). Sufficiet coditios for (7.3) ca ow be give directly or by verifyig Ŵ W o = o p (m /2 ). (7.7) I view of the iequality (4.3) a sufficiet coditio for the latter is D = ŵ (Z j ) w (Z j ) 2 = o p (m ). (7.8) Thus we have the followig results. Corollary 7.5. Suppose (C), (C3), m 3 = o(), ad oe of (7.3), (7.7), (7.8) hold. The (i) (7.6) implies that ( 2 log ˆR m )/ 2m is asymptotically stadard ormal, while (ii) (C2) implies that ( 2 log ˆR m +q)/ 2(m q) is asymptotically stadard ormal. Remark 7.2. The coditios i Theorem 7.4 are based o the sufficiet coditio (B) for (A) ad (A4). Workig with (A) ad (B2) istead, we see that (7.2) ca be replaced by the coditios, m /2 max ŵ (Z j ) = O p ( /2 m 2 ) ad ŵ (Z j ) 4 = o p (). j With D as i (7.8), we derive the bouds max ŵ (Z j ) max w (Z j ) + (D ) /2 j j m 2 ŵ (Z j ) 4 8m2 w (Z j ) 4 + 8m 2 D. 2

A Empirical Likelihood Approach To Goodess of Fit Testig 2 Here we used that (a + b) 4 8(a 4 + b 4 ) for o-egative a ad b. Assume ow that w 4 dq = O(m 2 ) ad that m 4 / 0. The we have (D) ad (D3) as show i the proof of Theorem 7.2 ad obtai the above two coditios ad (7.3) from (7.8). Corollary 7.6. Suppose (C), (7.8), w 4 dq = O(m 2 ) ad m 4 = o() hold. The (i) (7.6) implies that ( 2 log ˆR m )/ 2m is asymptotically stadard ormal, while (ii) (C2) implies that ( 2 log ˆR m + q)/ 2(m q) is asymptotically stadard ormal. Remark 7.3. Let us ow describe the behavior of 2 log ˆR uder a local alterative. For this we follow Remarks 6 ad 7 i Peg ad Schick (202). As there let h be a measurable fuctio satisfyig h dq = 0 ad h 2 dq < ad let Q,h be a distributio satisfyig /2 ( dq,h dq) (/2)h dq 2 0. (7.9) The the product measures Q,h ad Q are mutually cotiguous. All results i this sectio obtai the expasio 2 log ˆR /2 u (Z j ) 2 = o p (m /2 ) (7.0) for some measurable fuctio u from Z ito R m with the properties u dq = 0, u 2 dq = O(m ), u is Lideberg, ad the matrix U = u u dq satisfies U o = O() ad m /trace(u) 2 = O(). For example, i Theorem 7.4 oe has u = W /2 v. By cotiguity, oe has the expasio (7.0) eve if Z,..., Z are idepedet with distributio Q,h. Uder this distributioal assumptio oe has ( /2 u (Z j ) 2 µ (h) 2 trace(u ) )/ 2trace(U) 2 = N(0, ) with µ (h) = u h dq. Thus, uder the local alterative Q,h oe has ( 2 log ˆR µ (h) 2 trace(u ) )/ 2trace(U 2 ) = N(0, ). If U = I m, this simplifies to ( 2 log ˆR µ (h) 2 m )/ 2m = N(0, ) ad may be iterpreted as 2 log R beig approximately a o-cetral chi-square radom variable with m degrees of freedom ad o-cetrality parameter µ (h). 8. Details for the examples I this sectio we use the results of the previous sectio to provide the details for the examples of Sectios 2. I all examples, the compoets of w are orthoormal ad uiformly bouded, so that (C) ad (C3) hold with W = I m. We begi with a techical lemma.

22 H. Peg ad A. Schick Lemma 8.. Let (S, T ),..., (S, T ) be idepedet copies of the bivariate radom vector (S, T ), where T has a cotiuous distributio fuctio H ad E[S T ] = 0 ad σ 2 (T ) = E[S 2 T ] is bouded (by say B) ad bouded away from zero (by say b), Let H deote the empirical distributio fuctio based o T,..., T. Set u r = (, φ,..., φ r ), D j = u r (H(T j )) u r (H(T j )), ad M = E[S 2 u r (H(T ))u r (H(T )]. The we have the followig iequalities b v Mv B, v R +r, v =, (8.) u r (H(T j ))u 2 6π 2 r 2 ( + r) 2 r (H(T j )) I +r 2 a.s., (8.2) E[ S j D j 2 ] [ /2 E ] BE[ D j 2 ] Bπ2 r 3, (8.3) 2] S j D j Moreover, if E[S 4 ] is fiite, the we have the boud E[S 4 u r (H(T )) 4 ) ( + 2r) 2 E[S 4 ]. 2Bπ2 r 3. (8.4) Proof. The last iequality follows from the boud u r 2 + 2r. The iequality (8.) is a easy cosequece of b σ 2 (T ) B. Coditioig o T,..., T shows that the lefthad side of (8.4) is bouded by the left-had side of (8.3) ad yields the first iequality i (8.3). Sice φ k 2πk, we obtai D j 2 2πr 3 (H(T j ) H(T j )) 2. It is easy to check that E[(H(T j ) H(T j )) 2 ] /. This proves (8.3) ad (8.4). Next, we have almost surely, u r (H(T j ))u r (H(T j )) = u r (j/)u r (j/)). For a fuctio h defied o [0, ] with Lipschitz costat L, we have h(j/) 0 h(u) du sup j u j h(j/) h(u) L/. Sice the fuctio φ k φ l is Lipschitz with Lipschitz costat 2π(k + l), we derive the desired boud (8.2). Details for Example 2. Let X,..., X be idepedet copies of a radom variable X that has distributio fuctio F θ ad desity f θ for some θ i the ope subset Θ of R q. Recall we assumed i Example 2 that the map ϑ s ϑ = f ϑ is cotiuously differetiable i L 2 with derivative ϑ ṡ ϑ ad that the iformatio matrix J(ϑ) = 4 ṡ ϑ (x)ṡ ϑ (x) dx is ivertible for each ϑ i Θ. Thus we have ρ(τ) = (s θ+τ (x) s θ (x) τ ṡ θ (x)) 2 dx = o( τ 2 ). (8.5)

A Empirical Likelihood Approach To Goodess of Fit Testig 23 Recall also that l θ = 2ṡ θ /s θ deotes the score fuctio. By the properties of the desities, there is a δ > 0 ad a costat K such that f ϑ (x) f ϑ2 (x) dx K ϑ ϑ 2, ϑ θ < δ, ϑ 2 θ < δ. (8.6) As a cosequece we have sup F ϑ (x) F ϑ2 (x) K ϑ ϑ 2, ϑ θ < δ, ϑ 2 θ < δ. (8.7) x R Let m = m ad log()m 3 = o(). It suffices to show ( 2 log R (Fˆθ) m + q )/ 2(m q) = N(0, ). For this, we take w = q F θ ad ŵ = q Fˆθ with q = (φ,..., φ m ) ad verify (7.7) ad (C2) with ψ = J(θ) /2 l θ. The desired result the follows from (ii) of Corollary 7.5. We have W = I m = ŵ ŵ dfˆθ ad obtai ŵ ŵ df θ W 2m fˆθ(x) f θ (x) dx = o p (m /2 ) i view of (8.6) ad (2.2). Thus (7.7) follows if we verify Ŵ W ŵ ŵ 2 df θ + W = op (m ). (8.8) Note that ψ has mea 0 ad idetity dispersio matrix uder F θ ad that A ψ equals D J(θ) l θ, with D = w l θ df θ. Thus (C2) follows from Remark 7., ŵ (X j ) w (X j ) + D (ˆθ θ) = o p ( /2 ), (8.9) the stochastic expasio (2.2), ad the fact that D o is bouded. We are left to verify (8.8) ad (8.9). For this we set U k (t) = V kl (t) = [φ k (F θ+ t(x /2 j )) φ k (F θ (X j ))], [(φ k φ l )(F θ+ t(x /2 j )) (φ k φ l )(F θ (X j ))], ad ote that D = (d,..., d m ) with d k = φ k (F θ ) l θ df θ. The statemets (8.8) ad

24 H. Peg ad A. Schick (8.9) follow if we show that, for each fiite C, m m T (C) = sup (V kl (t) E[V kl (t)]) 2 = o p (m ), t C k= l= m T 2 (C) = sup (U k (t) E[U k (t)]) 2 = o p ( ), t C k= m T 3 (C) = sup (E[U k (t)] + /2 d k t) 2 = o( ). t C k= The first two statemets ca be verified usig the expoetial iequality give i Lemma 5.2 i Peg ad Schick (2004). This requires the fact that (log )m 3 / 0. The idetity f θ+τ f θ l θ τf θ = 2s θ (s θ+τ s θ ṡ θ τ)+(s θ+τ s θ ) 2 ad the defiitio of d k yield the formula φ k (F θ (x)))(f θ+τ (x) f θ (x)) dx = d k τ + φ k (F θ (x))(s θ+τ (x) s θ (x)) 2 dx + 2 φ k (F θ (x))s θ (x)(s θ+τ (x) s θ (x) ṡ θ (x)τ) dx. I view of this ad the fact that φ k (F ϑ ) df ϑ = 0 for all ϑ, we have the idetity E[U k (t)] + d k t /2 = (φ k (F θ (x)) φ k (F θ+ t(x)))(f /2 θ+ t(x) f /2 θ (x)) dx φ k (F θ (x))(s θ+ t(x) s /2 θ (x)) 2 dx 2 φ k (F θ (x))s θ (x)(s θ+ t(x) s /2 θ (x) /2 t ṡ θ (x)) dx. Usig (8.6), (8.7) ad the orthoormality of the the fuctios s θ φ k F θ, k =, 2,..., i L 2, T 3 (C) ca be bouded by 6π 2 m 3 K 4 C 4 ( 2 2 + 6m (s θ+ t(x) s /2 θ (x)) dx) 2 + 2 sup ρ( /2 t). t C The desired statemet T 3 (C) = o( ) ow follows from (8.5) ad m 3 = o(). This completes the proof of (7.4). Details for Example 3. Assume that the distributio fuctio of X is symmetric ad cotiuous. The S = sig(x) ad T = X are idepedet, S has mea zero ad variace, ad T has a cotiuous distributio fuctio H. Let R be defied as i Example 3 with as r = r ad r 3 = o(). It suffices to show that ( 2 log R ( + r ))/ 2( + r ) is asymptotically stadard ormal. This follows from Corollary 7.5 if we verify (7.3) ad (7.6). These coditios follow from Lemma 8. applied with S j = sig(x j ) ad T j = X j. Ideed, i view of the properties of r, (7.3) is a cosequece of (8.2) ad (7.6) of (8.4).

A Empirical Likelihood Approach To Goodess of Fit Testig 25 Details for Example 4. Assume that X ad Y are idepedet. Part (a) is a immediate cosequece of Corollary 7.. Part (b) follows if we show ( 2 log R (F, G) r 2 )/ 2r is asymptotically stadard ormal. We shall use Corollary 7.5 to coclude this. Here m equals r 2 ad thus satisfies m 3 = o(). We shall ow verify (7.8) ad (7.6). Let us set D klj = φ k (F(X j ))φ l (G(Y j )) φ k (F (X j ))φ l (G(Y j )), Φ kj = φ k (F(X j )) φ k (F (X j )) ad Γ lj = φ l (G(Y j )) φ l (G(Y j )). I view of the iequality D klj 2 Φ kj + 2 Γ lj, we obtai with the help of (8.3) the boud r r E[ D klj 2 ] 8π2 r 4. k= l= From this ad r 6 = o() we coclude (7.8). I view of the idetity D klj = φ k (F (X j ))Γ lj + φ l (G(Y j ))Φ kj + Φ kl Γ jl, (7.6) follows if we verify r r T = ( 2 /2 φ k (F (X j ))Γ lj )) = op (), T 2 = k= l= r r k= l= r r T 3 = k= l= ( 2 /2 Φ kj φ l (G(Y j ))) = op (), ( ) 2 /2 Φ kj Γ lj = op (). Applicatios of (8.4) with S j = φ k (F (X j )) yield the boud E[T ] π 2 r 4 /, ad this proves T = o p (). The proof of T 2 = o p () is similar. To deal with T 3 we set H(k, l) = Φ k,j Γ lj, Φk = Φ kj ad Γ l = Γ lj. Note that R j = F(X j ) is the rak of X j. Give Y,..., Y ad the order statistics X (),..., X (), the sum H(k, l) is a simple liear rak statistic with scores a(j) = φ k (j/) φ k (F (X (j) ) ad coefficiets G lj ad cosequetly has (coditioal) mea Φ k Γl ad (coditioal) variace (Φ kj Φ k ) 2 (Γ li Γ l ) 2 I view of this boud we derive the iequality E[T 3 ] E[ k= Φ 2 kj] r l= E[ Γ 2 lj] + r k= Φ 2 kj E[ Φ 2 k] Γ 2 lj. r l= E[ Γ 2 l ].

26 H. Peg ad A. Schick We have E[ Γ 2 k) = E[ Φ 2 l ] = + ( ) 2 φ k (j/) + 2π2 k 2 2. Usig this ad (8.3), we obtai E[T 3 ] = O(r 6 2 ) = o() ad thus T 3 = o p (). Ackowledgemets This work was completed while Ato Schick was visitig the Departmet of Statistics at Texas A&M Uiversity. He wats to thak the members of the departmet for their extraordiary hospitality. Thaks go also to Igrid Va Keilegom for discussios ad for providig a importat referece. Haxiag Peg s research was supported i part by NSF Grat DMS 0940365. Ato Schick s research was supported i part by NSF Grat DMS 090655. Refereces [] Che, S.X., Peg, L. ad Qi, Y.-L. (2009). Effects of data dimesio o empirical likelihood. Biometrika 96, 7 722. [2] Che, S.X. ad Qi, Y.S. (2000). Empirical likelihood cofidece itervals for local liear smoothers. Biometrika 87, 946 953. [3] Che, S.X. ad Va Keilegom, I. (2009). A goodess-of-fit test for parametric ad semiparametric models i multirespose regressio. Beroulli, 5, 955 976. [4] Hall, P. ad La Scala, B. (990). Methodology ad algorithms of empirical likelihood. Iterat. Statist. Review, 58, 09 27. [5] DiCiccio, T.J., Hall, P. ad Romao, J.P. (99). A. Statist. 9, 053 06. [6] Emerso, S. ad Owe, A. (2009). Calibratio of the empirical likelihood method for a vector mea. Electro. J. Statist. 3, 6 92. [7] Hellad, I. (982). Cetral limit theorems for martigales with discrete or cotiuous time. Scad. J. Statist. 9, 79 94. [8] Hjort, N.L., McKeague, I.W. ad Va Keilegom, I. (2009). Extedig the scope of empirical likelihood. A. Statist. 37, 079. [9] Iglot, T. ad Ledwia, R. (996). Asymptotic optimality of data-drive Neyma s tests for uiformity. A. Statist. 24, 982 209. [0] Kolaczyk, E.D. (994). Empirical likelihood for geeralized liear models. Statist. Siica 4, 99 28. [] Li, G. ad Wag, Q.-H. (2003). Empirical likelihood regressio aalysis for right cesored data. Statist. Siica 3, 5 68. [2] Liu, Y. ad Che, J. (200). Adjusted empirical likelihood with high-order precisio. A. Statist. 38, 34-362. [3] Neyma, J. (937). Smooth test for goodess of fit. Skad. Aktuarietidskr. 20, 49 99.

A Empirical Likelihood Approach To Goodess of Fit Testig 27 [4] Owe, A. (988). Empirical likelihood ratio cofidece itervals for a sigle fuctioal. Biometrika 75, 237 249. [5] Owe, A. (990). Empirical likelihood ratio cofidece regios. A. Statist. 8, 90 20. [6] Owe, A. (99). Empirical likelihood for liear models. A. Statist. 9, 725 747. [7] Owe, A. (200). Empirical Likelihood. Chapma & Hall/CRC, Lodo. [8] Peg, H. ad Schick, A. (2004). Estimatio of liear fuctioals of bivariate distributios with parametric margials. Statist. Decisios 22, 6 77. [9] Peg, H. ad Schick, A. (2005). Efficiet estimatio of liear fuctioals of a bivariate distributio with equal, but ukow margials: the least-squares approach. J. Multivariate Aal. 95, 385 409. [20] Peg, H. ad Schick, A. (202). Asymptotic ormality of quadratic forms with radom vectors of icreasig dimesio. Preprit. [2] Portoy, S. (988). Behavior of likelihood methods for expoetial families whe the umber of parameters teds to ifiity. A. Statist. 6, 356 366. [22] Qi, G. ad Jig, B.-Y. (200). Empirical likelihood for cesored liear regressio. Scad. J. Statist. 28, 66 673. [23] Qi, J. ad Lawless, J. (994). Empirical likelihood ad geeral estimatig equatios. A. Statist. 22, 300 325. [24] Shi, J. ad Lau, T.S. (2000). Empirical likelihood for partially liear models. J. Multivariate Aal. 72, 32 48. [25] Tripathi, G. ad Kitamura, Y. (2003). Testig coditioal momet restrictios A. Statist. 3, 2059 2095. [26] Wag, Q.-H. ad Jig, B.-Y. (200). Empirical likelihood for a class of fuctioals of survival distributio with cesored data. A. Ist. Statist. Math. 53, 57 527. [27] Wag, Q.-H. ad Jig, B.-Y. (2003). Empirical likelihood for partially liear models. A. Ist. Statist. Meth. 55, 585 595. [28] Zhou, M. ad Li, G. (2008). Empirical likelihood aalysis of the Buckley-James estimator. J Multivariate Aal. 99, 649 664.