FIXED-b ASYMPTOTICS FOR BLOCKWISE EMPIRICAL LIKELIHOOD

Similar documents
SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

A New Approach to Robust Inference in Cointegration

Adjusted Empirical Likelihood for Long-memory Time Series Models

Testing in GMM Models Without Truncation

What s New in Econometrics. Lecture 15

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

Generalized Neyman Pearson optimality of empirical likelihood for testing parameter hypotheses

The Bootstrap: Theory and Applications. Biing-Shen Kuo National Chengchi University

Research Article Optimal Portfolio Estimation for Dependent Financial Returns with Generalized Empirical Likelihood

Bootstrapping high dimensional vector: interplay between dependence and dimensionality

Serial Correlation Robust LM Type Tests for a Shift in Trend

Self-normalization for Time Series: A Review of Recent Developments 1

The properties of L p -GMM estimators

11. Bootstrap Methods

A Fixed-b Perspective on the Phillips-Perron Unit Root Tests

An estimate of the long-run covariance matrix, Ω, is necessary to calculate asymptotic

ICSA Applied Statistics Symposium 1. Balanced adjusted empirical likelihood

Supplemental Material for KERNEL-BASED INFERENCE IN TIME-VARYING COEFFICIENT COINTEGRATING REGRESSION. September 2017

HAR Inference: Recommendations for Practice

Supplement to Quantile-Based Nonparametric Inference for First-Price Auctions

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

Exponential Tilting with Weak Instruments: Estimation and Testing

Optimal Bandwidth Selection in Heteroskedasticity-Autocorrelation Robust Testing

Asymptotic inference for a nonstationary double ar(1) model

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Heteroskedasticity and Autocorrelation Consistent Standard Errors

University of California, Berkeley

Block Bootstrap HAC Robust Tests: The Sophistication of the Naive Bootstrap

Estimation of the Conditional Variance in Paired Experiments

Comment on HAC Corrections for Strongly Autocorrelated Time Series by Ulrich K. Müller

Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests

A Resampling Method on Pivotal Estimating Functions

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

AFT Models and Empirical Likelihood

A strong consistency proof for heteroscedasticity and autocorrelation consistent covariance matrix estimators

More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction

On the Long-Run Variance Ratio Test for a Unit Root

CAE Working Paper # Fixed-b Asymptotic Approximation of the Sampling Behavior of Nonparametric Spectral Density Estimators

A New Approach to the Asymptotics of Heteroskedasticity-Autocorrelation Robust Testing

Time Series and Forecasting Lecture 4 NonLinear Time Series

CONSTRUCTING NONPARAMETRIC LIKELIHOOD CONFIDENCE REGIONS WITH HIGH ORDER PRECISIONS

Extending the two-sample empirical likelihood method

On the coverage bound problem of empirical likelihood methods for time series

Can we do statistical inference in a non-asymptotic way? 1

Economics Division University of Southampton Southampton SO17 1BJ, UK. Title Overlapping Sub-sampling and invariance to initial conditions

A New Asymptotic Theory for Heteroskedasticity-Autocorrelation Robust Tests

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series

Working Paper No Maximum score type estimators

Lasso Maximum Likelihood Estimation of Parametric Models with Singular Information Matrices

c 2010 by Zhewen Fan. All rights reserved.

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Testing for a Change in Mean under Fractional Integration

Empirical Likelihood Methods for Sample Survey Data: An Overview

Empirical Likelihood Inference for Two-Sample Problems

Optimal Jackknife for Unit Root Models

Bahadur representations for bootstrap quantiles 1

Consistent Parameter Estimation for Conditional Moment Restrictions

EXTENDED TAPERED BLOCK BOOTSTRAP

Single Equation Linear GMM with Serially Correlated Moment Conditions

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

On block bootstrapping areal data Introduction

by Ye Cai and Mototsugu Shintani

ESSAYS ON TIME SERIES ECONOMETRICS. Cheol-Keun Cho

The generalized method of moments

addresses: (X. Shao). Accepted author version posted online: 25 Sep 2014.

Inference on distributions and quantiles using a finite-sample Dirichlet process

THE ERROR IN REJECTION PROBABILITY OF SIMPLE AUTOCORRELATION ROBUST TESTS

TESTING FOR NORMALITY IN THE LINEAR REGRESSION MODEL: AN EMPIRICAL LIKELIHOOD RATIO TEST

The Number of Bootstrap Replicates in Bootstrap Dickey-Fuller Unit Root Tests

Robust Backtesting Tests for Value-at-Risk Models

Economic modelling and forecasting

University of Pavia. M Estimators. Eduardo Rossi

OPTIMAL BANDWIDTH CHOICE FOR INTERVAL ESTIMATION IN GMM REGRESSION. Yixiao Sun and Peter C.B. Phillips. May 2008

Closest Moment Estimation under General Conditions

Location Properties of Point Estimators in Linear Instrumental Variables and Related Models

large number of i.i.d. observations from P. For concreteness, suppose

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract

Quantile regression and heteroskedasticity

where x i and u i are iid N (0; 1) random variates and are mutually independent, ff =0; and fi =1. ff(x i )=fl0 + fl1x i with fl0 =1. We examine the e

Improving linear quantile regression for

Econometrica, Vol. 69, No. 6 (November, 2001), ASYMPTOTIC OPTIMALITY OF EMPIRICAL LIKELIHOOD FOR TESTING MOMENT RESTRICTIONS

Lecture 11 Weak IV. Econ 715

Choice of Spectral Density Estimator in Ng-Perron Test: Comparative Analysis

CALCULATION METHOD FOR NONLINEAR DYNAMIC LEAST-ABSOLUTE DEVIATIONS ESTIMATOR

Accurate Asymptotic Approximation in the Optimal GMM Frame. Stochastic Volatility Models

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

Defect Detection using Nonparametric Regression

Closest Moment Estimation under General Conditions

Diagnostic Test for GARCH Models Based on Absolute Residual Autocorrelations

Research Article Least Squares Estimators for Unit Root Processes with Locally Stationary Disturbance

A CENTRAL LIMIT THEOREM FOR NESTED OR SLICED LATIN HYPERCUBE DESIGNS

Quantile Regression for Residual Life and Empirical Likelihood

HAC Corrections for Strongly Autocorrelated Time Series

Finite Sample Properties of the Dependent Bootstrap for Conditional Moment Models

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY

GMM Estimation and Testing

Non-Parametric Dependent Data Bootstrap for Conditional Moment Models

NUCLEAR NORM PENALIZED ESTIMATION OF INTERACTIVE FIXED EFFECT MODELS. Incomplete and Work in Progress. 1. Introduction

Transcription:

Statistica Sinica 24 (214), - doi:http://dx.doi.org/1.575/ss.212.321 FIXED-b ASYMPTOTICS FOR BLOCKWISE EMPIRICAL LIKELIHOOD Xianyang Zhang and Xiaofeng Shao University of Missouri-Columbia and University of Illinois at Urbana-Champaign Abstract: We describe an extension of the fixed-b approach introduced by Kiefer and Vogelsang (25) to the empirical likelihood estimation framework. Under fixed-b asymptotics, the empirical likelihood ratio statistic evaluated at the true parameter converges to a nonstandard yet pivotal limiting distribution that can be approximated numerically. The impact of the bandwidth parameter and kernel choice is reflected in the fixed-b limiting distribution. Compared to the χ 2 -based inference procedure used by Kitamura (1997) and Smith (211), the fixed-b approach provides a better approximation to the finite sample distribution of the empirical likelihood ratio statistic. Correspondingly, as shown in our simulation studies, the confidence region based on the fixed-b approach has more accurate coverage than its traditional counterpart. Key words and phrases: Blocking, empirical likelihood, fixed-b asymptotics, time series. 1. Introduction Empirical likelihood (EL) (Owen (1988, 199)) is a nonparametric technique for conducting inference for parameters in nonparametric settings. EL has been studied extensively in the statistics and econometrics literature (see Owen (21), Kitamura (26), and Chen and Van Keilegom (29) for comprehensive reviews). One striking property of EL is the nonparametric version of Wilks theorem that states that the EL ratio statistic evaluated at the true parameter converges to a χ 2 liming distribution. This property was first demonstrated for the mean parameter by Owen (199) and was further extended to the estimating equation framework by Qin and Lawless (1994). However, Wilks phenomenon fails to hold for stationary time series because the dependence within the observations is not taken into account in EL. Kitamura (1997) proposed blockwise empirical likelihood (BEL), which is able to accommodate the dependence of the data, and Wilks theorem continues to hold for the BEL ratio statistic under suitable weak dependence assumptions. The BEL can be viewed as a special case of the generalized empirical likelihood (GEL) with smoothed moment conditions (Smith (211)).

2 XIANYANG ZHANG AND XIAOFENG SHAO The performance of BEL and its variations (Nordman (29) and Smith (211)) can depend crucially on the choice of the bandwidth parameter for which no sound guidance is available. Kiefer and Vogelsang (25) proposed the socalled fixed-b asymptotic theory in the heteroscedasticity-autocorrelation robust (HAR) testing context. It was found that the asymptotic distribution obtained by treating the bandwidth as a fixed proportion (say b) of the sample size provides a better approximation to the sampling distribution of the studentized test statistic than the traditional χ 2 -based approximation. See Jansson (24), Sun, Phillips, and Jin (28), and Zhang and Shao (213) for rigorous theoretical justifications. The fixed-b approach has the advantage of accounting for the effect of the bandwidth and the kernel, as different bandwidth parameters and kernels correspond to different limiting (null) distributions (also see Shao and Politis (213) for a recent extension to the subsampling and block bootstrap context). The main thrust of the present paper is the development of a new asymptotic theory in the BEL estimation framework made possible by the fixed-b approach. We consider the problem in the moment condition model (Qin and Lawless (1994) and Smith (211)) that is a fairly general framework used by both statisticians and econometricians. Under the fixed-b asymptotic framework, we show that the asymptotic null distribution of the EL ratio statistic evaluated at the true parameter is nonstandard yet pivotal, and that it can be approximated numerically. It is interesting to note that the fixed-b limiting distribution coincides with the χ 2 distribution as b gets close to zero. We also illustrate the idea in the GEL estimation framework and demonstrate the usefulness of the fixed-b approach through simulation studies. For notation, let D[, 1] be the space of functions on [, 1] which are rightcontinuous and have left limits, endowed with the Skorokhod topology (see Billingsley (1999)). Weak convergence in D[, 1], or more generally in the R m - valued function space D m [, 1], is denoted by, where m N. Convergence in probability and convergence in distribution are denoted by p and d respectively. Let C[, 1] be the space of continuous functions on [, 1]. Denote by a the integer part of a R. 2. Methodology 2.1. Empirical likelihood Suppose we are interested in the inference of a p-dimensional parameter vector θ that is identified by a set of moment conditions. Denote by θ the true parameter of θ, an interior point of a compact parameter space Θ R p. Let {y t } n t=1 be a sequence of Rl -valued stationary time series and assume the moment conditions E[f(y t, θ )] =, t = 1, 2,..., n, (2.1)

FIXED-b ASYMPTOTICS FOR BLOCKWISE EMPIRICAL LIKELIHOOD 3 where f(y, θ) : R l+p R k is a map that is differentiable with respect to θ and rank(e[ f(y t, θ )/ θ ]) = p with k p. To deal with time series data, we consider the smoothed moment conditions introduced by Smith (211), f tn (θ) = 1 t 1 s=t n ( s K ) f(y t s, θ), (2.2) where K( ) is a kernel function and = bn with b (, 1) is the bandwidth parameter. Smoothing of the moment conditions induces a heteroskedasticity and autocorrelation consistent (HAC) covariance estimator of the long run variance matrix of {f(y t, θ)} n t=1. Let f t(θ) = f(y t, θ) and f n (θ) = n t=1 f tn(θ)/n, where f tn (θ) is defined in (2.2). Consider the profile empirical log-likelihood function based on the smoothed moment restrictions, { n L n (θ) = sup log(p t ) : p t, t=1 n p t = 1, t=1 n t=1 } p t f tn (θ) =. (2.3) Standard Lagrange multiplier arguments imply that the maximum is attained when 1 n p t = n{1 + λ f tn (θ)}, with f tn (θ) 1 + λ f tn (θ) =. The maximum empirical likelihood estimate (MELE) is then given by ˆθ el = argmax θ Θ L n (θ). Following Kitamura (26), the empirical log-likelihood function can also be derived by considering the dual problem (see e.g., Borwein and Lewis (1991)), L n (θ) = min λ R k t=1 n log(1 + λ f tn (θ)) n log n, (2.4) t=1 where log(x) = for x <. Here (2.4) has a natural connection with the generalized empirical likelihood (GEL), and it facilitates our theoretical derivation under the fixed-b asymptotics. To introduce the fixed-b approach, we define the empirical log-likelihood ratio function elr(θ) = 2 max λ R k n t=1 log(1 + λ f tn (θ)), (2.5) for θ Θ and = bn. Under the traditional small-b asymptotics, nb 2 +1/(nb) as n, and suitable weak dependence assumptions (see e.g., Smith (211)), it can be shown that elr(θ ) = n f n (θ ) ( b n t=1 f tn (θ )f tn (θ ) ) 1 fn (θ ) + o p (1) d κ2 1 κ 2 χ 2 k,

4 XIANYANG ZHANG AND XIAOFENG SHAO where κ 1 = + K(x)dx and κ 2 = + K2 (x)dx (assuming that κ 1, κ 2 < ). However, the χ 2 -based approximation can be poor, especially when the dependence is strong and the bandwidth parameter is large (see Section 3). To derive the fixed-b limiting distribution, we make an assumption that is standard in moment condition models. Assumption 1. nr t=1 f t(θ )/ n ΛW k (r) for r [, 1], where ΛΛ = Ω = + j= Γ j with Γ j = Ef t+j (θ )f t (θ ), and W k (r) is a k-dimensional vector of independent standard Brownian motions. Assumption 1 can be verified under suitable moment and weak dependence assumptions on f t (θ ) (see e.g., Phillips (1987)). For the kernel function, we assume the following. Assumption 2. The kernel K : R [ c, c ] for some < c <, is piecewise continuously differentiable. Fix b (, 1), where b = /n. Using summation by parts, the Continuous Mapping Theorem and Itô s formula, it is not hard to show that, for t = nr with r [, 1], nftn (θ ) = n t 1 s=t n ( s ) K f t s (θ ) ΛD k(r; b), (2.6) b where D k (r; b) = K((r s)/b)dw k(s). Let C k [, 1] = {(f 1, f 2,..., f k ) : f i C[, 1]}. For any g C k [, 1], take G el (g) = max λ R k log(1 + λ g(t))dt. We show in the Appendix that the functional G el ( ) is continuous under the sup norm. Therefore, by the Continuous Mapping Theorem, we can characterize the asymptotic behavior of elr(θ ). Theorem 1. Suppose Assumptions 1 2 hold. For n + and b fixed, elr(θ ) d U el,k (b; K) := 2 ( r s ) ) b max log (1 + λ K dw k (s) dr. (2.7) λ R k b The proof of Theorem 1 is given in the supplementary material. Theorem 1 shows that the fixed-b limiting distribution of elr(θ ) is nonstandard yet pivotal for a given bandwidth and kernel, and its critical values can be obtained via simulation or iid bootstrap (because the bootstrapped sample satisfies the Functional Central Limit Theorem). Let u el,k (b; K; 1 α) be the 1(1 α)% quantile of U el,k (b; K)/(1 b). Given b (, 1), a 1(1 α)% confidence region for the parameter θ is then given by { CI(1 α; b) = θ R p : elr(θ) } 1 b u el,k(b; K; 1 α). (2.8)

FIXED-b ASYMPTOTICS FOR BLOCKWISE EMPIRICAL LIKELIHOOD 5 When K(x) = I(x ), we have D k (r; b) = W k (r) and A := {λ R k : min r [,1] (1 + λ D k (r; b)) } = {λ R k : min r [,1] (1 + λ W k (r)) }. By Lemma 1 of Nordman, Bunzel, and Lahiri (213), we know that A is bounded with probability one, which implies that P (U el,k (b; K) = ) =. We conjecture that P (U el,k (b; K) = ) can be positive for particular K( ) and b (, 1). In our simulations, critical values are calculated based on the cases where U el,k (b; K) < (when b is close to zero, P (U el,k (b; K) = ) is rather small, as seen from our unreported simulation results). The nonstandard limiting distribution also provides some insights on how likely the origin is not contained in the convex hull of {f tn (θ )} n t=1 when the sample size n is large. Remark 1. To capture the dependence within the observations, one can employ the commonly used blocking technique first applied to the EL by Kitamura (1997). To illustrate, we consider the fully overlapping smoothed moment condition given by f tn (θ) = (1/m) t+m 1 j=t f(y j, θ) with t = 1, 2,..., n m + 1 and m = nb for b (, 1). Under suitable weak dependence assumptions, we have nftn (θ ) Λ{W k (r + b) W k (r)}/b for t = nr. Using similar arguments to those in Theorem 1, we can show that elr(θ ) d U el,k (b) := 2 b max λ R k b log(1 + λ {W k (r + b) W k (r)})dr. We generate the critical values of U el,k (b)/(1 b) (conditioning on U el,k (b) < ) for b from.1 to.3 with spacing.1, and further approximate the critical values by a cubic function of b following the practice of Kiefer and Vogelsang (25). The estimates of the coefficients of the corresponding cubic functions are given in Table 1. Similarly, we summarize the critical values of U el,k (b; K)/(1 b) (conditioning on U el,k (b; K) < ) with K(x) = (5π/8) 1/2 (1/x)J 1 (6πx/5) for b from.1 to.2 in Table 2, where J 1 ( ) denotes the Bessel function of the first kind. Remark 2. A natural question to ask is whether the fixed-b asymptotics is consistent with the traditional small-b asymptotics when b is close to zero. We provide an affirmative answer by showing that U el,k (b, K) converges to a scaled χ 2 k distribution as b. We assume K satisfies certain regularity conditions (see Assumption 2.2 in Smith (211)). Using the Taylor expansion and some standard arguments for EL, it is not hard to show that U el,k (b; K) = 1 b ( ) 1 D k (r, b) dr D k (r; b)d k (r; b) dr D k (r, b)dr + o p (1).

6 XIANYANG ZHANG AND XIAOFENG SHAO Table 1. Critical value function coefficients. a a 1 a 2 a 3 R 2 u el,1 (b;.9) 2.661 6.547 12.819-8.329.9984 u el,1 (b;.95) 3.917 5.819 34.483-14.192.9976 u el,1 (b;.99) 6.593 4.631 231.74-484.131.9855 u el,2 (b;.9) 4.827-1.521 212.177-477.642.9983 u el,2 (b;.95) 6.586-18.86 469.34-176.779.9945 u el,2 (b;.99) 9.928-36.918 128.429-253.245.996 u el,3 (b;.9) 6.424-1.99 45.193-172.778.9962 u el,3 (b;.95) 7.783 3.125 56.737-1552.979.999 u el,3 (b;.99) 1.138 22.29 18.359-333.819.9565 The critical value u el,k (b; 1 α) is approximated by a cubic function a + a 1b + a 2b 2 + a 3b 3 of b. The estimated coefficients and multiple R 2 are reported. The Brownian motion is approximated by a normalized partial sum of 1, i.i.d. standard normal random variables and the number of Monte Carlo replication is 5,. Table 2. Critical value function coefficients. a a 1 a 2 a 3 R 2 u el,1 (b; K;.9) 3.324 8.243 112.149-155.159.9992 u el,1 (b; K;.95) 5.116-9.585 533.935-1459.216.9989 u el,1 (b; K;.99) 9.612-85.1 2237.47-6595.415.9955 u el,2 (b; K;.9) 5.65 8.652 799.757-2928.35.993 u el,2 (b; K;.95) 7.79-23.367 1833.534-6752.999.9868 u el,2 (b; K;.99) 11.78-54.68 451.733-17748.759.982 u el,3 (b; K;.9) 6.86 48.99 1373.57-6288.66.984 u el,3 (b; K;.95) 7.656 15.88 1714.933-8718.14.9731 u el,3 (b; K;.99) 4.963 55.832 346.21-9711.193.952 The critical value u el,k (b; K; 1 α) is approximated by a cubic function a + a 1 b + a 2 b 2 + a 3 b 3 of b. The estimated coefficients and multiple R 2 are reported. The Brownian motion is approximated by a normalized partial sum of 1, iid standard normal random variables and the number of Monte Carlo replication is 5,. Under Assumption 2, we derive that 1 b D k (r, b)dr = = ( r s K b (1 s)/b s/b ) drd W k(s) b K(t)dtdW k (s) d κ 1 W k (1). Define the semi-positive definite kernel K b (r, s)= K((t r)/b)k((t s)/b)dt/(bκ 2).

FIXED-b ASYMPTOTICS FOR BLOCKWISE EMPIRICAL LIKELIHOOD 7 Then V k (b) = (1/b) D k(r; b)d k (r; b) dr = κ 2 K b (r, s)dw k(r)dw k (s) = d κ + 2 j=1 λ j,bη j η j, where {η j} + j=1 is an independent sequence of N k(, I k ) random vectors and the λ j,b are the eigenvalues associated with Kb (r, s). Note that E{V k (b)} 1 = I k Kb κ (r, r)dr = I k K 2( t r ) dtdr I k. 2 κ 2 b b Let η j = (η j1,..., η jk ) and denote by V (l,m) k (b) the (l, m)th element of V k (b) with 1 l, m k. Since + j=1 λ2 j,b = {K b (r, s)}2 drds as b (see e.g., Sun (21)), we get { (l,m) V k (b)} 2 + E = κ 2 = + j=1 j =1 + j=1 λ2 j,b ( + j=1 λ j,b λ j,b λ j,beη jl η jm η j lη j m, l m; ) 2 + 2 + 1, l = m, j=1 λ2 j,b which implies that V k (b) p κ 2 I k. Therefore,we have U el,k (b, K) d (κ 2 1 /κ 2)χ 2 k as b. Compared to the χ 2 -approximation, the fixed-b limiting distribution that captures the choice of the kernel and the bandwidth is expected to provide better approximation to the finite sample distribution of the BEL ratio statistic at the true parameter when b is relatively large. 2.2. Generalized empirical likelihood We extend the fixed-b approach to the Generalized empirical likelihood (GEL) estimation framework (Newey and Smith (24)). To describe GEL, we let ρ be a concave function defined on an open set I that contains the origin. Set ρ(x) = for x / I, and let ρ j (x) = j ρ(x)/ x j and ρ j = ρ j () for j =, 1, 2. We normalize ρ so that ρ 1 = ρ 2 = 1. Consider the set Π n (θ) = {λ : λ f tn (θ) I, t = 1, 2,..., n}. The GEL estimator is the solution to a saddle point problem, ˆθ gel = argmin θ Θ sup λ R k ˆP (θ, λ) = argmin θ Θ sup ˆP (θ, λ), λ Π n (θ) where ˆP (θ, λ) = 1 n t=1 {ρ(λ f tn (θ)) ρ }. The GEL ratio function is given by gelr(θ) = 2 sup λ R k ˆP (θ, λ). (2.9) The GEL estimator includes a number of special cases that have been well studied in the statistics and econometrics literature. The EL, exponential tilting (ET), and continuous updating (CUE) are special cases of the GEL. Thus

8 XIANYANG ZHANG AND XIAOFENG SHAO ρ(x) = log(1 x) and I = (, 1) for EL, ρ(x) = e x and I = R for ET, and ρ(x) = (1 + x) 2 /2 and I = R for CUE. More generally, members of the Cressie- Read power divergence family of discrepancies discussed by Imbens, Spady, and Johnson (1998) are included in the GEL class with ρ(x) = (1+γx) (γ+1)γ /(γ+1) (see Newey and Smith (24)). {ρ(λ g(t)) ρ }dt for g C k [, 1]. If ρ( ) is Let G gel (f) = max λ R k strictly concave and twice continuously differentiable, under suitable assumptions it can be shown that G gel ( ) is a continuous functional under the sup norm. Since the argument follows from that presented in the appendix with a minor modification, we skip the details (see Remark 1.1 in the supplementary material). Therefore, we have gelr(θ ) d U ρ,k (b; K) := 2 b max λ R k { ) } ρ (λ K((r s)/b)dw k (s) ρ dr. The GEL-based confidence region for the parameter θ is { CI(1 α; b) = θ R p : gelr(θ) } 1 b u ρ,k(b; K; 1 α), (2.1) where u ρ,k (b; K; 1 α) is the 1(1 α)% quantile of U ρ,k (b; K)/(1 b), which can again be obtained via simulation or iid bootstrap. 3. Numerical Studies We conducted two sets of simulation studies to compare and contrast the finite sample performance of the inference procedure based on the fixed-b approximation and the BEL of Kitamura (1997) and Smith (211). The simulation results presented below are based on the simulation runs where the origin is contained in the convex hull of {f tn (θ )}. 3.1. Mean and quantiles Consider the time series models AR(1), Y t = ρy t 1 + ϵ t with ρ =.5,.2,.5,.8, and AR(2), Y t = (5/6)Y t 1 (1/6)Y t 2 +ϵ t. The latter was used in Chen and Wong (29) where the focus was to compare the finite sample coverages of the quantile delivered by BEL. In both models, {ϵ t } is a sequence of iid standard normal random variables. We focus on the inference for the mean, the median and the 5% quantile. For the mean, f(y t, θ) = y t θ. For the q-th quantile, we consider the moment condition f q (y t, θ) = (θ y t )/h K(x)dx q, where K( ) is an r-th order window that satisfies 1, j = ; u j K(u)du =, 1 j r 1; κ, j = r,

FIXED-b ASYMPTOTICS FOR BLOCKWISE EMPIRICAL LIKELIHOOD 9 for some integer r 2, and h is a bandwidth such that h as n +. When h =, we have f q (y t, θ) = I(y t θ) q. To accommodate dependence, we consider the BEL with fully overlapping moment conditions f tn (θ) = (1/m) t+m 1 j=t f(y t, θ) for t = 1, 2,..., n m + 1 and m = nb with b (, 1). For comparisons, we consider smoothed EL with the kernel K(x) = (5π/8) 1/2 (1/x)J 1 (6πx/5), where J 1 ( ) is the Bessel function of the first kind. The HAC covariance estimator induced by using K( ) is essentially the same as the nonparametric long run variance estimator with the Quadratic spectral kernel (see Example 2.3 of Smith (211)). The sample sizes considered were n = 1 and 4, and b was chosen from.2 to.2. To draw inference for the quantiles, we employed the second order Epanechnikov window with bandwidth h = cn 1/4 for c =, 1, following Chen and Wong (29). The coverage probabilities and corresponding interval widths for the mean and quantiles delivered by the fixed-b approximation and the χ 2 -based approximation are depicted in Figures 1 4. For the mean, undercoverage occurs for both the fixed-b calibration and the χ 2 -based approximation when the dependence is positive, and becomes more severe as the dependence strengthens. Inference based on the fixed-b calibration provided uniformly better coverage probabilities in all cases, and was quite robust to the choice of b. The improvement was significant, especially for large bandwidth. On the other hand, the fixed-b based interval was slightly wider than the χ 2 -based interval. For negative dependence with ρ =.5, the fixed-b calibration tended to provide overcoverage, but the improvement over the χ 2 -based approximation could be seen for relatively large b. These findings are consistent with the intuition that the larger b is, the more accurate the fixed-b based approximation is relative to the χ 2 -based approximation used by Kitamura (1997) and Smith (211). The results for the median and 5% quantile were qualitatively similar to those in the mean case. The choice of h = 1 tended to provide slightly shorter interval widths as compared to the unsmoothed counterpart, h = in some cases (see Chen and Wong (29)). A comparison of Figure 1 with Figure 3 (Figure 2 with Figure 4) has the coverage probabilities for the EL based on the kernel K(x) generally closer to the nominal level than the BEL counterpart with the corresponding interval widths wider. This phenomenon is consistent with the finding that QS kernel provides better coverage, but wider interval widths, compared to the Bartlett kernel in Kiefer and Vogelsang (25) under the GMM framework. Our unreported simulation results also demonstrate the usefulness of fixed-b calibration under the GEL estimation framework. The results for ET are available upon request.

1 XIANYANG ZHANG AND XIAOFENG SHAO Figure 1. Coverage probabilities for the mean delivered by the BEL based on the fixed-b approximation and the χ 2 -based approximation. The nominal level is 95% and the number of Monte Carlo replications is 1,.

FIXED-b ASYMPTOTICS FOR BLOCKWISE EMPIRICAL LIKELIHOOD 11 Figure 2. Coverage probabilities for the median and 5% quantile delivered by the BEL based on the fixed-b approximation and the χ 2 -based approximation. The nominal level is 95% and the number of Monte Carlo replications is 1,.

12 XIANYANG ZHANG AND XIAOFENG SHAO Figure 3. Coverage probabilities for the mean delivered by the smoothed EL based on the fixed-b approximation and the χ 2 -based approximation. The corresponding kernel is K(x) = (5π/8) 1/2 (1/x)J 1 (6πx/5). The nominal level is 95% and the number of Monte Carlo replications is 1,.

FIXED-b ASYMPTOTICS FOR BLOCKWISE EMPIRICAL LIKELIHOOD 13 Figure 4. Coverage probabilities for the median and 5% quantile delivered by the smoothed EL based on the fixed-b approximation and the χ 2 -based approximation. The corresponding kernel is K(x) = (5π/8) 1/2 (1/x)J 1 (6πx/5). The nominal level is 95% and the number of Monte Carlo replications is 1,.

14 XIANYANG ZHANG AND XIAOFENG SHAO Figure 5. Coverage probabilities delivered by the BEL (left panels) and the smoothed EL (right panels) based on the fixed-b approximation and the χ 2 -based approximation. The corresponding kernel for the smoothed EL is K(x) = (5π/8) 1/2 (1/x)J 1 (6πx/5). The nominal level is 95% and the number of Monte Carlo replications is 1,. 3.2. Time series regression We consider the stylized linear regression model with an intercept and a regressor x t : y t = β 1 + β 2 x t + u t for 1 t n, where {x t } and {u t } are generated independently from an AR(1) model with common coefficient ρ. We set the true parameter β = (β 1, β 2 ) = (, ) and chose ρ {.2,.5,.8}. We are interested in constructing confidence contour for β. Consider the moment conditions f t (β) = (u t (β), x t u t (β), x t 1 u t (β), x t 2 u t (β)) with u t (β) = y t β 1 β 2 x t and 3 t n. We report the coverage probabilities for the BEL and the

FIXED-b ASYMPTOTICS FOR BLOCKWISE EMPIRICAL LIKELIHOOD 15 smoothed EL with kernel K(x) based on the fixed-b approximation and the χ 2 - based approximation in Figure 5. As the dependence strengthens, the fixed-b and χ 2 -based approximations deteriorate. The coverage probabilities obtained from the fixed-b calibration are consistently closer to the nominal level, and the improvement is significant for large bandwidths. In contrast, the coverage probabilities based on the χ 2 approximation are severely downward biased for relatively large b. To sum up, the fixed-b approximation provides a uniformly better approximation to the sampling distribution of the EL ratio statistic for a wide range of b, and it tends to deliver more accurate coverage probability in confidence interval construction and size in testing. From a practical viewpoint, the choice of the bandwidth parameter has a great impact on the finite sample performance of the EL ratio statistic; it is of interest to consider the optimal bandwidth under the fixed-b paradigm. Acknowledgements We are grateful to two referees for their helpful comments that led to substantial improvements. Shao s research is supported in part by National Science Foundation grant DMS-114545. References Billingsley, P. (1999). Convergence of Probability Measures. Second edition. Wiley, New York. Borwein, J. M. and Lewis, A. S. (1991). Duality relationships for entropy-type minimization problems. SIAM J. Control Optim. 29, 325-338. Chen, S. X. and Van Keilegom, I. (29). A review on empirical likelihood methods for regression. TEST 18, 415-447. Chen, S. X. and Wong, C. (29). Smoothed block empirical likelihood for quantiles of weakly dependent processes. Statist. Sinica 19, 71-82. Cressie, N. and Read, T. (1984). Multinomial goodness-of-fit tests. J. Roy. Statist. Soc. Ser. B 46, 44-464. Imbens, G. W., Spady, R. H. and Johnson, P. (1998). Information theoretic approaches to inference in moment condition models. Econometrica 66, 333-357. Jansson, M. (24). On the error of rejection probability in simple autocorrelation robust tests. Econometrica 72, 937-946. Kiefer, N. M. and Vogelsang, T. J. (25). A new asymptotic theory for heteroskedasticityautocorrelation robust tests. Econometric Theory 21, 113-1164. Kitamura, Y. (1997). Empirical likelihood methods with weakly dependent processes. Ann. Statist. 25, 284-212. Kitamura, Y. (26). Empirical likelihood methods in econometrics: Cowles Foundation Discussion Paper 1569. theory and practice. Newey, W. K. and Smith, R. (24). Higher order properties of GMM and generalized empirical likelihood estimators. Econometrica 72, 219-255.

16 XIANYANG ZHANG AND XIAOFENG SHAO Nordman, D. J. (29). Tapered empirical likelihood for time series data in time and frequency domains. Biometrika 96, 119-132. Nordman, D. J., Bunzel, H. and Lahiri, S. N. (213). A non-standard empirical likelihood for time series. Ann. Statist. To appear. Owen, A. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75, 237-249. Owen, A. (199). Empirical likelihood confidence regions. Ann. Statist. 18, 9-12. Owen, A. (21). Empirical Likelihood. Chapman and Hall, New York. Phillips, P. C. B. (1987). Time series regression with unit roots. Econometrica 55, 277-31. Qin, J. and Lawless, J. (1994). Empirical likelihood and general estimating equations. Ann. Statist. 22, 3-325. Shao, X. and Politis, D. N. (213). Fixed-b subsampling and block bootstrap: improved confidence sets based on p-value calibration. J. Roy. Statist. Soc. Ser. B 75, 161-184. Smith, R. (211). GEL criteria for moment condition models. Econometric Theory 27, 1192-1235. Sun, Y. (21). Let s fix it: fixed-b asymptotics versus small-b asymptotics in heteroscedasticity and autocorrelation robust Inference. working paper. Sun, Y., Phillips, P. C. B. and Jin, S. (28). Optimal bandwidth selection in heteroscedasicityautocorrlation robust testing. Econometrica 76, 175-194. Zhang, X. and Shao, X. (213). Fixed-smoothing asymptotics for time series. Ann. Statist. 41, 1329-1349. Department of Statistics, University of Missouri-Columbia, Columbia, MO 65211, USA. E-mail: zhangxiany@missouri.edu Department of Statistics, University of Illinois at Urbana-Champaign, 725 S. Wright St., Champaign, IL 6182, USA. E-mail: xshao@illinois.edu (Received November 212; accepted July 213)

Statistica Sinica: Supplement Fixed-b Asymptotics for Blockwise Empirical Likelihood Xianyang Zhang and Xiaofeng Shao University of Missouri-Columbia and University of Illinois at Urbana-Champaign Supplementary Material The following supplementary material contains the proof of Theorem 2.1. S1 Technical appendix Define the set of functions Q = {g = (g 1, g 2,..., g k ) C k [, 1] : g is are linearly independent}, and let G el (g) = max λ R k log(1 + λ g(t))dt be a nonlinear functional from C k [, 1] to the real line R, where log(x) = for x <. We shall prove in the following that G el (g) is a continuous map for functions in Q under the sup norm (Seijo and Sen (211)). For any g C k [, 1], we define H g = {λ R k : min t [,1] (1 + λ g(t)) } and L g (λ) = log(1 + λ g(t))dt. It is straightforward to show that L g (λ) is strictly convex for g Q on the set H g. We also note that H g is a closed convex set, which contains a neighborhood of the origin. Let λ g = argmax λ R k log(1 + λ g(t))dt be the maximizer of L g (λ). We first show that G el (g) < if and only if H g is bounded. If G el (g) =, then λ g cannot be finite, which implies that H g is unbounded. On the other hand, suppose H g is unbounded. Note that H g = t [,1] {λ R k : λ g(t) 1} which is the intersection of a set of closed half-spaces. The recession cone of H g is then given by + H g = t [,1] {λ R k : λ g(t) } (see Section 8 of Rockafellar (197)). By Theorem 8.4 of Rockafellar (197), there exists a nonzero vector λ + H g, and the set {t [, 1] : λ g(t) > } has positive Lebesgue measure because of the linearly independence of g. We have G el (g) L g (a λ) for any a >, where L g (a λ) as a. Thus we get G el (g) =. Next, we consider the case G el (g) =. Following the discussion above, there exists δ such that the set B := {t [, 1] : λ g(t) > δ} has Lebesgue measure Λ(B) >. For any A >, we choose ɛ (, 1) and large enough a > so that Λ(B) log(1 + a δ ɛ ) + log(1 ɛ ) > A.

S2 Xianyang Zhang and Xiaofeng Shao For any f Q with f g := sup t [,1] f(t) g(t) ɛ /( λ a), we have log(1 + a λ f(t))dt = log(1 + a λ (f(t) g(t)) + a λ g(t))dt B + log(1 + a λ (f(t) g(t)) + a λ g(t))dt B c Λ(B) log(1 + a δ ɛ ) + log(1 ɛ ) > A. In what follows, we turn to the case G el (g) <, i.e., H g is bounded as shown before. Case 1: we first consider the case that λ g H g = {λ R k : min t [,1] (1 + λ g(t)) > }. Since H g is open, we can pick a positive number τ so that B(λ g ; τ) := {λ R k : λ λ g τ} H g. Then we have min λ B(λg;τ) min t [,1] (1 + λ g(t)) > c >. Furthermore, there exists a sufficiently small δ such that for any f Q with f g δ, we have min t [,1] (1 + λ f(t)) > c > for any λ B(λ g ; τ), i.e., B(λg ; τ) H f. Notice that the constant c only depends on g, δ and c. Given any ɛ >, we shall first show that sup λ B(λg;τ) L f (λ) L g (λ) < ɛ for any f Q with f g < δ(ɛ), where < δ(ɛ) < δ. Because G el (g) <, we have log(1 + λ g(t))dt < for any λ B(λ g ; τ). Simple algebra yields that log(1 + λ f(t))dt log(1 + λ g(t))dt { max log(1 + M δ(ɛ)/c ), log(1 + M δ(ɛ)/c) }, (S1.1) where M = λ g + τ. The RHS of (1) can be made arbitrarily small for sufficiently small δ(ɛ). Therefore we get sup λ B(λg;τ) L f (λ) L g (λ) < ɛ for small enough δ(ɛ), which implies that G el (g) sup λ B(λg;τ) log(1 + λ f(t))dt < ɛ. Next, we show that there exists a local maxima of L f (λ) in B(λ g ; τ). Suppose ɛ is sufficiently small and choose < ξ < τ such that L g (λ g ) > max λ B(λg;τ) B c (λ g;ξ) L g (λ) + 2ɛ, where B(λ g ; ξ) = {λ R k : λ λ g < ξ}. Thus we get max λ B(λ g;τ) B c (λ g;ξ) L f (λ) max λ B(λ g;τ) B c (λ g;ξ) L g (λ) + ɛ < L g (λ g ) ɛ L f (λ g ) max L f (λ). λ B(λ g;ξ) Because f Q, L f (λ) is strictly convex. Hence, the local maxima is also the global maxima, which implies that G el (g) G el (f) < ɛ. Case 2: We now consider the case min t [,1] (1 + λ gg(t)) =. For any < δ < δ < 1, let H g (δ ) = {(1 δ )λ : λ H g } and H f (δ ) = {(1 δ )λ : λ H f }. There exists a small enough δ > such that for any f Q with f g < δ, H f (δ ) H g (δ ) H f H g. By the continuity of L g (λ), we know for any ɛ >, there exists a δ > such that when λ λ g δ λ g, L g (λ g ) < L g (λ) + ɛ/4. By the construction

Fixed-b Asymptotics for Blockwise Empirical Likelihood S3 of H g (δ ), we have L g (λ g ) < L g ((1 δ )λ g ) + ɛ/4 sup L g (λ) + ɛ/4. λ H g(δ ) Using similar arguments in the first case and the boundness of H g, we can show that sup L g (λ) sup L f (λ) < ɛ/8, λ H g(δ ) λ H g(δ ) for sufficiently small δ. Furthermore, when λ f H g (δ ), we have L f (λ f ) = sup λ Hg(δ ) L f (λ). When λ f / H g (δ ), by the convexity of L f (λ), we get L f ((1 δ )λ f ) (1 δ )L f (λ f ), which implies that L f (λ f ) L f ((1 δ )λ f ) 1 δ sup λ H g(δ ) L f (λ) 1 δ sup λ H g(δ ) L g (λ) + ɛ/8 1 δ sup L g (λ) + ɛ/4 λ H g(δ ) < sup L f (λ) + ɛ/2 λ H g(δ ) for small enough δ (e.g., δ ɛ < min(1/3, 24G el (g))). Thus we have G el (f) G el (g) L f (λ f ) sup L f (λ) λ H g(δ ) + L g(λ g ) sup L g (λ) sup L f (λ) < ɛ. λ H g(δ ) λ H g(δ ) sup λ H g(δ ) L g (λ) Combining the above arguments, we show that the map G el is continuous under the sup norm. Next, we consider the limiting process D k (r; b) = K((r s)/b)dw k(s) with b (, 1) being fixed in the asymptotics. Because the components of D k (r; b) are mutually independent, we have P (α D k (r; b) = for some α R k ) = which implies that P (D k (r; b) Q) = 1. Under the assumptions in Theorem 2.1, the set {λ : min r [,1] (1 + λ D k (r; b)) } is compact and convex almost surely (note the convexity and closeness of the set follow directly from its definition). Using summation by parts, we get n nftn (θ ) = = 1 ( t n b n K ) n t 1 s=t n k=1 ( ) s n K f t s (θ ) = n 1 ( t s f k (θ ) + 1 b n s=1 { K n ( t s K s=1 ) K ) f s (θ ) ( )} s t s 1 f k (θ ). k=1

S4 Xianyang Zhang and Xiaofeng Shao By the continuous mapping theorem and Itô s formula, we obtain { ( ) 1 r 1 nftn (θ ) d Λ b K W k (1) + 1b ( ) } r s b 2 K W k (s)ds = d ΛD k (r; b)/b, b for t = nr with r [, 1]. Finally, by the continuous mapping theorem, we get elr(θ ) = 2 b max λ R k n log(1 + λ nbλ 1 f tn (θ ))/n, t=1 d U el,k (b; K) := 2 b max λ R k ( log 1 + λ ) D k (r; b) dr. λ = Λ λ/( nb), (S1.2) Remark S1.1. For ET and CUE, we have I = R. Given any g Q with G gel (g) <, we have H g = {λ R k : λ g(t) I, for all t [, 1]} = R k and λ g <. Therefore, λ g is an interior point of H g and the arguments in Case 1 can be applied to show the continuity of G gel ( ) at g. References Rockafellar, T. R. (197). Convex Analysis. Princeton Univ. Press. Seijo, E. and Sen, B. (211). A continuous mapping theorem for the smallest argmax functional. Electron. J. Stat. 5, 421-439.