AN EFFICIENT ESTIMATOR FOR THE EXPECTATION OF A BOUNDED FUNCTION UNDER THE RESIDUAL DISTRIBUTION OF AN AUTOREGRESSIVE PROCESS

Similar documents
Institute of Mathematical Statistics LECTURE NOTES MONOGRAPH SERIES PLUG-IN ESTIMATORS IN SEMIPARAMETRIC STOCHASTIC PROCESS MODELS

Pointwise convergence rates and central limit theorems for kernel density estimators in linear processes

Convergence rates in weighted L 1 spaces of kernel density estimators for linear processes

Density estimators for the convolution of discrete and continuous random variables

Efficiency of Profile/Partial Likelihood in the Cox Model

A NONPARAMETRIC TEST FOR SERIAL INDEPENDENCE OF REGRESSION ERRORS

Support Vector Method for Multivariate Density Estimation

Variance bounds for estimators in autoregressive models with constraints

Efficient estimation for semiparametric semi-markov processes

SOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS

Statistics of stochastic processes

DEPARTMENT MATHEMATIK ARBEITSBEREICH MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE

The likelihood for a state space model

KALMAN-TYPE RECURSIONS FOR TIME-VARYING ARMA MODELS AND THEIR IMPLICATION FOR LEAST SQUARES PROCEDURE ANTONY G AU T I E R (LILLE)

[1] Thavaneswaran, A.; Heyde, C. C. A note on filtering for long memory processes. Stable non-gaussian models in finance and econometrics. Math.

A note on L convergence of Neumann series approximation in missing data problems

M- and Z- theorems; GMM and Empirical Likelihood Wellner; 5/13/98, 1/26/07, 5/08/09, 6/14/2010

Spread, estimators and nuisance parameters

Ef ciency of the empirical distribution for ergodic diffusion

Mohsen Pourahmadi. 1. A sampling theorem for multivariate stationary processes. J. of Multivariate Analysis, Vol. 13, No. 1 (1983),

On the Central Limit Theorem for an ergodic Markov chain

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

Testing Statistical Hypotheses

MAXIMUM VARIANCE OF ORDER STATISTICS

Estimation of the Bivariate and Marginal Distributions with Censored Data

ORTHOGONAL SERIES REGRESSION ESTIMATORS FOR AN IRREGULARLY SPACED DESIGN

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

SMOOTHED BLOCK EMPIRICAL LIKELIHOOD FOR QUANTILES OF WEAKLY DEPENDENT PROCESSES

Time Series Analysis -- An Introduction -- AMS 586

ON PITMAN EFFICIENCY OF

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

The Convergence Rate for the Normal Approximation of Extreme Sums

Deterministic. Deterministic data are those can be described by an explicit mathematical relationship

Uses of Asymptotic Distributions: In order to get distribution theory, we need to norm the random variable; we usually look at n 1=2 ( X n ).

Systems Simulation Chapter 7: Random-Number Generation

Consistency of Quasi-Maximum Likelihood Estimators for the Regime-Switching GARCH Models

Sub-Gaussian estimators under heavy tails

Asymptotics and Simulation of Heavy-Tailed Processes

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

1. Introduction In many biomedical studies, the random survival time of interest is never observed and is only known to lie before an inspection time

A New Iterative Procedure for Estimation of RCA Parameters Based on Estimating Functions

Estimators For Partially Observed Markov Chains

Nonparametric Density and Regression Estimation

EXACT CONVERGENCE RATE AND LEADING TERM IN THE CENTRAL LIMIT THEOREM FOR U-STATISTICS

Additive functionals of infinite-variance moving averages. Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535

KOLMOGOROV DISTANCE FOR MULTIVARIATE NORMAL APPROXIMATION. Yoon Tae Kim and Hyun Suk Park

Additive Isotonic Regression

Bickel Rosenblatt test

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Nonparametric regression with martingale increment errors

Estimation of parametric functions in Downton s bivariate exponential distribution

Computer Intensive Methods in Mathematical Statistics

Reliability of Coherent Systems with Dependent Component Lifetimes

Bootstrap with Larger Resample Size for Root-n Consistent Density Estimation with Time Series Data

On variable bandwidth kernel density estimation

KRUSKAL-WALLIS ONE-WAY ANALYSIS OF VARIANCE BASED ON LINEAR PLACEMENTS

Math 304 (Spring 2010) - Lecture 2

Han-Ying Liang, Dong-Xia Zhang, and Jong-Il Baek

A CENTRAL LIMIT THEOREM AND A STRONG MIXING CONDITION BY M. ROSENBLATT COLUMBIA UNIVERSITY. Communicated by P. A. Smith, November 10, 1955

A scale-free goodness-of-t statistic for the exponential distribution based on maximum correlations

Diagnostic Test for GARCH Models Based on Absolute Residual Autocorrelations

Erich Lehmann's Contributions to Orderings of Probability Distributions

Efficiency for heteroscedastic regression with responses missing at random

CS281A/Stat241A Lecture 17

Econ 583 Final Exam Fall 2008

Introduction to Time Series Analysis. Lecture 7.

Statistical inference on Lévy processes

ECONOMETRICS FIELD EXAM Michigan State University August 21, 2009

Birth-death chain. X n. X k:n k,n 2 N k apple n. X k: L 2 N. x 1:n := x 1,...,x n. E n+1 ( x 1:n )=E n+1 ( x n ), 8x 1:n 2 X n.

Minimum distance tests and estimates based on ranks

Statistical Convergence of Kernel CCA

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

ENTROPY-BASED GOODNESS OF FIT TEST FOR A COMPOSITE HYPOTHESIS

Asymptotic behavior for sums of non-identically distributed random variables

Math 240 (Driver) Qual Exam (9/12/2017)

THE LARGE CONDITION FOR RINGS WITH KRULL DIMENSION

Summary of EE226a: MMSE and LLSE

The Equivalence of Ergodicity and Weak Mixing for Infinitely Divisible Processes1

AN EFFICIENT ESTIMATOR FOR GIBBS RANDOM FIELDS

Estimation of cusp in nonregular nonlinear regression models

A General Class of Optimal Polynomials Related to Electric Filters

ON THE AVERAGE NUMBER OF REAL ROOTS OF A RANDOM ALGEBRAIC EQUATION

6.435, System Identification

Issues on quantile autoregression

Fitting Linear Statistical Models to Data by Least Squares II: Weighted

2.2 The Limit of a Function

Stochastic gradient descent and robustness to ill-conditioning

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t,

PRINCIPLES OF STATISTICAL INFERENCE

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

1. The Multivariate Classical Linear Regression Model

Limits at Infinity. Horizontal Asymptotes. Definition (Limits at Infinity) Horizontal Asymptotes

11. Learning graphical models

Asymptotic inference for a nonstationary double ar(1) model

Goodness-of-fit tests for the cure rate in a mixture cure model

Information Measure Estimation and Applications: Boosting the Effective Sample Size from n to n ln n

Estimating the inter-arrival time density of Markov renewal processes under structural assumptions on the transition distribution

11. Further Issues in Using OLS with TS Data

Estimation of a quadratic regression functional using the sinc kernel

Transcription:

Ann. Inst. Statist. Math. Vol. 46, No. 2, 309-315 (1994) AN EFFICIENT ESTIMATOR FOR THE EXPECTATION OF A BOUNDED FUNCTION UNDER THE RESIDUAL DISTRIBUTION OF AN AUTOREGRESSIVE PROCESS W. WEFELMEYER Mathematical Institute, University of Cologne, Weyertal 86, 50931 Cologne, Germany (Received December 18, 1992; revised August 18, 1993) Abstract. Consider a stationary first-order autoregressive process, with i.i.d. residuals following an unknown mean zero distribution. The customary estimator for the expectation of a bounded function under the residual distribution is the empirical estimator based on the estimated residuals. We show that this estimator is not efficient, and construct a simple efficient estimator. It is adaptive with respect to the autoregression parameter. Key words and phrases: Autoregressive model, efficient estimator, empirical estimator, residual distribution. 1. Introduction Let Q be a distribution on the real line with zero mean and positive and finite variance. We observe realizations Xo,...,Xn from a stationary autoregressive process Xi = 0X/-1 + ci, where the e~ are i.i.d, with unknown distribution Q. A simple, but inefficient estimator for 0 is the least squares estimator t~ n = Xi_lX / 221. i=l The unobserved residuals si are estimated by g~i = Xi -#nxi-1. A natural estimator for the expectation Ef(s) of a bounded function f is the empirical estimator based on the estimated residuals, JEnf = n -1 L f(gni). For fixed t E ~ and f(x) = I(x <_ t) we obtain an estimator for the residual distribution function F(t) = Q(-cx~, t], St F~(t) = n -1 E i=l I(g,~i _< t). 309

- - hi" 310 W. WEFELMEYER The estimator Fn is well studied. For the stationary case, with tv~l < 1, a functional central limit theorem is proved by Boldin (1982, 1983). For moving average models see Boldin (1989). He requires a uniformly bounded second derivative of F. His result is generalized by Kreiss (1991) to linear processes with coefficients depending on a finite-dimensional parameter. Koul and Leventhal (1989) treat the explosive autoregressive process, with I~ I > 1. They get by with a uniformly bounded first derivative of F. For related results under weak smoothness assumptions on F we refer to Koul (1991, 1992). Here we restrict attention to the stationary case, with I~1 < 1. First we show that Enf is not efficient. This is not due to the fact that an inefficient estimator, the least squares estimator, is used for ~. Rather, the reason is that E~f ignores that Q has mean zero. The estimator/~nf can be improved by adding noise, n Enf = F"nf - &nn-1 E gni with o i=l /fi / Here &~ estimates a = Eef(e)/Ee 2, and n -1 ~g~i estimates zero and may be considered as noise. The asymptotic distribution of/~f has variance E(f(e) - Ef(e)) 2. The asymptotic distribution of E~f has variance E(f(r - Ef(e)) 2 - (Er 2. This is strictly smaller than E(f(r - Ef(r 2 unless c and f(r are uncorrelated. In particular, an improved estimator for the distribution function F(t) is n F~(t) = Fn(t) - &~(t)n -~ E eni with / a,~(t) = ~ &J(gn~ < t) g2 i=i / i=i It is easy to check that E~f is asymptotically optimal among estimators of the form E.f ^ --1 n ^ - bun ~=1 sni with/~n a consistent estimator of some functional of Q. In Section 2 we prove, for smooth f, that E*f is efficient in the much larger class of regular estimators. If ~ is known, observing the Xi is equivalent to observing the si. Moreover, one can replace the estimated residuals ~{ by the true residuals ci. The problem then reduces to estimating Ef(s) efficiently on the basis of i.i.d, observations ~{ with E~ = 0. It can be solved with the methods described in Pfanzagl and Wefelmeyer (1982) and Bickel et al. (1993). The main technical difficulty in the present setting lies in showing that observing the gn{ is asymptotically equivalent to observing the 8i. Boldin (1982, 1983, 1989), Koul and Leventhal (1989), Koul (1991, 1992) and Kreiss (1991) consider the residual distribution function, i.e. step functions

ESTIMATOR FOR RESIDUAL DISTRIBUTION 311 f(x) = I(x <_ t), and need smoothness of the distribution function. We consider smooth functions f and get by without smoothness of the distribution function. This version is appropriate for the application we have in mind; see Wefelmeyer (1992). Two questions remain open. Under which conditions is our improved estimator F~(t) efficient? Over which classes of functions f does one obtain a functional version of our result? The present result is an example for efficient estimation in models with certain restrictions. Here the restriction is zero mean of the residual distribution. Such a restriction is also common in regression models. We note that E~f is not a onestep improvement of F, nf as discussed, e.g., by Bickel (1982) and Schick (1987) for the i.i.d, case. For a general theory of one-step estimators under restrictions in the i.i.d, case see Koshevnik (1992). 2. Result First we determine an asymptotic variance bound for estimators of Ef(~) when ~ is known. Then we find an estimator which attains the bound for all ~. Such an estimator is called adaptive. Its existence implies that the variance bound for known ~ equals the variance bound for unknown ~). We expect adaptivity to be symmetric in d and Q. In fact, there exist estimators for ~) which attain, for all Q, the asymptotic variance bound for the model in which ~ varies and Q is known; see Kreiss (1987). Fix ~ with [~l < 1 and a distribution Q on the real line with zero mean and positive and finite variance. For known ~, a local model around Q is introduced as follows. Set For h E H define Q~h by H = {h: R ~ N bounded, Eh(e) = O, Eeh(e) = 0}. Q h(dz) = (1 + n-1/2h(x) )Q(dx). Then Qnh is a probability measure for large n, and the residual has again mean zero under Qnh Enh~ = Ee + n-1/2eeh(~) = O. Let Pn denote the joint distribution of X0,...,Xn if Q is true. Define p~h accordingly. It is easy to check local asymptotic normality, fib log dpr~ h/dpn = n -1/2 E h(xi - v~xi-1) - 1 Eh(e)2 + op. (1) and n -1/2 s h(xi - v~xi-1 =} Nh under Pn, where Nh is normal with mean zero and variance Eh(r 2. For details see Huang (1986) or Kreiss (1987).

- L 312 W. WEFELMEYER Now we describe an asymptotic variance bound for estimators of the expectation El(s) and characterize efficient estimators. The norm (Eh(s)2) 1/2 in local asymptotic normality provides an inner product Eh(s)k(e) on L2(Q). We can express differences Enhf(e) - Ef(s) in terms of this inner product, n 1/2 (Enhf(c) - Ef(~)) = Eh(~)f(~). The right side is a linear functional on H; the function f is a gradient. Let H denote the closure of H in L2 (Q). The gradient f may be replaced by its projection into H, the canonical gradient. We show that the latter is given by the function (2.1) g(x) = f(x) - El(e) - ax with a -- Eef(r 2. To see this, note that g is a gradient since for h E H, Eh(~)g(~) = Eh(r - Eh(e)Ef(e) - aeeh(r = Eh(~)f(e), and that g E H since Eg(e) = El(e) - E f(e) - aec = 0, Esg(~) = Eef(r - E~E f(e) - aee 2 = O. An estimator Tn for Ef(c) is called regular at Q with limit L if, for h ~ H, under p;h. By the convolution theorem, an asymptotic variance bound for such an estimator is given by the squared length of the canonical gradient. More precisely, L = M + N in distribution, where M is independent of N, and N is normm with mean zero and variance equal to Eg(s) 2 = E(f(~) - Ef(~)) 2 - (E~f(~))2/Ec 2. The convolution theorem justifies calling an estimator efficient for Ef(~) at Q if its limit distribution under P~ is N. It is well known that an estimator Tn for Ef(e) is regular and efficient at Q if and only if it is asymptotically linear with influence function equal to the canonical gradient, (2.2) ni/:(t - El(4) = g(x - OX _I) + (1). i~1 We use this characterization to prove our main result. A convenient reference for the characterization is Greenwood and Wefelmeyer (1990). A different local model is introduced in Huang (1986) and Kreiss (1987). It does not take into account that the residuals have mean zero. Hence the condition

ESTIMATOR FOR RESIDUAL DISTRIBUTION 313 E~h(~) = 0 is missing in the definition of H, the relation Enh~ ---- 0 does not hold, and the local model is not contained in the given model. With this local model the canonical gradient is f(x) - Ef(c), and the empirical estimator/~f would be declared efficient. We have seen that it can be improved. The mistake is harmless for estimating ~. In fact, because of adaptivity, it suffices then to consider local models with Q fixed and d varying. Local asymptotic normality for such models is proved in Akahira (1976) and Akritas and Johnson (1982). THEOREM 2.1. Let f be bounded with uniformly continuous and bounded derivative. Then the estimator E* f = E~f - 5~n -1E 7ni, Tt with an = E gnif(gni) ~2. /i=l is regular and efficient for El(g) at Q. 3. Proof In this section, all sums extend over i from i to n, as before. We must show that (2.2), with g given in (2.1), holds for Tn = Enf. This follows if we prove (3.1) (3.2) (3.3) nl/a(enf - E f(c)) = n -1/2 E(f(r - E f(r + op~ (1), 5,~ = a + op,, (1). Fix 5 > 0. Since 0n is nl/2-consistent, there exists c > 0 such that Since Xi is square-integrable, > <5. Pn ~ max ]Xil > Sc-lnU2} = Hence we may restrict attention to X0,. 9 9 X~ with IJn - OJ <_ n-1/%, IXi[ <_ 5c-ln 1/2. (i) To prove (3.1), choose gn~ between vi and gni such that (3.4) f(gni)=f(ci)+(gn~-ei)f'(gni).

314 W. WEFELMEYER Write nl/2(e~f - E f(s)) = n -1/z ~(f(~) - E f(e)) - nl/z(o~ - O)n -1 ~ Xi-lf(g~i). The least squares estimator is nl/2-consistent, nl/2(on -O) = Opt(i). Use (3.4) to write n -1 ~, x~_l/'(~.~) = ~-~ ~ x~_~f'(~) + ~-1Z(/,(~ ) _/'(~0). Relation (3.1) follows if this is of order op~(1). The second right-hand term is Op,~ (1) since f' is uniformly continuous and with 6 arbitrarily small. The first right-hand term is opn (1) by the ergodic theorem since (ii) To prove (3.2), write EXo/'(x~ - ~Xo) = EXosF/'(c) = o. ~-~/~ Z( ~ - ~) = -~/~(~- - ~) n-~ Z x~_~. The least squares estimator is nx/:-consistent, and the ergodic theorem and EXo = 0 imply n-l~ Xi-1 -- opn (1). Hence (3.2) holds. (iii) To prove (3.3), note that &~ is a ratio of two averages. Consider first the numerator. Write (3.5) n -] E(gnif(gnO - sif(ei)) = ~-1 ~(g.~ _ ~)f(~.,) + ~-1 ~ ~(f(~.~) _ f(~)). The function f is bounded, say by c. Hence the first right-hand term in (3.5) is bounded by c[0~ - 0In -1 E Ixi-ll 9 This is of order Op.(1) since 0~ is consistent and n -1E [Xi-ll = EIXol + OR.(1) by the ergodic theorem. Use (3.4) to write the second right-hand term in (3.5) as (~"~ - ~)n-~ Z ~f'(g"~)" The function ff is bounded, say by c. Hence the second right-hand term in (3.5) is bounded by 5cn-1~ "] levi and is therefore of order op~(1) by the ergodic theorem since 5 is arbitrarily small. We obtain that (3.5) is OR,(1), and by the ergodic theorem, (3.6) n-1 Z gnif(gni) = n-1 E ~if(si) + op~ (1) = Eef(e) + op~ (1).

ESTIMATOR FOR RESIDUAL DISTRIBUTION 315 Consider now the denominator of 5n. Write - = n -1 - + = --(On -- ~))f~ -1EXi-l(2Xi "[- (On ~- ~9)Xi-1). This is bounded by 52(1 + 101)n-1E Ix,-lf and hence is of order op~(1) by the ergodic theorem since ~ is arbitrarily small. Again by the ergodic theorem, (3.7) 7/'--1 E eni ^2 = n-1 Eei+ 2 op~(1)=ec2+op~(1). Since Ec 2 is positive, relations (3.6) and (3.7) imply (3.3), and we are done. Acknowledgements I would like to thank the referees for several corrections and additional references. REFERENCES Akahira, M. (1976). On the asymptotic efficiency of estimators in an autoregressive process, Ann. Inst. Statist. Math., 28, 35-48. Akritas, M. G. and Johnson, R. A. (1982). Efficiencies of tests and estimators for p-order autoregressive processes when the error distribution is nonnormal, Ann. Inst. Statist. Math., 34, 579-589. Bickel, P. J. (1982). On adaptive estimation, Ann. Statist., 10, 647-671. Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1993). Efficient and Adaptive Estimation in Semiparametric Models, Johns Hopkins University Press, Baltimore. Boldin, M. V. (1982). Estimation of the distribution of noise in an autoregression scheme, Theory Probab. Appl., 27, 866-871. Boldin, M. V. (1983). Testing hypotheses in autoregression schemes by the Kolmogorov and w 2 criteria, Soviet Math. Dokl., 28, 550-553. Boldin, M. V. (1989). On testing hypotheses in the sliding average scheme by the Kolmogorov- Smirnov and w 2 tests, Theory Probab. Appl., 27, 866-871. Greenwood, P. E. and Wefelmeyer, W. (1990). Efficiency of estimators for partially specified filtered models, Stochastic Process. Appl., 36, 353-370. Huang, W.-M. (1986). A characterization of limiting distributions of estimators in an autoregressive process, Ann. Inst. Statist. Math., 38, 137-144. Koshevnik, Y. (1992). Efficient estimation for restricted nonparametric models, preprint. Koul, H. L. (1991). A weak convergence result useful in robust antoregression, J. Statist. Plann. Inference, 29, 291-308. Koul, H. L. (1992). Weighted Empiricals and Linear Models, IMS Lecture Notes--Monograph Series, Vol. 21, Hayward, California. Koul, H. L. and Leventhal, S. (1989). Weak convergence of the residual empirical process in explosive autoregression, Ann. Statist., 17, 1784-1794. Kreiss, J.-P. (1987). On adaptive estimation in autoregressive models when there are nuisance parameters, Statist. Decisions, 5, 59-76. Kreiss, J.-P. (1991). Estimation of the distribution function of noise in stationary processes, Metrika, 38, 285-297. Pfanzagl, J. and Wefelmeyer, W. (1982). Contributions to a General Asymptotic Statistical Theory, Lecture Notes in Statistics, 13, Springer, New York. Schick, A. (1987). A note on the construction of asymptotically linear estimators, J. Statist. Plann. Inference, 16, 89-105 (correction: ibid. (1989). 22, 269-270). Wefelmeyer, W. (1992). Quasi-likelihood models and optimal inference, preprint.