Markov Chain Monte Carlo confidence intervals

Similar documents
Markov Chain Monte Carlo confidence intervals

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Math Solutions to homework 6

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

On the convergence rates of Gladyshev s Hurst index estimator

Sequences and Series of Functions

Entropy Rates and Asymptotic Equipartition

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Distribution of Random Samples & Limit theorems

Lecture 2: Monte Carlo Simulation

7.1 Convergence of sequences of random variables

Lecture 3 The Lebesgue Integral

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

4. Partial Sums and the Central Limit Theorem

Chapter 6 Infinite Series

Asymptotic distribution of products of sums of independent random variables

1 Introduction to reducing variance in Monte Carlo simulations

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Stochastic Simulation

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MAT1026 Calculus II Basic Convergence Tests for Series

32 estimating the cumulative distribution function

Advanced Stochastic Processes.

Lecture 19: Convergence

Ma 4121: Introduction to Lebesgue Integration Solutions to Homework Assignment 5

The Central Limit Theorem

EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS

Chapter 7 Isoperimetric problem

LECTURE 8: ASYMPTOTICS I

MAS111 Convergence and Continuity

Kernel density estimator

Notes 19 : Martingale CLT

2 Markov Chain Monte Carlo Sampling

A Proof of Birkhoff s Ergodic Theorem

6.3 Testing Series With Positive Terms

Math 525: Lecture 5. January 18, 2018

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

7.1 Convergence of sequences of random variables

Notes 27 : Brownian motion: path properties

An Introduction to Randomized Algorithms

Output Analysis and Run-Length Control

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data

Lecture 20: Multivariate convergence and the Central Limit Theorem

Mathematics 170B Selected HW Solutions.

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

Regression with an Evaporating Logarithmic Trend

Monte Carlo Integration

CS284A: Representations and Algorithms in Molecular Biology

Lecture 3 : Random variables and their distributions

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Achieving Stationary Distributions in Markov Chains. Monday, November 17, 2008 Rice University

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

Machine Learning Theory Tübingen University, WS 2016/2017 Lecture 12

Archimedes - numbers for counting, otherwise lengths, areas, etc. Kepler - geometry for planetary motion

Self-normalized deviation inequalities with application to t-statistic

6.867 Machine learning, lecture 7 (Jaakkola) 1

A Hilbert Space Central Limit Theorem for Geometrically Ergodic Markov Chains

Modern Discrete Probability Spectral Techniques

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

Infinite Sequences and Series

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory

Lecture Notes for Analysis Class

Ma 530 Infinite Series I

Singular Continuous Measures by Michael Pejic 5/14/10

5 Birkhoff s Ergodic Theorem

Lecture 8: Convergence of transformations and law of large numbers

Measure and Measurable Functions

6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition

Application to Random Graphs

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Random Variables, Sampling and Estimation

1 Approximating Integrals using Taylor Polynomials

On forward improvement iteration for stopping problems

1 The Haar functions and the Brownian motion

We are mainly going to be concerned with power series in x, such as. (x)} converges - that is, lims N n

1 Inferential Methods for Correlation and Regression Analysis

1 1 2 = show that: over variables x and y. [2 marks] Write down necessary conditions involving first and second-order partial derivatives for ( x0, y

Solutions to home assignments (sketches)

Lecture 4. We also define the set of possible values for the random walk as the set of all x R d such that P(S n = x) > 0 for some n.

Empirical Processes: Glivenko Cantelli Theorems

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

IIT JAM Mathematical Statistics (MS) 2006 SECTION A

Fall 2013 MTH431/531 Real analysis Section Notes

ENGI Series Page 6-01

Math 341 Lecture #31 6.5: Power Series

Expectation and Variance of a random variable

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology

Detailed proofs of Propositions 3.1 and 3.2

TR/46 OCTOBER THE ZEROS OF PARTIAL SUMS OF A MACLAURIN EXPANSION A. TALBOT

Notes 5 : More on the a.s. convergence of sums

Notes for Lecture 11

Lesson 10: Limits and Continuity

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

Transcription:

Beroulli 3, 6, 88 838 DOI:.35/5-BEJ7 Markov Chai Mote Carlo cofidece itervals YVES F. ATCHADÉ Uiversity of Michiga, 85 South Uiversity, A Arbor, 489, MI, Uited States. E-mail: yvesa@umich.edu For a reversible ad ergodic Markov chai {X, } with ivariat distributio π, we show that a valid cofidece iterval for πh ca be costructed wheever the asymptotic variace σp h is fiite ad positive. We do ot impose ay additioal coditio o the covergece rate of the Markov chai. The cofidece iterval is derived usig the so-called fixed-b lag-widow estimator of σp h. We also derive a result that suggests that the proposed cofidece iterval procedure coverges faster tha classical cofidece iterval procedures based o the Gaussia distributio ad stadard cetral limit theorems for Markov chais. Keywords: Berry Essee bouds; cofidece iterval; lag-widow estimators; martigale approximatio; MCMC; reversible Markov chais. Itroductio Cofidece itervals play a importat role i Mote Carlo simulatio Robert ad Casella [6], Asmusse ad Gly []. I Markov Chai Mote Carlo MCMC, the existig literature requires the Markov chai to be geometrically ergodic for the validity of cofidece iterval procedures Joes et al. [5], Flegal ad Joes [8], Atchadé [3]. The mai objective of this work is to simplify some of these assumptios. We show that for a reversible ergodic Markov chai, a valid cofidece iterval ca be costructed wheever the asymptotic variace itself is fiite. No additioal covergece rate assumptio o the Markov chai is required. Let {X, } be a reversible statioary Markov chai with ivariat distributio π. For h L π, the asymptotic variace of h is deoted σp h see below for the iitio. A remarkable result by C. Kipis ad S. R. Varadha Kipis ad Varadha [9] says that if <σp h <, the σ P h hx i πh coverges weakly to N, where πh = hzπdz. I order to tur this result ito a cofidece iterval for πh, a estimator σ of σ P h is eeded. A commo practice cosists i choosig σ as a cosistet estimator of σ P h. However, cosistet estimatio of σ P h typically requires further assumptios o the covergece rate of the Markov chai typically geometric ergodicity, ad o the fuctio h. Istead of isistig o cosistecy, we cosider the so-called fixed-b approach developed by Kiefer, Vogelsag ad Buzel [8], Kiefer ad Vogelsag [7], where the proposed estimator σ is kow to be icosistet. Usig this icosistet estimator we show i Theorem. that a Studetized aalog of the Kipis Varadha s theorem holds: if <σp h <, the T = σ hx i πh coverges weakly to a o-gaussia distributio. The theorem exteds to ostatioary Markov chais that satisfy a very mild ergodicity assumptio. To 35-765 6 ISI/BS

Markov Chai Mote Carlo cofidece itervals 89 a certai extet, the result is a geeralizatio of Atchadé ad Cattaeo [4] which establishes the same limit theorem for geometrically ergodic but ot ecessarily reversible Markov chais. The result is particularly relevat for Markov chais with sub-geometric covergece rates. For such Markov chais, the author is ot aware of ay result that guaratees the asymptotic validity of cofidece itervals. However, it is importat to poit out that the fiiteess of σp h carries some implicatios i terms of covergece rate of P, ad is ot always easy to check. But the mai poit of this work is that the fiiteess of σp h is all that is eeded for cosistet cofidece iterval. As we shall see, Theorem. comes from the fact that there exists a pair of radom variables N, D, say, such that the joit process hx i πh, σ coverges weakly to σ P hn, σp hd. As a result, σ P h cacels out i the limitig distributio of T.This approach to cofidece itervals is closely related to the stadardized time series method of Schrube [9] see also Gly ad Iglehart [9], well kow i operatios research. Ideed i its simplest form, the stadardized time series method is the aalog of the fixed-b procedure usig the batch-mea estimator with a fixed umber of batches. Despite this close coectio, this paper focuses oly o the fixed-b cofidece iterval. We also compare the fixed-b lag-widow estimators with the more commoly used lagwidow estimators. We limit this compariso to the case of geometrically ergodic Markov chais. We prove i Theorem.6 that the covergece rate of the fixed-b lag-widow estimator is of order log/, better tha the fastest rate achievable by the more commoly used lag-widow estimator. Similar comparisos based o the covergece of T has bee reported elsewhere i the literature. Jasso [3] studied statioary Gaussia movig average models ad established that the rate of covergece of T is log. Su, Phillips ad Ji [3] obtaied the rate, uder the mai assumptio that the uderlyig process is Gaussia ad statioary. It seems ulikely that the covergece rate will hold without the Gaussia assumptio. However, it is uclear whether the covergece rate log/ obtaied i Theorem.6 is tight. We orgaize the paper as follows. Sectio cotais the mai results, icludig the rate of covergece of the fixed-b lag-widow estimator i Sectio.4. We preset a simulatio example to illustrate the fiite sample properties of the cofidece itervals i Sectio.5. All the mai proofs are postpoed to Sectio 3 ad the Appedix... Notatio Throughout the paper X, B deotes a measure space with a coutably geerated sigma-algebra B with a probability measure of iterest π. We deote L π the usual space of L -itegrable fuctios with respect to π, with orm ad associated ier product, ad we de- π ={f ote L π the subspace of L π of fuctios orthogoal to the costats: L L π: fxπdx = }. For a measurable fuctio f : X R, a probability measure ν o X, B ad a Markov kerel Q o X, we use the otatio: νf = fxνdx, f = f πf, Qf x = fyqx,dy, ad Q j fx = Q{Q j f }x, with Q fx= fx.forv : X [,, we ie L V as the space of all measurable real-valued fuctios f : X R s.t. f V = sup x X fx /Vx <.

8 Y.F. Atchadé For two probability measures ν,ν, we deotes ν ν tv = sup f ν f ν f, the total variatio distace betwee ν ad ν, ad ν ν V = sup {f, f V } ν f ν f, its V -orm geeralizatio. For sequeces {a,b } of real oegative umbers, the otatio a b meas that a cb for all, ad for some costat c that does ot deped o. For a radom sequece {X },we write X = O p a if the sequece X /a is bouded i probability. We say that X = o p a if X /a coverges i probability to zero as.. Mote Carlo cofidece itervals for reversible Markov chais Throughout the paper, P deotes a Markov kerel o X, B that is reversible with respect to π. This meas that for ay pair f,g L π, f,pg = g,pf. We assume that P satisfies the followig. A For π-almost all x X, lim P x, π tv =. Remark. Assumptio A is very basic. For istace, if P is φ-irreducible, ad aperiodic i additio to beig reversible with respect to π, the A holds. If i additio P is Harris recurret, the holds for all x X. IfP is a Metropolis Hastigs kerel, Harris recurrece typically follows from π-irreducibility. All these statemets ca be foud, for istace, i Tierey [3]. Throughout the sectio, uless stated otherwise, {X, } is a ostatioary Markov chai o X, B with trasitio kerel P ad started at some arbitrary but fixed poit x X for which holds. The Markov kerel P iduces i the usual way a self-adjoit operator also deoted P o the Hilbert space L π that maps h Ph. This operator P admits a spectral measure E o [, ], ad for h L π we will write μ h = h, E h for the associated oegative Borel measure o [, ]. Assumptio A implies that μ h does ot charge or, that is μ h {, } =. This is Lemma 5 of Tierey [5]... Cofidece iterval for πh Let h L π. We ie σ P + λ h = λ μ hdλ, that we call the asymptotic variace of h. The termiology comes from the fact that if the Markov chai is assumed statioary, a calculatio see, e.g., Häggström ad Rosethal [], Theorem 4

Markov Chai Mote Carlo cofidece itervals 8 usig the properties of the spectral measure μ h gives lim E [ ] hx k = σp h. 3 For ostatioary Markov chais, such as the oe cosidered i this paper, it is uclear whether 3 cotiues to hold i complete geerality. The estimatio of σp h is ofte of iterest because whe 3 holds, σp h/ approximates the mea squared error of the Mote Carlo estimate hx k.aestimateofσp h is ofte also sought i order to exploit the Kipis Varadha theorem for cofidece iterval purposes. It is kow Häggström ad Rosethal [], Theorem 4 that σp h ca also be writte as σ P h = + l= γ l h, 4 where for l, γ l h = h, P l h. This suggests the so-called lag-widow estimator of σ P h σ b = l= + l w γ, l, b 5 l where γ,l = hxj ˆπ h hx j+l ˆπ h. I the above display, ˆπ h = hx k, b is a iteger such that b, as, ad w : R R is a eve fuctio w x = wx with support [, ], that is, wx o, ad wx = for x. Sice w has support [, ], the actual rage for l i the summatio iig σb is b + l b. The lag-widow estimator σb ca be applied more broadly i time series ad the method has a log history. Some of the earlier work go back to the 95s Greader ad Roseblatt [], Parze [4]. Covergece results specific to ostatioary Markov chais have bee established recetly see, e.g., Damerdji [6], Flegal ad Joes [8], Atchadé [3] ad the refereces therei; however, uder assumptios that are much stroger tha A. It remais a ope problem whether σb ca be show to coverge to σp h assumig oly A. I particular, the author is ot aware of ay result that establishes the cosistecy of σb without assumig that P is geometrically ergodic. However, if the goal is to costruct a cofidece iterval for πh, we will ow see that it is eough to assume A ad σp h <. Cosider the lag-widow estimator obtaied by settig b =. This writes σ = l= + j= l w γ, l. 6

8 Y.F. Atchadé This estimator is well kow to be icosistet for estimatig σp h, but has recetly attracted a lot of iterest i the Ecoometrics literature uder the ame of fixed-b asymptotics Kiefer, Vogelsag ad Buzel [8], Kiefer ad Vogelsag [7], Su, Phillips ad Ji [3],seealsoNeave [] for some pioeer work. This paper takes ispiratio from this literature. However, ulike these works, we exploit the Markov structure ad we do ot impose ay statioary assumptio. We itroduce the fuctio vt = wt u du, t [, ], ad the kerel φ : [, ] [, ] R, where φs,t= ws t vs vt + vtdt, s,t [, ]. 7 We say that a kerel k : [, ] [, ] R is positive iite if for all, all a,...,a R, ad t,...,t [, ], j= a i a j kt i,t j. We will assume that the weight fuctio w i 6 is such that the followig holds. A The fuctio w : R R is a eve fuctio, with support [, ], ad of class C o,. Furthermore, the kerel φ ied i 7 is positive iite, ad ot idetically zero. Example. Assumptio A holds for the fuctio w give by wu = u, u. Ideed i this case, a simple calculatio gives that φs,t= s.5t.5, which by its multiplicative form is clearly positive iite. I this particular case, solvig φs,tutdt = αus yields the uique eigevalue α = t.5 dt = /6. A geeral approach to guaratee that φ as i 7 is positive iite is to start with a positive iite fuctio w, as the ext lemma shows. Lemma.. Suppose that the kerel [, ] [, ] R ied by s, t ws t is cotiuous ad positive iite. The φ as i 7 is also positive iite. Proof. By Mercer s theorem see Theorem A., there exist oegative umbers {λ j,j }, orthoormal fuctios ξ j : [, ] R such that wt sξ j s ds = λ j ξ j t, ad wt s = j λ j ξ j tξ j s, ad the series coverges uiformly ad absolutely. It is easy to show that oe ca iterchage itegral ad sum ad write vt = wt sds = j λ j ξ j t ξ j s ds, vtdt = wt sds dt = j λ j ξ j t dt, ad the we get φs,t= λ j ξ j t j ξ j t dt ξ j s ξ j s ds. This expressio of φ easily shows that it is positive iite.

Markov Chai Mote Carlo cofidece itervals 83 The usual approach for showig that the kerel s, t ws t is positive iite is by showig that the weight fuctio t wt is a characteristic fuctio or more geerally the Fourier trasform of a positive measure ad applyig Bocher s theorem. This approach shows that A holds for the Bartlett fuctio wx = x, x, the Parze fuctio 6x + 6 x 3, if x, wx = x 3, if x,, if x >, ad for a umber of others weight fuctios see, e.g., Haa [], pages 78 79 for details. I the case of the Bartlett fuctio, the kerel φ is give by For the Parze fuctio, we have φs,t= 3 s s t t s t. vs = 3 8 + s s s s 3 + s s 4 where a b = mia, b. ad vtdt = 3 4, Assumptio A implies that φ, cosidered as a liear operator o L [, ] φf s = φs,tftdt is self-adjoit, compact ad positive. Therefore, it has oly oegative eigevalues, ad a coutable umber of positive eigevalues. We deote {α j,j I} the set of positive eigevalues of φ each repeated accordig to its multiplicity. The idex set I {,,...} is either fiite or I ={,,...}. We itroduce the radom variable T w ied as T w = Z i I α iz i where {Z,Z i,i I} i.i.d. N,. Here is the mai result. Theorem.. Assume A A, ad h L π. If <σp h <, the as, σ where {Z i,i I} i.i.d. N,. w σp h α i Z i ad T = σ i I hxk πh w Tw, Proof. See Sectio 3.. The theorem implies that the cofidece iterval ˆπ h ± t α/ σ, 8

84 Y.F. Atchadé is a asymptotically valid Mote Carlo cofidece iterval for πh, where t α/ is the α/-quatile of the distributio of T w. These quatiles are itractable i geeral but ca be easily approximated by Mote Carlo simulatio see Sectio.3. TheassumptiothatσP h is fiite ca be difficult to check. Whe P is kow to satisfy a drift coditio, oe ca fid whole class of fuctios for which the asymptotic variace is fiite, as the followig propositio shows. The propositio uses Markov chai cocepts that have ot bee ied above, ad we refer the reader to Mey ad Tweedie [] for details. Propositio.3. Suppose that P is φ-irreducible ad aperiodic, with ivariat distributio π. Suppose also that there exist measurable fuctios V,f : X [,, costat b<, ad some petite set C B such that If πfv <, the for all h L f, σp h <. PVx Vx fx+ b C x, x X. 9 Proof. This is a well-kow result. We give the proof oly for completeess. Without ay loss of geerality, suppose that πh =. We recall that σp h = πh + j h, P j h. Sice h, P j h hx P j hx πdx, we obtai h, P j h h f hx { P j x, π f }πdx. j Sice P is φ-irreducible ad aperiodic, ad uder the drift coditio 9, Mey ad Tweedie [], Theorem 4.. implies that there exists a fiite costat B such that j P j x, π f BVx, x X. We coclude that σp h B h f hx Vxπdx B h f fxvxπdx <. Remark. Propositio.3 has a umber of well-kow special cases. The most commo case is whe f = λv for some λ,, i which case P is geometrically ergodic ad σp h < for all h L V /. Aother importat special case is f = V α,forsomeα [,. Such drift coditio implies that the Markov chai coverges at a polyomial rate. If α.5, the Propositio.3 implies that σp h < for all h L V α.5. To see this, otice that 9 with f = V α, ad Jarer ad Roberts [4], Lemma 3.5 imply that PV / V / cv α / + b C. Sice πv α <, the claim follows from Propositio.3. j.. Example: Metropolis Adjusted Lagevi Algorithm for smooth desities We give aother example where it is possible to check that σ P h < without geometric ergodicity. Take X = R d equipped with the usual Euclidea ier product,, orm, ad the Lebesgue measure deoted dx. We cosider a probability measure π that has a desity with

Markov Chai Mote Carlo cofidece itervals 85 respect to the Lebesgue measure, ad i a slight abuse of otatio we use the same symbol to represet π ad its desity: πx = e ux /Z, for some fuctio u : X R that we assume is differetiable, with gradiet u. Let q σ x, deotes the desity of the Gaussia distributio Nx σ ρx ux, σ I d, where the term ρx is used to modulate the drift σ ux, ad σ>is a scalig costat. We cosider the Metropolis Hastigs algorithm that geerates a Markov chai {X, } with ivariat distributio π as follows. Give X = x, we propose Y q σ x,. We either accept Y ad set X + = Y with probability αx,y, or we reject Y ad set X + = x, where αx,y = mi, πy q σ y, x. πx q σ x, y Whe ρx =, we get the Radom Walk Metropolis RWM, ad whe ρx =, we get the Metropolis Adjusted Lagevi Algorithm MaLa. However, we are maily iterested i the case where ρx = τ maxτ, ux, x X for some give costat τ>, which correspods to the trucated MaLa proposed by Roberts ad Tweedie [8]. The trucated MaLa combies the stability of the RWM ad the mixig of the MaLa. It is kow to be geometrically ergodic wheever RWM is geometrically ergodic Atchadé []. However, checkig i practice that the trucated MaLa is geometrically ergodic ca be difficult, as this ivolves checkig coditios o the curvature of the log-desity. We show i the ext result that if the gradiet of the log-desity u is Lipschitz ad ubouded the P satisfies a drift coditio of the type 9, ad σp h is guarateed to be fiite for certai fuctios. B Suppose that u is bouded from below, cotiuously differetiable, ad u is Lipschitz, ad lim sup ux =+. x Theorem.4. Assume B ad. Set Vx = a + ux, where a R is chose such that V. The there exist b,r, such that PVx Vx σ 4 ρx ux + b{ x r} x, x X. I particular, if ux ux e ux dx<, the σ P h < for all h L f, where fx= ρx ux. Proof. See Sectio 3.. Remark 3. This result ca be useful i cotexts where the log-desity u is kow to have a Lipschitz gradiet, but is too complicated to allow a easy verificatio of the geometric ergodicity coditios.

86 Y.F. Atchadé.3. O the distributio of the radom variable T w It is clear that the limitig distributio T w used for costructig the cofidece iterval 8 depedso thechoiceof w. More research is eededto explaihow to bestchoose w i this regard. But from the limited simulatios doe i this paper, we foud that weight fuctios w with large characteristic expoets lead to heavy-tailed limitig distributios T w, ad wider cofidece itervals. The characteristic expoet of a weight fuctio w is the largest umber r> such that lim u u r wu,. Overall, we recommed the use of the Bartlett weight fuctio wu = u, u, which has characteristic expoet, ad has behaved very well i the simulatios coducted. Aother issue is how to compute the quatiles of T w. As ied, the distributio of T w is itractable i geeral, as it requires kowig the eigevalues of φ. But the ext result gives a straightforward method for approximate simulatio from T w. Propositio.5. Let {Z j, j N} be i.i.d. stadard ormal radom variables. The T N w = Nj= Z j N Nj= φ i N, j N Z iz j w T w as N. Remark 4. As poited out by a referee, oe ca also approximately sample from T w by geeratig X :N N,, ad compute T N, with hx = x. The approach i Propositio.5 is i.i.d. similar, but replaces σn by ˇσ N as ied i 7. By Lemma 3.4, the two approaches are essetially equivalet. Proof of Propositio.5. Let {,α j,j I} be the eigevalues of φ, with associated eigefuctios {, j,j I}. By Mercer s theorem see Theorem 4 i the Appedix, N N α j Hece, i φ N, k Z i Z k = N N j I N N i j Z i. N T N w = / N N i /NZ i j I α j / N. N j i /NZ i It is a applicatio of Lemma 3.3 that as N, { N N i N Z i, N N j i N Z i, j I} coverges weakly to {Z,Z j,j I}. The result the follows from the cotiuous mappig theorem. We use Propositio.5 to approximately simulate T w for the fuctio wu = u, u, ad for the Bartlett ad Parze fuctios. Table reports the 95% ad 97.5%

Markov Chai Mote Carlo cofidece itervals 87 Table. Approximatios of t such that PT w >t= α/ α = % α = 5% wu = u + 5.49.6 3..9 Parze 4.. 5.64 Bartlett 3.77.5 4.78. quatiles, computed based o idepedet samples of T N w, with N = 3. We replicate these estimates 5 times to evaluate the Mote Carlo errors reported i parethesis. As explaied i Example, i the case wu = u, u, T w = 6T, where T ν deotes the studet s distributio with ν degree of freedom; thus, is this case we ca compute accurately the quatiles. I particular, the 95% ad 97.5% quatiles are 5.465 ad 3.3, respectively..4. Rate of covergece of σ A iterestig questio is uderstadig how the lag-widow estimators σ ad σ b compare. O oe had, the asymptotic behavior of σb is better uderstood. I the statioary case, the best rate of covergece of σb towards σp h is q/+q see, e.g., Parze [4], Theorem 5A B, where q is the largest umber q,r] such that j j q γ j h <, where γ j h = h, P j h, ad r is the characteristic expoet of w. This optimal rate is achieved by choosig b /+q. Hece, the optimal rate i the case of a geometrically ergodic Markov chai is r/+r. However, it is well documeted see, e.g., Newey ad West [3] that the fiite sample properties of σb are very sesitive to the actual costat i b /+q, ad some tuig is ofte required i practice. O the other had, the fixed-b framework has the advatage that it requires o tuig, sice b =. Furthermore, we establish i this sectio that σ has a better covergece rate. Reversibility plays o role i this discussio. We further simplify the aalysis by assumig that P satisfies a geometric ergodicity assumptio: G There exists a measurable fuctio V : X [, such that πv <, ad for all β, ], P x, π V β Cρ V β x,,x X. Deote Lip R the set of all bouded Lipschitz fuctios f : R R such that fx fy f Lip = sup. x y x y For P,Q two probability measures o R, we ie d P, Q = sup f dp f dq. f Lip R

88 Y.F. Atchadé d P, Q is the Wasserstei metric betwee P,Q. A upper boud o d P,Pgives a Berry Essee-type boud o the rate of weak covergece of P to P. I a slight abuse of otatio, if X, Y are radom variables, ad X P ad Y Q, we shall also write d X, Y to mea d P, Q. Theorem.6. Suppose that A ad G hold. Suppose also that I is fiite. For δ [, /4, let h L V δ be such that πh =, ad σp h =. The d σ,χ log as, 3 where χ = i I α iz i, {Z i,i I} are i.i.d. N,, ad {α i,i I} is the set of positive eigevalues of φ. Proof. See Sectio 3.3. Remark 5. The assumptio that I is fiite is mostly techical ad it seems plausible that this result cotiues to hold without that assumptio. For example, I is fiite for the kerel wu = u, u..5. A simulatio example This sectio illustrates the fiite sample behavior of the fixed-b cofidece iterval procedure. We will compare the fixed-b procedure ad the stadard cofidece iterval procedure based o σb usig a Gaussia limit. As example, we cosider the posterior distributio of a logistic regressio model, ad use the Radom Walk Metropolis algorithm Robert ad Casella [6]. Let X = = R d equipped with its Borel sigma-algebra, ad π be absolutely cotiuous w.r.t. the Lebesgue measure dθ with desity still deoted by π. We write θ for the Euclidea orm of θ. Letq deotes the desity of the ormal distributio N, o with covariace matrix. The Radom Walk Metropolis algorithm RWMA is a popular MCMC algorithm that geerates a Markov chai with ivariat distributio π ad trasitio kerel give by P θ, A = A θ + αθ,θ + z A θ + z A θ q z dz, θ,a B, X where A deotes the idicator fuctio, ad αθ,ϑ = mi, πϑ πθ is the acceptace probability. We assume that π is the posterior distributio from a logistic regressio model. More precisely, we assume that we have biary resposes y i {, }, where y i B p x i θ, i =,...,, ad x i R d is a vector of covariate, ad θ R d is the vector of parameter. Bp deotes the Beroulli distributio with parameter p,, ad px = +e ex x is the cdf of the logistic distributio. Let X R d deote the matrix with ith row x i.letlθ X deotes the log-likelihood

Markov Chai Mote Carlo cofidece itervals 89 Table. Coverage probability ad half-legth for fixed-b cofidece itervals Coverage Half-legth wu = u +.945 ±.3. ±. Parze.94 ±.3.3 ±. Bartlett.955 ±.3. ±. fuctio of the model. We assume a Gaussia prior N,s I d for θ, with s =. The posterior distributio of θ the becomes πθ X e lθ X e /s θ. It is kow that for this target distributio the RWM is geometrically ergodic see, e.g., Atchadé [3], Sectio 5.. Therefore, for all polyomial fuctios Theorem. holds. It is also kow that with a appropriate choice of b, σb coverges i probability to σp h see, e.g., Atchadé [3], Theorem 4., ad Corollary 4.. So we will compare the fixed-b cofidece itervals ad the classical cofidece itervals based o σb. We simulate a Gaussia dataset with = 5, d = 5, ad simulate the compoets of the true value of β from a U,. We first ru the adaptive chai for 6 iteratios ad take the sample posterior mea of β as the true posterior mea. We focus o the coefficiet β. Each sampler is ru for 3 iteratios, with o bur-i period. For the RMW, we use a covariace matrix = ci 5, where c is chose such that the acceptace probability i statioarity is about 3%, obtaied from a prelimiary ru. From each sampler, we compute the fixed-b 95% cofidece iterval, ad a classical 95% cofidece iterval. To explore the rage of behavior of the classical procedure, we use b = δ for differet values of δ,. To estimate coverage probability ad half-legth of these cofidece itervals, K = replicatios are performed. The result is summarized i Table for the fixed-b procedure, ad i Figure for the classical procedure. We see from the results that usig b = gives very good coverage, except for the choice wu = u +, which geerates sigificatly wider itervals. This is somewhat expected give the very heavy tail of the limitig distributio. The result also shows that the cofidece iterval procedure based o σb works equally well whe b is carefully chose, but ca perform poorly otherwise. We also test the coclusio of Theorem.6 by comparig the fiite sample covergece rate of the two cofidece iterval procedures. Here, we use oly the Bartlett fuctio. For the stadard procedure, we use the best choice of δ δ.66, as give by the previous simulatio. We compute the cofidece itervals after MCMC rus of legth, where {,..., 4 }. Each ru is repeated 3 times to approximate the coverage probabilities ad iterval legths. The result is plotted o Figure, ad is cosistet with Theorem.6 that the fixed-b procedure has faster covergece. The price to pay is a slightly wider iterval legth as see o Figure.

8 Y.F. Atchadé Figure. Coverage probability ad cofidece iterval half-legth for parameter β for differet values of δ usig σ b,adb = δ. The dashed lie is the 95% cofidece bad estimated from replicatios. Figure. Coverage probability ad cofidece iterval half-legth for parameter β as fuctio of umber of MCMC iteratios. The square-lie correspods to usig σ.

Markov Chai Mote Carlo cofidece itervals 8 3. Proofs 3.. Proof of Theorem. Let φ as i 7. Assumptio A ad Mercer s theorem implies that the kerel φ has a coutable umber of positive eigevalues {α i,i I} with associated eigefuctios { j,j I} such that φs,t= j I α j j s j t, s, t [, ] [, ], 4 where the covergece of the series is uiform o [, ] [, ]. Sice φs,tdt =, is also a eigevalue of φ with eigefuctio x. Hece, we ie Ī ={} I, α ={α j,j Ī}, with α =, ad l α the associated Hilbert space of real umbers sequeces {x j,j Ī} such that j x j <, equipped with the orm x α = j α j xj ad the ier product x,y α = j α j x j y j. We will eed the differetiability of the eigefuctio j. This is give by Kadota s theorem Kadota [6]. Uder the assumptio that w is cotiuously twice differetiable, the eigefuctios j, j I are cotiuously differetiable with derivative ad s t φs,t= j I α j j s j t, s, t [, ] [, ], 5 where agai the covergece of the series is uiform o [, ] [, ]. The expasios 4 ad 5 easily imply that α j <, j I sup t [,] sup t [,] α j j t < j I α j j t <. j I ad 6 It is easy to check that σ ca also be writte as σ = = j= j= i j w hx i π h hx j π h { i j w v,i v,j + u } hx i hx j, where v,i = l= w i l, ad u = = w i j. Notice that v,i is a Riema sum approximatio of vi/, where vt = wt u du, ad u approximates wt

8 Y.F. Atchadé u du dt = vtdt. I view of this, we itroduce { i j i ˇσ = = j= j= w v v j i φ, j hx i hx j = l I + α l } vtdt hx i hx j i l hx i. The last equality uses the Mercer s expasio for φ as give i 4. This implies that hx i T = σ = /σ P h i / hx i l I α l/σ P h l i / hx i + σ ˇσ /σ P h. 7 Hece, the proof of the theorem boils dow to the limitig behavior of the l α-valued process { } i σ P h j hx i, j Ī, ad the remaider σ ˇσ. I Lemma 3.5, we show that { σ P h l i hx i, l Ī} coverges weakly to {Z l,l Ī}, ad that σ ˇσ coverges i probability to zero. This is doe first i the statioary case i Lemmas 3.3 3.4, ad i the ostatioary case i Lemma 3.5. Hece, the theorem follows by applyig Slutszy s theorem ad the cotiuous mappig theorem. Everythig rely o a refiemet of the martigale approximatio of Kipis ad Varadha [9] that we establish first i Lemma 3.. 3... Martigale approximatio for Markov chais Throughout this sectio, uless stated otherwise, {X, } deotes a statioary reversible Markov chai with ivariat distributio π ad trasitio kerel P, ad we fix h L π. We deote F = σx,...,x. We itroduce the probability measure πdx,dy = πdxpx,dy o X X, ad we deote L π the associated L -space with orm f = fx,y πdxpx,dy.forε>, ie U ε x = + ε j+ P j hx, G ε x, y = U ε y PU ε x. j Sice P is a cotractio of L π, it is clear that U ε L π, ad G ε L π. Furthermore, for all ε>, U ε ε h ad G ε U ε. 8

Markov Chai Mote Carlo cofidece itervals 83 Whe σp h < a stroger coclusio is possible, ad this is the key observatio made by Kipis ad Varadha [9], Theorem.3. We summarize their result as follows. Lemma 3. Kipis ad Varadha [9]. Suppose that h L π, ad σ P h <. The for ay sequece {ε, } of positive umbers such that lim ε =, lim ε U ε =. Furthermore, there exists G L π, with Px,dzGx, z = π-a.e. such that σ P h = G, ad lim G ε G =. For, ie the process B t = σ P h t GX i,x i, t, ad let {Bt, t } deotes the stadard Browia motio. It is a easy cosequece of Lemma 3. that {GX i,x i, i } is a statioary martigale differece sequece with w fiite variace. Therefore, by the weak ivariace priciple for statioary martigales, B B i D[, ] equipped with the Skorohod metric. I Corollary.5, [9], it is show that the Markov chai {X, } iherits this weak ivariace priciple. For the purpose of this paper, we eed some refiemets of this result. Let {a,k, k } be a sequece of real umbers. Set a = sup k a,k, ad a tv = a,k a,k. Lemma 3.. Let h L π be such that σ P h <. If a + a tv is bouded i, the a,i hx i = where E R as. If f : [, ] R is a cotiuously differetiable fuctio, the a,i GX i,x i + R, 9 coverges weakly to ftdbt, as. σ P h f i hx i Proof. Set S = a,i hx i. The fuctio U ε satisfies + εu ε x PU ε x = hx, π-a.e. x X.Thisisusedtowrite a,k hx k = a,k εuε X k + U ε X k PU ε X k = a,k εu ε X k + a,k Uε X k PU ε X k + a,k PU ε X k a,k PU ε X k + a,k a,k P U ε X k.

84 Y.F. Atchadé It follows that S = ε + a,k U ε X k + a,k GX k,x k a,k Gε X k,x k GX k,x k + a, PU ε X a, PU ε X + which is valid for ay ε>. I particular with ε = ε = /,wehave S = where a,k GX k,x k + R R 3 = ε = a,k U ε X k, R a,k a,k P U ε X k. By statioarity ad the martigale property, a,k a,k P U ε X k, a,k Gε X k,x k GX k,x k +R +R +R3, = a, PU ε X a, PU ε X ad [ E a,k Gε X k,x k GX k,x k ] = G ε G a,k, usig Lemma 3., ad the assumptio o a. The other remaiders are also easily dealt with. E / R 3 a,k a,k E / PUε X k = ε U ε a tv, usig Lemma 3. ad the assumptio o a. Similarly, E / R a ε U ε ad E / R ε U ε a,k.

Markov Chai Mote Carlo cofidece itervals 85 This proves part of the lemma. For part, we use part with a,i = fi/ to coclude that σ P h = = i f hx i σ P h i f GX i,x i + o p ftdb t + o p, where A = o p meas that A coverges i probability to zero as. To coclude the proof, it suffices to show that ftdb t coverges weakly to ftdbt. This follows from the weak covergece cotiuous mappig theorem by oticig that B has cotiuous sample path almost surely, ad the map D[, ] R, x ftdxt is cotiuous at all poits x C[, ], where the itegral ftdxt is uderstood as a Riema Stietjes itegral. To see the cotiuity, take {x } a sequece of elemets i D[, ] that coverges to x C[, ] i the Skorohod metric. Sice x C[, ], the sequece {x } coverges to x i C[, ] as well. By itegratio by part, ftdx t = fx fx x tf t dt, ad ftdx t ftdx t x x f + f t dt, as. Lemma 3.3. Let h L π be such that σ P h <. Defie { Z = σ P h } i j hx i, j Ī The as, Z coverges weakly to Z i l α. ad { } Z = j t dbt,j Ī. Proof. We eed to show that for all u l α, Z w,u α Z,u α, ad that {Z } is tight. For u l α, Z,u α = σ P h f u i hx i, where f u t = j α j u j j t. From basic results i calculus, it follows from Kadota s theorem that f u is cotiuously differetiable o [, ]. Hece, by Lemma 3., part, Z w,u α f ut dbt = u, Z α.to show that {Z } is tight, it suffices to show that lim E Z,e j =. α sup N j=n

86 Y.F. Atchadé We have E Z,e j α α j = σp h = α j σp h i k j j π hp i k h i k j j λ i k μ h dλ. By Fubii s theorem, for N, E Z,e j = α σp h j=n j=n i k α j j j λ i k μ h dλ. Let ε>. By uiform covergece of the series j α j j s j t, we ca fid N such that for ay N N ad for all s,t [, ], l N α l l t l s ε. So that for all, E Z ε,e l α σp h l=n j= λ i j μ h dλ ε σp h + λ λ μ hdλ = ε, sice ε> is arbitrary, this proves. Lemma 3.4. Let h L π be such that σ P h <. The as, E σ ˇσ = O/. Hece σ ˇσ coverges i probability to, as. Proof. Comparig the expressio of σ ad ˇσ, we see that σ ˇσ = u vtdt hx i i hx i v,i v hx i. Sice the sequece E[ hx i ] coverges to the fiite limit σ h by assumptio, it is bouded, ad there exists a fiite costat c such that E σ ˇσ c u vtdt + c E/ [ v,i v i ] hx i.

Markov Chai Mote Carlo cofidece itervals 87 Set a, =, a,i = v,i v i. We recall that v,i = l= w i l, ad vt = wt u du, ad write l/ [ i a,i = w l ] i w u du = l= l / l/ l= l / l i u w l t u l dt du. Usig this expressio, it is easy to show that a w /. Ad sice w is of class C,a mea-value theorem o w usig the above expressio shows that a tv = a, + i= a,i a,i w + w /. We are the i positio to apply Lemma 3. to obtai [ ] E a,i hx i = O. By similar argumets as above, ad sice u = = w i j is a Riema sum approximatio of vtdt, we obtai that u vtdt =O. I coclusio, E σ ˇσ = O. Lemma 3.5. Assume A. Suppose that the Markov chai {X, } starts at X = x for x X such that holds. Let h L π be such that σ P h <. The as, σ ˇσ coverges i probability to zero, ad Z w Z i l α. Proof. Ergodicity is equivalet to the existece of a successful couplig of the Markov chai ad its statioary copy. More precisely, we ca costruct a process {X, X, } such that {X, } is a Markov chai with iitial distributio δ x ad trasitio kerel P, { X, } is a Markov chai with iitial distributio π ad trasitio kerel P, ad there exists a fiite couplig time τ such that X = X for all τ. For a proof of this result, see for istace Lidvall [], Theorem 4.; see also Roberts ad Rosethal [7], Propositio 8. We use a wide tilde to deote quatities computed from the statioary chai { X, }. Sice X = X for all τ, ad i view of the expressio of σ ˇσ give i, it is straightforward to check that σ ˇσ σ ˇσ coverges to zero i probability. The covergece of Z Z α is hadled similarly. Z Z α = α l l I k hxk l h X k

88 Y.F. Atchadé = τ k hxk α l l h X k l I τ α l l t τ sup t [,] l I hxk h X k, which coverges almost surely to zero, give 6, ad sice τ is fiite almost surely. 3.. Proof of Theorem.4 Sice u is bouded from below, we ca choose a = if x X ux such that Vx = a + ux. Let q σ x, y be the desity of the proposal Nx σ ρx ux, σ I d, ad ie Rx ={y R p : αx,y < }.Wehave PVx Vx= αx,y Vy Vx q σ x, y dy [ ] = αx,y Vy Vx qσ x, y dy 3 Rx + Vy Vx qσ x, y dy. Sice u is Lipschitz, with Lipschitz costat L, say, we have by Taylor expasio Vy Vx ux, y x + L y x. Itegratig both sides, ad usig the fact that ρx ux τ, we get Vy Vx qσ x, y dy σ ρx ux L σ 4 + 4 ρx ux + dσ 4 σ ρx ux + L τ σ 4 + dσ. 4 We also have πy q σ y, x πx q σ x, y = exp Vx Vy σ x y + σ ρy uy + σ y x σ ρx ux.

Markov Chai Mote Carlo cofidece itervals 89 If y Rx, we ecessarily have πy q σ y,x πx q σ x,y <, which traslates to Vy Vx> σ x y + σ ρy uy + σ y x σ ρx ux. Hece, if y Rx, [ αx,y ] Vy Vx [ αx,y ] σ x y + σ ρy uy + σ y x σ ρx ux = [ αx,y ] σ ρ y uy ρ x ux 8 y x,ρx ux + ρy uy σ σ 8 τ + 4τ y x σ. Hece, Rx [ ] αx,y Vy Vx qσ x, y dy σ τ + τ dσ 8 + σ τ. 5 We combie 3 5 to coclude that PVx Vx σ ρx ux + K, where K = L τ σ 4 4 + dσ + σ τ 8 + τ dσ + σ τ. Sice fx = σ ρx ux is cotiuous ad fx,as x by assumptio, the results follow readily. 3.3. Proof Theorem.6 We follow Dedecker ad Rio [7], Theorem.. With the geometric ergodicity assumptio, the martigale approximatio to hx i ca be costructed more explicitly tha i Lemmas 3. ad 3.. Defie gx = j P j hx, x X. By the geometric ergodicity assumptio, g is well-ied ad belogs to L V δ. The we ie D =, ad D k = gx k PgX k, k. It is easy to see that {D k,k } is a martigaledifferece sequece with respect to the atural filtratio of {X, }. Usig this martigale,

83 Y.F. Atchadé we ie σ = l I α l i l D i, ad we recall that ˇσ σ = l I = l I α l l i hx i. Hece, α l i l D i + σ ˇσ + ˇσ σ. Although the martigales are costructed differetly, the argumet i Lemma 3.4 carries through ad shows that E σ ˇσ = O/. The proof is similar to the proof of Lemma 3.4 ad is omitted. Also E ˇσ σ = O/. To see this, use the Cauchy Schwarz iequalities for sequeces i l α ad for radom variables to write E ˇσ σ = E[ α l l I [{ E α l l I { α l l I l I i hxi l D i i hxi } / l D i i hxi } / ] l + D i { [ i hxi ]} / α l E l D i { [ i hxi ]} / α l E l + D i. l I By the martigale approximatio, we have i hxi l D i = l PgX + i= l i l i PgX hxi ] + D i i l PgX i.

Markov Chai Mote Carlo cofidece itervals 83 The details of these calculatios ca be foud for istace i [4], Propositio A. It is the easy to show that [ i hxi ] α l E l D i l I 6 sup α l l t + 3 sup α l l t h V t t δ. l I For the secod term, otice that i hxi i i hxi l + D i = l D i + l D i. Hece, with similar calculatios, we obtai [ i hxi ] α l E l + D i l I h V δ sup α l l t t + 6 sup t l I l I l I α l l t + sup α l l t h V t δ. Give 6, these calculatios show that E ˇσ σ = O/. We coclude that σ = i α j l D i + O p, l I which implies that l I d σ,χ d σ,χ +. 6 Therefore, we oly eed to focus o the term d σ,χ. O the Euclidea space R I, we ie the orms x α = i I α ixi, x = i I x i ad the ier-products x,y α = i I α ix i y i, ad x,y = i I x iy i. For a sequece a,a,..., we use the otatio a i:k = a i,...,a k ad a i:k is the empty set if i>k. We itroduce ew radom variables {Z i,j,i I, j } which are i.i.d. N,, ad set S l:k = k j=l Z j,..., k j=l Z Ij T R I, so that χ dist. = i I α i Z i,j = S :. j= α

83 Y.F. Atchadé For l k, ad omittig the depedece o,wesetb l:k as the R I k l+ matrix j B l:k i, j = i, i I,l j k. By the Mercer s expasio for φ, wehave σ = k α i i D k = B : D :. i I α For f Lip R, we itroduce the fuctio f α : R I R, ied as f α x = f x α.asa matter of telescopig the sums, we have E [ f σ f χ ] [ ] = E f α B : D : f α S : = = l= l= [ E f α B :l D :l + S l+: f α B :l D :l + ] S l: [ E f α,,l+ B :l D :l + B l:l D l f α,,l+ B :l D :l + S l:l ], where we ie f α,,l x = E [f α x + S l: ] ad set f α,,+ x = f α x. First, we claim that f α,,l is differetiable everywhere o R I. To prove this, it suffices to obtai the almost everywhere differetiability of z R I f α x + z for ay x R I. By Rademacher s theorem, f as a Lipschitz fuctio is differetiable almost everywhere o R. IfE is the set of poits where f is ot differetiable, we coclude that f α is differetiable at all poits z/ {z R I : x + z α E}. Now by Poomarëv [5], Theorem, the Lebesgue measure of the set {z R I : x + z α E} is zero, which proves the claim. As a result, the fuctio x f α,,l x is differetiable with derivative [ f α,,l x h = E x + S l: x + S l:,h By writig this expectatio wrt the distributio of x + S l:, we get f α,,l x h = f α f α z z, h α exp α ]. x x,z μ,l dz, l +

Markov Chai Mote Carlo cofidece itervals 833 where μ,l is the distributio of S l:. This implies that f α,,l is ifiitely differetiable with secod derivatives give by f α,,l x h,h = l + f α z z, h α x z, h exp x x,z μ,l dz l + [ = E f α x + S l: x l + + S l: S l:,h,h ], l + l + which implies after some easy calculatios that for h R I, f α,,l x h, h h + Similarly for h R I, 3 f α,,l x h,h,h Now, by Taylor expasio we have α l + x α l + h 3 + l + x α. 7. 8 f α,,l+ B :l D :l + B l:l D l f α,,l+ B :l D :l + S l:l = f α,,l+ B :l D :l B l:l D l S l:l + f α,,l+ B :l D :l [B l:l D l, B l:l D l S l:l,s l:l ] + ϱ 3,l, where, usig 8, ϱ 3 l,l l + 3/ + B :l D :l α Bl:l l + D l 3 α + S l:l 3 α. l It follows that E 3 ϱ l=,l l= l + / By first coditioig o F l,wehave [ ] E f α,,l+ B :l D :l B l:l D l S l:l =. l= l / log. 9

834 Y.F. Atchadé Writig K,l = f α,,l B :l D :l,wehave f α,,l+ B :l D :l [B l:l D l, B l:l D l S l:l,s l:l ] l l = Dl i j K,l i, j i,j i,j i l l j K,l i, jz i,l Z jl. Therefore, E f α,,l+ B :l D :l [B l:l D l, B l:l D l S l:l,s l:l ] F l = i,j i l l j K,l+ i, j [ E Dl F ] l δij, where δ ij = ifi = j ad zero otherwise. We claim that the proof will be fiished if we show that for all i, j I, ad l, E /[ K,l i, j K,l+ i, j ] l +. 3 To prove this claim, it suffice to use 3 to show that l= i l j l EK,l+i, j / log for i j, ad l= i l j l EK,l+i, j[ed l F l ] / log for all i, j I. To show this, write l= = i l { l= l= l j E K,l+ i, j i l } l j E K,i, j [ + l l l i j ] [ E K,l i, j K,l+ i, j ]. By the covergece of Riema sums, ad 3, this implies that l i l= l j E K,l+ i, j l= i l j l. Combied with 7 + log. k For the secod term, otice from the iitio of D l at the begiig of the proof that ED l F l = GX l πg, where Gx = Pg x Pgx. Sice h L V δ for

Markov Chai Mote Carlo cofidece itervals 835 δ</4, G L V δ, ad δ </. Therefore, by geometric ergodicity, the solutio of the Poisso equatio for G ied as Ux= j P j Gx πg is well-ied, U L V δ, ad we have almost surely UX l PUX l = E D l F l. Notice that, sice δ </, for ay p such that pδ, the geometric ergodicity assumptio G implies that sup k E UX k p <. Now we use the usual martigale approximatio trick see, e.g., Atchadé ad Cattaeo [4], Propositio A to write l= i l = i i + l= l j E K,l+ i, j [ E Dl F ] l j E K, i, jux j E K, i, jux [{ l l E i j K,l+ i, j } ] l l i j K,l i, j UX l. We ow use the fact that i j is of class C see Theorem A.ii, 7, ad 3 to coclude that l l i j E K,l+ i, j [ E Dl F ] l l= + E / K,l+ i, j K,l+ i, j log. l= This proves the claim. It remais to establish 3. Write E l to deote the expectatio operator wrt / S l:. We the have for ay h,h R I, K,l h,h = f α,,l B :l D :l h,h = l + E l [f α B :l D :l + S l: B :l D :l + S l:,h α Sl:,h ]

836 Y.F. Atchadé Therefore, l = l + + l + K,l K,l+ h,h f α,,l+ B :l D :l + S l h,h O. = f α,,l+ B :l D :l + S l h,h f α,,l+ B :l D :l + B ld l h,h l + f α,,l+ B :l D :l + S l h,h + = 3 f α,,l+ B :l D :l + t S l + t B ld l l + f α,,l+ B :l D :l + S l h,h + for some t,.usig7 ad 8, 3 follows from the above. l + h,h, S l B ld l l + O O, Appedix: Mercer s theorem We recall Mercer s theorem below. Part i is the stadard Mercer s theorem, ad part ii is a special case of a result due to T. Kadota Kadota [6]. Theorem A. Mercer s theorem. i Let k : [, ] [, ] R be a cotiuous positive semiiite kerel. The there exist oegative umbers {λ j,j }, ad orthoormal fuctios {φ j,j }, φ j L [, ], such that kx,yφ j y dy = λ j φ j x for all x [, ], j, ad lim sup kx,y λ j φ j xφ j y =. x,y [,] j= Furthermore, if λ j, φ j is cotiuous. ii Let k as above. If i additio k is of class C o [, ] [, ], the for λ j, φ j is of class C o [, ] ad lim sup x y kx,y λ j φ j xφ j y =. x,y [,] j=

Markov Chai Mote Carlo cofidece itervals 837 By settig x = y, i both expasios, it follows that sup x λ j φ j x sup x kx,x < j A. ad sup λ j φ j x sup u v ku,v u=x,v=x <. x j x A. Ackowledgmets I am grateful to Matias Cattaeo ad Shae Hederso for very helpful discussios. This work is partly supported by NSF Grats DMS-9-663, NSF-SES 96 ad DMS--864. Refereces [] Asmusse, S. ad Gly, P.W. 7. Stochastic Simulatio: Algorithms ad Aalysis. Stochastic Modellig ad Applied Probability 57. New York: Spriger. MR333 [] Atchadé, Y.F. 6. A adaptive versio for the Metropolis adjusted Lagevi algorithm with a trucated drift. Methodol. Comput. Appl. Probab. 8 35 54. MR34873 [3] Atchadé, Y.F.. Kerel estimators of asymptotic variace for adaptive Markov chai Mote Carlo. A. Statist. 39 99. MR86345 [4] Atchadé, Y.F. ad Cattaeo, M.D. 4. A martigale decompositio for quadratic forms of Markov chais with applicatios. Stochastic Process. Appl. 4 646 677. MR3339 [5] Cha, K.S. ad Geyer, C. 994. Commet o Markov chais for explorig posterior distributios. A. Statist. 747 758. [6] Damerdji, H. 995. Mea-square cosistecy of the variace estimator i steady-state simulatio output aalysis. Oper. Res. 43 8 9. MR3746 [7] Dedecker, J. ad Rio, E. 8. O mea cetral limit theorems for statioary sequeces. A. Ist. Heri Poicaré Probab. Stat. 44 693 76. MR44694 [8] Flegal, J.M. ad Joes, G.L.. Batch meas ad spectral variace estimators i Markov chai Mote Carlo. A. Statist. 38 34 7. MR6474 [9] Gly, P.W. ad Iglehart, D.L. 99. Simulatio output aalysis usig stadardized time series. Math. Oper. Res. 5 6. MR383 [] Greader, U. ad Roseblatt, M. 953. Statistical spectral aalysis of time series arisig from statioary stochastic processes. A. Math. Statist. 4 537 558. MR589 [] Häggström, O. ad Rosethal, J.S. 7. O variace coditios for Markov chai CLTs. Electro. Commu. Probab. 454 464 electroic. MR365647 [] Haa, E.J. 97. Multiple Time Series. New York: Wiley. MR7995 [3] Jasso, M. 4. The error i rejectio probability of simple autocorrelatio robust tests. Ecoometrica 7 937 946. MR544 [4] Jarer, S.F. ad Roberts, G.O.. Polyomial covergece rates of Markov chais. A. Appl. Probab. 4 47. MR8963

838 Y.F. Atchadé [5] Joes, G.L., Hara, M., Caffo, B.S. ad Neath, R. 6. Fixed-width output aalysis for Markov chai Mote Carlo. J. Amer. Statist. Assoc. 537 547. MR79478 [6] Kadota, T.T. 967. Term-by-term differetiability of Mercer s expasio. Proc. Amer. Math. Soc. 8 69 7. MR3397 [7] Kiefer, N.M. ad Vogelsag, T.J. 5. A ew asymptotic theory for heteroskedasticityautocorrelatio robust tests. Ecoometric Theory 3 64. MR988 [8] Kiefer, N.M., Vogelsag, T.J. ad Buzel, H.. Simple robust testig of regressio hypotheses. Ecoometrica 68 695 74. MR76938 [9] Kipis, C. ad Varadha, S.R.S. 986. Cetral limit theorem for additive fuctioals of reversible Markov processes ad applicatios to simple exclusios. Comm. Math. Phys. 4 9. MR834478 [] Lidvall, T. 99. Lectures o the Couplig Method. Wiley Series i Probability ad Mathematical Statistics: Probability ad Mathematical Statistics. New York: Wiley. MR85 [] Mey, S.P. ad Tweedie, R.L. 993. Markov Chais ad Stochastic Stability. Commuicatios ad Cotrol Egieerig Series. Lodo: Spriger. MR8769 [] Neave, H.R. 97. A improved formula for the asymptotic variace of spectrum estimates. A. Math. Statist. 4 7 77. MR5876 [3] Newey, W.K. ad West, K.D. 994. Automatic lag selectio i covariace matrix estimatio. Rev. Eco. Stud. 6 63 653. MR9938 [4] Parze, E. 957. O cosistet estimates of the spectrum of a statioary time series. A. Math. Statist. 8 39 348. MR88833 [5] Poomarëv, S.P. 987. Submersios ad pre-images of sets of measure zero. Sibirsk. Mat. Zh. 8 99. MR88687 [6] Robert, C.P. ad Casella, G. 4. Mote Carlo Statistical Methods, d ed. Spriger Texts i Statistics. New York: Spriger. MR878 [7] Roberts, G.O. ad Rosethal, J.S. 4. Geeral state space Markov chais ad MCMC algorithms. Probab. Surv. 7. MR95565 [8] Roberts, G.O. ad Tweedie, R.L. 996. Expoetial covergece of Lagevi distributios ad their discrete approximatios. Beroulli 34 363. MR4473 [9] Schrube, L. 983. Cofidece iterval estimatio usig stadardized time series. Oper. Res. 3 9 8. [3] Su, Y., Phillips, P.C.B. ad Ji, S. 8. Optimal badwidth selectio i heteroskedasticityautocorrelatio robust testig. Ecoometrica 76 75 94. MR374985 [3] Tierey, L. 994. Markov chais for explorig posterior distributios. A. Statist. 7 76. MR3966 Received February 4 ad revised October 4