Strong approximation for additive functionals of geometrically ergodic Markov chains

Similar documents
Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

6 Markov Chain Monte Carlo (MCMC)

1. Stochastic Processes and filtrations

Practical conditions on Markov chains for weak convergence of tail empirical processes

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Path Decomposition of Markov Processes. Götz Kersting. University of Frankfurt/Main

1 Continuous-time chains, finite state space

Lecture 9. d N(0, 1). Now we fix n and think of a SRW on [0,1]. We take the k th step at time k n. and our increments are ± 1

Consistency of the maximum likelihood estimator for general hidden Markov models

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

Limit theorems for dependent regularly varying functions of Markov chains

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

On a class of additive functionals of two-dimensional Brownian motion and random walk

Central Limit Theorem for Non-stationary Markov Chains

1. Aufgabenblatt zur Vorlesung Probability Theory

Positive and null recurrent-branching Process

ELEMENTS OF PROBABILITY THEORY

A regeneration proof of the central limit theorem for uniformly ergodic Markov chains

The range of tree-indexed random walk

Stochastic Processes

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

CONDITIONAL LIMIT THEOREMS FOR PRODUCTS OF RANDOM MATRICES

Tail inequalities for additive functionals and empirical processes of. Markov chains

Necessary and sufficient conditions for strong R-positivity

Weak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij

Logarithmic scaling of planar random walk s local times

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

On the convergence of sequences of random variables: A primer

Selected Exercises on Expectations and Some Probability Inequalities

Gaussian Processes. 1. Basic Notions

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Stochastic Process (ENPC) Monday, 22nd of January 2018 (2h30)

Lecture 7. µ(x)f(x). When µ is a probability measure, we say µ is a stationary distribution.

Lectures on Stochastic Stability. Sergey FOSS. Heriot-Watt University. Lecture 4. Coupling and Harris Processes

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.

Central limit theorems for ergodic continuous-time Markov chains with applications to single birth processes

1. Introduction. 0 i<j n j i

GAUSSIAN PROCESSES; KOLMOGOROV-CHENTSOV THEOREM

Infinitely iterated Brownian motion

Susceptible-Infective-Removed Epidemics and Erdős-Rényi random

Mixing time for a random walk on a ring

Limit Thoerems and Finitiely Additive Probability

Markov processes and queueing networks

Stochastic Models (Lecture #4)

Mean-field dual of cooperative reproduction

On a class of stochastic differential equations in a financial network model

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A )

CDA6530: Performance Models of Computers and Networks. Chapter 3: Review of Practical Stochastic Processes

215 Problem 1. (a) Define the total variation distance µ ν tv for probability distributions µ, ν on a finite set S. Show that

Stochastic Processes (Week 6)

An almost sure invariance principle for additive functionals of Markov chains

Formulas for probability theory and linear models SF2941

Exact Simulation of the Stationary Distribution of M/G/c Queues

9 Brownian Motion: Construction

Module 9: Stationary Processes

On the Central Limit Theorem for an ergodic Markov chain

A Functional Central Limit Theorem for an ARMA(p, q) Process with Markov Switching

Iterated Random Functions: Convergence Theorems. C. D. Fuh Institute of Statistical Science Academia Sinica, Taipei, Taiwan, ROC ABSTRACT

MATH 564/STAT 555 Applied Stochastic Processes Homework 2, September 18, 2015 Due September 30, 2015

Exercises in stochastic analysis

Applied Stochastic Processes

Simultaneous drift conditions for Adaptive Markov Chain Monte Carlo algorithms

Subgaussian concentration inequalities for geometrically ergodic Markov chains

Chapter 7. Markov chain background. 7.1 Finite state space

Econometría 2: Análisis de series de Tiempo

Rademacher Bounds for Non-i.i.d. Processes

ECE353: Probability and Random Processes. Lecture 18 - Stochastic Processes

Discrete solid-on-solid models

FUNCTIONAL LAWS OF THE ITERATED LOGARITHM FOR THE INCREMENTS OF THE COMPOUND EMPIRICAL PROCESS 1 2. Abstract

University of Toronto Department of Statistics

Yaming Yu Department of Statistics, University of California, Irvine Xiao-Li Meng Department of Statistics, Harvard University.

SUPPLEMENT TO MARKOV PERFECT INDUSTRY DYNAMICS WITH MANY FIRMS TECHNICAL APPENDIX (Econometrica, Vol. 76, No. 6, November 2008, )

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

Estimating the efficient price from the order flow : a Brownian Cox process approach

A mathematical framework for Exact Milestoning

Solution: (Course X071570: Stochastic Processes)

STOCHASTIC PROCESSES Basic notions

Modelling Dependent Credit Risks

Department of Econometrics and Business Statistics

On the Set of Limit Points of Normed Sums of Geometrically Weighted I.I.D. Bounded Random Variables

Markov Chain Monte Carlo (MCMC)

25.1 Ergodicity and Metric Transitivity

Lesson 4: Stationary stochastic processes

On the local time of random walk on the 2-dimensional comb

Non-Essential Uses of Probability in Analysis Part IV Efficient Markovian Couplings. Krzysztof Burdzy University of Washington

UPPER DEVIATIONS FOR SPLIT TIMES OF BRANCHING PROCESSES

Lecture 21 Representations of Martingales

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

arxiv:math/ v1 [math.pr] 24 Apr 2003

Poisson Processes. Stochastic Processes. Feb UC3M

A review of Continuous Time MC STA 624, Spring 2015

CDA5530: Performance Models of Computers and Networks. Chapter 3: Review of Practical

Change detection problems in branching processes

Markov processes Course note 2. Martingale problems, recurrence properties of discrete time chains.

RANDOM WALKS AND PERCOLATION IN A HIERARCHICAL LATTICE

Let (Ω, F) be a measureable space. A filtration in discrete time is a sequence of. F s F t

A generalization of Strassen s functional LIL

Lecture Notes 3 Convergence (Chapter 5)

Transcription:

Strong approximation for additive functionals of geometrically ergodic Markov chains Florence Merlevède Joint work with E. Rio Université Paris-Est-Marne-La-Vallée (UPEM) Cincinnati Symposium on Probability Theory and Applications September 2014

Strong approximation in the iid setting (1) Assume that (X i ) i 1 is a sequence of iid centered real-valued random variables with a finite second moment σ 2 and define S n = X 1 + X 2 + + X n The ASIP says that a sequence (Z i ) i 1 of iid centered Gaussian variables may be constructed is such a way that sup Sk σb k = o(b n ) almost surely, 1 k n where b n = (n log log n) 1/2 (Strassen (1964)). When (X i ) i 1 is assumed to be in addition in L p with p > 2, then we can obtain rates in the ASIP: b n = n 1/p (see Major (1976) for p ]2, 3] and Komlós, Major and Tusnády for p > 3).

Strong approximation in the iid setting (2) When (X i ) i 1 is assumed to have a finite moment generating function in a neighborhood of 0, then the famous Komlós-Major-Tusnády theorem (1975 and 1976) says that one can construct a standard Brownian motion (B t ) t 0 in such a way that ( ) P sup S k σb k x + c log n k n a exp( bx) (1) where a, b and c are positive constants depending only on the law of X 1. (1) implies in particular that sup S k σb k = O(log n) almost surely 1 k n It comes from the Erdös-Rényi law of large numbers (1970) that this result is unimprovable.

Strong approximation in the multivariate iid setting Einmahl (1989) proved that we can obtain the rate O((log n) 2 ) in the almost sure approximation of the partial sums of iid random vectors with finite moment generating function in a neighborhood of 0 by Gaussian partial sums. Zaitsev (1998) removed the extra logarithmic factor and obtained the KMT inequality in the case of iid random vectors. What about KMT type results in the dependent setting?

An extension for functions of iid (Berkes, Liu and Wu (2014)) Let (X k ) k Z be a stationary process defined as follows. Let (ε k ) k Z be a sequence of iid r.v. s and g : R Z R be a measurable function such that for all k Z, X k = g(ξ k ) with ξ k := (..., ε k 1, ε k ) is well defined, E(g(ξ k )) = 0 and g(ξ k ) p < for some p > 2. Let (ε k ) k Z an independent copy of (ε k ) k Z. For any integer k 0, let ξk = ( ξ 1, ε 0, ε ) 1,..., ε k 1, ε k and X k = g(ξk ). For k 0, let δ(k) as introduced by Wu (2005): δ(k) = X k X k p. Berkes, Liu and Wu (2014): The almost sure strong approximation holds with the rate o(n 1/p ) and σ 2 = k E(X 0 X k ) provided that δ(k) = O(k α ) with α > 2 if p ]2, 4] and α > f (p) if p > 4 with f (p) = 1 + p2 4 + (p 2) p 2 + 20p + 4 8p

What about strong approximation in the Markov setting? Let (ξ n ) be an irreducible and aperiodic Harris recurrent Markov chain on a countably generated measurable state space (E, B). Let P(x,.) be the transition probability. We assume that the chain is positive recurrent. Let π be its (unique) invariant probability measure. Then there exists some positive integer m, some measurable function h with values in [0, 1] with π(h) > 0, and some probability measure ν on E, such that We assume that m = 1 P m (x, A) h(x)ν(a). The Nummelin splitting technique (1984) allows to extend the Markov chain in such a way that the extended Markov chain has a recurrent atom. This allows regeneration.

The Nummelin splitting technique (1) Let Q(x, ) be the sub-stochastic kernel defined by Q = P h ν The minorization condition allows to define an extended chain ( ξ n, U n ) in E [0, 1] as follows. At time 0, U 0 is independent of ξ 0 and has the uniform distribution over [0, 1]; for any n N, P( ξ n+1 A ξ n = x, U n = y) = 1 y h(x) ν(a) + 1 y>h(x) Q(x, A) 1 h(x) := P((x, y), A) and U n+1 is independent of ( ξ n+1, ξ n, U n ) and has the uniform distribution over [0, 1]. P = P λ (λ is the Lebesgue measure on [0, 1]) and ( ξ n, U n ) is an irreducible and aperiodic Harris recurrent chain, with unique invariant probability measure π λ. Moreover ( ξ n ) is an homogenous Markov chain with transition probability P(x,.).

Regeneration Define now the set C in E [0, 1] by C = {(x, y) E [0, 1] such that y h(x)}. For any (x, y) in C, P( ξ n+1 A ξ n = x, U n = y) = ν(a). Since π λ(c ) = π(h) > 0, the set C is an atom of the extended chain, and it can be proven that this atom is recurrent. Let T 0 = inf{n 1 : U n h( ξ n )} and T k = inf{n > T k 1 : U n h( ξ n )}, and the return times (τ k ) k>0 by τ k = T k T k 1. Note that T 0 is a.s. finite and the return times τ k are iid and integrable. Let S n (f ) = n k=1 f ( ξ k ). The random vectors (τ k, S Tk (f ) S Tk 1 (f )) k>0 are iid and their common law is the law of (τ 1, S T1 (f ) S T0 (f )) under P C.

Csáki and Csörgö (1995): If the r.v s S Tk ( f ) S Tk 1 ( f ) have a finite moment of order p for some p in ]2, 4] and if E(τ p/2 k ) <, then one can construct a standard Wiener process (W t ) t 0 such that S n (f ) nπ(f ) σ(f ) W n = O(a n ) a.s.. with a n = n 1/p (log n) 1/2 (log log n) α and σ 2 (f ) = lim n 1 n VarS n (f ). The above result holds for any bounded function f only if the return times have a finite moment of order p. The proof is based on the regeneration properties of the chain, on the Skorohod embedding and on an application of the results of KMT (1975) to the partial sums of the iid random variables S Tk+1 (f ) S Tk (f ), k > 0.

On the proof of Csáki and Csörgö For any i 1, let X i = T i l=t i 1 +1 f ( ξ l ). Since the (X i ) i>0 are iid, if E X 1 2+δ <, there exists a standard Brownian motion (W (t)) t>0 such that sup k X i σ(f )W (k) = o(n 1/(2+δ) ) k n i=1 a.s. Let ρ(n) = max{k : T k n}. If E τ 1 q < for some 1 q 2, then ρ(n) = n E(τ 1 ) + O(n1/q (log log n) α ) a.s. ρ(n) i=1 X i S n (f ) = o(n 1/(2+δ) ) a.s. W ( ) ( ρ(n)) W ( n = O n 1/(2q) (log n) 1/2 (log log n) α ) ) a.s. E(τ 1 ) With this method, no way to do better than O(n 1/(2q) (log n) 1/2 ) (1 q 2) even if f is bounded and τ 1 has exponential moment.

Link between the moments of return times and the coefficients of absolute regularity For positive measures µ and ν, let µ ν denote the total variation of µ ν Set β n = E Pn (x,.) π dπ(x). The coefficients β n are called absolute regularity (or β-mixing) coefficients of the chain. Bolthausen (1980-1982): for any p > 1, E(τ p 1 ) = E C (T p 0 ) < if and only if n p 2 β n <. n>0 Hence, according to the strong approximation result of M.-Rio (2012), if f is bounded and E(τ p 1 ) for some p in ]2, 3[, then the strong approximation result holds with the rate o(n 1/p (log n) (p 2)/(2p) ).

Main result: M. Rio (2014) Assume that β n = O(ρ n ) for some real ρ with 0 < ρ < 1, If f is bounded and such that π(f ) = 0 then there exists a standard Wiener process (W t ) t 0 and positive constants a, b and c depending on f and on the transition probability P(x, ) such that, for any positive real x and any integer n 2, ( ) P π sup Sk (f ) σ(f )W k c log n + x a exp( bx). k n where σ 2 (f ) = π(f 2 ) + 2 n>0 π(fp n f ) > 0. Therefore sup k n S k (g) σ(f )W k = O(log n) a.s.

Some comments on the condition The condition β n = O(ρ n ) for some real ρ with 0 < ρ < 1 is equivalent to say that the Markov chain is geometrically ergodic (see Nummelin and Tuominen (1982)). If the Markov chain is GE then there exists a positive real δ such that E ( e tτ 1 ) < and E π ( e tt 0 ) < for any t δ. Let µ be any law on E such that E Pn (x,.) π dµ(x) = O(r n ) for some r < 1. Then P µ (T 0 > n) decreases exponentially fast (see Nummelin and Tuominen (1982)). The result extends to the Markov chain (ξ n ) with transition probability P and initial law µ.

Some insights for the proof Let S n (f ) = n l=1 f ( ξ l ) and X i = T i l=t i 1 +1 f ( ξ l ). Recall that (X i, τ i ) i>0 are iid. Let α be the unique real such that Cov(X k ατ k, τ k ) = 0 The random vectors (X i ατ i, τ i ) i>0 of R 2 are then iid and their marginals are non correlated. By the multidimensional strong approximation theorem of Zaitsev (1998), there exist two independent standard Brownian motions (B t ) t and ( B t ) t such that and S Tn (f ) α(t n ne(τ 1 )) vb n = O(log n) a.s. (1) T n ne(τ 1 ) ṽ B n = O(log n) a.s. (2) where v 2 = Var(X 1 ατ 1 ) and ṽ 2 = Var(τ 1 ). We associate to T n a Poisson Process via (2).

Let λ = (E(τ 1)) 2. Via KMT, one can construct a Poisson process N Var(τ 1 ) (depending on B) with parameter λ in such a way that Therefore, via (2), and then, via (1), γn(n) ne(τ 1 ) ṽ B n = O(log n) a.s. T n γn(n) = O(log n) a.s. S γn(n) (f ) αγn(n) + αne(τ 1 ) vb n = O(log n) a.s. (3) The processes (B t ) t and (N t ) t appearing here are independent. Via (3), setting N 1 (k) = inf{t > 0 : N(t) k} := k l=1 E l, S n (f ) = vb N 1 (n/γ) + αn αe(τ 1)N 1 (n/γ) + O(log n) a.s. If v = 0, the proof is finished. Indeed, by KMT, there exists a Brownian motion W n (depending on N) such that αn αe(τ 1 )N 1 (n/γ) = W n + O(log n) a.s.

If v = 0 and α = 0, we have S n (f ) = vb N 1 (n/γ) + O(log n) a.s. Using Csörgö, Deheuvels and Horváth (1987) (B and N are independent), one can construct a Brownian motion W (depending on N) such that B N 1 (n/γ) W n = O(log n) a.s. ( ), which leads to the expected result when α = 0. However, in the case α = 0 and v = 0, we still have S n (f ) = vb N 1 (n/γ) + αn αe(τ 1)N 1 (n/γ) + O(log n) a.s. and then S n (f ) = W n + W n + O(log n) a.s. Since W and W are not independent, we cannot conclude. Can we construct W n independent of N (and then of W ) such that ( ) still holds?

The key lemma Let (B t ) t 0 be a standard Brownian motion on the line and {N(t) : t 0} be a Poisson process with parameter λ > 0, independent of (B t ) t 0. Then one can construct a standard Brownian process (W t ) t 0 independent of N( ) and such that, for any integer n 2 and any positive real x, ( P sup Bk 1 ) W N(k) C log n + x A exp( Bx), k n λ where A, B and C are positive constants depending only on λ. (W t ) t 0 may be constructed from the processes (B t ) t 0, N( ) and some auxiliary atomless random variable δ independent of the σ-field generated by the processes (B t ) t 0 and N( ).

Construction of W (1/3) It will be constructed from B by writing B on the Haar basis. For j Z and k N, let and Y j,k = e j,k = 2 j/2( 1 ]k2 j,(k+ 1 2 )2j ] 1 ](k+ 1 2 )2j,(k+1)2 j ]), 0 e j,k (t)db(t) = 2 j/2( 2B (k+ 1 2 )2 j B k2 j B (k+1)2 j ). Then, since (e j,k ) j Z,k 0 is a total orthonormal system of l 2 (R), for any t R +, B t = j Z k 0 ( t To construct W, we modify the e j,k. 0 e j,k(t)dt ) Y j,k.

Construction of W (2/3) Let E j = {k N : N(k2 j ) < N((k + 1 2 )2j ) < N((k + 1)2 j )} For j Z and k E j, let f j,k = c 1/2 ( j,k bj,k 1 ]N(k2 j ),N((k+ 1 2 )2j )] a ) j,k1 ]N((k+ 1 2 )2 j ),N((k+1)2 j )], where a j,k = N((k + 1 2 )2j ) N(k2 j ), b j,k = N((k + 1)2 j ) N((k + 1 2 )2j ), and c j,k = a j,k b j,k (a j,k + b j,k ) (f j,k ) j Z,k Ej is an orthonormal system whose closure contains the vectors 1 ]0,N(t)] for t R + and then the vectors 1 ]0,l] for l N. Setting f j,k = 0 if k / E j, we define W l = j Z k 0 ( l 0 f j,k(t)dt ) Y j,k for any l N and W 0 = 0

Construction of W (3/3) W l = j Z k 0 ( l 0 f j,k(t)dt ) Y j,k for any l N and W 0 = 0 Conditionally to N, (f j,k ) j Z,k Ej is an orthonormal system and (Y j,k ) is a sequence of iid N (0, 1), independent of N. Hence, conditionally to N, (W l ) l 0 is a Gaussian sequence such that Cov(W l, W m ) = l m. Therefore this Gaussian sequence is independent of N By the Skorohod embedding theorem, we can extend it to a standard Wiener process (W t ) t still independent of N.

Take λ = 1. For any l N B l W N(l) = j 0 k 0 If l / ]k2 j, (k + 1)2 j [, Hence, setting l j = [l2 j ], ( l 0 e j,k(t)dt N(l) 0 ) f j,k (t)dt Y j,k. l 0 e j,k(t)dt = N(l) 0 f j,k (t)dt = 0 ( l N(l) ) B l W N(l) = e j,l j (t)dt f j,lj (t)dt Y j,lj. j 0 0 0 Let t j = l l j 2 j 2 j. We then have B l W N(l) = U j,lj Y j,lj 1 tj ]0,1/2] + V j,lj Y j,lj 1 tj ]1/2,1[, j 0 j 0

The rest of the proof consists to show that ) P( Uj,l 2 j 1 tj ]0,1/2] c 1 log n + c 2 x j 0 and the same with V 2 j,l j 1 tj ]1/2,1[. c 3 e c 4x, Let Π j,k = N((k + 1)2 j ) N(k2 j ). We use in particular that the conditional law of Π j 1,2k given Π j,k is a B(Π j,k, 1/2) The property above allows to construct a sequence (ξ j,k ) j,k of iid N (0, 1) such that Π j 1,2k 1 2 Π j,k 1 + 1 2 Π1/2 j,k ξ j,k This comes from the Tusnády s lemma: Setting Φ m the d.f. of B(m, 1/2) and Φ the d.f. of N (0, 1), Φ 1 m (Φ(ξ)) m 1 + 1 2 2 ξ m

Thank you for your attention!