SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES

Similar documents
Weak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij

Functional Limit theorems for the quadratic variation of a continuous time random walk and for certain stochastic integrals

1 Weak Convergence in R k

Convergence of Feller Processes

Proofs of the martingale FCLT

Stochastic Processes. Winter Term Paolo Di Tella Technische Universität Dresden Institut für Stochastik

LIMITS FOR QUEUES AS THE WAITING ROOM GROWS. Bell Communications Research AT&T Bell Laboratories Red Bank, NJ Murray Hill, NJ 07974

SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE

MATH 6605: SUMMARY LECTURE NOTES

The Fluid Limit of an Overloaded Processor Sharing Queue

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

Empirical Processes: General Weak Convergence Theory

Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales

Notes 1 : Measure-theoretic foundations I

Filtrations, Markov Processes and Martingales. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition

The Kadec-Pe lczynski theorem in L p, 1 p < 2

ELEMENTS OF PROBABILITY THEORY

Existence of a Limit on a Dense Set, and. Construction of Continuous Functions on Special Sets

Chapter 5. Weak convergence

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS

Stochastic integration. P.J.C. Spreij

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

SUPPLEMENT TO PAPER CONVERGENCE OF ADAPTIVE AND INTERACTING MARKOV CHAIN MONTE CARLO ALGORITHMS

Mi-Hwa Ko. t=1 Z t is true. j=0

On the convergence of sequences of random variables: A primer

Verona Course April Lecture 1. Review of probability

LECTURE 12 UNIT ROOT, WEAK CONVERGENCE, FUNCTIONAL CLT

Lecture 17 Brownian motion as a Markov process

Convergence of Markov Processes. Amanda Turner University of Cambridge

1. Stochastic Processes and filtrations

On the Converse Law of Large Numbers

The Lebesgue Integral

STAT 7032 Probability Spring Wlodek Bryc

STAT 331. Martingale Central Limit Theorem and Related Results

An almost sure invariance principle for additive functionals of Markov chains

PROBABILITY THEORY II

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Lecture 19 L 2 -Stochastic integration

DISTRIBUTION OF THE SUPREMUM LOCATION OF STATIONARY PROCESSES. 1. Introduction

Martingale Problems. Abhay G. Bhatt Theoretical Statistics and Mathematics Unit Indian Statistical Institute, Delhi

An invariance result for Hammersley s process with sources and sinks

Prohorov s theorem. Bengt Ringnér. October 26, 2008

Weak convergence and Compactness.

µ X (A) = P ( X 1 (A) )

Normed Vector Spaces and Double Duals

i=1 β i,i.e. = β 1 x β x β 1 1 xβ d

An essay on the general theory of stochastic processes

for all f satisfying E[ f(x) ] <.

JUSTIN HARTMANN. F n Σ.

7 Convergence in R d and in Metric Spaces

e - c o m p a n i o n

Elementary properties of functions with one-sided limits

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria

Introduction to Empirical Processes and Semiparametric Inference Lecture 09: Stochastic Convergence, Continued

Integration on Measure Spaces

Lecture 21 Representations of Martingales

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces

Random Process Lecture 1. Fundamentals of Probability

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.

1 Independent increments

A Concise Course on Stochastic Partial Differential Equations

Logarithmic scaling of planar random walk s local times

4th Preparation Sheet - Solutions

MATH5011 Real Analysis I. Exercise 1 Suggested Solution

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Metric Spaces and Topology

(c) For each α R \ {0}, the mapping x αx is a homeomorphism of X.

Weak convergence in Probability Theory A summer excursion! Day 3

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

Stanford Mathematics Department Math 205A Lecture Supplement #4 Borel Regular & Radon Measures

arxiv: v1 [math.pr] 6 Sep 2012

Introduction and Preliminaries

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

Brownian Motion. Chapter Stochastic Process

Stochastic Processes II/ Wahrscheinlichkeitstheorie III. Lecture Notes

Jump Processes. Richard F. Bass

Jointly measurable and progressively measurable stochastic processes

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Exercise Solutions to Functional Analysis

We denote the space of distributions on Ω by D ( Ω) 2.

An Introduction to Stochastic Processes in Continuous Time

OPTIMAL TRANSPORTATION PLANS AND CONVERGENCE IN DISTRIBUTION

Lecture 12. F o s, (1.1) F t := s>t

PACKING-DIMENSION PROFILES AND FRACTIONAL BROWNIAN MOTION

Some basic elements of Probability Theory

GAUSSIAN PROCESSES; KOLMOGOROV-CHENTSOV THEOREM

Part III. 10 Topological Space Basics. Topological Spaces

B. Appendix B. Topological vector spaces

Chapter 2 Random Ordinary Differential Equations

A BOREL SOLUTION TO THE HORN-TARSKI PROBLEM. MSC 2000: 03E05, 03E20, 06A10 Keywords: Chain Conditions, Boolean Algebras.

Stable Lévy motion with values in the Skorokhod space: construction and approximation

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

CLASSES OF STRICTLY SINGULAR OPERATORS AND THEIR PRODUCTS

7 About Egorov s and Lusin s theorems

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

NOTES ON THE REGULARITY OF QUASICONFORMAL HOMEOMORPHISMS

CLASSES OF STRICTLY SINGULAR OPERATORS AND THEIR PRODUCTS

The Skorokhod problem in a time-dependent interval

Translation Invariant Exclusion Processes (Book in Progress)

Transcription:

SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES RUTH J. WILLIAMS October 2, 2017 Department of Mathematics, University of California, San Diego, 9500 Gilman Drive, La Jolla CA 92093-0112, USA, email: williams@math.ucsd.edu 1

Contents A Notation 3 B Stochastic Processes 3 C Path Spaces 4 C.1 Definitions and Topologies for C d and D d............ 4 C.2 Compact Sets in D d........................ 5 D Weak Convergence of Probability Measures 7 D.1 General Definitions and Results................. 7 D.2 Tightness of Probability Measures on D d............ 8 E Convergence in Distribution for Stochastic Processes 9 E.1 Types of Convergence....................... 9 E.2 Tightness and Continuity of Limit Processes.......... 9 E.3 Continuous Mapping Theorem.................. 11 E.4 Random Time Change...................... 11 F Functional Central Limit Theorems 12 2

A Notation For each integer d 1, let R d denote d-dimensional Euclidean space and let R d + denote the d-dimensional non-negative orthant {x R d : x = (x 1,...,x d ), x i 0 for i = 1,...,d}. We will use the following norm on R d : x max 1 i d x i, x R d. (1) We endow R d with the σ-algebra of Borel sets. When d = 1, we will omit the qualifying superscript d. B Stochastic Processes A d-dimensional stochastic process is a collection of random variables X = {X(t),t 0} defined on a common probability space (Ω,F,P) and taking values in R d. In particular, for each t 0, the function X(t) : Ω R d is measurable, where Ω is endowed with the σ-algebra F and R d is endowed with the Borel σ-algebra. We shall often write X(t,ω) or X t (ω) in place of X(t)(ω) for t 0, ω Ω. Such a stochastic process X is said to have r.c.l.l. (sample) paths if for each ω Ω, the function t X(t,ω) from [0, ) into R d is right continuous on [0, ) and has finite limits from the left on (0, ). Here r.c.l.l. stands for right continuous with finite left limits. The acronym càdlàg for the equivalent French phrase is used by some authors instead of r.c.l.l. The stochastic process X is said to have continuous (sample) paths if for each ω Ω, the function t X(t,ω) is continuous from [0, ) into R d. We will be concerned with the construction of, and convergence in distribution of, d-dimensional stochastic processes having continuous paths or r.c.l.l. paths. The next few sections summarize some basic definitions and properties needed for this. For more details, the reader is referred to Billingsley (1999) or Ethier and Kurtz (1986) or Jacod and Shiryaev (1987). 3

C Path Spaces C.1 Definitions and Topologies for C d and D d For d 1, let C d C([0, ), R d ) denote the space of continuous functions from [0, ) into R d. When d = 1, we shall suppress the superscript d. We endow C d with the topology of uniform convergence on compact time intervals. Let D d denote the space of functions from [0, ) into R d that are right continuous on [0, ) and have finite left limits on (0, ). When d = 1, we shall suppress the superscript d. An element x D d only has discontinuities of jump type and there are only countably many points in (0, )where x has a jump discontinuity. We endow D d with the (Skorokhod) J 1 -topology. There is a metric m J1 on D d which induces this topology and under which the space is a complete, separable metric space (i.e., a Polish space). For our purposes, we do not need to know the precise form of this metric, rather it will suffice to characterize convergence of sequences in the J 1 -topology. For this, let Γ denote the set of functions γ : [0, ) [0, ) that are strictly increasing and continuous with γ(0) = 0 and lim t γ(t) =. (In particular, γ maps [0, ) onto [0, ).) A sequence {x n } n=1 in Dd converges to x D d in the J 1 -topology if and only if for each T > 0 there is a sequence {γ n } n=1 (possibly depending on T) in Γ such that sup γ n (t) t 0 as n, and (2) 0 t T sup x n (γ n (t)) x(t) 0 as n. (3) 0 t T If x is in C d, {x n } n=1 converges to x in the J 1-topology if and only if {x n } n=1 converges to x uniformly on compact time intervals. In particular, with the topologies described above, C d is a topological subspace of D d. For later use, for each T 0, we define x T = sup x(t), for x D d. (4) t [0,T] 4

We note in passing that one may endow D d with the topology of uniform convergence on compact time intervals. However, this topology is finer than the J 1 -topology and D d with this topology is not separable. Remark. The reader is cautioned here that for positive integers d and k, the product space C d C k is the same topologically as the space C d+k, because a sequence of functions {x n } in C d say, converges uniformly on each compact time interval if and only if each component function x n i,i = 1,...,d, converges uniformly on each compact time interval. On the other hand, the productspaced d D k isnotthesametopologicallyasthespaced d+k,because in defining the J 1 -topology a change of time scale (given by functions γ n, cf. (2)), is used and this may be different for two sequences taking values in the two spaces D d and D k, respectively. For the purpose of defining measurability, we endow C d with the Borel σ- algebra associated with the topology of uniform convergence on compact time intervals; this agrees with the σ-algebra C d = σ{x(t) : x C d, 0 t < } which is the smallest σ-algebra on C d such that for each t 0, the projection mapping x x(t), from C d into R d, is measurable. We endow the space D d with the Borel σ-algebra associated with the J 1 -topology; this agrees with the σ-algebra D d = σ{x(t) : x D d, 0 t < }. As usual, when d = 1, we shall omit the superscript d. Consider a d-dimensional stochastic process X defined on a probability space (Ω,F,P). If X has r.c.l.l. (resp. continuous) paths, then ω X(,ω) defines a measurable mapping from (Ω,F) into (D d,d d ) (resp. (C d,c d )) and it induces a probability measure π on (D d,d d ) (resp. (C d,c d )) via π(a) = P({ω : X(,ω) A}) for all A D d (resp. C d ). This probability measure π is called the law of X. C.2 Compact Sets in D d To develop a criterion for relative compactness of probability measures associated with d-dimensional stochastic processes having r.c.l.l. paths, we need a 5

characterization of the (relatively) compact sets in D d with the J 1 -topology. Definition C.1. For each x D d, δ > 0 and T > 0, define w (x,δ,t) = inf {t i } max i sup s,t [t i 1,t i ) x(s) x(t) where {t i } ranges over all partitions of the form0=t 0 < < t n 1 < T t n with min(t i t i 1 ) > δ and n 1. Remark. This definition is slightly different from that in Billingsley (1999) and is taken from Ethier and Kurtz (1986), p. 122. The notation w (x,δ,t) is used rather than simply w(x,δ,t) because w(x,δ,t) is commonly used for the usual modulus of continuity. The quantitly w (x,δ,t) is often called a modified modulus of continuity. Proposition C.2. For each x D d and T > 0, lim δ 0 w (x,δ,t) = 0. For each δ > 0 and T > 0,w (,δ,t) is a Borel measurable function from D d into R. Proof. See Ethier and Kurtz (1986), Lemma 3.6.2. Proposition C.3. A set A D d is relatively compact if and only if the following two conditions hold for each T > 0: (i) sup x A x T <, (ii) lim δ 0 sup x A w (x,δ,t) = 0. Proof. See Ethier and Kurtz (1986), Theorem 3.6.3. Remark. One can replace condition (i) by the following and then the same result holds: (i) for each rational t [0,T], sup x(t) <. x A 6

D Weak Convergence of Probability Measures D.1 General Definitions and Results Let (M,m) be a complete separable metric space where M denotes the set and m is the metric. Let M denote the σ-algebra of Borel sets associated with the topology induced on M by m. Definition D.1. A sequence of probability measures {π n } n=1 on (M,M) converges weakly to a probability measure π on (M,M) if and only if f dπ n f dπ as n M for each bounded continuous function f : M R. M Definition D.2. A family of probability measures Π on (M,M) is (weakly) relatively compact if each sequence {π n } n=1 in Π has a subsequence that converges weakly to a probability measure π on (M,M). Definition D.3. A family of probability measures Π on (M,M) is tight if for each ε > 0 there is a compact set A in M such that π(a) > 1 ε for all π Π. Theorem D.4 (Prohorov s Theorem). A family of probability measures on (M,M) is tight if and only if it is (weakly) relatively compact. Proof. See Billingsley (1999), Theorems 5.1 and 5.2, or Ethier and Kurtz (1986), Theorem 3.2.2. Remark. The if part of this proposition uses the fact that the metric space (M, m) is separable and complete. Corollary D.5. Suppose that {π n } n=1 is a tight family of probability measures on (M,M) and that there is a probability measure π on (M,M) such thateachweaklyconvergentsubsequenceof{π n } n=1 has limitπ. Then{π n } n=1 converges to π. 7

Proof. See Billingsley (1999), p. 59. The following theorem is often useful for reducing arguments about convergence of probability laws associated with stochastic processes to real analysis arguments based on almost sure convergence of equivalent distributional representatives for those processes. Theorem D.6 (Skorokhod Representation Theorem). Suppose that π and {π n } n=1 are all probability measures on (M,M) and that {π n} n=1 converges weakly to π. Then there exists a probability space (Ω,F,P) on which are defined M-valued random variables X and {X n } n=1 such that X has distribution π, X n has distribution π n for n = 1,2,..., and X n X P-a.s. as n. Proof. See Ethier and Kurtz (1986), Theorem 3.1.8. D.2 Tightness of Probability Measures on D d Combining the characterization of relative compactness in D d with Prohorov s Theorem yields the following. Theorem D.7. A sequence of probability measures {π n } n=1 on (Dd,D d ) is tight if and only if for each T > 0 and ε > 0, (i) lim limsup π n ({x D d : x T K}) = 0, K n (ii) lim δ 0 limsup n π n ({x D d : w (x,δ,t) ε}) = 0. Proof. See Ethier and Kurtz (1986), Theorem, 3.7.4, or Billingsley (1999), Theorem 16.8. Remark. Condition (i) can be replaced by the following and then the above theorem still holds: (i) for each rational t [0,T], lim limsup π n ({x D d : x(t) K}) = 0. K 8 n

E Convergence in Distribution for Stochastic Processes E.1 Types of Convergence Suppose that X, X n,n = 1,2,..., are d-dimensional stochastic processes with r.c.l.l. paths (these processes may be defined on different probability spaces). Let π denote the law of X and for each n = 1,2,..., let π n denote the law of X n. The sequence of processes {X n } n=1 converges in distribution to X if and only if {π n } n=1 converges weakly to π. Some authors abuse terminology and say that {X n } n=1 converges weakly to X. We denote such convergence in distribution by X n X as n. SupposethatX and{x n } n=1 arealldefinedonthesameprobabilityspace (Ω,F,P). Then {X n } n=1 converges to X almost surely (resp. in probability) if and only if lim n m J1 (X n (ω),x(ω)) = 0 for P-almost every ω Ω (resp. for each ε > 0, lim n P(ω Ω : m J1 (X n (ω),x(ω)) ε) = 0). (Here m J1 is the metric introduced previously which induces the J 1 -topology on D d.) We denote the almost sure convergence by X n X a.s. as n, and we denote the convergence in probability by X n X in prob. as n. If {X n } n=1 converges to X almost surely or in probability, then {Xn } n=1 converges in distribution to X. Conversely, if {X n } n=1 converges in distribution to X and X is a.s. constant, then {X n } n=1 converges in probability to X as n. E.2 Tightness and Continuity of Limit Processes The following criterion due to Aldous often provides a convenient mechanism for verifying tightness of the laws associated with d-dimensional stochastic processes having r.c.l.l. paths. Theorem E.1. Let {X n } n=1 be a sequence of d-dimensional stochastic processes with r.c.l.l. paths. Then the probability measures induced on D d by {X n } n=1 are tight if the following two conditions hold for each T > 0: 9

(i) lim limsup P( X n T K) = 0, K n (ii) for each ε > 0,η > 0, there are positive constants δ ε,η and n ε,η such that for all 0 < δ δ ε,η and n n ε,η, sup P( X n (τ +δ) X n (τ) ε) η τ T f [0,T] where T f [0,T] denotes the set of all stopping times relative to the filtration generated by X n that take values in a finite subset of [0,T]. Proof. See Billingsley (1999), Theorem 16.10. The following proposition provides a useful criterion for checking when a sequence of d-dimensional stochastic processes with r.c.l.l. paths has associated laws that are tight and whose limit points are concentrated on the set of continuous paths C d. (We say such a sequence of laws is C-tight and (with an abuse of terminology) we sometimes say the sequence of processes is C-tight.) Proposition E.2. Let {X n } n=1 be a sequence of d-dimensional processes with r.c.l.l. paths. Then the sequence of probability measures {π n } n=1 induced on D d by {X n } n=1 is tight and any weak limit point of this sequence is concentrated on C d if and only if the following two conditions hold for each T > 0 and ε > 0: (i) lim K limsup n P( X n T K) = 0, (ii) lim δ 0 limsup n P(w(X n,δ,t) ε) = 0, where for x D d, { } w(x,δ,t) = sup sup x(u) x(v) : 0 t < t+δ T u,v [t,t+δ]. (5) Proof. See Proposition VI.3.26 in Jacod and Shiryaev (1987). 10

E.3 Continuous Mapping Theorem Theorem E.3 (Continuous Mapping Theorem). For fixed positive integers d and k, let h n, n = 1,2,..., h be measurable functions from (D d,d d ) into (D k,d k ). Define C h = {x D d : h n (x n ) h(x) in D k whenever x n x in D d }. Let X, X n,n = 1,2,..., be d-dimensional stochastic processes with r.c.l.l. paths such that X n X as n and P(X C h ) = 1. Then h n (X n ) h(x) as n. Exercise E.4. Prove the continuous mapping theorem. Hint: use the Skorokhod Representation Theorem. Exercise E.5. Verify that the following mapping is continuous. Consider h : D d D defined by h(x)(t) = sup x(s), t 0. 0 s t E.4 Random Time Change Let D o = {φ D : φ is non-decreasing and 0 φ(t) < for all t 0}. Note that if x D d and φ D o, then x φ D d. Theorem E.6 (Random time change theorem). Suppose that {X n } n=1, X are d-dimensional stochastic processes with r.c.l.l. paths and {Φ n } n=1, Φ are one-dimensional stochastic processes whose paths are in D o. If (X n,φ n ) (X,Φ) as n and P(X C d ) = 1, then X n Φ n X Φ as n. Exercise E.7. Prove this theorem using the continuous mapping theorem by showing that the mapping ψ : D d D o D d given by ψ(x,φ) = x φ is continuous at x C d. 11

F Functional Central Limit Theorems Theorem F.1 (Donsker s Theorem). Let v 1,v 2,... be independent, identically distributed (i.i.d.) real-valued random variables with mean µ (, ) and variance σ 2 (0, ). Let V n (t) = 1 σ n [nt] v i µnt, t 0. (6) i=1 Then V n W, where W is a standard one-dimensional Brownian motion. Proof. This is usually shown in two steps: (i) convergence of finite dimensional distributions, and (ii) tightness. See Billingsley (1999), Theorem 14.1, or Ethier and Kurtz (1986), Theorem 5.1.2. Remark. This result is also called a Functional Central Limit Theorem (FCLT) or Invariance Principle. Remark. There are extensions of Donsker s theorem that relax the i.i.d. requirements. Theorem F.2. (Functional Central Limit Theorem for Renewal Processes) Let u 1,u 2,... be i.i.d. positive random variables with mean λ 1 (0, ) and variance σ 2 (0, ). For each t 0, define and set N(t) = sup{k 0 : u 1 +...+u k t} N n (t) = 1 λ 3/2 σ n (N(nt) λnt). Then N n W as n, where W is a standard one-dimensionalbrownian motion. Proof. This follows from Donsker s theorem by a clever inversion argument, see Billingsley (1999), Theorem 14.6. 12

References P. Billingsley. Convergence of probability measures. John Wiley & Sons Inc., New York, second edition, 1999. S. N. Ethier and T. G. Kurtz. Markov Processes: Characterization and Convergence. Wiley, New York, 1986. J. Jacod and A. N. Shiryaev. Limit Theorems for Stochastic Processes. Springer-Verlag, New York, 1987. 13