Weak convergence and Compactness.

Similar documents
Weak convergence and Brownian Motion. (telegram style notes) P.J.C. Spreij

THEOREMS, ETC., FOR MATH 515

Convergence of Feller Processes

Problem Set 2: Solutions Math 201A: Fall 2016

g 2 (x) (1/3)M 1 = (1/3)(2/3)M.

Reminder Notes for the Course on Measures on Topological Spaces

MATH 6605: SUMMARY LECTURE NOTES

1 Weak Convergence in R k

SEPARABILITY AND COMPLETENESS FOR THE WASSERSTEIN DISTANCE

MTG 5316/4302 FALL 2018 REVIEW FINAL

Wiener Measure and Brownian Motion

MATS113 ADVANCED MEASURE THEORY SPRING 2016

+ 2x sin x. f(b i ) f(a i ) < ɛ. i=1. i=1

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define

THEOREMS, ETC., FOR MATH 516

CHAPTER I THE RIESZ REPRESENTATION THEOREM

Real Analysis Problems

7 About Egorov s and Lusin s theorems

Problem Set 5. 2 n k. Then a nk (x) = 1+( 1)k

PROBLEMS. (b) (Polarization Identity) Show that in any inner product space

ELEMENTS OF PROBABILITY THEORY

Exercises Measure Theoretic Probability

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm

Continuity of convex functions in normed spaces

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria

From now on, we will represent a metric space with (X, d). Here are some examples: i=1 (x i y i ) p ) 1 p, p 1.

6. Brownian Motion. Q(A) = P [ ω : x(, ω) A )

(convex combination!). Use convexity of f and multiply by the common denominator to get. Interchanging the role of x and y, we obtain that f is ( 2M ε

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

MATH 202B - Problem Set 5

Stochastic Processes. Winter Term Paolo Di Tella Technische Universität Dresden Institut für Stochastik

Random Process Lecture 1. Fundamentals of Probability

Empirical Processes: General Weak Convergence Theory

Exercises Measure Theoretic Probability

SUMMARY OF RESULTS ON PATH SPACES AND CONVERGENCE IN DISTRIBUTION FOR STOCHASTIC PROCESSES

A SKOROHOD REPRESENTATION THEOREM WITHOUT SEPARABILITY PATRIZIA BERTI, LUCA PRATELLI, AND PIETRO RIGO

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

7 Complete metric spaces and function spaces

Disintegration into conditional measures: Rokhlin s theorem

Prohorov s theorem. Bengt Ringnér. October 26, 2008

MATHS 730 FC Lecture Notes March 5, Introduction

(2) E M = E C = X\E M

Introduction and Preliminaries

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Introduction to Empirical Processes and Semiparametric Inference Lecture 08: Stochastic Convergence

Probability and Measure

(U) =, if 0 U, 1 U, (U) = X, if 0 U, and 1 U. (U) = E, if 0 U, but 1 U. (U) = X \ E if 0 U, but 1 U. n=1 A n, then A M.

Analysis III Theorems, Propositions & Lemmas... Oh My!

Integration on Measure Spaces

REAL AND COMPLEX ANALYSIS

ABSTRACT INTEGRATION CHAPTER ONE

1.4 Outer measures 10 CHAPTER 1. MEASURE

02. Measure and integral. 1. Borel-measurable functions and pointwise limits

MATH 722, COMPLEX ANALYSIS, SPRING 2009 PART 5

7 Convergence in R d and in Metric Spaces

Analysis Comprehensive Exam Questions Fall F(x) = 1 x. f(t)dt. t 1 2. tf 2 (t)dt. and g(t, x) = 2 t. 2 t

A VERY BRIEF REVIEW OF MEASURE THEORY

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

Measures and Measure Spaces

δ xj β n = 1 n Theorem 1.1. The sequence {P n } satisfies a large deviation principle on M(X) with the rate function I(β) given by

ECARES Université Libre de Bruxelles MATH CAMP Basic Topology

l(y j ) = 0 for all y j (1)

MAT 578 FUNCTIONAL ANALYSIS EXERCISES

Spaces of continuous functions

Metric Spaces and Topology

Class Notes for Math 921/922: Real Analysis, Instructor Mikil Foss

Notes of the seminar Evolution Equations in Probability Spaces and the Continuity Equation

Compact operators on Banach spaces

Bounded and continuous functions on a locally compact Hausdorff space and dual spaces

Examples of Dual Spaces from Measure Theory

Continuous Functions on Metric Spaces

Riesz Representation Theorems

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Economics 204 Summer/Fall 2011 Lecture 5 Friday July 29, 2011

STAT331 Lebesgue-Stieltjes Integrals, Martingales, Counting Processes

Brownian Motion. Chapter Stochastic Process

4 Countability axioms

OPTIMAL SOLUTIONS TO STOCHASTIC DIFFERENTIAL INCLUSIONS

A note on some approximation theorems in measure theory

1 Preliminaries on locally compact spaces and Radon measure

MATH 104 : Final Exam

Real Analysis Notes. Thomas Goller

Contents Ordered Fields... 2 Ordered sets and fields... 2 Construction of the Reals 1: Dedekind Cuts... 2 Metric Spaces... 3

INTRODUCTION TO MEASURE THEORY AND LEBESGUE INTEGRATION

Measure and integration

INTRODUCTION TO TOPOLOGY, MATH 141, PRACTICE PROBLEMS

Part III. 10 Topological Space Basics. Topological Spaces

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

APPROXIMATING BOCHNER INTEGRALS BY RIEMANN SUMS

Doléans measures. Appendix C. C.1 Introduction

Measures. Chapter Some prerequisites. 1.2 Introduction

The Kolmogorov extension theorem

Real Analysis Chapter 4 Solutions Jonathan Conder

Semicontinuous functions and convexity

Spectral theory for compact operators on Banach spaces

Basic Definitions: Indexed Collections and Random Functions

Math General Topology Fall 2012 Homework 11 Solutions

Summer Jump-Start Program for Analysis, 2012 Song-Ying Li. 1 Lecture 7: Equicontinuity and Series of functions

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Daniel Akech Thiong Math 501: Real Analysis Homework Problems with Solutions Fall Problem. 1 Give an example of a mapping f : X Y such that

Transcription:

Chapter 4 Weak convergence and Compactness. Let be a complete separable metic space and B its Borel σ field. We denote by M() the space of probability measures on (, B). A sequence µ n M() of probability measures converges weakly to a probability measure µ M() (µ n µ) if for every bounded continuous function f : R we have f(x)dµ n = f(x)dµ (4.1) Theorem 4.1. The following are equivalent. 1. µ n µ 2. For every bounded uniformly continuous function f f(x)dµ n = f(x)dµ 3. For every closed set C 4. For every open set G sup µ n (C) µ(c) inf µ n(g) µ(g) 5. For every continuity set A, i.e. A suc that µ(ā) = µ(a) = µ(ao ), we have µ n(a) = µ(a) 1

2 CHAPTER 4. WEAK CONVERGENCE AND COMPACTNESS. 1 Proof. Clearly 1 2. To show that 2 3, we consider f k (x) = [ 1+d(x,C) ]k. f k (x) is uniformly continuous and bounded by 1. f k (x) 1 C (x) and f k (x) 1 C (x) as n. For every k sup µ n (C) f k (x)dµ n = f k (x)dµ By letting k, we obtain sup µ n (C) µ(c) We see that 3 4 by taking complements. For any set A, µ(a c ) = 1 µ(a). It is clear that 3 and 4 5. Because A o is open and Ā is closed, µ(a o ) inf µ n (A o ) inf µ n (A) sup µ n (A) supµ n (Ā) µ(ā) If µ(a o ) = µ(a) = µ(ā), we have equality everywhere and that implies 5. Finally to prove that 5 1, we take a bounded continuous function f and approximate it uniformly by a simple function f k = k i=1 a i1 Ai (x). The natural choice for A i is A i = {x : a i f(x) a i+1 }. Since the sets {x : f(x) = a} are disjoint all but a countable number of them are of measure 0, under µ. We can pick {a i } avoiding this countable set. Then all our sets A i will be continuity sets and so for every k. f k (x)dµ n = f k (x)dµ Since f k approximates f uniformly it follows that f(x)dµ n = f(x)dµ which is 1. In a complete separable metric space any probability measure P, is essentially supported on a compact set. More precisely Theorem 4.2. Given any probability measure µ on the Borel subsets B of a complete separable metric space and ǫ > 0, there exists a compact set K ǫ such that µ(k ǫ ) 1 ǫ. Proof.. Consider a countable dense subset D = {x j } of and spheres {S(x j, 1 k )} of radius 1 k around points x j D. Clearly j B(x j, 1 k ) = and by the countable additivity of µ, for each k, we can find n k such that µ[ n k j=1 B(x j, 1 k )] 1 ǫ 2 k The set E = k n k j=1 B(x j, 1 k ) is totally bounded and has the property µ(e) 1 ǫ. Since is complete, the closure Ē of E is compact.

3 Remark 4.1. The space M() of probability measures on, with weak convergence is a complete separable metric space. In other words, weak convergence can be metrized. To do this we note that weak convergence is a topological notion and is not altered if we change the metric to an equivalent one. But the notion of uniform continuity depends on the metric. We can change the metric so that the set of bounded uniformly continuous functions is separable. We imbed into Y, the countable product of unit intervals, which is compact. Then under the metric inherited from Y, will not be complete. But its completion, which is the closure of in Y is a compact metric space. The space of uniformly continuous functions on is the same as the space of continuous functions on. Since is a compact metric space, C( ) is separable. Then we can define d(µ 1, µ 2 ) = j=1 1 1 2 k 1 + f k f k dµ 1 f k dµ 2 where {f k } is a dense set of uniformly continuous functions on. By Theorem 4.1 convergence in this metric is equivalent to weak convergence. The next theorem due to Prokhorov gives a necessary and sufficient condition for a subset A M() to be conditionally compact in the weak topology, i.e. every sequence from A has a weakly convergent subsequence. Theorem 4.3. A M() is conditionally compact in the weak topology if and only if for any ǫ > 0, there is a compact set K ǫ (independent of µ) such that for all µ A. µ(k ǫ ) 1 ǫ Proof. Necessity. In the proof of Theorem 4.2 a crucial step was the it µ( n j=1s(x j, 1 k )) = 1 n j=1 S(x j, 1 k ) is an open set and A is contained in a compact set. By Theorem 4.1, µ(g) is lower semicontinuous as a function of µ in the weak topology. Dini s theorem ensures that the convergence above is uniform for all µ A and that completes the proof of necessity. The construction of K ǫ as in Theorem4.2 works simultaneously for all µ A. Sufficiency. We first note that if is compact, then M() is compact. C() is separable and given any sequence µ n from M(), we can definitely choose, by diagonalization, a subsequence such that Λ(f) = f(x)dµ n exists for f in a a countable dense subset and therefore for all f C(). We appeal to the Riesz representation theorem to find a µ M() such that Λ(f) = f(x)dµ

4 CHAPTER 4. WEAK CONVERGENCE AND COMPACTNESS. Actually if M() is taken as the set of all finite measures instead of just probability measures, only some minor changes need to be made. One has to be sure that µ n () µ() and this has to be added, if it not already implied, in any of the equivalent formulations of 4.1 (for instance in 3 and 4). In particular if is compact and µ n () is bounded there is a subsequence that converges weakly to a measure µ. If is not compact, but we have a collection A M(), satisfying the uniform tightness condition, we restrict µ n to K k to get µ n,k satisfying 1 µ n,k (K k ) 1 1 k. By the diagonalization procedure, we can assume that along a subsequence (that we continue to denote by µ n ), for every k, µ n,k = α k exists. It is not difficult to see that for k > l α k α l and it follows that k α k = α exists and µ n α. While one can use Garsia-Rodemich-Rumsey estimates to establish tightness there are some other methods are equally useful. There are stochastic processes that admit jumps like Poisson processes, and the space on which to put these measures is the space D[0, 1] of functions x(t) on [0, T] that have left and right its at every point, x(t + 0) = x(t) and x(t 0) = x(t). A sequence x n (t) of such functions converges (in Skorohod J-1 topology) to x(t) in D[0, 1] if there are one to one continuous maps λ n (t) of [0, 1] onto itself such that sup λ n (t) t + sup x n (λ n (t)) x(t) 0 0 t T 0 t T If x(t) and y(t) are close in this topology then their jumps do not have to align perfectly, but to every jump of significant size in one there be a corresponding jump of nearly the same size at a nearby location in the other. There is a version of Ascoli-Arzela theorem here. For a set D of functions in D[0, 1] to be conditionally compact it is necessary and sufficient that their jumps if any be uniformly bounded and for any δ > 0, there be a uniform bound on the number of jumps of size greater than δ. More over these jumps should stay away by a uniform distance from each other. Two jumps of significant size are not allowed to come together and combine. If we denote by f (a, b) the oscillation of the function f in the open interval (a, b), i.e. f (a, b) = sup f(c) f(d) a<c<d<b then f (a, b) = min a<c<b max{ f(a, c), f (c, b)}

5 It is almost the oscillation in (a, b) except that we discount one jump at some point c in between. The modulus continuity of f in D[0, 1] is defines as ω f (h) = max{ sup 0 a<b 1 a b h f(a, b), sup f(0) f(a), 0 a h sup 1 h a 1 f(1) f(a) } The Ascoli-Arzela theorem in this context requires ω f (h) to go to 0 uniformly and the functions f to be uniformly bounded. (which will follow from the behavior of the modulus of continuity if the functions are bounded at 0 and the jumps are uniformly bounded.) Given a function f in D[0, 1], let us define successively τ 1 = {inf t : f(t) f(0) ǫ} τ j = {inf t τ j 1 : f(t) f(τ j 1 ) ǫ} until τ j fails to exist, i.e sup τj 1 t 1 x(t) x(τ j 1) < ǫ. Then the modulus of continuity can be estimated. If h = min{τ 1, τ 2 τ 1,...,τ j 1 τ j 2 }, then it is not difficult to see that, because any interval of length less than h can contain at most one of the points τ 1,..., τ j 1, ω f (h) ǫ We now develop some tools to estimate the modulus of continuity. Lemma 4.4. Let {S j : 1 j N} be a collection of random variables such that P[S 0 = 0] = 1, S j is adapted to the filtration F j and sup esssup ω P[ S k S j ǫ F j ] p(ǫ) < 1 0 j k N Then Proof. P[ sup S j 2ǫ] p(ǫ) 1 p(ǫ) P[ sup N 1 S j 2ǫ, S N ǫ] = P[ sup S j < 2ǫ, S k 2ǫ, S N ǫ] 0 j k 1 k=1 N 1 k=1 P[ sup S j < 2ǫ, S k 2ǫ, S N S k ǫ] 0 j k 1 N 1 p(ǫ) P[ sup S j < 2ǫ, S k 2ǫ] 0 j k 1 k=1 = p(ǫ)p[ sup S j 2ǫ]

6 CHAPTER 4. WEAK CONVERGENCE AND COMPACTNESS. Adding the two bounds which proves the lemma. P[ sup S j 2ǫ, S N > ǫ] p(ǫ) P[ sup S j 2ǫ] p(ǫ) + p(ǫ)p[ sup S j 2ǫ] This lemma will allow us to estimate P[τ j τ j 1 h F τj 1 ] and can be used as input for the next lemma. Lemma 4.5. Let esssup ω P[τ j τ j 1 h F τj 1 ] φ ǫ (h). Then P[ ω f (h) ǫ] inf j,s [j φ ǫ(h) + e [ φ ǫ (s) + e s [1 φ ǫ (s)] ] j ] Proof. Clearly for any s > 0, P[ ω f (h) ǫ] P[min{τ 1, τ 2 τ 1,..., τ j 1 τ j 2 } h] + P[τ j 1] (j 1)φ(h) + ee[e τj ] (j 1)φ ǫ (h) + e [ φ ǫ (s) + e s [1 φ ǫ (s)] ] j Remark 4.2. If a sequence in D[0, 1] converges to a it that is a continuous function, then the it is uniform. If the size of the largest jump goes to zero, then the convergence is to a it in C[0, 1]. Example 4.1. Consider i.i.d. random variable { i } with mean zero and variance 1. Let S n = 1 + 2 + + n and n (t) = 1 n S [ nt]. By central it theorem the finite dimensional distributions of n (t) converge to the finite dimensional distributions of (t) the Brownian motion. The process n (t) can be realized in D[0, 1] and we want to show weak convergence of the processes P n, in D[0, 1]. First we note that for k j, and as n, P n [ x( k n ) x( j n ) ǫ B j ] k j n nǫ 2 P[ max 1 j n j ǫ n] np[ ǫ n] 0