Notes 19 : Martingale CLT

Similar documents
2.2. Central limit theorem.

Notes 5 : More on the a.s. convergence of sums

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Convergence of random variables. (telegram style notes) P.J.C. Spreij

7.1 Convergence of sequences of random variables

Lecture 19: Convergence

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1

Notes 27 : Brownian motion: path properties

Lecture 10 October Minimaxity and least favorable prior sequences

7.1 Convergence of sequences of random variables

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Distribution of Random Samples & Limit theorems

STA Object Data Analysis - A List of Projects. January 18, 2018

Sequences and Series of Functions

LECTURE 8: ASYMPTOTICS I

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

This section is optional.

Lecture 8: Convergence of transformations and law of large numbers

Seunghee Ye Ma 8: Week 5 Oct 28

ST5215: Advanced Statistical Theory

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

1 Convergence in Probability and the Weak Law of Large Numbers

Law of the sum of Bernoulli random variables

Lecture 20: Multivariate convergence and the Central Limit Theorem

Lecture 2: Concentration Bounds

MATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

4. Partial Sums and the Central Limit Theorem

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Entropy Rates and Asymptotic Equipartition

The natural exponential function

Math Solutions to homework 6

Advanced Stochastic Processes.

Notes on Snell Envelops and Examples

Solutions to HW Assignment 1

Solution to Chapter 2 Analytical Exercises

6.3 Testing Series With Positive Terms

An alternative proof of a theorem of Aldous. concerning convergence in distribution for martingales.

Lecture 3 : Random variables and their distributions

Asymptotic distribution of products of sums of independent random variables

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Problem Set 2 Solutions

Read carefully the instructions on the answer book and make sure that the particulars required are entered on each answer book.

1+x 1 + α+x. x = 2(α x2 ) 1+x

lim za n n = z lim a n n.

STAT331. Example of Martingale CLT with Cox s Model

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

STAT Homework 1 - Solutions

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Different kinds of Mathematical Induction

ANSWERS TO MIDTERM EXAM # 2

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

University of Colorado Denver Dept. Math. & Stat. Sciences Applied Analysis Preliminary Exam 13 January 2012, 10:00 am 2:00 pm. Good luck!

Entropy and Ergodic Theory Lecture 5: Joint typicality and conditional AEP

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

1 Approximating Integrals using Taylor Polynomials

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

1 The Haar functions and the Brownian motion

The Central Limit Theorem

1 Review and Overview

Probability and Statistics

INFINITE SEQUENCES AND SERIES

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data

Berry-Esseen bounds for self-normalized martingales

Mixingales. Chapter 7

Characteristic functions and the central limit theorem

Math 25 Solutions to practice problems

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

A Proof of Birkhoff s Ergodic Theorem

Learning Theory: Lecture Notes

Detailed proofs of Propositions 3.1 and 3.2

Math 341 Lecture #31 6.5: Power Series

Chapter 6 Infinite Series

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Introduction to Probability. Ariel Yadin

ECE534, Spring 2018: Final Exam

Lecture Chapter 6: Convergence of Random Sequences

Central Limit Theorem using Characteristic functions

The standard deviation of the mean

7 Sequences of real numbers

Lecture 3 The Lebesgue Integral

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

2.1. Convergence in distribution and characteristic functions.

Lecture 12: September 27

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

Empirical Processes: Glivenko Cantelli Theorems

Fall 2013 MTH431/531 Real analysis Section Notes

Lecture 3: August 31

Singular Continuous Measures by Michael Pejic 5/14/10

Infinite Sequences and Series

Lecture 2. The Lovász Local Lemma

Self-normalized deviation inequalities with application to t-statistic

Math 113 Exam 3 Practice

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS

Lecture 11 October 27

Transcription:

Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall a umber of defiitios ad results that will be useful to keep i mid throughout this lecture. DEF 9. (Distributio fuctio) Let X be a RV o a triple (Ω, F, P). The law of X is L X = P X, which is a probability measure o (R, B). By the Uiqueess of Extesio lemma, L X is determied by the distributio fuctio (DF) of X F X (x) = P[X x], x R. DEF 9.2 (Covergece i distributio) A sequece of DFs (F ) coverges i distributio (or weakly) to a DF F if F (x) F (x), for all poits of cotiuity x of F. If F ad F are the DFs of X ad X respectively, we write X X. THM 9.3 (Covergece i distributio v. i probability) Suppose that (X ) is a sequece of RVs with X F. The followig holds:. If X P X, the X X. 2. If X c, where c is a costat, the X P c. LEM 9.4 (Covergig together lemma) If X X ad Z X 0, the Z X. LEM 9.5 (Multiplyig by a covergig sequece) If X X ad Y c with Y 0 ad c > 0, the X Y cx.

Lecture 9: Martigale CLT 2 DEF 9.6 (Characteristic fuctio) The characteristic fuctio (CF) of a RV X is φ X (t) = E[e itx ] = E[cos(tX)] + ie[si(tx)], where the secod equality is a defiitio ad the expectatios exist because they are bouded. EX 9.7 (Gaussia distributio) Let X N(0, ). The φ X (t) = e t2 /2. THM 9.8 If X ad X 2 are idepedet the φ X +X 2 (t) = φ X (t)φ X2 (t). LEM 9.9 If DFs F ad G have the same CF, the they are equal. THM 9.0 (Lévy s Cotiuity Theorem) Let µ, be probability measures (PM) with CFs φ.. If µ µ the φ (t) φ (t) for all t. 2. If φ (t) coverges poitwise to a limit φ(t) that is cotiuous at 0 the the associated sequece of PMs µ coverges weakly to a PM µ with CF φ. THM 9. (CLT) Let (X ) be IID with E[X ] = µ ad Var[X ] = σ 2 < +.The if S = k= X k S µ σ N(0, ). Martigale CLT We have see that MGs are, i some ways, geeralizatios of ubiased RWs. Hece, oe might expect a CLT to hold uder appropriate coditios (amely, Lideberg-type coditios). That turs out to be the case, but the situatio is more complex tha for sums of idepedet RVs, as the ext example suggests. I brief, mixtures of ubiased RWs are also MGs, i which case the limit may be a mixture of Gaussias.

Lecture 9: Martigale CLT 3 EX 9.2 (Mixtures of RWs) Recall that, if the bouded process {C } is predictable (i.e. C F for all ) ad {S } 0 is a MG, the the martigale trasform {(C S) } 0 with M = (C S) i C i (S i S i ), is also a MG. Suppose S is SRW o the itegers started at 0, that is, S = i= X i where the X i s are IID uiform i {, +}. Hece S i S i = X i. Let also X 0 be idepedet uiform i {, 2} ad set C i = X 0 for all i. The, with respect to the filtratio F = σ(x 0,..., ) for 0, {S } 0 is a MG (because E[X i ] = 0) ad {C } is bouded ad predictable. Hece {M }, as defied above, is a MG. Now ote that {M } is a mixture of two RWs, i the followig sese. O {X 0 = }, which occurs with probability /2, we have M = i= X i ad THM 9. implies that M / N(0, ). O the other had, o {X 0 = 2}, which also occurs with probability /2, we have istead M = i= 2X i ad THM 9. implies this time that M / 4 N(0, ) or M / N(0, 4). To hadle this issue, we itroduce the coditioal variace. Let {M } be a MG i L 2 with correspodig MG differece Z = M M ad let σ 2 = E[Z 2 F ]. (Note that this is a RV.) Cosider the stoppig time (ST) { } m τ = if m 0 : σi 2. () This is ideed a ST sice {τ m} = { m i= σ2 i } F m F m. Uder the assumptio that σ2 = + with probability, we the have τ + a.s. whe +. THM 9.3 (A CLT for MGs with Bouded Icremets) Let {M } be a MG with correspodig MG differece Z = M M. Assume there is a costat K < + such that Z K with probability for all. Let σ 2 = E[Z 2 F ] ad assume that σ2 = + with probability. Let τ be as defied i (). The M τ N(0, ), as +. i=

Lecture 9: Martigale CLT 4 Goig back to the example above: EX 9.4 (Mixtures of RWs cotiued) The coditios of THM 9.3 are satisfied with K = 4 by otig that σ 2 a.s. for all. O {X 0 = } we have τ = for all, while o {X 0 = 2} we have τ = /4 for all. To prove THM 9.3, we establish a geeralizatio of the Lideberg-Feller CLT. The settig ivolves a sequece of MGs {M,m } m 0 for with respect to filtratios {F,m } m 0. (If the origial MG is oly defied for 0 m r, the oe ca exted it by lettig M,m = M,r ad F,m = F,r for all m > r. More geerally, this settig accommodates a MG stopped at a sequece of stoppig times divergig to ifiity. See proof of THM 9.3 below.) THM 9.5 (Martigale CLT) For each, let {M,m } m 0 be a MG i L 2 with respect to filtratio {F,m } m 0 with correspodig MG differece Z,m = M,m M,m ad coditioal variace σ,m 2 = E[Z,m 2 F,m ]. Assume that, for each, M,m ad Γ,m m r= σ2,r coverge a.s. to a fiite limit whe m +. Suppose that The. Γ, + m= σ2,m i probability as + 2. ε > 0, lim + m= E[Z2,m; Z,m > ε] = 0 as +. M, + m= Z,m N(0, ), Before givig the proof of the MG-CLT, we use it to establish THM 9.3. Proof:(of THM 9.3) We costruct a sequece of MGs as follows. Let F,m = F m ad M,m = M m τ so that Z,m = Z m {τ m}. This is ideed a MG for each sice τ is a ST (ad multiplyig by a costat

Lecture 9: Martigale CLT 5 does ot affect the MG property). Moreover [ (Zm ] 2 σ,m 2 = E {τ m}) F,m = {τ m} E[Z 2 m F m ] = σ2 m {τ m}, where we used that {τ m} F m. Summig over m ad usig the boud o Z m, we get τ σm 2 Γ, = + K2, m= by the defiitio of τ. So as +, we have Γ,. Also, for ay ε > 0, Z,m = Z m {τ m} K < ε, m wheever is large eough so that lim + m= E[ Z,m 2 ; Z,m > ε] = 0. Therefore, the result follows from the MG-CLT after otig that M, = M τ, where we used the fact that τ < + a.s. by the assumptio σ2 = +. We move o to the proof of the MG-CLT. Proof:(of THM 9.5) We first assume that there is a costat c such that Γ, c, for all. We seek to prove that the followig is goig to 0 as + : E[e itm, ] e t2 /2 E[e itm, e itm, e t2 /2 e t2 Γ, /2 ] +E[e itm, e t2 /2 e t2 Γ, /2 e t2 /2 ] E[ e t2 /2 e t2 Γ, /2 ] + E[e itm, e t2 Γ, /2 ]. The first term above 0 by (BDD) ad the assumptio that Γ, c.

Lecture 9: Martigale CLT 6 To boud the secod term we use a telescopig argumet. Write e itm, e t2 Γ, /2 = {e } itm,m e t2 Γ,m/2 e itm,m e t2 Γ,m /2 m = m e itm,m e t2 Γ,m/2 { e itz,m e t2 σ 2,m/2 }. Hece, E[e itm, e t2 Γ, /2 ] [ { }] E e itm,m e t2 Γ,m/2 e itz,m e t2 σ,m/2 2 m [ [ ]] E e itm,m e t2 Γ,m/2 E e itz,m e t2 σ,m 2 /2 F,m m e [ [ ]] E ct2 E e itz,m e t2 σ,m/2 2 F,m m where we used that M,m, Γ,m F,m ad the assumptio that Γ, c. We will use the followig lemmas proved i a previous lecture. LEM 9.6 It holds that E [ e itx] ( + it E[X] t2 ]) 2 E[X2 E [ mi{ tx 3, tx 2 } ]. LEM 9.7 If z is a complex umber the e z ( + z) z 2 e z. By the MG property, E[Z,m F,m ]. Also by defiitio E[Z 2,m F,m ] = σ 2,m. By LEM 9.6, E [ e ] itz,m F,m ( t2 2,m) σ2 E [ tz,m 3 tz,m 2 F,m ] E[ tz,m 3 ; Z,m ε F,m ] + E[ tz,m 2 ; Z,m > ε F,m ] ε t 3 E[Z 2,m; Z,m ε F,m ] + t 2 E[Z 2,m; Z,m > ε F,m ] ε t 3 σ 2,m + t 2 E[Z 2,m; Z,m > ε F,m ].

Lecture 9: Martigale CLT 7 By LEM 9.7, ( ) σ,m/2 2 e t2 t2 σ,m 2 2 (t2 σ,m/2) 2 2 e t2 σ,m/2 2 t 4 σ,me 2 ct2 Pluggig the previous two displays ito the equatio above, we get E[e itm, e t2 Γ, /2 ] e [ [ ]] E ct2 E e itz,m e t2 σ,m/2 2 F,m m e ct2 { ε t 3 c + t 2 + m= ( ) sup σ,m 2. m [ E[Z,m; 2 Z,m > ε] + t 4 ce ct2 E sup σ,m] } 2. m (Note that the partial sum i the telescopig argumet above, e itm,m e t2 Γ,m/2, is bouded ad therefore we ca take the expectatio iside by (BDD).) To boud the supremum, we ote that so σ 2,m ε 2 + E[Z 2,m; Z,m > ε F,m ] ε 2 + + i= E[Z 2,i; Z,i > ε F,i ] [ ] + E sup σ,m 2 ε 2 + E[Z,i; 2 Z,i > ε]. m i= Takig + ad ε > 0 arbitrarily ad usig the secod coditio of the statemet, we get E[e itm, e t2 Γ, /2 ] 0, for all t. That cocludes the proof i the case Γ, c. To remove the last assumptio, multiply the differece Z,m by the idicator that {Γ,m c} F,m for some c > ad apply the argumet above. The ote that P[Γ, c] as + by the first coditio of the statemet ad use the covergig together lemma. See the details i [B]. 2 A applicatio: autoregressive processes Let {W } be a sequece of IID RVs with W c for all for some costat c < +. Assume that E[W ] = 0 ad Var[W ] = ν 2. For ρ (0, ), cosider

Lecture 9: Martigale CLT 8 the followig autoregressive process (of order ) X = ρx + W, ad X 0 = 0. Assume that ρ is ukow ad that we oly observe the sequece {X }. To estimate ρ, we observe that E[X X ] = E[ρX 2 + W X ] = ρe[x 2 ], where we used the idepedece of W ad X σ(w,..., W ). Hece, a atural estimator for ρ from {X } is ˆρ k= X kx k k= X2 k We use the MG-CLT to establish the asymptotic ormality of ˆρ. CLAIM 9.8 We have where Ψ ν2. ρ 2 Ψ ν 2 (ˆρ ρ) N(0, ), Proof: Our startig poit is the followig observatio: (ˆρ ρ) Xk 2 = k= = = X k X k ρ k=. k= X 2 k (X k ρx k )X k k= W k X k. k= The latter is ot a sum of idepedet RVs, but it turs out to be a MG with respect to the filtratio F = σ(w,..., W ), as we show ext. LEM 9.9 The process is a MG. M = W k X k, 0 k=

Lecture 9: Martigale CLT 9 LEM 9.20 We have Ψ Xk 2 P k= ν 2 ρ 2 Ψ. Before provig the two claims, it will be helpful to write by iductio X = ρ k W k, (2) k= ad defie H (x) = k= x k. The Var[X ] = E[X 2 ] = ν 2 H (ρ 2 ), where we used that the W k s are idepedet ad have mea 0 ad variace ν 2. Note also that by (2) ad the boud o W, we have the followig boud X ρ k W k ch (ρ). (3) k= Proof:(of LEM 9.9) The process {M } is adapted by (2) ad itegrable by the bouds o W ad X. Observe further that, sice X k F k, [ ] E W k X k F = W k X k + X E [W F ] = W k X k. k= k= That cocludes the proof of the claim. Proof:(of LEM 9.20) We use Chebyshev s iequality. We kow that [ ] E Xk 2 = ν 2 H k (ρ 2 ) ν2 ρ 2, (4) k= k= as H (ρ 2 ) ρ 2. We eed a boud o the variace. Trivially Var[X 2 k ] E[X4 k ] {ch k(ρ)} 4. Moreover, for k < l, we write X l ρ l k X k +Y l,k where Y l,k σ(w k+,..., W l ) k=

Lecture 9: Martigale CLT 0 ad H l (ρ 2 ) = ρ 2(l k) H k (ρ 2 ) + H l k (ρ 2 ). Hece, Cov [ Xk 2, ] X2 l = E [ (Xk 2 ν2 H k (ρ 2 ))(Xl 2 ν2 H l (ρ 2 )) ] = E[(X 2 k ν2 H k (ρ 2 )) ((ρ l k X k + Y l,k ) 2 ν 2 [ρ 2(l k) H k (ρ 2 ) + H l k (ρ 2 )])] = E[(X 2 k ν2 H k (ρ 2 )) (ρ 2(l k) X 2 k + 2ρl k X k Y l,k + Y 2 l,k ν2 [ρ 2(l k) H k (ρ 2 ) + H l k (ρ 2 )])] = ρ 2(l k) E[(X 2 k ν2 H k (ρ 2 )) 2 ] ρ 2(l k) {ch k (ρ)} 4. Therefore Var [ k= X 2 k ] = 2 2 = 0 k l 2c 4 2 ( ρ) 4 2c 4 2 ( ρ) 4 ρ 2(l k) {ch k (ρ)} 4 0 k l 0 m + 2c 4 ( ρ) 4 ( ρ 2 ) ρ 2(l k) ρ 2m k= K. By Chebyshev s iequality, [ [ ] P Xk 2 E X 2 k ] δ k= as +. Together with (4), that proves the claim. We apply the MG-CLT with ad F,m = F m. The Z,m = M m M m Ψ ν 2 M,m = M m Ψ ν 2, K/ δ 2 0, {m } = W mx m {m }, Ψ ν2

Lecture 9: Martigale CLT ad ad σ 2,m = E[Z 2,m F,m ] = {m } X2 m Ψ ν 2 E[W 2 m F m ] = {m } X2 m Ψ ν 2 ν2, Γ, = Ψ m= X 2 m = Ψ Ψ P, as + by LEM 9.20. To check the secod coditio i the MG-CLT, we ote that Z,m = W m X m Ψ ν {m } c(ch (ρ)) 2 Ψ ν c 2 2 ( ρ) Ψ ν ε, 2 for all large eough. The MG-CLT the says that or put differetly, that (ˆρ ρ) M N(0, ) Ψ ν2 Ψ ν 2 k= X 2 k By LEM 9.5 ad LEM 9.20, we fially have Ψ ν 2 (ˆρ ρ) N(0, ). That cocludes the proof of CLAIM 9.8. Refereces N(0, ). [Bil95] Patrick Billigsley. Probability ad measure. Wiley Series i Probability ad Mathematical Statistics. Joh Wiley & Sos Ic., New York, 995. [Roc] Sebastie Roch. Moder Discrete Probability: A Essetial Toolkit. Book i preparatio. http://www.math.wisc.edu/ roch/mdp/.