Convergence of random variables. (telegram style notes) P.J.C. Spreij

Similar documents
Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

1 Convergence in Probability and the Weak Law of Large Numbers

Introduction to Probability. Ariel Yadin

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

This section is optional.

An Introduction to Randomized Algorithms

Distribution of Random Samples & Limit theorems

Lecture 3 The Lebesgue Integral

Probability and Random Processes

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

Probability for mathematicians INDEPENDENCE TAU

M17 MAT25-21 HOMEWORK 5 SOLUTIONS

Advanced Stochastic Processes.

Chapter 6 Infinite Series

4. Partial Sums and the Central Limit Theorem

6.3 Testing Series With Positive Terms

sin(n) + 2 cos(2n) n 3/2 3 sin(n) 2cos(2n) n 3/2 a n =

Sequences and Series of Functions

lim za n n = z lim a n n.

Lecture 19: Convergence

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Entropy Rates and Asymptotic Equipartition

n=1 a n is the sequence (s n ) n 1 n=1 a n converges to s. We write a n = s, n=1 n=1 a n

Fall 2013 MTH431/531 Real analysis Section Notes

6 Infinite random sequences

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology

ST5215: Advanced Statistical Theory

MAS111 Convergence and Continuity

MA131 - Analysis 1. Workbook 3 Sequences II

2.1. Convergence in distribution and characteristic functions.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Introduction to Probability. Ariel Yadin. Lecture 2

Math 525: Lecture 5. January 18, 2018

Notes 5 : More on the a.s. convergence of sums

Infinite Sequences and Series

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Lecture 3 : Random variables and their distributions

5 Birkhoff s Ergodic Theorem

Lecture 20: Multivariate convergence and the Central Limit Theorem

Introduction to Probability. Ariel Yadin. Lecture 7

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

Law of the sum of Bernoulli random variables

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solutions to HW Assignment 1

A Proof of Birkhoff s Ergodic Theorem

Lecture Notes for Analysis Class

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

1 The Haar functions and the Brownian motion

Lecture 10 October Minimaxity and least favorable prior sequences

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

6. Uniform distribution mod 1

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

Lecture Chapter 6: Convergence of Random Sequences

3. Sequences. 3.1 Basic definitions

LECTURE 8: ASYMPTOTICS I

Detailed proofs of Propositions 3.1 and 3.2

2 Banach spaces and Hilbert spaces

Measure and Measurable Functions

Solutions of Homework 2.

Sequences and Series

Math Solutions to homework 6

Application to Random Graphs

Empirical Processes: Glivenko Cantelli Theorems

Lecture 2. The Lovász Local Lemma

Exercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1).

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

EE 4TM4: Digital Communications II Probability Theory

University of Colorado Denver Dept. Math. & Stat. Sciences Applied Analysis Preliminary Exam 13 January 2012, 10:00 am 2:00 pm. Good luck!

7 Sequences of real numbers

Chapter IV Integration Theory

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

Final Solutions. 1. (25pts) Define the following terms. Be as precise as you can.

ECE534, Spring 2018: Final Exam

Sieve Estimators: Consistency and Rates of Convergence

Seunghee Ye Ma 8: Week 5 Oct 28

Mathematical Methods for Physics and Engineering

STAT Homework 1 - Solutions

Notes on Snell Envelops and Examples

Probability: Limit Theorems I. Charles Newman, Transcribed by Ian Tobasco

Notes 19 : Martingale CLT

MAT1026 Calculus II Basic Convergence Tests for Series

INFINITE SEQUENCES AND SERIES

Sequences I. Chapter Introduction

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

LECTURE SERIES WITH NONNEGATIVE TERMS (II). SERIES WITH ARBITRARY TERMS

1 Introduction. 1.1 Notation and Terminology

FUNDAMENTALS OF REAL ANALYSIS by

Read carefully the instructions on the answer book and make sure that the particulars required are entered on each answer book.

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS

Math 341 Lecture #31 6.5: Power Series

Math 140A Elementary Analysis Homework Questions 3-1

Sequences. Notation. Convergence of a Sequence

Transcription:

Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005

Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space (Ω, F). If we have a sequece X, X 2,... of them ad we ask for limit behaviour of this sequece, the we have to specify the type of covergece. Ad sice we are dealig with fuctios, there are may useful types available. I these otes we will treat the best kow oes. They are called covergece, covergece i probability ad covergece i p-th mea. Aother importat cocept, weak covergece, with the Cetral Limit Theorem as its best kow example is treated somewhere else. I this otes we fially arrive at the Strog law of large umbers for a iid sequece of radom variables. This law states that averages of a iid sequece coverge almost surely to their commo expectatio. 2 relimiaries Let (Ω, F, ) be a probability space, I the sequel, uless stated otherwise, we assume that all radom variables are defied o this space. The σ-algebra F o Ω is the collectio of evets. Oe says that a evet F takes place almost surely, if (F ) = 0. For evets E, E 2,... we defie lim sup E = E m, m ad lim if E = m E m. Notice that (lim sup E ) c = lim if E c. For the evet lim sup E oe ofte writes E i.o. (i.o. meas ifiitely ofte) ad for the evet lim if E oe also writes E evetually. We will mostly cosider real valued radom variables X, fuctios X : Ω R that are measurable. Recall that a map X : Ω R is called measurable if X [B] F for all B B, the Borel sets of R. Measurability thus depeds o the choice of the σ-algebra F o Ω. There is always a σ-algebra o Ω that makes a give fuctio X measurable, the power set. More iterestig is the smallest σ-algebra that turs X ito a measurable fuctio. This σ-algebra is deoted by σ(x) ad it is give by σ(x) = {X [B] : B B}. 3 Idepedece We are used to call two evets E ad F idepedet if (E F ) = (E)(F ). Below we exted this to idepedece of a arbitrary sequece of evets, which comes at the ed of a sequece of defiitios.

Defiitio 3. (i) Let G, G 2,... be a sequece of sub-σ-algebras of F. Oe says that this sequece is idepedet if for all N ad G ik G ik it holds that (G i G i ) = (G ik ). i= For fiite sequeces G,..., G of σ-algebras, idepedece is defied as idepedece of the sequece G, G 2,..., where G k = {, Ω}, for k >. (ii) A sequece X, X 2,... of radom variables is said to be a idepedet sequece, if the σ-algebras F k = σ(x k ) (k N) are idepedet. (iii) A sequece E, E 2,... of evets is called idepedet, if the radom variables X k := Ek are idepedet. Remark 3.2 Notice that a idepedet sequece of evets remai idepedet, if i ay subsequece of it, the evets E are replaced with their complemets. Lemma 3.3 (Borel-Catelli) Let E, E 2,... be a sequece of evets. (i) If it has the property that (E ) <, the (lim sup E ) = 0. (ii) If (E ) = ad if, moreover, the sequece is idepedet, the (lim sup E ) =. roof (i) Let U = m E m. Notice that the sequece (U ) decreases to U = lim sup E. Hece we have (U) (U ) m (E m), which coverges to zero by assumptio. (ii) We prove that (lim if E) c = 0. Let D N = N m= Ec m (N ). Notice that for fixed the sequece (D N ) N decreases to D := m= Ec m. By idepedece we obtai (D N ) = N m= ( (E m)), which is less tha exp( N m= (E m)). Hece by takig limits for N, we obtai for every that (D ) exp( m= (E m)) = 0. Fially, we observe that lim if E c = = D ad hece (lim if E) c = (D ) = 0. Next to (ordiary) idepedece, we also have the otio of coditioal idepedece. As i defiitio 3. this cocept ca be defied for ifiite sequeces of σ-algebras. We oly eed a special case. Defiitio 3.4 Two σ-algebras F ad F 2 are called coditioally idepedet give a third σ-algebra G if (F F 2 G) = (F G)(F 2 G), (3.) for all F F, F 2 F 2 ad G G with (G) > 0. The equality i (3.) is easily see to be equivalet to (F F 2 G) = (F G)(F 2 G). (3.2) Furthermore, (3.2) is obviously also equivalet to (F 2 F G) = (F 2 G), provided that (F 2 G) > 0. 2

4 Covergece cocepts Let X, X, X 2,... be radom variables. We have the followig defiitios of differet modes of covergece. We will always assume that the parameter teds to ifiity, uless stated otherwise. Defiitio 4. (i) If (ω : X (ω) X(ω)) =, the we say that X coverges to X almost surely (). (ii) If ( X X > ε) 0 for all ε > 0, the we say that X coverges to X i probability. (iii) If E X X p 0 for some p > 0, the we say that X coverges to X i p-th mea, or i L p. For these types of covergece we use the followig otatios: X X, X X ad X L p X respectively. First we study a bit more i detail almost sure covergece of X to X. If this type of covergece takes place we have (ω : ε > 0 : N : N : X (ω) X(ω) < ε) =. But the also (droppig the ω i the otatio) for all ε > 0: ( N : N : X X < ε) =. (4.3) Coversely, if (4.3) holds, we have almost sure covergece. Notice that we ca rewrite the probability i (4.3) as (lim if E ε ) =, with E ε = { X X < ε}. Limits are ofte required to be uique i a appropriate sese. The atural cocept of uiqueess here is that of almost sure uiqueess. ropositio 4.2 I each of covergece cocepts i defiitio 4. the limit, whe it exists, is almost surely uique. This meas that if there are two cadidate limits X ad X, oe must have (X = X ) =. roof Suppose that X X ad X X. Let Ω 0 be the set of probability oe o which X (ω) X(ω) ad Ω 0 be the set of probability oe o which X (ω) X (ω). The also (Ω 0 Ω 0) = ad by uiqueess of limits of real umbers we must have that X(ω) = X (ω) for all ω Ω 0 Ω 0. Hece (X = X ) (Ω 0 Ω 0) =. If X X ad X X, the we have by the triagle iequality for ay ε > 0 ( X X > ε) ( X X > ε/2) + ( X X > ε/2), ad the right had side coverges to zero by assumptio. Fially we cosider the third covergece cocept. We eed the basic iequality a + b p c p ( a p + b p ) (exercise 6.3), where c p = max{2 p, }. This allows us to write E X X p c p (E X X p + E X X p ). It follows that E X X p = 0 ad hece that (X = X ) =. 3

The followig relatios hold betwee the types of covergece itroduced i defiitio 4.. ropositio 4.3 (i) If X X, the X X. L (ii) If X p X, the X X. L (iii) If p > q > 0 ad X p L X, the X q X. roof (i) Fix ε > 0 ad let A = { X X ε}. From (4.3) we kow that (lim if A c ) =, or that (lim sup A ) = 0. But A U := m A m ad the U form a decreasig sequece with lim sup A as its limit. Hece we have lim sup (A ) lim (U ) = 0. (ii) By Markov s iequality we have ( X X > ε) = ( X X p > ε p ) ε p E X X p, ad the result follows. (iii) Recall that the fuctio x x r is covex for r. Hece, Jese s iequality ( E Z r E Z r ) yields for r = p/q the iequality (E X X q ) r E X X p. We close this sectio with criterios that ca be used to decide whether covergece almost surely or i probability takes place. ropositio 4.4 (i) If for all ε > 0 the series ( X X > ε) is coverget, the X X. (ii) There is equivalece betwee (a) X X ad (b) every subsequece of (X ) cotais a further subsequece that is almost surely coverget to X. roof (i) Fix ε > 0 ad let E = { X X > ε}. The first part of the Borel-Catelli lemma (lemma 3.3) gives that (lim sup E ) = 0, equivaletly (lim if E) c =, but this is just (4.3). (ii) Assume that (a) holds, the for ay ε > 0 ad ay subsequece we also have ( X k X > ε) 0. Hece for every p N, there is k p N such that ( X kp X > ε) 2 p. Now we apply part (i) of this propositio, which gives us (b). Coversely, assume that (b) holds. We reaso by cotradictio. Suppose that (a) does t hold. The there exist a ε > 0 ad a level δ > 0 such that alog some subsequece ( k ) oe has ( X k X > ε) > δ, for all k. (4.4) But the sequece X k by assumptio has a almost surely coverget subsequece (X kp ), which, by propositio 4.3 (i), also coverges i probability. But this cotradicts (4.4). 4

5 The strog law The mai result of this sectio is the strog law of large umbers for a iid sequece of radom variables who have a fiite expectatio. Readers should be familiar with the weak law of large umbers for a sequece of radom variables that have a fiite variace (otherwise, make exercise 6.4 ow!). The proof of the theorem 5.2 uses approximatios with bouded radom variables. The followig lemma (sometimes called the trucatio lemma) prepares for that. Lemma 5. Let X, X 2,... be a iid sequece with E X < ad E X = µ. ut Y = X { X }. The the followig assertios hold true. (i) (X = Y evetually) =. (ii) E Y µ. (iii) 2 Var Y <. roof (i) Let E = {X Y } = { X > }. We will use the first part of the Borel-Catelli lemma to coclude that the assertio holds, i.e. we will show that (lim sup E ) = 0. We therefore look at (E ). Sice all X have the same distributio as X, we also have ( X > ) = (X > ). Recall the familiar iequality E X ( X > ) (exercise 6.8). Usig these igrediets we get (E ) E X <. Ideed, the Borel-Catelli ow give us the result. (ii) Sice X has the same distributio as X, we also have that Y has the same distributio as X { X }. I particular, they have the same expectatio. Hece E Y = E X { X }, which teds to E X i view of theorem A. i the appedix (see also exercise 6.7). (iii) This proof is a little tricky. It is sufficiet to show that E Y 2 2 <. The sum is equal to E X 2 2 { X }. Iterchagig expectatio ad summatio gives E X 2 as follows: 2 { X } { X } + 2 { X } ad we study the summatio. Split it 2 { X } { X >}. The first summatio is less tha 2 { X }. For the secod summatio we have 2 { X } { X >} = 2 { X >} 2 2 X X X + x 2 dx { X >} x 2 dx { X >} = 2 X { X >}. 5

Hece E X 2 2 { X } 2E X 2 ( { X } + X { X >}) which is fiite by assumptio. Here is the aouced strog law of large umbers. 2(E X { X } + E X { X >}) = 2E X, Theorem 5.2 Let X, X 2,... be a sequece of iid radom variables ad assume that E X <. Let µ = E X, the X µ (5.5) roof Let us first assume that the X are oegative. ut Y = X {X }. The X = Y k + (X k Y k ). k= k= Notice that o a set of probability oe the sum k= (X k(ω) Y k (ω)) cotais oly fiitely may ozero terms (this follows from lemma 5. (i)), so that it is sufficiet to show that Y µ. (5.6) Fix α >, β = [α ] ad put η = β β k= We first show that Y k. η E η 0 (5.7) by applyig propositio 4.4(i). Below we eed the followig techical result. There exists a costat C α such that for all i it holds that :β i β Cα 2 i 2 (exercise 6.0). Cosider for ay ε > 0 ( η E η > ε) ε 2 Var η = = = ε 2 β β 2 = i= = ε 2 ( C α β i= :β 2 i i= i 2 Var Y i, Var Y i )Var Y i 6

which is fiite by lemma 5.(iii). Hece propositio 4.4(i) yields the result. It is easy to show that E η coverges to µ (exercise 6.), ad we the coclude from (5.7) that η µ. (5.8) Recall that η depeds o α >. We proceed by showig that, by a limit argumet, the result (5.8) is also valid for α =. For every, let m = m() = if{k : β k > }. The β m > β m ad therefore i= Usig (5.8) ad Y i β m β m i= Y i = β m β m η m. βm β m α as m, we coclude that lim sup Y αµ, ad, sice this is true for every α >, we must also have lim sup Y µ (5.9) By a similar argumet we have Y i β m Y i = β m η m. β m β m i= i= Usig (5.8) agai, we coclude that lim if Y α µ, ad the we must also have lim if Y µ (5.0) Combiig (5.9) ad (5.0) yields (5.6) for oegative X i. Fially, for arbitrary X i we proceed as follows. For every real umber x we defie x + = max{x, 0} ad x = max{ x, 0}. The x = x + x. Similar otatio applies to radom variables. We apply the above results to the averages of the X + i ad X i to get X E X + E X = E X = µ. The ext propositio shows that if covergece of the averages to a costat takes place, this costat must be the (commo) expectatio. 7

ropositio 5.3 Let X, X 2,... be a sequece of iid radom variables, ad assume that X µ (5.) for some costat µ R. The E X is fiite ad µ = E X. roof Write X = X + X X to coclude that 0. I particular we have X X X evetually. I.e. (lim if{ }) =, or (lim sup{ > }) = 0. Usig the secod half of the Borel-Catelli lemma (lemma 3.3), we coclude that X ( ) =, ad thus ( X ) =. Sice E X =0 ( X > ) (exercise 6.8), we thus have E X < ad theorem 5.2 the yields that µ = E X. The assertio of theorem 5.2 is stated uder the assumptio of the existece of a fiite expectatio. I the case where oe deals with oegative radom variables, this assumptio ca be dropped. Theorem 5.4 Let X, X 2,... be a sequece of oegative iid radom variables, defied o a commo probability space. Let µ = E X, the X µ (5.2) roof We oly eed to cosider the case where µ =. Fix N N ad let X N = X {X N}, N. The theorem 5.2 applies ad we have that E X N. But X k= XN k ad hece l := lim if X k= XN k E X N for all N, ad thus also l lim N EX N But, by theorem A., the latter limit is equal to E X. Hece l = 6 Exercises 6. Let E ad E 2 be two evets such that (E E 2 ) = (E )(E 2 ). Show that these evets are idepedet i the sese of defiitio 3.. 6.2 Let E, E 2,... be evets ad let X = E,. Show that lim if X = lim if E ad that lim sup X = lim sup E. 6.3 Show that for ay two real umber a ad b ad for ay p > 0 it holds that ( a + b ) p max{2 p, }( a p + b p ). 6.4 Let X, X 2,... be a iid sequece of radom variables with commo fiite variace σ 2 ad expectatio µ ad put X = k= X k. Show that Var X = µ. σ 2 ad deduce from Chebychev s iequality that X 6.5 Let X, X, X 2,... be radom variables defied o some probability space (Ω, F, ). Show that the set {ω : lim X (ω) = X(ω)} is a evet, i.e. it is measurable. 8

6.6 Suppose that the real radom variables X, X 2,... are defied o a commo probability space ad that they are iid with a uiform distributio o [0, ]. Let M = max{x,..., X }. Show that M ad that eve M. 6.7 Give a example of a sequece of radom variables that coverge i probability, but ot almost surely. 6.8 Let X, X 2,... be a bouded sequece of radom variables ( X M) =, for some real umber M. Assume that for some radom variable X oe has X X. Show that also ( X M) = ad that for all p > 0 oe L has X p X. 6.9 Let Z, Z 2,... be a iid sequece of stadard ormals ad let S = k= Z k. Let a, b R ad defie X = exp(as b). (i) Express for every ε > 0 the probability (X > ε) i the cumulative distri- butio fuctio of Z ad deduce that X 0 iff b > 0. (ii) Show that E exp(λz ) = exp( 2 λ2 ) for λ R ad compute E X, p (p > 0). L (iii) Show that X p 0 iff p < 2b/a 2. (iv) Show that X 0 iff b > 0. Hit: Use Markov s iequality for X. p 6.0 Let α > ad β k = [α k ]. Show that there exists a costat C α such that for all itegers i o has :β i C α i 2. Show also that β k+ β k α. 6. Let x be real umbers with x x. Let y = i= x i. Show that y x. Take the η from the proof of theorem 5.2. Show that E η µ. 6.2 Let X, X 2,... be a sequece of i.i.d. radom variables with E X 2 <. L The aim is to show is that both X 2 µ where µ = E X ad X µ. (i) Show the L 2 covergece. (ii) Use Chebychev s iequality to show that ( X 2 µ > ε) < ad deduce form a well kow lemma that X 2 µ. (iii) Show the almost sure covergece of X by fillig the gaps. 6.3 Let X, X 2,... be real radom variables ad g : R R a uiformly cotiuous fuctio. Show that g(x ) g(x) if X X. What ca be said of the g(x ) if X X? 9

6.4 Let X, Y, X 2, Y 2,... be a i.i.d. sequece whose members have a uiform distributio o [0, ] ad let f : [0, ] [0, ] be cotiuous. Defie Z i = {f(xi)>y i}. (i) Show that i= Z i f(x) dx 0 (ii) Show that E ( i= Z i 0 f(x) dx)2 4. (iii) Explai why these two results are useful. 6.5 If X X ad g is a cotiuous fuctio, the also g(x ) g(x). Show this. 6.6 Assume that X X ad Y Y. Show that X + Y X ad similar statemets for products ad ratios. What about covergece of the pairs (X, Y )? How do the results look for almost sure covergece istead of covergece i probability. 6.7 Let X, X 2,... be oegative radom variables such that X + X ad let X = lim X The lim E X E X. Let X be a oegative radom variable ad X = X {X }, N. Show also that E X E X, whe X has desity, or whe X is discrete. (This is a special case of theorem A.). 6.8 rove for every radom variable X the double iequalities ( X > ) E X ( X > ) = =0 ad A ( X ) E X + ( X ). = = The mootoe covergece theorem Although familiarity with measure theory is ot assumed i this course, occasioally we eed oe of the mai theorems that will be provided i ay itroductory course i measure theory. Oe that deals with iterchagig limits ad expectatio. For a proof we refer to a course i measure theory, although a part of the proof of theorem A. is elemetary (see exercise 6.7). Theorem A. (Mootoe covergece theorem) Let X, X 2,... be oegative radom variables with property that X + X ad let X = lim X. The lim E X = E X. 0