Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Similar documents
Convergence of random variables. (telegram style notes) P.J.C. Spreij

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

7.1 Convergence of sequences of random variables

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

7.1 Convergence of sequences of random variables

Lecture 3 The Lebesgue Integral

Chapter 6 Infinite Series

Advanced Stochastic Processes.

1 Convergence in Probability and the Weak Law of Large Numbers

Measure and Measurable Functions

6.3 Testing Series With Positive Terms

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Lecture 19: Convergence

Distribution of Random Samples & Limit theorems

Introduction to Probability. Ariel Yadin

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

lim za n n = z lim a n n.

Sequences and Series of Functions

MAS111 Convergence and Continuity

Lecture 3 : Random variables and their distributions

Infinite Sequences and Series

Fall 2013 MTH431/531 Real analysis Section Notes

Empirical Processes: Glivenko Cantelli Theorems

An Introduction to Randomized Algorithms

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

Chapter 0. Review of set theory. 0.1 Sets

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

7 Sequences of real numbers

Math Solutions to homework 6

Math 113 Exam 3 Practice

sin(n) + 2 cos(2n) n 3/2 3 sin(n) 2cos(2n) n 3/2 a n =

Sequences. Notation. Convergence of a Sequence

The Borel hierarchy classifies subsets of the reals by their topological complexity. Another approach is to classify them by size.

Lecture Notes for Analysis Class

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS

The Boolean Ring of Intervals

Sequences and Series

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

Singular Continuous Measures by Michael Pejic 5/14/10

Assignment 5: Solutions

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology

FUNDAMENTALS OF REAL ANALYSIS by

MAT1026 Calculus II Basic Convergence Tests for Series

LECTURE 8: ASYMPTOTICS I

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

LECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if

MATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and

Mathematical Methods for Physics and Engineering

A Proof of Birkhoff s Ergodic Theorem

6 Infinite random sequences

CHAPTER 10 INFINITE SEQUENCES AND SERIES

Entropy Rates and Asymptotic Equipartition

Lecture 2. The Lovász Local Lemma

4. Partial Sums and the Central Limit Theorem

Axioms of Measure Theory

f n (x) f m (x) < ɛ/3 for all x A. By continuity of f n and f m we can find δ > 0 such that d(x, x 0 ) < δ implies that

Seunghee Ye Ma 8: Week 5 Oct 28

6. Uniform distribution mod 1

The natural exponential function

ST5215: Advanced Statistical Theory

Theorem 3. A subset S of a topological space X is compact if and only if every open cover of S by open sets in X has a finite subcover.

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

2 Banach spaces and Hilbert spaces

Math 140A Elementary Analysis Homework Questions 3-1

2.1. Convergence in distribution and characteristic functions.

Lecture 8: Convergence of transformations and law of large numbers

Sieve Estimators: Consistency and Rates of Convergence

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

This section is optional.

Notes #3 Sequences Limit Theorems Monotone and Subsequences Bolzano-WeierstraßTheorem Limsup & Liminf of Sequences Cauchy Sequences and Completeness

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Lesson 10: Limits and Continuity

5 Birkhoff s Ergodic Theorem

Notes 27 : Brownian motion: path properties

Lecture 10 October Minimaxity and least favorable prior sequences

STAT Homework 1 - Solutions

MATH 413 FINAL EXAM. f(x) f(y) M x y. x + 1 n

Math 341 Lecture #31 6.5: Power Series

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

MAXIMAL INEQUALITIES AND STRONG LAW OF LARGE NUMBERS FOR AANA SEQUENCES

MATH4822E FOURIER ANALYSIS AND ITS APPLICATIONS

Exercise 4.3 Use the Continuity Theorem to prove the Cramér-Wold Theorem, Theorem. (1) φ a X(1).

1.3 Convergence Theorems of Fourier Series. k k k k. N N k 1. With this in mind, we state (without proof) the convergence of Fourier series.

2.2. Central limit theorem.

Math 525: Lecture 5. January 18, 2018

Beurling Integers: Part 2

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

Math 61CM - Solutions to homework 3

A gentle introduction to Measure Theory

Notes for Lecture 11

SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker

5 Many points of continuity

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Probability and Random Processes

Transcription:

Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i distributio, ad covergece i kth mea. We ow cosider a fourth mode of covergece, almost sure covergece or covergece with probability oe, which will be show to imply both covergece i probability ad covergece i distributio. It is for this reaso that we attach the term strog to almost sure covergece ad weak to the other two; these terms are ot meat to idicate aythig about their usefuless. I fact, the weak modes of covergece are used much more frequetly i asymptotic statistics tha the strog mode, ad thus a reader ew to the subject may wish to skip this chapter; most of the rest of the book may be uderstood without a grasp of strog covergece. 3.1 Defiitio of almost sure covergece This form of covergece is i some sese the simplest to uderstad, sice it depeds oly o the cocept of limit of a sequece of real umbers, Defiitio 1.1. Sice a radom variable like X or X is a fuctio o a sample space (say, Ω), if we fix a particular elemet of that space (say, ω), we obtai the real umbers X (ω) ad X(ω). We may the ask, for each ω Ω, whether X (ω) coverges to X(ω) as a sequece of real umbers. Defiitio 3.1 Suppose X ad X 1, X 2,... are radom variables defied o the same sample space Ω (ad as usual P deotes the associated probability measure). If P ({ω Ω : X (ω) X(ω)}) = 1, the X is said to coverge almost surely (or with probability oe) to X, deoted X X or X X or X X w.p. 1. 50

I other words, covergece with probability oe meas exactly what it souds like: The probability that X coverges to X equals oe. Later, i Theorem 3.3, we will formulate a equivalet defiitio of almost sure covergece that makes it much easier to see why it is such a strog form of covergece of radom variables. Yet the ituitive simplicity of Defiitio 3.1 makes it the stadard defiitio. As i the case of covergece i probability, we may replace the limitig radom variable X by ay costat c, i which case we write X c. I the most commo statistical usage of covergece to a costat, the radom variable X is some estimator of a particular parameter, say g(θ): Defiitio 3.2 If X g(θ), X is said to be strogly cosistet for g(θ). As the ames suggest, strog cosistecy implies cosistecy, a fact to be explored i more depth below. 3.1.1 Strog Cosistecy versus Cosistecy As before, suppose that X ad X 1, X 2,... are radom variables defied o the same sample space, Ω. For give ad ɛ > 0, defie the evets ad A = {ω Ω : X k (ω) X(ω) < ɛ for all k } (3.1) B = {ω Ω : X (ω) X(ω) < ɛ}. (3.2) Evidetly, both A ad B occur with high probability whe X is i some sese close to X, so it might be reasoable to say that X coverges to X if P (A ) 1 or P (B ) 1 for ay ɛ > 0. I fact, Defiitio 2.1 is othig more tha the latter statemet; that is, X P X if ad oly if P (B ) 1 for ay ɛ > 0. Yet what about the sets A? Oe fact is immediate: Sice A B, we must have P (A ) P (B ). Therefore, P (A ) 1 implies P (B ) 1. By ow, the reader might already have guessed that P (A ) 1 for all ɛ is equivalet to X X: Theorem 3.3 With A defied as i Equatio (3.1), P (A ) 1 for ay ɛ > 0 if ad oly if X X. The proof of Theorem 3.3 is the subject of Exercise 3.1. The followig corollary ow follows from the precedig discussio: Corollary 3.4 If X X, the X P X. 51

The coverse of Corollary 3.4 is ot true, as the followig example illustrates. Example 3.5 Take Ω to be the half-ope iterval (0, 1], ad for ay iterval J Ω, say J = (a, b], take P (J) = b a to be the legth of that iterval. Defie a sequece of itervals J 1, J 2,... as follows: J 1 = (0, 1] J 2 through J 4 = (0, 1 3 ], ( 1 3, 2 3 ], ( 2 3,1] J 5 through J 9 = (0, 1 5 ], ( 1 5, 2 5 ], ( 2 5, 3 5 ], ( 3 5, 4 5 ], ( 4 5,1] J m 2 +1 through J (m+1) 2 =.. ( ] ( ] 1 2m 0,,..., 2m + 1 2m + 1, 1 Note i particular that P (J ) = 1/(2m + 1), where m = 1 is the largest iteger ot greater tha 1. Now, defie X = I{J } ad take 0 < ɛ < 1. The P ( X 0 < ɛ) is the same as 1 P (J ). Sice P (J ) 0, we coclude X P 0 by defiitio. However, it is ot true that X 0. Sice every ω Ω is cotaied i ifiitely may J, the set A defied i Equatio (3.1) is empty for all. Alteratively, cosider the set S = {ω : X (ω) 0}. For ay ω, X (ω) has o limit because X (ω) = 1 ad X (ω) = 0 both occur for ifiitely may. Thus S is empty. This is ot covergece with probability oe; it is covergece with probability zero! 3.1.2 Multivariate Extesios We may exted Defiitio 3.1 to the multivariate case i a completely straightforward way: Defiitio 3.6 X is said to coverge almost surely (or with probability oe) to X (X X) if P (X X as ) = 1. Alteratively, sice the proof of Theorem 3.3 applies to radom vectors as well as radom variables, we say X X if for ay ɛ > 0, P ( X k X < ɛ for all k ) 1 as. 52

We saw i Theorems 2.24 ad 2.30 that cotiuous fuctios preserve both covergece i probability ad covergece i distributio. Yet these facts were quite difficult to prove. Fortuately, the aalogous result for covergece with probability oe is quite easy to prove. I fact, sice almost sure covergece is defied i terms of covergece of sequeces of real (ot radom) vectors, the followig theorem may be prove usig the same method of proof used for Theorem 1.16. Theorem 3.7 Suppose that f : S R l is a cotiuous fuctio defied o some subset S R k, X is a k-compoet radom vector, ad P (X S) = 1. If X X, the f(x ) f(x). We coclude this sectio with a simple diagram summarizig the implicatios amog the modes of covergece defied so far. I the diagram, a double arrow like meas implies. Note that the picture chages slightly whe covergece is to a costat c rather tha a radom vector X. qm X X X X X P X X d X qm X c X c X P c X d c Exercise 3.1 Prove Theorem 3.3. Exercises for Sectio 3.1 Hit: Note that the sets A are icreasig i, so that by the lower cotiuity of ay probability measure (which you may assume without proof), lim P (A ) exists ad is equal to P ( =1A ). Exercise 3.2 Prove Theorem 3.7. 3.2 The Strog Law of Large Numbers Some of the results i this sectio are preseted for uivariate radom variables ad some are preseted for radom vectors. Take ote of the use of bold prit to deote vectors. Theorem 3.8 Strog Law of Large Numbers: Suppose that X 1, X 2,... are idepedet ad idetically distributed ad have fiite mea µ. The X µ. 53

It is possible to use fairly simple argumets to prove a versio of the Strog Law uder more restrictive assumptios tha those give above. See Exercise 3.4 for details of a proof of the uivariate Strog Law uder the additioal assumptio that X 4 <. To aid the proof of the Strog Law i its full geerality, we first establish a useful lemma. Lemma 3.9 If k=1 P ( X k X > ɛ) < for ay ɛ > 0, the X X. Proof: The proof relies o the coutable subadditivity of ay probability measure, a axiom statig that for ay sequece A 1, A 2,... of evets, ( ) P A k P (A k ). (3.3) k=1 To prove the lemma, we must demostrate that P ( X k X ɛ for all k ) 1 as, which (takig complemets) is equivalet to P ( X k X > ɛ for some k ) 0. Lettig A k deote the evet that X k X > ɛ, coutable subadditivity implies ( ) P (A k for some k ) = P A k P (A k ), ad the right had side teds to 0 as because it is the tail of a coverget series. Lemma 3.9 is early the same as a famous result called the First Borel-Catelli Lemma, or sometimes simply the Borel-Catelli Lemma; see Exercise 3.3. The utility of Lemma 3.9 is illustrated by the followig useful result, which allows us to relate almost sure covergece to covergece i probability (see Theorem 2.24, for istace). Theorem 3.10 X P X if ad oly if each subsequece X 1, X 2,... cotais a further subsequece that coverges almost surely to X. The proof of Theorem 3.10, which uses Lemma 3.9, is the subject of Exercise 3.7. i=k k= k= 3.2.1 Idepedet but ot idetically distributed variables Here, we geeralize the uivariate versio of the Strog Law to a situatio i which the X are assumed to be idepedet ad satisfy a secod momet coditio: Theorem 3.11 Kolmogorov s Strog Law of Large Numbers: Suppose that X 1, X 2,... are idepedet with mea µ ad Var X i <. i 2 The X µ. i=1 54

Note that there is o reaso the X i i Theorem 3.11 must have the same meas: If E X i = µ i, the the theorem as writte implies that (1/) i (X i µ i ) 0. Theorem 3.11 may be proved usig Kolmogorov s iequality from Exercise 1.31; this proof is the focus of Exercise 3.6. I fact, Theorem 3.11 turs out to be very importat because it may be used to prove the Strog Law, Theorem 3.8. The key to completig this proof is to itroduce trucated versios of X 1, X 2,... as i the followig lemma. Lemma 3.12 Suppose that X 1, X 2,... are idepedet ad idetically distributed ad have fiite mea µ. Defie Xi = X i I{ X i i}. The ad X X 0. i=1 Var X i i 2 < (3.4) Uder the assumptios of Lemma 3.12, we see immediately that X = X +(X X ) µ, because Equatio (3.4) implies X µ by Theorem 3.11. This proves the uivariate versio of Theorem 3.8; the full multivariate versio follows because X µ if ad oly if X j µ j for all j (Lemma 1.31). A proof of Lemma 3.12 is the subject of Exercise 3.5. Exercises for Sectio 3.2 Exercise 3.3 Let B 1, B 2,... deote a sequece of evets. Let B i.o., which stads for B ifiitely ofte, deote the set B i.o. def = {ω Ω : for every, there exists k such that ω B k }. Prove the first Borel-Catelli Lemma, which states that if =1 P (B ) <, the P (B i.o.) = 0. Hit: Argue that B i.o. = B k, =1 k= the adapt the proof of Lemma 3.9. Exercise 3.4 Use the hit below to prove that if X 1, X 2,... are idepedet ad idetically distributed ad E X1 4 <, the X E X 1. You may assume without loss of geerality that E X 1 = 0. 55

Hit: Use Markov s iequality (1.22) with r = 4 to put a upper boud o P ( X > ɛ ) ivolvig E (X 1 +... + X ) 4. Expad E (X 1 +... + X ) 4 ad the cout the ozero terms. Fially, argue that the coditios of Lemma 3.9 are satisfied. Exercise 3.5 Lemma 3.12 makes two assertios about the radom variables X i = X i I{ X i i}: (a) Prove that i=1 Var X i i 2 <. Hit: Use the fact that the X i are idepedet ad idetically distributed, the show that X1 2 1 k I{ X 1 k} 2 X 2 1, k=1 perhaps by boudig the sum o the left by a easy-to-evaluate itegral. (b) Prove that X X 0. Hit: Use Lemma 3.9 ad Exercise 1.32 to show that X X 0. The use Exercise 1.3. Exercise 3.6 Prove Theorem 3.11. Use the followig steps: (a) For k = 1, 2,..., defie Y k = max X µ. 2 k 1 <2 k Use the Kolmogorov iequality from Exercise 1.31 to show that P (Y k ɛ) 4 2k i=1 Var X i 4 k ɛ 2. (b) Use Lemma 3.9 to show that Y k 0, the argue that this proves X µ. Hit: Lettig log 2 i deote the smallest iteger greater tha or equal to log 2 i (the base-2 logarithm of i), verify ad use the fact that 1 4 4 k 3i. 2 k= log 2 i 56

Exercise 3.7 Prove Theorem 3.10. Hit: To simplify otatio, let Y k = X k deote a arbitrary subseqece. If Y P k X, Show that there exist k 1, k 2,... such that the use Lemma 3.9. P ( Y kj X > ɛ) < 1 2 j, O the other had, if X does ot coverge i probability to X, argue that there exists a subsequece Y 1 = X 1, Y 2 = X 2,... ad ɛ > 0 such that P ( Y k X > ɛ) > ɛ for all k. The use Corollary 3.4 to argue that Y does ot have a subsequece that coverges almost surely. 3.3 The Domiated Covergece Theorem We ow cosider the questio of whe Y d Y implies E Y E Y. This is ot geerally the case: Cosider cotamiated ormal distributios with distributio fuctios ( F (x) = 1 1 ) Φ(x) + 1 Φ(x 37). (3.5) These distributios coverge i distributio to the stadard ormal Φ(x), yet each has mea 37. However, recall Theorem 2.25, which guaratees that Y d Y implies E Y Y if all of the Y ad Y are uiformly bouded say, Y < M ad Y < M sice i that case, there is a bouded, cotiuous fuctio g(y) for which g(y ) = Y ad g(y ) = Y : Simply defie g(y) = y for M < y < M, ad g(y) = My/ y otherwise. To say that the Y are uiformly bouded is a much stroger statemet tha sayig that each Y is bouded. The latter statemet implies that the boud we choose is allowed to deped o, whereas the uiform boud meas that the same boud must apply to all Y. Whe there are oly fiitely may Y, the boudedess implies uiform boudedess sice we may take as a uiform boud the maximum of the bouds of the idividual Y. However, i the case of a ifiite sequece of Y, the maximum of a ifiite set of idividual bouds might ot exist. 57

The ituitio above, the, is that some sort of uiform boud o the Y should be eough to guaratee E Y E Y. The most commo way to express this idea is the Domiated Covergece Theorem, give later i this sectio as Theorem 3.17. The proof of the Domiated Covergece Theorem that we give here relies o a powerful techique that is ofte useful for provig results about covergece i distributio. This techique is called the Skorohod Represetatio Theorem, which guaratees that covergece i distributio implies almost sure covergece for a possibly differet sequece of radom variables. More precisely, if we kow X d X, the Skorohod Represetatio Theorem guaratees the existece of Y d =X ad Y d =X such that Y Y, where d = meas has the same distributio as. Costructio of such Y ad Y will deped upo ivertig the distributio fuctios of X ad X. However, sice ot all distributio fuctios are ivertible, we first geeralize the otio of the iverse of a distributio fuctio by defiig the quatile fuctio. Defiitio 3.13 If F (x) is a distributio fuctio, the we defie the quatile fuctio F : (0, 1) R by F (u) def = if{x R : u F (x)}. With the quatile fuctio thus defied, we may prove a useful lemma: Lemma 3.14 u F (x) if ad oly if F (u) x. Proof: Usig the facts that F ad F are odecreasig ad F [F (x)] x, u F (x) F (u) F [F (x)] x F [F (u)] F (x) u F (x), where the first implicatio follows because F [F (x)] x ad the last follows because u F [F (u)] (the latter fact requires right-cotiuity of F ). Now the costructio of Y ad Y proceeds as follows. Let F ad F deote the distributio fuctios of X ad X, respectively, for all. Take the sample space Ω to be the iterval (0, 1) ad adopt the probability measure that assigs to each iterval subset (a, b) (0, 1) its legth (b a). (There is a uique probability measure o (0, 1) with this property, a fact we do ot prove here.) The for every ω Ω, defie Y (ω) def = F (ω) ad Y (ω) def = F (ω). (3.6) The radom variables Y ad Y are exactly the radom variables we eed, as asserted i the followig theorem. 58

Theorem 3.15 Skorohod represetatio theorem: Assume F d F. The radom variables defied i expressio (3.6) satisfy 1. P (Y t) = F (t) for all ad P (Y t) = F (t); 2. Y Y. Before provig Theorem 3.15, we first state a techical lemma, a proof of which is the subject of Exercise 3.8(a). Lemma 3.16 Assume F d F ad let the radom variables Y ad Y be defied as i expressio (3.6). The for ay ω (0, 1) ad ay ɛ > 0 such that ω + ɛ < 1, Y (ω) lim if Y (ω) lim sup Y (ω) Y (ω + ɛ). (3.7) Proof of Theorem 3.15: By Lemma 3.14, Y t if ad oly if ω F (t). But P (ω F (t)) = F (t) by costructio. A similar argumet for Y proves the first part of the theorem. For the secod part of the theorem, lettig ɛ 0 i iequality (3.7) shows that Y (ω) Y (ω) wheever ω is a poit of cotiuity of Y (ω). Sice Y (ω) is a odecreasig fuctio of ω, there are at most coutably may poits of discotiuity of ω; see Exercise 3.8(b). Let D deote the set of all poits of discotiuity of Y (ω). Sice each idividual poit i Ω has probability zero, the coutable subadditivity property (3.3) implies that P (D) = 0. Sice we have show that Y (ω) Y (ω) for all ω D, we coclude that Y Y. Note that the poits of discotiuity of Y (ω) metioed i the proof of Theorem 3.15 are ot i ay way related to the poits of discotiuity of F (x). I fact, flat spots of F (x) lead to discotiuities of Y (ω) ad vice versa. Havig thus established the Skorohod Represeatio Theorem, we ow itroduce the Domiated Covergece Theorem. Theorem 3.17 Domiated Covergece Theorem: If for some radom variable Z, X Z for all ad E Z <, the X d X implies that E X E X. Proof: Fatou s Lemma (see Exercise 3.9) states that E lim if X lim if E X. (3.8) A secod applicatio of Fatou s Lemma to the oegative radom variables Z X implies E Z E lim sup X E Z lim sup E X. 59

Because E Z <, subtractig E Z preserves the iequality, so we obtai lim sup Together, iequalities (3.8) ad (3.9) imply E lim if X lim if E X E lim sup X. (3.9) E X lim sup E X E lim sup X. Therefore, the proof would be complete if X X. This is where we ivoke the Skorohod Represetatio Theorem: Because there exists a sequece Y that does coverge almost surely to Y, havig the same distributios ad expectatios as X ad X, the above argumet shows that E Y E Y, hece E X E X, completig the proof. Exercises for Sectio 3.3 Exercise 3.8 This exercise proves two results used to establish theorem 3.15. (a) Prove Lemma 3.16. Hit: For ay δ > 0, let x be a cotiuity poit of F (t) i the iterval (Y (ω) δ, Y (ω)). Use the fact that F d F to argue that for large, Y (ω) δ < Y (ω). Take the limit iferior of each side ad ote that δ is arbitrary. Similarly, argue that for large, Y (ω) < Y (ω + ɛ) + δ. (b) Prove that ay odecreasig fuctio has at most coutably may poits of discotiuity. Hit: If x is a poit of discotiuity, cosider the ope iterval whose edpoits are the left- ad right-sided limits at x. Note that each such iterval cotais a ratioal umber, of which there are oly coutably may. Exercise 3.9 Prove Fatou s lemma: E lim if X lim if E X. (3.10) Hit: Argue that E X E if k X k, the take the limit iferior of each side. Use the mootoe covergece property o page 25. Exercise 3.10 If Y d Y, a sufficiet coditio for E Y E Y is the uiform itegrability of the Y. 60

Defiitio 3.18 The radom variables Y 1, Y 2,... are said to be uiformly itegrable if sup E ( Y I{ Y α}) 0 as α. Prove that if Y d Y ad the Y are uiformly itegrable, the E Y E Y. Exercise 3.11 Prove that if there exists ɛ > 0 such that sup E Y 1+ɛ <, the the Y are uiformly itegrable. Exercise 3.12 Prove that if there exists a radom variable Z such that E Z = µ < ad P ( Y t) P ( Z t) for all ad for all t > 0, the the Y are uiformly itegrable. You may use the fact (without proof) that for a oegative X, E (X) = 0 P (X t) dt. Hits: Cosider the radom variables Y I{ Y t} ad Z I{ Z t}. I additio, use the fact that E Z = E ( Z I{i 1 Z < i}) i=1 to argue that E ( Z I{ Z < α}) E Z as α. 61