Probability: Limit Theorems I. Charles Newman, Transcribed by Ian Tobasco

Size: px
Start display at page:

Download "Probability: Limit Theorems I. Charles Newman, Transcribed by Ian Tobasco"

Transcription

1 Probability: Limit Theorems I Charles Newma, Trascribed by Ia Tobasco

2 Abstract. This is part oe of a two semester course o measure theoretic probability. The course was offered i Fall 2011 at the Courat Istitute for Mathematical Scieces, a divisio of New York Uiversity. The text is Probability Theory by S. R. S. Varadha. Homework will be assiged o a regular basis.

3 Cotets Chapter 1. Itroductio to Probability Theory 5 1. Probability Spaces 5 2. Measure Theoretic Itegratio 9 3. Product Spaces ad Product Measures Distributios ad Expectatios 12 Chapter 2. Weak Covergece Characteristic Fuctios Weak Covergece 17 Chapter 3. Idepedet Radom Variables Idepedece ad Covolutio Weak Law of Large Numbers Strog Law of Large Numbers Kolmogorov s 0-1 Law Cetral Limit Theorem Triagular Arrays ad Ifiite Divisibility Law of the Iterated Logarithm 39 Chapter 4. Depedet Radom Variables Coditioig Markov Chais ad Radom Walks Trasiece ad Recurrece 48 3

4

5 CHAPTER 1 Itroductio to Probability Theory 1. Probability Spaces Defiitio 1.1. A measurable space is a pair (Ω, B) where Ω is a set ad B is a σ-field of subsets of Ω, i.e. B cotais ad is closed uder complemetatio ad coutable uios. It follows immediately from the defiitio that every σ-field also cotais Ω ad is closed uder coutable itersectios. Propositio Give ay collectio F of subsets of Ω, there exists a uique σ-field B such that B F ad for ay σ-field G with G F, it follows that G B. The σ-field B give by the propositio above is called the σ-field geerated by F ad is ofte deoted σ (F). It is the smallest σ-field cotaiig F. I geeral it is difficult to say exactly whether a set belogs to a give σ-field. But the propositio above gives a coveiet way to costruct a σ-field from the iterestig sets (whatever those may be). Examples 1.3. Here are some first examples of measure spaces. (1) A pair of dice: Ω 1 = {(ω 1, ω 2 ) : ω i {1, 2,..., 6} for i = 1, 2}, B 1 = P (Ω 1 ). Or if dice are idistiguishable, replace B 1 with B 1 = {A : (ω 1, ω 2 ) A = (ω 2, ω 1 ) A}. (2) Number of cars passig a itersectio (i a oe-miute period): Ω 2 = {0, 1, 2,... }, B 2 = P (Ω 2 ). (3) Arbitrarily log (or ifiite) sequeces of tosses of a coi: Ω 3 = {(ω 1, ω 2,... ) : ω i {0, 1}}. We wat to uderstad evets of the form {ω = (ω 1, ω 2,... ) : ω 1 = ε 1,..., ω k = ε k, ε 1,..., ε k {0, 1}}. So let F deote the collectio of such evets, ad take B 3 = σ (F). Lecture 1, 9/6/11 1 This is exercise 1.7 i the text. 5

6 6 1. INTRODUCTION TO PROBABILITY THEORY (4) Poit processes i space (e.g. oe could study the distributio of galaxies i the uiverse): Ω 4 = { ω : ω R 3 s.t. bouded Λ R 3, ω Λ is fiite }. Let Λ be a ope, bouded subset of R 3 ad let A = {ω : ω Λ = } for fixed {0, 1, 2,... }. Let F be the collectio of all such A for all possible Λ, ad take B 4 = σ (F). Defiitio 1.4. A probability measure P o (Ω, B) is a fuctio from B to [0, 1 such that P (Ω) = 1 P is coutably additive, i.e. if A 1, A 2,... are disjoit members of B the P ( =1A ) = P (A ). Note that coutable additivity is equivalet to the followig cotiuity property: 2 If B 1 B 2 ad B B for all the P ( =1B ) = lim P (B ). Takig complemets, we get aother equivalet cotiuity property: If C 1 C 2 ad C B for all the =1 P ( =1C ) = lim P (C ). Defiitio 1.5. A probability space is a triple (Ω, B, P ) as above. If Ω is coutable ad B = P (Ω) the a probability measure is uiquely determied by the umber p (ω) = P ({ω}) for each ω Ω, provided p (ω) 0 for each ω ad ω Ω p (ω) = 1. I this case, we have for ay A B. P (A) = ω A p (ω) Example 1.6. Recall the previous examples. For Ω 1, we could take p (ω) = 1/36 for each ω, supposig we have fair, idepedet dice. For Ω 2, we could take λ λω p (ω) = e ω!, the Poisso distributio with mea λ > 0. But (3) ad (4) are ot as straightforward. What if Ω = σ (F) is ucoutable? We may kow what P should be o the smaller collectio F, e.g. ( ) k 1 P ({ω : ω 1 = ε 1,..., ω k = ε k }) = 2 i the third example above. This gives rise to the followig questios i the geeral cotext: Give Ω ad B = σ (F), if P is first defied o F, the (1) Ca the domai of P be exteded to B such that P is a probability measure? (I particular, how ca we guaratee coutable additivity?) 2 Proof of which is problem 1 from problem set 1 (also exercise 1.2 i the text).

7 1. PROBABILITY SPACES 7 (2) Is the extesio to B uique? Basic aswers to these questios are i the followig defiitios ad theorems. Defiitio 1.7. A field F existig of subsets of Ω is a collectio closed uder complemetatio ad fiite uios. A fiitely addivitive fuctio P from F to [0, 1 with P (Ω) = 1 is called coutably additive o F if for ay decreasig sequece A 1 A 2 with A j F for all j ad with =1A =, we have lim P (A ) = 0. Theorem 1.8 (Caratheodory Extesio Theorem). If F is a field ad P is coutably additive o F as above, the there exists a uique probability measure P o B = σ (F) such that P F = P. See page four of the text for a discussio of the proof. To use the theorem, we eed a field. Cosider the ext example. Example 1.9. The field of subsets of R. Let I be the collectio of itervals of the form (a, b with a < b or (, b or (a, ). The the fiite disjoit uios of itervals from I form a field. Exercise Check that the collectio of fiite disjoit uios of itervals i the example above ideed form a field. Defiitio The σ-field geerated by the field i the previous example is called the Borel σ-field, B. Ay elemet of B is called a Borel subset of R. To defie a probability measure o (R, B), we ll defie it first o itervals (a, b, ad the exted uiquely via the Caratheodory extesio theorem. Defiitio The Lebesgue measure is the uique measure λ such that λ ((a, b) = b a. Remark. But λ is ot a probability measure, as λ (R) =. We ll see ext how to costruct probability measures via the extesio theorem. Suppose F (x) is a mootoe icreasig ad right-cotiuous fuctio o R. Defie the set fuctio λ F ((a, b) = F (b) F (a). Note if F (x) = x, the λ F ((a, b) = λ ((a, b) ad so λ F exteds to the Lebesgue measure. Now suppose that F ( ) = 0 ad F (+ ) = 1. The λ F is a probability measure, for by our assumptio. λ F (R) = lim F (b) F (a) = 1 0 = 1 b a Example Let F : R R be give by { 0 x < 0 F (x) = 1 x 0. Note that λ F ((, ɛ) = F ( ɛ) F ( ) = 0 0 = 0, λ F ((, 0) = F (0) F ( ) = 1, Lecture 2, 9/13/11

8 8 1. INTRODUCTION TO PROBABILITY THEORY ad λ F ((, )) = 1. λ F geerates the Dirac δ-measure (or poit mass) which satisfies { 1 0 U δ (U) = 0 0 / U. I a similar way we ca geerate the poit mass at x, { 1 x U δ x (U) = 0 x / U. Claim. λ F is coutably additive o F iff F is right-cotiuous. Defiitio A distributio fuctio F (x) is a right-cotiuous, odecreasig fuctio o R which satisfies F ( ) = 0 ad F (+ ) = 1. Propositio If P is a probability measure o (R, B), the F (x) = P ((, x) is a distributio fuctio. Theorem Coversely, for every distributio fuctio F, there exists a uique probability measure o (R, B) with P ((, x) = F (x). Now we ll see a importat class of probability measures. Let x 1, x 2,... be a sequece of distict real umbers, ad let p 1, p 2, 0 satisfy p = 1. Defie the measure P (A) = p = p 1 A (x ). :x A These are called discrete probability measures o R. What is the distributio fuctio for P? We have F (x) = P ((, x) = p. =1 Here is a example of such a probaiblity measure. :x x Example The Poisso probability measure o R with rate λ > 0. Set x 1 = 0, x 2 = 1,..., x = 1,... ad set p = λ 1 ( 1)! e λ. The discrete probability measure geerated by these x, p is the Poisso measure. Example Let x 1, x 2,... be a deumeratio of Q. Let p = 2. Let P = p δ x = 1 2 δ x. This is a importat measure ad has some strage properties. For example: Propositio The distributio fuctio F (x) = 1 2 is strictly icreasig. :x x Proof. Fix x ad let ɛ > 0. The iterval (x, x + ɛ cotais ratioal umbers, ad hece has positive probability. Thus F (x + ɛ) F (x) = P ((x, x + ɛ) > 0.

9 2. MEASURE THEORETIC INTEGRATION 9 This measure is a great source of couterexamples, ad is worth rememberig. Example Suppose f is a o-egative, (Lebesgue)-itegrable fuctio with f (y) dy = 1. (Such a f is kow as a desity fuctio.) Defie the distributio fuctio F (x) = x f (y) dy ad let P satisfy P ((, x) = F (x). (Check that F is ideed a distributio fuctio.) By the fudametal theorem of calculus, almost everywhere. We ll see why later. d F (x) = f (x) dx 2. Measure Theoretic Itegratio First we must idetify which fuctios we expect to itegrate. Defiitio 2.1. A (real-valued) measurable fuctio o a measure space (Ω, Σ) is a map f : Ω R such that for all Borel B R, f 1 (B) Ω. Geerically, a measurable fuctio has the property that the preimages of measurable sets are measurable. Defiitio 2.2. Let (Ω, Σ, P ) be a probability space. A radom variable (rv) o (Ω, Σ, P ) is a measurable fuctio. So what is the differece here? I measure theory we care about the space, (R, B), ad objects like measurable fuctios help you uderstad the space. I probability, we care about radom variables (observables), ad we choose the space Ω to help us uderstad the radom variables. We usually supress the argumet whe writig a radom variable, so we would write X to mea the radom variable (measurable fuctio) X (a). Defiitio 2.3. A map f : Ω 1 Ω 2 betwee measure spaces is measurable if for all B Σ 2, f 1 (B) Σ 1. Such a f is called a Ω 2 -valued radom variable. Remark. I this course, the fuctos we ll discuss will be measurable. So it s ice to kow that certai classes of fuctios (e.g. cotiuous fuctios) are measurable ad that the usual operatios (e.g. +,, ( ) 1,, ( ), d dx ( ), lim ( )) preserve the property of measurability. Proof of these facts ca be foud i ay stadard text o measure theory. Example 2.4. Let X 1, X 2, X 3, X 4 be real-valued radom variables (o some measure space (Ω, Σ, P )). Defie the matrix-valued radom variable ( ) X1 X Y = 2. X 3 X 4 Let Ω 2 2 be the space of 2 2 matrices (Ω 2 2 measurable. = R 4 ), the Y : Ω Ω 2 2 is Here are the steps to buildig a theory of itegratio:

10 10 1. INTRODUCTION TO PROBABILITY THEORY (1) Defie the itegral Ω f (ω) dp (ω) = f dp for simple fuctios f, i.e. fiite liear combiatios of idicator fuctios of measurable sets. (2) Exted the defiitio to bouded fuctios. (3) Exted to o-egative fuctios. (4) Exted to itegrable fuctios (suitably defied). Here is step four. If f is measurable, defie the fuctios f + (ω) = f (ω) 1 [0, ) (f (ω)) = max (f, 0) f (ω) = f (ω) 1 (,0 (f (ω)) = mi (f, 0). The f +, f are measurable, ad by the costructio f = f + f ad f = f + +f. We way that f is itegrable if f dp <, ad i that case we defie Ω f dp = f + dp f dp. Example 2.5. Cosider f : R R give by f (x) = si x. Oe might be tempted to say f = 0, but i fact f is ot itegrable uder our defiitio. So f is ot well defied. Lecture 3, 9/20/11 Defiitio 2.6. If X : Ω R is a real-valued radom variable which is itegrable, the we defie its expectatio to be the quatity E [X = X (ω) dp (ω). Here are some useful theorems of itegratio: Ω Mootoe Covergece Theorem: If X, the lim E [X = E [lim X. Domiated Covergece Theorem: If X Y for itegrable Y, the lim E [X = E [lim X. Bouded Covergece Theorem: If X M <, the lim E [X = E [lim X. Jese s Iequality: If φ is covex, the φ (E [X ) E [φ (X ). These theorems hold uder poitwise covergece of X X, but they also hold uder a weaker kid of covergece, i which the set where X X has probability zero. Defiitio 2.7. If X is a sequece of (real-valued) radom variables o (Ω, B, P ), the we say X X almost surely (a.s.) if P {ω : X (ω) X (ω)} = 1. Here is a aother importat kid of covergece. Defiitio 2.8. If P {ω : X (ω) X (ω) > ɛ} 0 for all ɛ > 0, the we say that X X i probability. Note. If we were doig aalysis, we would say almost everywhere istead of almost surely ad i measure istead of i probability.

11 3. PRODUCT SPACES AND PRODUCT MEASURES 11 Example 2.9. O {[0, 1, Borel sets, Lebesgue measure}, cosider the sequece of radom variables X 1 = 1 [0,1/2 X 2 = 1 [1/2,1 X 3 = 1 [0,1/4 X 4 = 1 [1/4,1/2.. ad so o. The X do ot coverge a.e. to aythig (cosider ay o-dyadic umber i [0, 1). But they do coverge i probability to X 0, for with k () as. P { X X > ɛ} 1 2 k() 3. Product Spaces ad Product Measures This is importat for studyig idepedece i the abstract settig. Let (Ω 1, B 1, P 1 ), (Ω 2, B 2, P 2 ) be probability spaces, ad recall their Cartesia product Ω = Ω 1 Ω 2 is the set of (ω 1, ω 2 ) with ω i Ω i. There is a atural way to costruct a σ-field B ad probability measure P o Ω 1 Ω 2, which is aalogous to goig from oe-dimesio Lebesgue measure (legth) o R 1 to two-dimesioal Lebesgue measure (area) o R 2. The product σ-field, B = B 1 B 2, is the σ-field geerated by {A 1 A 2 : A 1 B 1, A 2 B 2 } (the measurable rectagles). Now we look to defie a measure o B. Let F be the field (ot σ-field) of fiite disjoit uios of the measurable rectagles. The defie P o F by first settig P (A 1 A 2 ) = P 1 (A 1 ) P 2 (A 2 ) for oe measurable rectagle A 1 A 2 ad the extedig via fiite additivity to all of F. Lemma 3.1. P defied as above is coutably additive o F. Thus P exteds uiquely to a probability measure o B, the product measure P 1 P 2. Furthermore, if A B (but ot eccessarily i F), the P (A) ca be evaluated by iterated itegratio (i either order). This is corollary 1.11 i the text, ad is a special case of (ad leads to) the followig theorem. Theorem 3.2 (Fubii). If f (ω 1, ω 2 ) is a (real-valued) itegrable fuctio o (Ω, B, P ) = (Ω 1 Ω 2, B 1 B 2, P 1 P 2 ), the the fuctio ω 2 f (ω 1, ω 2 ) is itegrable for a.e. ω 1, the fuctio ω 1 Ω 2 f (ω 1, ω 2 ) dp 2 is itegrable, ad [ f (ω 1, ω 2 ) dp = f (ω 1, ω 2 ) dp 2 dp 1. Ω 1 Ω 2 Ω 2 Similarly, f (ω 1, ω 2 ) dp = Ω 1 Ω 2 Ω 1 Ω 2 [ f (ω 1, ω 2 ) dp 1 dp 2. Ω 1 Remark. The coverse holds for o-egative fuctios, a result due to Toelli.

12 12 1. INTRODUCTION TO PROBABILITY THEORY 4. Distributios ad Expectatios Defiitio 4.1. Suppose (Ω 1, B 1, P ) is a probability space ad (Ω 2, B 2 ) is a measure space. If X : Ω 1 Ω 2 is a Ω 2 -valued radom variable, the Q defied by Q (A) = P ( X 1 (A) ) for A B 2 is a probability measure o (Ω 2, B 2 ), called the iduced measure or the distributio of X. Note. I the aalysis settig, oe might see Q = P X 1. I the probability settig, we ca iterpret Q (A) as the probability that X takes values i A. Example 4.2. If X is real-valued ((Ω 2, B 2 ) = (R, Borel sets)), the Q is the probability measure o R whose distributio fuctio is F Q (x) = Q ((, x) = P (X x) = F X (x). F X is the cumulative distributio fuctio of X from classical probability. So the distributio of a radom variable (i the ew sese) is just a geeralizatio of cumulative distributio to abstract probability space. Theorem 4.3 (Chage of Variables). If X is as i the above defiitio ad h is a real-valued measureable fuctio o (Ω 2, B 2 ) the the fuctio g give by g (ω 1 ) = h (X (ω 1 )) is a real-valued radom variable o (Ω 1, B 1 ). g is itegrable o (Ω 1, B 1, P ) iff h is itegrable o (Ω 2, B 2, Q) where Q is the distributio of X, ad E [g = E [h (X) = h (X (ω 1 )) dp = h (ω 2 ) dq. Ω 1 Ω 2 Example 4.4. If X is real-valued, the we have the usual formula E [h (X) = h (X (ω 1 )) dp = h (ω 2 ) dq = h (x) df X (x) Ω 1 R with the Lebesgue-Stieltjes itegral o the right. Here is the situatio thus far. If X is a real-valued radom variable o (Ω, B, P ), the it iduces a measure α = P X 1 called the probability distributio of X. The distributio fuctio of X is the F (x) = α ((, x) = P (X x). Ad if X is itegrable, its expectatio (mea) is E [X = X (ω) dp = x dα = x df (x). Ω Fially, if g a α-itegrable real-valued fuctio o R, the E [g (X) = g (X (ω)) dp = g (x) dα = g (x) df (x). Ω R R Defiitios 4.5. The mth momet of X for m = 1, 2, 3,... is E [X m = x m dα, provided E [ X m = R x m dα <. The variace of X is [ Var (X) = E (X E (X)) 2 = E [ X 2 (E [X) 2 ad the stadard deviatio of X is R σ (X) = Var (X).

13 4. DISTRIBUTIONS AND EXPECTATIONS 13 Problem oe o the secod homework is to calculate the mea ad stadard deviatio of a shufflig radom variable. Here is a hit. Propositio 4.6. If X 1, X 2,..., X are itegrable radom variables the c i X i is itegrable ad E [ c i X i = c i E [X i.

14

15 CHAPTER 2 Weak Covergece 1. Characteristic Fuctios Note. I probability, characteristic fuctios are ot idicator fuctios. Recall the followig elemetary facts about complex umbers: i 2 = 1 C = {a + bi a, b R} e ir = cos r + i si r for r R. Exercise 1.1. Defie e ir = (ir)!. Prove the third fact usig i 2 = 1 ad Taylor series for cos r ad si r. Defiitio 1.2. A complex-valued fuctio f (ω) = a (ω) + ib (ω) is itegrable if a ad b are itegrable, ad the f (ω) dp = a (ω) dp + i b (ω) dp. Thus if α is a probability measure o R, the for t R we have e itx dα = cos tx dα + i si tx dα. R R If α is the probability distributio of some real-valued radom variable X, the cosider the complex-valued fuctio φ (t) = e itx dα = e itx df X = E [ e itx. R Defiitio 1.3. The fuctio φ above is called the characteristic fuctio of α (or of F X or of X). Characteristic fuctios are importat to may parts of this course. For example, a stadard proof of the cetral limit theorem is via characteristic fuctios. Examples 1.4. (1) If X has uiform distributio o [C, D, the D ( e itx dα = e itx 1 D C dx = 1 e itd e itc D C it R C Note that ( 1 e itd e itc ) lim = 1 t 0 D C it as we expect (α is a probability measure). A special case is C = 1/2, D = +1/2, for which e itx dα = si t 2 t. R 2 15 R ).

16 16 2. WEAK CONVERGENCE (2) If X is Biomial(, p), the E [ e itx ( ) = e itk p k (1 p) k = ( pe it + (1 p) ). k k=0 (3) If X is Poisso(λ), the E [ e itx = e itk λ λk e k! = it 1) eλ(e. k=0 (4) If X is expoetial with mea µ, the E [ e itx 1 = 1 iµt. (5) If X is Normal ( µ, σ 2), the Lecture 4, 9/27/11 E [ e itx = e iµt 1 2 σ2 t 2. Exercise 1.5. Verify these calculatios. What is the relatio betwee characteristic fuctios ad momets? Formally, by iterchagig derivatives ad itegrals, we get d dt φ (t) = d e itx dα = (ix) e itx dα dt ad by settig t = 0 φ (0) = ix dα = ie [X. More geerally ( m d e dt) itx = i t=0 m x m ad so φ (m) (0) = i m E [X m, at least formally. Whe is this justified? Propositio 1.6. If E [ X m <, the φ is m-times (cotiuously) differetiable ad φ (m) (0) = i m E [X m. Proof of this will be problem two o the secod problem set. Also see exercises 2.2, 2.4 of the text. What about a coverse? If φ is m-times (cotiuously) differetiable, does this imply that E [ X m < ad hece E [X m = 1 i m φ(m) (0)? The aswer is yes for m eve, but o for m odd. See exercise 2.3 of the text for the details. Here is a classic theorem, which tells us how to determie if a complex-valued fuctio is the characteristic fuctio of a radom variable. Theorem 1.7 (Bocher). A complex-valued fuctio φ (t) is the characteristic fuctio of some α (or F or X) iff φ satisfies the followig three properties: (1) φ (0) = 1

17 2. WEAK CONVERGENCE 17 (2) φ is cotiuous (for all t) (3) φ is positive (semi-)defiite; i.e. for ay = 1, 2, 3,... ad ay real t 1,..., t, the matrix (φ (t i t j )) i,j=1 should be positive (semi-)defiite (self-adjoit with all positive eigevalues). Equivaletly, give N ad t 1,..., t R, φ should satisfy c i c j φ (t i t j ) 0 i,j=1 for all complex c 1,..., c. See the text for the proof; the oly if part is easier. This ext propositio is used to fiish the proof of the cetral limit theorem. Observe first that a distributio fuctio F is determied uiquely by its values at its cotiuity poits (F is rightcotiuous ad icreasig). Propositio 1.8. A distributio α (or distributio fuctio F ) is uiquely determied by its characteristic fuctio: if a, b are cotiuity poits of F the 1 α ((a, b) = F (b) F (a) = lim T 2π See pp of the text for the proof. T T e ita e itb φ (t) dt. it Remark. Eve if a, b are ot cotiuity poits, the limit above equals This is cosistet with the propositio. α ((a, b)) α ({α}) α ({b}). 2. Weak Covergece We are workig towards the cetral limit theorem, which is about the covergece of distributios to the ormal distributio. We eed to reiterpret the word covergece to acheive the theorem. Theorem 2.1. Let α, α be probability measures o R ad let F, F be the correspodig ditributio fuctios. The followig are equivalet: (1) lim F (x) = F (x) for every x that is a cotiuity poit of F. (2) α ([a, b) α ([a, b) as for all a, b R which are ot atoms of α (i.e. α ({a}), α ({b}) 0). (3) lim R f (x) dα (x) = f (x) dα (x) for every bouded, cotiuous R (real/complex-valued) fuctio f o R. Remark. If X has distributio α ad X has distributio α, the (3) says that E [f (X ) E [f (X). Defiitio 2.2. If α, α (F, F ) satisfy ay of the coditios above, we say α coverges weakly to α, ad we write α α (F F ). Theorem 2.3. α α iff the characteristic fuctios φ, φ satisfy lim φ (t) = φ (t) for every real t.

18 18 2. WEAK CONVERGENCE This last iterpretatio of weak covergece is used to prove the cetral limit theorem. Remarks. (1) If X, X have distributios α, α with α α, we say X coverges i distributio (i law) to X. As oted above, this meas E [f (X ) E [f (X) for ay bouded, cotiuous f. (2) (1) or (2) i theorem 2.1 ca be stregtheed to: α (A) α (A) for every Borel set A so that α ( A\A 0) = 0. (This is theorem 2.6 i the text.) Such a set A is called a cotiuity set of α. Example 2.4. Here is a simple example. Take α = δ 1 1/ ad α = δ 1. The α α, but α ({1}) = 0 for all so that α ({1}) α ({1}). Ideed, {1} is ot a cotiuity set of α (α ( A\A 0) 0). Cosider the proof of theorems 2.1 ad 2.3. (1) = (3) is fairly straightforward (see pp of the text). (3) = φ (t) φ (t) is trivial (take f (x) = e itx ). The iterestig proof is that φ (t) φ (t) = (1), which follows from the ext theorem. Theorem 2.5. Suppose φ, = 1, 2,..., are characteristic fuctios (of probability measures α ) ad there exists a fuctio φ so that φ (t) φ (t) for each t. Suppose also that φ is cotiuous at t = 0. The φ is a characteristic fuctio of some probability measure α, ad α α. Proof sketch. Part 1. (Steps 1-3 o pp ) Let F be ay sequece of distributio fuctios, the there exists a subsequece k ad a sub-distributio fuctio G (G is o-decreasig, right-cotiuous, with values i [0, 1; but G ( ) > 0 ad/or G (+ ) < 1 are allowed) such that F k (x) G (x) at every cotiuity poit of G. Proof of this fact uses boudedess of {F (x)} for each x, coutability of the ratioals, diagoalizatio, ad so forth. I some books, this type of covergece is called vague covergece. Example 2.6. For α = 2 3 δ δ δ, F (x) G (x) with { 1 G (x) = 12 x < x 1. Part 2. (Steps 4-5 o pp ) The key argumet is to show that φ (t) is cotiuous at t = 0 implies that G ( ) = 0 ad G (+ ) = 1. This follows from the Lemma 2.7. If F is a distributio fuctio with characteristic fuctio φ, the for ay T > 0 we have ( ( ) ( 2 1 F F 2 )) ( ) T φ (t) dt. T T 2T T This lemma is proved by elemetary maipulatios ad Fubii s theorem (see p. 27 of the text). Applyig the lemma to the F k ad takig k yields that for small T such that ±2/T are cotiuity poits of G, ( ( ) ( )) ( G G ) T φ (t) dt. T T 2T T

19 2. WEAK CONVERGENCE 19 This holds eve though we do t yet kow G is a distributio fuctio (so we do t kow φ is a characteristic fuctio). But we do kow φ is cotiuous at t = 0 ad φ (0) = 1, so G ( ) = 0 ad G (+ ) = 1. So G is a distributio fuctio. Now we have a subsequece F k which coverges to a distributio fuctio G at cotiuity poits. By what we ve already show, φ k (t) e itx dg for every t. But sice φ k (t) φ (t), we have φ (t) = e itx dg. Now for ay subsequece k, by the same argumet there exists a sub-subsequece k j so that F kj coverges to a distributio fuctio G. But agai φ G = φ G so that G = G ad hece F kj G. This implies F G; ow reame G as F to complete the proof. The argumets used i this proof are iterestig i their ow right. Defiitio 2.8. A collectio A of probability distributios o R is called totally bouded if there exists a probability distributio α so that every sequece α from A has a subsequece α k α. Theorem 2.9. A family of probability distributios A is totally bouded iff either of the followig holds: (1) The family is tight, i.e. lim sup α ({x : x l}) = 0. l α A (2) Let φ α be the characteristic fuctio of α. The lim sup h 0 t h 1 φ α (t) = 0. That (1) ad (2) are equivalet is the cotet of lemma 2.7. That (1) implies A is totally bouded is part 1 of the proof of theorem 2.5. That (2) implies A is totally bouded is part 2 of the proof of theorem 2.5. Lecture 5, 10/4/11

20

21 CHAPTER 3 Idepedet Radom Variables 1. Idepedece ad Covolutio Recall the otio of idepedet radom variables from classical probability. We eed a more geeral defiitio to tackle the classic theorems of probability. Defiitios 1.1. A fiite family of (real-valued) radom variables X 1, X 2,, X is idepedet (joitly idepedet) if P (X 1 B 1, X 2 B 2,..., X B ) = P (X 1 B 1 ) P (X B ) for B 1, B 2,... ay Borel subsets of R. A ifiite family of radom variables {X α } is idepedet if every fiite subfamily is. Evets A 1, A 2,... are said to be idepedet if their idicators are idepedet, or equivaletly if P (C 1 C 2 C ) = P (C 1 ) P (C ) for each ad for every choice of C j = A j or (A j ) c. Note. (1) For a pair of evets A 1, A 2 it suffices to check P (A 1 A 2 ) = P (A 1 ) P (A 2 ). This is what is usually preseted i a first course o probability. Exercise 1.2. Prove this. (2) It is possible for a family of evets (or radom variables) to be pairwise idepedet but ot (joitly) idepedet. See the ext example. Example 1.3. Fair coi tossig. Set { +1 ith toss is heads X i =, 1 ith toss is tails the X 1, X 2 ad Y = X 1 X 2 are pairwise idepedet but ot joitly idepedet. It might seem o-ituitive that these are pairwise idepedet. But thik of the coditioal distributios. This really shows idepedece is a statistical property (ad ot a fuctioal property). Here is a secod way to uderstad idepedece. Recall that if X 1,..., X are real-valued radom variables o some (Ω, F, P ) we ca regard X = (X 1,..., X ) as a R -valued radom variable whose distributio (the joit distributio of X 1,..., X from elemetary probability) is the measure µ o R so that µ (B) = P ({ω : X (ω) B}). If B is a rectagle B 1 B ad if X 1,..., X are idepedet, the µ (B 1 B ) = µ 1 (B 1 ) µ (B ) where µ i is the distributio (o R) of X i. I this case, we write µ = µ 1 µ. The coverse is also true. 21

22 22 3. INDEPENDENT RANDOM VARIABLES Lemma 1.4. X 1,..., X are idepedet iff the distributio of (X 1,..., X ) is the product measure of the distributios of the idividual X i s. We re workig our way towards applyig the theory to characteristic fuctios. This will lead to the classic theorems. Here is a tool we ll use quite ofte. Propositio 1.5. Suppose f 1,..., f are (bouded) measurable fuctios o R. 1 If X 1,..., X are idepedet (real-valued) radom variables, the E [f 1 (X 1 ) f (X ) = Π j=1e [f j (X j ). Proof. Apply the above lemma ad Fubii s theorem. A importat case of this is whe f j (X j ) = e itjxj with real umbers t j ad j = 1,...,. The, the propositio implies [ E e i(t1x1+ +tx) = Π j=1e [ e itjxj. The object o the left is the (multivariate) characteristic fuctio φ (t) = φ ((t 1,..., t )) of X = (X 1,..., X ). Of course this defiitio does ot require idepedece. The coverse is also true (if the equality holds for all choices of t j, the X 1,..., X are idepedet). I this course, we ll maily cosider t = t 1 = t 2 = = t. I this case, we see that if X 1,..., X are idepedet, the the characteristic fuctio of X = X X is φ X1+ +X (t) = E [e it(x1+ +X) = Π j=1e [ e itxj = Π j=1φ Xj (t). This is a importat relatio which we ll use time ad time agai. But ote that eve if the equality above holds for all t it may ot be true that X 1,..., X are idepedet. (??) The discussio above shows the utility of the characteristic fuctios. But i practice, distributios are closer to what we re actually iterested i. So we ask: Is there a represetatio for the distributio of sums of idepedet radom variables? The aswer is yes, but ufortuately it s ot quite as ice. By Fubii s theorem, oe has that for Z = X + Y (X, Y idepedet) ad A ay Borel subset of R, µ Z (A) = µ X (A y) dµ Y (y) = µ Y (A x) dµ X (x). R This is called the covolutio of µ X ad µ Y ad is deoted by µ X µ Y. If X, Y have probability desities f X, f Y, the Z also has a desity f Z with f Z (z) = f X (z y) f Y (y) dy = R f Y (z x) f X (x) dx. Here is a stadard ad useful fact for idepedet X, Y, which follow from the relatio E [XY = E [X E [Y : If E [ X 2, E [ Y 2 <, the Var (X + Y ) = Var (X) + Var (Y ). j. 1 The assumptio that f1,..., f are bouded ca be weakeed to E [ f j (X j ) < for each

23 2. WEAK LAW OF LARGE NUMBERS 23 For X 1,..., X idepedet, Var (X X ) = Var (X 1 ) + + Var (X ). Exercise 1.6. Derive these results. May of the stadard limit theorems of probability theory (e.g. the Law of Large Numbers, the Cetral Limit Theorem) cocer sums of idepedet radom variables X X as. I order to make sese of these theorems, we eed to a probability space o which ifiitely may idepedet radom variables X 1, X 2,... are defied (e.g. with some commo prescribed distributio). This comes from the classical samplig techiques from statistics. Defiitio 1.7. If X 1, X 2,... are idepedet radom variables which share a commo distributio, the they are said to be idepedet, idetically distributed (i.i.d.) radom variables. Example 1.8. Coi tossig where { +1 jth toss is heads X j =. 1 jth toss is tails There is a geeral argumet from measure theory that gives the existece of such spaces. Theorem 1.9. Let µ 1, µ 2,... be (Borel) probability measures o R, ad let Ω = {a = (x 1, x 2,... ) : x j R} be the space of ifiite sequeces of real umbers (sometimes deoted R ). Let F be the field of fiite dimesioal cylider sets, i.e. sets of the form {ω = (x 1, x 2,... ) : (x 1,..., x ) A} for N ad A a Borel subset of R. Let Σ be the σ-field geerated by F. The there exists a uique probability measure µ = µ 1 µ 2 o (Ω, Σ) such that µ ({ω = (x 1, x 2,... ) : (x 1,..., x ) A}) = (µ 1 µ ) (A) for ay N ad ay Borel subset A of R. Remark. (1) There is o covergece issue here. We re workig o a probability space. (2) We could have defied µ via ay choice of coordiates from ω, istead of just the first. The result would be the same. 2. Weak Law of Large Numbers Now we ca cosider i.i.d. radom variables X 1, X 2,... ad study the sample mea X1+ +X as. We ll begi with the weaker result, where we ll assumig ot oly that E ( X 1 ) < (so that E (X 1 ) = m is defied) but also that E ( ) X1 2 < (fiite variace). Write S = X X. We first study the weak law of large umbers, which claims that S m i probability, i.e. that P ( S m ) > δ 0 as. Note. Usually covergece i probability ad i distributio are ot the same. But sice the covergece here is to a costat (ad ot a arbitrary radom variable), the weak law of large umbers equivaletly claims that the distributios of S coverge i distributio to δ m as. Exercise 2.1. Check that α δ 0 iff P ( X > δ) 0 as.

24 24 3. INDEPENDENT RANDOM VARIABLES Later, we ll see the strog law of large umbers, where the covergece is almost surely. This takes much more work! Recall the followig result from measure theory. Propositio 2.2 (Chebyshev s Iequality). For ay radom variable X with E [ X 2 <, for ay a > 0. Proof. We have P ( X > a) 1 a 2 E [ X 2 P ( X > a) = E [ 1 { X >a} [ X 2 E a 2 1 { X >a} 1 a 2 E [ X 2 ad so we re doe. Now apply this to X = S ( ) S P m > δ m with a = δ to get 1 δ 2 E [ ((S = 1 δ 2 Var ( S ) ) m ) 2 = 1 ( δ 2 Var X1 + + X = 1 ( ) δ 2 Var X1 = 1 1 δ 2 Var (X 1) = 1 δ 2 σ2 0 as. Note the secod lie follows as [ X1 + + X m = E [X 1 = E = E ) [ S. So we have obtaied a first weak law of large umbers. (See theorem 3.2 i the text.) Theorem 2.3. If X 1, X 2,... are i.i.d. with E [X 1, E [ X1 2 < ad m = E [X i, the S m i probability as. S We ca do better. Theorem 2.4. If X 1, X 2,... are i.i.d. with E [ X 1 < ad m = E [X i, the m i probability as. There are two proofs.

25 2. WEAK LAW OF LARGE NUMBERS 25 X C j First proof. This proof is by a trucatio argumet. Give C > 0, defie = X j 1 { Xj C} ad Yj C = X j Xj C = X j 1 { Xj >C}. The S = 1 j=1 X C j + 1 j=1 Y C j = ξ C + η C, ad m = E [X 1 [ S = E = E [ X1 C [ + E Y C 1 = a C + b C = E [ ξ C [ + E η C we re we ve itroduced ξ C, η C ad a C, b C for bookkeepig. To prove S m it suffices to show that E [ S m 0. (Follows from Chebyshev s iequality.) Write [ S E m = E [ ξ C + η C a C b C E [ ξ C a C + 1 i=1 E [ Y C i + b C = E [ ξ C a C + E [ Y1 C + b C ( E [ ξ C a C 2) 1/2 [ + 2 E Y C 1 sice by Cauchy-Schwarz E [ W ( E [ W 2) 1/2. Now E [ ξ C a C 2 = Var ( ξ C ( ) 2 1 = Var ) ( X1 C ) 0 as, so that [ S lim sup E m 2 E [ Y1 C. Now we claim E [ Y1 C 0 as C. This follows from the domiated covergece theorem sice X 1 1 { X1 >C} 0 almost everywhere. This completes the proof. The idea of trucatio will appear agai whe we prove the strog law of large umbers. The secod proof is completely differet. It uses a importat lie of reasoig that will show up time ad time agai (e.g. i the proof of the cetral limit theorem).

26 26 3. INDEPENDENT RANDOM VARIABLES Secod proof. Argue with characteristic fuctios. Let φ (t) = E [ e itxj (for ay j), the set [ ψ (t) = E e it S [ = E e it( X 1 + +X ) = ( [ = E ( = φ e i tx 1 ( t ) )). Sice E [ X 1 <, φ is differetiable with φ (0) = ie [X 1 = im. So by Taylor expasio, ( ) t φ = 1 + im t ( ) 1 + o. The ( ψ (t) = 1 + im t ( )) 1 + o Lecture 6, 10/18/11 ad the limit as is the same as of ( 1 + im t ) e imt. But the characteristic fuctio of δ m is e imt, ad hece S δ m. We used the followig result: Lemma 2.5. If a is a sequece of complex umbers such that a z C (i.e. a = z/ + o (1/)), the (1 + a ) e z. 3. Strog Law of Large Numbers The coclusio of the strog law of large umbers is that S m almost surely. To prove almost sure limits, a basic (ad very importat) tool is the Borel-Catelli lemma. Lemma 3.1 (Borel-Catelli). Let {A } be ay sequece of evets i some probability space (Ω, F, P ). If =1 P (A ) <, the P ({A occurs for oly fiitely may }) = 1 or equivaletly P ({A occurs ifiitely ofte}) = 0 where {A occurs ifiitely ofte} = {ω : ω A for ifiitely may s}. Coversely, if the A are (mutually) idepedet evets with =1 P (A ) =, the P ({A occurs ifiitely ofte}) = 1. Note. Sometimes we write i.o. to mea ifiitely ofte. Proof. {A occurs i.o.} = k=1 ( =k A ), a decreasig limit as k of the sets =k A. So by coutable additivity we have P ({A occurs i.o.}) = lim P k ( =ka ) lim P (A ) = 0 k sice =1 P (A ) < by assumptio. =k

27 3. STRONG LAW OF LARGE NUMBERS 27 I the other directio, write {A occurs for oly fiitely may } = m=1 ( =m (A ) c ), which is a icreasig limit as k of the sets =m (A ) c. So P ({A occurs for oly fiitely may }) = lim m P ( =m (A ) c ) = lim m Π =mp ((A ) c ) = lim m Π =m (1 P (A )) sice by assumptio the A are idepedet. Usig the fact that 1 x e x for x 0 (proof by calculus) we get P ({A occurs for oly fiitely may }) lim m e ad hece the claim. = lim m 0 = 0 P =m P (Am) Exercise 3.2. Check the details of the proof by replacig Π =mp ((A ) c ) with Π l =mp ((A ) c ) ad lettig l i the ed. Now we are equipped to prove a first strog law of large umbers. [ Theorem 3.3. If X 1, X 2,... are i.i.d. ad E (X 1 ) 4 <, the S m = E (X 1 ) almost surely. Proof. By replacig X i by X i m, we may assume that m = 0. To show S 0 almost surely, it suffices to show that for every δ > 0, ({ }) S P δ for oly fiitely may = 1 as { ω : S } { (ω) 0 = j=1 ω : S } 1 for oly fiitely may, j ad as the evets o the right form a decreasig sequece i j, ({ }) { S P 0 = lim P S 1 } j j for oly fiitely may. Fially by Borel-Catelli, it suffices to show =1 P ( S ) [ > δ <. Sice E [X 1 = 0 ad E (X 1 ) 4 <, Chebyshev s iequality with p = 4 yields the estimate ( ) S P > δ 1 [ δ 4 E S 4 = 1 4 δ 4 E [ S 4. So it will be eough to show E [ S 4 B 2 for some costat B. Write E [ S 4 = E [(X X ) 4 [ = E (X 1 ) (X ) (X 1 ) 2 (X 2 ) (X 1 ) 2 (X ) 2 + other terms like X 1 (X 2 ) 2 X 3 or X 4 (X 7 ) 3 or X 1 X 3 X 5 X 8.

28 28 3. INDEPENDENT RANDOM VARIABLES [ Now E (X i ) 4 [ = C < for all i, ad E (X j ) 2 (X k ) 2 [ = E (X j ) 2 [ E (X k ) 2 = σ 2 σ 2 = σ 4 for j k where σ 2 = Var [ (X 1 ) < (sice E [X 1 = 0). O the other had, E [X 1 (X 2 ) 2 X 3 = E [X 1 E (X 2 ) 2 E [X 3 = 0 ad similarly all other such terms have zero expectatio. Thus E [ S 4 ( ) = C + 6 σ 4 = C + 3 ( 1) σ 4 B 2 2 for some B <. This completes the proof. We wat to claim the law of large umbers holds uder weaker coditios. First we ll see some theorems about sums of idepedet (ot eccessarily idetically distributed) radom variables X j, which will say whe j=1 X j is coverget (almost surely). I order to do this, we eed a techical lemma that improves the Chebyshev iequality: P ({ S l}) E ((S ) 2) l 2. Lemma [ 3.4 (Kolmogorov s Iequality). Suppose X 1,..., X are idepedet with E (X j ) 2 = (σ j ) 2 < ad E [X j = 0 for all j. Let T (ω) = sup 1 k S k (ω). The, E ((S ) 2) P ({T l}) l 2. The proof has a iterestig structure, which shows up agai i the study of Markov chais ad martigales. Proof. This is a stoppig-time argumet. Cosider the first time that the sequece S 1, S 2,... is l, ad defie E j to be the evet that that time is j. Explicitly we have E 1 = { S 1 l}, E 2 = { S 1 < l, S 2 l}, ad so o. Also, {T l} = k=1 E k ad 1 {T l} = k=1 1 E k. Now it suffices to show that P ({T l}) 1 l E [(S 2 ) 2 1 {T l}, ad by our choice of E k we have E [(S ) 2 1 {T l} = k=1 [(S E ) 2 1 Ek. Now P ({T l}) = E [ 1 {T l} = E [1 Ek k=1 k=1 1 [ l 2 E (S k ) 2 1 Ek by our choice of E k. Goig forwards, 1 [( P ({T l}) l 2 E (S k ) 2 + (S S k ) 2) 1 Ek = k=1 k=1 1 [( ) l 2 E (S ) 2 2S k (S S k ) 1 Ek.

29 3. STRONG LAW OF LARGE NUMBERS 29 But observe E [S k (S S k ) 1 Ek = E [S k 1 Ek E [S S k = 0 by idepedece ad sice E [X j = 0 for all j. So we ve show 1 [ P ({T l}) l 2 E (S ) 2 1 Ek, k=1 ad hece the result. Lecture 7, 10/31/11 Now we have Kolmogorov s two- ad three-series theorems. These importat results describe the covergece of idepedet series of radom variables. Theorem 3.5 (Kolmogorov s two-series theorem). Suppose X 1, X 2,... (o some (Ω, F, P )) are idepedet. If i=1 E [X i ad i=1 Var (X i) are coverget, the the series i=1 X i (ω) coverges almost surely. Remark. The two-series theorem gives a sufficiet coditio for almost sure covergece of series of idepedet r.v.s. The ext theorem (the three-series theorem) will give a eccesary coditio. Example 3.6. Recall from calculus that =1 1/s < iff s > 1. But =1 ( 1) / s < coverges for s > 0. What if oe takes radom sigs? Let Y 1, Y 2,... be idepedet with P (Y = +1) = 1/2, P (Y = 1) = 1/2 (e.g. fair coi-tossig). Does =1 Y / s coverge? The ituitive aswer is that we still get eough cacellatio to allow covergece. By the two-series theorem, the series coverges a.s. for s > 1/2. Ideed, if we defie X = Y / s, the E [X = 0 ad Var (X ) = 1/ 2s so that =1 Var (X ) < if 2s > 1. After we prove the three-series theorem, we ll be able to say that =1 X is a.s. ot coverget for s 1/2. Proof. First, assume E [X i = 0 for all i. To show a.s. covergece, it suffices to prove for all δ > 0, ( ) lim P sup S k S m δ = 0 m, m<k where S k = X X k. By Kolmogorov s iequality applied to the radom variables X m+1, X m+1 + X m+2,..., X m X, ( ) P sup S k S m δ 1 m<k δ 2 Var (X m X ) = 1 δ 2 Var (X i ) 0 i=m+1 as m,. If the meas are o-zero, defie Y i = X i E [X i so that X i = Y i +E [X i with E [Y i = 0. The apply the previous result to the Y i s ad use that i E [X i <. Suppose ow we have idepedet X i s to which the two-series theorem does ot apply. Note that if i Var (X i) < but i E [X i is diverget, we ca still apply the two-series theorem to Y i = X i E [X i ad argue that i X i = i (Y i + E [X i ) is diverget. But suppose i Var (X i) = +. We ca the try a

30 30 3. INDEPENDENT RANDOM VARIABLES trucatio argumet, lettig Y i = X i 1 { Xi C} with C a fixed costat. If the twoseries theorem works for the Y i s, ad if i=1 P (X i Y i ) = i=1 P ( X i > C) <, the the Borel-Catelli lemma implies P (X i = Y i for all but fiitely may i) = 1. Ad the i X i coverges a.s. because i Y i does. This proves half of the ext theorem. Theorem 3.7 (Kolmogorov s three-series thereom). Let X 1, X 2,... be idepedet radom variables, ad let C > 0. The i X i coverges almost surely iff (1) i P ( X i > C) <, (2) i E [ X i 1 { Xi C} <, ad (3) i Var ( X i 1 { Xi C}) <. We already proved sufficiecy; for ecessity, see the text. Remark. It seems strage that we ca take ay C > 0 i the theorem above, for the series of trucated variables that result are differet for differet Cs. But as asserted above, they coverge simultaeously. Sometimes we ca make problems easier by choosig the right C. Now we state ad prove the strog law of large umbers. Theorem 3.8 (Strog law of large umbers). If X 1, X 2,... are i.i.d. with E [ X 1 <, the X1+ +X E [X 1 almost surely. Proof sketch. There are several steps. (1) W.l.o.g., assume E [X 1 = 0. (2) Let Y = X 1 { X } ad b = E [Y. Sice E [ X 1 <, we get P ( X 1 > ) <. As P ( X 1 > ) = P ( X > ), we get P ( X > ) < ad hece P (X Y ) <. So by Borel- Catelli, P (X Y oly fiitely ofte) = 1. Thus if Y b coverges a.s., the so does X b. (3) Sice E [X = 0, b E [X = b = E [ X 1 1 { X1 >} 0 as because E [ X 1 <. (This is by the domiated covergece theorem.) (4) A elemetary lemma about ifiite series says that X b < implies (X1 b1)+ +(X b) 0. Sice b 0 implies b1+ +b 0, we coclude X1+ +X 0 as desired. (5) It remais to show Y b <. This is a applicatio of the Komogorov three-series theorem. Coditios (1) ad (2) i the theorem are immediate. Showig (3) requires a estimate based o the assumptio that E [ X 1 <. 4. Kolmogorov s 0-1 Law Cosider radom variables X 1, X 2,... o the ifiite product space (Ω, B, P ) = Π i=1 (R, Borel sets, µ i). Defiitios 4.1. Let B be the σ-field geerated by all evets of the form {X j D} for j where D is ay Borel set. The B is called the σ-field geerated by X, X +1,..., or the future σ-field from time. The tail σ-field B is the σ-field B = =1B. Elemets of B are called tail evets.

31 5. CENTRAL LIMIT THEOREM 31 Oe way to thik about the future σ-field is to otice that if F B, the the defiitio of F does ot deped o X 1,..., X 1. Similarly if F B, the F does ot deped o ay fiite collectio of X i s. Example 4.2. Let ω = (ω 1, ω 2,... ) ad X (ω) = ω. The evets { } E 1 = ω : lim sup X (ω) 5, { } X 1 (ω) + + X (ω) E 2 = ω : lim = 2 are tail evets. Theorem 4.3 (Kolmogorov s 0-1 law). If X 1, X 2,... are idepedet ad A is a tail evet, the either P (A) = 0 or P (A) = 1. Example 4.4. Cosider the tail evet E 2 from the previous example, ad suppose the X i are i.i.d. evets. The if E [X 1 = 2, the strog law of large umbers says P (E 2 ) = 1. Ad if E [X 1 2, the strog law of large umbers says P (E 2 ) = 0. This agrees with the result of the Kolmogorov 0-1 law. Proof. The idea is to show that A is idepedet of A, sice the P (A) = P (A A) = P (A) P (A) ad so (P (A)) 2 P (A) = 0. Hece P (A) (P (A) 1) = 0 ad hece P (A) = 0 or P (A) = 1. Let B be the σ-field geerated by X 1,..., X. A B implies A is idepedet of every evet i B for every. Let A = {evets that are idepedet of A}, the A is a mootoe class (closed uder icreasig/decreasig limits). Sice A cotais each of the fields B, it cotais the whole σ-field B o the ifiite product space. I particular we have A A, so A is idepedet of itself. Lecture 8, 11/1/11 Here are some more examples of tail evets: Examples 4.5. Let {X i } be a sequece of idepedet evets. (1) { 1 ω : lim ( X i (ω)) 3 } { is a tail evet. (2) ω : lim sup 1 log log ( } i=1 X i (ω) a) = b is a tail evet. Here a, b are costats. (3) I fact, {ω : lim X i (ω) exists} is a tail evet. I most cases, this evet has probability zero. The degeerate case is whe the distributios approach a poit mass. (4) { 1 ω : sup i=1 X i (ω) 3 } is ot a tail evet. 5. Cetral Limit Theorem Let X 1, X 2,... be i.i.d. r.v. s ad let S = X X be the th partial sum. Our best law of large umbers says that if µ = E [ X 1 <, the S / µ with probability oe. We ca reiterpret this as sayig 1 (S µ) 0 with probability oe. This suggests the questio: how fast does 1 (S [ µ) ted to zero? The cetral limit theorem gives a aswer, at least whe E (X 1 ) 2 <. The, as we ll see, 1 (S µ) 0 roughly like order 1/. More precisely, the distributio of (S µ) = 1 (S µ) does ot ted to δ 0 as (except i the trivial case where the commo distributio of the X i s is δ µ ).

32 32 3. INDEPENDENT RANDOM VARIABLES [ Theorem 5.1 (Cetral limit theorem). Let X 1, X 2,... be i.i.d. r.v. s with E (X 1 ) 2 1 < ad call µ = E [X 1. The the distributio of (S µ) coverges [ weakly to the ormal distributio with mea zero ad variace σ 2 = E (X 1 µ) 2 = Var (X 1 ). Remark. I the otrivial case σ 0, we ca read the theorem as sayig ( ) S µ u 1 P u e x2 2σ 2 dx 2πσ for every u R. Proof. The proof is a exercise i characteristic fuctios. Recall that (1) The characteristic fuctio of a ormal r.v. W with mea µ ad variace σ 2 is ψ (t) = E [ e itw = e iµt 1 2 σ2 t 2. (2) Distributios of r.v. s W coverge weakly to the distributio of W iff ψ (t) = E [ e itw ψ (t) = E [ e itw for all real t. (3) If Y 1, Y 2,..., Y are idepedet, the E [ e it(α1y1+ +αy) = Π i=1 E [ e i(αit)yi. (4) If E [ Y 2 <, the E [ e ity = 1 + ite [Y + (it)2 2 E [ Y 2 + o ( t 2) = 1 + ie [Y t E [ Y 2 t 2 + o ( t 2) 2 as t 0. See exercise 2.4 i the text; see also the secod proof of the weak law of large umbers (theorem 2.4). Now the proof. By (1) ad (2), we oly eed to prove that E [e it 1 (S µ) e 1 2 σ2 t 2 for all t R. To study 1 (S µ) = 1 i=1 (X i µ), let Y i = X i µ [ ad ote that E [Y i = E [X i µ = 0 ad E (Y i ) 2 = Var (X i ) = σ 2. The by (3), E [e it 1 (S µ) = E [e it P Y i i=1 where φ (t) = E [ e ity1. Fially by (1), = Π i=1e [e it Y i = ( φ ( t )) φ (t) = 1 σ2 2 t2 + o ( t 2) as t 0, so for fixed t R [ ( ) t φ = [1 σ2 t 2 ( t 2 + o = 2 [1 σ2 2 e 1 2 σ2 t 2 by lemma 2.5. This completes the proof. t 2 + o ( 1 ) )

33 5. CENTRAL LIMIT THEOREM 33 There are a variety of extesios of the cetral limit theorem. I what follows we ll assume X 1, X 2,... are idepedet with E [ Xi 2 <, but ot that they are idetically distributed. To simplify the discussio, let s assume E [X i = 0 for all i so that Var (X i ) = E [ Xi 2. (Of course, we ca always replace Xi by Y i = X i E [X i to get to this case.) Let S = X X ad let s 2 = Var (S ) = i=1 Var (X i) = i=1 σ2 i. I the i.i.d. case, σ2 = σ 2 ad the cetral limit theorem says (for σ 2 0) that S /s N (0, 1) i distributio. 2 I the exteded settig, ca we coclude S /s N (0, 1) i distributio? The ext example shows we eed some more assumptios. Example 5.2. Let X 1, X 2,... be i.i.d. ad (to be cocrete) with values ±1 with probability 1/2. Let X j = σ j X j with j=1 σ2 j < (e.g. with σ j = j 2 ). The by the Komogorov 2- or 3-series theorem, sice s 2 = j=1 Var (X j) = j=1 σ2 2 s 2 = j=1 σ2 j < we have that S j=1 σ jx j a.s. ad so S /s 1 s j=1 σ jx j a.s. ad hece also i distributio. But this limit is ot N (0, 1)-distributed. There are (at least) two ways to see why the limit is ot N (0, 1)-distributed. The first is to cosider the characteristic fuctio of the limit. The secod is to observe that i the σ j = j 2 case, the limit is a bouded radom variable. The example above wet wrog because j=1 σ2 j least) eed to assume j=1 σ2 j =. Is this eough? Example 5.3. Let +j with probability p j /2 X j = j with probability p j /2. 0 with probability 1 p j <. So it seems we (at [ The σj 2 = E (X j ) 2 = j 2 p j. Suppose j=1 j2 p j = but also that j=1 P j < (e.g. p j = j 2 ). The by Borel-Catelli (or the 3-series theorem), j=1 X j is a.s. coverget, sice it is a.s. a fiite sum. So S /s 0 a.s. ad therefore ot to N (0, 1) i distributio. The problem i the example above is that although s = j=1 σ2 j =, the mai cotributio to s = j=1 σ2 j comes from very large values of X j. To avoid this pheomeo, oe imposes the Lideberg coditio : If α i deotes the distributio of X i, we require 1 x 2 dα i 0 x ɛ s s 2 i=1 for every ɛ > 0. This says that the cotributio to the variace from very large values of X j is egligible. (Recall for E [X i = 0, Var (X i ) = R x2 dα i.) As we ll see, Lideberg s coditio plus the requiremet that s are eough to imply that S /s N (0, 1) i distributio. Lecture 9, 11/8/11 Theorem [ 5.4 (Lideberg s CLT). Let X 1, X 2,... be idepedet with E [X j = 0 ad E (X j ) 2 = Var (X j ) = σj 2 < ad let S = X X ad s 2 = 2 N (0, 1) meas the stadard ormal distributio, with mea 0 ad variace 1.

34 34 3. INDEPENDENT RANDOM VARIABLES σ σ 2 = Var (S ). The S /s N (0, 1) if s ad if for all ɛ > 0, 1 (5.1) E [ Xj 2 1 { Xj/s >ɛ} 0 as. s 2 j=1 Remark. Coditio (5.1) is called the Lideberg coditio. Proof. We have S s = X1 s + + X s ad suppose X j /s has the characteristic fuctio φ,j. ({φ,j } is a example of a triagular array.) We eed to show Π j=1 φ,j (t) e t2 /2 as for ay fixed t R. The trick: as φ,j (t) is close to 1 for large (by the Lideberg coditio), write φ,j (t) = (1 + (φ,j (t) 1)) = e φ,j(t) 1 + O (φ,j (t) 1) 2 as. So replace φ,j with ψ,j (t) = e φ,j(t) 1 ; oe ca show it suffices to prove Π j=1 ψ,j (t) = e P j=1 (φ,j(t) 1) e t2 /2, or equivaletly ( ) K = φ,j (t) 1 + σ2 j t2 2s 2 0 j=1 j=1 for all t. Observe that [ K E e itxj/s 1 + it X j + t2 (X j ) 2, s 2 sice E [X j = 0 for all j. Now we ll write each term as a itegral w.r.t. α j, the distributio of X j, break up the itegral accordig to where x < ɛs ad x ɛs (with ɛ < 1), ad use the bouds { e ir 1 ir + r2 2 C r 3 r < t C r 2 otherwise with r = tx/s. So K Ct 3 j=1 = I + II. { x <ɛs } x 3 s 3 dα j + C t 2 j=1 s 2 x 2 { x ɛs } s 2 Notice the Lideberg coditio says exactly that II 0. Ad x 3 { x <ɛs } s 3 dα j = ɛ x 2 { x <ɛs } s 2 dα j x 2 ɛ dα j = ɛ σ2 j s 2 R s 2 which gives K ɛct 3 after summig o j. Thus lim sup K Dɛ for all ɛ > 0, ad so lim K = 0. This completes the proof. dα j

35 6. TRIANGULAR ARRAYS AND INFINITE DIVISIBILITY Triagular Arrays ad Ifiite Divisibility The proof of the Lideberg CLT used a example of a triagular array. Here is aother example. Example 6.1. Biomial approximatio of the Poisso distributio. Let S be the umber of successes i idepedet trials with probability p of success o each trial. Claim. If p λ (0, ), the S Poisso (mea λ). Proof. (By characteristic fuctios.) Write S = X,1 + + X, with X,j idepedet Beroulli variables with parameter p. The E [ e its = ( E [ e itx1,) = ( (1 p ) + p e it) = ( ( 1 + p e it 1 )) ( = 1 + λ ( e it 1 ) ( )) 1 + o e λ(eit 1) which is the characteristic fuctio of Poisso (λ). Hece the claim. Now we have two examples of triagular arrays, oe i which we get a ormally distributed limit ad oe i which we get a Poisso limit. This is the begiig of the theory of triagular arrays. Defiitio 6.2. A triagular array cosists of radom variables Note. Ofte k =. X 1,1,..., X 1,k1 X 2,1,..., X 2,k2. We ll cosider the case where the radom variables i each row are idepedet, ad ask whether S = X,1 + + X,k has a limitig distributio as. Examples 6.3. (1) I Lideberg CLT, k =, X,j = X j /s, ad the limit is N (0, 1). (2) I the biomial approximatio of the Poisso distributio, k =, { 1 probability p X,j = 0 probability (1 p ), ad the limit is Poisso (λ). Besides idepedece i each row, we wat o idividual summad to cotribute too much (as ). So we ll assume that for all δ > 0, sup P ( X,j δ) 0 1 j k as. This coditio is called uiform ifiitesimality or uifomly asymptotically egligible, ad is aalogous to the Lideberg coditio (which says idividual summads do ot cotribute too much to the variace). Through all of this, we seek to aswer the questios:

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Singular Continuous Measures by Michael Pejic 5/14/10

Singular Continuous Measures by Michael Pejic 5/14/10 Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

The Central Limit Theorem

The Central Limit Theorem Chapter The Cetral Limit Theorem Deote by Z the stadard ormal radom variable with desity 2π e x2 /2. Lemma.. Ee itz = e t2 /2 Proof. We use the same calculatio as for the momet geeratig fuctio: exp(itx

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Lecture 3 : Random variables and their distributions

Lecture 3 : Random variables and their distributions Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Measure and Measurable Functions

Measure and Measurable Functions 3 Measure ad Measurable Fuctios 3.1 Measure o a Arbitrary σ-algebra Recall from Chapter 2 that the set M of all Lebesgue measurable sets has the followig properties: R M, E M implies E c M, E M for N implies

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

2.1. Convergence in distribution and characteristic functions.

2.1. Convergence in distribution and characteristic functions. 3 Chapter 2. Cetral Limit Theorem. Cetral limit theorem, or DeMoivre-Laplace Theorem, which also implies the wea law of large umbers, is the most importat theorem i probability theory ad statistics. For

More information

2.2. Central limit theorem.

2.2. Central limit theorem. 36.. Cetral limit theorem. The most ideal case of the CLT is that the radom variables are iid with fiite variace. Although it is a special case of the more geeral Lideberg-Feller CLT, it is most stadard

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

Probability for mathematicians INDEPENDENCE TAU

Probability for mathematicians INDEPENDENCE TAU Probability for mathematicias INDEPENDENCE TAU 2013 28 Cotets 3 Ifiite idepedet sequeces 28 3a Idepedet evets........................ 28 3b Idepedet radom variables.................. 33 3 Ifiite idepedet

More information

Axioms of Measure Theory

Axioms of Measure Theory MATH 532 Axioms of Measure Theory Dr. Neal, WKU I. The Space Throughout the course, we shall let X deote a geeric o-empty set. I geeral, we shall ot assume that ay algebraic structure exists o X so that

More information

Introduction to Probability. Ariel Yadin. Lecture 7

Introduction to Probability. Ariel Yadin. Lecture 7 Itroductio to Probability Ariel Yadi Lecture 7 1. Idepedece Revisited 1.1. Some remiders. Let (Ω, F, P) be a probability space. Give a collectio of subsets K F, recall that the σ-algebra geerated by K,

More information

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero? 2 Lebesgue Measure I Chapter 1 we defied the cocept of a set of measure zero, ad we have observed that every coutable set is of measure zero. Here are some atural questios: If a subset E of R cotais a

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

Introduction to Probability. Ariel Yadin

Introduction to Probability. Ariel Yadin Itroductio to robability Ariel Yadi Lecture 2 *** Ja. 7 ***. Covergece of Radom Variables As i the case of sequeces of umbers, we would like to talk about covergece of radom variables. There are may ways

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

Notes 5 : More on the a.s. convergence of sums

Notes 5 : More on the a.s. convergence of sums Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series

More information

EE 4TM4: Digital Communications II Probability Theory

EE 4TM4: Digital Communications II Probability Theory 1 EE 4TM4: Digital Commuicatios II Probability Theory I. RANDOM VARIABLES A radom variable is a real-valued fuctio defied o the sample space. Example: Suppose that our experimet cosists of tossig two fair

More information

Probability Theory. Muhammad Waliji. August 11, 2006

Probability Theory. Muhammad Waliji. August 11, 2006 Probability Theory Muhammad Waliji August 11, 2006 Abstract This paper itroduces some elemetary otios i Measure-Theoretic Probability Theory. Several probabalistic otios of the covergece of a sequece of

More information

6 Infinite random sequences

6 Infinite random sequences Tel Aviv Uiversity, 2006 Probability theory 55 6 Ifiite radom sequeces 6a Itroductory remarks; almost certaity There are two mai reasos for eterig cotiuous probability: ifiitely high resolutio; edless

More information

B Supplemental Notes 2 Hypergeometric, Binomial, Poisson and Multinomial Random Variables and Borel Sets

B Supplemental Notes 2 Hypergeometric, Binomial, Poisson and Multinomial Random Variables and Borel Sets B671-672 Supplemetal otes 2 Hypergeometric, Biomial, Poisso ad Multiomial Radom Variables ad Borel Sets 1 Biomial Approximatio to the Hypergeometric Recall that the Hypergeometric istributio is fx = x

More information

Notes 19 : Martingale CLT

Notes 19 : Martingale CLT Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall

More information

LECTURE 8: ASYMPTOTICS I

LECTURE 8: ASYMPTOTICS I LECTURE 8: ASYMPTOTICS I We are iterested i the properties of estimators as. Cosider a sequece of radom variables {, X 1}. N. M. Kiefer, Corell Uiversity, Ecoomics 60 1 Defiitio: (Weak covergece) A sequece

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.

More information

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology

Advanced Analysis. Min Yan Department of Mathematics Hong Kong University of Science and Technology Advaced Aalysis Mi Ya Departmet of Mathematics Hog Kog Uiversity of Sciece ad Techology September 3, 009 Cotets Limit ad Cotiuity 7 Limit of Sequece 8 Defiitio 8 Property 3 3 Ifiity ad Ifiitesimal 8 4

More information

Math 61CM - Solutions to homework 3

Math 61CM - Solutions to homework 3 Math 6CM - Solutios to homework 3 Cédric De Groote October 2 th, 208 Problem : Let F be a field, m 0 a fixed oegative iteger ad let V = {a 0 + a x + + a m x m a 0,, a m F} be the vector space cosistig

More information

MATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and

MATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and MATH01 Real Aalysis (2008 Fall) Tutorial Note #7 Sequece ad Series of fuctio 1: Poitwise Covergece ad Uiform Covergece Part I: Poitwise Covergece Defiitio of poitwise covergece: A sequece of fuctios f

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

Math Solutions to homework 6

Math Solutions to homework 6 Math 175 - Solutios to homework 6 Cédric De Groote November 16, 2017 Problem 1 (8.11 i the book): Let K be a compact Hermitia operator o a Hilbert space H ad let the kerel of K be {0}. Show that there

More information

Lecture Chapter 6: Convergence of Random Sequences

Lecture Chapter 6: Convergence of Random Sequences ECE5: Aalysis of Radom Sigals Fall 6 Lecture Chapter 6: Covergece of Radom Sequeces Dr Salim El Rouayheb Scribe: Abhay Ashutosh Doel, Qibo Zhag, Peiwe Tia, Pegzhe Wag, Lu Liu Radom sequece Defiitio A ifiite

More information

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT Itroductio to Extreme Value Theory Laures de Haa, ISM Japa, 202 Itroductio to Extreme Value Theory Laures de Haa Erasmus Uiversity Rotterdam, NL Uiversity of Lisbo, PT Itroductio to Extreme Value Theory

More information

Lecture 20: Multivariate convergence and the Central Limit Theorem

Lecture 20: Multivariate convergence and the Central Limit Theorem Lecture 20: Multivariate covergece ad the Cetral Limit Theorem Covergece i distributio for radom vectors Let Z,Z 1,Z 2,... be radom vectors o R k. If the cdf of Z is cotiuous, the we ca defie covergece

More information

Lecture Notes for Analysis Class

Lecture Notes for Analysis Class Lecture Notes for Aalysis Class Topological Spaces A topology for a set X is a collectio T of subsets of X such that: (a) X ad the empty set are i T (b) Uios of elemets of T are i T (c) Fiite itersectios

More information

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1 Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity

More information

Notes 27 : Brownian motion: path properties

Notes 27 : Brownian motion: path properties Notes 27 : Browia motio: path properties Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces:[Dur10, Sectio 8.1], [MP10, Sectio 1.1, 1.2, 1.3]. Recall: DEF 27.1 (Covariace) Let X = (X

More information

The Boolean Ring of Intervals

The Boolean Ring of Intervals MATH 532 Lebesgue Measure Dr. Neal, WKU We ow shall apply the results obtaied about outer measure to the legth measure o the real lie. Throughout, our space X will be the set of real umbers R. Whe ecessary,

More information

5 Birkhoff s Ergodic Theorem

5 Birkhoff s Ergodic Theorem 5 Birkhoff s Ergodic Theorem Amog the most useful of the various geeralizatios of KolmogorovâĂŹs strog law of large umbers are the ergodic theorems of Birkhoff ad Kigma, which exted the validity of the

More information

A Proof of Birkhoff s Ergodic Theorem

A Proof of Birkhoff s Ergodic Theorem A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed

More information

Law of the sum of Bernoulli random variables

Law of the sum of Bernoulli random variables Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible

More information

Entropy Rates and Asymptotic Equipartition

Entropy Rates and Asymptotic Equipartition Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 6 9/23/203 Browia motio. Itroductio Cotet.. A heuristic costructio of a Browia motio from a radom walk. 2. Defiitio ad basic properties

More information

This section is optional.

This section is optional. 4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore

More information

STAT Homework 1 - Solutions

STAT Homework 1 - Solutions STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better

More information

MA131 - Analysis 1. Workbook 3 Sequences II

MA131 - Analysis 1. Workbook 3 Sequences II MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................

More information

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1 8. The cetral limit theorems 8.1. The cetral limit theorem for i.i.d. sequeces. ecall that C ( is N -separatig. Theorem 8.1. Let X 1, X,... be i.i.d. radom variables with EX 1 = ad EX 1 = σ (,. Suppose

More information

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,

n outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n, CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 9 Variace Questio: At each time step, I flip a fair coi. If it comes up Heads, I walk oe step to the right; if it comes up Tails, I walk oe

More information

Lecture 8: Convergence of transformations and law of large numbers

Lecture 8: Convergence of transformations and law of large numbers Lecture 8: Covergece of trasformatios ad law of large umbers Trasformatio ad covergece Trasformatio is a importat tool i statistics. If X coverges to X i some sese, we ofte eed to check whether g(x ) coverges

More information

Chapter 7 Isoperimetric problem

Chapter 7 Isoperimetric problem Chapter 7 Isoperimetric problem Recall that the isoperimetric problem (see the itroductio its coectio with ido s proble) is oe of the most classical problem of a shape optimizatio. It ca be formulated

More information

Probability and Random Processes

Probability and Random Processes Probability ad Radom Processes Lecture 5 Probability ad radom variables The law of large umbers Mikael Skoglud, Probability ad radom processes 1/21 Why Measure Theoretic Probability? Stroger limit theorems

More information

Lecture 2. The Lovász Local Lemma

Lecture 2. The Lovász Local Lemma Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

MA131 - Analysis 1. Workbook 2 Sequences I

MA131 - Analysis 1. Workbook 2 Sequences I MA3 - Aalysis Workbook 2 Sequeces I Autum 203 Cotets 2 Sequeces I 2. Itroductio.............................. 2.2 Icreasig ad Decreasig Sequeces................ 2 2.3 Bouded Sequeces..........................

More information

Lecture 2: April 3, 2013

Lecture 2: April 3, 2013 TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 2: April 3, 203 Scribe: Shubhedu Trivedi Coi tosses cotiued We retur to the coi tossig example from the last lecture agai: Example. Give,

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

Chapter 0. Review of set theory. 0.1 Sets

Chapter 0. Review of set theory. 0.1 Sets Chapter 0 Review of set theory Set theory plays a cetral role i the theory of probability. Thus, we will ope this course with a quick review of those otios of set theory which will be used repeatedly.

More information

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer. 6 Itegers Modulo I Example 2.3(e), we have defied the cogruece of two itegers a,b with respect to a modulus. Let us recall that a b (mod ) meas a b. We have proved that cogruece is a equivalece relatio

More information

Math 140A Elementary Analysis Homework Questions 3-1

Math 140A Elementary Analysis Homework Questions 3-1 Math 0A Elemetary Aalysis Homework Questios -.9 Limits Theorems for Sequeces Suppose that lim x =, lim y = 7 ad that all y are o-zero. Detarime the followig limits: (a) lim(x + y ) (b) lim y x y Let s

More information

1+x 1 + α+x. x = 2(α x2 ) 1+x

1+x 1 + α+x. x = 2(α x2 ) 1+x Math 2030 Homework 6 Solutios # [Problem 5] For coveiece we let α lim sup a ad β lim sup b. Without loss of geerality let us assume that α β. If α the by assumptio β < so i this case α + β. By Theorem

More information

lim za n n = z lim a n n.

lim za n n = z lim a n n. Lecture 6 Sequeces ad Series Defiitio 1 By a sequece i a set A, we mea a mappig f : N A. It is customary to deote a sequece f by {s } where, s := f(). A sequece {z } of (complex) umbers is said to be coverget

More information

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions CS 70 Discrete Mathematics for CS Sprig 2005 Clacy/Wager Notes 21 Some Importat Distributios Questio: A biased coi with Heads probability p is tossed repeatedly util the first Head appears. What is the

More information

Series III. Chapter Alternating Series

Series III. Chapter Alternating Series Chapter 9 Series III With the exceptio of the Null Sequece Test, all the tests for series covergece ad divergece that we have cosidered so far have dealt oly with series of oegative terms. Series with

More information

32 estimating the cumulative distribution function

32 estimating the cumulative distribution function 32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio

More information

FUNDAMENTALS OF REAL ANALYSIS by

FUNDAMENTALS OF REAL ANALYSIS by FUNDAMENTALS OF REAL ANALYSIS by Doğa Çömez Backgroud: All of Math 450/1 material. Namely: basic set theory, relatios ad PMI, structure of N, Z, Q ad R, basic properties of (cotiuous ad differetiable)

More information

Math 113, Calculus II Winter 2007 Final Exam Solutions

Math 113, Calculus II Winter 2007 Final Exam Solutions Math, Calculus II Witer 7 Fial Exam Solutios (5 poits) Use the limit defiitio of the defiite itegral ad the sum formulas to compute x x + dx The check your aswer usig the Evaluatio Theorem Solutio: I this

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Solutions to HW Assignment 1

Solutions to HW Assignment 1 Solutios to HW: 1 Course: Theory of Probability II Page: 1 of 6 Uiversity of Texas at Austi Solutios to HW Assigmet 1 Problem 1.1. Let Ω, F, {F } 0, P) be a filtered probability space ad T a stoppig time.

More information

We are mainly going to be concerned with power series in x, such as. (x)} converges - that is, lims N n

We are mainly going to be concerned with power series in x, such as. (x)} converges - that is, lims N n Review of Power Series, Power Series Solutios A power series i x - a is a ifiite series of the form c (x a) =c +c (x a)+(x a) +... We also call this a power series cetered at a. Ex. (x+) is cetered at

More information

The Borel hierarchy classifies subsets of the reals by their topological complexity. Another approach is to classify them by size.

The Borel hierarchy classifies subsets of the reals by their topological complexity. Another approach is to classify them by size. Lecture 7: Measure ad Category The Borel hierarchy classifies subsets of the reals by their topological complexity. Aother approach is to classify them by size. Filters ad Ideals The most commo measure

More information

Introduction to Functional Analysis

Introduction to Functional Analysis MIT OpeCourseWare http://ocw.mit.edu 18.10 Itroductio to Fuctioal Aalysis Sprig 009 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. LECTURE OTES FOR 18.10,

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam will cover.-.9. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for you

More information

Application to Random Graphs

Application to Random Graphs A Applicatio to Radom Graphs Brachig processes have a umber of iterestig ad importat applicatios. We shall cosider oe of the most famous of them, the Erdős-Réyi radom graph theory. 1 Defiitio A.1. Let

More information

Central Limit Theorem using Characteristic functions

Central Limit Theorem using Characteristic functions Cetral Limit Theorem usig Characteristic fuctios RogXi Guo MAT 477 Jauary 20, 2014 RogXi Guo (2014 Cetral Limit Theorem usig Characteristic fuctios Jauary 20, 2014 1 / 15 Itroductio study a radom variable

More information

FUNDAMENTALS OF REAL ANALYSIS by. V.1. Product measures

FUNDAMENTALS OF REAL ANALYSIS by. V.1. Product measures FUNDAMENTALS OF REAL ANALSIS by Doğa Çömez V. PRODUCT MEASURE SPACES V.1. Product measures Let (, A, µ) ad (, B, ν) be two measure spaces. I this sectio we will costruct a product measure µ ν o that coicides

More information

1 Approximating Integrals using Taylor Polynomials

1 Approximating Integrals using Taylor Polynomials Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................

More information

Lecture 1 Probability and Statistics

Lecture 1 Probability and Statistics Wikipedia: Lecture 1 Probability ad Statistics Bejami Disraeli, British statesma ad literary figure (1804 1881): There are three kids of lies: lies, damed lies, ad statistics. popularized i US by Mark

More information

Probability and Statistics

Probability and Statistics ICME Refresher Course: robability ad Statistics Staford Uiversity robability ad Statistics Luyag Che September 20, 2016 1 Basic robability Theory 11 robability Spaces A probability space is a triple (Ω,

More information

1 Introduction. 1.1 Notation and Terminology

1 Introduction. 1.1 Notation and Terminology 1 Itroductio You have already leared some cocepts of calculus such as limit of a sequece, limit, cotiuity, derivative, ad itegral of a fuctio etc. Real Aalysis studies them more rigorously usig a laguage

More information