Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios. Extesio Theorem. Borel-Catelli Lemma ad SLLN 1.1. Radom variables ad measurable fuctios Defiitio 1.1. Give two pairs (Ω 1, F 1 ), (Ω 2, F 2 ) of a sample space ad a σ-field, a fuctio X : Ω 1 Ω 2 is defied to be measurable if for every A F 2 we must have X 1 (A) F 1. Whe Ω 2 is the set of all reals R ad F 2 is the Borel σ-field, the fuctio X is called a radom variable. d This defiitio aturally exteds to the case whe Ω 2 = R. I this case we call X a radom vector. Also sice the set of itegers is a subset of R, the defiitio of a radom variable icludes the case of iteger valued radom variables. Exercise 1. Suppose a fuctio X : Ω R is such that X 1 (, x) F for every real value x. Prove that X is a measurable fuctio. Note, that we do ot have to have a probability measure P o Ω 1 or Ω 2 i order to defie measurable fuctios. But probability measure is eeded whe we discuss probability distributios below. Examples. (a) It is easy to give a example of a fuctio which is ot measurable. Suppose, for example Ω 1 = Ω 2 ad both cosist of exactly 3 elemets ω 1, ω 2, ω 3. Say F 1 is a trivial σ-field (which cosists of oly ad Ω) ad F 2 is a full σ-field cosistig of all 8 subsets of Ω. The the idetical trasformatio X : ω ω is ot measurable: take ay o-empty 1
2 D. GAMARNIK,15.070 set A Ω, A = Ω. We have A is measurable with respect to F 2, but X 1 (A) = A is ot measurable with respect to F 1. (b) (Figure.) Say Ω = [0, 1] 2 ad X : Ω R is defied by X(ω) = ω 1 + ω 2. We claim that X is a radom variable whe Ω is equipped with Borel σ-field. Here is the proof. For every real value x cosider the set A = {ω = (ω 1, ω 2 ) : ω 1 + ω 2 > x}. We will prove that A is measurable (belogs to the Borel σ-field of [0, 1] 2 ). The we will take a complemet of A ad this will prove that X is radom variable. Cosider the coutable set of pairs of ratioals (r 1, r 2 ) such that r 1 + r 2 > x. For each of them fid = (r 1, r 2 ) the smallest iteger which is large eough so that the recagle 1 1 B(r 1, r 2 ; 1/) = {(ω 1, ω 2 ) [0, 1] 2 : ω 1 r 1, ω 2 r 2 } lies etirely i A (this is possible by strict iequality r 1 + r 2 > x). Observe that every pair (ω 1, ω 2 ) satisfyig ω 1 + ω 2 > x lies i oe of these rectagles. Thus A is the uio 1 r 1,r 2 B(r 1, r 2, (r1,r 2 ) of the coutable collectio of such rectagles ad therefore belogs ) to the Borel σ-field of [0, 1] 2. (c) Say Ω = C[0, ) equipped with the Borel σ-field, ad X : Ω R is the fuctio which maps every cotiuous fuctio f(t) ito max 0 t 1 f(t). The X is a radom variable o Ω. Ideed, for every x, we have X 1 (x) is the set of all fuctios f such that max 0 t 1 f(t) x. But this is exactly the set B(0, x, 1) used i Defiitio 1.5 of Lecture 1. The sets of this type geerate the Borel σ-field, ad i particular, belog to it. Thus X is measurable. The cocept of radom variables aturally leads to the cocept of probability distributio Defiitio 1.2. (Figure.) Give a probability space (Ω, F, P) ad a radom variables X : Ω R, the associated probability distributio is defied to be the fuctio F : R [0, 1] give by F (x) = P({ω Ω : X(ω) x}). Whe F (x) is a differetiable fuctio of x, its derivative f(x) = F (x) is called the desity fuctio. I other words F (x) is the probability give to the set of all elemetary outcomes ω which are mapped by X ito value at most x. It is the probability distributios which are usually discussed i elemetary probability classes. There, oe usually defies probability distributio as a fuctio satisfyig certai properties (like it should be o-decreasig ad should coverge to uity as x ). Here these properties ca be derived from the give defiitio of a probability distributio. Propositio 1. Prove that F (x) is o-decreasig, o-egative ad lim x F (x) = 0, lim x F (x) = 1. Proof. HW The cocept of probability distributios allows oe to perform the probability related calculatios without alludig to more abstract otios of probability measures. This is ot possible, however, whe we discuss probability spaces like C[0, ).
LECTURE 2. PROBABILITY BASICS CONTINUED 3 Havig defied radom variables ad associated probability distributios, we ca defie further expected values, momets, momet geeratig fuctios, etc., i a more formal way the is doe i elemetary probability classes. We do this oly heuristically, highlightig the mai ideas. Defiitio 1.3. A radom variable X : Ω R is called simple if it takes oly fiitely may values x 1, x 2,..., x m. The expected value of a simple radom variable X is defied to be the quatity E[X] = x i P{ω Ω : X(ω) = x i }. 1 i m What if X is ot simple? How do we defie its expected value? The idea is to approximate X by a sequece of simple radom variables. For simplicity assume that X takes oly values i the iterval [0, A] for some A > 0. That is X : Ω [0, A]. Now cosider X (ω) = k if k X(ω) ( k 1, ]. The X is a simple radom variable. It ca be show that the sequece of the correspodig expected values E[X ] coverges. Its limit is called the expected value E[X] of X. It is also sometimes writte as X(ω)dP(ω). This defiitio of expected value satisfies all the properties of expected values oe studies i elemetary probability courses, for example the fact E[X 2 ] (E[X]) 2, Markov iequality, Chebyshev iequality, Jese s iequality, etc. 1.2. What s i.i.d. sequece of radom variables? Now we ca give a formal defiitio of a stochastic process the priciple otio for this course. Defiitio 1.4. Let T be the set of all o-egative reals R + or itegers Z +. A stochastic process {X t } t T is a family of radom variables X t : Ω R parametrized by T. Remark. Note that a sample outcome ω correspodig to a stochastic process is a fuctio X(ω) : T R, ad the sample space of the correspodig stochastic process is the space of fuctios from T ito R. But ofte we cosider restrictios. For example, whe T = [0, ) we might cosider oly cotiuous fuctios from [0, ) ito R: Example. Set Ω = C[0, ) equipped with Borel σ-field. Defie X t (ω) = ω(t) for every sample ω C[0, ). The {Xt} t [0, ) is a stochastic process. This is true because each fuctio X t : C[0, ) R is a radom variable. (we will prove this later i the course). Remark. The defiitio aturally exteds to the case whe observatios are fuctios T ito d-dimesioal Euclidia space. Oe of the simplest (to aalyze, but ot to defie) examples of a stochastic process is a i.i.d. (idepedet, idetically distributed) stochastic process. What is a i.i.d. stochastic process? I probability courses it was commo to say X 1, X 2,..., is a i.i.d. sequece of Beroulli radom variables with parameter 0 < p < 1, or Z 1, Z 2,... is a i.i.d. sequece of Normal (Gaussia) radom variables with expected value µ ad variace σ 2. What do we mea by this? How does it fit with (Ω, F, P)? We are almost equipped to aswer this questio, but eed little more techicalities. Probability space defiitio icludes defiig a fuctio P : F [0, 1]. How ca we defie this fuctio o the etire σ-field, whe we caot sometimes eve describe the σ-field explicitly? The help comes from Extesio Theorem ET. A rough idea is that if the σ-field is geerated by R d
4 D. GAMARNIK, 15.070 some collectio of sets A ad we ca defie P o A oly, the there is a uique extesio of the fuctio P oto etire σ-field, provided some restrictios are satisfied. 1.2.1. Extesio Theorem Theorem 1.5 ( Extesio Theorem). Give a sample space Ω ad a collectio A of subsets of Ω such that for every A A its complemet Ω \ A is also i Ω ad for every fiite sequece A 1,..., A m its uio 1 j m A j is also i A. Suppose P : A [0, 1] is such that (a) P(Ω) = 1, (b) P( j=1a j ) j=1 P(A j ), wheever j=1a j A. (c) P( A j ) = j=1 j=1 P(A j ), wheever j=1 A j A ad A i, i = 1, 2,... are mutually exclu- sive. The the fuctio P uiquely exteds to a probability measure P : F(A) [0, 1] defied o the σ-field geerated by A. Remark. Note, that the requiremet from A is to be a collectio of sets with properties very similar to that of a σ-field. The oly differece is that we do ot require every ifiite uio of sets to be i A as well. 1.2.2. Examples ad applicatios Uiform probability measure. Cosider Ω = [0, 1] ad let A be the set of fiite uios of ope or closed o-itersectig itervals: [a 1, b 1 ) [a 2, b 2 ] (am, b m ). It is easy to check that A satisfies the coditios of the ET. Cosider the fuctio P : A [0, 1] which maps every such set of itervals to the value 1 i m b i a i (that is the total legth of these itervals). It ca be checked that this also satisfies the coditios of the ET (we skip the proof). Thus, by ET, there exists a uique extesio of fuctio P to a probability measure o etire Borel σ-field, sice this σ-field is geerated by itervals. This probability measure is called uiform probability measure o [0, 1]. Other types of cotiuous distributios. What about other distributios like Normal, Expoetial, etc.? The proper defiitio of these probability measures is itroduced similarly. For example the stadard Normal distributio is defied as probability space (R, B, P), where b 1 t 2 2π e 2 B is the Borel σ-field o R ad P assigs to each iterval [a, b] value dt. The each a o-itersectig collectio of itervals [a i, b i ], 1 i m is assiged value which is the sum of the correspodig itegrals. Agai the set of fiite collectios of o-itersectig itervals satisfies the coditios of ET, ad applyig ET we obtaied that the probability measure P is defied o the etire Borel σ-field B. 1.2.3. i.i.d. sequeces i.i.d coi tosses. Let Ω = {0, 1}. Recall that the product σ-field is the field geerated by cylider type sets A(ω). Let A be the set of fiite uios of such sets 1 j k A(ω j ). Agai, it ca be checked that that A satisfies the coditios of ET. For every fiite sequece ω = ω 1,..., ω m 1 ad the correspodig set A(ω) we set P(A(ω)) simply to be 2 m (the probability of a particular 1 sequece of 0/1 observatios i the first m coi tosses is 2 m ). For example, the probability of 1 first four zeros is 2 4. The, for every uio of o-itersectig sets 1 j k A(ω j ) we set their
LECTURE 2. PROBABILITY BASICS CONTINUED 5 k correspodig value to 2 m. The coditios of ET agai ca be checked, but we skip the proof. The, by ET there is a uique extesio of P to the etire product σ-field of Ω. This is what we call a sequece of i.i.d. ubiased coi tosses also kow as a sequece of i.i.d. Beroulli radom variables with parameter 1/2. The phrase i.i.d., i proper probabilistic terms, meas (Ω, F, P) the probability space costructed above. Geeral i.i.d. type distributios. We have defied formally i.i.d. Beroulli sequece. What about geeral i.i.d. sequeces? They are defied similarly by cosiderig ifiite products ad cylider type sets. First we set Ω = R. O it we cosider the product σ-field F. Defie A to be the set of fiite uios of cylider type sets. Recall that a cylider set A is the set of the form A = [a 1, b 1 ] (a [a m, b m ) R 2, b 2 ) product of closed or ope or half-closed halfope itervals. Recall also that cylider sets geerate, by defiitio, the product σ-field F. Suppose we have a probability space (R, B, P) defied o R ad its Borel σ-field B (for example P correspods to stadard Normal distributio). The for every cylider set A we defie P(A) = 1 j m P([a j, b j ]). Agai we check that A ad P satisfy the coditios of ET (we skip the proof). Thus there is a uique extesio of P to the etire product σ-field F of R, sice A geerates this σ-field. The we defie X m (ω) = ω m for every ω R. We ote that X m is a radom variable as it is a measurable fuctio from R ito R. The sequece X 1, X 2,... is a stochastic process which we call a i.i.d. sequece of radom variables. Essetially we have embedded a sequece of radom variables {X m } ito a sigle probability space (R, F, P). Is this defiitio cosistet with elemetary defiitio of i.i.d. Recall that elemetary defiitio of i.i.d. sequece is whe P(X 1 x 1,..., X m x m ) = 1 j m P(X j x j ). Is this true i our case? Note P(X 1 x 1,..., X x m ) = P{ω R : ω 1 (, x 1 ],..., ω m (, x m ]} = P{ω (, x 1 ] (, x m] R } = P((, xj ]), 1 j m where the last equality follows from how we defied P o cylider sets. But the product of these probabilities is exactly 1 j P(X j x j ). Thus the idetity checks. 1.3. Borel-Catelli Lemma ad Strog Law of Large Numbers (SLLN) The Strog Law of Large Numbers (SLLN) (like the Cetral Limit Theorem) is oe of the most fudametal theorems i probability theory. Yet properly statig it, let aloe provig it is ot as straightforward as is, for example the Weak Law of Large Numbers (WLLN). We ow use the (Ω, F, P) framework to properly state ad prove SLLN. We begi with a very useful tool, the Borel-Catelli Lemma. Give a sample space Ω, a σ-field F ad a ifiite sequece A 1, A 2,..., A m,... F defie A i.o. (i.o. stads for ifiitely ofte) to be the set of all ω Ω which belog to ifiitely may A m -s. Oe ca write A i.o. as (check
6 D. GAMARNIK, 15.070 the validity of this idetity) A i.o. = m 1 j m A j. Lemma 1.6 (Borel-Catelli Lemma). Give a probability space (Ω, F, P) ad a ifiite sequece of evets A m, m 1 suppose m P(A m ) <. The P(A i.o. ) = 0. I words we say: the probability that A m happe ifiitely ofte is equal to zero. Proof. Defie B m = j m A j. The A i.o. = m B m. The B 1 B 2 B 3. Usig Propositio 4 part (b) (applied to complemet sets) we obtai P(A i.o. ) = lim m P(B m ). But sice m P(A m) < the the tail parts of the sum satisfy lim m j m P(A j ) = 0. But P(B m ) = P( j m A j ) j m P(A j ). Therefore, moreover lim m B m = 0. We coclude P(A i.o. ) = 0. Theorem 1.7 (SLLN). Cosider a i.i.d sequece of radom variables X, = 1, 2,... correspodig to some probability measure (R, B, P). Suppose E[X 1 ] <. The almost surely Formally, defie The P(A) = 1. 1 i X i lim = E[X 1 ]. 1 i X i (ω) A = {ω R : lim = E[X 1 ]}. Proof. The proof of this fudametal result i probability theory is complicated (see for example [2]). Here, for simplicity, we cosider a special case whe the radom variable X 1 has a fiite fourth momet. Namely, E[ X 4 1 ] = X 1 (ω) 4 dp(ω) <. Let us ceter the radom variables X i i the followig way: Y i = X i E[X i ]. The Y i have zero expected value. Sice 1 i Y i 1 i = X i E [X i ], P Y i it suffices to prove that 1 i coverges almost surely to zero. Fix ɛ > 0 ad defie the evet A (ɛ) as P 1 i Y i > ɛ. Formally Applyig Markov iequality A (ɛ) = {ω R : P( 1 i Y i > ɛ) = P(( 1 i Y i (ω) > ɛ}. 1 i Y i E[( 1 i Y i) 4 ] 4 ɛ 4 ) 4 > ɛ 4 ) Whe we expad E[( 1 i Y i) 4 ] we ote that oly the terms of the form E[Y 4 ] ad E[Y 2 Y 2 ] are i i j o-zero, sice the expected value of Y i is zero ad the sequece is i.i.d. Also by idepedece
LECTURE 2. PROBABILITY BASICS CONTINUED 7 E[Y 2 Y j 2 ] = (E[Y 1 2 ]) 2. We obtai a boud i 2 E[Y1 4 ] + ( 1)(E[Y 2 1 ]) 2 [E[Y1 4 ] + (E[Y 2 1 ]) 2 ] E[Y1 4 ] + (E[Y 2 1 ]) 2 = 4 ɛ 4 4 ɛ 4 2 ɛ 4 This expressio is fiite by our assumptio of fiiteess of fourth momet. Sice the sum E[Y 4 1 ]+(E[Y 2 1 ]) 2 1 <, the applyig the Borel-Catelli Lemma we coclude that probability that A (ɛ) happes for ifiitely may is zero. I other words, for almost every ω R 2 ɛ 4 P 1 i Y i (ω) there exists 0 = 0 (ɛ, ω) such that for all > 0 we must have ɛ. This meas that for almost every ω, we have 1 i Y i (ω) lim = 0. This cocludes the proof. 1.4. Readig assigmets Notes Modelig experimets, pages 1.4,1.5,2.2. Grimmett ad Stirzaker [2], Chapters 1 ad 2. Chapter 7, Sectios 7.3-7.5. Durrett [1] Chapter 1, Sectios 1-7.
BIBLIOGRAPHY 1. R. Durrett, Probability: theory ad examples, Duxbury Press, secod editio, 1996. 2. G. R. Grimmett ad D. R. Stirzaker, Probability ad radom processes, Oxford Sciece Publicatios, 1985. 9