Advanced Stochastic Processes.

Similar documents
7.1 Convergence of sequences of random variables

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Distribution of Random Samples & Limit theorems

1 Convergence in Probability and the Weak Law of Large Numbers

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

7.1 Convergence of sequences of random variables

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Lecture 19: Convergence

Introduction to Probability. Ariel Yadin. Lecture 7

This section is optional.

An Introduction to Randomized Algorithms

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

Axioms of Measure Theory

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

Introduction to Probability. Ariel Yadin

STAT Homework 1 - Solutions

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

Notes 27 : Brownian motion: path properties

Sequences and Series of Functions

The Borel hierarchy classifies subsets of the reals by their topological complexity. Another approach is to classify them by size.

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

The Boolean Ring of Intervals

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Lecture 2. The Lovász Local Lemma

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

LECTURE 8: ASYMPTOTICS I

4. Partial Sums and the Central Limit Theorem

ST5215: Advanced Statistical Theory

Mathematical Statistics - MS

Math 61CM - Solutions to homework 3

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

Math 525: Lecture 5. January 18, 2018

Math 341 Lecture #31 6.5: Power Series

Lecture 12: November 13, 2018

Lecture Notes for Analysis Class

Lecture 3 : Random variables and their distributions

AMS570 Lecture Notes #2

Measure and Measurable Functions

Chapter 0. Review of set theory. 0.1 Sets

Probability for mathematicians INDEPENDENCE TAU

Learning Theory: Lecture Notes

As stated by Laplace, Probability is common sense reduced to calculation.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Probability and Random Processes

Lecture 8: Convergence of transformations and law of large numbers

Probability Theory. Muhammad Waliji. August 11, 2006

B Supplemental Notes 2 Hypergeometric, Binomial, Poisson and Multinomial Random Variables and Borel Sets

Mathematics 170B Selected HW Solutions.

lim za n n = z lim a n n.

Lecture 3: August 31

Introduction to Probability. Ariel Yadin. Lecture 2

Probability and Statistics

Basics of Probability Theory (for Theory of Computation courses)

Infinite Sequences and Series

The Central Limit Theorem

Recitation 4: Lagrange Multipliers and Integration

Lecture 7: Properties of Random Samples

Random Variables, Sampling and Estimation

Chapter 6 Principles of Data Reduction

Lecture 3 The Lebesgue Integral

The Random Walk For Dummies

EE 4TM4: Digital Communications II Probability Theory

Lecture 2: April 3, 2013

Here are some examples of algebras: F α = A(G). Also, if A, B A(G) then A, B F α. F α = A(G). In other words, A(G)

Singular Continuous Measures by Michael Pejic 5/14/10

5 Birkhoff s Ergodic Theorem

Notes 5 : More on the a.s. convergence of sums

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

Lecture 12: September 27

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

Problems from 9th edition of Probability and Statistical Inference by Hogg, Tanis and Zimmerman:

Law of the sum of Bernoulli random variables

Lecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables

Sieve Estimators: Consistency and Rates of Convergence

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Introductory Ergodic Theory and the Birkhoff Ergodic Theorem

Math 155 (Lecture 3)

Solutions to HW Assignment 1

6.3 Testing Series With Positive Terms

6. Sufficient, Complete, and Ancillary Statistics

Notes 19 : Martingale CLT

Chapter IV Integration Theory

Approximations and more PMFs and PDFs

Math 113, Calculus II Winter 2007 Final Exam Solutions

MATH 413 FINAL EXAM. f(x) f(y) M x y. x + 1 n

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

Chapter 6 Infinite Series

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Seunghee Ye Ma 8: Week 5 Oct 28

Lecture 14: Graph Entropy

Exercises 1 Sets and functions

1 The Haar functions and the Brownian motion

Sequences. Notation. Convergence of a Sequence

Lecture 27. Capacity of additive Gaussian noise channel and the sphere packing bound

Transcription:

Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios. Extesio Theorem. Borel-Catelli Lemma ad SLLN 1.1. Radom variables ad measurable fuctios Defiitio 1.1. Give two pairs (Ω 1, F 1 ), (Ω 2, F 2 ) of a sample space ad a σ-field, a fuctio X : Ω 1 Ω 2 is defied to be measurable if for every A F 2 we must have X 1 (A) F 1. Whe Ω 2 is the set of all reals R ad F 2 is the Borel σ-field, the fuctio X is called a radom variable. d This defiitio aturally exteds to the case whe Ω 2 = R. I this case we call X a radom vector. Also sice the set of itegers is a subset of R, the defiitio of a radom variable icludes the case of iteger valued radom variables. Exercise 1. Suppose a fuctio X : Ω R is such that X 1 (, x) F for every real value x. Prove that X is a measurable fuctio. Note, that we do ot have to have a probability measure P o Ω 1 or Ω 2 i order to defie measurable fuctios. But probability measure is eeded whe we discuss probability distributios below. Examples. (a) It is easy to give a example of a fuctio which is ot measurable. Suppose, for example Ω 1 = Ω 2 ad both cosist of exactly 3 elemets ω 1, ω 2, ω 3. Say F 1 is a trivial σ-field (which cosists of oly ad Ω) ad F 2 is a full σ-field cosistig of all 8 subsets of Ω. The the idetical trasformatio X : ω ω is ot measurable: take ay o-empty 1

2 D. GAMARNIK,15.070 set A Ω, A = Ω. We have A is measurable with respect to F 2, but X 1 (A) = A is ot measurable with respect to F 1. (b) (Figure.) Say Ω = [0, 1] 2 ad X : Ω R is defied by X(ω) = ω 1 + ω 2. We claim that X is a radom variable whe Ω is equipped with Borel σ-field. Here is the proof. For every real value x cosider the set A = {ω = (ω 1, ω 2 ) : ω 1 + ω 2 > x}. We will prove that A is measurable (belogs to the Borel σ-field of [0, 1] 2 ). The we will take a complemet of A ad this will prove that X is radom variable. Cosider the coutable set of pairs of ratioals (r 1, r 2 ) such that r 1 + r 2 > x. For each of them fid = (r 1, r 2 ) the smallest iteger which is large eough so that the recagle 1 1 B(r 1, r 2 ; 1/) = {(ω 1, ω 2 ) [0, 1] 2 : ω 1 r 1, ω 2 r 2 } lies etirely i A (this is possible by strict iequality r 1 + r 2 > x). Observe that every pair (ω 1, ω 2 ) satisfyig ω 1 + ω 2 > x lies i oe of these rectagles. Thus A is the uio 1 r 1,r 2 B(r 1, r 2, (r1,r 2 ) of the coutable collectio of such rectagles ad therefore belogs ) to the Borel σ-field of [0, 1] 2. (c) Say Ω = C[0, ) equipped with the Borel σ-field, ad X : Ω R is the fuctio which maps every cotiuous fuctio f(t) ito max 0 t 1 f(t). The X is a radom variable o Ω. Ideed, for every x, we have X 1 (x) is the set of all fuctios f such that max 0 t 1 f(t) x. But this is exactly the set B(0, x, 1) used i Defiitio 1.5 of Lecture 1. The sets of this type geerate the Borel σ-field, ad i particular, belog to it. Thus X is measurable. The cocept of radom variables aturally leads to the cocept of probability distributio Defiitio 1.2. (Figure.) Give a probability space (Ω, F, P) ad a radom variables X : Ω R, the associated probability distributio is defied to be the fuctio F : R [0, 1] give by F (x) = P({ω Ω : X(ω) x}). Whe F (x) is a differetiable fuctio of x, its derivative f(x) = F (x) is called the desity fuctio. I other words F (x) is the probability give to the set of all elemetary outcomes ω which are mapped by X ito value at most x. It is the probability distributios which are usually discussed i elemetary probability classes. There, oe usually defies probability distributio as a fuctio satisfyig certai properties (like it should be o-decreasig ad should coverge to uity as x ). Here these properties ca be derived from the give defiitio of a probability distributio. Propositio 1. Prove that F (x) is o-decreasig, o-egative ad lim x F (x) = 0, lim x F (x) = 1. Proof. HW The cocept of probability distributios allows oe to perform the probability related calculatios without alludig to more abstract otios of probability measures. This is ot possible, however, whe we discuss probability spaces like C[0, ).

LECTURE 2. PROBABILITY BASICS CONTINUED 3 Havig defied radom variables ad associated probability distributios, we ca defie further expected values, momets, momet geeratig fuctios, etc., i a more formal way the is doe i elemetary probability classes. We do this oly heuristically, highlightig the mai ideas. Defiitio 1.3. A radom variable X : Ω R is called simple if it takes oly fiitely may values x 1, x 2,..., x m. The expected value of a simple radom variable X is defied to be the quatity E[X] = x i P{ω Ω : X(ω) = x i }. 1 i m What if X is ot simple? How do we defie its expected value? The idea is to approximate X by a sequece of simple radom variables. For simplicity assume that X takes oly values i the iterval [0, A] for some A > 0. That is X : Ω [0, A]. Now cosider X (ω) = k if k X(ω) ( k 1, ]. The X is a simple radom variable. It ca be show that the sequece of the correspodig expected values E[X ] coverges. Its limit is called the expected value E[X] of X. It is also sometimes writte as X(ω)dP(ω). This defiitio of expected value satisfies all the properties of expected values oe studies i elemetary probability courses, for example the fact E[X 2 ] (E[X]) 2, Markov iequality, Chebyshev iequality, Jese s iequality, etc. 1.2. What s i.i.d. sequece of radom variables? Now we ca give a formal defiitio of a stochastic process the priciple otio for this course. Defiitio 1.4. Let T be the set of all o-egative reals R + or itegers Z +. A stochastic process {X t } t T is a family of radom variables X t : Ω R parametrized by T. Remark. Note that a sample outcome ω correspodig to a stochastic process is a fuctio X(ω) : T R, ad the sample space of the correspodig stochastic process is the space of fuctios from T ito R. But ofte we cosider restrictios. For example, whe T = [0, ) we might cosider oly cotiuous fuctios from [0, ) ito R: Example. Set Ω = C[0, ) equipped with Borel σ-field. Defie X t (ω) = ω(t) for every sample ω C[0, ). The {Xt} t [0, ) is a stochastic process. This is true because each fuctio X t : C[0, ) R is a radom variable. (we will prove this later i the course). Remark. The defiitio aturally exteds to the case whe observatios are fuctios T ito d-dimesioal Euclidia space. Oe of the simplest (to aalyze, but ot to defie) examples of a stochastic process is a i.i.d. (idepedet, idetically distributed) stochastic process. What is a i.i.d. stochastic process? I probability courses it was commo to say X 1, X 2,..., is a i.i.d. sequece of Beroulli radom variables with parameter 0 < p < 1, or Z 1, Z 2,... is a i.i.d. sequece of Normal (Gaussia) radom variables with expected value µ ad variace σ 2. What do we mea by this? How does it fit with (Ω, F, P)? We are almost equipped to aswer this questio, but eed little more techicalities. Probability space defiitio icludes defiig a fuctio P : F [0, 1]. How ca we defie this fuctio o the etire σ-field, whe we caot sometimes eve describe the σ-field explicitly? The help comes from Extesio Theorem ET. A rough idea is that if the σ-field is geerated by R d

4 D. GAMARNIK, 15.070 some collectio of sets A ad we ca defie P o A oly, the there is a uique extesio of the fuctio P oto etire σ-field, provided some restrictios are satisfied. 1.2.1. Extesio Theorem Theorem 1.5 ( Extesio Theorem). Give a sample space Ω ad a collectio A of subsets of Ω such that for every A A its complemet Ω \ A is also i Ω ad for every fiite sequece A 1,..., A m its uio 1 j m A j is also i A. Suppose P : A [0, 1] is such that (a) P(Ω) = 1, (b) P( j=1a j ) j=1 P(A j ), wheever j=1a j A. (c) P( A j ) = j=1 j=1 P(A j ), wheever j=1 A j A ad A i, i = 1, 2,... are mutually exclu- sive. The the fuctio P uiquely exteds to a probability measure P : F(A) [0, 1] defied o the σ-field geerated by A. Remark. Note, that the requiremet from A is to be a collectio of sets with properties very similar to that of a σ-field. The oly differece is that we do ot require every ifiite uio of sets to be i A as well. 1.2.2. Examples ad applicatios Uiform probability measure. Cosider Ω = [0, 1] ad let A be the set of fiite uios of ope or closed o-itersectig itervals: [a 1, b 1 ) [a 2, b 2 ] (am, b m ). It is easy to check that A satisfies the coditios of the ET. Cosider the fuctio P : A [0, 1] which maps every such set of itervals to the value 1 i m b i a i (that is the total legth of these itervals). It ca be checked that this also satisfies the coditios of the ET (we skip the proof). Thus, by ET, there exists a uique extesio of fuctio P to a probability measure o etire Borel σ-field, sice this σ-field is geerated by itervals. This probability measure is called uiform probability measure o [0, 1]. Other types of cotiuous distributios. What about other distributios like Normal, Expoetial, etc.? The proper defiitio of these probability measures is itroduced similarly. For example the stadard Normal distributio is defied as probability space (R, B, P), where b 1 t 2 2π e 2 B is the Borel σ-field o R ad P assigs to each iterval [a, b] value dt. The each a o-itersectig collectio of itervals [a i, b i ], 1 i m is assiged value which is the sum of the correspodig itegrals. Agai the set of fiite collectios of o-itersectig itervals satisfies the coditios of ET, ad applyig ET we obtaied that the probability measure P is defied o the etire Borel σ-field B. 1.2.3. i.i.d. sequeces i.i.d coi tosses. Let Ω = {0, 1}. Recall that the product σ-field is the field geerated by cylider type sets A(ω). Let A be the set of fiite uios of such sets 1 j k A(ω j ). Agai, it ca be checked that that A satisfies the coditios of ET. For every fiite sequece ω = ω 1,..., ω m 1 ad the correspodig set A(ω) we set P(A(ω)) simply to be 2 m (the probability of a particular 1 sequece of 0/1 observatios i the first m coi tosses is 2 m ). For example, the probability of 1 first four zeros is 2 4. The, for every uio of o-itersectig sets 1 j k A(ω j ) we set their

LECTURE 2. PROBABILITY BASICS CONTINUED 5 k correspodig value to 2 m. The coditios of ET agai ca be checked, but we skip the proof. The, by ET there is a uique extesio of P to the etire product σ-field of Ω. This is what we call a sequece of i.i.d. ubiased coi tosses also kow as a sequece of i.i.d. Beroulli radom variables with parameter 1/2. The phrase i.i.d., i proper probabilistic terms, meas (Ω, F, P) the probability space costructed above. Geeral i.i.d. type distributios. We have defied formally i.i.d. Beroulli sequece. What about geeral i.i.d. sequeces? They are defied similarly by cosiderig ifiite products ad cylider type sets. First we set Ω = R. O it we cosider the product σ-field F. Defie A to be the set of fiite uios of cylider type sets. Recall that a cylider set A is the set of the form A = [a 1, b 1 ] (a [a m, b m ) R 2, b 2 ) product of closed or ope or half-closed halfope itervals. Recall also that cylider sets geerate, by defiitio, the product σ-field F. Suppose we have a probability space (R, B, P) defied o R ad its Borel σ-field B (for example P correspods to stadard Normal distributio). The for every cylider set A we defie P(A) = 1 j m P([a j, b j ]). Agai we check that A ad P satisfy the coditios of ET (we skip the proof). Thus there is a uique extesio of P to the etire product σ-field F of R, sice A geerates this σ-field. The we defie X m (ω) = ω m for every ω R. We ote that X m is a radom variable as it is a measurable fuctio from R ito R. The sequece X 1, X 2,... is a stochastic process which we call a i.i.d. sequece of radom variables. Essetially we have embedded a sequece of radom variables {X m } ito a sigle probability space (R, F, P). Is this defiitio cosistet with elemetary defiitio of i.i.d. Recall that elemetary defiitio of i.i.d. sequece is whe P(X 1 x 1,..., X m x m ) = 1 j m P(X j x j ). Is this true i our case? Note P(X 1 x 1,..., X x m ) = P{ω R : ω 1 (, x 1 ],..., ω m (, x m ]} = P{ω (, x 1 ] (, x m] R } = P((, xj ]), 1 j m where the last equality follows from how we defied P o cylider sets. But the product of these probabilities is exactly 1 j P(X j x j ). Thus the idetity checks. 1.3. Borel-Catelli Lemma ad Strog Law of Large Numbers (SLLN) The Strog Law of Large Numbers (SLLN) (like the Cetral Limit Theorem) is oe of the most fudametal theorems i probability theory. Yet properly statig it, let aloe provig it is ot as straightforward as is, for example the Weak Law of Large Numbers (WLLN). We ow use the (Ω, F, P) framework to properly state ad prove SLLN. We begi with a very useful tool, the Borel-Catelli Lemma. Give a sample space Ω, a σ-field F ad a ifiite sequece A 1, A 2,..., A m,... F defie A i.o. (i.o. stads for ifiitely ofte) to be the set of all ω Ω which belog to ifiitely may A m -s. Oe ca write A i.o. as (check

6 D. GAMARNIK, 15.070 the validity of this idetity) A i.o. = m 1 j m A j. Lemma 1.6 (Borel-Catelli Lemma). Give a probability space (Ω, F, P) ad a ifiite sequece of evets A m, m 1 suppose m P(A m ) <. The P(A i.o. ) = 0. I words we say: the probability that A m happe ifiitely ofte is equal to zero. Proof. Defie B m = j m A j. The A i.o. = m B m. The B 1 B 2 B 3. Usig Propositio 4 part (b) (applied to complemet sets) we obtai P(A i.o. ) = lim m P(B m ). But sice m P(A m) < the the tail parts of the sum satisfy lim m j m P(A j ) = 0. But P(B m ) = P( j m A j ) j m P(A j ). Therefore, moreover lim m B m = 0. We coclude P(A i.o. ) = 0. Theorem 1.7 (SLLN). Cosider a i.i.d sequece of radom variables X, = 1, 2,... correspodig to some probability measure (R, B, P). Suppose E[X 1 ] <. The almost surely Formally, defie The P(A) = 1. 1 i X i lim = E[X 1 ]. 1 i X i (ω) A = {ω R : lim = E[X 1 ]}. Proof. The proof of this fudametal result i probability theory is complicated (see for example [2]). Here, for simplicity, we cosider a special case whe the radom variable X 1 has a fiite fourth momet. Namely, E[ X 4 1 ] = X 1 (ω) 4 dp(ω) <. Let us ceter the radom variables X i i the followig way: Y i = X i E[X i ]. The Y i have zero expected value. Sice 1 i Y i 1 i = X i E [X i ], P Y i it suffices to prove that 1 i coverges almost surely to zero. Fix ɛ > 0 ad defie the evet A (ɛ) as P 1 i Y i > ɛ. Formally Applyig Markov iequality A (ɛ) = {ω R : P( 1 i Y i > ɛ) = P(( 1 i Y i (ω) > ɛ}. 1 i Y i E[( 1 i Y i) 4 ] 4 ɛ 4 ) 4 > ɛ 4 ) Whe we expad E[( 1 i Y i) 4 ] we ote that oly the terms of the form E[Y 4 ] ad E[Y 2 Y 2 ] are i i j o-zero, sice the expected value of Y i is zero ad the sequece is i.i.d. Also by idepedece

LECTURE 2. PROBABILITY BASICS CONTINUED 7 E[Y 2 Y j 2 ] = (E[Y 1 2 ]) 2. We obtai a boud i 2 E[Y1 4 ] + ( 1)(E[Y 2 1 ]) 2 [E[Y1 4 ] + (E[Y 2 1 ]) 2 ] E[Y1 4 ] + (E[Y 2 1 ]) 2 = 4 ɛ 4 4 ɛ 4 2 ɛ 4 This expressio is fiite by our assumptio of fiiteess of fourth momet. Sice the sum E[Y 4 1 ]+(E[Y 2 1 ]) 2 1 <, the applyig the Borel-Catelli Lemma we coclude that probability that A (ɛ) happes for ifiitely may is zero. I other words, for almost every ω R 2 ɛ 4 P 1 i Y i (ω) there exists 0 = 0 (ɛ, ω) such that for all > 0 we must have ɛ. This meas that for almost every ω, we have 1 i Y i (ω) lim = 0. This cocludes the proof. 1.4. Readig assigmets Notes Modelig experimets, pages 1.4,1.5,2.2. Grimmett ad Stirzaker [2], Chapters 1 ad 2. Chapter 7, Sectios 7.3-7.5. Durrett [1] Chapter 1, Sectios 1-7.

BIBLIOGRAPHY 1. R. Durrett, Probability: theory ad examples, Duxbury Press, secod editio, 1996. 2. G. R. Grimmett ad D. R. Stirzaker, Probability ad radom processes, Oxford Sciece Publicatios, 1985. 9