LECTURE NOTES ON PROBABILITY

Size: px
Start display at page:

Download "LECTURE NOTES ON PROBABILITY"

Transcription

1 LECTURE NOTES ON PROBABILITY OMER TAMUZ Cotets Disclaimer 3 1. Why we eed measure theory Riddle Riddle 2 (Gabay-O Coor game) Riddle Why we eed measure theory Bous riddle 5 2. Measure theory π-systems, algebras ad sigma-algebras 6 3. Hah-Kolmogorov Theorem ad costructig measures 9 4. Evets ad radom variables Idepedece ad the Borel-Catelli Lemmas The tail sigma-algebra Expectatios A strog law of large umbers ad the Cheroff boud The weak law of large umbers Coditioal expectatios Why thigs are ot as simple as they seem Coditioal expectatios i fiite spaces Coditioal expectatios i L Coditioal expectatios i L Some properties of coditioal expectatio The Galto-Watso process Markov chais Martigales Stoppig times Harmoic ad superharmoic fuctios The Choquet-Dey Theorem 47 Date: November 29, Partially adapted from Williams [6]. Ay commets or suggestios are welcome. 1

2 2 17. Characteristic fuctios ad the Cetral Limit Theorem Sceery Recostructio: I Statioary distributios ad processes Sceery recostructio: II Statioary processes ad measure preservig trasformatios The Ergodic Theorem The Rado-Nikodym derivative The weak topology ad the simplex of ivariat measures Percolatio Large deviatios The mass trasport priciple Majority dyamics 78 Refereces 81

3 3 Disclaimer This a ot a textbook. These are lecture otes.

4 4 1. Why we eed measure theory 1.1. Riddle 1. There are N people stadig i a lie. Each perso {1,..., N} has a bit X {0, 1} writte above her head. Each perso ca see the bits of the people i frot of her but ot her ow or the bits of those behid, so that perso ca see (X +1, X +2,..., X N ). Startig with perso 1, each perso declares i tur a bit Y, ad this declaratio is heard by the rest. Y has to be a fuctio of what is kow to perso. Hece Y = f (Y 1,..., Y 1, X +1,..., X N ) for some fuctio f : {0, 1} N 1 {0, 1}. Show that there exist fuctios (f 1,..., f ) such that for ay assigmet of bits to (X 1,..., X ) it holds that Y = X for all > Riddle 2 (Gabay-O Coor game). This time there is a coutably ifiite lie of people, so that perso ca see (X +1, X +2,...), ad, as before, hears (Y 1,..., Y 1 ). Thus Y = f (Y 1,..., Y 1, X +1,...) for some f : {0, 1} N {0, 1}. Show that there exist fuctios (f 1, f 2,...) such that for ay assigmet of bits to (X 1, X 2,...) it holds that Y = X for all > Riddle 3. Now there are agai coutably ifiitely may people, but they do ot hear the declaratios ad so Y = f (X +1, X +2,...) for some f : {0, 1} N {0, 1}. Show that there exist fuctios (f 1, f 2,...) such that for ay assigmet of bits to (X 1, X 2,...) the set of N for which Y X is fiite Why we eed measure theory. Assume that the X s are i.i.d radom variables with P [X = 1] = P [X = 2] = 1. I the settig of 2 riddle 3, fix ay fuctios (f 1, f 2,...). Sice Y is a fuctio of (X +1, X +2,...) it is idepedet of X. Thus P [Y = X ] = P [Y = X, X = 0] + P [Y = X, X = 1] = P [Y = 0, X = 0] + P [Y = 1, X = 1] = P [Y = 0] P [X = 0] + P [Y = 1] P [X = 1] = (P [Y = 0] + P [Y = 1]) 1 2 = 1 2. Defie K {1, 2,..., } by K = max{ : X Y }, with K = if this maximum does ot exist.

5 If Y X the K. Hece, ad sice P [Y X ] = 1, we have 2 that P [K ] 1 for all. Hece P [K < ] < 1 for all. Thus m=1 P [K = m] = P [K < ] < 1 2. Takig the limit we have show that P [K = m] < 1, 2 m=1 ad so P [K < ] < 1. Thus P [K = ] > 1, ad i particular with 2 2 positive probability ifiitely may people guess wrogly. 1 Exercise 1.1. Show that i the settig of riddle 2, P [Y = X for all > 1] < 1. Aother similar (ad better kow) example is the Baach-Tarski paradox Bous riddle. Prove or disprove: every subset of R 2 of size 9 is cotaied i the disjoit uio of 9 closed disks of radius I fact this happes w.p. 1.

6 6 2. Measure theory A probability measure µ o a fiite space Ω assiges to each ω Ω a umber betweem 0 ad 1, ad has the property that these umbers sum to 1. We ca also thik about it as a fuctio µ: 2 ω [0, 1] that assigs to each subset of Ω a umber, ad has the properties that (1) µ(ω) = 1. (2) µ is additive. That is, if A 1, A 2 are disjoit (i.e., A 1 A 2 = ) the µ(a 1 A 2 ) = µ(a 1 ) + µ(a 2 ). For example, whe Ω = {0, 1}, the i.i.d. fair coi toss measure ca be defied by lettig, for each k µ ({ω : ω 1 = 1, ω 2 = 1,..., ω k = 1}) = 2 k for each ω Ω. We would like to defie the same object for a coutable umber of coi tosses. That is, whe Ω = {0, 1} N, we would like to defie a map µ: 2 Ω [0, 1] that has the above properties, satisfies µ ({ω : ω 1 = 1, ω 2 = 1,..., ω k = 1}) = 2 k ad is furthermore coutably additive: if (A 1, A 2,...) is a sequece of disjoit sets the µ ( A ) = µ(a ). As we saw i the riddle from the previous lecture, this is impossible. I order to solve this problem we will itroduce some measure theoretical cocepts π-systems, algebras ad sigma-algebras. Give a set Ω, a π-system o Ω is a collectio P of subsets of Ω such that if A, B P the A B P. Example 2.1. Let Ω = R, ad let P = {(, x] : x R}. This is a π-system because (, x] (, y] = (, mi{x, y}]. Example 2.2. Let Ω = {0, 1} N, ad let P be the collectio of sets {A S } idexed by fiite S N where A S = {ω Ω : ω k = 1 for all k S}. This is a π-system because A S A T = A S T.

7 Example 2.3. Let X be a topological space. The the set of closed sets i X is a π-system. A algebra of subsets of Ω is a π-system A o Ω with the followig additioal properties: (1) Ω A. (2) If A A the its complemet A c A. It is easy to see that if A is a algebra of subsets of Ω the (1) A. (2) If A, B A the A B A. Example 2.4. Let Ω be ay set. The the collectio of subsets of Ω is a algebra. Example 2.5. Let Ω = {0, 1} N, ad let A clope be the algebra of clope sets. That is, A clope is the collectio of fiite uios of sets A x idexed by fiite x {0, 1}, where A x = {ω Ω : ω k = x k for all k }. Exercise 2.6. Show that A clope uios of sets of the form A x. is the collectio of fiite disjoit Example 2.7. Let Ω = N, ad let A be the collectio of sets A such that either A is fiite, or else A c is fiite. Exercise 2.8. Prove that A clope ad A are algebras. Give a algebra A, a fiitely additive probability measure is a fuctio µ: A [0, 1] with the followig properties: (1) µ(ω) = 1. (2) µ is additive. That is, if A 1, A 2 are disjoit (i.e., A 1 A 2 = ) the µ(a 1 A 2 ) = µ(a 1 ) + µ(a 2 ). Exercise 2.9. Show that µ( ) = 0. Exercise Defie a fiitely additive measure o the algebra A from Example 2.7. A algebra F of subsets of Ω is a sigma-algebra if for ay sequece (A 1, A 2,...) of elemets of F it holds that A F. It follows that A F. Exercise (1) Let I be a set, ad let {F i } i I be a collectio of sigma-algebras of subsets of Ω. Show that i I F i is a sigmaalgebra. 7

8 8 (2) Let C be a collectio of subsets of Ω. The there exists a uique miimal (uder iclusio) sigma-algebra F C. F is called the sigma-algebra geerated by C, which we write as F = σ(c). Exercise Prove that A (Example 2.7) is ot a sigma-algebra. Give a topological space, the Borel sigma-algebra B is the sigmaalgebra geerated by the ope sets. Hece it is also geerated by ay basis of the topology. A measurable space is a pair (Ω, F), where F is a sigma-algebra of subsets of Ω. A probability measure o (Ω, F) is a fuctio µ: F [0, 1] with the followig properties: (1) µ(ω) = 1. (2) µ is coutably additive. That is, if (A 1, A 2,...) is a sequece of disjoit sets (i.e., A A m = for all m) the µ ( A ) = µ(a ).

9 3. Hah-Kolmogorov Theorem ad costructig measures Theorem 3.1 (Hah-Kolmogorov Theorem). Let C be a collectio of subsets of Ω, ad let F = σ(c). Let µ 0 : C [0, 1] be a coutably additive map with µ(ω) = 1. We say that a probability measure µ: F [0, 1] exteds µ 0 if µ(a) = µ 0 (A) for all A C. (1) If C is a π-system the there exists at most oe probability measure µ that exteds µ 0. (2) If C is a algebra the there exists exactly oe probability measure µ that exteds µ 0. Example 3.2. Let A = A clope be the algebra defied i Example 2.5. The there is a uique map µ 0 : A [0, 1] that is additive ad satisfies µ 0 (A x ) = 2 x. Furthremore, this map is coutably additive. Hece µ 0 has a uique extesio µ: B [0, 1] (where B = σ(a) is the Borel sigma-algebra o {0, 1} N, equipped with the product topology). The probability measure µ is sometimes called the Beroulli measure o {0, 1} N. Exercise 3.3. Prove that µ 0 : A clope [0, 1] is coutably additive. Example 3.4. Let P be the π-system o the iterval [0, 1] give by P = {[0, x] : x [0, 1]}, ad let ad let µ 0 : P [0, 1] be give by µ 0 ([0, x]) = x. The there exists a probability measure µ: B [0, 1] (where B = σ(c) is the Borel sigma-algebra o [0, 1]) that exteds µ 0. Note that ideed there always exists such a µ; it is called the Lebesgue measure. To prove this we aturally exted µ 0 to the algebra geerated by P, ad the show that this extesio is coutably additive. Example 3.5. Let P be the π-system from Example 2.1. Choose some mootoe icreasig, right cotiuous F : R [0, 1] with if x F (x) = 0 ad sup x F (x) = 1. Let µ 0 : P [0, 1] be give by µ 0 ((, x]) = F (x). The if there exists a probability measure µ: B [0, 1] (where B = σ(c) is the Borel sigma-algebra o R) that exteds µ 0, the it is uique. Such a probability measure also always exists. Theorem 3.6. Let (Ω, F, µ) be a probability space. 9

10 10 (1) If (F 1, F 2,...) be a sequece of sets i F such that F F +1 the µ ( F ) = lim µ(f ). (2) If (F 1, F 2,...) be a sequece of sets i F such that F F +1 the µ ( F ) = lim µ(f ). Proof. (1) Let G 1 = F 1, ad for > 1 let G = F \ F 1. The F = G, ad additioally the G s are disjoit. Hece µ ( F ) = µ(g ) = lim µ(g ) = lim µ ( k=1g ) = lim µ(f ). (2) Left as a exercise. k=1 Corollary 3.7. Let (Ω, F, µ) be a probability space, ad let (F 1, F 2,...) be a sequece of sets i F. (1) If µ(f ) = 0 for all the (2) If µ(f ) = 1 for all the µ ( F ) = 0. µ ( F ) = 1.

11 4. Evets ad radom variables Give a measurable space (Ω, F), a evet A is a elemet of F. We sometimes call evets measurable sets. A sub-sigma-algebra of F is a subset of F that is also a sigma-algebra. Give aother measurable space (Θ, G), a fuctio f : Ω Θ is measurable if for all A G it holds that f 1 (A) F. Exercise 4.1. Prove that f is measurable iff the collectio (4.1) is a sub-sigma-algebra of F. σ(f) = {f 1 (A) : A G} = f 1 (G). Hece (assumig f is oto, otherwise restrict to its image), f 1 : G σ(f) is a isomorphism of sigma-algebras. Fix a measurable space (Ω, F), ad let f be a measurable fuctio to some other measurable space. Give a sub-sigma-algebra G F, we say that f is G-measurable if σ(f) is a sub-sigma-algebra of G. We say that a sigma-algebra F is separable if it geerated by a coutable subset. That is, if there exists some coutable C F such that F = σ(c). We say thtat F separates poits if for all ω 1 ω 2 there exists some A F such that ω 1 A ad ω 2 A. Theorem 4.2. Let (Ω, F), (Θ 1, G 1 ) ad (Θ 2, G 2 ) be measurable spaces with sigma-algebras that separate poits. Let f : Ω Θ 1 ad g : Ω Θ 2 be measurable fuctios. The g is σ(f)-measurable iff there exists a measurable h: Θ 1 Θ 2 such that g = h f. Exercise 4.3. Prove for the case that g = h f. Measurable fuctios to (R, B) will be of particular iterest. Claim 4.4. Let (Ω, F) be a measurable space, ad let f : Ω R. The (1) If C B satisfies σ(c) = B, ad if f 1 (A) F for all A C the f is measurable. (2) For each x R let A x Ω be give by A x = {ω : f(ω) x}. If each A x is i F the f is measurable. (3) If Ω is a topological space with Borel sigma-algebra F, ad if f is cotiuous, the it is measurable. (4) If g is a measurable fuctio from (R, B) to itself ad f is measurable the g f is measurable. Claim 4.5. Let (Ω, F) be a measurable space, ad let {f } be a sequece of measurable fuctios to (R, B) with 0 f 1 for all. The the followig are measurable: 11

12 12 (1) if f. (2) lim if f. (3) The set {ω : lim f (ω) exists }. Claim 4.6. The measurable fuctios (Ω, F) (R, B) are a vector space over the reals: (1) If f is measurable the λf is measurable, for all λ R. (2) If f 1 ad f 2 are measurable, the f 1 + f 2 is measurable. Give a probability space (Ω, F, µ) ad a measurable space (Θ, G), we say that two measurable fuctios f, g : Ω Θ are equivalet if µ({ω : f(ω) = g(ω)}) = 1. A radom variable is a equivalece class of measurable fuctios. We will ofte cosider the case that (Θ, G) = (R, B), i which case we will call X a real radom variable. I fact, we will do this so ofte that we will ofte refer to real radom variables as just radom variables. A few otes: (1) Note we will ofte just thik of radom variables as measurable fuctios. We will say, for example, that a real radom variable is o-egative, by which we will mea that there is a o-egative fuctio i the equivalece class. We will also defie radom variables by just describig oe elemet of the equivalece class. (2) It is easy to verify that sums, products, limits etc. of radom variables are well defied, i the sese that (for example) the equivalece class of f +g is equal to the equivalece class of f + g wheever f ad f are equivalet ad g ad g are equivalet. (3) We will later eed to verify that the expectatio of a radom variable is well defied, i.e., is idepedet of the choice of represetative. Example 4.7. Let Ω = {0, 1} N, ad let P be the Beroulli measure defied i Example 3.2. Defie the radom variable X : Ω R by X(ω) = max{ N : ω k = 0 for all k }. Note that X is ot well defied at a sigle poit i Ω, the all zeros sequece. We accordigly exted R to iclude (ad ) ad assig X(ω) = i this case. Give a radom variable X : Ω Θ, we defie the pushforward measure ν = X µ o (Θ, G) by ν(a) = µ ( X 1 (A) ).

13 The measure ν is also called the law of X. Whe Θ = R we defie the cumulative distributio fuctio F : R [0, 1] of X by F (x) = ν ((, x]) = µ ({ω : X(ω) x}). As we oted i Example 3.5, ν is uiquely determied by F. Exercise 4.8. Calculate the cumulative distributio fuctio of the radom variable defied i Example

14 14 5. Idepedece ad the Borel-Catelli Lemmas Let (Ω, F, P) be a probability space. Let (F 1, F 2,...) be sub-sigmaalgebras. We say that these sigma-algebras are idepedet if for ay (A 1, A 2,...) with A F ad ay fiite sequece k it holds that (5.1) P [ k A k ] = P [A k ]. k We say that the radom variables (X 1, X 2,...) are idepedet if (σ(x 1 ), σ(x 2 ),...) are idepedet. We say that the evets (A 1, A 2,...) are idepedet if their idicators fuctios (1 {A1 }, 1 {A2 },...) are idepedet. Note that σ(1 {A} ) = {, A, A c, Ω}. Claim 5.1. Let the evets (A 1, A 2,...) be idepedet. The P [ A ] = P [A ]. Proof. By idepedece we have that for ay m N m P [ m =1A ] = P [A ]. Deote B m = m =1A. The B is a decreasig sequece with m B m = A, ad so by Theorem 3.6 we have that m P [ A ] = P [ m B m ] = lim P [B m ] = lim P [A ]. m m =1 =1 P [A ] = It turs out that to prove idepedece it suffices to show (5.1) for geeratig π-systems. Proof is by Carathéodory s Theorem. Theorem 5.2. Let (X 1, X 2,...) be a sequece of idepedet real radom variables, each with the distributio P [X > x] = e x. Let The P [L = 1] = 1. L = lim sup X log. To prove this Theorem we will eed the Borel-Catelli Lemmas. Lemma 5.3 (Borel-Catelli Lemmas). Let (Ω, F, P) be a probability space, ad let (A 1, A 2,...) be a sequece of evets. (1) If P [A ] < the P [ω Ω : ω A for ifiitely may ] = 0.

15 (2) If P [A ] = ad (A 1, A 2,...) are idepedet the P [ω Ω : ω A for ifiitely may ] = 1. To see why idepedece is eeded for the secod part, cosider the case that all the evets A are equal to some evet A with 0 < P [A] < 1. Proof of Lemma 5.3. (1) Note that {ω : ω A for ifiitely may } = m A m. Let B = m A m, so that we wat to show that P [ B ] = 0. Note that B is a decreasig sequece (i.e., if > the B B ) ad therefore by Theorem 3.6 we have that P [ B ] = lim P [B ]. Sice B = m A m, we have that P [B ] m A m. But the latter coverges to 0, ad so we are doe. (2) Note that {ω : ω A for ifiitely may } c = {ω : ω A for fiitely may } 15 = {ω : ω A c for all large eough} = m A c m. We would hece like to show that P [ m A c m] = 0. Let C = m A c m. The by idepedece ad Claim 5.1 we have that P [C ] = P [ m A c m] = m (1 P [A m ]). Sice 1 x e x this implies that ( P [C ] exp m P [A m ] Fially, by Corollary 3.7, P [ C ] = 0. ) = 0. Proof of Theorem 5.2. Let A be the evet that X α log. The P [A ] = α, ad the evets (A 1, A 2,...) are idepedet (exercise!). Also, ote that { = if α 1, P [A ] < if α > 1.

16 16 Thus, from the Borel-Catelli Lemmas it follows that { 1 if α 1, P [X α log for ifiitely may ] = 0 if α > 1. Now, ote that the evet {L α} is idetical to the evet m>0 {X (α 1/m) log for ifiitely may }, ad so P [L 1] = 1, by Corollary 3.7. It also follows that P [L 1 + 1/] = 0 for ay > 0, ad so we have that P [L > 1] = 0, agai by Corollary 3.7. Hece P [L 1] = 1, ad so P [L = 1] = 1.

17 6. The tail sigma-algebra Cosider a sequece of idepedet real radom variables (X 1, X 2,...) such that there exists some M 0 such that P [ X M] = 1 for all. That is, the sequece is uiformly bouded. Defie the radom variables Y = 1 k=1 Claim 6.1. P [ L M] = 1. X ad L = lim sup Y. Proof. Clearly P [ Y M] = 1. Hece P [ Y M for all ] = 1, ad thus P [ L M] = 1. Defie the evet A = {lim Y exists }. Theorem 6.2. There exists some c [ M, M] such that P [L = c] = 1, ad P [A] {0, 1}. A iterestig observatio is that L is idepedet of X 1. To see this, defie L 1 = lim sup X, k=1 k=2 which is clearly idepedet of X 1. But 1 X 1 L = lim sup X = lim sup + 1 X = L. I fact, by the same argumet, L is idepedet of (X 1, X 2,..., X ) for ay. This makes L a tail radom variable, as we ow explai. For each N defie the sigma-algebra T by σ(x, X +1,...), which is the smallest sigma-algebra that cotais (σ(x ), σ(x +1 ),...). Defie the tail sigma-algebra by T = T. A radom variable is a tail radom variable if it is T -measurable. Claim 6.3. L is a tail radom variable. Proof. Usig a costructio similar to the L costructio above, it is easy to see that for every there exists a fuctio f such that L = f (X, X +1,...). It follows that L is T -measurable. Thus for every A σ(a) it holds that L 1 (A) T, for every. Thus A T = T. k=2 17

18 18 Let (Z 1, Z 2,...) be i.i.d radom variables, each distributed uiformly over the set of symbols S = {a, b, c}. Let S be the set of fiite strigs over S, ad defie the radom variable W takig values i S as follows: W 1 = Z 1. If W is empty, or if the last symbol i W is differet tha Z +1, the W +1 is the cocateatio W Z +1. If the last symbol i W is Z +1 the W +1 is equal to W, with this last symbol removed. We will prove later i the course that with probability oe it holds that lim W =, ad hece we ca defie the radom variable T to be the evetual first symbol i all W high eough. It is immediate that T is measurable i the tail sigma-algebra of the sequece (W 1, W 2,...). It is also easy to see that P [T = a] = 1/3, sice by the symmetry of the defiitios, P [T = a] = P [T = b] = P [T = c], ad these must sum to oe. By the same argumet, the probability that W starts with some strig w for all high eough is w. Theorem 6.4 (Kolmogorov s Zero-Oe Law). Let T be the tail sigmaalgebra of a sequece of idepedet radom variables. The P [A] {0, 1} for ay A T. Before provig this theorem we will prove a lemma. Lemma 6.5. Let the evet A be idepedet of itself. The P [A] {0, 1}. Proof. P [A] = P [A A] = P [A] P [A]. Proof of Theorem 6.4. Let G = σ(x 1,..., X 1 ), T = σ(x, X +1,...) ad T = T. We first claim that G ad T are idepedet. To see this, defie T m = σ(x,..., X +m ), ad ote that T m ad G are idepedet, ad so P [A B] = P [A] P [B] for ay A G ad ay B T m. Now C = m T m is ot a sigma-algebra, but it is a π-system. Sice P [A B] = P [A] P [B] for ay A G ad ay B C, it follows that G ad σ(c ) = T. Sice T T the G ad T are idepedet. Hece T is idepedet of σ( G ) = σ( σ(x )) = σ(x 1, X 2,...). Sice T σ(x 1, X 2,...) it follows that T is idepedet of T, ad so P [A] {0, 1} for ay A T. Proof of Theorem 6.2. Sice A is a tail radom variable the P [A] {0, 1}.

19 For ay q Q defie the tail evet A q = {L q}. By Kolmogorov s zero-oe, law, the probability of each of these is either 0 or 1, ad so there is some c = sup{q : P [A q ] = 1} = if{q : P [A q ] = 0}. Sice Q is coutable, P [L c] = P [L c] = 1, ad so P [L = c] = 1. Fially, c [ M, M], sice P [L [ M, M]] = 1. 19

20 20 7. Expectatios Let (Ω, F, P) be a probability space. Choose evets (A 1,..., A k ) ad o-egative umbers (x 1,..., x k ), ad let f = k x k1 {Ak }. We [ ] call such a measurable fuctio simple, ad defie its expectatio E f by [ ] E f = k x P [A ]. =1 Note that oe eeds to check that E [ ] is well defied, as there might be more tha oe way to write a simple fuctio as a fiite sum of idicators. Give a (o-simple) o-egative real fuctio f, we defie its expectatio by { [ ] } E [f] = sup E f : f is simple ad f f. Note that this supremum may be ifiite. It is straightforward to verify that for ay o-egative fuctios f, g such that P [f = g] = 1 it holds that E [f] = E [g]. We ca therefore defie the expectatio of a radom variable X as the expectatio of ay f i the equivalece class. We will heceforth cosider expectatios of radom variables. It is likewise straightforward to verify that for ay two o-egative radom variables X, Y : Liearity of expectatio: For ay λ > 0 it holds that E [X + λy ] = E [X] + λe [Y ]. If X Y the E [X] E [Y ]. Theorem 7.1 (Markov s Iequality). If X is a o-egative radom variable with E [X] < the for every λ > 0 P [X λ] E [X] λ. Proof. Let A = {X λ}, ad let Y be give by { λ if ω A, Y (ω) = λ 1 {A} (ω) = 0 otherwise. The Y X, ad so E [Y ] E [X]. Sice E [Y ] = λ P [A], we get that λ P [X λ] E [X], ad the claim follows by dividig both sides by λ.

21 Cosider the o-egative radom variables (X 1, X 2,...) defied o the iterval (0, 1] (equipped with the Borel sigma-algebra ad Lebesgue measure) which are give by { if x 1/, X (x) = 0 otherwise. The (1) E [X ] = 1. (2) For every x (0, 1] it holds that lim X (x) = X(x), where X is the costat fuctio X(x) = 0. (3) lim E [X ] E [X]. Hece it is ot ecessarily true that if X X the E [X ] E [X]. Theorem 7.2 (Mootoe Covergece Theorem). Let (Ω, F, P) be a probability space, ad let (X 1, X 2,...) be a sequece of o-egative radom variables such that X (ω) is icreasig for every ω Ω. Let X(ω) = lim X (ω) [0, ]. The lim E [X ] = E [X] [0, ]. Theorem 7.3 (Domiated Covergece Theorem). Let (Ω, F, P) be a probability space, ad let (X 1, X 2,...) be a sequece of o-egative radom variables. Let X, Y be a o-egative radom variables with E [Y ] <, ad such that lim X (ω) = X(ω) for every ω Ω, ad X (ω) Y (ω) for every ω Ω ad N. The lim E [X ] = E [X]. Give a radom variable X, we defie the radom variables X + ad X by X + (ω) = max{x(ω), 0} ad X (ω) = max{ X(ω), 0}, so that X + ad X are both o-egative, ad X = X + X. If E [X + ] ad E [X ] are both fiite, we defie E [X] = E [ X +] E [ X ], ad say that X L 1 (Ω, F, P), or just X L 1. Note that X L 1 iff E [ X ] < iff X L 1. For p 1 we say that X L p if X p L 1. Exercise 7.4. Show that L p is a vector space. X E [ X p ] 1/p defies a orm o L p. Theorem 7.5. If r > p 1 ad X L r the X L p ad E [ X r ] 1/r E [ X p ] 1/p. 21

22 22 I fact, if we equip L p with this orm, the it is a Baach space; that is, it is complete with respect to the metric iduced by this orm. Theorem 7.6. Let (X 1, X 2,...) be a sequece of radom variables i L p such that lim r sup {E [ X X m p ] = 0. m, r The there exists a X L p such that lim E [ X X p ] = 0. A particularly iterestig case is p = 2. I this case we ca defie a ier product (X, Y ) := E [X Y ], which makes L 2 a Hilbert space, with completeess give by Theorem 7.6. Theorem 7.7. Let X, Y L 2. The X Y L 1. Proof. Note first that X, Y L 2. Sice L 2 is a vector space the E [( X + Y ) 2 ] <, ad so By the liearity of expectatio E [ X X Y + Y 2] <. E [ ( X + Y ) 2] = E [ X 2] + 2 E [ X Y ] + E [ Y 2], ad so we have that E [ X Y ] <. Now, E [ X Y ] = E [ X Y ], ad so X Y L 1. Fially, sice X Y = (X Y ) + + (X Y ) it follows that E [(X Y ) ± ] < ad so X Y L 1. It follows from Theorems 7.6 ad 7.7 that L 2 is a real Hilbert space, whe equipped with the ier product (X, Y ) := E [X Y ]. We ca therefore immediately coclude that for ay X, Y L 2 (1) E [X Y ] 2 E [X 2 ] E [Y 2 ], with equality iff for some λ R it a.s. holds that X = λ Y. (2) E [(X + Y ) 2 ] = E [X 2 ] + E [Y 2 ] iff E [X Y ] = 0. Give X L 2, we[ defie] the radom variable [ X := X] E [X], ad deote Var (X) = E X X ad Cov (X, Y ) = E X Ỹ. We say that X ad Y are ucorrelated if Cov (X, Y ) = 0. Usig these defiitios the facts above become (1) Cov (X, Y ) 2 Var (X) Var (Y ), with equality iff for some λ R it a.s. holds that X = λ Y. (2) Var (X + Y ) = Var (X)+Var (Y ) iff X ad Y are ucorrelated.

23 8. A strog law of large umbers ad the Cheroff boud Theorem 8.1. Let X, Y L 1 be idepedet. The X Y L 1 ad E [X Y ] = E [X] E [Y ]. To prove this, we first ote that it holds for idicator fuctios by the defiitio of idepedece, the show that it holds for simple fuctios, ad apply the mootoe covergece theorem to show that it holds i geeral. Theorem 8.2. Let (X 1, X 2,...) be a sequece of idepedet radom variables uiformly bouded i L 4 (so that E [X] 4 < K for all ad some K > 0), ad with E [X ] = 0. Let Y = 1 X. The lim Y = 0 a.s. Proof. By idepedece k E [ X k X 3 l ] = E [ Xk X 2 l X m ] = 0, ad so, by liearity we have that E [ ( ) 4 ] Y 4 = E 1 X k = 1 E [ ] X k + 4 k=1 k By Theorem 7.5 we have that E [X 2 k ]2 < K, ad so E [ ] Y 4 K + 6K 3 7K 2. 2 It follows from Markov s iequality that for ay ε > 0 k<l 23 E [ X 2 k X 2 l ]. P [ Y 4 ε 4] 7K ε 4, 2 ad so, by Borel-Catelli, lim sup Y ε for ay ε > 0 (almost surely, which we drop for the remaider of the proof). Itersectig these probability oe evets for ε = 1/2, 1/3, 1/4,... yields that lim sup Y = 0 ad thus lim Y = 0. With a little additioal effort we ca prove that if E [X ] = µ the lim Y = µ. A atural questio is: what is the probability that Y is sigificatly far from µ, for fiite? For example, for η > µ, what is the probability that Y η?

24 24 Theorem 8.3 (Cheroff Boud). Let (X 1, X 2,...) be a sequece of i.i.d. radom variables i L, ad with E [X ] = µ. The for every η > µ there is a r > 0 such that P [Y η] e r. Proof. Deote p = P [Y η]; we wat to show that p e r. Note that the evet {Y η} is idetical to the evet {e t Y e t η }, for ay t > 0. Sice e t Y is a positive radom variable, by the Markov iequality we have that p = P [ e t Y e t η] E [ ] e t Y. e t η Now, E [ [ ] e t Y] = E e t X k = E [ e k] t X, k k where the peultimate equality uses idepedece. Let X be a radom variable with the same distributio as each X k. The we have show that E [ e t Y] = E [ e t X]. We ow defie the momet geeratig fuctio of X by M(t) := E [ e tx]. The ame comes from the fact that t (8.1) M(t) =! E [X ]. Note that this meas that M (0) = E [X]. Usig M we ca write ad so =0 E [ e t X] = M(t), p exp ( (t η log M x (t)) ) If we defie the cumulat geeratig fuctio of X by K(t) := log M(t), the p exp ( (t η K(t)) ). Sice K (0) = M (0)/M(0) = E [X], ad sice K is smooth (as it turs out), it follows that for t > 0 small eough, Hece, if we defie t η K(t) = t η t µ O(t 2 ) > 0. r = sup{t η K(t)} t

25 we get that r > 0 ad p e r. Note that we did ot really eed X k to be i L, but oly that it is i L 1 ad that its momet (or cumulat) geeratig fuctio is defied ad smooth aroud zero. Claim 8.4. Let X L 1 have a cumulat geeratig fuctio K that is well defied ad fiite for some t > 0. The Proof. By Markov s iequality P [X a] e t a+k(t). P [X a] = P [ e t X e t a] E [ e t X] e t a = e t a+k(t). It turs out that the Cheroff boud is asymptotically tight. We show this i

26 26 9. The weak law of large umbers Theorem 9.1. Let (X 1, X 2,...) be a sequece of idepedet real radom variables i L 2, let E [X ] = µ, Var (X ) σ 2, ad let Y = k X. The for every ε > 0 ad N ad i particular P [ Y µ ε] σ2 ε, lim P [ Y µ ε] = 0. I this case we say that Y coverges i probability to µ. More geerally, we say that a sequece of real radom variables Y coverges i probability to a real radom variable Y if lim P [ Y Y ε] = 0. Exercise 9.2. Does covergece i probability imply poitwise covergece? Does poitwise covergece imply covergece i probability? To prove this Theorem we will eed Chebyshev s iequality, which is just Markov s iequality i disguise. Lemma 9.3 (Chebyshev s Iequality). For every X L 2 ad for every λ > 0 it holds that P [ X E [X] λ] Var (X) λ 2. Proof of Theorem 9.1. Note that E [Y ] = µ, ad that, by idepedece, ( ) ( ) 1 Var (Y ) = Var X k = 1 Var X 2 k = 1 Var (X 2 k ) σ2. k k k Hece Chebyshev s iequality yields that for every λ > 0 we have that P [ Y µ ε] σ2 ε We ca relax the assumptio X L 2 to X L 1 ad still prove the weak law of large umbers. I fact, eve the strog law holds i this settig (for i.i.d. radom variables), but we will leave the proof of that for after we prove the Ergodic Theorem.

27 Theorem 9.4. Let (X 1, X 2,...) be a sequece of i.i.d. real radom variables i L 1, let E [X ] = µ, ad let Y = k X. The for every ε > 0 lim P [ Y µ ε] = 0. We show a proof adapted from [5]. Proof. We assume µ = 0; the reductio is straightforward. Let X = X 1. For N N, ad a r.v. X deote X N = X 1 { X N} ad X >N = X 1 { X >N}, so that X = X N + X >N. By the Domiated Covergece Theorem (9.1) E [ X >N ] 0 ad E [ X N] E [X] = 0, sice both are domiated by X. Fix ε, δ > 0. To prove the claim (uder our assumptio that µ = 0) we show that P [ Y ε] < δ for all large eough. For ay N N we ca write Y as where Y Note that Y Y = 1 := 1 k k X N k + X k >N = Y + Y >, X N k ad Y > = 1 is ot the same as Y N k X >N k. 27 ; we will ot eed the latter. Likewise, Y > is ot the same as Y >N. Choose N large eough so that E [ X >N ] < ε δ/4; this is possible by (9.1). Now, [ E [ Y > 1 ] = E k X >N k ] E [ 1 k X >N k Therefore, by Markov s iequality, we have that P [ Y > ε/2] < δ/2. ] = E [ X >N ] < ε δ/4. Sice X N k is bouded it is i L 2. Therefore, by idepedece, Var ( ) Y Var ( ) X N = k N 2. By liearity of expectatios E [ ] [ ] Y = E X N, ad thus teds to zero, by (9.1). It thus from Chebyshev s iequality that for large eough P [ Y ε/2 ] < δ/2. Sice P [ Y ε] P [ Y ε/2 ad Y > ε/2 ], the claim follows by the uio boud.

28 Coditioal expectatios Why thigs are ot as simple as they seem. Cosider a poit chose uiformly from the surface of the (idealized, spherical) earth, so that the probability of fallig o a set is proportioal to its area. Say we coditio o the poit fallig o the equator. What is the coditioal distributio? It obviously has to be uiform: by symmetry, there caot be a reaso that it is more likely to be i oe time zoe tha aother. Say ow that we coditio o the poit fallig o a particular meridia m. By the same reasoig, the coditioal distributio is uiform, ad so, for example, the probability that we are withi 2 meters of the orth pole is the same as the probability that we are withi 1 meter from the equator. Itegratig over m we get that regardless of the meridia, the probability of beig 2 meters from the orth pole is the same as the probability of beig 1 meter from the equator. But the area withi 2 meters of the orth pole is about 4πm 2, whereas the area withi 1 meter of the equator is about 80000m Coditioal expectatios i fiite spaces. Cosider a probability space (Ω, F, P) with Ω <, F = 2 Ω, ad P [ω] > 0 for all ω Ω. Let Ω = {1,..., } 2, let Y be the radom variable give by Y (ω 1, ω 2 ) = ω 1, ad let G = σ(y ) be the sigma-algebra geerated by the sets A k = {k} {0,..., }. Let X be a real radom variable. The the usual defiitio is the E [X Y ] is the radom variable Ω R give by E [X Y ](ω) = ω Y 1 (ω) X(ω )P [ω ] ω Y 1 (ω) P [ω ] This otatio ca be cofusig - E [X Y ] is a radom variable ad ot a umber! Ideed, give A F with P [A] > 0, we deote by E [X A] the umber E [X A] = 1 P [A] E [ X 1 {A} ]. Exercise (1) E [X Y ] = argmi Z L 2 (Ω,G,P) E [(X Z) 2 ]. (2) E [X Y ] is G-measurable. (3) If A G with P [A] > 0 the E [ X 1 {A} ] = E [ E [X Y ] 1{A} ] Coditioal expectatios i L 2. Fix a probability space (Ω, F, P). Give a sub-sigma-algebra G F, we kow by Theorem 7.6 that the

29 subspace L 2 (Ω, G, P) L 2 (Ω, F, P) is closed. We ca therefore defie the projectio operator by P G : L 2 (Ω, F, P) L 2 (Ω, G, P) P G (X) = argmi E [ (X Y ) 2]. Y L 2 (Ω,G,P) Some immediate observatios: (1) P G (X) is G-measurable. (2) If Y L 2 (Ω, F, P) the E [(X P G (X)) Y ] = 0, or E [X Y ] = E [P G (X) Y ]. Thus give A G with P [A] > 0 we have that E [ X 1 {A} ] = E [ PG (X) 1 {A} ] Coditioal expectatios i L 1. Theorem Let (Ω, F, P) be a probability space with a r.v. X L 1 ad a sub-sigma-algebra G F. The there exists a uique radom variable Y with the followig properties: (1) Y L 1 (Ω, G, P). (2) For every A G it holds that E [ Y 1 {A} ] = E [ X 1{A} ]. We deote E [X G] := Y. For A F with P [A] > 0 we deote E [X A] = E [ ] X 1 {A} /P [A]. Proof. We first prove uiqueess. Let Y ad Z both satisfy the two coditios i the theorem, ad assume by cotradictio that P [Y > Z] > 0. The there is some ε > 0 such that P [Y ε > Z] > 0. Let A = {Y ε > Z}, ad ote that A G. The E [ Y 1 {A} ] = E [ (Y ε) 1{A} ] + εp [A] > E [ Z 1 {A} ] + ε P [A] > E [ Z 1 {A} ]. But sice A G we have that both P [ ] [ ] Y 1 {A} ad P Z 1{A} are equal to E [ ] X 1 {A} - cotradictio. We prove the remider uder the assumptio that X 0; the reductio is straightforward. Let X = X 1 {X }. The X is bouded, ad i particular is i L 2. Let Y = P G (X ). We claim that Y is o-egative. To see this, assume by cotradictio that P [Y < ε] > 0 for some ε > 0, ad let A = {Y < ε}. The E [ ] [ ] [ ] Y 1 {A} < ε P [A] < 0, but E Y 1 {A} = E X 1{A} 0. Now, Y is a mootoe icreasig sequece. To see this, ote that X is mootoe icreasig, ad that P G is a liear operator, ad so 29

30 30 Y +1 Y = P G (X +1 X ) is o-egative, by the same proof as above. Sice Y is mootoe icreasig the so is Y 1 {A}, for ay A G. Therefore, if we defie Y = lim Y, the E [ ] [ ] Y 1 {A} E Y 1{A}. But E [ ] Y 1 {A} = E [X A], ad, sice X 1 {A} is also mootoe icreasig with X 1 {A} = lim X 1 {A}, we have that E [ Y 1 {A} ] = lim E [Y A] = lim E [ X 1 {A} ] = E [ X 1{A} ]. Fially, each Y is G-measurable by costructio, ad therefore so is Y Some properties of coditioal expectatio. Exercise (1) If X is G-measurable (i.e., σ(x) G) the E [X G] = X. (2) The Law of Total Expectatio. If G 2 G 1 the E [E [X G 1 ] G 2 ] = E [X G 2 ]. I particular E [E [X G]] = E [X]. (3) If Z L (Ω, G, P) the E [Z X G] = Z E [X G].

31 11. The Galto-Watso process Cosider a asexual orgaism (i the origial work these were Victoria me) whose umber of offsprigs X 1 is chose at radom from some distributio o N 0 = {0, 1, 2,...}. Each of its descedats i (assumig it has ay) has X i offsprigs, with the radom variables (X 1, X 2,...) distributed idepedetly ad idetically. A iterestig questio is: what is the probability that the orgaisms progey will live forever, ad what is the probability that there will be a last oe to its ame? Formally, cosider geeratios {1, 2,...}, ad to each geeratio associate a ifiite sequece of radom variables (X,1, X,2,...), with all the radom variables (X,i ) idepedet ad idetically distributed o N 0. We will, to simplify some expressios, defie X = X 1,1. We assume that 0 < E [X] <, ad deote µ = E [X]. We also assume that P [X = 0] > 0. To each geeratio we defie the umber of orgaisms Z, which is also a radom variable. It is defied recursively by Z 1 = 1 ad Z +1 = Z i=1 X,i. Clearly Z = 0 implies Z +1 = 0. We are iterested i the evet that Z = 0 for some, or that, equivaletly, Z = 0 for all large eough. This is agai equivalet to the evet Z <, sice each Z is a iteger. We deote this evet by E (for extictio), ad deote E = {Z = 0}, so that the sequece E is icreasig ad E = E. Therefore, by Theorem 3.6, P [E] = lim P [Z = 0]. We first calculate the expectatio of Z +1. Sice Z is idepedet of (X,1, X,2,...), it holds that [ Z ] E [Z +1 ] = E X,i ad so = E [ i=1 [ Z E ]] X,i Z i=1 = E [Z E [X Z ]] = E [Z ] E [X], E [Z +1 ] = µ. Claim If µ < 1 the P [E] = 1. 31

32 32 Proof 1. By Markov s iequality, P [Z 1] µ. Thus by the Borel- Catelli Lemma w.p. 1 there will be some with Z < 1, ad thus Z = 0. Proof 2. Note that E [ Z ] = E [Z ] <, ad so P [ Z = ] = 0. It is also true that P [E] = 1 whe µ = 1. Note that i this case E [Z +1 Z 1, Z 2,..., Z ] = E [Z +1 Z ] = Z E [X] = Z. The first equality makes Z a Markov chai. The secod makes it a Martigale; we will discuss both cocepts formally. By the Martigale Covergece Theorem we have that Z coverges almost surely to some r.v. Z. But clearly Z caot coverge to aythig but 0, ad so P [E] = 1. Note that the evet E is equal to the uio of the evet that X 1,1 = 0 with the evet that X 1,1 > 0 but each of the sub-tree of the Z 2 offsprigs goes extict. Sice the process o each subtree is idetical, ad sice the probability that all of such k offsprig trees goes extict is P [E] k, we have that P [E] must satisfy (11.1) P [E] = k N 0 P [X = k]p [E] k. We accordigly defie f : [0, 1] [0, 1], the geeratig fuctio of X, by f(t) = k N 0 P [X = k] t k = E [ t X], where we take 0 0 = 1. The (11.1) is equivalet to observig that P [E] is a fixed poit of f. Note that 1 is always a fixed poit, but i geeral there might be more. Some observatios: (1) f(0) = P [X = 0] ad f(1) = 1. (2) f (t) = k N 0 P [X = k] k t k 1 = E [ X t X 1]. Hece f (1) = E [X] = µ. Note also that f (t) > 0. (3) Likewise, the k th derivative of f is E [ X k t X k], which is also positive. Thus f is strictly covex.

33 Let f (t) = E [ ] t Z be the geeratig fuctio of Z. The f +1 (t) = E [ t ] Z +1 = E [ E [ ]] t Z +1 Z [ [ = E E t ]] Z k=1 X,k Z = E [E [ t X] ] Z = E [ f(t) Z] = f (f(t)), where we agai used the fact that Z is idepedet of (X,1, X,2,...). Sice f 1 = 1, f +1 is the -fold compositio of f with itself: f +1 = f f f. Now P [Z = 0] = f (0). Sice f is aalytic, P [E] = lim P [E ] = lim f (0) will be the fixed poit of f that oe coverges to by applyig f repeatedly to 0. Furthermore, f(0) = P [X = 0] > 0, f(1) = 1, ad f is icreasig ad covex. Thus f will have a uique fixed poit. Fially, sice f (1) = µ, this fixed poit will be 1 iff µ 1. 33

34 Markov chais Let the state space S be a coutable or fiite set. A sequece of S-valued radom variables (X 0, X 1, X 2,...) is said to be a Markov chai if for all x S ad > 0 P [X = x X 0, X 1,..., X 1 ] = P [X = x X 1 ]. A Markov chai is said to be time homogeeous if P [X = x X 1 ] does ot deped o. I this case it will be useful to study the associated stochastic S-idexed matrix P (x, y) = P [X +1 = y X = x]. It is easy to see that P [X +m = y X = x] = P m (x, y), where P m deotes the usual matrix expoetiatio. We call P the trasitio matrix of the Markov chai. I the cotext of a trasitio matrix P, we will deote by P x the measure of the Markov chai for which P [X 0 = x] = 1. The ext claim is eeded to formally apply the Markov property. Claim Let (X 0, X 1,...) be a time homogeeous Markov chai. Fix some measurable f : S N R ad deote Y = f(x, X +1,...). The for ay, m N ad x S such that P [X = x] > 0 ad P [X m = x] > 0 it holds that E [Y +1 X = x] = E [Y m+1 X m = x]. Example: let S = Z, let X 0 = 0, ad let P (x, y) = { x y =1}. This is called the simple radom walk o Z. More geerally (i some directio), oe ca cosider a graph G = (S, E) with fiite positive out-degrees d(x) = E {x} S ad let P (x, y) = 1 {(x,y) E} d(x) The lazy radom walk o Z has trasitio probabilities P (x, y) = { x y 1}. We say that a (time homogeeous) Markov chai is irreducible if for all x, y S there exists some m so that P m (x, y) > 0. We say that a irreducible chai is aperiodic if for some (equivaletly, every) x S it holds that P m (x, x) > 0 for all m large eough. Exercise Show that if a irreducible chai is ot aperiodic the for every x S there is a k N so that P m (x, x) = 0 for all m ot divisible by k. Exercise (1) Show that the simple radom walk o Z is irreducible but ot aperiodic.

35 (2) Show that the lazy radom walk o Z is irreducible ad aperiodic. (3) Show that the simple radom walk o a directed graph is irreducible iff the graph is strogly coected. (4) Show that the simple radom walk o a coected, udirected graph is aperiodic iff the graph is ot bipartite. We defie the hittig time to x S by T x = mi{ > 0 : X = x}. This is a radom variable takig values i N { }. A irreducible Markov chai is said to be recurret if P [T x < ] = 1 wheever P [X 0 = x] > 0. A o-recurret radom walks is called trasiet. Theorem Fix a irreducible Markov chai with P [X 0 = x] > 0 for all x S. The the followig are equivalet. (i) The Markov chai is recurret. (ii) For some (all) x X it holds that P [X = x i.o.] = 1. (iii) For some (all) x X it holds that m P m (x, x) =. Proof. Choose ay x S. Sice P [T x < ] = 1, ad sice P [X 0 = y] > 0 for ay y S, we have that P [T x < X 0 = y] = 1, or that P [X = x for some > 0 X 0 = y] = 1. By irreducibility we have that P [X m = y] > 0 for ay m, ad so by the Markov property it follows that Summig over y yields that ad so P [X = x for some > m X m = y] = 1. P [X = x for some > m] = 1, P [X = x i.o.] = 1. We have thus show that (i) implies (ii). Note that P m (x, x) = P [X m = x X 0 = x]. Now, (ii) implies that P [X = x i.o. X 0 = x] = 1 ad so, by Borel-Catelli, (ii) implies (iii). 35

36 36 Fially, to show that (iii) implies (i), assume that the Markov chai is trasiet. The P [T x < ] < 1, ad so P [T x < X 0 = x] < 1. Deote the latter by p. Hece, by the Markov property, p = P [X = x for some > m X m = x]. Therefore, coditioed o X 0 = x, the probability that x is visited k more times is p k (1 p). I particular the expected umber of visits is fiite, ad sice this expectatio is equal to m P m (x, x), the proof is complete. Exercise Prove that every irreducible Markov chai over a fiite state space is recurret. Exercise Let P be the trasitio matrix of a Markov chai over S, ad for ε > 0 let P ε = (1 ε)p + εi, where I is the idetity matrix. Thus P ε is the ε-lazified versio of P. Cosider two Markov chais over S: both with X 0 = x, ad oe with trasitio matrix P ad the other with trasitio matrix P ε. Prove that either both are recurret or both are trasitive. Corollary The simple radom walk o Z is recurret. Proof. Note that P [X 2+1 = 0] = 0 ad that By Stirlig ad so Hece P [X 2 = 0] = 2 2 ( 2 m ( ) , P [X 2 = 0] 1 2. ). P m (0, 0) 1 2 m =, ad the claim follows by Theorem Cosider ow a radom walk with a drift o Z. For example, let P (x, y) = p if y = x + 1 ad P (x, y) = 1 p if y = x 1. I this case, assumig X 0 = 0, X = k Y where the Y are i.i.d. r.v. with P [Y = 1] = p ad P [Y = 1] = 1 p. It follows from the strog law of large umbers that a.s. lim X / = 2p 1 > 0, ad so i particular lim X =, ad the radom walk is trasiet. The same

37 argumet holds wheever the trasitio probabilities correspod to a L 1 radom variable with o-zero expectatio, by the same argumet (although we have yet to prove a L 1 SLLN). Exercise Prove that the simple radom walk o Z 2 (give by P (x, y) = { x y =1}) is recurret, but that the simple radom walk o Z d (give by P (x, y) = 1 d 1 { x y =1}) is trasiet for all d 3. 37

38 Martigales A filtratio Φ = (F 1, F 2,...) is a sequece of icreasig sigmaalgebras F 1 F 2. A atural (ad i some sese oly) example is the case that F = σ(y 1,..., Y ) for some sequece of radom variables (Y 1, Y 2,...). A process (X 1, X 2,...) is said to be adapted to Φ if each X is F -measurable. A sequece of real radom variables (X 1, X 2,...) that is adapted to Φ ad is i L 1 is called a martigale with respect to Φ if for all 1 E [X +1 F ] = X. It is called a supermartigale if E [X +1 F ] X. Note that if (X 1, X 2,...) is a martigale the E [X ] = E [X 1 ] ad by subtractig the costat E [X 1 ] from all X s we get that (X 0, X 1,...) is a martigale with X 0 = 0. A similar statemet holds for supermartigales. As a first example, let W be i.i.d. r.v. with P [W = +1] = P [W = 1] = 1/2, let X = k W, ad let F = σ(x 1,..., X ). The X is the amout of moey made i fair bets (or the locatios of a simple radom walk o Z) ad is a martigale with respect to (F 1, F 2,...). If we set P [W = +1] = 1/2 ε ad P [W = 1] = 1/2 + ε for some ε > 0 the X is a supermartigale. As a secod example we itroduce Pólya s ur. Cosider a ur i which there are iitially a sigle black ball ad a sigle white ball. I each time period we reach i, pull out a ball, ad the put back two balls of the same color. Formally, let (Y 1, Y 2,...) be i.i.d. radom variables distributed uiformly over [0, 1], ad let the umber of black balls at time be B, give by B 1 = 1 ad B +1 = B + 1 {Y<B /(+1)}. Deote by R = B /( + 1) the fractio of black balls. The E [R +1 B 1,..., B ] = E [R +1 B ],

39 sice the process (B 1, B 2,...) is a Markov chai. Furthermore E [R +1 B ] = E [B +1 B ] ( B + = = B + 1 = R, B + 1 ad so R is a martigale with respect to F = σ(b 1,..., B ). Theorem R coverges poitwise: there is a radom variable R such that P [lim R = R] = 1. To prove this theorem we will prove a much more geeral theorem. Theorem 13.2 (Martigale Covergece i L 2 ). Let Φ = (F 1, F 2,...) be a filtratio, ad let (X 1, X 2,...) be a martigale w.r.t. Φ. Furthermore, assume that there exists a K such that E [X 2 ] < K for all. The there exists a radom variable X L 2 such that E [(X X ) 2 ] 0. Proof. Set X 0 = 0, ad for 1 let Y = X X 1. Sice X 1 = E [X F 1 ], we have that Y is orthogoal to ay F 1 -measurable r.v., ad i particular is orthogoal to Y m for ay m <. Now, Y = X k ad so by the orthogoality of the Y s it follows that E [ ] [ ] Yk 2 = E X 2 < K. Thus k k E [ Y 2 ] < K, ad we have that X is a Cauchy sequece i L 2. Therefore, sice L 2 is complete (Theorem 7.6) there exists some X L 2 such that E [(X X ) 2 ] 0. This theorem still does ot imply poitwise covergece, which we would eed to prove Theorem Theorem 13.3 (Martigale Poitwise Covergece). Let Φ = (F 1, F 2,...) be a filtratio, ad let (X 1, X 2,...) be a supermartigale w.r.t. Φ. Furthermore, assume that there exists a K such that E [ X ] < K for all ) 39

40 40. The there exists a radom variable X L 1 such that almost surely lim X = X. Before provig this theorem we will eed the followig lemmas. Lemma Let (X 0, X 1, X 2,...) be a supermartigale w.r.t. Φ = (F 1, F 2,...) with X 0 = 0, let B be {0, 1}-values radom variables adapted to Φ, ad let Y = k B k 1 (X k X k 1 ). The Y is a supermartigale ad E [Y ] 0. The idea behid this lemma is the followig: imagie that you are gamblig at a casio with o-positive expected wis from every gamble. Say that you have some system for decidig whe to gamble ad whe to stay out (i.e., the B s). The you do ot expect to wi more tha you would have if you stayed i the game every time. Proof. E [Y +1 F ] = E [ k +1 ] B k 1 (X k X k 1 ) F = E [Y + B (X +1 X ) F ] = Y + B E [X +1 X F ] = Y + B (E [X +1 F ] X ) Y. Thus Y is a supermartigale, ad by iductio E [Y ] 0. Lemma Let (X 0, X 1, X 2,...) be a supermartigale w.r.t. Φ = (F 1, F 2,...) with X 0 = 0. Fix some a < b, ad let B be defied as follows: B 0 = 0, ad B +1 is the idicator of the uio of the evets (1) B = 1 ad X b. (2) B = 0 ad X < a. Let U a,b be the umber of k such that B k = 0 ad B k 1 = 1. The E [ ] U a,b E [(X a) ]. b a Proof. By picture, it is clear that for it holds that Y = k B 1 (X X 1 ) Y (b a)u a,b (X a). By Lemma 13.4 we have that E [Y ] 0, ad so the claim follows by takig expectatios.

41 Proof of Theorem For a give a < b, let U a,b = lim U a,b. The limit exists sice this is a mootoe icreasig sequece, ad it also follows that E [ ] U a,b = lim E [ U a,b ] E [(X a) ] lim a + K b a b a <. Thus P [ U a,b < ] = 1, ad it follows that with probability zero it occurs that lim sup X b ad lim if X a. Applyig this to a coutable dese set of pairs (a, b) we get that with probability zero lim sup X > lim if X, ad so lim sup X = lim if X almost surely. Theorem 13.1 is ow a direct cosequece. Exercise Let R be the fractio of black balls i Pólya s ur. Show that lim R is distributed uiformly o (0, 1). Hit: calculate the distributio of R. 41

42 Stoppig times Let Φ = (F 1, F 2,...) be a filtratio, ad let (X 0, X 1, X 2,...) be a supermartigale w.r.t. Φ. Deote F = σ( F ). A radom variable T takig values i N { } is called a stoppig time if for all it holds that the evet {T } is F -measurable. Example: (X 1, X 2,...) is a Markov chai over the state space S, ad T x is the hittig time to x S give by T x = mi{ > 0 : X = x}. Example: (X 1, X 2,...) is the simple radom walk o Z, ad T is the first 3 such that X < X 1 < X 2. Give a stoppig time T, we defie the stopped process (X T 1, X T 2,...) = (X 1, X 2,..., X T 1, X T, X T,...). That is, X T = X if T, ad X T = X T if T. Equivaletly, X T = X mi{t,}. Ituitively, the stopped process correspods to the process of a gambler s bak accout, whe the gambler decides stoppig at time T. Theorem If (X 0, X 1, X 2,...) is a (super)martigale (with X 0 = 0) the (X T 0, X T 1, X T 2,...) is a (super)martigale. Proof. We prove for the case of supermartigales; the proof for martigales is idetical. Let B = 1 {T } ad Y = k B k 1 (X k X k 1 ). The by Lemma 13.4 we have that Y is a supermartigale. But Y = X T. So the gambler s bak accout is still a martigale, o matter what the stoppig time is, ad i particular E [ X T ] 0 (with equality for martigales). However, cosider a simple radom walk o Z, with stoppig time T 1. That is, the gambler stops oce she has eared a dollar. The clearly E [X T1 ] = 1. The followig theorem gives coditios for whe E [X T ] = 0. Theorem 14.2 (Doob s Optioal Stoppig Time Theorem). Let (X 0, X 1,...) be a supermartigale with X 0 = 0, ad let T be a stoppig time. Assume oe of the followig holds: (1) N s.t. P [T N] = 1. (2) K s.t. P [ X K for all ] = 1, ad P [T < ] = 1. (3) E [T ] < ad K s.t. P [ X +1 X K for all ] = 1. (4) P [T < ] = 1 ad X is o-egative. The E [X T ] 0, with equality if (X 0, X 1,...) is a martigale. To prove this theorem we will eed the followig importat lemma.

43 Lemma 14.3 (Fatou s Lemma). Let (Z 1, Z 2,...) be a sequece of oegative real radom variables. The [ ] E lim if Z lim if E [Z ]. Recall from the Galto-Watso example that ideed this may be a strict iequality. Exercise Prove Fatou s Lemma. Hit: use the Mootoe Covergece Theorem. Proof. We prove that E [X T ] 0; the equality i case of the martigales follows easily. Note that E [ ] X T 0, by Theorem Also lim X T = X T, sice P [T < ] = 1 uder all coditios. (1) X T = XN T. (2) By the Bouded Covergece Theorem E [ X ] T = lim E [ ] X T 0. (3) mi{t,} X T = X k X k 1 K T. k=1 Hece by the Domiated Covergece Theorem E [X T ] = E [ X T ]. (4) By Fatou s Lemma, E [X T ] lim if E [ X T ] 0. Corollary Let T 1 be the hittig time to 1 of the simple radom walk o Z. The E [T 1 ] =. 43

44 Harmoic ad superharmoic fuctios Let (X 0, X 1,...) be a Markov chai over the state space S with trasitio matrix P. We say that a fuctio f : S R is P -harmoic if P f = f. Here P f : S R is [P f](x) = y S P (x, y)f(y). We say that f is P -superharmoic if [P f](x) f(x) for all x S. Claim Assume that for all x there exists a such that P [X = x] > 0. Let Z = f(x ). The Z is a (super)martigale iff f is (super)harmoic. Proof. We prove for the (super) case: E [f(x +1 ) X 0,..., X ] = E [f(x +1 ) X ] = y S P (X, y)f(y) f(x ) iff f is superharmoic. Theorem Let P be irreducible. The the followig are equivalet. (i) Every Markov chai with trasitio matrix P is recurret. (ii) Some Markov chai with trasitio matrix P is recurret. (iii) Every o-egative P -superharmoic fuctio is costat. Proof. The equivalece of (i) ad (ii) follows easily from Theorem To see that (i) implies (iii), let T y be the hittig time to y, ad ote that P x [T y < ] = 1, by recurrece. Let f be a o-egative superharmoic fuctio, ad let Z = f(x ). The we ca apply the Optioal Stoppig Time Theorem to Z Ty to get that E x [ ZTy ] Ex [Z 0 ]. The l.h.s. is equal to f(y) ad the r.h.s. is equal to f(x), ad so f is costat.

45 45 Assume (iii), ad ote that P x [T y < ] = P x [X 1 = y, T y < ] + P x [X 1 y, T y < ] = P x [X 1 = y] + z y P x [X 1 = z, T y < ] = P x [X 1 = y] + z y P x [T y < X 1 = z] P x [X 1 = z] = P (x, y) + z y P (x, z) P z [T y < ] z P (x, z) P z [T y < ]. Hece f(x) = P x [T y < ] is superharmoic, ad thus costat by assumptio. Say p = P x [T y < ]. By irreducibility p > 0. Hece, by the Markov property, for every N the expected umber of visits at times > N is at least p, ad so the expected umber of visits is ifiite. Thus the radom walk is recurret. The followig claim is a direct cosequece of Claim 15.1 ad the Martigale Covergece Theorem. Claim Let f : S R be bouded ad superharmoic. The Z = f(x ) is a bouded supermartigale ad therefore coverges almost surely to Z := lim Z. Recall that T = σ(x, X +1,...) ad that T = T is the tail sigma-algebra. We thik of our probability space as beig (Ω, F, P) with Ω = S N ad F the Borel sigma-algebra of the product of the discrete topologies. The A T iff A is of the form S B for some measurable B F. Equivaletly, A T iff for every (x 0, x 1,...) A, ad (y 0,..., y 1 ) S it holds that (y 0,..., y 1, x, x +1,...) A. Aother importat sigma-algebra is the shift-ivariat sigma-algebra I. To defie it, let ϕ: S N S N be the shift map give by ϕ(x 0, x 1, x 2,...) = (x 1, x 2,...). The I is the collectio of subsets of S N that are ϕ 1 -ivariat. That is, A I iff for every (x 0, x 1, x 2,...) A it holds that ϕ 1 (x 0, x 1, x 2,...) = S (x 1, x 2,...) A. Exercise Fid a irreducible Markov chai o the state space N that has a radom variable that is T -measurable but ot I-measurable. Claim Z is I-measurable ad T -measurable.

46 46 Proof. Fix some z R, ad let A = {Z z}. To show that Z is I-measurable it suffices to show that A I. Let (x 0, x 1,...) A, so that lim f(x ) z. But the lim f(x 1 ) z, ad so ay (x, x 0, x 1, x 2,...) A. Thus A I. Z is clearly T -measurable for every, ad therefore is also T -measurable. Sice Z is bouded we have that Z L (I).

47 16. The Choquet-Dey Theorem As motivatio, cosider the simple radom walk (X 1, X 2,...) o Z 3. Let P = X / X be the projectio of X to the uit sphere (ad assume P = 0 wheever X = 0). Sice this radom walk is trasiet, it is easy to deduce that lim X =. It follows that lim P +1 P = 0; that is, the projectio moves more ad more slowly. A atural questio is: does P coverge? Assume (X 0, X 1,...) is a martigale. The by the martigale ad Markov properties we have that Z = E [Z X ], ad so f(x) = E [Z X = x] wheever P [X = x] > 0. Coversely, choose ay W L (T ), ad defie a fuctio f(x) = E [W X = x] for some s.t. P [X = x] > 0. We claim that f is well defied, sice for ay such f(x) = E [W X = x] = E x [W ]. It is easy to verify that f is harmoic. Deote by h (S, P ) l (S) the bouded P -harmoic fuctios. The the map Φ: L (T ) h (S, P ) give by Φ: W f is a liear isometry. Its iverse is the map Φ 1 : f lim f(x ). Theorem 16.1 (Choquet-Dey Theorem). Let (Y 1, Y 2,...) be i.i.d. radom variables takig values i some coutable abelia group G. Let X = k Y. The (X 1, X 2,...) is a time homogeeous Markov chai over the state space G. If (X 1, X 2,...) is also irreducible the every W L (T ) is costat. For the proof of this theorem we will eed a importat classical result from covex aalysis. Theorem 16.2 (Krei-Milma Theorem). Let X be a Hausdorff locally covex topological space. A poit x C is extreme if wheever x is equal to the o-trivial covex combiatio αy + (1 α)z the y = z. Let C be compact covex subset of X. The every x C ca be writte as the limit of covex combiatios of extreme poits i C. Proof of Theorem Deote by P the trasitio matrix of (X 1, X 2,...), ad let µ(g) = P [Y = g]. The P (g, k) = µ(k g). Thus, if f is P -harmoic the f(g) = k G f(k)p (g, k) = k G f(k)µ(k g) = k G f(g + k)µ(k). Let H = h [0,1] (G, P ) be the set of all P -harmoic fuctios with rage i [0, 1]. We ote that harmoicity is ivariat to multiplicatio by a costat ad additio, ad so if we show that every f h [0,1] (G, P ) 47

48 48 is costat the we have show that every f H is costat. It the follows that every W L (T ) is costat, by the fact that Φ is a isometry. We state three properties of H that are easy to verify. (1) H is ivariat to the G actio: for ay f H ad g G, the fuctio f g : G R give by [f g ](k) = f(k g) is also i H. (2) H is compact i the topology of poitwise covergece. (3) H is covex. As a covex compact space, H is the closed covex hull of its extreme poits; this is the Krei-Milma Theorem. Thus H has extreme poits. Let f H be a extreme poit. The, sice f is harmoic, f(g) = k G f(g + k)µ(k) = k G f k (g)µ(k). By the first property of H each f k is also i H, ad thus we have writte f as a covex combiatio of fuctios i H. But f is extreme, ad so f = f k for all k i the support of µ. But sice the Markov chai is irreducible, the support of µ geerates G. Hece f is ivariat to the G-actio, ad therefore costat. A immediate corollary of the Choquet-Dey Theorem is that every evet i T has probability either 0 or 1. As a applicatio, cosider the questio o the simple radom walk o Z 3. We would like to show that P does ot coverge poitwise. Note that the evet that P coverges is a shift-ivariat evet, ad therefore has measure i {0, 1}. Assume by cotradictio that it has measure 1, ad let P = lim P. For each Borel subset B of the sphere, the evet that P B is shift-ivariat, ad therefore has measure i {0, 1}. For each k N, disjoitly partitio the sphere ito Borel sets with radius at most 1/k. The P [P B] = 1 for exactly oe of these sets, which we call B k. Let the itersectio of all these B k s be the sigleto cotaiig the poit i the sphere b. The we have show that P is equal to b, almost surely. Note that so far we have ot used the fact that the radom walk is simple. Fially, because the radom walk is simple, the by the symmetry of the problem, it must hold that such a poit b is ivariat to reflectio about the x y, y z ad x z plaes, which is impossible. Claim Every evet i T has measure either 0 or 1. Exercise Derive Kolmogorov s zero-oe law from Claim 16.3.

49 17. Characteristic fuctios ad the Cetral Limit Theorem Let X be a real radom variables. The characteristic fuctio ϕ X : R C of X is give by ϕ X (t) = E [ e itx] = E [cos(tx)] + i E [si(tx)]. This expectatio exists for ay real radom variable X ad ay real t, sice the sie ad cosie fuctios are bouded. Note that ϕ ax+b (t) = E [ e it(ax+b)] = E [ e itax e itb] = ϕ ax e itb. Exercise ϕ X is cotiuous, ad is differetiable times if X L. I this case ϕ () X (0) = i E [X ]. If X ad Y are idepedet, the it(x+y )] ϕ X+Y (t) = E [ e = E [ e itx] E [ e ] ity = ϕ X (t) ϕ Y (t). A real radom variable X is said to have a probability distributio fuctio (or p.d.f.) f X : R R if for ay measurable h: R R it holds that E [h(x)] = wheever the l.h.s. exists. I this case ϕ X (t) = h(x)f X (x) dx, e itx f X (x) dx, So that ϕ X is the Fourier trasform of f X. Let X be a real radom variable. Recall that the Cumulative Distributio Fuctio (or c.d.f.) F : R [0, 1] is give by F (x) = P [X x]. We saw i Example 3.5 that F uiquely determies the distributio of X. Theorem 17.2 (Lévy s Iversio Formula). Let X be a real radom variable. For every b > a such that P [X = a] = P [X = b] = 0 it holds that T e ita e itb F (b) F (a) = lim ϕ X (t) dt. T T it 49

50 50 Sice there are at most coutably may c R such that P [X = c] > 0, F is determied by ϕ X. Let X be a stadard Gaussia (or ormal) radom variable. This is a real radom variable with p.d.f. f X (x) = 1 /2. It is easy to 2π e x2 calculate that ϕ X (t) = e 1 2 t2. Thus if X 1 ad X 2 are idepedet stadard Gaussia the ϕ (X1 +X 2 )/ 2 (t) = e 1 2 t2, ad more geerally the same holds for (X X )/. If (X 1, X 2,...) are (ot ecessarily Gaussia) i.i.d. ad Y = k X k the ϕ Y (t) = ϕ X (t). If we defie the Z = 1 Y = 1 k ϕ Z (t) = ϕ X (t/ ). X Now, let E [X] = 0 ad E [X 2 ] = 1. Sice X L 2 the ϕ X is twice differetiable ad ϕ X (0) = 1 ϕ (0) = 0 ϕ (0) = 1. It is a exercise to show that it follows that ϕ X (t) = t2 + o(t 2 ), where here we mea by o(t 2 ) that as t 0 it holds that Thus we have that ad thus ϕ X (t) t2 t 2 0. ϕ Z (t) = ϕ X (t/ ) = (1 1 2 t2 / + o(t 2 / 2 )), lim ϕ Z (t) = e 1 2 t2.

51 51 As we kow, e 1 2 t2 is the characteristic fuctio of a stadard Gaussia. Thus we have proved that if G is a stadard Gaussia the for ay t R it holds that E [ e itz] E [ e itg]. This is almost the cetral limit theorem.

52 Sceery Recostructio: I Fix, ad let (X = X 1, X 2,...) be i.i.d. radom variables o the abelia group Z/Z. Deote by µ(k) = P [X = k] their law. Let X 0 be uiformly distributed o Z/Z, ad let Z = k=0 X be the correspodig radom walk. We assume throughout that the support of µ geerates Z/Z. Some importat examples to keep i mid: µ(1) = 1. µ(1) = 1 ε, µ(2) = ε. Fix some f {0, 1}, ad let F = f(z ). The law of (F 1, F 2,...) depeds o f; we thik of these distributios as a family idexed by f. We deote by P f [ ] the distributio whe we fix a particular f. Note that P f [ ] does ot chage if we shift f. Exercise Prove this. Deote by [f] the equivalece class of f uder shifts. That is, f [f] if there is some k Z/Z such that for every m Z/Z it holds that f (k + m) = f(m). The questio of sceery recostructio is the followig: is it possible to determie [f] give (F 1, F 2,...)? I particular we say that we ca recostruct f if there is some measurable ˆf : {0, 1} N {0, 1} such that for every f {0, 1} it holds that [ ] (18.1) P f ˆf(F1, F 2,...) [f] = 1. Equivaletly, if [ ] P ˆf(f(Z1 ), f(z 2 ),...) [f] = 1. I statistics, ˆf is called a estimator of f, ad the existece of such a ˆf is called idetifiability (of f). This clearly depeds o µ, ad so we say that µ is recostructive if this holds. Oe ca reformulate (18.1) i fiitary terms. It is equivalet to the existece of a sequece ( ˆf 1, ˆf 2,...) with ˆf k beig σ(f 1,..., F k )-measurable ad with [ ] lim P f ˆfk (F 1,..., F k ) [f] = 1 k for all f {0, 1}.

53 A very iterestig questio is how quickly does this coverge to oe (whe it does), for µ chose uiformly over ; for example for µ(1) = 0.99, µ(2) = Questio Let N(, ε) be the smallest k such that there is a ˆf k : {0, 1} k {0, 1} with [ ] P f ˆfk (F 1,..., F k ) [f] 1 ε for all f. For fixed ε (say 1/3), how does N(, ε) grow with? This is ot kow; it is ot eve kow if N(, ε) is expoetial or polyomial. The questio of whether a give µ is recostructive is much better uderstood. Theorem Let be a prime > 5, ad let µ Q. The µ is recostructive iff ϕ µ (k) ϕ µ (m) for all k m. Here ϕ µ is give by [ ] ϕ µ (k) = ϕ X (k) = E e 2πi k X = e 2πi k X µ(k). where k X is multiplicatio mod. l Z/Z The first directio (the case that ϕ µ (k) ϕ µ (m) for all k m) does ot require the extra assumptios o ad µ. This is due to Matziger ad Lember [3]. To prove this theorem we will eed to study a few ew cocepts. 53

54 Statioary distributios ad processes Give a trasitio matrix P o some state space S, ad give a Markov chai (X 1, X 2,...) over this P, the law of X 2 is give by P [X 2 = t] = s P [X 1 = s, X 2 = t] = s = s P [X 1 = s]p [X 2 = t X 1 = s] P [X 1 = s]p (s, t). Thus, if we thik of the distributios of X 1 ad X 2 as vectors v 1, v 2 l 1 (S), the we have that v 2 = v 1 P. A o-egative left eigevector of P is called a statioary distributio of P. It correspods to a distributio of X 1 that iduces the same distributio o X 2. By the Perro-Frobeius Theorem, if S is fiite the P has a statioary distributio. Furthermore, if P is also irreducible the this distributio is uique. Exercise The uiform distributio o Z/Z is the uique statioary distributio of the µ radom walk (recall that µ is geeratig). Let (Y 1, Y 2,...) be a geeral process. We say that this process is statioary (or shift-ivariat) if its law is the same as the law of (Y 2, Y 3,...). Equivaletly, for every, the law of (Y k+1,..., Y k+ ) is idepedet of k. Exercise Show that the two defiitios are ideed equivalet. Claim If (Y 1, Y 2,...) is a Markov chai, ad if the distributio of Y 1 is statioary, the (Y 1, Y 2,...) is a statioary process. Returig to our sceery recostructio problem, we ca use what we leared above to deduce that (Z 1, Z 2,...) is a statioary process. It easily follows that is also a statioary process. (F 1, F 2,...) = (f(z 1 ), f(z 2 ),...)

55 20. Sceery recostructio: II Fix N, f {0, 1} ad µ a geeratig probability measure o Z/Z, ad recall our process i which X 0 is uiform o Z/Z, (X 1, X 2,...) are i.i.d. with law µ, Z = X 0 + X X ad F = f(z ). Recall also that are we are iterested i guessig (correctly, almost surely) what [f] is (the equivalece class of fuctios that are shifts of f) from a sigle radom istace of (F 1, F 2,...). Defie the a: Z/Z R, autocorrelatio of f by a(k) = 1 f(m) f(m + k), m=0 ad ote that a is the same for ay f [f]. Imagie that we are willig to settle o recostructig a rather tha [f]. We will show that if the values of the characteristic fuctio ϕ µ are uique the we ca recostruct the a(k) s. To this ed, we defie A: N R, the autocorrelatio of F by α k = E [F T F T +k ] for some T N; by statioarity, the choice of T is immaterial. We will show that if we kow the α k s the we ca ifer the a k s. But this will ot help us, uless there some measurable ˆα k : {0, 1} N R such that P f [ˆα k (F 1, F 2,...) = α k ] = 1. A atural cadidate for ˆα k is the empirical average; we take lim sup rather tha lim to make sure ˆα k is well defied: ˆα k = lim sup m 1 m m F T F T +k. A statemet such as ˆα k = α k almost surely souds a lot like the strog law of large umbers. We will show later that this is ideed true, ad that it follows from the Ergodic Theorem, which is a geeralizatio of the SLLN. Let µ µ be the covolutio of µ with itself, which is give by T =1 55 [µ µ](k) = m µ(k m) µ(m). This is a probability distributio which is exactly the law of X 1 + X 2. Defie aalogously the k-fold covolutio µ (k), which is the law of X X k.

56 56 Claim For every k N it holds that α k = m µ (k) (m) a m. Proof. We set T = 0, coditio o X 0 ad Z k ad thus α k = E [f(z 0 ) f(z k )] = m,l E [f(x 0 ) f(z k ) X 0 = l, Z k = l + m] P [X 0 = l, Z k = l + m] = m,l f(l) f(l + m) 1 µ(k) (m) = m a m µ (k) (m). It follows that if we deote by α the colum vector (α 0,..., α 1 ), by a the colum vector (a 0,..., a 1 ), ad by M the matrix M k,m = µ (k) (m) the α = Ma. Assumig (as we will show later) that we ca determie α, it follows that we ca determie a if M is ivertible. Claim M is ivertible iff the values of the characteristic fuctio ϕ µ are uique. Proof. We apply the Fourier trasform to each row of M. Sice the Fourier trasform is a orthogoal liear trasformatio, the resultig matrix ˆM is ivertible iff M is ivertible. Now, over Z/Z the Fourier trasform is idetical to the characteristic fuctio. Sice the k th row of M is the law of X X k, the k th row of ˆM is give by [ ] ϕ X1 + +X k (m) = E e 2πi m (X 1+ +X k ) = ϕ X (m) k. Thus ˆM is a Vadermode matrix, ad is ivertible iff ϕ X has uique values. Recall that we are iterested i recostructig [f] rather tha a. To this ed we eed to defie the two-fold autocorrelatio a k,l = 1 f(m) f(m + k) f(m + k + l), ad its aalogue m=0 α k,l = E [F T F T +k F T +k+l ].

57 It is the easy to show that there is also a liear relatio betwee these two objects, with the correspodig matrix beig M M, the tesor product of M with itself. This is ivertible iff M is ivertible, ad so we get the same result. However, this still does ot suffice, ad we eed to add still more idices ad calculate -fold autocorrelatios. The appropriate matrices are agai ivertible iff M is, ad moreover [f] is uiquely determied by the -fold autocorrelatio. 57

58 Statioary processes ad measure preservig trasformatios We say that a statioary process (Y 1, Y 2,...) is ergodic if its shiftivariat sigma-algebra is trivial. That is, if for every shift-ivariat evet A it holds that P [A] {0, 1}. Some examples: A i.i.d. process is obviously statioary. By Kolmogorov s zerooe law its tail sigma-algebra is trivial, ad so its shift-ivariat sigma-algebra is also trivial. Thus it is ergodic. Let (Y 1, Y 2,...) be biary radom variables such that ad P [(Y 1, Y 2,...) = (1, 1,...)] = 1/2 P [(Y 1, Y 2,...) = (0, 0,...)] = 1/2. This process is statioary but ot ergodic; the evet lim Y = 1 is shift-ivariat ad has probability 1/2. Let (Y 1, Y 2,...) be biary radom variables such that ad P [(Y 1, Y 2,...) = (1, 0, 1, 0,...)] = 1/2 P [(Y 1, Y 2,...) = (0, 1, 0, 1,...)] = 1/2. This process is statioary ad ergodic. Let P be chose uiformly over [0, 1], ad let (Y 1, Y 2,...) be biary radom variables, which coditioed o P are i.i.d. Beroulli with parameter P. This process is statioary but ot ergodic. For example, the evet that 1 lim Y k 1/2 k is a shift-ivariat evet that has probability 1/2. Let (Y 1, Y 2,...) be a Markov chai, with the distributio of Y 1 equal to some statioary distributio. The this process is statioary. It is ergodic iff the distributio of Y 1 is ot a o-trivial covex combiatio of two differet statioary distributios. Let Y 1 be distributed uiformly o [0, 1). Fix some 0 < α < 1, ad let Y +1 = Y +α mod 1. This is a statioary process, ad it is ergodic iff α is irratioal. A geeralizatio of the last example is the followig. Let (Ω, F, ν) be a probability space, ad let T : Ω Ω be a measurable trasformatio that preserves ν. That is, ν(a) = ν(t 1 (A)) for all A F. We say

59 that A F is T -ivariat if T 1 (A) = A, ad ote that the collectio of T -ivariat sets is a sub-sigma-algebra. Let Y 1 have law ν, ad let each Y +1 = T (Y ). The (Y 1, Y 2,...) is a statioary process. Claim (Y 1, Y 2,...) is ergodic iff for every T -ivariat A F it holds that ν(a) {0, 1}. Proof. The map π : Ω Ω N give by π(ω) = (ω, T (ω), T 2 (ω),...) is a bijectio that pushes the measure ν to the law P of (Y 1, Y 2,...), ad thus these two probability spaces are isomorphic. Furthermore, if we deote the shift by σ : Ω N Ω N, the π is equivariat, i the sese that π T = σ π. It follows that the T -ivariat sigma-algebra is mapped to the shift-ivariat sigma-algebra, ad thus oe is trivial iff the other is trivial. Of course, if we have a process (Y 1, Y 2,...) takig values i Ω, the statioarity is precisely ivariace w.r.t. the shift trasformatio T : Ω Ω give by T (x 1, x 2, x 3,...) = (x 2, x 3,...). Thus statioary processes ad measure preservig trasformatios are two maifestatios of the same object. We say that T is ergodic if the T -ivariat sigma-algebra is trivial. Claim Let (Ω, F, P) be a probability space, with T : Ω Ω a ergodic measure preservig trasformatio. If Z : Ω R is a T -ivariat radom variable (i.e., Z(ω) = Z(T (ω)) for all ω Ω) the there is some z R such that P [Z = z] = 1. Exercise Prove this claim. Hit: For ay a < b R, the evet Z [a, b] is T -ivariat, ad thus has measure either 0 or 1. If (Ω, F) is a measurable space, if T : Ω Ω is measurable, ad if ν 1 ad ν 2 are T -ivariat, the it is easy to see that ay covex combiatio of ν 1 ad ν 2 is also T -ivariat. Thus the set of T -ivariat probability measures o (Ω, F) is covex. A extreme T -ivariat probability measure o (Ω, F) is oe that caot be writte as a o-trivial liear combiatio of two differet ivariat measures. We will later show (Propositio 24.2) that the extreme measures are precisely the ergodic oes. 59

60 The Ergodic Theorem Theorem 22.1 (The Poitwise Ergodic Theorem). Let (Ω, F, P) be a probability space, with T : Ω Ω a measure preservig trasformatio. If T is ergodic the for every X L 1 (Ω, F, P) it holds that for ν-almost every ω Ω 1 1 lim X(T k (ω)) = E [X]. k=0 I the laguage of statioary processes, oe ca say that if (Y 1, Y 2,...) is a statioary process with trivial shift-ivariat sigma-algebra, ad if f(y 1, Y 2,...) L 1, the almost surely lim 1 f(y k, Y k+1,...) = E [f(y 1, Y 2,...)]. k=1 This Theorem was origially proved by Birkhoff [1]. We give a proof due to Katzelso ad Weiss [2]. Proof. We assume without loss of geerality that X is o-egative; otherwise apply the proof separately to X + ad X. Defie X : Ω Ω by X 1 1 (ω) = lim X(T k (ω)) wheever this limit exists. We wat to show that it exists w.p. 1, ad that P [X = E [X]] = 1. Defie X : Ω R by ad likewise X(ω) = lim sup X(ω) = lim if k=0 1 1 k=0 1 1 k=0 X(T k (ω)), X(T k (ω)). Note that both are T -ivariat, ad so there are some x ad x such that Provig that (22.1) will thus fiish the proof. P [ X = x, X = x ] = 1. x E [X] x

61 61 Fix some ε > 0. Let N(ω) be the first positive iteger such that (22.2) N(ω) 1 1 X(T k (ω)) + ε x N(ω) k=0 Sice N(ω) is a.s. fiite, there is some K N such that the set A = {ω : N(ω) > K} has measure less tha ε/x. Defie { X(ω) ω A X(ω) = max{x(ω), x} ω A, ad also { N(ω) ω A Ñ(ω) = 1 ω A. Note that i aalogy to (22.2) we have that or, rearragig, that 1 Ñ(ω) Ñ(ω) 1 k=0 X(T k (ω)) + ε x, (22.3) Ñ(ω) 1 k=0 X(T k (ω)) Ñ(ω)(x ε). Now X ad X oly differ o A, ad whe they do differ the it is at most by x, sice X is o-egative. Hece [ ] [ E X = E X + ( X ] X) [ ] = E [X] + E X X = E [X] + E [( X ] X) 1 {A} E [X] + E [ ] x 1 {A} (22.4) E [X] + x ε/x = E [X] + ε. Now, let L = Kx/ε. For each ω Ω, let ω 0 = ω ad let It follows that ω j+1 = T Ñ(ω j) (ω j ). ω j = T Ñ(ω 0)+Ñ(ω 1)+ +Ñ(ω j 1) (ω).

62 62 Let J(ω) be the maximal j such that Ñ(ω 0 ) + Ñ(ω 1) + + Ñ(ω j) < L, ad let Ñ L (ω) = Ñ(ω 0) + Ñ(ω 1) + + Ñ(ω J(ω)). Note that ÑL(ω) > L K. The we ca write L 1 k=0 X(T k (ω)) = Ñ(ω 0 ) k=0 X(T k (ω 0 )) + + Ñ(ω J(ω) ) k=0 Applyig (22.3) to each term but the last yields L 1 k=0 X(T k (ω)) ÑL(ω)(x ε) + L 1 k=ñl(ω) ad usig the fact that X is o-egative meas L 1 k=0 X(T k (ω)) ÑL(ω)(x ε). X(T k (ω J(ω) )) + X(T k (ω)), L 1 k=ñl(ω) Sice ÑL(ω) > L K we ca apply this estimate too, ad, rearragig, arrive at 1 L 1 X(T k (ω)) x K L L x ε k=0 which by the choice of L we ca write as 1 L 1 X(T k (ω)) x 2ε. L k=0 Now, by T -ivariace the expectatio of the l.h.s. is just equal to the expectatio of X. Hece [ E X] x 2ε. Puttig this together with (22.4) yields [ ] x E X + 2ε E [X] + 3ε, ad takig ε to zero yields x E [X]. This completes the first half of the proof of (22.1); the secod follows by a similar argumet. Exercise Use the Ergodic Theorem to prove the strog law of large umbers. X(T k (ω))

63 23. The Rado-Nikodym derivative Let (Ω, F, P) be a probability space. Give a r.v. X with E [X] = 1, we ca defie a the measure Q = X P by Q[A] = E [ 1 {A} X ] = 1 {A} (ω) X(ω) dp(ω). It is easy to show that X is the uique r.v. such that Q = X P. I this case we call X the Rado-Nikodym derivative of Q with respect to P, ad deote Note that (23.1) Ω dq (ω) = X(ω). dp P[A] = 0 implies Q[A] = 0, so that ot every measure Q ca be writte as X P for some X. Whe Q ad Q satisfy (23.1) the we say that Q is absolutely cotiuous relative to P. Example The uiform distributio o [0, 1] is absolutely cotiuous relative to the uiform distributio o [0, 2]. If P [A] > 0 the P [ A] is absolutely cotiuous relative to P. The poit mass δ 1/2 is ot absolutely cotiuous relative to the uiform distributio o [0, 1]. The i.i.d. q measure o {0, 1} N is ot absolutely cotiuous relative to the i.i.d. p measure o {0, 1} N, uless p = q. Lemma If Q is absolutely cotiuous relative to P, the for each ε > 0 there exists a δ > 0 such that, for every measurable A, P[A] < δ implies Q[A] < ε. Proof. Assume the cotrary, so that there is some ε ad a sequece of evets (A 1, A 2,...) with P[A ] < 2 ad Q[A ] ε. Let A = m> A m be the evet that ifiitely may of these evets occur. The by Borel-Catelli P[A] = 0. O the other had Q[A] ε, i cotradictio to absolute cotiuity. Recall that F is separable if it geerated by a coutable subset {F 1, F 2,...}. We ca assume w.l.o.g. that this subset is a π-system. Theorem 23.3 (Rado-Nikodym Theorem). Let (Ω, F, P) be a probability space with F separable, ad let Q be absolutely cotiuous relative to P. The there exists a r.v. X such that Q = X P. 63

64 64 Proof. Let F = σ(f 1, F 2,...). The F is a fiite sigma-algebra, ad as such is the set of all possible uios of {B1,..., Bk }, a fiite partitio of Ω. Defie the F -measurable r.v. X as follows. For a give ω Ω there is a uique B {B1,..., Bk } such that ω B. Set X (ω) = X (B ) = Q[B ] P[B ], where we take 0/0 = 0. It is easy to verify that E [X ] = 1, ad that for every B F it holds that Q[B] = E [ 1 {B} X ], so that o F it holds that X is the Rado-Nikodym derivative dq/dp. Now, sice (F 1, F 2,...) is a filtratio, B is the disjoit uio of (at most) two sets B +1 i ad B +2 j. Hece E [X +1 F ](ω) = X +1(B +1 i ) P [ ] B +1 i + X+1 (B +1 j ) P [ ] B +1 j P [ ] [ ] B +1 i + P B +1 j = Q[B +1 i ] P [ B +1 P[B +1 i ] i = Q[B+1 i ] + Q[B +1 j ] P [B ] = Q[B ] P [B ] = X (ω), ] Q[B + +1 j ] P [ B +1 P[B +1 j ] j P [B ] ad thus (X 1, X 2,...) is a martigale w.r.t. the filtratio (F 1, F 2,...). Sice it is o-egative the it coverges almost surely to some r.v. X. We ow claim that (X 1, X 2,...) are uiformly itegrable, i the sese that for every ε there exists a K such that for all it holds that E [ X 1 {X>K}] < ε. To see this, recall that E [X ] = 1, ote that X is o-egative, ad apply Markov s iequality to arrive at P [X > K] < 1 K. Now, by Lemma 23.2, if we choose K large eough the this implies that Q[X > K] < ε. But the evet {X > K} is i F, sice X is F -measurable. Hece E [ X 1 {X>K}] = Q[X > K] < ε. ]

65 This proves that (X 1, X 2,...) are uiformly itegrable. A importat result (which is ot hard but which we will ot prove) is that if X X almost surely, the uiform itegrability implies that this covergece is also i L 1, i the sese that E [ X X ] 0. It follows that for ay F i {F 1, F 2,...} ad thus lim E [ 1 {Fi } (X X) ] lim E [ 1 {Fi } X X ] = 0, E [ 1 {Fi } X ] = lim E [ 1 {Fi } X ] = Q[Fi ]. Thus the measure X P agrees with Q o the geeratig algebra {F 1, F 2,...}, ad thus Q = X P. 65

66 The weak topology ad the simplex of ivariat measures Let X be a compact metrizable topological space. By the Riesz Represetatio Theorem we ca idetify P(X), the set of probability measures o X, with the positive bouded liear fuctioals o C(X) that assig 1 to the costat fuctio 1. The space X of bouded liear fuctioals o C(X) comes equipped with the compact, metrizable weak* topology, uder which ϕ ϕ if ϕ (f) ϕ(f) for all f C(X). The restrictio of this topology to the (closed) set of probability measures yields what probabilists call the weak topology o the probability measures o X. I the importat case that X = {0, 1} N we have that ν ν weakly if for every clope A it holds that ν (A) ν(a). I the case X = R {, } we have that ν ν if lim sup ν (A) ν(a) for all closed A, or if lim if ν (A) ν(a) for all ope A. Let X = {0, 1} Z, ad deote by I(X) the set of statioary (or shiftivariat) probability measures o X. Claim I(X) is a closed subset of P(X). Proof. Deote the shift by σ : X X. Assume that ν is a sequece i I(X) that coverges to some ν P(X). We prove the claim by showig that ν is statioary. Let A be a clope subset of X. The ν(a) = lim ν (A) = lim ν (σ(a)) = ν(σ(a)), where the last equality follows from the fact that A beig clope implies that σ(a) is clope. Thus ν is ivariat o a geeratig sub-algebra of the sigma-algebra, ad by a stadard argumet it is ivariat. Clearly, I(X) is a covex set. The ext propositio shows (a more geeral claim which implies) that its extreme poits I e (X) are the ergodic measures. Propositio A T -ivariat measure ν o (Ω, F) is ergodic iff it is extreme. Proof. Assume that ν is ot ergodic. The there is some T -ivariat A F such that p := ν(a) (0, 1). Let ν 1 be give by ν 1 (B) = ν(b

67 67 A) = 1 p ν(b A), ad let ν 2(B) = ν(b A c ). The ν 1 (T 1 B) = 1 p ν((t 1 (B)) A)) = 1 p ν((t 1 (B)) T 1 (A))) = 1 p ν((t 1 (B A))) = 1 ν(b A) p = ν 1 (B). Ad thus ν 1 is T -ivariat. The same argumet applies to ν 2, sice A c is also T -ivariat. Fially, ν = pν 1 + (1 p)ν 2. For the other directio, assume ν = pν 1 + (1 p)ν 2 for some p (0, 1). Clearly, ν 1 is absolutely cotiuous relative to ν, ad so we write ν 1 = X ν for some X L 1 (ν). We ow claim that X is T -ivariat; we prove this for the case that T is ivertible (although it is true i geeral). I this case, for ay A F ν 1 (A) = ν 1 (T (A)) = 1 {A} (T 1 (ω)) X(ω) dν(ω) Ω = 1 {A} (T 1 (ω)) X(ω) dν(t 1 (ω)) Ω = 1 {A} (ω) X(T (ω)) dν(ω), Ω ad so X T is also a Rado-Nikodym derivative dν 1 /dν. But by the uiqueess of this derivative X ad X T agree almost everywhere. It is a ow ice exercise to show that there exists some X that is equal to X almost everywhere ad is T -ivariat. It the follows by Claim 21.2, ad by the fact that E [X] = 1, that P [X = 1] = 1, ad thus ν = ν 1. This Theorem has a iterestig cosequece. Exercise Assume ν, µ are both T -ivariat ergodic measures o (Ω, F). Show that there exist two disjoit set A, B F such that ν(a) = 0 ad µ(a) = 1, while ν(b) = 1 ad µ(b) = 0. Thus ν ad µ live i differet places. I fact, it is possible to show that there is a map β : I e (X) F with the properties that

68 68 (1) µ(β µ ) = 1 for all µ I e (X). (2) For all µ ν I e (X) it holds that β µ β ν =. Usig this, it is possible to show that I(X) is i fact a simplex: a compact covex set i which there is a uique way to write each elemet as a covex itegral of the extreme poits. Propositio The ergodic measures I e (X) are dese i I(X). Thus the simplex I(X) has the iterestig property that its extreme poits are dese. It turs out that there is oly oe such simplex (up to affie homeomorphisms), which is called the Poulse simplex. Proof. It suffices to show that for ν, µ I e (X) ad θ = 1ν + 1 µ it is 2 2 possible to fid θ I e (X) s.t. lim θ = θ. To this ed, fix ad defie θ as follows. Let the law of the r.v.s (X k ) k Z be µ, ad the law of (Y k ) k Z be ν. For m Z, let (X0 m,..., X 1) m be idepedet of all previously defied radom variables, ad with law equal to that of (X 0,..., X 1 ). Defie (Y0 m,..., Y 1) m aalogously. Defie (W k ) k Z by W k = { X k/ k mod if k/ is eve Y k/ k mod if k/ is odd. Fially, choose N uiformly at radom from {0, 1,..., 2 1}, ad defie (Z k ) k Z by Z k = W k+n. Let θ be the law of (Z k ). It is straightforward (if tedious) to verify that (Z k ) is statioary. We leave it as a exercise to show that it is ergodic. Thus to fiish the proof we have to show that lim θ = θ. Fix M N, ad cosider the evet that N {1,..., M}. As teds to ifiity, the probability of this evet teds to zero. Thus, if we coditio o N, with probability that teds to 1/2 we have that the law of (Z 1,..., Z M ) is equal to the law of (X 1,..., X M ), ad likewise for (Y 1,..., Y M ). This completes the proof.

69 25. Percolatio Let V be a coutable set, ad let G = (V, E) be a locally fiite, simple symmetric graph. That is, E is a symmetric relatio o V with E {v} V fiite for each v V. We also assume that G is coected, so that the trasitive closure of E is V V. The i.i.d. p percolatio measure o {0, 1} E is simply the product Beroulli measure, i which we choose each edge idepedetly with probability p. We will deote this measure by P p [ ], ad will deote by E the radom edge set with this law. G = (V, E) will be the correspodig radom graph. Note that G will i geeral ot be coected. For each v V we deote by K(v) the (radom) coected compoet that v belogs to i G. We deote by {v } the evet that K(v) is ifiite. We deote by K the evet that there is some v for which K(v) is ifiite. Claim The probability of K is either 0 or 1. I the former case, for every v V, P p [v ] = 0, while i the latter P p [v ] > 0. Proof. Eumerate E = (e 1, e 2,...), ad let A = {e i E}. The (A 1, A 2,...) is a i.i.d. sequece. Clearly K is σ(a 1, A 2,...)-measurable, ad also clearly it is a tail evet. Hece the first part of the claim follows by Kolmogorov s 0-1 law. Sice the evet K cotais {v }, it is immediate that P p [K ] = 0 implies P p [v ] = 0. Assume ow that P p [K ] = 1. The there is some w V such that, with positive probability, P p [w ]. Let P = (e 1, e 2,..., e ) be a path betwee v ad w. Cosider the radom variable Ẽ takig values i {0, 1}E defied as follows: for every edge e P, we set e Ẽ iff e E. Ad we set e Ẽ for all e P. We (you) prove i the exercise below that the law of Ẽ is absolutely cotiuous relative to the law of E. Deote G = (V, Ẽ), ad deote by K(v) the coected compoet of v i G. Now, K(v) = K(w), sice v ad w are coected i G. Also, K(w) cotais K(w), sice Ẽ cotais E. Hece the evet { K(w) = } occurs with positive probability, ad so the same holds for K(v) = K(w). Fially, by absolute cotiuity, the same holds for K(v), ad so P p [v ] > 0. Exercise Prove that the law of Ẽ is absolutely cotiuous relative to the law of E. Claim If q > p the P q [K ] P p [K ]. To prove this claim we prove a stroger theorem, ad i the process itroduce the techique of couplig. Let Ω = {0, 1} N, with F the Borel 69

70 70 sigma-algebra. We cosider the atural partial order o Ω give by ω ω if ω ω for all N. We say that A F is icreasig if for all ω ω it holds that ω A implies ω A. Let P p [ ] deote the i.i.d. p measure o Ω. Theorem If A is icreasig the q > p implies P q [A] P p [A]. Proof. Let (X 1, X 2,...) be i.i.d. radom variables, each distributed uiformly o [0, 1]. For each let Q = 1 {X q} ad P = 1 {X p}. Note that P [Q = 1] = q ad P [P = 1] = p, ad that (Q 1, Q 2,...) is i.i.d., as is (P 1, P 2,...). Hece the law of (Q 1, Q 2,...) (resp., (P 1, P 2,...)) is P q [ ] (resp., P p [ ]). Note also that (Q 1, Q 2,...) (P 1, P 2,...), sice q > p. Hece for ay icreasig evet A {0, 1} N it holds that (P 1, P 2,...) A implies (Q 1, Q 2,...) A, ad thus P q [A] P p [A]. The costructio i this proof is a example of couplig. Formally, a couplig of two probability spaces (Ω, F, P) ad (Ω, F, P ), is a probability space (Ω Ω, σ(f F ), Q) such that the projectios o the two coordiates pushes Q forward to P ad P. Sice P p [K ] {0, 1}, sice P p [K ] is weakly icreasig i p, we are iterested i the critical percolatio probability p c = sup{p : P p [K ] = 0}. A iterestig (ad ofte hard) questio is whether P pc [K ] is zero or oe. Let G be the ifiite k-ary tree with root o. I this case we ca calculate p c, by otig that the evet {o } ca be thought of as the evet that the Galto-Watso tree with childre distributio B(k, p) is ifiite. We kow that this happes with positive probability iff p > 1/k. Hece i this case p c = 1/k, ad P pc [K ] = 0.

71 26. Large deviatios Let (X 1, X 2,...) be i.i.d. real radom variables. Deote X = X 1 ad let µ = E [X]. Let Y = 1 X k. By the law of large umbers we expect that Y should be close to µ for large. What is the probability that it is larger tha some η > µ? We already proved the Cheroff lower boud. We here prove a asymptotically matchig upper boud. Recall that the momet geeratig fuctio of X is k=1 M(t) = E [ e tx], ad that its cumulat geeratig fuctio is K(t) = log M(t) = log E [ e tx]. Of course, these may be ifiite for some t. Let I, the domai of both, be the set o which they are fiite, ad ote that 0 I. Claim I is a iterval, ad K is covex o I. 71 For the proof of this claim we will eed Hölder s iequality. p [1, ] ad a real r.v. X deote For X p = E [ X p ] 1/p. Lemma 26.2 (Hölder s iequality). For ay p, q [1, ] with 1/p + 1/q = 1 ad r.v.s X, Y it holds that X Y 1 X p Y q. Exercise Prove Hölder s iequality. Hit: use Youg s iequality, which states that for every real x, y 0 ad p, q > 1 with 1/p + 1/q = 1 it holds that xy xp p + yq q. Proof of Claim Assume a, b I. The for ay r (0, 1) K(ra + (1 r)b) = log E [ e (ra+(1 r)b)x] = log E [ (e ax ) r ( e bx ) 1 r ].

72 72 By Hölder s iequality K(ra + (1 r)b) log E [( e ax) r] 1/r + log E [ (e bx ) 1 r ] 1/(1 r) = log E [ e ax] r + log E [ e bx ] 1 r = r log E [ e ax] + (1 r) log E [ e bx] = rk(a) + (1 r)k(b). Sice K is o-egative it follows that it is fiite o ra + (1 r)b, ad thus I is a iterval o which it is covex. Applyig the Domiated Covergece Theorem iductively ca be used to show that M ad K are smooth (i.e., ifiitely differetiable) o the iterior of I. Let the Legedre trasform of K be give by K (η) = sup(tη K(t)). t>0 It turs out that the fact that K is smooth ad covex implies that K is also smooth ad covex. Therefore, if the supremum i this defiitio is obtaied at some t, the K (t) = η. Coversely, if K (t) = η for some t, the this t is uique ad K (η) = tη K(t). Theorem 26.4 (Cheroff boud). Proof. For ay t 0 Optimizig over t yields the claim. P [Y η] e K (η). P [Y η] P [ty tη] = P [e t ] k X k e tη [ E e ] k tx k e tη = e (tη K(t)). Theorem If η = K (t) for some t i the iterior of I the P [Y η] = e K (η)+o(). Proof. Oe side is give by the Cheroff boud. It thus remais to prove the upper boud. Let Z = k=1 X k. We wat to prove that P [Z η] = e K (η)+o().

73 Deote the law of X by ν, ad for a t I (to be determied late) defie ν by d ν dν (x) = etx E [e tx ] = etx K(t). Let ( X 1, X 2,...) be i.i.d. with law ν, ad let Z = k=1 X k. The law of Z is ν (), the -fold covolutio of ν. We claim that d ν () (x) = etx dν () E [e tx ] = e tx K(t). This is left as a exercise. Note also that [ ] E X = E [ Xe tx] = K (t). E [e tx ] Now, for ay η > η, P [Z η] P [η Z η] = η η 1 dν () (x) η = e K(t) e tx d ν () (x) η η e (tη K(t))+ d ν () (x) = e (tη K(t)) P η [ η Z ] η [ ] Sice E X = K (t), it follows that if we choose t so that η > K (t) > η which we ca, by the claim hypothesis ad the smoothess of K the, by the law of large umbers, [ P η Z ] η 1, ad so 1 lim log P [Z η] (tη K(t)). Sice this holds for ay η > η ad η > K (t) > η, it also holds for η = η ad t such that K (t) = η. So lim 1 log P [Z η] (tη K(t)) 73

74 74 or P [Z η] e (tη K(t))+o(). Fially, sice K is covex ad smooth, ad sice K (t) = η, the t is the maximizer of zη K(z), ad thus tη K(t) = K (η) ad P [Z η] e K (η)+o().

75 27. The mass trasport priciple Let G = (V, E) be a locally fiite, coutable graph. A graph automorphism is a bijectio f : V V such that (v, w) E iff (f(v), f(w)) E. The automorphisms of a graph form the group Aut(G) uder compositio. We say that G is trasitive if its automorphism group acts o it trasitively. That is, if for all v, w V there is a graph automorphism f s.t. f(v) = w. Ituitively, this meas that the geometry of the graph looks the same from the poit of view of every vertex. A importat example is whe Γ is fiitely geerated by a symmetric fiite subset S, ad G = (V, E) with V = Γ is the correspodig Cayley graph. I this case it is easy to see that the Γ actio o itself is a actio by graph automorphisms, which is furthermore already trasitive. We will restrict our discussio to this settig, eve though it all exteds to uimodular trasitive graphs; these are graphs with a uimodular automorphism group. A map f : Γ Γ [0, ) is a mass-trasport if it is ivariat uder the diagoal Γ-actio: f(h, k) = f(gh, gk) for all g, h, k Γ. It is useful to thik about f as idicatig how much mass is passed from h to k, where the amout passed ca deped o idetities of h ad k, but i a way that (i some sese) oly depeds o the geometry of the graph ad ot o their ames. Theorem 27.1 (Mass Trasport Priciple for Groups). For every mass trasport f : Γ Γ [0, ) ad g Γ it holds that f(k, g). k G f(g, k) = k G That is, the total mass flowig out of g is equal to the total mass flowig i. Proof. By ivariace k G f(g, k) = k G Chagig variables to h = k 1 g yields = h G f(h, e). Applyig ivariace agai yields f(gh, g) h G f(k 1 g, e). 75

76 76 ad agai chagig variables to k = gh yields the desired result. Figure 1. Spaig forest. As a applicatio, cosider the followig radom subgraph E of the stadard Cayley graph of Z 2. For each z, w 1, w 2 Z 2 such that w 1 = z + (0, 1) ad w 2 = z + (1, 0), we idepedetly set (z, w 1 ) E, (z, w 2 ) E w.p. 1/2, ad (z, w 1 ) E, (z, w 2 ) E w.p. 1/2. For distict z, w Z 2, we say that w is a descedat of z (ad z is a acestor of w) i E if there is a path betwee w ad z, ad if w z i both coordiates. Note that, by costructio, (1) E has o cycles, ad each ode is adjacet to at least oe edge, ad so E is a spaig forest. (2) Each w has ifiitely may acestors. (3) If w z the the umber of descedats of w is idepedet of the umber of descedats of z. Propositio The umber of descedats of each w Z is almost surely fiite, with ifiite expectatio. Proof. Let f(w, z) equal the probability that z is a acestor of w, ad ote that by the ivariace of the defiitios f is a mass trasport.

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Introduction to Probability. Ariel Yadin. Lecture 7

Introduction to Probability. Ariel Yadin. Lecture 7 Itroductio to Probability Ariel Yadi Lecture 7 1. Idepedece Revisited 1.1. Some remiders. Let (Ω, F, P) be a probability space. Give a collectio of subsets K F, recall that the σ-algebra geerated by K,

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information

Probability for mathematicians INDEPENDENCE TAU

Probability for mathematicians INDEPENDENCE TAU Probability for mathematicias INDEPENDENCE TAU 2013 28 Cotets 3 Ifiite idepedet sequeces 28 3a Idepedet evets........................ 28 3b Idepedet radom variables.................. 33 3 Ifiite idepedet

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

Introduction to Probability. Ariel Yadin

Introduction to Probability. Ariel Yadin Itroductio to robability Ariel Yadi Lecture 2 *** Ja. 7 ***. Covergece of Radom Variables As i the case of sequeces of umbers, we would like to talk about covergece of radom variables. There are may ways

More information

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1 Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity

More information

Singular Continuous Measures by Michael Pejic 5/14/10

Singular Continuous Measures by Michael Pejic 5/14/10 Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

More information

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Lecture 3 : Random variables and their distributions

Lecture 3 : Random variables and their distributions Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

5 Birkhoff s Ergodic Theorem

5 Birkhoff s Ergodic Theorem 5 Birkhoff s Ergodic Theorem Amog the most useful of the various geeralizatios of KolmogorovâĂŹs strog law of large umbers are the ergodic theorems of Birkhoff ad Kigma, which exted the validity of the

More information

Math Solutions to homework 6

Math Solutions to homework 6 Math 175 - Solutios to homework 6 Cédric De Groote November 16, 2017 Problem 1 (8.11 i the book): Let K be a compact Hermitia operator o a Hilbert space H ad let the kerel of K be {0}. Show that there

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

Random Models. Tusheng Zhang. February 14, 2013

Random Models. Tusheng Zhang. February 14, 2013 Radom Models Tusheg Zhag February 14, 013 1 Radom Walks Let me describe the model. Radom walks are used to describe the motio of a movig particle (object). Suppose that a particle (object) moves alog the

More information

Lecture 12: November 13, 2018

Lecture 12: November 13, 2018 Mathematical Toolkit Autum 2018 Lecturer: Madhur Tulsiai Lecture 12: November 13, 2018 1 Radomized polyomial idetity testig We will use our kowledge of coditioal probability to prove the followig lemma,

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

STAT Homework 1 - Solutions

STAT Homework 1 - Solutions STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Notes on Snell Envelops and Examples

Notes on Snell Envelops and Examples Notes o Sell Evelops ad Examples Example (Secretary Problem): Coside a pool of N cadidates whose qualificatios are represeted by ukow umbers {a > a 2 > > a N } from best to last. They are iterviewed sequetially

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Axioms of Measure Theory

Axioms of Measure Theory MATH 532 Axioms of Measure Theory Dr. Neal, WKU I. The Space Throughout the course, we shall let X deote a geeric o-empty set. I geeral, we shall ot assume that ay algebraic structure exists o X so that

More information

Lecture Notes for Analysis Class

Lecture Notes for Analysis Class Lecture Notes for Aalysis Class Topological Spaces A topology for a set X is a collectio T of subsets of X such that: (a) X ad the empty set are i T (b) Uios of elemets of T are i T (c) Fiite itersectios

More information

A Proof of Birkhoff s Ergodic Theorem

A Proof of Birkhoff s Ergodic Theorem A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed

More information

Application to Random Graphs

Application to Random Graphs A Applicatio to Radom Graphs Brachig processes have a umber of iterestig ad importat applicatios. We shall cosider oe of the most famous of them, the Erdős-Réyi radom graph theory. 1 Defiitio A.1. Let

More information

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002 ECE 330:541, Stochastic Sigals ad Systems Lecture Notes o Limit Theorems from robability Fall 00 I practice, there are two ways we ca costruct a ew sequece of radom variables from a old sequece of radom

More information

Entropy Rates and Asymptotic Equipartition

Entropy Rates and Asymptotic Equipartition Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,

More information

Notes 19 : Martingale CLT

Notes 19 : Martingale CLT Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall

More information

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other

More information

Chapter 0. Review of set theory. 0.1 Sets

Chapter 0. Review of set theory. 0.1 Sets Chapter 0 Review of set theory Set theory plays a cetral role i the theory of probability. Thus, we will ope this course with a quick review of those otios of set theory which will be used repeatedly.

More information

Lecture 2. The Lovász Local Lemma

Lecture 2. The Lovász Local Lemma Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio

More information

Measure and Measurable Functions

Measure and Measurable Functions 3 Measure ad Measurable Fuctios 3.1 Measure o a Arbitrary σ-algebra Recall from Chapter 2 that the set M of all Lebesgue measurable sets has the followig properties: R M, E M implies E c M, E M for N implies

More information

Notes 5 : More on the a.s. convergence of sums

Notes 5 : More on the a.s. convergence of sums Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

MA131 - Analysis 1. Workbook 3 Sequences II

MA131 - Analysis 1. Workbook 3 Sequences II MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................

More information

Lecture 2: Concentration Bounds

Lecture 2: Concentration Bounds CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy

More information

BIRKHOFF ERGODIC THEOREM

BIRKHOFF ERGODIC THEOREM BIRKHOFF ERGODIC THEOREM Abstract. We will give a proof of the poitwise ergodic theorem, which was first proved by Birkhoff. May improvemets have bee made sice Birkhoff s orgial proof. The versio we give

More information

Solutions to HW Assignment 1

Solutions to HW Assignment 1 Solutios to HW: 1 Course: Theory of Probability II Page: 1 of 6 Uiversity of Texas at Austi Solutios to HW Assigmet 1 Problem 1.1. Let Ω, F, {F } 0, P) be a filtered probability space ad T a stoppig time.

More information

Part II Probability and Measure

Part II Probability and Measure Part II Probability ad Measure Based o lectures by J. Miller Notes take by Dexter Chua Michaelmas 2016 These otes are ot edorsed by the lecturers, ad I have modified them (ofte sigificatly) after lectures.

More information

Probability and Random Processes

Probability and Random Processes Probability ad Radom Processes Lecture 5 Probability ad radom variables The law of large umbers Mikael Skoglud, Probability ad radom processes 1/21 Why Measure Theoretic Probability? Stroger limit theorems

More information

The Borel hierarchy classifies subsets of the reals by their topological complexity. Another approach is to classify them by size.

The Borel hierarchy classifies subsets of the reals by their topological complexity. Another approach is to classify them by size. Lecture 7: Measure ad Category The Borel hierarchy classifies subsets of the reals by their topological complexity. Aother approach is to classify them by size. Filters ad Ideals The most commo measure

More information

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero? 2 Lebesgue Measure I Chapter 1 we defied the cocept of a set of measure zero, ad we have observed that every coutable set is of measure zero. Here are some atural questios: If a subset E of R cotais a

More information

Introduction to Probability. Ariel Yadin. Lecture 2

Introduction to Probability. Ariel Yadin. Lecture 2 Itroductio to Probability Ariel Yadi Lecture 2 1. Discrete Probability Spaces Discrete probability spaces are those for which the sample space is coutable. We have already see that i this case we ca take

More information

Lecture 14: Graph Entropy

Lecture 14: Graph Entropy 15-859: Iformatio Theory ad Applicatios i TCS Sprig 2013 Lecture 14: Graph Etropy March 19, 2013 Lecturer: Mahdi Cheraghchi Scribe: Euiwoog Lee 1 Recap Bergma s boud o the permaet Shearer s Lemma Number

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

The Central Limit Theorem

The Central Limit Theorem Chapter The Cetral Limit Theorem Deote by Z the stadard ormal radom variable with desity 2π e x2 /2. Lemma.. Ee itz = e t2 /2 Proof. We use the same calculatio as for the momet geeratig fuctio: exp(itx

More information

MAT1026 Calculus II Basic Convergence Tests for Series

MAT1026 Calculus II Basic Convergence Tests for Series MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real

More information

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory 1. Graph Theory Prove that there exist o simple plaar triagulatio T ad two distict adjacet vertices x, y V (T ) such that x ad y are the oly vertices of T of odd degree. Do ot use the Four-Color Theorem.

More information

An alternative proof of a theorem of Aldous. concerning convergence in distribution for martingales.

An alternative proof of a theorem of Aldous. concerning convergence in distribution for martingales. A alterative proof of a theorem of Aldous cocerig covergece i distributio for martigales. Maurizio Pratelli Dipartimeto di Matematica, Uiversità di Pisa. Via Buoarroti 2. I-56127 Pisa, Italy e-mail: pratelli@dm.uipi.it

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018) Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black

More information

The Boolean Ring of Intervals

The Boolean Ring of Intervals MATH 532 Lebesgue Measure Dr. Neal, WKU We ow shall apply the results obtaied about outer measure to the legth measure o the real lie. Throughout, our space X will be the set of real umbers R. Whe ecessary,

More information

Theorem 3. A subset S of a topological space X is compact if and only if every open cover of S by open sets in X has a finite subcover.

Theorem 3. A subset S of a topological space X is compact if and only if every open cover of S by open sets in X has a finite subcover. Compactess Defiitio 1. A cover or a coverig of a topological space X is a family C of subsets of X whose uio is X. A subcover of a cover C is a subfamily of C which is a cover of X. A ope cover of X is

More information

2 Banach spaces and Hilbert spaces

2 Banach spaces and Hilbert spaces 2 Baach spaces ad Hilbert spaces Tryig to do aalysis i the ratioal umbers is difficult for example cosider the set {x Q : x 2 2}. This set is o-empty ad bouded above but does ot have a least upper boud

More information

Here are some examples of algebras: F α = A(G). Also, if A, B A(G) then A, B F α. F α = A(G). In other words, A(G)

Here are some examples of algebras: F α = A(G). Also, if A, B A(G) then A, B F α. F α = A(G). In other words, A(G) MATH 529 Probability Axioms Here we shall use the geeral axioms of a probability measure to derive several importat results ivolvig probabilities of uios ad itersectios. Some more advaced results will

More information

FUNDAMENTALS OF REAL ANALYSIS by

FUNDAMENTALS OF REAL ANALYSIS by FUNDAMENTALS OF REAL ANALYSIS by Doğa Çömez Backgroud: All of Math 450/1 material. Namely: basic set theory, relatios ad PMI, structure of N, Z, Q ad R, basic properties of (cotiuous ad differetiable)

More information

6a Time change b Quadratic variation c Planar Brownian motion d Conformal local martingales e Hints to exercises...

6a Time change b Quadratic variation c Planar Brownian motion d Conformal local martingales e Hints to exercises... Tel Aviv Uiversity, 28 Browia motio 59 6 Time chage 6a Time chage..................... 59 6b Quadratic variatio................. 61 6c Plaar Browia motio.............. 64 6d Coformal local martigales............

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

PRELIM PROBLEM SOLUTIONS

PRELIM PROBLEM SOLUTIONS PRELIM PROBLEM SOLUTIONS THE GRAD STUDENTS + KEN Cotets. Complex Aalysis Practice Problems 2. 2. Real Aalysis Practice Problems 2. 4 3. Algebra Practice Problems 2. 8. Complex Aalysis Practice Problems

More information

Probability and Measure

Probability and Measure Probability ad Measure Stefa Grosskisky Cambridge, Michaelmas 2005 These otes ad other iformatio about the course are available o www.statslab.cam.ac.uk/ stefa/teachig/probmeas.html The text is based o

More information

Math 220A Fall 2007 Homework #2. Will Garner A

Math 220A Fall 2007 Homework #2. Will Garner A Math 0A Fall 007 Homewor # Will Garer Pg 3 #: Show that {cis : a o-egative iteger} is dese i T = {z œ : z = }. For which values of q is {cis(q): a o-egative iteger} dese i T? To show that {cis : a o-egative

More information

Math 525: Lecture 5. January 18, 2018

Math 525: Lecture 5. January 18, 2018 Math 525: Lecture 5 Jauary 18, 2018 1 Series (review) Defiitio 1.1. A sequece (a ) R coverges to a poit L R (writte a L or lim a = L) if for each ǫ > 0, we ca fid N such that a L < ǫ for all N. If the

More information

Introductory Ergodic Theory and the Birkhoff Ergodic Theorem

Introductory Ergodic Theory and the Birkhoff Ergodic Theorem Itroductory Ergodic Theory ad the Birkhoff Ergodic Theorem James Pikerto Jauary 14, 2014 I this expositio we ll cover a itroductio to ergodic theory. Specifically, the Birkhoff Mea Theorem. Ergodic theory

More information

LECTURE 8: ASYMPTOTICS I

LECTURE 8: ASYMPTOTICS I LECTURE 8: ASYMPTOTICS I We are iterested i the properties of estimators as. Cosider a sequece of radom variables {, X 1}. N. M. Kiefer, Corell Uiversity, Ecoomics 60 1 Defiitio: (Weak covergece) A sequece

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic

More information

MAS111 Convergence and Continuity

MAS111 Convergence and Continuity MAS Covergece ad Cotiuity Key Objectives At the ed of the course, studets should kow the followig topics ad be able to apply the basic priciples ad theorems therei to solvig various problems cocerig covergece

More information

Law of the sum of Bernoulli random variables

Law of the sum of Bernoulli random variables Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible

More information

Chapter IV Integration Theory

Chapter IV Integration Theory Chapter IV Itegratio Theory Lectures 32-33 1. Costructio of the itegral I this sectio we costruct the abstract itegral. As a matter of termiology, we defie a measure space as beig a triple (, A, µ), where

More information

Math 140A Elementary Analysis Homework Questions 3-1

Math 140A Elementary Analysis Homework Questions 3-1 Math 0A Elemetary Aalysis Homework Questios -.9 Limits Theorems for Sequeces Suppose that lim x =, lim y = 7 ad that all y are o-zero. Detarime the followig limits: (a) lim(x + y ) (b) lim y x y Let s

More information

Math 216A Notes, Week 5

Math 216A Notes, Week 5 Math 6A Notes, Week 5 Scribe: Ayastassia Sebolt Disclaimer: These otes are ot early as polished (ad quite possibly ot early as correct) as a published paper. Please use them at your ow risk.. Thresholds

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.

More information

( ) = p and P( i = b) = q.

( ) = p and P( i = b) = q. MATH 540 Radom Walks Part 1 A radom walk X is special stochastic process that measures the height (or value) of a particle that radomly moves upward or dowward certai fixed amouts o each uit icremet of

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

Lecture 2: April 3, 2013

Lecture 2: April 3, 2013 TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 2: April 3, 203 Scribe: Shubhedu Trivedi Coi tosses cotiued We retur to the coi tossig example from the last lecture agai: Example. Give,

More information

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities

Ada Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We

More information

McGill University Math 354: Honors Analysis 3 Fall 2012 Solutions to selected problems

McGill University Math 354: Honors Analysis 3 Fall 2012 Solutions to selected problems McGill Uiversity Math 354: Hoors Aalysis 3 Fall 212 Assigmet 3 Solutios to selected problems Problem 1. Lipschitz fuctios. Let Lip K be the set of all fuctios cotiuous fuctios o [, 1] satisfyig a Lipschitz

More information

Mathematics 170B Selected HW Solutions.

Mathematics 170B Selected HW Solutions. Mathematics 17B Selected HW Solutios. F 4. Suppose X is B(,p). (a)fidthemometgeeratigfuctiom (s)of(x p)/ p(1 p). Write q = 1 p. The MGF of X is (pe s + q), sice X ca be writte as the sum of idepedet Beroulli

More information

Part A, for both Section 200 and Section 501

Part A, for both Section 200 and Section 501 Istructios Please write your solutios o your ow paper. These problems should be treated as essay questios. A problem that says give a example or determie requires a supportig explaatio. I all problems,

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

7 Sequences of real numbers

7 Sequences of real numbers 40 7 Sequeces of real umbers 7. Defiitios ad examples Defiitio 7... A sequece of real umbers is a real fuctio whose domai is the set N of atural umbers. Let s : N R be a sequece. The the values of s are

More information

(b) What is the probability that a particle reaches the upper boundary n before the lower boundary m?

(b) What is the probability that a particle reaches the upper boundary n before the lower boundary m? MATH 529 The Boudary Problem The drukard s walk (or boudary problem) is oe of the most famous problems i the theory of radom walks. Oe versio of the problem is described as follows: Suppose a particle

More information

This section is optional.

This section is optional. 4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore

More information

Notes 27 : Brownian motion: path properties

Notes 27 : Brownian motion: path properties Notes 27 : Browia motio: path properties Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces:[Dur10, Sectio 8.1], [MP10, Sectio 1.1, 1.2, 1.3]. Recall: DEF 27.1 (Covariace) Let X = (X

More information

Square-Congruence Modulo n

Square-Congruence Modulo n Square-Cogruece Modulo Abstract This paper is a ivestigatio of a equivalece relatio o the itegers that was itroduced as a exercise i our Discrete Math class. Part I - Itro Defiitio Two itegers are Square-Cogruet

More information

6 Infinite random sequences

6 Infinite random sequences Tel Aviv Uiversity, 2006 Probability theory 55 6 Ifiite radom sequeces 6a Itroductory remarks; almost certaity There are two mai reasos for eterig cotiuous probability: ifiitely high resolutio; edless

More information

Lecture 4. We also define the set of possible values for the random walk as the set of all x R d such that P(S n = x) > 0 for some n.

Lecture 4. We also define the set of possible values for the random walk as the set of all x R d such that P(S n = x) > 0 for some n. Radom Walks ad Browia Motio Tel Aviv Uiversity Sprig 20 Lecture date: Mar 2, 20 Lecture 4 Istructor: Ro Peled Scribe: Lira Rotem This lecture deals primarily with recurrece for geeral radom walks. We preset

More information

lim za n n = z lim a n n.

lim za n n = z lim a n n. Lecture 6 Sequeces ad Series Defiitio 1 By a sequece i a set A, we mea a mappig f : N A. It is customary to deote a sequece f by {s } where, s := f(). A sequece {z } of (complex) umbers is said to be coverget

More information

Ergodicity of Stochastic Processes and the Markov Chain Central Limit Theorem

Ergodicity of Stochastic Processes and the Markov Chain Central Limit Theorem Uiversity of Bristol School of Mathematics Ergodicity of Stochastic Processes ad the Markov Chai Cetral Limit Theorem A Project of 30 Credit Poits at Level 7 For the Degree of MSci Mathematics Author:

More information