Probability Theory and Stochastic Processes UAB. Paul Jung

Size: px
Start display at page:

Download "Probability Theory and Stochastic Processes UAB. Paul Jung"

Transcription

1 Probability Theory ad Stochastic Processes UAB Paul Jug 1

2 Notatio used i the text: Throughout the text := deotes a defiitio or set equal to. We use the symbol to idicate two differet otatios for the same object, or i the case of fuctios, idetically equal to a costat. deotes a disjoit uio. Lebesgue measure is deoted by m, m(dx), dx, or dt. R := R {, } {X < x} := {ω : X(ω) < x} For evets A ad B, we ofte use a comma for itersectio: {A, B} := {A B}. For evets A, B or radom variables X, Y, we deote idepedece by A B or X Y. X F deotes that X is F-measurable. N 0 := N {0} u.i. stads for uiformly itegrable. i.i.d. stads for idepedet ad idetically distributed. µ µ deotes weak covergece of distributios. a b deotes lim a /b = 1. 2

3 CONTENTS Cotets 1 Measure Theory Radom Variables Expectatio Distributios Beroulli s Laws of Large Numbers Idepedece ad Covolutio Weak Law of Large Numbers Strog Law of Large Numbers Uiform Itegrability ad the L 1 Law of Large Numbers The Ergodic Theorem Coditioal Expectatio Statioary Sequeces Birkhoff s Ergodic Theorem The Cetral Limit Theorem Covergece i Distributio Characteristic Fuctios The Cetral Limit Theorem The Momet Method Bocher s Theorem The Law of Small Numbers Poisso Covergece Poisso Processes Berstei s Block Method Radom Walk Recurrece ad Trasiece Stoppig Times The Markov ad Martigale Properties Large Deviatios Browia Motio Costructio of the Process Properties of Browia Motio

4 1 MEASURE THEORY 1 Measure Theory We will begi with a brief review of measure theory. As this is meat oly for review, may proofs will be omitted, but ca be foud i [RF10] or [Rud87]. Defiitio 1.1. Give a set Ω, a σ-algebra or σ-field is a subset F P(Ω) (the power set of Ω) which satisfies (i) F (thus F is oempty) (ii) A F implies A c F (iii) A F for all implies that N A F. Defiitio 1.2. A measure is a o-egative fuctio µ : F [0, ] which satisfies (i) µ( ) = 0 (ii) coutable additivity: disjoit uio. N µ(a ) = µ ( N A ) where deotes the Defiitio 1.3. A measurable space is a pair (Ω, F), ad a measure space is a triplet (Ω, F, µ). Example 1.4 (Lebesgue measure). Let B be the Borel σ-field o Ω = R, i.e., the smallest σ-field cotaiig all ope sets. Thus, ay ope set G is i B, ad ay closed set F closed is i B. Also, G δ B ad F σ B where G δ is a coutable itersectio of ope sets ad F σ is a coutable uio of closed sets. Furthermore, coutable uios of G δ sets form the collectio G δσ := (G δ ) σ ad coutable itersectios of F σ form F σδ. This procedure ca be recursively applied to get all of B. Let m( ) be the measure which assigs to ay ope iterval, its legth m((a, b)) = b a. This is Lebesgue measure, ad we have that (R, B, m) is a measure space. Oe ca exted the σ-field to iclude sets which are ot i B, but such a discussio is better left for a course i real aalysis. Oe ca also restrict Lebesgue measure to obtai a measure space of the form (R, σ({a}), m). Here σ({ }) is the smallest σ-field cotaiig { }, e.g. σ({a}) = {, A, A c, Ω}. Example 1.5 (Coutig measure). Let Ω = Z. We ca defie the coutig measure space (Z, P(Z), µ) where for A P(Z), µ(a) := #{x Z : x A}. Agai, we ca also defie measures by restrictig the σ-field, for example (Z, {, A, A c, Z}, µ). Defiitio 1.6. The orm of a measure µ, deoted µ, is µ(ω). If µ <, this correspods to a true orm for the vector space of fiite siged measures o Ω. 4

5 1.1 Radom Variables 1.1 Radom Variables Defiitio 1.7. A probability space is a measure space where the orm of the measure is 1. We geerally deote probability spaces as (Ω, F, P). Ay elemet ω Ω is called a outcome ad ay A F is called a evet. The set Ω is ofte called the sample space. Ay measure µ with a ozero, fiite orm ca be made ito a probability measure by ormalizig, i.e., scalig the measure by µ 1. Example 1.8. Cosider the set of all possible results from fair coi tosses which result i H or T. Let Ω = {H, T }, F = P(Ω), ad P(A) = #A 2. The for A = {exactly oe T } we have that P(A) = 2. Note that P is a ormalized coutig measure. Example 1.9. If we cout the umber of H s appearig i each ω Ω of the previous example ad idetify (or merge ito a sigle elemet) all elemets with the same umber of H s, the we get the sample space Ω = {0, 1,..., } with F = P( Ω). Oe ca the check that P(A) = ( ) 1 k 2 k A is the measure iduced by P uder this idetificatio. Example Let Ω = [0, 1], F = B([0, 1]), ad P(A) = m(a) (Lebesgue measure). This ca be thought of as the measure correspodig to ifiitely may fair coi tosses with ω Ω give by ω = (ω 1, ω 2,...) ad ω i = 0 or 1 with the associatio that H = 1 ad T = 0. I other words, we use the dyadic expasio of ω [0, 1]. For example: ω = = Defiitio Give two measurable spaces (Ω, F) ad (R, B), we say that X : Ω R is a F-measurable fuctio if X 1 (B) F for all B B. We ofte just say measurable whe our choice of F is implicit. Probabilists call measurable fuctios o (Ω, F, P), radom variables. Remarks It is importat to look at iverse images of Borel sets from the rage of X ad ot forward images of A F from the domai of X. I geeral, the image of a measurable fuctio X may ot be measurable with respect to the Borel σ-field o R. For istace, set Ω R such that Ω / B ad cosider the σ-field F = {Ω B : B B}. The the idetity map restricted to Ω (or iclusio map X : Ω R) is F-measurable, but the forward image of Ω (uder the map X) is Ω which is ot a Borel set. 5

6 1 MEASURE THEORY 2. Sice {(, c), c R}, or alteratively {(, c], c R}, geerate the Borel sets, the measurability coditio is equivalet to X 1 ((, c)) F for all c R, or alteratively X 1 ((, c]) F for all c R. Exercise 1.1. If X ad Y are radom variables, show that X + Y ad X Y are radom variables. Example 1.13 (Biomial ad Beroulli radom variables). The probability space described i Example 1.9 is a caoical oe for the followig importat radom variable. A radom variable which couts the umber of heads out of idepedet coi tosses is called a biomial radom variable. I geeral, the coi tosses come up heads with probability 0 < p < 1, i which case a biomial radom variable X Bi(, p) is described by the probabilities ( P({ω : X(ω) = k}) = p k) k (1 p) k. The special istace = 1 is called a Beroulli radom variable, i which case we write X Ber(p). Example 1.14 (Geometric radom variables). Cosider a probability space similar to Example 1.10, except that we toss a ifiite umber of cois which may ot be fair. Suppose each coi comes up heads idepedetly with probability 0 < p < 1. A radom variable X which couts the umber of tosses required i order to get our first H is called a geometric radom variable ad we write X Geom(p). It is described by the probabilities P({ω : X(ω) = k}) = p(1 p) 1. Defiitio If X : (Ω, F) (R, B) is measurable, the we say that X is a exteded radom variable (Here, R = R {± }). If X : (Ω, F) (R, B ) is measurable, the we say that X is a radom vector (B is just the Borel σ-field o R ). Propositio For a coutable set of radom variables {X, N}, we have that if N X, N X, lim if N X ad lim sup N X are all exteded radom variables. Proof. We shall oly prove the first claim. We begi by otig that {ω : if N X (ω) < a} = N {ω : X (ω) < a}. (1.1) Each set {ω : X (ω) < a} F sice X is measurable by assumptio. Thus, N {ω : X (ω) < a} F as it is a coutable uio of measurable sets. 6

7 1.2 Expectatio Example It is importat that the idex set of if N X is coutable. Cosider the probability space ([0, 1], B, m), a fixed subset A [0, 1], ad the radom variables { 0 if t = ω ad ω A c X t (ω) = 1 otherwise. The if t [0,1] X t (ω) = 1 A (ω) which is measurable oly if A B. 1.2 Expectatio I the rest of this sectio we cosider geeral measure spaces (E, F, µ) which are fiite (uless otherwise stated) so that i particular, µ(e) <. Defiitio A measurable fuctio ϕ(x) is simple if ϕ takes fiitely may values. A simple fuctio ϕ is called a idicator fuctio if ϕ(x) {0, 1} for all x E. We write { 1, x A 1 A (x) = 0, x A c. Defiitio 1.19 (Itegral of simple fuctios). If ϕ : E R is simple ad takes values {a 1,..., a }, the we defie its Lebesgue itegral as ϕ dµ ϕ µ(dx) E := E a j µ(ϕ 1 (a j )). (1.2) j=1 Example Let A be a measurable subset of E. The 1 A dµ = 1µ(A) + 0µ(A c ) E = µ(a). The followig lemma is what makes the Lebesgue itegral work ad what gives it power. Lemma 1.21 (Simple Approximatio Lemma). A fuctio f is measurable if ad oly if there exists a sequece of simple fuctios (ϕ, N) such that ϕ (x) f(x) for all x E. Moreover, this sequece satisfies ϕ f for all. If f is oegative, the sequece (ϕ, N) ca be chose to be odecreasig. Defiitio 1.22 (Itegral of measurable fuctios). We proceed i three steps, by defiig the itegral first for bouded ad oegative fuctios, the for oegative fuctios, ad fially for geeral measurable fuctios. 7

8 1 MEASURE THEORY (i) If f 0 is a bouded, measurable fuctio o E, we defie { } f dµ := sup ϕ dµ : ϕ is simple ad ϕ f. (1.3) E E (ii) If g 0 is measurable o E, we defie { } g dµ := sup f dµ : f is a bouded, measurable fuctio ad 0 f g. E E Note that g dµ = is possible. E (iii) If g is measurable, the there exists measurable fuctios g 1 ad g 2 such that g 1 0, g 2 < 0, ad g = g 1 + g 2. For example, let A = g 1 ([0, )). The A is measurable ad g = 1 A g + 1 A cg. Usig this decompositio, we ca the defie g dµ := g 1 dµ ( g 2 ) dµ (1.4) E wheever (at least) oe of the itegrals o the right is fiite. itegrals o the right are ifiite, the itegral is udefied. Exercise 1.2. For bouded, measurable f, { } f dµ = if ϕ dµ : ϕ is simple ad ϕ f. E E E E If both Exercise 1.3. If (E, µ) = ([0, 1], m) ad f is bouded ad Riema itegrable, the f m(dx) = f dx. E Heceforth, all itegrals are with respect to measures ad dx or dt will deote Lebesgue measure (we also cotiue to sometimes use m(dx) to deote Lebesgue measure). Theorem 1.23 (Liearity ad mootoicity). If f ad g are measurable fuctios o (E, µ), the af + bg dµ = a f dµ + b g dµ. E E E If f g, the E f dµ E E g dµ. I a typical course i measure theory, oe would prove liearity ad mootoicity i three steps for measurable f ad g: first for bouded fuctios, ext for oegative fuctios, ad fially for geeral measurable fuctios. 8

9 1.2 Expectatio Corollary 1.24 (Triagle Type Iequality). If f is measurable, the f dµ f dµ. Proof. Sice f f f, we have by mootoicity that f dµ f dµ f dµ. E E E E E Defiitio 1.25 (Itegral over subsets). For ay measurable subset A E, we set f dµ := f1 A dµ. Propositio If E = =1 A, the f dµ = f dµ. E A A E =1 Proof. This follows from liearity of the itegral ad the coutable additivity of µ. Defiitio We say that f(x) = g(x) almost everywhere o E with respect to the measure µ, deoted a.e., if µ({x : f(x) g(x)}) = 0. If µ = 1, the we say that f = g almost surely, deoted a.s., o E. We say that two sets A ad B are equal a.e. or a.s. if their idictor fuctios are equal a.e. or a.s. Remark It is ofte the case i both probability theory ad measure theory that oe refers to a fuctio f whe really oe meas the equivalece class of fuctios which are equal to f a.s. or a.e., sice typically oe oly wats to kow thigs up to measure zero. Corollary If f a.e. = g, the f dµ = g dµ. Proof. Let A = {x : f(x) = g(x)}. The f dµ = f dµ + f dµ E A A c = g dµ + f dµ A A c = g dµ A = g dµ + g dµ A A c = g dµ. E E E 9

10 1 MEASURE THEORY Defiitio 1.30 (Covergece almost everywhere, almost surely). We say a.e. (f, N) coverges almost everywhere to f ad write f f wheever it coverges poitwise except o a set of measure zero: µ({x : lim f (x) f(x)}) = 0. If µ is a probability measure, we say almost surely ad write f a.s. f. Theorem 1.31 (Bouded Covergece Theorem). If (f, N) is a sequece of measurable fuctios for which there exists a M R such that f M for a.e. all ad f f, the f dµ = lim f dµ. E E It is crucial i the above theorem that (E, F, µ) is a fiite measure space. This is ot so for the ext three results, but for cosistecy let us cotiue to assume the settig of fiite measure spaces. Lemma 1.32 (Fatou s Lemma). If (f, N) is a sequece of oegative, measurable fuctios o E, the lim if f dµ lim if f dµ. E Theorem 1.33 (Mootoe Covergece Theorem). If (f, N) is a sequece of measurable fuctios o E such that f f a.e., the f dµ = lim f dµ. E Theorem 1.34 (Domiated Covergece Theorem). If (f, N) is a sequece of measurable fuctios o E for which f g for some g such that E g dµ <. If f a.e. f, the lim f dµ = f dµ. E E Defiitio If µ = 1, the we call the itegral (if it exists) of a radom variable X, the expectatio of X, deoted EX := X(x) dµ(x). E Our typical otatio for a probability space is (Ω, F, P) i which case the above becomes EX = X(ω) dp. Ω E E 10

11 1.2 Expectatio Exercise 1.4. Show that if g : R R is a Borel measurable fuctio ad X is a radom variable, the g(x) is a radom variable. I particular, the followig is well-defied: E(g(X)) = g(x(ω)) dp. Ω Havig itroduced probabilist s otatio for itegratio, i the rest of the sectio, we shall couch some typical theorems from real aalysis i this settig. First, however, let us itroduce two more terms used i the laguage of probability. Defiitio Sice g(x) = x p is a Borel measurable fuctio, g(x) = X p is a radom variable. The pth momet of X is give by EX p = X p (ω) dp. Ω Defiitio E(X 2 (EX) 2 ) is called the variace of X, deoted Var X. It is easy to see that Var X is equivalet also to E(X EX) 2 Because EX ad Var X play a cetral role, they ofte are deoted by µ = EX ad σ 2 = Var X. We ackowledge that µ is beig used for several differet objects, but the reader should be able to deduce the meaig i each case by cotext. Theorem 1.38 (Hölder s Iequality). Suppose p ad q are cojugate expoets, i.e., 1 p + 1 = 1, 1 < p, q <. q If X ad Y are radom variables, the where X p := (E X p ) 1 p. E XY X p Y q, I the case where p = q = 2, Hölder s Iequality is just the Cauchy-Schwarz Iequality (oe ca check that E XY defies a ier product betwee X ad Y ). Exercise 1.5 (Paley-Zygmud Iequality). Let Y 0 with EY 2 <. For θ [0, 1] 2 (EY )2 P(Y > θey ) (1 θ) EY 2. Hit: Use Hölder s Iequality o the product Y 1 {Y >θey }. Exercise 1.6. Let E X k <. The for 0 < j < k we have E X j E( X k ) j/k <. 11

12 1 MEASURE THEORY Exercise 1.7 (Mikowski s Iequality). For p 1 ad radom variables X ad Y, show that X p + Y p X + Y p. Mikowski s Iequality shows that p is a orm for the space of all radom variables with fiite pth momet. I fact, it turs out that this ormed liear space is complete, thus makig L p (Ω) a Baach space. This motivates the followig defiitio. Defiitio We say that a radom variable X is itegrable with respect to P if E X < ad write X L 1 (Ω) L 1 (Ω, F, P). If E X p <, the we say X L p (Ω). Example If X L p (Ω) ad Y L q (Ω) where p ad q are cojugate expoets, the by Hölder s Iequality, XY L 1 (Ω). Example Let Z be a radom variable. Defie radom variables Y 1 ad X = Z α for α 1. The we have that Thus, E X 1 (E 1 q ) 1 q (E X p ) 1 p. (E Z α ) 1 α (E Z αp ) 1 αp or Z α Z αp. Thus, for 1 p < q <, we have that Z p Z q. I particular, if X L q (Ω), the X L p (Ω). A immediate result of this fact is that for a radom variable X, Var X < implies E X < ad thus, EX <. Note: This sort of result does ot hold for geeral measure spaces. It relies o the assumptio that we are workig o a probability (ad thus fiite) measure space. Exercise 1.8. (a) Use summatio-by-parts (the discrete aalog of itegratioby-parts) to show that whe X takes values i N, EX = N P({ω : X(ω) }). (b) For a geeral radom variable, use Example 1.41 to show also that Var X < if ad oly if E( X 1 { X >} ) <. N Defiitio A fuctio ϕ : R R is said to be covex o R if for all x, y R ad λ [0, 1] ϕ(λx + (1 λ)y) λϕ(x) + (1 λ)ϕ(y). 12

13 1.2 Expectatio Theorem 1.43 (Jese s Iequality). If ϕ : R R is a covex fuctio, the provided that both are fiite. E(ϕ(X)) ϕ(ex) Proof. Cosider a lie l(x) = ax+b satisfyig l(x) ϕ(x) ad l(ex) = ϕ(ex). For covex fuctios, such a lie always exists. The ϕ(x) ax + b, thus by mootoicity ad liearity of expectatio, Eϕ(X) E(aX + b) = aex + b = l(ex) = ϕ(ex). Example Let ϕ(t) = t p for t 0 ad p 1. The ϕ is a covex fuctio. Let Z be a radom variable ad defie the radom variable X as X = Z α, α 1. The Jese s Iequality gives us that Eϕ(X) ϕ(ex). Therefore, E Z αp (E Z α ) p. By takig the αp th root of both sides we get (E Z αp ) 1 αp (E Z α ) 1 α. Thus, we agai have the result that Z αp Z α. Theorem 1.45 (Markov s Iequality). For a radom variable X ad u > 0, we have that P({ω : X(ω) u}) E X u. Proof. We begi by otig that u1 {ω: X(ω) u} X. We ca the take the expectatio of both sides to get ue(1 {ω: X(ω) u} ) E X. By rewritig the expectatio o the left as a probability ad divig both sides by u, we get P({ω : X(ω) u}) E X u. Remark Whe u > 0, the evet { X u} is the same as {X 2 u 2 }, thus a easy corollary is Chebyshev s Iequality: P({ω : X(ω) u}) EX2 u 2. 13

14 1 MEASURE THEORY Exercise 1.9. If ϕ is a strictly covex fuctio, the ϕ(ex) = Eϕ(X) implies that X is a.s. costat. Theorem 1.47 (Trasformatio Theorem). If f : R R is a Borel measurable fuctio ad {X i, 1 i } are radom variables, the f(x 1,..., X ) is a radom variable. We ow wat to discuss the idea of product measures. I order to simplify the mechaics while still providig isight, we will cosider the case where = 2. Lemma 1.48 (Product measure). Let (E 1, F 1, µ 1 ) ad (E 2, F 2, µ 2 ) be measure spaces. There is a uique measure µ o E = E 1 E 2 with σ-field F 1 F 2 := σ({a B : A F 1, B F 2 }) = such that µ(a B) = µ 1 µ 2 (A B) := µ 1 (A)µ 2 (B). Usig the lemma above, we may defie the otio of a product σ-field ad product measure as the oes described i the lemma. Oe should check that, i the case of Lebesgue measure, these coicide with our previous otio of (R, B, m). This is doe by cofirmig the coutable additivity of Lebesgue measure o B, ad the utilizig Caratheodory s Extesio Theorem (see [RF10]). Theorem 1.49 (Fubii-Toelli Theorem). Suppose E = E 1 E 2 is edowed with a product σ-field ad product measure. For a measurable fuctio f o E, if f 0 or f L 1 (E, F, µ), the f(x, y) µ(dx, dy) = f(x, y) µ 1 (dx) µ 2 (dy). E E 2 E } 1 {{} ( ) Remarks It is possible that f(x, ) be measurable for each y ad f(, y) be measurable for each x, but f is still ot measurable. I such cases, the theorem caot hold sice the itegral o the left is ot eve well-defied. 2. For Toelli (f 0), both sides may be ifiite. 3. Part of the theorem is that ( ), as a fuctio of y, is measurable with respect to (E 2, F 2, µ 2 ). 4. Note that the iterated itegral o the right may be doe i either order sice the desigatio of E 1 was arbitrary. 5. Similar to Fatou, Mootoe Covergece, ad Domiated Covergece, it is ot ecessary here that the measure spaces be fiite. 14

15 1.3 Distributios 1.3 Distributios A property is probability-theoretical if ad oly if it is described i terms of a distributio. M. Loève [Loè77] Defiitio The distributio 1 of a radom variable X is a measure µ X o (R, B) such that for A B, µ X (A) := P(X 1 (A)). I other words, it is the iduced probability measure o R by the measure P o Ω. We will also sometimes say that a measure µ o (R, B) is a distributio if it is a probability measure, eve if there is o a priori associated radom variable. Remark A less widely used term for the iduced measure o R is the law of a radom variable X. However, the term law is also sometimes used i referece to the measure P o the measurable space (Ω, F). Due to its ambiguity, we do ot use this termiology i the sequel. The quote at the begiig of this sectio is i the cotext of M. Loève writig o the coceptual differece betwee probability theory ad geeral measure theory. We add to this the claim that the otio of a distributio is the heart of a radom variable. To explai this claim, cosider that a measurable space (Ω, F) is required to have very little structure just eough to defie a measure o it (for example we caot discuss additio or cotiuity i the set Ω sice it may ot be a group ad may ot have a topology). This is reasoable sice evets which occur i real life typically do ot come with atural algebraic, topological, or geometric structures. Itroducig a radom variable X ito the picture (or radom vector X) allows us to push probabilities ito a space i which we have a great deal of structure, amely R (respectively R ). We ca the forget about (Ω, F, P) ad work with (R, B, µ X ) (respectively (R, B, µ X )). The values X takes are ow thought of as varyig accordig to the iduced measure µ X, which is the motivatio behid callig it a radom variable rather tha a fuctio. This is embodied by the shorthad otatio P(X A) P({ω : X(ω) A}). Remark Sice Ω is typically a abstract space to begi with, oe ofte just sets Ω = R. The we may also set µ X = P i which case we might as well set X(ω) = ω (the idetity fuctio). This is takig the above discussio to the extreme, but this sort of thikig is sometimes helpful particularly whe whe we later ecouter sequeces of radom variables takig values i a ifiite product space R N = i=1 R. Exercise Show that µ X is a probability measure. Defiitio A collectio of radom variables {X i, i I} is said to be idetically distributed (i.d.) if the distributio of X i is the same for all i I. 1 These should ot be cofused with distributios i the theory of partial differetial equatios. 15

16 1 MEASURE THEORY Example Let Ω = {X, T } 3 ad cosider the Beroulli radom variables { 1, if H o the i X i = th toss 0, otherwise for i {1, 2, 3}. The ω 1 = (H, H, H), ω 5 = (H, T, T ), ω Ω = 2 = (H, H, T ), ω 6 = (T, H, T ), ω 3 = (H, T, H), ω 7 = (T, T, H), ω 4 = (T, H, H), ω 8 = (T, T, T ) The the distributio of X i for i {1, 2, 3} is give by µ = 1 2 δ δ 1. Thus, these radom variables are idetically distributed with distributio µ = µ Xi.. Example 1.56 (Expoetial distributio with rate λ). We say that X has a expoetial distributio with rate λ ad write X Exp(λ) if µ X (A) = λe λx dx for all A B. A [0, ) Example 1.57 (Normal or Gaussia distributio). We say that X has a ormal or Gaussia distributio with mea µ 0 ad variace σ 2 ad write X N(µ 0, σ 2 ) if ( 1 µ X (A) = exp (x µ 0) 2 ) 2πσ 2 2σ 2 dx A Defiitio The distributio fuctio of X is defied to be F X (x) := P(X x) P({ω : X(ω) x}) = µ X ((, x]). Oe should ot cofuse distributio fuctios, which are actual fuctios o R, with distributios which are probability measures o R. To make the distictio clear, oe ofte calls F X a cumulative distributio fuctio or more simply a cdf. The followig is a stadard result from measure theory. Lemma 1.59 (Cotiuity of measure). Suppose A, A F for some (possibly ifiite) measure space (E, F, µ). If A A, the µ(a) = lim µ(a ). If µ is a fiite measure ad A A, the µ(a) = lim µ(a ). 16

17 1.3 Distributios I the followig, we use the otatio f(x+) := lim f(y) ad f(x ) := lim f(y). y x + y x Propositio 1.60 (Distributio fuctio properties). (i) F X is odecreasig, i.e., x y implies F X (x) F X (y). (ii) F X is right-cotiuous, i.e., F (x+) = F (x) for all x. (iii) lim x F X (x) = 0 ad lim x F X (x) = 1. (iv) P(X = x) = µ X ({x}) = F (x) F (x ). Proof. (i) Sice x y, we have (, x] (, y] which implies F X (x) = µ X ((, x]) µ X ((, y]) = F X (y). (ii) Suppose x x. The (, x ] = (, x] which implies F X (x+) = lim µ X((, x ]) = µ X ( (, x ]) = µ X ((, x]) = F X (x) where the secod equality follows from cotiuity of measure. (iii) From the fact that N (, ] =, we coclude that lim F X(x) = lim µ X((, ]) x = µ X ( N(, ]) = µ X ( ) = 0. From N (, ] = R, we coclude that lim F X(x) = lim µ X((, ]) x = µ X ( N(, ])) = µ X (R) = 1 where the secod equalities i both of the above follow from cotiuity of measure. (iv) This follows from the fact that F X (x ) = P(X < x), which oe ca easily check. 17

18 1 MEASURE THEORY Theorem 1.61 (Characterizatio by distributio fuctios). If a fuctio F satisfies properties (i), (ii), ad (iii) of Propositio 1.60, the it is the distributio fuctio of some radom variable X, i.e., there is a X such that F = F X. Proof. We will costruct a X usig (Ω, F, P) = ([0, 1], B([0, 1]), m), i.e., Lebesgue measure o the Borel subsets of [0, 1]. We defie X(ω) := sup {y : F (y) < ω}. Whe F is cotiuous ad strictly icreasig, X is its iverse, ad this is how oe should thik about it eve whe F is discotiuous or ot strictly icreasig. If we ca show that the {ω : X(ω) x} = {ω : ω F (x)}, (1.5) P(X x) = P(ω F (x)) = m([0, F (x)]) = F (x), as desired. So ow we show (1.5). [ ] Suppose that ω F (x). Sice X is odecreasig, we have X(ω) X(F (x)) x. The latter iequality is equivalet to sayig sup {y : F (y) < F (x)} x, which is true sice if y is such that F (y) < F (x), the y x sice F is odecreasig. [ ] Suppose that X(ω) x. We must show ω F (x). By way of cotradictio, suppose ω > F (x). Let x x with x > x for all. By right-cotiuity, F (x ) F (x). Choose a N N such that ω > F (x N ) F (x). The we have sup {y : F (y) < ω} x N because x N is oe such y. The by defiitio, X(ω) x N > x, a cotradictio. Remark Sice F X is odecreasig, all discotiuities are jump-discotiuities. Therefore, ay F X with a discotiuity of height c at poit x must be associated to a distributio µ X which is partly made up by the poit mass cδ x. The poit x is sometimes referred to as a atom. Example 1.63 (Uiform distributio). Perhaps the simplest distributio fuctios are those of cotiuous uiform radom variables X Uif[a, b] ad 18

19 1.3 Distributios discrete uiform radom variables X Uif{x 1,..., x }. For the cotiuous case we have 0, x a x a F X (x) = b a, x (a, b) 1, x b. I the discrete case we have 0, x < x 1 k F X (x) =, x k x < x k+1 ad 1 k < 1, x x. The use of the words discrete ad cotiuous apply to more tha just uiform radom variables. I particular, we say that a radom variable is discrete if its distributio ca be writte i the form p δ x N for a coutable set of values {x, N} which occur with probabilities {p, N} summig to oe (i the fiite case, ifiitely may of the probabilities are zero). A radom variable is cotiuous if its distributio fuctio F X is cotiuous. However, it is ofte the case that whe oe says X is cotiuous, oe really meas the slightly stroger statemet that its distributio is absolutely cotiuous, which we ow discuss. Defiitio If µ ad ν are measures o (R, B), we say that ν is absolutely cotiuous with respect to µ if for all A B, µ(a) = 0 ν(a) = 0, ad we write ν µ. Before movig o, let us motivate the above defiitio. Give a radom variable X, we have up util ow, three differet yet equivalet ways of describig a probability measure. Firstly, the abstract way, (Ω, F, P). Secodly, usig the measure µ X o (R, B) iduced by X. Fially, Theorem 1.61 tells us that the distributio fuctio F X uiquely determies the measure µ X. I the rest of the sectio we will show that the otio of absolute cotiuity provides a fourth descriptio of the probability measure that holds wheever a distributio is absolutely cotiuous with respect to Lebesgue measure. Defiitio A measure space (E, F, µ) is σ-fiite if there exists a coutable collectio {E, N} F such that E = N E ad µ(e ) < for all. Theorem 1.66 (Rado-Nikodym Theorem). Suppose (E, B, µ) is a σ-fiite measure space. The ν µ if ad oly if there is a measurable fuctio f 0 such that ν(b) = f dµ for all B B. (1.6) B The fuctio f is called the Rado-Nikodym derivative ad is uique µ-a.e. 19

20 1 MEASURE THEORY Proof. For the proof we refer to [Rud87]. However, oe ca easily see the µ-a.e. uiqueess of the fuctio f, for if f ad g both satisfy (1.6), ad differ o a set of positive µ-measure, the there is a set B B such that (f g) dµ 0. B Hece ν(b) = B f dµ g dµ = ν(b), a cotradictio. B Defiitio We say that a fuctio F o R is absolutely cotiuous o a iterval [a, b] if for every ɛ > 0, there exists a δ > 0 such that for every N ad every collectio {(a k, b k )} k=1 of disjoit ope subitervals of [a, b] such that k=1 (b k a k ) < δ, we have k=1 F (b k) F (a k ) < ɛ. Note that absolute cotiuity implies uiform cotiuity, which i tur implies cotiuity. Theorem 1.68 (Fudametal Theorem of Calculus). A fuctio F is absolutely cotiuous o [a, b] if ad oly if F (x) exists a.e., F L 1 ([a, b]), ad F (x) F (a) = x As usual, dt deotes Lebesgue measure. a F (t) dt for all x [a, b]. Defiitio If X is a radom variable with a absolutely cotiuous distributio fuctio F X, the its desity (sometimes called probability desity fuctio or pdf) is f X (x) := F X (x). The previous theorem shows that a probability distributio o R is absolutely cotiuous with respect to Lebesgue measure, precisely whe its associated distributio fuctio is absolutely cotiuous (thus affordig us the dual use of this termiology). I particular, oe gets that µ X (A) = f X (x)dx, thus whe F X is absolutely cotiuous o R, we have four equivalet ways of iterpretig the probability measure. Example 1.70 (Expoetial ad Normal desities). If X has a expoetial distributio with rate λ the ad thus for c > 0, A f X (x) = λe λx 1 [0, ) F X (c) = P(X c) = c 0 λe λx dx. If X has a ormal distributio with mea µ 0 ad variace σ 2 the ( 1 f X (x) = exp (x µ 0) 2 ) 2πσ 2 2σ 2. 20

21 1.3 Distributios Example Clearly, F X must be cotiuous i order for it to be absolutely cotiuous, so there is o desity for the radom variable X Ber(0, 1), which is a coi flip assigig 1 to heads ad 0 to tails. It is ot eough, however, that F X is cotiuous, i order for X to have a desity. Cosider the cotiuous Cator-Lebesgue fuctio F o [0, 1], i.e., the Devil s Staircase, which ca be exteded to all of R by settig its value to 1 for x > 1 ad 0 for x < 0. The, it is a distributio fuctio. Sice F (x) = 0 a.e., F (1) F (0) = 1 0 = 1 0 F (x)dx. We see that F is ot absolutely cotiuous thus has o associated desity. Exercise If the distributio µ X is absolutely cotiuous with desity f X, show that for ay Borel measurable fuctio h, Eh(X) = h(x)f X (x) dx. R 21

22 2 BERNOULLI S LAWS OF LARGE NUMBERS 2 Beroulli s Laws of Large Numbers 2.1 Idepedece ad Covolutio Measure theory eds ad probability begis with the defiitio of idepedece. R. Durrett [Dur10] Before defiig idepedece, let us cosider the motivatio behid the way it is defied. Let us suppose some evet B Ω (with P(B) > 0) is kow occur (say to someoe with extra or iside iformatio). Coditioed o the iformatio that B has occurred, the probability that B c occurs must the be 0. O the other had, it may be that P(B) < 1, yet we kow B occurs. It is atural the, uder the assumptio that B occurs, to ormalize P by dividig by P(B). This gives us a ew probability P(A) = P(A B) := P(A B) P(B) o Ω, called the coditioal probability, where P(A) ad P(A B) are just two differet otatios for the same thig ad are defied by the right-had side. We read P(A B) as the probability that A occurs give that B occurs. With this i mid, if we thik of A ad B as beig idepedet of each other, the kowledge of B should ot affect the probability of A occurrig. So, we should expect P(A B) = P(A). This would the imply that P(A B) = P(A)P(B). We ow defie idepedece i a variety of cotexts. These cosideratios motivate us to make the followig defiitio of idepedece o the space (Ω, F, P): we say that two evets A ad B are idepedet (with respect to P), ad we write A B, if P(A B) = P(A)P(B). Give (Ω, F, P), two sub-σ-fields of F, say G ad H, are said to be idepedet if A B for all A G ad B H. The otio of idepedet σ-fields is just a extesio of idepedet evets. To see this, ote that if A B, the A B c, for we have P(A B c ) = P(A) P(A B) = P(A) P(A)P(B) = P(A)(1 P(B)) = P(A)P(B c ). Sice σ({a}) = {, A, A c, Ω} ad σ({b}) = {, B, B c, Ω}, it follows that A B implies that every elemet of σ({a}) is idepedet of every elemet of σ({b}). We say that two radom variables X ad Y are idepedet if σ(x) σ(y ), where for a radom variable Z we defie the σ-field geerated by Z to be σ(z) := {Z 1 (B) : B B} (oe ca check that this is a σ-field). Note that σ(z) is i fact the smallest σ-field that ca be costructed from Ω that allows 22

23 2.1 Idepedece ad Covolutio Z to be measurable. Also, otice that if X Y, the P(X [a, b], Y (c, d)) = P({ω : X(ω) [a, b]} {ω : Y (ω) (c, d)} ) }{{}}{{} =X 1 ([a,b]) σ(x) =Y 1 ((c,d)) σ(y ) = P(X [a, b])p(y (c, d)). We could of course use ay two Borel sets i place of [a, b] ad (c, d). We say that a fiite umber of evets A 1,..., A are idepedet if for every idex set I {1,..., }, we have ( ) P A i = P(A i ). i I i I The evets A 1,..., A are pairwise idepedet if for every 1 i < j, P(A i A j ) = P(A i )P(A j ). It is clear that idepedece implies pairwise idepedece. Exercise 2.1. Show that the coverse is ot true. I other words, costruct evets which are pairwise idepedet, but ot idepedet. We say that fiitely may σ-fields F 1,..., F are idepedet if ( ) P A i = P(A i ) i=1 i=1 for all A i F i, 1 i. Defiig this i terms of arbitrary idex sets I, as above, is ot ecessary because we ca always let some A i = Ω. Fiitely may radom variables X 1,..., X are idepedet if σ(x 1 ),..., σ(x ) are idepedet. Lastly, whether we are talkig about evets, σ-fields, or radom variables, a ifiite collectio is said to be idepedet if every fiite subcollectio is idepedet. For ow, we assume that such ifiite collectios exist ad address the existece issue later i Theorem Havig defied idepedece eight times, let us develop some properties ad see how it is useful. First, a result: Theorem 2.1 (Idepedece Trasformatio Theorem). Suppose {X i, i N} are idepedet radom variables ad f i : R R, i N, are Borel measurable. The {f i (X i ), i N} are idepedet. Proof. Sice we must show that {σ(f i (X i )), i N} are idepedet, it suffices to show that σ(f i (X i )) σ(x i ) sice {σ(x i ), i N} are idepedet. This meas we must show {X 1 (f 1 (B)) : B B} {X 1 (B) : B B}, but this is clear for if B is a Borel set, so is f 1 (B) sice f is Borel measurable. Hece X 1 (f 1 (B)) is of the form X 1 (B ) where B B. 23

24 2 BERNOULLI S LAWS OF LARGE NUMBERS By applyig the same argumets, oe ca show Corollary 2.2. If {X ij, (i, j) N 2 } are idepedet, ad f i : R mi R, i N, are Borel measurable, the {f i (X i1,..., X imi ), i N} are idepedet. Propositio 2.3 (Product measures ad idepedece). If {X i, i N} are idepedet with distributios {µ i, 1 i } respectively, the the radom vector X = (X 1,..., X ) has distributio µ := i=1 µ i o (R, B ). Proof. First we check that µ coicides with the distributio whe applied to product sets A 1 A, where A i B. We have P((X 1,..., X ) A 1 A ) = P(X 1 A 1,..., X A ) = P(X 1 A 1 ) P(X A ) = µ 1 (A 1 ) µ (A ) = µ(a 1 A ). Now we must show that µ(b) = P( X B) for arbitrary B B. We have thus far oly show this for very special B, however, ote that the Borel σ-field i R is geerated by products A 1 A of Borel sets i R. Hece, sice µ ad the distributio coicide o a geeratig set, the they coicide o all elemets of B. Example 2.4. If X N(0, 1) ad Y N(0, 1) are idepedet, the they are distributed as 1 µ X (A) = e x2 /2 m(dx), A 2π 1 µ Y (B) = e y2 /2 m(dy). 2π By Theorem 2.3, the distributio of (X, Y ) is give by 1 x µ (X,Y ) (A B) = (µ X µ Y )(A B) = 2 +y 2 2π e 2 dxdy. B A B As oe might expect, we see that µ (X,Y ) turs out to be the two-dimesioal stadard Gaussia distributio. More geerally, if X ad Y have absolutely cotiuous distributio fuctios F X ad F Y, the they have desities f X ad f Y. If X Y, by Theorem 2.3 oe has µ (X,Y ) (B) = f X f Y dxdy. (2.1) Coversely, if µ (X,Y ) is give by the above equatio, it is a easy exercise to see that X Y. B 24

25 2.1 Idepedece ad Covolutio Defiitio 2.5. The joit distributio for a collectio of radom variables {X 1,..., X }, is the distributio µ o (R, B ) of the radom vector X = (X 1,..., X ), i.e., µ(a) = P( X A) for all A B. Likewise, if µ(a) = A f X (x 1,..., x ) m(d x) for all A B, the f X {X 1,..., X } is said to be a joit desity for the collectio Remark 2.6. We ofte abuse otatio by switchig idiscrimiately betwee X ad {X 1,..., X }. Similarly, we sometimes say joit distributio (whe thought of as a collectio) ad other times simply say distributio (whe thought of as a vector). Note that i the above defiitio, idepedece of the radom variables is ot required. See the examples below. Corollary 2.7. If {X 1,..., X } are idepedet ad have desities f x1,..., f x, the the radom vector X = (X 1,..., X ) has joit desity f X (x 1,..., x ) : R R defied by f X = f xi. i=1 Example 2.8. If X 1 Uif(0, 1) ad X 1 = X 2, i.e., the two radom variables are ot oly idetically distributed but i fact idetical, the the radom vector X = (X 1, X 2 ) has a joit distributio but o joit desity sice the distributio cocetrates o the oe-dimesioal lie y = x. This is despite the fact that X 1 ad X 2 both have desities. Example 2.9. For X 1 X 2, let X 1 N(0, 1) ad let X 2 have a desity defied by { 2 f X2 (x) = 2π e x2 2, if x < 0 0, otherwise Also, let X 3 = X 1 if X 1 > 0 ad X 3 = X 2 if X 1 < 0. The, X 3 N(0, 1), but X 3 is ot idepedet from X 1. Agai, the radom vector X = (X 1, X 3 ) has a joit distributio but o joit desity sice part of the distributio cocetrates o the the lie y = x for x > 0. However, if oe coditios o the evet X 1 < 0 (or X 3 < 0), the the coditioal distributio has a joit desity. I fact, this coditioal desity is proportioal to the desity of a stadard two-dimesioal Gaussia radom variable, restricted to the quadrat x < 0, y < 0. 25

26 2 BERNOULLI S LAWS OF LARGE NUMBERS Defiitio Suppose the vector X = (X 1, X 2 ) has distributio µ. The probabilities µ(b) for all B of the form (a, b) R, determie a margial distributio, µ 1, defied by µ 1 ((a, b)) := µ((a, b) R) Remarks Give µ((a 1, b 1 ) R) ad µ(r (a 2, b 2 )) for all ope (a 1, b 1 ) ad (a 2, b 2 ), we the kow the margial distributios µ 1 ad µ 2, but we still do ot kow the distributio µ for the radom vector. However, i the special case where X 1 ad X 2 are idepedet, oe ca easily extract µ, sice the we obtai that µ(((a 1, b 1 ) R) (R (a 2, b 2 ))) = µ((a 1, b 1 ) (a 2, b 2 )) = }{{} by µ 1 ((a 1, b 1 ))µ 2 (a 2, b 2 )). 2. If B i is the support of µ i, the µ is supported o B 1 B 2, but B 1 B 2 may ot be its support. I other words, eve if µ i (A i ) > 0 for both A i B i, this does ot imply that µ(a 1 A 2 ) > If for B B B, the distributio µ is give by a desity fuctio µ(b) = f X (x 1, x 2 )dx 1 dx 2, B the for B 1 B, the margial distributio µ 1 (B 1 ) = B 1 f X1 dx 1, where f X1 = R f X (x 1, x 2 ) dx 2 is called the margial desity. If i additio X 1 X 2, the f X = f X1 f X2. I fact whe margial desities exist, the previous statemet is if ad oly if. 4. If X = (X1,..., X ) R with 3, the the margial distributios described above are the oe-poit or oe-dimesioal margials. More geerally, the k-poit or k-dimesioal margials are the distributios of vectors (X j1, X j2,..., X jk ) where 1 k <. Example Suppose µ(a) = the f Y (y) = f X (x) = y 0 x A e y 1 {0<x<y} dxdy, e y dy = e x for x > 0 ad y e y dx = e y 1 dx = ye y for y > 0. Hece, sice e y 1 {0<x<y} ye y e x 1 {x>0,y>0}, we ca coclude that X ad Y are ot idepedet. 0 26

27 2.1 Idepedece ad Covolutio Propositio If X ad Y are idepedet ad E X < ad E Y <, the E XY < ad EXY E(XY ) = EXEY. Proof. First ote that if we let X = 1 A ad Y = 1 B, the EXY = E1 A B = P(A B) = P(A)P(B) = EXEY. The idea is that the class of idicator radom variables are i some sese the buildig blocks of all radom variables. The ext step is to use liearity to exted the result to all simple fuctios of the type X = i=1 c i1 Ai The, usig the Simple Approximatio Lemma ad the Mootoe Covergece Theorem, oe ca show that the theorem holds for all X 0 ad Y 0. Fially, cosiderig the egative ad oegative parts of arbitrary X ad Y separately completes the proof. Idepedece is hugely importat i probability theory ad most of the fudametal theorems ad basic models i the sequel are built o idepedet radom variables at some level. These theorem ad models are, however, oly startig poits for more complex models which require some level of depedece i order to be more realistic. Thus, we ow quickly itroduce the most basic tools for measurig depedece. Defiitio (i) If < EXY = EXEY <, the X ad Y are said to be ucorrelated. Note that ucorrelated does ot imply idepedece. (ii) If > EXY > EXEY >, the X ad Y are positively correlated. (iii) If < EXY < EXEY <, the X ad Y are egatively correlated. Defiitio (i) The covariace of X ad Y is Cov(X, Y ) := E[(X EX)(Y EY )] = EXY EXEY. (ii) The correlatio 2 of X ad Y is Corr(X, Y ) := Cov(X, Y ) Var X Var Y. 2 The otio of correlatio is extremely importat i probability; it represets oe of the first ways of measurig depedece, hece providig a foil to idepedece. The moder form of correlatio is due to Pearso, but it is geerally recogized (icludig by Pearso himself) that Galto iveted this cocept i 1888, after may writigs o similar ideas. O a related ote, Pearso is also the first perso to use the term stadard deviatio [Sti89]. 27

28 2 BERNOULLI S LAWS OF LARGE NUMBERS (iii) The covariace matrix Σ associated to a radom vector (X 1,..., X ) R is Cov(X 1, X 1 ) Cov(X 1, X 2 )... Cov(X 1, X ) Cov(X 2, X 1 ) Cov(X 2, X 2 )... Cov(X 2, X )......, Cov(X, X 1 ) Cov(X, X 2 )... Cov(X, X ) where each etry Σ ij = Cov(X i, X j ). Remarks It is easy to see that Cov(X, X) = Var X. 2. We have Cov(aX, by ) = ab Cov(X, Y ) which implies Var(cX) = c 2 Var X ad also Var(cX) = c Var(X) }{{}}{{} σ cx cσ X where σ X is called the stadard deviatio of X. 3. We have Cov(X, Y +Z) = Cov(Y +Z, X) = Cov(X, Y )+Cov(X, Z) which together with (b) shows that covariace is a symmetric, biliear form. 4. A very commo formula is ( ) Var X i = i=1 i,j Cov(X i, X j ), which is the same as summig all the etries of the covariace matrix for (X 1,..., X ). Exercise 2.2. Show that Corr(X, Y ) 1. Also, fid whe Corr(X, Y ) = 1 ad whe Corr(X, Y ) = 1. Exercise 2.3. Show that the covariace matrix Σ of ay radom vector X must be positive semi-defiite, i.e., v T Σv 0 for all v R. Oe way to do this is to cosider the variace of the scalar radom variable give by the dot product v X. Coversely, show that ay positive semi-defiite matrix Σ is the covariace matrix of some radom vector. Hit: for the coverse directio, take a radom vector X whose margials are all idepedet ad which each have variace oe. Sice Σ is positive defiite, the square-root matrix is well-defied. Calculate the covariace matrix for the radom vector Σ X. 28

29 2.1 Idepedece ad Covolutio Example 2.17 (Multivariate Gaussia distributio). Covariace matrices are especially useful whe dealig with Gaussia distributios i high dimesios. I particular, if µ = (µ 1,..., µ ) R ad Σ is a positive defiite matrix the the -dimesioal Gaussia distributio with mea vector µ ad covariace matrix Σ has the desity (2π) 2 (detσ) 1 2 exp ( 1 ) 2 (x µ)t Σ 1 (x µ) for x R. We are earig our first big result i probability theory (both historically ad i traditioal pedagogy) which uses idepedece to aalyze the limitig behavior of ormalized sums of i.i.d. radom variables. The limitig behavior is described by the so-called Law of Large Numbers attributed to Jakob Beroulli [Ber13]. However, before movig to the limitig behavior, let us first preset a method for obtaiig a complete descriptio of the distributio of sums of a fixed fiite umber of idepedet radom variables. Defiitio If µ X ad µ Y are distributios o R correspodig to idepedet radom variables X ad Y, the their covolutio, defied i terms of the product measure by µ X µ Y (A) := µ X µ Y ({(x, y) : x + y A}), is the distributio of the sum X + Y. Remarks It is very importat that X +Y is a sum of idepedet radom variables. 2. The covolutio is a distributio o R eve though it is defied i terms of a distributio o R This otio exteds to radom vectors i X, Y R (as log as they are both i the same dimesio ). Exercise 2.4 (Poisso distributio). We say that X Poiss(λ) has a Poisso 3 distributio with mea λ if P(X = k) = e k λk for k [0, ). k! If X Poiss(λ) ad Y Poiss(κ) are idepedet, use covolutio to show X + Y Poiss(λ + κ). If X Bi(, p) ad Y Bi(m, p) are idepedet, use covolutio to show X + Y Bi( + m, p). 3 As oted i [JKK05, p. 157], this distributio ad its correspodig covergece theorem were most likely first discovered by A. De Moivre i 1711, well before Poisso s time. This a example of Stigler s Law, the otio that the ame attached to a mathematical theorem is ever the perso that actually discovered the theorem. 29

30 2 BERNOULLI S LAWS OF LARGE NUMBERS Propositio 2.20 (Covolutio is a semigroup). As a operatio, covolutio is commutative ad associative. Moreover, δ 0 is a idetity with respect to covolutio of dsitributios. Proof. Commutativity ad associativity follow from these properties for additio (of idepedet radom variables). Similarly, sice X 0 is a idetity for additio of idepedet radom variables, its distributio δ 0 is the idetity for the operatio of covolutio. Remarks Covolutio is ot a group sice the oly possible iverse of X would be X, but these are clearly ot idepedet. 2. The -fold covolutio of µ X with itself correspods to the sum of i.i.d. radom variables which have the same distributio as X. It is deoted by ν = µ X. Equivaletly we say that µ X is the th covolutio root of ν, ad we may write ν 1/ = µ X. 3. If the th covolutio root of ν exists for every N, we say that ν is a ifiitely divisible distributio. Exercise 2.5. Show that the th covolutio root of the Gaussia distributio N(µ, σ 2 ) is the Gaussia distributio N(µ/, σ 2 /). I particular, every Gaussia distributio is ifiitely divisible. Propositio Let A y = {z : z + x A}. The covolutio of µ X ad µ Y is give by µ X µ Y (A) = µ X (A y)µ Y (dy). R If µ X ad µ Y have associated desities f X ad f Y, the µ X µ Y also has a desity which is give by f X+Y (z) = f X (z y)f Y (y) dy. R Proof. Set B := {(x, y) : x + y A}. The, usig Toelli s Theorem, we have µ X µ Y (A) = = R R ( ) 1 B (x, y)µ X (dx) µ Y (dy) R µ X (A y) µ Y (dy). For the secod part, itegrate the fuctio g(z) := f X (z y)f Y (y) dy R 30

31 2.2 Weak Law of Large Numbers over a set A B to get g(x) dx = A = = A R R ( ) f X (x y)f Y (y) dy dx R ( ) f X (x y) dx f Y (y) dy A µ X (A y)f Y (y) dy. The right side is equal to µ X µ Y (A) by the first part of the propositio, thus g must be the desity correspodig to the distributio µ X µ Y. Example 2.23 (Gamma distributio). We say that X has a Gamma distributio with rate λ > 0 ad shape parameter ν > 0, ad write X Gamma(ν, λ), if λ ν x ν 1 µ X (A) = Γ(ν) e λx dx for all A B. A [0, ) Note that whe ν = 1, this is just a expoetial distributio with rate λ For ν N, Γ(ν) = (ν 1)! whereas for other values this is the well-kow Gamma fuctio. If X Gamma(ν 1, λ) ad Y Gamma(ν 2, λ) are idepedet, the by Propositio 2.22, X + Y has a desity give by f X+Y (z) = z 0 λ ν1+ν2 Γ(ν 1 )Γ(ν 2 ) (z y)ν1 1 e λ(z y) y ν2 1 e λy dy = λν1+ν2 e λz Γ(ν 1 + ν 2 ) = λν1+ν2 e λz Γ(ν 1 + ν 2 ) z 0 Γ(ν 1 + ν 2 ) Γ(ν 1 )Γ(ν 2 ) (z y)ν1 1 y ν2 1 dy where the itegral is see to be equal to oe by substitutig y = zu ad dy = z du to tur the itegrad ito a Beta fuctio (or simply use the fact that the right side must be a desity). Thus we see that the sum of two idepedet Gamma radom variables with the same rate, gives us aother Gamma radom variable. 2.2 Weak Law of Large Numbers Defiitio We say the sequece (X, N) coverges i probability to pr X, deoted by X X, if for every ɛ > 0, there exists a N such that N implies that P( X X > ɛ) < ɛ. Exercise 2.6. Show that Fatou s Lemma, the Domiated Covergece Theorem, ad the Mootoe Covergece Theorem all remai valid if we replace covergece a.s. with covergece i probability. 31

32 2 BERNOULLI S LAWS OF LARGE NUMBERS Exercise 2.7. Suppose a fuctio h : R R is cotiuous. If X X, the h(x ) pr a.s. h(x). If X X, the h(x ) a.s. h(x). These are versios of what is kow as the Cotiuous Mappig Theorem 4. pr Example (The shrikig ad revolvig iterval) Set f 1 (x) = 1 [0,1], f 2 (x) = 1 [1,1 1 2 ],..., f (x) = 1 [ 1 1 k=1 k, k=1 1 ]. k Cosiderig the itervals above, modulo 1, we set g 1 (x) = 1 [0,1], g 2 (x) = 1 [0, 1 2 ],..., g (x) = 1 [ 1 1 k=1 k (mod 1), k=1 1 k (mod 1)] where modulo 1 simply meas that we slide back to [0, 1] (if the left edpoit becomes greater tha the right edpoit, modulo 1, we split the iterval i two i the atural way). The for ay fixed ω [0, 1], g (ω) = 1 for ifiity may. The sequece (g, N) coverges poitwise owhere, but if we let g 0, the P( g g > ɛ) = 1. Hece, the sequece coverges i probability. Example Let the radom variable g i = 1 [i,i+1]. If (R, B, µ) is a probability a.s. pr space, the for all x, lim i g i (x) = 0, ad hece g i 0. Also, g i 0. For each ɛ > 0, simply choose N ɛ large eough such that µ([ N ɛ, N ɛ ]) > 1 ɛ. The for > N ɛ, P( g 0 > ɛ) < ɛ. Note that i this example, if oe uses Lebesgue measure istead of the probability measure µ, the the sequece does ot coverge i measure, see for example [RF10]. Theorem 2.27 (Weak Law of Large Numbers, fiite 2d momets). If {X, N} are idepedet ad idetically distributed (i.i.d.) ad EX 2 1 <, the S pr EX 1 where S = X 1 + +X. I fact, the assumptio of idepedece i the above ca be weakeed to Cov(X i, X j ) 0 for all i j. 4 This result ad aalogs were proved i [MW43]. 32

33 2.2 Weak Law of Large Numbers Proof. By Chebyshev s Iequality, we obtai ( ) S P EX 1 > ɛ E S EX 1 2 ɛ 2 = = = = S Var( ) ɛ 2 (sice E (S /) = EX 1 / = EX 1 < ) 1 2 ɛ 2 Var(X X ) 1 2 ɛ 2 Cov(X i, X j ) i=1 j=1 1 2 ɛ 2 Var X 1 (sice Cov(X i, X j ) 0 for all i j) 1 ɛ 2 Var X 1 0. Remarks The result is ot geerally valid for positively correlated radom variables. If EX1 2 < ad all X i are idetical, i.e., X i = X 1 (thik of people observig the same coi toss), the S = X1 = X 1 EX 1, uless X 1 costat. 2. The assumptio of beig idetically distributed ca also be relaxed. For example, oe ca use {X, N} which have bouded variaces ad are pairwise ucorrelated or egatively correlated. The without chagig the proof too much, oe ca obtai S E(S ) pr 0. Defiitio We say that (X, N) coverges i L p ad write X if lim E X X p = 0. Remarks By Chebyshev s Iequality, for p > 0, P( X X > ɛ) E X X p ɛ p, ad thus covergece i L p implies covergece i probability. L p X, 2. I the proof of the Weak Law of Large Numbers (WLLN), we actually proved the stroger result ( S, N) coverges i L2 to EX The shrikig, revolvig iterval i Example 2.25 shows that it is possible for a sequece to coverge i L p, but ot almost surely. 33

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Lecture 3 : Random variables and their distributions

Lecture 3 : Random variables and their distributions Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

EE 4TM4: Digital Communications II Probability Theory

EE 4TM4: Digital Communications II Probability Theory 1 EE 4TM4: Digital Commuicatios II Probability Theory I. RANDOM VARIABLES A radom variable is a real-valued fuctio defied o the sample space. Example: Suppose that our experimet cosists of tossig two fair

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Measure and Measurable Functions

Measure and Measurable Functions 3 Measure ad Measurable Fuctios 3.1 Measure o a Arbitrary σ-algebra Recall from Chapter 2 that the set M of all Lebesgue measurable sets has the followig properties: R M, E M implies E c M, E M for N implies

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Singular Continuous Measures by Michael Pejic 5/14/10

Singular Continuous Measures by Michael Pejic 5/14/10 Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

STAT Homework 1 - Solutions

STAT Homework 1 - Solutions STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better

More information

Notes 5 : More on the a.s. convergence of sums

Notes 5 : More on the a.s. convergence of sums Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series

More information

The Central Limit Theorem

The Central Limit Theorem Chapter The Cetral Limit Theorem Deote by Z the stadard ormal radom variable with desity 2π e x2 /2. Lemma.. Ee itz = e t2 /2 Proof. We use the same calculatio as for the momet geeratig fuctio: exp(itx

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

Notes 27 : Brownian motion: path properties

Notes 27 : Brownian motion: path properties Notes 27 : Browia motio: path properties Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces:[Dur10, Sectio 8.1], [MP10, Sectio 1.1, 1.2, 1.3]. Recall: DEF 27.1 (Covariace) Let X = (X

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Math 525: Lecture 5. January 18, 2018

Math 525: Lecture 5. January 18, 2018 Math 525: Lecture 5 Jauary 18, 2018 1 Series (review) Defiitio 1.1. A sequece (a ) R coverges to a poit L R (writte a L or lim a = L) if for each ǫ > 0, we ca fid N such that a L < ǫ for all N. If the

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 6 9/24/2008 DISCRETE RANDOM VARIABLES AND THEIR EXPECTATIONS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 6 9/24/2008 DISCRETE RANDOM VARIABLES AND THEIR EXPECTATIONS MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 6 9/24/2008 DISCRETE RANDOM VARIABLES AND THEIR EXPECTATIONS Cotets 1. A few useful discrete radom variables 2. Joit, margial, ad

More information

Probability and Statistics

Probability and Statistics ICME Refresher Course: robability ad Statistics Staford Uiversity robability ad Statistics Luyag Che September 20, 2016 1 Basic robability Theory 11 robability Spaces A probability space is a triple (Ω,

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

5 Birkhoff s Ergodic Theorem

5 Birkhoff s Ergodic Theorem 5 Birkhoff s Ergodic Theorem Amog the most useful of the various geeralizatios of KolmogorovâĂŹs strog law of large umbers are the ergodic theorems of Birkhoff ad Kigma, which exted the validity of the

More information

Probability for mathematicians INDEPENDENCE TAU

Probability for mathematicians INDEPENDENCE TAU Probability for mathematicias INDEPENDENCE TAU 2013 28 Cotets 3 Ifiite idepedet sequeces 28 3a Idepedet evets........................ 28 3b Idepedet radom variables.................. 33 3 Ifiite idepedet

More information

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1 Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity

More information

Quick Review of Probability

Quick Review of Probability Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter 2 & Teachig

More information

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero? 2 Lebesgue Measure I Chapter 1 we defied the cocept of a set of measure zero, ad we have observed that every coutable set is of measure zero. Here are some atural questios: If a subset E of R cotais a

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 6 9/23/203 Browia motio. Itroductio Cotet.. A heuristic costructio of a Browia motio from a radom walk. 2. Defiitio ad basic properties

More information

This section is optional.

This section is optional. 4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore

More information

A Proof of Birkhoff s Ergodic Theorem

A Proof of Birkhoff s Ergodic Theorem A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed

More information

Introduction to Probability. Ariel Yadin. Lecture 7

Introduction to Probability. Ariel Yadin. Lecture 7 Itroductio to Probability Ariel Yadi Lecture 7 1. Idepedece Revisited 1.1. Some remiders. Let (Ω, F, P) be a probability space. Give a collectio of subsets K F, recall that the σ-algebra geerated by K,

More information

FUNDAMENTALS OF REAL ANALYSIS by. V.1. Product measures

FUNDAMENTALS OF REAL ANALYSIS by. V.1. Product measures FUNDAMENTALS OF REAL ANALSIS by Doğa Çömez V. PRODUCT MEASURE SPACES V.1. Product measures Let (, A, µ) ad (, B, ν) be two measure spaces. I this sectio we will costruct a product measure µ ν o that coicides

More information

Lecture 8: Convergence of transformations and law of large numbers

Lecture 8: Convergence of transformations and law of large numbers Lecture 8: Covergece of trasformatios ad law of large umbers Trasformatio ad covergece Trasformatio is a importat tool i statistics. If X coverges to X i some sese, we ofte eed to check whether g(x ) coverges

More information

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].

Probability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)]. Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

Entropy Rates and Asymptotic Equipartition

Entropy Rates and Asymptotic Equipartition Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,

More information

Lecture Notes for Analysis Class

Lecture Notes for Analysis Class Lecture Notes for Aalysis Class Topological Spaces A topology for a set X is a collectio T of subsets of X such that: (a) X ad the empty set are i T (b) Uios of elemets of T are i T (c) Fiite itersectios

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Quick Review of Probability

Quick Review of Probability Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter & Teachig Material.

More information

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other

More information

Axioms of Measure Theory

Axioms of Measure Theory MATH 532 Axioms of Measure Theory Dr. Neal, WKU I. The Space Throughout the course, we shall let X deote a geeric o-empty set. I geeral, we shall ot assume that ay algebraic structure exists o X so that

More information

2.1. Convergence in distribution and characteristic functions.

2.1. Convergence in distribution and characteristic functions. 3 Chapter 2. Cetral Limit Theorem. Cetral limit theorem, or DeMoivre-Laplace Theorem, which also implies the wea law of large umbers, is the most importat theorem i probability theory ad statistics. For

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

Mathematics 170B Selected HW Solutions.

Mathematics 170B Selected HW Solutions. Mathematics 17B Selected HW Solutios. F 4. Suppose X is B(,p). (a)fidthemometgeeratigfuctiom (s)of(x p)/ p(1 p). Write q = 1 p. The MGF of X is (pe s + q), sice X ca be writte as the sum of idepedet Beroulli

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Probability Theory. Muhammad Waliji. August 11, 2006

Probability Theory. Muhammad Waliji. August 11, 2006 Probability Theory Muhammad Waliji August 11, 2006 Abstract This paper itroduces some elemetary otios i Measure-Theoretic Probability Theory. Several probabalistic otios of the covergece of a sequece of

More information

The Borel hierarchy classifies subsets of the reals by their topological complexity. Another approach is to classify them by size.

The Borel hierarchy classifies subsets of the reals by their topological complexity. Another approach is to classify them by size. Lecture 7: Measure ad Category The Borel hierarchy classifies subsets of the reals by their topological complexity. Aother approach is to classify them by size. Filters ad Ideals The most commo measure

More information

STAT Homework 2 - Solutions

STAT Homework 2 - Solutions STAT-36700 Homework - Solutios Fall 08 September 4, 08 This cotais solutios for Homework. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better isight.

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS

January 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic

More information

Introduction to Probability. Ariel Yadin

Introduction to Probability. Ariel Yadin Itroductio to robability Ariel Yadi Lecture 2 *** Ja. 7 ***. Covergece of Radom Variables As i the case of sequeces of umbers, we would like to talk about covergece of radom variables. There are may ways

More information

Probability and Measure

Probability and Measure Probability ad Measure Stefa Grosskisky Cambridge, Michaelmas 2005 These otes ad other iformatio about the course are available o www.statslab.cam.ac.uk/ stefa/teachig/probmeas.html The text is based o

More information

Basics of Probability Theory (for Theory of Computation courses)

Basics of Probability Theory (for Theory of Computation courses) Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet

More information

The Boolean Ring of Intervals

The Boolean Ring of Intervals MATH 532 Lebesgue Measure Dr. Neal, WKU We ow shall apply the results obtaied about outer measure to the legth measure o the real lie. Throughout, our space X will be the set of real umbers R. Whe ecessary,

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 19

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 19 CS 70 Discrete Mathematics ad Probability Theory Sprig 2016 Rao ad Walrad Note 19 Some Importat Distributios Recall our basic probabilistic experimet of tossig a biased coi times. This is a very simple

More information

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer. 6 Itegers Modulo I Example 2.3(e), we have defied the cogruece of two itegers a,b with respect to a modulus. Let us recall that a b (mod ) meas a b. We have proved that cogruece is a equivalece relatio

More information

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3 MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special

More information

Law of the sum of Bernoulli random variables

Law of the sum of Bernoulli random variables Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible

More information

Lecture 12: November 13, 2018

Lecture 12: November 13, 2018 Mathematical Toolkit Autum 2018 Lecturer: Madhur Tulsiai Lecture 12: November 13, 2018 1 Radomized polyomial idetity testig We will use our kowledge of coditioal probability to prove the followig lemma,

More information

HOMEWORK I: PREREQUISITES FROM MATH 727

HOMEWORK I: PREREQUISITES FROM MATH 727 HOMEWORK I: PREREQUISITES FROM MATH 727 Questio. Let X, X 2,... be idepedet expoetial radom variables with mea µ. (a) Show that for Z +, we have EX µ!. (b) Show that almost surely, X + + X (c) Fid the

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

Archimedes - numbers for counting, otherwise lengths, areas, etc. Kepler - geometry for planetary motion

Archimedes - numbers for counting, otherwise lengths, areas, etc. Kepler - geometry for planetary motion Topics i Aalysis 3460:589 Summer 007 Itroductio Ree descartes - aalysis (breaig dow) ad sythesis Sciece as models of ature : explaatory, parsimoious, predictive Most predictios require umerical values,

More information

Sequences. Notation. Convergence of a Sequence

Sequences. Notation. Convergence of a Sequence Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Math 10A final exam, December 16, 2016

Math 10A final exam, December 16, 2016 Please put away all books, calculators, cell phoes ad other devices. You may cosult a sigle two-sided sheet of otes. Please write carefully ad clearly, USING WORDS (ot just symbols). Remember that the

More information

Lecture 20: Multivariate convergence and the Central Limit Theorem

Lecture 20: Multivariate convergence and the Central Limit Theorem Lecture 20: Multivariate covergece ad the Cetral Limit Theorem Covergece i distributio for radom vectors Let Z,Z 1,Z 2,... be radom vectors o R k. If the cdf of Z is cotiuous, the we ca defie covergece

More information

Chapter 0. Review of set theory. 0.1 Sets

Chapter 0. Review of set theory. 0.1 Sets Chapter 0 Review of set theory Set theory plays a cetral role i the theory of probability. Thus, we will ope this course with a quick review of those otios of set theory which will be used repeatedly.

More information

Part II Probability and Measure

Part II Probability and Measure Part II Probability ad Measure Based o lectures by J. Miller Notes take by Dexter Chua Michaelmas 2016 These otes are ot edorsed by the lecturers, ad I have modified them (ofte sigificatly) after lectures.

More information

f X (12) = Pr(X = 12) = Pr({(6, 6)}) = 1/36

f X (12) = Pr(X = 12) = Pr({(6, 6)}) = 1/36 Probability Distributios A Example With Dice If X is a radom variable o sample space S, the the probablity that X takes o the value c is Similarly, Pr(X = c) = Pr({s S X(s) = c} Pr(X c) = Pr({s S X(s)

More information

MA131 - Analysis 1. Workbook 3 Sequences II

MA131 - Analysis 1. Workbook 3 Sequences II MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

6a Time change b Quadratic variation c Planar Brownian motion d Conformal local martingales e Hints to exercises...

6a Time change b Quadratic variation c Planar Brownian motion d Conformal local martingales e Hints to exercises... Tel Aviv Uiversity, 28 Browia motio 59 6 Time chage 6a Time chage..................... 59 6b Quadratic variatio................. 61 6c Plaar Browia motio.............. 64 6d Coformal local martigales............

More information

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences

Precise Rates in Complete Moment Convergence for Negatively Associated Sequences Commuicatios of the Korea Statistical Society 29, Vol. 16, No. 5, 841 849 Precise Rates i Complete Momet Covergece for Negatively Associated Sequeces Dae-Hee Ryu 1,a a Departmet of Computer Sciece, ChugWoo

More information

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002

ECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002 ECE 330:541, Stochastic Sigals ad Systems Lecture Notes o Limit Theorems from robability Fall 00 I practice, there are two ways we ca costruct a ew sequece of radom variables from a old sequece of radom

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

Topic 8: Expected Values

Topic 8: Expected Values Topic 8: Jue 6, 20 The simplest summary of quatitative data is the sample mea. Give a radom variable, the correspodig cocept is called the distributioal mea, the epectatio or the epected value. We begi

More information

Chapter IV Integration Theory

Chapter IV Integration Theory Chapter IV Itegratio Theory Lectures 32-33 1. Costructio of the itegral I this sectio we costruct the abstract itegral. As a matter of termiology, we defie a measure space as beig a triple (, A, µ), where

More information

Problem Set 2 Solutions

Problem Set 2 Solutions CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S

More information

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS

REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS REAL ANALYSIS II: PROBLEM SET 1 - SOLUTIONS 18th Feb, 016 Defiitio (Lipschitz fuctio). A fuctio f : R R is said to be Lipschitz if there exists a positive real umber c such that for ay x, y i the domai

More information

4. Basic probability theory

4. Basic probability theory Cotets Basic cocepts Discrete radom variables Discrete distributios (br distributios) Cotiuous radom variables Cotiuous distributios (time distributios) Other radom variables Lect04.ppt S-38.45 - Itroductio

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15

17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15 17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig

More information

s = and t = with C ij = A i B j F. (i) Note that cs = M and so ca i µ(a i ) I E (cs) = = c a i µ(a i ) = ci E (s). (ii) Note that s + t = M and so

s = and t = with C ij = A i B j F. (i) Note that cs = M and so ca i µ(a i ) I E (cs) = = c a i µ(a i ) = ci E (s). (ii) Note that s + t = M and so 3 From the otes we see that the parts of Theorem 4. that cocer us are: Let s ad t be two simple o-egative F-measurable fuctios o X, F, µ ad E, F F. The i I E cs ci E s for all c R, ii I E s + t I E s +

More information

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT Itroductio to Extreme Value Theory Laures de Haa, ISM Japa, 202 Itroductio to Extreme Value Theory Laures de Haa Erasmus Uiversity Rotterdam, NL Uiversity of Lisbo, PT Itroductio to Extreme Value Theory

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22 CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first

More information

Application to Random Graphs

Application to Random Graphs A Applicatio to Radom Graphs Brachig processes have a umber of iterestig ad importat applicatios. We shall cosider oe of the most famous of them, the Erdős-Réyi radom graph theory. 1 Defiitio A.1. Let

More information

1 Introduction. 1.1 Notation and Terminology

1 Introduction. 1.1 Notation and Terminology 1 Itroductio You have already leared some cocepts of calculus such as limit of a sequece, limit, cotiuity, derivative, ad itegral of a fuctio etc. Real Aalysis studies them more rigorously usig a laguage

More information