P. Billingsley, Probability and Measure, Wiley, New York, P. Gänssler, W. Stute, Wahrscheinlichkeitstheorie, Springer, Berlin, 1977.

Size: px
Start display at page:

Download "P. Billingsley, Probability and Measure, Wiley, New York, P. Gänssler, W. Stute, Wahrscheinlichkeitstheorie, Springer, Berlin, 1977."

Transcription

1 Probability Theory Klaus Ritter TU Kaiserslautern, WS 2017/18 Literature In particular, H. Bauer, Probability Theory, de Gruyter, Berlin, P. Billingsley, Probability and Measure, Wiley, New York, P. Gänssler, W. Stute, Wahrscheinlichkeitstheorie, Springer, Berlin, A. Klenke, Probability Theory, Springer, Berlin, Prerequisites Stochastic Methods and Measure and Integration Theory

2 Contents I Introduction 1 II Basic Concepts of Probability Theory 3 1 Random Variables and Distributions Convergence in Probability Convergence in Distribution Uniform Integrability Kernels and Product Measures Independence III Limit Theorems 37 1 Zero-One Laws The Strong Law of Large Numbers The Weak Law of Large Numbers Characteristic Functions The One-Dimensional Central Limit Theorem The Law of the Iterated Logarithm The Multi-dimensional Central Limit Theorem IV Brownian Motion 65 1 Stochastic Processes Donsker s Invariance Principle The Brownian Motion V Conditional Expectations and Martingales 78 1 The Radon-Nikodym-Theorem Conditional Expectations Discrete-Time Martingales Stopping Times and Optional Sampling Martingale Inequalities and Convergence Theorems ii

3 CONTENTS iii A Measure and Integration Measure Spaces and Measurable Mappings Borel Sets The Lebesgue Measure Real-valued Measurable Mappings Integration L p -Spaces Dynkin Classes Product Spaces Carathéodory s Theorem The Factorization Lemma Literature 116 Definitions 117

4 Chapter I Introduction A stochastic model: a probability space (Ω, A, P ) together with a collection of random variables (measurable mappings) Ω R, say. Main topics in this course: (i) limit theorems, (ii) conditional probabilities and expectations, (iii) discrete-time martingales, (iv) Brownian motion. Example 1. Limit theorems like the law of large numbers or the central limit theorem deal with sequences X 1, X 2,... of random variables and their partial sums S n = n X i (physics: position of a particle after n collisions; gambling: cumulative gain after n trials). Under which conditions and in which sense does n α S n converge as n tends to infinity? Example 2. Consider two random variables X 1 and X 2. If P ({X 2 = v}) > 0 then the conditional probability of {X 1 A} given {X 2 = v} is defined by P ({X 1 A} {X 2 = v}) = P ({X 1 A} {X 2 = v}). P ({X 2 = v}) How can we reasonably extend this definition to the case P ({X 2 = v}) = 0, e.g., for X 2 being normally distributed? How does the observation X 2 = v change our stochastic model? Example 3. Martingales may be used to model fair games, and a particular case of a martingale S 0, S 1,... arises in Example 1, if X 1, X 2,... is an independent sequence with zero mean. A gambling strategy is defined by a sequence H 1, H 2,... of random 1

5 I. Introduction 2 variables, where H n may depend in any way on the outcomes of the first n 1 trials. The cumulative gain after n trials is given by the discrete integral n H k X k = k=1 n H k (S k S k 1 ). k=1 Can we tilt the martingale in favor of the gambler by a suitable strategy? Example 4. The fluctuation of a stock price defines a function on the time interval [0, [ with values in R (for simplicity, we admit negative stock prices at this point). What is a reasonable σ-algebra on the space Ω of all mappings [0, [ R or on the subspace of all continuous mappings? How can we define (non-discrete) probability measures on these spaces in order to model the random dynamics of stock prices? Analogously for random perturbations in physics, biology, etc. More generally, the same questions arise for mappings I S with an arbitrary non-empty set I and S R d (physics: phase transition in ferromagnetic materials, the orientation of magnetic dipoles on a set I of sites; medicine: spread of diseases, certain biometric parameters for a set I of individuals; environmental science: the concentration of certain pollutants in a region I). Example 5. Suppose that X 1, X 2,... is an independent sequence with zero mean and variance one. Rescaling the partial sums S n in time and space according to S (m) n/m = n X k / m k=1 and using piecewise linear interpolation of S (m) 0, S (m) 1/m,... at the knots 0, 1/m,... we get a random variable S (m) taking values in the space C([0, [). By the central limit theorem S (m) 1 converges to a standard normally distributed random variable as m tends to infinity. The infinite-dimensional counterpart of this result, which deals with probability measures on the space C([0, [), guarantees convergence of S (m) to a Brownian motion. On the latter we quote from Schilling, Partzsch (2014, p. vi): Brownian motion is arguably the single most important stochastic process.

6 Chapter II Basic Concepts of Probability Theory Context for probability theoretical concepts: a probability space (Ω, A, P ). Terminology: A A event, P (A) probability of the event A A. 1 Random Variables and Distributions Given: a probability space (Ω, A, P ) and a measurable space (Ω, A ). Definition 1. X : Ω Ω is a random element if X is A-A -measurable. Particular cases: (i) X is a (real) random variable if (Ω, A ) = (R, B), (ii) X is a numerical random variable if (Ω, A ) = (R, B), (iii) X is a k-dimensional (real) random vector if (Ω, A ) = (R k, B k ), (iv) X is a k-dimensional numerical random vector if (Ω, A ) = (R k, B k ). As is customary, we use the abbreviation for any X : Ω Ω and A Ω. Definition 2. {X A} = {ω Ω : X(ω) A} (i) The distribution (probability law) of a random element X : Ω Ω (with respect to P ) is the image measure P X = X(P ). Notation: X Q if P X = Q. 3

7 II.1. Random Variables and Distributions 4 (ii) Given: probability spaces (Ω 1, A 1, P 1 ), (Ω 2, A 2, P 2 ) and random elements X 1 and X 2 are identically distributed if X 1 : Ω 1 Ω, X 2 : Ω 2 Ω. (P 1 ) X1 = (P 2 ) X2. Definition 3. A property Π holds P -almost surely (P -a.s., a.s., with probability one), if A A: A {ω Ω : Π holds for ω} P (A) = 1. Remark 4. (i) For random elements X, Y : Ω Ω X = Y P -a.s. P X = P Y, but the converse is not true in general. For instance, let P be the uniform distribution on Ω = {0, 1} and define X(ω) = ω and Y (ω) = 1 ω. (ii) For every probability measure Q on (Ω, A ) there exists a probability space (Ω, A, P ) and a random element X : Ω Ω such that X Q. Take (Ω, A, P ) = (Ω, A, Q) and X = id Ω. Carathéodory s Theorem is a general tool for the construction of probability measures, see Section A.9. (iii) A major part of probability theory deals with properties of random elements that can be formulated in terms of their distributions only. Example 5. A discrete distribution P X is specified by a countable set D Ω and a mapping p: D R such that ω D : p(ω ) 0 p(ω ) = 1, ω D namely, Here ε ω P X = p(ω ) ε ω. ω D denotes the Dirac measure at the point ω Ω, see Example A.1.3. Hence P X (A ) = P ({X A }) = p(ω ), A A. ω A D Assume that {ω } A for every ω D. Then P ({X = ω }) = p(ω ), and p, extended by zero to a mapping Ω R, is the density of P X w.r.t the counting measure on A.

8 II.1. Random Variables and Distributions 5 If D < then p(ω ) = 1 D yields the uniform distribution on D. For (Ω, A ) = (R, B) B(n, p) = n k=0 ( ) n p k (1 p) n k ε k k is the binomial distribution with parameters n N and p [0, 1]. In particular, for n = 1 we get the Bernoulli distribution B(1, p) = (1 p) ε 0 + p ε 1. Further examples include the geometric distribution G(p) with parameter p ]0, 1], G(p) = p (1 p) k 1 ε k, k=1 and the Poisson distribution π(λ) with parameter λ > 0, π(λ) = k=0 exp( λ) λk k! ε k. Example 6. Consider a distribution on (R k, B k ) that is defined in terms of a probability density f : R k [0, [ w.r.t. the Lebesgue measure λ k. In this case P X (A ) = P ({X A }) = f dλ k, A B k. A We present some examples in the case k = 1. The normal distribution N(µ, σ 2 ) = f λ 1 with parameters µ R and σ 2, where σ > 0, is obtained by ( 1 f(x) = exp 1 ) (x µ) 2, x R. 2πσ 2 2 σ 2 The exponential distribution with parameter λ > 0 is obtained by { 0 if x < 0 f(x) = λ exp( λ x) if x 0. The uniform distribution on D B with λ 1 (D) ]0, [ is obtained by f = 1 λ 1 (D) 1 D.

9 II.1. Random Variables and Distributions 6 Distribution Functions Definition 7. Let X = (X 1,..., X k ): Ω R k be a random vector. Then F X : R k [0, 1] (x 1,..., x k ) P X ( k ) ( k ], x i ] = P ) {X i x i } is called the distribution function of X. Theorem 8. Given: probability spaces (Ω 1, A 1, P 1 ), (Ω 2, A 2, P 2 ) and random vectors X 1 : Ω 1 R k, X 2 : Ω 2 R k. Then (P 1 ) X 1 = (P 2 ) X 2 F X 1 = F X 2. Proof. holds trivially. : By Theorem A.2.4, B k = σ(e) for E = { k } ], x i ] : x 1,..., x k R, and E is closed w.r.t. intersections. Use Theorem A.1.5. For notational convenience, we consider the case k = 1 in the sequel. We refer to Elstrodt (2011, Satz II.5.10) or Übung 3.4 in Maß- und Integrationstheorie (2016) for the following two facts. Theorem 9. (i) F X is non-decreasing, (ii) F X is right-continuous, (iii) lim x F X (x) = 0 and lim x F X (x) = 1, (iv) F X is continuous at x iff P ({X = x}) = 0. Theorem 10. For every function F that satisfies (i) (iii) from Theorem 9, 1 Q probability measure on B x R: Q(], x]) = F (x). Expectation and Variance Remark 11. Define r = for r > 0. For 1 p < q < and X Z(Ω, A) ( ) p/q X p dp X q dp, due to Hölder s inequality, see Theorem A.6.2.

10 II.1. Random Variables and Distributions 7 Notation: L = L(Ω, A, P ) = { X Z(Ω, A) : } X dp < is the class of P -integrable random variables, and analogously { } L = L(Ω, A, P ) = X Z(Ω, A) : X dp < is the class of P -integrable numerical random variables. We consider P X as a distribution on (R, B) if P ({X R}) = 1 for a numerical random variable X, and we consider L as a subspace of L. Definition 12. For X L or X Z + E(X) = X dp is the expectation of X. For X Z(Ω, A) such that X 2 L Var(X) = (X E(X)) 2 dp and Var(X) are the variance and the standard deviation of X, respectively. Remark 13. The Transformation Theorem A.5.12 implies X p dp < x p P X (dx) < Ω for X Z(Ω, A), in which case, for p = 1 E(X) = and for p = 2 R R R x P X (dx), Var(X) = (x E(X)) 2 P X (dx). Thus E(X) and Var(X) depend only on P X. Example 14. X B(n, p) E(X) = n p Var(X) = n p (1 p) X G(p) E(X) = 1 Var(X) = 1 p p p 2 X π(λ) E(X) = λ Var(X) = λ, see the lecture on Stochastische Methoden. X is Cauchy distributed with parameter α > 0 if X f λ 1 where f(x) = α π(α 2 + x 2 ), x R.

11 II.2. Convergence in Probability 8 Since t x dx = 1 log( x 2 2 t2 ) neither E(X + ) < nor E(X ) <, and therefore X L. If X N(µ, σ 2 ) then E(X) = µ Var(X) = σ 2. If X is exponentially distributed with parameter λ > 0 then E(X) = 1 λ Var(X) = 1 λ 2. See the lecture on Stochastische Methoden. 2 Convergence in Probability Motivated by the Examples A.5.11 and A.6.8 we introduce a notion of convergence that is weaker than convergence in mean and convergence almost surely. In the sequel, X, X n, etc. are random variables on a common probability space (Ω, A, P ). Definition 1. (X n ) n converges to X in probability if Notation: X n P X. ε > 0: lim P ({ X n X > ε}) = 0. Theorem 2 (Chebyshev-Markov Inequality). Let (Ω, A, µ) be a measure space and f Z(Ω, A). For every u > 0 and every 1 p < µ({ f u}) 1 u f p dµ. p Proof. We have u p dµ f p dµ. { f u} Ω Corollary 3. If E(X 2 ) < and ε > 0, then P ({ X E(X) ε}) 1 ε 2 Var(X). Theorem 4. d(x, Y ) = min(1, X Y ) dp defines a semi-metric on Z(Ω, A), and X n P X lim d(x n, X) = 0.

12 II.2. Convergence in Probability 9 Proof. For ε > 0 min(1, X n X ) dp = min(1, X n X ) dp + { X n X >ε} P ({ X n X > ε}) + min(1, ε). : Let 0 < ε < 1. Use Theorem 2 to obtain { X n X ε} min(1, X n X ) dp P ({ X n X > ε}) = P ({min(1, X n X ) > ε}) 1 ε min(1, X n X ) dp = 1 ε d(x n, X). Remark 5. By Theorems 4 and A.6.11, X n L p X X n P X. Example A.5.11 shows that does not hold in general. Remark 6. By Theorems 4 and A.5.10, X n P -a.s. X X n P X. Example A.6.8 shows that does not hold in general. The Law of Large Numbers deals with convergence almost surely or convergence in probability, see the introductory Example I.1 and Sections III.2 and III.3. Subsequence Criteria Corollary 7. X n P P -a.s. X subsequence (X nk ) k N : X nk X. Proof. Due to Theorems 4 and A.6.7 there exists a subsequence (X nk ) k N such that min(1, X nk X ) P -a.s. 0. Remark 8. In any semi-metric space (M, d) a sequence (a n ) n N converges to a iff subsequence (a nk ) k N subsequence (a nkl ) l N : lim l d(a nkl, a) = 0. Corollary 9. X n P X iff subsequence (X nk ) k N subsequence (X nkl ) l N : X nkl P -a.s. X. Proof. : Corollary 7. : Remarks 6 and 8 together with Theorem 4.

13 II.3. Convergence in Distribution 10 Remark 10. We conclude that, in general, there is no semi-metric on Z(Ω, A) that defines the a.s.-convergence. However, if Ω is countable, then Proof: Übung 2.3. X n P -a.s. X X n P X. Lemma 11. Let denote convergence almost everywhere or convergence in probability. If X n (i) X (i) for i = 1,..., m and f : R m R is continuous, then f (X n (1),..., X n (m) ) f (X (1),..., X (m) ). Proof. Trivial for convergence almost everywhere, and by Corollary 9 the conclusion holds for convergence in probability, too. Corollary 12. Let X n P X. Then X n Proof. Corollary 9 and Lemma A.6.6. P Y X = Y P -a.s. 3 Convergence in Distribution Given: a metric space (M, ρ). Let M(M) denote the set of all probability measures on the Borel-σ-algebra B(M) in M. Moreover, let Definition 1. C b (M) = {f : M R : f bounded, continuous}. (i) A sequence (Q n ) n N in M(M) converges weakly to Q M(M) if f C b (M): lim f dq n = f dq. Notation: Q n w Q. (ii) A sequence (X n ) n N of random elements with values in M converges in distribution to a random element X with values in M if Q n Q for the distributions w Q n of X n and Q of X, respectively. d Notation: X n X. Remark 2. For convergence in distribution the random elements need not be defined on a common probability space. In the sequel: Q n, Q M(M) for n N.

14 II.3. Convergence in Distribution 11 Example 3. (i) For x n, x M ε xn For the proof of, note that f dε xn = f(x n ), w ε x lim ρ(x n, x) = 0. f dε x = f(x). For the proof of, suppose that lim sup ρ(x n, x) > 0. Take f(y) = min(ρ(y, x), 1), y M, and observe that f C b (M) and lim sup f dε xn = lim sup min(ρ(x n, x), 1) > 0 while f dε x = 0. (ii) For the Euclidean distance ρ on M = R k we have (M, B(M)) = (R k, B k ). Now, in particular, k = 1 and Q n = N(µ n, σ 2 n) where σ n > 0. For f C b (R) f dq n = 1/ 2π f(σ n x + µ n ) exp( 1/2 x 2 ) λ 1 (dx). R Put N(µ, 0) = ε µ. Then lim µ n = µ w lim σ n = σ Q n N(µ, σ 2 ). Otherwise (Q n ) n N does not converge weakly. Übung 4.1. (iii) For M = C([0, T ]) let ρ(x, y) = sup t [0,T ] x(t) y(t). Examples I.4 and I.5. Cf. the introductory Remark 4. Note that Q n w Q does not imply A B(M): lim Q n (A) = Q(A). For instance, assume lim ρ(x n, x) = 0 with x n x for every n N. Then ε xn ({x}) = 0, ε x ({x}) = 1. Theorem 5 (Portemanteau Theorem). The following properties are equivalent: (i) Q n w Q, (ii) f C b (M) uniformly continuous: lim f dqn = f dq,

15 II.3. Convergence in Distribution 12 (iii) A M closed: lim sup Q n (A) Q(A), (iv) A M open: lim inf Q n (A) Q(A), (v) A B(M): Q( A) = 0 lim Q n (A) = Q(A). Proof. At first we show that (ii) (iii) (i), which yields the equivalence of (i) (iv). (ii) (iii) : Let A M be closed, and put A m = {x M : dist(x, A) < 1/m}. Note that A m A. Let ε > 0, and take m N such that Q(A m ) < Q(A) + ε. Define ϕ: R R by 1 if z 0 ϕ(z) = 1 z if 0 < z < 1 0 otherwise and f : M R by f(x) = ϕ(m dist(x, A)). Then f C b (M) is uniformly continuous, and 0 f 1 with f A = 1 and f A c m = 0. Hence Q n (A) f dq n and lim sup Q n (A) f dq Q(A m ) Q(A) + ε. (iii) (i) : Let f C b (M). Assume that 0 < f < 1 without loss of generality. For every m N and every P M(M) m f dp k/m P ({(k 1)/m f < k/m}) = 1 m P ({f (k 1)/m}) m k=1 k=1 m = (k 1)/m P ({(k 1)/m f < k/m}) + 1/m f dp + 1/m. k=1 Therefore lim sup f dq n 1 m m Q({f (k 1)/m}) k=1 f dq + 1/m, which implies lim sup f dqn f dq. In fact, the latter holds true for any bounded and upper semi-continuous mapping f : M R. For f being continuous consider 1 f, too, to obtain lim f dqn = f dq. (i) (v) : Let A B(M) with Q( A) = 0, and put f = 1 A as well a f = sup{g : g lower semi-continuous, g f}, f = inf{h : h upper semi-continuous, h f}.

16 II.3. Convergence in Distribution 13 Then f is lower semi-continuous, f is upper semi-continuous, and f f f with equality Q-a.s., since Q( A) = 0. From the proof of (iii) (i) we get f dq lim inf f dq n lim sup f dq n f dq. Hence lim f dqn = f dq. (v) (i) : Übung 3.4. Lemma 6. Consider metric spaces M, M and a continuous mapping π : M M. Then w w Q πq n πq. Q n Proof. Note that f C b (M ) implies f π C b (M). Employ Theorem A.5.12 and the definition of weak convergence. Weak Convergence of Measures on (R, B) In the sequel, we study the particular case (M, B(M)) = (R, B), i.e., convergence in distribution for random variables. The Central Limit Theorem deals with this notion of convergence, see the introductory Example I.1 and Section III.5. Notation: for any Q M(R) and for any function F : R R Theorem 7. Q n Moreover, if Q n F Q (x) = Q(], x]), x R, Cont(F ) = {x R : F continuous at x}. w Q x Cont(F Q ): lim F Qn (x) = F Q (x). w Q and Cont(F Q ) = R then lim sup F Qn (x) F Q (x) = 0. x R Proof. : If x Cont(F Q ) and A = ], x] then Q( A) = Q({x}) = 0, see Theorem 1.9. Hence Theorem 5 implies lim F Q n (x) = lim Q n (A) = Q(A) = F Q (x). : Consider a non-empty open set A R. Take pairwise disjoint open intervals A 1, A 2,... such that A = A i. Fatou s Lemma implies lim inf Q n(a) = lim inf Q n (A i ) lim inf Q n(a i ).

17 II.3. Convergence in Distribution 14 Note that R \ Cont(F Q ) is countable. Fix ε > 0, and take for i N such that Then lim inf We conclude that A i = ]a i, b i] A i a i, b i Cont(F Q ) Q(A i ) Q(A i) + ε 2 i. w and therefore Q n Q by Theorem 5. Uniform convergence, see Übung 1.3. Corollary 8. Q n(a i ) lim inf Q n(a i) = Q(A i) Q(A i ) ε 2 i. Q n lim inf Q n(a) Q(A) ε, w w Q Q n Q Q = Q. Proof. By Theorem 7 F Q (x) = F Q(x) if x D = Cont(F Q ) Cont(F Q). Since D is dense in R and F Q as well as F Q are right-continuous, we get F Q = F Q. Apply Theorem Given: random variables X n, X on (Ω, A, P ) for n N. Theorem 9. and Proof. Assume X n X n X n P d X X n X d P X X constant a.s. X n X. P X. For ε > 0 and x R P ({X x ε}) P ({ X X n > ε}) P ({X x ε} { X X n ε}) P ({X n x}) = P ({X n x} {X x + ε}) + P ({X n x} {X > x + ε}) P ({X x + ε}) + P ({ X X n > ε}). Thus F X (x ε) lim inf F X n (x) lim sup F Xn (x) F X (x + ε). For x Cont(F X ) we get lim F Xn (x) = F X (x). Apply Theorem 7. d Now, assume that X n X and P X = ε x. Let ε > 0 and take f C b (R) such that f 0, f(x) = 0, and f(y) = 1 if x y ε. Then P ({ X X n > ε}) = P ({ x X n > ε}) = 1 R\[x ε,x+ε] dp Xn f dp Xn and lim f dp Xn = f dp X = 0.

18 II.3. Convergence in Distribution 15 Example 10. Consider the uniform distribution P on Ω = {0, 1}. Put Then P Xn = P X and therefore X n (ω) = ω, X(ω) = 1 ω. X n However, { X n X < 1} = and therefore X n d X. P X does not hold. Theorem 11 (Skorohod). There exists a probability space (Ω, A, P ) with the following property. If w Q, Q n then there exist X n, X Z(Ω, A) for n N such that n N: Q n = P Xn Q = P X X n P -a.s. X. Proof. Take Ω = ]0, 1[, A = B(Ω), and consider the uniform distribution P on Ω. Define X Q (ω) = inf{z R : ω F Q (z)}, ω ]0, 1[, for any Q M(R). Since X Q is non-decreasing, we have X Q Z(Ω, A). Furthermore, see Übung 2.4. w P XQ = Q, (1) Assuming Q n Q we define X n = X Qn and X = X Q. Since X is non-decreasing, we conclude that Ω \ Cont(X) is countable. Thus it suffices to show ω Cont(X): lim X n (ω) = X(ω). Let ω Cont(X) and ε > 0. Put x = X(ω) and take x i Cont(F Q ) such that Hence x ε < x 1 < x < x 2 < x + ε. F Q (x 1 ) < ω < F Q (x 2 ). By assumption there exists n 0 N such that F Qn (x 1 ) < ω < F Qn (x 2 ) for n n 0. Hence X n (ω) ]x 1, x 2 ], i.e. X n (ω) X(ω) < ε.

19 II.3. Convergence in Distribution 16 Remark 12. By (1) we have a general method to transform uniformly distributed random numbers from ]0, 1[ into random numbers with distribution Q. Remark 13. (i) Put C (r) = {f : R R : f, f (1),..., f (r) bounded, f (r) uniformly continuous}. Then Q n w Q r N 0 f C (r) : lim f dq n = f dq, see Gänssler, Stute (1977, p. 66). (ii) The Lévy distance d(q, R) = inf{h ]0, [ : x R: F Q (x h) h F R (x) F Q (x + h) + h} defines a metric on M(R), and Q n see Chow, Teicher (1978, Thm ). w Q lim d(q n, Q) = 0, (iii) Suppose that (M, ρ) is a complete separable metric space. Then there exists a metric d on M(M) such that (M(M), d) is complete and separable as well, and Q n w Q lim d(q n, Q) = 0, see Parthasarathy (1967, Sec. II.6) or Klenke (2013, Abschn. 13.2) and cf. Übung 3.3. Compactness Finally, we present a compactness criterion, which is very useful for construction of probability measures on B(M). Lemma 14. Let x n,l R for n, l N with l N: sup x n,l <. n N Then there exists an increasing sequence (n i ) i N in N such that l N: (x ni,l) i N converges. Proof. See Billingsley (1979, Thm ).

20 II.3. Convergence in Distribution 17 Definition 15. (i) P M(M) tight if ε > 0 K M compact P P: P (K) 1 ε. (ii) P M(M) relatively compact if every sequence in P contains a subsequence that converges weakly. Theorem 16 (Prohorov). Assume that M is a complete separable metric space and P M(M). Then P relatively compact P tight. Proof. See Parthasarathy (1967, Thm. II.6.7). Here: M = R. : Suppose that P is not tight. Then, for some ε > 0, there exists a sequence (P n ) n N in P such that P n ([ n, n]) < 1 ε. For a suitable subsequence, P nk Theorem 5 implies P (] m, m[) lim inf k which is a contradiction. w P M(R). Take m > 0 such that P (] m, m[) > 1 ε. P n k (] m, m[) lim inf k P n k ([ n k, n k ]) 1 ε, : Consider any sequence (P n ) n N in P and the corresponding sequence (F n ) n N of distribution functions. Use Lemma 14 to obtain a subsequence (F ni ) i N and a non-decreasing function G: Q [0, 1] with q Q: lim i F ni (q) = G(q). Put F (x) = inf{g(q) : q Q x < q}, x R. Claim (Helly s Theorem): (i) F is non-decreasing and right-continuous, (ii) x Cont(F ): lim i F ni (x) = F (x). Proof: Ad (i): Obviously F is non-decreasing. For x R and ε > 0 take δ 2 > 0 such that q Q ]x, x + δ 2 [ : G(q) F (x) + ε. Thus, for z ]x, x + δ 2 [, F (x) F (z) F (x) + ε.

21 II.4. Uniform Integrability 18 Ad (ii): If x Cont(F ) and ε > 0 take δ 1 > 0 such that Thus, for q 1, q 2 Q with we get Claim: F (x) ε F (x δ 1 ). x δ 1 < q 1 < x < q 2 < x + δ 2, F (x) ε F (x δ 1 ) G(q 1 ) lim inf i G(q 2 ) F (x) + ε. lim F (x) = 0 x Proof: For ε > 0 take m Q such that Thus F n i (x) lim sup F ni (x) i lim F (x) = 1. x n N: P n (] m, m]) 1 ε. G(m) G( m) = lim i ( Fni (m) F ni ( m) ) = lim i P ni (] m, m]) 1 ε. Since F (m) G(m) and F ( m 1) G( m), we obtain It remains to apply Theorems 1.10 and 7. F (m) F ( m 1) 1 ε. 4 Uniform Integrability In the sequel: X n, X random variables on a common probability space (Ω, A, P ). Definition 1. (X n ) n N uniformly integrable (u.i.) if lim sup X n dp = 0. α n N Remark 2. { X n α} (i) (X n ) n N u.i. ( n N: X n L 1 ) sup n N X n 1 <. (ii) Y L 1 n N: X n Y (X n ) n N u.i. (iii) p > 1 ( n N: X n L p ) sup n N X n p < (X n ) n N u.i. Proof: { X n α} X n dp = 1/α p 1 { X n α} αp 1 X n dp 1/α p 1 X n p p.

22 II.4. Uniform Integrability 19 Example 3. For the uniform distribution P on [0, 1] and X n = n 1 [0,1/n] we have X n L 1 and X n 1 = 1, but for any α > 0 and n α X n dp = n P ([0, 1/n]) = 1, so that (X n ) n N is not u.i. Lemma 4. (X n ) n N u.i. iff { X n α} sup E( X n ) < (1) n N and ε > 0 δ > 0 A A: ( P (A) < δ sup n N A ) X n dp < ε. (2) Proof. : For (1), see Remark 2.(i). Moreover, X n dp = X n dp + X n dp A A { X n α} A { X n <α} X n dp + α P (A). { X n α} For ε > 0 take α > 0 with sup X n dp < ε/2 n N { X n α} and δ = ε/(2α) to obtain (2). : Put M = sup n N E( X n ). Theorem 2.2 yields P ({ X n α}) M/α. Let ε > 0, take δ > 0 according to (2) to obtain for α > M/δ X n dp < ε. sup n N { X n α} Theorem 5. Let 1 p <, and assume X n L p for every n N. Then (X n ) n N converges in L p iff (X n ) n N converges in probability ( X n p ) n N is u.i. L p Proof. : Assume X n X. It follows that (X n ) n N is bounded in L p, and from P Remark 2.5 we get X n X. Observe that 1 A X n p 1 A (X n X) p + 1 A X p

23 II.4. Uniform Integrability 20 for every A A. Let ε > 0, take k N such that By Remark 2.(ii), sup X n X p < ε. (3) n>k ( X 1 X p,..., X k X p, X p, X p,... ) u.i. By Lemma 4 P (A) < δ ( ) sup 1 A (X n X) p < ε 1 A X p < ε 1 n k for a suitable δ > 0. Together with (3) this implies P (A) < δ sup 1 A X n p < 2 ε. n N : Let ε > 0, put A = A m,n = { X m X n > ε}. Then X m X n p 1 A (X m X n ) p + 1 A c (X m X n ) p 1 A X m p + 1 A X n p + ε. P By assumption X n X for some X Z(Ω, A). Take δ > 0 according to (2) for ( X n p ) n N, and note that Hence, for m, n sufficiently large, A m,n { X m X > ε/2} { X n X > ε/2}. P (A m,n ) < δ, which implies Apply Theorem A.6.7. X m X n p 2 ε 1/p + ε. Remark 6. (i) Theorem 5 yields a generalization of Lebesgue s convergence theorem: If X n L 1 P for every n N and X n X, then (X n ) n N u.i. X L 1 X n L 1 X. (ii) Uniform integrability is a property of the distributions only.

24 II.5. Kernels and Product Measures 21 Convergence of Expectations Theorem 7. X n d X E( X ) lim inf E( X n ). Proof. From Skorohod s Theorem 3.11 we get a probability space ( Ω, Ã, P ) with random variables X n, X such that X n P -a.s. X P Xn = P Xn P X = P X. Thus E( X ) = E( X ) and E( X n ) = E( X n ). Apply Fatou s Lemma A.5.5. Theorem 8. If then X n d X (X n ) n N u.i. X L 1 lim E(X n ) = E(X). Proof. Notation as previously. Now ( X n ) n N is u.i., see Remark 6.(ii). Hence, by Remark 6.(i), X L 1 and X L 1 n X. Thus E( X ) < and lim E(X n) = lim E( X n ) = E( X) = E(X). P -a.s. Example 9. Example 3 continued. With X = 0 we have X n X, and therefore d X. But E(X n ) = 1 > 0 = E(X). X n 5 Kernels and Product Measures Recall the construction and properties of product (probability) measures from Elstrodt (2011, Kap. V) or Section V.2 in the Lecture Notes on Maß- und Integrationstheorie (2016). Two-Stage Experiments Given: measurable spaces (Ω 1, A 1 ) and (Ω 2, A 2 ). Motivation: two-stage experiment. Output ω 1 Ω 1 of the first stage determines probabilistic model for the second stage. Example 1. Choose one out of n coins and throw it once. Parameters a 1,..., a n 0 such that n a i = 1 and b 1,..., b n [0, 1]. Let Ω 1 = {1,..., n}, A 1 = P(Ω 1 ) and define µ = n a i ε i,

25 II.5. Kernels and Product Measures 22 i.e., a i = µ({i}) is the probability of choosing the i-th coin. Moreover, let and define Ω 2 = {H, T}, A 2 = P(Ω 2 ) K(i, ) = b i ε H + (1 b i ) ε T, i.e., b i = K(i, {H}) is the probability for obtaining H when throwing the i-th coin. Definition 2. K : Ω 1 A 2 R is a Markov (transition) kernel (from (Ω 1, A 1 ) to (Ω 2, A 2 )), if (i) K(ω 1, ) is a probability measure on A 2 for every ω 1 Ω 1, (ii) K(, A 2 ) is A 1 -B-measurable for every A 2 A 2. Example 3. Extremal cases, non-disjoint. (i) Model for the second stage not influenced by output of the first stage, i.e., for a probability measure ν on A 2 In Example 1 this means b 1 = = b n. ω 1 Ω 1 : K(ω 1, ) = ν. (ii) Output of the first stage determines the output of the second stage, i.e., for a A 1 -A 2 -measurable mapping f : Ω 1 Ω 2 Given: ω 1 Ω 1 : K(ω 1, ) = ε f(ω1 ). In Example 1 this means b 1,..., b n {0, 1}. a probability measure µ on A 1 and a Markov kernel K from (Ω 1, A 1 ) to (Ω 2, A 2 ). Question: stochastic model (Ω, A, P ) for the compound experiment? Reasonable, and assumed in the sequel, How to define P? Ω = Ω 1 Ω 2, A = A 1 A 2. Example 4. In Example 1, a reasonable requirement for P is P (A 1 Ω 2 ) = µ(a 1 ), A 1 Ω 1, and P ({i} A 2 ) = K(i, A 2 ) µ({i}), A 2 Ω 2.

26 II.5. Kernels and Product Measures 23 Consequently, for A Ω P (A) = = n P ({(ω 1, ω 2 ) A : ω 1 = i}) = n P ({i} {ω 2 Ω 2 : (i, ω 2 ) A}) n K(i, {ω 2 Ω 2 : (i, ω 2 ) A}) µ({i}) = K(ω 1, {ω 2 Ω 2 : (ω 1, ω 2 ) A}) µ(dω 1 ). Ω 1 May we generally use the right-hand side integral for the definition of P? Lemma 5. Let f Z(Ω, A). Then, for ω 1 Ω 1, the ω 1 -section f(ω 1, ): Ω 2 R of f is A 2 -B-measurable, and for ω 2 Ω 2 the ω 2 -section of f is A 1 -B-measurable. f(, ω 2 ): Ω 1 R Proof. In the case of an ω 1 -section. Fix ω 1 Ω 1. Then Ω 2 Ω 1 Ω 2 : ω 2 (ω 1, ω 2 ) is A 2 -A-measurable due to Theorem A.8.5.(i). Apply Theorem A.1.7. Remark 6. In particular, for A A and f = 1 A f(ω 1, ) = 1 A (ω 1, ) = 1 A(ω1 ) where 1 A(ω 1 ) = {ω 2 Ω 2 : (ω 1, ω 2 ) A} is the ω 1 -section of A. By Lemma 5 ω 1 Ω 1 : A(ω 1 ) A 2. Analogously for the ω 2 -section A(ω 2 ) = {ω 1 Ω 1 : (ω 1, ω 2 ) A} of A. Lemma 7. Let f Z + (Ω, A). Then g : Ω 1 [0, [ { }, ω 1 f(ω 1, ω 2 ) K(ω 1, dω 2 ) Ω 2 is A 1 -B([0, ])-measurable. 1 poor notation

27 II.5. Kernels and Product Measures 24 Proof. Let F denote the set of all functions f Z + (Ω, A) with the measurability property as claimed. We show that Indeed, A 1 A 1, A 2 A 2 : 1 A1 A 2 F. (1) Ω 2 1 A1 A 2 (ω 1, ω 2 ) K(ω 1, dω 2 ) = 1 A1 (ω 1 )K(ω 1, A 2 ). Furthermore, we show that A A: 1 A F. (2) To this end let and D = {A A : 1 A F} E = {A 1 A 2 : A 1 A 1 A 2 A 2 }. Then E D by (1), E is closed w.r.t. intersections, and σ(e) = A. It easily follows that D is a Dynkin class. Hence Theorem A.7.4 yields A = σ(e) = δ(e) D A, which implies (2). From Theorems A.4.6 and A.5.9.(iv) we get f 1, f 2 F α [0, [ αf 1 + f 2 F. (3) Finally, Theorem A.5.3 and Theorem A.4.4.(iii) imply that f n F f n f f F. (4) Use Theorem A.4.8 together with (2) (4) to conclude that F = Z +. Theorem 8. Moreover, 1 probability measure µ K on A A 1 A 1 A 2 A 2 : µ K(A 1 A 2 ) = A 1 K(ω 1, A 2 ) µ(dω 1 ). (5) A A: µ K(A) = K(ω 1, A(ω 1 )) µ(dω 1 ). Ω 1 (6) Proof. Existence : For A A and ω 1 Ω 1 K(ω 1, A(ω 1 )) = 1 A(ω1 )(ω 2 ) K(ω 1, dω 2 ) = 1 A (ω 1, ω 2 ) K(ω 1, dω 2 ). Ω 2 Ω 2 According to Lemma 5.7 µ K is well-defined via (6). Using Theorem A.5.3, it is easy to verify that µ K is a probability measure on A. For A 1 A 1 and A 2 A 2 Hence µ K satisfies (5). K(ω 1, (A 1 A 2 )(ω 1 )) = { K(ω 1, A 2 ) if ω 1 A 1 0 otherwise. Uniqueness : Apply Theorem A.1.5 with A 0 = {A 1 A 2 : A i A i }.

28 II.5. Kernels and Product Measures 25 Example 9. In Example 4 we have P = µ K. Theorem 10 (Fubini s Theorem). (i) For f Z + (Ω, A) f d(µ K) = (ii) For f (µ K)-integrable and we have Ω (a) A 1 A 1 and µ(a 1 ) = 1, Ω 1 Ω 2 f(ω 1, ω 2 ) K(ω 1, dω 2 ) µ(dω 1 ). A 1 = {ω 1 Ω 1 : f(ω 1, ) K(ω 1, )-integrable} (b) A 1 R: ω 1 Ω 2 f(ω 1, ) dk(ω 1, ) is integrable w.r.t. µ A1 A 1, (c) Ω f d(µ K) = A 1 Ω 2 f(ω 1, ω 2 ) K(ω 1, dω 2 ) µ A1 A 1 (dω 1 ). Proof. Ad (i): algebraic induction. Ad (ii): consider f + and f and use (i). Remark 11. For brevity, we write f(ω 1, ω 2 ) K(ω 1, dω 2 ) µ(dω 1 ) = Ω 2 Ω 1 A 1 Ω 2 f(ω 1, ω 2 ) K(ω 1, dω 2 ) µ A1 A 1 (dω 1 ), if f is (µ K)-integrable. For f Z(Ω, A) f is (µ K)-integrable f (ω 1, ω 2 ) K(ω 1, dω 2 ) µ(dω 1 ) <. Ω 2 Multi-Stage Experiments Ω 1 Now we construct a stochastic model for a series of experiments, where the outputs of the first i 1 stages determine the model for the ith stage. Given: measurable spaces (Ω i, A i ) for i I, where I = {1,..., n} or I = N. Put and note that i j=1 ( ( ) Ω i i, A i = Ω j, j=1 Ω j = Ω i 1 Ω i i A j ), j=1 i A j = A i 1 A i j=1

29 II.5. Kernels and Product Measures 26 for i I \ {1}. Furthermore, let Ω = i I Ω i, A = i I A i. (7) Given: a probability measure µ on A 1, Markov kernels K i from ( Ω i 1, A i 1) to (Ωi, A i ) for i I \ {1}. Theorem 12. For I = {1,..., n} 1 probability measure ν on A A 1 A 1... A n A n : ν(a 1 A n ) =... K n ((ω 1,..., ω n 1 ), A n ) K n 1 ((ω 1,..., ω n 2 ), dω n 1 ) µ(dω 1 ). A 1 A n 1 Moreover, for f ν-integrable (the short version) f dν = Ω... Ω 1 f(ω 1,..., ω n ) K n ((ω 1,..., ω n 1 ), dω n ) µ(dω 1 ). Ω n (8) Notation: ν = µ K 2 K n. Proof. Induction, using Theorems 8 and 10. Remark 13. Particular case of Theorem 12 with µ = P 1, i I \ {1} ω i 1 Ω i 1 : K i (ω i 1, ) = P i (9) for probability measures P i on A i : 1 probability measure P 1 P n on A A 1 A 1... A n A n : P 1 P n (A 1 A n ) = P 1 (A 1 ) P n (A n ). Moreover, for every P 1 P n -integrable function f, Fubini s Theorem reads f d(p 1 P n ) = Ω... Ω 1 f(ω 1,..., ω n ) P n (dω n ) P 1 (dω 1 ). Ω n Analogously for any other order of integration. Definition 14. P 1 P n is called the product probability measure corresponding to P i for i = 1,..., n, and (Ω, A, µ) is called the product probability space corresponding to (Ω i, A i, P i ) for i = 1,..., n. Example 15. (i) In Example 4 with b = b 1 = = b n and ν = b ε H + (1 b) ε T we have P = µ ν.

30 II.5. Kernels and Product Measures 27 (ii) For countable spaces Ω i and σ-algebras A i = P(Ω i ) we get P 1 P 2 (A) = ω 1 Ω 1 P 2 (A(ω 1 )) P 1 ({ω 1 }), A Ω. (iii) For uniform distributions P i on finite spaces Ω i, P 1 P n is the uniform distribution on Ω. Remark 16. Theorems 8, 10, and 12 and Remark 13 extend to σ-finite measures µ instead of probability measures and σ-finite kernels K or K i, resp., instead of Markov kernels, with the resulting measure on A being σ-finite, too. Recall that µ is σ-finite, if A 1,1, A 1,2,... A 1 pairwise disjoint: Ω 1 = A 1,i i N: µ(a 1,i ) <. By definition, a mapping K : Ω 1 A 2 R is a kernel, if K(ω 1, ) is a measure on A 2 for every ω 1 Ω 1 and if K(, A 2 ) is A 1 -R-measurable for every A 2 A 2. Moreover, K is a σ-finite kernel if, additionally, Example 17. A 2,1, A 2,2,... A 2 pairwise disjoint: Ω 2 = A 2,i i N: sup K(ω 1, A 2,i ) <. ω 1 Ω 1 (i) For every integrable random variable X on any probability space (Ω, A, P ) with X 0, E(X) = (1 F X (u)) λ 1 (du), ]0, [ see Billingsley (1995, p. 275) or Übung 7.4 in Maß- und Integrationstheorie (2016). (ii) For the Lebesgue measure Theorem 18 (Ionescu-Tulcea). For I = N, λ n = λ 1 λ 1. 1 probability measure P on A n N A 1 A 1... A n A n : ( ) P A 1 A n Ω i = (µ K 2 K n )(A 1 A n ). (10) i=n+1 Proof. Existence : Consider the σ-algebras n A i, Ã n = σ ( ) π {1,...,n}

31 II.5. Kernels and Product Measures 28 in n Ω i and Ω i, respectively. Define a probability measure P n on Ãn by P n (A Then (8) yields the following consistency property Thus P n+1 (A Ω n+1 ) Ω i = (µ K 2 K n )(A), A i=n+1 ) Ω i = P n (A Ω i ), A i=n+2 i=n+1 P (Ã) = P n (Ã), yields a well-defined mapping on the algebra à Ãn, n A i. n A i. à = n N à n of cylinder sets. Obviously, P is a content with P ( Ω i ) = 1 and (10) holds for P = P. Claim: P is σ-continuous at. It suffices to show that for every sequence of sets A (n) = B (n) with B (n) n A i and A (n) we have Ω i i=n+1 lim (µ K 2... K n )(B (n) ) = 0. Assume the contrary, i.e., inf n N (µ K 2... K n )(B (n) ) > 0. Observe that Recursively, we define ω Ω such that B (n) Ω n+1 B (n+1). (11) n N: B (n+1) (ω 1,..., ω n), (12) which implies (ω 1,..., ω n) B (n) for every n N, see (11). Consequently, ω A (n), n=1 contradicting A (n). Put ω n = (ω 1,..., ω n ) for n 1 and ω i Ω i. Consider the probability measure K n+1 (ω n, ) on A n+1 as well as the Markov kernels K n+m ((ω n, ), ): n+m 1 Ω i A n+m R i=n+1

32 II.5. Kernels and Product Measures 29 for m 2. By Q n,m (ω n, ) = K n+1 (ω n, ) K n+m ((ω n, ), ) we obtain a probability measures on n+m i=n+1 A i for m 1; actually Q n,m is a Markov kernel. Finally, we put f n,m (ω n ) = Q n,m ( ωn, B (n+m) (ω n ) ) for n, m 1. Recursively, we define ω Ω such that Take m = 1 in (13) to obtain (12). n N: inf m 1 f n,m(ω 1,..., ω n) > 0. (13) Due to (11) we have B n+m (ω n ) Ω n+m+1 B n+m+1 (ω n ), and therefore f n,m (ω n ) = Q n,m+1 ( ωn, B n+m (ω n ) Ω n+m+1 ) fn,m+1 (ω n ). Furthermore, 0 f n,m 1 and f n,m (ω n ) = 1 B (n+m)(ω n, ω n+1,..., ω n+m ) Q n,m (ω n, d(ω n+1,..., ω n+m )). n+m i=n+1 Ω i In particular, f n,m is n A i B measurable. For n = 1 Therefore Ω 1 which yields For n 1 Therefore Ω 1 f 1,m (ω 1 ) µ(dω 1 ) = (µ K 2... K 1+m )(B (1+m) ). inf f 1,m(ω 1 ) µ(dω 1 ) = inf (µ K 2... K 1+m )(B (1+m) ) > 0, m 1 m 1 Ω n+1 ω 1 Ω 1 : inf m 1 f 1,m(ω 1) > 0. Ω n+1 f n+1,m (ω n, ω n+1 ) K n+1 (ω n, dω n+1 ) = f n,m+1 (ω n ). inf m 1 f n+1,m(ω n, ω n+1 ) K n+1 (ω n, dω n+1 ) = inf m 1 f n,m+1(ω n ). Proceed inductively to establish (13). By Theorem A.9.3, P is σ-additive and it remains to apply Theorem A.9.4. Uniqueness : By (10), P is uniquely determined on the class of measurable rectangles. Apply Theorem A.1.5. Outlook: the probabilistic method. Example 19. The queuing model, see Übung 5.3. depends on ω i 1. See also Übung 5.1. Here K i ((ω 1,..., ω i 1 ), ) only Outlook: Markov processes.

33 II.6. Independence 30 Product Measures The General Case Given: a non-empty set I and probability spaces (Ω i, A i, P i ) for i I. Recall the definition (7). Put Theorem 20. P 0 (I) = {J I : J non-empty, finite}. probability measure P on A S P 0 (I) A i A i, i S : 1 ( ) P A i Ω i = P i (A i ). (14) i S i I\S i S Notation: P = i I P i. Proof. See Remark 13 in the case of a finite set I. If I = N, assume I = N without loss of generality. The particular case of Theorem 18 with (9) for probability measures P i on A i shows 1 probability measure P on A n N A 1 A 1... A n A n : ( ) P A 1 A n Ω i = P 1 (A 1 ) P n (A n ). i=n+1 If I is uncountable, we use Theorem A.8.7. For S I non-empty and countable and for B i S A i we put P (( ) 1B) πs I = P i (B). i S Hereby we get a well-defined mapping P : A R, which clearly is a probability measure and satisfies (14). Use Theorem A.1.5 to obtain the uniqueness result. Definition 21. P = i I P i is called the product measure corresponding to P i for i I, and (Ω, A, P ) is called the product measure space corresponding to (Ω i, A i, P i ) for i I. 6 Independence... the concept of independence... plays a central role in probability theory; it is precisely this concept that distinguishes probability theory from the general theory of measure spaces, see Shiryayev (1984, p. 27). In the sequel, (Ω, A, P ) denotes a probability space and I is a non-empty set. Independence of Events Definition 1. Let A i A for i I. Then (A i ) i I is independent if ( ) P A i = P (A i ) (1) i S i S for every S P 0 (I). Elementary case: I = 2.

34 II.6. Independence 31 In the sequel, E i A for i I. Definition 2. (E i ) i I is independent if (1) holds for every S P 0 (I) and all A i E i for i S. Remark 3. (i) (E i ) i I independent i I : Ẽi E i (Ẽi) i I independent. (ii) (E i ) i I independent S P 0 (I): (E i ) i S independent. Lemma 4. (E i ) i I independent (δ(e i )) i I independent. Proof. Without loss of generality, I = {1,..., n} and n 2, see Remark 3.(ii). Put D 1 = {A δ(e 1 ) : ({A}, E 2,..., E n ) independent}. Then D 1 is a Dynkin class and E 1 D 1, hence δ(e 1 ) = D 1. Thus (δ(e 1 ), E 2,..., E n ) independent. Repeat this step for 2,..., n. Theorem 5. If (E i ) i I independent i I : E i closed w.r.t. intersections (2) then (σ(e i )) i I independent. Proof. Use Theorem A.7.4 and Lemma 4. Corollary 6. Assume that I = j J I j for pairwise disjoint sets I j. If (2) holds, then ( ( ) ) σ independent. Proof. Let { Ẽ j = i S i I j E i j J } A i : S P 0 (I j ) A i E i for i S. Then Ẽj is closed w.r.t. intersections and ( Ẽ j is independent. Finally, )j J ( ) σ E i = σ(ẽj). i I j

35 II.6. Independence 32 Independence of Random Elements In the sequel, (Ω i, A i ) denotes a measurable space for i I, and X i : Ω Ω i is A-A i -measurable for i I. Definition 7. (X i ) i I is independent if (σ(x i )) i I is independent. Remark 8. See Übung 6.2 for a characterization of independence in terms of prediction problems. Example 9. Actually, the essence of independence. Assume that ( (Ω, A, P ) = Ω i, ) A i P i i I i I, i I for probability measures P i on A i. Let X i = π i. Then, for S P 0 (I) and A i A i for i S ( ) ( ) P {X i A i } = P i S A i Ω i = P i (A i ) = P ({X i A i }). i S i I\S i S i S Hence (X i ) i I is independent. Furthermore, P Xi = P i. Theorem 10. Given: probability spaces (Ω i, A i, P i ) for i I. Then there exist (i) a probability space (Ω, A, P ) and (ii) A-A i -measurable mappings X i : Ω Ω i for i I such that (X i ) i I independent i I : P Xi = P i. Proof. See Example 9. Theorem 11. Let F i A i for i I. If i I : σ(f i ) = A i F i closed w.r.t. intersections then (X i ) i I independent (X 1 i (F i )) i I independent. Proof. By Elstrodt (2011, Satz I.4.4) or Lemma II.3.6 in Maß- und Integrationstheorie (2016) we have σ(x i ) = X 1 i (A i ) = σ(x 1 i (F i )). : See Remark 3.(i). : Note that X 1 i (F i ) is closed w.r.t. intersections. Use Theorem 5. Example 12. Independence of a family of random variables X i, i.e., (Ω i, A i ) = (R, B) for i I. In this case (X i ) i I is independent iff ( ) S P 0 (I) c i R, i S : P {X i c i } = P ({X i c i }). i S i S

36 II.6. Independence 33 Theorem 13. Let (i) I = j J I j for pairwise disjoint sets I j, (ii) ( Ω j, Ãj) be measurable spaces for j J, (iii) f j : i Ij Ω i Ω ( j be A i Ij i) -Ãj measurable mappings for j J. Put Then Proof. Y j = (X i ) i Ij : Ω i I j Ω i. (X i ) i I independent (f j Y j ) j J independent. σ(f j Y j ) = Y 1 j (f 1 j ( (Ãj)) Y 1 j i I j A i ( ) = σ({x i : i I j }) = σ X 1 i (A i ). i I j ) Use Corollary 6 and Remark 3.(i). Example 14. For an independent sequence (X i ) i N of random variables are independent. ( max(x 1, X 3 ), 1 [0, [ (X 2 ), lim sup 1/n n ) X i Independence and Product Measures Remark 15. Consider the mapping X : Ω i I Ω i : ω (X i (ω)) i I. Clearly X is A- i I A i-measurable. By definition, P X (A) = P ({X A}) for A i I A i. In particular, for measurable rectangles A i I A i, i.e., A = A i Ω i (3) i S i I\S with S P 0 (I) and A i A i, ( ) P X (A) = P {X i A i }. (4) i S

37 II.6. Independence 34 Definition 16. (i) P X is called the joint distribution of the random elements X i, i I. (ii) Let P denote a probability measure on ( i I Ω i, i I A i), and let i I. Then P πi is called a (one-dimensional) marginal distribution of P. Example 17. Let Ω = {1,..., 6} 2 and consider the uniform distribution P on A = P(Ω), which is a model for rolling a die twice. Moreover, let Ω i = N and A i = P(Ω i ) such that 2 A i = P(N 2 ). Consider the random variables Then where X 1 (ω 1, ω 2 ) = ω 1, X 2 (ω 1, ω 2 ) = ω 1 + ω 2. P X (A) = A M, A N 2, 36 M = {(k, l) N 2 : 1 k 6 k + 1 l k + 6} Claim: (X 1, X 2 ) are not independent. Proof: but We add that Theorem 18. P ({X 1 = 1} {X 2 = 3}) = P X ({(1, 3)}) = P ({(1, 2)}) = 1/36 P ({X 1 = 1}) P ({X 2 = 3}) = 1/6 P ({(1, 2), (2, 1)}) = 1/3 1/36. P X1 = /6 ε k, P X2 = (6 l 7 )/36 ε l. k=1 l=2 (X i ) i I independent P X = i I P Xi. Proof. For A given by (3) ( ) P Xi (A) = P Xi (A i ) = P ({X i A i }). i I i S i S On the other hand, we have (4). Thus hold trivially. Use Theorem A.1.5 to obtain. In the sequel, we consider random variables X i, i.e., (Ω i, A i ) = (R, B) for i I. Theorem 19. Let I = {1,..., n}. If (X 1,..., X n ) independent i I : X i 0 (X i integrable) then ( n X i is integrable and) ( n ) E X i = n E(X i ).

38 II.6. Independence 35 Proof. Use Fubini s Theorem and Theorem 18 to obtain ( n ) E X i = x 1 x n P (X1,...,X n)(d(x 1,..., x n )) R n = x 1 x n (P X1 P Xn )(d(x 1,..., x n )) R n n n = x i P Xi (dx i ) = E( X i ). R Drop if the random variables are integrable. Uncorrelated Random Variables Definition 20. X 1, X 2 L 2 are uncorrelated if E(X 1 X 2 ) = E(X 1 ) E(X 2 ). Theorem 21 (Bienaymé). Let X 1,..., X n L 2 be pairwise uncorrelated. Then ( n ) n Var X i = Var(X i ). Proof. We have ( n ) ( n Var X i = E (X i E(X i )) Moreover, = ) 2 n E(X i E(X i )) 2 + n E((X i E(X i )) (X j E(X j ))). i,j=1 i j E((X i E(X i )) (X j E(X j ))) = E(X i X j ) E(X i ) E(X j ). (The latter quantity is called the covariance between X i and X j.) Convolutions Definition 22. The convolution product of probability measures P 1,..., P n on B is defined by P 1 P n = s(p 1 P n ) where s(x 1,..., x n ) = x x n. Theorem 23. Let (X 1,..., X n ) be independent and S = n X i. Then P S = P X1 P Xn.

39 II.6. Independence 36 Proof. Put X = (X 1,..., X n ). Since S = s (X 1,..., X n ) we get P S = s(p X ) = s(p X1 P Xn ). Remark 24. The class of probability measure on B forms an Abelian semi-group w.r.t., and P ε 0 = P. Theorem 25. For all probability measures P 1, P 2 on B and every P 1 P 2 -integrable function f f d(p 1 P 2 ) = f(x + y) P 1 (dx) P 2 (dy). R If P 1 = h 1 λ 1 then P 1 P 2 = h λ 1 with h(x) = h 1 (x y) P 2 (dy). If P 2 = h 2 λ 1, additionally, then h(x) = R R R R h 1 (x y) h 2 (y) λ(dy). Proof. Use Fubini s Theorem and the transformation theorem. See Billingsley (1979, p. 230). Example 26. (i) Put N(µ, 0) = ε µ. By Theorems 21 and 25 for µ i R and σ i 0. N(µ 1, σ 2 1) N(µ 2, σ 2 2) = N(µ 1 + µ 2, σ σ 2 2) (ii) Consider n independent Bernoulli trials, i.e., (X 1,..., X n ) independent with P Xi = p ε 1 + (1 p) ε 0 for every i {1,..., n}, where p [0, 1]. Inductively, we get for k {1,..., n} k X i B(k, p), see also Übung 4.4. Thus, for any n, m N, B(n, p) B(m, p) = B(n + m, p).

40 Chapter III Limit Theorems Given: a sequence (X n ) n N of random variables on a probability space (Ω, A, P ) and weights 0 < a n. Put S n = n X i, n N 0. For instance, S n might be the cumulative gain after n trials or (one of the coordinates of) the position of a particle after n collisions. Question: Convergence of S n /a n for suitable weights a n in a suitable sense? Particular case: a n = n. Definition 1. (X n ) n N independent and identically distributed (i.i.d.) if (X n ) n N is independent and n N: P Xn = P X1. In this case (S n ) n N0 is called the associated random walk. 1 Zero-One Laws Kolmogorov s Zero-One Law Definition 1. For σ-algebras A n A, n N, the corresponding tail σ-algebra is A = ( σ A m ), n N and A A is called a tail (terminal) event. m n Example 2. Let A n = σ(x n ). Put C = B. Then A = n N σ({x m : m n}) and A A n N C C: A = {(X n, X n+1,... ) C}. 37

41 III.1. Zero-One Laws 38 For instance, {(S n ) n N converges}, {(S n /a n ) n N converges} A, and the function lim inf S n /a n is A -B-measurable. However, S n as well as lim inf S n are not A -B-measurable, in general. Analogously for the lim sup s. Theorem 3 (Kolmogorov s Zero-One Law). Let (A n ) n N be an independent sequence of σ-algebras A n A. Then A A : P (A) {0, 1}. Proof. We show that A and A are independent (terminology), which implies P (A) = P (A) P (A) for every A A. Put A n = σ(a 1 A n ). Note that A σ(a n+1... ). By Corollary II.6.6 and Remark II.6.3.(i) A n, A independent, and therefore n N A n and A are independent, too. Thus, by Theorem II.6.5, ( σ n N A n ), A independent. Finally, ( A σ n N ) ( A n = σ n N A n ). Corollary 4. Let X Z(Ω, A ). Under the assumptions of Theorem 3, X is constant P -a.s. Remark 5. Assume that (X n ) n N is independent. Then P ({(S n ) n N converges}), P ({(S n /a n ) n N converges}) {0, 1}. In case of convergence P -a.s., lim S n /a n is constant P -a.s. The Borel-Cantelli Lemma Definition 6. Let A n A for n N. Then lim inf A n = A m, n N m n Remark 7. ( ) c (i) lim inf A n = lim sup A c n. lim sup A n = n N m n A m.

42 III.1. Zero-One Laws 39 ( ) (ii) P lim inf A n lim inf (iii) If (A n ) n N is independent, then P Proof: Übung 6.3. ( P (A n) lim sup P (A n ) P ( lim sup lim sup A n ). A n ) {0, 1} (Borel s Zero-One Law). Theorem 8 (Borel-Cantelli Lemma). Let A = lim sup A n with A n A. (i) If n=1 P (A n) < then P (A) = 0. (ii) If n=1 P (A n) = and (A n ) n N is independent, then P (A) = 1. Proof. Ad (i): ( P (A) P m n A m ) P (A m ). m=n By assumption, the right-hand side tends to zero as n tends to. Ad (ii): We have P (A c ) = P (lim inf Use 1 x exp( x) for x 0 to obtain ( l P m=n A c m ) = l (1 P (A m )) m=n Ac n) n=1 ( P m n A c m ). l exp( P (A m )) = exp( m=n l P (A m )). By assumption, the right-hand side tends to zero as l tends to. Thus P (A c ) = 0. Example 9. A fair coin is tossed an infinite number of times. Determine the probability that 0 occurs twice in a row infinitely often. Model: (X n ) n N i.i.d. with Put P ({X 1 = 0}) = P ({X 1 = 1}) = 1/2. A n = {X n = X n+1 = 0}. Then (A 2n ) n N is independent and P (A 2n ) = 1/4. Thus P (lim sup A n ) = 1. Remark 10. A stronger version of Theorem 8.(ii) requires only pairwise independence, see Bauer (1996, p. 70). Example 11. Let (X n ) n N be i.i.d. with P ({X 1 = 1}) = p = 1 P ({X 1 = 1}) for some constant p [0, 1]. Put A n = σ(x n ) and A = lim sup {S n = 0}, m=n

43 III.2. The Strong Law of Large Numbers 40 and note that A / A = n N σ({x m : m n}) at least in the standard setting from Example II.6.9. Clearly Use Stirling s Formula to obtain P ({S 2n = 0}) = where r = 4p (1 p) [0, 1]. Suppose that Then r < 1, and therefore 1/2 (S n + n) B(n, p). n! P ({S n = 0}) = n=0 The Borel-Cantelli Lemma implies Suppose that Then P ({S n = 0}) = n=0 but ({S n = 0}) n N is not independent. show that P (A) = 1, see Example ( ) n n 2πn e ( ) 2n p n (1 p) n rn, n πn p 1/2. P ({S 2n = 0}) <. n=0 P (A) = 0. p = 1/2. P ({S 2n = 0}) =, n=0 Using the Central Limit Theorem one can 2 The Strong Law of Large Numbers Throughout this section: (X n ) n N independent. The L 2 Case Put C = { (S n ) n N converges in R }. By Remark 1.5, P (C) {0, 1}. First we provide sufficient conditions for P (C) = 1 to hold.

7 The Radon-Nikodym-Theorem

7 The Radon-Nikodym-Theorem 32 CHPTER II. MESURE ND INTEGRL 7 The Radon-Nikodym-Theorem Given: a measure space (Ω,, µ). Put Z + = Z + (Ω, ). Definition 1. For f (quasi-)µ-integrable and, the integral of f over is (Note: 1 f f.) Theorem

More information

Elementary Probability. Exam Number 38119

Elementary Probability. Exam Number 38119 Elementary Probability Exam Number 38119 2 1. Introduction Consider any experiment whose result is unknown, for example throwing a coin, the daily number of customers in a supermarket or the duration of

More information

1. Aufgabenblatt zur Vorlesung Probability Theory

1. Aufgabenblatt zur Vorlesung Probability Theory 24.10.17 1. Aufgabenblatt zur Vorlesung By (Ω, A, P ) we always enote the unerlying probability space, unless state otherwise. 1. Let r > 0, an efine f(x) = 1 [0, [ (x) exp( r x), x R. a) Show that p f

More information

STAT 7032 Probability Spring Wlodek Bryc

STAT 7032 Probability Spring Wlodek Bryc STAT 7032 Probability Spring 2018 Wlodek Bryc Created: Friday, Jan 2, 2014 Revised for Spring 2018 Printed: January 9, 2018 File: Grad-Prob-2018.TEX Department of Mathematical Sciences, University of Cincinnati,

More information

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1 Chapter 2 Probability measures 1. Existence Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension to the generated σ-field Proof of Theorem 2.1. Let F 0 be

More information

Random Process Lecture 1. Fundamentals of Probability

Random Process Lecture 1. Fundamentals of Probability Random Process Lecture 1. Fundamentals of Probability Husheng Li Min Kao Department of Electrical Engineering and Computer Science University of Tennessee, Knoxville Spring, 2016 1/43 Outline 2/43 1 Syllabus

More information

The main results about probability measures are the following two facts:

The main results about probability measures are the following two facts: Chapter 2 Probability measures The main results about probability measures are the following two facts: Theorem 2.1 (extension). If P is a (continuous) probability measure on a field F 0 then it has a

More information

Notes on Measure, Probability and Stochastic Processes. João Lopes Dias

Notes on Measure, Probability and Stochastic Processes. João Lopes Dias Notes on Measure, Probability and Stochastic Processes João Lopes Dias Departamento de Matemática, ISEG, Universidade de Lisboa, Rua do Quelhas 6, 1200-781 Lisboa, Portugal E-mail address: jldias@iseg.ulisboa.pt

More information

1: PROBABILITY REVIEW

1: PROBABILITY REVIEW 1: PROBABILITY REVIEW Marek Rutkowski School of Mathematics and Statistics University of Sydney Semester 2, 2016 M. Rutkowski (USydney) Slides 1: Probability Review 1 / 56 Outline We will review the following

More information

Part II Probability and Measure

Part II Probability and Measure Part II Probability and Measure Theorems Based on lectures by J. Miller Notes taken by Dexter Chua Michaelmas 2016 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Probability Theory. Richard F. Bass

Probability Theory. Richard F. Bass Probability Theory Richard F. Bass ii c Copyright 2014 Richard F. Bass Contents 1 Basic notions 1 1.1 A few definitions from measure theory............. 1 1.2 Definitions............................. 2

More information

Product measure and Fubini s theorem

Product measure and Fubini s theorem Chapter 7 Product measure and Fubini s theorem This is based on [Billingsley, Section 18]. 1. Product spaces Suppose (Ω 1, F 1 ) and (Ω 2, F 2 ) are two probability spaces. In a product space Ω = Ω 1 Ω

More information

I. ANALYSIS; PROBABILITY

I. ANALYSIS; PROBABILITY ma414l1.tex Lecture 1. 12.1.2012 I. NLYSIS; PROBBILITY 1. Lebesgue Measure and Integral We recall Lebesgue measure (M411 Probability and Measure) λ: defined on intervals (a, b] by λ((a, b]) := b a (so

More information

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539 Brownian motion Samy Tindel Purdue University Probability Theory 2 - MA 539 Mostly taken from Brownian Motion and Stochastic Calculus by I. Karatzas and S. Shreve Samy T. Brownian motion Probability Theory

More information

MATHS 730 FC Lecture Notes March 5, Introduction

MATHS 730 FC Lecture Notes March 5, Introduction 1 INTRODUCTION MATHS 730 FC Lecture Notes March 5, 2014 1 Introduction Definition. If A, B are sets and there exists a bijection A B, they have the same cardinality, which we write as A, #A. If there exists

More information

Probability and Measure

Probability and Measure Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 84 Paper 4, Section II 26J Let (X, A) be a measurable space. Let T : X X be a measurable map, and µ a probability

More information

Probability and Measure

Probability and Measure Probability and Measure Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA Convergence of Random Variables 1. Convergence Concepts 1.1. Convergence of Real

More information

Exercises Measure Theoretic Probability

Exercises Measure Theoretic Probability Exercises Measure Theoretic Probability 2002-2003 Week 1 1. Prove the folloing statements. (a) The intersection of an arbitrary family of d-systems is again a d- system. (b) The intersection of an arbitrary

More information

Stochastic Processes. Winter Term Paolo Di Tella Technische Universität Dresden Institut für Stochastik

Stochastic Processes. Winter Term Paolo Di Tella Technische Universität Dresden Institut für Stochastik Stochastic Processes Winter Term 2016-2017 Paolo Di Tella Technische Universität Dresden Institut für Stochastik Contents 1 Preliminaries 5 1.1 Uniform integrability.............................. 5 1.2

More information

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015 Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.

More information

STA 711: Probability & Measure Theory Robert L. Wolpert

STA 711: Probability & Measure Theory Robert L. Wolpert STA 711: Probability & Measure Theory Robert L. Wolpert 6 Independence 6.1 Independent Events A collection of events {A i } F in a probability space (Ω,F,P) is called independent if P[ i I A i ] = P[A

More information

4th Preparation Sheet - Solutions

4th Preparation Sheet - Solutions Prof. Dr. Rainer Dahlhaus Probability Theory Summer term 017 4th Preparation Sheet - Solutions Remark: Throughout the exercise sheet we use the two equivalent definitions of separability of a metric space

More information

A D VA N C E D P R O B A B I L - I T Y

A D VA N C E D P R O B A B I L - I T Y A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2

More information

1. Probability Measure and Integration Theory in a Nutshell

1. Probability Measure and Integration Theory in a Nutshell 1. Probability Measure and Integration Theory in a Nutshell 1.1. Measurable Space and Measurable Functions Definition 1.1. A measurable space is a tuple (Ω, F) where Ω is a set and F a σ-algebra on Ω,

More information

Lecture 6 Basic Probability

Lecture 6 Basic Probability Lecture 6: Basic Probability 1 of 17 Course: Theory of Probability I Term: Fall 2013 Instructor: Gordan Zitkovic Lecture 6 Basic Probability Probability spaces A mathematical setup behind a probabilistic

More information

17. Convergence of Random Variables

17. Convergence of Random Variables 7. Convergence of Random Variables In elementary mathematics courses (such as Calculus) one speaks of the convergence of functions: f n : R R, then lim f n = f if lim f n (x) = f(x) for all x in R. This

More information

PROBABILITY THEORY II

PROBABILITY THEORY II Ruprecht-Karls-Universität Heidelberg Institut für Angewandte Mathematik Prof. Dr. Jan JOHANNES Outline of the lecture course PROBABILITY THEORY II Summer semester 2016 Preliminary version: April 21, 2016

More information

2 Probability, random elements, random sets

2 Probability, random elements, random sets Tel Aviv University, 2012 Measurability and continuity 25 2 Probability, random elements, random sets 2a Probability space, measure algebra........ 25 2b Standard models................... 30 2c Random

More information

Probability: Handout

Probability: Handout Probability: Handout Klaus Pötzelberger Vienna University of Economics and Business Institute for Statistics and Mathematics E-mail: Klaus.Poetzelberger@wu.ac.at Contents 1 Axioms of Probability 3 1.1

More information

µ X (A) = P ( X 1 (A) )

µ X (A) = P ( X 1 (A) ) 1 STOCHASTIC PROCESSES This appendix provides a very basic introduction to the language of probability theory and stochastic processes. We assume the reader is familiar with the general measure and integration

More information

Notes 1 : Measure-theoretic foundations I

Notes 1 : Measure-theoretic foundations I Notes 1 : Measure-theoretic foundations I Math 733-734: Theory of Probability Lecturer: Sebastien Roch References: [Wil91, Section 1.0-1.8, 2.1-2.3, 3.1-3.11], [Fel68, Sections 7.2, 8.1, 9.6], [Dur10,

More information

7 Convergence in R d and in Metric Spaces

7 Convergence in R d and in Metric Spaces STA 711: Probability & Measure Theory Robert L. Wolpert 7 Convergence in R d and in Metric Spaces A sequence of elements a n of R d converges to a limit a if and only if, for each ǫ > 0, the sequence a

More information

Exercises Measure Theoretic Probability

Exercises Measure Theoretic Probability Exercises Measure Theoretic Probability Chapter 1 1. Prove the folloing statements. (a) The intersection of an arbitrary family of d-systems is again a d- system. (b) The intersection of an arbitrary family

More information

Lecture 17 Brownian motion as a Markov process

Lecture 17 Brownian motion as a Markov process Lecture 17: Brownian motion as a Markov process 1 of 14 Course: Theory of Probability II Term: Spring 2015 Instructor: Gordan Zitkovic Lecture 17 Brownian motion as a Markov process Brownian motion is

More information

JUSTIN HARTMANN. F n Σ.

JUSTIN HARTMANN. F n Σ. BROWNIAN MOTION JUSTIN HARTMANN Abstract. This paper begins to explore a rigorous introduction to probability theory using ideas from algebra, measure theory, and other areas. We start with a basic explanation

More information

STAT 7032 Probability. Wlodek Bryc

STAT 7032 Probability. Wlodek Bryc STAT 7032 Probability Wlodek Bryc Revised for Spring 2019 Printed: January 14, 2019 File: Grad-Prob-2019.TEX Department of Mathematical Sciences, University of Cincinnati, Cincinnati, OH 45221 E-mail address:

More information

Introduction and Preliminaries

Introduction and Preliminaries Chapter 1 Introduction and Preliminaries This chapter serves two purposes. The first purpose is to prepare the readers for the more systematic development in later chapters of methods of real analysis

More information

Measure and integration

Measure and integration Chapter 5 Measure and integration In calculus you have learned how to calculate the size of different kinds of sets: the length of a curve, the area of a region or a surface, the volume or mass of a solid.

More information

2 n k In particular, using Stirling formula, we can calculate the asymptotic of obtaining heads exactly half of the time:

2 n k In particular, using Stirling formula, we can calculate the asymptotic of obtaining heads exactly half of the time: Chapter 1 Random Variables 1.1 Elementary Examples We will start with elementary and intuitive examples of probability. The most well-known example is that of a fair coin: if flipped, the probability of

More information

1 Probability space and random variables

1 Probability space and random variables 1 Probability space and random variables As graduate level, we inevitably need to study probability based on measure theory. It obscures some intuitions in probability, but it also supplements our intuition,

More information

Lecture 2: Random Variables and Expectation

Lecture 2: Random Variables and Expectation Econ 514: Probability and Statistics Lecture 2: Random Variables and Expectation Definition of function: Given sets X and Y, a function f with domain X and image Y is a rule that assigns to every x X one

More information

1 Measurable Functions

1 Measurable Functions 36-752 Advanced Probability Overview Spring 2018 2. Measurable Functions, Random Variables, and Integration Instructor: Alessandro Rinaldo Associated reading: Sec 1.5 of Ash and Doléans-Dade; Sec 1.3 and

More information

02. Measure and integral. 1. Borel-measurable functions and pointwise limits

02. Measure and integral. 1. Borel-measurable functions and pointwise limits (October 3, 2017) 02. Measure and integral Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/real/notes 2017-18/02 measure and integral.pdf]

More information

9 Radon-Nikodym theorem and conditioning

9 Radon-Nikodym theorem and conditioning Tel Aviv University, 2015 Functions of real variables 93 9 Radon-Nikodym theorem and conditioning 9a Borel-Kolmogorov paradox............. 93 9b Radon-Nikodym theorem.............. 94 9c Conditioning.....................

More information

P (A G) dp G P (A G)

P (A G) dp G P (A G) First homework assignment. Due at 12:15 on 22 September 2016. Homework 1. We roll two dices. X is the result of one of them and Z the sum of the results. Find E [X Z. Homework 2. Let X be a r.v.. Assume

More information

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3 Brownian Motion Contents 1 Definition 2 1.1 Brownian Motion................................. 2 1.2 Wiener measure.................................. 3 2 Construction 4 2.1 Gaussian process.................................

More information

The Essential Equivalence of Pairwise and Mutual Conditional Independence

The Essential Equivalence of Pairwise and Mutual Conditional Independence The Essential Equivalence of Pairwise and Mutual Conditional Independence Peter J. Hammond and Yeneng Sun Probability Theory and Related Fields, forthcoming Abstract For a large collection of random variables,

More information

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due ). Show that the open disk x 2 + y 2 < 1 is a countable union of planar elementary sets. Show that the closed disk x 2 + y 2 1 is a countable

More information

MATH 6605: SUMMARY LECTURE NOTES

MATH 6605: SUMMARY LECTURE NOTES MATH 6605: SUMMARY LECTURE NOTES These notes summarize the lectures on weak convergence of stochastic processes. If you see any typos, please let me know. 1. Construction of Stochastic rocesses A stochastic

More information

Stochastic Convergence, Delta Method & Moment Estimators

Stochastic Convergence, Delta Method & Moment Estimators Stochastic Convergence, Delta Method & Moment Estimators Seminar on Asymptotic Statistics Daniel Hoffmann University of Kaiserslautern Department of Mathematics February 13, 2015 Daniel Hoffmann (TU KL)

More information

Problem set 1, Real Analysis I, Spring, 2015.

Problem set 1, Real Analysis I, Spring, 2015. Problem set 1, Real Analysis I, Spring, 015. (1) Let f n : D R be a sequence of functions with domain D R n. Recall that f n f uniformly if and only if for all ɛ > 0, there is an N = N(ɛ) so that if n

More information

Basics of Stochastic Analysis

Basics of Stochastic Analysis Basics of Stochastic Analysis c Timo Seppäläinen This version November 16, 214 Department of Mathematics, University of Wisconsin Madison, Madison, Wisconsin 5376 E-mail address: seppalai@math.wisc.edu

More information

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents MATH 3969 - MEASURE THEORY AND FOURIER ANALYSIS ANDREW TULLOCH Contents 1. Measure Theory 2 1.1. Properties of Measures 3 1.2. Constructing σ-algebras and measures 3 1.3. Properties of the Lebesgue measure

More information

1 Presessional Probability

1 Presessional Probability 1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional

More information

Filtrations, Markov Processes and Martingales. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition

Filtrations, Markov Processes and Martingales. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition Filtrations, Markov Processes and Martingales Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition David pplebaum Probability and Statistics Department,

More information

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M. Lecture 10 1 Ergodic decomposition of invariant measures Let T : (Ω, F) (Ω, F) be measurable, and let M denote the space of T -invariant probability measures on (Ω, F). Then M is a convex set, although

More information

II - REAL ANALYSIS. This property gives us a way to extend the notion of content to finite unions of rectangles: we define

II - REAL ANALYSIS. This property gives us a way to extend the notion of content to finite unions of rectangles: we define 1 Measures 1.1 Jordan content in R N II - REAL ANALYSIS Let I be an interval in R. Then its 1-content is defined as c 1 (I) := b a if I is bounded with endpoints a, b. If I is unbounded, we define c 1

More information

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R. Ergodic Theorems Samy Tindel Purdue University Probability Theory 2 - MA 539 Taken from Probability: Theory and examples by R. Durrett Samy T. Ergodic theorems Probability Theory 1 / 92 Outline 1 Definitions

More information

Part III Advanced Probability

Part III Advanced Probability Part III Advanced Probability Based on lectures by M. Lis Notes taken by Dexter Chua Michaelmas 2017 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after

More information

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS

PROBABILITY: LIMIT THEOREMS II, SPRING HOMEWORK PROBLEMS PROBABILITY: LIMIT THEOREMS II, SPRING 218. HOMEWORK PROBLEMS PROF. YURI BAKHTIN Instructions. You are allowed to work on solutions in groups, but you are required to write up solutions on your own. Please

More information

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1)

n [ F (b j ) F (a j ) ], n j=1(a j, b j ] E (4.1) 1.4. CONSTRUCTION OF LEBESGUE-STIELTJES MEASURES In this section we shall put to use the Carathéodory-Hahn theory, in order to construct measures with certain desirable properties first on the real line

More information

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries Chapter 1 Measure Spaces 1.1 Algebras and σ algebras of sets 1.1.1 Notation and preliminaries We shall denote by X a nonempty set, by P(X) the set of all parts (i.e., subsets) of X, and by the empty set.

More information

Admin and Lecture 1: Recap of Measure Theory

Admin and Lecture 1: Recap of Measure Theory Admin and Lecture 1: Recap of Measure Theory David Aldous January 16, 2018 I don t use bcourses: Read web page (search Aldous 205B) Web page rather unorganized some topics done by Nike in 205A will post

More information

Lectures on Integration. William G. Faris

Lectures on Integration. William G. Faris Lectures on Integration William G. Faris March 4, 2001 2 Contents 1 The integral: properties 5 1.1 Measurable functions......................... 5 1.2 Integration.............................. 7 1.3 Convergence

More information

. Find E(V ) and var(v ).

. Find E(V ) and var(v ). Math 6382/6383: Probability Models and Mathematical Statistics Sample Preliminary Exam Questions 1. A person tosses a fair coin until she obtains 2 heads in a row. She then tosses a fair die the same number

More information

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past. 1 Markov chain: definition Lecture 5 Definition 1.1 Markov chain] A sequence of random variables (X n ) n 0 taking values in a measurable state space (S, S) is called a (discrete time) Markov chain, if

More information

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces

Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, Metric Spaces Stat 8112 Lecture Notes Weak Convergence in Metric Spaces Charles J. Geyer January 23, 2013 1 Metric Spaces Let X be an arbitrary set. A function d : X X R is called a metric if it satisfies the folloing

More information

1 Independent increments

1 Independent increments Tel Aviv University, 2008 Brownian motion 1 1 Independent increments 1a Three convolution semigroups........... 1 1b Independent increments.............. 2 1c Continuous time................... 3 1d Bad

More information

On the Set of Limit Points of Normed Sums of Geometrically Weighted I.I.D. Bounded Random Variables

On the Set of Limit Points of Normed Sums of Geometrically Weighted I.I.D. Bounded Random Variables On the Set of Limit Points of Normed Sums of Geometrically Weighted I.I.D. Bounded Random Variables Deli Li 1, Yongcheng Qi, and Andrew Rosalsky 3 1 Department of Mathematical Sciences, Lakehead University,

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT4410, autumn 2017 Nadia S. Larsen. 17 November 2017.

Product measures, Tonelli s and Fubini s theorems For use in MAT4410, autumn 2017 Nadia S. Larsen. 17 November 2017. Product measures, Tonelli s and Fubini s theorems For use in MAT4410, autumn 017 Nadia S. Larsen 17 November 017. 1. Construction of the product measure The purpose of these notes is to prove the main

More information

MAT 571 REAL ANALYSIS II LECTURE NOTES. Contents. 2. Product measures Iterated integrals Complete products Differentiation 17

MAT 571 REAL ANALYSIS II LECTURE NOTES. Contents. 2. Product measures Iterated integrals Complete products Differentiation 17 MAT 57 REAL ANALSIS II LECTURE NOTES PROFESSOR: JOHN QUIGG SEMESTER: SPRING 205 Contents. Convergence in measure 2. Product measures 3 3. Iterated integrals 4 4. Complete products 9 5. Signed measures

More information

STOR 635 Notes (S13)

STOR 635 Notes (S13) STOR 635 Notes (S13) Jimmy Jin UNC-Chapel Hill Last updated: 1/14/14 Contents 1 Measure theory and probability basics 2 1.1 Algebras and measure.......................... 2 1.2 Integration................................

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Stochastic Processes (Stochastik II)

Stochastic Processes (Stochastik II) Stochastic Processes (Stochastik II) Lecture Notes Zakhar Kabluchko University of Ulm Institute of Stochastics L A TEX-version: Judith Schmidt Vorwort Dies ist ein unvollständiges Skript zur Vorlesung

More information

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets

More information

An almost sure invariance principle for additive functionals of Markov chains

An almost sure invariance principle for additive functionals of Markov chains Statistics and Probability Letters 78 2008 854 860 www.elsevier.com/locate/stapro An almost sure invariance principle for additive functionals of Markov chains F. Rassoul-Agha a, T. Seppäläinen b, a Department

More information

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989), Real Analysis 2, Math 651, Spring 2005 April 26, 2005 1 Real Analysis 2, Math 651, Spring 2005 Krzysztof Chris Ciesielski 1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer

More information

Lecture 9. d N(0, 1). Now we fix n and think of a SRW on [0,1]. We take the k th step at time k n. and our increments are ± 1

Lecture 9. d N(0, 1). Now we fix n and think of a SRW on [0,1]. We take the k th step at time k n. and our increments are ± 1 Random Walks and Brownian Motion Tel Aviv University Spring 011 Lecture date: May 0, 011 Lecture 9 Instructor: Ron Peled Scribe: Jonathan Hermon In today s lecture we present the Brownian motion (BM).

More information

1.1. MEASURES AND INTEGRALS

1.1. MEASURES AND INTEGRALS CHAPTER 1: MEASURE THEORY In this chapter we define the notion of measure µ on a space, construct integrals on this space, and establish their basic properties under limits. The measure µ(e) will be defined

More information

Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales

Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales Prakash Balachandran Department of Mathematics Duke University April 2, 2008 1 Review of Discrete-Time

More information

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure? MA 645-4A (Real Analysis), Dr. Chernov Homework assignment 1 (Due 9/5). Prove that every countable set A is measurable and µ(a) = 0. 2 (Bonus). Let A consist of points (x, y) such that either x or y is

More information

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria

Weak convergence. Amsterdam, 13 November Leiden University. Limit theorems. Shota Gugushvili. Generalities. Criteria Weak Leiden University Amsterdam, 13 November 2013 Outline 1 2 3 4 5 6 7 Definition Definition Let µ, µ 1, µ 2,... be probability measures on (R, B). It is said that µ n converges weakly to µ, and we then

More information

Combinatorics in Banach space theory Lecture 12

Combinatorics in Banach space theory Lecture 12 Combinatorics in Banach space theory Lecture The next lemma considerably strengthens the assertion of Lemma.6(b). Lemma.9. For every Banach space X and any n N, either all the numbers n b n (X), c n (X)

More information

1.1 Review of Probability Theory

1.1 Review of Probability Theory 1.1 Review of Probability Theory Angela Peace Biomathemtics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,

More information

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS

ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS Bendikov, A. and Saloff-Coste, L. Osaka J. Math. 4 (5), 677 7 ON THE REGULARITY OF SAMPLE PATHS OF SUB-ELLIPTIC DIFFUSIONS ON MANIFOLDS ALEXANDER BENDIKOV and LAURENT SALOFF-COSTE (Received March 4, 4)

More information

Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor)

Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor) Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor) Matija Vidmar February 7, 2018 1 Dynkin and π-systems Some basic

More information

MATH 418: Lectures on Conditional Expectation

MATH 418: Lectures on Conditional Expectation MATH 418: Lectures on Conditional Expectation Instructor: r. Ed Perkins, Notes taken by Adrian She Conditional expectation is one of the most useful tools of probability. The Radon-Nikodym theorem enables

More information

4 Countability axioms

4 Countability axioms 4 COUNTABILITY AXIOMS 4 Countability axioms Definition 4.1. Let X be a topological space X is said to be first countable if for any x X, there is a countable basis for the neighborhoods of x. X is said

More information

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory Part V 7 Introduction: What are measures and why measurable sets Lebesgue Integration Theory Definition 7. (Preliminary). A measure on a set is a function :2 [ ] such that. () = 2. If { } = is a finite

More information

1 Introduction. 2 Measure theoretic definitions

1 Introduction. 2 Measure theoretic definitions 1 Introduction These notes aim to recall some basic definitions needed for dealing with random variables. Sections to 5 follow mostly the presentation given in chapter two of [1]. Measure theoretic definitions

More information

Homework Assignment #2 for Prob-Stats, Fall 2018 Due date: Monday, October 22, 2018

Homework Assignment #2 for Prob-Stats, Fall 2018 Due date: Monday, October 22, 2018 Homework Assignment #2 for Prob-Stats, Fall 2018 Due date: Monday, October 22, 2018 Topics: consistent estimators; sub-σ-fields and partial observations; Doob s theorem about sub-σ-field measurability;

More information

REAL ANALYSIS LECTURE NOTES: 1.4 OUTER MEASURE

REAL ANALYSIS LECTURE NOTES: 1.4 OUTER MEASURE REAL ANALYSIS LECTURE NOTES: 1.4 OUTER MEASURE CHRISTOPHER HEIL 1.4.1 Introduction We will expand on Section 1.4 of Folland s text, which covers abstract outer measures also called exterior measures).

More information

36-752: Lecture 1. We will use measures to say how large sets are. First, we have to decide which sets we will measure.

36-752: Lecture 1. We will use measures to say how large sets are. First, we have to decide which sets we will measure. 0 0 0 -: Lecture How is this course different from your earlier probability courses? There are some problems that simply can t be handled with finite-dimensional sample spaces and random variables that

More information

Probability and Measure

Probability and Measure Chapter 4 Probability and Measure 4.1 Introduction In this chapter we will examine probability theory from the measure theoretic perspective. The realisation that measure theory is the foundation of probability

More information

18.175: Lecture 2 Extension theorems, random variables, distributions

18.175: Lecture 2 Extension theorems, random variables, distributions 18.175: Lecture 2 Extension theorems, random variables, distributions Scott Sheffield MIT Outline Extension theorems Characterizing measures on R d Random variables Outline Extension theorems Characterizing

More information

Lecture 5: Expectation

Lecture 5: Expectation Lecture 5: Expectation 1. Expectations for random variables 1.1 Expectations for simple random variables 1.2 Expectations for bounded random variables 1.3 Expectations for general random variables 1.4

More information

1 Sequences of events and their limits

1 Sequences of events and their limits O.H. Probability II (MATH 2647 M15 1 Sequences of events and their limits 1.1 Monotone sequences of events Sequences of events arise naturally when a probabilistic experiment is repeated many times. For

More information

Invariance Principle for Variable Speed Random Walks on Trees

Invariance Principle for Variable Speed Random Walks on Trees Invariance Principle for Variable Speed Random Walks on Trees Wolfgang Löhr, University of Duisburg-Essen joint work with Siva Athreya and Anita Winter Stochastic Analysis and Applications Thoku University,

More information

Gaussian Random Fields

Gaussian Random Fields Gaussian Random Fields Mini-Course by Prof. Voijkan Jaksic Vincent Larochelle, Alexandre Tomberg May 9, 009 Review Defnition.. Let, F, P ) be a probability space. Random variables {X,..., X n } are called

More information

Introduction to Ergodic Theory

Introduction to Ergodic Theory Introduction to Ergodic Theory Marius Lemm May 20, 2010 Contents 1 Ergodic stochastic processes 2 1.1 Canonical probability space.......................... 2 1.2 Shift operators.................................

More information