A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E
Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2 Existence and Uniqueness 7 1.3 Conditional Jensen s Inequalities 11 1.4 Product Measures and Fubini s Theorem 13 1.5 Examples of Conditional Expectation 14 1.6 Notation for Example Sheet 1 14 2 Discrete Time Martingales 17 2.1 Optional Stopping 18 2.2 Hitting Probabilities for a Simple Symmetric Random Walk 20 2.3 Martingale Convergence Theorem 21 2.4 Uniform Integrability 22 2.5 Backwards Martingales 25 2.6 Applications of Martingales 25 2.6.1 Martingale proof of the Radon-Nikodym theorem 26 3 Stochastic Processes in Continuous Time 27
4 andrew tulloch 4 Bibliography 29
1 Conditional Expectation Let (Ω, F, P) be a probability space. Ω is a set, F is a σ-algebra on Ω, and P is a probability measure on (Ω, F). Definition 1.1. F is a σ-algebra on Ω if it satisfies, Ω F A F ==> A c F (A n ) n 0 is a collection of sets in F then n A n F. Definition 1.2. P is a probability measure on (Ω, F) if P : F [0, 1] is a set function. P( ) = 0, P(Ω) = 1, (A n ) n 0 is a collection of pairwise disjoint sets in F, then P( n A n ) = n P(A n ). Definition 1.3. The Borel σ-algebra B(R) is the σ-algebra generated by the open sets of R. Call O the collection of open subsets of R, then B(R) = {ξ : ξ is a sigma algebra containing O} (1.1) Definition 1.4. A a collection of subsets of Ω, then we write σ(a) = {ξ : ξ a sigma algebra containing A} Definition 1.5. X is a random variable on (Ω, F) if X : Ω > R is a function with the property that X 1 (V) F for all V open sets in R.
6 andrew tulloch Exercise 1.6. If X is a random variable then {B R, X 1 (B) F} is a σ-algebra and contains B(R). If (X i, i I) is a collection of random variables, then we write σ(x i, i I) = σ({ω Ω : X i(ω) B},i I,B B(R))) and it is the smallest σ-algebra that makes all the X i s measurable. Definition 1.7. First we define it for the positive simple random variables. E ( n i=1 c i 1(A i ) with c i positive constants, (A i ) F. ) = n P(A i ). (1.2) i=1 We can extend this to any positive random variable X 0 by approximation X as the limit of piecewise constant functions. For a general X, we write X = X + X with X + = max(x, 0), X = max( X, 0). If at least one of E(X + ) or E(X ) is finite, then we define E(X) = E(X + ) + E(X ). We call X integrable if E( X ) <. Definition 1.8. Let A, B F, P(B) > 0. Then P(A B) P(A B) = P(B) E[X B] = E(X1(B)) P(B) Goal - we want to define E(X G) that is a random variable measurable with respect to the σ-algebra G. 1.1 Discrete Case Suppose G is a σ-algebra countably generated (B i)i N is a collection of pairwise disjoint sets in F with B i = Ω. Let G = σ(b i, i N). It is easy to check that G = { j J B j, J N}. Let X be an integrable random variable. Then X = E(X G) = E(X B i ) I(B i ) i N
advanced probability 7 (i) X is G-measurable (check). (ii) and so X is integrable. E ( X ) E( X ) (1.3) (iii) G G, then (check). E ( XI(G) = E ( X I(G) )) (1.4) 1.2 Existence and Uniqueness Definition 1.9. A F, A happens almost surely (a.s.) if P(A) = 1. Theorem 1.10 (Monotone Convergence Theorem). If X n 0 is a sequence of random variables and X n X as n a.s, then E(X n ) E(X) (1.5) almost surely as n. Theorem 1.11 (Dominated Convergence Theorem). If (X n ) is a sequence of random variables such that X n Y for Y an integrable random variable, then if X n as X then E(X n ) as E(X). Definition 1.12. For p [1, ), f measurable functions, then f p = E[ f p ] 1 p (1.6) f = inf{λ : f λa.e.} (1.7) Definition 1.13. L p = L p (Ω, F, P) = { f : f p < } Formally, L p is the collection of equivalence classes where two functions are equivalent if they are equal a.e. We will just represent an element of L p by a function, but remember that equality is a.e.
8 andrew tulloch Theorem 1.14. The space (L 2, 2 ) is a Hilbert space with U, V >= E(UV). Suppose H is a closed subspace, then f L 2 there exists a unique g H such that f g 2 = inf{ f h 2, h H and f g, h = 0 for all h H. We call g the orthogonal projection of f onto H. Theorem 1.15. Let (Ω, F, ) be an underlying probability space, and let X be an integrable random variable, and let G F sub σ-algebra. Then there exists a random variable Y such that (i) Y is G-measurable (ii) If A G, E(XI(A) = E(YI(A))) (1.8) and Y is integrable. Moreover, if Y also satisfies the above properties, then Y = Y a.s. Remark 1.16. Y is called a version of the conditional expectation of X given G and we write G = σ(z) as Y = E(X G). Remark 1.17. (b) could be replaced by the following condition: for all Z G-measurable, bounded random variables, E(XZ) = E(YZ) (1.9) Proof. Uniqueness - let Y satisfy (a) and (b). If we consider {Y Y > 0} = A, A is G measurable. From (b), E ( (Y Y)I(A) = E(XI(A)) ) E(XI(A)) = 0 and hence P(Y Y > 0)) = 0 which implies that Y Y a.s. Similarly, Y Y a.s. Existence - Complete the following three steps: (i) X L 2 (Ω, F, ) is a Hilbert space with U, V = E(UV). The space L 2 (Ω, G, ) is a closed subspace. X n X(L 2 ) => X n p X => subseqxnk as X => X = lim sup X nk (1.10)
advanced probability 9 We can write L 2 (Ω, F, ) = L 2 (Ω, G, ) + L 2 (Ω, G, ) X = Y + Z Set Y = E(X G), Y is G-measurable, A G. E(XI(A)) = EYI(A) + EZI(A) }{{} =0 (ii) If X 0 then Y 0 a.s. Consider A = {Y < 0}, then 0 E(XI(A)) = E(YI(A)) 0 (1.11) Thus P(A) = 0 Y 0 a.s. Let X 0, Set 0 X n = max(x, n) n, so X n L 2 for all n. Write Y n = E(X n G), then Y n 0 a.s., Y n is increasing a.s.. Set Y = lim sup Y n. So Y is G-measurable. We will show Y = E(X G) a.s. For all A G, we need to check E(XI(A)) = E(YI(A)). We know that E(X n I(A)) = E(Y n I(A)), and Y n Y a.s. Thus, by monotone convergence theorem, E(XI(A)) = E(YI(A)). If X is integrable, setting A = Ω, we have Y is integrable. (iii) X is a general random variable, not necessarily in L 2 or 0. Then we have that X = X + + X. We define E(X G) = E(X + G) E(X G). This satisfies (a), (b). Remark 1.18. If X 0, we can always define Y = E(X G) a.s. The integrability condition of Y may not be satisfied. Definition 1.19. Let G 0, G 1,... be sub σ-algebras of F. Then they are called independent if for all i, j N, P ( G i G j ) = Π n i=1 P(G i ) (1.12) Theorem 1.20. (i) If X 0 then E(X G) 0
10 andrew tulloch (ii) E(E(X G)) = E(X) (A = Ω) (iii) X is G-measurable implies E(X G) = X a.s. (iv) X is independent of G, then E(X G) = E(X). Theorem 1.21 (Fatau s lemma). X n 0, then for all n, E(lim inf X n ) lim inf E(X n ) (1.13) Theorem 1.22 (Conditional Monotone Convergence). Let X n 0, X n X a.s. Then E(X n G) E(X G) a.s. (1.14) Proof. Set Y n = E(X n G). Then Y n 0 and Y n is increasing. Set Y = lim sup Y n. Then Y is G-measurable. Theorem 1.23 (Conditional Fatau s Lemma). X n 0, then E(lim inf X n G) lim inf E(X n G) a.s. (1.15) Proof. Let X denote the limit inferior of the X n. For every natural number k define pointwise the random variable Y k = inf n k X n. Then the sequence Y 1, Y 2,... is increasing and converges pointwise to X. For k n, we have Y k X n, so that E(Y k G) E(X n G) a.s (1.16) by the monotonicity of conditional expectation, hence E(Y k G) inf n k E(X n G) a.s. (1.17) because the countable union of the exceptional sets of probability zero is again a null set. Using the definition of X, its representation as pointwise limit of the Y k, the monotone convergence theorem for conditional expectations, the last inequality, and the definition of the limit inferior, it follows that almost surely
advanced probability 11 ( E lim inf n ) X n G = E(X G) (1.18) ( ) = E lim Y k G (1.19) k = lim k E(Y k G) (1.20) lim k inf n k E(X n G) (1.21) = lim inf n E(X n G) (1.22) Theorem 1.24 (Conditional dominated convergence). TODO 1.3 Conditional Jensen s Inequalities Let X be an integrable random variable such that φ(x) is integrable of φ is non-negative. Suppose G F is a σ-algebra. Then E(φ(X) G) φ(e(x G)) (1.23) almost surely. In particular, if 1 p <, then E(X G) p X p (1.24) Proof. Every convex function can be written as φ(x) = sup i N (a i x + b i ), a i, b i R. Then E(φ(X) G) ae(x G) + b i E(φ(X) G) sup(a i E(X G) + b i ) i N = φ(e(x G) The second part follows from E(X G) p p = E( E(X G) p ) E(E( X p G)) = E( X p ) = X p p (1.25) Proposition 1.25 (Tower Property). Let X L 1, H G F be
12 andrew tulloch sub-σ-algebras. Then E(E(X G) H) = E(X H) (1.26) almost surely. Proof. Clearly E(X H) is H-measurable. Let A H. Then E(E(X H) I(A)) = E(XI(A)) = E(E(X G) I(A)) (1.27) Proposition 1.26. Let X L 1, G F be sub-σ-algebras. Suppose that Y is bounded, G-measurable. Then E(XY G) = YE(X G) (1.28) almost surely. Proof. Clearly YE(X G) is G-measurable. Let A G. Then E(YE(X G) I(A)) = E E(X G) (YI(A)) }{{} G-measurable, bounded = E(XYI(A)) (1.29) Definition 1.27. A collection A of subsets of Ω is called a π-system if for all A, B A, then A B A. Proposition 1.28 (Uniqueness of extension). Suppose that ξ is a σ- algebra on E. Let µ 1, µ 2 be two measures on (E, ξ) that agree on a π-system generating ξ and µ 1 (E) = µ 2 (E) <. Then µ 1 = µ 2 everywhere on ξ. Theorem 1.29. Let X L 1, G, H F two sub-σ-algebras. If σ(x, G) is independent of H, then E(X σ(g, H)) = E(X G) (1.30) almost surely.
advanced probability 13 Proof. Take A G, B H. E(E(X G) I(A) I(B)) = P(B) E(E(X G) I(A)) = P(B) E(XI(A)) = E(XI(A) I(B)) = E(E(X σ(g, H)) I(A B)) Assume X 0, the general case follows by writing X = X + X. Now, letting F F, we have that µ(f) = E(E(X G) I(F)), and if µ, ν are two measures on (Ω, pf), setting A = {A B, A G, B H}. Then A is a π-system. µ, ν are two measurables that agree on the π-system A and µ(ω) = E(E(X G)) = E(X) = νω <, since X is integrable. Note that A generates σ(g, H). So, by the uniqueness of extension theorem, µ, ν agree everywhere on σ(g, H). Remark 1.30. If we only had X independent of H and G independent of H, the conclusion can fail. For example, consider coin tosses X, Y independent 0, 1 with probability 1 2, and Z = I(X = Y). 1.4 Product Measures and Fubini s Theorem Definition 1.31. A measure space (E, ξ, µ) is called σ-finite if there exists sets (S n ) n with S n = E and µ(s n ) < for all n. Let (E 1, ξ 1, µ 1 ) and (E 2, ξ 2, µ 2 ) be two σ-finite measure spaces, with A = {A 1 A 2 : A 1 ξ 1, A 2 ξ 2 } a π-system of subsets of E = E 1 E 2. Define ξ = ξ 1 ξ 2 = σ(a). Definition 1.32 (Product measure). Let (E 1, ξ 1, µ 1 ) and (E 2, ξ 2, µ 2 ) be two σ-finite measure spaces. Then there exists a unique measure µ on (E, ξ) (µ = µ 1 µ 2 ) satisfying µ(a 1 A 2 ) = µ 1 (A 1 )µ 2 (A 2 ) (1.31) for all A 1 ξ 1, A 2 ξ 2.
14 andrew tulloch Theorem 1.33 (Fubini s Theorem). Let (E 1, ξ 1, µ 1 ) and (E 2, ξ 2, µ 2 ) be σ-finite measure spaces. Let f 0, f is ξ-measurable. Then µ( f ) = E 1 ( ) f (x 1, x 2 )µ 2 (dx 2 ) µ 1 (dx 1 ) (1.32) E 2 If f is integrable, then x 2 f (x 1, x 2 ) is u 2 -integrable for u 1 -almost all x. Moreover, x 1 E f (x 2 1, x 2 µ 2 (dx 2 ) is µ 1 -integrable and µ( f ) is given by (1.32). 1.5 Examples of Conditional Expectation Definition 1.34. A random vector (X 1, X 2,..., X n ) R n is called a Gaussian random vector if and only if for all a 1,..., a n R, a 1 X 1 + + a n X n (1.33) is a Gaussian random variable. (X t ) t 0 is called a Gaussian process if for all 0 t 1 t 2 t n, the vector X t1,..., X tn is a Gaussian random vector. Example 1.35 (Gaussian case). Let (X, Y) e a Gaussian vector in R 2. We want to calculate E(X Y) = E(X σ(y)) = X (1.34) where X = f (Y) with f a Borel function. Let s try f of a linear function X = ay = b, a, b R to be determined. Note that E(X) = E(X ) and E(X X)Y = 0 Cov(X X, Y) = 0 by laws of conditional expectation. Then we have that ae(y) + b = E(X) Cov(X, Y) = av(x) (1.35) TODO - continue inference 1.6 Notation for Example Sheet 1 (i) G H = σ(g, H). (ii) Let X, Y be two random variables taking values in R with joint density f X,Y (x, y) and h : R R be a Borel function such that
advanced probability 15 h(x) is integrable. We want to calculate E(h(X) Y) = E(h(X) σ(y)) (1.36) Let g be bounded and measurable. Then E(h(X)g(Y)) = h(x)g(y) f X,Y (x, y)dxdy (1.37) with 0/0 = 0 = h(x)g(y) f X,Y(x, y) f f Y (y) Y (y)dxdy (1.38) ( = h(x) f ) X,Y(x, y) dx g(y) f f Y (y) Y (y)dy (1.39) Set φ(y) = h(x) f X,Y(x,y) dx if f f Y (y) Y (y) > 0, and 0 otherwise. Then we have almost surely, and E(h(X) Y) = φ(y) (1.40) E(h(X) Y) = h(x)ν(y, dx) (1.41) with ν(y, dx) = f X,Y(x,y) f Y (y) I( f Y (y) > 0) dx = f X Y (x y)dx. ν(y, dx) is called the conditional distribution of X given Y = y and f X Y (x y) is the conditional density of X given Y = y.
2 Discrete Time Martingales Let (Ω, F, P) be a probability space and (E, ξ) a measurable space. Usually E = R, R d, C. For us, E = R. A sequence X = (X n ) n 0 of random variables taking values in E is called a stochastic process. A filtration is an increasing family (F n ) n 0 of sub-σ-algebras of F n, so F n F n+1. Intuitively, F n is the information available to us at time n. To every stochastic process X we associate a filtration called the natural filtration (Fn X ) n 0, Fn X = σ(x k, k n) (2.1) A stochastic process X is called adapted to (F n ) n 0 if X n is F n - measurable for all n. A stochastic process X is called integrable if X n is integrable for all n. Definition 2.1. An adapted integrable process (X n ) n 0 taking values in R is called a (i) martingale if E(X n F m ) = X m for all n m. (ii) super-martingale if E(X n F m ) X m for all n m. (iii) sub-martingale if E(X n F m ) X m for all n m. Remark 2.2. A (sub,super)-martingale with respect to a filtration F n is also a (sub, super)-martingale with respect to the natural filtration of X n (by the tower property)
18 andrew tulloch Example 2.3. Suppose (ξ i ) are iid random variables with E(ξ i ) = 0. Set X n = i=1 n ξ i. Then (X n ) is a martingale. Example 2.4. As above, but let (ξ i ) be iid with E(ξ i ) = 1. Then X n = Πi=1 n ξ i is a martingale. Definition 2.5. A random variables T : Ω Z + { } is called a stopping time if {T n} F n for all n. Equivalently, {T = n} F n for all n. Example 2.6. (i) Constant times are trivial stopping times. (ii) A B(R). Define T A = inf{n 0 X n A}, with inf =. Then T A is a stopping time. Proposition 2.7. Let S, T, (T n ) be stopping times on the filtered probability space (Ω, F, (F n ), P). Then S T, S T, inf n T n, lim inf n T n, lim sup n T n are stopping times. Notation. T stopping time, then X T (ω) = X T(ω) (ω). The stopped process X T is defined by Xt T = X T t. F T = {A F A T T F t, t}. Proposition 2.8. (Ω, F, (F n ), P), X = (X n ) n 0 is adapted. (i) S T, stopping times, then F S F T (ii) X T I(T < ) is F T -measurable. (iii) T a stopping time, then X T is adapted (iv) If X is integrable, then X T is integrable. Proof. Let A ξ. Need to show that {X T I(T < ) A} F T. {X T I(T < )} {T t} = s t {T = s} {X s A} F t (2.2) }{{}}{{} F s F t F s F t 2.1 Optional Stopping Theorem 2.9. Let X be a martingale.
advanced probability 19 (i) If T is a stopping time, then X T is also a martingale. In particular, E(X T t ) = E(X 0 ) for all t. (ii) (iii) (iv) Proof. By the tower property, it is sufficient to check E(X T t F t 1 ) = E = t 1 i=1 X s I(T = s) F t 1 + E(X t I(T > t 1) F t 1 ) }{{} F s F t 1 t 1 I(T = s) X s + I(t > t 1) X t 1 = X T (t 1) s=0 Since it is a martingale, E(X T t ) = E(X 0 ). Theorem 2.10. Let X be a martingale. (i) If T is a stopping time, then X T is also a martingale, so in particular E(X T t ) = E(X 0 ) (2.3) (ii) If X T are bounded stopping times, then E(X T F S ) = X S almost surely. Proof. Let S T n. Then X T = (X T X T 1 ) + (X T 1 X T 2 ) + + (X S+1 X S ) + X S = X s + n k=0 (X k+1 X k )I(S k < T). Let A F s. Then E(X T I(A)) = E(X s I(A)) + n k=0 E((X k+1 X k )I(S k < T) I(A)) (2.4) = E(X s I(A)) (2.5) Remark 2.11. The optimal stopping theorem also holds for super/submartingales with the respective martingale inequalities in the statement.
20 andrew tulloch Example 2.12. Suppose that (ξ i ) i are random variables with P(ξ i = 1) = P(ξ i = 1) = 1 2 (2.6) Set X 0 = 0, X n = n i=1 ξ i. This is a simply symmetric random walk on X n. Let T = inf{n 0 : X n = 1}. Then T < = 1, but T is not bounded. Proposition 2.13. If X is a positive supermartingale and T is a stopping time which is finite almost surely (P(T < ) = 1), then E(X T ) E(X 0 ) (2.7) Proof. ( ) E(X T ) = E lim inf X t T t lim inf t E(X t T) E(X 0 ) (2.8) 2.2 Hitting Probabilities for a Simple Symmetric Random Walk Let (ξ i ) be iid ±1 equally likely. Let X 0 = 0, X n = n i=1 ξ i. For all x Z let T x = inf{n 0 : X n = x} (2.9) which is a stopping time. We want to explore hitting probabilities (P(T a < T b )) for a, b > 0. If E(T) <, then by (iv) in Theorem 2.10, E(X T ) = E(X 0 ) = 0. E(X T ) = ap(t a < T b ) + bp(t b < T a ) = 0 (2.10) and thus obtain that P(T a < T b ) = b a + b. (2.11) Remains to check E(T) <. We have P(ξ 1 = 1, ξ a+b = 1) = 1 2 a+b.
advanced probability 21 2.3 Martingale Convergence Theorem Theorem 2.14. Let X = (X n ) n 0 be a (super-)-martingale bounded in L 1, that is, sup n 0 E( X n ) <. Then X n converges as n almost surely towards an a.s. finite limit X L 1 (F ) with F = σ(f n, n 0). To prove it we will use Doob s trick which counts up-crossings of intervals with rational endpoints. Corollary 2.15. Let X be a positive supermartingale. Then it converges to an almost surely finite limit as n. Proof. E( X n ) = E(X n ) E(X 0 ) < (2.12) Proof. Let x = (x n ) n be a sequence of real numbers, and let a < b be two real numbers. Let T 0 (x) = 0 and inductively for k 0, S k+1 (x) = inf{n T k (x) : x n a}t k+1 (x) = inf{n S k+1 (x) : x n b} (2.13) with the usual convention that inf =. Define N n ([a, b], x) = sup{k 0 : T k (x) n} - the number of up-crossings of the interval [a, b] by the sequence x by the time n. As n, we have N n ([a, b], x) N([a, b], x) = sup{k 0 : T k (x) < }, (2.14) the total number of up-crossings of the interval [a, b]. Lemma 2.16. A sequence of rationals x = (x n ) n converges in R = R {± } if and only if N([a, b], x) < for all rationals a, b. Proof. Assume x converges. Then if for some a < b we had that N([a, b], x) =, then lim inf n x n a < b lim sup n x n, which is a contradiction. Then, suppose that x does converge. Then lim inf n x n > lim sup n x n, and so taking a, b rationals between these two numbers gives that N([a, b], x) = as required.
22 andrew tulloch Theorem 2.17 (Doob s up-crossing inequality). Let X be a supermartingale and a < b be two real numbers. Then for all n 0, (b a)e(n n ([a, b], X)) E ( (X n a) ) (2.15) Proof. For all k, X Tk X Sk b a (2.16) 2.4 Uniform Integrability Theorem 2.18. Suppose X L 1. Then the collection of random variables {E(X G)} (2.17) for G F a sub-σ-algebra is uniformly integrable. Proof. Since X L 1, for all ɛ > 0 there exists S > 0 such that if A F and P(A) < δ, then E( X I(A)) ɛ. Set Y = E(X G). Then E( Y ) E( X ). Choose λ < such that E( X ) λδ. Then P( Y λ) E( Y ) λ δ (2.18) by Markov s inequality. Then E( Y I( Y λ)) E(E( X G) I( Y λ)) (2.19) = E( X I( Y λ)) (2.20) ɛ (2.21) Definition 2.19. A process X = (X n ) n 0 is called a uniformly integrable martingale if it is a martingale and the collection (X n ) is uniformly integrable. Theorem 2.20. Let X be a martingale. Then the following are equivalent.
advanced probability 23 (i) X is a uniformly integrable martingale. (ii) X converges almost surely and in L 1 to a limit X as n. (iii) There exists a random variable Z L 1 such that X n = E(Z F n ) almost surely for all n 0. Theorem 2.21 (Chapter 13 of Williams). Let X n, X L 1 for all n 0 and suppose that X n only if (X n ) is uniformly integrable. Proof. We proceed as follows. as X as n. Then X n converges to X in L 1 if and (i) (ii) Since X is uniformly integrable, it is bounded in L 1 and by the martingale convergence theorem, we get that X n converges almost surely to a finite limit X. By the previous theorem, Theorem 2.21 gives L 1 convergence. (ii) (iii) Set Z = X. We need to show that X n = E(Z F n ) almost surely for all n 0. For all m n by the martingale property we have X n E(X F n ) 1 = E(X m X F n ) 1 X m X 1 0 (2.22) as m. (iii) (i) E(Z F n ) is a martingale by the tower property of conditional expectation. Uniform integrability follows from Theorem 2.18. Remark 2.22. If X is UI then X = E(Z F ) a.s where F = σ(f n, n 0). Remark 2.23. If X is a super/sub-martingale UI, then it converges almost surely and in L 1 to a finite limit X with E(X F n ) ( )( )X n almost surely. Example 2.24. Let X 1, X 2,... be iid random variables with P(X = 0) = P(X = 2) = 2 1. Set Y n = X 1 X n. Then Y n is a martingale. As E(Y n ) = 1 for all n, we have (Y n ) is bounded in L 1, and it converges almost surely to 0. But E(Y n ) = 1 for all n, and hence it does not converge in L 1.
24 andrew tulloch If X is a UI martingale and T is a stopping time, then we can unambiguously define X T = X n I(T = n) + X I(T = ) (2.23) n=0 Theorem 2.25 (Optional stopping for UI martingales). Let X be a UI martingale and let S, T be stopping times with S T. Then E(X T F S ) = X S (2.24) almost surely. Proof. We first show that E(X F T ) = X T almost surely for any stopping time T. First, check that X T L 1. Since X n E( X F n ), we have E( X T ) = Let B F T. Then E( X n I(T = n) + E( X I(T = ))) (2.25) n=0 E( X I(T = n)) n Z + { } = E( X ) (2.26) (2.27) E(I(B) X T ) = E(I(B) I(T = n) X n ) (2.28) n Z + { } = E(I(B) I(T = n) X ) (2.29) n Z + { } = E(I(B) X ) (2.30) where for the second equality we used that E(X F n ) = X n almost surely. Clearly X T isf T -measurable, and hence E(X F T ) = X T almost surely. Using the tower property of conditional expectation, we have
advanced probability 25 for stopping times S T (as F S F T ), E(X T F S ) = E(E(X F T ) F S ) (2.31) = E(X F S ) (2.32) = X S (2.33) almost surely. 2.5 Backwards Martingales Let... G 2 G 1 G 0 be a sequence of... 2.6 Applications of Martingales Fill in proof from lecture notes Theorem 2.26 (Kolmogrov s 0 1 law). Let (X i ) i 1 be a sequence of IID random variables. Let F n = σ(x k, k n) and F = n 0 F n. Then F is trivial - that is, every A F has probability P(A) {0, 1}. Proof. Let G n = σ(x k, k n) and A F. Since G n is independent of F n+1, we have that E(I(A) G n ) = P(A) (2.34) Theorem 2.26 (LN ) gives that P(A) = E(I(A) G n ) converges to E(I(A) G ) as n, where G = σ(g n, n 0). Then we deduce that E(I(A) G n ) = I(A) = P(A) as F G. Therefore, P(A) = link to correct theorem Theorem 2.27 (Strong law of large numbers). Let (X i ) i 1 be a sequence of iid random variables in L 1 with µ = E(X i ). Let S n = n i=1 X i and S 0 = 0. Then S n n µ as n almost surely and in L 1. Proof. Theorem 2.28 (Kakutani s product martingale theorem). Let (X n ) n 0 be a sequence of independent non-negative random variables of mean 1. Let M 0 = 1, M n = i=1 n X i for n N. Then (M n ) n 0 is a non-negative martingale and M n M a.s. as n for some random variable M. We set a n=e( Xn), then a n (0, 1]. Moreover, fill in, this is somewhat involved. (i) If n a n > 0, then M n M in L 1 and E(M ) = 1, (ii) If n a n = 0, then M = 0 almost surely.
26 andrew tulloch Proof. fill in 2.6.1 Martingale proof of the Radon-Nikodym theorem Let P, Q be two probability measures on the measurable space Ω, F. Assume that F is countably generated, that is, there exists a collection of sets (F n ) n N such that F = σ(f N, n N). Then the following are equivalent. (i) P(A) = 0 Q(A) for all A F. That is, Q is absolutely continuous with respect to P and write Q << P (ii) For all ɛ > 0, there exists δ > 0 such that P(A) δ Q(A) ɛ. (iii) There exists a non-negative random variable X such that Q(A) = E P (XI(A)) (2.35) Proof. (i) (ii). If (ii) does not hold, then there exists ɛ > 0 such that for all n 1 there exists a set A n with P(A n ) 1 and Q(A n 2 n ) ɛ. By Borel-Cantelli, we get that P(A n i.o) = 0. Therefore from (i) we get that Q(A n i.o) = 0. But Q(A n i.o) = Q( n k n A k ) = lim n Q( k n A k ) ɛ (2.36) which is a contradiction. (ii) (iii). Consider the filtration F n = σ(f k, k n). Let A n = {H 1 H n H i = F i or F c i } (2.37) then it is easy to see that F n = σ(a n ). Note also that sets in A n are disjoint. continue proof
3 Stochastic Processes in Continuous Time Our setting is a probability space (Ω, F, P) a probability space with t J R + = [0, ) Definition 3.1. A filtration on (Ω, F, P) is an increasing collection of σ-algebras (F t ) t J, satisfying F s F t for t s. A stochastic process in continuous time is an ordered collection of random variables on Ω.
4 Bibliography