A D VA N C E D P R O B A B I L - I T Y

Similar documents
Advanced Probability

Advanced Probability

1. Stochastic Processes and filtrations

Lectures 22-23: Conditional Expectations

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 9 10/2/2013. Conditional expectations, filtration and martingales

1 Stat 605. Homework I. Due Feb. 1, 2011

Part III Advanced Probability

Notes 13 : Conditioning

Probability Theory. Richard F. Bass

Lecture 6 Basic Probability

Brownian Motion and Conditional Probability

Martingale Theory and Applications

Problem Sheet 1. You may assume that both F and F are σ-fields. (a) Show that F F is not a σ-field. (b) Let X : Ω R be defined by 1 if n = 1

MATH 418: Lectures on Conditional Expectation

ADVANCED PROBABILITY: SOLUTIONS TO SHEET 1

Dynkin (λ-) and π-systems; monotone classes of sets, and of functions with some examples of application (mainly of a probabilistic flavor)

Probability and Measure

Math 6810 (Probability) Fall Lecture notes

Exercises Measure Theoretic Probability

Lecture 2. We now introduce some fundamental tools in martingale theory, which are useful in controlling the fluctuation of martingales.

Random Process Lecture 1. Fundamentals of Probability

STOCHASTIC MODELS FOR WEB 2.0

PROBABILITY THEORY II

CONVERGENCE OF RANDOM SERIES AND MARTINGALES

Notes on Measure, Probability and Stochastic Processes. João Lopes Dias

Part II Probability and Measure

STOR 635 Notes (S13)

If Y and Y 0 satisfy (1-2), then Y = Y 0 a.s.

Lecture 17 Brownian motion as a Markov process

MATHS 730 FC Lecture Notes March 5, Introduction

Exercises Measure Theoretic Probability

Measure and integration

18.175: Lecture 3 Integration

An Introduction to Stochastic Calculus

Lecture 21 Representations of Martingales

P (A G) dp G P (A G)

Probability and Measure

Stochastic integration. P.J.C. Spreij

Conditional expectation

ECON 2530b: International Finance

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

Notes 15 : UI Martingales

Stochastic Processes. Winter Term Paolo Di Tella Technische Universität Dresden Institut für Stochastik

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

JUSTIN HARTMANN. F n Σ.

Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales

FE 5204 Stochastic Differential Equations

1. Probability Measure and Integration Theory in a Nutshell

Ergodic Properties of Markov Processes

Lecture 22 Girsanov s Theorem

A PECULIAR COIN-TOSSING MODEL

Useful Probability Theorems

Martingale Theory for Finance

1 Probability space and random variables

ABSTRACT EXPECTATION

Integration on Measure Spaces

Inference for Stochastic Processes

Lecture Notes for MA 623 Stochastic Processes. Ionut Florescu. Stevens Institute of Technology address:

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

Exercises in stochastic analysis

4 Expectation & the Lebesgue Theorems

Stochastics Process Note. Xing Wang

(A n + B n + 1) A n + B n

Lecture 22: Variance and Covariance

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define

Wahrscheinlichkeitstheorie Prof. Schweizer Additions to the Script

Wiener Measure and Brownian Motion

Lecture 9. d N(0, 1). Now we fix n and think of a SRW on [0,1]. We take the k th step at time k n. and our increments are ± 1

Lecture 4: Conditional expectation and independence

Brownian Motion and Stochastic Calculus

7 Convergence in R d and in Metric Spaces

Stochastic Processes in Discrete Time

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Chapter II Independence, Conditional Expectation

CHAPTER 1. Martingales

Brownian motion. Samy Tindel. Purdue University. Probability Theory 2 - MA 539

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

Notes 1 : Measure-theoretic foundations I

Probability and Measure

Martingales, standard filtrations, and stopping times

STA205 Probability: Week 8 R. Wolpert

Math 635: An Introduction to Brownian Motion and Stochastic Calculus

MATH/STAT 235A Probability Theory Lecture Notes, Fall 2013

Lecture 5 Theorems of Fubini-Tonelli and Radon-Nikodym

n E(X t T n = lim X s Tn = X s

Math 735: Stochastic Analysis

Preliminaries to Stochastic Analysis Autumn 2011 version. Xue-Mei Li The University of Warwick

Exercises: sheet 1. k=1 Y k is called compound Poisson process (X t := 0 if N t = 0).

Measurable functions are approximately nice, even if look terrible.

2 n k In particular, using Stirling formula, we can calculate the asymptotic of obtaining heads exactly half of the time:

Measure Theoretic Probability. P.J.C. Spreij

1.1. MEASURES AND INTEGRALS

Lecture 11. Probability Theory: an Overveiw

Lecture 5. 1 Chung-Fuchs Theorem. Tel Aviv University Spring 2011

Harmonic functions on groups

Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )

STOCHASTIC CALCULUS JASON MILLER AND VITTORIA SILVESTRI

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing.

Transcription:

A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E

Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2 Existence and Uniqueness 7 1.3 Conditional Jensen s Inequalities 11 1.4 Product Measures and Fubini s Theorem 13 1.5 Examples of Conditional Expectation 14 1.6 Notation for Example Sheet 1 14 2 Discrete Time Martingales 17 2.1 Optional Stopping 18 2.2 Hitting Probabilities for a Simple Symmetric Random Walk 20 2.3 Martingale Convergence Theorem 21 2.4 Uniform Integrability 22 2.5 Backwards Martingales 25 2.6 Applications of Martingales 25 2.6.1 Martingale proof of the Radon-Nikodym theorem 26 3 Stochastic Processes in Continuous Time 27

4 andrew tulloch 4 Bibliography 29

1 Conditional Expectation Let (Ω, F, P) be a probability space. Ω is a set, F is a σ-algebra on Ω, and P is a probability measure on (Ω, F). Definition 1.1. F is a σ-algebra on Ω if it satisfies, Ω F A F ==> A c F (A n ) n 0 is a collection of sets in F then n A n F. Definition 1.2. P is a probability measure on (Ω, F) if P : F [0, 1] is a set function. P( ) = 0, P(Ω) = 1, (A n ) n 0 is a collection of pairwise disjoint sets in F, then P( n A n ) = n P(A n ). Definition 1.3. The Borel σ-algebra B(R) is the σ-algebra generated by the open sets of R. Call O the collection of open subsets of R, then B(R) = {ξ : ξ is a sigma algebra containing O} (1.1) Definition 1.4. A a collection of subsets of Ω, then we write σ(a) = {ξ : ξ a sigma algebra containing A} Definition 1.5. X is a random variable on (Ω, F) if X : Ω > R is a function with the property that X 1 (V) F for all V open sets in R.

6 andrew tulloch Exercise 1.6. If X is a random variable then {B R, X 1 (B) F} is a σ-algebra and contains B(R). If (X i, i I) is a collection of random variables, then we write σ(x i, i I) = σ({ω Ω : X i(ω) B},i I,B B(R))) and it is the smallest σ-algebra that makes all the X i s measurable. Definition 1.7. First we define it for the positive simple random variables. E ( n i=1 c i 1(A i ) with c i positive constants, (A i ) F. ) = n P(A i ). (1.2) i=1 We can extend this to any positive random variable X 0 by approximation X as the limit of piecewise constant functions. For a general X, we write X = X + X with X + = max(x, 0), X = max( X, 0). If at least one of E(X + ) or E(X ) is finite, then we define E(X) = E(X + ) + E(X ). We call X integrable if E( X ) <. Definition 1.8. Let A, B F, P(B) > 0. Then P(A B) P(A B) = P(B) E[X B] = E(X1(B)) P(B) Goal - we want to define E(X G) that is a random variable measurable with respect to the σ-algebra G. 1.1 Discrete Case Suppose G is a σ-algebra countably generated (B i)i N is a collection of pairwise disjoint sets in F with B i = Ω. Let G = σ(b i, i N). It is easy to check that G = { j J B j, J N}. Let X be an integrable random variable. Then X = E(X G) = E(X B i ) I(B i ) i N

advanced probability 7 (i) X is G-measurable (check). (ii) and so X is integrable. E ( X ) E( X ) (1.3) (iii) G G, then (check). E ( XI(G) = E ( X I(G) )) (1.4) 1.2 Existence and Uniqueness Definition 1.9. A F, A happens almost surely (a.s.) if P(A) = 1. Theorem 1.10 (Monotone Convergence Theorem). If X n 0 is a sequence of random variables and X n X as n a.s, then E(X n ) E(X) (1.5) almost surely as n. Theorem 1.11 (Dominated Convergence Theorem). If (X n ) is a sequence of random variables such that X n Y for Y an integrable random variable, then if X n as X then E(X n ) as E(X). Definition 1.12. For p [1, ), f measurable functions, then f p = E[ f p ] 1 p (1.6) f = inf{λ : f λa.e.} (1.7) Definition 1.13. L p = L p (Ω, F, P) = { f : f p < } Formally, L p is the collection of equivalence classes where two functions are equivalent if they are equal a.e. We will just represent an element of L p by a function, but remember that equality is a.e.

8 andrew tulloch Theorem 1.14. The space (L 2, 2 ) is a Hilbert space with U, V >= E(UV). Suppose H is a closed subspace, then f L 2 there exists a unique g H such that f g 2 = inf{ f h 2, h H and f g, h = 0 for all h H. We call g the orthogonal projection of f onto H. Theorem 1.15. Let (Ω, F, ) be an underlying probability space, and let X be an integrable random variable, and let G F sub σ-algebra. Then there exists a random variable Y such that (i) Y is G-measurable (ii) If A G, E(XI(A) = E(YI(A))) (1.8) and Y is integrable. Moreover, if Y also satisfies the above properties, then Y = Y a.s. Remark 1.16. Y is called a version of the conditional expectation of X given G and we write G = σ(z) as Y = E(X G). Remark 1.17. (b) could be replaced by the following condition: for all Z G-measurable, bounded random variables, E(XZ) = E(YZ) (1.9) Proof. Uniqueness - let Y satisfy (a) and (b). If we consider {Y Y > 0} = A, A is G measurable. From (b), E ( (Y Y)I(A) = E(XI(A)) ) E(XI(A)) = 0 and hence P(Y Y > 0)) = 0 which implies that Y Y a.s. Similarly, Y Y a.s. Existence - Complete the following three steps: (i) X L 2 (Ω, F, ) is a Hilbert space with U, V = E(UV). The space L 2 (Ω, G, ) is a closed subspace. X n X(L 2 ) => X n p X => subseqxnk as X => X = lim sup X nk (1.10)

advanced probability 9 We can write L 2 (Ω, F, ) = L 2 (Ω, G, ) + L 2 (Ω, G, ) X = Y + Z Set Y = E(X G), Y is G-measurable, A G. E(XI(A)) = EYI(A) + EZI(A) }{{} =0 (ii) If X 0 then Y 0 a.s. Consider A = {Y < 0}, then 0 E(XI(A)) = E(YI(A)) 0 (1.11) Thus P(A) = 0 Y 0 a.s. Let X 0, Set 0 X n = max(x, n) n, so X n L 2 for all n. Write Y n = E(X n G), then Y n 0 a.s., Y n is increasing a.s.. Set Y = lim sup Y n. So Y is G-measurable. We will show Y = E(X G) a.s. For all A G, we need to check E(XI(A)) = E(YI(A)). We know that E(X n I(A)) = E(Y n I(A)), and Y n Y a.s. Thus, by monotone convergence theorem, E(XI(A)) = E(YI(A)). If X is integrable, setting A = Ω, we have Y is integrable. (iii) X is a general random variable, not necessarily in L 2 or 0. Then we have that X = X + + X. We define E(X G) = E(X + G) E(X G). This satisfies (a), (b). Remark 1.18. If X 0, we can always define Y = E(X G) a.s. The integrability condition of Y may not be satisfied. Definition 1.19. Let G 0, G 1,... be sub σ-algebras of F. Then they are called independent if for all i, j N, P ( G i G j ) = Π n i=1 P(G i ) (1.12) Theorem 1.20. (i) If X 0 then E(X G) 0

10 andrew tulloch (ii) E(E(X G)) = E(X) (A = Ω) (iii) X is G-measurable implies E(X G) = X a.s. (iv) X is independent of G, then E(X G) = E(X). Theorem 1.21 (Fatau s lemma). X n 0, then for all n, E(lim inf X n ) lim inf E(X n ) (1.13) Theorem 1.22 (Conditional Monotone Convergence). Let X n 0, X n X a.s. Then E(X n G) E(X G) a.s. (1.14) Proof. Set Y n = E(X n G). Then Y n 0 and Y n is increasing. Set Y = lim sup Y n. Then Y is G-measurable. Theorem 1.23 (Conditional Fatau s Lemma). X n 0, then E(lim inf X n G) lim inf E(X n G) a.s. (1.15) Proof. Let X denote the limit inferior of the X n. For every natural number k define pointwise the random variable Y k = inf n k X n. Then the sequence Y 1, Y 2,... is increasing and converges pointwise to X. For k n, we have Y k X n, so that E(Y k G) E(X n G) a.s (1.16) by the monotonicity of conditional expectation, hence E(Y k G) inf n k E(X n G) a.s. (1.17) because the countable union of the exceptional sets of probability zero is again a null set. Using the definition of X, its representation as pointwise limit of the Y k, the monotone convergence theorem for conditional expectations, the last inequality, and the definition of the limit inferior, it follows that almost surely

advanced probability 11 ( E lim inf n ) X n G = E(X G) (1.18) ( ) = E lim Y k G (1.19) k = lim k E(Y k G) (1.20) lim k inf n k E(X n G) (1.21) = lim inf n E(X n G) (1.22) Theorem 1.24 (Conditional dominated convergence). TODO 1.3 Conditional Jensen s Inequalities Let X be an integrable random variable such that φ(x) is integrable of φ is non-negative. Suppose G F is a σ-algebra. Then E(φ(X) G) φ(e(x G)) (1.23) almost surely. In particular, if 1 p <, then E(X G) p X p (1.24) Proof. Every convex function can be written as φ(x) = sup i N (a i x + b i ), a i, b i R. Then E(φ(X) G) ae(x G) + b i E(φ(X) G) sup(a i E(X G) + b i ) i N = φ(e(x G) The second part follows from E(X G) p p = E( E(X G) p ) E(E( X p G)) = E( X p ) = X p p (1.25) Proposition 1.25 (Tower Property). Let X L 1, H G F be

12 andrew tulloch sub-σ-algebras. Then E(E(X G) H) = E(X H) (1.26) almost surely. Proof. Clearly E(X H) is H-measurable. Let A H. Then E(E(X H) I(A)) = E(XI(A)) = E(E(X G) I(A)) (1.27) Proposition 1.26. Let X L 1, G F be sub-σ-algebras. Suppose that Y is bounded, G-measurable. Then E(XY G) = YE(X G) (1.28) almost surely. Proof. Clearly YE(X G) is G-measurable. Let A G. Then E(YE(X G) I(A)) = E E(X G) (YI(A)) }{{} G-measurable, bounded = E(XYI(A)) (1.29) Definition 1.27. A collection A of subsets of Ω is called a π-system if for all A, B A, then A B A. Proposition 1.28 (Uniqueness of extension). Suppose that ξ is a σ- algebra on E. Let µ 1, µ 2 be two measures on (E, ξ) that agree on a π-system generating ξ and µ 1 (E) = µ 2 (E) <. Then µ 1 = µ 2 everywhere on ξ. Theorem 1.29. Let X L 1, G, H F two sub-σ-algebras. If σ(x, G) is independent of H, then E(X σ(g, H)) = E(X G) (1.30) almost surely.

advanced probability 13 Proof. Take A G, B H. E(E(X G) I(A) I(B)) = P(B) E(E(X G) I(A)) = P(B) E(XI(A)) = E(XI(A) I(B)) = E(E(X σ(g, H)) I(A B)) Assume X 0, the general case follows by writing X = X + X. Now, letting F F, we have that µ(f) = E(E(X G) I(F)), and if µ, ν are two measures on (Ω, pf), setting A = {A B, A G, B H}. Then A is a π-system. µ, ν are two measurables that agree on the π-system A and µ(ω) = E(E(X G)) = E(X) = νω <, since X is integrable. Note that A generates σ(g, H). So, by the uniqueness of extension theorem, µ, ν agree everywhere on σ(g, H). Remark 1.30. If we only had X independent of H and G independent of H, the conclusion can fail. For example, consider coin tosses X, Y independent 0, 1 with probability 1 2, and Z = I(X = Y). 1.4 Product Measures and Fubini s Theorem Definition 1.31. A measure space (E, ξ, µ) is called σ-finite if there exists sets (S n ) n with S n = E and µ(s n ) < for all n. Let (E 1, ξ 1, µ 1 ) and (E 2, ξ 2, µ 2 ) be two σ-finite measure spaces, with A = {A 1 A 2 : A 1 ξ 1, A 2 ξ 2 } a π-system of subsets of E = E 1 E 2. Define ξ = ξ 1 ξ 2 = σ(a). Definition 1.32 (Product measure). Let (E 1, ξ 1, µ 1 ) and (E 2, ξ 2, µ 2 ) be two σ-finite measure spaces. Then there exists a unique measure µ on (E, ξ) (µ = µ 1 µ 2 ) satisfying µ(a 1 A 2 ) = µ 1 (A 1 )µ 2 (A 2 ) (1.31) for all A 1 ξ 1, A 2 ξ 2.

14 andrew tulloch Theorem 1.33 (Fubini s Theorem). Let (E 1, ξ 1, µ 1 ) and (E 2, ξ 2, µ 2 ) be σ-finite measure spaces. Let f 0, f is ξ-measurable. Then µ( f ) = E 1 ( ) f (x 1, x 2 )µ 2 (dx 2 ) µ 1 (dx 1 ) (1.32) E 2 If f is integrable, then x 2 f (x 1, x 2 ) is u 2 -integrable for u 1 -almost all x. Moreover, x 1 E f (x 2 1, x 2 µ 2 (dx 2 ) is µ 1 -integrable and µ( f ) is given by (1.32). 1.5 Examples of Conditional Expectation Definition 1.34. A random vector (X 1, X 2,..., X n ) R n is called a Gaussian random vector if and only if for all a 1,..., a n R, a 1 X 1 + + a n X n (1.33) is a Gaussian random variable. (X t ) t 0 is called a Gaussian process if for all 0 t 1 t 2 t n, the vector X t1,..., X tn is a Gaussian random vector. Example 1.35 (Gaussian case). Let (X, Y) e a Gaussian vector in R 2. We want to calculate E(X Y) = E(X σ(y)) = X (1.34) where X = f (Y) with f a Borel function. Let s try f of a linear function X = ay = b, a, b R to be determined. Note that E(X) = E(X ) and E(X X)Y = 0 Cov(X X, Y) = 0 by laws of conditional expectation. Then we have that ae(y) + b = E(X) Cov(X, Y) = av(x) (1.35) TODO - continue inference 1.6 Notation for Example Sheet 1 (i) G H = σ(g, H). (ii) Let X, Y be two random variables taking values in R with joint density f X,Y (x, y) and h : R R be a Borel function such that

advanced probability 15 h(x) is integrable. We want to calculate E(h(X) Y) = E(h(X) σ(y)) (1.36) Let g be bounded and measurable. Then E(h(X)g(Y)) = h(x)g(y) f X,Y (x, y)dxdy (1.37) with 0/0 = 0 = h(x)g(y) f X,Y(x, y) f f Y (y) Y (y)dxdy (1.38) ( = h(x) f ) X,Y(x, y) dx g(y) f f Y (y) Y (y)dy (1.39) Set φ(y) = h(x) f X,Y(x,y) dx if f f Y (y) Y (y) > 0, and 0 otherwise. Then we have almost surely, and E(h(X) Y) = φ(y) (1.40) E(h(X) Y) = h(x)ν(y, dx) (1.41) with ν(y, dx) = f X,Y(x,y) f Y (y) I( f Y (y) > 0) dx = f X Y (x y)dx. ν(y, dx) is called the conditional distribution of X given Y = y and f X Y (x y) is the conditional density of X given Y = y.

2 Discrete Time Martingales Let (Ω, F, P) be a probability space and (E, ξ) a measurable space. Usually E = R, R d, C. For us, E = R. A sequence X = (X n ) n 0 of random variables taking values in E is called a stochastic process. A filtration is an increasing family (F n ) n 0 of sub-σ-algebras of F n, so F n F n+1. Intuitively, F n is the information available to us at time n. To every stochastic process X we associate a filtration called the natural filtration (Fn X ) n 0, Fn X = σ(x k, k n) (2.1) A stochastic process X is called adapted to (F n ) n 0 if X n is F n - measurable for all n. A stochastic process X is called integrable if X n is integrable for all n. Definition 2.1. An adapted integrable process (X n ) n 0 taking values in R is called a (i) martingale if E(X n F m ) = X m for all n m. (ii) super-martingale if E(X n F m ) X m for all n m. (iii) sub-martingale if E(X n F m ) X m for all n m. Remark 2.2. A (sub,super)-martingale with respect to a filtration F n is also a (sub, super)-martingale with respect to the natural filtration of X n (by the tower property)

18 andrew tulloch Example 2.3. Suppose (ξ i ) are iid random variables with E(ξ i ) = 0. Set X n = i=1 n ξ i. Then (X n ) is a martingale. Example 2.4. As above, but let (ξ i ) be iid with E(ξ i ) = 1. Then X n = Πi=1 n ξ i is a martingale. Definition 2.5. A random variables T : Ω Z + { } is called a stopping time if {T n} F n for all n. Equivalently, {T = n} F n for all n. Example 2.6. (i) Constant times are trivial stopping times. (ii) A B(R). Define T A = inf{n 0 X n A}, with inf =. Then T A is a stopping time. Proposition 2.7. Let S, T, (T n ) be stopping times on the filtered probability space (Ω, F, (F n ), P). Then S T, S T, inf n T n, lim inf n T n, lim sup n T n are stopping times. Notation. T stopping time, then X T (ω) = X T(ω) (ω). The stopped process X T is defined by Xt T = X T t. F T = {A F A T T F t, t}. Proposition 2.8. (Ω, F, (F n ), P), X = (X n ) n 0 is adapted. (i) S T, stopping times, then F S F T (ii) X T I(T < ) is F T -measurable. (iii) T a stopping time, then X T is adapted (iv) If X is integrable, then X T is integrable. Proof. Let A ξ. Need to show that {X T I(T < ) A} F T. {X T I(T < )} {T t} = s t {T = s} {X s A} F t (2.2) }{{}}{{} F s F t F s F t 2.1 Optional Stopping Theorem 2.9. Let X be a martingale.

advanced probability 19 (i) If T is a stopping time, then X T is also a martingale. In particular, E(X T t ) = E(X 0 ) for all t. (ii) (iii) (iv) Proof. By the tower property, it is sufficient to check E(X T t F t 1 ) = E = t 1 i=1 X s I(T = s) F t 1 + E(X t I(T > t 1) F t 1 ) }{{} F s F t 1 t 1 I(T = s) X s + I(t > t 1) X t 1 = X T (t 1) s=0 Since it is a martingale, E(X T t ) = E(X 0 ). Theorem 2.10. Let X be a martingale. (i) If T is a stopping time, then X T is also a martingale, so in particular E(X T t ) = E(X 0 ) (2.3) (ii) If X T are bounded stopping times, then E(X T F S ) = X S almost surely. Proof. Let S T n. Then X T = (X T X T 1 ) + (X T 1 X T 2 ) + + (X S+1 X S ) + X S = X s + n k=0 (X k+1 X k )I(S k < T). Let A F s. Then E(X T I(A)) = E(X s I(A)) + n k=0 E((X k+1 X k )I(S k < T) I(A)) (2.4) = E(X s I(A)) (2.5) Remark 2.11. The optimal stopping theorem also holds for super/submartingales with the respective martingale inequalities in the statement.

20 andrew tulloch Example 2.12. Suppose that (ξ i ) i are random variables with P(ξ i = 1) = P(ξ i = 1) = 1 2 (2.6) Set X 0 = 0, X n = n i=1 ξ i. This is a simply symmetric random walk on X n. Let T = inf{n 0 : X n = 1}. Then T < = 1, but T is not bounded. Proposition 2.13. If X is a positive supermartingale and T is a stopping time which is finite almost surely (P(T < ) = 1), then E(X T ) E(X 0 ) (2.7) Proof. ( ) E(X T ) = E lim inf X t T t lim inf t E(X t T) E(X 0 ) (2.8) 2.2 Hitting Probabilities for a Simple Symmetric Random Walk Let (ξ i ) be iid ±1 equally likely. Let X 0 = 0, X n = n i=1 ξ i. For all x Z let T x = inf{n 0 : X n = x} (2.9) which is a stopping time. We want to explore hitting probabilities (P(T a < T b )) for a, b > 0. If E(T) <, then by (iv) in Theorem 2.10, E(X T ) = E(X 0 ) = 0. E(X T ) = ap(t a < T b ) + bp(t b < T a ) = 0 (2.10) and thus obtain that P(T a < T b ) = b a + b. (2.11) Remains to check E(T) <. We have P(ξ 1 = 1, ξ a+b = 1) = 1 2 a+b.

advanced probability 21 2.3 Martingale Convergence Theorem Theorem 2.14. Let X = (X n ) n 0 be a (super-)-martingale bounded in L 1, that is, sup n 0 E( X n ) <. Then X n converges as n almost surely towards an a.s. finite limit X L 1 (F ) with F = σ(f n, n 0). To prove it we will use Doob s trick which counts up-crossings of intervals with rational endpoints. Corollary 2.15. Let X be a positive supermartingale. Then it converges to an almost surely finite limit as n. Proof. E( X n ) = E(X n ) E(X 0 ) < (2.12) Proof. Let x = (x n ) n be a sequence of real numbers, and let a < b be two real numbers. Let T 0 (x) = 0 and inductively for k 0, S k+1 (x) = inf{n T k (x) : x n a}t k+1 (x) = inf{n S k+1 (x) : x n b} (2.13) with the usual convention that inf =. Define N n ([a, b], x) = sup{k 0 : T k (x) n} - the number of up-crossings of the interval [a, b] by the sequence x by the time n. As n, we have N n ([a, b], x) N([a, b], x) = sup{k 0 : T k (x) < }, (2.14) the total number of up-crossings of the interval [a, b]. Lemma 2.16. A sequence of rationals x = (x n ) n converges in R = R {± } if and only if N([a, b], x) < for all rationals a, b. Proof. Assume x converges. Then if for some a < b we had that N([a, b], x) =, then lim inf n x n a < b lim sup n x n, which is a contradiction. Then, suppose that x does converge. Then lim inf n x n > lim sup n x n, and so taking a, b rationals between these two numbers gives that N([a, b], x) = as required.

22 andrew tulloch Theorem 2.17 (Doob s up-crossing inequality). Let X be a supermartingale and a < b be two real numbers. Then for all n 0, (b a)e(n n ([a, b], X)) E ( (X n a) ) (2.15) Proof. For all k, X Tk X Sk b a (2.16) 2.4 Uniform Integrability Theorem 2.18. Suppose X L 1. Then the collection of random variables {E(X G)} (2.17) for G F a sub-σ-algebra is uniformly integrable. Proof. Since X L 1, for all ɛ > 0 there exists S > 0 such that if A F and P(A) < δ, then E( X I(A)) ɛ. Set Y = E(X G). Then E( Y ) E( X ). Choose λ < such that E( X ) λδ. Then P( Y λ) E( Y ) λ δ (2.18) by Markov s inequality. Then E( Y I( Y λ)) E(E( X G) I( Y λ)) (2.19) = E( X I( Y λ)) (2.20) ɛ (2.21) Definition 2.19. A process X = (X n ) n 0 is called a uniformly integrable martingale if it is a martingale and the collection (X n ) is uniformly integrable. Theorem 2.20. Let X be a martingale. Then the following are equivalent.

advanced probability 23 (i) X is a uniformly integrable martingale. (ii) X converges almost surely and in L 1 to a limit X as n. (iii) There exists a random variable Z L 1 such that X n = E(Z F n ) almost surely for all n 0. Theorem 2.21 (Chapter 13 of Williams). Let X n, X L 1 for all n 0 and suppose that X n only if (X n ) is uniformly integrable. Proof. We proceed as follows. as X as n. Then X n converges to X in L 1 if and (i) (ii) Since X is uniformly integrable, it is bounded in L 1 and by the martingale convergence theorem, we get that X n converges almost surely to a finite limit X. By the previous theorem, Theorem 2.21 gives L 1 convergence. (ii) (iii) Set Z = X. We need to show that X n = E(Z F n ) almost surely for all n 0. For all m n by the martingale property we have X n E(X F n ) 1 = E(X m X F n ) 1 X m X 1 0 (2.22) as m. (iii) (i) E(Z F n ) is a martingale by the tower property of conditional expectation. Uniform integrability follows from Theorem 2.18. Remark 2.22. If X is UI then X = E(Z F ) a.s where F = σ(f n, n 0). Remark 2.23. If X is a super/sub-martingale UI, then it converges almost surely and in L 1 to a finite limit X with E(X F n ) ( )( )X n almost surely. Example 2.24. Let X 1, X 2,... be iid random variables with P(X = 0) = P(X = 2) = 2 1. Set Y n = X 1 X n. Then Y n is a martingale. As E(Y n ) = 1 for all n, we have (Y n ) is bounded in L 1, and it converges almost surely to 0. But E(Y n ) = 1 for all n, and hence it does not converge in L 1.

24 andrew tulloch If X is a UI martingale and T is a stopping time, then we can unambiguously define X T = X n I(T = n) + X I(T = ) (2.23) n=0 Theorem 2.25 (Optional stopping for UI martingales). Let X be a UI martingale and let S, T be stopping times with S T. Then E(X T F S ) = X S (2.24) almost surely. Proof. We first show that E(X F T ) = X T almost surely for any stopping time T. First, check that X T L 1. Since X n E( X F n ), we have E( X T ) = Let B F T. Then E( X n I(T = n) + E( X I(T = ))) (2.25) n=0 E( X I(T = n)) n Z + { } = E( X ) (2.26) (2.27) E(I(B) X T ) = E(I(B) I(T = n) X n ) (2.28) n Z + { } = E(I(B) I(T = n) X ) (2.29) n Z + { } = E(I(B) X ) (2.30) where for the second equality we used that E(X F n ) = X n almost surely. Clearly X T isf T -measurable, and hence E(X F T ) = X T almost surely. Using the tower property of conditional expectation, we have

advanced probability 25 for stopping times S T (as F S F T ), E(X T F S ) = E(E(X F T ) F S ) (2.31) = E(X F S ) (2.32) = X S (2.33) almost surely. 2.5 Backwards Martingales Let... G 2 G 1 G 0 be a sequence of... 2.6 Applications of Martingales Fill in proof from lecture notes Theorem 2.26 (Kolmogrov s 0 1 law). Let (X i ) i 1 be a sequence of IID random variables. Let F n = σ(x k, k n) and F = n 0 F n. Then F is trivial - that is, every A F has probability P(A) {0, 1}. Proof. Let G n = σ(x k, k n) and A F. Since G n is independent of F n+1, we have that E(I(A) G n ) = P(A) (2.34) Theorem 2.26 (LN ) gives that P(A) = E(I(A) G n ) converges to E(I(A) G ) as n, where G = σ(g n, n 0). Then we deduce that E(I(A) G n ) = I(A) = P(A) as F G. Therefore, P(A) = link to correct theorem Theorem 2.27 (Strong law of large numbers). Let (X i ) i 1 be a sequence of iid random variables in L 1 with µ = E(X i ). Let S n = n i=1 X i and S 0 = 0. Then S n n µ as n almost surely and in L 1. Proof. Theorem 2.28 (Kakutani s product martingale theorem). Let (X n ) n 0 be a sequence of independent non-negative random variables of mean 1. Let M 0 = 1, M n = i=1 n X i for n N. Then (M n ) n 0 is a non-negative martingale and M n M a.s. as n for some random variable M. We set a n=e( Xn), then a n (0, 1]. Moreover, fill in, this is somewhat involved. (i) If n a n > 0, then M n M in L 1 and E(M ) = 1, (ii) If n a n = 0, then M = 0 almost surely.

26 andrew tulloch Proof. fill in 2.6.1 Martingale proof of the Radon-Nikodym theorem Let P, Q be two probability measures on the measurable space Ω, F. Assume that F is countably generated, that is, there exists a collection of sets (F n ) n N such that F = σ(f N, n N). Then the following are equivalent. (i) P(A) = 0 Q(A) for all A F. That is, Q is absolutely continuous with respect to P and write Q << P (ii) For all ɛ > 0, there exists δ > 0 such that P(A) δ Q(A) ɛ. (iii) There exists a non-negative random variable X such that Q(A) = E P (XI(A)) (2.35) Proof. (i) (ii). If (ii) does not hold, then there exists ɛ > 0 such that for all n 1 there exists a set A n with P(A n ) 1 and Q(A n 2 n ) ɛ. By Borel-Cantelli, we get that P(A n i.o) = 0. Therefore from (i) we get that Q(A n i.o) = 0. But Q(A n i.o) = Q( n k n A k ) = lim n Q( k n A k ) ɛ (2.36) which is a contradiction. (ii) (iii). Consider the filtration F n = σ(f k, k n). Let A n = {H 1 H n H i = F i or F c i } (2.37) then it is easy to see that F n = σ(a n ). Note also that sets in A n are disjoint. continue proof

3 Stochastic Processes in Continuous Time Our setting is a probability space (Ω, F, P) a probability space with t J R + = [0, ) Definition 3.1. A filtration on (Ω, F, P) is an increasing collection of σ-algebras (F t ) t J, satisfying F s F t for t s. A stochastic process in continuous time is an ordered collection of random variables on Ω.

4 Bibliography