Math 456: Mathematical Modeling. Tuesday, March 6th, 2018

Similar documents
Math Homework 5 Solutions

Some Definition and Example of Markov Chain

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

RANDOM WALKS IN ONE DIMENSION

Markov Chains (Part 3)

Lecture 7. We can regard (p(i, j)) as defining a (maybe infinite) matrix P. Then a basic fact is

Stat 516, Homework 1

2. Transience and Recurrence

MARKOV PROCESSES. Valerio Di Valerio

MATH 564/STAT 555 Applied Stochastic Processes Homework 2, September 18, 2015 Due September 30, 2015

Math 456: Mathematical Modeling. Tuesday, April 9th, 2018

Introduction to Stochastic Processes

Positive and null recurrent-branching Process

Chapter 7. Markov chain background. 7.1 Finite state space

Mathematical Methods for Computer Science

4 Sums of Independent Random Variables

MATH 56A: STOCHASTIC PROCESSES CHAPTER 2

Stochastic Processes (Week 6)

12 Markov chains The Markov property

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

Lecture 7. 1 Notations. Tel Aviv University Spring 2011

Summary of Results on Markov Chains. Abstract

MATH HOMEWORK PROBLEMS D. MCCLENDON

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

Transience: Whereas a finite closed communication class must be recurrent, an infinite closed communication class can be transient:

6 Markov Chain Monte Carlo (MCMC)

Markov Chains and Stochastic Sampling

Lecture 4 - Random walk, ruin problems and random processes

Markov Processes Hamid R. Rabiee

HW 2 Solutions. The variance of the random walk is explosive (lim n Var (X n ) = ).

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions

= P{X 0. = i} (1) If the MC has stationary transition probabilities then, = i} = P{X n+1

1 Sequences of events and their limits

Wed Feb The vector spaces 2, 3, n. Announcements: Warm-up Exercise:

MA3H2 Markov Processes and Percolation theory. Stefan Adams

Markov Chains. Chapter 16. Markov Chains - 1

Lecture 9 Classification of States

Universal examples. Chapter The Bernoulli process

1 Gambler s Ruin Problem

Math 564 Homework 1. Solutions.

Inference for Stochastic Processes

MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation)

Countable state discrete time Markov Chains

Lectures on Markov Chains

Stochastic Simulation

Recap. Probability, stochastic processes, Markov chains. ELEC-C7210 Modeling and analysis of communication networks

LIMITS AT INFINITY MR. VELAZQUEZ AP CALCULUS

4 Branching Processes

RENEWAL THEORY STEVEN P. LALLEY UNIVERSITY OF CHICAGO. X i

T 1. The value function v(x) is the expected net gain when using the optimal stopping time starting at state x:

MATH 56A SPRING 2008 STOCHASTIC PROCESSES

Modern Discrete Probability Spectral Techniques

Note that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +

Markov chains. Randomness and Computation. Markov chains. Markov processes

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

MATH 56A SPRING 2008 STOCHASTIC PROCESSES 65

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Finish section 3.6 on Determinants and connections to matrix inverses. Use last week's notes. Then if we have time on Tuesday, begin:

Discrete time Markov chains. Discrete Time Markov Chains, Definition and classification. Probability axioms and first results

Math-Stat-491-Fall2014-Notes-V

Lecture 5. 1 Chung-Fuchs Theorem. Tel Aviv University Spring 2011

Brownian Motion. An Undergraduate Introduction to Financial Mathematics. J. Robert Buchanan. J. Robert Buchanan Brownian Motion

Readings: Finish Section 5.2

Markov Chains, Random Walks on Graphs, and the Laplacian

1 Presessional Probability

Mathematics 206 Solutions for HWK 23 Section 6.3 p358

MAA704, Perron-Frobenius theory and Markov chains.

Monte Carlo importance sampling and Markov chain

SOLVING AND COMPUTING THE DISCRETE DIRICHLET PROBLEM

Markov Chains on Countable State Space

Applied Stochastic Processes

Lecture 6 Random walks - advanced methods

1.3 Convergence of Regular Markov Chains

IEOR 6711: Stochastic Models I Fall 2013, Professor Whitt Lecture Notes, Thursday, September 5 Modes of Convergence

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1

MAS275 Probability Modelling Exercises

What is a random variable

Homework set 3 - Solutions

Advanced Computer Networks Lecture 2. Markov Processes

CIS 2033 Lecture 5, Fall

INTRODUCTION TO MARKOV CHAIN MONTE CARLO

ADVANCED PROBABILITY: SOLUTIONS TO SHEET 1

STOCHASTIC PROCESSES Basic notions

A = A U. U [n] P(A U ). n 1. 2 k(n k). k. k=1

Lecture 11: Random Variables

Math 163: Lecture notes

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING

Name of the Student:

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

CPSC 536N: Randomized Algorithms Term 2. Lecture 9

1 Gambler s Ruin Problem

1 Random Walks and Electrical Networks

CS8803: Statistical Techniques in Robotics Byron Boots. Hilbert Space Embeddings


STAT 430/510 Probability Lecture 7: Random Variable and Expectation

1/2 1/2 1/4 1/4 8 1/2 1/2 1/2 1/2 8 1/2 6 P =

Statistics 433 Practice Final Exam: Cover Sheet and Marking Sheet

Markov Chains Handout for Stat 110

Transcription:

Math 456: Mathematical Modeling Tuesday, March 6th, 2018

Markov Chains: Exit distributions and the Strong Markov Property Tuesday, March 6th, 2018

Last time 1. Weighted graphs. 2. Existence of stationary distributions (irreducible chains). 3. Informal definition of transient and recurrent states 4. Definition of stopping times. 5. Statement of the Strong Markov Property.

Today Today we will see: 1. Stopping times and Exit distributions 2. The proof of the Strong Markov Property. 3. Probability of return: properties and computations.

Notation reminder Probability with respect to initial state Hitting time Recall the notation from last time: Given a Markov Chain X 0, X 1, X 2,... we will write P x [A] := P[A X 0 = x] E x [Y ] := E[Y X 0 = x] Also recall: given a state y, we have the first hitting time T y := min{n 1 X n = y} which is a positive, integer valued random variable.

Warm up Consider the Gambler s ruin with a fair coin What is P x [(Winning)]? Using exit times, and writing A = {0, M} What is P x [X TA = M]?

Warm up From the total probability formula, it follows that (for x 0, M) P x [X TA = M] = 1 2 P x+1[x TA = M] + 1 2 P x 1[X TA = M] From here, with a little effort, one can show that P x [X TA = M] = x, x = 0, 1,... M

Warm up From the total probability formula, it follows that (for x 0, M) P x [X TA = M] = 1 2 P x+1[x TA = M] + 1 2 P x 1[X TA = M] From here, with a little effort, one can show that P x [X TA = M] = x, x = 0, 1,... M What if the coin is not a fair?

Exit probabilities We are given: An irreducible Markov chain with state space S A set A S. The state of the chain at the time it first reaches A is given by X TA We may think of A as representing exits to the state space, so X TA represents the location of the chain where we exit.

Exit probabilities To study the distribution of X TA, fix y A, and define G y (x) := P[X TA = y X 0 = x] this defines a function G y : S R. This is known as the Green function of the chain (with Dirichlet conditions on A).

Exit probabilities Observe: if x A, then T A = 0, so P[X TA = y X 0 = x] = P[X 0 = y X 0 = x] and this is 1 or 0 depending on whether x = y or not, so { 1 if x A, x = y G y (x) = 0 if x A, x y What can we say for x A?

Exit probabilities For x A, we divide and conquer. In this case, the chain will move to a new state X 1, and we can condition on this new state. The total probability formula + Markov property then says that P[X TA = y X 0 = x] = x S P[X TA = y X 1 = x ]P[X 1 = x X 0 = x]

Exit probabilities Now, the identity P[X TA = y X 0 = x] = x S P[X TA = y X 1 = x ]P[X 1 = x X 0 = x] can be rewritten as G y (x) = x S G y (x )p(x, x ) or, using that p(x, x ) = 1 x (G y (x ) G y (x))p(x, x ) = 0 x S \ A. x S

Exit probabilities Example Let us go back to the Gambler s ruin. Assume we have a biased coin (p q) Take y = M. Then for x {1,..., M 1} q(g M (x + 1) G M (x)) + p(g M (x 1) G M (x)) = 0 With some cleverness, we can write this as a recurrence relation: G M (x + 1) G M (x) = p q (G M(x) G M (x 1))

Exit probabilities Example Thus, we have for x {1,..., M 1} ( ) p x G M (x + 1) G M (x) = (G M (1) G M (0)) q ( ) p x = G M (1) q this being since G M (0) = 0. Now, we can add up all these identities, from x = 0 up to some arbitrary element of {1,..., M}, obtaining x 1 ( ) p i G M (x) G M (0) = G M (1) q i=0

Exit probabilities Example Therefore (again, since G M (0) = 0) x 1 ( ) p i G M (x) = G M (1) q i=0 What about G M (1)?. We use the elementary identity Substituting... x 1 ( ) p i = (p/q)x 1 q (p/q) 1 i=0

Exit probabilities Example...we obtain for every x {0,..., M} the formula G M (x) = (p/q)x 1 (p/q) 1 G M(1) Since G M (M) = 1, it follows that Therefore, 1 = (p/q)m 1 (p/q) 1 G M(1) G M (1) = (p/q) 1 (p/q) M 1

Exit probabilities Example Thus, we have for x {1,..., M 1} ( ) p x G M (x + 1) G M (x) = (G M (1) G M (0)) q ( ) p x = G M (1) q this being since G M (0) = 0. At the same time, we have G M (M) = 1, so 1 G M (M 1) = ( ) p M 1 G M (1) q

Exit probabilities Example In conclusion, for every x G M (x) = (p/q)x 1 (p/q) M+1 1

The Laplacian Operator To every function f : S R we associate a new one, f(x) := x S(f(x )) f(x))p(x, x ) and known as the Laplacian of f.

The Laplacian Operator Then, for a given set A S and y A, we have { Gy (x) = δ xy if x A, G y (x) = 0 if x S \ A. Solving the above amounts to a linear algebra problem involving the transition matrix.

The Laplacian Operator This means we can compute an exact formula for the exit distribution for a set A, without even running a single simulation. The flip side is, if one wants to approximate the solution to the above problem, one could run lots of simulations, and tally the exit points.

Math 456: Mathematical Modeling Thursday, March 8th, 2018

Markov Chains: The Strong Markov Property (...and a bit more about the Laplacian) Thursday, March 8th, 2018

Last time 1. A bit on calculus on weighted graphs 2. Definitions: gradient, divergence, and Laplacian. 3. The Laplacian as a measure of a function s smoothness. 4. Exit distributions. Example: Gambler s ruin. 5. Exit distributions and the associated graph Laplacian.

Today 1. The Laplacian and stationary distributions. 2. The Strong Markov property. 3. Probability of return: properties and computations.

vs. Stationary distributions Last time we defined the Laplacian on a weighted graph. For the graph associated to a Markov chain (G = S, weights given by transition probabilities), for a function f on the state space, its Laplacian is given by f(x) = y G(f(y) f(x))p(x, y)

vs. Stationary distributions Let π be a stationary distribution for the chain, and let f be a function solving f = 0. How do the two equations compare? We have π(x) = y G π(y)p(y, x) (f(y) f(x))p(x, y) = 0 y G

vs. Stationary distributions Let π be a stationary distribution for the chain, and let f be a function solving f = 0. How do the two equations compare?...rewriting the second one, π(x) = y G π(y)p(y, x) f(x) = y G f(y)p(x, y) where we have used that y p(x, y) = 1 for every x. What is the difference? We have p(x, y) in one, and p(y, x) in the other (and we are summing in y in both cases).

vs. Stationary distributions What does this look like in vector notation? Solving f = 0 in all of G means pf = f and as we know, π being stationary means p t π = π The two conditions are very similar, but they are not exactly the same in general! If π t = π, they of course coincide! In general, if p is doubly stochastic then they coincide.

Classification of states The Strong Markov Property Recall the Strong Markov Property from last time. Theorem (Strong Markov Property) Let X n denote a Markov Chain. If T is a stopping time and Y n := X T +n, then Y n is also a Markov chain. Moreover, the transition probability for Y n is the same as the one for X n.

Classification of states The Strong Markov Property Theorem (Strong Markov Property) Let X n denote a Markov Chain. If T is a stopping time and Y n := X T +n, then Y n is also a Markov chain. Moreover, the transition probability for Y n is the same as the one for X n. What does this mean? It means that given states α 1,..., α n, we have P(Y n+1 = α n+1 Y n = α n,... Y 1 = α 1 ) = p(α n, α n+1 ) where p(x, y) is the transition matrix for the original chain.

Classification of states The Strong Markov Property Theorem (Strong Markov Property) Let X n denote a Markov Chain. If T is a stopping time and Y n := X T +n, then Y n is also a Markov chain. Moreover, the transition probability for Y n is the same as the one for X n. What does this mean? Equivalently, it means that P(Y n+1 = α n+1, Y n = α n,... Y 1 = α 1 ) is equal to p(α n, α n+1 )p(α n 1, α n )... p(α 1, α 2 )P(Y 1 = α 1 ). (we shall skip the proof of this for now and revisit it in a couple of lectures)

Probability of return Now, armed with the Strong Markov Property, let us put it to use to analyze various questions about the long time behavior of a chain. Remember: we were trying to study the even that starting from x, we eventually reach some other state y at some (random) time T y, as well as the probability that starting from x, we eventually return to x itself at some later (random) time T x.

Probability of return Recall The first hitting time, with the convention that in the event that {X n y n}. T y = min{n 0 X n = y} T y =

Probability of return also recall (with the same convention in case the respective event is empty) T k x := min{n > T k 1 x X n = x} T k A := min{n > T k 1 A X n A} This is known as the time of the k-th return to x or the k-th hitting time for A.

Probability of return also recall (with the same convention in case the respective event is empty) T k x := min{n > T k 1 x X n = x} T k A := min{n > T k 1 A X n A} This is known as the time of the k-th return to x or the k-th hitting time for A. Evidently, each T k x is a stopping time.

Probability of return The probability of returning to x ρ xx := P x [T x < ] The probability of visiting y starting from x. ρ xy := P x [T y < ] Note: ρ xy > 0 if and only if x y.

Probability of return Using the Strong Markov Property, we can show the following: Lemma (See equation (1.4) in Durrett) P x [T k x < ] = ρ k xx

Probability of return Using the Strong Markov Property, we can show the following: Lemma (See equation (1.4) in Durrett) P x [T k x < ] = ρ k xx Note: In particular, if ρ xx < 1, the probability of returning to the state k times goes to zero exponentially fast as k 0.

Probability of return Proof. Observe that, P x [T k x < ] = P x [T k x <, T k 1 x < ] Therefore P x [T k x < ] k 1 = P x [X T k 1 x +n = x for some n, Tx < ] k 1 = P x [X T k 1 x +n = x for some n Tx < ]P[Tx k 1 < ]

Probability of return Proof. Let Y n := X T k 1 x +n. By the Strong Markov Property k 1 P x [X T k 1 x +n = x for some n Tx < ] = P[Y n = x for some n Tx k 1 < ] = P[X n = x for some n X 0 = x] = P[T x < X 0 = x] = P x [T x < ] Repeating this argument, we obtain the desired formula.

Probability of return Now, reviewing the definition of transient and recurrent from last time, we agree to say that 1. a state x will be called recurrent if ρ xx = 1. 2. a state x will be called transient if ρ xx < 1.

Probability of return The previous lemma and these definitions lead us to the following realizations (one trivial, two not so trivial) 1. If x is transient, then the probability of returning to x k times decreases geometrically with k. 2. If x is recurrent and k N, then with probability equal to 1, the chain will be on the state x at least k times. 3. Actually, if x is recurrent, then with probability 1, starting from x, the chain will revisit x infinitely manytimes. This last point is particularly delicate, and the proof uses an important tool in probability known as the Borel-Cantelli lemma.