Note that in the example in Lecture 1, the state Home is recurrent (and even absorbing), but all other states are transient. f ii (n) f ii = n=1 < +

Similar documents
1 Random walks: an introduction

Lecture 11: Introduction to Markov Chains. Copyright G. Caire (Sample Lectures) 321

Markov Chains, Stochastic Processes, and Matrix Decompositions

Lecture 9 Classification of States

8. Statistical Equilibrium and Classification of States: Discrete Time Markov Chains

MATH 564/STAT 555 Applied Stochastic Processes Homework 2, September 18, 2015 Due September 30, 2015

SMSTC (2007/08) Probability.

Lecture 20 : Markov Chains

Markov Chains and Stochastic Sampling

2 Discrete-Time Markov Chains

Lecture 7. µ(x)f(x). When µ is a probability measure, we say µ is a stationary distribution.

MATH 56A: STOCHASTIC PROCESSES CHAPTER 1

12 Markov chains The Markov property

MAA704, Perron-Frobenius theory and Markov chains.

P i [B k ] = lim. n=1 p(n) ii <. n=1. V i :=

µ n 1 (v )z n P (v, )

Markov Chains CK eqns Classes Hitting times Rec./trans. Strong Markov Stat. distr. Reversibility * Markov Chains

Lecture 5. If we interpret the index n 0 as time, then a Markov chain simply requires that the future depends only on the present and not on the past.

2. Transience and Recurrence

CS145: Probability & Computing Lecture 18: Discrete Markov Chains, Equilibrium Distributions

Discrete time Markov chains. Discrete Time Markov Chains, Limiting. Limiting Distribution and Classification. Regular Transition Probability Matrices

Finite-Horizon Statistics for Markov chains

MATH 56A SPRING 2008 STOCHASTIC PROCESSES 65

Stochastic Processes (Week 6)

Stochastic Processes

4. Ergodicity and mixing

Markov Chains (Part 3)

= P{X 0. = i} (1) If the MC has stationary transition probabilities then, = i} = P{X n+1

Hitting Probabilities

Quantitative Model Checking (QMC) - SS12

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

Markov Chains, Random Walks on Graphs, and the Laplacian

Markov Chains on Countable State Space

An Introduction to Entropy and Subshifts of. Finite Type

Lecture 7. We can regard (p(i, j)) as defining a (maybe infinite) matrix P. Then a basic fact is

Stochastic processes. MAS275 Probability Modelling. Introduction and Markov chains. Continuous time. Markov property

Chapter 7. Markov chain background. 7.1 Finite state space

Positive and null recurrent-branching Process

Transience: Whereas a finite closed communication class must be recurrent, an infinite closed communication class can be transient:

M3/4/5 S4 Applied Probability

Birth-death chain models (countable state)

MATH 56A: STOCHASTIC PROCESSES CHAPTER 2

Stochastic Models: Markov Chains and their Generalizations

INTRODUCTION TO MARKOV CHAINS AND MARKOV CHAIN MIXING

5 Birkhoff s Ergodic Theorem

Markov Chains. Sarah Filippi Department of Statistics TA: Luke Kelly

Invertibility and stability. Irreducibly diagonally dominant. Invertibility and stability, stronger result. Reducible matrices

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

Lecture: Local Spectral Methods (1 of 4)

Lecture 2: September 8

Markov processes Course note 2. Martingale problems, recurrence properties of discrete time chains.

STOCHASTIC PROCESSES Basic notions

Classification of Countable State Markov Chains

The Transition Probability Function P ij (t)

Probabilistic Aspects of Computer Science: Markovian Models

Countable state discrete time Markov Chains

6 Markov Chain Monte Carlo (MCMC)

Chapter 2: Markov Chains and Queues in Discrete Time

Examples of Countable State Markov Chains Thursday, October 16, :12 PM

Reinforcement Learning

1.2. Markov Chains. Before we define Markov process, we must define stochastic processes.

Markov Chains Handout for Stat 110

Irreducibility. Irreducible. every state can be reached from every other state For any i,j, exist an m 0, such that. Absorbing state: p jj =1

Lecture 10. Theorem 1.1 [Ergodicity and extremality] A probability measure µ on (Ω, F) is ergodic for T if and only if it is an extremal point in M.

LECTURE 3. Last time:

Markov Processes Hamid R. Rabiee

HW 2 Solutions. The variance of the random walk is explosive (lim n Var (X n ) = ).

Math 113 Homework 5. Bowei Liu, Chao Li. Fall 2013


ISM206 Lecture, May 12, 2005 Markov Chain

Matrix analytic methods. Lecture 1: Structured Markov chains and their stationary distribution

Convex Optimization CMU-10725

Stochastic Shortest Path Problems

Markov chains. Randomness and Computation. Markov chains. Markov processes

88 CONTINUOUS MARKOV CHAINS

On random walks. i=1 U i, where x Z is the starting point of

Markov chains. 1 Discrete time Markov chains. c A. J. Ganesh, University of Bristol, 2015

Outlines. Discrete Time Markov Chain (DTMC) Continuous Time Markov Chain (CTMC)

ISE/OR 760 Applied Stochastic Modeling

arxiv: v1 [math.pr] 6 Jan 2014

Non-homogeneous random walks on a semi-infinite strip

Math 564 Homework 1. Solutions.

Math Homework 5 Solutions

Theory and Applications of Stochastic Systems Lecture Exponential Martingale for Random Walk

Lecture 10: Powers of Matrices, Difference Equations

Stochastic Simulation

Stochastic Processes (Stochastik II)

Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

the time it takes until a radioactive substance undergoes a decay

IEOR 6711: Professor Whitt. Introduction to Markov Chains

INTRODUCTION TO MARKOV CHAIN MONTE CARLO

11.4. RECURRENCE AND TRANSIENCE 93

IEOR 6711: Stochastic Models I. Solutions to Homework Assignment 9

Markov Chain Model for ALOHA protocol

7 Convergence in R d and in Metric Spaces

25.1 Ergodicity and Metric Transitivity

Math 456: Mathematical Modeling. Tuesday, March 6th, 2018

Mathematical Methods for Computer Science

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

Thus, X is connected by Problem 4. Case 3: X = (a, b]. This case is analogous to Case 2. Case 4: X = (a, b). Choose ε < b a

Transcription:

Random Walks: WEEK 2 Recurrence and transience Consider the event {X n = i for some n > 0} by which we mean {X = i}or{x 2 = i,x i}or{x 3 = i,x 2 i,x i},. Definition.. A state i S is recurrent if P(X n = i for some n > 0 X 0 = i) = Definition.2. A state i S is transient if P(X n = i for some n > 0 X 0 = i) < Note that in the example in Lecture, the state Home is recurrent (and even absorbing), but all other states are transient. Definition.3. Let f ii (n) be the probability that the first return time to i is n when we start at i: f ii (n) = P(X i,...,x n i,x n = i X 0 = i) f ii (0) = 0, by convention Then we define the probability of eventual return as: f ii = + n= f ii (n) Theorem.4. A rule to determine recurrent and transient states is: (a) state i S is recurrent iff n p(n) ii = + (b) state i S is transient iff n p(n) ii < + Proof: We use generating functions. We define and One can show that, for s < : P ii (s) = F ii (s) = + + P ii (s) = s n p ii (n), s < s n f ii (n), s < F ii (s) This equation is enough to prove the theorem since, as s ր we have P ii (s) ր + f ii = F ii (s = ) = By Abel s theorem, Thus + + p ii (n) = lim sր P ii (s) p ii (n) = + f ii = +

Theorem.5 (Abel s Theorem). If a i 0 for all i and G(s) = n 0 sn a n is finite for s <, then { + if lim G(s) = n 0 a n = + sր n 0 a n if n 0 a n < + Proof of formula P ii (s) = F : ii(s) Let the events A m = {X m = i}, B r = {X i,...,x r i,x r = i} r m. The B r are disjoint, thus: m P(A m X 0 = i) = P(A m B r X 0 = i) = = = p ii (m) = m P(A m B r,x 0 = i)p(b r X 0 = i) m P(A m X r = i)p(b r X 0 = i) m f ii (r)p ii (m r) Multiplying the last equation by s m with s < and sum over m we get: = m s m p ii (m) = m = m m s m f ii (r)p ii (m r) k+l=m = f ii (k)s k k l = P ii (s) = P ii (s)f ii (s) = P ii (s) = F ii (s) f ii (k)s k p ii (l)s l p ii (l)s l 2 Positive/null-recurrence Let T i = inf{n > 0 : X n = i} be the first return time of a random walk starting in X 0 = i. Note that P(T i < + X 0 = i) = P(T i = m X 0 = i) = n = i,x n i,...,x i X 0 = i) m n P(X = P(X n = i for some n X 0 = i) = f ii Recall also the notation (from Lecture 2) for the probability that the first return time to i is n f ii (n) = P(X n = i,x n i,...,x i X 0 = i) = P(T i = n X 0 = i) For a recurrent state, = f ii = n f ii(n) = P(T i < + X 0 = i) and for a transient state, f ii <. So for a transient state, the random variable T i must have some mass at T i = + : P(T i = + X 0 = i) > 0. We define the following: Definition 2.. The mean recurrence time is given by µ i E(T i X 0 = i) = n np(t i = n X 0 = i) = n nf ii (n). 2

Can we compute this quantity? Let us to that end distinguish two cases: Transient states: We know that for transient states the event T i = + occurs with some strictly positive probability P(T i = + X 0 = i) > 0. We therefore have that µ i = E(T i X 0 = i) = +. Recurrent states: By definition m f ii(m) = for recurrent states. But n nf ii(n) is not necessarily finite. So even though you know for sure the chain will return to state i at some finite time, the average recurrence time may be infinite. Based on this, we say that any recurrent state is either one of the following: Positive-recurrent if µ i < +. Null-recurrent if µ i = +. 3 Irreducible chains We say that i j if n 0 such that p i j (n) > 0. In words, we say that i communicates with j or that j is accessible from i. We say that i j if i j and j i. In words, i and j intercommunicate. Note, however, that this implies that n 0 s.t. p i j (n) > 0 and m 0 s.t. p j i (m) > 0 (but it is not necessary that n = m holds). If i j then these states are reflexive, symmetric and transitive, meaning: Reflexivity: i i, because we always have that at least p ii (0) = holds. Symmetry: If i j holds then -of course- also j i holds. Transitivity: If i k and k j, then i j. This statement follows from the Chapman- Kolmogorov equation: p i j (n+m) = l Sp i l (n)p l j (m) p i k (n)p k j (m). One can always pick an n and m such that the right hand side is strictly positive and hence n+m such that p i j (n+m) > 0. The same argument shows that there exists n, m such that p j i (n +m ) > 0. Thus i and j intercommunicate. We can partition the state space S into equivalence classes. For example, the music festival example of Lecture had two such classes: {Home} and {Bar,Dancing,Concert}. Note that {Home} is here trivially a positive-recurrent state (in fact it is absorbing: once you are at home you return to home at every time step with probability one) and that {Bar,Dancing,Concert} are transient states (fortunately or unfortunately?). Having stated these notions, we are ready to define irreducible chains as follows: Definition 3.. A Markov chain is said to be irreducible if it has only one equivalence class, i.e., i,j S n,m such that p i j (n)p j i (m) > 0. In other words, in an irreducible Markov chain every state is accessible from every state. Proposition 3.2. Within an equivalence class of a Markov chain or for an irreducible Markov chain it holds that. All states i have the same period. 2. All states i are recurrent or all states are transient. 3

3. If all states i are recurrent, then either they are all null-recurrent or they are all positive-recurrent. Proof of point 2. Take two states in the same equivalence class, i j. Then, from the Chapman- Kolmogorov equation, we deduce the inequality p ii (n+t+m) p i j (n) }{{} >0 p j j (t) p j i (m) }{{} >0 α }{{} >0 p jj (t). If i is transient, then t p ii(t) < + (criterion proved in Lecture 2) and thus t p jj(n+t+m) < + so j is also transient. To complete the proof, we note that the roles of i and j can be interchanged. This way we also get that if j is transient, then i is transient. The proof of is similar. The proof of 3 requires more tools that we don t quite have at this point. Lemma 3.3. If a Markov chain has a finite state space S and is irreducible, then all its states are (positive-)recurrent. Proof. We have the following property: lim n j S p i j (n) =, but our state space is finite so we can interchange the order: lim p i j(n) =. () n + j S We continue by contradiction. Assume that all j S are transient, then p jj (n) 0 as n. Even stronger, we proved in Homework 2 that p ij (n) 0 as well. This contradicts (): lim p i j(n). n + j S}{{} 0 So if there is a j that is recurrent and the chain is irreducible, then all j S must be recurrent. The proof that all states are in fact positive-recurrent requires more tools that we don t yet have. 4 Stationary distribution Definition 4.. A distribution π is called stationary if it satisfies the equation π = π P. It follows immediately that any stationary distribution also satisfies π = π P k for any k 0. In particular, if we initialize a chain in the stationary distribution π (0) = π then at any time n, π (n) = π (and this is why π is called stationary). Discussion: For systems with a finite state space, one can show that the finite stochastic matrix P always has a unit eigenvalue and associated left eigenvector with non-negative components π i 0. But it may not be unique (as will be clear from the discussion below). Uniqueness requires more conditions on P. For example if P has all its elements strictly positive or if there exists N such that P N has all its elements strictly positive then the standard forms of the Perron-Frobenius theorem imply unicity. However this is not the most general condition. The theory of Markov chains in finite state spaces can be developed through the Perron-Frobenius theorems (at various levels of sophistication) but this is not the route we take in this class because we are also interested in infinite (countable) state spaces. An important theorem is the following. It is clear that the vector with all ones is a right eigenvector with eigenvalue. But the existence of a left eigenvector with non-negative components for the same eigenvalue is not obvious. 4

Theorem 4.2 (existence and uniqueness of a stationary distribution). Consider an irreducible Markov chain. It has a stationary distribution if and only if the chain is positive-recurrent. Moreover, this distribution is unique and takes on the value π i = µ i = E(T i X 0 = i) Remark 4.3. Note that µ i is finite because we assume the chain is positive-recurrent. Remark 4.4. Take an irreducible Markov chain with a finite state space. Then by Lemma 3.3, we know it must be positive-recurrent. Thus an irreducible Markov chain on a finite state space has a unique stationary distribution π i = /µ i. Remark 4.5. This theorem is very powerful. Indeed suppose you have a chain and you know that it is irreducible. With an infinite state space, it might be difficult to prove directly that it is positive-recurrent, but it might be easier to compute the stationary distribution. Then you immediately can conclude that it is necessarily positive-recurrent. The proof of the theorem is not easy and we do not go through it here. We rather try to motivate the theorem through the following discussion. Example 4.6. Consider the following random walk on a circle, a finite state space: q p The stochastic matrix P of the walk is of size S S and looks as follows: 0 p 0 q. q 0 p.. 0 q 0....... 0 p p q 0 One can easily verify that πi = S is the stationary distribution. This example suggests that on Zd the random walk has no stationary distribution because S and thus πi = S 0 would not yield a valid probabilitydistribution. This is true. Indeed the randomwalk in Z d is irreducible and null recurrent for d =,2 and transient for d 3, so by the above theorem, it cannot have a stationary distribution. When one verifies that π i = S is the stationary distribution in the example above, one sees that this works because the columns sum to one (recall that for a stochastic matrix the rows always sum to one). This motivates the following definition. Definition 4.7. A doubly stochastic matrix is a matrix P = [p ij ] with p ij 0 of which all the rows and all the columns sum to one. 5

One can easily see that the mechanism of the example above generalizes to any chain on a finite state space with a doubly stochastic matrix: in this case, πi = S is a stationary distribution because the columns sum to one. Now what about the unicity of the stationary distribution? The following simple situation suggests that we don t have unicity for reducible chains. Example 4.8. Consider two separate circles: The state space is the union of two disconnected finite state spaces, S = S S 2. This Markov chain is not irreducible! Its transition matrix shows the following block structure: ( ) P 0 P = (2) 0 P 2 Consequently, π = π P has many solutions. An example are all distributions that are computed as follows: ( ) α π = S,, α S, β S 2,, β, (3) S 2 where α + β = and α,β 0. The first S components correspond to the first circle, the last S 2 correspond to the second. The uniform stationary distribution corresponds to α = S S + S 2 and β = S2 S + S 2. Also note that extreme cases, such as {α = 0,β = } and {α =,β = 0} are also perfectly valid stationary distributions. The general stationary distributions are convex combinations of the two extremal ones. 6