ISM206 Lecture, May 12, 2005 Markov Chain Instructor: Kevin Ross Scribe: Pritam Roy May 26, 2005 1 Outline of topics for the 10 AM lecture The topics are: Discrete Time Markov Chain Examples Chapman-Kolmogorov Equation Types of states Long Run Behavior Expected/Average cost Recurrence Times Absorbing states and Random Walk 2 Introduction Transitioning into uncertain problems like sensitivity, decision analysis, and queueing dynamics Solving problems in optimization which deal with markov chains. 3 Discrete time Markov Chain In this model time is discrete i.e. t 0, 1.2,.... States: The current status of the system may by one of the (M+1) mutually exclusive 1
categories called States Hence X t represents the state of the system at time t, so its possible values are 0,1,2,...,M. We are interested in evolving states e.g. inventory level interest rate number of waiting tasks in a queue. 3.1 Key Properties of markov Chain Markovian prprty says that the conditional probability of any future event given any past events. and the present state X t = i, is independent of the past events and depends only upon the present state. P [X t+1 = j X 0 = k 0, X 1 = k 1,..., X t = i] = P [X t+1 = j X t = i] (1) A stochastic process which follows the Markovian property are called Markov Chain. For example, amount of the stock left in shop depends on what was there in previous day, not everyday. The conditional probabilities p ij (t) = P [X t+1 = j X t = i] for a Markov Chain are called (one-step) transition probabilities. Similarly, n-step transition probabilities are p (n) ij = P [X t+n=j X t = i]. If for each i and j, P [X t+n = j X t = i] = P [X n = j X 0 = i] (2) then the transition probabilities are said to be stationary Because the p ij (t) are conditional probabilities, they must be non-negative, and since the process must make a transition into some state, they must satisfy the propoerties and p (n) ij 0, for all i and j; n = 0, 1, 2,..., (3) p (n) ij = 1, for all i; n = 0, 1, 2,... (4) A convenient way of showing all the n-step probabilities p (n) in the matrix form as State 0 1... M 0 p (n) 00 p (n) 01... p (n) 0M 1 p (n) 10 p (n) 11... p (n) 1M............... M p (n) M0 p (n) M1... p (n) MM for n=0,1,2,...,m. 2
3.2 Formulating Weather Example as Markov Chain Problem The weather in Canterville from day to day has been formulated as X t = 0 if day t is dry = 1 if day t is wet. P[tomorrow will be dry today is dry] = P [X t+1 = 0 X t = 0]= p 00 = 0.8 P[tomorrow will be dry today is wet] = P [X t+1 = 0 X t = 1]= p 10 = 0.6 Furthermore, p 00 + p 01 = 1, so p 01 = 1 0.8 = 0.2, p 10 + p 11 = 1, so p 11 = 1 0.6 = 0.4. The transition matrix P = State 0 1 0 0.8 0.2 1 0.6 0.4 3.3 Gambling Problem We assume one person bets 1 dollar on every round of game. The probability of winning is p and the probability of losing is (1-p). So he plays until either he wins 3 dollars or goes broke. For example, if he has 1 dollar initially, then the probability of winning i.e. having 2 dollars after first round is p and probability of having 0 dollars i.e. losing is (1-p). So the 1-step transition matrix P = State 0 1 2 3 0 1 0 0 0 1 1-ρ 0 ρ 0 2 0 1-ρ 0 ρ 3 0 0 0 1 Let us start with X 0 = [1 0 0 0]. X 1 = X 0 P = [0.2 0.3 0.4 0.1] X 2 = X 1 P = X 0 P 2 = [0.15 0.32 0.34 0.19]... X n = X 0 P n 3
Eventually one will reach the fixed point and from then X i+1 = X i. Inference: Memoryless property does not care about beginning states. 4 Chapman-Kolmogrov Equations p (n) ij = k=0 (p (m) ik ) (p(n m) kj ) (5) where, i = 0,1,..,M,,1,..,M and any m = 1,2,..,n-1, n = m+1, m+2,... For n=2 the expression becomes, p (2) ij = (p ik )(p kj ) (6) k=0 for all states i and j, where the p (2) ij are the elements of a matrix P (2). These elements are obtained by multiplying the matrix of one-step transition by itself; i.e., P (2) = P.P = P 2. (7) In general, n-step transition probabilities can be obtained by computing the n-th power of the one-step transition matrix i.e. P (n) = P.P (n 1) = P n. In the weather example, p 2 0.8 0.2 0.8 0.2 0.76 0.24 (weather) = ( )( ) = ( 0.6 0.2 0.6 0.2 0.72 0.28 ). P [X n = j X 0 = i 0 ] = P [X n = j X 0 = i 0 ] = (P n )(P 0 ) (8) where P 0 is the vector of probabilities of state i 0. 5 Classification of States of a Markov Chain Absorbing : A state is said to be an absorbing state if, upon reaching this state, the process never will leave this state agin. State i is an absorbing state if and only if p jj = 1 > 0 for some n 0 i.e. one can get to j from i in some steps. Communicating : If state j is accessible from state i and state i is accessible from state j then states i and j are said to communicate. Accessible : A state j is said to be accessible from state i if p (n) ij 4
Transient : A state is said to be a transient state if upon entering this state, the process may never return this state again. A state is transient if there exists a state j (j i) that is accessible from state i but not vice-versa. Recurrent : A state is said to be recurrent state if upon entering this state, the process definitely will return to this state again. Periodic : A state is said to be periodic if upon entering this state, the process definitely return to this state in fixed number of steps. Irreducible Markov Chain : If all states communicate then the Markov Chain can not be simplified and is said to be irreducible. Ergodic : In a finite state Markov Chain, recurrent state that is aperiodic is called ergodic. 6 Long Run Behavior Steady state probabilities for large enough n, all rows of P n are same i.e. the probabilities of being in each state is independent of the original. They are called steady state probabilities since they don t change. For any irreducible ergodic Markov Chain, n p (n) ij exists and is independent of i. Furthermore, n p (n) ij = π j > 0 (9) where π j uniquely satisfies the following steady-state equations. π j = π i p ij i=0 forj = 0, 1,..., M π j = 1 Note : Steady state probabilities are NOT same as stationary transition probabilities. In the weather example, π 0 = π 0 p 00 + π 1 p 10 π 1 = π 0 p 01 + π 1 p 11 π 0 + π 1 = 1 5
We have 3 equations and 2 unknown variable. But the third equation is obtained just by adding first two equations. After solving we obtain, π 0 = 0.25 and π 0 = 0.75. Note: the important results concerning steady-state probabilities. if i and j are recurrent states belonging to different classes, then p (n) ij if j is a transient state, then n p (n) ij = 0 for all i. We can use long-run behavior to obtain likelihood of states, calculate expected/average cost. = 0 for all n 7 Expected Average Cost per Unit Time If the requirement that the states be aperiodic is relaxed, then the it n p(n) ij = 0 may not exist. To illustrate this point, consider the two-state transition matrix State 0 1 P= 0 0 1 1 1 0 If the process starts in state 0 at time 0, it will be in state 2,4,6,..and in state 1 at times 1,3,5,..Thus, p (n) 00 = 1 if n is even and p (n) 00 = 0 if n is odd, so that n p(n) 00 does not exist. However, the following it always exists for an irreducible (finitestate) Markov Chain: n ( 1 n p (k) ij ) = π j, (10) where pi j satisfy the steady-state equations given in previous subsection. The result is important in computing the long-run average cost per unit time associated with a Markov Chain. Suppose that a cost (or, other penalty function) C(X t ) is incurred when the process is in state X t at time t, for t = 0,1,2,.. Note: C(X t ) is a random variable that takes on any one of the values C(0), C(1),..,C(M) and that the function C(.) is independent of t. 6
The expected avergare cost incurred over the first n periods is given by E[ 1 n C(X t ))]. By using the result that n ( 1 n p (k) ij ) = π j, by it can be shown that the (long-run) expected average cost per unit time is given E[ 1 n C(X t ))] = π j C(j). For further reference, please see worked out example in the text. 8 Expected Average Cost per Unit Time for Complex cost Functions In previous section the cost function was based soley on the state that the process is in time t. In many imprtant problems encountered in practice, the cost may also depend upon some other random variables. The other random variables has to be i.i.d. e.g. stock holding cost may depend on number of workers, interest rate. X t is independent of this other two values. Suppose the costs to be considered are the ordering cost and the penalty cost for the unsatisfied demand. Therefore the total cost for week t is a function of X t 1 and D t, that is, C(X t 1, D t ). Under the assumptions of the example, it can be shown that the (long-run) expected average cost per unit time is given by, n E[ 1 n C(X t 1, D t ))] = k(j)π j. (11) where k(j) = E[C(j, D t )], 7
where the expectation is taken w.r.t. probability distribution of the random variable D t, given the state j. Similarly the long-run actual average cost per unit time is given by 1 n n C(X t 1, D t )) = k(j)π j. (12) 9 Recurrence Time We often like to know how long we expect to get state j from state i and how often we return to i. Let f n ij = probability of first passage time from i to j is n. = p 1 ij = p ij f (2) ij = k j p ikf (1) kj f (1) ij... f (n) ij = k j p ikf (n 1) kj. In general, n-th transitions are tedious to calculate, but first passage time can be calculated from above. µ ij = if n=1 f ij n < 1 = n=1 nf ij n otherwise. If n=1 f n ij = 1 then µ ij = 1 + k j p ikµ kj is unique. Expected recurrence time for well behaved systems: µ ii = 1 π i (13) 10 Absorbing States We can recall that a state is called an absorbing state if p kk = 1, so that once the chain visits k it remains there forever. If k is an absorbing state, the process starts in state i, the probability of ever going to state k is called the probability of absorption into state k, given the system started in state i. This probability is denoted by f ik. f ik = p ij f jk, (14) 8
for i = 0,1,..,M, subject to the conditions f kk = 1, f ik = 0, if state i is recurrent and i k. 10.1 Random Walk Absorption probabilities are important in random walks. A Random Walk is a Markov chain with the property that if the system is in a state i, then in a single transition the system either remains at i or moves to one of the two states immediately adjacent to i. For example, arandom walk often is used as a model for situations involving gambling. To illustrate consider a gambling example with two players (A and B) each having 2 dollars. They agree to keep playing the game and betting 1 dollar at a time until one player is broke. The probability of A wiining a single bet is 1, so B wins 3 with probability 2. The number of dollars player A has before each bet (0,1,2,3,or 3 4) provides the states of a Markov chain with transition matrix P= State 0 1 2 3 4 0 1 0 0 0 0 2 1 1 0 0 0 3 3 2 1 2 0 0 0 3 3 2 1 3 0 0 0 3 3 4 0 0 0 0 1 Let f ik be the probability of absoroption at state k given start state is i. We can check that f ik = p ij f jk, (15) for i = 0,1,..,M, is satisfied withy the conditions f kk = 1, f ik = 0, if state i is recurrent and i k. 9
For our gambling example, f 00 = 1 f 10 = 2 3 f 00 + 1 3 f 20 f 20 = 2 3 f 10 + 1 3 f 30 f 30 = 2 3 f 20 + 1 3 f 40 f 40 = 0 After solving the sets of equations, we obtain probability of A losing = f 20 = 1 5 and probability of A winning = f 24 = 4 5. Note: Starting state has an effect on long term behavior. For example, if they start with 3 dollars then probability of A losing would be different. 10