The Mabinogion Sheep Problem

Size: px

Start display at page:

Download "The Mabinogion Sheep Problem"

Melvin Sutton
6 years ago
Views:

1 The Mabinogion Sheep Problem Kun Dong Cornell University April 22, 2015 K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

2 Introduction (Williams 1991) we are given a herd of black and white sheep at time t = 0. At each time t = 1, 2,, a sheep is randomly selected, and a sheep of the opposite color (if any remain) becomes the same color as the chosen sheep. Just after t = 0 and each transition, we can remove any number of white sheep (if any remain) from the herd. The goal is to maximize the expected final number black sheep. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

3 A naive lower bound Definition. A Policy is a function a : (Z ) 3 Z for which a(t, #white, #black) is the number of white sheep removed at time t. Suppose we start with w 0 white sheep and b 0 black sheep. One of the policy is that we immediately remove all white sheep at t = 0. Hence there is an easy lower bound for the optimal policy, which is the starting number of black sheep. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

4 Decision Process Let (W t, B t ) be the number of white and black sheep at time t, with the natural filtration F t = σ((w s, B s ) s t ). ( ) ( ) P (W t+1, B t+1 ) = (w, b) F t = p (W t, B t ), a(t), (w, b) K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

5 Decision Process Let (W t, B t ) be the number of white and black sheep at time t, with the natural filtration F t = σ((w s, B s ) s t ). ( ) ( ) P (W t+1, B t+1 ) = (w, b) F t = p (W t, B t ), a(t), (w, b) What if a 0? Total number of sheep stays the same b 0 + w 0. p(b, b + 1) = b b 0 + w 0, b p(b, b 1) = 1 b 0 + w 0 for 0 < b < b 0 + w 0. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

6 Transition probability For the Markov Chain B t with initially k sheep of each color, the transition matrix is 1 0 2k k 2k p = 2k i... i 2k 2k k 1 2k 2k 0 1 B t is not irreducible. {0, 2k} are recurrent but all other states are transient. Any π with π(0) + π(2k) = 1 is stationary. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

7 Absorbing Markov Chain We can divide the communicating classes into transient classes and ergodic classes. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

8 Absorbing Markov Chain We can divide the communicating classes into transient classes and ergodic classes. We call a chain an Absorbing Markov Chain when all ergodic classes are singleton sets. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

9 Absorbing Markov Chain We can divide the communicating classes into transient classes and ergodic classes. We call a chain an Absorbing Markov Chain when all ergodic classes are singleton sets. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

10 AMC cont. Theorem For any finite Markov chain, the probability that the process is in ergodic classes tends to 1 as n. Proof. For any transient state, there is a positive probability to leave the transient class, so there is also a positive probability to reach an ergodic class. Because we have finite states, there exists N and p such that the probability of entering an ergodic state within N steps is at least p. This means the probability of not entering an ergodic state after kn steps is less than (1 p) k 0 as k. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

11 AMC cont. Theorem For any finite Markov chain, the probability that the process is in ergodic classes tends to 1 as n. Proof. For any transient state, there is a positive probability to leave the transient class, so there is also a positive probability to reach an ergodic class. Because we have finite states, there exists N and p such that the probability of entering an ergodic state within N steps is at least p. This means the probability of not entering an ergodic state after kn steps is less than (1 p) k 0 as k. Lemma P(B t {0, 2k}) 1 K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

12 Fundamental Matrix We can put the transition matrix P into the form 1 0 [ ] 0 1 S 0 P = = R Q 2k 1 1 2k 0 2k 2k 1 2k 2k i 2k... i 2k 1 2k 0 S - transition within ergodic classes T Q - transition within transient classes T R - transition from transient classes to ergodic classes The fundamental matrix is N = (I Q) 1 = I + Q + Q 2 + = k=0 Q k K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

13 N = (I + Q) 1 = k=0 Qk Remark 1. Let n j be the number of visits to a transient state j. E i [n j ] = p k ij = N ij Remark 2. In similar ways we can also get Var i (n j ), E i [ j T n j], Var i [ j T n j] from the fundamental matrix N. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

14 N = (I + Q) 1 = k=0 Qk Remark 1. Let n j be the number of visits to a transient state j. E i [n j ] = p k ij = N ij Remark 2. In similar ways we can also get Var i (n j ), E i [ j T n j], Var i [ j T n j] from the fundamental matrix N. Theorem. Let b ij be the probability that the chain starts in transient state i and ends in absorbing state j, then {b ij } = B = NR. Proof. By Strong Markov property, b ij = p ij + k T p ik b kj = B = R + QB = (I Q)B = R K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

15 The sheep chain ( B 0 = 2k i=1 Figure: E[B ] with respect to b 0 1 (2k i)! E b0 [B ] = (2k)b b0,2k ( ) 2k 1 )/( 2k i 1 i=1 1 (2k i)! ( ) 2k 1 ) i 1 K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

16 Policy A Do Nothing, Policy A Reduce W t to B t 1, W t < B t W t B t Definition. The value function V (w, b) is the expected final number of black sheep under policy A if we start with w white sheep and b black sheep. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

17 Policy A Do Nothing, Policy A Reduce W t to B t 1, W t < B t W t B t Definition. The value function V (w, b) is the expected final number of black sheep under policy A if we start with w white sheep and b black sheep. As a result, V has the following properties, (a1) V (0, b) = b (a2) V (w, b) = V (w 1, b) = V (b 1, b) whenever w b > 0. (a3) V (w, b) = w w+b V (w + 1, b 1) + b w+b V (w 1, b + 1) whenever b > w > 0. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

18 Policy A Do Nothing, Policy A Reduce W t to B t 1, W t < B t W t B t Definition. The value function V (w, b) is the expected final number of black sheep under policy A if we start with w white sheep and b black sheep. As a result, V has the following properties, (a1) V (0, b) = b (a2) V (w, b) = V (w 1, b) = V (b 1, b) whenever w b > 0. (a3) V (w, b) = w w+b V (w + 1, b 1) + b b > w > 0. Claim. V (W n, B n ) is a martingale w.r.t {F n }. w+b V (w 1, b + 1) whenever V (W n+1, B n+1 ) = Wn W n+b n V (W n +1, B n 1)+ Bn W n+b n V (W n 1, B n +1) = V (W n, B n ) K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

19 Supermartingale Lemma 1. V (w, b) V (w 1, b) whenever w > 0 Lemma 2. V (w, b) w w+b V (w + 1, b 1) + b w+b V (w 1, b + 1) whenever w, b > 0. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

20 Supermartingale Lemma 1. V (w, b) V (w 1, b) whenever w > 0 Lemma 2. V (w, b) w w+b V (w + 1, b 1) + b w+b V (w 1, b + 1) whenever w, b > 0. Suppose these two Lemmas hold, we have the theorem Theorem. For any policy, V (W n, B n ) is a supermartingale. Proof. Indeed, any action we take does not increase the value of V. Hence E[V (W n+1, B n+1 ) F n ] E[V (W n, B n ) F n ] = V (W n, B n ) K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

21 Martingale Convergence Because V (W n, B n ) is a supermartingale it converges almost surely. Hence the chain must end up in an absorbing state for which all sheep are of one color. In this case, for deterministic W 0 and B 0, E[B ] = E[V (W, B )] V (W 0, B 0 ) Therefore, for any initial number of black and white sheep, the expected final number of black sheep under any policy is no more than the expected final number of black sheep under Policy A. We conclude that Policy A is optimal K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

22 Long-term Limit Theorem V (k, k) (2k + π 4 πk) 0 Remark. If we start with 1000 black sheep and 1000 white sheep, we expect to finish with 1945 black sheep. Proof. p k = 2 2k( ) 2k k (πk) 1/2 by Stirling s Formula. Let α k = v k (2k + π 4 p k). For ρ k = 2p k 1+p k, v k+1 = v k + ρ k (2k + 1 v k ). α k+1 = (1 ρ k )α k ρ k c k for c k = (p k p k+1 )(1 + p k ) 2p 2 k p k+1 Because c k 0 and (1 ρ k ) 0, α k 0 π 4 K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

23 Continuous Case Let B t, W t be the number of black and white sheep, but for t R. A t is also a continuous time, non-decreasing process (Heuristically da t = a t dt, a t 0). X t = B t + W t, Y t = B t W t. T = inf{t 0 : Y t X t }. dx t = da t dy t = Y t X t dt + da t + 2dβ t V (x, y) = max A E (x,y)[x T 1 {YT X T }] K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

24 Application Analogous idea in Portfolio Selection with Transaction Cost (Davis and Norman, 1990). An invester decide between a bank account paying a fixed interest rate, and a stock with log-normal diffusion price. He consumes from the bank, and tries to maximize the expectation of consumption E 0 e δt u(c(t))dt. Let s 0 (t) be the holding in bank and s 1 (t) be the holding in stock. ds 0 (t) = (rs 0 (t) c(t))dt ds 1 (t) = αs 1 (t) + σs 1 (t)db In this case, we want s 0(t) s 1 (t) = π and c(t) s 0 (t)+s 1 (t) = C. K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

25 References Terence Chan, Some diffusion models for the mabinogion sheep problem of williams, Advances in applied probability (1996), Mark HA Davis and Andrew R Norman, Portfolio selection with transaction costs, Mathematics of Operations Research 15 (1990), no. 4, John G Kemeny and James Laurie Snell, Finite markov chains: with a new appendix generalization of a fundamental matrix, Springer, David Williams, Probability with martingales, Cambridge university press, K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

26 Thank you! K. Dong (Cornell University) The Mabinogion Sheep Problem April 22, / 18

Markov Chains Absorption Hamid R. Rabiee

Markov Chains Absorption Hamid R. Rabiee Absorbing Markov Chain An absorbing state is one in which the probability that the process remains in that state once it enters the state is (i.e., p ii = ). A