MS&E 321 Spring 12-13 Stochastic Systems June 1, 2013 Prof. Peter W. Glynn Page 1 of 10 Section 4: Steady-State Theory Contents 4.1 The Concept of Stochastic Equilibrium.......................... 1 4.2 Existence of Stationary Distributions for Finite-State Markov Chains......... 3 4.3 Definitions and Simple Consequences........................... 5 4.4 A Test for Recurrence................................... 5 4.5 Proving Positive Recurrence................................ 6 4.6 Convergence of the n-step Transition Probabilities................... 9 4.1 The Concept of Stochastic Equilibrium In the setting of a deterministic dynamical system (x n : n 0) governed by a recursion x n+1 = f(x n ), a stable dynamical system ought typically to converge to an equilibrium x, so that x n x as n. Provided that f is continuous, the equilibrium x will then necessarily satisfy the deterministic fixed point equation x = f(x ). (4.1.1) Note that if x 0 = x, then x n = x (4.1.2) for n 0, so that the system is in equilibrium or steady-state when started in x. For a stochastic system (X n : n 0), it is rarely the case that X n X a.s. (4.1.3) as n. This is too strong a notion of convergence to steady-state. In particular, note that if (X n : n 0) satisfies the stochastic recursion X n+1 = f(x n, Z n+1 ), new randomness (as determined by Z n+1 ) is being injected into the system, no matter how large the size of n. As a consequence, one can not usually expect that f(x n, Z n+1 ) X n 0 a.s., as would be required by (4.1.3). Rather, in the stochastic context, one might instead hope that X n X 1
as n, so that we are demanding only that the distribution of X n converges to an equilibrium distribution as n. The analog to (4.1.1) is then a stochastic fixed point equation, namely the equilibrium rv X must satisfy X D = f(x, Z ), where = D means equality in distribution. In particular, when the stochastic recursion is such that Z n+1 is independent of X n and identically distributed (as for Markov chain), the distribution π of X must satisfy π( ) = P(X ) = P(X dx)p x (X 1 ) = π(x)p x (X 1 ) (4.1.4) S (since Z is then independent of X ). Equation (4.1.4) is the equation that characterizes equilibrium distribution (or stead-state distribution) of a Markov chain. Note that when S is discrete, (4.1.4) asserts that any equilibrium distribution π = (π(x) : x S) should satisfy S π(y) = x π(x)p(x, y) for y S, or, equivalently, the linear system πp = π. (4.1.5) It should further be noted that if X 0 D = X (so that X 0 is initialized with distribution π), then for n 0, in direct analog with (4.1.2). X n D = X0 (4.1.6) A good deal of the theory of Markov chains is concerned with the question of existence and uniqueness of equilibrium distribution (or, in discrete state space, the question of existence and uniqueness of probability solutions of the linear system (4.1.5)). A closely related set of mathematical questions deals with the connection between the equilibrium distribution of a Markov chain and its long-run behavior when the chain is initialized from a non-equilibrium state. The typical form of such long-run (asymptotic) behavior is either a law of large numbers n 1 1 I(X j = y) π(y) n a.s. as n or a pointwise limit theorem such as as n. P x (X n = y) π(y) Remark 4.1.1 The relation (4.1.6) is a manifestation of the fact that when the Markov chain is initialized with distribution π, X = (X n : n 0) is a stationary process, so that (X n+m : n 0) D = (X n : n 0) for m 0. For this reason, π is often called a stationary distribution of X. 2
4.2 Existence of Stationary Distributions for Finite-State Markov Chains In this section, we use analytic (non-probabilistic) methods to show that every finite-state Markov chain has an equilibrium distribution. We first use a well-known fixed point theorem to assert existence of such equilibrium distributions. Brower Fixed Point Theorem: Let T : C C be a continuous mapping defined on a closed convex subset of R d. Then, there exists a fixed point z such that z = T (z ). To apply this to finite state chains, let P be the set of all stochastic vectors µ = (µ(x) : x S). Note that P can be viewed as a subset of R d +, where d = S. Furthermore, P is closed and convex. Now, define T (µ) = µp, where P = (P (x, y) : x, y S) is the (one-step) transition matrix of the Markov chain. Since T maps P into P and is continuous, Brower s Theorem guarantees existence of π P such that π = πp. But we can go further than this. Theorem 4.2.1 Let P = (P (x, y) : x, y S) be a stochastic matrix for which S <. Then, there exists a matrix Λ such that P Λ = ΛP = Λ 2 = Λ. (4.2.1) Furthermore, as n. n 1 1 P j Λ (4.2.2) n Proof: We first show that there exists a subsequence (n k : k 1) for which 1 n k n k 1 converges to a limit; we will call the limit matrix Λ. Note that the sequence P n 1 n P j (4.2.3) n 1 P j can be identified with a sequence in R d2. Because P n lies in the unit hypercube in R d2 and the unit hypercube is compact, it follows that there exists a subsequence (n k : k 1) and a limit Λ such that P nk Λ (4.2.4) as n. Since P n is stochastic for each n 1, it follows that the limit Λ is necessarily stochastic. Furthermore, P nk +1 = 1 I + n k + 1 n k j=1 P j = I ( ) n k + 1 + P P nk n k 0 + P Λ = P Λ. n k + 1 3
On the other hand, P nk +1 = ( ) nk P nk + 1 n k + 1 n k + 1 P n k Λ + 0 = Λ, so that we conclude that P Λ = Λ. An essentially identical argument proves that ΛP = Λ and Λ 2 = Λ. It remains to establish (4.2.2), we need to show that for each ɛ > 0, there exists N = N(ɛ) such that for n N, P n Λ e < ɛ. (4.2.5) Choose n k so that P nk Λ e < ɛ 2 ; this can be done on account of (4.2.4). Then, for m 1, ( ) P mnk Λ = 1 mn k 1 P j Λ = 1 n k 1 P j Λ (4.2.6) mn k n k m But (4.2.1) ensures that P j Λ = ΛP j = Λ for j 1, so ( ) ( ) Λ = Λ = Λ, m m so that we can re-write (4.2.6) as ( ) 1 n k 1 P j Λ. n k m If A is stochastic, it is easily verified that A e = 1. Since m is stochastic, ( ) 1 n k 1 P j Λ n k m e n 1 k 1 P j Λ n k < ɛ m 2. e e It follows that if we choose N = 2n k /ɛ, we have verified (4.2.5) for n N, a multiple of n k. To deal with non-integer multiples of n k, say n of the form n = mn k + l N (with 0 l < n k, note 4
that n 1 1 P j Λ n = e ( mnk ) (P mnk Λ) + 1 n 1 (P j Λ) n n j=mn k e mn k mn k + l P mn k Λ e + 1 l 1 n (P mn k Λ) P j mn k mn k + l P mn k Λ e + 1 n P mn l 1 k Λ e e P j e proving the theorem. ɛ 2 mn k mn k + l + l n ɛ 2 + 1 m ɛ, Remark 4.2.1 Every row of Λ is a stationary distribution of P. 4.3 Definitions and Simple Consequences The definitions of irreducibility, transient state, and recurrent state are referred to the text and Applied Probability and Queues by S. Asmussen (2003). Proposition 4.3.1 Suppose X = (X n : n 0) is an irreducible Markov chain. Let τ(x) = inf{n 1 : X n = x} for each x S. Then, If one state x is recurrent, then all states are recurrent. If one state x is transient, then all states are transient. If X is recurrent, then P x (τ(y) < ) = 1 for all x, y S. 4.4 A Test for Recurrence Theorem 4.4.1 Suppose X = (X n : n 0) irreducible. Then: i.) X is transient if and only if there exists x, y S such that P n (x, y) <. n=0 ii.) X is recurrent if and only if there exists x, y S such that P n (x, y) =. Example 4.4.1 Simple symmetric random walk is recurrent in d = 1, 2. Simple symmetric random walk is transient in d 3. 5 n=0
Remark 4.4.1 A random walk (S n : n 0) in d = 1 with S n = S 0 + Z 1 + + Z n with the Z i s iid and with EZ 1 = 0 is recurrent. A random walk (S n : n 0) in d = 1 with S n = S 0 + Z 1 + + Z n with the Z i s iid and with P (Z 1 dx) = (i.e. with a Cauchy distribution) is recurrent. dx π(1 + x 2 ) A random walk (S n : n 0) in d = 2 with S n = S 0 + Z 1 + + Z n with E Z 1 2 < and with EZ 1 = 0 is recurrent. A genuinely d-dimensional random walk in R d is transient in d 3. See R. Durrett, Probability: Theory and Examples (1991), p.159-170, for details. 4.5 Proving Positive Recurrence Proposition 4.5.1 Suppose X is irreducible. Then, X is positive recurrent if and only if there exists a probability solution π to π = πp. (4.5.1) Remark 4.5.1 If one can solve (4.5.1) explicitly, one has established positive recurrence. If one can not solve (4.5.1) explicitly, we need to look for alternatives; see below. To prove positive recurrence, we need to prove that there exists z S such that E z τ(z) <. Note that E z τ(z) = 1 + y z P (z, y)e y τ(z). Put C c = {z} and C = S C c. Note that for y C, T 1 E y τ(z) E y w(x j ), where T = inf{n 0 : X n C c } and w = (w(x) : x C) is a weight function as defined earlier in the quarter. Suppose that there exists c < 1 such that for x C. Then, it follows that so E x w(x 1 ) cw(x) E x w(x 1 )I(X 1 C) cw(x) B w c, where B = (B(x, y) : x, y C) has entries defined by B(x, y) = P (x, y). Set f(x) = w(x) for x C and note that f w = 1. So, T 1 w(x j ) = E y f(x j ) (1 B w ) 1 f w w(x) (1 c) 1 w(x). E y T 1 6
Proposition 4.5.2 If there exists w : S [1, ) such that for x z and c < 1, then for x z. τ(z) 1 E x E x w(x 1 ) cw(x) (4.5.2) w(x j ) (1 c) 1 w(x) Example 4.5.1 Consider the embedded DTMC (X n : n 0) (with X n = X(D n +) for n 0) for the M/G/1 queue with E exp(θv 1 ) < for θ in a neighborhood of the origin and with λev 1 <. Set w(x) = exp(θx) for θ > 0 and sufficiently small. Then, there exists c < 1 and θ > 0 small so that E x e θx 1 ce θx for x 1. Furthermore, P (0, y)e θy <. y=1 Hence, E 0 τ(0) <, so (X n : n 0) is positive recurrent. Therefore, there exists a probability solution π to the linear system π = πp. Exercise 4.5.1 For some problems, it is hard to prove (4.5.2) for x z, but (much) easier to prove (4.5.2) for x K c, where K is a finite set containing z. To deal with such situations, suppose now that X is irreducible and let T = inf{n 1 : X n K}. a.) Prove that if (4.5.2) holds for x K c, then P x (T < ) = 1 for x K c. b.) (continuation of a.)) Prove that for x K, E x τ(z) = E x T + y K y z c.) (continuation of b.)) Prove that for x K, E x T 1 + (1 c) 1 d.) (continuation of c.)) Prove that if (4.5.2) holds and then E z τ(z) <. P x (X T = y)e y τ(z). y K c P (x, y)w(y). max E xw(x 1 ) <, (4.5.3) x K e.) (continuation of d.)) Use a similar argument to that used above in a.)-d.) to prove that if f w, then τ(z) 1 E x f(x j ) < if (4.5.2) and (4.5.3) hold, so that E π f(x 0 ) = x π(x) f(x) <. 7
Example 4.5.1 (continued) Under the stated conditions on V 1, so τ(0) 1 E 0 π(θ) = e θx j < π(x)e θx < for θ in a neighborhood of the origin. The transition matrix P for (X n : n 0) is where P = p i = x=0 p 0 p 1 p 2 p 3 p 0 p 1 p 2 p 3 0 p 0 p 1 p 2 0 0 p 0 p 1. Then the steady-state equations can be written as 0......, λt (λt)i e P (V 1 dt). i! i+1 π(i) = π(0)p i + π(r)p i r+1, i = 0, 1, 2,... r=1 We solve for the moment generating function π(θ) in terms of Multiplying through by e θi, we get K(θ) e θk p k. k=0 i+1 e θi π(i) = π(0)p i e θi + e θ π(r)p i r+1 e θ(i+1) e θ π(0)p i+1 e θ(i+1), i = 0, 1, 2,... r=0 Summing over i, and recognizing i+1 r=0 π(r)p i r+1 as a convolution, we have e θi π(i) = π(θ) = π(0)k(θ) + e θ [K(θ) π(θ) π(0)p 0 ] e θ π(0)[k(θ) p 0 ]. i=0 Solving for π(θ) gives Note that K(θ) = 0 e λt (λt) k P (V 1 dt) = k! k=0 π(θ) = π(0)k(θ)(eθ 1) e θ. K(θ) 0 ( ) exp λt(e θ 1) P (V 1 dt) = ϕ V (λ(e θ 1)), 8
where ϕ V (θ) = E exp(θv 1 ). So, π(θ) = π(0)ϕ V (λ(e θ 1))(e θ 1) e θ ϕ V (λ(e θ, (4.5.4) 1)) To identify π(0), let θ 0 and use l Hopital s rule on the right-hand side of (9.5.4), yielding π(0) = 1 λev 1. Hence, the moment-generating function of the stationary distribution for the M/G/1 queue is given by π(θ) = (1 ρ)ϕ V (λ(e θ 1))(e θ 1) e θ ϕ V (λ(e θ, 1)) where ρ λev 1. Since π( ) converges in a neighborhood of the origin, the Dominated Convergence Theorem guarantees that π (0) = xπ(x)( L), yielding x=0 L = ρ + ρ2 (c 2 + 1) 2(1 ρ), (4.5.5) where c 2 is the squared coefficient of the variation of V 1 given by c 2 = varv 1 /(EV1 2 ). Equation (4.5.5) is the celebrated Pollaczek-Khintchine formula for the mean number-in-system for the M/G/1 queue. It demonstrates the cost paid for variability in the service times. Exercise 4.5.2 Compute the second moment x=0 x2 π(x) for Example 4.5.1. Exercise 4.5.3 Suppose that X = (X(t) : t 0) is the number-in-system process for a G/M/1 queue (i.e. a G/G/1 queue with exponential service times having rate parameter µ). Suppose that Eχ 1 > 1/µ, where χ 1 is the first inter-arrival time. a.) Argue that if X n = X(A n ) (where A n is the arrival time of customer n), then (X n : n 0) is a Markov chain on state space S = Z +. b.) Compute the transition matrix of P. c.) Compute the stationary distribution of (X n : n 0). (Hint: Try a geometric distribution.) 4.6 Convergence of the n-step Transition Probabilities Definition 4.6.1 Let X = (X n : n 0) be an irreducible Markov chain with transition matrix P. The state x S is said to be periodic of period p if gcd{n 1 : P n (x, x) > 0} = p (where gcd = greatest common divisor ). If the period is 1, the state x is said to be aperiodic. Proposition 4.6.1 Suppose X is irreducible. Then x S is periodic of period p if and only if y is periodic of period p for all y S. 9
An irreducible Markov chain with transition matrix P is periodic of period p if and only if S can be partitioned into P 1, P 2,..., P p for which P has the block structure P = 0 P 12 0 0 0 P 23... P p 1,p P p,1 0 0 Note that if P (x, x) > 0 for some x S, X must be aperiodic. (The converse is false, however.) Theorem 4.6.1 If X = (X n : n 0) is an irreducible positive recurrent aperiodic Markov chain, then P n (x, y) π(y) as n, where π = (π(y) : y S) is the stationary distribution of X. For a proof via coupling, see p.273-275 of Probability: Theory and Examples by R. Durrett (1991). 10