FINITE-STATE MARKOV CHAINS

Size: px

Start display at page:

Download "FINITE-STATE MARKOV CHAINS"

Clifton Whitehead
6 years ago
Views:

1 Chapter 4 FINITE-STATE MARKOV CHAINS 4.1 Introducton The countng processes {N(t), t 0} of Chapterss 2 and 3 have the property that N(t) changes at dscrete nstants of tme, but s defned for all real t 0. Such stochastc processes are generally called contnuous tme processes. The Markov chans to be dscussed n ths and the next chapter are stochastc processes defned only at nteger values of tme, n = 0, 1,.... At each nteger tme n 0, there s an nteger valued random varable (rv) X n, called the state at tme n, and the process s the famly of rv s {X n, n 0}. These processes are often called dscrete tme processes, but we prefer the more specfc term nteger tme processes. An nteger tme process {X n ; n 0} can also be vewed as a contnuous tme process {X(t); t 0} by takng X(t) = X n for n t < n + 1, but snce changes only occur at nteger tmes, t s usually smpler to vew the process only at nteger tmes. In general, for Markov chans, the set of possble values for each rv X n s a countable set usually taken to be {0, 1, 2,... }. In ths chapter (except for Theorems 4.2 and 4.3), we restrct attenton to a fnte set of possble values, say {1,..., M}. Thus we are lookng at processes whose sample functons are sequences of ntegers, each between 1 and M. There s no specal sgnfcance to usng nteger labels for states, and no compellng reason to nclude 0 as a state for the countably nfnte case and not to nclude 0 for the fnte case. For the countably nfnte case, the most common applcatons come from queueng theory, and the state often represents the number of watng customers, whch can be zero. For the fnte case, we often use vectors and matrces, and t s more conventonal to use postve nteger labels. In some examples, t wll be more convenent to use more llustratve labels for states. Defnton 4.1. A Markov chan s an nteger tme process, {X n, n 0} for whch each rv X n, n 1, s nteger valued and depends on the past only through the most recent rv X n 1, 139

2 140 CHAPTER 4. FINITE-STATE MARKOV CHAINS.e., for all nteger n 1 and all nteger, j, k,..., m, Pr{X n =j X n 1 =, X n 2 =k,..., X 0 =m} = Pr{X n =j X n 1 =}.. (4.1) Pr{X n =j X n 1 =} depends only on and j (not n) and s denoted by Pr{X n =j X n 1 =} = P. (4.2) The ntal state X 0 has an arbtrary probablty dstrbuton, whch s requred for a full probablstc descrpton of the process, but s not needed for most of the results. A Markov chan n whch each X n has a fnte set of possble sample values s a fnte-state Markov chan. The rv X n s called the state of the chan at tme n. The possble values for the state at tme n, namely {1,..., M} or {0, 1,... } are also generally called states, usually wthout too much confuson. Thus P s the probablty of gong to state j gven that the prevous state s ; the new state, gven the prevous state, s ndependent of all earler states. The use of the word state here conforms to the usual dea of the state of a system the state at a gven tme summarzes everythng about the past that s relevant to the future. Note that the transton probabltes, P, do not depend on n. Occasonally, a more general model s requred where the transton probabltes do depend on n. In such stuatons, (4.1) and (4.2) are replaced by Pr{X n =j X n 1 =, X n 2 =k,..., X 0 =m} = Pr{X n =j X n 1 =} = P (n). (4.3) A process that obeys (4.3), wth a dependence on n, s called a non-homogeneous Markov chan. Some people refer to a Markov chan (as defned n (4.1) and (4.2)) as a homogeneous Markov chan. We wll dscuss only the homogeneous case (snce not much of general nterest can be sad about the non-homogeneous case) and thus omt the word homogeneous as a qualfer. An ntal probablty dstrbuton for X 0, combned wth the transton probabltes {P } (or {P (n)} for the non-homogeneous case), defne the probabltes for all events. Markov chans can be used to model an enormous varety of physcal phenomena and can be used to approxmate most other knds of stochastc processes. To see ths, consder samplng a gven process at a hgh rate n tme, and then quantzng t, thus convertng t nto a dscrete tme process, {Z n ; 1 < n < 1}, where each Z n takes on a fnte set of possble values. In ths new process, each varable Z n wll typcally have a statstcal dependence on past values that gradually des out n tme, so we can approxmate the process by allowng Z n to depend on only a fnte number of past varables, say Z n 1,..., Z n k. Fnally, we can defne a Markov process where the state at tme n s X n = (Z n, Z n 1,..., Z n k+1 ). The state X n = (Z n, Z n 1,..., Z n k+1 ) then depends only on X n 1 = (Z n 1,..., Z n k+1, Z n k ), snce the new part of X n,.e., Z n, s ndependent of Z n k 1, Z n k 2,..., and the other varables comprsng X n are specfed by X n 1. Thus {X n } forms a Markov chan approxmatng the orgnal process. Ths s not always an nsghtful or desrable model, but at least provdes one possblty for modelng relatvely general stochastc processes. Markov chans are often descrbed by a drected graph (see Fgure 4.1). In the graphcal representaton, there s one node for each state and a drected arc for each non-zero transton

3 4.2. CLASSIFICATION OF STATES 141 probablty. If P = 0, then the arc from node to node j s omtted; thus the dfference between zero and non-zero transton probabltes stands out clearly n the graph. Several of the most mportant characterstcs of a Markov chan depend only on whch transton probabltes are zero, so the graphcal representaton s well suted for understandng these characterstcs. A fnte-state Markov chan s also often descrbed by a matrx [P ] (see Fgure 4.1). If the chan has M states, then [P ] s a M by M matrx wth elements P. The matrx representaton s deally suted for studyng algebrac and computatonal ssues. P P P 32 P 12 P 35 P 35 [P ] = P 11 P 41 P (a) P 55 P 11 P 12 P 16 P 21 P 22 P P 61 P 62 P 66 Fgure 4.1: Graphcal and Matrx Representaton of a 6 state Markov Chan; a drected arc from to j s ncluded n the graph f and only f (ff) P > 0. (b) 4.2 Classfcaton of states Ths secton, except where ndcated otherwse, apples to Markov chans wth both fnte and countable state spaces. We start wth several defntons. Defnton 4.2. An (n-step) walk 1 s an ordered strng of nodes { 0, 1,... n }, n 1, n whch there s a drected arc from m 1 to m for each m, 1 m n. A path s a walk n whch the nodes are dstnct. A cycle s a walk n whch the frst and last nodes are the same and the other nodes are dstnct. Note that a walk can start and end on the same node, whereas a path cannot. Also the number of steps n a walk can be arbtrarly large, whereas a path can have at most M 1 steps and a cycle at most M steps. Defnton 4.3. A state j s accessble from (abbrevated as j) f there s a walk n the graph from to j. For example, n fgure 4.1(a), there s a walk from node 1 to node 3 (passng through node 2), so state 3 s accessble from 1. There s no walk from node 5 to 3, so state 3 s not accessble from 5. State 2, for example, s accessble from tself, but state 6 s not accessble from tself. To see the probablstc meanng of accessblty, suppose that a walk 0, 1,... n exsts from node 0 to n. Then, condtonal on X 0 = 0, there s a postve probablty, P 0 1, that X 1 = 1, and consequently (snce P 1 2 > 0), there s a postve probablty that 1 We are nterested here only n drected graphs, and thus undrected walks and paths do not arse.

4 142 CHAPTER 4. FINITE-STATE MARKOV CHAINS X 2 = 2. Contnung ths argument there s a postve probablty that X n = n, so that Pr{X n = n X 0 = 0 } > 0. Smlarly, f Pr{X n = n X 0 = 0 } > 0, then there s an n-step walk from 0 to n. Summarzng, j f and only f (ff) Pr{X n =j X 0 =} > 0 for some n 1. We denote Pr{X n =j X 0 =} by P n. Thus, for n 1, P n > 0 ff the graph has an n step walk from to j (perhaps vstng the same node more than once). For the example n Fgure 4.1(a), P13 2 = P 12P 23 > 0. On the other hand, P53 n = 0 for all n 1. An mportant relaton that we use often n what follows s that f there s an n-step walk from state to j and an m-step walk from state j to k, then there s a walk of m + n steps from to k. Thus Ths also shows that P n > 0 and P m jk > 0 mply Pn+m k > 0. (4.4) j and j k mply k. (4.5) Defnton 4.4. Two dstnct states and j communcate (abbrevated j) f s accessble from j and j s accessble from. An mportant fact about communcatng states s that f j and m j then m. To see ths, note that j and m j mply that j and j m, so that m. Smlarly, m, so m. Defnton 4.5. A class T of states s a non-empty set of states such that for each state T, communcates wth each j T (except perhaps tself) amd does not communcate wth any j / T. For the example of Fg. 4.1(a), {1, 2, 3, 4} s one class of states, {5} s another, and {6} s another. Note that state 6 does not communcate wth tself, but {6} s stll consdered to be a class. The entre set of states n a gven Markov chan s parttoned nto one or more dsjont classes n ths way. Defnton 4.6. For fnte-state Markov chans, a recurrent state s a state that s accessble from all states that are accessble from ( s recurrent f j mples that j ). A transent state s a state that s not recurrent. Recurrent and transent states for Markov chans wth a countably nfnte set of states wll be defned n the next chapter. Accordng to the defnton, a state n a fnte-state Markov chan s recurrent f there s no possblty of gong to a state j from whch there can be no return. As we shall see later, f a Markov chan ever enters a recurrent state, t returns to that state eventually wth probablty 1, and thus keeps returnng nfntely often (n fact, ths property serves as the defnton of recurrence for Markov chans wthout the fnte-state restrcton). A state s transent f there s some j that s accessble from but from whch there s no possble return. Each tme the system returns to, there s a possblty of gong to j; eventually ths possblty wll occur, and then no more returns to can occur (ths can be thought of as a mathematcal form of Murphy s law).

5 4.2. CLASSIFICATION OF STATES 143 Theorem 4.1. For fnte-state Markov chans, ether all states n a class are transent or all are recurrent. 2 Proof: Assume that state s transent (.e., for some j, j but j 6 ) and suppose that and m are n the same class (.e., m). Then m and j, so m j. Now f j m, then the walk from j to m could be extended to ; ths s a contradcton, and therefore there s no walk from j to m, and m s transent. Snce we have just shown that all nodes n a class are transent f any are, t follows that the states n a class are ether all recurrent or all transent. For the example of fg. 4.1(a), {1, 2, 3, 4} s a transent class and {5} s a recurrent class. In terms of the graph of a Markov chan, a class s transent f there are any drected arcs gong from a node n the class to a node outsde the class. Every fnte-state Markov chan must have at least one recurrent class of states (see Exercse 4.1), and can have arbtrarly many addtonal classes of recurrent states and transent states. States can also be classfed accordng to ther perods (see Fgure 4.2). In fg. 4.2(a), gven that X 0 = 2, we see that X 1 must be ether 1 or 3, X 2 must then be ether 2 or 4, and n general, X n must be 2 or 4 for n even and 1 or 3 for n odd. On the other hand, f X 0 s 1 or 3, then X n s 2 or 4 for n odd and 1 or 3 for n even. Thus the effect of the startng state never des out. Fg. 4.2(b) llustrates another example n whch the state alternates from odd to even and the memory of the startng state never des out. The states n both these Markov chans are sad to be perodc wth perod (a) (b) Fgure 4.2: Perodc Markov Chans Defnton 4.7. The perod of a state, denoted d(), s the greatest common dvsor (gcd) of those values of n for whch P n > 0. If the perod s 1, the state s aperodc, and f the perod s 2 or more, the state s perodc. 3 For example, n Fgure 4.2(a), P11 n > 0 for n = 2, 4, 6,.... Thus d(1), the perod of state 1, s two. Smlarly, d() = 2 for the other states n Fgure 4.2(a). For fg. 4.2(b), we have 2 Ths theorem s also true for Markov chans wth a countable state space, but the proof here s nadequate. Also recurrent classes wth a countable state space are further classfed nto ether postve-recurrent or nullrecurrent, a dstncton that does not appear n the fnte-state case. 3 For completeness, we say that the perod s nfnte f P n = 0 for all n 1. Such states do not have the ntutve characterstcs of ether perodc or aperodc states. Such a state cannot communcate wth any other state, and cannot return to tself, so t corresponds to a sngleton class of transent states. The noton of perodcty s of prmary nterest for recurrent states.

6 144 CHAPTER 4. FINITE-STATE MARKOV CHAINS P11 n > 0 for n = 4, 8, 10, 12,... ; thus d(1) = 2, and t can be seen that d() = 2 for all the states. These examples suggest the followng theorem. Theorem 4.2. For any Markov chan (wth ether a fnte or countably nfnte number of states), all states n the same class have the same perod. Proof: Let and j be any dstnct par of states n a class. Then j and there s some r such that P r > 0 and some s such that P j s > 0. Snce there s a walk of length r + s gong from to j and back to, r + s must be dvsble by d(). Let t be any nteger such that Pjj t > 0. Snce there s a walk of length r + t + s that goes frst from to j, then to j agan, and then back to, r + t + s s dvsble by d(), and thus t s dvsble by d(). Snce ths s true for any t such that Pjj t > 0, d(j) s dvsble by d(). Reversng the roles of and j, d() s dvsble by d(j), so d() = d(j). Snce the states n a class all have the same perod and are ether all recurrent or all transent, we refer to the class tself as havng the perod of ts states and as beng recurrent or transent. Smlarly f a Markov chan has a sngle class of states, we refer to the chan as havng the correspondng perod and beng recurrent or transent. Theorem 4.3. If a recurrent class n a fnte-state Markov chan has perod d, then the states n the class can be parttoned nto d subsets, S 1, S 2,..., S d, such that all transtons out of subset S m go to subset S m+1 for m < d and to subset S 1 for m = d. That s, f j S m and P jk > 0, then k S m+1 for m < d and k S 1 for m = d. Proof: See Fgure 4.3 for an llustraton of the theorem. For a gven state n the class, say state 1, defne the sets S 1,..., S d by S m = {j : P nd+m 1j > 0 for some n 0}; 1 m d. (4.6) For each gven j n the class, we frst show that there s one and only one value of m such that j S m. Snce 1 j, there s some r for whch P r 1j > 0 and some s for whch P s j1 > 0. Snce there s a walk from 1 to 1 (through j) of length r + s, r + s s dvsble by d. Defne m, 1 m d, by r = m + nd, where n s an nteger. From (4.6), j S m. Now let r 0 be any other nteger such that P r0 1j > 0. Then r0 + s s also dvsble by d, so that r 0 r s dvsble by d. Thus r 0 = m + n 0 d for some nteger n 0 and that same m. Snce r 0 s any nteger such that P r0 1j > 0, j s n S m for only that one value of m. Snce j s arbtrary, ths shows that the sets S m are dsjont and partton the class. Fnally, suppose j S m and P jk > 0. Gven a walk of length r = nd + m from state 1 to j, there s a walk of length nd + m + 1 from state 1 to k. It follows that f m < d, then k S m+1 and f m = d, then k S 1, completng the proof. We have seen that each class of states (for a fnte-state chan) can be classfed both n terms of ts perod and n terms of whether or not t s recurrent. The most mportant case s that n whch a class s both recurrent and aperodc. Defnton 4.8. For a fnte-state Markov chan, an ergodc class of states s a class that

7 4.2. CLASSIFICATION OF STATES 145 P S 2 P S 3 PPPPPPPPPq 1 S 3 PPPPPPPPPPPPPPPq Fgure 4.3: Structure of a Perodc Markov Chan wth d = 3. Note that transtons only go from one subset S m to the next subset S m+1 (or from S d to S 1 ). s both recurrent and aperodc 4. A Markov chan consstng entrely of one ergodc class s called an ergodc chan. We shall see later that these chans have the desrable property that P n becomes ndependent of the startng state as n 1. The next theorem establshes the frst part of ths by showng that P n > 0 for all and j when n s suffcently large. The Markov chan n Fgure 4.4 llustrates the theorem by llustratng how large n must be n the worst case Fgure 4.4: An ergodc chan wth M = 6 states n whch P m > 0 for all m > (M 1)2 and all, j but P (M 1)2 11 = 0 The fgure also llustrates that an M state Markov chan must have a cycle wth M 1 or fewer nodes. To see ths, note that an ergodc chan must have cycles, snce each node must have a walk to tself, and any subcycle of repeated nodes can be omtted from that walk, convertng t nto a cycle. Such a cycle mght have M nodes, but a chan wth only a M node cycle would be perodc. Thus some nodes must be on smaller cycles, such as the cycle of length 5 n the fgure. Theorem 4.4. For an ergodc M state Markov chan, P m (M 1) > 0 for all, j, and all m 4 For Markov chans wth a countably nfnte state space, ergodc means that the states are postverecurrent and aperodc (see Chapter 5, Secton 5.1).

8 146 CHAPTER 4. FINITE-STATE MARKOV CHAINS Proof*: 5 As shown n Fgure 4.4, the chan must contan a cycle wth fewer than M nodes. Let τ M 1 be the number of nodes on a smallest cycle n the chan and let be any gven state on such a cycle. Defne T (m), m 1, as the set of states accessble from the fxed state n m steps. Thus T (1) = {j : P > 0}, and for arbtrary m 1, T (m) = {j : P m > 0}. (4.7) Snce s on a cycle of length τ, P τ > 0. For any m 1 and any j T (m), we can then construct an m + τ step walk from to j by gong from to n τ steps and then to j n another m steps. Ths s true for all j T (m), so T (m) T (m + τ). (4.8) By defnng T (0) to be the sngleton set {}, (4.8) also holds for m = 0, snce T (τ). By startng wth m = 0 and teratng on (4.8), T (0) T (τ) T (2τ) T (nτ). (4.9) We now show that f one of the ncluson relatons n (4.9) s satsfed wth equalty, then all the subsequent relatons are satsfed wth equalty. More generally, assume that T (m) = T (m + s) for some m 0 and s 1. Note that T (m + 1) s the set of states that can be reached n one step from states n T (m), and smlarly T (m + s + 1) s the set reachable n one step from T (m + s) = T (m). Thus T (m + 1) = T (m s). Iteratng ths result, T (m) = T (m + s) mples T (n) = T (n + s) for all n m. (4.10) Thus, (4.9) starts wth strct nclusons and then contnues wth strct equaltes. Snce the entre set has M members, there can be at most M 1 strct nclusons n (4.9). Thus T ((M 1)τ) = T (nτ) for all ntegers n M 1. (4.11) Defne k as (M 1)τ. We can then rewrte (4.11) as T (k) = T (k + jτ) for all j 1. (4.12) We next show that T (k) conssts of all M nodes n the chan. The central part of ths s to show that T (k) = T (k + 1). Let t be any postve nteger other than τ such that P t > 0. Lettng m = k n (4.8) and usng t n place of τ, Snce T (k + τt) = T (k), ths shows that T (k) T (k + t) T (k + +2t) T (k + τt). (4.13) Now let s be the smallest postve nteger such that T (k) = T (k + t). (4.14) T (k) = T (k + s). (4.15) 5 Proofs marked wth an astersk can be omtted wthout loss of contnuty.

9 4.3. THE MATRIX REPRESENTATION 147 From (4.11), we see that (4.15) holds when s takes the value τ. Thus, the mnmzng s must le n the range 1 s τ. We wll show that s = 1 by assumng s > 1 and establshng a contradcton. Snce the chan s aperodc, there s some t not dvsble by s for whch P t > 0. Ths t can be represented by t = js + ` where 1 ` < s and j 0. Iteratng (4.15), we get T (k) = T (k + js), and applyng (4.10) to ths, T (k + `) = T (k + js + `) = T (k + t) = T (k). where we have used t = js + ` followed by (4.14). Ths s the desred contradcton, snce ` < s. Thus s = 1 and T (k) = T (k + 1). Iteratng ths, T (k) = T (k + n) for all n 0. (4.16) Snce the chan s ergodc, each state j contnues to be accessble after k steps. Therefore j must be n T (k + n) for some n 0, whch, from (4.16), mples that j T (k). Snce j s arbtrary, T (k) must be the entre set of states. Thus P n > 0 for all n k and all j. Ths same argument can be appled to any state on the gven cycle wth τ nodes. Any state m not on ths cycle has a path to the cycle usng at most M τ steps. Usng ths path to reach a node on the cycle, and followng ths wth all the walks from of length k = (M 1)τ, we see that P M τ+(m 1)τ mj > 0 for all j, m. The proof s complete, snce M τ + (M 1)τ (M 1) for all τ, 1 τ M 1, wth equalty when τ = M 1. Fgure 4.4 llustrates a stuaton where the bound (M 1) s met wth equalty. Note that there s one cycle of length M 1 and the sngle node not on ths cycle, node 1, s the unque startng node at whch the bound s met wth equalty. 4.3 The Matrx representaton The matrx [P ] of transton probabltes of a Markov chan s called a stochastc matrx; that s, a stochastc matrx s a square matrx of non-negatve terms n whch the elements n each row sum to 1. We frst consder the n step transton probabltes P n n terms of [P]. The probablty of gong from state to state j n two steps s the sum over h of all possble two step walks, from to h and from h to j. Usng the Markov condton n (4.1), P 2 = MX P h P hj. h=1 It can be seen that ths s just the term of the product of matrx [P ] wth tself; denotng [P ][P ] as [P ] 2, ths means that P 2 s the (, j) element of the matrx [P ]2. Smlarly, P n s

10 148 CHAPTER 4. FINITE-STATE MARKOV CHAINS the element of the nth power of the matrx [P ]. Snce [P ] m+n = [P ] m [P ] n, ths means that P m+n = MX h=1 P m h P n hj. (4.17) Ths s known as the Chapman-Kolmogorov equaton. An effcent approach to compute [P ] n (and thus P n) for large n, s to multply [P ]2 by [P ] 2, then [P ] 4 by [P ] 4 and so forth and then multply these bnary powers together as needed. The matrx [P ] n (.e., the matrx of transton probabltes rased to the nth power) s very mportant for a number of reasons. The, j element of ths matrx s P n, whch s the probablty of beng n state j at tme n gven state at tme 0. If memory of the past des out wth ncreasng n, then we would expect the dependence on both n and to dsappear n P n. Ths means, frst, that [P ]n should converge to a lmt as n 1, and, second, that each row of [P ] n should tend to the same set of probabltes. If ths convergence occurs (and we later determne the crcumstances under whch t occurs), [P ] n and [P ] n+1 wll be the same n the lmt n 1 whch means lm[p ] n = (lm[p ] n )P. If all the rows of lm[p n ] are the same, equal to some row vector π = (π 1, π 2,..., π M ), ths smplfes to π = π[p ]. Snce π s a probablty vector (.e., ts components are the probabltes of beng n the varous states n the lmt n 1), ts components must be non-negatve and sum to 1. Defnton 4.9. A steady-state probablty vector (or a steady-state dstrbuton) for a Markov chan wth transton matrx [P ] s a vector π that satsfes π = π[p ] ; where X π = 1 ; π 0, 1 M. (4.18) The steady-state probablty vector s also often called a statonary dstrbuton. If a probablty vector π satsfyng (4.18) s taken as the ntal probablty assgnment of the chan at tme 0, then that assgment s mantaned forever. That s, f Pr{X 0 =} = π for all, then Pr{X 1 =j} = P π P = π j for all j, and, by nducton, Pr{X n = j} = π j for all j and all n > 0. If [P ] n converges as above, then, for each startng state, the steady-state dstrbuton s reached asymptotcally. There are a number of questons that must be answered for a steady-state dstrbuton as defned above: 1. Does π = π[p ] always have a probablty vector soluton? 2. Does π = π[p ] have a unque probablty vector soluton? 3. Do the rows of [P ] n converge to a probablty vector soluton of π = π[p ]? We frst gve the answers to these questons for fnte-state Markov chans and then derve them. Frst, (4.18) always has a soluton (although ths s not necessarly true for nfntestate chans). The answer to the second and thrd questons s smpler wth the followng defnton:

11 4.3. THE MATRIX REPRESENTATION 149 Defnton A unchan s a fnte-state Markov chan that contans a sngle recurrent class plus, perhaps, some transent states. An ergodc unchan s a unchan for whch the recurrent class s ergodc. A Unchan, as we shall see, s the natural generalzaton of a recurrent chan to allow for some ntal transent behavor wthout dsturbng the long term aymptotc behavor of the underlyng recurrent chan. The answer to the second queston above s that the soluton to (4.18) s unque ff [P] s the transton matrx of a unchan. If there are r recurrent classes, then π = π[p ] has r lnearly ndependent solutons. For the thrd queston, each row of [P ] n converges to the unque soluton of (4.18) f [P] s the transton matrx of an ergodc unchan. If there are multple recurrent classes, but all of them are aperodc, then [P ] n stll converges, but to a matrx wth non-dentcal rows. If the Markov chan has one or more perodc recurrent classes, then [P ] n does not converge. We frst look at these answers from the standpont of matrx theory and then proceed n Chapter 5 to look at the more general problem of Markov chans wth a countably nfnte number of states. There we use renewal theory to answer these same questons (and to dscover the dfferences that occur for nfnte-state Markov chans). The matrx theory approach s useful computatonally and also has the advantage of tellng us somethng about rates of convergence. The approach usng renewal theory s very smple (gven an understandng of renewal processes), but s more abstract The egenvalues and egenvectors of P A convenent way of dealng wth the nth power of a matrx s to fnd the egenvalues and egenvectors of the matrx. Defnton The row vector π s a left egenvector of [P ] of egenvalue f π 6= 0 and π[p ] = π. The column vector s a rght egenvector of egenvalue f 6= 0 and [P ] =. We frst treat the specal case of a Markov chan wth two states. Here the egenvalues and egenvectors can be found by elementary (but slghtly tedous) algebra. The egenvector equatons can be wrtten out as π 1 P 11 + π 2 P 21 = π 1 P P 12 2 = 1 π 1 P 12 + π 2 P 22 = π 2 P P 22 2 = 2. (4.19) These equatons have a non-zero soluton ff the matrx [P I], where [I] s the dentty matrx, s sngular (.e., there must be a non-zero for whch [P I] = 0 ). Thus must be such that the determnant of [P I], namely (P 11 )(P 22 ) P 12 P 21, s equal to 0. Solvng ths quadratc equaton n, we fnd that has two solutons, 1 = 1 and 2 = 1 P 12 P 21. Assume ntally that P 12 and P 21 are not both 0. Then the soluton for the left and rght egenvectors, π (1) and (1), of 1 and π (2) and (2) of 2, are gven by π (1) 1 = P 21 P 12 +P 21 π (1) 2 = P 12 P 12 +P 21 (1) 1 = 1 (1) π (2) 1 = 1 π (2) 2 = 1 (2) 1 = P 12 P 12 +P 21 (2) 2 = 2 = 1. P 21 P 12 +P 21

12 150 CHAPTER 4. FINITE-STATE MARKOV CHAINS 1 0 These solutons contan an arbtrary normalzaton factor. Now let [Λ] = and 0 2 let [U] be a matrx wth columns (1) and (2). Then the two rght egenvector equatons n (4.19) can be combned compactly as [P ][U] = [U][Λ]. It turns out (gven the way we have normalzed the egenvectors) that the nverse of [U] s just the matrx whose rows are the left egenvectors of [P ] (ths can be verfed by drect calculaton, and we show later that any rght egenvector of one egenvalue must be orthogonal to any left egenvector of another egenvalue). We then see that [P ] = [U][Λ][U] 1 and consequently [P ] n = [U][Λ] n [U] 1. Multplyng ths out, we get [P ] n π1 + π = 2 n 2 π 2 π 2 n 2 π 1 π 1 n 2 π 2 + π 1 n 2 where π 1 = P 21 P 12 + P 21, π 2 = 1 π 1. Recallng that 2 = 1 P 12 P 21, we see that 2 1. If P 12 = P 21 = 0, then 2 = 1 so that [P ] and [P ] n are smply dentty matrces. If P 12 = P 21 = 1, then 2 = 1 so that [P ] n alternates between the dentty matrx for n even and [P ] for n odd. In all other cases, 2 < 1 and [P ] n approaches the matrx whose rows are both equal to π. Parts of ths specal case generalze to an arbtrary fnte number of states. In partcular, = 1 s always an egenvalue and the vector e whose components are all equal to 1 s always a rght egenvector of = 1 (ths follows mmedately from the fact that each row of a stochastc matrx sums to 1). Unfortunately, not all stochastc matrces can be represented n the form [P ] = [U][Λ][U 1 (snce M ndependent rght egenvectors need not exst see Exercse 4.9) In general, the dagonal matrx of egenvalues n [P ] = [U][Λ][U 1 ] must be replaced by somethng called a Jordan form, whch does not easly lead us to the desred results. In what follows, we develop the powerful Perron and Frobenus theorems, whch are useful n ther own rght and also provde the necessary results about [P ] n n general. 4.4 Perron-Frobenus theory A real vector x (.e., a vector wth real components) s defned to be postve, denoted x > 0, f x > 0 for each component. A real matrx [A] s postve, denoted [A] > 0, f A > 0 for each, j. Smlarly, x s non-negatve, denoted x 0, f x 0 for all. [A] s non-negatve, denoted [A] 0, f A 0 for all, j. Note that t s possble to have x 0 and x 6= 0 wthout havng x > 0, snce x > 0 means that all components of x are postve and x 0, x 6= 0 means that at least one component of x s postve and all are non-negatve. Next, x > y and y < x both mean x y > 0. Smlarly, x y and y x mean x y 0. The correspondng matrx nequaltes have correspondng meanngs. We start by lookng at the egenvalues and egenvectors of postve square matrces. In what follows, when we assert that a matrx, vector, or number s postve or non-negatve, we mplctly mean that t s real also. We wll prove Perron s theorem, whch s the crtcal result for dealng wth postve matrces. We then generalze Perron s theorem to the Frobenus theorem, whch treats a class of non-negatve matrces called rreducble matrces. We fnally specalze the results to stochastc matrces.

13 4.4. PERRON-FROBENIUS THEORY 151 Perron s theorem shows that a square postve matrx [A] always has a postve egenvalue that exceeds the magntude of all other egenvalues. It also shows that ths has a rght egenvector that s postve and unque wthn a scale factor. It establshes these results by relatng to the followng frequently useful optmzaton problem. For a gven square matrx [A] > 0, and for any non-zero vector 6 x 0, let g(x) be the largest real number a for whch ax [A]x. Let be defned by = sup g(x). (4.20) x6=0,x 0 We can express g(x) explctly by rewrtng ax Ax as ax P j A x j for all. Thus, the largest a for whch ths s satsfed s P j g(x) = mn g (x) where g (x) = A x j. (4.21) x Snce [A] > 0, x 0 and x 6= 0, t follows that the numerator P A x j s postve for all. Thus g (x) s postve for x > 0 and nfnte for x = 0, so g(x) > 0. It s shown n Exercse 4.10 that g(x) s a contnuous functon of x over x 6= 0, x 0 and that the supremum n (4.20) s actually acheved as a maxmum. Theorem 4.5 (Perron). Let [A] > 0 be a M by M matrx, let > 0 be gven by (4.20) and (4.21), and let be a vector x that maxmzes (4.20). Then 1. = [A] and > For any other egenvalue µ of [A], µ <. 3. If x satsfes x = [A]x, then x = β for some (possbly complex) number β. Dscusson: Property (1) asserts not only that the soluton of the optmzaton problem s an egenvalue of [A], but also that the optmzng vector s an egenvector and s strctly postve. Property (2) says that s strctly greater than the magntude of any other egenvalue, and thus we refer to t n what follows as the largest egenvalue of [A]. Property (3) asserts that the egenvector s unque (wthn a scale factor), not only among postve vectors but among all (possbly complex) vectors. Proof* Property 1: We are gven that = g( ) g(x) for each x 0, x 6= 0. (4.22) We must show that = [A],.e., that = P j A j for each, or equvalently that P j = g( ) = g ( ) = A j for each. (4.23) Thus we want to show that the mnmum n (4.21) s acheved by each, 1 M. To show ths, we assume the contrary and demonstrate a contradcton. Thus, suppose that 6 Note that the set of nonzero vectors x for whch x 0 s dfferent from the set {x > 0} n that the former allows some x to be zero, whereas the latter requres all x to be zero.

14 152 CHAPTER 4. FINITE-STATE MARKOV CHAINS g( ) < g k ( ) for some k. Let e k be the kth unt vector and let be a small postve number. The contradcton wll be to show that g( + e k ) > g( ) for small enough, thus volatng (4.22). For 6= k, g ( + e k ) = P j A P j + A k j > A j. (4.24) g k ( + e k ), on the other hand, s contnuous n > 0 as ncreases from 0 and thus remans greater than g( ) for small enough. Ths shows that g( + e k ) > g( ), completng the contradcton. Ths also shows that k must be greater than 0 for each k. Property 2: Let µ be any egenvalue of [A]. Let x 6= 0 be a rght egenvector (perhaps complex) for µ. Takng the magntude of each sde of µx = [A]x, we get the followng for each component µ x = X j A x j X j A x j. (4.25) Let u = ( x 1, x 2,..., x M ), so (4.25) becomes µ u [A]u. Snce u 0, u 6= 0, t follows from the defnton of g(x) that µ g(u). From (4.20), g(u), so µ. Next assume that µ =. From (4.25), then, u [A]u, so u acheves the maxmzaton n (4.20) and part 1 of the theorem asserts that u = [A]u. Ths means that (4.25) s satsfed wth equalty, and t follows from ths (see Exercse 4.11) that x = βu for some (perhaps complex) scalar β. Thus x s an egenvector of, and µ =. Thus µ = s mpossble for µ 6=, so > µ for all egenvalues µ 6=. Property 3: Let x be any egenvector of. Property 2 showed that x = βu where u = x for each and u s a non-negatve egenvector of egenvalue. Snce > 0, we can choose α > 0 so that αu 0 and αu = 0 for some. Now αu s ether dentcally 0 or else an egenvector of egenvalue, and thus strctly postve. Snce αu = 0 for some, αu = 0. Thus u and x are scalar multples of, completng the proof. Next we apply the results above to a more general type of non-negatve matrx called an rreducble matrx. Recall that we analyzed the classes of a fnte-state Markov chan n terms of a drected graph where the nodes represent the states of the chan and a drected arc goes from to j f P > 0. We can draw the same type of drected graph for an arbtrary non-negatve matrx [A];. e., a drected arc goes from to j f A > 0. Defnton An rreducble matrx s a non-negatve matrx such that for every par of nodes, j n ts graph, there s a walk from to j. For stochastc matrces, an rreducble matrx s thus the matrx of a recurrent Markov chan. If we denote the, j element of [A] n by A n, then we see that An > 0 ff there s a walk of length n from to j n the graph. If [A] s rreducble, a walk exsts from any to any j (ncludng j = ) wth length at most M, snce the walk need vst each other node at most once. Thus A n > 0 for some n, 1 n M, and P M n=1 An > 0. The key to analyzng rreducble matrces s then the fact that the matrx B = P M n=1 [A]n s strctly postve.

15 4.4. PERRON-FROBENIUS THEORY 153 Theorem 4.6 (Frobenus). Let [A] 0 be a M by M rreducble matrx and let be the supremum n (4.20) and (4.21). Then the supremum s acheved as a maxmum at some vector and the par, have the followng propertes: 1. = [A] and > For any other egenvalue µ of [A], µ. 3. If x satsfes x = [A]x, then x = β for some (possbly complex) number β. Dscusson: Note that ths s almost the same as the Perron theorem, except that [A] s rreducble (but not necessarly postve), and the magntudes of the other egenvalues need not be strctly less than. When we look at recurrent matrces of perod d, we shall fnd that there are d 1 other egenvalues of magntude equal to. Because of the possblty of other egenvalues wth the same magntude as, we refer to as the largest real egenvalue of [A]. Proof* Property 1: We frst establsh property 1 for a partcular choce of and and then show that ths choce satsfes the optmzaton problem n (4.20) and (4.21). Let [B] = P M n=1 [A]n > 0. Usng theorem 4.5, we let B be the largest egenvalue of [B] and let > 0 be the correspondng rght egenvector. Then [B] = B. Also, snce [B][A] = [A][B], we have [B]{[A] } = [A][B] = B [A]. Thus [A] s a rght egenvector for egenvalue B of [B] and thus equal to multpled by some postve scale factor. Defne ths scale factor to be, so that [A] = and > 0. We can relate to B by [B] = P M n=1 [A]n = ( + + M ). Thus B = + + M. Next, for any non-zero x 0, let g > 0 be the largest number such that [A]x gx. Multplyng both sdes of ths by [A], we see that [A] 2 x g[a]x g 2 x. Smlarly, [A] x g x for each 1, so t follows that Bx (g + g g M )x. From the optmzaton property of B n theorem 4.5, ths shows that B g + g g M. Snce B = M, we conclude that g, showng that, solve the optmzaton problem for A n (4.20) and (4.21). Propertes 2 and 3: The frst half of the proof of property 2 n Theorem 4.5 apples here also to show that µ for all egenvalues µ of [A]. Fnally, let x be an arbtrary vector satsfyng [A]x = x. Then, from the argument above, x s also a rght egenvector of [B] wth egenvalue B, so from Theorem 4.5, x must be a scalar multple of, completng the proof. Corollary 4.1. The largest real egenvalue of an rreducble matrx [A] 0 has a postve left egenvector π. π s the unque left egenvector of (wthn a scale factor) and s the only non-negatve non-zero vector (wthn a scale factor) that satsfes π π[a]. Proof: A left egenvector of [A] s a rght egenvector (transposed) of [A] T. The graph correspondng to [A] T s the same as that for [A] wth all the arc drectons reversed, so that all pars of nodes stll communcate and [A] T s rreducble. Snce [A] and [A] T have the same egenvalues, the corollary s just a restatement of the theorem.

16 154 CHAPTER 4. FINITE-STATE MARKOV CHAINS Corollary 4.2. Let be the largest real egenvalue of an rreducble matrx and let the rght and left egenvectors of be > 0 and π > 0. Then, wthn a scale factor, s the only non-negatve rght egenvector of [A] (.e., no other egenvalues have non-negatve egenvectors). Smlarly, wthn a scale factor, π s the only non-negatve left egenvector of [A]. Proof: Theorem 4.6 asserts that s the unque rght egenvector (wthn a scale factor) of the largest real egenvalue, so suppose that u s a rght egenvector of some other egenvalue µ. Lettng π be the left egenvector of, we have π[a]u = πu and also π[a]u = µπu. Thus πu = 0. Snce π > 0, u cannot be non-negatve and non-zero. The same argument shows the unqueness of π. Corollary 4.3. Let [P ] be a stochastc rreducble matrx (.e., the matrx of a recurrent Markov chan). Then = 1 s the largest real egenvalue of [P ], e = (1, 1,..., 1) T s the rght egenvector of = 1, unque wthn a scale factor, and there s a unque probablty vector π > 0 that s a left egenvector of = 1. Proof: Snce each row of [P ] adds up to 1, [P ]e = e. Corollary 4.2 asserts the unqueness of e and the fact that = 1 s the largest real egenvalue, and Corollary 4.1 asserts the unqueness of π. The proof above shows that every stochastc matrx, whether rreducble or not, has an egenvalue = 1 wth e = (1,..., 1) T as a rght egenvector. In general, a stochastc matrx wth r recurrent classes has r ndependent non-negatve rght egenvectors and r ndependent non-negatve left egenvectors; the left egenvectors can be taken as the steadystate probablty vectors wthn the r recurrent classes (see Exercse 4.14). The followng corollary, proved n Exercse 4.13, extends corollary 4.3 to unchans. Corollary 4.4. Let [P ] be the transton matrx of a unchan. Then = 1 s the largest real egenvalue of [P ], e = (1, 1,..., 1) T s the rght egenvector of = 1, unque wthn a scale factor, and there s a unque probablty vector π 0 that s a left egenvector of = 1; π > 0 for each recurrent state and π = 0 for each transent state. Corollary 4.5. The largest real egenvalue of an rreducble matrx [A] 0 s a strctly ncreasng functon of each component of [A]. Proof: For a gven rreducble [A], let [B] satsfy [B] [A], [B] 6= [A]. Let be the largest real egenvalue of [A] and > 0 be the correspondng rght egenvector. Then = [A] [B], but 6= [B]. Let µ be the largest real egenvalue of [B], whch s also rreducble. If µ, then µ [B], and µ 6= [B], whch s a contradcton of property 1 n Theorem 4.6. Thus, µ >. We are now ready to study the asymptotc behavor of [A] n. The smplest and cleanest result holds for [A] > 0. We establsh ths n the followng corollary and then look at the case of greatest mportance, that of a stochastc matrx for an ergodc Markov chan. More general cases are treated n Exercses 4.13 and 4.14.

17 4.4. PERRON-FROBENIUS THEORY 155 Corollary 4.6. Let be the largest egenvalue of [A] > 0 and let π and be the postve left and rght egenvectors of, normalzed so that π = 1. Then lm n 1 [A] n = π. (4.26) n Proof*: Snce > 0 s a column vector and π > 0 s a row vector, π s a postve matrx of the same dmenson as [A]. Snce [A] > 0, we can defne a matrx [B] = [A] α π whch s postve for small enough α > 0. Note that π and are left and rght egenvectors of [B] wth egenvalue µ = α. We then have µ n = [B] n, whch when pre-multpled by π yelds ( α) n = π[b] n = X X π B n j. j where B n s the, j element of [B]n. Snce each term n the above summaton s postve, we have ( α) n π B n j, and therefore B n ( α)n /(π j ). Thus, for each, j, lm n 1 B n n = 0, and therefore lm n 1 [B] n n = 0. Next we use a convenent matrx dentty: for any egenvalue of a matrx [A], and any correspondng rght and left egenvectors and π, normalzed so that π = 1, we have {[A] π} n = [A] n n π (see Exercse 4.12). Applyng the same dentty to [B], we have {[B] µ π} n = [B] n µ n π. Fnally, snce [B] = [A] α π, we have [B] µ π = [A] π, so that [A] n n π = [B] n µ n π. (4.27) Dvdng both sdes of (4.27) by n and takng the lmt of both sdes of (4.27) as n 1, the rght hand sde goes to 0, completng the proof. Note that for a stochastc matrx [P ] > 0, ths corollary smplfes to lm n 1 [P ] n = eπ. Ths means that lm n 1 P n = π j, whch means that the probablty of beng n state j after a long tme s π j, ndependent of the startng state. Theorem 4.7. Let [P ] be the transton matrx of an ergodc fnte-state Markov chan. Then = 1 s the largest real egenvalue of [P ], and > µ for every other egenvalue µ. Furthermore, lm n 1 [P ] n = eπ, where π > 0 s the unque probablty vector satsfyng π[p ] = π and e = (1, 1,..., 1) T s the unque vector (wthn a scale factor) satsfyng [P ] =. Proof: From corollary 4.3, = 1 s the largest real egenvalue of [P ], e s the unque (wthn a scale factor) rght egenvector of = 1, and there s a unque probablty vector π such that π[p ] = π. From Theorem 4.4, [P ] m s postve for suffcently large m. Snce [P ] m s also stochastc, = 1 s strctly larger than the magntude of any other egenvalue of [P ] m. Let µ be any other egenvalue of [P ] and let x be a rght egenvector of µ. Note that x s also a rght egenvector of [P ] m wth egenvalue (µ) m. Snce = 1 s the only egenvalue of [P ] m of magntude 1 or more, we ether have µ < or (µ) m =. If (µ) m =, then x must be a scalar tmes e. Ths s mpossble, snce x cannot be an egenvector of [P ] wth both egenvalue and µ. Thus µ <. Smlarly, π > 0 s the unque left egenvector of [P ] m wth egenvalue = 1, and πe = 1. Corollary 4.6 then asserts that

18 156 CHAPTER 4. FINITE-STATE MARKOV CHAINS lm n 1 [P ] mn = eπ. Multplyng by [P ] for any, 1 < m, we get lm n 1 [P ] mn+ = eπ, so lm n 1 [P ] n = eπ. Theorem 4.7 generalzes easly to an ergodc unchan (see Exercse 4.15). In ths case, as one mght suspect, π = 0 for each transent state and π > 0 wthn the ergodc class. Theorem 4.7 becomes: Theorem 4.8. Let [P ] be the transton matrx of an ergodc unchan. Then = 1 s the largest real egenvalue of [P ], and > µ for every other egenvalue µ. Furthermore, lm [P m 1 ]m = eπ, (4.28) where π 0 s the unque probablty vector satsfyng π[p ] = π and e = (1, 1,..., 1) T the unque (wthn a scale factor) satsfyng [P ] =. s If a chan has a perodc recurrent class, [P ] m never converges. The exstence of a unque probablty vector soluton to π[p ] = π for a perodc recurrent chan s somewhat mystfyng at frst. If the perod s d, then the steady-state vector π assgns probablty 1/d to each of the d subsets of Theorem 4.3. If the ntal probabltes for the chan are chosen as Pr{X 0 = } = π for each, then for each subsequent tme n, Pr{X n = } = π. What s happenng s that ths ntal probablty assgnment starts the chan n each of the d subsets wth probablty 1/d, and subsequent transtons mantan ths randomness over subsets. On the other hand, [P ] n cannot converge because P n, for each, s zero except when n s a multple of d. Thus the memory of startng state never des out. An ergodc Markov chan does not have ths pecular property, and the memory of the startng state des out (from Theorem 4.7). The ntuton to be assocated wth the word ergodc s that of a process n whch tmeaverages are equal to ensemble-averages. Usng the general defnton of ergodcty (whch s beyond our scope here), a perodc recurrent Markov chan n steady-state (.e., wth Pr{X n = } = π for all n and ) s ergodc. Thus the noton of ergodcty for Markov chans s slghtly dfferent than that n the general theory. The dfference s that we thnk of a Markov chan as beng specfed wthout specfyng the ntal state dstrbuton, and thus dfferent ntal state dstrbutons really correspond to dfferent stochastc processes. If a perodc Markov chan starts n steady state, then the correspondng stochastc process s statonary, and otherwse not. 4.5 Markov chans wth rewards Suppose that each state n a Markov chan s assocated wth some reward, r. As the Markov chan proceeds from state to state, there s an assocated sequence of rewards that are not ndependent, but are related by the statstcs of the Markov chan. The stuaton s smlar to, but smpler than, that of renewal-reward processes. As wth renewal-reward processes, the reward r could equally well be a cost or an arbtrary real valued functon of the state. In ths secton, the expected value of the aggregate reward over tme s analyzed.

19 4.5. MARKOV CHAINS WITH REWARDS 157 The model of Markov chans wth rewards s surprsngly broad. We have already seen that almost any stochastc process can be approxmated by a Markov chan. Also, as we saw n studyng renewal theory, the concept of rewards s qute graphc not only n modelng such thngs as corporate profts or portfolo performance, but also for studyng resdual lfe, queueng delay, and many other phenomena. In Secton 4.6, we shall study Markov decson theory, or dynamc programmng. Ths can be vewed as a generalzaton of Markov chans wth rewards n the sense that there s a decson maker or polcy maker who n each state can choose between several dfferent polces; for each polcy, there s a gven set of transton probabltes to the next state and a gven expected reward for the current state. Thus the decson maker must make a compromse between the expected reward of a gven polcy n the current state (.e., the mmedate reward) and the long term beneft from the next state to be entered. Ths s a much more challengng problem than the current study of Markov chans wth rewards, but a thorough understandng of the current problem provdes the machnery to understand Markov decson theory also. Frequently t s more natural to assocate rewards wth transtons rather than states. If r denotes the reward assocated wth a transton from to j and P denotes the correspondng transton probablty, then r = P j P r s the expected reward assocated wth a transton from state. Snce we analyze only expected rewards here, and snce the effect of transton rewards r are summarzed nto the state rewards r = P j P r, we henceforth gnore transton rewards and consder only state rewards. The steady-state expected reward per unt tme, assumng a sngle recurrent class of states, s easly seen to be g = P π r where π s the steady-state probablty of beng n state. The followng examples demonstrate that t s also mportant to understand the transent behavor of rewards. Ths transent behavor wll turn out to be even more mportant when we study Markov decson theory and dynamc programmng. Example (Expected frst-passage tme). A common problem when dealng wth Markov chans s that of fndng the expected number of steps, startng n some ntal state, before some gven fnal state s entered. Snce the answer to ths problem does not depend on what happens after the gven fnal state s entered, we can modfy the chan to convert the gven fnal state, say state 1, nto a trappng state (a trappng state s a state from whch there s no ext,.e., for whch P = 1). That s, we set P 11 = 1, P 1j = 0 for all j 6= 1, and leave P unchanged for all 6= 1 and all j (see Fgure 4.5) Fgure 4.5: The converson of a four state Markov chan nto a chan for whch state 1 s a trappng state. Note that the outgong arcs from node 1 have been removed. Let v be the expected number of steps to reach state 1 startng n state 6= 1. Ths number

20 158 CHAPTER 4. FINITE-STATE MARKOV CHAINS of steps ncludes the frst step plus the expected number of steps from whatever state s entered next (whch s 0 f state 1 s entered next). Thus, for the chan n Fgure 4.5, we have the equatons v 2 = 1 + P 23 v 3 + P 24 v 4 v 3 = 1 + P 32 v 2 + P 33 v 3 + P 34 v 4 v 4 = 1 + P 42 v 2 + P 43 v 3. For an arbtrary chan of M states where 1 s a trappng state and all other states are transent, ths set of equatons becomes v = 1 + X j6=1 P v j ; 6= 1. (4.29) If we defne r = 1 for 6= 1 and r = 0 for = 1, then r s a unt reward for not yet enterng the trappng state, and v as the expected aggregate reward before enterng the trappng state. Thus by takng r 1 = 0, the reward ceases upon enterng the trappng state, and v s the expected transent reward,.e., the expected frst passage tme from state to state 1. Note that n ths example, rewards occur only n transent states. Snce transent states have zero steady-state probabltes, the steady-state gan per unt tme, g = P π r, s 0. If we defne v 1 = 0, then (4.29), along wth v 1 = 0, has the vector form v = r + [P ]v; v 1 = 0. (4.30) For a Markov chan wth M states, (4.29) s a set of M 1 equatons n the M 1 varables v 2 to v M. The equaton v = r + [P ]v s a set of M lnear equatons, of whch the frst s the vacuous equaton v 1 = 0 + v 1, and, wth v 1 = 0, the last M 1 correspond to (4.29). It s not hard to show that (4.30) has a unque soluton for v under the condton that states 2 to M are all transent states and 1 s a trappng state, but we prove ths later, n Lemma 4.1, under more general crcumstances. Example Assume that a Markov chan has M states, {0, 1,..., M 1}, and that the state represents the number of customers n an nteger tme queueng system. Suppose we wsh to fnd the expected sum of the tmes all customers spend n the system, startng at an nteger tme where customers are n the system, and endng at the frst nstant when the system becomes dle. From our dscusson of Lttle s theorem n Secton 3.6, we know that ths sum of tmes s equal to the sum of the number of customers n the system, summed over each nteger tme from the ntal tme wth customers to the fnal tme when the system becomes empty. As n the prevous example, we modfy the Markov chan to make state 0 a trappng state. We take r = as the reward n state, and v as the expected aggregate reward untl the trappng state s entered. Usng the same reasonng as n the prevous example, v s equal to the mmedate reward r = plus the expected reward from whatever state s entered next. Thus v = r + P j 1 P v j. Wth v 0 = 0, ths s v = r + [P ]v. Ths has a unque soluton for v as wll be shown later n Lemma 4.1. Ths same analyss s vald for any choce of reward r for each transent state ; the reward n the trappng state must be 0 so as to keep the expected aggregate reward fnte.

Google PageRank with Stochastic Matrix

Google PageRank with Stochastic Matrix Google PageRank wth Stochastc Matrx Md. Sharq, Puranjt Sanyal, Samk Mtra (M.Sc. Applcatons of Mathematcs) Dscrete Tme Markov Chan Let S be a countable set (usually S s a subset of Z or Z d or R or R d