SENSITIVITY OF THE STATIONARY DISTRIBUTION OF A MARKOV CHAIN*

SIAM J Matrix Anal Appl c 1994 Society for Industrial and Applied Mathematics Vol 15, No 3, pp 715-728, July, 1994 001 SENSITIVITY OF THE STATIONARY DISTRIBUTION OF A MARKOV CHAIN* CARL D MEYER Abstract It is well known that if the transition matrix of an irreducible Markov chain of moderate size has a subdominant eigenvalue which is close to 1, then the chain is ill conditioned in the sense that there are stationary probabilities which are sensitive to perturbations in the transition probabilities However, the converse of this statement has heretofore been unresolved The purpose of this article is to address this issue by establishing upper and lower bounds on the condition number of the chain such that the bounding terms are functions of the eigenvalues of the transition matrix Furthermore, it is demonstrated how to obtain estimates for the condition number of an irreducible chain with little or no extra computational effort over that required to compute the stationary probabilities by means of an LU or QR factorization Key words Markov chains, stationary distribution, stochastic matrix, sensitivity analysis, perturbation theory, character of a Markov chain, condition numbers AMS subject classifications 65U05, 65F35, 60J10, 60J20, 15A51, 15A12, 15A18 1 Introduction The problem under consideration is that of analyzing the effects of small perturbations to the transition probabilities of a finite, irreducible, homogeneous Markov chain More precisely, if P n n is the transition probability matrix for such a chain, and if π T (π 1, π 2,, π n is the stationary distribution vector satisfying π T P π T and n i1 π i 1, the goal is to describe the effect on π T when P is perturbed by a matrix E such that P P + E is the transition probability matrix of another irreducible Markov chain Schweitzer (1968 provided the first perturbation analysis in terms of Kemeny and Snell s fundamental matrix Z (A + eπ T 1 in which A I P and e is a column of 1 s If A # denotes the group inverse of A [Meyer (1975 or Campbell and Meyer (1991], then Z (A + eπ T 1 A # + eπ T But in virtually all applications involving Z, the term eπ T is redundant; ie, all relevant information is contained in A # In particular, if π T ( π 1, π 2,, π n is the stationary distribution for P P + E, then (11 π T π T ( I + EA # 1 and (12 π T π T E A # in which can be either the 1-, 2-, or -norm If the jth column and the (i, j- entry of A # are denoted by A # j and a# ij, respectively, then (13 π j π j E A # j *Received by the editors April 6, 1992; accepted for publication (in revised form October 30, 1992 This work was supported in part by National Science Foundation grants DMS-9020915 and DDM-8906248 North Carolina State University, Mathematics Department, Raleigh, North Carolina 27695-8205, (meyer@mathncsuedu 715

716 carl d meyer and (14 max j π j π j E max i,j This bound is about as good as possible see Ipsen and Meyer (1994 for a discussion of optimal bounds Moreover, if the transition probabilities are analytic functions of a parameter t so that P P(t, then a # ij (15 dπ T dt π T dp dt A# and dπ j dt πt dp dt A# j The results (11 and (12 are due to Meyer (1980, and (13 appears in Golub and Meyer (1986 The inequality (14 was given by Funderlic and Meyer (1986, and the formulas (15 are derived in Golub and Meyer (1986 and Meyer and Stewart (1988 Seneta (1991 established an inequality similar to (12 using the coefficient of ergodicity τ 1 (A # in place of A # These facts make it absolutely clear that the entries in A # determine the extent to which π T is sensitive to small changes in P, so, on the basis of (14, it is natural to adopt the following definition of Funderlic and Meyer (1986 Definition 11 The condition of a Markov chain with a transition matrix P is measured by the size of its condition number, which is defined to be κ max i,j where a # ij is the (i, j-entry in the group inverse A # of A I P It is an elementary fact that κ is invariant under permutations of the states of the chain For chains of moderate size, it is not difficult to show (see the proof of Theorem 21 given in 4 that if there exists a subdominant eigenvalue of P which is close to 1, then κ must be large However, the converse of this statement has heretofore been unresolved, and our purpose is to focus on this issue More precisely, we address the following question If the subdominant eigenvalues of an irreducible Markov chain are well separated from 1, can we be sure that the chain is well conditioned? In other words, do the subdominant eigenvalues of P (or equivalently, the nonzero eigenvalues of A somehow provide complete information about the sensitivity of the chain or do we really need to know something about the singular values of A? The conjecture that κ max i,j a # ij is somehow controlled by the nonzero eigenvalues of A is contrary to what is generally true a standard example is the triangular matrix (16 1 2 0 0 0 1 2 4 2 n 2 2 n 1 0 1 2 0 0 0 1 2 2 n 3 2 n 2 0 0 1 0 0 T n n, T 1 0 0 1 2 n 4 2 n 3 0 0 0 1 2 0 0 0 1 2 0 0 0 0 1 0 0 0 0 1 a # ij

sensitivity of markov chains 717 for which max i,j [T 1 ] ij is immense for even moderate values of n, but the eigenvalues of T provide no clue whatsoever that this occurs The fact that the eigenvalues are repeated or that T is nonsingular is irrelevant consider a small perturbation of T or the matrices ( 0 0 T 0 T and T# ( 0 0 0 T 1 We will prove that, unlike the situation illustrated above, irreducible stochastic matrices P possess enough structure to guarantee that growth of the entries in A # is controlled by the nonzero eigenvalues of A I P As a consequence, it will follow that the sensitivity of an irreducible Markov chain is governed by the location of its subdominant eigenvalues 2 The main result In the sequel, it is convenient to adopt the following terminology and notation Definition 21 Let P be the transition probability matrix of an n -state irreducible Markov chain, and let σ(p {1, λ 2, λ 3,, λ n } denote the eigenvalues of P The character 1 of the chain is defined to be the (necessarily real number (1 λ 2 (1 λ 3 (1 λ n It will follow from later developments that (21 0 < n A chain is said to be of weak character when is close to 0, and the chain is said to have a strong character when is significantly larger than 0 If ( P T 1 1 0 T 0 C (eg, this may be the reduction to Jordan form where the spectral radius of C is less than 1, then ( ( A T 1 0 0 T and A # T 1 0 0 0 I C 0 (I C 1 T [Campbell and Meyer (1991], so det (I C and 1 det (I C 1 In other words, and 1 are the respective determinants of the nonsingular parts of A and A # in the sense that det (A /R(A and 1 det (A # / R(A where A /R(A denotes the linear operator defined by restricting A to R (A It is also true that 1 det (Z where Z is Kemeny and Snell s fundamental matrix The main result of this paper is the following theorem which establishes the connection between the condition of an irreducible chain and its character 1 The character was defined by Meyer (1993 to be n 1 (1 λ 2 (1 λ 3 (1 λ n, which is the normalization of the definition given here

718 carl d meyer Theorem 21 For an irreducible stochastic matrix P n n, let A I P, and for i j, let δ ij (A denote the deleted product of diagonal entries δ ij (A a kk (1 p kk k i,j k i,j If δ max i,j δ ij (A (the product of all but the two smallest diagonal entries, then the condition number κ is bounded by (22 1 n min 1 λ i κ < 2δ(n 1 λ i 1 2(n 1 The proof of this theorem depends on exploiting the rich structure of A, some of which is apparent, and some of which requires illumination Before giving a formal argument, it is necessary to detail the various components of this structure, so the important facets are first laid out in 3 as a sequence of lemmas After the necessary framework is in place, it will be a simple matter to connect the lemmas together in order to construct a proof; this is contained in 4 By combining Theorem 21 with (14 and the other facts listed in 1, we arrive at the following conclusion Theorem 22 The condition of an irreducible Markov chain is primarily governed by how close the subdominant eigenvalues of the chain are to 1 More precisely, if an irreducible chain is well conditioned, then all subdominant eigenvalues must be well separated from 1, and if all subdominant eigenvalues are well separated from 1 in the sense that the chain has a strong character, then it must be well conditioned It is a corollary of Theorem 21 that if max λi 1 λ i << 1, then the chain is not overly sensitive, but it is important to underscore the point that the issue of sensitivity is not equivalent to the question of how close max λi 1 λ i is to 1 Knowing that some λ i 1 is not sufficient to guarantee that the chain is sensitive; eg, consider the well-conditioned periodic chain (or any small perturbation thereof for which P 0 0 1 1 0 0 and A # 1 1 1 0 0 1 1 3 0 1 0 1 0 1 3 The underlying structure The purpose of this section is to organize relevant properties of A I P into a sequence of lemmas from which the formal proof of Theorem 21 can be constructed Some of the more transparent or well-known features of A are stated in the first lemma Lemma 31 If A I P where P n n is an irreducible stochastic matrix, then the following statements are true (31 A as well as each principal submatrix of A has strictly positive diagonal entries, and the off-diagonal entries are nonpositive (32 A is a singular M-matrix of rank n 1 (33 If B k k (k < n is a principal submatrix of A, then each of the following statements is true (a B is a nonsingular M-matrix (b B 1 0 (c det (B > 0 (d B is diagonally dominant (e det (B b 11 b 22 b kk 1

sensitivity of markov chains 719 Proof These facts are either self-evident, or they are direct consequences of wellknown results see Berman and Plemmons (1979 or Horn and Johnson (1991 Part of the less transparent structure of A is illuminated in the following sequence of lemmas Lemma 32 If P n n is an irreducible stochastic matrix, and if A i denotes the principal submatrix of A I P obtained by deleting the ith row and column from A, then n det (A i i1 Proof Suppose that the eigenvalues of A are denoted by {µ 1, µ 2,, µ n }, and write the characteristic equation for A as x n + α n 1 x n 1 + + α 1 x + α 0 0 Each coefficient α n k is given by ( 1 k times the sum of the product of the eigenvalues of A taken k at a time That is, (34 α n k ( 1 k µ i1 µ i2 µ ik 1 i 1< <i k n But it is also a standard result from elementary matrix theory that each coefficient α n k can be described as α n k ( 1 k (all k k principal minors of A Since 0 is a simple eigenvalue for A, there is only one nonzero term in the sum (34 when k n 1, and hence Therefore, α 1 ( 1 n 1 µ 2 µ 3 µ n ( 1 n 1 (1 λ 2 (1 λ 3 (1 λ n n ( 1 n 1 det (A i i1 n det (A i i1 n (1 λ k k2 Lemma 33 If A i denotes the principal submatrix of A I P obtained by deleting the ith row and column from A, and if π i is the ith stationary probability, then the character of the chain is given by det (A i π i Proof This result follows directly from Lemma 32 and the fact that the stationary distribution π T is given by the formula π T 1 ( n i1 det (A det (A 1, det (A 2,, det (A n i [Golub and Meyer (1986 or Iosifescu (1980, p 123] The mean return time for the kth state is R k 1/π k [Kemeny and Snell (1960], and, since not all of the π k s can be less than 1/n, there must exist a state such that R k n By combining this with (33c and (33e, an interesting corollary which proves (21 is produced

720 carl d meyer Corollary 31 If R k denotes the mean return time for the kth state then 0 < det (A i < min k R k n for each i 1,2,, n Lemma 34 If A I P where P n n is an irreducible stochastic matrix, and if B k k (k < n is a principal submatrix of A, then the largest entry in each column of B 1 is the diagonal entry That is, for j 1, 2,, k, it must be the case that [B 1 ] jj [B 1 ] ij for each i j At least two different proofs are possible, and we shall give both because each is instructive in its own right The first argument is shorter and more probabilistic, but it rests on a result which requires a proof of its own The second argument involves more algebraic details, but it is entirely self-contained and depends only on elementary concepts Probabilistic proof Without loss of generality, assume that B is the leading k k principal submatrix of A so that P has the form ( I B P Consider any pair of states i and j in the set S {1,2,, k}, and let N j denote the number of times the process is in state j before first hitting a state in the complement S {k + 1, k + 2,, n} If X n denotes the state of the process after n steps, and if then h ij P(hitting state j before entering S X 0 i, (35 E[N j X 0 i] d ij + h ij E[N j X 0 j] where d ij { 1 if i j, 0 if i j This statement (which appears without proof on p 62 in Kemeny and Snell (1960 is intuitive, but it is not trivial The theory of absorbing chains says that [B 1 ] ij E[N j X 0 i], so for i j we have [B 1 ] ij h ij [B 1 ] jj [B 1 ] jj Algebraic proof Assume that B is the leading k k principal submatrix of A, and suppose the states have been arranged so that the jth state is listed first and the ith state is listed second The goal is to prove that [B 1 ] 11 [B 1 ] 21 Because [B 1 ] 11 det (B 11 det (B and [B 1 ] 21 det (B 12 det (B where B ij denotes the submatrix of B obtained by deleting the ith row and jth column from B and because Lemma 31 guarantees that det (B > 0, it suffices to prove that det (B 11 + det (B 12 0

sensitivity of markov chains 721 Denote the first unit vector by e T 1 (1,0,,0, and partition B as (36 1 p 11 p 12 p 1k ( p B 21 1 p 22 p 2k 1 p11 p 12 p 1k b 1 b 2 b k p k1 p k2 1 p kk In terms of these quantities, det (B 11 + det (B 12 is given by det (B 11 + det (B 12 det ( ( b 2 b 3 b k + det b1 b 3 b k det ( b 2 + b 1 b 3 b k det ( B 11 + b 1 e T 1 det (B 11 ( 1 + e T 1 B 1 11 b 1 Lemma 31 also insures that det (B 11 > 0, so the proof can be completed by arguing that 1 + e T 1 B 1 11 b 1 0 To do so, modify the chain by making state 1 as well as states k + 1, k + 2,, n absorbing states so that the transition matrix has the form 1 0 0 0 0 0 p 21 p 22 p 23 p 2k p 2,k+1 p 2n p 31 p 32 p 33 p 3k p 3,k+1 p 3n P p k1 p k2 p k3 p kk p k,k+1 p kn 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 b 1 Q R 0 0 I n k It follows from the elementary theory of absorbing chains that the entries in the matrix (I Q 1( b 1 R B 1 ( 11 b1 R represent the various absorption probabilities, and consequently all entries in B 1 11 b 1 are between 0 and 1 so that 0 1 + e T 1 B 1 11 b 1 1 Note Although it may not be of optimal efficiency, the algebraic argument given above is also a proof of the statement (35 Lemma 35 If A I P where P n n is an irreducible stochastic matrix, and if B k k (k < n is a principal submatrix of A, then 0 < det (B max i δ i (B max i,j [B 1 ] ij 1 max i,j [B 1 ] ij where δ r (B denotes the deleted product δ r (B b 11 b 22 b kk /b rr

722 carl d meyer Proof Lemma 34 insures that there is some diagonal entry [ B 1] rr such that of B 1 (37 [ B 1 ] max [ rr B 1 ] i,j ij If B rr is the principal submatrix of B obtained by deleting the rth row and column from B, then (33e together with (37 produces det (B det (B rr [B 1 ] rr δ r(b [B 1 ] rr δ r (B max i,j [B 1 ] ij max i δ i (B 1 max i,j [B 1 ] ij max i,j [B 1 ] ij Lemma 36 For an irreducible stochastic matrix P n n, let A j be the principal submatrix of A I P obtained by deleting the jth row and column from A, and let Q be the permutation matrix such that ( Aj Q T c j AQ d T j a jj If the stationary distribution for Q T PQ is written as ψ T π T Q (π T, π j, then the group inverse of A is given by ( (I eπ T A 1 A # j (I eπ T π j (I eπ T A 1 j e Q π T A 1 j (I eπ T π j π T A 1 j e where e is a column of 1 s whose size is determined by the context in which it appears Proof The group inverse possesses the property that (T 1 AT # T 1 A # T for all nonsingular matrices T [Campbell and Meyer (1991], so Q T ( # Aj A # c j Q Q T d T j a jj Since rank ( Q T AQ n 1, it follows that a jj d T j A 1 j c j 0, and this is used to verify that ( # ( Aj c j A 1 (I eψ T d T j a jj j 0 0 0 (I eψ T ( (I eπ T A 1 j (I eπ T π j (I eπ T A 1 j e π T A 1 j (I eπ T π j π T A 1 j e

sensitivity of markov chains 723 4 Proof of the main theorem The preceding sequence of lemmas are now connected together to prove the primary results stated in Theorem 21 The upper bound To derive the inequalities (41 max i,j a # ij < 2δ(n 1 2(n 1, begin by letting Q be the permutation matrix given in Lemma 36 so that for i j, the (i, j-entry of A # is the (k, n-entry of Q T A # Q where k n In succession, use the formula of Lemma 36 and Hölder s inequality followed by the results of Lemmas 35 and 33 to write a # ij π j e T k (I eπ T A 1 j e π j e k π 1 A 1 j e < 2π j A 1 j 2π j(n 1 max r,s 2π j(n 1 max i δ i (A j det (A j 2δ(n 1 2(n 1 [ ] A 1 j rs 2π j(n 1δ det (A j Now consider the diagonal elements The (j, j-entry of A # is the (n, n-entry of Q T A # Q, so proceeding in a manner similar to that above produces a # jj π j π T A 1 j e π j π 1 A 1 j e < π j A 1 j π j(n 1 max r,s π j(n 1 max i δ i (A j det (A j δ(n 1 thus proving (41 The lower bound To establish that (n 1, [ ] A 1 j rs π j(n 1δ det (A j (42 1 n min 1 λ i max a # ij, i,j λ i 1 make use of the fact that if Ax µx for µ 0, then A # x µ 1 x [Campbell and Meyer (1991, p 129] In particular, if λ 1 is an eigenvalue of P, and if x is a corresponding eigenvector, then Ax (1 λx implies that A # x (1 λ 1 x, so 1 1 λ A # n max i,j a # ij

724 carl d meyer 5 Using an LU factorization Except for chains which are too large to fit into a computer s main memory, the stationary distribution π T is generally computed by direct methods; ie, either an LU or QR factorization of A I P (or A T is computed [Harrod and Plemmons (1984; Grassmann, Taksar, and Heyman (1985; Funderlic and Meyer (1986; Golub and Meyer (1986; Barlow (1993] Even for very large chains which are nearly uncoupled, direct methods are usually involved they can be the basis of the main algorithm [Stewart and Zhang (1991], or they can be used to solve the aggregated and coupling chains in iterative aggregation/disaggregation algorithms [Chatelin and Miranker (1982, Haviv (1987] In the conclusion of their paper, Golub and Meyer (1986 make the following observation Computational experience suggests that when a triangular factorization of A n n is used to solve an irreducible chain, the condition of the chain seems to be a function of the size of the nonzero pivots, and this means that it should be possible to estimate κ with little or no extra cost beyond that incurred in computing π T For large chains, this can be a significant savings over the O(n 2 operations demanded by traditional condition estimators Of course, this is contrary to the situation which exists for general nonsingular matrices because the absence of small pivots (or the existence of a large determinant is not a guarantee of a well-conditioned matrix consider the matrix in (16 A mathematical formulation and proof (or even an intuitive explanation of Golub and Meyer s observation has heretofore not been given, but the results of 2 and 3 now make it possible to give a more precise statement and a rigorous proof of the Golub Meyer observation The arguments hinge on the fact that whenever π T is computed by means of a triangular factorization of A (or A T, the character of the chain is always an immediate by-product The results for an LU factorization are given below, and the analogous theory for a QR factorization is given in the next section Suppose that the LU factorization 2 of A I P is computed to be ( ( Ln 0 Un c A LU r T 1 0 0 If A n is the principal submatrix of A obtained by deleting the last row and column from A, then A n is a nonsingular M-matrix, and its LU factorization is A n L n U n Since the LU factors of a nonsingular M-matrix are also nonsingular M-matrices [Berman and Plemmons (1979, Horn and Johnson (1991], it follows that L n and U n are nonsingular M-matrices, and hence L 1 n 0 and U 1 n 0 Consequently, r T 0, so the solution (obtained by a simple substitution process with no divisions of the nonsingular triangular system x T L n r T is nonnegative This together with the result of Lemma 33 and Theorem 21 produces the following conclusion Theorem 51 For an irreducible Markov chain whose transition matrix is P, let the LU factorization of A I P be given by ( ( Ln 0 Un c A LU r T 1 0 0 2 Regardless of whether A or A T is used, Gaussian elimination with finite-precision arithmetic can prematurely produce a zero (or even a negative pivot, and this can happen for wellconditioned chains Practical implementation demands a strategy to deal with this situation, and Funderlic and Meyer (1986 and Stewart and Zhang (1991 discuss this problem along with possible remedies Practical algorithms involve reordering schemes which introduce permutation matrices, but these permutations are not important in the context of this section, so they are suppressed

sensitivity of markov chains 725 If x T is the solution of x T L n r T, then each of the following statements is true The stationary distribution of the chain is (51 π T 1 1 + x 1 (x T, 1 The character of the chain is (52 det (U n π n (1 + x 1 det (U n The condition number for the chain is bounded above by (53 κ < 2δ(n 1π n det (U n 2δ(n 1 (1 + x 1 det (U n 2(n 1 (1 + x 1 det (U n The condition number for the chain is bounded below by n 1 (54 π n i1 π i u ii n 1 1 x i (1 + x 1 2 κ u i1 ii where u ii is the ith pivot in U n Proof Statements (51, (52, and (53 are straightforward consequences of the previous discussion To establish (54, first recall from Lemma 36 that Since U 1 n 0 and L 1 n a # nn π n π T A 1 n ( π T U 1 π1 n, u 11 e π n π T U 1 n L 1 n e > 0 0, it follows that π T U 1 n π 2 u 22 + α 2,, L 1 n e (1, 1 + β 2,, 1 + β n 1 T and L 1 n e can be written as π n 1 + α n 1, u n 1,n 1 where each α i and β i is nonnegative, and consequently (setting α 0 β 0 0 Therefore, π T A 1 n κ a # nn π n π T U 1 n n 1 e π T U 1 n L 1 (π i + α i (1 + β i n e u i1 ii n 1 L 1 π i n e π n u i1 ii n 1 i1 π i u ii n 1 1 x i (1 + x 1 2 u i1 ii As mentioned before, the pivots or the determinant need not be indicators of the condition of a general nonsingular matrix In particular, the absence of small pivots (or the existence of a large determinant is not a guarantee of a well-conditioned matrix However, for our special matrices A I P, the bounds in Theorem 51 allow the pivots to be used as condition estimators

726 carl d meyer Corollary 51 For an irreducible Markov chain whose transition matrix is P, suppose that the LU factorization of A I P and the stationary distribution π T have been computed as described in Theorem 51 If the pivots u ii are large relative to π n in the sense that π n /det (U n is not too small, then the chain is well conditioned If there are pivots u ii which are small relative to π n π i in the sense that π i /u ii π n n 1 is large, then the chain is ill conditioned i1 6 Using a QR factorization The utility of orthogonal triangularization is well documented in the vast literature on matrix computations, and the use of a QR factorization to solve and analyze Markov chains is discussed by Golub and Meyer (1986 The following theorem shows that the character of an irreducible chain can be directly obtained from the diagonal entries of R and the last column of Q, and this will establish an upper bound using a QR factorization which is analogous to that in Theorem 51 for an LU factorization A lower bound analogous to the one in Theorem 51 is not readily available Theorem 61 For an irreducible Markov chain whose transition matrix is P, the QR factorization of A I P is given by ( ( ( Qn c Rn R n e Qn R n Q n R n e A QR d T 0 0 d T R n d T R n e q nn If q denotes the last column of Q, then each of the following statements are true The stationary distribution of the chain is (61 π T The character of the chain is q T n i1 q in (62 q 1 det (R n The condition number for the chain is bounded above by (63 κ < 2δ(n 1 q 1 det (R n 2(n 1 q 1 det (R n Proof The formula (61 for π T is derived in Golub and Meyer (1986 To prove (62, first recall the result of Lemma 33, and observe that ( 2 2 detan (detq nr n 2 π n π 2 n (detq n 2 (detr n 2 qnn/ 2 q 2 1 Use the fact that QQ T I implies Q n Q T n + cc T I to obtain (detq 2 det ( Q n Q T n det ( I cc T 1 c T c q 2 nn, and substitute this into the previous expression to obtain (62 The bound (63 is now a consequence of the result of Theorem 21

sensitivity of markov chains 727 7 Concluding remarks It has been argued that the sensitivity of an irreducible chain is primarily governed by how close the subdominant eigenvalues are to 1 in the sense that the condition number of the chain is bounded by (71 1 n min λ i 1 1 λ i κ < 2δ(n 1 Although the upper bound explicitly involves n, it is generally not the case that 2δ(n 1/ grows in proportion to n Except in the special case when the diagonal entries of P are 0, the term δ somewhat mitigates the presence of n because as n becomes larger, δ becomes smaller Computational experience suggests that 2δ(n 1/ is usually a rather conservative estimate of κ, and the term δ/ by itself, although not always an upper bound for κ, is often of the same order of magnitude as κ However, there exist pathological cases for which even δ/ severely overestimates κ This seems to occur for chains which are not too badly conditioned and no single eigenvalue is extremely close to 1, but enough eigenvalues are within range of 1 to force 1 to be too large This suggests that for the purposes of bounding κ above, perhaps not all of the subdominant eigenvalues need to be taken into account In a forthcoming article, Seneta (1993 addresses this issue by an analysis involving coefficients of ergodicity When direct methods are used to solve an irreducible chain, standard condition estimators can be used to produce reliable estimates for κ, but the cost of doing so is O(n 2 operations beyond the solution process The results of Theorems 51 and 61 make it possible to estimate κ with the same computations which produce π T Although the bounds for κ produced by Theorem 51 are sometimes rather loose, they are nevertheless virtually free One must balance the cost of obtaining condition estimates against the information one desires to obtain from these estimates 8 Acknowledgments The exposition of this article was enhanced by suggestions provided by Dianne O Leary, Guy Latouche, and Paul Schweitzer REFERENCES J L Barlow (1993, Error bounds for the computation of null vectors with applications to Markov chains, SIAM J Matrix Anal Appl, 14, pp 598 618 A Berman and R J Plemmons (1979, Nonnegative Matrices in the Mathematical Sciences, Academic Press, New York S L Campbell and C D Meyer (1991, Generalized Inverses of Linear Transformations, Dover Publications (1979 edition by Pitman Pub Ltd, London, New York F Chatelin and W L Miranker (1982, Acceleration by aggregation of successive approximation methods, Linear Algebra Appl, 43, pp 17 47 R E Funderlic and C D Meyer (1986, Sensitivity of the stationary distribution vector for an ergodic Markov chain, Linear Algebra Appl, 76, pp 1 17 G H Golub and C D Meyer (1986, Using the QR factorization and group inversion to compute, differentiate, and estimate the sensitivity of stationary probabilities for Markov chains, SIAM J Algebraic Discrete Meth, 7, pp 273 281 W K Grassmann, M I Taksar, and D P Heyman (1985, Regenerative analysis and steady state distributions for Markov chains, Oper Res, 33, pp 1107 1116 W J Harrod and R J Plemmons (1984, Comparison of some direct methods for computing stationary distributions of Markov chains, SIAM J Sci Statist Comput, 5, pp 453 469 M Haviv (1987, Aggregation/disaggregation methods for computing the stationary distribution of a Markov chain, SIAM J Numer Anal, 22, pp 952 966 R A Horn and C R Johnson (1991, Topics In Matrix Analysis, Cambridge University Press, Cambridge

728 carl d meyer M Iosifescu (1980, Finite Markov Processes and their Applications, John Wiley and Sons, New York I C F Ipsen and C D Meyer (1994, Uniform stability of Markov chains, SIAM J Matrix Anal Appl, 15, pp 1061 1074 J G Kemeny and J L Snell (1960, Finite Markov Chains, D Van Nostrand, New York C D Meyer (1975, The role of the group generalized inverse in the theory of finite Markov chains, SIAM Rev, 17, pp 443 464 (1980, The condition of a finite Markov chain and perturbation bounds for the limiting probabilities, SIAM J Algebraic Discrete Meth, 1, pp 273 283 (1993, The character of a finite Markov chain, in Linear Algebra, Markov Chains, and Queueing Models, C D Meyer and R J Plemmons, eds, IMA Volumes in Mathematics and its Applications, Vol 48, Springer-Verlag, New York, pp 47 58 C D Meyer and G W Stewart (1988, Derivatives and perturbations of eigenvectors, SIAM J Numer Anal, 25, pp 679 691 P J Schweitzer (1968, Perturbation theory and finite Markov chains, J Appl Probab, 5, pp 401 413 E Seneta (1991, Sensitivity analysis, ergodicity coefficients, and rank-one updates for finite Markov chains, in Numerical Solution of Markov Chains, W J Stewart, ed, Probability: Pure and Applied, No 8, Marcel Dekker, New York, pp 121 129 (1993, Sensitivity of finite Markov chains under perturbation, Statist and Probab Lett, 17, to appear G W Stewart and G Zhang (1991, On a direct method for the solution of nearly uncoupled Markov chains, Numer Math, 59, pp 1 11