THE REACHABILITY CONES OF ESSENTIALLY NONNEGATIVE MATRICES by Michae Neumann Department of Mathematics, University of Connecticut, Storrs, CT 06269 3009 and Ronad J. Stern Department of Mathematics, Concordia University, Montrea, Canada H4X 1J7 and Michae J. Tsatsomeros Department of Mathematics, University of Connecticut, Storrs, CT 06269 3009 Research supported in part by US Air Force Research Grant No AFOSR-88-0047 and by NSF Grant No DMS-8901860. This author woud aso ike to thank NSERC for making it possibe for him to visit Ronad J. Stern in Montrea. Research supported by the Natura Sciences and Engineering Counci of Canada, grant No A4641. Research supported in part by US Air Force Research Grant No AFOSR-88-0047 and by NSF Grant No DMS-8901860.
Abstract Let A be an n n essentiay nonnegative matrix and consider the inear differentia system ẋ(t) = Ax(t), t 0. We show that there exists a constant h(a) > 0 such that the trajectory emanating from x o reaches R+ n at a finite time t o = t(x o ) 0 if and ony if the sequence of points generated by a finite differences approximation from x o, with time step 0 < h < h(a), reaches R+ n at a finite index k o = k(x o ) 0. This generaizes and strengthens earier resuts of two of the authors, where some additiona spectra restrictions were imposed on A. Our proof makes use of the existence of a basis of nonnegative vectors to the Perron eigenspace.
1 Introduction Let A be an n n matrix and consider the inear differentia system The soution to (1), given by ẋ(t) = Ax(t), t 0. (1.1) x(t) = e ta x o, t 0, (1.2) wi be referred to as the trajectory of the differentia system emanating from x o. A set Γ R n is caed positivey invariant with respect to A if e ta (Γ) Γ, t 0. This condition has the impication that once a trajectory emanating from x o reaches Γ in some finite time, it remains in Γ thereafter. The set of a initia points of trajectories which reach Γ is caed the reachabiity set for Γ under A and is denoted by X A (Γ). Ceary, X A (Γ) = {x o R n ( t o = t(x o ) 0)( t t o )[e ta x o Γ]} = t 0 e ta Γ. (1.3) In the case when Γ is a proper cone, namey, a cosed, convex, pointed and soid cone, then it was shown in Neumann and Stern [?] that X A (Γ) is a convex cone itsef, which contains Γ, but which is not necessariy cosed or pointed. We sha denote the cosure of X A (Γ) by X A (Γ). In this paper A = (a ij ) wi aways denote an essentiay nonnegative matrix, that is, a matrix whose off diagona entries are nonnegative. This condition is known to be equivaent to the positive invariance of the nonnegative orthant R+ n with respect to A (see Beman [?] and Birkhoff and Varga [?]). In Neumann and Stern [?] and in Berman, Neumann, and Stern [?], expicit formuas were derived for X A (R+) n under additiona spectra assumptions on A. In Neumann and Stern [?], a numerica characterization of X A (R+) n based on a Cauchy Euer approximation to the soution to (1.1) was deveoped. However, there, in addition to A being essentiay nonnegative, it was assumed that A has a rea spectrum or that it possesses a stricty positive generaized eigenvector corresponding to its Perron root. For further description of these reachabiity resuts the reader is referred to 1
Chapter 6 of Berman, Neumann, and Stern [?]. In the present work we pose the questions of generaizing and strengthening the aforementioned numerica characterization of X A (R+) n by assuming that A is ony essentiay nonnegative and by numericay characterizing X A (R+) n and not just its cosure. For the sake of competeness we first give one of the main resuts in Neumann and Stern [?]. Define h(a) = sup{h min (1 + ha ii) > 0}. (1.4) 1 i n It is cear that h(a) can be infinite and that for any 0 < h < h(a) the matrix I + ha is nonnegative. For any such h we define the discrete reachabiity set of R+ n to be X A,h (R+) n = {x o R n ( k o = k(x o ) 0)( k k o )[(I + ha) k x o 0]} (1.5) which can be shown to be an invariant cone under I +ha. The sequence of vectors x k = (I + ha) k x o, k = 0, 1,..., wi be referred to as the discrete trajectory emanating from x o. This sequence represents an approximation to the soution of (1.1), in the sense of Cauchy Euer, at a discrete set of times. Next, it is known by the cassica Perron Frobenius theory that the Perron root λ 1 := max{reµ µ σ(a)} σ(a), (1.6) where σ(a) denotes the spectrum of A. Reca now that the Perron (generaized) eigenspace of A is defined to be the A invariant subspace W A = N((λ 1 I A) p ), (1.7) where p is the mutipicity of λ 1 in the minima poynomia of A. The resut of Neumann and Stern which we wish to quote here can now be stated as foows. Theorem 1.1 ([?], Theorem 2.2) Suppose A is an n n essentiay nonnegative matrix. If W A int(r n +), (1.8) then for any 0 < h < h(a) which satisfies we have that 1 + hλ 1 > 1 + hµ, µ σ(a) \ {λ 1 }, (1.9) X A,h (R n +) = X A (R n +). (1.10) 2
We shoud emphasize the fact that the time step h in (1.9) depends ony on A and is not necessariy sma. In other words, Theorem 1.1 provides us with a way of testing whether a point ies in X A (R+) n regardess of how much the continuous and discrete trajectories emanating from x o diverge from each other. The main objectives of this paper are to extend and improve the resut of Theorem 1.1. This wi be achieved in two principa stages. In the first stage, Theorem 3.1, we sha show that the concusion of Theorem 1.1 hods for any essentiay nonnegative matrix. Our proof of this resut wi be aided by a bock trianguar form, due to Hartwig, Neumann, and Rose [?], to which, actuay, any nonnegative matrix can be symmetricay permuted. For the sake of competeness, this form is dispayed in Lemma 2.1. In the second stage, Theorem 3.3, we sha show that with a possibe sight additiona restriction on the supremum of h, the equaity in (1.10) hods even when the cosure signs are removed. The proof of Theorem 3.3 requires that we further expose geometrica properties of the discrete reachabiity cone X A,h (R+). n We do this in Lemma 3.2. 2 Preiminaries Suppose that B is an n n nonnegative matrix and et W B = N((ρ(B)I B) p ) be the generaized Perron eigenspace of B corresponding to its spectra radius ρ(b), where p is the degree of ρ(b) in the minima poynomia of B. The emma to foow assigns to B a certain bock upper trianguar form which wi be essentia to the generaization of Theorem 1.1. It is based on Lemma 3.1 and Coroaries 3.1 3.2 of Hartwig, Neumann, and Rose [?]. We comment that this form coud aso be derived from proofs of the existence of a nonnegative basis for W B, which are based on the Frobenius norma form for B, as given in Rothbum [?] and Richman and Schneider [?]. However, the proof of our main resut in Section 3 does not require that we consider the Frobenius norma form of a matrix. For a nonnegative vector w, we sha et ν(w) denote the number of positive entries in w. Lemma 2.1 Any n n nonnegative matrix B is permutationay simiar 3
to a bock upper trianguar matrix B 11 B12... B1p B 22... B2p.... (2.1) 0 Bpp with the foowing properties. If for each j {1, 2,..., p} we set B j = B jj... Bjp.... 0 Bpp R q,q +, (2.2) then each diagona bock B jj in (2.1) is of size k j k j, where k j = max{ν(w) w W Bj R q +}, q = n k 1... k j 1 and it possesses a stricty positive generaized eigenvector corresponding to ρ( B jj ). Moreover, and u j W Bjj intr k j + (2.3) ρ( B j ) = ρ( B jj ), ρ( B j ) > ρ( B j+1 ), 1 j p 1, (2.4) where m j = index ρ( Bjj ) B jj. W Bj = {u R q u = [ū T, 0] T, ū W Bjj } (2.5) ( B jj ρ( B jj )I) m j 1 u j R k j + \ {0}, (2.6) We concude this section by recaing that R+ n admits an aternative representation as the intersection of n cosed hafspaces (e.g. Rockafear [?]) as foows : Let.,. denote the usua inner product in R n and suppose that {ν i i = 1, 2,..., n} is the set of a outward unit normas to R+ n (that is, ν i = [0,..., 0, 1, 0,..., 0] T, where the nonzero entry occurs in the i-th position). Then, n R+ n = {z R n ν i, z 0}. (2.7) i=1 4
3 The Main Resuts Let A be an n n essentiay nonnegative matrix and et h(a) be as defined in (1.4). In what foows we sha assume, without oss of generaity, that the matrix B = B(h) = I + ha is aready in the upper trianguar bock form given in Lemma 2.1. Otherwise, our considerations appy to a permutation simiarity of A. We are now ready to state the first main resut of this paper. Theorem 3.1 Let A be an n n essentiay nonnegative matrix. Then for a 0 < h < h(a), X A,h (R n +) = X A (R n +). (3.1) Proof We wi first show that X A (R+) n X A,h (R+). n Suppose that there exists z X A (R+) n such that Define the vector z X A,h (R n +). (3.2) u = [u T 1, u T 2,..., u T p ] T intr n +, (3.3) where the vectors u j, j = 1, 2,..., p are defined as in Lemma 2.1. Then there exists a stricty decreasing sequence of positive numbers {ɛ m } m=1, with ɛ m 0, such that z + ɛ m u X A,h (R n +), m 1. (3.4) For an arbitrary but fixed m, the excusion in (3.4) means by (2.7) that for each i 1 there exists an outward unit norma ν (m,i) to R n + such that ν (m,i), B i (z + ɛ m u) > 0. (3.5) Since there are ony a finite number of such normas there exists a norma, say ν (m), and a sequence {i (m) k } k=1 such that ν (m), B i(m) k (z + ɛ m u) > 0, k 1. (3.6) Suppose that z and ν (m) are partitioned in conformity with (2.1). Assume further that the nonzero entry of ν (m) occurs in the th bock, namey, z = [z T 1,..., z T,..., z T p ] T and ν (m) = [0,..., 0, (ν (m) ) T, 0,..., 0] T. (3.7) 5
Consider now the traiing submatrix (viz. Lemma 2.1) B = I + ha R q,q, q = n k 1... k 1 (3.8) and construct the foowing q vectors: ν (m) := [(ν (m) ) T, 0,..., 0] T, z (m) := [z T,..., z T p ] T, ū (m) := [u T, 0,..., 0] T R q +, û (m) := [0, u T +1,..., u T p ] T R q +. (3.9) Then, k 1, ν (m), B i(m) k ( z (m) + ɛ m ū (m) ) ν (m), B i(m) k ( z (m) + ɛ m ū (m) ) + Thus by (3.5) and (3.10), + ν (m), ɛ m B i(m) k û (m) = ν (m), B i(m) k ( z (m) + ɛ m (ū (m) + û (m) )) = ν (m), B i(m) k (z + ɛ m u). (3.10) ν (m), B i(m) k ( z (m) + ɛ m ū (m) ) > 0, k 1. (3.11) Next, as e ta is upper trianguar t 0, a necessary condition for the assumption z X A (R n +) to hod true is that z X A (R q +). (3.12) Reca now that u intr k +. It was shown in Neumann and Stern [?] that intx A (R k +)=X A (intr k +) and, furthermore, once a trajectory has entered intr k + it cannot in finite time reach the boundary of R k +. Consequenty ν (m), ɛ m e taū (m) < 0, t 0. (3.13) Then, (3.12) and (3.13) have the impication that there exists a sufficienty arge t m 0 such that ν (m), e ta ( z (m) + ɛ m ū (m) ) < 0, t t m. (3.14) We sha next show that (3.11) and (3.14) are incompatibe as we vary m. But first, et λ := max{reµ µ σ(a )} (3.15) 6
and write B = (1 + hλ )I + h(a λ I). (3.16) Observe then that, by Lemma 2.1, ū (m) W A R+ q and consider the resoution of z (m) + ɛ m ū (m) into z (m) + ɛ m ū (m) = ( w (m) + ɛ m ū (m) ) + r (m), (3.17) where r (m) is the projection of z (m) (and hence of z (m) + ɛ m ū (m) ) onto U A, the join of a eigenspaces of A corresponding to eigenvaues µ λ aong W A. We caim that, independenty of m, ν (m), w (m) + ɛ m ū (m) = 0. (3.18) Suppose to the contrary. As A λ I is nipotent on W A, et p 1 0 be the argest integer such that and ν (m), (A λ I) p 1 ( w (m) + ɛ m ū (m) ) = 0 (3.19) ν (m), (A λ I) j ( w (m) + ɛ m ū (m) ) = 0, j > p 1. (3.20) If A is a nonnegative matrix, so that h(a) =, then for any h (0, ) the eigenvaues of I + ha satisfy 1 + hλ > 1 + hµ, µ σ(a ) \ {λ } (3.21) or ese (3.15) is vioated. Simiary, if h(a) <, (3.21) hods true for a 0 < h < h(a). Consequenty, the restriction of B /(1 + hλ ) to U A is a convergent matrix, namey, [ im k B 1 + hλ ] i (m) k r (m) = 0. (3.22) Then for i (m) k > p 1 by (3.19) and (3.20) we can write that = + ν (m), B i(m) k ( z (m) + ɛ m ū (m) ) = ( (m) ) i k (1 + hλ ) i(m) p k 1 h p 1 ν (m), (A λ I) p 1 ( w (m) + ɛ m ū (m) ) + p1 p 1 1 j=0 ( i (m) k j ) (1 + hλ ) i(m) k j h j ν (m), (A λ I) j ( w (m) + ɛ m ū (m) ) + + ν (m), B i(m) k r (m). (3.23) 7
Note that as k and for a 0 j < p 1, Aso, by (3.22), ( (m) ) / ( (m) ) i k i k j p1 0. (3.24) im k (1 + hλ ) p 1 ( i (m) k p1 ) h p 1 ν (m), [ B 1 + hλ ] i (m) k r (m) = 0. (3.25) Now, since 1 + hλ > 0, upon taking k sufficienty arge, reations (3.11) and (3.23) (3.25) have the impication that ν (m), (A λ I) p 1 ( w (m) + ɛ m ū (m) ) > 0. (3.26) On the other hand, the restriction of A λ I to U A is a stabiity matrix and so im t et (A λ I) r (m) = 0. (3.27) Now, by (3.19) and (3.20) one obtains ν (m), e ta ( z (m) + ɛ m ū (m) = = e tλ ν (m), e t(a λ I) ( w (m) + ɛ m ū (m) ) + e tλ ν (m), e t(a λ I) r (m) = tp 1 e tλ ν (m), (A λ I) p 1 ( w (m) + ɛ m ū (m) ) p 1! + e tλ p 1 1 j=0 t j j! ν(m), (A λ I) ( w (m) + ɛ m ū (m) ) + e tλ ν (m), e t(a λ I) r (m). (3.28) But then, for t sufficienty arge, (3.14), (3.27) and (3.28) give us that ν (m), (A λ I) p 1 ( w (m) + ɛ m ū (m) ) 0, (3.29) a contradiction to (3.26) showing that (3.18) is vaid. On setting w (m) := [0, w (m) ] T R n, by (3.17) we obtain that ν (m), w (m) + ɛ m ν (m), u = 0. (3.30) 8
Now vary m. Then, since the number of outward normas is finite, there must exist indices m 2 > m 1 1 such that ν := ν (m1) = ν (m2), and hence w := w (m1) = w (m2), so that which, as ɛ m1 < ɛ m2, is ony possibe if ν, w + ɛ mi ν, u = 0, i = 1, 2 (3.31) ν, u = 0, (3.32) which is not possibe as u intr n +. This shows that X A (R+) n X A,h (R+) n (3.33) and competes the first part of the theorem. We sha next prove the reverse containment, namey, that X A,h (R+) n X A (R+). n (3.34) Let z X A,h (R+) n such that z X A (R+). n (3.35) Suppose that u and {ɛ m } m=1 are chosen as before so that z + ɛ m u X A (R n +), m 1. (3.36) Consider m arbitrary but fixed. The excusion in (3.36) means that for any unbounded sequence of stricty increasing positive times {t (m) j } j=1, there exist ν (m), ν (m), z (m), ū (m), defined in a manner simiar to the first part of the theorem (viz. equations (3.3) (3.11)), so that ν (m), e t(m) j A ( z (m) + ɛ m ū (m) > 0, j 1. (3.37) Next, by equations (2.4) and (2.6) of Lemma 2.1 we can deduce that B j u intr k +, j 0. Observe now that the assumption z X A,h (R+) n has the impication that there exists a sufficienty arge exponent j m so that ν (m), B j ( z(m) + ɛ m ū (m) ) < 0, j j m. (3.38) 9
We sha now show that, as we vary m, equations (3.37) and (3.38) are incompatibe. For this, et us resove z (m) + ɛ m ū (m) as in (3.17). Then, an anaysis simiar to the one for equations (2.23) and (2.28) shows that or ese, for some p 1 0, ν (m), w (m) + ɛ m ū (m) = 0 (3.39) sgn ν (m), e t(m) j A ( z (m) +ɛ m ū (m) ) = sgn ν (m), (A λ I) p 1 ( w (m) +ɛ m ū (m) ) (3.40) and sgn ν (m), B j ( z(m) + ɛ m ū (m) ) = sgn ν (m), (A λ I) p 1 ( w (m) + ɛ m ū (m) ), (3.41) which contradict (3.37) and (3.38). Then, in a simiar fashion to (3.30) and (3.31), upon varying m, it foows from (3.39) that for some outward norma ν to R+, n ν, u = 0. This is a contradiction to (3.3) which competes the proof of the theorem As a consequence of the fact that in genera the reachabiity cone is not cosed, in order to achieve our goa of providing a numerica characterization for the eements of X A (R+) n we must improve the resut of Theorem 3.1 by distinguishing between those boundary points which reach R+ n and those which do not. The set of a such points was termed in [?] the effective part of the boundary of X A (R+). n In the remainder of this section we sha utiize some geometrica properties of the continuous and discrete reachabiity cones to show that for a but a finite number of vaues of h in (0, h(a)) we can strengthen the resut of Theorem 3.1, given in equation (3.1), by proving that X A (R+) n = X A,h (R+). n (3.42) For this purpose we require the foowing emma, some of whose causes have aready been estabished in the iterature. For any set Γ R n and A R n,n we define core A (Γ) to be the maxima subset of Γ which is positivey invariant with respect to A and core A,h (Γ) to be the maxima subset of Γ which is invariant with respect to B = B(h) = I + ha. As usua, we denote the boundary of Γ by Γ. Lemma 3.2 Let A be an n n essentiay nonnegative matrix and suppose that 0 < h < h(a) is chosen so that B = B(h) = I + ha is invertibe. Then the foowing hod. 10
(i) X A (intr n +) = intx A (R n +). (ii) X A (R n +) is positivey invariant with respect to A. (iii) X A,h (intr n +) = intx A,h (R n +). (iv) X A,h (R n +) is invariant with respect to B. (v) X A (R n +) X A (R n +) = X A (core A ( R n +)). (vi) X A,h (R n +) X A,h (R n +) = X A,h (core A,h ( R n +)). (vii) core A ( R n +) is the union of a the positivey invariant faces of R n +. (viii) core A,h ( R n +) = core A ( R n +). Proof Caims (i), (ii), (v) and (vii) are due to Neumann and Stern [?] (see aso Berman, Neumann, and Stern [?]). We comment that it is evident from the proof of (i) in [?] that once a trajectory emanating from a point in intx A (R+) n has entered intr+, n it remains in intr+ n for a finite time. Now et h be as prescribed. Then the proofs of (iii), (iv), and (vi) are very simiar to their respective counterparts in the continuous case and therefore they are omitted. To show (viii) consider a point z core A ( R+). n By (vii) z beongs to some positivey invariant face F of R+. n Thus, since F is a poyhedra proper cone in the span of F, the restriction of A to F must be essentiay F nonnegative, namey for some α 0, (A + αi)f F (3.43) (see Schneider and Vidyasagar [?] and Stern [?]). Then, for any x F R+ n we have that y:=(a + h 1 I)x R+ n and by (3.43) there exists some β 0 such that y + βx F and so, by the definition of a face, we have the impication 0 y y + βx F y F. (3.44) Thus, hy = Bx F, that is, for any choice of h (0, h(a)), F is B=B(h) invariant. This means that z core A,h ( R n +) and hence core A ( R n +) core A,h ( R n +). (3.45) 11
Conversey, suppose that z core A,h ( R+) n and note that, in particuar, z R+. n Thus if z core A ( R+), n then we must have that z X A (intr+). n It foows from (3.1) that intx A (R n +) = intx A,h (R n +) (3.46) which, combined with caims (i) and (iii), has the impication that X A (intr n +) = X A,h (intr n +). (3.47) But then z X A,h (intr+) n contradicting the fact that z core A,h ( R+). n This competes the proof of the emma We can now prove our second main resut. Theorem 3.3 Let A be an n n essentiay nonnegative matrix. Then for a 0 < h < h(a) such that B = B(h) = I + ha is invertibe we have X A (R n +) = X A,h (R n +). (3.48) Proof Let h be as prescribed in the statement of the theorem. First we wi show that X A (R n +) X A,h (R n +). (3.49) By Theorem 3.1 the interior points of the two cones coincide so it suffices to show that X A (R n +) X A (R n +) X A,h (R n +). (3.50) Suppose that z X A (R+) X n A (R+). n Then, by (v) and (vii) of Lemma 3.2, z X A (F 1 ), where F 1 is some positivey invariant and hence, according to the expanation invoving (3.43) and (3.44), a B invariant face of R+. n Without oss of generaity assume that F 1 = {x = (x 1, x 2,..., x n ) T x j = 0, j > k for some k 1} (3.51) in which case, as BF 1 F 1, B has necessariy the upper trianguar form [ ] [ ] B11 B 12 A11 A = I + h 12, (3.52) 0 B 22 0 A 22 where A 11 R n k,n k and A 22 R k,k. Otherwise, our considerations appy to a permutation simiarity of A. Then z has the form z = [z T 1, 0] T, z 1 X A11 (R n k + ). (3.53) 12
Thus, since h(a) h(a 11 ), Theorem 3.1 appied to A 11 yieds that and so, by (3.53), z 1 X A11,h(R n k + ) = X A11 (R n k + ) (3.54) z X A,h (F 1 ) = X A (F 1 ). (3.55) If now z 1 intx A11 (R+ n k )=intx A11,h (R+ n k ), then z X A,h (F 1 ) which impies that z X A,h (R+) n (3.56) and we are done. Suppose then that z 1 X A11 (R n k + ) X A11 (R n k + ). (3.57) We may then appy a simiar anaysis to z 1 to show that there exists a B invariant face F 2 of F 1 such that z X A,h (F 2 ) = X A (F 2 ). (3.58) Then, either z X A,h (F 2 ) or we continue the reduction. The process must terminate when the face under consideration is of dimension 0 (the origin) or of dimension 1 (the span of a Perron eigenvector). In either case the probem is trivia and (3.49) is shown. The reverse containment foows simiary by appying (vi) and (viii) of Lemma 3.2 13