A FAMILY OF ANADROMIC NUMERICAL METHODS FOR MATRIX RICCATI DIFFERENTIAL EQUATIONS

Size: px

Start display at page:

Download "A FAMILY OF ANADROMIC NUMERICAL METHODS FOR MATRIX RICCATI DIFFERENTIAL EQUATIONS"

Amberly Stephens
5 years ago
Views:

1 A FAMILY OF ANADROMIC NUMERICAL METHODS FOR MATRIX RICCATI DIFFERENTIAL EQUATIONS REN-CANG LI AND WILLIAM KAHAN Abstract. Matrix Riccati Differential Equations (MRDE X = A 1 XA 11 + A X XA 1 X, X(0 = X 0, where A ij A ij (t, appear frequently throughout applied mathematics, science, and engineering. Naturally the existing conventional Runge-Kutta methods and linear multi-step methods can be adapted to solve MRDE numerically. Indeed they have been adapted. There are a few unconventional numerical methods, too, but they are suited more for time-invariant MRDE than timevarying ones. For stiff MRDE, existing implicit methods which are preferred to explicit ones require solving nonlinear systems of equations (of possibly much higher dimensions than the original problem itself; for example implicit Runge- Kutta methods, and thus they not only pose implementation difficulties but also are expensive. Up to today, the most preserved property of MRDE is the symmetry property for a symmetric MRDE, while many other crucial properties are discarded. Besides the symmetry property, our proposed methods also preserve two other important properties Bilinear Dependence on the initial value, and Generalized Inverse Relation between MRDE and its complementary MRDE. By preserving the generalized inverse relation, our methods are able to accurately integrate MRDE whose solution has singularities. By preserving the property of bilinear dependence on the initial value, our methods also conserves the rank of change to the initial value and a solution monotonicity property. Our methods are anadromic 1, meaning if an MRDE is integrated by one of our methods from t = τ to τ + θ and then integrated backward from t = τ + θ to τ using the same method, the value at t = τ is recovered in the absence of rounding errors. This implies that our methods are necessarily of even order of convergence. For time-invariant MRDE, methods of any even order of convergence are established, while for time-varying MRDE, methods of order as high as 10 are established, but only methods of order up to 6 are stated in detail. Our methods are semi-implicit, in the sense that there are no nonlinear systems of matrix equations to solve, only the linear ones, unlike any existing implicit method. Given the availability of high quality code for linear matrix equations, our methods can be easily implemented and embedded into any application software package that needs a robust MRDE solver. Numerical examples are presented to support our claims. 1 We coined Anadromic from the Greek roots ανα- (back up and δρoµoσ (act of running of anadromous (which describes fish that must return from the ocean to spawn in the same streams whence they hatched after we had tried to describe our numerical methods with words like reflexive [7, 8], symmetric (which is commonly used today in the literature, e.g., [], and reversible that turned out to be overworked. Date: Received??? and, in revised form,??? Mathematics Subject Classification. 65L05. Key words and phrases. Matrix Riccati differential equations, solution singularity, anadromic numerical method, group of two-sided bilinear rational functions, generalized inverse relation. This work was supported in part by the National Science Foundation Grant DMS

2 1. Introduction Matrix Riccati Differential Equations (MRDEs arise frequently throughout applied mathematics, science and engineering. They in particular play major roles in optimal control, filtering and estimation [30] and in solving linear two point boundary value problems of ordinary differential equations (ODEs [3, 4, 18, 19]. A number of algorithms have been proposed in the past for solving MRDEs numerically. These include carefully re-designed conventional Runge-Kutta methods and Linear Multi-step Methods for ODEs by Choi and Laub [10] and by Dieci [15] and unconventional methods for MRDEs arising from the optimal control theory by, e.g., [1, 9, 3, 31, 33, 34, 37, 40, 4, 44]. It is known that these unconventional methods are either not suited or inefficient for time-varying MRDEs. While the re-designed conventional methods benefit greatly from past development of sophisticated general-purposed computer programs for ODEs, they could be easily evolved into complicated programs thousands of lines long, somewhat complicated interfaces. Implicit conventional methods which are preferred to explicit ones for stiff systems require solving nonlinear systems of equations (of possibly much higher dimensions than the original problem itself for Runge-Kutta methods which not only pose implementation difficulties but also may be expensive. None of these existing methods can integrate over solution singularities (poles, a situation does occur in applications, e.g., one-dimensional quantum Hamilton- Jacobi equations [11]. In this paper, we will establish a family of unconventional numerical methods that have a capability to produce meaningful numerical results even if there are poles in the solution. This capability is a byproduct of our numerical formulas that preserve crucial structural properties previously disregarded. One of our second order methods is not entirely new. It was used in 1987 by Babuška and Majer [4, Section 3.] for both time-invariant and time-varying MRDEs, and had also been used independently by the second author here in one of his unpublished notes in 1980s. While this method is only of order, it conserves many important properties of MRDE that are crucial to our investigation here. Such conservations were not known in [4], however. Generally MRDE takes the form: (MRDE X = A 1 XA 11 + A X XA 1 X, X(0 = X 0, where X is an n-by-m (not necessarily square matrix-valued function of time t, and all A ij are smooth matrix-valued functions of time t, too, with dimensions determined by the following conformal partitioning ( m n m A (1.1 A A(t = 11 A 1 n A 1 A The general form of our methods for integrating from t = τ to t = τ + θ is (1. Y X θ/ k 1 = ( 1 θ l c l f l (X, Y, Z Y θ/ l=0. k 1 = ( θ 1 l c l f l (Z, Y, where X X(τ and Z X(τ + θ, f l (, are matrix-valued functions to be determined so that this will give a method of order k and at the same time the equations in (1. are linear in Y and Z, respectively, and c l are the coefficients in the power series of tanh t. For time-invariant MRDE, we have found all such l=0

3 ANADROMIC METHODS FOR RICCATI DIFFERENTIAL EQUATIONS 3 methods of all even orders, while for time-varying MRDE, we have found methods of order as high as 10, but only methods of order up to 6 are described in detail owing to their exponentially growing complexities as the order increases. Mathematica plays a major role in our finding those methods for the time-varying MRDE. The method (1. can be casted into the framework of modified integrators in the sense of [8], and is also closely related to the implicit midpoint rule on an associated linear differential equation. The latter fact, generously shared with us by Hairer [3], makes it possible for us to significantly simplify our earlier construction of high order methods in the form of (1. in [35]. That is through establishing modified implicit midpoint rules for a linear differential equation [36]. Geometrically, MRDE can be viewed as a flow on the Grassmann manifold [46]. Schiff and Shnider [41] appeared to be the first to take advantage of this point of view and proposed so-called Möbius Schemes to better simulate the flow. The basic idea is to numerically preserve the (bilinear rational dependence property one of the three properties we shall discuss in Section, and it is done through approximating the fundamental solution of the associated linear differential equation. It was argued that this preservation would enable the schemes to deal with numerical instability and pass accurately through the singularities. Our methods (1. preserve the (bilinear rational dependence property, besides the other two properties symmetry and generalized inverse relation. In this sense, our methods fit into the form of their Möbius Schemes. But we argue that it is the preservation of the generalized inverse relation property that enables us to give a rigorous justification as to why our methods can pass singularities. Solutions X(t to (MRDE can also be regarded as sample-values of members of the two-sided bilinear rational matrix function group selected by t and sampled at an indeterminate X(0. Our methods (1., as well as Möbius Schemes in [46], approximate X(t by a sequence of sample-values each drawn from the same group of two-sided bilinear rational matrix functions, regardless of X(0. More detail is in Section 5. In analyzing our methods, the complementary MRDE to (MRDE: (cmrde U = A 1 UA + A 11 U UA 1 U, U(0 = U 0. plays a major role. We say (cmrde is the complement of (MRDE. It can be seen that the complement of the complement of an MRDE is the MRDE itself. In particular, the complement of (cmrde becomes (MRDE. To single out these two equations (MRDE and (cmrde from the rest, we choose to label them differently to make them recognizable instantaneously. The rest of this paper is organized as follows. Section reviews three important properties of MRDE that we should prefer our numerical methods to preserve, other things being equal. In Section 3, we first introduce two simple nd order anadromic methods, based on which there are two existing techniques to compute higher orders approximations the classical extrapolation and the composition. Our focus in this article, however, is on higher order anadromic methods in the general form of (1.. In Section 4, we prove that our proposed methods (1. indeed preserve the three important properties, and as a by-product a solution monotonicity property. Because of the preserved properties, we argue in Section 6 that our methods will have the capability to march over solution poles and still render numerical solutions that are accurate as dictated by the step-size, the order of the method used, and rounding errors in solving involved linear matrix equations

4 4 REN-CANG LI AND WILLIAM KAHAN along the way. A linear stability theory is outlined in Section 7 for our methods. The claim that our methods can march over the poles and our linear stability theory are validated by three numerical examples in Section 8. Section 9 presents our conclusions. Notation. X and X(t is the solution of (MRDE, and X, Y, and Z are numerical approximates at t = τ, τ + θ/, and τ + θ, respectively. Similarly symbols U, U(t, U, V, and W mean the same for the corresponding (cmrde. I k is the k k identity matrix, or simply I when its dimension is clear from the context. Superscript ( T denotes transpose, while ( denotes conjugate transpose.. Important Properties of MRDE An MRDE is said to be symmetric if (.1 A 1 = A T 1, A 11 = A T, A 1 = A T 1, and X(0 = X(0 T. Note this definition of symmetry is valid regardless whether A and X(0 are complex or not, and in fact, in our later development, it is not necessary for A and X(0 to be real when we refer to a symmetric MRDE. We say (MRDE is Hermitian if (. A 1 = A 1, A 11 = A, A 1 = A 1, and X(0 = X(0. An MRDE is said to be time-invariant if A in (1.1 does not depend on t, i.e., a constant matrix; otherwise it is time-varying. In what follows we shall explain three properties that are important to us in this study: Bilinear Dependence on the initial value, Generalized Inverse Property, and Symmetry. Among them, the Symmetry property (Hermitian MRDEs included is easiest to preserve, and almost all known numerical schemes do achieve that, e.g., the straightforward applications of the Runge-Kutta and linear multistep methods. But preserving the other two properties cheaply is a nontrivial task achieved by few previous numerical schemes. All three properties, however, are preserved by our methods in the next section. Because of this, we later argue that our methods preserve a solution monotonicity for symmetric MRDEs and are capable of marching over the solution poles..1. Bilinear Property. In principle, (MRDE could be reduced to a (time-varying linear homogeneous equation by the Bernoulli substitution of the form X = T S 1 [39]: (.3 dp dt = AP, P (0 = ( S0 T 0, wherein P = ( m m The solution to (MRDE with X(0 = T 0 S 1 0 relates to the solution to (.3 by X(t = T (ts(t 1 so long as S(t remains invertible and it would be if X(t stayed finite because S(t satisfies a linear homogeneous differential equation S = (A 11 + A 1 XS. But this reduction of the given MRDE to a linear homogeneous differential equation is vulnerable to numerical instability when, as happens sometimes, all the columns of P approach a subspace of dimension lower than the number of columns, and then S becomes too nearly non-invertible to allow X to be recovered accurately from T S 1. This is why X(t, if it must be computed numerically, is n S T.

5 ANADROMIC METHODS FOR RICCATI DIFFERENTIAL EQUATIONS 5 usually computable better from the given MRDE than from either its foregoing linear reduction or a nd linear reduction, namely (.4 R = RA, R(0 = ( T 0 S 0, wherein R = ( T S for which X(t = S(t 1 T (t, provided X(0 = S 1 0 T 0. Both reductions can fizzle numerically. Still the two linear reductions shed light upon how the desired solution X(t depends upon its initial value X(0. Let Φ Φ(t be the Fundamental Solution of (.3: dφ (.5 dt = AΦ, Φ(0 = I m+n (identity matrix, so P (t = ΦP (0. Consequently, after partitioning (.6 Φ = we find that ( m n m Φ 11 Φ 1, n Φ 1 Φ (.7 X(t = [Φ 1 + Φ X(0][Φ 11 + Φ 1 X(0] 1. This is well-defined when [Φ 11 + Φ 1 X(0] 1 exists, which is guaranteed for small enough t. As t increases, if Φ 11 + Φ 1 X(0 becomes singular at some point t 0, then t 0 becomes either a singularity or a removable singularity of X(t. In any case, for any fixed t the matrix is a Bilinear Rational Function of X(0. X(t is actually a Two-Sided Bilinear Rational Function of X(0, bilinear rational in two ways simultaneously due to the nd linear reduction (.4. Actually the two kinds of bilinear rational dependence of X(t upon X(0 coexist partly because of MRDE but more because of an obscure matrix identity independent of that differential equation. Theorem.1 (Most bilinear rational functions are two-sided. Let Φ and Φ be two (m + n-by-(m + n matrices partitioned in the same way as in (.6. Suppose that 1 there is at least one n m matrix G such that both Φ 11 + Φ 1 G and G Φ 1 Φ are invertible. Then (.8 [Φ 1 + Φ G][Φ 11 + Φ 1 G] 1 [G Φ 1 Φ ] 1 [ G Φ 11 + Φ 1 ] for every n m matrix G such that both matrix inverses exist if and only if (Φ11 ( Φ11 Φ1 Φ Φ Φ 1 = µi Φ 1 Φ Φ 1 Φ for some scalar µ. Proof. The identity is equivalent to expanding which to get [G Φ 1 Φ ][Φ 1 + Φ G] = [ G Φ 11 + Φ 1 ][Φ 11 + Φ 1 G], (.9 ( Φ 1 Φ 11 + Φ Φ 1 G( Φ 11 Φ 11 + Φ 1 Φ 1 + ( Φ 1 Φ 1 + Φ Φ G G( Φ 11 Φ 1 + Φ 1 Φ G = 0. 1 This implies that there is an nonempty open set of G for which both matrices are invertible.

6 6 REN-CANG LI AND WILLIAM KAHAN This holds for any G (such that [Φ 11 + Φ 1 G] 1 and [G Φ 1 Φ ] 1 exist if ΦΦ = µi is satisfied. On the other hand, if (.8 holds for every matrix G in an nonempty open set, so does (.9 for every matrix G in the nonempty open set and, consequently, for every n m matrix G because (.9 is a constraint upon quadratic polynomials in the entries of G; if satisfied by all elements in an open set, the identity must be satisfied by every aptly dimensioned G. Now run G through every such matrix each with at most one nonzero element to conclude ΦΦ = µi for some scalar µ. What use is the two-sided property of this bilinear rational function X(t of X(0. One application is a far simpler proof than is found in Reid [39, p.1] of the following theorem. Write the bilinear rational function X(t of X(0 as defined in (.7 as X(t = F (X(0. Theorem. ([39]. Rank(F (X 1 F (X = rank(x 1 X if F (X 1 and F (X are both finite. Proof. Take F (X 1 = [Φ 1 + Φ X 1 ][Φ 11 + Φ 1 X 1 ] 1 but change F (X = [Φ 1 + Φ X ][Φ 11 +Φ 1 X ] 1 to F (X = [X Φ1 Φ ] 1 [ X Φ11 + Φ 1 ] as Theorem.1 provides with µ = 1 since the fundamental solution Φ of (.3 is nonsingular. Consequently F (X 1 F (X = [X Φ1 Φ ] 1 (X 1 X [Φ 11 + Φ 1 X 1 ] 1 and the conclusion follows. This theorem implies that, when the solution X(t = F (X(0 of a matrix Riccati differential equation changes because its initial value X(0 has been changed, the rank of the change is conserved. Two-sided bilinear rational matrix functions like (.7 form a group. Solutions X(t to (MRDE can be regarded as sample-values of members of the group selected by t and sampled at an indeterminate X(0. More discussion on this is in Section 5. We should prefer numerical methods that preserve this bilinear rational property in its computed solution X, other things being equal. Observe that bilinear rational functions are closed under composition... Generalized Inverse Property. MRDE ensures that all the inverses in Subsection.1 exist while t is small enough, and this ensures X(t is a two-sided bilinear rational function of X(0 unless t gets so big that X(t has a pole; if such a thing exists it is a finite t 0 at which X(t becomes infinite. What happens when a square solution X(t (i.e., m = n passes through a pole? Typically its inverse U(t passes through a singular matrix; this U(t satisfies (cm- RDE. When m n, this complementary MRDE is still well-defined. Theorem.3 below shows that a so-called Generalized Inverse Relation is enforced between two complementary MRDEs. Theorem.3. If X 0 U 0 = I (or U 0 X 0 = I, and if the solutions X(t to (MRDE and U(t to (cmrde have only isolated singularities and share no common ones, then X(tU(t I (or U(tX(t I.

7 ANADROMIC METHODS FOR RICCATI DIFFERENTIAL EQUATIONS 7 Proof. If U 0 X 0 = I, then (UX = I solves the following initial value problem for (UX d dt (UX =(A 1 UA + A 11 U UA 1 UX + U(A 1 XA 11 + A X XA 1 X (UX t=0 = I. =A 1 X (UXA 1 X + A 11 (UX (UXA 11 UA 1 (UX + UA 1 =[I (UX]A 1 X + A 11 [(UX I] [(UX I]A 11 UA 1 [(UX I], Therefore (UX = I at least initially. Since all singularities in X(t and U(t are assumed to be isolated; so are those of U(tX(t. Thus all the singularities in the right-hand side of this ODE are removable. The other case when X 0 U 0 = I is similar. We should prefer numerical methods that retain this generalized inverse property in their computed solutions X and U, other things being equal. Equation XU = I n necessarily implies n m and that X s rows are linearly independent. We call such U a right generalized inverse of X. Similarly UX = I m necessarily implies n m and that X s columns are linearly independent. We call such U a left generalized inverse of X. In either case, we call U a generalized inverse of X..3. Symmetry Property. Symmetric MRDE, i.e., (.1 holds, appears most commonly in optimal control and filtering problems [10, 9]. For the symmetric MRDE, A s eigenvalues come in pairs with opposite signs because where (.10 J = A is similar to A T = JAJ 1 which is similar to A, ( 0 In, J T = J, J 1 = J T. I n 0 Such matrix A is said to be Hamiltonian. Therefore A and A have the same eigenvalues. If 0 is among them its multiplicity is even, even if it is defective for lack of an equal number of eigenvectors. A s eigenvalues are important in the discussion of attractive stationary points. Next consider the Fundamental Solution of (.3 (.11 Φ = ( n n n Φ 11 Φ 1. n Φ 1 Φ Since d dt (J 1 Φ T J = (J 1 Φ T JA, ( Φ 1 = J 1 Φ T Φ T J = Φ T 1 Φ T 1 Φ T. 11

8 8 REN-CANG LI AND WILLIAM KAHAN Therefore Φ 11 Φ T Φ 1 Φ T 1 = Φ T Φ 11 Φ T 1Φ 1 = I, Φ 11 Φ T 1 Φ 1 Φ T 11 = Φ 1 Φ T Φ Φ T 1 = 0, Φ T Φ 1 Φ T 1Φ = Φ T 11Φ 1 Φ T 1Φ 11 = 0. And then for every symmetric initial value X(0 = X 0, we find that (.1 X(t = [Φ 1 + Φ X 0 ][Φ 11 + Φ 1 X 0 ] 1 = [ Φ T 11 + X 0 Φ T 1] 1 [Φ T 1 + X 0 Φ T ] = X(t T. Theorem.4 (Most symmetry-preserving bilinear rational functions are two-sided. Let Φ be an n-by-n matrix partitioned as in (.11. Suppose that there is at least one n n matrix G such that Φ 11 + Φ 1 G is invertible. Then (.13 [Φ 1 + Φ G][Φ 11 + Φ 1 G] 1 [ Φ T 11 + GΦ T 1] 1 [Φ T 1 + GΦ T ] for every n n matrix G = G T such that the matrix inverse exists if and only if ( ( Φ T (.14 Φ T 1 Φ11 Φ 1 Φ T 1 Φ T = µi 11 Φ 1 Φ for some scalar µ. Moreover µ 0 if and only if both sides of (.13 actually vary with G. Proof. The first claim is proved the same way as Theorem.1 was, except for running G through all symmetric matrices of the apt dimension with at most two nonzero elements. The second claim follows from the observation that (.14 amounts to (J 1 Φ T J Φ = µi. If µ = 0, then the rank of Φ cannot exceed the nullity of J 1 Φ T J which is the same as that of Φ, and therefore cannot exceed n. Since the rank of (Φ 11, Φ 1 is n because [Φ 11 + Φ 1 G] 1 exists for some G, the rank of Φ is n, too. Then (Φ 1, Φ = H(Φ 11, Φ 1 for some square H, and thus [Φ 1 + Φ G][Φ 11 + Φ 1 G] 1 = H regardless of G. On the other hand, if both sides of (.13 do not vary with G, then [Φ 1 + Φ G][Φ 11 + Φ 1 G] 1 H for some constant symmetric matrix H with respect to G, and therefore Φ 1 + Φ G H[Φ 11 + Φ 1 G] for every G in an nonempty open set of symmetric G, and consequently for every symmetric G. This leads to Φ 1 = HΦ 11 and Φ = HΦ 1. Substitute both relations into the left-hand side of (.14 to conclude that it holds with µ = 0. Symmetric MRDE also have the following monotonicity property which shares some resemblance with [17, Theorem ] due originally to [38], but there are differences in the conditions, for example, no requirement for both A 1 and A 1 to be positive semidefinite here. This condition, however, guarantees that the solution of the symmetric MRDE exists for all time [16, Proposition 1]. Theorem.5. For real symmetric (MRDE, i.e., A is real and (.1 holds, let X(t and X(t be its two solutions with initial values X(0 and X(0, respectively. If X(0 X(0 by which we mean that X(0 X(0 positive semidefinite, then X(t X(t over the interval [0, T of their existence. Likewise X(0 X(0 means that X(0 X(0 is positive definite.

9 ANADROMIC METHODS FOR RICCATI DIFFERENTIAL EQUATIONS 9 Proof. It can be proved similarly to [17, Theorem ]. For completeness, we present a proof here. Let W W (t = X(t X(t. It can be verified that [ W = W A X + X ] T [ A 1 + A X + X ] A 1 W whose solution can be written as W (t = Ψ(tW (0Ψ(t T, where Ψ(t is the solution of [ Ψ = A X + X ] A 1 Ψ, Ψ(0 = I. Therefore W (t must remain positive semidefinite over the interval [0, T in which both X(t and X(t have no singularity. We should prefer numerical methods that retain these properties: Symmetry and Two-Sided Bilinear Rational dependence upon X(0 and Monotonicity in the sense of Theorem.5, in their computed solutions X, other things being equal. Observe also bilinear rational functions that propagate matrix symmetry are closed under composition. Remark.1. For Hermitian MRDE, everything in this subsection, after transposes ( T replaced by conjugate transposes (, holds. 3. Unconventional Anadromic Numerical Methods We shall start by presenting two simple nd order anadromic numerical methods to pave the way for our general format of higher order ones which also fall into the framework of modified integrators in [8] Two Simple nd Order Methods. These methods are based on a partition technique. Consider one step of integration from τ to τ + θ, where θ is the current step-size. Define matrix-valued function f(x, Y as (3.1 f(x, Y = A 1 X A 11 +A Y X A 1 Y, where all A ij = A ij (τ + 1 θ. Let X X(τ. An approximation Z X(τ + θ can be computed by solving (3. Y X θ/ = f(x, Y, Z Y θ/ = f(z, Y, where Y X(τ + 1 θ and Z X(τ + θ. This defines a relatively inexpensive second-order Anadromic method (3.3 Updating Formula: Z = Q(θ, τ + 1 θ, X that preserves all three properties, namely, Bilinear Relation, Generalized Inverse Property, and Symmetry, as we shall prove. It is relatively inexpensive because the determining equations for Y and Z are linear in Y and Z, unlike the existing implicit methods which have to solve nonlinear matrix equations. By Anadromic, we mean that if we integrate the MRDE backwards from τ + θ to τ using the same updating formula with the negative step size θ and with X(τ + θ = Z, in the absence of rounding errors X is recovered back exactly, namely X Q( θ, (τ + θ 1 θ, Z.

10 10 REN-CANG LI AND WILLIAM KAHAN An anadromic method has many attractive properties. It is at least of second order convergence [Q(θ, τ + 1 θ, X(τ X(τ + θ]/θ = O(θ. In fact more can be said. Let Z(τ be the computed solution with Z(0 = X(0 at t = τ which is fixed and a multiple of θ. Then [, p.] Z(τ X(τ = E (τθ + E 4 (τθ 4 + E 6 (τθ 6 +, i.e, its error expansion in terms of θ contains only even powers of θ. In the past, such a property is considered ideal to apply (traditional extrapolation methods [7, 14, 1] to achieve higher order approximations. The simple nd order method (3. is closely related to the implicit midpoint rule applied to (.3, an observation E. Hairer [3] generously shared with the authors. The observation makes it possible to significantly simplify our earlier analysis [35] in constructing higher order methods. Theorem 3.1 (Hairer. Let P 1 P (τ + θ be the solution by applying the implicit midpoint rule to (.3 from t = τ to τ + θ P 1 P θ = A P 1 + P, P = ( m m n S P (τ, P T 1 = ( m m S 1 n T 1 P (τ + θ, where A = A(τ + 1 θ. Assume that S, S 1, and S +S 1 are invertible. If X = T S 1, then (3.4 Y S + S 1 = T + T 1, Z = T 1 S 1 1, where Y and Z are defined by (3.. Proof. It suffices to prove that ŶY = [(T + T 1 /][(S + S 1 /] 1 and ẐZ = T 1 S 1 1 satisfy both defining equations in (3. for Y and Z. We have the following identities (3.5 (ŶY X S 1 + S = T + T 1 = T + T 1 = T 1 T = T 1 T X S 1 S + S X S 1 S X S 1 S, ẐZ S 1 S. (ẐZ ŶY S 1 + S (3.6 Partition A = (A ij in the conformal way. The implicit midpoint rule gives T S 1 S = 1 θ[a 11 (S 1 + S + A 1 (T 1 + T ], T 1 T = 1 θ[a 1 (S 1 + S + A (T 1 + T ]. Substitute S 1 S and T 1 T into (3.5 and (3.6, and then apply [(S + S 1 /] 1 from the right, and then divide by θ 1 to see that ŶY and ẐZ satisfy both defining equations in (3. for Y and Z, respectively. Updating formula Q above is not alone in its simplicity and preservation of the desired properties. An obvious alternative is (3.7 Y a X θ/ = f(y a, X, Z a Y a θ/ = f(y a, Z a,

11 ANADROMIC METHODS FOR RICCATI DIFFERENTIAL EQUATIONS 11 where Y a X(τ + 1 θ and Z a X(τ + θ. Which circumstances can favor one alternative over the other is not known at this time. Both preserve the same properties of the MRDE s solution. It comes as no surprise that this alternative method (3.7 closely relates to the implicit midpoint rule as well, except this time the rule is applied to the second linear reduction (.4. We state the theorem but omit its proof as it is similar to Theorem 3.1. Theorem 3.. Let R 1 R(τ + θ be the solution by applying the implicit midpoint rule to (.4 from t = τ to τ + θ R 1 R θ = R 1 + R A, R = ( m n T S R(τ, R 1 = ( m n T 1 S 1 R(τ + θ, where A = A(τ + 1 θ. Assume that S, S 1, and S +S 1 are invertible. If X = S 1 T, then (3.8 S + S 1 where Y a and Z a are defined by (3.7. Y a = T + T 1, Z a = S 1 1 T 1, The method defined by (3. appeared in 1980s in an unpublished note of the second author, and was also discovered independently by Babuška and Majer [4], where its nd order convergence was proved by a brute force verification. But its many properties discussed in Section 4 and which are critical to our effort to integrate MRDE to pass poles were not known, much less exploited. 3.. Higher Order Methods. Higher order approximations can be derived from the two simple anadromic methods (3. and (3.7 in at least two different known ways: extrapolation [7, 14] and composition [4, 7]. The focus of this article is about the third way given below and this third way can be casted in the framework of modified integrators in the sense of [8], through applying the simple nd order anadromic methods in Subsection 3.1 to truncated modified differential equations of (MRDE. For more on numerical integrations through modified differential equations, the interested reader is referred to [8]. Specifically, for one step of integration from τ to τ + θ, we shall seek a sequence of matrices (3.9 Ã l = ( m n m A 11,l A 1,l n A 1,l A,l for l = 0, 1,,, depending only on A(t and its derivatives at t = τ + 1 θ. We then define (3.10 f l (X, Y = A 1,l X A 11,l + A,l Y X A 1,l Y, matrix-valued functions having two matrix arguments X and Y. corresponding coefficients in the power series of tanh(t [1, p.85] (3.11 l=0 c l t l+1 = exp(t 1 exp(t + 1 = tanh(t = t 1 3 t t t t9 + O(t 11. Let c l be the

12 1 REN-CANG LI AND WILLIAM KAHAN Finally our (kth order numerical method takes the form (3.1 Y X θ/ k 1 = ( 1 θ l c l f l (X, Y, l=0 where Y X(τ + 1 θ and Z X(τ + θ. method for solving (MRDE so long as Z Y θ/ k 1 = ( 1 θ l c l f l (Z, Y l=0 lim A ij,0 = A ij (τ for i, j {1, }. θ 0 This defines a consistent numerical In particular if A ij,0 = A ij (τ + 1 θ, it is of nd order convergence because Z by (3.1 differs from the one by (3.1 and (3. by O(θ 3. In seeking A ij,l later, care is taken so that (3.1 defines an anadromic method of order k. In particular, f 0 (X, Y ia always taken to be the same as the f(x, Y in (3.1, i.e., A ij,0 is A ij (t evaluated at τ + 1 θ. It remains to find what f l (X, Y should be for l 1. It is important to notice that the determining equations for Y and Z are again linear in Y and Z, making the method easy to implement. The entering of the coefficients c l may seem mysterious at first. In a way, it is a demonstration of the close link between (MRDE and the linear ODE (.3, especially for the time-invariant A. See (3.18 below. It is quite evident to see that in the sense of [8] the modified differential equation of (MRDE to which the application of (3. from t = τ to τ +θ yields Z = X(τ +θ exactly is (3.13 X = Ã1 XÃ11 + Ã X XÃ1 X, X(τ = X(τ, where (3.14 Ã = ( 1 θ l c l Ã l l=0 (Ã11 Ã 1 Ã 1 Ã The method (3.1 is simply a result of an application of (3. to (3.13 with Ã truncated. The corresponding modified differential equation of (.3 for the implicit midpoint rule is d (3.15 P dt = Ã P, and at the same time (3.1 is related, much in the same way as stated in Theorem 3.1 for (3., to the implicit midpoint rule applied to (3.15 with Ã truncated. The method (3.1 is not alone in its simplicity and preservation of the desired properties, either, as we commented before about (3.. An obvious alternative is (3.16 Y a X θ/ k 1 = ( 1 θ l c l f l (Y a, X, l=0 Z a Y a θ/. k 1 = ( θ 1 l c l f l (Y a, Z a, where Y a X(τ + 1 θ and Z a X(τ + θ. This alternative method can also be casted as a modified integrator of (3.7 for (MRDE. It is, not surprisingly, related to the implicit midpoint rule applied to (3.17 d R dt = RÃ l=0

13 ANADROMIC METHODS FOR RICCATI DIFFERENTIAL EQUATIONS 13 after Ã is truncated. This close relationship between the proposed higher order methods and the implicit midpoint rule plays an instrumental role in simplifying our earlier constructions in [35] of these higher order methods. That is that the sought Ã for our methods (3.1 (and for (3.16, too is the same as the one in the modified differential equation (3.15 for the implicit midpoint rule. The latter is detailed in [36]. We summarize the findings in what follows. The time-invariant case. Now A does not depend on time t. It is found in [36] that (3.18 Ã = 1 1 θ (e θa I ( e θa + I 1 = ( 1 θ l c l A l+1. Theorem 3.3. Suppose A is constant. Then with Ãl = A l+1 for all l, (3.1 defines an anadromic method of order k for (MRDE. Moreover, if X(τ = X, (3.19 X(τ + θ = Z + ( 1 θ k+1 c k f k (X, X + O(θ k+. Proof. The first claim has been proved already. The second claim follows by comparing Z to the one in (3.1 after letting k =. Example 3.1. Consider a scalar time-invariant RDE, i.e., m = n = 1 and X = αx + βx + γ which can be written in the form of (MRDE: l=0 X = γ X( β/ + δ + (β/ + δx X( αx, where δ is a constant and arbitrary. The corresponding matrix is ( β/ + δ α (3.0 A =. γ β/ + δ It can be verified that ( A ( β/ + δ = αγ αδ γδ (β/ + δ. αγ In particular, A = (β /4 αγi if δ = 0. Then for δ = 0 A l+1 = (β /4 αγ l A, f 0 (X, Y = γ X ( β/ + (β/y X ( αy, f l (X, Y = (β /4 αγ l f 0 (X, Y. Let Θ = β /4 αγ. The first equation in (3.1 gives (3.1 Y = G( 1 θ, X def = [ 1 + (β/ k l=0 ( 1 θ l+1 c l Θ l ] X 0 + γ k l=0 ( 1 θ l+1 c l Θ l 1 (αx 0 + β/ k l=0 ( 1 θ l+1 c l Θ l. Since all f l (X, Y f l (Y, X, the second equation in (3.1 must yield Z = G( θ, 1 Y. In the case of k =, Z X(τ + θ provided X = X(τ by the idea of the modified differential equation. In fact, more can be said for the example (with δ = 0, namely also Y X( θ 1 for k =, too (which may fail for non-scalar RDE, however [35, Example 7.1]. In view of this fact, [ 1 + (β/ ] k 1 l=0 θl+1 c l Θ l X + γ k 1 l=0 θl+1 c l Θ l (3. X 1 (αx + β/ k 1 l=0 θl+1 c l Θ l

14 14 REN-CANG LI AND WILLIAM KAHAN is an order k updating formula for scalar RDE X = αx + βx + γ. Close formulas for Y, Z for the case δ 0 can also be established. But they are much more complicated. The key lies in computing A l+1. The matrix A in (3.0 can be written as ( β/ α A = δi + B, B =, B = ΘI. γ β/ Therefore we have for m = l + 1 A m = m j=0 ( m δ m j B j = j l j=0 ( m δ m j Θ j I + j l j=0 ( m δ m j 1 Θ j B j + 1 from which each entry of A m can be explicitly written out and so is f l (X, Y for each and every l. We omit the detail. The time-varying case. This is a much more complicated situation than the time invariant case. With the help of Mathematica, we have found in [36] Ã l in terms of values of A(t and its derivatives at t = τ + 1 θ for l up to 4, and they yield methods in the form of (3.1 for k = 1,, 3, 4, 5 corresponding to orders, 4, 6, 8, and 10 of convergence. However, the complexity of Ãl in terms of the number of summands grows exponentially (in fact it has 4 l terms. Therefore anadromic methods of orders higher than 6 are likely impractical in general. In what follows we present Ãl for l up to. Denote for l 0 (3.3 A l = dl dt l A(t. 1 t=τ+ We have from [36] θ (3.4 (3.5 Ã 0 = A 0 = A(τ + 1 θ, Ã 1 = A (A 0 A 1 A 1 A 0 1 A, Ã = A A 0 (A 0 A 1 A 1 A 0 A 0 + (A 3 0A 1 A 1 A [ A0 (A 1 A 1 A 0 A 1 + (A 1 ] A (A 0A + 3A 0 A A 0 + A A (A 1 A A A 1 ( (A 0 A 3 A 3 A A 4. These formulas for Ãl contain (higher order derivatives which can be hard or expensive to evaluate sometimes. In such situations, naturally we may approximate the derivatives by divided differences. There are many ways to do that. But care must be taken in order to retain the anadromic property of the method and at the same time not to reduce the claimed order of convergence. Another consideration is to maximize the usage of any evaluated A(t (and possibly its low order derivatives between consecutive steps of integration. Define (3.7 t 0 = τ + 1 θ, t i = t 0 i( 1 θ, t i = t 0 + i( 1 θ,

15 ANADROMIC METHODS FOR RICCATI DIFFERENTIAL EQUATIONS 15 whose layout is shown in the following picture, where [t 1, t 1 ] is the current interval of integration. t 4 τ 3 θ t 3 τ θ t τ 1 θ t 1 τ t 0 t 1 τ+ 1 θ τ+θ t t 3 τ+ 3 θ τ+θ t 4 τ+ 5 θ The formulas (3.4 (3.6 call for evaluating A(t and its derivatives at..., t 4, t, t 0, t, t 4,.... Our goal is to update these formulas so that evaluating derivatives, except perhaps only the first order derivatives, is not needed. Define the first and second order divided-differences A ({α, β} def = A(α A(β, A ({α, β, γ} def = A ({α, β} A ({β, γ} α β α γ In considering the objectives we mentioned above, we proposed in [36] the following sets of methods for derivative approximations: (3.8a (3.8b A (t 0 A ({t 1, t 1 }, A (t 0 A ({t 1, t 0, t 1 },. (3.9a (3.9b (3.30a (3.30b (3.31a (3.31b (3.3a (3.3b A (t 0 A ({t, t }, A (t 0 A ({t, t 0, t }, A (t 0 1 [ A (t 1 + A ] (t 1 θ A ({t 1, t 1 }, A (4 (t 0 48 [ A θ ({t 1, t 1 } A ({t 1, t 0, t 1 } ], A (t 0 8 [ A θ ({t, t } A ({t 1, t 1 } ], A (4 (t 0 3 ( A(t + A(t θ 4 4 A(t 1 + A(t 1 A (t 0 8 [ A (θ ({t 4, t 4 } A ({t, t } ], A (4 (t 0 3 (θ 4 ( A(t4 + A(t 4 4 A(t + A(t + 3A(t 0, + 3A(t 0. With them, we readily define a new Ã1 as (3.33 New Ã1 is obtained by (3.5 with (3.8 or with (3.9 (by which we mean the 1st and nd order derivatives A 1 and A in (3.5 are approximated by either (3.8 or (3.9. Then (3.1 with k =, (3.4, and (3.33 defines anadromic methods of order 4. However, for methods of orders higher than 4, simply taking these Ãl given in (3.4, (3.5, and (3.6 and replacing the derivatives of A(t at t = t 0 by the corresponding approximations above do not work. For example, (3.1 with k =, (3.4, (3.33, and (3.6 after (all or some of the derivatives approximated does

16 16 REN-CANG LI AND WILLIAM KAHAN k Ã l and b l for 0 l k 1 given by odr 1 (3.4 odr4 (3.4, (3.5 odr4a (3.4, (3.5 with (3.8 odr4b (3.4, (3.5 with (3.9 odr6 3 (3.4, (3.5, (3.6 odr6a 3 (3.4, (3.5 with (3.8, (3.34 with (3.30 odr6b 3 (3.4, (3.5 with (3.9, (3.35 with (3.31 odr6c 3 (3.4, (3.5 with (3.9, (3.34 with (3.3 Table 3.1. Anadromic method (3.1 is of order k not have order 6. It turns out that using (3.5 with (3.8 or (3.9 for Ã1 affects next Ã. (3.34 For (3.5 with (3.8: new Ã is obtained by replacing the last line of (3.6 by 1 6 (A 0 A 3 A 3 A A 4 ; (3.35 For (3.5 with (3.9: new Ã is obtained by replacing the last line of (3.6 by 17 1 (A 0 A 3 A 3 A A 4. In theory, 6th order anadromic methods are now readily available by approximating the derivatives in the new Ã given in (3.34 or (3.35 by any combinations of the approximation methods in (3.8 (3.3. But again in choosing the approximations, we should keep in mind reusing the values of A(t (and possibly its 1st order derivatives between consecutive steps of integration as much as possible. Table 3.1 gives our suggested anadromic methods of nd, 4th, and 6th orders of convergence. 4. Preserved Properties of Proposed Methods In this section we shall prove that any scheme defined by (3.1 preserves all three properties discussed in Section, namely, Bilinear Relation, Generalized Inverse Property, and Symmetry Bilinear Property. Define H ij, functions of θ, by (4.1 k 1 ( ( θ 1 l A11,l A c 1,l l A 1,l A,l l=0 Solving for Y and Z in (3.1 yields (4.a (4.b ( m n m H = 11 H 1 n H 1 H Y = [(/θi (H XH 1 ] 1 [(/θx + (H 1 XH 11 ], Z = [(/θy + (H 1 + H Y ][(/θi + (H 11 + H 1 Y ] 1. The formulas (4.a and (4.b, in combining with Theorem.1 and that bilinear rational functions are closed under composition, show that in the absence of rounding errors, the numerical solution preserves the Bilinear Rational property discussed in Subsection.1..

17 ANADROMIC METHODS FOR RICCATI DIFFERENTIAL EQUATIONS 17 There are two consequences as the results of such preservation. The first one is the conservation of the change in rank. If X is changed, the rank of change is conserved in Y and Z, like that the exact solution of MRDE enjoys. See Theorem.. Therefore the approximate solutions computed numerically by any of our anadromic methods conserve the rank of change to the initial value provided that they use the same stepsizes θ for one approximate solution as for another approximate solution, and provided roundoff does not interfere. (But it will except perhaps if the rank is min{m, n}. The second one is that the solution monotonicity property as in Theorem.5 is retained by Z (see Theorem 4.3 below. 4.. Generalized Inverse Property. For convenience, we identify (MRDE by its defining parameters {m, n, A, X 0 }. Doing so allows us to identify its complementary (cmrde as one of MRDE in the form (MRDE but with the defining parameters {n, m, A c, U 0 }, where A c def = ( n m n A A 1 m A 1 A 11 Notice that A c relates to A through permuting symmetrically the blocked columns and blocked rows of A. Through identifying (cmrde as one of (MRDE, we can apply the numerical scheme (3.1 to (cmrde with f l defined with ( A,l A 1,l A 1,l A 11,l obtained through again permuting symmetrically the blocked columns and blocked rows of matrices in (3.9. The application will lead to a numerical method for (cmrde as follows: (4.3a (4.3b V = [(/θi (H 11 UH 1 ] 1 [(/θu + (H 1 UH ], W = [(/θv + (H 1 + H 11 V ][(/θi + (H + H 1 V ] 1, where U U(τ, V U(τ + 1 θ and W U(τ + θ, and matrices H ij are still defined in (4.1. Theorem 4.1 below shows that the generalized inverse property is always preserved. Theorem 4.1. Let Y and Z be defined by (4., and let V and W be defined by (4.3. (1 If UX = I (and thus n m, then V Y = W Z = I. ( If XU = I (and thus n m, then Y V = ZW = I. Proof. We shall only prove Item 1 since the other one can be similarly dealt with. Suppose that UX = I. Write Q ii = (/θi H ii. Then V Y = [Q 11 + UH 1 ] 1 [UQ + H 1 ] [Q + XH 1 ] 1 [XQ 11 + H 1 ]. We have U [Q + XH 1 ] = UQ + H 1 because of UX = I. Thus and therefore [UQ + H 1 ] [Q + XH 1 ] 1 = U, V Y = [Q 11 + UH 1 ] 1 U [XQ 11 + H 1 ] = I. To prove W Z = I. Set P ii = (/θi + H ii. Then W Z = [P 11 V + H 1 ] [P + H 1 V ] 1 [P Y + H 1 ] [P 11 + H 1 Y ] 1..

18 18 REN-CANG LI AND WILLIAM KAHAN We have [P + H 1 V ] Y = P Y + H 1 because of V Y = I. Thus [P + H 1 V ] 1 [P Y + H 1 ] = Y, and therefore as was to be shown. W Z = [P 11 V + H 1 ] Y [P 11 + H 1 Y ] 1 = I, 4.3. Symmetry Property. Our first few proofs, independently by the authors and David Bindel [5], of Symmetry Property summarized in the following theorem were long and complicated. The proof given below is due to Hairer [3]. Also in the theorem, we impose conditions on A ij,l in (4.4. These conditions are automatically satisfied for those given in Subsection 3. if they are true for A ij, as proved in [36]. Theorem 4.. In (3.1, if (4.4 A 1,l = A T 1,l, A 11,l = A T,l, A 1,l = A T 1,l for all l, and X = X T, then Z = Z T. Proof. Consider the associated linear differential equation d (4.5 P ( dt = Â P S, P (τ =, T where Â = k 1 l=0 ( 1 θ l c l Ã l obtained after truncating Ã in (3.14, and X = T S 1. Z relates to the implicit midpoint rule solution P 1 P θ = ÂP 1 + P, P = ( m m n S T, P 1 = ( m m S 1 n T 1 for (4.5 by Z = T 1 S 1 1. The assumptions in (4.4 imply ÂT J = JÂ, where J is defined in (.10. It is not hard to verify that d dt [ P (t T J P (t] = 0 which says P (t T J P (t is a quadratic first integral of (4.5. Since the implicit midpoint rule preserves quadratic first integral 3 [4, p.101], we have P T 1 JP 1 = P T JP S T 1 T 1 + T T 1 S 1 = S T T + T T S. X is symmetric; so is S T XS = S T T. Thus S T 1 T 1 + T T 1 S 1 = 0, i.e., S T 1 T 1 is symmetric; so is Z = T 1 S 1 1 = S T 1 (S T 1 T 1 S 1 1, as was to be shown. It is worth emphasizing that despite this symmetry property of our methods in the absence of rounding errors, numerically computed solutions often deviate from being symmetric (Hermitian after many number of integration steps. This is what we observed in our numerical experiments and drove us to seek many different proofs of Theorem 4.. Therefore it is recommended to symmetrize the computed Z every few steps in implementation. Fortunately the cost of doing so is marginal, relative to the overall cost of integration. 3 That P T 1 JP 1 = P T JP can also be directly verified, by noting that B = (I 1 θâ 1 (I + 1 θâ is symplectic, i.e., B T JB = J, and P 1 = BP.

19 ANADROMIC METHODS FOR RICCATI DIFFERENTIAL EQUATIONS 19 The next theorem says Z defined by (4. retains the solution monotonicity property as given in Theorem.5. Let ŶY and ẐZ be defined by (4. after X is changed to X, and θ 0 be the smallest θ for which one of fail to be nonsingular. (/θi (H XH 1, (/θi + (H 11 + H 1 Y, (/θi (H XH 1, (/θi + (H 11 + H 1 ŶY Theorem 4.3. For real symmetric (MRDE, assume (4.4 and that all A ij,l are real. If both X and X are real and X X, then Z ẐZ for θ [0, θ 0. Proof. By Theorem 4., both Z and ẐZ are real symmetric. For θ [0, θ 0, ẐZ Z is continuous in θ; so are its eigenvalues. Since rank(ẑz Z = rank( X X as θ increases from 0 to any number that is less than θ 0. Thus the eigenvalues of ẐZ Z cannot changes their signs. We point out that this theorem is intrinsically different from the obvious conclusion if X X, then Z ẐZ for sufficiently small θ because the exactly solutions of real symmetric MRDE do. Firstly for this obvious conclusion to hold, one must assume strictly X X; and secondly how small of θ is sufficiently small depends on the smallest eigenvalue of the difference X X. For this second point, θ 0 in Theorem 4.3 can be taken to be (4.6 θ 0 = min { A 0 (1 + X, A 0 (1 + X for the second order method, where is any consistent matrix norm. For methods of order k, it is (4.7 θ 0 = min{δ( X, δ( X }, where δ(η is the smallest positive root of k 1 1 (1 + η c l Ãl ( 1 θ l+1 = 0. Both θ 0 in (4.6 and (4.7 do not depend on the difference X X. l=0 Remark 4.1. For Hermitian MRDE, everything in this subsection, after transposes ( T replaced by conjugate transposes (, holds. 5. The Group of Two-Sided Bilinear Rational Functions Given an n-by-m matrix G of fixed perhaps unequal dimensions but indeterminate (variable elements, the set of all invertible bilinear rational n-by-m matrixvalued functions F (G def = [Φ 1 + Φ G][Φ 11 + Φ 1 G] 1, where constant matrices Φ ij are the submatrices of invertible } Φ = ( m n m Φ 11 Φ 1 n Φ 1 Φ,

20 0 REN-CANG LI AND WILLIAM KAHAN constitute a group BR n,m. Its operation is composition, as we shall see in a moment. The domain of F (G is nonempty because the m rows of (Φ 11 Φ 1 must be linearly independent. Note that the Semi-Group of all n-by-m bilinear rational functions includes non-invertible functions corresponding to non-invertible matrices Φ, but none of these are related to MRDE all of whose linear reductions generate invertible matrizants Φ. Consequently what follows is confined to the group of invertible bilinear rational functions. Given three invertible (m + n-by-(m + n matrices Φ j = ( m n m Φ 11,j Φ 1,j n Φ 1,j Φ,j for j = 1, and 3 corresponding respectively to three bilinear rational matrix functions F j (G def = [Φ 1,j + Φ,j G][Φ 11,j + Φ 1,j G] 1, we find that if Φ 1 = Φ Φ 3 then F 1 (G = F (F 3 (G. Conversely, if F 1 (G = F (F 3 (G, then we find Φ 1 = βφ Φ 3 for some nonzero scalar β. In general each bilinear rational matrix function F (G in BR n,m is associated with a ray of invertible matrices βφ generated by running β through all nonzero scalars. Therefore the group BR n,m of n-by-m bilinear rational matrix functions F (G is isomorphic to the Multiplicative Quotient Group of invertible (m + n-by-(m + n matrices Φ by nonzero scalars β. This quotient group is denoted by PGL m+n, which stands for the Projective General Linear Group. We shall drop the dimensions m and n henceforth since they will not change. One connection between F in group BR and βφ in PGL is an equation ( I F (G = βφ ( I G ( S 1 Φ11 Φ = β 1 Φ 1 Φ ( I G in which S = β[φ 11 + Φ 1 G]. Here the nonzero scalar β cancels away, and S is invertible except when G falls at a pole of F (G = [Φ 1 + Φ G][Φ 11 + Φ 1 G] 1, where det(φ 11 + Φ 1 G = 0, which turns out to force F (G to. A second connection is an unobvious equation ( ( F (G I = S 1 G I ( Φ11 Φ1, Φ 1 Φ S 1 where ( 1 ( Φ11 Φ1 Φ11 Φ = 1 and S = G Φ1 + Φ Φ 1 Φ Φ, Φ 1 which exhibits the same function F (G = [G Φ 1 Φ ] 1 [ G Φ 11 + Φ 1 ] now associated with βφ 1 in PGL. The two formulas for F justify the term twosided. In other words, there are two isomorphisms between the group BR of bilinear rational matrix functions F and the rays of matrices βφ and βφ 1 (the two sets coincide in the quotient group PGL. The two isomorphisms supply two ways to compute F (G numerically for any particular G. Sometimes one way is far more accurate than the other. For instance, if one of the matrices ( ( 1 Φ11 Φ 1 ( Φ11 Φ1 Φ11 Φ and = 1 Φ 1 Φ Φ Φ 1 Φ Φ 1

21 ANADROMIC METHODS FOR RICCATI DIFFERENTIAL EQUATIONS 1 has just one relatively tiny singular value, this matrix can usually provide a more accurately computed value of F (G than can the other matrix, all but one of whose singular values must be relatively tiny. However, sometimes neither matrix provides accurate results; such can be the case when both matrices have too many relatively tiny singular values. Such is sometimes the case for bilinear rational matrix functions that solve matrix Riccati differential equations. Solutions X(t of the matrix Riccati differential equation (MRDE can be regarded not so much as functions of t selected by an initial value X(0, but rather as sample-values X(t = F t (X(0 of members of the group BR selected by t and sampled at an indeterminate X(0. Thus, as t increases from 0, the differential equation s solution F t ( traces a trajectory through the group BR of bilinear rational matrix functions F starting at the identity function F 0 (. But numerical methods that would compute F t ( from its matrix Φ t often encounter numerical instability. Instead, numerical methods try to compute X(t = F t (X(0 well only for a specific initial value X(0. Our anadromic methods approximate F t (X(0 by a sequence of sample-values F (X(0 each drawn from the same group BR of functions F (, and all intended to follow nearly along the trajectory traced by F t (X(0 regardless of X(0. 6. Marching Over Poles The updating formula defined by (3.1 preserves a generalized inverse relation between solutions of complementary MRDEs so long as their numerical solutions are computed with matching step-sizes θ small enough that all needed matrix inverses exist. But what do we gain from this? When X is square naturally we could switch between the given MRDE for X and its complementary MRDE for U, whichever has no poles. This should allow us to march over poles without stopping the numerical integration unnecessarily, unlike existing methods! What about non-square cases? Then such a switch mechanism cannot possibly work because neither of two non-square generalized inverses determines the other uniquely. Lemmas 6.1 and 6. below show that the updating formula will work just fine as long as we do not accidentally step on any poles, or too close to the poles lest the associated linear matrix equations in the formulas (4. would be too ill-conditioned to prevent us from solving them with adequate accuracy. Lemma 6.1. Suppose n m matrix X has full column rank, and let U = {U : UX = I}. If U X = I for all U in a nonempty relative open set of U, then X = X, or in other words, X is uniquely determined by a nonempty relative open set in the collection of its (left generalized inverses. Proof. No proof is necessary if X is square. Suppose X is not square, and let V be from the nonempty open set of U. So V is m n and VX = I. We claim both X and V can be embedded in invertible matrices such that ( ( (X Im 0 (6.1 ˇX =. VˇV 0 I n m This is because both V and X have full rank, and therefore they can be embedded in invertible matrices ( V (, X Z. Y

Elementary linear algebra

Chapter 1 Elementary linear algebra 1.1 Vector spaces Vector spaces owe their importance to the fact that so many models arising in the solutions of specific problems turn out to be vector spaces. The