Definite Matrix Polynomials and their Linearization by Definite Pencils. Higham, Nicholas J. and Mackey, D. Steven and Tisseur, Françoise

Definite Matrix Polynomials and their Linearization by Definite Pencils Higham Nicholas J and Mackey D Steven and Tisseur Françoise 2007 MIMS EPrint: 200797 Manchester Institute for Mathematical Sciences School of Mathematics The University of Manchester Reports available from: And by contacting: http://eprintsmathsmanchesteracuk/ The MIMS Secretary School of Mathematics The University of Manchester Manchester M13 9PL UK ISSN 1749-9097

DEFINITE MATRIX POLYNOMIALS AND THEIR LINEARIZATION BY DEFINITE PENCILS NICHOLAS J HIGHAM D STEVEN MACKEY AND FRANÇOISE TISSEUR Abstract Hyperbolic matrix polynomials are an important class of Hermitian matrix polynomials that contain overdamped quadratics as a special case They share with definite pencils the spectral property that their eigenvalues are real and semisimple We extend the definition of hyperbolic matrix polynomial in a way that relaxes the requirement of definiteness of the leading coefficient matrix yielding what we call definite polynomials We show that this class of polynomials has an elegant characterization in terms of definiteness intervals on the extended real line and that it includes definite pencils as a special case A fundamental question is whether a definite matrix polynomial P can be linearized in a structure-preserving way We show that the answer to this question is affirmative: P is definite if and only if it has a definite linearization in H(P) a certain vector space of Hermitian pencils; and for definite P we give a complete characterization of all the linearizations in H(P) that are definite For the important special case of quadratics we show how a definite quadratic polynomial can be transformed into a definite linearization with a positive definite leading coefficient matrix a form that is particularly attractive numerically Key words matrix polynomial hyperbolic matrix polynomial matrix pencil definite pencil structure-preserving linearization quadratic eigenvalue problem polynomial eigenvalue problem AMS subject classifications 65F15 15A18 1 Introduction Consider the matrix polynomial of degree l (11) l P(λ) = λ j A j A j C n n j=0 We will assume throughout that P is regular that is det P(λ) 0 However we do not insist that A l is nonzero so l is part of the problem specification The polynomial eigenvalue problem is to find scalars λ and nonzero vectors x and y satisfying P(λ)x = 0 and y P(λ) = 0; x and y are right and left eigenvectors corresponding to the eigenvalue λ A standard way of treating the polynomial eigenvalue problem P(λ)x = 0 both theoretically and numerically is to convert it into an equivalent linear matrix pencil L(λ) = λx + Y C ln ln by the process known as linearization Formally L is a linearization of P if it satisfies P(λ) 0 E(λ)L(λ)F(λ) = 0 I ()n for some unimodular E(λ) and F(λ) This implies that c det(l(λ)) = det(p(λ)) for some nonzero constant c so that L and P have the same eigenvalues The most Version of October 13 2008 School of Mathematics The University of Manchester Manchester M13 9PL UK (higham@mamanacuk http://wwwmamanacuk/ higham/ ftisseur@mamanacuk http://wwwmamanacuk/ ftisseur/) The work of both authors was supported by Engineering and Physical Sciences Research Council grant EP/D079403 The work of the first author was also supported by a Royal Society-Wolfson Research Merit Award Department of Mathematics Western Michigan University Kalamazoo MI 49008 USA (stevemackey@wmichedu) This work was supported by National Science Foundation grant DMS- 0713799 1

2 N J HIGHAM D S MACKEY AND F TISSEUR widely used linearizations in practice are the companion forms 10 Sec 141 defined by (12a) (12b) C i (λ) = λx i + Y i i = 12 X 1 = X 2 = diag(a l I n I n ) A A A A I n 0 0 I Y 1 = n 0 0 Y A 2 = 0 In 0 I n 0 A 0 0 0 In recent work 8 12 two vector spaces of pencils and their intersection have been studied that generalize the companion forms and provide a systematic way of generating a wide class of linearizations These spaces make it possible to identify linearizations having specific properties such as optimal conditioning 9 optimal backward error bounds 7 and preservation of structure such as symmetry 8 or palindromic or odd-even structure 11 In this paper we concentrate on matrix polynomials with symmetric or Hermitian matrix coefficients Before discussing our aims we recall some definitions and terminology A pencil L(λ) = λx + Y is called Hermitian if XY C n n are Hermitian We write A > 0 to denote that the Hermitian matrix A is positive definite A Hermitian matrix A is definite if either A > 0 or A > 0 Two definite matrices have opposite parity if one is positive definite and the other is negative definite A sequence A 0 A 1 A 2 of definite matrices has alternating parity if A j and A j+1 have opposite parity for all j Definition 11 (definite pencil) A Hermitian pencil L(λ) = λx +Y is definite (or equivalently the matrices XY form a definite pair) if (13) γ(xy ) := min z C n z 2 =1 (z Xz) 2 + (z Y z) 2 > 0 The quantity γ(xy ) is known as the Crawford number of the pencil Definition 12 (hyperbolic polynomial) A matrix polynomial P(λ) = l j=0 λj A j is hyperbolic if A l > 0 and for every nonzero x C n the scalar equation x P(λ)x = 0 has l distinct real zeros This latter definition which can be found in Gohberg Lancaster and Rodman 4 Sec 134 for example does not explicitly require the A i for i < l to be Hermitian However the fact that the coefficients a j (x) = x A j x j = 0:l of the scalar polynomial x P(λ)x = l j=0 λj a j (x) are real-valued functions of x C n (since their roots are real and the leading coefficient is real) has this implication Definite pencils and hyperbolic polynomials share an important spectral property: all their eigenvalues are real and semisimple In light of this commonality as well as the possibility of giving them analogous definitions it is natural to wonder whether every hyperbolic P can be linearized by some definite pencil In particular can this always be done with one of the pencils in H(P) a vector space of Hermitian pencils associated with P studied in 8? In the case of the quadratic Q(λ) = λ 2 A + λb + C with A B and C Hermitian and A > 0 Barkwell and Lancaster 2 and Veselić 16 Thm A5 show that Q is hyperbolic if and only if the Hermitian pencil 0 A A 0 (14) L 2 (λ) = λ + A B 0 C

DEFINITE MATRIX POLYNOMIALS 3 is definite We will show that L 2 is just one of many definite pencils in H(Q) Our contributions in this paper are first to propose an extension of the definition of hyperbolicity for a matrix polynomial that retains all the spectral properties of hyperbolic polynomials and in the linear case (l = 1) is equivalent to definiteness of the pencil Second we prove that a Hermitian matrix polynomial P(λ) has a definite linearization in H(P) if and only if P is hyperbolic in this extended sense For these reasons we will refer to this class of extended hyperbolic polynomials as definite polynomials Furthermore for a definite P we give a complete characterization of all the linearizations in H(P) that are definite Finally we specialize to definite quadratics Q(λ) In particular we explain how Q can be linearized into a definite pencil L(λ) = λx + Y with X > 0 a form that is particularly attractive numerically An important theme of this work is the preservation of the definiteness of a polynomial in the process of linearization As such this work continues the study of structure-preserving linearizations begun in 8 and 11 While our extension of the notion of hyperbolicity is mathematically natural and fruitful in terms of the various properties it yields it is also of practical relevance To explain why we note that in acoustic fluid-structure interaction problems a non- Hermitian generalized eigenvalue problem arises with pencil of the form 1 Chap 8 Ms 0 ω M fs M f + Ks M fs 0 K f where M s K s C n n and M f K f C m m are Hermitian positive definite Multiplying the first block row by ω yields the Hermitian quadratic polynomial Q(ω) = ω 2 Ms 0 Ks M + ω fs 0 0 + 0 0 M fs M f 0 K f This polynomial is not hyperbolic because its leading coefficient matrix is not positive definite; nor is the reversed polynomial ω 2 Q(1/ω) hyperbolic However Q is a definite polynomial so that all the theory developed in this paper applies to it To see why this is so observe that for sufficiently small positive ǫ the matrix ǫ Q( ǫ) = 2 M s + ǫk s ǫmfs ǫm fs ǫm f + K f is congruent to the positive definite matrix ǫ 2 M s + ǫk s 0 0 ǫm f + K f ǫm fs (K s ǫm s ) 1 M fs that is Q( ǫ) > 0 Similarly with P the permutation matrix 0 I n0 I m we have ǫ 2 P T ǫmf + ǫ Q(1/ǫ)P = 2 K f ǫm fs M s ǫk s ǫm fs which is congruent to ǫmf + ǫ 2 K f 0 0 M s ǫk s ǫm fs (M f + ǫk f ) 1 M fs For ǫ negative and sufficiently close to 0 this matrix is negative definite that is Q(ǫ 1 ) < 0 Thus there exist distinct ǫ 1 < 0 and ǫ 2 < 0 such that Q(ǫ 1 ) < 0 and Q(ǫ 2 ) > 0 It then follows from Theorem 26 below that the quadratic Q is definite

4 N J HIGHAM D S MACKEY AND F TISSEUR β λ=0 λ<0 λ>0 λ= + + λ= α λ>0 λ<0 λ=0 Fig 21 Correspondence between λ and (α β) Note the two copies of R { } represented by the upper semicircle and lower semicircle 2 Definiteness and hyperbolicity We will use the homogenous forms of the degree l matrix polynomial P(λ) = l j=0 λj A j and pencil L(λ) = λx +Y which are given by l P(αβ) = α j β l j A j j=0 L(αβ) = αx + βy Then λ is identified with any pair (αβ) (00) for which λ = α/β Without loss of generality we can take α 2 + β 2 = 1 giving the pictorial representation in Figure 21 of the unit circle R { } and the correspondence between λ and (αβ) Note that the unit circle contains two copies of R { } since (αβ) and ( α β) correspond to the same λ R { } We say that the matrix polynomial P( α β) is obtained from P(αβ) by homogenous rotation if (21) and α = β c s s c α β cs R c 2 + s 2 = 1 l l l P(αβ) = α j β l j A j = (c α s β) j (s α + c β) l j A j =: α j βl j Ã j := j=0 j=0 j=0 P( α β) It is easily checked that Ãl = P(cs) and Ã0 = P( sc) Further relationships between P and P are given in the following lemma Lemma 21 Suppose P( α β) is obtained from P(αβ) by homogenous rotation with (αβ) and ( α β) related by (21) Then (a) P is positive definite at (αβ) if and only if P is positive definite at ( α β) More generally the signatures of the matrices P(αβ) and P( α β) are the same (b) x P(αβ)x = x P( α β)x for all nonzero x C n (c) The eigenvectors of P and P are the same but the corresponding eigenvalues are rotated

DEFINITE MATRIX POLYNOMIALS 5 Proof Since for any (αβ) and ( α β) related by (21) P(αβ) and P( α β) are exactly the same matrix the proof is straightforward Since a Hermitian pencil L(λ) = λx+y is definite if its Crawford number γ(xy ) in (13) is strictly positive it is clear that a sufficient condition for definiteness is that one of X and Y is definite However it is the definiteness of a suitable linear combination of X and Y that characterizes definiteness of the pair as shown by the following lemma which is essentially contained in 14 15 Thm 6118 (see 15 p 290 for references to earlier work on this topic) Theorem 22 A Hermitian pencil L(λ) = λx +Y is definite if and only if L(µ) is a definite matrix for some µ R { } or equivalently if L(αβ) > 0 for some (αβ) on the unit circle Definite pairs have the desirable properties that they are simultaneously diagonalizable under congruence and in the associated eigenproblem L(λ)x = 0 the eigenvalues are real and semisimple 15 Cor 6119 Recall that by definition for a hyperbolic matrix polynomial P(λ) = l i=0 λi A i the equation x P(λ)x = 0 has l distinct real zeros for x 0 and hence P has real eigenvalues For such a P let λ 1 (x) > λ 2 (x) > > λ l (x) be the roots of x P(λ)x for some nonzero x C n Markus 13 31 (see also 4 Sec 134) shows that the eigenvalues of P are distributed in l disjoint closed intervals (22) I j = {λ j (x) : x C n x 2 = 1} j = 1:l Markus 13 Lem 3115 gives moreover the following characterization of hyperbolicity In this result and below we write P(µ) when we are considering the definiteness of the matrix P(µ) C n n and reserve the notation P(λ) for the matrix polynomial Theorem 23 (Markus s characterization of hyperbolicity) Let P(λ) = l j=0 λj A j be a Hermitian matrix polynomial of degree l > 1 with A l > 0 Then P is hyperbolic if and only if there exist µ j R such that (23) ( 1) j P(µ j ) > 0 j = 1:l 1 µ 1 > µ 2 > > µ These properties combine to give a useful definiteness diagram that summarizes many of the key properties of hyperbolic polynomials Theorem 24 A hyperbolic polynomial P(λ) = l j=0 λj A j with eigenvalues λ ln λ 1 has the properties displayed in the following diagram where the closed shaded intervals are the I j defined in (22) and the matrix P(µ) is indefinite (ie has both positive and negative eigenvalues) on the interiors of these intervals: ( 1) l P(µ)>0 ( 1) P(µ)>0 P(µ)>0 P(µ)<0 P(µ)>0 I l I 2 I {}}{{}}{{}}{{}}{ 1 + λ ln λ ()n+1 λ 2n λ n+1 λ n λ 1 {}}{ Proof The proof is essentially a matter of counting sign changes in eigenvalues Since A l > 0 P(µ) > 0 for all sufficiently large µ Since λ 1 is the largest eigenvalue of the polynomial P(λ) P(λ 1 ) is singular and P(µ) is nonsingular for µ > λ 1 Hence we must have P(µ) > 0 for µ > λ 1 Likewise ( 1) l P(µ) > 0 for µ < λ ln As µ decreases from λ 1 the inertia of the matrix P(µ) changes only at an eigenvalue of P(λ) and at a k-fold eigenvalue of P(λ) the number of negative eigenvalues

6 N J HIGHAM D S MACKEY AND F TISSEUR of P(µ) can increase by at most k Hence any number µ 1 for which P(µ 1 ) < 0 satisfies µ 1 < λ n Similarly any number µ for which ( 1) P(µ ) > 0 satisfies λ ()n+1 < µ By continuing this argument we find that the points µ µ 1 satisfying (23) can be accommodated in the diagram only by placing them in the bracketed intervals and that P must be indefinite inside each of the shaded intervals Finally it is easily seen that the shaded intervals are precisely the I j in (22) Note that hyperbolic pencils L(λ) = λx + Y are definite since their coefficient matrices are Hermitian with X > 0 However a definite pair is not necessarily hyperbolic in the standard sense since X and Y can both be indefinite We can now make a definition of definite polynomial that extends the notion of hyperbolicity and is consistent with the definition of definite pencil Definition 25 (definite polynomial) A Hermitian matrix polynomial P(λ) = l j=0 λj A j is definite if there exists µ R { } such that the matrix P(µ) is definite and for every nonzero x C n the scalar equation x P(λ)x = 0 has l distinct zeros in R { } To see the consistency of the definition take a definite P and homogeneously rotate P into P so that µ corresponds to that is µ = α/β corresponds to µ = α/ β = 1/0 = (this can be done by setting c = α and s = β in (21)); then by Lemma 21 P( µ) = Ã l > 0 If x P(λ)x = 0 has distinct roots in R { } then x P( λ)x = 0 has real distinct zeros (and no infinite root since Ã l > 0) Here we adopt the convention that x P(λ)x = l j=0 a j(x)λ j has a root at whenever a l (x) = 0 Thus P( λ) is hyperbolic Hence any definite matrix polynomial is actually a homogeneously rotated hyperbolic matrix polynomial By Lemma 21 all the spectral and definiteness properties of hyperbolic polynomials are inherited by definite polynomials as long as we interpret intervals homogeneously on the unit circle; see Figure 22 Note that for a matrix polynomial P(αβ) of degree l in homogeneous variables αβ we have P( α β) = ( 1) l P(αβ) Thus for odd degree Hermitian polynomials P( α β) and P(αβ) have opposite signature for any (αβ) on the unit circle and hence an antisymmetric (through the origin) definiteness diagram (see Figure 22 (a)) An even degree Hermitian polynomial has P( α β) and P(αβ) with the same signature for any (αβ) on the unit circle and hence a symmetric (through the origin) definiteness diagram (see Figure 22 (b)) By virtue of this (anti)symmetry the leftmost definiteness semi-interval in Theorem 24 that is {µ < λ ln : ( 1) l P(µ) > 0} together with the right-most definiteness semi-interval {µ > λ 1 : P(µ) > 0} can be considered as one definiteness interval and any hyperbolic matrix polynomial of degree l can be regarded as having l distinct definiteness intervals (and not l + 1 as the diagram of Theorem 24 might suggest) In view of the connection between hyperbolic polynomials and definite polynomials via homogeneous rotation we have the following extension of Markus s characterization of hyperbolic matrix polynomials in Theorem 23 Unlike the latter result ours is valid for l = 1 Theorem 26 (characterization of definite matrix polynomial) A Hermitian matrix polynomial P(λ) = l j=0 λj A j is definite if and only if there exist γ j R { } with γ 0 > γ 1 > γ 2 > > γ (γ 0 = being possible) such that P(γ 0 ) P(γ 1 ) P(γ ) are definite matrices with alternating parity

DEFINITE MATRIX POLYNOMIALS 7 0 L>0 L>0 0 + + L<0 0 + + + L<0 0 0 L<0 X > 0 Y indefinite X indefinite Y > 0 (a) Definite pencils L(α β) = αx + βy + 0 L>0 X Y both indefinite Q<0 0 Q<0 Q>0 0 + + Q>0 0 Q>0 + Q<0 + Q>0 + Q<0 + Q>0 0 0 0 Q<0 Q<0 Q>0 A > 0 B > 0 C > 0 (overdamped) A > 0 C < 0 A C both indefinite (b) Definite quadratics Q(α β) = α 2 A + αβb + β 2 C Fig 22 Examples of definiteness diagrams for (a) definite pencils and (b) definite quadratics The shaded arcs are the arcs of indefiniteness 3 Hermitian linearizations We recall some definitions and results from 8 12 With the notation (31) Λ = λ λ 1 T F l where l = deg(p) and F = C or R define two vector spaces of ln ln pencils L(λ) = λx + Y : (32) (33) L 1 (P) = { L(λ) : L(λ)(Λ I n ) = v P(λ) v F l } L 2 (P) = { L(λ) : (Λ T I n )L(λ) = w T P(λ) w F l } where is the Kronecker product 10 Chap 12 The vectors v and w are referred to as right ansatz and left ansatz vectors respectively It is easily checked that for the companion forms in (12) C 1 (λ) L 1 (P) with v = e 1 and C 2 (λ) L 2 (P) with w = e 1 where e i denotes the ith column of I l For any regular P almost all pencils in L 1 (P) and L 2 (P) are linearizations of P 12 Thm 47 The intersection (34) DL(P) = L 1 (P) L 2 (P) is of particular interest because there is a simultaneous correspondence via Kronecker products between left and right eigenvectors of P and those of pencils in DL(P) Two

8 N J HIGHAM D S MACKEY AND F TISSEUR key facts are that L DL(P) if and only if L satisfies the conditions in (32) and (33) with w = v and that every v F l uniquely determines X and Y such that L(λ) = λx + Y is in DL(P) 8 Thm 34 12 Thm 53 Thus DL(P) is an l- dimensional space of pencils associated with P Just as for L 1 (P) and L 2 (P) almost all pencils in DL(P) are linearizations when P is regular 12 Thm 68 Define the block Hankel block j j matrices A l L j = A A l A A l j+1 U j = A j 1 A 1 A 0 A 1 A 0 It is shown in 8 Thm 35 that the jth standard basis pencil in DL(P) with ansatz vector e j (j = 1:l) can be expressed as Lj 0 (35) L j (λ) = λx j X j 1 X j = 0 U l j (L j and U j are taken to be void when j = 0) Now for a Hermitian matrix polynomial P(λ) of degree l let (36) H(P) := { λx + Y L 1 (P) : X = X Y = Y } denote the set of all Hermitian pencils in L 1 (P) It is shown in 8 Thm 61 that H(P) is the subset of all pencils in DL(P) with a real ansatz vector In other words for each vector v R l there is a unique Hermitian pencil in H(P) defined by l v j(λx j + X j 1 ) with X j as in (35) and every pencil in H(P) can be written in this way 4 Definite linearizations In this section L(λ) = λx + Y is an element of H(P) with ansatz vector v R l where l is the degree of P We begin with the statement of our two main results Theorem 41 (existence of definite linearizations) A Hermitian matrix polynomial P(λ) has a definite linearization in H(P) if and only if P is definite We denote by D(P) the subset of all definite pencils in H(P) ie D(P) = {L H(P) : L is a definite pencil } H(P) Note that since every definite pencil is regular by 12 Thm 43 any pencil in D(P) is automatically a definite linearization of P With a vector v R l we associate a scalar polynomial p(λ;v) := v T Λ = l v i λ l i referred to as the v-polynomial We adopt the convention that p(λ;v) has a root at whenever v 1 = 0 Theorem 42 (characterization of D(P)) Suppose the Hermitian matrix polynomial P of degree l is definite and L(λ) = λx + Y H(P) with ansatz vector v R l Then L D(P) if and only if the roots of p(x;v) including if v 1 = 0 are real simple (ie of multiplicity 1) and lie in distinct definiteness intervals for P Moreover L(α β) = αx + βy is a definite matrix if and only if L D(P) and (α β) lies in the one definiteness interval for P that is not occupied by a root of p(x;v) The rest of this section is devoted to the proofs of these two theorems

DEFINITE MATRIX POLYNOMIALS 9 41 Ansatz vector conditions The first task is to eliminate from further consideration any ansatz vector v R l such that p(x;v) has either a complex (nonreal) root or a real root (including ) with multiplicity 2 or greater Synthetic division carried out by Horner s method is one of the key ingredients for this task Lemma 43 Let p(x) = a 0 + a 1 x + + a m x m and denote by q i (s) i = 0:m the scalars generated by Horner s method for evaluating the polynomial p at a point s that is q 0 (s) = a m q i (s) = sq i 1 (s) + a m i i = 1:m Define the degree m 1 polynomial q s (x) := q m 1 (s) + q m 2 (s)x + + q k (s)x m k 1 + + q 1 (s)x m 2 + q 0 (s)x m 1 Then (a) p(x) = (x s) q s (x) + p(s) m 1 (b) If s is a root of p(x) then p(x) = (x s) q s (x) and q s (x) = a m (x r i) where r 1 r 2 r m 1 are the other roots of p (c) If r and s are distinct roots of p(x) then q s (r) = 0 (d) s is a root of p with multiplicity at least two if and only if q s (s) = 0 Proof (a) is a standard identity for Horner s method; see eg 6 Sec 52 (a) (b) is immediate and (b) implies (c) and (d) In what follows we often use Λ in (31) with an argument: Λ(r) = r r 1 T We will need to refer to the following result from 12 Lem 65 Lemma 44 Suppose that L(λ) DL(P) with ansatz vector v and p(x;v) is the v-polynomial of v Let Y j denote the jth block column of Y in L(λ) = λx + Y where 1 j l 1 Then ( Λ T (x) I ) Y j = q j 1 (x;v)p(x) xp(x;v)p j 1 (x) where q j 1 (x;v) and P j 1 (x) are the scalar and matrix generated by Horner s method for evaluating p(x;v) and P(x) respectively as in Lemma 43 We recall an operation on block matrices introduced in 12 that is useful for constructing pencils in L 1 (P) For block l l matrices X and Y with n n blocks X ij and Y ij the column-shifted sum X Y of X and Y is defined by X Y := X 11 X 1l 0 0 Y 11 Y 1l + X l1 X ll 0 0 Y l1 Y ll where the zero blocks are n n It is shown in 12 Lem 34 that F ln l(n+1) (41) L(λ) L 1 (P) with ansatz vector v F l X Y = v A l A A 0 We can now prove the following result Lemma 45 Suppose P is a Hermitian matrix polynomial and L H(P) with ansatz vector v If either (a) the v-polynomial p(x;v) has a (nonreal) complex conjugate pair of roots s and s or

10 N J HIGHAM D S MACKEY AND F TISSEUR (b) p(x;v) has a real root s with multiplicity at least 2 or (c) is a root of p(x;v) with multiplicity at least 2 ie v 1 = v 2 = 0 then L is not a definite pencil Proof To prove parts (a) and (b) we use a congruence transformation S(αX + βy )S so as to preserve and reveal the nondefiniteness of L(αβ) = αx + βy Let I 0 S = Λ T I n (s) where s is the given complex (or multiple real) root of p(x;v) Using Lemma 44 together with p(s;v) = 0 we see that the bottom block row of SY has the form (SY ) l: = q 0 (s;v)p(s) q 1 (s;v)p(s) q (s;v)p(s) where the scalars q j (s;v) j = 0: are generated by Horner s method for p(s;v) as in Lemma 43 Observe that S L(λ) is no longer in H(P) but is still in L 1 (P) since ( ) I 0 S L(λ)(Λ I) = S (v P(λ)) = Λ T I n (v P(λ)) (s) v 1 v 1 v 2 v = v P(λ) = 2 v P(λ) p(s; v) 0 so that we can use the column shifted sum to deduce the structure of SX and SY Since the right ansatz vector of S L(λ) is v 1 v 0 T we know from (41) that the bottom block row of the shifted sum SX SY must be zero Thus the bottom block rows of SX and SY are (SX) l: = 0 q 0 (s;v)p(s) q 1 (s;v)p(s) q (s;v)p(s) (SY ) l: = q 0 (s;v)p(s) q 1 (s;v)p(s) q (s;v)p(s) 0 From this we can now compute the (ll)-blocks of SXS and SY S (SXS ) ll = (SX) l: S :l = (SX) l: (Λ(s) I n ) = q s (s;v)p(s) (SY S ) ll = (SY ) l: (Λ(s) I n ) = s q s (s;v)p(s) where q s (x;v) = j=0 q j(s;v)x l j 2 But q s (s;v) = 0 by Lemma 43 (c) for part (a) or Lemma 43 (d) for part (b) Thus S(αX + βy )S ll = 0 for all (αβ) on the unit circle showing that αx + βy is not a definite pencil For the proof of part (c) we observe from (35) that for the standard basis pencils L 3 L 4 L l for H(P) with ansatz vectors e 3 e 4 e l the (11) block is identically 0 Thus for any ansatz vector v R l with v 1 = v 2 = 0 the corresponding L(λ) = λx + Y H(P) has zero (11) blocks in X and Y so that (αx + βy ) 11 0 for all αβ Hence αx + βy is not a definite matrix for any (αβ) on the unit circle and L is therefore not a definite pencil In light of Lemma 45 we now assume throughout the remainder of section 4 that the ansatz vector v R l is such that p(x;v) has l 1 distinct real roots (including possibly when v 1 = 0)

DEFINITE MATRIX POLYNOMIALS 11 The second major step in the proof of Theorems 41 and 42 is to block-diagonalize the pencil αx +βy by a definiteness-revealing congruence The generic case v 1 0 is treated first in its entirety; then we come back to look at v 1 = 0 v 2 0 and see which parts of the argument for v 1 0 must be modified to handle this case 42 Case 1: v 1 0 As before L(λ) H(P) DL(P) Let r 1 r 2 r be the finite real and distinct roots of the v-polynomial p(x; v) We start the reduction of αx + βy to block diagonal form by nonsingular congruence with the matrix (42) S = e 1 Λ(r 1 ) Λ(r 2 ) Λ(r ) T I n It is easy to verify that S L(λ) L 1 (P) with right ansatz vector v 1 e 1 SY has the form v 1 A v 2 A l v 1 A v 3 A l v 1 A 1 v l A l v 1 A 0 q 0 (r 1 ;v)p(r 1 ) q 1 (r 1 ;v)p(r 1 ) q (r 1 ;v)p(r 1 ) SY = q 0 (r 2 ;v)p(r 2 ) q 1 (r 2 ;v)p(r 2 ) q (r 2 ;v)p(r 2 ) ; q 0 (r ;v)p(r ) q 1 (r ;v)p(r ) q (r ;v)p(r ) for block rows 2:l this is obtained by using Lemma 44 repeatedly on block columns 1: of Y while the form of the first block row follows from that of Y on using the basis elements in (35) (since L(λ) DL(P)) Combining this with the shifted sum property in (41) SX SY = v 1 e 1 A l A 0 we find that v 1 A l v 2 A l v l A l 0 q 0 (r 1 ;v)p(r 1 ) q (r 1 ;v)p(r 1 ) SX = 0 q 0 (r ;v)p(r ) q (r ;v)p(r ) v 1 A v 2 A l v 1 A 1 v l A l v 1 A 0 q 0 (r 1 ;v)p(r 1 ) q (r 1 ;v)p(r 1 ) 0 SY = q 0 (r ;v)p(r ) q (r ;v)p(r ) 0 Completing the congruence by right multiplication with S yields the two Hermitian matrices v 1 A l 0 0 0 0 q r1 (r 1 )P(r 1 ) q r1 (r 2 )P(r 1 ) q r1 (r )P(r 1 ) SXS = 0 q r2 (r 1 )P(r 2 ) q r2 (r 2 )P(r 2 ) q r2 (r )P(r 2 ) 0 q r (r 1 )P(r ) q r (r 2 )P(r ) q r (r )P(r ) SY S = 2 v 1A v 2A l v 1P(r 1) v 1P(r 2) v 1P(r ) v 1P(r 1) r 1bq r1 (r 1)P(r 1) r 2bq r1 (r 2)P(r 1) r bq r1 (r )P(r 1) v 1P(r 2) r 1bq r2 (r 1)P(r 2) r 2bq r2 (r 2)P(r 2) r bq r2 (r )P(r 2) 6 4 v 1P(r ) r 1bq r (r 1)P(r ) r 2bq r (r 2)P(r ) r bq r (r )P(r ) 3 7 5

12 N J HIGHAM D S MACKEY AND F TISSEUR But by Lemma 43 (c) q rj (r i ) = 0 for i j so that all the off-diagonal blocks in the trailing principal block (l 1) (l 1) submatrices of SXS and SY S are zero Combining these results gives the arrowhead form M 0 v 1 βp(r 1 ) v 1 βp(r 2 ) v 1 βp(r ) v 1 βp(r 1 ) µ 1 P(r 1 ) 0 0 S(αX + βy )S = v 1 βp(r 2 ) 0 µ 2 P(r 2 ) 0 v 1 βp(r ) 0 0 µ P(r ) where (43) (44) M 0 = αv 1 A l + β(v 1 A v 2 A l ) µ i = (r i β α) q ri (r i ) i = 1:l 1 From the form of the diagonal blocks µ k P(r l ) we can already deduce two necessary conditions for αx + βy to be definite: 1 (αβ) must be distinct from the roots r k = (r k 1) in the homogeneous sense 2 Each root r k must lie in some definiteness interval for P A sequence of l 1 more congruences eliminates the rest of the off-diagonal blocks This begins with a congruence by µ1 v 1 β 0 0 0 I I n =: S 1 in order to eliminate the (12) and (21) blocks: S 1 S(αX + βy )S S1 = M 1 0 v 1 βµ 1 P(r 2 ) v 1 βµ 1 P(r ) 0 µ 1 P(r 1 ) v 1 βµ 1 P(r 2 ) µ 2 P(r 2 ) v 1 βµ 1 P(r ) µ P(r ) where M 1 = µ 2 1M 0 v 2 1β 2 µ 1 P(r 1 ) With µj 0 0 σ S j = j 0 0 0 I I n j 2 where σ j = v 1 β j 1 k=1 µ k is the (1j + 1) entry of S j it can be proved by induction that after k such eliminations-by-congruence we have S k S 1 S(αX + βy )S S 1 S k =

v 1 β v 1 β DEFINITE MATRIX POLYNOMIALS 13 k k M k 0 0 v 1 β µ j P(r k+1 ) v 1 β µ j P(r ) 0 µ 1 P(r 1 ) 0 µ k P(r k ) k µ j P(r k+1 ) µ k+1 P(r k+1 ) k µ j P(r ) µ P(r ) where (45) M k = k k µ j k ( k µ j M 0 v1β 2 2 µ j )P(r i ) Thus after completing all l 1 eliminations-by-congruence we have the block diagonal form M (46) µ 1 P(r 1 ) S S 1 S(αX + βy )S S1 S = µp(r) where M = M is given by (45) with k replaced by l 1 Quite remarkably M simplifies to just a scalar multiple of P(αβ): M = v 1 2 q rj (r j ) (α r j β) P(αβ) The proof of this simplification for M which involves some tedious calculations is left to Appendix A The block diagonal form in (46) can be simplified even further; a scaling congruence removes the squared term q r j (r j ) 2 from the (11) block and v 1 can be factored out of each µ i = (r i β α) q ri (r i ) since from Lemma 43 (b) q ri (r i ) = v 1 (r i r j ) Thus we see that αx + βy is congruent to the block diagonal form 2 3 Y (α r jβ) P(α β) (r 1β α) Y 1 r j) P(r 1) (47) v 1 j 1(r 6 4 (r β α) Y 7 5 (r r j) P(r ) j This block diagonalization allows us to make the connection between definiteness of the polynomial P and definiteness of the matrix αx + βy

14 N J HIGHAM D S MACKEY AND F TISSEUR (αβ) (γ1) (r j1) γ 1 α β > 0 γ r j1 1 < 0 α r j1 β < 0 Fig 41 Pictorial representation of (48) We first make the convention that (αβ) lies in the upper half-circle since replacing (αβ) by ( α β) changes the signs of all the diagonal blocks and hence does not affect the definiteness or indefiniteness of αx + βy The finite roots r 1 r can be viewed in homogeneous terms as vectors (r i 1) in the upper (αβ)-plane Each diagonal block in (47) is a scalar multiple of P evaluated at one of the constants in the set R = {(αβ)r 1 r 2 r } For any γ R the scalar in front of P(γ) can be interpreted as a product of l 1 factors in which γ is compared with each of the other l 1 constants in R via 2 2 determinants: γ α 1 β = γβ α; γ r j 1 1 = γ r j; (48) or α r j β 1 = α r jβ when γ = (αβ) The sign of any of these determinants is positive for any constant in R that lies counterclockwise from γ and negative for any that lie clockwise from γ; see Figure 41 Consequently the sign of the whole scalar multiple of P(γ) reveals the parity of the number of constants in R that lie clockwise from γ Hence from (47) we see that when P is definite any choice of (αβ) and finite r 1 r 2 r to be placed one in each of the l distinct definiteness intervals for P in the upper half-circle will result in a definite matrix αx + βy and hence a definite pencil L(λ) = λx + Y H(P) This proves the if part of Theorem 41 Now suppose there exists a definite pencil αx + βy in H(P) We first assume that v 1 0 By a final permutation congruence we rearrange the diagonal blocks in (47) so that the vectors (αβ) (r i 1) i = 1:l 1 at which P is evaluated are encountered in counterclockwise order (starting with if (αβ) = ) as we descend the diagonal With this reordering of blocks the scalar coefficient of P in the (11) block will be positive and the rest of the scalar coefficients will have alternating signs as we descend the diagonal Thus in order for αx + βy to be a definite matrix the definiteness parity of the matrices P(r i ) P(αβ) must also alternate as we descend the diagonal Thus by Theorem 26 we see that the existence of a definite pencil in H(P) with v 1 0 implies that P must be definite Now if there exists a definite pencil in H(P) with v 1 = 0 v 2 0 then since pencils in H(P) vary continuously with the ansatz vector v R l and definite pencils form an open subset of H(P) a sufficiently small perturbation of v 1 away from zero will result in a definite pencil in H(P) with v 1 0 and thereby imply the definiteness of P Thus the existence of any definite pencil in H(P) implies the definiteness of P This completes the proof of Theorem 41

DEFINITE MATRIX POLYNOMIALS 15 To complete the proof of Theorem 42 characterizing the set of all definite pencils in H(P) we need to consider the case where one of the roots r j of p(x;v) is (or equivalently v 1 = 0) assuming that one of the definiteness intervals of P contains 43 Case 2: v 1 = 0 v 2 0 We can no longer start the block diagonalization of αx + βy with S as in (42) since one of the roots r i is Instead we use all available finite (real) roots r 1 r 2 r and let S = e 1 e 2 Λ(r 1 ) Λ(r 2 ) Λ(r ) T I n By arguments similar to those used in subsection 42 we find that 0 v 2 A l v 3 A l v l A l v 2 A l v 2 A + v 3 A l v 3 A + v 4 A l v l A SX = 0 q 0 (r 1 ;v)p(r 1 ) q 1 (r 1 ;v)p(r 1 ) q (r 1 ;v)p(r 1 ) 0 q 0 (r ;v)p(r ) q 1 (r ;v)p(r ) q (r ;v)p(r ) v 2 A l v 3 A l v l A l 0 v 3 A l v 2 A v 3 A v 4 A l v 2 A 1 v l A v 2 A 0 SY = q 0 (r 1 ;v)p(r 1 ) q 1 (r 1 ;v)p(r 1 ) q (r 1 ;v)p(r 1 ) 0 q 0 (r ;v)p(r ) q 1 (r ;v)p(r ) q (r ;v)p(r ) 0 Note that q 0 (x;v) = v 1 = 0 and q 1 (x;v) = v 1 x+v 2 = v 2 Using these and completing the congruence by right multiplication with S yields 0 v 2 A l 0 0 v 2 A l v 2 A + v 3 A l 0 0 SX S = 0 0 q r1 (r 1 )P(r 1 ) q r1 (r )P(r 1 ) 0 0 q r (r 1 )P(r ) q r (r )P(r ) SY S = 2 3 v 2A l v 3A l 0 0 v 3A l v 2 A v 3 A v 4 A l v 2P(r 1) v 2P(r ) 0 v 2P(r 1) r 1bq r1 (r 1)P(r 1) r bq r1 (r )P(r 1) 6 4 7 5 0 v 2P(r ) r 1bq r (r 1)P(r ) r bq r (r )P(r ) But by Lemma 43 (c) all the off-diagonal blocks in the bottom right (l 2) (l 2) block submatrices of SXS and SY S are zero Hence v 2 βa l (v 2 α v 3 β)a l 0 0 (v 2 α v 3 β)a l N 0 v 2 βp(r 1 ) v 2 βp(r ) S(αX + βy ) S = 0 v 2 βp(r 1 ) µ 1 P(r 1 ) 0 v 2 βp(r ) µ P(r ) where N 0 = α(v 2 A + v 3 A l ) + β(v 2 A v 3 A v 4 A l ) and the µ i = (r i β α) q ri (r i ) are the same as in case 1 Note that because of v 2 βa l in the (11) block

16 N J HIGHAM D S MACKEY AND F TISSEUR choosing β = 0 results in αx + βy not being a definite matrix Thus we cannot choose (αβ) to be which is entirely consistent with case 1 where we had to choose (αβ) to be distinct from all the r i From now on then we assume that β 0 Note that the blocks µ i P(r i ) on the diagonal of this condensed form once again show that each r i must lie in some definiteness interval for P and that (αβ) must be chosen distinct from all the finite roots r 1 r 2 r ; otherwise αx + βy will not be a definite matrix The next step in the reduction is to eliminate the blocks in the second block row and column that are of the form v 2 βp(r i ) using l 2 congruences analogous to the ones used in case 1 The first of these is by the matrix 1 S 1 = µ 1 v 2 β 1 1 I n yielding S 1 S(αX + βy ) S S 1 = v 2 βa l µ 1 (v 2 α v 3 β)a l 0 0 µ 1 (v 2 α v 3 β)a l N 1 0 v 2 βµ 1 P(r 2 ) v 2 βµ 1 P(r ) 0 0 µ 1 P(r 1 ) 0 0 v 2 βµ 1 P(r 2 ) 0 µ 2 P(r 2 ) 0 0 v 2 βµ 1 P(r ) 0 0 µ P(r ) where N 1 = µ 2 1N 0 v 2 2β 2 µ 1 P(r 1 ) Continuing in analogous fashion we ultimately obtain S S 1 S(αX + βy ) S S 1 S = v 2 βa l µ i (v 2 α v 3 β)a l µ i (v 2 α v 3 β)a l N µ 1 P(r 1 ) µp(r) where N = µ j µ j N 0 v2β 2 2 ( µ j )P(r i ) One more congruence completes the block diagonalization namely congruence by E = 1 0 µ j (v 2 α v 3 β) v 2 β Note that E is nonsingular because β 0 This gives I E S S 1 S(αX + βy ) S S 1 S E = I n

(49) DEFINITE MATRIX POLYNOMIALS 17 v 2 βa l N µ 1 P(r 1 ) µp(r) where (410) N = v 2 β µ j µ j (v 2 α v 3 β) 2 A l + v 2 β v2β 3 3 ( µ j )P(r i ) µ j N 0 It is shown in Appendix A2 that N = v 2 v 2 2 q rj (r j ) β (α r j β) P(αβ) The block diagonal form in (49) is simplified further by factoring out v 2 and by a scaling congruence to remove the squared term v 2 q r j (r j ) 2 from the (22) block Thus αx + βy is congruent to the block diagonal form βa l β (α r j β) P(αβ) v 2 (r 1 β α) 1 r j ) P(r 1 ) j 1(r (r β α) (r r j ) P(r ) j With the block diagonalization of αx + βy written in this particular form it now becomes possible to give a common conceptual interpretation for all the diagonal blocks that is similar to the one we gave for case 1 We just point out the differences Recall that when v 1 = 0 and v 2 0 p(x;v) has l 2 finite roots r 1 r 2 r and one which is infinite r = ie (10) in homogeneous form Then we have P(r ) P( ) P(10) = A l Hence each diagonal block is a scalar multiple of P evaluated at one of the constants in the set R = {(αβ)r 1 r 2 r } For any γ R the scalar in front of P(γ) can be interpreted as a product of l 1 factors in which γ is compared with each of the other constants in R via 2 2 determinants: γ r j1 1 = γ rj γ α 1 β = γβ α or γ 1 1 0 = 1 if γ is finite 1 r j1 0 = +1 or 1 α 0 β = β if γ = (10) is infinite α r j1 β = α rj β and α 1 β 0 = β for γ = (αβ) otherwise Recall our convention that each γ R lies in the strict upper (αβ) half-plane (10) Then the sign of any of these determinants is positive for any constant in R that lies counterclockwise from γ and negative for any that lie clockwise from γ Consequently the sign of the whole scalar multiple of P(γ) reveals the parity of the number of constants in R that lie clockwise from γ If we re-order the blocks

18 N J HIGHAM D S MACKEY AND F TISSEUR (via permutation congruence) so that as we go down the diagonal we encounter the constants from R in the evaluated P s in strict counterclockwise order (starting with whenever R) then the scalar multiples will have alternating sign starting with a positive sign in the (11) block We then see that αx +βy is a definite matrix if and only if the matrices P(γ) have strictly alternating definiteness parity as we descend the diagonal This completes the characterization of the set of all definite pencils in H(P) as given in Theorem 42 5 Application to quadratics We now concentrate our attention on quadratic polynomials Q(λ) = λ 2 A + λb + C with Hermitian A B and C For x C n let q x (λ) = x Q(λ)x = λ 2 (x Ax) + λ(x Bx) + x Cx = λ 2 a x + λb x + c x be the scalar section of Q at x The discriminant of Q at x is the discriminant of q x (λ): D x := b 2 x 4a x c x = D x (Q) The following result is specific to quadratics Theorem 51 A Hermitian quadratic matrix polynomial Q(λ) is definite if and only if any two (and hence all) of the following properties hold: (a) Q(µ) > 0 for some µ R { } (b) D x = (x Bx) 2 4(x Ax)(x Cx) > 0 for all nonzero x C n (c) Q(γ) < 0 for some γ R { } Proof (a) and (b) are equivalent to Q being definite directly from Definition 25 (a) and (c) are equivalent to Q being definite by Theorem 26 Suppose (b) and (c) hold and let Q(λ) = Q(λ) Then Q(γ) > 0 and D x ( Q) = D x (Q) so Q satisfies (a) and (b) and so is definite But then by Theorem 26 Q(µ) < 0 for some µ which means that Q(µ) > 0 so Q satisfies (a) and (b) and hence is definite Finally Q being definite implies (b) and (c) hold by Definition 25 and Theorem 26 Here is a simple example where A B and C are all indefinite but properties (a) and (c) of Theorem 51 hold so that Q(λ) is definite: Q(λ) = λ 2 3 1 1 2 + λ 6 3 3 10 + 0 2 2 9 Q(1) > 0 Q(3) < 0 The definiteness diagram for Q has the form of the last diagram in Figure 22 (b) The standard basis pencils of H(Q) with ansatz vectors e 1 and e 2 are given by A 0 B C 0 A A 0 L 1 (λ) = λ + L 0 C C 0 2 (λ) = λ + A B 0 C L 1 (λ) is a linearization if the trailing coefficient matrix C is nonsingular Since the root of p(x;e 1 ) is 0 Theorem 42 implies that if Q(λ) is definite with C = Q(0) definite then L 1 (λ) is definite Similarly L 2 (λ) is a linearization if the leading coefficient matrix A is nonsingular and since the root of p(x;e 2 ) is Theorem 42 implies that if Q(λ) is definite with A = Q( ) definite then L 2 (λ) is definite Now if Q is definite with A > 0 and C < 0 then A 0 0 C > 0 Thus the eigenvalues of Q can be computed by the Cholesky QR method on either L 1 (λ) or λl 2 (1/λ) Note that the Cholesky QR method 3 has several advantages over the QZ algorithm for the numerical solution of hyperbolic quadratics First the Cholesky QR method takes advantage of the symmetry of the pencil which results in a reduction in both

DEFINITE MATRIX POLYNOMIALS 19 the storage requirement and the computational cost Second it guarantees to produce real eigenvalues and therefore preserves this spectral property of definite quadratics This is not necessarily the case for the QZ algorithm We now show how to transform definite quadratics into special forms Three different cases are considered Case (a) Suppose that Q is hyperbolic so that A > 0 and we wish to transform Q into Q with Â > 0 and Ĉ < 0 An ordinary translation suffices to achieve this as long as we know one value γ R such that Q(γ) < 0 Then Q(λ) := Q(λ + γ) = λ 2 A + λ(b + 2γA) + C + γb + γ 2 A λ 2 Â + λ B + Ĉ with Q(0) = Q(γ) = Ĉ < 0 and Â = A > 0 Case (b) Suppose Q is definite and we wish to transform it into a hyperbolic Q with Ã > 0 This can be done by a homogeneous rotation provided we know one value µ R { } for which Q(µ) > 0 We need to make µ for Q correspond to for Q For this we express µ in homogeneous coordinates: µ = (cs) with c 2 + s 2 = 1 Recall that = (10) in homogeneous coordinates From (21) we see that α β = c s = µ will correspond to eαeβ = 1 0 Thus homogenous rotation with these (cs)-values will give Q such that Ã = Q(cs) > 0 Case (c) Suppose Q is definite and we wish to transform Q into Q so that A > 0 and C < 0 To do this we need to know a µ R { } such that Q(µ) > 0 and a γ R { } such that Q(γ) < 0 With these two numbers in hand we first do the rotation of case (b) to obtain A > 0 and then the translation of case (a) to obtain C < 0 Finally we note that an efficient algorithm for testing whether a Hermitian quadratic is hyperbolic is developed by Guo Higham and Tisseur 5 In the case of an affirmative test this algorithm provides a µ such that Q(µ) < 0 Appendix A This appendix deals with the simplification of one of the diagonal blocks obtained during the block diagonalizations of αx + βy for both case 1 (v 1 0) and case 2 (v 1 = 0 v 2 0) The next lemma is the main technical result needed to achieve these simplifications Lemma A1 Let p m (x;f) be the polynomial of degree m 1 that interpolates the function f at the m distinct points r 1 r 2 r m Rewrite f as (A1) f(x) = E(x;f) + p m (x;f) where E(x;f) is the error in interpolation Then m (a) E(x;x m ) = (x r i ) (b) E(x;x m+1 ) = (c) E(x;x m+2 ) = ( x + m ) m r i (x r i ) ( x 2 + x m ( m ) 2 r i + r i m i i<j ) m r i r j (x r i )

20 N J HIGHAM D S MACKEY AND F TISSEUR Proof Recall the Lagrange form of the interpolating polynomial to f at the m distinct points r 1 r 2 r m m m ( x ri ) p m (x;f) = f(r j ) r j r i To avoid clutter in the proof we define s := m r i s := i j m r i r j We will use repeatedly the following fact: if two monic polynomials (p and q) of degree m agree at m distinct points then they are identically equal (since p q is a degree m 1 polynomial with m zeros) (a) f(x) = x m and q 0 (x) = m (x r i) + p m (x;f) are monic polynomials that agree at the m points r 1 r 2 r m Hence q 0 (x) = x m and the expression for E(x;x m ) in (a) follows This result can also be obtained from the standard formula for the error in polynomial interpolation Observe that equating coefficients of the degree m 1 terms in x m = q 0 (x) gives the identity i i<j (A2) m ( rj m ) i j (r = s j r i ) (b) Note that x m+1 and q 1 (x) = (x + m r i) m (x r i) + p m (x;x m+1 ) are monic degree m + 1 polynomials that agree at the m points r 1 r 2 r m Thus an (m + 1)th point is needed to prove that q 1 (x) = x m+1 Now m m q 1 (0) = s ( r i ) + r m+1 j m i j ( ri r j r i ) m m m rj m = s ( r i ) + ( r i ) i j (r j r i ) ( m m r m ) j = ( r i ) s i j (r j r i ) = 0 by (A2) Thus q 1 (x) = x m+1 whenever r 1 r 2 r m are all nonzero Now suppose one of the points r m say is zero so that the above argument is not valid In this case we view q 1 as a function of x and r 1 r 2 r m and observe that q 1 is continuous in all these variables as long as the r i remain distinct We perturb r m away from zero keeping it distinct from all the other r i Then for any fixed but arbitrary x we have q 1 (xr 1 r 2 r m ) = x m+1 when r m 0 and by continuity we have the same equality as r m 0 Thus q 1 (x) x m+1 holds for any set of distinct r 1 r 2 r m even if one of them is zero (c) We begin by computing the three highest order terms of (x 2 + sx + s 2 s) m (x r i) + p m (s;x m+2 ) =: q 2 (x) which come solely from the first expression

DEFINITE MATRIX POLYNOMIALS 21 since p m (x;x m+2 ) is of degree m 1 We have m (x 2 + sx + s 2 s) (x r i ) = (x 2 + sx + s 2 s)(x m sx m 1 + sx m 2 + ) = x m+2 + 0 x m+1 + 0 x m + Thus h(x) := q 2 (x) x m+2 is actually a degree m 1 polynomial and it is easy to see that h(x) has the m distinct zeros r 1 r 2 r m so that h(x) = 0 ie q 2 (x) x m+2 and the expression for E(x;x m+2 ) follows Using the Lagrange form of the interpolating polynomial and letting x = α/β with β 0 (a)(b)(c) of Lemma A1 together with (A1) yield the following identities in the homogeneous variables (αβ): (A3) (A4) (A5) m m m α m = (α r i β) + (r i β) m ( ) α rj β r i β r j β m m m m α m+1 = (α + βr i ) (α r i β) + (r i β) m+1 α m+2 = ( α 2 + α m ( m ) 2 βr i + βr i β 2 + ( ) α rj β r i β r j β m ) m r i r j (α r i β) i i<j m m (r i β) m+2 ( α rj β r i β r j β With these results in hand we can now return to the simplification of the blocks M = M in (45) and N in (410) A1 Simplification of M Recall from (45) that M = M = µ j M with (A6) M = µ j M 0 v1β 2 2 ( µ j )P(r i ) where from (43) and (44) M 0 = αv 1 A l +β(v 1 A v 2 A l ) and µ i = (r i β α) q ri (r i ) i = 1:l 1 The first step is to break apart every instance of P into three pieces: (A7) P(λ) = λ l A l + λ A + P(λ) Then we rewrite M 0 so as to eliminate v 2 and group A l and A together For this note that since the r i are the roots of the v-polynomial v 2 /v 1 = (r 1 + r 2 + + r ) =: s so that v 2 = v 1 s and (A8) M 0 = v 1 (α + sβ)a l + v 1 βa Substituting (A7) and (A8) into (A6) and grouping all the A l and A together yields (A9) M = A l v 1 (α + sβ) µ j v1β 2 2 ( µ j )r i l )

22 N J HIGHAM D S MACKEY AND F TISSEUR + A v 1 β µ j v1β 2 2 ( ) µ j r i v 21β 2 ( ) µ j P(r i ) We now simplify each of these three pieces in turn Since µ j = (r j β α) q rj (r j ) by (44) (A10) µ j = ( 1) (α r j β) q rj (r j ) Also from Lemma 43 (b) (A11) q rj (r j ) = v 1 (r j r i ) i j Substituting (A10) first and then (A11) in the coefficient of A l gives v 1 (α + sβ) µ j v1β 2 2 = v 1 = v 1 ( 1) ( 1) q rj (r j ) q rj (r j ) α l ( µ j )r i l (α + sβ) (α r j β) + β 2 r l i ( α rj β r i r j ) where we used (A4) for the last equality The simplification of the coefficient of A is very similar to that of A l On using (A10) and (A11) we obtain v 1 β µ j v1β 2 2 = v 1 = v 1 ( 1) ( 1) ( q rj (r j ) β µ j ) r i (α r j β) + β q rj (r j ) α β r i ( α rj β r i r j ) where we used (A3) in the last equality The rest of M is simplified as follows: v1β 2 2 ( µ j ) P(r i ) = v 1 But for β 0 and x = α/β P(r i ) ( 1) ( α rj β ) = β r i r j q rj (r j ) β 2 P(r i ) ( x rj ) P(r i ) r i r j ( α rj β r i r j )

DEFINITE MATRIX POLYNOMIALS 23 and the expression inside the square bracket is the Lagrange form of P(x) since P is of degree l 2 and the l 1 points r i are distinct Hence for β 0 v1β 2 2 ( ) µ j P(ri ) = v 1 ( 1) q rj (r j ) β 2 P(αβ) Finally note that the last equality also holds for β = 0 by continuity Putting these three simplifications back into (A9) yields M = v 1 ( 1) and on using (A10) M = M in (45) becomes M = µ j v 1 = v 1 ( 1) q rj (r j ) P(αβ) q rj (r j ) P(αβ) 2 q rj (r j ) (α r j β) P(αβ) A2 Simplification of N To simplify (A12) N = v 2 β µ j µ j (v 2 α v 3 β) 2 A l + v 2 β µ j N 0 v2β 3 3 ( µ j )P(r i ) where N 0 = α(v 2 A +v 3 A l )+β(v 2 A v 3 A v 4 A l ) and µ i = (r i β α) q ri (r i ) we this time break apart every instance of P into four pieces (A13) P(λ) = λ l A l + λ A + +λ A + P(λ) ( ) Leaving aside the product v 2 β µ j at the beginning of the expression for N and focusing on the quantity inside the square brackets we now simplify the coefficients of the A l A A and P(λ)-terms The A l -term v 2 2 ( µ i α v 3 β v 2 ) 2 + β ( v3 µ i α v 4 β v 2 v 2 ) v 2 β 3 r l j ( ) µ i Defining ŝ = r 1 + r 2 + + r and θ = 1 i<j r ir j we note that v 3 /v 2 = ŝ and v 4 /v 2 = θ From these notations and the definition of the µ i we have v 2 2 ( 1) (α + ŝβ) 2 (α r j β) q rj (r j ) + ( 1) β( ŝα θβ) (α r j β) q rj (r j ) i j

24 N J HIGHAM D S MACKEY AND F TISSEUR ( ) +( 1) v 2 β 3 ri l (α r j β) q ri (r i ) = v2( 1) 2 = v2( 1) 2 q rj (r j ) q rj (r j ) α l (α 2 + ŝαβ + (ŝ 2 θ)β (α r j β) 2) ( ) α + β 3 ri l rj β r i r j where we used the identity (A5) to obtain the last equality The A -term v 2 2β ( µ i α v 3 β v 2 ) v 2 β 2 r j µ i We use again the fact that ŝ = v 3 /v 2 to rewrite this expression as i j (A14) v2β 2 (α + ŝβ) µ i v 2 β 2 r j µ i i j A similar analysis as in the simplification of the A l -coefficient in case 1 using (A4) simplifies (A14) to The A -term v2( 1) 2 v2β 2 2 µ i v 2 β q rj (r j ) α β r j µ i The simplification of this term is completely analogous to the treatment of the coefficient of A in case 1 We obtain v2( 1) 2 i j q rj (r j ) α β 2 The P(λ)-term Substituting for the µ i and factoring out all the q ri (r i ) terms leads to ( ) v2β 3 3 P(r j ) µ i i j = v2( 1) 2 q rj (r j ) β 3 ( α rj β P(r i ) r i r j )

DEFINITE MATRIX POLYNOMIALS 25 which by the Lagrange interpolation formula simplifies to v2( 1) 2 q rj (r j ) β 3 P(αβ) Bringing these four simplifications all together we have N = v 2 v 2 2 q rj (r j ) β (α r j β) P(αβ) REFERENCES 1 ANSYS Inc Theory Manual SAS IP Inc twelfth edition 2 Lawrence Barkwell and Peter Lancaster Overdamped and gyroscopic vibrating systems Trans AME: J Applied Mechanics 59:176 181 1992 3 Philip I Davies Nicholas J Higham and Françoise Tisseur Analysis of the Cholesky method with iterative refinement for solving the symmetric definite generalized eigenproblem SIAM J Matrix Anal Appl 23(2):472 493 2001 4 Israel Gohberg Peter Lancaster and Leiba Rodman Indefinite Linear Algebra and Applications Birkhäuser Basel Switzerland 2005 5 Chun-Hua Guo Nicholas J Higham and Françoise Tisseur Detecting and solving hyperbolic quadratic eigenvalue problems MIMS EPrint 2007117 Manchester Institute for Mathematical Sciences The University of Manchester UK September 2007 Revised August 2008 To appear in SIAM J Matrix Anal Appl 6 Nicholas J Higham Accuracy and Stability of Numerical Algorithms Society for Industrial and Applied Mathematics Philadelphia PA USA second edition 2002 7 Nicholas J Higham Ren-Cang Li and Françoise Tisseur Backward error of polynomial eigenproblems solved by linearization SIAM J Matrix Anal Appl 29(4):1218 1241 2007 8 Nicholas J Higham D Steven Mackey Niloufer Mackey and Françoise Tisseur Symmetric linearizations for matrix polynomials SIAM J Matrix Anal Appl 29(1):143 159 2006 9 Nicholas J Higham D Steven Mackey and Françoise Tisseur The conditioning of linearizations of matrix polynomials SIAM J Matrix Anal Appl 28(4):1005 1028 2006 10 Peter Lancaster and Miron Tismenetsky The Theory of Matrices Academic Press London second edition 1985 11 D Steven Mackey Niloufer Mackey Christian Mehl and Volker Mehrmann Structured polynomial eigenvalue problems: Good vibrations from good linearizations SIAM J Matrix Anal Appl 28(4):1029 1051 2006 12 D Steven Mackey Niloufer Mackey Christian Mehl and Volker Mehrmann Vector spaces of linearizations for matrix polynomials SIAM J Matrix Anal Appl 28(4):971 1004 2006 13 A S Markus Introduction to the Spectral Theory of Polynomial Operator Pencils American Mathematical Society Providence RI USA 1988 14 G W Stewart Perturbation bounds for the definite generalized eigenvalue problem Linear Algebra Appl 23:69 85 1979 15 G W Stewart and Ji-guang Sun Matrix Perturbation Theory Academic Press London 1990 16 Krešimir Veselić A Jacobi eigenreduction algorithm for definite matrix pairs Numer Math 64:241 269 1993